I am developing an application for dealing with kubernetes runtime microservices. I actually did some cool things, like moving a microservice from a node to another one. The problem is that all replicas go together.
So, Imagine that a microservice has two replicas and it is running on a namespaces with two nodes.
I want to set one replica in each node. Is that possible? Even in a yaml file, is that possible?
I am trying to do my own scheduler to do that, but I got no success until now.
I think what you are looking for is a NodeSelector for your replica Set. From the documentation:
Inter-pod affinity and anti-affinity allow you to constrain which nodes your pod is eligible to be scheduled based on labels on pods that are already running on the node rather than based on labels on nodes.
I can't find where it's documented, but I recently read somewhere that replicas will be distributed across nodes when you create the kubernetes service BEFORE the deployment / replicaset.


Run different replica count for different containers within same pod

I have a pod with 2 closely related services running as containers. I am running as a StatefulSet and have set replicas as 5. So 5 pods are created with each pod having both the containers.
Now My requirement is to have the second container run only in 1 pod. I don't want it to run in 5 pods. But my first service should still run in 5 pods.
Is there a way to define this in the deployment yaml file for Kubernetes? Please help.
a "pod" is the smallest entity that is managed by kubernetes, and one pod can contain multiple containers, but you can only specify one pod per deployment/statefulset, so there is no way to accomplish what you are asking for with only one deployment/statefulset.
however, if you want to be able to scale them independently of each other, you can create two deployments/statefulsets to accomplish this. this is imo the only way to do so.
see for more information.
Containers are like processes,
Pods are like VMs,
and Statefulsets/Deployments are like the supervisor program controlling the VM's horizontal scaling.
The only way for your scenario is to define the second container in a new deployment's pod template, and set its replicas to 1, while keeping the old statefulset with 5 replicas.
Here are some definitions from documentations (links in the references):
Containers are technologies that allow you to package and isolate applications with their entire runtime environment—all of the files necessary to run. This makes it easy to move the contained application between environments (dev, test, production, etc.) while retaining full functionality. [1]
Pods are the smallest, most basic deployable objects in Kubernetes. A Pod represents a single instance of a running process in your cluster. Pods contain one or more containers. When a Pod runs multiple containers, the containers are managed as a single entity and share the Pod's resources. [2]
A deployment provides declarative updates for Pods and ReplicaSets. [3]
StatefulSet is the workload API object used to manage stateful applications. Manages the deployment and scaling of a set of Pods, and provides guarantees about the ordering and uniqueness of these Pods. [4]
Based on all that information - this is impossible to match your requirements using one deployment/Statefulset.
I advise you to try the idea #David Maze mentioned in a comment under your question:
If it's possible to have 4 of the main application container not having a matching same-pod support container, then they're not so "closely related" they need to run in the same pod. Run the second container in a separate Deployment/StatefulSet (also with a separate Service) and you can independently control the replica counts.
Can a pod run on multiple nodes?

I have one kubernetes master and three kubernetes nodes. I made one pod which is running on specific node. I want to run that pod on 2 nodes. how can I achieve this? do replica concept help me? if yes how?
Yes, you can assign pods to one or more nodes of your cluster, and here are some options to achieve this:
nodeSelector is the simplest recommended form of node selection constraint. nodeSelector is a field of PodSpec. It specifies a map of key-value pairs. For the pod to be eligible to run on a node, the node must have each of the indicated key-value pairs as labels (it can have additional labels as well). The most common usage is one key-value pair.
affinity and anti-affinity
Node affinity is conceptually similar to nodeSelector -- it allows you to constrain which nodes your pod is eligible to be scheduled on, based on labels on the node.
nodeSelector provides a very simple way to constrain pods to nodes with particular labels. The affinity/anti-affinity feature, greatly expands the types of constraints you can express. The key enhancements are
The affinity/anti-affinity language is more expressive. The language offers more matching rules besides exact matches created with a logical AND operation;
you can indicate that the rule is "soft"/"preference" rather than a hard requirement, so if the scheduler can't satisfy it, the pod will still be scheduled;
you can constrain against labels on other pods running on the node (or other topological domain), rather than against labels on the node itself, which allows rules about which pods can and cannot be co-located
A DaemonSet ensures that all (or some) Nodes run a copy of a Pod. As nodes are added to the cluster, Pods are added to them. As nodes are removed from the cluster, those Pods are garbage collected. Deleting a DaemonSet will clean up the Pods it created.
Some typical uses of a DaemonSet are:
running a cluster storage daemon on every node
running a logs collection daemon on every node
running a node monitoring daemon on every node
Please check this link to read more about how to assign pods to nodes.
It's not a good practice to run the pods directly on the nodes as the nodes/pods can crash at any time. It's better use the K8S controllers as mentioned in the K8S documentation here.
K8S supports multiple containers and depending on the requirement the appropriate controller can be used. By looking at the OP it's difficult to say which controller to use.
You can use daemonset, if you want to run pod on each node.
What I see is you are trying to deploy pod on each node, it's better if you allow the scheduler to make decision where the pod need to be deployed based on the resources.
This would be best in all worst scenario's.
I'm mean in case of node failures.

Is There a Way To Control Demonset's Rolling Update Way In Kubernetes?

I have three demonset pods which contain a container of hadoop resource manager in each pod. One of three is active node. And the other two are standby nodes.
Is there a way to let kubernetes know the hadoop resource manager
inside the pod is a active node or standby node?
I want to control the rolling update way to update the standby node at first and update the active node in last for decrease the times
changing active node which may cause risk.
Consider the following: Deployments, DaemonSets and ReplicaSets are abstractions meant to manage a uniform group of objects.
In your specific case, although you're running the same application, you can't say it's a uniform group of object as you have two types: active and standby objects.
There is no way for telling Kubernetes which is which if they're grouped in what is supposed to be an uniform set of objects.
As suggested by #wolmi, having them in a Deployment instead of DaemonSet still leaves you with the issue that deployment strategies can't individually identify objects to control when they're updated because of the aforementioned logic.
My suggestion would be that, additional to using a Deployment with node affinity to ensure a highly available environment, you separate active and standby objects in different Deployments/Services and base your rolling update strategy on that scenario.
This will ensure that you're updating the standby nodes first, removing the risk of updating the active nodes before the other.
I think this is not the best way to do that, totally understand that you use Daemonset to be sure that Hadoop exists on an HA environment one per node but you can have that same scenario using a deployment and affinity parameters more concrete the pod affinity, then you can be sure only one Hadoop node exists per K8S node.
With that new approach, you can use a replication-controller to control the rolling-update, some resources from the documentation:

Kubernetes - How is high availability ensured if I deploy a containerised app?

I am new to the kubernetes environment. While deploying an application, I could figure out how to do auto scaling but did not quite understand how high availability is ensured? If its not, how can I configure it?
Edit : By HA, I mean how to ensure that pod is scheduled across multiple nodes to ensure HA on pod/service level.
Please guide. Thanks in advance! :)
By HA, I mean how to ensure that pod is scheduled across multiple
nodes to ensure HA on pod/service level.
I'm guessing your app is cloud compatible and can be scaled, In this situation there are multiple feature your can take advantage of:
DaemonSets: containers on demonsets will be run on every single node. Unless you include/exclude certain nodes.
Deployments: Deployments are next generation of Replication Controllers. Using deployments you can easily scale your application as well as ensure availability of certain number of pods. Please note in order to be available on node failure, you need to set node affinity rules on the pods. In order to do that you need to set it in the pod templates. In 1.6 affinity can be specified as a field in PodSpec, rather than using annotations.

How can i set max count of pods for replication-controller per node?

Let's imagine a situation.
For example: We have three node within the cluster.
And we have replication-controller where we have spec: replicas: 3
I would like to have not more one pod per node.
Because if one node will have more one pod while node die, i'll lose two or more pods.
How can i do that?
The Kubernetes scheduler already prioritizes spreading pods from the same replication controller out across your nodes.
However, if you want to be 100% sure that no two pods will end up on the same node, you can set the container's HostPort field. No two containers with the same HostPort can ever run on the same node.
If you want to get even fancier, you could write your own scheduler plugin :)
In general, though, the idea of Kubernetes is that you shouldn't have to think about your nodes except in special circumstances; you can instead trust the system to keep your applications running.