I have a simple replication controller yaml file which looks like this:
apiVersion: v1
kind: ReplicationController
metadata:
name: nginx
spec:
replicas: 3
selector:
app: nginx
template:
spec:
containers:
- image: library/nginx:3.2
imagePullPolicy: IfNotPresent
name: nginx
ports:
- containerPort: 80
metadata:
labels:
app: nginx
And after running this replication controller, I will get 3 different pods whose names are "nginx-xxx", where "xxx" represents a random string of letters and digits.
What I want is to specify names for the pods created by the replication controller, so that the pods' name can be "nginx-01", "nginx-02", "nginx-03". And further more, for say if pod "nginx-02" is down for some reason, and replication controller will automatically create another nginx pod, and I want this new nginx pod's name to remain as "nginx-02".
I wonder if this is possible? Thanks in advance.
You should be using statefulset instead of replication controllers. Moreover, replication controllers are replaced with ReplicaSets.
StatefulSet Pods have a unique identity that is comprised of an ordinal. For a StatefulSet with N replicas, each Pod in the StatefulSet will be assigned an integer ordinal, from 0 up through N-1, that is unique over the Set. Each Pod in a StatefulSet derives its hostname from the name of the StatefulSet and the ordinal of the Pod.
StatefulSets matches your requirements and hence use it in your deployment.
Try the deployment files below:
apiVersion: v1
kind: Service
metadata:
name: nginx
labels:
app: nginx
spec:
ports:
- port: 80
name: web
clusterIP: None
selector:
app: nginx
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: web
spec:
selector:
matchLabels:
app: nginx # has to match .spec.template.metadata.labels
serviceName: "nginx"
replicas: 3 # by default is 1
template:
metadata:
labels:
app: nginx # has to match .spec.selector.matchLabels
spec:
terminationGracePeriodSeconds: 10
containers:
- name: nginx
image: k8s.gcr.io/nginx-slim:0.8
ports:
- containerPort: 80
name: web
volumeMounts:
- name: www
mountPath: /usr/share/nginx/html
volumes:
- name: www
emptyDir:
This can be implemented using statefulsets which is out of beta since version 1.9. Quoting the documentation: When using kind: StatefulSet,
Pods have a unique identity that is comprised of an ordinal, a stable network identity, and stable storage. The identity sticks to the Pod, regardless of which node it’s (re)scheduled on.
Each Pod in a StatefulSet derives its hostname from the name of the StatefulSet and the ordinal of the Pod. The pattern for the constructed hostname is $(statefulset name)-$(ordinal).
So in the example above, you would get nginx-0,nginx-1,nginx-2
If you're running stateless workloads, I cannot imagine why you would want to have fixed identities associated with each object if your intention is to run N replicas of a particular pod.
There is no way to do this using a ReplicaSet/ReplicationController. When the controller creates new pods, it will have a generated name suffix after the pod name.
If that is what you really want (fixed identity/ordinal index), the property is satisfied by the StatefulSet resource which is stable since Kubernetes v1.9. However, it also comes with additional guarantees that you probably do not need.
Related
I've experienced a surprising behavior when playing around with Kubernetes and I wanted to know if there is any good explanation behind it.
I've noticed that when two Kubernetes deployments are created with the same labels, and with the same spec.selector, the deployments still function correctly, even though using the same selector "should" cause them to be confused regarding which pods is related to each one.
Example configurations which present this -
example_deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
labels:
app: nginx
extra_label: one
spec:
replicas: 2
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:1.14.2
ports:
- containerPort: 80
example_deployment_2.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment-2
labels:
app: nginx
extra_label: two
spec:
replicas: 3
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:1.14.2
ports:
- containerPort: 80
I expected the deployments not to work correctly, since they will select pods from each other and assume it is theirs.
The actual result is that the deployments seem to be created correctly, but entering the deployment from k9s returns all of the pods. This is true for both deployments.
Can anyone please shed light regarding why this is happening? Is there additional internal filtering in Kubernetes to to prevent pods which were not really created by the deployment from being associated with it?
I'll note that I've seen this behavior in AWS and have reproduced it in Minikube.
When you create a K8S Deployment, K8S creates a ReplicaSet to manage the pods, then this ReplicaSet creates the pods based on the number of replicas provided or patched by the hpa. Addition to the provided labels and annotations you provide, the ReplicaSet add ownerReferences which contains its name and uid, so even if you have 4 pods with the same labels, each two pods will have a different ownerReferences used by the ReplicaSet to manage them:
apiVersion: v1
kind: Pod
metadata:
ownerReferences:
- apiVersion: apps/v1
blockOwnerDeletion: true
controller: true
kind: ReplicaSet
name: <replicaset name>
uid: <replicaset uid>
...
Application is deployed on K8s using StatefulSet because of stateful in nature. There is around 250+ pods are running and HPA has been implemented on it too that can scale upto 400 pods.
When new deployment occurs, it takes longer time (~ 10-15m) to update all pods in Rolling Update fashion.
Problem: End user get response from 2 version of pods until all pods are replaced with new revision.
I am googling for an architecture where overall deployment time can be reduced and getting the best possible solutions to use BLUE/GREEN strategy but it has bunch of impact with integrated services like monitoring, logging, telemetry etc because of 2 naming conventions.
Ideally I am looking for a solutions like maxSurge for Deployment in which firstly new pods are created and then traffic are shifted to it at a time but in case of StatefulSet, it won't support maxSurge with RollingUpdate strategy & controller will delete and recreate each Pod in the StatefulSet based on ordinal index from bigger to smaller.
The solution is to do a partitioning rolling update along with a canary deployment.
Let’s suppose we have the statefulset workload defined by the following yaml file:
apiVersion: v1
kind: Service
metadata:
name: nginx
labels:
app: nginx
version: "1.20"
spec:
ports:
- port: 80
name: web
clusterIP: None
selector:
app: nginx
version: "1.20"
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: web
spec:
selector:
matchLabels:
app: nginx # Label selector that determines which Pods belong to the StatefulSet
# Must match spec: template: metadata: labels
serviceName: "nginx"
replicas: 3
template:
metadata:
labels:
app: nginx # Pod template's label selector
version: "1.20"
spec:
terminationGracePeriodSeconds: 10
containers:
- name: nginx
image: nginx:1.20
ports:
- containerPort: 80
name: web
volumeMounts:
- name: www
mountPath: /usr/share/nginx/html
volumeClaimTemplates:
- metadata:
name: www
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 1Gi
You could patch the statefulset to create a partition, and change the image and version label for the remaining pods: (In this case, since there are only 3 pods, the last one will be the one that will change its image.)
$ kubectl patch statefulset web -p '{"spec":{"updateStrategy":{"type":"RollingUpdate","rollingUpdate":{"partition":2}}}}'
$ kubectl patch statefulset web --type='json' -p='[{"op": "replace", "path": "/spec/template/spec/containers/0/image", "value":"nginx:1.21"}]'
$ kubectl patch statefulset web --type='json' -p='[{"op": "replace", "path": "/spec/template/metadata/labels/version", "value":"1.21"}]'
At this point, you have a pod with the new image and version label ready to use, but since the version label is different, the traffic is still going to the other two pods. If you change the version in the yaml file and apply the new configuration, the rollout will be transparent, since there is already a pod ready to migrate the traffic:
apiVersion: v1
kind: Service
metadata:
name: nginx
labels:
app: nginx
version: "1.21"
spec:
ports:
- port: 80
name: web
clusterIP: None
selector:
app: nginx
version: "1.21"
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: web
spec:
selector:
matchLabels:
app: nginx # Label selector that determines which Pods belong to the StatefulSet
# Must match spec: template: metadata: labels
serviceName: "nginx"
replicas: 3
template:
metadata:
labels:
app: nginx # Pod template's label selector
version: "1.21"
spec:
terminationGracePeriodSeconds: 10
containers:
- name: nginx
image: nginx:1.21
ports:
- containerPort: 80
name: web
volumeMounts:
- name: www
mountPath: /usr/share/nginx/html
volumeClaimTemplates:
- metadata:
name: www
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 1Gi
$ kubectl apply -f file-name.yaml
Once traffic is migrated to the pod containing the new image and version label, you should patch again the statefulset and remove the partition with the command kubectl patch statefulset web -p '{"spec":{"updateStrategy":{"type":"RollingUpdate","rollingUpdate":{"partition":0}}}}'
Note: You will need to be very careful with the size of the partition, since the remaining pods will handle the whole traffic for some time.
My application has to deployments with a POD.
Can I create a Service to distribute load across these 2 PODs, part of different deployments ?
If so, How ?
Yes it is possible to achieve. Good explanation how to do it can be found on Kubernete documentation. However, keep in mind that both deployments should provide the same functionality, as the output should have the same format.
A Kubernetes Service is an abstraction which defines a logical set of Pods running somewhere in your cluster, that all provide the same functionality. When created, each Service is assigned a unique IP address (also called clusterIP). This address is tied to the lifespan of the Service, and will not change while the Service is alive. Pods can be configured to talk to the Service, and know that communication to the Service will be automatically load-balanced out to some pod that is a member of the Service.
Based on example from Documentation.
1. nginx Deployment. Keep in mind that Deployment can have more than 1 label.
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx
spec:
selector:
matchLabels:
run: nginx
env: dev
replicas: 2
template:
metadata:
labels:
run: nginx
env: dev
spec:
containers:
- name: nginx
image: nginx
ports:
- containerPort: 80
2. nginx-second Deployment
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-second
spec:
selector:
matchLabels:
run: nginx
env: prod
replicas: 2
template:
metadata:
labels:
run: nginx
env: prod
spec:
containers:
- name: nginx-second
image: nginx
ports:
- containerPort: 80
Now to pair Deployments with Services you have to use Selector based on Deployments labels. Below you can find 2 service YAMLs. nginx-service which pointing to both deployments and nginx-service-1 which points only to nginx-second deployment.
## Both Deployments
apiVersion: v1
kind: Service
metadata:
name: nginx-service
spec:
ports:
- port: 80
protocol: TCP
selector:
run: nginx
---
### To nginx-second deployment
apiVersion: v1
kind: Service
metadata:
name: nginx-service-1
spec:
ports:
- port: 80
protocol: TCP
selector:
env: prod
You can verify that service binds to deployment by checking the endpoints.
$ kubectl get pods -l run=nginx -o yaml | grep podIP
podIP: 10.32.0.9
podIP: 10.32.2.10
podIP: 10.32.0.10
podIP: 10.32.2.11
$ kk get ep nginx-service
NAME ENDPOINTS AGE
nginx-service 10.32.0.10:80,10.32.0.9:80,10.32.2.10:80 + 1 more... 3m33s
$ kk get ep nginx-service-1
NAME ENDPOINTS AGE
nginx-service-1 10.32.0.10:80,10.32.2.11:80 3m36s
Yes, you can do that.
Add a common label key pair to both the deployment pod spec and use that common label as selector in service definition
With the above defined service the requests would be load balanced across all the matching pods.
I am using statefulset and I spin up multiple pods but they are not replica of each other. I want to set the hostname of the pods and passing these hostname as a env variable to all the pods so that they can communicate with each other.
I tried to use hostname under pod spec but hostname is never to set to specified hostname. However, it is set to hostname as podname-0.
# Source: testrep/templates/statefulset.yaml
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: orbiting-butterfly-testrep
labels:
app.kubernetes.io/name: testrep
helm.sh/chart: testrep-0.1.0
app.kubernetes.io/instance: orbiting-butterfly
app.kubernetes.io/managed-by: Tiller
spec:
replicas: 1
selector:
matchLabels:
app.kubernetes.io/name: testrep
app.kubernetes.io/instance: orbiting-butterfly
strategy:
type: Recreate
template:
metadata:
labels:
app.kubernetes.io/name: testrep
app.kubernetes.io/instance: orbiting-butterfly
spec:
nodeSelector:
testol: ad3
hostname: test1
containers:
- name: testrep
image: "test/database:v1"
imagePullPolicy: IfNotPresent
env:
- name: DB_HOSTS
value: test1,test2,test3
As per documentation:
StatefulSet is the workload API object used to manage stateful applications.
Manages the deployment and scaling of a set of Pods, and provides guarantees about the ordering and uniqueness of these Pods.
StatefulSets are valuable for applications that require one or more of the following:
Stable, unique network identifiers.
Stable, persistent storage.
Ordered, graceful deployment and scaling.
Ordered, automated rolling updates.
Statefulset Limitations:
StatefulSets currently require a Headless Service to be responsible for the network identity of the Pods. You are responsible for creating this Service.
Pod Identity
StatefulSet Pods have a unique identity that is comprised of an ordinal, a stable network identity, and stable storage. The identity sticks to the Pod, regardless of which node it’s (re)scheduled on.
For a StatefulSet with N replicas, each Pod in the StatefulSet will be assigned an integer ordinal, from 0 up through N-1, that is unique over the Set.
Stable Network ID
Each Pod in a StatefulSet derives its hostname from the name of the StatefulSet and the ordinal of the Pod. The pattern for the constructed hostname is $(statefulset name)-$(ordinal). The example above will create three Pods named web-0,web-1,web-2. A StatefulSet can use a Headless Service to control the domain of its Pods. The domain managed by this Service takes the form: $(service name).$(namespace).svc.cluster.local, where “cluster.local” is the cluster domain. As each Pod is created, it gets a matching DNS subdomain, taking the form: $(podname).$(governing service domain), where the governing service is defined by the serviceName field on the StatefulSet.
Note:
You are responsible for creating the Headless Service responsible for the network identity of the pods.
So as described by vjdhama Please create your Statefulset with Headless Service.
You can find this example in the docs:
apiVersion: v1
kind: Service
metadata:
name: nginx
labels:
app: nginx
spec:
ports:
- port: 80
name: web
clusterIP: None
selector:
app: nginx
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: web
spec:
selector:
matchLabels:
app: nginx # has to match .spec.template.metadata.labels
serviceName: "nginx" # has to match headless Service metadata.name
replicas: 3 # by default is 1
template:
metadata:
labels:
app: nginx # has to match .spec.selector.matchLabels
spec:
terminationGracePeriodSeconds: 10
containers:
- name: nginx
image: k8s.gcr.io/nginx-slim:0.8
ports:
- containerPort: 80
In this scenario Pod DNS and Pod Hostnames should be respectively:
Pod DNS
web-{0..N-1}.nginx.default.svc.cluster.local
Pod Hostname
web-{0..N-1}
NAME READY STATUS RESTARTS AGE IP
pod/web-0 1/1 Running 0 5m 192.168.148.78
pod/web-1 1/1 Running 0 4m53s 192.168.148.79
pod/web-2 1/1 Running 0 4m51s 192.168.148.80
From the Pod perspective:
root#web-2:# nslookup nginx
Server: 10.96.0.10
Address: 10.96.0.10#53
Name: nginx.default.svc.cluster.local
Address: 192.168.148.80
Name: nginx.default.svc.cluster.local
Address: 192.168.148.78
Name: nginx.default.svc.cluster.local
Address: 192.168.148.79
So you can call each of the respective pods using the Pod DNS, like:
web-0.nginx.default.svc.cluster.local
Update:
Exposing single pod from StatefulSet.
Pod Name Label
When the StatefulSet controller creates a Pod, it adds a label, statefulset.kubernetes.io/pod-name, that is set to the name of the Pod. This label allows you to attach a Service to a specific Pod in the StatefulSet.
https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/#pod-name-label
You can find here tricky way.
Using described above advantages of Statefulset:
The pattern for the constructed hostname is $(statefulset name)-$(ordinal). The example above will create three Pods named web-0,web-1,web-2.
So as an example:
apiVersion: v1
kind: Service
metadata:
name: app-0
spec:
type: LoadBalancer
selector:
statefulset.kubernetes.io/pod-name: web-0
ports:
- protocol: TCP
port: 80
targetPort: 80
Will do this for you.
Hope this help.
Kubernetes has service discovery baked in, so you don't have to do that. You can setup a headless service for your StatefulSet to let other applications talk to it.
The pods that are created with StatefulSets are ordered and setup sequentially. Hence the postfix integer value. You can read more on it here.
I'm new to Kubernetes and recently I am using it to deploy hadoop. I want to do a thing that set a specific hostname to pod which is created using statefulSets with hostNetwork = true.
Here is my yaml config file.
apiVersion: v1
kind: Service
metadata:
name: test-bbox
labels:
app: test-bbox
spec:
clusterIP: None
selector:
app: test-bbox
ports:
- name: foo
port: 1234
targetPort: 1234
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: test-bbox
spec:
serviceName: "test-bbox"
replicas: 1
selector:
matchLabels:
app: test-bbox
template:
metadata:
labels:
app: test-bbox
spec:
hostNetwork: true
hostname: test-hostname
containers:
- image: busybox
name: busybox
env:
- name: MY_POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
command:
- sleep
- "7200"
As the yaml says, the attribute hostnetwork is set to true. Then the pod test-bbox-0's hostname is the hostname of the Node where it is created.
If I set hostnetwork to false, the hostname is auto-generated by statefulSets as a format such as test-bbox-0.test-bbox.default.svc.cluster.local, which is just I need.
But in my case I need to set hostnetwork to true and at the same time to customize the hostname to the format mentioned above rather than the Node's hostname.
So the question is, is there any way to customize hostname for Pod ?
Kubernetes version used : 1.9
It is not possible to set the hostname for a Pod that is using hostNetwork. If you try to change the hostname in such a Pod you'll see that you are changing the hostname of the node too; this is because they are sharing the UTS namespace, not only the networking one.
Pods managed by a StatefulSet are a special case and their hostname is set my StatefulSet and it can't be configured directly. The hostname can be influenced by the name of StatefulSet itself while domainname by naming the Service appropriately:
Each Pod in a StatefulSet derives its hostname from the name of the StatefulSet and the ordinal of the Pod. The pattern for the constructed hostname is $(statefulset name)-$(ordinal). [...] A StatefulSet can use a Headless Service to control the domain of its Pods. The domain managed by this Service takes the form: $(service name).$(namespace).svc.cluster.local, where cluster.local is the cluster domain. As each Pod is created, it gets a matching DNS subdomain, taking the form: $(podname).$(governing service domain), where the governing service is defined by the serviceName field on the StatefulSet.