Kubernetes Statefulsets: Restart all pods concurrently (instead of in sequence) - kubernetes

I have a use-case for concurrent restart of all pods in a statefulset.
Does kubernetes statefulset support concurrent restart of all pods?
According to the statefulset documentation, this can be accomplished by setting the pod update policy to parallel as in this example:
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: mysql-db
spec:
podManagementPolicy: Parallel
replicas: 3
However this does not seem to work in practice, as demonstrated on this statefulset running on EKS:
Apply this:
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: producer
namespace: ragnarok
spec:
selector:
matchLabels:
app: producer
replicas: 10
podManagementPolicy: "Parallel"
serviceName: producer-service
template:
metadata:
labels:
app: producer
spec:
containers:
- name: producer
image: archbungle/load-tester:pulsar-0.0.49
imagePullPolicy: IfNotPresent
Rollout restart happens in sequence as if disregarding the rollout policy setting:
(base) welcome#Traianos-MacBook-Pro eks-deploy % kubectl get pods -n ragnarok | egrep producer
producer-0 1/1 Running 0 3m58s
producer-1 1/1 Running 0 3m56s
producer-2 1/1 Running 0 3m53s
producer-3 1/1 Running 0 3m47s
producer-4 1/1 Running 0 3m45s
producer-5 1/1 Running 0 3m43s
producer-6 1/1 Running 1 3m34s
producer-7 0/1 ContainerCreating 0 1s
producer-8 1/1 Running 0 16s
producer-9 1/1 Running 0 19s
(base) welcome#Traianos-MacBook-Pro eks-deploy % kubectl get pods -n ragnarok | egrep producer
producer-0 1/1 Running 0 4m2s
producer-1 1/1 Running 0 4m
producer-2 1/1 Running 0 3m57s
producer-3 1/1 Running 0 3m51s
producer-4 1/1 Running 0 3m49s
producer-5 1/1 Running 0 3m47s
producer-6 0/1 Terminating 1 3m38s
producer-7 1/1 Running 0 5s
producer-8 1/1 Running 0 20s
producer-9 1/1 Running 0 23s
(base) welcome#Traianos-MacBook-Pro eks-deploy % kubectl get pods -n ragnarok | egrep producer
producer-0 1/1 Running 0 4m8s
producer-1 1/1 Running 0 4m6s
producer-2 1/1 Running 0 4m3s
producer-3 1/1 Running 0 3m57s
producer-4 1/1 Running 0 3m55s
producer-5 0/1 Terminating 0 3m53s
producer-6 1/1 Running 0 4s
producer-7 1/1 Running 0 11s
producer-8 1/1 Running 0 26s
producer-9 1/1 Running 0 29s

As the document pointed, Parallel pod management will effective only in the scaling operations. This option only affects the behavior for scaling operations. Updates are not affected.
Maybe you can try something like
kubectl scale statefulset producer --replicas=0 -n ragnarok
and
kubectl scale statefulset producer --replicas=10 -n ragnarok
According to documentation, all pods should be deleted and created together by scaling them with the Parallel policy.
Reference : https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/#parallel-pod-management

Related

Why does the official helm chart for fluent-bit start 20 pods

I followed the official helm chart for fluent-bit and ended up with 20 pods, in a namespace. How do I configure it to use 1 pod?
The replicaCount attribute in values.yaml is set to 1.
https://github.com/fluent/helm-charts/tree/main/charts/fluent-bit
helm upgrade -i fluent-bit helm/efk/fluent-bit --namespace some-ns
kubectl get pods -n some-ns
NAME READY STATUS RESTARTS AGE
fluent-bit-22dx4 1/1 Running 0 15h
fluent-bit-2x6rn 1/1 Running 0 15h
fluent-bit-42rfd 1/1 Running 0 15h
fluent-bit-54drx 1/1 Running 0 15h
fluent-bit-8f8pl 1/1 Running 0 15h
fluent-bit-8rtp9 1/1 Running 0 15h
fluent-bit-8wfcc 1/1 Running 0 15h
fluent-bit-bffh8 1/1 Running 0 15h
fluent-bit-lgl9k 1/1 Running 0 15h
fluent-bit-lqdrs 1/1 Running 0 15h
fluent-bit-mdvlc 1/1 Running 0 15h
fluent-bit-qgvww 1/1 Running 0 15h
fluent-bit-qqwh6 1/1 Running 0 15h
fluent-bit-qxbjt 1/1 Running 0 15h
fluent-bit-rqr8g 1/1 Running 0 15h
fluent-bit-t8vbv 1/1 Running 0 15h
fluent-bit-vkcfl 1/1 Running 0 15h
fluent-bit-wnwtq 1/1 Running 0 15h
fluent-bit-xqwxk 1/1 Running 0 15h
fluent-bit-xxj8q 1/1 Running 0 15h
Note that there are two deployment kinds in your template, daemonset and deployment.
This is controlled by the kind field of the values file.
Now daemonset is written in the values, so it will be on each node without considering the affinity Start a replica.
If you want to start one, please set kind to Deployment, then set replicaCount to 1 and redeploy.
values.yaml
# Default values for fluent-bit.
# kind -- DaemonSet or Deployment
kind: Deployment
# replicaCount -- Only applicable if kind=Deployment
replicaCount: 1
https://kubernetes.io/docs/concepts/workloads/controllers/daemonset/
https://kubernetes.io/docs/concepts/workloads/controllers/deployment/

promethues operator alertmanager-main-0 pending and display

What happened?
kubernetes version: 1.12
promethus operator: release-0.1
I follow the README:
$ kubectl create -f manifests/
# It can take a few seconds for the above 'create manifests' command to fully create the following resources, so verify the resources are ready before proceeding.
$ until kubectl get customresourcedefinitions servicemonitors.monitoring.coreos.com ; do date; sleep 1; echo ""; done
$ until kubectl get servicemonitors --all-namespaces ; do date; sleep 1; echo ""; done
$ kubectl apply -f manifests/ # This command sometimes may need to be done twice (to workaround a race condition).
and then I use the command and then is showed like:
[root#VM_8_3_centos /data/hansenwu/kube-prometheus/manifests]# kubectl get pod -n monitoring
NAME READY STATUS RESTARTS AGE
alertmanager-main-0 2/2 Running 0 66s
alertmanager-main-1 1/2 Running 0 47s
grafana-54f84fdf45-kt2j9 1/1 Running 0 72s
kube-state-metrics-65b8dbf498-h7d8g 4/4 Running 0 57s
node-exporter-7mpjw 2/2 Running 0 72s
node-exporter-crfgv 2/2 Running 0 72s
node-exporter-l7s9g 2/2 Running 0 72s
node-exporter-lqpns 2/2 Running 0 72s
prometheus-adapter-5b6f856dbc-ndfwl 1/1 Running 0 72s
prometheus-k8s-0 3/3 Running 1 59s
prometheus-k8s-1 3/3 Running 1 59s
prometheus-operator-5c64c8969-lqvkb 1/1 Running 0 72s
[root#VM_8_3_centos /data/hansenwu/kube-prometheus/manifests]# kubectl get pod -n monitoring
NAME READY STATUS RESTARTS AGE
alertmanager-main-0 0/2 Pending 0 0s
grafana-54f84fdf45-kt2j9 1/1 Running 0 75s
kube-state-metrics-65b8dbf498-h7d8g 4/4 Running 0 60s
node-exporter-7mpjw 2/2 Running 0 75s
node-exporter-crfgv 2/2 Running 0 75s
node-exporter-l7s9g 2/2 Running 0 75s
node-exporter-lqpns 2/2 Running 0 75s
prometheus-adapter-5b6f856dbc-ndfwl 1/1 Running 0 75s
prometheus-k8s-0 3/3 Running 1 62s
prometheus-k8s-1 3/3 Running 1 62s
prometheus-operator-5c64c8969-lqvkb 1/1 Running 0 75s
I don't know why the pod altertmanager-main-0 pending and disaply then restart.
And I see the event, it is showed as:
72s Warning FailedCreate StatefulSet create Pod alertmanager-main-0 in StatefulSet alertmanager-main failed error: The POST operation against Pod could not be completed at this time, please try again.
72s Warning FailedCreate StatefulSet create Pod alertmanager-main-0 in StatefulSet alertmanager-main failed error: The POST operation against Pod could not be completed at this time, please try again.
72s Warning^Z FailedCreate StatefulSet
[10]+ Stopped kubectl get events -n monitoring
Most likely the alertmanager does not get enough time to start correctly.
Have a look at this answer : https://github.com/coreos/prometheus-operator/issues/965#issuecomment-460223268
You can set the paused field to true, and then modify the StatefulSet to try if extending the liveness/readiness solves your issue.

Failed to open topo server on vitess with etcd

I'm running a simple example with Helm. Take a look below at values.yaml file:
cat << EOF | helm install helm/vitess -n vitess -f -
topology:
cells:
- name: 'zone1'
keyspaces:
- name: 'vitess'
shards:
- name: '0'
tablets:
- type: 'replica'
vttablet:
replicas: 1
mysqlProtocol:
enabled: true
authType: secret
username: vitess
passwordSecret: vitess-db-password
etcd:
replicas: 3
vtctld:
replicas: 1
vtgate:
replicas: 3
vttablet:
dataVolumeClaimSpec:
storageClassName: nfs-slow
EOF
Take a look at the output of current pods running below:
$ kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system coredns-fb8b8dccf-8f5kt 1/1 Running 0 32m
kube-system coredns-fb8b8dccf-qbd6c 1/1 Running 0 32m
kube-system etcd-master1 1/1 Running 0 32m
kube-system kube-apiserver-master1 1/1 Running 0 31m
kube-system kube-controller-manager-master1 1/1 Running 0 32m
kube-system kube-flannel-ds-amd64-bkg9z 1/1 Running 0 32m
kube-system kube-flannel-ds-amd64-q8vh4 1/1 Running 0 32m
kube-system kube-flannel-ds-amd64-vqmnz 1/1 Running 0 32m
kube-system kube-proxy-bd8mf 1/1 Running 0 32m
kube-system kube-proxy-nlc2b 1/1 Running 0 32m
kube-system kube-proxy-x7cd5 1/1 Running 0 32m
kube-system kube-scheduler-master1 1/1 Running 0 32m
kube-system tiller-deploy-8458f6c667-cx2mv 1/1 Running 0 27m
vitess etcd-global-6pwvnv29th 0/1 Init:0/1 0 16m
vitess etcd-operator-84db9bc774-j4wml 1/1 Running 0 30m
vitess etcd-zone1-zwgvd7spzc 0/1 Init:0/1 0 16m
vitess vtctld-86cd78b6f5-zgfqg 0/1 CrashLoopBackOff 7 16m
vitess vtgate-zone1-58744956c4-x8ms2 0/1 CrashLoopBackOff 7 16m
vitess zone1-vitess-0-init-shard-master-mbbph 1/1 Running 0 16m
vitess zone1-vitess-0-replica-0 0/6 Init:CrashLoopBackOff 7 16m
Running logs I see this error:
$ kubectl logs -n vitess vtctld-86cd78b6f5-zgfqg
++ cat
+ eval exec /vt/bin/vtctld '-cell="zone1"' '-web_dir="/vt/web/vtctld"' '-web_dir2="/vt/web/vtctld2/app"' -workflow_manager_init -workflow_manager_use_election -logtostderr=true -stderrthreshold=0 -port=15000 -grpc_port=15999 '-service_map="grpc-vtctl"' '-topo_implementation="etcd2"' '-topo_global_server_address="etcd-global-client.vitess:2379"' -topo_global_root=/vitess/global
++ exec /vt/bin/vtctld -cell=zone1 -web_dir=/vt/web/vtctld -web_dir2=/vt/web/vtctld2/app -workflow_manager_init -workflow_manager_use_election -logtostderr=true -stderrthreshold=0 -port=15000 -grpc_port=15999 -service_map=grpc-vtctl -topo_implementation=etcd2 -topo_global_server_address=etcd-global-client.vitess:2379 -topo_global_root=/vitess/global
ERROR: logging before flag.Parse: E0422 02:35:34.020928 1 syslogger.go:122] can't connect to syslog
F0422 02:35:39.025400 1 server.go:221] Failed to open topo server (etcd2,etcd-global-client.vitess:2379,/vitess/global): grpc: timed out when dialing
I'm running behind vagrant with 1 master and 2 nodes. I suspect that is a issue with eth1.
The storage are configured to use NFS.
$ kubectl logs etcd-operator-84db9bc774-j4wml
time="2019-04-22T17:26:51Z" level=info msg="skip reconciliation: running ([]), pending ([etcd-zone1-zwgvd7spzc])" cluster-name=etcd-zone1 cluster-namespace=vitess pkg=cluster
time="2019-04-22T17:26:51Z" level=info msg="skip reconciliation: running ([]), pending ([etcd-zone1-zwgvd7spzc])" cluster-name=etcd-global cluster-namespace=vitess pkg=cluster
It appears that etcd is not fully initializing. Note that neither the pod for the global lockserver (etcd-global-6pwvnv29th) nor the local one for cell zone1 (pod etcd-zone1-zwgvd7spzc) are ready.

Kubernetes pod not starting

I have a kubernetes cluster with 5 nodes. When I add a simple nginx pod it will be scheduled to one of the nodes but it will not start up. It will not even pull the image.
This is the nginx.yaml file:
apiVersion: v1
kind: Pod
metadata:
name: nginx
spec:
containers:
- name: nginx
image: nginx
ports:
- containerPort: 80
when I describe the pod there is one event: Successfully assigned busybox to up02 When I log in to the up02 and check to see if there are any images pulled I see it didn't get pulled so I pulled it manually (I thought maybe it needs some kick start ;) )
The pod will allways stay in the Container creating state. It's not only with this pod, the problem is with any pod I try to add.
There are some pods running on the machine which is necessary for Kubernetes to operate:
up#up01:~$ kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
default busybox 0/1 ContainerCreating 0 11m
default nginx 0/1 ContainerCreating 0 22m
kube-system dummy-2088944543-n1cd5 1/1 Running 0 5d
kube-system etcd-up01 1/1 Running 0 5d
kube-system kube-apiserver-up01 1/1 Running 0 5d
kube-system kube-controller-manager-up01 1/1 Running 0 5d
kube-system kube-discovery-1769846148-xfpls 1/1 Running 0 5d
kube-system kube-dns-2924299975-5rzz8 4/4 Running 0 5d
kube-system kube-proxy-17bpl 1/1 Running 2 3d
kube-system kube-proxy-3pk63 1/1 Running 0 3d
kube-system kube-proxy-h3wrj 1/1 Running 0 5d
kube-system kube-proxy-wzqv4 1/1 Running 0 3d
kube-system kube-proxy-z3xxx 1/1 Running 0 3d
kube-system kube-scheduler-up01 1/1 Running 0 5d
kube-system kubernetes-dashboard-3203831700-3xfbd 1/1 Running 0 5d
kube-system weave-net-6c0nr 2/2 Running 0 3d
kube-system weave-net-dchhf 2/2 Running 0 5d
kube-system weave-net-hshvg 2/2 Running 4 3d
kube-system weave-net-n684c 2/2 Running 1 3d
kube-system weave-net-r5319 2/2 Running 0 3d
You can do
kubectl describe pods <pod>
to get more info on what's happening.
Can you recreate the nginx pod again in namespace kube-system?
kubectl create --namespace kube-system -f nginx.yaml
this should fix your problem.
Second, do you have proxy in your environment, take a look as well.
make sure that your namespace and service account information is correct. if you've configured your services or deployments to use a namespace or service account, that namespace needs to exist.
If you configured it to use a non - default service account then that has to exist as well, and the service account should be created after the namespace.
you shouldn't be necessarily using the kube system namespace. namespaces exist so there can be more than one of them and control the flow of traffic inside of a cluster.
you should also set potentially set permissions for your namespace. read this here.
https://kubernetes.io/docs/reference/access-authn-authz/rbac/#service-account-permissions

Kubernetes: how to scale my pods

I'm new to Kubernetes. I try to scale my pods. First I started 3 pods:
./cluster/kubectl.sh run my-nginx --image=nginx --replicas=3 --port=80
There were starting 3 pods. First I tried to scale up/down by using a replicationcontroller but this did not exist. It seems to be a replicaSet now.
./cluster/kubectl.sh get rs
NAME DESIRED CURRENT AGE
my-nginx-2494149703 3 3 9h
I tried to change the amount of replicas described in my replicaset:
./cluster/kubectl.sh scale --replicas=5 rs/my-nginx-2494149703
replicaset "my-nginx-2494149703" scaled
But I still see my 3 original pods
./cluster/kubectl.sh get pods
NAME READY STATUS RESTARTS AGE
my-nginx-2494149703-04xrd 1/1 Running 0 9h
my-nginx-2494149703-h3krk 1/1 Running 0 9h
my-nginx-2494149703-hnayu 1/1 Running 0 9h
I would expect to see 5 pods.
./cluster/kubectl.sh describe rs/my-nginx-2494149703
Name: my-nginx-2494149703
Namespace: default
Image(s): nginx
Selector: pod-template-hash=2494149703,run=my-nginx
Labels: pod-template-hash=2494149703
run=my-nginx
Replicas: 3 current / 3 desired
Pods Status: 3 Running / 0 Waiting / 0 Succeeded / 0 Failed
Why isn't it scaling up? Do I also have to change something in the deployment?
I see something like this when I describe my rs after scaling up:
(Here I try to scale from one running pod to 3 running pods). But it remains one running pod. The other 2 are started and killed immediatly
34s 34s 1 {replicaset-controller } Normal SuccessfulCreate Created pod: my-nginx-1908062973-lylsz
34s 34s 1 {replicaset-controller } Normal SuccessfulCreate Created pod: my-nginx-1908062973-5rv8u
34s 34s 1 {replicaset-controller } Normal SuccessfulDelete Deleted pod: my-nginx-1908062973-lylsz
34s 34s 1 {replicaset-controller } Normal SuccessfulDelete Deleted pod: my-nginx-1908062973-5rv8u
This is working for me
kubectl scale --replicas=<expected_replica_num> deployment <deployment_label_name> -n <namespace>
Example
# kubectl scale --replicas=3 deployment xyz -n my_namespace
TL;DR: You need to scale your deployment instead of the replica set directly.
If you try to scale the replica set, then it will (for a very short time) have a new count of 5. But the deployment controller will see that the current count of the replica set is 5 and since it knows that it is supposed to be 3, it will reset it back to 3. By manually modifying the replica set that was created for you, you are fighting with the system controller (which is untiring and will pretty much always outlast you).
kubectl run my-nginx --image=nginx --replicas=3 --port=80 in this kubectl run will create a deployment or job to manage the created container(s).
Deployment-->ReplicaSet-->Pod this is how kubernetes works.
If you change the bottom-level object, its higher-level object will undo your change.You have to change the top-level object.
scale it down to zero and then to the number of pods you required (guess it equals to 3)
kubectl scale deployment <deployment-name> --replicas=0 -n <namespace>
kubectl scale deployment <deployment-name> --replicas=3 -n <namespace>
Not sure if this is the best way as I'm starting out with kubernetes, but I did this by updating my yaml file
# app.yaml
apiVersion: apps/v1
...
spec:
replicas: <new value>
and running $ kubectl scale -f app.yaml --replicas=<new value>
you can verify your new number of replicas by running $ kubectl get pods
In my case I was also interested in scaling back my VMs, on google cloud. I did this with $ gcloud container clusters resize appName --size=1 --zone "my-zone"
Below example shows how you should scale up/down your "pods/resource/deployments".
k8smaster#k8smaster:~/debashish$ more createdeb_deployment1.yaml
---
apiVersion: apps/v1beta2
kind: Deployment
metadata:
name: debdeploy-webserver
spec:
replicas: 1
selector:
matchLabels:
app: debdeploy1webserver
template:
metadata:
labels:
app: debdeploy1webserver
spec:
containers:
-
image: "docker.io/debu3645/debapachewebserver:v1"
name: deb-deploy1-container
ports:
-
containerPort: 6060
deployment created -->
**kubectl -n debns1 create -f createdeb_deployment1.yaml**
k8smaster#k8smaster:~/debashish$ `kubectl scale --replicas=5 **deployment**/debdeploy-webserver -n debns1`
(Scale up 5 deployments)
k8smaster#k8smaster:~/debashish$ kubectl get pods -n debns1
NAME READY STATUS RESTARTS AGE
debdeploy-webserver-7cf4fb74c5-8wvzx 1/1 Running 0 16s
debdeploy-webserver-7cf4fb74c5-jrf6v 1/1 Running 0 16s
debdeploy-webserver-7cf4fb74c5-m9fpw 1/1 Running 0 16s
debdeploy-webserver-7cf4fb74c5-q9n7r 1/1 Running 0 16s
debdeploy-webserver-7cf4fb74c5-ttw6p 1/1 Running 1 19h
resourcepod-deb1 1/1 Running 5 6d18h
k8smaster#k8smaster:~/debashish$ **kubectl get ep -n debns1**
NAME ENDPOINTS AGE
frontend-svc-deb 192.168.1.10:80,192.168.1.11:80,192.168.1.12:80 + 2 more... 18h
frontend-svc1-deb 192.168.1.8:80 14d
frontend-svc2-deb 192.168.1.8:80 5d19h
k8smaster#k8smaster:~/debashish$ **kubectl scale --replicas=2** deployment/debdeploy-webserver -n debns1
(Scale down from 5 to 2)
deployment.extensions/debdeploy-webserver scaled
k8smaster#k8smaster:~/debashish$ **kubectl get pods -n debns1**
NAME READY STATUS RESTARTS AGE
debdeploy-webserver-7cf4fb74c5-8wvzx 1/1 Terminating 0 35m
debdeploy-webserver-7cf4fb74c5-jrf6v 1/1 Terminating 0 35m
debdeploy-webserver-7cf4fb74c5-m9fpw 1/1 Terminating 0 35m
debdeploy-webserver-7cf4fb74c5-q9n7r 1/1 Running 0 35m
debdeploy-webserver-7cf4fb74c5-ttw6p 1/1 Running 1 19h
resourcepod-deb1 1/1 Running 5 6d19h
k8smaster#k8smaster:~/debashish$ **kubectl get pods -n debns1**
NAME READY STATUS RESTARTS AGE
debdeploy-webserver-7cf4fb74c5-q9n7r 1/1 Running 0 37m
debdeploy-webserver-7cf4fb74c5-ttw6p 1/1 Running 1 19h
resourcepod-deb1 1/1 Running 5 6d19h
k8smaster#k8smaster:~/debashish$ kubectl **scale --current-replicas=4 --replicas=2** deployment/debdeploy-webserver -n debns1 (Check the current no. of deployments. If current replication is 4, then bring it down to 2, else dont do anything)
error: Expected replicas to be 4, was 2
k8smaster#k8smaster:~/debashish$ **kubectl scale --current-replicas=3 --replicas=10 deployment/debdeploy-webserver -n debns1**
error: Expected replicas to be 3, was 2
k8smaster#k8smaster:~/debashish$ **kubectl scale --current-replicas=2 --replicas=10 deployment/debdeploy-webserver -n debns1**
deployment.extensions/debdeploy-webserver scaled
k8smaster#k8smaster:~/debashish$ **kubectl get pods -n debns1**
NAME READY STATUS RESTARTS AGE
debdeploy-webserver-7cf4fb74c5-46bxg 1/1 Running 0 6s
debdeploy-webserver-7cf4fb74c5-d6qsx 0/1 ContainerCreating 0 6s
debdeploy-webserver-7cf4fb74c5-fdq6v 1/1 Running 0 6s
debdeploy-webserver-7cf4fb74c5-gd87t 1/1 Running 0 6s
debdeploy-webserver-7cf4fb74c5-kqdbj 0/1 ContainerCreating 0 6s
debdeploy-webserver-7cf4fb74c5-q9n7r 1/1 Running 0 47m
debdeploy-webserver-7cf4fb74c5-qjvm6 1/1 Running 0 6s
debdeploy-webserver-7cf4fb74c5-skxq4 0/1 ContainerCreating 0 6s
debdeploy-webserver-7cf4fb74c5-ttw6p 1/1 Running 1 19h
debdeploy-webserver-7cf4fb74c5-wlc7q 0/1 ContainerCreating 0 6s
resourcepod-deb1 1/1 Running 5 6d19h
for deployment
kubectl scale deployment <deployment-name> --replicas=3 -n <namespace>
for statefulset
kubectl scale statefulsets <stateful-set-name> --replicas=3 -n <namespace>