All containers showing ready, but pod is not ready - kubernetes

when I describe my pod, I can see the following conditions:
$ kubectl describe pod blah-84c6554d77-6wn42
...
Conditions:
Type Status
Initialized True
Ready False
ContainersReady True
PodScheduled True
...
$ kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
blah-84c6554d77-6wn42 1/1 Running 46 23h 10.247.76.179 xxx-x-x-xx-123.nodes.xxx.d.ocp.xxx.xxx.br <none> <none>
...
I wonder how can this be possible: All the containers in the pod are showing ready=true but pod is ready=false
Has anyone experienced this before? Do you know what else could be causing the pod to not be ready?
I'm running kubernetes version 1.15.4.
I can see on the code that
// The status of "Ready" condition is "True", if all containers in a pod are ready
// AND all matching conditions specified in the ReadinessGates have status equal to "True".
but, I haven't defined any custom Readiness Gates.
I wonder how can I check what's the reason for the check failure.
I couldn't find this on the docs for pod-readiness-gate
here is the full pod yaml
$ kubectl get pod blah-84c6554d77-6wn42 -o yaml
apiVersion: v1
kind: Pod
metadata:
creationTimestamp: "2019-10-17T04:05:30Z"
generateName: blah-84c6554d77-
labels:
app: blah
commit: cb511424a5ec43f8dghdfdwervxz8a19edbb
pod-template-hash: 84c6554d77
name: blah-84c6554d77-6wn42
namespace: default
ownerReferences:
- apiVersion: apps/v1
blockOwnerDeletion: true
controller: true
kind: ReplicaSet
name: blah-84c6554d77
uid: 514da64b-c242-11e9-9c5b-0050123234523
resourceVersion: "19780031"
selfLink: /api/v1/namespaces/blah/pods/blah-84c6554d77-6wn42
uid: 62be74a1-541a-4fdf-987d-39c97644b0c8
spec:
containers:
- env:
- name: URL
valueFrom:
configMapKeyRef:
key: url
name: external-mk9249b92m
image: myregistry/blah:3.0.0
imagePullPolicy: IfNotPresent
livenessProbe:
failureThreshold: 5
httpGet:
path: /healthcheck
port: 8080
scheme: HTTP
initialDelaySeconds: 30
periodSeconds: 30
successThreshold: 1
timeoutSeconds: 3
name: blah
ports:
- containerPort: 8080
name: http
protocol: TCP
readinessProbe:
failureThreshold: 10
httpGet:
path: /healthcheck
port: 8080
scheme: HTTP
initialDelaySeconds: 10
periodSeconds: 30
successThreshold: 1
timeoutSeconds: 3
resources: {}
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /var/run/secrets/kubernetes.io/serviceaccount
name: default-token-4tp6z
readOnly: true
dnsPolicy: ClusterFirst
enableServiceLinks: true
nodeName: xxxxxxxxxxx
priority: 0
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
serviceAccount: default
serviceAccountName: default
terminationGracePeriodSeconds: 30
tolerations:
- effect: NoExecute
key: node.kubernetes.io/not-ready
operator: Exists
tolerationSeconds: 300
- effect: NoExecute
key: node.kubernetes.io/unreachable
operator: Exists
tolerationSeconds: 300
volumes:
- name: default-token-4tp6z
secret:
defaultMode: 420
secretName: default-token-4tp6z
status:
conditions:
- lastProbeTime: null
lastTransitionTime: "2019-10-17T04:14:22Z"
status: "True"
type: Initialized
- lastProbeTime: null
lastTransitionTime: "2019-10-17T09:47:15Z"
status: "False"
type: Ready
- lastProbeTime: null
lastTransitionTime: "2019-10-17T07:54:55Z"
status: "True"
type: ContainersReady
- lastProbeTime: null
lastTransitionTime: "2019-10-17T04:14:18Z"
status: "True"
type: PodScheduled
containerStatuses:
- containerID: docker://33820f432a5a372d028c18f1b0376e2526ef65871f4f5c021e2cbea5dcdbe3ea
image: myregistry/blah:3.0.0
imageID: docker-pullable://myregistry/blah:#sha256:5c0634f03627bsdasdasdasdasdasdc91ce2147993a0181f53a
lastState:
terminated:
containerID: docker://5c8d284f79aaeaasdasdaqweqweqrwt9811e34da48f355081
exitCode: 1
finishedAt: "2019-10-17T07:49:36Z"
reason: Error
startedAt: "2019-10-17T07:49:35Z"
name: blah
ready: true
restartCount: 46
state:
running:
startedAt: "2019-10-17T07:54:39Z"
hostIP: 10.247.64.115
phase: Running
podIP: 10.247.76.179
qosClass: BestEffort
startTime: "2019-10-17T04:14:22Z"
Thanks

You got a ReadinessProbe:
readinessProbe:
failureThreshold: 10
httpGet:
path: /healthcheck
port: 8080
scheme: HTTP
https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/#define-readiness-probes
Readiness probes are configured similarly to liveness probes. The only difference is that you use the readinessProbe field instead of the livenessProbe field.
tl;dr; check if /healthcheck on port 8080 is returning a successful HTTP status code and, if not used or not necessary, drop it.

Related

0/1 nodes are available: 1 node(s) didn't have free ports for the requested pod ports when start the promethus exporter

After I install the promethus using helm in kubernetes cluster, the pod shows error like this:
0/1 nodes are available: 1 node(s) didn't have free ports for the requested pod ports.
this is the deployment yaml:
apiVersion: v1
kind: Pod
metadata:
name: kube-prometheus-1660560589-node-exporter-n7rzg
generateName: kube-prometheus-1660560589-node-exporter-
namespace: reddwarf-monitor
uid: 73986565-ccd8-421c-bcbb-33879437c4f3
resourceVersion: '71494023'
creationTimestamp: '2022-08-15T10:51:07Z'
labels:
app.kubernetes.io/instance: kube-prometheus-1660560589
app.kubernetes.io/managed-by: Helm
app.kubernetes.io/name: node-exporter
controller-revision-hash: 65c69f9b58
helm.sh/chart: node-exporter-3.0.8
pod-template-generation: '1'
ownerReferences:
- apiVersion: apps/v1
kind: DaemonSet
name: kube-prometheus-1660560589-node-exporter
uid: 921f98b9-ccc9-4e84-b092-585865bca024
controller: true
blockOwnerDeletion: true
status:
phase: Pending
conditions:
- type: PodScheduled
status: 'False'
lastProbeTime: null
lastTransitionTime: '2022-08-15T10:51:07Z'
reason: Unschedulable
message: >-
0/1 nodes are available: 1 node(s) didn't have free ports for the
requested pod ports.
qosClass: BestEffort
spec:
volumes:
- name: proc
hostPath:
path: /proc
type: ''
- name: sys
hostPath:
path: /sys
type: ''
- name: kube-api-access-9fj8v
projected:
sources:
- serviceAccountToken:
expirationSeconds: 3607
path: token
- configMap:
name: kube-root-ca.crt
items:
- key: ca.crt
path: ca.crt
- downwardAPI:
items:
- path: namespace
fieldRef:
apiVersion: v1
fieldPath: metadata.namespace
defaultMode: 420
containers:
- name: node-exporter
image: docker.io/bitnami/node-exporter:1.3.1-debian-11-r23
args:
- '--path.procfs=/host/proc'
- '--path.sysfs=/host/sys'
- '--web.listen-address=0.0.0.0:9100'
- >-
--collector.filesystem.ignored-fs-types=^(autofs|binfmt_misc|cgroup|configfs|debugfs|devpts|devtmpfs|fusectl|hugetlbfs|mqueue|overlay|proc|procfs|pstore|rpc_pipefs|securityfs|sysfs|tracefs)$
- >-
--collector.filesystem.ignored-mount-points=^/(dev|proc|sys|var/lib/docker/.+)($|/)
ports:
- name: metrics
hostPort: 9100
containerPort: 9100
protocol: TCP
resources: {}
volumeMounts:
- name: proc
readOnly: true
mountPath: /host/proc
- name: sys
readOnly: true
mountPath: /host/sys
- name: kube-api-access-9fj8v
readOnly: true
mountPath: /var/run/secrets/kubernetes.io/serviceaccount
livenessProbe:
httpGet:
path: /
port: metrics
scheme: HTTP
initialDelaySeconds: 120
timeoutSeconds: 5
periodSeconds: 10
successThreshold: 1
failureThreshold: 6
readinessProbe:
httpGet:
path: /
port: metrics
scheme: HTTP
initialDelaySeconds: 30
timeoutSeconds: 5
periodSeconds: 10
successThreshold: 1
failureThreshold: 6
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
imagePullPolicy: IfNotPresent
securityContext:
runAsUser: 1001
runAsNonRoot: true
restartPolicy: Always
terminationGracePeriodSeconds: 30
dnsPolicy: ClusterFirst
serviceAccountName: kube-prometheus-1660560589-node-exporter
serviceAccount: kube-prometheus-1660560589-node-exporter
hostNetwork: true
hostPID: true
securityContext:
fsGroup: 1001
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchFields:
- key: metadata.name
operator: In
values:
- k8smasterone
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 1
podAffinityTerm:
labelSelector:
matchLabels:
app.kubernetes.io/instance: kube-prometheus-1660560589
app.kubernetes.io/name: node-exporter
namespaces:
- reddwarf-monitor
topologyKey: kubernetes.io/hostname
schedulerName: default-scheduler
tolerations:
- key: node.kubernetes.io/not-ready
operator: Exists
effect: NoExecute
- key: node.kubernetes.io/unreachable
operator: Exists
effect: NoExecute
- key: node.kubernetes.io/disk-pressure
operator: Exists
effect: NoSchedule
- key: node.kubernetes.io/memory-pressure
operator: Exists
effect: NoSchedule
- key: node.kubernetes.io/pid-pressure
operator: Exists
effect: NoSchedule
- key: node.kubernetes.io/unschedulable
operator: Exists
effect: NoSchedule
- key: node.kubernetes.io/network-unavailable
operator: Exists
effect: NoSchedule
priority: 0
enableServiceLinks: true
preemptionPolicy: PreemptLowerPriority
I have checked the host machine and found the port 9100 is free, why still told that no port for this pod? what should I do to avoid this problem? this is the host port 9100 check command:
[root#k8smasterone grafana]# lsof -i:9100
[root#k8smasterone grafana]#
this is the pod describe info:
➜ ~ kubectl describe pod kube-prometheus-1660560589-node-exporter-n7rzg -n reddwarf-monitor
Name: kube-prometheus-1660560589-node-exporter-n7rzg
Namespace: reddwarf-monitor
Priority: 0
Node: <none>
Labels: app.kubernetes.io/instance=kube-prometheus-1660560589
app.kubernetes.io/managed-by=Helm
app.kubernetes.io/name=node-exporter
controller-revision-hash=65c69f9b58
helm.sh/chart=node-exporter-3.0.8
pod-template-generation=1
Annotations: <none>
Status: Pending
IP:
IPs: <none>
Controlled By: DaemonSet/kube-prometheus-1660560589-node-exporter
Containers:
node-exporter:
Image: docker.io/bitnami/node-exporter:1.3.1-debian-11-r23
Port: 9100/TCP
Host Port: 9100/TCP
Args:
--path.procfs=/host/proc
--path.sysfs=/host/sys
--web.listen-address=0.0.0.0:9100
--collector.filesystem.ignored-fs-types=^(autofs|binfmt_misc|cgroup|configfs|debugfs|devpts|devtmpfs|fusectl|hugetlbfs|mqueue|overlay|proc|procfs|pstore|rpc_pipefs|securityfs|sysfs|tracefs)$
--collector.filesystem.ignored-mount-points=^/(dev|proc|sys|var/lib/docker/.+)($|/)
Liveness: http-get http://:metrics/ delay=120s timeout=5s period=10s #success=1 #failure=6
Readiness: http-get http://:metrics/ delay=30s timeout=5s period=10s #success=1 #failure=6
Environment: <none>
Mounts:
/host/proc from proc (ro)
/host/sys from sys (ro)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-9fj8v (ro)
Conditions:
Type Status
PodScheduled False
Volumes:
proc:
Type: HostPath (bare host directory volume)
Path: /proc
HostPathType:
sys:
Type: HostPath (bare host directory volume)
Path: /sys
HostPathType:
kube-api-access-9fj8v:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/disk-pressure:NoSchedule op=Exists
node.kubernetes.io/memory-pressure:NoSchedule op=Exists
node.kubernetes.io/network-unavailable:NoSchedule op=Exists
node.kubernetes.io/not-ready:NoExecute op=Exists
node.kubernetes.io/pid-pressure:NoSchedule op=Exists
node.kubernetes.io/unreachable:NoExecute op=Exists
node.kubernetes.io/unschedulable:NoSchedule op=Exists
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 2m54s (x233 over 3h53m) default-scheduler 0/1 nodes are available: 1 node(s) didn't have free ports for the requested pod ports.
this is the netstat:
[root#k8smasterone ~]# netstat -plant |grep 9100
[root#k8smasterone ~]#
I also tried this to allow the pods running in master node by add this config:
tolerations:
- effect: NoSchedule
key: node-role.kubernetes.io/master
still did not fixed this problem.
When you configure your pod with hostNetwork: true, the containers running in this pod can directly see the network interfaces of the host machine where the pod was started.
The container port will be exposed to the external network at :, the hostPort is the port requested by the user in the configuration hostPort.
To bypass your problem, you have two options:
setting hostNetwork: false
choose a different hostPort (it is better in the range 49152 to 65535)

terminationGracePeriodSeconds not shown in kubectl describe result

Creating a Pod with spec terminationGracePeriodSeconds specified, I can't check whether this spec has been applied successfully using kubectl describe. How can I check whether terminationGracePeriodSeconds option has been successfully applied? I'm running kubernetes version 1.19.
apiVersion: v1
kind: Pod
metadata:
name: mysql-client
spec:
serviceAccountName: test
terminationGracePeriodSeconds: 60
containers:
- name: mysql-cli
image: blah
command: ["/bin/sh", "-c"]
args:
- sleep 2000
restartPolicy: OnFailure
Assuming the pod is running successfully. You should be able to see the settings in the manifest.
terminationGracePeriodSeconds is available in v1.19 as per the following page. Search for "terminationGracePeriodSeconds" here.
https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.19/
Now try this command:
kubectl get pod mysql-client -o yaml | grep terminationGracePeriodSeconds -a10 -b10
How can I check whether terminationGracePeriodSeconds option has been successfully applied?
First of all, you need to make sure your pod has been created correctly. I will show you this on an example. I have deployed very simple pod by following yaml:
apiVersion: v1
kind: Pod
metadata:
name: nginx
spec:
terminationGracePeriodSeconds: 60
containers:
- name: nginx
image: nginx:1.14.2
ports:
- containerPort: 80
Then I run the command kubectl get pods:
NAME READY STATUS RESTARTS AGE
nginx 1/1 Running 0 9m1s
Everything is fine.
I can't check whether this spec has been applied successfully using kubectl describe.
That is also correct, because this command doesn't return us information about termination grace period. To find this information you need to run kubectl get pod <your pod name> command. The result will be similar to below:
apiVersion: v1
kind: Pod
metadata:
annotations:
kubectl.kubernetes.io/last-applied-configuration: |
{"apiVersion":"v1","kind":"Pod","metadata":{"annotations":{},"name":"nginx","namespace":"default"},"spec":{"containers":[{"image":"nginx:1.14.2","name":"nginx","ports":[{"containerPort":80}]}],"terminationGracePeriodSeconds":60}}
creationTimestamp: "2022-01-11T11:34:58Z"
name: nginx
namespace: default
resourceVersion: "57260566"
uid: <MY-UID>
spec:
containers:
- image: nginx:1.14.2
imagePullPolicy: IfNotPresent
name: nginx
ports:
- containerPort: 80
protocol: TCP
resources: {}
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /var/run/secrets/kubernetes.io/serviceaccount
name: <name>
readOnly: true
dnsPolicy: ClusterFirst
enableServiceLinks: true
nodeName: <my-node-name>
preemptionPolicy: PreemptLowerPriority
priority: 0
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
serviceAccount: default
serviceAccountName: default
terminationGracePeriodSeconds: 60
tolerations:
- effect: NoExecute
key: node.kubernetes.io/not-ready
operator: Exists
tolerationSeconds: 300
- effect: NoExecute
key: node.kubernetes.io/unreachable
operator: Exists
tolerationSeconds: 300
volumes:
- name: kube-api-access-nj88r
projected:
defaultMode: 420
sources:
- serviceAccountToken:
expirationSeconds: 3607
path: token
- configMap:
items:
- key: ca.crt
path: ca.crt
name: kube-root-ca.crt
- downwardAPI:
items:
- fieldRef:
apiVersion: v1
fieldPath: metadata.namespace
path: namespace
status:
conditions:
- lastProbeTime: null
lastTransitionTime: "2022-01-11T11:35:01Z"
status: "True"
type: Initialized
- lastProbeTime: null
lastTransitionTime: "2022-01-11T11:35:07Z"
status: "True"
type: Ready
- lastProbeTime: null
lastTransitionTime: "2022-01-11T11:35:07Z"
status: "True"
type: ContainersReady
- lastProbeTime: null
lastTransitionTime: "2022-01-11T11:35:01Z"
status: "True"
type: PodScheduled
containerStatuses:
- containerID: containerd://<ID>
image: docker.io/library/nginx:1.14.2
imageID: docker.io/library/nginx#sha256:<sha256>
lastState: {}
name: nginx
ready: true
restartCount: 0
started: true
state:
running:
startedAt: "2022-01-11T11:35:06Z"
hostIP: <IP>
phase: Running
podIP: <IP>
podIPs:
- ip: <IP>
qosClass: BestEffort
startTime: "2022-01-11T11:35:01Z"
The most important part will be here:
{"apiVersion":"v1","kind":"Pod","metadata":{"annotations":{},"name":"nginx","namespace":"default"},"spec":{"containers":[{"image":"nginx:1.14.2","name":"nginx","ports":[{"containerPort":80}]}],"terminationGracePeriodSeconds":60}}
and here
terminationGracePeriodSeconds: 60
At this moment you are sure that terminationGracePeriodSeconds is applied successfully.

Multiple kube-shcedulers

I'm trying to create a kube-shceduler that will be used by my organisation pod deployment (a data-center). The objective would be to have the pods of any deployment be split equally between the two sites of the data-center. To archive this I created a new kube-shceduler manifest (see below), in which i changed the Ports that are used and added a new configuration file (see below) using the --config parameter. The problem is that even after setting the --port parameter to a new port that isn't already used by the original kube-scheduler, it still tries to use the old one. As such the new kube scheduler can't start:
I1109 20:27:59.225996 1 registry.go:173] Registering SelectorSpread plugin
I1109 20:27:59.226097 1 registry.go:173] Registering SelectorSpread plugin
I1109 20:27:59.926994 1 serving.go:331] Generated self-signed cert in-memory
failed to create listener: failed to listen on 0.0.0.0:10251: listen tcp 0.0.0.0:10251: bind: address already in use
How can I force the use of the specified port OR how can I delete the original kube-scheduler so that there won't be any conflict between the two. The first solution would be preferable.
Configuration kube-scheduler (manifest - /etc/kubernetes/manifest/kube-scheduler.yaml) :
apiVersion: v1
kind: Pod
metadata:
creationTimestamp: null
labels:
component: kube-scheduler
tier: control-plane
name: kube-scheduler
namespace: kube-system
spec:
containers:
- command:
- kube-scheduler
- --authentication-kubeconfig=/etc/kubernetes/scheduler.conf
- --authorization-kubeconfig=/etc/kubernetes/scheduler.conf
- --bind-address=127.0.0.1
- --kubeconfig=/etc/kubernetes/scheduler.conf
- --leader-elect=true
#- --port=0
image: k8s.gcr.io/kube-scheduler:v1.19.3
imagePullPolicy: IfNotPresent
livenessProbe:
failureThreshold: 8
httpGet:
host: 127.0.0.1
path: /healthz
port: 10259
scheme: HTTPS
initialDelaySeconds: 10
periodSeconds: 10
timeoutSeconds: 15
name: kube-scheduler
resources:
requests:
cpu: 100m
startupProbe:
failureThreshold: 24
httpGet:
host: 127.0.0.1
path: /healthz
port: 10259
scheme: HTTPS
initialDelaySeconds: 10
periodSeconds: 10
timeoutSeconds: 15
volumeMounts:
- mountPath: /etc/kubernetes/scheduler.conf
name: kubeconfig
readOnly: true
hostNetwork: true
priorityClassName: system-node-critical
volumes:
- hostPath:
path: /etc/kubernetes/scheduler.conf
type: FileOrCreate
name: kubeconfig
status: {}
Configuration kube-custom-scheduler.yaml (/etc/kubernetes/manifest/kube-custom-scheduler.yaml):
apiVersion: v1
kind: Pod
metadata:
creationTimestamp: null
labels:
component: kube-custom-scheduler
tier: control-plane
name: kube-custom-scheduler
namespace: kube-system
spec:
containers:
- command:
- kube-scheduler
- --config=/etc/kubernetes/scheduler-custom.conf
- --master=true
- --authentication-kubeconfig=/etc/kubernetes/scheduler.conf
- --authorization-kubeconfig=/etc/kubernetes/scheduler.conf
- --bind-address=127.0.0.1
- --kubeconfig=/etc/kubernetes/scheduler.conf
- --leader-elect=false
- --secure-port=10269
- --scheduler-name=kube-custom-shceduler
- --port=10261
image: k8s.gcr.io/kube-scheduler:v1.19.3
imagePullPolicy: IfNotPresent
livenessProbe:
failureThreshold: 8
httpGet:
host: 127.0.0.1
path: /healthz
port: 10269
scheme: HTTPS
initialDelaySeconds: 10
periodSeconds: 10
timeoutSeconds: 15
name: kube-scheduler
resources:
requests:
cpu: 100m
startupProbe:
failureThreshold: 24
httpGet:
host: 127.0.0.1
path: /healthz
port: 10269
scheme: HTTPS
initialDelaySeconds: 10
periodSeconds: 10
timeoutSeconds: 15
volumeMounts:
- mountPath: /etc/kubernetes/scheduler.conf
name: kubeconfig
readOnly: true
- mountPath: /etc/kubernetes/scheduler-custom.conf
name: customconfig
readOnly: true
hostNetwork: true
priorityClassName: system-node-critical
volumes:
- hostPath:
path: /etc/kubernetes/scheduler.conf
type: FileOrCreate
name: kubeconfig
- hostPath:
path: /etc/kubernetes/scheduler-custom.conf
type: FileOrCreate
name: customconfig
status: {}
Custom configuration mentioned in the kube-custom-scheduler.yaml (/etc/kubernetes/scheduler-custom.conf):
apiVersion: kubescheduler.config.k8s.io/v1beta1
kind: KubeSchedulerConfiguration
profiles:
- pluginConfig:
- name: PodTopologySpread
args:
defaultConstraints:
- maxSkew: 1
topologyKey: topology.kube.io/datacenter
whenUnsatisfiable: ScheduleAnyway
Please don't hesistate if you need more information about the cluster.
According to https://kubernetes.io/docs/reference/command-line-tools-reference/kube-scheduler/ , the --port would be overwritten by the file specified by --config. Adding HealthzBindAddress and MetricsBindAddress to /etc/kubernetes/scheduler-custom.conf works:
healthzBindAddress: 0.0.0.0:10261
metricsBindAddress: 0.0.0.0:10261
If you want to remove the default scheduler, you can move kube-scheduler.yaml out of /etc/kubernetes/manifests/.

Configure pod resource request for k8s etcd pod

When running k8s 1.18 alongside with a default “on cluster” etcd pod deployment, what is the way to assign a resource (CPU/memory) request, or influence the pod spec for the etcd container?
The default configuration provides no resource requests or limits.
Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits AGE
--------- ---- ------------ ---------- --------------- ------------- ---
kube-system etcd-172-25-87-82-hybrid.com 0 (0%) 0 (0%) 0 (0%) 0 (0%) 77m
I’m aware of how one can pass extra args to etcd via kubeadm extraArgs config but these do not cover the etcd pod resources.
etcd:
local:
extraArgs:
heartbeat-interval: "1000"
election-timeout: "5000"
The question can be extended to the other resources in the kube-system namespace eg coredns, etc.
After init cluster, you can find generated /etc/kubernetes/manifests/etcd.yaml. Tried to edit it? The kubelet should pick the changes and restart the etcd instance.
root#kube-1:~# cat /etc/kubernetes/manifests/etcd.yaml
apiVersion: v1
kind: Pod
metadata:
annotations:
kubeadm.kubernetes.io/etcd.advertise-client-urls: https://10.154.0.33:2379
creationTimestamp: null
labels:
component: etcd
tier: control-plane
name: etcd
namespace: kube-system
spec:
containers:
- command:
- etcd
- --advertise-client-urls=https://10.154.0.33:2379
- --cert-file=/etc/kubernetes/pki/etcd/server.crt
- --client-cert-auth=true
- --data-dir=/var/lib/etcd
- --initial-advertise-peer-urls=https://10.154.0.33:2380
- --initial-cluster=kube-1=https://10.154.0.33:2380
- --key-file=/etc/kubernetes/pki/etcd/server.key
- --listen-client-urls=https://127.0.0.1:2379,https://10.154.0.33:2379
- --listen-metrics-urls=http://127.0.0.1:2381
- --listen-peer-urls=https://10.154.0.33:2380
- --name=kube-1
- --peer-cert-file=/etc/kubernetes/pki/etcd/peer.crt
- --peer-client-cert-auth=true
- --peer-key-file=/etc/kubernetes/pki/etcd/peer.key
- --peer-trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt
- --snapshot-count=10000
- --trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt
image: k8s.gcr.io/etcd:3.4.13-0
imagePullPolicy: IfNotPresent
livenessProbe:
failureThreshold: 8
httpGet:
host: 127.0.0.1
path: /health
port: 2381
scheme: HTTP
initialDelaySeconds: 10
periodSeconds: 10
timeoutSeconds: 15
name: etcd
resources: {}
startupProbe:
failureThreshold: 24
httpGet:
host: 127.0.0.1
path: /health
port: 2381
scheme: HTTP
initialDelaySeconds: 10
periodSeconds: 10
timeoutSeconds: 15
volumeMounts:
- mountPath: /var/lib/etcd
name: etcd-data
- mountPath: /etc/kubernetes/pki/etcd
name: etcd-certs
hostNetwork: true
priorityClassName: system-node-critical
volumes:
- hostPath:
path: /etc/kubernetes/pki/etcd
type: DirectoryOrCreate
name: etcd-certs
- hostPath:
path: /var/lib/etcd
type: DirectoryOrCreate
name: etcd-data
status: {}

kubernetes Pod's readinessProbe errored but endpoint not removed from Service

I'm running Spinnaker on Kubernetes 1.10.111. One of the Spinnaker services is a Pod running a service called Clouddriver. This Pod was running fine, but then the readinessProbe started erroring continuously. Kubernetes docs say
readinessProbe: Indicates whether the Container is ready to service requests. If the readiness probe fails, the endpoints controller removes the Pod’s IP address from the endpoints of all Services that match the Pod.
— https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/#container-probes
But this Pod's IP is still in the Service's endpoints. Why?
Clouddriver Pod YAML
kubectl -n spinnaker-test get pods spin-clouddriver-5559d44484-mp8q9 -o yaml
apiVersion: v1
kind: Pod
metadata:
annotations:
kubernetes.io/psp: spotify.backend-service
creationTimestamp: 2019-02-15T20:46:38Z
generateName: spin-clouddriver-5559d44484-
labels:
app: spin
app.kubernetes.io/managed-by: halyard
app.kubernetes.io/name: clouddriver
app.kubernetes.io/part-of: spinnaker
app.kubernetes.io/version: 1.12.1
cluster: spin-clouddriver
pod-template-hash: "1115800040"
name: spin-clouddriver-5559d44484-mp8q9
namespace: spinnaker-test
ownerReferences:
- apiVersion: extensions/v1beta1
blockOwnerDeletion: true
controller: true
kind: ReplicaSet
name: spin-clouddriver-5559d44484
uid: ce79561c-3161-11e9-acdf-42010a800082
resourceVersion: "53541277"
selfLink: /api/v1/namespaces/spinnaker-test/pods/spin-clouddriver-5559d44484-mp8q9
uid: caa66d7c-3162-11e9-acdf-42010a800082
spec:
containers:
- env:
- name: JAVA_OPTS
value: -XX:+UnlockExperimentalVMOptions -XX:+UseCGroupMemoryLimitForHeap -XX:MaxRAMFraction=2
- name: SPRING_PROFILES_ACTIVE
value: local
image: gcr.io/spinnaker-marketplace/clouddriver:4.3.1-20190130095322
imagePullPolicy: IfNotPresent
lifecycle: {}
name: clouddriver
ports:
- containerPort: 7002
protocol: TCP
readinessProbe:
exec:
command:
- wget
- --no-check-certificate
- --spider
- -q
- http://localhost:7002/health
failureThreshold: 3
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 1
resources:
limits:
cpu: "20"
memory: 5000Mi
requests:
cpu: "20"
memory: 5000Mi
securityContext:
allowPrivilegeEscalation: false
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /opt/spinnaker/config
name: spin-clouddriver-files-1952526246
- mountPath: /home/halyard/.hal/k8s-spinnaker/staging/dependencies
name: spin-clouddriver-files-1757773194
- mountPath: /var/run/secrets/kubernetes.io/serviceaccount
name: default-token-w2lt5
readOnly: true
dnsPolicy: ClusterFirst
nodeName: gke-production-us-ce-terraform-201812-d63606d6-9vq9
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
serviceAccount: default
serviceAccountName: default
terminationGracePeriodSeconds: 720
tolerations:
- effect: NoExecute
key: node.kubernetes.io/not-ready
operator: Exists
tolerationSeconds: 300
- effect: NoExecute
key: node.kubernetes.io/unreachable
operator: Exists
tolerationSeconds: 300
volumes:
- name: spin-clouddriver-files-1952526246
secret:
defaultMode: 420
secretName: spin-clouddriver-files-1952526246
- name: spin-clouddriver-files-1757773194
secret:
defaultMode: 420
secretName: spin-clouddriver-files-1757773194
- name: default-token-w2lt5
secret:
defaultMode: 420
secretName: default-token-w2lt5
status:
conditions:
- lastProbeTime: null
lastTransitionTime: 2019-02-15T20:46:38Z
status: "True"
type: Initialized
- lastProbeTime: null
lastTransitionTime: 2019-02-15T20:53:40Z
status: "True"
type: Ready
- lastProbeTime: null
lastTransitionTime: 2019-02-15T20:46:38Z
status: "True"
type: PodScheduled
containerStatuses:
- containerID: docker://3509b48511b1ea7bc97812cb82831c559d9410cb9eaaa26b4f492d881603fb31
image: gcr.io/spinnaker-marketplace/clouddriver:4.3.1-20190130095322
imageID: docker-pullable://gcr.io/spinnaker-marketplace/clouddriver#sha256:466228b97b8c4a61a0270c53ae4c397eb04bc3661bc4f1ee9ef4d5fce70d187d
lastState: {}
name: clouddriver
ready: true
restartCount: 0
state:
running:
startedAt: 2019-02-15T20:47:26Z
hostIP: 10.178.32.98
phase: Running
podIP: 10.179.34.24
qosClass: Guaranteed
startTime: 2019-02-15T20:46:38Z
Describing the Pod shows the readinessProbe has been continuously erroring for over a day.
kubectl -n spinnaker-test describe pods spin-clouddriver-5559d44484-mp8q9
Name: spin-clouddriver-5559d44484-mp8q9
Namespace: spinnaker-test
Node: gke-production-us-ce-terraform-201812-d63606d6-9vq9/10.178.32.98
Start Time: Fri, 15 Feb 2019 15:46:38 -0500
Labels: app=spin
app.kubernetes.io/managed-by=halyard
app.kubernetes.io/name=clouddriver
app.kubernetes.io/part-of=spinnaker
app.kubernetes.io/version=1.12.1
cluster=spin-clouddriver
pod-template-hash=1115800040
Annotations: kubernetes.io/psp=spotify.backend-service
Status: Running
IP: 10.179.34.24
Controlled By: ReplicaSet/spin-clouddriver-5559d44484
Containers:
clouddriver:
Container ID: docker://3509b48511b1ea7bc97812cb82831c559d9410cb9eaaa26b4f492d881603fb31
Image: gcr.io/spinnaker-marketplace/clouddriver:4.3.1-20190130095322
Image ID: docker-pullable://gcr.io/spinnaker-marketplace/clouddriver#sha256:466228b97b8c4a61a0270c53ae4c397eb04bc3661bc4f1ee9ef4d5fce70d187d
Port: 7002/TCP
Host Port: 0/TCP
State: Running
Started: Fri, 15 Feb 2019 15:47:26 -0500
Ready: True
Restart Count: 0
Limits:
cpu: 20
memory: 5000Mi
Requests:
cpu: 20
memory: 5000Mi
Readiness: exec [wget --no-check-certificate --spider -q http://localhost:7002/health] delay=0s timeout=1s period=10s #success=1 #failure=3
Environment:
JAVA_OPTS: -XX:+UnlockExperimentalVMOptions -XX:+UseCGroupMemoryLimitForHeap -XX:MaxRAMFraction=2
SPRING_PROFILES_ACTIVE: local
Mounts:
/home/halyard/.hal/k8s-spinnaker/staging/dependencies from spin-clouddriver-files-1757773194 (rw)
/opt/spinnaker/config from spin-clouddriver-files-1952526246 (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-w2lt5 (ro)
Conditions:
Type Status
Initialized True
Ready True
PodScheduled True
Volumes:
spin-clouddriver-files-1952526246:
Type: Secret (a volume populated by a Secret)
SecretName: spin-clouddriver-files-1952526246
Optional: false
spin-clouddriver-files-1757773194:
Type: Secret (a volume populated by a Secret)
SecretName: spin-clouddriver-files-1757773194
Optional: false
default-token-w2lt5:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-w2lt5
Optional: false
QoS Class: Guaranteed
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning Unhealthy 3m (x321 over 1d) kubelet, gke-production-us-ce-terraform-201812-d63606d6-9vq9 Readiness probe errored: rpc error: code = DeadlineExceeded desc = context deadline exceeded
But Service still has the Pod's IP of 10.179.34.24 in its Endpoints.
kubectl -n spinnaker-test describe services spin-clouddriver
Name: spin-clouddriver
Namespace: spinnaker-test
Labels: app=spin
cluster=spin-clouddriver
Annotations: kubectl.kubernetes.io/last-applied-configuration={"apiVersion":"v1","kind":"Service","metadata":{"annotations":{},"labels":{"app":"spin","cluster":"spin-clouddriver"},"name":"spin-clouddriver","namesp...
Selector: app=spin,cluster=spin-clouddriver
Type: ClusterIP
IP: 10.178.65.100
Port: <unset> 7002/TCP
TargetPort: 7002/TCP
Endpoints: 10.179.34.24:7002
Session Affinity: None
Events: <none>
kubectl -n spinnaker-test describe endpoints spin-clouddriver
Name: spin-clouddriver
Namespace: spinnaker-test
Labels: app=spin
cluster=spin-clouddriver
Annotations: <none>
Subsets:
Addresses: 10.179.34.24
NotReadyAddresses: <none>
Ports:
Name Port Protocol
---- ---- --------
<unset> 7002 TCP
Events: <none>
footnotes
GKE 1.10.11-gke.1 to be exact, but the fact that it's GKE shouldn't matter.
A probe by the kubelet can end in one of three states:
successful
failed (command returned a non-0 exit code)
errored (command did not return before the timeout elapsed, the command does not exist inside the container, etc)
Here is the code (in 1.10.11) where the event probe errored is recorded. Note that err != nil.
Here is the code that calls the above function - when err != nil (the probe returned an error), the result is discarded.
Only probes that fail will actually cause the pod's ready state to be changed.