eureka pod turn to pending state when running a period of time in kubernetes cluster - kubernetes

I am deploy a eureka pod in kubernetes cluster(v1.15.2),today the pod turn to be pending state and the actual state is running.Other service could not access to eureka, the eureka icon to indicate pod status shows:this pod is in a pending state.This is my stateful deploy yaml:
{
"kind": "StatefulSet",
"apiVersion": "apps/v1beta2",
"metadata": {
"name": "eureka",
"namespace": "dabai-fat",
"selfLink": "/apis/apps/v1beta2/namespaces/dabai-fat/statefulsets/eureka",
"uid": "92eefc3d-4601-4ebc-9414-8437f9934461",
"resourceVersion": "20195760",
"generation": 21,
"creationTimestamp": "2020-02-01T16:55:54Z",
"labels": {
"app": "eureka"
}
},
"spec": {
"replicas": 1,
"selector": {
"matchLabels": {
"app": "eureka"
}
},
"template": {
"metadata": {
"creationTimestamp": null,
"labels": {
"app": "eureka"
}
},
"spec": {
"containers": [
{
"name": "eureka",
"image": "registry.cn-hangzhou.aliyuncs.com/dabai_app_k8s/dabai_fat/soa-eureka:v1.0.0",
"ports": [
{
"name": "server",
"containerPort": 8761,
"protocol": "TCP"
},
{
"name": "management",
"containerPort": 8081,
"protocol": "TCP"
}
],
"env": [
{
"name": "APP_NAME",
"value": "eureka"
},
{
"name": "POD_NAME",
"valueFrom": {
"fieldRef": {
"apiVersion": "v1",
"fieldPath": "metadata.name"
}
}
},
{
"name": "APP_OPTS",
"value": " --spring.application.name=${APP_NAME} --eureka.instance.hostname=${POD_NAME}.${APP_NAME} --registerWithEureka=true --fetchRegistry=true --eureka.instance.preferIpAddress=false --eureka.client.serviceUrl.defaultZone=http://eureka-0.${APP_NAME}:8761/eureka/"
},
{
"name": "APOLLO_META",
"valueFrom": {
"configMapKeyRef": {
"name": "fat-config",
"key": "apollo.meta"
}
}
},
{
"name": "ENV",
"valueFrom": {
"configMapKeyRef": {
"name": "fat-config",
"key": "env"
}
}
}
],
"resources": {
"limits": {
"cpu": "2",
"memory": "1Gi"
},
"requests": {
"cpu": "2",
"memory": "1Gi"
}
},
"terminationMessagePath": "/dev/termination-log",
"terminationMessagePolicy": "File",
"imagePullPolicy": "IfNotPresent"
}
],
"restartPolicy": "Always",
"terminationGracePeriodSeconds": 10,
"dnsPolicy": "ClusterFirst",
"securityContext": {},
"imagePullSecrets": [
{
"name": "regcred"
}
],
"schedulerName": "default-scheduler"
}
},
"serviceName": "eureka-service",
"podManagementPolicy": "Parallel",
"updateStrategy": {
"type": "RollingUpdate",
"rollingUpdate": {
"partition": 0
}
},
"revisionHistoryLimit": 10
},
"status": {
"observedGeneration": 21,
"replicas": 1,
"readyReplicas": 1,
"currentReplicas": 1,
"updatedReplicas": 1,
"currentRevision": "eureka-5976977b7d",
"updateRevision": "eureka-5976977b7d",
"collisionCount": 0
}
}
this is the describe output of the pending state pod:
$ kubectl describe pod eureka-0
Name: eureka-0
Namespace: dabai-fat
Priority: 0
Node: uat-k8s-01/172.19.104.233
Start Time: Mon, 23 Mar 2020 18:40:11 +0800
Labels: app=eureka
controller-revision-hash=eureka-5976977b7d
statefulset.kubernetes.io/pod-name=eureka-0
Annotations: <none>
Status: Running
IP: 172.30.248.8
IPs: <none>
Controlled By: StatefulSet/eureka
Containers:
eureka:
Container ID: docker://5e5eea624e1facc9437fef739669ffeaaa5a7ab655a1297c4acb1e4fd00701ea
Image: registry.cn-hangzhou.aliyuncs.com/dabai_app_k8s/dabai_fat/soa-eureka:v1.0.0
Image ID: docker-pullable://registry.cn-hangzhou.aliyuncs.com/dabai_app_k8s/dabai_fat/soa-eureka#sha256:7cd4878ae8efec32984a2b9eec623484c66ae11b9449f8306017cadefbf626ca
Ports: 8761/TCP, 8081/TCP
Host Ports: 0/TCP, 0/TCP
State: Running
Started: Mon, 23 Mar 2020 18:40:18 +0800
Ready: True
Restart Count: 0
Limits:
cpu: 2
memory: 1Gi
Requests:
cpu: 2
memory: 1Gi
Environment:
APP_NAME: eureka
POD_NAME: eureka-0 (v1:metadata.name)
APP_OPTS: --spring.application.name=${APP_NAME} --eureka.instance.hostname=${POD_NAME}.${APP_NAME} --registerWithEureka=true --fetchRegistry=true --eureka.instance.preferIpAddress=false --eureka.client.serviceUrl.defaultZone=http://eureka-0.${APP_NAME}:8761/eureka/
APOLLO_META: <set to the key 'apollo.meta' of config map 'fat-config'> Optional: false
ENV: <set to the key 'env' of config map 'fat-config'> Optional: false
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-xnrwt (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady True
PodScheduled True
Volumes:
default-token-xnrwt:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-xnrwt
Optional: false
QoS Class: Guaranteed
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 360s
node.kubernetes.io/unreachable:NoExecute for 360s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 16h default-scheduler Successfully assigned dabai-fat/eureka-0 to uat-k8s-01
Normal Pulling 16h kubelet, uat-k8s-01 Pulling image "registry.cn-hangzhou.aliyuncs.com/dabai_app_k8s/dabai_fat/soa-eureka:v1.0.0"
Normal Pulled 16h kubelet, uat-k8s-01 Successfully pulled image "registry.cn-hangzhou.aliyuncs.com/dabai_app_k8s/dabai_fat/soa-eureka:v1.0.0"
Normal Created 16h kubelet, uat-k8s-01 Created container eureka
Normal Started 16h kubelet, uat-k8s-01 Started container eureka
how could this happen? what should I do to avoid this situation? After I restart the eureka pod,this problem disappeared,but I still want to know the reason cause this problem.

Sounds like a Kubernetes bug? Try to reproduce it on the current version of Kubernetes. You can also dive into the kubelet logs to see if there is anything useful on those.

Related

When updating service target I have 502 errors during 5 to 60 seconds

I want to manually reroute (when I have needs) a web server (A Helm deployment on GKE) to another one.
To do that I have 3 Helm deployments :
Application X
Application Y
Ingress on application X
All work fine, but if I launch a Helm update with Ingress chart changing uniquely the selector of the service I target I have 502 errors :(
Source of service :
apiVersion: v1
kind: Service
metadata:
name: {{ .Values.Service.Name }}-https
labels:
app: {{ .Values.Service.Name }}
type: svc
name: {{ .Values.Service.Name }}
environment: {{ .Values.Environment.Name }}
annotations:
cloud.google.com/neg: '{"ingress": true}'
beta.cloud.google.com/backend-config: '{"ports": {"{{ .Values.Application.Port }}":"{{ .Values.Service.Name }}-https"}}'
spec:
type: NodePort
selector:
name: {{ .Values.Application.Name }}
environment: {{ .Values.Environment.Name }}
ports:
- protocol: TCP
port: {{ .Values.Application.Port }}
targetPort: {{ .Values.Application.Port }}
---
apiVersion: cloud.google.com/v1beta1
kind: BackendConfig
metadata:
name: {{ .Values.Service.Name }}-https
spec:
timeoutSec: 50
connectionDraining:
drainingTimeoutSec: 60
sessionAffinity:
affinityType: "GENERATED_COOKIE"
affinityCookieTtlSec: 300
---
apiVersion: networking.k8s.io/v1beta1
kind: Ingress
metadata:
name: {{ .Values.Service.Name }}-https
labels:
app: {{ .Values.Service.Name }}
type: ingress
name: {{ .Values.Service.Name }}
environment: {{ .Values.Environment.Name }}
annotations:
kubernetes.io/ingress.global-static-ip-name: {{ .Values.Service.PublicIpName }}
networking.gke.io/managed-certificates: "{{ join "," .Values.Service.DomainNames }}"
nginx.ingress.kubernetes.io/rewrite-target: /
spec:
backend:
serviceName: {{ $.Values.Service.Name }}-https
servicePort: 80
rules:
{{- range .Values.Service.DomainNames }}
- host: {{ . | title | lower }}
http:
paths:
- backend:
serviceName: {{ $.Values.Service.Name }}-https
servicePort: 80
{{- end }}
The only thing which change from one call to another is the value of "{{ .Values.Application.Name }}", all other values are strictly the same.
Targeted PODS are always UP & RUNNING and all respond 200 using "kubectl" port forwarding test.
Here is the status of all my namespace objects :
NAME READY STATUS RESTARTS AGE
pod/drupal-dummy-404-v1-pod-744454b7ff-m4hjk 1/1 Running 0 2m32s
pod/drupal-dummy-404-v1-pod-744454b7ff-z5l29 1/1 Running 0 2m32s
pod/drupal-dummy-v1-pod-77f5bf55c6-9dq8n 1/1 Running 0 3m58s
pod/drupal-dummy-v1-pod-77f5bf55c6-njfl9 1/1 Running 0 3m57s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/drupal-dummy-v1-service-https NodePort 172.16.90.71 <none> 80:31391/TCP 3m49s
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/drupal-dummy-404-v1-pod 2/2 2 2 2m32s
deployment.apps/drupal-dummy-v1-pod 2/2 2 2 3m58s
NAME DESIRED CURRENT READY AGE
replicaset.apps/drupal-dummy-404-v1-pod-744454b7ff 2 2 2 2m32s
replicaset.apps/drupal-dummy-v1-pod-77f5bf55c6 2 2 2 3m58s
NAME AGE
managedcertificate.networking.gke.io/d8.syspod.fr 161m
managedcertificate.networking.gke.io/d8gfi.syspod.fr 128m
managedcertificate.networking.gke.io/dummydrupald8.cnes.fr 162m
NAME HOSTS ADDRESS PORTS AGE
ingress.extensions/drupal-dummy-v1-service-https d8gfi.syspod.fr 34.120.106.136 80 3m50s
Another test has to pre-launch two services, one for each deployment and just update the Ingress Helm deployment changing this time "{{ $.Values.Service.Name }}", same problem, and the site indisponibility is here from 60s to 300s.
Here is the status of all my namespace objects (for this second test) :
root#47475bc8c41f:/opt/bin# k get all,svc,ingress,managedcertificates
NAME READY STATUS RESTARTS AGE
pod/drupal-dummy-404-v1-pod-744454b7ff-8r5pm 1/1 Running 0 26m
pod/drupal-dummy-404-v1-pod-744454b7ff-9cplz 1/1 Running 0 26m
pod/drupal-dummy-v1-pod-77f5bf55c6-56dnr 1/1 Running 0 30m
pod/drupal-dummy-v1-pod-77f5bf55c6-mg95j 1/1 Running 0 30m
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/drupal-dummy-404-v1-pod-https NodePort 172.16.106.121 <none> 80:31030/TCP 26m
service/drupal-dummy-v1-pod-https NodePort 172.16.245.251 <none> 80:31759/TCP 27m
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/drupal-dummy-404-v1-pod 2/2 2 2 26m
deployment.apps/drupal-dummy-v1-pod 2/2 2 2 30m
NAME DESIRED CURRENT READY AGE
replicaset.apps/bastion-66bb77bfd5 1 1 1 148m
replicaset.apps/drupal-dummy-404-v1-pod-744454b7ff 2 2 2 26m
replicaset.apps/drupal-dummy-v1-pod-77f5bf55c6 2 2 2 30m
NAME HOSTS ADDRESS PORTS AGE
ingress.extensions/drupal-dummy-v1-service-https d8gfi.syspod.fr 34.120.106.136 80 14m
Does anybody have any explanation (and solution) ?
Added deployment DUMP (sure something is missing but I don't see) :
root#c55834fbdf1a:/# k get deployment.apps/drupal-dummy-v1-pod -o json
{
"apiVersion": "apps/v1",
"kind": "Deployment",
"metadata": {
"annotations": {
"deployment.kubernetes.io/revision": "2",
"meta.helm.sh/release-name": "drupal-dummy-v1-pod",
"meta.helm.sh/release-namespace": "e1"
},
"creationTimestamp": "2020-06-23T18:49:59Z",
"generation": 2,
"labels": {
"app.kubernetes.io/managed-by": "Helm",
"environment": "e1",
"name": "drupal-dummy-v1-pod",
"type": "dep"
},
"name": "drupal-dummy-v1-pod",
"namespace": "e1",
"resourceVersion": "3977170",
"selfLink": "/apis/apps/v1/namespaces/e1/deployments/drupal-dummy-v1-pod",
"uid": "56f74fb9-b582-11ea-9df2-42010a000006"
},
"spec": {
"progressDeadlineSeconds": 600,
"replicas": 2,
"revisionHistoryLimit": 10,
"selector": {
"matchLabels": {
"environment": "e1",
"name": "drupal-dummy-v1-pod",
"type": "dep"
}
},
"strategy": {
"rollingUpdate": {
"maxSurge": "25%",
"maxUnavailable": "25%"
},
"type": "RollingUpdate"
},
"template": {
"metadata": {
"creationTimestamp": null,
"labels": {
"environment": "e1",
"name": "drupal-dummy-v1-pod",
"type": "dep"
}
},
"spec": {
"containers": [
{
"env": [
{
"name": "APPLICATION",
"value": "drupal-dummy-v1-pod"
},
{
"name": "DB_PASS",
"valueFrom": {
"secretKeyRef": {
"key": "password",
"name": "dbpassword"
}
}
},
{
"name": "DB_FQDN",
"valueFrom": {
"configMapKeyRef": {
"key": "dbip",
"name": "gcpenv"
}
}
},
{
"name": "DB_PORT",
"valueFrom": {
"configMapKeyRef": {
"key": "dbport",
"name": "gcpenv"
}
}
},
{
"name": "DB_NAME",
"valueFrom": {
"configMapKeyRef": {
"key": "dbdatabase",
"name": "gcpenv"
}
}
},
{
"name": "DB_USER",
"valueFrom": {
"configMapKeyRef": {
"key": "dbuser",
"name": "gcpenv"
}
}
}
],
"image": "eu.gcr.io/gke-drupal-276313/drupal-dummy:1.0.0",
"imagePullPolicy": "Always",
"livenessProbe": {
"failureThreshold": 3,
"httpGet": {
"path": "/",
"port": 80,
"scheme": "HTTP"
},
"initialDelaySeconds": 60,
"periodSeconds": 10,
"successThreshold": 1,
"timeoutSeconds": 5
},
"name": "drupal-dummy-v1-pod",
"ports": [
{
"containerPort": 80,
"protocol": "TCP"
}
],
"readinessProbe": {
"failureThreshold": 3,
"httpGet": {
"path": "/",
"port": 80,
"scheme": "HTTP"
},
"initialDelaySeconds": 60,
"periodSeconds": 10,
"successThreshold": 1,
"timeoutSeconds": 5
},
"resources": {},
"terminationMessagePath": "/dev/termination-log",
"terminationMessagePolicy": "File",
"volumeMounts": [
{
"mountPath": "/var/www/html/sites/default",
"name": "drupal-dummy-v1-pod"
}
]
}
],
"dnsPolicy": "ClusterFirst",
"restartPolicy": "Always",
"schedulerName": "default-scheduler",
"securityContext": {},
"terminationGracePeriodSeconds": 30,
"volumes": [
{
"name": "drupal-dummy-v1-pod",
"persistentVolumeClaim": {
"claimName": "drupal-dummy-v1-pod"
}
}
]
}
}
},
"status": {
"availableReplicas": 2,
"conditions": [
{
"lastTransitionTime": "2020-06-23T18:56:05Z",
"lastUpdateTime": "2020-06-23T18:56:05Z",
"message": "Deployment has minimum availability.",
"reason": "MinimumReplicasAvailable",
"status": "True",
"type": "Available"
},
{
"lastTransitionTime": "2020-06-23T18:49:59Z",
"lastUpdateTime": "2020-06-23T18:56:05Z",
"message": "ReplicaSet \"drupal-dummy-v1-pod-6865d969cd\" has successfully progressed.",
"reason": "NewReplicaSetAvailable",
"status": "True",
"type": "Progressing"
}
],
"observedGeneration": 2,
"readyReplicas": 2,
"replicas": 2,
"updatedReplicas": 2
}
}
Here service DUMP too :
root#c55834fbdf1a:/# k get service/drupal-dummy-v1-service-https -o json
{
"apiVersion": "v1",
"kind": "Service",
"metadata": {
"annotations": {
"beta.cloud.google.com/backend-config": "{\"ports\": {\"80\":\"drupal-dummy-v1-service-https\"}}",
"cloud.google.com/neg": "{\"ingress\": true}",
"cloud.google.com/neg-status": "{\"network_endpoint_groups\":{\"80\":\"k8s1-4846660e-e1-drupal-dummy-v1-service-https-80-36c11551\"},\"zones\":[\"europe-west3-a\",\"europe-west3-b\"]}",
"meta.helm.sh/release-name": "drupal-dummy-v1-service",
"meta.helm.sh/release-namespace": "e1"
},
"creationTimestamp": "2020-06-23T18:50:45Z",
"labels": {
"app": "drupal-dummy-v1-service",
"app.kubernetes.io/managed-by": "Helm",
"environment": "e1",
"name": "drupal-dummy-v1-service",
"type": "svc"
},
"name": "drupal-dummy-v1-service-https",
"namespace": "e1",
"resourceVersion": "3982781",
"selfLink": "/api/v1/namespaces/e1/services/drupal-dummy-v1-service-https",
"uid": "722d3a99-b582-11ea-9df2-42010a000006"
},
"spec": {
"clusterIP": "172.16.103.181",
"externalTrafficPolicy": "Cluster",
"ports": [
{
"nodePort": 32396,
"port": 80,
"protocol": "TCP",
"targetPort": 80
}
],
"selector": {
"environment": "e1",
"name": "drupal-dummy-v1-pod"
},
"sessionAffinity": "None",
"type": "NodePort"
},
"status": {
"loadBalancer": {}
}
}
And ingress one :
root#c55834fbdf1a:/# k get ingress.extensions/drupal-dummy-v1-service-https -o json
{
"apiVersion": "extensions/v1beta1",
"kind": "Ingress",
"metadata": {
"annotations": {
"ingress.gcp.kubernetes.io/pre-shared-cert": "mcrt-a15e339b-6c3f-4f23-8f6b-688dc98b33a6,mcrt-f3a385de-0541-4b9c-8047-6dcfcbd4d74f",
"ingress.kubernetes.io/backends": "{\"k8s1-4846660e-e1-drupal-dummy-v1-service-https-80-36c11551\":\"HEALTHY\"}",
"ingress.kubernetes.io/forwarding-rule": "k8s-fw-e1-drupal-dummy-v1-service-https--4846660e8b9bd880",
"ingress.kubernetes.io/https-forwarding-rule": "k8s-fws-e1-drupal-dummy-v1-service-https--4846660e8b9bd880",
"ingress.kubernetes.io/https-target-proxy": "k8s-tps-e1-drupal-dummy-v1-service-https--4846660e8b9bd880",
"ingress.kubernetes.io/ssl-cert": "mcrt-a15e339b-6c3f-4f23-8f6b-688dc98b33a6,mcrt-f3a385de-0541-4b9c-8047-6dcfcbd4d74f",
"ingress.kubernetes.io/target-proxy": "k8s-tp-e1-drupal-dummy-v1-service-https--4846660e8b9bd880",
"ingress.kubernetes.io/url-map": "k8s-um-e1-drupal-dummy-v1-service-https--4846660e8b9bd880",
"kubernetes.io/ingress.global-static-ip-name": "gkxe-k1312-e1-drupal-dummy-v1",
"meta.helm.sh/release-name": "drupal-dummy-v1-service",
"meta.helm.sh/release-namespace": "e1",
"networking.gke.io/managed-certificates": "dummydrupald8.cnes.fr,d8.syspod.fr",
"nginx.ingress.kubernetes.io/rewrite-target": "/"
},
"creationTimestamp": "2020-06-23T18:50:45Z",
"generation": 1,
"labels": {
"app": "drupal-dummy-v1-service",
"app.kubernetes.io/managed-by": "Helm",
"environment": "e1",
"name": "drupal-dummy-v1-service",
"type": "ingress"
},
"name": "drupal-dummy-v1-service-https",
"namespace": "e1",
"resourceVersion": "3978178",
"selfLink": "/apis/extensions/v1beta1/namespaces/e1/ingresses/drupal-dummy-v1-service-https",
"uid": "7237fc51-b582-11ea-9df2-42010a000006"
},
"spec": {
"backend": {
"serviceName": "drupal-dummy-v1-service-https",
"servicePort": 80
},
"rules": [
{
"host": "dummydrupald8.cnes.fr",
"http": {
"paths": [
{
"backend": {
"serviceName": "drupal-dummy-v1-service-https",
"servicePort": 80
}
}
]
}
},
{
"host": "d8.syspod.fr",
"http": {
"paths": [
{
"backend": {
"serviceName": "drupal-dummy-v1-service-https",
"servicePort": 80
}
}
]
}
}
]
},
"status": {
"loadBalancer": {
"ingress": [
{
"ip": "34.98.97.102"
}
]
}
}
}
I have seen that in kubernetes events (only when I reconfigure my service selector to target first or second deployment).
Switch to indisponibility page (3 seconds 502) :
81s Normal Attach service/drupal-dummy-v1-service-https Attach 1 network endpoint(s) (NEG "k8s1-4846660e-e1-drupal-dummy-v1-service-https-80-36c11551" in zone "europe-west3-b")
78s Normal Attach service/drupal-dummy-v1-service-https Attach 1 network endpoint(s) (NEG "k8s1-4846660e-e1-drupal-dummy-v1-service-https-80-36c11551" in zone "europe-west3-a")
Switch back to application (15 seconds 502 -> Never the same duration):
7s Normal Attach service/drupal-dummy-v1-service-https Attach 1 network endpoint(s) (NEG "k8s1-4846660e-e1-drupal-dummy-v1-service-https-80-36c11551" in zone "europe-west3-a")
7s Normal Attach service/drupal-dummy-v1-service-https Attach 1 network endpoint(s) (NEG "k8s1-4846660e-e1-drupal-dummy-v1-service-https-80-36c11551" in zone "europe-west3-b")
I could check that the NEG events appear just before 502 error ends, I suspect that when we change service definition a new NEG is implemented but the time is not immediate, and while we wait to have it we still not have the old service up and there is no service during this time :(
There is no "rolling update" of services definition ?
No solution at all, even with having two services and just upgrading the Ingress. Seen witn GCP support, GKE will always destroy then recretate something it is not possible to do any rolling update of service or ingress with no downtime. They suggest to have two full silos then play with DNS, we have chosen another solution, having only one deployment and just do a simple rolling update of deployment changing referenced docker image. Not really in the target, but it works...

How to add resource and limits on Kubernetes Engine on Google Cloud Platform

I am trying to add resource and limits to my deployment on Kuberenetes Engine since one of my deployment on the pod is continuously getting evicted with an error message The node was low on resource: memory. Container model-run was using 1904944Ki, which exceeds its request of 0. I assume that the issue could be resolved by adding resource requests.
When I try to add resource requests and deploy, the deployment is successful but when I go back and and view detailed information about the Pod, with the command
kubectl get pod default-pod-name --output=yaml --namespace=default
It still says the pod has request of cpu: 100m and without any mention of memory that I have allotted. I am guessing that the cpu request of 100m was a default one. Please let me know how I can allot the requests and limits, the code I am using to deploy is as follows:
kubectl run model-run --image-pull-policy=Always --overrides='
{
"apiVersion": "apps/v1beta1",
"kind": "Deployment",
"metadata": {
"name": "model-run",
"labels": {
"app": "model-run"
}
},
"spec": {
"selector": {
"matchLabels": {
"app": "model-run"
}
},
"template": {
"metadata": {
"labels": {
"app": "model-run"
}
},
"spec": {
"containers": [
{
"name": "model-run",
"image": "gcr.io/some-project/news/model-run:development",
"imagePullPolicy": "Always",
"resouces": {
"requests": [
{
"memory": "2048Mi",
"cpu": "500m"
}
],
"limits": [
{
"memory": "2500Mi",
"cpu": "750m"
}
]
},
"volumeMounts": [
{
"name": "credentials",
"readOnly": true,
"mountPath":"/path/collection/keys"
}
],
"env":[
{
"name":"GOOGLE_APPLICATION_CREDENTIALS",
"value":"/path/collection/keys/key.json"
}
]
}
],
"volumes": [
{
"name": "credentials",
"secret": {
"secretName": "credentials"
}
}
]
}
}
}
}
' --image=gcr.io/some-project/news/model-run:development
Any solution will be appreciated
The node was low on resource: memory. Container model-run was using 1904944Ki, which exceeds its request of 0.
At first the message seems like there is a lack of resource in the node itself but the second part makes me believe you are correct in trying to raise the request limit for the container.
Just keep in mind that if you still face errors after this change, you might need to add mode powerful node-pools to your cluster.
I went through your command, there is a few issues I'd like to highlight:
kubectl run was deprecated in 1.12 to all resources except for pods and it is retired in version 1.18.
apiVersion": "apps/v1beta1 is deprecated, and starting on v 1.16 it is no longer be supported, I replaced with apps/v1.
In spec.template.spec.container it's written "resouces" instead of "resources"
after fixing the resources the next issue is that requests and limits are written in array format, but they need to be in a list, otherwise you get this error:
kubectl run --generator=deployment/apps.v1 is DEPRECATED and will be removed in a future version. Use kubectl run --generator=run-pod/v1 or kubectl create instead.
error: v1beta1.Deployment.Spec: v1beta1.DeploymentSpec.Template: v1.PodTemplateSpec.Spec: v1.PodSpec.Containers: []v1.Container: v1.Container.Resources: v1.ResourceRequirements.Limits: ReadMapCB: expect { or n, but found [, error found in #10 byte of ...|"limits":[{"cpu":"75|..., bigger context ...|Always","name":"model-run","resources":{"limits":[{"cpu":"750m","memory":"2500Mi"}],"requests":[{"cp|...
Here is the fixed format of your command:
kubectl run model-run --image-pull-policy=Always --overrides='{
"apiVersion": "apps/v1",
"kind": "Deployment",
"metadata": {
"name": "model-run",
"labels": {
"app": "model-run"
}
},
"spec": {
"selector": {
"matchLabels": {
"app": "model-run"
}
},
"template": {
"metadata": {
"labels": {
"app": "model-run"
}
},
"spec": {
"containers": [
{
"name": "model-run",
"image": "nginx",
"imagePullPolicy": "Always",
"resources": {
"requests": {
"memory": "2048Mi",
"cpu": "500m"
},
"limits": {
"memory": "2500Mi",
"cpu": "750m"
}
},
"volumeMounts": [
{
"name": "credentials",
"readOnly": true,
"mountPath": "/path/collection/keys"
}
],
"env": [
{
"name": "GOOGLE_APPLICATION_CREDENTIALS",
"value": "/path/collection/keys/key.json"
}
]
}
],
"volumes": [
{
"name": "credentials",
"secret": {
"secretName": "credentials"
}
}
]
}
}
}
}' --image=gcr.io/some-project/news/model-run:development
Now after aplying it on my Kubernetes Engine Cluster v1.15.11-gke.13 , here is the output of kubectl get pod X -o yaml:
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
model-run-7bd8d79c7d-brmrw 1/1 Running 0 17s
$ kubectl get pod model-run-7bd8d79c7d-brmrw -o yaml
apiVersion: v1
kind: Pod
metadata:
labels:
app: model-run
pod-template-hash: 7bd8d79c7d
run: model-run
name: model-run-7bd8d79c7d-brmrw
namespace: default
spec:
containers:
- env:
- name: GOOGLE_APPLICATION_CREDENTIALS
value: /path/collection/keys/key.json
image: nginx
imagePullPolicy: Always
name: model-run
resources:
limits:
cpu: 750m
memory: 2500Mi
requests:
cpu: 500m
memory: 2Gi
volumeMounts:
- mountPath: /path/collection/keys
name: credentials
readOnly: true
- mountPath: /var/run/secrets/kubernetes.io/serviceaccount
name: default-token-tjn5t
readOnly: true
nodeName: gke-cluster-115-default-pool-abca4833-4jtx
restartPolicy: Always
volumes:
- name: credentials
secret:
defaultMode: 420
secretName: credentials
You can see that the resources limits and requests were set.
If you still have any question let me know in the comments!
It seems we can not override limits through --overrides flag.
What you can do is you could pass limits with the kubectl command.
kubectl run model-run --image-pull-policy=Always --requests='cpu=500m,memory=2048Mi' --limits='cpu=750m,memory=2500Mi' --overrides='
{
"apiVersion": "apps/v1beta1",
"kind": "Deployment",
"metadata": {
"name": "model-run",
"labels": {
"app": "model-run"
}
},
"spec": {
"selector": {
"matchLabels": {
"app": "model-run"
}
},
"template": {
"metadata": {
"labels": {
"app": "model-run"
}
},
"spec": {
"containers": [
{
"name": "model-run",
"image": "gcr.io/some-project/news/model-run:development",
"imagePullPolicy": "Always",
"resouces": {
"requests": [
{
"memory": "2048Mi",
"cpu": "500m"
}
],
"limits": [
{
"memory": "2500Mi",
"cpu": "750m"
}
]
},
"volumeMounts": [
{
"name": "credentials",
"readOnly": true,
"mountPath":"/path/collection/keys"
}
],
"env":[
{
"name":"GOOGLE_APPLICATION_CREDENTIALS",
"value":"/path/collection/keys/key.json"
}
]
}
],
"volumes": [
{
"name": "credentials",
"secret": {
"secretName": "credentials"
}
}
]
}
}
}
}
' --image=gcr.io/some-project/news/model-run:development

How to check what port a pod is listening on with kubectl and not looking at the dockerFile?

I have a pod running and want to port forward so i can access the pod from the internal network.
I don't know what port it is listening on though, there is no service yet.
I describe the pod:
$ kubectl describe pod queue-l7wck
Name: queue-l7wck
Namespace: default
Priority: 0
Node: minikube/192.168.64.3
Start Time: Wed, 18 Dec 2019 05:13:56 +0200
Labels: app=work-queue
chapter=jobs
component=queue
Annotations: <none>
Status: Running
IP: 172.17.0.2
IPs:
IP: 172.17.0.2
Controlled By: ReplicaSet/queue
Containers:
queue:
Container ID: docker://13780475170fa2c0d8e616ba1a3b1554d31f404cc0a597877e790cbf01838e63
Image: gcr.io/kuar-demo/kuard-amd64:blue
Image ID: docker-pullable://gcr.io/kuar-demo/kuard-amd64#sha256:1ecc9fb2c871302fdb57a25e0c076311b7b352b0a9246d442940ca8fb4efe229
Port: <none>
Host Port: <none>
State: Running
Started: Wed, 18 Dec 2019 05:14:02 +0200
Ready: True
Restart Count: 0
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-mbn5b (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
default-token-mbn5b:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-mbn5b
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled <unknown> default-scheduler Successfully assigned default/queue-l7wck to minikube
Normal Pulling 31h kubelet, minikube Pulling image "gcr.io/kuar-demo/kuard-amd64:blue"
Normal Pulled 31h kubelet, minikube Successfully pulled image "gcr.io/kuar-demo/kuard-amd64:blue"
Normal Created 31h kubelet, minikube Created container queue
Normal Started 31h kubelet, minikube Started container queue
even the JSON has nothing:
$ kubectl get pods queue-l7wck -o json
{
"apiVersion": "v1",
"kind": "Pod",
"metadata": {
"creationTimestamp": "2019-12-18T03:13:56Z",
"generateName": "queue-",
"labels": {
"app": "work-queue",
"chapter": "jobs",
"component": "queue"
},
"name": "queue-l7wck",
"namespace": "default",
"ownerReferences": [
{
"apiVersion": "apps/v1",
"blockOwnerDeletion": true,
"controller": true,
"kind": "ReplicaSet",
"name": "queue",
"uid": "a9ec07f7-07a3-4462-9ac4-a72226f54556"
}
],
"resourceVersion": "375402",
"selfLink": "/api/v1/namespaces/default/pods/queue-l7wck",
"uid": "af43027d-8377-4227-b366-bcd4940b8709"
},
"spec": {
"containers": [
{
"image": "gcr.io/kuar-demo/kuard-amd64:blue",
"imagePullPolicy": "Always",
"name": "queue",
"resources": {},
"terminationMessagePath": "/dev/termination-log",
"terminationMessagePolicy": "File",
"volumeMounts": [
{
"mountPath": "/var/run/secrets/kubernetes.io/serviceaccount",
"name": "default-token-mbn5b",
"readOnly": true
}
]
}
],
"dnsPolicy": "ClusterFirst",
"enableServiceLinks": true,
"nodeName": "minikube",
"priority": 0,
"restartPolicy": "Always",
"schedulerName": "default-scheduler",
"securityContext": {},
"serviceAccount": "default",
"serviceAccountName": "default",
"terminationGracePeriodSeconds": 30,
"tolerations": [
{
"effect": "NoExecute",
"key": "node.kubernetes.io/not-ready",
"operator": "Exists",
"tolerationSeconds": 300
},
{
"effect": "NoExecute",
"key": "node.kubernetes.io/unreachable",
"operator": "Exists",
"tolerationSeconds": 300
}
],
"volumes": [
{
"name": "default-token-mbn5b",
"secret": {
"defaultMode": 420,
"secretName": "default-token-mbn5b"
}
}
]
},
"status": {
"conditions": [
{
"lastProbeTime": null,
"lastTransitionTime": "2019-12-18T03:13:56Z",
"status": "True",
"type": "Initialized"
},
{
"lastProbeTime": null,
"lastTransitionTime": "2019-12-18T03:14:02Z",
"status": "True",
"type": "Ready"
},
{
"lastProbeTime": null,
"lastTransitionTime": "2019-12-18T03:14:02Z",
"status": "True",
"type": "ContainersReady"
},
{
"lastProbeTime": null,
"lastTransitionTime": "2019-12-18T03:13:56Z",
"status": "True",
"type": "PodScheduled"
}
],
"containerStatuses": [
{
"containerID": "docker://13780475170fa2c0d8e616ba1a3b1554d31f404cc0a597877e790cbf01838e63",
"image": "gcr.io/kuar-demo/kuard-amd64:blue",
"imageID": "docker-pullable://gcr.io/kuar-demo/kuard-amd64#sha256:1ecc9fb2c871302fdb57a25e0c076311b7b352b0a9246d442940ca8fb4efe229",
"lastState": {},
"name": "queue",
"ready": true,
"restartCount": 0,
"started": true,
"state": {
"running": {
"startedAt": "2019-12-18T03:14:02Z"
}
}
}
],
"hostIP": "192.168.64.3",
"phase": "Running",
"podIP": "172.17.0.2",
"podIPs": [
{
"ip": "172.17.0.2"
}
],
"qosClass": "BestEffort",
"startTime": "2019-12-18T03:13:56Z"
}
}
How do you checker what port a pod is listening on with kubectl?
Update
If I ssh into the pod and run netstat -tulpn as suggested in the comments I get:
$ kubectl exec -it queue-pfmq2 -- sh
~ $ netstat -tulpn
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 :::8080 :::* LISTEN 1/kuard
But this method is not using kubectl.
Your container image has a port opened during the build (looks like port 8080 in your case) using the EXPOSE command in the Dockerfile. Since the exposed port is baked into the image, k8s does not keep track of this open port since k8s does not need to take steps to open it.
Since k8s is not responsible for opening the port, you won't be able to find the listening port using kubectl or checking the pod YAML
Try the combination of both kubectl and your Linux command to get the Port container is listening on:
kubectl exec <pod name here> -- netstat -tulpn
Further you can pipe this result with grep to narrow the findings if required eg.
kubectl exec <pod name here> -- netstat -tulpn | grep "search string"
Note: It will work only if your container's base image supports the command netstat. and as per your Update section it seems it supports.
Above solution is nothing but a smart use of the commands you have used in two parts first to exec the container in interactive mode using -it second in the container to list the listening port.
One answer suggested to run netstat inside the container.
This only works if netstat is part of the container's image.
As an alternative, you can run netstat on the host executing it in the container's network namespace..
Get the container's process ID on the host (this is the application running inside the container). Then change to the container's network namespace (run as root on the host):
host# PS1='container# ' nsenter -t <PID> -n
Modifying the PS1 environment variable is used to show a different prompt while you are in the container's network namespace.
Get the listening ports in the container:
container# netstat -na
....
container# exit
If who created the image added the right Openshift label then you can use the following command (unfortunately your image does not have the label) :
skopeo inspect docker://image-url:tag | grep expose-service
e.g.
skopeo inspect docker://quay.io/redhattraining/loadtest:v1.0 | grep expose-service
output:
"io.openshift.expose-services": "8080:http"
So 8080 is the port exposed by the image
Hope this helps
normally. a container will able to run curl . so you can use curl to check whether a port is open.
for port in 8080 50000 443 8443;do curl -I - connect-timeout 1 127.0.0.1:$port;done
this can be run with sh.

Google Cloud Container: Can not connect to mongodb service

I created a mongodb replication controller and a mongo service. I tried to connect to it from a different mongo pod just to test the connection. But that does not work
root#mongo-test:/# mongo mongo-service/mydb
MongoDB shell version: 3.2.0
connecting to: mongo-service/mydb
2015-12-09T11:05:55.256+0000 E QUERY [thread1] Error: network error while attempting to run command 'isMaster' on host 'mongo-service:27017' :
connect#src/mongo/shell/mongo.js:226:14
#(connect):1:6
exception: connect failed
I am not sure what I have done wrong in the configuration. I may miss something here
kubectl get rc
CONTROLLER CONTAINER(S) IMAGE(S) SELECTOR REPLICAS AGE
mongo mongo mongo:latest name=mongo 1 9s
kubectl get pods
NAME READY STATUS RESTARTS AGE
mongo-6bnak 1/1 Running 0 1m
mongo-test 1/1 Running 0 21m
kubectl get services
NAME CLUSTER_IP EXTERNAL_IP PORT(S) SELECTOR AGE
kubernetes 10.119.240.1 <none> 443/TCP <none> 23h
mongo-service 10.119.254.202 <none> 27017/TCP name=mongo,role=mongo 1m
I configured the RC and Service with the following configs
mongo-rc
{
"metadata": {
"name": "mongo",
"labels": { "name": "mongo" }
},
"kind": "ReplicationController",
"apiVersion": "v1",
"spec": {
"replicas": 1,
"template": {
"metadata": {
"labels": { "name": "mongo" }
},
"spec": {
"volumes": [
{
"name": "mongo-disk",
"gcePersistentDisk": {
"pdName": "mongo-disk",
"fsType": "ext4"
}
}
],
"containers": [
{
"name": "mongo",
"image": "mongo:latest",
"ports": [{
"name":"mongo",
"containerPort": 27017
}],
"volumeMounts": [
{
"name": "mongo-disk",
"mountPath": "/data/db"
}
]
}
]
}
}
}
}
mongo-service:
{
"kind": "Service",
"apiVersion": "v1",
"metadata": {
"name": "mongo-service"
},
"spec": {
"ports": [
{
"port": 27017,
"targetPort": "mongo"
}
],
"selector": {
"name": "mongo",
"role": "mongo"
}
}
}
Almost a bit embarrassing.
The issue was that I used the selector "role" in the service but did not define it on the RC.

Kubernetes endpoints throw `ServiceUnavailable`

I have a new Kubernetes cluster on AWS that was built using the kube-up script from v1.1.1. I can successfully access the Elasticsearch/Kibana/KubeUI/Grafana endpoints, but cannot access Heapster/KubeDNS/InfluxDB from my machine, through the API proxy. I have seen some ancillary issues related to this on the K8S project, but no clear identification as to what's going on. From what I can gather, everything is running fine so I'm not sure what is wrong here? I'd really like to use the embedded monitoring of Grafana/Influx/Heapster but the Grafana dashboard is just blank with an series error.
Kubernetes version
$ kubectl version
Client Version: version.Info{Major:"1", Minor:"1", GitVersion:"v1.1.1", GitCommit:"92635e23dfafb2ddc828c8ac6c03c7a7205a84d8", GitTreeState:"clean"}
Server Version: version.Info{Major:"1", Minor:"1", GitVersion:"v1.1.1", GitCommit:"92635e23dfafb2ddc828c8ac6c03c7a7205a84d8", GitTreeState:"clean"}
Cluster-info
$ kubectl cluster-info
Kubernetes master is running at https://MASTER_IP
Elasticsearch is running at https://MASTER_IP/api/v1/proxy/namespaces/kube-system/services/elasticsearch-logging
Heapster is running at https://MASTER_IP/api/v1/proxy/namespaces/kube-system/services/heapster
Kibana is running at https://MASTER_IP/api/v1/proxy/namespaces/kube-system/services/kibana-logging
KubeDNS is running at https://MASTER_IP/api/v1/proxy/namespaces/kube-system/services/kube-dns
KubeUI is running at https://MASTER_IP/api/v1/proxy/namespaces/kube-system/services/kube-ui
Grafana is running at https://MASTER_IP/api/v1/proxy/namespaces/kube-system/services/monitoring-grafana
InfluxDB is running at https://MASTER_IP/api/v1/proxy/namespaces/kube-system/services/monitoring-influxdb
Accessing influxDB from the API proxy URL above
{
"kind": "Status",
"apiVersion": "v1",
"metadata": {},
"status": "Failure",
"message": "no endpoints available for service \"monitoring-influxdb\"",
"reason": "ServiceUnavailable",
"code": 503
}
Endpoint details from the Host
$ curl http://localhost:8080/api/v1/namespaces/kube-system/endpoints/monitoring-influxdb
{
"kind": "Endpoints",
"apiVersion": "v1",
"metadata": {
"name": "monitoring-influxdb",
"namespace": "kube-system",
"selfLink": "/api/v1/namespaces/kube-system/endpoints/monitoring-influxdb",
"uid": "2f75b259-8a22-11e5-b248-028ff74b9b1b",
"resourceVersion": "131",
"creationTimestamp": "2015-11-13T16:18:33Z",
"labels": {
"kubernetes.io/cluster-service": "true",
"kubernetes.io/name": "InfluxDB"
}
},
"subsets": [
{
"addresses": [
{
"ip": "10.244.1.4",
"targetRef": {
"kind": "Pod",
"namespace": "kube-system",
"name": "monitoring-influxdb-grafana-v2-n6jx1",
"uid": "2f31ed90-8a22-11e5-b248-028ff74b9b1b",
"resourceVersion": "127"
}
}
],
"ports": [
{
"name": "http",
"port": 8083,
"protocol": "TCP"
},
{
"name": "api",
"port": 8086,
"protocol": "TCP"
}
]
}
]
}
Querying the service from the Host
$ curl -IL 10.244.1.4:8083
HTTP/1.1 200 OK
Accept-Ranges: bytes
Content-Length: 13751
Content-Type: text/html; charset=utf-8
Last-Modified: Fri, 14 Nov 2014 21:55:58 GMT
Date: Tue, 17 Nov 2015 21:31:48 GMT
Monitoring-InfluxDB Service
$ curl http://localhost:8080/api/v1/namespaces/kube-system/services/monitoring-influxdb
{
"kind": "Service",
"apiVersion": "v1",
"metadata": {
"name": "monitoring-influxdb",
"namespace": "kube-system",
"selfLink": "/api/v1/namespaces/kube-system/services/monitoring-influxdb",
"uid": "2f715831-8a22-11e5-b248-028ff74b9b1b",
"resourceVersion": "60",
"creationTimestamp": "2015-11-13T16:18:33Z",
"labels": {
"kubernetes.io/cluster-service": "true",
"kubernetes.io/name": "InfluxDB"
}
},
"spec": {
"ports": [
{
"name": "http",
"protocol": "TCP",
"port": 8083,
"targetPort": 8083
},
{
"name": "api",
"protocol": "TCP",
"port": 8086,
"targetPort": 8086
}
],
"selector": {
"k8s-app": "influxGrafana"
},
"clusterIP": "10.0.35.241",
"type": "ClusterIP",
"sessionAffinity": "None"
},
"status": {
"loadBalancer": {}
}
}
Pod Details
$ kubectl describe pod --namespace=kube-system monitoring-influxdb-grafana-v2-n6jx
Name: monitoring-influxdb-grafana-v2-n6jx1
Namespace: kube-system
Image(s): gcr.io/google_containers/heapster_influxdb:v0.4,beta.gcr.io/google_containers/heapster_grafana:v2.1.1
Node: ip-172-20-0-44.us-west-2.compute.internal/172.20.0.44
Start Time: Fri, 13 Nov 2015 08:21:36 -0800
Labels: k8s-app=influxGrafana,kubernetes.io/cluster-service=true,version=v2
Status: Running
Reason:
Message:
IP: 10.244.1.4
Replication Controllers: monitoring-influxdb-grafana-v2 (1/1 replicas created)
Containers:
influxdb:
Container ID: docker://564724318ca81d33d6079978d24f78b3c6ff8eb08a9023c845e250eeb888aafd
Image: gcr.io/google_containers/heapster_influxdb:v0.4
Image ID: docker://8b8118c488e431cc43e7ff9060968d88402cc6c38a6390c4221352403aa7ac1b
QoS Tier:
memory: Guaranteed
cpu: Guaranteed
Limits:
memory: 200Mi
cpu: 100m
Requests:
memory: 200Mi
cpu: 100m
State: Running
Started: Fri, 13 Nov 2015 08:22:55 -0800
Ready: True
Restart Count: 0
Environment Variables:
grafana:
Container ID: docker://518dea564a0ee014345e9006da6113fb6584ff1ebc6d0cc9609a608abc995f45
Image: beta.gcr.io/google_containers/heapster_grafana:v2.1.1
Image ID: docker://200e77ba156a5a86879e49667b97afe84dca42b5bb67ab1e06217e6a19c5a6a6
QoS Tier:
cpu: Guaranteed
memory: Guaranteed
Limits:
memory: 100Mi
cpu: 100m
Requests:
cpu: 100m
memory: 100Mi
State: Running
Started: Fri, 13 Nov 2015 08:22:35 -0800
Ready: True
Restart Count: 0
Environment Variables:
INFLUXDB_SERVICE_URL: http://monitoring-influxdb:8086
GF_AUTH_BASIC_ENABLED: false
GF_AUTH_ANONYMOUS_ENABLED: true
GF_AUTH_ANONYMOUS_ORG_ROLE: Admin
GF_SERVER_ROOT_URL: /api/v1/proxy/namespaces/kube-system/services/monitoring-grafana/
Conditions:
Type Status
Ready True
Volumes:
influxdb-persistent-storage:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
grafana-persistent-storage:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
default-token-bo89c:
Type: Secret (a secret that should populate this volume)
SecretName: default-token-bo89c
No events.
Unfortunately those URLs are incomplete. Influx's ports are named, so you need to say which port you want.
https://MASTER_IP/api/v1/proxy/namespaces/kube-system/services/monitoring-influxdb:http or https://MASTER_IP/api/v1/proxy/namespaces/kube-system/services/monitoring-influxdb:api
There's a bug open to give better errors for this.