sprint cloud data flow server kubernetes liveness timeout values - kubernetes

I am using SCDF for kubernetes to deploy streams. Some of the kubernetes pods deployed by SCDF server are in constant restarts because the livenessProbe initialDelaySeconds of 10s is too short:
#> kubectl get pods
NAME READY STATUS RESTARTS AGE
datapipeline-confirmation-0-g261e 0/1 CrashLoopBackOff 37 2h
As these pods are created by SCDF, I can not figure out how to tell SCDF server to use larger timeout values when deploying pods. Tried just about everything short of diving into SCDF java source code, and asking on StackExchange. Thanks in advance!

You could try to PATCH the pod config directly
something like:
kubectl patch pod datapipeline-confirmation-0-g261e -c <container name> -p '{"readinessProbe": {"timeoutSeconds": 60}}'
assuming you have the credentials to do so. Otherwise you might need to ask SCDF to fix their problem.

Related

How do I know why my SonarQube helm chart is getting auto-killed by Kubernetes

This question is about logging/monitoring.
I'm running a 3 node cluster on AKS, with 3 orgs, Dev, Test and Prod. The chart worked fine in Dev, but the same chart keeps getting killed by Kubernetes in Test, and it keeps getting recreated, and re-killed. Is there a way to extract details on why this is happening? All I see when I describe the pod is Reason: Killed
Please tell me more details on this or can give some suggestions. Thanks!
List Events sorted by timestamp
kubectl get events --sort-by=.metadata.creationTimestamp
There might be various reasons for it to be killed, e.g. not sufficient resources or failed liveness probe.
For SonarQube there is a liveness and readiness probe configured so it might fail. Also as described in helm's chart values:
If an ingress path other than the root (/) is defined, it should be reflected here
A trailing "/" must be included
You can also check if there are sufficient resources on node:
check what node are pods running on: kubectl get pods -test and
then run kubectl describe node <node-name> to check if there is no
disk/ memory pressure.
You can also run kubectl logs <pod-name> and kubectl describe pod <pod-name> that might give you some insight of kill reason.

AKS - incorrect Pod Status

I have an AKS Cluster with two nodepools. Node pool 1 has 3 nodes, and nodepool 2 has 1 node - all Linux VMs. I noticed that after stopping the VMs and then doing kubectl get pods, the Pods status shows "running" though the VMs are not actually running. How is this possible?
This is the command I tried: kubectl get pods -n development -o=wide
The screenshot is given below. Though VMs are not running, the Pod status shows "running". However, trying to access the app using the Public IP of the service resulted in
ERR_CONNECTION_TIMED_OUT
Here is a full thread (https://github.com/kubernetes/kubernetes/issues/55713) on this issue. The problem here is by default the pod waits for 5 minutes before evicting to another node when the current node becomes notReady, but in this case none of the worker nodes are ready and hence pods are not getting evicted. Refer the git issue, there are some suggestions and solutions provided.
What is actually going is related to the kubelet processes running on the nodes cannot provide their status to the Kubernetes API server. Kubernetes will always assume that your PODs are running when the nodes associated with the POD are offline. The fact that all nodes are offline, will in fact cause your POD to not be running hence not being accessible, causing the ERR_CONNECTION_TIMED_OUT
You can run kubectl get nodes to get the status of the nodes, they should show NotReady. Please check and let me know.
Also, can you please provide the output for kubectl get pods -A

Telemetry mixer logs

I deploy istio 1.2.5 on a K8s cluster.
According to documentation https://istio.io/faq/mixer/ in rules section:
kubectl get rules --all-namespaces
You will get the list. In my cluster I got No resources found
But if I use:
kubectl get rules.config.istio.io -n istio-system
I got the list:
NAME AGE
kubeattrgenrulerule 5h
promhttp 5h
promtcp 5h
promtcpconnectionclosed 5h
promtcpconnectionopen 5h
stdio 5h
stdiotcp 5h
Someone know the difference?
Also if I try:
kubectl -n istio-system logs -f istio-telemetry-7df96d454b-4kxs9 -c mixer
I didn't got the log of request in the log ( I found it work in another cluster). Do you know why?
I tried to reproduce your issue on both versions Istio 1.2.5 and Istio 1.3.0 and environments like GKE, Minikube and Kubeadm.
I have tried to install it manually and using HELM. Each time everything worked as should.
Based on the information you have provided: I found it work in another cluster and you are using bare metal I would guess that this cluster have some specific configuration or some of the kubernetes/Istio objects have Insufficient resources.
$ kubectl describe node [node-name]
Please keep in mind that you might install Istio Configuration Profile which requested too many resources. Each profile contain different amount of resources based on each object (citadel, egress, galley, pilot, telemetry, etc). For example if you will check Istio Docs
The Envoy proxy uses 0.6 vCPU and 50 MB memory per 1000 requests per second going through the proxy.
The istio-telemetry service uses 0.6 vCPU per 1000 mesh-wide requests per second.
Pilot uses 1 vCPU and 1.5 GB of memory.

kubectl get pod status always ContainerCreating

k8s version: 1.12.1
I created pod with api on node and allocated an IP (through flanneld). When I used the kubectl describe pod command, I could not get the pod IP, and there was no such IP in etcd storage.
It was only a few minutes later that the IP could be obtained, and then kubectl get pod STATUS was Running.
Has anyone ever encountered this problem?
Like MatthiasSommer mentioned in comment, process of creating pod might take a while.
If POD will stay for a longer time in ContainerCreating status you can check what is stopping it change to status Running by command:
kubectl describe pod <pod_name>
Why creating of pod may take a longer time?
Depends on what is included in manifest, pod can share namespace, storage volumes, secrets, assignin resources, configmaps etc.
kube-apiserver validates and configures data for api objects.
kube-scheduler needs to check and collect resurces requrements, constraints, etc and assign pod to the node.
kubelet is running on each node and is ensures that all containers fulfill pod specification and are healty.
kube-proxy is also running on each node and it is responsible for network on pod.
As you see there are many requests, validates, syncs and it need a while to create pod fulfill all requirements.

Minikube got stuck when creating container

I recently got started to learn Kubernetes by using Minikube locally in my Mac. Previously, I was able to start a local Kubernetes cluster with Minikube 0.10.0, created a deployment and viewed Kubernetes dashboard.
Yesterday I tried to delete the cluster and re-did everything from scratch. However, I found I cannot get the assets deployed and cannot view the dashboard. From what I saw, everything seemed to get stuck during container creation.
After I ran minikube start, it reported
Starting local Kubernetes cluster...
Kubectl is now configured to use the cluster.
When I ran kubectl get pods --all-namespaces, it reported (pay attention to the STATUS column):
kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system kube-addon-manager-minikube 0/1 ContainerCreating 0 51s
docker ps showed nothing:
docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
minikube status tells me the VM and cluster are running:
minikubeVM: Running
localkube: Running
If I tried to create a deployment and an autoscaler, I was told they were created successfully:
kubectl create -f configs
deployment "hello-minikube" created
horizontalpodautoscaler "hello-minikube-autoscaler" created
$ kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
default hello-minikube-661011369-1pgey 0/1 ContainerCreating 0 1m
default hello-minikube-661011369-91iyw 0/1 ContainerCreating 0 1m
kube-system kube-addon-manager-minikube 0/1 ContainerCreating 0 21m
When exposing the service, it said:
$ kubectl expose deployment hello-minikube --type=NodePort
service "hello-minikube" exposed
$ kubectl get service
NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE
hello-minikube 10.0.0.32 <nodes> 8080/TCP 6s
kubernetes 10.0.0.1 <none> 443/TCP 22m
When I tried to access the service, I was told:
curl $(minikube service hello-minikube --url)
Waiting, endpoint for service is not ready yet...
docker ps still showed nothing. It looked to me everything got stuck when creating a container. I tried some other ways to work around this issue:
Upgraded to minikube 0.11.0
Use the xhyve driver instead of the Virtualbox driver
Delete everything cached, like ~/.minikube, ~/.kube, and the cluster, and re-try
None of them worked for me.
Kubernetes is still new to me and I would like to know:
How can I troubleshoot this kind of issue?
What could be the cause of this issue?
Any help is appreciated. Thanks.
It turned out to be a network problem in my case.
The pod status is "ContainerCreating", and I found during container creation, docker image will be pulled from gcr.io, which is inaccessible in China (blocked by GFW). Previous time it worked for me because I happened to connect to it via a VPN.
I didn't try minikube but I use kubernetes. With the information provided it is difficult to say the cause of the issue. Your minikube has no problem in creating resources but ContainerCreating is a problem related to docker daemon or improper communication between kube-api and docker daemon or some problem with kubelet.
You can try the following command:
kubectl describe po POD_NAME
This will give you the POD's events. Maybe this will provide a path to the root cause of issue.
You may also check the logs of kubelet to get the events.
I had this problem on Windows, but it was related to an NTLM proxy. I deleted the minikube VM then recreated it with the correct proxy settings for my CNTLM installation:
minikube start \
--docker-env http_proxy=http://10.0.2.2:3128 \
--docker-env https_proxy=http://10.0.2.2:3128 \
--docker-env no_proxy=localhost,127.0.0.1,::1,192.168.99.100
See https://blog.alexellis.io/minikube-behind-proxy/
The horizontalpodautoscaler (hpa) requires heapster to use. You'll need to run heapster in minikube for that to work. You can always debug these kinds of issues with minikube logs or interactively through the dashboard found at minikube dashboard.
You can find the steps to run heapster and grafana at https://github.com/kubernetes/heapster
For me, it takes several minutes before I see the ContainerCreating problem. After executing the following command:
systemctl status kube-controller-manager.service
I get this error:
Sync "default/redis-master-2229813293" failed with unable to create pods: No API token found for service account "default", retry after the token is automatically created and added to the service account.
There are two ways to solve this:
Set the service account with token
Remove the ServiceAccount setting of KUBE_ADMISSION_CONTROL in api-server