Grafana & Loki agents not deployed in Tainted nodes - kubernetes

We are running our workloads on AKS. Basically we have Two Node-Pools.
1. System-Node-Pool: Where all system pods are running
2. Apps-Node-Pool: Where our actual workloads/ apps run in.
In fact, our Apps-Node-Pool is Tainted whereas System-Node-Pool isn't. So basically I deployed Loki-Grafana stack in order for Monitoring and for Log analysis. I'm using below Helm command to install the Grafana-Loki stack.
helm upgrade --install loki grafana/loki-stack --set grafana.enabled=true,prometheus.enabled=true,prometheus.alertmanager.persistentVolume.enabled=false,prometheus.server.persistentVolume.enabled=false,loki.persistence.enabled=true,loki.persistence.storageClassName=standard,loki.persistence.size=5Gi
Since the Toleration isn't added in the helm command (Or even in values.yaml) all the Grafana and Loki pods get deployed in the System-Node-Pool. But my case is, since the necessary agents aren't deployed on Apps-Node-Pool (For example: Promtail Pods) I can't check the logs of my App pods.
Since the Taint exists in Apps-Node-Pool if we add a Toleration along with the Helm command, then basically all the Monitoring related Pods will get deployed in Apps-Node-Pool (Still can't guarant as it may get deployed in System-Node-Pool since it doesn't have a Taint)
So according to my cluster, what can I do in order to make sure that Agent pods are also running in Tainted node?

So in my case my requirement was to run "Promtail" pods in Apps-Node-Pool. I haven't added a Toleration to the promtail pods, so I had to add a toleration to the Promtail pod. So successfully the Promtail pods got deployed in Apps-Node-Pool.
But still adding a Toleration in Promtail pod doesn't guarant the deployment of the Promtail pods gets deployed in Apps-Node-Pool because in my case, System-Node-Pool didn't have any Taint.
In this case you may leverage both NodeAffinity and Tolerations to exclusively deploy pods in a specific node.

Related

Pod is not visible after sometime

I deployed a pod in Kubernetes cluster. The deployment is success and I am able to see my pod running. But after sometime my pod is missing from the list of workloads.Why is this so?

Fail to upgrade operator in K8s

I'm writing an operator by operator-sdk and I have created statefulset pod in operator by using k8s api like :
r.client.Create(context.TODO(), statefulset)
It's works correctly and the statefulset pod is crated. But now I want to upgrade the operator already run in k8s so that I can add some command for pod like
Containers: []corev1.Container{{
Command: []string{.....}
First I build the newer operator image and delete the operator in k8s. The k8s quickly restarts the the operator by using the newer image(kubectl describe pod myoperator show newer images is used).
Second I delete the statefulset pod and the k8s also restarts the statefulset pod in seconds.
But I find the statefulset pod doesn't contain the command I added in the operator(kubectl describe pod statefulsetpod). If I delete all the resources in k8s and redeploy them, It works.
I have a lot of resources need be created by the operator so I don't want deploy all the resources.
You should delete statefulset itself instead of statefulset pod. The problem is when you delete statefulset pod - new pod automatically creates from old statefulset spec.
Once you delete/recreate statefulset - as expected you schedule proper updated pods.
Probably you can add additional logic to operator that will patch already existed statefulset - that can be resolution for avoiding redeploy statefulset each time.

Kubernetes helm waiting before killing the old pods during helm deployment

I have a "big" micro-service (website) with 3 pods deployed with Helm Chart in production env, but when I deploy a new version of the Helm chart, during 40 seconds (time to start my big microservice) I have a problem with the website (503 Service Unavailable)
So, I look at a solution to tell to kubernetes do not kill the old pod before the complete start of the new version
I tried the --wait --timeout but it did not work for me.
My EKS version : "v1.14.6-eks-5047ed"
Without more details about the Pods, I'd suggest:
Use Deployment (if not already) so that Pods are managed by a Replication Controller, which allows to do rolling updates, and that in combination with configured Startup Probe (if on k8s v1.16+) or Readiness Probe so that Kubernetes knows when the new Pods are ready to take on traffic (a Pod is considered ready when all of its Containers are ready).

Difference between daemonsets and deployments

In Kelsey Hightower's Kubernetes Up and Running, he gives two commands :
kubectl get daemonSets --namespace=kube-system kube-proxy
and
kubectl get deployments --namespace=kube-system kube-dns
Why does one use daemonSets and the other deployments?
And what's the difference?
Kubernetes deployments manage stateless services running on your cluster (as opposed to for example StatefulSets which manage stateful services). Their purpose is to keep a set of identical pods running and upgrade them in a controlled way. For example, you define how many replicas(pods) of your app you want to run in the deployment definition and kubernetes will make that many replicas of your application spread over nodes. If you say 5 replica's over 3 nodes, then some nodes will have more than one replica of your app running.
DaemonSets manage groups of replicated Pods. However, DaemonSets attempt to adhere to a one-Pod-per-node model, either across the entire cluster or a subset of nodes. A Daemonset will not run more than one replica per node. Another advantage of using a Daemonset is that, if you add a node to the cluster, then the Daemonset will automatically spawn a pod on that node, which a deployment will not do.
DaemonSets are useful for deploying ongoing background tasks that you need to run on all or certain nodes, and which do not require user intervention. Examples of such tasks include storage daemons like ceph, log collection daemons like fluentd, and node monitoring daemons like collectd
Lets take the example you mentioned in your question: why iskube-dns a deployment andkube-proxy a daemonset?
The reason behind that is that kube-proxy is needed on every node in the cluster to run IP tables, so that every node can access every pod no matter on which node it resides. Hence, when we make kube-proxy a daemonset and another node is added to the cluster at a later time, kube-proxy is automatically spawned on that node.
Kube-dns responsibility is to discover a service IP using its name and only one replica of kube-dns is enough to resolve the service name to its IP. Hence we make kube-dns a deployment, because we don't need kube-dns on every node.

Can't shut down influxDB in Kubernetes

I have spun up a Kubernetes cluster in AWS using the official "kube-up" mechanism. By default, an addon that monitors the cluster and logs to InfluxDB is created. It has been noted in this post that InfluxDB quickly fills up disk space on nodes, and I am seeing this same issue.
The problem is, when I try to kill the InfluxDB replication controller and service, it "magically" comes back after a time. I do this:
kubectl delete rc --namespace=kube-system monitoring-influx-grafana-v1
kubectl delete service --namespace=kube-system monitoring-influxdb
kubectl delete service --namespace=kube-system monitoring-grafana
Then if I say:
kubectl get pods --namespace=kube-system
I do not see the pods running anymore. However after some amount of time (minutes to hours), the replication controllers, services, and pods are back. I don't know what is restarting them. I would like to kill them permanently.
You probably need to remove the manifest files for influxdb from the /etc/kubernetes/addons/ directory on your "master" host. Many of the kube-up.sh implementations use a service (usually at /etc/kubernetes/kube-master-addons.sh) that runs periodically and makes sure that all the manifests in /etc/kubernetes/addons/ are active.
You can also restart your cluster, but run export ENABLE_CLUSTER_MONITORING=none before running kube-up.sh. You can see other environment settings that impact the cluster kube-up.sh builds at cluster/aws/config-default.sh