I deploy prometheus in kubernetes on this manual
As a storage scheme was invented:
Prometeus in kubernetes stores the metrics within 24 hours.
Prometheus not in kubernetes stores the metrics in 1 week.
A federation is set up between them.
Who faced with the fact that after removing the pods after a certain period of time (much less than 24 hours) metrics are missing on it.
This is perfectly normal if you do not have a persistent storage configured for your prometheus pod. You should use PV/PVC to define a stable place where you keep your prometheus data, otherwise if your pod is recreated, it starts with a clean slate.
PV/PVC needs dedicated storage servers in the cluster. If there is no money for storage servers, here is a cheaper approach:
Label a node:
$ kubectl label nodes <node name> prometheus=yes
Force all the prometheus pods to be created on the same labeled node by using nodeSelector:
nodeSelector:
prometheus: yes
Create an emptyDir volume for each prometheus pod. An emptyDir volume is first created when the Prometheus pod is assigned to the labeled node and exists as long as that pod is running on that node and is safe across container crashes and pod restarts.
spec:
containers:
- image: <prometheus image>
name: <prometheus pod name>
volumeMounts:
- mountPath: /cache
name: cache-volume
volumes:
- name: cache-volume
emptyDir: {}
This approach makes all the Prometheus pods run on the same node with persistent storage for the metrics - a cheaper approach that prays the Prometheus node does not crash.
Related
Below is kubernetes POD definition
apiVersion: v1
kind: Pod
metadata:
name: static-web
labels:
role: myrole
spec:
containers:
- name: web
image: nginx
ports:
- name: web
containerPort: 80
protocol: TCP
as I have not specified the resources, how much Memory & CPU will be allocated? Is there a kubectl to find what is allocated for the POD?
If resources are not specified for the Pod, the Pod will be scheduled to any node and resources are not considered when choosing a node.
The Pod might be "terminated" if it uses more memory than available or get little CPU time as Pods with specified resources will be prioritized. It is a good practice to set resources for your Pods.
See Configure Quality of Service for Pods - your Pod will be classified as "Best Effort":
For a Pod to be given a QoS class of BestEffort, the Containers in the Pod must not have any memory or CPU limits or requests.
In your case, kubernetes will assign QoS called BestEffort to your pod.
That's means kube-scheduler has no idea how to schedule your pod and just do its best.
That's also means your pod can consume any amount of resource(cpu/mem)it want (but the kubelet will evict it if anything goes wrong).
To see the resource cost of your pod, you can use kubectl top pod xxxx
Is it possible for a pod/deployment/statefulset to be moved to another node or be recreated on another node automatically if the first node fails? The pod in question is set to 1 replica. So is it possible to configure some sort of failover for kubernetes pods? I've tried out pod affinity settings but nothing is moved automatically it has been around 10 minutes.
the yaml for the said pod is like below:
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: ceph-rbd-sc-pvc
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 2Gi
storageClassName: ceph-rbd-sc
---
apiVersion: v1
kind: Pod
metadata:
name: ceph-rbd-pod-pvc-sc
labels:
app: ceph-rbd-pod-pvc-sc
spec:
containers:
- name: ceph-rbd-pod-pvc-sc
image: busybox
command: ["sleep", "infinity"]
volumeMounts:
- mountPath: /mnt/ceph_rbd
name: volume
nodeSelector:
etiket: worker
volumes:
- name: volume
persistentVolumeClaim:
claimName: ceph-rbd-sc-pvc
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchLabels:
name: ceph-rbd-pod-pvc-sc
topologyKey: "kubernetes.io/hostname"
Edit:
I managed to get it to work. But now i have another problem, the newly created pod in the other node is stuck in "container creating" and the old pod is stuck in "terminating". I also get Multi-Attach error for volume stating that the pv is still in use by the old pod. The situation is the same for any deployment/statefulset with a pv attached, the problem is resolved only when the failed node comes back online. Is there a solution for this?
Answer from coderanger remains valid regarding Pods. Answering to your last edit:
Your issue is with CSI.
When your Pod uses a PersistentVolume whose accessModes is RWO.
And when the Node hosting your Pod gets unreachable, prompting Kubernetes scheduler to Terminate the current Pod and create a new one on another Node
Your PersistentVolume can not be attached to the new Node.
The reason for this is that CSI introduced some kind of "lease", marking a volume as bound.
With previous CSI spec & implementations, this lock was not visible, in terms of Kubernetes API. If your ceph-csi deployment is recent enough, you should find a corresponding "VolumeAttachment" object that could be deleted, to fix your issue:
# kubectl get volumeattachments -n ci
NAME ATTACHER PV NODE ATTACHED AGE
csi-194d3cfefe24d5f22616fabd3d2fb2ce5f79b16bdca75088476c2902e7751794 rbd.csi.ceph.com pvc-902c3925-11e2-4f7f-aac0-59b1edc5acf4 melpomene.xxx.com true 14d
csi-24847171efa99218448afac58918b6e0bb7b111d4d4497166ff2c4e37f18f047 rbd.csi.ceph.com pvc-b37722f7-0176-412f-b6dc-08900e4b210d clio.xxx.com true 90d
....
kubectl delete -n ci volumeattachment csi-xxxyyyzzz
Those VolumeAttachments are created by your CSI provisioner, before the device mapper attaches a volume.
They would be deleted only once the corresponding PV would have been released from a given Node, according to its device mapper - that needs to be running, kubelet up/Node marked as Ready according to the the API. Until then, other Nodes can't map it. There's no timeout, should a Node get unreachable due to network issues or an abrupt shutdown/force off/reset: its RWO PV are stuck.
See: https://github.com/ceph/ceph-csi/issues/740
One workaround for this would be not to use CSI, and rather stick with legacy StorageClasses, in your case installing rbd on your nodes.
Though last I checked -- k8s 1.19.x -- I couldn't manage to get it working, I can't recall what was wrong, ... CSI tends to be "the way" to do it, nowadays. Despite not being suitable for production use, sadly, unless running in an IAAS with auto-scale groups deleting Nodes from the Kubernetes API (eventually evicting the corresponding VolumeAttachments), or using some kind of MachineHealthChecks like OpenShift 4 implements.
A bare Pod is a single immutable object. It doesn't have any of these nice things. Related: never ever use bare Pods for anything. If you try this with a Deployment you should see it spawn a new one to get back to the requested number of replicas. If the new Pod is Unschedulable you should see events emitted explaining why. For example if only node 1 matches the nodeSelector you specified, or if another Pod is already running on the other node which triggers the anti-affinity.
I am having the below Pod definition.
apiVersion: v1
kind: Pod
metadata:
name: mongodb
spec:
volumes:
- name: mongodb-data
awsElasticBlockStore:
volumeID: vol-0c0d9800c22f8c563
fsType: ext4
containers:
- image: mongo
name: mongodb
volumeMounts:
- name: mongodb-data
mountPath: /data/db
ports:
- containerPort: 27017
protocol: TCP
I have created volumne in AWS and tried to mount to the container. The container is not starting.
kubectl get po
NAME READY STATUS RESTARTS AGE
mongodb 0/1 ContainerCreating 0 6m57s
When I created the volume and assigned it to a Availability zone where the node is running and and the pod was scheduled on that node, the volume was mounted successfully. If the pod is not scheduled on the node, the mount fails. How can I make sure that the volume can be accessed by all the nodes
According to the documentation:
There are some restrictions when using an awsElasticBlockStore volume:
the nodes on which Pods are running must be AWS EC2 instances
those instances need to be in the same region and availability-zone as the EBS volume
EBS only supports a single EC2 instance mounting a volume
Make sure all of the above are met. If your nodes are in different zones than you might need to create additional EBS volumes, for example:
aws ec2 create-volume --availability-zone=eu-west-1a --size=10 --volume-type=gp2
Please let me know if that helped.
I am trying to cut the costs of running the kubernetes cluster on Google Cloud Platform.
I moved my node-pool to preemptible VM instances. I have 1 pod for Postgres and 4 nodes for web apps.
For Postgres, I've created StorageClass to make data persistent.
Surprisingly, maybe not, all storage data was erased after a day.
How to make a specific node in GCP not preemptible?
Or, could you advice what to do in that situation?
I guess I found a solution.
Create a disk on gcloud via:
gcloud compute disks create --size=10GB postgres-disk
gcloud compute disks create --size=[SIZE] [NAME]
Delete any StorageClasses, PV, PVC
Configure deployment file:
apiVersion: apps/v1beta2
kind: StatefulSet
metadata:
name: postgres
spec:
serviceName: postgres
selector:
matchLabels:
app: postgres
replicas: 1
template:
metadata:
labels:
app: postgres
role: postgres
spec:
containers:
- name: postgres
image: postgres
env:
...
ports:
...
# Especially this part should be configured!
volumeMounts:
- name: postgres-persistent-storage
mountPath: /var/lib/postgresql
volumes:
- name: postgres-persistent-storage
gcePersistentDisk:
# This GCE PD must already exist.
pdName: postgres-disk
fsType: ext4
You can make a specific node not preemptible in a Google Kubernetes Engine cluster, as mentioned in the official documentation.
The steps to set up a cluster with both preemptible and non-preemptible node pools are:
Create a Cluster: In the GCP Console, go to Kubernetes Engine -> Create Cluster, and configure the cluster as you need.
On that configuration page, under Node pools, click on Add node pool. Enter the number of nodes for the default and the new pool.
To make one of the pools preemptible, click on the Advance edit button under the pool name, check the Enable preemptible nodes (beta) box, and save the changes.
Click on Create.
Then you probably want to schedule specific pods only on non-preemptible nodes. For this, you can use node taints.
You can use managed service from GCP named GKE google kubernetes cluster.
And storage data erased cause of storage class change may not retain policy and PVC.
It's better to use managed service I think.
I hava a kubernetes cluster up running on AWS. Now when I'm trying to attach a AWS EBS as a volume to a pod, I got a "special device does not exist" problem.
Output: mount: special device /var/lib/kubelet/plugins/kubernetes.io/aws-ebs/mounts/vol-xxxxxxx does not exist
I did some research and found that the correct AWS EBS device path should be like this format:
/var/lib/kubelet/plugins/kubernetes.io/aws-ebs/mounts/aws/us-west-2a/vol-xxxxxxxx
My doubt is that it might because I set up the Kubernetes cluster according to this tutorial and did not set the cloud provider, and therefore the AWS device "does not exit". I wonder if my doubt is correct, and if yes, how to set the cloud provider after the cluster is already up running.
You need to set the cloud provider to properly mount an EBS volume. To do that after the fact set --cloud-provider=aws in the following services:
controller-manager
apiserver
kubelet
Restart everything and try mounting again.
An example pod which mounts an EBS volume explicitly may look like this:
apiVersion: v1
kind: Pod
metadata:
name: test-ebs
spec:
containers:
- image: gcr.io/google_containers/test-webserver
name: test-container
volumeMounts:
- mountPath: /test-ebs
name: test-volume
volumes:
- name: test-volume
# This AWS EBS volume must already exist.
awsElasticBlockStore:
volumeID: <volume-id>
fsType: ext4
The Kubernetes version is an important factor here. The EBS mounts was experimental in 1.2.x, I tried it then but without success. In the last releases I never tried it again but be sure to check your IAM roles on the k8s vm's to make sure they have the rights to provision EBS disks.