How to save SQL storage data in running preemtible instance? - kubernetes

I am trying to cut the costs of running the kubernetes cluster on Google Cloud Platform.
I moved my node-pool to preemptible VM instances. I have 1 pod for Postgres and 4 nodes for web apps.
For Postgres, I've created StorageClass to make data persistent.
Surprisingly, maybe not, all storage data was erased after a day.
How to make a specific node in GCP not preemptible?
Or, could you advice what to do in that situation?

I guess I found a solution.
Create a disk on gcloud via:
gcloud compute disks create --size=10GB postgres-disk
gcloud compute disks create --size=[SIZE] [NAME]
Delete any StorageClasses, PV, PVC
Configure deployment file:
apiVersion: apps/v1beta2
kind: StatefulSet
metadata:
name: postgres
spec:
serviceName: postgres
selector:
matchLabels:
app: postgres
replicas: 1
template:
metadata:
labels:
app: postgres
role: postgres
spec:
containers:
- name: postgres
image: postgres
env:
...
ports:
...
# Especially this part should be configured!
volumeMounts:
- name: postgres-persistent-storage
mountPath: /var/lib/postgresql
volumes:
- name: postgres-persistent-storage
gcePersistentDisk:
# This GCE PD must already exist.
pdName: postgres-disk
fsType: ext4

You can make a specific node not preemptible in a Google Kubernetes Engine cluster, as mentioned in the official documentation.
The steps to set up a cluster with both preemptible and non-preemptible node pools are:
Create a Cluster: In the GCP Console, go to Kubernetes Engine -> Create Cluster, and configure the cluster as you need.
On that configuration page, under Node pools, click on Add node pool. Enter the number of nodes for the default and the new pool.
To make one of the pools preemptible, click on the Advance edit button under the pool name, check the Enable preemptible nodes (beta) box, and save the changes.
Click on Create.
Then you probably want to schedule specific pods only on non-preemptible nodes. For this, you can use node taints.

You can use managed service from GCP named GKE google kubernetes cluster.
And storage data erased cause of storage class change may not retain policy and PVC.
It's better to use managed service I think.

Related

how to migrate VM data from a disk to kubernetes cluster

how to migrate VM data from a disk to a Kubernetes cluster?
I have a VM with three disks attached mounted to it, each having data that needs to be migrated to a Kubernetes cluster to be attached to database services in statefulset.
This link https://cloud.google.com/kubernetes-engine/docs/how-to/persistent-volumes/preexisting-pd does give me the way. But don't know how to use it or implement it with statefulsets so that one particular database resource (like Postgres) be able to use the same PV(created from one of those GCE persistent disks) and create multiple PVCs for new replicas.
Is the scenario I'm describing achievable?
If yes how to implement it?
apiVersion: elasticsearch.k8s.elastic.co/v1
kind: Elasticsearch
metadata:
name: elasticsearch
spec:
version: 6.8.12
http:
tls:
selfSignedCertificate:
disabled: true
nodeSets:
- name: default
count: 3
config:
node.store.allow_mmap: false
xpack.security.enabled: false
xpack.security.authc:
anonymous:
username: anonymous
roles: superuser
authz_exception: false
volumeClaimTemplates:
- metadata:
name: elasticsearch-data
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 50Gi
storageClassName: managed-premium
StatefulSet is a preference for manually creating a PersistentVolumeClaim. For database workloads that can't be replicated, you can't set replicas: greater than 1 in either case, but the PVC management is valuable. You usually can't have multiple databases pointing at the same physical storage, containers or otherwise, and most types of Volumes can't be shared across Pods.
Postgres doesn't support multiple instances sharing the underlying volume without massive corruption so if you did set things up that way, it's definitely a mistake. More common would be to use the volumeClaimTemplate system so each pod gets its own distinct storage. Then you set up Postgres streaming replication yourself.
Refer to the document on Run a replicated stateful application and stackpost for more information.

Methods of Verifying Kubernetes Configuration

I've been working on a small side project to try and learn Kubernetes. I have a relatively simple cluster with two services, an ingress, and working on adding a Redis database now. I'm hosting this cluster in Google Kubernetes Engine (GKE), but using Minikube to run the cluster locally and try everything out before I commit any changes and push them to the prod environment in GKE.
During this project, I have noticed that GKE seems to have some slight differences in how it wants the configuration vs what works in Minikube. I've seen this previously with ingresses and now with persistent volumes.
For example, to run Redis with a persistent volume in GKE, I can use:
apiVersion: apps/v1
kind: Deployment
metadata:
name: chatter-db-deployment
labels:
app: chatter
spec:
replicas: 1
selector:
matchLabels:
app: chatter-db-service
template:
metadata:
labels:
app: chatter-db-service
spec:
containers:
- name: master
image: redis
args: [
"--save", "3600", "1", "300", "100", "60", "10000",
"--appendonly", "yes",
]
ports:
- containerPort: 6379
volumeMounts:
- name: chatter-db-storage
mountPath: /data/
volumes:
- name: chatter-db-storage
gcePersistentDisk:
pdName: chatter-db-disk
fsType: ext4
The gcePersistentDisk section at the end refers to a disk I created using gcloud compute disks create. However, this simply won't work in Minikube as I can't create disks that way.
Instead, I need to use:
volumes:
- name: chatter-db-storage
persistentVolumeClaim:
claimName: chatter-db-claim
I also need to include separate configuration for a PeristentVolume and a PersistentVolumeClaim.
I can easily get something working in either Minikube OR GKE, but I'm not sure what is the best means of getting a config which works for both. Ideally, I want to have a single k8s.yaml file which deploys this app, and kubectl apply -f k8s.yaml should work for both environments, allowing me to test locally with Minikube and then push to GKE when I'm satisfied.
I understand that there are differences between the two environments and that will probably leak into the config to some extent, but there must be an effective means of verifying a config before pushing it? What are the best practices for testing a config? My questions mainly come down to:
Is it feasible to have a single Kubernetes config which can work for both GKE and Minikube?
If not, is it feasible to have a mostly shared Kubernetes config, which overrides the GKE and Minikube specific pieces?
How do existing projects solve this particular problem?
Is the best method to simply make a separate dev cluster in GKE and test on that, rather than bothering with Minikube at all?
Yes, you have found some parts of Kubernetes configuration that was not perfect from the beginning. But there are newer solutions.
Storage abstraction
The idea in newer Kubernetes releases is that your application configuration is a Deployment with Volumes that refers to PersistentVolumeClaim for a StorageClass.
While StorageClass and PersistentVolume belongs more to the infrastructure configuration.
See Configure a Pod to Use a PersistentVolume for Storage on how to configure a Persistent Volume for Minikube. For GKE you configure a Persistent Volume with GCEPersistentDisk or if you want to deploy your app to AWS you may use a Persistent Volume for AWSElasticBlockStore.
Ingress and Service abstraction
Service with type LoadBalancer and NodePort in combination with Ingress does not work the same way across cloud providers and Ingress Controllers. In addition, Services Mesh implementations like Istio have introduced VirtualService. The plan is to improve this situation with Ingress v2 as how I understand it.

Prometheus in k8s (metrics)

I deploy prometheus in kubernetes on this manual
As a storage scheme was invented:
Prometeus in kubernetes stores the metrics within 24 hours.
Prometheus not in kubernetes stores the metrics in 1 week.
A federation is set up between them.
Who faced with the fact that after removing the pods after a certain period of time (much less than 24 hours) metrics are missing on it.
This is perfectly normal if you do not have a persistent storage configured for your prometheus pod. You should use PV/PVC to define a stable place where you keep your prometheus data, otherwise if your pod is recreated, it starts with a clean slate.
PV/PVC needs dedicated storage servers in the cluster. If there is no money for storage servers, here is a cheaper approach:
Label a node:
$ kubectl label nodes <node name> prometheus=yes
Force all the prometheus pods to be created on the same labeled node by using nodeSelector:
nodeSelector:
prometheus: yes
Create an emptyDir volume for each prometheus pod. An emptyDir volume is first created when the Prometheus pod is assigned to the labeled node and exists as long as that pod is running on that node and is safe across container crashes and pod restarts.
spec:
containers:
- image: <prometheus image>
name: <prometheus pod name>
volumeMounts:
- mountPath: /cache
name: cache-volume
volumes:
- name: cache-volume
emptyDir: {}
This approach makes all the Prometheus pods run on the same node with persistent storage for the metrics - a cheaper approach that prays the Prometheus node does not crash.

DigitalOcean blockstorage using for Kubernetes Volume

I have a K8S cluster running on DigitalOcean. I have a Postgresql database running there and I want to create a volume using the DigitalOcean BlockStorage to be used by the Postgresql pod as volume. Is there any examples on how to do that?
If it's not possible to use DigitalOcean blockstorage then how do most companies run their persistence storage for databases?
No official support yet. You can try the example from someone in this github issue:
Update: I finished writing a volume plugin for digitalocean. Attach/detach is working on my cluster. Looking for anyone willing to
test this on their k8s digitalocean cluster. My branch is
https://github.com/wardviaene/kubernetes/tree/do-volume
You can use the following spec in your pod yml:
spec:
containers:
- name: k8s-demo
image: yourimage
volumeMounts:
- mountPath: /myvol
name: myvolume
ports:
- containerPort: 3000
volumes:
- name: myvolume
digitaloceanVolume:
volumeID: mykubvolume
fsType: ext4 Where mykubvolume is the volume created in DigitalOcean in the same region.
You will need to add create a config file:
[Global] apikey = do-api-key region = your-region and add these
parameters to your kubernetes processes: --cloud-provider=digitalocean
--cloud-config=/etc/cloud.config
I'm still waiting for an issue in the godo driver to be resolved,
before I can submit a PR (digitalocean/godo#102)
I found this link here about flexvolumes This mentions how you can customize to load vendor volumes. There is also a script on how to do this at script
A Container Storage Interface (CSI) Driver for DigitalOcean Block Storage.
https://github.com/digitalocean/csi-digitalocean
Have tested with statefulset MySql, working fine

Kubernetes cannot attach AWS EBS as volume. Probably due to cloud provider issue

I hava a kubernetes cluster up running on AWS. Now when I'm trying to attach a AWS EBS as a volume to a pod, I got a "special device does not exist" problem.
Output: mount: special device /var/lib/kubelet/plugins/kubernetes.io/aws-ebs/mounts/vol-xxxxxxx does not exist
I did some research and found that the correct AWS EBS device path should be like this format:
/var/lib/kubelet/plugins/kubernetes.io/aws-ebs/mounts/aws/us-west-2a/vol-xxxxxxxx
My doubt is that it might because I set up the Kubernetes cluster according to this tutorial and did not set the cloud provider, and therefore the AWS device "does not exit". I wonder if my doubt is correct, and if yes, how to set the cloud provider after the cluster is already up running.
You need to set the cloud provider to properly mount an EBS volume. To do that after the fact set --cloud-provider=aws in the following services:
controller-manager
apiserver
kubelet
Restart everything and try mounting again.
An example pod which mounts an EBS volume explicitly may look like this:
apiVersion: v1
kind: Pod
metadata:
name: test-ebs
spec:
containers:
- image: gcr.io/google_containers/test-webserver
name: test-container
volumeMounts:
- mountPath: /test-ebs
name: test-volume
volumes:
- name: test-volume
# This AWS EBS volume must already exist.
awsElasticBlockStore:
volumeID: <volume-id>
fsType: ext4
The Kubernetes version is an important factor here. The EBS mounts was experimental in 1.2.x, I tried it then but without success. In the last releases I never tried it again but be sure to check your IAM roles on the k8s vm's to make sure they have the rights to provision EBS disks.