Methods of Verifying Kubernetes Configuration - kubernetes

I've been working on a small side project to try and learn Kubernetes. I have a relatively simple cluster with two services, an ingress, and working on adding a Redis database now. I'm hosting this cluster in Google Kubernetes Engine (GKE), but using Minikube to run the cluster locally and try everything out before I commit any changes and push them to the prod environment in GKE.
During this project, I have noticed that GKE seems to have some slight differences in how it wants the configuration vs what works in Minikube. I've seen this previously with ingresses and now with persistent volumes.
For example, to run Redis with a persistent volume in GKE, I can use:
apiVersion: apps/v1
kind: Deployment
metadata:
name: chatter-db-deployment
labels:
app: chatter
spec:
replicas: 1
selector:
matchLabels:
app: chatter-db-service
template:
metadata:
labels:
app: chatter-db-service
spec:
containers:
- name: master
image: redis
args: [
"--save", "3600", "1", "300", "100", "60", "10000",
"--appendonly", "yes",
]
ports:
- containerPort: 6379
volumeMounts:
- name: chatter-db-storage
mountPath: /data/
volumes:
- name: chatter-db-storage
gcePersistentDisk:
pdName: chatter-db-disk
fsType: ext4
The gcePersistentDisk section at the end refers to a disk I created using gcloud compute disks create. However, this simply won't work in Minikube as I can't create disks that way.
Instead, I need to use:
volumes:
- name: chatter-db-storage
persistentVolumeClaim:
claimName: chatter-db-claim
I also need to include separate configuration for a PeristentVolume and a PersistentVolumeClaim.
I can easily get something working in either Minikube OR GKE, but I'm not sure what is the best means of getting a config which works for both. Ideally, I want to have a single k8s.yaml file which deploys this app, and kubectl apply -f k8s.yaml should work for both environments, allowing me to test locally with Minikube and then push to GKE when I'm satisfied.
I understand that there are differences between the two environments and that will probably leak into the config to some extent, but there must be an effective means of verifying a config before pushing it? What are the best practices for testing a config? My questions mainly come down to:
Is it feasible to have a single Kubernetes config which can work for both GKE and Minikube?
If not, is it feasible to have a mostly shared Kubernetes config, which overrides the GKE and Minikube specific pieces?
How do existing projects solve this particular problem?
Is the best method to simply make a separate dev cluster in GKE and test on that, rather than bothering with Minikube at all?

Yes, you have found some parts of Kubernetes configuration that was not perfect from the beginning. But there are newer solutions.
Storage abstraction
The idea in newer Kubernetes releases is that your application configuration is a Deployment with Volumes that refers to PersistentVolumeClaim for a StorageClass.
While StorageClass and PersistentVolume belongs more to the infrastructure configuration.
See Configure a Pod to Use a PersistentVolume for Storage on how to configure a Persistent Volume for Minikube. For GKE you configure a Persistent Volume with GCEPersistentDisk or if you want to deploy your app to AWS you may use a Persistent Volume for AWSElasticBlockStore.
Ingress and Service abstraction
Service with type LoadBalancer and NodePort in combination with Ingress does not work the same way across cloud providers and Ingress Controllers. In addition, Services Mesh implementations like Istio have introduced VirtualService. The plan is to improve this situation with Ingress v2 as how I understand it.

Related

K8s service doesn't use autoscaled mongodb pods

I am trying to deploy mongodb with kubernetes (gke more precisely). This database is used by a microservice which only needs to read in the database, so I thought of deploying multiple pods with a mongodb docker in each of them, so that the work is shared between them. To do that I created a mongodb image in which I uploaded my mongodb that I previously used with a single docker.
(Here is its Dockerfile, a single deployment of this image works in k8s so I guess that this may not be linked to the issue)
FROM mongo:latest
EXPOSE 27017
COPY /mdb/ /data/db
As the number of requests to the db varies during the day, I want to use gke horizontal autoscaling for those "mongodb pods". Autoscaling works as new pods are created when the cpu utilization goes over the target I fixed in my horinzontal pod autoscaler, but these new pods are not used by the service I created for the deployment file used to deploy those pods, and that's my issue.
Something strange to me is that the new pods local IP addresses appear in my service endpoints, and when I delete the initial pod, which is the only one working at that moment, then the other pods created by the autoscaler get activated and so I finally get a performance improvement. However, this is obviously not a solution to me, and moreover other pods created after I deleted the initial one don't get used either.
Here are the yamls files for my mongodb deployment and service :
apiVersion: apps/v1
kind: Deployment
metadata:
name: mongodb-deployment
labels:
app: mongodb
spec:
replicas: 1
selector:
matchLabels:
app: mongodb
template:
metadata:
labels:
app: mongodb
spec:
containers:
- name: mongodb
image: "$my_mongodb_image_in_which_I_have_my_db"
ports:
- containerPort: 27017
resources:
requests:
memory: "1800Mi"
cpu: "3000m"
apiVersion: v1
kind: Service
metadata:
name: mongodb-service
spec:
type: LoadBalancer
loadBalancerIP: $IP_reserved_for_this_service
selector:
app: mongodb
ports:
- protocol: TCP
port: 80
targetPort: 27017
And I am accessing those mongodb through pymongo, in programs that run in another pod in the same gke cluster:
def get_db(database: str):
client = MongoClient(host="$IP_reserved_for_this_service",
port=80,
username="...",
password="...",
authSource="admin")
return client.get_database(database)
This way of using and autoscaling mongodb might be weird and quite impractical but it's only a first model for me and I would like to make it work (or understand why it can't work).
Here are screenshots showing the behaviour of those pods:
state 1: only the initial pod is working
...but all ips appear in service endpoints
state 2: initial pod deleted, the other work now (except the new one created by the autoscaler after the deletion)
...and the endpoints are updated in the service (the update is in the "+ 1 more ...", I checked in the google console)
I feel that the problem might come either from the configuration of my mongodb-service or from the way k8s or gke deals with mongodb images (anyway since I'm new to k8s I might be completely wrong on that too).
Any help or comment will be appreciated, and if you need more information let me know.
There sticky connection in Kubernetes, a common and well-known feature of Kubernetes. Kubernetes doesn't balance packages, it's balancing connections. Once your app established a connection to a service, all requests to this service will through this connection. Kubernetes doesn't guarantee that the next one won't go via another connection.
Once of the option to solve - headless service https://kubernetes.io/docs/concepts/services-networking/service/#headless-services
Or service mesh, but it's too much for your case ))

Why is my Host Path Persistent Volume reachable from all pods?

I'm pretty stuck with this learning step of Kubernetes named PV and PVC.
What I'm trying to do here is understand how to handle shared read-write volume on multiple pods.
What I understood here is that a PVC cannot be shared between pods unless a NFS-like storage class has been configured.
I'm still with my hostPath Storage Class and I tried the following (Docker Desktop and 3 nodes microK8s cluster) :
This PVC with dynamic Host Path provisionning
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: pvc-desktop
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 50Mi
Deployment with 3 replicated pods writing on the same PVC.
apiVersion: apps/v1
kind: Deployment
metadata:
name: busybox
spec:
replicas: 3
selector:
matchLabels:
app: busybox
template:
metadata:
labels:
app: busybox
spec:
containers:
- name: busybox
image: library/busybox:stable
command: ["/bin/sh"]
args:
["-c", 'while true; do echo "1: $(hostname)" >> /root/index.html; sleep 2; done;',]
volumeMounts:
- mountPath: /root
name: vol-desktop
volumes:
- name: vol-desktop
persistentVolumeClaim:
claimName: pvc-desktop
Nginx server for serving volume content
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx
spec:
replicas: 1
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:stable
volumeMounts:
- mountPath: /usr/share/nginx/html
name: vol-desktop
ports:
- containerPort: 80
volumes:
- name: vol-desktop
persistentVolumeClaim:
claimName: pvc-desktop
Following what I understood on the documentation, this could not be possible, but in reality everything run pretty smoothly and my Nginx server displayed the up to date index.html file pretty well.
It actually worked on a single-node cluster and multi-node cluster.
What am I not getting here? Why this thing works?
Is every pod mounting is own host path volume on start?
How can a hostPath storage works between multiple nodes?
EDIT: For the multi-node case, a network folder has been created between the same storage path of each machine this is why everything has been replicated successfully. I didn't understand that the same host path is created on each node with that PVC mounted.
To anyone with the same problem: each node mounting this hostpath PVC will have is own folder created at the PV path.
So without network replication between nodes, only pods of the same node will share the same folder.
This is why it's discouraged on a multi-node cluster due to the unpredictable location of a pod on the cluster.
Thanks!
how to handle shared read-write volume on multiple pods.
Redesign your application to avoid it. It tends to be fragile and difficult to manage multiple writers safely; you depend on both your application correctly performing things like file locking, the underlying shared filesystem implementation handling things properly, and the system being tolerant of any sort of network hiccup that might happen.
The example you give is something that frequently appears in Docker Compose setups: have an application with a mix of backend code and static files, and then try to publish the static files at runtime through a volume to a reverse proxy. Instead, you can build an image that copies the static files at build time:
FROM nginx
ARG app_version=latest
COPY --from=my/app:${app_version} /app/static /usr/share/nginx/html
Have your CI system build this and push it immediately after the backend image is built. The resulting image serves the corresponding static files, but doesn't require a shared volume or any manual management of the volume contents.
For other types of content, consider storing data in a database, or use an object-storage service that maintains its own backing store and can handle the concurrency considerations. Then most of your pods can be totally stateless, and you can manage the data separately (maybe even outside Kubernetes).
How can a hostPath storage works between multiple nodes?
It doesn't. It's an instruction to Kubernetes, on whichever node the pod happens to be scheduled on, to mount that host directory into the container. There's no management of any sort of the directory content; if two pods get scheduled on the same node, they'll share the directory, and if not, they won't; and if your pod's Deployment is updated and the pod is deleted and recreated somewhere else, it might not be the same node and might not have the same data.
With some very specific exceptions you shouldn't use hostPath volumes at all. The exceptions are things like log collectors run as DaemonSets, where there is exactly one pod on every node and you're interested in picking up the host-directory content that is different on each node.
In your specific setup either you're getting lucky with where the data producers and consumers are getting colocated, or there's something about your MicroK8s setup that's causing the host directories to be shared. It is not in general reliable storage.

How to save SQL storage data in running preemtible instance?

I am trying to cut the costs of running the kubernetes cluster on Google Cloud Platform.
I moved my node-pool to preemptible VM instances. I have 1 pod for Postgres and 4 nodes for web apps.
For Postgres, I've created StorageClass to make data persistent.
Surprisingly, maybe not, all storage data was erased after a day.
How to make a specific node in GCP not preemptible?
Or, could you advice what to do in that situation?
I guess I found a solution.
Create a disk on gcloud via:
gcloud compute disks create --size=10GB postgres-disk
gcloud compute disks create --size=[SIZE] [NAME]
Delete any StorageClasses, PV, PVC
Configure deployment file:
apiVersion: apps/v1beta2
kind: StatefulSet
metadata:
name: postgres
spec:
serviceName: postgres
selector:
matchLabels:
app: postgres
replicas: 1
template:
metadata:
labels:
app: postgres
role: postgres
spec:
containers:
- name: postgres
image: postgres
env:
...
ports:
...
# Especially this part should be configured!
volumeMounts:
- name: postgres-persistent-storage
mountPath: /var/lib/postgresql
volumes:
- name: postgres-persistent-storage
gcePersistentDisk:
# This GCE PD must already exist.
pdName: postgres-disk
fsType: ext4
You can make a specific node not preemptible in a Google Kubernetes Engine cluster, as mentioned in the official documentation.
The steps to set up a cluster with both preemptible and non-preemptible node pools are:
Create a Cluster: In the GCP Console, go to Kubernetes Engine -> Create Cluster, and configure the cluster as you need.
On that configuration page, under Node pools, click on Add node pool. Enter the number of nodes for the default and the new pool.
To make one of the pools preemptible, click on the Advance edit button under the pool name, check the Enable preemptible nodes (beta) box, and save the changes.
Click on Create.
Then you probably want to schedule specific pods only on non-preemptible nodes. For this, you can use node taints.
You can use managed service from GCP named GKE google kubernetes cluster.
And storage data erased cause of storage class change may not retain policy and PVC.
It's better to use managed service I think.

Does Kubernetes support VM / node provisioning & management?

To my understanding Kubernetes is a container orchestration service comparable to AWS ECS or Docker Swarm. Yet there are several high rated questions on stackoverflow that compare it to CloudFoundry which is a plattform orchestration service.
This means that CloudFoundry can take care of the VM layer, updating and provisioning VMs while moving containers avoiding downtime. Therefore the comparison to Kubernetes makes limited sense to my understanding.
Am I misunderstanding something, does Kubernetes support provisioning and managing the VM layer too?
Yes, you can manage VMs with KuberVirt as #AbdennourTOUMI pointed out. However, Kubernetes focuses on container orchestration and it also interacts with cloud providers to provision things like Load Balancers that can direct traffic to a cluster.
Cloud Foundry is a PaaS that provides much more than Kubernetes at the lower level. Kubernetes can run on top of an IaaS like AWS together with something like OpenShift
This is a diagram showing some of the differences:
As for VM, my answer is YES; you can run VM as workload in k8s cluster.
Indeed, Redhat team figured out how to run VM in the kubernetes cluster by adding the patch KubeVirt.
example from the link above.
apiVersion: kubevirt.io/v1alpha2
kind: VirtualMachine
metadata:
creationTimestamp: null
labels:
kubevirt.io/vm: vm-cirros
name: vm-cirros
spec:
running: false
template:
metadata:
creationTimestamp: null
labels:
kubevirt.io/vm: vm-cirros
spec:
domain:
devices:
disks:
- disk:
bus: virtio
name: registrydisk
volumeName: registryvolume
- disk:
bus: virtio
name: cloudinitdisk
volumeName: cloudinitvolume
machine:
type: ""
resources:
requests:
memory: 64M
terminationGracePeriodSeconds: 0
volumes:
- name: registryvolume
registryDisk:
image: kubevirt/cirros-registry-disk-demo:latest
- cloudInitNoCloud:
userDataBase64: IyEvYmluL3NoCgplY2hvICdwcmludGVkIGZyb20gY2xvdWQtaW5pdCB1c2VyZGF0YScK
name: cloudinitvolume
Then:
kubectl create -f vm.yaml
virtualmachine "vm-cirros" created

How to configure a Kubernetes Multi-Pod Deployment

I would like to deploy an application cluster by managing my deployment via k8s Deployment object. The documentation has me extremely confused. My basic layout has the following components that scale independently:
API server
UI server
Redis cache
Timer/Scheduled task server
Technically, all 4 above belong in separate pods that are scaled independently.
My questions are:
Do I need to create pod.yml files and then somehow reference them in deployment.yml file or can a deployment file also embed pod definitions?
K8s documentation seems to imply that the spec portion of Deployment is equivalent to defining one pod. Is that correct? What if I want to declaratively describe multi-pod deployments? Do I do need multiple deployment.yml files?
Pagids answer has most of the basics. You should create 4 Deployments for your scenario. Each deployment will create a ReplicaSet that schedules and supervises the collection of PODs for the Deployment.
Each Deployment will most likely also require a Service in front of it for access. I usually create a single yaml file that has a Deployment and the corresponding Service in it. Here is an example for an nginx.yaml that I use:
apiVersion: v1
kind: Service
metadata:
annotations:
service.alpha.kubernetes.io/tolerate-unready-endpoints: "true"
name: nginx
labels:
app: nginx
spec:
type: NodePort
ports:
- port: 80
name: nginx
targetPort: 80
nodePort: 32756
selector:
app: nginx
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: nginxdeployment
spec:
replicas: 3
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginxcontainer
image: nginx:latest
imagePullPolicy: Always
ports:
- containerPort: 80
Here some additional information for clarification:
A POD is not a scalable unit. A Deployment that schedules PODs is.
A Deployment is meant to represent a single group of PODs fulfilling a single purpose together.
You can have many Deployments work together in the virtual network of the cluster.
For accessing a Deployment that may consist of many PODs running on different nodes you have to create a Service.
Deployments are meant to contain stateless services. If you need to store a state you need to create StatefulSet instead (e.g. for a database service).
You can use the Kubernetes API reference for the Deployment and you'll find that the spec->template field is of type PodTemplateSpec along with the related comment (Template describes the pods that will be created.) it answers you questions. A longer description can of course be found in the Deployment user guide.
To answer your questions...
1) The Pods are managed by the Deployment and defining them separately doesn't make sense as they are created on demand by the Deployment. Keep in mind that there might be more replicas of the same pod type.
2) For each of the applications in your list, you'd have to define one Deployment - which also makes sense when it comes to difference replica counts and application rollouts.
3) you haven't asked that but it's related - along with separate Deployments each of your applications will also need a dedicated Service so the others can access it.
additional information:
API server use deployment
UI server use deployment
Redis cache use statefulset
Timer/Scheduled task server maybe use a statefulset (If your service has some state in)