Deploy mariadb galera cluster in kubernetes/docker swarm - kubernetes

I'm trying to deploy scalable mariadb galera cluster in kubernetes or docker swarm. Since each pod or containers needs its own galera config, how should i create my deployment so i could be able to scale it without any manual work? I think we can't use ConfigMap cause for a 10 node cluster there have to be 10 configmaps!
Example of mariadb galera config of a node:
wsrep_cluster_address="gcomm://ip_1,ip_2,ip_3"
wsrep_node_address="ip_1"
wsrep_node_name="node_1"
wsrep_cluster_name="mariadb-cluster"
For such applications which have different config for each node, what is the best way of deployment?
Note: I can create pods/containers and do the config my self (join new nodes to the cluster) but i think this isn't right way and i need it to be auto scalable.

You almost definitely want to use a StatefulSet to deploy this in Kubernetes. Among other things, this has the property that each Pod will get its own PersistentVolumeClaim for storage, and that the names of individual Pods are predictable and sequential. You should create a matching headless Service and then each Pod will have a matching DNS name.
That solves a couple of parts of the riddle:
# You pick this
wsrep_cluster_name="mariadb-cluster"
# You know what all of these DNS names will be up front
wsrep_cluster_address="gcomm://galera-0.galera.default.svc.cluster.local,...,galera-9.galera.default.svc.cluster.local"
For wsrep_node_name, the MariaDB documentation indicates that it defaults to the host name. In Kubernetes, the host name defaults to the pod name, and the pod name is one of the sequential galera-n for pods managed by a StatefulSet, so you don't need to manually set this.
wsrep_node_address is trickier. Here the documentation indicates that there are heuristics to guess it (with a specific caveat that it might not be reliable for containers). You can't know an individual pod's IP address before it's created. You can in principle use the downward API to inject a pod's IP address into an environment variable. I'd start by hoping the heuristics would guess the pod IP address and this works well enough (it is what the headless Service would ultimately resolve to).
That leaves you with the block above in the ConfigMap, and it's global across all of the replicas. The other remaining per-Galera-node values should be automatically guessable.

Related

Dynamically get the correct IP for a Redis pod on Kubernetes

Until now, our solution ran on three containers on the same IP.
One of the containers had Redis. When the other two containers (communicating via message passing on Redis) needed to reach the Redis, they simply utilized port 6379. Since the Redis had the same machine/IP this was simple and straight forward.
Now there is a need to run these three dockers on Kuberenetes and each docker container is now getting its own unique IP address. How does one manage that correctly?
What if additional Redis containers are needed. How does each container knows which one to couple with?
When switching to Kubernetes, you have to rely on the K8s architecture itself and think in a cloud-native manner.
Consider that a pod is ephemeral: its IP will, potentially and eventually, change, anytime. You can not rely on a pod's IP.
What you can do is create a service (a clusterIP works great for this use case) that will serve as the entry point for every replicas of your Redis pod. The replicas should rely on the service, and K8s will take care of updating the list of : backing the service.
You can find a great tutorial of doing this here.

k8s volume in memory betweens pods, or connect services to container in multi containet structure

I have an issue with k8s volumes
The structure
We talk about 2 services (pods) that service one generates files and the other service expose those files over the web.
Service number one needs a UDP connection open to the world and also a TCP connection to communicate between other pods.
(two services, one for UDP and one for TCP)
Service Number two needs a connection to the world (web).
(one service that connects to ingress)
The Issue
I need volume in memory between those two pods, to speed up the R/W process.
The solution I checked, to work with the multi-container structure using EmptyDir volume
(there is an option to run this volume in memory)
The problem with this solution is that I can't connect the k8s service object for those containers, service connects only to pods, and only the pod gets IP but containers not.
there is any solution or idea for this situation?
p.s. I'm running on AKS if it's matters.
If you really want to use a multi-container pod, why don’t you create two Kubernetes service resources? One mapping to ContainerPort A and one mapping to containerPort B? This way you would Expose every Container independently.
Keep in mind, that EmptyDir is not „in memory“ it is just a shared volume on the same host, accessible only for the containers that are sharing the emptyDir.
Edit: as #Matt pointed out I was wrongly informed: I was not aware of the emptyDir.medium="Memory" setting.
So another solution could be to go with two independent pods and a dedicated volume (either on host level or a Kubernetes persistent volume). With taints and tolerations you can then ensure, that both pods are scheduled on the same node where the actual volume is attached.

Communication between Pods in Kubernetes. Service object or Cluster Networking?

I'm a beginner in Kubernetes and I have a situation as following: I have two differents Pods: PodA and PodB. Firstly, I want to expose PodA to the outside world, so I create a Service (type NodePort or LoadBalancer) for PodA, which is not difficult to understand for me.
Then I want PodA communicate to PodB, and after several hours googling, I found the answer is that I also need to create a Service (type ClusterIP if I want to keep PodB only visible inside the cluster) for PodB, and if I do so, I can let PodA and PodB comminucate to each other. But the problem is I also found this article. According to this webpage, they say that the communication between pods on the same node can be done via cbr0, a Network Bridge, or the communication between pods on different nodes can be done via a route table of the cluster, and they don't mention anything to the Service object (which means we don't need Service object ???).
In fact, I also read the documents of K8s and I found in the Cluster Networking
Cluster Networking
...
2. Pod-to-Pod communications: this is the primary focus of this document.
...
where they also focus on to the Pod-to-Pod communications, but there is no stuff relevant to the Service object.
So, I'm really confusing right now and my question is: Could you please explain to me the connection between these stuff in the article and the Service object? The Service object is a high-level abstract of the cbr0 and route table? And in the end, how can the Pods can communicate to each other?
If I misunderstand something, please, point it out for me, I really appreciate that.
Thank you guys !!!
Motivation behind using a service in a Kubernetes cluster.
Kubernetes Pods are mortal. They are born and when they die, they are not resurrected. If you use a Deployment to run your app, it can create and destroy Pods dynamically.
Each Pod gets its own IP address, however in a Deployment, the set of Pods running in one moment in time could be different from the set of Pods running that application a moment later.
This leads to a problem: if some set of Pods (call them “backends”) provides functionality to other Pods (call them “frontends”) inside your cluster, how do the frontends find out and keep track of which IP address to connect to, so that the frontend can use the backend part of the workload?
That being said, a service is handy when your deployments (podA and podB) are dynamically managed.
Your PodA can always communicate with PodB if it knows the address or the DNS name of PodB. In a cluster environment, there may be multiple replicas of PodB, or an instance of PodB may die and be replaced by another instance with a different address and different name. A Service is an abstraction to deal with this situation. If you use a Service to expose your PodB, then all pods in the cluster can talk to an instance of PodB using that service, which has a fixed name and fixed address no matter how many instances of PodB exists and what their addresses are.
First, I read it as you are dealing with two applications, e.g. ApplicationA and ApplicationB. Don't use the Pod abstraction when you reason about your architecture. On Kubernetes, you are dealing with a distributed system, and it is designed so that you should have multiple instances of your Application, e.g. for High Availability. Each instance of your application is a Pod.
Deploy your applications ApplicationA and ApplicationB as a Deployment resource. Then it is easy do do rolling upgrades without downtime, and Kubernetes will restart any instance of your application if it crash.
For every Deployment or for you, application, create one Service resource, (e.g. ServiceA and ServiceB). When you communicate from ApplicationA to another application, use the Service, e.g. ServiceB. The service will load balance your requests to the instances of the other application, and you can upgrade your Deployment without downtime.
1.Cluster networking : As the name suggests, all the pods deployed in the cluster will be connected by implementing any kubernetes network model like DANM, flannel
Check this link to see how to create a cluster network.
Creating cluster network
With the CNI installed (by implementing cluster network), every pod will get an IP.
2.Service objects created with type ClusterIP, points to the this IPs (via endpoint) created internally to communicate.
Answering your question, Yes, The Service object is a high-level abstract of the cbr0 and route table.
You can use service object to communicate between pods.
You can also implement service mesh like envoy / Istio if the network is complex.

Why don't Kubernetes deployments support services?

I'm new to K8s, so still trying to get my head around things. I've been looking at deployments and can appreciate how useful they will be. However, I don't understand why they don't support services (only replica sets and pods).
Why is this? Does this mean that services would typically be deployed outside of a deployment?
To answer your question, Kubernetes deployments are used for managing stateless services running in the cluster instead of StatefulSets which are built for the stateful application run-time. Actually, with deployments you can describe the update strategy and road map for all underlying objects that have to be created during implementation.Therefore, we can distinguish separate specification fields for some objects determination, like needful replica number of Pods, template for Pod by describing a list of containers that should be in the Pod, etc.
However, as #P Ekambaram already mention in his answer, Services represent abstraction layer of network communication model inside Kubernetes cluster, and they declare a way to access Pods within a cluster via corresponded Endpoints. Services are separated from deployment object manifest specification, because of their mission to dynamically provide specific network behavior for the nested Pods without affecting or restarting them in case of any communication modification via appropriate Service Types.
Yes, services should be deployed as separate objects. Note that deployment is used to upgrade or rollback the image and works above ReplicaSet
Kubernetes Pods are mortal. They are born and when they die, they are not resurrected. ReplicaSets in particular create and destroy Pods dynamically (e.g. when scaling out or in). While each Pod gets its own IP address, even those IP addresses cannot be relied upon to be stable over time. This leads to a problem: if some set of Pods (let’s call them backends) provides functionality to other Pods (let’s call them frontends) inside the Kubernetes cluster, how do those frontends find out and keep track of which backends are in that set?
Services.come to the rescue.
A Kubernetes Service is an abstraction which defines a logical set of Pods and a policy by which to access them. The set of Pods targeted by a Service is (usually) determined by a Label Selector
Something I've just learnt that is somewhat related to my question: multiple K8s objects can be included in the same yaml file, separate by ---. Something like:
apiVersion: v1
kind: Deployment
# other stuff here
---
apiVersion: v1
kind: Service
# other stuff here
i think it intends to decoupled and fine-grained.

Redistribute pods after adding a node in Kubernetes

What should I do with pods after adding a node to the Kubernetes cluster?
I mean, ideally I want some of them to be stopped and started on the newly added node. Do I have to manually pick some for stopping and hope that they'll be scheduled for restarting on the newly added node?
I don't care about affinity, just semi-even distribution.
Maybe there's a way to always have the number of pods be equal to the number of nodes?
For the sake of having an example:
I'm using juju to provision small Kubernetes cluster on AWS. One master and two workers. This is just a playground.
My application is apache serving PHP and static files. So I have a deployment, a service of type NodePort and an ingress using nginx-ingress-controller.
I've turned off one of the worker instances and my application pods were recreated on the one that remained working.
I then started the instance back, master picked it up and started nginx ingress controller there. But when I tried deleting my application pods, they were recreated on the instance that kept running, and not on the one that was restarted.
Not sure if it's important, but I don't have any DNS setup. Just added IP of one of the instances to /etc/hosts with host value from my ingress.
descheduler, a kuberenets incubator project could be helpful. Following is the introduction
As Kubernetes clusters are very dynamic and their state change over time, there may be desired to move already running pods to some other nodes for various reasons:
Some nodes are under or over utilized.
The original scheduling decision does not hold true any more, as taints or labels are added to or removed from nodes, pod/node affinity requirements are not satisfied any more.
Some nodes failed and their pods moved to other nodes.
New nodes are added to clusters.
There is automatic redistribution in Kubernetes when you add a new node. You can force a redistribution of single pods by deleting them and having a host based antiaffinity policy in place​. Otherwise Kubernetes will prefer using the new node for scheduling and thus achieve a redistribution over time.
What are your reasons for a manual triggered redistribution​?