kubernetes having different env for one of the replica in service - kubernetes

A Usecase where one of the service must be scaled to 10 pods.
BUT, one of the pod must have different env variables. (kind of doing certain actions like DB actions and triggers handling, don't want 10 triggers to be handled instead of 1 DB change), for example 9 pods have env variable CHANGE=0 but one of the pod has env variable CHANGE=1
Also i am resolving by service name, so changing service name is not what i am looking for.

It sounds like you're trying to solve an issue with your app using Kubernetes.
The reason I say that is because the whole concept of "replicas" is to have identical instances, what you're actually saying is: "I have 10 identical pods but I want 1 of the to be different" and that's not how Kubernetes works.
So, you need to re-think the reason for which you need this environment variable to be different, what do you use it for. If you want to share the details maybe I can help you find an idiomatic way of doing this using Kubernetes.

The easiest way to do what you describe is to have two separate Services. One attaches to any "web" pod:
apiVersion: v1
kind: Service
metadata:
name: myapp-web
spec:
selector:
app: myapp
tier: web
The second attaches to only the master pod(s):
apiVersion: v1
kind: Service
metadata:
name: myapp-master
spec:
selector:
app: myapp
tier: web
role: master
Then have two separate Deployments. One has the single master pod, and one replica; the other has nine server pods. Your administrative requests go to myapp-master but general requests go to myapp-web.
As #omricoco suggests you can come up with a couple of ways to restructure this. A job queue like RabbitMQ will have the property that each job is done once (with retries if a job fails), so one setup is to run a queue like this, allow any server to accept administrative requests, but have their behavior just be to write a job into the queue. Then you can run a worker process (or several) to service these.

Related

Kubectl get deployments, no resources

I've just started learning kubernetes, in every tutorial the writer generally uses "kubectl .... deploymenst" to control the newly created deploys. Now, with those commands (ex kubectl get deploymets) i always get the response No resources found in default namespace., and i have to use "pods" instead of "deployments" to make things work (which works fine).
Now my question is, what is causing this to happen, and what is the difference between using a deployment or a pod? ? i've set the docker driver in the first minikube, it has something to do with this?
First let's brush up some terminologies.
Pod - It's the basic building block for Kubernetes. It groups one or more containers (such as Docker containers), with shared storage/network, and a specification for how to run the containers.
Deployment - It is a controller which wraps Pod/s and manages its life cycle, which is to say actual state to desired state. There is one more layer in between Deployment and Pod which is ReplicaSet : A ReplicaSet’s purpose is to maintain a stable set of replica Pods running at any given time. As such, it is often used to guarantee the availability of a specified number of identical Pods.
Below is the visualization:
Source: I drew it!
In you case what might have happened :
Either you have created a Pod not a Deployment. Therefore, when you do kubectl get deployment you don't see any resources. Note when you create Deployments it in turn creates a ReplicaSet for you and also creates the defined pods.
Or may be you created your deployment in a different namespace, if that's the case, then type this command to find your deployments in that namespace kubectl get deploy NAME_OF_DEPLOYMENT -n NAME_OF_NAMESPACE
More information to clarify your concepts:
Source
Below the section inside spec.template is the section which is supposedly your POD manifest if you were to create it manually and not take the deployment route. Now like I said earlier in simple terms Deployments are a wrapper to your PODs, therefore anything which you see outside the path spec.template is the configuration which you will need to defined on how you want to manage (scaling,affinity, e.t.c) your POD
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
labels:
app: nginx
spec:
replicas: 3
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:1.14.2
ports:
- containerPort: 80
Deployment is a controller providing higher level abstraction on top of pods and ReplicaSets. A Deployment provides declarative updates for Pods and ReplicaSets. Deployments internally creates ReplicaSets within which pods are created.
Use cases of deployment is documented here
One reason for No resources found in default namespace could be that you created the deployment in a specific namespace and not in default namespace.
You can see deployments in a specific namespace or in all namespaces via
kubectl get deploy -n namespacename
kubectl get deploy -A

Make RabbitMQ durable/persistent queues survive Kubernetes pod restart

Our application uses RabbitMQ with only a single node. It is run in a single Kubernetes pod.
We use durable/persistent queues, but any time that our cloud instance is brought down and back up, and the RabbitMQ pod is restarted, our existing durable/persistent queues are gone.
At first, I though that it was an issue with the volume that the queues were stored on not being persistent, but that turned out not to be the case.
It appears that the queue data is stored in /var/lib/rabbitmq/mnesia/<user#hostname>. Since the pod's hostname changes each time, it creates a new set of data for the new hostname and loses access to the previously persisted queue. I have many sets of files built up in the mnesia folder, all from previous restarts.
How can I prevent this behavior?
The closest answer that I could find is in this question, but if I'm reading it correctly, this would only work if you have multiple nodes in a cluster simultaneously, sharing queue data. I'm not sure it would work with a single node. Or would it?
What helped in our case was to set hostname: <static-host-value>
apiVersion: apps/v1
kind: Deployment
spec:
replicas: 1
...
template:
metadata:
labels:
app: rabbitmq
spec:
...
containers:
- name: rabbitmq
image: rabbitmq:3-management
...
hostname: rmq-host
How can I prevent this behavior?
By using a StatefulSet as is intended for the case where Pods have persistent data that is associated with their "identity." The Helm chart is a fine place to start reading, even if you don't end up using it.
I ran into this issue myself and the quickest way I found was to specify an environment variable RABBITMQ_NODENAME = "yourapplicationsqueuename" and making sure I only had 1 replica for my pod.

Kubernetes set deploment number of replicas based on namespace

I've split our Kubernetes cluster into two different namespaces; staging and production, aiming to have production deployments having two replicas (for rolling deployments, autoscaling comes later) and staging having one single replica.
Other than having one deployment configuration per namespace, I was wondering whether or not we could set the default number of replicas per deployment, per namespace?
When creating the deployment config, if you don't specify the number of replicas, it will default to one. Is there a way of defaulting it to two on the production namespace?
If not, is there a recommended approach for this which will prevent the need to have a deployment config per namespace?
One way of doing this would be to scale the deployment up to two replicas, manually, in the production namespace, once it has been created for the first time, but I would prefer to skip any manual steps.
It is not possible to set different number of replicas per namespace in one deployment.
But you can have 2 different deployment files 1 per each namespace, i.e. <your-app>-production.yaml and <your-app>-staging.yaml.
In these descriptions you can determine any custom values and settings that you need.
For an example:
<your-app>-production.yaml:
apiVersion: v1
kind: Deployment
metadata:
name: <your-app>
namespace: production #Here is namespace
...
spec:
replicas: 2 #Here is the count of replicas of your application
template:
spec:
containers:
- name: <your-app-pod-name>
image: <your-app-image>
...
<your-app>-staging.yaml:
apiVersion: v1
kind: Deployment
metadata:
name: <your-app>
namespace: staging #Here is namespace
...
spec:
replicas: 1 #Here is the count of replicas of your application
template:
spec:
containers:
- name: <your-app-pod-name>
image: <your-app-image>
...
I don't think you can avoid having two deployments, but you can get rid of the duplicated code by using helm templates (https://docs.helm.sh/chart_template_guide). Then you can define a single deployment yaml and substitute different values when you deploy with an if statement.
When creating the deployment config, if you don't specify the number of replicas, it will default to one. Is there a way of defaulting it to two on the production namespace?
Actually, there are two ways to do it, but both of them involved coding.
Admission Controllers:
This is the recommended way of assigning default values to fields.
While creating objects in Kubernetes, it passes through some admission controllers and one of them is MutatingWebhook.
MutatingWebhook has been upgraded to beta version since v1.9+. This admission controller modifies (mutates) the object before actully created (or modified/deleted), say, assigning default values of some fields and some similar task. You can change the minimum replicas number here.
User Have to implement a admission server to receive requests from kubernetes and give modified object as response accordingly.
Here is a sample admission server implemented by Openshift kubernetes-namespace-reservation.
Deployment Controller:
This is comparatively easier but kind of hacking the deployment procedure.
You can write a Deployment controller which will watch for deployment and if there is any deployment made, it will do some task. Here, you can update the deployment with some minimum values you wish.
You can see the official Sample Pod Controller.
If both of them seems lots to do, it is better to assign fields more carefully each time for each deployment.

OpenShift - how can pods resolve each other names

I m trying to have cloudera manager and cloudera agents on openshift, in order to run the installation I need to get all the pods communicating with each other.
Manually, I modified the /etc/hosts on the manager and add all the agents and on the agents I added the manager and all the other agents.
Now I wanted to automate this, let suppose I add a new agent, I want it to resolve the manager and the host (I can get a part of it done, by passing the manager name as an env variable and with a shell script add it to the /etc/hosts, not the ideal way but still solution). But the second part would be more difficult, to get the manager to resolve every new agent, and also to resolve every other agent on the same service.
I was wondering if there is a way so every pod on the cluster can resolve the others names ?
I have to services cloudera-manager with one pod, and an other service cloudera-agent with -let's say- 3 agents.
do you have any idea ?
thank you.
Not sure, but it looks like you could benefit from StatefulSets.
There are other ways to get the other pods ips (like using a headless service or requesting to the serverAPI directly ) but StatefulSets provide :
Stable, unique network identifiers
Stable, persistent storage.
Lots of other functionality that facilitates the deployment of a special kind of clusters like distributed databases. Not sure my term 'distributed' here is correct, but it helps me remind what they are for :).
If you want to get all Pods running under a certain Service, make sure to use a headless Service (i.e. set clusterIP: None). Then, you can query your local DNS-Server for the Service and will receive A-Records for all Pods assigned to it:
---
apiVersion: v1
kind: Service
metadata:
name: my-sv
namespace: my-ns
labels:
app: my-app
spec:
clusterIP: None
selector:
app: my-app
Then start your Pods (make sure to give app: labels for assignment) and query your DNS-Server from any of them:
kubectl exec -ti my-pod --namespace=my-ns -- /bin/bash
$ nslookup my-sv.my-ns.svc.cluster.local
Server: 10.255.3.10
Address: 10.255.3.10#53
Name: my-sv.my-ns.svc.cluster.local
Address: 10.254.24.11
Name: my-sv.my-ns.svc.cluster.local
Address: 10.254.5.73
Name: my-sv.my-ns.svc.cluster.local
Address: 10.254.87.6

How to configure a Kubernetes Multi-Pod Deployment

I would like to deploy an application cluster by managing my deployment via k8s Deployment object. The documentation has me extremely confused. My basic layout has the following components that scale independently:
API server
UI server
Redis cache
Timer/Scheduled task server
Technically, all 4 above belong in separate pods that are scaled independently.
My questions are:
Do I need to create pod.yml files and then somehow reference them in deployment.yml file or can a deployment file also embed pod definitions?
K8s documentation seems to imply that the spec portion of Deployment is equivalent to defining one pod. Is that correct? What if I want to declaratively describe multi-pod deployments? Do I do need multiple deployment.yml files?
Pagids answer has most of the basics. You should create 4 Deployments for your scenario. Each deployment will create a ReplicaSet that schedules and supervises the collection of PODs for the Deployment.
Each Deployment will most likely also require a Service in front of it for access. I usually create a single yaml file that has a Deployment and the corresponding Service in it. Here is an example for an nginx.yaml that I use:
apiVersion: v1
kind: Service
metadata:
annotations:
service.alpha.kubernetes.io/tolerate-unready-endpoints: "true"
name: nginx
labels:
app: nginx
spec:
type: NodePort
ports:
- port: 80
name: nginx
targetPort: 80
nodePort: 32756
selector:
app: nginx
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: nginxdeployment
spec:
replicas: 3
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginxcontainer
image: nginx:latest
imagePullPolicy: Always
ports:
- containerPort: 80
Here some additional information for clarification:
A POD is not a scalable unit. A Deployment that schedules PODs is.
A Deployment is meant to represent a single group of PODs fulfilling a single purpose together.
You can have many Deployments work together in the virtual network of the cluster.
For accessing a Deployment that may consist of many PODs running on different nodes you have to create a Service.
Deployments are meant to contain stateless services. If you need to store a state you need to create StatefulSet instead (e.g. for a database service).
You can use the Kubernetes API reference for the Deployment and you'll find that the spec->template field is of type PodTemplateSpec along with the related comment (Template describes the pods that will be created.) it answers you questions. A longer description can of course be found in the Deployment user guide.
To answer your questions...
1) The Pods are managed by the Deployment and defining them separately doesn't make sense as they are created on demand by the Deployment. Keep in mind that there might be more replicas of the same pod type.
2) For each of the applications in your list, you'd have to define one Deployment - which also makes sense when it comes to difference replica counts and application rollouts.
3) you haven't asked that but it's related - along with separate Deployments each of your applications will also need a dedicated Service so the others can access it.
additional information:
API server use deployment
UI server use deployment
Redis cache use statefulset
Timer/Scheduled task server maybe use a statefulset (If your service has some state in)