Can I have a K8s pod per user/firm? - kubernetes

Is there a way we can have a K8s pod per user/per firm? I realise, per user/per firm grouping is mixing up the business level semantics with infrastructure but say I had this need for regulatory reasons, etc to keep things separate. Then is there a way to create a pod on the fly when a user logs in for the first time and hold this pod reference and route any further requests to the relevant pod which will host a set of containers each running an instance of one of the modules.
Is this even possible?
If possible, what are those identifiers that
can be injected into the pod on the fly that I could use to identify that this is
USER-A-POD vs USER_B_POD or FIRM_A_POD vs FIRM_B_POD ?
Effectively, I need to have a pod template that helps me create identical pods of 1 replica but the only way they differ is they are serving traffic related to one user/one firm only.

Generally, if you want to send traffic to a specific pod say from a Kubernetes Service you would use Labels and Selectors. For example, using the selector app: usera-app in the Service:
apiVersion: v1
kind: Service
metadata:
name: usera-service
spec:
selector:
app: usera-app
ports:
- protocol: TCP
port: 80
targetPort: 80
Then say if the Deployment for your pods, using the label app: usera-app:
apiVersion: apps/v1
kind: Deployment
metadata:
name: usera-deployment
spec:
selector:
matchLabels:
app: usera-app
replicas: 2
template:
metadata:
labels:
app: usera-app
spec:
containers:
- name: myservice
image: nginx
ports:
- containerPort: 80
More info here
How you assign your pods and deployments is up to you and whatever configuration you may use. If you'd like to force create some of the labels in deployments/pods you can take a look at MutatingAdminssionWebhooks.
If you are looking at projects to facilitate all this you can take a look at:
Gatekeeper which is an implementation of the Open Policy Agent for Kubernetes admission. (Still in alpha as of this writing)
Other tools that can help you with attestation and admission mechanism (would have to be adapted for labels):
Kritis
Portieris

Yes, you can create multiple virtual clusters for each user with namespaces.
https://kubernetes.io/docs/concepts/overview/working-with-objects/namespaces/
Namespaces are the way to divide cluster between users.

Related

kubernetes - container busy with request, then request should route to another container

I am very new to k8s and docker. But I have task on k8s. Now I stuck with a use case. That is:
If a container is busy with requests. Then incoming request should redirect to another container.
deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: twopoddeploy
namespace: twopodns
spec:
selector:
matchLabels:
app: twopod
replicas: 1
template:
metadata:
labels:
app: twopod
spec:
containers:
- name: secondcontainer
image: "docker.io/tamilpugal/angmanualbuild:latest"
env:
- name: "PORT"
value: "24244"
- name: firstcontainer
image: "docker.io/tamilpugal/angmanualbuild:latest"
env:
- name: "PORT"
value: "24243"
service.yaml
apiVersion: v1
kind: Service
metadata:
name: twopodservice
spec:
type: NodePort
selector:
app: twopod
ports:
- nodePort: 31024
protocol: TCP
port: 82
targetPort: 24243
From deployment.yaml, I created a pod with two containers with same image. Because, firstcontainer is not reachable/ busy, then secondcontainer should handles the incoming requests. This our idea and use case. So (only for checking our use case) I delete firstcontainer using docker container rm -f id_of_firstcontainer. Now site is not reachable until, docker recreates the firstcontainer. But I need k8s should redirects the requests to secondcontainer instead of waiting for firstcontainer.
Then, I googled about the solution, I found Ingress and Liveness - Readiness. But Ingress does route the request based on the path instead of container status. Liveness - Readiness also recreates the container. SO also have some question and some are using ngix server. But no luck. That why I create a new question.
So my question is how to configure the two containers to reduce the downtime?
what is keyword for get the solution from the google to try myself?
Thanks,
Pugal.
A service can load-balance between multiple pods. You should delete the second copy of the container inside the deployment spec, but also change it to have replicas: 2. Now your deployment will launch two identical pods, but the service will match both of them, and requests will go to both.
apiVersion: apps/v1
kind: Deployment
metadata:
name: onepoddeploy
namespace: twopodns
spec:
selector: { ... }
replicas: 2 # not 1
template:
metadata: { ... }
spec:
containers:
- name: firstcontainer
image: "docker.io/tamilpugal/angmanualbuild:latest"
env:
- name: "PORT"
value: "24243"
# no secondcontainer
This means if two pods aren't enough to handle your load, you can kubectl scale deployment -n twopodns onepoddeploy --replicas=3 to increase the replica count. If you can tell from CPU utilization or another metric when you're getting to "not enough", you can configure the horizontal pod autoscaler to adjust the replica count for you.
(You will usually want only one container per pod. There's no way to share traffic between containers in the way you're suggesting here, and it's helpful to be able to independently scale components. This is doubly true if you're looking at a stateful component like a database and a stateless component like an HTTP service. Typical uses for multiple containers are things like log forwarders and network proxies, that are somewhat secondary to the main operation of the pod, and can scale or be terminated along with the primary container.)
There are two important caveats to running multiple pod replicas behind a service. The service load balancer isn't especially clever, so if one of your replicas winds up working on intensive jobs and the other is more or less idle, they'll still each get about half the requests. Also, if your pods are configure for HTTP health checks (recommended), if a pod is backed up to the point where it can't handle requests, it will also not be able to answer its health checks, and Kubernetes will kill it off.
You can help Kubernetes here by trying hard to answer all HTTP requests "promptly" (aiming for under 1000 ms always is probably a good target). This can mean returning a "not ready yet" response to a request that triggers a large amount of computation. This can also mean rearranging your main request handler so that an HTTP request thread isn't tied up waiting for some task to complete.

How to Route to specific pod through Kubernetes Service (like a Gateway API)

I am running Kubernetes on "Docker Desktop" in Windows.
I have a LoadBalancer Service for a deployment which has 3 replicas.
I would like to access SPECIFIC pod through some means (such as via URL path : < serviceIP >:8090/pod1).
Is there any way to achieve this usecase?
deployment.yaml :
apiVersion: v1
kind: Service
metadata:
name: my-service1
labels:
app: stream
spec:
ports:
- port: 8090
targetPort: 8090
name: port8090
selector:
app: stream
# clusterIP: None
type: LoadBalancer
---
apiVersion: apps/v1beta2
kind: Deployment
metadata:
name: stream-deployment
labels:
app: stream
spec:
replicas: 3
selector:
matchLabels:
app: stream
strategy:
type: Recreate
template:
metadata:
labels:
app: stream
spec:
containers:
- image: stream-server-mock:latest
name: stream-server-mock
imagePullPolicy: Never
env:
- name: STREAMER_IP
valueFrom:
fieldRef:
fieldPath: status.podIP
- name: STREAMER_ADDRESS
value: stream-server-mock:8090
ports:
- containerPort: 8090
My end goal is to achieve horizontal auto-scaling of pods.
How Application designed/and works as of now (without kubernetes) :
There are 3 components : REST-Server, Stream-Server (3 instances
locally on different JVM on different ports), and RabbitMQ.
1 - The client sends a request to "REST-Server" for a stream url.
2 - The REST-Server puts in the RabbitMQ queue.
3 - One of the Stream-Server picks it up and populates its IP and sends back to REST-Server through RabbitMQ.
4 - The client receives the IP and establishes a direct WS connection using the IP.
The Problem what I face is :
1 - When the client requests for a stream IP, one of the pods (lets say POD1) picks it up and sends its URL (which is service URL, comes through LoadBalancer Service).
2 - Next time when the client tries to connect (WebSocket Connection) using the Service IP, it wont be the same pod which accepted the request.
It should be the same pod which accepted the request, and must be accessible by the client.
You can use StatefulSets if you are not required to use deployment.
For replica 3, you will have 3 pods named
stream-deployment-0
stream-deployment-1
stream-deployment-2
You can access each pod as $(podname).$(service name).$(namespace).svc.cluster.local
For details, check this
You may also want to set up an ingress to point each pod from outside of the cluster.
As mentioned by aerokite, you can use StatefulSets. However, if you don't want to modify your deployments, you can simply use Headless Services. As specified in the documentation:
For headless Services, a cluster IP is not allocated.
For headless Services that define selectors, the endpoints controller
creates Endpoints records in the API, and modifies the DNS
configuration to return records (addresses) that point directly to the
Pods backing the Service.
This means that whenever you query the DNS name for your Service (i.e. my-svc.my-namespace.svc.cluster-domain.example), what you get is a list of all the Pod IPs (unlike regular services where you get the cluster IP). You can then select your Pods using your own mechanisms.
Regarding your new question, if that is your only issue, you can use session affinity. If you set service.spec.sessionAffinity to ClientIP, then connections from a particular client will always go to the same Pod each time. You don't need other modifications like the headless Services mentioned above.
IMO, the only way to achieve this will be:
Instead of using a deployment with 3 replicas, use 3 deployments with 1 replicas each (or just create pods only); deployment1 -> pod1, deployment2 -> pod2, deployment3 -> pod3
Expose all the deployments on a separate service, service1 -> deployment1, service2 -> deployment2, service3 -> deployment3
Create an ingress resource and route to each pod using the service for each deployment. For example:
ingress-url/service1
ingress-url/service2
ingress-url/service3

how to set different environment variables of Deployment replicas in kubernetes

I have 4 k8s pods by setting the replicas of Deployment to 4 now.
apiVersion: v1
kind: Deployment
metadata:
...
spec:
...
replicas: 4
...
The POD will get items in a database and consume it, the items in database has a column class_name.
now I want one pod only get one class_name's item.
for example pod1 only get item which class_name equals class_name_1, and pod2 only get item which class_name equals class_name_2...
So I want to pass different class_name as environment variables to different Deployment PODs. Can I define it in the yaml file of Deployment?
Or is there any other way to achieve my goal?(like something other than Deployment in k8s)
For distributed job processing Deployments are not very good, because they don't have any type of ordering or consistent pod hostnames. You'd better use StatefulSet for it, because they have consistent naming, like pod-0, pod-1, pod-2. You can rely on that hostname index.
For example, if your class_name_idx - is the index of class name in class names list, num_replicas - is the number of replicas in StatefulSet and pod_idx - is the index of pod in StatefulSet, then pod should run the job only if: class_name_idx % num_replicas == pod_idx.
Unfortunately number of StatefulSet replicas cannot be obtained within the pod dynamically using Downward API, so you can either hardcode it or use Kubernetes API to obtain it from cluster.
Neither Deployment nor anything else won't help to achieve your goal. Your goal is some kind of logic and it should be implemented via code in your application.
Since the Deployment is some instances of the same application the only thing that might be useful for you is: using multiple deployments, each for its own task. The first could get class_name_1 item, while other class_name_2, class_name_3 etc. But it is not a good idea
I would not recommend this approach, but the closest thing to do what you want is using the stateful-set and use the pod name as the index.
When you deploy a stateful set, the pods will be named after their statefulset name, in the following sample:
apiVersion: v1
kind: Service
metadata:
name: kuard
labels:
app: kuard
spec:
type: NodePort
ports:
- port: 8080
name: web
selector:
app: kuard
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: kuard
spec:
serviceName: "kuard"
replicas: 3
selector:
matchLabels:
app: kuard
template:
metadata:
labels:
app: kuard
spec:
containers:
- name: kuard
image: gcr.io/kuar-demo/kuard-amd64:1
ports:
- containerPort: 8080
name: web
The pods created by the statefulset will be named as:
kuard-0
kuard-1
kuard-2
This way you could either, name the stateful-set according to the classes, i.e: class-name and the pod created will be class-name-0 and you can replace the _ by -. Or just strip the name out to get the index at the end.
To get the name just read the environment variable HOSTNAME
This naming is consistent, so you can make sure you always have 0, 1, 2, 3 after the name. And if the 2 goes down, it will be recreated.
Like I said, I would not recommend this approach because you tie the infrastructure to your code, and also can't scale(if needed) because each service are unique and adding new instances would get new ids.
A better approach would be using one deployment for each class and pass the proper values as environment variables.

Why should I specify service before deployment in a single Kubernetes configuration file?

I'm trying to understand why kubernetes docs recommend to specify service before deployment in one configuration file:
The resources will be created in the order they appear in the file. Therefore, it’s best to specify the service first, since that will ensure the scheduler can spread the pods associated with the service as they are created by the controller(s), such as Deployment.
Does it mean spread pods between kubernetes cluster nodes?
I tested with the following configuration where a deployment is located before a service and pods are distributed between nods without any issues.
apiVersion: apps/v1
kind: Deployment
metadata:
name: incorrect-order
namespace: test
spec:
selector:
matchLabels:
app: incorrect-order
replicas: 2
template:
metadata:
labels:
app: incorrect-order
spec:
containers:
- name: incorrect-order
image: nginx
ports:
- containerPort: 80
---
apiVersion: v1
kind: Service
metadata:
name: incorrect-order
namespace: test
labels:
app: incorrect-order
spec:
type: NodePort
ports:
- port: 80
selector:
app: incorrect-order
Another explanation is that some environment variables with service URL will not be set for pods in this case. However it also works ok in case a configuration is inside one file like the example above.
Could you please explain why it is better to specify service before the deployment in case of one configuration file? Or may be it is some outdated recommendation.
If you use DNS as service discovery, the order of creation doesn't matter.
In case of Environment Vars (the second way K8S offers service discovery) the order matters, because once that vars are passed to the starting pod, they cannot be modified later if the service definition changes.
So if your service is deployed before you start your pod, the service envvars are injected inside the linked pod.
If you create a Pod/Deployment resource with labels, this resource will be exposed through a service once this last is created (with proper selector to indicate what resource to expose).
You are correct in that it effects the spread among the worker nodes.
Deployments without a Service will simply be scheduled onto the nodes with the least cpu/memory allocation. For instance, a brand new and empty node will get all new pods from a new deployment.
With a Deployment that also has a service the Scheduler tries to spread the pods between nodes, disregarding the cpu/memory load (within limits), to help the Service survive better.
It puzzles me that a Deployment on it's own doesn't cause a optimal spread but it doesn't, not yet at least.
This is the answer from the official documentation:
The resources will be created in the order they appear in the file.
Therefore, it's best to specify the service first, since that will
ensure the scheduler can spread the pods associated with the service
as they are created by the controller(s), such as Deployment.
Kubernetes Documentation/Concepts/Cluster/Administration/Managing Resources

How to configure a Kubernetes Multi-Pod Deployment

I would like to deploy an application cluster by managing my deployment via k8s Deployment object. The documentation has me extremely confused. My basic layout has the following components that scale independently:
API server
UI server
Redis cache
Timer/Scheduled task server
Technically, all 4 above belong in separate pods that are scaled independently.
My questions are:
Do I need to create pod.yml files and then somehow reference them in deployment.yml file or can a deployment file also embed pod definitions?
K8s documentation seems to imply that the spec portion of Deployment is equivalent to defining one pod. Is that correct? What if I want to declaratively describe multi-pod deployments? Do I do need multiple deployment.yml files?
Pagids answer has most of the basics. You should create 4 Deployments for your scenario. Each deployment will create a ReplicaSet that schedules and supervises the collection of PODs for the Deployment.
Each Deployment will most likely also require a Service in front of it for access. I usually create a single yaml file that has a Deployment and the corresponding Service in it. Here is an example for an nginx.yaml that I use:
apiVersion: v1
kind: Service
metadata:
annotations:
service.alpha.kubernetes.io/tolerate-unready-endpoints: "true"
name: nginx
labels:
app: nginx
spec:
type: NodePort
ports:
- port: 80
name: nginx
targetPort: 80
nodePort: 32756
selector:
app: nginx
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: nginxdeployment
spec:
replicas: 3
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginxcontainer
image: nginx:latest
imagePullPolicy: Always
ports:
- containerPort: 80
Here some additional information for clarification:
A POD is not a scalable unit. A Deployment that schedules PODs is.
A Deployment is meant to represent a single group of PODs fulfilling a single purpose together.
You can have many Deployments work together in the virtual network of the cluster.
For accessing a Deployment that may consist of many PODs running on different nodes you have to create a Service.
Deployments are meant to contain stateless services. If you need to store a state you need to create StatefulSet instead (e.g. for a database service).
You can use the Kubernetes API reference for the Deployment and you'll find that the spec->template field is of type PodTemplateSpec along with the related comment (Template describes the pods that will be created.) it answers you questions. A longer description can of course be found in the Deployment user guide.
To answer your questions...
1) The Pods are managed by the Deployment and defining them separately doesn't make sense as they are created on demand by the Deployment. Keep in mind that there might be more replicas of the same pod type.
2) For each of the applications in your list, you'd have to define one Deployment - which also makes sense when it comes to difference replica counts and application rollouts.
3) you haven't asked that but it's related - along with separate Deployments each of your applications will also need a dedicated Service so the others can access it.
additional information:
API server use deployment
UI server use deployment
Redis cache use statefulset
Timer/Scheduled task server maybe use a statefulset (If your service has some state in)