K8s service doesn't use autoscaled mongodb pods

K8s service doesn't use autoscaled mongodb pods - mongodb

I am trying to deploy mongodb with kubernetes (gke more precisely). This database is used by a microservice which only needs to read in the database, so I thought of deploying multiple pods with a mongodb docker in each of them, so that the work is shared between them. To do that I created a mongodb image in which I uploaded my mongodb that I previously used with a single docker.
(Here is its Dockerfile, a single deployment of this image works in k8s so I guess that this may not be linked to the issue)
FROM mongo:latest
EXPOSE 27017
COPY /mdb/ /data/db
As the number of requests to the db varies during the day, I want to use gke horizontal autoscaling for those "mongodb pods". Autoscaling works as new pods are created when the cpu utilization goes over the target I fixed in my horinzontal pod autoscaler, but these new pods are not used by the service I created for the deployment file used to deploy those pods, and that's my issue.
Something strange to me is that the new pods local IP addresses appear in my service endpoints, and when I delete the initial pod, which is the only one working at that moment, then the other pods created by the autoscaler get activated and so I finally get a performance improvement. However, this is obviously not a solution to me, and moreover other pods created after I deleted the initial one don't get used either.
Here are the yamls files for my mongodb deployment and service :
apiVersion: apps/v1
kind: Deployment
metadata:
name: mongodb-deployment
labels:
app: mongodb
spec:
replicas: 1
selector:
matchLabels:
app: mongodb
template:
metadata:
labels:
app: mongodb
spec:
containers:
- name: mongodb
image: "$my_mongodb_image_in_which_I_have_my_db"
ports:
- containerPort: 27017
resources:
requests:
memory: "1800Mi"
cpu: "3000m"
apiVersion: v1
kind: Service
metadata:
name: mongodb-service
spec:
type: LoadBalancer
loadBalancerIP: $IP_reserved_for_this_service
selector:
app: mongodb
ports:
- protocol: TCP
port: 80
targetPort: 27017
And I am accessing those mongodb through pymongo, in programs that run in another pod in the same gke cluster:
def get_db(database: str):
client = MongoClient(host="$IP_reserved_for_this_service",
port=80,
username="...",
password="...",
authSource="admin")
return client.get_database(database)
This way of using and autoscaling mongodb might be weird and quite impractical but it's only a first model for me and I would like to make it work (or understand why it can't work).
Here are screenshots showing the behaviour of those pods:
state 1: only the initial pod is working
...but all ips appear in service endpoints
state 2: initial pod deleted, the other work now (except the new one created by the autoscaler after the deletion)
...and the endpoints are updated in the service (the update is in the "+ 1 more ...", I checked in the google console)
I feel that the problem might come either from the configuration of my mongodb-service or from the way k8s or gke deals with mongodb images (anyway since I'm new to k8s I might be completely wrong on that too).
Any help or comment will be appreciated, and if you need more information let me know.

There sticky connection in Kubernetes, a common and well-known feature of Kubernetes. Kubernetes doesn't balance packages, it's balancing connections. Once your app established a connection to a service, all requests to this service will through this connection. Kubernetes doesn't guarantee that the next one won't go via another connection.
Once of the option to solve - headless service https://kubernetes.io/docs/concepts/services-networking/service/#headless-services
Or service mesh, but it's too much for your case ))

Related

Redis master/slave replication on Kubernetes for ultra-low latency

A graph is always better than the last sentences, so here is what I would like to do :
To sum up:
I want to have a Redis master instance outside (or inside, this is not relevant here) my K8S cluster
I want to have a Redis slave instance per node replicating the master instance
I want that when removing a node, the Redis slave pod gets unregistered from master
I want that when adding a node, a Redis slave pod is added to the node and registered to the master
I want all pods in one node to consume only the data of the local Redis slave (easy part I think)
Why do I want such an architecture?
I want to take advantage of Redis master/slave replication to avoid dealing with cache invalidation myself
I want to have ultra-low latency calls to Redis cache, so having one slave per node is the best I can get (calling on local host network)
Is it possible to automate such deployments, using Helm for instance? Are there domcumentation resources to make such an architecture with clean dynamic master/slave binding/unbinding?
And most of all, is this architecture a good idea for what I want to do? Is there any alternative that could be as fast?

i remember we had a discussion on this topic previously here, no worries adding more here.
Read more about the Redis helm chart : https://github.com/bitnami/charts/tree/master/bitnami/redis#choose-between-redis-helm-chart-and-redis-cluster-helm-chart
You should also be asking the question of how my application will be
connecting to POD on same Node without using the service of Redis.
For that, you can use the `environment variables and expose them to application POD.
Something like :
env:
- name: HOST_IP
valueFrom:
fieldRef:
fieldPath: status.hostIP
It will give you the value of Node IP on which the POD is running, then you can use that IP to connect to DeamonSet (Redis slave if you are running).
You can read more at : https://kubernetes.io/docs/tasks/inject-data-application/environment-variable-expose-pod-information/
Is it possible to automate such deployments, using Helm for instance?
Yes, you can write down your own Helm chart and deploy the generated YAML manifest.
And most of all, is this architecture a good idea for what I want to
do? Is there any alternative that could be as fast?
If you think then it is a good idea, as per my consideration this could create the $$$ issue & higher cluster resources usage.
What if you are running the 200 nodes on each you will be running the slave of Redis ? Which might consume resources on each node and add cost to your infra.
OR
if you are planning for specific deployment
Your above suggestion is also good, but still, if you are planning to use the Redis with only Specific deployment you can use the sidecar pattern also and connect multiple Redis together using configuration.
apiVersion: v1
kind: Service
metadata:
name: web
labels:
app: web
spec:
ports:
- port: 80
name: redis
targetPort: 5000
selector:
app: web
type: LoadBalancer
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: web
spec:
selector:
matchLabels:
app: web
replicas: 3
template:
metadata:
labels:
app: web
spec:
containers:
- name: redis
image: redis
ports:
- containerPort: 6379
name: redis
protocol: TCP
- name: web-app
image: web-app
env:
- name: "REDIS_HOST"
value: "localhost"

kubernetes - container busy with request, then request should route to another container

I am very new to k8s and docker. But I have task on k8s. Now I stuck with a use case. That is:
If a container is busy with requests. Then incoming request should redirect to another container.
deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: twopoddeploy
namespace: twopodns
spec:
selector:
matchLabels:
app: twopod
replicas: 1
template:
metadata:
labels:
app: twopod
spec:
containers:
- name: secondcontainer
image: "docker.io/tamilpugal/angmanualbuild:latest"
env:
- name: "PORT"
value: "24244"
- name: firstcontainer
image: "docker.io/tamilpugal/angmanualbuild:latest"
env:
- name: "PORT"
value: "24243"
service.yaml
apiVersion: v1
kind: Service
metadata:
name: twopodservice
spec:
type: NodePort
selector:
app: twopod
ports:
- nodePort: 31024
protocol: TCP
port: 82
targetPort: 24243
From deployment.yaml, I created a pod with two containers with same image. Because, firstcontainer is not reachable/ busy, then secondcontainer should handles the incoming requests. This our idea and use case. So (only for checking our use case) I delete firstcontainer using docker container rm -f id_of_firstcontainer. Now site is not reachable until, docker recreates the firstcontainer. But I need k8s should redirects the requests to secondcontainer instead of waiting for firstcontainer.
Then, I googled about the solution, I found Ingress and Liveness - Readiness. But Ingress does route the request based on the path instead of container status. Liveness - Readiness also recreates the container. SO also have some question and some are using ngix server. But no luck. That why I create a new question.
So my question is how to configure the two containers to reduce the downtime?
what is keyword for get the solution from the google to try myself?
Thanks,
Pugal.

A service can load-balance between multiple pods. You should delete the second copy of the container inside the deployment spec, but also change it to have replicas: 2. Now your deployment will launch two identical pods, but the service will match both of them, and requests will go to both.
apiVersion: apps/v1
kind: Deployment
metadata:
name: onepoddeploy
namespace: twopodns
spec:
selector: { ... }
replicas: 2 # not 1
template:
metadata: { ... }
spec:
containers:
- name: firstcontainer
image: "docker.io/tamilpugal/angmanualbuild:latest"
env:
- name: "PORT"
value: "24243"
# no secondcontainer
This means if two pods aren't enough to handle your load, you can kubectl scale deployment -n twopodns onepoddeploy --replicas=3 to increase the replica count. If you can tell from CPU utilization or another metric when you're getting to "not enough", you can configure the horizontal pod autoscaler to adjust the replica count for you.
(You will usually want only one container per pod. There's no way to share traffic between containers in the way you're suggesting here, and it's helpful to be able to independently scale components. This is doubly true if you're looking at a stateful component like a database and a stateless component like an HTTP service. Typical uses for multiple containers are things like log forwarders and network proxies, that are somewhat secondary to the main operation of the pod, and can scale or be terminated along with the primary container.)
There are two important caveats to running multiple pod replicas behind a service. The service load balancer isn't especially clever, so if one of your replicas winds up working on intensive jobs and the other is more or less idle, they'll still each get about half the requests. Also, if your pods are configure for HTTP health checks (recommended), if a pod is backed up to the point where it can't handle requests, it will also not be able to answer its health checks, and Kubernetes will kill it off.
You can help Kubernetes here by trying hard to answer all HTTP requests "promptly" (aiming for under 1000 ms always is probably a good target). This can mean returning a "not ready yet" response to a request that triggers a large amount of computation. This can also mean rearranging your main request handler so that an HTTP request thread isn't tied up waiting for some task to complete.

Issues when using AKS to manage containers

I am using docker + AKS to manage my containers. When I run my containers locally/or on a VM using docker-compose ..my services(which are containerized) can communicate with my databases which are also in containers. The bridge between these containers is created using networks. After I converted the docker-compose file for all of my applications to the respective yaml counterparts and deployed my containers to AKS (single node), my containerized services are not able to reach the database.
All my containers have 3 yaml files
Pvc
deployment(for pods)
svc.
I've gone through many of the getting started with AKS examples and for some reason am not able to figure it out. All application services are exposed publicly using load balancers. My question is more like how do I define which db the application services should connect to now that the concept of networks doesn't exist anymore.
In the examples provided for KS all the the front end services do is create a env and specify the name of the backend service. I tried that as well and my application still doesn't work. Sample that I referred to validate my setup is https://learn.microsoft.com/en-gb/azure/aks/kubernetes-walkthrough#run-the-application.
Any help would be great.

If you need these services internally only, you should not expose it publicly using load balancers.
Kubernetes has two possibilities for service discovery. DNS and environment variables. While DNS is an optional component, I did not see any cluster without it. Also I assume that AKS uses it.
So, for example you have a Postgres database and want to use it somewhere else:
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: postgres
labels:
app: postgres
spec:
replicas: 1
template:
metadata:
labels:
app: postgres
spec:
containers:
- name: db
image: postgres:11
ports:
- name: postgres
containerPort: 5432
This creates a deployment with exposes the port 5432. The label app: postgres is also important here, since we need it later to identify the created Pods.
Now we need to create a service for it:
apiVersion: v1
kind: Service
metadata:
name: postgres
labels:
app: postgres
spec:
type: ClusterIP # default value
selector:
app: postgres
ports:
- port: 5432
This creates a virtual IP address and registers all ready pods with the label app: postgres to it. Since the name of the service is postgres and it is the default namespace, postgres is now accessible via postgres.default.svc.cluster.local:5432. You can you this address and port in your other application (eg Python) to connect to the database.

Why should I specify service before deployment in a single Kubernetes configuration file?

I'm trying to understand why kubernetes docs recommend to specify service before deployment in one configuration file:
The resources will be created in the order they appear in the file. Therefore, it’s best to specify the service first, since that will ensure the scheduler can spread the pods associated with the service as they are created by the controller(s), such as Deployment.
Does it mean spread pods between kubernetes cluster nodes?
I tested with the following configuration where a deployment is located before a service and pods are distributed between nods without any issues.
apiVersion: apps/v1
kind: Deployment
metadata:
name: incorrect-order
namespace: test
spec:
selector:
matchLabels:
app: incorrect-order
replicas: 2
template:
metadata:
labels:
app: incorrect-order
spec:
containers:
- name: incorrect-order
image: nginx
ports:
- containerPort: 80
---
apiVersion: v1
kind: Service
metadata:
name: incorrect-order
namespace: test
labels:
app: incorrect-order
spec:
type: NodePort
ports:
- port: 80
selector:
app: incorrect-order
Another explanation is that some environment variables with service URL will not be set for pods in this case. However it also works ok in case a configuration is inside one file like the example above.
Could you please explain why it is better to specify service before the deployment in case of one configuration file? Or may be it is some outdated recommendation.

If you use DNS as service discovery, the order of creation doesn't matter.
In case of Environment Vars (the second way K8S offers service discovery) the order matters, because once that vars are passed to the starting pod, they cannot be modified later if the service definition changes.
So if your service is deployed before you start your pod, the service envvars are injected inside the linked pod.
If you create a Pod/Deployment resource with labels, this resource will be exposed through a service once this last is created (with proper selector to indicate what resource to expose).

You are correct in that it effects the spread among the worker nodes.
Deployments without a Service will simply be scheduled onto the nodes with the least cpu/memory allocation. For instance, a brand new and empty node will get all new pods from a new deployment.
With a Deployment that also has a service the Scheduler tries to spread the pods between nodes, disregarding the cpu/memory load (within limits), to help the Service survive better.
It puzzles me that a Deployment on it's own doesn't cause a optimal spread but it doesn't, not yet at least.

This is the answer from the official documentation:
The resources will be created in the order they appear in the file.
Therefore, it's best to specify the service first, since that will
ensure the scheduler can spread the pods associated with the service
as they are created by the controller(s), such as Deployment.
Kubernetes Documentation/Concepts/Cluster/Administration/Managing Resources

How to configure a Kubernetes Multi-Pod Deployment

I would like to deploy an application cluster by managing my deployment via k8s Deployment object. The documentation has me extremely confused. My basic layout has the following components that scale independently:
API server
UI server
Redis cache
Timer/Scheduled task server
Technically, all 4 above belong in separate pods that are scaled independently.
My questions are:
Do I need to create pod.yml files and then somehow reference them in deployment.yml file or can a deployment file also embed pod definitions?
K8s documentation seems to imply that the spec portion of Deployment is equivalent to defining one pod. Is that correct? What if I want to declaratively describe multi-pod deployments? Do I do need multiple deployment.yml files?

Pagids answer has most of the basics. You should create 4 Deployments for your scenario. Each deployment will create a ReplicaSet that schedules and supervises the collection of PODs for the Deployment.
Each Deployment will most likely also require a Service in front of it for access. I usually create a single yaml file that has a Deployment and the corresponding Service in it. Here is an example for an nginx.yaml that I use:
apiVersion: v1
kind: Service
metadata:
annotations:
service.alpha.kubernetes.io/tolerate-unready-endpoints: "true"
name: nginx
labels:
app: nginx
spec:
type: NodePort
ports:
- port: 80
name: nginx
targetPort: 80
nodePort: 32756
selector:
app: nginx
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: nginxdeployment
spec:
replicas: 3
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginxcontainer
image: nginx:latest
imagePullPolicy: Always
ports:
- containerPort: 80
Here some additional information for clarification:
A POD is not a scalable unit. A Deployment that schedules PODs is.
A Deployment is meant to represent a single group of PODs fulfilling a single purpose together.
You can have many Deployments work together in the virtual network of the cluster.
For accessing a Deployment that may consist of many PODs running on different nodes you have to create a Service.
Deployments are meant to contain stateless services. If you need to store a state you need to create StatefulSet instead (e.g. for a database service).

You can use the Kubernetes API reference for the Deployment and you'll find that the spec->template field is of type PodTemplateSpec along with the related comment (Template describes the pods that will be created.) it answers you questions. A longer description can of course be found in the Deployment user guide.
To answer your questions...
1) The Pods are managed by the Deployment and defining them separately doesn't make sense as they are created on demand by the Deployment. Keep in mind that there might be more replicas of the same pod type.
2) For each of the applications in your list, you'd have to define one Deployment - which also makes sense when it comes to difference replica counts and application rollouts.
3) you haven't asked that but it's related - along with separate Deployments each of your applications will also need a dedicated Service so the others can access it.

additional information:
API server use deployment
UI server use deployment
Redis cache use statefulset
Timer/Scheduled task server maybe use a statefulset (If your service has some state in)

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse