Kubernetes service with clustered PODs in active/standby - service

Apologies for not keeping this short, as any such attempt would make me miss-out on some important details of my problem.
I have a legacy Java application which works in a active/standby mode in a clustered environment to expose certain RESTful WebServices via a predefined port.
If there are two nodes in my app cluster, at any point in time only one would be in Active mode, and the other in Passive mode, and the requests are always served by the node with app running in Active mode. 'Active' and 'Passive' are just roles, the app as such would be running on both nodes. The Active and Passive instances communicate with each other through this same predetermined port.
Suppose I have a two node cluster with one instance of my application running on each node, then one of the instance would initially be active and the other will be passive. If for some reason the active node goes for a toss for some reason, the app instance in other node identifies this using some heartbeat mechanism, takes over the control and becomes the new active. When the old active comes back up it detects the other guy has owned up the new Active role, hence it goes into Passive mode.
The application manages to provide RESTful webservices on the same endpoint IP irrespective of which node is running the app in 'Active' mode by using a cluster IP, which piggy-backs on the active instance, so the cluster IP switches over to whichever node is running the app in Active mode.
I am trying to containerize this app and run this in a Kubernetes cluster for scale and ease of deployment. I am able to containerize and able to deploy it as a POD in a Kubernetes cluster.
In order to bring in the Active/Passive role here, I am running two instances of this POD, each pinned to a separate K8S nodes using node affinity (each node is labeled as either active or passive, and POD definitions pin on these labels), and clustering them up using my app's clustering mechanism whereas only one will be active and the other will be passive.
I am exposing the REST service externally using K8S Service semantics by making use of the NodePort, and exposing the REST WebService via a NodePort on the master node.
Here's my yaml file content:
apiVersion: v1
kind: Service
metadata:
name: myapp-service
labels:
app: myapp-service
spec:
type: NodePort
ports:
- port: 8443
nodePort: 30403
selector:
app: myapp
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: active
spec:
replicas: 1
template:
metadata:
labels:
app: myapp
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: nodetype
operator: In
values:
- active
volumes:
- name: task-pv-storage
persistentVolumeClaim:
claimName: active-pv-claim
containers:
- name: active
image: myapp:latest
imagePullPolicy: Never
securityContext:
privileged: true
ports:
- containerPort: 8443
volumeMounts:
- mountPath: "/myapptmp"
name: task-pv-storage
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: passive
spec:
replicas: 1
template:
metadata:
labels:
app: myapp
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: nodetype
operator: In
values:
- passive
volumes:
- name: task-pv-storage
persistentVolumeClaim:
claimName: active-pv-claim
containers:
- name: passive
image: myapp:latest
imagePullPolicy: Never
securityContext:
privileged: true
ports:
- containerPort: 8443
volumeMounts:
- mountPath: "/myapptmp"
name: task-pv-storage
Everything seems working fine, except that since both PODs are exposing the web service via same port, the K8S Service is routing the incoming requests to one of these PODS in a random fashion. Since my REST WebService endpoints work only on Active node, the service requests work via K8S Service resource only when the request is getting routed to the POD with app in Active role. If at any point in time the K8S Service happens to route the incoming request to POD with app in passive role, the service is inaccessible/not served.
How do I make this work in such a way that the K8S service always routes the requests to POD with app in Active role? Is this something doable in Kubernetes or I'm aiming for too much?
Thank you for your time!

You can use a readiness probe in conjunction with election container. Election will always elect one master from the election pool, and if you make sure only that pod is marked as ready... only that pod will recieve traffic.

One way to achieve this is add label tag in the pod as active and standby. then select the active pod in your service. this will send the traffic to pod labeled as active.
https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/#service-and-replicationcontroller
you can find another example on this document.
https://kubernetes.io/docs/concepts/services-networking/connect-applications-service/

Bit late.
You can do it with deploying same application as two different app deployments with two services with different ip (or port) and then also deploy loadbalancer with custom conf to send traffic to one app and second as a backup. All traffic goes first to your loadbalancer.
You can also check if any kubernetes ingress controller support backup option.

Related

Redis master/slave replication on Kubernetes for ultra-low latency

A graph is always better than the last sentences, so here is what I would like to do :
To sum up:
I want to have a Redis master instance outside (or inside, this is not relevant here) my K8S cluster
I want to have a Redis slave instance per node replicating the master instance
I want that when removing a node, the Redis slave pod gets unregistered from master
I want that when adding a node, a Redis slave pod is added to the node and registered to the master
I want all pods in one node to consume only the data of the local Redis slave (easy part I think)
Why do I want such an architecture?
I want to take advantage of Redis master/slave replication to avoid dealing with cache invalidation myself
I want to have ultra-low latency calls to Redis cache, so having one slave per node is the best I can get (calling on local host network)
Is it possible to automate such deployments, using Helm for instance? Are there domcumentation resources to make such an architecture with clean dynamic master/slave binding/unbinding?
And most of all, is this architecture a good idea for what I want to do? Is there any alternative that could be as fast?
i remember we had a discussion on this topic previously here, no worries adding more here.
Read more about the Redis helm chart : https://github.com/bitnami/charts/tree/master/bitnami/redis#choose-between-redis-helm-chart-and-redis-cluster-helm-chart
You should also be asking the question of how my application will be
connecting to POD on same Node without using the service of Redis.
For that, you can use the `environment variables and expose them to application POD.
Something like :
env:
- name: HOST_IP
valueFrom:
fieldRef:
fieldPath: status.hostIP
It will give you the value of Node IP on which the POD is running, then you can use that IP to connect to DeamonSet (Redis slave if you are running).
You can read more at : https://kubernetes.io/docs/tasks/inject-data-application/environment-variable-expose-pod-information/
Is it possible to automate such deployments, using Helm for instance?
Yes, you can write down your own Helm chart and deploy the generated YAML manifest.
And most of all, is this architecture a good idea for what I want to
do? Is there any alternative that could be as fast?
If you think then it is a good idea, as per my consideration this could create the $$$ issue & higher cluster resources usage.
What if you are running the 200 nodes on each you will be running the slave of Redis ? Which might consume resources on each node and add cost to your infra.
OR
if you are planning for specific deployment
Your above suggestion is also good, but still, if you are planning to use the Redis with only Specific deployment you can use the sidecar pattern also and connect multiple Redis together using configuration.
apiVersion: v1
kind: Service
metadata:
name: web
labels:
app: web
spec:
ports:
- port: 80
name: redis
targetPort: 5000
selector:
app: web
type: LoadBalancer
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: web
spec:
selector:
matchLabels:
app: web
replicas: 3
template:
metadata:
labels:
app: web
spec:
containers:
- name: redis
image: redis
ports:
- containerPort: 6379
name: redis
protocol: TCP
- name: web-app
image: web-app
env:
- name: "REDIS_HOST"
value: "localhost"

kubernetes - container busy with request, then request should route to another container

I am very new to k8s and docker. But I have task on k8s. Now I stuck with a use case. That is:
If a container is busy with requests. Then incoming request should redirect to another container.
deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: twopoddeploy
namespace: twopodns
spec:
selector:
matchLabels:
app: twopod
replicas: 1
template:
metadata:
labels:
app: twopod
spec:
containers:
- name: secondcontainer
image: "docker.io/tamilpugal/angmanualbuild:latest"
env:
- name: "PORT"
value: "24244"
- name: firstcontainer
image: "docker.io/tamilpugal/angmanualbuild:latest"
env:
- name: "PORT"
value: "24243"
service.yaml
apiVersion: v1
kind: Service
metadata:
name: twopodservice
spec:
type: NodePort
selector:
app: twopod
ports:
- nodePort: 31024
protocol: TCP
port: 82
targetPort: 24243
From deployment.yaml, I created a pod with two containers with same image. Because, firstcontainer is not reachable/ busy, then secondcontainer should handles the incoming requests. This our idea and use case. So (only for checking our use case) I delete firstcontainer using docker container rm -f id_of_firstcontainer. Now site is not reachable until, docker recreates the firstcontainer. But I need k8s should redirects the requests to secondcontainer instead of waiting for firstcontainer.
Then, I googled about the solution, I found Ingress and Liveness - Readiness. But Ingress does route the request based on the path instead of container status. Liveness - Readiness also recreates the container. SO also have some question and some are using ngix server. But no luck. That why I create a new question.
So my question is how to configure the two containers to reduce the downtime?
what is keyword for get the solution from the google to try myself?
Thanks,
Pugal.
A service can load-balance between multiple pods. You should delete the second copy of the container inside the deployment spec, but also change it to have replicas: 2. Now your deployment will launch two identical pods, but the service will match both of them, and requests will go to both.
apiVersion: apps/v1
kind: Deployment
metadata:
name: onepoddeploy
namespace: twopodns
spec:
selector: { ... }
replicas: 2 # not 1
template:
metadata: { ... }
spec:
containers:
- name: firstcontainer
image: "docker.io/tamilpugal/angmanualbuild:latest"
env:
- name: "PORT"
value: "24243"
# no secondcontainer
This means if two pods aren't enough to handle your load, you can kubectl scale deployment -n twopodns onepoddeploy --replicas=3 to increase the replica count. If you can tell from CPU utilization or another metric when you're getting to "not enough", you can configure the horizontal pod autoscaler to adjust the replica count for you.
(You will usually want only one container per pod. There's no way to share traffic between containers in the way you're suggesting here, and it's helpful to be able to independently scale components. This is doubly true if you're looking at a stateful component like a database and a stateless component like an HTTP service. Typical uses for multiple containers are things like log forwarders and network proxies, that are somewhat secondary to the main operation of the pod, and can scale or be terminated along with the primary container.)
There are two important caveats to running multiple pod replicas behind a service. The service load balancer isn't especially clever, so if one of your replicas winds up working on intensive jobs and the other is more or less idle, they'll still each get about half the requests. Also, if your pods are configure for HTTP health checks (recommended), if a pod is backed up to the point where it can't handle requests, it will also not be able to answer its health checks, and Kubernetes will kill it off.
You can help Kubernetes here by trying hard to answer all HTTP requests "promptly" (aiming for under 1000 ms always is probably a good target). This can mean returning a "not ready yet" response to a request that triggers a large amount of computation. This can also mean rearranging your main request handler so that an HTTP request thread isn't tied up waiting for some task to complete.

How to Route to specific pod through Kubernetes Service (like a Gateway API)

I am running Kubernetes on "Docker Desktop" in Windows.
I have a LoadBalancer Service for a deployment which has 3 replicas.
I would like to access SPECIFIC pod through some means (such as via URL path : < serviceIP >:8090/pod1).
Is there any way to achieve this usecase?
deployment.yaml :
apiVersion: v1
kind: Service
metadata:
name: my-service1
labels:
app: stream
spec:
ports:
- port: 8090
targetPort: 8090
name: port8090
selector:
app: stream
# clusterIP: None
type: LoadBalancer
---
apiVersion: apps/v1beta2
kind: Deployment
metadata:
name: stream-deployment
labels:
app: stream
spec:
replicas: 3
selector:
matchLabels:
app: stream
strategy:
type: Recreate
template:
metadata:
labels:
app: stream
spec:
containers:
- image: stream-server-mock:latest
name: stream-server-mock
imagePullPolicy: Never
env:
- name: STREAMER_IP
valueFrom:
fieldRef:
fieldPath: status.podIP
- name: STREAMER_ADDRESS
value: stream-server-mock:8090
ports:
- containerPort: 8090
My end goal is to achieve horizontal auto-scaling of pods.
How Application designed/and works as of now (without kubernetes) :
There are 3 components : REST-Server, Stream-Server (3 instances
locally on different JVM on different ports), and RabbitMQ.
1 - The client sends a request to "REST-Server" for a stream url.
2 - The REST-Server puts in the RabbitMQ queue.
3 - One of the Stream-Server picks it up and populates its IP and sends back to REST-Server through RabbitMQ.
4 - The client receives the IP and establishes a direct WS connection using the IP.
The Problem what I face is :
1 - When the client requests for a stream IP, one of the pods (lets say POD1) picks it up and sends its URL (which is service URL, comes through LoadBalancer Service).
2 - Next time when the client tries to connect (WebSocket Connection) using the Service IP, it wont be the same pod which accepted the request.
It should be the same pod which accepted the request, and must be accessible by the client.
You can use StatefulSets if you are not required to use deployment.
For replica 3, you will have 3 pods named
stream-deployment-0
stream-deployment-1
stream-deployment-2
You can access each pod as $(podname).$(service name).$(namespace).svc.cluster.local
For details, check this
You may also want to set up an ingress to point each pod from outside of the cluster.
As mentioned by aerokite, you can use StatefulSets. However, if you don't want to modify your deployments, you can simply use Headless Services. As specified in the documentation:
For headless Services, a cluster IP is not allocated.
For headless Services that define selectors, the endpoints controller
creates Endpoints records in the API, and modifies the DNS
configuration to return records (addresses) that point directly to the
Pods backing the Service.
This means that whenever you query the DNS name for your Service (i.e. my-svc.my-namespace.svc.cluster-domain.example), what you get is a list of all the Pod IPs (unlike regular services where you get the cluster IP). You can then select your Pods using your own mechanisms.
Regarding your new question, if that is your only issue, you can use session affinity. If you set service.spec.sessionAffinity to ClientIP, then connections from a particular client will always go to the same Pod each time. You don't need other modifications like the headless Services mentioned above.
IMO, the only way to achieve this will be:
Instead of using a deployment with 3 replicas, use 3 deployments with 1 replicas each (or just create pods only); deployment1 -> pod1, deployment2 -> pod2, deployment3 -> pod3
Expose all the deployments on a separate service, service1 -> deployment1, service2 -> deployment2, service3 -> deployment3
Create an ingress resource and route to each pod using the service for each deployment. For example:
ingress-url/service1
ingress-url/service2
ingress-url/service3

Is there a way to do a load balancing between pod in multiple nodes?

I have a kubernetes cluster deployed with rke witch is composed of 3 nodes in 3 different servers and in those server there is 1 pod which is running yatsukino/healthereum which is a personal modification of ethereum/client-go:stable .
The problem is that I'm not understanding how to add an external ip to send request to the pods witch are
My pods could be in 3 states:
they syncing the ethereum blockchain
they restarted because of a sync problem
they are sync and everything is fine
I don't want my load balancer to transfer requests to the 2 first states, only the third point consider my pod as up to date.
I've been searching in the kubernetes doc but (maybe because a miss understanding) I only find load balancing for pods inside a unique node.
Here is my deployment file:
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: goerli
name: goerli-deploy
spec:
replicas: 3
selector:
matchLabels:
app: goerli
template:
metadata:
labels:
app: goerli
spec:
containers:
- image: yatsukino/healthereum
name: goerli-geth
args: ["--goerli", "--datadir", "/app", "--ipcpath", "/root/.ethereum/geth.ipc"]
env:
- name: LASTBLOCK
value: "0"
- name: FAILCOUNTER
value: "0"
ports:
- containerPort: 30303
name: geth
- containerPort: 8545
name: console
livenessProbe:
exec:
command:
- /bin/sh
- /app/health.sh
initialDelaySeconds: 20
periodSeconds: 60
volumeMounts:
- name: app
mountPath: /app
initContainers:
- name: healthcheck
image: ethereum/client-go:stable
command: ["/bin/sh", "-c", "wget -O /app/health.sh http://my-bash-script && chmod 544 /app/health.sh"]
volumeMounts:
- name: app
mountPath: "/app"
restartPolicy: Always
volumes:
- name: app
hostPath:
path: /app/
The answers above explains the concepts, but about your questions anout services and external ip; you must declare the service, example;
apiVersion: v1
kind: Service
metadata:
name: goerli
spec:
selector:
app: goerli
ports:
- port: 8545
type: LoadBalancer
The type: LoadBalancer will assign an external address for in public cloud or if you use something like metallb. Check your address with kubectl get svc goerli. If the external address is "pending" you have a problem...
If this is your own setup you can use externalIPs to assign your own external ip;
apiVersion: v1
kind: Service
metadata:
name: goerli
spec:
selector:
app: goerli
ports:
- port: 8545
externalIPs:
- 222.0.0.30
The externalIPs can be used from outside the cluster but you must route traffic to any node yourself, for example;
ip route add 222.0.0.30/32 \
nexthop via 192.168.0.1 \
nexthop via 192.168.0.2 \
nexthop via 192.168.0.3
Assuming yous k8s nodes have ip 192.168.0.x. This will setup ECMP routes to your nodes. When you make a request from outside the cluster to 222.0.0.30:8545 k8s will load-balance between your ready PODs.
For loadbalancing and exposing your pods, you can use https://kubernetes.io/docs/concepts/services-networking/service/
and for checking when a pod is ready, you can use tweak your liveness and readiness probes as explained https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-probes/
for probes you might want to consider exec actions like execution a script that checks what is required and returning 0 or 1 dependent on status.
When a container is started, Kubernetes can be configured to wait for a configurable
amount of time to pass before performing the first readiness check. After that, it
invokes the probe periodically and acts based on the result of the readiness probe. If a
pod reports that it’s not ready, it’s removed from the service. If the pod then becomes
ready again, it’s re-added.
Unlike liveness probes, if a container fails the readiness check, it won’t be killed or
restarted. This is an important distinction between liveness and readiness probes.
Liveness probes keep pods healthy by killing off unhealthy containers and replacing
them with new, healthy ones, whereas readiness probes make sure that only pods that
are ready to serve requests receive them. This is mostly necessary during container
start up, but it’s also useful after the container has been running for a while.
I think you can use probe for your goal

Each of the individual kubernetes containers to be made accessible from Internet - is it possible?

I'm considering kubernetes as a platform for my application. I will launch multiple StatefulSets, each containing up to, say, 32 containers. kubernetes cluster will contain a few nodes, and each node will be assigned for e.g. 32+ external IP addresses.
My application requires that clients running somewhere on the internet to be able to reach each individual server instance via a static IP address and port for client-based load balancing and failover. Servers can come up and die from tie to time, but server address should be stable while the server is running.
To summarise in simple words I would like to be able to access my containers from Internet like this:
StatefulSet 1:
container 1: node1.domain.com:1000
container 2: node2.domain.com:1000
StatefulSet 2:
container 1: node1.domain.com:1001
container 2: node2.domain.com:1001
StatefulSet 3:
container 1: node2.domain.com:1002
container 2: node3.domain.com:1002
Is this something that is possible to achieve with kubernetes? If so, could you provide a hint how and reference to relevant kubernetes documentation?
Any reason you're tied to a StatefulSet? Sounds more like a DaemonSet to me. If you want to stick with StatefulSet, just use the container/host port parameters in your container spec.
Example, run the apps overflow-foo, overflow-bar, overflow-baz each on their own ports on every node matching your selector criteria in the cluster.
apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
name: overflow-app
labels:
app: overflow-app-agent
version: v1
spec:
template:
metadata:
labels:
name: overflow-app
spec:
hostPID: true
hostIPC: true
hostNetwork: true
containers:
- image: overflow-foo:latest
name: overflow-bar
command: [ "bash", "-c", "run.sh" ]
ports:
- containerPort: 1000
hostPort: 1000
- image: overflow-bar:latest
name: overflow-bar
command: [ "bash", "-c", "run.sh" ]
ports:
- containerPort: 1001
hostPort: 1001
- image: overflow-baz:latest
name: overflow-baz
command: [ "bash", "-c", "run.sh" ]
ports:
- containerPort: 1002
hostPort: 1002
It sounds like you want to use Services for exposure of your StatefulSets. You would define a single service per Stateful Set and expose it to the outside world with a NodePort or LoadBalancer. A NodePort is available to address on every Node in the cluster and a LoadBalancer would be a single point of entry that also balances the load to the different PODs of your StatefulSet. For more information you can read the official docs for Services, especially the sections for NodePort and LoadBalancer.
One additional note - The NodePort uses a port range 30000-32767 by default, but you can change it with the cluster parameter service-node-port-range. See docs.