Kubernetes nodes on different datacenters causes slowness - kubernetes

We have 2 datacenters (A and B) on 2 different regions (Eastern and Central). I've setup K8S single cluster where master is on datacenter A and few nodes on both datacenters. Our application contains 3 images (AppServer, DB, ReportingServer). Consider these 2 scenarios when the application is deployed:
1 - All 3 pods are created on the nodes belong to datacenter A. Everything works fine.
2 - DB pod or AppServer pod is created on a node belong to datacenter B and the other 2 pods are created on datacenter A. In this case application is very slow, takes 10-15 mins for it to be in running state (instead of 2-3 mins), login page loads very slowly and logging to the application usually throws error due to timeout.
My question: Is it normal for K8S to behave like this if nodes are in different datacenters? Or my setup is wrong?

Related

First 10 long running transactions

I have a fairly small cluster of 6 nodes, 3 client, and 3 server nodes. Important configurations,
storeKeepBinary = true,
cacheMode = Partitioned (some caches's about 5-8, out of 25 have this as TRANSACTIONAL)
AtomicityMode = Atomic
backups = 1
readFromBackups = false
no persistence
When I run the app for some load/performance test on-prem on 2 large boxes, 3 clients on one box, and 3 servers on another box, all within docker containers, I get a decent performance.
However, when I move them over to AWS and run them in EKS, the only change I make is to change the cluster discovery from standard TCP (default) to Kubernetes-based discovery and run the same test.
But now the performance is very bad, I keep getting,
WARN [sys-#145%test%] - [org.apache.ignite] First 10 long-running transactions [total=3]
Here the transactions are running more than a min long.
In other cases, I am getting,
WARN [sys-#196%test-2%] - [org.apache.ignite] First 10 long-running cache futures [total=1]
Here the associated future has been running for > 3 min.
Most of the places 'google search' has taken me, talks flaky/inconsistent n/w as the cause.
The app and the test seem to be ok since on a local on-prem this works just fine and the performance is decent as well.
Wanted to check if others have faced this or when running on Kubernetes in the public cloud something else needs to be done. Like somewhere I read nodes need to be pinned to the host in a cloud/virtual environment, but it's not mandatory.
TIA

Internal k8s services communication not balanced

I’m running a k8s cluster on aws-eks.
I have two services A and B.
Service A listens to a rabbit queue and sends http request to service B (which takes a couple of seconds).
Both services scale based on the number of message in the queue.
The problem is that the requests from A to B are not balanced.
When scaled to about 100 pods each, I see that service A pods sends requests to only about 60% of service B pods at a given time,
Meaning, eventually all pods gets messages, but some pods are at 100 cpu receiving 5 messages at a time, while others at 2 cpu
Receiving 1 message every minute or so..
That obviously causes low performance and timeouts.
I’ve read that it should work in round robin, but when I tried to set 10 fixed replicas of each service (all pods already up and running)
And pushing 10 messages to queue, I’ve seen that all service A pods pulled a message to send to service B, but some of service B pods never got any requests while other got more than one - resulting in one whole process to finish within 4 second while another took about 12 second.
Any ideas for why it’s working like that and how to change it to be more balanced?
Thanks

how to achieve HA for Kubernateve cluster for different server location

We have 2 server location
A)location_1
B)location_2
total 5 server
location_1 -> have 2 server
location_2 ->have 3 server
we installed K8 in above 5 servers.in that 3 are master node which also act as worker node also 2 more worker node. But we need to attain HA for kubernative cluster
setup is like below
location_1 have 2 master node(it also act as worker node too)
location_2 have 1 master node (it also a worker node) and 2 more worker node -Total 3 nodes
but there might be chance of entire location1 or location 2 get down .So if that case how we can achieve HA as whole ,also individual location wise?

How to prevent data inconsistency when one node lose network connectivity in kubernetes

I have a situation where I have a cluster with a service (we named it A1) and its data which is on a remote storage like cephFS in my case. the number of replica for my service is 1. Assume I have 5 node in my cluster and service A1 reside in node 1. something happens with node 1 network and it lose the connectivity with cephFS cluster and my Kubernetes cluster as well (or docker-swarm). cluster mark it as unreachable and start a new service (we named it A2) on node 2 to keep replica as 1. after for example 15 min node 1 network fixed and node 1 get back to cluster and have service A1 running already (assume it didn't crash while it loses its connectivity with remote storage).
I worked with docker-swarm and recently switched to Kubernetes. I see Kuber has a feature call StatefulSet but when I read about it. it doesn't answer my question. (or I may miss something when I read about it)
Question A: what does cluster do. does it keep A2 and shutdown A1 or let A1 keeps working and shutdown A2 (Logically it should shutdown A1)
Question B (and my primary question as well!): Assume that the cluster wants to shutdown on of these services (for example A1). This service does some save on storage when it wants to shutdown. in this case state A1 save to disk and A2 with newer state saved something before A1 network get fixed.
There must be some locks when we mount the volume to the container in which when it attached to one container other container cant write to that (let A1 failed when want to save its old state data on disk)
The way it works - using docker swarm terminology -
You have a service. A service is a description of some image you'd like to run, how many replicas and so on. Assuming the service specifies at least 1 replica should be running it will create a task that will schedule a container on a swarm node.
So the service is associated with 0 to many tasks, where each task has 0 - if its still starting or 1 container - if the task is running or stopped - which is on a node.
So, when swarm (the orcestrator) detects a node go offline, it principally sees that a number of tasks associated with a service have lost their containers, and so the replication (in terms of running tasks) is no longer correct for the service, and it creates new tasks which in turn will schedule new containers on the available nodes.
On the disconnected node, the swarm worker notices that it has lost connection to the swarm managers so it cleans up all the tasks it is holding onto as it no longer has current information about them. In the process of cleaning the tasks up, the associated containers get stopped.
This is good because when the node finally reconnects there is no race condition where there are two tasks running. Only "A2" is running and "A1" has been shut down.
This is bad if you have a situation where nodes can lose connectivity to the managers frequently, but you need the services to keep running on those nodes regardless, as they will be shut down each time the workers detach.
The process on K8s is pretty much the same just change the terminology.

Crate DB 2 Node Setup

I'm trying to setup a 2 node Crate cluster, I have set the following configuration values on the 2 nodes:
gateway.recover_after_nodes: 1
gateway.expected_nodes: 2
However the check is failing as per the documentation:
(E / 2) < R <= E where R is the number of recovery nodes, E is the
number of expected nodes.
I see that most available documentation states a 3 node cluster, however at this point I can only start a 2 node cluster as a failover setup.
The behaviour I'm expecting is that if one of the nodes goes down the other node should be able to take up the traffic and once 2nd node comes back up it should sync up with new node.
If anyone has been able to successfully bring up a 2 node Crate cluster, please share the configuration required for the same.
Cheers
It doesn't make sense to run a two node cluster with 1 required node, because this could easily end up in a split brain and set the cluster into a state that it won't be able to recover, that's why you always need more then half of the number of expected nodes.