Facing issues in setting up Cadence Workflow (Cadence server) in K8 - cadence-workflow

I am facing issues in setting up cadence in Kubernetes environment, following are details
CADENCE_SERVER_IMAGE_VERSION: 0.19.2
All Cadence components are running within single pod
It seems Ringpop configuration requires headless services, but headless doesnt works with Istio
Every things works fine if i just have one single Pod, but as soon as I create 2 pods they start fighting for tasklist and shards and thats what my problem is
Ringpop config: { name: RINGPOP_SEEDS, value: 'api-gtp-cadence.api-gtp-cadence.svc.cluster.local:7933,api-gtp-cadence.api-gtp-cadence.svc.cluster.local:7934,api-gtp-cadence.api-gtp-cadence.svc.cluster.local:7935,api-gtp-cadence.api-gtp-cadence.svc.cluster.local:7939'

Every things works fine if i just have one single Pod, but as soon as I create 2 pods they start fighting for tasklist and shard
Headless is required for Cadence in K8s.
This is because Cadence uses Ringpop to manage the membership of each node of the cluster. Ringpop uses IP:port of each node as identity and address to communicate. Headless is the only way(AFAIK) that will give each pod a PodIP. That's why only single pod cluster work without headless -- because at that case, there is only one member in the ringpop.

Related

Running other non-cluster containers on k8s node

I have a k8s cluster that runs the main workload and has a lot of nodes.
I also have a node (I call it the special node) that some of special container are running on that that is NOT part of the cluster. The node has access to some resources that are required for those special containers.
I want to be able to manage containers on the special node along with the cluster, and make it possible to access them inside the cluster, so the idea is to add the node to the cluster as a worker node and taint it to prevent normal workloads to be scheduled on it, and add tolerations on the pods running special containers.
The idea looks fine, but there may be a problem. There will be some other containers and non-container daemons and services running on the special node that are not managed by the cluster (they belong to other activities that have to be separated from the cluster). I'm not sure that will be a problem, but I have not seen running non-cluster containers along with pod containers on a worker node before, and I could not find a similar question on the web about that.
So please enlighten me, is it ok to have non-cluster containers and other daemon services on a worker node? Does is require some cautions, or I'm just worrying too much?
Ahmad from the above description, I could understand that you are trying to deploy a kubernetes cluster using kudeadm or minikube or any other similar kind of solution. In this you have some servers and in those servers one is having some special functionality like GPU etc., for deploying your special pods you can use node selector and I hope you are already doing this.
Coming to running separate container runtime on one of these nodes you need to consider two points mainly
This can be done and if you didn’t integrated the container runtime with
kubernetes it will be one more software that is running on your server
let’s say you used kubeadm on all the nodes and you want to run docker
containers this will be separate provided you have drafted a proper
architecture and configured separate isolated virtual network
accordingly.
Now comes the storage part, you need to create separate storage volumes
for kubernetes and container runtime separately because if any one
software gets failed or corrupted it should not affect the second one and
also for providing the isolation.
If you maintain proper isolation starting from storage to network then you can run both kubernetes and container runtime separately however it is not a suggested way of implementation for production environments.

Kubernetes with hybrid containers on one VM?

I have played around a little bit with docker and kubernetes. Need some advice here on - Is it a good idea to have one POD on a VM with all these deployed in multiple (hybrid) containers?
This is our POC plan:
Customers to access (nginx reverse proxy) with a public API endpoint. eg., abc.xyz.com or def.xyz.com
List of containers that we need
Identity server Connected to SQL server
Our API server with Hangfire. Connected to SQL server
The API server that connects to Redis Server
The Redis in turn has 3 agents with Hangfire load-balanced (future scalable)
Setup 1 or 2 VMs?
Combination of Windows and Linux Containers, is that advisable?
How many Pods per VM? How many containers per Pod?
Should we attach volumes for DB?
Thank you for your help
Cluster size can be different depending on the Kubernetes platform you want to use. For managed solutions like GKE/EKS/AKS you don't need to create a master node but you have less control over our cluster and you can't use latest Kubernetes version.
It is safer to have at least 2 worker nodes. (More is better). In case of node failure, pods will be rescheduled on another healthy node.
I'd say linux containers are more lightweight and have less overhead, but it's up to you to decide what to use.
Number of pods per VM is defined during scheduling process by the kube-scheduler and depends on the pods' requested resources and amount of resources available on cluster nodes.
All data inside running containers in a Pod are lost after pod restart/deletion. You can import/restore DB content during pod startup using Init Containers(or DB replication) or configure volumes to save data between pod restarts.
You can easily decide which container you need to put in the same Pod if you look at your application set from the perspective of scaling, updating and availability.
If you can benefit from scaling, updating application parts independently and having several replicas of some crucial parts of your application, it's better to put them in the separate Deployments. If it's required for the application parts to run always on the same node and if it's fine to restart them all at once, you can put them in one Pod.

Specify scheduling order of a Kubernetes DaemonSet

I have Consul running in my cluster and each node runs a consul-agent as a DaemonSet. I also have other DaemonSets that interact with Consul and therefore require a consul-agent to be running in order to communicate with the Consul servers.
My problem is, if my DaemonSet is started before the consul-agent, the application will error as it cannot connect to Consul and subsequently get restarted.
I also notice the same problem with other DaemonSets, e.g Weave, as it requires kube-proxy and kube-dns. If Weave is started first, it will constantly restart until the kube services are ready.
I know I could add retry logic to my application, but I was wondering if it was possible to specify the order in which DaemonSets are scheduled?
Kubernetes itself does not provide a way to specific dependencies between pods / deployments / services (e.g. "start pod A only if service B is available" or "start pod A after pod B").
The currect approach (based on what I found while researching this) seems to be retry logic or an init container. To quote the docs:
They run to completion before any app Containers start, whereas app Containers run in parallel, so Init Containers provide an easy way to block or delay the startup of app Containers until some set of preconditions are met.
This means you can either add retry logic to your application (which I would recommend as it might help you in different situations such as a short service outage) our you can use an init container that polls a health endpoint via the Kubernetes service name until it gets a satisfying response.
retry logic is preferred over startup dependency ordering, since it handles both the initial bringup case and recovery from post-start outages

request service from minion only forward to local deployed pod in that minion

I am working on a POC, and i find out some strange behavior after setting up my kubernetes cluster
In fact, i am working on a topology of one master and two minions.
When i tried to make up 2 pods into each minion and expose a service for them, it turned out that when i try to request the service from the master, nothing is returned (any response from 2 pods) and when i try to request the service from a minion, only the pod deployed in that minion respond but the other no.
This can heavily depend on how your cluster is provisioned.
For starters, you need to validate how networking is set up and if it works as kubernetes expects. Said short, if you launch two pods (on separate nodes), they should get IPs from their dedicated per node ranges, and be able to route that between nodes. You can use some small(ish) base image (alpine/debian/ubuntu etc.), with something like sleep 1d , exec into them interactively with bash and simply ping one from the other. If it does not work, your network setup is broken.
Make sure you test between pods, not directly from node host OS. In some configurations node is unable to access service IPs due to routing concerns, but pod-to-pod works fine (seen this in some flannel configurations)
Also, your networking is probably provided by some overlay network solution like flannel, weave, calico etc. so check their respective logs for signs of problems.

Are there any scripts to monitor k8s‘ status?

If used on the production system, k8s related services might be down at sometime. Are there any scripts provided that can monitor and restart the services, or i need to develop my scripts and add them to crontab.
I'm guessing you mean things like the scheduler, apiserver etc. If so, they're already monitored by the kubelet running on that node. Kubelet itself is monitored by a babysitter (your init system- eg upstart, systemd etc). Depending on how your provisioned your cluster, the manifest files for those kube-daemons might be under /etc/kubernetes/manifest, those will have health checks.
Yes..How about dashboard (web ui) and kube-dns .. recently we deployed a new cluster and kube-dns was not working, didn't realize until user reported. Looking for a automated test/utility which can validated all the kubernetes required services running properly after new cluster deployment. Looked into prometheus which helps for continuous monitoring but may not help on new cluster setup validation.