AWS EKS Communication Between Node Groups - kubernetes

I have an EKS cluster with a single node group (3 nodes) that is currently only running Jenkins4.
I want to start utilising this EKS cluster for other things but want to separate out deployments into specific node groups.
For example, I want to create a 'monitoring' node group to which I will deploy prometheus and grafana. I also want another larger node group for application deployments.
I know I can create a second node group in EKS and label it with 'monitoring' so I can use nodeSelector to deploy to the correct node group.
My question is around whether I need to consider networking between the node groups. For prometheus for example to be able to scrape from exporters running on pods on the other node groups.
Is that something which is required with some sort of ingress rule? Or is it not required. If it required, what is the correct way to implement this?

As long as the nodes are in the same cluster and belong to the same master and no custom network policy prevents node groups from reaching each other you should be able to rely on ClusterIPs.
My concern is more related on the reason why you should prefer to use dedicated node groups for separating tasks. Is that because of specific requirements? As long as you have available resources in your cluster I would leverage on the existing nodes and deploy Kubernetes Resources (deployments/services/etc..) in dedicated namespaces which is the kind of separation looks appropriate the most to me in your case. Then, at the time you need more horsepower, you can scale horizontally your cluster even with different hardware, specific labels and NodeAffinity (instead of NodeSelector, for better customisation).
Hope I helped.

Related

Why there is no concept of nodepool in Kubernetes?

I can see GKE, AKS, EKS all are having nodepool concepts inbuilt but Kubernetes itself doesn't provide that support. What could be the reason behind this?
We usually need different Node types for different requirements such as below-
Some pods require either CPU or Memory intensive and optimized nodes.
Some pods are processing ML/AI algorithms and need GPU-enabled nodes. These GPU-enabled nodes should be used only by certain pods as they are expensive.
Some pods/jobs want to leverage spot/preemptible nodes to reduce the cost.
Is there any specific reason behind Kubernetes not having inbuilt such support?
Node Pools are cloud-provider specific technologies/groupings.
Kubernetes is intended to be deployed on various infrastructures, including on-prem/bare metal. Node Pools would not mean anything in this case.
Node Pools generally are a way to provide Kubernetes with a group of identically configured nodes to use in the cluster.
You would specify the node you want using node selectors and/or taints/tolerations.
So you could taint nodes with a GPU and then require pods to have the matching toleration in order to schedule onto those nodes. Node Pools wouldn't make a difference here. You could join a physical server to the cluster and taint that node in exactly the same way -- Kubernetes would not see that any differently to a Google, Amazon or Azure-based node that was also registered to the cluster, other than some different annotations on the node.
As Blender Fox mentioned Node group is more specific to Cloud provider Grouping/Target options.
In AWS we have Node groups or Target groups, While in GKE Managed/Unmanaged node groups.
You set the Cluster Autoscaler and it scales up & down the count in the Node pool or Node groups.
If you are running Kubernetes on On-prem there may not be the option of a Node pool, as the Node group is mostly a group of VM in the Cloud. While on the on-prem bare metal machines also work as Worker Nodes.
To scale up & Down there is Cluster autoscaler(CA adds or removes nodes from the cluster by creating/deleting VMs) in K8s which uses the Cloud provider node group API while on Bare metal it may not work simply.
Each provider have own implementation and logic which get determined from K8s side by flag --cloud-provider Code link
So if you are on On-prem private cloud write your own cloud client and interface.
It's not necessary to have to node group however it's more of Cloud provider side implementation.
For Scenario
Some pods require either CPU or Memory intensive and optimized nodes.
Some pods are processing ML/AI algorithms and need GPU-enabled nodes.
These GPU-enabled nodes should be used only by certain pods as they
are expensive. Some pods/jobs want to leverage spot/preemptible nodes
to reduce the cost.
You can use the Taints-toleration, Affinity, or Node selectors as per need to schedule the POD on the specific type of Nodes.

In Kubernetes / AKS how do you direct services to specific node pools' individual nodes. Is there a way for the deployments to choose or is it auto?

I am new to Kubernetes. One thing I am not sure of is when creating all of these deployments I am starting to max out my cpu/memory usage on my current node pool.
In this article it states that the SAME configuration of nodes will be created as a "node pool"
In Azure Kubernetes Service (AKS), nodes of the same configuration are grouped together into node pools.
System node pools serve the primary purpose of hosting critical system pods such as CoreDNS and tunnelfront. User node pools serve the primary purpose of hosting your application pods.
Also, you can create "taints (lol what)", tags and labels which only seem to "label" per se the node pool. Not an individual node.
When creating a node pool, you can add taints, labels, or tags to that node pool. When you add a taint, label, or tag, all nodes within that node pool also get that taint, label, or tag.
So with all of that said, it doesn't seem like the control is inside of a node pool's node. So how does it work for nodes in a node pool when deploying workloads and services?
Do I need to worry about managing that or is that automatically managed and pods are created across the plane of nodes in a node pool. I'm not really seeing the documentation for this.
One more thing, "Vertical Pod Autoscaling"
This seems like a good option but doesn't really explain in the documentation what is going on in terms of the nodes in a node pool. Except for this one statement at the end.
This article showed you how to automatically scale resource utilization, such as CPU and memory, of cluster nodes to match application requirements.
The question about the vertical auto scaler (which is really vertical + horizontal (IMHO)) but I understand the reference/verbiage, is what happens if you aren't using this? Do you have to manage each deployment on your own? How do deployments distribute over the individual node pool plane?

Can two kubernetes clusters share the same external etcd and work like master slave

We have a requirement to setup a geo redundant cluster. I am looking at sharing an external etcd cluster to run two kubernetes clusters. It may sound absurd at first, but the requirements have come down to it..I am seeking some direction to whether it is possible, and if not, what are the challenges.
Yes it is possible, you can have a single etcd cluster and multiple k8s clusters attached to it. The key to achieve it, is to use -etcd-prefix string flag from kubernetes apiserver. This way each cluster will use different root path for storing its resources and avoid possible conflict with second cluster in the etcd. In addition to it, you should also setup the appropriate rbac rules and certificates for each k8s cluster. You can find more detailed information about it in the following article: Multi-tenant external etcd for Kubernetes clusters.
EDIT: Ooh wait, just noticed that you want to have those two clusters to behave as master-slave. In that case you could achieve it by assign to the slave cluster a read-only role in the etcd and change it to read-write when it has to become master. Theoretically it should work, but I have never tried it and I think the best option is to use builtin k8s mechanism for high-availability like leader-election.

Set replicas on different nodes

I am developing an application for dealing with kubernetes runtime microservices. I actually did some cool things, like moving a microservice from a node to another one. The problem is that all replicas go together.
So, Imagine that a microservice has two replicas and it is running on a namespaces with two nodes.
I want to set one replica in each node. Is that possible? Even in a yaml file, is that possible?
I am trying to do my own scheduler to do that, but I got no success until now.
Thank you all
I think what you are looking for is a NodeSelector for your replica Set. From the documentation:
Inter-pod affinity and anti-affinity allow you to constrain which nodes your pod is eligible to be scheduled based on labels on pods that are already running on the node rather than based on labels on nodes.
Here is the documentation: https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#inter-pod-affinity-and-anti-affinity-beta-feature
I can't find where it's documented, but I recently read somewhere that replicas will be distributed across nodes when you create the kubernetes service BEFORE the deployment / replicaset.

Kubernetes - How is high availability ensured if I deploy a containerised app?

I am new to the kubernetes environment. While deploying an application, I could figure out how to do auto scaling but did not quite understand how high availability is ensured? If its not, how can I configure it?
Edit : By HA, I mean how to ensure that pod is scheduled across multiple nodes to ensure HA on pod/service level.
Please guide. Thanks in advance! :)
By HA, I mean how to ensure that pod is scheduled across multiple
nodes to ensure HA on pod/service level.
I'm guessing your app is cloud compatible and can be scaled, In this situation there are multiple feature your can take advantage of:
DaemonSets: containers on demonsets will be run on every single node. Unless you include/exclude certain nodes.
Deployments: Deployments are next generation of Replication Controllers. Using deployments you can easily scale your application as well as ensure availability of certain number of pods. Please note in order to be available on node failure, you need to set node affinity rules on the pods. In order to do that you need to set it in the pod templates. In 1.6 affinity can be specified as a field in PodSpec, rather than using annotations.