How to select a specific pod for a service in Kubernetes - kubernetes

I have a kubernetes cluster of 3 hosts where each Host has a unique id label.
On this cluster, there is a software that has 3 instances (replicas).
Each replica requires to talk to all other replicas. In addition, there is a service that contains all pods so that this application is permanently available.
So I have:
Instance1 (with labels run: theTool,instanceid: 1)
Instance2 (with labels run: theTool,instanceid: 2)
Instance3 (with labels run: theTool,instanceid: 3)
and
Service1 (selecting pods with label instanceid=1)
Service2 (selecting pods with label instanceid=2)
Service3 (selecting pods with label instanceid=3)
Service (selecting pods with label run=theTool)
This approach works but have I cannot scale or use the rolling-update feature.
I would like to define a deployment with 3 replicas, where each replicate gets a unique generic label (for instance the replica-id like 1/3, 2/3 and so on).
Within the services, I could use the selector to fetch this label which will exist even after an update.
Another solution might be to select the pod/deployment, depending on the host where it is running on. I could use a DaemonSet or just a pod/deployment with affinity to ensure that each host has an exact one replica of my deployment.
But I didn't know how to select a pod based on a host label where it runs on.
Using the hostname is not an option as hostnames will change in different environments.
I have searched the docs but didn't find anything matching this use case. Hopefully, someone here has an idea how to solve this.

The feature you're looking for is called StatefulSets, which just launched to beta with Kubernetes 1.5 (note that it was previously available in alpha under a different name, PetSets).
In a StatefulSet, each replica has a unique name that is persisted across restarts. In your example, these would be instance-1, instance-2, instance-3. Since the instance names are persisted (even if the pod is recreated on another node), you don't need a service-per-instance.
The documentation has more details:
Using StatefulSets
Scaling a StatefulSet
Deleting a StatefulSet
Debugging a StatefulSet

You can map NodeIP:NodePort with PodIP:PodPort. Your pod is running on some Node(Instance/VM).
Assign Label to your nodes ,
http://kubernetes.io/docs/user-guide/node-selection/
Write a service for your pod , for example
service.yaml:
apiVersion: v1
kind: Service
metadata:
name: mysql-service
labels:
label: mysql-service
spec:
type: NodePort
ports:
- port: 3306 #Port on which your service is running
nodePort: 32001 # Node port on which you can access it statically
targetPort: 3306
protocol: TCP
name: http
selector:
name: mysql-selector #bind pod here
Add node selector (in spec field) to your deployment.yaml
deployment.yaml:
spec:
nodeSelector:
nodename: mysqlnode #labelkey=labelname assigned in first step
With this you will be able to access your pod service with Nodeip:Nodeport. If I labeled node 10.11.20.177 with ,
nodename=mysqlnode
I will add in node selector ,
nodeSelector:
nodename : mysqlnode
I specified in service nodePort so now I can access pod service (Which is running in container)
10.11.20.177:32001
But this node should be in same network so it can access pod. For outside access make 32001 accessible publicaly with firewall configuration. It is static forever. Label will take care of your dynamic pod ips.

Related

Access a kubernetes service with with dns name internally, and with FDQN externally

Application A and application B are two applications running in the same kubernetes cluster. Application A can access B by reading the B_HOST env ( with value b.example.com) passed to A's container. Is there any way by which an A would be able access B:
internally: using the DNS name of B's service (b.default.svc.cluster.local)
externally: using the FQDN of B, that is also defined in the ingress resource (b.example.com)
at the same time?
For example,
If you try to curl b.example.com inside the pod/container of A, it should resolve to b.default.svc.cluster.local and get the result via that service.
If you try to curl b.example.com outside the k8s cluster, it should use ingress to reach the service B and get the results.
As a concept, adding an extra host entry (that maps B's FQDN to its service IP) to the container A's /etc/hosts should work. But that doesn't seem to be a good practice as it needs to get the IP address of B's service in advance and then create A's pod with that HostAliases config. Patching this field into an existing pod is not allowed. The service IP changes when you recreate the service, and adding the dns name of the service instead of its IP in HostAliases is also not supported.
So, what would be a good method to achieve this?
Found a similar discussion in this thread.
Additional Info:
I'm using Azure Kubernetes service (AKS) and using application gateway as ingress controller (AGIC).
You can try different methods, then see which one works for you.
Method 1 :
Modifying the coreDNS configuration of your k8s cluster.
Reference: https://coredns.io/2017/05/08/custom-dns-entries-for-kubernetes/
In AKS, it can be done as described here:
https://learn.microsoft.com/en-us/azure/aks/coredns-custom#rewrite-dns
Method 2 :
Specifying an externalIP manually for the service B and then adding the same IP in /etc/hosts file of pod A using hostAliases seems working.
Part of pod definition of app A:
apiVersion: v1
kind: Pod
metadata:
name: a
labels:
app: a
spec:
hostAliases:
- ip: "10.0.3.165"
hostnames:
- "b.example.com"
Part of service definition of app B:
apiVersion: v1
kind: Service
metadata:
name: b
spec:
selector:
app: b
externalIPs:
- 10.0.3.165
ports:
- protocol: TCP
port: 80
targetPort: 80
But not sure if that is a good practice; there could be pitfalls.
One being that the externalIP we are defining could be any random valid IP address - be it private or public, without a conflict to other IPs of cluster resources.Unpredictable behaviour can result if overlapping IP ranges are used.
Method 3 :
The clusterIP of the service will be available inside pod A as an environment variable B_SERVICE_HOST by default.
So, instead of adding an externalIP you can try to get the actual service IP (clusterIP) of B from env B_SERVICE_HOST and add to /etc/hosts of the pod A - either using hostAliases or directly, whichever works.
echo $B_SERVICE_HOST 'b.example.com' >> /etc/hosts
You can do this using a postStart hook for the container in the pod definition:
containers:
- image: "myreg/myimagea:tag"
name: container-a
lifecycle:
postStart:
exec:
command: ["/bin/sh", "-c", "echo $B_SERVICE_HOST 'b.example.com' >> /etc/hosts"]
Since this is a container lifecycle hook, the changes will be specific to that one container. So other containers in the same pod may not have the same changes applied to their hosts file.
Also note that, service of B should be created before the pod A in order to be able to get IP from B_SERVICE_HOST env.
Method 4 :
You can try to create a public DNS zone and a private DNS zone in your cloud tenant. Then add records in it to point to ther service. For example, create a private DNS zone in Azure then do either of the following 2 methods:
Add A record mapping b.example.com to svc B's clusterIP
Add CNAME record mapping b.example.com to internal loadbalancer dns label provided by azure for the service. On a wider perspective, if you have multiple applications in the cluster with same reequirement, Create a static IP, create a loadbalancer type service for your ingress controller using this static IP as loadBalancerIP and with an annotation service.beta.kubernetes.io/azure-dns-label-name as described here. You'll get a dns label for that service. Then add a CNAME record in your private zone with mapping *.example.com to this azure-provided dns label. Still I doubt if this would be suitable if your ingress controller is Azure application gateway.
NOTE:
Also consider how the method you adopt will affect your debugging process in future if any networking related issue arises.
If you feel that would be problem, consider using two different environment variables B_HOST and B_PUBLIC_HOST separately for external and internal access.

2 different services for same DaemonSet K8s

I was wondering if there is a way to create a service for a pod on a specific node.
For example:
Lets say I have a cluster with 4 worker nodes [compute-0 ... compute-3].
Nodes "compute-0" and "compute-1" have a label "app=firstApplication"
Nodes "compute-2" and "compute-3" have a different label "app=secondApplication"
I have a single Daemonset running across all 4 nodes.
I want to create 2 services, one for each couple of nodes.
Is this possible somehow?
Thanks!
EDIT
The reason for what we are trying to do is that we have an Openshift4.6 cluster, and for security reasons we have VXLAN port blocked off between 2 groups of nodes. When pods try to resolve DNS queries using the default dns service (172.30.0.10), sometimes they access the dns pods on the blocked off nodes.
No - this is not possible! Since services are referencing their Pods by Labels and all Pods in a DaemonSet are labelled the same, you can't do that. Of course, you could label your Pods after creation, but since this would be lost after recreation of the DaemonSet, I would not go down that route.
You could split your DaemonSet into parts and use Node Selectors or Affinity to control the distribution of Pods over Nodes.
If you specify a .spec.template.spec.nodeSelector, then the DaemonSet controller will create Pods on nodes which match that node selector. Likewise if you specify a .spec.template.spec.affinity, then DaemonSet controller will create Pods on nodes which match that node affinity.
That way, each DaemonSet can have its own Service.
You just need to patch existing pods. Add those label in your pods. May be you need to handle another operator. The job of the operator is to get the pods first. Then check if the desire label exist or not . If not exist patch the label of the pod. this is just like kubectl patch. With the help of kubeclient just update the label if the label is not exist in the pods. do some research about kubeclient. There are also an example sample-controller in kubernetes. Here is the link :
ref
if there are some extra label in pod just add them in selector.
---
kind: Service
apiVersion: v1
metadata:
name: first-svc
labels:
app: firstApplication
spec:
selector:
app: firstApplication
ports:
- name: http
port: 8080
targetPort: 8080
---
kind: Service
apiVersion: v1
metadata:
name: second-svc
labels:
app: secondApplication
spec:
selector:
app: secondApplication
ports:
- name: http
port: 8080
targetPort: 8080
---

AKS Kubernetes questions

Can someone please explain how POD to POD works in AKS?
from the docs, I can see it uses kube proxy component to send the traffic to the desired POD.
But I have been told that I must use clusterIP service and bind all the relevant POD's together.
So what is real flow? Or I missed something. below a couple of questions to be more clear.
Questions:
how POD to POD inside one node can talk to each other? what is the flow?
how POD to POD inside a cluster (different nodes) can talk to each other? what is the flow?
if it's possible it will be highly appreciated if you can describe the flows for #1 and #2 in the deployment of kubenet and CNI.
Thanks a lot!
for pod to pod communication we use services. so first we need to understand,
why we need service: what actually do service for us that, they resolve the dns name and give us the the exact ip that we need to connect a specific pod. now as you want to communicate with pod to pod you need to create a ClusterIP service.
ClusterIP: Exposes the Service on a cluster-internal IP. Choosing this value makes the Service only reachable from within the cluster. This is the default ServiceType. with ClusterIP service you can't access a pod from outside the cluster for this reason we use clusterip service if we want the communication between pod to pod only.
kube-proxy is the network proxy that runs on each node in your cluster.
it maintains network rules on nodes. These network rules allow network communication to your Pods from network sessions inside or outside of your cluster.
every service maintain iptables.And kube-proxy handled these ip tables for every service. so yes, kube-proxy is the most vital point for network setup in our k8s cluster.
how the network policy works in kubernetes:
all Pods can communicate with all other Pods without using network address translation (NAT).
all Nodes can communicate with all Pods without NAT.
the IP that a Pod sees itself as is the same IP that others see it as.
with those point:
Container-to-Container networking
Pod-to-Pod networking
Pod-to-Service networking
Internet-to-Service networking
It handles transmission of packets between pod to pods, and also with the outside world. It acts like a network proxy and load balancer for pods running on the node by implementing load-balancing using NAT in iptables.
The kube-proxy process stands in between the Kubernetes network and the pods that are running on that particular node. It is responsible for ensuring that communication is maintained efficiently across all elements of the cluster. When a user creates a Kubernetes service object, the kube-proxy instance is responsible to translate that object into meaningful rules in the local iptables rule set on the worker node. iptables is used to translate the virtual IP assigned to the service object to all of the pod IPs mapped by the service.
i hope it's clear your idea about kube proxy.
lets see a example how it's works.
here i used headless service so that i can connect a specific pod.
---
apiVersion: v1
kind: Service
metadata:
name: my-service
namespace: default
spec:
clusterIP: None
selector:
app: my-test
ports:
- port: 80
name: rest
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: my-sts
spec:
serviceName: my-service
replicas: 3
selector:
matchLabels:
app: my-test
template:
metadata:
labels:
app: my-test
spec:
containers:
- name: nginx
image: nginx
ports:
- containerPort: 80
name: web
---
this will create 3 pods. as : my-sts-0, my-sts-1, my-sts-2. now if we want to connect to the pod my-sts-0 just use this dns name my-sts-0.my-service.default.svc:80 . and the service will resolve the dns name and will provide the exact podip of my-sts-0. now if you need to comminucate from my-sts-1 to my-sts-0, you can just use this dns name.
The template is like my_pod_name.my_Service_Name.my_Namespace.svc.cluster-domain.example , but you can skip the cluster-domain.example part. Only Service_Name.Namespace.svc will work fine.
ref

kubernetes Service select multi labels

I have two StatefulSets named my-sts and my-sts-a, want to create a single service, that addresses same-indexed pods, from two different StatefulSets, like: my-sts-0 and my-sts-a-0. But found k8s doc says:
Labels selectors for both objects are defined in json or yaml files using maps, and only equality-based requirement selectors are supported
My idea is to create a label for the two sts pods like:
my-sts-0 has label abc:sts-0
my-sts-a-0 has label abc:sts-0
my-sts-1 has label abc:sts-1
my-sts-a-1 has label abc:sts-1
How to get the index of those pods so that I can create a label named abc=sts-<index> to approach it?
Is there any other way?
Kubernetes already gives you a DNS name to select individual StatefulSet pods. Say you have a Service my-sts that matches every pod in the StatefulSet, and the StatefulSet is set up with serviceName: my-sts; then you can access host names my-sts-0.my-sts.namespace.svc.cluster.local and so on.
If you specifically want a service to target a specific pod, there is also a statefulset.kubernetes.io/pod-name label that gets added automatically, so you can attach to that
apiVersion: v1
kind: Service
metadata:
name: my-sts-master
spec:
selector:
statefulset.kubernetes.io/pod-name: my-sts-0
ports: [...]

How are minions chosen for given pod creation request

How does kubernetes choose the minion among many available for a given pod creation command? Is it something that can be controlled/tweaked ?
If replicated pods are submitted for deployment, is kubernetes intelligent enough to place them in different minions if they expose the same container/host port pair? Or does it always place different replicas in different minions ?
What about corner cases like what if two different pods (not necessarily replicas) that expose same host/container port pair are submitted? Will they carefully be placed on different minions ?
If a pod requires specific compute/memory requirements, can it be placed in a minion/host that has sufficient resources left to meet those requirement?
In summary, is there detailed documentation on kubernetes pod placement strategy?
Pods are scheduled to ports using the algorithm in generic_scheduler.go
There are rules that prevent host-port conflicts, and also to make sure that there are sufficient memory and cpu requirements. predicates.go
One way to choose minion for pod creation is using nodeSelector. Inside the yaml file of pod specify the label of minion for which you want to choose the minion.
apiVersion: v1
kind: Pod
metadata:
name: nginx1
labels:
env: test
spec:
containers:
- name: nginx
image: nginx
imagePullPolicy: IfNotPresent
nodeSelector:
key: value