What is the purpose of headless service in Kubernetes? - kubernetes

I am naive in Kubernetes world. I was going through a interesting concept called headless service.
I have read it, understand it, and I can create headless service. But I am still not convinced about use cases. Like why do we need it. There are already three types of service clusterIP, NodePort and loadbalancer service with their separate use cases.
Could you please tell me what is exactly which headless service solve and all those other three services could not solve it.
I have read it that headless is mainly used with the application which is stateful like dB based pod for example cassandra, MongoDB etc. But my question is why?

A headless service doesn't provide any sort of proxy or load balancing -- it simply provides a mechanism by which clients can look up the ip address of pods. This means that when they connect to your service, they're connecting directly to the pods; there's no intervening proxy.
Consider a situation in which you have a service that matches three pods; e.g., I have this Deployment:
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: example
name: example
spec:
replicas: 3
selector:
matchLabels:
app: example
template:
metadata:
labels:
app: example
spec:
containers:
- image: docker.io/traefik/whoami:latest
name: whoami
ports:
- containerPort: 80
name: http
If I'm using a typical ClusterIP type service, like this:
apiVersion: v1
kind: Service
metadata:
labels:
app: example
name: example
spec:
ports:
- name: http
port: 80
targetPort: http
selector:
app: example
Then when I look up the service in a client pod, I will see the ip address of the service proxy:
/ # host example
example.default.svc.cluster.local has address 10.96.114.63
However, when using a headless service, like this:
apiVersion: v1
kind: Service
metadata:
labels:
app: example
name: example-headless
spec:
clusterIP: None
ports:
- name: http
port: 80
targetPort: http
selector:
app: example
I will instead see the addresses of the pods:
/ # host example-headless
example-headless.default.svc.cluster.local has address 10.244.0.25
example-headless.default.svc.cluster.local has address 10.244.0.24
example-headless.default.svc.cluster.local has address 10.244.0.23
By removing the proxy from the equation, clients are aware of the actual pod ips, which may be important for some applications. This also simplifies the path between clients and the service, which may have performance benefits.

Kubernetes Services of type ClusterIP, NodePort and LoadBalancer have one thing in common: They loadbalance between all pods that match the service's selector, so you can talk to all pods via a virtual ip.
That's nice because you can have multiple pods of the same application and all requests will be spread between those, avoiding overloading one pod while others are still idle.
For some applications you might require to talk to the pods directly instead of over an virtual ip. Still you'll want a stable hostname that points to the same pod, even if the pod's ip changes, e.g. because it needs to be rescheduled on a different host.
Use cases are mostly databases where applications should always connect to the same instance for data and session consistency.

A headless service does not provide a virtual IP covering all the endpoints in its endpoint slice. If you query that service DNS, you will get all the endpoint IP addresses back. A regular service on the other hand will have sort of a virtual IP, so clients connecting to it are not aware of the underlying endpoints.
To answer your question, why this is important for a stateful. Usually, members of a statefulset need to be aware of each other. Let's say you want to run a distributed database in something like a cluster formation. The members of this cluster need to have a way to discover each other. This is where the headless service comes into play. Because the individual database pods can get the address of the other database pods by using the headless service.
The headless service also provides a stable network identify for each member of the statefulset. You can read about that here. This is, again, useful as the members of the statefulset need to coordinate with each other in some way. Let's say the pod-0 will always take the initial leader role, and every other member knows it should report itself to pod-0 to form a cluster like configuration.

Related

Restrict access to service to only some pods

I have a mosquitto broker running on a pod, this server is exposed as a service as both DNS and IP address.
But this service is accessible by any pod in the cluster.
I want to restrict access to this service such that pods trying to connect to this DNS or IP address should only be able to if the pods have certain name/metadata.
One solution I guess will be to use namespaces? What other solution is there?
The UseCase you are describing is exactly what NetworkPolicies are here for.
Basically you define selector for pods which the network traffic should be restricted (i.e. your mosquito broker) and what specifica pods need to have in order to be allowed to reach it. For example a label "broker-access: true" or whatever seems to be suitable for you.
an example network policy could look like this:
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: broker-policy
namespace: default
spec:
podSelector:
matchLabels:
role: message-broker
policyTypes:
- Ingress
ingress:
- from:
- podSelector:
matchLabels:
broker-access: true
ports:
- protocol: TCP
port: 6379
this network policy would be applied to every pod with label role=message-broker.
and it would restrict all incoming traffic except for traffic from pods with label broker-acces=true on port 6379.
Hope this helps and gives you a bit of a skaffold for your NetworkPolicy

AKS Kubernetes questions

Can someone please explain how POD to POD works in AKS?
from the docs, I can see it uses kube proxy component to send the traffic to the desired POD.
But I have been told that I must use clusterIP service and bind all the relevant POD's together.
So what is real flow? Or I missed something. below a couple of questions to be more clear.
Questions:
how POD to POD inside one node can talk to each other? what is the flow?
how POD to POD inside a cluster (different nodes) can talk to each other? what is the flow?
if it's possible it will be highly appreciated if you can describe the flows for #1 and #2 in the deployment of kubenet and CNI.
Thanks a lot!
for pod to pod communication we use services. so first we need to understand,
why we need service: what actually do service for us that, they resolve the dns name and give us the the exact ip that we need to connect a specific pod. now as you want to communicate with pod to pod you need to create a ClusterIP service.
ClusterIP: Exposes the Service on a cluster-internal IP. Choosing this value makes the Service only reachable from within the cluster. This is the default ServiceType. with ClusterIP service you can't access a pod from outside the cluster for this reason we use clusterip service if we want the communication between pod to pod only.
kube-proxy is the network proxy that runs on each node in your cluster.
it maintains network rules on nodes. These network rules allow network communication to your Pods from network sessions inside or outside of your cluster.
every service maintain iptables.And kube-proxy handled these ip tables for every service. so yes, kube-proxy is the most vital point for network setup in our k8s cluster.
how the network policy works in kubernetes:
all Pods can communicate with all other Pods without using network address translation (NAT).
all Nodes can communicate with all Pods without NAT.
the IP that a Pod sees itself as is the same IP that others see it as.
with those point:
Container-to-Container networking
Pod-to-Pod networking
Pod-to-Service networking
Internet-to-Service networking
It handles transmission of packets between pod to pods, and also with the outside world. It acts like a network proxy and load balancer for pods running on the node by implementing load-balancing using NAT in iptables.
The kube-proxy process stands in between the Kubernetes network and the pods that are running on that particular node. It is responsible for ensuring that communication is maintained efficiently across all elements of the cluster. When a user creates a Kubernetes service object, the kube-proxy instance is responsible to translate that object into meaningful rules in the local iptables rule set on the worker node. iptables is used to translate the virtual IP assigned to the service object to all of the pod IPs mapped by the service.
i hope it's clear your idea about kube proxy.
lets see a example how it's works.
here i used headless service so that i can connect a specific pod.
---
apiVersion: v1
kind: Service
metadata:
name: my-service
namespace: default
spec:
clusterIP: None
selector:
app: my-test
ports:
- port: 80
name: rest
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: my-sts
spec:
serviceName: my-service
replicas: 3
selector:
matchLabels:
app: my-test
template:
metadata:
labels:
app: my-test
spec:
containers:
- name: nginx
image: nginx
ports:
- containerPort: 80
name: web
---
this will create 3 pods. as : my-sts-0, my-sts-1, my-sts-2. now if we want to connect to the pod my-sts-0 just use this dns name my-sts-0.my-service.default.svc:80 . and the service will resolve the dns name and will provide the exact podip of my-sts-0. now if you need to comminucate from my-sts-1 to my-sts-0, you can just use this dns name.
The template is like my_pod_name.my_Service_Name.my_Namespace.svc.cluster-domain.example , but you can skip the cluster-domain.example part. Only Service_Name.Namespace.svc will work fine.
ref

Session Affinity Settings for multiple Pods exposed by a single service

I have a setup Metallb as LB with Nginx Ingress installed on K8S cluster.
I have read about session affinity and its significance but so far I do not have a clear picture.
How can I create a single service exposing multiple pods of the same application?
After creating the single service entry point, how to map the specific client IP to Pod abstracted by the service?
Is there any blog explaining this concept in terms of how the mapping between Client IP and POD is done in kubernetes?
But I do not see Client's IP in the YAML. Then, How is this service going to map the traffic to respective clients to its endpoints? this is the question I have.
kind: Service
apiVersion: v1
metadata:
name: my-service
spec:
selector:
app: my-app
ports:
- name: http
protocol: TCP
port: 80
targetPort: 80
sessionAffinity: ClientIP
sessionAffinityConfig:
clientIP:
timeoutSeconds: 10000
Main concept of Session Affinity is to redirect traffic from one client always to specific node. Please keep in mind that session affinity is a best-effort method and there are scenarios where it will fail due to pod restarts or network errors.
There are two main types of Session Affinity:
1) Based on Client IP
This option works well for scenario where there is only one client per IP. In this method you don't need Ingress/Proxy between K8s services and client.
Client IP should be static, because each time when client will change IP he will be redirected to another pod.
To enable the session affinity in kubernetes, we can add the following to the service definition.
service.spec.sessionAffinity: ClientIP
Because community provided proper manifest to use this method I will not duplicate.
2) Based on Cookies
It works when there are multiple clients from the same IP, because it´s stored at web browser level. This method require Ingress object. Steps to apply this method with more detailed information can be found here under Session affinity based on Cookie section.
Create NGINX controller deployment
Create NGINX service
Create Ingress
Redirect your public DNS name to the NGINX service public/external IP.
About mapping ClientIP and POD, according to Documentation
kube-proxy is responsible for SessionAffinity. One of Kube-Proxy job
is writing to IPtables, more details here so thats how it is
mapped.
Articles which might help with understanding Session Affinity:
https://sookocheff.com/post/kubernetes/building-stateful-services/
https://medium.com/#diegomrtnzg/redirect-your-users-to-the-same-pod-by-using-session-affinity-on-kubernetes-baebf6a1733b
follow the service reference for session affinity
kind: Service
apiVersion: v1
metadata:
name: my-service
spec:
selector:
app: my-app
ports:
- name: http
protocol: TCP
port: 80
targetPort: 80
sessionAffinity: ClientIP
sessionAffinityConfig:
clientIP:
timeoutSeconds: 10000

Kubernetes: How to update namespace without changing external IP address?

How to update namespace without changing External-IP of the Service?
Initially, it was without namespace and it was deployed:
kind: Service
metadata:
name: my-app
labels:
app: my-app
spec:
type: LoadBalancer
ports:
port: 80
protocol: TCP
selector:
app: my-app
It creates External-IP address and it was pointed to the DNS. Now I would like to update the Namespace to keep it more organised. So I have created the namespace.
apiVersion: v1
kind: Namespace
metadata:
name: my-namespace
and I have updated the service with a namespace.
kind: Service
metadata:
name: my-app
namespace: my-namespace
labels:
app: my-app
spec:
type: LoadBalancer
ports:
port: 80
protocol: TCP
selector:
app: my-app
The above file creates another Service in my-namespace and the External-IP is not the same. Is there a way to update the namespace without recreating?
Please let me know if you need any information. Thanks!
some cloud providers allow you to specify external ip of a service with https://kubernetes.io/docs/concepts/services-networking/service/#external-ips if you can ake use of this, you should be able to achieve what you need. This will not be a zero downtime operation though as you'll first need to delete the current service and recreate it under different namespace with externalIP specified.
It seems that what you are after is a static public IP address, that is, one that you have reserved and outlives your cluster. If you had one of these, you would be able to specify it for your LoadBalancer either as #Radek mentions above, or possibly with a provider-specific annotation. Then you would be able to move the IP address between LoadBalancers (same applies for Ingress too).
However it seems you have not yet allocated a static public IP yet. It may be a good time to do this, but as Azure doesn’t appear to allow you to “promote” a dynamic IP to a static IP it won’t directly help you here (*).
So you’re left with creating a new LoadBalancer resource with a new public IP. To aid transition and avoid downtime, you could use an external DNS entry to switch users over from 1st to 2nd LoadBalancer IP addresses, which can be done seamlessless and without downtime if you set things up correctly. However does take a bit of time to transition: only once the DNS TTL period is done, is it safe to delete the first LoadBalancer.
If you don’t have an external DNS, this illustrates why it’s a good idea to set it up.
(*) GCP does allow you to do this, I doubt AWS does.
The best way to address this issue is to introduce nginx ingress controller. Let Externel LB route the calls to ingress controller. just update the ingress rules with correct service mapping.
the advantage is that you can expose any number of apps/service from single external IP via ingress. Without ingress you will have to setup one external LB for each application/service that you want to expose to outside world

Is it possible to route traffic to a specific Pod?

Say I am running my app in GKE, and this is a multi-tenant application.
I create multiple Pods that hosts my application.
Now I want:
Customers 1-1000 to use Pod1
Customers 1001-2000 to use Pod2
etc.
If I have a gcloud global IP that points to my cluster, is it possible to route a request based on the incoming ipaddress/domain to the correct Pod that contains the customers data?
You can guarantee session affinity with services, but not as you are describing. So, your customers 1-1000 won't use pod-1, but they will use all the pods (as a service makes a simple load balancing), but each customer, when gets back to hit your service, will be redirected to the same pod.
Note: always within time specified in (default 10800):
service.spec.sessionAffinityConfig.clientIP.timeoutSeconds
This would be the yaml file of the service:
kind: Service
apiVersion: v1
metadata:
name: my-service
spec:
selector:
app: my-app
ports:
- name: http
protocol: TCP
port: 80
targetPort: 80
sessionAffinity: ClientIP
If you want to specify time, as well, this is what needs to be added:
sessionAffinityConfig:
clientIP:
timeoutSeconds: 10
Note that the example above would work hitting ClusterIP type service directly (which is quite uncommon) or with Loadbalancer type service, but won't with an Ingress behind NodePort type service. This is because with an Ingress, the requests come from many, randomly chosen source IP addresses.
Not with Pods by themselves, but you should be able to with Services.
Pods are intended to be stateless and indistinguishable from one another.
But you should be able to create a Deployment per customer group, and a Service per Deployment. The Ingress nginx should be able to be told to map incoming requests by whatever attributes are relevant to specific customer group Services.