Why headless services are not recommended in stateless applications? - kubernetes

I'm a kubernetes beginner and I found that headless service is generally recommended for stateful applications, so I'm wondering why it can't be used in stateless again? Is it bad that requests should be faster if they go directly to the Pod and not through the Service?

A regular Service provides you with a single, stable, IP address to access any of the replicas associated to the Service.
Given that the application is really stateless, it does not matter which replica you talk to, so the "hiding" behind a single IP.
This relieves you of the responsibility of load balancing: You can just use the very same IP every time and don't have to handle added replicas, removed replicas or repeated DNS lookups.
If you would use a headless service for a stateless application, which is of course possible, you would have to handle all that or risk that the replicas do not deliver scaling and/or high-availability as intended.
In fact a headless service shuffles the order of returned records when performing a DNS looup to help clients with the load balancing, if they can't handle it themself. But you have to perform regular DNS lookups for headless services to learn about added and/or removed instances, which is not the case for a regular service with a service IP.

First of all Headless services are vaguely used to access all the
pod replicas directly instead of using the Services.
In the case of deployment(Stateless services) the pods are
interchangeable because if the pod needs to reschedule it wont
maintain the same id as the previous pod.
Whereas the Statefulsets maintain a unique ID for each pod.
Statefulsets provide 2 unique identities for each pod. First Network
Identity which helps us to give the same DNS name to the pod
regardless of numerous restarts and second is the Storage Identity,
it will also remain the same regardless of the restarts. So,
statefulset won’t use the shared volume.
An IP address is not assigned to a headless service. Internally, it
builds the required endpoints for DNS-named pod exposure.
Conclusion: Pods having a distinct identity are required for stateful applications (hostname). To communicate one pod with other pods a StatefulSet requires a Headless Service. A service with a service IP is a headless service. As a result, it immediately returns the IPs of our related pods. This enables us to communicate with the pods directly instead of using a proxy. Whereas in stateless applications a Service is required to interact with other pods.
For more detailed information refer to this article

Related

Can I have multiple headless services for StatefulSet?

In kubernetes, is it somehow possible to "assign" multiple headless services to single statefulset, or achieve behaviour describe below some other way?
Use-case:
We've got statefulset, let's call it: set. It has 3 pod, and headless service called set-headless.
It is possible to access pods, using following dns names:
set-0.set-headless.namespace.svc.cluster.local
set-1.set-headless.namespace.svc.cluster.local
set-2.set-headless.namespace.svc.cluster.local
For some reasons, we would like to change this endpoints, to i.e. contain some more information in headless service name - set-uswest1-headles.
To accomplish this change without downtime, it would be perfect, to have two headless services running at the same time, so pods could be accessible by following dns names:
set-0.set-headless.namespace.svc.cluster.local
set-1.set-headless.namespace.svc.cluster.local
set-2.set-headless.namespace.svc.cluster.local
set-0.set-uswest1-headless.namespace.svc.cluster.local
set-1.set-uswest1-headless.namespace.svc.cluster.local
set-2.set-uswest1-headless.namespace.svc.cluster.local
Is it possible at all? Can this be achieved some other way (not using headless servic
Yes, it all depends on the labels applied to each statefulSet/Pod that will add that pod to the headless service endpoints.
You can have one headless service to route to all the pods, and 1 for each set of different set of pods
EDIT: For your use case, in order to not have downtime, its important that both of the headless services has the same labels.
Also, its important to remember that headless services are for pods in the same statefulset to communicate with each other and services are used for pods to be reached from other services. So in case you need the pods to be reached by other services/ingress you need the same labels applied to both services and satefulsets for no downtime.
Or you could explain what kind of service is this and i can help you with specific actions for that kind of service

My understanding of headless service in k8s and two questions to verify

I am learning the headless service of kubernetes.
I understand the following without question (please correct me if I am wrong):
A headless service doesn't have a cluster IP,
It is used for communicating with stateful app
When client app container/pod communicates with a database pod via headless service the pod IP address is returned instead of the service's.
What I don't quite sure:
Many articles on internet explaining headless service is vague in my opinion. Because all I found only directly state something like :
If you don't need load balancing but want to directly connect to the
pod (e.g. database) you can use headless service
But what does it mean exactly?
So, following are my thoughts of headless service in k8s & two questions with an example
Let's say I have 3 replicas of PostgreSQL database instance behind a service, if it is a regular service I know by default request to database would be routed in a round-robin fasion to one of the three database pod. That's indeed a load balancing.
Question 1:
If using headless service instead, does the above quoted statement mean the headless service will stick with one of the three database pod, never change until the pod dies? I ask this because otherwise it would still be doing load balancing if not stick with one of the three pod. Could some one please clarify it?
Question 2:
I feel no matter it is regular service or headless service, client application just need to know the DNS name of the service to communicate with database in k8s cluster. Isn't it so? I mean what's the point of using the headless service then? To me the headless service only makes sense if client application code really needs to know the IP address of the pod it connects to. So, as long as client application doesn't need to know the IP address it can always communicate with database either with regular service or with headless service via the service DNS name in cluster, Am I right here?
A normal Service comes with a load balancer (even if it's a ClusterIP-type Service). That load balancer has an IP address. The in-cluster DNS name of the Service resolves to the load balancer's IP address, which then forwards to the selected Pods.
A headless Service doesn't have a load balancer. The DNS name of the Service resolves to the IP addresses of the Pods themselves.
This means that, with a headless Service, basically everything is up to the caller. If the caller does a DNS lookup, picks the first address it's given, and uses that address for the lifetime of the process, then it won't round-robin requests between backing Pods, and it will not notice if that Pod disappears. With a normal Service, so long as the caller gets the Service's (cluster-internal load balancer's) IP address, these concerns are handled automatically.
A headless Service isn't specifically tied to stateful workloads, except that StatefulSets require a headless Service as part of their configuration. An individual StatefulSet Pod will actually be given a unique hostname connected to that headless Service. You can have both normal and headless Services pointing at the same Pods, though, and it might make sense to use a normal Service for cases where you don't care which replica is (initially) contacted.
A headless service will return all Pod IPs that are associated through the selector. The order is not stable, so if a client is making repeated DNS queries and uses only the first returned IP, this will result in some kind of load balancing as well.
Regarding your second question: That is correct. In general, if a client does not need to know all instances - and handle the unstable IPs - a regular service provides more benefits.

Load balancing first or ingress rule matching first?

From k8s docs and along with other answers I can find, it shows load balancer(LB) before the ingress. However I am confused that after matching the ingress rule, there can be still multiple containers that backed the selected services. Does LB happen again here for selecting one container to route to?
https://kubernetes.io/docs/concepts/services-networking/ingress/#what-is-ingress
As you can see from the picture you posted, the Ingress choose a Service (based on a Rule) and not directly a Pod. Then, the Service may (or may not) have more than one Pod behind.
The default Service type for Kubernetes is called ClusterIP. It receives a virtual IP, which then redirects requests to one of the Pods served behind. On each node of the cluster, runs a kube-proxy which is responsible for implementing this virtual ip mechanic.
So, yes, load balancing happens again after a Service is selected.. if that service selects more than one Pod. Which backend (Pod) is choosen depends on how kube-proxy is configured and is usually either round robin or just random.
There is also a way to create a Service without a virtual IP. Such services, called headless Services, directly uses DNS to redirect requests to the different backends.. but they are not the default because it is better to use proxies than try to load balance with DNS.. which may have side effects (depending on who makes requests)
You can find a lot of info regarding how Services work in the docs.

Communication between Pods in Kubernetes. Service object or Cluster Networking?

I'm a beginner in Kubernetes and I have a situation as following: I have two differents Pods: PodA and PodB. Firstly, I want to expose PodA to the outside world, so I create a Service (type NodePort or LoadBalancer) for PodA, which is not difficult to understand for me.
Then I want PodA communicate to PodB, and after several hours googling, I found the answer is that I also need to create a Service (type ClusterIP if I want to keep PodB only visible inside the cluster) for PodB, and if I do so, I can let PodA and PodB comminucate to each other. But the problem is I also found this article. According to this webpage, they say that the communication between pods on the same node can be done via cbr0, a Network Bridge, or the communication between pods on different nodes can be done via a route table of the cluster, and they don't mention anything to the Service object (which means we don't need Service object ???).
In fact, I also read the documents of K8s and I found in the Cluster Networking
Cluster Networking
...
2. Pod-to-Pod communications: this is the primary focus of this document.
...
where they also focus on to the Pod-to-Pod communications, but there is no stuff relevant to the Service object.
So, I'm really confusing right now and my question is: Could you please explain to me the connection between these stuff in the article and the Service object? The Service object is a high-level abstract of the cbr0 and route table? And in the end, how can the Pods can communicate to each other?
If I misunderstand something, please, point it out for me, I really appreciate that.
Thank you guys !!!
Motivation behind using a service in a Kubernetes cluster.
Kubernetes Pods are mortal. They are born and when they die, they are not resurrected. If you use a Deployment to run your app, it can create and destroy Pods dynamically.
Each Pod gets its own IP address, however in a Deployment, the set of Pods running in one moment in time could be different from the set of Pods running that application a moment later.
This leads to a problem: if some set of Pods (call them “backends”) provides functionality to other Pods (call them “frontends”) inside your cluster, how do the frontends find out and keep track of which IP address to connect to, so that the frontend can use the backend part of the workload?
That being said, a service is handy when your deployments (podA and podB) are dynamically managed.
Your PodA can always communicate with PodB if it knows the address or the DNS name of PodB. In a cluster environment, there may be multiple replicas of PodB, or an instance of PodB may die and be replaced by another instance with a different address and different name. A Service is an abstraction to deal with this situation. If you use a Service to expose your PodB, then all pods in the cluster can talk to an instance of PodB using that service, which has a fixed name and fixed address no matter how many instances of PodB exists and what their addresses are.
First, I read it as you are dealing with two applications, e.g. ApplicationA and ApplicationB. Don't use the Pod abstraction when you reason about your architecture. On Kubernetes, you are dealing with a distributed system, and it is designed so that you should have multiple instances of your Application, e.g. for High Availability. Each instance of your application is a Pod.
Deploy your applications ApplicationA and ApplicationB as a Deployment resource. Then it is easy do do rolling upgrades without downtime, and Kubernetes will restart any instance of your application if it crash.
For every Deployment or for you, application, create one Service resource, (e.g. ServiceA and ServiceB). When you communicate from ApplicationA to another application, use the Service, e.g. ServiceB. The service will load balance your requests to the instances of the other application, and you can upgrade your Deployment without downtime.
1.Cluster networking : As the name suggests, all the pods deployed in the cluster will be connected by implementing any kubernetes network model like DANM, flannel
Check this link to see how to create a cluster network.
Creating cluster network
With the CNI installed (by implementing cluster network), every pod will get an IP.
2.Service objects created with type ClusterIP, points to the this IPs (via endpoint) created internally to communicate.
Answering your question, Yes, The Service object is a high-level abstract of the cbr0 and route table.
You can use service object to communicate between pods.
You can also implement service mesh like envoy / Istio if the network is complex.

How DNS service works in the Kubernetes?

I am new to the Kubernetes, and I'm trying to understand that how can I apply it for my use-case scenario.
I managed to install a 3-node cluster on VMs within the same network. Searching about K8S's concepts and reading related articles, still I couldn't find answer for my below question. Please let me know if you have knowledge on this:
I've noticed that internal DNS service of K8S applies on the pods and this way services can find each other with hostnames instead of IPs.
Is this applicable for communication between pods of different nodes or this is only within the services inside a single node? (In other words, do we have a dns service on the node level in the K8S, or its only about pods?)
The reason for this question is the scenario that I have in mind:
I need to deploy a micro-service application (written in Java) with K8S. I made docker images from each service in my application and its working locally. Currently, these services are connected via pre-defined IP addresses.
Is there a way to run each of these services within a separate K8S node and use from its DNS service to connect the nodes without pre-defining IPs?
A service serves as an internal endpoint and (depending on the configuration) load balancer to one or several pods behind it. All communication typically is done between services, not between pods. Pods run on nodes, services don't really run anything, they are just routing traffic to the appropriate pods.
A service is a cluster-wide configuration that does not depend on a node, thus you can use a service name in the whole cluster, completely independent from where a pod is located.
So yes, your use case of running pods on different nodes and communicate between service names is a typical setup.