Why don't Kubernetes deployments support services? - kubernetes

I'm new to K8s, so still trying to get my head around things. I've been looking at deployments and can appreciate how useful they will be. However, I don't understand why they don't support services (only replica sets and pods).
Why is this? Does this mean that services would typically be deployed outside of a deployment?

To answer your question, Kubernetes deployments are used for managing stateless services running in the cluster instead of StatefulSets which are built for the stateful application run-time. Actually, with deployments you can describe the update strategy and road map for all underlying objects that have to be created during implementation.Therefore, we can distinguish separate specification fields for some objects determination, like needful replica number of Pods, template for Pod by describing a list of containers that should be in the Pod, etc.
However, as #P Ekambaram already mention in his answer, Services represent abstraction layer of network communication model inside Kubernetes cluster, and they declare a way to access Pods within a cluster via corresponded Endpoints. Services are separated from deployment object manifest specification, because of their mission to dynamically provide specific network behavior for the nested Pods without affecting or restarting them in case of any communication modification via appropriate Service Types.

Yes, services should be deployed as separate objects. Note that deployment is used to upgrade or rollback the image and works above ReplicaSet
Kubernetes Pods are mortal. They are born and when they die, they are not resurrected. ReplicaSets in particular create and destroy Pods dynamically (e.g. when scaling out or in). While each Pod gets its own IP address, even those IP addresses cannot be relied upon to be stable over time. This leads to a problem: if some set of Pods (let’s call them backends) provides functionality to other Pods (let’s call them frontends) inside the Kubernetes cluster, how do those frontends find out and keep track of which backends are in that set?
Services.come to the rescue.
A Kubernetes Service is an abstraction which defines a logical set of Pods and a policy by which to access them. The set of Pods targeted by a Service is (usually) determined by a Label Selector

Something I've just learnt that is somewhat related to my question: multiple K8s objects can be included in the same yaml file, separate by ---. Something like:
apiVersion: v1
kind: Deployment
# other stuff here
---
apiVersion: v1
kind: Service
# other stuff here

i think it intends to decoupled and fine-grained.

Related

Why not to use Kubernetes StatefulSet for stateless applications?

I know why use StatefulSet for stateful applications. (e.g. DB or something)
In most cases, I can see like "You want to deploy stateful app to k8s? Use StatefulSet!"
However, I couldn't see like "You want to deploy stateless app to k8s? Then, DO NOT USE StatefulSet" ever.
Even nobody says "I don't recommend to use StatefulSet for stateless app", many stateless apps is deployed through Deployment, like it is the standard.
The StatefulSet has clear pros for stateful app, but I think Deployment doesn't for stateless app.
Is there any pros in Deployment for stateless apps? Or is there any clear cons in StatefulSet for stateless apps?
I supposed that StatefulSet cannot use LoadBalancer Service or StatefulSet has penalty to use HPA, but all these are wrong.
I'm really curious about this question.
P.S. Precondition is the stateless app also uses the PV, but not persists stateful data, for example logs.
I googled "When not to use StatefulSet", "when Deployment is better than StatefulSet", "Why Deployment is used for stateless apps", or something more questions.
I also see the k8s docs about StatefulSet either.
Different Priorities
What happens when a Node becomes unreachable in a cluster?
Deployment - Stateless apps
You want to maximize availability. As soon as Kubernetes detects that there are fewer than the desired number of replicas running in your cluster, the controllers spawn new replicas of it. Since these apps are stateless, it is very easy to do for the Kubernetes controllers.
StatefulSet - Stateful apps
You want to maximize availability - but not you must ensure data consistency (the state). To ensure data consistency, each replica has its own unique ID, and there are never multiple replicas of this ID, e.g. it is unique. This means that you cannot spawn up a new replica, unless that you are sure that the replica on the unreachable Node are terminated (e.g. stops using the Persistent Volume).
Conclusion
Both Deployment and StatefulSet try to maximize the availability - but StatefulSet cannot sacrifice data consistency (e.g. your state), so it cannot act as fast as Deployment (stateless) apps can.
These priorities does not only happens when a Node becomes unreachable, but at all times, e.g. also during upgrades and deployments.
In contrast to a Kubernetes Deployment, where pods are easily replaceable, each pod in a StatefulSet is given a name and treated individually. Pods with distinct identities are necessary for stateful applications.
This implies that if any pod perishes, it will be apparent right away. StatefulSets act as controllers but do not generate ReplicaSets; rather, they generate pods with distinctive names that follow a predefined pattern. The ordinal index appears in the DNS name of a pod. A distinct persistent volume claim (PVC) is created for each pod, and each replica in a StatefulSet has its own state.
For instance, a StatefulSet with four replicas generates four pods, each of which has its own volume, or four PVCs. StatefulSets require a headless service to return the IPs of the associated pods and enable direct interaction with them. The headless service has a service IP but no IP address and has to be created separately.The major components of a StatefulSet are the set itself, the persistent volume and the headless service.
That all being said, people deploy Stateful Applications with Deployments, usually they mount a RWX PV into the pods so all "frontends" share the same backend. Quite common in CNCF projects.
A stateful set manages each POD with a unique hostname based on an index number. So with an index, it would be easy to identify the individual PODs and also easy for the application to check which on rely or unique network identities. Also, you might have read stateful sets get deleted in a specified order to maintain consistency.
When you use stateful for the stateless application it will be like a burden to manage and add complexity to unique network identities and ordering guarantees.
For example, when you scale down to zero stateful sets it goes in the controlled way while with deployment or RS it won't be the same case. However, there is no guarantee when deleting the resource stateful set.
Also, Before a scaling operation is applied to a stateful set Pod, all of its predecessors must be Running and Ready. So if you are deploying the application, three Pods will be deployed suppose in order app-0, app-1, app-2. app-1 wont be deployed before app-0 is Running & Ready, and app-2 wont be deployed until app-1 is Ready.
While with deployment you can manage the % for and handle the RollingUpdate scenario but with a stateful set it will delete and recreate new POD one by one.

Can I have multiple headless services for StatefulSet?

In kubernetes, is it somehow possible to "assign" multiple headless services to single statefulset, or achieve behaviour describe below some other way?
Use-case:
We've got statefulset, let's call it: set. It has 3 pod, and headless service called set-headless.
It is possible to access pods, using following dns names:
set-0.set-headless.namespace.svc.cluster.local
set-1.set-headless.namespace.svc.cluster.local
set-2.set-headless.namespace.svc.cluster.local
For some reasons, we would like to change this endpoints, to i.e. contain some more information in headless service name - set-uswest1-headles.
To accomplish this change without downtime, it would be perfect, to have two headless services running at the same time, so pods could be accessible by following dns names:
set-0.set-headless.namespace.svc.cluster.local
set-1.set-headless.namespace.svc.cluster.local
set-2.set-headless.namespace.svc.cluster.local
set-0.set-uswest1-headless.namespace.svc.cluster.local
set-1.set-uswest1-headless.namespace.svc.cluster.local
set-2.set-uswest1-headless.namespace.svc.cluster.local
Is it possible at all? Can this be achieved some other way (not using headless servic
Yes, it all depends on the labels applied to each statefulSet/Pod that will add that pod to the headless service endpoints.
You can have one headless service to route to all the pods, and 1 for each set of different set of pods
EDIT: For your use case, in order to not have downtime, its important that both of the headless services has the same labels.
Also, its important to remember that headless services are for pods in the same statefulset to communicate with each other and services are used for pods to be reached from other services. So in case you need the pods to be reached by other services/ingress you need the same labels applied to both services and satefulsets for no downtime.
Or you could explain what kind of service is this and i can help you with specific actions for that kind of service

In Kubernetes, what is the real purpose of replicasets?

I am aware about the hierarchical order of k8s resources. In brief,
service: a service is what exposes the application to outer world (or with in cluster). (The service types like, CluserIp, NodePort, Ingress are not so much relevant to this question. )
deployment: a deployment is what is responsible to keep a set of pods running.
replicaset: a replica set is what a deployment in turn relies on to keep the set of pods running.
pod: - a pod consist of a container or a group of container
container - the actual required application is run inside the container.
The thing i want to empasise in this question is, why we have replicaset. Why don't the deployment directly handle or take responsibility of keeping the required number of pods running. But deployment in turn relies on replicset for this.
If k8s is designed this way there should be definitely some benefit of having replicaset. And this is what i want to explore/understand in depth.
Both essentially serves the same purpose. Deployments are a higher abstraction and as the name suggests it deals with creating, maintining and upgrading the deployment (collection of pods) as a whole.
Whereas, ReplicationControllers or Replica sets primary responsibility is to maintain a set of identical replicas (which you can achieve declaratively using deployments too, but internally it creates a resplicaset to enable this).
More specifically, when you are trying to perform a "rolling" update to your deployment, such as updating the image versions, the deployment internally creates a new replica set and performs the rollout. during the rollout you can see two replicasets for the same deployment.
So in other words, Deployment needs the lower level "encapsulation" of Replica sets to achive this.

Kubernetes Statefulset access and downtime

I did read up on StatefulSets which I want to use for some stateful applications, and was wondering about two things?
1) Do I need to put a service in front? Or do I just dns query the single instances, and they do have a static ip like with a service by default, disregarding what pod runs behind?
2) How does a statefulset behave when a given pod X is down? I suppose with my theory of it having a kind of "internal" service, it will just hold back any requests done while the pod behind the IP is down until a new one is there?
1) Do I need to put a service in front? Or do I just dns query the
single instances, and they do have a static ip like with a service by
default, disregarding what pod runs behind?
StatefulSets currently require a Headless Service to be responsible for the network identity of the Pods. You are responsible for creating this Service.
2) How does a statefulset behave when a given pod X is down? I suppose
with my theory of it having a kind of "internal" service, it will just
hold back any requests done while the pod behind the IP is down until
a new one is there?
StatefulSet ensures that, at any time, there is at most one Pod with a given identity running in a cluster. This is referred to as at most one semantics provided by a StatefulSet.The StatefulSet also recreates Pods if they’re deleted, similar to what a ReplicaSet does for stateless Pods.

What exactly Kubernetes Services are and how they are different from Deployments

After reading thru Kubernetes documents like this, deployment , service and this I still do not have a clear idea what the purpose of service is.
It seems that the service is used for 2 purposes:
expose the deployment to the outside world (e.g using LoadBalancer),
expose one deployment to another deployment (e.g. using ClusterIP services).
Is this the case? And what about the Ingress?
------ update ------
Connect a Front End to a Back End Using a Service is a good example of the service working with the deployment.
Service
A deployment consists of one or more pods and replicas of pods. Let's say, we have 3 replicas of pods running in a deployment. Now let's assume there is no service. How does other pods in the cluster access these pods? Through IP addresses of these pods. What happens if we say one of the pods goes down. Kunernetes bring up another pod. Now the IP address list of these pods changes and all the other pods need to keep track of the same. The same is the case when there is auto scaling enabled. The number of the pods increases or decreases based on demand. To avoid this problem services come into play. Thus services are basically programs that manages the list of the pods ip for a deployment.
And yes, also regarding the uses that you posted in the question.
Ingress
Ingress is something that is used for providing a single point of entry for the various services in your cluster. Let's take a simple scenario. In your cluster there are two services. One for the web app and another for documentation service. If you are using services alone and not ingress, you need to maintain two load balancers. This might cost more as well. To avoid this, ingress when defined, sits on top of services and routes to services based on the rules and path defined in the ingress.