Kubernetes Service for a Subset of StatefulSet Pods - kubernetes

I have a StatefulSet with 3 pods. The first is assigned to the master role, the rest have a read replica role.
redis-0 (master)
redis-1 (replica)
redis-2 (replica)
How can I create a Kubernetes Service that matches only the pods redis-1 and redis-2? Basically I want to service that points only to the pods acting as replicas?
Logically what I want is to select every pod in the STS except the first. In pseudocode:
selector: app=redis-sts && statefulset.kubernetes.io/pod-name!=redis-0
Alternatively, selecting all the relevant pods could be viable. Again in psuedocode:
selector: statefulset.kubernetes.io/pod-name=redis-1 || statefulset.kubernetes.io/pod-name=redis-2
Here is the relevant YAML with the selectors & service defined. Full YAML.
apiVersion: v1
kind: Service
metadata:
name: redis-service
spec:
ports:
- port: 6379
clusterIP: None
selector:
app: redis-sts
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: redis
spec:
selector:
matchLabels:
app: redis-sts
serviceName: redis-service
replicas: 3
template:
metadata:
labels:
app: redis-sts
spec:
# ...

You may use pod name labels of your redis statefulset to create the service to access a particular read replica pod.
apiVersion: v1
kind: Service
metadata:
name: redis-1
spec:
type: LoadBalancer
externalTrafficPolicy: Local
selector:
statefulset.kubernetes.io/pod-name: redis-1
ports:
- protocol: TCP
port: 6379
targetPort: 6379
And then use the service name for the pod to access the specific pods.
externalTrafficPolicy: Local will only proxy the traffic to the node that has the instance of your pod.

The Service v1 API in 1.21 doesn't support the newer "set" based LabelSelector (matchLabels or matchExpressions)
You could write a controller to apply labels to the stateful set that can then meet the simple equality logic for Service selectors. There may be Redis operators that do this type of thing already.
An idea from this stateful set label question is to use an initContainer that has the pod write access to add the label.

i would suggest to not be depends on the Kubernetes service as if your master pod got killer or restarted read replica can get change anytime in redis cluster.
https://github.com/harsh4870/Redis-Rejson-HA-Helm-Chart
here is the helm chart which deploys the Redis the same way one master and two read replica but with sentinel.
Your node or python code has to hit the service of Redis and in return redis sentinel will give you all the IP addresses for Mater and slave replicas.
Using that IP you can always connect to the Read replica and Master as per need.
If your cluster gets restarted or POD restarted master read replicas may get changed with time.

Related

Kubernetes (GKE) names , labels, selectors, matchLables in manifest files

I have a question about labels and names, in this example manifest file
apiVersion: apps/v
1
kind: Deployment
metadata:
name: nginx-deployment
labels:
app: nginx
spec:
replicas: 3
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:1.7.9
ports:
- containerPort: 80
I can see that the name of the deployment is "nginx-deployment" and the pod name is "nginx"? or is it the running container?
Then I see in the console that the pods would have a hash attached to the end of its name, I believe this is the revision number?
I just want to decipher the names from the labels from the matchLables, so for example I can use this service manifest to expose the pods with a certain label:
apiVersion: v1
kind: Service
metadata:
name: nginx
spec:
type: LoadBalancer
selector:
app: nginx
ports:
- protocol: TCP
port: 60000
targetPort: 80
will this service expose all pods with the selector : app:nginx ?
thanks
The Deployment has your specifed name "nginx-deployment".
However you did not define a Pod with a fixed name, you define a template for the pods managed by this deployment.
The Deployment manages 3 pods (because of your replicas 3), so it will use the template to build this three pods.
There will also be a Replica Set with a hash and this will manage the Pods, but this is better seen by following the example below.
Since a deployment can manage multiple pods (like your example with 3 replicas) or needing one new Pod when updating them, a deployment will not exactly use the name specified in the template, but will always append a hash value to keep them unique.
But now you would have the problem to have all Pods loadbalanced behind one Kubernetes Service, because they have different names.
This is why you denfine a label "app:nignx" in your template so all 3 Pods will have this label regardless of there name and other labels set by Kubernetes.
The Service uses the selector to find the correct Pods. In your case it will search them by label "app:nginx".
So yes the Service will expose all 3 Pods of your deployment and will loadbalance trafic between them.
You can use --show-labels for kubectl get pods to see the name and the assigned labels.
For a more complete example see:
https://kubernetes.io/docs/concepts/workloads/controllers/deployment/

Disable Kubernetes replica set load balancing

I have a Kubernetes cluster of 3 nodes.
A sample deployment
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
labels:
app: nginx
spec:
replicas: 3
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:1.14.2
ports:
- containerPort: 80
I do not have ingress, but I do have external load balancer that round-robins the traffic at 80.11.12.10, 80.11.12.11, 80.11.12.12. So I set my service like this.
apiVersion: v1
kind: Service
metadata:
name: nginx-service
spec:
selector:
app: nginx
ports:
- protocol: TCP
port: 80
externalIPs:
- 80.11.12.10
- 80.11.12.11
- 80.11.12.12
The problem is that due to existing kubernetes service load balancer the traffic get load balancing twice. Aside that it is unnecessary it is spoils the connection persistence. Is there a way to force Kubernetes to forward traffic on local machine pod for each node?
If you set service.spec.externalTrafficPolicy to the value Local, kube-proxy only proxies proxy requests to local endpoints, and does not forward traffic to other nodes.
kubectl patch svc servicename -p '{"spec":{"externalTrafficPolicy":"Local"}}'
If there are no local endpoints, packets sent to the node are dropped.
For clusterIP type service you need to use Service Topology
Service Topology enables a service to route traffic based upon the
Node topology of the cluster. For example, a service can specify that
traffic be preferentially routed to endpoints that are on the same
Node as the client, or in the same availability zone.
It's an alpha feature available from kubernetes 1.17 which needs to be turned on by enabling the feature flag
kube-proxy is using ipvs in order to maange load-balancing in recent k8s versions. ipvsadm can be used at the node level in order to choose which load-balancing algorithm you want ipvs to use. According to https://linux.die.net/man/8/ipvsadm, you can choose between 8 load-balancing algorithms: round robin, weighted round robin, least-connection, weighted least-connection, locality-based least-connection, locality-based least-connection with replication, destination-hashing, and source-hashing. For additional information, check the documentation for the --scheduler option of ipvsadm.

How to set hostname for kubernetes pod in statefulset

I am using statefulset and I spin up multiple pods but they are not replica of each other. I want to set the hostname of the pods and passing these hostname as a env variable to all the pods so that they can communicate with each other.
I tried to use hostname under pod spec but hostname is never to set to specified hostname. However, it is set to hostname as podname-0.
# Source: testrep/templates/statefulset.yaml
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: orbiting-butterfly-testrep
labels:
app.kubernetes.io/name: testrep
helm.sh/chart: testrep-0.1.0
app.kubernetes.io/instance: orbiting-butterfly
app.kubernetes.io/managed-by: Tiller
spec:
replicas: 1
selector:
matchLabels:
app.kubernetes.io/name: testrep
app.kubernetes.io/instance: orbiting-butterfly
strategy:
type: Recreate
template:
metadata:
labels:
app.kubernetes.io/name: testrep
app.kubernetes.io/instance: orbiting-butterfly
spec:
nodeSelector:
testol: ad3
hostname: test1
containers:
- name: testrep
image: "test/database:v1"
imagePullPolicy: IfNotPresent
env:
- name: DB_HOSTS
value: test1,test2,test3
As per documentation:
StatefulSet is the workload API object used to manage stateful applications.
Manages the deployment and scaling of a set of Pods, and provides guarantees about the ordering and uniqueness of these Pods.
StatefulSets are valuable for applications that require one or more of the following:
Stable, unique network identifiers.
Stable, persistent storage.
Ordered, graceful deployment and scaling.
Ordered, automated rolling updates.
Statefulset Limitations:
StatefulSets currently require a Headless Service to be responsible for the network identity of the Pods. You are responsible for creating this Service.
Pod Identity
StatefulSet Pods have a unique identity that is comprised of an ordinal, a stable network identity, and stable storage. The identity sticks to the Pod, regardless of which node it’s (re)scheduled on.
For a StatefulSet with N replicas, each Pod in the StatefulSet will be assigned an integer ordinal, from 0 up through N-1, that is unique over the Set.
Stable Network ID
Each Pod in a StatefulSet derives its hostname from the name of the StatefulSet and the ordinal of the Pod. The pattern for the constructed hostname is $(statefulset name)-$(ordinal). The example above will create three Pods named web-0,web-1,web-2. A StatefulSet can use a Headless Service to control the domain of its Pods. The domain managed by this Service takes the form: $(service name).$(namespace).svc.cluster.local, where “cluster.local” is the cluster domain. As each Pod is created, it gets a matching DNS subdomain, taking the form: $(podname).$(governing service domain), where the governing service is defined by the serviceName field on the StatefulSet.
Note:
You are responsible for creating the Headless Service responsible for the network identity of the pods.
So as described by vjdhama Please create your Statefulset with Headless Service.
You can find this example in the docs:
apiVersion: v1
kind: Service
metadata:
name: nginx
labels:
app: nginx
spec:
ports:
- port: 80
name: web
clusterIP: None
selector:
app: nginx
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: web
spec:
selector:
matchLabels:
app: nginx # has to match .spec.template.metadata.labels
serviceName: "nginx" # has to match headless Service metadata.name
replicas: 3 # by default is 1
template:
metadata:
labels:
app: nginx # has to match .spec.selector.matchLabels
spec:
terminationGracePeriodSeconds: 10
containers:
- name: nginx
image: k8s.gcr.io/nginx-slim:0.8
ports:
- containerPort: 80
In this scenario Pod DNS and Pod Hostnames should be respectively:
Pod DNS
web-{0..N-1}.nginx.default.svc.cluster.local
Pod Hostname
web-{0..N-1}
NAME READY STATUS RESTARTS AGE IP
pod/web-0 1/1 Running 0 5m 192.168.148.78
pod/web-1 1/1 Running 0 4m53s 192.168.148.79
pod/web-2 1/1 Running 0 4m51s 192.168.148.80
From the Pod perspective:
root#web-2:# nslookup nginx
Server: 10.96.0.10
Address: 10.96.0.10#53
Name: nginx.default.svc.cluster.local
Address: 192.168.148.80
Name: nginx.default.svc.cluster.local
Address: 192.168.148.78
Name: nginx.default.svc.cluster.local
Address: 192.168.148.79
So you can call each of the respective pods using the Pod DNS, like:
web-0.nginx.default.svc.cluster.local
Update:
Exposing single pod from StatefulSet.
Pod Name Label
When the StatefulSet controller creates a Pod, it adds a label, statefulset.kubernetes.io/pod-name, that is set to the name of the Pod. This label allows you to attach a Service to a specific Pod in the StatefulSet.
https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/#pod-name-label
You can find here tricky way.
Using described above advantages of Statefulset:
The pattern for the constructed hostname is $(statefulset name)-$(ordinal). The example above will create three Pods named web-0,web-1,web-2.
So as an example:
apiVersion: v1
kind: Service
metadata:
name: app-0
spec:
type: LoadBalancer
selector:
statefulset.kubernetes.io/pod-name: web-0
ports:
- protocol: TCP
port: 80
targetPort: 80
Will do this for you.
Hope this help.
Kubernetes has service discovery baked in, so you don't have to do that. You can setup a headless service for your StatefulSet to let other applications talk to it.
The pods that are created with StatefulSets are ordered and setup sequentially. Hence the postfix integer value. You can read more on it here.

Kubernetes to find Pod IP from another Pod

I have the following pods hello-abc and hello-def.
And I want to send data from hello-abc to hello-def.
How would pod hello-abc know the IP address of hello-def?
And I want to do this programmatically.
What's the easiest way for hello-abc to find where hello-def?
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: hello-abc-deployment
spec:
replicas: 1
template:
metadata:
labels:
app: hello-abc
spec:
containers:
- name: hello-abc
image: hello-abc:v0.0.1
imagePullPolicy: Always
args: ["/hello-abc"]
ports:
- containerPort: 5000
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: hello-def-deployment
spec:
replicas: 1
template:
metadata:
labels:
app: hello-def
spec:
containers:
- name: hello-def
image: hello-def:v0.0.1
imagePullPolicy: Always
args: ["/hello-def"]
ports:
- containerPort: 5001
---
apiVersion: v1
kind: Service
metadata:
name: hello-abc-service
spec:
ports:
- port: 80
targetPort: 5000
protocol: TCP
selector:
app: hello-abc
type: NodePort
---
apiVersion: v1
kind: Service
metadata:
name: hello-def-service
spec:
ports:
- port: 80
targetPort: 5001
protocol: TCP
selector:
app: hello-def
type: NodePort
Preface
Since you have defined a service that routes to each deployment, if you have deployed both services and deployments into the same namespace, you can in many modern kubernetes clusters take advantage of kube-dns and simply refer to the service by name.
Unfortunately if kube-dns is not configured in your cluster (although it is unlikely) you cannot refer to it by name.
You can read more about DNS records for services here
In addition Kubernetes features "Service Discovery" Which exposes the ports and ips of your services into any container which is deployed into the same namespace.
Solution
This means, to reach hello-def you can do so like this
curl http://hello-def-service:${HELLO_DEF_SERVICE_PORT}
based on Service Discovery https://kubernetes.io/docs/concepts/services-networking/service/#environment-variables
Caveat: Its very possible that if the Service port changes, only pods that are created after the change in the same namespace will receive the new environment variables.
External Access
In addition, you can also reach this your service externally since you are using the NodePort feature, as long as your NodePort range is accessible from outside.
This would require you to access your service by node-ip:nodePort
You can find out the NodePort which was randomly assigned to your service with kubectl describe svc/hello-def-service
Ingress
To reach your service from outside you should implement an ingress service such as nginx-ingress
https://github.com/helm/charts/tree/master/stable/nginx-ingress
https://github.com/kubernetes/ingress-nginx
Sidecar
If your 2 services are tightly coupled, you can include both in the same pod using the Kubernetes Sidecar feature. In this case, both containers in the pod would share the same virtual network adapter and accessible via localhost:$port
https://kubernetes.io/docs/concepts/workloads/pods/pod/#uses-of-pods
Service Discovery
When a Pod is run on a Node, the kubelet adds a set of environment
variables for each active Service. It supports both Docker links
compatible variables (see makeLinkVariables) and simpler
{SVCNAME}_SERVICE_HOST and {SVCNAME}_SERVICE_PORT variables, where the
Service name is upper-cased and dashes are converted to underscores.
Read more about service discovery here:
https://kubernetes.io/docs/concepts/services-networking/service/#environment-variables
You should be able to reach hello-def-service from pods in hello-abc via DNS as specified here: https://kubernetes.io/docs/concepts/services-networking/dns-pod-service/#services
However, kube-dns or CoreDNS has to be configured/installed in your k8s cluster before DNS records can be utilized in your cluster.
Specifically, you should be reach hello-def-service via the DNS record http://hello-def-service for the service running in the same namespace as hello-abc-service
And you should be able to reach hello-def-service running in another namespace ohter_namespace via the DNS record hello-def-service.other_namespace.svc.cluster.local.
If, for some reason, you do not have DNS add-ons installed in your cluster, you still can find the virtual IP of the hello-def-service via environment variables in hello-abc pods. As is documented here.

Can we configure one service with different pod resources in Kubernetes?

I want to deploy a Postgres service with replication on Kubernetes cluster.
I have defined a PetSet and a Service for this. But i am only able to define same resource limits to all pods in a service, due to which Kubernetes assigns these nodes randomly to nodes.
Is there a way, where i can have a service with different pods resource conf ?
My current yaml for reference.
https://github.com/kubernetes/charts/blob/master/incubator/patroni/templates/ps-patroni.yaml
You cannot assign different configuration options (i.e. resource limits) to pods in the same Replica. Inherently they are meant to be identical. You would need to create multiple replicas in order to accomplish this.
for a service you could have multiple deployments with different nodeSelector configurations behind one Service definition.
E.g. you could label your nodes like this:
kubectl label nodes node-1 pool=freshhardware
kubectl label nodes node-2 pool=freshhardware
kubectl label nodes node-3 pool=shakyhardware
And then have two deployments like this:
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: my-nginx
spec:
replicas: 4
template:
metadata:
labels:
run: my-nginx
spec:
containers:
- name: my-nginx
image: nginx
ports:
- containerPort: 80
nodeSelector:
pool: freshhardware
... the second one could look the same with just these fields exchanged:
nodeSelector:
pool: shakyhardware
A service definition like this would then take all pods from both deployments into account:
apiVersion: v1
kind: Service
metadata:
name: my-nginx
labels:
run: my-nginx
spec:
ports:
- port: 80
protocol: TCP
selector:
run: my-nginx
The drawback is of course that you'd have to manage two Deployments at a time, but that's kind of build-in to this problem anyways.