Is it possible for a pod running in a satrefulset to get the hostname of the all the pod running in different statefulset? - kubernetes

I have a pod running in a statefulset but it needs to know the hostname or address of all pods running in another statefulset to communicate with them. The second statefulset is being created by a separate helm chart. Can the pod work this out dynamically? Can I inject this information into the pod through an env similar to setting .Status.ip?
Edit: Each statefulSet has its own headless service

As discussed in the comments, the way to go here is to use a service-resource as this will give you a static DNS within the cluster to reach all the pods that a targeted by that service.
The DNS for the service is:
the services name if you access it from within the same namespace
<my-service-name>.<namespace-name>.svc.cluster.local if you access it from another namespace, and where cluster.local is the clusters domain that might differ from cluster to cluster depending on the clusters configuration
If you further need more configuration options, e.g. when you want to deploy your chart into different cloud environments where the clusters-domain might actually differ, you can use kustomize.io to adjust your configuration at apply time.

Related

Is there a way to apply different configmap for each pod generated by damonset?

I am using filebeat as a daemonset and I would like each generated pod to export to a single port for the logstash.
Is there an approach to be used for this?
No. You cannot provide different configmap to the pods of same daemonset or deployment. If you want each of your pods of daemonset to have different configurations then you can mount some local-volume (using hostpath) so that all the pods will take configuration from that path and that can be different on each node. Or you need to deploy different daemonsets with different configmaps and select different nodes for each of them.
As you can read here:
A DaemonSet ensures that all (or some) Nodes run a copy of a Pod.
...a copy of Pod based on a single template and this is the reason why you cannot specify different ConfigMaps to be used by different Pods managed by DaemonSet Controller.
As an alternative you can configure many different DaemonSets where each one will be responsible for running copy of a Pod specified in the template only on specific node.
Another alternative is using static pods:
It is possible to create Pods by writing a file to a certain directory
watched by Kubelet. These are called static pods. Unlike DaemonSet,
static Pods cannot be managed with kubectl or other Kubernetes API
clients. Static Pods do not depend on the apiserver, making them
useful in cluster bootstrapping cases. Also, static Pods may be
deprecated in the future.
The whole procedure of creation a static Pod is described here.
I hope it helps.
You can use a ConfigMap containing config in each node and expose spec.nodeName environment to your pods. Then your pods can know which node it's running on and decide which config it loads.

How do I expose each individual Pod in a DaemonSet without hostNetwork

How do I be able to go to a specific Pod in a DaemonSet without hostNetwork? The reason is my Pods in the DaemonSet are stateful, and I prefer to have at most one worker on each Node (that's why I used DaemonSet).
My original implementation was to use hostNetwork so the worker Pods can be found by Node IP by outside clients. But in many production environment hostNetwork is disabled, so we have to create one NodePort service for each Pod of the DaemonSet. This is not flexible and obviously cannot work in the long run.
Some more background on how my application is stateful
The application works in an HDFS-taste, where Workers(datanodes) register with Masters(namenodes) with their hostname. The masters and outside clients need to go to a specific worker for what it's hosting.
hostNetwork is an optional setting and is not necessary. You can connect to your pods without specifying it.
To communicate with pods in DaemonSet you can specify hostPort in the DaemonSet’s pod spec to expose it on the node. You can then communicate with it directly by using the IP of the node it is running on.
Another approach to connect to stateful application is StatefulSet. It allows you to specify network identifiers. However it requires headless service for network identity of the Pods and you are responsible for creating such services.

How to access pods without services in Kubernetes

I was wondering how pods are accessed when no service is defined for that specific pod. If it's through the environment variables, how does the cluster retrieve these?
Also, when services are defined, where on the master node is it stored?
Kind regards,
Charles
If you define a service for your app , you can access it outside the cluster using that service
Services are of several types , including nodePort , where you can access that port on any cluster node and you will have access to the service regardless of the actual location of the pod
you can access the endpoints or actual pod ports inside the cluster as well , but not outside
all of the above uses the kubernetes service discovery
There are two type of service dicovery though
Internal Service discovery
External Service Discovery.
You cannot "access" a pods container port(s) without a service. Services are objects that define the desired state of an ultimate set of iptable rule(s).
Also, services, like all other objects, are stored in etcd and maintained through your master(s).
You could however manually create an iptable rule forwarding traffic to the local container port that docker has exposed.
Hope this helps! If you still have any questions drop them here.
Just for debugging purposes, you can forward a port from your machine to one in the pod:
kubectl port-forward POD_NAME HOST_PORT:POD_PORT
If you have to access it from anywhere, you should use services, but you got to have a deployment created
Create deployment
kubectl create -f https://raw.githubusercontent.com/kubernetes/website/master/content/en/examples/service/networking/run-my-nginx.yaml
Expose the deployment with a NodePort service
kubectl expose deployment deployment/my-nginx --type=NodePort --name=nginx-service
Then list the services and get the port of the service
kubectl get services | grep nginx-service
All cluster data is stored in etcd which is a distributed key-value store. If etcd goes down, cluster becomes unstable and no new pods can come up.
Kubernetes has a way to access any pod within the cluster. Service is a logical way to access a set of pods bound by a selector. An individual pod can still be accessed irrespective of the service. Further service can be created to access the pods from outside the cluster (NodePort service)

Pop to Pod communication for pods within the same Deployment

I have a Kubernetes deployment that has 3 replicas. It starts 3 pods which are distributed across a given cluster. I would like to know how to reliably get one pod to contact another pod within the same ReplicaSet.
The deployment above is already wrapped up in a Kubernetes Service. But Services do not cover my use case. I need each instance of my container (each Pod) to start-up a local in memory cache and have these cache communicate/sync with other cache instances running on other Pods. This is how I see a simple distributed cache working on for my service. Pod to pod communication within the same cluster is allowed as per the Kubernetes Network Model but I cannot see a reliable way to address each a pod from another pod.
I believe I can use a StatefulSet, however, I don't want to lose the ClusterIP assigned to the service which is required by Ingress for load balancing.
Ofcourse you can use statefulset, and ingress doesn't need ClusterIP that assigned to the service, since it uses the endpoints, so 'headless service' is ok.

Helm Deployment vs Service

I am trying to understand k8s and helm.
When I create a helm chart, there are 2 files: service.yaml and deployment.yaml. Both of them have a name field.
If I understand correctly, the deployment will be responsible for managing the pods, replicasets, etc and thus the service.
Basically, why am I allowed use a separate name for the service and for the deployment? Under what scenario would we want these 2 names to differ? Can a deployment have more than 1 service?
The "service" creates a persistent IP address in your cluster which is how everything else connects it. The Deployment creates a ReplicaSet, which creates a Pod, and this Pod is the backend for that service. There can be more than 1 pod, in which case the service load balances, and these pods can change over time, change IP's, but your service remains constant.
Think of the service as a load balancer which points to your pods. It's analogous to interfaces and implementations. The service is like an interface, which is backed by the pods, the impementations.
The mapping is m:n. You can have multiple services backed by a single pod, or multiple pods backing a single service.