When using GKE, I found that a all the nodes in the Kubernetes cluster must be in the same network and the same subnet. So, I wanted to understand the correct way to design networking.
I have two services A and B and they have no relation between them. My plan was to use a single cluster in a single region and have two nodes for each of the services A and B in different subnets in the same network.
However, it seems like that can't be done. The other way to partition a cluster is using namespaces, however I am already using partitioning development environment using namespaces.
I read about cluster federation https://kubernetes.io/docs/concepts/cluster-administration/federation/, however it my services are small and I don't need them in multiple clusters and in sync.
What is the correct way to setup netowrking for these services? Should I just use the same network and subnet for all the 4 nodes to serve the two services A and B?
You can restrict the incoming (or outgoing) traffic making use of labels and networking policies.
In this way the pods would be able to receive the traffic merely if it has been generated by a pod belonging to the same application or with any logic you want to implement.
You can follow this step to step tutorial that guides you thorough the implementation of a POC.
kubectl run hello-web --labels app=hello \
--image=gcr.io/google-samples/hello-app:1.0 --port 8080 --expose
Example of Network policy
kind: NetworkPolicy
apiVersion: networking.k8s.io/v1
metadata:
name: hello-allow-from-foo
spec:
policyTypes:
- Ingress
podSelector:
matchLabels:
app: hello
ingress:
- from:
- podSelector:
matchLabels:
app: foo
Related
I am naive in Kubernetes world. I was going through a interesting concept called headless service.
I have read it, understand it, and I can create headless service. But I am still not convinced about use cases. Like why do we need it. There are already three types of service clusterIP, NodePort and loadbalancer service with their separate use cases.
Could you please tell me what is exactly which headless service solve and all those other three services could not solve it.
I have read it that headless is mainly used with the application which is stateful like dB based pod for example cassandra, MongoDB etc. But my question is why?
A headless service doesn't provide any sort of proxy or load balancing -- it simply provides a mechanism by which clients can look up the ip address of pods. This means that when they connect to your service, they're connecting directly to the pods; there's no intervening proxy.
Consider a situation in which you have a service that matches three pods; e.g., I have this Deployment:
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: example
name: example
spec:
replicas: 3
selector:
matchLabels:
app: example
template:
metadata:
labels:
app: example
spec:
containers:
- image: docker.io/traefik/whoami:latest
name: whoami
ports:
- containerPort: 80
name: http
If I'm using a typical ClusterIP type service, like this:
apiVersion: v1
kind: Service
metadata:
labels:
app: example
name: example
spec:
ports:
- name: http
port: 80
targetPort: http
selector:
app: example
Then when I look up the service in a client pod, I will see the ip address of the service proxy:
/ # host example
example.default.svc.cluster.local has address 10.96.114.63
However, when using a headless service, like this:
apiVersion: v1
kind: Service
metadata:
labels:
app: example
name: example-headless
spec:
clusterIP: None
ports:
- name: http
port: 80
targetPort: http
selector:
app: example
I will instead see the addresses of the pods:
/ # host example-headless
example-headless.default.svc.cluster.local has address 10.244.0.25
example-headless.default.svc.cluster.local has address 10.244.0.24
example-headless.default.svc.cluster.local has address 10.244.0.23
By removing the proxy from the equation, clients are aware of the actual pod ips, which may be important for some applications. This also simplifies the path between clients and the service, which may have performance benefits.
Kubernetes Services of type ClusterIP, NodePort and LoadBalancer have one thing in common: They loadbalance between all pods that match the service's selector, so you can talk to all pods via a virtual ip.
That's nice because you can have multiple pods of the same application and all requests will be spread between those, avoiding overloading one pod while others are still idle.
For some applications you might require to talk to the pods directly instead of over an virtual ip. Still you'll want a stable hostname that points to the same pod, even if the pod's ip changes, e.g. because it needs to be rescheduled on a different host.
Use cases are mostly databases where applications should always connect to the same instance for data and session consistency.
A headless service does not provide a virtual IP covering all the endpoints in its endpoint slice. If you query that service DNS, you will get all the endpoint IP addresses back. A regular service on the other hand will have sort of a virtual IP, so clients connecting to it are not aware of the underlying endpoints.
To answer your question, why this is important for a stateful. Usually, members of a statefulset need to be aware of each other. Let's say you want to run a distributed database in something like a cluster formation. The members of this cluster need to have a way to discover each other. This is where the headless service comes into play. Because the individual database pods can get the address of the other database pods by using the headless service.
The headless service also provides a stable network identify for each member of the statefulset. You can read about that here. This is, again, useful as the members of the statefulset need to coordinate with each other in some way. Let's say the pod-0 will always take the initial leader role, and every other member knows it should report itself to pod-0 to form a cluster like configuration.
Can someone please explain how POD to POD works in AKS?
from the docs, I can see it uses kube proxy component to send the traffic to the desired POD.
But I have been told that I must use clusterIP service and bind all the relevant POD's together.
So what is real flow? Or I missed something. below a couple of questions to be more clear.
Questions:
how POD to POD inside one node can talk to each other? what is the flow?
how POD to POD inside a cluster (different nodes) can talk to each other? what is the flow?
if it's possible it will be highly appreciated if you can describe the flows for #1 and #2 in the deployment of kubenet and CNI.
Thanks a lot!
for pod to pod communication we use services. so first we need to understand,
why we need service: what actually do service for us that, they resolve the dns name and give us the the exact ip that we need to connect a specific pod. now as you want to communicate with pod to pod you need to create a ClusterIP service.
ClusterIP: Exposes the Service on a cluster-internal IP. Choosing this value makes the Service only reachable from within the cluster. This is the default ServiceType. with ClusterIP service you can't access a pod from outside the cluster for this reason we use clusterip service if we want the communication between pod to pod only.
kube-proxy is the network proxy that runs on each node in your cluster.
it maintains network rules on nodes. These network rules allow network communication to your Pods from network sessions inside or outside of your cluster.
every service maintain iptables.And kube-proxy handled these ip tables for every service. so yes, kube-proxy is the most vital point for network setup in our k8s cluster.
how the network policy works in kubernetes:
all Pods can communicate with all other Pods without using network address translation (NAT).
all Nodes can communicate with all Pods without NAT.
the IP that a Pod sees itself as is the same IP that others see it as.
with those point:
Container-to-Container networking
Pod-to-Pod networking
Pod-to-Service networking
Internet-to-Service networking
It handles transmission of packets between pod to pods, and also with the outside world. It acts like a network proxy and load balancer for pods running on the node by implementing load-balancing using NAT in iptables.
The kube-proxy process stands in between the Kubernetes network and the pods that are running on that particular node. It is responsible for ensuring that communication is maintained efficiently across all elements of the cluster. When a user creates a Kubernetes service object, the kube-proxy instance is responsible to translate that object into meaningful rules in the local iptables rule set on the worker node. iptables is used to translate the virtual IP assigned to the service object to all of the pod IPs mapped by the service.
i hope it's clear your idea about kube proxy.
lets see a example how it's works.
here i used headless service so that i can connect a specific pod.
---
apiVersion: v1
kind: Service
metadata:
name: my-service
namespace: default
spec:
clusterIP: None
selector:
app: my-test
ports:
- port: 80
name: rest
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: my-sts
spec:
serviceName: my-service
replicas: 3
selector:
matchLabels:
app: my-test
template:
metadata:
labels:
app: my-test
spec:
containers:
- name: nginx
image: nginx
ports:
- containerPort: 80
name: web
---
this will create 3 pods. as : my-sts-0, my-sts-1, my-sts-2. now if we want to connect to the pod my-sts-0 just use this dns name my-sts-0.my-service.default.svc:80 . and the service will resolve the dns name and will provide the exact podip of my-sts-0. now if you need to comminucate from my-sts-1 to my-sts-0, you can just use this dns name.
The template is like my_pod_name.my_Service_Name.my_Namespace.svc.cluster-domain.example , but you can skip the cluster-domain.example part. Only Service_Name.Namespace.svc will work fine.
ref
Let's say I define a Service named my-backend in Kubernetes. I would like to intercept every request sent to this service, what is the proper way to do it? For example, another container under the same namespace sends a request through http://my-backend.
I tried to use Admission Controller with a validation Webhook. However, it can intercept the CRUD operations on service resources, but it fails to intercept any connection to a specific service.
There is no direct way to intercept the requests to a service in Kubernetes.
For workaround this is what you can do-
Create a sidecar container just to log the each incoming request. logging
Run tcpdump -i eth0 -n in your containers and filter out requests
Use Zipkin
Creating service on cloud providers, will have their own logging mechanism. for ex - load balancer service on aws will have its logs generated on S3. aws elb logs
You can use a service mesh such as istio. An istio service mesh deploys a envoy proxy sidecar along with every pod. Envoy intercepts all the incoming requests to the pod and can provide you metrics such as number of requests etc. A service mesh brings in more features such as distributed tracing, rate limiting etc.
Kubernetes NetworkPolicy object will help on this. A network policy controls how group of pods can communicate with each other and other network endpoints. You can only allow the ingress traffic to the my-backend service based on pod selector. Below is the example that will allow the ingress traffic from specific
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: ingress-only-from-frontend-to-my-backend
namespace: default
spec:
podSelector:
matchLabels:
<my-backend pod label>
policyTypes:
- Ingress
ingress:
- from:
- podSelector:
matchLabels:
<Frontend web pod label>
I am a newbie to Kubernetes and trying to learn calico networking.
I am following this documentation (https://docs.aws.amazon.com/eks/latest/userguide/calico.html)
and I tried to create a networkpolicy for the traffic to flow between backend to client :
kind: NetworkPolicy
apiVersion: networking.k8s.io/v1
metadata:
namespace: stars
name: backend-client
spec:
podSelector:
matchLabels:
role : client
ingress:
- from:
- namespaceSelector:
matchLabels:
role: backend
ports:
- protocol: TCP
port: 9000
I finished all the 10 steps in the documentation, and i tried to test by creating a policy that would send traffic from the backend to the client with the above policy.
When i applied the policy there was no error , but i don't see the traffic/connection between the two.
Please let me know what is wrong.
Creating NetworkPolicy alone will not help in ensuring that the NetworkPolicy is enforced. We should configure the network plugin like Calico which is integrated with Kubernetes and executes the necessary operations to achieve the intent of the given Network Policy
https://kubernetes.io/docs/concepts/services-networking/network-policies/ says
"Network policies are implemented by the network plugin, so you must be using a networking solution which supports NetworkPolicy - simply creating the resource without a controller to implement it will have no effect."
I believe you need to put your policy in the client namespace instead of the stars namespace. I don't believe there are any pods with role: client in the stars namespace. A pod selector like you've specified only applies to pods in the namespace the policy is in.
While I don't think it is as direct as it could be the Kubernetes Network Policy docs do mention that a NetworkPolicy applies in the given namespace. I suggest you check them out if you haven't already.
I hope that helps.
I would like to deploy an application cluster by managing my deployment via k8s Deployment object. The documentation has me extremely confused. My basic layout has the following components that scale independently:
API server
UI server
Redis cache
Timer/Scheduled task server
Technically, all 4 above belong in separate pods that are scaled independently.
My questions are:
Do I need to create pod.yml files and then somehow reference them in deployment.yml file or can a deployment file also embed pod definitions?
K8s documentation seems to imply that the spec portion of Deployment is equivalent to defining one pod. Is that correct? What if I want to declaratively describe multi-pod deployments? Do I do need multiple deployment.yml files?
Pagids answer has most of the basics. You should create 4 Deployments for your scenario. Each deployment will create a ReplicaSet that schedules and supervises the collection of PODs for the Deployment.
Each Deployment will most likely also require a Service in front of it for access. I usually create a single yaml file that has a Deployment and the corresponding Service in it. Here is an example for an nginx.yaml that I use:
apiVersion: v1
kind: Service
metadata:
annotations:
service.alpha.kubernetes.io/tolerate-unready-endpoints: "true"
name: nginx
labels:
app: nginx
spec:
type: NodePort
ports:
- port: 80
name: nginx
targetPort: 80
nodePort: 32756
selector:
app: nginx
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: nginxdeployment
spec:
replicas: 3
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginxcontainer
image: nginx:latest
imagePullPolicy: Always
ports:
- containerPort: 80
Here some additional information for clarification:
A POD is not a scalable unit. A Deployment that schedules PODs is.
A Deployment is meant to represent a single group of PODs fulfilling a single purpose together.
You can have many Deployments work together in the virtual network of the cluster.
For accessing a Deployment that may consist of many PODs running on different nodes you have to create a Service.
Deployments are meant to contain stateless services. If you need to store a state you need to create StatefulSet instead (e.g. for a database service).
You can use the Kubernetes API reference for the Deployment and you'll find that the spec->template field is of type PodTemplateSpec along with the related comment (Template describes the pods that will be created.) it answers you questions. A longer description can of course be found in the Deployment user guide.
To answer your questions...
1) The Pods are managed by the Deployment and defining them separately doesn't make sense as they are created on demand by the Deployment. Keep in mind that there might be more replicas of the same pod type.
2) For each of the applications in your list, you'd have to define one Deployment - which also makes sense when it comes to difference replica counts and application rollouts.
3) you haven't asked that but it's related - along with separate Deployments each of your applications will also need a dedicated Service so the others can access it.
additional information:
API server use deployment
UI server use deployment
Redis cache use statefulset
Timer/Scheduled task server maybe use a statefulset (If your service has some state in)