Subnetting within Kubernetes Cluster - kubernetes

I have couple of deployments - say Deployment A and Deployment B. The K8s Subnet is 10.0.0.0/20.
My requirement : Is it possible to get all pods in Deployment A to get IP from 10.0.1.0/24 and pods in Deployment B from 10.0.2.0/24.
This helps the networking clean and with help of IP itself a particular deployment can be identified.

Deployment in Kubernetes is a high-level abstraction that rely on controllers to build basic objects. That is different than object itself such as pod or service.
If you take a look into deployments spec in Kubernetes API Overview, you will notice that there is no such a thing as defining subnets, neither IP addresses that would be specific for deployment so you cannot specify subnets for deployments.
Kubernetes idea is that pod is ephemeral. You should not try to identify resources by IP addresses as IPs are randomly assigned. If the pod dies it will have another IP address. You could try to look on something like statefulsets if you are after unique stable network identifiers.
While Kubernetes does not support this feature I found workaround for this using Calico: Migrate pools feature.
First you need to have calicoctl installed. There are several ways to do that mentioned in the install calicoctl docs.
I choose to install calicoctl as a Kubernetes pod:
kubectl apply -f https://docs.projectcalico.org/manifests/calicoctl.yaml
To make work faster you can setup an alias :
alias calicoctl="kubectl exec -i -n kube-system calicoctl /calicoctl -- "
I have created two yaml files to setup ip pools:
apiVersion: projectcalico.org/v3
kind: IPPool
metadata:
name: pool1
spec:
cidr: 10.0.0.0/24
ipipMode: Always
natOutgoing: true
apiVersion: projectcalico.org/v3
kind: IPPool
metadata:
name: pool2
spec:
cidr: 10.0.1.0/24
ipipMode: Always
natOutgoing: true
Then you you have apply the following configuration but since my yaml were being placed in my host filesystem and not in calico pod itself I placed the yaml as an input to the command:
➜ cat ippool1.yaml | calicoctl apply -f-
Successfully applied 1 'IPPool' resource(s)
➜ cat ippool2.yaml | calicoctl apply -f-
Successfully applied 1 'IPPool' resource(s)
Listing the ippools you will notice the new added ones:
➜ calicoctl get ippool -o wide
NAME CIDR NAT IPIPMODE VXLANMODE DISABLED SELECTOR
default-ipv4-ippool 192.168.0.0/16 true Always Never false all()
pool1 10.0.0.0/24 true Always Never false all()
pool2 10.0.1.0/24 true Always Never false all()
Then you can specify what pool you want to choose for you deployment:
---
metadata:
labels:
app: nginx
name: deployment1-pool1
spec:
replicas: 3
selector:
matchLabels:
app: nginx
template:
metadata:
annotations:
cni.projectcalico.org/ipv4pools: "[\"pool1\"]"
---
I have created similar one called deployment2 that used ippool2 with the results below:
deployment1-pool1-6d9ddcb64f-7tkzs 1/1 Running 0 71m 10.0.0.198 acid-fuji
deployment1-pool1-6d9ddcb64f-vkmht 1/1 Running 0 71m 10.0.0.199 acid-fuji
deployment2-pool2-79566c4566-ck8lb 1/1 Running 0 69m 10.0.1.195 acid-fuji
deployment2-pool2-79566c4566-jjbsd 1/1 Running 0 69m 10.0.1.196 acid-fuji
Also its worth mentioning that while testing this I found out that if your default deployment will have many replicas and will ran out of ips Calico will then use different pool.

Related

how to restrict a pod to connect only to 2 pods using networkpolicy and test connection in k8s in simple way?

Do I still need to expose pod via clusterip service?
There are 3 pods - main, front, api. I need to allow ingress+egress connection to main pod only from the pods- api and frontend. I also created service-main - service that exposes main pod on port:80.
I don't know how to test it, tried:
k exec main -it -- sh
netcan -z -v -w 5 service-main 80
and
k exec main -it -- sh
curl front:80
The main.yaml pod:
apiVersion: v1
kind: Pod
metadata:
labels:
app: main
item: c18
name: main
spec:
containers:
- image: busybox
name: main
command:
- /bin/sh
- -c
- sleep 1d
The front.yaml:
apiVersion: v1
kind: Pod
metadata:
labels:
app: front
name: front
spec:
containers:
- image: busybox
name: front
command:
- /bin/sh
- -c
- sleep 1d
The api.yaml
apiVersion: v1
kind: Pod
metadata:
labels:
app: api
name: api
spec:
containers:
- image: busybox
name: api
command:
- /bin/sh
- -c
- sleep 1d
The main-to-front-networkpolicy.yaml
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: front-end-policy
spec:
podSelector:
matchLabels:
app: main
policyTypes:
- Ingress
- Egress
ingress:
- from:
- podSelector:
matchLabels:
app: front
ports:
- port: 8080
egress:
- to:
- podSelector:
matchLabels:
app: front
ports:
- port: 8080
What am I doing wrong? Do I still need to expose main pod via service? But should not network policy take care of this already?
Also, do I need to write containerPort:80 in main pod? How to test connectivity and ensure ingress-egress works only for main pod to api, front pods?
I tried the lab from ckad prep course, it had 2 pods: secure-pod and web-pod. There was issue with connectivity, the solution was to create network policy and test using netcat from inside the web-pod's container:
k exec web-pod -it -- sh
nc -z -v -w 1 secure-service 80
connection open
UPDATE: ideally I want answers to these:
a clear explanation of the diff btw service and networkpolicy.
If both service and netpol exist - what is the order of evaluation that the traffic/request goes thru? It first goes thru netpol then service? Or vice versa?
if I want front and api pods to send/receive traffic to main - do I need separate services exposing front and api pods?
Network policies and services are two different and independent Kubernetes resources.
Service is:
An abstract way to expose an application running on a set of Pods as a network service.
Good explanation from the Kubernetes docs:
Kubernetes Pods are created and destroyed to match the state of your cluster. Pods are nonpermanent resources. If you use a Deployment to run your app, it can create and destroy Pods dynamically.
Each Pod gets its own IP address, however in a Deployment, the set of Pods running in one moment in time could be different from the set of Pods running that application a moment later.
This leads to a problem: if some set of Pods (call them "backends") provides functionality to other Pods (call them "frontends") inside your cluster, how do the frontends find out and keep track of which IP address to connect to, so that the frontend can use the backend part of the workload?
Enter Services.
Also another good explanation in this answer.
For production you should use a workload resources instead of creating pods directly:
Pods are generally not created directly and are created using workload resources. See Working with Pods for more information on how Pods are used with workload resources.
Here are some examples of workload resources that manage one or more Pods:
Deployment
StatefulSet
DaemonSet
And use services to make requests to your application.
Network policies are used to control traffic flow:
If you want to control traffic flow at the IP address or port level (OSI layer 3 or 4), then you might consider using Kubernetes NetworkPolicies for particular applications in your cluster.
Network policies target pods, not services (an abstraction). Check this answer and this one.
Regarding your examples - your network policy is correct (as I tested it below). The problem is that your cluster may not be compatible:
For Network Policies to take effect, your cluster needs to run a network plugin which also enforces them. Project Calico or Cilium are plugins that do so. This is not the default when creating a cluster!
Test on kubeadm cluster with Calico plugin -> I created similar pods as you did, but I changed container part:
spec:
containers:
- name: main
image: nginx
command: ["/bin/sh","-c"]
args: ["sed -i 's/listen .*/listen 8080;/g' /etc/nginx/conf.d/default.conf && exec nginx -g 'daemon off;'"]
ports:
- containerPort: 8080
So NGINX app is available at the 8080 port.
Let's check pods IP:
user#shell:~$ kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
api 1/1 Running 0 48m 192.168.156.61 example-ubuntu-kubeadm-template-2 <none> <none>
front 1/1 Running 0 48m 192.168.156.56 example-ubuntu-kubeadm-template-2 <none> <none>
main 1/1 Running 0 48m 192.168.156.52 example-ubuntu-kubeadm-template-2 <none> <none>
Let's exec into running main pod and try to make request to the front pod:
root#main:/# curl 192.168.156.61:8080
<!DOCTYPE html>
...
<title>Welcome to nginx!</title>
It is working.
After applying your network policy:
user#shell:~$ kubectl apply -f main-to-front.yaml
networkpolicy.networking.k8s.io/front-end-policy created
user#shell:~$ kubectl exec -it main -- bash
root#main:/# curl 192.168.156.61:8080
...
Not working anymore, so it means that network policy is applied successfully.
Nice option to get more information about applied network policy is to run kubectl describe command:
user#shell:~$ kubectl describe networkpolicy front-end-policy
Name: front-end-policy
Namespace: default
Created on: 2022-01-26 15:17:58 +0000 UTC
Labels: <none>
Annotations: <none>
Spec:
PodSelector: app=main
Allowing ingress traffic:
To Port: 8080/TCP
From:
PodSelector: app=front
Allowing egress traffic:
To Port: 8080/TCP
To:
PodSelector: app=front
Policy Types: Ingress, Egress

Kubernetes resource quota, have non schedulable pod staying in pending state

So I wish to limit resources used by pod running for each of my namespace, and therefor want to use resource quota.
I am following this tutorial.
It works well, but I wish something a little different.
When trying to schedule a pod which will go over the limit of my quota, I am getting a 403 error.
What I wish is the request to be scheduled, but waiting in a pending state until one of the other pod end and free some resources.
Any advice?
Instead of using straight pod definitions (kind: Pod) use deployment.
Why?
Pods in Kubernetes are designed as relatively ephemeral, disposable entities:
You'll rarely create individual Pods directly in Kubernetes—even singleton Pods. This is because Pods are designed as relatively ephemeral, disposable entities. When a Pod gets created (directly by you, or indirectly by a controller), the new Pod is scheduled to run on a Node in your cluster. The Pod remains on that node until the Pod finishes execution, the Pod object is deleted, the Pod is evicted for lack of resources, or the node fails.
Kubernetes assumes that for managing pods you should a workload resources instead of creating pods directly:
Pods are generally not created directly and are created using workload resources. See Working with Pods for more information on how Pods are used with workload resources.
Here are some examples of workload resources that manage one or more Pods:
Deployment
StatefulSet
DaemonSet
By using deployment you will get very similar behaviour to the one you want.
Example below:
Let's suppose that I created pod quota for a custom namespace, set to "2" as in this example and I have two pods running in this namespace:
kubectl get pods -n quota-demo
NAME READY STATUS RESTARTS AGE
quota-demo-1 1/1 Running 0 75s
quota-demo-2 1/1 Running 0 6s
Third pod definition:
apiVersion: v1
kind: Pod
metadata:
name: quota-demo-3
spec:
containers:
- name: quota-demo-3
image: nginx
ports:
- containerPort: 80
Now I will try to apply this third pod in this namespace:
kubectl apply -f pod.yaml -n quota-demo
Error from server (Forbidden): error when creating "pod.yaml": pods "quota-demo-3" is forbidden: exceeded quota: pod-demo, requested: pods=1, used: pods=2, limited: pods=2
Not working as expected.
Now I will change pod definition into deployment definition:
apiVersion: apps/v1
kind: Deployment
metadata:
name: quota-demo-3-deployment
labels:
app: quota-demo-3
spec:
selector:
matchLabels:
app: quota-demo-3
template:
metadata:
labels:
app: quota-demo-3
spec:
containers:
- name: quota-demo-3
image: nginx
ports:
- containerPort: 80
I will apply this deployment:
kubectl apply -f deployment-v3.yaml -n quota-demo
deployment.apps/quota-demo-3-deployment created
Deployment is created successfully, but there is no new pod, Let's check this deployment:
kubectl get deploy -n quota-demo
NAME READY UP-TO-DATE AVAILABLE AGE
quota-demo-3-deployment 0/1 0 0 12s
We can see that a pod quota is working, deployment is monitoring resources and waiting for the possibility to create a new pod.
Let's now delete one of the pod and check deployment again:
kubectl delete pod quota-demo-2 -n quota-demo
pod "quota-demo-2" deleted
kubectl get deploy -n quota-demo
NAME READY UP-TO-DATE AVAILABLE AGE
quota-demo-3-deployment 1/1 1 1 2m50s
The pod from the deployment is created automatically after deletion of the pod:
kubectl get pods -n quota-demo
NAME READY STATUS RESTARTS AGE
quota-demo-1 1/1 Running 0 5m51s
quota-demo-3-deployment-7fd6ddcb69-nfmdj 1/1 Running 0 29s
It works the same way for memory and CPU quotas for namespace - when the resources are free, deployment will automatically create new pods.

Kubernetes Cronjob labeling

As I have seen few related posts but none answered my question, I thought I would ask a new question based on suggestions from other users as well here.
I have the need to make a selector label for a network policy for a running cronjob that is responsible to connect to some other services within the cluster, as far as I know there is no easy straight forward way to make a selector label for the jobs pod as that would be problematic with duplicate job labels if they ever existed. Not sure why the cronjob can't have a selector itself, and then can be applied to the job and the pod.
also there might be a possibility to just set this cronjob in its own namespace and then allow all from that one namespace to whatever needed in the network policy but does not feel like the right way to overcome that problem.
Using k8s v1.20
First of all, to select pods (spawned by your CronJob) that should be allowed by the NetworkPolicy as ingress sources or egress destinations, you may set specific label for those pods.
You can easily set a label for Jobs spawned by CronJob using labels field (another example with an explanation can be found in the OpenShift CronJobs documentation):
---
apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: mysql-test
spec:
...
jobTemplate:
spec:
template:
metadata:
labels:
workload: cronjob # Sets a label for jobs spawned by this CronJob.
type: mysql # Sets another label for jobs spawned by this CronJob.
...
Pods spawned by this CronJob will have the labels type=mysql and workload=cronjob, using this labels you can create/customize your NetworkPolicy:
$ kubectl get pods --show-labels
NAME READY STATUS RESTARTS AGE LABELS
mysql-test-1615216560-tkdvk 0/1 Completed 0 2m2s ...,type=mysql,workload=cronjob
mysql-test-1615216620-pqzbk 0/1 Completed 0 62s ...,type=mysql,workload=cronjob
mysql-test-1615216680-8775h 0/1 Completed 0 2s ...,type=mysql,workload=cronjob
$ kubectl describe pod mysql-test-1615216560-tkdvk
Name: mysql-test-1615216560-tkdvk
Namespace: default
...
Labels: controller-uid=af99e9a3-be6b-403d-ab57-38de31ac7a9d
job-name=mysql-test-1615216560
type=mysql
workload=cronjob
...
For example this mysql-workload NetworkPolicy allows connections to all pods in the mysql namespace from any pod with the labels type=mysql and workload=cronjob (logical conjunction) in a namespace with the label namespace-name=default :
NOTE: Be careful to use correct YAML (take a look at this namespaceSelector and podSelector example).
---
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: mysql-workload
namespace: mysql
spec:
podSelector: {}
policyTypes:
- Ingress
ingress:
- from:
- namespaceSelector:
matchLabels:
namespace-name: default
podSelector:
matchLabels:
type: mysql
workload: cronjob
To use network policies, you must be using a networking solution which supports NetworkPolicy:
Network policies are implemented by the network plugin. To use network policies, you must be using a networking solution which supports NetworkPolicy. Creating a NetworkPolicy resource without a controller that implements it will have no effect.
You can learn more about creating Kubernetes NetworkPolicies in the Network Policies documentation.

Helm 3 Deployment Order of Kubernetes Service Catalog Resources

I am using Helm v3.3.0, with a Kubernetes 1.16.
The cluster has the Kubernetes Service Catalog installed, so external services implementing the Open Service Broker API spec can be instantiated as K8S resources - as ServiceInstances and ServiceBindings.
ServiceBindings reflect as K8S Secrets and contain the binding information of the created external service. These secrets are usually mapped into the Docker containers as environment variables or volumes in a K8S Deployment.
Now I am using Helm to deploy my Kubernetes resources, and I read here that...
The [Helm] install order of Kubernetes types is given by the enumeration InstallOrder in kind_sorter.go
In that file, the order does neither mention ServiceInstance nor ServiceBinding as resources, and that would mean that Helm installs these resource types after it has installed any of its InstallOrder list - in particular Deployments. That seems to match the output of helm install --dry-run --debug run on my chart, where the order indicates that the K8S Service Catalog resources are applied last.
Question: What I cannot understand is, why my Deployment does not fail to install with Helm.
After all my Deployment resource seems to be deployed before the ServiceBinding is. And it is the Secret generated out of the ServiceBinding that my Deployment references. I would expect it to fail, since the Secret is not there yet, when the Deployment is getting installed. But that is not the case.
Is that just a timing glitch / lucky coincidence, or is this something I can rely on, and why?
Thanks!
As said in the comment I posted:
In fact your Deployment is failing at the start with Status: CreateContainerConfigError. Your Deployment is created before Secret from the ServiceBinding. It's only working as it was recreated when the Secret from ServiceBinding was available.
I wanted to give more insight with example of why the Deployment didn't fail.
What is happening (simplified in order):
Deployment -> created and spawned a Pod
Pod -> failing pod with status: CreateContainerConfigError by lack of Secret
ServiceBinding -> created Secret in a background
Pod gets the required Secret and starts
Previously mentioned InstallOrder will leave ServiceInstace and ServiceBinding as last by comment on line 147.
Example
Assuming that:
There is a working Kubernetes cluster
Helm3 installed and ready to use
Following guides:
Kubernetes.io: Instal Service Catalog using Helm
Magalix.com: Blog: Kubernetes Service Catalog
There is a Helm chart with following files in templates/ directory:
ServiceInstance
ServiceBinding
Deployment
Files:
ServiceInstance.yaml:
apiVersion: servicecatalog.k8s.io/v1beta1
kind: ServiceInstance
metadata:
name: example-instance
spec:
clusterServiceClassExternalName: redis
clusterServicePlanExternalName: 5-0-4
ServiceBinding.yaml:
apiVersion: servicecatalog.k8s.io/v1beta1
kind: ServiceBinding
metadata:
name: example-binding
spec:
instanceRef:
name: example-instance
Deployment.yaml:
apiVersion: apps/v1
kind: Deployment
metadata:
name: ubuntu
spec:
selector:
matchLabels:
app: ubuntu
replicas: 1
template:
metadata:
labels:
app: ubuntu
spec:
containers:
- name: ubuntu
image: ubuntu
command:
- sleep
- "infinity"
# part below responsible for getting secret as env variable
env:
- name: DATA
valueFrom:
secretKeyRef:
name: example-binding
key: host
Applying above resources to check what is happening can be done in 2 ways:
First method is to use timestamp from $ kubectl get RESOURCE -o yaml
Second method is to use $ kubectl get RESOURCE --watch-only=true
First method
As said previously the Pod from the Deployment couldn't start as Secret was not available when the Pod tried to spawn. After the Secret was available to use, the Pod started.
The statuses this Pod had were the following:
Pending
ContainerCreating
CreateContainerConfigError
Running
This is a table with timestamps of Pod and Secret:
| Pod | Secret |
|-------------------------------------------|-------------------------------------------|
| creationTimestamp: "2020-08-23T19:54:47Z" | - |
| - | creationTimestamp: "2020-08-23T19:54:55Z" |
| startedAt: "2020-08-23T19:55:08Z" | - |
You can get this timestamp by invoking below commands:
$ kubectl get pod pod_name -n namespace -o yaml
$ kubectl get secret secret_name -n namespace -o yaml
You can also get get additional information with:
$ kubectl get event -n namespace
$ kubectl describe pod pod_name -n namespace
Second method
This method requires preparation before running Helm chart. Open another terminal window (for this particular case 2) and run:
$ kubectl get pod -n namespace --watch-only | while read line ; do echo -e "$(gdate +"%H:%M:%S:%N")\t $line" ; done
$ kubectl get secret -n namespace --watch-only | while read line ; do echo -e "$(gdate +"%H:%M:%S:%N")\t $line" ; done
After that apply your Helm chart.
Disclaimer!
Above commands will watch for changes in resources and display them with a timestamp from the OS. Please remember that this command is only for example purposes.
The output for Pod:
21:54:47:534823000 NAME READY STATUS RESTARTS AGE
21:54:47:542107000 ubuntu-65976bb789-l48wz 0/1 Pending 0 0s
21:54:47:553799000 ubuntu-65976bb789-l48wz 0/1 Pending 0 0s
21:54:47:655593000 ubuntu-65976bb789-l48wz 0/1 ContainerCreating 0 0s
-> 21:54:52:001347000 ubuntu-65976bb789-l48wz 0/1 CreateContainerConfigError 0 4s
21:55:09:205265000 ubuntu-65976bb789-l48wz 1/1 Running 0 22s
The output for Secret:
21:54:47:385714000 NAME TYPE DATA AGE
21:54:47:393145000 sh.helm.release.v1.example.v1 helm.sh/release.v1 1 0s
21:54:47:719864000 sh.helm.release.v1.example.v1 helm.sh/release.v1 1 0s
21:54:51:182609000 understood-squid-redis Opaque 1 0s
21:54:52:001031000 understood-squid-redis Opaque 1 0s
-> 21:54:55:686461000 example-binding Opaque 6 0s
Additional resources:
Stackoverflow.com: Answer: Helm install in certain order
Alibabacloud.com: Helm charts and templates hooks and tests part 3
So to answer my own question (and thanks to #dawid-kruk and the folks on Service Catalog Sig on Slack):
In fact, the initial start of my Pods (the ones referencing the Secret created out of the ServiceBinding) fails! It fails because the Secret is actually not there the moment K8S tries to start the pods.
Kubernetes has a self-healing mechanism, in the sense that it tries (and retries) to reach the target state of the cluster as described by the various deployed resources.
By Kubernetes retrying to get the pods running, eventually (when the Secret is finally there) all conditions will be satisfied to make the pods start up nicely. Therefore, eventually, evth. is running as it should.
How could this be streamlined? One possibility would be for Helm to include the custom resources ServiceBinding and ServiceInstance into its ordered list of installable resources and install them early in the installation phase.
But even without that, Kubernetes actually deals with it just fine. The order of installation (in this case) really does not matter. And that is a good thing!

How to block traffic to db pod & service (DNS) from other namespaces in kubernates?

I have created 2 tenants(tenant1,tenant2) in 2 namespaces tenant1-namespace,tenant2-namespace
Each tenant has db pod and its services
How to isolate db pods/service i.e. how to restrict pod/service from his namespace to access other tenants db pods ?
I have used service account for each tenant and applied network policies so that namespaces are isolated.
kubectl get svc --all-namespaces
tenant1-namespace grafana-app LoadBalancer 10.64.7.233 104.x.x.x 3000:31271/TCP 92m
tenant1-namespace postgres-app NodePort 10.64.2.80 <none> 5432:31679/TCP 92m
tenant2-namespace grafana-app LoadBalancer 10.64.14.38 35.x.x.x 3000:32226/TCP 92m
tenant2-namespace postgres-app NodePort 10.64.2.143 <none> 5432:31912/TCP 92m
So
I want to restrict grafana-app to use only his postgres db in his namespace only, not in other namespace.
But problem is that using DNS qualified service name (app-name.namespace-name.svc.cluster.local)
its allowing to access each other db pods (grafana-app in namespace tenant1-namespace can have access to postgres db in other tenant2-namespace via postgres-app.tenant2-namespace.svc.cluster.local
Updates : network policies
1)
kind: NetworkPolicy
apiVersion: networking.k8s.io/v1
metadata:
name: deny-from-other-namespaces
spec:
podSelector:
matchLabels:
ingress:
- from:
- podSelector: {}
2)
kind: NetworkPolicy
apiVersion: networking.k8s.io/v1
metadata:
name: web-allow-external
spec:
podSelector:
matchLabels:
app: grafana-app
ingress:
- from: []
Your NetworkPolicy objects are correct, I created an example with them and will demonstrate bellow.
If you still have access to the service on the other namespace using FQDN, your NetworkPolicy may not be fully enabled on your cluster.
Run gcloud container clusters describe "CLUSTER_NAME" --zone "ZONE" and look for these two snippets:
At the beggining of the description it shows if the NetworkPolicy Plugin is enabled at Master level, it should be like this:
addonsConfig:
networkPolicyConfig: {}
At the middle of the description, you can find if the NetworkPolicy is enabled on the nodes. It should look like this:
name: cluster-1
network: default
networkConfig:
network: projects/myproject/global/networks/default
subnetwork: projects/myproject/regions/us-central1/subnetworks/default
networkPolicy:
enabled: true
provider: CALICO
If any of the above is different, check here: How to Enable Network Policy in GKE
Reproduction:
I'll create a simple example, I'll use gcr.io/google-samples/hello-app:1.0 image for tenant1 and gcr.io/google-samples/hello-app:2.0 for tenant2, so it's simplier to see where it's connecting but i'll use the names of your environment:
$ kubectl create namespace tenant1
namespace/tenant1 created
$ kubectl create namespace tenant2
namespace/tenant2 created
$ kubectl run -n tenant1 grafana-app --generator=run-pod/v1 --image=gcr.io/google-samples/hello-app:1.0
pod/grafana-app created
$ kubectl run -n tenant1 postgres-app --generator=run-pod/v1 --image=gcr.io/google-samples/hello-app:1.0
pod/postgres-app created
$ kubectl run -n tenant2 grafana-app --generator=run-pod/v1 --image=gcr.io/google-samples/hello-app:2.0
pod/grafana-app created
$ kubectl run -n tenant2 postgres-app --generator=run-pod/v1 --image=gcr.io/google-samples/hello-app:2.0
pod/postgres-app created
$ kubectl expose pod -n tenant1 grafana-app --port=8080 --type=LoadBalancer
service/grafana-app exposed
$ kubectl expose pod -n tenant1 postgres-app --port=8080 --type=NodePort
service/postgres-app exposed
$ kubectl expose pod -n tenant2 grafana-app --port=8080 --type=LoadBalancer
service/grafana-app exposed
$ kubectl expose pod -n tenant2 postgres-app --port=8080 --type=NodePort
service/postgres-app exposed
$ kubectl get all -o wide -n tenant1
NAME READY STATUS RESTARTS AGE IP NODE
pod/grafana-app 1/1 Running 0 100m 10.48.2.4 gke-cluster-114-default-pool-e5df7e35-ez7s
pod/postgres-app 1/1 Running 0 100m 10.48.0.6 gke-cluster-114-default-pool-e5df7e35-c68o
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR
service/grafana-app LoadBalancer 10.1.23.39 34.72.118.149 8080:31604/TCP 77m run=grafana-app
service/postgres-app NodePort 10.1.20.92 <none> 8080:31033/TCP 77m run=postgres-app
$ kubectl get all -o wide -n tenant2
NAME READY STATUS RESTARTS AGE IP NODE
pod/grafana-app 1/1 Running 0 76m 10.48.4.8 gke-cluster-114-default-pool-e5df7e35-ol8n
pod/postgres-app 1/1 Running 0 100m 10.48.4.5 gke-cluster-114-default-pool-e5df7e35-ol8n
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR
service/grafana-app LoadBalancer 10.1.17.50 104.154.135.69 8080:30534/TCP 76m run=grafana-app
service/postgres-app NodePort 10.1.29.215 <none> 8080:31667/TCP 77m run=postgres-app
Now, let's deploy your two rules: The first blocking all traffic from outside the namespace, the second allowing ingress the grafana-app from outside of the namespace:
$ cat default-deny-other-ns.yaml
kind: NetworkPolicy
apiVersion: networking.k8s.io/v1
metadata:
name: deny-from-other-namespaces
spec:
podSelector:
matchLabels:
ingress:
- from:
- podSelector: {}
$ cat allow-grafana-ingress.yaml
kind: NetworkPolicy
apiVersion: networking.k8s.io/v1
metadata:
name: web-allow-external
spec:
podSelector:
matchLabels:
run: grafana-app
ingress:
- from: []
Let's review the rules for Network Policy Isolation:
By default, pods are non-isolated; they accept traffic from any source.
Pods become isolated by having a NetworkPolicy that selects them. Once there is any NetworkPolicy in a namespace selecting a particular pod, that pod will reject any connections that are not allowed by any NetworkPolicy. (Other pods in the namespace that are not selected by any NetworkPolicy will continue to accept all traffic.)
Network policies do not conflict; they are additive. If any policy or policies select a pod, the pod is restricted to what is allowed by the union of those policies' ingress/egress rules. Thus, order of evaluation does not affect the policy result.
Then we will apply the rules on both namespaces because the scope of the rule is the namespace it's assigned to:
$ kubectl apply -n tenant1 -f default-deny-other-ns.yaml
networkpolicy.networking.k8s.io/deny-from-other-namespaces created
$ kubectl apply -n tenant2 -f default-deny-other-ns.yaml
networkpolicy.networking.k8s.io/deny-from-other-namespaces created
$ kubectl apply -n tenant1 -f allow-grafana-ingress.yaml
networkpolicy.networking.k8s.io/web-allow-external created
$ kubectl apply -n tenant2 -f allow-grafana-ingress.yaml
networkpolicy.networking.k8s.io/web-allow-external created
Now for final testing, I'll log inside grafana-app in tenant1 and try to reach the postgres-app in both namespaces and check the output:
$ kubectl exec -n tenant1 -it grafana-app -- /bin/sh
/ ### POSTGRES SAME NAMESPACE ###
/ # wget -O- postgres-app:8080
Connecting to postgres-app:8080 (10.1.20.92:8080)
Hello, world!
Version: 1.0.0
Hostname: postgres-app
/ ### GRAFANA OTHER NAMESPACE ###
/ # wget -O- --timeout=1 http://grafana-app.tenant2.svc.cluster.local:8080
Connecting to grafana-app.tenant2.svc.cluster.local:8080 (10.1.17.50:8080)
Hello, world!
Version: 2.0.0
Hostname: grafana-app
/ ### POSTGRES OTHER NAMESPACE ###
/ # wget -O- --timeout=1 http://postgres-app.tenant2.svc.cluster.local:8080
Connecting to postgres-app.tenant2.svc.cluster.local:8080 (10.1.29.215:8080)
wget: download timed out
You can see that the DNS is resolved, but the networkpolicy blocks the access to the backend pods.
If after double checking NetworkPolicy is enabled on Master and Nodes you still face the same issue let me know in the comments and we can dig further.