Network Policy in Kubernetes under the hood

Network Policy in Kubernetes under the hood - kubernetes

I have network policy created and implemented as per https://github.com/ahmetb/kubernetes-network-policy-recipes, and its working fidn , however I would like to understand how exactly this gets implemeneted in the back end , how does network policy allow or deny traffic , by modifying the iptables ? which kubernetes componenets are involved in implementing this ?

"It depends". It's up to whatever controller actually does the setup, which is usually (but not always) part of your CNI plugin.
The most common implementation is Calico's Felix daemon, which supports several backends, but iptables is a common one. Other plugins use eBPF network programs or other firewall subsystems to similar effect.

Network Policy is implemented by network plugins (calico for example) most commonly by setting up Linux Iptables Netfilter rules on the Kubernetes nodes.
From the docs here
In the Calico approach, IP packets to or from a workload are routed and firewalled by the Linux routing table and iptables infrastructure on the workload’s host. For a workload that is sending packets, Calico ensures that the host is always returned as the next hop MAC address regardless of whatever routing the workload itself might configure. For packets addressed to a workload, the last IP hop is that from the destination workload’s host to the workload itself

Related

Kubernetes services : request Assignment algorithm

What is the logic algorithm that a kubernets service uses to assign requests to pods that it exposes? Can this algorithm be customized?
Thanks.

kube-proxy in userspace mode chooses a backend via a round-robin algorithm.
kube-proxy in iptables mode chooses a backend at random.
IPVS provides more options for balancing traffic to backend Pods; these are:rr: round-robin,lc: least connection (smallest number of open connections),dh: destination hashing,sh: source hashing,sed: shortest expected delay, nq: never queue
As mentioned here:- Service
For application level routing you would need to use a service mesh like istio ,envoy, kong.

You can use a component kube-proxy. What is it?
kube-proxy is a network proxy that runs on each node in your cluster, implementing part of the Kubernetes Service concept.
kube-proxy maintains network rules on nodes. These network rules allow network communication to your Pods from network sessions inside or outside of your cluster.
kube-proxy uses the operating system packet filtering layer if there is one and it's available. Otherwise, kube-proxy forwards the traffic itself.
But why use a proxy when there is a round-robin DNS algorithm? There are a few reasons for using proxying for Services:
There is a long history of DNS implementations not respecting record TTLs, and caching the results of name lookups after they should have expired.
Some apps do DNS lookups only once and cache the results indefinitely.
Even if apps and libraries did proper re-resolution, the low or zero TTLs on the DNS records could impose a high load on DNS that then becomes difficult to manage.
kube-proxy has many modes:
User space proxy mode - In the userspace mode, the iptables rule forwards to a local port where a go binary (kube-proxy) is listening for connections. The binary (running in userspace) terminates the connection, establishes a new connection to a backend for the service, and then forwards requests to the backend and responses back to the local process. An advantage of the userspace mode is that because the connections are created from an application, if the connection is refused, the application can retry to a different backend
Iptables proxy mode - In iptables mode, the iptables rules are installed to directly forward packets that are destined for a service to a backend for the service. This is more efficient than moving the packets from the kernel to kube-proxy and then back to the kernel so it results in higher throughput and better tail latency. The main downside is that it is more difficult to debug, because instead of a local binary that writes a log to /var/log/kube-proxy you have to inspect logs from the kernel processing iptables rules.
IPVS proxy mode - IPVS is a Linux kernel feature that is specifically designed for load balancing. In IPVS mode, kube-proxy programs the IPVS load balancer instead of using iptables. This works, it also uses a mature kernel feature and IPVS is designed for load balancing lots of services; it has an optimized API and an optimized look-up routine rather than a list of sequential rules.
You can read more here - good question about proxy mode on StackOverflow, here - comparing proxy modes and here - good article about proxy modes.
Like rohatgisanat mentioned in his answer you can also use service mesh. Here is also good article about Kubernetes service mesh comparsion.

How to assign a single static source IP address for all pods of a service or deployment in kubernetes?

Consider a microservice X which is containerized and deployed in a kubernetes cluster. X communicates with a Payment Gateway PG. However, the payment gateway requires a static IP for services contacting it as it maintains a whitelist of IP addresses which are authorized to access the payment gateway. One way for X to contact PG is through a third party proxy server like QuotaGuard which will provide a static IP address to service X which can be whitelisted by the Payment Gateway.
However, is there an inbuilt mechanism in kubernetes which can enable a service deployed in a kube-cluster to obtain a static IP address?

there's no mechanism in Kubernetes for this yet.
other possible solutions:
if nodes of the cluster are in a private network behind a NAT then just add your network's default gateway to the PG's whitelist.
if whitelist can accept a cidr apart from single IPs (like 86.34.0.0/24 for example) then add your cluster's network cidr to the whitelist
If every node of the cluster has a public IP and you can't add a cidr to the whitelist then it gets more complicated:
a naive way would be to add ever node's IP to the whitelist, but it doesn't scale above tiny clusters few just few nodes.
if you have access to administrating your network, then even though nodes have pubic IPs, you can setup a NAT for the network anyway that targets only packets with PG's IP as a destination.
if you don't have administrative access to the network, then another way is to allocate a machine with a static IP somewhere and make it act as a proxy using iptables NAT similarly like above again. This introduces a single point of failure though. In order to make it highly available, you could deploy it on a kubernetes cluster again with few (2-3) replicas (this can be the same cluster where X is running: see below). The replicas instead of using their node's IP to communicate with PG would share a VIP using keepalived that would be added to PG's whitelist. (you can have a look at easy-keepalived and either try to use it directly or learn from it how it does things). This requires high privileges on the cluster: you need be able to grant to pods of your proxy NET_ADMIN and NET_RAW capabilities in order for them to be able to add iptables rules and setup a VIP.
update:
While waiting for builds and deployments during last few days, I've polished my old VIP-iptables scripts that I used to use as a replacement for external load-balancers on bare-metal clusters, so now they can be used as well to provide egress VIP as described in the last point of my original answer. You can give them a try: https://github.com/morgwai/kevip

There are two answers to this question: for the pod IP itself, it depends on your CNI plugin. Some allow it with special pod annotations. However most CNI plugins also involve a NAT when talking to the internet so the pod IP being static on the internal network is kind of moot, what you care about is the public IP the connection ends up coming from. So the second answer is "it depends on how your node networking and NAT is set up". This is usually up to the tool you used to deploy Kubernetes (or OpenShift in your case I guess). With Kops it's pretty easy to tweak the VPC routing table.

Is there a way to customize iptables rules in filter table on kubernetes master/worker node?

I'm working on a project where we're attempting to transition legacy product (deployed as a standalone VM) to kubernetes infrastcurture.
I'm using KUBEROUTER as CNI provider.
To protect the VM against DoS(and log the attempt) we've added different chains in iptables filter table. (These include rules for ping flood, syn flood - I think network policies/ingress controller can manage syn flood, but not sure how icmp flood would be taken care of. )
When I deployed kubernetes on my VM, I found that kubernetes updates iptables and creates it's own chains. (Mainly k8s updates NAT rules but chains are added in filter table as well)
My questions are:
Is it possible to customize iptables on VM where kubernetes is running?
If I add my own chains (making sure that k8s chains are in place) to iptables configuration, would they be overwritten by k8s?
Can I add chains using plain old iptables commands or need to do so via kubectl? (From k8s documentation, I got an impression that we can only update rules in NAT table using kubectl)
Please let me know, if somebody knows more on this, thanks !
~Prasanna

Is it possible to customize iptables on VM where kubernetes is running?
Yes, you can manage your VM's iptables normally, but the rules concerning application inside of Kubernetes should be managed from inside of Kubernetes.
If I add my own chains (making sure that k8s chains are in place) to iptables configuration, would they be overwritten by k8s?
Chains should not be overwritten by Kubernetes as Kubernetes creates its own chain and manages it.
Can I add chains using plain old iptables commands or need to do so via kubectl? (From k8s documentation, I got an impression that we can only update rules in NAT table using kubectl)
You can use iptables for rules related to the VirtualMachine. To manage firewall rules you should use iptables because kubectl can’t manage the firewall. For the inbound and outbound rules in Kubernetes cluster use Kubernetes tools ( .yaml files where you specify the network policies). Be aware not to create services that might be in conflict with iptables rules.
If you intend to expose application services externally, by either using the NodePort or LoadBalancing service types, traffic forwarding must be enabled in your iptables ruleset. If you find that you are unable to access a service from outside of the network used by the pod where your application is running, check that your iptables ruleset does not contain a rule similar to the following:
:FORWARD DROP [0:0]
Kubernetes network policies are application-centric compared to standard infrastructure/network-centric standard firewalls.
Which means that we do not really use CIDR or IP based network policies, in Kubernetes they are built on labels and selectors.
Concerning DDoS protection and details of ICMP flood attacks: the truth is that "classic" methods of mitigation - limiting the ICMP responses/filtering techniques will have an impact on legitimate traffic. In the "new era" of DDoS attacks with huge traffics, firewall based solutions are not enough as the traffic is usually able to overbear them. You could consider some vendor specific solutions or if you have that kind of possibilities - prepare your infrastructure to bear huge amounts of traffic or implement solutions like ping size and frequency limitations. Also, the overall DDoS protection consists of many levels and solutions. There are solutions like black hole routing, rate limiting, anycast network diffusion, uRPF, ACL's which can also help with application-level DDoS attacks. There are many more interesting practices I could recommend, but in my opinion, it is important to have a playbook and incident response plan in case of those attacks.

Firewall at container level in GCP

I have my jenkins slaves running on gke dynamically. I need to allow those containers to access my nexus server which is running on port 8080 on a different instance but same network. In firewall I have to allow those containers to access nexus-port 8080. But I don't want to keep 0.0.0.0 in source IP ranges. What is the IP range that I should allow to make it work. I tried Internal IPs, Cluster EndPoint in Source IP and targets I allowed all instances in the network. It is not working as expected. I need some help.

What you want to use to achieve this is not fiddling directly with firewall, but utilize Kubernetes native way of limiting traffic between pods by use of Network Policies
https://kubernetes.io/docs/concepts/services-networking/network-policies/

In addition to #Radek 'Goblin' Pieczonka answer I think it’s worth to add that traditional firewall rules are no longer sufficient for containerized environments.
Kubernetes Network Policy allows you to specify the connectivity allowed within your cluster, and what should be blocked. This is not based on traditional IP firewall concept but rather on selectors, not IP addresses and ports.
Here you can read foundations of the new philosophy of security. You probably will find interesting for your project.

How to configure Kubernetes to encrypt the traffic between nodes, and pods?

In preparation for HIPAA compliance, we are transitioning our Kubernetes cluster to use secure endpoints across the fleet (between all pods). Since the cluster is composed of about 8-10 services currently using HTTP connections, it would be super useful to have this taken care of by Kubernetes.
The specific attack vector we'd like to address with this is packet sniffing between nodes (physical servers).
This question breaks down into two parts:
Does Kubernetes encrypts the traffic between pods & nodes by default?
If not, is there a way to configure it such?
Many thanks!

Actually the correct answer is "it depends". I would split the cluster into 2 separate networks.
Control Plane Network
This network is that of the physical network or the underlay network in other words.
k8s control-plane elements - kube-apiserver, kube-controller-manager, kube-scheduler, kube-proxy, kubelet - talk to each other in various ways. Except for a few endpoints (eg. metrics), it is possible to configure encryption on all endpoints.
If you're also pentesting, then kubelet authn/authz should be switched on too. Otherwise, the encryption doesn't prevent unauthorized access to the kubelet. This endpoint (at port 10250) can be hijacked with ease.
Cluster Network
The cluster network is the one used by the Pods, which is also referred to as the overlay network. Encryption is left to the 3rd-party overlay plugin to implement, failing which, the app has to implement.
The Weave overlay supports encryption. The service mesh linkerd that #lukas-eichler suggested can also achieve this, but on a different networking layer.

The replies here seem to be outdated. As of 2021-04-28 at least the following components seem to be able to provide an encrypted networking layer to Kubernetes:
Istio
Weave
linkerd
cilium
Calico (via Wireguard)
(the list above was gained via consultation of the respective projects home pages)

Does Kubernetes encrypts the traffic between pods & nodes by default?
Kubernetes does not encrypt any traffic.
There are servicemeshes like linkerd that allow you to easily introduce https communication between your http service.
You would run a instance of the service mesh on each node and all services would talk to the service mesh. The communication inside the service mesh would be encrypted.
Example:
your service -http-> localhost to servicemesh node - https-> remoteNode -http-> localhost to remote service.
When you run the service mesh node in the same pod as your service the localhost communication would run on a private virtual network device that no other pod can access.

No, kubernetes does not encrypt traffic by default
I haven't personally tried it, but the description on the Calico software defined network seems oriented toward what you are describing, with the additional benefit of already being kubernetes friendly
I thought that Calico did native encryption, but based on this GitHub issue it seems they recommend using a solution like IPSEC to encrypt just like you would a traditional host

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse