Kubernetes Network Policy for External Name Service - kubernetes

We are looking at setting up network policies for our Kubernetes cluster. However in at least one of our namespaces we have an ExternalName service (kubernetes reference - service types) for an AWS RDS intance. We would like to restrict traffic to this ExternalName service to be from a particular set of pods, or if that is not possible, from a particular namespace. Neither the namespace isolation policy or the NetworkPolicy resoure seem to apply to ExternalName services. After searching the documentation for both Weave and Project Calico, there doesn't seem to be any mention of such functionality.
Is it possible to restrict network traffic to an ExternalName service to be from a specific set of pods or from a particular namespace?

You can't really do that. ExternalName services are a DNS construct. A client performs a DNS lookup for the service and kube-dns returns the CNAME record for, in your case, the RDS instance. Then the client connects to RDS directly.
There are two possible ways to tackle this:
Block just DNS lookups (pods can still connect to the DB if they know the IP or fully qualified RDS hostname):
change namespace isolation to support ExternalName services
make kube-dns figure the client pod behind each request it gets
make kube-dns aware of namespace isolation settings and apply them, so it only returns CNAME records to authorized pods
Return DNS lookups, but block RDS connections:
extend NetworkPolicy somehow to also control egress traffic
blacklist/whitelist RDS IPs wholesale (easier said than done, since they are dynamic) or make the network controllers track the results from DNS lookups and block/allow connections accordingly.
In either case, you'll have to file a number of feature requests in Kubernetes and downstream.
Source: I wrote the EN support code.

Related

OpenShift/OKD, what is the difference between deployment, service, route, ingress?

Could you please explain use of each "Kind" of OpenShift in a short sentences?
It is okay, that deployment contains data about, image source, pod counts, limits etc.
With the route we can determine the URL for each deployment as well as Ingress, but what is the difference and when should use route and when ingress?
And what is the exact use of service?
Thanks for your help in advance!
Your question cannot be answered simply in short words or one line answers, go through the links and explore more,
Deployment: It is used to change or modify the state of the pod. A pod can be one or more running containers or a group of duplicate pods called ReplicaSets.
Service: Each pod is given an IP address when using a Kubernetes service. The service provides accessibility, connects the appropriate pod automatically, and this address may not be directly identifiable.
Route:Similar to the Kubernetes Ingress resource, OpenShift's Route was developed with a few additional features, including the ability to split traffic between multiple backends.
Ingress: It offers routing rules for controlling who can access the services in a Kubernetes cluster.
Difference between route and ingress?
OpenShift uses HAProxy to get (HTTP) traffic into the cluster. Other Kubernetes distributions use the NGINX Ingress Controller or something similar. You can find more in this doc.
when to use route and ingress: It depends on your requirements. From the image below you can find the feature of the ingress and route and you select according to your requirements.
Exact use of service:
Each pod in a Kubernetes cluster has its own unique IP address. However, the IP addresses of the Pods in a Deployment change as they move around. Therefore, using Pod IP addresses directly is illogical. Even if the IP addresses of the member Pods change, you will always have a consistent IP address with a Service.
A Service also provides load balancing. Clients call a single, dependable IP address, and the Service's Pods distribute their requests evenly.

Network policy behavior for multi-node cluster

I have a multi-node cluster setup. There are Kubernetes network policies defined for the pods in the cluster. I can access the services or pods using their clusterIP/podIP only from the node where the pod resides. For services with multiple pods, I cannot access the service from the node at all (I guess when the service directs the traffic to the pod with the resident node same as from where I am calling then the service will work).
Is this the expected behavior?
Is it a Kubernetes limitation or a security feature?
For debugging etc., we might need to access the services from the node. How can I achieve it?
No, it is not the expected behavior for Kubernetes. Pods should be accessible for all the nodes inside the same cluster through their internal IPs. ClusterIP service exposes the service on a cluster-internal IP and making it reachable from within the cluster - it is basically set by default for all the service types, as stated in Kubernetes documentation.
Services are not node-specific and they can point to a pod regardless of where it runs in the cluster at any given moment in time. Also make sure that you are using the cluster-internal port: while trying to reach the services. If you still can connect to the pod only from node where it is running, you might need to check if something is wrong with your networking - e.g, check if UDP ports are blocked.
EDIT: Concerning network policies - by default, a pod is non-isolated either for egress or ingress, i.e. if no NetworkPolicy resource is defined for the pod in Kubernetes, all traffic is allowed to/from this pod - so-called default-allow behavior. Basically, without network policies all pods are allowed to communicate with all other pods/services in the same cluster, as described above.
If one or more NetworkPolicy is applied to a particular pod, it will reject all traffic that is not explicitly allowed by that policies (meaning, NetworkPolicythat both selects the pod and has "Ingress"/"Egress" in its policyTypes) - default-deny behavior.
What is more:
Network policies do not conflict; they are additive. If any policy or policies apply to a given pod for a given direction, the connections allowed in that direction from that pod is the union of what the applicable policies allow.
So yes, it is expected behavior for Kubernetes NetworkPolicy - when a pod is isolated for ingress/egress, the only allowed connections into/from the pod are those from the pod's node and those allowed by the connection list of NetworkPolicy defined.
To be compatible with it, Calico network policy follows the same behavior for Kubernetes pods.
NetworkPolicy is applied to pods within a particular namespace - either the same or different with the help of the selectors.
As for node specific policies - nodes can't be targeted by their Kubernetes identities, instead CIDR notation should be used in form of ipBlock in pod/service NetworkPolicy - particular IP ranges are selected to allow as ingress sources or egress destinations for pod/service.
Whitelisting Calico IP addresses for each node might seem to be a valid option in this case, please have a look at the similar issue described here.

Route traffic in kubernetes based on IP address

We have need to test a pod on our production kubernetes cluster after a data migration before we expose it to our users. What we'd like to do is route traffic from our internal ip addresses to the correct pods, and all other traffic to a maintenance pod. Is there a way we can achieve this, or do we need to briefly expose an ip address on the cluster so we can access the pods directly?
What ingress controller are you using?
Generally ingress-controllers tend to support header-based routing, rather that source ip based one.
Say, ingress-nginx supports header-based canary deployments out of box.
https://kubernetes.github.io/ingress-nginx/user-guide/nginx-configuration/annotations/#canary.
Good example here.
https://medium.com/#domi.stoehr/canary-deployments-on-kubernetes-without-service-mesh-425b7e4cc862
Note, that ingress-nginx canary implementation actually requires your services to be in different namespaces.
You could try to configure your ingress-nginx with use-forwarded-headers and try to employ X-Forwarded-For header for routing.
https://kubernetes.github.io/ingress-nginx/user-guide/nginx-configuration/configmap/#use-forwarded-headers

Multiple Host Kubernetes Ingress Controller

I've been studying Kubernetes for a few weeks now, and using the kube-lego NGINX examples (https://github.com/jetstack/kube-lego) have successfully deployed services to Kubernetes cluster using Rancher on DigitalOcean.
I've deployed sample static sites, Wordpress, Laravel, Craft CMS, etc. All of which use custom Namespaces, Deployment, Secrets, Containers with external registries, Services, and Ingress Definitions.
Using the example (lego) NGINX Ingress Controller setup, I'm able to apply DNS to the exposed IP address of my K8s cluster, and have the resulting sites appear.
What I don't know, though, is how to allow for multiple hosts to have Ingress Controllers service the same deployments, and thus provide HA Ingress to the cluster. (by applying an external load balancer service, or geo-ip, or what-have-you).
Rancher (stable) allows me to add multiple hosts, I've spun up 3 to 5 at a time, and Kubernetes is configured and deployed across all Hosts. Furthermore, I'll define many replicas and/or deployments (listed above) and they will be spread over the cluster and accessible as would be expected. I've even specified multiple replicas of the Ingress Controller, but of course they all get scheduled on the same host, giving me only one IP address of Ingress.
So how do I allow multiple hosts (each with their own public facing IP address) to allow ingress into the cluster? I've also read about setting up multiple Ingress Controllers, but then you must specify what deployment/services are being serviced by what Ingress Controller, which then totally defeats the purpose.
Maybe I'm missing something, but if K8s multi-host is supposed to provide HA, and the Host with the Ingress Controller goes down, then the service will be rescheduled on the other Hosts, but the IP address that everything is pointing to will be dead, and thus an outage. Any way to have multiple IP Addresses to the same set of deployment/services?
I investigated my setup a bit more today, and I think I found out why I was having difficulty. The "LoadBalancer" is often mentioned as for use with Cloud Providers (in both docs, and what #fiunchinho describes). I was using it with a Rancher setup, which auto creates an HA-Proxy LoadBalancer ingress for you on the hosts.
By default, it will just schedule it on one of the hosts. You can specify that you want it scheduled globally buy providing an 'annotation' of io.rancher.scheduler.global: "true".
Like so:
annotations:
# Create load balancers on every host in the environment
io.rancher.scheduler.global: "true"
http://rancher.com/docs/rancher/v1.6/en/rancher-services/load-balancer/
I preferred LoadBalancer over NodePort because I wanted the ability to send port 80 (and in the future port 443) to any of the Nodes, and have them successfully fulfil my request by inspecting the Host header, and directing as-needed.
These LBs can also be setup in the Rancher UI under the "Infrastructure Stack" menu. I have successfully removed the single LB, and re-added one with an "Always run one instance of this container on every host" option enabled.
After this was configured, I could make a request to any of the Hosts for any of the Ingresses, and get a response, no matter what host the container was scheduled on.
https://rancher.com/docs/rancher/v1.6/en/rancher-services/load-balancer/
So cool!
The ingress controller is deployed like any regular pod. That means that you can have as many replicas as you'd like, which will be spread among all your nodes.
You need a Service object that group all the pods for the ingress controller.
Then you just need to expose that Service to outside the cluster. You can do that using a LoadBalancer service if you are on a cloud provider. Or you can use just a NodePort service.
The point is that the service will balance the traffic that your ingress controller receives between all the pods that are running on different kubernetes nodes. If one of the nodes goes down, it doesn't really matter, because there are other nodes containing ingress controller pods.

Routing internal traffic in Kubernetes?

We presently have a setup where applications within our mesos/marathon cluster want to reach out to services which may or may not reside in our mesos/marathon cluster. Ingress for external traffic into the cluster is accomplished via an Amazon ELB sitting in front of a cluster of Traefik instances, which then chooses the appropriate set of container instances to load-balance to via the incoming HTTP Host header compared against essentially a many-to-one association of configured host headers against a particular container instance. Internal-to-internal traffic is actually handled by this same route as well, as the DNS record that is associated with a given service is mapped to that same ELB both internal to and external to our mesos/marathon cluster. We also give the ability to have multiple DNS records pointing against the same container set.
This setup works, but causes seemingly unnecessary network traffic and load against our ELBs as well as our Traefik cluster, as if the applications in the containers or another component were able to self-determine that the services they wished to call out to were within the specific mesos/marathon cluster they were in, and make an appropriate call to either something internal to the cluster fronting the set of containers, or directly to the specific container itself.
From what I understand of Kubernetes, Kubernetes provides the concept of services, which essentially can act as the front for a set of pods based on configuration for which pods the service should match over. However, I'm not entirely sure of the mechanism by which we can have applications in a Kubernetes cluster know transparently to direct network traffic to the service IPs. I think that some of this can be helped by having Envoy proxy traffic meant for, e.g., <application-name>.<cluster-name>.company.com to the service name, but if we have a CNAME that maps to that previous DNS entry (say, <application-name>.company.com), I'm not entirely sure how we can avoid exiting the cluster.
Is there a good way to solve for both cases? We are trying to avoid having our applications' logic have to understand that it's sitting in a particular cluster and would prefer a component outside of the applications to perform the routing appropriately.
If I am fundamentally misunderstanding a particular component, I would gladly appreciate correction!
When you are using service-to-service communication inside a cluster, you are using Service abstraction which is something like a static point which will road traffic to the right pods.
Service endpoint available only from inside a cluster by it's IP or internal DNS name, provided by internal Kubernetes DNS server. So, for communicating inside a cluster, you can use DNS names like <servicename>.<namespace>.svc.cluster.local.
But, what is more important, Service has a static IP address.
So, now you can add that static IP as a hosts record to the pods inside a cluster for making sure that they will communicate each other inside a cluster.
For that, you can use HostAlias feature. Here is an example of configuration:
apiVersion: v1
kind: Pod
metadata:
name: hostaliases-pod
spec:
restartPolicy: Never
hostAliases:
- ip: "10.0.1.23"
hostnames:
- "my.first.internal.service.example.com"
- ip: "10.1.2.3"
hostnames:
- "my.second.internal.service.example.com"
containers:
- name: cat-hosts
image: busybox
command:
- cat
args:
- "/etc/hosts"
So, if you will use your internal Service IP in combination with service's public FQDN, all traffic from your pod will be 100% inside a cluster, because the application will use internal IP address.
Also, you can use upstream DNS server which will contain same aliases, but an idea will be the same.
With Upstream DNS for the separate zone, resolving will work like that:
With a new version of Kubernetes, which using Core DSN for providing DNS service, and has more features it will be a bit simpler.