I'm porting an application to k8s. The application currently consists of pairs of dockers, Trusted and Untrusted, where Trusted is connected to the regular bridge network and talks to internal services, while Untrusted is connected to a separate network allowing access only to external IPs. Untrusted accesses the network according to user-generated data, hence must have internet access and must not be able to access internal IPs.
Trusted and Untrusted communicate using a pair of FIFOs since they run on the same machine (Using a Unix Domain Socket was ~20% slower, I didn't yet test local TCP/IP but I suspect a bigger performance hit - the service is not entirely horizontally scalable due to ordering constraints so single-machine performance matters).
I've hit a wall porting this setup to k8s: the original idea was to use a Pod for each pair of dockers and using emptyDir to share the FIFOs, but there seems to be no way to create separate network limitations for a single container in a Pod since they all share a networking container. Is there a way to do this?
What's the alternative if this isn't possible? Setting up the untrusted containers in a separate namespace and applying a limited network access policy allowing only tightly-controlled access to the rest of the cluster?
Related
Wondering if traffic between a pod's sidecar proxy and the pod's application(s) can be intercepted as it traverses the localhost network stack (perhaps using a eBPF module)?
Can a tenant guarantee the security of its traffic if it does not trust/control the nodes on which its pods are running?
Many thanks.
Edit: Is it possible to guarantee traffic security on untrusted infrastructure at all? And then, how to trust shared infrastructure?
Wondering if traffic between a pod's sidecar proxy and the pod's application(s) can be intercepted as it traverses the localhost network stack (perhaps using a eBPF module)?
Yes, inspecting and even changing packets on the local interface is doable with a eBPF TC or XDP program.
Though it should be noted that you can also inspect local traffic with a raw socket like used by tcpdump(this requires about the same privileges as eBPF).
Can a tenant guarantee the security of its traffic if it does not trust/control the nodes on which its pods are running?
This very much depends on your thread model. eBPF can only be used by users with root access or special capabilities. It is very hard if not impossible to protect against root level access since you such users can also access your applications memory.
But it is never bad practice to use solid encryption, even over localhost.
Is it possible to guarantee traffic security on untrusted infrastructure at all? And then, how to trust shared infrastructure?
You have to draw a line somewhere, the exact location is totally up to you and depends on how important your secrets are and from whom you are trying to keep them. Perhaps you can trust your infrastructure provider, but do you trust the hypervisor software? or the actual hardware?. Both of which are most likely not manufactured by your infrastructure provider.
We are currently providing our software as a software-as-a-service on Amazon EC2 machines. Our software is a microservice-based application with around 20 different services.
For bigger customers we use dedicated installations on a dedicated set of VMs, the number of VMs (and number of instances of our microservices) depending on the customer's requirements. A common requirement of any larger customer is that our software needs access to the customer's datacenter (e.g., for LDAP access). So far, we solved this using Amazon's virtual private gateway feature.
Now we want to move our SaaS deployments to Kubernetes. Of course we could just create a Kubernetes cluster across an individual customer's VMs (e.g., using kops), but that would offer little benefit.
Instead, perspectively, we would like to run a single large Kubernetes cluster on which we deploy the individual customer installations into dedicated namespaces, that way increasing resource utilization and lowering cost compared to the fixed allocation of machines to customers that we have today.
From the Kubernetes side of things, our software works fine already, we can deploy multiple installations to one cluster just fine. An open topic is however the VPN access. What we would need is a way to allow all pods in a customer's namespace access to the customer's VPN, but not to any other customers' VPNs.
When googleing for the topic, I found approaches that add a VPN client to the individual container (e.g., https://caveofcode.com/2017/06/how-to-setup-a-vpn-connection-from-inside-a-pod-in-kubernetes/) which is obviously not an option).
Other approaches seem to describe running a VPN server inside K8s (which is also not what we need).
Again others (like the "Strongswan IPSec VPN service", https://www.ibm.com/blogs/bluemix/2017/12/connecting-kubernetes-cluster-premises-resources/ ) use DaemonSets to "configure routing on each of the worker nodes". This also does not seem like a solution that is acceptable to us, since that would allow all pods (irrespective of the namespace they are in) on a worker node access to the respective VPN... and would also not work well if we have dozens of customer installations each requiring its own VPN setup on the cluster.
Is there any approach or solution that provides what we need, .i.e., VPN access for the pods in a specific namespace only?
Or are there any other approaches that could still satisfy our requirement (lower cost due to Kubernetes worker nodes being shared between customers)?
For LDAP access, one option might be to setup a kind of LDAP proxy, so that only this proxy would need to have VPN access to the customer network (by running this proxy on a small dedicated VM for each customer, and then configuring the proxy as LDAP endpoint for the application). However, LDAP access is only one out of many aspects of connectivity that our application needs depending on the use case.
If your IPSec concentrator support VTI, it's possible route the traffic using firewall rules. For example, PFSense suports it: https://www.netgate.com/docs/pfsense/vpn/ipsec/ipsec-routed.html.
Using VTI, you can direct traffic using some kind of policy routing: https://www.netgate.com/docs/pfsense/routing/directing-traffic-with-policy-routing.html
However, i can see two big problems here:
You cannot have two IPSEC tunnels with the conflicted networks. For example, your kube network is 192.168.0.0/24 and you have two customers: A (172.12.0.0/24) and B (172.12.0.0/12). Unfortunelly, this can happen (unless your customer be able to NAT those networks).
Find the ideals criteria for rule match (to allow the routing), since your source network are always the same. Use mark packages (using iptables mangle or even through application) can be a option, but you will still get stucked on the first problem.
A similar scenario is founded on WSO2 (API gateway provider) architecture. They solved it using reverse-proxy in each network (sad but true) https://docs.wso2.com/display/APICloud/Expose+your+On-Premises+Backend+Services+to+the+API+Cloud#ExposeyourOn-PremisesBackendServicestotheAPICloud-ExposeyourservicesusingaVPN
Regards,
UPDATE:
I don't know if you use GKE. If yes, maybe use Alias-IP can be an option: https://cloud.google.com/kubernetes-engine/docs/how-to/alias-ips. The PODs IPs will be routable from VPC. So, you can apply some kind of routing policy based on their CIDR.
Let's say I'm using an GCE ingress to handle traffic from outside the cluster and terminate TLS (https://example.com/api/items), from here the request gets routed to one of two services that are only available inside the cluster. So far so good.
What if I have to call service B from service A, should I go all the way and use the cluster's external IP/domain and use HTTPS (https://example.com/api/user/1) to call the service or could I use the internal IP of the service and use HTTP (http://serviceb/api/user/1)? Do I have to encrypt the data or is it "safe" as long as it isn't leaving the private k8s network?
What if I want to have "internal" endpoints that should only be accessible from within the cluster - when I'm always using the external https-url those endpoints would be reachable for everyone. Calling the service directly, I could just do a http://serviceb/internal/info/abc.
What if I have to call service B from service A, should I go all the way and use the cluster's external IP/domain and use HTTPS (https://example.com/api/user/1) to call the service or could I use the internal IP of the service and use HTTP (http://serviceb/api/user/1)?
If you need to use the features that you API Gateway is offering (authentication, cache, high availability, load balancing) then YES, otherwise DON'T. The External facing API should contain only endpoints that are used by external clients (from outside the cluster).
Do I have to encrypt the data or is it "safe" as long as it isn't leaving the private k8s network?
"safe" is a very relative word and I believe that there are no 100% safe networks. You should put in the balance the probability of "somebody" or "something" sniffing data from the network and the impact that it has on your business if that happens.
If this helps you: for any project that I've worked for (or I heard from somebody I know), the private network between containers/services was more than sufficient.
What if I want to have "internal" endpoints that should only be accessible from within the cluster - when I'm always using the external https-url those endpoints would be reachable for everyone.
Exactly what I was saying on top of the answer. Keeping those endpoints inside the cluster makes them inaccessible by design from outside.
One last thing, managing a lot of SSL certificates for a lot of internal services is a pain that one should avoid if not necessary.
I have a few questions about Kubernetes master-slave salt mode (reposting from https://github.com/kubernetes/kubernetes/issues/21215)
How do you think anyone who has a large cluster in GCE upgrade things in place when a new vulnerability is exposed?
How does one do things like regular key rotation etc, without a master-minion salt setup in GCE? Does not that leave GCE cluster more vulnerable in the long run?
I am not a security expert, so this is probably a naive question. Since GCE cluster is already running inside a pretty locked down network, is the communication between master-slave a major concern? I understand, in GKE the master is hidden and access is restricted to the GCP project owner. But in GCE, master is visible. So, is this a real concern for GCE only setups?
How do you think anyone who has a large cluster in GCE upgrade things in place when a new vulnerability is exposed?
By upgrading to a new version of k8s. If there is a kernel or docker vulnerability, we would build a new base image (container-vm), send a PR to enable it in GCE, and then cut a new release referencing the new base image. If there is a k8s vulnerability, we would cut a new version of kubernetes and you could upgrade it using the upgrade.sh script in github.
How does one do things like regular key rotation etc, without a master-minion salt setup in GCE? Does not that leave GCE cluster more vulnerable in the long run?
By updating the keys on the master node, updating the keys in the node instance template, and rolling nodes from the old instance template to the new instance template. We don't want to distribute keys via salt, because then you have to figure out how to secure salt (which requires keys which then also need to be rotated). Instead we "distribute" keys out of band using the GCE metadata server.
Since GCE cluster is already running inside a pretty locked down network, is the communication between master-slave a major concern?
For GKE, the master is running outside of the protected network, so it is a concern. GCE follows the same security model (even though it isn't strictly necessary) because it reduces the burden on the folks maintaining both systems if there is less drift in how they are configured.
So, is this a real concern for GCE only setups?
For most folks it probably isn't a concern. But you could imagine a large company running multiple clusters (or other workloads) in the same network so that services maintained by different teams could easily communicate over the internal cloud network. In that case, you would still want to protect the communication between the master and nodes to reduce the impact an attacker (or malicious insider) could have by exploiting a single entry point into the network.
Is it possible to grant blanket access to my CloudSQL instance from ALL (current and future) GCE instances? I've tried adding the /16 internal network block address for my project's instances (copied from the "networks" tab under "Compute Engine": 10.240.0.0/16) but that won't save - it appears that I can only add single-machine (/32) IP addresses.
You need to use the external IP of your machine, although they are both (GCE and Cloud SQL) in Google's datacenters, you cannot communicate between the two using internal IPs.
I do not think there is a native way to allow access from any instance in your project. The only way would be to make your own app to run on one of your instances and use the GCE api to periodically query running instances, get their external ip's, and then use the CloudSQL API to modify the security configuration on the CloudSQL instance.
You could improve this slightly creating a pool of static IP's that you assign to your GCE machines that are going to access your CloudSQL instance, that way the IP's would not change, the side affect is that you would be charged for IP's that you have reserved but do not have allocated to instances.
Apart from that you would have to put a rule to allow any IP access (e.g. 1.0.0.0/0), which would not be a good idea.