AWS EKS - Howto solve RulesPerSecurityGroupLimitExceeded caused by NLB rules - amazon-vpc

I have av VPC with three private subnets (10.176.128.0/20, 10.176.144.0/20, 10.176.160.0/20) and three public subnets (10.176.0.0/20, 10.176.16.0/20, 10.176.32.0/20). All private subnets have the tag kubernetes.io/role/internal-elb=1 and public have the tag kubernetes.io/role/elb=1.
I run all my worker nodes in managed node groups and AWS eks has been responsible for creating a default security group for the cluster. That security group is what I'm referring to later.
I have two namespaces in my kubernetes cluster, test and stage and in each namespace I have 3 services with loadbalancer and they expose 8 ports in each namespace. The loadbalancer is of type nlb.
Now to the problem, each service with a loadbalancer creates 4 rules per port in my security group for my nodes, one for each subnet it's located in and one for all trafic (0.0.0.0/0). 8 * 4 * 2 = 64 and max number of rules per security group is 60 according to AWS, so when I'm about to create the last LB I get the error about RulesPerSecurityGroupLimitExceeded.
Two ways to solve this as I see it, either have more security groups attached to my nodes or somehow config so there are less rules created per port. Thing is that actually one rule with be enough of them since 0.0.0.0/0 would allow all my subnets as well. Another option might be that I'm doing something wrong in the design. The first option to add more security groups I have tried and failed with, still tries to add the rules to the one that is already full.

We are hitting this issue as well. One thing you can do is request a quota increase on rules per security group in the AWS console. Feels to me like that is only going to postpone the issue slightly though.

Related

Understanding Kubernetes networking and public/private addressing

I'm trying to set up a Kubernetes cluster for production (with kubeadm) on a private cloud (IONOS Cloud). I've been trying to figure it all out and configure it as best as possible for a few weeks now. Still there is one thing I don't quite understand or see if it is possible.
The initial cluster would be 5 nodes, 3 masters and 2 workers. The master servers with a load balancer and the workers with another load balancer. Each server has 2 interfaces, one for the public network and one for the private network (192.168.3.0/24). Each server has a firewall that by default blocks all packets.
In order not to have to create firewall rules on each server, so that they can see each other over the public network, I would like any kind of communication between Kubernetes nodes to be over the private network.
I can think of other reasons to use the private network for inter-node communication, such as: speed, latency, security...
However, I have not been able to configure it and I don't know if it is really possible to create this scenario. The problems I encounter are the following:
In order to access the API (e.g. kubectl) of the cluster from outside I need to expose the control-plane endpoint with the public IP of the balancer. If I do this, then the database endpoints etcd are exposed on the public network. Then, the other nodes, in the process of joining the cluster (kubeadm join) need to get some information from the databases etcd, therefore, they necessarily need visibility over the public network between them.
Once all the nodes are joined, the "kubernetes" service has the endpoints of all the control-plane endpoints (public). When I try to deploy Calico (or another CNI) they never finish deploying because they make queries to the kubernetes service, and if they don't have visibility between them, it fails.
It seems that whatever I do, if I publish the API on the public network, I need all the nodes to see each other over the public network.
Maybe I'm making my life too complicated, and the simplest thing to do is to open a firewall rule for each node, but I don't know if this is a good practice.
Architecture diagram (I still cannot embed images)

DNS problem on AWS EKS when running in private subnets

I have an EKS cluster setup in a VPC. The worker nodes are launched in private subnets. I can successfully deploy pods and services.
However, I'm not able to perform DNS resolution from within the pods. (It works fine on the worker nodes, outside the container.)
Troubleshooting using https://kubernetes.io/docs/tasks/administer-cluster/dns-debugging-resolution/ results in the following from nslookup (timeout after a minute or so):
Server: 172.20.0.10
Address 1: 172.20.0.10
nslookup: can't resolve 'kubernetes.default'
When I launch the cluster in an all-public VPC, I don't have this problem. Am I missing any necessary steps for DNS resolution from within a private subnet?
Many thanks,
Daniel
I feel like I have to give this a proper answer because coming upon this question was the answer to 10 straight hours of debugging for me. As #Daniel said in his comment, the issue I found was with my ACL blocking outbound traffic on UDP port 53 which apparently kubernetes uses to resolve DNS records.
The process was especially confusing for me because one of my pods worked actually worked the entire time since (I think?) it happened to be in the same zone as the kubernetes DNS resolver.
To elaborate on the comment from #Daniel, you need:
an ingress rule for UDP port 53
an ingress rule for UDP on ephemeral ports (e.g. 1025–65535)
I hadn't added (2) and was seeing CoreDNS receiving requests and trying to respond, but the response wasn't getting back to the requester.
Some tips for others dealing with these kinds of issues, turn on CoreDNS logging by adding the log configuration to the configmap, which I was able to do with kubectl edit configmap -n kube-system coredns. See CoreDNS docs on this https://github.com/coredns/coredns/blob/master/README.md#examples This can help you figure out whether the issue is CoreDNS receiving queries or sending the response back.
I ran into this as well. I have multiple node groups, and each one was created from a CloudFormation template. The CloudFormation template created a security group for each node group that allowed the nodes in that group to communicate with each other.
The DNS error resulted from Pods running in separate node groups from the CoreDNS Pods, so the Pods were unable to reach CoreDNS (network communications were only permitted withing node groups). I will make a new CloudFormation template for the node security group so that all my node groups in my cluster can share the same security group.
I resolved the issue for now by allowing inbound UDP traffic on port 53 for each of my node group security groups.
So I been struggling for a couple of hours i think, lost track of time, with this issue as well.
Since i am using the default VPC but with the worker nodes inside the private subnet, it wasn't working.
I went through the amazon-vpc-cni-k8s and found the solution.
We have to sff the environment variable of the aws-node daemonset AWS_VPC_K8S_CNI_EXTERNALSNAT=true.
You can either get the new yaml and apply or just fix it through the dashboard. However for it to work you have to restart the worker node instance so the ip route tables are refreshed.
issue link is here
thankz
Re: AWS EKS Kube Cluster and Route53 internal/private Route53 queries from pods
Just wanted to post a note on what we needed to do to resolve our issues. Noting that YMMV and everyone has different environments and resolutions, etc.
Disclaimer:
We're using the community terraform eks module to deploy/manage vpcs and the eks clusters. We didn't need to modify any security groups. We are working with multiple clusters, regions, and VPC's.
ref:
Terraform EKS module
CoreDNS Changes:
We have a DNS relay for private internal, so we needed to modify coredns configmap and add in the dns-relay IP address
...
ec2.internal:53 {
errors
cache 30
forward . 10.1.1.245
}
foo.dev.com:53 {
errors
cache 30
forward . 10.1.1.245
}
foo.stage.com:53 {
errors
cache 30
forward . 10.1.1.245
}
...
VPC DHCP option sets:
Update with the IP of the above relay server if applicable--requires regeneration of the option set as they cannot be modified.
Our DHCP options set looks like this:
["AmazonProvidedDNS", "10.1.1.245", "169.254.169.253"]
ref: AWS DHCP Option Sets
Route-53 Updates:
Associate every route53 zone with the VPC-ID that you need to associate it with (where our kube cluster resides and the pods will make queries from).
there is also a terraform module for that:
https://www.terraform.io/docs/providers/aws/r/route53_zone_association.html
We had run into a similar issue where DNS resolution times out on some of the pods, but re-creating the pod couple of times resolves the problem. Also its not every pod on a given node showing issues, only some pods.
It turned out to be due to a bug in version 1.5.4 of Amazon VPC CNI, more details here -- https://github.com/aws/amazon-vpc-cni-k8s/issues/641.
Quick solution is to revert to the recommended version 1.5.3 - https://docs.aws.amazon.com/eks/latest/userguide/update-cluster.html
As many others, I've been struggling with this bug a few hours.
In my case the issue was this bug https://github.com/awslabs/amazon-eks-ami/issues/636 that basically sets up an incorrect DNS when you specify endpoint and certificate but not certificate.
To confirm, check
That you have connectivity (NACL and security groups) allowing DNS on TCP and UDP. For me the better way was to ssh into the cluster and see if it resolves (nslookup). If it doesn't resolve (most likely it is either NACL or SG), but check that the DNS nameserver in the node is well configured.
If you can get name resolution in the node, but not inside the pod, check that the nameserver in /etc/resolv.conf points to an IP in your service network (if you see 172.20.0.10, your service network should be 172.20.0.0/24 or so)

How to make cluster nodes private on Google Kubernetes Engine?

I noticed every node in a cluster has an external IP assigned to it. That seems to be the default behavior of Google Kubernetes Engine.
I thought the nodes in my cluster should be reachable from the local network only (through its virtual IPs), but I could even connect directly to a mongo server running on a pod from my home computer just by connecting to its hosting node (without using a LoadBalancer).
I tried to make Container Engine not to assign external IPs to newly created nodes by changing the cluster instance template settings (changing property "External IP" from "Ephemeral" to "None"). But after I did that GCE was not able to start any pods (Got "Does not have minimum availability" error). The new instances did not even show in the list of nodes in my cluster.
After switching back to the default instance template with external IP everything went fine again. So it seems for some reason Google Kubernetes Engine requires cluster nodes to be public.
Could you explain why is that and whether there is a way to prevent GKE exposing cluster nodes to the Internet? Should I set up a firewall? What rules should I use (since nodes are dynamically created)?
I think Google not allowing private nodes is kind of a security issue... Suppose someone discovers a security hole on a database management system. We'd feel much more comfortable to work on fixing that (applying patches, upgrading versions) if our database nodes are not exposed to the Internet.
GKE recently added a new feature allowing you to create private clusters, which are clusters where nodes do not have public IP addresses.
This is how GKE is designed and there is no way around it that I am aware of. There is no harm in running kubernetes nodes with public IPs, and if these are the IPs used for communication between nodes you can not avoid it.
As for your security concern, if you run that example DB on kubernetes, even if you go for public IP it would not be accessible, as this would be only on the internal pod-to-pod networking, not the nodes them selves.
As described in this article, you can use network tags to identify which GCE VMs or GKE clusters are subject to certain firewall rules and network routes.
For example, if you've created a firewall rule to allow traffic to port 27017, 27018, 27019, which are the default TCP ports used by MongoDB, give the desired instances a tag and then use that tag to apply the firewall rule that allows those ports access to those instances.
Also, it is possible to create GKE cluster with applying the GCE tags on all nodes in the new node pool, so the tags can be used in firewall rules to allow/deny desired/undesired traffic to the nodes. This is described in this article under --tags flag.
Kubernetes Master is running outside your network and it needs to access your nodes. This could the the reason for having public IPs.
When you create your cluster, there are some firewall rules created automatically. These are required by the cluster, and there's e.g. ingress from master and traffic between the cluster nodes.
Network 'default' in GCP has readymade firewall rules in place. These enable all SSH and RDP traffic from internet and enable pinging of your machines. These you can remove without affecting the cluster and your nodes are not visible anymore.

Tenant isolation with Kubernetes on networking level

We want to run a multi-tenant scenario that requires tenant separation on a network level.
The idea is that every tenant receives a dedicated node and a dedicated network that other tenants nodes can join. Tenant nodes should be able to interact with each other in that network.
Networks should not be able to talk with each other (true network isolation).
Are there any architectural patterns to achieve this?
One Kubernetes cluster per tenant?
One Kubernetes cluster for all tenants, with one subnet per tenant?
One Kubernetes cluster across VPCs (speaking in AWS terms)?
The regular way to deal with multi-tenancy inside kubernetes is to use namespaces. But this is within a kube cluster, meaning you still have the same underlying networking solution shared by all tenants. That is actualy fine, as you have Network Policies to restrict networking in the cluster.
You can obviously run autonomous clusters per tenant, yet this is not exactly multi-tenancy then, just multiple clusters. Networking can be configured on node level to route as expected, but you'd still be left with an issue of cross-cluster service discovery etc. Federation can help a bit with that, but I would still advise to chase Namespaces+Policies approach.
I see four ways to run multi-tenant k8s clusters at network-level:
Namespaces
Ingress rules
allow/deny and ingress/egress Network Policies
Network-aware Zones

Consistent IP Addresses for Auto Scaling / Load Balanced Instances

The Setup
ECS (Containerized) Application (Node.js, API Only)
Auto Scaling Group for ECS Container Instances
Load Balancer in front of auto scaling group
VPC covering all instances and ELB
Database hosted in another VPC, not managed explicitly (MongoDB Atlas), likely not the same region.
The Problem
I want my database to use good security policies, therefore I opt for whitelisting IPs as Atlas recommends - rather than opening up my database to the world with 0.0.0.0/0.
Each server has its own IP address, and in an autoscaling event it would need to be added by automation to the Atlas security rules (which is possible, not ideal).
How can I (using NAT Gateways? Elastic IPs?) get one IP for all of my load balanced instances.
Failed Solutions?
I tried using a NAT Gateway, essentially scenario 2 where all my of instances were in a private subnet, the NAT was in a public subnet with internet access, and the instances went through it to get to the database. This worked! Elastic IP on the NAT and I was able to authorize it on Atlas however it had weird issues where the instance wouldn't respond for 65 - 75 seconds, intermittently when pinged. I suspect this is due to the fact that it's not technically available on the internet and there's some routing that I don't fully understand happening. Once you got a 200 though everything would work fine, for a bit, then another 70 second latency and back to good again...
Really appreciate the input, have been searching for a while with no luck!
Have you tried a VPC peering connection? As long as the VPC CIDR blocks do not overlap, this is a good option because you can use security groups and private IPs between the peered VPCs.
http://docs.aws.amazon.com/AmazonVPC/latest/PeeringGuide/Welcome.html