How to setup fail2ban with Rancher Load Balancer (HAProxy) - haproxy

I want to setup fail2ban on my Rancher agents.
I have a Cattle environment running a managed network where each Rancher agent is running a HAProxy used as a web-server + load balancer.
I want to ban users based on different criteria (too many failed logins, too many requests, etc) from HTTP and HTTPs ports.
Currently I have fail2ban setup with regex that is working in dummy logs but fail2ban is not banning any IPs.
fail2ban-client status <my-jail> shows that the jail has been started but has 0 bans even when I do incorrect requests.

After running fail2ban in debug mode and investigating IP tables I have found the problem.
The problem occurs because of 3 reasons:
1) HAProxy is running a docker image with server time in UTC and your servers might be in a different time zone.
2) HAProxy is running in a docker container in a Cattle managed network which means that the incoming packets are Forward packets and not Input packets for iptables.
3) The way Cattle handles forwarding is a bit ugly and hence does not allow custom fail2ban rules
In my case as HAProxy is in a docker image with a different timezone, fail2ban was ignoring ban attempts as the time was a few hours off. Changing the server time fixed the first problem.
I could now see that IPs were indeed being banned when using:
fail2ban-client status <my-jail>
But the problem still remained cause even though I could see the correct IPs being banned, I could still access the server completely fine.
This is because of the way Rancher sets up iptables. To fix this problem I changed my /etc/fail2ban/jail.local from:
[DEFAULT]
...
chain = INPUT
...
To:
[DEFAULT]
...
chain = CATTLE_FORWARD
...
Now the users are correctly banned because the timezone matches and are then put into a jail in the Forward chain hence dropping requests from banned users.

Related

Where Are TLS Handshake Errors In APISERVER Logs Coming From?

I have a cluster provisioned using KubeSpray on AWS. It has two bastions, one controller, one worker, and one etcd server.
I am seeing endless messages in the APISERVER logs:
http: TLS handshake error from 10.250.227.53:47302: EOF
They come from two IP addresses, 10.250.227.53 and 10.250.250.158. The port numbers change every time.
None of the cluster nodes correspond to those two IP addresses. The subnet cidr ranges are shown below.
The cluster seems stable. This behavior does not seem to have any negative affect. But I don't like having random HTTPS requests.
How can I debug this issue?
They're from the health check configured on the AWS ELB; you can stop those messages by changing the health check configuration to be HTTPS:6443/healthz instead of the likely TCP one it is using now
How can I debug this issue?
Aside from just generally being cognizant of how your cluster was installed, and then observing that those connections come at regular intervals, I would further bet that those two IP addresses belong to the two ENIs that are allocated to the ELB in each public subnet (they'll show up in the Network Interfaces list on the console as "owner: elasticloadbalancer" or something similar)

GKE streaming large file download fails with partial response

I have an app hosted on GKE which, among many tasks, serve's a zip file to clients. These zip files are constructed on the fly through many individual files on google cloud storage.
The issue that I'm facing is that when these zip's get particularly large, the connection fails randomly part way through (anywhere between 1.4GB to 2.5GB). There doesn't seem to be any pattern with timing either - it could happen between 2-8 minutes.
AFAIK, the connection is disconnecting somewhere between the load balancer and my app. Is GKE ingress (load balancer) known to close long/large connections?
GKE setup:
HTTP(S) load balancer ingress
NodePort backend service
Deployment (my app)
More details/debugging steps:
I can't reproduce it locally (without kubernetes).
The load balancer logs statusDetails: "backend_connection_closed_after_partial_response_sent" while the response has a 200 status code. A google of this gave nothing helpful.
Directly accessing the pod and downloading using k8s port-forward worked successfully
My app logs that the request was cancelled (by the requester)
I can verify none of the files are corrupt (can download all directly from storage)
I believe your "backend_connection_closed_after_partial_response_sent" issue is caused by websocket connection being killed by the back-end prematurily. You can see the documentation on websocket proxying in nginx - it explains the nature of this process. In short - by default WebSocket connection is killed after 10 minutes.
Why it works when you download the file directly from the pod ? Because you're bypassing the load-balancer and the websocket connection is kept alive properly. When you proxy websocket then things start to happen because WebSocket relies on hop-by-hop headers which are not proxied.
Similar case was discussed here. It was resolved by sending ping frames from the back-end to the client.
In my opinion your best shot is to do the same. I've found many cases with similar issues when websocket was proxied and most of them suggest to use pings because it will reset the connection timer and will keep it alive.
Here's more about pinging the client using WebSocket and timeouts
I work for Google and this is as far as I can help you - if this doesn't resolve your issue you have to contact GCP support.

getaddrinfo: Temporary failure in name resolution kubernetes + coredns

We have a service that sends tons of events in bulks. It basically opens multiple http POST connections.
Since we moved the service to kubernetes, we're getaddrinfo: Temporary failure in name resolution errors from time to time. (most calls work but some fail and it's weird.
Can anyone explain why and how to fix?
Check the tinder post, they had a similar problem:
https://medium.com/tinder-engineering/tinders-move-to-kubernetes-cda2a6372f44
and the source for their dns info:
https://www.weave.works/blog/racy-conntrack-and-dns-lookup-timeouts
TLDR: check your arp tables cache gc_* host parameters, try to disable AAAA query in the containers /etc/gai.conf, move the DNS to a daemonset and inject the host IP as dns servers to the pods
Also, to help this and speed up dns resolve, add a ending dot to all domains (ie: database.example.com.), so coredns will try that query directly (one query, 2 with ipv6) instead of trying all the kubernetes search domain list ( about 5, 10 with ipv6). Only leave out that dot where you are querying for kubernetes resources or in apps that do not like it (if they exist, it's a bug, ALL DNS always end with a dot, while most of the time we can leave it out, with it is the correct way, must not fail)

DNS problem on AWS EKS when running in private subnets

I have an EKS cluster setup in a VPC. The worker nodes are launched in private subnets. I can successfully deploy pods and services.
However, I'm not able to perform DNS resolution from within the pods. (It works fine on the worker nodes, outside the container.)
Troubleshooting using https://kubernetes.io/docs/tasks/administer-cluster/dns-debugging-resolution/ results in the following from nslookup (timeout after a minute or so):
Server: 172.20.0.10
Address 1: 172.20.0.10
nslookup: can't resolve 'kubernetes.default'
When I launch the cluster in an all-public VPC, I don't have this problem. Am I missing any necessary steps for DNS resolution from within a private subnet?
Many thanks,
Daniel
I feel like I have to give this a proper answer because coming upon this question was the answer to 10 straight hours of debugging for me. As #Daniel said in his comment, the issue I found was with my ACL blocking outbound traffic on UDP port 53 which apparently kubernetes uses to resolve DNS records.
The process was especially confusing for me because one of my pods worked actually worked the entire time since (I think?) it happened to be in the same zone as the kubernetes DNS resolver.
To elaborate on the comment from #Daniel, you need:
an ingress rule for UDP port 53
an ingress rule for UDP on ephemeral ports (e.g. 1025–65535)
I hadn't added (2) and was seeing CoreDNS receiving requests and trying to respond, but the response wasn't getting back to the requester.
Some tips for others dealing with these kinds of issues, turn on CoreDNS logging by adding the log configuration to the configmap, which I was able to do with kubectl edit configmap -n kube-system coredns. See CoreDNS docs on this https://github.com/coredns/coredns/blob/master/README.md#examples This can help you figure out whether the issue is CoreDNS receiving queries or sending the response back.
I ran into this as well. I have multiple node groups, and each one was created from a CloudFormation template. The CloudFormation template created a security group for each node group that allowed the nodes in that group to communicate with each other.
The DNS error resulted from Pods running in separate node groups from the CoreDNS Pods, so the Pods were unable to reach CoreDNS (network communications were only permitted withing node groups). I will make a new CloudFormation template for the node security group so that all my node groups in my cluster can share the same security group.
I resolved the issue for now by allowing inbound UDP traffic on port 53 for each of my node group security groups.
So I been struggling for a couple of hours i think, lost track of time, with this issue as well.
Since i am using the default VPC but with the worker nodes inside the private subnet, it wasn't working.
I went through the amazon-vpc-cni-k8s and found the solution.
We have to sff the environment variable of the aws-node daemonset AWS_VPC_K8S_CNI_EXTERNALSNAT=true.
You can either get the new yaml and apply or just fix it through the dashboard. However for it to work you have to restart the worker node instance so the ip route tables are refreshed.
issue link is here
thankz
Re: AWS EKS Kube Cluster and Route53 internal/private Route53 queries from pods
Just wanted to post a note on what we needed to do to resolve our issues. Noting that YMMV and everyone has different environments and resolutions, etc.
Disclaimer:
We're using the community terraform eks module to deploy/manage vpcs and the eks clusters. We didn't need to modify any security groups. We are working with multiple clusters, regions, and VPC's.
ref:
Terraform EKS module
CoreDNS Changes:
We have a DNS relay for private internal, so we needed to modify coredns configmap and add in the dns-relay IP address
...
ec2.internal:53 {
errors
cache 30
forward . 10.1.1.245
}
foo.dev.com:53 {
errors
cache 30
forward . 10.1.1.245
}
foo.stage.com:53 {
errors
cache 30
forward . 10.1.1.245
}
...
VPC DHCP option sets:
Update with the IP of the above relay server if applicable--requires regeneration of the option set as they cannot be modified.
Our DHCP options set looks like this:
["AmazonProvidedDNS", "10.1.1.245", "169.254.169.253"]
ref: AWS DHCP Option Sets
Route-53 Updates:
Associate every route53 zone with the VPC-ID that you need to associate it with (where our kube cluster resides and the pods will make queries from).
there is also a terraform module for that:
https://www.terraform.io/docs/providers/aws/r/route53_zone_association.html
We had run into a similar issue where DNS resolution times out on some of the pods, but re-creating the pod couple of times resolves the problem. Also its not every pod on a given node showing issues, only some pods.
It turned out to be due to a bug in version 1.5.4 of Amazon VPC CNI, more details here -- https://github.com/aws/amazon-vpc-cni-k8s/issues/641.
Quick solution is to revert to the recommended version 1.5.3 - https://docs.aws.amazon.com/eks/latest/userguide/update-cluster.html
As many others, I've been struggling with this bug a few hours.
In my case the issue was this bug https://github.com/awslabs/amazon-eks-ami/issues/636 that basically sets up an incorrect DNS when you specify endpoint and certificate but not certificate.
To confirm, check
That you have connectivity (NACL and security groups) allowing DNS on TCP and UDP. For me the better way was to ssh into the cluster and see if it resolves (nslookup). If it doesn't resolve (most likely it is either NACL or SG), but check that the DNS nameserver in the node is well configured.
If you can get name resolution in the node, but not inside the pod, check that the nameserver in /etc/resolv.conf points to an IP in your service network (if you see 172.20.0.10, your service network should be 172.20.0.0/24 or so)

How to get the real ip in the request in the pod in kubernetes

I have to get the real ip from the request in my business.actually I got the 10.2.100.1 every time at my test environment. any way to do this ?
This is the same question as GCE + K8S - Accessing referral IP address and How to read client IP addresses from HTTP requests behind Kubernetes services?.
The answer, copied from them, is that this isn't yet possible in the released versions of Kubernetes.
Services go through kube_proxy, which answers the client connection and proxies through to the backend (your web server). The address that you'd see would be the IP of whichever kube-proxy the connection went through.
Work is being actively done on a solution that uses iptables as the proxy, which will cause your server to see the real client IP.
Try to get that service IP which service is associated with that pods.
One very roundabout way right now is to set up an HTTP liveness probe and watch the IP it originates from. Just be sure to also respond to it appropriately or it'll assume your pod is down.