I am using a Nginx Ingress Controller in a Kubernetes Cluster. I've got an application within the cluster, which was available over the internet. Now I'm using the Ingress Controller to access the application, with the intent of showing some custom errors.
If i access the application (which is not written by myself, therefore I can't change things there), it receives the IP address of the nginx-ingress-controller-pod. The logs of the nginx-ingress-controller-pod indicate that the remote address is a different one.
I've already tried things like use-proxy-protocol, then I would be able to use $remote_addr and get the right IP. But as I mentioned I am not able to change my application, so I have to "trick" the ingress controller to use the $remote_addr as his own.
How can i configure the ingress, so the application will get the request from the remote IP and not from the nginx-ingress-controller-pod IP? Is there a way to do this?
Edit: I'm using a bare metal kubernetes installation with kubernetes v1.19.2 and the nginx chart ingress-nginx-3.29.0.
It could not achievable by using layer 7 ingress controller.
If Ingress preserves the source IP then response will got directly from the app pod to the client, so the client will get a response from a IP:port different from what he connected to. Or even worse - client's NAT drops the response completely because it doesn't match the existing connections.
You can take a look at this similar question on stackoverflow with accepted answer:
As ingress is above-layer-4 proxy. There is no way you can preserve SRC IP in layer 3 IP protocol. The best is and I think Nginx Ingress already been set by default that they put the "X-Forwarded-For" header in any HTTP forward.
Your app supposes to log the X-Forwarded-For header
You can try to workaround by following this article. It could help you to preserve your IP.
I also recommend this very good article about load balancing and proxying. You will also learn a bit about load balancing on L7:
L7 load balancing and the OSI model
As I said above in the section on L4 load balancing, using the OSI model for describing load balancing features is problematic. The reason is that L7, at least as described by the OSI model, itself encompasses multiple discrete layers of load balancing abstraction. e.g., for HTTP traffic consider the following sublayers:
Optional Transport Layer Security (TLS). Note that networking people argue about which OSI layer TLS falls into. For the sake of this discussion we will consider TLS L7.
Physical HTTP protocol (HTTP/1 or HTTP/2).
Logical HTTP protocol (headers, body data, and trailers).
Messaging protocol (gRPC, REST, etc.).
Related
My architecture looks like this:
Here, the HTTPS requests first go to the route53 service for DNS resolution. Route53 forwards the request to the Network Load balancer. This service redirects the traffic to HAProxy pods running inside a Kubernetes cluster.
The HAProxy servers are required to read a specific request header and based on its value, it will route the traffic to backend. To keep things simple, I have kept a single K8 Backend cluster, but assume that there are more than 1 such backend cluster running.
Considering this architecture:
What is the best place to perform TLS termination? Should we do it at NLB (green box) or implement it at HAProxy (Orange box)?
What are the advantages and disadvantages of each scenario?
As you are using the NLB you can achieve End to end HTTPS also however it forces the service also to use.
You can terminate at the LB level if you have multiple LB backed by clusters, leveraging the AWS cert manage with LB will be an easy way to manage the multiple setups.
There is no guarantee that if anyone that enters in your network won't be able to exploit a bug capable of intercepting traffic between services, Software Defined Network(SDN) in your VPC is secure and protects from spoofing but no guarantee.
So there is an advantage if you use TLS/SSL inside the VPC also.
I have deployed my Kubernetes cluster on EKS. I have an ingress-nginx which is exposed via load balancer to route traffic to different services. In ingress-nginx first request goes to auth service for authentication and if it is a valid request then I allow it move forward.
Let say the request is in Service 1 and now from there, it wants to communicate to Service 2. So if I somehow want my request to go directly to ingress not via load balancer and then from ingress to service 2.
Is is possible to do so?
Will it help in improving performance as I bypassed load balancer?
As the request is not moving through load balancer so load balancing won't take place, is it a serious concern?
1/ Is it possible: short answer, no.
There are edge cases, that would require for someone to create another Ingress object exposing Service2 in the first place. Then, you could trick the Ingress into routing you to some service that might not otherwise be reachable (if the DNS doesn't exist, some VIP was not yet exposed, ...)
There's no real issue with external clients bypassing the ELB, as long as they can not join all ports on your nodes, just the ones bound by your ingress controller.
2/ Bypassing the loadbalancer: won't change much in terms of performance.
If we're talking about a TCP loadbalancer, getting it away would help track real client IPs, though. Figuring out how to change it for an HTTP loadbalancer may be better -- though not always easy.
3/ Removing the LoadBalancer: if you have several nodes hosting replicas of your incress controller, then you would still be able to do some kind of DNS-based loadbalancing. Though for sure, it's not the same as having a real LB.
In AWS, you could find a middle ground setting up health-check based Route53 Records: set one for each node hosting an ingress controller, create another regrouping all healthy ingress nodes, then change your existing ingress FQDN records so they'ld all point to your new route53 name. You'ld be able to do TCP/HTTP checks against EC2 instances IPs, that's usually good enough. But again: DNS loadbalancing can suffer from outdated browser caches, some ISP not refreshing zones, ... LB is the real thing.
I have a requirement to run multiple hiveservers as pods on a kubernetes cluster, each serving users belonging to different AD groups. These hiveservers need to be exposed outside of kubernetes cluster, but each hiveserver cannot be exposed as a different service. Ideally I would like to have a reverse proxy implemented using ingress controller with ingress defined for each hiveserver, as the servers could be dynamically created and destroyed.
I see that nginx ingress controller can be used for http, I don't see a way to make this work as a reverse proxy for thrift based hiveservers. I also had a look at knox, but that seems to support http transport only.
Is there a known way to have ingress controller setup as reverse proxy to front end non-http end points like thrift hiveservers?
You may try to use service mesh, if this is an option for you.
In Istio such a use case (managing TCP traffic) can be achieved with Istio ingress gateway, that will act as entry point for the bunch of services inside your cluster (similar to K8S ingress but not limited to http traffic). There is even a built-in support for custom protocols like Apache Thrift protocol, which allows you to use features like rate limiting.
i wanted to understand what are my load balancing options in a scenario where i want to use a single HTTPS Load Balancer on GCP to serve some static content from a bucket and dynamic content using a combination of react front end and express backend on Kubernetes.
Additional info:
i have a domain name registered outside of Google Domains
I want to serve all content over https
I'm not starting with anything big. Just getting started with a more or less hobby type project which will attract very little traffic in the near future.
I dont mind serving my react front end, express backend from app engine if that helps simplify this somehow. however, in such a case, i would like to understand if i still want something on kubernetes, will i be able to communicate between app engine and kubernetes without hassles using internal IPs. And how would i load balance that traffic!!
Any kind of network blueprint in the public domain that will guide me will be helpful.
I did quite a bit of reading on NodePort/LoadBalancer/Ingress which has left me confused. from what i understand, LoadBalancer does not work with HTTP(S) traffic, operates more at TCP L4 Level, so probably not suitable for my use case.
Ingress provisions a dedicated Load Balancer of its own on which i cannot put my own routes to a backend bucket etc, which means i may need a minimum of two load balancers? and two IPs?
NodePort exposes a port on all nodes, which means i need to handle load balancing myself even if my HTTPS Load balancer routing can somehow help.
Any guidance/pointers will be much appreciated!
EDIT: Found some information on Network Endpoint Groups (NEG) while researching. Looking promising. will investigate. Any thoughts about taking this route? https://cloud.google.com/kubernetes-engine/docs/how-to/standalone-neg
EDIT: Was able to get this working using a combination of NEGs and Nginx reverse proxies.
In order to resolve your concerns please start with:
Choosing the right loadbalncer:
Network load balancer (Layer 4 load balancing or proxy for applications that rely on TCP/SSL protocol) the load is forwarding into your systems based on incoming IP protocol data, such as address, port, and protocol type.
The network load balancer is a pass-through load balancer, so your backends receive the original client request. The network load balancer doesn't do any Transport Layer Security (TLS) offloading or proxying. Traffic is directly routed to your VMs.
Network loadbalancers terminatese TLS on backends that are located in regions appropriate to your needs
HTTP(s) loadbalancer is a proxy-based, regional Layer 7 load balancer that enables you to run and scale your services behind a private load balancing IP address that is accessible only in the load balancer's region in your VPC network.
HTTPS and SSL Proxy load balancers terminate TLS in locations that are distributed globally.
An HTTP(S) load balancer acts as a proxy between your clients and your application. If you want to accept HTTPS requests from your clients
You have the option to use Google-managed SSL certificates (Beta) or to use certificates that you manage yourself.
Technical Details
When you create an Ingress object, the GKE Ingress controller configures a GCP HTTP(S) load balancer according to the rules in the Ingress manifest and the associated Service manifests. The client sends a request to the HTTP(S) load balancer. The load balancer is an actual proxy; it chooses a node and forwards the request to that node's NodeIP:NodePort combination. The node uses its iptables NAT table to choose a Pod. kube-proxy manages the iptables rules on the node. Routes traffic is going to a healthy Pod for the Service specified in your rules.
Per buckets documentation:
An HTTP(S) load balancer can direct traffic from specified URLs to either a backend bucket or a backend service.
Bucket should be public while using Loadbalncer- Creating buckets bucket
During LoaBalancer set-up you can choose backend service and backend bucket. You can find more information in the docs.
Please take a look also for this two tutorials here and here how to build application using cloud storage.
Hope this help.
Additional resources:
Loadbalancers, Controllers
From this youtube Brendan Burns talks about having a load balancer between each app layer. This makes good sense - and when he says load balancer, he is talking about a services right?
The real question is, having a service between each layer makes sense, but what about when you have a web application. Would you still need a reverse proxy like nginx as HTTP load balancer on top of the Kubernetes services. I can see the need to direct the the url to prevent a cross domain, but not for balancing since this would be handled by the Kubernetes service, right?
Then would you have pods of nginx redirecting to other services(internal cabernets load balancer/services)?
Just saw this. Again any comments are welcome.
Thanks
Yes, there are definitely use cases for which you might want a reverse proxy in front of the Kubernetes services. Experimental support is being added for this to Kubernetes version 1.1.
You can check out the design proposal here and an implementation using haproxy here.