Question
Is NLB supported for ECS with dynamic port mapping?
Background
It looks there are attempts to use NLB with ECS but problems with health check.
Network Load Balancer for inter-service communication
Health check interval for Network Load Balancer Target Group
NLB Target Group health checks are out of control
When talked with AWS, they acknowledged that the NLB documentation of health check interval is not accurate as NLB has multiple instances sending health check respectively, hence the interval when an ECS task will get health check is not according to the HealthCheckIntervalSeconds.
Also the ECS task page says specifically about ALB to use the dynamic port mapping.
Hence, I suppose NLB is not supported for ECS? If there is a documentation which states NLB is supported for ECS, please suggest.
Update
Why are properly functioning Amazon ECS tasks registered to ELB marked as unhealthy and replaced?
Elastic Load Balancing is repeatedly flagging properly functioning Amazon Elastic Container Service (Amazon ECS) tasks as unhealthy. These incorrectly flagged tasks are stopped and new tasks are started to replace them. How can I troubleshoot this?
change the Health check grace period to an appropriate time period for your service
A Network Load Balancer makes routing decisions at the transport layer (TCP/SSL). It can handle millions of requests per second. After the load balancer receives a connection, it selects a target from the target group for the default rule using a flow hash routing algorithm. It attempts to open a TCP connection to the selected target on the port specified in the listener configuration. It forwards the request without modifying the headers. Network Load Balancers support dynamic host port mapping.
https://docs.aws.amazon.com/AmazonECS/latest/developerguide/load-balancer-types.html#nlb
Related
On an AWS EKS cluster, I have deployed a stateful application.
In order to load balance my application across different pods and availability zones, I have added an HAProxy Ingress Controller which uses an external AWS NLB.
I have one NLB in this cluster which points to the HAProxy Service. On top of the NLB I have created a global accelerator and I've set the NLB as its target endpoint.
My requirement is to ensure that once a user connects to the DNS of the Global Accelerator, they will always be directed to the same endpoint server, i.e the same HAProxy Pod.
The connection workflow goes like this: Client User -> Global Accelerator -> NLB -> HAProxy pod.
While searching for ways to make this work, here's what I've done:
To ensure stickiness between the NLB and its target (HAProxy pods) I have enabled stickiness on the NLB targets.
Now, when it comes to the stickiness between the Global Accelerator and the NLB, it looks like the right thing to do is to set the Global Accelerator's Client Affinity attribute to "Source IP". According to the documentation, with this setting the Global Accelerator honors client affinity by routing all connections with the same source IP address to the same endpoint group.
My expectations were that with these attributes enabled, the user will always get connected to the same NLB which then connects to the same HAProxy pod.
After testing, when I connected to my application via the NLB DNS, the goal was achieved and I get a sticky connection. However, when I connect via the Global Accelerator, my session keeps crashing.
Any ideas of why that might be?
Or are there any suggestions of a different way to work with this?
This is not something that AWS supports (as of June 2022).
See this document https://aws.amazon.com/blogs/networking-and-content-delivery/updating-aws-global-accelerator-ec2-endpoints-based-on-autoscaling-group-events/
They specifically state
An example is when you want to send UDP traffic with client IP preservation to a handful of instances, with a guarantee that the same backend instances will handle requests from the same clients (client affinity). This is not possible with Application Load Balancers because they do not support UDP traffic, and Network Load Balancers do not support sticky sessions or client IP preservation with AWS Global Accelerator.
We have a gRPC application deployed in a cluster (v 1.17.6) with Istio (v 1.6.2) setup. The cluster has istio-ingressgateway setup as the edge LB, with SSL termination. The istio-ingressgateway is fronted by an AWS ELB (classic LB) in passthrough mode. This setup is fully functional and the traffic flows as intended, in general. So the setup looks like:
ELB => istio-ingressgateway => virtual service => app service => [(envoy)pods]
We are running load tests on this setup using GHZ (ghz.sh), running external to the application cluster. From the tests we’ve run, we have observed that each of the app container seems to get about 300 RPS routed to it, no matter the configuration of the GHZ test. For reference, we have tried various combos of --concurrency and --connection settings for the tests. This ~300 RPS is lower than what we expect from the app and, hence, requires a lot more PODs to provide the required throughput.
We are really interested in understanding the details of the physical connection (gRPC/HTTP2) setup in this case, all the way from the ELB to the app/envoy and the details of the load balancing being done. Of particular interest is the the case when the same client, GHZ e.g., opens up multiple connections (specified via the --connection option). We have looked at Kiali and it doesn’t give us the appropriate visibility.
Questions:
How can we get visibility into the physical connections being setup from the ingress gateway to the pod/proxy?
How is the “per request gRPC” load balancing happening?
What options might exist to optimize the various components involved in this setup?
Thanks.
1.How can we get visibility into the physical connections being setup from the ingress gateway to the pod/proxy?
If Kiali doesn't show what exactly you need, maybe you could try with Jaeger?
Jaeger is an open source end to end distributed tracing system, allowing users to monitor and troubleshoot transactions in complex distributed systems.
There is istio documentation about Jaeger.
Additionally Prometheus and Grafana might be helpful here, take a look here.
2.How is the “per request gRPC” load balancing happening?
As mentioned here
By default, the Envoy proxies distribute traffic across each service’s load balancing pool using a round-robin model, where requests are sent to each pool member in turn, returning to the top of the pool once each service instance has received a request.
If you wan't to change the default round-robin model you can use Destination Rule for that. Destination rules let you customize Envoy’s traffic policies when calling the entire destination service or a particular service subset, such as your preferred load balancing model, TLS security mode, or circuit breaker settings.
There is istio documentation about that.
More about load balancing in envoy here.
3.What options might exist to optimize the various components involved in this setup?
I'm not sure if there is anything to optimize in istio components, maybe some custom configuration in Destination Rule?
Additional Resources:
itnext.io
medium.com
programmaticponderings.com
I'm having an issue with load balancing persistent tcp connections to my kubernetes replicas.
I have Unity3D clients outside of the kubernetes cluster.
My cluster is a baremetal cluster with metallb installed composed out of 3 nodes: 1 master and 2 workers.
As I have read there are two approaches:
1) client connects to all replicas and each time it needs to send a request it will do so on a random connection out of those that it has previously established. Periodically, it refreshes connections (in case autoscale happened or some of the persistent connections died).
The problem here is, I'm not sure how to access all replicas externally, headless services cannot be exposed externally.
2) service mesh ? I have vaguely read/understood that they might establish persistent tcp on your behalf. So something like this :
unity3d client <----persistent connection ---> controller <---persistent connection----> replicas
However, I'm not sure how to accomplish this and I'm not sure what will happen if the controller itself fails, will all the clients get their connections dropped ? As I see it, it will come down to the same issue as the one from 1), which is allowing a client to connect to multiple different replicas at the same time with a persistent TCP connection.
Part of question comes as a complement to this : https://learnk8s.io/kubernetes-long-lived-connections
In order to enable external traffic to your cluster you need an Ingress Gateway. Your ingress gateway could be the standard nginx Ingress, a gateway provided by a mesh like the Istio Gateway or a more specialized edge gateway like ambassador, traefik, kong, gloo, etc.
There are at least two ways you can perform load balancing in K8s:
Using a Service resource which is just a set of iptables rules managed by the kube-proxy process. This is L4 load balancing only. No L7 application protocols like HTTP2 or gRPC are supported. Depending on your case, this type of LB might not be ideal for long lived connections as connections will rarely be closed.
Using the L7 load balancing offered by any of the ingress controllers which will skip the iptables routing (using a headless Service) and allow for more advanced load balancing algorithms.
In order to benefit from the latter case you still need to ensure that connections are eventually terminated which is often done from the client to the proxy (while reusing connections from the proxy to the upstream). I'm not familiar with Unity3D connections but if terminating them is not an option you won't be able to do much load balancing after all.
When the controller fails, connections will be dropped and your client could either graciously re-attempt the connection or panic. It depends on how you code it.
Please confirm if these are true, or please point to the official AWS documentations that describes how to use dynamic port mapping with NLB and run multiple same tasks in an ECS ES2 instance. I am not using Fargate.
ECS+NLB does NOT support dynamic port mapping, hence
ECS+NLB can only allow 1 task (docker container) per EC2 instance in an ECS service
This is because:
AWS ECS Developer Guide - Creating a Load Balancer only mentions ALB that can use dynamic port, and not mention on NLB.
Application Load Balancers offer several features that make them attractive for use with Amazon ECS services:
* Application Load Balancers allow containers to use dynamic host port mapping (so that multiple tasks from the same service are allowed per container instance).
ECS task creation page clearly states that dynamic port is for ALB.
Network Load Balancer for inter-service communication quotes a response from the AWS support:
"However, I would like to point out that there is currently an ongoing issue with the NLB functionality with ECS, mostly seen with dynamic port mapping where the container is not able to stabilize due to health check errors, I believe the error you're seeing is related to that issue. I can only recommend that you use the ALB for now, as the NLB is still quite new so it's not fully compatible with ECS yet."
Updates
Found a document stating NLB supports dynamic port. However, if I switch ALB to NLB, ECS service does not work. When I log into an EC2 instance, an ECS agent is running but no docker container is running.
If someone managed to make ECS(EC2 type)+NLB work, please provide the step by step how it has been done.
Amazon ECS Developer Guide - Service Load Balancing - Load Balancer Types - NLB
Network Load Balancers support dynamic host port mapping. For example, if your task's container definition specifies port 80 for an NGINX container port, and port 0 for the host port, then the host port is dynamically chosen from the ephemeral port range of the container instance (such as 32768 to 61000 on the latest Amazon ECS-optimized AMI). When the task is launched, the NGINX container is registered with the Network Load Balancer as an instance ID and port combination, and traffic is distributed to the instance ID and port corresponding to that container. This dynamic mapping allows you to have multiple tasks from a single service on the same container instance.
I've followed the steps from Microsoft to create a Multi-Node On-Premises Service Fabric cluster. I've deployed a stateless app to the cluster and it seems to be working fine. When I have been connecting to the cluster I have used the IP Address of one of the nodes. Doing that, I can connect via Powershell using Connect-ServiceFabricCluster nodename:19000 and I can connect to the Service Fabric Explorer website (http://nodename:19080/explorer/index.html).
The examples online suggest that if I hosted in Azure I can connect to http://mycluster.eastus.cloudapp.azure.com:19000 and it resolves, however I can't work out what the equivalent is on my local. I tried connecting to my sample cluster: Connect-ServiceFabricCluster sampleCluster.domain.local:19000 but that returns:
WARNING: Failed to contact Naming Service. Attempting to contact Failover Manager Service...
WARNING: Failed to contact Failover Manager Service, Attempting to contact FMM...
False
WARNING: No such host is known
Connect-ServiceFabricCluster : No cluster endpoint is reachable, please check if there is connectivity/firewall/DNS issue.
Am I missing something in my setup? Should there be a central DNS entry somewhere that allows me to connect to the cluster? Or am I trying to do something that isn't supported On-Premises?
Yup, you're missing a load balancer.
This is the best resource I could find to help, I'll paste relevant contents in the event of it becoming unavailable.
Reverse Proxy — When you provision a Service Fabric cluster, you have an option of installing Reverse Proxy on each of the nodes on the cluster. It performs the service resolution on the client’s behalf and forwards the request to the correct node which contains the application. In majority of the cases, services running on the Service Fabric run only on the subset of the nodes. Since the load balancer will not know which nodes contain the requested service, the client libraries will have to wrap the requests in a retry-loop to resolve service endpoints. Using Reverse Proxy will address the issue since it runs on each node and will know exactly on what nodes is the service running on. Clients outside the cluster can reach the services running inside the cluster via Reverse Proxy without any additional configuration.
Source: Azure Service Fabric is amazing
I have an Azure Service Fabric resource running, but the same rules apply. As the article states, you'll need a reverse proxy/load balancer to resolve not only what nodes are running the API, but also to balance the load between the nodes running that API. So, health probes are necessary too so that the load balancer knows which nodes are viable options for sending traffic to.
As an example, Azure creates 2 rules off the bat:
1. LBHttpRule on TCP/19080 with a TCP probe on port 19080 every 5 seconds with a 2 count error threshold.
2. LBRule on TCP/19000 with a TCP probe on port 19000 every 5 seconds with a 2 count error threshold.
What you need to add to make this forward-facing is a rule where you forward port 80 to your service http port. Then the health probe can be an http probe that hits a path to test a 200 return.
Once you get into the cluster, you can resolve the services normally and SF will take care of availability.
In Azure-land, this is abstracted again to using something like API Management to further reverse proxy it to SSL. What a mess but it works.
Once your load balancer is set up, you'll have a single IP to hit for management, publishing, and regular traffic.