Amazon ECS service access and load balancing in microservice architecture - kubernetes

Can someone explain the load balancing mechanism in AWS ECS for me? I clearly understand how inter service communication is handled within a kubernetes cluster, there is an automatic load balancer applied when accessing a defined internal service. This means Container/Pod scalability is simply predefined:
when a Pod-1A from within the service-A is accessing another Pod-1B
from within a different Service-B (Service to Service communication)
this call is automatically load balanced to this Pod-1B from
service-B.
So with service Registry in kubernetes we simply need to define Services and communication is automatically load balanced to the available Pods within the services.
So assuming that Pods are equal to Tasks and Services are equal to Services in AWS ECS, how is this load balancing mechanism handled wihtin ECS? Do we really need to apply an Elastic Load balancer at the task/pod level manually compared to kubernetes? (So that we need to define manually a load balancer for every service, to make this service and its tasks with its container scalable?)
Edit:
What is the reason in AWS ECS, to define a service which instantiates
multiple replicas of a Task, when no load balancer has been defined?
Will the traffic be routed only to the same Task replica (Container)
all the time? (No scaling at all?)
Please note, this is not about access from external ip addresses, where an ingress controller is needed. I am talking about microservices where each service exposes its own http api to communicate with other services within the cluster (internal microservice Application), typically there is an API Gateway handling external traffic (ingress controller).

Related

API gateway for services running with Kubernetes?

We have all our services running with Kubernetes. We want to know what is the best practice to deploy our own API gateway, we thought of 2 solutions:
Deploy API gateways outside the Kubernetes cluster(s), i.e. with Kong. This means the clusters' ingress will connect to the external gateways. The gateway is either VM or physical machines, and you can scale by replicating many gateway instances
Deploy gateway from within Kubernetes (then maybe connect to external L4 load balancer), i.e. Ambassador. However, with this approach, each cluster can only have 1 gateway. The only way to prevent fault-tolerance is to actually replicate the entire K8s cluster
What is the typical setup and what is better?
The typical setup for an api gateway in kubernetes is either using a load balancer service, if the cloud provider that you are using support dynamic provision of load balancers (all major cloud vendors like gcp, aws or azure support it), or even more common to use an ingress controller.
Both of these options can scale horizontally so you have fault tolerance, in fact there is already a solution for ingress controller using kong
https://github.com/Kong/kubernetes-ingress-controller

How to deploy kubernertes service (type LoadBalancer) on onprem VMs?

How to deploy kubernertes service (type LoadBalancer) on onprem VMs ? When I using type=LoadBalcer it's shows external IP as "pending" but everything works fine with the same yaml if I deployed on GKS. My question is-:
Do we need a Load balancer if I use type=LoadBalcer on Onprem VMs?
Can I assign LoadBalncer IP manually in yaml?
You need to setup metalLB.
MetalLB hooks into your Kubernetes cluster, and provides a network load-balancer implementation. In short, it allows you to create Kubernetes services of type LoadBalancer in clusters that don’t run on a cloud provider, and thus cannot simply hook into paid products to provide load-balancers.
To install run
kubectl apply -f https://raw.githubusercontent.com/metallb/metallb/v0.9.3/manifests/namespace.yaml
kubectl apply -f https://raw.githubusercontent.com/metallb/metallb/v0.9.3/manifests/metallb.yaml
For more details Click here to install
It might be helpful to check the Banzai Cloud Pipeline Kubernetes Engine (PKE) that is "a simple, secure and powerful CNCF-certified Kubernetes distribution" platform. It was designed to work on any cloud, VM or on bare metal nodes to provide a scalable and secure foundation for private clouds. PKE is cloud-aware and includes an ever-increasing number of cloud and platform integrations.
When I using type=LoadBalcer it's shows external IP as "pending" but everything works fine with the same yaml if I deployed on GKS.
If you create a LoadBalancer service — for example try to expose your own TCP based service, or install an ingress controller — the cloud provider integration will take care of creating the needed cloud resources, and writing back the endpoint where your service will be available. If you don't have a cloud provider integration or a controller for this purpose, your Service resource will remain in Pending state.
In case of Kubernetes, LoadBalancer services are the easiest and most common way to expose a service (redundant or not) for the world outside of the cluster or the mesh — to other services, to internal users, or to the internet.
Load balancing as a concept can happen on different levels of the OSI network model, mainly on L4 (transport layer, for example TCP) and L7 (application layer, for example HTTP). In Kubernetes, Services are an abstraction for L4, while Ingresses are a generic solution for L7 routing.
You need to setup metalLB.
MetalLB is one of the most popular on-prem replacements for LoadBalancer cloud integrations. The whole solution runs inside the Kubernetes cluster.
The main component is an in-cluster Kubernetes controller which watches LB service resources, and based on the configuration supplied in a ConfigMap, allocates and writes back IP addresses from a dedicated pool for new services. It maintains a leader node for each service, and depending on the working mode, advertises it via BGP or ARP (sending out unsolicited ARP packets in case of failovers).
MetalLB can operate in two ways: either all requests are forwarded to pods on the leader node, or distributed to all nodes with kubeproxy.
Layer 7 (usually HTTP/HTTPS) load balancer appliances like F5 BIG-IP, or HAProxy and Nginx based solutions may be integrated with an applicable ingress-controller. If you have such, you won't need a LoadBalancer implementation in most cases.
Hope that sheds some light on a "LoadBalancer on bare metal hosts" question.

AWS ECS Service Discovery for multiple tasks

If I have an ECS service, which can scale out, or always runs more than one task, is load balancing handled by the built-in service discovery?
I mean, if my service A is running 3 tasks, and another service B is making requests using the domain name, generated for A by service discovery, how will it decide where a request goes?
From this Medium article, it appears that route 53 is going to use its own catalog of healthy instances to determine which service to route to. So yes, load balancing will occur in that way.

K8 LB Networking

I understand what the Loadbalancer service type does. i.e it creates spins up a LB instance in your cloud instance, NodePorts are created and traffic is sent to the VIP onto the NodePorts.
However, how does this actually work in terms of kubectl and the LB spin up. Is this a construct within the CNI? What part of K8 sends the request and instructs the cloud provider to create the LB?
Thanks,
In this case the CloudControllerManager is responsible for the creation. The CloudControllerManager contains a ServiceController that listens to Service Create/Update/Delete events and triggers the creation of a LoadBalancer based on the configuration of the Service.
In general in Kubernetes you have the concept of declaratively creating a Resource (such as a Service), of which the state is stored in State Storage (etcd in Kubernetes). The controllers are responsible for making sure that that state is realised. In this case the state is realised by creating a Load Balancer in a cloud provider and pointing it to the Kubernetes Cluster.

Access a specific pod from external

We have an old service discovery system that requires processes to register its ip:port during startup. On a kubernetes cluster, we exposed a service that enables NodePort. The processes within container can register to the old system with their Pod Ip:port + HostIp. For the clients within the same kubernetes cluster, they should be able to connect to the right process via specific Pod Ip:port. For an external client, it knows the HostIp+NodePort and the specific Pod Ip:port, is there an efficient way to route the client’s request to the specific Pod? Running a proxy on each node to route the traffic (nodeport -> pod) seems inefficient due to additional proxy layer.
I guess you mean you don't want to add a Service of type NodePort as for your case that seems like an additional proxy layer. I can see how it is an additional layer in your case. Typically Kubernetes would be doing the orchestration and the Service would be part of the service-discovery mechanism. It sounds like you could use hostPort. But if you do go this route you should be aware it's not suggested practice as Kubernetes is intended for orchestration.