What are the ECS agent ports? - amazon-ecs

Question
Which ports does the ECS agent use? Amazon ECS Container Agent Configuration refers to ECS_RESERVED_PORTS. Are these the one which the ECS agent listens, hence need to be open in the ECS EC2 security group?
ECS_RESERVED_PORTS
Example values: [22, 80, 5000, 8080]
Default value on Linux: [22, 2375, 2376, 51678, 51679, 51680]
Default value on Windows: [53, 135, 139, 445, 2375, 2376, 3389, 5985, 51678, 51679]

These are ports that are either expected to already be used, or would cause confusion if they were reused. You can find some of the port definitions here and here.
22 (Linux) - Used for SSH
53 (Windows) - Used by the DNS client
135 (Windows) - Used by Windows RPC
139 (Windows) - Used by NetBIOS
445 (Windows) - Used by SMB
2375, 2376 - Used for exposing the Docker API over TCP and with TLS
3389 (Windows) - Used by Remote Desktop
5985 (Windows) - Used by WinRM
51678, 51679, 51680 - Used by the ECS agent for various APIs

ECS Container Agent does not require inbound ports to be open
Hence no security group inbound port configuration required for ECS Container Agent.
AWS ECS Developer Guide - Setting Up with Amazon ECS - Create a Security Group
Amazon ECS container instances do NOT require any inbound ports to be open. However, you might want to add an SSH rule so you can log into the container instance and examine the tasks with Docker commands. You can also add rules for HTTP and HTTPS if you want your container instance to host a task that runs a web server. Container instances do require external network access to communicate with the Amazon ECS service endpoint. Complete the following steps to add these optional security group rules.
I hope for better AWS terminologies. ECS Container Instance is EC2 instance, not Docker Container Instance. It can be confusing using "Container Instance" as it can be either a Docker container instance or an EC2 instance. Why not use "ECS EC2 Instance" to be specific then it would not cause any confusion. API Gateway lambda proxy integration and API Gateway lambda integration which do not give any clue about the difference between them, etc etc.

Related

Connect ECS task with Service Discovery

here my situation: I have a docker-compose with some container configured in the same networks.
I need that all the containers can communicate with all others, and some of these container had an environment where i need to set the endpoint of another one, like this example:
containerA:
image: imageA:0.0.1
port:
- "8080"
network:
- net1
containerB:
image: imageB:0.0.1
environments:
- BRIDGE: http://containerA:8080
network:
- net1
I need now to translate this docker-compose to an ECS Service (one Task for containerA and one Task for containerB).
I'm using ecs-cli without any problem, i can create a Cluster, run services with Fargate, run all the task inside the same VPC, using the same Security Group, and i enabled Service Discovery for all the ECS Services, using the same namespace for all (so i have containerA.namespace1 and containerB.namescape1).
But i have a problem in the connection between this two tasks:
i try so set:
BRIDGE: http://containerA:8080
BRIDGE: http://containerA.namespace1:8080
BRIDGE: http://containerA.namespace1.local
BRIDGE: http://containerA.namespace1.local:8080
but all this options doesn't work.
I try a "temp" solution for this problem using the public ip generated from taskA, but if i update the task A, the public IP rightly changes and i need redeploy also taskB.
So, the question is: how i can use "hostname" so i can connect to the name of the service and not to the public ip of the task?
thanks for any suggestion
[Update based on the thread] The answer originally provided below assumes you are open to evaluate an alternative tool to deploy your compose file on ECS.
It is very likely that if you use the native docker compose integration with ECS you can up your compose file and achieve what you want without you doing nothing special.
You can read more about how to do that here. BTW the service discovery mechanism you are using is identical to the discovery mechanism I need in the application I have used in the blog. I am fairly confident that it will work for you.

ECS+NLB does not support dynamic port hence only 1 task per EC2 instance?

Please confirm if these are true, or please point to the official AWS documentations that describes how to use dynamic port mapping with NLB and run multiple same tasks in an ECS ES2 instance. I am not using Fargate.
ECS+NLB does NOT support dynamic port mapping, hence
ECS+NLB can only allow 1 task (docker container) per EC2 instance in an ECS service
This is because:
AWS ECS Developer Guide - Creating a Load Balancer only mentions ALB that can use dynamic port, and not mention on NLB.
Application Load Balancers offer several features that make them attractive for use with Amazon ECS services:
* Application Load Balancers allow containers to use dynamic host port mapping (so that multiple tasks from the same service are allowed per container instance).
ECS task creation page clearly states that dynamic port is for ALB.
Network Load Balancer for inter-service communication quotes a response from the AWS support:
"However, I would like to point out that there is currently an ongoing issue with the NLB functionality with ECS, mostly seen with dynamic port mapping where the container is not able to stabilize due to health check errors, I believe the error you're seeing is related to that issue. I can only recommend that you use the ALB for now, as the NLB is still quite new so it's not fully compatible with ECS yet."
Updates
Found a document stating NLB supports dynamic port. However, if I switch ALB to NLB, ECS service does not work. When I log into an EC2 instance, an ECS agent is running but no docker container is running.
If someone managed to make ECS(EC2 type)+NLB work, please provide the step by step how it has been done.
Amazon ECS Developer Guide - Service Load Balancing - Load Balancer Types - NLB
Network Load Balancers support dynamic host port mapping. For example, if your task's container definition specifies port 80 for an NGINX container port, and port 0 for the host port, then the host port is dynamically chosen from the ephemeral port range of the container instance (such as 32768 to 61000 on the latest Amazon ECS-optimized AMI). When the task is launched, the NGINX container is registered with the Network Load Balancer as an instance ID and port combination, and traffic is distributed to the instance ID and port corresponding to that container. This dynamic mapping allows you to have multiple tasks from a single service on the same container instance.

Google Cloud Build deploy to GKE Private Cluster

I'm running a Google Kubernetes Engine with the "private-cluster" option.
I've also defined "authorized Master Network" to be able to remotely access the environment - this works just fine.
Now I want to setup some kind of CI/CD pipeline using Google Cloud Build -
after successfully building a new docker image, this new image should be automatically deployed to GKE.
When I first fired off the new pipeline, the deployment to GKE failed - the error message was something like: "Unable to connect to the server: dial tcp xxx.xxx.xxx.xxx:443: i/o timeout".
As I had the "authorized master networks" option under suspicion for being the root cause for the connection timeout, I've added 0.0.0.0/0 to the allowed networks and started the Cloud Build job again - this time everything went well and after the docker image was created it was deployed to GKE. Good.
The only problem that remains is that I don't really want to allow the whole Internet being able to access my Kubernetes master - that's a bad idea, isn't it?
Are there more elegant solutions to narrow down access by using allowed master networks and also being able to deploy via cloud build?
It's currently not possible to add Cloud Build machines to a VPC. Similarly, Cloud Build does not announce IP ranges of the build machines. So you can't do this today without creating a "ssh bastion instance" or a "proxy instance" on GCE within that VPC.
I suspect this would change soon. GCB existed before GKE private clusters and private clusters are still a beta feature.
We ended up doing the following:
1) Remove the deployment step from cloudbuild.yaml
2) Install Keel inside the private cluster and give it pub/sub editor privileges in the cloud builder / registry project
Keel will monitor changes in images and deploy them automatically based on your settings.
This has worked out great as now we get pushed sha hashed image updates, without adding vms or doing any kind of bastion/ssh host.
Updated answer (02/22/2021)
Unfortunately, while the below method works, IAP tunnels suffer from rate-limiting, it seems. If there are a lot of resources deployed via kubectl, then the tunnel times out after a while. I had to use another trick, which is to dynamically whitelist Cloud Build IP address via Terraform, and then to apply directly, which works every time.
Original answer
It is also possible to create an IAP tunnel inside a Cloud Build step:
- id: kubectl-proxy
name: gcr.io/cloud-builders/docker
entrypoint: sh
args:
- -c
- docker run -d --net cloudbuild --name kubectl-proxy
gcr.io/cloud-builders/gcloud compute start-iap-tunnel
bastion-instance 8080 --local-host-port 0.0.0.0:8080 --zone us-east1-b &&
sleep 5
This step starts a background Docker container named kubectl-proxy in cloudbuild network, which is used by all of the other Cloud Build steps. The Docker container establishes an IAP tunnel using Cloud Build Service Account identity. The tunnel connects to a GCE instance with a SOCKS or an HTTPS proxy pre-installed on it (an exercise left to the reader).
Inside subsequent steps, you can then access the cluster simply as
- id: setup-k8s
name: gcr.io/cloud-builders/kubectl
entrypoint: sh
args:
- -c
- HTTPS_PROXY=socks5://kubectl-proxy:8080 kubectl apply -f config.yml
The main advantages of this approach compared to the others suggested above:
No need to have a "bastion" host with a public IP - kubectl-proxy host can be entirely private, thus maintaining the privacy of the cluster
Tunnel connection relies on default Google credentials available to Cloud Build, and as such there's no need to store/pass any long-term credentials like an SSH key
I got cloudbuild working with my private GKE cluster following this google document:
https://cloud.google.com/architecture/accessing-private-gke-clusters-with-cloud-build-private-pools
This allows me to use cloudbuild and terraform to manage a GKE cluster with authorized network access to control plane enabled. I considered trying to maintain a ridiculous whitelist but that would ultimately defeat the purpose of using authorized network access control to begin with.
I would note that cloudbuild private pools are generally slower than non-private pools. This is due to the server-less nature of private pools. I have not experienced rate limiting so far as others have mentioned.
Our workaround was to add steps in the CI/CD -- to whitelist the cloudbuild's IP, via Authorized Master Network.
Note: Additional permission for the Cloud Build service account is needed
Kubernetes Engine Cluster Admin
On cloudbuild.yaml, add the whitelist step before the deployment/s.
This step fetches the Cloud Build's IP then updates the container clusters settings;
# Authorize Cloud Build to Access the Private Cluster (Enable Control Plane Authorized Networks)
- name: 'gcr.io/google.com/cloudsdktool/cloud-sdk'
id: 'Authorize Cloud Build'
entrypoint: 'bash'
args:
- -c
- |
apt-get install dnsutils -y &&
cloudbuild_external_ip=$(dig #resolver4.opendns.com myip.opendns.com +short) &&
gcloud container clusters update my-private-cluster --zone=$_ZONE --enable-master-authorized-networks --master-authorized-networks $cloudbuild_external_ip/32 &&
echo $cloudbuild_external_ip
Since the cloud build has been whitelisted, deployments will proceed without the i/o timeout error.
This removes the complexity of setting up VPN / private worker pools.
Disable the Control Plane Authorized Networks after the deployment.
# Disable Control Plane Authorized Networks after Deployment
- name: 'gcr.io/google.com/cloudsdktool/cloud-sdk'
id: 'Disable Authorized Networks'
entrypoint: 'gcloud'
args:
- 'container'
- 'clusters'
- 'update'
- 'my-private-cluster'
- '--zone=$_ZONE'
- '--no-enable-master-authorized-networks'
This approach works well even in cross-project / cross-environment deployments.
Update: I suppose this won't work with production strength for the same reason as #dinvlad's update above, i.e., rate limiting in IAP. I'll leave my original post here because it does solve the network connectivity problem, and illustrates the underlying networking mechanism.
Furthermore, even if we don't use it for Cloud Build, my method provides a way to tunnel from my laptop to a K8s private master node. Therefore, I can edit K8s yaml files on my laptop (e.g., using VS Code), and immediately execute kubectl from my laptop, rather than having to ship the code to a bastion host and execute kubectl inside the bastion host. I find this a big booster to development time productivity.
Original answer
================
I think I might have an improvement to the great solution provided by #dinvlad above.
I think the solution can be simplified without installing an HTTP Proxy Server. Still need a bastion host.
I offer the following Proof of Concept (without HTTP Proxy Server). This PoC illustrates the underlying networking mechanism without involving the distraction of Google Cloud Build (GCB). (When I have time in the future, I'll test out the full implementation on Google Cloud Build.)
Suppose:
I have a GKE cluster whose master node is private, e.g., having an IP address 10.x.x.x.
I have a bastion Compute Instance named my-bastion. It has only private IP but not external IP. The private IP is within the master authorized networks CIDR of the GKE cluster. Therefore, from within my-bastion, kubectl works against the private GKE master node. Because my-bastion doesn't have an external IP, my home laptop connects to it through IAP.
My laptop at home, with my home internet public IP address, doesn't readily have connectivity to the private GKE master node above.
The goal is for me to execute kubectl on my laptop against that private GKE cluster. From network architecture perspective, my home laptop's position is like the Google Cloud Build server.
Theory: Knowing that gcloud compute ssh (and the associated IAP) is a wrapper for SSH, the SSH Dynamic Port Forwarding should achieve that goal for us.
Practice:
## On laptop:
LAPTOP~$ kubectl get ns
^C <<<=== Without setting anything up, this hangs (no connectivity to GKE).
## Set up SSH Dynamic Port Forwarding (SOCKS proxy) from laptop's port 8443 to my-bastion.
LAPTOP~$ gcloud compute ssh my-bastion --ssh-flag="-ND 8443" --tunnel-through-iap
In another terminal of my laptop:
## Without using the SOCKS proxy, this returns my laptop's home public IP:
LAPTOP~$ curl https://checkip.amazonaws.com
199.xxx.xxx.xxx
## Using the proxy, the same curl command above now returns a different IP address,
## i.e., the IP of my-bastion.
## Note: Although my-bastion doesn't have an external IP, I have a GCP Cloud NAT
## for its subnet (for purpose unrelated to GKE or tunneling).
## Anyway, this NAT is handy as a demonstration for our curl command here.
LAPTOP~$ HTTPS_PROXY=socks5://127.0.0.1:8443 curl -v --insecure https://checkip.amazonaws.com
* Uses proxy env variable HTTPS_PROXY == 'socks5://127.0.0.1:8443' <<<=== Confirming it's using the proxy
...
* SOCKS5 communication to checkip.amazonaws.com:443
...
* TLSv1.2 (IN), TLS handshake, Finished (20): <<<==== successful SSL handshake
...
> GET / HTTP/1.1
> Host: checkip.amazonaws.com
> User-Agent: curl/7.68.0
> Accept: */*
...
< Connection: keep-alive
<
34.xxx.xxx.xxx <<<=== Returns the GCP Cloud NAT'ed IP address for my-bastion
Finally, the moment of truth for kubectl:
## On laptop:
LAPTOP~$ HTTPS_PROXY=socks5://127.0.0.1:8443 kubectl --insecure-skip-tls-verify=true get ns
NAME STATUS AGE
default Active 3d10h
kube-system Active 3d10h
It is now possible to create a pool of VM's that are connected to you private VPC and can be access from Cloud Build.
Quickstart

Azure Container Service with Kubernetes - Containers not able to reach Internet

I created an ACS (Azure Container Service) using Kubernetes by following this link : https://learn.microsoft.com/en-us/azure/container-service/container-service-kubernetes-windows-walkthrough & I deployed my .net 4.5 app by following this link : https://learn.microsoft.com/en-us/azure/container-service/container-service-kubernetes-ui . My app needs to access Azure SQL and other resources that are part of some other resource groups in my account, but my container is not able to make any outbound calls to network - both inside azure and to internet. I opened some ports to allow outbound connections, that is not helping either.
When I create an ACS does it come with a gateway or should I create one ? How can I configure ACS so that it allows outbound network calls ?
Thanks,
Ashok.
Outbound internet access works from an Azure Container Service (ACS) Kubernetes Windows cluster if you are connecting to IP Addresses other than the range 10.0.0.0/16 (that is you are not connecting to another service on your VNET).
Before Feb 22,2017 there was a bug where Internet access was not available.
Please try the latest deployment from ACS-Engine: https://github.com/Azure/acs-engine/blob/master/docs/kubernetes.windows.md., and open an issue there if you still see this, and we (Azure Container Service) can help you debug.
For the communication with service running inside the cluster, you can use the Kube-dns which allows you to access service by its name. You can find more details at https://kubernetes.io/docs/admin/dns/
For the external communication (internet), there is no need to create any gateway etc. By default your containers inside a pod can make outbound connections. To verify this, you can run powershell in one of your containers and try to run
wget http://www.google.com -OutFile testping.txt
Get-Contents testping.txt
and see if it works.
To run powershell, ssh to your master node - instructions here
kubectl exec -it <pod_name> -- powershell

TCP Ports for managing clusters

What outgoing TCP ports are needed to fully manage a Service Fabric cluster in Azure? I was aware of 19080 being needed to access the Service Fabric Explorer but then today I discovered that 19000 is needed to publish to a cluster. This makes me wonder if there are other ports.
I need to make an official request to my IT department to open up outgoing TCP Ports and I want to be sure I cover everything in one request. Are there other ports I should be aware of?
The default port for connecting to cluster from visual studio or powershell is 19000, but can be changed when you create the cluster, using either the Azure portal or ARM template deployments.
Port 19080 is used by the Service Fabric Explorer.
There are no other ports used by Service Fabric itself, but your applications can be configured to use other ports, but this is something you control.