Remove ECS task for forensic analysis - amazon-ecs

I use Amazon ECS for my services. ECS autoregisters the containers with the ELB. That is all fine.
However if one of my application gets into a failure mode I want to take it out of the ELB but keep it running so I can do forensic on it.
If I remove it from the ELB manually then ECS will just add it back in.
The AutoScaling group has a "standby" state. I guess it is something similar I am looking for in ECS.
Is that possible somehow?

Related

How to deploy container to ECS+Fargate+CodeDeploy without load balancer?

I have an app that comes in two pieces:
The (Ruby/Rails) main app which is also the web frontend
A Sidekiq event handler
The first one is not a problem, done that. CodeDeploy will deploy the new version by creating the new service(s) on the "blue load balancer target group" and start moving the traffic to them in a blue/green setup.
However, Sidekiq doesn't have a port, so no health check. This is the problem I'm having now, the NLB doesn't validate the health of the container, so it keeps restarting and the deploy eventually times out because it can't get a clean service.
I'm not sure if this is a ECS limitation or a CodeDeploy limitation (or a "me limitation" :) ), but it seems I have to have a load balancer for that to work.
But something about that just rings wrong! I'm sure AWS must have thought about "worker containers", right? How do you setup an ECS cluster with two services, each with one task - the two parts above. And how do you then use CodeDeploy to deploy to that service/task? Or do you deploy Sidekiq (in this case) in some other way?
Having two services, I can scale them individually.
At the moment, I have a CodePipeline, triggering a CodeBuild to build the image, push that into ECR then trigger CodeDeploy to do the deployment. It worked initially when I had one service, with the two tasks, but then both would scale simultaneously which isn't what I want in the end.

How to do stress testing for AWS CPUUtiliazation and MemoryUtilization?

I am using AWS cloud ECS services and created once ECS cluster. In which there is one FargateService and behind that 2 containers are running.
Two alarms have been created for fargate service on threashold 95.
Everything looks fine now Testing part comes into picture. I want test alarms functionality.
is there any easy way in AWS using some AWS service or manual script so that I can increase CPU and Memory use to test alarm functionality .

How can I update and ECS service adding an addition load balance to which a service talks to without downtime?

We use terraform to manage our AWS resources and have the code to change the service from one to two load balancers.
However,
terraform wants to destroy the service prior to recreating it. AWS cli docs indicate the reason - the API can only modify LBs during service creation and not on update.
create-service
update-service
It seems we need a blue/green deploy with the one LB and two LB services existing at the same time on the same cluster. I expect we'll need to create multiple task sets and the rest of the blue/green approach prior to this change (we'd planned for this anyway just not at this moment)
Does anyone have a good example for this scenario or know of any other approaches beyond full blue/green deployment?
Alas, it is not possible to change the number of LBs during an update. The service must be destroyed and recreated.
Ideally, one would be doing blue green deploys with multiple ECS clusters and a set of LBs. Then cluster A could have the old service and cluster B have the new service allowing traffic to move from A to B as we go from blue to green.
We aren't quite there yet but plan to be soon. So, for now, we will do the classic parking lot switch approach:
In this example the service that needs to go from 1 LB to 2 LBs is called target_service
clone target_service to become target_service2
deploy microservices that can talk to either target_service or target_service2
verify that target_service and target_service2 are both handling incoming data and requests
modify target_service infra-as-code to go from 1 to 2 LBs
deploy modified target_service (terraform deployment tool will destroy target_service leaving target_service2 to cover the gap, and then it will deploy target_service with 2 LBs
verify that target_service with 2 LBS works and handle requests
destroy and remove target_service2 as it is no longer needed
So, this is a blue-green like deploy albeit less elegant.
To update an Amazon ECS service with multiple load balancers, you need to ensure that you are using version 2 of the AWS CLI. Then, you can run the following command:
aws ecs update-service \
--cluster <cluster-name> \
--service <service-name> \
--load-balancers "[{\"containerName\": \"<container-name>\", \"containerPort\": <container-port>, \"targetGroupArn\": \"<target-group-arn1>\"}, {\"containerName\": \"<container-name>\", \"containerPort\": <container-port>, \"targetGroupArn\": \"<target-group-arn2>\"}]"
In the above command, replace with the name of your ECS cluster, with the name of your ECS service, with the name of the container in your task definition, with the port number the container listens to, and and with the ARN of your target groups.
Update 2022
ECS now supports updation of load balancers of a service. For now, this can be achieved using AWS CLI, AWS SDK, or AWS Cloudformation (not from AWS Console). From the documentation:
When you add, update, or remove a load balancer configuration, Amazon ECS starts new tasks with the updated Elastic Load Balancing configuration, and then stops the old tasks when the new tasks are running.
So, you don't need to create a new service. AWS update regarding this here. Please refer Update Service API doc on how to make the request.

Does AWS CDK provide constructs to deploy ECS applications?

We generally use BlueGreen & Rolling deployment strategy,
for docker containers in ECS container instances, to get deployed & updated.
Ansible ECS modules allow implement such deployment strategies with below modules:
https://docs.ansible.com/ansible/latest/modules/ecs_taskdefinition_module.html
https://docs.ansible.com/ansible/latest/modules/ecs_task_module.html
https://docs.ansible.com/ansible/latest/modules/ecs_service_module.html
Does AWS CDK provide such constructs for implementing deployment strategies?
CDK supports higher level constructs for ECS called "ECS patterns". One of them is ApplicationLoadBalancedFargateService which allows you to define an ECS Fargate service behind an Application Load Balancer. Rolling update is supported out of the box in this case. You simply run cdk deploy with a newer Docker image and ECS will take care of the deployment. It will:
Start a new task with the new Docker image.
Wait for several successful health checks of the new deployment.
Start sending new traffic to the new task, while letting the existing connections gracefully finish on the old task.
Once all old connections are done, ECS will automatically stop the old task.
If your new task does not start or is not healthy, ECS will keep running the original task.
Regarding Blue-Green deployment I think it's yet to be supported in CloudFormation. Once that's done, it can be implemented in CDK. If you can live without BlueGreen as IaC, you can define your CodeDeploy manually.
Check this NPM plugin which helps with blue-green deployment using CDK.
https://www.npmjs.com/package/#cloudcomponents/cdk-blue-green-container-deployment
Blue green deployments are supported in cloud formation now .
https://aws.amazon.com/about-aws/whats-new/2020/05/aws-cloudformation-now-supports-blue-green-deployments-for-amazon-ecs/
Don’t think CDK implementation is done yet .

Where is KOPS located/running from?

I am new to Docker and Kubernetes, though I have mostly figured out how it all works at this point.
I inherited an app that uses both, as well as KOPS.
One of the last things I am having trouble with is the KOPS setup. I know for absolute certain that Kubernetes is setup via KOPS. There's two KOPS state stores on an S3 bucket (corresponding to a dev and prod cluster respectively)
However while I can find the server that kubectl/kubernetes is running on, absolutely none of the servers I have access to seem to have a kops command.
Am I misunderstanding how KOPS works? Does it not do some sort of dynamic monitoring (would that just be done by ReplicaSet by itself?), but rather just sets a cluster running and it's done?
I can include my cluster.spec or config files, if they're helpful to anyone, but I can't really see how they're super relevant to this question.
I guess I'm just confused - as far as I can tell from my perspective, it looks like KOPS is run once, sets up a cluster, and is done. But then whenever one of my node or master servers goes down, it is self-healing. I would expect that of the node servers, but not the master servers.
This is all on AWS.
Sorry if this is a dumb question, I am just having trouble conceptually understanding what is going on here.
kops is a command line tool, you run it from your own machine (or a jumpbox) and it creates clusters for you, it’s not a long-running server itself. It’s like Terraform if you’re familiar with that, but tailored specifically to spinning up Kubernetes clusters.
kops creates nodes on AWS via autoscaling groups. It’s this construct (which is an AWS thing) that ensures your nodes come back to the desired number.
kops is used for managing Kubernetes clusters themselves, like creating them, scaling, updating, deleting. kubectl is used for managing container workloads that run on Kubernetes. You can create, scale, update, and delete your replica sets with that. How you run workloads on Kubernetes should have nothing to do with how/what tool you (or some cluster admin) use to manage the Kubernetes cluster itself. That is, unless you’re trying to change the “system components” of Kubernetes, like the Kubernetes API or kubedns, which are cluster-admin-level concerns but happen to run on top of Kuberentes as container workloads.
As for how pods get spun up when nodes go down, that’s what Kubernetes as a container orchestrator strives to do. You declare the desired state you want, and the Kubernetes system makes it so. If things crash or fail or disappear, Kubernetes aims to reconcile this difference between actual state and desired state, and schedules desired container workloads to run on available nodes to bring the actual state of the world back in line with your desired state. At a lower level, AWS does similar things — it creates VMs and keeps them running. If Amazon needs to take down a host for maintenance it will figure out how to run your VM (and attach volumes, etc.) elsewhere automatically.