ECS Service - Automating deploy with new Docker image - amazon-ecs

I want to automate the deployment of my application by having my ECS service launch with the latest Docker image. From what I've read, the way to deploy a new image version is as follows:
Create a new task revision (after updating the image on your Docker repository).
Update the service and specify the new revision.
This seems to work, but I want to do this all through CLI so I can script it. #2 seems easy enough to do through the AWS CLI with update-service, but I don't see a way to do #1 without specifying the entire Task JSON all over again as with register-task-definition (my JSON will include credentials in environment variables, so I want to have that in as few places as possible).
Is this how I should be automating deployment of my ECS Service updates? And if so, is there a "good" way to have the Task Definition launch a new revision (i.e. without duplicating everything)?

Yes, that is the correct approach.
And no, with the current API, you can't register a new revision of an existing task definition without duplicating it.
If you didn't use the CLI to generate the original task definition (or don't want to reuse the original commands that generated it), you could try something like the following through the CLI:
OLD_TASK_DEF=$(aws ecs describe-task-definition --task-definition <task_family_name>)
NEW_CONTAINER_DEFS=$(echo $OLD_TASK_DEF | jq '.taskDefinition.containerDefinitions' | jq '.[0].image="<new_image_name>"')
aws ecs register-task-definition --family <task_family_name> --container-definitions "'$(echo $NEW_CONTAINER_DEFS)'"
Not 100% secure as the last command's --container-defintions argument (which includes "environment" entries) will still be visible through processes like ps. One of the AWS SDKs would give better peace of mind.

The answer provided by Matt Callanan did not work for me: I received an error on this part:
--container-definitions "'$(echo $NEW_CONTAINER_DEFS)'"
Resulted in: Error parsing parameter '--container-definitions': Expected: '=', received: ''' for input:
'{ environment: [ { etc etc....
What I did to resolve it was:
TASK_FAMILY=<task familiy name>
DOCKER_IMAGE=<new_image_name>
LATEST_TASK_DEFINITION=$(aws ecs describe-task-definition --task-definition ${TASK_FAMILY})
echo $LATEST_TASK_DEFINITION \
| jq '{containerDefinitions: .taskDefinition.containerDefinitions, volumes: .taskDefinition.volumes}' \
| jq '.containerDefinitions[0].image='\"${DOCKER_IMAGE}\" \
> /tmp/tmp.json
aws ecs register-task-definition --family ${TASK_FAMILY} --cli-input-json file:///tmp/tmp.json
I take both the containerDefinitions and volumes elements from the original json document, because my containerDefinition uses these volumes (so it's not needed if you don't use volumes).

#!/bin/bash
SERVICE_NAME="your service name"
IMAGE_VERSION="v_"${BUILD_NUMBER}
TASK_FAMILY="your task defination name"
CLUSTER="your cluster name"
REGION="your region"
echo "=====================Create a new task definition for this build==========================="
sed -e "s;%BUILD_NUMBER%;${BUILD_NUMBER};g" taskdef.json > ${TASK_FAMILY}-${IMAGE_VERSION}.json
echo "=================Resgistring the task defination==========================================="
aws ecs register-task-definition --family ${TASK_FAMILY} --cli-input-json file://${TASK_FAMILY}-${IMAGE_VERSION}.json --region ${REGION}
echo "================Update the service with the new task definition and desired count================"
TASK_REVISION=`aws ecs describe-task-definition --task-definition ${TASK_FAMILY} --region ${REGION} | egrep "revision" | tr "/" " " | awk '{print $2}' | sed 's/"$//'`
DESIRED_COUNT=`aws ecs describe-services --cluster ${CLUSTER} --services ${SERVICE_NAME} --region ${REGION} | jq .services[].desiredCount`
if [ ${DESIRED_COUNT} = "0" ]; then
DESIRED_COUNT="1"
fi
echo "===============Updating the service=============================================================="
aws ecs update-service --cluster ${CLUSTER} --service ${SERVICE_NAME} --task-definition ${TASK_FAMILY}:${TASK_REVISION} --desired-count ${DESIRED_COUNT} --region ${REGION}
enter code here

Related

Github Workflow: Unable to process file command 'env' successfully

I'm using a github workflow to automate some actions for AWS. I haven't changed anything for a while as the script has been working nicely for me. Recently I've been getting this error: Unable to process file command 'env' successfully whenever the workflow runs. I've got no idea why this is happening. Any help or pointers would greatly appreciated. Thanks. Here's the workflow which is outputting the error:
- name: "Get AWS Resource values"
id: get_aws_resource_values
env:
SHARED_RESOURCES_ENV: ${{ github.event.inputs.shared_resources_workspace }}
run: |
BASTION_INSTANCE_ID=$(aws ec2 describe-instances \
--filters "Name=tag:env,Values=$SHARED_RESOURCES_ENV" \
--query "Reservations[*].Instances[*].InstanceId" \
--output text)
RDS_ENDPOINT=$(aws rds describe-db-instances \
--db-instance-identifier $SHARED_RESOURCES_ENV-rds \
--query "DBInstances[0].Endpoint.Address" \
--output text)
echo "rds_endpoint=$RDS_ENDPOINT" >> $GITHUB_ENV
echo "bastion_instance_id=$BASTION_INSTANCE_ID" >> $GITHUB_ENV
From the RDS endpoint query expression (Reservations[*].Instances[*].InstanceId) in your aws cli command, it seems you expect a multiline string. It could also be that before you started to receive this error the command was producing a single line string, and that changed at some point.
In GitHub actions, multiline strings for environment variables and outputs need to be created with a different syntax.
For the RDS endpoint you should set the environment variable like this:
echo "rds_endpoint<<EOF" >> $GITHUB_ENV
echo "$RDS_ENDPOINT" >> $GITHUB_ENV
echo "EOF" >> $GITHUB_ENV
I guess that the bastion instance id will not be a problem since it's a single line string.

SumoLogic dashboards - how do I automate?

I am getting some experience with SumoLogic dashboards and alerting. I would like to have all possible configuration in code. Does anyone have experience with automation of SumoLogic configuration? At the moment I am using Ansible for general server and infra provisioning.
Thanks for all info!
Best Regards,
Rafal.
(The dashboards, alerts, etc. are referred to as Content in Sumo Logic parlance)
You can use the Content Management API, especially the content-import-job. I am not an expert in Ansible, but I am not aware of any way to plug that API into Ansible.
Also there's a community Terraform provider for Sumo Logic and it supports content:
resource "sumologic_content" "test" {
parent_id = "%s"
config =
{
"type": "SavedSearchWithScheduleSyncDefinition",
"name": "test-333",
"search": {
"queryText": "\"warn\"",
"defaultTimeRange": "-15m",
[...]
Disclaimer: I am currently employed by Sumo Logic
Below is the shell script to import the dashboard. Here it is SumoLogic AU instance. eg: https://api.au.sumologic.com/api. This will be changed based on your country.
Note: You can export all of your dashboard as json files.
#!/usr/bin/env bash
set -e
# if you are using AWS parameter store
# accessKey=$(aws ssm get-parameter --name path_to_your_key --with-decryption --query 'Parameter.Value' --region=ap-southeast-2 | tr -d \")
# accessSecret=$(aws ssm get-parameter --name name path_to_your_secret --with-decryption --query 'Parameter.Value' --region=ap-southeast-2 | tr -d \")
# yourDashboardFolderName="xxxxx" # this is the folder id in the sumologic where you want to create dashboards
# if you are using just key and secreat
accessKey= "your_sumologic_key"
accessSecret= "your_sumologic_secret"
yourDashboardFolderName="xxxxx" # this is the folder id in the sumologic
# you can place all the json files of dashboard in ./Sumologic/Dashboards folder.
for f in $(find ./Sumologic/Dashboards -name '*.json'); \
do \
curl -X POST https://api.au.sumologic.com/api/v2/content/folders/$yourDashboardFolderName/import \
-H "Content-Type: application/json" \
-u "$accessKey:$accessSecret" \
-d #$f \
;done

Google Cloud Endpoint Error when creating service config

I am trying to configure Google Cloud Endpoints using Cloud Functions. For the same I am following instructions from: https://cloud.google.com/endpoints/docs/openapi/get-started-cloud-functions
I have followed the steps given and have come to the point of building the service config into a new ESPv2 Beta docker image. When I give the command:
chmod +x gcloud_build_image
./gcloud_build_image -s CLOUD_RUN_HOSTNAME \
-c CONFIG_ID -p ESP_PROJECT_ID
after replacing the hostname and configid and projectid I get the following error
> -c service-host-name-xxx -p project-id
Using base image: gcr.io/endpoints-release/endpoints-runtime-serverless:2
++ mktemp -d /tmp/docker.XXXX
+ cd /tmp/docker.5l3t
+ gcloud endpoints configs describe service-host-name-xxx.run.app --project=project-id --service=service-host-name-xxx.app --format=json
ERROR: (gcloud.endpoints.configs.describe) NOT_FOUND: Service configuration 'services/service-host-name-xxx.run.app/configs/service-host-name-xxx' not found.
+ error_exit 'Failed to download service config'
+ echo './gcloud_build_image: line 46: Failed to download service config (exit 1)'
./gcloud_build_image: line 46: Failed to download service config (exit 1)
+ exit 1
Any idea what am I doing wrong? Thanks
My bad. I repeated the steps and got it working. So I guess there must have been some mistake I did while trying it out. The document works as it states.
I had the same error. When running the script twice it works. This means you have to already have a service endpoint configured, which does not exist yet when the script tries to fetch the endpoint information with:
gcloud endpoints configs describe service-host-name-xxx.run.app
What I would do (in cloudbuild) is to supply some sort of an "empty" container first. I used the following example on top of my cloudbuild.yaml:
gcloud run services list \
--platform managed \
--project ${PROJECT_ID} \
--region europe-west1 \
--filter=${PROJECT_ID}-esp-svc \
--format yaml | grep . ||
gcloud run deploy ${PROJECT_ID}-esp-svc \
--image="gcr.io/endpoints-release/endpoints-runtime-serverless:2" \
--allow-unauthenticated \
--platform managed \
--project=${PROJECT_ID} \
--region=europe-west1 \
--timeout=120

How to retrieve node-pool size from a k8s cluster?

I couldn't find useful information from:
gcloud container clusters describe CLUSTER_NAME
or from
gcloud container node-pools describe POOL_NAME --cluster CLUSTER_NAME
It is easy to scale up/down using gcloud tool though:
gcloud container clusters resize [CLUSTER_NAME] --node-pool [POOL_NAME] \
--size [SIZE]
But how can I know beforehand what is the size of my node-pool?
I do not agree with the current answer because it only gives the total size of the cluster.
The question is about node-pools. I actually needed to find out the size of a pool so I give you my best shot after many hours of searching and thinking.
read -p 'Cluster name: ' CLUSTER_NAME
read -p 'Pool name: ' POOL_NAME
gcloud compute instance-groups list \
| grep "^gke-$CLUSTER_NAME-$POOL_NAME" \
| awk '{print $6}';
The gcloud command returns 6 columns: 1-name, 6-group-size.
The name of the instance group is predictable which lets me filter by that line with grep.
Lastly, select the 6th column.
Hope this helps someone else save some time.
For some reason I overlooked the not-so-obviuos from Migrating workloads to different machine types
kubectl get nodes -l cloud.google.com/gke-nodepool=$POOL_NAME -o=name \
| wc -l
You should use the following command:
gcloud container clusters describe <cluster name> --zone <zone-cluster>
Check for the field currentNodeCount
building on top of #hanzo2001 answer - something like this will probably reflect what you need:
kubectl get nodes -L cloud.google.com/gke-nodepool | grep -v GKE-NODEPOOL | awk '{print $6}' | sort | uniq -c | sort -r
16 n2-standard-4-pool
2 preempt-custom-6
2 default-pool
There is one (often overlooked/hidden) feature of the gcloud - filters and formatters. These can help you to get the requested information without further awk/grep -ing.
Get the current size/node count of a specific node-pool:
gcloud compute instance-groups list \
--filter "name:gke-<cluster>-<nodepool>-*" \
--format 'value(size)'
Example for cluster test and node-pool default-pool:
$ gcloud compute instance-groups list --filter "name:gke-test-default-pool-*" \
--format 'value(size)'
2
Get the current size/node count of every node-pool in the cluster you can further use:
CLUSTER=test
for nodepool in $(gcloud container node-pools list --cluster $CLUSTER --format="value(name)"); do
echo -n "${nodepool}: "
gcloud compute instance-groups list \
--filter "name:gke-${CLUSTER}-${nodepool}-*" \
--format 'value(size)'
done
Other relevant resources:
gcloud topic filters - gcloud filters reference
gcloud topic formats - gcloud formats reference
GCP blog post on filtering and formatting

Google cloud's glcoud compute instance create gives error "The resource projects/{ourID}/global/images/family/debian-8 was not found

We are using a server I created on Google Cloud Platform to create and manage the other servers over there. But when trying to create a new server from the Linux command line with the GCloud compute instances create function we receive the following error:
marco#ans-mgmt-01:~/gcloud$ ./create_gcloud_instance.sh app-tst-04 tst,backend-server,bootstrap home-tst 10.20.22.104
ERROR: (gcloud.compute.instances.create) Could not fetch resource:
- The resource 'projects/REMOVED_OUR_PROJECTID/global/images/family/debian-8' was not found
Our script looks like this:
#!/bin/bash
if [ "$#" -ne 4 ]; then
echo "Usage: create_gcloud_instance <instance_name> <tags> <subnet_name> <server_ip>"
exit 1
fi
set -e
INSTANCE_NAME=$1
TAGS=$2
SERVER_SUBNET=$3
SERVER_IP=$4
gcloud compute --project "REMOVED OUR PROJECT ID" instances create "$INSTANCE_NAME" \
--zone "europe-west1-c" \
--machine-type "f1-micro" \
--network "cloudnet" \
--subnet "$SERVER_SUBNET" \
--no-address \
--private-network-ip="$SERVER_IP" \
--maintenance-policy "MIGRATE" \
--scopes "https://www.googleapis.com/auth/devstorage.read_only","https://www.googleapis.com/auth/logging.write","https://www.googleapis.com/auth/monitoring.write","https://www.googleapis.com/auth/servicecontrol","https://www.googleapis.com/auth/service.management.readonly","https://www.googleapis.com/auth/trace.append" \
--service-account "default" \
--tags "$TAGS" \
--image-family "debian-8" \
--boot-disk-size "10" \
--boot-disk-type "pd-ssd" \
--boot-disk-device-name "bootdisk-$INSTANCE_NAME" \
./clean_known_hosts.sh $INSTANCE_NAME
On the google cloud console (console.cloud.google.com) I enabled the cloud api access scope for the ans-mgmt-01 server and also tried to create a server from there. That's working without problems.
The problem is that gcloud is looking for the image family in your project and not the debian-cloud project where it really exists.
This can be fixed by simply using --image-project debian-cloud.
This way instead of looking for projects/{yourID}/global/images/family/debian-8, it will look for projects/debian-cloud/global/images/family/debian-8.
For me the problem was debian-8(and now debian-9) reached the end of life and no longer supported. Updating to debian-10 or debian-11 fixed the issue
For me the problem was debian-9 after so much time came to an end and tried updating to debian-10 fixed the issue
you could run below command to see if the image is available
gcloud compute images list | grep debian
Below is the result from the command
NAME: debian-10-buster-v20221206
PROJECT: debian-cloud
FAMILY: debian-10
NAME: debian-11-bullseye-arm64-v20221102
PROJECT: debian-cloud
FAMILY: debian-11-arm64
NAME: debian-11-bullseye-v20221206
PROJECT: debian-cloud
FAMILY: debian-11
So you could have some idea from your result