Trying to execute a blue/green deployment of an ECS task within AWS using the CloudFormation approach (as documented here: https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/blue-green.html) and the deployment fails.
The initial stack deployment works fine and the ECS task is deployed and running as expected with the correct load balancer and target group etc. However when updating the task definition, to trigger a blue/green deployment, it fails with the message:
Imports and exports are currently not supported on templates using hooks
The deployment is created in CodeDeploy, so it's obviously triggered as expected, but the deployment screen in AWS console shows the following error:
The deployment failed because the stack update that triggered this CodeDeploy deployment failed in CloudFormation. In the AWS CloudFormation console, go to the Events tab to view status and error messages.
But the puzzling thing is the CloudFormation template does not appear to contain any imports or exports. I have even tried copying the yml from the documented example and it doesn't work.
I'm executing the CloudFormation updates using Serverless Framework, but I don't think that's an issue, the error is logged in the CloudFormation stack events tab.
Probably not unreasonable to expect the example in the AWS documentation to work?
So we did find the cause of this issue, and in fact the problem was actually caused by running the CloudFormation template via the serverless framework.
The serverless approach works for all our other AWS deployments, but the CodeDeploy transform explicitly requires for there to be no outputs from the CF template - however serverless actually adds the name of the S3 bucket that it uses as an output, which breaks this particular use case.
Therefore the solution was to invoke the CF template directly from the AWS CLI and it works perfectly.
Related
I am using AWS cloudformation to provision some resources. Part of it is to create an ECS task definition that will mount an EFS access point. A custom resource is defined in Cloudformation which a lambda function in Python will run the ECS Fargate task. However, when I create a stack from the Cloudformation template to provision all the things, the ECS task failed to mount the EFS through the access point with the following error message:
ResourceInitializationError: failed to invoke EFS utils commands to set up EFS volumes: stderr: Failed to resolve "fs-082b4402fbb9c9972.efs.us-east-1.amazonaws.com" - check that your file system ID is correct, and ensure that the VPC has an EFS mount target for this file system ID. See https://docs.aws.amazon.com/console/efs/mount-dns-name for more detail. Attempting to lookup mount target ip address using botocore. Failed to import necessary dependency botocore, please install botocore first. : unsuccessful EFS utils command execution; code: 1
I met similar error before when I created the ECS task in an AZ without a mount target. But it is definitely not the case.
If I run the ECS task manually from the console or run a local python code to run the ECS task, there are no error at all.
Since the cloudformation template are nested templates which create all VPC and other resources together, I am not sure whether the Cloudformation custom resource (Lambda calling ECS task) should have more DependsOn: resources. I have already added the mount targets and acccess point to DependsOn:.
Tried to separate the Cloudformation custom resource into another file so that this part will only be created after all other parts of stack are completed. However, the result is the same.
PS I have added 300 seconds delay to the lambda function which call the ECS task. It works normally afterwards. Then I tried to create the original stack without 300 seconds' delay. The result is also positive. Just wondered what the problem was.
I am new here, I tried to search for the topic before I post here, this may have been discussed before, please let me know before being to hash on me :)
In my project, after performing some changes on either the DevOps tool sets or infrastructures, we always do some manual sanity test, this normally includes:
Building a new image and update the helm chart
Push the image to Artifactory and perform a "helm update", and see it it runs.
I want to automate the whole thing, and try to get advice from the community, here's some requirement:
Validate Jenkins agent being able to talk to cluster ( I can do this with kubectl get all -n <some_namespace_jenkins_user_has_access_to)
Validate the cluster has access to Github (let's say I am using Argo CD to sync yamls)
Validate the cluster has access to Artifactory and able to pull image ( I don't want to build a new image with new tag and update helm chart, so that to force to cluster to pull new image)
All of the above can be done in command line (so that I can implement using Jenkins groovy)
Any suggestion is welcome.
Thanks guys
Your best bet is probably a combination of custom Jenkins scripts (i.e. running kubectl in Jenkins) and some in-cluster checks (e.g. using kuberhealthy).
So, when your Jenkins pipeline is triggered, it could do the following:
Check connectivity to the cluster
Build and push an image, etc.
Trigger in-cluster checks for testing if the cluster has access to GitHub and Artifactory, e.g. by launching a custom Job in the cluster, or creating a KuberhealthyCheck custom resource if you use kuberhealthy
During all this, the Jenkins pipeline writes the results of its test as metrics to a Pushgateway which is scraped by your Prometheus. The in-cluster checks also push their results as metrics to the Pushgateway, or expose them via kuberhealthy, if you decide to use it. In the end, you should have the results of all checks in the same Prometheus instance where you can react on them, e.g. creating Prometheus alerts or Grafana dashboards.
Our entire stack is automated using CloudFormation. I have created a custom rule in AWS Config that uses configuration change based trigger. Sometimes I have to update the lambda of config rule after testing. This is again done via Cloud Formation. But the problem is that the Config rule is not triggered, cos there’s no change in the configuration of existing resources. One solution is to comment out the Config rule altogether in CloudFormation template, deploy it, uncomment the rule and then deploy again. Is there a better way ?
I am studying about CI/CD on AWS (CodePipeline/CodeBuild/CodeDeploy) and found it to be a very good tool for managing a pipeline on the cloud with everything managed (don't even need to install Jenkins on EC2).
I am now reading about container building and deployment. For the build phase, CodeBuild supports building container images. For the deploy phase, while I could find a CodeDeploy solution to ECS cluster, it seems there is no direct CodeDeploy solution for EKS (kindly correct if I am wrong).
May I know if there is a solution to integrate EKS cluster (i.e. the deploy phase can fetch the docker image from ECR or dockerhub and deploy to EKS)? I have come across some ideas using lamda functions to trigger the cluster to perform rolling update of the container image, but I could not find a step-by-step guide on this.
=========================
(Update 17 Sep 2020)
Somehow managed to create a lambda function to trigger an update to EKS to perform rolling update of the k8s deployment. Thanks Prashanna for the source base.
Just want to share the key setups in the process.
(1) Update the lambda execution role to include permission to describe EKS clusters
Create a policy with describe EKS cluster access, and attach to the role:
Policy snippet:
...
......
"Action": "eks:Describe*"
...
......
Or you can create a "EKSFullAccess" policy, and attach to the lambda execution role
(2) Update the k8s ConfigMap, and supplement the lambda execution role ARN to the mapRole section. The corresponding k8s role should be a role that has permission to update container images (say system:masters) used for the k8s deployment
You can edit the map with command like below:
kubectl edit -n kube-system configmap/aws-auth
You don't have to add/update another ConfigMap even if your deployment is in another namespace. It will take effect as well.
Sample lambda function call request and response:
Gitab provides the inbuilt integration of EKS and deployment with the help of Helm charts. If you plan to use other tools Using AWS lambda to update the image is the best bet!
I've added my github project.
Setup a lambda with below code and give RBAC access to this lambda in your EKS. Try invoking the lambda by passing the required information like namespace, deployment, image etc
Lambda for Kubernetes image update
The lambda must require EKS:describecluster policy.
The Lambda role must be provided atleast update image RBAC role in EKS cluster RBAC role setup
Since there's no built-in CI/CD for EKS at the moment, this is going to be a showcase of success/failure stories of a 3rd-party CI/CDs in EKS :) My take: https://github.com/fluxcd/flux
Pros:
Quick to set up initially (until you get into multiple teams/environments)
Tracks and deploys image releases out of box
Possibility to split what to auto-deploy in dev/prod using regex. E.g. all versions to dev, only minor to prod. Or separate tag prefixes for dev/prod.
All state is in git - a good practice to start with
Cons:
Getting complex for further pipeline expansion, e.g. blue-green, canary, auto-rollbacks, etc.
The dashboard is proprietary (weave works product)
Not for on-demand parametrized job runs like traditional CIs.
Setup:
Setup an automated image build (looks like you've already figured out)
Setup flux and helm-operator into the cluster, point them to your "gitops repo"
For each app, create a HelmRelease object that describes a regex of image tag to track
Done. A newly published image tag that falls into regex will be auto-deployed to the cluster and the new version is committed to a gitops repo.
Is it possible to run kubernetes from source (./hack/local-up-cluster.sh) and still properly configure the cloud provider from this type of setup? For example, if an instance is running on AWS EC2 and all prerequisites are met including proper exports, aws cli and configs but keep getting an error stating that the cloud provider was not found. KUBERNETES_PROVIDER=aws, Zone is set to us-west-2a, etc...
Failed to get AWS Cloud Provider. plugin.host.GetCloudProvider returned <nil> instead
I don't think hack/local-up-cluster.sh is designed to be run on a cloud provider. However, cluster/kube-up.sh is designed to work when building from source:
$ make release
$ export KUBERNETES_PROVIDER=aws
$ cluster/kube-up.sh # Uses the release built in step 1
There are lots of options which can be configured, and you can find more details here (just ignore the part about https://get.k8s.io).