I have a auto-scaling-group setup. When there are no running instances per that group and my application deploys, the auto-scaling-group will spin up an instance and deploy. Fantastic. ... well sorta...
If there are more than one instances in that auto-scaling-group, then my scripts might point to one instance or another.
How do I deploy to a specific instance without having to setup all the CodeDeploy application, deployment-group, send a new revision, yada, yada, yada...
Or, do you have to take all of those steps each time? How then do you track your deployments? Surely there's a better way to this?
Ideally, I would like to create an Instance based on an AMI, associate that instance with my auto-scaling-group, then deploy specifically to that instance. But I can't create-deployment to an instance, only to a deployment-group.
This is maddening.
The problem you describe can easily be solved with HashiCorp Packer.
With a packerfile you can describe the way your application is supposed to be deployed to an instance. This instance is then snapshotted and turned into an available AMI.
After which you can update your target group for your autoscaling group with a new AMI.
Documentation for Packer can be found here:
Related
We have been using Terraform for almost a year now to manage all kinds of resources on AWS from bastion hosts to VPCs, RDS and also EKS.
We are sometimes really baffled by the EKS module. It could however be due to lack of understanding (and documentation), so here it goes:
Problem: Upsizing Disk (volume)
module "eks" {
source = "terraform-aws-modules/eks/aws"
version = "12.2.0"
cluster_name = local.cluster_name
cluster_version = "1.19"
subnets = module.vpc.private_subnets
#...
node_groups = {
first = {
desired_capacity = 1
max_capacity = 5
min_capacity = 1
instance_type = "m5.large"
}
}
I thought the default value for this (dev) k8s cluster's node can easily be the default 20GBs but it's filling up fast so I know want to change disk_size to let's say 40GBs.
=> I thought I could just add something like disk_size=40 and done.
terraform plan tells me I need to replace the node. This is a 1 node cluster, so not good. And even if it were I don't want to e.g. drain nodes. That's why I thought we are using managed k8s like EKS.
Expected behaviour: since these are elastic volumes I should be able to upsize but not downsize, why is that not possible? I can def. do so from the AWS UI.
Sure with a slightly scary warning:
Are you sure that you want to modify volume vol-xx?
It may take some time for performance changes to take full effect.
You may need to extend the OS file system on the volume to use any newly-allocated space
But I can work with the provided docs on that: https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/recognize-expanded-volume-linux.html?icmpid=docs_ec2_console
Any guidelines on how to up the storage? If I do so with the UI but don't touch Terraform then my EKS state will be nuked/out of sync.
To my knowledge, there is currently no way to resize an EKS node volume without recreating the node using Terraform.
Fortunately, there is a workaround: As you also found out, you can directly change the node size via the AWS UI or API. To update your state file afterward, you can run terraform apply -refresh-only to download the latest data (e.g., the increased node volume size). After that, you can change the node size in your Terraform plan to keep both plan and state in sync.
For the future, you might want to look into moving to ephemeral nodes as (at least my) experience shows that you will have unforeseeable changes to clusters and nodes from time to time. Already planning with replaceable nodes in mind will make these changes substantially easier.
By using the terraform-aws-eks terraform module you are actually following the "ephemeral nodes" paradigm, because for both ways of creating instances (self-managed workers or managed node groups) the module is creating Autoscaling Groups that create EC2 instances out of a Launch Template.
ASG and Launch Templates are specifically designed so that you don't care anymore about specific nodes, and rather you just care about the number of nodes. This means that for updating the nodes, you just replace them with new ones, which will use the new updated launch template (with more GBs for example, or with a new updated AMI, or a new instance type).
This is called "rolling updates", and it can be done manually (adding new instances, then draining the node, then deleting the old node), with scripts (see: eks-rolling-update in github by Hellofresh), or it can be done automagically if you use the AWS managed nodes (the ones you are actually using when specifying "node_groups", that is why if you add more GB, it will replace the node automatically when you run apply).
And this paradigm is the most common when operating Kubernetes in the cloud (and also very common on-premise datacenters when using virtualization).
Option 1) Self Managed Workers
With self managed nodes, when you change a parameter like disk_size or instance_type, it will change the Launch Template. It will update the $latest version tag, which is commonly where the ASG is pointing to (although can be changed). This means that old instances will not see any change, but new ones will have the updated configuration.
If you want to change the existing instances, you actually want to replace them with new ones. That is what this ephemeral nodes paradigm is.
One by one you can drain the old instances while increasing the number of desired_instances on the ASG, or let the cluster autoscaler do the job. Alternatively, you can use an automated script which does this for you for each ASG: https://github.com/hellofresh/eks-rolling-update
In terraform_aws_eks module, you create self managed workers by either using worker_groups or worker_groups_launch_template (recommended) field
Option 2) Managed Nodes
Managed nodes is an EKS-specific feature. You configure them very similarly, but in reality, it is an abstraction, and AWS will create the actual underlying ASG.
You can specify a Launch Template to be used by the ASG and its version. Some config can be specified at the managed node level (i.e. AMI and instance_types) and at the Launch Template (if it wasn't specified in the former).
Any change on the node group level config, or on the Launch Template version, will trigger an automatic rolling update, which will replace all old instances.
You can delay the rolling update by just not pointing to the $latest version (or pointing to $default, and not updating the $default tag when changing the LT).
In terraform_aws_eks module, you create self managed workers by using the node_groups field. You can also play with these settings: create_launch_template=true and set_instance_types_on_lt=true if you want the module to create the LT for you (alternatively you can just not use it, or pass a reference to one); and to set the instance_type on such LT as specified above.
But behavior is similar to worker groups. In no case you will have your existing instances changed. You can only change them manually.
However, there is an alternative: The manual way
You can use the EKS module to create the control plane, but then use a regular EC2 resource in terraform (https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/instance) to create one ore multiple (using count or for_each) instances.
If you create the instances using the aws_instance resource, then terraform will patch those instances (updated-in-place) when any change is allowed (i.e. increasing the root volue GB or the instance type; whereas changing the AMI will force a replacement).
The only tricky part, is that you need to configure the cloud-init script to make the instance join the cluster (something that is automatically done by the EKS module when using self/managed node groups).
However, it is very possible, and you can borrow the script from the module and plug it into the aws_instance's user_data field (https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/instance#user_data)
In this case (when talking about disk_size), however, you still need to manually (either by SSH, or by running an hacky exec using terraform) to patch the XFS filesystem so it sees the increased disk space.
Another alternative: Consider Kubernetes storage
That said, there is also another alternative for certain use cases. If you want to increase the disk space of those instances because of one of your applications using a hostPath, then it might be the case that you can use a kubernetes built-in storage solution using the EBS CSI driver.
For example, I manage an ElasticSearch cluster in Kubernetes (and deploy it from terraform with the helm module), and it uses dynamic storage provisioning to request an EBS volume (note that performance is the same, because both root and this other volume are EBS volumes). EBS CSI driver supports volume expansion, so I can just increase this disk by changing a terraform variable.
To conclude, I would not recommend the aws_instance way, unless you understand it and are sure you really want it. It may make sense in certain cases, but definitely not common
I'm creating a Deployment Group in CodeDeploy with a CloudFormation template.
The Deployment Group is successfully created and the application is deployed perfectly fine.
The CF resource that I defined (Type: AWS::CodeDeploy::DeploymentGroup) has the "Deployment" property set. The thing is that I would like to configure automatic rollbacks for this deployment, but as per CF documentation for "AutoRollbackConfiguration" property: "Information about the automatic rollback configuration that is associated with the deployment group. If you specify this property, don't specify the Deployment property."
So my understanding is that if I specify "Deployment", I cannot set "AutoRollbackConfiguration"... Then how are you supposed to configure any rollback for the deployment? I don't see any other resource property that relates to rollbacks.
Should I create a second DeploymentGroup resource and bind it to the same instances that the original Deployment Group has? I'm not sure this is possible or makes sense but I ran out of options.
Thanks,
Nicolas
First i like to describe why you cannot specify both, deployment and rollback configuration:
Whenever you specify a deployment directly for the group, you already state which revision you like to deploy. This conflicts with the idea of CloudFormation of having resources managed by it without having a drift in the actual configuration of those resources.
I would recommend the following:
Use CloudFormation to deploy the 'underlying' infrastructure (the deployment group, application, roles, instances, etc.)
Create a CodePipline within this infrastructure template, which then includes a CodeDeploy deployment action (https://docs.aws.amazon.com/codepipeline/latest/userguide/action-reference-CodeDeploy.html, https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-properties-codepipeline-pipeline-stages-actions-actiontypeid.html)
The pipeline can triggered whenever you have a new version inside you revision location
This approach clearly separates the underlying stuff, which is not changing dynamically and the actual application deployment, done using a proper pipeline.
Additionally in this way you can specify how you like to deploy (green/blue, canary) and how/when rollbacks should be handled. The status of your deployment also to be seen inside CodePipeline.
I didn't mention it but what you are suggesting about CodePipeline is exactly what I did.
In fact, I have one CloudFormation template that creates all the infrastructure and includes the DeploymentGroup. With this, the application is deployed for the first time to my EC2 instances.
Then I have another CF template for CI/CD purposes with a CodeDeploy stage/action that references the previous DeploymentGroup. Whenever I push some code to my repository, the Pipeline is triggered, code is built and new version successfully deployed to the instances.
However, I don't see how/where in any of the CF templates to handle/configure the rollback for the DeploymentGroup as you were saying. I think I get the idea of your explanation about the conflict CF might have in case of having a drift, but my impression is that in case of errors during the CF stack creation, CF rollback should just remove the DeploymentGroup you're trying to create. In other words, for me there's no CodeDeploy deployment rollback involved in that scenario, just removing the resource (DeploymentGroup) CF was trying to create.
One thing that really impresses me is that you can enable/disable automatic rollbacks for the DeploymentGroup through the AWS Console. Just edit and go to Advanced Configuration for the DeploymentGroup and you have a checkbox. I tried it and triggered the Pipeline again and worked perfectly. I made a faulty change to make the deployment fail in purpose, and then CodeDeploy automatically reverted back to the previous version of my application... completely expected behavior. Doesn't make much sense that this simple boolean/flag option is not available through CF.
Hope this makes sense and helps clarifying my current situation. Any extra help would be highly appreciated.
Thanks again
We are at early stages with running our services on AWS. We have our server hosted in AWS, in a VPC, having private and public subnets and have multiple instances in private and public subnets using ELB and autoscaling setup (using AMIs) for frontend web servers. The whole environement(VPC, security groups, EC2 instances, DB instances, S3 buckets, cloudfront) are setup manually using AWS console at first.
Application servers host jboss and war files are deployed on the servers.
As per AWS best practices we want to create whole infrastructure using cloudformation and have setup test/stage/prod environment.
-Would it be a good idea to have all the above componenets (VPC, security groups, EC2 instances, DB instances, S3 buckets, cloudfront etc) using one cloudformation stack/template? Or we should we create two stacks 1) having network replated components and 2) having EC2 related components?
-Once we have a prod envoronemtn running with cloudformation stact and In case we want to update the new AMIs on prod in future, how can we update the live running EC2 instances using cloudformation without interruptions?
-What are the best practices/multiple ways for code deployment to multiple EC2 notes when a new release is done? We dont use Contineus integration at the moment.
It's a very good idea to separate your setup into multiple stacks. One obvious reason is that stacks have certain limits that you may reach eventually. A more practical reason is that you don't really need to update, say, your VPC every time you just want to deploy a new version. The network architecture typically changes less frequently. Another reason to avoid having one huge template, or to make changes to an "important" template needlessly, is that you always run the risk of messing things up. If there's an error in your template and you remove an important resource by accident (e.g. commented out) you'll be very sorry. So separating stacks out of sheer caution is probably a good idea.
If you want to update your application you can simply update the template with the new AMIs and CFN will know what needs to be recreated or updated. You can read about rolling updates here. However, I'd recommend considering using something a bit more straightforward for deploying your actual code, like Ansible or Chef.
I'd also recommend you look into Docker for packaging and deploying your application's nodes. Very handy.
I'm using Amazon Web Services to create an autoscaling group of application instances behind an Elastic Load Balancer. I'm using a CloudFormation template to create the autoscaling group + load balancer and have been using Ansible to configure other instances.
I'm having trouble wrapping my head around how to design things such that when new autoscaling instances come up, they can automatically be provisioned by Ansible (that is, without me needing to find out the new instance's hostname and run Ansible for it). I've looked into Ansible's ansible-pull feature but I'm not quite sure I understand how to use it. It requires a central git repository which it pulls from, but how do you deal with sensitive information which you wouldn't want to commit?
Also, the current way I'm using Ansible with AWS is to create the stack using a CloudFormation template, then I get the hostnames as output from the stack, and then generate a hosts file for Ansible to use. This doesn't feel quite right – is there "best practice" for this?
Yes, another way is just to simply run your playbooks locally once the instance starts. For example you can create an EC2 AMI for your deployment that in the rc.local file (Linux) calls ansible-playbook -i <inventory-only-with-localhost-file> <your-playbook>.yml. rc.local is almost the last script run at startup.
You could just store that sensitive information in your EC2 AMI, but this is a very wide topic and really depends on what kind of sensitive information it is. (You can also use private git repositories to store sensitive data).
If for example your playbooks get updated regularly you can create a cron entry in your AMI that runs every so often and that actually runs your playbook to make sure your instance configuration is always up to date. Thus avoiding having "push" from a remote workstation.
This is just one approach there could be many others and it depends on what kind of service you are running, what kind data you are using, etc.
I don't think you should use Ansible to configure new auto-scaled instances. Instead use Ansible to configure a new image, of which you will create an AMI (Amazon Machine Image), and order AWS autoscaling to launch from that instead.
On top of this, you should also use Ansible to easily update your existing running instances whenever you change your playbook.
Alternatives
There are a few ways to do this. First, I wanted to cover some alternative ways.
One option is to use Ansible Tower. This creates a dependency though: your Ansible Tower server needs to be up and running at the time autoscaling or similar happens.
The other option is to use something like packer.io and build fully-functioning server AMIs. You can install all your code into these using Ansible. This doesn't have any non-AWS dependencies, and has the advantage that it means servers start up fast. Generally speaking building AMIs is the recommended approach for autoscaling.
Ansible Config in S3 Buckets
The alternative route is a bit more complex, but has worked well for us when running a large site (millions of users). It's "serverless" and only depends on AWS services. It also supports multiple Availability Zones well, and doesn't depend on running any central server.
I've put together a GitHub repo that contains a fully-working example with Cloudformation. I also put together a presentation for the London Ansible meetup.
Overall, it works as follows:
Create S3 buckets for storing the pieces that you're going to need to bootstrap your servers.
Save your Ansible playbook and roles etc in one of those S3 buckets.
Have your Autoscaling process run a small shell script. This script fetches things from your S3 buckets and uses it to "bootstrap" Ansible.
Ansible then does everything else.
All secret values such as Database passwords are stored in CloudFormation Parameter values. The 'bootstrap' shell script copies these into an Ansible fact file.
So that you're not dependent on external services being up you also need to save any build dependencies (eg: any .deb files, package install files or similar) in an S3 bucket. You want this because you don't want to require ansible.com or similar to be up and running for your Autoscale bootstrap script to be able to run. Generally speaking I've tried to only depend on Amazon services like S3.
In our case, we then also use AWS CodeDeploy to actually install the Rails application itself.
The key bits of the config relating to the above are:
S3 Bucket Creation
Script that copies things to S3
Script to copy Bootstrap Ansible. This is the core of the process. This also writes the Ansible fact files based on the CloudFormation parameters.
Use the Facts in the template.
Here is my case:
I have about 100 EC2 instances and everyone runs a java application (Java SE application, not Java EE application), I want to deploy my complied jar files and library to all the instances and then make the application run on everyone's application. Because the application is changing from time to time, every time I have to spend two hours to do this job.
Do you know if there's a management tool or software that can help me to do this work automatically, and what is your practice to deploy this application?
Do you have an auto deployment workflow for development on AWS?
Kwatee (http://www.kwatee.net), our free and lightweight deployment tool, supports EC2 instances as well as elastic load balancing. There's a short screencast of a small EC2 deployment here.
Since you're using java you can utilize AWS Elastic Beanstalk.
Development Lifecycle:
http://docs.amazonwebservices.com/elasticbeanstalk/latest/dg/create_deploy_Java.sdlc.html
Managing the Environment:
http://docs.amazonwebservices.com/elasticbeanstalk/latest/dg/using-features.managing.html
There are many more article links on the same page, probably will need to read all of them but these are the two that I feel are most related too your question. I haven't used this product so i can't give any first hand experience but it seems to be designed to help you with your exact problem.
Boxfuse does exactly what you want.
For your Java SE application you literally only have to execute:
boxfuse create my-javase-app -apptype=load-balanced
boxfuse scale my-javase-app -capacity=100:t2.micro
boxfuse run my-javase-app-1.0.jar -env=prod
This will
Create a new application and configure it to use an ELB
Scale it to 100 t2.micro instances
Create AMI
Create an ELB
Create a security group
Create an auto-scaling group
Launch your instance(s)
Any subsequent update will be done as a zero downtime blue/green deployment.
You can use AutoScaling Launch Configuration and AutoScaling Group to launch 100 EC2 instances. But hold on, Need to request EC2 instance limit with EC2 instance type to AWS Support. Typically, It will take 1 business day to complete the request.
First you suppose to create AutoScaling Launch Configuration in AWS Console. AutoScaling Launch Configuration includes type, storage, security group and you can add scripts to run at launch of EC2 Instance.
Next is AutoScaling Group. You have to choose which launch configuration is suitable for your autoscaling group. In AutoScaling Group configuration, you must mention min and max count (i.e) it will launch EC2 instances based on min count and launches upto max count. CloudWatch monitoring can be used for AutoScaling group. CloudWatch will work based on EC2 instance CPU Utilization and alarm settings.
Elastic Load Balancing will help to distribute traffic among EC2 Instances. If you want to use ELB, you have to create before AutoScaling Group. In AutoScaling Group you may include ELB to handle and distribute traffic.