Cloud Formation Template design - aws-cloudformation

What factors do folk take into account when deciding to write 1 large CF template, or nest many smaller ones? The use case I have in mind is RDS based where I'll need to define RDS instances, VPC Security groups, parameter and option groups as well as execute some custom lambda resources.
My gut feel is that this should be split, perhaps by resource type, but I was wondering if there was generally accepted practice on this.

My current rule of thumb is to split resources by deployment units - what deploys together, goes together.
I want to have the smallest deployable stack, because it's fast to deploy or fail if there's an issue. I don't follow this rule religiously. For example, I often group Lambdas together (even unrelated ones, depends on the size of the project), as they update only if the code/config changed and I tend to push small updates where only one Lambda changed.
I also often have a stack of shared resources that are used (Fn::Import-ed) throughout the other stacks like a KMS key, a shared S3 Bucket, etc.
Note that I have a CD process set up for every stack, hence the rule.

My current setup requires deployment of a VPC (with endpoints), RDS & application (API gateway, Lambdas). I have broken them down as
VPC stack: a shared resource with 1 VPC per region with public & private subnets, VPC endpoints, S3 bucket, NAT gateways, ACLs, security groups.
RDS stack: I can have multiple RDS clusters inside a VPC so made sense to keep it separate. Also, this is created after VPC as it needs VPC resources such as private subnets, security groups. This cluster is shared by multiple application stacks.
Application stack: This deploys API gateway & Lambdas (basically a serverless application) with the above RDS cluster as the DB.
So, in general, I pretty much follow what #Milan Cermak described. But in my case, these deployments are done when needed (not part of CD) so exported parameters are stored in parameter store of AWS Systems Manager.

Related

AWS Proton vs CloudFormation

Recently, I went to the AWS Proton service, I also tried to do a hands-on service, unfortunately, I was not able to succeed.
What I am not able to understand is what advantage I am getting with Proton, because the end to end pipeline I can build using CodeCommit, CodeDeploy, CodePipeline, and CloudFormation.
It will be great if someone could jot down the use cases where Proton can be used compared to the components which I suggested above.
From what I understand, AWS Proton is similar to AWS Service Catalog in that it allows
administrators prepare some CloudFormation (CFN) templates which Developers/Users can provision when they need them. The difference is that AWS Service Catalog is geared towards general users, e.g. those who just want to start a per-configured instance by Administrators, or provision entire infrastructures from the set of approve architectures (e.g. instance + rds + lambda functions). In contrast, AWS Proton is geared towards developers, so that they can provision by themselves entire architectures that they need for developments, such as CICD pipelines.
In both cases, CFN is used as a primary way in which these architectures are defined and provisioned. You can think of AWS Service Catalog and AWS Proton as high level services, while CFN as low level service which is used as a building block for the two others.
because the end to end pipeline I can build using CodeCommit, CodeDeploy, CodePipeline, and CloudFormation
Yes, in both cases (AWS Service Catalog and AWS Proton) you can do all of that. But not everyone want's to do it. Many AWS users and developers do not have time and/or interest in defining all the solutions they need in CFN. This is time consuming and requires experience. Also, its not a good security practice to allow everyone in your account provision everything they need without any constrains.
AWS Service Catalog and AWS Proton solve these issues as you can pre-define set of CFN templates and allow your users and developers to easily provision them. It also provide clear role separation in your account, so you have users which manage infrastructure and are administrators, while the other ones users/developers. This way both these groups of users concentrate on what they know best - infrastructure as code and software development.

Do I really need a VPC if I can use AWS security groups to secure my MongoDB EC2 instance?

I am really stuck here deciding whether I really need a VPC to deploy my MongoDB instance (a graphQL server also) into on AWS? I'm working on a project that's going to have a GraphQL server to serve a mobile-app along with a MongoDB instance to store the data. I've read everywhere that you must use a VPC, why though? Can't I use the security groups that AWS provides? This will allow me to lockdown my MongoDB instance right?
https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-security-groups.html
The reason I don't want to use a VPC is purely because of the extra costs!. The project I'm working on has a small budget & paying all the extra money (min $60 a month) for the VPC on AWS just isn't viable. Maybe if I was building an application that was going to be massive and has 10s of thousands of users and required scale and added security for peace of mind, then I'd consider using a VPC, but since it's not going to be that, and the budget is small, is it okay to use the security groups to lockdown my mongodb ec2 instance?
I've looked into other hosting solutions, in particular Digitalocean as they provide a free VPC service, however Digitalocean does not have data centers in my region (amongst other things) + I've used AWS a fair bit in the past and would love to keep using it.
I would love any suggestions about what I could/should do.
Security groups are a feature of VPCs and are tightly coupled with how EC2 instances are hosted. You need a VPC to define your networking rules including if your instances that host the MongoDB and GraphQL servers are public/private and what their security group rules are.
I'm not sure what costs you are referring to as VPCs are free and all accounts come with a VPC already created for you (the default VPC). You only pay for the ingress/egress traffic that you use so if you aren't doing anything massive, then the cost will be tiny ($0.02/GB) compared to the cost of the instances used to host your machines.
To address your comment, A NAT Gateway would only be needed if you want your instances on private subnets but you want those subnets to have internet access. This is not required if you are comfortable with putting your instances on public subnets and then locking them down with security group and NACL rules (this is not the best security practice but it is a comprise you can make to save on costs).

What is the Google Cloud Platform's "Managed Infrastructure Mixer Client"?

Can someone tell me what the purpose of the “Managed Infrastructure Mixer Client”? I have it showing up on my GCE logs and I can’t find any information on it. It is adding and removing GCE instances.
I believe it is related to GCP's recommended settings:
Automatic restart - On (recommended)
On host maintenance - Migrate VM instance (recommended)
This is the User Agent used by Managed Instance Groups when performing operations on instances. These operations can result from both user operating on the MIG (e.g. resizing, recreating instances), as well as operations performed by Autoscaler, Autohealer, Updater, etc.
Note that this string may change in the future.

How to execute Amazon Lambda functions on dedicated EC2 server?

I am currently developing the backend for my app based on Amazon Web Services. I pretended to use DynamoDB to store the user's data, but finally opted for MongoDB, which I have already installed in my EC2 instance.
I have some code written in Python to update/query... the DB, so that when a Cognito event triggers my lambda function, this code is directly executed on my instance so I can access my DB. Any ideas how can I accomplish this?
As mentioned by Gustavo Tavares, "the whole point of lambda is to run code without the need to deploy EC2 instances". And you do not have to put your EC2 with database to "public" subnets for Lambda to access them. Actually, you should never do that.
When creating/editing Lambda configuration you may select to run it in any of you VPCs (Configuration -> Advanced Settings -> VPC). Then select Subnet(s) to run your Lambda in. This will create ENIs (Elastic Network Interface) for the virtual machines you Lambdas will run on.
Your subnets must have Routing/ACL configured to access the subnets where Database resides. At least one of the SecurityGroups associated with Lambda must also have Outbound traffic allowed to the Database subnet on appropriate ports (27017).
Since you mentioned that your Lambdas are "back-end" then you should probably put them in the same "private" subnets as your MongoDB and avoid any access/routing headache.
One way to accomplish this is to give the Lambda a SAM Template, then use sam local invoke inside of the EC2 instance to execute locally.
OK BUT WHY OH WHY WOULD ANYONE DO THIS?
If your Lambda requires access to both a VPC and the Internet, and doesn't use a lot of memory and doesn't really require scalability, and you already wrote the code (*), it's actually 10x cheaper(**) and higher-performing to launch a t3.nano EC2 Spot Instance on a public subnet than to add a NAT Gateway to the Lambda function.
(*) if you have not written the code yet, don't even bother to make it a Lambda.
(**) 10x cheaper as in $3 vs $30, so this really only applies to hobbyist projects on a shoestring budget. Don't do this at work, because the cost of engineers' time to manage and maintain an EC2 instance will far exceed $30/month over the long term.
If you want Lambda to execute code on your ec2-instances you'll need to use the SDK for the language you're writing your lambda in. Then you can simply use the AWS API to run commands on your EC2 instance.
See: http://docs.aws.amazon.com/systems-manager/latest/userguide/run-command.html
I think you misunderstood the idea of AWS lambda.
The whole point of lambda is to run code without the need to deploy EC2 instances. You upload the code and the infrastructure is provisioned on the fly. If your application does not need the infrastructure anymore (after a brief period), it vanishes and you will not be charged for the idle time. If you need it again a new infrastructure is provisioned.
If you have a service, like your MongoDB, running in EC2 instances your lambda functions can access it like any other code. You just need configure your lambda code to connect to the EC2 instance, like you would be doing if your database were installed in any other internet faced server.
For example: You can put your MongoDB server in a public subnet of your VPC and assign an elastic IP for your server. In your Python lambda code you configure your driver to connect to this elastic IP and update the database.
It will work like every service were deployed in different servers across internet: Cognito connect to Lambda functions across internet and then the python code deployed in lambda connect to your MongoDB across internet.
If I can give you an advice, try DynamoDB a little more. With DynamoDB it will be even more simple to make all this work, because you will not need to configure a public subnet and request an elastic IP. And the API for DynamoDB is not very different of the MongDB API.

cloudformation best practices in AWS

We are at early stages with running our services on AWS. We have our server hosted in AWS, in a VPC, having private and public subnets and have multiple instances in private and public subnets using ELB and autoscaling setup (using AMIs) for frontend web servers. The whole environement(VPC, security groups, EC2 instances, DB instances, S3 buckets, cloudfront) are setup manually using AWS console at first.
Application servers host jboss and war files are deployed on the servers.
As per AWS best practices we want to create whole infrastructure using cloudformation and have setup test/stage/prod environment.
-Would it be a good idea to have all the above componenets (VPC, security groups, EC2 instances, DB instances, S3 buckets, cloudfront etc) using one cloudformation stack/template? Or we should we create two stacks 1) having network replated components and 2) having EC2 related components?
-Once we have a prod envoronemtn running with cloudformation stact and In case we want to update the new AMIs on prod in future, how can we update the live running EC2 instances using cloudformation without interruptions?
-What are the best practices/multiple ways for code deployment to multiple EC2 notes when a new release is done? We dont use Contineus integration at the moment.
It's a very good idea to separate your setup into multiple stacks. One obvious reason is that stacks have certain limits that you may reach eventually. A more practical reason is that you don't really need to update, say, your VPC every time you just want to deploy a new version. The network architecture typically changes less frequently. Another reason to avoid having one huge template, or to make changes to an "important" template needlessly, is that you always run the risk of messing things up. If there's an error in your template and you remove an important resource by accident (e.g. commented out) you'll be very sorry. So separating stacks out of sheer caution is probably a good idea.
If you want to update your application you can simply update the template with the new AMIs and CFN will know what needs to be recreated or updated. You can read about rolling updates here. However, I'd recommend considering using something a bit more straightforward for deploying your actual code, like Ansible or Chef.
I'd also recommend you look into Docker for packaging and deploying your application's nodes. Very handy.