How to execute Amazon Lambda functions on dedicated EC2 server? - mongodb

I am currently developing the backend for my app based on Amazon Web Services. I pretended to use DynamoDB to store the user's data, but finally opted for MongoDB, which I have already installed in my EC2 instance.
I have some code written in Python to update/query... the DB, so that when a Cognito event triggers my lambda function, this code is directly executed on my instance so I can access my DB. Any ideas how can I accomplish this?

As mentioned by Gustavo Tavares, "the whole point of lambda is to run code without the need to deploy EC2 instances". And you do not have to put your EC2 with database to "public" subnets for Lambda to access them. Actually, you should never do that.
When creating/editing Lambda configuration you may select to run it in any of you VPCs (Configuration -> Advanced Settings -> VPC). Then select Subnet(s) to run your Lambda in. This will create ENIs (Elastic Network Interface) for the virtual machines you Lambdas will run on.
Your subnets must have Routing/ACL configured to access the subnets where Database resides. At least one of the SecurityGroups associated with Lambda must also have Outbound traffic allowed to the Database subnet on appropriate ports (27017).
Since you mentioned that your Lambdas are "back-end" then you should probably put them in the same "private" subnets as your MongoDB and avoid any access/routing headache.

One way to accomplish this is to give the Lambda a SAM Template, then use sam local invoke inside of the EC2 instance to execute locally.
OK BUT WHY OH WHY WOULD ANYONE DO THIS?
If your Lambda requires access to both a VPC and the Internet, and doesn't use a lot of memory and doesn't really require scalability, and you already wrote the code (*), it's actually 10x cheaper(**) and higher-performing to launch a t3.nano EC2 Spot Instance on a public subnet than to add a NAT Gateway to the Lambda function.
(*) if you have not written the code yet, don't even bother to make it a Lambda.
(**) 10x cheaper as in $3 vs $30, so this really only applies to hobbyist projects on a shoestring budget. Don't do this at work, because the cost of engineers' time to manage and maintain an EC2 instance will far exceed $30/month over the long term.

If you want Lambda to execute code on your ec2-instances you'll need to use the SDK for the language you're writing your lambda in. Then you can simply use the AWS API to run commands on your EC2 instance.
See: http://docs.aws.amazon.com/systems-manager/latest/userguide/run-command.html

I think you misunderstood the idea of AWS lambda.
The whole point of lambda is to run code without the need to deploy EC2 instances. You upload the code and the infrastructure is provisioned on the fly. If your application does not need the infrastructure anymore (after a brief period), it vanishes and you will not be charged for the idle time. If you need it again a new infrastructure is provisioned.
If you have a service, like your MongoDB, running in EC2 instances your lambda functions can access it like any other code. You just need configure your lambda code to connect to the EC2 instance, like you would be doing if your database were installed in any other internet faced server.
For example: You can put your MongoDB server in a public subnet of your VPC and assign an elastic IP for your server. In your Python lambda code you configure your driver to connect to this elastic IP and update the database.
It will work like every service were deployed in different servers across internet: Cognito connect to Lambda functions across internet and then the python code deployed in lambda connect to your MongoDB across internet.
If I can give you an advice, try DynamoDB a little more. With DynamoDB it will be even more simple to make all this work, because you will not need to configure a public subnet and request an elastic IP. And the API for DynamoDB is not very different of the MongDB API.

Related

How to deploy a next.js + mongo app to AWS (or any other service like G Cloud)?

I just have some experience developing in JS but almost nothing in devops, and there's a lot of documentation but I don't really know where to start.
I built a next.js app (both frontend and backend) connected to mongo db. They run fine locally using docker-compose. Now I would like to deploy them to aws, also because I need to store on S3 files needed by the application.
What services do I tipically need? should I deploy my app to EC2, or use AWS amplify, or any other service like google cloud for example?
Can I deploy my images just how they are, including mongo, to EC2? Or should I, for example, just deploy next.js and connect it to a managed mongo db, which I suppose is an additional cost.
I know it is a pretty generic question, if you can just point me to the tools I need to manage the whole deploy process then I'll find out how to use them. Currently all the code (including Dockerfile and docker-compose.yml) is on github.
This is probably not the perfect answer since the question is very general and AWS provides a lot of features but I'll give it a go.
For JS app you could use a AWS Elastic Beanstalk which is for setting up web applications easily as it creates all the resources like EC2, load balancers, etc. Since you're new to AWS you can check this service out instead of manually creating EC2. Even if you use AWS Elastic Beanstalk you will still have access to the EC2 and other resources created by AWS Elastic Beanstalk. You'll get exposure to various different services which can help speed up your application.
For images S3 would be a great choice. However, depending on how frequently data is accessed I would look up the different S3 options as well as backup options.
As for your DB, MongoDB would work but you'd need to run it on a EC2 and maintain it yourself. AWS has different managed database option such as DynamoDB in your case but it all depends on the tools you require, budget, etc.

Is there a way to cache a self-hosted PostgreSQL db connection across Lambda invocations?

We have a self-hosted (i.e. on top of EC2 instance) PostgreSQL instance, which we intent to access from a Step Functions, which invoke multiple Lambdas. I am aware that for the RDS instances there is an RDS Proxy service available which solves for this exact scenario. Given that our instance is self-hosted - is there a way to accomplish caching the database connection across multiple lambda invocations without having to re-establish it for each one?

Do I really need a VPC if I can use AWS security groups to secure my MongoDB EC2 instance?

I am really stuck here deciding whether I really need a VPC to deploy my MongoDB instance (a graphQL server also) into on AWS? I'm working on a project that's going to have a GraphQL server to serve a mobile-app along with a MongoDB instance to store the data. I've read everywhere that you must use a VPC, why though? Can't I use the security groups that AWS provides? This will allow me to lockdown my MongoDB instance right?
https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-security-groups.html
The reason I don't want to use a VPC is purely because of the extra costs!. The project I'm working on has a small budget & paying all the extra money (min $60 a month) for the VPC on AWS just isn't viable. Maybe if I was building an application that was going to be massive and has 10s of thousands of users and required scale and added security for peace of mind, then I'd consider using a VPC, but since it's not going to be that, and the budget is small, is it okay to use the security groups to lockdown my mongodb ec2 instance?
I've looked into other hosting solutions, in particular Digitalocean as they provide a free VPC service, however Digitalocean does not have data centers in my region (amongst other things) + I've used AWS a fair bit in the past and would love to keep using it.
I would love any suggestions about what I could/should do.
Security groups are a feature of VPCs and are tightly coupled with how EC2 instances are hosted. You need a VPC to define your networking rules including if your instances that host the MongoDB and GraphQL servers are public/private and what their security group rules are.
I'm not sure what costs you are referring to as VPCs are free and all accounts come with a VPC already created for you (the default VPC). You only pay for the ingress/egress traffic that you use so if you aren't doing anything massive, then the cost will be tiny ($0.02/GB) compared to the cost of the instances used to host your machines.
To address your comment, A NAT Gateway would only be needed if you want your instances on private subnets but you want those subnets to have internet access. This is not required if you are comfortable with putting your instances on public subnets and then locking them down with security group and NACL rules (this is not the best security practice but it is a comprise you can make to save on costs).

AWS + Elastic Beanstalk + MongoDB

I am trying to setup my microservices architecture using AWS Elastic Beanstalk and Docker. That is very easy to do, but when I launch the environment, it launches into the default VPC, thus giving public IP's to the instances. Right now, that's not too much of a concern.
What I am having a problem with is how to set up the MongoDB architecture. I have read: recommended way to install mongodb on elastic beanstalk but still remain unsure on how to set this up.
So far I have tried:
Using the CloudFormation template from AWS here: http://docs.aws.amazon.com/quickstart/latest/mongodb/step2b.html to launch a primary with 2 replica node setup into the default VPC, but this gives and assigns public access to the Mongo nodes. I also am not sure how to connect my application since this does not add a NAT instance - do I simply connect directly to the primary node? In case of failure for this node, will the secondary node's IP become the same as that of the primary node so that all connections remain consistent? Or do I need to add my own NAT instance?
I have also tried launching MongoDB into its own VPC (https://docs.aws.amazon.com/quickstart/latest/mongodb/step2a.html) and giving access via the NAT, but this means having two different VPCs (one for my EB instances and one for the MongoDB). In this case would I connect to the NAT from my EB VPC in order to route requests to the databases?
I have also tried launching a new VPC for the MongoDB architecture first and then trying to launch EB into this VPC. For some reason, the load balancing setup won't let me add into the subnets, giving me the error: "Custom Availability Zones option not supported for VPC environments".
I am trying to launch all this in us-west-1. It's been two days now and I have no idea where to go or what the right way is to tackle this issue. I want the databases to be private (no public access) with a NAT gateway, so ideally my third method seems what I want, but I cannot seem to add the new EB instances/load balancer into the newly-created MongoDB VPC. This is the setup I'm going for: http://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/images/default-vpc-diagram.png but I am trying to use the templates to do this.
What am I doing wrong here? Any help would be much, much appreciated. I have read up a lot about this but still am not sure where to go from here.
Thanks a lot in advance!
Im having this same issue. There seems to be a complete lack of documentation on how to connect an Elastic Beanstalk node.js / express app with the aws Quickstart mongodb cluster set up documentation.
When I run the aws mongo quickstart though it launches a NAT which is public and also a private primary node... maybe this is part of your issue?

cloudformation best practices in AWS

We are at early stages with running our services on AWS. We have our server hosted in AWS, in a VPC, having private and public subnets and have multiple instances in private and public subnets using ELB and autoscaling setup (using AMIs) for frontend web servers. The whole environement(VPC, security groups, EC2 instances, DB instances, S3 buckets, cloudfront) are setup manually using AWS console at first.
Application servers host jboss and war files are deployed on the servers.
As per AWS best practices we want to create whole infrastructure using cloudformation and have setup test/stage/prod environment.
-Would it be a good idea to have all the above componenets (VPC, security groups, EC2 instances, DB instances, S3 buckets, cloudfront etc) using one cloudformation stack/template? Or we should we create two stacks 1) having network replated components and 2) having EC2 related components?
-Once we have a prod envoronemtn running with cloudformation stact and In case we want to update the new AMIs on prod in future, how can we update the live running EC2 instances using cloudformation without interruptions?
-What are the best practices/multiple ways for code deployment to multiple EC2 notes when a new release is done? We dont use Contineus integration at the moment.
It's a very good idea to separate your setup into multiple stacks. One obvious reason is that stacks have certain limits that you may reach eventually. A more practical reason is that you don't really need to update, say, your VPC every time you just want to deploy a new version. The network architecture typically changes less frequently. Another reason to avoid having one huge template, or to make changes to an "important" template needlessly, is that you always run the risk of messing things up. If there's an error in your template and you remove an important resource by accident (e.g. commented out) you'll be very sorry. So separating stacks out of sheer caution is probably a good idea.
If you want to update your application you can simply update the template with the new AMIs and CFN will know what needs to be recreated or updated. You can read about rolling updates here. However, I'd recommend considering using something a bit more straightforward for deploying your actual code, like Ansible or Chef.
I'd also recommend you look into Docker for packaging and deploying your application's nodes. Very handy.