Is there a way to cache a self-hosted PostgreSQL db connection across Lambda invocations? - postgresql

We have a self-hosted (i.e. on top of EC2 instance) PostgreSQL instance, which we intent to access from a Step Functions, which invoke multiple Lambdas. I am aware that for the RDS instances there is an RDS Proxy service available which solves for this exact scenario. Given that our instance is self-hosted - is there a way to accomplish caching the database connection across multiple lambda invocations without having to re-establish it for each one?

Related

Limitations of using a single ec2 instance to deploy a database

My application requires MongoDB to run. I was wondering if you could explain the limitations of using a single ec2 instance to deploy a database.
What would be a couple other options that wouldn't require ec2?
There are two basic options:
Run your own instance of MongoDB on EC2, or in a container.
Use a managed service such as MongoDB Atlas or DocumentDB
The first option provides you with more control of the application and its runtime environment, and more flexibility in configuration.
The downside is the overhead imposed in the management of the solution: Upgrades, scaling, performance, configuration changes, security patches. And this applies to the underliying operating system as well.

Why does RDS proxy make performance worse?

I deployed a RDS Aurora cluster for postgresql 11 in AWS. My lambda is talking to this cluster via IAM authentication. Since lambda is serverless, I have to create a connection to database every time my lambda is triggered and close the connection when it finishes. It is not great since creating db connection is heavy and takes time. I have used xray to observe the connection performance which takes 150ms to create a new connection. It also gives a lot load on db cluster since there will be many short lived connections on db.
After some searching I found RDS proxy is designed to solve the problem. So I deployed RDS proxy to use username/password to connect to my Aurora cluster. And my lambda connects to RDS proxy via IAM authentication.
When I observe the creating connection performance, it becomes worse. It takes more than 500ms to create a connection and sometimes it even takes more than 1 second.
How come it is worse when using RDS proxy? Is there anything I didn't configure in the proxy?

How to execute Amazon Lambda functions on dedicated EC2 server?

I am currently developing the backend for my app based on Amazon Web Services. I pretended to use DynamoDB to store the user's data, but finally opted for MongoDB, which I have already installed in my EC2 instance.
I have some code written in Python to update/query... the DB, so that when a Cognito event triggers my lambda function, this code is directly executed on my instance so I can access my DB. Any ideas how can I accomplish this?
As mentioned by Gustavo Tavares, "the whole point of lambda is to run code without the need to deploy EC2 instances". And you do not have to put your EC2 with database to "public" subnets for Lambda to access them. Actually, you should never do that.
When creating/editing Lambda configuration you may select to run it in any of you VPCs (Configuration -> Advanced Settings -> VPC). Then select Subnet(s) to run your Lambda in. This will create ENIs (Elastic Network Interface) for the virtual machines you Lambdas will run on.
Your subnets must have Routing/ACL configured to access the subnets where Database resides. At least one of the SecurityGroups associated with Lambda must also have Outbound traffic allowed to the Database subnet on appropriate ports (27017).
Since you mentioned that your Lambdas are "back-end" then you should probably put them in the same "private" subnets as your MongoDB and avoid any access/routing headache.
One way to accomplish this is to give the Lambda a SAM Template, then use sam local invoke inside of the EC2 instance to execute locally.
OK BUT WHY OH WHY WOULD ANYONE DO THIS?
If your Lambda requires access to both a VPC and the Internet, and doesn't use a lot of memory and doesn't really require scalability, and you already wrote the code (*), it's actually 10x cheaper(**) and higher-performing to launch a t3.nano EC2 Spot Instance on a public subnet than to add a NAT Gateway to the Lambda function.
(*) if you have not written the code yet, don't even bother to make it a Lambda.
(**) 10x cheaper as in $3 vs $30, so this really only applies to hobbyist projects on a shoestring budget. Don't do this at work, because the cost of engineers' time to manage and maintain an EC2 instance will far exceed $30/month over the long term.
If you want Lambda to execute code on your ec2-instances you'll need to use the SDK for the language you're writing your lambda in. Then you can simply use the AWS API to run commands on your EC2 instance.
See: http://docs.aws.amazon.com/systems-manager/latest/userguide/run-command.html
I think you misunderstood the idea of AWS lambda.
The whole point of lambda is to run code without the need to deploy EC2 instances. You upload the code and the infrastructure is provisioned on the fly. If your application does not need the infrastructure anymore (after a brief period), it vanishes and you will not be charged for the idle time. If you need it again a new infrastructure is provisioned.
If you have a service, like your MongoDB, running in EC2 instances your lambda functions can access it like any other code. You just need configure your lambda code to connect to the EC2 instance, like you would be doing if your database were installed in any other internet faced server.
For example: You can put your MongoDB server in a public subnet of your VPC and assign an elastic IP for your server. In your Python lambda code you configure your driver to connect to this elastic IP and update the database.
It will work like every service were deployed in different servers across internet: Cognito connect to Lambda functions across internet and then the python code deployed in lambda connect to your MongoDB across internet.
If I can give you an advice, try DynamoDB a little more. With DynamoDB it will be even more simple to make all this work, because you will not need to configure a public subnet and request an elastic IP. And the API for DynamoDB is not very different of the MongDB API.

real-time sync between local Postgres instance and Azure Cloud Postgres instance

I need to set up real time sync process between a on premise postgresql instance with cloud postgresql instance. Please let me know what are all the options available through which i can achieve it.
Do i have to use any specific tool or it can be managed through replication .
Please advice
Use PgPool
http://www.pgpool.net/mediawiki/index.php/Main_Page
from their web page:
pgpool-II can manage multiple PostgreSQL servers. Using the replication function enables creating a realtime backup on 2 or more physical disks, so that the service can continue without stopping servers in case of a disk failure.

Move RDS instance from EC2-Classic to VPC

I am currently migrating my production system from EC2-Classic to VPC platform.
All is done except for RDS instance, which is still in EC2-Classic.
My original plan was to do migration with some downtime: shutdown all instances, take database snapshot, create new instance in VPC from this snapshot (RDS "Restore snapshot" feature).
Unfortunately when I tried to do this I realized that I cannot restore to the type of instance I want.
When I click "Restore" Amazon offers me only a limited number of options:
Expensive db.m3, db.r3 instances
Previous generation db.t1, db.m1, db.m2 instances
Ideally I'd like to create db.t2 instance, is it possible to do that somehow?
Also, is there a way to migrate with zero downtime? So far I've found nothing in Amazon docs.