Populate RDS on creation - aws-cloudformation

Currently creating an RDS per account for several different AWS accounts. I use Cloudformation scripts for this.
When creating these databases I would like for them to have a similar structure. I created an SQL which I can successfully run manually after the script has run. I would like to however execute this automatically as part of running the script.
My solution so far is to create a EC2 instance with a dependency on the RDS to run once and then manually delete it later but this is not a suitable solution. I couldn't find any other way though?
Is it possible to run a query as part of a cloudformation script?
FYI: I'm creating a 11.5 Postgres instance.

The proper way is to use custom resources.
But this requires some new development. But if you have already EC2 instance that does populate the rds from its UserData you can automate its termination as follows:
Set InstanceInitiatedShutdownBehavior to termiante
At the end of UserData execute shutdown -h now to shutdown the instance.
Since your shutdown behavior is terminate, the instance will be automatically terminated.

Related

Best way to set up jupyter notebook project in AWS

My current project have the following structure:
Starts with a script in jupyter notebook which dowloads data from a CRM API to put in a local PostgressSql database I run with PgAdmin. After that it runs cluster analysis, return some scoring values, creates a table in database with the results and updates this values in the CRM with another API call. This process will take between 10 to 20 hours (the API only allows 400 requests per minute).
The second notebook reads the database, detects last update, runs api call to update database since the last call, runs kmeans analysis to cluster the data, compare results with the previous call, updates the new ones and the CRM via API. This second process takes less than 2 hours in my estimation and I want this script to run every 24 hours.
After testing, this works fine. Now I'm evaluating how to put this in production in AWS. I understand for the notebooks I need Sagemaker and from I have seen is not that complicated, my only doubt here is if I can call the API without implementing aditional code or need some configuration. My second problem is database. I don't understand the difference between RDS which is the one I think I have to use for this and Aurora or S3. My goal is to write the less code as possible, but a have try some tutorial of RDS like this one: [1]: https://www.youtube.com/watch?v=6fDTre5gikg&t=10s, and I understand this connect my local postgress to AWS but I can't find the data in the amazon page, only creates an instance?? and how to connect to it to analysis this data from SageMaker. My final goal is to run the notebooks in the cloud and connect to my postgres in the cloud. Just some orientation about how to use this tools would be appreciated.
I don't understand the difference between RDS which is the one I think I have to use for this and Aurora or S3
RDS and Aurora are relational databases fully managed by AWS. "Regular" RDS allows you to launch the existing popular databases such as MySQL, PostgreSQSL and other which you can launch at home/work as well.
Aurora is in-house, cloud-native implementation databases compatible with MySQL and PosrgreSQL. It can store the same data as RDS MySQL or PosrgreSQL, but provides a number of features not available for RDS, such as more read replicas, distributed storage, global databases and more.
S3 is not a database, but an object storage, where you can store your files, such as images, csv, excels, similarly like you would store them on your computer.
I understand this connect my local postgress to AWS but I can't find the data in the amazon page, only creates an instance??
You can migrate your data from your local postgress to RDS or Aurora if you wish. But RDS nor Aurora will not connect to your existing local database, as they are databases themselfs.
My final goal is to run the notebooks in the cloud and connect to my postgres in the cloud.
I don't see a reason why you wouldn't be able to connect to the database. You can try to make it work, and if you encounter difficulties you can make new question on SO with RDS/Aurora setup details.

Is there a way to recover an orphaned confluence pgsql database?

We are experimenting with Kubernetes and Confluence in the cloud and have deployed Confluence connected to a pgsql database. When applying an update, something happened that caused the pgsql pod to tank and lose the persistent volume connections.
Thankfully the volume was set to retain, so we have the volume and I have since been able to point a new pgsql instance to this volume, but I can't find a way to get Confluence to see this existing database. Confluence just proceeds to the initial fresh install screens. I've tried installing it on a temporary database and then modifying the confluence.cfg.xml file to point to the old data once completed but Confluence will not restart when I try this.
Any help is appreciated.
Using the web installer you should have a step to select "My own database". From there you can configure the database credentials and host. Go ahead and let the installer run, it will overwrite the default settings but will retain your existing data.
Also, you may want to get on the psql shell via console and check to make sure that your data actually exists and you haven't ended up with an empty database.
If you're still stuck, reach out here and we can check out the next steps.
In my case the original solution posted here is accurate:
However I had to do this in a non containerized environment. I installed Confluence on a VM using a blank database, then modified the confluence.cfg.xml file to point to the pgsql database in the kubernetes cluster and restarted confluence. I was able to see my data, so I then used confluence's XML export feature to grab the dataset. I then blew away the kubernetes environment and re-created it from scratch and imported the backed up XML into the new instance. Not a super clean way of doing it, but got where I needed to.

Start stopped EC2 instance in AWS via CloudFormation

I have to start EC2 instances which are already stopped in my AWS account. I'm trying to build a CloudFormation script to do it.
Is this possible via CloudFormation without using any Lambda functions?
If not, what are the alternatives?
Thanks.
I'm assuming those instances were created manually (i.e. not via CloudFormation).
None that I'm aware of.
Using Lambda-backed custom resources would be your best bet in my opinion, in case you really want to do it via CloudFormation. If you're open to accomplishing the same task without CloudFormation, it might be easier to do it using the AWS CLI.

AWS Cloudformation: Enable PostGIS Extension in RDS from Cloudformation

New to cloudformation. I am spawning PostgreSQL RDS instance using a aws cloudformation script. Is there a way to enable PostGIS (and other extensions) from aws cloudFormation script?
Working with PostGIS PostGIS is an extension to PostgreSQL for storing
and managing spatial information. If you are not familiar with
PostGIS, you can get a good general overview at PostGIS Introduction.
You need to perform a bit of setup before you can use the PostGIS
extension. The following list shows what you need to do; each step is
described in greater detail later in this section.
Connect to the DB instance using the master user name used to create the DB instance.
Load the PostGIS extensions.
Transfer ownership of the extensions to therds_superuser role.
Transfer ownership of the objects to the rds_superuser role.
Test the extensions.
https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/Appendix.PostgreSQL.CommonDBATasks.html
I'm not sure but maybe you can create a lambda function and RDS with your cloudformation and then you can invoke your lambda to do above steps. You need to try.
best,
I think this can be done with AWSUtility::CloudFormation::CommandRunner.
Basically we can run bash command with this (https://aws.amazon.com/blogs/mt/running-bash-commands-in-aws-cloudformation-templates/)
I don't think you will be able to achieve it by using cloudformation. Cloudformation is a provisioning tool not a configuration management tool.

EC2 UserData on Child AMI [duplicate]

This question already has answers here:
Amazon EC2 custom AMI not running bootstrap (user-data)
(5 answers)
Closed 7 years ago.
My goal is to create a base ami and then for child ami's to use the base ami.
I bootstrap the base ami via setting a powershell script in the --user-data flag and it works just fine.
However, when I use the create a child ami from the base ami, the child does not automatically run the script in the --user-data flag.
I understand that RunOnceService registry setting can be used to execute the latest userdata via the metadata call, however this seems hacky.
Is there a way to treat the child ami's as a new machine? Or get EC2 to run the script in the --user-data flag? Any other workarounds?
The default behavior of the EC2 Config Service is to NOT persist user-data settings following system startup. When your EC2 instance with the base AMI started up, this setting was toggled off during system startup and did not allow your subsequent child EC2 instances to handle user data.
Easy fix is to add <persist>true</persist> to your user data. An example from the documentation:
<powershell>
insert script here
</powershell>
<persist>true</persist>
Related:
AWS Documentation - Configuring a Windows Instance Using the EC2Config Service