terraform remote backend using postgres - postgresql

i am planning to use remote backend as postgres instead of s3 as enterprise standard.
terraform {
backend "pg" {
conn_str = "postgres://user:pass#db.example.com/schema_name"
}
}
When we use postgres remote backend, when we run terraform init, we have to provide schema which is specific to that terraform folder, as backend supports only one table and new record will be created with workspace name.
I am stuck now, as i have 50 projects and each have 2 tiers which is maintained in different folders, then we need to create 100 schemas in postgres. Also it is difficult to handle so many schemas in automated provisioning.
Can we handle something in similar to S3, where we have one bucket for all projects and multiple entries in same bucket with different key which specified in each terraform script. Can we use single schema for all projects and multiple tables/records based on key provide in backend configuration of each terraform folder.

You can use a single database and the pg provider will automatically create a specified schema.
Something like this:
terraform {
backend "pg" {
conn_str = "postgres://user:pass#db.example.com/terraform_backend"
schema = "fooapp"
}
}
This keeps the projects unique, at least. You could append a tier to that, too, or use Terraform Workspaces.
If you specify the config on the command line (aka partial configuration), as the provider recommends, it might make it easier to dynamically set for your use case:
terraform init \
-backend-config="conn_str=postgres://user:pass#db.example.com/terraform_backend" \
-backend-config="schema=fooapp-prod"
This works pretty well in my scenario similar to yours. Each project has a unique schema in a shared database and no tasks beyond the initial creation/configuration of the database is needed - the provider creates the schema as specified.

Related

how to migrate clustered rds(postgres) to different account?

I am migrating from AccountA(source) to AccountB(Target), same region.
I ran templates so AccountB already has a RDS cluster but with no data. The db instance id is exactly same as what I have on the sourcing account.
**My goal is to have the same endpoint as before since we're retiring AccountA completely and I don't want to change codes side for the updated endpoint. **
I took a snapshot of the cluster(writer instance) then copy snapshot with a KMS key, shared it with AccountB. All good. Now, from the AccountB(target), copied snapshot and attempted to restore. I thought I could restore directly into the RDS cluster but I see that's not doable with restore as it always creates a new instance.
Then, I renamed the existing empty RDS cluster to something else to free up the DB instance ID, then renamed the temp cluster to the same name. It worked but this seems not efficient.
What's is the best practice for the RDS data migration ?
Clustered RDS ( writer - reader ) and Cross Account
I didn't create a snapshot for Reader. Will it be synced from the writer automatically once I restore?
Your experience is correct -- RDS Snapshots are restored as a new RDS instance (rather than loading data into an existing instance).
By "endpoint", if you are referring to the DNS Name used to connect to the database, then the new database will not have the same endpoint. If you want to preserve an endpoint, then you can create a DNS Name with a CNAME record that resolves to the DNS Name of the database. That way, you can change the CNAME record to point to a different endpoint without needing to change the code. (However, you probably haven't done this, so you'll need to change the code to point to the new DNS Name anyway, so it's the same amount of work.)
You are correct that you do not need to snapshot/copy the Readers -- you can simply create them from the new database. You will need to 'create' the Readers on the new database after the restore, since the Snapshot simply contains data for the main database.

Google Cloud SQL PostgreSQL replication?

I want to make sure that there's not a better (easier, more elegant) way of emulating what I think is typically referred to as "logical replication" ("logical decoding"?) within the PostgreSQL community.
I've got a Cloud SQL instance (PostgreSQL v9.6) that contains two databases, A and B. I want B to mirror A, as closely as possible, but don't need to do so in real time or anything near that. Cloud SQL does not offer the capability of logical replication where write-ahead logs are used to mirror a database (or subset thereof) to another database. So I've cobbled together the following:
A Cloud Scheduler job publishes a message to a topic in Google Cloud Platform (GCP) Pub/Sub.
A Cloud Function kicks off an export. The exported file is in the form of a pg_dump file.
The dump file is written to a named bucket in Google Cloud Storage (GCS).
Another Cloud Function (the import function) is triggered by the writing of this export file to GCS.
The import function makes an API call to delete database B (the pg_dump file created by the export API call does not contain initial DROP statements and there is no documented facility for adding them via the API).
It creates database B anew.
It makes an API call to import the pg_dump file.
It deletes the old pg_dump file.
That's five different objects across four GCP services, just to obtain already existing, native functionality in PostgreSQL.
Is there a better way to do this within Google Cloud SQL?

Create an RDS/Postgres Replica in another AWS account?

I have an AWS account with a Postgres RDS database that represents the production environment for an app. We have another team that is building an analytics infrastructure in a different AWS account. They need to be able to pull data from our production database to hydrate their reports.
From my research so far, it seems there are a couple options:
Create a bash script that runs on a CRON schedule that uses pg_dump and pg_restore and stash that on an EC2 instance in one of the accounts.
Automate the process of creating a Snapshot on a schedule and then ship that to the other accounts S3 bucket. Then create a Lambda (or other script) that triggers when the snapshot is placed in the S3 bucket and restore it. Downside to this is we'd have to create a new RDS instance with each restore (since you can't restore a Snapshot to an existing instance), which changes the FQDN of the database (which we can mitigate using Route53 and a CNAME that gets updated, but this is complicated).
Create a read-replica in the origin AWS account and open up security for that instance so they can just access it directly (but then my account is responsible for all the costs associated with hosting and accessing it).
None of these seem like good options. Is there some other way to accomplish this?
I would suggest to use AWS Data Migration Service It can listen to changes on your source database and stream them to a target (https://docs.aws.amazon.com/dms/latest/userguide/CHAP_Task.CDC.html)
There is also a third-party blog post explaining how to set this up
https://medium.com/tensult/cross-account-and-cross-region-rds-mysql-db-replication-part-1-55d307c7ae65
Pricing is per hour, depending on the size of the replication EC2 instance. It runs in the target account, so it will not be on your cost center.

Using Ansible to automatically configure AWS autoscaling group instances

I'm using Amazon Web Services to create an autoscaling group of application instances behind an Elastic Load Balancer. I'm using a CloudFormation template to create the autoscaling group + load balancer and have been using Ansible to configure other instances.
I'm having trouble wrapping my head around how to design things such that when new autoscaling instances come up, they can automatically be provisioned by Ansible (that is, without me needing to find out the new instance's hostname and run Ansible for it). I've looked into Ansible's ansible-pull feature but I'm not quite sure I understand how to use it. It requires a central git repository which it pulls from, but how do you deal with sensitive information which you wouldn't want to commit?
Also, the current way I'm using Ansible with AWS is to create the stack using a CloudFormation template, then I get the hostnames as output from the stack, and then generate a hosts file for Ansible to use. This doesn't feel quite right – is there "best practice" for this?
Yes, another way is just to simply run your playbooks locally once the instance starts. For example you can create an EC2 AMI for your deployment that in the rc.local file (Linux) calls ansible-playbook -i <inventory-only-with-localhost-file> <your-playbook>.yml. rc.local is almost the last script run at startup.
You could just store that sensitive information in your EC2 AMI, but this is a very wide topic and really depends on what kind of sensitive information it is. (You can also use private git repositories to store sensitive data).
If for example your playbooks get updated regularly you can create a cron entry in your AMI that runs every so often and that actually runs your playbook to make sure your instance configuration is always up to date. Thus avoiding having "push" from a remote workstation.
This is just one approach there could be many others and it depends on what kind of service you are running, what kind data you are using, etc.
I don't think you should use Ansible to configure new auto-scaled instances. Instead use Ansible to configure a new image, of which you will create an AMI (Amazon Machine Image), and order AWS autoscaling to launch from that instead.
On top of this, you should also use Ansible to easily update your existing running instances whenever you change your playbook.
Alternatives
There are a few ways to do this. First, I wanted to cover some alternative ways.
One option is to use Ansible Tower. This creates a dependency though: your Ansible Tower server needs to be up and running at the time autoscaling or similar happens.
The other option is to use something like packer.io and build fully-functioning server AMIs. You can install all your code into these using Ansible. This doesn't have any non-AWS dependencies, and has the advantage that it means servers start up fast. Generally speaking building AMIs is the recommended approach for autoscaling.
Ansible Config in S3 Buckets
The alternative route is a bit more complex, but has worked well for us when running a large site (millions of users). It's "serverless" and only depends on AWS services. It also supports multiple Availability Zones well, and doesn't depend on running any central server.
I've put together a GitHub repo that contains a fully-working example with Cloudformation. I also put together a presentation for the London Ansible meetup.
Overall, it works as follows:
Create S3 buckets for storing the pieces that you're going to need to bootstrap your servers.
Save your Ansible playbook and roles etc in one of those S3 buckets.
Have your Autoscaling process run a small shell script. This script fetches things from your S3 buckets and uses it to "bootstrap" Ansible.
Ansible then does everything else.
All secret values such as Database passwords are stored in CloudFormation Parameter values. The 'bootstrap' shell script copies these into an Ansible fact file.
So that you're not dependent on external services being up you also need to save any build dependencies (eg: any .deb files, package install files or similar) in an S3 bucket. You want this because you don't want to require ansible.com or similar to be up and running for your Autoscale bootstrap script to be able to run. Generally speaking I've tried to only depend on Amazon services like S3.
In our case, we then also use AWS CodeDeploy to actually install the Rails application itself.
The key bits of the config relating to the above are:
S3 Bucket Creation
Script that copies things to S3
Script to copy Bootstrap Ansible. This is the core of the process. This also writes the Ansible fact files based on the CloudFormation parameters.
Use the Facts in the template.

How to replicate MySQL database to Cloud SQL Database

I have read that you can replicate a Cloud SQL database to MySQL. Instead, I want to replicate from a MySQL database (that the business uses to keep inventory) to Cloud SQL so it can have up-to-date inventory levels for use on a web site.
Is it possible to replicate MySQL to Cloud SQL. If so, how do I configure that?
This is something that is not yet possible in CloudSQL.
I'm using DBSync to do it, and working fine.
http://dbconvert.com/mysql.php
The Sync version do the service that you want.
It work well with App Engine and Cloud SQL. You must authorize external conections first.
This is a rather old question, but it might be worth noting that this seems now possible by Configuring External Masters.
The high level steps are:
Create a dump of the data from the master and upload the file to a storage bucket
Create a master instance in CloudSQL
Setup a replica of that instance, using the external master IP, username and password. Also provide the dump file location
Setup additional replicas if needed
Voilà!