Physically recover heroku postgresql database - postgresql

Heroku makes clear in its documentation and in its blog that it stores postgres database physical backups (that's binary copies of the database cluster files) in S3 using its software called wal-e.
Does somebody know if there is a way for the final user to access them?
Notice that I'm talking about physical backup, not the logical one provided by PGBackups plugin. This issue is related with a database in the free plan, without rollbacks, forks nor follows.
Thanks a lot.

I wrote and write most of WAL-E on Heroku's behalf. There is no access to the archives because it is operating system and architecture bit-depth dependent. On the "Free" and "Hobby" tiers the archives contain a mix of data from multiple tenants, which is relevant to why fork/follow are not supported.

Related

Rundeck MariaDB hot backup

On rundeck backup guide, noted that is mandatory to stop rundeck to take full backup when using data file. Now, that guide don't show any secure method to backup full rundeck instance (rundeck server + database) when using MariaDB, PostgreSQL, or any supported database as a backend.
In a real production scenario, not seems to be possible to stop rdeck on a daily basis.
Can anybody share best pratices to take a hot full backup on rdeck installation without stop rdeck?
Is there any secure and supported way to achive a full consistent rdeck projects and jobs definitions and database on a daily basis ?
In this post, answer is not clear, because question don't describe what kinbd of backend are used.
The documentation suggests the instance shutdown because some execution could be active, and that means a potentially active transaction in the middle of the "hot backup process" which means a potential data corruption in your backup. Is the safest way to backup your database.
If you want to do a "hot" backup you can export your projects (with all content, including jobs) and keys.

pg_dump from Azure Postgres Service which contains large data set

We are facing well-known pg_dumps effeciency problems in terms of velocity. We currently have a Azure hosted PostgreSQL, which holds our resources that are being created/updated by SmileCDR. Somehow after three months it is getting larger due to saving FHIR objects. Now, we want to have brand new environment; in that case persistent data in PostgreSQL has to be ripped out and new database has to be initiated with old data set.
Please be advised.
pg_dump consumes relative much more time, almost a day. How can we speed up backup-restore process?
What kind of alternatives that we could use and apply whereas pg_dump to achieve the goal?
Important notes;
Flyway utilized by SmileCDR to make versioning in PostgreSQL.
Everything has to be copied from old one to new one.
PostgreSQL version is 11, 2vCores, 100GB storage.
FHIR objects are being kept in PostgreSQL.
Some kind of suggestions like multiple jobs, without compress, directory format have been practiced but it didn't affect significantly.
Since you put yourself in the cage of database hosting, you have no alternative to pg_dump. They make it deliberately hard for you to get your data out.
The best you can do is a directory format dump with many processes; that will read data in parallel wherever possible:
pg_dump -F d -j 6 -f backupdir -h ... -p ... -U ... dbname
In this example, I specified 6 processes to run in parallel. This will speed up processing unless you have only one large table and all the others are quite small.
Alternatively, you may use smileutil with the synchronize-fhir-servers command, bulk export API on the system level, subscription mechanism. Just a warning that these options may be too slow to migrate the 100Gb database.
As you mentioned, if you can replicate the Azure VM that may be the fastest.

Can you use AWS DMS to move Aurora DB from one account to another?

I am trying to migrate an Aurora cluster from one of our accounts to another. We actually do not have a lot write requests and the database itself is quite small, but somehow we decided to minimize the downtime.
I have looked into several options
Use snapshot: cut off the mutation in source DB, take snapshot, share and restore in another account. This would definitely introduce some downtime
Use Aurora cloning: cut off the mutation in source DB, clone the cluster in target account and switch to the target DB. According to AWS, the cloning is much faster than taking and restoring a snapshot, so the downtime should be shorter.
I am not sure if I can use DMS to do this as I did not find useful doc/tutorials about moving Aurora across accounts. Also, I am not sure whether DMS will sync any write requests to target DB during migration.
If DMS can not live sync, then I probably should use Bucardo to live migrate.
Looking at the docs, AWS Aurora with PostgreSQL compatibility is allowed as source & target endpoints. So, answering your question, yes it's possible.
Obviously, your source Aurora DB should be accessible from the target account. Check that the DB endpoint is public and the traffic is not restricted by ACLs rules or SGs rules.
Also, if you want to enable ongoing replication, you need to grant rds_replication (or rds_superuser) role to the source database user. Link to the docs.
We actually ended up using DMS for this migration. What we did was:
Take a snapshot of the target DB in the original account.
Share the snapshot to the target account and restore it over there. (You have to use snapshot for migrating things like triggers, custom types, sequence, etc)
Setup connections (like VPC peering or security groups) between two accounts.
Setup DMS in source account (endpoints, replication instance, task)
Write SQL to temporarily disable/delete constraints, triggers, etc which may cause error when load source data.
Using DMS to load source data and enable ongoing replication.
Enable/add constraints, triggers, etc back.
Post migration test

Is Postgres designed to write to shared data stores?

Here's an important tip about volume sharing in docker:
Multiple containers can also share one or more data volumes. However,
multiple containers writing to a single shared volume can cause data
corruption. Make sure your applications are designed to write to
shared data stores.
In this context, does Postgres designed to write to shared data stores?
In other words, is it safe to run multiple Postgres containers (possibly with different minor versions) working with same database files located at the data volume?
You cannot have multiple PostgreSQL installations run against the same shared data files, this is a sure recipe for data corruption.
If your need is to update PostgreSQL without downtime, you'll need to use a replication solution that works between different major PostgreSQL versions so that you can first build a copy of the database with the new version an then switch over quickly in a controlled fashion. This still causes a small outage that has to be handled by the application.
Replication solutions that can be used are external replication tools like Slony-I or logical replication. Logical replication is fairly new, it will ship with PostgreSQL v10 (which won't help you with a current upgrade problem), but you can use it with pglogical from PostgreSQL 9.4 on.

MongoDB replication to hard drive

MonogDB's dynamic schema design is driving me towards it to replace MySQL in a production site. But this project runs on only 1 dedicated server (with 2 hard drives).
Docs about "MongoDB for production" recommends multiple servers. This makes me wonder if MongoDB is only suited for large commercial projects?
Anyways... I am wondering if the live database data can be replicated to the second hard drive for backup & recovery (to recover from corrupt data due to hard stop).
Any thoughts against the use of MongoDB in a single server environment is also appreciated. In this project, the biggest database will be less than 7GB.
Thanks