Bootstrap bucardo replication after pg_restore - postgresql

Currently I am setting up Master/Master Replication with bucardo between 5 Nodes on different locations (should provide location transparency). The database holds ~500 Tables which should be replicated. I grouped them into smaller replication herds of 50 Tables at maximum based on their dependency on each other. All tables have primary keys defined and the sequencers on each node are set up to provide system wide unique identities (based on residue class)
To get an initial database on each node, I made a --data-only custom format pg_dump into a File and restored this on each node via pg_restore. Bucardo sync is setup with the bucardo_latest strategy to resolve conflicts. Now when I start syncing bucardo is deleting all datasets in the origin database first and inserting it again from one of the restored nodes, because all restored datasets have a "later timestamp" (the point in time when I called pg_restore). This ultimately prohibits the inital startup as bucardo needs very much time and also fails, as there are lots of datasets to solve and timeouts often too short.
I also have 'last_modified' timestamps on each table which are managed by UPDATE triggers, but as I understand it, pg_dump inserts data via COPY, and therefore these triggers don't get fired.
Which timestamp does bucardo use to find out who is bucardo_latest?
Do I have to call pg_dump with something like set SESSION_REPLICATION_ROLE = 'replica';?
I just want bucardo to keep track of every new change, not executing pseudo changes because of the restore.
EDIT: pg_restore has definitely fired several triggers at restore time...as said I keep track on user and last modification date in each table, and those values are set to the user and timestamp when the restore was done. I am aware, that I can set SESSION_REPLICATION_ROLE for a plain text format restore via psql. Is this also possible for pg_restore somehow?

The common approach is make the dump/restore process before configure the replication.
So an option will be:
drop the bucardo schema in each database
do a bucardo remove for each object (most of them allow use all, like bucardo remove table all
dump/restore your data
Configure again the replication. Just make sure that when adding the sync, set the option onetimecopy=0. It's the default but I feel safer making it explicit.
Which timestamp does bucardo use to find out who is bucardo_latest?
bucardo handles its own timestamp value. Each table should have a trigger named like bucardo.delta_myschema_mytable that makes and insert in a table named like bucardo.delta_myschema_mytable. This table has a column txntime timestamp with time zone not null default now() and this is the timestamp used.
Do I have to call pg_dump with something like set SESSION_REPLICATION_ROLE = 'replica';?
AFAIK, if bucardo triggers are already set in the tables, the option --disable-triggers of pg_restore should do the trick.
You can also check these articles about working with large databases and the use of session_replication_role

Related

pg_repack and logical replication: any risk to missing out on changes from the table while running pg_repack?

As I understand, pg_repack creates a temporary 'mirror' table (table B) and copies the rows from the original table (table A) and re-indexes them and then replaces the original with the mirror. The mirroring step creates a lot of noise with logical replication (a lot of inserts at once), so I'd like to ignore the mirror table from being replicated.
I'm a bit confused with what happens during the switch over though. Is there a risk with losing some changes? I don't think there is since all actual writes are still going to the original table before and after the switch, so it should be safe right?
We're running Postgres 10.7 on AWS Aurora, using wal2json as the output plugin for replication.
I have neither used pg_repack nor logical replication but according to pg_repack Github repository there is a possible issue using pg_repack with logical replication: see
https://github.com/reorg/pg_repack/issues/135
To perform a repack, pg_repack will:
create a log table to record changes made to the original table.
add a trigger onto the original table, logging INSERTs, UPDATEs, and DELETEs into our log table.
create a new table containing all the rows in the old table.
build indexes on this new table.
apply all changes which have occurred in the log table to the new table.
swap the tables, including indexes and toast tables, using the system catalogs.
drop the original table.
In my experience, the log table keeps all changes and applies them after build indexes, besides if repack needs to rollback changes applied on the original table too.

How does AWS postgres RDS read replication handle schema switching?

I am wanting to know how an AWS postgres RDS does replication where I rename schemas to "swap" them within the read/write instance of the database.
Does it replicate this action to the read-replicas by sending on the "alter schema" rename commands I gave to my read/write instance? Or after my renames, does it see wholly different sets of data in the schemas and do a whole new copy of each out to the read-replicas?
For example...
In my RDS instance I have a read/write instance of "my_mega_database" which I want to create read-replicas of for my applications to connect to.
Typically, in "my_mega_database" there are two schemas "my_data" and "my_data_old", whereby "my_data" contains data that was delivered last night, and "my_data_old" contains data from the previous night. Each contains many tables and huge amounts of data.
If I were to do the following...
ALTER SCHEMA my_data_old RENAME TO my_data_tmp;
ALTER SCHEMA my_data RENAME TO my_data_old;
ALTER SCHEMA my_data_tmp RENAME TO my_data;
... I have affectively swapped these around.
My expectation is that these actions are replicated via the postgres WAL (ie: it sends the rename commands out to the replicas) and AWS RDS replication won't try and waste time copying huge amounts of data all over the place.
Is this correct?
(Speaking about PostgreSQL here, but RDS is probably similar.)
Renaming a schema (or any other object) is a small update in a catalog table, and no data are moved. Internally PostgreSQL uses only the numeric object ID, which stays the same.
You might wrap the three statements in a transaction to make the whole magic atomic.
The same is true on the standby, it is a trivial (meta)data modification.
The only thing that might be a problem are concurrent sessions holding locks.

Create a customized slave of postgresql

I need to create a slave for BI purposes and I need to modify some tables (e.g., remove all passwords or sensitive data). My database is PostgreSQL. I wonder if I can do it in database layer or I should do it programmatically by writing a code to do the replication.
You could use logical replication and have replica enabled triggers (that fire ony on replication) that modify the data when they are applied:
ALTER TABLE mytab DISABLE TRIGGER mytrig;
ALTER TABLE mytab ENABLE REPLICA TRIGGER mytrig;
You have to make sure that no replication conflicts can arise from these modifications. For example, never modify a key column.
Replication conflicts would stop replication and break your system.
The traditional way to solve this problem is to use an ETL process. That way you can have a different data model on the target database and for example pre-aggregate data so that the data warehouse doesn't grow too big and has a data model optimized for analytical queries.

How to see changes in a postgresql database

My postresql database is updated each night.
At the end of each nightly update, I need to know what data changed.
The update process is complex, taking a couple of hours and requires dozens of scripts, so I don't know if that influences how I could see what data has changed.
The database is around 1 TB in size, so any method that requires starting a temporary database may be very slow.
The database is an AWS instance (RDS). I have automated backups enabled (these are different to RDS snapshots which are user initiated). Is it possible to see the difference between two RDS automated backups?
I do not know if it is possible to see difference between RDS snapshots. But in the past we tested several solutions for similar problem. Maybe you can take some inspiration from it.
Obvious solution is of course auditing system. This way you can see in relatively simply way what was changed. Depending on granularity of your auditing system down to column values. Of course there is impact on your application due auditing triggers and queries into audit tables.
Another possibility is - for tables with primary keys you can store values of primary key and 'xmin' and 'ctid' hidden system columns (https://www.postgresql.org/docs/current/static/ddl-system-columns.html) for each row before updated and compare them with values after update. But this way you can identify only changed / inserted / deleted rows but not changes in different columns.
You can make streaming replica and set replication slots (and to be on the safe side also WAL log archiving ). Then stop replication on replica before updates and compare data after updates using dblink selects. But these queries can be very heavy.

Slow insert and update commands during mysql to redshift replication

I am trying to make a replication server from MySQL to redshift, for this, I am parsing the MySQL binlog. For initial replication, I am taking the dump of the mysql table, converting it into a CSV file and uploading the same to S3 and then I use the redshift copy command. For this the performance is efficient.
After the initial replication, for the continuous sync when I am reading the binlog the inserts and updates have to be run sequentially which are very slow.
Is there anything that can be done for increasing the performance?
One possible solution that I can think of is to wrap the statements in a transaction and then send the transaction at once, to avoid multiple network calls. But that would not address the problem that single update and insert statements in redshift run very slow. A single update statement is taking 6s. Knowing the limitations of redshift (That it is a columnar database and single row insertion will be slow) what can be done to work around those limitations?
Edit 1:
Regarding DMS: I want to use redshift as a warehousing solution which just replicates our MYSQL continuously, I don't want to denormalise the data since I have 170+ tables in mysql. During ongoing replication, DMS shows many errors multiple times in a day and fails completely after a day or two and it's very hard to decipher DMS error logs. Also, When I drop and reload tables, it deletes the existing tables on redshift and creates and new table and then starts inserting data which causes downtime in my case. What I wanted was to create a new table and then switch the old one with new one and delete old table
Here is what you need to do to get DMS to work
1) create and run a dms task with "migrate and ongoing replication" and "Drop tables on target"
2) this will probably fail, do not worry. "stop" the dms task.
3) on redshift make the following changes to the table
Change all dates and timestamps to varchar (because the options used
by dms for redshift copy cannot cope with '00:00:00 00:00' dates that
you get in mysql)
change all bool to be varchar - due to a bug in dms.
4) on dms - modify the task to "Truncate" in "Target table preparation mode"
5) restart the dms task - full reload
now - the initial copy and ongoing binlog replication should work.
Make sure you are on latest replication instance software version
Make sure you have followed the instructions here exactly
http://docs.aws.amazon.com/dms/latest/userguide/CHAP_Source.MySQL.html
If your source is aurora, also make sure you have set binlog_checksum to "none" (bad documentation)