AWS DMS transformation rules not adding Column on Target Postgres DB - postgresql

I have been trying to add a new column to a target Postgres DB using the transformation rule. I have set up the DMS with CDC.
DMS is able to sync all changes from the table but not performing ADD/Modify column operation.
Any comments on the above? What I am missing here?

Related

Applying column filters in DMS Task for source MongoDB and target Postgres

In my AWS DMS Task Source - MongoDB Target - RDS Postgres SQL
I am able to apply column filter and only 3 selected columns get migrated to target.
But if the source mongo document contains an object datatype. So In the Include-column filter after specifying
oid__id
phone_no
address
Params.name
Params.age
Only the top two columns oid__id, address, and phone_no get migrated to target and the remaining two columns are skipped by the DMS task.

AWS Glue ETL Job Missing collection name

I have data catalog tables generated by crawlers one is data source from mongodb, and second is datasource Postgres sql (rds). Crawlers running successfully & connections test working.
I am trying to define an ETL job from mongodb to postgres sql (simple transform).
In the job I defined source as AWS Glue Data Catalog (mongodb) and target as Data catalog Postgres.
When I run the job I get this error:
IllegalArgumentException: Missing collection name. Set via the 'spark.mongodb.input.uri' or 'spark.mongodb.input.collection' property
It looks like this is related to the mongodb part. I tried to set the 'database' and 'collection' parameters in the data catalog tables and it didn't help
Script generated for source is:
AWSGlueDataCatalog_node1653400663056 = glueContext.create_dynamic_frame.from_catalog(
database="data-catalog-db",
table_name="data-catalog-table",
transformation_ctx="AWSGlueDataCatalog_node1653400663056"
What could be missing?
I had the same problem, just add the parameter below.
AWSGlueDataCatalog_node1653400663056 = glueContext.create_dynamic_frame.from_catalog(
database="data-catalog-db",
table_name="data-catalog-table",
transformation_ctx="AWSGlueDataCatalog_node1653400663056"
additional_options = {"database":"data-catalog-db",
"collection":"data-catalog-table"}
Additional parameters can be found on the AWS page
https://docs.aws.amazon.com/glue/latest/dg/connection-mongodb.html

How to copy a database schema from database A to database B

I have created a Postgresql database A using liquibase changesets. Now, I'm creating an application that allows creating a new database B and copies the schema from database A in real-time including the liquibase changesets as the database can still be updated later. Note that at the time of the copied schema in database A could already be updated, making the base changesets outdated.
My main question would be:
How to copy PostgreSQL schema x from database a (dynamically generated at run-time) to b using liquibase? Database b could be on another server.
If it's not possible with liquibase, what other tools or approaches would make this possible?
--
Let me add more context:
We initialize a new database a schema using liquibase changeset.
We can add a new table and field to the database an at run-time. Or during the time when the application is running. For example, we add a new table people to the schema of database a, which is not originally in the changeset. This is done using liquibase classes too. So changeset is added to databasechangelog table.
Now, we create a new database b.
We want to import the schema of the database a to b, with people table.
I hope that is clear.
Thanks.
All schema changes must be run through your schema migration tool
The point of using a database schema migration tool such as Liquibase or Flyway is to have a “single source of truth” regarding the structure of your database tables. Your set of Liquibase changesets (or Flyway scripts) is supposed to be that single source of truth for your database.
If you are altering the structure of you database at runtime, such as adding a table named people, outside the scope of your migration tool, well, then you have violated the rules of the game. You will have defeated the purpose of using a schema migration tool. The intention of using a schema migration tool is that you make all schema changes through that tool.
If you need to add a table while running in production, you should be dropping the physical file for the Liquibase changeset (or Flyway script) into the file system of your database server environment, and then invoking Liquibase (or Flyway) to run a migration.
Perhaps you have been misunderstanding the sequence of events:
If you have built a database on server "A", that means you installed Postgres, created an empty database, then installed the collection of Liquibase changesets you have carefully built, then ran a Liquibase migration operation on that server.
When you go to create a database on server "B", you should be following the same steps: Install Postgres, create an empty database, installing the very same collection of Liquibase changesets, and then running a Liquibase migration operation.
Alternatively, if making a copy of server "A" to create server "B", that copy should include the exact same Liquibase changesets. So at the end of your copy process, the two databases+changesets are identical.
Here's how I solved this problem of mine using the Liquibase Java library:
1.) Export the changelog from the source database into a temporary file (XML).
Liquibase liquibase = new Liquibase(liquibaseOutFile.getAbsolutePath(), new FileSystemResourceAccessor(), sourceDatabase);
liquibase.generateChangeLog(catalogAndSchema, changeLogWriter, new PrintStream(liquibaseOutFile.getAbsolutePath()), null);
2.) Execute the temporary file to the new data source.
Liquibase targetLiquibase = new Liquibase(liquibaseOutFile.getAbsolutePath(), new FileSystemResourceAccessor(), targetDatabase);
Contexts context = new Contexts();
targetLiquibase.update(context);
Here's the complete code: https://czetsuya-tech.blogspot.com/2019/12/generate-postgresql-schema-using-java.html

How to Replicate Schema changes using the SQL Replication in IBM DB2?

I have configured SQL Replication between two our two IBM DB2 databases ( database A and database B). The replication is working fine and all the data changes in one database are being replicated to the other database.
The issue I am facing is that if I add a new table in database A, then I have to manually add this table in Capture Control Server and Apply Control Server.
Even If I add a new column in a table for whom SQL Replication is already configured, I have to register that column in Capture Control Server and Apply Control Server.
Is there any way of replicating the schema changes automatically from one database to the other so that we don't have to manually add the new tables and columns in Capture Control Server and Apply Control Server for SQL Replication?
Regards,
Babar Hussain

Backup specific tables in AWS RDS Postgres Instance

I have two databases on Amazon RDS, both Postgres. Database 1 and 2
I need to restore an instance from a snapshot of Database 1 for my Staging environment. (Database 2 is my current Staging DB).
However, I want the data from a few of the tables in Database 2 to overwrite the tables in the newly restored snapshot. What is the best way to do this?
When restoring RDS from a Snapshot, a new database instance is created. If you only wish to copy a portion of the snapshot:
Restore the snapshot to a new (temporary) database
Connect to the new database and dump the desired tables using pg_dump
Connect to your staging server and restore the tables using pg_restore (most probably deleting any matching existing tables first)
Delete the temporary database
pg_dump actually outputs SQL commands that are then used to recreate tables and restore data. Look at the content of a dump to understand how the restore process actually works.
I hope this still works for someone else.
With my team we faced a similar issue. We also had 2 Postgres databases and we also just needed to backup some tables from db1 to db2.
What we did is to use a lambda function using Python (from AWS lambda ofc) that connected to both databases and validates if db1.table1 has the same data as db2.table1, if not, then the lambda function should write the missing data from db1.table1 into db2.table1. The approach of using lambda was because we wanted to automate the process due to the main db (let's say db1) is constantly being updated. In addition, it allowed us to only backup our desired tables (let's say 3 tables out of 10), instead of backing up the whole database.
Note: Maybe you want to do these writes using temporary tables to avoid issues with any constraints you have in your tables.