I have to create an app which transfer data from snowflake to postgres everyday. Some tables in postgres are truncated before migration and all data from corresponding snowflake table is copied. While for other tables, data after last timestamp in postgres is copied from snowflake.
This job has to run at night sometime and not when customers are using the service at daytime.
What is the best way to do this ?
Do you have constraints, limiting your choices in:
ETL or bulk data tooling
Development languages?
According to this site, you can create a foreign data wrapper on Postgresql for snowflake
Related
How to migrate my whole database which is currently in AWS RDS Postgres to AWS Redshift and also can you please help me out how can I keep both these DBs in sync. I want to sync even if any column is updated in RDS so it must get updated in Redshift also.
I know we can achieve it with AWS Glue, but the above scenario is mandatory in my case. Migration task is easy to do but to to the CDC migration is bit challenging. I am also aware about the bookmark key but my situation is bit different, I do not have any sequential column in the tables, but it has updated_at field in all the tables so this column is the only field on which I can check whether the record is processed or not so that duplicate processing may not occur and if any new data is inserted it should also get replicated in RedShift.
So, would anyone help me out to do this even by using pyspark script?
Thanks.
I've created a DMS task of CDC and Full Load, migrating data from postgres 14 to Redshift. According to the documentation, when using pglogical and creating postgres publication with 'publish_via_partition_root' parameter of my partitioned table, changes should be published to the parent table and to to child tables. However, the data is still migrated to the child tables in Redshift and not to the parent table. Am I missing something thats needs to be configured or is it just not possible in DMS?
I am looking to find the data source of couple of Tables in Redshift. I have gone through all the stored procedures in Redshift instance. I couldn't find any stored procedure which populates these tables in Redshift. I have also checked the Data Migration Service and didn't see these tables are being migrated from RDS instance. However, the tables are updated regularly each day.
What would be the way to find how data is populated in those 2 tables? Is there any logs or system tables I can look in to?
One place I'd look is svl_statementtext. That will pull any queries and utility queries that may be inserting or running copy jobs against that table. Just use a WHERE text LIKE %yourtablenamehere% and see what comes back.
https://docs.aws.amazon.com/redshift/latest/dg/r_SVL_STATEMENTTEXT.html
Also check scheduled queries in the Redshift UI console.
I have created a db long ago using django. Now as we are migrating the application, so I need all the CREATE TABLE sql queries which django might have run to create the entire db for our service (which has around 70-80 tables and each table has avg 30-70 columns).
Both the servers old and new are using Postgres for databases.
But the technology stack is completely different (A 3rd party proprietary application which will host the service) instead of django.
If I start to write all the tables again from scratch, it will take at least a week or two.
Is there any way either from Postgres or from django which can generate the CREATE TABLE sql schema for an entire db keeping all the relationship as is?
Also, I have to do minor modification to that schema as per customer requirement.
p.s - pg_dump won't work as I need actual schema itself to get it reviewed from client.
I have a table in an MS Access db that I want to export to a PostgreSQL database. Every 2 or so months, I want to move all records from the Access table to a table in Postgres.
Right now, I am using the Export to ODBC option in Access to do this, but every time it exports as an entirely new table in Postgres. Is there a way for me to routinely append the records in the Access table to an existing table in my Postgres database? I have come across the option of a FDW but I am not familiar with how to install/use it.
I am new to using PostgreSQL, and have little to no experience working with databases other than Access, so any input/advice would be greatly appreciated.