I have created custom script in Express that actually migrates SQL Server database to MongoDB.
But I am facing problems in live syncing between the two databases.
Currently I have added a column updated_by in both the databases.
Then I fetch the latest updated_by row from MongoDb and SQL Server database.
Then I check the date difference and based on it I update my MongoDB database.
There are lots of db tables and I am finding it difficult to identify that, which table is being updated.
Is there any log in SQL Server 2008 R2 that states which table is updated and at what time?
I need a mechanism like, any data update in the db table should immediately sync that rows into my MongoDB.
Any more suggestions on live data syncing is also welcome.
Thanks in advance. :)
When i have such requirement to Sync between Relational DB say (MYSQL) and Non-Relational DB (Mongodb).
I had followed following steps which may help others in future. and the concept is generally called as Change Data Capture
Capture changes (For MYSQL iam using triggers.)
Transform changes to a suitable changes
ie RDBMS to Non RDBMS
Update changes
Remember to sync the structural changes of database and corresponding implementaions.
Following links may help
https://www.flydata.com/blog/what-change-data-capture-cdc-is-and-why-its-important/
Related
I'm trying to do automatic data synchronization between pervasive SQL and PostgreSQL which means whenever I save the data in the pervasive Sql database it would automatically sync to the PostgreSQL database at a specific period. Is there any tool available for this or any other we can achieve it?
I searched more on the internet but no solution was found.
Coming from Microsoft SQL Server with Database Services, Integration Services and Analysis Services it was easy to create full or incremental replication beside SSIS Packages to sync child databases to a master reporting database while all have the same schema. Now I am looking how this can be done in real time while using PostgreSQL, close to real time and by knowing the source of the child database.
For example:
The goal is to gather data from selected tables and fields from child databases to a master reporting database. In this case all tables in child databases have the same schema and the master reporting database has one more column in all of the selected tables identifying the source database.
For reference:
First thought was to use Kafka based on a selected list of tables and fields to generate messages which will populate master reporting database in real time as soon as possible a new record is inserted into child databases, is there any other better idea?
I have created a db long ago using django. Now as we are migrating the application, so I need all the CREATE TABLE sql queries which django might have run to create the entire db for our service (which has around 70-80 tables and each table has avg 30-70 columns).
Both the servers old and new are using Postgres for databases.
But the technology stack is completely different (A 3rd party proprietary application which will host the service) instead of django.
If I start to write all the tables again from scratch, it will take at least a week or two.
Is there any way either from Postgres or from django which can generate the CREATE TABLE sql schema for an entire db keeping all the relationship as is?
Also, I have to do minor modification to that schema as per customer requirement.
p.s - pg_dump won't work as I need actual schema itself to get it reviewed from client.
I have not found real sync from postgreSql to new graph database like neo4j, so I've decide to use same postgresql to sync one normalise tables with json table on the same postgresql with a differenta name for a database. So i have the best of two worlds, sql and nosql database.
When i see sql is more fast than graphql i can choose, and in the future, when i moved the nosql tables to a real graph database like neo4j and i ll be able to sync, i dont need to change the app that can use both database synced
Someone did already this? or it's a dumm idea? Or someone already use automatic libraries to sync from postgresql to neo4j ? and the another sens too ? or must I write sync scripts from scratch if i want to sync two databases?
Is it possible to have a MS access backend database (Microsoft JET or Access Database Engine) set up so that whenever entries are inserted/updated those changes are replicated* to a PostgreSQL database?
Two-way synchronization would be nice, but one way would be acceptable.
I know it's popular to link the two and use one as a frontend, but it's essential that both be backend.
Any suggestions?
* ie reflected, synchronized, mirrored
Can you use Microsoft SQL Server Express Edition? Or do you have to use Microsoft Access Database Engine? It's possible you'll have more options using MS SQL express, like more complete triggers and logging.
Either way, you're going to need a way to accumulate a log of changed rows from the source database engine, and a program to sync them to PostgreSQL by reading the log and converting it into suitable PostgreSQL INSERT, UPDATE and DELETE statements.
You could do this by having audit triggers in MADB/Express insert a row into an audit shadow table for every "real" table whenever it changed, including inserting special "row deleted" audit entries. Then your sync program could connect to both MADB/Express, read the audit tables, apply the changes to PostgreSQL, and empty the audit tables.
I'll be surprised if you find anything to do this out of the box. It's one area where Microsoft SQL Server has a big advantage because of all the deep Access and MADB engine integation to support the synchronisation and integration features.
There are some ETL ("Extract, Transform, Load") tools that might be helpful, like Pentaho and Talend. I don't know if you can achieve the desired degree of automation with them though.