AWS DMS "Load complete, replication ongoing" not working MongoDB to DocDB - mongodb

I am trying to make a PoC for MongoDB to DocDB migration with DMS.
I've set up a MongoDB instance with some dummy data and an empty DocDB. Source and Target endpoints are also set in DMS and both of them are connecting successfully to my databases.
When I create a migration task in DMS everything seems to be working fine. All existing data is successfully replicated from the MongoDB instance to DocDB and the migration task state is at "Load complete, replication ongoing".
At this point I tried creating new entries in existing collections as well as creating new empty collections in MongoDB but nothing happens in DocDB. If I understand correctly the replication should be real time and anything I create should be replicated instantly?
Also there are no indication of errors or warnings whatsoever... I don't suppose its a connectivity issue to the databases since the initial data is being replicated.
Also the users I am using for the migration have admin privileges in both databases.
Does anyone have any suggestions?

#PPetkov - could you check the following?
1. Check if right privileges were assigned to the user in the MongoDB endpoint according to https://docs.aws.amazon.com/dms/latest/userguide/CHAP_Source.MongoDB.html.
2. Check if replicate set to capture changes was appropriately configured in the MongoDB instance.
3. Once done, try to search for "]E:" or "]W:" in the CloudWatch logs to understand if there are any noticeable failures/warnings.

Related

AWS RDS Postgres: WAL_LEVEL not set to 'logical' after changing rds.logical_replication = 1

I am in the process of setting up Debezium to work with AWS Aurora Postgres (postgres version 12.6).
For Debezium to work, the WAL (Write-ahead-logging) must be set to 'logical' and not 'replica'.
On AWS, this would require a DBA to set the rds.logical_replication parameter in the parameter group to be set to 1.
The above was done. The database instance was restarted.
To verify that the WAL level was changed to 'logical', I ran the following query:
show wal_level.
However, after running this query in postgres on the target database the result showed replica.
I further looked at the log events in the AWS management console and I saw these log events.
Would anyone have an idea why this might be? There is another environment where we were able to successfully set the rds.logical_replication to 1 and following a database restart, the WAL was set to logical. But for our main environment this is not the case. Looking at the parameter groups, between these two environments, they are the same.
Any help/ advice is appreciated. thanks.
Ok after contacting the AWS support I got the information that the parameter "rds.logical_replication=1" is only active on the instance that has the Writer role (open for read-write). When you set this parameter you will have to use the writer instance for logical replication. You can connect to the writer instance either via Cluster endpoint (recommended) or instance endpoint.
I was checking the read-only instance with SHOW wal_level; but in fact it was set on the read/write instance !!!

Mongodb clone to another cluster

The idea here is, I have mongo cluster deployed in managed cloud service atlas. I have enabled Continuous Backup.
Now what I want to do is :
1) I want to use existing backup.
2) Using this existing backup I want to create similar cluster
(having same data form backup)
3) Automate this process so that every day my new cluster gets upto date from original cluster.
Note: The idea here for cloning cluster is, The original cluster is production data. I want to create a db which has similar data on which I can plug and play using any analytic tools and perform diffrent operations without affecting production data and load.
So far what I have found is to use mongorestore and mongodump.But here mongodump is putting load on production db even though my backup is enabled. I want to use same backup to clone this to another db cluster.
Deployed on Atlas, your server must have replica set.
Here are 2 solutions :
You need only reading data : connect your tools to a secondary server (ideally dedicated with priority 0 for becoming primary)
You need to read/write data : on the same server than above, play your mongodump command with --oplog option. By this way, you're dumping your data from a read-only server, preventing slowing performances of your main servers.
In this last case, what you need will find its solution in backup strategies, take a look at the doc to know more.
There's an offering for this purpose in ATLAS called analytic node.Link.
Analytic node is read replica of your database. Plus it will not interfere with your production traffic which makes it safer.
Also, you can connect BI connectors to this node and create your analytic platform.
We used redash.

Local MongoDB instance with index in remote server

One of our clients have a server running a MongoDB instance and we have to build an analytical application using the data stored in their MongoDB database which changes frequently.
Clients requirements are:
That we do not connect to their MongoDB instance directly or run another instance of MongoDB on their server but just somehow run our own MongoDB instance on our machine in our office using their MongoDB database directory with read only access remotely.
We've suggested deploying a REST application, getting a copy of their database dump but they did not want that. They just want us to run our own MongoDB intance which is hooked up with the MongoDB instance directory. Is this even possible ?
I've been searching for a solution for the past two days and we have to submit a solution by Monday. I really need some help.
I think this is normal request because analytical queries could cause too much load on the production server. It is pretty normal to separate production and analytical databases.
The easiest option is to use MongoDB replication. Set up MongoDB replica set with production database instance as primary and analytical database instance as secondary, also configure the analytical instance to never become primary.
If it is not possible to use replication - for example client doesn't want this, the servers could not connect directly to each other... - there is another option. You can read oplog from remote database and apply operations to your database instance. This is exactly the low level mechanism how replica set works, but you can do it manually too. For example MMS (Mongo Monitoring Sevice) Backup uses reading oplog for online backups of MongoDB.
Update: mongooplog could be the right tool for real-time application of replication oplog pulled from remote server on local server.
I don't think that running two databases that points to the same database files is possible or even recommended.
You could use mongorestore to restore from their data files directly, but this will only work if their mongod instance is not running (because mongorestore will need to lock the directory).
Another solution will be to do file system snapshots and then restore to your local database.
The downside to this backup/restore solutions is that your data will not be synced all the time.
Probably the best solution will be to use replica sets with hidden members.
You can create a replica set with just two members:
Primary - this will be the client server.
Secondary - hidden, with votes and priority set to 0. This will be your local instance.
Their server will always be primary (because hidden members cannot become primaries). Clients cannot see hidden members so for all intents and purposes your server will be read only.
Another upside to this is that the MongoDB replication will do all the "heavy" work of syncing the data between servers and your instance will always have the latest data.

mongoDB - manual offline replication by one command

I would like to have 2 databases: production and offline. My system will work with the production one. But time to time I would like to copy changes from production db to offline db.
In CouchDB you can use something like:
POST /_replicate HTTP/1.1
{"source":"example-database","target":"http://example.org/example-database"}
Is there other way than:
mongodump/mongorestore
db.cloneDatabase( "db0.example.net" )
...in mongoDB? I understand those operations as copying full content of database. Is that correct?
It sounds like you have a few options here depending on the constraints your database system has. In addition to the options above, you could also:
Set your offline database up as a secondary as part of a replica set. This replica could then be used for your offline work and would keep in sync with the primary. The added benefit to this is you will always have an additional copy of your data in case you run into issues with the primary. You may want to mark the "offline" replica as hidden so that it could never take over as primary. See the following links for more information: Replication in MongoDB, Replication Internals
If you really just want point in time snap shots then another option would be to backup your database files and restore them to your offline cluster. The methods to do this vary according to your database setup and environment. The following is a good start for learning about backups: MongoDB Backups

mongodb single DB replication

I've a working MongoDB "replica set" made up by 3 servers.
It is storing two DBs, I wonder if is it possible to replicate only one of the DBs without running more than one mongoDB instance(one per DB).
Here is a sketch of the "problem"
Server1 Server2 Server3
DB1 X X X
DB2 X X
X stands for Server where DBs have to be replicated in.
thank
I don't believe it is possible.
Unlike sharding, where you specify down to the collection level what gets sharded, with replica sets you're defining that a given MongoDB instance is part of a replica set. As only one node in a replica set can be the master at any given time, based on the scenario you are talking about, then there would be a problem if e.g. Server1 went down and Server3 was promoted to master - as DB2 would then not be able to be written to.
I had a simliar problem and found a quite easy solution in javascript to be executed in a mongo-shell.
Sourcecode available here:
http://www.suenkel.de/blog/2012/02/mongodb-replicate-one-database-or-collection/
With opening a tailable cursor on the oplog of the master server each operation could be applied to another server (of course you can filter by the namespace of the collections or even the databases...)
According to current MongoDB ReplicaSet architecture, you can't use a single Replica Set with some members having parts of the databases or collections.
However, if you have the requirement of replicating a single database or collection in real-time in another location, I ended up with following workaround:
Use directoryPerDB to separate the desired database files (Create a new replica with this option enabled if you don't have this already)
Copy the directory of desired database to the new location.
Deploy a new ReplicaSet with this single database.
Write a simple script and use Change Streams to perform the replication for you.
As I said, you will end up with another Replica Set dedicated for this database, but replication is done in real-time and both Replica Sets has the data in a consistent way (You have to perform your write operations on first ReplicaSet, though).