replicating mongo db oplog to another mongo db - mongodb

Hi we have a production DB on mongo which has a set of collections and all the activities are loaded into an oplog. Now i wanna write a script to watch this oplog so that when ever a new record is added to the oplog, i wanna write it to a db on another dummy server. How can i go about this. I am new to mongo, so im unsure of where to start with this. any ideas would be helpful for me. I am thinking of something on the lines of
while(true)
{
watch(oplog)
OnNewEntry
{
AddToAnotherMongo(another.server.com,port,dbname,record)
}
}

There are various oplog readers which can watch and replay to a specific server. This is what replicasets do by default and there is only one primary (writer). If you just want copies of your data then replicasets are the best option, and supported without any code.
Here are some samples of code which read the oplog:
http://github.com/wordnik/wordnik-oss/
http://github.com/RedBeard0531/mongo-oplog-watcher/
http://github.com/mongodb/mongo-java-driver/blob/master/examples/ReadOplog.java

I had a simliar problem and found a quite easy solution following your opcode-example in javascript to be executed in a mongo-shell.
source code available here
With opening a tailable cursor on the oplog of the master server each operation could be applied to another server (of course you can filter by the namespace of the collections or even the databases...)

Related

All documents in the collection magically disappeared. Can i find out what happened?

I cloned 2 of my collections from localhost to a remote location on MongoLab platform yesterday. I was trying to debug my (MEAN stack) application (with WebStorm IDE) and i realized one of those collections have no data in it. Well, there were 7800 documents this morning...
I am pretty much the only one who works on the database and especially with this collection. I didn't run any query to remove all of the documents from this collection. In mongolab's website there is a button says 'delete all documents from collection'. I am pretty sure I didn't hit that button. I asked my team mates; no one even opened that web page today.
Assuming that my team is telling the truth and I didn't remove everything and have a black out...
Is there a way to find out what happened?
And, is there a way to keep a query history (like unix command-line history) for mongo database that runs on a remote server? And if yes, how?
So, I am just curious about what happened. Also note that I don't have any DBA responsibilities or experience in that field.
MongoDB replica sets have a special collection called oplog. This collection stores all write operations for all databases in that replica set.
Here are instructions on how to access oplog in Mongolab:
Accessing the MongoDB oplog
Here is a query that will find all delete operations:
use local
db.oplog.rs.find({"op": "d", "ns" : "db_name.collection_name"})

Copy data field from one mongo collection to another, on db server

I have two mongo collections. One we can call a template and second is instance. Every time new instance is created, rather large data field is copied from template to instance. Currently the field is retrieved from mongo db template collection in application and then sent back to db as a part of instance collection insert.
Would it be possible to somehow perform this copy on insert directly in mongo db, to avoid sending several megabytes over the network back and forth?
Kadira is reporting 3 seconds lag due to this. And documents are only going to get bigger.
I am using Meteor, but I gather that that should not influence the answer much.
I have done some searching and I can't really find an elegant solution for you. The two ways I can think of doing it are:
1.) Fork a process to run a mongo command to copy your template as your new instance via db.collection.copyTo().
http://eureka.ykyuen.info/2015/02/26/meteor-run-shell-command-at-server-side/
https://docs.mongodb.org/manual/reference/method/db.collection.copyTo/
Or
2.) Attempt to access the raw mongo collection rather than the minimongo collection meteor provides you with so you can use the db.collection.copyTo() functionality supplied by Mongo.
var rawCollection = Collection.rawCollection();
rawCollection.copyTo(newCollection);
Can meteor mongo driver handle $each and $position operators?
I haven't tried accessing the rawCollection to see if copyTo is available, and I also don't know if it will bring it into meteor before writing out the new collection. I'm just throwing this out here as an idea for you; hopefully someone else has a better one.

Finding Collection Data in Meteor

I'm trying to better understand the Meteor/MongoDB data model. When you create a new meteor project I'd like to know where the data in a collection is stored when you create a new collection or add data to a collection. I understand that it is supposed to be under the .meteor/local/db directory but thus far I have not found it. I've both created new collections and added data to preexisting collection to both the basic project and to the Meteor demo projects (like Leaderboard) and I can't find where this data is stored. Could someone please guide me on this matter?
I imagine that I would at least see a JSON type list somewhere or a GUI similar to something like a MYSQL work bench (is there anything out there like this for Meteor - I've looked high and low but I haven't found it; Houston is insufficient).
In addition to scouring Stack Overflow for the answer to this question I've looked through a number of APIs (like Meteor's and Mongo's) and tutorials like http://meteortips.com/book/databases-part-1/
Again all I want to know is how can I see the data in Mongo as it is added to a collection. Thank you.
The data files are in the mongodb format; and are not human readable.
If you want to query mongo directly --
while meteor is running (from your app's directory)
meteor mongo
If meteor isn't running, and you want to launch just the database, you can try:
mongod --smallfiles --dbpath /path/to/my/app/.meteor/local/db --port 3001
Then connect with the regular mongo shell.
To access database in nice GUI form I use Robomongo.
What is nice you can connect to local (on port 3001) or production mongodb from it (see how to do that).
Update:
Remember to run meteor command before connecting to local mongodb.
Thanks #iAmME
I have been using MONOVUE (http://www.mongovue.com/downloads/) for viewing the collections and it has been very handy in checking the data.
The different kinds of views : Table View, Tree View and Text View make it easier to understand how the data is inserted especially for anyone(like me) jumping from RDBMS to NOSQL.

Managing (indexing) large datasets with Meteor and Mongo

How does Meteor handle the process of DB indexing? I've read that there are none at this time but I'm particularly concerned with very large data sets, joined with multiple lookups, etc. and will really impact performance. Are these issues taken care of by Mongo and Meteor?
I am coming from a Rails/PostgreSQL background and am about 2 days into Meteor and Mongo.
Thanks.
Meteor does expose a method for creating indexes, which maps to the mongo method db.collection.ensureIndex
You can access it on each Meteor.Collection instance, on the server. For Example:
if (Meteor.isServer){
var myCollection = new Meteor.Collection("dummy");
// create an index on 'dummy', field1 & field2
myCollection._ensureIndex({field1: 1, field2: 1});
}
From a performance POV, create indexes based on what you publish, but avoid over-indexing.
With oplog tailing, the initial query will only run occasionally- and get changes from the oplog.
Without oplog tailing, meteor will re-run the query every 10s, so better indexes have a large gain.
Got a response from the Discover Meteor book folks:
Sacha Greif Mod − Actually, we are in the process of writing a new
sidebar to address migrations. You'll have access to it for free if
you're on the Full or Premium packages :)
Regarding indexes, I think we might address that in an upcoming blog
post :)
Thanks much for the reply. I'm looking forward to both.

MongoDB. Keep information about sharded collections when restoring

I am using mongodump and mongorestore in a replicated shard cluster in MongoDB 2.2. to get a backup and restore it.
First, I use mongodump for creating the dump of all the system, then I drop a concrete collection and restore it using mongorestore with the output of mongodump. After that, the collection is correct (the data it contains is correct and also the indexes), but the information about if this collection is sharded is lost. Before dropping it, the collection was sharded. After the restore, however, the collection was not sharded anymore.
I was wondering then if a way of keeping this information in backups exist. I was thinking that maybe sharded information for collection is kept in the admin database, but in the dump, admin folder is empty, and using show collections for this database I get nothing. Then I thought it could be kept in the metadata, but this would be strange, because I know that, in the metadata, the information about indexes is stored and indexes are correctly restored.
Then, I would like to know if it could be possible to keep this information using instead of mongodump + mongorestore, filesystem snapshots; or maybe still using mongodump and mongorestore but stopping the system or locking writing. I don't thing this last option could be the reason, because I am not performing writing operations while restoring even not being locking it, but just to give ideas.
I also would like to know if anyone is completely sure about if it is the case that this feature is still not available in the current version.
Any ideas?
If you are using mongodump to back up your sharded collection, are you sure it really needs to be sharded? Usually sharded collections are very large and mongodump would take too long to back it up.
What you can do to back up a large sharded collection is described here.
The key piece is to back up your config server as well as each shard - and do it as close to "simultaneously" as possible after having stopped balancing. Config DB is small so you should probably back it up very frequently anyway. Best way to back up large shards is via file snapshots.