server side Mongodb GridFS file copying - mongodb

Per business requirements I need provide possibility to copy content of some file on GridFS. Of course it can be done over domain-specific layer. But in this case I can see some overhead:
take stream from mongo-server
allocate memory on business-layer
read
place back to mongo-server
Obvious solution is write mongo-side JavaScript that will perform copying in bound of single server.
So my questions:
Where is the description of API to manage GridFS on JavaScript?
Is there any issues if my GridFS is sharded?
Is there any issues if my GridFS is replicated?
Thank you in advance

You never need to copy a GridFS file within a single server, because GridFS files are immutable: you can create, read, or delete them, but not modify them. So there's no reason to make a copy.
Copying from one server to another should be done via a driver; there's no built-in support for copying directly from a MongoDB server to another.

The 'normal' js driver does not support GridFS.
You could do it in Node.js. The documentation is here:
http://mongodb.github.com/node-mongodb-native/markdown-docs/gridfs.html
For replica-sets the documentation can be found here for Node.js:
http://mongodb.github.com/node-mongodb-native/markdown-docs/replicaset.html
For simple one-time copying of some files you can use mongofiles on command line (using a temporary file :( ) :
http://www.mongodb.org/display/DOCS/GridFS+Tools
mongofiles --host HOST get currentfilename
mongofiles --host HOST put -l currentfilename newfilename
rm currentfilename
I however don't know how well mongofiles works with sharding and replicas but I would expect it to work.

gridfs_session = gridfs.GridFS(mongo.session)
out_file = gridfs_session.get(file_id)
new_copy = gridfs_session.put(out_file, content_type=out_file.content_type)

Related

From GridFS to FileS ystem

I have some files in Mongo GridFS (inserted with utilities offered by the Meteor cfs:gridfs package)and I want to copy them to the filesystem in order to be compliant to an internal company backup procedure.
Can someone point me in the right direction on how to do this ? Thanks.
You can use mongofiles with gridfs in meteor.
Mongodb provides extensive features with mongofiles. Refer to official manual.

Transfer a MongoDB database over an unstable connection

I have a fairly small MongoDB instance (15GB) running on my local machine, but I need to push it to a remote server in order for my partner to work on it. The problem is twofold,
The server only has 30GB of free space
My local internet connection is very unstable
I tried copyDatabase to transfer it directly, but it would take approximately 2 straight days to finish, in which the connection is almost guaranteed to fail at some point. I have also tried both mongoexport and mongodump but both produce files that are ~40GB, which won't fit on the server, and that's ignoring the difficulties of transferring 40GB in the first place.
Is there another, more stable method that I am unaware of?
Since your mongodump output is much larger than your data, I'm assuming you are using MongoDB 3.0+ with the WiredTiger storage engine and your data is compressed but your mongodump output is not.
As at MongoDB 3.2, the mongodump and mongorestore tools now have support for compression (see: Archiving and Compression in MongoDB Tools). Compression is not used by default.
For your use case as described I'd suggest:
Use mongodump --gzip to create a dump directory with compressed backups of all of your collections.
Use rsync --partial SRC .... DEST or similar for a (resumable) file transfer over your unstable internet connection.
NOTE: There may be some directories you can tell rsync to ignore with --exclude; for example the local and test databases can probably be skipped. Alternatively, you may want to specify a database to backup with mongodump --gzip --db dbname.
Your partner can use a similar rsync commandline to transfer to their environment, and a command line like mongorestore --gzip /path/to/backup to populate their local MongoDB instance.
If you are going to transfer dumps on an ongoing basis, you will probably find rsync's --checksum option useful to include. Normally rsync transfers "updated" files based on a quick comparison of file size and modification time. A checksum involves more computation but would allow skipping collections that have identical data to previous backups (aside from the modification time).
If you need to sync data changes on ongoing basis, you also may be better moving your database to a cloud service (eg. a Database-as-a-Service provider like MongoDB Atlas or your own MongoDB instance).

How to import data into MongoDB from another MongoDB?

I've installed new MongoDB server and now I want to import data from the old one. My MongoDB stores monitoring data and it's a bit problematic to export the data from old database (it's over 10Gb), so I though it might be possible to import directly from DB, but haven't found how to do that with mongoimport.
The export/import would be the fastest option.
But if you really want to bypass it you can use the new server as a replica of the old one, and wait for full replication.
It takes longer but it's an easy way to set up a full copy without impact on the first one.
Follow this:
http://docs.mongodb.org/manual/tutorial/convert-standalone-to-replica-set/
And then, once it's done, change configuration again.
It's easiest than it seems, but I recommend you to do a dry run with a sample database before doing it...
Note that another benefit is that the new replica will be probably smaller in size than the initial database, because MongoDb is not very good at freeing space of deleted members
mongoimport/mongoexport is per collection operating, so it's not proper for this kind of operation.
Instead to use mongodump/mongorestore.
If the old MongoDB instance can be shutdown to do this task, you can shut down it then copy all data files to the new server as its own data. And run the new instance.
Also db.cloneDatabase() can handle it to copy data directly from old instance to new one. It should be slower against copying data files directly.
You can use mongodump and pipe directly to the new database with mongorestore like:
mongodump --archive --db=test | mongorestore --archive --nsFrom='test.*' --nsTo='examples.*'
add --host --port and --username to mongorestore to connect to the remote db.
db.cloneDatabase() has been deprecated for a while.
You can use the copydb command discribed here.
Copies a database from a remote host to the current host or copies a database to another database within the current host.
copydb runs on the destination mongod instance, i.e. the host receiving the copied data.

How to perform one-time DB sync to another DB in MongoDB?

I have separate development and production MongoDB servers and I want to keep actual data in development server for sometime. What I should use for it: mongodump, mongoimport or something else?
Clarification: I want to copy data from production to development.
If it's a one time-thing
and you want fine control over parameters such as which collections to sync, you should use:
mongodump to dump bson files of your Production DB to your local machine
mongorestore to then, retrieve the dumped BSON files in your Local DB
Otherwise you should check out mongo-sync
It's a script I wrote for my self when I had to constantly copy my Local MongoDB database to and from my Production DB for a Project (I know it's stupid).
Once you put your DB details in config.yml, you can start syncing using two simple commands:
./mongo-sync push # Push DB to Remote
./mongo-sync pull # Pull DB to Local
If you use it inside some project, it's a good idea to add config.yml to .gitignore
You can use the db.copyDatabase(...) or db.cloneDatabase(...) commands:
http://www.mongodb.org/display/DOCS/Copy+Database+Commands
This is faster than mongodump / mongorestore because it skips creating the bson representation on disk.
When you want the dev database to look exactly like the production database, you can just copy the files. I am currently running a setup where I synchronize my MongoDB database between my desktop and my notebook with dropbox - even that works flawless.

Where does MongoDB store its documents?

I have inserted and fetched data using MongoDB, in PHP. Is there an actual copy of this data in a document somewhere?
By default Mongo stores its data in the directory /data/db.
You can specify a different directory using the --dbpath option.
If you’re running Mongo on Windows then the directory will be C:\data\db, where C is the drive letter of the working directory in which Mongo was started. This is quite confusing, so on Windows I’d recommend that you always specify a data directory using --dbpath.
MongoDB stores it's data in the data directory specified by --dbpath. It uses a database format so it's not actual documents, but there are multiple documents in each file and you cannot easily extract the data from this format yourself.
To read and/or update a document you need to use a MongoDB client, in the same way that you send SQL queries to MySQL through a MySQL client. You probably want to do it programmatically by using one of the client libraries for your programming language, but there is also a command-line client if you need to do manual updates.