Storing Images: MongoDb vs File System - mongodb

I need to store a large number of images (from around 10,000 images per day) With an average size of around 1 to 10 MB for each image.
I can store these images in MongoDB using the GridFS Library or by just storing the Base64 conversion as a string.
My question is that, is MongoDB suitable for such a payload or is it better to use a file system?
Many Thanks,

MongoDB GridFS has a lot of advantages over a normal file system and it is definitely able to cope with the amount of data you are describing as you can scale out with a sharded mongo cluster. I have not saved that much binary data in it on my own but I do not think there is a real difference between the binary data and text. So: yes, it is suitable for the payload.

Implementing queries and operations on the file objects being saved is easier with GridFS. In addition, GridFS caters for backup/replication/scaling. However, serving the files is faster using Filesystem + Nginx than GridFS + Nginx(look here: https://www.coffeepowered.net/2010/02/17/serving-files-out-of-gridfs/). GridFS 'slower' serving speed can however be leveraged when sharding is used

Related

Mongodb to Mongodb GridFS

I'm new to mongodb. I wanted to know if I initially code my app using mongodb and later I want to switch to mongodb gridfs, will the switching (of a filled large database) be possible.
So, if I am using mongo db initially and after some time of running the app the database documents exceed the size of 16Mb, I guess I will have to switch to gridfs. I want to know how easy or difficult will it be to switch to gridfs and whether that will be possible?
Thanks.
GridFS is used to store large files. It internally divides data in chunks(By default 255 KB). Let me give you an example of saving a pdf file in MongoDB using both ways. I am assuming the size of pdf as 10 MB so that we can see both normal way and GridFS way.
Normal Way:
Say you want to store it in normal_book collection in testDB database. So, whole pdf is stored in this collection and when you want to fetch it using db.normal_book.find(), whole pdf will be fetched in memory.
GridFS way:
In GridFS, we have two collections, one is for storing data and other is for storing its metadata. It will store data in fs.chunks collection and metadata in fs.filescollection. Now, the beauty of GridFS is that you can find the whole file at once or you can find chunks individually.
Now coming to your question, there is no direct way or property to
tell MongoDB that now I want to switch to GridFS. You need to
reinsert data in GridFS using mongofiles command-line tool or
using MongoDB's drivers.

C/C++ Example for GridFS implementation in MongoDB

Just Started building an application on mongodb for file saving and retrieving and found that it has a standard specification for this purpose named as GridFS . But unfortunately i am unable to find any start up example for this in C/C++. If anyone know any thing related with it then please gives my the direction.
Edit:
I read that for storing file greater than the size of 16MB, GridFS is used, so what about the file size smaller than 16MB?..I can not get any information about it. For the smaller size, do i need to use some other process or the same GridFs?
Thanks
GridFS can be accessed through the class mongo::GridFS. The API is pretty self-explaining.
Alternatively, you can embed the binary data of your files in normal documents as the BSON BinData type. mongo::BSONObjBuilder has the method appendBinData to add a field with binary content to a document.
The reason GridFS exists is that there is an upper limit of 16MB per document. When you want to store data larger than 16MB, you need to split it into multiple documents. GridFS is an abstraction to handle this automatically, but it can also be used for smaller files.
In general, you shouldn't mix both techniques for the same content, as it just makes things more complicated with little benefit. When you can guarantee that your data doesn't get close to 16MB, use embedding. When you occasionally have content > 16MB, you should use GridFS even for files smaller than that.

Using MongoDB's for storing files of size est. 500KB

In GridFS FAQ there is said that one should store in aforementioned GridFS files of size >16MB. I have a lot of files ~500KB.
Question is: which approach is more efficient - storing files' content inside document or storing file itself in GridFS? Should I consider other approaches?
As for efficiency, either approach is the same. GridFS is implemented at the driver level by paging your >16MB data across multiple documents. MongoDB is unaware that you're storing a "file", it just knows how to store documents and doesn't ask questions.
So, depending on your driver (PHP/NodeJS/Ruby), you may find some metadata features nice and opt to use GridFS because of that. Otherwise, if you are absolutely sure a document will not be larger than 16MB, storing the raw content in the document should be fairly simple and just as fast (or faster).
Generally, I'd recommend against storing files in the database. It can have a negative impact on your working set and overall speed.

Exceded maximum insert size of 16,000,000 bytes + mongoDB + ruby

I have an application where I'm using mongodb as a database for storing record the ruby wrapper for mongodb I'm using is mongoid
Now everything was working fine until I hit a above error
Exceded maximum insert size of 16,000,000 bytes
Can any pin point how to get rid of errors.
I'm running a mongodb server which does not have a configuration file (no configuration was provide with mongodb source file)
Can anyone help
You have hit the maximum limit of a single document in MongoDB.
If you save large data files in MongoDB, use GridFs instead.
If your document has too many subdocuments, consider splitting it and use relations instead of nesting.
The limit of 16MB data per document is a very well known limitation.
Use GridFS for storing arbitrary binary data of arbitrary size + metadata.

Anyone can explain about MongoDB GridFS feature?

I am one of the MongoDB user.Unfortunately i heard about MongoDB GridFS feature but i am not sure When and where to use that feature or its necessary to use.If you might be using the feature then, i hope you guys can explain precisely about this feature.
Thanks,
You'd look into using GridFS if you needed to store large files in MongoDB.
Quote from the MongoDB documentation which I think sums it up well:
The database supports native storage
of binary data within BSON objects.
However, BSON objects in MongoDB are
limited to 4MB in size. The GridFS
spec provides a mechanism for
transparently dividing a large file
among multiple documents. This allows
us to efficiently store large objects,
and in the case of especially large
files, such as videos, permits range
operations (e.g., fetching only the
first N bytes of a file).
There's also the GridFS specification here