How to cache Mongodb Queries? - mongodb

I am using mongodb 2.4.6 and python 2.7 .I have frequent executing queries.Is it possible to save the frequent qaueries results in cache.?
Thanks in advance!

Yes but you will need to make one, how about memcached or redis?
However as a pre-cautionary note, MongoDB does have its recently used data cached to RAM by the OS already so unless you are doing some really resource intensive aggregation query or you are using the results outside of your working set window you might not actually find that it increases performance all that much.

Related

Set allowDiskUse:true in mongodb compass aggregation

I am having trouble trying to call {allowDiskUse:true} in the mongodb compass GUI tool. I have created a view based on an aggregation of another collection. The view returns an error of
Sort exceeded memory limit of 104857600 bytes, but did not opt in to external sorting.
Hence i saw that it is required to append {allowDiskUse:true} but i am unable to find a suitable place to call. There is a possibility to use the $out stage to write to another collection but I would like to try the view first :)
ADD ON:
I have tried to run the query db.noDups.aggregate([],{allowDiskUse : true }); in command line and it works. But I would like to execute in MongoDB compass for the visualization and exporting function.
I also tried {},{},{allowDiskUse: true} in the filter condition but still no luck :(
Btw I am on MongoDB 4.2.6 Community and MongoDB compass 1.25.0
I tried appending it to the filter and it didn't work. I've looked on many different forums and kind find a solution for allowDiskUse from Compass. This seems kind of crazy that you need to add such a kludgy option to even do modest groupings on small amounts of data.
I also have been looking to see how to increase the amount of memory that Mongo can use to get around having to do this. I have Mongo installed on a server with 512GB of memory, it seems rather silly to have developers jumping through hoops like this.

How long will a mongo internal cache sustain?

I would like to know how long a mongo internal cache would sustain. I have a scenario in which i have some one million records and i have to perform a search on them using the mongo-java driver.
The initial search takes a lot of time (nearly one minute) where as the consecutive searches of same query reduces the computation time (to few seconds) due to mongo's internal caching mechanism.
But I do not know how long this cache would sustain, like is it until the system reboots or until the collection undergoes any write operation or things like that.
Any help in understanding this is appreciated!
PS:
Regarding the fields with which search is performed, some are indexed
and some are not.
Mongo version used 2.6.1
It will depend on a lot of factors, but the most prominent are the amount of memory in the server and how active the server is as MongoDB leaves much of the caching to the OS (by MMAP'ing files).
You need to take a long hard look at your log files for the initial query and try to figure out why it takes nearly a minute.
In most cases there is some internal cache invalidation mechanism that will drop your cached query internal record when write operation occurs. It is the simplest describing of process. Just from my own expirience.
But, as mentioned earlier, there are many factors besides simple invalidation that can have place.
MongoDB automatically uses all free memory on the machine as its cache.It would be better to use MongoDB 3.0+ versions because it comes with two Storage Engines MMAP and WiredTiger.
The major difference between these two is that whenever you perform a write operation in MMAP then the whole database is going to lock and whereas the locking mechanism is upto document level in WiredTiger.
If you are using MongoDB 2.6 version then you can also check the query performance and execution time taking to execute the query by explain() method and in version 3.0+ executionStats() in DB Shell Commands.
You need to index on a particular field which you will query to get results faster. A single collection cannot have more than 64 indexes. The more index you use in a collection there is performance impact in write/update operations.

Is MapReduce any faster than doing the map and reduce in JS?

I'm using MongoDB with Node.js. Is there any speed advantage to using a MapReduce in Mongo as opposed to getting the full result set and doing a map and reduce in JS on my own?
There is usually no performance advantage to retrieving the entire resultset and performing the m/r app-side. In fact, in almost all situations cramming the entire resultset in memory on your node server is a particularly bad idea.
Doing the map/reduce on MongoDB will make sure no bandwidth between the database and your app server is wasted on retrieving the resultset and writing back the results of your m/r. MongoDB's map/reduce can also be easily scaled up.
TL;DR : Always do it in MongoDB
If your database is on a different host than your server, the transfer of data will be smaller, which will waste less bandwidth and time.
The actual transfer of data can be costly and time consuming. Imagine if everytime you wanted to do an inventory count you shipped all your items to another warehouse.
Also you have to factor in how things will scale.
With mongodb you will typically want at least one replica for your data and that will add performance for read based tasks.
With node you probably wont need to add a second server for a good while due to how well it scales. Adding an intensive task to it could cause you to need to expand the amount of node servers facing outwards.

MongoDB poor read performance

I have a simple data set, a few collections, not more than 20
documents in each, in MongoDB 2.0 (previously 1.8). I'm getting poor
results when it comes to querying data (at least I think they could be
much better looking at http://mongoid.org/performance.html). At first,
I though that the mapper I use in Ruby (Mongoid) was the problem, but
I made some more tests and it seems more related to the database
itself.
I've made a simple benchmark where I query the same document 10000
times by its ID, first using the Ruby Mongo driver, then Mongoid. The
results:
user system total real
driver 7.670000 0.380000 8.050000 ( 8.770334)
mongoid 9.180000 0.380000 9.560000 ( 10.384077)
The code is here: https://gist.github.com/1303536
The machine I'm testing this on is a Core 2 Duo P8400 2.27 GHz with 4
GB of RAM running Ubuntu 11.04.
I also made a similar test using pymongo to check if the problem lies
in the Ruby driver, but the result was only slightly better (5-6 s for
10000 requests).
The bsonsize of the document I'm fetching is 67. It has some small
embedded documents, but not more than 100. Some of the embedded
documents refer documents from other collections by ID, but AFAIR this
relationship is handled by the mapper, so it shouldn't influence the
performance. Fetching this document directly in the database with explain() results in millis = 0.
The odd thing is that the HDD LED keeps blinking all the time during
the tests. Shouldn't this document be cached in RAM by Mongo after
first read? Is there something obvious I could be missing? Or is this
not a poor result at all (but comparing with http://mongoid.org/performance.html
it does seem bad)?
I dropped and recreated the database. Maybe it was because of going from 1.8 to 2.0. Anyway, the HDD led stopped blinking and everything is now 2-3x times faster.
I also looked carefully at the test that was used to benchmark Mongoid and this result (0.001s) is just for one find(), not a million. I told the Mongoid's author that I think it's not stated clearly on the web site that the number of operations applies only to some of them.
Sorry for the confusion.

Running MongoDB Queries in Map/Reduce

Is it possible to run MongoDB commands like a query to grab additional data or to do an update from with in MongoDB's MapReduce command. Either in the Map or the Reduce function?
Is this completely ludicrous to do anyways? Currently I have some documents that refer to separate collections using the MongoDB DBReference command.
Thanks for the help!
Is it possible to run MongoDB commands... from within MongoDB's MapReduce command.
In theory, this is possible. In practice there are lots of problems with this.
Problem #1: exponential work. M/R is already pretty intense and poorly logged. Adding queries can easily make M/R run out of control.
Problem #2: context. Imagine that you're running a sharded M/R and you are querying into an unsharded collection. Does the current context even have that connection?
You're basically trying to implement JOIN logic and MongoDB has no joins. Instead, you may need to build the final data in a couple of phases by running a few loops on a few sets of data.