Meteor Mongo Never Returns Data - mongodb

Using Meteor 1.3.2.4 and Mongo 3.2 (which doesn't seem like it should have major problems using it with Meteor), when running queries (really any queries) on collections larger than ~10,000 documents, they never return, or take many minutes to return.
I have indexes on most of these fields; this is not a no-index problem (I wish).
There is no evidence of any issues in the mongodb logs, just connection accepted. I have no mongo warnings or errors of any kind (fixed the mongo kernel warnings I had).
And the weirdest part about this is that when using the mongo cli, these queries run just fine, in a second or so.
One collection I'm running has ~500k docs and the other 15M.
What could be the issue? I read a few places that MongoDB 3.2 should work fine with Meteor, am I wrong?

Related

MongoDB closes connection on read operation

I run MongoDB 4.0 on WiredTiger under Ubuntu Server 16.04 to store complex documents. There is an issue with one of the collections: the documents have many images written as strings in base64. I understand this is a bad practice, but I need some time to fix it.
Because of this some find operations fail, but only those which have a non-empty filter or skip. For example db.collection('collection').find({}) runs OK while db.collection('collection').find({category: 1}) just closes connection after a timeout. It doesn't matter how many documents should be returned: if there's a filter, the error will pop every time (even if it should return 0 docs), while an empty query always executes well until skip is too big.
UPD: some skip values make queries to fail. db.collection('collection').find({}).skip(5000).limit(1) runs well, db.collection('collection').find({}).skip(9000).limit(1) takes way much time but executes too, while db.collection('collection').find({}).skip(10000).limit(1) fails every time. Looks like there's some kind of buffer where the DB stores query related data and on the 10000 docs it runs out of the resources. The collection itself has ~10500 docs. Also, searching by _id runs OK. Unfortunately, I have no opportunity to make new indexes because the operation fails just like read.
What temporary solution I may use before removing base64 images from the collection?
This happens because such a problematic data scheme causes huge RAM usage. The more entities there are in the collection, the more RAM is needed not only to perform well but even to run find.
Increasing MongoDB default RAM usage with storage.wiredTiger.engineConfig.cacheSizeGB config option allowed all the operations to run fine.

How to intercept mondoDB Query from Presto Connector

I have written a number of Presto queries that pull from mongoDB collections, but others in our project query mongo directly. These folks would like to use my queries to save them the time of having to rewrite them.
Is there a way to obtain/extract the mongoDB query language generated by Presto?
Didn't see anything in the MongoDB connector documentation that would indicate how or if this was possible.
I am aware of SQL-mongo converters out there, but Presto SQL extends normal SQL to enable things like unwrapping arrays etc. that we encounter with non-relational stores and these converts have trouble with these things in my experience.
You can set MongoDB driver log level DEBUG in log.properties:
org.mongodb=DEBUG
However, it will print many unrelated logs (e.g. healthcheck). Filed an issue https://github.com/prestosql/presto/issues/5600
I guess the easiest way is to look into Mongodb while the query is running and get it from there, for example via logging:
db.setProfilingLevel(2)
db.system.profile.find().pretty()
You may also use some GUIs like MongoVue or Robo 3T - I used MongoVue in the past to evaluate running queries.

Mongodb not returning from query calls

I have an issue with my current mongodb deployment. It seems like when mongo is under load, sometimes it will randomly get into a state where it doesn't return from query calls. I have to restart the node server (which reestablishes connection with mongo) to fix this.
Mongo doesn't throw any errors when this is happening, so there is no way to detect this bug. I've been profiling the database, but it says all my queries returns in under 5 seconds, so I don't think this is a case of unoptimized queries.
Do you guys have any other ideas as to what would cause mongodb to not return from any queries?

Mongoid: why fetching count is slower than fetching documents

I noticed a strange behavior. It might be mongoid or mongodb, I am not sure, but Counting documents is slower than fetching the documents. Here are the queries I fired:
Institution.all.any_of(:portaled_at.ne => nil).any_of(portaled: true).order_by(:portaled_at.desc).count
# mongodb query and timing as per mongoid logs,
# times are consistent over multiple runs
# MONGODB (236ms) db['$cmd'].find({"count"=>"institutions", "query"=>{"$or"=>[{:portaled_at=>{"$ne"=>nil}}, {:portaled=>true}]}, "fields"=>nil}).limit(-1)
# MONGODB (245ms) db['$cmd'].find({"count"=>"institutions", "query"=>{"$or"=>[{:portaled_at=>{"$ne"=>nil}}, {:portaled=>true}]}, "fields"=>nil}).limit(-1)
Institution.all.any_of(:portaled_at.ne => nil).any_of(portaled: true).order_by(:portaled_at.desc).to_a
# mongodb query and timing as per mongoid logs
# times are not so consistent over multiple runs,
# but consistently much lower than count query
# MONGODB (9ms) db['institutions'].find({"$or"=>[{:portaled_at=>{"$ne"=>nil}}, {:portaled=>true}]}).sort([[:portaled_at, :desc]])
# MONGODB (18ms) db['institutions'].find({"$or"=>[{:portaled_at=>{"$ne"=>nil}}, {:portaled=>true}]}).sort([[:portaled_at, :desc]])
I believe indexes are not used by mongodb for $and and $or queries, but just so if it matters, I have a sparse index on portaled_at in descending order. Out of around 200,000 documents only around 50-60 have portaled_at set.
rails 3.2.12
mongoid 2.6.0
mongodb 2.2.3
This is against my common sense and if anybody can explain what is going on I would really appreciate it.
While the two are running through different subsystems in MongoDB (one is using runCommand and the other the standard query engine), the specific issue in this case is very likely a known issue in the current version of MongoDb.
The quick summary is that counting without fetching is extremely slow as MongoDb is doing a lot of extra work that often isn't necessary. It's been fixed in the development branch, so it should be in 2.4 when it is released.
For some reason Mongo defaults to not counting records using only indexes. However, if you construct a query correctly, Mongo will count from the index. The trick is to only fetch the fields that are in the index, and you have to specify a query.
In Mongo Shell:
db.MyCollection.find({"_id":{$ne:''}},{"_id":1}).count()
You can check with the explain method:
db.MyCollection.find({"_id":{$ne:''}},{"_id":1}).explain()
Which will include "indexOnly" : true in the output.
And similarly the command can be executed via the Moped driver directly like so:
Mongoid::Sessions.default.command(:count => "MyCollection", :query=>{"_id"=>{"$ne"=>""}}, :fields => {:_id=>1})
Which, in my benchmarks (on my live data, YMMV) is about 100x faster than simply doing MyMongoidDocumentClass.count
Unfortunately, there doesn't seem to be a way to do this quickly through the Mongoid gem.

MongoDB poor read performance

I have a simple data set, a few collections, not more than 20
documents in each, in MongoDB 2.0 (previously 1.8). I'm getting poor
results when it comes to querying data (at least I think they could be
much better looking at http://mongoid.org/performance.html). At first,
I though that the mapper I use in Ruby (Mongoid) was the problem, but
I made some more tests and it seems more related to the database
itself.
I've made a simple benchmark where I query the same document 10000
times by its ID, first using the Ruby Mongo driver, then Mongoid. The
results:
user system total real
driver 7.670000 0.380000 8.050000 ( 8.770334)
mongoid 9.180000 0.380000 9.560000 ( 10.384077)
The code is here: https://gist.github.com/1303536
The machine I'm testing this on is a Core 2 Duo P8400 2.27 GHz with 4
GB of RAM running Ubuntu 11.04.
I also made a similar test using pymongo to check if the problem lies
in the Ruby driver, but the result was only slightly better (5-6 s for
10000 requests).
The bsonsize of the document I'm fetching is 67. It has some small
embedded documents, but not more than 100. Some of the embedded
documents refer documents from other collections by ID, but AFAIR this
relationship is handled by the mapper, so it shouldn't influence the
performance. Fetching this document directly in the database with explain() results in millis = 0.
The odd thing is that the HDD LED keeps blinking all the time during
the tests. Shouldn't this document be cached in RAM by Mongo after
first read? Is there something obvious I could be missing? Or is this
not a poor result at all (but comparing with http://mongoid.org/performance.html
it does seem bad)?
I dropped and recreated the database. Maybe it was because of going from 1.8 to 2.0. Anyway, the HDD led stopped blinking and everything is now 2-3x times faster.
I also looked carefully at the test that was used to benchmark Mongoid and this result (0.001s) is just for one find(), not a million. I told the Mongoid's author that I think it's not stated clearly on the web site that the number of operations applies only to some of them.
Sorry for the confusion.