MongoDB performance for queries

MongoDB performance for queries - mongodb

I'm having issues with my database right now with hundreds of people on our site. Looking at my current operations in my Compose.io dashboard I see that this is one query that has been running for a very long time.
I have an index on _id: 1, owner: 1 and date: -1. Any idea why this query could be taking so long and is locking my entire database? I can also see the number of yields is 268 which is very high which seems problematic.
This is another query that has been running for half a second. It seems to be locking the entire database. Why is this?

Related

MongoDB Atlas Serverless Database | Serverless Instance Costs | RPU cost explanation

Can someone explain how RPUs are getting calculated by an example ?
Let's say I have a mongo collection that has 5 million documents. So if i do a findOne to the collection, the RPUs generated would be 5M or 1 ?

It depends on a crucially import step that I could not find anywhere in Mongo's pricing information.
I had 4000 daily visits with a user database of 25,000 users. I was using findOne on the database two times per page load. What I thought would be 8000 RPUs turned out to be 170 million RPUs. That's right - 42,500 RPUs per page visit.
So the critically important step is: add indexes for the keys you're matching on.
Yes this is an obvious step to optimize performance, but also easy to overlook the first go-around of creating a database.
I was in the MongoDD dashboard daily to check user stats and was never alerted to any abnormalities. Searching online for over an hour, I could not find any mention of this from Mongo. But when I reached out to support, this is the first thing they pointed out. So they know it's an issue. They know it's a gotcha. They know how simple it'd be to point out that a single call is resulting in 40,000 RPUs. Doesn't seem that they care. Rather it seems to be a sleezy business model they've intentionally adopted.
So if you do a findOne on a 5 million document database, will you get 1 RPU charged or 5 Million RPUs charged? Depends if you remembered to index.

mongodb taking 500 ms in fetching 150 records

We are using MongoDb(3.0.6) for the application we are using. We have around 150 entries in one of the collections, but it takes approx 500ms to fetch all of these records, which I didn't expect. Mongo is deployed on a company server. What can be done to reduce this time?
We are not getting too many reads and there is no CPU load, etc, What can be mistake which may be causing this or what config should be changes to affect these.
Here is my schema: http://pastebin.com/x18iDPKf
I am just querying all the entries, which are 160 in number. I don't think time taken is due to Mongoose or NodeJs, as when I quesry using RoboMongo, It still takes same time.
Output of db.<collection>.stats().size :
223000
The query I am doing is:
db.getCollection('collectionName').find({})

Definitely it shouldn't be a problem with MongoDB. It should be a temporary issue or might be due to system constraints such as Internal Memory and so on.
If still the problem exists, use Indexes on the appropriate field which you are querying.
https://docs.mongodb.com/manual/indexes/

Mongo shell query timing out at 90 seconds

I am using mongodb via the mongo shell to query a large collection. For some reason after 90 seconds the mongo shell seems to be stopping my query and nothing is returned.
I have tried the following two commands but neither will return anything. After 90 seconds it just gives me a new line that I can type in another command.
db.cards.find("Field":"Something").maxTimeMS(9999999)
db.cards.find("Field":"Something").addOption(DBQuery.Option.tailable)
db.cards.find() return results, but anything with parameters is timing out at exactly 90 seconds and nothing is being returned.
Any help would be greatly appreciated.

Given the level of detail in your question, I am going to focus on 'query a large collection' and guess that your are using the MMAPv1 storage engine, with no index coverage on your query.
Are you disk bound?
Given the above assumptions, you could be cycling data between RAM and disk. Mongo has a default 100MB RAM limit, so if your query has to examine a lot of documents (no index coverage), paging data from disk to RAM could be the culprit. I have heard of mongo shell acting as you describe or locking/terminating when memory constraints are exceeded.
32bit systems can also impose severe memory limits for large collections.
You could look at your OS specific disk activity monitor to get a clue into whether this is your problem.
Just how large is your collection?
Next, how big is your collection? You can show collections and see the physical size of the collection and also db.cards.count() to see your record count. This helps quantify "large collection".
NOTE: you might need the mongo-hacker extensions to see collection disk use in show collections.
Mongo shell investigation
Within the mongo shell, you have a couple more places to look.
By default, mongo will log slow queries (> 100ms). After your 90 sec timeout:
db.adminCommand({getLog: "global" })
and look for slow query log entries.
Next look at your winning query plan.
var e = db.cards.explain()
e.find("Field":"Something")
I am guessing you will see
"stage": "COLLSCAN",
Which means you are doing a full collection scan and you need index coverage for your query (good idea for queries and sorts).
Suggestions
You should have at least partial index coverage on any production query. A proper index should solve your problem (assuming you don't have documents > 16MB).
Another approach (that I don't recommend - indexing is better) is to use a cursor instead
var cursor = db.cards.find("Field":"Something")
while (cursor.hasNext()) {
print(tojson(cursor.next()));
}
Depending on the root cause, this may work for you.

Correct way to run an ETL on a live production MongoDB database

We have the following environment:
3 servers + 3 replica set servers
Each server has 3 shards
Our main collection has around 40,000,000 documents that average at around 6kb
Our shard key is hashed(_id) - _id being pure BsonID
We peak daily at around 25,000 I/OPS and low point at around 10,000
We want to run an ETL that loads all of the documents on the main collection, do some in memory calculation (in our application tier) and then dumps it into an external DB.
We took the very poor and naive approach and simply loaded documents without a query using limit, skip and batchSize - which was a complete failure (severly hurt our service level - Even though we set the readPreference to secondary)
db.Collection.find().skip(i * 5000).limit(5000).readPref(secondary)
Where i is the current iteration we're going through, which runs on multiple threads to speed up the process.
I was wondering what would be the best approach to be able to load all of our documents without hurting the performance of our database.
The data can be a bit stale (a few seconds delay from the actual data on the primary is fine).
I've posted this question on DB admins but it doesn't seem to attract much answers so I'm posting it here as well. Sorry if it's against the forum rules.
Thanks!

MongoDB C# query performance much worse than mongotop reports

I have quite a bit of experience with Mongo, but am on the verge of tears of frustration about this problem (that of course popped up from nowhere a day before release).
Basically I am querying a database to retrieve a document but it will often be an order of magnitude (or even two) worse than it should be, particularly since the query is returning nothing.
Query:
//searchQuery ex: { "atomic.Basic^SessionId" : "a8297898-7fc9-435c-96be-9c5e60901e40" }
var doc = FindOne(searchQuery);
Explain:
{
"cursor":"BtreeCursor atomic.Basic^SessionId",
"isMultiKey" : false,
" n":0,
"nscannedObjects":0,
"nscanned":0,
"nscannedObjectsAllPlans":0,
"nscannedAllPlans":0,
"scanAndOrder":false,
"indexOnly":false,
"nYields":0,
"nChunkSkips":0,
"millis":0,
"indexBounds":{
"atomic.Basic^SessionId":[
[
"a8297898-7fc9-435c-96be-9c5e60901e40",
"a8297898-7fc9-435c-96be-9c5e60901e40"
]
]
}
}
It is often taking 50-150 ms, even though mongotop reports at most 15 ms of read time (and that should be over several queries). There are only 6k documents in the database (only 2k or so are in the index, and the explain says it's using the index) and since the document being searched for isn't there, it can't be a deserialization issue.
It's not this bad on every query (sub ms most of the time) and surely the B-tree isn't large enough to have that much variance.
Any ideas will have my eternal gratitude.

MongoTop is not reporting the total query time. It reports the amount of time MongoDB is spending holding particular locks.
That query retuned in 0ms according to the explain (which is extremely quickly). What you are describing sounds like network latency. What is the latency when you ping the node? Is it possible that the network is flakey?
What version of MongoDB are you using? Consider upgrading both MongoDB and the C# driver to the latest stable versions.