optimizing query for $exists in sub property - mongodb

I need to search for the existence of a property that is within another object.
the collection contains documents that look like:
"properties": {
"source": {
"a/name": 12837,
"a/different/name": 76129
}
}
As you can see below, part of the query string is from a variable.
With some help from JohnnyHK (see mongo query - does property exist? for more info), I've got a query that works by doing the following:
var name = 'a/name';
var query = {};
query['properties.source.' + name] = {$exists: true};
collection.find(query).toArray(function...
Now I need to see if I can index the collection to improve the performance of this query.
I don't have a clue how to do this or if it is even possible to index for this.
Suggestions?

2 things happening in here.
First probably you are looking for sparse indexes.
http://docs.mongodb.org/manual/core/index-sparse/
In your case it could be a sparse index on "properties.source.a/name" field. Making indexes on field will dramatically improve your query lookup time.
db.yourCollectionName.createIndex( { "properties.source.a/name": 1 }, { sparse: true } )
Second thing. Always when you want to know whether your query is fast/slow, use mongo console, run your query and on its result call explain method.
db.yourCollectionName.find(query).explain();
Thanks to it you will know whether your query uses indexes or not, how many documents it had to check in order to complete query and some others useful information.

Related

conditional mongodb view based on queried value

I'm not sure this is possible, but i'd like to create a single view or at least a single query that looks in different collections based on what's being queried.
for example, if the first character is an "A" look in the "Aresults" collection, if it's a "B" look in the "Bresults" collection, etc.
I could potentially create a "A-Z" collection with just those letters, and do a $lookup from there based on a condition, but i'm not sure how to do that either.
I am aware that i could create a view with a $unionWith having all the "*results" collections, but that seems very inefficient.
Any other ideas? Is there perhaps some type of dynamic query structure within mongodb like in MySQL (couldn't find any)?
Thanks
Something like this?
const prefix = db.meta_data.findOne({field: condition}).prefix ;
db.createView('view_name', prefix + 'results', [<your aggregation pipeline>]);
or this?
const pipeline = [];
db.meta_data.find({ field: condition }).forEach(x => {
pipeline.push({ $unionWith: { coll: prefix + 'results' } });
});
db.collection.aggregate([pipeline]);

Best way to count documents in mongoDB

we have a collection with big amount of documents, lets say around 100k. We now want to count the number of documents which has the key x set.
If I try it with Collection.countDocuments({ x: { $exists: true }}) I get the result, but it creates instantly a warning in the console: Query Targeting: Scanned Objects / Returned has gone above 1000.
So, is there a better way to count the documents? There is a Index on the field, is it possible to get the length of the index?
Thanks
Theres no real way of viewing the index trees in Mongo, what other people have linked you just returns the size of the tree, I'm not sure how useful that information is in this context.
Now to your question is this the best way to count?.
The answer is Yes ... -ish.
countDocuments is a wrapper function, it just simulates the following pipeline:
db.collection.aggregate([
{ $match: <query> },
{ $group: { _id: null, n: { $sum: 1 } } } )
])
This pipeline is the most efficient way to go, but the difference between running this aggregation and using the wrapper function is about 100-200 milliseconds, depending on your machine spec.
Meaning if you're looking for "way" better performance you're not going to find it.
With that said this warning is stupid, it just means you have more than 1000 documents with that field. The true purpose of it is to alert you in the case you're trying to query 1-20 documents without a proper index.
You can use the indexSizes field returned by the stats() method.
The stats() method "Returns statistics about the collection".
See example here :
https://docs.mongodb.com/manual/reference/method/db.collection.stats/#basic-stats-lookup
{
...,
"indexSizes" : {
"_id_" : 237568,
"cuisine_1" : 143360,
"borough_1_cuisine_1" : 151552,
"borough_1_address.zipcode_1" : 151552
},
...
}
indexSize key return size as in space used in storing not count
Check With Explain if index getting used or not . (Update in question Also)
can use hint option to check the performance after specifying index
Or precalculate count by $inc operator might good option if possible in you use case
try cursor.count if its faster countDocument should been faster but no harm in checking
https://docs.mongodb.com/manual/reference/method/cursor.count/

mongo find all with field that is object having a specified key

A mongo db has documents that look like:
{
"_id": : ObjectId("55cb43e8c78b04f43f2eb503"),
<some fields>
"topics": {
"test/23/result": 149823788,
"test/27/result": 147862733,
"input/misc/test": 14672882
}
}
I need to find all documents that have a topics field that contains a particular key. i.e. find all documents that have a topics.key = "test/27/result"
I've tried a number of things but none work yet, neither attempt below work,
they return no records event though some should match:
db.collName.find({"topics.test/27/result": {$exists:true}});
db.collName.find({"topics.test\/27\/result": {$exists:true}});
How can I make the query work?
The slash characters are inserted by another process. They are mqtt topic names.
I found the solution to my problem:
I was building the query wrong in my code. In the example below, evtData.source contains the key name to search for, i.e. 'test/27/result'
The query methodology that works for me is:
var query = {};
query['topics.' + evtData.source] = {$exists: true};
db.collName.find(query)

How to exclude from search results documents with fields which are not present in query?

I have two documents:
{ p1:"a", p2:"b" }
{ p1:"a", p2:"b", p3:"c" }
What I should to do with query: { p1:"a", p2:"b" } to find only first document? So I want find only documents with fields what I specified. If document has more fields (than query) it should not be presented in search results.
I must admit I know of no normal querying method by which to solve this problem. There is only one way I know of and that is to use MongoDBs object comparison. To do this you would change your structure to be something along the lines of:
{
ps: [a,b]
}
or:
{
ps: {p1:a,p2:b}
}
And then you would query like:
db.col.find({ p: [a,b] })
or:
db.col.find({ p: {p1:a, p2:b} })
There is one immedate problem with this though. It is key order dependant which means that if your a and b are actually the other way around in another document it won't match. So you will need to make sure you care about order when saving if you do this.
Hope it helps,
It's not easy to do with that structure, you need some other indexable cue as to what to take and to leave.
If you know what fields you DON'T want, you can use $exists, http://www.mongodb.org/display/DOCS/Advanced+Queries#AdvancedQueries-%24exists but it's not very efficient.
db.col.find({p3:{$exists:false}})
You could also just do this;
db.col.find({ p1:"a", p2:"b", p3:null }
Could you just throw away the fields you don't want in your own code? Or could you restructure into nested groups to make it easier to filter?
{ basicData:{p1:"a", p2:"b"}, extraData:{p3:"c",p4:"d"} }

How to add a field to a document which contains the result of the comparison of two other fields

I would like to speed up an query on my mongoDB which uses $where to compare two fields in the document, which seems to be really slow.
My query look like this:
db.mycollection.find({ $where : "this.lastCheckDate < this.modificationDate})
What I would like to do is add a field to my document, i.e. isCheckDateLowerThenModDate, on which I could execute a probably much faster query:
db.mycollection.find({"isCheckDateLowerThenModDate":true})
I quite new to mongoDB an have no idea how to do this. I would appreciate if someone could give me some hints or examples on
How to initialize such a field on an existing collection
How to maintain this field. Which means how to update this field when lastCheckDate or modificationDate changes.
Thanks in advance for your help!
You are thinking in a right way!
1.How to initialize such a field on an existing collection.
Most simple way is to load each document (from your language), calculate this field, update and save.
Or you could perform an update via mongo shell:
db.mycollection.find().forEach(function(doc) {
if(doc.lastCheckDate < doc.modificationDate)
{
doc.isCheckDateLowerThenModDate = true;
}
else
{
doc.isCheckDateLowerThenModDate = false;
}
db.mycollection.save(doc);
});
2.How to maintain this field. Which means how to update this field when
lastCheckDate or modificationDate changes.
You have to do it yourself from your client code. Make some wrapper for update, save operations and recalculate this value each time there. To be absolutely sure that this update works -- write unit tests.
The $where clause is slow because it is evaluating each document using the JavaScript interpreter.
There are a few alternatives:
1) Assuming your use case is "look for records that need updating", take advantage of a sparse index:
add a boolean field like needsChecking and $set this whenever the modificationDate is updated
in your "check" procedure, find the documents that have this field set (should be fast due to the sparse index)
db.mycollection.find({'needsChecking':true});
after you've done whatever check is needed, $unset the needsChecking field.
2) A new (and faster) feature in MongoDB 2.2 is the Aggregation Framework.
Here is an example of adding a "isUpdated" field based on the date comparison, and then filtering the matching documents:
db.mycollection.aggregate(
{ $project: {
_id: 1,
name: 1,
type: 1,
modificationDate: 1,
lastCheckDate: 1,
isUpdated: { $gt:["$modificationDate","$lastCheckDate"] }
}},
{ $match : {
isUpdated : true,
}}
)
Some current caveats of using the Aggregation Framework are:
you have to specify fields to include aside from _id
the result is limited to the current maximum BSON document size (16Mb in MongoDB 2.2)