I am trying to search a list of values on a matching field in documents which is array of documents. Using $in makes it OR between the values I supply. Using $all seems to be more logical.
For eg:
Collection: Phrases
sample doc:
{
"locales": [
{
"name": "BPT",
"internal_desc": "Entre 2 e 3 horas"
},
{
"name": "JPN",
"internal_desc": "2 ~ 3 時間"
}
]
}
Query:
db.phrases.find({"locales.name":{"$all":["BPT", "JPN"]}})
But some posts suggesting $all is bad in terms of performance. Is there any other way to achieve this?
Using $and instead of $all will result in equivalent performance. The bottom line is that given what you are trying to accomplish using $all is your best bet (as far as my understanding goes). However, $all can be optimized by making the first element in the expression more selective. For example if you know that "BPT" shows up in 2% of documents and "JPN" shows up in 20% of documents then it makes sense to list "BPT" as the first element in the $all expression. This way mongo only needs to filter through fewer documents on each consecutive element in your $all expression. Im sure you've seen the documentation but here is a link nonetheless: $all - mongodb
You can use the $and syntax, as shown in the query below;
db.phrases.find({$and : [
{"locales.name" : "BPT"},
{"locales.name" : "JPN"}
]
});
You can get information about your query, to see what the db is doing when executing the query by using the explain command, as displayed below;
db.phrases.explain().find({$and : [
{"locales.name" : "BPT"},
{"locales.name" : "JPN"}
]
});
Although, the explain command is more relevant to dbs where indexes are used, since it sort of gives you information about, which index was utilised by the db on the search.
Have a quick look into MongoDB indexes and explain() for further information.
I hope this helps.
Regards,
Nick.
Related
I have the following query to be executed on my MongoDB collection order_error. It has over 60 million documents. The main concern is I am having a $in operator within my query. I tried several possibilities of indices but none of them gave a high-performance improvement. The query is as follows
db.getCollection("order_error").find({
"$and":[
{
"type":"order"
},
{
"Origin.SN":{
"$in":[
"4095",
"4100",
"4509",
"4599",
"4510"
]
}
}
]
}).sort({"timestamp.milliseconds" : 1}).skip(1).limit(100).explain("executionStats")
One issue that needs to be noted is I am allowing sort on timestamp.milliseconds in both directions(ASC + DESC). I have limited the entries within the $in. Usually, it is more. SO what kind of index gives the performance improvement. I tried creating the following indices already
type_1_Origin.SN_1_timestamp.milliseconds_-1
type_1_timestamp.milliseconds_-1_Origin.SN
Is there any better way for index creation?
I need a mongodb query something like
db.getCollection("xyz").find({"_id" : {$regex : {$in : [xxxx/*]}}})
My Use case is -- I have a list of Strings such as
[xyz/12/poi, abc/98/mnb, ytn/65/tdx, ...]
The ids that are there in the collection(test) are something like
xyz/12/poi/2019061304.
I will get the values like xyz/12/poi from the input list, the other part of the id being yyyymmddhh format.
So, I need to go to the collection and find all the documents matching the input list with the ID of the documents in the test collection.
I can retrieve the documents individually but that does not seem to be a feasible option as the size of the input list is more than 10000.
Can you guys suggest a more feasible solution. Thanks in advance.
I tried using $in with $regex. But it seems mongodb does not support that. I have also tried pattern matching but even that is not feasible for me. Can you please suggest an alternative to using $in with $regex in mongodb.
Expected result could be an aggragate query/a normal query so that we hit the database only once and get the desired output rather than hitting the db for 10000 odd times.
Say you're querying documents based on 2 data points. One is a simple bool parameter, and the other is a complicated $geoWithin calculation.
db.collection.find( {"geoField": { "$geoWithin" : ...}, "boolField" : true} )
Will mongo reorder these parameters, so that it checks the boolField 1st, before running the complicated check?
MongoDB uses indexes like any other DBs. So the important thing for mongoDB is if any query fields has an index or not, not the order of query fields. At least there is no information in their documentation that mongoDB try to checks primitive query fields first. So for your example if boolField has an index mongoDB first check this field and eliminate documents whose boolField is false. But If geoField has an index then mongoDB first execute query on this field.
So what happens if none of them have index or both of them have? It should be the given order of fields in query because there is no suggestion or info beside of indexes in query optimization page of mongoDB. Additionally you can always test your queries performances with just adding .explain("executionStats").
So check the performance of db.collection.find( {"geoField": { "$geoWithin" : ...}, "boolField" : true} ) and db.collection.find( { "boolField" : true, "geoField": { "$geoWithin" : ...} } ). And let us know :)
To add to above response, if you want mongo to use specific index you can use cursor.hint . This https://docs.mongodb.com/manual/core/query-plans/ explains how default index selection is done.
How to limit the no. of rows retrieved in mongodb input transformation used in kettle.
I tried in mongodb input query with below queries but none of them are working :
{"$query" : {"$limit" : 10}}
or {"$limit" : 10}
Please let me know where i am going wrong.
Thanks,
Deepthi
There are several query modification operators you can use. Their names are not totally intuitive and don't match the names of functions you would use in the Mongo shell, but they do the same sorts of things.
In your case, you need the $maxScan operator. You could write your query as:
{"$query": {...}, "$maxScan": 10}
We have to use aggregation method providing by MongoDB, and here is the link: docs.mongodb.org/manual/applications/aggregation.
In my case, I use this query in MongoDB Input of Kettle, and we also have to choose Query is aggregation pipline.
{$match: {activity_type: " view "}},
{$sort: {activity_target: -1 } },
{$limit: 10}
The following screen shots can help you understand the operations more clear.
I have a collection of document, where each document looks like this:
{'name' : 'John', 'locations' :
[
{'place' : 'Paris', 'been' : true}
{'place' : 'Moscow', 'been' : false}
{'place' : 'Berlin', 'been' : true}
]
}
Where the locations array could have any length.
I want to match documents where the been field is true for all elements in the locations array. Looking at the documentation it looks like I should use $and somehow but I'm not sure if it works with sub-fields.
There are several options:
use $ne: db.destinations.find({"locations.been":{$ne:false}})
change your business logic to precompute that value before saving the document. Otherwise, this search must look through all records and then all places. This value could be indexed.
use the $where operator, but, understand the performance implications. It may require a full table scan. In this case, it would.
write a map-reduce function with the filter logic and only emit those that are valid. You'd need to incrementally update it per the docs.
write a query using the aggregation framework. There are a lot of good examples here. Although, like other solutions, this could end up looping through the entire collection.
I think it's impossible to do with standart MongoDB operators like $elemMatch or $all. The only possible way is to write custom JS query:
db.test.find("return this.locations.every(function(loc){return loc.been});")