I'am trying to make an exercise with MongoDB database where I need to find movies in a movies collection , where the same person appears in cast, directors, and writers.
I tried this code :
//first match the writers, directors and cast arrays which are not empty.
db.movies.aggregate( [ {$match : {$and : [ {writers: {$elemMatch: {$exists: true}}} , {directors: {$elemMatch: {$exists: true}}} , {cast: {$elemMatch : {$exists: true}}} ]}}
//then make an intersection between the matched results
, {$project: {"labers of love" :{$setIntersection: ["$writers", "$cast", "$directors"]}, _id: 0 } } ] ).pretty()
However , when I run this command I still get empty arrays which should not be the case. I'am not sure if this pipeline is working properly. When I run only the $match command I dont seem to get any empty array.
this is a question about how to create efficient indexes when query have "or". Without “or” ,I know how to create efficient index.
This is my query.
db.collection.find({
'msg.sendTime':{$gt:1},
'msg.msgType':{$in:["chat","g_card"]},
$or:[{'msg.recvId':{$in:['xm80049258']}},{'msg.userId':'xm80049258'}],
$orderby:{'msg.sendTime':-1}})
After reading some article, I create two single index on msg.recvId and msg.userId, and this make sense.
I want to know when mongodb execute "or", Is it divides all documents at very first ,then use msg.sendTime and msg.msgType ?
How to create efficient indexes in this case? Should I create indexes (msg.sendTime:1,msg.msgType:1,msg.recvId:1) and
(msg.sendTime:1,msg.msgType:1,msg.userId:1)
Thanks very much.
Paraphrasing from $or Clauses and Indexes:
When evaluating the clauses in the $or expression, MongoDB either performs a collection scan or, if all the clauses are supported by indexes, MongoDB performs index scans. That is, for MongoDB to use indexes to evaluate an $or expression, all the clauses in the $or expression must be supported by indexes.
Also from Indexing Strategies:
Generally, MongoDB only uses one index to fulfill most queries. However, each clause of an $or query may use a different index
What those paragraph mean for $or queries are:
In a find() query, only one index can be used. Therefore it's best to create an index that aligns with the fields in your query. Otherwise, MongoDB will do a collection scan.
Except when the query is an $or query, where MongoDB can use one index per $or term
In combination, if you have $or in your query, it's best to put the $or term as the top-level term, and create an index for each term separately
So to answer your question:
I want to know when mongodb execute "or", Is it divides all documents at very first ,then use msg.sendTime and msg.msgType ?
If your query has a top-level $or clause, MongoDB can use one index per clause. Otherwise, it will do a collection scan, or a semi-collection scan. For example, if you have an index:
db.collection.createIndex({a: 1, b: 1})
There are two general type of query you can create:
1. $or NOT on the top level of the query
This query can use the index, but will not be performant:
db.collection.find({a: 1, $or: [{b: 1}, {b: 2}]})
since the explain() output of the query is:
> db.collection.explain().find({a: 1, $or: [{b: 1}, {b: 2}]})
{
"queryPlanner": {
...
"indexBounds": {
"a": [
"[1.0, 1.0]"
],
"b": [
"[MinKey, MaxKey]"
]
...
Note that the query planner cannot use the proper boundary for the b field, where it is doing a semi-collection scan (since it's searching for b from MinKey to MaxKey, i.e. everything). The query planner result above is basically saying: "Find documents where a = 1, and scan all of them for b having value of 1 or 2"
2. $or on the top level of the query
However, pulling the $or clause to the top-level:
db.collection.find({$or: [{a: 1, b: 1}, {a: 1, b: 2}]})
will result in this query plan:
> db.test.explain().find({$or: [{a: 1, b: 1}, {a: 1, b: 2}]})
{
"queryPlanner": {
...
"winningPlan": {
"stage": "SUBPLAN",
...
"inputStages": [
{
"stage": "IXSCAN",
...
"indexBounds": {
"a": [
"[1.0, 1.0]"
],
"b": [
"[1.0, 1.0]"
]
}
},
{
"stage": "IXSCAN",
...
"indexBounds": {
"a": [
"[1.0, 1.0]"
],
"b": [
"[2.0, 2.0]"
]
Note that each term of the $or is treated as a separate query, each with a tight boundary. As such, the query plan above is saying: "Find documents where a = 1, b = 1 or a = 1, b = 2". As you can imagine, this query will be much more performant compared to the earlier query.
For your second question:
How to create efficient indexes in this case? Should I create indexes (msg.sendTime:1,msg.msgType:1,msg.recvId:1) and (msg.sendTime:1,msg.msgType:1,msg.userId:1)
As explained above, you need to combine the proper query with the proper index to achieve the best result. The two indexes you proposed will be able to be used by MongoDB and will work best if you rearrange your query to have the $or in the top-level of your query.
I encourage you to understand the explain() output of MongoDB, since it's the best tool to find out if your queries are using the proper indexes or not.
Relevant resources that you may find useful are:
Explain Results
Create Indexes to Support Your Queries
Indexing Strategies
In my collection posts I've documents like this
[{
_id : ObjectId("post-object-id"),
title : 'Post #1',
category_id : ObjectId("category-object-id")
}]
I need to make some queries where I those a range of posts based on their category_id (can be multiple ids) but exclude some of them.
I've tried with the query (in shell):
db.posts.find({$and: [
{_id: { $nin : ['ObjectId("post-object-id")']}},
{category_id : { $in : ['ObjectId("category-object-id")']}}
]})
I returns 0 if count().
However, if I change the category_id attribute and remove the $in and just include one ID it work, like this:
db.posts.find({$and: [
{_id: { $nin : ['ObjectId("58a1af81613119002d42ef06")']}},
{category_id : ObjectId("58761634bfb31efd5ce6e88d")}
]})
but this solution only enables me to find by one category.
How would I got about combining $in and $nin with objectId's in the same manner as above?
This will work, just remove single quotes around ObjectId
db.posts.find({$and: [
{_id: { $nin : [ObjectId("post-object-id")]}},
{category_id : { $in : [ObjectId("category-object-id")]}}
]})
You should not put single quotes around ObjectId, it make them strings
I am running a query from my Meteor server, but for some reason only the first projection is catching.
Users.find({"services.facebook" : {$exists : true}}, {"_id": {$nin: doNotCount}}).fetch()
only returns facebook users (disregarding {"_id": {$nin: doNotCount}})
Users.find(, {"_id": {$nin: doNotCount}}, {"services.facebook" : {$exists : true}}).fetch()
only returns users not in a given array (disregarding {"services.facebook" : {$exists : true}})
from the documentation, it looks like this is possible:
https://docs.mongodb.org/manual/reference/operator/projection/positional/
but im not having any luck
The query is the first parameter only, the second parameter deals with sorting, limits, restricting the fields to be returned etc...
Change to:
Users.find({ "services.facebook" : {$exists : true}, "_id": {$nin: doNotCount }}).fetch()
I have some documents in a collection in Mongodb :
{_id : ObjectId('533af69b923967ac1801e113'), fKey : '533aeb09ebef89282c6cc478', ... }
{_id : ObjectId('5343bd1e2305566008434afc'), fKey : ObjectId('5343bd1e2305566008434afc'), ...} }
As you can see my field fkey can be set by a string or an ObjectId.
I would like to get all documents which match '533aeb09ebef89282c6cc478' or ObjectId('5343bd1e2305566008434afc').
But if i run :
db.mycollection.find({fkey : '533aeb09ebef89282c6cc478'})
I only get the first document of the collection.
Is there a way to configure Mongodb in order to get all documents which match the request without checking the type ?
Thanks for your help.
Pierre
There are two options for you here.
You could use mongo's $or operator:
db.mycollection.find({
$or: [
{ fKey: '533aeb09ebef89282c6cc478' },
{ fKey: ObjectId( '533aeb09ebef89282c6cc478' ) }
]
})
The $or operator performs a logical OR operation on an array of two or more <expressions> and selects the documents that satisfy at least one of the <expressions>.
You could also use the $in operator:
db.mycollection.find({
fKey: {
"$in": [ '533aeb09ebef89282c6cc478', ObjectId( '533aeb09ebef89282c6cc478' ) ]
}
})
The $in operator selects the documents where the value of a field equals any value in the specified array.
It sounds to me like these inconsistencies are not meant to be there. I recommend going through your code and data pipelines and figure out who/what is inserting the fKey value with an unknown datatype.