MongoDB conditional query on nested document array - mongodb

Hi I'm trying to write a conditional query on nested document array.
I've read the document for days and couldn't figure out how to make this work.
DB looks like below :
[
{
"id":1,
"team":"team1",
"players":[
{
"name":"Mario",
"substitutes":[
"Luigi",
"Yoshi"
]
},
{
"name":"Wario",
"substitutes":[
]
}
]
},
{
"id":2,
"team":"team2",
"players":[
{
"name":"Bowser",
"substitutes":[
"Toad",
"Mario"
]
},
{
"name":"Wario",
"substitutes":[
]
}
]
}
]
Due to my lack of English, it's hard to put but what I'm trying to do is
to find teams that includes all queried players.
Each object in players array, some have substitutes.
For each objects in players array, if one of the queried players is not the main player("players.name"), then I want it to look for if one of substitutes("players.substitutes") is.
Team.find({players:{$in:[ 'Mario', 'Wario' ]}}) (mongoose query)
this will give me an array with 'team1'.
but what I want to get is both teams because 'Mario' is one of the substitutes for 'Bowser'(team2).
I failed to make a query but what I've been trying is not to use $where since the official MongoDB docs says :
AGGREGATION ALTERNATIVES PREFERRED
Starting in MongoDB 3.6, the $expr operator allows the use of
aggregation expressions within the query language. And, starting in
MongoDB 4.4, the $function and $accumulator allows users to define
custom aggregation expressions in JavaScript if the provided pipeline
operators cannot fulfill your application’s needs.
Given the available aggregation operators:
The use of $expr with aggregation operators that do not use JavaScript
(i.e. non-$function and non-$accumulator operators) is faster than
$where because it does not execute JavaScript and should be preferred
if possible. However, if you must create custom expressions, $function
is preferred over $where.
BUT if it could be easily written with $where operator then it's totally fine.
Any suggestions or ideas that lead to any further would be highly appreciated.

Firstly, your query is incorrect. And it is not very obvious what exactly is your filter criteria. So I am giving two suggestions:
If you want to filter all documents that have name defined in your matching criteria (which returns both documents):
db.Team.find({"players.name":{$in:[ 'Mario', 'Wario' ]}}).pretty()
If you want to filter all documents that have any provided player names in the substitutes array (which returns only one, because team1 doesn't have any substitutes are Mario/Wario)
db.Team.find({"players.substitutes":{$in:[ 'Mario', 'Wario' ]}}).pretty()
The names being looked at could be present in name or substitute
db.Team.find({ $or: [{"players.substitutes":{$in:[ 'Mario', 'Wario' ]}}, {"players.name":{$in:[ 'Mario', 'Wario' ]}}] }).pretty()

Related

MongoDB: How To Save Returned Results To Another Collection?

Consider the following:
I have a MongoDB collection named C_a. It contains a very large number of documents (e.g., more than 50,000,000).
For the sake of simplicity let's assume that each document has the following schema:
{
"username" : "Aventinus"
"text": "I love StackOverflow!",
"tags": [
"programming",
"mongodb"
]
}
Using text index I can return all documents which contain the keyword StackOverflow like this:
db.C_a.find({$text:{$search:"StackOverflow"}})
My question is the following:
Considering that the query above may return hundreds of thousands of documents, what is the easiest/fastest way to directly save the returned results into another collection named C_b?
Note: This post explains how to use aggregate to find exact matches (i.e., specific dates). I'm interested in using Text Index to save all the posts which include a specific keyword.
The referenced answer is correct. The example query from that answer can be updated to use your criteria:
db.C_a.aggregate([
{$match: {$text: {$search:"StackOverflow"}}},
{$out:"C_b"}
]);
From the MongoDB documentation for $text:
If using the $text operator in aggregation, the following restrictions also apply.
The $match stage that includes a $text must be the first stage in the pipeline.
A text operator can only occur once in the stage.
The text operator expression cannot appear in $or or $not expressions.
The text search, by default, does not return the matching documents in order of matching scores. Use the $meta aggregation expression in the $sort stage.

MongoDB aggregation $lookup to a field that is an indexed array

I am trying a fairly complex aggregate command on two collections involving $lookup pipeline. This normally works just fine on simple aggregation as long as index is set on foreignField.
But my $lookup is more complex as the indexed field is not just a normal Int64 field but actually an array of Int64. When doing a simple find(), it is easy to verify using explain() that the index is being used. But explaining the aggregate pipeline does not explain whether index is being used in the $lookup pipeline. All my timing tests seem to indicate that the index is not being used. MongoDB version is 3.6.2. Db compatibility is set to 3.6.
As I said earlier, I am not using simple foreignField lookup but the 3.6-specific pipeline + $match + $expr...
Could using pipeline be showstopper for the index? Does anyone have any deep experience with the new $lookup pipeline syntax and/or the index on an array field?
Examples
Either of the following works fine and if explained, shows that index on followers is being used.
db.col1.find({followers: {$eq : 823778}})
db.col1.find({followers: {$in : [823778]}})
But the following one does not seem to make use of the index on followers [there are more steps in the pipeline, stripped for readability].
db.col2.aggregate([
{$match:{field: "123"}},
{$lookup:{
from: "col1",
let : {follower : "$follower"},
pipeline: [{
$match: {
$expr: {
$or: [
{ $eq : ["$follower", "$$follower"] },
{ $in : ["$$follower", "$followers"]}
]
}
}
}],
as: "followers_all"
}
}])
This is a missing feature which is going to part of 3.8 version.
Currently eq matches in lookup sub pipeline are optimised to use indexes.
Refer jira fixed in 3.7.1 ( dev version).
Also, this may be relevant as well for non-multi key indexes.

$elemMatch Projection on a Simple Array

Imagine a collection of movies (stored in a MongoDB collection), with each one looking something like this:
{
_id: 123456,
name: 'Blade Runner',
buyers: [1123, 1237, 1093, 2910]
}
I want to get a list of movies, each one with an indication whether buyer 2910 (for example) bought it.
Any ideas?
I know I can change [1123, 1237, 1093, 2910] to [{id:1123}, {id:1237}, {id:1093}, {id:2910}] to allow the use of $elemMatch in the projection, but would prefer not to touch the structure.
I also know I can perhaps use the $unwind operator (within the aggregation framework), but that seems very wasteful in cases where buyer has thousands of values (basically exploding each document into thousands of copies in memory before matching).
Any other ideas? Am I missing something really simple here?
You can use the $setIsSubset aggregation operator to do this:
var buyer = 2910;
db.movies.aggregate(
{$project: {
name: 1,
buyers: 1,
boughtIt: {$setIsSubset: [[buyer], '$buyers']}
}}
)
That will give you all movie docs with a boughtIt field added that indicates whether buyer is contained in the the movie's buyers array.
This operator was added in MongoDB 2.6.
Not really sure of your intent here, but you don't need to change the structure just to use $elemMatch in projection. You can just issue like this:
db.movies.find({},{ "buyers": { "$elemMatch": { "$eq": 2910 } } })
That would filter the returned array elements to just the "buyer" that was indicated, or nothing where this was not present. It is true to point out that the $eq operator used here is not actually documented, but it does exist. So that may not be immediately clear that you can construct a condition in that way.
It seems a little wasteful to me though as you are returning "everything" regardless of whether the "buyer" is present or not. So a "query" seems more logical than a projection:
db.movies.find({ "buyers": 2910 })
And optionally either just keeping only that matched result:
db.movies.find({ "buyers": 2910 },{ "buyers.$": 1})
Set operators in the aggregation framework give you more options with $project which can do more to alter the document. But if you just want to know if someone "bought" the item, then a "query" seems the be logical and fastest way to do so.

Can you match sub-fields with $all in Mongo?

I have a collection of document, where each document looks like this:
{'name' : 'John', 'locations' :
[
{'place' : 'Paris', 'been' : true}
{'place' : 'Moscow', 'been' : false}
{'place' : 'Berlin', 'been' : true}
]
}
Where the locations array could have any length.
I want to match documents where the been field is true for all elements in the locations array. Looking at the documentation it looks like I should use $and somehow but I'm not sure if it works with sub-fields.
There are several options:
use $ne: db.destinations.find({"locations.been":{$ne:false}})
change your business logic to precompute that value before saving the document. Otherwise, this search must look through all records and then all places. This value could be indexed.
use the $where operator, but, understand the performance implications. It may require a full table scan. In this case, it would.
write a map-reduce function with the filter logic and only emit those that are valid. You'd need to incrementally update it per the docs.
write a query using the aggregation framework. There are a lot of good examples here. Although, like other solutions, this could end up looping through the entire collection.
I think it's impossible to do with standart MongoDB operators like $elemMatch or $all. The only possible way is to write custom JS query:
db.test.find("return this.locations.every(function(loc){return loc.been});")

MongoDB Aggregate $project

I store our web server logs in MongoDB and the schema looks similar to as follows:
[
{
"_id" : 12345,
"url" : "http://www.mydomain.com/xyz/abc.html",
....
},
....
]
I am trying to use the $project operator to reshape this schema a little bit before I start passing my collection through an aggregation pipeline. Basically, I need to add a new field called "type" that will later be used to perform group-by. The logic for the new field is pretty simple.
if "url" contains "pattern_A" then set "type" = "sales lead";
else if "url" contains "pattern_B" then set "type" = "existing client";
...
I'm thinking it would have to be something like this:
db.weblog.aggregate(
{
$project : {
type : { /* how to implement the logic??? */ }
}
}
);
I know how to do this using map-reduce (by setting the "keyf" attribute to a custom JS function that implements the above logic) but am now trying to use the new aggregation framework to do this. I tried to implement the logic using the expression operators but so far couldn't get it to work. Any help/suggestion would be greatly appreciated!
I am sharing my "solution" in case others encounter the same needs like mine.
After researching for a couple of weeks, as #asya-kamsky suggested in one of his comments, I've decided to add a computed field to my original MongoDB schema. It's not ideal because whenever the logic for the computed field changes I would have to do bulk updates to update all documents in my collection but it was either that or rewrite my code to use MapReduce. I chose the former for now. In looking at MongoDB Jira board, it would appear that many people have asked for more diverse operators to be added for the $project operator and I certainly hope that the MongoDB dev team gets around to adding them sooner than later
Operator for splitting string based on a separator.
New projection operator $elemMatch
Allow $slice operator in $project
add a $inOrder operator to $project
You need to use combination of several operators and expressions.
first, the $cond operator in $project lets you implement if then else logic.
$cond : takes an array of three elements, first a boolean expression, second and third are values to use for the field value - if boolean expression is true then it uses second element for value, if not then third element.
you can nest these so that third element is itself a $cond expression to get if-then-else-if-then-etc.
string manipulation is a little awkward but you do have $substr available.
If you post some examples of what exactly you tried, I may be able to spot why it didn't work.