How to match multiple subdocuments in MongoDB? - mongodb

Assuming that I have the following data in my books collection:
[
{
name: "Animal Farm",
readers: [
{
name: "Johny"
},
{
name: "Lisa"
}
],
likes: [
{
name: "Johny"
}
]
},
{
name: "1984",
readers: [
{
name: "Fred"
},
{
name: "Johny"
},
{
name: "Johny",
type: "bot"
}
],
likes: [
{
name: "Fred"
}
]
}
]
How do I retrieve all readers and likes that match name "Johny", with end result something like this:
[
{
name: "Animal Farm",
readers: [
{
name: "Johny"
}
],
likes: [
{
name: "Johny"
}
]
},
{
name: "1984",
readers: [
{
name: "Johny"
},
{
name: "Johny",
type: "bot"
}
],
likes: []
}
]
A following query is not possible:
db.books.find(
{ $or: [{ "readers.name": "Johny" }, { "likes.name": "Johny" }] },
{ name: 1, "readers.$": 1, "likes.$": 1 })
MongoDB complains with the following error: Cannot specify more than one positional array element per query (currently unsupported).
I have tried to use aggregation framework but did not succeed. So is this possible with MongoDB or do I have to run two queries to retrieve needed results?

As Sammaye has pointed already, specifying more than one positional array element is currently not supported.
However, you can use $elemMatch projection operator to get the results you want. $elemMatch projection operator limits the contents of the array to contain elements that matche the $elemMatch condition:
db.books.find(
{ $or: [{ "readers.name": "Johny" }, { "likes.name": "Johny" }] },
{
readers : { $elemMatch : { name : "Johny" }},
likes : { $elemMatch : { name : "Johny" }}
}
);
Edit
Altough MongoDB doesn't have a built in operator to do what you want, using existing operators, you can achieve what you want. But, embrace yourself, this is going to be a long one:
db.books.aggregate([
// find only documents that have correct "name"
{ $match: { $or: [{ "readers.name": "Johny" }, { "likes.name": "Johny" }]}},
// unwind the documents so we can push them to a array
{ $unwind: '$likes' },
// do a group to conditionally push the values into the array
{ $group : {
_id : '$_id',
likes : {
$push : {
$cond : [
{ $eq : ["$likes.name", "Johny"]},
"$likes",
null
]
}
},
readers : { $first : "$readers" },
name : { $first : "$name" }
}},
// the process is repeated for the readers array
{ $unwind: '$readers' },
{ $group : {
_id : '$_id',
readers : {
$push : {
$cond : [
{ $eq : ["$readers.name", "Johny"]},
"$readers",
null
]
}
},
likes : { $first : "$likes" },
name : { $first : "$name" }
}},
// final step: remove the null values from the arrays
{ $project : {
name : 1,
readers : { $setDifference : [ "$readers", [null] ] },
likes : { $setDifference : [ "$likes", [null] ] },
}}
]);
As you can see, you can do a "conditional" $push by using $cond operator inside the $push. But after the group stage, your array will contain null values. You have to filter them out by using setDifference.
Also note that you need to do unwind/group stages for each array you're building, otherwise a double unwind will duplicate the documents and you will end up with duplicate values in your arrays.

Following on from #ChristianP's Answer:
db.books.aggregate(
// So we don't have to random do this to docs we don't need to
{$match: { $or: [{ "readers.name": "Johny" }, { "likes.name": "Johny" }] }},
{$unwind: '$readers'},
{$match: { "readers.name": "Johny" }},
{$unwind: '$likes'},
{$match: { "likes.name": "Johny" }},
{$group: {_id: '$_id', likes: {$push: '$likes'}, readers: {$push: '$readers'}}}
)
Something like that should be able to do what you want, the functionality to do this in query was shunned in favour of doing it this way.

Related

How to group data by key without $unwind nor $group stage in mongodb aggregation

Can anyone help me to group the data by key without using $unwind nor $group stage ? The reason why I don't want to use $unwind nor $group stage is because I want to be able to use the pipeline in a updateMany(filter, pipeline) operation so the stages I can use are limited to: $addFields (= $set), $project (= $unset) and $replaceRoot (= $replaceWith).
The input data looks like this:
[
{
key: "alpha",
value: {
name: "foo",
},
},
{
key: "beta",
value: {
name: "bar",
},
},
{
key: "alpha",
value: {
name: "baz",
},
},
]
The result I would like to get:
[
{
key: "alpha",
values: [
{
name: "foo",
},
{
name: "baz",
},
],
},
{
key: "beta",
values: [
{
name: "bar",
},
],
},
]
I believe that it's doable by using $reduce but I'm newbie with mongodb aggregation so I'm struggling to accumulate objects in array conditionally.
Thanks
[
{
$set: {
keys: { $setUnion: [ "$array.key" ] }
}
},
{
$set: {
newarray : {
$map : {
input : "$keys",
in : {
key : "$$this",
values : {
$map : {
input : {
$filter : {
input : "$array",
as : "elem",
cond : {
$eq : [
"$$elem.key",
"$$this"
]
}
}},
as : "vals",
in : {
name : "$$vals.value.name"
}
}
}
}
}
}
}
}
]
Try it. https://mongoplayground.net/p/m1UUvt5QNgg
Or more compact version: https://mongoplayground.net/p/bFrSiyc5nSc

Fetch distinct values from Mongo DB nested array and output to a single array

given below is my data in mongo db.I want to fetch all the unique ids from the field articles ,which is nested under the jnlc_subjects index .The result should contain only the articles array with distinct object Ids.
Mongo Data
{
"_id" : ObjectId("5c9216f1a21a4a31e0c7fa56"),
"jnlc_journal_category" : "Biology",
"jnlc_subjects" : [
{
"subject" : "Conservation Biology",
"views" : "123",
"articles" : [
ObjectId("5c4e93d0135edb6812200d5f"),
ObjectId("5c4e9365135edb6a12200d60"),
ObjectId("5c4e93a8135edb6912200d61")
]
},
{
"subject" : "Micro Biology",
"views" : "20",
"articles" : [
ObjectId("5c4e9365135edb6a12200d60"),
ObjectId("5c4e93d0135edb6812200d5f"),
ObjectId("5c76323fbaaccf5e0bae7600"),
ObjectId("5ca33ce19d677bf780fc4995")
]
},
{
"subject" : "Marine Biology",
"views" : "8",
"articles" : [
ObjectId("5c4e93d0135edb6812200d5f")
]
}
]
}
Required result
I want to get output in following format
articles : [
ObjectId("5c4e9365135edb6a12200d60"),
ObjectId("5c4e93a8135edb6912200d61"),
ObjectId("5c76323fbaaccf5e0bae7600"),
ObjectId("5ca33ce19d677bf780fc4995"),
ObjectId("5c4e93d0135edb6812200d5f")
]
Try as below:
db.collection.aggregate([
{
$unwind: "$jnlc_subjects"
},
{
$unwind: "$jnlc_subjects.articles"
},
{ $group: {_id: null, uniqueValues: { $addToSet: "$jnlc_subjects.articles"}} }
])
Result:
{
"_id" : null,
"uniqueValues" : [
ObjectId("5ca33ce19d677bf780fc4995"),
ObjectId("5c4e9365135edb6a12200d60"),
ObjectId("5c4e93a8135edb6912200d61"),
ObjectId("5c4e93d0135edb6812200d5f"),
ObjectId("5c76323fbaaccf5e0bae7600")
]
}
Try with this
db.collection.aggregate([
{
$unwind:{
path:"$jnlc_subjects",
preserveNullAndEmptyArrays:true
}
},
{
$unwind:{
path:"$jnlc_subjects.articles",
preserveNullAndEmptyArrays:true
}
},
{
$group:{
_id:"$_id",
articles:{
$addToSet:"$jnlc_subjects.articles"
}
}
}
])
If you don't want to $group with _id ypu can use null instead of $_id
According to description as mentioned into above question,as a solution to it please try executing following aggregate operation.
db.collection.aggregate(
// Pipeline
[
// Stage 1
{
$match: {
"_id": ObjectId("5c9216f1a21a4a31e0c7fa56")
}
},
// Stage 2
{
$unwind: {
path: "$jnlc_subjects",
}
},
// Stage 3
{
$unwind: {
path: "$jnlc_subjects.articles"
}
},
// Stage 4
{
$group: {
_id: null,
articles: {
$addToSet: '$jnlc_subjects.articles'
}
}
},
// Stage 5
{
$project: {
articles: 1,
_id: 0
}
},
]
);

Count and apply condition to slice the mongodb array document

My document structure looks like this:
{
"_id" : ObjectId("5aeeda07f3a664c55e830a08"),
"profileId" : ObjectId("5ad84c8c0e71892058b6a543"),
"list" : [
{
"content" : "answered your post",
"createdBy" : ObjectId("5ad84c8c0e71892058b6a540")
},
{
"content" : "answered your post",
"createdBy" : ObjectId("5ad84c8c0e71892058b6a540")
},
{
"content" : "answered your post",
"createdBy" : ObjectId("5ad84c8c0e71892058b6a540")
},
],
}
I want to count array of
list field. And apply condition before slicing that
if the list<=10 then slice all the elements of list
else 10 elements.
P.S I used this query but is returning null.
db.getCollection('post').aggregate([
{
$match:{
profileId:ObjectId("5ada84c8c0e718s9258b6a543")}
},
{$project:{notifs:{$size:"$list"}}},
{$project:{notifications:
{$cond:[
{$gte:["$notifs",10]},
{$slice:["$list",10]},
{$slice:["$list","$notifs"]}
]}
}}
])
Your first $project stage effectively wipes out all result fields but the one(s) that it explicitly projects (only notifs in your case). That's why the second $project stage cannot $slice the list field anymore (it has been removed by the first $project stage).
Also, I think your $cond/$slice combination can be more elegantly expressed using the $min operator. So there's at least the following two fixes for your problem:
Using $addFields:
db.getCollection('post').aggregate([
{ $match: { profileId: ObjectId("5ad84c8c0e71892058b6a543") } },
{ $addFields: { notifs: { $size: "$list" } } },
{ $project: {
notifications: {
$slice: [ "$list", { $min: [ "$notifs", 10 ] } ]
}
}}
])
Using a calculation inside the $project - this avoids a stage so should be preferable.
db.getCollection('post').aggregate([
{ $match: { profileId: ObjectId("5ad84c8c0e71892058b6a543") } },
{ $project: {
notifications: {
$slice: [ "$list", { $min: [ { $size: "$list" }, 10 ] } ]
}
}}
])

Empty array prevents document to appear in query

I have documents that have a few fields and in particular the have a field called attrs that is an array. I am using the aggregation pipeline.
In my query I am interested in the attrs (attributes) field if there are any elements in it. Otherwise I still want to get the result. In this case I am after the field type of the document.
The problem is that if a document does not contain any element in the attrs field it will be filtered away and I won't get its _id.type field, which is what I really want from this query.
{
aggregate: "entities",
pipeline: [
{
$match: {
_id.servicePath: {
$in: [
/^/.*/,
null
]
}
}
},
{
$project: {
_id: 1,
"attrs.name": 1,
"attrs.type": 1
}
},
{
$unwind: "$attrs"
},
{
$group: {
_id: "$_id.type",
attrs: {
$addToSet: "$attrs"
}
}
},
{
$sort: {
_id: 1
}
}
]
}
So the question is: how can I get a result containing all documents types regardless of their having attrs, but including the attributes in case they have them?
I hope it makes sense.
You can use the $cond operator in a $project stage to replace the empty attr array with one that contains a placeholder like null that can be used as a marker to indicate that this doc doesn't contain any attr elements.
So you'd insert an additional $project stage like this right before the $unwind:
{
$project: {
attrs: {$cond: {
if: {$eq: ['$attrs', [] ]},
then: [null],
else: '$attrs'
}}
}
},
The only caveat is that you'll end up with a null value in the final attrs array for those groups that contain at least one doc without any attrs elements, so you need to ignore those client-side.
Example
The example uses an altered $match stage because the one in your example isn't valid.
Input Docs
[
{_id: {type: 1, id: 2}, attrs: []},
{_id: {type: 2, id: 1}, attrs: []},
{_id: {type: 2, id: 2}, attrs: [{name: 'john', type: 22}, {name: 'bob', type: 44}]}
]
Output
{
"result" : [
{
"_id" : 1,
"attrs" : [
null
]
},
{
"_id" : 2,
"attrs" : [
{
"name" : "bob",
"type" : 44
},
{
"name" : "john",
"type" : 22
},
null
]
}
],
"ok" : 1
}
Aggregate Command
db.test.aggregate([
{
$match: {
'_id.servicePath': {
$in: [
null
]
}
}
},
{
$project: {
_id: 1,
"attrs.name": 1,
"attrs.type": 1
}
},
{
$project: {
attrs: {$cond: {
if: {$eq: ['$attrs', [] ]},
then: [null],
else: '$attrs'
}}
}
},
{
$unwind: "$attrs"
},
{
$group: {
_id: "$_id.type",
attrs: {
$addToSet: "$attrs"
}
}
},
{
$sort: {
_id: 1
}
}
])
use some if statements and loops.
first, your query should select all documents, first and foremost.
loop through all of them
then, if number of attributes is greater than 0, loop through the attributes. loop them into whatever array or output you find useful.
use if statements to sanitize your results if you like.
You should use '$or' operator , and two seperate queries : one to select the documents with attr value equal to required value, and other query to match documents where attr is null, or attr key does not exist ( using $exists operator )

Using MongoDB's positional operator $ in a deeply nested document query

Is it possible to use positional operator '$' in combination with a query on a deeply-nested document array?
Consider the following nested document defining a 'user':
{
username: 'test',
kingdoms: [
{
buildings: [
{
type: 'castle'
},
{
type: 'treasury'
},
...
]
},
...
]
}
We'd like to return the 'castles' for a particular user e.g. in a form:
{
kingdoms: [{
buildings: [{
type: 'castle'
}]
}]
}
Because you cannot use the $ operator twice (https://jira.mongodb.org/browse/server-831) I know that I can't also query for a particular kingdom, so I'm trying to write a find statement for the nth kingdom.
This seems to make sense when updating a deeply-nested sub-document (Mongodb update deeply nested subdocument) but I'm having less success with the find query.
I can return the first kingdom's buildings with the query:
db.users.findOne(
{ username: 'test' },
{ kingdoms: {$slice: [0, 1]}, 'kingdom.buildings': 1 }
);
But this returns all the buildings of that kingdom.
Following the single-level examples of position operator I'm trying a query like this:
db.users.findOne(
{ username: 'test', 'kingdoms.buildings.type': 'castle' },
{ kingdoms: {$slice: [n, 1]}, 'kingdom.buildings.$': 1 }
);
so as to be in the form:
db.collection.find( { <array.field>: <value> ...}, { "<array>.$": 1 } )
as described in the documentation http://docs.mongodb.org/manual/reference/operator/projection/positional/#proj.S
However this fails with the error:
Positional operator does not match the query specifier
Presumably because kingdoms.buildings isn't considered an array. I've also tried kingdoms.0.buildings
It is confusing because this appears to work for updates (according to Mongodb update deeply nested subdocument)
Have I just got the syntax wrong or is this not supported? If so is there a way to achieve something similar?
You get an error from
db.users.findOne(
{ username: 'test', 'kingdoms.buildings.type': 'castle' },
{ kingdoms: {$slice: [n, 1]}, 'kingdom.buildings.$': 1 }
);
because there is a spelling mistake ("kingdom.buildings.$" should be "kingdoms.buildings.$").
However, this way can not accomplish what you expect.
$ is always aimed at kingdoms in the path of kingdoms.buildings - the first array.
This is a way that should be able to solve the problem.
(V2.6+ required)
db.c.aggregate([ {
$match : {
username : 'test',
'kingdoms.buildings.type' : 'castle'
}
}, {
$project : {
_id : 0,
kingdoms : 1
}
}, {
$redact : {
$cond : {
"if" : {
$or : [ {
$gt : [ "$kingdoms", [] ]
}, {
$gt : [ "$buildings", [] ]
}, {
$eq : [ "$type", "castle" ]
} ]
},
"then" : "$$DESCEND",
"else" : "$$PRUNE"
}
}
} ]).pretty();
To only reserve the first element of kingdoms,
db.c.aggregate([ {
$match : {
username : 'test',
'kingdoms.buildings.type' : 'castle'
}
}, {
$redact : {
$cond : {
"if" : {
$or : [ {
$gt : [ "$kingdoms", [] ]
}, {
$gt : [ "$buildings", [] ]
}, {
$eq : [ "$type", "castle" ]
} ]
},
"then" : "$$DESCEND",
"else" : "$$PRUNE"
}
}
}, {
$unwind : "$kingdoms"
}, {
$group : {
_id : "$_id",
kingdom : {
$first : "$kingdoms"
}
}
}, {
$group : {
_id : "$_id",
kingdoms : {
$push : "$kingdom"
}
}
}, {
$project : {
_id : 0,
kingdoms : 1
}
} ]).pretty();