Empty array prevents document to appear in query - mongodb

I have documents that have a few fields and in particular the have a field called attrs that is an array. I am using the aggregation pipeline.
In my query I am interested in the attrs (attributes) field if there are any elements in it. Otherwise I still want to get the result. In this case I am after the field type of the document.
The problem is that if a document does not contain any element in the attrs field it will be filtered away and I won't get its _id.type field, which is what I really want from this query.
{
aggregate: "entities",
pipeline: [
{
$match: {
_id.servicePath: {
$in: [
/^/.*/,
null
]
}
}
},
{
$project: {
_id: 1,
"attrs.name": 1,
"attrs.type": 1
}
},
{
$unwind: "$attrs"
},
{
$group: {
_id: "$_id.type",
attrs: {
$addToSet: "$attrs"
}
}
},
{
$sort: {
_id: 1
}
}
]
}
So the question is: how can I get a result containing all documents types regardless of their having attrs, but including the attributes in case they have them?
I hope it makes sense.

You can use the $cond operator in a $project stage to replace the empty attr array with one that contains a placeholder like null that can be used as a marker to indicate that this doc doesn't contain any attr elements.
So you'd insert an additional $project stage like this right before the $unwind:
{
$project: {
attrs: {$cond: {
if: {$eq: ['$attrs', [] ]},
then: [null],
else: '$attrs'
}}
}
},
The only caveat is that you'll end up with a null value in the final attrs array for those groups that contain at least one doc without any attrs elements, so you need to ignore those client-side.
Example
The example uses an altered $match stage because the one in your example isn't valid.
Input Docs
[
{_id: {type: 1, id: 2}, attrs: []},
{_id: {type: 2, id: 1}, attrs: []},
{_id: {type: 2, id: 2}, attrs: [{name: 'john', type: 22}, {name: 'bob', type: 44}]}
]
Output
{
"result" : [
{
"_id" : 1,
"attrs" : [
null
]
},
{
"_id" : 2,
"attrs" : [
{
"name" : "bob",
"type" : 44
},
{
"name" : "john",
"type" : 22
},
null
]
}
],
"ok" : 1
}
Aggregate Command
db.test.aggregate([
{
$match: {
'_id.servicePath': {
$in: [
null
]
}
}
},
{
$project: {
_id: 1,
"attrs.name": 1,
"attrs.type": 1
}
},
{
$project: {
attrs: {$cond: {
if: {$eq: ['$attrs', [] ]},
then: [null],
else: '$attrs'
}}
}
},
{
$unwind: "$attrs"
},
{
$group: {
_id: "$_id.type",
attrs: {
$addToSet: "$attrs"
}
}
},
{
$sort: {
_id: 1
}
}
])

use some if statements and loops.
first, your query should select all documents, first and foremost.
loop through all of them
then, if number of attributes is greater than 0, loop through the attributes. loop them into whatever array or output you find useful.
use if statements to sanitize your results if you like.

You should use '$or' operator , and two seperate queries : one to select the documents with attr value equal to required value, and other query to match documents where attr is null, or attr key does not exist ( using $exists operator )

Related

Customize existing document and add new fields in mongo aggregation

I have two document with following structure
{
"CollegeName" : "Hi-Tech College",
"StudentName" : "John",
"Age" : 25
},
{
"CollegeName" : "Hi-Tech College",
"StudentName" : "Tom",
"Age" : 24
}
In those two document collegename is the common fields, by using that I want generate following format of a single document
{
"CollegeName" : "Hi-Tech College",
"JohnAge" : 25,
"TomAge" : 24
}
You can try below aggregation:
db.col.aggregate([
{
$group: {
_id: null,
CollegeName: { $first: "$CollegeName" },
Students: { $push: { k: { $concat: [ "$StudentName", "Age" ] }, v: "$Age" } }
}
},
{
$replaceRoot: {
newRoot: { $mergeObjects: [ { CollegeName: "$CollegeName" }, { $arrayToObject: "$Students" } ] }
}
}
])
Basically to create key names dynamically you can use $arrayToObject operator which takes an array of key-value pairs (k and v properties) and returns an object. To create your custom keys you can use $concat. Then you have to "merge" new dynamically created object with CollegeName so you can use $mergeObjects and $replaceRoot operators for that.
Since it's grouping by null which returns one document for entire collection you have to keep in mind that MongoDB has BSON document size limit, so your result can't exceed 16MB. More here.
db.temp.aggregate([
{"$group":{"_id":{"CollegeName":"$CollegeName"},
"Students":{"$push":{"StudentName":"$StudentName","Age":"$Age"}}}}
,{"$unwind":"$Students"}
,,{"$group":{"_id":"$_id",
'JohnAge': { $max: {$cond: [ {$or: [
{$eq:['$Students.StudentName', 'John']}
]}
, '$Students.Age', null] } },
'TomAge': { $max: {$cond: [ {$or: [
{$eq:['$Students.StudentName', 'Tom']}
]}
, '$Students.Age', null] } }
}}
])

Select an array of subdocuments and include fields from the parent in each

Suppose I have a document like this:
{
_id: "a",
title: "Hello",
info: [
{
_id: "a1",
item: "b"
},
{
_id: "a2",
item: "c"
},
]
}
I want to query this document so that I get a result like:
[
{
title: "Hello",
_id: "a1",
item: "b"
},
{
title: "Hello",
_id: "a2",
item: "c"
}
]
If I only want to get a single one of those items, say the item where _id: "a1" I can do a query like
findOne ( { "info._id": "a1" }, { title: 1, "info._id": 1, "info.item": 1, _id: 0 } );
I will get the correct result for a single sub-document. My question is how do I expand this to work for each item?
Thanks in advance.
You can use aggregation framework's $unwind operator to transform your one element with nested array to two documents with nested info field. Then you can use $project to get rid of nesting. In last stage you can filter new form using $match.
db.yourCollection.aggregate([
{ $unwind: "$info" },
{ $project: { title: 1, _id: "$info._id", item: "$info.item"} },
{ $match: {_id: "a1"}}
])
Such query will return one document like this:
{ "title" : "Hello", "_id" : "a1", "item" : "b" }
MongoDB aggregate operation facilitates applying filter into embedded documents using combination of $unwind and $match aggregation stages.
According to description as mentioned into above question please try executing following aggregate query into MongoDB shell.
db.myCollection.aggregate(
// Pipeline
[
// Stage 1
{
$match: {
"_id": "a"
}
},
// Stage 2
{
$unwind: {
path: "$info",
preserveNullAndEmptyArrays: true
}
},
// Stage 3
{
$match: {
'info._id': 'a1'
}
},
// Stage 4
{
$project: {
title: 1,
_id: '$info._id',
item: '$info.item'
}
},
]
);

mongodb aggregation query for field value length's sum

Say, I have following documents:
{name: 'A', fav_fruits: ['apple', 'mango', 'orange'], 'type':'test'}
{name: 'B', fav_fruits: ['apple', 'orange'], 'type':'test'}
{name: 'C', fav_fruits: ['cherry'], 'type':'test'}
I am trying to query to find the total count of fav_fruits field on overall documents returned by :
cursor = db.collection.find({'type': 'test'})
I am expecting output like:
cursor.count() = 3 // Getting
Without much idea of aggregate, can mongodb aggregation framework help me achieve this in any way:
1. sum up the lengths of all 'fav_fruits' field: 6
and/or
2. unique 'fav_fruit' field values = ['apple', 'mango', 'orange', 'cherry']
You need to $project your document after the $match stage and use the $size operator which return the number of items in each array. Then in the $group stage you use the $sum accumulator operator to return the total count.
db.collection.aggregate([
{ "$match": { "type": "test" } },
{ "$project": { "count": { "$size": "$fav_fruits" } } },
{ "$group": { "_id": null, "total": { "$sum": "$count" } } }
])
Which returns:
{ "_id" : null, "total" : 6 }
To get unique fav_fruits simply use .distinct()
> db.collection.distinct("fav_fruits", { "type": "test" } )
[ "apple", "mango", "orange", "cherry" ]
Do this to get just the number of fruits in the fav_fruits array:
db.fruits.aggregate([
{ $match: { type: 'test' } },
{ $unwind: "$fav_fruits" },
{ $group: { _id: "$type", count: { $sum: 1 } } }
]);
This will return the total number of fruits.
But if you want to get the array of unique fav_fruits along with the total number of elements in the fav_fruits field of each document, do this:
db.fruits.aggregate([
{ $match: { type: 'test' } },
{ $unwind: "$fav_fruits" },
{ $group: { _id: "$type", count: { $sum: 1 }, fav_fruits: { $addToSet: "$fav_fruits" } } }
])
You can try this. It may helpful to you.
db.collection.aggregate([{ $match : { type: "test" } }, {$group : { _id : null, count:{$sum:1} } }])

Get original document field as part of aggregate result

I am wanting to get all of the document fields in my aggregate results but as soon as I use $group they are gone. Using $project allows me to readd whatever fields I have defined in $group but no luck on getting the other fields:
var doc = {
_id: '123',
name: 'Bob',
comments: [],
attendances: [{
answer: 'yes'
}, {
answer: 'no'
}]
}
aggregate({
$unwind: '$attendances'
}, {
$match: {
"attendances.answer": { $ne:"no" }
}
}, {
$group: {
_id: '$_id',
attendances: { $sum: 1 },
comments: { $sum: { $size: { $ifNull: [ "$comments", [] ] }}}
}
}, {
$project: {
comments: 1,
}
}
This results in:
[{
_id: 5317b771b6504bd4a32395be,
comments: 12
},{
_id: 53349213cb41af00009a94d0,
comments: 0
}]
How do I get 'name' in there? I have tried adding to $group as:
name: '$name'
as well as in $project:
name: 1
But neither will work
You can't project fields that are removed during the $group operation.
Since you are grouping by the original document _id and there will only be one name value, you can preserve the name field using $first:
db.sample.aggregate(
{ $group: {
_id: '$_id',
comments: { $sum: { $size: { $ifNull: [ "$comments", [] ] }}},
name: { $first: "$name" }
}}
)
Example output would be:
{ "_id" : "123", "comments" : 0, "name" : "Bob" }
If you are grouping by criteria where there could be multiple values to preserve, you should either $push to an array in the $group or use $addToSet if you only want unique names.
Projecting all the fields
If you are using MongoDB 2.6 and want to get all of the original document fields (not just name) without listing them individually you can use the aggregation variable $$ROOT in place of a specific field name.

How to match multiple subdocuments in MongoDB?

Assuming that I have the following data in my books collection:
[
{
name: "Animal Farm",
readers: [
{
name: "Johny"
},
{
name: "Lisa"
}
],
likes: [
{
name: "Johny"
}
]
},
{
name: "1984",
readers: [
{
name: "Fred"
},
{
name: "Johny"
},
{
name: "Johny",
type: "bot"
}
],
likes: [
{
name: "Fred"
}
]
}
]
How do I retrieve all readers and likes that match name "Johny", with end result something like this:
[
{
name: "Animal Farm",
readers: [
{
name: "Johny"
}
],
likes: [
{
name: "Johny"
}
]
},
{
name: "1984",
readers: [
{
name: "Johny"
},
{
name: "Johny",
type: "bot"
}
],
likes: []
}
]
A following query is not possible:
db.books.find(
{ $or: [{ "readers.name": "Johny" }, { "likes.name": "Johny" }] },
{ name: 1, "readers.$": 1, "likes.$": 1 })
MongoDB complains with the following error: Cannot specify more than one positional array element per query (currently unsupported).
I have tried to use aggregation framework but did not succeed. So is this possible with MongoDB or do I have to run two queries to retrieve needed results?
As Sammaye has pointed already, specifying more than one positional array element is currently not supported.
However, you can use $elemMatch projection operator to get the results you want. $elemMatch projection operator limits the contents of the array to contain elements that matche the $elemMatch condition:
db.books.find(
{ $or: [{ "readers.name": "Johny" }, { "likes.name": "Johny" }] },
{
readers : { $elemMatch : { name : "Johny" }},
likes : { $elemMatch : { name : "Johny" }}
}
);
Edit
Altough MongoDB doesn't have a built in operator to do what you want, using existing operators, you can achieve what you want. But, embrace yourself, this is going to be a long one:
db.books.aggregate([
// find only documents that have correct "name"
{ $match: { $or: [{ "readers.name": "Johny" }, { "likes.name": "Johny" }]}},
// unwind the documents so we can push them to a array
{ $unwind: '$likes' },
// do a group to conditionally push the values into the array
{ $group : {
_id : '$_id',
likes : {
$push : {
$cond : [
{ $eq : ["$likes.name", "Johny"]},
"$likes",
null
]
}
},
readers : { $first : "$readers" },
name : { $first : "$name" }
}},
// the process is repeated for the readers array
{ $unwind: '$readers' },
{ $group : {
_id : '$_id',
readers : {
$push : {
$cond : [
{ $eq : ["$readers.name", "Johny"]},
"$readers",
null
]
}
},
likes : { $first : "$likes" },
name : { $first : "$name" }
}},
// final step: remove the null values from the arrays
{ $project : {
name : 1,
readers : { $setDifference : [ "$readers", [null] ] },
likes : { $setDifference : [ "$likes", [null] ] },
}}
]);
As you can see, you can do a "conditional" $push by using $cond operator inside the $push. But after the group stage, your array will contain null values. You have to filter them out by using setDifference.
Also note that you need to do unwind/group stages for each array you're building, otherwise a double unwind will duplicate the documents and you will end up with duplicate values in your arrays.
Following on from #ChristianP's Answer:
db.books.aggregate(
// So we don't have to random do this to docs we don't need to
{$match: { $or: [{ "readers.name": "Johny" }, { "likes.name": "Johny" }] }},
{$unwind: '$readers'},
{$match: { "readers.name": "Johny" }},
{$unwind: '$likes'},
{$match: { "likes.name": "Johny" }},
{$group: {_id: '$_id', likes: {$push: '$likes'}, readers: {$push: '$readers'}}}
)
Something like that should be able to do what you want, the functionality to do this in query was shunned in favour of doing it this way.