How to find match in documents in Mongo and Mongo aggregation? - mongodb

I have following json structure in mongo collection-
{
"students":[
{
"name":"ABC",
"fee":1233
},
{
"name":"PQR",
"fee":345
}
],
"studentDept":[
{
"name":"ABC",
"dept":"A"
},
{
"name":"XYZ",
"dept":"X"
}
]
},
{
"students":[
{
"name":"XYZ",
"fee":133
},
{
"name":"LMN",
"fee":56
}
],
"studentDept":[
{
"name":"XYZ",
"dept":"X"
},
{
"name":"LMN",
"dept":"Y"
},
{
"name":"ABC",
"dept":"P"
}
]
}
Now I want to calculate following output.
if students.name = studentDept.name
so my result should be as below
{
"name":"ABC",
"fee":1233,
"dept":"A",
},
{
"name":"XYZ",
"fee":133,
"dept":"X"
}
{
"name":"LMN",
"fee":56,
"dept":"Y"
}
Do I need to use mongo aggregation or is it possible to get above given output without using aggregation???

What you are really asking here is how to make MongoDB return something that is actually quite different from the form in which you store it in your collection. The standard query operations do allow a "limitted" form of "projection", but even as the title on the page shared in that link suggests, this is really only about "limiting" the fields to display in results based on what is present in your document already.
So any form of "alteration" requires some form of aggregation, which with both the aggregate and mapReduce operations allow to "re-shape" the document results into a form that is different from the input. Perhaps also the main thing people miss with the aggregation framework in particular, is that it is not just all about "aggregating", and in fact the "re-shaping" concept is core to it's implementation.
So in order to get results how you want, you can take an approach like this, which should be suitable for most cases:
db.collection.aggregate([
{ "$unwind": "$students" },
{ "$unwind": "$studentDept" },
{ "$group": {
"_id": "$students.name",
"tfee": { "$first": "$students.fee" },
"tdept": {
"$min": {
"$cond": [
{ "$eq": [
"$students.name",
"$studentDept.name"
]},
"$studentDept.dept",
false
]
}
}
}},
{ "$match": { "tdept": { "$ne": false } } },
{ "$sort": { "_id": 1 } },
{ "$project": {
"_id": 0,
"name": "$_id",
"fee": "$tfee",
"dept": "$tdept"
}}
])
Or alternately just "filter out" the cases where the two "name" fields do not match and then just project the content with the fields you want, if crossing content between documents is not important to you:
db.collection.aggregate([
{ "$unwind": "$students" },
{ "$unwind": "$studentDept" },
{ "$project": {
"_id": 0,
"name": "$students.name",
"fee": "$students.fee",
"dept": "$studentDept.dept",
"same": { "$eq": [ "$students.name", "$studentDept.name" ] }
}},
{ "$match": { "same": true } },
{ "$project": {
"name": 1,
"fee": 1,
"dept": 1
}}
])
From MongoDB 2.6 and upwards you can even do the same thing "inline" to the document between the two arrays. You still want to reshape that array content in your final output though, but possible done a little faster:
db.collection.aggregate([
// Compares entries in each array within the document
{ "$project": {
"students": {
"$map": {
"input": "$students",
"as": "stu",
"in": {
"$setDifference": [
{ "$map": {
"input": "$studentDept",
"as": "dept",
"in": {
"$cond": [
{ "$eq": [ "$$stu.name", "$$dept.name" ] },
{
"name": "$$stu.name",
"fee": "$$stu.fee",
"dept": "$$dept.dept"
},
false
]
}
}},
[false]
]
}
}
}
}},
// Students is now an array of arrays. So unwind it twice
{ "$unwind": "$students" },
{ "$unwind": "$students" },
// Rename the fields and exclude
{ "$project": {
"_id": 0,
"name": "$students.name",
"fee": "$students.fee",
"dept": "$students.dept"
}},
])
So where you want to essentially "alter" the structure of the output then you need to use one of the aggregation tools to do. And you can, even if you are not really aggregating anything.

Related

Mongo DB: How to aggregate on self Collection to pull Value(s) associated to the IDs in Hierarchical Dataset

I'm trying to write an aggregation in Mongo which would result in something similar to SQL.
I'm now trying to achieve the same in Mongo with the above collection.
Please Suggest me how to build Mongo Aggregation in order to achieve my output.
Unwind both module_details and module_child then match them.
[
{
"$unwind": "$module.module_details.data"
},
{
"$unwind": "$module.module_child.data"
},
{
"$match": {
"$expr": {
"$eq": [
"$module.module_details.data.module_child_id",
"$module.module_child.data.module_child_id"
]
}
}
},
{
"$project": {
"_id": 0,
"module_id:": "$module.module_details.data.module_id",
"name": "$module.module_child.data.name",
"value": "$module.module_details.data.value"
}
}
]
You probably need to match on module_id as well. However, it was not a part of the question.
[
{
"$match": {
"module_id": "9898"
}
},
{
"$unwind": "$module.module_details.data"
},
{
"$unwind": "$module.module_child.data"
},
{
"$match": {
"$expr": {
"$eq": [
"$module.module_details.data.module_child_id",
"$module.module_child.data.module_child_id"
]
}
}
},
{
"$project": {
"_id": 0,
"module_id:": "$module.module_details.data.module_id",
"name": "$module.module_child.data.name",
"value": "$module.module_details.data.value"
}
}
]

Mongo Group and sum with two fields

I have documents like:
{
"from":"abc#sss.ddd",
"to" :"ssd#dff.dff",
"email": "Hi hello"
}
How can we calculate count of sum "from and to" or "to and from"?
Like communication counts between two people?
I am able to calculate one way sum. I want to have sum both ways.
db.test.aggregate([
{ $group: {
"_id":{ "from": "$from", "to":"$to"},
"count":{$sum:1}
}
},
{
"$sort" :{"count":-1}
}
])
Since you need to calculate number of emails exchanged between 2 addresses, it would be fair to project a unified between field as following:
db.a.aggregate([
{ $match: {
to: { $exists: true },
from: { $exists: true },
email: { $exists: true }
}},
{ $project: {
between: { $cond: {
if: { $lte: [ { $strcasecmp: [ "$to", "$from" ] }, 0 ] },
then: [ { $toLower: "$to" }, { $toLower: "$from" } ],
else: [ { $toLower: "$from" }, { $toLower: "$to" } ] }
}
}},
{ $group: {
"_id": "$between",
"count": { $sum: 1 }
}},
{ $sort :{ count: -1 } }
])
Unification logic should be quite clear from the example: it is an alphabetically sorted array of both emails. The $match and $toLower parts are optional if you trust your data.
Documentation for operators used in the example:
$match
$exists
$project
$cond
$lte
$strcasecmp
$toLower
$group
$sum
$sort
You basically need to consider the _id for grouping as an "array" of the possible "to" and "from" values, and then of course "sort" them, so that in every document the combination is always in the same order.
Just as a side note, I want to add that "typically" when I am dealing with messaging systems like this, the "to" and "from" sender/recipients are usually both arrays to begin with anyway, so it usally forms the base of where different variations on this statement come from.
First, the most optimal MongoDB 3.2 statement, for single addresses
db.collection.aggregate([
// Join in array
{ "$project": {
"people": [ "$to", "$from" ],
}},
// Unwind array
{ "$unwind": "$people" },
// Sort array
{ "$sort": { "_id": 1, "people": 1 } },
// Group document
{ "$group": {
"_id": "$_id",
"people": { "$push": "$people" }
}},
// Group people and count
{ "$group": {
"_id": "$people",
"count": { "$sum": 1 }
}}
]);
Thats the basics, and now the only variations are in construction of the "people" array ( stage 1 only above ).
MongoDB 3.x and 2.6.x - Arrays
{ "$project": {
"people": { "$setUnion": [ "$to", "$from" ] }
}}
MongoDB 3.x and 2.6.x - Fields to array
{ "$project": {
"people": {
"$map": {
"input": ["A","B"],
"as": "el",
"in": {
"$cond": [
{ "$eq": [ "A", "$$el" ] },
"$to",
"$from"
]
}
}
}
}}
MongoDB 2.4.x and 2.2.x - from fields
{ "$project": {
"to": 1,
"from": 1,
"type": { "$const": [ "A", "B" ] }
}},
{ "$unwind": "$type" },
{ "$group": {
"_id": "$_id",
"people": {
"$addToSet": {
"$cond": [
{ "$eq": [ "$type", "A" ] },
"$to",
"$from"
]
}
}
}}
But in all cases:
Get all recipients into a distinct array.
Order the array to a consistent order
Group on the "always in the same order" list of recipients.
Follow that and you cannot go wrong.

Get documents with nested objects matching count condition

I am a mongo noob and am working with a mongo collection with records that look like so:
{
"cats" [
{
"name": "fluffy",
"color": "red",
},
{
"name": "snowball",
"color": "white",
},
]
{
I would like to perform a query that gets all records that have more than 1 white cats. MapReduce looks promising, but seems like overkill. Any help is appreciated.
You can use the aggregation framework to do this. You don't need to use the $where operator.
db.collection.aggregate([
{ "$match": { "cats.color": "white" }},
{ "$project": {
"nwhite": { "$map": {
"input": "$cats",
"as": "c",
"in": { "$cond": [
{ "$eq": [ "$$c.color", "white" ] },
1,
0
]}
}},
"cats": 1
}},
{ "$unwind": "$nwhite" },
{ "$group": {
"_id": "$_id",
"cats": { "$first": "$cats" },
"nwhite": { "$sum": "$nwhite" }
}},
{ "$match": { "nwhite": { "$gte" :2 } } }
])
Use $where. It is an especially powerful operator as it allows you to execute arbitrary javascript.
For your specific case, try this:
db.collection.find({$where: function() {
return this.cats.filter(function(cat){
// Filter only white cats
return cat.color === 'white';
}).length >= 2;
}});

Mongodb array concatenation

When querying mongodb, is it possible to process ("project") the result so as to perform array concatenation?
I actually have 2 different scenarios:
(1) Arrays from different fields:, e.g:
Given:
{companyName:'microsoft', managers:['ariel', 'bella'], employees:['charlie', 'don']}
{companyName:'oracle', managers:['elena', 'frank'], employees:['george', 'hugh']}
I'd like my query to return each company with its 'managers' and 'employees' concatenated:
{companyName:'microsoft', allPersonnel:['ariel', 'bella','charlie', 'don']}
{companyName:'oracle', allPersonnel:['elena', 'frank','george', 'hugh']}
(2) Nested arrays:, e.g.:
Given the following docs, where employees are separated into nested arrays (never mind why, it's a long story):
{companyName:'microsoft', personnel:[ ['ariel', 'bella'], ['charlie', 'don']}
{companyName:'oracle', personnel:[ ['elena', 'frank'], ['george', 'hugh']}
I'd like my query to return each company with a flattened 'personal' array:
{companyName:'microsoft', allPersonnel:['ariel', 'bella','charlie', 'don']}
{companyName:'oracle', allPersonnel:['elena', 'frank','george', 'hugh']}
I'd appreciate any ideas, using either 'find' or 'aggregate'
Thanks a lot :)
Of Course in Modern MongoDB releases we can simply use $concatArrays here:
db.collection.aggregate([
{ "$project": {
"companyNanme": 1,
"allPersonnel": { "$concatArrays": [ "$managers", "$employees" ] }
}}
])
Or for the second form with nested arrays, using $reduce in combination:
db.collection.aggregate([
{ "$project": {
"companyName": 1,
"allEmployees": {
"$reduce": {
"input": "$personnel",
"initialValue": [],
"in": { "$concatArrays": [ "$$value", "$$this" ] }
}
}
}}
])
There is the $setUnion operator available to the aggregation framework. The constraint here is that these are "sets" and all the members are actually "unique" as a "set" requires:
db.collection.aggregate([
{ "$project": {
"companyname": 1,
"allPersonnel": { "$setUnion": [ "$managers", "$employees" ] }
}}
])
So that is cool, as long as all are "unique" and you are in singular arrays.
In the alternate case you can always process with $unwind and $group. The personnel nested array is a simple double unwind
db.collection.aggregate([
{ "$unwind": "$personnel" },
{ "$unwind": "$personnel" },
{ "$group": {
"_id": "$_id",
"companyName": { "$first": "$companyName" },
"allPersonnel": { "$push": { "$personnel" } }
}}
])
Or the same thing as the first one for versions earlier than MongoDB 2.6 where the "set operators" did not exist:
db.collection.aggregate([
{ "$project": {
"type": { "$const": [ "M", "E" ] },
"companyName": 1,
"managers": 1,
"employees": 1
}},
{ "$unwind": "$type" },
{ "$unwind": "$managers" },
{ "$unwind": "$employees" },
{ "$group": {
"_id": "$_id",
"companyName": { "$first": "$companyName" },
"allPersonnel": {
"$addToSet": {
"$cond": [
{ "$eq": [ "$type", "M" ] },
"$managers",
"$employees"
]
}
}
}}
])

Selecting all objects from complex model

I have aggregation pipeline stage:
$project: {
'school': {
'id': '$_id',
'name': '$name',
'manager': '$manager'
},
'students': '$groups.students',
'teachers': '$groups.teachers'
}
Need something like this:
{
'users': // manager + students + teachers
}
Tried:
{
'users': {
$push: {
$each: ['$school.manager', '$students', '$teachers']
}
}
}
I'm presuming that "students" and "teachers" are both arrays here and located under a common sub-document heading like so:
{
"_id": 123,
"name": "This school",
"manager": "Bill"
"groups": {
"teachers": ["Ted"],
"students": ["Missy"]
}
}
So in order to get all of those in a singular array such as "users" then it depends on your MongoDB version and the "uniqueness" of your data. For true "sets" and where you have MongoDB 2.6 or greater available, there is the $setUnion operator, albeit with an additional level of $group to make "manager" and array:
db.collection.aggregate([
{ "$group": {
"_id": { "_id": "$_id", "name": "$name" },
"manager": { "$push": "$manager" },
"groups": { "$first": "$groups" }
}},
{ "$project": {
"users": {
"$setUnion": [ "$manager", "$groups.teachers", "$groups.students" ]
}
}}
])
Or otherwise where that operator is not available or there is a "unique" problem then there is this way to handle "combining":
db.collection.aggregate([
{ "$group": {
"_id": { "_id": "_id", "name": "$name" },
"manager": { "$push": "$manager" },
"teachers": { "$first": "$groups.teachers" },
"students": { "$first": "$groups.students" },
"type": { "$first": { "$const": ["M","T","S"] } }
}},
{ "$unwind": "$type" },
{ "$project": {
"users": {
"$cond": [
{ "$eq": [ "$type", "M" ] },
"$manager",
{ "$cond": [
{ "$eq": [ "$type", "T" ] },
"$teachers",
"$students"
]}
]
}
}},
{ "$unwind": "$users" },
{ "$group": {
"_id": "$_id",
"users": { "$push": "$users" }
}}
])
This essentially "tags" each field by a "type" for which the document is copied in the pipeline. Then placed into a single "users" field depending on which "type" matched. The single array then from the resulting three documents from each original can then be safely "unwound" and combined in a final $group operation.
So "sets" are your fastest option where available or where not available or not unique you can use the later technique in order to combine these to a single list.