MongoDB - is this query possibile with denormalized model? - mongodb

I have this simple Mongodb document:
{
"_id" : ObjectId("55663d9361cfa81a5c48d54f")
"name" : "Oliver",
"surname" : "Queen",
"age" : 25,
"friends" : [
{
"name" : "Jhon",
"surname" : "Diggle",
"age" : "30"
},
{
"name" : "Barry",
"surname" : "Allen",
"age" : "24"
}
]
}
Is it possbile, using denormalized model as above, to find all Oliver's friends with 24 years old?
I think it's really simple with normalized model; it's enough to do two queries.
For example the following query:
db.collection.find({name:"Oliver", "friends.age":24}, {_id:0, friends:1})
returns an array of Oliver's friends. Is it possible to make a selection of the internal document?

Using aggregation
db.collection.aggregate(
[
{ $match: { "name": "Oliver" }},
{ $unwind: "$friends" },
{ $match: { "friends.age": 24 }},
{ $group: { "_id": "$_id", friends: { "$push": "$friends" }}},
{ $project: { "_id": 0, "friends": 1 }}
]
)

Related

MongoDB - Grouping by inner-documents and retrieving top results

I'm trying to find the most common (and least common) skills stored in the mongo database. I'm using mongoose to retrieve the results.
The User is the root document, which each have an inner Profile document. The profile has an attribute of 'skills' which contain an array of ProfileSkillEntry's which has a title (the skill name).
return User.aggregate([{
$group: {
'_id': '$profile.skills.title',
'count': {
$sum: 1
}
}
}, {
$sort: {
'count': -1
}
}, {
$limit: 5
}]);
I expect it to combine all of the registered Users skills together, find the top 5 occurring and return that. Instead it seems to be grouping per-user and giving invalid results.
Example User document structure:
{
"_id" : ObjectId("..."),
"firstName" : "Harry",
"lastName" : "Potter",
"profile" : {
"_id" : ObjectId("..."),
"skills" : [
{
"_id" : ObjectId("..."),
"title" : "Java",
"description" : "Master",
"dateFrom" : "31/07/2019",
"coreSkill" : true
},
{
"_id" : ObjectId("..."),
"title" : "JavaScript",
"description" : "Proficient",
"dateFrom" : "31/07/2019",
"coreSkill" : false
}
],
}
}
Please use the below query. Just add the sort and limit as per your requirement
db.test.aggregate(
[{ $unwind: { path: "$profile.skills"} },
{ $group: { _id: "$profile.skills.title",
"count": { $sum: 1 }} }] )

Merge Multiple Document from same collection MongoDB

I have a JSON data like this and i wanted to apply aggregation on this data in such a way that i should group by from data:
{
"series": [
{
"id": "1",
"element": "111",
"data": [
{
"timeFrame": {
"from": "2016-01-01T00:00:00Z",
"to": "2016-01-31T23:59:59Z"
},
"value": 1
},
{
"timeFrame": {
"from": "2016-02-01T00:00:00Z",
"to": "2016-02-29T23:59:59Z"
},
"value": 2
}
]
}
]
}
and i have acheived this by the above aggregation:
db.getCollection('col1').aggregate([
{$unwind: "$data"},
{$group :{
element: {$first:"$relatedElement"},
_id : {
day : {$dayOfMonth: "$values.timeFrame.from"},
month:{$month: "$values.timeFrame.from"},
year:{$year: "$values.timeFrame.from"}
},
fromDate : { $first : "$values.timeFrame.from" },
total : {$sum : "$values.value"},
count : {$sum : 1},
}
},
{
$project: {
_id : 0,
element:1,
fromDate : '$fromDate',
avgValue : { $divide: [ "$total", "$count" ] }
}
}])
OutPut:
{
"id" : "1",
"element" : "3",
"fromDate" : ISODate("2017-05-01T00:00:00.000Z"),
"avgValue" : 0.0378787878787879
}
{
"id" : "1",
"element" : "3",
"fromDate" : ISODate("2017-04-30T22:00:00.000Z"),
"avgValue" : 0.416666666666667
}
But, i am getting two document and this i want to merge as a single document like :
{
"id" : "1",
"element" : "3",
"average" : [
{
"fromDate" : ISODate("2017-05-01T00:00:00.000Z"),
"avgValue" : 0.0378787878787879
},
{
"fromDate" : ISODate("2017-04-30T22:00:00.000Z"),
"avgValue" : 0.416666666666667
}
]
}
Can anyone help me on this.
Add following $group at the end of your aggregate pipeline to merge current output documents into single document -
{$group:{
_id:"$_id",
element: {$first: "$element"},
average:{$push:{
"fromDate": "$fromDate",
"avgValue": "$avgValue"
}}
}}

mongo query/aggregation with equality inside array

I am trying to formulate a query over the sample bios collection http://docs.mongodb.org/manual/reference/bios-example-collection/:
Retrieve all the persons who received two awards on the same year.
The expected answers are "Ole-Johan Dahl" and "Kristen Nygaard" as for instance the doc for Ole-Johan Dahl is
{
"_id" : 5,
"name" : {
"first" : "Ole-Johan",
"last" : "Dahl"
},
"birth" : ISODate("1931-10-12T04:00:00Z"),
"death" : ISODate("2002-06-29T04:00:00Z"),
"contribs" : [
"OOP",
"Simula"
],
"awards" : [
{
"award" : "Rosing Prize",
"year" : 1999,
"by" : "Norwegian Data Association"
},
{
"award" : "Turing Award",
"year" : 2001,
"by" : "ACM"
},
{
"award" : "IEEE John von Neumann Medal",
"year" : 2001,
"by" : "IEEE"
}
]
}
So far, the best query that I could come up with is the following query using aggregation framework:
db.bios.aggregate([
{$project : { "first_name": "$name.first", "last_name": "$name.last" , "award1" :"$awards", "award2" :"$awards" } },
{$unwind : "$award1"},
{$unwind : "$award2"},
{$project : { "first_name": 1, "last_name": 1, "award1" : 1, "award2" : 1,
"super" : { $and : [ {$eq : ["$award1.year", "$award2.year"]},
{$lt: ["$award1.award", "$award2.award"]}
]
}}
},
{$match : {"super": true}}
])
However I am not happy with this solution because
the query projects awards twice and unwind them in the following step. This will generate quadratic many intermediate documents;
the query computes an auxiliary field "super" which is only used for filtering afterwards.
Is there a better way to formulate this query?
Try the following aggregation pipeline:
db.bios.aggregate([
{
"$unwind": "$awards"
},
{
"$group": {
"_id": {
"year": "$awards.year",
"firstName": "$name.first",
"lastName": "$name.last"
},
"count": { "$sum": 1 },
"award_recepients": { "$push": "$name" }
}
},
{
"$match": { "count": 2 }
},
{
"$project": {
"_id": 0,
"year": "$_id.year",
"award_recepients": 1,
"count": 1
}
}
])

MongoDB Aggregation using nested element

I have a collection with documents like this:
"_id" : "15",
"name" : "empty",
"location" : "5th Ave",
"owner" : "machine",
"visitors" : [
{
"type" : "M",
"color" : "blue",
"owner" : "Steve Cooper"
},
{
"type" : "K",
"color" : "red",
"owner" : "Luis Martinez"
},
// A lot more of these
]
}
I want to group by visitors.owner to find which owner has the most visits, I tried this:
db.mycol.aggregate(
[
{$group: {
_id: {owner: "$visitors.owner"},
visits: {$addToSet: "$visits"},
count: {$sum: "comments"}
}},
{$sort: {count: -1}},
{$limit: 1}
]
)
But I always get count = 0 and visits not corresponding to one owner :/
Please help
Try the following aggregation pipeline:
db.mycol.aggregate([
{
"$unwind": "$visitors"
},
{
"$group": {
"_id": "$visitors.owner",
"count": { "$sum": 1}
}
},
{
"$project": {
"_id": 0,
"owner": "$_id",
"visits": "$count"
}
}
]);
Using the sample document you provided in your question, the result is:
/* 0 */
{
"result" : [
{
"owner" : "Luis Martinez",
"visits" : 1
},
{
"owner" : "Steve Cooper",
"visits" : 1
}
],
"ok" : 1
}

How to query a mongo collection to return the full document with virtual fields containing calculated values from the sub-document?

I'm trying to query a collection for a specific document that contains a sub-document. The sub-document contains values for which I'd like to obtain
the highest and lowest scores from that sub-document and return that result as virtual fields to the original document.
I have the following dataset:
{
"_id" : "d0e78492342f9f-f843ec7-4bd14g3h-bh34j3a9-02d6ah32k8e6b79e",
"name" : "Addison Hunt",
"tests" : [
{
"name" : "lorem",
"score" : 79
},
{
"name" : "vallum",
"score" : 100
},
{
"name" : "ipsum",
"score" : 65
}
],
"created_at" : 1401488865684,
"class" : "dolor sit amit",
"user_id" : "005G5635231325O4VIAU"
}
In mongo 2.4, how can I query mongo once to return the following result:
{
"_id" : "d0e78492342f9f-f843ec7-4bd14g3h-bh34j3a9-02d6ah32k8e6b79e",
"name" : "Addison Hunt",
"tests" : [
{
"name" : "lorem",
"score" : 79
},
{
"name" : "vallum",
"score" : 100
},
{
"name" : "ipsum",
"score" : 65
}
],
"created_at" : 1401488865684,
"class" : "dolor sit amit",
"user_id" : "005G5635231325O4VIAU",
"worst_test": {
"name" : "ipsum",
"score" : 65
},
"best_test": {
"name" : "vallum",
"score" : 100
}
}
Where "best_test" and "worst_test" are virtual fields representing the tests with the highest and lowest scores, respectively.
I've tried with many different ways and the closest I've gotten is with this query:
db.students.aggregate([
{ $match: {
'_id': 'd0e78492342f9f-f843ec7-4bd14g3h-bh34j3a9-02d6ah32k8e6b79e'
}},
{ $unwind: '$tests' },
{ $sort: {'tests.score': 1} },
{ $group: {
_id: '$_id',
student_tests: {$push: "$$ROOT"},
worst_test: {$first: '$tests'},
best_test: { $last: '$tests' }
}}
]);
Which yields this result:
{
"_id" : "d0e78492342f9f-f843ec7-4bd14g3h-bh34j3a9-02d6ah32k8e6b79e",
"student_tests" : [
{
"name" : "Addison Hunt",
"tests" : [
{
"name" : "ipsum",
"score" : 65
}
],
"created_at" : 1401488865684,
"class" : "dolor sit amit",
"user_id" : "005G5635231325O4VIAU",
},
{
"name" : "Addison Hunt",
"tests" : [
{
"name" : "lorem",
"score" : 79
}
],
"created_at" : 1401488865684,
"class" : "dolor sit amit",
"user_id" : "005G5635231325O4VIAU",
},
{
"name" : "Addison Hunt",
"tests" : [
{
"name" : "vallum",
"score" : 100
}
],
"created_at" : 1401488865684,
"class" : "dolor sit amit",
"user_id" : "005G5635231325O4VIAU",
},
],
"worst_test": {
"name" : "ipsum",
"score" : 65
},
"best_test": {
"name" : "vallum",
"score" : 100
}
}
If you are using $$ROOT then in fact you are using MongoDB 2.6 as this is an aggregation variable only introduced in that version.
But while handy for various things, all it does is represent the entire document at the present stage of the pipeline where used. To do what you want and return the original document unmodified but with additional fields, you could use it in $project stage before the $unwind to assign to the _id field, but really you don't have exactly the same document as you would still need to $project at the end in order to get the correct document shape out of those elements.
You best bet is just projecting the fields, but keeping an un-altered copy of the array before any $sort is applied:
db.students.aggregate([
{ "$match": {
"_id": "d0e78492342f9f-f843ec7-4bd14g3h-bh34j3a9-02d6ah32k8e6b79e"
}},
{ "$project": {
"name": 1,
"tests": 1,
"created_at": 1,
"class": 1,
"user_id": 1,
"testCopy": "$tests"
}},
{ "$unwind": "$testCopy" },
{ "$sort": { "testCopy.score": 1 } },
{ "$group": {
"_id: "$_id",
"tests": { "$first": "$tests" },
"created_at": { "$first": "$created_at" },
"class": { "$first": "$class" },
"user_id": { "$first": "$user_id" },
"worst_test": { "$first": "$testCopy" },
"best_test": { "$last": "$testCopy" }
}}
]);
Or using $$ROOT as mentioned before, alternately just placing the fields under the _id individually in the $project:
db.students.aggregate([
{ "$match": {
"_id": "d0e78492342f9f-f843ec7-4bd14g3h-bh34j3a9-02d6ah32k8e6b79e"
}},
{ "$project": {
"_id": "$$ROOT",
"tests": 1
}},
{ "$unwind": "$tests" },
{ "$sort": { "tests.score": 1 } },
{ "$group": {
"_id": "$_id",
"aworst_test": { "$first": "$tests" },
"abest_test": { "$last": "$tests" }
}},
{ "$project": {
"_id": "$_id._id",
"tests": "$_id.tests",
"created_at": "$_id.created_at",
"class": "$_id.class",
"user_id": "$_id.user_id",
"worst_test": "$aworst_test",
"best_test": "$abest_test"
}}
]);
But as you see, you are still doing the $project work somewhere in order to get the structure you want, as well as the "renamed fields" to maintain the field order you want as the $project will otherwise "optimize" and "keep" any fields that have not been renamed and "append" new fields after the existing ones.
There really is no simple way to "get all fields" in the same way as you originally found them. Operations like $project and $group are an "all or nothing" affair, where they only explicitly produce what you tell them to.