I have a mongodb collection full of documents like this :
{
_id : xxxxxx
category : 1,
tech : [
{key:"size",value:5},
{key:"color",value:"red"}
{key:"weight",value:27.4}
]
}
My question is : how can I do to aggregate (average, sum or whatever) each item with key = "size" in this collection?
thank you for your help
When you have documents that contain an array you use the $unwind operator in order to access the array elements.
db.tech.aggregate([
{ "$unwind": "$tech" },
{ "$match": { "tech.key": "size" } },
{ "$group": {
"_id": null,
"totalSize": { "$sum": "$tech.value" }
}}
])
So once the array is "un-wound" you can then $group on whatever you want to use as a key under the _id field, for all documents in the collection use null. Any of the group aggregation operators can be applied.
The array elements in the "de-normalized" documents will be available through "dot notation" as shown above.
Also see the full list of aggregation operators in the manual.
Related
I have a MongoDB collection of documents formatted as shown below:
{
"_id" : ...,
"username" : "foo",
"challengeDetails" : [
{
"ID" : ...,
"pb" : 30081,
},
{
"ID" : ...,
"pb" : 23995,
},
...
]
}
How can I write a find query for records that have a challengeDetails documents with a matching ID and sort them by the corresponding PB?
I have tried (this is using the NodeJS driver, which is why the projection syntax is weird)
const result = await collection
.find(
{ "challengeDetails.ID": challengeObjectID},
{
projection: {"challengeDetails.$": 1},
sort: {"challengeDetails.0.pb": 1}
}
)
This returns the correct records (documents with challengeDetails for only the matching ID) but they're not sorted.
I think this doesn't work because as the docs say:
When the find() method includes a sort(), the find() method applies the sort() to order the matching documents before it applies the positional $ projection operator.
But they don't explain how to sort after projecting. How would I write a query to do this? (I have a feeling aggregation may be required but am not familiar enough with MongoDB to write that myself)
You need to use aggregation to sort n array
$unwind to deconstruct the array
$match to match the value
$sort for sorting
$group to reconstruct the array
Here is the code
db.collection.aggregate([
{ "$unwind": "$challengeDetails" },
{ "$match": { "challengeDetails.ID": 2 } },
{ "$sort": { "challengeDetails.pb": 1 } },
{
"$group": {
"_id": "$_id",
"username": { "$first": "$username" },
"challengeDetails": { $push: "$challengeDetails" }
}
}
])
Working Mongo playground
I have a collection objects.
{
"_id" : ObjectId("55fa65046db58e7d0c8b456a"),
"object_id" : "1651419",
"user" : {
"id" : "65593",
"cookie" : "9jgkm7ME1HDFD4K6j8WWvg",
},
"createddate" : ISODate("2015-09-17T10:00:20.945+03:00")
}
Every time user visits object's page it stores as separate record in collection. Now i need to get array of last N visited objects. It should be distinct, so array should have N unique records. Also, it should be sorted by createddate.
So if the user visited object_id = 1, then object_id = 2 two times, after that visited object_id = 3 and again object_id = 1 the array should contain:
{
visits : [1, 3, 2]
}
(distinct and sorted by time of last visit).
I tried to use construction like
db.objects.aggregate([
{$match: {'user.id' : '65593'}},
{$sort: { 'createddate':-1 }},
{$project: {'id': '$user.id', 'obj' : '$object_id'}},
{$group: {_id:'$id', 'obj': {$addToSet: '$obj'}}},
{$project:{_id:0, 'obj':'$obj'}}
])
but it returns array that not sorted and also i can't limit array size.
The $addToSet operator and "sets" in general for MongoDB are not ordered in any way. Insead, get the "distinct" values by grouping on them first, then apply to the array after sorting them:
db.objects.aggregate([
{ "$match": { "user.id": "65593" } },
{ "$sort": { "user.id": 1, "createddate": -1 } },
{ "$group": {
"_id": {
"_id": "$user.id",
"object_id": "$object_id"
},
"createddate": { "$first": "$createddate" }
}},
{ "$sort": { "_id._id": 1, "createddate": -1 } },
{ "$group": {
"_id": "$_id._id",
"obj": { "$push": "$_id.object_id" }
}}
])
So if you want the discovery oder by date you $sort first, but since $group does not guarantee any order of results you need to $sort again before you group with the $push operation to build the array.
Note that you are likely reducing down the "createddate" somehow as then general "distinct" items would appear to be the "user.id" and the "object_id" fields, so this does need some sort of accumulator and needs to be included for your ordering.
Then the array items will be in the order you expect.
If you need to $limit then you must process $unwind and split the limit the results. Alternately process a "limit" after the first group and following sort here.
But of course this is only practical to do for a single main grouping _id, being "user.id". Future mongodb releases will support $slice, which will make this practical for multiple grouping id's and a bit more simple in general. But it still won't be possible to "limit" the array items before that initial group over multiple primary groupind id's.
I found the solution i expected.
db.objects.aggregate([
{$match: {'user.id' : '65593'}},
{$group : {
_id : '$object_id',
dt : {$max: '$createddate'}
}
},
{$sort: {'dt':-1}},
{$limit:5},
{$group : {
_id :null,
'objects' : {$push:'$_id'}
}
},
{$project: {_id:0, 'objects':'$objects'}}
])
It returns limited to N distinct array sorted backwards by createddate.
Thank everyone for help!
In mongo, I can do this:
db.HI.aggregate({$project: {new_val: '$tags.first'}})
However, this doesn't work:
db.HI.aggregate({$project: {new_val: '$my_array.0'}})
Does it mean that aggregation doesn't support array in this way? Is there any alternative?
Presently the aggregation framework doesn't yet support this, there's an in progress JIRA ticket for this here and there.
An alternative is to first $unwind the array, then $group the deconstructed array documents by the _id key. In the grouped documents, retrieve the first array element with the $first group accumulator operator:
db.HI.aggregate([
{
"$unwind": "$my_array"
},
{
"$group": {
"_id": "$_id",
"new_val": { "$first": "$my_array" }
}
}
])
i want to find accepted bodypart which have status active
i tried this
db.patients.find({
"injury.injurydata.injuryinformation.dateofinjury": {
"$gte": ISODate("2014-05-21T08:00:00Z") ,
"$lt": ISODate("2014-06-03T08:00:00Z")
},
{
"injury.injurydata.acceptedbodyparts":1,
"injury.injurydata.injuryinformation.dateofinjury":1
"injury":{
$elemMatch: {
"injury.injurydata.acceptedbodyparts.status": "current"
}
}
})
but still get both array
If acceptedbodyparts is an array, you can't query acceptedbodyparts.status. If status is a field on the documents contained in the array, you would need to use another $elemMatch clause in your query. So the last part would look something like this:
{"injury":{ "$elemMatch": { "injurydata.acceptedbodyparts": {"$elemMatch": {"status":"current"} }} }}
I also removed the injury. prefix in the first $elemMatch because you're querying data within the injury array.
Note that this will return the entire document with the full array, as long as it contains the document you're searching for. If your intention is to retrieve a particular element in an array, $elemMatch is the wrong approach.
Standard projection will not work with nested arrays or limiting any fields inside arrays. For that you need the aggregation framework:
db.patients.aggregate([
// First match, Matches documents
{ "$match": {
"injury.injurydata.injuryinformation.dateofinjury": {
"$gte": ISODate("2014-05-21T08:00:00Z"),
"$lt": ISODate("2014-06-03T08:00:00Z")
}
}},
// Un-wind the arrays
{ "$unwind": "$injury" },
{ "$unwind": "$injury.injurydata" },
{ "$unwind": "$injury.injurydata.acceptedbodyparts" },
// Now match the required data in the array
{ "$match": {
"injury.injurydata.acceptedbodyparts.status": "current"
}},
// Group only wanted fields
{ "$group": {
"_id": "$_id",
"acceptedbodyparts": {
"$push": "injury.injurydata.acceptedbodyparts"
}
}}
])
You can add in other fields outside of the array either using $first or by akin g them part of the _id in the grouping.
This is just something that is outside of the scope of the standard projection available and the aggregation framework with the extended manipulation capabilities solves this.
This is a very easy question, just having a really bad brain freeze. In my aggregation, I just want to remove the '_id' field by using $project but return everything else. However, I'm getting
$projection requires at least one output field
I would think it's like :
db.coll.aggregate( [ { $match .... }, { $project: { _id: 0 }}])
From v4.2, you can make use of $unset aggregate operator to remove single or multiple fields. You can also exclude a field or fields from an embedded document using the dot notation.
To remove a single field:
db.coll.aggregate([ { $unset: "_id" } ])
To remove multiple fields:
db.coll.aggregate([ { $unset: [ "_id", "name" ] } ])
To remove embedded fields:
db.coll.aggregate([
{ $unset: [ "_id", "author.name" ] }
])
You need to explicitly include fields when using aggregation either via various pipeline operations or via $project. There currently isn't a way to return all fields unless explicitly defined by field name:
$project : {
_id : 0,
"Name" : 1,
"Address" : 1
}
You can exclude the _id using the technique you used and as shown above.
You can do this just with the exact syntax that you wrote in your question.
Example document:
Person
{
_id: ObjectId('6023a13b756e30fec9f77b26'),
name: 'Pablo',
lastname: 'Presidente',
}
If you do and aggregation, with $lookup you can remove, let's say the _id field like this:
db.person.aggregate( [ { task1 }, { ... }, { taskN }, { $project: { _id: 0 }}])
Also this way you can exclude fields for other related documents to your aggregation; you would do that like this:
db.person.aggregate( [ { task1 }, { ... }, { taskN }, { $project: { _id: 0, 'otherDocument._id': 0 }}])
Performance wise I don't know if this is any good, but leaving that aside, this works like a charm!
More info: https://docs.mongodb.com/manual/reference/operator/aggregation/project/#exclude-fields-from-output-documents
Also, you can use $unset but this I haven't tried out.
From the docs:
$unset and $project
The $unset is an alias for the $project stage that
removes/excludes fields
If you are doing a simple find, is mostly the same, here are the docs with some examples:
https://docs.mongodb.com/manual/tutorial/project-fields-from-query-results/#return-all-but-the-excluded-fields
Hope this is useful, best regards!
PR