How I can do correct push into my aggregated list ?
db.getCollection('rty').aggregate(
{ $match: {'id': 110451}},
{ $unwind: '$matches'},
{ $match: {'matches.majority.uuid': {'$exists': true}}},
{ $group: {_id: '$id', list: {$push: {'$matches.majority.uuid' , 'matches.majority.confidence'}}}})
When I push only uuid it's working, but how I can use two fields here ...
Refer to $push on aggregation, please try it as below
db.getCollection('rty').aggregate(
{ $match: {'id': 110451}},
{ $unwind: '$matches'},
{ $match: {'matches.majority.uuid': {'$exists': true}}},
{ $group: {_id: '$id', list:
{$push:
{uid: '$matches.majority.uuid' ,
conf: 'matches.majority.confidence'}}}});
Related
I have two collections where I'm trying to do an aggregation query with filter options. I have looked online but I couldn't find solution for this.
Col 1
[
{
_id: ObjectId('st_123'),
stud_num: 123,
school: ObjectId('sc_123'),
gender: 'M'
},
{
_id: ObjectId('st_234'),
stud_num: 123,
school: ObjectId('sc_234'),
gender: 'F'
},
{
_id: ObjectId('st_345'),
stud_num: 123,
school: ObjectId('sc_345'),
gender: 'M'
}
]
Col 2
[
{
_id: ObjectId('f_123'),
stud_health_id: ObjectId('st_123'),
schoolYear: ObjectId('sy123')
},
{
_id: ObjectId('f_234'),
stud_health_id: ObjectId('st_234'),
schoolYear: ObjectId('sy234')
},
{
_id: ObjectId('f_345'),
stud_health_id: ObjectId('st_890'),
schoolYear: ObjectId('sy234')
},
{
_id: ObjectId('f_456'),
stud_health_id: ObjectId('st_345'),
schoolYear: ObjectId('sy345')
}
]
I am trying to filter the records from collection 1 which doesn't have entry in collection 2 with extra params.
If I send {schoolYear: ObjectID('sy234)} then it should return the first and third document of collection 1 because for that year those two students doesn't have record.
One option is using $lookup and $match:
db.col1.aggregate([
{$lookup: {
from: "col2",
as: "col2",
let: {schoolYear: "sy234", stud_id: "$_id"},
pipeline: [
{$match: {$expr: {
$and: [
{$eq: ["$schoolYear", "$$schoolYear"]},
{$eq: ["$stud_health_id", "$$stud_id"]}
]
}
}
}
]
}
},
{$match: {"col2.0": {$exists: false}}},
{$unset: "col2"}
])
See how it works on the playground example
Following the examples I have two types of data in the same time series
db.weather.insertMany( [
{
"metadata": { "sensorId": 5578, "type": "temperature" },
"timestamp": ISODate("2021-05-18T00:00:00.000Z"),
"temp": 72
},//....
and..
db.weather.insertMany([
{
"metadata": {"sensorId": 5578, "type": "humidity" },
"timestamp": ISODate("2021-05018T00:00:001Z"),
"humpercent": 78
},//...
and I want to be able to serve simple requests by aggregating the data as:
{
sensorId: 5578,
humidityData: [78, 77, 75 ...],
tempData: [72, 72, 71...]
}
which seems like the obvious use case, but the
db.foo.aggregate([{$group: {_id: "$sensorId"}}])
function on sensorId only returns the ids with no other fields. am i missing a simple identity aggregation function or a way to collect into an array?
What you are looking for is the $addToSet Operator:
db.foo.aggregate([{
$group: {
_id: "$metadata.sensorId",
temp: {
$addToSet: "$temp"
},
humidity: {
$addToSet: "$humpercent"
}
}
}])
Note that the order of elements in the returned array is not specified.
If all you have is two categories, you can simply $push them:
db.collection.aggregate([
{$sort: {timestamp: 1}},
{$group: {
_id: {sensorId: "$metadata.sensorId"},
temp: {$push: "$temp"},
humidity: {$push: "$humpercent"}
}
}
])
See how it works on the playground example - small
But if you want the generic solution for multiple measurements you need something like:
db.collection.aggregate([
{$sort: {timestamp: 1}},
{$set: {m: "$$ROOT"}},
{$unset: ["m.metadata", "m.timestamp", "m._id"]},
{$set: {m: {$first: {$objectToArray: "$m"}}}},
{$group: {
_id: {type: "$metadata.type", sensorId: "$metadata.sensorId"},
data: {$push: "$m.v"}}
},
{$project: {_id: 0, data: 1, type: {k: "type", v: "$_id.type"}, sensorId: "$_id.sensorId"}},
{$group: {
_id: "$sensorId",
data: {$push: {k: "$type.v", v: "$data"}}
}},
{$project: {_id: 0, data: {"$mergeObjects": [{$arrayToObject: "$data"}, {sensorId: "$_id"}]}
}},
{$replaceRoot: {newRoot: "$data"}}
])
See how it works on the playground example - generic
Context: I have a MongoDB database with some duplicated documents.
Problem: I want to remove all duplicated documents. (For each duplicated document, I only want to save one, which can be arbitrarily chosen.)
Minimal illustrative example:
The documents all have the following fields (there are also other fields, but those are of no relevance here):
{
"_id": {"$oid":"..."},
"name": "string",
"user": {"$oid":"..."},
}
Duplicated documents: A document is considered duplicated if there are two or more documents with the same "name" and "user" (i.e. the document id is of no relevance here).
How can I remove the duplicated documents?
EDIT:
Since mongoDB version 4.2, one option is to use $group and $merge In order to move all unique documents to a new collection:
removeList = db.collection.aggregate([
{
$group: {
_id: {name: "$name", user: "$user"},
doc: {$first: "$$ROOT"}
}
},
{$replaceRoot: {newRoot: "$doc"}},
{$merge: {into: "newCollection"}}
])
See how it works on the playground example
For older version, you do the same using $out.
Another option is to get a list of all documents to remove and remove them with another query:
db.collection.aggregate([
{
$group: {
_id: {name: "$name", user: "$user"},
doc: {$first: "$$ROOT"},
remove: {$push: "$_id"}
}
},
{
$set: {
remove: {
$filter: {
input: "$remove",
cond: {$ne: ["$$this", "$doc._id"]}
}
}
}
},
{$group: {_id: 0, remove: { $push: "$remove"}}},
{$set: { _id: "$$REMOVE",
remove: {
$reduce: {
input: "$remove",
initialValue: [],
in: {$concatArrays: ["$$value", "$$this"]}
}
}
}
}
])
db.collection.deleteMany({_id: {$in: removeList}})
I am unwinding an array using MongoDB aggregation framework and the array has duplicates and I need to ignore those duplicates while doing a grouping further.
How can I achieve that?
you can use $addToSet to do this:
db.users.aggregate([
{ $unwind: '$data' },
{ $group: { _id: '$_id', data: { $addToSet: '$data' } } }
]);
It's hard to give you more specific answer without seeing your actual query.
You have to use $addToSet, but at first you have to group by _id, because if you don't you'll get an element per item in the list.
Imagine a collection posts with documents like this:
{
body: "Lorem Ipsum...",
tags: ["stuff", "lorem", "lorem"],
author: "Enrique Coslado"
}
Imagine you want to calculate the most usual tag per author. You'd make an aggregate query like that:
db.posts.aggregate([
{$project: {
author: "$author",
tags: "$tags",
post_id: "$_id"
}},
{$unwind: "$tags"},
{$group: {
_id: "$post_id",
author: {$first: "$author"},
tags: {$addToSet: "$tags"}
}},
{$unwind: "$tags"},
{$group: {
_id: {
author: "$author",
tags: "$tags"
},
count: {$sum: 1}
}}
])
That way you'll get documents like this:
{
_id: {
author: "Enrique Coslado",
tags: "lorem"
},
count: 1
}
Previous answers are correct, but the procedure of doing $unwind -> $group -> $unwind could be simplified.
You could use $addFields + $reduce to pass to the pipeline the filtered array which already contains unique entries and then $unwind only once.
Example document:
{
body: "Lorem Ipsum...",
tags: [{title: 'test1'}, {title: 'test2'}, {title: 'test1'}, ],
author: "First Last name"
}
Query:
db.posts.aggregate([
{$addFields: {
"uniqueTag": {
$reduce: {
input: "$tags",
initialValue: [],
in: {$setUnion: ["$$value", ["$$this.title"]]}
}
}
}},
{$unwind: "$uniqueTag"},
{$group: {
_id: {
author: "$author",
tags: "$uniqueTag"
},
count: {$sum: 1}
}}
])
Say I have the following:
this.aggregate(
{$unwind: "$tags"},
{$match: {tags: {$in: pip.activity.tags}}},
{$group : {_id : '$_id',matches:{$sum:1}}},
{$project: { _id: 0,matches:1}},
{$sort: {matches:-1 }},
callback
);
how would I go about including an additional 'external' objectId field in the results? e.g if I have the following:
var otherField = new ObjectId('xxxxxxx');
this.aggregate(
{$unwind: "$tags"},
{$match: {tags: {$in: pip.activity.tags}}},
{$group : {_id : '$_id',matches:{$sum:1}}},
{$project: { _id: 0,matches:1,otherField:otherField}}, <-- include otherField
{$sort: {matches:-1 }},
callback
);
Is this possible or should I be using a forLoop or MapReduce for this particular step? I'm looking for something really efficient.
The $project pipeline operator would not let you inject the object, but you can probably insert the object id earlier in the $group operator. If you have a collection:
db.foo.save({_id:1,tags:['a','b']})
db.foo.save({_id:2,tags:['b','c']})
db.foo.save({_id:3,tags:['c','d']})
You can then write:
db.foo.aggregate({
$unwind: "$tags"},{
$match: { tags: {$in: ['b','c'] } }},{
$group: { _id: "$_id", matches: {$sum: 1 }, otherField: {$min: new ObjectId()} }},{
$project: { _id: 0, matches: 1, otherField: 1 }},{
$sort: { matches: -1 }})
The $min or $max can be used here, but it expects an operator or reference to a field so you have to give it one..