Mongodb - $group with $addToSet and then $lookup - mongodb

I've got the following query
db.getCollection('transportations').aggregate(
{
$group: {
_id: null,
departure_city_id: { $addToSet: "$departure.city_id" },
departure_station_id: { $addToSet: "$departure.station_id" }
}
}
);
and the result is
{
"_id" : null,
"departure_city_id" : [
ObjectId("5a2f5378334c4442ab5a63ea"),
ObjectId("59dae1efe408157cc1585fea"),
ObjectId("5a5bbfdc35628410f9fdcde9")
],
"departure_station_id" : [
ObjectId("5a2f53d1334c4442ab5a63ee"),
ObjectId("5a2f53c5334c4442ab5a63ed"),
ObjectId("5a5bc13435628410f9fdcdea")
]
}
Now i want to lookup each departure_city_id with the collection "areas" to get the "name" of the area and each departure_station_id with the collection "stations" to get also the "name" of the station
The result could be something like this
{
"_id" : null,
"departure_city_id" : [
{
_id: ObjectId("5a2f5378334c4442ab5a63ea"),
name: "City 1
},
{
_id: ObjectId("59dae1efe408157cc1585fea"),
name: "City 2
},
{
_id: ObjectId("5a5bbfdc35628410f9fdcde9"),
name: "City 3
}
],
"departure_station_id" : [
{
_id: ObjectId("5a2f53d1334c4442ab5a63ee"),
name: "Station 1
},
{
_id: ObjectId("5a2f53c5334c4442ab5a63ed"),
name: "Station 2
},
{
_id: ObjectId("5a5bc13435628410f9fdcdea"),
name: "Station 3
}
]
}

The $lookup aggregation pipeline stage NOW works directly with an array (on 3.3.4 version).
See: lookup between local (multiple)array of values and foreign (single) value
The answer of the question is just:
db.getCollection('transportations').aggregate(
{
$group: {
_id: null,
departure_city_id: { $addToSet: "$departure.city_id" },
departure_station_id: { $addToSet: "$departure.station_id" }
}
},
{
$lookup: {
from: "areas",
localField: "departure_city_id",
foreignField: "_id",
as: "departure_city_id"
}
},
{
$lookup: {
from: "stations",
localField: "departure_station_id",
foreignField: "_id",
as: "departure_station_id"
}
}
)

Related

MongoDB multiple $lookup and $group output

I'm quite a newbie with MongoDB and I'm trying to retrieve a kind-of leaderboard based on two related collections and a third one, referencing one of the two, based on its different property.
Schema can be found here
Consider a schema like the following one:
tree: { _id, company_id: string, company_name }
link: { _id, company_id: string, url: string }
analytics: { _id, tree_id: string, link_id: string, views: number, clicks: number, date: stringĀ }
A analytics document can have tree_id, views or link_id, clicks at once.
What I'm trying to achieve right now is a kind-of a "leaderboard" of the total clicks + views, starting from analytics collection, joining it with both tree and link, and finally retrieving the sum of clicks and views.
I have already managed to retrieve the sum of them for a specific company_id, with the following code
db.analytics.aggregate([{
$lookup: {
from: "trees",
as: "trees",
localField: "tree_id",
foreignField: "_id"
}
}, {
$lookup: {
from: "links",
as: "links",
localField: "link_id",
foreignField: "_id"
}
}, {
$match: {
$or: [
{"trees.company_id": "1"},
{"links.company_id": "1"}
]
}
}, {
$group: {
_id: null,
views_count: {
$sum: "$views"
},
clicks_count: {
$sum: "$clicks"
}
}
}])
But I can't find a way to get a list of results like
{ company_id: 1, company_name: "foo", clicks: 100, views: 200 },
{ company_id: 2, company_name: "bar", clicks: 200, views: 200 }
and so on.
What I've tried so far is grouping by different _id, which is not working as I would expect
db.analytics.aggregate([{
$lookup: {
from: "trees",
as: "trees",
localField: "tree_id",
foreignField: "_id"
}
}, {
$lookup: {
from: "links",
as: "links",
localField: "link_id",
foreignField: "_id"
}
}, {
$group: {
_id: "$trees.company_id",
views_count: {
$sum: "$views"
},
clicks_count: {
$sum: "$clicks"
}
}
}])
Which does not assign clicks_count to a specific entry, but outputs something like
{ "_id" : [ "1" ], "views_count" : 6, "clicks_count" : 0 }
{ "_id" : [ ], "views_count" : 0, "clicks_count" : 48 }
{ "_id" : [ "2" ], "views_count" : 10, "clicks_count" : 0 }
I'm not even sure that this schema could be the best solution, so I will also appreciate any design suggestions or similar stuff.
Based on the comment below, I tried to deconstruct trees before grouping results, but it ended outputting the company_id, views_count only, without counting clicks, as following
{ "_id" : "2", "views_count" : 10, "clicks_count" : 0 }
{ "_id" : "1", "views_count" : 6, "clicks_count" : 0 }
$addFields to add company field, check condition if trees.company_id not empty [] then return trees otherwise return links
$arrayElemAt to get first element from array
$group by company_id and sum your counts
db.analytics.aggregate([
{ $lookup: { //... } },
{ $lookup: { //... } },
{
$addFields: {
company: {
$arrayElemAt: [
{ $cond: [{ $ne: ["$trees.company_id", []] }, "$trees", "$links"] },
0
]
}
}
},
{
$group: {
_id: "$company.company_id",
company_name: { $first: "$company.company_name" },
views_count: { $sum: "$views" },
clicks_count: { $sum: "$clicks" }
}
}
])
Playground

Count articles grouping by tags mongodb

I had a lot of articles with a field called tags, and is an array of tags _ids, and for statistics purpose I want to count how many articles we had by each tag. If tags were a simple tag _id, it's easy because I could group by tag, but is an array of tags, and I can't group by that field.
First I try with this:
db.note.aggregate([{$match: {
publishedAt: {
$gte: ISODate('2018-01-01'),
$lte: ISODate('2019-01-01')
}
}}, {$group: {
_id: "$tags",
"total": {
"$sum": 1
}
}}, {$lookup: {
from: 'tags',
localField: '_id',
foreignField: '_id',
as: 'tag'
}}, {$unwind: {
path: "$tag"
}}, {$project: {
total: 1,
"tag.name": 1
}}, {$sort: {
total: -1
}}])
But that doesn't work, that query, group by tags group, so I try to do this:
{
'$match': {
'publishedAt': {
'$gte': new Date(req.body.gte),
'$lte': new Date(req.body.lte)
}
}
},
{
'$unwind': {
'path': '$tags'
}
}, {
'$group': {
'_id': '$tags',
'total': {
'$sum': 1
}
}
}, {
'$lookup': {
'from': 'tags',
'localField': '_id',
'foreignField': '_id',
'as': 'tag'
}
}, {
'$project': {
'total': 1,
'tag.name': 1
}
}, {
'$sort': {
'total': -1
}
},
{
'$unwind': {
'path': '$tag'
}
}
)
But the problem with this, that group for the first tag from the array and I miss all other tags in that array.
What do you think will be the solution?
I had a lot of articles with a field called tags, and is an array of
tags _ids, and for statistics purpose I want to count how many
articles we had by each tag.
You can try this (I am assuming the following input documents):
notes:
{ _id: 1, name: "art-1", author: "ab", tags: [ "t1", "t2" ] },
{ _id: 2, name: "art-2", author: "cd", tags: [ "t1", "t3" ] },
{ _id: 3, name: "art-3", author: "wx", tags: [ "t4", "t3" ] },
{ _id: 4, name: "art-4", author: "yx", tags: [ "t1" ] }
tags:
{ _id: 1, id: "t1", name: "t1's name" },
{ _id: 2, id: "t2", name: "t2's name" },
{ _id: 3, id: "t3", name: "t3's name" },
{ _id: 4, id: "t4", name: "t4's name" }
The Query:
db.tags.aggregate( [
{
$lookup: {
from: "notes",
localField: "id",
foreignField: "tags",
as: "tag_matches"
}
},
{ $project: { id: 1, name: 1, _id: 0, count: { $size: "$tag_matches" } } }
] )
The Output:
{ "id" : "t1", "name" : "t1's name", "count" : 3 }
{ "id" : "t2", "name" : "t2's name", "count" : 1 }
{ "id" : "t3", "name" : "t3's name", "count" : 2 }
{ "id" : "t4", "name" : "t4's name", "count" : 1 }

Mongodb aggregate lookup for many collections to one nested output

I've 3 Collections:
School
{ "id" : { "$numberLong" : "100000" },
"name" : "School1" }
Faculty
{ "id" : { "$numberLong" : "100000" },
"schoolId" : { "$numberLong" : "100000" },
"name" : "Faculty1" }
Subject
{ "id" : { "$numberLong" : "100000" },
"name" : "Subject1" }
Assume there are many of these in each collection. I want to be able to serve an endpoint that takes in an ID and returns the full 3 layered heirarchy (School->Faculty->Subject). How would I return all this data.
Something like:
{
id: 1,
name: "school1",
faculties: [{
id:1000,
name: "faculty1",
subjects: [
{id: 1, name: "sub1"},
{id: 2, name: "sub2"},
{id: 3, name: "sub3"}
]
}]
}
Ok after ages i actually got the solution, which is a lot simpler than the rabbit hole i'd gone down.
{ $match: {id: 100001}},
{ $lookup:
{
from: 'faculties',
localField: 'id',
foreignField: 'schoolId',
as: 'faculties',
}
},
{ $unwind: {
path: "$faculties",
preserveNullAndEmptyArrays: true
}
},
{ $lookup:
{
from: 'subjects',
localField: 'faculties.id',
foreignField: 'facultyId',
as: 'faculties.subjects',
}
}
Which returns the exact output i wanted. The key is the final lookup returning as: 'faculties.subjects' which puts subjects inside faculties which is the first child of schools.
If you need further nesting, you just need to go as: faculties.subjects.students.names for instance each time you get deeper
db.School.aggregate(
// Pipeline
[
// Stage 1
{
$lookup: // Equality Match
{
from: "Faculty",
localField: "id",
foreignField: "schoolId",
as: "faculties"
}
},
// Stage 2
{
$unwind: {
path: "$faculties",
preserveNullAndEmptyArrays: false // optional
}
},
// Stage 3
{
$lookup: // Equality Match
{
from: "Subject",
localField: "faculties.id",
foreignField: "facultyId",
as: "faculties.subjects"
}
},
// Stage 4
{
$group: {
_id: {
id: '$id',
name: '$name'
},
faculties: {
$addToSet: '$faculties'
}
}
},
// Stage 5
{
$project: {
id: '$_id.id',
name: '$_id.name',
faculties: 1
}
},
]
);

how can i aggregate in mongodb?

I have two collections points collection and users collection here i want to do aggregation based on userid
points collection
{
"userpoint": "2",
"purchaseid":"dj04944",
"store":"001",
"date":ISODate("2017-11-10T08:15:39.736Z")
"userid"[
objectID("5a7565ug8945rt67"),
objectID("8a35553d3446rt78")
]
},
{
"userpoint": "4",
"purchaseid":"5678sd",
"store":"004",
"date":ISODate("2017-11-11T08:15:39.736Z")
"userid"[
objectID("9a85653d3890rt09")
]
}
users collection
{
objectID("5a7565ug8945rt67"),
"name":"asdf",
"mobinumber":"12345",
},
{
objectID("8a35553d3446rt78"),
"name":"qwr",
"mobinumber":"11111",
},
{
objectID("9a85653d3890rt09"),
"name":"juir",
"mobinumber":"9611",
}
how can i do aggregation
db.points.aggregate([
{
$lookup:
{
from: "users",
localField: "",
foreignField: "",
as: "inventory_docs"
}
}
])
i want to combine both collections
help me out to move forward
If your expected output like bellow
{
"_id" : ObjectId("5a164fa5400096bfa0b3422c"),
"date" : ISODate("2017-11-10T08:15:39.736Z"),
"name" : "asdf",
"mobile" : "12345"
}
The can try this query
db.points.aggregate([
{
$match: {
store: "001",
date: {$lte: ISODate("2017-11-10T08:15:39.736Z"), $gte: ISODate("2017-11-10T08:15:39.736Z")}
}
},
{$unwind: "$userid"},
{
$lookup: {
from: "users",
localField: "userid",
foreignField: "_id",
as: "user"
}
},
{
$project: {
userpoint: 1,
purchaseid: 1,
date: 1,
user: {$arrayElemAt: ["$user", 0]}
}
},
{$match: {"user.name": "asdf"}},
{
$project: {
date: 1,
name: "$user.name",
mobile: "$user.mobinumber"
}
}
])

MongoDB's aggregation from nested key returns nothing

(Edit : this question was edited to better reflect the issue, which might be a little more complicated than the proposed related question.)
Let's say I have these two collections
products
{
_id: 'AAAA',
components: [
{ type: 'foo', items: [
{ itemId: 'item1', qty: 2 },
{ itemId: 'item2', qty: 1 }
] },
{ type: 'bar', items: [
{ itemId: 'item3', qty: 8 }
] }
]
}
items
{
_id: 'item1',
name: 'Foo Item'
}
{
_id: 'item2',
name: 'Bar Item'
}
{
_id: 'item3',
name: 'Buz Item'
}
And that I perform this query
db['products'].aggregate([
{ $lookup: {
from: 'items',
localField: 'components.items.itemId',
foreignField: '_id',
as: 'componentItems'
} }
]);
I get this
{
_id: 'AAAA',
components: [
{ type: 'foo', items: [
{ itemId: 'item1', qty: 2 },
{ itemId: 'item2', qty: 1 }
] }
{ type: 'bar', items: [
{ itemId: 'item3', qty: 8 }
] }
],
componentItems: [ ]
}
Why doesn't the aggregation read the local field value? How can I retrieve the foreign document without losing my original document structure?
Edit
I have read the jira issue and seen the proposed answer, however I don't know how this applies. This is not merely an array, but values from an object, inside an array. I am not sure how I can unwind this, and how to put it back together without losing the document structure.
Edit 2
The problem that I have is that I'm not sure how to group the results back together. With this query :
db['products'].aggregate([
{ $unwind: '$components' },
{ $unwind: '$components.items' },
{ $lookup: {
from: 'items',
localField: 'components.items.itemId',
foreignField: '_id',
as: 'componentsItems'
} }
]);
I get the "correct" result of
{ "_id" : "AAAA", "components" : { "type" : "foo", "items" : { "itemId" : "item1", "qty" : 2 } }, "componentsItems" : [ { "_id" : "item1", "name" : "Foo Item" } ] }
{ "_id" : "AAAA", "components" : { "type" : "foo", "items" : { "itemId" : "item2", "qty" : 1 } }, "componentsItems" : [ { "_id" : "item2", "name" : "Bar Item" } ] }
{ "_id" : "AAAA", "components" : { "type" : "bar", "items" : { "itemId" : "item3", "qty" : 8 } }, "componentsItems" : [ { "_id" : "item3", "name" : "Buz Item" } ] }
But, while I can unwind components.items, I cannot seem to unto this, as $group complains that
"the group aggregate field name 'components.items' cannot be used because $group's field names cannot contain '.'"
db['products'].aggregate([
{ $unwind: '$components' },
{ $unwind: '$components.items' },
{ $lookup: {
from: 'items',
localField: 'components.items.itemId',
foreignField: '_id',
as: 'componentsItems'
} },
{ "$group": {
"components.type": "$components.type",
"components.items": { $push: "$components.items" },
"componentsItems": { $push: "$componentsItems" }
} },
{ "$group": {
"_id": "$_id",
"components": { $push: "$components" },
"componentsItems": { $push: "$componentsItems" }
} }
]);
Edit 3
This query is, thus far, the closest that I found, except that components are not grouped back by type.
db['products'].aggregate([
{ $unwind: '$components' },
{ $unwind: '$components.items' },
{ $lookup: {
from: 'items',
localField: 'components.items.itemId',
foreignField: '_id',
as: 'componentsItems'
} },
{ $unwind: '$componentsItems' },
{ $group: {
"_id": "$_id",
"components": {
$push: {
"type": "$components.type",
"items": "$components.items"
}
},
"componentsItems": { $addToSet: "$componentsItems" }
} }
]);
Also: I am concerned that using $unwind and $group may affect the order of the components, which should be preserved. AFAIK, MongoDB preserve array order when storing documents. I'd hate for this functionality to be broken by the awkwardness of $lookup.
Here is my long and awkward solution :
db['products'].aggregate([
// unwind all... because $lookup cannot work with multi-values
{ $unwind: '$components' },
{ $unwind: '$components.items' },
// lookup... This is a 1:1 relationship but who cares, right?
{ $lookup: {
from: 'items',
localField: 'components.items.itemId',
foreignField: '_id',
as: 'componentsItems'
} },
// our 1:1 relationship is now an array, so this is required
// before grouping, so we don't end up with array of arrays
{ $unwind: '$componentsItems' },
// Group 1: put "components.items" in a temporary array
// and filter duplicates from "componentsItems"
{ $group: {
"_id": {
"i": "$_id",
"t": "$components.type"
},
"items": {
$push: "$components.items"
},
"componentsItems": { $addToSet: "$componentsItems" }
} },
// undo $push...
{ $unwind: "$componentsItems" },
// Group 2: put everything back together
{ $group: {
"_id": "$_id.i",
"items": {
$push: {
"type": "$_id.t",
"items": "$items"
}
},
"componentsItems": { $push: "$componentsItems" }
} }
]);
Edit
A better solution :
db['products'].aggregate([
// Return document, added a collection of "itemId"
{ $project: {
"_id": 1,
"components": 1,
"componentItemId": "$components.items.itemId"
} },
// Since there was two arrays, the field is an array of arrays...
{ $unwind: "$componentItemId" },
{ $unwind: "$componentItemId" },
// make 1:1 lookup...
{ $lookup: {
from: 'items',
localField: 'componentItemId',
foreignField: '_id',
as: 'componentsItems'
} },
// ... extract the 1:1 reference...
{ $unwind: "$componentsItems" },
// group back, ignoring the "componentItemId" field
{ $group: {
"_id": "$_id",
"components": { $first: "$components" },
"componentItems": { $addToSet: "$componentsItems" }
}}
]);
I'm not sure if there is yet a better solution, and I am concerned about performance, but this seems to be the only solutions I can think of.
The downside is that documents cannot be dynamic, and this query will need to be modified whenever the schema changes.
Update
This seems to be resolved in MongoDB 3.3.4 (not release at the time of writing this answer).