So, I have two collections User and Book, and I want to aggregate them to receive an output (shown at the end of the post) for a specific User ID.
below are the collections:
User
Each document contains User ID, Name of the User and an array containing ID of a document from Book collection and a Boolean property read.
[
{
_id: ObjectId('adfa2sd5'),
name: "Mia",
books: [
{
bid: "154854",
read: true
},
{
bid: "5475786",
read: false
}
]
},
{
_id: ObjectId('uai5as5a'),
name: "Jack",
books: [
{
bid: "5475786",
read: true
}
]
}
]
Book
Each document possesses a book ID and name of the book.
[
{
_id: ObjectId('154854'),
name: "The Great Gatsby"
},
{
_id: ObjectId('5475786'),
name: "Frankenstein"
},
]
Output:
The output contains the User ID, along an array book_list which contains detail of each book (id, name) from the documents of Book collection based on the books.bid from User document and read field which was along books.bid.
[
{
_id: ObjectId('adfa2sd5'),
book_list: [
{
_id: ObjectId('154854'),
name: "The Great Gatsby",
read: true
},
{
_id: ObjectId('5475786'),
name: "Frankenstein",
read: false
}
]
}
]
Here's one way you could do it.
db.users.aggregate([
{
"$match": {
"_id": ObjectId("fedcba9876543210fedcba98")
}
},
{
"$lookup": {
"from": "books",
"localField": "books.bid",
"foreignField": "_id",
"as": "bookList"
}
},
{
"$set": {
"books": {
"$map": {
"input": "$books",
"as": "book",
"in": {
"$mergeObjects": [
"$$book",
{
"name": {
"$getField": {
"field": "name",
"input": {
"$first": {
"$filter": {
"input": "$bookList",
"cond": {"$eq": ["$$this._id", "$$book.bid"]}
}
}
}
}
}
}
]
}
}
}
}
},
{"$unset": ["bookList", "name"]}
])
Try it on mongoplayground.net.
Related
How can i aggregate two collections if i have the index in multidimensional array?
//collection 1 item example:
{
_id: ObjectId('000'),
trips: [ // first array
{ // trip 1
details: [ // array in first array
{ // detail 1
key: ObjectId('111') // ==> _id i want aggregate
}
]
}
]
}
//collection 2 item exampe:
{
_id: ObjectId('111'),
description: 'any info here...'
}
what is the procedure to do in order to obtain a valid result?
I don't want to reprocess every data after doing a "find".
This is the result I would like to get
//collection 1 after aggregate:
{
_id: ObjectId('000'),
trips: [
{ // trip 1
details: [
{ // detail 1
key: ObjectId('111') // id i want aggregate
_description_obj: {
_id: ObjectId('111'),
description: 'any info here...'
}
}
]
}
]
}
Here's one way you could do it.
db.coll1.aggregate([
{
"$match": {
_id: ObjectId("000000000000000000000000")
}
},
{
"$lookup": {
"from": "coll2",
"localField": "trips.details.key",
"foreignField": "_id",
"as": "coll2"
}
},
{
"$set": {
"trips": {
"$map": {
"input": "$trips",
"as": "trip",
"in": {
"$mergeObjects": [
"$$trip",
{
"details": {
"$map": {
"input": "$$trip.details",
"as": "detail",
"in": {
"$mergeObjects": [
"$$detail",
{
"_description_obj": {
"$first": {
"$filter": {
"input": "$coll2",
"cond": {"$eq": ["$$this._id", "$$detail.key"]}
}
}
}
}
]
}
}
}
}
]
}
}
},
"coll2": "$$REMOVE"
}
}
])
Try it on mongoplayground.net.
I'm trying to get aggregations with same aggregation pipeline including $match and $group operations from multiple collections.
For example,
with a users collection and collections of questions, answers and comments where every document has authorId and created_at field,
db = [
'users': [{ _id: 123 }, { _id: 456} ],
'questions': [
{ authorId: ObjectId('123'), createdAt: ISODate('2022-09-01T00:00:00Z') },
{ authorId: ObjectId('456'), createdAt: ISODate('2022-09-05T00:00:00Z') },
],
'answers': [
{ authorId: ObjectId('123'), createdAt: ISODate('2022-09-05T08:00:00Z') },
{ authorId: ObjectId('456'), createdAt: ISODate('2022-09-01T08:00:00Z') },
],
'comments': [
{ authorId: ObjectId('123'), createdAt: ISODate('2022-09-01T16:00:00Z') },
{ authorId: ObjectId('456'), createdAt: ISODate('2022-09-05T16:00:00Z') },
],
]
I want to get counts of documents from each collections with created_at between a given range and grouped by authorId.
A desired aggregation result may look like below. The _ids here are ObjectIds of documents in users collection.
\\ match: { createdAt: { $gt: ISODate('2022-09-03T00:00:00Z) } }
[
{ _id: ObjectId('123'), questionCount: 0, answerCount: 1, commentCount: 0 },
{ _id: ObjectId('456'), questionCount: 1, answerCount: 0, commentCount: 1 }
]
Currently, I am running aggregation below for each collection, combining the results in the backend service. (I am using Spring Data MongoDB Reactive.) This seems very inefficient.
db.collection.aggregate([
{ $match: {
created_at: { $gt: ISODate('2022-09-03T00:00:00Z') }
}},
{ $group : {
_id: '$authorId',
count: {$sum: 1}
}}
])
How can I get the desired result with one aggregation?
I thought $unionWith or $lookup may help but I'm stuck here.
You can try something like this, using $lookup, here we join users, with all the three collections one-by-one, and then calculate the count:
db.users.aggregate([
{
"$lookup": {
"from": "questions",
"let": {
id: "$_id"
},
"pipeline": [
{
"$match": {
$expr: {
"$and": [
{
"$gt": [
"$createdAt",
ISODate("2022-09-03T00:00:00Z")
]
},
{
"$eq": [
"$$id",
"$authorId"
]
}
]
}
}
}
],
"as": "questions"
}
},
{
"$lookup": {
"from": "answers",
"let": {
id: "$_id"
},
"pipeline": [
{
"$match": {
$expr: {
"$and": [
{
"$gt": [
"$createdAt",
ISODate("2022-09-03T00:00:00Z")
]
},
{
"$eq": [
"$$id",
"$authorId"
]
}
]
}
}
}
],
"as": "answers"
}
},
{
"$lookup": {
"from": "comments",
"let": {
id: "$_id"
},
"pipeline": [
{
"$match": {
$expr: {
"$and": [
{
"$gt": [
"$createdAt",
ISODate("2022-09-03T00:00:00Z")
]
},
{
"$eq": [
"$$id",
"$authorId"
]
}
]
}
}
}
],
"as": "comments"
}
},
{
"$project": {
"questionCount": {
"$size": "$questions"
},
"answersCount": {
"$size": "$answers"
},
"commentsCount": {
"$size": "$comments"
}
}
}
])
Playground link. In the above query, we use pipelined form of $lookup, to perform join on some custom logic. Learn more about $lookup here.
Another way is this, perform normal lookup and then filter out the elements:
db.users.aggregate([
{
"$lookup": {
"from": "questions",
"localField": "_id",
"foreignField": "authorId",
"as": "questions"
}
},
{
"$lookup": {
"from": "answers",
"localField": "_id",
"foreignField": "authorId",
"as": "answers"
}
},
{
"$lookup": {
"from": "comments",
"localField": "_id",
"foreignField": "authorId",
"as": "comments"
}
},
{
"$project": {
questionCount: {
"$size": {
"$filter": {
"input": "$questions",
"as": "item",
"cond": {
"$gt": [
"$$item.createdAt",
ISODate("2022-09-03T00:00:00Z")
]
}
}
}
},
answerCount: {
"$size": {
"$filter": {
"input": "$answers",
"as": "item",
"cond": {
"$gt": [
"$$item.createdAt",
ISODate("2022-09-03T00:00:00Z")
]
}
}
}
},
commentsCount: {
"$size": {
"$filter": {
"input": "$comments",
"as": "item",
"cond": {
"$gt": [
"$$item.createdAt",
ISODate("2022-09-03T00:00:00Z")
]
}
}
}
}
}
}
])
Playground link.
This is my Product Schema
const product = mongoose.Schema({
name: {
type: String,
},
category:{
type: ObjectId,
ref: Category
}
}
I want to return the product based on filters coming from the front end.
For example: Lets consider there are 2 categories Summer and Winter. When the user wants to filter product by Summer Category an api call would me made to http://localhost:8000/api/products?category=summer
Now my question is, how do I filter because from the frontend I am getting category name and there is ObjectId in Product Schema.
My attempt:
Project.find({category:categoryName})
If I've understood correctly you can try one of these queries:
First one is using pipeline into $lookup to match by category and name in one stage like this:
yourModel.aggregate([
{
"$lookup": {
"from": "category",
"let": {
"category": "$category",
},
"pipeline": [
{
"$match": {
"$expr": {
"$and": [
{
"$eq": [
"$id",
"$$category"
]
},
{
"$eq": [
"$name",
"Summer"
]
}
]
}
}
}
],
"as": "categories"
}
},
{
"$match": {
"categories": {
"$ne": []
}
}
}
])
Example here
Other way is using normal $lookup and then $filter the array returned by join.
yourModel.aggregate([
{
"$lookup": {
"from": "category",
"localField": "category",
"foreignField": "id",
"as": "categories"
}
},
{
"$match": {
"categories": {
"$ne": []
}
}
},
{
"$set": {
"categories": {
"$filter": {
"input": "$categories",
"as": "c",
"cond": {
"$eq": [
"$$c.name",
"Summer"
]
}
}
}
}
}
])
Example here
Note how both queries uses $match: { categories: { $ne : []}}, this is because if the lookup doesn't find any result it returns an empty array. By the way, this stage is optional.
Also you can add $unwind or $arrayElemAt index 0 or whatever to get only one value from the array.
Also, to do it in mongoose you only need to replace "Summer" with your variable.
I have these collections:
users
{
_id: "userId1",
// ...
tracks: ["trackId1", "trackId2"],
};
tracks
{
_id: "trackId1",
// ...
creatorId: "userId1",
categoryId: "categoryId1"
}
categories
{
_id: "categoryId1",
// ...
tracks: ["trackId1", "trackId15", "trackId20"],
};
by using the following code, I am able to get a track by its ID and add the creator
tracks.aggregate([
{
$match: { _id: ObjectId(trackId) },
},
{
$lookup: {
let: { userId: { $toObjectId: "$creatorId" } },
from: "users",
pipeline: [{ $match: { $expr: { $eq: ["$_id", "$$userId"] } } }],
as: "creator",
},
},
{ $limit: 1 },
])
.toArray();
Response:
"track": {
"_id": "trackId1",
// ...
"categoryId": "categoryId1",
"creatorId": "userId1",
"creator": {
"_id": "userId1",
// ...
"tracks": [
"trackId5",
"trackId10",
"trackId65"
]
}
}
but what I am struggling with is that I want the creator.tracks to aggregate also returning the tracks by their ID (e.g up to last 5), and also to get the last 5 tracks from the categoryId
expected result:
"track": {
"_id": "trackId1",
// ...
"categoryId": "categoryId1",
"creatorId": "userId1",
"creator": {
"_id": "userId1",
"tracks": [
{
"_id": "trackId5",
// the rest object without the creator
},
{
"_id": "trackId10",
// the rest object without the creator
},
{
"_id": "trackId65",
// the rest object without the creator
},
]
},
// without trackId1 which is the one that is being viewed
"relatedTracks": [
{
"_id": "trackId15",
// the rest object without the creator
},
{
"_id": "trackId20",
// the rest object without the creator
},
]
}
I would appreciate any explanation/help to understand what is the best one to do it and still keep the good performance
Query
start from a track
join with users using the trackId get all the tracks of the creator
(creator-tracks)
join with categories using the categoryId to get all the tracks of the category (related tracks)
remove from related-tracks the tracks of the creator
take the last 5 from both using $slice (creator-tracks and related-tracks)
*i added 2 extra lookups to get all info of the tracks, its empty arrays because i dont have enough data(i have only trackId1), with all the data it will work
PlayMongo
db.tracks.aggregate([
{
"$match": {
"_id": "trackId1"
}
},
{
"$lookup": {
"from": "users",
"localField": "creatorId",
"foreignField": "_id",
"as": "creator-tracks"
}
},
{
"$set": {
"creator-tracks": {
"$arrayElemAt": [
"$creator-tracks.tracks",
0
]
}
}
},
{
"$lookup": {
"from": "categories",
"localField": "categoryId",
"foreignField": "_id",
"as": "related-tracks"
}
},
{
"$set": {
"related-tracks": {
"$arrayElemAt": [
"$related-tracks.tracks",
0
]
}
}
},
{
"$set": {
"related-tracks": {
"$filter": {
"input": "$related-tracks",
"cond": {
"$not": [
{
"$in": [
"$$this",
"$creator-tracks"
]
}
]
}
}
}
}
},
{
"$set": {
"creator-tracks": {
"$slice": [
{
"$filter": {
"input": "$creator-tracks",
"cond": {
"$ne": [
"$$this",
"$_id"
]
}
}
},
-5
]
}
}
},
{
"$set": {
"related-tracks": {
"$slice": [
"$related-tracks",
-5
]
}
}
},
{
"$lookup": {
"from": "tracks",
"localField": "creator-tracks",
"foreignField": "_id",
"as": "creator-tracks-all-info"
}
},
{
"$lookup": {
"from": "tracks",
"localField": "related-tracks",
"foreignField": "_id",
"as": "related-tracks-all-info"
}
}
])
I have three collections users, products and orders , orders type has two possible values "Cash" and "Online". One users can have single/multiple products and products have none/single/multiple orders. I want to text search on users collection on name. Now I want to write a query which will return all matching users on text search highest text score first, it might be possible one user's name is returning top score but don't have any products and orders.
I have written a query but it's not returning users who has text score highest but don't have any products/orders. It's only returning users who has record present in all three collections. And also performance of this query is not great taking long time if a user has lot of products for example more than 3000 products. Any help appreciated.
db.users.aggregate(
[
{
"$match": {
"$text": {
"$search": "john"
}
}
},
{
"$addFields": {
"score": {
"$meta": "textScore"
}
}
},
{
"$sort": {
"Score": {
"$meta": "textScore"
}
}
},
{
"$skip": 0
},
{
"$limit": 6
},
{
"$lookup": {
"from": "products",
"localField": "userId",
"foreignField": "userId",
"as": "products"
}
},
{ $unwind: '$products' },
{
"$lookup": {
"from": "orders",
"let": {
"products": "$products"
},
"pipeline": [
{
"$match": {
"$expr": {
"$and": [
{
"$in": [
"$productId",
["$$products.productId"]
]
},
{
"$eq": [
"$orderType",
"Cash"
]
}
]
}
}
}
],
"as": "orders"
}
},
{ $unwind: 'orders' },
{
$group: {
_id: "$_id",
name: { $first: "$name" },
userId: { $first: "$userId" },
products: { $addToSet: "$products" },
orders: { $addToSet: "$orders" },
score: { $first: "$score" },
}
},
{ $sort: { "score": -1 } }
]
);
Issue:
Every lookup produces an array which holds the matched documents. When no documents are found, the array would be empty. Unwinding that empty array would break the pipeline immediately. That's the reason, we are not getting user records with no products/orders. We would need to preserve such arrays so that the pipeline execution can continue.
Improvements:
In orders lookup, the $eq can be used instead of $in, as we already
unwinded the products array and each document now contains only
single productId
Create an index on userId in products collection to make the query more efficient
Following is the updated query:
db.users.aggregate([
{
"$match": {
"$text": {
"$search": "john"
}
}
},
{
"$addFields": {
"score": {
"$meta": "textScore"
}
}
},
{
"$skip": 0
},
{
"$limit": 6
},
{
"$lookup": {
"from": "products",
"localField": "userId",
"foreignField": "userId",
"as": "products"
}
},
{
$unwind: {
"path":"$products",
"preserveNullAndEmptyArrays":true
}
},
{
"$lookup": {
"from": "orders",
"let": {
"products": "$products"
},
"pipeline": [
{
"$match": {
"$expr": {
"$and": [
{
"$eq": [
"$productId",
"$$products.productId"
]
},
{
"$eq": [
"$orderType",
"Cash"
]
}
]
}
}
}
],
"as": "orders"
}
},
{
$unwind: {
"path":"$orders"
"preserveNullAndEmptyArrays":true
}
},
{
$group: {
_id: "$_id",
name: {
$first: "$name"
},
userId: {
$first: "$userId"
},
products: {
$addToSet: "$products"
},
orders: {
$addToSet: "$orders"
},
score: {
$first: "$score"
}
}
},
{
$sort: {
"score": -1
}
}
]);
To get more information on unwind, please check https://docs.mongodb.com/manual/reference/operator/aggregation/unwind/