MongoDB Sorting Nested Array Fields - mongodb

I have a collection of item that has price depending on user membership (silver, gold, platinum)
{
..
memberships: [
{level: "silver", price: 100},
{level: "gold", price: 90},
{level: "platinum", price: 80}
]
}
when silver users browse for items, they will need to see items sorted by price using memberships price where level is silver.
How do I sort this ?

You can use $unwind and then $sort in aggregation pipeline. eg: here
db.collection.aggregate(
{ $unwind: "$memberships" },
{ $sort: { "memberships.price": 1 }},
{ $group: {
_id: "$_id",
"memberships": { $push: "$memberships" }
}
})

Related

MongoDB aggregation grouping fill missing values

I'm using MongoDB aggregation framework. I have a Mongo collection with documents like this:
{
'step': 1,
'name': 'house',
'score': 2
}
{
'step': 1,
'name': 'car',
'score': 3
}
{
'step': 2,
'name': 'house',
'score': 4
}
I'm grouping the documents with same 'step' and pushing 'name' and 'score' into an array of objects. What I get is:
{
'step': 1,
'scores':
[
{'name':'house','score':2},
{'name':'car','score':3}
]
}
{
'step': 2,
'scores':
[
{'name':'house','score':4}
]
}
For each 'step' I need to copy the value of previous 'step' in case that a 'name' does not exists. I should have something like this:
{
'step': 1,
'scores':
[
{'name':'house','score':2},
{'name':'car','score':3}
]
}
{
'step': 2,
'scores':
[
{'name':'house','score':4},
**{'name': 'car', 'score':3}**
]
}
At the second document the element {'name':'car','score':3} has been copied from the previous document because at 'step:2' there is not documents having 'score' for 'car'.
I'm not able to figure out how to do this operation with MongoDB aggregation. Some help will be very appreciated.
Required to use $lookup with pipeline, look below step by step,
$group by step and push all scores in one array scores
push all name in names of each score of particular step, we will use in match condition inside lookup
db.collection.aggregate([
{
$group: {
_id: "$step",
scores: {
$push: {
name: "$name",
score: "$score"
}
},
names: { $push: "$name" }
}
},
$unwind scores because its array and we are going to lookup
{ $unwind: "$scores" },
$lookup let variables step(_id) and names for pipeline level
$match condition with expression $expr there are 3 conditions
check the size of names It should be one(1), either its car or house,
match step number, it should be equal
match not in for ex. if car is already available then it will search for house in lookup, need to use separate $not and than $in
$project to show required fields
lookup result will store in clone_score
{
$lookup: {
from: "collection",
let: {
step_id: { $subtract: ["$_id", 1] },
names: "$names"
},
pipeline: [
{
$match: {
$expr: {
$and: [
{ $eq: [{ $size: "$$names" }, 1] },
{ $eq: ["$$step_id", "$step"] },
{ $not: [{ $in: ["$name", "$$names"] }] }
]
}
}
},
{
$project: {
_id: 0,
name: 1,
score: 1
}
}
],
as: "clone_score"
}
},
$group by step(_id) and push all scores in one array scores, keep first clone_score
{
$group: {
_id: "$_id",
scores: { $push: "$scores" },
clone_score: { $first: "$clone_score" }
}
},
from above pipelines, we have two separate array scores and clone_score now,
$project we need to concat both of them in scores
{
$project: {
_id: 0,
step: "$_id",
scores: {
$concatArrays: ["$scores", "$clone_score"]
}
}
}
])
Playground: https://mongoplayground.net/p/Fytf7NEU7uG

MongoDB Aggregation SUM Array of Arrays by object key

Okay, so I've been searching for a while but couldn't find an answer to this, and I am desperate :P
I have some documents with this syntax
{
"period": ISODate("2018-05-29T22:00:00.000+0000"),
"totalHits": 13982
"hits": [
{
// some fields...
users: [
{
// some fields...
userId: 1,
products: [
{ productId: 1, price: 30 },
{ productId: 2, price: 30 },
{ productId: 3, price: 30 },
{ productId: 4, price: 30 },
]
},
]
}
]
}
And I want to retrieve a count of how many products (Independently of which user has them) we have on a period, an example output would be like this:
[
{
"period": ISODate("2018-05-27T22:00:00.000+0000"),
"count": 432
},
{
"period": ISODate("2018-05-28T22:00:00.000+0000"),
"count": 442
},
{
"period": ISODate("2018-05-29T22:00:00.000+0000"),
"count": 519
}
]
What is driving me crazy is the "object inside an array inside an array" I've done many aggregations but I think they were simpler than this one, so I am a bit lost.
I am thinking about changing our document structure to a better one, but we have ~6M documents which we would need to transform to the new one and that's just a mess... but Maybe it's the only solution.
We are using MongoDB 3.2, we can't update our systems atm (I wish, but not possible).
You can use $unwind to expand your array, then use $group to sum:
db.test.aggregate([
{$match: {}},
{$unwind: "$hits"},
{$project: {_id: "$_id", period: "$period", users: "$hits.users"}},
{$unwind: "$users"},
{$project: {_id: "$_id", period: "$period", subCout: {$size: "$users.products"}}},
{$group: {"_id": "$period", "count": {$sum: "$count"}}}
])

$group after $lookup is taking way too long

I have following mongo collection:
{
"_id" : "22pTvYLd7azAAPL5T",
"plate" : "ABC-123",
"company": "AMZ",
"_portfolioType" : "account"
},
{
"_id" : "22pTvYLd7azAAPL5T",
"plate" : "ABC-123",
"_portfolioType" : "sale",
"price": 87.3
},
{
"_id" : "22pTvYLd7azAAPL5T",
"plate" : "ABC-123",
"_portfolioType" : "sale",
"price": 88.9
}
And I am trying to aggregate all documents which have same value in plate field. Below is the query I have written so far:
db.getCollection('temp').aggregate([
{
$lookup: {
from: 'temp',
let: { 'p': '$plate', 't': '$_portfolioType' },
pipeline: [{
'$match': {
'_portfolioType': 'sale',
'$expr': { '$and': [
{ '$eq': [ '$plate', '$$p' ] },
{ '$eq': [ '$$t', 'account' ] }
]}
}
}],
as: 'revenues'
},
},
{
$project: {
plate: 1,
company: 1,
totalTrades: { $arrayElemAt: ['$revenues', 0] },
},
},
{
$addFields: {
revenue: { $add: [{ $multiply: ['$totalTrades.price', 100] }, 99] },
},
},
{
$group: {
_id: '$company',
revenue: { $sum: '$revenue' },
}
}
])
Query works fine if I remove $group stage, however, as soon as I add $group stage mongo starts an infinite processing. I tried adding $match as the first stage so to limit number of documents to process but without any luck. E.g:
{
$match: { $or: [{ _portfolioType: 'account' }, { _portfolioType: 'sale' }] }
},
I also tried using { explain: true } but it doesn't return anything helpful.
As Neil Lunn noticed, you very likely don't need the lookup to reach your "end goal", which is still quite vague.
Please read comments and adjust as needed:
db.temp.aggregate([
{$group:{
// Get unique plates
_id: "$plate",
// Not clear what you expect if there are documents with
// different company, and the same plate.
// Assuming "it never happens"
// You may need to $cond it here with {$eq: ["$_portfolioType", "account"]}
// but you never voiced it.
company: {$first:"$company"},
// Not exactly all documents with _portfolioType: sale,
// but rather price from all documents for this plate.
// Assuming price field is available only in documents
// with "_portfolioType" : "sale". Otherwise add a $cond here.
// If you really need "all documents", push $$ROOT instead.
prices: {$push: "$price"}
}},
{$project: {
company: 1,
// Apply your math here, or on the previous stage
// to calculate revenue per plate
revenue: "$prices"
}}
{$group: {
// Get document for each "company"
_id: "$company",
// Revenue associated with plate
revenuePerPlate: {$push: {"k":"$_id", "v":"$revenue"}}
}},
{$project:{
_id: 0,
company: "$_id",
// Count of unique plate
platesCnt: {$size: "$revenuePerPlate"},
// arrayToObject if you wish plate names as properties
revenuePerPlate: {$arrayToObject: "$revenuePerPlate"}
}}
])

Why is $match not used in the Mongo Aggregation query?

As described in the mongo documentation:
https://docs.mongodb.com/manual/reference/sql-aggregation-comparison/
There is a query for the following SQL query:
SELECT cust_id,
SUM(li.qty) as qty
FROM orders o,
order_lineitem li
WHERE li.order_id = o.id
GROUP BY cust_id
And the equivalent mongo aggregation query is as follows:
db.orders.aggregate( [
{ $unwind: "$items" },
{
$group: {
_id: "$cust_id",
qty: { $sum: "$items.qty" }
}
}
] )
However, the query is workinf fine as expected. My question, why is there no $match clause for the corresponding WHERE clause in SQL? And how is $unwind compensating the $match clause?
The comment by #Veeram is correct. The where clause in the SQL is unnecessary because the items list is embedded in the orders collection, where in a relational database you would have both an orders table and an orders_lineitem table (names taken from the description at https://docs.mongodb.com/manual/reference/sql-aggregation-comparison/)
Per the example data, you start with documents like this:
{
cust_id: "abc123",
ord_date: ISODate("2012-11-02T17:04:11.102Z"),
status: 'A',
price: 50,
items: [ { sku: "xxx", qty: 25, price: 1 },
{ sku: "yyy", qty: 25, price: 1 } ]
}
When you $unwind, the items are unwound but the rest of the data is projected. If you run a query like
db.orders.aggregate([ {"$unwind": "$items"} ])
you get the output
{
cust_id: "abc123",
ord_date: ISODate("2012-11-02T17:04:11.102Z"),
status: 'A',
price: 50,
items: { sku: "xxx", qty: 25, price: 1 }
},
{
cust_id: "abc123",
ord_date: ISODate("2012-11-02T17:04:11.102Z"),
status: 'A',
price: 50,
items: { sku: "yyy", qty: 25, price: 1 }
}
That has flattened the items array, allowing the $group to add the items.qty field:
db.orders.aggregate([
{"$unwind": "$items"},
{"$group": {
"_id": "$cust_id",
"qty": {"$sum": "$items.qty"}
}
}])
With the output:
{ "_id": "abc123",
"qty": 50
}

MongoDB - Unwind array using aggregation and remove duplicates

I am unwinding an array using MongoDB aggregation framework and the array has duplicates and I need to ignore those duplicates while doing a grouping further.
How can I achieve that?
you can use $addToSet to do this:
db.users.aggregate([
{ $unwind: '$data' },
{ $group: { _id: '$_id', data: { $addToSet: '$data' } } }
]);
It's hard to give you more specific answer without seeing your actual query.
You have to use $addToSet, but at first you have to group by _id, because if you don't you'll get an element per item in the list.
Imagine a collection posts with documents like this:
{
body: "Lorem Ipsum...",
tags: ["stuff", "lorem", "lorem"],
author: "Enrique Coslado"
}
Imagine you want to calculate the most usual tag per author. You'd make an aggregate query like that:
db.posts.aggregate([
{$project: {
author: "$author",
tags: "$tags",
post_id: "$_id"
}},
{$unwind: "$tags"},
{$group: {
_id: "$post_id",
author: {$first: "$author"},
tags: {$addToSet: "$tags"}
}},
{$unwind: "$tags"},
{$group: {
_id: {
author: "$author",
tags: "$tags"
},
count: {$sum: 1}
}}
])
That way you'll get documents like this:
{
_id: {
author: "Enrique Coslado",
tags: "lorem"
},
count: 1
}
Previous answers are correct, but the procedure of doing $unwind -> $group -> $unwind could be simplified.
You could use $addFields + $reduce to pass to the pipeline the filtered array which already contains unique entries and then $unwind only once.
Example document:
{
body: "Lorem Ipsum...",
tags: [{title: 'test1'}, {title: 'test2'}, {title: 'test1'}, ],
author: "First Last name"
}
Query:
db.posts.aggregate([
{$addFields: {
"uniqueTag": {
$reduce: {
input: "$tags",
initialValue: [],
in: {$setUnion: ["$$value", ["$$this.title"]]}
}
}
}},
{$unwind: "$uniqueTag"},
{$group: {
_id: {
author: "$author",
tags: "$uniqueTag"
},
count: {$sum: 1}
}}
])