MongoDB aggregation: $unwind after grouping by date - mongodb

I have this model for purchases:
{
purchase_date: 2018-03-11 00:00:00.000,
total_cost: 400,
items: [
{
title: 'Pringles',
price: 200,
quantity: 2,
category: 'Snacks'
}
]
}
What I'm trying to do is to, first of all, to group the purchases by date, by doing so:
{$group: {
_id: {
date: $purchase_date,
items: '$items'
}
}}
However, now what I want to do is group the purchases of each day by items[].category and calculate how much was spent for each category in that day. I was able to do that with one day, but when I grouped each purchase by date I no longer able to $unwind the items.
I tried passing the path $items and it doesn't find it at all. If I try to use $_id.$items or _id.$items in both cases I get an error stating that it is not a valid path for $unwind.

You can use purchase_data and items.category as a grouping _id but you need to use $unwind on items before and then you can add another $group to get all groups per day
db.col.aggregate([
{ $unwind: "$items" },
{
$group: {
_id: {
purchase_date: "$purchase_date",
category: "$items.category",
},
total: { $sum: { $multiply: [ "$items.price", "$items.quantity" ] } }
}
},
{
$group: {
_id: "$_id.purchase_date",
categories: { $push: { name: "$_id.category", total: "$total" } }
}
}
])

Related

categraji data by using MongoDb aggregation

Payload in excel sheets that consist of 4 columns i.e Date, status, amount, orderId.You need to structure the data / categorize the columns according to months and in each month orders are categorized as per status.
Umbrella Status:
INTRANSIT - ‘intransit’, ‘at hub’, ‘out for delivery’
RTO - ‘RTO Intransit’, ‘RTO Delivered’
PROCESSING - ‘processing’
For example:
Response should look like: -
May :
1.INTRANSIT
2. RTO
3.PROCESSING
June:
1.INTRANSIT
2. RTO
3.PROCESSING
You can use different aggregation operators provided in MongoDB.For example: -group, facet, Match, unwind, bucket, project, lookup, etc.
I tried it with this:
const pipeline = [{
$facet:
{
"INTRANSIT": [{ $match: { Status: { $in: ['INTRANSIT', 'AT HUB', 'OUT FOR
DELIVERY'] } } }, { $group: { _id: "$Date", numberofbookings: { $sum: 1 } }
}],
"RTO": [{ $match: { Status: { $in: ['RTO INTRANSIT', 'RTO DELIVERED'] } } },
{ $group: { _id: "$Date", numberofbookings: { $sum: 1 } } }],
"PROCESSING": [{ $match: { Status: { $in: ['PROCESSING'] } } }, {
$group: {
_id: date.getMonth("$Date"),
numberofbookings: { $sum: 1 }
}
}]
}
}];
const aggCursor = coll.aggregate(pipeline);

Add number field in $project mongodb

I have an issue that need to insert index number when get data. First i have this data for example:
[
{
_id : 616efd7e56c9530018e318ac
student : {
name: "Alpha"
email: null
nisn: "0408210001"
gender : "female"
}
},
{
_id : 616efd7e56c9530018e318af
student : {
name: "Beta"
email: null
nisn: "0408210001"
gender : "male"
}
}
]
and then i need the output like this one:
[
{
no:1,
id:616efd7e56c9530018e318ac,
name: "Alpha",
nisn: "0408210001"
},
{
no:2,
id:616efd7e56c9530018e318ac,
name: "Beta",
nisn: "0408210002"
}
]
i have tried this code but almost get what i expected.
{
'$project': {
'_id': 0,
'id': '$_id',
'name': '$student.name',
'nisn': '$student.nisn'
}
}
but still confuse how to add the number of index. Is it available to do it in $project or i have to do it other way? Thank you for the effort to answer.
You can use $unwind which can return an index, like this:
db.collection.aggregate([
{
$group: {
_id: 0,
data: {
$push: {
_id: "$_id",
student: "$student"
}
}
}
},
{
$unwind: {path: "$data", includeArrayIndex: "no"}
},
{
"$project": {
"_id": 0,
"id": "$data._id",
"name": "$data.student.name",
"nisn": "$data.student.nisn",
"no": {"$add": ["$no", 1] }
}
}
])
You can see it works here .
I strongly suggest to use a $match step before these steps, otherwise you will group your entire collection into one document.
You need to run a pipeline with a $setWindowFields stage that allows you to add a new field which returns the position of a document (known as the document number) within a partition. The position number creation is made possible by the $documentNumber operator only available in the $setWindowFields stage.
The partition could be an extra field (which is constant) that can act as the window partition.
The final stage in the pipeline is the $replaceWith step which will promote the student embedded document to the top-level as well as replacing all input documents with the specified document.
Running the following aggregation will yield the desired results:
db.collection.aggregate([
{ $addFields: { _partition: 'students' }},
{ $setWindowFields: {
partitionBy: '$_partition',
sortBy: { _id: -1 },
output: { no: { $documentNumber: {} } }
} },
{ $replaceWith: {
$mergeObjects: [
{ id: '$_id', no: '$no' },
'$student'
]
} }
])

MongoDB aggregation: How to get the index of a document in a collection depending sorted by a document property

Assume I have a collection with millions of documents. Below is a sample of how the documents look like
[
{ _id:"1a1", points:[2,3,5,6] },
{ _id:"1a2", points:[2,6] },
{ _id:"1a3", points:[3,5,6] },
{ _id:"1b1", points:[1,5,6] },
{ _id:"1c1", points:[5,6] },
// ... more documents
]
I want to query a document by _id and return a document that looks like below:
{
_id:"1a1",
totalPoints: 16,
rank: 29
}
I know I can query the whole document, sort by descending order then get the index of the document I want by _id and add one to get its rank. But I have worries about this method.
If the documents are in millions won't this be 'overdoing' it. Querying a whole collection just to get one document? Is there a way to achieve what I want to achieve without querying the whole collection? Or the whole collection has to be involved because of the ranking?
I cannot save them ranked because the points keep on changing. The actual code is more complex but the take away is that I cannot save them ranked.
Total points is the sum of the points in the points array. The rank is calculated by sorting all documents in descending order. The first document becomes rank 1 and so on.
an aggregation pipeline like the following can get the result you want. but how it operates on a collection of millions of documents remains to be seen.
db.collection.aggregate(
[
{
$group: {
_id: null,
docs: {
$push: { _id: '$_id', totalPoints: { $sum: '$points' } }
}
}
},
{
$unwind: '$docs'
},
{
$replaceWith: '$docs'
},
{
$sort: { totalPoints: -1 }
},
{
$group: {
_id: null,
docs: { $push: '$$ROOT' }
}
},
{
$set: {
docs: {
$map: {
input: {
$filter: {
input: '$docs',
as: 'x',
cond: { $eq: ['$$x._id', '1a3'] }
}
},
as: 'xx',
in: {
_id: '$$xx._id',
totalPoints: '$$xx.totalPoints',
rank: {
$add: [{ $indexOfArray: ['$docs._id', '1a3'] }, 1]
}
}
}
}
}
},
{
$unwind: '$docs'
},
{
$replaceWith: '$docs'
}
])

MongoDB sum with match

I have a collection with the following data structure:
{
_id: ObjectId,
text: 'This contains some text',
type: 'one',
category: {
name: 'Testing',
slug: 'test'
},
state: 'active'
}
What I'm ultimately trying to do is get a list of categories and counts. I'm using the following:
const query = [
{
$match: {
state: 'active'
}
},
{
$project: {
_id: 0,
categories: 1
}
},
{
$unwind: '$categories'
},
{
$group: {
_id: { category: '$categories.name', slug: '$categories.slug' },
count: { $sum: 1 }
}
}
]
This returns all categories (that are active) and the total counts for documents matching each category.
The problem is that I want to introduce two additional $match that should still return all the unique categories, but only affect the counts. For example, I'm trying to add a text search (which is indexed on the text field) and also a match for type.
I can't do this at the top of the pipeline because it would then only return categories that match, not only affect the $sum. So basically it would be like being able to add a $match within the $group only for the $sum. Haven't been able to find a solution for this and any help would be greatly appreciated. Thank you!
You can use $cond inside of your $group statement:
{
$group: {
_id: { category: '$categories.name', slug: '$categories.slug' },
count: { $sum: { $cond: [ { $eq: [ "$categories.type", "one" ] }, 1, 0 ] } }
}
}

MongoDB get full doc after match, group, and sort

Order:
{
order_id: 1,
order_time: ISODate(...),
customer_id: 456,
products: [
{
product_id: 1,
product_name: "Pencil"
},
{
product_id: 2,
product_name: "Scissors"
},
{
product_id: 3,
product_name: "Tape"
}
]
}
I have a collection with a whole bunch of documents like the above. I would like to query for the latest order for each customer who ordered Scissors.
That is, where there exists a "products.product_name" which equals "Scissors", group by customer_id, give me the full document where the "order_time" is the "max" for that group.
To find the documents, I could do like find({ 'products.product_name' : "Scissors" }) but then I get all of the order with Scissors, I only want the most recent.
So, I am looking at aggregation... Mongo's "$group" aggregation stage seems to require that you do some kind of actual aggregation inside like sum or max or whatever. I am guessing there's some combination of $match, $group, and $sort to use here but I can't seem to quite get it working.
Something close:
db.storcap.aggregate(
[
{
$match: { 'products.product_name' : "Scissors" }
},
{
$sort: { created_at:-1 }
},
{
$group: {
_id: "$customer_id",
}
}]
)
But this doesn't return the full doc and I am not sure that it's doing the sorting and grouping right.
You can use $first operator to get most recent order (are ordered desc) and special variable $$ROOT to get whole object in a final result:
db.storcap.aggregate([
{
$match: { 'products.product_name' : "Scissors" }
},
{
$sort: { created_at:-1 }
},
{
$group: {
_id: "$customer_id",
lastOrder: { $first: "$$ROOT" }
}
}
])