Aggregation at each document level mongodb - mongodb

I have a list of documents like this
[{
"_id": "5dbc95f921d7625303fe2369",
"name": "John",
"itemsPurchased": [{
"offer": "o1",
"items": ["p1"]
},{
"offer": "o1",
"items": ["p1"]
},
{
"offer": "o1",
"items": ["p2"]
},
{
"offer": "o2",
"items": ["p1"]
}, {
"offer": "o7",
"items": ["p1"]
}
]
},
{
"_id": "zbc95f921d7625303fe2363",
"name": "Doe",
"itemsPurchased": [{
"offer": "o1",
"items": ["p11"]
},{
"offer": "o1",
"items": ["p11"]
},
{
"offer": "o2",
"items": ["p13"]
},
{
"offer": "o1",
"items": ["p22"]
},
{
"offer": "o2",
"items": ["p11"]
}, {
"offer": "o3",
"items": ["p11"]
}
]
}
]
And i am trying to compute unique offers on unique products by each customer, expecting the resultant to be like:
[
{
"_id": "5dbc95f921d7625303fe2369",
"name": "John",
"offersAndProducts": {
"o1":2,
"o2":2,
"o3":1
},
{
"_id": "zbc95f921d7625303fe2363",
"name": "Doe",
"offersAndProducts": {
"o1":2,
"o2":1,
"o7":1
}
]
I want to apply aggregations per document, After performing $unwind on itemsPurchased, applied $group on items and then on offer to eliminate the duplication:
{
"$group" : {
"_id" : {
"item" : {
"$arrayElemAt" : [
"$itemsPurchased.item",
0.0
]
},
"count" : {
"$sum" : 1.0
},
"offer" : "$itemsPurchased.offer"
}
}
}
then,
{
"$group" : {
"_id" : "$_id.offer",
"count" : {
"$sum" : 1.0
}
}
}
this gives the array of products and offers for all documents:
[
{o1:4,o2:3,o3:1,o7:1}
]
But i need it at document level.
tried $addFeild, but $unwind and $match operators gives invalid error.
Any other way of achieving this?

Generally speaking, it's an anti-pattern to $unwind an array and then to $group on the original _id since most operations can be done on the array directly, in a single stage. Here is what such a stage would look like:
{$addFields:{
offers:{$arrayToObject:{
$map:{
input:{$setUnion:"$itemsPurchased.offer"},
as:"o",
in:[
"$$o",
{$size:{$setUnion:{$let:{
vars:{items:{$filter:{
input:"$itemsPurchased",
cond:{$eq:["$$this.offer","$$o"]}
}}},
in:{$reduce:{
input:"$$items",
initialValue:[],
in:{$concatArrays:["$$value","$$items.items"]}
}}
}}}
}]
}
}}
}}
What this does is create an array where each element is a two element array (which is a syntax that $arrayToObject can convert to an object where first element is key name and second is value) and the input is a unique set of offers and for each we accumulate an array of products, get rid of duplicates (with $setUnion) and then get the size of the result. What this produces on your input is this:
"offers" : {
"o1" : 2,
"o2" : 2,
"o3" : 1
}

You need to run $unwind and $group twice. To count only unique items you can use $addToSet. To build your keys dynamically you need to use $arrayToObject:
db.collection.aggregate([
{
$unwind: "$itemsPurchased"
},
{
$unwind: "$itemsPurchased.items"
},
{
$group: {
_id: {
_id: "$_id",
offer: "$itemsPurchased.offer"
},
name: { $first: "$name" },
items: { $addToSet: "$itemsPurchased.items" }
}
},
{
$group: {
_id: "$_id._id",
name: { $first: "$name" },
offersAndProducts: { $push: { k: "$_id.offer", v: { $size: "$items" } } }
}
},
{
$project: {
_id: 1,
name: 1,
offersAndProducts: { $arrayToObject: "$offersAndProducts" }
}
}
])
Mongo Playground

Related

MongoDB - aggregation - array of array data selection

I'm want to create an aggregation for the following contents of a collection:
{ "_id": ObjectId("574ffe9bda461e4b4b0043ab"),
"list1": [
"_id": "54",
"list2": [
{
"lang": "EN",
"value": "val1"
},
{
"lang": "ES",
"value": "val2"
},
{
"lang": "FR",
"value": "val3"
},
{
"lang": "IT",
"value": "val3"
}
]
]
}
From this collection i want to get as Object ("id": "54", "value": "val3") the returned Object is based on condition : list1.id = "54" and list2.lang = "IT"
You can try a simple combination of $match and $unwind to traverse your nested arrays:
db.collection.aggregate([
{
$unwind: "$list1"
},
{
$match: { "list1._id": "54" }
},
{
$unwind: "$list1.list2"
},
{
$match: { "list1.list2.lang": "IT" }
},
{
$project: {
_id: "$list1._id",
val: "$list1.list2.value"
}
}
])
Mongo Playground.
If the list._id field is unique you can index it and swap first first two pipeline stages to filter out other documents before running $unwind:
db.collection.aggregate([
{
$match: { "list1._id": "54" }
},
{
$unwind: "$list1"
},
{
$unwind: "$list1.list2"
},
{
$match: { "list1.list2.lang": "IT" }
},
{
$project: {
_id: "$list1._id",
val: "$list1.list2.value"
}
}
])

Get last and minimal values from grouped documents

My document model looks like:
{
"model": "ABC123",
"date": "2018-12-24T23:00:00.000+0000",
"price": "2000" ,
}
I would like to retrive collection to get array of documents:
[
{ "_id" : "ABC123", "newestDate" : ISODate("2018-12-26T23:00:00Z"), "newestPrice" : 2801.33, "lowestPriceAtAll": 1300 },
{ "_id" : "ABC124", "newestDate" : ISODate("2018-12-26T23:00:00Z"), "newestPrice" : 2801.33, "lowestPriceAtAll": 990}
]
where _id is model field, newestPrice is price of newest document (grouped by model) and lowestPriceAtAll is lowest price in all documents with the same model.
I grilled two queries.
First is to find lowest price documents:
offers.aggregate([
{ $sort: { "model": 1, "price": 1 }},
{
$group: {
_id: "$model",
lowestPrice: { "$first": "$price" },
lowestPriceDate: { "$first": "$date"},
}
}
])
the second is to find newest documents:
offers.aggregate([
{ $sort: { "model": 1, "date": -1 }},
{
$group: {
_id: "$model",
newestDate: { "$first": "$date" },
newestPrice: { "$first": "$price"},
}
}
])
Is it possible to merge these two queries into one? (the most important thing is that documents have to be grouped by model field).
you can use $facet
db.offers.aggregate([
{$facet :{
lowest: [
{ $sort: { "model": 1, "price": 1 }},
{
$group: {
_id: "$model",
lowestPrice: { "$first": "$price" },
lowestPriceDate: { "$first": "$date"},
}
}
],
newest: [
{ $sort: { "model": 1, "date": -1 }},
{
$group: {
_id: "$model",
newestDate: { "$first": "$date" },
newestPrice: { "$first": "$price"},
}
}
]
}}
])

MongoDB filter for specific data in Array and return only specific fields in the output

I have a below structure maintained in a sample collection.
{
"_id": "1",
"name": "Stock1",
"description": "Test Stock",
"lines": [
{
"lineNumber": "1",
"priceInfo": {
"buyprice": 10,
"sellprice": 15
},
"item": {
"id": "BAT10001",
"name": "CricketBat",
"description": "Cricket bat"
},
"quantity": 10
},
{
"lineNumber": "2",
"priceInfo": {
"buyprice": 10,
"sellprice": 15
},
"item": {
"id": "BAT10002",
"name": "CricketBall",
"description": "Cricket ball"
},
"quantity": 10
},
{
"lineNumber": "3",
"priceInfo": {
"buyprice": 10,
"sellprice": 15
},
"item": {
"id": "BAT10003",
"name": "CricketStumps",
"description": "Cricket stumps"
},
"quantity": 10
}
]
}
I have a scenario where i will be given lineNumber and item.id, i need to filter the above collection based on lineNumber and item.id and i need to project only selected fields.
Expected output below:
{
"_id": "1",
"lines": [
{
"lineNumber": "1",
"item": {
"id": "BAT10001",
"name": "CricketBat",
"description": "Cricket bat"
},
"quantity": 10
}
]
}
Note: I may not get lineNumber all the times, if lineNumber is null then i should filter for item.id alone and get the above mentioned output.The main purpose is to reduce the number of fields in the output, as the collection is expected to hold huge number of fields.
I tried the below query,
db.sample.aggregate([
{ "$match" : { "_id" : "1"} ,
{ "$project" : { "lines" : { "$filter" : { "input" : "$lines" , "as" : "line" , "cond" :
{ "$and" : [ { "$eq" : [ "$$line.lineNumber" , "3"]} , { "$eq" : [ "$$line.item.id" , "BAT10001"]}]}}}}}
])
But i got all the fields, i'm not able to exclude or include the required fields.
I tried the below query and it worked for me,
db.Collection.aggregate([
{ $match: { _id: '1' } },
{
$project: {
lines: {
$map: {
input: {
$filter: {
input: '$lines',
as: 'line',
cond: {
$and: [
{ $eq: ['$$line.lineNumber', '3'] },
{ $eq: ['$$line.item.id', 'BAT10001'] },
],
},
},
},
as: 'line',
in: {
lineNumber: '$$line.lineNumber',
item: '$$line.item',
quantity: '$$line.quantity',
},
},
},
},
},
])
You can achieve it with $unwind and $group aggregation stages:
db.collection.aggregate([
{$match: {"_id": "1"}},
{$unwind: "$lines"},
{$match: {
$or: [
{"lines.lineNumber":{$exists: true, $eq: "1"}},
{"item.id": "BAT10001"}
]
}},
{$group: {
_id: "$_id",
lines: { $push: {
"lineNumber": "$lines.lineNumber",
"item": "$lines.item",
"quantity": "$lines.quantity"
}}
}}
])
$match - sets the criterias for the documents filter. The first stage is takes document with _id = "1", the second takes only documents which have lines.lineNumber equal to "1" or item.id equal to "BAT10001".
$unwind - splits the lines array into seperated documents.
$group - merges the documents by the _id element and puts the generated object with lineNumber, item and quantity elements into the lines array.

Distinct array element with condition

My documents look like this:
{
"_id": "1",
"tags": [
{ "code": "01-01", "type": "machine" },
{ "code": "04-06", "type": "gearbox" },
{ "code": "07-01", "type": "machine" }
]
},
{
"_id": "2",
"tags": [
{ "code": "03-04","type": "gearbox" },
{ "code": "01-01", "type": "machine" },
{ "code": "04-11", "type": "machine" }
]
}
I want to get distinct codes only for tags whose type is "machine". so, for the example above, the result should be ["01-01", "07-01", "04-11"].
How do I do this?
Using $unwind and then $group with the tag as the key will give you each tag in a separate document in your result set:
db.collection_name.aggregate([
{
$unwind: "$tags"
},
{
$match: {
"tags.type": "machine"
}
},
{
$group: {
_id: "$tags.code"
}
},
{
$project:{
_id:false
code: "$_id"
}
}
]);
Or, if you want them put into an array within a single document, you can use $push within a second $group stage:
db.collection_name.aggregate([
{
$unwind: "$tags"
},
{
$match: {
"tags.type": "machine"
}
},
{
$group: {
_id: "$tags.code"
}
},
{
$group:{
_id: null,
codes: {$push: "$_id"}
}
}
]);
Another user suggested including an initial stage of { $match: { "tags.type": "machine" } }. This is a good idea if your data is likely to contain a significant number of documents that do not include "machine" tags. That way you will eliminate unnecessary processing of those documents. Your pipeline would look like this:
db.collection_name.aggregate([
{
$match: {
"tags.type": "machine"
}
},
{
$unwind: "$tags"
},
{
$match: {
"tags.type": "machine"
}
},
{
$group: {
_id: "$tags.code"
}
},
{
$group:{
_id: null,
codes: {$push: "$_id"}
}
}
]);
> db.foo.aggregate( [
... { $unwind : "$tags" },
... { $match : { "tags.type" : "machine" } },
... { $group : { "_id" : "$tags.code" } },
... { $group : { _id : null , "codes" : {$push : "$_id"} }}
... ] )
{ "_id" : null, "codes" : [ "04-11", "07-01", "01-01" ] }
A better way would be to group directly on tags.type and use addToSet on tags.code.
Here's how we can achieve the same output in 3 stages of aggregation :
db.name.aggregate([
{$unwind:"$tags"},
{$match:{"tags.type":"machine"}},
{$group:{_id:"$tags.type","codes":{$addToSet:"$tags.code"}}}
])
Output : { "_id" : "machine", "codes" : [ "04-11", "07-01", "01-01" ] }
Also, if you wish to filter out tag.type codes, we just need to replace "machine" in match stage with desired tag.type.

MongoDB query with conditional group by statement

I need to export customer records from database of mongoDB. Exported customer records should not have duplicated values. "firstName+lastName+code" is the key to DE-duped the record and If there are two records present in database with same key then I need to give preference to source field with value other than email.
customer (id,firstName,lastName,code,source) collection is this.
If there are record 3 records with same unique key and 3 different sources then i need to choose only one record between 2 sources(TV,internet){or if there are n number of sources i need the one record only}not with the 'email'(as email will be choosen when only one record is present with the unique key and source is email)
query using:
db.customer.aggregate([
{
"$match": {
"active": true,
"dealerCode": { "$in": ["111391"] },
"source": { "$in": ["email", "TV", "internet"] }
}
},
{
$group: {
"_id": {
"firstName": "$personalInfo.firstName",
"lastName": "$personalInfo.lastName",
"code": "$vehicle.code"
},
"source": {
$addToSet: { "source": "$source" }
}
}
},
{
$redact:
{
$cond: [
{ $eq: [{ $ifNull: ["$source", "other"] }, "email"] },
"$$PRUNE",
"$$DESCEND"
]
}
},
{
$project:
{
"source":
{
$map:
{
"input": {
$cond: [
{ $eq: [{ $size: "$source" }, 0] },
[{ "source": "email" }],
"$source"
]
},
"as": "inp",
"in": "$$inp.source"
}
},
"record": { "_id": 1 }
}
}
])
sample output:
{ "_id" : { "firstName" : "sGI6YaJ36WRfI4xuJQzI7A==", "lastName" : "99eQ7i+uTOqO8X+IPW+NOA==", "code" : "1GTHK23688F113955" }, "source" : ["internet"] }
{ "_id" : { "firstName" : "WYDROTF/9vs9O7XhdIKd5Q==", "lastName" : "BM18Uq/ltcbdx0UJOXh7Sw==", "code" : "1G4GE5GV5AF180133" }, "source" : ["internet"] }
{ "_id" : { "firstName" : "id+U2gYNHQaNQRWXpe34MA==", "lastName" : "AIs1G33QnH9RB0nupJEvjw==", "code" : "1G4GE5EV0AF177966" }, "source" : ["internet"] }
{ "_id" : { "firstName" : "qhreJVuUA5l8lnBPVhMAdw==", "lastName" : "petb0Qx3YPfebSioY0wL9w==", "code" : "1G1AL55F277253143" }, "source" : ["TV"] }
{ "_id" : { "firstName" : "qhreJVuUA5l8lnBPVhMAdw==", "lastName" : "6LB/NmhbfqTagbOnHFGoog==", "code" : "1GCVKREC0EZ168134" }, "source" : ["TV", "internet"] }
This is a problem with this query please suggest :(
Your code doesn't work, because $cond is not an accumulator operator. Only these accumulator operators, can be used in a $group stage.
Assuming your records contain not more than two possible values of source as you mention in your question, you could add a conditional $project stage and modify the $group stage as,
Code:
db.customer.aggregate([
{
$group: {
"_id": {
"id": "$id",
"firstName": "$firstName",
"lastName": "$lastName",
"code": "$code"
},
"sourceA": { $first: "$source" },
"sourceB": { $last: "$source" }
}
},
{
$project: {
"source": {
$cond: [
{ $eq: ["$sourceA", "email"] },
"$sourceB",
"$sourceA"
]
}
}
}
])
In case there can be more that two possible values for source, then you could do the following:
Group by the id, firstName, lastName and code. Accumulate
the unique values of source, using the $addToSet operator.
Use $redact to keep only the values other than email.
Project the required fields, if the source array is empty(all the elements have been removed), add a
value email to it.
Unwind the source field to list it as a field and not an array.
(optional)
Code:
db.customer.aggregate([
{
$group: {
"_id": {
"id": "$id",
"firstName": "$firstName",
"lastName": "$lastName",
"code": "$code"
},
"sourceArr": { $addToSet: { "source": "$source" } }
}
},
{
$redact: {
$cond: [
{ $eq: [{ $ifNull: ["$source", "other"] }, "email"] },
"$$PRUNE",
"$$DESCEND"
]
}
},
{
$project: {
"source": {
$map: {
"input":
{
$cond: [
{ $eq: [{ $size: "$sourceArr" }, 0] },
[{ "source": "item" }],
"$sourceArr"]
},
"as": "inp",
"in": "$$inp.source"
}
}
}
}
])