mongodb aggregation pipeline - convert array entries to subdocument - mongodb

I'm working on a mongodb aggregation pipeline. I currently have the following document:
{
"data": [
{ "type": "abc", "price": 25000, "inventory": 15 },
{ "type": "def", "price": 8000, "inventory": 150 }
]
}
And I would like to turn it in:
{
"abc": { "price": 25000, "inventory": 15 },
"def": { "price": 8000, "inventory": 150 }
}
I could do it field by field with a $project stage, but obviously my real example has way more fields then this simple example... And I also have no certainty about which values could be in type.

Since data is an array, you could use an aggregation pipeline similar to:
$unwind to split those into separate documents each containing a single item
$project to change it from {type:x, price:y, inventory:z} to [x,[{price:y, inventory:z}]]
$group to collect the items back to a single array of arrays
$arrayToObject to convert the array of arrays to [{x:{price:y,inventory:z}},...]
If you need more detail, I can see about working up a sample when I have a bit more time.

Thanks to #Joe I managed to create a solution:
db.collection.aggregate([
{
$unwind: "$data"
},
{
$project: {
data: [
"$data.type",
{
price: "$data.price",
inventory: "$data.inventory"
}
],
}
},
{
$group: {
_id: "$_id",
doc: {
$push: "$$ROOT"
}
}
},
{
$replaceRoot: {
newRoot: {
$arrayToObject: "$doc.data"
}
}
}
])
Result:
[
{
"abc": {
"inventory": 15,
"price": 25000
},
"def": {
"inventory": 150,
"price": 8000
}
}
]
Mongo Playground

Related

MongoDB Aggregate Query to find the documents with missing values

I am having a huge collection of objects where the data is stored for different employees.
{
"employee": "Joe",
"areAllAttributesMatched": false,
"characteristics": [
{
"step": "A",
"name": "house",
"score": "1"
},
{
"step": "B",
"name": "car"
},
{
"step": "C",
"name": "job",
"score": "3"
}
]
}
There are cases where the score for an object is completely missing and I want to find out all these details from the database.
In order to do this, I have written the following query, but seems I am going wrong somewhere due to which it is not displaying the output.
I want the data in the following format for this query, so that it is easy to find out which employee is missing the score for which step and which name.
db.collection.aggregate([
{
"$unwind": "$characteristics"
},
{
"$match": {
"characteristics.score": {
"$exists": false
}
}
},
{
"$project": {
"employee": 1,
"name": "$characteristics.name",
"step": "$characteristics.step",
_id: 0
}
}
])
You need to use $exists to check the existence
playground
You can use $ifNull to handle both cases of 1. the score field is missing 2. score is null.
db.collection.aggregate([
{
"$unwind": "$characteristics"
},
{
"$match": {
$expr: {
$eq: [
{
"$ifNull": [
"$characteristics.score",
null
]
},
null
]
}
}
},
{
"$group": {
_id: null,
documents: {
$push: {
"employee": "$employee",
"name": "$characteristics.name",
"step": "$characteristics.step",
}
}
}
},
{
$project: {
_id: false
}
}
])
Here is the Mongo playground for your reference.

mongodb - find previous and next document in aggregation framework

After applying a long pipeline to my collection I can obtain something like this:
{
{
"_id": "main1",
"title": "First",
"code": "C1",
"subDoc": {
"active": true,
"sub_id": "main1sub1",
"order": 1
}
},
{
"_id": "main2",
"title": "Second",
"code": "C2",
"subDoc": {
"active": true,
"sub_id": "main2sub1",
"order": 1
}
},
{
"_id": "main3",
"title": "Third",
"code": "C3",
"subDoc": {
"active": false,
"sub_id": "main3sub1",
"order": 1
}
}
}
The documents are already in the correct order. Now I have to find the document immediately preceding or following the one corresponding to a given parameter. For example, if I know { "code" : "C2" } I have to retrieve the previous document (example document with "code" : "C1").
I only need to get that document, not the others.
I know how to do it using the find () method and applying sort () and limit () in sequence, but I want to get the document directly in the aggregation pipeline, adding the necessary stages to do it.
I've tried some combinations of $ indexOfArray and $ arrayElemAt, but the first problem I encounter is that I don't have an array, it's just documents.
The second problem is that the parameter I know might sometimes be inside the subdocument, for example {"sub_id": "main3sub1"}, and again I should always get the previous or next parent document as a response (in the example, the pipeline should return document "main2" as previous document)
I inserted the collection in mongoplayground to be able to perform the tests quickly:
mongoplayground
Any idea?
If you want to retrieve only the previous document, use the following query:
First Approach:
Using $match,$sort,$limit
db.collection.aggregate([
{
$match: {
code: {
"$lt": "C2"
}
}
},
{
"$sort": {
code: -1
}
},
{
$limit: 1
}
])
MongoDB Playground
Second Approach:
As specified by # Wernfried Domscheit,
Converting to array and then using $arrayElemAt
db.collection.aggregate([
{
$group: {
_id: null,
data: {
$push: "$$ROOT"
}
}
},
{
$addFields: {
"index": {
$subtract: [
{
$indexOfArray: [
"$data.code",
"C2"
]
},
1
]
}
}
},
{
$project: {
_id: 0,
data: {
$arrayElemAt: [
"$data",
"$index"
]
}
}
},
{
$replaceRoot: {
newRoot: "$data"
}
}
])
MongoDB Playground

Mongodb find maximum based on nested object key

I have below schema where I need to identify the object which has highest rank.
{ "team" : {
"member1" : [ { "rank": 2, "goal": 50 } ],
"member2" : [ { "rank": 5, "goal": 30 } ],
"member3" : [ { "rank": 1, "goal": 80 } ]
}}
$unwind will not work on the nested objects. Tried to convert this object as Array and tried to find the max of rank key. Any help would be appreciated.
If the intent is to only find the maximum rank that exists: The idea is a two stage aggregation query using $project and using $objectToArray to have common keys from which $max on required attribute can be applied.
Query: playground link
db.collection.aggregate([
{
$project: {
teamsData: {
$objectToArray: "$team"
}
}
},
{
$project: {
maxRank: {
$max: "$teamsData.v.rank"
}
}
}
]);
To get the object details that has the maximum rank: Use $unwind on the array projected from previous stage to help in sorting by rank $sort and then picking the the first item $first at $group stage.
Query: playgorund link
db.collection.aggregate([
{
$project: {
team: {
$objectToArray: "$team"
}
}
},
{
$unwind: "$team"
},
{
$sort: {
"team.v.rank": -1
}
},
{
$group: {
_id: null,
maxRankObj: {
$first: "$$ROOT"
}
}
}
]);
Sample O/P:
[
{
"_id": null,
"maxRankObj": {
"_id": ObjectId("5a934e000102030405000000"),
"team": {
"k": "member2",
"v": [
{
"goal": 30,
"rank": 5
}
]
}
}
}
]

MongoDB: single find request to return data from different documents with different fields

I have this collection:
{
"name": "Leonardo",
"height": "180",
"weapon": "sword",
"favorite_pizza": "Hawai"
},
{
"name": "Donatello",
"height": "181",
"weapon": "stick",
"favorite_pizza": "Pepperoni"
},
{
"name": "Michelangelo",
"height": "182",
"weapon": "nunchucks",
"favorite_pizza": "Bacon"
},
{
"name": "Raphael",
"height": "183",
"weapon": "sai",
"favorite_pizza": "Margherita"
}
With using one query I want this result (ordered by height):
{
"name": "Leonardo",
"height": "180",
"weapon": "sword",
"favorite_pizza": "Hawai"
},
{
"name": "Donatello",
},
{
"name": "Michelangelo",
},
{
"name": "Raphael",
}
So the query needs to first get the document which has smallest height field and then get all contents of that document, then it needs to get all other documents and return only name field of those documents, while ordering those documents by height.
Change your height to numeric for correct sorting and you can try below aggregation in 3.4 pipeline.
The query $sorts the document by "height" ascending followed by $group to create two fields, "first" field which has the smallest height record ($$ROOT to access the whole document) and "allnames" to record all names.
$project with $slice + $concatArrays to replace the "allnames" array first element with the smallest height document and get the updated array.
$unwind with $replaceRoot to promote all the docs to top level.
db.colname.aggregate([
{"$sort":{
"height":1
}},
{"$group":{
"_id":null,
"first":{"$first":"$$ROOT"},
"allnames":{"$push":{"name":"$name"}}
}},
{"$project":{
"data":{"$concatArrays":[["$first"],{"$slice":["$allnames",1,{"$size":"$allnames"}] } ]}
}},
{"$unwind":"$data"},
{"$replaceRoot":{"newRoot":"$data"}}
])
Just for completeness reasons...
#Veeram's answer is probably the better choice (I have a feeling it should be faster and easier to understand) but you can achieve the same result using a slightly simpler $group stage followed by slightly more complex $project stage using $reduce:
collection.aggregate([{
$sort: {
"height": 1
}
}, {
$group: {
"_id":null,
"allnames": {
$push: "$$ROOT"
}
}
}, {
$project: {
"data": {
$reduce: {
input: "$allnames",
initialValue: null,
in: {
$cond: [{
$eq: [ "$$value", null ] // if it's the first time we come here
},
[ "$$this" ], // we include the entire document
{
$concatArrays: [ // else we concat
"$$value", // the already concatenated values
[ { "name": "$$this.name" } ] // with the "name" of the currently looked at document
]
}]
}
}
}
}
}, {
$unwind: "$data"
}, {
$replaceRoot: {
"newRoot": "$data"
}
}])
Alternatively - as pointed out by #Veeram in the comment below - , it's possible to write the $reduce in this way:
$project: {
"data": {
$reduce: {
input: { "$slice": [ "$allnames", 1, { $size: "$allnames" } ] }, // process everything in the "allnames" array except for the first item
initialValue: { "$slice": [ "$allnames", 1 ] }, // start with the first item
in: { $concatArrays: [ "$$value", [ { "name": "$$this.name" } ] ]} // and keep appending the "name" field of all other items only
}
}
}

How to access the fields from arrays of a object in two different collections?

This is locations collection data.
{
_id: "1",
location: "loc1",
sublocations: [
{
_id: 2,
sublocation: "subloc1",
},
{
_id: 3,
sublocation: "subloc2",
}
]
},
{
_id: "4",
location: "loc2",
sublocations: [
{
_id: 5,
sublocation: "subloc1",
},
{
_id: 6,
sublocation: "subloc2",
}
]
}
This is products collection data
{
_id: "1",
product: "product1",
prices: [
{
_id: 2,
sublocationid: 2, //ObjectId of object in sublocations array
price: 500
},
{
_id: 3,
sublocationid: 5, //ObjectId of object in sublocations array
price: 200
}
]
}
Now I need to get the sublocation in product schema in the prices array. Expected result is as below.
{
_id: "1",
product: "product1",
prices: [
{
_id: 2,
sublocationid: 3,
sublocation: "subloc2",
price: 500
},
{
_id: 3,
sublocationid: 5,
sublocation: "subloc1"
price: 200
}
]
}
To achieve it, I did it like in the following way.
First, performing aggregation on locations collection - $unwind the sublocations array and store the $out in the new collection.
Second, perform aggregation on 'products' collection - $unwind the prices, $lookup the sublocationid from the new collection and $group them.
Third, after getting data delete the data of new collection.
Is there any other simplified way? Please let me know if there is any.
If you want to stick with 3.4 version, you can try this query:
db.products.aggregate([
{
$unwind: {
"path": "$prices"
}
},
{
$lookup: {
"from": "locations",
"localField": "prices.sublocationid",
"foreignField": "sublocations._id",
"as": "locations"
}
},
{
$unwind: {
"path": "$locations"
}
},
{
$unwind: {
"path": "$locations.sublocations"
}
},
{
$addFields: {
"keep": {
"$eq": [
"$prices.sublocationid",
"$locations.sublocations._id"
]
}
}
},
{
$match: {
"keep": true
}
},
{
$addFields: {
"price": {
"_id": "$prices._id",
"sublocationid": "$prices.sublocationid",
"sublocation": "$locations.sublocations.sublocation",
"price": "$prices.price"
}
}
},
{
$group: {
"_id": "$_id",
"product": { "$first": "$product" },
"prices": { "$addToSet": "$price" }
}
}
]);
It's not as nice as 3.6 version though, because of a higher memory consumption.
You can try below aggregation query in 3.6 version.
Since both local field and foreign field are array you have to $unwind both to do equality comparison.
For this you will have to use new $lookup syntax.
$match with $expr provides comparsion between document fields to look up the location's sublocation document for each product's sublocation id.
$project to project the matching sublocation doc.
$addFields with $arrayElemAt to convert the looked up sublocation array into a document.
$group to push all prices with matching sublocation's document for each product.
db.products.aggregate[
{
"$unwind": "$prices"
},
{
"$lookup": {
"from": "locations",
"let": {
"prices": "$prices"
},
"pipeline": [
{
"$unwind": "$sublocations"
},
{
"$match": {
"$expr": [
"$$prices.sublocationid",
"$sublocations._id"
]
}
},
{
"$project": {
"sublocations": 1,
"_id": 0
}
}
],
"as": "prices.sublocations"
}
},
{
"$addFields": {
"prices.sublocations": {
"$arrayElemAt": [
"$prices.sublocations",
0
]
}
}
},
{
"$group": {
"_id": "$_id",
"product": {
"$first": "$product"
},
"prices": {
"$push": "$prices"
}
}
}
])