Order results from MongoDB Query - mongodb

I'm executing a MongoDB query in ExpressJS via Mongoose.
I have documents that look like the below
{
"name": "first",
"spellings": [
"here",
"is",
"you"
],
"keyStageLevel": 4,
"spellingLevel": 2,
"__v": 0
},
{
"name": "second",
"spellings": [
"her",
"is",
"another"
],
"keyStageLevel": 2,
"spellingLevel": 3,
"__v": 0
},
{
"name": "third",
"spellings": [
"snother",
"list"
],
"keyStageLevel": 2,
"spellingLevel": 4,
"__v": 0
}
I would like to have the result of my query returned so that
1) the keyStageLevel are in order and
2) within each keyStageLevel the spellingLevel are shown in order with the details of the document.
keyStageLevel 2
spellingLevel 3
name: "second",
"spellings": [
"her",
"is",
"another"
]
spellingLevel 4
name: "third",
"spellings": [
"snother",
"list"
]
keyStageLevel 4
spellingLevel 2
etc
My code currently runs
var spellings = await Spelling.aggregate([{"$group" : {_id:{keyStageLevel:"$keyStageLevel",spellingLevel:"$spellingLevel"}}} ]);
which retuns
[
{
"_id": {
"keyStageLevel": 2,
"spellingLevel": 4
}
},
{
"_id": {
"keyStageLevel": 2,
"spellingLevel": 3
}
},
{
"_id": {
"keyStageLevel": 5,
"spellingLevel": 1
}
},
{
"_id": {
"keyStageLevel": 4,
"spellingLevel": 2
}
}
]
Many thanks for any help.

What you are mostly after is using $group to accumulate the remaining document data under each "keyStageLevel" this is done using $push. If you want results in specific order then you always need to $sort, being both before and after feeding to a $group stage:
var spellings = await Spelling.aggregate([
{ "$sort": { "keyStageLevel": 1, "spellingLevel": 1 } },
{ "$group" : {
"_id": { "keyStageLevel": "$keyStageLevel" },
"data": {
"$push": {
"spellingLevel": "$spellingLevel",
"name": "$name",
"spellings": "$spellings"
}
}
}},
{ "$sort": { "_id": 1 } }
])
The first $sort ensures the items added via $push are accumulated in that order, and the final ensures that the "output" is actually sorted in the desired order, as $group will likely not always return the grouped keys in any specific order unless you instruct with such a stage.
This will give you output like:
{
"_id" : {
"keyStageLevel" : 2
},
"data" : [
{
"spellingLevel" : 3,
"name" : "second",
"spellings" : [
"her",
"is",
"another"
]
},
{
"spellingLevel" : 4,
"name" : "third",
"spellings" : [
"snother",
"list"
]
}
]
}
{
"_id" : {
"keyStageLevel" : 4
},
"data" : [
{
"spellingLevel" : 2,
"name" : "first",
"spellings" : [
"here",
"is",
"you"
]
}
]
}

Related

Retrieve highest score for each game using aggregate in MongoDB

I am working on a database of various games and i want to design a query that returns top scorer from each game with specific player details.
The document structure is as follows:
db.gaming_system.insertMany(
[
{
"_id": "01",
"name": "GTA 5",
"high_scores": [
{
"hs_id": 1,
"name": "Harry",
"score": 6969
},
{
"hs_id": 2,
"name": "Simon",
"score": 8574
},
{
"hs_id": 3,
"name": "Ethan",
"score": 4261
}
]
},
{
"_id": "02",
"name": "Among Us",
"high_scores": [
{
"hs_id": 1,
"name": "Harry",
"score": 926
},
{
"hs_id": 2,
"name": "Simon",
"score": 741
},
{
"hs_id": 3,
"name": "Ethan",
"score": 841
}
]
}
]
)
I have created a query using aggregate which returns the name of game and the highest score for that game as follows
db.gaming_system.aggregate(
{ "$project": { "maximumscore": { "$max": "$high_scores.score" }, name:1 } },
{ "$group": { "_id": "$_id", Name: { $first: "$name" }, "Highest_Score": { "$max": "$maximumscore" } } },
{ "$sort" : { "_id":1 } }
)
The output from my query is as follows:
{ "_id" : "01", "Name" : "GTA 5", "Highest_Score" : 8574 }
{ "_id" : "02", "Name" : "Among Us", "Highest_Score" : 926 }
I want to generate output which also provides the name of player and "hs_id" of that player who has the highest score for each game as follows:
{ "_id" : "01", "Name" : "GTA 5", "Top_Scorer" : "Simon", "hs_id": 2, "Highest_Score" : 8574 }
{ "_id" : "02", "Name" : "Among Us", "Top_Scorer" : "Harry", "hs_id": 1, "Highest_Score" : 926 }
What should be added to my query using aggregate pipeline?
[
{
$unwind: "$high_scores" //unwind the high_scores, so you can then sort
},
{
$sort: {
"high_scores.score": -1 //sort the high_scores, irrelevant of game, because we are going to group in next stage
}
},
{
//now group them by _id, take the name and top scorer from $first (which is the first in that group as sorted by score in descending order
$group: {
_id: "$_id",
name: {
$first: "$name"
},
Top_Scorer: {
$first: "$high_scores"
}
}
}
]

group first, make bucketauto second in mongodb aggregation

I have a dataset structured like that:
{
"id": 1230239,
"group_name": "A",
"confidence": 0.14333882876354542,
},
{
"id": 1230240,
"group_name": "B",
"confidence": 0.4434535,
},
Etc.
It is pretty simple to calculate buckets and number of items in each bucket of confidence level, using $bucketauto like that:
{
"$bucketAuto": {
"groupBy": "$confidence",
"buckets": 4
}
}
But how can I do the same for each group, separately?
I tried this one:
{"$group": {
"_id": "group",
"data": {
"$push": {
"confidence": "$confidence",
}
}
}
},
{
"$bucketAuto": {
"groupBy": "$data.confidence",
"buckets": 4
}
}
But that does not work.
What I need roughly is this as an output:
{ 'groupA':
{
"_id": {
"min": 0.0005225352581638143,
"max": 0.2905137273072962
},
"count": 67
},
{"_id": {
"min": 0.2905137273072962,
"max":0.5531611756507283,
},
"count": 43
},
},
{ 'groupB':
{
"_id": {
"min": 0.0005225352581638143,
"max": 0.2905137273072962
},
"count": 67
},
{"_id": {
"min": 0.2905137273072962,
"max":0.5531611756507283,
},
"count": 43
},
}
Any advice or hint would be appreciated
$facet to the rescue -- the "multigroup" operator. This pipeline:
db.foo.aggregate([
{$facet: {
"groupA": [
{$match: {"group_name": "A"}}
,{$bucketAuto: {
"groupBy": "$confidence",
"buckets": 3
}}
]
,"groupB": [
{$match: {"group_name": "B"}}
,{$bucketAuto: {
"groupBy": "$confidence",
"buckets": 3
}}
]
}}
]);
yields the output you seek:
{
"groupA" : [
{
"_id" : {
"min" : 0.14333882876354542,
"max" : 0.34333882876354543
},
"count" : 2
},
{
"_id" : {
"min" : 0.34333882876354543,
"max" : 0.5433388287635454
},
"count" : 2
},
{
"_id" : {
"min" : 0.5433388287635454,
"max" : 0.5433388287635454
},
"count" : 1
}
],
"groupB" : [
{
"_id" : {
"min" : 0.5433388287635454,
"max" : 0.7433388287635454
// etc. etc.
If you want to go totally dynamic, you'll need to do it in two passes: first get the distinct group names, then build the $facet expression from those names:
db.foo.distinct("group_name").forEach(function(name) {
fct_stage["group" + name] = [
{$match: {"group_name": name}}
,{$bucketAuto: {
"groupBy": "$confidence",
"buckets": 3
}}
];
});
db.foo.aggregate([ {$facet: fct_stage} ]);

how to unwind more than one array in MongoDB Aggregation

This is what my documents look like
{
"_id" : ObjectId("584149cafda90a8b18cdfcc1"),
"uid" : "583eaa7df4def0ec5a520d19",
"surid" : "58414631ec5ed099538929b8",
"createdat" : ISODate("2016-12-02T10:15:38.382Z"),
"response" : [
{
"qid" : "649975800",
"que" : "Which is your favourite color ?",
"ans" : [
"red",
"yellow"
]
},
{
"qid" : "309541969",
"que" : "which is your favourite fruits ? ",
"ans" : [
"apple",
"orange"
]
}
]
}
/* 2 */
{
"_id" : ObjectId("58414a28fda90a8b18cdfcc7"),
"uid" : "57ff2141b893ba1a2e89ef57",
"surid" : "58414631ec5ed099538929b8",
"createdat" : ISODate("2016-12-02T10:17:12.800Z"),
"response" : [
{
"qid" : "649975800",
"que" : "Which is your favourite color ?",
"ans" : "red"
},
{
"qid" : "309541969",
"que" : "which is your favourite fruits ? ",
"ans" : "banana"
}
]
}
/* 3 */
{
"_id" : ObjectId("58414a52fda90a8b18cdfcd1"),
"uid" : "57b300678c9f14d7555b668e",
"surid" : "58414631ec5ed099538929b8",
"createdat" : ISODate("2016-12-02T10:17:54.869Z"),
"response" : [
{
"qid" : "649975800",
"que" : "Which is your favourite color ?",
"ans" : "red"
},
{
"qid" : "309541969",
"que" : "which is your favourite fruits ? ",
"ans" : "banana"
}
]
}
This is what I need:
{
"que" : "Which is your favourite color ?",
"ans" :{red:3, yellow:1}
},
{
"que" : "which is your favourite fruits ? ",
"ans":{apple:1, orange:1, banana:3}
}
I want to this result with mongodb aggregation using unique surid and with separate answer.
it's all about to the feedback result of the user data.
Because you won't know the values for the embedded ans array in advance, the proposed desired output won't be feasible since it assumes you know the values. A much better and faster approach would be to get the output as an embedded counts document like:
{
"ques": "Which is your favourite color ?",
"counts": [
{ "value": "red", "count": 3 },
{ "value": "yellow", "count": 1 }
]
},
{
"ques": "which is your favourite fruits ?",
"counts": [
{ "value": "apple", "count": 1 },
{ "value": "orange", "count": 1 },
{ "value": "banana", "count": 3 }
]
}
which can be achieved by running this aggregate operation:
db.collection.aggregate([
{ "$unwind": "$response" },
{ "$unwind": "$response.ans" },
{
"$group": {
"_id": {
"surid": "$surid",
"ans": "$response.ans"
},
"ques": { "$first": "$reponse.que" },
"count": { "$sum": 1 }
}
},
{
"$group": {
"_id": "$_id.surid",
"ques": { "$first": "$ques" },
"counts": {
"$push": {
"value": "$_id.ans",
"count": "$count"
}
}
}
}
])
However, if the values are static and known in advance, then take advantage of the $cond operator in the $group stage to evaluate the counts based on the "response.ans" field, something like the following:
db.collection.aggregate([
{ "$unwind": "$response" },
{ "$unwind": "$response.ans" },
{
"$group": {
"_id": "$surid",
"ques": { "$first": "$reponse.que" },
"red": {
"$sum": {
"$cond": [ { "$eq": [ "$response.ans", "red" ] }, 1, 0 ]
}
},
"yellow": {
"$sum": {
"$cond": [ { "$eq": [ "$response.ans", "yellow" ] }, 1, 0 ]
}
},
"apple": {
"$sum": {
"$cond": [ { "$eq": [ "$response.ans", "apple" ] }, 1, 0 ]
}
},
"orange": {
"$sum": {
"$cond": [ { "$eq": [ "$response.ans", "orange" ] }, 1, 0 ]
}
},
"banana": {
"$sum": {
"$cond": [ { "$eq": [ "$response.ans", "banana" ] }, 1, 0 ]
}
}
}
}
])

Aggregate array of subdocuments into single document

My document looks like the following (ignore timepoints for this question):
{
"_id": "xyz-800",
"site": "xyz",
"user": 800,
"timepoints": [
{"timepoint": 0, "a": 1500, "b": 700},
{"timepoint": 2, "a": 1000, "b": 200},
{"timepoint": 4, "a": 3500, "b": 1500}
],
"groupings": [
{"type": "MNO", "group": "<10%", "raw": "1"},
{"type": "IJK", "group": "Moderate", "raw": "23"}
]
}
Can I flatten (maybe not the right term) so the groupings are in a single document. I would like the result to look like:
{
"id": "xyz-800",
"site": "xyz",
"user": 800,
"mnoGroup": "<10%",
"mnoRaw": "1",
"ijkGroup": "Moderate",
"ijkRaw": "23"
}
In reality I would like the mnoGroup and mnoRaw attributes to be created no matter if the attribute groupings.type = "MNO" exists or not. Same with the ijk attributes.
You can use $arrayElemAt to read the groupings array by index in the first project stage and $ifNull to project optional values in the final project stage. Litte verbose, but'll see what I can do.
db.groupmore.aggregate({
"$project": {
_id: 1,
site: 1,
user: 1,
mnoGroup: {
$arrayElemAt: ["$groupings", 0]
},
ijkGroup: {
$arrayElemAt: ["$groupings", -1]
}
}
}, {
"$project": {
_id: 1,
site: 1,
user: 1,
mnoGroup: {
$ifNull: ["$mnoGroup.group", "Unspecified"]
},
mnoRaw: {
$ifNull: ["$mnoGroup.raw", "Unspecified"]
},
ijkGroup: {
$ifNull: ["$ijkGroup.group", "Unspecified"]
},
ijkRaw: {
$ifNull: ["$ijkGroup.raw", "Unspecified"]
}
}
})
Sample Output
{ "_id" : "xyz-800", "site" : "xyz", "user" : 800, "mnoGroup" : "<10%", "mnoRaw" : "1", "ijkGroup" : "Moderate", "ijkRaw" : "23" }
{ "_id" : "ert-600", "site" : "ert", "user" : 8600, "mnoGroup" : "Unspecified", "mnoRaw" : "Unspecified", "ijkGroup" : "Unspecified", "ijkRaw" : "Unspecified" }

MongoDB select distinct and count

I have a product collection which looks like that:
products = [
{
"ref": "1",
"facets": [
{
"type":"category",
"val":"kitchen"
},
{
"type":"category",
"val":"bedroom"
},
{
"type":"material",
"val":"wood"
}
]
},
{
"ref": "2",
"facets": [
{
"type":"category",
"val":"kitchen"
},
{
"type":"category",
"val":"livingroom"
},
{
"type":"material",
"val":"plastic"
}
]
}
]
I would like to select and count the distinct categories and the number of products that have the category (Note that a product can have more than one category). Something like that:
[
{
"category": "kitchen",
"numberOfProducts": 2
},
{
"category": "bedroom",
"numberOfProducts": 1
},
{
"category": "livingroom",
"numberOfProducts": 1
}
]
And it would be better if I could get the same result for each different facet type, something like that:
[
{
"facetType": "category",
"distinctValues":
[
{
"val": "kitchen",
"numberOfProducts": 2
},
{
"val": "livingroom",
"numberOfProducts": 1
},
{
"val": "bedroom",
"numberOfProducts": 1
}
]
},
{
"facetType": "material",
"distinctValues":
[
{
"val": "wood",
"numberOfProducts": 1
},
{
"val": "plastic",
"numberOfProducts": 1
}
]
}
]
I am doing tests with distinct, aggregate and mapReduce. But can't achieve the results needed. Can anybody tell me the good way?
UPDATE:
With aggregate, this give me the different facet categories that a product have, but not the values nor the count of different values:
db.products.aggregate([
{$match:{'content.facets.type':'category'}},
{$group:{ _id: '$content.facets.type'} }
]).pretty();
The following aggregation pipeline will give you the desired result. In the first pipeline step, you need to do an $unwind operation on the facets array so that it's deconstructed to output a document for each element. After the $unwind stage is the first of the $group operations which groups the documents from the previous stream by category and type and calculates the number of products in each group using $sum. The next $group operation in the next pipeline stage then creates the array that holds the aggregated values by using $addToSet operator. The final pipeline stage is the $project operation which then transforms the document in the stream by modifying existing fields:
var pipeline = [
{ "$unwind": "$facets" },
{
"$group": {
"_id": {
"facetType": "$facets.type",
"value": "$facets.val"
},
"count": { "$sum": 1 }
}
},
{
"$group": {
"_id": "$_id.facetType",
"distinctValues": {
"$addToSet": {
"val": "$_id.value",
"numberOfProducts": "$count"
}
}
}
},
{
"$project": {
"_id": 0,
"facetType": "$_id",
"distinctValues": 1
}
}
];
db.product.aggregate(pipeline);
Output
/* 0 */
{
"result" : [
{
"distinctValues" : [
{
"val" : "kitchen",
"numberOfProducts" : 2
},
{
"val" : "bedroom",
"numberOfProducts" : 1
},
{
"val" : "livingroom",
"numberOfProducts" : 1
}
],
"facetType" : "category"
},
{
"distinctValues" : [
{
"val" : "wood",
"numberOfProducts" : 1
},
{
"val" : "plastic",
"numberOfProducts" : 1
}
],
"facetType" : "material"
}
],
"ok" : 1
}