Is it possible to aggregate $$ROOT in mongo db? - mongodb

I have following Mongo Collection.
[
{
"query": "a",
"page": "p1",
"clicks": 10,
"date": "x"
},
{
"query": "b",
"page": "p1",
"clicks": 5,
"date": "x"
},
{
"query": "a",
"page": "p1",
"clicks": 5,
"date": "y"
},
{
"query": "c",
"page": "p2",
"clicks": 2,
"date": "y"
},
]
Output Should be like this :
[
{
"page" : "p1",
"most_clicks_query" : "a",
"sum_of_clicks_for_query" : 15
},
{
"page" : "p2",
"most_clicks_query" : "c",
"sum_of_clicks_for_query" : 2
},
]
Logic to get this Output :
I need the query name that has most clicks for each page with sum of clicks (for that query)
What I ask :
I am hoping to get this result in one aggregation query.
So I am playing with $$ROOT.
In this path, now I am stuck with grouping the $$ROOT (to get sum of clicks for queries).
Can someone guide me a better path to do this?

Here is the aggregation you're looking for:
db.collection.aggregate([
{
"$group": {
"_id": {
"page": "$page",
"query": "$query"
},
"sum_of_clicks_for_query": {
"$sum": "$clicks"
}
}
},
{
"$project": {
"_id": false,
"page": "$_id.page",
"most_clicks_query": "$_id.query",
"sum_of_clicks_for_query": true
}
},
{
$sort: {
"sum_of_clicks_for_query": -1
}
},
{
$group: {
_id: "$page",
group: {
$first: "$$ROOT"
}
}
},
{
$replaceRoot: {
newRoot: "$group"
}
}
])
Playground: https://mongoplayground.net/p/Uzk3CuSwVRM

Related

Retrieve highest score for each game using aggregate in MongoDB

I am working on a database of various games and i want to design a query that returns top scorer from each game with specific player details.
The document structure is as follows:
db.gaming_system.insertMany(
[
{
"_id": "01",
"name": "GTA 5",
"high_scores": [
{
"hs_id": 1,
"name": "Harry",
"score": 6969
},
{
"hs_id": 2,
"name": "Simon",
"score": 8574
},
{
"hs_id": 3,
"name": "Ethan",
"score": 4261
}
]
},
{
"_id": "02",
"name": "Among Us",
"high_scores": [
{
"hs_id": 1,
"name": "Harry",
"score": 926
},
{
"hs_id": 2,
"name": "Simon",
"score": 741
},
{
"hs_id": 3,
"name": "Ethan",
"score": 841
}
]
}
]
)
I have created a query using aggregate which returns the name of game and the highest score for that game as follows
db.gaming_system.aggregate(
{ "$project": { "maximumscore": { "$max": "$high_scores.score" }, name:1 } },
{ "$group": { "_id": "$_id", Name: { $first: "$name" }, "Highest_Score": { "$max": "$maximumscore" } } },
{ "$sort" : { "_id":1 } }
)
The output from my query is as follows:
{ "_id" : "01", "Name" : "GTA 5", "Highest_Score" : 8574 }
{ "_id" : "02", "Name" : "Among Us", "Highest_Score" : 926 }
I want to generate output which also provides the name of player and "hs_id" of that player who has the highest score for each game as follows:
{ "_id" : "01", "Name" : "GTA 5", "Top_Scorer" : "Simon", "hs_id": 2, "Highest_Score" : 8574 }
{ "_id" : "02", "Name" : "Among Us", "Top_Scorer" : "Harry", "hs_id": 1, "Highest_Score" : 926 }
What should be added to my query using aggregate pipeline?
[
{
$unwind: "$high_scores" //unwind the high_scores, so you can then sort
},
{
$sort: {
"high_scores.score": -1 //sort the high_scores, irrelevant of game, because we are going to group in next stage
}
},
{
//now group them by _id, take the name and top scorer from $first (which is the first in that group as sorted by score in descending order
$group: {
_id: "$_id",
name: {
$first: "$name"
},
Top_Scorer: {
$first: "$high_scores"
}
}
}
]

mongo aggregation framework group by quarter/half year/year

I have a database with this schema structure :
{
"name" : "Carl",
"city" : "paris",
"time" : "1-2018",
"notes" : [
"A",
"A",
"B",
"C",
"D"
]
}
And this query using the aggregation framework :
db.getCollection('collection').aggregate(
[{
"$match": {
"$and": [{
"$or": [ {
"time": "1-2018"
}, {
"time": "2-2018"
} ]
}, {
"name": "Carl"
}, {
"city": "paris"
}]
}
}, {
"$unwind": "$notes"
}, {
"$group": {
"_id": {
"notes": "$notes",
"time": "$time"
},
"count": {
"$sum": 1
}
}
}
, {
"$group": {
"_id": "$_id.time",
"count": {
"$sum": 1
}
}
}, {
"$project": {
"_id": 0,
"time": "$_id",
"count": 1
}
}])
It working correcly and i'm getting these results these results :
{
"count" : 4.0,
"time" : "2-2018"
}
{
"count" : 4.0,
"time" : "1-2018"
}
My issue is that i'd like to keep the same match stage and i'd like to group by quarter.
Here the result i'd like to have :
{
"count" : 8.0,
"time" : "1-2018" // here quarter 1
}
Thanks

MongoDB filter for specific data in Array and return only specific fields in the output

I have a below structure maintained in a sample collection.
{
"_id": "1",
"name": "Stock1",
"description": "Test Stock",
"lines": [
{
"lineNumber": "1",
"priceInfo": {
"buyprice": 10,
"sellprice": 15
},
"item": {
"id": "BAT10001",
"name": "CricketBat",
"description": "Cricket bat"
},
"quantity": 10
},
{
"lineNumber": "2",
"priceInfo": {
"buyprice": 10,
"sellprice": 15
},
"item": {
"id": "BAT10002",
"name": "CricketBall",
"description": "Cricket ball"
},
"quantity": 10
},
{
"lineNumber": "3",
"priceInfo": {
"buyprice": 10,
"sellprice": 15
},
"item": {
"id": "BAT10003",
"name": "CricketStumps",
"description": "Cricket stumps"
},
"quantity": 10
}
]
}
I have a scenario where i will be given lineNumber and item.id, i need to filter the above collection based on lineNumber and item.id and i need to project only selected fields.
Expected output below:
{
"_id": "1",
"lines": [
{
"lineNumber": "1",
"item": {
"id": "BAT10001",
"name": "CricketBat",
"description": "Cricket bat"
},
"quantity": 10
}
]
}
Note: I may not get lineNumber all the times, if lineNumber is null then i should filter for item.id alone and get the above mentioned output.The main purpose is to reduce the number of fields in the output, as the collection is expected to hold huge number of fields.
I tried the below query,
db.sample.aggregate([
{ "$match" : { "_id" : "1"} ,
{ "$project" : { "lines" : { "$filter" : { "input" : "$lines" , "as" : "line" , "cond" :
{ "$and" : [ { "$eq" : [ "$$line.lineNumber" , "3"]} , { "$eq" : [ "$$line.item.id" , "BAT10001"]}]}}}}}
])
But i got all the fields, i'm not able to exclude or include the required fields.
I tried the below query and it worked for me,
db.Collection.aggregate([
{ $match: { _id: '1' } },
{
$project: {
lines: {
$map: {
input: {
$filter: {
input: '$lines',
as: 'line',
cond: {
$and: [
{ $eq: ['$$line.lineNumber', '3'] },
{ $eq: ['$$line.item.id', 'BAT10001'] },
],
},
},
},
as: 'line',
in: {
lineNumber: '$$line.lineNumber',
item: '$$line.item',
quantity: '$$line.quantity',
},
},
},
},
},
])
You can achieve it with $unwind and $group aggregation stages:
db.collection.aggregate([
{$match: {"_id": "1"}},
{$unwind: "$lines"},
{$match: {
$or: [
{"lines.lineNumber":{$exists: true, $eq: "1"}},
{"item.id": "BAT10001"}
]
}},
{$group: {
_id: "$_id",
lines: { $push: {
"lineNumber": "$lines.lineNumber",
"item": "$lines.item",
"quantity": "$lines.quantity"
}}
}}
])
$match - sets the criterias for the documents filter. The first stage is takes document with _id = "1", the second takes only documents which have lines.lineNumber equal to "1" or item.id equal to "BAT10001".
$unwind - splits the lines array into seperated documents.
$group - merges the documents by the _id element and puts the generated object with lineNumber, item and quantity elements into the lines array.

Aggregate query with document containing array of objects

I am facing issue with the aggregation query on MongoDB.
I have a document in following structure:
[{
"_id": ObjectId("19a5070b808028108101"),
"arr_vs": [
{
"arr_id": "one",
"val": 5
},
{
"arr_id": "two",
"val": 5
}]
},
{
"_id": ObjectId("19a5070b80802810810"),
"arr_vs": [
{
"arr_id": "one",
"val": 5
},
{
"arr_id": "two",
"val": 2
},{
"arr_id": "three",
"val": 1
}]
}]
I want the count for each value associated with arr_vs items.
Expected output:
{
"arr_vs":{
"one":[
{
"val":5,
"total_count":2
},{
"val":2,
"total_count":
}
}],
"two":[
{
"val":5,
"total_count":2
},{
"val":2,
"total_count":
}
}]
}
}
Any help will be appreciated.
Outputting to named keys is never really the fantastic thing some people seem to think it is. Realistically I usually want to work with the results returned, and therefore a "list/array" makes a lot more sense.
This is basically every new person basically gets told to abandon their "named keys" concepts, and realize they are working with a database and the inherent problems with named keys. Kind of also why collections are essentially "lists" as well.
So you would be better off getting used to the concept:
db.collection.aggregate([
{ "$unwind": "$arr_vs" },
{ "$group": {
"_id": { "id": "$arr_vs.arr_id", "val": "$arr_vs.val" },
"total_count": { "$sum": 1 }
}},
{ "$group": {
"_id": "$_id.id",
"v": {
"$push": {
"val": "$_id.val",
"total_count": "$total_count"
}
}
}}
])
Which is basically going to give you:
/* 1 */
{
"_id" : "two",
"v" : [
{
"val" : 2.0,
"total_count" : 1.0
},
{
"val" : 5.0,
"total_count" : 1.0
}
]
}
/* 2 */
{
"_id" : "one",
"v" : [
{
"val" : 5.0,
"total_count" : 2.0
}
]
}
/* 3 */
{
"_id" : "three",
"v" : [
{
"val" : 1.0,
"total_count" : 1.0
}
]
}
And is the aggregated data in an iterable and easy to use form.
If you are intent on your output format and have at least a MongoDB 3.4.4 version, you can take that further by compacting the documents and using $arrayToObject:
db.collection.aggregate([
{ "$unwind": "$arr_vs" },
{ "$group": {
"_id": { "id": "$arr_vs.arr_id", "val": "$arr_vs.val" },
"total_count": { "$sum": 1 }
}},
{ "$group": {
"_id": "$_id.id",
"v": {
"$push": {
"val": "$_id.val",
"total_count": "$total_count"
}
}
}},
{ "$group": {
"_id": null,
"arr_vs": {
"$push": {
"k": "$_id",
"v": "$v"
}
}
}},
{ "$project": {
"_id": 0,
"arr_vs": { "$arrayToObject": "$arr_vs" }
}}
])
Or even just apply the final "reshape" client side, if your MongoDB version does not support the new operator:
db.collection.aggregate([
{ "$unwind": "$arr_vs" },
{ "$group": {
"_id": { "id": "$arr_vs.arr_id", "val": "$arr_vs.val" },
"total_count": { "$sum": 1 }
}},
{ "$group": {
"_id": "$_id.id",
"v": {
"$push": {
"val": "$_id.val",
"total_count": "$total_count"
}
}
}},
{ "$group": {
"_id": null,
"arr_vs": {
"$push": {
"k": "$_id",
"v": "$v"
}
}
}},
/*
{ "$project": {
"_id": 0,
"arr_vs": { "$arrayToObject": "$arr_vs" }
}}
*/
]).map( d => ({
"arr_vs": d.arr_vs.reduce((acc,curr) =>
Object.assign(acc,({ [curr.k]: curr.v })),{})
}))
And both produce the same output:
{
"arr_vs" : {
"two" : [
{
"val" : 2.0,
"total_count" : 1.0
},
{
"val" : 5.0,
"total_count" : 1.0
}
],
"one" : [
{
"val" : 5.0,
"total_count" : 2.0
}
],
"three" : [
{
"val" : 1.0,
"total_count" : 1.0
}
]
}
}

How to compare and count each value of element with condition in mongoDB pipeline after unwinding?

This is my command I ran in tools->command
{
aggregate : "hashtags",
pipeline:
[
{$unwind:"$time"},
{$match:{"$time":{$gte:NumberInt(1450854385), $lte:NumberInt(1450854385)}}},
{$group:{"_id":"$word","count":{$sum:1}}}
]
}
which gave us this result
Response from server:
{
"result": [
{
"_id": "dear",
"count": NumberInt(1)
},
{
"_id": "ghost",
"count": NumberInt(1)
},
{
"_id": "rat",
"count": NumberInt(1)
},
{
"_id": "police",
"count": NumberInt(1)
},
{
"_id": "bugs",
"count": NumberInt(3)
},
{
"_id": "dog",
"count": NumberInt(2)
},
{
"_id": "batman",
"count": NumberInt(9)
},
{
"_id": "ear",
"count": NumberInt(1)
}
],
"ok": 1
}
The documents are in collection 'hashtags'
The documents inserted are as shown below
1.
{
"_id": ObjectId("567a483bf0058ed6755ab3de"),
"hash_count": NumberInt(1),
"msgids": [
"1583"
],
"time": [
NumberInt(1450854385)
],
"word": "ghost"
}
2.
{
"_id": ObjectId("5679485ff0058ed6755ab3dd"),
"hash_count": NumberInt(1),
"msgids": [
"1563"
],
"time": [
NumberInt(1450788886)
],
"word": "dear"
}
3.
{
"_id": ObjectId("567941aaf0058ed6755ab3dc"),
"hash_count": NumberInt(9),
"msgids": [
"1555",
"1556",
"1557",
"1558",
"1559",
"1561",
"1562",
"1584",
"1585"
],
"time": [
NumberInt(1450787170),
NumberInt(1450787292),
NumberInt(1450787307),
NumberInt(1450787333),
NumberInt(1450787354),
NumberInt(1450787526),
NumberInt(1450787615),
NumberInt(1450855148),
NumberInt(1450855155)
],
"word": "batman"
}
4.
{
"_id": ObjectId("567939cdf0058ed6755ab3d9"),
"hash_count": NumberInt(3),
"msgids": [
"1551",
"1552",
"1586"
],
"time": [
NumberInt(1450785157),
NumberInt(1450785194),
NumberInt(1450856188)
],
"word": "bugs"
}
So I want to count the number of values in the field 'time' which comes in between two limits
such as this
foreach word
{
foreach time
{
if((a<time)&&(time<b))
word[count]++
}
}
but my query is just giving output of the total size of array 'time'.
What is the correct query?
for eg
if lower bound is 1450787615 and upper bound is 1450855155
there are 3 values in 'time'. for word 'batman'
The answer should be
{
"_id": "batman",
"count": NumberInt(3)
},
for batman.Thank you.
Use the following aggregation pipeline:
db.hashtags.aggregate([
{
"$match": {
"time": {
"$gte": 1450787615, "$lte": 1450855155
}
}
},
{ "$unwind": "$time" },
{
"$match": {
"time": {
"$gte": 1450787615, "$lte": 1450855155
}
}
},
{
"$group": {
"_id": "$word",
"count": {
"$sum": 1
}
}
}
])
For the given sample documents, this will yield:
/* 0 */
{
"result" : [
{
"_id" : "batman",
"count" : 3
},
{
"_id" : "dear",
"count" : 1
},
{
"_id" : "ghost",
"count" : 1
}
],
"ok" : 1
}