Querying aggregates on subdocuments then grouping by field in parent document - mongodb

I'm a noob when it comes to Mongo and I've been struggling to wrap my head around how to fetch data in the following fashion. I have a collection of order documents that contain some data such as an event_id and a subcollection (if that's the term) of issued_tickets. issued_tickets contains one to many subdocuments that contain fields such as name, date, etc. What I am trying to do is fetch the number of each type of issued tickets for each event_id in the parent document. So I would be wanting to do a count on each issued_tickets grouped by issued_tickets.name and then that goes up to the parent which is then summed and grouped on the parent's event_id.
Can anyone help me accomplish this? I keep spinning myself out on trying groupings and projections still.
Here is a sample document:
{
"_id" : ObjectId("5ce7335c1c666f000414f74a"),
"event_id" : ObjectId("5cb54f966668a9719ef6a103"),
"subtotal" : 3000,
"service_fee" : 760,
"processing_fee" : 143,
"total" : 3903,
"customer_id" : ObjectId("5ce7666c1c335f000414f747"),
"updated_at" : ISODate("2019-05-23T23:57:17.524Z"),
"created_at" : ISODate("2019-05-23T23:57:17.524Z"),
"ref" : "60d5fcf9-86c6-469b-b86b-315a9b55caca",
"issued_tickets" : [
{
"_id" : ObjectId("5ce7335c1c335f000414f666"),
"name" : "Tier 1",
"stub_name" : "Tier 1",
"price" : 1500,
"base_fee" : 200,
"perc_fee" : "0.12",
"access_code" : "163a1b9ee98338a8a4288a1c87446665",
"redeemed" : false
},
{
"_id" : ObjectId("5ce7335c1c335f0004146669"),
"name" : "Tier 2",
"stub_name" : "Tier 2",
"price" : 1500,
"base_fee" : 200,
"perc_fee" : "0.12",
"access_code" : "f50f262cd0bf1ec4ab36667c2a762446",
"redeemed" : true
}
]
}

We can do aggregations like following
$unwind to deconstruct the array
$group to reconstruct the array. While regrouping by eventId and issued_tickets.name, we can count using $sum
Mongo script :
db.collection.aggregate([
{
$unwind: "$issued_tickets"
},
{
$group: {
_id: {
_id: "$event_id",
ticketName: "$issued_tickets.name"
},
count: {
$sum: 1
}
}
},
{
$project: {
event_id: "$_id._id",
ticketName: "$_id.ticketName",
count: 1,
_id: 0
}
}
])
Working Mongo playground

Related

Indexing MongoDB for sort consistency

The MongoDB documentation says that MongoDB doesn't store documents in a collection in a particular order. So if you have this collection:
db.restaurants.insertMany( [
{ "_id" : 1, "name" : "Central Park Cafe", "borough" : "Manhattan"},
{ "_id" : 2, "name" : "Rock A Feller Bar and Grill", "borough" : "Queens"},
{ "_id" : 3, "name" : "Empire State Pub", "borough" : "Brooklyn"},
{ "_id" : 4, "name" : "Stan's Pizzaria", "borough" : "Manhattan"},
{ "_id" : 5, "name" : "Jane's Deli", "borough" : "Brooklyn"},
] );
and sorting like this:
db.restaurants.aggregate(
[
{ $sort : { borough : 1 } }
]
)
Then the sort order can be inconsistent since:
the borough field contains duplicate values for both Manhattan and Brooklyn. Documents are returned in alphabetical order by borough, but the order of those documents with duplicate values for borough might not to be the same across multiple executions of the same sort.
To return a consistent result it's recommended to modify the query to:
db.restaurants.aggregate(
[
{ $sort : { borough : 1, _id: 1 } }
]
)
My question relates to the efficiency of such a query. Let's say you have millions of documents, should you create a compound index, something like { borough: 1, _id: -1 }, to make it efficient? Or is it enough to index { borough: 1 } due to the, potentially, special nature of the _id field?
I'm using MongoDB 4.4.
If you need stable sort, you will have to sort on both the fields and for performant query you will need to have a compound index on both the fields.
{ borough: 1, _id: -1 }

Limit distinct values only if a subelement exists

I have searched here but could not find an clear answer to the following question. In the sample collection mycollection below, how would one select distinct vin numbers only in Objects where the status field exists and the status is UNLOCKED ?
I have tried
db.getCollection('mycollection').distinct("vin", {$and: [{"decoded_payload.status": {$exists: true}}, {"decoded_payload.status":"UNLOCKED"}]})
but this query hangs indefinitely
Due to the large size of the database and the lengthy delay of such a query, I would like to limit the output to check if it runs at all but it seems limit() is not an option with .distinct()
In MongoDB, how would one select the distinct vin in the data below, set the limit = 1 and only select based on the status condition (status exists and is equal to "UNLOCKED")?
Would aggregate() be the right choice? How does one use the above conditions with aggregate() and limit() ?
The output in this case would be 34567
{
"_id" : ObjectId("1"),
"vin" : "12345",
"class_name" : "foo",
"decoded_payload" : {
"timestamp" : 1547329250,
"status" : "LOCKED"
}
}
{
"_id" : ObjectId("2"),
"vin" : "23456",
"class_name" : "foo",
"decoded_payload" : {
"timestamp" : 1547329260,
"status" : "LOCKED"
}
}
{
"_id" : ObjectId("3"),
"vin" : "34567",
"class_name" : "bar",
"decoded_payload" : {
"timestamp" : 1547329270,
"status" : "UNLOCKED",
"reservation_id" : "71"
}
}
{
"_id" : ObjectId("4"),
"vin" : "45678",
"class_name" : "baz",
"decoded_payload" : {
"timestamp" : 1547329280,
"reservation_id" : "71"
}
}
You can use this aggregation Query to filter data and return distinct "vin"
db.mycollection.aggregate([
{
$match: {
$and: [{
"decoded_payload.status": { $exists: true }
}, {
"decoded_payload.status": "UNLOCKED"
}]
}
},
{ $limit : 5 }, // You can use this stage after group too
{
$group: { _id: "$vin" }
}
])
Use limit stage before and after $group stage as per requirement

Building a pipeline and aggregate in Mongo

How do I aggregate the below collection of document type to sum the quantity of all product_id sold based on each district_id and city_id within a period of time
I tried using the aggregate functions of $match, $group but haven't been successful.
{
"_id" : ObjectId("5b115e00a186ae19062b0714"),
"id" : 86164014,
"cost" : 3,
"created_date" : "2017-04-04 21:44:14",
"quantity" : 12,
"bill_id" : 46736603,
"product_id" : 24,
"bill_date" : "2017-04-04",
"district_id" : 75
"city_id": 21
}
You should be more specific about the "within a period of time" and which field we should consider, but the query for the first part could be this one:
db.getCollection("your collection").aggregate([
{
$group: {
_id: {
city_id: "$city_id",
district_id: "$district_id"
},
quantities: { $sum: "$quantity" }
}
}
])

Summing a value of a key over multiple documents in MongoDB

I have a collection named users with the following structure to its documents
{
"_id" : <user_id>,
"NAME" : "ABC",
"TIME" : 53.0,
"OBJECTS" : 1
},
{
"_id" : <user_id>,
"NAME" : "ABCD",
"TIME" : 353.0,
"OBJECTS" : 70
}
Now, I want to sum the value of OBJECTS over the entire collection and return the value along with the objects.
Something like this
{
{
"_id" : <user_id>,
"NAME" : "ABC",
"TIME" : 53.0,
"OBJECTS" : 1
},
{
"_id" : <user_id>,
"NAME" : "ABCD",
"TIME" : 353.0,
"OBJECTS" : 70
},
"TOTAL_OBJECTS": 71
}
Or any way wherein I don't have to compute on the received object and can directly access from it. Now, I've tried looking this up but I found none where the hierarchy of the existing documents isn't destroyed.
You can use $group specifying null as a grouping id. You'll gather all documents into one array (using $$ROOT variable) and another field can represent a sum of OBJECT like below:
db.users.aggregate([
{
$group: {
_id: null,
documents: { $push: "$$ROOT" },
TOTAL_OBJECTS: { $sum: "$OBJECTS" }
}
}
])
db.users.aggregate(
// Pipeline
[
// Stage 1
{
$group: {
_id: null,
TOTAL_OBJECTS: {
$sum: '$OBJECTS'
},
documents: {
$addToSet: '$$CURRENT'
}
}
},
]
);
Into above aggregate query I have pushed all documents into an array using $addToSet operator as a part of $group stage of aggregate operation

Aggregation query returning array of all objects for mongodb

I'm using mongo for the first time. I'm trying to aggregate some documents in a collection using the query below. Instead the query returns an object with a key "result" that contains an array of all the documents that fit with $match.
Below is the query.
db.events_2015_04_10.aggregate([
{$group:{
_id: "$uid",
count: {$sum: 1},
},
$match : {promo:"bc40100abc8d4eb6a0c68f81f4a756c7", evt:"login"}
}
]
);
Below is a sample document in the collection:
{
"_id" : ObjectId("552712c3f92ea17426000ace"),
"product" : "Mobile Safari",
"venue_id" : NumberLong(71540),
"uid" : "dd542fea6b4443469ff7bf1f56472eac",
"ag" : 0,
"promo" : "bc40100abc8d4eb6a0c68f81f4a756c7",
"promo_f" : NumberLong(1),
"brand" : NumberLong(17),
"venue" : "ovation_2480",
"lt" : 0,
"ts" : ISODate("2015-04-10T00:01:07.734Z"),
"evt" : "login",
"mac" : "00:00:00:00:00:00",
"__ns__" : "wifipromo",
"pvdr" : NumberLong(42),
"os" : "iPhone",
"cmpgn" : "fc6de34aef8b4f57af0b8fda98d8c530",
"ip" : "192.119.43.250",
"lng" : 0,
"product_ver" : "8"
}
I'm trying to get it all grouped by uid's with the total sum of each group... What is the correct way to achieve this?
Try the following aggregation framework which has the $match pipeline stage first and then the $group pipeline later:
db.events_2015_04_10.aggregate([
{
$match: {
promo: "bc40100abc8d4eb6a0c68f81f4a756c7",
evt: "login"
}
},
{
$group: {
_id: "$uid",
count: {
$sum: 1
}
}
}
])