Group different field by quarter - mongodb

I've got a aggregation :
{
$group: {
_id: "$_id",
cuid: {$first: "$cuid"},
uniqueConnexion: {
$addToSet: "$uniqueConnexion"
},
uniqueFundraisings: {
$addToSet: "$uniqueFundraisings"
}
}
},
that result with :
{
"cuid" : "cjcqe7qdo00nl0ltitkxdw8r6",
"uniqueConnexion" : [
"09.2019",
"06.2019",
"07.2019",
"08.2019",
"05.2019"
],
"uniqueFundraisings" : [
"06.2019",
"02.2019",
"01.2019",
"03.2019",
"09.2018",
"10.2018"
],
}
And now I'm want to group the uniquerConnexion and uniqueFundraisings fields to a new field (name uniqueAction) and convert them to a quarter format.
So an output like this :
{
"cuid" : "cjcqe7qdo00nl0ltitkxdw8r6",
"uniqueAction" : [
"Q4-2018",
"Q1-2019",
"Q2-2019",
"Q3-2014",
],
}

The previous answer shows the power of $setUnion operating on two lists. I have taken that and expanded a little more to get the OP target state. Given an input that more clearly shows some quarterly grouping (hint!):
var r =
{
"cuid" : "cjcqe7qdo00nl0ltitkxdw8r6",
"uniqueConnexion" : [
"01.2018",
"02.2018",
"08.2018",
"09.2018",
"10.2018",
"11.2018"
],
"uniqueFundraisings" : [
"01.2018",
"02.2018",
"05.2018",
"06.2018",
"12.2018"
],
};
this agg:
db.foo.aggregate([
// Unique-ify the two lists:
{ $project: {
cuid:1,
X: { $setUnion: [ "$uniqueConnexion", "$uniqueFundraisings" ] }
}}
// Now need to get to quarters....
// The input date is "MM.YYYY". Need to turn it into "Qn-YYYY":
,{ $project: {
X: {$map: {
input: "$X",
as: "z",
in: {$let: {
vars: { q: {$toInt: {$substr: ["$$z",0,2] }}},
in: {$concat: [{$cond: [
{$lte: ["$$q", 3]}, "Q1", {$cond: [
{$lte: ["$$q", 6]}, "Q2", {$cond: [
{$lte: ["$$q", 9]}, "Q3", "Q4"] }
]}
]} ,
"-", {$substr:["$$z",3,4]},
]}
}}}}}}
,{ $unwind: "$X"}
,{ $group: {_id: "$X", n: {$sum:1} }}
]);
produces this output. Yes, the OP was not looking for the count of things appearing in each quarter but very often that quickly follows on the heels of the original ask.
{ "_id" : "Q4-2018", "n" : 3 }
{ "_id" : "Q3-2018", "n" : 2 }
{ "_id" : "Q2-2018", "n" : 2 }
{ "_id" : "Q1-2018", "n" : 2 }

i think this will help you
{ $project: {
cuid:1,
uniqueAction: { $setUnion: [ "$uniqueConnexio", "$uniqueAction" ] }, _id: 0
}
}

Related

Mongodb - Get sales per hours

Good people! I am in need of your help.
I am trying to create a line graph using apexcharts with data imported from Mongodb.
I am trying to graph hourly sales, so I need the number of sales for each hour of the day.
Example Mongodb document.
{
"_id" : ObjectId("5dbee4eed6f04aaf191abc59"),
"seller_id" : "5aa1c2c35ef7a4e97b5e995a",
"temp" : "4.3",
"sale_type" : "coins",
"createdAt" : ISODate("2020-05-10T00:10:00.000Z"),
"updatedAt" : ISODate("2019-11-10T14:32:14.650Z")
}
Up to now I have a query like this:
db.getCollection('sales').aggregate([
{ "$facet": {
"00:00": [
{ "$match" : {createdAt: {$gte: ISODate("2020-05-10T00:00:00.000Z"),$lt: ISODate("2020-05-10T00:59:00.001Z")},seller_id: "5aa1c2c35ef7a4e97b5e995a",
}},
{ "$count": "sales" },
],
"01:00": [
{ "$match" : {createdAt: {$gte: ISODate("2020-05-10T01:00:00.000Z"),$lt: ISODate("2020-05-10T01:59:00.001Z")},seller_id: "5aa1c2c35ef7a4e97b5e995a",
}},
{ "$count": "sales" },
],
"02:00": [
{ "$match" : {createdAt: {$gte: ISODate("2020-05-10T02:00:00.000Z"),$lt: ISODate("2020-05-10T02:59:00.001Z")},seller_id: "5aa1c2c35ef7a4e97b5e995a",
}},
{ "$count": "sales" },
],
"03:00": [
{ "$match" : {createdAt: {$gte: ISODate("2020-05-10T03:00:00.000Z"),$lt: ISODate("2020-05-10T03:59:00.001Z")},seller_id: "5aa1c2c35ef7a4e97b5e995a",
}},
{ "$count": "sales" },
],
}},
{ "$project": {
"ventas0": { "$arrayElemAt": ["$01:00.sales", 0] },
"ventas1": { "$arrayElemAt": ["$02:00.sales", 0] },
"ventas3": { "$arrayElemAt": ["$03:00.sales", 0] },
}}
])
But I am sure there is a more efficient way to do this.
My expected output looks like this:
[countsale(00:00),countsale(01:00),countsale(02:00),countsale(03:00), etc to 24 hs]
You are correct, there is a more efficient way to do this. We can use Date expression operators and specifically by grouping with $hour.
db.getCollection('sales').aggregate([
{
$match: {
createdAt: {$gte: ISODate("2020-05-10T00:00:00.000Z"), $lt: ISODate("2020-05-11T00:00:00.001Z")}
}
},
{
$group: {
_id: {$hour: "$createdAt"},
count: {$sum: 1}
}
},
{
$sort: {
_id: 1
}
}
]);
This will give you this result:
[
{
_id: 0,
count: x
},
{
_id: 1,
count: y
},
...
{
_id: 23,
count: z
}
]
From here you can restructure the data easily as you wish.
A problem I forsee happening are hours without any matches (i.e count=0) will not exists in the result set. you'll have to fill in those gaps manually.

MongoDB : not able to get the field 'name' which has the max value in the two similar sub-documents

I have a test collection:
{
"_id" : ObjectId("5exxxxxx03"),
"username" : "abc",
"col1" : [
{
"colId" : 1
"col2" : [
{
"name" : "a",
"value" : 10
},
{
"name" : "b",
"value" : 20
},
{
"name" : "c",
"value" : 30
}
],
"col3" : [
{
"name" : "d",
"value" : 15
},
{
"name" : "e",
"value" : 25
},
{
"name" : "f",
"value" : 35
}
]
}
]
}
col1 has the list of sub-documents col2 and col3, which are similar, but convey different meanings. These two sub-documents are having name and value as fields.
Now, I need to find the max value from col2 or col3 and its corresponding name.
I tried the below query:
db.test.aggregate([
{$unwind: '$col1'},
{$unwind: '$col1.col2'},
{$unwind: '$col1.col3'},
{$group:
{_id: '$col1.colId',
maxCol2: {$max: '$col1.col2.value'},
maxCol3: {$max: '$col1.col3.value'}}},
{$project:
{maxValue: {$max: ['$maxCol2', '$maxCol3']},
name: {$cond: [
{$eq: ['$maxValue', '$maxCol2']},
'$col1.col2.name',
'$col1.col3.name']}}}]).pretty()
But, it resulted in the following, without name field in it:
{ "_id" : 1, "maxValue" : 35 }
So, just to check, weather my condition is correct or not, tried the following query ($col1.col2.name and $col1.col3.name replaced with 111 and 222 strings):
db.test.aggregate([
{$unwind: '$col1'},
{$unwind: '$col1.col2'},
{$unwind: '$col1.col3'},
{$group:
{_id: '$col1.colId',
maxCol2: {$max: '$col1.col2.value'},
maxCol3: {$max: '$col1.col3.value'}}},
{$project:
{maxValue: {$max: ['$maxCol2', '$maxCol3']},
name: {$cond: [
{$eq: ['$maxValue', '$maxCol2']},
'111',
'222']}}}]).pretty()
Which gives me the expected output:
{ "_id" : 1, "maxValue" : 35, "name" : "222" }
Could any one guide me why I am not getting the correct answer and how should I query this to get the correct output?
The correct out should be:
{ "_id" : 1, "maxValue" : 35, "name" : "f" }
P.S. - I'm a beginner.
You can use below aggregation
db.collection.aggregate([
{ "$project": {
"col1": {
"$max": {
"$reduce": {
"input": "$col1",
"initialValue": [],
"in": {
"$concatArrays": [
"$$this.col2",
"$$value",
"$$this.col3"
]
}
}
}
}
}}
])
MongoPlayground
Try this one:
Explanation
We need to add extra fields with col2 and col3 values. Once we calculate max value, we retrieve name based on max value.
db.collection.aggregate([
{
$unwind: "$col1"
},
{
$unwind: "$col1.col2"
},
{
$unwind: "$col1.col3"
},
{
$group: {
_id: "$col1.colId",
maxCol2: {
$max: "$col1.col2.value"
},
maxCol3: {
$max: "$col1.col3.value"
},
col2: {
$addToSet: "$col1.col2"
},
col3: {
$addToSet: "$col1.col3"
}
}
},
{
$project: {
maxValue: {
$filter: {
input: {
$cond: [
{
$gt: [
"$maxCol2",
"$maxCol3"
]
},
"$col2",
"$col3"
]
},
cond: {
$eq: [
"$$this.value",
{
$cond: [
{
$gt: [
"$maxCol2",
"$maxCol3"
]
},
"$maxCol2",
"$maxCol3"
]
}
]
}
}
}
}
},
{
$unwind: "$maxValue"
},
{
$project: {
_id: 1,
maxValue: "$maxValue.value",
name: "$maxValue.name"
}
}
])
MongoPlayground | Merging col2 / col3 | Per document

mongodb aggregate multiple arrays

I am using MongoDB version v3.4. I have a documents collection and sample datas are like this:
{
"mlVoters" : [
{"email" : "a#b.com", "isApproved" : false}
],
"egVoters" : [
{"email" : "a#b.com", "isApproved" : false},
{"email" : "c#d.com", "isApproved" : true}
]
},{
"mlVoters" : [
{"email" : "a#b.com", "isApproved" : false},
{"email" : "e#f.com", "isApproved" : true}
],
"egVoters" : [
{"email" : "e#f.com", "isApproved" : true}
]
}
Now if i want the count of distinct email addresses for mlVoters:
db.documents.aggregate([
{$project: { mlVoters: 1 } },
{$unwind: "$mlVoters" },
{$group: { _id: "$mlVoters.email", mlCount: { $sum: 1 } }},
{$project: { _id: 0, email: "$_id", mlCount: 1 } },
{$sort: { mlCount: -1 } }
])
Result of the query is:
{"mlCount" : 2.0,"email" : "a#b.com"}
{"mlCount" : 1.0,"email" : "e#f.com"}
And if i want the count of distinct email addresses for egVoters i do the same for egVoters field. And the result of that query would be:
{"egCount" : 1.0,"email" : "a#b.com"}
{"egCount" : 1.0,"email" : "c#d.com"}
{"egCount" : 1.0,"email" : "e#f.com"}
So, I want to combine these two aggregation and get the result as following (sorted by totalCount):
{"email" : "a#b.com", "mlCount" : 2, "egCount" : 1, "totalCount":3}
{"email" : "e#f.com", "mlCount" : 1, "egCount" : 1, "totalCount":2}
{"email" : "c#d.com", "mlCount" : 0, "egCount" : 1, "totalCount":1}
How can I do this? How should the query be like? Thanks.
First you add a field voteType in each vote. This field indicates its type. Having this field, you don't need to keep the votes in two separate arrays mlVoters and egVoters; you can instead concatenate those arrays into a single array per document, and unwind afterwards.
At this point you have one document per vote, with a field that indicates which type it is. Now you simply need to group by email and, in the group stage, perform two conditional sums to count how many votes of each type there are for every email.
Finally you add a field totalCount as the sum of the other two counts.
db.documents.aggregate([
{
$addFields: {
mlVoters: {
$ifNull: [ "$mlVoters", []]
},
egVoters: {
$ifNull: [ "$egVoters", []]
}
}
},
{
$addFields: {
"mlVoters.voteType": "ml",
"egVoters.voteType": "eg"
}
},
{
$project: {
voters: { $concatArrays: ["$mlVoters", "$egVoters"] }
}
},
{
$unwind: "$voters"
},
{
$project: {
email: "$voters.email",
voteType: "$voters.voteType"
}
},
{
$group: {
_id: "$email",
mlCount: {
$sum: {
$cond: {
"if": { $eq: ["$voteType", "ml"] },
"then": 1,
"else": 0
}
}
},
egCount: {
$sum: {
$cond: {
"if": { $eq: ["$voteType", "eg"] },
"then": 1,
"else": 0
}
}
}
}
},
{
$addFields: {
totalCount: {
$sum: ["$mlCount", "$egCount"]
}
}
}
])

Mongodb aggregate ifNull against array elements

I have the following dataset:
{
patientId: 228,
medication: {
atHome : [
{
"drug" : "tylenol",
"start" : "3",
"stop" : "7"
},
{
"drug" : "advil",
"start" : "0",
"stop" : "2"
},
{
"drug" : "vitaminK",
"start" : "0",
"stop" : "11"
}
],
}
}
When I execute the following aggregate everything looks great.
db.test01.aggregate(
[
{$match: {patientId: 228}},
{$project: {
patientId: 1,
"medication.atHome.drug": 1
}
},
]);
Results (Exactly what I wanted):
{
"_id" : ObjectId("5a57b7d17af6772ebf647939"),
"patientId" : NumberInt(228),
"medication" : {
"atHome" : [
{"drug" : "tylenol"},
{"drug" : "advil"},
{"drug" : "vitaminK"}
]}
}
We then wanted to add ifNull to change nulls to a default value, but this bungled the results.
db.test01.aggregate(
[
{$match: {patientId: 228}},
{$project: {
patientId: {$ifNull: ["$patientId", NumberInt(-1)]},
"medication.atHome.drug": {$ifNull: ["$medication.atHome.drug", "Unknown"]}
}
},
]);
Results from ifNull (Not what I was hoping for):
{
"_id" : ObjectId("5a57b7d17af6772ebf647939"),
"patientId" : NumberInt(228),
"medication" : {
"atHome" : [
{"drug" : ["tylenol", "advil", "vitaminK"]},
{"drug" : ["tylenol", "advil", "vitaminK"]},
{"drug" : ["tylenol", "advil", "vitaminK"]},
]}
}
What am I missing or not understanding?
To set attributes of documents that are elements of an array to default values you need to $unwind the array and then to group everything up after you check the attributes for null. Here is the query:
db.test01.aggregate([
// unwind to evaluete the array elements
{$unwind: "$medication.atHome"},
{$project: {
patientId: {$ifNull: ["$patientId", -1]},
"medication.atHome.drug": {$ifNull: ["$medication.atHome.drug", "Unknown"]}
}
},
// group to put atHome documents to an array again
{$group: {
_id: {_id: "$_id", patientId: "$patientId"},
"atHome": {$push: "$medication.atHome" }
}
},
// project to get a document of required format
{$project: {
_id: "$_id._id",
patientId: "$_id.patientId",
"medication.atHome": "$atHome"
}
}
])
UPDATE:
There is another more neat query to achieve the same. It uses the map operator to evaluate each array element thus does not require unwinding.
db.test01.aggregate([
{$project:
{
patientId: {$ifNull: ["$patientId", -1]},
"medication.atHome": {
$map: {
input: "$medication.atHome",
as: "e",
in: { $cond: {
if: {$eq: ["$$e.drug", null]},
then: {drug: "Unknown"},
else: {drug: "$$e.drug"}
}
}
}
}
}
}
])

Querying the total number of elements in nested arrays - embed documents MongoDB

I have documents in my collections like to:
{
_id: 1,
activities: [
{
activity_id: 1,
travel: [
{
point_id: 1,
location: [-76.0,19.1]
},
{
point_id: 2,
location: [-77.0,19.3]
}
]
},
{
activity_id: 2,
travel: [
{
point_id: 3,
location: [-99.3,18.2]
}
]
}
]
},
{
_id: 2,
activities: [
{
activity_id: 3,
travel: [
{
point_id: 4,
location: [-75.0,11.1]
}
]
}
]
}
I can get the total number of activities, as follows:
db.mycollection.aggregate(
{$unwind: "$activities"},
{$project: {count:{$add:1}}},
{$group: {_id: null, number: {$sum: "$count" }}}
)
I get (3 activities):
{ "result" : [ { "_id" : null, "number" : 3 } ], "ok" : 1 }
question: How can I get the total number of elements in all travels?
expected result: 4 elements
these are:
{
point_id: 1,
location: [-76.0,19.1]
},
{
point_id: 2,
location: [-77.0,19.3]
},
{
point_id: 3,
location: [-99.3,18.2]
},
{
point_id: 4,
location: [-75.0,11.1]
}
You can easily transform document by using double $unwind
e.g.
db.collection.aggregate([
{$unwind: "$activities"},
{$unwind: "$activities.travel"},
{$group:{
_id:null,
travel: {$push: {
point_id:"$activities.travel.point_id",
location:"$activities.travel.location"}}
}},
{$project:{_id:0, travel:"$travel"}}
])
This will emit which is very close to your desired output format:
{
"travel" : [
{
"point_id" : 1.0,
"location" : [
-76.0,
19.1
]
},
{
"point_id" : 2.0,
"location" : [
-77.0,
19.3
]
},
{
"point_id" : 3.0,
"location" : [
-99.3,
18.2
]
},
{
"point_id" : 4.0,
"location" : [
-75.0,
11.1
]
}
]
}
Update:
If you just want to know total number of travel documents in whole collection,
try:
db.collection.aggregate([
{$unwind: "$activities"},
{$unwind: "$activities.travel"},
{$group: {_id:0, total:{$sum:1}}}
])
It will print:
{
"_id" : NumberInt(0),
"total" : NumberInt(4)
}
Update 2:
OP wants to filter documents based on some property in aggregation framework. Here is a way to do so:
db.collection.aggregate([
{$unwind: "$activities"},
{$match:{"activities.activity_id":1}},
{$unwind: "$activities.travel"},
{$group: {_id:0, total:{$sum:1}}}
])
It will print (based on sample document):
{ "_id" : 0, "total" : 2 }