Mongodb Aggregation Pipeline Count Total size across multiple fields - mongodb

`"ActivityScores" : {
"Spring" : [
{
"ActivityId" : "8fd38724-7e7d-4518-bd49-d38a8b4b3435",
"ActivityTime" : "2017-05-25T16:07:02.000-06:00"
}
],
"Winter" : [
{
"ActivityId" : "90d2a976-19d9-4ce0-aa88-d32c122d173b",
"ActivityTime" : "2017-02-14T22:50:00.000-06:00"
}
],
"Fall" : [
{
"ActivityId" : "84b8c41e-788f-4acd-abec-dc455285972b",
"ActivityTime" : "2016-11-15T22:37:02.000-06:00"
},
{
"ActivityId" : "157af880-d47b-42fc-8ecf-ecfc1bbb56b1",
"ActivityTime" : "2016-09-01T22:50:05.000-06:00"
}
]
},
"Grade" : "2",
"GradeTag" : "GRADE_2", `
I am looking for aggregation query to get Total of ActivityIds. I tried various combination of $group, $unwind, $size $addToset but none of them seems to be working . I need to find total activities using aggregation framework only. I don't want to go through each document using javascript or python to get the total counts. Is there any easy way around?

Thanks.We are on version 3.2.Finally below combination worked. ActivityScores was field to entity.SchoolYears in our Schema.Working Aggregation Pipeline for me.
db.studentcontentareadocument.aggregate(
[
{
$project: {
"SpringActivitiesPerDoc" : {
"$size" : "$entity.SchoolYears.ActivityScores.Spring"
},
"WinterActivitiesPerDoc" : {
"$size" : "$entity.SchoolYears.ActivityScores.Winter"
},
"FallActivitiesPerDoc" : {
"$size" : "$entity.SchoolYears.ActivityScores.Fall"
}
}
},
{
$project: {
"TotalActivitiesPerDoc" : {
"$add" : [
"$SpringActivitiesPerDoc",
"$WinterActivitiesPerDoc",
"$FallActivitiesPerDoc"
]
}
}
},
{
$group: {
"_id" : null,
"TotalActivities" : {
"$sum" : "$TotalActivitiesPerDoc"
}
}
},
{
$project: {
"_id" : 0,
"TotalSGPActivities" : "$TotalActivities"
}
}
],
{
cursor: {
batchSize: 50
},
allowDiskUse: true
}
);

Related

MongoDB Match DATE between FAILS

I have a match like:
{
$and: [
{ $nor: [ { Meetings: { $exists: false } }, { Meetings: { $size: 0 } }, { Meetings: { $eq: null } } ] },
{ 'Meetings.MeetingDate': { $gte: ISODate("2020-12-23T00:00:01.000Z") } },
{ 'Meetings.MeetingDate': { $lte: ISODate("2020-12-23T23:59:59.999Z") } }
]
}
and on Mongo I have meetings from 2020-01-01 to 2020-12-31.
If I want to get only the 23rd ones, this match brings them but also from higer date like 25, 26, 30, etc...
What is the correct way to match date BETWEEN to get a specific date? (could be one day or a range...)
Here there is a Mongo Playground with a small example, but here works fine, I get all from the 29th.
I guess my problem is in my Aggregation. On the example I added MeetingDate on the root and in real life its a child array, maybe this is the problem.
db.getCollection("ClientProject").aggregate(
[
{
"$match" : {
"$and" : [
{
"$nor" : [
{
"Meetings" : {
"$exists" : false
}
},
{
"Meetings" : {
"$size" : 0.0
}
},
{
"Meetings" : {
"$eq" : null
}
}
]
},
{
"Meetings.MeetingDate" : {
"$gte" : ISODate("2020-12-30T00:00:01.000+0000")
}
},
{
"Meetings.MeetingDate" : {
"$lte" : ISODate("2020-12-31T23:59:59.999+0000")
}
}
]
}
},
{
"$project" : {
"ProjectName" : 1.0,
"ClientName" : 1.0,
"ClientResponsableName" : "$CreatedByName",
"ProjectType" : 1.0,
"ProjectSKU" : 1.0,
"Meetings" : 1.0
}
},
{
"$unwind" : {
"path" : "$Meetings",
"preserveNullAndEmptyArrays" : false
}
},
{
"$unwind" : {
"path" : "$Meetings.Invites",
"preserveNullAndEmptyArrays" : false
}
},
{
"$addFields" : {
"Meetings.Invites.MeetingDate" : "$Meetings.MeetingDate",
"Meetings.Invites.MeetingStartTime" : "$Meetings.StartTime",
"Meetings.Invites.MeetingEndTime" : "$Meetings.EndTime",
"Meetings.Invites.MeetingStatus" : "$Meetings.MeetingStatus",
"Meetings.Invites.ProjectId" : {
"$toString" : "$_id"
},
"Meetings.Invites.ProjectType" : "$ProjectType",
"Meetings.Invites.ProjectSKU" : "$ProjectSKU",
"Meetings.Invites.ProjectName" : "$ProjectName",
"Meetings.Invites.ClientId" : "$ClientId",
"Meetings.Invites.ClientName" : "$ClientName",
"Meetings.Invites.ClientResponsableName" : "$ClientResponsableName"
}
},
{
"$replaceRoot" : {
"newRoot" : "$Meetings.Invites"
}
},
{
"$sort" : {
"MeetingDate" : 1.0,
"MeetingStartTime" : 1.0,
"InviteStatus" : 1.0
}
}
],
{
"allowDiskUse" : false
}
);
cheers
The problem here is not related with the match on Date, it´s working. The problem is that each record has an array of meeting and each one has different dates, so even matching the correct date, the rest of the aggregation, Unwinds, etc... will use the the full record with all meetings, thats why I get with diferent dates.
Here I added an extra Match after $ReplaceRoot and worked... could use $filter too in some part of the aggregation..
cheers

Multiple grouping in mongodb

Sample Colloection Data :
{
"_id" : ObjectId("5f30df23243ffsdfwer3d14568bf"),
"value" : {
"busId" : 200.0,
"status" : {
"code" : {
"id" : 1.0,
"key" : "2100",
"value" : "Complete"
}
}
}
}
My Query does provides the right result, but would like to squeeze the output more by using multiple grouping or $project or any other aggregators.
mongo Query:
db.suraj_coll.aggregate([
{
$addFields: {
"value.available": {
$cond: [
{
$in: [
"$value.status.code.value",
[
"Accept",
"Complete"
]
]
},
"Approved",
"Rejected"
]
}
}
},
{
"$group": {
"_id": {
busID: "$value.busId",
status: "$value.available"
},
"subtotal": {
$sum: 1
}
}
}
])
Output:
/* 1 */
{
"_id" : {
"busID" : 200.0,
"status" : "Approved"
},
"subtotal" : 3.0
}
/* 2 */
{
"_id" : {
"busID" : 200.0,
"status" : "Rejected"
},
"subtotal" : 1.0
}
Is it possible to squeeze the output more by using any further grouping ?
Output should look like below
{
"_id" : {
"busID" : 200.0,
"Approved" : 3.0
"Rejected" : 1.0
}
}
tried with $project, by keeping the count in a doc , but couldn't place the count against Approve or Rejected.
Any suggestion would be great.
You can use more two pipelines after your query,
$group by busID and push status and count in status
$project to convert status array to object using $arrayToObject and merge with busID using $mergeObjects
{
$group: {
_id: "$_id.busID",
status: {
$push: {
k: "$_id.status",
v: "$subtotal"
}
}
}
},
{
$project: {
_id: {
$mergeObjects: [
{ busID: "$_id" },
{ $arrayToObject: "$status" }
]
}
}
}
Playground

Partition data around a match query during aggregation

What I have been trying to get my head around is to perform some kind of partitioning(split by predicate) in a mongo query. My current query looks like:
db.posts.aggregate([
{"$match": { $and:[ {$or:[{"toggled":false},{"toggled":true, "status":"INACTIVE"}]} , {"updatedAt":{$gte:1549786260000}} ] }},
{"$unwind" :"$interests"},
{"$group" : {"_id": {"iid": "$interests", "pid":"$publisher"}, "count": {"$sum" : 1}}},
{"$project":{ _id: 0, "iid": "$_id.iid", "pid": "$_id.pid", "count": 1 }}
])
This results in the following output:
{
"count" : 3.0,
"iid" : "INT456",
"pid" : "P789"
}
{
"count" : 2.0,
"iid" : "INT789",
"pid" : "P789"
}
{
"count" : 1.0,
"iid" : "INT123",
"pid" : "P789"
}
{
"count" : 1.0,
"iid" : "INT123",
"pid" : "P123"
}
All good so far, but then I had realized that for the documents that match the specific filter {"toggled":true, "status":"INACTIVE"}, I would rather decrement the count (-1). (considering the eventual value can be negative as well.)
Is there a way to somehow partition the data after match to make sure different grouping operations are performed for both the collection of documents?
Something that sounds similar to what I am looking for is
$mergeObjects, or maybe $reduce, but not much that I can relate from the documentation examples.
Note: I can sense, one straightforward way to deal with this would be to perform two queries, but I am looking for a single query to perform the operation.
Sample documents for the above output would be:
/* 1 */
{
"_id" : ObjectId("5d1f7******"),
"id" : "CON123",
"title" : "Game",
"content" : {},
"status" : "ACTIVE",
"toggle":false,
"publisher" : "P789",
"interests" : [
"INT456"
],
"updatedAt" : NumberLong(1582078628264)
}
/* 2 */
{
"_id" : ObjectId("5d1f8******"),
"id" : "CON456",
"title" : "Home",
"content" : {},
"status" : "INACTIVE",
"toggle":true,
"publisher" : "P789",
"interests" : [
"INT456",
"INT789"
],
"updatedAt" : NumberLong(1582078628264)
}
/* 3 */
{
"_id" : ObjectId("5d0e9******"),
"id" : "CON654",
"title" : "School",
"content" : {},
"status" : "ACTIVE",
"toggle":false,
"publisher" : "P789",
"interests" : [
"INT123",
"INT456",
"INT789"
],
"updatedAt" : NumberLong(1582078628264)
}
/* 4 */
{
"_id" : ObjectId("5d207*******"),
"id" : "CON789",
"title":"Stack",
"content" : { },
"status" : "ACTIVE",
"toggle":false,
"publisher" : "P123",
"interests" : [
"INT123"
],
"updatedAt" : NumberLong(1582078628264)
}
What I am looking forward to as a result though is
{
"count" : 1.0, (2-1)
"iid" : "INT456",
"pid" : "P789"
}
{
"count" : 0.0, (1-1)
"iid" : "INT789",
"pid" : "P789"
}
{
"count" : 1.0,
"iid" : "INT123",
"pid" : "P789"
}
{
"count" : 1.0,
"iid" : "INT123",
"pid" : "P123"
}
This aggregation gives the desired result.
db.posts.aggregate( [
{ $match: { updatedAt: { $gte: 1549786260000 } } },
{ $facet: {
FALSE: [
{ $match: { toggle: false } },
{ $unwind : "$interests" },
{ $group : { _id : { iid: "$interests", pid: "$publisher" }, count: { $sum : 1 } } },
],
TRUE: [
{ $match: { toggle: true, status: "INACTIVE" } },
{ $unwind : "$interests" },
{ $group : { _id : { iid: "$interests", pid: "$publisher" }, count: { $sum : -1 } } },
]
} },
{ $project: { result: { $concatArrays: [ "$FALSE", "$TRUE" ] } } },
{ $unwind: "$result" },
{ $replaceRoot: { newRoot: "$result" } },
{ $group : { _id : "$_id", count: { $sum : "$count" } } },
{ $project:{ _id: 0, iid: "$_id.iid", pid: "$_id.pid", count: 1 } }
] )
[ EDIT ADD ]
The output from the query using the input data from the question post:
{ "count" : 1, "iid" : "INT123", "pid" : "P789" }
{ "count" : 1, "iid" : "INT123", "pid" : "P123" }
{ "count" : 0, "iid" : "INT789", "pid" : "P789" }
{ "count" : 1, "iid" : "INT456", "pid" : "P789" }
[ EDIT ADD 2 ]
This query gets the same result with different approach (code):
db.posts.aggregate( [
{
$match: { updatedAt: { $gte: 1549786260000 } }
},
{
$unwind : "$interests"
},
{
$group : {
_id : {
iid: "$interests",
pid: "$publisher"
},
count: {
$sum: {
$switch: {
branches: [
{ case: { $eq: [ "$toggle", false ] },
then: 1 },
{ case: { $and: [ { $eq: [ "$toggle", true] }, { $eq: [ "$status", "INACTIVE" ] } ] },
then: -1 }
]
}
}
}
}
},
{
$project:{
_id: 0,
iid: "$_id.iid",
pid: "$_id.pid",
count: 1
}
}
] )
[ EDIT ADD 3 ]
NOTE:
The facet query runs the two facets (TRUE and FALSE) on the same set of documents; it is like two queries running in parallel. But, there is some duplication of code as well as additional stages for shaping the documents down the pipeline to get the desired output.
The second query avoids the code duplication, and there are much lesser stages in the aggregation pipeline. This will make difference when the input dataset has a large number of documents to process - in terms of performance. In general, lesser stages means lesser iterations of the documents (as a stage has to scan the documents which are output from the previous stage).

monogdb nested array items exact match

I have a collection as below what I want is to fetch the items that has exact match of Tag="dolore", I tried different ways but I am getting all the elements if any of the embedded element has tag as dolore
{
"_id" : 123,
"vendor" : "ut",
"boxes" : [
{
"boxRef" : 321,
"items" : [
{
"Tag" : "dolore",
},
{
"Tag" : "irure",
},
{
"Tag" : "labore",
}
]
},
{
"boxRef" : 789,
"items" : [
{
"Tag" : "incididunt",
},
{
"Tag" : "magna",
},
{
"Tag" : "laboris",
}
]
},
{
"boxRef" : 456,
"items" : [
{
"Tag" : "reprehenderit",
},
{
"Tag" : "reprehenderit",
},
{
"Tag" : "enim",
}
]
}
]
}
If you are expecting to get only the matching embedded documents you have $unwind, $match and then $group to reverse the $unwind. Like this:
db.getCollection('collectionName').aggregate([
{
$unwind:"$boxes"
},
{
$unwind:"$boxes.items"
},
{
$match:{
"boxes.items.Tag":"dolore"
}
},
{
$group:{
_id:{
boxRef:"$boxes.boxRef",
_id:"$_id"
},
vendor:{
"$first":"$vendor"
},
boxRef:{
"$first":"$boxes.boxRef"
},
items:{
$push:"$boxes.items"
}
}
},
{
$group:{
_id:"$_id._id",
vendor:{
"$first":"$vendor"
},
boxes:{
$push:{
boxRef:"$boxRef",
items:"$items"
}
}
}
},
])
Output:
{
"_id" : 123.0,
"vendor" : "ut",
"boxes" : [
{
"boxRef" : 321.0,
"items" : [
{
"Tag" : "dolore"
}
]
}
]
}

Change field name in result Mongo Query

I have this Mongo query:
db.getCollection('Catalogos').aggregate(
{ $match: {Items: {$elemMatch: {'MarIclase': '04'} } } },
{ $unwind : "$Items" },
{ $match: { "Items.MarIclase" : "04" } },
{ $group : {
_id : "$_id",
Items : { $push : { 'MarIclase': "$Items.MarIclase", 'MarCdescrip' : '$Items.MarCdescrip' } }
}}
);
The result of this query is:
{
"result" : [
{
"_id" : "CAT_MARCAS_VU",
"Items" : [
{
"MarIclase" : "04",
"MarCdescrip" : "5500 LARSON"
},
{
"MarIclase" : "04",
"MarCdescrip" : "A LINER"
}
]
}
],
"ok" : 1.0000000000000000
}
I'd like to have this result:
{
"result" : [
{
"_id" : "CAT_MARCAS_VU",
"Items" : [
{
"04" : "5500 LARSON"
},
{
"04" : "A LINER"
}
]
}
],
"ok" : 1.0000000000000000
}
¿Do you know if I can make something in the $push and change the fieldnames for values?
I'd like to have something like this:
{ "04" : "A LINER" }
{ "04" : "5500 LARSON" }
Thank you!