In a MongoDB collection, there is data nested in an absence array.
{
"_id" : ObjectId("5c6c62f3d0e85e6ae3a8c842"),
"absence" : [
{
"date" : ISODate("2017-05-10T17:00:00.000-07:00"),
"code" : "E",
"type" : "E",
"isPartial" : false
},
{
"date" : ISODate("2018-02-24T16:00:00.000-08:00"),
"code" : "W",
"type" : "E",
"isPartial" : false
},
{
"date" : ISODate("2018-02-23T16:00:00.000-08:00"),
"code" : "E",
"type" : "E",
"isPartial" : false
},
{
"date" : ISODate("2018-02-21T16:00:00.000-08:00"),
"code" : "U",
"type" : "U",
"isPartial" : false
},
{
"date" : ISODate("2018-02-20T16:00:00.000-08:00"),
"code" : "R",
"type" : "E",
"isPartial" : false
}
]
}
I'd like to aggregate by absence.type to return a count of every type and the total number of absence children. The results might look like:
{
"_id" : ObjectId("5c6c62f3d0e85e6ae3a8c842"),
"U" : 1,
"E" : 4,
"total" : 5
}
There are several similar questions posted here but I'm yet to successfully adapt the answers my schema. Any help is greatly appreciated.
Also, are there GUI modeling tools to help with MongoDB query building? The transition from RDBMS queries to the Mongo aggregation pipeline has been quite difficult.
You can use below aggregation:
db.col.aggregate([
{
$unwind: "$absence"
},
{
$group: {
_id: { _id: "$_id", type: "$absence.type" },
count: { $sum: 1 }
}
},
{
$group: {
_id: "$_id._id",
types: { $push: { k: "$_id.type", v: "$count" } },
total: { $sum: "$count" }
}
},
{
$replaceRoot: {
newRoot: {
$mergeObjects: [ "$$ROOT", { $arrayToObject: "$types" } ]
}
}
},
{
$project: {
types: 0
}
}
])
$unwind allows you to get single document per absence. Then you need double $group, first one to count by type and _id and second one to aggregate the data per _id. Having one document per _id you just need $replaceRoot with $mergeObjects to promote your dynamically created keys and values (by $arrayToObject) to the root level.
output:
{ "_id" : ObjectId("5c6c62f3d0e85e6ae3a8c842"), "total" : 5, "U" : 1, "E" : 4 }
If you know all the possible values of "absence.type" then $filter the array on the value and compute the $size of the filtered array. This won't work if you don't know all the possible values in the "absence.type".
db.col.aggregate([
{ $project: { U: { $size: { $filter: { input: "$absence", as: "a", cond: { $eq: [ "$$a.type", "U"]} }}},
E: { $size: { $filter: { input: "$absence", as: "a", cond: { $eq: [ "$$a.type", "E"]} }}} }},
{ $project: { total: { $add: [ "$U", "$E" ]}, U: 1, E: 1}},
])
Related
I read the documentation and found that addToSet doesn't guarantee order.
But is there any way I can preserve the order as the original document.
My Query is :-
aggregate([{$match: {
$or:[{"Name.No":"119"},{"Name.No":"120"}]
}}, {$project: {
x:{$objectToArray:"$Results"}
}},{$unwind: "$x"},{$group: {_id: "$x.k", distinctVals: {$addToSet: "$x.v.TCR"}}}])
Sample Data:
{"Name" : {"No." : "119","Time" : "t"},
"Results":{"K1" : {"Counters" : x, "TCR" : [{"Name" : "K11", "Result" : "PASSED"},
{"Name" : "K12","Result" : "FAILED"},
{"Name" : "K13","Result" : "PASSED"}]
},
"K2" : {"Counters": y, "TCR" : [{"Name" : "K21","Result" : "PASSED"},
{"Name" : "K22","Result" : "PASSED"}]
}
}
}
}
Job2;
{"Name" : {"No." : "120","Time" : "t1"},
"Results":{"K1" : {"Counters" : x, "TCR" : [{"Name" : "K11", "Result" : "PASSED"},
{"Name" : "K12","Result" : "PASSED"},
{"Name" : "K13","Result" : "FAILED"}]
},
"K3" : {"Counters": y, "TCR" : [{"Name" : "K31","Result" : "PASSED"},
{"Name" : "K32","Result" : "PASSED"}]
}
}
}
Expected;
{"Name" : {"No." : "119-120","Time" : "lowest(t,t1)"},
"Results":{"K1" : {"Counters" : x, "TCR" : [{"Name" : "K11", "Result" : "PASSED"},
{"Name" : "K12","Result" : "PASSED"},
{"Name" : "K13","Result" : "PASSED"}]
},
"K2" : {"Counters": y, "TCR" : [{"Name" : "K21","Result" : "PASSED"},
{"Name" : "K22","Result" : "PASSED"}]
},
"K3" : {"Counters": y, "TCR" : [{"Name" : "K31","Result" : "PASSED"},
{"Name" : "K32","Result" : "PASSED"}]
}
}
}
I want to maintain the order same as original document, also every time document would change,so I cant sort based on any parameter.
convert Results object to array format using $objectToArray
$unwind deconstruct Results array
$unwind deconstruct Results.v.TCR array
$match to filter PASSED Result
$group by Results.k and get first Name, get first Counters, construct array of Results.v.TCR
$group by null and get minimum Time, construct unique array of No, construct Results array in key-value pair, $reduce to iterate loop of TCR and remove duplicate documents
$project to show required fields, convert Results array to object using $arrayToObject, convert No array to string and concat with "-"
db.collection.aggregate([
{ $addFields: { Results: { $objectToArray: "$Results" } } },
{ $unwind: "$Results" },
{ $unwind: "$Results.v.TCR" },
{ $match: { "Results.v.TCR.Result": "PASSED" } },
{
$group: {
_id: "$Results.k",
Name: { $first: "$Name" },
Counters: { $first: "$Results.v.Counters" },
TCR: { $push: "$Results.v.TCR" }
}
},
{
$group: {
_id: null,
Time: { $min: "$Name.Time" },
No: { $addToSet: "$Name.No" },
Results: {
$push: {
k: "$_id",
v: {
Counters: "$Counters",
TCR: {
$reduce: {
input: "$TCR",
initialValue: [],
in: {
$cond: [
{
$in: [
{
Name: "$$this.Name",
Result: "$$this.Result"
},
"$$value"
]
},
"$$value",
{
$concatArrays: [
"$$value",
[
{
Name: "$$this.Name",
Result: "$$this.Result"
}
]
]
}
]
}
}
}
}
}
}
}
},
{
$project: {
_id: 0,
Results: { $arrayToObject: "$Results" },
Name: {
Time: "$Time",
No: {
$reduce: {
input: "$No",
initialValue: "",
in: {
$concat: [
"$$value",
{ $cond: [{ $eq: ["$$value", ""]}, "", "-"] },
"$$this"
]
}
}
}
}
}
}
])
Playground
The "." (dot) in "No." field is not valid, it may cause issue in mongodb query operations, i would suggest do not use "." (dot) as field name.
I have the following document of collection "user" than contains two nested arrays:
{
"person" : {
"personId" : 78,
"firstName" : "Mario",
"surname1" : "LOPEZ",
"surname2" : "SEGOVIA"
},
"accounts" : [
{
"accountId" : 42,
"accountRegisterDate" : "2018-01-04",
"banks" : [
{
"bankId" : 1,
"name" : "Bank LTD",
},
{
"bankId" : 2,
"name" : "Bank 2 Corp",
}
]
},
{
"accountId" : 43,
"accountRegisterDate" : "2018-01-04",
"banks" : [
{
"bankId" : 3,
"name" : "Another Bank",
},
{
"bankId" : 4,
"name" : "BCT bank",
}
]
}
]
}
I'm trying to get a query that will find this document and get only this subdocument at output:
{
"bankId" : 3,
"name" : "Another Bank",
}
I'm getting really stucked. If I run this query:
{ "accounts.banks.bankId": "3" }
Gets the whole document. And I've trying combinations of projection with no success:
{"accounts.banks.0.$": 1} //returns two elements of array "banks"
{"accounts.banks.0": 1} //empty bank array
Maybe that's not the way to query for this and I'm going in bad direction.
Can you please help me?
You can try following solution:
db.user.aggregate([
{ $unwind: "$accounts" },
{ $match: { "accounts.banks.bankId": 3 } },
{
$project: {
items: {
$filter: {
input: "$accounts.banks",
as: "bank",
cond: { $eq: [ "$$bank.bankId", 3 ] }
}
}
}
},
{
$replaceRoot : {
newRoot: { $arrayElemAt: [ "$items", 0 ] }
}
}
])
To be able to filter accounts by bankId you need to $unwind them. Then you can match accounts to the one having bankId equal to 3. Since banks is another nested array, you can filter it using $filter operator. This will give you one element nested in items array. To get rid of the nesting you can use $replaceRoot with $arrayElemAt.
I want to count correct, incorrect and unattempted question count. I am getting zero values.
Query -
db.studentreports.aggregate([
{ $match: { 'groupId': 314 } },
{ $unwind: '$questions' },
{ $group:
{
_id: {
dateTimeStamp: '$dateTimeStamp',
customerId: '$customerId'
},
questions : { $push: '$questions' },
unttempted : { $sum : { $eq: ['$questions.status',0]}},
correct : { $sum : { $eq: ['$questions.status',1]}},
incorrect : { $sum : { $eq: ['$questions.status',2]}},
Total: { $sum: 1 }
}
}
])
Schema structure -
{
"_id" : ObjectId("59fb46ed560e1a2fd5b6fbf4"),
"customerId" : 2863318,
"groupId" : 309,
"questions" : [
{
"questionId" : 567,
"status" : 0,
"_id" : ObjectId("59fb46ee560e1a2fd5b700a4"),
},
{
"questionId" : 711,
"status" : 0,
"_id" : ObjectId("59fb46ee560e1a2fd5b700a3")
},
....
values unttempted, correct and incorrect are getting wrong -
"unttempted" : 0,
"correct" : 0,
"incorrect" : 0,
"Total" : 7558.0
Group by is required based on datetime and customerId.
Can some one correct query ?
Thanks.
You want to sum these fields only if a certain condition is met.
You just have to rewrite your group statement like this:
{ $group:
{
_id: {
dateTimeStamp: '$dateTimeStamp',
customerId: '$customerId'
},
questions : { $push: '$questions' },
unttempted : { $sum : {$cond:[{ $eq: ['$questions.status',0]}, 1, 0]}},
correct : { $sum : {$cond:[{ $eq: ['$questions.status',1]}, 1, 0]}},
incorrect : { $sum : {$cond:[{ $eq: ['$questions.status',2]}, 1, 0]}},
Total: { $sum: 1 }
}
}
Check out the documentation $eq. $eq compares and returns true or false. So then your $sum cannot do anything with that result
What I have :
{ "_id" : ObjectId("577dc9d61a0b7e0a40499f90"), "equ" : 123456, "key" : "p" }
{ "_id" : ObjectId("577c789b1a0b7e0a403f1b52"), "equ" : 123456, "key" : "r" }
{ "_id" : ObjectId("577b27481a0b7e0a4033965a"), "equ" : 123456, "key" : "r" }
{ "_id" : ObjectId("5779d6111a0b7e0a40282dc7"), "equ" : 123456, "key" : "o" }
What I want :
{ "_id" : ObjectId("5779d6111a0b7e0a40282dc7"), "equ" : 123456, "keys" : "prro" }
What I tried :
db.table.aggregate([{"$group":{"_id":0, "keys":{"$push":"$key"}}}])
returns an array and not a string:
{"_id":0, "keys":["p","r","r","o"]}
Do you have any idea?
TL; DR
Use this aggregation pipeline:
db.col.aggregate([
{$group: {_id: "$equ", last: {$last: "$_id"}, keys: {$push: "$key"}}},
{
$project: {
equ: "$_id",
_id: "$last",
keys: {
$reduce: {
input: "$keys",
initialValue: "",
in: {$concat: ["$$value", "$$this"]}
}
}
}
}
])
More details
First you should group the documents based on the equ value and also maintain an array of keys along with the _id of the last group member:
var groupByEqu = {
$group: {
_id: "$equ",
last: {$last: "$_id"},
keys: {$push: "$key"}
}
}
Just applying this pipeline operation would result in:
{
"_id" : 123456,
"last" : ObjectId("5779d6111a0b7e0a40282dc7"),
"keys" : [ "p", "r", "r", "o" ]
}
You should use a Projection to transform the preceding document into your desired one. The first two transformations are trivial. For joining the array elements you can use the new $reduce operator:
var project = {
$project: {
equ: "$_id",
_id: "$last",
keys: {
$reduce: {
input: "$keys",
initialValue: "",
in: {$concat: ["$$value", "$$this"]}
}
}
}
}
Applying these two pipeline operations would give you the desired result:
db.col.aggregate([groupByEqu, project])
Which is:
{
"equ" : 123456,
"_id" : ObjectId("5779d6111a0b7e0a40282dc7"),
"keys" : "prro"
}
Here's an example of documents I use :
{
"_id" : ObjectId("554a1f5fe36a768b362ea5c0"),
"store_state" : 1,
"services" : [
{
"id" : "XXX",
"state" : 1,
"active": true
},
{
"id" : "YYY",
"state" : 1,
"active": true
},
...
]
}
I want to output a new field with "Y" if the id is "XXX" and active is true and "N" in any other cases. The service element with "XXX" as id is not present on every documents (output "N" in this case).
Here's my query for the moment :
db.stores.aggregate({
$match : {"store_state":1}
},
{ $project : {
"XXX_active": {
$cond: [ {
$and:[
{$eq:["services.$id","XXX"]},
{$eq:["services.$active",true]}
]},"Y","N"
] }
}
}).pretty()
But it always output "N" for "XXX_active" field.
The expected output I need is :
{
"_id" : ObjectId("554a1f5de36a768b362e7e6f"),
"XXX_active" : "Y"
},
{
"_id" : ObjectId("554a1f5ee36a768b362e9d25"),
"XXX_active" : "N"
},
{
"_id" : ObjectId("554a1f5de36a768b362e73a5"),
"XXX_active" : "Y"
}
Other example of possible result :
{
"_id" : ObjectId("554a1f5de36a768b362e7e6f"),
"XXX_active" : "Y",
"YYY_active" : "N"
},
{
"_id" : ObjectId("554a1f5ee36a768b362e9d25"),
"XXX_active" : "N",
"YYY_active" : "N"
},
{
"_id" : ObjectId("554a1f5de36a768b362e73a5"),
"XXX_active" : "Y",
"YYY_active" : "Y"
}
Only one XXX_active per object and no duplicates objects but I need all objects with an XXX_active even if the services id element "XXX" is not present. Could someone help please?
First $unwind services array and then used $cond as below :
db.stores.aggregate({
"$match": {
"store_state": 1
}
}, {
"$unwind": "$services"
}, {
"$project": {
"XXX_active": {
"$cond": [{
"$and": [{
"$eq": ["$services.id", "XXX"]
}, {
"$eq": ["$services.active", true]
}]
}, "Y", "N"]
}
}
},{"$group":{"_id":"$_id","XXX_active":{"$first":"$XXX_active"}}}) //group by id
The following aggregation pipeline will give the desired result. You would need to first apply the $unwind operator on the services array field first as your initial aggregation pipeline step. This will deconstruct the services array field from the input documents to output a document for each element. Each output document replaces the array with an element value.
db.stores.aggregate([
{
"$match" : {"store_state": 1}
},
{
"$unwind": "$services"
},
{
"$project": {
"store_state" : 1,
"services": 1,
"XXX_active": {
"$cond": [
{
"$and": [
{"$eq":["$services.id", "XXX"]},
{"$eq":["$services.active",true]}
]
},"Y","N"
]
}
}
},
{
"$match": {
"services.id": "XXX"
}
},
{
"$group": {
"_id": {
"_id": "$_id",
"store_state": "$store_state",
"XXX_active": "$XXX_active"
},
"services": {
"$push": "$services"
}
}
},
{
"$project": {
"_id": "$_id._id",
"store_state" : "$_id.store_state",
"services": 1,
"XXX_active": "$_id.XXX_active"
}
}
])