I am using mongodb 3.6. I have many document in my collection.inside the document i do not have any Domain field. I create Domain for some document.
Now I want to use aggregate for filtering this collection. that is mean, I want those documents that have not Domain as a field.
db.Events.aggregate([
{$project : {
Domain : {$filter: {
input: "$Domain",
cond:{if: {Domain : {$exists: false}}, then: {"$BusinessCode": 1} }}}
}
}
],{
allowDiskUse: true
})
when I execute this script I got error:
Assert: command failed: {
"ok" : 0,
"errmsg" : "Unrecognized expression '$exists'",
"code" : 168,
"codeName" : "InvalidPipelineOperator"
} : aggregate failed
seems $exists is not supported into $filter expression.
How could I do that?
Another question is: Can I use 2 $project like this:
db.Events.aggregate([
{$project : {
Domain : {$filter: {
input: "$Domain",
cond:{if: {Domain : {$exists: false}}, then: {"$BusinessCode": 1} }}}
}
},
{
$match : {BusinessCode: /(([1-2]?[0-9])-([0-9]*)-([0-9]*)-([0-9]*)-([0-9]*)-([0-9]*)-([0-9]*))/}
},
{
$project : {BusinessCode : {$arrayElemAt:[{$split : ["$BusinessCode", "-"]},0]}}
},
{
$addFields: {"Domain": "$BusinessCode"}
},
],{
allowDiskUse: true
})
I want to check, does specific field is there into document. if does not exist, BusinessCode projected and other stuff..
***************************Edit****************
this is my sample of documents:
"DeviceId" : "xxxxxxx",
"UserId" : UUID(""),
"UserFullName" : "test-user",
"SystemId" : "com.messaging",
"SystemTitle" : "message",
"EventId" : "messaging.message",
"EventTitle" : "test",
"EventData" : [],
"BusinessCode" : "1-2-4-4-5-6-9",...
After execute this script, I expect "Domain" append to my document like this:
"EventTitle" : "test",
"EventData" : [],
"BusinessCode" : "1-2-4-4-5-6-9"
"Domain": "1" // 1 is first number of BusinessCode that splitted
but if Domain was exist script goes to next document and check again.
So you're looking for a something like COALESCE in SQL and it is called $ifNull in MongoDB. For instance:
db.Events.save({Domain: "4"})
db.Events.save({BusinessCode: "1-2-4-4-5-6-9"})
db.Events.aggregate([
{
$project: {
Domain: {
$ifNull: [ "$Domain", { $arrayElemAt: [ { $split : ["$BusinessCode", "-"] },0] } ] }
}
}
])
Related
I'm running the following aggregation query on a test database
db.restaurants.explain().aggregate([
{$match: {"address.zipcode": {$in: ["10314", "11208", "11219"]}}},
{$match: {"grades": {$elemMatch: {score: {$gte: 1}}}}},
{$group: {_id: "$borough", count: {$sum: 1} }},
{$sort: {count: -1} }
]);
And as per MongoDB documentation it should return cursor that I can iterate and see data about all pipeline stages:
The operation returns a cursor with the document that contains detailed information regarding the processing of the aggregation pipeline.
However the aggregation command returns explain info only about first two match stages:
{
"stages" : [
{
"$cursor" : {
"query" : {
"$and" : [
{
"address.zipcode" : {
"$in" : [
"10314",
"11208",
"11219"
]
}
},
{
"grades" : {
"$elemMatch" : {
"score" : {
"$gte" : 1.0
}
}
}
}
]
},
"fields" : {
"borough" : 1,
"_id" : 0
},
"queryPlanner" : {
"plannerVersion" : 1,
"namespace" : "test.restaurants",
"indexFilterSet" : false,
"parsedQuery" : {
"$and" : [
{
"grades" : {
"$elemMatch" : {
"score" : {
"$gte" : 1.0
}
}
}
},
{
"address.zipcode" : {
"$in" : [
"10314",
"11208",
"11219"
]
}
}
]
},
"winningPlan" : {
"stage" : "COLLSCAN",
"filter" : {
"$and" : [
{
"grades" : {
"$elemMatch" : {
"score" : {
"$gte" : 1.0
}
}
}
},
{
"address.zipcode" : {
"$in" : [
"10314",
"11208",
"11219"
]
}
}
]
},
"direction" : "forward"
},
"rejectedPlans" : []
}
}
},
{
"$group" : {
"_id" : "$borough",
"count" : {
"$sum" : {
"$const" : 1.0
}
}
}
},
{
"$sort" : {
"sortKey" : {
"count" : -1
}
}
}
],
"ok" : 1.0
}
And the object returned does not seem like cursor at all.
If I save the aggregation result to a variable and then try to iterate through it using cursor methods (hasNext(), next(), etc) I get the following:
TypeError: result.next is not a function : #(shell):1:1
How can I see info on all pipeline steps?
Thanks
1. Explain info
Explain() returns the winning plan of a query, ie how the database fetch the document before processing them in the pipeline.
Here, because adress.zipcode and grades aren't indexed, the db performs a COLLSCAN, ie iterate over all documents in db and see if they match
After that, you group the document and sort the results. Thoses operations are done "in memory", on the previously fetched documents. The fields aren't indexed, so no special plan can be used here
more info here : explain results
2. Explain() on aggregation query does not return a cursor
For some reason, explain() on aggregation query does not return a cursor, but a BSON object directly (unlike explain() on find() query )
It might be a bug, but there's nothing about this in the doc.
Anyway, you can do :
var explain = db.restaurants.explain().aggregate([
{$match: {"address.zipcode": {$in: ["10314", "11208", "11219"]}}},
{$match: {"grades": {$elemMatch: {score: {$gte: 1}}}}},
{$group: {_id: "$borough", count: {$sum: 1} }},
{$sort: {count: -1} }
]);
printjson(explain)
I have collection in which documents are like:
{
_id: ObjectId(),
user: ObjectId(),
studentName: String,
createdAt: Date,
isAbondoned: boolean
}
example of documents are:
1-
{
"_id" : ObjectId("56cd2d36a489a5b875902f0e"),
"user" : ObjectId("56c4cafabd5f92cd78ae49d4"),
"studentName" : "Aman",
"createdAt" : ISODate("2016-02-24T04:10:30.486+0000"),
"isAbandoned" : true
}
2-
{
"_id" : ObjectId("56cd2dcda489a5b875902fcd"),
"user" : ObjectId("56c4cafabd5f92cd78ae49d4"),
"studentName" : "Aman",
"createdAt" : ISODate("2016-02-24T04:13:01.932+0000"),
"isAbandoned" : false
}
3-
{
"_id" : ObjectId("56cee51503b7cb7b0eda9c4c"),
"user" : ObjectId("56c85244bd5f92cd78ae4bc1"),
"studentName" : "Rajeev",
"createdAt" : ISODate("2016-02-25T11:27:17.281+0000"),
"isAbandoned" : true,
}
Now I want to find the list of students for which their 'isAbandoned' is true for their last 'createdAt' document.
Required output for above example is:
{
"user" : ObjectId("56c85244bd5f92cd78ae4bc1"),
"studentName" : "Rajeev"
}
because for studentName "Aman" max(createdAt) is 2nd document and 'isAbandoned' is false for that.
The best way to do this is using the aggregation framework. You need to $group your documents by "user" and return the last document for each user using the $last accumulator operator but for this to work, you need a preliminary sorting stage using the $sort aggregation pipeline operator. To sort your documents, you need to consider both the "createdAt" field and the "user" field.
The last stage in the pipeline is the $match stage where you select only those last documents where "isAbandoned" equals true.
db.students.aggregate([
{ "$sort": { "user": 1, "createdAt": 1 } },
{ "$group": {
"_id": "$user",
"last": { "$last": "$$ROOT" }
}},
{ "$match": { "last.isAbandoned": true } }
])
which returns something like this:
{
"_id" : ObjectId("56c85244bd5f92cd78ae4bc1"),
"last" : {
"_id" : ObjectId("56cee51503b7cb7b0eda9c4c"),
"user" : ObjectId("56c85244bd5f92cd78ae4bc1"),
"studentName" : "Rajeev",
"createdAt" : ISODate("2016-02-25T11:27:17.281Z"),
"isAbandoned" : true
}
}
To get the expected result, we need to use the $replaceRoot pipeline operator starting from verion 3.4 to promote the embedded document to the top level
{
$replaceRoot: { newRoot: "$last" }
}
In older version, you need to use the $project aggregation pipeline operation to reshape our documents. So if we extend our pipeline with the following stage:
{
"$project": {
"_id": "$last._id",
"user": "$last.user",
"studentName": "$last.studentName",
"createdAt": "$last.createdAt",
"isAbandoned": "$last.isAbandoned"
}}
it produces the expected output:
{
"_id" : ObjectId("56cee51503b7cb7b0eda9c4c"),
"user" : ObjectId("56c85244bd5f92cd78ae4bc1"),
"studentName" : "Rajeev",
"createdAt" : ISODate("2016-02-25T11:27:17.281Z"),
"isAbandoned" : true
}
This is a good example of need to group data by specific filed (createdAt) and then compare result set match criteria.
find max by student id,
match only entries by max entry = createdAt
check if they are passing criteria
reshape document
Code:
db.student.aggregate([{
$group : {
_id : "$user",
created : {
$max : "$createdAt"
},
documents : {
$push : "$$ROOT"
}
}
}, {
$project : {
_id : 0,
documents : {
$filter : {
input : "$documents",
as : "item",
cond : {
$eq : ["$$item.createdAt", "$created"]
}
}
}}
}, {
$match : {
"documents.isAbandoned" : true
}},
{ $unwind : "$documents" },
{
$project : {
_id : "$documents._id",
user : "$documents.user",
studentName : "$documents.studentName",
createdAt : "$documents.createdAt",
isAbandoned : "$documents.isAbandoned",
}}
])
I have collection in which documents are like:
{
_id: ObjectId(),
user: ObjectId(),
studentName: String,
createdAt: Date,
isAbondoned: boolean
}
example of documents are:
1-
{
"_id" : ObjectId("56cd2d36a489a5b875902f0e"),
"user" : ObjectId("56c4cafabd5f92cd78ae49d4"),
"studentName" : "Aman",
"createdAt" : ISODate("2016-02-24T04:10:30.486+0000"),
"isAbandoned" : true
}
2-
{
"_id" : ObjectId("56cd2dcda489a5b875902fcd"),
"user" : ObjectId("56c4cafabd5f92cd78ae49d4"),
"studentName" : "Aman",
"createdAt" : ISODate("2016-02-24T04:13:01.932+0000"),
"isAbandoned" : false
}
3-
{
"_id" : ObjectId("56cee51503b7cb7b0eda9c4c"),
"user" : ObjectId("56c85244bd5f92cd78ae4bc1"),
"studentName" : "Rajeev",
"createdAt" : ISODate("2016-02-25T11:27:17.281+0000"),
"isAbandoned" : true,
}
Now I want to find the list of students for which their 'isAbandoned' is true for their last 'createdAt' document.
Required output for above example is:
{
"user" : ObjectId("56c85244bd5f92cd78ae4bc1"),
"studentName" : "Rajeev"
}
because for studentName "Aman" max(createdAt) is 2nd document and 'isAbandoned' is false for that.
The best way to do this is using the aggregation framework. You need to $group your documents by "user" and return the last document for each user using the $last accumulator operator but for this to work, you need a preliminary sorting stage using the $sort aggregation pipeline operator. To sort your documents, you need to consider both the "createdAt" field and the "user" field.
The last stage in the pipeline is the $match stage where you select only those last documents where "isAbandoned" equals true.
db.students.aggregate([
{ "$sort": { "user": 1, "createdAt": 1 } },
{ "$group": {
"_id": "$user",
"last": { "$last": "$$ROOT" }
}},
{ "$match": { "last.isAbandoned": true } }
])
which returns something like this:
{
"_id" : ObjectId("56c85244bd5f92cd78ae4bc1"),
"last" : {
"_id" : ObjectId("56cee51503b7cb7b0eda9c4c"),
"user" : ObjectId("56c85244bd5f92cd78ae4bc1"),
"studentName" : "Rajeev",
"createdAt" : ISODate("2016-02-25T11:27:17.281Z"),
"isAbandoned" : true
}
}
To get the expected result, we need to use the $replaceRoot pipeline operator starting from verion 3.4 to promote the embedded document to the top level
{
$replaceRoot: { newRoot: "$last" }
}
In older version, you need to use the $project aggregation pipeline operation to reshape our documents. So if we extend our pipeline with the following stage:
{
"$project": {
"_id": "$last._id",
"user": "$last.user",
"studentName": "$last.studentName",
"createdAt": "$last.createdAt",
"isAbandoned": "$last.isAbandoned"
}}
it produces the expected output:
{
"_id" : ObjectId("56cee51503b7cb7b0eda9c4c"),
"user" : ObjectId("56c85244bd5f92cd78ae4bc1"),
"studentName" : "Rajeev",
"createdAt" : ISODate("2016-02-25T11:27:17.281Z"),
"isAbandoned" : true
}
This is a good example of need to group data by specific filed (createdAt) and then compare result set match criteria.
find max by student id,
match only entries by max entry = createdAt
check if they are passing criteria
reshape document
Code:
db.student.aggregate([{
$group : {
_id : "$user",
created : {
$max : "$createdAt"
},
documents : {
$push : "$$ROOT"
}
}
}, {
$project : {
_id : 0,
documents : {
$filter : {
input : "$documents",
as : "item",
cond : {
$eq : ["$$item.createdAt", "$created"]
}
}
}}
}, {
$match : {
"documents.isAbandoned" : true
}},
{ $unwind : "$documents" },
{
$project : {
_id : "$documents._id",
user : "$documents.user",
studentName : "$documents.studentName",
createdAt : "$documents.createdAt",
isAbandoned : "$documents.isAbandoned",
}}
])
I have collection in which documents are like:
{
_id: ObjectId(),
user: ObjectId(),
studentName: String,
createdAt: Date,
isAbondoned: boolean
}
example of documents are:
1-
{
"_id" : ObjectId("56cd2d36a489a5b875902f0e"),
"user" : ObjectId("56c4cafabd5f92cd78ae49d4"),
"studentName" : "Aman",
"createdAt" : ISODate("2016-02-24T04:10:30.486+0000"),
"isAbandoned" : true
}
2-
{
"_id" : ObjectId("56cd2dcda489a5b875902fcd"),
"user" : ObjectId("56c4cafabd5f92cd78ae49d4"),
"studentName" : "Aman",
"createdAt" : ISODate("2016-02-24T04:13:01.932+0000"),
"isAbandoned" : false
}
3-
{
"_id" : ObjectId("56cee51503b7cb7b0eda9c4c"),
"user" : ObjectId("56c85244bd5f92cd78ae4bc1"),
"studentName" : "Rajeev",
"createdAt" : ISODate("2016-02-25T11:27:17.281+0000"),
"isAbandoned" : true,
}
Now I want to find the list of students for which their 'isAbandoned' is true for their last 'createdAt' document.
Required output for above example is:
{
"user" : ObjectId("56c85244bd5f92cd78ae4bc1"),
"studentName" : "Rajeev"
}
because for studentName "Aman" max(createdAt) is 2nd document and 'isAbandoned' is false for that.
The best way to do this is using the aggregation framework. You need to $group your documents by "user" and return the last document for each user using the $last accumulator operator but for this to work, you need a preliminary sorting stage using the $sort aggregation pipeline operator. To sort your documents, you need to consider both the "createdAt" field and the "user" field.
The last stage in the pipeline is the $match stage where you select only those last documents where "isAbandoned" equals true.
db.students.aggregate([
{ "$sort": { "user": 1, "createdAt": 1 } },
{ "$group": {
"_id": "$user",
"last": { "$last": "$$ROOT" }
}},
{ "$match": { "last.isAbandoned": true } }
])
which returns something like this:
{
"_id" : ObjectId("56c85244bd5f92cd78ae4bc1"),
"last" : {
"_id" : ObjectId("56cee51503b7cb7b0eda9c4c"),
"user" : ObjectId("56c85244bd5f92cd78ae4bc1"),
"studentName" : "Rajeev",
"createdAt" : ISODate("2016-02-25T11:27:17.281Z"),
"isAbandoned" : true
}
}
To get the expected result, we need to use the $replaceRoot pipeline operator starting from verion 3.4 to promote the embedded document to the top level
{
$replaceRoot: { newRoot: "$last" }
}
In older version, you need to use the $project aggregation pipeline operation to reshape our documents. So if we extend our pipeline with the following stage:
{
"$project": {
"_id": "$last._id",
"user": "$last.user",
"studentName": "$last.studentName",
"createdAt": "$last.createdAt",
"isAbandoned": "$last.isAbandoned"
}}
it produces the expected output:
{
"_id" : ObjectId("56cee51503b7cb7b0eda9c4c"),
"user" : ObjectId("56c85244bd5f92cd78ae4bc1"),
"studentName" : "Rajeev",
"createdAt" : ISODate("2016-02-25T11:27:17.281Z"),
"isAbandoned" : true
}
This is a good example of need to group data by specific filed (createdAt) and then compare result set match criteria.
find max by student id,
match only entries by max entry = createdAt
check if they are passing criteria
reshape document
Code:
db.student.aggregate([{
$group : {
_id : "$user",
created : {
$max : "$createdAt"
},
documents : {
$push : "$$ROOT"
}
}
}, {
$project : {
_id : 0,
documents : {
$filter : {
input : "$documents",
as : "item",
cond : {
$eq : ["$$item.createdAt", "$created"]
}
}
}}
}, {
$match : {
"documents.isAbandoned" : true
}},
{ $unwind : "$documents" },
{
$project : {
_id : "$documents._id",
user : "$documents.user",
studentName : "$documents.studentName",
createdAt : "$documents.createdAt",
isAbandoned : "$documents.isAbandoned",
}}
])
I'm using mongo for the first time. I'm trying to aggregate some documents in a collection using the query below. Instead the query returns an object with a key "result" that contains an array of all the documents that fit with $match.
Below is the query.
db.events_2015_04_10.aggregate([
{$group:{
_id: "$uid",
count: {$sum: 1},
},
$match : {promo:"bc40100abc8d4eb6a0c68f81f4a756c7", evt:"login"}
}
]
);
Below is a sample document in the collection:
{
"_id" : ObjectId("552712c3f92ea17426000ace"),
"product" : "Mobile Safari",
"venue_id" : NumberLong(71540),
"uid" : "dd542fea6b4443469ff7bf1f56472eac",
"ag" : 0,
"promo" : "bc40100abc8d4eb6a0c68f81f4a756c7",
"promo_f" : NumberLong(1),
"brand" : NumberLong(17),
"venue" : "ovation_2480",
"lt" : 0,
"ts" : ISODate("2015-04-10T00:01:07.734Z"),
"evt" : "login",
"mac" : "00:00:00:00:00:00",
"__ns__" : "wifipromo",
"pvdr" : NumberLong(42),
"os" : "iPhone",
"cmpgn" : "fc6de34aef8b4f57af0b8fda98d8c530",
"ip" : "192.119.43.250",
"lng" : 0,
"product_ver" : "8"
}
I'm trying to get it all grouped by uid's with the total sum of each group... What is the correct way to achieve this?
Try the following aggregation framework which has the $match pipeline stage first and then the $group pipeline later:
db.events_2015_04_10.aggregate([
{
$match: {
promo: "bc40100abc8d4eb6a0c68f81f4a756c7",
evt: "login"
}
},
{
$group: {
_id: "$uid",
count: {
$sum: 1
}
}
}
])