Get the last element for a specific field in mongo [duplicate] - mongodb

I have collection in which documents are like:
{
_id: ObjectId(),
user: ObjectId(),
studentName: String,
createdAt: Date,
isAbondoned: boolean
}
example of documents are:
1-
{
"_id" : ObjectId("56cd2d36a489a5b875902f0e"),
"user" : ObjectId("56c4cafabd5f92cd78ae49d4"),
"studentName" : "Aman",
"createdAt" : ISODate("2016-02-24T04:10:30.486+0000"),
"isAbandoned" : true
}
2-
{
"_id" : ObjectId("56cd2dcda489a5b875902fcd"),
"user" : ObjectId("56c4cafabd5f92cd78ae49d4"),
"studentName" : "Aman",
"createdAt" : ISODate("2016-02-24T04:13:01.932+0000"),
"isAbandoned" : false
}
3-
{
"_id" : ObjectId("56cee51503b7cb7b0eda9c4c"),
"user" : ObjectId("56c85244bd5f92cd78ae4bc1"),
"studentName" : "Rajeev",
"createdAt" : ISODate("2016-02-25T11:27:17.281+0000"),
"isAbandoned" : true,
}
Now I want to find the list of students for which their 'isAbandoned' is true for their last 'createdAt' document.
Required output for above example is:
{
"user" : ObjectId("56c85244bd5f92cd78ae4bc1"),
"studentName" : "Rajeev"
}
because for studentName "Aman" max(createdAt) is 2nd document and 'isAbandoned' is false for that.

The best way to do this is using the aggregation framework. You need to $group your documents by "user" and return the last document for each user using the $last accumulator operator but for this to work, you need a preliminary sorting stage using the $sort aggregation pipeline operator. To sort your documents, you need to consider both the "createdAt" field and the "user" field.
The last stage in the pipeline is the $match stage where you select only those last documents where "isAbandoned" equals true.
db.students.aggregate([
{ "$sort": { "user": 1, "createdAt": 1 } },
{ "$group": {
"_id": "$user",
"last": { "$last": "$$ROOT" }
}},
{ "$match": { "last.isAbandoned": true } }
])
which returns something like this:
{
"_id" : ObjectId("56c85244bd5f92cd78ae4bc1"),
"last" : {
"_id" : ObjectId("56cee51503b7cb7b0eda9c4c"),
"user" : ObjectId("56c85244bd5f92cd78ae4bc1"),
"studentName" : "Rajeev",
"createdAt" : ISODate("2016-02-25T11:27:17.281Z"),
"isAbandoned" : true
}
}
To get the expected result, we need to use the $replaceRoot pipeline operator starting from verion 3.4 to promote the embedded document to the top level
{
$replaceRoot: { newRoot: "$last" }
}
In older version, you need to use the $project aggregation pipeline operation to reshape our documents. So if we extend our pipeline with the following stage:
{
"$project": {
"_id": "$last._id",
"user": "$last.user",
"studentName": "$last.studentName",
"createdAt": "$last.createdAt",
"isAbandoned": "$last.isAbandoned"
}}
it produces the expected output:
{
"_id" : ObjectId("56cee51503b7cb7b0eda9c4c"),
"user" : ObjectId("56c85244bd5f92cd78ae4bc1"),
"studentName" : "Rajeev",
"createdAt" : ISODate("2016-02-25T11:27:17.281Z"),
"isAbandoned" : true
}

This is a good example of need to group data by specific filed (createdAt) and then compare result set match criteria.
find max by student id,
match only entries by max entry = createdAt
check if they are passing criteria
reshape document
Code:
db.student.aggregate([{
$group : {
_id : "$user",
created : {
$max : "$createdAt"
},
documents : {
$push : "$$ROOT"
}
}
}, {
$project : {
_id : 0,
documents : {
$filter : {
input : "$documents",
as : "item",
cond : {
$eq : ["$$item.createdAt", "$created"]
}
}
}}
}, {
$match : {
"documents.isAbandoned" : true
}},
{ $unwind : "$documents" },
{
$project : {
_id : "$documents._id",
user : "$documents.user",
studentName : "$documents.studentName",
createdAt : "$documents.createdAt",
isAbandoned : "$documents.isAbandoned",
}}
])

Related

Limit distinct values only if a subelement exists

I have searched here but could not find an clear answer to the following question. In the sample collection mycollection below, how would one select distinct vin numbers only in Objects where the status field exists and the status is UNLOCKED ?
I have tried
db.getCollection('mycollection').distinct("vin", {$and: [{"decoded_payload.status": {$exists: true}}, {"decoded_payload.status":"UNLOCKED"}]})
but this query hangs indefinitely
Due to the large size of the database and the lengthy delay of such a query, I would like to limit the output to check if it runs at all but it seems limit() is not an option with .distinct()
In MongoDB, how would one select the distinct vin in the data below, set the limit = 1 and only select based on the status condition (status exists and is equal to "UNLOCKED")?
Would aggregate() be the right choice? How does one use the above conditions with aggregate() and limit() ?
The output in this case would be 34567
{
"_id" : ObjectId("1"),
"vin" : "12345",
"class_name" : "foo",
"decoded_payload" : {
"timestamp" : 1547329250,
"status" : "LOCKED"
}
}
{
"_id" : ObjectId("2"),
"vin" : "23456",
"class_name" : "foo",
"decoded_payload" : {
"timestamp" : 1547329260,
"status" : "LOCKED"
}
}
{
"_id" : ObjectId("3"),
"vin" : "34567",
"class_name" : "bar",
"decoded_payload" : {
"timestamp" : 1547329270,
"status" : "UNLOCKED",
"reservation_id" : "71"
}
}
{
"_id" : ObjectId("4"),
"vin" : "45678",
"class_name" : "baz",
"decoded_payload" : {
"timestamp" : 1547329280,
"reservation_id" : "71"
}
}
You can use this aggregation Query to filter data and return distinct "vin"
db.mycollection.aggregate([
{
$match: {
$and: [{
"decoded_payload.status": { $exists: true }
}, {
"decoded_payload.status": "UNLOCKED"
}]
}
},
{ $limit : 5 }, // You can use this stage after group too
{
$group: { _id: "$vin" }
}
])
Use limit stage before and after $group stage as per requirement

Return the last 'true' value for each group at a given time [duplicate]

I have collection in which documents are like:
{
_id: ObjectId(),
user: ObjectId(),
studentName: String,
createdAt: Date,
isAbondoned: boolean
}
example of documents are:
1-
{
"_id" : ObjectId("56cd2d36a489a5b875902f0e"),
"user" : ObjectId("56c4cafabd5f92cd78ae49d4"),
"studentName" : "Aman",
"createdAt" : ISODate("2016-02-24T04:10:30.486+0000"),
"isAbandoned" : true
}
2-
{
"_id" : ObjectId("56cd2dcda489a5b875902fcd"),
"user" : ObjectId("56c4cafabd5f92cd78ae49d4"),
"studentName" : "Aman",
"createdAt" : ISODate("2016-02-24T04:13:01.932+0000"),
"isAbandoned" : false
}
3-
{
"_id" : ObjectId("56cee51503b7cb7b0eda9c4c"),
"user" : ObjectId("56c85244bd5f92cd78ae4bc1"),
"studentName" : "Rajeev",
"createdAt" : ISODate("2016-02-25T11:27:17.281+0000"),
"isAbandoned" : true,
}
Now I want to find the list of students for which their 'isAbandoned' is true for their last 'createdAt' document.
Required output for above example is:
{
"user" : ObjectId("56c85244bd5f92cd78ae4bc1"),
"studentName" : "Rajeev"
}
because for studentName "Aman" max(createdAt) is 2nd document and 'isAbandoned' is false for that.
The best way to do this is using the aggregation framework. You need to $group your documents by "user" and return the last document for each user using the $last accumulator operator but for this to work, you need a preliminary sorting stage using the $sort aggregation pipeline operator. To sort your documents, you need to consider both the "createdAt" field and the "user" field.
The last stage in the pipeline is the $match stage where you select only those last documents where "isAbandoned" equals true.
db.students.aggregate([
{ "$sort": { "user": 1, "createdAt": 1 } },
{ "$group": {
"_id": "$user",
"last": { "$last": "$$ROOT" }
}},
{ "$match": { "last.isAbandoned": true } }
])
which returns something like this:
{
"_id" : ObjectId("56c85244bd5f92cd78ae4bc1"),
"last" : {
"_id" : ObjectId("56cee51503b7cb7b0eda9c4c"),
"user" : ObjectId("56c85244bd5f92cd78ae4bc1"),
"studentName" : "Rajeev",
"createdAt" : ISODate("2016-02-25T11:27:17.281Z"),
"isAbandoned" : true
}
}
To get the expected result, we need to use the $replaceRoot pipeline operator starting from verion 3.4 to promote the embedded document to the top level
{
$replaceRoot: { newRoot: "$last" }
}
In older version, you need to use the $project aggregation pipeline operation to reshape our documents. So if we extend our pipeline with the following stage:
{
"$project": {
"_id": "$last._id",
"user": "$last.user",
"studentName": "$last.studentName",
"createdAt": "$last.createdAt",
"isAbandoned": "$last.isAbandoned"
}}
it produces the expected output:
{
"_id" : ObjectId("56cee51503b7cb7b0eda9c4c"),
"user" : ObjectId("56c85244bd5f92cd78ae4bc1"),
"studentName" : "Rajeev",
"createdAt" : ISODate("2016-02-25T11:27:17.281Z"),
"isAbandoned" : true
}
This is a good example of need to group data by specific filed (createdAt) and then compare result set match criteria.
find max by student id,
match only entries by max entry = createdAt
check if they are passing criteria
reshape document
Code:
db.student.aggregate([{
$group : {
_id : "$user",
created : {
$max : "$createdAt"
},
documents : {
$push : "$$ROOT"
}
}
}, {
$project : {
_id : 0,
documents : {
$filter : {
input : "$documents",
as : "item",
cond : {
$eq : ["$$item.createdAt", "$created"]
}
}
}}
}, {
$match : {
"documents.isAbandoned" : true
}},
{ $unwind : "$documents" },
{
$project : {
_id : "$documents._id",
user : "$documents.user",
studentName : "$documents.studentName",
createdAt : "$documents.createdAt",
isAbandoned : "$documents.isAbandoned",
}}
])

How to filter and aggregate data from one collection into another format using MongoDB

I have one collection named 'ctrlcharts'.
e.g.
{
"_id" : ObjectId("57fc695492af567031246736"),
"deviceId" : "A001",
"sensorId" : "S003",
"time" : "2016/10/11 12:23:50",
"charts" : [
{
"sensor" : "ch_11",
"value" : 120
},
{
"sensor" : "ch_12",
"value" : 150
}
]
}
How to filter "sensor" : "ch_11" and aggregate data from one collection into another format using MongoDB
e.g.
{
"time" : "2016/10/11 12:23:50",
"sensor" : "ch_12",
"value" : 150
}
I tried below code
db.ctrlcharts.aggregate([
{ $match: {"deviceId" : "A001", "sensorId" : "S003", "time" : "2016/10/11 12:23:50"}},
{ $project: {
_id: 0,
time : 1 ,
sensor : "$charts.sensor"
value : "$charts.value"
}
}
])
But I got the result as
{
"time" : "2016/10/11 12:23:50",
"sensor" : ["ch_11","ch_12"],
"value" : [120,150]
}
Thanks
You tried best....just use $unwind
db.ctrlcharts.aggregate(
{$unwind:"$charts"},
{$match: {"deviceId" :"A001", "charts.sensor":"ch_12", "time" : "2016/10/11 12:23:50"}},
{$project:{_id:0,time:1, sensor : "$charts.sensor", value :"$charts.value"}}).pretty()
You can use $unwind (aggregation) to separate charts array.
db.ctrlcharts.aggregate( [ { $unwind : "$charts" } ] )
This will produce result like -
{ "_id" : ObjectId("57ff397a007c43ecacf10512"), "deviceId" : "A001", "sensorId" : "S003", "time" : "2016/10/11 12:23:50", "charts" : { "sensor" : "ch_11", "value" : 120 } }
{ "_id" : ObjectId("57ff397a007c43ecacf10512"), "deviceId" : "A001", "sensorId" : "S003", "time" : "2016/10/11 12:23:50", "charts" : { "sensor" : "ch_12", "value" : 150 } }
and then use your match query
Use the $arrayElemAt and $filter operators to query the array more efficiently without the need to $unwind. The reason why $unwind is not as efficient is that it produces a cartesian product of the documents i.e. a copy of each document per array entry, which uses more memory (possible memory cap on aggregation pipelines of 10% total memory) and therefore takes time to produce as well processing the documents during the flattening process.
The $filter will return a subset of the array that only contains the elements that match the filter condition. The $arrayElemAt operator then returns the element from the filtered array at the specified array index to give you the subdocument you need.
A further $project is necessary to flatten the fields to give you the desired result:
db.ctrlcharts.aggregate([
{ "$match": {
"deviceId": "A001",
"sensorId": "S003",
"time": "2016/10/11 12:23:50",
"charts.sensor": "ch_11"
} },
{
"$project": {
"time": 1,
"chart": {
"$arrayElemAt": [
"$filter": {
"input": "$charts",
"as": "item",
"cond": { "$eq": ["$$item.sensor", "ch_11"] }
}, 0
]
}
}
},
{
"$project": {
"_id": 0,
"time": 1,
"sensor": "$chart.sensor"
"value": "$chart.value"
}
}
])

MongoDB select documents where field1 equals nested.field2 in aggregate pipeline

I have joined two collections on one field using '$lookup', while actually I needed two fields to have a unique match. My next step would be to unwind the array containing different values of the second field I need for a unique match and then compare these to the value of the second field it needs to match higher up. However, the second line in the snippet below returns no results.
// Request only the page that has been viewed
{ '$unwind' : '$DSpub.PublicationPages'},
{ '$match' : {'pageId' : '$DSpub.PublicationPages.PublicationPageId' } }
Is there a more appropriate way to do this? Or can I avoid doing this altogether by unwinding the "from" collection before performing the '$lookup', and then match both fields?
This is not as easy at it looks.
$match does not operate on dynamic data (that means we are comparing static value against data set). To overcome that - we can use $project phase to add a bool static flag, that can be utilized by $match
Please see example below:
Having input collection like this:
[{
"_id" : ObjectId("56be1b51a0f4c8591f37f62b"),
"name" : "Alice",
"sub_users" : [{
"_id" : ObjectId("56be1b51a0f4c8591f37f62a")
}
]
}, {
"_id" : ObjectId("56be1b51a0f4c8591f37f62a"),
"name" : "Bob",
"sub_users" : [{
"_id" : ObjectId("56be1b51a0f4c8591f37f62a")
}
]
}
]
We want to get only fields where _id and $docs.sub_users._id" are same, where docs are $lookup output.
db.collecction.aggregate([{
$lookup : {
from : "collecction",
localField : "_id",
foreignField : "_id",
as : "docs"
}
}, {
$unwind : "$docs"
}, {
$unwind : "$docs.sub_users"
}, {
$project : {
_id : 0,
fields : "$$ROOT",
matched : {
$eq : ["$_id", "$docs.sub_users._id"]
}
}
}, {
$match : {
matched : true
}
}
])
that gives output:
{
"fields" : {
"_id" : ObjectId("56be1b51a0f4c8591f37f62a"),
"name" : "Bob",
"sub_users" : [
{
"_id" : ObjectId("56be1b51a0f4c8591f37f62a")
}
],
"docs" : {
"_id" : ObjectId("56be1b51a0f4c8591f37f62a"),
"name" : "Bob",
"sub_users" : {
"_id" : ObjectId("56be1b51a0f4c8591f37f62a")
}
}
},
"matched" : true
}

Return the last "true" value for each group

I have collection in which documents are like:
{
_id: ObjectId(),
user: ObjectId(),
studentName: String,
createdAt: Date,
isAbondoned: boolean
}
example of documents are:
1-
{
"_id" : ObjectId("56cd2d36a489a5b875902f0e"),
"user" : ObjectId("56c4cafabd5f92cd78ae49d4"),
"studentName" : "Aman",
"createdAt" : ISODate("2016-02-24T04:10:30.486+0000"),
"isAbandoned" : true
}
2-
{
"_id" : ObjectId("56cd2dcda489a5b875902fcd"),
"user" : ObjectId("56c4cafabd5f92cd78ae49d4"),
"studentName" : "Aman",
"createdAt" : ISODate("2016-02-24T04:13:01.932+0000"),
"isAbandoned" : false
}
3-
{
"_id" : ObjectId("56cee51503b7cb7b0eda9c4c"),
"user" : ObjectId("56c85244bd5f92cd78ae4bc1"),
"studentName" : "Rajeev",
"createdAt" : ISODate("2016-02-25T11:27:17.281+0000"),
"isAbandoned" : true,
}
Now I want to find the list of students for which their 'isAbandoned' is true for their last 'createdAt' document.
Required output for above example is:
{
"user" : ObjectId("56c85244bd5f92cd78ae4bc1"),
"studentName" : "Rajeev"
}
because for studentName "Aman" max(createdAt) is 2nd document and 'isAbandoned' is false for that.
The best way to do this is using the aggregation framework. You need to $group your documents by "user" and return the last document for each user using the $last accumulator operator but for this to work, you need a preliminary sorting stage using the $sort aggregation pipeline operator. To sort your documents, you need to consider both the "createdAt" field and the "user" field.
The last stage in the pipeline is the $match stage where you select only those last documents where "isAbandoned" equals true.
db.students.aggregate([
{ "$sort": { "user": 1, "createdAt": 1 } },
{ "$group": {
"_id": "$user",
"last": { "$last": "$$ROOT" }
}},
{ "$match": { "last.isAbandoned": true } }
])
which returns something like this:
{
"_id" : ObjectId("56c85244bd5f92cd78ae4bc1"),
"last" : {
"_id" : ObjectId("56cee51503b7cb7b0eda9c4c"),
"user" : ObjectId("56c85244bd5f92cd78ae4bc1"),
"studentName" : "Rajeev",
"createdAt" : ISODate("2016-02-25T11:27:17.281Z"),
"isAbandoned" : true
}
}
To get the expected result, we need to use the $replaceRoot pipeline operator starting from verion 3.4 to promote the embedded document to the top level
{
$replaceRoot: { newRoot: "$last" }
}
In older version, you need to use the $project aggregation pipeline operation to reshape our documents. So if we extend our pipeline with the following stage:
{
"$project": {
"_id": "$last._id",
"user": "$last.user",
"studentName": "$last.studentName",
"createdAt": "$last.createdAt",
"isAbandoned": "$last.isAbandoned"
}}
it produces the expected output:
{
"_id" : ObjectId("56cee51503b7cb7b0eda9c4c"),
"user" : ObjectId("56c85244bd5f92cd78ae4bc1"),
"studentName" : "Rajeev",
"createdAt" : ISODate("2016-02-25T11:27:17.281Z"),
"isAbandoned" : true
}
This is a good example of need to group data by specific filed (createdAt) and then compare result set match criteria.
find max by student id,
match only entries by max entry = createdAt
check if they are passing criteria
reshape document
Code:
db.student.aggregate([{
$group : {
_id : "$user",
created : {
$max : "$createdAt"
},
documents : {
$push : "$$ROOT"
}
}
}, {
$project : {
_id : 0,
documents : {
$filter : {
input : "$documents",
as : "item",
cond : {
$eq : ["$$item.createdAt", "$created"]
}
}
}}
}, {
$match : {
"documents.isAbandoned" : true
}},
{ $unwind : "$documents" },
{
$project : {
_id : "$documents._id",
user : "$documents.user",
studentName : "$documents.studentName",
createdAt : "$documents.createdAt",
isAbandoned : "$documents.isAbandoned",
}}
])