MongoDB sort by value in embedded document array - mongodb

I have a MongoDB collection of documents formatted as shown below:
{
"_id" : ...,
"username" : "foo",
"challengeDetails" : [
{
"ID" : ...,
"pb" : 30081,
},
{
"ID" : ...,
"pb" : 23995,
},
...
]
}
How can I write a find query for records that have a challengeDetails documents with a matching ID and sort them by the corresponding PB?
I have tried (this is using the NodeJS driver, which is why the projection syntax is weird)
const result = await collection
.find(
{ "challengeDetails.ID": challengeObjectID},
{
projection: {"challengeDetails.$": 1},
sort: {"challengeDetails.0.pb": 1}
}
)
This returns the correct records (documents with challengeDetails for only the matching ID) but they're not sorted.
I think this doesn't work because as the docs say:
When the find() method includes a sort(), the find() method applies the sort() to order the matching documents before it applies the positional $ projection operator.
But they don't explain how to sort after projecting. How would I write a query to do this? (I have a feeling aggregation may be required but am not familiar enough with MongoDB to write that myself)

You need to use aggregation to sort n array
$unwind to deconstruct the array
$match to match the value
$sort for sorting
$group to reconstruct the array
Here is the code
db.collection.aggregate([
{ "$unwind": "$challengeDetails" },
{ "$match": { "challengeDetails.ID": 2 } },
{ "$sort": { "challengeDetails.pb": 1 } },
{
"$group": {
"_id": "$_id",
"username": { "$first": "$username" },
"challengeDetails": { $push: "$challengeDetails" }
}
}
])
Working Mongo playground

Related

match operation for array size gt 0 does not work in aggregation MongoDB

I have an mongo collection called Book.
{
"_id" : "00000000",
"name" : "Book1",
"similarBooks" : [],
"genre" : ""
}
similarBooks is an array in the Book collection which contains other books which are similar to Book1.
I want to find all the books which are having similar books to it. which means i need to match similarBooks array size gt 0 in my aggregation.
I was using the aggregation-
db.Book.aggregate([{
"$match": {
"similarBooks": {
"$gt": {
"$size": 0
}
}
}
}
])
But it is not working.
There is another option of using $expr in the match condition,
db.Book.aggregate([{ {
$match: {
$expr: {
$gt: [{
$size: "$similarBooks"
}, 0]
}
}
}
])
but we can not use $expr while creating the partial index, so I can not use the second option using $expr in my aggregation. Is there any other way I can run the aggregation to find the array size gt 0.
I am using MongoDB shell version v4.2.3.
You can use Mongo's dot notation combined with $exists.
db.Book.aggregate(
[
{
"$match": {
"similarBooks.0": {"$exists": true}
}
}
])

mongo: find non-superseded documents

I have a collection with documents like:
{
"_id" : "ThisIsASampleId_rand12345",
"timestamp" : ISODate("2019-04-30T10:53:34.515Z"),
"mySpecialId" : "specialId_12345",
"status" : "error",
}
My goal is to find all documents with {status: 'error'}, so long as no subsequent documents exist with the same mySpecialId and status 'success'.
Clearly I can do db.jobs.find({status: 'error'}), but after that, I get lost.
Do I need to do a $lookup in an aggregation pipeline into the same collection, using "mySpecialId" as both local and foreign fields, with a $match that includes something like {$gt: {timestamp: $PREVIOUS_TIMESTAMP}}? That feels wrong, somehow.
Is there a simpler/better/more elegant way to do this?
You can $sort your collection by timestamp field and then run $group with $last operator to get most recent document for each mySpecialId. Then you can simply check if that last document's status is error. If not then either all documents in this group had success or error appeared by was superseded with success. To get back original shape of your documents you can use $replaceRoot.
db.col.aggregate([
{
$sort: { timestamp: 1 }
},
{
$group: {
_id: "$mySpecialId",
lastDoc: { $last: "$$ROOT" }
}
},
{
$match: {
"lastDoc.status": "error"
}
},
{
$replaceRoot: {
newRoot: "$lastDoc"
}
}
])

How to filter array in a mongodb query

In mongodb, I have a collection that contains a single document that looks like the following:
{
"_id" : ObjectId("5552b7fd9e8c7572e36e39df"),
"StackSummaries" : [
{
"StackId" : "arn:aws:cloudformation:ap-southeast-2:406119630047:stack/XXXX-30fb22a-285-439ee279-c7c8d36/4ebd8770-f8f4-11e4-bf36-503f2370240f",
"TemplateDescription" : "XXXX",
"StackStatusReason" : "",
"CreationTime" : "2015-05-12T22:14:50.535Z",
"StackName" : "XXXX",
"StackStatus" : "CREATE_COMPLETE"
},
{
"TemplateDescription" : "XXXX",
"StackStatusReason" : "",
"CreationTime" : "2015-05-11T04:02:05.543Z",
"StackName" : "XXXX",
"StackStatus" : "DELETE_COMPLETE",
"StackId" : "arn:aws:cloudformation:ap-southeast-2:406119630047:stack/XXXXX/7c8d04e0-f792-11e4-bb12-506726f15f9a"
},
{ ... },
{ many others }
]
}
ie the imported results of the aws cli command aws cloudformation
list-stacks
I'm trying to find the items of the StackSummaries array that have a StackStatus of CREATE_COMPLETE or UPDATE_COMPLETE. After much experimenting and reading other SO posts I arrived at the following:
db.cf_list_stacks.aggregate( {$match: {"StackSummaries.StackStatus": "CREATE_COMPLETE"}})
However this still returns the whole document (and I haven't even worried about UPDATE_COMPLETE).
I'm coming from an SQL background and struggling with simple queries like this. Any ideas on how to get the information I'm looking for?
SO posts I've looked at:
MongoDB query with elemMatch for nested array data
MongoDB: multiple $elemMatch
$projection vs $elemMatch
Make $elemMatch (projection) return all objects that match criteria
Update
Notes on things I learned while understanding this topic:
aggregate() is just a pipeline (like a Unix shell pipeline) where each $ operator is just another step. And like shell pipelines they can look complex, but you just build them up step by step until you get the results you want
Mongo has a great webinar: Exploring the Aggregation Framework
RoboMongo is a good tool (GPL3) for working with Mongo data and queries
If you only want the object inside the StackSummaries array, you should use the $unwind clause to expand the array, filter the documents you want and then project only the parts of the document that you actually want.
The query would look something like this:
db.cf_list_stacks.aggregate([
{ '$unwind' : '$StackSummaries' },
{ '$match' : { 'StackSummaries.StackStatus' : 'CREATE_COMPLETE' } },
{ '$project' : {
'TemplateDescription' : '$StackSummaries.TemplateDescription',
'StackStatusReason' : '$StackSummaries.StackStatusReason',
...
} }
])
Useful links:
Aggregation pipeline documentation
$unwind Documentation
$project Documentation
With MongoDB 3.4 and newer, you can leverage the $addFields and $filter operators with the aggregation framework to get the desired result.
Consider running the following pipeline:
db.cf_list_stacks.aggregate([
{
"$addFields": {
"StackSummaries": {
"$filter": {
"input": "$StackSummaries",
"as": "el":
"cond": {
"$in": [
"$$el.StackStatus",
["CREATE_COMPLETE", "UPDATE_COMPLETE"]
]
}
}
}
}
}
]);
For MongoDB 3.2
db.cf_list_stacks.aggregate([
{
"$project": {
"StackSummaries": {
"$filter": {
"input": "$StackSummaries",
"as": "el":
"cond": {
"$or": [
{ "$eq": ["$$el.StackStatus", "CREATE_COMPLETE"] },
{ "$eq": ["$$el.StackStatus", "UPDATE_COMPLETE"] }
]
}
}
}
}
}
]);
For MongoDB 3.0 and below
db.cf_list_stacks.aggregate([
{ "$unwind": "$StackSummaries" },
{
"$match": {
"StackSummaries.StackStatus": {
"$in": ["CREATE_COMPLETE", "UPDATE_COMPLETE"]
}
}
},
{
"$group": {
"_id": "$_id",
"StackSummaries": {
"$addToSet": "$StackSummaries"
}
}
}
])
The above pipeline has the $unwind operator which deconstructs the StackSummaries array field from the input documents to output a document for each element. Each output document replaces the array with an element value.
A further filtering is required after the $unwind to get only the documents that pass the given criteria thus a second $match operator pipeline stage follows.
In order to get the original array field after doing the $unwind bit, you would need to group the documents using the $group operator and within the group you can then use the $addToSet array operator to then push the elements into the array.
Based on the criteria that you are trying to find the items of the StackSummaries array that have a StackStatus of CREATE_COMPLETE OR UPDATE_COMPLETE, you could use $elemMatch projection but this won't work with the $in operator as required to get the document with StackStatus of CREATE_COMPLETE OR UPDATE_COMPLETE at this time. There is a JIRA issue for this:
db.cf_list_stacks.find(
{
"StackSummaries.StackStatus": {
"$in": ["CREATE_COMPLETE", "UPDATE_COMPLETE"]
}
},
{
"StackSummaries": {
"$elemMatch": {
"StackStatus": {
"$in": ["CREATE_COMPLETE", "UPDATE_COMPLETE"]
}
}
}
})
This will only give you documents where the StackStatus has the "CREATE_COMPLETE" value.

Query number of sub collections Mongodb

I am new to mongodb and I am trying to figure out how to count all the returned query inside an array of documents like below:
"impression_details" : [
{
"date" : ISODate("2014-04-24T16:35:46.051Z"),
"ip" : "::1"
},
{
"date" : ISODate("2014-04-24T16:35:53.396Z"),
"ip" : "::1"
},
{
"date" : ISODate("2014-04-25T16:22:20.314Z"),
"ip" : "::1"
}
]
What I would like to do is count how many 2014-04-24 there are (which is 2). At the moment my query is like this and it is not working:
db.banners.find({
"impression_details.date":{
"$gte": ISODate("2014-04-24T00:00:00.000Z"),
"$lte": ISODate("2014-04-24T23:59:59.000Z")
}
}).count()
Not sure what is going on please help!
Thank you.
The concept here is that there is a distinct difference between selecting documents and selecting elements of a sub-document array. So what is happening currently in your query is exactly what should be happening. As the document contains at least one sub-document entry that matches your condition, then that document is found.
In order to "filter" the content of the sub-documents itself for more than one match, then you need to apply the .aggregate() method. And since you are expecting a count then this is what you want:
db.banners.aggregate([
// Matching documents still makes sense
{ "$match": {
"impression_details.date":{
"$gte": ISODate("2014-04-24T00:00:00.000Z"),
"$lte": ISODate("2014-04-24T23:59:59.000Z")
}
}},
// Unwind the array
{ "$unwind": "$impression_details" },
// Actuall filter the array contents
{ "$match": {
"impression_details.date":{
"$gte": ISODate("2014-04-24T00:00:00.000Z"),
"$lte": ISODate("2014-04-24T23:59:59.000Z")
}
}},
// Group back to the normal document form and get a count
{ "$group": {
"_id": "$_id",
"impression_details": { "$push": "$impression_details" },
"count": { "$sum": 1 }
}}
])
And that will give you a form that only has the elements that match your query in the array, as well as providing the count of those entries that were matched.
Use the $elemMatch operator would do what you want.
In your query it meas to find all the documents whose impression_details field contains a data between ISODate("2014-04-24T00:00:00.000Z") and ISODate("2014-04-24T23:59:59.000Z"). The point is, it will return the whole document which is not what you want. So if you want only the subdocuments that satisfies your condition:
var docs = db.banners.find({
"impression_details": {
$elemMatch: {
data: {
$gte: ISODate("2014-04-24T00:00:00.000Z"),
$lte: ISODate("2014-04-24T23:59:59.000Z")
}
}
}
});
var count = 0;
docs.forEach(function(doc) {
count += doc.impression_details.length;
});
print(count);

MongoDB aggregation : $sum values of a key in a collection

I have a mongodb collection full of documents like this :
{
_id : xxxxxx
category : 1,
tech : [
{key:"size",value:5},
{key:"color",value:"red"}
{key:"weight",value:27.4}
]
}
My question is : how can I do to aggregate (average, sum or whatever) each item with key = "size" in this collection?
thank you for your help
When you have documents that contain an array you use the $unwind operator in order to access the array elements.
db.tech.aggregate([
{ "$unwind": "$tech" },
{ "$match": { "tech.key": "size" } },
{ "$group": {
"_id": null,
"totalSize": { "$sum": "$tech.value" }
}}
])
So once the array is "un-wound" you can then $group on whatever you want to use as a key under the _id field, for all documents in the collection use null. Any of the group aggregation operators can be applied.
The array elements in the "de-normalized" documents will be available through "dot notation" as shown above.
Also see the full list of aggregation operators in the manual.