match operation for array size gt 0 does not work in aggregation MongoDB - mongodb

I have an mongo collection called Book.
{
"_id" : "00000000",
"name" : "Book1",
"similarBooks" : [],
"genre" : ""
}
similarBooks is an array in the Book collection which contains other books which are similar to Book1.
I want to find all the books which are having similar books to it. which means i need to match similarBooks array size gt 0 in my aggregation.
I was using the aggregation-
db.Book.aggregate([{
"$match": {
"similarBooks": {
"$gt": {
"$size": 0
}
}
}
}
])
But it is not working.
There is another option of using $expr in the match condition,
db.Book.aggregate([{ {
$match: {
$expr: {
$gt: [{
$size: "$similarBooks"
}, 0]
}
}
}
])
but we can not use $expr while creating the partial index, so I can not use the second option using $expr in my aggregation. Is there any other way I can run the aggregation to find the array size gt 0.
I am using MongoDB shell version v4.2.3.

You can use Mongo's dot notation combined with $exists.
db.Book.aggregate(
[
{
"$match": {
"similarBooks.0": {"$exists": true}
}
}
])

Related

How can I filter document in mongodb?

I have a query collection in mongodb which contains document in the below format :
{
_id : ObjectId("61aced92ede..."),
query : "How to solve...?",
answer : []
is_solved : false
}
Now, I want to filter the documents with the following condition
filter all documents that are not solved. (is_solved : true)
filter "n" number of document that are solved.
So, That result will have all unsolved documents and only 10 solved documents in an array.
You can use this aggregation query:
First use $facet to create two ways: The document solved, and document not solved.
Into each way do the necessary $match and $limit the solved documents.
Then concatenate the values using $concatArrays.
db.collection.aggregate([
{
"$facet": {
"not_solved": [
{
"$match": {
"is_solved": false
}
}
],
"solved": [
{
"$match": {
"is_solved": true
}
},
{
"$limit": 10
}
]
}
},
{
"$project": {
"result": {
"$concatArrays": [
"$not_solved",
"$solved"
]
}
}
}
])
Example here where I've used $limit: 1 to see easier.
Also, if you want, you can add $unwind at the end of the aggregation to get values at the top level like this example

if mongodb match inside aggregation returns nothing, how to make a new query?

I use match to select some documents from the collection, and then output all other documents except those found.
If match doesn't find any documents, then I need to display all available documents from the collection.
How can this be done?
Without an example I don't know if I've understood correctly, but you can try this aggregation query (or add this aggregation stages into your query).
The ide is using $facet create two ways:
Frist way: Match the value
Second way: Get everything
And use $project to output one of these options using $cond and $size.
Into the $project if the array returned in the "exists way" is 0 (any result) the result is no_exists(i.e. all values) otherwise is the exists value.
db.collection.aggregate([
{
"$facet": {
"exists": [
{
"$match": {
// your match
}
}
],
"no_exists": []
}
},
{
"$project": {
"result": {
"$cond": {
"if": {
"$eq": [
{
"$size": "$exists"
},
0
]
},
"then": "$no_exists",
"else": "$exists"
}
}
}
}
])
Example here where value exists and output only the value, and here where not exists and output all collection.

MongoDB sort by value in embedded document array

I have a MongoDB collection of documents formatted as shown below:
{
"_id" : ...,
"username" : "foo",
"challengeDetails" : [
{
"ID" : ...,
"pb" : 30081,
},
{
"ID" : ...,
"pb" : 23995,
},
...
]
}
How can I write a find query for records that have a challengeDetails documents with a matching ID and sort them by the corresponding PB?
I have tried (this is using the NodeJS driver, which is why the projection syntax is weird)
const result = await collection
.find(
{ "challengeDetails.ID": challengeObjectID},
{
projection: {"challengeDetails.$": 1},
sort: {"challengeDetails.0.pb": 1}
}
)
This returns the correct records (documents with challengeDetails for only the matching ID) but they're not sorted.
I think this doesn't work because as the docs say:
When the find() method includes a sort(), the find() method applies the sort() to order the matching documents before it applies the positional $ projection operator.
But they don't explain how to sort after projecting. How would I write a query to do this? (I have a feeling aggregation may be required but am not familiar enough with MongoDB to write that myself)
You need to use aggregation to sort n array
$unwind to deconstruct the array
$match to match the value
$sort for sorting
$group to reconstruct the array
Here is the code
db.collection.aggregate([
{ "$unwind": "$challengeDetails" },
{ "$match": { "challengeDetails.ID": 2 } },
{ "$sort": { "challengeDetails.pb": 1 } },
{
"$group": {
"_id": "$_id",
"username": { "$first": "$username" },
"challengeDetails": { $push: "$challengeDetails" }
}
}
])
Working Mongo playground

Using "$count" Within an "addField" Operation in MongoDB Aggregation

I am trying to find the correct combination of aggregation operators to add a field titled "totalCount" to my mongoDB view.
This will get me the count at this particular stage of the aggregation pipeline and output this as the result of a count on each of the documents:
{
$count: "count"
}
But I then end up with one document with this result, rather than what I'm trying to accomplish, which is to make this value print out as an addedField that is a field/value on all of the documents, or even better, a value that prints in addition to the returned documents.
I've tried this but it gives me an error ""Unrecognized expression '$count'",":
{
$addFields: {
"totalCount" : { $count: "totalCount" }
}
}
What would the correct syntactical construction be for this? Is it possible to do it this way, or do I need to use $sum, or some other operator to make this work? I also tried this:
{
$addFields: {
"totalCount" : { $sum: { _id: 1 } }
}
},
... but while it doesn't give me any errors, it just prints 0 as the value for that field on every document rather than the total count of all documents.
Total count will always be a one-document result so you need $facet to run mutliple aggregation pipelines and then merge results. Let's say your regular pipeline contains simple $project and you want to merge it's results with $count. You can run below aggregation:
db.col.aggregate([
{
$facet: {
totalCount: [
{ $count: "value" }
],
pipelineResults: [
{
$project: { _id: 1 } // your regular aggregation pipeline here
}
]
}
},
{
$unwind: "$pipelineResults"
},
{
$unwind: "$totalCount"
},
{
$replaceRoot: {
newRoot: {
$mergeObjects: [ "$pipelineResults", { totalCount: "$totalCount.value" } ]
}
}
}
])
After $facet stage you'll get single document like this
{
"totalCount" : [
{
"value" : 3
}
],
"pipelineResults" : [
{
"_id" : ObjectId("5b313241120e4bc08ce87e46")
},
//....
]
}
Then you have to use $unwind to transform arrays into multiple documents and $replaceRoot with $mergeObjects to promote regular pipeline results into root level.
Since mongoDB version 5.0 there is another option, that allows to avoid the disadvantage of $facet, the grouping of all returned document into a one big document. The main concern is that a document as a size limit of 16M. Using $setWindowFields allows to avoid this concern
This can simply replace #micki's 4 steps:
db.col.aggregate([
{$setWindowFields: {output: {totalCount: {$count: {}}}}}
])

How to filter array in a mongodb query

In mongodb, I have a collection that contains a single document that looks like the following:
{
"_id" : ObjectId("5552b7fd9e8c7572e36e39df"),
"StackSummaries" : [
{
"StackId" : "arn:aws:cloudformation:ap-southeast-2:406119630047:stack/XXXX-30fb22a-285-439ee279-c7c8d36/4ebd8770-f8f4-11e4-bf36-503f2370240f",
"TemplateDescription" : "XXXX",
"StackStatusReason" : "",
"CreationTime" : "2015-05-12T22:14:50.535Z",
"StackName" : "XXXX",
"StackStatus" : "CREATE_COMPLETE"
},
{
"TemplateDescription" : "XXXX",
"StackStatusReason" : "",
"CreationTime" : "2015-05-11T04:02:05.543Z",
"StackName" : "XXXX",
"StackStatus" : "DELETE_COMPLETE",
"StackId" : "arn:aws:cloudformation:ap-southeast-2:406119630047:stack/XXXXX/7c8d04e0-f792-11e4-bb12-506726f15f9a"
},
{ ... },
{ many others }
]
}
ie the imported results of the aws cli command aws cloudformation
list-stacks
I'm trying to find the items of the StackSummaries array that have a StackStatus of CREATE_COMPLETE or UPDATE_COMPLETE. After much experimenting and reading other SO posts I arrived at the following:
db.cf_list_stacks.aggregate( {$match: {"StackSummaries.StackStatus": "CREATE_COMPLETE"}})
However this still returns the whole document (and I haven't even worried about UPDATE_COMPLETE).
I'm coming from an SQL background and struggling with simple queries like this. Any ideas on how to get the information I'm looking for?
SO posts I've looked at:
MongoDB query with elemMatch for nested array data
MongoDB: multiple $elemMatch
$projection vs $elemMatch
Make $elemMatch (projection) return all objects that match criteria
Update
Notes on things I learned while understanding this topic:
aggregate() is just a pipeline (like a Unix shell pipeline) where each $ operator is just another step. And like shell pipelines they can look complex, but you just build them up step by step until you get the results you want
Mongo has a great webinar: Exploring the Aggregation Framework
RoboMongo is a good tool (GPL3) for working with Mongo data and queries
If you only want the object inside the StackSummaries array, you should use the $unwind clause to expand the array, filter the documents you want and then project only the parts of the document that you actually want.
The query would look something like this:
db.cf_list_stacks.aggregate([
{ '$unwind' : '$StackSummaries' },
{ '$match' : { 'StackSummaries.StackStatus' : 'CREATE_COMPLETE' } },
{ '$project' : {
'TemplateDescription' : '$StackSummaries.TemplateDescription',
'StackStatusReason' : '$StackSummaries.StackStatusReason',
...
} }
])
Useful links:
Aggregation pipeline documentation
$unwind Documentation
$project Documentation
With MongoDB 3.4 and newer, you can leverage the $addFields and $filter operators with the aggregation framework to get the desired result.
Consider running the following pipeline:
db.cf_list_stacks.aggregate([
{
"$addFields": {
"StackSummaries": {
"$filter": {
"input": "$StackSummaries",
"as": "el":
"cond": {
"$in": [
"$$el.StackStatus",
["CREATE_COMPLETE", "UPDATE_COMPLETE"]
]
}
}
}
}
}
]);
For MongoDB 3.2
db.cf_list_stacks.aggregate([
{
"$project": {
"StackSummaries": {
"$filter": {
"input": "$StackSummaries",
"as": "el":
"cond": {
"$or": [
{ "$eq": ["$$el.StackStatus", "CREATE_COMPLETE"] },
{ "$eq": ["$$el.StackStatus", "UPDATE_COMPLETE"] }
]
}
}
}
}
}
]);
For MongoDB 3.0 and below
db.cf_list_stacks.aggregate([
{ "$unwind": "$StackSummaries" },
{
"$match": {
"StackSummaries.StackStatus": {
"$in": ["CREATE_COMPLETE", "UPDATE_COMPLETE"]
}
}
},
{
"$group": {
"_id": "$_id",
"StackSummaries": {
"$addToSet": "$StackSummaries"
}
}
}
])
The above pipeline has the $unwind operator which deconstructs the StackSummaries array field from the input documents to output a document for each element. Each output document replaces the array with an element value.
A further filtering is required after the $unwind to get only the documents that pass the given criteria thus a second $match operator pipeline stage follows.
In order to get the original array field after doing the $unwind bit, you would need to group the documents using the $group operator and within the group you can then use the $addToSet array operator to then push the elements into the array.
Based on the criteria that you are trying to find the items of the StackSummaries array that have a StackStatus of CREATE_COMPLETE OR UPDATE_COMPLETE, you could use $elemMatch projection but this won't work with the $in operator as required to get the document with StackStatus of CREATE_COMPLETE OR UPDATE_COMPLETE at this time. There is a JIRA issue for this:
db.cf_list_stacks.find(
{
"StackSummaries.StackStatus": {
"$in": ["CREATE_COMPLETE", "UPDATE_COMPLETE"]
}
},
{
"StackSummaries": {
"$elemMatch": {
"StackStatus": {
"$in": ["CREATE_COMPLETE", "UPDATE_COMPLETE"]
}
}
}
})
This will only give you documents where the StackStatus has the "CREATE_COMPLETE" value.