MongoDB: count both matching documents and matching subdocuments, grouped by property of document - mongodb

Given a collection of documents each containing an array of subdocuments (among other properties):
{
"prop1": False,
"prop2": "unique_value",
"subdocuments": [
{
"subprop1": 1,
"subprop2": 10
},
{
"subprop1": 30,
"subprop2": 40
},
{
"subprop1": 10,
"subprop2": 1
}
]
}
And a $match query covering both documents and subdocuments:
{
"prop1": False,
"$or": [
{"subdocuments.subprop1": {"$lt": 3}},
{"subdocuments.subprop2": {"$lt": 5}}
]
}
How can I create an aggregate query that returns the number of matching subdocuments and matching documents, grouped by a specific property of the root documents?
Just counting total subdocuments and matching documents is simple, but I'm struggling to also get the right count of matching subdocuments.
Ideally I'd like to have a result like this (if we consider the sample document, only subdoc 1 and 3 match the $or conditions):
{
"unique_value": {
"documents": 1,
"subdocuments": 2
}
}
In this case the results are being grouped by the value of "prop2".

You can use $size and $filter to get the count for matching subdocuments first. Then do a $sum to get the documentCount and subdocumentCount.
db.collection.aggregate([
{
"$match": {
"prop1": false,
"$or": [
{
"subdocuments.subprop1": {
"$lt": 3
}
},
{
"subdocuments.subprop2": {
"$lt": 5
}
}
]
}
},
{
"$addFields": {
"subdocumentCount": {
$size: {
"$filter": {
"input": "$subdocuments",
"as": "s",
"cond": {
"$or": [
{
$lt: [
"$$s.subprop1",
3
]
},
{
$lt: [
"$$s.subprop2",
5
]
}
]
}
}
}
}
}
},
{
$group: {
_id: "$prop2",
documentCount: {
$sum: 1
},
subdocumentCount: {
$sum: "$subdocumentCount"
}
}
},
{
$project: {
_id: 0,
k: "$_id",
v: {
documentCount: "$documentCount",
subdocumentCount: "$subdocumentCount"
}
}
},
{
$group: {
_id: null,
docs: {
$push: "$$ROOT"
}
}
},
{
"$addFields": {
"docs": {
"$arrayToObject": "$docs"
}
}
},
{
"$replaceRoot": {
"newRoot": "$docs"
}
}
])
Here is the Mongo playground for your reference.

Related

MongoDB: How to merge all documents into a single document in an aggregation pipeline

I have the current aggregation output as follows:
[
{
"courseCount": 14
},
{
"registeredStudentsCount": 1
}
]
The array has two documents. I would like to combine all the documents into a single document having all the fields in mongoDB
db.collection.aggregate([
{
$group: {
_id: 0,
merged: {
$push: "$$ROOT"
}
}
},
{
$replaceRoot: {
newRoot: {
"$mergeObjects": "$merged"
}
}
}
])
Explained:
Group the output documents in one field with push
Replace the document root with the merged objects
Plyaground
{
$group: {
"_id": "null",
data: {
$push: "$$ROOT"
}
}
}
When you add this as the last pipeline, it will put all the docs under data, but here data would be an array of objects.
In your case it would be
{ "data":[
{
"courseCount": 14
},
{
"registeredStudentsCount": 1
}
] }
Another approach would be,
db.collection.aggregate([
{
$group: {
"_id": "null",
f: {
$first: "$$ROOT",
},
l: {
$last: "$$ROOT"
}
}
},
{
"$project": {
"output": {
"courseCount": "$f.courseCount",
"registeredStudentsCount": "$l.registeredStudentsCount"
},
"_id": 0
}
}
])
It's not dynamic as first one. As you have two docs, you can use this approach. It outputs
[
{
"output": {
"courseCount": 14,
"registeredStudentsCount": 1
}
}
]
With extra pipeline in the second approach
{
"$replaceRoot": {
"newRoot": "$output"
}
}
You will get the output as
[
{
"courseCount": 14,
"registeredStudentsCount": 1
}
]

How do I use $unwind and then $group in the same mongodb query

I have the following mongodb structure...
[
{
track: 'Newcastle',
time: '17:30',
date: '22/04/2022',
bookmakers: [
{
bookmaker: 'Coral',
runners: [
{
runner: 'John',
running: true,
odds: 3.2
},
...
]
},
...
]
},
...
]
I'm trying to find filter the bookmakers array for each document to only include the objects that match the specified bookmaker values, for example:
{ 'bookmakers.bookmaker': { $in: ['Coral', 'Bet365'] } }
At the moment, I'm using the following mongodb query to only select the bookmakers that are specified, however I need to put the documents back together after they've been seperated by the '$unwind', is there a way I can do this using $group?
await HorseRacingOdds.aggregate([
{ $unwind: "$bookmakers" },
{
$group: {
_id: "$_id",
bookmakers: "$bookmakers"
}
},
{
$project: {
"_id": 0,
"__v": 0,
"lastUpdate": 0
}
}
])
How about a plain $addFields with $filter?
db.collection.aggregate([
{
"$addFields": {
"bookmakers": {
"$filter": {
"input": "$bookmakers",
"as": "b",
"cond": {
"$in": [
"$$b.bookmaker",
[
"Coral",
"Bet365"
]
]
}
}
}
}
},
{
$project: {
"_id": 0,
"__v": 0,
"lastUpdate": 0
}
}
])
Here is the Mongo playground for your reference.

How to query an array and retrieve it from MongoDB

Updated:
I have a document on the database that looks like this:
My question is the following:
How can I retrieve the first 10 elements from the friendsArray from database and sort it descending or ascending based on the lastTimestamp value.
I don't want to download all values to my API and then sort them in Python because that is wasting my resources.
I have tried it using this code (Python):
listOfUsers = db.user_relations.find_one({'userId': '123'}, {'friendsArray' : {'$orderBy': {'lastTimestamp': 1}}}).limit(10)
but it just gives me this error pymongo.errors.OperationFailure: Unknown expression $orderBy
Any answer at this point would be really helpful! Thank You!
use aggregate
first unwind
then sort according timestap
group by _id to create sorted array
use addfields and filter for getting first 10 item of array
db.collection.aggregate([
{ $match:{userId:"123"}},
{
"$unwind": "$friendsArray"
},
{
$sort: {
"friendsArray.lastTimeStamp": 1
}
},
{
$group: {
_id: "$_id",
friendsArray: {
$push: "$friendsArray"
}
},
},
{
$addFields: {
friendsArray: {
$filter: {
input: "$friendsArray",
as: "z",
cond: {
$lt: [
{
$indexOfArray: [
"$friendsArray",
"$$z"
]
},
10
]
}// 10 is n first item
}
}
},
}
])
https://mongoplayground.net/p/2Usk5sRY2L2
and for pagination use this
db.collection.aggregate([
{ $match:{userId:"123"}},
{
"$unwind": "$friendsArray"
},
{
$sort: {
"friendsArray.lastTimeStamp": 1
}
},
{
$group: {
_id: "$_id",
friendsArray: {
$push: "$friendsArray"
}
},
},
{
$addFields: {
friendsArray: {
$filter: {
input: "$friendsArray",
as: "z",
cond: {
$and: [
{
$gt: [
{
$indexOfArray: [
"$friendsArray",
"$$z"
]
},
10
]
},
{
$lt: [
{
$indexOfArray: [
"$friendsArray",
"$$z"
]
},
20
]
},
]
}// 10 is n first item
}
}
},
}
])
The translation of your find to aggregation(we need unwind that why aggregation is used) would be like the bellow query.
Test code here
Query (for descending replace 1 with -1)
db.collection.aggregate([
{
"$match": {
"userId": "123"
}
},
{
"$unwind": {
"path": "$friendsArray"
}
},
{
"$sort": {
"friendsArray.lastTimeStamp": 1
}
},
{
"$limit": 10
},
{
"$replaceRoot": {
"newRoot": "$friendsArray"
}
}
])
If you want to skip some before limit add one stage also
{
"$skip" : 10
}
To take the 10-20 messages for example.

total of all groups totals using mongodb

i did this Aggregate pipeline , and i want add a field contains the Global Total of all groups total.
{ "$match": query },
{ "$sort": cursor.sort },
{ "$group": {
_id: { key:"$paymentFromId"},
items: {
$push: {
_id:"$_id",
value:"$value",
transaction:"$transaction",
paymentMethod:"$paymentMethod",
createdAt:"$createdAt",
...
}
},
count:{$sum:1},
total:{$sum:"$value"}
}}
{
//i want to get
...project groups , goupsTotal , groupsCount
}
,{
"$skip":cursor.skip
},{
"$limit":cursor.limit
},
])
you need to use $facet (avaialble from MongoDB 3.4) to apply multiple pipelines on the same set of docs
first pipeline: skip and limit docs
second pipeline: calculate total of all groups
{ "$match": query },
{ "$sort": cursor.sort },
{ "$group": {
_id: { key:"$paymentFromId"},
items: {
$push: "$$CURRENT"
},
count:{$sum:1},
total:{$sum:"$value"}
}
},
{
$facet: {
docs: [
{ $skip:cursor.skip },
{ $limit:cursor.limit }
],
overall: [
{$group: {
_id: null,
groupsTotal: {$sum: '$total'},
groupsCount:{ $sum: '$count'}
}
}
]
}
the final output will be
{
docs: [ .... ], // array of {_id, items, count, total}
overall: { } // object with properties groupsTotal, groupsCount
}
PS: I've replaced the items in the third pipe stage with $$CURRENT which adds the whole document for the sake of simplicity, if you need custom properties then specify them.
i did it in this way , project the $group result in new field doc and $sum the sub totals.
{
$project: {
"doc": {
"_id": "$_id",
"total": "$total",
"items":"$items",
"count":"$count"
}
}
},{
$group: {
"_id": null,
"globalTotal": {
$sum: "$doc.total"
},
"result": {
$push: "$doc"
}
}
},
{
$project: {
"result": 1,
//paging "result": {$slice: [ "$result", cursor.skip,cursor.limit ] },
"_id": 0,
"globalTotal": 1
}
}
the output
[
{
globalTotal: 121500,
result: [ [group1], [group2], [group3], ... ]
}
]

Mongo Query to Return only a subset of SubDocuments

Using the example from the Mongo docs:
{ _id: 1, results: [ { product: "abc", score: 10 }, { product: "xyz", score: 5 } ] }
{ _id: 2, results: [ { product: "abc", score: 8 }, { product: "xyz", score: 7 } ] }
{ _id: 3, results: [ { product: "abc", score: 7 }, { product: "xyz", score: 8 } ] }
db.survey.find(
{ id: 12345, results: { $elemMatch: { product: "xyz", score: { $gte: 6 } } } }
)
How do I return survey 12345 (regardless of even if it HAS surveys or not) but only return surveys with a score greater than 6? In other words I don't want the document disqualified from the results based on the subdocument, I want the document but only a subset of subdocuments.
What you are asking for is not so much a "query" but is basically just a filtering of content from the array in each document.
You do this with .aggregate() and $project:
db.survey.aggregate([
{ "$project": {
"results": {
"$setDifference": [
{ "$map": {
"input": "$results",
"as": "el",
"in": {
"$cond": [
{ "$and": [
{ "$eq": [ "$$el.product", "xyz" ] },
{ "$gte": [ "$$el.score", 6 ] }
]}
]
}
}},
[false]
]
}
}}
])
So rather than "contrain" results to documents that have an array member matching the condition, all this is doing is "filtering" the array members out that do not match the condition, but returns the document with an empty array if need be.
The fastest present way to do this is with $map to inspect all elements and $setDifference to filter out any values of false returned from that inspection. The possible downside is a "set" must contain unique elements, so this is fine as long as the elements themselves are unique.
Future releases will have a $filter method, which is similar to $map in structure, but directly removes non-matching results where as $map just returns them ( via the $cond and either the matching element or false ) and is then better suited.
Otherwise if not unique or the MongoDB server version is less than 2.6, you are doing this using $unwind, in a non performant way:
db.survey.aggregate([
{ "$unwind": "$results" },
{ "$group": {
"_id": "$_id",
"results": { "$push": "$results" },
"matched": {
"$sum": {
"$cond": [
{ "$and": [
{ "$eq": [ "$results.product", "xyz" ] },
{ "$gte": [ "$results.score", 6 ] }
]},
1,
0
]
}
}
}},
{ "$unwind": "$results" },
{ "$match": {
"$or": [
{
"results.product": "xyz",
"results.score": { "$gte": 6 }
},
{ "matched": 0 }
}},
{ "$group": {
"_id": "$_id",
"results": { "$push": "$results" },
"matched": { "$first": "$matched" }
}},
{ "$project": {
"results": {
"$cond": [
{ "$ne": [ "$matched", 0 ] },
"$results",
[]
]
}
}}
])
Which is pretty horrible in both design and perfomance. As such you are probably better off doing the filtering per document in client code instead.
You can use $filter in mongoDB 3.2
db.survey.aggregate([{
$match: {
{ id: 12345}
}
}, {
$project: {
results: {
$filter: {
input: "$results",
as: "results",
cond:{$gt: ['$$results.score', 6]}
}
}
}
}]);
It will return all the sub document that have score greater than 6. If you want to return only first matched document than you can use '$' operator.
You can use $redact in this way:
db.survey.aggregate( [
{ $match : { _id : 12345 }},
{ $redact: {
$cond: {
if: {
$or: [
{ $eq: [ "$_id", 12345 ] },
{ $and: [
{ $eq: [ "$product", "xyz" ] },
{ $gte: [ "$score", 6 ] }
]}
]
},
then: "$$DESCEND",
else: "$$PRUNE"
}
}
}
] );
It will $match by _id: 12345 first and then it will "$$PRUNE" all the subdocuments that don't have "product":"xyz" and don't have score greater or equal 6. I added the condition ($cond) { $eq: [ "$_id", 12345 ] } so that it wouldn't prune the whole document before it reaches the subdocuments.