MongoDB $divide on aggregate output - mongodb

Is there a possibility to calculate mathematical operation on already aggregated computed fields?
I have something like this:
([
{
"$unwind" : {
"path" : "$users"
}
},
{
"$match" : {
"users.r" : {
"$exists" : true
}
}
},
{
"$group" : {
"_id" : "$users.r",
"count" : {
"$sum" : 1
}
}
},
])
Which gives an output as:
{ "_id" : "A", "count" : 7 }
{ "_id" : "B", "count" : 49 }
Now I want to divide 7 by 49 or vice versa.
Is there a possibility to do that? I tried $project and $divide but had no luck.
Any help would be really appreciated.
Thank you,

From your question, it looks like you are assuming result count to be 2 only. In that case I can assume users.r can have only 2 values(apart from null).
The simplest thing I suggest is to do this arithmetic via javascript(if you're using it in mongo console) or in case of using it in progam, use the language you're using to access mongo) e.g.
var results = db.collection.aggregate([theAggregatePipelineQuery]).toArray();
print(results[0].count/results[1].count);
EDIT: I am sharing an alternative to above approach because OP commented about the constraint of not using javascript code and the need to be done only via query. Here it is
([
{ /**your existing aggregation stages that results in two rows as described in the question with a count field **/ },
{ $group: {"_id": 1, firstCount: {$first: "$count"}, lastCount: {$last: "$count"}
},
{ $project: { finalResult: { $divide: ['$firstCount','$lastCount']} } }
])
//The returned document has your answer under `finalResult` field

Related

How to improve aggregate pipeline

I have pipeline
[
{'$match':{templateId:ObjectId('blabla')}},
{
"$sort" : {
"_id" : 1
}
},
{
"$facet" : {
"paginatedResult" : [
{
"$skip" : 0
},
{
"$limit" : 100
}
],
"totalCount" : [
{
"$count" : "count"
}
]
}
}
])
Index:
"key" : {
"templateId" : 1,
"_id" : 1
}
Collection has 10.6M documents 500k of it is with needed templateId.
Aggregate use index
"planSummary" : "IXSCAN { templateId: 1, _id: 1 }",
But the request takes 16 seconds. What i did wrong? How to speed up it?
For start, you should get rid of the $sort operator. The documents are already sorted by _id since the documents are already guaranteed to sorted by the { templateId: 1, _id: 1 } index. The outcome is sorting 500k which are already sorted anyway.
Next, you shouldn't use the $skip approach. For high page numbers you will skip large numbers of documents up to almost 500k (rather index entries, but still).
I suggest an alternative approach:
For the first page, calculate an id you know for sure falls out of the left side of the index. Say, if you know that you don't have entries back dated to 2019 and before, you can use a match operator similar to this:
var pageStart = ObjectId.fromDate(new Date("2020/01/01"))
Then, your match operator should look like this:
{'$match' : {templateId:ObjectId('blabla'), _id: {$gt: pageStart}}}
For the next pages, keep track of the last document of the previous page: if the rightmost document _id is x in a certain page, then pageStart should be x for the next page.
So your pipeline may look like this:
[
{'$match' : {templateId:ObjectId('blabla'), _id: {$gt: pageStart}}},
{
"$facet" : {
"paginatedResult" : [
{
"$limit" : 100
}
]
}
}
]
Note, that now the $skip is missing from the $facet operator as well.

Count of a nested value of all entries in mongodb collection

I have a collection named outbox which has this kind of structure
"_id" :ObjectId("5a94e02bb0445b1cc742d795"),
"track" : {
"added" : {
"date" : ISODate("2020-12-03T08:48:51.000Z")
}
},
"provider_status" : {
"job_number" : "",
"count" : {
"total" : 1,
"sent" : 0,
"delivered" : 0,
"failed" : 0
},
"delivery" : []
}
I have 2 tasks. First I want the sum of all the "total","sent","failed" on all the entries in the collection no matter what their objectId is. ie I want sum of all the "total","sent","delivered" and "failed". Second I want all these only for a given object Id between Start and End date.
I am trying to find total using this query
db.outbox.aggregate(
{ $group: { _id : null, sum : { $sum: "$provider_status.count.total" } } });
But I am getting this error as shown
Since I do not have much experience in mongodb I don't have any idea how to do these two tasks. Need help here.
You are executing this in Robo3t seems like.
You need to enclose this in an array like
db.test.aggregate([ //See here
{
$group: {
_id: null,
sum: {
$sum: "$provider_status.count.total"
}
}
}
])//See here
But it's not the case with playground as they handle them before submitting to the server

Sort inside cond and if mongodb

I want to sort my aggregation only if a condition is met.
This is what I have so far:
{
$cond: {
if: { $gte: [sort, "like"] },
then: { $divide: { $sort : { total_likes : -1 } } },
else: { $divide: '' }
}
}
sort is a variable that comes from a query parameter.
I want to sort by total_likes, only if sort is "likes". If it's not, I want to leave it alone.
First of all, #schoenbl, if you want to match some condition in mongo aggregation, you should use $match aggregation. It will send the documents which fulfill the given condition.
if: { $gte: [sort, "like"] }
In MongoDB, you are not allowed to compare string using "gte" operator. For string comparison in MongoDB, you get two operators:
for case sensitive $cmp.
for case insensitive $strcasecmp.
then: { $divide: { $sort : { total_likes : -1 } } },
Next, you were using divide operator don't know what is your need but syntax is improper,
refer $divide, for better knowledge.
Also, you are doing sorting in $cond, which means you want to sort each element, and that is not possible because you can't sort without having a comparison as you are inside $cond operator and it is performing manipulation on a single document.
Now, according to your need, I have prepared the next stages which will give sorted document which contains "sort" equals to "like".
{$match:{"sort":"like"}},{$sort:{"total_likes":-1}}
Output:
{ "_id" : ObjectId("5d50569fbe39828b4a22fba2"), "name" : "kyle", "sort" : "like", "total_likes" : 5 }
{ "_id" : ObjectId("5d5056a6be39828b4a22fba3"), "name" : "jack", "sort" : "like", "total_likes" : 2 }
{ "_id" : ObjectId("5d5056abbe39828b4a22fba4"), "name" : "john", "sort" : "like", "total_likes" : 1 }

Mongodb regex in aggregation using reference to field value

note: I'm using Mongodb 4 and I must use aggregation, because this is a step of a bigger aggregation
Problem
How to find in a collection documents that contains fields that starts with value from another field in same document ?
Let's start with this collection:
db.regextest.insert([
{"first":"Pizza", "second" : "Pizza"},
{"first":"Pizza", "second" : "not pizza"},
{"first":"Pizza", "second" : "not pizza"}
])
and an example query for exact match:
db.regextest.aggregate([
{
$match : { $expr: { $eq: [ "$first" ,"$second" ] } } }
])
I will get a single document
{
"_id" : ObjectId("5c49d44329ea754dc48b5ace"),
"first" : "Pizza", "second" : "Pizza"
}
And this is good.
But how to do the same, but with startsWith ? My plan was to use regex but seems that is not supported in aggregation so far.
With a find and a custom javascript function works fine:
db.regextest.find().forEach(
function(obj){
if (obj.first.startsWith(obj.second)){
print(obj);
}
}
)
And returns correctly:
{
"_id" : ObjectId("5c49d44329ea754dc48b5ace"),
"first" : "Pizza",
"second" : "Pizza"
}
How it's possible to get same result with aggregation framework ?
One idea is to use existing aggregation framework pipeline, out to a temp colletion and then run the find above, to get match I'm looking for. This seems to be a workaround, I hope someone have a better idea.
Edit: here the solution
db.regextest.aggregate([{
$project : {
"first" : 1,
"second" : 1,
fieldExists : {
$indexOfBytes : ['$first', '$second' , 0]
}
}
}, {
$match : {
fieldExists : {
$gt : -1
}
}
}
]);
The simplest way is to use $expr, first available in 3.6 like this:
{
$match: {
$expr: {
$eq: [
'$second',
{
$substr: ['$first', 0, { $strLenCP: '$second' }]
}
]
}
}
}
This compares the string in field second with the first N characters of first where N is the length of second string. If they are equal, then first starts with second.
4.2 adds support for $regex in aggregation expressions, but starts with is much simpler and doesn't need regular expressions.

mongoDB distict problems

It's one of my data as JSON format:
{
"_id" : ObjectId("5bfdb412a80939b6ed682090"),
"accounts" : [
{
"_id" : ObjectId("5bf106eee639bd0df4bd8e05"),
"accountType" : "DDA",
"productName" : "DDA1"
},
{
"_id" : ObjectId("5bf106eee639bd0df4bd8df8"),
"accountType" : "VSA",
"productName" : "VSA1"
},
{
"_id" : ObjectId("5bf106eee639bd0df4bd8df9"),
"accountType" : "VSA",
"productName" : "VSA2"
}
]
}
I want to make a query to get all productName(no duplicate) of accountType = VSA.
I write a mongo query:
db.Collection.distinct("accounts.productName", {"accounts.accountType": "VSA" })
I expect: ['VSA1', 'VSA2']
I get: ['DDA','VSA1', 'VSA2']
Anybody knows why the query doesn't work in distinct?
Second parameter of distinct method represents:
A query that specifies the documents from which to retrieve the distinct values.
But the thing is that you showed only one document with nested array of elements so whole document will be returned for your condition "accounts.accountType": "VSA".
To fix that you have to use Aggregation Framework and $unwind nested array before you apply the filtering and then you can use $group with $addToSet to get unique values. Try:
db.col.aggregate([
{
$unwind: "$accounts"
},
{
$match: {
"accounts.accountType": "VSA"
}
},
{
$group: {
_id: null,
uniqueProductNames: { $addToSet: "$accounts.productName" }
}
}
])
which prints:
{ "_id" : null, "uniqueProductNames" : [ "VSA2", "VSA1" ] }