How to express complex $sum grouping expression in Spring Data MongoDB - spring-data

I have following MongoDB aggregation query that works well in MongoDB
[
{ $match: { "myfield":"X" },
{ $group: {
_id: { myfield: "$myfield" },
count: { $sum: 1 },
lt5w: { $sum: { $cond:{ if: { $gte: [ "$myDate", new Date(ISODate().getTime() - 1000 * 60 * 60 * 24 * 7 * 5) ] }, then: 1, else: 0 } } },
gt12w: { $sum: { $cond:{ if: { $gte: [ new Date(ISODate().getTime() - 1000 * 60 * 60 * 24 * 7 * 12), "$myDate" ] }, then: 1, else: 0 } } }
}
}
])
How can I express this complex $sum operation using Spring Data MongoDB API?
group("myfield))
.sum("???").as("lt5w")
.sum("???").as("gt12w")
.count().as("count"),
The sum() method only expects simple string.
According to this ticket (closed)
https://jira.spring.io/browse/DATAMONGO-784
the aggregation should support complex operations like $cmp and $cond
Update: It seems that the sum(AggregationExpression expr) version of the method is forgotten here. min(), max(), first() have that method version.

Filed a ticket and the issue is fixed!
https://jira.spring.io/browse/DATAMONGO-1784

Related

Convert mongoDB Aggregation Query to Spring data mongo template Aggregation

I am fairly new to MongoDb and Spring-data-Mongo.
After doing a lot of research I was able to get the desired result in MongoDB using the below query, but now I am finding it very difficult to implement the same logic in the Spring-data-Mongo template.
The logic is pretty simple:
I have 2 date fields and 2 integer fields in the document.
expireLastModifiedD (Integer Represents Days), expireLastUsedD: (Integer, Represents days), lastModified: (Date type), lastUsed: (Date Type).
I need to find Documents that satisfy the below expression.
lastModified+expireLastModifiedD < NOW && lastUsed + expireLastUsedD < NOW
I have created a MongoDB Query as under.
[
{
$project: {
expireLastModifiedD: 1,
expireLastUsedD: 1,
lastModified: 1,
lastUsed: 1,
NowSubtractLastModified: {
$toInt: {
$divide: [
{
$subtract: [
new ISODate(),
"$lastModified"
]
},
1000 * 60 * 60 * 24
]
}
},
NowSubtractLastused: {
$toInt: {
$divide: [
{
$subtract: [
new ISODate(),
"$lastUsed"
]
},
1000 * 60 * 60 * 24
]
}
}
}
},
{
$project: {
expireLastModifiedD: 1,
expireLastUsedD: 1,
lastModified: 1,
lastUsed: 1,
NowSubtractLastModified: 1,
NowSubtractLastused: 1,
isExpireLastModifiedDLTNowSubtractLastModified: {
$lt: [
"$expireLastModifiedD",
"$NowSubtractLastModified"
]
},
isExpireLastUsedDLTNowSubtractLastused: {
$lt: [
"$expireLastUsedD",
"$NowSubtractLastused"
]
}
}
},
{
$match: {
isExpireLastModifiedDLTNowSubtractLastModified: true,
isExpireLastUsedDLTNowSubtractLastused: true
}
}
]
I need help creating the above MongoDb query in Spring-data Mongo Template using Aggregation.
After Lot of Research, I figured it out.
Instant now = Instant.now();
Timestamp current = Timestamp.from(now);
ProjectionOperation projectionOperation = Aggregation.project("lastUsed", "expireLastUsedD", "lastModified", "expireLastModifiedD")
.andExpression("([0] - lastModified)", current).divide(24 * 60 * 60 * 1000).as("lastModifiedFromNowD")
.andExpression("([0] - lastUsed)", current).divide(24 * 60 * 60 * 1000).as("lastUsedFromNowD");
ProjectionOperation projectionOperation1 = Aggregation.project("lastUsed", "expireLastUsedD", "lastModified", "expireLastModifiedD", "lastModifiedFromNowD", "lastUsedFromNowD")
.andExpression("expireLastModifiedD < lastModifiedFromNowD").as("isExpLstModLTLastModFromNowD")
.andExpression("expireLastUsedD < lastUsedFromNowD").as("isExpLstUsdLtLstUsdFromNowD");
MatchOperation matchOperation = Aggregation.match(new Criteria().andOperator(
Criteria.where("isExpLstModLTLastModFromNowD").is(true), Criteria.where("isExpLstUsdLtLstUsdFromNowD").is(true)));
TypedAggregation<UrlMap> agg = Aggregation.newAggregation(UrlMap.class,
projectionOperation, projectionOperation1, matchOperation);

Is it possible to list in mongodb the list of elements whose value is less than 10% of another field?

I basically have a database where I record motorcycles and their mileage.
{
"motorcycle":"A",
"current_km":4600,
"review_km":5000
},
{
"motorcycle":"B",
"current_km":4000,
"review_km":5000
},
{
"motorcycle":"C",
"current_km":4900,
"review_km":5000
},
{
"motorcycle":"D",
"current_km":3000,
"review_km":5000
}
I have a field called current_km that determines your current mileage and I have another field called review_km, which consists of specifying the mileage in which your review should be done, as long as your current mileage (current_km) is greater than 10% of Mileage review (review_km).
So I would like to list the elements where:
current_km is greater than:
(review_km - ( review_km * 0.10))
for example:
current_km = 4600;
review_km = 5000;
result = 5000 - (5000 * 0.10);
4600 (current_km)> = 4500 (result) // in this case it is showed
In my database it would show the results of motorcycles A and C
how can I do it? I don't know if it is possible to do it in mongodb directly.
Need to use aggregation with $subtract and $multiply,
$addFields add new fields, we are generating result field, equation (review_km - ( review_km * 0.10)) using $subtract and $multiply
$match equation in $expr if current_km >= result if its correct then returns document
db.collection.aggregate([
{
$addFields: {
result: {
$subtract: [
"$review_km",
{
$multiply: [
"$review_km",
0.10
]
}
]
}
}
},
{
$match: {
$expr: {
$gte: [
"$current_km",
"$result"
]
}
}
}
])
Working Playground: https://mongoplayground.net/p/s2qenvuzLKF
Shorter version
If you don't want result field in response then combined condition in $match and $addFields is no longer needed
db.collection.aggregate([
{
$match: {
$expr: {
$gte: [
"$current_km",
{
$subtract: [
"$review_km",
{
$multiply: [
"$review_km",
0.10
]
}
]
}
]
}
}
}
])
Working Playground: https://mongoplayground.net/p/fii__3tTika

MongoDB aggregate query to SpringDataMongoDB

I have below MongoDB aggregate query and would like to have it's equivalent SpringData Mongodb query.
MongoDB Aggregate Query :
db.response.aggregate(
// Pipeline
[
// Stage 1 : Group by Emotion & Month
{
$group: {
_id: {
emotion: "$emotion",
category: "$category"
},
count: {
$sum: 1
},
point: {
$first: '$point'
}
}
},
// Stage 2 : Total Points
{
$addFields: {
"totalPoint": {
$multiply: ["$point", "$count"]
}
}
},
// Stage3 : Group By Category - Overall Response Total & totalFeedbacks
{
$group: {
_id: '$_id.category',
totalFeedbacks: {
$sum: "$count"
},
overallResponseTotal: {
$sum: "$totalPoint"
}
}
},
// Stage4 - Overall Response Total & totalFeedbacks
{
$project: {
_id: 1,
overallResponseTotal: '$overallResponseTotal',
maxTotalFrom: {
"$multiply": ["$totalFeedbacks", 3.0]
},
percent: {
"$multiply": [{
"$divide": ["$overallResponseTotal", "$maxTotalFrom"]
}, 100.0]
}
}
},
// Stage4 - Percentage Monthwise
{
$project: {
_id: 1,
overallResponseTotal: 1,
maxTotalFrom: 1,
percent: {
"$multiply": [{
"$divide": ["$overallResponseTotal", "$maxTotalFrom"]
}, 100.0]
}
}
}
]
);
I have tried it's equivalent in Spring Data but got stuck at Stage 2 on how to convert "$addFields" to java code. Though I search about it on multiple sites but couldn't find anything useful. Please see my equivalent java code for Stage 1.
//Stage 1 -Group By Emotion and Category and return it's count
GroupOperation groupEmotionAndCategory = Aggregation.group("emotion","category").count().as("count").first("point")
.as("point");
Aggregation aggregation = Aggregation.newAggregation(groupEmotionAndCategory);
AggregationResults<CategoryWiseEmotion> output = mongoTemplate.aggregate(aggregation, Response.class, CategoryWiseEmotion.class);
Any helps will be highly appreciated.
$addFields is not yet supported by Spring Data Mongodb.
One workaround is to pass the raw aggregation pipeline to Spring.
But since you have a limited number of fields after stage 1, you could also downgrade stage 2 to a projection:
{
$project: {
// _id is included by default
"count" : 1, // include count
"point" : 1, // include point
"totalPoint": {
$multiply: ["$point", "$count"] // compute totalPoint
}
}
}
I haven't tested it myself, but this projection should translate to something like:
ProjectionOperation p = project("count", "point").and("point").multiply(Fields.field("count")).as("totalPoint");
Then you can translate stage 3, 4 and 5 similarly and pass the whole pipeline to Aggregation.aggregate().

In Mongo, How to write search query to search document based on time, on Date object.?

We have Collection named Incident. In which we have one field StartTime(Date object type).
Every day, whenever incident condition is met then new Document entry will be created and inserted into the collection.
We have to get all the incident which, fall between 10PM to 6AM. (i.e from midnight to early morning).
But i face problem on how to write query for this use case.
Since we have date object, I can able to write query to search document between two Dates.
How to write search query for search based on time, on Date object.
Sample Data:
"StartTime" : ISODate("2015-10-16T18:15:14.211Z")
It's just not a good idea. But basically you apply the date aggregation operators:
db.collection.aggregate([
{ "$redact": {
"$cond": {
"if": {
"$or": [
{ "$gte": [{ "$hour": "$StartTime" }, 22] },
{ "$lt": [{ "$hour": "$StartTime" }, 6 ] }
]
},
"then": "$$KEEP",
"else": "$$PRUNE"
}
}}
])
Using $redact that will only return or $$KEEP the documents that meet both conditions for the $hour extracted from the Date, and $$PRUNE or "remove" from results those that do not.
A bit shorter with MongoDB 3.6 and onwards, but really no different:
db.collection.find({
"$expr": {
"$or": [
{ "$gte": [{ "$hour": "$StartTime" }, 22] },
{ "$lt": [{ "$hour": "$StartTime" }, 6 ] }
]
}
})
Overall, not a good idea because the statement needs to scan the whole collection and calculate that logical condition.
A better way is to actually "store" the "time" as a separate field:
var ops = [];
db.collection.find().forEach(doc => {
// Get milliseconds from start of day
let timeMillis = doc.StartTime.valueOf() % (1000 * 60 * 60 * 24);
ops.push({
"updateOne": {
"filter": { "_id": doc._id },
"update": { "$set": { timeMillis } }
}
});
if ( ops.length > 1000 ) {
db.collection.bulkWrite(ops);
ops = [];
}
})
if ( ops.length > 0 ) {
db.collection.bulkWrite(ops);
ops = [];
}
Then you can simply query with something like:
var start = 22 * ( 1000 * 60 * 60 ), // 10PM
end = 6 * ( 1000 * 60 * 60 ); // 6AM
db.collection.find({
"$or": [
{ "timeMillis": { "$gte": start } },
{ "timeMillis": { "$lt": end } }
]
);
And that field can actually be indexed and so quickly and efficiently return results.

Mongo $subtract date doesn't work in aggregation $match block

I am creating a mongo aggregation query which use a $subtract operator in my $match block. As explained in these codes below.
This query doesn't work:
db.coll.aggregate(
[
{
$match: {
timestamp: {
$gte: {
$subtract: [new Date(), 24 * 60 * 60 * 1000]
}
}
}
},
{
$group: {
_id: {
timestamp: "$timestamp"
},
total: {
$sum: 1
}
}
},
{
$project: {
_id: 0,
timestamp: "$_id.timestamp",
total: "$total",
}
},
{
$sort: {
timestamp: -1
}
}
]
)
However, this second query work:
db.coll.aggregate(
[
{
$match: {
timestamp: {
$gte: new Date(new Date() - 24 * 60 * 60 * 1000)
}
}
},
{
$group: {
_id: {
timestamp: "$timestamp"
},
total: {
$sum: 1
}
}
},
{
$project: {
_id: 0,
timestamp: "$_id.timestamp",
total: "$total",
}
},
{
$sort: {
timestamp: -1
}
}
]
)
I need to use $subtract on my $match block so I can't use the last query.
As of mongodb 3.6 you can use $subtract in the $match stage via the $expr. Here's the docs: https://docs.mongodb.com/manual/reference/operator/query/expr/
I was able to get a query like what you're describing via this $expr and a new system variable in mongodb 4.2 called $$NOW. Here is my query, which gives me orders that have been created within the last 4 hours:
[
{ $match:
{ $expr:
{ $gt: [
"$_created_at",
{ $subtract: [ "$$NOW", 4 * 60 * 60 * 1000] } ]
}
}
}
]
Well you cannot do that and you are not meant to do so either. Another valid thing is that you say to "need" to do this but in reality you really do not.
Pretty much all of the general aggregation operators outside of the pipeline operators are really only valid within a $project or a $group pipeline stage. Mostly within $project but certainly not in others.
A $match pipeline is really the same as a general "query" operation, so the only things valid in there are the query operators.
As for the case for your "need", any "value" that is submitted within an aggregation pipeline and particularly within a $match needs to be evaluated outside of the actual pipeline before the BSON representation is sent to the server.
The only exception is the notation that defines variables in the document, particularly "fieldnames" such a "$fieldname" and then only really in $project or $group. So that means something that "refers" to an existing value of a document, and that is something that cannot be done within any type of "query" document expression.
If you need to work with the value of another field in the document then you work it out with $project first, as in:
db.collection.aggregate([
{ "$project": {
"fieldMath": { "$subtract": [ "$fieldOne", "$fieldTwo" ] }
}},
{ "$match": { "fieldMath": { "$gt": 2 } }}
])
For any other purpose you really want to evaluate the value "outside" the pipeline.
The above answers the question you asked, but this answers the question you didn't ask.
Your pipeline doesn't make any sense since grouping on the "timestamp" alone would be unlikely to group anything since the values are of millisecond accuracy and there is likely not to be more than just a few at best for very active systems.
It appears like you are looking for the math to group by "day", which you can do like this:
db.collection.aggregate([
{ "$group": {
"_id": {
"$subtract": [
{ "$subtract": [ "$timestamp", new Date(0) ] },
{ "$mod": [
{ "$subtract": [ "$timestamp", new Date(0) ] },
1000 * 60 * 60 * 24
]}
]
},
"total": { "$sum": "$total" }
}}
])
That "rounds" your timestamp value to a single day and has a much better chance of "aggregating" something than you would otherwise have.
Or you can use the "date aggregation operators" to do much the same thing with a composite key.
So if you want to "query" then it evaluates externally. If you want to work on a value "within the document" then you must do so in either a $project or $group pipeline stage.
The $subtract operator is a projection-operator. It is only available during a $project step. So your options are:
(not recommended) Add a $project-step before your $match-step to convert the timestamp field of all documents for the following match-step. I would not recommend you to do this because this operation needs to be performed on every single document on your database and prevents the database from using an index on the timestamp field, so it could cost you a lot of performance.
(recommended) Generate the Date you want to match against in the shell / in your application. Generate a new Date() object, store it in a variable, subtract 24 hours from it and perform your 2nd query using that variable.