Convert mongoDB Aggregation Query to Spring data mongo template Aggregation - mongodb

I am fairly new to MongoDb and Spring-data-Mongo.
After doing a lot of research I was able to get the desired result in MongoDB using the below query, but now I am finding it very difficult to implement the same logic in the Spring-data-Mongo template.
The logic is pretty simple:
I have 2 date fields and 2 integer fields in the document.
expireLastModifiedD (Integer Represents Days), expireLastUsedD: (Integer, Represents days), lastModified: (Date type), lastUsed: (Date Type).
I need to find Documents that satisfy the below expression.
lastModified+expireLastModifiedD < NOW && lastUsed + expireLastUsedD < NOW
I have created a MongoDB Query as under.
[
{
$project: {
expireLastModifiedD: 1,
expireLastUsedD: 1,
lastModified: 1,
lastUsed: 1,
NowSubtractLastModified: {
$toInt: {
$divide: [
{
$subtract: [
new ISODate(),
"$lastModified"
]
},
1000 * 60 * 60 * 24
]
}
},
NowSubtractLastused: {
$toInt: {
$divide: [
{
$subtract: [
new ISODate(),
"$lastUsed"
]
},
1000 * 60 * 60 * 24
]
}
}
}
},
{
$project: {
expireLastModifiedD: 1,
expireLastUsedD: 1,
lastModified: 1,
lastUsed: 1,
NowSubtractLastModified: 1,
NowSubtractLastused: 1,
isExpireLastModifiedDLTNowSubtractLastModified: {
$lt: [
"$expireLastModifiedD",
"$NowSubtractLastModified"
]
},
isExpireLastUsedDLTNowSubtractLastused: {
$lt: [
"$expireLastUsedD",
"$NowSubtractLastused"
]
}
}
},
{
$match: {
isExpireLastModifiedDLTNowSubtractLastModified: true,
isExpireLastUsedDLTNowSubtractLastused: true
}
}
]
I need help creating the above MongoDb query in Spring-data Mongo Template using Aggregation.

After Lot of Research, I figured it out.
Instant now = Instant.now();
Timestamp current = Timestamp.from(now);
ProjectionOperation projectionOperation = Aggregation.project("lastUsed", "expireLastUsedD", "lastModified", "expireLastModifiedD")
.andExpression("([0] - lastModified)", current).divide(24 * 60 * 60 * 1000).as("lastModifiedFromNowD")
.andExpression("([0] - lastUsed)", current).divide(24 * 60 * 60 * 1000).as("lastUsedFromNowD");
ProjectionOperation projectionOperation1 = Aggregation.project("lastUsed", "expireLastUsedD", "lastModified", "expireLastModifiedD", "lastModifiedFromNowD", "lastUsedFromNowD")
.andExpression("expireLastModifiedD < lastModifiedFromNowD").as("isExpLstModLTLastModFromNowD")
.andExpression("expireLastUsedD < lastUsedFromNowD").as("isExpLstUsdLtLstUsdFromNowD");
MatchOperation matchOperation = Aggregation.match(new Criteria().andOperator(
Criteria.where("isExpLstModLTLastModFromNowD").is(true), Criteria.where("isExpLstUsdLtLstUsdFromNowD").is(true)));
TypedAggregation<UrlMap> agg = Aggregation.newAggregation(UrlMap.class,
projectionOperation, projectionOperation1, matchOperation);

Related

mongodb find oldest date of three keys in each document

I have a document schema that looks like this:
{
status: String,
estimateDate: Date,
lostDate: Date,
soldDate: Date,
assignedDate: Date
}
With this schema all three dates could exists and none of them could exists. I need to do a check of all three and if at least one exists use the oldest date if none exists use todays date. With the "returned" date, get the difference in days from another key (assignedDate). I have figured out how to do what I want with one date but cannot figure out how to scale this up to include all three keys. Below is the working code I have for one key.
Within my aggregate pipeline $project stage I do the following:
days: {
$cond: {
if: {
$not: ["$date1"]
},
then: {
$floor: {
$divide: [
{
$subtract: [new Date(), "$assignedDate"]
},
1000 * 60 * 60 * 24
]
}
},
else: {
$floor: {
$divide: [
{
$subtract: [
"$estimateDate",
"$assignedDate"
]
},
1000 * 60 * 60 * 24
]
}
}
}
}
You can use $min and $ifNull operators to get oldest date specify new Date() as default value if any of those dates does not exist:
db.col.aggregate([
{
$project: {
oldest: {
$min: [
{ $ifNull: [ "$lostDate", new Date() ] },
{ $ifNull: [ "$soldDate", new Date() ] },
{ $ifNull: [ "$assignedDate", new Date() ] },
]
}
}
}
])

In Mongo, How to write search query to search document based on time, on Date object.?

We have Collection named Incident. In which we have one field StartTime(Date object type).
Every day, whenever incident condition is met then new Document entry will be created and inserted into the collection.
We have to get all the incident which, fall between 10PM to 6AM. (i.e from midnight to early morning).
But i face problem on how to write query for this use case.
Since we have date object, I can able to write query to search document between two Dates.
How to write search query for search based on time, on Date object.
Sample Data:
"StartTime" : ISODate("2015-10-16T18:15:14.211Z")
It's just not a good idea. But basically you apply the date aggregation operators:
db.collection.aggregate([
{ "$redact": {
"$cond": {
"if": {
"$or": [
{ "$gte": [{ "$hour": "$StartTime" }, 22] },
{ "$lt": [{ "$hour": "$StartTime" }, 6 ] }
]
},
"then": "$$KEEP",
"else": "$$PRUNE"
}
}}
])
Using $redact that will only return or $$KEEP the documents that meet both conditions for the $hour extracted from the Date, and $$PRUNE or "remove" from results those that do not.
A bit shorter with MongoDB 3.6 and onwards, but really no different:
db.collection.find({
"$expr": {
"$or": [
{ "$gte": [{ "$hour": "$StartTime" }, 22] },
{ "$lt": [{ "$hour": "$StartTime" }, 6 ] }
]
}
})
Overall, not a good idea because the statement needs to scan the whole collection and calculate that logical condition.
A better way is to actually "store" the "time" as a separate field:
var ops = [];
db.collection.find().forEach(doc => {
// Get milliseconds from start of day
let timeMillis = doc.StartTime.valueOf() % (1000 * 60 * 60 * 24);
ops.push({
"updateOne": {
"filter": { "_id": doc._id },
"update": { "$set": { timeMillis } }
}
});
if ( ops.length > 1000 ) {
db.collection.bulkWrite(ops);
ops = [];
}
})
if ( ops.length > 0 ) {
db.collection.bulkWrite(ops);
ops = [];
}
Then you can simply query with something like:
var start = 22 * ( 1000 * 60 * 60 ), // 10PM
end = 6 * ( 1000 * 60 * 60 ); // 6AM
db.collection.find({
"$or": [
{ "timeMillis": { "$gte": start } },
{ "timeMillis": { "$lt": end } }
]
);
And that field can actually be indexed and so quickly and efficiently return results.

How to express complex $sum grouping expression in Spring Data MongoDB

I have following MongoDB aggregation query that works well in MongoDB
[
{ $match: { "myfield":"X" },
{ $group: {
_id: { myfield: "$myfield" },
count: { $sum: 1 },
lt5w: { $sum: { $cond:{ if: { $gte: [ "$myDate", new Date(ISODate().getTime() - 1000 * 60 * 60 * 24 * 7 * 5) ] }, then: 1, else: 0 } } },
gt12w: { $sum: { $cond:{ if: { $gte: [ new Date(ISODate().getTime() - 1000 * 60 * 60 * 24 * 7 * 12), "$myDate" ] }, then: 1, else: 0 } } }
}
}
])
How can I express this complex $sum operation using Spring Data MongoDB API?
group("myfield))
.sum("???").as("lt5w")
.sum("???").as("gt12w")
.count().as("count"),
The sum() method only expects simple string.
According to this ticket (closed)
https://jira.spring.io/browse/DATAMONGO-784
the aggregation should support complex operations like $cmp and $cond
Update: It seems that the sum(AggregationExpression expr) version of the method is forgotten here. min(), max(), first() have that method version.
Filed a ticket and the issue is fixed!
https://jira.spring.io/browse/DATAMONGO-1784

How do I do a 'group by' for a datetime when I want to group by just the date using $group? [duplicate]

I am working on a project in which I am tracking number of clicks on a topic.
I am using mongodb and I have to group number of click by date( i want to group data for 15 days).
I am having data store in following format in mongodb
{
"_id" : ObjectId("4d663451d1e7242c4b68e000"),
"date" : "Mon Dec 27 2010 18:51:22 GMT+0000 (UTC)",
"topic" : "abc",
"time" : "18:51:22"
}
{
"_id" : ObjectId("4d6634514cb5cb2c4b69e000"),
"date" : "Mon Dec 27 2010 18:51:23 GMT+0000 (UTC)",
"topic" : "bce",
"time" : "18:51:23"
}
i want to group number of clicks on topic:abc by days(for 15 days)..i know how to group that but how can I group by date which are stored in my database
I am looking for result in following format
[
{
"date" : "date in log",
"click" : 9
},
{
"date" : "date in log",
"click" : 19
},
]
I have written code but it will work only if date are in string (code is here http://pastebin.com/2wm1n1ix)
...please guide me how do I group it
New answer using Mongo aggregation framework
After this question was asked and answered, 10gen released Mongodb version 2.2 with an aggregation framework, which is now the better way to do this sort of query. This query is a little challenging because you want to group by date and the values stored are timestamps, so you have to do something to convert the timestamps to dates that match. For the purposes of example I will just write a query that gets the right counts.
db.col.aggregate(
{ $group: { _id: { $dayOfYear: "$date"},
click: { $sum: 1 } } }
)
This will return something like:
[
{
"_id" : 144,
"click" : 165
},
{
"_id" : 275,
"click" : 12
}
]
You need to use $match to limit the query to the date range you are interested in and $project to rename _id to date. How you convert the day of year back to a date is left as an exercise for the reader. :-)
10gen has a handy SQL to Mongo Aggregation conversion chart worth bookmarking. There is also a specific article on date aggregation operators.
Getting a little fancier, you can use:
db.col.aggregate([
{ $group: {
_id: {
$add: [
{ $dayOfYear: "$date"},
{ $multiply:
[400, {$year: "$date"}]
}
]},
click: { $sum: 1 },
first: {$min: "$date"}
}
},
{ $sort: {_id: -1} },
{ $limit: 15 },
{ $project: { date: "$first", click: 1, _id: 0} }
])
which will get you the latest 15 days and return some datetime within each day in the date field. For example:
[
{
"click" : 431,
"date" : ISODate("2013-05-11T02:33:45.526Z")
},
{
"click" : 702,
"date" : ISODate("2013-05-08T02:11:00.503Z")
},
...
{
"click" : 814,
"date" : ISODate("2013-04-25T00:41:45.046Z")
}
]
There are already many answers to this question, but I wasn't happy with any of them. MongoDB has improved over the years, and there are now easier ways to do it. The answer by Jonas Tomanga gets it right, but is a bit too complex.
If you are using MongoDB 3.0 or later, here's how you can group by date. I start with the $match aggregation because the author also asked how to limit the results.
db.yourCollection.aggregate([
{ $match: { date: { $gte: ISODate("2019-05-01") } } },
{ $group: { _id: { $dateToString: { format: "%Y-%m-%d", date: "$date"} }, count: { $sum: 1 } } },
{ $sort: { _id: 1} }
])
To fetch data group by date in mongodb
db.getCollection('supportIssuesChat').aggregate([
{
$group : {
_id :{ $dateToString: { format: "%Y-%m-%d", date: "$createdAt"} },
list: { $push: "$$ROOT" },
count: { $sum: 1 }
}
}
])
Late answer, but for the record (for anyone else that comes to this page): You'll need to use the 'keyf' argument instead of 'key', since your key is actually going to be a function of the date on the event (i.e. the "day" extracted from the date) and not the date itself. This should do what you're looking for:
db.coll.group(
{
keyf: function(doc) {
var date = new Date(doc.date);
var dateKey = (date.getMonth()+1)+"/"+date.getDate()+"/"+date.getFullYear()+'';
return {'day':dateKey};
},
cond: {topic:"abc"},
initial: {count:0},
reduce: function(obj, prev) {prev.count++;}
});
For more information, take a look at MongoDB's doc page on aggregation and group: http://www.mongodb.org/display/DOCS/Aggregation#Aggregation-Group
This can help
return new Promise(function(resolve, reject) {
db.doc.aggregate(
[
{ $match: {} },
{ $group: { _id: { $dateToString: { format: "%Y-%m-%d", date: "$date" } }, count: { $sum: 1 } } },
{ $sort: { _id: 1 } }
]
).then(doc => {
/* if you need a date object */
doc.forEach(function(value, index) {
doc[index]._id = new Date(value._id);
}, this);
resolve(doc);
}).catch(reject);
}
Haven't worked that much with MongoDB yet, so I am not completely sure. But aren't you able to use full Javascript?
So you could parse your date with Javascript Date class, create your date for the day out of it and set as key into an "out" property. And always add one if the key already exists, otherwise create it new with value = 1 (first click). Below is your code with adapted reduce function (untested code!):
db.coll.group(
{
key:{'date':true},
initial: {retVal: {}},
reduce: function(doc, prev){
var date = new Date(doc.date);
var dateKey = date.getFullYear()+''+date.getMonth()+''+date.getDate();
(typeof prev.retVal[dateKey] != 'undefined') ? prev.retVal[dateKey] += 1 : prev.retVal[dateKey] = 1;
},
cond: {topic:"abc"}
}
)
thanks for #mindthief, your answer help solve my problem today. The function below can group by day a little more easier, hope can help the others.
/**
* group by day
* #param query document {key1:123,key2:456}
*/
var count_by_day = function(query){
return db.action.group(
{
keyf: function(doc) {
var date = new Date(doc.time);
var dateKey = (date.getMonth()+1)+"/"+date.getDate()+"/"+date.getFullYear();
return {'date': dateKey};
},
cond:query,
initial: {count:0},
reduce: function(obj, prev) {
prev.count++;
}
});
}
count_by_day({this:'is',the:'query'})
Another late answer, but still. So if you wanna do it in only one iteration and get the number of clicks grouped by date and topic you can use the following code:
db.coll.group(
{
$keyf : function(doc) {
return { "date" : doc.date.getDate()+"/"+doc.date.getMonth()+"/"+doc.date.getFullYear(),
"topic": doc.topic };
},
initial: {count:0},
reduce: function(obj, prev) { prev.count++; }
})
Also If you would like to optimize the query as suggested you can use an integer value for date (hint: use valueOf(), for the key date instead of the String, though for my examples the speed was the same.
Furthermore it's always wise to check the MongoDB docs regularly, because they keep adding new features all the time. For example with the new Aggregation framework, which will be released in the 2.2 version you can achieve the same results much easier http://docs.mongodb.org/manual/applications/aggregation/
If You want a Date oject returned directly
Then instead of applying the Date Aggregation Operators, instead apply "Date Math" to round the date object. This can often be desirable as all drivers represent a BSON Date in a form that is commonly used for Date manipulation for all languages where that is possible:
db.datetest.aggregate([
{ "$group": {
"_id": {
"$add": [
{ "$subtract": [
{ "$subtract": [ "$date", new Date(0) ] },
{ "$mod": [
{ "$subtract": [ "$date", new Date(0) ] },
1000 * 60 * 60 * 24
]}
]},
new Date(0)
]
},
"click": { "$sum": 1 }
}}
])
Or if as is implied in the question that the grouping interval required is "buckets" of 15 days, then simply apply that to the numeric value in $mod:
db.datetest.aggregate([
{ "$group": {
"_id": {
"$add": [
{ "$subtract": [
{ "$subtract": [ "$date", new Date(0) ] },
{ "$mod": [
{ "$subtract": [ "$date", new Date(0) ] },
1000 * 60 * 60 * 24 * 15
]}
]},
new Date(0)
]
},
"click": { "$sum": 1 }
}}
])
The basic math applied is that when you $subtract two Date objects the result returned will be the milliseconds of differnce numerically. So epoch is represented by Date(0) as the base for conversion in whatever language constructor you have.
With a numeric value, the "modulo" ( $mod ) is applied to round the date ( subtract the remainder from the division ) to the required interval. Being either:
1000 milliseconds x 60 seconds * 60 minutes * 24 hours = 1 day
Or
1000 milliseconds x 60 seconds * 60 minutes * 24 hours * 15 days = 15 days
So it's flexible to whatever interval you require.
By the same token from above an $add operation between a "numeric" value and a Date object will return a Date object equivalent to the millseconds value of both objects combined ( epoch is 0, therefore 0 plus difference is the converted date ).
Easily represented and reproducible in the following listing:
var now = new Date();
var bulk = db.datetest.initializeOrderedBulkOp();
for ( var x = 0; x < 60; x++ ) {
bulk.insert({ "date": new Date( now.valueOf() + ( 1000 * 60 * 60 * 24 * x ))});
}
bulk.execute();
And running the second example with 15 day intervals:
{ "_id" : ISODate("2016-04-14T00:00:00Z"), "click" : 12 }
{ "_id" : ISODate("2016-03-30T00:00:00Z"), "click" : 15 }
{ "_id" : ISODate("2016-03-15T00:00:00Z"), "click" : 15 }
{ "_id" : ISODate("2016-02-29T00:00:00Z"), "click" : 15 }
{ "_id" : ISODate("2016-02-14T00:00:00Z"), "click" : 3 }
Or similar distribution depending on the current date when the listing is run, and of course the 15 day intervals will be consistent since the epoch date.
Using the "Math" method is a bit easier to tune, especially if you want to adjust time periods for different timezones in aggregation output where you can similarly numerically adjust by adding/subtracting the numeric difference from UTC.
Of course, that is a good solution. Aside from that you can group dates by days as strings (as that answer propose) or you can get the beginning of dates by projecting date field (in aggregation) like that:
{'$project': {
'start_of_day': {'$subtract': [
'$date',
{'$add': [
{'$multiply': [{'$hour': '$date'}, 3600000]},
{'$multiply': [{'$minute': '$date'}, 60000]},
{'$multiply': [{'$second': '$date'}, 1000]},
{'$millisecond': '$date'}
]}
]},
}}
It gives you this:
{
"start_of_day" : ISODate("2015-12-03T00:00:00.000Z")
},
{
"start_of_day" : ISODate("2015-12-04T00:00:00.000Z")
}
It has some pluses: you can manipulate with your days in date type (not number or string), it allows you to use all of the date aggregation operators in following aggregation operations and gives you date type on the output.

Query to get last X minutes data with Mongodb

I'm trying to query my db that have this document format:
{
"_id" : ObjectId("520b8b3f8bd94741bf006033"),
"value" : 0.664,
"timestamp" : ISODate("2013-08-14T13:48:35Z"),
"cr" : ISODate("2013-08-14T13:50:55.834Z")
}
I can get the last records from a datetime with this query:
db.mycol.find({timestamp:{$gt: ISODate("2013-08-14T13:48:00Z")}}).sort({x:1});
But I'm trying to get a set with the value fields and timestamps from 18 minutes ago.
For the 18 minutes part, that's not really about MongoDB, but about JavaScript and what's available in the mongo shell:
query = {
timestamp: { // 18 minutes ago (from now)
$gt: new Date(ISODate().getTime() - 1000 * 60 * 18)
}
}
Works in the mongo shell, but using Mongo drivers for other languages would be really different.
To "project" over a smaller schema with both values and timestamps:
projection = {
_id: 0,
value: 1,
timestamp: 1,
}
Applying both:
db.mycol.find(query, projection).sort({timestamp: 1});
Well, that's still not a "set" since there might be duplicates. To get rid of them you can use the $group from the aggregation framework:
db.mycol.aggregate([
{$match: query},
{$group: {
_id: {
value: "$value",
timestamp: "$timestamp",
}
}},
{$project: {
value: "$_id.value",
timestamp: "$_id.timestamp",
}},
{$sort: {timestamp: 1}},
])
You could also do below
db.getCollection('collectionName').find({timestamp : {$gte: new Date().getTime()-(60*60*1000) } } )
The above query ll give you records of timestamp b/w now and 60 mins. if you like more then 60 mins - say 2 hrs you could change expression to (2*60*60*1000)
for 30 mins (30*60*1000)
Starting in Mongo 5, you can use $dateSubtract:
// { date: ISODate("2021-12-05T20:32:56Z") } <= 5 minutes ago
// { date: ISODate("2021-12-05T20:07:56Z") } <= 25 minutes ago (older than 18 minutes)
db.collection.aggregate([
{ $match: {
$expr: {
$gt: [
"$date",
{ $dateSubtract: { startDate: "$$NOW", unit: "minute", amount: 18 } }
]
}
}}
])
// { date: ISODate("2021-12-05T20:32:56Z") } <= 5 minutes ago
With $dateSubtract, we create the oldest date/time after which we keep documents, by subtracting 18 (amount) "minute" (unit) out of the current date $$NOW (startDate).
you can access the data of current timestamp from mongodb using nodejs
const collection1 = dbo.collection('customers');
var dateq = new Date();
collection1.find({ "Timestamp" : { $gt: new Date(dateq.getTime() - 6000)}
}).toArray(function(err , docs){
console.log(docs);
}
code end
Wow, thanks to #Alistair_Nelson I was able to get the data from n minutes ago, for example to get the last 18 minutes from ISODate("2013-08-14T14:00:00Z"):
db.mycol.find({timestamp:{$gt: new Date(ISODate("2013-08-14T14:00:00Z")-18*60000)}})
To get only the fields I need:
db.mycol.find({timestamp:{$gt: new Date(ISODate("2013-08-14T14:00:00Z")-18*60000)}},{value:1,timestamp:1, _id:0})
const xMins = 10;
db.mycol.find({ timestamp: { $gt: new Date(Date.now() - 1000 * 60 * xMins) } }).sort({ x: 1 });