MongoDb How to group by month and year from string - mongodb

I am having field dateStr in collection
{ .... "dateStr" : "07/01/2020" .... }
{ .... "dateStr" : "07/01/1970" .... }
I want to group by month and year from dateStr field
I have tried
db.collection.aggregate(
{$project : {
month : {$month : new Date("$dateStr")},
year : {$year : new Date("$dateStr")}
}},
{$group : {
_id : {month : "$month" ,year : "$year" },
count : {$sum : 1}
}})
Output :
{
"result" : [
{
"_id" : {
"month" : 1,
"year" : 1970
},
"count" : 2
}
],
"ok" : 1
}
But I am having two years 1970,2020. Why I am getting single record?

You cannot use the date aggregation operators on anything else that is tho a Date object itself. Your ultimate best option is to convert these "strings" to proper Date objects so you can query correctly in this and future operations.
That said, if your "strings" always have a common structure then there is a way to do this with the aggregation framework tools. It requires a lot of manipulation thought that does not makes this an "optimal" approach to dealing with the problem. But with a set structure of "double digits" and a consistent delimiter this is possible with the $substr operator:
db.collection.aggregate([
{ "$group": {
"_id": {
"year": { "$substr": [ "$dateStr", 7, 4 ] },
"month": { "$substr": [ "$dateStr", 4, 2 ] }
},
"count": { "$sum": 1 }
}}
])
So JavaScript casting does not work inside the aggregation framework. You can always "feed" input to the pipeline based on "client code" evaluation, but the aggregation process itself does not evaluate any code. Just like the basic query engine, this is all based on a "data structure" implementation that uses "native operator" instructions to do the work.
You cannot convert strings to dates in the aggregation pipeline. You should work with real BSON Date objects, but you can do it with strings if there is a consistent format that you can present in a "lexical order".
I still suggest that you convert these to BSON Dates ASAP. And beware that the "ISODate" or UTC value is constructed with a different string form. ie:
new Date("2020-01-07")
Being in "yyyy-mm-dd" format. At least for the JavaScript invocation.

Related

Convert month from number to string question in Mongodb query

I am trying to get some avg number per month in the financial year. The collection is called test and the month data comes from CreateDate field. I want to get the avg price per month. The collection data is like below:
{
"_id" : ObjectId("5fd289a93f7cf02c36837ca7"),
"ClientName" : "John",
"OrderNumber" : "12345A",
"Price" : 10,
"CreateDate" : ISODate("2020-09-20T06:00:00.000Z"),
}
{
"_id" : ObjectId("5fd289a93f7cf02c36837cc7"),
"ClientName" : "John",
"OrderNumber" : "12345",
"Price" : 20,
"CreateDate" : ISODate("2020-09-12T06:00:00.000Z"),
}
So I am writing the query to get the avg number per month by the following within the financial year (from Sep to Aug):
db.test.aggregate([
{
$match: {
"CreateDate": {
$lt: ISODate("2021-08-31T00:00:00.000Z"),
$gte: ISODate("2020-09-01T00:00:00.000Z")
}
}
},
{
$group: {
_id: {$month: "$CreateDate"},
"AvgPrice": {
"$avg": "$Price",
}
}
},
{ $project:{ _id : 0 , Month: '$_id' , "AvgPrice ": '$AvgPrice' } }
])
The result I am getting is with the following format:
{
"Month" : 9,
"AvgPrice " : 15.0
}
{
"Month" : 10,
"AvgPrice " : 18.6666666666667
}
How can I display of the month converting to a string instead of the number. For example, the following is the ideal return:
{
"Month" : Sep,
"AvgPrice" : 15.0
}
{
"Month" : Oct,
"AvgPrice" : 18.6666666666667
}
I also have two more questions:
I am using the Mongodb 3.6 version, is there any way to round up the avg price to two digit after the decimal point? For example, above will be "18.67" instead of "18.66666". Mongo 4.2 has something called $round but 3.6 seems doesn't have this function.
If I want to break down by client, has the returning result like below:
{
"ClientName": "John",
"Month" : Sep,
"AvgPrice" : 15.0
}
{
"ClientName" : "Mary"
"Month" : Oct,
"AvgPrice" : 18.6666666666667
}
How do I put another level of the group to breakdown to the client level and then month level?
Any help will be appreciated!
If I want to break down by client
You can add ClientName field in _id,
{
$group: {
_id: {
ClientName: "$ClientName",
month: { $month: "$CreateDate" }
},
AvgPrice: { $avg: "$Price" }
}
},
How can I display of the month converting to a string instead of the number.
There is no any straight way to get month name in mongodb, but if you prepare array of months in string and access it by index,
$arrayElemAt to select month by its number
{
$project: {
_id: 0,
ClientName: "$_id.ClientName",
Month: {
$arrayElemAt: [
["","Jan","Feb","Mar","Apr","May","Jun","Jul","Aug","Sep","Oct","Nov","Dec"],
"$_id.month"
]
},
AvgPrice: 1
}
}
Playground
I am using the Mongodb 3.6 version, is there any way to round up the avg price to two digit after the decimal point?
There is no any option in mongodb 3.6 or below, you already know there is a option $round in mongodb 4.2.
You can refer this question Rounding to 2 decimal places using MongoDB aggregation framework
, there are many tricks.

How to group dates in mongoDB by first or second half of the month (fortnights)

With the following data structure, using mongoDB's (v3.4) aggregation framework how do you group information every 15 days?
{
"_id" : ObjectId("5cb10a201e20af7503305fea"),
"user" : ObjectId("5b21240c4e71161fdd40b27c"),
"version" : NumberLong(2),
"value" : 42,
"itemRef" : ObjectId("5cb10a201e20af7503305fe9"),
"status" : "ACCEPTED",
"date" : ISODate("2019-04-13T11:00:00.466Z")
}
the required output would be:
[date: 2019/01/01, totalValue:15],
[date: 2019/01/16, totalValue:5],
[date: 2019/02/01, totalValue:25],
[date: 2019/02/16, totalValue:30]
The way I found to resolve this problem with mongoDB 3.4 was using $cond + $dayOfMonth to define in which part of the month this date is.
db.contract.aggregate(
[
{$match:{...queryGoesHere...}},
{$project:
{dateText:
{$cond:
[
{$lte:[{$dayOfMonth:$date},15]},
['$dateToString': ['format': '%Y-%m-01', 'date': '$date']],
['$dateToString': ['format': '%Y-%m-16', 'date': '$date']]
]
}
value:'$value'
}
},
{$group:
{
_id:'$dateText',
total:{'$sum':1}
}
}
]
The solution is in the projection of the "dateText", it first uses the $cond to determine if the date is in the first or second part of the month. It determines this using the '$dayOfMonth' which returs the day in the month. If it is less or equal to 15, it uses the '$dateToString' to format the date by year-month-01 else it formats it to year-month-16.
Hope this can help someone in the future.

mongo query select only first of month

is it possible to query only the first (or last or any single?) day of the month of a mongo date field.
i use the $date aggregation operators regularly but within a $group clause.
basically i have field that is already aggregated (averaged) for each day of the month. i want to select only one of these days (with the value as a representative of the entire month.)
following is a sample of a record set from jan 1, 2014 to feb 1, 2015 with price as the daily price and 28day_avg as the trailing monthly average for 28 days.
{ "date" : ISODate("2014-01-01T00:00:00Z"), "_id" : ObjectId("533b3697574e2fd08f431cff"), "price": 59.23, "28day_avg": 54.21}
{ "date" : ISODate("2014-01-02T00:00:00Z"), "_id" : ObjectId("533b3697574e2fd08f431cff"), "price": 58.75, "28day_avg": 54.15}
...
{ "date" : ISODate("2015-02-01T00:00:00Z"), "_id" : ObjectId("533b3697574e2fd08f431cff"), "price": 123.50, "28day_avg": 122.25}
method 1.
im currently running an aggregation using $month data (and summing the price) but one issue is im seeking to retrieve the underlying date value ISODate("2015-02-01T00:00:00Z") versus the 0,1,2 value that comes with several of the date aggregations (that loop at the first of the week, month, year). mod(28) on a date?
method 2
i'd like to simply pluck out a single record of the 28day_avg as representative of the period. the 1st of the month would be adequate
the desired output is...
_id: ISODate("2015-02-01T00:00:00Z"), value: 122.25,
_id: ISODate("2015-01-01T00:00:00Z"), value: 120.78,
_id: ISODate("2014-12-01T00:00:00Z"), value: 118.71,
...
_id: ISODate("2014-01-01T00:00:00Z"), value: 53.21,
of course, the value will vary from method 1 to method 2 but that is fine. one is 28 days trailing while the other will account for 28, 30, 31 day months...dont care about that so much.
A non-agg is ok but also doesnt work. aka {"date": { "$mod": [ 28, 0 ]} }
To pick the first of the month for each month (method 2), use the following aggregation:
db.test.aggregate([
{ "$project" : { "_id" : "$date", "day" : { "$dayOfMonth" : "$date" }, "28day_avg" : 1 } },
{ "$match" : { "day" : 1 } }
])
You can't use an index for the match, so this is not efficient. I'd suggest adding another field to each document that holds the $dayOfMonth value, so you can index it and do a simple find:
{
"date" : ISODate("2014-01-01T00:00:00Z"),
"price" : 59.23,
"28day_avg" : 54.21,
"dayOfMonth" : 1
}
db.test.ensureIndex({ "dayOfMonth" : 1 })
db.test.find({ "dayOfMonth" : 1 }, { "_id" : 0, "date" : 1, "28day_avg" : 1 })

MongoDB: Missing fields after sort() when using projection

So I have a database filled with image information, and I want to retrieve a subset of the fields sorted by ascending date. I use the following query to retrieve the aggregated set:
db.images.find({}, {rel_path: 1, date: 1}).sort({'date.year': 1, 'date.month': 1})
I expect this query to return a set looking something like this:
{
"_id": ObjectId("530deb1060832c64291a11a7"),
"date": { "year: 2006, "month": 2 },
"rel_path": "/mnt/backup/Backup/Photos/asdfasdfasdf.jpg"
}
{
"_id": ObjectId("530de1db60832c64291a05ec"),
"date": { "year: 2006, "month": 5 },
"rel_path": "/mnt/backup/Backup/Photos/qweqweqwe.jpg"
}
... <more documents> ...
What I get, however, looks like this:
{
"_id": ObjectId("530deb1060832c64291a11a7"),
"rel_path": "/mnt/backup/Backup/Photos/asdfasdfasdf.jpg"
}
{
"_id": ObjectId("530de1db60832c64291a05ec"),
"rel_path": "/mnt/backup/Backup/Photos/qweqweqwe.jpg"
}
... <more documents> ...
If I skip the 'sort()' I get all fields from my projection, so it seems the 'date' field somehow is removed by the 'sort()' call.
Anyone have any idea what's going on here?
Edit: Here's a sample document by request:
{
"_id" : ObjectId("530de16860832c64291a0562"),
"orientation" : 1,
"camera_make" : "Apple",
"camera_model" : "iPhone 4",
"rel_path" : "Bröllopsbilder/IMG_0997.JPG",
"file_size" : 1827977,
"date" : { "month" : "10", "year" : "2011" },
"root" : "/mnt/backup/Backup/Bilder/",
"md5" : "fb26ebf24914d515144be5e53797744b"
}
The find() query looks fine and it works as expected. I tested it by running it against a similar data set.
Reason this could be happening is when a few documents in the collection do not have the "date" field. Try running the same query by adding a filter criteria in the find query to return only those results where "date" field exists using $exists operator i.e.,:
db.images.find({date:{$exists:true}}, {rel_path: 1, date: 1})
.sort({'date.year': 1, 'date.month': 1})

MongoDb aggregation Group by Date

I'm trying to group by timestamp for the collection named "foo" { _id, TimeStamp }
db.foos.aggregate(
[
{$group : { _id : new Date (Date.UTC({ $year : '$TimeStamp' },{ $month : '$TimeStamp' },{$dayOfMonth : '$TimeStamp'})) }}
])
Expecting many dates but the result is just one date. The data i'm using is correct (has many foo and different dates except 1970). There's some problem in the date parsing but i can not solve yet.
{
"result" : [
{
"_id" : ISODate("1970-01-01T00:00:00.000Z")
}
],
"ok" : 1
}
Tried this One:
db.foos.aggregate(
[
{$group : { _id : { year : { $year : '$TimeStamp' }, month : { $month : '$TimeStamp' }, day : {$dayOfMonth : '$TimeStamp'} }, count : { $sum : 1 } }},
{$project : { parsedDate : new Date('$_id.year', '$_id.month', '$_id.day') , count : 1, _id : 0} }
])
Result :
uncaught exception: aggregate failed: {
"errmsg" : "exception: disallowed field type Date in object expression (at 'parsedDate')",
"code" : 15992,
"ok" : 0
}
And that one:
db.foos.aggregate(
[
{$group : { _id : { year : { $year : '$TimeStamp' }, month : { $month : '$TimeStamp' }, day : {$dayOfMonth : '$TimeStamp'} }, count : { $sum : 1 } }},
{$project : { parsedDate : Date.UTC('$_id.year', '$_id.month', '$_id.day') , count : 1, _id : 0} }
])
Can not see dates in the result
{
"result" : [
{
"count" : 412
},
{
"count" : 1702
},
{
"count" : 422
}
],
"ok" : 1
}
db.foos.aggregate(
[
{ $project : { day : {$substr: ["$TimeStamp", 0, 10] }}},
{ $group : { _id : "$day", number : { $sum : 1 }}},
{ $sort : { _id : 1 }}
]
)
Group by date can be done in two steps in the aggregation framework, an additional third step is needed for sorting the result, if sorting is desired:
$project in combination with $substr takes the first 10 characters (YYYY:MM:DD) of the ISODate object from each document (the result is a collection of documents with the fields "_id" and "day");
$group groups by day, adding (summing) the number 1 for each matching document;
$sort ascending by "_id", which is the day from the previous aggregation step - this is optional if sorted result is desired.
This solution can not take advantage of indexes like db.twitter.ensureIndex( { TimeStamp: 1 } ), because it transforms the ISODate object to a string object on the fly. For large collections (millions of documents) this could be a performance bottleneck and more sophisticated approaches should be used.
It depends on whether you want to have the date as ISODate type in the final output. If so, then you can do one of two things:
Extract $year, $month, $dayOfMonth from your timestamp and then reconstruct a new date out of them (you are already trying to do that, but you're using syntax that doesn't work in aggregation framework).
If the original Timestamp is of type ISODate() then you can do date arithmetic to subtract the hours, minutes, seconds and milliseconds from your timestamp to get a new date that's "rounded" to the day.
There is an example of 2 here.
Here is how you would do 1. I'm making an assumption that all your dates are this year, but you can easily adjust the math to accommodate your oldest date.
project1={$project:{_id:0,
y:{$subtract:[{$year:"$TimeStamp"}, 2013]},
d:{$subtract:[{$dayOfYear:"$TimeStamp"},1]},
TimeStamp:1,
jan1:{$literal:new ISODate("2013-01-01T00:00:00")}
} };
project2={$project:{tsDate:{$add:[
"$jan1",
{$multiply:["$y", 365*24*60*60*1000]},
{$multiply:["$d", 24*60*60*1000]}
] } } };
Sample data:
db.foos.find({},{_id:0,TimeStamp:1})
{ "TimeStamp" : ISODate("2013-11-13T19:15:05.600Z") }
{ "TimeStamp" : ISODate("2014-02-01T10:00:00Z") }
Aggregation result:
> db.foos.aggregate(project1, project2)
{ "tsDate" : ISODate("2013-11-13T00:00:00Z") }
{ "tsDate" : ISODate("2014-02-01T00:00:00Z") }
This is what I use in one of my projects :
collection.aggregate(
// group results by date
{$group : {
_id : { date : "$date" }
// do whatever you want here, like $push, $sum...
}},
// _id is the date
{$sort : { _id : -1}},
{$orderby: { _id : -1 }})
.toArray()
Where $date is a Date object in mongo. I get results indexed by date.