I am new to MongoDB and I've been struggling to get a specific query to work without any luck.
I have a collection with millions of documents having a date and an amount, I want to get the aggregations for specific periods of time.
For example, I want to get the count, amount summations for the periods between 1/1/2015 - 15/1/2015 and between 1/2/2015 - 15/2/2015
A sample collection is
{ "_id" : "148404972864202083547392254", "account" : "3600", "amount" : 50, "date" : ISODate("2017-01-01T12:02:08.642Z")}
{ "_id" : "148404972864202085437392254", "account" : "3600", "amount" : 50, "date" : ISODate("2017-01-03T12:02:08.642Z")}
{ "_id" : "148404372864202083547392254", "account" : "3600", "amount" : 70, "date" : ISODate("2017-01-09T12:02:08.642Z")}
{ "_id" : "148404972864202083547342254", "account" : "3600", "amount" : 150, "date" : ISODate("2017-01-22T12:02:08.642Z")}
{ "_id" : "148404922864202083547392254", "account" : "3600", "amount" : 200, "date" : ISODate("2017-02-02T12:02:08.642Z")}
{ "_id" : "148404972155502083547392254", "account" : "3600", "amount" : 30, "date" : ISODate("2017-02-7T12:02:08.642Z")}
{ "_id" : "148404972864202122254732254", "account" : "3600", "amount" : 10, "date" : ISODate("2017-02-10T12:02:08.642Z")}
for date ranges between 1/1/2017 - 10/10/2017 and 1/2/2017 - 10/2/2017 the output would be like this:
1/1/2017 - 10/1/2017 - count =3, amount summation: 170
10/2/2017 - 15/2/2017 - count =2, amount summation: 40
Is it possible to work with such different date ranges? The code would be in Java, but as an example in mongo, can someone please help me?
There must be a more elegant solution than this. Anyways you can wrap it into a function and generalize date related arguments.
First, you need to make a projection at the same time deciding into which range an item goes (note the huge $switch expression). By default, an item goes into 'null' range.
Then, you filter out results that didn't match your criteria (i.e. range != null).
The very last step is to group items by the range and make all needed calculations.
db.items.aggregate([
{ $project : {
amount : true,
account : true,
date : true,
range : {
$switch : {
branches : [
{
case : {
$and : [
{ $gte : [ "$date", ISODate("2017-01-01T00:00:00.000Z") ] },
{ $lt : [ "$date", ISODate("2017-01-10T00:00:00.000Z") ] }
]
},
then : { $concat : [
{ $dateToString: { format: "%d/%m/%Y", date: ISODate("2017-01-01T00:00:00.000Z") } },
{ $literal : " - " },
{ $dateToString: { format: "%d/%m/%Y", date: ISODate("2017-01-10T00:00:00.000Z") } }
] }
},
{
case : {
$and : [
{ $gte : [ "$date", ISODate("2017-02-01T00:00:00.000Z") ] },
{ $lt : [ "$date", ISODate("2017-02-10T00:00:00.000Z") ] }
]
},
then : { $concat : [
{ $dateToString: { format: "%d/%m/%Y", date: ISODate("2017-02-01T00:00:00.000Z") } },
{ $literal : " - " },
{ $dateToString: { format: "%d/%m/%Y", date: ISODate("2017-02-10T00:00:00.000Z") } }
] }
}
],
default : null
}
}
} },
{ $match : { range : { $ne : null } } },
{ $group : {
_id : "$range",
count : { $sum : 1 },
"amount summation" : { $sum : "$amount" }
} }
])
Based on your data it will give the following results*:
{ "_id" : "01/02/2017 - 10/02/2017", "count" : 2, "amount summation" : 230 }
{ "_id" : "01/01/2017 - 10/01/2017", "count" : 3, "amount summation" : 170 }
*I believe you have few typos in your questions, that's why the data look different.
Related
Hello i have exercise to filter all countries where gdp is greater than 0.05 on one person in country. I need to take the latest year of population. Also code of the country should have at least 3 characters. My collection looks like this:
mondial.countries
{
"_id" : ObjectId("581cb5a519ec2deb4ba71c03"),
"name" : "Germany",
"code" : "GER",
"capital" : "RN-Niamey-Niamey",
"area" : 1267000,
"gdp" : 7304,
"inflation" : 1.9,
"unemployment" : null,
"independence" : ISODate("1960-08-03T00:00:00Z"),
"government" : "republic",
"population" : [
{
"year" : 1950,
"value" : 2559703
},
{
"year" : 1960,
"value" : 3337141
},
{
"year" : 1970,
"value" : 4412638
},
{
"year" : 1977,
"value" : 5102990
},
{
"year" : 1988,
"value" : 7251626
},
{
"year" : 1997,
"value" : 9113001
},
{
"year" : 2001,
"value" : 11060291
},
{
"year" : 2012,
"value" : 17138707
}
]
}
For this example I have to take the population from year 2012 a divide it by gdp a then display it if its greater than 50000. I have been trying with function in js but idk how to show fields that are greater thatn 5000 of my operation. What is the easies way to do this?
var countries = db.mondial.countries.find({
"code": {$gte: 3},
});
while(countries.hasNext()) {
gdp = countries.next()
gdpresult = countries.population / gdp.gdp
print(gdpresult)
}
I don't know if I understood correctly. more see if it helps
db.mondial.aggregate([
{
$match:{
$expr: {
$gte:['$code',3 ]
}
}
},
{
$project: {
gdpresult: {
$map: {
input: '$population',
as: 'p',
in: {
value: {
$divide: ["$$p.value", '$gdp']
},
year: '$$p.year'
}
}
}
}
}])
I am the beginner in MongoDB & Here is my sample doc given below :
{
"plan_id" : "100",
"schedule_plan_list" : [
{
"date" : "01-05-2020",
"time" : "9:00AM -10:00AM"
},
{
"date" : "02-05-2020",
"time" : "10:00AM -11:00AM"
},
{
"date" : "03-05-2020",
"time" : "9:00AM -10:00AM"
},
{
"date" : "04-05-2020",
"time" : "9:30AM -10:30AM"
},
{
"date" : "05-05-2020",
"time" : "9:00AM -10:00AM"
},
{
"date" : "06-05-2020",
"time" : "9:00AM -10:00AM"
},
{
"date" : "07-05-2020",
"time" : "9:30AM -10:30AM"
},
{
"date" : "08-05-2020",
"time" : "4:00PM -5:00PM"
}
]
}
I want to get next 5 elements ** based on given date is **"02-05-2020"
My given query fetch only match "02-05-2020" but I want "02-05-2020","03-05-2020",.."06-05-2020"
db.getCollection('schedule_plans').find({"plan_id" : "100"},{_id:0,"schedule_plan_list": { "$elemMatch": { "date" : "02-05-2020"}}})
so anyone help me to solve this
You can try below aggregation query :
db.collection.aggregate([
{ $match: { "plan_id": "100" } },
/** You can re-create `schedule_plan_list` field with condition applied & slice the new array to keep required no.of elements in array */
{
$project: {
_id: 0,
"schedule_plan_list": {
$slice: [
{
$filter: { input: "$schedule_plan_list", cond: { $gte: [ "$$this.date", "02-05-2020" ] } }
},
5
]
}
}
})
Test : mongoplayground
Ref : aggregation
I've the following structure of docs:
{
"_id" : ObjectId("5786458371d24d924d8b4575"),
"uniqueNumber" : "3899822714",
"lastUpdatedAt" : ISODate("2016-07-13T20:11:11.000Z"),
"new" : [
{
"price" : 8.4,
"created" : ISODate("2016-07-13T13:11:28.000Z")
},
{
"price" : 10.0,
"created" : ISODate("2016-07-13T14:50:56.000Z")
}
],
"used" : [
{
"price" : 10.99,
"created" : ISODate("2016-07-08T13:46:31.000Z")
},
{
"price" : 8.59,
"created" : ISODate("2016-07-13T13:11:28.000Z")
}
]
}
Now I need to get a list that gives me the lowest price of each array per date.
So, as example:
{
"uniqueNumber" : 1234,
"prices" : {
"created" : 2016-07-08,
"minNew" : 123,
"minUsed" : 22
}
}
By now I've built the following query
db.getCollection('col').aggregate([
{
$match : {
"uniqueNumber" : "3899822714"
}
},
{
$unwind : "$used"
},
{
$project : {
"uniqueNumber" : "$uniqueNumber",
"price" : "$used.price",
"ts" : "$used.created"
}
},
{
$sort : { "ts" : 1 }
},
{
$group : {_id: "$uniqueNumber", priceOfMaxTS : { $min: "$price" }, ts : { $last: "$ts" }}
}
]);
But this one will only give me the lowest price for the highest date. I couldn't really find anything that pushes me to the right direction to get the desired result.
UPDATE
I've found a way to get the lowest price of the used array grouped by day with this query:
db.getCollection('col').aggregate([
{
$match : {
"uniqueNumber" : "3899822714"
}
},
{
$unwind : "$used"
},
{
$project : {
"asin" : "$uniqueNumber",
"price" : "$used.price",
"ts" : "$used.created",
"y" : { "$year" : "$used.created" },
"m" : { "$month" : "$used.created" },
"d" : { "$dayOfMonth" : "$used.created" }
}
},
{
$group : { _id : { "year" : "$y", "month" : "$m", "day" : "$d" }, minPriceOfDay : { $min: "$price" }}
}
]);
No I only need to find a way to do this also to the new array in the same query.
Document looks like this:
{
"_id" : ObjectId("361de42f1938e89b179dda42"),
"user_id" : "u1",
"evaluator_id" : "e1",
"candidate_id" : ObjectId("54f65356294160421ead3ca1"),
"OVERALL_SCORE" : 150,
"SCORES" : [
{ "NAME" : "asd", "OBTAINED_SCORE" : 30}, { "NAME" : "acd", "OBTAINED_SCORE" : 36}
]
}
Aggregation function:
db.coll.aggregate([ {$unwind:"$SCORES"}, {$group : { _id : { user_id : "$user_id", evaluator_id : "$evaluator_id"}, AVG_SCORE : { $avg : "$SCORES.OBTAINED_SCORE" }}} ])
Suppose if there are two documents with same "user_id" (say u1) and different "evaluator_id" (say e1 and e2).
For example:
1) Average will work like this ((30 + 20) / 2 = 25). This is working for me.
2) But for { evaluator_id : "e1" } document, score is 30 for { "NAME" : "asd" } and { evaluator_id : "e2" } document, score is 0 for { "NAME" : "asd" }. In this case, I want the AVG_SCORE to be 30 only (not (30 + 0) / 2 = 15).
Is it possible through aggregation??
Could any one help me out.
It's possible by placing a $match between the $unwind and $group aggregation pipelines to first filter the arrays which match the specified condition to include in the average computation and that is, score array where the obtained score is not equal to 0 "SCORES.OBTAINED_SCORE" : { $ne : 0 }
db.coll.aggregate([
{
$unwind: "$SCORES"
},
{
$match : {
"SCORES.OBTAINED_SCORE" : { $ne : 0 }
}
},
{
$group : {
_id : {
user_id : "$user_id",
evaluator_id : "$evaluator_id"
},
AVG_SCORE : {
$avg : "$SCORES.OBTAINED_SCORE"
}
}
}
])
For example, the aggregation result for this document:
{
"_id" : ObjectId("5500aaeaa7ef65c7460fa3d9"),
"user_id" : "u1",
"evaluator_id" : "e1",
"candidate_id" : ObjectId("54f65356294160421ead3ca1"),
"OVERALL_SCORE" : 150,
"SCORES" : [
{
"NAME" : "asd",
"OBTAINED_SCORE" : 0
},
{
"NAME" : "acd",
"OBTAINED_SCORE" : 36
}
]
}
will yield:
{
"result" : [
{
"_id" : {
"user_id" : "u1",
"evaluator_id" : "e1"
},
"AVG_SCORE" : 36
}
],
"ok" : 1
}
Lets say I have three students...
Alice, she is Always there on fridays.
{
"name" : "Alice",
"goes" : {
"mondays" : {
"fr" : 900,
"to" : 1400
},
"fridays" : {
"fr" : 700,
"to" : 1600
},
}
}
And bob, here should be there on the first of january
{
"_id" : ObjectId("5284a7085d60338b40b8f17d"),
"name" : "Bob",
"goes" : {
"mondays" : {
"fr" : 800,
"to" : 1200
},
"special" : [
{
"date" : "2010-01-01",
"fr" : 1000,
"to" : 1500
}
]
}
}
And Clair who will not be attenging on mondays or at 10.00
{
"_id" : ObjectId("5284c2785d60338b40b8f17f"),
"name" : "Clair",
"goes" : {
"wednesdays" : {
"fr" : 1100,
"to" : 1500
},
"special" : [
{
"date" : "2010-01-01",
"fr" : 1600,
"to" : 1900
},
{
"date" : "2010-01-02",
"fr" : 1000,
"to" : 1300
}
]
}
}
I want to find all students that should attend on fridays at 7 och 10 on the first of January 2010
So I do this with the aggregation framework.
db.students.aggregate(
[
{
$unwind: "$goes.special"
},
{
$match: {
$or : [
{
'goes.fridays.fr': 700,
},
{
'goes.special.date' : '2010-01-01',
'goes.special.fr': 1000
}
]
}
}
]
)
But Alice does not show up. It clearly states why in the mongodb docs, http://docs.mongodb.org/manual/reference/operator/aggregation/unwind/ at the very bottom.
"If you specify a target field for $unwind that holds an empty array
([]) in an input document, the pipeline ignores the input document,
and will generates no result documents."
I could solve it by adding an array with a null value in it but that does not seam like a nice solution.
Is there a way I could get unwind NOT to ignore documents that does not have data in a $unwind'ed array?
You don't need $unwind at all. Simple $match in pipeline is enough:
pipeline = [
{
"$match" : {
"$or" : [
{
"goes.fridays.fr" : 700
},
{
"goes.special" : {
"$elemMatch" : {
"date" : "2010-01-01",
"fr" : 1000
}
}
}
]
}
}
]
db.students.aggregate(pipeline)
It can be done easily even without aggregation framework.
query = {
"$or" : [
{
"goes.fridays.fr" : 700
},
{
"goes.special" : {
"$elemMatch" : {
"date" : "2010-01-01",
"fr" : 1000
}
}
}
]
}
db.students.find(query)