Querying for a Date Range - mongodb

I have the following data schema:
{
"Address" : "Test1",
"City" : "London",
"Country" : "UK",
"Currency" : "",
"Price_History" : {
"2014-07-04T02:42:58" : [
{
"value1" : 98,
"value2" : 98,
"value3" : 98
}
],
"2014-07-04T03:50:50" : [
{
"value1" : 91,
"value2" : 92,
"value3" : 93
}
]
},
"Location" : [
9.3435,
52.1014
],
"Postal_code" : "xxx"
}
how could generate a query in mongodb to search for all results between "2014-07-04T02:42:58" and "2014-07-04T03:50:50" or how could generate a query to select only results with values from 91 till 93 without to know the date ?
thanks

Not a really good way to model this. A better example would be as follows:
{
"Address" : "Test1",
"City" : "London",
"Country" : "UK",
"Currency" : "",
"Price_History" : [
{ "dateEnrty": 1, "date": ISODate("2014-07-04T02:42:58Z"), "value": 98 },
{ "dateEntry": 2, "date": ISODate("2014-07-04T02:42:58Z"), "value": 98 },
{ "dateEntry": 3, "date": ISODate("2014-07-04T02:42:58Z"), "value": 98 },
{ "dateEntry": 1, "date": ISODate("2014-07-04T03:50:50Z"), "value": 91 },
{ "dateEntry": 2, "date": ISODate("2014-07-04T03:50:50Z"), "value": 92 },
{ "dateEntry": 3, "date": ISODate("2014-07-04T03:50:50Z"), "value": 93 },
],
"Location" : [
9.3435,
52.1014
],
"Postal_code" : "xxx"
}
Or something along those lines that does not utilize the path dependency. Your query here would be relatively simple, but also considering that MongodDB searches documents and not arrays for something like this. But you can dissect with the aggregation framework:
db.collection.aggregate([
// Still match first to reduce the possible documents
{ "$match": {
"Price_History": {
"$elemMatch": {
"date": {
"$gte": ISODate("2014-07-04T02:42:58Z"),
"$lte": ISODate("2014-07-04T03:50:50Z")
},
"value": 98
}
}
}},
// Unwind to "de-normalize"
{ "$unwind": "$Price_History" },
// Match this time to "filter" the array which is now documents
{ "$match": {
"Price_History.date": {
"$gte": ISODate("2014-07-04T02:42:58Z"),
"$lte": ISODate("2014-07-04T03:50:50Z")
},
"Price_Hisotry.value": 98
}},
// Now group back each document with the matches
{ "$group": {
"_id": "$_id",
"Address": { "$first": "$Address" },
"City": { "$first": "$City" },
"Country": { "$first": "$Country" },
"Currency": { "$first": "$Currency" },
"Price_History": { "$push": "$Price_History" },
"Location": { "$first": "$Location" },
"Postal_Code": { "$first": "$Postal_Code" }
}}
])
Or otherwise better off hanging the "normalization" and just go for discrete documents that you can simply process via a standard .find(). Must faster and simpler.
{
"Address" : "Test1",
"City" : "London",
"Country" : "UK",
"Currency" : "",
"date": ISODate("2014-07-04T02:42:58Z"),
"value": 98
}
Etc. So then just query:
db.collection.find({
"date": {
"$gte": ISODate("2014-07-04T02:42:58Z"),
"$lte": ISODate("2014-07-04T03:50:50Z")
},
"value": 98
})
I would really go with that as a "de-normalized" "Price History" collection as it is much more efficient and basically what the aggregation statement is emulating.
The query you ask for is possible using something that evaluates JavaScript like MongoDB mapReduce, but as I have already said, it will need to scan the entire collection without any index assistance, and that is bad.
Take your case to the boss to re-model and earn your bonus now.

Related

Retrieve highest score for each game using aggregate in MongoDB

I am working on a database of various games and i want to design a query that returns top scorer from each game with specific player details.
The document structure is as follows:
db.gaming_system.insertMany(
[
{
"_id": "01",
"name": "GTA 5",
"high_scores": [
{
"hs_id": 1,
"name": "Harry",
"score": 6969
},
{
"hs_id": 2,
"name": "Simon",
"score": 8574
},
{
"hs_id": 3,
"name": "Ethan",
"score": 4261
}
]
},
{
"_id": "02",
"name": "Among Us",
"high_scores": [
{
"hs_id": 1,
"name": "Harry",
"score": 926
},
{
"hs_id": 2,
"name": "Simon",
"score": 741
},
{
"hs_id": 3,
"name": "Ethan",
"score": 841
}
]
}
]
)
I have created a query using aggregate which returns the name of game and the highest score for that game as follows
db.gaming_system.aggregate(
{ "$project": { "maximumscore": { "$max": "$high_scores.score" }, name:1 } },
{ "$group": { "_id": "$_id", Name: { $first: "$name" }, "Highest_Score": { "$max": "$maximumscore" } } },
{ "$sort" : { "_id":1 } }
)
The output from my query is as follows:
{ "_id" : "01", "Name" : "GTA 5", "Highest_Score" : 8574 }
{ "_id" : "02", "Name" : "Among Us", "Highest_Score" : 926 }
I want to generate output which also provides the name of player and "hs_id" of that player who has the highest score for each game as follows:
{ "_id" : "01", "Name" : "GTA 5", "Top_Scorer" : "Simon", "hs_id": 2, "Highest_Score" : 8574 }
{ "_id" : "02", "Name" : "Among Us", "Top_Scorer" : "Harry", "hs_id": 1, "Highest_Score" : 926 }
What should be added to my query using aggregate pipeline?
[
{
$unwind: "$high_scores" //unwind the high_scores, so you can then sort
},
{
$sort: {
"high_scores.score": -1 //sort the high_scores, irrelevant of game, because we are going to group in next stage
}
},
{
//now group them by _id, take the name and top scorer from $first (which is the first in that group as sorted by score in descending order
$group: {
_id: "$_id",
name: {
$first: "$name"
},
Top_Scorer: {
$first: "$high_scores"
}
}
}
]

group first, make bucketauto second in mongodb aggregation

I have a dataset structured like that:
{
"id": 1230239,
"group_name": "A",
"confidence": 0.14333882876354542,
},
{
"id": 1230240,
"group_name": "B",
"confidence": 0.4434535,
},
Etc.
It is pretty simple to calculate buckets and number of items in each bucket of confidence level, using $bucketauto like that:
{
"$bucketAuto": {
"groupBy": "$confidence",
"buckets": 4
}
}
But how can I do the same for each group, separately?
I tried this one:
{"$group": {
"_id": "group",
"data": {
"$push": {
"confidence": "$confidence",
}
}
}
},
{
"$bucketAuto": {
"groupBy": "$data.confidence",
"buckets": 4
}
}
But that does not work.
What I need roughly is this as an output:
{ 'groupA':
{
"_id": {
"min": 0.0005225352581638143,
"max": 0.2905137273072962
},
"count": 67
},
{"_id": {
"min": 0.2905137273072962,
"max":0.5531611756507283,
},
"count": 43
},
},
{ 'groupB':
{
"_id": {
"min": 0.0005225352581638143,
"max": 0.2905137273072962
},
"count": 67
},
{"_id": {
"min": 0.2905137273072962,
"max":0.5531611756507283,
},
"count": 43
},
}
Any advice or hint would be appreciated
$facet to the rescue -- the "multigroup" operator. This pipeline:
db.foo.aggregate([
{$facet: {
"groupA": [
{$match: {"group_name": "A"}}
,{$bucketAuto: {
"groupBy": "$confidence",
"buckets": 3
}}
]
,"groupB": [
{$match: {"group_name": "B"}}
,{$bucketAuto: {
"groupBy": "$confidence",
"buckets": 3
}}
]
}}
]);
yields the output you seek:
{
"groupA" : [
{
"_id" : {
"min" : 0.14333882876354542,
"max" : 0.34333882876354543
},
"count" : 2
},
{
"_id" : {
"min" : 0.34333882876354543,
"max" : 0.5433388287635454
},
"count" : 2
},
{
"_id" : {
"min" : 0.5433388287635454,
"max" : 0.5433388287635454
},
"count" : 1
}
],
"groupB" : [
{
"_id" : {
"min" : 0.5433388287635454,
"max" : 0.7433388287635454
// etc. etc.
If you want to go totally dynamic, you'll need to do it in two passes: first get the distinct group names, then build the $facet expression from those names:
db.foo.distinct("group_name").forEach(function(name) {
fct_stage["group" + name] = [
{$match: {"group_name": name}}
,{$bucketAuto: {
"groupBy": "$confidence",
"buckets": 3
}}
];
});
db.foo.aggregate([ {$facet: fct_stage} ]);

querying date ranges without using loop in mongodb using aggregate

I have these array of date ranges. It's for a chart feature in my wep app.
[{
"start": "7/01/2016",
"end": "7/31/2016"
},{
"start": "8/01/2016",
"end": "8/31/2016"
},{
"start": "9/01/2016",
"end": "9/30/2016"
}]
This is my sample data.
{
"_id": 68,
"charges": [
{
"id": "ch_1AD2wYHDsLEzoG2tjPo7uGnq",
"amount": 1200,
"created": "7/13/2016"
},{
"id": "ch_1ADPRPHDsLEzoG2t1k3o0qCz",
"amount": 2000,
"created": "8/1/2016"
},{
"id": "ch_1ADluFHDsLEzoG2t608Bppzn",
"amount": 900,
"created": "8/2/2016"
},{
"id": "ch_1AE8OWHDsLEzoG2tBmlm1A22",
"amount": 1800,
"created": "9/14/2016"
}
]
}
This is the result that I'm trying to achieve.
[
{
"created": "9/13/2016",
"amount": 1200
},{
"created": "9/14/2016",
"amount": 2900
},{
"created": "9/15/2016",
"amount": 1800
},
]
Can I achieve that without looping the date range and querying inside? I only manage to get this far
[
{
$match: { _id: 68 }
},{
$unwind: "$charges"
},{
I don't know what to do here
}
]
NOTE: Nevermind the date formatting
you can achieve this using the new $bucket operator introduced in mongodb 3.4 liek this :
db.collection.aggregate([
{
$match:{
_id:68
}
},
{
$unwind:"$charges"
},
{
$bucket:{
groupBy:"$charges.created",
boundaries:[
"7/01/2016",
"8/01/2016",
"9/01/2016",
"9/30/2016"
],
default:"others",
output:{
amount:{
$sum:"$charges.amount"
}
}
}
}
])
explaination:
match a specific document using $match
unwind charges array
group by range ( range is provided in boundaries)
this output:
{ "_id" : "7/01/2016", "amount" : 1200 }
{ "_id" : "8/01/2016", "amount" : 2900 }
{ "_id" : "9/01/2016", "amount" : 1800 }

Combining multiple sub-documents into a new doc in mongo

I am trying to query multiple sub-documents in MongoDB and return as a single doc.
I think the aggregation framework is the way to go, but, can't see to get it exactly right.
Take the following docs:
{
"board_id": "1",
"hosts":
[{
"name": "bob",
"ip": "10.1.2.3"
},
{
"name": "tom",
"ip": "10.1.2.4"
}]
}
{
"board_id": "2",
"hosts":
[{
"name": "mickey",
"ip": "10.2.2.3"
},
{
"name": "mouse",
"ip": "10.2.2.4"
}]
}
{
"board_id": "3",
"hosts":
[{
"name": "pavel",
"ip": "10.3.2.3"
},
{
"name": "kenrick",
"ip": "10.3.2.4"
}]
}
Trying to get a query result like this:
{
"hosts":
[{
"name": "bob",
"ip": "10.1.2.3"
},
{
"name": "tom",
"ip": "10.1.2.4"
},
{
"name": "mickey",
"ip": "10.2.2.3"
},
{
"name": "mouse",
"ip": "10.2.2.4"
},
{
"name": "pavel",
"ip": "10.3.2.3"
},
{
"name": "kenrick",
"ip": "10.3.2.4"
}]
}
I've tried this:
db.collection.aggregate([ { $unwind: '$hosts' }, { $project : { name: 1, hosts: 1, _id: 0 }} ])
But it's not quite what I want.
You can definitely do this with aggregate. Let's assume your data is in collection named board, so please replace it with whatever your collection name is.
db.board.aggregate([
{$unwind:"$hosts"},
{$group:{_id:null, hosts:{$addToSet:"$hosts"}}},
{$project:{_id:0, hosts:1}}
]).pretty()
it will return
{
"hosts" : [
{
"name" : "kenrick",
"ip" : "10.3.2.4"
},
{
"name" : "pavel",
"ip" : "10.3.2.3"
},
{
"name" : "mouse",
"ip" : "10.2.2.4"
},
{
"name" : "mickey",
"ip" : "10.2.2.3"
},
{
"name" : "tom",
"ip" : "10.1.2.4"
},
{
"name" : "bob",
"ip" : "10.1.2.3"
}
]
}
So your basic problem here is that the arrays are contained in separate documents. So while you are correct to $unwind the array for processing, in order to bring the content into a single array you would need to $group the result across documents, and $push the content to the result array:
db.collection.aggregate([
{ "$unwind": "$hosts" },
{ "$group": {
"_id": null,
"hosts": { "$push": "$hosts" }
}}
])
So just as $unwind will "deconstruct" the array elements, the $push accumulator in $group brings "reconstructs" the array. And since there is no other key to "group" on, this brings all the elements into a single array.
Note that a null grouping key is only really practical when the resulting document would not exceed the BSON limit. Otherwise you are better off leaving the individual elements as documents in themselves.
Optionally remove the _id with an additional $project if required.

Sort working opposite

I have a mongo collection:
/* 0 */
{
"_id" : ObjectId("51f1fcc08188d3117c6da351"),
"cust_id" : "abc123",
"ord_date" : ISODate("2012-10-03T18:30:00Z"),
"status" : "A",
"price" : 25,
"items" : [{
"sku" : "ggg",
"qty" : 7,
"price" : 2.5
}, {
"sku" : "ppp",
"qty" : 5,
"price" : 2.5
}]
}
/* 1 */
{
"_id" : ObjectId("51fa1c318188d305fcbf9f9b"),
"cust_id" : "abc123",
"ord_date" : ISODate("2012-10-03T18:30:00Z"),
"status" : "A",
"price" : 27,
"items" : [{
"sku" : "ggg",
"qty" : 7,
"price" : 2.5
}, {
"sku" : "ppp",
"qty" : 5,
"price" : 2.5
}]
}
When I am giving the aggregate query for sorting in ascending order:
db.orders.aggregate([{
"$unwind": "$items"
}, {
"$sort": {
"price": -1
}
}, {
"$match": {}
}, {
"$group": {
"price": {
"$first": "$price"
},
"items": {
"$push": {
"sku": "$items.sku"
}
},
"_id": "$_id"
}
}, {
"$project": {
"_id": 0,
"price": 1,
"items": 1
}
}])
I get result:
{
"result": [{
"price": 25,
"items": [{
"sku": "ggg"
}, {
"sku": "ppp"
}]
}, {
"price": 27,
"items": [{
"sku": "ggg"
}, {
"sku": "ppp"
}]
}]
}
i.e it is sorting in ascending order and vice versa.
Move the $sort after $group, since the previous sort will be lost after grouping.
db.orders.aggregate([{
"$unwind": "$items"
}, {
"$match": {}
}, {
"$group": {
"price": {
"$first": "$price"
},
"items": {
"$push": {
"sku": "$items.sku"
}
},
"_id": "$_id"
}
}, {
"$sort": {
"price": -1
}
}, {
"$project": {
"_id": 0,
"price": 1,
"items": 1
}
}])
For $natural operator, this is the quoted from the doc.
The $natural operator uses the following syntax to return documents in
the order they exist on disk
Long story short, that means the order you see is not necessarily consistent with the order it store in DB.