Query Time series data based on Date - mongodb

If I have the below document, I would like to return the same document but with hourly.1 based on the minute inside the date field.
Does anyone know how to dynamically do this? If the date had 0 minute I want the hourly.0 returned. All the way up to the 59th minute.
{
"_id" : "08062017/cpu",
"date" : ISODate("2018-04-11T02:01:00.000Z"),
"metadata" : {
"host" : "localhost",
"metric" : "cpu"
},
"hourly" : {
"0" : {
"total" : 234,
"used" : 123
},
"1" : {
"total" : 234,
"used" : 123
}
}
}
RESULT:
{
"_id" : "08062017/cpu",
"date" : ISODate("2018-04-11T02:01:00.000Z"),
"metadata" : {
"host" : "localhost",
"metric" : "cpu"
},
"hourly" : {
"0" : {
"total" : 234,
"used" : 123
}
}
}

Related

how to get the aggregate sum on a set of fields with the same values using mongo

I am trying to find the sum of documents which have the same values on a set of fields using mongo shell, these are sample documents,
{
"id" : "1",
"date" : ISODate("2017-04-29T00:00:00.000Z"),
"amount" : 697,
"name" : "vendor1"
}
{
"id" : "2",
"date" : ISODate("2017-04-29T00:00:00.000Z"),
"amount" : 380
"name" : "vendor2"
}
{
"id" : "2",
"date" : ISODate("2017-04-29T00:00:00.000Z"),
"amount" : 380,
"name" : "vendor2"
}
{
"id" : "3",
"date" : ISODate("2017-04-29T00:00:00.000Z"),
"amount" : 702,
"name" : "vendor3"
}
{
"id" : "3",
"date" : ISODate("2017-04-29T00:00:00.000Z"),
"amount" : 702,
"name" : "vendor3"
}
the query I have tried is,
db.results.aggregate([
{$group:{'_id':{name:'$name', id:'$id', date:'$date', amount:'$amount',
count:{'$sum':1}}}},
{$match:{'count':{'$gt':1}}}])
but it fetched 0 records. Also I like to know how many such documents have been found, So I am wondering how to solve the issue.
You can use this.
db.results.aggregate([
{ $group:{'_id': {name:'$name', id:'$id', date:'$date', amount:'$amount'}
, count: {$sum: 1} } }
])
Result:
{ "_id" : { "name" : "vendor3", "id" : "3", "date" : ISODate("2017-04-29T00:00:00Z"), "amount" : 702 }, "count" : 2 }
{ "_id" : { "name" : "vendor2", "id" : "2", "date" : ISODate("2017-04-29T00:00:00Z"), "amount" : 380 }, "count" : 2 }
{ "_id" : { "name" : "vendor1", "id" : "1", "date" : ISODate("2017-04-29T00:00:00Z"), "amount" : 697 }, "count" : 1 }

MongoDB Query for Time Series data

I am trying to write a find query to retrieve data only for first hour i.e. hourly : "1" in the following events document. Following is the output from db.events.find().pretty().
In real scenario, I would be finding based on id and hour.
{
"_id" : "08062017/cpu",
"metadata" : {
"host" : "localhost",
"metric" : "cpu"
},
"hourly" : {
"0" : {
"total" : 234,
"used" : 123
},
"1" : {
"total" : 234,
"used" : 123
}
}
}

MongoDB find subdocument

I am starting a collection for aggregated results.
So I am following MO 'Pre aggegrated reports'
I am inserting docs with this schema:
{
"_id" : "20160526/78220/59",
"metadata" : {
"date" : "2016-05-26",
"offer" : "78220",
"adv" : NumberLong(59),
"update" : ISODate("2016-05-26T10:49:25.597Z")
},
"daily" : NumberLong(6),
"hourly" : {
"12" : NumberLong(6)
},
"publisher" : {
"43" : {
"daily" : NumberLong(3),
"hourly" : {
"12" : NumberLong(3)
}
},
"738" : {
"daily" : NumberLong(3),
"hourly" : {
"12" : NumberLong(3)
}
}
}
The idea is to have aggregated info from every hour from every publisher. _id is date/offer/code, and every publisher can give some offers.
But now I need to get, for example, a sum of the daily or hourly data for all publisher's offers.
My main question is how can I access a report on specific publisher, for example, 738, or 43?
If I query:
db.getCollection('daily_aggregate').find({'publisher.738':{$exists:true}})
I get all documents that has publisher 738, but I get other publishers data too. I want to retrieve data just from 738.
I am trying different approaches here, but probably I have to include pub_id inside the publisher schema in some way.
Any clue?
Thanks in advance.
Probably, what I need to query is just this:
db.getCollection('daily_aggregate').find({'publisher.738':{$exists: true}}, {_id: 0, 'publisher' : 1})
It gives me:
1
{
"publisher" : {
"738" : {
"daily" : NumberLong(4),
"hourly" : {
"18" : NumberLong(4)
}
}
}
}
2
{
"publisher" : {
"738" : {
"daily" : NumberLong(6),
"hourly" : {
"18" : NumberLong(6)
}
}
}
}
3
{
"publisher" : {
"43" : {
"daily" : NumberLong(9),
"hourly" : {
"12" : NumberLong(3),
"19" : NumberLong(6)
}
},
"738" : {
"daily" : NumberLong(3),
"hourly" : {
"12" : NumberLong(3)
}
}
}
}
4
{
"publisher" : {
"43" : {
"daily" : NumberLong(1),
"hourly" : {
"12" : NumberLong(1)
}
},
"738" : {
"daily" : NumberLong(2),
"hourly" : {
"12" : NumberLong(2)
}
}
}
}
But I am wondering if this is the best scenario.

How to group data by Date and compare it by Month MongoDB

I have huge CDR(call detail report) data like this :
{
"_id" : ObjectId("54eecc9a6c6852b9f0575bbb"),
"msisdn" : "9818895866",
"callType" : "NA",
"duration" : 13.5,
"charges" : 200,
"traffic" : "Data",
"Date" : ISODate("2014-02-15T12:15:42.535Z")
}
{
"_id" : ObjectId("54eecc9a6c6852b9f0575bbc"),
"msisdn" : "9818356561",
"callType" : "STD",
"duration" : 20.100000381469727,
"charges" : 100,
"traffic" : "Voice",
"Date" : ISODate("2014-01-09T00:11:14.646Z")
}
{
"_id" : ObjectId("54eecc9a6c6852b9f0575bbd"),
"msisdn" : "9818173670",
"callType" : "NA",
"duration" : 19.399999618530273,
"charges" : 300,
"traffic" : "Data",
"Date" : ISODate("2014-01-13T19:48:47.789Z")
}
{
"_id" : ObjectId("54eecc9a6c6852b9f0575bbe"),
"msisdn" : "9818719936",
"callType" : "Local",
"duration" : 9,
"charges" : 350,
"traffic" : "SMS",
"Date" : ISODate("2014-03-02T10:51:29.846Z")
}
{
"_id" : ObjectId("54eecc9a6c6852b9f0575bbf"),
"msisdn" : "9818612562",
"callType" : "STD",
"duration" : 5.110000133514404,
"charges" : 450,
"traffic" : "Voice",
"Date" : ISODate("2014-01-08T16:41:30.327Z")
}
i want to display usage of TRAFFIC="DATA" Sum of Previous month duration > sum of current month duration * 2
Display only sum of greater msisdn field
i tried this one
db.CDR.aggregate([ { $match : { traffic : "Data" }},{"$group" : {
"_id" : {
"Msisdn" : "$msisdn",
"Month" : "$date"
},
"Total Duration" : {
"$sum" : "$duration"
},
"Count" : {
"$sum" : 1
}
}
}])
It displays all time sum of duration.
I want to group by Month and compare each month and display only greater msisdn.

MongoDB flatten embedded array

i'd like to create a report of a collection. Its schema is :
(I simplified the schema, to focus on the problematic)
Mongoose Schema
var MobilHomeSchema = new Schema({
id: Schema.Types.ObjectId,
region: String,
equipments:[
{ id: ObjectId, label: String }
]
});
It contains lots of mobilhomes. These mobilhomes are in a campsite, on a region (I chose this group, it could be country, ...). Each mobilhome has some equipments, not always the sames.
I'd like to create a spreadsheet with these columns, to count the number of each equipments in a region (it's just an example)
Expected generic result format
region | equipments.label 1 | equipments.label 2 | equipments.label 3 | ....
Example with "real" values :
region|terrace|pergola|shower
Spain | 30 | 15 |150
France| 55 | 32 |540
...
in json format, it could be :
EDIT
[{
region: "Spain",
terrace: 30,
pergola: 15,
shower: 150
},
{
region: "France",
terrace: 55,
pergola: 32,
shower: 540
}]
/EDIT
How can I do ?
(map-reduce ? a most Business Intelligence tool ?)
Many Thanks !
Don't use map/reduce. Use aggregation. In the mongo shell,
> db.mobile.aggregate([
{ "$unwind" : "$equipments" },
{ "$group" : { "_id" : { "region" : "$region", "label" : "$equipments.label" }, "count" : { "$sum" : 1 } } }
])
On the documents
{ "region" : "France", "equipments" : [ { "_id" : 0, "label" : "terrace" }, { "_id" : 1, "label" : "pergola" } ] },
{ "region" : "France", "equipments" : [ { "_id" : 0, "label" : "shower" }, { "_id" : 1, "label" : "pergola" } ] },
{ "region" : "Spain", "equipments" : [ { "_id" : 0, "label" : "terrace" }, { "_id" : 1, "label" : "shower" } ] },
{ "region" : "Spain", "equipments" : [ { "_id" : 0, "label" : "veranda" }, { "_id" : 1, "label" : "pergola" } ] }
the result is
{ "_id" : { "region" : "Spain", "label" : "veranda" }, "count" : 1 }
{ "_id" : { "region" : "Spain", "label" : "terrace" }, "count" : 1 }
{ "_id" : { "region" : "Spain", "label" : "shower" }, "count" : 1 }
{ "_id" : { "region" : "France", "label" : "shower" }, "count" : 1 }
{ "_id" : { "region" : "France", "label" : "pergola" }, "count" : 2 }
{ "_id" : { "region" : "Spain", "label" : "pergola" }, "count" : 1 }
{ "_id" : { "region" : "France", "label" : "terrace" }, "count" : 1 }
Since you're using an array, presumably you don't know all the possible types of equipment ahead of time, which makes shoving the above results back into one object per region in the aggregation an unwieldy thing to attempt. Better to work with these results in the client.