Filling in with documents with default values after find/aggregate - mongodb

I have a collection:
{ "name" : "A", "value" : 1, "date" : ISODate("2014-01-01T00:00:00.000Z") }
{ "name" : "B", "value" : 7, "date" : ISODate("2014-01-01T00:00:00.000Z") }
{ "name" : "A", "value" : 3, "date" : ISODate("2014-01-02T00:00:00.000Z") }
{ "name" : "B", "value" : 8, "date" : ISODate("2014-01-02T00:00:00.000Z") }
{ "name" : "B", "value" : 8, "date" : ISODate("2014-01-03T00:00:00.000Z") }
{ "name" : "A", "value" : 5, "date" : ISODate("2014-01-04T00:00:00.000Z") }
{ "name" : "A", "value" : 4, "date" : ISODate("2014-01-05T00:00:00.000Z") }
The document for A on 3rd Jan 2014 is not available. When I do a find/aggregate on A, I would like the document to appear in my result set with a default value (or better, value to be same as previous date). For example:
{ "name" : "A", "value" : 1, "date" : ISODate("2014-01-01T00:00:00.000Z") }
{ "name" : "A", "value" : 3, "date" : ISODate("2014-01-02T00:00:00.000Z") }
{ "name" : "A", "value" : 3 (or default value -1), "date" : ISODate("2014-01-03T00:00:00.000Z") }
{ "name" : "A", "value" : 5, "date" : ISODate("2014-01-04T00:00:00.000Z") }
{ "name" : "A", "value" : 4, "date" : ISODate("2014-01-05T00:00:00.000Z") }
How can this be done?

One thing you need in order to be able to do this in aggregation framework is an array of dates that you want your report to cover. For example, for input that you show, you might have an array:
days = [ ISODate("2014-01-01T00:00:00Z"), ISODate("2014-01-02T00:00:00Z"),
ISODate("2014-01-03T00:00:00Z"), ISODate("2014-01-04T00:00:00Z"),
ISODate("2014-01-05T00:00:00Z"), ISODate("2014-01-06T00:00:00Z") ];
to indicate that you want every one of these six days represented.
Here is the aggregation that you would run:
db.coll.aggregate( [
{$group : {_id:{name:"$name",date:"$date"},value:{$sum:"$value"}}},
{$group : {_id:"$_id.name", days:{$addToSet:"$_id.date"},docs:{$push:"$$ROOT"}}},
{$project : {missingDays:{$setDifference:[days,"$days"]},docs:1}},
{$unwind : "$missingDays"},
{$unwind : "$docs"},
{$group : {
_id:"$_id",
days:{$addToSet:{date:"$docs._id.date",value:"$docs.value"}},
missingDays:{$addToSet:{date:"$missingDays",value:{$literal:0}}}
} },
{$project : {_id:0, name:"$_id", date:{$setUnion:["$days","$missingDays"]}}},
{$unwind : "$date"},
{$sort : {date:1,name:1}}
] )
On your sample input with days defined as above it outputs:
{ "name" : "A", "date" : { "date" : ISODate("2014-01-01T00:00:00Z"), "value" : 1 } }
{ "name" : "A", "date" : { "date" : ISODate("2014-01-02T00:00:00Z"), "value" : 3 } }
{ "name" : "A", "date" : { "date" : ISODate("2014-01-03T00:00:00Z"), "value" : 0 } }
{ "name" : "A", "date" : { "date" : ISODate("2014-01-04T00:00:00Z"), "value" : 5 } }
{ "name" : "A", "date" : { "date" : ISODate("2014-01-05T00:00:00Z"), "value" : 4 } }
{ "name" : "A", "date" : { "date" : ISODate("2014-01-06T00:00:00Z"), "value" : 0 } }
{ "name" : "B", "date" : { "date" : ISODate("2014-01-01T00:00:00Z"), "value" : 7 } }
{ "name" : "B", "date" : { "date" : ISODate("2014-01-02T00:00:00Z"), "value" : 8 } }
{ "name" : "B", "date" : { "date" : ISODate("2014-01-03T00:00:00Z"), "value" : 8 } }
{ "name" : "B", "date" : { "date" : ISODate("2014-01-04T00:00:00Z"), "value" : 0 } }
{ "name" : "B", "date" : { "date" : ISODate("2014-01-05T00:00:00Z"), "value" : 0 } }
{ "name" : "B", "date" : { "date" : ISODate("2014-01-06T00:00:00Z"), "value" : 0 } }
The first group stage may not be necessary in your case - it's there in case there are multiple documents for the same name and date, in that case you want to add the values for them. The second $group and $project stage figure out the difference between the days present for each name and the array of days you want covered, creating missingDays which will be getting the value 0 in the next $group stage. That group stage creates for each name an array of dates that have data and array of missing dates that don't. It structures them the say way so that the following $project stage can create a union of them using the $setUnion operator. After that all that's left is to $unwind the array of dates and sort it whichever way you want.

Related

MongoDB - how to optimise find query with regex search, with sort

I need to execute the following query:
db.S12_RU.find({"venue.raw":a,"title":/b|c|d|e/}).sort({"year":-1}).skip(X).limit(Y);
where X and Y are numbers.
The number of documents in my collection is:
208915369
Currently, this sort of query takes about 6 minutes to execute.
I have the following indexes:
[
{
"v" : 2,
"key" : {
"_id" : 1
},
"name" : "_id_"
},
{
"v" : 2,
"key" : {
"venue.raw" : 1
},
"name" : "venue.raw_1"
},
{
"v" : 2,
"key" : {
"venue.raw" : 1,
"title" : 1,
"year" : -1
},
"name" : "venue.raw_1_title_1_year_-1"
}
]
A standard document looks like this:
{ "_id" : ObjectId("5fc25fc091e3146fb10484af"), "id" : "1967181478", "title" : "Quality of Life of Swedish Women with Fibromyalgia Syndrome, Rheumatoid Arthritis or Systemic Lupus Erythematosus", "authors" : [ { "name" : "Carol S. Burckhardt", "id" : "2052326732" }, { "name" : "Birgitha Archenholtz", "id" : "2800742121" }, { "name" : "Kaisa Mannerkorpi", "id" : "240289002" }, { "name" : "Anders Bjelle", "id" : "2419758571" } ], "venue" : { "raw" : "Journal of Musculoskeletal Pain", "id" : "49327845" }, "year" : 1993, "n_citation" : 31, "page_start" : "199", "page_end" : "207", "doc_type" : "Journal", "publisher" : "Taylor & Francis", "volume" : "1", "issue" : "", "doi" : "10.1300/J094v01n03_20" }
Is there any way to make this query execute in a few seconds?

How to group by minimum value in nested arrays

I have a MongoDB collection with unique user traits and I'm trying to combine it with their orders, produce a new collection with the first order date and sum the total of orders by the user.
Given this example collection:
{
"_id" : 1,
"name" : "bob",
"orders" : [ { "date" : "2019-01-01", "amount" : 10 }, { "date" : "2019-01-02", "amount" : 10 } ]
}
{
"_id" : 1,
"name" : "lisa",
"orders" : [ { "date" : "2019-01-02", "amount" : 10 }, { "date" : "2019-01-03", "amount" : 15 } ]
}
this would be my desired output:
{
"_id" : 1,
"name" : "bob",
"first_order" : "2019-01-01",
"total_amount" : 20
}
{
"_id" : 2,
"name" : "lisa",
"first_order" : "2019-01-02",
"total_amount" : 25
}
Thank you

Can't convert from BSON type string to Date on aggregation pipeline

I am using MongoDB 3.4.9, and I want to have monthly report w.r.t. customer info, and here are the Sample example mongodb records with nested items and error received is:
can't convert from BSON type string to Date
{
"_id" : ObjectId("59da6a331c7a9ac0b6674fe8"),
"date" : ISODate("2017-10-08T18:10:59.899Z"),
"items" : [
{
"quantity" : 1,
"price" : 47.11,
"desc" : "Item #1"
},
{
"quantity" : 2,
"price" : 42.0,
"desc" : "Item #2"
}
],
"custInfo" : "Tobias Trelle, gold customer"
}
{
"_id" : ObjectId("59da6a511c7a9ac0b6674fed"),
"date" : ISODate("2017-10-08T18:11:28.961Z"),
"items" : [
{
"quantity" : 1,
"price" : 47.11,
"desc" : "Item #1"
},
{
"quantity" : 2,
"price" : 42.0,
"desc" : "Item #2"
}
],
"custInfo" : "Tobias Trelle, gold customer"
}
{
"_id" : ObjectId("59da6a511c7a9ac0b6674ff0"),
"date" : ISODate("2017-10-08T18:11:29.133Z"),
"items" : [
{
"quantity" : 1,
"price" : 47.11,
"desc" : "Item #1"
},
{
"quantity" : 2,
"price" : 42.0,
"desc" : "Item #2"
}
],
"custInfo" : "Tobias Trelle, gold customer"
}
Here is the MongoDB query for calculating sum grouping by custInfo month wise
db.runCommand({aggregate:"order", pipeline :
[{$match : {$and : [{"date" : {$gte : ISODate("2016-10-08T18:10:59.899Z")}},
{"date" : {$lte : ISODate("2018-10-08T18:10:59.899Z")}}]}}
,
{ "$project" : { "custInfo" : 1 ,"count" : 1 , "date" : 1 ,
"duration" : {"$month" : [ "date"]}}},
{ "$group" : { "_id" :
{ "duration" : "$duration" , "custInfo" : "$custInfo"} ,"count" : { "$sum" : 1} }}
]}//,
//cursor:{batchSize:1000}
)
Please help where I was wrong.
Regards
Kris
I'm not sure why $month is considered to be "duration" here but in any event, you dropped a dollar sign off the field variable and the call to $month was a bit off. This should work:
{ "$project" : { "custInfo" : 1 ,
"count" : 1 ,
"date" : 1 ,
"duration" : {"$month" : "$date" } }}
Not related with your specific case, but with the generic solution to this BSON error...
Solve this problem with:
{"project: { "y.dateFieldName":
{"$cond":[{ $eq: [{$type: "$data.dateFieldName"},'date']},{"$year":"$data.dateFieldName"},-1]}}

how to get the aggregate sum on a set of fields with the same values using mongo

I am trying to find the sum of documents which have the same values on a set of fields using mongo shell, these are sample documents,
{
"id" : "1",
"date" : ISODate("2017-04-29T00:00:00.000Z"),
"amount" : 697,
"name" : "vendor1"
}
{
"id" : "2",
"date" : ISODate("2017-04-29T00:00:00.000Z"),
"amount" : 380
"name" : "vendor2"
}
{
"id" : "2",
"date" : ISODate("2017-04-29T00:00:00.000Z"),
"amount" : 380,
"name" : "vendor2"
}
{
"id" : "3",
"date" : ISODate("2017-04-29T00:00:00.000Z"),
"amount" : 702,
"name" : "vendor3"
}
{
"id" : "3",
"date" : ISODate("2017-04-29T00:00:00.000Z"),
"amount" : 702,
"name" : "vendor3"
}
the query I have tried is,
db.results.aggregate([
{$group:{'_id':{name:'$name', id:'$id', date:'$date', amount:'$amount',
count:{'$sum':1}}}},
{$match:{'count':{'$gt':1}}}])
but it fetched 0 records. Also I like to know how many such documents have been found, So I am wondering how to solve the issue.
You can use this.
db.results.aggregate([
{ $group:{'_id': {name:'$name', id:'$id', date:'$date', amount:'$amount'}
, count: {$sum: 1} } }
])
Result:
{ "_id" : { "name" : "vendor3", "id" : "3", "date" : ISODate("2017-04-29T00:00:00Z"), "amount" : 702 }, "count" : 2 }
{ "_id" : { "name" : "vendor2", "id" : "2", "date" : ISODate("2017-04-29T00:00:00Z"), "amount" : 380 }, "count" : 2 }
{ "_id" : { "name" : "vendor1", "id" : "1", "date" : ISODate("2017-04-29T00:00:00Z"), "amount" : 697 }, "count" : 1 }

Mongodb - Return all the associative documents with value for the key derived from another query

I have a document of following structure:
{
"Type" : "Request",
"Cat" : "A",
"ID" : 10
}
{
"Type" : "Processed",
"Cat" : "A",
"ID" : 10
}
{
"Type" : "Receieved",
"Cat" : "A",
"ID" : 10
}
{
"Type" : "Receieved",
"Cat" : "B",
"ID" : 11
}
{
"Type" : "Processed",
"Cat" : "C",
"ID" : 12
}
I want documents:
Those documents with Type: "Processed" and get its ID
And all the associated documents with the ID got from above (1st step).
I need the results to be like this:
{
"Type" : "Request"
"Cat" : "A"
"ID" : 10
}
{
"Type" : "Processed"
"Cat" : "A"
"ID" : 10
}
{
"Type" : "Receieved"
"Cat" : "A"
"ID" : 10
}
{
"Type" : "Processed"
"Cat" : "C"
"ID" : 12
}
Can someone help me on how to achieve this ? I used elemmatch under $match in aggregate - but its not working as expected.
You can try something like
db.collection.aggregate([
{$project : {
"ID":1,
"doc.Type" : "$Type",
"doc.Cat" : "$Cat",
"doc.ID" : "$ID"
}
}
{$group : {
_id : "$ID",
docs : {$push : doc}
}
},
{$match : {
"docs.Type":"Processed"
}
},
{$unwind : "$docs"},
{$project : {
_id : 0,
docs : 0,
"Type" : "$docs.Type",
"Cat" : "$docs.Cat",
"ID" : "$docs.ID"
}
}
])