express mongoDB aggregation sum - mongodb

I have many records in my mongodb collection and I need to recount some information.
records format:
{
"_id" : someId,
"targetFrom" : ObjectId("603e0355e805140334e79438"),<-- this is ID for search
"targetTo" : null,
"operationPaid" : true,
"type" : "coming", <--- type
"moneyAccount" : someId,
"agent" : null,
"sum" : 5000, <--- sum
}
{
"_id" : someId,
"targetFrom" : null,
"targetTo" : null,
"operationPaid" : true,
"type" : "out", <--- type
"moneyAccount" : someId,
"agent" : ObjectId("603e0355e805140334e79438"),<-- this is ID for search
"sum" : 3000, <--- sum
}
so, I need to group by records TYPE and get SUM for id ObjectId("603e0355e805140334e79438"), but id for search can be field targetFrom or targetTo or agent
for this example I need to get result 2000
sum 5000 is coming and sum 3000 is out with

Query
match the Id in one of the 3 possible fields
group by null (all collection 1 group), if type="out" i subtract the sum field else i add to sum field
Test code here
aggregate(
[{"$match":
{"$expr":
{"$or":
[{"$eq":["$targetFrom", ObjectId("603e0355e805140334e79438")]},
{"$eq":["$targetTo", ObjectId("603e0355e805140334e79438")]},
{"$eq":["$agent", ObjectId("603e0355e805140334e79438")]}]}}},
{"$group":
{"_id":null,
"sum":
{"$sum":
{"$cond":
[{"$eq":["$type", "out"]}, {"$subtract":[0, "$sum"]}, "$sum"]}}}}])

Related

MongoDB aggregation and paging

I have documents with my internal id field inside of each document and date when this document was added. There could be number of documents with the same id (differents versions of the same document), but dates will always be different for those documents. I want in some query, to bring only one document from all versions of the same document (with same id field) that was relevant to specified date, and I want to display them with paging (50 rows in the page). So, is there any chance to do this in MongoDB (operations - query documents by some field, group them by id field, sort by date field and take only first, and all this should be with paging.) ?
Please see example :Those are documents, some of them different documents,like documents A,B and C, and some are versions of the same documents,
like _id: 1, 2 and 3 are all version of the same document A
Document A {
_id : 1,
"id" : "A",
"author" : "value",
"date" : "2015-11-05"
}
Document A {
_id : 2,
"id" : "A",
"author" : "value",
"date" : "2015-11-06"
}
Document A {
_id : 3,
"id" : "A",
"author" : "value",
"date" : "2015-11-07"
}
Document B {
_id : 4,
"id" : "B",
"author" : "value",
"date" : "2015-11-06"
}
Document B {
_id : 5,
"id" : "B",
"author" : "value",
"date" : "2015-11-07"
}
Document C {
_id : 6,
"id" : "C",
"author" : "value",
"date" : "2015-11-07"
}
And I want to query all documents that has "value" in the "author" field.
And from those documents to bring only one document of each with latest date for
the specified date, for example 2015-11-08. So, I expect the result to be :
_id : 3, _id : 5, _id : 6
And also paging , for example 10 documents in each page.
Thanks !!!!!
Two documents can't have the same _id. There is a unique index on _id by default.
As per 1. you need to have a compound _id field which includes the date:
{
"_id":{
docId: yourFormerIdValue,
date: new ISODate()
}
// other fields
}
To get the version valid at a specified date, the query becomes rather easy:
db.yourColl.find({
"_id":{
"docId": idToFind,
// get only the version valid up to a specific date...
"date":{ "$lte": someISODate }
}
})
// ...sort the results descending...
.sort("_id.date":-1)
// ...and get only the first and therefor newest entry
.limit(1)

Find oldest/youngest post in mongodb collection

I have a mongodb collection with many fields. One field is 'date_time', which is in an ISO datetime format, Ex: ISODate("2014-06-11T19:16:46Z"), and another field is 'name'.
Given a name, how do I find out the oldest/youngest post in the collection?
Ex: If there are two posts in the collection 'data' :
[{'name' : 'John', 'date_time' : ISODate("2014-06-11T19:16:46Z")},
{'name' : 'John', 'date_time' : ISODate("2015-06-11T19:16:46Z")}]
Given the name 'John' how do I find out the oldest post in the collection i.e., the one with ISODate("2014-06-11T19:16:46Z")? Similarly for the youngest post.
Oldest:
db.posts.find({ "name" : "John" }).sort({ "date_time" : 1 }).limit(1)
Newest:
db.posts.find({ "name" : "John" }).sort({ "date_time" : -1 }).limit(1)
Index on { "name" : 1, "date_time" : 1 } to make the queries efficient.
You could aggregate it as below:
Create an index on the name and date_time fields, so that the
$match and $sort stage operations may use it.
db.t.ensureIndex({"name":1,"date_time":1})
$match all the records for the desired name(s).
$sort by date_time in ascending order.
$group by the name field. Use the $first operator to get the first
record of the group, which will also be the oldest. Use the $last
operator to get the last record in the group, which will also be the
newest.
To get the entire record use the $$ROOT system variable.
Code:
db.t.aggregate([
{$match:{"name":"John"}},
{$sort:{"date_time":1}},
{$group:{"_id":"$name","oldest":{$first:"$$ROOT"},
"youngest":{$last:"$$ROOT"}}}
])
o/p:
{
"_id" : "John",
"oldest" : {
"_id" : ObjectId("54da62dc7f9ac597d99c182d"),
"name" : "John",
"date_time" : ISODate("2014-06-11T19:16:46Z")
},
"youngest" : {
"_id" : ObjectId("54da62dc7f9ac597d99c182e"),
"name" : "John",
"date_time" : ISODate("2015-06-11T19:16:46Z")
}
}
db.t.find().sort({ "date_time" : 1 }).limit(1).pretty()

MongoDB Group querying for Embeded Document

I have a mongo document which has structure like
{
"_id" : "THIS_IS_A_DHP_USER_ID+2014-11-26",
"_class" : "weight",
"items" : [
{
"dateTime" : ISODate("2014-11-26T08:08:38.716Z"),
"value" : 98.5
},
{
"dateTime" : ISODate("2014-11-26T08:18:38.716Z"),
"value" : 95.5
},
{
"dateTime" : ISODate("2014-11-26T08:28:38.663Z"),
"value" : 90.5
}
],
"source" : "MANUAL",
"to" : ISODate("2014-11-26T08:08:38.716Z"),
"from" : ISODate("2014-11-26T08:08:38.716Z"),
"userId" : "THIS_IS_A_DHP_USER_ID",
"createdDate" : ISODate("2014-11-26T08:38:38.776Z")
}
{
"_id" : "THIS_IS_A_DHP_USER_ID+2014-11-25",
"_class" : "weight",
"items" : [
{
"dateTime" : ISODate("2014-11-25T08:08:38.716Z"),
"value" : 198.5
},
{
"dateTime" : ISODate("2014-11-25T08:18:38.716Z"),
"value" : 195.5
},
{
"dateTime" : ISODate("2014-11-25T08:28:38.716Z"),
"value" : 190.5
}
],
"source" : "MANUAL",
"to" : ISODate("2014-11-25T08:08:38.716Z"),
"from" : ISODate("2014-11-25T08:08:38.716Z"),
"userId" : "THIS_IS_A_DHP_USER_ID",
"createdDate" : ISODate("2014-11-26T08:38:38.893Z")
}
The query that want to fire on this document structure,
finding documents for a particular user id
unwinding the embedded array
Grouping the documents based over _id with -
summing the items.value of the embedded array
getting the minimum of the items.dateTime of the embedded array
Note. The sum and min, I want to get as a object i.e. { value : sum , dateTime : min of the items.dateTime} inside an array of items
Can this be achieved in an single aggregation call using push or some other technique.
When you group over a particular _id, and apply aggregation operators such as $min and $sum, there exists only one record per group(_id), that holds the sum and the minimum date for that group. So there is no way to obtain a different sum and a different minimum date for the same _id, which also logically makes no sense.
What you would want to do is:
db.collection.aggregate([
{$match:{"userId":"THIS_IS_A_DHP_USER_ID"}},
{$unwind:"$items"},
{$group:{"_id":"$_id",
"values":{$sum:"$items.value"},
"dateTime":{$min:"$items.dateTime"}}}
])
But in case when you do not query for a particular userId, then you would have multiple groups, each group having its own sum and min date. Then it makes sense to accumulate all these results together in an array using the $push operator.
db.collection.aggregate([
{$unwind:"$items"},
{$group:{"_id":"$_id",
"result":{$sum:"$items.value"},
"dateTime":{$min:"$items.dateTime"}}},
{$group:{"_id":null,"result":{$push:{"value":"$result",
"dateTime":"$dateTime",
"id":"$_id"}}}},
{$project:{"_id":0,"result":1}}
])
you should use following aggregation may it works
db.collectionName.aggregate(
{"$unwind":"$items"},
{"$match":{"userId":"THIS_IS_A_DHP_USER_ID"}},
{"$group":{"_id":"$_id","sum":{"$sum":"$items.value"},
"minDate":{"$min":"$items.dateTime"}}}
)

How would this query in mongodb

example document
{
"_id" : ObjectId("5338796453370917f05bb064"),
"Sigla" : "CE",
"Regiao" : "Nordeste",
"Codigo" : 2306009,
"Municipio" : "Iracema",
"1991" : 52.40499877929688,
"2000" : 108.7089996337891,
"IDHEducacao" : {
"1991" : 0.516,
"2000" : 0.735
}
}
{
"_id" : ObjectId("5338796453370917f05bb065"),
"Sigla" : "CE",
"Regiao" : "Nordeste",
"Codigo" : 2306108,
"Municipio" : "Irauçuba",
"1991" : 47.72299957275391,
"2000" : 62.65800094604492,
"IDHEducacao" : {
"1991" : 0.491,
"2000" : 0.692
}
}
---> Mongodb
I made the following query
{"$group":
{
"_id":{"Regiao":"$Regiao"},
"IDHEducao_max_2000" : {"$max" : "$2000"},
}
}
I want to show the region, the largest index of the field in 2000, and what is the municipality that owns this index. But I'm not getting
Looks like 2000 is the name of one of the fields in your document, which I find strange.
The SQL below:
SELECT Regiao, MAX( 2000 ) AS Indice FROM table1 GROUP BY Regiao
can be written in MongoDB as
db.table1.aggregate([
{"$group": {
"_id":{"Regiao":"$Regiao"},
"IDHEducao_max_2000" : {"$max" : "$2000"}}
},
{"$project": {"_id":0, "Regiao":"$_id", "Indice":"$IDHEducao_max_2000"}}
]}
But this SQL:
SELECT Sigla, Regiao, Municipio, MAX( 2000 ) AS Indice FROM table1 GROUP BY Regiao
is NOT valid. When you use GROUP BY, you can only select fields used in the GROUP BY or aggregated values of other fields (i.e., SUM(), COUNT(), etc..). However, if all you need is some value for the other fields, you could use the $first or $last operators. Note that these operators are usually used only after a sort phase to get min/max:
db.table1.aggregate([
{"$group": {
"_id":{"Regiao":"$Regiao"},
"Sigla" : {"$first" : "$Sigla"}}
"Municipio" : {"$first" : "$Municipio"}}
"IDHEducao_max_2000" : {"$max" : "$2000"}}
},
{"$project": {"_id":0, "Sigla":1, "Regiao":"$_id", "Municipio":1,
"Indice":"$IDHEducao_max_2000"}}
]}
EDIT: OP was updated with the question below:
I want to show the region, the largest index of the field in 2000, and what is the municipality that owns this index.
If you use the $sort phase of the aggregation pipeline followed by $group phase and make use of the $first operator, you can get the results you want:
db.table1.aggregate([
// Sort by (City ASC, Index DESC)
{$sort:{"Regiao":1, "2000":-1}},
// Group by City and get the max Index and corresponding Municipality
{$group:{
_id:"$Regiao",
Index:{$first:"$2000"},
Municipio:{$first:"$Municipio"}}
}
])

MongoDb - How to search BSON composite key exactly?

I have a collection that stored information about devices like the following:
/* 1 */
{
"_id" : {
"startDate" : "2012-12-20",
"endDate" : "2012-12-30",
"dimensions" : ["manufacturer", "model"],
"metrics" : ["deviceCount"]
},
"data" : {
"results" : "1"
}
}
/* 2 */
{
"_id" : {
"startDate" : "2012-12-20",
"endDate" : "2012-12-30",
"dimensions" : ["manufacturer", "model"],
"metrics" : ["deviceCount", "noOfUsers"]
},
"data" : {
"results" : "2"
}
}
/* 3 */
{
"_id" : {
"dimensions" : ["manufacturer", "model"],
"metrics" : ["deviceCount", "noOfUsers"]
},
"data" : {
"results" : "3"
}
}
And I am trying to query the documents using the _id field which will be unique. The problem I am having is that when I query for all the different attributes as in:
db.collection.find({$and: [{"_id.dimensions":{ $all: ["manufacturer","model"], $size: 2}}, {"_id.metrics": { $all:["noOfUsers","deviceCount"], $size: 2}}]});
This matches 2 and 3 documents (I don't care about the order of the attributes values), but I would like to only get 3 back. How can I say that there should not be any other attributes to _id than those that I specify in the search query?
Please advise. Thanks.
Unfortunately, I think the closest you can get to narrowing your query results to just unordered _id.dimensions and unordered _id.metrics requires you to know the other possible fields in the _id subdocument field, eg. startDate and endDate.
db.collection.find({$and: [
{"_id.dimensions":{ $all: ["manufacturer","model"], $size: 2}},
{"_id.metrics": { $all:["noOfUsers","deviceCount"], $size: 2}},
{"_id.startDate":{$exists:false}},
{"_id.endDate":{$exists:false}}
]});
If you don't know the set of possible fields in _id, then the other possible solution would be to specify the exact _id that you want, eg.
db.collection.find({"_id" : {
"dimensions" : ["manufacturer", "model"],
"metrics" : ["deviceCount", "noOfUsers"]
}})
but this means that the order of _id.dimensions and _id.metrics is significant. This last query does a document match on exact BSON representation of _id.