How write get sum of array with mapReduce MongoDB? - mongodb

Given following database schema:
{
'_id': 5079,
'name': 'Lincoln County',
'state': 'AR',
'population': 13024,
'cases': [{'date': '2020-03-16', 'count': 1}, {'date': '2020-03-22', 'count': 1},
{'date': '2020-03-24', 'count': 1}, {'date': '2020-03-26', 'count': 2}],
'deaths': [{'date': '2020-03-27', 'count': 1}, {'date': '2020-04-02', 'count': 1},
{'date': '2020-05-28', 'count': 2}, {'date': '2020-05-30', 'count': 1}]
}
What MongoDB mapReduce function would generate a collection of the total number of covid19 case counts for each states. Generate one record for each state with its 2-letter abbreviation and its total covid cases?

Try this query:
db.collection.aggregate([
{
"$project": {
"total": {
"$sum": {
"$map": {
"input": "$cases",
"as": "c",
"in": "$$c.count"
}
}
},
"state": 1
}
}
])
Example here
The query uses $map to create an array with values from cases.count and then $sum these values.
Also, the fields ouput are count which contains the $sum and the state using state: 1.

Related

MongoDB - Find array index of document in array field

I have aggregation that contains array field. This array field contains documents (objects). For these I have some matching criteria and I would like create a new fields named lowestCheapIndex and highestExpensiveIndex, each with array index of matching element.
Matching criteria:
lowestCheapIndex - Should contain lowest array index number of any record item, that has price below 20.
highestExpensiveIndex - Should contain highest array index number of any record item, that has price over 30.
My current aggregation output:
{
'_id': 'Egg shop',
'records': [
{'_id': 1, 'price': 22},
{'_id': 2, 'price': 18},
{'_id': 3, 'price': 34},
{'_id': 4, 'price': 31},
{'_id': 5, 'price': 13},
]
}
Desired output:
{
'_id': 'Egg shop',
'records': [
{'_id': 1, 'price': 22},
{'_id': 2, 'price': 18},
{'_id': 3, 'price': 34},
{'_id': 4, 'price': 31},
{'_id': 5, 'price': 13},
],
'lowestCheapIndex': 1,
'highestExpensiveIndex': 3,
}
Question:
How can i retrieve array index based on my criteria? I found $indexOfArray in docs, but still I am having hard time how it would be used in my case.
You can do following in an aggregation pipeline:
use $map to augment your records array with booleans indicating below 20 and over 30
use $indexOfArray to search for the booleans; For highestExpensiveIndex, reverse the array first to get the index then subtract it from size of array - 1 to get the expected index.
db.collection.aggregate([
{
"$addFields": {
"records": {
"$map": {
"input": "$records",
"as": "r",
"in": {
"_id": "$$r._id",
"price": "$$r.price",
"below20": {
$lt: [
"$$r.price",
20
]
},
"over30": {
$gt: [
"$$r.price",
30
]
}
}
}
}
}
},
{
"$addFields": {
"lowestCheapIndex": {
"$indexOfArray": [
"$records.below20",
true
]
},
"highestExpensiveIndex": {
"$subtract": [
{
"$subtract": [
{
$size: "$records"
},
{
"$indexOfArray": [
{
"$reverseArray": "$records.over30"
},
true
]
}
]
},
1
]
}
}
}
])
Mongo playground

How to properly use $group operator in MongoDB?

I am currently struggling with the MongoDB query in which I want to group data by 2 fields myId and myType, but the results that I get in return don't look like what I need.
My goal is to have for each myId results with myType grouping. Like:
myId : {myType1 : 5, myType2 : 3, myType3 : 1}
But when I am trying to provide query with group operator like below:
db.collection.aggregate([{
"$project": {
"myId": "$myId",
"myType": "$eventType",
}
},
{
"$group":{
"_id":{
"myId":"$myId",
"myType":"$Type"
},
"count":{
"$sum":1
}
}
}
])
Results returned by this kind of grouping looks like this
[{'_id': {'myId': 'qwerty123', 'myType': 'created', 'count': 1}},
{'_id': {'myId': qwerty123', 'myType': 'removed', 'count': 3}},
{'_id': {'myId': qwerty123', 'myType': 'updated', 'count': 2}},
{'_id': {'myId': 'asd123', 'myType': 'created', 'count': 1}},
{'_id': {'myId': asd123', 'myType': 'removed', 'count': 2}}]
But what I would like to achieve is a structure like below:
[{'_id': {'myId': 'qwerty123', 'myType': {'created' 1, 'removed' : 3, 'updated' : 2}}},
{'_id': {'myId': 'asd123', 'myType': {'created' 1, 'removed' : 2}}}]
Or maybe like this:
[{'qwerty123', 'myType': {'created' 1, 'removed' : 3, 'updated' : 2}},
{'asd123', 'myType': {'created' 1, 'removed' : 2}}]
Is it possible to achieve results from $group operator with the above schema? If yes, how can I achieve it?
Thank you.
Use below stage after your above one
db.collection.aggregate([
{ $group: {
_id: "$_id.myId",
myType: {
$push: {
$arrayToObject: [
[
{
k: "$_id.myType",
v: "$_id.count"
}
]
]
}
}
}}
])
MongoPlayground

Group by several fields and custom sums with two conditions

I want to group rows with two conditions. The first one to get total (now it works), the second to get unread messages. I cannot imagine how to do it. Inserts are:
db.messages.insert({'source_user': 'test1', 'destination_user': 'test2', 'is_read': true})
db.messages.insert({'source_user': 'test1', 'destination_user': 'test2', 'is_read': false})
db.messages.insert({'source_user': 'test1', 'destination_user': 'test3', 'is_read': true})
my code:
db.messages.aggregate([
{'$match': {'source_user': user}},
{'$group': {
'_id': {
'source_user': '$source_user',
'destination_user': '$destination_user',
'is_read': '$is_read'
},
'total': {'$sum': 1}}
},
{'$project': {
'source_user': '$_id.source_user',
'destination_user': '$_id.destination_user',
#'unread': {'$sum': {'$_id.is_read': False}},
'total': '$total',
'_id': 0
}}
])
as a result I want to get:
[{
'source_user': 'test1',
'destination_user': 'test2',
'unread': 1,
'total': 2
}, {
'source_user': 'test1',
'destination_user': 'test3',
'unread': 0,
'total': 1
}
]
Should I add a new group or I can use $is_read flag in the same group?
Thank you!
You can count unread messages the same way you do it for total but you need to apply $cond to add 0 only for those that are read and 1 for other ones:
db.messages.aggregate([
{'$match': {'source_user': user}},
{'$group': {
'_id': {
'source_user': '$source_user',
'destination_user': '$destination_user'
},
'total': {'$sum': 1},
'unread': {'$sum': { '$cond': [ '$is_read', 0, 1 ] }}
}
},
{'$project': {
'source_user': '$_id.source_user',
'destination_user': '$_id.destination_user',
'total': 1,
'unread': 1,
'_id': 0
}}
])
MongoDB Playground

How to do HAVING COUNT in MongoDB?

My documents look like this:
{
"_id": ObjectId("5698fcb5585b2de0120eba31"),
"id": "26125242313",
"parent_id": "26125241841",
"link_id": "10024080",
"name": "26125242313",
"author": "gigaquack",
"body": "blogging = creative writing",
"subreddit_id": "6",
"subreddit": "reddit.com",
"score": "27",
"created_utc": "2007-10-22 18:39:31"
}
What I'm trying to do is create a query that finds users who posted to only 1 subreddit. I did this in SQL by using the query:
Select distinct author, subreddit from reddit group by author having count(*) = 1;
I'm trying to do something similar in MongoDB but are having some troubles atm.
I managed to recreate select distinct by using aggregate group but I can't figure out how to solve the HAVING COUNT part.
This is what my query looks like:
db.collection.aggregate(
[{"$group":
{ "_id": { author: "$author", subreddit: "$subreddit" } } },
{$match:{count:1}} // This part is not working
])
Am I using $match wrong?
Your query should be like:
db.collection.aggregate([{
'$group': {
'_id': {'author': '$author', 'subreddit': '$subreddit'},
'count': {'$sum': 1},
'data': {'$addToSet': '$$ROOT'}}
}, {
'$match': {
'count': {'$eq': 1}
}}])
Where data is one-length list with matched document.
if you want to get some exact field, it should look like this:
db.collection.aggregate([{
'$group': {
'_id': {'author': '$author', 'subreddit': '$subreddit'},
'count': {'$sum': 1},
'author': {'$last': '$author'}}
}, {
'$match': {
'count': {'$eq': 1}
}}])

count on aggregate in mongodb

this my query for aggregate in pymongo:
db.connection_log.aggregate([
{ '$match': {
'login_time': {'$gte': datetime.datetime(2014, 5, 30, 6, 57)}
}},
{ '$group': {
'_id': {
'username': '$username',
'ras_id': '$ras_id',
'user_id': '$user_id'
},
'total': { '$sum': '$type_details.in_bytes'},
'total1': {'$sum': '$type_details.out_bytes'}
}},
{ '$sort': {'total': 1, 'total1': 1}}
])
How to count all result in aggregate?
Add to the end of your aggregation pipeline:
$group: {
_id:null,
count:{
$sum:1
}
}
SQL to Aggregation Mapping Chart
Well if you really want your results with a total count combined then you can always just push the results into their own array:
result = db.connection_log.aggregate([
{ '$match': {
'login_time': {'$gte': datetime.datetime(2014, 5, 30, 6, 57)}
}},
{ '$group': {
'_id': {
'username': '$username',
'ras_id': '$ras_id',
'user_id': '$user_id'
},
'total': { '$sum': '$type_details.in_bytes'},
'total1': {'$sum': '$type_details.out_bytes'}
}},
{ '$sort': {'total': 1, 'total1': 1}},
{ '$group' {
'_id': null,
'results': {
'$push': {
'_id': '$_id',
'total': '$total',
'total1': '$total1'
}
},
'count': { '$sum': 1 }
}}
])
And if you are using MongoDB 2.6 or greater you can just '$push': '$$ROOT' instead of actually specifying all of the document fields there.
But really, unless you are using MongoDB 2.6 and are explicitly asking for a cursor as a result, then that result is actually returned as an array already without adding an inner array for results with a count. So just get the length of the array, which in python is:
len(result)
If you are indeed using a cursor for a large result-set or otherwise using $limit and $skip to "page" results then you will need to do two queries with one just summarizing the "total count", but otherwise you just don't need to do this.