How write get sum of array with mapReduce MongoDB?

How write get sum of array with mapReduce MongoDB? - mongodb

Given following database schema:
{
'_id': 5079,
'name': 'Lincoln County',
'state': 'AR',
'population': 13024,
'cases': [{'date': '2020-03-16', 'count': 1}, {'date': '2020-03-22', 'count': 1},
{'date': '2020-03-24', 'count': 1}, {'date': '2020-03-26', 'count': 2}],
'deaths': [{'date': '2020-03-27', 'count': 1}, {'date': '2020-04-02', 'count': 1},
{'date': '2020-05-28', 'count': 2}, {'date': '2020-05-30', 'count': 1}]
}
What MongoDB mapReduce function would generate a collection of the total number of covid19 case counts for each states. Generate one record for each state with its 2-letter abbreviation and its total covid cases?

Try this query:
db.collection.aggregate([
{
"$project": {
"total": {
"$sum": {
"$map": {
"input": "$cases",
"as": "c",
"in": "$$c.count"
}
}
},
"state": 1
}
}
])
Example here
The query uses $map to create an array with values from cases.count and then $sum these values.
Also, the fields ouput are count which contains the $sum and the state using state: 1.

Related

MongoDB - Find array index of document in array field

I have aggregation that contains array field. This array field contains documents (objects). For these I have some matching criteria and I would like create a new fields named lowestCheapIndex and highestExpensiveIndex, each with array index of matching element.
Matching criteria:
lowestCheapIndex - Should contain lowest array index number of any record item, that has price below 20.
highestExpensiveIndex - Should contain highest array index number of any record item, that has price over 30.
My current aggregation output:
{
'_id': 'Egg shop',
'records': [
{'_id': 1, 'price': 22},
{'_id': 2, 'price': 18},
{'_id': 3, 'price': 34},
{'_id': 4, 'price': 31},
{'_id': 5, 'price': 13},
]
}
Desired output:
{
'_id': 'Egg shop',
'records': [
{'_id': 1, 'price': 22},
{'_id': 2, 'price': 18},
{'_id': 3, 'price': 34},
{'_id': 4, 'price': 31},
{'_id': 5, 'price': 13},
],
'lowestCheapIndex': 1,
'highestExpensiveIndex': 3,
}
Question:
How can i retrieve array index based on my criteria? I found $indexOfArray in docs, but still I am having hard time how it would be used in my case.

You can do following in an aggregation pipeline:
use $map to augment your records array with booleans indicating below 20 and over 30
use $indexOfArray to search for the booleans; For highestExpensiveIndex, reverse the array first to get the index then subtract it from size of array - 1 to get the expected index.
db.collection.aggregate([
{
"$addFields": {
"records": {
"$map": {
"input": "$records",
"as": "r",
"in": {
"_id": "$$r._id",
"price": "$$r.price",
"below20": {
$lt: [
"$$r.price",
20
]
},
"over30": {
$gt: [
"$$r.price",
30
]
}
}
}
}
}
},
{
"$addFields": {
"lowestCheapIndex": {
"$indexOfArray": [
"$records.below20",
true
]
},
"highestExpensiveIndex": {
"$subtract": [
{
"$subtract": [
{
$size: "$records"
},
{
"$indexOfArray": [
{
"$reverseArray": "$records.over30"
},
true
]
}
]
},
1
]
}
}
}
])
Mongo playground

How to properly use $group operator in MongoDB?

I am currently struggling with the MongoDB query in which I want to group data by 2 fields myId and myType, but the results that I get in return don't look like what I need.
My goal is to have for each myId results with myType grouping. Like:
myId : {myType1 : 5, myType2 : 3, myType3 : 1}
But when I am trying to provide query with group operator like below:
db.collection.aggregate([{
"$project": {
"myId": "$myId",
"myType": "$eventType",
}
},
{
"$group":{
"_id":{
"myId":"$myId",
"myType":"$Type"
},
"count":{
"$sum":1
}
}
}
])
Results returned by this kind of grouping looks like this
[{'_id': {'myId': 'qwerty123', 'myType': 'created', 'count': 1}},
{'_id': {'myId': qwerty123', 'myType': 'removed', 'count': 3}},
{'_id': {'myId': qwerty123', 'myType': 'updated', 'count': 2}},
{'_id': {'myId': 'asd123', 'myType': 'created', 'count': 1}},
{'_id': {'myId': asd123', 'myType': 'removed', 'count': 2}}]
But what I would like to achieve is a structure like below:
[{'_id': {'myId': 'qwerty123', 'myType': {'created' 1, 'removed' : 3, 'updated' : 2}}},
{'_id': {'myId': 'asd123', 'myType': {'created' 1, 'removed' : 2}}}]
Or maybe like this:
[{'qwerty123', 'myType': {'created' 1, 'removed' : 3, 'updated' : 2}},
{'asd123', 'myType': {'created' 1, 'removed' : 2}}]
Is it possible to achieve results from $group operator with the above schema? If yes, how can I achieve it?
Thank you.

Use below stage after your above one
db.collection.aggregate([
{ $group: {
_id: "$_id.myId",
myType: {
$push: {
$arrayToObject: [
[
{
k: "$_id.myType",
v: "$_id.count"
}
]
]
}
}
}}
])
MongoPlayground

Group by several fields and custom sums with two conditions

I want to group rows with two conditions. The first one to get total (now it works), the second to get unread messages. I cannot imagine how to do it. Inserts are:
db.messages.insert({'source_user': 'test1', 'destination_user': 'test2', 'is_read': true})
db.messages.insert({'source_user': 'test1', 'destination_user': 'test2', 'is_read': false})
db.messages.insert({'source_user': 'test1', 'destination_user': 'test3', 'is_read': true})
my code:
db.messages.aggregate([
{'$match': {'source_user': user}},
{'$group': {
'_id': {
'source_user': '$source_user',
'destination_user': '$destination_user',
'is_read': '$is_read'
},
'total': {'$sum': 1}}
},
{'$project': {
'source_user': '$_id.source_user',
'destination_user': '$_id.destination_user',
#'unread': {'$sum': {'$_id.is_read': False}},
'total': '$total',
'_id': 0
}}
])
as a result I want to get:
[{
'source_user': 'test1',
'destination_user': 'test2',
'unread': 1,
'total': 2
}, {
'source_user': 'test1',
'destination_user': 'test3',
'unread': 0,
'total': 1
}
]
Should I add a new group or I can use $is_read flag in the same group?
Thank you!

You can count unread messages the same way you do it for total but you need to apply $cond to add 0 only for those that are read and 1 for other ones:
db.messages.aggregate([
{'$match': {'source_user': user}},
{'$group': {
'_id': {
'source_user': '$source_user',
'destination_user': '$destination_user'
},
'total': {'$sum': 1},
'unread': {'$sum': { '$cond': [ '$is_read', 0, 1 ] }}
}
},
{'$project': {
'source_user': '$_id.source_user',
'destination_user': '$_id.destination_user',
'total': 1,
'unread': 1,
'_id': 0
}}
])
MongoDB Playground

How to do HAVING COUNT in MongoDB?

My documents look like this:
{
"_id": ObjectId("5698fcb5585b2de0120eba31"),
"id": "26125242313",
"parent_id": "26125241841",
"link_id": "10024080",
"name": "26125242313",
"author": "gigaquack",
"body": "blogging = creative writing",
"subreddit_id": "6",
"subreddit": "reddit.com",
"score": "27",
"created_utc": "2007-10-22 18:39:31"
}
What I'm trying to do is create a query that finds users who posted to only 1 subreddit. I did this in SQL by using the query:
Select distinct author, subreddit from reddit group by author having count(*) = 1;
I'm trying to do something similar in MongoDB but are having some troubles atm.
I managed to recreate select distinct by using aggregate group but I can't figure out how to solve the HAVING COUNT part.
This is what my query looks like:
db.collection.aggregate(
[{"$group":
{ "_id": { author: "$author", subreddit: "$subreddit" } } },
{$match:{count:1}} // This part is not working
])
Am I using $match wrong?

Your query should be like:
db.collection.aggregate([{
'$group': {
'_id': {'author': '$author', 'subreddit': '$subreddit'},
'count': {'$sum': 1},
'data': {'$addToSet': '$$ROOT'}}
}, {
'$match': {
'count': {'$eq': 1}
}}])
Where data is one-length list with matched document.
if you want to get some exact field, it should look like this:
db.collection.aggregate([{
'$group': {
'_id': {'author': '$author', 'subreddit': '$subreddit'},
'count': {'$sum': 1},
'author': {'$last': '$author'}}
}, {
'$match': {
'count': {'$eq': 1}
}}])

count on aggregate in mongodb

this my query for aggregate in pymongo:
db.connection_log.aggregate([
{ '$match': {
'login_time': {'$gte': datetime.datetime(2014, 5, 30, 6, 57)}
}},
{ '$group': {
'_id': {
'username': '$username',
'ras_id': '$ras_id',
'user_id': '$user_id'
},
'total': { '$sum': '$type_details.in_bytes'},
'total1': {'$sum': '$type_details.out_bytes'}
}},
{ '$sort': {'total': 1, 'total1': 1}}
])
How to count all result in aggregate?

Add to the end of your aggregation pipeline:
$group: {
_id:null,
count:{
$sum:1
}
}
SQL to Aggregation Mapping Chart

Well if you really want your results with a total count combined then you can always just push the results into their own array:
result = db.connection_log.aggregate([
{ '$match': {
'login_time': {'$gte': datetime.datetime(2014, 5, 30, 6, 57)}
}},
{ '$group': {
'_id': {
'username': '$username',
'ras_id': '$ras_id',
'user_id': '$user_id'
},
'total': { '$sum': '$type_details.in_bytes'},
'total1': {'$sum': '$type_details.out_bytes'}
}},
{ '$sort': {'total': 1, 'total1': 1}},
{ '$group' {
'_id': null,
'results': {
'$push': {
'_id': '$_id',
'total': '$total',
'total1': '$total1'
}
},
'count': { '$sum': 1 }
}}
])
And if you are using MongoDB 2.6 or greater you can just '$push': '$$ROOT' instead of actually specifying all of the document fields there.
But really, unless you are using MongoDB 2.6 and are explicitly asking for a cursor as a result, then that result is actually returned as an array already without adding an inner array for results with a count. So just get the length of the array, which in python is:
len(result)
If you are indeed using a cursor for a large result-set or otherwise using $limit and $skip to "page" results then you will need to do two queries with one just summarizing the "total count", but otherwise you just don't need to do this.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

How write get sum of array with mapReduce MongoDB? - mongodb

Related

MongoDB - Find array index of document in array field

How to properly use $group operator in MongoDB?

Group by several fields and custom sums with two conditions

How to do HAVING COUNT in MongoDB?

count on aggregate in mongodb

Categories

Resources