Mongodb sort and group by

Mongodb sort and group by - mongodb

I'm not sure that my question is correct, but it seems so:
I have a set of rows in my Mongodb, like:
[{'_id': '5b4c9aa7ddc752c1f5844315',
'ccode': 'RU',
'date': '2018-07-16T00:00:00.000Z',
'rates': {'reg_emails_confirmed': 4,
'registered': 1,
'regs_age1': 1,
'regs_male': 1}},
{'_id': '5b4cad0dddc752c1f5844322',
'ccode': 'US',
'date': '2018-07-16T00:00:00.000Z',
'rates': {'reg_emails_confirmed': 4,
'registered': 2,
'regs_age1': 2,
'regs_male': 2}},
{'_id': '5bd88204af4c814883a414b2',
'ccode': 'US',
'date': '2018-10-30T00:00:00.000Z',
'rates': {'reg_emails_confirmed': 2,
'registered': 1,
'regs_age1': 1,
'regs_male': 1}},
{'_id': '5bd88204af4c814883a414b3',
'ccode': 'RU',
'date': '2018-10-30T00:00:00.000Z',
'rates': {'reg_emails_confirmed': 2,
'registered': 1,
'regs_age1': 1,
'regs_male': 1}}]
And I want to sort them by date and combine because for the same date there are multiple rows from different countries.
So the result should look something like ...
[{'2018-07-16T00:00:00.000Z': [{'_id': '5b4c9aa7ddc752c1f5844315',
'ccode': 'RU',
'date': '2018-07-16T00:00:00.000Z',
'rates': {'reg_emails_confirmed': 4,
'registered': 1,
'regs_age1': 1,
'regs_male': 1}},
{'_id': '5b4cad0dddc752c1f5844322',
'ccode': 'US',
'date': '2018-07-16T00:00:00.000Z',
'rates': {'reg_emails_confirmed': 4,
'registered': 2,
'regs_age1': 2,
'regs_male': 2}}]},
{'2018-10-30T00:00:00.000Z': [{'_id': '5bd88204af4c814883a414b2',
'ccode': 'US',
'date': '2018-10-30T00:00:00.000Z',
'rates': {'reg_emails_confirmed': 2,
'registered': 1,
'regs_age1': 1,
'regs_male': 1}},
{'_id': '5bd88204af4c814883a414b3',
'ccode': 'RU',
'date': '2018-10-30T00:00:00.000Z',
'rates': {'reg_emails_confirmed': 2,
'registered': 1,
'regs_age1': 1,
'regs_male': 1}}]}]
I tried:
db.getCollection('daily_stats').aggregate([
{'$match': some_condition},
{'$group': {'ccode': 1}}, # ccode or date?
{'$sort': {"date": 1}},
])
But got an error
The field * must be an accumulator object
I googled the error, it's pretty clear, but not seems that related to my case. I don't need any sum, avg, etc functions

Query
sort by date (asceding here, if you need descending put -1)
group by date and collect the ROOT documents
replace the root so you have the date as key
*this assumes you have dates on strings, which is bad idea, if you convert them to date objects, you can still use the query but add
"k":{"$dateToString" : {"date" :"$_id"}}
Test code here
aggregate(
[{"$sort":{"date":1}},
{"$group":{"_id":"$date", "docs":{"$push":"$$ROOT"}}},
{"$replaceRoot":
{"newRoot":{"$arrayToObject":[[{"k":"$_id", "v":"$docs"}]]}}}])

When using $group, you need an _id
From the docs
{
$group:
{
_id: <expression>, // Group By Expression
<field1>: { <accumulator1> : <expression1> },
...
}
}
In your case...
db.getCollection('daily_stats').aggregate([
{'$match': some_condition},
{'$group': {
'_id': "$ccode",
'rates': { $addToSet: '$rates' },
'date': { $first: '$date' }
}},
{'$sort': {"date": 1}},
{'$project: { "_id": 0, "country": "$_id", "rates": 1, "date": 1 }}
])
Playground: https://mongoplayground.net/p/B31XLS9p-6W

Related

Group all elements with same name with their IDs Mongodb

I want to group all elements with same name and find their IDs and $push them in a list.
I have a dataset like
{
'id': 1,
'name': 'Refrigerator'
},
{
'id': 2,
'name': 'Refrigerator'
},
{
'id': 3,
'name': 'TV'
},
{
'id': 4,
'name': 'TV'
}
Expected Ouput
{
'equipment_name': 'Refrigerator',
'equipment_id': [1, 2]
},
{
'equipment_name': 'TV',
'equipment_id': [3, 4]
}
What I've tried
{'$group': {'_id': '$_id', 'equipmne_name': '$name'}}
{'$project': {'name': {'$push': {'$expr': ['$name', '$name']}}}
And a few more aggregation techniques with $cond

[
{'$group': {'_id': {'key': '$name', 'value': '$_id'}}},
{'$group': {'_id': '$_id.key', 'result': {'$push': {'$toString': '$$ROOT._id.value'}}}},
{'$project': {'_id': 0, 'equipment_name': '$_id', 'equipment_id': '$result'}}
]

How to properly use $group operator in MongoDB?

I am currently struggling with the MongoDB query in which I want to group data by 2 fields myId and myType, but the results that I get in return don't look like what I need.
My goal is to have for each myId results with myType grouping. Like:
myId : {myType1 : 5, myType2 : 3, myType3 : 1}
But when I am trying to provide query with group operator like below:
db.collection.aggregate([{
"$project": {
"myId": "$myId",
"myType": "$eventType",
}
},
{
"$group":{
"_id":{
"myId":"$myId",
"myType":"$Type"
},
"count":{
"$sum":1
}
}
}
])
Results returned by this kind of grouping looks like this
[{'_id': {'myId': 'qwerty123', 'myType': 'created', 'count': 1}},
{'_id': {'myId': qwerty123', 'myType': 'removed', 'count': 3}},
{'_id': {'myId': qwerty123', 'myType': 'updated', 'count': 2}},
{'_id': {'myId': 'asd123', 'myType': 'created', 'count': 1}},
{'_id': {'myId': asd123', 'myType': 'removed', 'count': 2}}]
But what I would like to achieve is a structure like below:
[{'_id': {'myId': 'qwerty123', 'myType': {'created' 1, 'removed' : 3, 'updated' : 2}}},
{'_id': {'myId': 'asd123', 'myType': {'created' 1, 'removed' : 2}}}]
Or maybe like this:
[{'qwerty123', 'myType': {'created' 1, 'removed' : 3, 'updated' : 2}},
{'asd123', 'myType': {'created' 1, 'removed' : 2}}]
Is it possible to achieve results from $group operator with the above schema? If yes, how can I achieve it?
Thank you.

Use below stage after your above one
db.collection.aggregate([
{ $group: {
_id: "$_id.myId",
myType: {
$push: {
$arrayToObject: [
[
{
k: "$_id.myType",
v: "$_id.count"
}
]
]
}
}
}}
])
MongoPlayground

Getting last entry of the months from mongo collection

Say the collection store data in the below format. Every day a new entry is added in the collection. Dates are in ISO format.
|id|dt|data|
---
|1|2021-03-17|{key:"A", value:"B"}
...
|1|2021-03-14|{key:"A", value:"B"}
...
|1|2021-02-28|{key:"A", value:"B"}
|1|2021-02-27|{key:"A", value:"B"}
...
|1|2021-02-01|{key:"A", value:"B"}
|1|2021-01-31|{key:"A", value:"B"}
|1|2021-01-30|{key:"A", value:"B"}
...
|1|2021-01-01|{key:"A", value:"B"}
|1|2020-12-31|{key:"A", value:"B"}
...
|1|2020-11-30|{key:"A", value:"B"}
...
I need help with a query that gives me the last day of each month for a given period of time. Below is the query I was able to do which is not giving last day of the current month as I am sorting it by day, month and year.
db.getCollection('data').aggregate([
{
$match: {dt: {$gt: ISODate("2020-01-01")}
},
{
$project: {
dt: "$dt",
month: {
$month: "$dt"
},
day: {
$dayOfMonth: "$dt"
},
year: {
$year: "$dt"
},
data: "$data"
}
},
{
$sort: {day: -1, month: -1, year: -1}
},
{ $limit: 24},
{
$sort: {dt: -1}
},
])
The results I am after is:
|1|2021-03-17|{key:"A", value:"B"}
|1|2021-02-28|{key:"A", value:"B"}
|1|2021-01-31|{key:"A", value:"B"}
|1|2020-12-31|{key:"A", value:"B"}
|1|2020-11-30|{key:"A", value:"B"}
...
|1|2020-01-31|{key:"A", value:"B"}

Group the records by year and month, get the max date for that month.
db.getCollection('data').aggregate([
{ $match: { dt: { $gt: ISODate("2020-01-01") } } },
{ $group: { // group by
_id: { $substr: ['$dt', 0, 7] }, // get year and month eg 2020-01
dt: { $max: "$dt" }, // find the max date
doc:{ "$first" : "$$ROOT" } } // to get the document
},
{ "$replaceRoot": { "newRoot": "$doc"} }, // project the document
{ $sort: { dt: -1 } }
]);
$substr
$group
$replaceRoot
$max
$first

I monkey patched a possible solution for you in Python, but without your DB, I can't be positive that this works.
First there's a function that takes in an integer representing a month and returns the last day of that month.
import datetime as dt
def last_day_of_month(month):
return dt.datetime(2021, month+1, 1) - dt.timedelta(days=1)
Next, I built the query with a separate function.
def build_query(last_month):
return [
{
"$and": [
{"date": {"$gte": last_day_of_month(i)}},
{"date": {"$lt": last_day_of_month(i) + dt.timedelta(days=1)}}
]
}
for i in range(0, last_month)
]
Here's the output. It would be inside an $or operator in the $match stage.
{'$match': {'$or': [{'$and': [{'date': {'$gte': datetime.datetime(2020, 12, 31, 0, 0)}},
{'date': {'$lt': datetime.datetime(2021, 1, 1, 0, 0)}}]},
{'$and': [{'date': {'$gte': datetime.datetime(2021, 1, 31, 0, 0)}},
{'date': {'$lt': datetime.datetime(2021, 2, 1, 0, 0)}}]},
{'$and': [{'date': {'$gte': datetime.datetime(2021, 2, 28, 0, 0)}},
{'date': {'$lt': datetime.datetime(2021, 3, 1, 0, 0)}}]},
{'$and': [{'date': {'$gte': datetime.datetime(2021, 3, 31, 0, 0)}},
{'date': {'$lt': datetime.datetime(2021, 4, 1, 0, 0)}}]},
{'$and': [{'date': {'$gte': datetime.datetime(2021, 4, 30, 0, 0)}},
{'date': {'$lt': datetime.datetime(2021, 5, 1, 0, 0)}}]},
{'$and': [{'date': {'$gte': datetime.datetime(2021, 5, 31, 0, 0)}},
{'date': {'$lt': datetime.datetime(2021, 6, 1, 0, 0)}}]},
{'$and': [{'date': {'$gte': datetime.datetime(2021, 6, 30, 0, 0)}},
{'date': {'$lt': datetime.datetime(2021, 7, 1, 0, 0)}}]},
{'$and': [{'date': {'$gte': datetime.datetime(2021, 7, 31, 0, 0)}},
{'date': {'$lt': datetime.datetime(2021, 8, 1, 0, 0)}}]},
{'$and': [{'date': {'$gte': datetime.datetime(2021, 8, 31, 0, 0)}},
{'date': {'$lt': datetime.datetime(2021, 9, 1, 0, 0)}}]},
{'$and': [{'date': {'$gte': datetime.datetime(2021, 9, 30, 0, 0)}},
{'date': {'$lt': datetime.datetime(2021, 10, 1, 0, 0)}}]},
{'$and': [{'date': {'$gte': datetime.datetime(2021, 10, 31, 0, 0)}},
{'date': {'$lt': datetime.datetime(2021, 11, 1, 0, 0)}}]},
{'$and': [{'date': {'$gte': datetime.datetime(2021, 11, 30, 0, 0)}},
{'date': {'$lt': datetime.datetime(2021, 12, 1, 0, 0)}}]}]}}

How write get sum of array with mapReduce MongoDB?

Given following database schema:
{
'_id': 5079,
'name': 'Lincoln County',
'state': 'AR',
'population': 13024,
'cases': [{'date': '2020-03-16', 'count': 1}, {'date': '2020-03-22', 'count': 1},
{'date': '2020-03-24', 'count': 1}, {'date': '2020-03-26', 'count': 2}],
'deaths': [{'date': '2020-03-27', 'count': 1}, {'date': '2020-04-02', 'count': 1},
{'date': '2020-05-28', 'count': 2}, {'date': '2020-05-30', 'count': 1}]
}
What MongoDB mapReduce function would generate a collection of the total number of covid19 case counts for each states. Generate one record for each state with its 2-letter abbreviation and its total covid cases?

Try this query:
db.collection.aggregate([
{
"$project": {
"total": {
"$sum": {
"$map": {
"input": "$cases",
"as": "c",
"in": "$$c.count"
}
}
},
"state": 1
}
}
])
Example here
The query uses $map to create an array with values from cases.count and then $sum these values.
Also, the fields ouput are count which contains the $sum and the state using state: 1.

Group by several fields and custom sums with two conditions

I want to group rows with two conditions. The first one to get total (now it works), the second to get unread messages. I cannot imagine how to do it. Inserts are:
db.messages.insert({'source_user': 'test1', 'destination_user': 'test2', 'is_read': true})
db.messages.insert({'source_user': 'test1', 'destination_user': 'test2', 'is_read': false})
db.messages.insert({'source_user': 'test1', 'destination_user': 'test3', 'is_read': true})
my code:
db.messages.aggregate([
{'$match': {'source_user': user}},
{'$group': {
'_id': {
'source_user': '$source_user',
'destination_user': '$destination_user',
'is_read': '$is_read'
},
'total': {'$sum': 1}}
},
{'$project': {
'source_user': '$_id.source_user',
'destination_user': '$_id.destination_user',
#'unread': {'$sum': {'$_id.is_read': False}},
'total': '$total',
'_id': 0
}}
])
as a result I want to get:
[{
'source_user': 'test1',
'destination_user': 'test2',
'unread': 1,
'total': 2
}, {
'source_user': 'test1',
'destination_user': 'test3',
'unread': 0,
'total': 1
}
]
Should I add a new group or I can use $is_read flag in the same group?
Thank you!

You can count unread messages the same way you do it for total but you need to apply $cond to add 0 only for those that are read and 1 for other ones:
db.messages.aggregate([
{'$match': {'source_user': user}},
{'$group': {
'_id': {
'source_user': '$source_user',
'destination_user': '$destination_user'
},
'total': {'$sum': 1},
'unread': {'$sum': { '$cond': [ '$is_read', 0, 1 ] }}
}
},
{'$project': {
'source_user': '$_id.source_user',
'destination_user': '$_id.destination_user',
'total': 1,
'unread': 1,
'_id': 0
}}
])
MongoDB Playground

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Mongodb sort and group by - mongodb

Related

Group all elements with same name with their IDs Mongodb

How to properly use $group operator in MongoDB?

Getting last entry of the months from mongo collection

How write get sum of array with mapReduce MongoDB?

Group by several fields and custom sums with two conditions

Categories

Resources