mongoDB group by date and other column

mongoDB group by date and other column - mongodb

i need some help grouping by date and by other column, at the moment i got:
[
{
'$project': {
'date': 1,
'source': 1,
'callDirection': 1,
'status': 1
}
}, {
'$match': {
'$or': [
{
'source': '501'
}, {
'source': '555'
}
]
}
}, {
'$group': {
'_id': 0,
'total': {
'$sum': 1
},
'answered': {
'$sum': {
'$cond': [
{
'$eq': [
'$status', 'ANSWERED'
]
}, 1, 0
]
}
},
'no answer': {
'$sum': {
'$cond': [
{
'$eq': [
'$status', 'NO ANSWER'
]
}, 1, 0
]
}
}
}
}
]
the result i got now is the totals:
_id:0
total:591
answered:443
no answer:129
what i need is to split the data by source and by date so i get the data return like this
date => 2022-01-23 , source => 501, answered => 12, noanswer => 2
date => 2022-01-23 , source => 555, answered => 5, noanswer => 5
date => 2022-01-24 , source => 501, answered => 6, noanswer => 3
date => 2022-01-24 , source => 555, answered => 22, noanswer => 6
example data:
"date": "2021-12-23 10:25:59","source": "501","callDirection": "Outgoing","status": "ANSWERED"
"date": "2021-12-23 11:21:19","source": "501","callDirection": "Outgoing","status": "NO ANSWER"
"date": "2021-12-24 01:21:19","source": "501","callDirection": "Outgoing","status": "ANSWERED"
"date": "2021-12-24 10:25:59","source": "555","callDirection": "Outgoing","status": "ANSWERED"
"date": "2021-12-25 12:55:19","source": "555","callDirection": "Outgoing","status": "ANSWERED"
im new to mongoDb and i need some help ,thanks a lot

Perhaps Something like this:
db.collection.aggregate([
{
$addFields: {
date: {
$substr: [
"$date",
0,
10
]
}
}
},
{
$group: {
_id: {
da: "$date",
so: "$source",
cd: "$callDirection"
},
answer: {
"$sum": {
"$cond": [
{
"$eq": [
"ANSWERED",
"$status"
]
},
1,
0
]
}
},
noanswer: {
"$sum": {
"$cond": [
{
"$eq": [
"ANSWERED",
"$status"
]
},
0,
1
]
}
}
}
},
{
$project: {
date: "$_id.da",
source: "$_id.so",
callDirection: "$_id.cd",
answer: 1,
noanswer: 1
}
}
])
Explained:
Replace the datetime string with date only string
Group by date,source & callDirection generating two new counting fields answer and noanswer from the status field.
Project only the necessary fields are needed
playground

Related

MongoDB - How to bring age group data

How to bring age group base data from a collection in MongoDB i.e how many people are 0-18, 19-24, 25-34, 35+
[
{
"_id": ObjectId("608be7c608c7de2367c89638"),
"status": true,
"gender": "Male",
"first_name": "Vinter",
"last_name": "R",
"dob": "1-2-1999"
},
{
"_id": ObjectId("608be7c608c7de2267c89639"),
"status": true,
"gender": "Male",
"first_name": "Ray",
"last_name": "Morgan",
"dob": "1-2-2015"
}
....
]
See the Mongo Playground:
https://mongoplayground.net/p/4ydNg9Plh6P

Interesting question!
Would like to credit to #Takis and #YuTing.
Good hint from #Takis's comment on $bucket.
#YuTing's answer is good.
Think this answer is shorter by utilizing the feature provided by MongoDB.
$toDate - Convert date string to Date (supported for version 4.0 above).
$dateDiff - Date subtraction and get the unit (Supported in version 5).
$$CURRENT - Variable to get the current iterated document. For adding into persons array field (via $push).
$switch - To display group value based on conditions (Optional).
db.collection.aggregate([
{
"$addFields": {
"age": {
$dateDiff: {
startDate: {
$toDate: "$dob"
},
endDate: "$$NOW",
unit: "year"
}
}
}
},
{
$bucket: {
groupBy: "$age",
// Field to group by
boundaries: [
0,
19,
25,
35
],
// Boundaries for the buckets
default: "Other",
// Bucket id for documents which do not fall into a bucket
output: {
// Output for each bucket
"count": {
$sum: 1
},
"persons": {
$push: "$$CURRENT"
}
}
}
},
{
$project: {
_id: 0,
group: {
$switch: {
branches: [
{
case: {
$lt: [
"$_id",
19
]
},
then: "0-18"
},
{
case: {
$lt: [
"$_id",
25
]
},
then: "19-24"
},
{
case: {
$lt: [
"$_id",
35
]
},
then: "25-34"
}
],
default: "35+"
}
},
count: 1,
persons: 1
}
}
])
Sample Mongo Playground

use $bucket
db.collection.aggregate([
{
$bucket: {
groupBy: {
"$subtract": [
{
$year: new Date()
},
{
$toInt: {
$substr: [
"$dob",
{
$subtract: [
{
$strLenCP: "$dob"
},
4
]
},
4
]
}
}
]
},
// Field to group by
boundaries: [
0,
19,
25,
35,
100
],
// Boundaries for the buckets
default: "Other",
// Bucket id for documents which do not fall into a bucket
output: {
// Output for each bucket
"count": {
$sum: 1
},
"artists": {
$push: {
"name": {
$concat: [
"$first_name",
" ",
"$last_name"
]
},
"age": {
"$subtract": [
{
$year: new Date()
},
{
$toInt: {
$substr: [
"$dob",
{
$subtract: [
{
$strLenCP: "$dob"
},
4
]
},
4
]
}
}
]
}
}
}
}
}
}
])
mongoplayground

Groupby Array elements and categorize them based on date in Mongodb Query

I have an array element in my DB and I want to group By and calculate the number of repetitions of different elements in this array based on different datetime. assume following collection:
{
_id: ObjectId(7df78ad8902c)
title: 'MongoDB Overview',
tags: ['SQL', 'database', 'NoSQL'],
created_at: 2021-10-03 10:05:51.755Z
},
{
_id: ObjectId(7df78ad8902d)
title: 'NoSQL Overview',
tags: ['mongodb', 'database', 'PHP'],
created_at: 2021-10-03 14:05:51.755Z
},
{
_id: ObjectId(7df78ad8902d)
title: 'Developing',
tags: ['java', 'btc/usdt', 'PHP'],
created_at: 2021-10-03 14:05:51.755Z
}
,
{
_id: ObjectId(7df78ad8902d)
title: 'databases for search',
tags: ['elasticsearch', 'database', 'PHP'],
created_at: 2021-10-03 12:05:51.755Z
}
I want to calculate the number of repetitions of different elements in tags field such as mongodb, database, noSQL based on datetime (for example hot hashtags in last hour, today or this month) in this collection. How can I solve this problem in mongo?
expected answer like .
1 - hot hashtags in last hour ['a' , 'b' , 'c']
2 - hot hastags in last 5 hours : ...
3 - today : ...
4 - this month : ...

Query
not calendar based, based on difference on milliseconds
keep only last 30 days data
facet and 4 group by
its exactly the same code 4x
The only difference is the multiply
last 30days : (now_date-created_at) <= (* 30 24 60 60 1000)
last 24h : (now_date-created_at) <= (* 24 60 60 1000)
last 120 hours(5days) : (now_date-created_at) <= (* 5 60 60 1000)
last 60 min : (now_date-created_at) <= (* 60 60 1000)
*subtraction works on dates also, and returns milliseconds
filter the date depending on what we want its 4 different filters
unwind
group by tag and count occurences
sort by count
limit 2 to keep like the 2 top hotttest tags, you can change it to any value, like limit 1 to keep only the hottest tag
*if you want calendar based like 3 October(3 days data only), query must be changed, the same is for day(query is for 24 hours) etc (in those cases we should use $hour $month etc)
Its not big change in query.
//same month
{"$eq" : [ {"$month" : "$$NOW"}, {"$month" : "$created_at"} ]
//same day(assuming same month from previous filter)
{"$eq" : [ {"$dayOfMonth" : "$$NOW"}, {"$dayOfMonth" : "$created_at"} ]
//same hour
*we could also use the new $dateTrunc to check for same month etc.
Test code here
(Query is big but its the same thing 4x)
db.collection.aggregate([
{
"$set": {
"created_at": {
"$dateFromString": {
"dateString": "$created_at"
}
}
}
},
{
"$unwind": {
"path": "$tags"
}
},
{
"$match": {
"$expr": {
"$lte": [
{
"$subtract": [
"$$NOW",
"$created_at"
]
},
{
"$multiply": [
30,
24,
60,
60,
1000
]
}
]
}
}
},
{
"$facet": {
"month-tag": [
{
"$match": {
"$expr": {
"$lte": [
{
"$subtract": [
"$$NOW",
"$created_at"
]
},
{
"$multiply": [
30,
24,
60,
60,
1000
]
}
]
}
}
},
{
"$group": {
"_id": "$tags",
"count": {
"$sum": 1
}
}
},
{
"$sort": {
"count": -1
}
},
{
"$limit": 2
},
{
"$project": {
"_id": 0,
"tag": "$_id"
}
}
],
"day-tag": [
{
"$match": {
"$expr": {
"$lte": [
{
"$subtract": [
"$$NOW",
"$created_at"
]
},
{
"$multiply": [
24,
60,
60,
1000
]
}
]
}
}
},
{
"$group": {
"_id": "$tags",
"count": {
"$sum": 1
}
}
},
{
"$sort": {
"count": -1
}
},
{
"$limit": 2
},
{
"$project": {
"_id": 0,
"tag": "$_id"
}
}
],
"5hour-tag": [
{
"$match": {
"$expr": {
"$lte": [
{
"$subtract": [
"$$NOW",
"$created_at"
]
},
{
"$multiply": [
5,
60,
60,
1000
]
}
]
}
}
},
{
"$group": {
"_id": "$tags",
"count": {
"$sum": 1
}
}
},
{
"$sort": {
"count": -1
}
},
{
"$limit": 2
},
{
"$project": {
"_id": 0,
"tag": "$_id"
}
}
],
"hour-tag": [
{
"$match": {
"$expr": {
"$lte": [
{
"$subtract": [
"$$NOW",
"$created_at"
]
},
{
"$multiply": [
60,
60,
1000
]
}
]
}
}
},
{
"$group": {
"_id": "$tags",
"count": {
"$sum": 1
}
}
},
{
"$sort": {
"count": -1
}
},
{
"$limit": 2
},
{
"$project": {
"_id": 0,
"tag": "$_id"
}
}
]
}
}
])

Filter specific elements of ordered array of objects, based on values of previous objects (aggregation framework)

I have these documents:
[
{
'_id': 1,
'role': [
{ // keep this document
'plan': 'free',
'date': ISODate('2020-01-01')
},
{
'plan': 'free',
'date': ISODate('2020-01-02')
},
{
'plan': 'free',
'date': ISODate('2020-01-03')
},
{ // keep this document
'plan': 'pro',
'date': ISODate('2020-01-04')
},
{
'plan': 'pro',
'date': ISODate('2020-01-05')
},
{
'plan': 'pro',
'date': ISODate('2020-01-06')
},
{ // keep this document
'plan': 'free',
'date': ISODate('2020-01-08')
},
{
'plan': 'free',
'date': ISODate('2020-01-09')
}
]
},
{
'_id': 2,
'role': [
{ // keep this document
'plan': 'pro',
'date': ISODate('2020-02-05')
},
{
'plan': 'pro',
'date': ISODate('2020-02-06')
},
{ // keep this document
'plan': 'free',
'date': ISODate('2020-02-07')
},
{
'plan': 'free',
'date': ISODate('2020-02-08')
},
{
'plan': 'free',
'date': ISODate('2020-02-09')
},
{ // keep this document
'plan': 'pro',
'date': ISODate('2020-02-10')
},
{
'plan': 'pro',
'date': ISODate('2020-02-11')
},
{
'plan': 'pro',
'date': ISODate('2020-02-12')
}
]
}
]
So I have to filter documents, based on the change of the value of plan field.
I always want to keep the first occurence, but the next document will only be kept if the value of plan field has changed (e.g. free changed to pro, or pro changed to free).
Obs.: I have more distinct values for the plan field (e.g. premium, admin etc), but I only got two documents for the example.

I believe this operation might be an overkill if done on huge dataset & dataset having role array with lots of objects in it. You can try below aggregation query :
db.collection.aggregate([
/** As `role` field already exists `$addFields` will overwrite with new value */
{
$addFields: {
role: {
$let: {
vars: {
data: {
$reduce: {
input: { $slice: [ "$role", 1, { $size: "$role" } ] }, /** array input without first object */
initialValue: { roleObjs: [ { $arrayElemAt: [ "$role", 0 ] } ], plan: { $arrayElemAt: [ "$role.plan", 0 ] } }, /** Pick first object & first object's plan as initial values */
in: {
roleObjs: { $cond: [ { $eq: [ "$$this.plan", "$$value.plan" ] }, "$$value.roleObjs", { $concatArrays: [ "$$value.roleObjs", [ "$$this" ] ] } ] }, /** Conditional check & merge new object to array or return holding array as is */
plan: { $cond: [ { $eq: [ "$$this.plan", "$$value.plan" ] }, "$$value.plan", "$$this.plan" ] }
}
}
}
},
in: "$$data.roleObjs" /** Return newly formed `roleObjs` array in local variable */
}
}
}
}
])
Test : mongoplayground

Here is an aggregation with the desired result:
db.collection.aggregate( [
{
$addFields: {
plans: {
$reduce: {
input: "$role",
initialValue: [],
in: { $concatArrays: [ "$$value", [ "$$this.plan" ] ] }
}
}
}
},
{
$addFields: {
role: {
$reduce: {
input: { $range: [ 0, { $subtract: [ { $size: "$role" }, 1 ] } ] },
initialValue: { prevPlan: { $arrayElemAt: [ "$plans", 0 ] }, roles: [ { $arrayElemAt: [ "$role", 0 ] } ] },
in: {
$cond: [ { $eq: [ { $arrayElemAt: [ "$plans", "$$this"] }, "$$value.prevPlan" ] },
{ prevPlan: { $arrayElemAt: [ "$plans", "$$this"] },
roles: { $concatArrays: [ "$$value.roles", [ ] ] }
},
{ prevPlan: { $arrayElemAt: [ "$plans", "$$this" ] },
roles: { $concatArrays: [ "$$value.roles", [ { $arrayElemAt: [ "$role", "$$this" ] } ] ] }
}
]
}
}
}
}
},
{
$project: { role: "$role.roles" }
}
] )

MongoDB. Aggregate the sum of two arrays sizes

With MongoDB 3.4.10 and mongoose 4.13.6 I'm able to count sizes of two arrays on the User model:
User.aggregate()
.project({
'_id': 1,
'leftVotesCount': { '$size': '$leftVoted' },
'rightVotesCount': { '$size': '$rightVoted' }
})
where my Users are (per db.users.find())
{ "_id" : ObjectId("5a2b21e63023c6117085c240"), "rightVoted" : [ 2 ],
"leftVoted" : [ 1, 6 ] }
{ "_id" : ObjectId("5a2c0d68efde3416bc8b7020"), "rightVoted" : [ 2 ],
"leftVoted" : [ 1 ] }
Here I'm getting expected result:
[ { _id: '5a2b21e63023c6117085c240', leftVotesCount: 2, rightVotesCount: 1 },
{ _id: '5a2c0d68efde3416bc8b7020', leftVotesCount: 1, rightVotesCount: 1 } ]
Question. How can I get a cumulative value of leftVotesCount and rightVotesCount data? I tried folowing:
User.aggregate()
.project({
'_id': 1,
'leftVotesCount': { '$size': '$leftVoted' },
'rightVotesCount': { '$size': '$rightVoted' },
'votesCount': { '$add': [ '$leftVotesCount', '$rightVotesCount' ] },
'votesCount2': { '$sum': [ '$leftVotesCount', '$rightVotesCount' ] }
})
But votesCount is null and votesCount2 is 0 for both users. I'm expecting votesCount = 3 for User 1 and votesCount = 2 for User 2.

$leftVotesCount, $rightVotesCount become available only on the next stage. Try something like:
User.aggregate()
.project({
'_id': 1,
'leftVotesCount': { '$size': '$leftVoted' },
'rightVotesCount': { '$size': '$rightVoted' }
})
.project({
'_id': 1,
'leftVotesCount': 1,
'rightVotesCount': 1
'votesCount': { '$add': [ '$leftVotesCount', '$rightVotesCount' ] },
'votesCount2': { '$sum': [ '$leftVotesCount', '$rightVotesCount' ] }
})

You can't reference the project variables created in the same project stage.
You can wrap the variables in a $let expression.
User.aggregate().project({
"$let": {
"vars": {
"leftVotesCount": {
"$size": "$leftVoted"
},
"rightVotesCount": {
"$size": "$rightVoted"
}
},
"in": {
"votesCount": {
"$add": [
"$$leftVotesCount",
"$$rightVotesCount"
]
},
"leftVotesCount": "$$leftVotesCount",
"rightVotesCount": "$$rightVotesCount"
}
}
})

It turned out that $add supports nested expressions, so I was able to solve the issue by excluding intermediate variables:
User.aggregate().project({
'_id': 1,
'votesCount': { '$add': [ { '$size': '$leftVoted' }, { '$size': '$rightVoted' } ] }
});
// [ {_id: '...', votesCount: 3}, {_id: '...', votesCount: 2} ]

How to ORDER BY FIELD VALUE in MongoDB

In Mysql I often use the FIELD() function in the ORDER BY clause:
ORDER BY FIElD(id, '1', '6', '3', ...);
How does one get the same results in MongoDB? I tried the following:
.find(...).sort({id: [1, 6, 3]})
This did not work

We can use $indexOfArray
Console
db.collectionName.aggregate([{
$match: {
_id: {
$in: [249, 244]
}
}
}, {
$addFields: {
sort: {
$indexOfArray: [
[249, 244], "$_id"
]
}
}
},{
$sort: {
sort: 1
}
}])
PHP code
$data = $this->mongo->{$collectionName}->aggregate(
[
[
'$match' => ['_id' => ['$in' => $idList]]
],
[
'$addFields' => ['sort' => ['$indexOfArray' => [$idList, '$_id']]]
],
[
'$sort' => ['sort' => 1]
],
[
'$project' => [
'name' => 1
]
]
]
);

So for the record:
Given the array [1,6,3] what you want in your query is this:
db.collection.aggregate([
{ "$project": {
"weight": { "$cond": [
{ "$eq": ["_id": 1] },
3,
{ "$cond": [
{ "$eq": ["_id": 6] },
2,
{ "$cond": [
{ "$eq": ["_id": 3] },
1,
0
]},
]},
]}
}},
{ "$sort": { "weight": -1 } }
])
And that gives you specific "weights" by order of your "array" of inputs to "project" weights upon the results.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

mongoDB group by date and other column - mongodb

Related

MongoDB - How to bring age group data

Groupby Array elements and categorize them based on date in Mongodb Query

Filter specific elements of ordered array of objects, based on values of previous objects (aggregation framework)

MongoDB. Aggregate the sum of two arrays sizes

How to ORDER BY FIELD VALUE in MongoDB

Categories

Resources