Convert to lowercase in group aggregation - mongodb

I want to return an aggregate of blog post tags and their total count. My blog posts are stored like so:
{
"_id" : ObjectId("532c323bb07ab5aace243c8e"),
"title" : "Fitframe.js - Responsive iframes made easy",
"tags" : [
"JavaScript",
"jQuery",
"RWD"
]
}
I'm then executing the following pipeline:
printjson(db.posts.aggregate(
{
$project: {
tags: 1,
count: { $add: 1 }
}
},
{
$unwind: '$tags'
},
{
$group: {
_id: '$tags',
count: {
$sum: '$count'
},
tags_lower: { $toLower: '$tags' }
}
},
{
$sort: {
_id: 1
}
}
));
So that the results are sorted correctly I need to sort on a lowercase version of each tag. However, when executing the above code I get the following error:
aggregate failed: {
"errmsg" : "exception: unknown group operator '$toLower'",
"code" : 15952,
"ok" : 0
}
Do I need to do another projection to add the lowercase tag?

Yes, you must add it to the projection. It will not work in the group, only specific operators like $sum ( http://docs.mongodb.org/manual/reference/operator/aggregation-group/ ) are counted as $group operators and capable of being used on that level of the group

You don't need to add another projection ... you could fix it when you do the $group:
db.posts.aggregate(
{
$project: {
tags: 1,
count: { $add: 1 }
}
},
{
$unwind: '$tags'
},
{
$group: {
_id: { tag: '$tags', lower: { $toLower : '$tags' } },
count: {
$sum: '$count'
}
}
},
{
$sort: {
"_id.lower": 1
}
}
)
In the above example, I've preserved the original name and added the lower case version to the _id.

Add another projection step between $unwind and $grop:
...
{$project: {
tags: {$toLower: '$tags'},
count: 1
}}
...
And remove tags_lower from $group

Related

About Mongo Group by where arr.length>0

Just assume following data:
{_id:1,hotelcode:a,availdates:["2020-01-02","2020-02-03"]}
{_id:2,hotelcode:a,availdates:["2020-02-03"]}
{_id:3,hotelcode:b,availdates:[]}
{_id:4,hotelcode:b,availdates:["2020-01-02"]}
{_id:5,hotelcode:c,availdates:["2020-01-02","2020-02-03"]}
I wanna achieve:
select hotelcode,count(hotelcode) from table group by hotelcode where availdates.length>0
What should I do?
I tried:
db.getCollection('spl_rate_27').aggregate([
{$project:{
adlength:{$size:"$avail_dates"}}
},
{$match:{adlength:{$gt:1}}},
{$group:{_id:{hotelcode:"$hotel_code"},total:{$sum:1}}}
])
But I got :
{
"_id" : {
"hotelcode" : null
},
"total" : 99999,0
}
It seems something was wrong...But I can't find it out....
You can do something like following, first get the objects whose availdates is greater than 0
[
{
$match: {
$expr: {
$gt: [
{
$size: "$availdates"
},
0
]
}
}
},
{
$group: {
_id: "$hotelcode",
total: {
$sum: 1
}
}
},
{
$project: {
_id: 0,
hotelcode: "$_id",
total: 1
}
}
]
Working Mongo playground
There are two things you can change.
Instead of $project use $addFields - project restricts fields, addFields adds field to the document
Then use $gte in the query as you need >0.
play
db.collection.aggregate([
{
$addFields: {
adlength: {
$size: "$availdates" //misspelled
}
}
},
{
$match: {
adlength: {
$gte: 1
}
}
},
{
$group: {
_id: {
hotelcode: "$hotelcode" //misspelled
},
total: {
$sum: 1
}
}
}
])
Well , I got inspiration from #Gibbs' answer. And I changed a bit my script:
db.getCollection('table').aggregate([
{$project:{
hotelcode:1, ##I omit this!!!
adlength:{$size:"$availdates"}}
},
{$match:{"adlength":{$gt:0}}},
{$group:{_id:{hotelcode:"$hotelcode"},total:{$sum:1}}}
])
And it works perfectly!
I hope this is what you are expecting.
db.collection.aggregate({
$match: {
"availdates": {
"$gt": "1"
}
}
},
{
$group: {
_id: "$hotelcode",
"records": {
$push: "$$ROOT"
},
"dataCount": {
$sum: 1
}
}
})
Working demo url : Mongo Playground URL

Mongo aggregation pipeline, finding out the total number of entries in an array per user

I have a collection, lets call it 'user'. In this collection there is a property entries, which holds a variably sized array of strings,
I want to find out the total number of these strings across my collection.
db.users.find()
> [{ entries: [] }, { entries: ['entry1','entry2']}, {entries: ['entry1']}]
So far I have have made many attempts here are some of my closest.
db.users.aggregate([
{ $project:
{ numberOfEntries:
{ $size: "$entries" } }
},
{ $group:
{_id: { total_entries: { $sum: "$entries"}
}
}
}
])
What this gives me is a list of the users with the total number of entries, now what I want is each of the total_entries figures added up to get my total. Any ideas of what I am doing wrong. Or if there is a better way to start this?
A possible solution could be:
db.users.aggregate([{
$group: {
_id: 'some text here',
count: {$sum: {$size: '$entries'}}
}
}]);
This will give you the total count of all entries across all users and look like
[
{
_id: 'some text here',
count: 3
}
]
I would use $unwind in the case that you want individual entry counts.
That would look like
db.users.aggregate([
{ $unwind: '$entries' },
{$group: {
_id: '$entries',
count: {$sum: 1}
}
])
and this will give you something along the lines of:
[
{
_id: 'entry1',
count: 2
},
{
_id: 'entry2',
count: 1
}
]
In case you want the overall distinct nbr of entries:
> db.users.aggregate([
{ $unwind: "$entries" },
{ $group: { _id: "$entries" } },
{ $count: "total" }
])
{ "total" : 2 }
In case you want the overall nbr of entries:
> db.users.aggregate( [ { $unwind: "$entries" }, { $count: "total" } ] )
{ "total" : 3 }
This makes use of the "unwind" operator which flattens elements of an array from records:
> db.users.aggregate( [ { $unwind: "$entries" } ] )
{ "_id" : ObjectId("5a81a7a1318e1cfc10250430"), "entries" : "entry1" }
{ "_id" : ObjectId("5a81a7a1318e1cfc10250430"), "entries" : "entry2" }
{ "_id" : ObjectId("5a81a7a1318e1cfc10250431"), "entries" : "entry1" }
You were in the right direction though you just needed to specify an _id value of null in the $group stage to calculate accumulated values for all the input documents as a whole i.e.
db.users.aggregate([
{
"$project": {
"numberOfEntries": {
"$size": {
"$ifNull": ["$entries", []]
}
}
}
},
{
"$group": {
"_id": null, /* _id of null to get the accumulated values for all the docs */
"totalEntries": { "$sum": "$numberOfEntries" }
}
}
])
Or with just a single pipeline as:
db.users.aggregate([
{
"$group": {
"_id": null, /* _id of null to get the accumulated values for all the docs */
"totalEntries": {
"$sum": {
"$size": {
"$ifNull": ["$entries", []]
}
}
}
}
}
])

How to use nested grouping in MongoDB

I need to find total count of duplicate profiles per organization level. I have documents as shown below:
{
"OrganizationId" : 10,
"Profile" : {
"_id" : "75"
}
"_id" : "1"
},
{
"OrganizationId" : 10,
"Profile" : {
"_id" : "75"
}
"_id" : "2"
},
{
"OrganizationId" : 10,
"Profile" : {
"_id" : "77"
}
"_id" : "3"
},
{
"OrganizationId" : 10,
"Profile" : {
"_id" : "77"
}
"_id" : "4"
}
I have written query which is a group by ProfileId and OrganizationId. The results i am getting as shown below:
Organization Total
10 2
10 2
But i want to get the sum of total per organization level, that means Org 10 should have one row with sum of 4.
The query i am using as shown below:
db.getSiblingDB("dbName").OrgProfile.aggregate(
{ $project: { _id: 1, P: "$Profile._id", O: "$OrganizationId" } },
{ $group: {_id: { p: "$P", o: "$O"}, c: { $sum: 1 }} },
{ $match: { c: { $gt: 1 } } });
Any ideas ? Please help
The following pipeline should give you the desired output, whereas the last $project stage is just for cosmetic purposes to turn _id into OrganizationId but is not needed for the essential computation so you may omit it.
db.getCollection('yourCollection').aggregate([
{
$group: {
_id: { org: "$OrganizationId", profile: "$Profile._id" },
count: { $sum: 1 }
}
},
{
$group: {
_id: "$_id.org",
Total: {
$sum: {
$cond: {
if: { $gte: ["$count", 2] },
then: "$count",
else: 0
}
}
}
}
},
{
$project: {
_id: 0,
Organization: "$_id",
Total: 1
}
}
])
gives this output
{
"Total" : 4.0,
"Organization" : 10
}
To filter out organizations without duplicates you can use $match which will also result in a simplification of the second $group stage
...aggregate([
{
$group: {
_id: { org: "$OrganizationId", profile: "$Profile._id" },
count: { $sum: 1 }
}
},
{
$match: {
count: { $gte: 2 }
}
},
{
$group: {
_id: "$_id.org",
Total: { $sum: "$count" }
}
},
{
$project: {
_id: 0,
Organization: "$_id",
Total: 1
}
}
])
I think I have a solution for you. In that last step there, instead of matching, I think you want another $group.
.aggregate([
{ $project: { _id: 1, P: "$Profile._id", O: "$OrganizationId" } }
,{ $group: {_id: { p: "$P", o: "$O"}, c: { $sum: 1 }} }
,{ $group: { _id: "$_id.o" , c: { $sum: "$c" } }}
]);
You can probably read it and figure out yourself what's happening in that last step, but just in case I'll explain. the last step is group all documents that have the same organization id, and then summing the quantity specified by the previous c field. After the first group, you had two documents that both had a count c of 2 but different profile id. The next group ignores the profile id and just groups them if they have the same organization id and adds their counts.
When I ran this query, here is my result, which is what I think you're looking for:
{
"_id" : 10,
"c" : 4
}
Hope this helps. Let me know if you have any questions.

MongoDB aggregate and count

A document in collection called 'myCollection' looks like this:
{
_id : 57b4b4e028108d801738a472,
updatedAt : 2016-08-17T19:03:01.831+0000,
createdAt : 2016-08-17T19:02:56.887+0000,
from : 57b1c2fc4bf55ba009b36c84,
to : 57b1c75e4bf55ba009b36c85,
}
I need to count the occurrences of 'from' and 'to' and end up with collection of documents like this:
{
"_id" : 7b1c2fc4bf55ba009b36c84,
"occurredInFrom" : 12,
"occurredInTo" : 16
}
where _id comes from either '$from' or '$to'.
The incorrect aggregate query I've written is this:
{
$group: {
_id: "$from",
occurredInFrom: { $sum: 1 },
occurredInTo: { $sum: 1}
}
}
I can definitely see that _id: "$from" is not sufficient. Can you please show me the correct way?
Note: The structure of 'myCollection' is not final, if you think there is a better structure, please suggest it.
Try this
db.myCollection.aggregate([
{ $project:
{ _id: 0,
dir: [
{id:"$from", from:{"$sum":1}, to:{"$sum":0}},
{id:"$to", from:{"$sum":0}, to:{"$sum":1}}
]
}
},
{ $unwind : "$dir" },
{ $group:
{
_id: "$dir.id",
occurredInFrom: { $sum: "$dir.from" },
occurredInTo: { $sum: "$dir.to" }
}
}
])

How to count number of inner documents in mongoDB

I am very new to mongodb concepts
outerob{
_id:111,
name:xxx,
dependents : [ {
name:a,
age:11
}
{
name:b,
age:12
}
{
name:a,
age:11
}
]
}
I have collection like this. I want to count number of dependents. please help me with this
thanks in advance
You can find the number of items in array by using Aggregation framework as follows :
db.myCollection.aggregate(
{ $unwind: "$dependents" },
{ $group: { _id: "$_id", count: { $sum: 1 }}}
);
You can find the number of items with specific name as follows :
db.myCollection.aggregate(
{ $unwind: "$dependents" },
{ $match : {"dependents.name" : "a"}},
{ $group: { _id: "$_id", count: { $sum: 1 }}}
);
try
x=db.collection.find({_id:111}).toArray()[0].dependents.length