Count of unique items in mongodb documents with Array of Strings - mongodb

I'm having a problem that seems like it can be solved by some aggregation samples I've seen, but I've not come up with an answer yet.
Basically I have documents like so:
{
date: '2015-01-14 00:00:00.000Z',
attendees: ['john', 'jane', 'james', 'joanne'],
groupName: '31'
}
And I need to find the unique attendees for a groupName and their attendance count. So for example, with the data:
{
date: '2015-01-13 00:00:00.000Z',
attendees: ['john', 'jane', 'james', 'joanne'],
groupName: '31'
},
{
date: '2015-01-14 00:00:00.000Z',
attendees: ['james', 'joanne'],
groupName: '31'
},
{
date: '2015-01-15 00:00:00.000Z',
attendees: ['joanne'],
groupName: '31'
}
I'd like to get something like:
[{
name: 'joanne',
count: 3
}, {
name: 'john',
count: 1
}, {
name: 'james',
count: 2
}]
I can't seem to find an aggregation to get this type of result. Any help is appreciated.

you can do this:
db.collection.aggregate([
{$unwind: '$attendees'},
{$group: {_id: '$attendees', count: {$sum: 1}}},
{$project: {_id:0, name: '$_id', count: '$count'}}
])

Related

MongoDB Aggregate functions convert object array to string array

I have some documents in a collection. Every document has a challenge_id field. I want to map those document array into a string array. The final string array should consist of challenge ids from each document
input:
[
{
_id: ObjectId("62c3e31931e7df585c39e4e1"),
activity_id: ObjectId("62c3e31931e7df585c39e4df"),
challenge_id: ObjectId("62bd543c3a3937000958f2dd"),
status: "active",
createdAt: ISODate("2022-07-05T07:07:05.823Z"),
updatedAt: ISODate("2022-07-05T07:07:05.823Z")
},
{
_id: ObjectId("62c3e33f299750585cc70b23"),
activity_id: ObjectId("62c3e33e299750585cc70b21"),
challenge_id: ObjectId("62bd543c3a3937000958f2dd"),
status: "active",
createdAt: ISODate("2022-07-05T07:07:43.612Z"),
updatedAt: ISODate("2022-07-05T07:07:43.612Z")
},
{
_id: ObjectId("62c3e359341e86585c65c714"),
activity_id: ObjectId("62c3e359341e86585c65c712"),
challenge_id: ObjectId("62bd543c3a3937000958f2dd"),
status: "active",
createdAt: ISODate("2022-07-05T07:08:09.409Z"),
updatedAt: ISODate("2022-07-05T07:08:09.409Z")
}
]
output should looks like:
['62bd543c3a3937000958f2dd','62bd543c3a3937000958f2dd', '62bd543c3a3937000958f2dd' ]
Is it possible to do this with an aggregate function ? How ?
You can use $group like this:
db.collection.aggregate([
{$group: {_id: 0, res: {$push: {$toString: "$challenge_id"}}}},
{$project: {res: 1, _id: 0}}
])
See how it works on the playground example

How do you write a query that takes into account multiple numbers and orders them

My data looks something like this:
{_id: ObjectId("5e10c2d61a9201e439335816"), name: "Bob", redWins: 23, blueWins: 34}
{_id: ObjectId("5e10c34e1a9201e439335818"), name: "Alice", redWins: 41, blueWins: 52}
{_id: ObjectId("5e10c36f1a9201e439335819"), name: "John", redWins: 12, blueWins: 24}
The goal is to be able to sort the data from most to least total wins (redWins + blueWins) and have a result that has the name and the amount of total wins in it. Desired output:
{name: "Alice", totalWins: 93},
{name: "Bob", totalWins: 57},
{name: "John", totalWins: 36}
One of the things I tried to use aggregation but I can't seem to figure out how to add the numbers before sorting them.
Thanks!
Use $addFields with $sum.
db.collection.aggregate([
{
$addFields: {
totalWins: {$sum: ["$redWins", "$blueWins"]}
}
},
{
$sort: {
totalWins: -1
}
},
{
$project: {
name: 1,
totalWins: 1,
_id: 0
}
}
])

MongoDB aggregation group by similar string

Im starting to learn aggregations for Mongo, but for my project i found a lot of brands in my collection with very similar names, like 'BrandA' and 'BrandA tech'. Is there a way to group them at the end of my aggregation?
I have 2 collections in my database:
The first one is for brands:
{
_id: ObjectId(),
name: String
}
The second one is for products:
{
_id: ObjectId(),
name: String,
brand: ObjectId() // referring to _id of brands
}
Now lets say i have the following brands:
{_id: ObjectId('5a9fd2b8045b020013de2a47'), name: 'brand1'},
{_id: ObjectId('5a9fcf94d28420245451a39c'), name: 'brand2'},
{_id: ObjectId('5a9fcf94d28420245451a39a'), name: 'brand1 sub1'},
{_id: ObjectId('5a9fe8bf045b020013de2a6d'), name: 'sub2 brand2'}
And the following products:
{_id: ObjectId(''), name: 'item1', brand: ObjectId('5a9fd2b8045b020013de2a47')},
{_id: ObjectId(''), name: 'item2', brand: ObjectId('5a9fcf94d28420245451a39c')},
{_id: ObjectId(''), name: 'item3', brand: ObjectId('5a9fd2b8045b020013de2a47')},
{_id: ObjectId(''), name: 'item4', brand: ObjectId('5a9fcf94d28420245451a39a')},
{_id: ObjectId(''), name: 'item5', brand: ObjectId('5a9fe8bf045b020013de2a6d')},
{_id: ObjectId(''), name: 'item6', brand: ObjectId('5a9fd2b8045b020013de2a47')},
{_id: ObjectId(''), name: 'item7', brand: ObjectId('5a9fcf94d28420245451a39c')},
{_id: ObjectId(''), name: 'item8', brand: ObjectId('5a9fcf94d28420245451a39a')}
The query I have now:
db.getCollection('products').aggregate([
{$group: {
_id: '$brand',
amount: { $sum: 1 },
}},
{
$sort: { 'amount': -1 }
},{$lookup: {
from: 'brands',
localField: '_id',
foreignField: '_id',
as: 'lookup'
}},
{$unwind: {path: '$lookup'}},
{$project: {
_id: '$_id',
brandName: '$lookup.name',
amount: '$amount'
}}
]);
Result:
{_id: ObjectId('5a9fd2b8045b020013de2a47'), brandName: 'brand1', amount: 3}
{_id: ObjectId('5a9fcf94d28420245451a39c'), brandName: 'brand2', amount: 2}
{_id: ObjectId('5a9fcf94d28420245451a39a'), brandName: 'brand1 sub1', amount: 2}
{_id: ObjectId('5a9fe8bf045b020013de2a6d'), brandName: 'sub2 brand2', amount: 1}
Result I want:
{_id: ObjectId(null), brandName: 'brand1', amount: 5},
{_id: ObjectId(null), brandName: 'brand2', amount: 3}
Is it possible to to group the result I have now by finding similar strings in brandName? Like grouping 'brand1' and 'brand1 sub1' or 'brand2' and 'sub2 brand2'?
I think that you could do what you want by using $split and $unwind
split will transform your string into an array of words and unwind will create as many entries as you have words in the array.
Then you can apply the pipeline you already prepared to count the occurences.
a change in the model could easily achieve this. just add the items in an array to a brand.
then you instantly get a count by using the array's length and the query speed is faster.

mongodb get distinct values with category

Suppose there is the following collection
People:
{
_id: 1,
name: 'john',
last_name: 'blah1',
job: 'lifeguard'
}
{
_id: 2,
name: 'john',
last_name: 'blah2',
job: 'lifeguard'
}
{
_id: 3,
name: 'alex',
last_name: 'blah3',
job: 'lifeguard'
}
{
_id: 4,
name: 'alex',
last_name: 'blah4',
job: 'lifeguard'
}
{
_id: 5,
name: 'alex',
last_name: 'blah5',
job: 'gardener'
}
I need to get the distict jobs with an array of distict names:
Trying to get the following result:
[
{
value: 'lifeguard',
names: [
'john',
'alex'
],
},
{
value: 'gardener',
names: [
'alex'
],
},
]
I understand how to get the unique jobs
db.people.find().distinct('jobs')
However i did not figure out how to do a distinct query with multiple properties.
Better to use the aggregation framework where you have a pipeline that has a $group stage to group the documents by the job key and then construct the names distinct array within the group by the accumulator $addToSet.
Consider the following aggregate operation:
db.people.aggregate([
{
"$group": {
"_id": "$job",
"names": { "$addToSet": "$name" }
}
}
])
#chridam did help me find the right answer, in the real world my object was more like
{
_id: 1,
name: ['john', 'bah1', 'blah2', 'blah3'],
last_name: 'blah1',
job: 'lifeguard'
}
so i had to $unwind the names and aggregate $group just like in #chridam's answer.
model.aggregate([
{$unwind: "$name"},
{
$group: {
_id:"$name",
jobs: {
$addToSet: "$job"
}
}
}
]

MongoDB aggregation for distinct values

I have 2 collections:
user
{
_id: 'user_id1',
username: 'user1',
}
{
_id: 'user_id2',
username: 'user2',
}
{
_id: 'user_id3',
username: 'user3',
}
inbox
{
_id: 'inbox_id1',
from: {_id: 'user_id1', username: 'user1'},
to: {_id: 'user_id2', username: 'user2'},
text: 'Hello there',
timestamp: new Date(),
}
{
_id: 'inbox_id2',
from: {_id: 'user_id1', username: 'user1'},
to: {_id: 'user_id2', username: 'user2'},
text: 'Trying again...',
timestamp: new Date(),
}
{
_id: 'inbox_id3',
from: {_id: 'user_id3', username: 'user3'},
to: {_id: 'user_id2', username: 'user2'},
text: 'You there?',
timestamp: new Date(),
}
Whenever a user goes into his inbox, I would like to show him the thread list, which should include a list of latest messages from each user. So basically I would like to get distinct documents (based on the from._id field), and only the latest document (based on timestamp field).
So my results for user2 should include only 2 documents (inbox_id2 and inbox_id3).
I know I need to use aggregation for it, but not sure how exactly.
I was able to solve it based on a similar question: MongoDB : Aggregation framework : Get last dated document per grouping ID
My solution looks like this:
db.inbox.aggregate([
{$match : {'to.username': 'user2'}},
{'$sort': {'from._id': 1, 'timestamp': -1}},
{'$group': {
'_id': '$from._id',
'timestamp': {'$first': '$timestamp'},
'text': {'$first': '$text'},
'from': {'$first': '$from.username'},
}},
]);