I have a collection of Book Reviews where I am trying to find users who have created multiple reviews (lets say 5), I also want to return the number of reviews, their unique ID and their Name.
So far I have managed to find a way of doing this through aggregation, however for the life of me I cant seem to return the name field, I assumed a simple $project would be fine but instead I can only see the ID and the Number of reviews someone has made, what am i missing to fix this?
Current Code:
db.bookreviews.aggregate([
{"$group": {"_id": "$reviewerID","NumberOfReviews": { "$sum": 1 }}},
{"$match": {NumberOfReviews: {"$gte": 5}}},
{"$project":{_id:1,NumberOfReviews:1, reviewerName:1}},
])
Returned Values:
{IDXYZ, NumberofReviews 5},
{IDABC, NumberofReviews 5},
{ID123, NumberofReviews 5}
you can use $first to keep the first document of that group and keep the value of reviewerName in your $group stage and you can remove the $project.
db.bookreviews.aggregate([
{"$group": {"_id": "$reviewerID","NumberOfReviews": { "$sum": 1 }, "reviewerName": { "$first": "$reviewerName" } } },
{"$match": {"NumberOfReviews": {"$gte": 5}}},
])
Related
I need a bit help with mongodb aggregation.
first I have a $match to get filter some specific documents.
then I group by a field I need them grouped in.
the group I need to select a document where field value is ... and get that document as main data.
{"$match": {"$and": [
{chain: chain},
{dex: dex}
]}};
{$group: {
_id: "$pairAddress",
allChange: {"$push": "$$ROOT"},
baseToken: {$last: '$baseToken'},
txCount: {the document with timeframe inside this group 86400.txCount}
}},
{$sort: {txCount: -1}}
{$skip: 0}
{$limit: 100}
the group consist of documents with different timeframes, I need to somehow select a specific timeframe and add fields to the group from that timeframe. for example each timeframe has a different amount of txCount after group I want to sort by txCount and limit the amount and use skip for some pagination.
the problem is in selecting a document from that group with the specific timeframe.
anyone who could help me a bit to the right direction that would be awesome.
Here an example of how data is stored in the database and what I would like the result to be.
const document = {
_id: '3567356735672467',
pairAddress: '0x45jk6v34jy5634jkh5v6kj4h5v62j4h56', // group by pair address
baseToken: '0x456jn345k6hb4k5h6b3khb65k3hb56k3h4b6',
resolution: 86400, // a pair address has 6 documents with each a own timeframe 300, 900, 1800, 3600, 43200, 86400
base0: true,
txCount: 26,
buyCount: 10,
sellCount:16,
buyVolume: '2342354.345',
sellVolume: '1234.34',
volume: '1232352.345',
change: '12.34',
positive: true,
time: 1676865981,
chain: 'ETH',
dex: 'SUS',
price: '12.45',
};
const result = [
{
_id: "0x45jk6v34jy5634jkh5v6kj4h5v62j4h56",
allChange: {"$push": "$$ROOT"}, // array of all documents/timeframes for a pairAddress
selectedTxAmount: 26, // this needs to be the document with selected timeframe example 86400, selected from the group is must match the pairAddress
}
];
Maybe its possible to change the aggregation to make it work and faster.
match all timeframes, dex and chain.
sort by txCount.
skip X amount.
limit to 100
and return all document with a field containing all timestamps per the pairAddress left after the aggregation.
Currently thanks to #1sina1 I got this and it works.
{"$match": {"$and": [
{"chain": chain},
{"dex": dex}
]}},
{$group: {
_id: "$pairAddress",
allChange: {"$push": "$$ROOT"},
baseToken: {$last: '$baseToken'},
txCount: {
"$push": {
"$cond": {
"if": {
"$eq": [
"$resolution",
43200
]
},
"then": "$txCount",
"else": "$$REMOVE"
}
}
}
}},
{$sort: {txCount: -1}},
{$skip: parseInt(page) * 100},
{$limit: 100},
But I think there might be a way to do it just a bit different now we first group all (which is about 20k documents) I am only interested in 100, so maybe first match to timeframe/resolution then sort, skip, limit and then I just need from those 100 pairAddress all the according timeframes/resolutions for each as a flied allChange.
I have Collection of documents with id and contact. Contact is an array which contains subdocuments.
I am trying to get the count of contact where isActive = Y. Also need to query the collection based on the id. The entire query can be something like
Select Count(contact.isActive=Y) where _id = '601ad0227b25254647823713'
I am using mongo and mongoose for the first time. Please edit the question if I was not able to explain it properly.
You can use an aggregation pipeline like this:
First $match to get only documents with desired _id.
Then $unwind to get different values inside array.
Match again to get the values which isActive value is Y.
And $group adding one for each document that exists (i.e. counting documents with isActive= Y). The count is stores in field total.
db.collection.aggregate([
{
"$match": {"id": 1}
},
{
"$unwind": "$contact"
},
{
"$match": {"contact.isActive": "Y"}
},
{
"$group": {
"_id": "$id",
"total": {"$sum": 1}
}
}
])
Example here
I want to know how to use aggregate() to take all of the objects of a specific field (i.e. "user") and count them.
This what I am doing:
I want to return a list of users with the sum of how many tweets that have made?
So I want output that looks like
Etc..
Also I don't want repeating users like
Etc..
which is what the above aggregate does.
So basically, how can I modify this aggregate to ensure the objects are unique?
I believe you will want to group by the user.id field instead of the user object. You can try doing that directly
$group: {_id: "$user.id", totalTweets: {$sum: 1} }
Or you might want to try projecting that field onto the document before grouping
$addFields: {userId: "$user.id"}
$group: {_id: "$userId", totalTweets: {$sum: 1} }
If you want whole inner user object in each documents after aggregation then you have to use $push operator in aggregation
and also you need to do the aggregation on unique id of users e.g: id or id_str instead of $user object as in your question.
db.tweets.aggregate([{ $group: {_id: "$user.id", totalTweets: { $sum: 1 }, user : { $push: "$user" } } }])
This will solved your problem. For details about $push operator, have a look at official documents $push
I have a structure of...
{ _id = object_id,
user: name,
days: { "4/1/2010": {"checked": true},
"4/2/2011": {"checked": false)}
}
I want to get the total number of days across users. If days was an array, I would do something like...
db.collection.aggregate([{"$group": {"_id": null, {"$sum": {"$size": "$days"}}}}])
but that won't work since I can't use size. Anyone have suggestions?
Note: There may be a different number of days missing in the data structure for each user which is why I want to check the count within each user's days
You can use aggregation pipeline with $objectToArray stage to convert days pair into arrays followed by $sum and $size in a $group stage in 3.4.
db.collection.aggregate([
{"$group":{
"_id":null,
"count":{
"$sum":{
"$size":{"$objectToArray":"$days"}
}
}
}}
])
Say I have a users collection in MongoDB. A typical user document contains a name field, and an array of subdocuments, representing the user's characteristics. Say something like this:
{
"name": "Joey",
"characteristics": [
{
"name": "shy",
"score": 0.8
},
{
"name": "funny",
"score": 0.6
},
{
"name": "loving",
"score": 0.01
}
]
}
How can I find the top X funniest users, sorted by how funny they are?
The only way I've found so far, was to use the aggregation framework, in a query similar to this:
db.users.aggregate([
{$project: {"_id": 1, "name": 1, "characteristics": 1, "_characteristics": '$characteristics'}},
{$unwind: "$_characteristics"},
{$match: {"_characteristics.name": "funny"}},
{$sort: {"_characteristics.score": -1}},
{$limit: 10}
]);
Which seems to be exactly what I want, except for the fact that according to MongoDB's documentation on using indexes in pipelines, once I call $project or $unwind in an aggregation pipeline, I can no longer utilize indexes to match or sort the collection, which renders this solution somewhat unfeasible for a very large collection.
I think you are half way there. I would do
db.users.aggregate([
{$match: { 'characteristics.name': 'funny' }},
{$unwind: '$characteristics'},
{$match: {'characteristics.name': 'funny'}},
{$project: {_id: 0, name: 1, 'characteristics.score': 1}},
{$sort: { 'characteristics.score': 1 }},
{$limit: 10}
])
I add a match stage to get rid of users without the funny attribute (which can be easily indexed).
unwind and match again to keep only the certain part of the data
keep only the necessary data with project
sort by the correct score
and limit the results.
that way you can use an index for the first match.
The way I see it, if the characteristics you are interested about are not too many, IMO it would be better to have your structure as
{
"name": "Joey",
"shy": 0.8
"funny": 0.6
"loving": 0.01
}
That way you can use an index (sparse or not) to make your life easier!