How can I aggregate collection and group by field count - mongodb

I have a collection of users that looks something like this:
{
"_id": ObjectId("54380a817a4b612a38e87613"),
"email": "email#email.com",
"ogp": [BIG NESTED COLLECTION... {}, {}, {}]
"created": ISODate("2012-02-28T23:10:07Z"),
"o_id": ObjectId("5438096f7a4b612a38e445f4")
"geo": {"country":"US", "city":"Seattle", "longitude": 123, "latitude":123}
}
I'd like to get all the users location and group them by country and total. Something like this:
[ {country:"US",total:250,000}, {country:"GB",total:150,000}, ... ]
Currently I'm just grabbing all of the documents and parsing it on the server:
db.users.find({'geo.country': {$ne: null},'geo.city': {$ne: null}}, {'geo.country':1}, function(err, doc) {
var data;
doc = _.groupBy(doc, function(par) { return par.geo.country; });
data = [];
return _.each(doc, function(item, key, obj) {
return data.push([key, obj[key].length]);
});
});
The problem with this is there are 600,000+ documents and the query takes about 1 minute to execute. Would the "aggregate" function would help speed up this query? If so how would I do it?

This should do it:
db.myCollection.aggregate([
{"$group": {_id: "$geo.country", count:{$sum:1}}}
])

Related

MongoDB query to update fields with same name

I'm using mongodb version v3.4.2
I have a collection with one document.
The document has a complex(not standard) structure that I don't know.
The structure of the document is something like this
{
"time": 9281
"object_1": [
{ "time": 8372 },
{ "time": 3234 }
],
"object_2": {
"time": 2928
}
}
I need to change the values of all fields with name "time".
Can I do this with a single/multiple mongo query?
I tried with :
db.collection.update({}, {$set: {time: 999}})
db.collection.updateMany({}, {$set: {time: 999}})
but only top level occurrencies are updated with the new value.
After the document has been updated with the query, this should be the result:
{
"time":999
"object_1":[{"time":999},{"time":999}]
"object_2":{"time":999}
}

MongoDB aggregate query extremely slow

I've a MongoDB query here which is running extremely slow without an index but the query fields are too big to index so i'm looking for advice on how to optimise this query or create valid index for it:
collection.aggregate([{
$match: {
article_id: {
$nin: read_article_ids
},
author_id: {
$in: liked_authors,
$nin: disliked_authors
},
word_count: {
$gte: 1000,
$lte: 10000
},
article_sentiment: {
$elemMatch: {
sentiments: mood
}
}
}
}, {
$sample: {
size: 4
}
}])
The collection in this case is a collection of articles with article_id, author_id, word_count, and article_sentiment. There is around 1.6 million documents in the collection and a query like this takes upwards of 10 seconds without an index. The box has 56gb of memory and is all around pretty specced out.
The query's function is to retrieve a batch of 4 articles by authors the user likes and that they've not read and that match a given sentiment (The article_sentiment key holds a nested array of key:value pairs)
So is this query incorrect for what i'm trying to achieve? Is there a way to improve it?
EDIT: Here is a sample document for this collection.
{
"_id": ObjectId("57f7dd597a1026d326fc02c4"),
"publication_name": "National News Inc",
"author_name": "John Hardwell",
"title": "How Shifting Policy Has Stunted Cultural Growth",
"article_id": "2f0896cd47c9423cb5a309c7277dd90d",
"author_id": "51b7f46f6c0f46f2949608c9ec2624d4",
"word_count": 1202,
"article_sentiment": [{
"sentiments": "happy",
"weight": 0.528596282005
}, {
"sentiments": "serious",
"weight": 0.569274544716
}, {
"sentiments": "relaxed",
"weight": 0.825395524502
}]
}

Mongodb Aggregate function with reference to another collection

I have two collections
Persons
{
"name": "Tom",
"car_id": "55b73e3e8ead0e220d8b45f3"
}
...
Cars
{
"model": "BMW",
"car_id": "55b73e3e8ead0e220d8b45f3"
}
...
How can i do a query such that i can get the following results (for example)
BMW : 2
Toyota: 3
Below is my aggregate function. I can get it the data out, however it does the car _id instead of the model name.
db.persons.aggregate([
{
$group: {
_id: {
car_Id: "$car_id",
},
carsCount: {$sum: 1}
},
},
]);
Appreciate any assistance.
you can aggregate internally in one collection with one query. I would suggest to keep aggregated data like that or similar:
{
car_id: 273645jhg2f52345hj3564jh6
sum: 4
}
And replace ID with name when you will need to expose data to the user.

Can I use populate before aggregate in mongoose?

I have two models, one is user
userSchema = new Schema({
userID: String,
age: Number
});
and the other is the score recorded several times everyday for all users
ScoreSchema = new Schema({
userID: {type: String, ref: 'User'},
score: Number,
created_date = Date,
....
})
I would like to do some query/calculation on the score for some users meeting specific requirement, say I would like to calculate the average of score for all users greater than 20 day by day.
My thought is that firstly do the populate on Scores to populate user's ages and then do the aggregate after that.
Something like
Score.
populate('userID','age').
aggregate([
{$match: {'userID.age': {$gt: 20}}},
{$group: ...},
{$group: ...}
], function(err, data){});
Is it Ok to use populate before aggregate? Or I first find all the userID meeting the requirement and save them in a array and then use $in to match the score document?
No you cannot call .populate() before .aggregate(), and there is a very good reason why you cannot. But there are different approaches you can take.
The .populate() method works "client side" where the underlying code actually performs additional queries ( or more accurately an $in query ) to "lookup" the specified element(s) from the referenced collection.
In contrast .aggregate() is a "server side" operation, so you basically cannot manipulate content "client side", and then have that data available to the aggregation pipeline stages later. It all needs to be present in the collection you are operating on.
A better approach here is available with MongoDB 3.2 and later, via the $lookup aggregation pipeline operation. Also probably best to handle from the User collection in this case in order to narrow down the selection:
User.aggregate(
[
// Filter first
{ "$match": {
"age": { "$gt": 20 }
}},
// Then join
{ "$lookup": {
"from": "scores",
"localField": "userID",
"foriegnField": "userID",
"as": "score"
}},
// More stages
],
function(err,results) {
}
)
This is basically going to include a new field "score" within the User object as an "array" of items that matched on "lookup" to the other collection:
{
"userID": "abc",
"age": 21,
"score": [{
"userID": "abc",
"score": 42,
// other fields
}]
}
The result is always an array, as the general expected usage is a "left join" of a possible "one to many" relationship. If no result is matched then it is just an empty array.
To use the content, just work with an array in any way. For instance, you can use the $arrayElemAt operator in order to just get the single first element of the array in any future operations. And then you can just use the content like any normal embedded field:
{ "$project": {
"userID": 1,
"age": 1,
"score": { "$arrayElemAt": [ "$score", 0 ] }
}}
If you don't have MongoDB 3.2 available, then your other option to process a query limited by the relations of another collection is to first get the results from that collection and then use $in to filter on the second:
// Match the user collection
User.find({ "age": { "$gt": 20 } },function(err,users) {
// Get id list
userList = users.map(function(user) {
return user.userID;
});
Score.aggregate(
[
// use the id list to select items
{ "$match": {
"userId": { "$in": userList }
}},
// more stages
],
function(err,results) {
}
);
});
So by getting the list of valid users from the other collection to the client and then feeding that to the other collection in a query is the onyl way to get this to happen in earlier releases.

Mongoose query where value is not null

Looking to do the following query:
Entrant
.find
enterDate : oneMonthAgo
confirmed : true
.where('pincode.length > 0')
.exec (err,entrants)->
Am I doing the where clause properly? I want to select documents where pincode is not null.
You should be able to do this like (as you're using the query api):
Entrant.where("pincode").ne(null)
... which will result in a mongo query resembling:
entrants.find({ pincode: { $ne: null } })
A few links that might help:
The mongoose query api
The docs for mongo query operators
$ne
selects the documents where the value of the field is not equal to
the specified value. This includes documents that do not contain the
field.
User.find({ "username": { "$ne": 'admin' } })
$nin
$nin selects the documents where:
the field value is not in the specified array or the field does not exist.
User.find({ "groups": { "$nin": ['admin', 'user'] } })
I ended up here and my issue was that I was querying for
{$not: {email: /#domain.com/}}
instead of
{email: {$not: /#domain.com/}}
Ok guys I found a possible solution to this problem. I realized that joins do not exists in Mongo, that's why first you need to query the user's ids with the role you like, and after that do another query to the profiles document, something like this:
const exclude: string = '-_id -created_at -gallery -wallet -MaxRequestersPerBooking -active -__v';
// Get the _ids of users with the role equal to role.
await User.find({role: role}, {_id: 1, role: 1, name: 1}, function(err, docs) {
// Map the docs into an array of just the _ids
var ids = docs.map(function(doc) { return doc._id; });
// Get the profiles whose users are in that set.
Profile.find({user: {$in: ids}}, function(err, profiles) {
// docs contains your answer
res.json({
code: 200,
profiles: profiles,
page: page
})
})
.select(exclude)
.populate({
path: 'user',
select: '-password -verified -_id -__v'
// group: { role: "$role"}
})
});
total count the documents where the value of the field is not equal to the specified value.
async function getRegisterUser() {
return Login.count({"role": { $ne: 'Super Admin' }}, (err, totResUser) => {
if (err) {
return err;
}
return totResUser;
})
}
Hello guys I am stucked with this. I've a Document Profile who has a reference to User,and I've tried to list the profiles where user ref is not null (because I already filtered by rol during the population), but
after googleing a few hours I cannot figure out how to get this. I
have this query:
const profiles = await Profile.find({ user: {$exists: true, $ne: null }})
.select("-gallery")
.sort( {_id: -1} )
.skip( skip )
.limit(10)
.select(exclude)
.populate({
path: 'user',
match: { role: {$eq: customer}},
select: '-password -verified -_id -__v'
})
.exec();
And I get this result, how can I remove from the results the user:null colletions? . I meant, I dont want to get the profile when user is null (the role does not match).
{
"code": 200,
"profiles": [
{
"description": null,
"province": "West Midlands",
"country": "UK",
"postal_code": "83000",
"user": null
},
{
"description": null,
"province": "Madrid",
"country": "Spain",
"postal_code": "43000",
"user": {
"role": "customer",
"name": "pedrita",
"email": "myemail#gmail.com",
"created_at": "2020-06-05T11:05:36.450Z"
}
}
],
"page": 1
}
Thanks in advance.