I have a collection where multiple documents may have the same userId field. I would like to groupby userId so that I get a list of unique userIds, but also a sort by date so that each returned document is the latest document for that userId. I've done queries like this with sql, and I'm really hoping its possible with mongo.
In this example collection:
{ userId: 456, date: 5/16/1988 },
{ userId: 456, date: 5/17/1988 },
{ userId: 789, date: 5/18/1988 },
{ userId: 789, date: 5/17/1988 }
I would want to return:
{ userId: 456, date: 5/17/1988 },
{ userId: 789, date: 5/18/1988 }
Here is how you would do it in mongo. Note that it got this to work with a date format of yyyy-mm-dd.
db.collection.aggregate({
$group: {
id : '$userId',
date: { $max: '$date'}
}
})
Sources: http://docs.mongodb.org/manual/
In your question you say you want the full document returned. You can do this by returning the full document as a field in the $group operator.
db.coll.aggregate([
{$sort:{date:-1}},
{$group: {_id: "$userId", doc:{$first: "$$CURRENT"}} }
])
I created four documents like in your question with Date type as some random dates. This would give you the following result:
{
"_id" : 789,
"doc" : {
"_id" : ObjectId("53d80e246ebc37d0c33321ba"),
"userId" : 789,
"date" : ISODate("2014-07-05T04:00:00.000Z")
}
},
{
"_id" : 456,
"doc" : {
"_id" : ObjectId("53d80e246ebc37d0c33321b8"),
"userId" : 456,
"date" : ISODate("2014-07-05T04:00:00.000Z")
}
}
See http://docs.mongodb.org/manual/reference/operator/aggregation/group/#variables for more info on $$CURENT
Although they probably are the right way to go, I wasn't able to use the db.collection.aggregate methods because I need to use other things like .populate() on a Model.find in this situation. So I came up with a work around where I sort on userID and date in the find(options) like so:
{ sort: { updateDate: -1, userId: -1 } }
Then I wrote a function on the front end to extract the latest record for each user:
filterLatest: function(docs) {
var lastUserId = null;
var latestDocs = [];
docs.forEach(function(doc) {
if(lastUserId != doc.userId) latestDocs.push(doc);
lastUserId = doc.userId;
});
return latestDocs;
}
Related
We have a collection which stores log documents.
Is it possible to have multiple aggregations on different attributes?
A document looks like this in it's purest form:
{
_id : int,
agent : string,
username: string,
date : string,
type : int,
subType: int
}
With the following query I can easily count all documents and group them by subtype for a specific type during a specific time period:
db.logs.aggregate([
{
$match: {
$and : [
{"date" : { $gte : new ISODate("2020-11-27T00:00:00.000Z")}}
,{"date" : { $lte : new ISODate("2020-11-27T23:59:59.000Z")}}
,{"type" : 906}
]
}
},
{
$group: {
"_id" : '$subType',
count: { "$sum": 1 }
}
}
])
My output so far is perfect:
{
_id: 4,
count: 5
}
However, what I want to do is to add another counter, which will also add the distinct count as a third attribute.
Let's say I want to append the resultset above with a third attribute as a distinct count of each username, so my resultset would contain the subType as _id, a count for the total amount of documents and a second counter that represents the amount of usernames that has entries. In my case, the number of people that somehow have created documents.
A "pseudo resultset" would look like:
{
_id: 4,
countOfDocumentsOfSubstype4: 5
distinctCountOfUsernamesInDocumentsWithSubtype4: ?
}
Does this makes any sense?
Please help me improve the question as well, since it's difficult to google it when you're not a MongoDB expert.
You can first group at the finest level, then perform a second grouping to achieve what you need:
db.logs.aggregate([
{
$match: {
$and : [
{"date" : { $gte : new ISODate("2020-11-27T00:00:00.000Z")}}
,{"date" : { $lte : new ISODate("2020-11-27T23:59:59.000Z")}}
,{"type" : 906}
]
}
},
{
$group: {
"_id" : {
subType : "$subType",
username : "$username"
},
count: { "$sum": 1 }
}
},
{
$group: {
"_id" : "$_id.subType",
"countOfDocumentsOfSubstype4" : {$sum : "$count"},
"distinctCountOfUsernamesInDocumentsWithSubtype4" : {$sum : 1}
}
}
])
Here is the test cases I used:
And here is the aggregate result:
I have a collection like with elements like this:
{
_id: 585b...,
data: [
{
name: "John",
age: 30
},
{
name: "Jane",
age: 31
}
]
}
I know how to find the document that contains John:
db.People.find({"data.name", "John"})
But then I get the entire document. How can I get just the embedded document. So I want to return this:
{
name: "John",
age: 30
}
For context: this is part of a larger dataset and I need to check if certain updates are made to this specific document. Due to the way the application is implemented, the embedded document won't always be at the same index.
So how can I query and return an embedded document?
Use a second parameter to suppress the ID
db.people.find({"data.name", "John"}, {_id : 0})
This will output
data: [
{
name: "John",
age: 30
},
{
name: "Jane",
age: 31
}
]
To get just the embedded documents, use aggregation.
db.test.aggregate([
{
$unwind : "$data"
},
{
$match : {"data.name" : "John"}
},
{
$project : {
_id : 0,
name : "$data.name",
age : "$data.age"
}
}
])
help me please :
a have such order collection with this schema :
const OrderSchema = new Schema(
{
doctorId: { type : Schema.Types.ObjectId, ref: 'Users'},
patientId: { type : Schema.Types.ObjectId, ref: 'Users'},
orderTime: { type : String , default:''},
createdAt: { type : Date, default:Date.now },
approvedByDoctor:{ type :Boolean, default:false },
price:{type:Number,default:0}
},
);
and a have 10 documents like this, what query must i do to get array of "orderTime" from each document? thanks
Assuming you have documents which look like this:
{
"_id" : ObjectId("578f73d17612ac41eb736641"),
"createdAt" : ISODate("2016-07-20T12:51:29.558Z")
}
{
"_id" : ObjectId("578f73e57612ac41eb736642"),
"createdAt" : ISODate("2016-07-20T12:51:49.701Z")
}
then you can generate a result document containing an array of createdAt dates which looks like this:
{ "_id" : null, "creationDates" : [ ISODate("2016-07-20T12:51:29.558Z"), ISODate("2016-07-20T12:51:49.701Z") ] }
by running the following aggregate query:
db.<your_collection>.aggregate([{$group:{"_id":null,"creationDates":{$push:"$createdAt"}}}])
this will basically group all documents in the collection ("_id":null) and push the the values from the createdAt fields into an array ("creationDates":{$push:"$createdAt"})
Use the aggregation framework to create the array. Essentially you'd want to group all the documents, use the $push accumulator operator to create the list. Follow this example to get the gist:
Order.aggregate([
{
"$group": {
"_id": 0,
"orderTimes": { "$push": "$orderTime" }
}
}
]).exec(function(err, result) {
console.log(result[0].orderTimes);
});
I have a schema that stores user attendance for events:
event:
_id: ObjectId,
...
attendances: [{
user: {
type: ObjectId,
ref: 'User'
},
answer: {
type: String,
enum: ['yes', 'no', 'maybe']
}
}]
}
Sample data:
_id: '533483aecb41af00009a94c3',
attendances: [{
user: '531770ea14d1f0d0ec42ae57',
answer: 'yes',
}, {
user: '53177a2114d1f0d0ec42ae63',
answer: 'maybe',
}],
I would like to return this data in the following format when I query for all attendances for a user:
var attendances = {
yes: ['533497dfcb41af00009a94d8'], // These are the event IDs
no: [],
maybe: ['533497dfcb41af00009a94d6', '533497dfcb41af00009a94d2']
}
I am not sure the aggegation pipeline will return it in this format? So I was thinking I could return this and modify it easily:
var attendances = [
answer: 'yes',
ids: ['533497dfcb41af00009a94d8'],
},{
answer: 'no',
ids: ['533497dfcb41af00009a94d8']
}, {
answer: 'maybe',
ids: ['533497dfcb41af00009a94d6', '533497dfcb41af00009a94d2']
}];
My attempt is not succesful however. It doesn't group it by answer:
this.aggregate({
$match: {
'attendances.user': someUserId
}
}, {
$project: {
answer: '$attendances.answer'
}
}, {
$group: {
_id: '$answer',
ids: {
$addToSet: "$_id"
}
}
}, function(e, docs) {
});
Can I return the data I need in the first desired format and if not how can I correct the above code to achieve the desired result?
On that note - perhaps the map-reduce process would be better suited?
The below query will help you get close to the answer you want. Although, it isn't exactly in the same format you are expecting, you get separate documents for each answer option and an array of event id's.
db.collection.aggregate([
// Unwind the 'attendances' array
{"$unwind" : "$attendances"},
// Match for user
{"$match" : {"attendances.user" : "53177a2114d1f0d0ec42ae63"}},
// Group by answer and push the event id's to an array
{"$group" : {_id : "$attendances.answer", eventids : {$push : "$_id"}}}
])
This produces the below output:
{
"result" : [
{
"_id" : "yes",
"eventids" : [
ObjectId("53d968a8c4840ac54443a9d6")
]
},
{
"_id" : "maybe",
"eventids" : [
ObjectId("53d968a8c4840ac54443a9d7"),
ObjectId("53d968a8c4840ac54443a9d8")
]
}
],
"ok" : 1
}
This is the documents structure:
{
'_id' : ObjectId('56be1b51a0f4c8591f37f62a'),
'name': 'Bob',
'sub_users': [{'_id' : ObjectId('56be1b51a0f4c8591f37f62a')}]
}
{
'_id' : ObjectId('56be1b51a0f4c8591f37f62b'),
'name': 'Alice',
'sub_users': [{'_id' : ObjectId('56be1b51a0f4c8591f37f62a')}]
}
The sub_users array is used basically to link accounts, in the example Alice is Bob's manager since she has him as a sub_user. Bob has his own id in the sub_users array and this is wrong (no one really is his own boss).
I want to find all the Bobs, it feels like a simple query but I can't find the way to do it, or to even to google it properly, tried this (probably knowing it wouldn't work);
db.users.aggregate([
{ $group: { _id: '_id' } },
{ $match: { sub_users: { $elemMatch: { _id: '$$ROOT._id' } } } }
])
And it didn't worked, so the question is; how to find a document whose nested documents have the same value as the root element (for a certain field)?
To get there I'm using compare expression - please see example below:
db.users.aggregate([{
$unwind : "$sub_users"
}, //have all ids on same level
{
$project : {
_id : 1,
name : 1,
sameId : {
$cmp : ["$_id", "$sub_users._id"]
},
}
}, {
$match : {
sameId : 0
}
}
])