Mongodb aggregate distinct with sort and limit - mongodb

I have a collection objects.
{
"_id" : ObjectId("55fa65046db58e7d0c8b456a"),
"object_id" : "1651419",
"user" : {
"id" : "65593",
"cookie" : "9jgkm7ME1HDFD4K6j8WWvg",
},
"createddate" : ISODate("2015-09-17T10:00:20.945+03:00")
}
Every time user visits object's page it stores as separate record in collection. Now i need to get array of last N visited objects. It should be distinct, so array should have N unique records. Also, it should be sorted by createddate.
So if the user visited object_id = 1, then object_id = 2 two times, after that visited object_id = 3 and again object_id = 1 the array should contain:
{
visits : [1, 3, 2]
}
(distinct and sorted by time of last visit).
I tried to use construction like
db.objects.aggregate([
{$match: {'user.id' : '65593'}},
{$sort: { 'createddate':-1 }},
{$project: {'id': '$user.id', 'obj' : '$object_id'}},
{$group: {_id:'$id', 'obj': {$addToSet: '$obj'}}},
{$project:{_id:0, 'obj':'$obj'}}
])
but it returns array that not sorted and also i can't limit array size.

The $addToSet operator and "sets" in general for MongoDB are not ordered in any way. Insead, get the "distinct" values by grouping on them first, then apply to the array after sorting them:
db.objects.aggregate([
{ "$match": { "user.id": "65593" } },
{ "$sort": { "user.id": 1, "createddate": -1 } },
{ "$group": {
"_id": {
"_id": "$user.id",
"object_id": "$object_id"
},
"createddate": { "$first": "$createddate" }
}},
{ "$sort": { "_id._id": 1, "createddate": -1 } },
{ "$group": {
"_id": "$_id._id",
"obj": { "$push": "$_id.object_id" }
}}
])
So if you want the discovery oder by date you $sort first, but since $group does not guarantee any order of results you need to $sort again before you group with the $push operation to build the array.
Note that you are likely reducing down the "createddate" somehow as then general "distinct" items would appear to be the "user.id" and the "object_id" fields, so this does need some sort of accumulator and needs to be included for your ordering.
Then the array items will be in the order you expect.
If you need to $limit then you must process $unwind and split the limit the results. Alternately process a "limit" after the first group and following sort here.
But of course this is only practical to do for a single main grouping _id, being "user.id". Future mongodb releases will support $slice, which will make this practical for multiple grouping id's and a bit more simple in general. But it still won't be possible to "limit" the array items before that initial group over multiple primary groupind id's.

I found the solution i expected.
db.objects.aggregate([
{$match: {'user.id' : '65593'}},
{$group : {
_id : '$object_id',
dt : {$max: '$createddate'}
}
},
{$sort: {'dt':-1}},
{$limit:5},
{$group : {
_id :null,
'objects' : {$push:'$_id'}
}
},
{$project: {_id:0, 'objects':'$objects'}}
])
It returns limited to N distinct array sorted backwards by createddate.
Thank everyone for help!

Related

$facet of mongodb returning full sorted documents instead of count based on match

i have a documents as below
{
_id:1234,
userId:90oi,
tag:"self"
},
{
_id:5678,
userId:65yd,
tag:"other"
},
{
_id:9012,
userId:78hy,
tag:"something"
},
{
_id:3456,
userId:60oy,
tag:"self"
},
i needed response like below
[{
tag : "self",
count : 2
},
{
tag : "something",
count : 1
},
{
tag : "other",
count : 1
}
]
i was using $facet to query the documents. but it is returning entire documents not the count. My query is as follows
db.data.aggregate({
$facet: {
categorizedByGrade : [
{ $match: {userId:ObjectId(userId)}},
{$sortByCount: "$tag"}
]
}
})
Let me know what i am doing wrong. Thanks in advance for the help
So you don't need to use $facet for this one - facet is when you really need to process multiple aggregation pipelines in one aggregation query (mongoDB $facet), Please try this :
db.yourCollectionName.aggregate([{$project :{tag :1, _id :0}},{$group :{_id: '$tag',
count: { $sum: 1 }}}, {$project : {tag : '$_id', _id:0, count :1}}])
Explanation :
$project at first point is to retain only needed fields in all documents that way we've less data to process, $group will iterate through all documents to group similar data upon fields specified, While $sum will count the respective number of items getting added through group stage in each set, Finally $project again is used to make the result look like what we needed.
You can retrieve the correct records using facet, please have a look at below query
db.data.aggregate({
$facet: {
categorizedByGrade : [
{
$sortByCount:"$tag"
},
{
$project:{
_id:0,
tag:"$_id",
count:1,
}
}]
}
})

Get the number of documents liked per document in MongoDB

I'm working on a project by using MongoDB as a database and I'm encountering a problem: I can't find the right query to make a simple count of the likes of a document. The collection that I use is this :
{ "username" : "example1",
"like" : [ { "document_id" : "doc1" },
"document_id" : "doc2 },
...]
}
So what I need is to compute is the number of likes of each document so at the end I will have
{ "document_id" : "docA" , nbLikes : 30 }, {"document_id" : "docB", nbLikes : 1}
Can anyone help me on this because I failed.
You can do this by unwinding the like array of each doc and then grouping by document_id to get a count for each value:
db.test.aggregate([
// Duplicate each doc, once per 'like' array element
{$unwind: '$like'},
// Group them by document_id and assemble a count
{$group: {_id: '$like.document_id', nbLikes: {$sum: 1}}},
// Reshape the docs to match the desired output
{$project: {_id: 0, document_id: '$_id', nbLikes: 1}}
])
Add "likeCount" field and increase count for per $push operation and read "likeCount" field
db.test.update(
{ _id: "..." },
{
$inc: { likeCount: 1 },
$push: { like: { "document_id" : "doc1" } }
}
)

mongodb query and sort by list item

How to sort documents by funnelSteps[0].count, that is how to sort by the count number of the first funnelSteps?
Thank you.
{
"funnelSteps" : [{
"title" : "step1",
"criteria" : ["1","2"],
"count" : 305
}, {
"title" : "step2",
"criteria" : ["1","2","3"],
"count" : 153
}]
}
MongoDB uses "dot notation" to refer to nested elements in a structure, so you can indeed specify an element by index:
db.collection.find().sort({ "funnelSteps.0.count": 1 })
Where the sort order of 1 is ascending or -1 for descending. See .sort() for more detail.
That is fine for a "known" position of an array element, but if you wanted to sort by something such as the "least" value within "funnelSteps" then you would do something like this using .aggregate():
db.collection.aggregate([
{ "$unwind": "$funnelSteps" },
{ "$group": {
"_id": "$_id",
"funnelSteps": { "$push": "$funnelSteps" },
"lowestCount": { "$min": "$funnelSteps.count" }
}},
{ "$sort": { "lowestCount": 1 } }
])
So in that case you would need to "pull apart" the array in order to get the value you wanted before sorting. But for a known position you can just use the basic arguments to sort as shown.

Query number of sub collections Mongodb

I am new to mongodb and I am trying to figure out how to count all the returned query inside an array of documents like below:
"impression_details" : [
{
"date" : ISODate("2014-04-24T16:35:46.051Z"),
"ip" : "::1"
},
{
"date" : ISODate("2014-04-24T16:35:53.396Z"),
"ip" : "::1"
},
{
"date" : ISODate("2014-04-25T16:22:20.314Z"),
"ip" : "::1"
}
]
What I would like to do is count how many 2014-04-24 there are (which is 2). At the moment my query is like this and it is not working:
db.banners.find({
"impression_details.date":{
"$gte": ISODate("2014-04-24T00:00:00.000Z"),
"$lte": ISODate("2014-04-24T23:59:59.000Z")
}
}).count()
Not sure what is going on please help!
Thank you.
The concept here is that there is a distinct difference between selecting documents and selecting elements of a sub-document array. So what is happening currently in your query is exactly what should be happening. As the document contains at least one sub-document entry that matches your condition, then that document is found.
In order to "filter" the content of the sub-documents itself for more than one match, then you need to apply the .aggregate() method. And since you are expecting a count then this is what you want:
db.banners.aggregate([
// Matching documents still makes sense
{ "$match": {
"impression_details.date":{
"$gte": ISODate("2014-04-24T00:00:00.000Z"),
"$lte": ISODate("2014-04-24T23:59:59.000Z")
}
}},
// Unwind the array
{ "$unwind": "$impression_details" },
// Actuall filter the array contents
{ "$match": {
"impression_details.date":{
"$gte": ISODate("2014-04-24T00:00:00.000Z"),
"$lte": ISODate("2014-04-24T23:59:59.000Z")
}
}},
// Group back to the normal document form and get a count
{ "$group": {
"_id": "$_id",
"impression_details": { "$push": "$impression_details" },
"count": { "$sum": 1 }
}}
])
And that will give you a form that only has the elements that match your query in the array, as well as providing the count of those entries that were matched.
Use the $elemMatch operator would do what you want.
In your query it meas to find all the documents whose impression_details field contains a data between ISODate("2014-04-24T00:00:00.000Z") and ISODate("2014-04-24T23:59:59.000Z"). The point is, it will return the whole document which is not what you want. So if you want only the subdocuments that satisfies your condition:
var docs = db.banners.find({
"impression_details": {
$elemMatch: {
data: {
$gte: ISODate("2014-04-24T00:00:00.000Z"),
$lte: ISODate("2014-04-24T23:59:59.000Z")
}
}
}
});
var count = 0;
docs.forEach(function(doc) {
count += doc.impression_details.length;
});
print(count);

MongoDB: count the number of items in an array

I have a collection where every document in the collection has an array named foo that contains a set of embedded documents. Is there currently a trivial way in the MongoDB shell to count how many instances are within foo? something like:
db.mycollection.foos.count() or db.mycollection.foos.size()?
Each document in the array needs to have a unique foo_id and I want to do a quick count to make sure that the right amount of elements are inside of an array for a random document in the collection.
In MongoDB 2.6, the Aggregation Framework has a new array $size operator you can use:
> db.mycollection.insert({'foo':[1,2,3,4]})
> db.mycollection.insert({'foo':[5,6,7]})
> db.mycollection.aggregate([{$project: { count: { $size:"$foo" }}}])
{ "_id" : ObjectId("5314b5c360477752b449eedf"), "count" : 4 }
{ "_id" : ObjectId("5314b5c860477752b449eee0"), "count" : 3 }
if you are on a recent version of mongo (2.2 and later) you can use the aggregation framework.
db.mycollection.aggregate([
{$unwind: '$foo'},
{$group: {_id: '$_id', 'sum': { $sum: 1}}},
{$group: {_id: null, total_sum: {'$sum': '$sum'}}}
])
which will give you the total foos of your collection.
Omitting the last group will aggregate results per record.
Using Projections and Groups
db.mycollection.aggregate(
[
{
$project: {
_id:0,
foo_count:{$size:"$foo"},
}
},
{
$group: {
foo_total:{$sum:"$foo_count"},
}
}
]
)
Multiple child array counts can also be calculated this way
db.mycollection.aggregate(
[
{
$project: {
_id:0,
foo1_count:{$size:"$foo1"},
foo2_count:{$size:"$foo2"},
}
},
{
$group: {
foo1_total:{$sum:"$foo1_count"},
foo2_total:{$sum:"$foo2_count"},
}
}
]
)