Mongodb aggregate add all fields - mongodb

I have a collection recording impressions (views) of certain tags. I want to see the count of each tag value. In the response I also want to see the whole of the record, in the same way that mysql would.
I'm doing a group using the aggregate pipeline which looks like this
db.tag_impressions.aggregate( [
{ $group : { _id : "$tag_value" , count:{$sum:1} } },
{ $sort : { count: 1 } }
] )
I want to return all of the matched document in tag_impressions.
and I've had some success using $first
db.tag_impressions.aggregate( [
{ $group : { _id : "$tag_value" , "tag_type" : {$first : "Tag_type"} , count:{$sum:1} } },
{ $sort : { count: 1 } }
] )
But I would have to specify each field and it would take away the benefit from mongo being schema-less.
How can I return all of the document in the results?

Related

is to possible to due multiple analyses in one aggregation function in MONGODB

I have a database containing books. It has the fields of "author", "title", "year" and "book type". Suppose I want to put a description on my website that describes the data currently in the database for instance the total number of books, the total number of different authors, the total number of books per author etc. etc.
Should I do several separate aggregations in Mongo or can I combine them. For now I would do something like
db.aggregate([
{
$group : {
_id : "$author",
details: {
$push : {
id:"$_id"
}
}
])
followed by
db.aggregate([
{
$group : {
_id : "$pubdate",
details: {
$push : {
id:"$_id"
}
}
])
etc. etc. is there a smarter solution?
You can combine the different aggregations in single $facet operation like:
db.aggregate([ { $facet:
{
"authorCount": [
{
$group : {
_id : "$author",
details: {
$push : {
id:"$_id"
}
}
],
"pubDateCount": [
{
$group : {
_id : "$pubdate",
details: {
$push : {
id:"$_id"
}
} ]
}
}
])
Explained:
The data will read once and sent to the different facet stages independently , so you dont need to do it in multiple aggregation commands.

Double aggregation with distinct count in MongoDB

We have a collection which stores log documents.
Is it possible to have multiple aggregations on different attributes?
A document looks like this in it's purest form:
{
_id : int,
agent : string,
username: string,
date : string,
type : int,
subType: int
}
With the following query I can easily count all documents and group them by subtype for a specific type during a specific time period:
db.logs.aggregate([
{
$match: {
$and : [
{"date" : { $gte : new ISODate("2020-11-27T00:00:00.000Z")}}
,{"date" : { $lte : new ISODate("2020-11-27T23:59:59.000Z")}}
,{"type" : 906}
]
}
},
{
$group: {
"_id" : '$subType',
count: { "$sum": 1 }
}
}
])
My output so far is perfect:
{
_id: 4,
count: 5
}
However, what I want to do is to add another counter, which will also add the distinct count as a third attribute.
Let's say I want to append the resultset above with a third attribute as a distinct count of each username, so my resultset would contain the subType as _id, a count for the total amount of documents and a second counter that represents the amount of usernames that has entries. In my case, the number of people that somehow have created documents.
A "pseudo resultset" would look like:
{
_id: 4,
countOfDocumentsOfSubstype4: 5
distinctCountOfUsernamesInDocumentsWithSubtype4: ?
}
Does this makes any sense?
Please help me improve the question as well, since it's difficult to google it when you're not a MongoDB expert.
You can first group at the finest level, then perform a second grouping to achieve what you need:
db.logs.aggregate([
{
$match: {
$and : [
{"date" : { $gte : new ISODate("2020-11-27T00:00:00.000Z")}}
,{"date" : { $lte : new ISODate("2020-11-27T23:59:59.000Z")}}
,{"type" : 906}
]
}
},
{
$group: {
"_id" : {
subType : "$subType",
username : "$username"
},
count: { "$sum": 1 }
}
},
{
$group: {
"_id" : "$_id.subType",
"countOfDocumentsOfSubstype4" : {$sum : "$count"},
"distinctCountOfUsernamesInDocumentsWithSubtype4" : {$sum : 1}
}
}
])
Here is the test cases I used:
And here is the aggregate result:

How to efficiently count filtered documents in MongoDB $group operator

I have a fairly small dataset of 63k documents (2.5GB total). Example of document:
{
_id : "[uniqueId]",
FormId : 10,
Name : "Name of form",
IsComplete : true,
Sections : [ many sections and can be large ]
}
I want to get the total count of documents by FormId. I get fast result (.15sec) on this query:
db.getCollection('collection').aggregate([
{ $sort : { FormId : 1 } }, //Index exists on FormId
{ $group : { _id : "$FormId", count : { $sum : 1 } } },
{ $sort : { "count" : -1 } }
])
My problem is I need to get a count of the documents where { "IsComplete":true }. I have 2 indexes built on both properties but I realize that using the $match operator scans all docs. So how does one efficiently filter the $group count?
Efficient Way would be
Filters down the documents by using $match to pass only matching documents to the next pipeline. By placing $match at the very beginning of a pipeline, the query can take advantage of indexes.
Use $project to pass along the documents with only the required fields to the next stage in the pipeline, this will further reduce data to the next pipeline.
db.getCollection('collection').aggregate([
{ $match: {"IsComplete":true} },
{ $project: {"IsComplete":1, "FormId":1}},
{ $group : { _id : "$FormId", count : { $sum : 1 } } },
{ $sort : { "count" : -1 } }
])

MongoDB sum() data

I am new to mongoDB and nosql, what is the syntax to get a sum?
In MySQL, I would do something like this:
SELECT SUM(amount) from my_table WHERE member_id = 61;
How would I convert that to MongoDB? Here is what I have tried:
db.bigdata.aggregate({
$group: {
_id: {
memberId: 61,
total: {$sum: "$amount"}
}
}
})
Using http://docs.mongodb.org/manual/tutorial/aggregation-zip-code-data-set/ for reference you want:
db.bigdata.aggregate(
{
$match: {
memberId: 61
}
},
{
$group: {
_id: "$memberId",
total : { $sum : "$amount" }
}
})
From the MongoDB docs:
The aggregation pipeline is a framework for data aggregation modeled on the concept of data processing pipelines. Documents enter a multi-stage pipeline that transforms the documents into an aggregated results.
It would be better to match first and then group, so that you system only perform group operation on filtered records. If you perform group operation first then system will perform group on all records and then selects the records with memberId=61.
db.bigdata.aggregate(
{ $match : {memberId : 61 } },
{ $group : { _id: "$memberId" , total : { $sum : "$amount" } } }
)
db.bigdata.aggregate(
{ $match : {memberId : 61 } },
{ $group : { _id: "$memberId" , total : { $sum : "$amount" } } }
)
would work if you are summing data which is not a part of array, if you want to sum the data present in some array in a document then use
db.collectionName.aggregate(
{$unwind:"$arrayName"}, //unwinds the array element
{
$group:{_id: "$arrayName.arrayField", //id which you want to see in the result
total: { $sum: "$arrayName.value"}} //the field of array over which you want to sum
})
and will get result like this
{
"result" : [
{
"_id" : "someFieldvalue",
"total" : someValue
},
{
"_id" : "someOtherFieldvalue",
"total" : someValue
}
],
"ok" : 1
}

Aggregation in MongoDB, using unwind

I need to aggregate all tags from records like this:
https://gist.github.com/sbassi/5642925
(there are 2 sample records in this snippet) and sort them by size (first the tag that appears with more frequency). But I don't want to take into account data that have specific "user_id" (lets say, 2,3,6 and 12).
Here is my try (just the aggregation, without filtering and sorting):
db.user_library.aggregate( { $unwind : "$annotations.data.tags" }, {
$group : { _id : "$annotations.data.tags" ,totalTag : { $sum : 1 } } }
)
And I got:
{ "result" : [ ], "ok" : 1 }
Right now you can't unwind an array that is nested inside another array. See SERVER-6436
Consider structuring the data differently, having an array field with all tags for that document or possibly unwinding annotations and then unwinding annotations.data.tags in a stacked unwind like this:
db.user_library.aggregate([
{ $project: { 'annotations.data.tags': 1 } },
{ $unwind: '$annotations' },
{ $unwind: '$annotations.data.tags' },
{ $group: { _id: '$annotations.data.tags', totalTag: { $sum: 1 } } }
])