Group documents by subdocument field - mongodb

I am trying to use mongo's aggregation framework to group a collection based on a timestamp and use the $out to output it to a new collection. Apologies, I am new to Mongo
I have the following JSON structure in my collection
{
"_id" : "1",
"parent" : [
{
"child" : {
"child_id" : "1",
"timestamp" : ISODate("2010-01-08T17:49:39.814Z")
}
}
]
}
Here is what I have been trying
db.mycollection.aggregate([
{ $project: { child_id: '$parent.child.child_id', timestamp: '$parent.child.timestamp' }},
{ $group: { cid: '$child_id', ts: { $max: '$timestmap'} }},
{ $out : 'mycollectiongrouped'}
]))
however getting this error. Any ideas, I assume I am probably using the project incorrectly.
[thread1] Error: command failed: {
"ok" : 0,
"errmsg" : "the group aggregate field 'cid' must be defined as an expression inside an object",
"code" : 15951
} : aggregate failed :
_getErrorWithCode#src/mongo/shell/utils.js:25:13

db.collection.aggregate([
{$group: {
_id: "$parent.child.child_id",
timestamp: {$max: "$parent.child.timestamp"}
}},
{$project: {
cid: {$arrayElemAt: ["$_id", 0]},
ts: {$arrayElemAt: ["$timestamp", 0]},
_id: 0
}},
{$out: "groupedCollection" }
])
You are missing the _id which is mandatory for the $group pipeline stage. That being said since the "parent" field in your document is one element array, the $group stage should be the first stage in the pipeline.
By making the $group stage the first stage, you will only need to project one document per group instead of all documents in the collection.
Note that the resulted document fields are array hence the use of the $arrayElemAt operator in the $project stage.

You need an _id field for the $group. This _id is what determines which documents are grouped together. For instance, if you want to group by child_id, then do _id: "$child_id". In that case, you can omit the cid field (in this case, you can just change cid to _id).

Related

$facet of mongodb returning full sorted documents instead of count based on match

i have a documents as below
{
_id:1234,
userId:90oi,
tag:"self"
},
{
_id:5678,
userId:65yd,
tag:"other"
},
{
_id:9012,
userId:78hy,
tag:"something"
},
{
_id:3456,
userId:60oy,
tag:"self"
},
i needed response like below
[{
tag : "self",
count : 2
},
{
tag : "something",
count : 1
},
{
tag : "other",
count : 1
}
]
i was using $facet to query the documents. but it is returning entire documents not the count. My query is as follows
db.data.aggregate({
$facet: {
categorizedByGrade : [
{ $match: {userId:ObjectId(userId)}},
{$sortByCount: "$tag"}
]
}
})
Let me know what i am doing wrong. Thanks in advance for the help
So you don't need to use $facet for this one - facet is when you really need to process multiple aggregation pipelines in one aggregation query (mongoDB $facet), Please try this :
db.yourCollectionName.aggregate([{$project :{tag :1, _id :0}},{$group :{_id: '$tag',
count: { $sum: 1 }}}, {$project : {tag : '$_id', _id:0, count :1}}])
Explanation :
$project at first point is to retain only needed fields in all documents that way we've less data to process, $group will iterate through all documents to group similar data upon fields specified, While $sum will count the respective number of items getting added through group stage in each set, Finally $project again is used to make the result look like what we needed.
You can retrieve the correct records using facet, please have a look at below query
db.data.aggregate({
$facet: {
categorizedByGrade : [
{
$sortByCount:"$tag"
},
{
$project:{
_id:0,
tag:"$_id",
count:1,
}
}]
}
})

MongoDB: How to match on the elements of an array?

I have two collections as follows:
db.qnames.find()
{ "_id" : ObjectId("5a4da53f97a9ca769a15d49e"), "domain" : "mail.google.com", "tldOne" : "google.com", "clients" : 10, "date" : "2016-12-30" }
{ "_id" : ObjectId("5a4da55497a9ca769a15d49f"), "domain" : "mail.google.com", "tldOne" : "google.com", "clients" : 9, "date" : "2017-01-30” }
and
db.dropped.find()
{ "_id" : ObjectId("5a4da4ac97a9ca769a15d49c"), "domain" : "google.com", "dropDate" : "2017-01-01", "regStatus" : 1 }
I would like to join the two collections and choose the documents for which 'dropDate' field (from dropped collection) is larger than the 'date' filed (from qnames field). So I used the following query:
db.dropped.aggregate( [{$lookup:{ from:"qnames", localField:"domain",foreignField:"tldOne",as:"droppedTraffic"}},
{$match: {"droppedTraffic":{$ne:[]} }},
{$unwind: "$droppedTraffic" } ,
{$match: {dropDate:{$gt:"$droppedTraffic.date"}}} ])
but this query does not filter the records where dropDate < date. Anyone can give me a clue of why it happens?
The reason why you are not getting the record is
Date is used as a String in your collections, to make use of the comparison operators to get the desired result modify your collection documents using new ISODate("your existing date in the collection")
Please note even after modifying both the collections you need to modify your aggregate query, since in the final $match query two values from the same document is been compared.
Sample query to get the desired documents
db.dropped.aggregate([
{$lookup: {
from:"qnames",
localField:"domain",
foreignField:"tldOne",
as:"droppedTraffic"}
},
{$project: {
_id:1, domain:1,
regStatus:1,
droppedTraffic: {
$filter: {
input: "$droppedTraffic",
as:"droppedTraffic",
cond:{ $gt: ["$$droppedTraffic.date", "$dropDate"]}
}
}
}}
])
In this approach given above we have used $filter which avoids the $unwind operation
You should use $redact to compare two fields of the same document. Following example should work:
db.dropped.aggregate( [
{$lookup:{ from:"qnames", localField:"domain",foreignField:"tldOne",as:"droppedTraffic"}},
{$match: {"droppedTraffic":{$ne:[]} }},
{$unwind: "$droppedTraffic" },
{
"$redact": {
"$cond": [
{ "$lte": [ "$dropDate", "$droppedTraffic.date" ] },
"$$KEEP",
"$$PRUNE"
]
}
}
])

In mongodb aggregation where to apply sort before lookup or after lookup?

I am writing an aggregation query where i want to perform a join in MongoDB between two collections and for that i am using $lookup, now my question is does $lookup change order of results by sort or not ?? because if it does that then i need to put my sort after $lookup and if not then i can use it before $lookup ??
My code is given below
brandmodel.aggregate(
{$project: { '_id':0, 'brand_id': 1, 'brand_name':1, 'brand_icon':1, 'banner_image': 1, 'weight': 1} },
{$lookup: {from: "student_coupons",localField: "brand_id",foreignField: "brand_id",as: "coupons"}},
{$unwind : "$coupons"},
{$sort: {weight: -1, "coupons.time_posted": -1}}, // SHOULD I WRITE THIS BEFORE LOOKUP OR AFTER LOOKUP
In MongoDB 3.6, the $lookup has a more expressive way where you can access the fields of the source document and do further pipeline operations within the $lookup stage. See documentation
As an example,
db.movies.aggregate([
{ $match : { _id : ObjectId("573a1390f29313caabcd414c")} },
{ $lookup : {from: "comments",
let: {'id' : '$_id' },
pipeline: [
{ $match : { '$expr': { '$eq': [ '$movie_id', '$$id' ] } }},
{ $sort: {'date': -1} }
],
as: "comments"
}
}
])
You have to declare any fields you want from the source collection in the let , do the matching as required (This is optional). Then you can use the pipeline stages that you need to apply in the collection being looked up.

Get the number of documents liked per document in MongoDB

I'm working on a project by using MongoDB as a database and I'm encountering a problem: I can't find the right query to make a simple count of the likes of a document. The collection that I use is this :
{ "username" : "example1",
"like" : [ { "document_id" : "doc1" },
"document_id" : "doc2 },
...]
}
So what I need is to compute is the number of likes of each document so at the end I will have
{ "document_id" : "docA" , nbLikes : 30 }, {"document_id" : "docB", nbLikes : 1}
Can anyone help me on this because I failed.
You can do this by unwinding the like array of each doc and then grouping by document_id to get a count for each value:
db.test.aggregate([
// Duplicate each doc, once per 'like' array element
{$unwind: '$like'},
// Group them by document_id and assemble a count
{$group: {_id: '$like.document_id', nbLikes: {$sum: 1}}},
// Reshape the docs to match the desired output
{$project: {_id: 0, document_id: '$_id', nbLikes: 1}}
])
Add "likeCount" field and increase count for per $push operation and read "likeCount" field
db.test.update(
{ _id: "..." },
{
$inc: { likeCount: 1 },
$push: { like: { "document_id" : "doc1" } }
}
)

count multiple distinct fields by group with Mongo

I have a data set looks as
{"BrandId":"a","SessionId":100,"UserName":"tom"}
{"BrandId":"a","SessionId":200,"UserName":"tom"}
{"BrandId":"b","SessionId":300,"UserName":"mike"}
I would like to count distinct session and username group by brandid, the sample sql is like:
select brandid,count_distinct(sessionid),count_distinct(username)
from data
group by brandid
I tried to write Mongo DB, my current code is as following and it does not work. Is there anyway to make it work?
db.logs.aggregate([
{$group:{
_id:{brand:"$BrandId",user:"$UserName",session:"$SessionId"},
count:{$sum:1}}},
{$group:{
_id:"$_id.brand",
users:{$sum:"$_id.user"},
sessions:{$sum:"$_id.session"}
}}
])
for the certain example, the expected count is
{"BrandId:"a","countSession":2,"countUser":1}
{"BrandId:"b","countSession":1,"countUser":1}
if you know SQL, the expect result is as same as the SQL I mentioned.
You can do this by using $addToSet to accumulate the distinct set of SessionId and UserName values during the $group, and then adding a $project stage to your pipeline that uses the $size operator to get the size of each set:
db.logs.aggregate([
{$group: {
_id: '$BrandId',
sessionIds: {$addToSet: '$SessionId'},
userNames: {$addToSet: '$UserName'}
}},
{$project: {
_id: 0,
BrandId: '$_id',
countSession: {$size: '$sessionIds'},
countUser: {$size: '$userNames'}
}}
])
Result:
{
"BrandId" : "b",
"countSession" : 1,
"countUser" : 1
},
{
"BrandId" : "a",
"countSession" : 2,
"countUser" : 1
}