Group by array element in Mongodb - mongodb

We have nested document and trying to group by array element. Our document structure looks like
/* 1 */
{
"_id" : ObjectId("5a690a4287e0e50010af1432"),
"slug" : [
"true-crime-the-10-most-infamous-american-murder-mysteries",
"10-most-infamous-american-murder-mysteries"
],
"tags" : [
{
"id" : "59244aa6b1be5055278e9b5b",
"name" : "true crime",
"_id" : "59244aa6b1be5055278e9b5b"
},
{
"id" : "5924524db1be5055278ebd6e",
"name" : "Occult Museum",
"_id" : "5924524db1be5055278ebd6e"
},
{
"id" : "5a690f0fc1a72100110c2656",
"_id" : "5a690f0fc1a72100110c2656",
"name" : "murder mysteries"
},
{
"id" : "59244d71b1be5055278ea654",
"name" : "unsolved murders",
"_id" : "59244d71b1be5055278ea654"
}
]
}
We want to find list of all slugs group by tag name. I am trying with following and it gets result but it isn't accurate. We have hundreds of records with each tag but i only get few with my query. I am not sure what i am doing wrong here.
Thanks in advance.
// Requires official MongoShell 3.6+
db.getCollection("test").aggregate(
[
{
"$match" : {
"item_type" : "Post",
"site_id" : NumberLong(2),
"status" : NumberLong(1)
}
},
{$unwind: "$tags" },
{
"$group" : {
"_id" : {
"tags᎐name" : "$tags.name",
"slug" : "$slug"
}
}
},
{
"$project" : {
"tags.name" : "$_id.tags᎐name",
"slug" : "$_id.slug",
"_id" : NumberInt(0)
}
}
],
{
"allowDiskUse" : true
}
);
Expected output is
TagName Slug
----------
true crime "true-crime-the-10-most-infamous-american-murder-mysteries",
"10-most-infamous-american-murder-mysteries"
"All records where tags true crime"

Instead of using slug as a part of _id you should use $push or $addToSet to accumulate them, try:
db.test.aggregate([
{
$unwind: "$tags"
},
{
$unwind: "$slug"
},
{
$group: {
_id: "$tags.name",
slugs: { $addToSet: "$slug" }
}
},
{
$project: {
_id: 1,
slugs: {
$reduce: {
input: "$slugs",
initialValue: "",
in: {
$concat: [ "$$value", ",", "$$this" ]
}
}
}
}
}
])
EDIT: to get comma separated string for slugs you can use $reduce with $concat
Output:
{ "_id" : "murder mysteries", "slugs" : ",10-most-infamous-american-murder-mysteries,true-crime-the-10-most-infamous-american-murder-mysteries" }
{ "_id" : "Occult Museum", "slugs" : ",10-most-infamous-american-murder-mysteries,true-crime-the-10-most-infamous-american-murder-mysteries" }
{ "_id" : "unsolved murders", "slugs" : ",10-most-infamous-american-murder-mysteries,true-crime-the-10-most-infamous-american-murder-mysteries" }
{ "_id" : "true crime", "slugs" : ",10-most-infamous-american-murder- mysteries,true-crime-the-10-most-infamous-american-murder-mysteries" }

Related

Mongodb Query to get the nth document

I need to create a query in mongodb that needs to return the SECOND TO THE LAST document. I am planning to use $group for this query but i dont know what aggregation function to use. I only know $first and $last.
I have an example collection below and also include the expected output. Thank you!
"_id" : ObjectId("60dc27ac54b7c46bfa1b84b4"),
"auditlogs" : [
{
"_id" : ObjectId("60dc27ac54b7c46bfa1b84be"),
"userid" : ObjectId("5ffe702d59a9205db81fcb69"),
"action" : "ADDTRANSACTION"
},
{
"_id" : ObjectId("60dc27ac54b7c46bfa1b84bd"),
"userid" : ObjectId("5ffe644f9493e05db9245192"),
"action" : "EDITPROFILE"
},
{
"_id" : ObjectId("60dc27ac54b7c46bfa1b84bc"),
"userid" : ObjectId("5ffe64949493e05db9245197"),
"action" : "DELETETRANSACTION"
} ]
"_id" : ObjectId("60dc27ac54b7c46bfa1b75ge2"),
"auditlogs" : [
{
"_id" : ObjectId("60dc27ac54b7c46bfa1b84bb"),
"userid" : ObjectId("5ffe64b69493e05db924519b"),
"action" : "ADDTRANSACTION"
},
{
"_id" : ObjectId("60dc27ac54b7c46bfa1b84ba"),
"userid" : ObjectId("5ffe65419493e05db92451d4"),
"action" : "ADDTRANSACTION"
},
{
"_id" : ObjectId("60dc27ac54b7c46bfa1b84b9"),
"userid" : ObjectId("5ffe65689493e05db92451d9"),
"action" : "CHANGEACCESS"
},
{
"_id" : ObjectId("60dc27ac54b7c46bfa1b84b8"),
"userid" : ObjectId("5ffe65819493e05db92451dd"),
"action" : "DELETETRANSACTION"
},
{
"_id" : ObjectId("60dc27ac54b7c46bfa1b84b7"),
"userid" : ObjectId("5ffe65df9493e05db92451f3"),
"action" : "EDITPROFILE",
]
OUTPUT:
{"_id" : ObjectId("60dc27ac54b7c46bfa1b84b4"),"_id" : ObjectId("60dc27ac54b7c46bfa1b84bd"),"userid" : ObjectId("5ffe644f9493e05db9245192"),"action" : "EDITPROFILE"},
{"_id" : ObjectId("60dc27ac54b7c46bfa1b75ge2"),"_id" : ObjectId("60dc27ac54b7c46bfa1b84b8"),"userid" : ObjectId("5ffe65819493e05db92451dd"),"action" : "DELETETRANSACTION"}
You can't have two _id keys in one single object.
I've made the parent object's id to _parentId you can give it's a name anything you want except _id
Aggregation:
db.collection.aggregate([
{
$unwind: "$auditlogs"
},
{
"$project": {
"_parentId": "$_id",
"_id": "$auditlogs._id",
"action": "$auditlogs.action",
"userid": "$auditlogs.userid",
}
}
])
Playground
You can slice the array by -2 to get the last two item, then by 1 to get first one. Therefore, the array will be left the second to the last. Finally, unwind auditlogs so it can be changed from array to object which is structure that you want.
db.collection.aggregate([
{
$project: { auditlogs : { $slice: [ "$auditlogs", -2 ] } }
},
{
$project: { auditlogs : { $slice: [ "$auditlogs", 1 ] } }
},
{
$unwind: "$auditlogs"
}
])

MongoDB: Filtering aggregation by field values and nested arrays

I built up a graph structure using MongoDB. I did some complex queries on this structure already, but I am struggling on selecting a subgraph of a given depth starting from a specific node via the aggregation pipeline.
I already did use the $graphLookup to get all the required nodes, which gives the following result:
{ "_id" : "O_4", "name" : "D", "type" : "Info", "links" : [ { "link" : "L_2", "objectId" : "O_1" }, { "link" : "L_4", "objectId" : "O_3" }, { "link" : "L_10", "objectId" : "O_6" } ] }
{ "_id" : "O_2", "name" : "B", "type" : "Info", "links" : [ { "link" : "L_1", "objectId" : "O_1" }, { "link" : "L_3", "objectId" : "O_3" } ] }
{ "_id" : "O_1", "name" : "A", "type" : "Info", "links" : [ { "link" : "L_1", "objectId" : "O_2" }, { "link" : "L_2", "objectId" : "O_4" } ] }
{ "_id" : "O_3", "name" : "C", "type" : "Info", "links" : [ { "link" : "L_3", "objectId" : "O_2" }, { "link" : "L_4", "objectId" : "O_4" }, { "link" : "L_5", "objectId" : "O_5" }, { "link" : "L_6", "objectId" : "O_7" } ] }
{ "_id" : "O_6", "name" : "F", "type" : "System", "links" : [ { "link" : "L_8", "objectId" : "O_7" }, { "link" : "L_9", "objectId" : "O_5" }, { "link" : "L_10", "objectId" : "O_4" } ] }
But now I want to remove the nested "link" objects (in array "links") where the "objectId" is not present in the above result, i.e. in "O_6" the link "L_8" should be removed, since the node "O_7" is not part of the subgraph.
I already tried playing around with $in, $facet and other stuff to get this problem solved, but it seems like I am unable ...
Maybe, you guys can help out?
Edit:
Just found a solution more or less - $filter does a decent job here:
{
$unwind: "$links"
}, {
$group: {
_id: null,
ids: {
$addToSet: "$_id"
},
links: {
$addToSet: "$links"
}
}
}, {
$project: {
links: {
$filter: {
input: "$links",
as: "link",
cond: {
$in: ["$$link.objectId", "$ids"]
}
}
}
}
}, {
$unwind: "$links"
}, {
$replaceRoot: {
newRoot: "$links"
}
}, {
$group: {
_id: "$link"
}
}
Returns what I needed - the list of Link-IDs:
{ "_id" : "L_1" }
{ "_id" : "L_10" }
{ "_id" : "L_3" }
{ "_id" : "L_2" }
{ "_id" : "L_4" }

How to regoup in subdocuments, multi document that have a same field in MongoDB?

I have a collection in mongoDB that looks like this :
db.mycollection.find({})
{
"_id" : ObjectId("5deb4ce4bbe1b67e6e5611e4"),
"site" : "MDC",
"label" : "407",
"status" : "removed"
}
{
"_id" : ObjectId("5def36379ca17632de773d7e"),
"site" : "MDC",
"label" : "407",
"status" : "new"
}
{
"_id" : ObjectId("5df4740eab0d76657c19a7d2"),
"site" : "MDC",
"label" : "408",
"status" : "new"
}
I would like to regroup my documents that have the same value for the field "label" in one document with subdocument of the status, to have something like this :
{
"_id" : ObjectId("5deb4ce4bbe1b67e6e5611e4"),
"site" : "MDC",
"label" : "407",
"status" : [
{
"label" : "new"
},
{
"label" : "removed"
}
]
}
I tried different ways (aggregate, update,..) to do this but it's a complete fail...
You need to $group by label or site in order to $push your statuses:
db.collection.aggregate([
{
$group: {
_id: "$label",
old_id: { $first: "$_id" },
site: { $first: "$site" },
status: { $push: { label: "$status" } }
}
},
{
$project: {
_id: "$old_id",
site: 1,
label: "$_id",
status: 1
}
}
])
Mongo Playground

Sort a match group by id in aggregate

(Mongo newbie here, sorry) I have a mongodb collection, result of a mapreduce with this schema :
{
"_id" : "John Snow",
"value" : {
"countTot" : 500,
"countCall" : 30,
"comment" : [
{
"text" : "this is a text",
"date" : 2016-11-17 00:00:00.000Z,
"type" : "call"
},
{
"text" : "this is a text",
"date" : 2016-11-12 00:00:00.000Z,
"type" : "visit"
},
...
]
}
}
My goal is to have a document containing all the comments of a certain type. For example, a document John snow with all the calls.
I manage to have all the comments for a certain type using this :
db.general_stats.aggregate(
{ $unwind: '$value.comment' },
{ $match: {
'value.comment.type': 'call'
}}
)
However, I can't find a way to group the data received by the ID (for example john snow) even using the $group property. Any idea ?
Thanks for reading.
Here is the solution for your query.
db.getCollection('calls').aggregate([
{ $unwind: '$value.comment' },
{ $match: {
'value.comment.type': 'call'
}},
{
$group : {
_id : "$_id",
comment : { $push : "$value.comment"},
countTot : {$first : "$value.countTot"},
countCall : {$first : "$value.countCall"},
}
},
{
$project : {
_id : 1,
value : {"countTot":"$countTot","countCall":"$countCall","comment":"$comment"}
}
}
])
or either you can go with $project with $filter option
db.getCollection('calls').aggregate([
{
$project: {
"value.comment": {
$filter: {
input: "$value.comment",
as: "comment",
cond: { $eq: [ "$$comment.type", 'call' ] }
}
},
"value.countTot":"$value.countTot",
"value.countCall":"$value.countCall",
}
}
])
In both case below is my output.
{
"_id" : "John Snow",
"value" : {
"countTot" : 500,
"countCall" : 30,
"comment" : [
{
"text" : "this is a text",
"date" : "2016-11-17 00:00:00.000Z",
"type" : "call"
},
{
"text" : "this is a text 2",
"date" : "2016-11-17 00:00:00.000Z",
"type" : "call"
}
]
}
}
Here is the query which is the extension of the one present in OP.
db.general_stats.aggregate(
{ $unwind: '$value.comment' },
{ $match: {
'value.comment.type': 'call'
}},
{$group : {_id : "$_id", allValues : {"$push" : "$$ROOT"}}},
{$project : {"allValues" : 1, _id : 0} },
{$unwind : "$allValues" }
);
Output:-
{
"allValues" : {
"_id" : "John Snow",
"value" : {
"countTot" : 500,
"countCall" : 30,
"comment" : {
"text" : "this is a text",
"date" : ISODate("2016-11-25T10:46:49.258Z"),
"type" : "call"
}
}
}
}
Got my answer looking at this :
How to retrieve all matching elements present inside array in Mongo DB?
using the $addToSet property in the $group one.

Mongodb find data and groupby for another column

{
"_id" : ObjectId("5763e4d6c0140edcb8731485"),
"_class" : "net.microservice.product.domain.Product",,
"createdAt" : ISODate("2016-06-17T11:53:58.228Z"),
"createdBy" : "user-0",
"modifiedAt" : ISODate("2016-06-21T06:21:47.524Z"),
"modifiedBy" : "user-0",
"merchant" : "a746f24safa5-e96f-4281-9759-a4a02b306d77",
"type" : DBRef("productTypes", ObjectId("575fd99236623f70c959247f")),
"fields" : {
"Image4" : {
"value" : "http://i.hizliresim.com/ZdELXa.jpg",
"detail" : {
"revisedBy" : "CTA",
"revisionDate" : ISODate("2016-06-21T06:21:47.204Z")
}
},
"Image3" : {
"value" : "http://i.hizliresim.com/l1WkqX.jpg",
"detail" : {
"revisedBy" : "CTA",
"revisionDate" : ISODate("2016-06-21T06:21:47.204Z")
}
},
"Image2" : {
"value" : "http://i.hizliresim.com/VYMl9n.jpg",
"detail" : {
"revisedBy" : "CTA",
"revisionDate" : ISODate("2016-06-21T06:21:47.204Z")
}
},
"Kur" : {
"value" : "TL",
"detail" : {
"revisedBy" : "CTA",
"revisionDate" : ISODate("2016-06-21T06:21:47.204Z")
}
},
"Image1" : {
"value" : "http://i.hizliresim.com/nrWAQ0.jpg",
"detail" : {
"revisedBy" : "CTA",
"revisionDate" : ISODate("2016-06-21T06:21:47.204Z")
}
},
"uploadDate" : ISODate("2016-06-17T11:53:00Z"),
"tasks" : [ ]
}
this is sample of database. I want to get data in which:
- modifiedAt is before "modifiedAt" : ISODate("2016-07-21T06:21:47.524Z"),
so i do this and this works:
db.products.find({
'modifiedAt':
{$lte: ISODate("2016-10-18T13:05:18.961Z"
)} }).
count()
14999
But i need to find for each merchant. Beause 14999 result is not true because a merchant have lots of product so 14999 includes multiple products.
I need to group by merchant and distinct. I couldnot do it.
i do this but
db.products.
aggregate([ {
$group: {
_id: '$merchant', } }, {
$match: {
modifiedAt:
{$lte: ISODate("2016-06-18T13:05:18.961Z")} }} ])
brings nothing and no error.
you can try something like this. This gives you the number of products by merchant.
db.products.aggregate([
{$match: {modifiedAt:{$lte: ISODate("2016-06-21T06:21:47.524Z")}}},
{$group: { _id: "$merchant",count: { $sum: 1 }}}
])
Output:
{ "_id" : "a89846f24safa5-e96f-4281-9759-a4a02b306d77", "count" : 1 }
Always place the $match as early in the aggregation pipeline as possible. Because $match limits the total number of documents in the aggregation pipeline, earlier $match operations minimize the amount of processing down the pipe.
So your query would be like
db.products.aggregate([
{
$match: {
modifiedAt: {
$lte: ISODate("2016-06-18T13:05:18.961Z")
}
}
},
{
$group: {
_id: '$merchant'
}
}
])