Create an unique list of multiple arrays inside a subdocument. - mongodb

I would like to create an unique list of array values inside a subdocument.
Document:
{
"_id" : ObjectId("5aee0e3c059638093b69c8b3"),
"firstname" : "John",
"lastname" : "Doe",
"websites" : [
{
"_id" : ObjectId("123"),
"key" : "website2",
"url" : "www.xxx.com",
"tags" : [
"php",
"python",
"java"
]
},
{
"_id" : ObjectId("456"),
"key" : "website2",
"url" : "www.yyy.com",
"tags" : [
"java",
"php"
]
},
{
"_id" : ObjectId("789"),
"key" : "website3",
"url" : "www.zzz.com",
"tags" : [
"java",
"html",
"css"
]
}
]
}
Expected output:
{
"_id" : ObjectId("5aee0e3c059638093b69c8b3"),
"firstname" : "John",
"lastname" : "Doe",
"unique_tags": [
"java",
"php",
"python",
"html",
"css",
],
"websites" : [
{
"_id" : ObjectId("123"),
"key" : "website2",
"url" : "www.xxx.com",
"tags" : [
"php",
"python",
"java"
]
},
{
"_id" : ObjectId("456"),
"key" : "website2",
"url" : "www.yyy.com",
"tags" : [
"java",
"php"
]
},
{
"_id" : ObjectId("789"),
"key" : "website3",
"url" : "www.zzz.com",
"tags" : [
"java",
"html",
"css"
]
}
]
}
It looks like Mongo has a distinct functionality, but this does not work inside an aggregate query (right?!).
Also tried to unwind on websites.tags and use the addToSet functionality but it also hasn't the right output.
Any ideas?

You can try below aggregation:
db.col.aggregate([
{
$addFields: {
unique_tags: {
$reduce: {
input: {
$concatArrays: {
$map: {
input: "$websites",
as: "website",
in: "$$website.tags"
}
}
},
initialValue: [],
in: { $setUnion : ["$$value", "$$this"]}
}
}
}
}
])
To flatten an array of arrays (websites -> tags) you can use $map with $concatArrays. Then you'll get an array of all tags from all websites. To get only unique values you can use $reduce with $setUnion (which drops duplicates).

Related

All Mongo documents with duplicated objects inside array

Find documents with duplicated objects inside an array.
Some answers works just with array made of "basic type elements" (i.e. array of strings). Here I want to filter on certain objects fields
In example:
{
"name": "1",
"arr": [{ "type": "fruit", "name":"pear"},{ "type": "fruit","name":"banana"}]
},
{
"name":"2",
"arr": [{"type":"fish"}]
}
Given the above two documents, I want to retrieve just document 1, because it has 2 elements in the array that have the same type. (Of course I want all documents with such property, not just one)
The following query can get us the expected output:
db.collection.find({
$expr:{
$ne:[
{
$size:"$arr"
},
{
$size:{
$setUnion:["$arr.type"]
}
}
]
}
}).pretty()
Data set:
{
"_id" : ObjectId("5d7b8546d76ccfa3cb0f133c"),
"name" : "1",
"arr" : [
{
"type" : "fruit",
"name" : "pear"
},
{
"type" : "fruit",
"name" : "banana"
}
]
}
{
"_id" : ObjectId("5d7b8546d76ccfa3cb0f133d"),
"name" : "2",
"arr" : [
{
"type" : "fish"
}
]
}
{
"_id" : ObjectId("5d7b8546d76ccfa3cb0f133e"),
"name" : "3",
"arr" : [
{
"type" : "product",
"name" : "watch"
},
{
"type" : "product",
"name" : "Pen"
}
]
}
Output:
{
"_id" : ObjectId("5d7b8546d76ccfa3cb0f133c"),
"name" : "1",
"arr" : [
{
"type" : "fruit",
"name" : "pear"
},
{
"type" : "fruit",
"name" : "banana"
}
]
}
{
"_id" : ObjectId("5d7b8546d76ccfa3cb0f133e"),
"name" : "3",
"arr" : [
{
"type" : "product",
"name" : "watch"
},
{
"type" : "product",
"name" : "Pen"
}
]
}
Query analysis: We are filtering documents in which the size of arr is not equal to the count of unique type present in the arr

How to merge document in mongodb based on an Id

Is there a way in mongoDb to merge documents based on id. For eg:
/*1*/{
"id" : "xxxxxx",
"pages" : {
"name" : "Page50-50",
"pageid" : "Page50-50",
"icon" : "/images/cmstemplates/2column.png",
"children" : []
},
"permissions" : [
{
"grpName" : "Admin",
"grants" : [
"View",
"Delete",
"Edit"
]
},
{
"grpName" : "Users",
"grants" : [
"View"
]
}
]
}
/*2*/
{
"id" : "xxxxxx",
"pages" : {
"name" : "AboutUs",
"pageid" : "AboutUs",
"icon" : "/images/cmstemplates/1column.png",
"children" : []
},
"permissions" : [
{
"grpName" : "Student",
"grants" : [
"View",
"Delete"
]
}
]
}
Expected Output:
{
"id" : "xxxxxx",
"pages" : [{
"name" : "Page50-50",
"pageid" : "Page50-50",
"icon" : "/images/2column.png",
"children" : [],
"permissions" : [
{
"grpName" : "Admin",
"grants" : [
"View",
"Delete",
"Edit"
]
},
{
"grpName" : "Users",
"grants" : [
"View"
]
}
]
},{
"name" : "AboutUs",
"pageid" : "AboutUs",
"icon" : "/images/1column.png",
"children" : [],
"permissions" : [
{
"grpName" : "Student",
"grants" : [
"View",
"Delete"
]
}
]
}]
}
The permissions params in the expected output should come inside pages, and pages should group into an array. I used $group and was able to group the pages, but i am not able to figure out how can i add permissions param inside pages. I am using mongodb 3.2.
Thanks.
Use $group
db.getCollection('Collection').aggregate([{
$group: {
"_id": "$_id",
"pages": {
$push: "$pages"
}
}
}])

Embed root field in a subdocument within an aggregation pipeline

Maybe someone can help me with Mongo's Aggregation Pipeline. I am trying to put an object in another object but I'm new to Mongo and ist very difficult:
{
"_id" : ObjectId("5888a74f137ed66828367585"),
"name" : "Unis",
"tags" : [...],
"editable" : true,
"token" : "YfFzaoNvWPbvyUmSulXfMPq4a9QgGxN1ElIzAUmSJRX4cN7zCl",
"columns" : [...],
"description" : "...",
"sites" : {
"_id" : ObjectId("5888ae2f137ed668fb95a03d"),
"url" : "www.....de",
"column_values" : [
"University XXX",
"XXX",
"false"
],
"list_id" : ObjectId("5888a74f137ed66828367585")
},
"scan" : [
{
"_id" : ObjectId("5888b1074e2123c22ae7f4d3"),
"site_id" : ObjectId("5888ae2f137ed668fb95a03d"),
"scan_group_id" : ObjectId("5888a970a7f75fbd49052ed6"),
"date" : ISODate("2017-01-18T16:00:00Z"),
"score" : "B",
"https" : false,
"cookies" : 12
}
]
}
I want to put every object in the "scan"-array into "sites". So that it looks like this:
{
"_id" : ObjectId("5888a74f137ed66828367585"),
"name" : "Unis",
"tags" : [...],
"editable" : true,
"token" : "YfFzaoNvWPbvyUmSulXfMPq4a9QgGxN1ElIzAUmSJRX4cN7zCl",
"columns" : [...],
"description" : "...",
"sites" : {
"_id" : ObjectId("5888ae2f137ed668fb95a03d"),
"url" : "www.....de",
"column_values" : [
"University XXX",
"XXX",
"false"
],
"list_id" : ObjectId("5888a74f137ed66828367585"),
"scan" : [
{
"_id" : ObjectId("5888b1074e2123c22ae7f4d3"),
"site_id" : ObjectId("5888ae2f137ed668fb95a03d"),
"scan_group_id" : ObjectId("5888a970a7f75fbd49052ed6"),
"date" : ISODate("2017-01-18T16:00:00Z"),
"score" : "B",
"https" : false,
"cookies" : 12
}
]
}
}
Is there a step in the aggregation pipeline to perform this task?
With a single pipeline I don't see any other way but specifying each field individually as:
db.collection.aggregate([
{
"$project": {
"name": 1, "tags": 1,
"editable": 1,
"token": 1, "columns": 1,
"description": 1,
"sites._id": "$sites._id",
"sites.url": "$sites.url" ,
"sites.column_values": "$sites.column_values" ,
"sites.list_id": "$sites.list_id",
"sites.scan": "$scan"
}
}
])
With MongoDB 3.4 and newer, you can use the $addFields pipeline step instead of specifying all fields using $project. The advantage is that it adds new fields to documents and outputs documents that contain all existing fields from the input documents and the newly added fields:
db.collection.aggregate([
{
"$addFields": {
"sites._id": "$sites._id",
"sites.url": "$sites.url" ,
"sites.column_values": "$sites.column_values" ,
"sites.list_id": "$sites.list_id",
"sites.scan": "$scan"
}
}, { "$project": { "scan": 0 } }
])

MongoDB finding all subdocuments where subDocument _ids like "string"

Sample document structure is like:
{
"_id" : "https://docs.mongodb.org/manual",
"collection" : {
"_id" : "collection",
"urls" : [
"https://docs.mongodb.org/manual/c1",
"https://docs.mongodb.org/manual/c2"
]
},
"collectionNew" : {
"_id" : "collection1",
"urls" : [
"https://docs.mongodb.org/manual/c1New",
"https://docs.mongodb.org/manual/c2New"
]
},
"log" : {
"_id" : "log",
"urls" : [
"https://docs.mongodb.org/manual/l1",
"https://docs.mongodb.org/manual/l2"
]
}
}
I have multiple such documents in my collection.
After finding document by
db.AutoSearch.find({"_id": "https://docs.mongodb.org/manual"})
i want to search all subdocuments under this with name like "collection"
SOLUTION
I changed my document structure to:
{
"_id" : "https://docs.mongodb.org/manual",
"contents" : [
{
"_id" : "connect",
"urls" : [
"https://docs.mongodb.org/manual/search/?query=connect"
]
},
{
"_id" : "connection",
"urls" : [
"https://docs.mongodb.org/manual/search/?query=connection",
"https://docs.mongodb.org/manual/search/?query=connection%20pymongo"
]
}
{
"_id" : "list",
"urls" : [
"https://docs.mongodb.org/manual/search/?query=listfiles",
"https://docs.mongodb.org/manual/search/?query=listdatabases",
"https://docs.mongodb.org/manual/search/?query=listcommands"
]
},
{
"_id" : "log",
"urls" : [
"http://docs.mongodb.org/manual/tutorial/rotate-log-files/",
"http://docs.mongodb.org/manual/reference/log-messages/"
]
},
{
"_id" : "index",
"urls" : [
"https://docs.mongodb.org/manual/search/?query=index-related%20commands"
]
}
]}
Then by using mongoDB aggregate query
db.AutoSearch.aggregate([{$match:{ "_id": "https://docs.mongodb.org/manual"}},
{$unwind: '$contents'},
{$match:{"contents._id":/connect/}}])
I got the desired output.

mongodb how to retrieve all the value in fields type array?

Is there a way to retrieve all the values
of a fields type array
ie
{ "slug" : "my-post", "status" : "publish", "published" : ISODate("2014-01-26T18:28:11Z"), "title" : "my post", "body" : "my body post", "_id" : ObjectId("52e553c937fb8bf218b8c624"), "tags" : [ "js", "php", "scala" ], "created" : ISODate("2014-01-26T18:28:25.298Z"), "author" : "whisher", "__v" : 0 }
{ "slug" : "my-post-2", "status" : "publish", "published" : ISODate("2014-01-26T18:28:27Z"), "title" : "my post 2", "body" : "spost body", "_id" : ObjectId("52e5540837fb8bf218b8c625"), "tags" : [ "android", "actionscript", "java" ], "created" : ISODate("2014-01-26T18:29:28.915Z"), "author" : "whisher", "__v" : 0 }
the result should be like
"android", "actionscript", "java","js", "php", "scala"
You can $unwind, and then $group them back
db.collection.aggregate({ $unwind : "$tags" }, {$group:{_id: "$tags"}});
The result would be
{ _id: "android"},
{ _id: "actionscript"},
{ _id: "java"},
{ _id: "js"},
{ _id: "php"},
{ _id: "scala"}
Use the distinct command (reference):
> db.test.distinct("tags")
[ "js", "php", "scala", "actionscript", "android", "java" ]
You could use aggregation if you eventually needed something more complex:
> db.test.aggregate(
{ $project: { tags : 1 } },
{ $unwind : "$tags" },
{ $group : { _id: "$tags" } } );
Results:
[
{
"_id" : "java"
},
{
"_id" : "actionscript"
},
{
"_id" : "android"
},
{
"_id" : "scala"
},
{
"_id" : "php"
},
{
"_id" : "js"
}
]
I'd use $project (reference) to reduce the number of fields being passed through the pipeline though. In the example above, I've used $project to include only the tags for example.