Get distinct values from mongodb array with Springboot - mongodb

I want to filter all collection from mongoDb document where duplicated elements are there in values field which is list. how we can get only unique data and it's named as group.
My Collection:
{
"_id" : ObjectId("5d4b3d44101cc8e202d30eaf"),
"name" : "motorManufacturer",
"values" : [
"LEAM",
"Bico",
"Bico",
"LEAM"
]
}
{
"_id" : ObjectId("5d4b3d44101cc8e202d30eaf"),
"name" : "motorManufacturer",
"values" : [
"NOV",
"NOV"
"SLB",
"SLB",
"SD",
]
}
I have this Collection. I want only unique data from values in Spingboot.
Expceted Result:
{
"name" : "motorManufacturer",
"values" : [
"Bico",
"LEAM"
]
}
{
"name" : "motorManufacturer",
"values" : [
"NOV"
"SLB",
"SD",
]
]

Use $setUnion to remove duplicates.
db.collection.aggregate([
{
$project: {
"name": 1,
"values": {
$setUnion: ["$values"]
}
}
}
])
Demo
In spring-data-mongodb, this can be written as,
ProjectionOperation projectStage = Aggregation.project().andInclude("name")
.and(SetOperators.SetUnion.arrayAsSet("values")).as("values");
mongoTemplate.aggregate(Aggregation.newAggregation(projectStage),
<your collection class>.class,
<your collection class>.class).getMappedResults();

Related

Update inner array in multiple array document

I am trying this question... I don't want use the "_id" : "12", field..
{
"_id" : ObjectId("62622dd73905f04f59db2971"),
"array1" : [
{
"_id" : "12",
"array2" : [
{
"_id" : "123",
"answeredBy" : []
},
{
"_id" : "124",
"answeredBy" : []
}
]
}
]
}
I am trying to update using the following query
db.getCollection('nestedArray').updateMany(
{'array1.array2._id':'123'},
{$push:{'array1.array2.$[inner].answeredBy':'success'}},
{arrayFilters:[{'inner._id':'123'}]}
)
But I am getting the following error:
"errmsg" : "The path 'array1.array2' must exist in the document in
order to apply array updates.",
I just trying to understand what is wrong with the code....
add $[] between array1 and array2
Update Nested Arrays in Conjunction with $[]
db.collection.update({
"array1.array2._id": "123"
},
{
$push: {
"array1.$[].array2.$[inner].answeredBy": "success"
}
},
{
arrayFilters: [
{
"inner._id": "123"
}
]
})
mongoplayground

MongoDB - Fetch an object from deep subdocuments

How to query an object from an Array inside an Array, and get it as a top-level object? For example, consider the following record.
{
"subjects": [
{
"name": "English",
"teachers": [
{
"name": "Mark" /* Trying to get this object*/
},
{
"name": "John"
}
]
}
]
}
I am trying to get the following object out as the top-level object.
{
"name": "Mark"
}
You need to use the aggregation framework to do exactly what you're asking for.
Here I entered the document you gave into collection: foo.
> db.foo.find().pretty()
{
"_id" : ObjectId("57ceed3d31484d5b491eaae9"),
"subjects" : [
{
"name" : "English",
"teachers" : [
{
"name" : "Mark"
},
{
"name" : "John"
}
]
}
]
}
Using $unwind to unravel our array we then enter our first stage of the aggregation pipeline:
> db.foo.aggregate([
... {$unwind: "$subjects"}
... ]).pretty()
{
"_id" : ObjectId("57ceed3d31484d5b491eaae9"),
"subjects" : {
"name" : "English",
"teachers" : [
{
"name" : "Mark"
},
{
"name" : "John"
}
]
}
}
Subjects was an array of length 1 so the only difference here is one less set of [] array brackets.
We need to unwind again.
> db.foo.aggregate([
... {$unwind: "$subjects"},
... {$unwind: "$subjects.teachers"}
... ]).pretty()
{
"_id" : ObjectId("57ceed3d31484d5b491eaae9"),
"subjects" : {
"name" : "English",
"teachers" : {
"name" : "Mark"
}
}
}
{
"_id" : ObjectId("57ceed3d31484d5b491eaae9"),
"subjects" : {
"name" : "English",
"teachers" : {
"name" : "John"
}
}
}
Now we turned our array of length '2' into two separate documents. The first one with subjects.teachers.name = Mark and the second with subjects.teachers.name = John.
We only want to return the case where name = Mark so we need to add a $match stage to our pipeline.
> db.foo.aggregate([
... {$unwind: "$subjects"},
... {$unwind: "$subjects.teachers"},
... {$match: {"subjects.teachers.name": "Mark"}}
... ]).pretty()
{
"_id" : ObjectId("57ceed3d31484d5b491eaae9"),
"subjects" : {
"name" : "English",
"teachers" : {
"name" : "Mark"
}
}
}
Ok! Now we are only matching on the case where name: Mark.
Let's add a $project case to shape our input how we want.
> db.foo.aggregate([
... {$unwind: "$subjects"},
... {$unwind: "$subjects.teachers"},
... {$match: {"subjects.teachers.name": "Mark"}},
... {$project: {"name": "$subjects.teachers.name", "_id": 0}}
... ]).pretty()
{ "name" : "Mark" }

Get unique values from arrays per record in Mongodb

I have a collection in MongoDB that looks like this:
{
"_id" : ObjectId("56d3e53b965b57e4d1eb3e71"),
"name" : "John",
"posts" : [
{
"topic" : "Harry Potter",
"obj_ids" : [
"1234"
],
"dates_posted" : [
"2014-12-24"
]
},
{
"topic" : "Daniel Radcliffe",
"obj_ids" : [
"1235",
"1236",
"1237"
],
"dates_posted" : [
"2014-12-22",
"2015-01-13",
"2014-12-24"
]
}
],
},
{
"_id" : ObjectId("56d3e53b965b57e4d1eb3e72"),
"name" : "Jane",
"posts" : [
{
"topic" : "Eragon",
"tweet_ids" : [
"1672",
"1673",
"1674"
],
"dates_posted" : [
"2014-12-27",
"2014-11-16"
]
}
],
}
How could I query to get a result like:
{
"name": "John",
"dates": ["2014-12-24", "2014-12-22", "2015-01-13"]
},
{
"name": "Jane",
"dates" : ["2014-12-27", "2014-11-16"]
}
I need the dates to be unique, as "2014-12-24" appears in both elements of "posts" but I need only the one.
I tried doing db.collection.aggregate([{$unwind: "$posts"}, {$group:{_id:"$posts.dates_posted"}}]) and that gave me results like this:
{ "_id" : [ "2014-12-24", "2014-12-22", "2015-01-13", "2014-12-24" ] }
{ "_id" : [ "2014-12-27", "2014-11-16" ] }
How can I remove the duplicates and also get the name corresponding to the dates?
You would need to use the $addToSet operator to maintain unique values. One way of doing it would be to:
unwind posts.
unwind "posts.date_posted", so that the array gets flattened and the value can be aggregated in the group stage.
Then group by _id and accumulate unique values for the date field, along with name.
code:
db.collection.aggregate([
{
$unwind:"$posts"
},
{
$unwind:"$posts.dates_posted"
},
{
$group:
{
"_id":"$_id",
"dates":{$addToSet:"$posts.dates_posted"},
"name":{$first:"$name"}
}
},
{
$project:
{
"name":1,
"dates":1,
"_id":0
}
}
])
The cons of this approach being that, it uses two unwind stages, which is quiet costly, since it would increase the number of documents, input to the subsequent stages, by a multiplication factor of n where n is the number of values in the array that is flattened.

List of userids without duplicates in mongodb [duplicate]

I'm trying to learn MongoDB and how it'd be useful for analytics for me. I'm simply playing around with the JavaScript console available on their website and have created the following items:
{"title": "Cool", "_id": {"$oid": "503e4dc0cc93742e0d0ccad3"}, "tags": ["twenty", "sixty"]}
{"title": "Other", "_id": {"$oid": "503e4e5bcc93742e0d0ccad4"}, "tags": ["ten", "thirty"]}
{"title": "Ouch", "_id": {"$oid": "503e4e72cc93742e0d0ccad5"}, "tags": ["twenty", "seventy"]}
{"title": "Final", "_id": {"$oid": "503e4e72cc93742e0d0ccad6"}, "tags": ["sixty", "seventy"]}
What I'd like to do is query so I get a list of unique tags for all of these objects. The result should look something like this:
["ten", "twenty", "thirty", "sixty", "seventy"]
How do I query for this? I'm trying to distinct() it, but the call always fails without even querying.
The code that fails on their website works on an actual MongoDB instance:
> db.posts.insert({title: "Hello", tags: ["one", "five"]});
> db.posts.insert({title: "World", tags: ["one", "three"]});
> db.posts.distinct("tags");
[ "one", "three", "five"]
Weird.
You can use the aggregation framework. Depending on how you'd like the results structured, you can use either
var pipeline = [
{"$unwind": "$tags" } ,
{ "$group": { _id: "$tags" } }
];
R = db.tb.aggregate( pipeline );
printjson(R);
{
"result" : [
{
"_id" : "seventy"
},
{
"_id" : "ten"
},
{
"_id" : "sixty"
},
{
"_id" : "thirty"
},
{
"_id" : "twenty"
}
],
"ok" : 1
}
or
var pipeline = [
{"$unwind": "$tags" } ,
{ "$group":
{ _id: null, tags: {"$addToSet": "$tags" } }
}
];
R = db.tb.aggregate( pipeline );
printjson(R);
{
"result" : [
{
"_id" : null,
"tags" : [
"seventy",
"ten",
"sixty",
"thirty",
"twenty"
]
}
],
"ok" : 1
}
You should be able to use this:
db.mycollection.distinct("tags").sort()
Another way of getting unique array elements using aggregation pipeline
db.blogs.aggregate(
[
{$group:{_id : null, uniqueTags : {$push : "$tags"}}},
{$project:{
_id : 0,
uniqueTags : {
$reduce : {
input : "$uniqueTags",
initialValue :[],
in : {$let : {
vars : {elem : { $concatArrays : ["$$this", "$$value"] }},
in : {$setUnion : "$$elem"}
}}
}
}
}}
]
)
collection
> db.blogs.find()
{ "_id" : ObjectId("5a6d53faca11d88f428a2999"), "name" : "sdfdef", "tags" : [ "abc", "def", "efg", "abc" ] }
{ "_id" : ObjectId("5a6d5434ca11d88f428a299a"), "name" : "abcdef", "tags" : [ "abc", "ijk", "lmo", "zyx" ] }
>
pipeline
> db.blogs.aggregate(
... [
... {$group:{_id : null, uniqueTags : {$push : "$tags"}}},
... {$project:{
... _id : 0,
... uniqueTags : {
... $reduce : {
... input : "$uniqueTags",
... initialValue :[],
... in : {$let : {
... vars : {elem : { $concatArrays : ["$$this", "$$value"] }},
... in : {$setUnion : "$$elem"}
... }}
... }
... }
... }}
... ]
... )
result
{ "uniqueTags" : [ "abc", "def", "efg", "ijk", "lmo", "zyx" ] }
There are couple of web mongo consoles available:
http://try.mongodb.org/
http://www.mongodb.org/#
But if you type help in them you will realise they only support a very small number of ops:
HELP
Note: Only a subset of MongoDB's features are provided here.
For everything else, download and install at mongodb.org.
db.foo.help() help on collection method
db.foo.find() list objects in collection foo
db.foo.save({a: 1}) save a document to collection foo
db.foo.update({a: 1}, {a: 2}) update document where a == 1
db.foo.find({a: 1}) list objects in foo where a == 1
it use to further iterate over a cursor
As such distinct does not work because it is not supported.

MongoDB query result returns an array of arrays of dictionaries. I need only an array of dictionaries

I need the value of the "browser" key in my result to be an array of dictionaries. My current query returns an array of arrays of dictionaries which is not what I want.
How would I change/modify my query to get the value of the "browser" key to be just an array of dictionaries?
Is there a better way to pass the browser data from the first group to the second group in my aggregate query?
Below is my data, my query, and my current result:
Data Format:
{
"_id" : ObjectId("52f11293ed50ed92d0324755"),
"major" : "26",
"site_domain" : "www.google.com",
"user_id" : "34850348039485093455445434",
"timestamp" : "1390953411",
"browser_name" : "Firefox",
}
Query:
db.collection.aggregate({$group:{_id: {user_id:"$user_id", site_domain:"$site_domain"}, browser: {$addToSet:{name:"$browser_name", type:"$major"}}, browsing_history: {$addToSet:"$timestamp"}}},
{$group:{_id: {user_id:"$_id.user_id"}, browser:{$addToSet:"$browser"}, sites_visited:{$addToSet:{ site:"$_id.site_domain", times:"$browsing_history"}}}});
Result:
{
"_id" : {
"user_id" : "ab93680ffb1b9c2"
},
"browser" : [
[
{
"name" : "Firefox",
"type" : "20"
}
]
],
"sites_visited" : [
{
"site" : "google.com",
"times" : [
[
"20140201105126",
"1167637060"
]
]
}
]
}
My ideal result for the "browser" key would be:
"browser" : [
{
"name" : "Firefox",
"type" : "20"
}
]
You got most of the way. Just add an $unwind in between. Remember that otherwise you are pushing an array onto another array.
db.collection.aggregate([
{$group:{
_id: {user_id:"$user_id", site_domain:"$site_domain"},
browser: {$addToSet:{name:"$browser_name", type:"$major"}},
browsing_history: {$addToSet:"$timestamp"}
}},
{$unwind: "$browser"}, // de-normalize before next group
{$group:{
_id: {user_id:"$_id.user_id"},
browser:{$addToSet:"$browser"},
sites_visited:
{$addToSet:{ site:"$_id.site_domain", times:"$browsing_history"}}
}}
]);