How to iterate through a set to get field value in MongoDB - mongodb

Can somebody tell me please if is possible to iterate through a set to create a field value for key in mongodb result. If I have $facet state in pipeline like:
'missing': [{'$group': {'_id': '$foo', 'woo': {'$addToSet': '$wwo'}}},
{'$project': {'missing_woo': {'$setDifference': [woo_set, '$woo']}}
I would like to get result where code value will be the key like
{'missing_woo': 'missing_woo1'}, {'missing_woo': 'missing_woo2'},... {'missing_woo': 'missing_wooN'}
so that I can iterate through the set generated at $project and to create field values

You can simply use $unwind:
db.collection.aggregate([
{
$facet: {
missing: [
{$group: {_id: "$foo", woo: {$addToSet: "$wwo"}}},
{$project: {_id: 0, missing_woo:
{$setDifference: [
[
"woo1",
"woo2",
"wooN",
"missing_woo1",
"missing_woo2",
"missing_wooN"
],
"$woo"
]
}
}
},
{$unwind: "$missing_woo"}
]
}
}
])
See how it works on the playground example

Related

find missing elements from the passed array to mongodb qyery

for example
animals = ['cat','mat','rat'];
collection contains only 'cat' and 'mat'
I want the query to return 'rat' which is not there in collection..
collection contains
[
{
_id:objectid,
animal:'cat'
},
{
_id:objectid,
animal:'mat'
}
]
db.collection.find({'animal':{$nin:animals}})
(or)
db.collection.find({'animal':{$nin:['cat','mat','rat']}})
EDIT:
One option is:
Use $facet to $group all existing values to a set. using $facet allows to continue even if the db is empty, as #leoll2 mentioned.
$project with $cond to handle both cases: with or without data.
Find the set difference
db.collection.aggregate([
{$facet: {data: [{$group: {_id: 0, animals: {$addToSet: "$animal"}}}]}},
{$project: {
data: {
$cond: [{$gt: [{$size: "$data"}, 0]}, {$first: "$data"}, {animals: []}]
}
}},
{$project: {data: "$data.animals"}},
{$project: {_id: 0, missing: {$setDifference: [animals, "$data"]}}}
])
See how it works on the playground example - with data or playground example - without data

Scala / MongoDB - removing duplicate

I have seen very similar questions with solutions to this problem, but I am unsure how I would incorporate it in to my own query. I'm programming in Scala and using a MongoDB Aggregates "framework".
val getItems = Seq (
Aggregates.lookup(Store...)...
Aggregates.lookup(Store.STORE_NAME, "relationship.itemID", "uniqueID", "item"),
Aggregates.unwind("$item"),
// filter duplicates here ?
Aggregates.lookup(Store.STORE_NAME, "item.content", "ID", "content"),
Aggregates.unwind("$content"),
Aggregates.project(Projections.fields(Projections.include("store", "item", "content")))
)
The query returns duplicate objects which is undesirable. I would like to remove these. How could I go about incorporating Aggregates.group and "$addToSet" to do this? Or any other reasonable solution would be great too.
Note: I have to omit some details about the query, so the store lookup aggregate is not there. However, I want to remove the duplicates later in the query so it hopefully shouldn't matter.
Please let me know if I need to provide more information.
Thanks.
EDIT: 31/ 07/ 2019: 13:47
I have tried the following:
val getItems = Seq (
Aggregates.lookup(Store...)...
Aggregates.lookup(Store.STORE_NAME, "relationship.itemID", "uniqueID", "item"),
Aggregates.unwind("$item"),
Aggregates.group("$item.itemID,
Accumulators.first("ID", "$ID"),
Accumulators.first("itemName", "$itemName"),
Accumulators.addToSet("item", "$item")
Aggregates.unwind("$items"),
Aggregates.lookup(Store.STORE_NAME, "item.content", "ID", "content"),
Aggregates.unwind("$content"),
Aggregates.project(Projections.fields(Projections.include("store", "items", "content")))
)
But my query now returns zero results instead of the duplicate result.
You can use $first to remove the duplicates.
Suppose I have the following data:
[
{"_id": 1,"item": "ABC","sizes": ["S","M","L"]},
{"_id": 2,"item": "EFG","sizes": []},
{"_id": 3, "item": "IJK","sizes": "M" },
{"_id": 4,"item": "LMN"},
{"_id": 5,"item": "XYZ","sizes": null
}
]
Now, let's aggregate it using $first and $unwind and see the difference:
First let's aggregate it using $first
db.collection.aggregate([
{ $sort: {
item: 1
}
},
{ $group: {
_id: "$item",firstSize: {$first: "$sizes"}}}
])
Output
[
{"_id": "XYZ","firstSize": null},
{"_id": "ABC","firstSize": ["S","M","L" ]},
{"_id": "IJK","firstSize": "M"},
{"_id": "EFG","firstSize": []},
{"_id": "LMN","firstSize": null}
]
Now, Let's aggregate it using $unwind
db.collection.aggregate([
{
$unwind: "$sizes"
}
])
Output
[
{"_id": 1,"item": "ABC","sizes": "S"},
{"_id": 1,"item": "ABC","sizes": "M"},
{"_id": 1,"item": "ABC","sizes": "L},
{"_id": 3,"item": "IJK","sizes": "M"}
]
You can see $first removes the duplicates where as $unwind keeps the duplicates.
Using $unwind and $first together.
db.collection.aggregate([
{ $unwind: "$sizes"},
{
$group: {
_id: "$item",firstSize: {$first: "$sizes"}}
}
])
Output
[
{"_id": "IJK", "firstSize": "M"},
{"_id": "ABC","firstSize": "S"}
]
group then addToSet is an effective way to deal with your problem !
it looks like this in mongoshell
db.sales.aggregate(
[
{
$group:
{
_id: { day: { $dayOfYear: "$date"}, year: { $year: "$date" } },
itemsSold: { $addToSet: "$item" }
}
}
]
)
in scala you can do it like
Aggregates.group("$groupfield", Accumulators.addToSet("fieldName","$expression"))
if you have multiple field to group
Aggregates.group(new BasicDBObject().append("fieldAname","$fieldA").append("fieldBname","$fieldB")), Accumulators.addToSet("fieldName","expression"))
then unwind

Mongodb 3.2 and 3.0 $unwind aggregation

I have created a query and check it in robomongo and it's working fine for me in mongodb 3.2
db.post.aggregate([
{$unwind: {path: "$page_groups", preserveNullAndEmptyArrays: true}},
{$group: {_id: "$page_groups",
page_names: {$addToSet: "$page_name"}}},
])
But unfortunantly I need to get same data in mongodb 3.0
Can anyone tell me how to get data with empty array in mongo 3.0 and get results by array key?
Without $unwind I get objects where pages have two or more groups and I don't need it.
Thank you for answere, I wanted to use $project at first, but I think I have found easier way using $match and array $size to ignore results where array gets more than one element:
db.post_summary.aggregate([
{$match: {$or:
[{page_groups: {$size: 1}}, {page_groups: {$size: 0}}]}},
{$group: {
_id: "$page_groups",
page_names: { "$addToSet": "$page_name" }
}},
])
In my case "page_groups" have this structure:
page_groups:[
0 =>[_id, group_name]
1 =>[_id, group_name]
]
To mimick the preserveNullAndEmptyArrays $unwind option in 3.2 for 3.0 aggregation pipeline operations, generate an initial $project pipeline stage that creates the array field if it's null or empty (using the $ifNull operator):
var pipeline = [
{
"$project": {
"pg": {
"$ifNull": [
"$page_groups",
["Unspecified"]
]
},
"page_name": 1
}
},
{ "$unwind": "$page_groups" },
{
"$group": {
"_id": "$page_groups",
"page_names": { "$addToSet": "$page_name" }
}
}
];
db.collection.aggregate(pipeline);

mongodb find matches based on count aggregation

I have a mongodb collection like this:
{"uid": "01370mask4",
"title": "hidden",
"post: "hidden",
"postTime": "01-23, 2016",
"unixPostTime": "1453538601",
"upvote": [2, 3]}
and I'd like to select post records from the users with more than 5 posts. The stucture should be the same, I just don't need the documents from users who don't have many posts.
db.collection.aggregate(
[
{ $group : { _id : "$uid", count: { $sum: 1 } } }
]
)
Now I'm stuck at how to use the count values to find. I searched but didn't find any methods to add the count values back to the same collection by uid. Saving the aggregation output and joining them together seems not supported by mongodb. Please advise, thanks!
Update:
Sorry that I didn't make it clear earlier. Thanks for your prompt answers! I want a subset of the original collection, with post text, post timestamp, etc. I don't want a subset of the aggregation output.
If there aren't millions of documents, then you can try a shortcut way to achieve what you are trying using one aggregate and another find query,
Aggregate query:
var users = db.collection.aggregate(
[
{$group:{_id:'$uid', count:{$sum:1}}},
{$match:{count:{$gt:5}}},
{$group:{_id:null,users:{$push:'$_id'}}}
]
).toArray()[0]['users']
Then it's a straight ahead query to find the particular users:
db.collection.find({uid: {$in: users}})
Just add the $match after your group with the correct query and it works :
db.collection.aggregate(
[
{ $group : { _id : "$uid", count: { $sum: 1 } } },
{ $match : { count : { $gt : 5 } }
]
)
Please try this one to select users with more than 5 posts. To keep the original fields through using $first, if the $uid is unique, please add the field as below.
db.collection.aggregate([
{$group: {
_id: '$uid',
title: {$first: '$title'},
post: {$first:'$post'},
postTime:{$first: '$postTime'},
unixPostTime:{$first: '$unixPostTime'},
upvote:{$first: '$upvote'},
count: {$sum: 1}
}},
{$match: {count: {$gte: 5}}}])
)
If there are multiple value for the same $uid, you should use $push to an array in the $group.
If you want to save the result to db, please try it as below
var cur = db.collection.aggregate(
[
{$group: {
_id: '$uid',
title: {$first: '$title'},
post: {$first:'$post'},
postTime:{$first: '$postTime'},
unixPostTime:{$first: '$unixPostTime'},
upvote:{$first: '$upvote'},
count: {$sum: 1}
}},
{$match: {count: {$gte: 5}}}
]
)
cur.forEach(function(doc) {
db.collectioin.update({_id: doc._id}, {/*the field should be updated */});
});

How to aggregate queries in mongodb

I have a document collection that look like the following:
{
name : "tester"
, activity: [
{
gear: "glasses"
where: "outside"
}
, {
gear: "hat"
, where: "inside"
}
, {
gear: "glasses"
, where: "car"
}
]
}
How do I query the collection to return only documents with multiple activities that contain the value of "gear":"glasses"?
Thanks!
I think it's possible to do without aggregation framework, if you need full document filtered by your condition:
db.collection.find({
"activity": {$elemMatch: {gear:"glasses"}},
"activity.1" : {$exists: 1}
})
This is going to be ugly with aggregation framework, but it can be done:
db.collection.aggregate(
{$match: {"activity.gear": "glasses"}},
{$unwind: "$activity"},
{$group: {
_id: {_id: "$_id", name: "$name"},
_count: {$sum: {$cond: [{$eq: ["glasses", "$activity.gear"]}, 1, 0]}}
}},
{$match: {_count: {$gt: 1}}}
)
When analyzing the above query, I would recommend walking through step. Start with just the "$match", the the "$match" and "$unwind". And so one. You will see how each step works.
The response is not the full document. If you are looking for the full document, include a $project step that passes through a dummy activity, and reconstruct the full document on the output.
You can also try this:
db.collection.find( { activity: { $elemMatch: { gear: "glasses" } } )