I have something like below:
{
"_id" : "1",
"firstArray" : [
{
"_id" : "11",
"secondArray" : [ ]
},
{
"_id" : "12",
"secondArray" : [ ]
},
{
"_id" : "13",
"secondArray" : [ { "type" : "somthing" } ]
}
]
},
{
"_id" : "2",
"firstArray" : [
{
"_id" : "21",
"secondArray" : [ ]
},
{
"_id" : "22",
"secondArray" : [ ]
}
]
}
I need a mongodb query to find documents which ALL of the nested secondArrays are empty? the query should return second document and not the first one.
to solve that, we need to check size of arr2, but to enable that we need first to unwind arr1.
Please find below aggregation framework snippet which solves this problem,
db.pmoubed.aggregate([{
$unwind : "$firstArray"
}, {
$project : {
_id : 1,
firstArray : 1,
isNotEmpty : {
$size : "$firstArray.secondArray"
}
}
}, {
$group : {
_id : "$_id",
isNotEmpty : {
$sum : "$isNotEmpty"
},
firstArray : {
$push : "$firstArray"
}
}
}, {
$match : {
"isNotEmpty" : 0
}
}
])
Any comments welcome
Related
I have two JSONs in a collection in mongodb and would like to write a bson.M filter to fetch the first JSON below.
I tried with the filter below to get the first JSON but got no result.
When the first JSON is in the collection, I get result when i use the filter
but when i have both JSONs, I do not get a result. Need help.
filter := bson.M{"type": "FPF", "status": "REGISTERED","fpfInfo.fpfInfoList.ssai.st": 1, "fpfInfo.fpfInfoList.infoList.dn": "sim"}
{
"_id" : "47f6ad68-d431-4b69-9899-f33d828f8f9c",
"type" : "FPF",
"status" : "REGISTERED",
"fpfInfo" : {
"fpfInfoList" : [
{
"ssai" : {
"st" : 1
},
"infoList" : [
{
"dn" : "sim"
}
]
}
]
}
},
{
"_id" : "347c8ed2-d9d1-4f1a-9672-7e8a232d2bf8",
"type" : "FPF",
"status" : "REGISTERED",
"fpfInfo" : {
"fpfInfoList" : [
{
"ssai" : {
"st" : 1,
"ds" : "000004"
},
"infoList" : [
{
"dn" : "sim"
}
]
}
]
}
}
db.collection.aggregate([
{
"$unwind": "$fpfInfo.fpfInfoList"
},
{
"$match": {
"fpfInfo.fpfInfoList.ssai.ds": {
"$exists": false
},
"fpfInfo.fpfInfoList.infoList.dn": "sim",
"fpfInfo.fpfInfoList.ssai.st": 1
}
}
])
Playground
Lets say I have this kind of collection
{
"_id" :"A",
"title" : "TITLE1",
"brand" : [
{
"brand_id" : "B",
"varients" : [
{
"name" : "RED ",
"price" : 5.0
}
]
},
{
"brand_id" : "C",
"varients" : [
{
"name" : "GREEN",
"price" : 5.0
}
]
},
{
"brand_id" : "D",
"varients" : [
{
"name" : "Others",
"price" : 0.0
}
]
}
],
}
I then want to select one and ONLY the nested data of variants. Have tried with the following statement without any success.
db.testing.findOne( {_id: "A", "brand.brand_id" : 'D'} )
Expected output
"varients" : [
{
"name" : "Others",
"price" : 0.0
}
]
Using findOne you can't get the subset or nest content of the document in the response, but yes using aggregation you can get it in a way you want it
Check out this aggregation pipeline:
[
{
'$match': {
'_id': 'A'
}
}, {
'$unwind': {
'path': '$brand'
}
}, {
'$match': {
'brand.brand_id': 'D'
}
}, {
'$project': {
'varients': '$brand.varients',
'_id': 0
}
}
]
I am trying to use Mongo Aggregation Framework to find out intersection between an array inside my document AND another user defined array.
I don't get a correct result and my guess is its because of the fact that I have array inside of an array.
Here is my data set.
My documents:
{
"_id" : 1,
"pendingEntries" : [
{
"entryID" : ObjectId("5701b4c3c6b126083332e65f"),
"tags" : [
{
"tagKey" : "owner",
"tagValue" : "john"
},
{
"tagKey" : "ErrorCode",
"tagValue" : "7001"
},
{
"tagKey" : "ErrorDescription",
"tagValue" : "error123"
}
],
"entryTime" : ISODate("2016-04-04T00:26:43.167Z")
}
]
},
/* 1 */
{
"_id" : 2,
"pendingEntries" : [
{
"entryID" : ObjectId("5701b4c3c6b126083332e65d"),
"tags" : [
{
"tagKey" : "owner",
"tagValue" : "peter"
},
{
"tagKey" : "ErrorCode",
"tagValue" : "6001"
},
{
"tagKey" : "JIRA",
"tagValue" : "Oabc-123"
}
],
"entryTime" : ISODate("2016-04-04T00:26:43.167Z")
}
]
},
/* 2 */
{
"_id" : 3,
"pendingEntries" : [
{
"entryID" : ObjectId("5701b4c3c6b126083332e65c"),
"tags" : [
{
"tagKey" : "owner",
"tagValue" : "abc"
},
{
"tagKey" : "ErrorCode",
"tagValue" : "6001"
},
{
"tagKey" : "JIRA",
"tagValue" : "abc-123"
}
],
"entryTime" : ISODate("2016-04-04T00:26:43.167Z")
}
]
}
My Query:
db.entrylike.aggregate(
[
{ $project: { "pendingEntries.entryID": 1, "common": { $setIntersection: [ "$pendingEntries.tags", [{ "tagKey" : "ErrorCode", "tagValue" : "7001" }] ] } } }
]
)
Result:
{
"result" : [
{
"_id" : 1,
"pendingEntries" : [
{
"entryID" : ObjectId("5701b4c3c6b126083332e65f")
}
],
"common" : []
},
{
"_id" : 2,
"pendingEntries" : [
{
"entryID" : ObjectId("5701b4c3c6b126083332e65d")
}
],
"common" : []
},
{
"_id" : 3,
"pendingEntries" : [
{
"entryID" : ObjectId("5701b4c3c6b126083332e65c")
}
],
"common" : []
}
],
"ok" : 1
}
I am not expecting first common field to be empty. Can someone let me know what is it that I am doing wrong? Or any work arounds that I can take.
I am using mongodb 3.0.8. I am aware of the fact that Mongodb 3.2 can offer some features which will fulfill my needs but 3.2 upgrade is not in our pipeline soon and I am looking to resolve this using Mongo3.0 if possible.
My goal is to either replace tags array with the common elements from the user defined list or add a new field with common elements. My am trying to to the later in my example.
The reason you the common field is empty is because your "pendingEntries" array and your user defined array have not element in common. What you really want is to return an array that contains the elements that appear in your "tags" array and your user defined array. To do that you can simply use the $map operator and apply the $setIntersection operator to each subdocument "tags" in the "pendingEntries" array.
db.entrylike.aggregate([
{ "$project": {
"common": {
"$map": {
"input": "$pendingEntries",
"as": "p",
"in": {
"entryID": "$$p.entryID",
"tags": {
"$setIntersection": [
"$$p.tags",
{ "$literal": [
{
"tagKey" : "ErrorCode",
"tagValue" : "7001"
}
]}
]
}
}
}
}
}}
])
Which returns:
{
"_id" : 1,
"common" : [
{
"entryID" : ObjectId("5701b4c3c6b126083332e65f"),
"tags" : [
{
"tagKey" : "ErrorCode",
"tagValue" : "7001"
}
]
}
]
}
{
"_id" : 2,
"common" : [
{
"entryID" : ObjectId("5701b4c3c6b126083332e65d"),
"tags" : [ ]
}
]
}
{
"_id" : 3,
"common" : [
{
"entryID" : ObjectId("5701b4c3c6b126083332e65c"),
"tags" : [ ]
}
]
}
I am trying to craft a query that will allow me to find duplicate keys in subdocument in MongoDB.
It needs to be able to query any number of documents and see what keys are duplicated across them in a subdocument. The key of my subdocument is called attributes and I need to be able to target a particular query of documents and pull out duplicate attribute keys that they all share.
EDIT:
I forgot to mention that I do not know the names of the attributes ahead of time. I need to be able to essentially select distinct attributes that they share and aggregate the values.
Collection Sample:
[
{
sku: '123',
attributes: {
size: 'L',
custom: 7
}
},
{
sku: '456',
attributes: {
size: 'M'
}
},
{
sku: 'abc',
attributes: {
material: 'cotton'
size: 'S'
}
}
]
Desired Result (if possible):
{
size: [' S', 'M', 'L']
}
If the desired result is not possible I would at least like to be able to get back [ 'size' ]
This process needs to be optimized as much as possible and I just cant seem to get a query just right to return what I need, any help is greatly appreciated =)
Here is what I have so far
db.getCollection('myCollection').aggregate([
{ $match: {
_id: { $in: [ObjectId("55158b0bd6076278295cf022"), ObjectId("55158b0bd6076278295cf021"), ObjectId("55158b0bd6076278295cf01f") ] }
}
},
{ $project: { attributes: 1 }},
{ $group: { _id: '$attributes' } }
])
Which products this output:
{
"result" : [
{
"_id" : {
"shirt_size" : "S",
"shirt_color" : "Blue",
"custom_attr" : "adsfasdf"
}
},
{
"_id" : {
"shirt_size" : "M",
"shirt_color" : "Green"
}
},
{
"_id" : {
"shirt_size" : "L",
"shirt_color" : "Red"
}
}
],
"ok" : 1.0000000000000000,
"$gleStats" : {
"lastOpTime" : Timestamp(1427475045, 1),
"electionId" : ObjectId("54f7c1edf8e5ff44cec194b6")
}
}
I feel like it is close and I am just missing the last step :(
I think you need to $unwind the array, and then $group it and use $sum to count the appearance, then everything with sum > 1 is a duplicate.
Links:
http://docs.mongodb.org/manual/reference/operator/aggregation/unwind/
http://docs.mongodb.org/manual/reference/operator/aggregation/group/
http://docs.mongodb.org/manual/reference/operator/aggregation/sum/
The $addToSet(aggregation) returns an array of unique values - http://docs.mongodb.org/manual/reference/operator/aggregation/addToSet/
Using the following aggregation (get unique sizes per Doc):
db.coll1.aggregate([
{$unwind : "$testdoc"},
{$group : {_id: "$_id", size: {$addToSet: "$testdoc.attributes.size"}}}
])
Gives the following result:
{
"result" : [
{
"_id" : ObjectId("551621fe6155a7741a0d328a"),
"size" : [
"M",
"L"
]
},
{
"_id" : ObjectId("551621fe6155a7741a0d328b"),
"size" : [
"L"
]
},
{
"_id" : ObjectId("551621fe6155a7741a0d3289"),
"size" : [
"S",
"M",
"L"
]
}
],
"ok" : 1
}
The following aggregation returns unique sizes across all docs:
db.coll1.aggregate([
{$unwind : "$testdoc"},
{$group :
{_id: "AllSizes", size: {$addToSet: "$testdoc.attributes.size"}}} ])
Result:
{
"result" : [
{
"_id" : "AllSizes",
"size" : [
"S",
"M",
"L"
]
}
],
"ok" : 1
}
Based on the following Docs:
> db.coll1.find().pretty()
{
"_id" : ObjectId("551621fe6155a7741a0d3289"),
"testdoc" : [
{
"sku" : "123",
"attributes" : {
"size" : "L",
"custom" : 7
}
},
{
"sku" : "456",
"attributes" : {
"size" : "M"
}
},
{
"sku" : "abc",
"attributes" : {
"material" : "cotton",
"size" : "S"
}
}
]
}
{
"_id" : ObjectId("551621fe6155a7741a0d328a"),
"testdoc" : [
{
"sku" : "123",
"attributes" : {
"size" : "L",
"custom" : 7
}
},
{
"sku" : "456",
"attributes" : {
"size" : "M"
}
},
{
"sku" : "abc",
"attributes" : {
"material" : "cotton",
"size" : "M"
}
}
]
}
{
"_id" : ObjectId("551621fe6155a7741a0d328b"),
"testdoc" : [
{
"sku" : "123",
"attributes" : {
"size" : "L",
"custom" : 7
}
},
{
"sku" : "456",
"attributes" : {
"size" : "L"
}
},
{
"sku" : "abc",
"attributes" : {
"material" : "cotton",
"size" : "L"
}
}
]
}
abstract document in collection md given:
{
vals : [{
uid : string,
val : string|array
}]
}
the following, partially correct aggregation is given:
db.md.aggregate(
{ $unwind : "$vals" },
{ $match : { "vals.uid" : { $in : ["x", "y"] } } },
{
$group : {
_id : { uid : "$vals.uid" },
vals : { $addToSet : "$vals.val" }
}
}
);
that may lead to the following result:
"result" : [
{
"_id" : {
"uid" : "x"
},
"vals" : [
[
"24ad52bc-c414-4349-8f3a-24fd5520428e",
"e29dec2f-57d2-43dc-818a-1a6a9ec1cc64"
],
[
"5879b7a4-b564-433e-9a3e-49998dd60b67",
"24ad52bc-c414-4349-8f3a-24fd5520428e"
]
]
},
{
"_id" : {
"uid" : "y"
},
"vals" : [
"0da5fcaa-8d7e-428b-8a84-77c375acea2b",
"1721cc92-c4ee-4a19-9b2f-8247aa53cfe1",
"5ac71a9e-70bd-49d7-a596-d317b17e4491"
]
}
]
as x is the result aggregated on documents containing an array rather than a string, the vals in the result is an array of arrays. what i look for in this case is to have a flattened array (like the result for y).
for me it seems like that what i want to achieve by one aggegration call only, is currently not supported by any given operation as e.g. a type conversion cannot be done or unwind expectes in every case an array as input type.
is map reduce the only option i have? if not ... any hints?
thanks!
You can use the aggregation to do the computation you want without changing your schema (though you might consider changing your schema simply to make queries and aggregations of this field easier to write).
I broke up the pipeline into multiple steps for readability. I also simplified your document slightly, again for readability.
Sample input:
> db.md.find().pretty()
{
"_id" : ObjectId("512f65c6a31a92aae2a214a3"),
"uid" : "x",
"val" : "string"
}
{
"_id" : ObjectId("512f65c6a31a92aae2a214a4"),
"uid" : "x",
"val" : "string"
}
{
"_id" : ObjectId("512f65c6a31a92aae2a214a5"),
"uid" : "y",
"val" : "string2"
}
{
"_id" : ObjectId("512f65e8a31a92aae2a214a6"),
"uid" : "y",
"val" : [
"string3",
"string4"
]
}
{
"_id" : ObjectId("512f65e8a31a92aae2a214a7"),
"uid" : "z",
"val" : [
"string"
]
}
{
"_id" : ObjectId("512f65e8a31a92aae2a214a8"),
"uid" : "y",
"val" : [
"string1",
"string2"
]
}
Pipeline stages:
> project1 = {
"$project" : {
"uid" : 1,
"val" : 1,
"isArray" : {
"$cond" : [
{
"$eq" : [
"$val.0",
[ ]
]
},
true,
false
]
}
}
}
> project2 = {
"$project" : {
"uid" : 1,
"valA" : {
"$cond" : [
"$isArray",
"$val",
[
null
]
]
},
"valS" : {
"$cond" : [
"$isArray",
null,
"$val"
]
},
"isArray" : 1
}
}
> unwind = { "$unwind" : "$valA" }
> project3 = {
"$project" : {
"_id" : 0,
"uid" : 1,
"val" : {
"$cond" : [
"$isArray",
"$valA",
"$valS"
]
}
}
}
Final aggregation:
> db.md.aggregate(project1, project2, unwind, project3, group)
{
"result" : [
{
"_id" : "z",
"vals" : [
"string"
]
},
{
"_id" : "y",
"vals" : [
"string1",
"string4",
"string3",
"string2"
]
},
{
"_id" : "x",
"vals" : [
"string"
]
}
],
"ok" : 1
}
If you modify your schema using always "vals.val" field as an array field (even when the record contains only one element) you can do it easily as follows:
db.test_col.insert({
vals : [
{
uid : "uuid1",
val : ["value1"]
},
{
uid : "uuid2",
val : ["value2", "value3"]
}]
});
db.test_col.insert(
{
vals : [{
uid : "uuid2",
val : ["value4", "value5"]
}]
});
Using this approach you only need to use two $unwind operations: one unwinds the "parent" array and the second unwinds every "vals.val" value. So, querying like
db.test_col.aggregate(
{ $unwind : "$vals" },
{ $unwind : "$vals.val" },
{
$group : {
_id : { uid : "$vals.uid" },
vals : { $addToSet : "$vals.val" }
}
}
);
You can obtain your expected value:
{
"result" : [
{
"_id" : {
"uid" : "uuid2"
},
"vals" : [
"value5",
"value4",
"value3",
"value2"
]
},
{
"_id" : {
"uid" : "uuid1"
},
"vals" : [
"value1"
]
}
],
"ok" : 1
}
And no, you can't execute this query using your current schema, since $unwind fails when the field isn't an array field.