In MongoDb, how to apply sort internal fields present in document? - mongodb

My document looks like this
{
field1: somevalue,
name:xtz
nested_documents: [ // array of nested document
{ x:"1", y:"2" }, // first nested document
{ x:"2", y:"3" }, // second nested document
{ x:"-1", y:"3" }, // second nested document
// ...many more nested documents
]
}
How one can sort the data present in nested_documents?
Expected answer is shown below:
nested_documents: [ { x:"-1", y:"3" },{ x:"1", y:"2" },{ x:"2", y:"3" }]

To do this you would have to use the aggregation framework
db.test.aggregate([{$unwind:'$nested_documents'},{$sort:{'nested_documents.x':
1}}])
this returns
"result" : [
{
"_id" : ObjectId("5139ba3dcd4e11c83f4cea12"),
"field1" : "somevalue",
"name" : "xtz",
"nested_documents" : {
"x" : "-1",
"y" : "3"
}
},
{
"_id" : ObjectId("5139ba3dcd4e11c83f4cea12"),
"field1" : "somevalue",
"name" : "xtz",
"nested_documents" : {
"x" : "1",
"y" : "2"
}
},
{
"_id" : ObjectId("5139ba3dcd4e11c83f4cea12"),
"field1" : "somevalue",
"name" : "xtz",
"nested_documents" : {
"x" : "2",
"y" : "3"
}
}
],
"ok" : 1
Hope this helps

Related

MongoDB - query for a nested item inside a collection

i have a mongodb collection "result" with data like
{ "_id" : {
"user" : "Howard",
"friend" : "Sita"
},
"value" : {
"mutualFriend" :[ "Hanks", "Bikash", "Shyam", "Bakshi" ]
}
}
{ "_id" : {
"user" : "Shiva",
"friend" : "Tom"
},
"value" : {
"friendList" :[ "Hanks", " Tom", " Karma", " Hari", " Dinesh" ]
}
}
{ "_id" : {
"user" : "Hari",
"friend" : "Shiva"
},
"value" : {
"mutualFriend" :[ "Tom", "Karma", "Dinesh" ]
}
}
Now, here i want to query whole Document having value.mutualFriend. how can i get the result?
Expected Output
{ "_id" : {
"user" : "Howard",
"friend" : "Sita"
},
"value" : {
"mutualFriend" :[ "Hanks", "Bikash", "Shyam", "Bakshi" ]
}
}
{ "_id" : {
"user" : "Hari",
"friend" : "Shiva"
},
"value" : {
"mutualFriend" :[ "Tom", "Karma", "Dinesh" ]
}
}
i have large number of document in MongoDB collection, containing value.friendList and value.mutualFriend and then i want to find only documents with value.mutualFriend
db.collection.find({"value.mutualFriend.0" : { $exists : true }})
Its just make sure that the 0th element exists. you can customize your query over various array length.

Sorting objects in array within mongodb

I've seen this question all over google/SO/mongo docs, and I've tried to implement the solution, but it's not working for me. I have the following test database:
> db.test.find().pretty()
{
"_id" : ObjectId("56b4ab167db9acd913ce6e07"),
"state" : "HelloWorld",
"items" : [
{
"guid" : "123"
},
{
"guid" : "124"
},
{
"guid" : "123"
}
]
}
And I want to sort by the "guid" element of items. Running the sort commands yields:
> db.test.find().sort( {"items.guid" : 1}).pretty()
{
"_id" : ObjectId("56b4ab167db9acd913ce6e07"),
"state" : "HelloWorld",
"items" : [
{
"guid" : "123"
},
{
"guid" : "124"
},
{
"guid" : "123"
}
]
}
How can I sort by the "guid" element, so that the returned output of "items" is the 123, 123, and 124 guids (essentially move the child elements of "items" so that they're sorted by "guid")?
EDIT: I've also tried to use the $orderby command, doesn't accomplish what I want:
> db.test.find({ $query : {}, $orderby: {'items.guid' : 1} }).pretty()
{
"_id" : ObjectId("56b4ab167db9acd913ce6e07"),
"state" : "HelloWorld",
"items" : [
{
"guid" : "123"
},
{
"guid" : "124"
},
{
"guid" : "123"
}
]
}
Here is how it can be done using aggregate
db.test.aggregate([
{
$unwind : '$items'
},
{
$sort : {'items.guid' : 1}
},
{
$group : {
_id : '$_id',
state : {$first : '$state'},
items : {
$push : {'guid' : '$items.guid'}
}
}
}
]).pretty()
This is the output from this command.
{
"_id" : ObjectId("56b4ab167db9acd913ce6e07"),
"state" : "HelloWorld",
"items" : [
{
"guid" : "123"
},
{
"guid" : "123"
},
{
"guid" : "124"
}
]
}

Find duplicate key in embedded sub document in mongodb

I am trying to craft a query that will allow me to find duplicate keys in subdocument in MongoDB.
It needs to be able to query any number of documents and see what keys are duplicated across them in a subdocument. The key of my subdocument is called attributes and I need to be able to target a particular query of documents and pull out duplicate attribute keys that they all share.
EDIT:
I forgot to mention that I do not know the names of the attributes ahead of time. I need to be able to essentially select distinct attributes that they share and aggregate the values.
Collection Sample:
[
{
sku: '123',
attributes: {
size: 'L',
custom: 7
}
},
{
sku: '456',
attributes: {
size: 'M'
}
},
{
sku: 'abc',
attributes: {
material: 'cotton'
size: 'S'
}
}
]
Desired Result (if possible):
{
size: [' S', 'M', 'L']
}
If the desired result is not possible I would at least like to be able to get back [ 'size' ]
This process needs to be optimized as much as possible and I just cant seem to get a query just right to return what I need, any help is greatly appreciated =)
Here is what I have so far
db.getCollection('myCollection').aggregate([
{ $match: {
_id: { $in: [ObjectId("55158b0bd6076278295cf022"), ObjectId("55158b0bd6076278295cf021"), ObjectId("55158b0bd6076278295cf01f") ] }
}
},
{ $project: { attributes: 1 }},
{ $group: { _id: '$attributes' } }
])
Which products this output:
{
"result" : [
{
"_id" : {
"shirt_size" : "S",
"shirt_color" : "Blue",
"custom_attr" : "adsfasdf"
}
},
{
"_id" : {
"shirt_size" : "M",
"shirt_color" : "Green"
}
},
{
"_id" : {
"shirt_size" : "L",
"shirt_color" : "Red"
}
}
],
"ok" : 1.0000000000000000,
"$gleStats" : {
"lastOpTime" : Timestamp(1427475045, 1),
"electionId" : ObjectId("54f7c1edf8e5ff44cec194b6")
}
}
I feel like it is close and I am just missing the last step :(
I think you need to $unwind the array, and then $group it and use $sum to count the appearance, then everything with sum > 1 is a duplicate.
Links:
http://docs.mongodb.org/manual/reference/operator/aggregation/unwind/
http://docs.mongodb.org/manual/reference/operator/aggregation/group/
http://docs.mongodb.org/manual/reference/operator/aggregation/sum/
The $addToSet(aggregation) returns an array of unique values - http://docs.mongodb.org/manual/reference/operator/aggregation/addToSet/
Using the following aggregation (get unique sizes per Doc):
db.coll1.aggregate([
{$unwind : "$testdoc"},
{$group : {_id: "$_id", size: {$addToSet: "$testdoc.attributes.size"}}}
])
Gives the following result:
{
"result" : [
{
"_id" : ObjectId("551621fe6155a7741a0d328a"),
"size" : [
"M",
"L"
]
},
{
"_id" : ObjectId("551621fe6155a7741a0d328b"),
"size" : [
"L"
]
},
{
"_id" : ObjectId("551621fe6155a7741a0d3289"),
"size" : [
"S",
"M",
"L"
]
}
],
"ok" : 1
}
The following aggregation returns unique sizes across all docs:
db.coll1.aggregate([
{$unwind : "$testdoc"},
{$group :
{_id: "AllSizes", size: {$addToSet: "$testdoc.attributes.size"}}} ])
Result:
{
"result" : [
{
"_id" : "AllSizes",
"size" : [
"S",
"M",
"L"
]
}
],
"ok" : 1
}
Based on the following Docs:
> db.coll1.find().pretty()
{
"_id" : ObjectId("551621fe6155a7741a0d3289"),
"testdoc" : [
{
"sku" : "123",
"attributes" : {
"size" : "L",
"custom" : 7
}
},
{
"sku" : "456",
"attributes" : {
"size" : "M"
}
},
{
"sku" : "abc",
"attributes" : {
"material" : "cotton",
"size" : "S"
}
}
]
}
{
"_id" : ObjectId("551621fe6155a7741a0d328a"),
"testdoc" : [
{
"sku" : "123",
"attributes" : {
"size" : "L",
"custom" : 7
}
},
{
"sku" : "456",
"attributes" : {
"size" : "M"
}
},
{
"sku" : "abc",
"attributes" : {
"material" : "cotton",
"size" : "M"
}
}
]
}
{
"_id" : ObjectId("551621fe6155a7741a0d328b"),
"testdoc" : [
{
"sku" : "123",
"attributes" : {
"size" : "L",
"custom" : 7
}
},
{
"sku" : "456",
"attributes" : {
"size" : "L"
}
},
{
"sku" : "abc",
"attributes" : {
"material" : "cotton",
"size" : "L"
}
}
]
}

MongoDB $elemMatch display issue

Listed are the following sample documents in a test collection
My requirement is to extract *only" two fields
"host.name" and "host.config.storageDevice.scsiLun.lunType" matching the condition that "host.config.storageDevice.scsiLun.lunType" : "cdrom1" and "host.name" : "on-xxx"
Like I mentioned above , I only want attribute "lunType" to be displayed in the array and not "a" or "b"
I attempted to use both $elemMatch and $projection and it always seems to return all the attributes of array "lunType" ... Am I missing anything here
Attempted in reference to documentation
http://docs.mongodb.org/manual/reference/projection/elemMatch/
Query
db.test.find({"host.config.storageDevice.scsiLun": { $elemMatch: { "lunType" : "cdrom1" } } },{ "host.name" : 1, "host.config.storageDevice.scsiLun.$" : 1})
db.test.find({"host.config.storageDevice.scsiLun.lunType" : "cdrom1" },{ "host.name" : 1, "host.config.storageDevice.scsiLun.$" : 1})
Documents in collection
{
"_id" : ObjectId("51d57f3ad4ebc6c87962d4c0"),
"host" : {
"name" : "on-xxx",
"config" : {
"storageDevice" : {
"scsiLun" : [
{
"a" : "1",
"lunType" : "cdrom1"
},
{
"a" : "2",
"lunType" : "disk2"
},
{
"a" : "3",
"lunType" : "disk3"
}
]
}
}
}
}
,
{
"_id" : ObjectId("51d57f59d4ebc6c87962d4c1"),
"host" : {
"name" : "on-yyy",
"config" : {
"storageDevice" : {
"scsiLun" : [
{
"a" : "4",
"lunType" : "cdrom4"
},
{
"a" : "5",
"lunType" : "disk5"
},
{
"a" : "6",
"lunType" : "disk6"
}
]
}
}
}
}
,
{
"_id" : ObjectId("51d57f74d4ebc6c87962d4c2"),
"host" : {
"name" : "on-zzz",
"config" : {
"storageDevice" : {
"scsiLun" : [
{
"a" : "7",
"lunType" : "cdrom11"
},
{
"a" : "8",
"lunType" : "disk22"
},
{
"a" : "9",
"lunType" : "disk32"
}
]
}
}
}
}
All $elemMatch does is to return the first element of the array that matches a criteria, but it will give you the entire element, that is both "a" and "lunType" properties.
What might get the desired result with aggregation using $unwind to break down the array, then $match to filter and $project to show only the "lunType" field.
I didn't test the query but it should look like this:
db.test.aggregate(
{ $unwind : "$host.config.storageDevice.scsiLun" },
{ $match : { host.name : "on-xxx" ,
host.config.storageDevice.scsiLun.lunType : "cdrom1" } },
{ $project : {
_id : 0 ,
host.config.storageDevice.scsiLun.lunType : 1 }
);

Aggregate of different subtypes in document of a collection

abstract document in collection md given:
{
vals : [{
uid : string,
val : string|array
}]
}
the following, partially correct aggregation is given:
db.md.aggregate(
{ $unwind : "$vals" },
{ $match : { "vals.uid" : { $in : ["x", "y"] } } },
{
$group : {
_id : { uid : "$vals.uid" },
vals : { $addToSet : "$vals.val" }
}
}
);
that may lead to the following result:
"result" : [
{
"_id" : {
"uid" : "x"
},
"vals" : [
[
"24ad52bc-c414-4349-8f3a-24fd5520428e",
"e29dec2f-57d2-43dc-818a-1a6a9ec1cc64"
],
[
"5879b7a4-b564-433e-9a3e-49998dd60b67",
"24ad52bc-c414-4349-8f3a-24fd5520428e"
]
]
},
{
"_id" : {
"uid" : "y"
},
"vals" : [
"0da5fcaa-8d7e-428b-8a84-77c375acea2b",
"1721cc92-c4ee-4a19-9b2f-8247aa53cfe1",
"5ac71a9e-70bd-49d7-a596-d317b17e4491"
]
}
]
as x is the result aggregated on documents containing an array rather than a string, the vals in the result is an array of arrays. what i look for in this case is to have a flattened array (like the result for y).
for me it seems like that what i want to achieve by one aggegration call only, is currently not supported by any given operation as e.g. a type conversion cannot be done or unwind expectes in every case an array as input type.
is map reduce the only option i have? if not ... any hints?
thanks!
You can use the aggregation to do the computation you want without changing your schema (though you might consider changing your schema simply to make queries and aggregations of this field easier to write).
I broke up the pipeline into multiple steps for readability. I also simplified your document slightly, again for readability.
Sample input:
> db.md.find().pretty()
{
"_id" : ObjectId("512f65c6a31a92aae2a214a3"),
"uid" : "x",
"val" : "string"
}
{
"_id" : ObjectId("512f65c6a31a92aae2a214a4"),
"uid" : "x",
"val" : "string"
}
{
"_id" : ObjectId("512f65c6a31a92aae2a214a5"),
"uid" : "y",
"val" : "string2"
}
{
"_id" : ObjectId("512f65e8a31a92aae2a214a6"),
"uid" : "y",
"val" : [
"string3",
"string4"
]
}
{
"_id" : ObjectId("512f65e8a31a92aae2a214a7"),
"uid" : "z",
"val" : [
"string"
]
}
{
"_id" : ObjectId("512f65e8a31a92aae2a214a8"),
"uid" : "y",
"val" : [
"string1",
"string2"
]
}
Pipeline stages:
> project1 = {
"$project" : {
"uid" : 1,
"val" : 1,
"isArray" : {
"$cond" : [
{
"$eq" : [
"$val.0",
[ ]
]
},
true,
false
]
}
}
}
> project2 = {
"$project" : {
"uid" : 1,
"valA" : {
"$cond" : [
"$isArray",
"$val",
[
null
]
]
},
"valS" : {
"$cond" : [
"$isArray",
null,
"$val"
]
},
"isArray" : 1
}
}
> unwind = { "$unwind" : "$valA" }
> project3 = {
"$project" : {
"_id" : 0,
"uid" : 1,
"val" : {
"$cond" : [
"$isArray",
"$valA",
"$valS"
]
}
}
}
Final aggregation:
> db.md.aggregate(project1, project2, unwind, project3, group)
{
"result" : [
{
"_id" : "z",
"vals" : [
"string"
]
},
{
"_id" : "y",
"vals" : [
"string1",
"string4",
"string3",
"string2"
]
},
{
"_id" : "x",
"vals" : [
"string"
]
}
],
"ok" : 1
}
If you modify your schema using always "vals.val" field as an array field (even when the record contains only one element) you can do it easily as follows:
db.test_col.insert({
vals : [
{
uid : "uuid1",
val : ["value1"]
},
{
uid : "uuid2",
val : ["value2", "value3"]
}]
});
db.test_col.insert(
{
vals : [{
uid : "uuid2",
val : ["value4", "value5"]
}]
});
Using this approach you only need to use two $unwind operations: one unwinds the "parent" array and the second unwinds every "vals.val" value. So, querying like
db.test_col.aggregate(
{ $unwind : "$vals" },
{ $unwind : "$vals.val" },
{
$group : {
_id : { uid : "$vals.uid" },
vals : { $addToSet : "$vals.val" }
}
}
);
You can obtain your expected value:
{
"result" : [
{
"_id" : {
"uid" : "uuid2"
},
"vals" : [
"value5",
"value4",
"value3",
"value2"
]
},
{
"_id" : {
"uid" : "uuid1"
},
"vals" : [
"value1"
]
}
],
"ok" : 1
}
And no, you can't execute this query using your current schema, since $unwind fails when the field isn't an array field.