The document represents one users having images. Each image can have N images related to it. I would like to be able to update the matches list only if:
The match does exist yet.
There is less then N elements in the matches array
If they are already N element, only push if "c" parameter is higher than the lower present.
{
"user_id" : 1,
"imgs" : [
{
"img_id" : 1,
"matches" : [
{
"c" : 0.3,
"img_id" : 2
},
{
"c" : 0.2,
"img_id" : 3
}
]
},
{
"img_id" : 5,
"matches" : [
{
"c" : 0.4,
"img_id" : 6
}
]
}
]
}
Basically, "matches" is a set, but $addToSet does not provide $slice and $sort, so I am trying to use $push instead.
db.stack.updateOne(
{ "user_id" : 1, "imgs.img_id" : 1, "imgs.matches.img_id" : { "$ne" : 2 } },
{ "$push" : { "imgs.$.matches" : { "$each" : [ { "c" : 0.7, "img_id" : 2} ], "$sort" : { "c" : -1 }, "$slice" : 3 } } }
);
Does not work, since my document get inserted several times.
Your issue is with the filter part of the updateOne. You should use $elemMatch to make sure that the filter is applied to only one element of the "matches" list.
{"user_id": 1, "imgs": {"$elemMatch": {"img_id" : 1, "matches.img_id": {"$ne": 2}}}},
{ "$push" : { "imgs.$.matches" : { "$each" : [ { "c" : 0.7, "img_id" : 2} ], "$sort" : { "c" : -1 }, "$slice" : 3 } } })
Related
I have a collection called constructora that has the following structure:
{
"_id" : A,
"edificio" : [
{
"_id": X,
"a" : 48,
"b" : 59
},
{
"_id": Y,
"a" : 8,
"b" : 5
},
{
"_id": Z,
"a" : 0,
"b" : -1
},
...
]
}
So, I want to make a query that returns, for each sub document (edificio) his parent's _id. An example:
{
"_id" : X,
"a" : 48,
"b" : 59
"id_constructora" : A
}
{
"_id" : Y,
"a" : 8,
"b" : 5
"id_constructora" : A
}
{
"_id" : Z,
"a" : 0,
"b" : -1
"id_constructora" : A
}
How can I do that?
EDIT
Now I'm trying using aggregate, and grouping my query by "edificio_id", so for each document in edificio I can get my desired output:
db.constructora.aggregate(
[
{ $project : { "_id" : 1, "edificio._id" : 1 } },
{ $group : { _id : "$edificio._id" } }
]
).pretty();
But it doesn't work. The output is:
...
{
"_id" : [
ObjectId("613339376430333562373466"),
ObjectId("663736363935393066656236"),
ObjectId("313933613036363364633832"),
ObjectId("653135313831633638336436")
]
}
{
"_id" : [
ObjectId("643531326231663739626465"),
ObjectId("343231386237333365356461"),
ObjectId("373461303864636138393263"),
ObjectId("386433623966653737343962"),
ObjectId("303863633366376431363335"),
ObjectId("663833343161643639376161"),
ObjectId("383833363836663532633733"),
ObjectId("396330313961353137333166"),
ObjectId("646535366662363364613837"),
ObjectId("633937613032656436653965")
]
}
You can use $unwind to break the embedded array into embedded docs, $addFields to rename and add the _id into the embedded doc followed by $replaceRoot to promote the embedded document to the top level in 3.4 mongo server.
db.constructora.aggregate([
{$unwind:"$edificio"},
{$addFields:{"edificio.id_constructora":"$_id"}},
{$replaceRoot: {newRoot: "$edificio"}}
])
More info here https://docs.mongodb.com/manual/reference/operator/aggregation/replaceRoot/#replaceroot-with-an-array-element
I am trying to craft a query that will allow me to find duplicate keys in subdocument in MongoDB.
It needs to be able to query any number of documents and see what keys are duplicated across them in a subdocument. The key of my subdocument is called attributes and I need to be able to target a particular query of documents and pull out duplicate attribute keys that they all share.
EDIT:
I forgot to mention that I do not know the names of the attributes ahead of time. I need to be able to essentially select distinct attributes that they share and aggregate the values.
Collection Sample:
[
{
sku: '123',
attributes: {
size: 'L',
custom: 7
}
},
{
sku: '456',
attributes: {
size: 'M'
}
},
{
sku: 'abc',
attributes: {
material: 'cotton'
size: 'S'
}
}
]
Desired Result (if possible):
{
size: [' S', 'M', 'L']
}
If the desired result is not possible I would at least like to be able to get back [ 'size' ]
This process needs to be optimized as much as possible and I just cant seem to get a query just right to return what I need, any help is greatly appreciated =)
Here is what I have so far
db.getCollection('myCollection').aggregate([
{ $match: {
_id: { $in: [ObjectId("55158b0bd6076278295cf022"), ObjectId("55158b0bd6076278295cf021"), ObjectId("55158b0bd6076278295cf01f") ] }
}
},
{ $project: { attributes: 1 }},
{ $group: { _id: '$attributes' } }
])
Which products this output:
{
"result" : [
{
"_id" : {
"shirt_size" : "S",
"shirt_color" : "Blue",
"custom_attr" : "adsfasdf"
}
},
{
"_id" : {
"shirt_size" : "M",
"shirt_color" : "Green"
}
},
{
"_id" : {
"shirt_size" : "L",
"shirt_color" : "Red"
}
}
],
"ok" : 1.0000000000000000,
"$gleStats" : {
"lastOpTime" : Timestamp(1427475045, 1),
"electionId" : ObjectId("54f7c1edf8e5ff44cec194b6")
}
}
I feel like it is close and I am just missing the last step :(
I think you need to $unwind the array, and then $group it and use $sum to count the appearance, then everything with sum > 1 is a duplicate.
Links:
http://docs.mongodb.org/manual/reference/operator/aggregation/unwind/
http://docs.mongodb.org/manual/reference/operator/aggregation/group/
http://docs.mongodb.org/manual/reference/operator/aggregation/sum/
The $addToSet(aggregation) returns an array of unique values - http://docs.mongodb.org/manual/reference/operator/aggregation/addToSet/
Using the following aggregation (get unique sizes per Doc):
db.coll1.aggregate([
{$unwind : "$testdoc"},
{$group : {_id: "$_id", size: {$addToSet: "$testdoc.attributes.size"}}}
])
Gives the following result:
{
"result" : [
{
"_id" : ObjectId("551621fe6155a7741a0d328a"),
"size" : [
"M",
"L"
]
},
{
"_id" : ObjectId("551621fe6155a7741a0d328b"),
"size" : [
"L"
]
},
{
"_id" : ObjectId("551621fe6155a7741a0d3289"),
"size" : [
"S",
"M",
"L"
]
}
],
"ok" : 1
}
The following aggregation returns unique sizes across all docs:
db.coll1.aggregate([
{$unwind : "$testdoc"},
{$group :
{_id: "AllSizes", size: {$addToSet: "$testdoc.attributes.size"}}} ])
Result:
{
"result" : [
{
"_id" : "AllSizes",
"size" : [
"S",
"M",
"L"
]
}
],
"ok" : 1
}
Based on the following Docs:
> db.coll1.find().pretty()
{
"_id" : ObjectId("551621fe6155a7741a0d3289"),
"testdoc" : [
{
"sku" : "123",
"attributes" : {
"size" : "L",
"custom" : 7
}
},
{
"sku" : "456",
"attributes" : {
"size" : "M"
}
},
{
"sku" : "abc",
"attributes" : {
"material" : "cotton",
"size" : "S"
}
}
]
}
{
"_id" : ObjectId("551621fe6155a7741a0d328a"),
"testdoc" : [
{
"sku" : "123",
"attributes" : {
"size" : "L",
"custom" : 7
}
},
{
"sku" : "456",
"attributes" : {
"size" : "M"
}
},
{
"sku" : "abc",
"attributes" : {
"material" : "cotton",
"size" : "M"
}
}
]
}
{
"_id" : ObjectId("551621fe6155a7741a0d328b"),
"testdoc" : [
{
"sku" : "123",
"attributes" : {
"size" : "L",
"custom" : 7
}
},
{
"sku" : "456",
"attributes" : {
"size" : "L"
}
},
{
"sku" : "abc",
"attributes" : {
"material" : "cotton",
"size" : "L"
}
}
]
}
model:
{
"_id" : "a62107e10f388c90a3eb2d7634357c8b",
"_appid" : [
{
"_id" : "1815aaa7f581c838",
"events" : [
{
"_id" : "_TB_launch",
"boday" : [
{
"VERSIONSCODE" : "17",
"NETWORK" : "cmwap",
"VERSIONSNAME" : "2.4.0",
"IMSI" : "460026319223205",
"PACKAGENAME" : "com.androidbox.astjxmjmmshareMM",
"CHANNELID" : "xmjmm17",
"CHANNELNAME" : "浠..?.M寰.俊?.韩?.?1.x锛.,
"eventid" : "_TB_launch",
"uuid" : "a62107e10f388c90a3eb2d7634357c8b",
"creattime" : "1366300799766",
"ts" : ISODate("2013-04-25T06:28:36.403Z")
}
],
"size" : 1
}
],
"size" : 1
}
],
"size" : 1
}
> db.events.update(
{
"_id":"039e569770cec5ff3811e7410233ed27",
"_appid._id":"e880db04064b03bc534575c7f831a83a",
"_appid.events._id":"_TB_launch"
},
{
"$push":{
"_appid.$.events.$.boday":{"111":"123123"}
}
}
);
Cannot apply the positional operator without a corresponding query field containing an array.
Why?!!
You are trying to reference multiple levels of embedding - you can only have one positional $ operator. You won't be able to do something like this until this feature request has been implemented.
Response Here
The short answer is, "no", but working with nested arrays gets
tricky. Here is an example:
db.foo.save({_id: 1, a1:[{_a1id:1, a2:[{a2id:1, a3:[{a3id:1, a4:"data"}]}]}]})
db.foo.find()
{ "_id" : 1, "a1" : [
{ "_a1id" : 1, "a2" : [
{ "a2id" : 1, "a3" : [
{ "a3id" : 1, "a4" : "data" }
] }
] }
] }
db.foo.update({_id:1}, {$push:{"a1.0.a2.0.a3":{a3id:2, a4:"other data"}}})
db.foo.find()
{ "_id" : 1, "a1" : [
{ "_a1id" : 1, "a2" : [
{ "a2id" : 1, "a3" : [
{ "a3id" : 1, "a4" : "data" },
{ "a3id" : 2, "a4" : "other data" }
] }
] }
] }
If you are unsure where one of your sub-documents lies within an
array, you may use one positional operator, and Mongo will update the
first sub-document which matches. For example:
db.foo.update({_id:1, "a1.a2.a2id":1}, {$push:{"a1.0.a2.$.a3":{a3id:2, a4:"other data"}}})
abstract document in collection md given:
{
vals : [{
uid : string,
val : string|array
}]
}
the following, partially correct aggregation is given:
db.md.aggregate(
{ $unwind : "$vals" },
{ $match : { "vals.uid" : { $in : ["x", "y"] } } },
{
$group : {
_id : { uid : "$vals.uid" },
vals : { $addToSet : "$vals.val" }
}
}
);
that may lead to the following result:
"result" : [
{
"_id" : {
"uid" : "x"
},
"vals" : [
[
"24ad52bc-c414-4349-8f3a-24fd5520428e",
"e29dec2f-57d2-43dc-818a-1a6a9ec1cc64"
],
[
"5879b7a4-b564-433e-9a3e-49998dd60b67",
"24ad52bc-c414-4349-8f3a-24fd5520428e"
]
]
},
{
"_id" : {
"uid" : "y"
},
"vals" : [
"0da5fcaa-8d7e-428b-8a84-77c375acea2b",
"1721cc92-c4ee-4a19-9b2f-8247aa53cfe1",
"5ac71a9e-70bd-49d7-a596-d317b17e4491"
]
}
]
as x is the result aggregated on documents containing an array rather than a string, the vals in the result is an array of arrays. what i look for in this case is to have a flattened array (like the result for y).
for me it seems like that what i want to achieve by one aggegration call only, is currently not supported by any given operation as e.g. a type conversion cannot be done or unwind expectes in every case an array as input type.
is map reduce the only option i have? if not ... any hints?
thanks!
You can use the aggregation to do the computation you want without changing your schema (though you might consider changing your schema simply to make queries and aggregations of this field easier to write).
I broke up the pipeline into multiple steps for readability. I also simplified your document slightly, again for readability.
Sample input:
> db.md.find().pretty()
{
"_id" : ObjectId("512f65c6a31a92aae2a214a3"),
"uid" : "x",
"val" : "string"
}
{
"_id" : ObjectId("512f65c6a31a92aae2a214a4"),
"uid" : "x",
"val" : "string"
}
{
"_id" : ObjectId("512f65c6a31a92aae2a214a5"),
"uid" : "y",
"val" : "string2"
}
{
"_id" : ObjectId("512f65e8a31a92aae2a214a6"),
"uid" : "y",
"val" : [
"string3",
"string4"
]
}
{
"_id" : ObjectId("512f65e8a31a92aae2a214a7"),
"uid" : "z",
"val" : [
"string"
]
}
{
"_id" : ObjectId("512f65e8a31a92aae2a214a8"),
"uid" : "y",
"val" : [
"string1",
"string2"
]
}
Pipeline stages:
> project1 = {
"$project" : {
"uid" : 1,
"val" : 1,
"isArray" : {
"$cond" : [
{
"$eq" : [
"$val.0",
[ ]
]
},
true,
false
]
}
}
}
> project2 = {
"$project" : {
"uid" : 1,
"valA" : {
"$cond" : [
"$isArray",
"$val",
[
null
]
]
},
"valS" : {
"$cond" : [
"$isArray",
null,
"$val"
]
},
"isArray" : 1
}
}
> unwind = { "$unwind" : "$valA" }
> project3 = {
"$project" : {
"_id" : 0,
"uid" : 1,
"val" : {
"$cond" : [
"$isArray",
"$valA",
"$valS"
]
}
}
}
Final aggregation:
> db.md.aggregate(project1, project2, unwind, project3, group)
{
"result" : [
{
"_id" : "z",
"vals" : [
"string"
]
},
{
"_id" : "y",
"vals" : [
"string1",
"string4",
"string3",
"string2"
]
},
{
"_id" : "x",
"vals" : [
"string"
]
}
],
"ok" : 1
}
If you modify your schema using always "vals.val" field as an array field (even when the record contains only one element) you can do it easily as follows:
db.test_col.insert({
vals : [
{
uid : "uuid1",
val : ["value1"]
},
{
uid : "uuid2",
val : ["value2", "value3"]
}]
});
db.test_col.insert(
{
vals : [{
uid : "uuid2",
val : ["value4", "value5"]
}]
});
Using this approach you only need to use two $unwind operations: one unwinds the "parent" array and the second unwinds every "vals.val" value. So, querying like
db.test_col.aggregate(
{ $unwind : "$vals" },
{ $unwind : "$vals.val" },
{
$group : {
_id : { uid : "$vals.uid" },
vals : { $addToSet : "$vals.val" }
}
}
);
You can obtain your expected value:
{
"result" : [
{
"_id" : {
"uid" : "uuid2"
},
"vals" : [
"value5",
"value4",
"value3",
"value2"
]
},
{
"_id" : {
"uid" : "uuid1"
},
"vals" : [
"value1"
]
}
],
"ok" : 1
}
And no, you can't execute this query using your current schema, since $unwind fails when the field isn't an array field.
I'm want use mongodb to achieve simple query like mysql "select a-b from table", but aggregation framework query result is not right.
data:
{ "_id" : ObjectId("511223348a88785127a0d13f"), "a" : 1, "b" : 1, "name" : "xxxxx0" }
{ "_id" : ObjectId("511223348a88785127a0d13f"), "a" : 2, "b" : 2, "name" : "xxxxx1" }
mongodb cmd:
db.site.aggregate([
{ $match: {
"a" : {$exists:true},
"b" : {$exists:true},
}
},
{ $project: { _id : 0,name : 1,
r1: {$subtract:["$a", "$b"]} }
},
{ $limit: 100 },
]);
"result" : [
{
"name" : "xxxx1",
"r1" : -1
},
{
"name" : "xxxx0",
"r1" : -2
},
]
I cannot replicate your behaviour:
> db.tg.find()
{ "_id" : ObjectId("511223348a88785127a0d13f"), "a" : 1, "b" : 1, "name" : "xxxxx0" }
> db.tg.aggregate([{ $match: { "a" : {$exists:true}, "b" : {$exists:true} } }, { $project: { _id : 0,name : 1, r1: {$subtract:["$a", "$b"]} }}, { $limit: 100 }])
{ "result" : [ { "name" : "xxxxx0", "r1" : 0 } ], "ok" : 1 }
Can you give us a little more info like your MongoDB version?