C# mongodb - how to update nested array elements - mongodb

I have the following JSON structure that represents an item
{
Id: "a",
Array1: [{
Id: "b",
Array2: [{
Id: "c",
Array3: [
{...}
]
}]
}]
}
I need to be able to either replace the array element in Array2 with a new item or to replace just Array3 with a new array.
Here is my code to replace the array item in Array2:
await Collection.UpdateOneAsync(
item => item.Id.Equals("a") &&
item.Array1.Any(a => a.Id.Equals("b")) &&
item.Array1[-1].Array2.Any(b => b.Id.Equals("c")),
Builders<Item>.Update.Set(s => s.Array1[-1].Array2[-1], newArray2Item)
);
When executing this code I'm getting this error:
"A write operation resulted in an error.
Too many positional (i.e. '$') elements found in path 'Array1.$.Array2.$'"
Here is my code to replace Array3 within Array2:
await Collection.UpdateOneAsync(
item => item.Id.Equals("a") &&
item.Array1.Any(a => a.Id.Equals("b")) &&
item.Array1[-1].Array2.Any(b => b.Id.Equals("c")),
Builders<Item>.Update.Set(s => s.Array1[-1].Array2[-1].Array3, newArray3)
);
And this is the error:
"A write operation resulted in an error.
Too many positional (i.e. '$') elements found in path 'Array1.$.Array2.$.Array3'"
I'm using C# MongoDB driver version 2.5.0 and MongoDB version 3.6.1
I found this Jira ticket Positional Operator Matching Nested Arrays that says the problem was fixed and they suggested this syntax for the update
Update all matching documents in nested array:
db.coll.update({}, {$set: {“a.$[i].c.$[j].d”: 2}}, {arrayFilters: [{“i.b”: 0}, {“j.d”: 0}]})
Input: {a: [{b: 0, c: [{d: 0}, {d: 1}]}, {b: 1, c: [{d: 0}, {d: 1}]}]}
Output: {a: [{b: 0, c: [{d: 2}, {d: 1}]}, {b: 1, c: [{d: 0}, {d: 1}]}]}
So I converted it to my elements:
db.getCollection('Items').update(
{"Id": "a"},
{$set: {"Array1.$[i].Array2.$[j].Array3": [newArray3]}},
{arrayFilters:
[
{"i.Id": "b"},
{"j.Id": "c"}
]}
)
And got this error:
cannot use the part (Array1 of Array.$[i].Array2.$[j].Array3) to traverse the element
Any ideas on how to solve this error?

Here's the C# version of what you need:
var filter = Builders<Item>.Filter.Eq("Id", "a");
var update = Builders<Item>.Update.Set("Array1.$[i].Array2.$[j].Array3", new[] { new Item { Id = "d" } });
var arrayFilters = new List<ArrayFilterDefinition> { new JsonArrayFilterDefinition<Item>("{'i.Id': 'b'}"), new JsonArrayFilterDefinition<Item>("{'j.Id': 'c'}") };
var updateOptions = new UpdateOptions { ArrayFilters = arrayFilters };
collection.UpdateOne(filter, update, updateOptions);

Related

MongoDB query for two input arrays at same index?

I have two arrays A and B of length n defined by the input,
fruit_ids = [{id: "id1"}, {id: "id2"}, {id:"id3"}];
fruit_names = [{name: "Orange"},{name: "Kiwi"},{name: "Banana"}]
and MongoDB documents
{ farm_id: "3344", fruits: [{name: "Orange", id:"id1"}, {name: "Kiwi", id:"id67"}]}
Now I want to write a Mongo query such that it pulls items from particular farm_id specified at array fruit_ids and fruit_names but at same index,
for example for the above input, I want for farm_id: 3344 {name: "Orange", id:"id1"} to get deleted.
Can anyone please help me.
You can use $pullAll operator to remove all the matching elements and build your update statement dynamically using below code:
var fruit_ids = [{id: "id1"}, {id: "id2"}, {id:"id3"}];
var fruit_names = [{name: "Orange"},{name: "Apple"},{name: "Banana"}];
var pullAll = {
$pullAll: { fruits: fruit_ids.map((id, index) => Object.assign(fruit_names[index], id)) }
}
db.col.update({ farm_id: 3344 }, pullAll)
This will only try to update the farm_id: 3344.
I was trying $pullAll as suggested by #mickl in his answer, but the thing is I had other fields inside my embedded documents and because $pullAll works only for exact matches, that's why I currently I am using $pull with $or on the array of embedded docs. I found this solution from this answer how-to-force-mongodb-pullall-to-disregard-document-order.
let arr = [{name: "Orange", id:"id1"}, {name: "Kiwi", id:"id67"}];
db.col.update(
{ farm_id: 3344 },
{ "$pull": { "fruits": { "$or": arr } }}
)

Add a new field with large number of rows to existing collection in Mongodb

I have an existing collection with close to 1 million number of docs, now I'd like to append a new field data to this collection. (I'm using PyMongo)
For example, my existing collection db.actions looks like:
...
{'_id':12345, 'A': 'apple', 'B': 'milk'}
{'_id':12346, 'A': 'pear', 'B': 'juice'}
...
Now I want to append a new column field data to this existing collection:
...
{'_id':12345, 'C': 'beef'}
{'_id':12346, 'C': 'chicken'}
...
such that the resulting collection should look like this:
...
{'_id':12345, 'A': 'apple', 'B': 'milk', 'C': 'beef'}
{'_id':12346, 'A': 'pear', 'B': 'juice', 'C': 'chicken'}
...
I know we can do this with update_one with a for loop, e.g
for doc in values:
collection.update_one({'_id': doc['_id']},
{'$set': {k: doc[k] for k in fields}},
upsert=True
)
where values is a list of dictionary each containing two items, the _id key-value pair and new field key-value pair. fields contains all the new fields I'd like to add.
However, the issue is that I have a million number of docs to update, anything with a for loop is way too slow, is there a way to append this new field faster? something similar to insert_many except it's appending to an existing collection?
===============================================
Update1:
So this is what I have for now,
bulk = self.get_collection().initialize_unordered_bulk_op()
for doc in values:
bulk.find({'_id': doc['_id']}).update_one({'$set': {k: doc[k] for k in fields} })
bulk.execute()
I first wrote a sample dataframe into the db with insert_many, the performance:
Time spent in insert_many: total: 0.0457min
then I use update_one with bulk operation to add extra two fields onto the collection, I got:
Time spent: for loop: 0.0283min | execute: 0.0713min | total: 0.0996min
Update2:
I added an extra column to both the existing collection and the new column data, for the purpose of using left join to solve this. If you use left join you can ignore the _id field.
For example, my existing collection db.actions looks like:
...
{'A': 'apple', 'B': 'milk', 'dateTime': '2017-10-12 15:20:00'}
{'A': 'pear', 'B': 'juice', 'dateTime': '2017-12-15 06:10:50'}
{'A': 'orange', 'B': 'pop', 'dateTime': '2017-12-15 16:09:10'}
...
Now I want to append a new column field data to this existing collection:
...
{'C': 'beef', 'dateTime': '2017-10-12 09:08:20'}
{'C': 'chicken', 'dateTime': '2017-12-15 22:40:00'}
...
such that the resulting collection should look like this:
...
{'A': 'apple', 'B': 'milk', 'C': 'beef', 'dateTime': '2017-10-12'}
{'A': 'pear', 'B': 'juice', 'C': 'chicken', 'dateTime': '2017-12-15'}
{'A': 'orange', 'B': 'pop', 'C': 'chicken', 'dateTime': '2017-12-15'}
...
If your updates are really unique per document there is nothing faster than the bulk write API. Neither MongoDB nor the driver can guess what you want to update so you will need to loop through your update definitions and then batch your bulk changes which is pretty much described here:
Bulk update in Pymongo using multiple ObjectId
The "unordered" bulk writes can be slightly faster (although in my tests they weren't) but I'd still vote for the ordered approach for error handling reasons mainly).
If, however, you can group your changes into specific recurring patterns then you're certainly better off defining a bunch of update queries (effectively one update per unique value in your dictionary) and then issue those each targeting a number of documents. My Python is too poor at this point to write that entire code for you but here's a pseudocode example of what I mean:
Let's say you've got the following update dictionary:
{
key: "doc1",
value:
[
{ "field1", "value1" },
{ "field2", "value2" },
]
}, {
key: "doc2",
value:
[
// same fields again as for "doc1"
{ "field1", "value1" },
{ "field2", "value2" },
]
}, {
key: "doc3",
value:
[
{ "someotherfield", "someothervalue" },
]
}
then instead of updating the three documents separately you would send one update to update the first two documents (since they require the identical changes) and then one update to update "doc3". The more knowledge you have upfront about the structure of your update patterns the more you can optimize that even by grouping updates of subsets of fields but that's probably getting a little complicated at some point...
UPDATE:
As per your below request let's give it a shot.
fields = ['C']
values = [
{'_id': 'doc1a', 'C': 'v1'},
{'_id': 'doc1b', 'C': 'v1'},
{'_id': 'doc2a', 'C': 'v2'},
{'_id': 'doc2b', 'C': 'v2'}
]
print 'before transformation:'
for doc in values:
print('_id ' + doc['_id'])
for k in fields:
print(doc[k])
transposed_values = {}
for doc in values:
transposed_values[doc['C']] = transposed_values.get(doc['C'], [])
transposed_values[doc['C']].append(doc['_id'])
print 'after transformation:'
for k, v in transposed_values.iteritems():
print k, v
for k, v in transposed_values.iteritems():
collection.update_many({'_id': { '$in': v}}, {'$set': {'C': k}})
Since your join collection having less documents, you can convert the dateTime to date
db.new.find().forEach(function(d){
d.date = d.dateTime.substring(0,10);
db.new.update({_id : d._id}, d);
})
and do multiple field lookup based on date (substring of dateTime) and _id,
and out to a new collection (enhanced)
db.old.aggregate(
[
{$lookup: {
from : "new",
let : {id : "$_id", date : {$substr : ["$dateTime", 0, 10]}},
pipeline : [
{$match : {
$expr : {
$and : [
{$eq : ["$$id", "$_id"]},
{$eq : ["$$date", "$date"]}
]
}
}},
{$project : {_id : 0, C : "$C"}}
],
as : "newFields"
}
},
{$project : {
_id : 1,
A : 1,
B : 1,
C : {$arrayElemAt : ["$newFields.C", 0]},
date : {$substr : ["$dateTime", 0, 10]}
}},
{$out : "enhanced"}
]
).pretty()
result
> db.enhanced.find()
{ "_id" : 12345, "A" : "apple", "B" : "milk", "C" : "beef", "date" : "2017-10-12" }
{ "_id" : 12346, "A" : "pear", "B" : "juice", "C" : "chicken", "date" : "2017-12-15" }
{ "_id" : 12347, "A" : "orange", "B" : "pop", "date" : "2017-12-15" }
>

Remove items from Array which match at least one condition from a set of conditions

I'm trying to execute a query which remove some specific elements which match at least one condition from a set of conditions,
{
id: 'myId',
path2: [{
a: '1'
},{
a: '2'
},{
a: '3'
}]
}
and update it to:
{
id: 'myId',
path2: [{
a: '1'
}]
}
Here, I removed from path2 all elements where the value of the 'a' field is equal to either 2 or 3.
I tried the following with no success (I'm using mongoose):
let conditions = ['2', '3'];
myModel.findOneAndUpdate({id: 'myId'},
{$pull: {path2: {$elemMatch: {a: {$in: conditions}}}}}
);
Thank you in advance.
This should do,
myModel.findOneAndUpdate({id: 'myId'}, {$pull: {path2: {a: {$in: conditions}}}} );
You don't need to use elemMatch.

Unique Index on Array in MongoDB

Say we have a collection of documents similar to this:
{
foo: "Bar",
foos: [1, 2, 3]
}
I would like to define a unique index such that no document identical to this one can be inserted into the database.
db.stuffs.ensureIndex({ foos: 1 }, { unique: true })
Seems to block any document containing a foos array with any intersection, eg. if the document above was already in the database, then
{
foo: "Bar",
foos: [ 1 ]
}
Would also be blocked.
> db.stuffs.ensureIndex({ foos: 1 }, { unique: true })
> db.stuffs.insert({ foo: "Bar", foos: [ 1, 2, 3 ]})
> db.stuffs.insert({ foo: "Bar", foos: [ 1 ]})
E11000 duplicate key error index: test.stuffs.$foos_1 dup key: { : 1.0 }
I would like to be able to make insertions of [ 1, 2 ], [ 2, 1 ], [ 1, 3 ], etc. but not two [ 1, 2 ]
The array index will not meet your requirement. But I think you can switch to the other format to store your data.
If there is no need to use the feature of array (such as $addToSet, $push op), you can simply hash/map your data to another format. e.g.: [1,2,3] to the string "1,2,3".
While I assume that you want to remain the array operations in order to make some updates. Then you can try the subdocument below:
db.stuffs.ensureIndex({ foos: 1 }, { unique: true }) // build the index for the sub doc
db.stuffs.insert({ foo: "Bar", foos: {k:[ 1, 2, 3 ]}})
db.stuffs.insert({ foo: "Bar", foos: {k:[ 1 ]}})
db.stuffs.update({ "_id" : ObjectId("54081f544ea4d4e96bffd9ad")}, {$push:{"foos.k": 2}})
db.stuffs.insert({ foo: "Bar", foos: {k:[1, 2]}})
E11000 duplicate key error index: test.stuffs.$foos_1 dup key: { : { k: [ 1.0, 2.0 ] } }
Please refer to this question. Explains why Unique Index won't work in this case.
Unique index in MongoDB
To index a field that holds an array value, MongoDB creates an index
key for each element in the array.
If the array that you want to index is of constant size, you can create a unique multikey index like this:
collection.createIndex({"coordinate.0": 1, "coordinate.1": 1}, {unique: true})
coordinate is an array of size 2.
When I try to insert a duplicate coordinate, it returns an error as expected.

Add subdocument array element to subdocument array element in mongoDB

Is this possible?
I have a collection C, with an array of attributes A1.
Each attribute has an array of subattributes A2.
How can I add a subdocument to a specific C.A1 subdocument ?
Here is an example.
db.docs.insert({_id: 1, A1: [{A2: [1, 2, 3]}, {A2: [4, 5, 6]}]})
If you know the index of the subdocument you want to insert, you can use dot notation with the index (starting from 0) in the middle:
db.docs.update({_id: 1}, {$addToSet: {'A1.0.A2': 9}})
This results in:
{
"A1" : [
{
"A2" : [
1,
2,
3,
9
]
},
{
"A2" : [
4,
5,
6
]
}
],
"_id" : 1
}
Yes, this is possible. If you post an example I can show you more specifically what the update query would look like. But here's a shot:
db.c.update({ A1: value }, { $addToSet: { "A1.$.A2": "some value" }})
I haven't actually tried this (I'm not in front of a Mongo instance right now) and I'm going off memory, but that should get you pretty close.
Yes, $push can be used to do the same. Try below given code.
db.c.update({ A1: value }, { $push: { "A1.$.A2": num }});