Can Mongo Bulk Operations like insertMany Partially Fail? - mongodb

After scouring the documentation and posts online, there is one thing I have never been clear about with Mongo.
When you are attempting to write documents in bulk to a collection like the example below, is it ever possible that you would get some documents that write successfully, but some that don't?
db.products.insertMany( [
{ item: "card", qty: 15 },
{ item: "envelope", qty: 20 },
{ item: "stamps" , qty: 30 }
] );
In other words, could you ever get into a situation where you would create documents for the card and envelope items, but not for the stamps item?
I am trying to improve the performance of some of my company's processes and there is some debate in my team as to what kind of error scenarios can really arise from bulk inserts or updates, so if anyone has a clear answer, that would be fantastic. I know that generally mongo queries are not transactional unless you explicitly state so, but this is one area where it just wasn't clear.

Have a look at this example:
db.products.insertMany([
{ _id: 1, item: "card", qty: 15 },
{ _id: 2, item: "envelope", qty: 20 },
{ _id: 1, item: "stamps", qty: 30 }
]);
uncaught exception: BulkWriteError({
"writeErrors" : [
{
"index" : 2,
"code" : 11000,
"errmsg" : "E11000 duplicate key error collection: so.products index: _id_ dup key: { _id: 1.0 }",
"op" : {
"_id" : 1,
"item" : "stamps",
"qty" : 30
}
},
],
"writeConcernErrors" : [ ],
"nInserted" : 2,
"nUpserted" : 0,
"nMatched" : 0,
"nModified" : 0,
"nRemoved" : 0,
"upserted" : [ ]
}) :
Behavior should be obvious. Note, by default the documents are inserted in the same order as in your command, unless you specify option ordered: false

Related

Mongoose retun nModified : 1 when no data is updated

Trying to understand why/if mongoose is updating my documents even though no data is changed?
If I save a new document with the query below it will return this in the console.log(item)
{ n: 1,
nModified: 0,
upserted: [ { index: 0, _id: 5f3d35c386aeb6c6fb35fa79 } ],
ok: 1 }
Query
Product.updateOne(
{productName: product.productName},
{$set: newProduct},
{upsert: true}
).then((item) => {
console.log(item);
}).catch((e) => {
console.log('Insert error', e);
});
If i rerun the same query again i get this back. This indicates that the document has been modified but the data is the same, there is no new data that has been inserted.
{ n: 1, nModified: 1, ok: 1 }
I've noticed if i remove the stores array, delete the document, insert it again and rerun the query I get { n: 1, nModified: 0, ok: 1 } back in the console.log(item)
I run the same querys, the same amout of time, but when having an array in the object i get this { n: 1, nModified: 1, ok: 1 } and when not having an array a get this { n: 1, nModified: 0, ok: 1 }
It seems that when having an array the document gets modified regardless if the data is changed.
Example 1
Gives { n: 1, nModified: 1, ok: 1 }
const newProduct = {
ean: product.ean,
productName: product.productName,
lowestPrice: product.productPrice,
mainCategory: categories.mainCategory,
group: categories.group,
subCategory: categories.subCategory,
subSubCategory: subSubCat,
stores: [{
name: "foobar",
}],
};
Example 2
Gives { n: 1, nModified: 0, ok: 1 }
const newProduct = {
ean: product.ean,
productName: product.productName,
lowestPrice: product.productPrice,
mainCategory: categories.mainCategory,
group: categories.group,
subCategory: categories.subCategory,
subSubCategory: subSubCat,
};
Is it me who misunderstands the operation below or whats going on?
What i want to do is:
1.insert if the document don't exists based on productName,
2.if something differs in the document stored in the database and the newProduct, update the document.
3.If nothing differs, do nothing
Product model
const ProductSchema = new Schema({
ean: String,
productName: String,
mainCategory: String,
subCategory: String,
group: String,
subSubCategory: String,
lowestPrice: Number,
isPopular: Boolean,
description: String,
stores: [
{
name: String,
},
],
});
Edit: As its pretty hard to explain I created a small repo that shows the issue.
https://github.com/gameatrix/mongo_array
The database still performs the update.
For example, let's conditionally upsert a value:
MongoDB Enterprise ruby-driver-rs:PRIMARY> db.foo.update({a:42},{a:42},{upsert:true})
WriteResult({
"nMatched" : 0,
"nUpserted" : 1,
"nModified" : 0,
"_id" : ObjectId("5f3d4ed509fcd40c9f092690")
})
MongoDB Enterprise ruby-driver-rs:PRIMARY> db.foo.update({a:42},{a:42},{upsert:true})
WriteResult({ "nMatched" : 1, "nUpserted" : 0, "nModified" : 1 })
The first write was an insert, the second write was an update. The second write did not change any data but the database performed a write.
You can verify there was a write by using a change stream in another shell instance:
MongoDB Enterprise ruby-driver-rs:PRIMARY> db.foo.watch()
{ "_id" : { "_data" : "825F3D4ED5000000012B022C0100296E5A1004FAC29486D5A3459A8726349007F2E43E46645F696400645F3D4ED509FCD40C9F0926900004" }, "operationType" : "insert", "clusterTime" : Timestamp(1597853397, 1), "fullDocument" : { "_id" : ObjectId("5f3d4ed509fcd40c9f092690"), "a" : 42 }, "ns" : { "db" : "test", "coll" : "foo" }, "documentKey" : { "_id" : ObjectId("5f3d4ed509fcd40c9f092690") } }
{ "_id" : { "_data" : "825F3D4ED6000000012B022C0100296E5A1004FAC29486D5A3459A8726349007F2E43E46645F696400645F3D4ED509FCD40C9F0926900004" }, "operationType" : "replace", "clusterTime" : Timestamp(1597853398, 1), "fullDocument" : { "_id" : ObjectId("5f3d4ed509fcd40c9f092690"), "a" : 42 }, "ns" : { "db" : "test", "coll" : "foo" }, "documentKey" : { "_id" : ObjectId("5f3d4ed509fcd40c9f092690") } }
By definition an upsert either modifies documents that match the condition or inserts new documents. You are always going to have a write when upserting.
2.if something differs in the document stored in the database and the newProduct, update the document.
The bolded part is not how MongoDB (and most databases, as far as I know) work. Whether a write is performed does not depend on whether the data being written is the same as what is already in the database.

mongdb ensure uniqueness on two fields both ways

Say I have the fields a and b. I want to have a compound uniqueness where if a: 1, b: 2, I would not be able to do a: 2, b: 1.
The reason I want this is because I'm making a "friends list" kind of collection, where if a is connected to b, then it's automatically the reverse as well.
is this possible on a schema level or do I need to do queries to check.
If you don't need to differentiate between requester and requestee, you could sort the values before saving or querying so that your two fields a and b have a predictable order for any pair of friend IDs (and you can take advantage of the unique index constraint).
For example, using the mongo shell:
Create a helper function to return friend pairs in predictable order:
function friendpair (friend1, friend2) {
if ( friend1 < friend2) {
return ({a: friend1, b: friend2})
} else {
return ({a: friend2, b: friend1})
}
}
Add a compound unique index:
> db.friends.createIndex({a:1, b:1}, {unique: true});
{
"createdCollectionAutomatically" : true,
"numIndexesBefore" : 1,
"numIndexesAfter" : 2,
"ok" : 1
}
Insert unique pairs (should work)
> db.friends.insert(friendpair(1,2))
WriteResult({ "nInserted" : 1 })
> db.friends.insert(friendpair(1,3))
WriteResult({ "nInserted" : 1 })
Insert non-unique pair (should return duplicate key error):
> db.friends.insert(friendpair(2,1))
WriteResult({
"nInserted" : 0,
"writeError" : {
"code" : 11000,
"errmsg" : "E11000 duplicate key error collection: test.friends index: a_1_b_1 dup key: { : 1.0, : 2.0 }"
}
})
Search should work in either order:
db.friends.find(friendpair(3,1)).pretty()
{ "_id" : ObjectId("5bc80ed11466009f3b56fa52"), "a" : 1, "b" : 3 }
db.friends.find(friendpair(1,3)).pretty()
{ "_id" : ObjectId("5bc80ed11466009f3b56fa52"), "a" : 1, "b" : 3 }
Instead of handling duplicate key errors or insert versus update, you could also use findAndModify with an upsert since this is expected to be a unique pair:
> var pair = friendpair(2,1)
> db.friends.findAndModify({
query: pair,
update: {
$set: {
a : pair.a,
b : pair.b
},
$setOnInsert: { status: 'pending' },
},
upsert: true
})
{
"_id" : ObjectId("5bc81722ce51da0e4118c92f"),
"a" : 1,
"b" : 2,
"status" : "pending"
}
Doesn't seem like you can do a unique on the entire array's values so I'm doing a kind of work around. I'm using the $jsonSchema as follows:
{
$jsonSchema:
{
bsonType:
"object",
required:
[
"status",
"users"
],
properties:
{
status:
{
enum:
[
"pending",
"accepted"
],
bsonType:
"string"
},
users:
{
bsonType:
"array",
description:
"references two user_id",
items:
{
bsonType:
"objectId"
},
maxItems:
2,
minItems:
2,
},
}
}
}
then I will use $all to find the connected users, e.g.
db.collection.find( { users: { $all: [ ObjectId1, ObjectId2 ] } } )

Mongodb- using find() method on an Array of Objects only return first match instead of all

Unlike the other question someone asked where they wanted only one item returned. I HAVE one item returned and I need ALL of the matching objects in the array return. However the second object that matches my query is being completely ignored.
This is what one of the items in the item collection looks like:
{
name: "soda",
cost: .50,
inventory: [
{ flavor: "Grape",
amount: 8 },
{ flavor: "Orange",
amount: 4 },
{ flavor: "Root Beer",
amount: 15 }
]
}
Here is the query I typed in to mongo shell:
Items.find({"inventory.amount" : { $lte : 10} } , { name : 1, "inventory.$.flavor" : 1})
And here is the result:
"_id" : ObjectId("59dbe33094b70e0b5851724c"),
"name": "soda"
"inventory" : [
{ "flavor" : "Grape",
"amount" : 8,
}
]
And here is what I want it to return to me:
"_id" : ObjectId("59dbe33094b70e0b5851724c"),
"name": "soda"
"inventory" : [
{ "flavor" : "Grape",
"amount" : 8
},
{ "flavor" : "Orange",
"amount" : 4
}
]
I'm new to mongo and am dabbling to get familiar with it. I've read through the docs but couldn't find a solution to this though it's quite possible I overlooked it. I'd really love some help. Thanks in advance.
first u can get your result by this query
db.Items.find({"inventory.amount" : { $lte : 10} } , { name : 1, "inventory.flavor" : 1 , "inventory.amount" : 1})

Complex Sort on multiple very large MongoDB Collections

I have a mongodb database with currently about 30 collections ranging from 1.5gb to 2.5gb and I need to reformat and sort the data into nested groups and dump them to a new collection. This database will eventually have about 2000 collections of the same type and formatting of data.
Data is currently available like this:
{
"_id" : ObjectId("598392d6bab47ec75fd6aea6"),
"orderid" : NumberLong("4379116282"),
"regionid" : 10000068,
"systemid" : 30045305,
"stationid" : 60015036,
"typeid" : 7489,
"bid" : 0,
"price" : 119999.91,
"minvolume" : 1,
"volremain" : 6,
"volenter" : 8,
"issued" : "2015-12-31 09:12:29",
"duration" : "14 days, 0:00:00",
"range" : 65535,
"reportedby" : 0,
"reportedtime" : "2016-01-01 00:22:42.997926"} {...} {...}
I need to group these by regionid > typeid > bid like this:
{"regionid": 10000176,
"orders": [
{
"typeid": 34,
"buy": [document, document, document, ...],
"sell": [document, document, document, ...]
},
{
"typeid": 714,
"buy": [document, document, document, ...],
"sell": [document, document, document, ...]
}]
}
Here's more verbose a sample of my ideal output format: https://gist.github.com/BatBrain/cd3426c29ce8ca8152efd1fa06ca1392
I have been trying to use the db.collection.aggregate() to do this, running this command as an initial test step:
db.day_2016_01_01.aggregate( [{ $group : { _id : "$regionid", entries : { $push: "$$ROOT" } } },{ $out : "test_group" }], { allowDiskUse:true, cursor:{} })
But I have been getting this message, "errmsg" : "BufBuilder attempted to grow() to 134217728 bytes, past the 64MB limit."
I tried looking into how to use the cursor object, but I'm pretty confused about how to apply it in this situation, or even if that is a viable option. Any advice or solutions would be great.

Different upsert behavior on Cosmos DB vs MongoDB

I'm running into an issue with Cosmos DB where the behavior of a query with {upsert: true} and $setOnInsert where the insert values are applied every time regardless of if the operation was an insert or an update.
The results of the following example query when ran against Cosmos DB and MongoDB show a difference in the final value of defaultQty.
db.products.remove({})
// WriteResult({ "nRemoved" : 1 })
db.products.insert({ _id: 1, item: "apple", price: 0.05, defaultQty: 50})
// WriteResult({ "nInserted" : 1 })
db.products.find({})
// { "_id" : 1, "item" : "apple", "price" : 0.05, "defaultQty" : 50 }
sleep(100)
db.products.update(
{ _id: 1 },
{ $set: { price: 0.10 }, $setOnInsert: { defaultQty: 100 }},
{ upsert: true }
)
// WriteResult({ "nMatched" : 1, "nUpserted" : 0, "nModified" : 1 })
db.products.find({})
// { "_id" : 1, "item" : "apple", "price" : 0.1, "defaultQty" : 100 }
Here is a screen shot of the comparison results side-by-side in Studio 3T.
Has anyone experienced this?
Thanks!
This issue is now fixed awaiting deployment. You can track progress here https://feedback.azure.com/forums/599059-azure-cosmos-db-mongodb-api/suggestions/20017141-bug-fix-during-upsert-operation-setoninsert-is-b
We Will post update once deployment is completed.
thanks!
Yes, this is a known issue that will be fixed shortly in Azure Cosmos DB.