MongoDb Document Failed validation [duplicate] - mongodb

This question already has an answer here:
Mongo Json Schema Validator AnyOf not working
(1 answer)
Closed 3 years ago.
I'm trying to use JSON schema validators in my test collection. It has anyOf validation rule which should accept foo OR bar, should my understanding be valid and correct.
Validation:
{
$jsonSchema: {
bsonType: 'object',
additionalProperties: false,
anyOf: [
{
bsonType: 'object',
properties: {
foo: {
bsonType: 'string'
}
},
additionalProperties: false
},
{
bsonType: 'object',
properties: {
bar: {
bsonType: 'string'
}
},
additionalProperties: false
}
],
properties: {
_id: {
bsonType: 'objectId'
}
}
}
}
Command to insert a document:
rs0:PRIMARY> db.myColl.insert([{foo:"123"}])
Error given:
BulkWriteResult({
"writeErrors" : [
{
"index" : 0,
"code" : 121,
"errmsg" : "Document failed validation",
"op" : {
"_id" : ObjectId("6ee51b4766ba25a01fbcf8u9"),
"foo" : "test123"
}
}
],
"writeConcernErrors" : [ ],
"nInserted" : 0,
"nUpserted" : 0,
"nMatched" : 0,
"nModified" : 0,
"nRemoved" : 0,
"upserted" : [ ]
})
As far as I know, MongoDB supports draft 4 of JSON Schema, as specified here.
Why is it still giving me an error code of 121 (Document failed validation)?
Am I missing something?
Thanks in advance.

As mentioned by #Relequestual, the post in here, helped me to solve my question. Thank you!

Related

Mongoose retun nModified : 1 when no data is updated

Trying to understand why/if mongoose is updating my documents even though no data is changed?
If I save a new document with the query below it will return this in the console.log(item)
{ n: 1,
nModified: 0,
upserted: [ { index: 0, _id: 5f3d35c386aeb6c6fb35fa79 } ],
ok: 1 }
Query
Product.updateOne(
{productName: product.productName},
{$set: newProduct},
{upsert: true}
).then((item) => {
console.log(item);
}).catch((e) => {
console.log('Insert error', e);
});
If i rerun the same query again i get this back. This indicates that the document has been modified but the data is the same, there is no new data that has been inserted.
{ n: 1, nModified: 1, ok: 1 }
I've noticed if i remove the stores array, delete the document, insert it again and rerun the query I get { n: 1, nModified: 0, ok: 1 } back in the console.log(item)
I run the same querys, the same amout of time, but when having an array in the object i get this { n: 1, nModified: 1, ok: 1 } and when not having an array a get this { n: 1, nModified: 0, ok: 1 }
It seems that when having an array the document gets modified regardless if the data is changed.
Example 1
Gives { n: 1, nModified: 1, ok: 1 }
const newProduct = {
ean: product.ean,
productName: product.productName,
lowestPrice: product.productPrice,
mainCategory: categories.mainCategory,
group: categories.group,
subCategory: categories.subCategory,
subSubCategory: subSubCat,
stores: [{
name: "foobar",
}],
};
Example 2
Gives { n: 1, nModified: 0, ok: 1 }
const newProduct = {
ean: product.ean,
productName: product.productName,
lowestPrice: product.productPrice,
mainCategory: categories.mainCategory,
group: categories.group,
subCategory: categories.subCategory,
subSubCategory: subSubCat,
};
Is it me who misunderstands the operation below or whats going on?
What i want to do is:
1.insert if the document don't exists based on productName,
2.if something differs in the document stored in the database and the newProduct, update the document.
3.If nothing differs, do nothing
Product model
const ProductSchema = new Schema({
ean: String,
productName: String,
mainCategory: String,
subCategory: String,
group: String,
subSubCategory: String,
lowestPrice: Number,
isPopular: Boolean,
description: String,
stores: [
{
name: String,
},
],
});
Edit: As its pretty hard to explain I created a small repo that shows the issue.
https://github.com/gameatrix/mongo_array
The database still performs the update.
For example, let's conditionally upsert a value:
MongoDB Enterprise ruby-driver-rs:PRIMARY> db.foo.update({a:42},{a:42},{upsert:true})
WriteResult({
"nMatched" : 0,
"nUpserted" : 1,
"nModified" : 0,
"_id" : ObjectId("5f3d4ed509fcd40c9f092690")
})
MongoDB Enterprise ruby-driver-rs:PRIMARY> db.foo.update({a:42},{a:42},{upsert:true})
WriteResult({ "nMatched" : 1, "nUpserted" : 0, "nModified" : 1 })
The first write was an insert, the second write was an update. The second write did not change any data but the database performed a write.
You can verify there was a write by using a change stream in another shell instance:
MongoDB Enterprise ruby-driver-rs:PRIMARY> db.foo.watch()
{ "_id" : { "_data" : "825F3D4ED5000000012B022C0100296E5A1004FAC29486D5A3459A8726349007F2E43E46645F696400645F3D4ED509FCD40C9F0926900004" }, "operationType" : "insert", "clusterTime" : Timestamp(1597853397, 1), "fullDocument" : { "_id" : ObjectId("5f3d4ed509fcd40c9f092690"), "a" : 42 }, "ns" : { "db" : "test", "coll" : "foo" }, "documentKey" : { "_id" : ObjectId("5f3d4ed509fcd40c9f092690") } }
{ "_id" : { "_data" : "825F3D4ED6000000012B022C0100296E5A1004FAC29486D5A3459A8726349007F2E43E46645F696400645F3D4ED509FCD40C9F0926900004" }, "operationType" : "replace", "clusterTime" : Timestamp(1597853398, 1), "fullDocument" : { "_id" : ObjectId("5f3d4ed509fcd40c9f092690"), "a" : 42 }, "ns" : { "db" : "test", "coll" : "foo" }, "documentKey" : { "_id" : ObjectId("5f3d4ed509fcd40c9f092690") } }
By definition an upsert either modifies documents that match the condition or inserts new documents. You are always going to have a write when upserting.
2.if something differs in the document stored in the database and the newProduct, update the document.
The bolded part is not how MongoDB (and most databases, as far as I know) work. Whether a write is performed does not depend on whether the data being written is the same as what is already in the database.

mongdb ensure uniqueness on two fields both ways

Say I have the fields a and b. I want to have a compound uniqueness where if a: 1, b: 2, I would not be able to do a: 2, b: 1.
The reason I want this is because I'm making a "friends list" kind of collection, where if a is connected to b, then it's automatically the reverse as well.
is this possible on a schema level or do I need to do queries to check.
If you don't need to differentiate between requester and requestee, you could sort the values before saving or querying so that your two fields a and b have a predictable order for any pair of friend IDs (and you can take advantage of the unique index constraint).
For example, using the mongo shell:
Create a helper function to return friend pairs in predictable order:
function friendpair (friend1, friend2) {
if ( friend1 < friend2) {
return ({a: friend1, b: friend2})
} else {
return ({a: friend2, b: friend1})
}
}
Add a compound unique index:
> db.friends.createIndex({a:1, b:1}, {unique: true});
{
"createdCollectionAutomatically" : true,
"numIndexesBefore" : 1,
"numIndexesAfter" : 2,
"ok" : 1
}
Insert unique pairs (should work)
> db.friends.insert(friendpair(1,2))
WriteResult({ "nInserted" : 1 })
> db.friends.insert(friendpair(1,3))
WriteResult({ "nInserted" : 1 })
Insert non-unique pair (should return duplicate key error):
> db.friends.insert(friendpair(2,1))
WriteResult({
"nInserted" : 0,
"writeError" : {
"code" : 11000,
"errmsg" : "E11000 duplicate key error collection: test.friends index: a_1_b_1 dup key: { : 1.0, : 2.0 }"
}
})
Search should work in either order:
db.friends.find(friendpair(3,1)).pretty()
{ "_id" : ObjectId("5bc80ed11466009f3b56fa52"), "a" : 1, "b" : 3 }
db.friends.find(friendpair(1,3)).pretty()
{ "_id" : ObjectId("5bc80ed11466009f3b56fa52"), "a" : 1, "b" : 3 }
Instead of handling duplicate key errors or insert versus update, you could also use findAndModify with an upsert since this is expected to be a unique pair:
> var pair = friendpair(2,1)
> db.friends.findAndModify({
query: pair,
update: {
$set: {
a : pair.a,
b : pair.b
},
$setOnInsert: { status: 'pending' },
},
upsert: true
})
{
"_id" : ObjectId("5bc81722ce51da0e4118c92f"),
"a" : 1,
"b" : 2,
"status" : "pending"
}
Doesn't seem like you can do a unique on the entire array's values so I'm doing a kind of work around. I'm using the $jsonSchema as follows:
{
$jsonSchema:
{
bsonType:
"object",
required:
[
"status",
"users"
],
properties:
{
status:
{
enum:
[
"pending",
"accepted"
],
bsonType:
"string"
},
users:
{
bsonType:
"array",
description:
"references two user_id",
items:
{
bsonType:
"objectId"
},
maxItems:
2,
minItems:
2,
},
}
}
}
then I will use $all to find the connected users, e.g.
db.collection.find( { users: { $all: [ ObjectId1, ObjectId2 ] } } )

Mongoose match element or empty array with $in statement

I'm trying to select any documents where privacy settings match the provided ones and any documents which do not have any privacy settings (i.e. public).
Current behavior is that if I have a schema with an array of object ids referenced to another collection:
privacy: [{
type: mongoose.Schema.Types.ObjectId,
ref: 'Category',
index: true,
required: true,
default: []
}],
And I want to filter all content for my categories and the public ones, in our case content that does not have a privacy settings. i.e. an empty array []
We currently query that with an or query
{"$or":[
{"privacy": {"$size": 0}},
{"privacy": {"$in":
["5745bdd4b896d4f4367558b4","5745bd9bb896d4f4367558b2"]}
}
]}
I would love to query it by only providing an empty array [] as one the comparison options in the $in statement. Which is possible in mongodb:
db.emptyarray.insert({a:1})
db.emptyarray.insert({a:2, b:null})
db.emptyarray.insert({a:2, b:[]})
db.emptyarray.insert({a:3, b:["perm1"]})
db.emptyarray.insert({a:3, b:["perm1", "perm2"]})
db.emptyarray.insert({a:3, b:["perm1", "perm2", []]})
> db.emptyarray.find({b:[]})
{ "_id" : ObjectId("5a305f3dd89e8a887e629ce0"), "a" : 2, "b" : [ ] }
{ "_id" : ObjectId("5a305f3dd89e8a887e629ce3"), "a" : 3, "b" : [ "perm1", "perm2", [ ] ] }
> db.emptyarray.find({b:{$in:[]}})
> db.emptyarray.find({b:{$in:[[], "perm1"]}})
{ "_id" : ObjectId("5a305f3dd89e8a887e629ce0"), "a" : 2, "b" : [ ] }
{ "_id" : ObjectId("5a305f3dd89e8a887e629ce1"), "a" : 3, "b" : [ "perm1" ] }
{ "_id" : ObjectId("5a305f3dd89e8a887e629ce2"), "a" : 3, "b" : [ "perm1", "perm2" ] }
{ "_id" : ObjectId("5a305f3dd89e8a887e629ce3"), "a" : 3, "b" : [ "perm1", "perm2", [ ] ] }
> db.emptyarray.find({b:{$in:[[], "perm1", null]}})
{ "_id" : ObjectId("5a305f3dd89e8a887e629cde"), "a" : 1 }
{ "_id" : ObjectId("5a305f3dd89e8a887e629cdf"), "a" : 2, "b" : null }
{ "_id" : ObjectId("5a305f3dd89e8a887e629ce0"), "a" : 2, "b" : [ ] }
{ "_id" : ObjectId("5a305f3dd89e8a887e629ce1"), "a" : 3, "b" : [ "perm1" ] }
{ "_id" : ObjectId("5a305f3dd89e8a887e629ce2"), "a" : 3, "b" : [ "perm1", "perm2" ] }
{ "_id" : ObjectId("5a305f3dd89e8a887e629ce3"), "a" : 3, "b" : [ "perm1", "perm2", [ ] ] }
> db.emptyarray.find({b:{$in:[[]]}})
{ "_id" : ObjectId("5a305f3dd89e8a887e629ce0"), "a" : 2, "b" : [ ] }
{ "_id" : ObjectId("5a305f3dd89e8a887e629ce3"), "a" : 3, "b" : [ "perm1", "perm2", [ ] ] }
Maybe like this:
"privacy_locations":{
"$in": ["5745bdd4b896d4f4367558b4","5745bd9bb896d4f4367558b2",[]]
}
But this query, works from the console (CLI), but not in the code where it throws a cast error:
{
"message":"Error in retrieving records from db.",
"error":
{
"message":"Cast to ObjectId failed for value \"[]\" at ...
}
}
Now I perfectly understand the cast is happening because the Schema is defined as an ObjectId.
But I still find that this approach is missing two possible scenarios.
I believe it is possible to query (in MongoDB) null options or empty array within an $in statement.
array: {$in:[null, [], [option-1, option-2]}
Is this correct?
I've been thinking that the best solution to my problem (Cannot select in options or empty) could be to have empty arrays be an array with a fix option of ALL for example. A setting for privacy that means ALL instead of how it is now which is that if not set, that is considered all.
But I don't want a major refactor of the existing code, I just need to see if I can make a better query or more performant query.
Today we have the query working with an $OR statement that has issues with indexes. And even if it is fast, I wanted to bring attention to this issue even if is not considered a bug.
I will appreciate any comments or guidance.
The semi-short answer is that the schema is mixing types for the privacy property (ObjectId and Array) while declaring that it is strictly of type ObjectId in the schema.
Since MongoDB is schema-less it will allow any document shape per document and doesn't need to verify the query document to match a schema. Mongoose on the other hand is meant to apply a schema enforcement and so it will verify a query document against the schema before it attempts to query the DB. The query document for { privacy: { $in: [[]] } } will fail validation since an empty array is not a valid ObjectId as indicated by the error.
The schema would need to declare the type as Mixed (which doesn't support ref) to continue using an empty array as an acceptable type as well as ObjectId.
// Current
const FooSchema = new mongoose.Schema({
privacy: [{
type: mongoose.Schema.Types.ObjectId,
ref: 'Category',
index: true,
required: true,
default: []
}]
});
const Foo = connection.model('Foo', FooSchema);
const foo1 = new Foo();
const foo2 = new Foo({privacy: [mongoose.Types.ObjectId()]});
Promise.all([
foo1.save(),
foo2.save()
]).then((results) => {
console.log('Saved', results);
/*
[
{ __v: 0, _id: 5a36e36a01e1b77cba8bd12f, privacy: [] },
{ __v: 0, _id: 5a36e36a01e1b77cba8bd131, privacy: [ 5a36e36a01e1b77cba8bd130 ] }
]
*/
return Foo.find({privacy: { $in: [[]] }}).exec();
}).then((results) => {
// Never gets here
console.log('Found', results);
}).catch((err) => {
console.log(err);
// { [CastError: Cast to ObjectId failed for value "[]" at path "privacy" for model "Foo"] }
});
And the working version. Also note the adjustment to properly apply the required flag, index flag and default value.
// Updated
const FooSchema = new mongoose.Schema({
privacy: {
type: [{
type: mongoose.Schema.Types.Mixed
}],
index: true,
required: true,
default: [[]]
}
});
const Foo = connection.model('Foo', FooSchema);
const foo1 = new Foo();
const foo2 = new Foo({
privacy: [mongoose.Types.ObjectId()]
});
Promise.all([
foo1.save(),
foo2.save()
]).then((results) => {
console.log(results);
/*
[
{ __v: 0, _id: 5a36f01733704f7e58c0bf9a, privacy: [ [] ] },
{ __v: 0, _id: 5a36f01733704f7e58c0bf9c, privacy: [ 5a36f01733704f7e58c0bf9b ] }
]
*/
return Foo.find().where({
privacy: { $in: [[]] }
}).exec();
}).then((results) => {
console.log(results);
// [ { _id: 5a36f01733704f7e58c0bf9a, __v: 0, privacy: [ [] ] } ]
});

How do I validated array of objects using mongodb validator?

I have been trying to validate my data using the validators provided by MongoDB but I have run into a problem. Here is a simple user document which I am inserting.
{
"name" : "foo",
"surname" : "bar",
"books" : [
{
"name" : "ABC",
"no" : 19
},
{
"name" : "DEF",
"no" : 64
},
{
"name" : "GHI",
"no" : 245
}
]
}
Now, this is the validator which has been applied for the user collection. But this is now working for the books array which I am inserting along with the document. I want to check the elements inside the object which are the members of books array. The schema of the object won't change.
db.runCommand({
collMod: "users",
validator: {
$or : [
{ "name" : { $type : "string" }},
{ "surname" : { $type : "string" }},
{ "books.name" : { $type : "string" }},
{ "books.no" : { $type : "number" }}
],
validationLevel: "strict"
});
I know that this validator is for member objects and not for array, but then how do I validate such an object ?
It has been very long since this question was asked.
Anyways, if at all anyone comes through this.
For MongoDB 3.6 and greater version, this can be achieved using the validator.
db.createCollection("users", {
validator: {
$jsonSchema: {
bsonType: "object",
required: ["name","surname","books"],
properties: {
name: {
bsonType: "string",
description: "must be a string and is required"
},
surname: {
bsonType: "string",
description: "must be a string and is required"
},
books: {
bsonType: [ "array" ],
items: {
bsonType: "object",
required:["name","no"],
properties:{
name:{
bsonType: "string",
description: "must be a string and is required"
},
no:{
bsonType: "number",
description: "must be a number and is required"
}
}
},
description: "must be a array of objects containing name and no"
}
}
}
}
})
This one handles all your requirements.
For more information, refer this link
You can do it in 3.6 using $jsonSchema expression.
JsonSchema allows defining a field as an array and specifying schema constraints for all elements as well as specific constraints for individual array elements.
This blog post has a number of examples which will help you figure out the syntax.

MongoDB nested Document Validation for sub-documents

I got a document structured like the following. My question is how do I do the nested part "roles" validation on the database side. My requirements are:
the roles size could be 0 or more than 1.
the presence of name and created_by for a role if a role is created.
{
"_id": "123456",
"name": "User Name",
"roles": [
{
"name": "mobiles_user",
"last_usage_at": {
"$date": 1457000592991
},
"created_by": "987654",
"created_at": {
"$date": 1457000592991
}
},
{
"name": "webs_user",
"last_usage_at": {
"$date": 1457000592991
},
"created_by": "987654",
"created_at": {
"$date": 1457000592991
}
},
]
}
At the moment, I am only doing the following for those none nested attributes:
db.createCollection( "users",
{ "validator" : {
"_id" : {
"$type" : "string"
},
"email" : {
"$regex" : /#gmail\.com$/
},
"name" : {
"$type" : "string"
}
}
} )
Could anyone please advise that how to do the nested document validation?
Yes, you can validate all sub-documents in a document by negating $elemMatch, and you can ensure that the size is not 1. It's sure not pretty though! And not exactly obvious either.
> db.createCollection('users', {
... validator: {
... name: {$type: 'string'},
... roles: {$exists: 'true'},
... $nor: [
... {roles: {$size: 1}},
... {roles: {$elemMatch: {
... $or: [
... {name: {$not: {$type: 'string'}}},
... {created_by: {$not: {$type: 'string'}}},
... ]
... }}}
... ],
... }
... })
{ "ok" : 1 }
This is confusing, but it works! What it means is only accept documents where neither the size of roles is 1 nor roles has an element with a name that isn't a string or a created_by that isn't a string.
This is based upon the fact that in logic terms,
for all x: f(x) and g(x)
Is equivalent to
not exists x s.t.: not f(x) or not g(x)
We have to use the latter since MongoDB only gives us an exists operator.
Proof
Valid documents work:
> db.users.insert({
... name: 'hello',
... roles: [],
... })
WriteResult({ "nInserted" : 1 })
> db.users.insert({
... name: 'hello',
... roles: [
... {name: 'foo', created_by: '2222'},
... {name: 'bar', created_by: '3333'},
... ]
... })
WriteResult({ "nInserted" : 1 })
If a field is missing from roles, it fails:
> db.users.insert({
... name: 'hello',
... roles: [
... {name: 'foo', created_by: '2222'},
... {created_by: '3333'},
... ]
... })
WriteResult({
"nInserted" : 0,
"writeError" : {
"code" : 121,
"errmsg" : "Document failed validation"
}
})
If a field in roles has the wrong type, it fails:
> db.users.insert({
... name: 'hello',
... roles: [
... {name: 'foo', created_by: '2222'},
... {name: 'bar', created_by: 3333},
... ]
... })
WriteResult({
"nInserted" : 0,
"writeError" : {
"code" : 121,
"errmsg" : "Document failed validation"
}
})
If roles has size 1 it fails:
> db.users.insert({
... name: 'hello',
... roles: [
... {name: 'foo', created_by: '2222'},
... ]
... })
WriteResult({
"nInserted" : 0,
"writeError" : {
"code" : 121,
"errmsg" : "Document failed validation"
}
})
The only thing I can't figure out unfortunately is how to ensure that roles is an array. roles: {$type: 'array'} seems to fail everything, I presume because it's actually checking that the elements are of type 'array'?
Edit: this answer is not correct, it is possible to validate all sub-documents in the array. See answer: https://stackoverflow.com/a/43102783/200224
You can't really. You can do things like:
"roles.name": { "$type": "string" }
But all that really means is at "at least one" of those properties need match the specified type. That means this would actually be valid:
{
"_id" : "123456",
"name" : "User Name",
"roles" : [
{
"name" : "mobiles_user",
"last_usage_at" : ISODate("2016-03-03T10:23:12.991Z"),
"created_by" : "987654",
"created_at" : ISODate("2016-03-03T10:23:12.991Z")
},
{
"name" : "webs_user",
"last_usage_at" : ISODate("2016-03-03T10:23:12.991Z"),
"created_by" : "987654",
"created_at" : ISODate("2016-03-03T10:23:12.991Z")
},
{
"name" : 1
}
]
}
It is afterall "documement validation" and that is by nature not well suited to sub-documents in arrays, or any data in a contained array really.
The core of the implementation relies on expressions available to query operators, and since MongoDB lacks anythin in standard query expressions that equates to "All array entries must match this value" without being directly specific then it's not possible to express as a validator condition.
The only posibility to check array content like that in a "query" expression is using $where, and that is noted to not be an available option with document validation.
Even the $size operator available for queries must match a specific "size" value, and cannot use an in-equality condition. So you "could" verify a strict size, but not a minimal size, unless:
"roles.0": { "$exists": true }
This is a feature in "infancy" and somewhat experimental, so there is the possibility that future releases may address this.
But for now, your better option is to do such "schema validation" in client side code ( where you will get a lot better exception reporting ) instead. There are many libraries already existing that take that approach.