mongdb ensure uniqueness on two fields both ways - mongodb

Say I have the fields a and b. I want to have a compound uniqueness where if a: 1, b: 2, I would not be able to do a: 2, b: 1.
The reason I want this is because I'm making a "friends list" kind of collection, where if a is connected to b, then it's automatically the reverse as well.
is this possible on a schema level or do I need to do queries to check.

If you don't need to differentiate between requester and requestee, you could sort the values before saving or querying so that your two fields a and b have a predictable order for any pair of friend IDs (and you can take advantage of the unique index constraint).
For example, using the mongo shell:
Create a helper function to return friend pairs in predictable order:
function friendpair (friend1, friend2) {
if ( friend1 < friend2) {
return ({a: friend1, b: friend2})
} else {
return ({a: friend2, b: friend1})
}
}
Add a compound unique index:
> db.friends.createIndex({a:1, b:1}, {unique: true});
{
"createdCollectionAutomatically" : true,
"numIndexesBefore" : 1,
"numIndexesAfter" : 2,
"ok" : 1
}
Insert unique pairs (should work)
> db.friends.insert(friendpair(1,2))
WriteResult({ "nInserted" : 1 })
> db.friends.insert(friendpair(1,3))
WriteResult({ "nInserted" : 1 })
Insert non-unique pair (should return duplicate key error):
> db.friends.insert(friendpair(2,1))
WriteResult({
"nInserted" : 0,
"writeError" : {
"code" : 11000,
"errmsg" : "E11000 duplicate key error collection: test.friends index: a_1_b_1 dup key: { : 1.0, : 2.0 }"
}
})
Search should work in either order:
db.friends.find(friendpair(3,1)).pretty()
{ "_id" : ObjectId("5bc80ed11466009f3b56fa52"), "a" : 1, "b" : 3 }
db.friends.find(friendpair(1,3)).pretty()
{ "_id" : ObjectId("5bc80ed11466009f3b56fa52"), "a" : 1, "b" : 3 }
Instead of handling duplicate key errors or insert versus update, you could also use findAndModify with an upsert since this is expected to be a unique pair:
> var pair = friendpair(2,1)
> db.friends.findAndModify({
query: pair,
update: {
$set: {
a : pair.a,
b : pair.b
},
$setOnInsert: { status: 'pending' },
},
upsert: true
})
{
"_id" : ObjectId("5bc81722ce51da0e4118c92f"),
"a" : 1,
"b" : 2,
"status" : "pending"
}

Doesn't seem like you can do a unique on the entire array's values so I'm doing a kind of work around. I'm using the $jsonSchema as follows:
{
$jsonSchema:
{
bsonType:
"object",
required:
[
"status",
"users"
],
properties:
{
status:
{
enum:
[
"pending",
"accepted"
],
bsonType:
"string"
},
users:
{
bsonType:
"array",
description:
"references two user_id",
items:
{
bsonType:
"objectId"
},
maxItems:
2,
minItems:
2,
},
}
}
}
then I will use $all to find the connected users, e.g.
db.collection.find( { users: { $all: [ ObjectId1, ObjectId2 ] } } )

Related

Mongoose retun nModified : 1 when no data is updated

Trying to understand why/if mongoose is updating my documents even though no data is changed?
If I save a new document with the query below it will return this in the console.log(item)
{ n: 1,
nModified: 0,
upserted: [ { index: 0, _id: 5f3d35c386aeb6c6fb35fa79 } ],
ok: 1 }
Query
Product.updateOne(
{productName: product.productName},
{$set: newProduct},
{upsert: true}
).then((item) => {
console.log(item);
}).catch((e) => {
console.log('Insert error', e);
});
If i rerun the same query again i get this back. This indicates that the document has been modified but the data is the same, there is no new data that has been inserted.
{ n: 1, nModified: 1, ok: 1 }
I've noticed if i remove the stores array, delete the document, insert it again and rerun the query I get { n: 1, nModified: 0, ok: 1 } back in the console.log(item)
I run the same querys, the same amout of time, but when having an array in the object i get this { n: 1, nModified: 1, ok: 1 } and when not having an array a get this { n: 1, nModified: 0, ok: 1 }
It seems that when having an array the document gets modified regardless if the data is changed.
Example 1
Gives { n: 1, nModified: 1, ok: 1 }
const newProduct = {
ean: product.ean,
productName: product.productName,
lowestPrice: product.productPrice,
mainCategory: categories.mainCategory,
group: categories.group,
subCategory: categories.subCategory,
subSubCategory: subSubCat,
stores: [{
name: "foobar",
}],
};
Example 2
Gives { n: 1, nModified: 0, ok: 1 }
const newProduct = {
ean: product.ean,
productName: product.productName,
lowestPrice: product.productPrice,
mainCategory: categories.mainCategory,
group: categories.group,
subCategory: categories.subCategory,
subSubCategory: subSubCat,
};
Is it me who misunderstands the operation below or whats going on?
What i want to do is:
1.insert if the document don't exists based on productName,
2.if something differs in the document stored in the database and the newProduct, update the document.
3.If nothing differs, do nothing
Product model
const ProductSchema = new Schema({
ean: String,
productName: String,
mainCategory: String,
subCategory: String,
group: String,
subSubCategory: String,
lowestPrice: Number,
isPopular: Boolean,
description: String,
stores: [
{
name: String,
},
],
});
Edit: As its pretty hard to explain I created a small repo that shows the issue.
https://github.com/gameatrix/mongo_array
The database still performs the update.
For example, let's conditionally upsert a value:
MongoDB Enterprise ruby-driver-rs:PRIMARY> db.foo.update({a:42},{a:42},{upsert:true})
WriteResult({
"nMatched" : 0,
"nUpserted" : 1,
"nModified" : 0,
"_id" : ObjectId("5f3d4ed509fcd40c9f092690")
})
MongoDB Enterprise ruby-driver-rs:PRIMARY> db.foo.update({a:42},{a:42},{upsert:true})
WriteResult({ "nMatched" : 1, "nUpserted" : 0, "nModified" : 1 })
The first write was an insert, the second write was an update. The second write did not change any data but the database performed a write.
You can verify there was a write by using a change stream in another shell instance:
MongoDB Enterprise ruby-driver-rs:PRIMARY> db.foo.watch()
{ "_id" : { "_data" : "825F3D4ED5000000012B022C0100296E5A1004FAC29486D5A3459A8726349007F2E43E46645F696400645F3D4ED509FCD40C9F0926900004" }, "operationType" : "insert", "clusterTime" : Timestamp(1597853397, 1), "fullDocument" : { "_id" : ObjectId("5f3d4ed509fcd40c9f092690"), "a" : 42 }, "ns" : { "db" : "test", "coll" : "foo" }, "documentKey" : { "_id" : ObjectId("5f3d4ed509fcd40c9f092690") } }
{ "_id" : { "_data" : "825F3D4ED6000000012B022C0100296E5A1004FAC29486D5A3459A8726349007F2E43E46645F696400645F3D4ED509FCD40C9F0926900004" }, "operationType" : "replace", "clusterTime" : Timestamp(1597853398, 1), "fullDocument" : { "_id" : ObjectId("5f3d4ed509fcd40c9f092690"), "a" : 42 }, "ns" : { "db" : "test", "coll" : "foo" }, "documentKey" : { "_id" : ObjectId("5f3d4ed509fcd40c9f092690") } }
By definition an upsert either modifies documents that match the condition or inserts new documents. You are always going to have a write when upserting.
2.if something differs in the document stored in the database and the newProduct, update the document.
The bolded part is not how MongoDB (and most databases, as far as I know) work. Whether a write is performed does not depend on whether the data being written is the same as what is already in the database.

Mongoose match element or empty array with $in statement

I'm trying to select any documents where privacy settings match the provided ones and any documents which do not have any privacy settings (i.e. public).
Current behavior is that if I have a schema with an array of object ids referenced to another collection:
privacy: [{
type: mongoose.Schema.Types.ObjectId,
ref: 'Category',
index: true,
required: true,
default: []
}],
And I want to filter all content for my categories and the public ones, in our case content that does not have a privacy settings. i.e. an empty array []
We currently query that with an or query
{"$or":[
{"privacy": {"$size": 0}},
{"privacy": {"$in":
["5745bdd4b896d4f4367558b4","5745bd9bb896d4f4367558b2"]}
}
]}
I would love to query it by only providing an empty array [] as one the comparison options in the $in statement. Which is possible in mongodb:
db.emptyarray.insert({a:1})
db.emptyarray.insert({a:2, b:null})
db.emptyarray.insert({a:2, b:[]})
db.emptyarray.insert({a:3, b:["perm1"]})
db.emptyarray.insert({a:3, b:["perm1", "perm2"]})
db.emptyarray.insert({a:3, b:["perm1", "perm2", []]})
> db.emptyarray.find({b:[]})
{ "_id" : ObjectId("5a305f3dd89e8a887e629ce0"), "a" : 2, "b" : [ ] }
{ "_id" : ObjectId("5a305f3dd89e8a887e629ce3"), "a" : 3, "b" : [ "perm1", "perm2", [ ] ] }
> db.emptyarray.find({b:{$in:[]}})
> db.emptyarray.find({b:{$in:[[], "perm1"]}})
{ "_id" : ObjectId("5a305f3dd89e8a887e629ce0"), "a" : 2, "b" : [ ] }
{ "_id" : ObjectId("5a305f3dd89e8a887e629ce1"), "a" : 3, "b" : [ "perm1" ] }
{ "_id" : ObjectId("5a305f3dd89e8a887e629ce2"), "a" : 3, "b" : [ "perm1", "perm2" ] }
{ "_id" : ObjectId("5a305f3dd89e8a887e629ce3"), "a" : 3, "b" : [ "perm1", "perm2", [ ] ] }
> db.emptyarray.find({b:{$in:[[], "perm1", null]}})
{ "_id" : ObjectId("5a305f3dd89e8a887e629cde"), "a" : 1 }
{ "_id" : ObjectId("5a305f3dd89e8a887e629cdf"), "a" : 2, "b" : null }
{ "_id" : ObjectId("5a305f3dd89e8a887e629ce0"), "a" : 2, "b" : [ ] }
{ "_id" : ObjectId("5a305f3dd89e8a887e629ce1"), "a" : 3, "b" : [ "perm1" ] }
{ "_id" : ObjectId("5a305f3dd89e8a887e629ce2"), "a" : 3, "b" : [ "perm1", "perm2" ] }
{ "_id" : ObjectId("5a305f3dd89e8a887e629ce3"), "a" : 3, "b" : [ "perm1", "perm2", [ ] ] }
> db.emptyarray.find({b:{$in:[[]]}})
{ "_id" : ObjectId("5a305f3dd89e8a887e629ce0"), "a" : 2, "b" : [ ] }
{ "_id" : ObjectId("5a305f3dd89e8a887e629ce3"), "a" : 3, "b" : [ "perm1", "perm2", [ ] ] }
Maybe like this:
"privacy_locations":{
"$in": ["5745bdd4b896d4f4367558b4","5745bd9bb896d4f4367558b2",[]]
}
But this query, works from the console (CLI), but not in the code where it throws a cast error:
{
"message":"Error in retrieving records from db.",
"error":
{
"message":"Cast to ObjectId failed for value \"[]\" at ...
}
}
Now I perfectly understand the cast is happening because the Schema is defined as an ObjectId.
But I still find that this approach is missing two possible scenarios.
I believe it is possible to query (in MongoDB) null options or empty array within an $in statement.
array: {$in:[null, [], [option-1, option-2]}
Is this correct?
I've been thinking that the best solution to my problem (Cannot select in options or empty) could be to have empty arrays be an array with a fix option of ALL for example. A setting for privacy that means ALL instead of how it is now which is that if not set, that is considered all.
But I don't want a major refactor of the existing code, I just need to see if I can make a better query or more performant query.
Today we have the query working with an $OR statement that has issues with indexes. And even if it is fast, I wanted to bring attention to this issue even if is not considered a bug.
I will appreciate any comments or guidance.
The semi-short answer is that the schema is mixing types for the privacy property (ObjectId and Array) while declaring that it is strictly of type ObjectId in the schema.
Since MongoDB is schema-less it will allow any document shape per document and doesn't need to verify the query document to match a schema. Mongoose on the other hand is meant to apply a schema enforcement and so it will verify a query document against the schema before it attempts to query the DB. The query document for { privacy: { $in: [[]] } } will fail validation since an empty array is not a valid ObjectId as indicated by the error.
The schema would need to declare the type as Mixed (which doesn't support ref) to continue using an empty array as an acceptable type as well as ObjectId.
// Current
const FooSchema = new mongoose.Schema({
privacy: [{
type: mongoose.Schema.Types.ObjectId,
ref: 'Category',
index: true,
required: true,
default: []
}]
});
const Foo = connection.model('Foo', FooSchema);
const foo1 = new Foo();
const foo2 = new Foo({privacy: [mongoose.Types.ObjectId()]});
Promise.all([
foo1.save(),
foo2.save()
]).then((results) => {
console.log('Saved', results);
/*
[
{ __v: 0, _id: 5a36e36a01e1b77cba8bd12f, privacy: [] },
{ __v: 0, _id: 5a36e36a01e1b77cba8bd131, privacy: [ 5a36e36a01e1b77cba8bd130 ] }
]
*/
return Foo.find({privacy: { $in: [[]] }}).exec();
}).then((results) => {
// Never gets here
console.log('Found', results);
}).catch((err) => {
console.log(err);
// { [CastError: Cast to ObjectId failed for value "[]" at path "privacy" for model "Foo"] }
});
And the working version. Also note the adjustment to properly apply the required flag, index flag and default value.
// Updated
const FooSchema = new mongoose.Schema({
privacy: {
type: [{
type: mongoose.Schema.Types.Mixed
}],
index: true,
required: true,
default: [[]]
}
});
const Foo = connection.model('Foo', FooSchema);
const foo1 = new Foo();
const foo2 = new Foo({
privacy: [mongoose.Types.ObjectId()]
});
Promise.all([
foo1.save(),
foo2.save()
]).then((results) => {
console.log(results);
/*
[
{ __v: 0, _id: 5a36f01733704f7e58c0bf9a, privacy: [ [] ] },
{ __v: 0, _id: 5a36f01733704f7e58c0bf9c, privacy: [ 5a36f01733704f7e58c0bf9b ] }
]
*/
return Foo.find().where({
privacy: { $in: [[]] }
}).exec();
}).then((results) => {
console.log(results);
// [ { _id: 5a36f01733704f7e58c0bf9a, __v: 0, privacy: [ [] ] } ]
});

MongoDB Conditional validation on arrays and embedded documents

I have a number of documents in my database where I am applying document validation. All of these documents may have embedded documents. I can apply simple validation along the lines of SQL non NULL checks (these are essentially enforcing the primary key constraints) but what I would like to do is apply some sort of conditional validation to the optional arrays and embedded documents. By example, lets say I have a document that looks like this:
{
"date": <<insertion date>>,
"name" : <<the portfolio name>>,
"assets" : << amount of money we have to trade with>>
}
Clearly I can put validation on this document to ensure that date name and assets all exist at insertion time. Lets say, however, that I'm managing a stock portfolio and the document can have future updates to show an array of stocks like this:
{
"date" : <<insertion date>>,
"name" : <<the portfolio name>>,
"assets" : << amount of money we have to trade with>>
"portfolio" : [
{ "stockName" : "IBM",
"pricePaid" : 155.39,
"sharesHeld" : 100
},
{ "stockName" : "Microsoft",
"pricePaid" : 57.22,
"sharesHeld" : 250
}
]
}
Is it possible to to apply a conditional validation to this array of sub documents? It's valid for the portfolio to not be there but if it is each document in the array must contain the three fields "stockName", "pricePaid" and "sharesHeld".
MongoShell
db.createCollection("collectionname",
{
validator: {
$or: [
{
"portfolio": {
$exists: false
}
},
{
$and: [
{
"portfolio": {
$exists: true
}
},
{
"portfolio.stockName": {
$type: "string",
$exists: true
}
},
{
"portfolio.pricePaid": {
$type: "double",
$exists: true
}
},
{
"portfolio.sharesHeld": {
$type: "double",
$exists: true
}
}
]
}
]
}
})
With this above validation in place you can insert documents with or without portfolio.
After executing the validator in shell, then you can insert data of following
db.collectionname.insert({
"_id" : ObjectId("58061aac8812662c9ae1b479"),
"date" : ISODate("2016-10-18T12:50:52.372Z"),
"name" : "B",
"assets" : 200
})
db.collectionname.insert({
"_id" : ObjectId("58061ab48812662c9ae1b47a"),
"date" : ISODate("2016-10-18T12:51:00.747Z"),
"name" : "A",
"assets" : 100,
"portfolio" : [
{
"stockName" : "Microsoft",
"pricePaid" : 57.22,
"sharesHeld" : 250
}
]
})
If we try to insert a document like this
db.collectionname.insert({
"date" : new Date(),
"name" : "A",
"assets" : 100,
"portfolio" : [
{ "stockName" : "IBM",
"sharesHeld" : 100
}
]
})
then we will get the below error message
WriteResult({
"nInserted" : 0,
"writeError" : {
"code" : 121,
"errmsg" : "Document failed validation"
}
})
Using Mongoose
Yes it can be done, Based on your scenario you may need to initialize the parent and the child schema.
Shown below would be a sample of child(portfolio) schema in mongoose.
var mongoose = require('mongoose');
var Schema = mongoose.Schema;
var portfolioSchema = new Schema({
"stockName" : { type : String, required : true },
"pricePaid" : { type : Number, required : true },
"sharesHeld" : { type : Number, required : true },
}
References:
http://mongoosejs.com/docs/guide.html
http://mongoosejs.com/docs/subdocs.html
Can I require an attribute to be set in a mongodb collection? (not null)
Hope it Helps!

mongodb combination of fields unique

I need to ensure combination of some fields to be unique , so i found out we can use indexes to achieve it , like :
db.user.ensureIndex({'fname':1,'lname':1},{unique:true})
However in my scenario i allow the user to add email or phone or even both . So how do i achieve the uniqueness in this case . I tried like setting 2 indexed like
db.user.ensureIndex({'fname':1,'lname':1,'email':1},{unique:true})
And
db.user.ensureIndex({'fname':1,'lname':1,'mobile':1},{unique:true})
this dint work when i try to add a document with email unique , like below
> db.user.insert({fname:'tony',lname:'stark',email:'aa#gmail.com'})
WriteResult({ "nInserted" : 1 })
> db.user.insert({fname:'tony',lname:'stark',email:'xyz#gmail.com'})
WriteResult({
"nInserted" : 0,
"writeError" : {
"code" : 11000,
"errmsg" : "E11000 duplicate key error index: temp.user.$fname_1_lname_1_mobile_1 dup key: { : \"tony\", : \"stark\", : null }"
}
})
I guess the error is because of the second index ,since both the inserts have null value for the mobile field . I then tried this
db.user.ensureIndex({'fname':1,'lname':1,'email':1,'mobile':1},{unique:true})
And that dint help ,is there any way to do this with mongodb or check the uniqueness with code otherwise .
Update
I need to ensure uniqueness in this case too :
db.user.insert({fname:'tony',lname:'stark',email:'abc#gmail.com'})
<<insert>>
db.user.insert({fname:'tony',lname:'stark',email:'abc#gmail.com',mobile:554545435})
<<dont insert>>
Update 2
Even Unique partial Index didn't help
> db.users.createIndex({ "fname": 1, "lname": 1, "email":1 },
... { unique: true, partialFilterExpression:{
... email: { $exists: true, $gt : { $type : 10 } } } })
{
"createdCollectionAutomatically" : true,
"numIndexesBefore" : 1,
"numIndexesAfter" : 2,
"ok" : 1
}
> db.users.createIndex({ "fname": 1, "lname": 1, "mobile":1 },
... { unique: true, partialFilterExpression:{
... mobile: { $exists: true, $gt : { $type : 10 } } } })
{
"createdCollectionAutomatically" : false,
"numIndexesBefore" : 2,
"numIndexesAfter" : 3,
"ok" : 1
}
Here is the query :
>db.users.insert({fname:"tony",lname:"stark",mobile:"123",email:"123#gmail.com"})
WriteResult({ "nInserted" : 1 })
>db.users.insert({fname:"tony",lname:"stark",mobile:"123",email:"123#gmail.com"})
WriteResult({ "nInserted" : 1 })
Mongo version - v3.2.0
I have no experience with mongodb much neither have used indexes in other db's , i guess it's kind of noob question .Thanks in advance

MongoDB nested Document Validation for sub-documents

I got a document structured like the following. My question is how do I do the nested part "roles" validation on the database side. My requirements are:
the roles size could be 0 or more than 1.
the presence of name and created_by for a role if a role is created.
{
"_id": "123456",
"name": "User Name",
"roles": [
{
"name": "mobiles_user",
"last_usage_at": {
"$date": 1457000592991
},
"created_by": "987654",
"created_at": {
"$date": 1457000592991
}
},
{
"name": "webs_user",
"last_usage_at": {
"$date": 1457000592991
},
"created_by": "987654",
"created_at": {
"$date": 1457000592991
}
},
]
}
At the moment, I am only doing the following for those none nested attributes:
db.createCollection( "users",
{ "validator" : {
"_id" : {
"$type" : "string"
},
"email" : {
"$regex" : /#gmail\.com$/
},
"name" : {
"$type" : "string"
}
}
} )
Could anyone please advise that how to do the nested document validation?
Yes, you can validate all sub-documents in a document by negating $elemMatch, and you can ensure that the size is not 1. It's sure not pretty though! And not exactly obvious either.
> db.createCollection('users', {
... validator: {
... name: {$type: 'string'},
... roles: {$exists: 'true'},
... $nor: [
... {roles: {$size: 1}},
... {roles: {$elemMatch: {
... $or: [
... {name: {$not: {$type: 'string'}}},
... {created_by: {$not: {$type: 'string'}}},
... ]
... }}}
... ],
... }
... })
{ "ok" : 1 }
This is confusing, but it works! What it means is only accept documents where neither the size of roles is 1 nor roles has an element with a name that isn't a string or a created_by that isn't a string.
This is based upon the fact that in logic terms,
for all x: f(x) and g(x)
Is equivalent to
not exists x s.t.: not f(x) or not g(x)
We have to use the latter since MongoDB only gives us an exists operator.
Proof
Valid documents work:
> db.users.insert({
... name: 'hello',
... roles: [],
... })
WriteResult({ "nInserted" : 1 })
> db.users.insert({
... name: 'hello',
... roles: [
... {name: 'foo', created_by: '2222'},
... {name: 'bar', created_by: '3333'},
... ]
... })
WriteResult({ "nInserted" : 1 })
If a field is missing from roles, it fails:
> db.users.insert({
... name: 'hello',
... roles: [
... {name: 'foo', created_by: '2222'},
... {created_by: '3333'},
... ]
... })
WriteResult({
"nInserted" : 0,
"writeError" : {
"code" : 121,
"errmsg" : "Document failed validation"
}
})
If a field in roles has the wrong type, it fails:
> db.users.insert({
... name: 'hello',
... roles: [
... {name: 'foo', created_by: '2222'},
... {name: 'bar', created_by: 3333},
... ]
... })
WriteResult({
"nInserted" : 0,
"writeError" : {
"code" : 121,
"errmsg" : "Document failed validation"
}
})
If roles has size 1 it fails:
> db.users.insert({
... name: 'hello',
... roles: [
... {name: 'foo', created_by: '2222'},
... ]
... })
WriteResult({
"nInserted" : 0,
"writeError" : {
"code" : 121,
"errmsg" : "Document failed validation"
}
})
The only thing I can't figure out unfortunately is how to ensure that roles is an array. roles: {$type: 'array'} seems to fail everything, I presume because it's actually checking that the elements are of type 'array'?
Edit: this answer is not correct, it is possible to validate all sub-documents in the array. See answer: https://stackoverflow.com/a/43102783/200224
You can't really. You can do things like:
"roles.name": { "$type": "string" }
But all that really means is at "at least one" of those properties need match the specified type. That means this would actually be valid:
{
"_id" : "123456",
"name" : "User Name",
"roles" : [
{
"name" : "mobiles_user",
"last_usage_at" : ISODate("2016-03-03T10:23:12.991Z"),
"created_by" : "987654",
"created_at" : ISODate("2016-03-03T10:23:12.991Z")
},
{
"name" : "webs_user",
"last_usage_at" : ISODate("2016-03-03T10:23:12.991Z"),
"created_by" : "987654",
"created_at" : ISODate("2016-03-03T10:23:12.991Z")
},
{
"name" : 1
}
]
}
It is afterall "documement validation" and that is by nature not well suited to sub-documents in arrays, or any data in a contained array really.
The core of the implementation relies on expressions available to query operators, and since MongoDB lacks anythin in standard query expressions that equates to "All array entries must match this value" without being directly specific then it's not possible to express as a validator condition.
The only posibility to check array content like that in a "query" expression is using $where, and that is noted to not be an available option with document validation.
Even the $size operator available for queries must match a specific "size" value, and cannot use an in-equality condition. So you "could" verify a strict size, but not a minimal size, unless:
"roles.0": { "$exists": true }
This is a feature in "infancy" and somewhat experimental, so there is the possibility that future releases may address this.
But for now, your better option is to do such "schema validation" in client side code ( where you will get a lot better exception reporting ) instead. There are many libraries already existing that take that approach.