I'm trying to select any documents where privacy settings match the provided ones and any documents which do not have any privacy settings (i.e. public).
Current behavior is that if I have a schema with an array of object ids referenced to another collection:
privacy: [{
type: mongoose.Schema.Types.ObjectId,
ref: 'Category',
index: true,
required: true,
default: []
}],
And I want to filter all content for my categories and the public ones, in our case content that does not have a privacy settings. i.e. an empty array []
We currently query that with an or query
{"$or":[
{"privacy": {"$size": 0}},
{"privacy": {"$in":
["5745bdd4b896d4f4367558b4","5745bd9bb896d4f4367558b2"]}
}
]}
I would love to query it by only providing an empty array [] as one the comparison options in the $in statement. Which is possible in mongodb:
db.emptyarray.insert({a:1})
db.emptyarray.insert({a:2, b:null})
db.emptyarray.insert({a:2, b:[]})
db.emptyarray.insert({a:3, b:["perm1"]})
db.emptyarray.insert({a:3, b:["perm1", "perm2"]})
db.emptyarray.insert({a:3, b:["perm1", "perm2", []]})
> db.emptyarray.find({b:[]})
{ "_id" : ObjectId("5a305f3dd89e8a887e629ce0"), "a" : 2, "b" : [ ] }
{ "_id" : ObjectId("5a305f3dd89e8a887e629ce3"), "a" : 3, "b" : [ "perm1", "perm2", [ ] ] }
> db.emptyarray.find({b:{$in:[]}})
> db.emptyarray.find({b:{$in:[[], "perm1"]}})
{ "_id" : ObjectId("5a305f3dd89e8a887e629ce0"), "a" : 2, "b" : [ ] }
{ "_id" : ObjectId("5a305f3dd89e8a887e629ce1"), "a" : 3, "b" : [ "perm1" ] }
{ "_id" : ObjectId("5a305f3dd89e8a887e629ce2"), "a" : 3, "b" : [ "perm1", "perm2" ] }
{ "_id" : ObjectId("5a305f3dd89e8a887e629ce3"), "a" : 3, "b" : [ "perm1", "perm2", [ ] ] }
> db.emptyarray.find({b:{$in:[[], "perm1", null]}})
{ "_id" : ObjectId("5a305f3dd89e8a887e629cde"), "a" : 1 }
{ "_id" : ObjectId("5a305f3dd89e8a887e629cdf"), "a" : 2, "b" : null }
{ "_id" : ObjectId("5a305f3dd89e8a887e629ce0"), "a" : 2, "b" : [ ] }
{ "_id" : ObjectId("5a305f3dd89e8a887e629ce1"), "a" : 3, "b" : [ "perm1" ] }
{ "_id" : ObjectId("5a305f3dd89e8a887e629ce2"), "a" : 3, "b" : [ "perm1", "perm2" ] }
{ "_id" : ObjectId("5a305f3dd89e8a887e629ce3"), "a" : 3, "b" : [ "perm1", "perm2", [ ] ] }
> db.emptyarray.find({b:{$in:[[]]}})
{ "_id" : ObjectId("5a305f3dd89e8a887e629ce0"), "a" : 2, "b" : [ ] }
{ "_id" : ObjectId("5a305f3dd89e8a887e629ce3"), "a" : 3, "b" : [ "perm1", "perm2", [ ] ] }
Maybe like this:
"privacy_locations":{
"$in": ["5745bdd4b896d4f4367558b4","5745bd9bb896d4f4367558b2",[]]
}
But this query, works from the console (CLI), but not in the code where it throws a cast error:
{
"message":"Error in retrieving records from db.",
"error":
{
"message":"Cast to ObjectId failed for value \"[]\" at ...
}
}
Now I perfectly understand the cast is happening because the Schema is defined as an ObjectId.
But I still find that this approach is missing two possible scenarios.
I believe it is possible to query (in MongoDB) null options or empty array within an $in statement.
array: {$in:[null, [], [option-1, option-2]}
Is this correct?
I've been thinking that the best solution to my problem (Cannot select in options or empty) could be to have empty arrays be an array with a fix option of ALL for example. A setting for privacy that means ALL instead of how it is now which is that if not set, that is considered all.
But I don't want a major refactor of the existing code, I just need to see if I can make a better query or more performant query.
Today we have the query working with an $OR statement that has issues with indexes. And even if it is fast, I wanted to bring attention to this issue even if is not considered a bug.
I will appreciate any comments or guidance.
The semi-short answer is that the schema is mixing types for the privacy property (ObjectId and Array) while declaring that it is strictly of type ObjectId in the schema.
Since MongoDB is schema-less it will allow any document shape per document and doesn't need to verify the query document to match a schema. Mongoose on the other hand is meant to apply a schema enforcement and so it will verify a query document against the schema before it attempts to query the DB. The query document for { privacy: { $in: [[]] } } will fail validation since an empty array is not a valid ObjectId as indicated by the error.
The schema would need to declare the type as Mixed (which doesn't support ref) to continue using an empty array as an acceptable type as well as ObjectId.
// Current
const FooSchema = new mongoose.Schema({
privacy: [{
type: mongoose.Schema.Types.ObjectId,
ref: 'Category',
index: true,
required: true,
default: []
}]
});
const Foo = connection.model('Foo', FooSchema);
const foo1 = new Foo();
const foo2 = new Foo({privacy: [mongoose.Types.ObjectId()]});
Promise.all([
foo1.save(),
foo2.save()
]).then((results) => {
console.log('Saved', results);
/*
[
{ __v: 0, _id: 5a36e36a01e1b77cba8bd12f, privacy: [] },
{ __v: 0, _id: 5a36e36a01e1b77cba8bd131, privacy: [ 5a36e36a01e1b77cba8bd130 ] }
]
*/
return Foo.find({privacy: { $in: [[]] }}).exec();
}).then((results) => {
// Never gets here
console.log('Found', results);
}).catch((err) => {
console.log(err);
// { [CastError: Cast to ObjectId failed for value "[]" at path "privacy" for model "Foo"] }
});
And the working version. Also note the adjustment to properly apply the required flag, index flag and default value.
// Updated
const FooSchema = new mongoose.Schema({
privacy: {
type: [{
type: mongoose.Schema.Types.Mixed
}],
index: true,
required: true,
default: [[]]
}
});
const Foo = connection.model('Foo', FooSchema);
const foo1 = new Foo();
const foo2 = new Foo({
privacy: [mongoose.Types.ObjectId()]
});
Promise.all([
foo1.save(),
foo2.save()
]).then((results) => {
console.log(results);
/*
[
{ __v: 0, _id: 5a36f01733704f7e58c0bf9a, privacy: [ [] ] },
{ __v: 0, _id: 5a36f01733704f7e58c0bf9c, privacy: [ 5a36f01733704f7e58c0bf9b ] }
]
*/
return Foo.find().where({
privacy: { $in: [[]] }
}).exec();
}).then((results) => {
console.log(results);
// [ { _id: 5a36f01733704f7e58c0bf9a, __v: 0, privacy: [ [] ] } ]
});
Related
Is there a way to update more than one keys inside a subdocument at once (within the given subdoc _id) instead writing the query like that:
articles.updateOne(
{
_id: 123,
'data._id': 5,
},
{
$set: {
'data.$.comments': 10,
'data.$.visible': true,
},
},
);
Sample document:
{
"_id" : 123,
"data" : [
{
"_id" : 5
"comments" : 8,
"visible" : false,
"status" : null,
}
]
}
I am looking for such a solution:
$set: {
'data.$': { visible: true, comments: 10 },
},
... in other words: Is it able to submit a object with a few keys to update only the given keys inside the object and leave the existing keys untouched? Like MySQL... UPDATE * ... SET foo = 'bar', test = 'hello', ...
Just wondering what the best way to accomplish this is. I can think of some janky ways, but they don't seem right.
What I'm trying to do is remove all sub-sub-array objects from a documents. Like follows:
SCHEMA
schema {
person: Array<{
id: string;
posts: Array<{
id: string,
comments: Array<{
id: string
tagged_person_id: string;
}>
}>
}>
}
What I am looking for some way to delete all comments in every post for each person where the comment has tagged_person_id == some_id. This isn't my actually use-case, but it represents the same concept.
I know how to use $pull to remove from a subarray for one subdocument, but just not sure how to accomplish all of this in one query, or if it's even possible.
As per JIRA ticket SERVER-1243 and the documentation, starting with MongoDB v3.5.12, given the following document:
{
"posts" : [
{
"comments" : [
{
"tagged_person_id" : "x"
},
{
"tagged_person_id" : "y"
}
]
},
{
"comments" : [
{
"tagged_person_id" : "x"
}
]
},
{
"comments" : [
{
"tagged_person_id" : "y"
}
]
}
]
}
You can run this update:
db.collection.update({}, {
$pull : {
"posts.$[].comments" : {"tagged_person_id": "x"}
}
})
in order to remove all comments where tagged_person_id is equal to "x".
Result:
{
"posts" : [
{
"comments" : [
{
"tagged_person_id" : "y"
}
]
},
{
"comments" : []
},
{
"comments" : [
{
"tagged_person_id" : "y"
}
]
}
]
}
Say I have the fields a and b. I want to have a compound uniqueness where if a: 1, b: 2, I would not be able to do a: 2, b: 1.
The reason I want this is because I'm making a "friends list" kind of collection, where if a is connected to b, then it's automatically the reverse as well.
is this possible on a schema level or do I need to do queries to check.
If you don't need to differentiate between requester and requestee, you could sort the values before saving or querying so that your two fields a and b have a predictable order for any pair of friend IDs (and you can take advantage of the unique index constraint).
For example, using the mongo shell:
Create a helper function to return friend pairs in predictable order:
function friendpair (friend1, friend2) {
if ( friend1 < friend2) {
return ({a: friend1, b: friend2})
} else {
return ({a: friend2, b: friend1})
}
}
Add a compound unique index:
> db.friends.createIndex({a:1, b:1}, {unique: true});
{
"createdCollectionAutomatically" : true,
"numIndexesBefore" : 1,
"numIndexesAfter" : 2,
"ok" : 1
}
Insert unique pairs (should work)
> db.friends.insert(friendpair(1,2))
WriteResult({ "nInserted" : 1 })
> db.friends.insert(friendpair(1,3))
WriteResult({ "nInserted" : 1 })
Insert non-unique pair (should return duplicate key error):
> db.friends.insert(friendpair(2,1))
WriteResult({
"nInserted" : 0,
"writeError" : {
"code" : 11000,
"errmsg" : "E11000 duplicate key error collection: test.friends index: a_1_b_1 dup key: { : 1.0, : 2.0 }"
}
})
Search should work in either order:
db.friends.find(friendpair(3,1)).pretty()
{ "_id" : ObjectId("5bc80ed11466009f3b56fa52"), "a" : 1, "b" : 3 }
db.friends.find(friendpair(1,3)).pretty()
{ "_id" : ObjectId("5bc80ed11466009f3b56fa52"), "a" : 1, "b" : 3 }
Instead of handling duplicate key errors or insert versus update, you could also use findAndModify with an upsert since this is expected to be a unique pair:
> var pair = friendpair(2,1)
> db.friends.findAndModify({
query: pair,
update: {
$set: {
a : pair.a,
b : pair.b
},
$setOnInsert: { status: 'pending' },
},
upsert: true
})
{
"_id" : ObjectId("5bc81722ce51da0e4118c92f"),
"a" : 1,
"b" : 2,
"status" : "pending"
}
Doesn't seem like you can do a unique on the entire array's values so I'm doing a kind of work around. I'm using the $jsonSchema as follows:
{
$jsonSchema:
{
bsonType:
"object",
required:
[
"status",
"users"
],
properties:
{
status:
{
enum:
[
"pending",
"accepted"
],
bsonType:
"string"
},
users:
{
bsonType:
"array",
description:
"references two user_id",
items:
{
bsonType:
"objectId"
},
maxItems:
2,
minItems:
2,
},
}
}
}
then I will use $all to find the connected users, e.g.
db.collection.find( { users: { $all: [ ObjectId1, ObjectId2 ] } } )
I am new to MongoDB so this is probably a basic question (hopefully). I currently have 10 million records with 410 fields loaded in a mongodb collection like so:
{
"_id" : ObjectId("........"),
"AddressID" : 123455,
"IndividualId" : 1,
"personfirstname" : "FirstName",
"personmiddleinitial" : "M",
"personlastname" : "LastName",
"etc": "....."
}
I need to wrap all of this data into an embedded document like so:
{
"_id" : ObjectId("........"),
"data" : {
"AddressID" : 123455,
"IndividualId" : 1,
"personfirstname" : "FirstName",
"personmiddleinitial" : "M",
"personlastname" : "LastName",
"etc": "....."
}
I don't necessarily need to update this data in-place but that would be nice. If I need to export this data somehow specifying the new format and then re-import the new, updated data that is fine. Performing this via the MongoDB shell would be ideal.
As suggested by chridam within comments you can execute the following aggregation pipeline:
db.collectionName.aggregate([
{ $project: { _id: "$_id", data: "$$ROOT" } },
{ $out: "newCollectionName" }
]);
This way you have the _id field both at root level and in the data object. Thus, you can execute a massive update to unset the second one:
db.newCollectionName.updateMany(
{},
{ $unset: { "data._id": "" } }
);
Finally, you can drop the first collection and rename the second to restore the original name on the updated collection:
db.collectionName.drop();
db.newCollectionName.rename("collectionName");
This approach fully works within the database, avoiding fetching any of your 10 million documents.
You can simply do this in the shell with the following
db.test.find().forEach(function(doc){
doc = { _id: doc._id, data: doc };
delete doc.data._id;
db.test.save(doc);
});
For example, if we insert the following documents:
> db.test.insertMany([
... {
... _id: ObjectId("5a91af8908e17c5997e03b7e"),
... field1: false,
... field2: 0,
... field3: "No"
... },
... {
... _id: ObjectId("5a91afbc08e17c5997e03b7f"),
... field1: true,
... field2: 1,
... field3: "Yes"
... }])
{
"acknowledged" : true,
"insertedIds" : [
ObjectId("5a91af8908e17c5997e03b7e"),
ObjectId("5a91afbc08e17c5997e03b7f")
]
}
Then run:
db.test.find().forEach(function(doc){
doc = { _id: doc._id, data: doc };
delete doc.data._id;
db.test.save(doc);
});
Our documents now look like this:
> db.test.find().pretty()
{
"_id" : ObjectId("5a91af8908e17c5997e03b7e"),
"data" : {
"field1" : false,
"field2" : 0,
"field3" : "No"
}
}
{
"_id" : ObjectId("5a91afbc08e17c5997e03b7f"),
"data" : {
"field1" : true,
"field2" : 1,
"field3" : "Yes"
}
}
I have the following documents inside a folders collection:
folders: [
{ _id : 1,
docs : [
{ foo : 1,
bar : undefined},
{ foo : 3,
bar : 3}
]
},
{ _id : 2,
docs : [
{ foo : 2,
bar : 2},
{ foo : 3,
bar : 3}
]
},
{ _id : 3,
docs : [
{ foo : 2},
{ foo : 3,
bar : 3}
]
},
{ _id : 4,
docs : [
{ foo : 1 }
]
},
{ _id : 5,
docs : [
{ foo : 1,
bar : null }
]
}
]
I need to be able to query the documents that do not have an undefined value, null value, or non-existent value for docs.bar. In the case above, the query should only return the document with _id: 2. I currently have a solution but I was wondering if there is a better way to query the documents.
My current solution:
db.folders.find({$nor: [{"docs.bar": { $exists: false }}]})
This ...
db.folder.find({"docs.bar": {$exists: true}, "docs.bar": {$ne: null}})
... will return only those entries for which at least one of the sub documents in the docs array has a populated bar attribute. Note: in this query the two predicates are ANDed, I think that matches your requirements, it certainly returns the document with _id: 2 from the set you supplied.