Mongo - Count of empty double nested arrays - mongodb

Say I have this structure
{
"_id" : "4klhrj5hZ",
"name" : "asdf",
"startTime" : ISODate("2016-09-20T22:22:08.082Z"),
"columns" : [
{
"_id" : ObjectId("57e1b69087ceb4392ebdf7f4"),
"createdAt" : ISODate("2016-09-20T22:22:08.088Z"),
"rows" : [
{
"value" : "adf",
"_id" : ObjectId("57e1b7867598bd39a72876ef")
}
]
},
{
"_id" : ObjectId("57e1b69087ceb4392ebdf7f3"),
"createdAt" : ISODate("2016-09-20T22:22:08.088Z"),
"rows" : [
{
"value" : "we",
"_id" : ObjectId("57e1b69287ceb4392ebdf7f5"),
]
},
{
"_id" : ObjectId("57e1b69087ceb4392ebdf7f2"),
"createdAt" : ISODate("2016-09-20T22:22:08.086Z"),
"rows" : [
{
"value" : "asdf",
"_id" : ObjectId("57e1b7be7598bd39a72876f0")
}
]
}
]
}
Where I first have an array of columns, then an array of rows with each columns. An array of arrays... I'm trying to write a count query to tell me how many top level documents contain ALL empty array of arrays. So in this example, columns[0-2].rows.length === 0, not columns could be any length. The thing that is tripping me up the most from examples i've seen, is the doing the nested array dynamically and not referring to it like
columns.0.rows
Thanks!
EDIT: Clarification
Here is the mongoose schema to help clarify
var RowSchema = new Schema({
value: String,
createdAt:{
type: Date,
'default': Date.now
}
});
var ColumnSchema = new Schema({
rows: [RowSchema],
createdAt:{
type: Date,
'default': Date.now
}
});
var ItemSchema = new Schema({
_id: {
type: String,
unique: true,
'default': shortid.generate
},
name: String,
columns: [ColumnSchema],
createdAt:{
type: Date,
'default': Date.now
}
})
I want to run a query to find all Item's that contain zero rows in all columns. So I know how to find an array that is empty:
Item.find({ columns: { $exists: true, $eq: [] } })
But I want something like
Item.find({ 'columns.rows': { $exists: true, $eq: [] } })
Sorry for the unclear explanation, just get so wrapped up in it sometimes you forget to set the proper context. Thanks.

Related

Update existing mongodb data into an embedded document

I am new to MongoDB so this is probably a basic question (hopefully). I currently have 10 million records with 410 fields loaded in a mongodb collection like so:
{
"_id" : ObjectId("........"),
"AddressID" : 123455,
"IndividualId" : 1,
"personfirstname" : "FirstName",
"personmiddleinitial" : "M",
"personlastname" : "LastName",
"etc": "....."
}
I need to wrap all of this data into an embedded document like so:
{
"_id" : ObjectId("........"),
"data" : {
"AddressID" : 123455,
"IndividualId" : 1,
"personfirstname" : "FirstName",
"personmiddleinitial" : "M",
"personlastname" : "LastName",
"etc": "....."
}
I don't necessarily need to update this data in-place but that would be nice. If I need to export this data somehow specifying the new format and then re-import the new, updated data that is fine. Performing this via the MongoDB shell would be ideal.
As suggested by chridam within comments you can execute the following aggregation pipeline:
db.collectionName.aggregate([
{ $project: { _id: "$_id", data: "$$ROOT" } },
{ $out: "newCollectionName" }
]);
This way you have the _id field both at root level and in the data object. Thus, you can execute a massive update to unset the second one:
db.newCollectionName.updateMany(
{},
{ $unset: { "data._id": "" } }
);
Finally, you can drop the first collection and rename the second to restore the original name on the updated collection:
db.collectionName.drop();
db.newCollectionName.rename("collectionName");
This approach fully works within the database, avoiding fetching any of your 10 million documents.
You can simply do this in the shell with the following
db.test.find().forEach(function(doc){
doc = { _id: doc._id, data: doc };
delete doc.data._id;
db.test.save(doc);
});
For example, if we insert the following documents:
> db.test.insertMany([
... {
... _id: ObjectId("5a91af8908e17c5997e03b7e"),
... field1: false,
... field2: 0,
... field3: "No"
... },
... {
... _id: ObjectId("5a91afbc08e17c5997e03b7f"),
... field1: true,
... field2: 1,
... field3: "Yes"
... }])
{
"acknowledged" : true,
"insertedIds" : [
ObjectId("5a91af8908e17c5997e03b7e"),
ObjectId("5a91afbc08e17c5997e03b7f")
]
}
Then run:
db.test.find().forEach(function(doc){
doc = { _id: doc._id, data: doc };
delete doc.data._id;
db.test.save(doc);
});
Our documents now look like this:
> db.test.find().pretty()
{
"_id" : ObjectId("5a91af8908e17c5997e03b7e"),
"data" : {
"field1" : false,
"field2" : 0,
"field3" : "No"
}
}
{
"_id" : ObjectId("5a91afbc08e17c5997e03b7f"),
"data" : {
"field1" : true,
"field2" : 1,
"field3" : "Yes"
}
}

Mongoose match element or empty array with $in statement

I'm trying to select any documents where privacy settings match the provided ones and any documents which do not have any privacy settings (i.e. public).
Current behavior is that if I have a schema with an array of object ids referenced to another collection:
privacy: [{
type: mongoose.Schema.Types.ObjectId,
ref: 'Category',
index: true,
required: true,
default: []
}],
And I want to filter all content for my categories and the public ones, in our case content that does not have a privacy settings. i.e. an empty array []
We currently query that with an or query
{"$or":[
{"privacy": {"$size": 0}},
{"privacy": {"$in":
["5745bdd4b896d4f4367558b4","5745bd9bb896d4f4367558b2"]}
}
]}
I would love to query it by only providing an empty array [] as one the comparison options in the $in statement. Which is possible in mongodb:
db.emptyarray.insert({a:1})
db.emptyarray.insert({a:2, b:null})
db.emptyarray.insert({a:2, b:[]})
db.emptyarray.insert({a:3, b:["perm1"]})
db.emptyarray.insert({a:3, b:["perm1", "perm2"]})
db.emptyarray.insert({a:3, b:["perm1", "perm2", []]})
> db.emptyarray.find({b:[]})
{ "_id" : ObjectId("5a305f3dd89e8a887e629ce0"), "a" : 2, "b" : [ ] }
{ "_id" : ObjectId("5a305f3dd89e8a887e629ce3"), "a" : 3, "b" : [ "perm1", "perm2", [ ] ] }
> db.emptyarray.find({b:{$in:[]}})
> db.emptyarray.find({b:{$in:[[], "perm1"]}})
{ "_id" : ObjectId("5a305f3dd89e8a887e629ce0"), "a" : 2, "b" : [ ] }
{ "_id" : ObjectId("5a305f3dd89e8a887e629ce1"), "a" : 3, "b" : [ "perm1" ] }
{ "_id" : ObjectId("5a305f3dd89e8a887e629ce2"), "a" : 3, "b" : [ "perm1", "perm2" ] }
{ "_id" : ObjectId("5a305f3dd89e8a887e629ce3"), "a" : 3, "b" : [ "perm1", "perm2", [ ] ] }
> db.emptyarray.find({b:{$in:[[], "perm1", null]}})
{ "_id" : ObjectId("5a305f3dd89e8a887e629cde"), "a" : 1 }
{ "_id" : ObjectId("5a305f3dd89e8a887e629cdf"), "a" : 2, "b" : null }
{ "_id" : ObjectId("5a305f3dd89e8a887e629ce0"), "a" : 2, "b" : [ ] }
{ "_id" : ObjectId("5a305f3dd89e8a887e629ce1"), "a" : 3, "b" : [ "perm1" ] }
{ "_id" : ObjectId("5a305f3dd89e8a887e629ce2"), "a" : 3, "b" : [ "perm1", "perm2" ] }
{ "_id" : ObjectId("5a305f3dd89e8a887e629ce3"), "a" : 3, "b" : [ "perm1", "perm2", [ ] ] }
> db.emptyarray.find({b:{$in:[[]]}})
{ "_id" : ObjectId("5a305f3dd89e8a887e629ce0"), "a" : 2, "b" : [ ] }
{ "_id" : ObjectId("5a305f3dd89e8a887e629ce3"), "a" : 3, "b" : [ "perm1", "perm2", [ ] ] }
Maybe like this:
"privacy_locations":{
"$in": ["5745bdd4b896d4f4367558b4","5745bd9bb896d4f4367558b2",[]]
}
But this query, works from the console (CLI), but not in the code where it throws a cast error:
{
"message":"Error in retrieving records from db.",
"error":
{
"message":"Cast to ObjectId failed for value \"[]\" at ...
}
}
Now I perfectly understand the cast is happening because the Schema is defined as an ObjectId.
But I still find that this approach is missing two possible scenarios.
I believe it is possible to query (in MongoDB) null options or empty array within an $in statement.
array: {$in:[null, [], [option-1, option-2]}
Is this correct?
I've been thinking that the best solution to my problem (Cannot select in options or empty) could be to have empty arrays be an array with a fix option of ALL for example. A setting for privacy that means ALL instead of how it is now which is that if not set, that is considered all.
But I don't want a major refactor of the existing code, I just need to see if I can make a better query or more performant query.
Today we have the query working with an $OR statement that has issues with indexes. And even if it is fast, I wanted to bring attention to this issue even if is not considered a bug.
I will appreciate any comments or guidance.
The semi-short answer is that the schema is mixing types for the privacy property (ObjectId and Array) while declaring that it is strictly of type ObjectId in the schema.
Since MongoDB is schema-less it will allow any document shape per document and doesn't need to verify the query document to match a schema. Mongoose on the other hand is meant to apply a schema enforcement and so it will verify a query document against the schema before it attempts to query the DB. The query document for { privacy: { $in: [[]] } } will fail validation since an empty array is not a valid ObjectId as indicated by the error.
The schema would need to declare the type as Mixed (which doesn't support ref) to continue using an empty array as an acceptable type as well as ObjectId.
// Current
const FooSchema = new mongoose.Schema({
privacy: [{
type: mongoose.Schema.Types.ObjectId,
ref: 'Category',
index: true,
required: true,
default: []
}]
});
const Foo = connection.model('Foo', FooSchema);
const foo1 = new Foo();
const foo2 = new Foo({privacy: [mongoose.Types.ObjectId()]});
Promise.all([
foo1.save(),
foo2.save()
]).then((results) => {
console.log('Saved', results);
/*
[
{ __v: 0, _id: 5a36e36a01e1b77cba8bd12f, privacy: [] },
{ __v: 0, _id: 5a36e36a01e1b77cba8bd131, privacy: [ 5a36e36a01e1b77cba8bd130 ] }
]
*/
return Foo.find({privacy: { $in: [[]] }}).exec();
}).then((results) => {
// Never gets here
console.log('Found', results);
}).catch((err) => {
console.log(err);
// { [CastError: Cast to ObjectId failed for value "[]" at path "privacy" for model "Foo"] }
});
And the working version. Also note the adjustment to properly apply the required flag, index flag and default value.
// Updated
const FooSchema = new mongoose.Schema({
privacy: {
type: [{
type: mongoose.Schema.Types.Mixed
}],
index: true,
required: true,
default: [[]]
}
});
const Foo = connection.model('Foo', FooSchema);
const foo1 = new Foo();
const foo2 = new Foo({
privacy: [mongoose.Types.ObjectId()]
});
Promise.all([
foo1.save(),
foo2.save()
]).then((results) => {
console.log(results);
/*
[
{ __v: 0, _id: 5a36f01733704f7e58c0bf9a, privacy: [ [] ] },
{ __v: 0, _id: 5a36f01733704f7e58c0bf9c, privacy: [ 5a36f01733704f7e58c0bf9b ] }
]
*/
return Foo.find().where({
privacy: { $in: [[]] }
}).exec();
}).then((results) => {
console.log(results);
// [ { _id: 5a36f01733704f7e58c0bf9a, __v: 0, privacy: [ [] ] } ]
});

MongoDB Conditional validation on arrays and embedded documents

I have a number of documents in my database where I am applying document validation. All of these documents may have embedded documents. I can apply simple validation along the lines of SQL non NULL checks (these are essentially enforcing the primary key constraints) but what I would like to do is apply some sort of conditional validation to the optional arrays and embedded documents. By example, lets say I have a document that looks like this:
{
"date": <<insertion date>>,
"name" : <<the portfolio name>>,
"assets" : << amount of money we have to trade with>>
}
Clearly I can put validation on this document to ensure that date name and assets all exist at insertion time. Lets say, however, that I'm managing a stock portfolio and the document can have future updates to show an array of stocks like this:
{
"date" : <<insertion date>>,
"name" : <<the portfolio name>>,
"assets" : << amount of money we have to trade with>>
"portfolio" : [
{ "stockName" : "IBM",
"pricePaid" : 155.39,
"sharesHeld" : 100
},
{ "stockName" : "Microsoft",
"pricePaid" : 57.22,
"sharesHeld" : 250
}
]
}
Is it possible to to apply a conditional validation to this array of sub documents? It's valid for the portfolio to not be there but if it is each document in the array must contain the three fields "stockName", "pricePaid" and "sharesHeld".
MongoShell
db.createCollection("collectionname",
{
validator: {
$or: [
{
"portfolio": {
$exists: false
}
},
{
$and: [
{
"portfolio": {
$exists: true
}
},
{
"portfolio.stockName": {
$type: "string",
$exists: true
}
},
{
"portfolio.pricePaid": {
$type: "double",
$exists: true
}
},
{
"portfolio.sharesHeld": {
$type: "double",
$exists: true
}
}
]
}
]
}
})
With this above validation in place you can insert documents with or without portfolio.
After executing the validator in shell, then you can insert data of following
db.collectionname.insert({
"_id" : ObjectId("58061aac8812662c9ae1b479"),
"date" : ISODate("2016-10-18T12:50:52.372Z"),
"name" : "B",
"assets" : 200
})
db.collectionname.insert({
"_id" : ObjectId("58061ab48812662c9ae1b47a"),
"date" : ISODate("2016-10-18T12:51:00.747Z"),
"name" : "A",
"assets" : 100,
"portfolio" : [
{
"stockName" : "Microsoft",
"pricePaid" : 57.22,
"sharesHeld" : 250
}
]
})
If we try to insert a document like this
db.collectionname.insert({
"date" : new Date(),
"name" : "A",
"assets" : 100,
"portfolio" : [
{ "stockName" : "IBM",
"sharesHeld" : 100
}
]
})
then we will get the below error message
WriteResult({
"nInserted" : 0,
"writeError" : {
"code" : 121,
"errmsg" : "Document failed validation"
}
})
Using Mongoose
Yes it can be done, Based on your scenario you may need to initialize the parent and the child schema.
Shown below would be a sample of child(portfolio) schema in mongoose.
var mongoose = require('mongoose');
var Schema = mongoose.Schema;
var portfolioSchema = new Schema({
"stockName" : { type : String, required : true },
"pricePaid" : { type : Number, required : true },
"sharesHeld" : { type : Number, required : true },
}
References:
http://mongoosejs.com/docs/guide.html
http://mongoosejs.com/docs/subdocs.html
Can I require an attribute to be set in a mongodb collection? (not null)
Hope it Helps!

Mongoose Mongo 2dsphere geoWithin

After reading many questions that are SO close to mine, and reading the MongoDB docs and Mongoose docs, I still cannot answer my question.
Using express 4.13.4, mongoose 4.4.10, mongodb 2.1.14 on Node 4.4.0
My Mongoose Location schema:
var schema = new Schema({
type: {type: String},
coordinates: []
},{_id:false});
var model = mongoose.model('LocationModel',schema);
module.exports = {
model : model,
schema : schema
};
My CatalogModel schema (what I write to Mongo):
var locationSchema = require('./locationModel').schema;
var schema = new Schema({
title : String,
format: {type: String, maxlength: 4},
location: {type: locationSchema, required:true},
otherStuff: String
});
schema.index({location: '2dsphere'}); // Ensures 2dsphere index for location
model = mongoose.model('CatalogModel',schema);
I create a concrete example and write to MongoDB (this works fine... in that I can query it in Mongo)
var polyEntry = new CatalogModel({
title:"I am just a Polygon",
otherStuff: "More stuff here",
location:{
type:'Polygon',
coordinates:[[[0,1],[0,2],[1,2],[0,1]]]
}
});
In Mongo, I asked the collection for the indexes:
db.catalogmodels.getIndexes()
And this is what it says (not entirely sure what this means)
[
{
"v" : 1,
"key" : {
"_id" : 1
},
"name" : "_id_",
"ns" : "test.catalogmodels"
},
{
"v" : 1,
"key" : {
"location" : "2dsphere"
},
"name" : "location_2dsphere",
"ns" : "test.catalogmodels",
"background" : true,
"2dsphereIndexVersion" : 3
}
]
I can do a db.catalogmodels.find() and get my document back.
{
"_id" : ObjectId("12345678901234566778"),
"title" : "I am just a Polygon",
"location" : {
"type" : "Polygon",
"coordinates" : [ [ [ 0, 1 ], [ 0, 2 ], [ 1, 2 ], [ 0, 1 ] ] ]
},
"__v" : 0
}
I can even do a $geoWithin call in Mongo:
db.catalogmodels.find(
{
location:{
$geoWithin:{
$geometry:{
type:"Polygon",
"coordinates":[[[-1,0],[-1,3],[4,3],[4,0],[-1,0]]]
}
}
}
})
But here's the actual question:
Mongoose keeps telling me [Error: Can't use $geoWithin]
var geoJson = {
"type" : "Polygon",
"coordinates" : [[[-1,0],[-1,3],[4,3],[4,0],[-1,0]]]
};
CatalogModel
.find()
.where('location').within(geoJson)
.exec(function(err,data){
if ( err ) { console.log(err); }
else {console.log("Data: " + data);}
db.close()
});
I also replaced the .find().where().within() call to:
CatalogEntryModel.find({
location:{
$geoWithin:{
$geometry:{
type:"Polygon",
"coordinates":[[[-1,0],[-1,3],[4,3],[4,0],[-1,0]]]
}
}
}
})
.exec(function(err,data){
if ( err ) { console.log(err); }
else {console.log("Data: " + data);}
db.close();
});
Is there a reason Mongoose does not like the $geoWithin call? The latest API says this should work.
I wrote this up as an issue on Mongoose: https://github.com/Automattic/mongoose/issues/4044#
And it has been closed.

How to update a subdocument in mongodb

I know the question have been asked many times, but I can't figure out how to update a subdocument in mongo.
Here's my Schema:
// Schemas
var ContactSchema = new mongoose.Schema({
first: String,
last: String,
mobile: String,
home: String,
office: String,
email: String,
company: String,
description: String,
keywords: []
});
var UserSchema = new mongoose.Schema({
email: {
type: String,
unique: true,
required: true
},
password: {
type: String,
required: true
},
contacts: [ContactSchema]
});
My collection looks like this:
db.users.find({}).pretty()
{
"_id" : ObjectId("5500b5b8908520754a8c2420"),
"email" : "test#random.org",
"password" : "$2a$08$iqSTgtW27TLeBSUkqIV1SeyMyXlnbj/qavRWhIKn3O2qfHOybN9uu",
"__v" : 8,
"contacts" : [
{
"first" : "Jessica",
"last" : "Vento",
"_id" : ObjectId("550199b1fe544adf50bc291d"),
"keywords" : [ ]
},
{
"first" : "Tintin",
"last" : "Milou",
"_id" : ObjectId("550199c6fe544adf50bc291e"),
"keywords" : [ ]
}
]
}
Say I want to update subdocument of id 550199c6fe544adf50bc291e by doing:
db.users.update({_id: ObjectId("5500b5b8908520754a8c2420"), "contacts._id": ObjectId("550199c6fe544adf50bc291e")}, myNewDocument)
with myNewDocument like:
{ "_id" : ObjectId("550199b1fe544adf50bc291d"), "first" : "test" }
It returns an error:
db.users.update({_id: ObjectId("5500b5b8908520754a8c2420"), "contacts._id": ObjectId("550199c6fe544adf50bc291e")}, myNewdocument)
WriteResult({
"nMatched" : 0,
"nUpserted" : 0,
"nModified" : 0,
"writeError" : {
"code" : 16837,
"errmsg" : "The _id field cannot be changed from {_id: ObjectId('5500b5b8908520754a8c2420')} to {_id: ObjectId('550199b1fe544adf50bc291d')}."
}
})
I understand that mongo tries to replace the parent document and not the subdocument, but in the end, I don't know how to update my subdocument.
You need to use the $ operator to update a subdocument in an array
Using contacts.$ will point mongoDB to update the relevant subdocument.
db.users.update({_id: ObjectId("5500b5b8908520754a8c2420"),
"contacts._id": ObjectId("550199c6fe544adf50bc291e")},
{"$set":{"contacts.$":myNewDocument}})
I am not sure why you are changing the _id of the subdocument. That is not advisable.
If you want to change a particular field of the subdocument use the contacts.$.<field_name> to update the particular field of the subdocument.