Pulling/deleting an item from a nested array - mongodb

Note: it's a Meteor project.
My schema looks like that:
{
_id: 'someid'
nlu: {
data: {
synonyms:[
{_id:'abc', value:'car', synonyms:['automobile']}
]
}
}
}
The schema is defined with simple-schema. Relevant parts:
'nlu.data.synonyms.$': Object,
'nlu.data.synonyms.$._id': {type: String, autoValue: ()=> uuidv4()},
'nlu.data.synonyms.$.value': {type:String, regEx:/.*\S.*/},
'nlu.data.synonyms.$.synonyms': {type: Array, minCount:1},
'nlu.data.synonyms.$.synonyms.$': {type:String, regEx:/.*\S.*/},
I am trying to remove {_id:'abc'}:
Projects.update({_id: 'someid'},
{$pull: {'nlu.data.synonyms' : {_id: 'abc'}}});
The query returns 1 (one doc was updated) but the item was not removed from the array. Any idea?

This is my insert query
db.test.insert({
"_id": "someid",
"nlu": {
"data": {
"synonyms": [
{
"_id": "abc"
},
{
"_id": "def"
},
10,
[ 5, { "_id": 5 } ]
]
}
}
})
And here is my update
db.test.update(
{
"_id": "someid",
"nlu.data.synonyms._id": "abc"
},
{
"$pull": {
"nlu.data.synonyms": {
"_id": "abc"
}
}
}
)

The problem broke down to the autoValue parameter on your _id property.
This is a very powerful feature to manipulate automatic values on your schema. However, it prevented from pulling as it had always returned a value, indicating that this field should be set.
In order to make it aware of the pulling, you can make it aware of an operator being present (as in cases of mongo updates).
Your autoValue would then look like:
'nlu.data.synonyms.$._id': {type: String, autoValue: function(){
if (this.operator) {
this.unset();
return;
}
return uuidv4();
}},
Edit: Note the function here being not an arrow function, otherwise it losses the context that is bound on it by SimpleSchema.
It basically only returns a new uuid4 when there is no operator present (as in insert operations). You can extend this further by the provided functionality (see the documentation) to your needs.
I just summarized my code to a reproducable example:
import uuidv4 from 'uuid/v4';
const Projects = new Mongo.Collection('PROJECTS')
const ProjectSchema ={
nlu: Object,
'nlu.data': Object,
'nlu.data.synonyms': {
type: Array,
},
'nlu.data.synonyms.$': {
type: Object,
},
'nlu.data.synonyms.$._id': {type: String, autoValue: function(){
if (this.operator) {
this.unset();
return;
}
return uuidv4();
}},
'nlu.data.synonyms.$.value': {type:String, regEx:/.*\S.*/},
'nlu.data.synonyms.$.synonyms': {type: Array, minCount:1},
'nlu.data.synonyms.$.synonyms.$': {type:String, regEx:/.*\S.*/},
};
Projects.attachSchema(ProjectSchema);
Meteor.startup(() => {
const insertId = Projects.insert({
nlu: {
data: {
synonyms:[
{value:'car', synonyms:['automobile']},
]
}
}
});
Projects.update({_id: insertId}, {$pull: {'nlu.data.synonyms' : {value: 'car'}}});
const afterUpdate = Projects.findOne(insertId);
console.log(afterUpdate, afterUpdate.nlu.data.synonyms.length); // 0
});
Optional Alternative: Normalizing Collections
However there is one additional note for optimization.
You can work around this auto-id generation issue by normalizing synonyms into an own collection, where the mongo insert provides you an id. I am not sure how unique this id will be compared to uuidv4 but i never faced id issues with that.
A setup could look like this:
const Synonyms = new Mongo.Collection('SYNONYMS');
const SynonymsSchema = {
value: {type:String, regEx:/.*\S.*/},
synonyms: {type: Array, minCount:1},
'synonyms.$': {type:String, regEx:/.*\S.*/},
};
Synonyms.attachSchema(SynonymsSchema);
const Projects = new Mongo.Collection('PROJECTS')
const ProjectSchema ={
nlu: Object,
'nlu.data': Object,
'nlu.data.synonyms': {
type: Array,
},
'nlu.data.synonyms.$': {
type: String,
},
};
Projects.attachSchema(ProjectSchema);
Meteor.startup(() => {
// just add this entry once
if (Synonyms.find().count() === 0) {
Synonyms.insert({
value: 'car',
synonyms: ['automobile']
})
}
// get the id
const carId = Synonyms.findOne()._id;
const insertId = Projects.insert({
nlu: {
data: {
synonyms:[carId] // push the _id as reference
}
}
});
// ['MG464i9PgyniuGHpn'] => reference to Synonyms document
console.log(Projects.findOne(insertId).nlu.data.synonyms);
Projects.update({_id: insertId}, {$pull: {'nlu.data.synonyms' : carId }}); // pull the reference
const afterUpdate = Projects.findOne(insertId);
console.log(afterUpdate, afterUpdate.nlu.data.synonyms.length);
});
I know this was not part of the question but I just wanted to point out that there are many benefits of normalizing complex document structures into separate collections:
no duplicate data
decouple data that is not intended to be bound (here: Synonyms could be also used independently from Projects)
update referred documents once, all Projects will point to the very actual version (since it's a reference)
finer publication/subscription handling => more control about what data flows over the wire
reduces complex auto and default value generation
changes in the referred collection's schema may have only few consequences for UI and functions that make use of the referrer's schema.
Of course this has also disadvantages:
more collections to handle
more code to write (more code = more potential errors)
more tests to write (much more time to invest)
sometimes you need to denormalize back for this one case out of 100
you have to invest a lot of time in data schema design before starting to code

Related

Use GraphQL Query to get results form MongoDB after aggregation with mongoose

so i have following problem.
I have a mongoDB collection and a corresponding mongoose model which looks like this.
export const ListItemSchema = new Schema<ListItemSchema>({
title: { type: String, required: true },
parentId: { type: Schema.Types.ObjectId, required: false },
});
export const TestSchema = new Schema<Test>(
{
title: { type: String, required: true },
list: { type: [ListItemSchema], required: false },
}
);
As you can see, my TestSchema holds an Array of ListItems inside -> TestSchema is also my Collection in MongoDB.
Now i want to query only my ListItems from a Test with a specific ID.
Well that was not that big of a problem at least from the MongoDB side.
I use MongoDB Aggregation Framework for this and call my aggregation inside a custom Resolver.
Here is the code to get an array of only my listItems from a specific TestModel
const test = TestModel.aggregate([
{$match: {_id: id}},
{$unwind: "$list"},
{
$match: {
"list.parentId": {$eq: null},
},
},
{$replaceRoot: {newRoot: "$list"}},
]);
This is the result
[ { _id: randomId,
title: 't',
parentId: null },
{ _id: randomId,
title: 'x'
parentId: null
} ]
The Query to trigger the resolver looks like this and is placed inside my Test Type Composer.
query getList {
test(testId:"2f334575196fe042ea83afbf", parentId: null) {
title
}
}
So far so good... BUT! Ofc my query will fail or will result in a not so good result^^ because GraphQL expects data based on the Test-Model but receives a completely random array.
So after a lot of typing here is the question:
How do i have to change my query to receive the list array?
Do i have to adjust the query or is it something with mongoose?
i really stuck at this point so any help would be awesome!
Thanks in advance :)
I'm not sure if I understood your issue correctly.
In your graphql, try to leave out exclamation mark(!) from the Query type.
something like :
type Query {
test: TestModel
}
instead of
type Query {
test: TestModel!
}
then you'll get the error message in console but still be able to receive any form of data.

How to define mongoose schema for nested documents

I need to define the mongoose schema for for nested documents which is given below.
Documents:
"Options":[{"Value":["28","30","32","34","36","38","40","42","44","46"],"_id":{"$oid":"5de8427af55716115dd43c8f"},"Name":"Size"},{"Value":["White"],"_id":{"$oid":"5de8427af55716115dd43c8e"},"Name":"Colour"}]
I was declaring like below but its not working.
const Product = new Schema(
{
Options: [{ value: { _id: ObjectId, Name: String } }]
},
{
timestamps: {
createdAt: "createdAt",
updatedAt: "updatedAt"
},
collection: "products"
}
);
Here I need the schema where if i will directly add/update the same document then it will be added.
You need to modify your schema like this :
{
Options: [ new Schema ({ value: [...], _id: Schema.Types.ObjectId, Name: String })]
}
This is the way to create an array of subdocuments with Mongoose. If you don't use the "new Schema" key words, you are actually creating a field with type "Mixed", which needs a different way to handle updates.
You can also omit the _id, it should be added automatically.
You can find more information on subdocument on this page :
https://mongoosejs.com/docs/subdocs.html
...and on mixed type fields : https://mongoosejs.com/docs/schematypes.html#mixed
...which will explain shortly the problem.
{
Options: [ new Schema ({ _id: mongoose.Types.ObjectId(),value: [String], Name: String } })]
}

Using async loop in mongodb shell for updating many documents

I have a problem with the following query in MongoDB shell ONLY when the size of the array gets bigger, for example, more than 100 elements.
newPointArray --> is an array with 500 elements
newPointArray.forEach(function(newDoc){
//update the mongodb properties for each doc
db.getCollection('me_all_test')
.update({ '_id': newDoc._id },
{ $set: { "properties": newDoc.properties } },
{ upsert: true });
})
Can someone guide me how can I run this query IN MongoDB SHELL for lager array by using an async loop or promise or...?
Thanks in advance
Rather than doing individual .update()s, use a .bulkWrite() operation. This should reduce the overhead of asking mongo to do multiple individual operations. This is assuming that you are doing general operations. I'm not clear on if newPointArray is always new points that don't exist.
Given your example, I believe your script would mimic the following:
// I'm assuming this is your array (but truncated)
let newPointArray = [
{
_id: "1",
properties: {
foo: "bar"
}
},
{
_id: "2",
properties: {
foo: "buzz"
}
}
// Whatever other points you have in your array
];
db
.getCollection("me_all_test")
.bulkWrite(newPointArray
// Map your array to a query bulkWrite understands
.map(point => {
return {
updateOne: {
filter: {
_id: point._id
},
update: {
$set: {
properties: point.properties
}
},
upsert: true
}
};
}));
You may also want to consider setting ordered to false in the operation which may also have performance gains. That would look something liked this:
db
.getCollection("me_all_test")
.bulkWrite([SOME_ARRAY_SIMILAR_TO_ABOVE_EXAMPLE], {
ordered: false
});

What's the strategy for creating index for this query?

I have a comment model for each thread,
const CommentSchema = new mongoose.Schema({
author: { type: ObjectID, required: true, ref: 'User' },
thread: { type: ObjectID, required: true, ref: 'Thread' },
parent: { type: ObjectID, required: true, ref: 'Comment' },
text: { type: String, required: true },
}, {
timestamps: true,
});
Besides a single query via _id, I want to query the database via this way:
Range query
const query = {
thread: req.query.threadID,
_id: { $gt: req.query.startFrom }
};
CommentModel.find(query).limit(req.query.limit);
My intention here is to find comments which related to a thread then get part of the result. It seems this query works as expected. My questions are:
Is this the right way to fulfill my requirement?
How to proper index the fields? Is this a compound index or I need to separate indexing each field? I checked the result of explain(), it seems as long as one of the query fields contains an index, the inputStage.stage will always have IXSCAN rather than COLLSCAN? Is this the key information to check the performace of the query?
Does it mean that every time I need to find based on one field, I need to make an index for these fields? Let's say that I want to search all the comments that are posted by an author to a specific thread.
Code like this:
const query = {
thread: req.query.threadID,
author: req.query.authorID,
};
Do I need to create a compound index for this requirement?
If you want to query by multiple fields then you have to create compound index.
for example
const query = {
thread: req.query.threadID,
author: req.query.authorID,
};
if you want to use this query then you have to create compound index like :
db.comments.createIndex( { "thread": 1, "author": 1 } );
Then that is, the index supports queries on the item field as well as both item and stock fields:
db.comments.find( { thread: "threadName" } )
db.comments.find( { thread: "threadName", author: "authorName" }
but not supports this one
db.comments.find( { author: "authorName", thread: "threadName" }
say if you create index for { "thread": 1, "author": 1, "parent": 1 } then for bellow ordered query support index
the thread field,
the thread field and the author field,
the thread field and the author field and the parent field.
But not support for bellow order
the author field,
the parent field, or
the author and parent fields.
For more read this

Resolving a ref inside of a MongoDB aggregate

I have a Product model object that has the following field in its schema:
category : { type: ObjectId, turnOn: false, ref: "category" }
It references a category model that has a title field in it:
var categorySchema = Schema({
title : { type: String }
});
I'm using the product.category property (which is of type ObjectId as shown above) in a MongoDB aggregate but really want the category.title property from the category model rather than _id in the final resultset.
The following code gets the job done, but you'll see that I'm having to do some looping at the end to "resolve" the title field for the given product.category (ObjectId). Is there anyway to do all of that within the aggregate? In other words, is there a way to get the category model object's title field in the groups that are returned rather than having to do the extra looping work? Based on posts I've researched I don't see a built-in way but wanted to double-check.
getProductsGroupedByCategory = function(callback) {
Category.find(function(err, cats) {
var aggregate = [
{
$group: {
_id: "$category",
products: {
$push: {
title: "$title",
authors: "$authors",
publishDate: "$publishDate",
description: "$description"
}
}
}
},
{
$sort: {
"_id": 1
}
}
];
Product.aggregate(aggregate, function(err, catProducts) {
//Grab name of category and associate with each group
//since we only have the category_id at this point
for (var i = 0; i<catProducts.length;i++) {
var catProduct = catProducts[i];
for (var j=0;j<cats.length;j++) {
if (catProduct._id.toString() === cats[j]._id.toString()) {
catProduct.category = cats[j].title;
}
}
};
callback(err, catProducts);
});
});
}, //more code follows
An example datum would be helpful along with what you need out of it. From What I understand you are looking to get the title in to the grouping criteria and that should be doing by having a compound grouping criteria i.e.
_id: {category: "$category", title: "$title"}
If the title is within an array, you should do unwind, group and then wind again to achieve the result.