Mongoose - can't insert subDocuments of a Dictionary Type - mongodb

I have a Mongoose schema for the document Company, that has several fields. One of these (documents_banks) is a "free" field, of dictionary type, because I don't know the names of the keys in advance.
The problem is that, when I save the document (company.save()) even if the resulting saved document has the new sub_docs, in the DB no new sub_docs are actually saved.
var Company = new Schema({
banks: [{ type: String }], // array of Strings
documents_banks: {} // free field
});
Even if documents_banks is not restricted by the Schema, it will have this structure (in my mind):
{
"bank_id1": {
"doc_type1": {
"url": { "type": "String" },
"custom_name": { "type": "String" }
},
"doc_type2": {
"url": { "type": "String" },
"custom_name": { "type": "String" }
}
},
"bank_id2": {
"doc_type1": {
"url": { "type": "String" },
"custom_name": { "type": "String" }
}
}
}
But I don't know in advance names of keys bank_id neither doc_type, so I used the Dictionary type (documents_banks:{}).
Now, this below is the function I use to save new sub_docs in documents_banks. The same logic I always use to save new sub_docs.. Anyway this time, it seems saved, but it's not.
function addBankDocument(company_id, bank_id, doc_type, url, custom_name) {
// retrieve the company document
Company.findById(company_id)
.then(function(company) {
// create empty sub_docs if needed
if (!company.documents_banks) {
company.documents_banks = {};
}
if (!company.documents_banks[bank_id]) {
company.documents_banks[bank_id] = {};
}
// add the new sub_doc
company.documents_bank[bank_id][doc_type] = {
"url": url,
"custom_name": custom_name
};
return company.save();
})
.then(function(saved_company) {
// I try to check if the new obj has been saved
console.log(saved_company.documents_bank[bank_id][doc_type]);
// and it actually prints the new obj!!
});
}
The saved_company returned by the .save() actually has the new sub_docs, but if I check the DB there is not the new sub_doc! I can save just the first one, all the others are not stored.
So, the console.log() always print the new sub_docs, but actually in the DataBase, just the first sub_doc is saved, not the others. So at the end, saved_company always has 1 sub_doc, the first one.
It seems very strange to me, since saved_company has the new sub_docs. What can be happened?
This below is a real extract from by DB, and it will contains forever just the sub_doc "doc_bank#1573807781414", others will be not present in the DB.
{
"_id": "5c6eaf8efdc21500146e289c", // company_id
"banks": [ "MPS" ],
"documents_banks": {
"5c5ac3e025acd98596021a9a": // bank_id
{
"doc_bank#1573807781414": // doc_type
{
"url": "http://...",
"custom_name": "file1"
}
}
}
}
Versions:
$ npm -v
6.4.1
$ npm show mongoose version
5.7.11
$ node -v
v8.16.0

It seems that, since mongoose doesn't know the exact model of the subdoc, it can't know when it changes. So I have to use markModified to notify changes of the "free field" (also known as dictionary or MixedType) with this:
company_doc.documents_banks["bank_id2"]["doc_type3"] = obj; // modify
company_doc.markModified('documents_banks'); // <--- notify changes
company_doc.save(); // save changes
As I understood, markModified force the model to 'update' that field during the save().

Related

Document AI Contract Processor - batchProcessDocuments ignores fieldMask

My aim is to reduce the json file size, which contains the base64 image sections of the documents by default.
I am using the Document AI - Contract Processor in US region, nodejs SDK.
It is my understanding that setting fieldMask attribute in batchProcessDocuments request filters out the properties that will be in the resulting json.
I want to keep only the entities property.
Here are my call parameters:
const documentai = require('#google-cloud/documentai').v1;
const client = new documentai.DocumentProcessorServiceClient(options);
let params = {
"name": "projects/XXX/locations/us/processors/3e85a4841d13ce5",
"region": "us",
"inputDocuments": {
"gcsDocuments": {
"documents": [{
"mimeType": "application/pdf",
"gcsUri": "gs://bubble-bucket-XXX/files/CymbalContract.pdf"
}]
}
},
"documentOutputConfig": {
"gcsOutputConfig": {
"gcsUri": "gs://bubble-bucket-XXXX/ocr/"
},
"fieldMask": {
"paths": [
"entities"
]
}
}
};
client.batchProcessDocuments(params, function(error, operation) {
if (error) {
return reject(error);
}
return resolve({
"operationName": operation.name
});
});
However, the resulting json is still containing the full set of data.
Am I missing something here?
The auto-generated documentation for the Node.JS Client Library is a little hard to follow, but it looks like the fieldMask should be a member of the gcsOutputConfig instead of the documentOutputConfig. (I'm surprised the API didn't throw an error)
https://cloud.google.com/nodejs/docs/reference/documentai/latest/documentai/protos.google.cloud.documentai.v1.documentoutputconfig.gcsoutputconfig
The REST Docs are a little more clear
https://cloud.google.com/document-ai/docs/reference/rest/v1/DocumentOutputConfig#gcsoutputconfig
Note: For a REST API call and for other client libraries, the fieldMask is structured as a string (e.g. text,entities,pages.pageNumber)
I haven't tried this with the Node Client libraries before, but I'd recommend trying this as well if moving the parameter doesn't work on its own.
https://cloud.google.com/document-ai/docs/send-request#async-processor

how can I set the value of objectId to another property different that _id when creating a document?

I'm trying to create an object that looks like this:
const userSettingsSchema = extendSchema(HistorySchema, {
user: //ObjectId here,
key:{type: String},
value:{type: String}
});
this is the post method declared in the router
app.post(
"/api/user/settings/:key",
userSettingsController.create
);
and this is the method "create":
async create(request, response) {
try {
const param = request.params.key;
const body = request.body;
console.log('body', body)
switch (param) {
case 'theme':
var userSettings = await UserSettings.create(body) // user:objecId -> missing...
response.status(201).send(userSettings);
break;
}
} catch (e) {
return response.status(400).send({ msg: e.message });
}
}
I don't know how to assign the value of ObjectId to the user property, because ObjectId is generate when the doc is created, thus, I can not do this: userSettings.user = userSettings._id, because the objectr is already. I only manage to get something like this created:
{
"_id": "60c77565f1ac494e445cccfe",
"key": "theme",
"value": "dark",
}
But it should be:
{
"user": "60c77565f1ac494e445cccfe",
"key": "theme",
"value": "dark",
}
_id is the only mandatory property of a document. It is unique identifier of the document and you cannot remove it.
If you provide document without _id the driver will generate one.
You can generate ObjectId as
let id = mongoose.Types.ObjectId();
and assign it to as many properties as you want.
You can even generate multiple different ObjectIds for different properties of the same document.
Now, you don't really need to assign ObjectId to "user" property. _id will do just fine. In fact it is most likely you don't need user's settings in separate collection and especially as multiple documents with single key-value pair.
You should be good by embedding "settings" property to your "user" collection as a key-value json.

AppSync: pipeline resolver #return null result

I'm successfully using a pipeline resolver to persist a parent/child relationship, except when the list of child items is empty and I #return early.
I'm guessing the issue is around my response mappers and use of $ctx.prev vs $ctx.result but I can't figure it out.
The pipeline looks like this:
BEFORE template: {}
Function 1:
request = PutItem the parent
response = $utils.toJson($ctx.result)
Function 2:
request = TransactWriteItems (foreach UpdateItem) the children
response = $utils.toJson($ctx.prev.result)
AFTER template: $utils.toJson($ctx.prev.result)
When I call the mutation with
{"parentAttribute":"foo", "children": [{"childAttribute": "bar"}]}
I get a good response like:
{
"data": {
"createFoo": {
"parentAttribute": "foo",
"children": [
{
"childAttribute": "bar"
}
]
}
}
}
If no children, Function 2 request mapper does #return to avoid "TransactWriteItems must have at least one operation" error.
In this scenario I am hoping for the above response to the mutation, just with children: []
Instead, I get:
{
"data": {
"createFoo": null
}
}
The data has been written correctly; if I query it I get back the parent with empty list of children.
How do I get this pipeline to execute so that it returns the combined parent+child data whether the child array is populated or not?
Detail
The schema is something like:
type Foo {
id: String!
attr1: String
bars: [Bar]
}
type Bar {
id: String!
attr2: String
}
type Mutation {
createFoo(foo: Foo): Foo
}
And a dynamodb representation like this:
pk
sk
attr1
attr2
FOO#1
METADATA#FOO#1
Lorem
FOO#1
BAR#1
Ipsum
While the pipeline looks like:
before.vtl
{}
createParent-request.vtl
{
"version" : "2017-02-28",
"operation" : "PutItem",
"key" : {
"pk" : $util.dynamodb.toDynamoDBJson(...),
"sk" : $util.dynamodb.toDynamoDBJson(...)
},
"attributeValues" : {
"data" : $util.dynamodb.toDynamoDBJson(...)
}
}
createParent-response.vtl
#if($ctx.error)
$utils.error($ctx.error.message, $ctx.error.type)
#end
$utils.toJson($ctx.result)
createChildren-request.vtl
#if($ctx.args.fooInput.children.size() > 0)
{
"version": "2018-05-29",
"operation": "TransactWriteItems",
"transactItems": [
#foreach( $child in $ctx.args.fooInput.children )
{
"table": "${table}",
"operation": "UpdateItem",
"key": {
"pk" : $util.dynamodb.toDynamoDBJson(...),
"sk" : $util.dynamodb.toDynamoDBJson(...)
},
"update": {
"expression": "SET #data = :data",
"expressionNames": {
"#data": "data"
},
"expressionValues": {
":data":
$util.dynamodb.toDynamoDBJson(...)
}
}
}
#if( $foreach.hasNext ),#end
#end
]
}
#else
#return
#end
createChildren-response.vtl
#if($ctx.error)
$utils.error($ctx.error.message, $ctx.error.type)
#end
$utils.toJson($ctx.prev.result)
after.vtl
#if($ctx.error)
$utils.error($ctx.error.message, $ctx.error.type)
#end
$utils.toJson($ctx.prev.result)
I figured it out. For the expected behaviour, one needs the 'after' mapper to return the necessary JSON to populate the overall mutation response. In my example above, after.vtl needs to return a parent and nothing else matters (in particular, the result of the individual function response mappers).
I ended up putting the output of the 'create parent' operation into ctx.stash then returning ctx.stash in after.vtl, setting the other resolvers to {}.
Note that, if your response has subtypes (with their own resolvers) and you return it sparse, AppSync will call the resolver. In the context of my example, it's enough to return the parent without any children and then the normal query resolver for "get children of a parent" will execute to populate the final response.

How can I return the element I'm looking for inside a nested array?

I have a database like this:
[
{
"universe":"comics",
"saga":[
{
"name":"x-men",
"characters":[
{
"character":"wolverine",
"picture":"618035022351.png"
},
{
"character":"cyclops",
"picture":"618035022352.png"
}
]
}
]
},
{
"universe":"dc",
"saga":[
{
"name":"spiderman",
"characters":[
{
"character":"venom",
"picture":"618035022353.png"
}
]
}
]
}
]
and with this code I manage to update one of the objects in my array. specifically the object where character: wolverine
db.mydb.findOneAndUpdate({
"universe": "comics",
"saga.name": "x-men",
"saga.characters.character": "wolverine"
}, {
$set: {
"saga.$[].characters.$[].character": "lobezno",
"saga.$[].characters.$[].picture": "618035022354.png",
}
}, {
new: false
}
)
it returns all my document, I need ONLY the document matched
I would like to return the object that I have updated without having to make more queries to the database.
Note
I have been told that my code does not work well as it should, apparently my query to update this bad, I would like to know how to fix it and get the object that matches these search criteria.
In other words how can I get this output:
{
"character":"wolverine",
"picture":"618035022351.png"
}
in a single query using filters
{
"universe": "comics",
"saga.name": "x-men",
"saga.characters.character": "wolverine"
}
My MongoDB knowledge prevents me from correcting this.
Use the shell method findAndModify to suit your needs.
But you cannot use the positional character $ more than once while projecting in MongoDb, so you may have to keep track of it yourself at client-side.
Use arrayFilters to update deeply nested sub-document, instead of positional all operator $[].
Below is a working query -
var query = {
universe: 'comics'
};
var update = {
$set: {
'saga.$[outer].characters.$[inner].character': 'lobezno',
'saga.$[outer].characters.$[inner].picture': '618035022354.png',
}
};
var fields = {
'saga.characters': 1
};
var updateFilter = {
arrayFilters: [
{
'outer.name': 'x-men'
},
{
'inner.character': 'wolverine'
}
]
};
db.collection.findAndModify({
query,
update,
fields,
arrayFilters: updateFilter.arrayFilters
new: true
});
If I understand your question correctly, your updating is working as expected and your issue is that it returns the whole document and you don't want to query the database to just to return these two fields.
Why don't you just extract the fields from the document returned from your update? You are not going to the database when doing that.
var extractElementFromResult = null;
if(result != null) {
extractElementFromResult = result.saga
.filter(item => item.name == "x-men")[0]
.characters
.filter(item => item.character == "wolverine")[0];
}

mongodb need to populate a new field with an old fields value, without destroying other data

I have a situation where a model changed at some point in time and I am faced with (for argument sake) half my data liks like this
{
_id: OID,
things: [{
_id:OID,
arm: string,
body: string
}],
other: string
}
and the other half of my data look like this
{
_id: OID,
things: [{
_id:OID,
upper_appendage: string,
body: string
}],
other: string
}
I would like to 'correct' half of the data - so that I DON'T have to accommodate both names for 'arm' in my application code.
I have tried a couple different things:
The first errors
db.getCollection('x')
.find({things:{$exists:true}})
.forEach(function (record) {
record.things.arm = record.things.upper_appendage;
db.users.save(record);
});
and this - which destroys all the other data in
db.getCollection('x')
.find({things:{$exists:true}})
.forEach(function (record) {
record.things = {
upper_appendage.arm = record.things.upper_appendage
};
db.users.save(record);
});
Keeping in mind that there is other data I want to maintain...
How can I do this???
the $rename operator should have worked for this job but unfortunately it doesn't seem to support nested array fields (as of mongodb server 4.2). instead you'd need a forEach like the following:
db.items.find({
things: {
$elemMatch: {
arm: {
$exists: true
}
}
}
}).forEach(function(item) {
for (i = 0; i != item.things.length; ++i)
{
item.things[i].upper_appendage = item.things[i].arm;
delete item.things[i].arm; ;
}
db.items.update({
_id: item._id
}, item);
})
note: i've assumed you want to make all records have upper_appendageand get rid of 'arm' field. if it's the other way you want, just switch things around.