Document AI Contract Processor - batchProcessDocuments ignores fieldMask - google-api-nodejs-client

My aim is to reduce the json file size, which contains the base64 image sections of the documents by default.
I am using the Document AI - Contract Processor in US region, nodejs SDK.
It is my understanding that setting fieldMask attribute in batchProcessDocuments request filters out the properties that will be in the resulting json.
I want to keep only the entities property.
Here are my call parameters:
const documentai = require('#google-cloud/documentai').v1;
const client = new documentai.DocumentProcessorServiceClient(options);
let params = {
"name": "projects/XXX/locations/us/processors/3e85a4841d13ce5",
"region": "us",
"inputDocuments": {
"gcsDocuments": {
"documents": [{
"mimeType": "application/pdf",
"gcsUri": "gs://bubble-bucket-XXX/files/CymbalContract.pdf"
}]
}
},
"documentOutputConfig": {
"gcsOutputConfig": {
"gcsUri": "gs://bubble-bucket-XXXX/ocr/"
},
"fieldMask": {
"paths": [
"entities"
]
}
}
};
client.batchProcessDocuments(params, function(error, operation) {
if (error) {
return reject(error);
}
return resolve({
"operationName": operation.name
});
});
However, the resulting json is still containing the full set of data.
Am I missing something here?

The auto-generated documentation for the Node.JS Client Library is a little hard to follow, but it looks like the fieldMask should be a member of the gcsOutputConfig instead of the documentOutputConfig. (I'm surprised the API didn't throw an error)
https://cloud.google.com/nodejs/docs/reference/documentai/latest/documentai/protos.google.cloud.documentai.v1.documentoutputconfig.gcsoutputconfig
The REST Docs are a little more clear
https://cloud.google.com/document-ai/docs/reference/rest/v1/DocumentOutputConfig#gcsoutputconfig
Note: For a REST API call and for other client libraries, the fieldMask is structured as a string (e.g. text,entities,pages.pageNumber)
I haven't tried this with the Node Client libraries before, but I'd recommend trying this as well if moving the parameter doesn't work on its own.
https://cloud.google.com/document-ai/docs/send-request#async-processor

Related

How to generate automatic Id with Commit or Batch Document Firestore REST

Hi I am creating documents with commit like this way:
{
"writes": [
{
"update": {
"name": "projects/projectID/databases/(default)/documents/test/?documentId=",
"fields": {
"comment": {
"stringValue": "Hello World!"
}
}
}
},
{
"update": {
"name": "projects/projectID/databases/(default)/documents/test/?documentId=",
"fields": {
"comment": {
"stringValue": "Happy Birthday!"
}
}
}
}
]
}
The parameter ?documentId= dosen´t work like when creating a single document, if I left empty I get an error that I must specify the name of the document so how I can generate an automatic id for each document?
Unfortunately, batch commits with auto generated documentId are not possible in the Firestore REST API. As you can see in this documentation, the Document object should be provided with a full path, including the documentID:
“Name:string
The resource name of the document, for example projects/{project_id}/databases/{databaseId}/documents/{document_path}.”
And if it was possible to omit the documentID, it would be mentioned in this documentation.
If you would like to have this implemented in the Firestore REST API, you can create a feature request in Google’s Issue Tracker so that they can consider implementing it.
I just came across the same problem and discovered that it is still not implemented.
I created a feature request for it here: https://issuetracker.google.com/issues/227875470.
So please go star it if you want this to be added.

REST API Multiple PUT or DELETE in one time

Greeting everyone, I have a datatable in my html page that I populated using REST API. I can create new row and also update or delete by selecting a row and clicking the edit or delete button.
But currently I am unable to delete update or delete multiple row at once due to url error,
e.g : PUT http://127.0.0.1:8000/dashboard/content_detail/5,7,9/ 404 (Not Found)
how can I split this this into several separate url with respective id when I update or delete.
e.g :
/dashboard/content_detail/5
/dashboard/content_detail/7
/dashboard/content_detail/9
Below is my code, any help is much appreciated thank you.
idSrc: 'id',
ajax: {
create: {
type: 'POST',
url: content_path,
data: function (content_data) {
var create_data = {};
$.each(content_data.data, function (id, value) {
create_data['name'] = value['name'];
create_data['description'] = value['description'];
create_data['category'] = value['category'];
});
return create_data;
},
success: function () {
content_table.api().ajax.reload();
}
},
edit: {
type: 'PUT',
url: '/dashboard/content_detail/_id_/',
data: function (content_data) {
var updated_data = {};
$.each(content_data.data, function (id, value) {
updated_data['description'] = value['description'];
updated_data['category'] = value['category'];
updated_data['name'] = value['name'];
});
return updated_data;
},
success: function () {
content_table.api().ajax.reload();
}
},
remove: {
type: 'DELETE',
url: '/dashboard/content_detail/_id_/',
data: function (content_data) {
var deleted_data = {};
$.each(content_data.data, function (id, value) {
deleted_data['id'] = id;
});
return deleted_data;
},
success: function () {
content_table.api().ajax.reload();
}
}
},
If you're going to allow the update of a large number of items at once, then PATCH might be your friend:
Looking at the RFC 6902 (which defines the Patch standard), from the client's perspective the API could be called like
PATCH /authors/{authorId}/book
[
{ "op": "replace", "path": "/dashboard/content_detail/5", "value": "test"},
{ "op": "remove", "path": "/dashboard/content_detail", "value": [ "7", "9" ]}
]
From a design perspective you don't want several ids in your url.
I would prefer single calls for each change, thinking in resources you only manipulate one at a time.
In case this is a perfomance issue, I recommend a special url marked with action or something simliar, to make clear this ist not REST.
In HTTP it is not required for information to only exist on a single resource. It is possible to have multiple resources that represent the same underlying data.
It's therefore not out of the question to create a resource that 'represents' a set of other resources that you wish to DELETE or PUT to.
I do agree that it might not be the most desirable. I think we tend to prefer having information only exist in a single part of tree, and I think we like to avoid situations where updating a resource effects a secondary resource's state. However, if you are looking for a strictly RESTful solution to solve this problem, I think it's the right way.
Therefore a url design such as:
/dashboard/content_detail/5,7,9/
Is not necessarily non-RESTful or goes against the HTTP protocol. The fact that you're getting a 404 on that URL currently has to do with your application framework, not the protocol (HTTP) or architecture (REST) of your API.
However, for cases such as these I feel I would personally be inclined to sometimes create a separate POST endpoint that, acting outside of REST like an RPC endpoint. Specifically for these types of batch requests.

Is it possible to extract a set of database rows with RestTemplate?

I am having difficulties getting multiple datasets out of my database with RestTemplate. I have many routines that extract a single row, with a format like:
IndicatorModel indicatorModel = restTemplate.getForObject(URL + id,
IndicatorModel.class);
and they work fine. However, if I try to extract a set of data, such as:
Map<String, List<S_ServiceCoreTypeModel>> coreTypesMap =
restTemplate.getForObject(URL + id, Map.class);
this returns values in a
Map<String, LinkedHashMap<>>
format. Is there an easy way to return a List<> or Set<> in the desired format?
Fundamentally the issue is that your Java object model does not match the structure of your json document. You are attempting to deserialize a single json element into a java List. Your JSON document looks like:
{
"serviceCoreTypes":[
{
"serviceCoreType":{
"name":"ALL",
"description":"All",
"dateCreated":"2016-06-23 14:46:32.09",
"dateModified":"2016-06-23 14:46:32.09",
"deleted":false,
"id":1
}
},
{
"serviceCoreType":{
"name":"HSI",
"description":"High-speed Internet",
"dateCreated":"2016-06-23 14:47:31.317",
"dateModified":"2016-06-23 14:47:31.317",
"deleted":false,
"id":2
}
}
]
}
But you cannot turn a serviceCoreTypes into a List, you can only turn a Json Array into a List. For instance if you removed the unnecessary wrapper elements from your json and your input document looked like:
[
{
"name": "ALL",
"description": "All",
"dateCreated": "2016-06-23 14:46:32.09",
"dateModified": "2016-06-23 14:46:32.09",
"deleted": false,
"id": 1
},
{
"name": "HSI",
"description": "High-speed Internet",
"dateCreated": "2016-06-23 14:47:31.317",
"dateModified": "2016-06-23 14:47:31.317",
"deleted": false,
"id": 2
}
]
You should be able to then deserialize THAT into a List< S_ServiceCoreTypeModel>. Alternately if you cannot change the json structure, you could create a Java object model that models the json document by creating some wrapper classes. Something like:
class ServiceCoreTypes {
List<ServiceCoreType> serviceCoreTypes;
...
}
class ServiceCoreTypeWrapper {
ServiceCoreType serviceCoreType;
...
}
class ServiceCoreType {
String name;
String description;
...
}
I'm assuming you don't actually mean database, but instead a restful service as you're using RestTemplate
The problem you're facing is that you want to get a Collection back, but the getForObject method can only take in a single type parameter and cannot figure out what the type of the returned collection is.
I'd encourage you to consider using RestTemplate.exchange(...)
which should allow you request for and receive back a collection type.
I have a solution that works, for now at least. I would prefer a solution such as the one proposed by Ben, where I can get the HTTP response body as a list of items in the format I chose, but at least here I can extract each individual item from the JSON node. The code:
S_ServiceCoreTypeModel endModel;
RestTemplate restTemplate = new RestTemplate();
JsonNode node = restTemplate.getForObject(URL, JsonNode.class);
JsonNode allNodes = node.get("serviceCoreTypes");
JsonNode oneNode = allNodes.get(1);
ObjectMapper objectMapper = new ObjectMapper();
endModel = objectMapper.readValue(oneNode.toString(), S_ServiceCoreTypeModel.class);
If anyone has thoughts on how to make Ben's solution work, I would love to hear it.

Can't post node that requires a pre assigned value with services api

I have setup a content type with a subject field that has pre assigned values in a dropdown field.
I am using the services api to post new content from a polymer app.
When I POST to the api I send the field structure and value in json but get and error.
"406 (Not Acceptable : An illegal choice has been detected. Please contact the site administrator.)"
Even though the object I am sending matches one of the required values in the field.
Do I need to prefix the value with something? I assume I'm posting to the right place to get that response but don't know why it would accept anything other than the string value.
Here is what I sent to the api which is picked up by my Charles proxy.
{
"node": {
"type": "case",
"title": "my case",
"language": "und",
"field_subject": {
"und": {
"0": {
"value": "subject1"
}
}
},
"body": {
"und": {
"0": {
"value": "my details of subject"
}
}
}
}
}
And here is an example of what I have setup in my Drupal field
subject1| first
subject2| second
subject3| third
subject4| forth
For anyone else with the same problem, this subject is poorly documented, but the answer is simple, my subject did not need the value key despite devel suggesting thats how it would be formatted.
"field_subject": {
"und": [
"subject1"
]
}
I could also shorten my code with "und" being an array.

How to use Restangular when the service for a collection returns a dictionary instead of an array?

I'm in the early phases of the development of a client app to an existing REST service and I'm trying to decide what to use for server communication. So far I'm loving Restangular documentation, it seems really solid, but I'm worried it's not going to work with the service because the responses look something like this:
{
"0": {
"name": "John",
"total": 230,
"score": 13
},
"1": {
"name": "Sally",
"total": 190,
"score": 12
},
"2": {
"name": "Harry",
"total": 3,
"score": 0
},
"...": "..."
}
I can't find in the docs if something like this is supported or how am I supposed to handle this type of response. Has anyone tried? Any ideas?
I'm the creator of Restangular :).
You can use that response with Restangular. You need to use the responseInterceptor. I guess that your response looks like that when you're getting an array. So you need to do:
RestangularProvider.setListTypeIsArray(false)
RestangularProvider.setResponseExtractor(function(response, operation) {
// Only for lists
if (operation === 'getList') {
// Gets all the values of the property and created an array from them
return _.values(response)
}
return response;
});
With this, it's going to work :)
Please let me know if this worked out fr you
You could do this if if you attached a "then" statement to the promise that you return.
That is, instead of this:
return <RectangularPromise>;
you have this:
return <RectangularPromise>.then(function(response){return _.values(response);};
Essentially, it's the same as #mgonto's answer, but doing it this way means that you don't have to modify your whole Rectangular Service if you only need to do this for one endpoint. (Although, another way to avoid this is to create multiple instances of the service as outlined in the readme).
unfortunately listTypeIsArray is not described in readme, so I haven't tried this option
but I resolved similar problem as described here
https://github.com/mgonto/restangular/issues/100#issuecomment-24649851
RestangularProvider.setListTypeIsArray() has been deprecated.
If you are expecting an object then handle it within addResponseInterceptor
app.config(function(RestangularProvider) {
// add a response interceptor
RestangularProvider.addResponseInterceptor(function(data, operation, what, url, response, deferred) {
var extractedData;
// .. to look for getList operations
if (operation === "getList") {
// .. and handle the data and meta data
extractedData = data.data.data;
extractedData.meta = data.data.meta;
} else {
extractedData = data.data;
}
return extractedData;
}); });
If you are expecting string response then convert it to array.
app.config(function(RestangularProvider) {
RestangularProvider.addResponseInterceptor(function(data, operation, what, url, response, deferred) {
var newData = data;
if (angular.isString(data)) {
newData = [data]; //covert string to array
}
return newData;
}); });