Removing _embedded from a collection of REST resources - rest

Maybe this goes against REST/HAL principles but I thought that if I was viewing a list of items they should not be contained in an _embedded tag. Below is the details returned when I navigate to /characters in my spring boot application.
I had expected _embedded to not be present for the characterDescriptions since they are the main focus of the page, is it possible to achieve this? Should I try to achieve this or would _embedded be the norm here?
On a related note when I navigate to a particular resource using the link ( like characters/1 for instance) should I be linking back to the /characters parent page or is it acceptable to only contain a self-link at these kinds of endpoints (I will be eventually linking to the user here but this is a general question about REST endpoints )
The controller method that returns this JSON is below the JSON
{
"_embedded": {
"characterDescriptions": [
{
"characterName": "Adrak",
"playerName": "Liam",
"userName": "liam",
"_links": {
"self": {
"href": "http://localhost:8080/characters/1"
}
}
},
{
"characterName": "Thorny",
"playerName": "Aedo",
"userName": "aedo",
"_links": {
"self": {
"href": "http://localhost:8080/characters/2"
}
}
},
{
"characterName": "Anin",
"playerName": "Saoirse",
"userName": "saoirse",
"_links": {
"self": {
"href": "http://localhost:8080/characters/3"
}
}
}
]
},
"_links": {
"self": {
"href": "http://localhost:8080/characters"
}
}
}
Here's the relevant method
#GetMapping
public ResponseEntity<Resources<Resource<CharacterDescription>>> getAllCharacterDescriptions( ) {
List <Resource<CharacterDescription>> characters = repository.findAll()
.stream().map( character -> {
Link characterLink = linkTo(methodOn(CharacterDescriptionController.class)
.getCharacterDescription(character.getCharacterId()))
.withSelfRel();
return new Resource<>(character, characterLink);
}).collect(Collectors.toList());
Link allCharacterLink = linkTo(methodOn(CharacterDescriptionController.class)
.getAllCharacterDescriptions(auth))
.withSelfRel();
Resources<Resource<CharacterDescription>> resources = new Resources<>(characters, allCharacterLink);
return ResponseEntity.ok(resources);
}

According to the HAL spec you can either render a single resource, with its content and set of links, or you can render an aggregate resource, which has room for multiple resources-within-this-resource.
In your domain model, you clearly show multiple documents, each with a distinct self URI (/characters/1, /characters/2, etc.), hence, you aren't serving up a single item resource, but instead an aggregate root.
If you read the HAL spec, you'll find this definition underneath _embedded:
It is an object whose property names are link relation types (as defined by RFC5988) and values are either a Resource Object or an array of Resource Objects.
In fact, looking for the word array in the HAL spec, only leads you to section quoted above, and the _links section.
Hence, _embedded is the proper place to render an array of resources in HAL.

Related

Document AI Contract Processor - batchProcessDocuments ignores fieldMask

My aim is to reduce the json file size, which contains the base64 image sections of the documents by default.
I am using the Document AI - Contract Processor in US region, nodejs SDK.
It is my understanding that setting fieldMask attribute in batchProcessDocuments request filters out the properties that will be in the resulting json.
I want to keep only the entities property.
Here are my call parameters:
const documentai = require('#google-cloud/documentai').v1;
const client = new documentai.DocumentProcessorServiceClient(options);
let params = {
"name": "projects/XXX/locations/us/processors/3e85a4841d13ce5",
"region": "us",
"inputDocuments": {
"gcsDocuments": {
"documents": [{
"mimeType": "application/pdf",
"gcsUri": "gs://bubble-bucket-XXX/files/CymbalContract.pdf"
}]
}
},
"documentOutputConfig": {
"gcsOutputConfig": {
"gcsUri": "gs://bubble-bucket-XXXX/ocr/"
},
"fieldMask": {
"paths": [
"entities"
]
}
}
};
client.batchProcessDocuments(params, function(error, operation) {
if (error) {
return reject(error);
}
return resolve({
"operationName": operation.name
});
});
However, the resulting json is still containing the full set of data.
Am I missing something here?
The auto-generated documentation for the Node.JS Client Library is a little hard to follow, but it looks like the fieldMask should be a member of the gcsOutputConfig instead of the documentOutputConfig. (I'm surprised the API didn't throw an error)
https://cloud.google.com/nodejs/docs/reference/documentai/latest/documentai/protos.google.cloud.documentai.v1.documentoutputconfig.gcsoutputconfig
The REST Docs are a little more clear
https://cloud.google.com/document-ai/docs/reference/rest/v1/DocumentOutputConfig#gcsoutputconfig
Note: For a REST API call and for other client libraries, the fieldMask is structured as a string (e.g. text,entities,pages.pageNumber)
I haven't tried this with the Node Client libraries before, but I'd recommend trying this as well if moving the parameter doesn't work on its own.
https://cloud.google.com/document-ai/docs/send-request#async-processor

How to generate automatic Id with Commit or Batch Document Firestore REST

Hi I am creating documents with commit like this way:
{
"writes": [
{
"update": {
"name": "projects/projectID/databases/(default)/documents/test/?documentId=",
"fields": {
"comment": {
"stringValue": "Hello World!"
}
}
}
},
{
"update": {
"name": "projects/projectID/databases/(default)/documents/test/?documentId=",
"fields": {
"comment": {
"stringValue": "Happy Birthday!"
}
}
}
}
]
}
The parameter ?documentId= dosen´t work like when creating a single document, if I left empty I get an error that I must specify the name of the document so how I can generate an automatic id for each document?
Unfortunately, batch commits with auto generated documentId are not possible in the Firestore REST API. As you can see in this documentation, the Document object should be provided with a full path, including the documentID:
“Name:string
The resource name of the document, for example projects/{project_id}/databases/{databaseId}/documents/{document_path}.”
And if it was possible to omit the documentID, it would be mentioned in this documentation.
If you would like to have this implemented in the Firestore REST API, you can create a feature request in Google’s Issue Tracker so that they can consider implementing it.
I just came across the same problem and discovered that it is still not implemented.
I created a feature request for it here: https://issuetracker.google.com/issues/227875470.
So please go star it if you want this to be added.

Creating a flat single relationship in Loopback 3

Loopback has a way to make a light relationship using referencesMany where you can say something like:
{
"name": "SomeModel",
"plural": "SomeModel",
// ...,
"relations": {
"images": {
"type": "referencesMany",
"model": "Images",
"options": {
"validate": true
}
}
}
}
Which will allow you to store an array of ObjectId in MongoDB.
I can then do something like:
SomeModel.find({ include: 'images' }) or GET to /api/SomeModel/?filter[include]=images to include a response with nested image objects that are related to the SomeModel.
Is there a good way to do this in a singular case (not an array of values)? Relate one parent to a child? HasOne puts a someModelId on the child and I don't really want to pollute the Image model with BelongsTo as its polymorphic and belongs to all sorts of stuff.

Return an array with a single element using Gson and HAL (Hypertext Application Language)

I'm having problems using Halarious (Java library for the HAL specification) and Gson to serialise a list of links in the _links section with just a single element. The array is serialized into an object instead of being an array with a single link.
EXAMPLE:
What I'm getting now is:
{
"year": 2008,
"_embedded": {
"items": {
"_links": {
"self": {
"href": "/first_item"
}
}
}
}
}
Instead of:
{
"year": 2008,
"_embedded": {
"items": {
"_links": {
"self": [
{
"href": "/first_item"
}
]
}
}
}
}
I solved the same problem with the _embedded section but I can't solve it for the links section.
Thanks
I solved using a workaround. I dont' use #HalLink but a surrogate ad hoc class that contains all the hierarchy and which instance is named "_links".
So using a list of custom Href objects when it has a single link I'll receive back the expected self attribute as list with a single element.
After all the HAL documentation (http://stateless.co/hal_specification.html) says: ​"If you're unsure whether the link should be singular, assume it will be multiple" and from http://blog.stateless.co/post/13296666138/json-linking-with-hal "Where a relation may potentially have multiple links sharing the same key the value should be an array of link objects".
​In this way I won't break consumers having them to deal with either a JSON array or an object.

Pagination issue in RESTful API design

I am designing a RESTful API for a mobile application I am working on. My problem is with large collections containing many items. I understand that a good practice is to paginate large number of results in a collection.
I have read the Facebook Graph API doc (https://developers.facebook.com/docs/graph-api/using-graph-api/v2.2), Twitter cursors doc (https://dev.twitter.com/overview/api/cursoring), GitHub API doc (https://developer.github.com/v3/) and this post (API pagination best practices).
Consider an example collection /resources in my API that contains 100 items named resource1 to resource100 and sorted descending. This is the response you will get upon a GET request (GET http://api.path.com/resources?limit=5):
{
"_links": {
"self": { "href": "/resources?limit=5&page=1" },
"last": { "href": "/resources?limit=5&page=7" },
"next": { "href": "/resources?limit=5&page=2" }
},
"_embedded": {
"records": [
{ resource 100 },
{ resource 99 },
{ resource 98 },
{ resource 97 },
{ resource 96 }
]
}
}
Now my problem is a scenario like this:
1- I GET /resources with above contents.
2- After that, something is added to the resources collection (say another device adds a new resource for this account). So now I have 101 resources.
3- I GET /resources?limit=5&page=2 as the initial response suggests will contain the next page of my results. The response would be like this:
{
"_links": {
"self": { "href": "/history?page=2&limit=5" },
"last": { "href": "/history?page=7&limit=5" },
"next": { "href": "/history?page=3&limit=5" }
},
"_embedded": {
"records": [
{ resource 96 },
{ resource 95 },
{ resource 94 },
{ resource 93 },
{ resource 92 }
]
}
}
As you can see resource 96 is repeated in both pages (Or similar problem may happen if a resource gets deleted in step 2, in that case one resource will be lost).
Since I want to use this in a mobile app and in one list, I have to append the resources of each API call to the one before it so I can have a complete list. But this is troubling. Please let me know if you have a suggestion. Thank you in advance.
P.S: I have considered timestamp like query strings instead of cursor based pagination, but that will make problems somewhere else for me. (let me know if you need more info about that.)
We just implemented something similar to this for a mobile app via a REST API. The mobile app passed an additional query parameter which represents a timestamp at which elements in the page should be "frozen".
So your first request would look something like GET /resources?limit=5&page=1&from=2015-01-25T05:10:31.000Z and then the second page request (some time later) would increment the page count but keep the same timestamp: GET /resources?limit=5&page=2&from=2015-01-25T05:10:31.000Z
This also gives the mobile app control if it wants to differentiate a "soft" page (preserving the timestamp of the request of page 1) from a "hard refresh" page (resetting the timestamp to the current time).
Why not just maintain a set of seen resources?
Then when you process each response you can check whether the resource is already being presented.