RESTful design - how to model entity's attachments - rest

I am trying to model entity's attachments in REST. Let's say a defect entity can have multiple attachments attached to it. Every attachment has a description and some other properties (last modified, file size...) . The attachment itself is a file in any format (jpeg, doc ...)
I was wondering how should I model it RESTfully
I thought about the following two options:
First approach (using same resource, different representations):
GET , content-type:XML on http://my-app/defects/{id}/attachments will return the defect's
attachments metadata in XML format (description, last modified, file size...)
GET , content-type:gzip on http://my-app/defects/{id}/attachments will return the defect's attachments in a zip file
GET , content-type:mime multi-part on http://my-app/defects/{id}/attachments will return the defect's attachments in a multi-part message (binary data and XML metadata altogether)
POST, content-type:XML on http://my-app/defects/{id}/attachments will create new attachment, metadata only no file attached (then the user has to send PUT request with the binary data)
POST , content-type:mime\multi-part on http://my-app/defects/{id}/attachments will create the attachment, the client can send both metadata and file itself in a single roundtrip
Second approach (separate the attachment's data from the metadata):
GET , content-type:XML on http://my-app/defects/{id}/attachments will return the defect's
attachments metadata in XML format (description, last modified, file size...)
GET , content-type:gzip on http://my-app/defects/{id}/attachments/files will return the defect's attachments binary data in a single zip
Creating a new attachment, first call:
POST, content-type:XML on http://my-app/defects/{id}/attachments will create new attachment, metadata only no file attached (then the user has to send PUT request with the binary data)
Then add the binary data itself:
POST , content-type:mime\multi-part on http://my-app/defects/{id}/attachments/{id}/file will create the attachment file
On one hand the first approach is more robust and efficient since the client can create\get the attachments metadata and binary data in single round trip. On the other hand, I am a bit reluctant to use the mime-multipart representation as it's more cumbersome to consume and produce.
EDIT: I checked out flicker upload REST API. It seems they are using multi part messages to include both the photo and the photo attributes.

Much of this problem has already been solved by the Atom Pub spec. See here
One thing to be careful about in your proposed solutions is that you are using content negotiation to deliver different content. I believe that is considered bad. Content negotiation should only deliver different representations of the same content.

Don't manage metadata separately. A two-part action defeats the point of REST.
One smooth GET/POST/PUT/DELETE with one -- relatively -- complex payload is what's typically done.
The fact that it's multiple underlying "objects" in "tables" is irrelevant to REST.
At the REST level, it's just one complex object's state transmitted with one message.

Related

Fetch only some parts of a large JSON in Dart

I wonder how to fetch only a part of large JSON file
In my example, its not that large but in my project the file is sometime like 7000 lines of code.
Example Json: https://statsapi.web.nhl.com/api/v1/schedule?expand=schedule.linescore
How to fech only the team Name for example.
Normally, from a network request you only get what the server serves you. You can't fetch only a portion from that. You can process the data after the response from server, where you can refer to the portion of your data by key values. Like this.
response['totalItems']

REST service return collection of XML documents

We are writing a REST service to query for PDF files. The service consumer wants the metadata for those PDFs, not the actual PDF. The metadata for happens to be stored as an XML document, one XML document for each PDF resource. They resource and the resource's metadata are completely different files.
What should the query response look like?
Typically we use JSON for request/response bodies. Should the response body be a JSON object that contains a collection of URLs, where each URL links to a metadata document? This seems pretty clean, but causes a lot unnecessary network traffic because the consumer must send a GET request for each metadata document.
Should the XML of the metadata documents be embedded in the response body's JSON object? (yuck!)
Is there a solution is both clean and efficient?
Based on some clarifying comments, I'm going to suggest that you don't write a "RESTful" API. You don't need one. You don't have objects that you need to interact with in any complex way. You don't have state that needs to be affected (REST means Representational State Transfer).
You just need an HTTP API. Just return the XML file. You can also provide an endpoint to get multiple XML documents ZIPed, if you want.
So do something like this:
/api/host/123 - download the PDF file (Content-Type: application/pdf) - You didn't say if you already have an endpoint for PDFs, but if you did want one, this is how I would structure it.
/api/host/123/metadata - download the XML metadata (Content-Type: text/xml)
/api/host/bulk_metadata - download a ZIP of the metadata for file IDs listed in a POST parameter (Content-Type: application/zip)
Use Content-Disposition: attachment; filename="{filename}.{pdf|xml|zip}" to tell browsers to download the content to disk rather than displaying it inline.

RESTful url - getting new subentity

There are 2 models: Entity and Subentity. Entity can have many connected Subentities (one:many relation).
There is a method on server that returns new Subentity (let's call it GetEmptySubentity). Point is, when you want to create new Subentity, you press a button, and model comes from server with some fields pre-filled. Some of those Subentity pre-filled values depend on according Entity, so I need to pass an Entity id in this request.
So should the correct url to get the empty Subentity be like /Entity/{id}/Subentity/empty? Or I am getting something wrong?
Yes you are. According to the uniform interface / hateoas constraint you should send hyperlinks to your REST clients and they should use the API by following those hyperlinks. In order to do this you need a hypermedia format, for example HTML, ATOM+XML, HAL+JSON, LD+JSON & Hydra, etc... (use google). So by HTML the result should contain a HTML form with input fields having default values, etc... You should add semantics to that for with RDFa and so by processing the HTML your REST client will know, that the link is about creating a new resource. Ofc it is easier to parse the other hypermedia formats. By them you can use the same concept with RDF (by JSON-LD or ATOM for example), or you can use link relations with vendor specific MIME types (by HAL or ATOM for example), or your custom solution which describes those input fields. So you usually get the necessary information with the hyperlink, and you don't have to send another request to get the default values.
If you want to make things complicated, then you can send a request for the default values to the entity itself in order to send the values of properties, and not to send a form with input fields. Optionally you can send a request which returns the entire link, for example GET /Entity/{id}/SubEntity/offset=0&count=0 can return an empty array of subentities and the form for creation. You can use additional query or path parameters if that form is really big, and you don't want to send it with every response related to the SubEntity collection. The URL specification says only that the path should contain the hierarchical part and the query should contain the non-hierarchical part of the URL.
Btw. REST is just a delivery method, you don't have to map it to your database entities. The REST resource and URL structure can be completely different from your database, since you can use any type of data storage mechanisms with REST, even the file system...

best approach to design a rest web service with binary data to be consumed from the browser

I'm developing a json rest web service that will be consumed from a single web page app built with backbone.js
This API will let the consumer upload files related to some entity, like pdf reports related to a project
Googling around and doing some research at stack overflow I came with these possible approaches:
First approach: base64 encoded data field
POST: /api/projects/234/reports
{
author: 'xxxx',
abstract: 'xxxx',
filename: 'xxxx',
filesize: 222,
content: '<base64 encoded binary data>'
}
Second approach: multipart form post:
POST: /api/projects/234/reports
{
author: 'xxxx',
abstract: 'xxxx',
}
as a response I'll get a report id, and with that I shall issue another post
POST: /api/projects/234/reports/1/content
enctype=multipart/form-data
and then just send the binary data
(have a look at this: https://stackoverflow.com/a/3938816/47633)
Third approach: post the binary data to a separate resource and save the href
first I generate a random key at the client and post the binary content there
POST: /api/files/E4304205-29B7-48EE-A359-74250E19EFC4
enctype=multipart/form-data
and then
POST: /api/projects/234/reports
{
author: 'xxxx',
abstract: 'xxxx',
filename: 'xxxx',
filesize: 222,
href: '/api/files/E4304205-29B7-48EE-A359-74250E19EFC4'
}
(see this: https://stackoverflow.com/a/4032079/47633)
I just wanted to know if there's any other approach I could use, the pros/cons of each, and if there's any established way to deal with this kind of requirements
the big con I see to the first approach, is that I have to fully load and base64 encode the file on the client
some useful resources:
Post binary data to a RESTful application
What is a good way to transfer binary data to a HTTP REST API service?
How do I upload a file with metadata using a REST web service?
Bad idea to transfer large payload using web services?
https://stackoverflow.com/a/5528267/47633
My research results:
Single request (data included)
The request contains metadata. The data is a property of metadata and encoded (for example: Base64).
Pros:
transactional
everytime valid (no missing metadata or data)
Cons:
encoding makes the request very large
Examples:
Twitter
GitHub
Imgur
Single request (multipart)
The request contains one or more parts with metadata and data.
Content types:
multipart/form-data
multipart/mixed
multipart/related
Pros:
transactional
everytime valid (no missing metadata or data)
Cons:
content type negotiation is complex
content type for data is not visible in WADL
Examples:
Confluence (with parts for data and for metadata)
Jira (with one part for data, metadata only part headers for file name and mime type)
Bitbucket (with one part for data, no metadata)
Google Drive (with one part for metadata and one for part data)
Single request (metadata in HTTP header and URL)
The request body contains the data and the HTTP header and the URL contains the metadata.
Pros:
transactional
everytime valid (no missing metadata or data)
Cons:
no nested metadata possible
Examples:
S3 GetObject and PutObject
Two request
One request for metadata and one or more requests for data.
Pros:
scalability (for example: data request could go to repository server)
resumable (see for example Google Drive)
Cons:
not transactional
not everytime valid (before second request, one part is missing)
Examples:
Google Drive
YouTube
I can't think of any other approaches off the top of my head.
Of your 3 approaches, I've worked with method 3 the most. The biggest difference I see is between the first method and the other 2: Separating metadata and content into 2 resources
Pro: Scalability
while your solution involves posting to the same server, this can easily be changed to point the content upload to a separate server (i.e. Amazon S3)
In the first method, the same server that serves metadata to users will have a process blocked by a large upload.
Con: Orphaned Data/Added complexity
failed uploads (either metadata or content) will leave orphaned data in the server DB
Orphaned data can be cleaned up with a scheduled job, but this adds code complexity
Method II reduces the orphan possibilities, at the cost of longer client wait time as you're blocking on the response of the first POST
The first method seems the most straightforward to code. However, I'd only go with the first method if anticipate this service being used infrequently and you can set a reasonable limit on the user file uploads.
I believe the ultimate method is number 3 (separate resource) for the main reason that it allows maximizing the value I get from the HTTP standard, which matches how I think of REST APIs. For example, and assuming a well-grounded HTTP client is in the use, you get the following benefits:
Content compression: You optimize by allowing servers to respond with compressed result if clients indicate they support, your API is unchanged, existing clients continue to work, future clients can make use of it
Caching: If-Modified-Since, ETag, etc. Clients can advoid refetching the binary data altogether
Content type abstraction: For example, you require an uploaded image, it can be of types image/jpeg or image/png. The HTTP headers Accept and Content-type give us some elegant semantics for negotiating this between clients and servers without having to hardcode it all as part of our schema and/or API
On the other hand, I believe it's fair to conclude that this method is not the simplest if the binary data in question is not optional. In which case the Cons listed in Eric Hu's answer will come into play.

Restful's principle

What is the real meaning of Resources with multiple representations for the restful? After reading InfoQ's "A Brief Introduction to REST", I am confused. What is Representations?
A representation is a certain way to display and/or transfer data. The same resource can be represented in different ways:
As HTML page
As an XML document
As a JSON data structure
As plain text
Even as a PDF file if that would be desired
...
You can exchange "representation" with "data format" to get a better understanding.
Examples for a "customer" resource:
HTML:
<h1>John Doe</h1>
XML:
<customer-name>John Doe</customer-name>
JSON:
{
"UserName" : "John Doe",
}
A metaphor:
Just think of a picture. It can be represended as Bitmap, PNG, JPEG and many other formats and data structures. All of them show the same picture but they differ in their internal structure. (their "representation")
Practical considerations:
In a web application environment the most common representation is (X)HTML as the standard output sent to the browser. Followed by XML and JSON when it comes to Ajax and automated access to the web application.
A Resource is basically a collection of data, in the example it is the associated data with a given customer.
When you retrieve a resource, you get a representation of it. Now for most data there are multiple representations available. Think of a table of data, or a chart, etc...
In the example you define which representation you would like to receive by setting the HTTP Accept header. In the first example in an xml format, in the second one in a vcard format.
Take a look at this: REST Wikipedia article
A resource is something on the server, a "thing", and the article is just saying you can have multiple message formates returned about that "thing" that describe it in different ways...
Have a look at Roy Fielding's dissertation which defines REST.
Actually "representation" is more abstract than these answers suggest. "Representation" simply means what you get back is not necessarily the entire resource. For example, I have an employee record which is a resource in my corporate HR database. "Employee" is an obvious resource noun to expose through a RESTful architecture. But if you access my employee ID through the e-mail URI, the representation will be entirely different than the representation you see when accessing my employee ID through the HR benefits URI.
What DR's answer describes (JSON, XML, etc.) are actually called media-types in REST terminology. It is simply the data format of the response.