RESTful web services - best way to return result of an operation? - rest

I am designing a RESTful API and I would like to know what the most RESTful way is to return details about an operation.
E.g. an operation on a resource occurs when some data is POSTed to a URL. HTTP status codes will indicate either success or failure for the operation. But apart from success/failure I need to indicate some other info to the client, such as an ID number.
So my question is, should the ID number be returned in an XML document in the response content, or should it be returned in some custom HTTP header fields? Which is more in line with the principles of REST? Or am I free to choose.

Returning an entity is a perfectly valid response to an HTTP POST.
You also do not need to return XML you could just use the content type text/plain and simply return a string value.
Using a header would require you to define a new custom header which is not ideal. I would expect clients would have an easier time parsing a response body than extracting the information from a header.

XML document makes the most sense.

If it is a just an ID number, it would save overhead to do it just as an HTTP header. Building a correct XML document just for a single number would add much more overhead to the request.

Related

Response body when PATCHing a collection

In my REST API I have a very large collection; it contains millions of items. The path for this collection is /mycollection
Because this collection is so large it is not good practice to GET the whole collection so the API supports paging. Paging will be the primary way of getting the collection
GET /mycollection?page=1&page-size=100 HTTP/1.1
Say the original collection contains 1,000,000 items and I want to update 5,000, delete 3,000 and add 2,000 items. I could write my API to support updating the collection via either the PUT method or the PATCH method. While either method would require very different request bodies I believe both methods would require the exact same response body, i.e. the response body would have to contain the current representation of the entire updated resource, i.e. all 999,000 items in the collection.
As I mentioned earlier GETting the entire collection is just not realistic; it's too big. For the same reason I don't want PUTting or PATCHing to return the entire collection. Adding query parameters to a PUT or PATCH request wouldn't work either because neither PUT nor PATCH are safe methods.
So what would be the proper response in this large collection scenario?
I could respond with
HTTP/1.1 202 Accepted
Location: /mycollection?page=1&page-size=100
The 202 Accepted response code doesn't feel correct because the update would have been done synchronously. The Location header doesn't quite feel right either. I could maybe go with a Links header, but still it doesn't feel right.
Again I ask what would be the proper response in this large collection scenario?
This question is based on a misconception:
While either method would require very different request bodies I believe both methods would require the exact same response body, i.e. the response body would have to contain the current representation of the entire updated resource
Either can just return 204 No Content or 200 OK and no response body. There's no requirement that they include the full representation in the response.
You could optionally support this (perhaps along with the Prefer: return=representation header, or perhaps Content-Location header), but without this header I would say it's not even a convention that the current representation is returned. Generic clients shouldn't assume that the response body is the new representation unless these headers are used.
So, just return a 2xx and you're good to go.
So what would be the proper response in this large collection scenario?
Short version: you should probably treat a successful PUT as though it were a successful POST.
the intended meaning of the payload can be summarized as:
a representation of the status of, or results obtained from, the action
So the response could be as simple as
200 OK
Content-Type: text/plain
It worked!
Longer answer:
While either method would require very different request bodies I believe both methods would require the exact same response body, i.e. the response body would have to contain the current representation of the entire updated resource
This isn't right - If you review RFC 7231, you'll see that the response to PUT has this description
a representation of the status of the action
Returning the new representation of the resource is an edge case, not the default (see the specification of the Content-Location header).
For a state-changing request like PUT (Section 4.3.4) or POST (Section 4.3.3), it implies that the server's response contains the new representation of that resource, thereby distinguishing it from representations that might only report about the action (e.g., "It worked!"). This allows authoring applications to update their local copies without the need for a subsequent GET request.
That said, I'd suggest a review of your choice of method token. Both PUT and PATCH support remote authoring semantics - messages that ask a server to make its copy of a document look like your local copy. That's why, for example, the PUT specification has a bunch of constraints about adding validator header fields to the response. General purpose components are allowed to assume that they know what's going on, because all resources are supposed to understand these methods the same way.
But in your case, you can't really be said to be remote authoring the collection, because the client (and the general purpose components) don't have a representation of the collection, but instead only representations of pages of the collection.
If you were going to be consistent with the uniform interface, then you would either
allow remote authoring of the pages, or
abandon the method tokens that imply remote authoring
It is okay to use POST when the semantics of your request don't quite align with the standardized meanings
POST serves many useful purposes in HTTP, including the general purpose of “this action isn’t worth standardizing.”

Should I use GET or POST REST API Method?

I want to retrieve data about a bunch of resources. Let's say an Array of book id and the response is JSON Array of book objects. I want to send the request payload as JSON to the server.
Should I use GET and POST method?
Note:
I don't want to make multiple GET request for each book ID.
POST seems to be confusing as it is supposed to be used only when the request creates a resource or modifies the server state.
I want to retrieve data about a bunch of resources. Let's say an Array of book id and the response is JSON Array of book objects.
If you are thinking about passing the array of book id as the message body of the HTTP Request, then GET is a bad idea.
A payload within a GET request message has no defined semantics; sending a payload body on a GET request might cause some existing implementations to reject the request.
You should use POST instead
POST seems to be confusing as it is supposed to be used only when the request creates a resource or modifies the server state.
That's not quite right. POST can be used for anything -- see GraphQL or SOAP. But what you give up by using POST is the ability of intermediate components to participate in the conversation.
For example, for cases that are effectively read-only, you would like to use a safe method, because that allows pre-caching optimization, and automated retry of lost responses on an unreliable network. POST doesn't have extra semantic constraints, so you lose out.
What HTTP really wants is that you GET using the URI; this can be done in one of two relatively straightforward ways:
POST the ids to the server, to create a new resource (meaning that the server retains for itself a copy of the list of ids), and receive a new resource identifier back in exchange. Then GET using this new identifier any time you want to know the current representation of the results.
Encode the information you need into the URI itself. Most commonly, this is done using the query part of the URI, although that isn't strictly necessary. The downside here is that if the URI encoded representation of the array of ids is very long, you may have trouble with some implementations that enforce arbitrary URI limits.
There aren't always great answers:
The REST interface is designed to be efficient for large-grain hypermedia data transfer, optimizing for the common case of the Web, but resulting in an interface that is not optimal for other forms of architectural interaction.
If I understand correctly, you want to get a list of all of the items in a list, in one pull. This would be possible using GET, as REST returns the JSON it can by default be up to 100 items, and you can get more items if needed by specifying $top.
As far as writing back or to the server, POST would be what your looking for, this to my understanding would need to be one for one.
you are going to use a GET-Request and put your request-data (book-id array) in the data-section of your ajax (or whatever you're going to use) request. See How to pass parameters in GET requests with jQuery

REST Check if resource exists, how to handle on server side?

how to handle resource checking on server side?
For example, my api looks like:
/books/{id}
After googling i found, that i should use HEAD method to check, if resource exists.
https://www.w3.org/Protocols/rfc2616/rfc2616-sec9.html
I know, that i can use GET endpoint and use HEAD method to fetch information about resource and server does not return body in this case.
But what should i do on server side?
I have two options.
One endpoint marked as GET. I this endpoint i can use GET method to fetch data and HEAD to check if resource is available.
Two endpoints. One marked as GET, second as HEAD.
Why i'm considering second solution?
Let's assume, that GET request fetch some data from database and process them in some way which takes some time, eg. 10 ms
But what i actually need is only to check if data exists in database. So i can run query like
select count(*) from BOOK where id = :id
and immediately return status 200 if result of query is equal to 1. In this case i don't need to process data so i get a faster response time.
But... resource in REST is a object which is transmitted via HTTP, so maybe i should do processing data but not return them when i use HEAD method?
Thanks in advance for your answer!
You could simply delegate the HEAD handler to the existing GET handler and return the status code and headers only (ignoring the response payload).
That's what some frameworks such as Spring MVC and JAX-RS do.
See the following quote from the Spring MVC documentation:
#GetMapping — and also #RequestMapping(method=HttpMethod.GET), are implicitly mapped to and also support HTTP HEAD. An HTTP HEAD request is processed as if it were HTTP GET except but instead of writing the body, the number of bytes are counted and the "
Content-Length header set.
[...]
#RequestMapping method can be explicitly mapped to HTTP HEAD and HTTP OPTIONS, but that is not necessary in the common case.
And see the following quote from the JAX-RS documentation:
HEAD and OPTIONS requests receive additional automated support. On receipt of a HEAD request an implementation MUST either:
Call a method annotated with a request method designator for HEAD or, if none present,
Call a method annotated with a request method designator for GET and discard any returned entity.
Note that option 2 may result in reduced performance where entity creation is significant.
Note: Don't use the old RFC 2616 as reference anymore. It was obsoleted by a new set of RFCs: 7230-7235. For the semantics of the HTTP protocol, refer to the RFC 7231.
Endpoint should be the same and server side script should make decision what to do based on method. If method is HEAD, then just return suitable HTTP code:
204 if content exists but server don't return it
404 if not exists
4xx or 5xx on other error
If method is GET, then process request and return content with HTTP code:
200 if content exists and server return it
404 if not exists
4xx or 5xx on other error
The important thing is that URL should be the same, just method should be different. If URL will be different then we talking about different resources in REST context.
Your reference for HTTP methods is out of date; you should be referencing RFC 7231, section 4.3.2
The HEAD method is identical to GET except that the server MUST NOT send a message body in the response (i.e., the response terminates at the end of the header section).
This method can be used for obtaining metadata about the selected representation without transferring the representation data and is often used for testing hypertext links for validity, accessibility, and recent modification.
You asked:
resource in REST is a object which is transmitted via HTTP, so maybe i should do processing data but not return them when i use HEAD method?
That's right - the primary difference between GET and HEAD is whether the server returns a message-body as part of the response.
But what i actually need is only to check if data exists in database.
My suggestion would be to use a new resource for that. "Resources" are about making your database look like a web site. It's perfectly normal in REST to have many URI that map to a queries that use the same predicate.
Jim Webber put it this way:
The web is not your domain, it's a document management system. All the HTTP verbs apply to the document management domain. URIs do NOT map onto domain objects - that violates encapsulation. Work (ex: issuing commands to the domain model) is a side effect of managing resources. In other words, the resources are part of the anti-corruption layer. You should expect to have many many more resources in your integration domain than you do business objects in your business domain.

What is the correct REST method for performing server side validation?

If I don't want to update a resource, but I just want to check if something is valid (in my case, a SQL query), what's the correct REST method?
I'm not GETting a resource (yet). I'm not PUTting, POSTing, or PATCHing anything (yet). I'm simply sending part of a form back for validation that only the server can do. Another equivalent would be checking that a password conforms to complexity requirements that can only be known by the domain, or perhaps there are other use cases.
Send object, validate, return response, continue with form. Using REST. Any ideas? Am I missing something?
What is the correct REST method for performing server side validation?
Asking whether a representation is valid should have no side effects on the server; therefore it should be safe.
If the representation that you want to validate can be expressed within the URI, then, you should prefer to use GET, as it is the simplest choice, and gives you the best semantics for caching the answer. For example, if we were trying to use a web site to create a validation api for a text (and XML or JSON validator, for instance), then we would probably have a form with a text area control, and construct the identifier that we need by processing the form input.
If the representation that you want to validate cannot be expressed within the URI, then you are going to need to put it into the message body.
Of the methods defined by RFC 7231, only POST is suitable.
Additional methods, outside the scope of this specification, have been standardized for use in HTTP. All such methods ought to be registered within the "Hypertext Transfer Protocol (HTTP) Method Registry" maintained by IANA, as defined in Section 8.1.
The HTTP method registry gives you a lot of options. For this case, I wouldn't bother with them unless you find either a perfect match, or a safe method that accepts a body and is close enough.
So maybe REPORT, which is defined in RFC 3253; I tend to steer clear of WebDAV methods, as I'm not comfortable stretching specifications for "remote Web content authoring operations" outside of their remit.
TLDR; There's a duplicate question around the topic of creating validation endpoints via REST:
In your case a GET request would seem sufficient.
The HTTP GET method is used to read (or retrieve) a representation of a resource. In the “happy” (or non-error) path, GET returns a representation in XML or JSON and an HTTP response code of 200 (OK). In an error case, it most often returns a 404 (NOT FOUND) or 400 (BAD REQUEST).
restapitutorial.com
For validating your SQL query you could use a GET request to get the valid state of your query potentially using a query parameter to achieve this.
GET: api/validateQuery?query="SELECT * FROM TABLE"
Returning:
200 (OK): Valid Query
400 (MALFORMED): Invalid Query
404 (NOT FOUND): Query valid but returns no results (if you plan on executing the query)
I think this type of endpoint is best served as a POST request. As defined in the spec, POST requests can be used for
Providing a block of data, such as the fields entered into an HTML form, to a data-handling process
The use of GET as suggested in another post, for me, is misleading and impractical based on the complexity & arbitrarity of SQL queries.

Which HTTP Verb for Read endpoint with request body

We are exposing an endpoint that will return a large data set. There is a background process which runs once per hour and generates the data. The data will be different after each run.
The requester can ask for either the full set of data or a subset. The sub set is determined via a set of parameters but the parameters are too long to fit into a uri which has a max length of 2,083 characters. https://www.google.co.uk/webhp?sourceid=chrome-instant&ion=1&espv=2&ie=UTF-8#q=uri%20max%20length
The parameters can easily be sent in the request body but which which is the correct HTTP verb to use?
GET would be ideal but use of a body 'has no semantic meaning to a GET request' HTTP GET with request body
PUT is not appropriate because there is no ID and no data is being updated or replaced.
POST is not appropriate because a new resource is not being replaced and more importantly the server is not generating and Id.
http://www.restapitutorial.com/lessons/httpmethods.html
GET (read) would seem to be the most appropriate but how can we include the complex set of parameters to determine the response?
Many thanks
John
POST is the correct method. POST should be used for any operation that's not standardized by HTTP, which is your case, since there's no standard for a GET operation with a body. The reference you linked is just directly mapping HTTP methods to CRUD, which is a REST anti-pattern.
You are right that GET with body is to be avoided. You can experiment with other safe methods that take a request body (such as REPORT or SEARCH), or you can indeed use POST. I see no reason why the latter is wrong; what you're citing is just an opinion, not the spec.
Assuming that the queries against that big dataset are not totally random, you should consider adding stored queries to your API. This way clients can add, remove, update queries (through request body) using POST DELETE PUT. Maybe you can call them "reports".
This way the GET requests need only a reference as query parameter to these queries/reports, you don't have to send all the details with every requests.
But only if not all the requests from clients are unique.