Query parameter with http PUT method - REST API - rest

I am passing request object and path variables in my http PUT method to update a record.
Is it ok to pass additional data (such as timestamp) as query parameters which i want to save in the final record that has on additional field(say timestamp) as compared to request object ?

Is it ok to pass additional data (such as timestamp) as query parameters which i want to save in the final record that has on additional field(say timestamp) as compared to request object ?
Short answer: that probably doesn't mean what you think it does.
Is it OK to use query parameters in a PUT request? Absolutely. Query parameters are just another piece of the resource identifier.
/15f3221f-ee3b-4155-bc75-f80855a9187e/abc
/15f3221f-ee3b-4155-bc75-f80855a9187e?abc
Those are two different resource identifiers, and the machines won't assume that they identify the same resource, but all of the http methods that would apply to one would also apply to the other, and mean the same thing.
There's nothing magic about abc of course, you could use a timestamp there
/15f3221f-ee3b-4155-bc75-f80855a9187e?1970-01-01
Changing the timestamp changes the identifier; as far as general purpose components are concerned, these next two examples identify different resources
/15f3221f-ee3b-4155-bc75-f80855a9187e?1970-01-01
/15f3221f-ee3b-4155-bc75-f80855a9187e?1970-01-02
You might imagine them as two different pages of a desktop calendar. Modifying the list of appointments in your 1970-01-02 document shouldn't change your 1970-01-01 calendar at all.
Metadata about a representation would normally be embedded within the representation itself (think HEAD element in an HTML document) or in the HTTP Headers. As far as I can tell, we don't have a standardized header that matches the semantics you want.
All that said: the server has a LOT of freedom in how it interprets a request to update the representation of /15f3221f-ee3b-4155-bc75-f80855a9187e?1970-01-02. For instance, the updating of that resource might also update the representations of many other resources.
(Do keep in mind caching, though - there are only a limited number of ways we can advise a general purpose client that some cached representations have been invalidated by a request.)

Related

Is using the If-Modified-Since header to filter a resource collection to only recent ones in a REST API considered a valid approach?

I'm designing a REST API where I have a need to provide the option to GET only the resources in a collection that were created or modified recently, based on a client-provided timestamp (which, in turn, will have been generated by the API in a previous response). I'm considering the use of the Last-Modified and If-Modified-Since headers for this purpose.
Earlier questions here (like Is it valid to modify a REST API representation based on a If-Modified-Since header?) seems to indicate that this is frowned upon, on the grounds that RFC2616 indicates that the purpose of these headers is related to caching. However, since then, RFC2616 has been superseded by RFC7232, which states that
If-Modified-Since is typically used for two distinct purposes: 1) to allow efficient updates of a cached representation that does not have an entity-tag and 2) to limit the scope of a web traversal to resources that have recently changed.
My interpretation is that my use case of allowing retrieval of all changes to the collection since the last retrieval is covered by the second purpose.
So I have two questions:
Is this interpretation correct, or am I missing something subtle here?
Even if my interpretation is correct, does that make it a good practice to use these headers in this way? In other words: what other reasons would there be to not use these headers after all and instead, for example, include a timestamp in the response and allow the client to provide that back in the query string for the next request?
Is this interpretation correct, or am I missing something subtle here?
I believe RFC 7234 contradicts your interpretation.
If an If-None-Match header field is not present, a request containing an If-Modified-Since header field (Section 3.3 of [RFC7232]) indicates that the client wants to validate one or more of its own stored responses by modification date. A cache recipient SHOULD generate a 304 (Not Modified) response (using the metadata of the selected stored response) if one of the following cases is true....
The broad problem here is that a general purpose cache isn't going to know that your resource / your server have a different understanding of what the standard headers mean, and therefore clients are not going to have the experience you want.
Furthermore...
I'm designing a REST API where I have a need to provide the option to GET only the resources in a collection that were created or modified recently, based on a client-provided timestamp (which, in turn, will have been generated by the API in a previous response).
We already have a standardized mechanism for this - it's the URI. That may become clearer if you review Fielding's definition of resource.
I understand it this way: "resource", within the context of REST, is a generalization of "document" (see also Jim Webber, 2011). It's perfectly reasonable to have many different documents derived from the same (or overlapping) information.
Think "paging" - every page is a different document, with its own unique identifier, but all of the pages are being derived from the same common source, with items moving from one page to another over time.

Building a REST API: How to decide which params go to Headers, Body, URL or query?

How do you decide where do parameters go?
Suppose the API is for an object, which has an ID, a few fields and each request may or may not have a token. There has to be GET, PUT, POST and DELETE requests for the object.
As a rule, you want all of the parameters necessary to identify a resource to be directly encoded into the URI somewhere. That allows you to bookmark the URI for re-use later, and to share that bookmark with another person/process.
Example:
Building a REST API: How to decide which params go to Headers, Body, URL or query?
All of the context you need to GET this resource is right here. You can click it, save it, send it off in an email, and it is still useful, of itself.
So where in the URI does information go?
If the information is only needed by the client once the representation has been downloaded, then you might consider encoding it into the fragment.
The fragment identifier component of a URI allows indirect identification of a secondary resource by reference to a primary resource and additional identifying information.
On the web, fragments were useful because they allowed you to call upon the user agent to focus at a particular element in a representation. The fragment is not sent over the network, but only used on the client side. Think Data Transfer Object - one big cacheable document (so we don't need a lot of round trips) with lots of URI that point to specific information within it.
Other parameters can be encoded into path segments or the query string. The machines don't care (20 years ago, this was somewhat less true - we would sometimes have to work around caches that didn't handle the query part of a URI correctly).
URI with parameters configured via application/x-www-form-urlencoded query strings were convenient on the web because HTML had form support for creating those identifiers on the client.
These days, we can use URI templates to describe how to compute a new URI, which gives you more options.
Relative resolution gives us a general purpose mechanism for computing a new URI from a given reference identifier. Think dot-references with symbolic links. That mechanism is primarily based on navigating the hierarchical part of the URI, which is to say that path.
The machines don't care of the hierarchy of resources and the hierarchy of identifiers are parallel
# Here's an identifier for a collection
/collection
# Here's an identifier for a member of this collection
/collection/member
# Here's an identifier for a collection
/2c957fb6-ac92-4fdb-a086-02292c3b7c7c
# Here's an identifier for a member of this collection
/41d36a69-d10c-4503-8e5e-3b2d64e9c3a6
All of these samples are fine, as far as the machines are concerned; but human beings tend to have an easier time working with the top set.
Headers are metadata that belongs to the domain of "transporting documents over a network".
The body is the document itself - it is the message that is being transported over the network (the http request and the headers are, in a sense, the envelope that carries the message). Yes, this sometimes means that information that is in the message also gets copied into the headers, or copied via the template into the target-uri.

REST strategy for overloading GET verb

Consider a need to create a GET endpoint for fetching Member details using either of 4 options (It's common in legacy application with RPC calls)
Get member by ID
Get member by SSN
Get member by a combination of Phone and LastName (both must be passed)
What's a recommended strategy to live the REST spirit and yet provide this flexibility?
Some options I could think of are:
Parameters Based
/user/{ID}
/user?ssn=?
/user?phone=?&lname=?
Separate Endpoints
/user/{ID}
/user/SSN/{SSNID}
/user/{lname}/{phone}
RPC for custom
/user/{ID}
/user/findBySSN/
/user/findbycontact/
REST doesn't care what spelling you use for your identifiers.
For example, think about how you would do this on the web. You would provide forms, one for each set of search criteria. The consumer would choose which form to use, and submit the form, without ever knowing what the URI is.
In the case of HTML forms, there are specific processing rules for describing how the form information will be copied into the URI. The form takes on the aspect of a URI Template.
A URI Template provides both a structural description of a URI space and, when variable values are provided, machine-readable instructions on how to construct a URI corresponding to those values.
But there aren't any rules saying that restrict the server from providing a URI template that directs the client to copy the variable values into path segments rather than into the query string.
In other words, in REST, the server retains control of its own URI space.
You might sometimes prefer to use path segments because of their hierarchical nature, which may be convenient if you want the client to use relative resolution of relative references in your representations.
REST ≠ pretty URLs. The two are orthogonal.
Your question is about the latter, I feel.
Whilst the other answers have been good, if you want your API to work with HTML forms, go with query parameters on the collection /user resource for all fields, even for ID (assuming a human is typing these in based on information they are getting from sheets of paper on their desk, etc.)
If your server is able to produce links to each record, always produce canonical links such as /users/{id}, don't duplicate data under different URLs.

Is it valid to modify a REST API representation based on a If-Modified-Since header?

I want to implement a "get changed values" capability in my API. For example, say I have the following REST API call:
GET /ws/school/7/student
This gets all the students in school #7. Unfortunately, this may be a lot. So, I want to modify the API to return only the student records that have been modified since a certain time. (The use case is that a nightly process runs from another system to pull all the students from my system to theirs.)
I see http://blog.mugunthkumar.com/articles/restful-api-server-doing-it-the-right-way-part-2/ recommends using the if-modified-since header and returning a representation as follows:
Search all the students updated since the time requested in the if-modified-since header
If there are any, return those students with a 200 OK
If there are no students returned from that query, return a 304 Not Modified
I understand what he wants to do, but this seems the wrong way to go about it. The definition of the If-Modified-Since header (http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.24) says:
The If-Modified-Since request-header field is used with a method to make it conditional: if the requested variant has not been modified since the time specified in this field, an entity will not be returned from the server; instead, a 304 (not modified) response will be returned without any message-body.
This seems wrong to me. We would not be returning the representation or a 304 as indicated by the RFC, but some hybrid. It seems like client side code (or worse, a web cache between server and client) might misinterpret the meaning and replace the local cached value, when it should really just be updating it.
So, two questions:
Is this a correct use of the header?
If not (and I suspect not), what is the best practice? Query string parameter?
This is not the correct use of the header. The If-Modified-Since header is one which an HTTP client (browser or code) may optionally supply to the server when requesting a resource. If supplied the meaning is "I want resource X, but only if it's changed since time T." Its purpose is to allow client-side caching of resources.
The semantics of your proposed usage are "I want updates for collection X that happened since time T." It's a request for a subset of X. It does not seem like your motivation is to enable caching. Your client-side cached representation seemingly contains all of X, even though the typical request will only return you a small set of changes to X; that is, the response is not what you are directly caching, so the caching needs to happen in custom user logic client-side.
A query string parameter is a much more appropriate solution. Below {seq} would be something like a sequence number or timestamp.
GET /ws/schools/7/students/updates?since={seq}
Server-side I imagine you have a sequence of updates since the beginning of your system and a request of the above form would grab the first N updates that had a sequence value greater than {seq}. In this way, if a client ever got very far behind and needed to catch up, the results would be paged.

RESTful way to create multiple items in one request

I am working on a small client server program to collect orders. I want to do this in a "REST(ful) way".
What I want to do is:
Collect all orderlines (product and quantity) and send the complete order to the server
At the moment I see two options to do this:
Send each orderline to the server: POST qty and product_id
I actually don't want to do this because I want to limit the number of requests to the server so option 2:
Collect all the orderlines and send them to the server at once.
How should I implement option 2? a couple of ideas I have is:
Wrap all orderlines in a JSON object and send this to the server or use an array to post the orderlines.
Is it a good idea or good practice to implement option 2, and if so how should I do it.
What is good practice?
I believe that another correct way to approach this would be to create another resource that represents your collection of resources.
Example, imagine that we have an endpoint like /api/sheep/{id} and we can POST to /api/sheep to create a sheep resource.
Now, if we want to support bulk creation, we should consider a new flock resource at /api/flock (or /api/<your-resource>-collection if you lack a better meaningful name). Remember that resources don't need to map to your database or app models. This is a common misconception.
Resources are a higher level representation, unrelated with your data. Operating on a resource can have significant side effects, like firing an alert to a user, updating other related data, initiating a long lived process, etc. For example, we could map a file system or even the unix ps command as a REST API.
I think it is safe to assume that operating a resource may also mean to create several other entities as a side effect.
Although bulk operations (e.g. batch create) are essential in many systems, they are not formally addressed by the RESTful architecture style.
I found that POSTing a collection as you suggested basically works, but problems arise when you need to report failures in response to such a request. Such problems are worse when multiple failures occur for different causes or when the server doesn't support transactions.
My suggestion to you is that if there is no performance problem, for example when the service provider is on the LAN (not WAN) or the data is relatively small, it's worth it to send 100 POST requests to the server. Keep it simple, start with separate requests and if you have a performance problem try to optimize.
Facebook explains how to do this: https://developers.facebook.com/docs/graph-api/making-multiple-requests
Simple batched requests
The batch API takes in an array of logical HTTP requests represented
as JSON arrays - each request has a method (corresponding to HTTP
method GET/PUT/POST/DELETE etc.), a relative_url (the portion of the
URL after graph.facebook.com), optional headers array (corresponding
to HTTP headers) and an optional body (for POST and PUT requests). The
Batch API returns an array of logical HTTP responses represented as
JSON arrays - each response has a status code, an optional headers
array and an optional body (which is a JSON encoded string).
Your idea seems valid to me. The implementation is a matter of your preference. You can use JSON or just parameters for this ("order_lines[]" array) and do
POST /orders
Since you are going to create more resources at once in a single action (order and its lines) it's vital to validate each and every of them and save them only if all of them pass validation, ie. you should do it in a transaction.
I've actually been wrestling with this lately, and here's what I'm working towards.
If a POST that adds multiple resources succeeds, return a 200 OK (I was considering a 201, but the user ultimately doesn't land on a resource that was created) along with a page that displays all resources that were added, either in read-only or editable fashion. For instance, a user is able to select and POST multiple images to a gallery using a form comprising only a single file input. If the POST request succeeds in its entirety the user is presented with a set of forms for each image resource representation created that allows them to specify more details about each (name, description, etc).
In the event that one or more resources fails to be created, the POST handler aborts all processing and appends each individual error message to an array. Then, a 419 Conflict is returned and the user is routed to a 419 Conflict error page that presents the contents of the error array, as well as a way back to the form that was submitted.
I guess it's better to send separate requests within single connection. Of course, your web-server should support it
You won't want to send the HTTP headers for 100 orderlines. You neither want to generate any more requests than necessary.
Send the whole order in one JSON object to the server, to: server/order or server/order/new.
Return something that points to: server/order/order_id
Also consider using CREATE PUT instead of POST