Good Restful design: different payload for different accounts for same url - rest

Is it considered bad design if one url accepted different payloads depending on the basic authentication used? for instance:
http://localhost/userA PUT by userA is allowed up pass XML_A but
http://localhost/userA PUT by adminA is allowed up pass XML_B which is XML_A plus more.
in otherwords it is the same resource but what can be updated is determined based off of the credentials supplied.
I have seen conversations about return data but not too many about request payloads. (not sure if it would be considered different) thanks
UPDATE
Based off of Darrel Miller information, would the following be a better design?
GET /{Username} readonly resource returns different payload based off of rights
GET /{Username}/UpdInfo returns only updatable info (subset of GET /{Username})
PUT /{Username}/UpdInfo updates info 1 to 1 from the GET /{Username}/Info
GET /admin/{Username}/UpdInfo returns updatable info (larger subset of GET /{Username})
PUT /admin/{Username}/UpdInfo updates info 1 to 1 from the GET /admin/{Username}/Info

The problem I see is that PUT method replaces the entire contents of the resource that is targeted. e.g if the following sequence occured,
PUT /UserA with XML_B
PUT /UserA with XML_A
GET /UserA returns XML_A
UserA no longer contains the extra information contained in XML_B.
It think you would be better to just represent the two different sets of information as different resources:
GET /admin/UserA
PUT /admin/UserA with XML_B
GET /UserA
PUT /UserA with XML_A

Related

Rest API: path for accessing derived data

It is not clear to me that if I have a micro service that is in place to provide some derived data how the rest api should be designed for this. For instance :-
If I have a customer and I want to access the customer I would define the API as:
/customer/1234
this would return everything we know about the customer
however if I want to provide a microservice that simply tells me if the customer was previously known to the system with another account number what do I do. I want this logic to be in the microservice but how do I define the API
customer/1234/previouslyKnow
customerPreviouslyKnown/1234
Both don't seem correct. In the first case it implies
customer/1234
could be used to get all the customer information but the microservice doesn't offer this.
Confused!
Adding some extra details for clarification.
I suppose my issue is, I don't really want a massive service which handles everything customer related. It would be better if there were lighter weight services that handles customer orders, customer info, customer history, customer status (live, lost, dead....).
It strikes me all of these would start with
/customer/XXXX
so would all the services be expected to provide a customer object back if only customer/XXXX was given with no extra in the path such as /orders
Also some of the data as mentioned isn't actually persisted anywhere it is derived and I want the logic of this hidden in a service and not in the calling code. So how is this requested and returned.
Doing microservices doesn't mean to have a separate artifact for each method. The rules of coupling and cohesion also apply to the microservices world. So if you can query several data all related to a customer, the related resources should probably belong to the same service.
So your resource would be /customers/{id}/previous-customer-numbers whereas /customers (plural!) is the list of customers, /customers/{id} is a single customer and /customers/{id}/previous-customer-numbers the list of customer numbers the customer previously had.
Try to think in resources, not operations. So returning the list of previously used customer numbers is better than returning just a boolean value. /customer/{id}/previous-accounts would be even better, I think...
Back to topic: If the value of previous-accounts is directly derived from the same data, i.e. you don't need to query a second database, etc. I would even recommend just adding the value to the customer representation:
{
"id": "1234",
"firstName": "John",
"lastName": "Doe",
"previouslyKnown": true,
"previousAccounts": [
{
"id": "987",
...
}
]
}
Whether the data is stored or derived shouldn't matter so the service client to it should not be visible on the boundary.
Adding another resource or even another service is unnecessary complexity and complexity kills you in the long run.
You mention other examples:
customer orders, customer info, customer history, customer status (live, lost, dead....)
Orders is clearly different from customer data so it should reside in a separate service. An order typically also has an order id which is globally unique. So there is the resource /orders/{orderId}. Retrieving orders by customer id is also possible:
/orders;customer={customerId}
which reads give me the list of orders for which the customer is identified by the given customer id.
These parameters which filter a list-like rest resource are called matrix parameters. You can also use a query parameter: /orders?customer={customerId} This is also quite common but a matrix parameter has the advantage that it clearly belongs to a specific part of the URL. Consider the following:
/orders;customer=1234/notifications
This would return the list of notifications belonging to the orders of the customer with the id 1234.
With a query parameter it would look like this:
/orders/notifications?customer=1234
It is not clear from the URL that the orders are filtered and not the notifications.
The drawback is that framework support for matrix parameters is varying. Some support them, some don't.
I'd like matrix parameters best here but a query parameter is OK, too.
Going back to your list:
customer orders, customer info, customer history, customer status (live, lost, dead....)
Customer info and customer status most likely belong to the same service (customer core data or the like) or even the same resource. Customer history can also go there. I would place it there as long as there isn't a reason to think of it separately. Maybe customer history is such a complicated domain (and it surely can be) that it's worth a separate service: /customer-history/{id} or maybe just /customer/{id}.
It's no problem that different services use the same paths for providing different information about one customer. They are different services and they have different endpoints so there is no collision whatsoever. Ideally you even have a DNS alias pointing to the corresponding service:
https://customer-core-data.service.lan/customers/1234
https://customer-history.service.lan/customers/1234
I'm not sure if I really understand your question. However, let me show how you can check if a certain resource exist in your server.
Consider the server provides a URL that locates a certain resource (in this situation, the URL locates a customer with the identifier 1): http://example.org/api/customers/1.
When a client perform a GET request to this URL, the client can expect the following results (there may be other situation, like authentication/authorization problems, but let's keep it simple):
If a customer with the identifier 1 exists, the client is supposed to receive a response with the status code 200 and a representation of the resource (for example, a JSON or XML representing the customer) in the response payload.
If the customer with the identifier 1 do not exist, the client is supposed to receive a response with the status code 404.
To check whether a resource exists or not, the client doesn't need the resource representation (the JSON or XML that represents the customer). What's relevant here is the status code: 200 when the resource exists and 404 when the resource do not exist. Besides GET requests, the URL that locates a customer (http://example.org/api/customers/1) could also handle HEAD requests. The HEAD method is identical to the GET method, but the server won't send the resource representation in HEAD requests. Hence, it's useful to check whether a resource exists or not.
See more details regarding the HEAD method:
4.3.2. HEAD
The HEAD method is identical to GET except that the server MUST NOT
send a message body in the response (i.e., the response terminates at
the end of the header section). The server SHOULD send the same
header fields in response to a HEAD request as it would have sent if
the request had been a GET, except that the payload header fields MAY be omitted. This method can be used for obtaining
metadata about the selected representation without transferring the
representation data and is often used for testing hypertext links for
validity, accessibility, and recent modification. [...]
If the difference between resource and resource representation is not clear, please check this answer.
One thing I want to add to the already great answers is: URLS design doesn't really matter that much if you do REST correctly.
One of the important tenets of REST is that urls are discovered. A client that has the customers's information already, and wants to find out what the "previously known" information, should just be able to discover that url on the main customer resource. If it links from there to the "previously known" information, it doesn't matter if the url is on a different domain, path, or even protocol.
So if you application naturally makes more sense if "previouslyKnown" is on a separate base path, then maybe you should just go for that.

What should be the response of GET for multiple requested resources with some invalid ids?

what should be the response to a request to
http://localhost:8080/users/1,2,3 when the system doesn't have a user with id 3?
When all users are present I return a 200 response code with all user objects in the response body. When the user requests a single missing user I return a 404 with an error message in the body.
However, what should be the body and status code for a mix between valid and missing ids?
I assume that you want to follow REST API principles. In order to keep clear api design you should rather use query string for filtering
http://localhost:8080/users?id=1,2,3
Then you won't have such dilemmas - you can return just only users with id contained in provided value list and 200 status code (even if list is empty). This endpoint in general
http://localhost:8080/users/{id}
should be reserved for requesting single resource (user) by providing primary key.
What you are requesting there is a collection. The request essentially reads: "give me all users whose ID is in {1, 2, 3}." A subset of those users (let's say there is no user yet with the ID 3) would still be a successful operation, which is asking for a 200 (OK).
If you are overly concerned by this, there's still the possibility to redirect the client via 303 (See Other) to a resource representation without the offending elements.
If all of the IDs are invalid, things get a bit tricky. One may be tempted to simply return a 404 (Not Found), but strictly speaking that were not correct:
The 404 status code indicates that the origin server did not find a current representation for the target resource
Indeed, there is one: The empty set. From a programmatic standpoint, it may indeed be easier to just return that instead of throwing an error. This relies on clients being able to process empty sets/documents.
The RFC grants you the freedom to go either way:
[…] the origin server did not find a current representation for the target resource or is not willing to disclose that one exists.
So if you wish to hide the existence of an empty set, it's okay. It bears mentioning that a set containing nothing is not nothing itself ;)
I would recommend not to offer the method in the first place, but rather force the user of your APIto make three separate requests and return unambiguous responses (two 200s for users 1 and 2 and 404 for user 3). Additionally, the API could offer a get method that responds with all available user ids or such (depends on your system).
Alternatively, if that's not an option, I guess, you have two options:
Return 404 as soon as one user is not found, which technically is more accurate in my opinion. I mean, the request was for 1, 2 AND 3, which was not found.
Return 200 with users 1 and 2, and null, which probably is the most useful for your scenario.

REST Design: What Http verb should be used to retrieve a dynamic resource?

I have a scenario in which I have REST API which manages a Resource which we will call Group. A Group contains members and the group resource is dynamic - whenever you retrieve it, you get the latest data (so a query must run server side to update the number of members in a group - in other words, the result of the request is to modify the data, since the results of running the query are stored).
Given a *group_id* it should return a minimal amount of information like
{
group_id: "5t7yu8i9io0op",
group_name: "That's my name",
size: 34
}
So a GET to this resource causes the resource to change, since a subsequent GET could return a new value for 'size'. This tells me it is not idempotent and so you should use POST to retrieve this resource. Am I correct in this conclusion?
If I am correct, do you think it is advisable to also provide a GET method that only returns the currently stored data for the group (eg. so the size could be out of date, even the name too). I suppose in this case I should return a last-modified date as one of the fields so that the user knows how up-to-date the resource is and can then elect to use the POST method...but then I am left wondering why would anyone do that, so why not ONLY provide the POST method and forget about GET?
Confused I am!
Thanks in advance.
[EDIT]
#Satish posted a link in his/her answer to the HTTP specs. In section 9.1.1. it ends with this sentence:
Naturally, it is not possible to ensure that the server does not generate side-effects as a result of performing a GET request; in fact, some dynamic resources consider that a feature. The important distinction here is that the user did not request the side-effects, so therefore cannot be held accountable for them.
So in my scenario, the requester does not really care about the side-effect that the value for 'size' is recomputed as a direct result of making the request. They want the group information and it just so happens that to provide accurate, up-to-date group data, the size query must be run in order to update that value. Whilst making the request causes data to change implies this should be a POST, the user did not request that side-effect and so therefore a GET request would be acceptable and more intuitive, would it not? And therefore still be restful according to this sentence.
[2nd EDIT]
#Satish asks a very important question in the comments. So for others who read this I'll explain further about this problem:
Normally you would not run the group query to update its size from a REST request. As members are added or removed from a group, you would update the computed size of that group, store it and then a simple GET request would always return the correct size. However, our situation is more complicated in that a group is only stored as a query definition in ElasticSearch (kind of like a view in an RDBMS). Members do not get added/removed to and from groups. They get added to a much larger set of data (a collection in MongoDB). There are hundreds, potentially thousands, of different 'group definitions' so it is not practical to recompute size for every group when the collection changes. We cannot know when an item is added/removed to/from the collection which groups might change size - you only know by running the group definition who is in that group and what the size is. I hope that clears things up. :)
You should use GET. Even if dynamic resource is changing, you did not request for that change through your request and you are not accountable for that change. Ref: http://www.w3.org/Protocols/rfc2616/rfc2616-sec9.html
In your case when you do a GET you retrieve some information about the Group. You don't modify the group structure. Ok, the group can be changed by an external entity so your next GET may bring you another data. Am I right? Who modifies the structure of the group and when?
So you should use GETbecause the resource it will be modified from somewhere else and not by your call that tries to do a read operation.
EDIT
After your edited the question I just want to add that I agree about the side effects.
It matters if you sent data or a change command explicitly to the server or you just read something and you don't have to pay attention for what the server side is doing to gave you the response. More intuitively:
GET - Requests data from a specified resource
POST - Submits data to be processed to a specified resource
It is the combination of GET and POST. So you should use POST.
Refer : http://adarshdchaurasia.wordpress.com/2013/09/26/http-get-vs-post/
You should not use GET because if you use GET method then search engines may cache the responses. It may cause unintentional data update at your server side, which you do not want. GET method is meant to return content without updating anything on server. POST is meant to updated the things at server and return result against that operation.

REST interface for finding average

Suppose I want to create a REST interface to find the average of a list of numbers. Assume that the numbers are submitted one at a time. How would you do this?
POST a number to https://example.com/api/average
If this is the first number a hash will be returned
POST a number to https://example.com/api/average/hash
....
GET https://example.com/api/average/hash to find the average
DELETE https://example.com/api/average/hash since we don't need it any more
Is this the right way to do it? Any suggestions?
It makes more sense to think of the list of numbers as the resource. Suppose each list's resource URL is /list/{id} where {id} is a placeholder for the list's ID. Then:
POST /list creates a new list, the server generates a list ID (or 'hash') and specifies the /list/{id} URL in the response's Location header.
POST /list/{id} adds a number to the list
GET /list/{id}/average returns the average
DELETE /list/{id} deletes the list.
An alternative to GET /list/{id}/average would be for GET /list/{id} to return the list as structured data, e.g. XML, that includes the average as a generated property.
What you are talking about is doing a stateless transformation of a request representation (list of numbers) into a response representation (single number).
Lets categorize your resource:
Stateless -- The request is stateless, but so is the resource. It should be able to take your request, process it, and return a response without maintaining any internal state. Further discussion below.
Unlikely to be cacheable -- I am making an assumption here that your lists of numbers are never/seldom identical.
Idempotent -- Requests have no side effects. This is because the resource is stateless.
Now lets examine the different HTTP methods:
GET - Gets the state of a resource. Since your resource has no state, it is not appropriate for your situation. (idempotent, cacheable)
DELETE - Removes a resource or clears its state. Also not appropriate for your situation. (not idempotent, not cacheable)
PUT - Used to set the state of a resource (or create it if it does not exist). (idempotent, not cacheable)
POST - Used to process requests which may or may not modify the state of a resource. May create other resources. (no guarantee of idempotence -- depends on whether the resource is stateful or stateless, not cacheable)
As you see in the other answers, POST is most popularly used as a synonym for 'create'. While this is ok, POST is not limited to just 'create' in REST. Mark Baker does a good job of explaining this here: http://www.markbaker.ca/2001/09/draft-baker-http-resource-state-model-01.txt (Section 3.1.4).
While POST does not have a perfect semantic mapping to your problem, it is the best of all the HTTP methods for what you are trying to do. It also leads to a simple, stateless, and scalable solution, which is the point of REST.
In summary, the answer to your question is:
Method: POST
Request: A representation of a list of numbers
Response: A representation of a single number (average of the list)
While this may look like a SOAP-style web service invocation, it is not. Don't let your visceral reaction to SOAP cloud your use of the POST method and place unnecessary constraints on it.
KISS (Keep it simple, stupid).
You cannot just return a hash or an ID, you have to return URIs or a URI template plus the field values. The only URI that can be part of your API is the entry point, otherwise your API is not REST.
To maximize REST philosophies I would do the following
Do a PUT this to indicate a new structure that would generate a hash, that is not based on the number passed. Just a "random" hash. Then each subsequent post would include the id-hash with returned result hash of the numbers sent. Then when a get is presented on that you can cache the results.
1. PUT /api/average/{number} //return id-hash
2. POST /api/average/{id-hash}/{number} // return average-hash
3. GET /api/average/{average-hash}
4. DELETE /api/average/{id-hash}
Then you can cache the get of the average hash, even when you may get to the result in a different way, or different servers get that same average.

RESTful way to create multiple items in one request

I am working on a small client server program to collect orders. I want to do this in a "REST(ful) way".
What I want to do is:
Collect all orderlines (product and quantity) and send the complete order to the server
At the moment I see two options to do this:
Send each orderline to the server: POST qty and product_id
I actually don't want to do this because I want to limit the number of requests to the server so option 2:
Collect all the orderlines and send them to the server at once.
How should I implement option 2? a couple of ideas I have is:
Wrap all orderlines in a JSON object and send this to the server or use an array to post the orderlines.
Is it a good idea or good practice to implement option 2, and if so how should I do it.
What is good practice?
I believe that another correct way to approach this would be to create another resource that represents your collection of resources.
Example, imagine that we have an endpoint like /api/sheep/{id} and we can POST to /api/sheep to create a sheep resource.
Now, if we want to support bulk creation, we should consider a new flock resource at /api/flock (or /api/<your-resource>-collection if you lack a better meaningful name). Remember that resources don't need to map to your database or app models. This is a common misconception.
Resources are a higher level representation, unrelated with your data. Operating on a resource can have significant side effects, like firing an alert to a user, updating other related data, initiating a long lived process, etc. For example, we could map a file system or even the unix ps command as a REST API.
I think it is safe to assume that operating a resource may also mean to create several other entities as a side effect.
Although bulk operations (e.g. batch create) are essential in many systems, they are not formally addressed by the RESTful architecture style.
I found that POSTing a collection as you suggested basically works, but problems arise when you need to report failures in response to such a request. Such problems are worse when multiple failures occur for different causes or when the server doesn't support transactions.
My suggestion to you is that if there is no performance problem, for example when the service provider is on the LAN (not WAN) or the data is relatively small, it's worth it to send 100 POST requests to the server. Keep it simple, start with separate requests and if you have a performance problem try to optimize.
Facebook explains how to do this: https://developers.facebook.com/docs/graph-api/making-multiple-requests
Simple batched requests
The batch API takes in an array of logical HTTP requests represented
as JSON arrays - each request has a method (corresponding to HTTP
method GET/PUT/POST/DELETE etc.), a relative_url (the portion of the
URL after graph.facebook.com), optional headers array (corresponding
to HTTP headers) and an optional body (for POST and PUT requests). The
Batch API returns an array of logical HTTP responses represented as
JSON arrays - each response has a status code, an optional headers
array and an optional body (which is a JSON encoded string).
Your idea seems valid to me. The implementation is a matter of your preference. You can use JSON or just parameters for this ("order_lines[]" array) and do
POST /orders
Since you are going to create more resources at once in a single action (order and its lines) it's vital to validate each and every of them and save them only if all of them pass validation, ie. you should do it in a transaction.
I've actually been wrestling with this lately, and here's what I'm working towards.
If a POST that adds multiple resources succeeds, return a 200 OK (I was considering a 201, but the user ultimately doesn't land on a resource that was created) along with a page that displays all resources that were added, either in read-only or editable fashion. For instance, a user is able to select and POST multiple images to a gallery using a form comprising only a single file input. If the POST request succeeds in its entirety the user is presented with a set of forms for each image resource representation created that allows them to specify more details about each (name, description, etc).
In the event that one or more resources fails to be created, the POST handler aborts all processing and appends each individual error message to an array. Then, a 419 Conflict is returned and the user is routed to a 419 Conflict error page that presents the contents of the error array, as well as a way back to the form that was submitted.
I guess it's better to send separate requests within single connection. Of course, your web-server should support it
You won't want to send the HTTP headers for 100 orderlines. You neither want to generate any more requests than necessary.
Send the whole order in one JSON object to the server, to: server/order or server/order/new.
Return something that points to: server/order/order_id
Also consider using CREATE PUT instead of POST