find-or-create idiom in REST API design?

find-or-create idiom in REST API design? - rest

say we have a 'user' resource with unique constraint on 'name'. how would you design a REST API to handle a find-or-create (by name) use case? I see the following options:
option 1: multiple requests
client:
POST /user
{"name":"bob"}
server:
HTTP 409 //or something else
client:
GET /user?name=bob
server:
HTTP 200 //returns existing user
option 2: one request, two response codes
client:
POST /user
{"name":"bob"}
server:
HTTP 200 //returns existing user
(in case user is actually created, return HTTP 201 instead)
option 3: request errs but response data contains conflicting entity
client:
POST /user
{"name":"bob"}
server:
HTTP 409 //as in option1, since no CREATE took place
{"id": 1, "name":"bob"} //existing user returned

I believe the "correct" RESTful way to do this would be :
GET /user?name=bob
200: entity contains user
404: entity does not exist, so
POST /user { "name" : "bob" }
303: GET /user?name=bob
200: entity contains user
I'm also a big fan of the Post-Redirect-Get pattern, which would entail the server sending a redirect to the client with the uri of the newly created user. Your response in the POST case would then have the entity in its body with a status code of 200.
This does mean either 1 or 3 round-trips to the server. The big advantage of PRG is protecting the client from rePOSTing when a page reload occurs, but you should read more about it to decide if it's right for you.
If this is too much back-and-forth with the server, you can do option 2. This is not strictly RESTful by my reading of https://www.rfc-editor.org/rfc/rfc2616#section-9.5:
The action performed by the POST method might not result in a resource
that can be identified by a URI. In this case, either 200 (OK) or 204
(No Content) is the appropriate response status, depending on whether
or not the response includes an entity that describes the result.
If you're okay with veering away from the standard, and you're concerned about round-trips, then Option 2 is reasonable.

I am using a version of option 2. I return 201 when the resource is created, and 303 ("see other") when it is merely retrieved. I chose to do this, in part, because get_or_create doesn't seem to be a common REST idiom, and 303 is a slightly unusual response code.

I would go with Option 2 for two reasons:
First, HTTP response code, 2xx (e.g. 200 nd 201) refers to a successful operation unlike 4xx. So in both cases when find or create occurs you have a successful operation.
Second, Option 1 doubles the number of requests to the server which can be a performance hit for heavy load.

I believe a GET request which either:
returns an existing record; or
creates a new record and returns it
is the most efficient approach as discussed in my answer here: Creating user record / profile for first time sign in
It is irrelevant that the server needs to create the record before returning it in the response as explained by #Cormac Mulhall and #Blake Mitchell in REST Lazy Reference Create GET or POST?
This is also explained in Section 9.1.1 of the HTTP specification:
Naturally, it is not possible to ensure that the server does not generate side-effects as a result of performing a GET request; in fact, some dynamic resources consider that a feature. The important distinction here is that the user did not request the side-effects, so therefore cannot be held accountable for them.

Related

REST Check if resource exists, how to handle on server side?

how to handle resource checking on server side?
For example, my api looks like:
/books/{id}
After googling i found, that i should use HEAD method to check, if resource exists.
https://www.w3.org/Protocols/rfc2616/rfc2616-sec9.html
I know, that i can use GET endpoint and use HEAD method to fetch information about resource and server does not return body in this case.
But what should i do on server side?
I have two options.
One endpoint marked as GET. I this endpoint i can use GET method to fetch data and HEAD to check if resource is available.
Two endpoints. One marked as GET, second as HEAD.
Why i'm considering second solution?
Let's assume, that GET request fetch some data from database and process them in some way which takes some time, eg. 10 ms
But what i actually need is only to check if data exists in database. So i can run query like
select count(*) from BOOK where id = :id
and immediately return status 200 if result of query is equal to 1. In this case i don't need to process data so i get a faster response time.
But... resource in REST is a object which is transmitted via HTTP, so maybe i should do processing data but not return them when i use HEAD method?
Thanks in advance for your answer!

You could simply delegate the HEAD handler to the existing GET handler and return the status code and headers only (ignoring the response payload).
That's what some frameworks such as Spring MVC and JAX-RS do.
See the following quote from the Spring MVC documentation:
#GetMapping — and also #RequestMapping(method=HttpMethod.GET), are implicitly mapped to and also support HTTP HEAD. An HTTP HEAD request is processed as if it were HTTP GET except but instead of writing the body, the number of bytes are counted and the "
Content-Length header set.
[...]
#RequestMapping method can be explicitly mapped to HTTP HEAD and HTTP OPTIONS, but that is not necessary in the common case.
And see the following quote from the JAX-RS documentation:
HEAD and OPTIONS requests receive additional automated support. On receipt of a HEAD request an implementation MUST either:
Call a method annotated with a request method designator for HEAD or, if none present,
Call a method annotated with a request method designator for GET and discard any returned entity.
Note that option 2 may result in reduced performance where entity creation is significant.
Note: Don't use the old RFC 2616 as reference anymore. It was obsoleted by a new set of RFCs: 7230-7235. For the semantics of the HTTP protocol, refer to the RFC 7231.

Endpoint should be the same and server side script should make decision what to do based on method. If method is HEAD, then just return suitable HTTP code:
204 if content exists but server don't return it
404 if not exists
4xx or 5xx on other error
If method is GET, then process request and return content with HTTP code:
200 if content exists and server return it
404 if not exists
4xx or 5xx on other error
The important thing is that URL should be the same, just method should be different. If URL will be different then we talking about different resources in REST context.

Your reference for HTTP methods is out of date; you should be referencing RFC 7231, section 4.3.2
The HEAD method is identical to GET except that the server MUST NOT send a message body in the response (i.e., the response terminates at the end of the header section).
This method can be used for obtaining metadata about the selected representation without transferring the representation data and is often used for testing hypertext links for validity, accessibility, and recent modification.
You asked:
resource in REST is a object which is transmitted via HTTP, so maybe i should do processing data but not return them when i use HEAD method?
That's right - the primary difference between GET and HEAD is whether the server returns a message-body as part of the response.
But what i actually need is only to check if data exists in database.
My suggestion would be to use a new resource for that. "Resources" are about making your database look like a web site. It's perfectly normal in REST to have many URI that map to a queries that use the same predicate.
Jim Webber put it this way:
The web is not your domain, it's a document management system. All the HTTP verbs apply to the document management domain. URIs do NOT map onto domain objects - that violates encapsulation. Work (ex: issuing commands to the domain model) is a side effect of managing resources. In other words, the resources are part of the anti-corruption layer. You should expect to have many many more resources in your integration domain than you do business objects in your business domain.

HTTP status code for GET request with non-existing query parameter value

Let's clarify three common scenarios when no item matches the request with a simple example:
GET /posts/{postId} and postId does not exist (status code 404, no question)
GET /posts?userId={userId} and the user with userId does not have any posts
GET /posts?userId={userId} and the user with userId does not exist itslef
I know there's no strict REST guideline for the appropriate status codes for cases 2 and 3, but it seems to be a common practice to return 200 for case 2 as it's considered a "search" request on the posts resource, so it is said that 404 might not be the best choice.
Now I wonder if there's a common practice to handle case 3. Based on a similar reasoning with case 2, 200 seems to be more relevant (and of course in the response body more info could be provided), although returning 404 to highlight the fact that userId itself does not exist is also tempting.
Any thoughts?

Ok, so first, REST doesn't say anything about what the endpoints should return. That's the HTTP spec. The HTTP spec says that if you make a request for a non-existent resource, the proper response code is 404.
Case 1 is a request for a single thing. That would return 404, as you said.
The resource being returned in case 2 is typically an envelope which contains metadata and a collection of things. It doesn't matter if the envelope has any things in it or not. So 200 is the correct response code because the envelope exists, it just so happens the envelope isn't holding any things. It would be allowable under the spec to say there's no envelope if there are no things and return 404, but that's usually not done because then the API can't send the metadata.
Case 3, then, is exactly the same thing as case 2. If expected response is an envelope, then the envelope exists whether or not the userId is valid. It would not be unreasonable to include metadata in the envelope pointing out that there is no user with userId, if the API designer thinks that information would be useful to clients.
Case 2 and Case 3 are really the same case, and should both either return 200 with an empty envelope or 404.

First piece, you need to recognize that /posts?userId={userId} identifies a resource, precisely in the same sense that /posts/{userId} or /index.html specifies a resource.
So GET /posts?userId={userId} "requests transfer of a current selected representation for the target resource."
The distinction between 200 and 404 is straight forward; if you are reporting to the consumer that "the origin server did not find a current representation for the target resource or is not willing to disclose that one exists", then you should be returning 404. If the response payload includes a current representation of the resource, then you should use the 200 status code.
404 is, of course, an response from the Client Error response class
the server SHOULD send a representation containing an explanation of the error situation, and whether it is a temporary or permanent condition.
So a way of figuring out which of these status codes to use, is to just look at the message body of the response. If it is a current representation of the resource, then you use the 200 status code. If it is a representation of a message that explains no current representation is available, then you use the 404 status code.
Of course that ducks the big question: what should the representation of the resource be in each case? Once you know that, you can work out the rest.
If you you think that an unexpected identifier indicates an error on the client (for example, a corrupted link), then it will probably improve the consumer's experience to report that as an explicit error, rather than returning a representation of an empty list.
But that's a judgment call; different API are going to have different answers, and HTTP isn't particularly biased one way or the other; HTTP just asks that you ensure that the response code and headers are appropriate for the choice that you have made.

What should be the response of GET and DELETE methods in REST? [duplicate]

This question already has answers here:
HTTP status code for update and delete?
(10 answers)
Closed 5 years ago.
In REST, suppose the client invoked the findAll method (GET)- which would simply return the list of entities(DTOs) to the client with HTTP status code 200. Now, let's say client has invoked a DELETE method removeObject(Object object). The object in the parameter doesn't exist in the database, and typically it would return HTTP status code 400. I want the client to know this real reason of 400 in a more manageable way (less panic) . Now the need of a status code / description arises which wasn't necessary during the GET method.
I want the client to get a consistent RESPONSE for all the messages. Is there any guideline / best practices for what to return in REST based APIs?

What do you mean by REST? Let's suppose, we refer to Roy Fielding and Http 1.1 standard.
According to standard, DELETE is idempotent method. I.e. if you request DELETE more than once, side-effects would be the same. I.e. all the same record in DB would be marked as "deleted" or be absent.
First of all, to request DELETE, you request a resource. Say, http://some.url/to/resource. If it never was present - you should respond with 404.
Section "9.7 DELETE" of the standard says:
A successful response SHOULD be 200 (OK) if the response includes an
entity describing the status, 202 (Accepted) if the action has not
yet been enacted, or 204 (No Content) if the action has been enacted
but the response does not include an entity.
If you don't remove a record completely from DB and want to communicate on subsequent requests that resource had been deleted and is no longer available, then the standard, section "10.4.11 410 Gone", says:
The 410 response is primarily intended to assist the task of web
maintenance by notifying the recipient that the resource is
intentionally unavailable and that the server owners desire that
remote links to that resource be removed.
But using this response code or providing it for some time period is not necessary, and response could be also 404. So use it, if you want to differentiate, whether a resource had been deleted or had never been present.

Status code of 400 is fine but you then also return some object that the frontend can parse and comprehend (i.e. json). This way, your frontend gets more info about the error and optionally override it before presenting it to the user. Some pretty basic and barely functional pseudocode:
Backend:
return(status_code=400,
body={message: {type: "error", description: "some text here"})
Frontend:
if (status_code >= 400):
console.log(json.loads(response).message.description)

I would expect an Rest api to respond with a 200 when deleting an object. Anything else usually means something went wrong with my request.
If I requested an object after it had been deleted, I would then expect the API to return 404, or something of that nature.
Please refer to section 9.7 of RFC 2616
https://www.rfc-editor.org/rfc/rfc2616#section-9.7

REST PUT create or update

I want to create or update an item in one go with the following request:
PUT /items/{id}
Should I return 201 Created if the item has been created and 204 No Content when it has been updated? Or should I return the same status for both actions?

The answer is clearly stated in RFC2616, section 9.6:
If a new resource is created, the origin server MUST inform the user agent via the 201 (Created) response. If an existing resource is modified, either the 200 (OK) or 204 (No Content) response codes SHOULD be sent to indicate successful completion of the request.

TL;DR
Use 200 OK in both cases if the distinction isn't important to clients.
Use 201 Created upon creation, and 200 OK upon update, if the distinction is important.
Details
With RESTful design, there's no single, correct way of doing things, but I've often found that the original HTTP status code specification provides good guidance.
It's always fine to return 200 OK if you don't expect the client to take separate action based on the response, but you can provide more information if you think the client is going to need it. Just be aware that clients may come to expect that of your API, so once you start providing more details, you can't easily change your mind.
That said, if you want to return 201 Created, then according to the specification:
The response SHOULD include an entity containing a list of resource characteristics and location(s) from which the user or user agent can choose the one most appropriate.
As an example, you can put the location in the Location header, but you can also put it in the body of the response (but then you must document the structure and content-type of that body).
When it comes to 204 No Content responses, they create dead ends for link-following clients, so if you're creating a true level 3 REST API then you should avoid that response type as a courtesy to clients.

Can a RESTful POST method be implemented to be idempotent?

I am designing a RESTful API that is managing favorites. Each favorite resource can contain details of two items that are considered as part of the favorite.
HTTP POST /favorites
{ "item1" : "ball",
"item1-ID" : "1",
"item2" : "bat",
"item2-ID" : "2"
}
Please excuse the rudimentary JSON payload. The focus is however on the semantics of the POST
The above POST method creates a new favorite resource (that contains a ball (ID 1) and a bat (ID 2))
My question is with regard to the expected behavior when the same POST request is sent twice. The first request will create a favorite (as expected). What should happen when the second request is sent?
1) Signal an error with 409
2) Signal a success with 201
1) is not idempotent (as POST is), while 2) makes POST idempotent.
Which is the correct approach?

You are thinking about this in the wrong way. If a POST creates a new resource then the resource should have a URL that identifies it (e.g., http://.../favorite/1). When the second POST with the same payload happens, is a new resource created? If so, then you have two separate resources with unique URLs. If you application does not create a new resource, then the second POST would return the same URL as the first one.
Edit
The POST section of RFC7231 does not prohibit it from being implemented in an idempotent manner. It does provide some guidance though:
If a POST creates a new resource, then it SHOULD send a 201 (Created) response containing a Location header identifying the created resource
If the result of a POST is equivalent to an existing resource, then the server MAY redirect the UA to the existing resource by returning a 303 (See Other) with the appropriate Location header
POST is not required to change the state of a resource. However, GET and HEAD are required to be idempotent. I'm not aware of any method that is required to change state on the server.
Personally, I implement resource creating POST methods as returning a 303 redirect to the canonical URL for the resource as a matter of practice. I wasn't aware of the RFC distinguishing status codes for resource creation and re-use responses. The benefit of this approach is that it removes the need to include the created resource in the POST response and the GET to the canonical URL will cache the response in any intermediate caches that may be present. It also allows for the implementation of intelligent clients that do not retrieve the response and use the canonical URL (from the Location header) instead.
In your case, I think that I would adopt the 303 approach provided that a favorite has a URL of it's own; 201 would be strange if it did not. If there is very good reason for distinguishing the creation and re-use cases in the response, then a 201 when the new resource is created and a 303 when it is reused is appropriate. What I would not do is to return a 409 unless your application requires you to deny the creation of a resource that already exists.

To me, as you pictured it, it should create a new favorite. But this feels strange for almost any application. What if:
your favorites could be defined as non-repeatable if they are exact copies
you could send an ID with the favorite (not a good idea, as it'd be client-based), and insert or update based on that
you could send some other unique field and use that to define whether it is a "create" or an "update" (ie. "favorite name" or "position in menu")
As you see, it all depends on your application.
Maybe you can get some ideas from this article I happened to write some time ago: A look into various REST APIs . Don't miss the summary at the bottom.

What should happen when the second request is sent?
It all depends on your implementation . Say if you have some logic in your controller that checks if ID 1 and ID 2 exist in the data base then update the value else create the new record.
The whole thing is not depended on POST request.
AND FYI HTTP response code for success is 200 and not 201.
Your current approach is not correct and the question is not a JAVA question

In general, it depends on you, how you want to design the underlying system. In most cases, POST creates a new resource (Yes, there are instances when you use POST and not create anything new. For example, you have a use-case where in you want to search on a very long list of the parameters and GET might not be the way!. This is completely acceptable and not a violation of REST principles.)
Commenting on your specific case,
What should happen when the second request is sent?
You should ask the following questions to yourself.
Is having a duplicate request (a request with the same payload/headers/time-stamp etc.) a "different" request for your system ? If YES - treat it normally as if it's not duplicate. I assume that the next systems to which the request is going are also understanding this as a new request, otherwise they could fail and flag error. For example, MySQL could throw Duplicate Primary Key error (Refer: Error Code: 1062. Duplicate entry 'PRIMARY')
If the answer to (1) is NO - Is the duplicate occurrence of this request an expected behaviour of the clients ? (For example, you tend to get duplicate requests from mobile if it's intermittently getting the network.) If YES - return 200 OK. Note the idempotency here.
If the answer to (2) is NO - throw 409 conflict with a custom error-code which client can handle and react accordingly. I would also investigate the unwanted behaviour of client on throwing the duplicate requests and fix it if that's doable.

Duplicate idempotent POST requests should return the previously created resource in response if the status code of the response was 200. Idempotency Enforcement Scenarios

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse