Should HTTP PUT create a resource if it does not exist? - rest

Lets suppose that someone performs a PUT request on my endoint:
/resources/{id}
However there is not resource with the given id stored in my PostgreSQL database.
According to the RFC 2616, I should create the resource if I am capable to:
The PUT method requests that the enclosed entity be stored under the supplied Request-URI. If the Request-URI refers to an already existing resource, the enclosed entity SHOULD be considered as a modified version of the one residing on the origin server. If the Request-URI does not point to an existing resource, and that URI is capable of being defined as a new resource by the requesting user agent, the origin server can create the resource with that URI.
Would be okay to create the resource with the provided id? As manually assigning ids on database insert is not the best practice.
Should I return a 404 error if the creation of the resource is not possible?

First of all, you are using an obsolete document: The RFC 2616 is no longer relevant nowadays and anyone using such document as reference should stop right away.
Quoting Mark Nottingham who, at the time of writing, co-chairs the IETF HTTP and QUIC Working Groups:
Don’t use RFC2616. Delete it from your hard drives, bookmarks, and burn (or responsibly recycle) any copies that are printed out.
The old RFC 2616 has been supplanted by the following documents that, together, define the HTTP/1.1 protocol:
RFC 7230: Message Syntax and Routing
RFC 7231: Semantics and Content
RFC 7232: Conditional Requests
RFC 7233: Range Requests
RFC 7234: Caching
RFC 7235: Authentication
If you are looking for methods, status codes and headers definitions, then the RFC 7231 is the document you should refer to.
Having said that, let's go back to your question.
Should HTTP PUT create a resource if it does not exist?
It depends.
But, if your application generates resource identifiers on behalf of the client, as you mentioned in your question, then you should use POST instead of PUT for creating resources.
Some parts of the PUT method definition are quoted below. The last sentence seems to be the most relevant to you (highlight is mine), supporting what I've just mentioned above:
4.3.4. PUT
The PUT method requests that the state of the target resource be created or replaced with the state defined by the representation enclosed in the request message payload. [...]
If the target resource does not have a current representation and the PUT successfully creates one, then the origin server MUST inform the user agent by sending a 201 (Created) response. If the target resource does have a current representation and that representation is successfully modified in accordance with the state of the enclosed representation, then the origin server MUST send either a 200 (OK) or a 204 (No Content) response to indicate successful completion of the request. [...]
Proper interpretation of a PUT request presumes that the user agent knows which target resource is desired. A service that selects a proper URI on behalf of the client, after receiving a state-changing request, SHOULD be implemented using the POST method rather than PUT. [...]
Should I return a 404 error if the creation of the resource is not possible?
That's seems to be an accurate status code to be returned, as no representation has been found for the requested resource:
6.5.4. 404 Not Found
The 404 (Not Found) status code indicates that the origin server did not find a current representation for the target resource or is not willing to disclose that one exists. [...]
Now, for the sake of completeness, find below some relevant quotes on the POST method definition, which should be used to create resources in the scenario described in your question:
4.3.3. POST
The POST method requests that the target resource process the representation enclosed in the request according to the resource's own specific semantics. For example, POST is used for the following functions (among others):
[...]
Creating a new resource that has yet to be identified by the origin server;
[...]
If one or more resources has been created on the origin server as a result of successfully processing a POST request, the origin server SHOULD send a 201 (Created) response containing a Location header field that provides an identifier for the primary resource created and a representation that describes the status of the request while referring to the new resource(s).
While the 201 status code indicates that a new resource has been created, the Location header indicate where the newly created resource is located. If no Location header is provided, then the client should assume that the resource is identified by the effective request URI:
6.3.2. 201 Created
The 201 (Created) status code indicates that the request has been fulfilled and has resulted in one or more new resources being created. The primary resource created by the request is identified by either a Location header field in the response or, if no Location field is received, by the effective request URI. [...]

In short, it depends wheter the payload you want to store violates any constraint the server has for resources or not.
In general I'd say it should attempt it as the client explicitly expresses his intent to store that particular representation at the target URI. The server should though perform constraint checks before! Usually, in a real REST scenario though, the client should use URI that are provided by the server and not just chose any URI on its own. Thereby, a server should be in control of its namespace, as such using PUT to create resources is not recommended here by default.
With that being said, as PUT is idempotent while POST being not, some clients might want to benefit from this property. Here a POST-PUT creation pattern has evolved, where a client is attempting to create a new resource via POST until it receives a confirmation via a Location header in the response and afterwards attempts the update of that resource's state via PUT. This way the client can be sure that in case of transmission problems the representation was only created once. Depending on the stance, some people might consider the actual update of the resource as the actual resource creation, though as the client beforehand received the respective link, this is not quite the case.
Note that a server also has the right to transform the representation to something different if i.e. the server is configured to provide specific representations for certain URI endpoints. Think of uploading an image via PUT to a URI and the server embedds the image into a HTML page

There's two questions embedded here: 1) should PUT try to create the resource and 2) what happens if it cant.
1)
The RFC linked by #cass says https://www.rfc-editor.org/rfc/rfc7231#section-4.3.4:
The PUT method requests that the state of the target resource be
created or replaced with the state defined by the representation
enclosed in the request message payload. A successful PUT of a given
representation would suggest that a subsequent GET on that same
target resource will result in an equivalent representation being
sent in a 200 (OK) response.
Further, Mozilla's text https://developer.mozilla.org/en-US/docs/Web/HTTP/Methods/PUT
The HTTP PUT request method creates a new resource or replaces a representation of the target resource with the request payload.
Further, from the original RFC (that was replaced with the above test) https://www.w3.org/Protocols/rfc2616/rfc2616-sec9.html
The PUT method requests that the enclosed entity be stored under the supplied Request-URI. If the Request-URI refers to an already existing resource, the enclosed entity SHOULD be considered as a modified version of the one residing on the origin server. If the Request-URI does not point to an existing resource, and that URI is capable of being defined as a new resource by the requesting user agent, the origin server can create the resource with that URI.
This is a bit anecdotal, but the Kubernetes API also carefully makes this distinction and informs it's users of PATCH if they really meant update: https://kubernetes.io/docs/reference/using-api/api-concepts/#api-verbs
For PUT requests, Kubernetes internally classifies these as either create or update based on the state of the existing object. An update is different from a patch; the HTTP verb for a patch is PATCH.
2
In terms of "what happens if it fails" I think the code depends on what went wrong:
400: it couldnt be created due to a bad payload
409: it couldnt be created due to a conflict - for example some field in the input JSON has some global uniqueness check on it
502/3 - it couldnt be created because it tried to call the database and it was dead
I'm not sure if 404 is the best code, becuase it doesnt tell the user anything about why.

Related

Status code when using PUT endpoint to create resource in REST api

When you use PUT endpoint to create resource in REST api, what should the endpoint return for subsequent calls after returning 201(created) for the first call? 403(cannot create since the resource already exist)? 200(updated to the same exact object?) if you change the status code after one call(201-> 200 or 403), isn't that a violation of idempotency? I looked everywhere but all I can find is you can use PUT to create but nowhere it said about status code change after resource creation.
In short my question is that PUT is an idempotent method, but when it is used in resource creation, can it still change it's return status code from the following calls?
p.s.
After first calls, it will be idempotent(constantly 403 or 200). And ideally I want to be able to tell the client that the resource is already created and you shouldn't call this again.(403)
I know using POST is an alternative but as ID is already known to client at the point of creation I wanna use PUT method but want to know the proper REST way in terms of idempotency.
===================================================================
References of Using PUT endpoints for creating resources
http://restcookbook.com/HTTP%20Methods/put-vs-post/
https://www.w3.org/Protocols/rfc2616/rfc2616-sec9.html
The fundamental difference between the POST and PUT requests is
reflected in the different meaning of the Request-URI. The URI in a
POST request identifies the resource that will handle the enclosed
entity. That resource might be a data-accepting process, a gateway to
some other protocol, or a separate entity that accepts annotations. In
contrast, the URI in a PUT request identifies the entity enclosed with
the request -- the user agent knows what URI is intended
9.6. PUT If a new resource is created, the origin server MUST inform the user agent via the 201 (Created) response. If an existing resource
is modified, either the 200 (OK) or 204 (No Content) response codes
SHOULD be sent to indicate successful completion of the request.
http://zalando.github.io/restful-api-guidelines/http/Http.html
PUT requests are usually robust against non-existence of resources by
implicitly creating before updating
successful PUT requests will usually generate 200 or 204 (if the
resource was updated - with or without actual content returned), and
201 (if the resource was created)
Idempotency is about the server state - not about the responses. E.g. DELETE is idempotent, but after the 2nd try the resource will not be found and you may choose to respond with 404. But the state of the server is going to be the same - the resource is deleted.
Same with PUT - you can invoke it multiple times, but the state of the server will always be the same after the operation is finished.
Ideally though you could reuse PUT for updating the resources. So when the 2nd request is arrived you can use that for updating instead of returning errors. That will probably simplify implementation and the contract.

Why not use PUT for REST queries that require a payload?

REST recommends that queries (not resource creation) be done through the GET method. In some cases, the query data is too large or structured in a way that makes it difficult to put in a URL, and to address this, a RESTful API is modified to support queries with bodies.
It seems that the convention for RESTful queries that require bodies is to use POST. Here are a few examples:
Dropbox API
ElasticSearch
O'Reilley's Restful Web Services Cookbook
Queries don't modify the internal state of the system, but POST doesn't support idempotent operations. However, PUT is idempotent. Why don't RESTful APIs use PUT with a body instead of POST for queries that require a body?
NOTE: A popular question asks which (PUT vs POST) is preferred for creating a resource. This question asks why PUT is not used for queries that require bodies.
No. PUT might be idempotent, but it also has a specific meaning. The body of the request in PUT should be used to replace the resource in the URI.
With POST no such assumptions are being made. And note that using a POST request means that the request might not be idempotent, in specific cases it still might be.
However, you could do it with PUT, but it requires you to jump through an extra hoop. Basically, you could create a "query resource" with PUT, and then use GET immediately after to fetch the results of this query resource. Perhaps this was what you were after, but this is the most RESTful, because the resulting query results can still be linked to. (something which is completely missing if you use POST requests).
You should read the standard: http://www.w3.org/Protocols/rfc2616/rfc2616-sec9.html
The definition of POST:
The POST method is used to request that the origin server accept the
entity enclosed in the request as a new subordinate of the resource
identified by the Request-URI in the Request-Line.
The action performed by the POST method might not result in a resource
that can be identified by a URI. In this case, either 200 (OK) or 204
(No Content) is the appropriate response status, depending on whether
or not the response includes an entity that describes the result.
The definition of PUT
The PUT method requests that the enclosed entity be stored under the
supplied Request-URI. If the Request-URI refers to an already existing
resource, the enclosed entity SHOULD be considered as a modified
version of the one residing on the origin server. If the Request-URI
does not point to an existing resource, and that URI is capable of
being defined as a new resource by the requesting user agent, the
origin server can create the resource with that URI.
If the resource could not be created or modified with the Request-URI, an appropriate error response SHOULD be given that reflects the nature of the problem.
Another thing that PUT is not cacheable while POST is.
Responses to this method are not cacheable, unless the response
includes appropriate Cache-Control or Expires header fields.
e.g. http://www.ebaytechblog.com/2012/08/20/caching-http-post-requests-and-responses/

should I use PUT to create a resource with a known id (e.g.: email)

I believe to have read some time ago that creating a resource when the id is known (e.g.: email) should be done using a PUT on that resource.
E.g.:
PUT /user/chris#example.com
is this correct?
Yes, it is correct to use
PUT resource/{id} --> 204 No Content
when the id of the resource is being specified by the client and the operation is idempotent. The operation is idempotent if doing it two or more times in a row has the same effect as doing it once.
If you use POST, you usually do not provide a client identifier. Instead the server chooses its own identifier and informs the client of the created resource's location by sending a 201 Created response with a Location header.
POST resource --> 201 Created
Location: /resource/7
Yes, it's correct to use PUT to create a resource with a known URI. PUT asks the server to replace the resource at the target URI with the resource representation in the payload, so you have to know the target URI. However, keep in mind that PUT requires a complete representation, so if you're creating or updating a resource with an incomplete representation, you should use POST.
I think what more correctly to do it throught POST on new url and create if thid id does not exists.
Because PUT method is idempotent (e. g. if you send one request many time - result would be same).

S3 REST API and POST method

I'm using AWS S3 REST API, and after solving some annoying problems with signing it seems to work. However, when I use correct REST verb for creating resource, namely POST, I get 405 method not allowed. Same request works fine with method PUT and creates resource.
Am I doing something wrong, or is AWS S3 REST API not fully REST-compliant?
Yes, you are wrong in mapping CRUD to HTTP methods.
Despite the popular usage and widespread misconception, including high-rated answers here on Stack Overflow, POST is not the "correct method for creating resource". The semantics of other methods are determined by the HTTP protocol, but the semantics of POST are determined by the target media type itself. POST is the method used for any operation that isn't standardized by HTTP, so it can be used for creation, but also can be used for updates, or anything else that isn't already done by some other method. For instance, it's wrong to use POST for retrieval, since you have GET standardized for that, but it's fine to use POST for creating a resource when the client can't use PUT for some reason.
In the same way, PUT is not the "correct method for updating resource". PUT is the method used to replace a resource completely, ignoring its current state. You can use PUT for creation if you have the whole representation the server expects, and you can use PUT for update if you provide a full representation, including the parts that you won't change, but it's not correct to use PUT for partial updates, because you're asking for the server to consider the current state of the resource. PATCH is the method to do that.
In informal language, what each method says to the server is:
POST: take this data and apply it to the resource identified by the given URI, following the rules you documented for the resource media type.
PUT: replace whatever is identified by the given URI with this data, ignoring whatever is in there already, if anything.
PATCH: if the resource identified by the given URI still has the same state it had the last time I looked, apply this diff to it.
Notice that create or update isn't mentioned and isn't part of the semantics of those methods. You can create with POST and PUT, but not PATCH, since it depends on a current state. You can update with any of them, but with PATCH you have an update conditional to the state you want to update from, with PUT you update by replacing the whole entity, so it's an idempotent operation, and with POST you ask the server to do it according to predefined rules.
By the way, I don't know if it makes sense to say that an API is or isn't REST-compliant, since REST is an architectural style, not a spec or a standard, but even considering that, very few APIs who claim to be REST are really RESTful, in most cases because they are not hypertext driven. AWS S3 is definitely not RESTful, although where it bears on your question, their usage of HTTP methods follows the HTTP standard most of the time.
+--------------------------------------+---------------------+
| POST | PUT |
+--------------------------------------+---------------------+
| Neither safe nor idempotent Ex: x++; | Idempotent Ex: x=1; |
+--------------------------------------+---------------------+
To add to #Nicholos
From the http://www.w3.org/Protocols/rfc2616/rfc2616-sec9.html
POST:
The posted entity is subordinate to the URI in the same way that a
file is subordinate to a directory containing it, a news article is
subordinate to a newsgroup to which it is posted, or a record is
subordinate to a database
The action performed by the POST method might not result in a resource
that can be identified by a URI. In this case, either 200 (OK) or 204
(No Content) is the appropriate response status, depending on whether
or not the response includes an entity that describes the result
If a resource has been created on the origin server, the response
SHOULD be 201 (Created)
PUT:
The PUT method requests that the enclosed entity be stored under the
supplied Request-URI. If the Request-URI refers to an already existing
resource, the enclosed entity SHOULD be considered as a modified
version of the one residing on the origin server. If the Request-URI
does not point to an existing resource, and that URI is capable of
being defined as a new resource by the requesting user agent, the
origin server can create the resource with that URI. If a new resource
is created, the origin server MUST inform the user agent via the 201
(Created) response. If an existing resource is modified, either the
200 (OK) or 204 (No Content) response codes SHOULD be sent to indicate
successful completion of the request
IMO PUT can be used to create or modify/replace the enclosed entity.
In the original HTTP specification, the resource given in the payload of a POST request is "considered to be subordinate to the specified object" (i.e. the request URL). TimBL has said previously (can't find the reference) that it was modelled on the identically-named method in NNTP.

REST: Deleting non-canonical resources

If /users/{username} is an alias of /users/{userId} (one cannot exist without the other) then I am expecting that deleting the former means that you want to delete the latter.
If /lists/{listId}/users/{userId} is an reference to /users/{userId} (one can exist without the other) then I'm expecting to be able to delete the reference without affecting the canonical resource.
In both cases, invoking HTTP GET will result in HTTP 303 to the canonical resource.
I believe that it is confusing that both types behave the same for HTTP GET but not for HTTP DELETE.
This has led me to the following questions:
What should happen when HTTP DELETE is invoked on an alias?
What is the appropriate way to remove references from a list without removing the actual object they point to?
5.3.5. DELETE
The DELETE method requests that the origin server delete the target
resource. This method MAY be overridden by human intervention (or
other means) on the origin server. The client cannot be guaranteed
that the operation has been carried out, even if the status code
returned from the origin server indicates that the action has been
completed successfully. However, the server SHOULD NOT indicate
success unless, at the time the response is given, it intends to
delete the resource or move it to an inaccessible location.
**> A successful response SHOULD be 200 (OK) if the response includes a
representation describing the status, 202 (Accepted) if the action has
not yet been enacted, or 204 (No Content) if the action has been
enacted but the response does not include a representation.**
So:
If /users/{username} is an alias of /users/{userId} (one cannot exist without the other)
then > I am expecting that deleting the former means that you want to delete the latter
What should happen when HTTP DELETE is invoked on an alias?
"both" resources must to be deleted. as i understand if one can not exist without other they are just different identifiers of same resource, so after deleting one of them following requests must be answered with 404 Not Found for both.
What is the appropriate way to remove references from a list without removing the actual object they point to?
UPDATE:
My question is what HTTP methods should be used to indicate that a client would like to remove references from a list?
If user(client?) would like to remove reference from a list without removing actual resource there is only one way - client must send PUT request with modified list to the server which means that existing list must be replaced with a new one. If user doesn't have enough permissions "403 Forbidden" can be returned in response.
To do the reverse of this(i.e. inform user when list was changed) if resource(i.e. list in this case) is supposed to be cached on client side and removal of some other resource affects this list, only approach i can think to "inform" user is to use caches. For example:
REQUEST
GET /list HTTP/1.1
Host: service.org
RESPONSE
HTTP/1.1 200 OK
Content-Type: application/...
Content-Length: ...
Cache-Control: private, max-age=0
ETag: a32lasdf
In the response above Cache-Control header value "private, max-age=0" is an instruction for client that it can cache representation locally though must send following request to the server each time:
REQUEST
GET /list HTTP/1.1
Host: service.org
If-None-Match: a32lasdf
if /list resource is untouched then ETag value will match "If-None-Match" value, server can respond with "304 Not Modified" and client doesn't need to do anything. But if previous removal action affected the /list, hash code for ETag will be different from one which is provided with "If-None-Match" header and thus new representation of the /list resource will be returned by server.
URIs are references to resources, not the resources themselves. In a sense, they are all canonical. Different URIs may identify the same resource. They may also be pleceholders for something that doesn't even exist yet. The "one cannot exist without the other" concept is something specific to the application, it's not a REST constraint. Ideally, you won't have URIs depending on each other. If you delete the resource all URIs pointing to it will point to nothing. Or at least that's how things should be if the URIs were pointing to resources and not to one another.
UPDATE:
According to RFC2396:
1.1 Overview of URI:
...
Resource
A resource can be anything that has identity. Familiar
examples include an electronic document, an image, a service
(e.g., "today's weather report for Los Angeles"), and a
collection of other resources. Not all resources are network
"retrievable"; e.g., human beings, corporations, and bound
books in a library can also be considered resources.
The resource is the conceptual mapping to an entity or set of
entities, not necessarily the entity which corresponds to that
mapping at any particular instance in time. Thus, a resource
can remain constant even when its content---the entities to
which it currently corresponds---changes over time, provided
that the conceptual mapping is not changed in the process.
Identifier
An identifier is an object that can act as a reference to
something that has identity. In the case of URI, the object is
a sequence of characters with a restricted syntax.
UPDATE:
I guess, if the resource is accessible/modifiable through multiple URIs, it's up to the application to designate a canonical one (the URI that is going to be included in the hypertext of responses). But yes, it's the URIs that are non-canonical (from the application's point of view), not the resources.