how to handle resource checking on server side?
For example, my api looks like:
/books/{id}
After googling i found, that i should use HEAD method to check, if resource exists.
https://www.w3.org/Protocols/rfc2616/rfc2616-sec9.html
I know, that i can use GET endpoint and use HEAD method to fetch information about resource and server does not return body in this case.
But what should i do on server side?
I have two options.
One endpoint marked as GET. I this endpoint i can use GET method to fetch data and HEAD to check if resource is available.
Two endpoints. One marked as GET, second as HEAD.
Why i'm considering second solution?
Let's assume, that GET request fetch some data from database and process them in some way which takes some time, eg. 10 ms
But what i actually need is only to check if data exists in database. So i can run query like
select count(*) from BOOK where id = :id
and immediately return status 200 if result of query is equal to 1. In this case i don't need to process data so i get a faster response time.
But... resource in REST is a object which is transmitted via HTTP, so maybe i should do processing data but not return them when i use HEAD method?
Thanks in advance for your answer!
You could simply delegate the HEAD handler to the existing GET handler and return the status code and headers only (ignoring the response payload).
That's what some frameworks such as Spring MVC and JAX-RS do.
See the following quote from the Spring MVC documentation:
#GetMapping — and also #RequestMapping(method=HttpMethod.GET), are implicitly mapped to and also support HTTP HEAD. An HTTP HEAD request is processed as if it were HTTP GET except but instead of writing the body, the number of bytes are counted and the "
Content-Length header set.
[...]
#RequestMapping method can be explicitly mapped to HTTP HEAD and HTTP OPTIONS, but that is not necessary in the common case.
And see the following quote from the JAX-RS documentation:
HEAD and OPTIONS requests receive additional automated support. On receipt of a HEAD request an implementation MUST either:
Call a method annotated with a request method designator for HEAD or, if none present,
Call a method annotated with a request method designator for GET and discard any returned entity.
Note that option 2 may result in reduced performance where entity creation is significant.
Note: Don't use the old RFC 2616 as reference anymore. It was obsoleted by a new set of RFCs: 7230-7235. For the semantics of the HTTP protocol, refer to the RFC 7231.
Endpoint should be the same and server side script should make decision what to do based on method. If method is HEAD, then just return suitable HTTP code:
204 if content exists but server don't return it
404 if not exists
4xx or 5xx on other error
If method is GET, then process request and return content with HTTP code:
200 if content exists and server return it
404 if not exists
4xx or 5xx on other error
The important thing is that URL should be the same, just method should be different. If URL will be different then we talking about different resources in REST context.
Your reference for HTTP methods is out of date; you should be referencing RFC 7231, section 4.3.2
The HEAD method is identical to GET except that the server MUST NOT send a message body in the response (i.e., the response terminates at the end of the header section).
This method can be used for obtaining metadata about the selected representation without transferring the representation data and is often used for testing hypertext links for validity, accessibility, and recent modification.
You asked:
resource in REST is a object which is transmitted via HTTP, so maybe i should do processing data but not return them when i use HEAD method?
That's right - the primary difference between GET and HEAD is whether the server returns a message-body as part of the response.
But what i actually need is only to check if data exists in database.
My suggestion would be to use a new resource for that. "Resources" are about making your database look like a web site. It's perfectly normal in REST to have many URI that map to a queries that use the same predicate.
Jim Webber put it this way:
The web is not your domain, it's a document management system. All the HTTP verbs apply to the document management domain. URIs do NOT map onto domain objects - that violates encapsulation. Work (ex: issuing commands to the domain model) is a side effect of managing resources. In other words, the resources are part of the anti-corruption layer. You should expect to have many many more resources in your integration domain than you do business objects in your business domain.
Related
I've read a lot of questions about 400 vs 422, but they are almost for HTTP POST requests for example this one: 400 vs 422 response to POST of data.
I am still not sure what should I use for GET when the required parameters are sent with wrong values.
Imagine this scenario:
I have the endpoint /searchDeclaration, with the parameter type.
My declarations have 2 types: TypeA and TypeB.
So, I can call this endpoint like this: /searchDeclaration?type=TypeA to get all TypeA declarations.
What error should I send when someone calls the endpoint with an invalid type? For example: /searchDeclaration?type=Type123
Should I send 400? I am not sure it is the best code, because the parameter is right, only the value is not valid.
The 422 code looks like it is more suitable for POST request.
EDIT:
After some responses I have another doubt.
The /searchDeclaration endpoints will return the declaration for the authenticated user. TypeA and TypeB are valid values, but some users don't have submitted a TypeB declaration, so when they call /searchDeclaration?type=TypeB which error should I send? Returning 404 does not seem right because the URI is correct, but that user does not have a declaration for that value yet. Am I overthinking this?
If the URI is wrong, use 404 Not Found.
The 404 (Not Found) status code indicates that the origin server did not find a current representation for the target resource
The target resource is, as one might expect, identified by the target URI.
Aside from the semantics of the response, 404 indicates that the response is cacheable - meaning that general purpose caches will know that they can re-use this response to future requests
I am not sure it is the best code, because the parameter is right, only the value is not valid.
The fact that this URI includes parameters and values is an implementation detail; something that we don't care about at the HTTP level.
Casually: 404 means "the URI is spelled wrong"; but doesn't try to discriminate which part(s) of the URI have errors. That information is something that you can include in the body of the response, as part of the explanation of the error situation.
Am I overthinking this?
No, but I don't think you are thinking about the right things, yet.
One of the reasons that you are finding this challenging is that you have multiple logical resources sharing the same target URI. If each user declaration document had its own unique identifier, then the exercise of choosing the right response semantics would be a lot more straight forward.
Another way of handing it would be to redirect the client to the more specific URI, and then handle the response semantics there in the straight forward way.
It's trying to use a common URI for different logical resources AND respond without requiring an extra round trip that is making it hard. Bad news: this is one of the trade offs that ought to have been considered when designing your resource identifiers; if you don't want this to require harder thinking, don't use this kind of design.
The good news: 404 is still going to be fine - you are dealing with authorized requests, and the rules about sharing responses to authorized requests mean that the only possible confusion will be if different users are sharing the same private cache.
Remember: part of the point is that all resources share a common, general purpose message vocabulary. Everything is supposed to look like a document being served by a boring web server.
The fact that there's a bunch of complexity of meaning behind the resource is an implementation detail that is correctly hidden behind the uniform interface.
There are two options, it all depends on your 'type' variable.
If 'type' is an ENUM, which only allows 'typeA' and 'typeB', and your client sends 'type123', the service will respond with a '400 Bad Request' error, you don't need to check. In my opinion, this should be ideal, since if you need to add new 'type's in the future, you will only have to add them in the ENUM, instead of doing 'if-else' inside your code to check them all.
In case the 'type' variable is a String, the controller will admit a 'type123' and you should be the one to return an error, since the client request is not malformed, but rather it is trying to access a resource that does not exist.
In this case, you can return a 404 Not Found error, (that resource the client is filtering by cannot be found), or a 422 error as you say, since the server understands the request, but is not able to process it.
Let's assume for a moment that the resource you are querying is returning a set of entries that do contain certain properties. If you don't specify a filter you will basically get a (pageable) representation of those entries either as embedded objects or as links to those resources.
Now you want to filter the results based on some properties these entries have. Most programming languages nowadays provide some lambda functionality in the form of
List filteredList = list.filter(item => item.propertyX == ...)...;
The result of such a filter function is usually a list of items that fulfilled the specified conditions. If no items met the given condition then the result will be an empty list.
Applying certain filter conditions on the Web can be designed similarly. Is it really an error when a provided filter expression doesn't yield any entries? IMO it is not an error in terms of the actual message transport itself as the server was able to receive and parse the request without any issues. As such it has to be some kind of business rule that states that only admissible values are allowed as input.
If you or your company consider the case of a provided filter value for a property returning no results as an error or you perform some i.e. XML or JSON schemata validation on the received payload (for complex requests) then we should look at how those mentioned HTTP errors are defined:
The 400 (Bad Request) status code indicates that the server cannot or will not process the request due to something that is perceived to be a client error (e.g., malformed request syntax, invalid request message framing, or deceptive request routing). (Source: RFC 7230)
Here, it is clearly the case that you don't want to process requests that come with an invalid property value and as such decline the request as such.
The 422 (Unprocessable Entity) status code means the server understands the content type of the request entity (hence a 415(Unsupported Media Type) status code is inappropriate), and the syntax of the request entity is correct (thus a 400 (Bad Request) status code is inappropriate) but was unable to process the contained instructions. For example, this error condition may occur if an XML request body contains well-formed (i.e., syntactically correct), but semantically erroneous, XML instructions. (Source: RFC 4918 (WebDAV))
In this case you basically say that the payload was actually syntactically correct but failed on a semmantical level.
Note that 422 Unprocessable Entity stems from WebDAV while 400 Bad Request is defined in the HTTP specification. This can have some impact if your API serves arbitrary HTTP clients. Ones that only know and support the HTTP error codes defined in the HTTP sepcification won't be able to really determine the semantics of the 422 response. They will still consider it as a user error, but won't be able to provide the client with any more help on that issue. As such, if your API needs to be as generic as possible, stick to 400 Bad Request. If you are sure all clients support 422 Unprocessable Entity go for that one.
General improvement hints
As you tagged your question with rest, let's see how we can improve this case.
REST is an architectural style with an intention of decoupling clients from servers to make the former one more failure tollerant while allowing the latter one to evolve freely over time. Servers in such architectures should therefore provide clients with all the things clients need to make valid requests. To avoid having clients to know upfront what a server expects as input, servers usually provide some kind of input mask clients can use to fill in stuff the server needs.
On the browsable Web this is usually accomplished by HTML Forms. The form not only teaches your client where to send the request to, which HTTP operation to use and which representation format the request should actually use (usually given implicitly as application/x-www-form-urlencoded) but also the sturcture and properties the server supports.
In HTML forms it is rather easy for the server to restrict the input choices of a client by using something along the lines of
<form action="/target">
<label for="cars">Choose a car:</label>
<select name="cars" id="cars">
<option value="volvo">Volvo</option>
<option value="saab">Saab</option>
<option value="opel">Opel</option>
<option value="audi">Audi</option>
</select>
<br/>
<input type="submit" value="Submit">
</form>
This doesn't really remove the needs to check and verify the correctness of the request on the server side, tough you make it much easier for the client to actually perform a valid request.
Unfortunately, HTML forms itself have their limits. I.e. they only allow POST and GET requests to be issues. While encType defaults to application/x-www-form-urlencoded, if you want to transfer files you should use multipart/form-data. Other than that, any valid content-type should be admissible.
If you prefer JSON-based payloads over HTML you might want to look into JSON Forms, HAL forms, Ion Forms among others.
Note though that you should adhere to the content type negotiation principles. Most often proactive content type negotiation is performed where a client sends its preferences within the Accept header and the server will select the best match somehow and return either the resource mapped to that representation format or respond with a 406 Not Acceptable response. While the standard doesn't prevent returning a default representation in such caes, it bears the danger that clients won't be able to process such responses then. A better alternative here would be to fall back to reactive negotiation where the server responds with a 300 Muliple Choice response where a client has to select one of the provided alternatives and then send a GET request to the selected alternative URIs to retrieve the content in the payload may be able to process.
If you want to provide a simple link a client can use to retrieve filtered results, the server should provide the client already with the full URI as well as a link relation name and/or extension relation type that the client can use to lookup the URI to retrieve the content for if interested in.
Both, forms and link-relation support, fall under the HATEOAS umbrella as they help to remove the need for any external documentation such as OpenAPI or Swagger documentation.
To sum things up, I would rethink whether a provided property value that does not exist should really end up as a business failure. I think returning an empty list is just fine here as you clearly state that way that for the given criterias no result was obtainable. If you though still want to stick to a business error check what clients actually make use of your API. If they support 422 go for that one. If you don't know, better stick to 400 as it should be understood by all HTTP clients equally.
In oder to remove the likelihood of ending up with requests that issue invalid property values, use forms to teach clients how requests should look like. Through certain elements or properties you can already teach a client that only a limited set of choices is valid for a certain property. Instead of a form you could also provide dedicated links a client can just use to obtain the filtered result. Just make sure to issue those links with meaningful link relatin names then.
As an example I have an order where the invoicing address can be modified. The change might trigger various additional actions (i.e. create a cancellation invoice and a new invoice with the updated address).
As recommended by various sources (see below) I don't want to have a PATCH on the order resource, because it has many other properties, but want to expose a dedicated endpoint, also called "intent" resource or subresource according to the web links below:
/orders/{orderId}/invoicing-address
Should I use a POST or a PATCH against this subresource?
The invoicing address itself has no ID. In the domain layer it is represented as a value object that is part of the order entity.
What ETag should be used for the subresource?
The address is part of the order and together with the items they form an aggregate in the domain layer. When the aggregate is updated it gets a new version number in the database. That version number is used as an ETag for optimistic locking.
Should a GET on invoicing-address respond with the order aggregate version number or a hash value of the address DTO in the ETag header?
What payload should be returned after updating the address?
Since the resource is the invoicing address it seems natural to return the updated address object (maybe with server side added fields). Should the body also include the ID/URI and the ETag of the order resource?
None of the examples I found with subresources showed any server responses or considered optimistic locking.
https://rclayton.silvrback.com/case-against-generic-use-of-patch-and-put
https://www.thoughtworks.com/insights/blog/rest-api-design-resource-modeling
https://softwareengineering.stackexchange.com/questions/371273/design-update-properties-on-an-entity-in-a-restful-resource-based-api (see provided answer)
https://www.youtube.com/watch?v=aQVSzMV8DWc&t=188s (Jim Webber at about about 31 mins)
As far as REST is concerned, "subresources" aren't a thing. /orders/12345/invoicing-address identifies a resource. The fact that this resource has a relationship with another resource identified by /orders/12345 is irrelevant.
Thus, the invoicing-address resource should understand HTTP methods exactly the same way as every other resource on the web.
Should I use a POST or a PATCH against this subresource?
Use PUT/PATCH if you are proposing a direct change to the representation of the resource. For example, these are the HTTP methods we would use if we were trying to fix a spelling error in an HTML document (PUT if we were sending a complete copy of the HTML document; PATCH if we were sending a diff).
PUT /orders/12345/invoicing-address
Content-Type: text/plain
1060 W Addison St.
Chicago, IL
60613
On the other hand, if you are proposing an indirect change to the representation of the resource (the request shows some information to the server, and the server is expected to compute a new representation itself)... well, we don't have a standardized method that means exactly that; therefore, we use POST
POST serves many useful purposes in HTTP, including the general purpose of “this action isn’t worth standardizing.” -- Fielding, 2009
What ETag should be used for the subresource?
You should first give some thought to whether you want to use a strong-validator or a weak validator
A strong validator is representation metadata that changes value whenever a change occurs to the representation data that would be observable in the content of a 200 (OK) response to GET.
...
In contrast, a weak validator is representation metadata that might
not change for every change to the representation data.
...
a weak entity-tag ought to change whenever the origin server wants caches to invalidate old responses.
I might use a weak validator if the representation included volatile but insignificant information; I don't need clients to refresh their copy of a document because it doesn't have the latest timestamp metadata. But I probably wouldn't use an "aggregate version number" if I expected the aggregate to be changing more frequently than the invoicing-address itself changes.
What payload should be returned after updating the address?
See 200 OK.
In the case of a POST request, sending the current representation of the resource (after changes have been made to it) is nice because the response is cacheable (assuming you include the appropriate metadata signals in the response headers).
Responses to PATCH have similar rules to POST (see RFC 5789).
PUT is the odd man out, here
Responses to the PUT method are not cacheable.
Should the body also include the ID/URI and the ETag of the order resource?
Entirely up to you - HTTP components aren't going to be paying attention to the representation, so you can design that representation as makes sense to you. On the web, it's perfectly normal to return HTML documents with links to other HTML documents.
If I don't want to update a resource, but I just want to check if something is valid (in my case, a SQL query), what's the correct REST method?
I'm not GETting a resource (yet). I'm not PUTting, POSTing, or PATCHing anything (yet). I'm simply sending part of a form back for validation that only the server can do. Another equivalent would be checking that a password conforms to complexity requirements that can only be known by the domain, or perhaps there are other use cases.
Send object, validate, return response, continue with form. Using REST. Any ideas? Am I missing something?
What is the correct REST method for performing server side validation?
Asking whether a representation is valid should have no side effects on the server; therefore it should be safe.
If the representation that you want to validate can be expressed within the URI, then, you should prefer to use GET, as it is the simplest choice, and gives you the best semantics for caching the answer. For example, if we were trying to use a web site to create a validation api for a text (and XML or JSON validator, for instance), then we would probably have a form with a text area control, and construct the identifier that we need by processing the form input.
If the representation that you want to validate cannot be expressed within the URI, then you are going to need to put it into the message body.
Of the methods defined by RFC 7231, only POST is suitable.
Additional methods, outside the scope of this specification, have been standardized for use in HTTP. All such methods ought to be registered within the "Hypertext Transfer Protocol (HTTP) Method Registry" maintained by IANA, as defined in Section 8.1.
The HTTP method registry gives you a lot of options. For this case, I wouldn't bother with them unless you find either a perfect match, or a safe method that accepts a body and is close enough.
So maybe REPORT, which is defined in RFC 3253; I tend to steer clear of WebDAV methods, as I'm not comfortable stretching specifications for "remote Web content authoring operations" outside of their remit.
TLDR; There's a duplicate question around the topic of creating validation endpoints via REST:
In your case a GET request would seem sufficient.
The HTTP GET method is used to read (or retrieve) a representation of a resource. In the “happy” (or non-error) path, GET returns a representation in XML or JSON and an HTTP response code of 200 (OK). In an error case, it most often returns a 404 (NOT FOUND) or 400 (BAD REQUEST).
restapitutorial.com
For validating your SQL query you could use a GET request to get the valid state of your query potentially using a query parameter to achieve this.
GET: api/validateQuery?query="SELECT * FROM TABLE"
Returning:
200 (OK): Valid Query
400 (MALFORMED): Invalid Query
404 (NOT FOUND): Query valid but returns no results (if you plan on executing the query)
I think this type of endpoint is best served as a POST request. As defined in the spec, POST requests can be used for
Providing a block of data, such as the fields entered into an HTML form, to a data-handling process
The use of GET as suggested in another post, for me, is misleading and impractical based on the complexity & arbitrarity of SQL queries.
REST recommends that queries (not resource creation) be done through the GET method. In some cases, the query data is too large or structured in a way that makes it difficult to put in a URL, and to address this, a RESTful API is modified to support queries with bodies.
It seems that the convention for RESTful queries that require bodies is to use POST. Here are a few examples:
Dropbox API
ElasticSearch
O'Reilley's Restful Web Services Cookbook
Queries don't modify the internal state of the system, but POST doesn't support idempotent operations. However, PUT is idempotent. Why don't RESTful APIs use PUT with a body instead of POST for queries that require a body?
NOTE: A popular question asks which (PUT vs POST) is preferred for creating a resource. This question asks why PUT is not used for queries that require bodies.
No. PUT might be idempotent, but it also has a specific meaning. The body of the request in PUT should be used to replace the resource in the URI.
With POST no such assumptions are being made. And note that using a POST request means that the request might not be idempotent, in specific cases it still might be.
However, you could do it with PUT, but it requires you to jump through an extra hoop. Basically, you could create a "query resource" with PUT, and then use GET immediately after to fetch the results of this query resource. Perhaps this was what you were after, but this is the most RESTful, because the resulting query results can still be linked to. (something which is completely missing if you use POST requests).
You should read the standard: http://www.w3.org/Protocols/rfc2616/rfc2616-sec9.html
The definition of POST:
The POST method is used to request that the origin server accept the
entity enclosed in the request as a new subordinate of the resource
identified by the Request-URI in the Request-Line.
The action performed by the POST method might not result in a resource
that can be identified by a URI. In this case, either 200 (OK) or 204
(No Content) is the appropriate response status, depending on whether
or not the response includes an entity that describes the result.
The definition of PUT
The PUT method requests that the enclosed entity be stored under the
supplied Request-URI. If the Request-URI refers to an already existing
resource, the enclosed entity SHOULD be considered as a modified
version of the one residing on the origin server. If the Request-URI
does not point to an existing resource, and that URI is capable of
being defined as a new resource by the requesting user agent, the
origin server can create the resource with that URI.
If the resource could not be created or modified with the Request-URI, an appropriate error response SHOULD be given that reflects the nature of the problem.
Another thing that PUT is not cacheable while POST is.
Responses to this method are not cacheable, unless the response
includes appropriate Cache-Control or Expires header fields.
e.g. http://www.ebaytechblog.com/2012/08/20/caching-http-post-requests-and-responses/
I'm using AWS S3 REST API, and after solving some annoying problems with signing it seems to work. However, when I use correct REST verb for creating resource, namely POST, I get 405 method not allowed. Same request works fine with method PUT and creates resource.
Am I doing something wrong, or is AWS S3 REST API not fully REST-compliant?
Yes, you are wrong in mapping CRUD to HTTP methods.
Despite the popular usage and widespread misconception, including high-rated answers here on Stack Overflow, POST is not the "correct method for creating resource". The semantics of other methods are determined by the HTTP protocol, but the semantics of POST are determined by the target media type itself. POST is the method used for any operation that isn't standardized by HTTP, so it can be used for creation, but also can be used for updates, or anything else that isn't already done by some other method. For instance, it's wrong to use POST for retrieval, since you have GET standardized for that, but it's fine to use POST for creating a resource when the client can't use PUT for some reason.
In the same way, PUT is not the "correct method for updating resource". PUT is the method used to replace a resource completely, ignoring its current state. You can use PUT for creation if you have the whole representation the server expects, and you can use PUT for update if you provide a full representation, including the parts that you won't change, but it's not correct to use PUT for partial updates, because you're asking for the server to consider the current state of the resource. PATCH is the method to do that.
In informal language, what each method says to the server is:
POST: take this data and apply it to the resource identified by the given URI, following the rules you documented for the resource media type.
PUT: replace whatever is identified by the given URI with this data, ignoring whatever is in there already, if anything.
PATCH: if the resource identified by the given URI still has the same state it had the last time I looked, apply this diff to it.
Notice that create or update isn't mentioned and isn't part of the semantics of those methods. You can create with POST and PUT, but not PATCH, since it depends on a current state. You can update with any of them, but with PATCH you have an update conditional to the state you want to update from, with PUT you update by replacing the whole entity, so it's an idempotent operation, and with POST you ask the server to do it according to predefined rules.
By the way, I don't know if it makes sense to say that an API is or isn't REST-compliant, since REST is an architectural style, not a spec or a standard, but even considering that, very few APIs who claim to be REST are really RESTful, in most cases because they are not hypertext driven. AWS S3 is definitely not RESTful, although where it bears on your question, their usage of HTTP methods follows the HTTP standard most of the time.
+--------------------------------------+---------------------+
| POST | PUT |
+--------------------------------------+---------------------+
| Neither safe nor idempotent Ex: x++; | Idempotent Ex: x=1; |
+--------------------------------------+---------------------+
To add to #Nicholos
From the http://www.w3.org/Protocols/rfc2616/rfc2616-sec9.html
POST:
The posted entity is subordinate to the URI in the same way that a
file is subordinate to a directory containing it, a news article is
subordinate to a newsgroup to which it is posted, or a record is
subordinate to a database
The action performed by the POST method might not result in a resource
that can be identified by a URI. In this case, either 200 (OK) or 204
(No Content) is the appropriate response status, depending on whether
or not the response includes an entity that describes the result
If a resource has been created on the origin server, the response
SHOULD be 201 (Created)
PUT:
The PUT method requests that the enclosed entity be stored under the
supplied Request-URI. If the Request-URI refers to an already existing
resource, the enclosed entity SHOULD be considered as a modified
version of the one residing on the origin server. If the Request-URI
does not point to an existing resource, and that URI is capable of
being defined as a new resource by the requesting user agent, the
origin server can create the resource with that URI. If a new resource
is created, the origin server MUST inform the user agent via the 201
(Created) response. If an existing resource is modified, either the
200 (OK) or 204 (No Content) response codes SHOULD be sent to indicate
successful completion of the request
IMO PUT can be used to create or modify/replace the enclosed entity.
In the original HTTP specification, the resource given in the payload of a POST request is "considered to be subordinate to the specified object" (i.e. the request URL). TimBL has said previously (can't find the reference) that it was modelled on the identically-named method in NNTP.