Correct RESTful method for checking input data matches database record - rest

I have an application that takes user's input personal details (name, dob, etc.) and looks for a match in the database. I am not creating a record, nor editing one.
Now normally I'd use GET, but GET doesn't allow me a body, and all the data that needs to be checked has to be sent up. I run a database query against all the user's inputs to find a match. If the database returns a match then one is found, otherwise it returns an empty array and no match is found.
So can you recommend the correct RESTful method (HTTP) that I might use for this?
Thanks

As the semantics of the payload of a POST operation are defined by the service itself, it should be used in any case where the other operations do not fit. POST is therefore not only used to create new resources but also to trigger certain calculations and stuff like that.
The general question however is, why do you need such a method? If you fear overwriting any changes done by an other client to the resource between you fetching the state of a resource and requesting an update, i.e., you should consider using conditional requests as defined in RFC 7232 where either an ETag hash is calculated for the current state or the Last-Modified value is taken and added to response header. A client could then send the request including the payload to check first including a If-Match or If-Unmodified-Since header requesting the server to apply an update only if the precondition holds. If it fails the server would tell you that with a 412 Precondition Failed error response.

Related

version number vs ETag for optimistic concurrency

I'm implementing my first Web API with JSON responses and got confused handling concurrency control for optimistic locking. Right now I develop the server with ASP.NET Core 6 and the client is developed with Angular. The data store is a SQL database. In the future we want to open the API to third parties.
I see two options for handling optimistic locking:
A) ETag in header
B) version number in the body
With option A) I see those two problems:
Since ETag is a header "Precondition Fails 412" needs to be send for failures. The error needs to be specified in the body, so the client can understand that it is a concurrency error.
For concurrency errors I would expect to send HTTP error code 409:
"...Conflicts are most likely to occur in response to a PUT request. For example, if versioning were being used and the entity being PUT included changes to a resource which conflict with those made by an earlier (third-party) request, the server might use the 409 response to indicate that it can't complete the request. In this case, the response entity would likely contain a list of the differences between the two versions in a format defined by the response Content-Type."
2) When a collection/list of representations (e.g. all due orders) is returned from the server there is no possibility to send multiple ETags. The ETag in the header applies to the whole collection/list, but each single resource in the list can be individually modified by different users and needs to have a version number for concurrency detection. I don't see any other option than to send an individual version number/ETag as a property with each representation in the body.
I find it confusing when collections are treated differently than single resources. Since the client needs to store the ETags for each single resource it is natural to include the ETag as a property in its object model anyway.
Option B) avoids the issues of A): I'm free to send a 409 on failure and single resources and collections of resources are treated the same way. The problem is the DELETE verb.
There is a discussion about using a body in a DELETE request at [1]: Is an entity body allowed for an HTTP DELETE request?
The current version of the RFC states:
"A client SHOULD NOT generate a body in a DELETE request. A payload received in a DELETE request has no defined semantics, cannot alter the meaning or target of the request, and might lead some implementations to reject the request."
What does "has no defined semantics" mean? What are the implications? I'm aware that some implementations are ignoring the body or have not implemented the option to send a body with DELETE. I still wonder if I could send a version number in the body. Generally, I don't understand the reasons a body should not be generated for DELETE? Why is it evil?
I understand that If-Match with ETag is the recommended solution to handle concurrency, but it can't handle basic collections. Why is there a 409 error code for concurrency issues, when the recommended solution ETag can't use it to be compliant? This is really confusing.
Keeping the version number in the body for concurrency detection would be consistent, but it does not work for DELETE.
Edited:
The above text was edited for clarification reflecting Everts comments.
I see the following options (even GET can be conditional I'm here only concerned about concurrency update issues):
A) Client requests use If-Match header. Server responses put ETag in header and in body
The ETag in the body gives the client a consistent way of handling single resources and collections. In case the collection needs to have a version on its own the ETag in the header is available and can also be used for conditional GET.
I have to give up sending 409 for concurrency errors.
B) Version number is sent in the body except for DELETE.
In the case of DELETE the version number must be either sent with a query parameter or with an If-Match header. It is not pleasant to have this exception, but a more clear 409 can be sent and single resources and collections are handled consistently.
I'm still unsure about what implementation to choose. Are there any clear criteria that can help with the decision? How has this problem been solved by others?
Do I misunderstand the usage of 409?
Update:
This German article https://www.seoagentur.de/seo-handbuch/lexikon/s/statuscode-412-precondition-failed/ explains the usage of 409 vs 412:
412 should only be used if the precondition (e.g. If-Match) caused the failure, while 409 should be used if the entity would cause a conflict. If a database with version id is used this id can be passed to the ETag and from the If-Match back to the version id. However, it is not forbidden to do it in the body of the entity itself. It just requires that the concept and how it works be explained, whereas the ETag solution just lets you refer to the HTTP specification. (comment: it still leaves the no body in DELETE problem)
The ETag with collections problem has more nuances:
a) If the returned collection is just data for a list view it normally doesn't hold all data of the displayed objects. If one item in the list is edited the client needs to first GET the full entity from a single resource URI and will get the required ETag with that request. It also provides are current version of the resource at the time of editing and not when the list was requested.
b) If the returned collection holds the full entity data and performance issues are relevant or many items independently need to be changed at once, than the ETags of each item can be passed to the client in the body.
Update 2:
The "Zalando RESTfull API and Event Guidelines" compare various options for optimisitc locking at [https://opensource.zalando.com/restful-api-guidelines/#optimistic-locking][2]
They favour sending the ETag in the body for server responses.
I currently think that this is the best option. The client than has the freedom to use that ETag or send a new Get on the single resource.
To me "Conflict 409" is the right response for a concurrency error, but the RFC states that a "Precondition Fails 412" needs to be send, which is not clear enough. Of course an error description in the body can clarify the cause of the error, but I would rather send 409.
Use the HTTP status codes as they are defined, not what you think they should mean based on the name.
When a collection (e.g. all due orders) is returned from the server there is no possibility to send multiple ETags. Each single resource can be individually modified. Therefore the only option is to send the ETag in the body as a property of each representation in the JSON response. It is confusing that collections are treated differently than single resources. Also the client needs to include a property for the ETag in its object model in any case.
HTTP doesn't really have a concept of collections, but you can give the collection itself its own ETag as well.
I don't understand all the implications of the specific wording. What does "has no defined semantics" mean? What is the reasons a body should not be generated?
A delete request should only delete the resource located at the URI. If it's successful, we can assume that the URI that was used in the DELETE request will return a 404 or 410 after the request was successful.
If you want to conditionally parameterize deletions, delete multiple resources at once or delete something other than the resource specified in the URI, the DELETE request is simply not appropriate for that use-case.
If you want to use ETags, just use an If-Match header.

How to use additional save feature along with search feature HTTP API call

I have implemented an /GET HTTP endpoint to provide search feature. The user sends search terms in query parameters and receives JSON response containing all search results.
Now I have to add a new feature i.e. save search. It means the user sends same search parameters and can also send a boolean parameter say save=true. I have to save the search term in database in this case for future uses. However this parameter is not mandatory.
I am confused over the following points:
Modify same GET HTTP endpoint allowing additional save parameter in query parameters.
Modify same GET HTTP endpoint but passing save parameter in request body instead of query parameters as its backend state changing parameter.
Use separate endpoint for save the parameters using POST method.
What is the standard/acceptable way of doing this?
As far as I understood your question you try to store a search request and by storing it also retrieve the response in one go?
Usually GET is used to retrieve a resources' state though as this method is defined as safe it shouldn't be used if certain state is created for the invoked resource as persisting the search query would be. RFC 7231 further states that:
A payload within a GET request message has no defined semantics; sending a payload body on a GET request might cause some existing implementations to reject the request.
I therefore would refrain from option #1 or #2 as this might break interoperability by certain clients.
POST on the otherhand is defined in RFC 7231 as
The POST method requests that the target resource process the representation enclosed in the request according to the resource's own specific semantics.
It therefore should be used in every situation the other HTTP operations don't fit. The HTTP spec further defines that creating a new resource a 201 Created HTTP status code should be returned including a HTTP response header named Location containing the URI of the created resource. This URI can later be used to retrieve it's state (i.e. the performed search result).
From a client's perspective you are basically storing some query definition on the server and don't care where or how the server is actually persisting it. All you care is to retrieve a handle you can later on invoke. This doesn't prevent the server from returning the current search result within the response payload. And this is what I'd do exactly.
Proposed steps:
Send search request via POST
Store query definition
Generate the URI for the stored query
Perform the search according to the query
Return a response with a 201 Created status code and Location header pointing to the URI of the stored query and add the query result within the response payload
A client can later on use the returned URI to retrieve the current state of the resource, which the server can interpret as: execute the query stored for that URI and return the search result.
How the URI has to look like is not defined by the REST architecture. You might generate UUIDs or generate a hash value based on the query generate. The latter approach has the benefit that multiple identical queries wouldn't result in additional queries created but in the reusage of such. In such cases a redirect to the existing query resource should be performed to tell the client that his query already existed which also teaches the client the actual URI of the query resource as a side effect.

Which HTTP Verb for Read endpoint with request body

We are exposing an endpoint that will return a large data set. There is a background process which runs once per hour and generates the data. The data will be different after each run.
The requester can ask for either the full set of data or a subset. The sub set is determined via a set of parameters but the parameters are too long to fit into a uri which has a max length of 2,083 characters. https://www.google.co.uk/webhp?sourceid=chrome-instant&ion=1&espv=2&ie=UTF-8#q=uri%20max%20length
The parameters can easily be sent in the request body but which which is the correct HTTP verb to use?
GET would be ideal but use of a body 'has no semantic meaning to a GET request' HTTP GET with request body
PUT is not appropriate because there is no ID and no data is being updated or replaced.
POST is not appropriate because a new resource is not being replaced and more importantly the server is not generating and Id.
http://www.restapitutorial.com/lessons/httpmethods.html
GET (read) would seem to be the most appropriate but how can we include the complex set of parameters to determine the response?
Many thanks
John
POST is the correct method. POST should be used for any operation that's not standardized by HTTP, which is your case, since there's no standard for a GET operation with a body. The reference you linked is just directly mapping HTTP methods to CRUD, which is a REST anti-pattern.
You are right that GET with body is to be avoided. You can experiment with other safe methods that take a request body (such as REPORT or SEARCH), or you can indeed use POST. I see no reason why the latter is wrong; what you're citing is just an opinion, not the spec.
Assuming that the queries against that big dataset are not totally random, you should consider adding stored queries to your API. This way clients can add, remove, update queries (through request body) using POST DELETE PUT. Maybe you can call them "reports".
This way the GET requests need only a reference as query parameter to these queries/reports, you don't have to send all the details with every requests.
But only if not all the requests from clients are unique.

Pass complex object to delete method of REST Service

I have REST service that manage the resource EASYPAY.. In this moment this service exposes 3 different methods:
Get a EasyPay request (GET);
Insert a Easypay request (POST);
Update a Easypay request (PUT).
Whe I inserto or update a request I must insert also a row on a trace table on my database.
Now I have to delete a Easypay request and I must add also a row on the trace table.
I wanted to use the DELETE HTTP verb, but I saw that with delete I cannot pass a complex object but just the ID of the request to delete.
I cannot use the PUT HTTP verb because I have already used it, and in any case it would not be conceptually correct...
I do not want to do more that one call from client to server (one for delete the request, the other one to add a row in the trace table)..
So I do not know how to solve the problem.
EDIT
I try to explain better... I have a web site that is deployed on two different server. One for front-end and one for Back-end. Back-end expose some REST services just for front-end and it has no access to internet (just to intranet).
The customer that access the web site can do a payment via a system called XPAY and it works really similar to paypal (XPAY is just another virtual POS).
So when the customer try to do a payment, I save some information on the database + I trace the attempt of the payment, then he is redirected to XPAY. There, he can do the payment. At the endy XPAY return to the web-site (the front end) communicating us the result of the payment.
The result is in the URL of payment, so i must take all the information in the url and send them to the back-end.
According to the result, I must update (if result is ok) or delete (if result is ko) the information I saved before and write a row on the trace table.
What do you suggest?
Thank you
There are actually a couple of ways to solve your problem. First, REST is just an architectural style and not a protocol. Therefore REST does not dictate how an URI has to be made up or what parameters you pass. It only requires a unique resource identifier and probably that it should be self-descriptive, meaning that a client can take further actions based on the returned content (HATEOAS, included links even to itself and proper content type specification).
DELETE
As you want to keep a trace of the deleted resource in some other table, you can either pass data within the URI itself maybe as query parameter (even JSON can be encoded in order to be passed as query parameter) or use custom HTTP headers to pass (meta-)information to the backend.
Sending a complex object (it does not matter if it is XML or JSON) as query parameter may cause certain issues though as some HTTP frameworks limit the maximum URI size to roughly 2000 characters. So if the invoked URI exceeds this limit, the backend may have trouble to fulfill the request.
Although the hypertext transfer protocol does not define a maximum number (or size) of headers certain implementations may raise an error if the request is to large though.
POST
You of course also have the possibility to send a new temporary resource to the backend which may be used to remove the pending payment request and add a new entry to the trace table.
According to the spec:
The action performed by the POST method might not result in a resource that can be identified by a URI. In this case, either 200 (OK) or 204 (No Content) is the appropriate response status, depending on whether or not the response includes an entity that describes the result.
This makes POST request feasible for short living temporary resources which trigger some processing on the server side. This is useful if you want to design some queue-like or listener system where you place an action for execution into the system. As a POST request may contain a body, you can therefore send the POS response within the body of the POST request. This action request can then be used to remove the pending POS request and add a new entry to the trace table.
PATCH
Patch is a way a client can instruct a server to transform one or more resources from state 1 to state 2. Therefore, the client is responsible for breaking down the neccessary actions the server has to take to transform the resources to their desired state while the server tries to execute them. A client always works on a known state (which it gathered at some previous time). This allows the client to modify the current state to a desired state and therefore know the needed steps for the transition. As of its atomic requirements either all instruction succeed or none of them.
A JSON Patch for your scenario may look like this:
PATCH /path/to/resource HTTP/1.1
Host: backend.server.org
Content-lengt: 137
Content-Type: application/json-patch+json
If-Match: "abc123"
[
{ "op": "remove", "path": "/easyPayRequest/12345" }
{ "op": "add", "path": "/trace/12345", "value": [ "answer": "POSAnswerHere" ] }
]
where 12345 is the ID of the actual easypay request and POSAnswerHere should be replaced with the actual response of the POS service or what the backend furthermore expects to write as a trace.
The If-Match header in the example just gurantees that the patch request is executed on the latest known state. If in the meantime a further process changed the state (which also generates a new If-Match value) the request will fail with a 412 Precondition Failed failure.
Discussion
While DELETE may initially be the first choice, it is by far not the best solution in your situation in my opinion as this request is not really idempotent. The actual POS entity deletion is idempotent but the add of the trace is not as multiple sends of the same request will add an entry for each request (-> side-effect). This however contradicts the idempotency requirements of the DELETE operation to some degree.
POST on the otherhand is an all-purpose operation that does not guarantee idempotency (as PATCH does not gurantee either). While it is mainly used to create new resources on the server side, only the server (or the creators of that server-application) know what it actually does with the request (though this may be extended to all operations). As there are no transactional restrictions adding the trace might succeed while the deletion of the pending request entity might fail or vice versa. While this may be handled by the dev-team, the actual operation does not give any gurantees on that issue. This may be a major concern if the server is not in your own hands and thus can not be modified or checked easily.
The PATCH request, which is still in RFC, may contain therefore a bit more semantic then the POST request. It also specifies the ability to modify more then one resource per request explicitely and insist on atomicity which also requires a transaction-like handling. JSON Patch is quite intuitive and conveys a more semantics then just adding the POS response to a POST entity body.
In my opinion PATCH should therefore be prefered over POSTor DELETE.

Is it valid to modify a REST API representation based on a If-Modified-Since header?

I want to implement a "get changed values" capability in my API. For example, say I have the following REST API call:
GET /ws/school/7/student
This gets all the students in school #7. Unfortunately, this may be a lot. So, I want to modify the API to return only the student records that have been modified since a certain time. (The use case is that a nightly process runs from another system to pull all the students from my system to theirs.)
I see http://blog.mugunthkumar.com/articles/restful-api-server-doing-it-the-right-way-part-2/ recommends using the if-modified-since header and returning a representation as follows:
Search all the students updated since the time requested in the if-modified-since header
If there are any, return those students with a 200 OK
If there are no students returned from that query, return a 304 Not Modified
I understand what he wants to do, but this seems the wrong way to go about it. The definition of the If-Modified-Since header (http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.24) says:
The If-Modified-Since request-header field is used with a method to make it conditional: if the requested variant has not been modified since the time specified in this field, an entity will not be returned from the server; instead, a 304 (not modified) response will be returned without any message-body.
This seems wrong to me. We would not be returning the representation or a 304 as indicated by the RFC, but some hybrid. It seems like client side code (or worse, a web cache between server and client) might misinterpret the meaning and replace the local cached value, when it should really just be updating it.
So, two questions:
Is this a correct use of the header?
If not (and I suspect not), what is the best practice? Query string parameter?
This is not the correct use of the header. The If-Modified-Since header is one which an HTTP client (browser or code) may optionally supply to the server when requesting a resource. If supplied the meaning is "I want resource X, but only if it's changed since time T." Its purpose is to allow client-side caching of resources.
The semantics of your proposed usage are "I want updates for collection X that happened since time T." It's a request for a subset of X. It does not seem like your motivation is to enable caching. Your client-side cached representation seemingly contains all of X, even though the typical request will only return you a small set of changes to X; that is, the response is not what you are directly caching, so the caching needs to happen in custom user logic client-side.
A query string parameter is a much more appropriate solution. Below {seq} would be something like a sequence number or timestamp.
GET /ws/schools/7/students/updates?since={seq}
Server-side I imagine you have a sequence of updates since the beginning of your system and a request of the above form would grab the first N updates that had a sequence value greater than {seq}. In this way, if a client ever got very far behind and needed to catch up, the results would be paged.