I am working on an application that is primarily concerned with polling data from a third party, then mapping and persisting relevant information to a customer. It does this on a fixed interval, but the requirements also specify a way to manually fire off the process. This manual call will only receive a status on the execution of the process, but is not interested in the data created by the process.
I implemented the call using http, but my implementation has been identified as not RESTful. Having now done the research into what better than means, I completely concur as I was using the URI to define verbs instead of nouns.
Is it possible to make this RESTful? Is it okay to make a resource that is extremely transient, like this?
POST /rpc/{process}?param1=....&....
Or should the process itself be considered permanent, but the triggering of as a temporary update? I doubt this as this is not an idempotent action.
PUT /rpc/{process}?run=true¶m1......
Is there a correct way to handle this and be called RESTful? If it is not, do I use SOAP or continue to use the simple http structure and live with the fact that it is not RESTful?
Use querystrings for filtering
POST/PUT depends on what you want your process to do (PUT for idempotent actions, POST for non idempotent actions).
Keep it simple and I do not see any issues the way you have resolved this.
Related
I am looking for a REST API to do following
Search based on parameters sent, if results found, return the results.
If no results found, create a record based on search parameters sent.
Can this be accomplished by creating one single API or 2 separate APIs are required?
I would expect this to be handled by a single request to a single resource.
Which HTTP method to use
This depends on the semantics of what is going on - we care about what the messages mean, rather than how the message handlers are implemented.
The key idea is the uniform interface constraint it REST; because we have a common understanding of what HTTP methods mean, general purpose connectors in the HTTP application can do useful work (for example, returning cached responses to a request without forwarding them to the origin server).
Thus, when trying to choose which HTTP method is appropriate, we can consider the implications the choice has on general purpose components (like web caches, browsers, crawlers, and so on).
GET announces that the meaning of the request is effectively read only; because of this, general purpose components know that they can dispatch this request at any time (for instance, a user agent might dispatch a GET request before the user decides to follow the link, to make the experience faster).
That's fine when you intend the request to provide the client with a copy of your search results, and the fact that you might end up making changes to server local state is just an implementation detail.
On the other hand, if the client is trying to edit the results of a particular search (but sometimes the server doesn't need to change anything), then GET isn't appropriate, and you should use POST.
A way to think about the difference is to consider what action you want to be taken when an intermediate cache holds a response from an earlier copy of "the same" request. If you want the cache to reuse the response, GET is the best; on the other hand, if you want the cache to throw away the old response (and possibly store the new one), then you should be using POST.
I know the use of http verbs is based on standard specification. But my question if I use "GET" for update operations and write a code logic to update, does it create issues in any scenario? Apart from the standard, what else could be the reason to use these verbs for a specific purpose only?
my question if I use "GET" for update operations and write a code logic to update, does it create issues in any scenario?
Yes.
A simple example - suppose the network between the client and the server is unreliable; specifically, for a time, HTTP responses are being lost. A general purpose component (like a web proxy) might time out, and then, noticing that the method token of the request is GET, resend the request a second/third/fourth time, with your server performing its update on every GET request.
Let us further assume that these multiple update operations lead to an undesirable outcome; where do we properly affix blame?
Second example: you send someone a copy of the link to the update operation, so that they can send you a request at the appropriate time. But suppose you send that link to them in an email, and the email client recognizes the uri and (as a performance optimization) pre-fetches the link, triggering your update operation too early. Where do we properly affix the blame?
HTTP does not attempt to require the results of a GET to be safe. What it does is require that the semantics of the operation be safe, and therefore it is a fault of the implementation, not the interface or the user of that interface, if anything happens as a result that causes loss of property -- Fielding, 2002
In these, and other examples, blame is correctly affixed to your server, because GET has a standardized meaning which include the constraint that the semantics of the request are safe.
That's not to say that you can't have side effects when handling a GET request; "hit counters" are almost as old as the web itself. You have a lot of freedom in your implementation; so long as you respect the uniform interface, there won't be too much trouble.
Experience report: one of our internal tools uses GET requests to trigger scheduling; in our carefully controlled context (which is not web scale), we get away with it, and have for a very long time.
To borrow your language, there are certainly scenarios that would give us problems; but given our controls we manage to avoid them.
I wouldn't like our chances, though, if requests started coming in from outside of our carefully controlled context.
I think it's a decent question. You're asking a hypothetical: is there any value to doing the right other than that's we agree to use GET for fetching? e.g.: is there value beyond the fact that it's 'semantically nice'. A similar question in HTML might be: "Is it ok to use a <div> with an onclick instead of a <button>? (the answer is no).
There certainly is. Clients, servers and intermediates all change their behavior depending on what method is used. Even if your server can process GET for updates, and you build a client that uses this, your browser might still get confused.
If you are interested in this subject, don't ask on a forum; read the spec. The HTTP specification tells you what clients, servers and proxies should do when they encounter certain methods, statuses and headers.
Start at RFC7231
I understand (From the accepted answer What is the difference between HTTP and REST?)
that REST is just a set of rules about how to use HTTP
Accepted answer says:
No, REST is the way HTTP should be used.
Today we only use a tiny bit of the HTTP protocol's methods – namely
GET and POST. The REST way to do it is to use all of the protocol's
methods.
For example, REST dictates the usage of DELETE to erase a document (be
it a file, state, etc.) behind a URI, whereas, with HTTP, you would
misuse a GET or POST query like ...product/?delete_id=22
My question is what is the disadvantage/drawback(technical or design) If I continue to use the POST method instead of DELETE/PUT for deleting/updating the resource in Rest?
My question is what is the disadvantage/drawback(technical or design)
If I continue to use POST method instead of DELETE/PUT for
deleting/updating the resource in Rest ?
The POST request is not Idempotent but the DELETE request is Idempotent.
An idempotent HTTP method is a HTTP method that can be called many times without different outcomes
Idempotency is important in building a fault-tolerant API.
Let's suppose a client wants to update a resource through POST. Since POST is not an idempotent method, calling it multiple times can result in wrong updates. What would happen if you sent out the POST request to the server, but you get a timeout. Did the resource actually get updated? Did the timeout happen when sending the request to the server, or when responding to the client? Can we safely retry again, or do we need to figure out first what happened with the resource? By using idempotent methods, we do not have to answer this question, but we can safely resend the request until we actually get a response back from the server.
So, if you use POST for deleting, there will be consequences.
From a purely technical viewpoint, I am not aware of any real drawbacks. Others mentioned idempotency, but that does not come just by using DELETE, you still have to implement it anyway.
Which leaves us with design considerations:
Your clients (or rather the programmers programming against your API) might reasonably expect the DELETE method to delete things and the POST method to add things. If you don't follow that convention, you confuse them.
If you use POST for both deleting and adding things, you have to invent another way of telling what actually to do. Surely this isn't very hard, but it makes your API more complicated for no good reason.
For both these reasons, you will need more and better documentation, since you're not following RESTful principles, which are already documented.
When we use POST instead of Delete in our rest API then we are snatching power of Idempotency from client.That means,By using POST we are saying to our API user that this API can produce differnent result upon hitting multiple time.
In case of Timeout, API user have to enquiry for the resource
which he had made a request to delete.Then if found,he has to made a
call to POST API to delete it.
Where if same request is made using Delete method.Then we are assuring
our API user that multiple calls to same method will return same
result. Hence he can raise any number of request untill he gets
successful deletion instead of Timeout without enquriy.
Note : Maintaining Idempotency is the duty of API maker.Just putting Delete method do not give Idempotency.
In REST generally we know that POST use to Add something, PUT use to Edit something in existing data and DELETE is use for Delete something and POST request is not Idempotent but the DELETE request is Idempotent.
Although above are definition but in my point of view We are using these methods because for better understanding that particular method is use for what purpose and by using these methods the bridge between UI developer and Backend developer will not be minimized.
if you want to use POST method instead of DELETE/PUT then there will
not any impact but this is not a good coding standard.
Take this design of an API:
/articles/{id} - Returns an article. Client provides a token in the header to identify them.
/updated-articles - Returns collection of articles that have been updated since the client's last call to this endpoint, and only includes articles that this client previously requested. Client provides a token in the header to identify them.
The second enpoint doesn't fit very well with me. The design motivation of that second enpoint is that the client does not need to track the time of their last requests. Is this breaking the "statelessness" constraint of RESTful APIs? An alternative approach would be /updated-articles?since=YYYY-MM-DD but this would require clients to remember
Your "token" is basically a client id, and the fact of remembering the date of their last access is keeping a client-state on the server.
Think about it : If you had to scale up your service, could you simply plug-in a new server, copy your service's files, and redirect via a round-robin algorithm on one or another of the two server (without having them sharing informations) ? Clearly no, because you would need your table tokens<->date of last consultation shared between the two servers. So no it's definitely not stateless.
Plus, I don't understand your point :
An alternative approach would be /updated-articles?since=YYYY-MM-DD
but this would require clients to remember
Wouldn't a token require a client to remember ? On the contrary, this way would be RESTful, since the client-state (the date of last consultation) would be kept on the client side.
Basically, no, I don't think your second resource would break statelessness.
I think it's okay to have your client's keep track of their own 'updated at' time stamp. Your api should be stateless. The client doesn't have to be stateless.
If anything the client should retain a lot of state. The client will be a device central to one user and their specific needs. It's responsible for keeping track of the user's needs and current state. In this situation someone will have to store that time stamp. I think it should be your clients, not your server.
This is just my opinion though.
I did find a write up over the true meaning of statelessness that I think could benefit you as well here.
We should avoid creating endpoints with no related entity. So instead of /updated-articles?since=<timestamp> a better approach should be:
/articles?updated=true&since-last-request=true or
/articles?updated-since-last-request=true
If the intended result should affect all clients. Meaning every request time stamp must be kept on the server. Or
/articles?updated-since=<timestamp>
If the intended result depends on each client behavior. That seems to be your case.
The choice between the former or the latter (or both) depends on the use case. But the main point is to avoid creating endpoints with no related entity and having special cases being defined by parameters.
As a guideline:
Endpoints are substantives, adjectives are parameters and verbs are the HTTP request methods
This also means a simple 'GET /articles' means returning ALL articles. To avoid abuse you may issue proper 4xx codes depending on the case.
The CMIS protocol specification does not even contain the words "synchronous" or "asynchronous".
I guess that implicitely, every server-side operation is synchronous?
Are there counter-examples?
Scenario: I create a document, then immediately I list the directory, and the file does not appear yet.
Is this scenario illegal?
(not a client problem, the client waits for HTTP response before going to the next instruction)
CMIS is synchronous, and doesn't have any semantic for transactions spanning multiple requests. Thus, each successful mutation request (POST, PUT, DELETE in the AtomPub binding) have immediate effect to subsequent requests.
The scenario you depicted is not illegal, it's simply impossible to achieve under CMIS. If you need asynchronous operations, you need to decouple the client from CMIS and introduce an intermediate layer that e.g.:
caches the pre-execution state of a CMIS mutation
returns the cached value until the mutation is successful.
In general CMIS operations are supposed to be synchronous. I don't know of any counter examples. Asychronous operation on the server would make it almost impossible to create applications.
Assuming you are using AtomPub under the covers, you might want to check
http://bitworking.org/projects/atom/rfc5023.html . Creating a resource returns its URI.
So I guess what you are seeing is either a bug in the implementation or the client is using implemention details not covered by the spec. The alfresco repository for example indexes asynchronously by default (solr). Hence, I think it should be possible to come with code which demos the behavior you observe backing the listing by a search.