I am working on a distributed execution server. I have decided to use a REST API based on HTTP on the server. The clients will connect to the server and GET the next task to be accomplished. Obviously I need to "update" the task that is retrieved to ensure that it is only processed once. A GET is not supposed to have any side effects (like changing the state of the resource retrieved). I could use a POST (to update the resource), but I also need to retrieve it. I am thinking that I could have a URL that a POST marks the task as "claimed", then a GET marks the task as retrieved. Unfortunately I have a side effect on GET again. Is this just not going to work in REST? I am OK with have a "function" resource to do this, but don't want to give up the paradigm without a little research.
Pat O
If nothing else fits, you're supposed to use a POST request. Nothing prevents you from returning the resource on a POST request. But it becomes apparent that something (in this case) will happen to that resource, which wouldn't be the case when using a GET request.
REST is really just a concept, and you can implement it however you want. There is no one 'right way', as everyones use cases are different. (yes I understand that there is a defined spec out there, but you can still do it however you want) In this situation if your GET needs to have a side effect, it will have a side effect. Just make sure to properly document what you did (and potentially why you did it).
However it sounds like you're just trying to create a queue with multiple subscribers, and if the subscribers are automated (such as scripts or other machines) you may want to look at using an actual queue. (http://www.rabbitmq.com/getstarted.html).
If you are using this to power a web UI or something where actual people process this, you could also use a queue, with your GET request simply pulling the next item from the queue.
Note that when using most of the messaging systems you will not be able to guarantee the order in which the messages are pulled from the queue, so if the order is necessary you may not be able to use this approach.
Related
I am looking for a REST API to do following
Search based on parameters sent, if results found, return the results.
If no results found, create a record based on search parameters sent.
Can this be accomplished by creating one single API or 2 separate APIs are required?
I would expect this to be handled by a single request to a single resource.
Which HTTP method to use
This depends on the semantics of what is going on - we care about what the messages mean, rather than how the message handlers are implemented.
The key idea is the uniform interface constraint it REST; because we have a common understanding of what HTTP methods mean, general purpose connectors in the HTTP application can do useful work (for example, returning cached responses to a request without forwarding them to the origin server).
Thus, when trying to choose which HTTP method is appropriate, we can consider the implications the choice has on general purpose components (like web caches, browsers, crawlers, and so on).
GET announces that the meaning of the request is effectively read only; because of this, general purpose components know that they can dispatch this request at any time (for instance, a user agent might dispatch a GET request before the user decides to follow the link, to make the experience faster).
That's fine when you intend the request to provide the client with a copy of your search results, and the fact that you might end up making changes to server local state is just an implementation detail.
On the other hand, if the client is trying to edit the results of a particular search (but sometimes the server doesn't need to change anything), then GET isn't appropriate, and you should use POST.
A way to think about the difference is to consider what action you want to be taken when an intermediate cache holds a response from an earlier copy of "the same" request. If you want the cache to reuse the response, GET is the best; on the other hand, if you want the cache to throw away the old response (and possibly store the new one), then you should be using POST.
A few time ago I participate from a interview where had a question about REST modelling, and how the best way to implement it. The question was:
You have an REST API where you expose a method to consult the distance between two point, although you must save each request to this method to expose the request history.
And I was questioned about which HTTP method should be used on this case, for me the logic answer in that moment was the GET method (to execute the both actions). After this the interviewer asked me why, because since we are also storing the request, this endpoint is not idempotent anymore, after that I wasn't able to reply it. Since this stills on my mind, so I decided to verify here and see others opinions about which method should be used for this case (or how many, GET and POST for example).
You have an REST API where you expose a method to consult the distance between two point, although you must save each request to this method to expose the request history.
How would you do this on the web? You'd probably have a web page with a form, and that form would have input controls to collect the start and end point. When you submit the form, the browser would use the data in the controls, as well as the form metadata and standard HTML processing rules to create a request that would be sent to the server.
Technically, you could use POST as the method of the form. It's completely legal to do that. BUT, as the semantics of the request are "effectively read only", a better choice would be to use GET.
More precisely, this would mean having a family of similar resources, the representation of which includes information about the two points described in the query string.
That family of similar resources would probably be implemented on your origin server as a single operation/route, with a parser extracting the two points from the query string and passing them along to the function as arguments.
the interviewer asked me why, because since we are also storing the request, this endpoint is not idempotent anymore
This is probably the wrong objection - the semantics of GET requests are safe (effectively read only). So the interview might argue that saving the request history is not read only. However, this objection is invalid, because the semantic constraints apply to the request message, not the implementation.
For instance, you may have noticed that HTTP servers commonly add an entry to their access log for each request. Clearly that's not "read only" - but it is merely an implementation detail; the client's request did not say "and also log this".
GET is still fine here, even though the server is writing things down.
One possible objection would be that, if we use GET, then sometimes a cache will return an previous response rather than passing the request all the way through to the origin server to get logged. Which is GREAT - caches are a big part of the reason that the web can be web scale.
But if you don't want caching, the correct way to handle that is to add metadata to the response to inhibit caching, not to change the HTTP method.
Another possibility, which is more consistent with the interviewer's "idempotent" remark, is that they wanted this "request history" to be a resource that the client could edit, and that looking up distances would be a side effect of that editing process.
For instance, we might have some sort of an "itinerary" resource with one or more legs provided by the client. Each time the client modifies the itinerary (for example, by adding another leg), the distance lookup method is called automatically.
In this kind of a problem, where the client is (logically) editing a resource, the requests are no longer "effectively read only". So GET is off the table as an option, and we have to look into the other possibilities.
The TL;DR version is that POST would always be acceptable (and this is how we would do it on the web), but you might prefer an API style where the client edits the representation of the resource locally, in which case you would let the client choose between PUT and PATCH.
I know the use of http verbs is based on standard specification. But my question if I use "GET" for update operations and write a code logic to update, does it create issues in any scenario? Apart from the standard, what else could be the reason to use these verbs for a specific purpose only?
my question if I use "GET" for update operations and write a code logic to update, does it create issues in any scenario?
Yes.
A simple example - suppose the network between the client and the server is unreliable; specifically, for a time, HTTP responses are being lost. A general purpose component (like a web proxy) might time out, and then, noticing that the method token of the request is GET, resend the request a second/third/fourth time, with your server performing its update on every GET request.
Let us further assume that these multiple update operations lead to an undesirable outcome; where do we properly affix blame?
Second example: you send someone a copy of the link to the update operation, so that they can send you a request at the appropriate time. But suppose you send that link to them in an email, and the email client recognizes the uri and (as a performance optimization) pre-fetches the link, triggering your update operation too early. Where do we properly affix the blame?
HTTP does not attempt to require the results of a GET to be safe. What it does is require that the semantics of the operation be safe, and therefore it is a fault of the implementation, not the interface or the user of that interface, if anything happens as a result that causes loss of property -- Fielding, 2002
In these, and other examples, blame is correctly affixed to your server, because GET has a standardized meaning which include the constraint that the semantics of the request are safe.
That's not to say that you can't have side effects when handling a GET request; "hit counters" are almost as old as the web itself. You have a lot of freedom in your implementation; so long as you respect the uniform interface, there won't be too much trouble.
Experience report: one of our internal tools uses GET requests to trigger scheduling; in our carefully controlled context (which is not web scale), we get away with it, and have for a very long time.
To borrow your language, there are certainly scenarios that would give us problems; but given our controls we manage to avoid them.
I wouldn't like our chances, though, if requests started coming in from outside of our carefully controlled context.
I think it's a decent question. You're asking a hypothetical: is there any value to doing the right other than that's we agree to use GET for fetching? e.g.: is there value beyond the fact that it's 'semantically nice'. A similar question in HTML might be: "Is it ok to use a <div> with an onclick instead of a <button>? (the answer is no).
There certainly is. Clients, servers and intermediates all change their behavior depending on what method is used. Even if your server can process GET for updates, and you build a client that uses this, your browser might still get confused.
If you are interested in this subject, don't ask on a forum; read the spec. The HTTP specification tells you what clients, servers and proxies should do when they encounter certain methods, statuses and headers.
Start at RFC7231
I'm building a REST API and I'm trying to keep it as RESTful as possible, but some things are still not quite clear for me. I saw a lot of topic about similar question but all too centered about the "simple" problem of updating data, my issue is more about the business logic around that.
My main issue is with business logic triggered by partial update of a model. I see a lot of different opinion online about PATCH methods, creating new sub-ressources or adding action, but it often seems counter productive with the REST approach of keeping URI simple and structured.
I have some record that need to be proceeded ( refused, validated, partially validated ..etc ), each change trigger additional actions.
If it's refused, an email with the reason should be sent
if it's partially validated, the link to fulfill the missing data is sent
if it's validated some other ressources must be created.
There is a few other change that can be made to the status but this is enough for the example.
What would be a RESTful way to do that ?
My first idea would be to create actions :
POST /record/:id/refuse
POST /record/:id/validate ..etc
It seems RESTful to me but too complicated, and moreover, this approach means having multiple route performing essentially the same thing : Update one field in the record object
I also see the possibility of a PATCH method like :
PATCH /record/:id in which I check if the field to update is status, and the new value to know which action to perform.
But I feel it can start to be too complex when I will have the need to perform similar action for other property of the record.
My last option, and I think maybe the best but I'm not sure if it's RESTful, would be to use a sub-ressource status and to use PUT to update it :
PUT /record/:id/status, with a switch on the new value.
No matter what the previous value was, switching to accepted will always trigger the creation, switching to refused will always trigger the email ...etc
Are those way of achieving that RESTful and which one make more sense ? Is there other alternative I didn't think about ?
Thanks
What would be a RESTful way to do that ?
In HTTP, your "uniform interface" is that of a document store. Your Rest API is a facade, that takes messages with remote authoring semantics (PUT/POST/PATCH), and your implementation produces useful work as a side effect of its handling of those messages.
See Jim Webber 2011.
I have some record that need to be proceeded ( refused, validated, partially validated ..etc ), each change trigger additional actions.
So think about how we might do this on the web. We GET some resource, and what is returned is an html representation of the information of the record and a bunch of forms that describe actions we can do. So there's a refused form, and a validated form, and so on. The user chooses the correct form to use in the browser, fills in any supplementary information, and submits the form. The browser, using the HTML form processing rules, converts the form information into an HTTP request.
For unsafe operations, the form is configured to use POST, and the browsers therefore know that the form data should be part of the message-body of the request.
The target-uri of the request is just whatever was used as the form action -- which is to say, the representation of the form includes in it the information that describes where the form should be submitted.
As far as the browser and the user are concerned, the target-uri can be anything. So you could have separate resources to handle validate messages and refused messages and so on.
Caching is an important idea, both in REST and in HTTP; HTTP has specific rules baked into it for cache invalidation. Therefore, it is often the case that you will want to use a target-uri that identifies the document you want the client to reload if the command is successful.
So it might go something like this: we GET /record/123, and that gives us a bunch of information, and also some forms describing how we can change the record. So fill one out, submit it successfully, and now we expect the forms to be gone - or a new set of forms to be available. Therefore, it's the record document itself that we would expect to be reloading, and the target-uri of the forms should be /record/123.
(So the API implementation would be responsible for looking at the HTTP request, and figuring out the meaning of the message. They might all go to a single /record/:id POST handler, and that code looks through the message-body to figure out which internal function should do the work).
PUT/PATCH are the same sort of idea, except that instead of submitting forms, we send edited representations of the resource itself. We GET /record/123, change the status (for example, to Rejected), and then send a copy of our new representation of the record to the server for processing. It would therefore be the responsibility of the server to examine the differences between its representation of the resource and the new provided copy, and calculate from them any necessary side effects.
My last option, and I think maybe the best but I'm not sure if it's RESTful, would be to use a sub-resource status and to use PUT to update it
It's fine -- think of any web page you have ever seen where the source has a link to an image, or a link to java script. The result is two resources instead of one, with separate cache entries for each -- which is great, when you want fine grained control over the caching of the resources.
But there's a trade - you also need to fetch more resources. (Server-push mitigates some of this problem).
Making things easier on the server may make things harder on the client - you're really trying to find the design with the best balance.
I'm trying to develop rest api for the first time. And looking to loopback references that uses change stream for the resources like /resources/change-stream with the GET and POST methods.
I have visited this post which indicates differences between rest api and streaming api.
While the loopback is providing it in rest api, I think. What is it and what it does. Can you please explain it to me in a way that you're making clear to me (for a six years old child). Because, I am developing REST API for the first time in my own. So, I would like to understand step by step if possible like what should I have in the postman. Should I use the url like '/api/resources/change-stream?_format=event-stream along with application/json content-type or just /api/resources/change-stream would be fine.
It would be great example if you could provide me some real example so that I can develop it trying in my own application.
PS: It's perfectly fine to me whichever language (Node.js, Python, Ruby, PHP) you'd choose to provide answer with some examples.
If I had to guess, it sounds like a 1-way long polling where you leave a long running, open request to a server that will fulfill the request when an event happens. If the request times out, don't worry about it, send another and leave it open. When the request is fulfilled with an event, immediately fire another request so that you can receive the next event.
Since the document on the other end of the API is still (probably) a JSON document, you should keep that mime. However, you aren't limited in what you can send back as an event type; if you want to send back XML or YAML, do so and set that mime. The "stream" is just a convention mechanism.
As far as your application is concerned and from a REST perspective, it just takes a while for the event that you are trying to get to be provided to you and it has a high chance of failure. But I wouldn't look at this from a REST perspective, REST is just convention, don't let it tie you down.
Alternatively, long-polling should probably be replaced by something like a WebSocket as it provides a much easier API (in my opinion) and doesn't seem as hacky as long-polling.
If you're trying to ask, "how do I tell a RESTful consumer that my API is a 'stream' API", there is not point. Again, as far as REST is concerned, the https://example.com/api/events/ endpoint refers to a JSON type document that changes a lot, takes a long time to receive, and "fails" often (if the events you generate don't fire a lot).