How to tell between a nonexistent service or a lack of a record? - rest

I'm making some REST services, and as I'm implementing a GET request, I was debating about what I should do if there is no record there, for example /profile/1234. Now, if you tried to use /profiles/1234 (note the s on profiles) it would definitely be a 404 code because that URL definitely isn't found. But when it comes to there just being no 1234 profile, I'm not sure what I should return. I don't want to return {} because that might give the impression there is a record with no data. But if I return a 404 error code, how is the consumer of the API supposed to tell the difference between the two 404s?
How can I be a responsible API designer by communicating programmatically what the difference between a service that doesn't exist and a record that doesn't exist?

You can send a custom status message which will help you differentiate between the "different" 404s.
On the other hand, I wouldn't worry about that distinction. People are used to having to type api endpoints carefully. As long as they manually test their app a tiny bit the issue becomes a non-issue.

Related

Modelling a endpoint for a REST API that need save data for every request

A few time ago I participate from a interview where had a question about REST modelling, and how the best way to implement it. The question was:
You have an REST API where you expose a method to consult the distance between two point, although you must save each request to this method to expose the request history.
And I was questioned about which HTTP method should be used on this case, for me the logic answer in that moment was the GET method (to execute the both actions). After this the interviewer asked me why, because since we are also storing the request, this endpoint is not idempotent anymore, after that I wasn't able to reply it. Since this stills on my mind, so I decided to verify here and see others opinions about which method should be used for this case (or how many, GET and POST for example).
You have an REST API where you expose a method to consult the distance between two point, although you must save each request to this method to expose the request history.
How would you do this on the web? You'd probably have a web page with a form, and that form would have input controls to collect the start and end point. When you submit the form, the browser would use the data in the controls, as well as the form metadata and standard HTML processing rules to create a request that would be sent to the server.
Technically, you could use POST as the method of the form. It's completely legal to do that. BUT, as the semantics of the request are "effectively read only", a better choice would be to use GET.
More precisely, this would mean having a family of similar resources, the representation of which includes information about the two points described in the query string.
That family of similar resources would probably be implemented on your origin server as a single operation/route, with a parser extracting the two points from the query string and passing them along to the function as arguments.
the interviewer asked me why, because since we are also storing the request, this endpoint is not idempotent anymore
This is probably the wrong objection - the semantics of GET requests are safe (effectively read only). So the interview might argue that saving the request history is not read only. However, this objection is invalid, because the semantic constraints apply to the request message, not the implementation.
For instance, you may have noticed that HTTP servers commonly add an entry to their access log for each request. Clearly that's not "read only" - but it is merely an implementation detail; the client's request did not say "and also log this".
GET is still fine here, even though the server is writing things down.
One possible objection would be that, if we use GET, then sometimes a cache will return an previous response rather than passing the request all the way through to the origin server to get logged. Which is GREAT - caches are a big part of the reason that the web can be web scale.
But if you don't want caching, the correct way to handle that is to add metadata to the response to inhibit caching, not to change the HTTP method.
Another possibility, which is more consistent with the interviewer's "idempotent" remark, is that they wanted this "request history" to be a resource that the client could edit, and that looking up distances would be a side effect of that editing process.
For instance, we might have some sort of an "itinerary" resource with one or more legs provided by the client. Each time the client modifies the itinerary (for example, by adding another leg), the distance lookup method is called automatically.
In this kind of a problem, where the client is (logically) editing a resource, the requests are no longer "effectively read only". So GET is off the table as an option, and we have to look into the other possibilities.
The TL;DR version is that POST would always be acceptable (and this is how we would do it on the web), but you might prefer an API style where the client edits the representation of the resource locally, in which case you would let the client choose between PUT and PATCH.

Restful business logic on property update

I'm building a REST API and I'm trying to keep it as RESTful as possible, but some things are still not quite clear for me. I saw a lot of topic about similar question but all too centered about the "simple" problem of updating data, my issue is more about the business logic around that.
My main issue is with business logic triggered by partial update of a model. I see a lot of different opinion online about PATCH methods, creating new sub-ressources or adding action, but it often seems counter productive with the REST approach of keeping URI simple and structured.
I have some record that need to be proceeded ( refused, validated, partially validated ..etc ), each change trigger additional actions.
If it's refused, an email with the reason should be sent
if it's partially validated, the link to fulfill the missing data is sent
if it's validated some other ressources must be created.
There is a few other change that can be made to the status but this is enough for the example.
What would be a RESTful way to do that ?
My first idea would be to create actions :
POST /record/:id/refuse
POST /record/:id/validate ..etc
It seems RESTful to me but too complicated, and moreover, this approach means having multiple route performing essentially the same thing : Update one field in the record object
I also see the possibility of a PATCH method like :
PATCH /record/:id in which I check if the field to update is status, and the new value to know which action to perform.
But I feel it can start to be too complex when I will have the need to perform similar action for other property of the record.
My last option, and I think maybe the best but I'm not sure if it's RESTful, would be to use a sub-ressource status and to use PUT to update it :
PUT /record/:id/status, with a switch on the new value.
No matter what the previous value was, switching to accepted will always trigger the creation, switching to refused will always trigger the email ...etc
Are those way of achieving that RESTful and which one make more sense ? Is there other alternative I didn't think about ?
Thanks
What would be a RESTful way to do that ?
In HTTP, your "uniform interface" is that of a document store. Your Rest API is a facade, that takes messages with remote authoring semantics (PUT/POST/PATCH), and your implementation produces useful work as a side effect of its handling of those messages.
See Jim Webber 2011.
I have some record that need to be proceeded ( refused, validated, partially validated ..etc ), each change trigger additional actions.
So think about how we might do this on the web. We GET some resource, and what is returned is an html representation of the information of the record and a bunch of forms that describe actions we can do. So there's a refused form, and a validated form, and so on. The user chooses the correct form to use in the browser, fills in any supplementary information, and submits the form. The browser, using the HTML form processing rules, converts the form information into an HTTP request.
For unsafe operations, the form is configured to use POST, and the browsers therefore know that the form data should be part of the message-body of the request.
The target-uri of the request is just whatever was used as the form action -- which is to say, the representation of the form includes in it the information that describes where the form should be submitted.
As far as the browser and the user are concerned, the target-uri can be anything. So you could have separate resources to handle validate messages and refused messages and so on.
Caching is an important idea, both in REST and in HTTP; HTTP has specific rules baked into it for cache invalidation. Therefore, it is often the case that you will want to use a target-uri that identifies the document you want the client to reload if the command is successful.
So it might go something like this: we GET /record/123, and that gives us a bunch of information, and also some forms describing how we can change the record. So fill one out, submit it successfully, and now we expect the forms to be gone - or a new set of forms to be available. Therefore, it's the record document itself that we would expect to be reloading, and the target-uri of the forms should be /record/123.
(So the API implementation would be responsible for looking at the HTTP request, and figuring out the meaning of the message. They might all go to a single /record/:id POST handler, and that code looks through the message-body to figure out which internal function should do the work).
PUT/PATCH are the same sort of idea, except that instead of submitting forms, we send edited representations of the resource itself. We GET /record/123, change the status (for example, to Rejected), and then send a copy of our new representation of the record to the server for processing. It would therefore be the responsibility of the server to examine the differences between its representation of the resource and the new provided copy, and calculate from them any necessary side effects.
My last option, and I think maybe the best but I'm not sure if it's RESTful, would be to use a sub-resource status and to use PUT to update it
It's fine -- think of any web page you have ever seen where the source has a link to an image, or a link to java script. The result is two resources instead of one, with separate cache entries for each -- which is great, when you want fine grained control over the caching of the resources.
But there's a trade - you also need to fetch more resources. (Server-push mitigates some of this problem).
Making things easier on the server may make things harder on the client - you're really trying to find the design with the best balance.

Confusion about HTTP verbs

While building my Web API, I have encountered some cases, where I'm not sure what HTTP verbs to use.
Downloading a file with a side effect
My first thought was to use GET, but later I did realize, when a client calls the API to download a file, the server also updates the counter in the DB indicating total number of downloads and the date of the last download.
Isn't this against the specification? The server state was changed, after all. Shouldn't this be a POST/PUT? But if the POST/PUT would be used, I wouldn't be able to share the link and use it from the browser.
Generating random list of values
In my case I need to call the API to generate random list of questions for a test (exam). The request doesn't change anything on the server, it just produces different response content each time the client calls it, so I guess using GET is alright. The indempotency applies only for the server state, not the result handed to the client, right? So is it allowed to request (GET) the same resource repeatedly with different outcome (as seen from the client)?
Generating list of values based on the user input
The last case is similar to the previous. I need the server to generate list of questions. This time based on the previous test's wrong answers. Again, the request doesn't alter server data, but I need to send to the server (relatively) long list of items, which wouldn't have to fit as a query string. That's why I would think a POST with a payload in the body could be used. But to be honest, it feels weird.
Is there a definitive answer which verbs to use for each case?
Downloading a file with a side effect
My first thought was to use GET
And that's the right answer. HTTP Methods are about semantics, not implementation.
HTTP does not attempt to require the results of a GET to be safe. What
it does is require that the semantics of the operation be safe, and
therefore it is a fault of the implementation, not the interface
or the user of that interface, if anything happens as a result that
causes loss of property -- Fielding (2002)
Generating random list of values
it just produces different response content each time the client calls it, so I guess using GET is alright.
Yup - again, as long as the semantics are safe, GET is a fine choice.
Generating list of values based on the user input
I need to send to the server (relatively) long list of items, which wouldn't have to fit as a query string. That's why I would think a POST with a payload in the body could be used. But to be honest, it feels weird.
So if you weren't worried about length of the identifier, GET would be the usual answer here, with all of the user input encoded into the URI.
At this point, you have a couple of options.
The simplest one is to simply use POST, with the user input in the message body, and the resulting list of values in the Response. That shouldn't feel weird -- POST is the method in HTTP with the fewest semantic constraints.
Alternatively, you can rethink your protocol such that the client is creating a "query resource", using the message body as the payload. So POST could work here again, or alternatively you could use PUT (with a somewhat different handling of the URI).
A third possibility is to look in the Hypertext Transfer Protocol Method Registry to see if there is an extension method with the semantics that you need, paying careful attention to whether or not the method is safe. SEARCH and REPORT might fit your needs.
If I decide later, I want to record each generated test to the DB, would you recommend to change the API to POST or keep it as it is? In case of changing the HTTP verb, the client wouldn't notice any functional change, but it would break the API, so semantics-wise, wouldn't it be more appropriate to use POST right from the start, after all? In both cases the meaning would be "create a new test".
No, but change things up a bit and things get interesting. The interesting bit isn't really "record to the database", but "be able to pull it out of the database later". When you start looking toward creating a new resource that can be retrieved later, GET stops being a good fit.
it would break the API
Only because you are ignoring an important REST constraint - REST api are hypertext driven. On the web, we can easily change from GET to POST by changing from a link to a form (or from a GET form to a POST form). The client isn't playing "guess the URI" or "guess the method" because the representation of state includes these details.
But yes, if you make a big enough change to the semantics, it's not going to be backwards compatible. So don't try to pretend that it is backwards compatible - just create a new protocol using new resources.

Server load difference between an http response with a single value or an object containing more data

I want to naI want to know if there is a real practical difference between different types of content in an HTTP response. Let me explain my self better.
Say I submit a POST request to a server with typical resource payload. Let's use a client with client_name, client_email, client_phone.
Would there be an actual difference if the server returns just an id:
{id:100}
Or if it returns the fully created resource without sensible data, like so:
{client_name: 'Some Client', client_email: 'email#sample.com', client_phone: '417-235-4622'}
Suppose that the application as a considerable amount of active users, creating resources at any given moment. Is there a significant cost in server resources associated with returning data from the server (just an ID or a full object)
Given the following scenarios when creating a resource:
Submit POST request, receive resource ID, complete all data visualization feedback with data in memory (info in form element).
Submit POST request, receive full object with id, email and phone. Continue with UI things.
If there is a difference in cost, and its significant, then the response ID is the way to go. But, I'm thinking that if I have lot's fields to submit, and most of them are required, and I'm only expecting an ID in return, then that'a a guarantee that te resource got created but it doesn't mean it was created completely. Suppose I submit the data, and one of those fields fails silently to submit to database (email for example), the server returns ID, the UI shows the user that the resource was created, the user reloads the page and the email is gone.
If the server returns the full object I get the feeling that the transaction is more atomic.
So, to wrap up. Is there a significante difference in terms of cost to the server ?
but it doesn't mean it was created completely. Suppose I submit the data, and one of those fields fails silently to submit to database (email for example)
Even if the email were to be saved in a different table than the rest of the data, it will still have to be done in a transactional manner (an indivisible operation that must succeed or fail as a complete unit; it can never be only partially complete). This could even mean rolling back changes if a failure is detected at any point during the save operation.
Now back to the main question, REST just says that you should conform to the uniform interface. In other words, it says you should do what POST is supposed to do as per the HTTP spec. Here is the quote from that spec that is relevant,
If a resource has been created on the
origin server, the response SHOULD
be 201 (Created) and contain an entity
which describes the status of the
request and refers to the new
resource, and a Location header
(see section 14.30).
I think it all depends on the use case scenarios. If the client immediately needs to display info regarding the newly created object, I really do not see any advantage to returning only the ID and doing a GET request after, to get the data you could have got with your initial POST.
Anyway as long as your API is consistent I think that you should choose the pattern that fits your needs the best. There is not any correct way of how to build a REST API, imo.
Is there a significante difference in terms of cost to the server ?
That's totally unanswerable by us. How powerful is the server? What software are you running on it? What's the breakdown of your expected traffic? What performance targets are you expected to hit? etc..
Performance problems should be solved through a combination of more better hardware and a sensible software architecture that still does everything you need it to. You don't even know if you have a problem yet and you're trying to fix it.
You're asking the wrong question. The question you should be asking is: when my clients create a user, are they likely to need server-created information beyond the URI immediately? Of course, we can't really answer that either. If the server isn't (and won't ever!) be creating anything, there's an obvious answer. If it is, or may, you may want to return a full representation even if the client doesn't need it now, so it's not a breaking change later if they decide they do. The pain there depends a lot on whether this is an internal- or external-facing API, and who owns the clients.
In addition to the other answers given, which are quite comprehensive, I would just like to add that it is contrary to the design of the web to provide object IDs and expect the client to know what to do with them. You should instead be providing URLs to the object in question. Clients can then do a GET request on the provided URL to fetch the full set of data for the object, should they want to. And I f the responses to these GET requests have already been cached, your server will not have to do any work at all to satisfy them!

Rest Services conventions

In Rest Services,we normally use 'GET' requests when we want to retrieve some data from the server, however we can also retrieve data using a 'POST' request.
We use 'POST' to create, 'PUT' to update, and 'DELETE' to delete, however we can even create new data using a 'DELETE' request.
So I was just wondering what is the real reason behind for that, why these conventions are used?
So I was just wondering what is the real reason behind for that, why these conventions are used?
So the world doesn't fall apart!
No but seriously, why are any protocols or standards created? Take this historical scenario. Back in the early days of Google, many developers (relative to nowadays) weren't too savvy on the HTTP protocol. What you might've caught was a bunch of sites who just made use of the well known (maybe only known) GET method. So there would be links that would be GET requests, but would perform operations that were meant to be POST request, that would change the state of the server (sometimes very important changes of state). Enter Google, who spends its days crawling the web. So now you have all these links that Google is crawling, all these links that are GET requests, but changing the state of the server. So all these companies are getting a bunch of hits on their servers changing state. They all think they're being attacked! But Google isn't doing anything wrong. HTTP semantics state that GET requests should not have state changing behaviors. It should be a "read only" method. So finally these companies smartened up, and started following HTTP semantics. True story.
The moral of the story: follow protocols, that's what they're there for - to follow.
You seem to be looking at it from the perspective of server implementation. Yeah you can implement your server to accept DELETE request to "get" something. That's not really the matter at hand. When implementing the server, you need to think about what the client expects. I mean ultimately, you are creating an API. Look at it from a code API perspective
public class Foo {
public Bar bar;
public Bar deleteBar() {
return bar; // Really?!
}
public void getBar() {
bar = null; // What the..??!
}
}
I don't know how long a developer would last in the game, writing code like that. Any callers expecting to "get" Bar (simply by naming semantics) has another thing coming. Same goes for your REST services. It is ultimately a WEB API, and should follow the semantics of the protocol (namely HTTP) on which it is built. Those who understand the protocol, will have an idea of what the API does (at least in the CRUD sense), simply based on the type of request they make.
My suggestion to you or anyone trying to learn REST, is to get a good handle on HTTP. I would keep the following document handy. Read it once, then keep it as a reference
Hypertext Transfer Protocol (HTTP/1.1): Semantics and Content
GET cached by proxies POST and DELETE not!
Yes you can create data with GET but now you have to destroy that cached.Why to do extra job.
Also maximum header sizes accepted are different because of purpose of usage.
I recommend reading the spec which clearly states how each each http-method should be used.
why these conventions are used?
They are conventions, that is, best practices that have been adopted as a standard. You do not have to adhere to the standard, but most consumers of a REST service assume you do. That way it is easier to understand the implementation / interface.