Resftul API structure, Is userID expected in route if already derived from session token? - rest

I am currently planning a web API based on the principles of REST. I am using a session token to correct identify what user is making a request (after authentication of course), then determining if that user has access to the given resource.
Assuming the user making the request has a userID of 7, and I am wanting to retrieve a list of only the presentations that he can access, would best/proper practice be to:
1. Include my userID in the route, such as:
localhost:55555/api/users/7/presentations
or
2. Not include userID, such as:
localhost:55555/api/presentations
Each presentation can be accessed by any number of other users. For this reason I am leaning towards option 2 but would like to know what others think before I finalize the structure.

A very common pattern for REST APIs is to have both:
a list resource with optional parameters like /presentation/?by=Alice&since=2013-1-1,
object resources like /presentation/0AFF56E7.
For presentations, I wouldn't use a composite ID containing the user ID, since it doesn't seem really needed and it would prevent future features like changing the "owner" of the presentation (without changing its ID).

Related

Good URL syntax for a GET request with a composite key

Let's take the following resource in my REST API:
GET `http://api/v1/user/users/{id}`
In normal circumstances I would use this like so:
GET `http://api/v1/user/users/aabc`
Where aabc is the user id.
There are times, however, when I have had to design my REST API in a way that some extra information is passed with the ID. For example:
GET `http://api/v1/user/users/customer:1`
Where customer:1 denotes I am using an id from the customer domain to lookup the user and that id is 1.
I now have a scenario where the identifier is more than one key (a composite key). For example:
GET `http://api/v1/user/users/customer:1;type:agent`
My question: in the above URL, what should I use as the separator between customer:1 and type:agent?
According to https://www.ietf.org/rfc/rfc3986.txt I believe that the semi-colon is not allowed.
You should either:
Use parameters:
GET http://api/v1/user/users?customer=1
Or use a new URL:
GET http://api/v1/user/users/customer/1
But use Standards like this
("Paths tend to be cached, parameters tend to not be, as a general rule.")
Instead of trying to create a general structure for accessing records via multiple keys at once, I would suggest trying to think of this on more of a case-by-case basis.
To take your example, one way to interpret it is that you have multiple customers, and those customers each may have multiple user accounts. A natural hierarchy for this would be:
/customer/x/user/y
Often an elegant decision like this can be made, that not only solves the problem but also documents your data-model in a way that someone can easily see that users belong to customers via a 1-to-many relationship.

Should an element in a REST API return its own ID?

What is the benefit of returning the ID of the element? Isn't it already part of the url and therefore known? I am not talking about using the REST API with HAL or something similiar.
api/employees/1
{
"Id" : 1
"Name" : "Joe Bloggs",
"Department" : "IT"
}
api/employees/1
{
"Name" : "Joe Bloggs",
"Department" : "IT"
}
I guess it makes sense to add more information regarding the usage of the API:
The API in question is a public API in a closed network (not internet). We provide sample clients but our customer write their own client for our API. The ID of an element is no sensitive information. The data is not about exmployees (as stated in the question) but about asset management.
The reason I am asking is, that customers are complaining that if they use some kind of middleware (whatever this is), they only receive the content of a element but do not have access to the url of the element (how?).
If you write your own client, is there any kind of situation where you can't get the ID based on the URL? Should we add the ID for people, who somehow do not have access to the url?
What is the client actually using the ID for? Presenting a product ID isn't that wrong IMO but does a user has to know the ID you store the user entity at in the DB when she uses an email to authenticate with the API anyways? So to answer the actual question: it depends. If the client, however, is using it to construct the next URI to invoke I strongly recommend to return links with meaningful relation names instead as this helps to decouple the client from the API as the client does not have to have a-priori knowledge of the API itself.
Depending on the resource it might not be benefitial to have an ascending ID as this might favor guessing attacks and also may lead to strange situation if you remove an item in the middle of the collection. Are the IDs of the subsequent items updated? Is a gap exposed between items? Usually UUIDs or the like are a much safer way to expose such information.
One further aspect to consider is that clients in an ideal REST environment should not interpret URIs itself but use the relation name the URI was returned for instead to determine whether to invoke that URI or not. A client which extracts an ID from an URI most likely has some a-priori knowledge of the API and is thus thighly coupled to that API and will with a certainty break if the API is ever going to be changed.
With that being said, there is the concept of URI patterns which should help a client in extracting things like IDs and names from URIs. Personally I'm not that keen on such things as they promote a misleading approach to the application of REST in an API design.
If you write your own client, is there any kind of situation where you can't get the ID based on the URL? Should we add the ID for people, who somehow do not have access to the url?
Extracting the ID of an URI requires knowledge of the URI structure. If you ever, at some later time, want to change the URI structure for whatever reason all clients that were built around that knowledge will break. URIs shouldn't contain content as the body is actually there for. As the ID seems to be content for some of the client include it in the response body. You are of course free to add some of the information to the URI though you shouldn't require clients on parsing that URI and extract the required information of it.

how to isolate data in Restful API

There are some restful apis, as follows:
api/v1/billing/invoices/{invoiceNumber}
api/v1/billing/transactions/{transactionNumber}
And, each invoice or transaction belong to a specific account.
When implementing the restful apis, we must meet: Each account can only view their own invoice or transaction.
How should we isolate the data in restful apis?
Of course, we can pass the account number to the api, such as:
api/v1//billing/invoices/{invoiceNumber}?accoutNumber=XXX
api/v1/billing/{accountNumber}/invoices/{invoiceNumber}
But the Invoice Number has been able to uniquely identify a resource. So I do not want the problem to be complicated.
Is there any other way to solve this problem?
You are mixing a lot of things here.
This is not a REST problem, this is a security problem. More precisely, it's a OWASP top 10 2013 Insecure direct object vulnerability.
Let's make it simple: you have a URL like this
.../superSensitiveStuff/1
and you want to prevent the owner of "1" from accessing to ".../superSensitiveStuff/2"
To the best of my knowledge, there are three ways of dealing with this issue:
enforcing integrity in request URLs. This strategy does not apply to all cases, it only works in those scenarios where the client issues a request to a resource previously communicated by the server. In this case, the server may add a query param like this
.../superSensitiveStuff/1?sec=HMAC(.../superSensitiveStuff/1)
where HMAC is a cryptographic HASH function. If the parameter is missing, the server will drop the request and if it's there the server will be able to verify that it's exactly the authorized URL because the HMAC value guarantees its integrity(for additional infos, hit the link above).
using unpredictable references. The problem here is that a user can guess another id. "uhmm... I have the resource number 1, let me check whether the resource number 2 exists". If you drop sequences and move to long random number this is very hard to do. The resource will become
.../superSensitiveStuff/195A23FR3548...32OT465
This is good because it's effective and cheap.
exploiting a mixed RBAC-ABAC approach. RBAC stands for Role Based Access Control and this is what you are using. The leading A of the second acronym stands for Attribute. This means that access is provided on the basis of a user role and an attribute. In this case is the userId, since it must be authenticated for accessing private resources. In few words, when a user requests a specific .../superSensitiveStuff resource it is loaded from the repository when you have the ownership information for that resource. It could be a DB, for example, and your SuperSensitiveStuff java business model could be like this
public class SuperSensitiveStuff {
private String userId;
private String secretStuff;
...
}
now, in your controller you can do the following
String principal = getPrincipal(); //you request the logged userId
SuperSensitiveStuff resource = myService.load(id); //you load the resource using the {id} in the request path
if (resource.getUserId.equals(principal))
return resource //200 ok, this is an authorized access
else
throw new EvilAttemptException() //401 unauthorized, cheater detected

How to design api to retrieve one resource by non-primary key in RESTful style?

Original, we have a api GET /users/:id to retrieve a user by its primary key.
Now we need a new api to retrieve a user by its email.
GET /users?email=xx#xx.com seems like to get a collection.
GET /users/byEmail/:email includes a non-noun word byEmail.
Is there any other choice?
Both approaches you have suggested are valid on their own, but I probably wouldn't do both as it's best to stick with one URI per resource. Another common way to do this is:
/users/id/:id
or
/users/email/:email
I should point out that the choice of query params vs url params or /name/:value vs /:value is not what make a service "RESTful". Put another way, having "pretty" or "readable" URLs does not automatically mean your service is RESTful.
One of the key points of REST is resource identification through a URI i.e. a particular URI will always point to a particular resource. In the case of email, this could probably change (user may want to change their email address) so this url no longer identifies this user at all times.
A more RESTful approach would be to make explicit that this is really a search and not an identifier, and have a URI like this:
/search/users/email/:email
This is more RESTful because this URI always identifies the same resource, namely the search results for this email address. Note that the resource in this case is the search results, not the user resource itself.
I like the convention that URIs with / at the end of the path are the collections. So GET /users?email=xx#xx.com returns an item and GET /users/?email=xx#xx.com returns a collection with a single item. But ofc. you don't have to use this convention.
Another options are using /users/:email if you can solve the routing on the server, or /users-by-email/:email or /users/email-:email, etc... It is not important which URI structure you choose as long as your REST API meets with the HATEOAS constraint.

Informative vs unique generated ID in REST API

Designing a RESTful API. I have two ways of identifying resources (person data). Either by the unique ID generated by the database, or by a social security number (SSN), entered for each person. The SSN is supposedly unique, though can be changed.
Using the ID would be most convenient for me, since it is guaranteed to be unique, and does not change. Hence the URL for the resource, also always stays the same:
GET /persons/12
{
"name": Morgan
"ssn": "840212-3312"
}
The argument for using SSN, is that it is more informative and understandable by API clients. SSN is also used more in surrounding systems:
GET /persons/840212-3321
{
"name": Morgan
"id": "12"
}
So the question is: Should I go with the first approach, and avoid some implementation headaches where the SSN may change. And maybe provide some helper method that converts from SSN to ID?
Or go with the second approach. Providing a more informative API. Though having to deal with some not so RESTful strangeness where URL:s might change due to SSN changes?
URL design is a personal choice. But to give you some more examples which differ from those Ray has already provided, I will give you some of my own.
I have a user account resource and allow access via both URIs:
/users/12
and
/users/morgan
where the numerical value is an auto_incremented ID, and the alphabetic value is a unique username on the system specified by the user. these resources are uncachable so I do not bother about canonicalisation, however the /users page links to the alphabetic forms.
No other resources on my system have two unique fields, so are referred to by IDs, /jobs/123, /quotations/456.
As you can see, I prefer plural URI segments ;-)
I think of "job 123" as being from the "jobs" collection, so it seems logical to have a "jobs" resource, with subresources for each job.
You do not need to have a separate /search/ area for performing searches, I think it would be cleaner to apply your search criteria to the collection resource directly:
/people?ssn=123456-7890 (people with SSN matching/containing "123456-7890")
/people?name=morgan (people who's name is/contains "Morgan")
I have something similar, but use only the first letter as a filter:
/sites?alpha=f
Lists all sites beginning with F. You can think of it as a filter, or as a search criteria, those terms are just different sides of the same coin.
Good to see someone taking time to think about their Resource urls!
I would make a Url with the unique id to provide resource to a single user. Like:
http://api.mysite.com/person/12/
Where 12 is your unique ID. Note that I also prefer the singular 'person'....
Regardless, the url should return:
{
"ssn": "840212-3312"
"name": "Morgan"
"id": "12"
}
However, I would also create a general search URL that returns a list of users that match the parameters (either a json array or whatever format you need). You can specify search parameters as get params like this:
http://api.mysite.com/person/search/?ssn=840212-3312
Or
http://api.mysite.com/person/search/?name=Morgan
These would return something like this for a single search hit--note it's an array, not a single item like the unique id url that points directly to a single user.
[{
"ssn": "840212-3312"
"name": "Morgan"
"id": "12"
}]
This search could then be later augmented for other search criteria. You might only return the unique id's via the search Url--you could always make a request to the unique id url once you've got it from the search...
I would suggest that you use neither. Generate resource IDs that are unique both to a single user of your API and across all other resources (including other users' resources).
Using the unique database ID is not ideal for a couple of reasons. First, API resources and database records won't necessarily always be 1-to-1 even if you have designed it that way today. Second, you might change to a different data store that would generate different format unique ids.
Also, it is good practice to separate out the ID from other resource properties, such as SSN (as an aside I hope you are storing SSN in a very secure manner, but that's another topic). If for whatever reason an SSN changed, more than one API resource was associated with the same SSN, or you decide that piece of data is not needed someday, you don't want to have to change the ID.
One pattern is to prepend the unique ID with a few characters that indicate the resource type. For example if User is a resource type in you API, a generated unique ID would be something like USR56382.
RESTful API is an architectural style which emphasizes on resource centric design approach.
In my opinion, I would keep the resources as plural and noun format.
Every resource, for example, customers has following uniform interfaces
POST /customers - for creating a resource instance
PUT /customers/{customerId} - for updating a particular instance
GET /customers - is for search customers. So #Ray, search is not required to be part of URI itself. Any filter or query parameters that need to be supported should be there itself.
GET /customers/{customerId} - to retrieve a particular instance of customer
DELETE /customers/{customerId} to delete a particular instance
The reason why plural, it is because it behaves as a factory. For example, when u r trying to create a new instance of a resource, the instance does not exist and therefore, it cannot be on the self instance. Hence, singularity is not used.
It also goes hand-in-hand for search/inquiry, where you do not know or hold the actual instance of resource. Hence, the plural form is much recommended.
Now, the question is what to use for a resource id - a database primary key, a generated identifier, or an encrypted token.
In my opinion, database primary keys should not be exposed. Resource identifier should not be designed 1-1 with DB primary key. But, it happens a lot. A generated UUID based key is much more recommended to avoid any sequential follow-through attack but world is not ideal always.
Coming back to token or an encrypted token, is a recommended approach for sensitive APIs, and where data exchange is performed between two separate applications. If we are using it, the encryption/decryption should be solely at the API end. That means, the encrypted keys for sub-resources should be returned as part of parent API response, otherwise it defeats the purpose.