Dangers of hashing known plain text - rest

I have easily guessable internal identifiers (auto increasing numbers) and I'd like to give my clients access to resources based on these identifiers.
Of cause I cannot provide them with an URL like:
https://example.com/order/13
because they could easily guess how to access order #14 from this URL.
I therefore thought about providing them with a salted hash of the identifier like:
https://example.com/order/4643ef…
where
4643ef… = sha256(13 + 'supersecretsalt')
Is this a good approach from a security perspective?

First of all, your should not be granting access to any resource simply based on a uri. In other words, user A should not be able to access a resource that belongs to user B even if he knows the relevant uri. To mitigate this, you should add some form of authentication and authorization before allowing access to any (confidential?) resources.
That said, if you'd still like to obfuscate the uri, you can probably use a GUID for this instead of generating any kind of hash. Instead, fore each order ID, simply store a GUID together with it, and then look that ID up whenever the GUID is used in an url.
Sidenote: If you do want to let your customers look up some order-details based simply on a url (i.e. without requiring identification), you might at least make the availability of the resource temporary. You can do this by storing e.g. a valid until-date together with the GUID.
Now user A will be able to see info relating to his resource via a url with a guid, but perhaps only for e.g. 3 days. Other users would also be able to access it, but it would be less likely to happen, both because it would be hard to guess the GUID, and because they would only have a 3 day window to do so.
If user A needs to access his resource again later, perhaps you could provide a way to extend the validity of the GUID, or alternatively just provide a new GUID that points to the same resource, but with a different validity date.
Obviously you'll need to thing through whether or not this is realistic / acceptable for your particular situation and security needs.

Related

REST API: Resource hierarchy and multiple URI

I work with a banking database, which is structured like this:
Table Primary Key Unique Keys Foreign Keys
-------------------------------------------------------------
BANK ID BIC
CUSTOMER ID CUSTNO, PASS, CARD BANK
ACCOUNT ID IBAN BANK, CUSTOMER
I want to design a clean REST API, but I run into following problems:
Should I put resources in a hierarchy, or rather flat? The problem with the hierarchy might be that the client only knows the ACCOUNT ID, but does not know the CUSTOMER ID, so how is he supposed to get the resource?
/banks/{id}/customers/{id}/accounts{id}
or
/banks/{id}
/customers/{id}
/accounts{id}
The primary key in each table is the database ID. It is an internal ID and has no business meaning. Is it correct to use it as the default URI of the resource?
Each object has its own set of unique keys. For example, CUSTOMER can be identified by his CUSTNO, PASS or CARD. Each client only has a subset of these keys. Should I define a sub-resource per key or provide a lookup service that will give the proper URI back?
/customers/id/{id}
/customers/custno/{custno}
/customers/pass/{pass}
/customers/card/{card}
or
/lookup/customer?keyType=card&keyValue=AB-303555
(gives back customer {id})
I am asking what is the truly RESTful way, what is best practice. I haven't found proper answers yet.
I am asking what is the truly RESTful way, what is best practice.
REST doesn't care what spellings you use for your identifiers.
/ef726381-dd43-4017-9778-83cee2bbbd93
is a perfectly RESTful URI, suitable for any use case.
Outside of some purely mechanical concerns, general purpose consumers treat a URI as a single opaque unit. There's no notion of a consumer extracting semantic information from the URI -- which means that any information encoded into the identifier is done at the server's discretion and for its use alone.
For cases where information known to the client needs to be included in the target-uri of the request, we have URI Templates, which are a sort of generalization of a GET form in HTML. So a way to think about your problem is to consider what information the client has, and how they would put that information into a form.
HTML's form processing rules are pretty limiting -- general URI templates have fewer constraints.
/customers/id/{id}
/customers/custno/{custno}
/customers/pass/{pass}
/customers/card/{card}
Having multiple resources sharing common information is normal in REST -- your resource model is not your data model. So this could be fine. It's even OK to have multiple resources that share representations. You could have them stand alone, or you could have them share a Content-Location, or a canonical link relation, or you could simply have those resources redirect to the canonical resource.
It's all good.
So you mean if a UUID can be a valid URI, then a table autonumber key can be too?
Yes, exactly.
Note that if you want the lifetime of the URI to extend beyond the lifetime of your current implementation, then you need to design your identifiers with that constraint in mind. See Cool URIs Don't Change.
The clients don't care what the URI is, they just want the link to work again when they need it.

What is the best way to handle REST resource proposal generation?

I have this API : GET /travels/generate/{city-departure}/{city-arrival}
It generate a list of possible travels path (with train changes, etc).
Now these are not real resources because they don't have ID (they are only generated for proposal).
What is the best way to select one and save it in a RESTful way ? Should I create a temporary resource for each proposal like "GET /temporary-travel/{id}" ?
A REST resource does not need to have an ID. It must be identifiable. Your URLs
/travels/generate/{city-departure}/{city-arrival}
are completely OK to identify a resource.
A REST resource does not need to have an ID. It must be identifiable.
One solution would be using a list index (e.g. GET /travels/generate/{city-departure}/{city-arrival}/{index} ). This somehow needs you to remember the content and the order of the proposed travel paths.
To overcome the limitation of temporary storing possible travel paths, you may either store them permanently and providing them an static identifier or you may provide a domain specific key that consists of multiple chained static identifiers that provide an identity to your travel path (e.g. chaining all route segment IDs or so).
I somehow prefer the idea of storing all possible travel paths even knowing it is technically somewhat nearly impossible. I like it because the travel paths possibly provided by your system are kind of limited due to the algorithm and the data base you use.

RESTful url to GET resource by different fields

Simple question I'm having trouble finding an answer to..
If I have a REST web service, and my design is not using url parameters, how can I specify two different keys to return the same resource by?
Example
I want (and have already implemented)
/Person/{ID}
which returns a person as expected.
Now I also want
/Person/{Name}
which returns a person by name.
Is this the correct RESTful format? Or is it something like:
/Person/Name/{Name}
You should only use one URI to refer to a single resource. Having multiple URIs will only cause confusion. In your example, confusion would arise due to two people having the same name. Which person resource are they referring to then?
That said, you can have multiple URIs refer to a single resource, but for anything other than the "true" URI you should simply redirect the client to the right place using a status code of 301 - Moved Permanently.
Personally, I would never implement a multi-ID scheme or redirection to support it. Pick a single identification scheme and stick with it. The users of your API will thank you.
What you really need to build is a query API, so focus on how you would implement something like a /personFinder resource which could take a name as a parameter and return potentially multiple matching /person/{ID} URIs in the response.
I guess technically you could have both URI's point to the same resource (perhaps with one of them as the canonical resource) but I think you wouldn't want to do this from an implementation perspective. What if there is an overlap between IDs and names?
It sure does seem like a good place to use query parameters, but if you insist on not doing so, perhaps you could do
person/{ID}
and
personByName/{Name}
I generally agree with this answer that for clarity and consistency it'd be best to avoid multiple ids pointing to the same entity.
Sometimes however, such a situation arises naturally. An example I work with is Polish companies, which can be identified by their tax id ('NIP' number) or by their national business registry id ('KRS' number).
In such case, I think one should first add the secondary id as a criterion to the search endpoint. Thus users will be able to "translate" between secondary id and primary id.
However, if users still keep insisting on being able to retrieve an entity directly by the secondary id (as we experienced), one other possibility is to provide a "secret" URL, not described in the documentation, performing such an operation. This can be given to users who made the effort to ask for it, and the potential ambiguity and confusion is then on them, if they decide to use it, not on everyone reading the documentation.
In terms of ambiguity and confusion for the API maintainer, I think this can be kept reasonably minimal with a helper function to immediately detect and translate the secondary id to primary id at the beginning of each relevant API endpoint.
It obviously matters much less than normal what scheme is chosen for the secret URL.

REST numeric or string resource identifiers?

I'm doing some research to help me develop a REST API and this is one topic I haven't seen discussed in depth anywhere.
If I have a user in the system, is it better to identify the user using a numeric identifier
/users/1
Or using a string identifier?
/users/RSmith
I can see hypothetical potential pros and cons to each approach, string identifiers are more human readable, less discoverable (can't be incremented to find valid users), and don't require storing another numeric id in the database (I wouldn't want to expose database ids through the API). Numeric identifiers have no inherent meaning and due to that, can be guaranteed to be immutable, whereas with a string id the user might want to rename the resource, thus changing the resource URI.
Is there a REST best practice here or does the best approach vary to system to system? If the latter, are there any additional pros and cons associated with each method?
As you know, strictly speaking, there is no advantage between both approaches. Yes, string identifies may be easier for people to remember, but apart from that, REST does not enforce "pretty" URLs (or IDs), because most of the time URLs are accessed by programs following the hyperlinks.
Thus, human friendly URLs should only be used for bootstrapping resources that may be remembered by humans. Also, ID guessing should not be a problem because either:
You have to restrict access to URLs based on any authentication method, or:
You have to use randomized/unguessable URLs that are not "public".
So which one to use? Most of the time, it does not matter, as IDs are not accessed directly. If you have to ensure people remember their URLs for some reason, try to do them human-friendly, but try to avoid resource-name change and apply some other means of authentication so that even guessed URLs don't get access to unauthorized places.
Only advantage of this: /users/RSmith is that it's more human friendly. From RESTfull perspective it doesn't matter because both are valid resource identifiers. Everything else depends on your system requrements.

RESTful API creates a globally unique resource

In our system we have accounts which contain items. An item is always associated with a single account but also has a globally unique id in the system. Sometimes it is desirable to work with an item when only its id is known.
Is it incorrect to allow access to a subordinate resource (the item) from outside it's owner (the account)? In other words, is it wrong to have 2 URI's to the same resource? This is a little tricky to explain so here is an example:
POST /inventory/accountId
#Request Body contains new item
#Response body contains new item's id
GET|PUT|DELETE /inventory/accountId/guid #obviously works and makes sense
GET|PUT|DELETE /inventory/guid #does this make sense?
Perhaps I should rethink my resource layout and not use accounts to create items but instead take the account as a query string parameter or field on the item?
POST /inventory
# Request body contains item w/ account name set on it
GET|POST|DELETE /inventory/uuid #makes sense
GET|POST|DELETE /inventory/accountId/uuid #not allowed
I think having two URIs point to the same item is asking for trouble. In my experience, these sorts of things lead to craziness as you scale out (caching, multiple nodes in a cluster going out of sync and so on). As long as the item's ID is indeed globally unique, there's no reason no to simply refer to it as /inventory/uid
POST /inventory/accountId
GET|PUT|DELETE /inventory/accountId/guid #obviously works and makes sense
GET|PUT|DELETE /inventory/guid #does this make sense?
It makes the most sense when /inventory/guid redirects to /inventory/accountId/guid (or, I'd argue, vice-versa). Having a single canonical entity, with multiple URI's redirecting to it, allows your caching scheme to remain the most straightforward. If the two URI's instead return the same data, then a user is inevitably going to PUT a new representation to one and then be confused when it GETs an old copy from the other because the cache was only invalidated for the former. A similar problem can occur for subsequent GETs on the two. Redirects keep that a lot cleaner (not perfectly synchronous, but cleaner).
Whether to make items subordinate to accounts depends on whether items can exist without an account. If the data of an item is a subset of the data of an account, then go ahead and make it subordinate. If you find that an account is just one kind of container, or that some items exist without any container, then promote them to the top level.
In other words, is it wrong to have 2 URI's to the same resource?
No. It is not wrong to have multiple URI's identifying the same resource. I don't see anything wrong with your first approach as well. Remember URI's are unique identifiers and should be opaque to clients. If they are uniquely identifying a resource then you don't have to worry too much about making your URLs look pretty. I am not saying resource modeling is not important but IMO we shouldn't spend too much time on it. If your business needs that you have guid directly under inventory and also under individual accounts, so be it.
Are you concerned about this because of a potential security hole in letting data be available to unauthorized users? Or is your concern purely design driven?
If security is not your concern, I agree that it is perfectly fine to have 2 URIS pointing to the same resource.