Restrict updates to specific fields in RESTful API - rest

Let's say I have an object, Widget, comprised of an Id and a Name. Let's say I expose an endpoint, /widget, where clients can POST new Widget objects. If I want the Id field to always be set by the server, not modifiable by the client, but still visible to clients, how can I declare that the Id field is not modifiable? I'm using RESTeasy if that makes any difference.

I can think of a few options.
First, are you sure you need to expose the ID as part of the representation? Or is it enough to respond with the location of the new posted resource?
Your client posts:
<Resource><Name>New Resource</Name></Resource>
And you respond:
HTTP/1.1 201 Created
...
Location: /resources/{new_resource_id}
Beyond that, I think it's OK to have some simple, well-understood conventions with your clients. I think most developers understand that an ID is likely to be system-generated (especially, since you're doing a POST and not a PUT). For less obvious cases, where you have arbitrary read-only fields (or other validation or display information), I think it may make sense to provide a link to metadata:
<NewPersonForm>
<atom:link href="/people/new/metadata" rel="/rels/metadata" />
<Name />
<Department>HR</Department>
</NewPersonForm>
What the metadata looks like is up to you, but something along these lines might work for you:
<Metadata>
<Element>
<Name>Department</Name>
<IsReadOnly>True</IsReadOnly>
</Element>
</Metadata>
That's a nice, format-neutral (it works well for both XML and JSON) way to provide information to the client, and if they really want to, they can program against it to build forms on the fly (I use it to provide validation information, language-specific labels, and data type information).
I hope this helps.
John

You write the code on the server that free to do whatever it wants. And that includes adding or changing data as needed. Check the AtomPub protocol section 9.2 that explicitly states:
Since the server is free to alter the
POSTed Entry, for example, by changing
the content of the atom:id element,
returning the Entry can be useful to
the client, enabling it to correlate
the client and server views of the new
Entry.

Related

REST API - PUT or GET?

I am designing and building a REST API. I understand the basic concept underlying the different request types. In particular PUT requests are intended for updating data.
I have a number of cases where an API call will modify the database, changing the values of a data object's attributes. However, the new values are not sent by the client but rather are implicit in the specific endpoint invoked. There are arguments needed to select the object to be modified, but not to supply attribute values for that object.
Originally I set these up to be PUT requests. However, I am now wondering whether they should be GET requests instead, because the body does not in fact contain update data.
Which would be recommended?
Just because the body doesn't contain update data doesn't mean it is not an update. Look at it from user's or at least from your API user's point of view. Is it an update from their point of view or retrieval of an object where update is not important from their point of view. If it is an update from user's point of view use PUT.
Originally I set these up to be PUT requests. However, I am now wondering whether they should be GET requests instead, because the body does not in fact contain update data.
If the semantics of the request are a change to the representation(s) of a resource on the server, then GET is inappropriate.
If the payload/entity enclosed in the request is not a candidate representation of the target resource ("make your representation look like this one right here"), then PUT is inappropriate.
"Update yourself however you see fit, here is some information that will help" will normally use POST.
POST serves many useful purposes in HTTP, including the general purpose of "this action isn’t worth standardizing." -- Roy Fielding, 2009
POST is the general solution for requests that are intended to modify resource state; PUT (and PATCH) are specializations with narrower semantics (specifically, remote authoring).

How to specify data security constraints in REST APIs?

I'm designing a REST API and I'm a big defender of keeping my URL simple, avoiding more than two nested resources.
However, I've been having second thoughts because of data security restrictions that apply to my APIs, that have been trying to force me to nest more resources. I'll try to provide examples to be more specific, as I don't know the correct naming for this situation.
Consider a simple example where I want to get a given contact restriction for a customer, like during what period my customer accepts to be bothered with a phone call:
So, I believe it's simpler to have this:
- GET /customers/12345
- GET /customers/12345/contacts
- GET /contacts/9999
- GET /contacts/9999/restrictions
- GET /restrictions/1
than this:
- GET /customers/12345
- GET /customers/12345/contacts
- GET /customers/12345/contacts/9999
- GET /customers/12345/contacts/9999/restrictions
- GET /customers/12345/contacts/9999/restrictions/1
Note: If there are more related resources, who knows where this will go...
The first case is my favourite because since all resources MUST have a unique identifier, as soon I have its unique identifier I should be able to get the resource instance directly: GET /restrictions/1
The data security restriction in place in my company states that not everyone can see every customers' info (eg: only some managers can access private equity customers). So, to guarantee that, the architects are telling me I should use /customers/12345/contacts/9999/restrictions/1 or /customers/12345/contact-restrictions/1 so that our data access validator in our platform has the customerId to check if the caller has access to it.
I understand the requirement and I see its value. However, I think that this kind of custom security informatio, because that's what I believe to be, should be in a custom header.
So, I believe I should stick to GET /restriction/1 with a custom header "customerId" with the value 12345.
This custom header would only be needed for the apis that have this requirement.
Besides the simpler URL, another advantage of the header, is that if an API didn't start with that security requirement and suddenly needs to comply to it, we could simply require the header to be passed, instead of redefining paths.
I hope I made it clear for you and I'll be looking to learn more about great API design techniques.
Thank you all that reached the end of my post :)
TL;DR: you are fighting over URI design, and REST doesn't actually offer guidance there.
REST, and REST clients, don't distinguish between your "simpler" design and the nested version. A URI is just an opaque sequence of bytes with some little domain agnostic semantics.
/4290c3b2-134e-4647-867a-214d0c866f29
Is a perfectly "RESTFUL" URI. See Stefan Tilkov, REST: I don't Think it Means What You Think it Does.
Fundamentally, REST servers are document stores. You provide a key (the URI) and the server provides the document. Or you provide a key, and the server modifies the document.
How this is implemented is completely at the discretion of the server. It could be that /4290c3b2-134e-4647-867a-214d0c866f29 is used to look up the tuple (12345, 9999, 1), and then the server checks to see if the credentials described in the request header have permission to access that information, and if so the appropriate representation of the resource corresponding to that tuple is returned.
From the client's perspective, it's all the same thing: I provide an opaque identifier in a standard way, and credentials in a standard way, and I get access to the resource or I don't.
the architects are telling me I should use /customers/12345/contacts/9999/restrictions/1 or /customers/12345/contact-restrictions/1 so that our data access validator in our platform has the customerId to check if the caller has access to it.
I understand the requirement and I see its value. However, I think that this kind of custom security information, because that's what I believe to be, should be in a custom header.
There's nothing in REST to back you up. In fact, the notion of introducing a custom header is something of a down check, because your customer header is not something that a generic component is going to know about.
When you need a new header, the "REST" way to go about it is to introduce a new standard. See RFC 5988 for an example.
Fielding, writing in 2008
Every protocol, every media type definition, every URI scheme, and every link relationship type constitutes prior knowledge that the client must know (or learn) in order to make use of that knowledge. REST doesn’t eliminate the need for a clue. What REST does is concentrate that need for prior knowledge into readily standardizable forms.
The architects have a good point - encoding into the uri the hints that make it easier/cheaper/more-reliable to use your data access validator is exactly the sort of thing that allowing the servers to control their own URI namespace is supposed to afford.
The reason that this works, in REST, is that clients don't depend on URI for semantics; instead, they rely on the definitions of the relations that are encoded into the links (or otherwise expressed by the definition of the media type itself).

REST resource path design

I'm developing a REST service, and I'm trying to adhere to Doctor Roy Fielding's conventions and guidelines.
I picture my service as an endpoint that exposes a set of resources. A resource is identified by an URI, and api clients can manipulate resources by using using HTTP semantics (i.e., different HTTP verbs map to the corresponding operations over URIs).
Guidelines state that these URIs should be defined in an hierarchical way, reflecting object hierarchy. This renders useful in resource creating, because on the backend we need the data to perform the create operation. However, on further manipulations, a lot of the information included in the URI is not even going to be used by the service, because generally the resource Id alone is enough to uniquely identify the operation target.
An example: Consider an Api that exposes the creation and management of products. Consider also that a product is associated with a brand. On creation it makes sense that the following action is performed:
HTTP POST /Brand/{brand_id}/Product
[Body containing the input necessary to the creation of the product]
The creation returns an HTTP 201 created with a location header that exposes the newly created product's location.
On further manipulations, clients could access the product by doing:
HTTP PUT /Brand/{brand_id}/Product/{product_id}
HTTP DELETE /Brand/{brand_id}/Product/{product_id}
etc
However, since the product Id is universal in the product scope, the following manipulations could be performed like this:
/Product/{product_id}
I am only keeping the /Brand/{brand_id} prefix for coherence reasons. In fact, the brand id is being ignored by the service. Do you thing that this is a good practice, and is reasonable for the sake of maintaining a clear, unambiguous ServiceInterface definition? What are the benefits of doing this, and is this the way to go at all?
Also any pointers on URI definition best practices would be appreciated.
Thanks in advance
You are saying:
Guidelines state that these URIs should be defined in an hierarchical way, reflecting object hierarchy.
While it is often done this way it is not really relevant for a RESTful API design. Roy Fielding has a nice article addressing common misconceptions about REST. There he even says:
A REST API must not define fixed resource names or hierarchies (an obvious coupling of client and server).
and
A REST API should be entered with no prior knowledge beyond the initial URI …
So don't encode information in your URL that should be passed inside the resource. A RESTful API should work even if you replace all your URLs with artificial and non-sensical URIs. (I like understandable URIs as anybody but as an mental exercise to check your "RESTfullness" it's quite good.)
The problem to model an URI for an object "hierarchy" is that the hierarchy isn't very often as obvious as it seems. (What is the object hierarchy between teacher, course, and student?). Often objects are in a web of relations and belong not clearly beneath another object. A product might belong to a brand but you might have multiple supplier (covering a subset of products for multiple brands). And REST is wonderful to express complex nets of relations. The whole internet/web works this way.
Instead of encoding the relationship in the hierarchy just define a hyperlink in your resource pointing to the related objects.
For your specific example I would use POST /product/ to create a new product and have a link to your /brand/xzy in the resource representation when creating the product.
If you want to know which products are defined for a specific brand, just include a list of links in the returned representation for GET /brand/xzy. If you want to have an explicit resource representing for this relationship you still could define GET /brand/{id}/products as an URL (or /brandproducts/xzy or /34143453453) and return it as a link in your brand resource.
Don't think so much about the design of your URIs, think more about the information you provide in your resources. Make sure it provides links to all the resource representations your client might want to view or manipulate after receiving it from your API.
I think this is the key comment:
a product is associated with a brand.
The word associated tells me you need to link resources together. So, let's say that there is an association between brands and products. Each resource has would have their own set of methods (GET, PUT, etc) as you described, but the representations should have links to other resources that describe their associations. Where the links are depends on the type of association (one-to-one, one-to-many, many-to-one, many-to-many) and direction.
For example, let's say there is a canonical request for this product to api.example.com:
GET /product/12345
It returns some representation of that product. For simplicity, I am going to use XML for the representation, but it can be XHTML, JSON or whatever you need. So, a simple representation of product 12345:
HTTP/1.1 200 OK
Content-Type: application/xml; charset=utf-8
Content-Length: ...
<?xml version="1.0"?>
<Product href="http://api.example.com/product/12345" rel="self" type="application/xml"/>
<Name>Macaroni and Cheese</Name>
<Brand href="http://api.example.com/brand/7329" type="application/xml"/>
<Manufacturer href="http://api.example.com/manufacturer/kraft" rel="parent" type="application/xml"/>
</Product>
As you can see, I am embedding links into the representation of product 12345 describing each relationship. Whenever possible, I try to follow HATEOAS constraint as much as I can:
There is explicit linking between the current resource and associated resources.
An optional relationship description ("rel"). "self" and "parent" are descriptions about the relationship the current resource and the resource the link references.
An optional preferred MIME type that should be requested. This describes the type of document that the client should expect if a followup request is made.
Opaque URLs instead of raw identifiers. Clients can just "navigate" to the URL without building them using some convention. Note that URLs do not always need to contain a unique database identifier or key (like "/brands/kraft").
To expand on some advance concepts, let's say that products have other relationships. Maybe products have hierarchical relationships or products supersedes other products. All of these complex relationships can represented by links. So, an advanced representation of product 12345:
HTTP/1.1 200 OK
Content-Type: application/xml; charset=utf-8
Content-Length: ...
<?xml version="1.0"?>
<Product href="http://api.example.com/product/12345" rel="self" type="application/xml"/>
<Name>Macaroni and Cheese</Name>
<Brand href="http://api.example.com/brand/7329" rel="parent" type="application/xml"/>
<Manufacturer href="http://api.example.com/manufacturer/kraft" type="application/xml"/>
<!-- Other product data -->
<Related>
<Product href="http://api.example.com/product/29180" rel="prev" type="application/xml"/>
<Product href="http://api.example.com/product/39201" rel="next" type="application/xml"/>
</Related>
</Product>
In this example, I am using "prev" and "next" to represent a chain of products. The "prev" can be interpreted as "superseded" and "next" be interpreted as "superseded-by". You can use "superseded" and "superseded-by" as rel values, but "prev" and "next" are commonly used. It is really up to you.

post body in REST

I was referring to the O'Reilly book on REST api design, that clearly lays down the message format specifically around the areas of how links should be used to represent interrelated resources and stuff. But all the examples are for reading a resource (GET) and how the server structures the message. But what about a Create (POST) ? Should the message structure for create of a similarily inter-connected object be similar i.e through links ??
By the way of an example, let us consider we want to create a Person object with a Parent field . Should the json message format sent to server thru POST (Post msg body) be like :-
{
name:'test',
age:12,
links:[
{
rel:'parent',
href:'/people/john'
}
]
}
Here is a media type you could look at
http://stateless.co/hal_specification.html
Yes, that is one way of doing it. GET information might be usefully made human-readable, but POST/PUT information targets the machine.
Adding information to reduce the server's need to process information (e.g. by limiting itself to verifying information makes sense rather than recovering it all from scratch) also makes a lot of sense, performance-wise. As long as you do verify: keep in mind that user data must be treated as suspect on general principles. You don't want the first ExtJS-savvy guy being able to forge requests to your services.
You might also format data in XML or CSV, depending on what's best for the specific application. And keeping in mind that you might want to refactor or reuse the code, so adhering to a single standard also makes sense. All things considered, JSON is probably the best option.

Should I use PUT method for update, if I also update a timestamp attribute

According to REST style, it's generally assumed that HTTP POST, GET, PUT, and DELETE methods should be used for CREATE, READ, UPDATE and DELETE (CRUD) operations.
But if we stick to the HTTP method definitions, it might not be so clear.
In this article it's explained that:
In a nutshell: use PUT if and only if you know both the URL where the resource will live, and the entirety of the contents of the resource. Otherwise, use POST.
Mainly because
PUT is a much more restrictive verb. It takes a complete resource and stores it at the given URL. If there was a resource there previously, it is replaced; if not, a new one is created. These properties support idempotence, which a naive create or update operation might not. I suspect this may be why PUT is defined the way it is; it's an idempotent operation which allows the client to send information to the server.
In my case I usually issue updates passing all the resource data, so I could use PUT for updates, but every time I issue an update I save a LastUser and LastUpdate column, with the user id that made the modification and the time of the operation.
I'd like to know your opinion, because strictly speaking those two columns are not part of the resource, but they do prevent the operation from being idempotent.
Ignoring the comment about the REST style mapping CRUD to the HTTP methods, this is an excellent question.
The answer to your question is, yes you are free to use PUT in this scenario even though there are some elements of the resource that are updated by the server in a non-idempotent manner. Unfortunately, the reasoning behind the answer is quite vague. The important thing, is to understand what was the intent of the client request. The client intended to completely replace the contents of resource with the values passed. The client is not responsible for the server doing other operations and therefore the semantics of the HTTP method are not violated.
This is the reasoning that is used to allow a server to update a page counter when you do a GET operation. The client didn't ask for an update therefore the GET is safe even though the server chose to make an update.
The whole, complete resource versus partial resource debate has finally been spelled out in an update to the HTTP spec
An origin server SHOULD reject any PUT
request that contains a
Content-Range header field, since it
might be misinterpreted as partial
content (or might be partial content
that is being mistakenly PUT as a
full representation). Partial content
updates are possible by targeting a
separately identified resource with
state that overlaps a portion of
the larger resource, or by using a
different method that has been
specifically defined for partial
updates (for example, the PATCH
method defined in [RFC5789]).
So, what we are supposed to do is now clear. What is not so clear is why there exists this constraint on only being allowed to send full responses. That question has been asked and IMHO remains unanswered in this thread on rest-discuss.
As LastUser and LastUpdate are not modifiable by the client, I'd remove them from the representation of your resource altogether. Let me explain my reasoning with an example.
Let's say that our typical example API will return the following representation to the client when asked to provide a single resource:
GET /example/123
<?xml version="1.0" encoding="UTF-8" ?>
<example>
<id>123</id>
<lorem>ipsum</lorem>
<dolor>sit amet</dolor>
<lastUser uri="/user/321">321</lastUser>
<lastUpdate>2011-04-16 20:00:00 GMT</lastUpdate>
</example>
If a client wants to modify the resource, it would presumably take the whole representation and send it back to the API.
PUT /example/123
<?xml version="1.0" encoding="UTF-8" ?>
<example>
<id>123</id>
<lorem>foobar</lorem>
<dolor>foobaz</dolor>
<lastUser>322</lastUser>
<lastUpdate>2011-04-16 20:46:15 GMT+2</lastUpdate>
</example>
Since the API generates values for lastUser and lastUpdate automatically and cannot accept data provided by the client, the most appropriate response would be 400 Bad Request or 403 Forbidden (since the client cannot modify these values).
If we want to be compliant with REST and send a full representation of the resource when doing a PUT request, we need to remove lastUser and lastUpdate from the representation of the resource. This will allow clients to send the full entity via PUT:
PUT /example/123
<?xml version="1.0" encoding="UTF-8" ?>
<example>
<id>123</id>
<lorem>foobar</lorem>
<dolor>foobaz</dolor>
</example>
The server would accept a full representation now that it doesn't contain lastUpdate and lastUser.
The question that remains is how to provide clients with access to lastUpdate and lastUser. If they don't need it (and these fields are required just internally by the API), we are fine and our solution is perfectly RESTful. If however clients need access to this data, the cleanest approach would be to use HTTP headers:
GET /example/123
...
Last-Modified: Sat, 16 Apr 2011 18:46:15 GMT
X-Last-User: /user/322
...
<?xml version="1.0" encoding="UTF-8" ?>
<example>
<id>123</id>
<lorem>foobar</lorem>
<dolor>foobaz</dolor>
</example>
Using a custom HTTP header is not ideal because user agents need to be taught on how to read it. If we want to provide clients with access to the same data in a more easier way, the only thing that we can do is to put the data into the representation, and we are facing the same problem as in your original question. I would at least try to mitigate it somehow. If the content type used by the API is XML, we can put the data into node attributes instead of exposing them directly as node values, i.e.:
GET /example/123
...
Last-Modified: Sat, 16 Apr 2011 18:46:15 GMT
...
<?xml version="1.0" encoding="UTF-8" ?>
<example last-update="2011-04-16 18:46:15 GMT" last-user="/user/322">
<id>123</id>
<lorem>foobar</lorem>
<dolor>foobaz</dolor>
</example>
This way we'll at least avoid the problem where a client would attempt to submit all XML nodes in a follow-up PUT request. This won't work with JSON, and the solution is still a bit on the edge of idempotency (since the API would still have to ignore the XML attributes when processing the request).
Even better, as Jonah pointed out in the comments, if clients need access to lastUser and lastUpdate, these can be exposed as a new resource, linked from the original one e.g. as follows:
GET /example/123
<?xml version="1.0" encoding="UTF-8" ?>
<example>
<id>123</id>
<lorem>foobar</lorem>
<dolor>foobaz</dolor>
<lastUpdateUri>/example/123/last-update</lastUpdateUri>
</example>
... and then:
GET /example/123/last-update
<?xml version="1.0" encoding="UTF-8" ?>
<lastUpdate>
<resourceUri>/example/123</resourceUri>
<updatedBy uri="/user/321">321</updatedBy>
<updatedAt>2011-04-16 20:00:00 GMT</updatedAt>
</lastUpdate>
(The above can be also nicely expanded to provide a full audit log with individual changes, providing a resource changelog is available.)
Please note:
I agree with Darrel Miller's take on the question, but I wanted to provide a different approach on top of it. Note that this approach is not backed-up by any standards/RFCs/etc, it's just a different take on the problem.
The disadvantage of using PUT to create resources is that the client has to provide the unique ID that represents the object it is creating. While it usually possible for the client to generate this unique ID, most application designers prefer that their servers (usually through their databases) create this ID. In most cases we want our server to control the generation of resource IDs. So what do we do? We can switch to using POST instead of PUT.
So:
Put = UPDATE
Post = INSERT
Hopefully, this helps for your specific case.
The HTTP methods POST and PUT aren't the HTTP equivalent of the CRUD's create and update. They both serve a different purpose. It's quite possible, valid and even preferred in some occasions, to use PUT to create resources, or use POST to update resources.
Use PUT when you can update a resource completely through a specific resource. For instance, if you know that an article resides at http://example.org/article/1234, you can PUT a new resource representation of this article directly through a PUT on this URL.
If you do not know the actual resource location, for instance, when you add a new article, but do not have any idea where to store it, you can POST it to an URL, and let the server decide the actual URL.