Related
We've recently started creating API endpoints. One of these end points is hardcoded to change 2 of our reference type codes (i.e. code: "P" for mobile is being changed to "M") from their system value to a custom value (out of a configurable list that has approximately 12 records at the moment. I'm trying to convince them it's bad practice and a terrible idea to change this reference data because of all of the issues it can cause for systems that use the api, however they believe it increases the "independence" of the API from the system of truth. We work in an enterprise environment and currently only our systems hit the api.
Is there any other data or information (Copious amounts of google searching hasn't revealed anyone discussing this sort of issue specifically) that suggests this is a bad idea? Or am I wrong in thinking so?
Edit:
For reference here's some examples:
What the data would look like in the source system the api pulls from
{
"phone_type": "P",
"phone_number": "1234567890",
"user_id":"username"
}
What that same data would look like coming from our API now
{
"phone_type": "M",
"phone_number": "1234567890",
"user_id":"username"
}
What the reference data would look like coming from our reference codes end point
[
{
"code": "P",
"description": "Mobile Number",
"active":"true"
}
]
I'm fairly new to REST APIs, and I have been puzzling over how to model a situation where you want to model a predetermined sequence of states in a REST API, such that you can only move forward through those states.
To give a toy example, consider a "steak" resource.
First you POST to /steaks to get a new steak. You get back a representation like the following.
{
"id": 4,
"cooking_state": "raw"
"next": "rare"
}
You want the user to be able to prompt transition to the next cooking_state in the sequence, until the steak the the desired level of cooked. So you might want them to be able to make some request that makes the resource at /steaks/4/ go through something like the following sequence (assume they like their steak well-done).
{
"id": 4,
"cooking_state": "blue"
"next": "rare"
}
{
"id": 4,
"cooking_state": "rare"
"next": "medium rare"
}
{
"id": 4,
"cooking_state": "medium rare"
"next": "medium"
}
{
"id": 4,
"cooking_state": "medium"
"next": "medium well"
}
{
"id": 4,
"cooking_state": "medium well"
"next": "well done"
}
{
"id": 4,
"cooking_state": "well done"
"next": null
}
(Let's say that eating the steak is modelled by a DELETE request.)
What you don't want to allow is jumping ahead or moving backwards.
{
"id": 4,
"cooking_state": "blue"
"next": "rare"
}
{
"id": 4,
"cooking_state": "well done"
"next": null
}
{
"id": 4,
"cooking_state": "medium rare"
"next": "medium"
}
None of the approaches I have thought of seem adequate, but I'll go through them.
PUT or PATCH request
A PUT or PATCH request seems ill-fitted to this sort of situation. Sure, you could start with...
{
"id": 4,
"cooking_state": "blue"
"next": "rare"
}
... and then PUT to /steaks/4/:
{
"id": 4,
"cooking_state": "rare"
"next": "medium rare"
}
Or maybe some sort of "merge patch":
PATCH /steaks/4/ HTTP/1.1
{
"cooking_state": "rare"
}
Then if someone tries to update the cooking state out of sequence, you could forbid it (409 status code?). But putting or patching seems to imply a level of freedom that doesn't exist.
Verbs, possibly thinly disguised
In some ways it feels more natural to me to move between states with a bodyless post request. Something like one of the following:
POST /steak/4/?action=next
POST /steak/4/cooking_state?action=next
But that feels very "verb-ish". It feels like a thin veneer over:
POST /steak/4/fry/
POST-ing events
You could POST cooking "events".
First, you create your steak...
POST /steaks/
... and get back the following response body.
{
"id": 5,
"cooking_state": "raw",
"next": "blue"
}
You can then query /steaks/5.
Then to cook the steak by one stage you make a request like the following:
POST /cooking_event/
{
"steak_id": 5
}
After this, a previously raw steak at /steaks/5/ would now look like this:
GET /steaks/5/
{
"state": "blue",
"next": "rare"
}
You make an identical POST request again, and now steak 5 looks like:
GET /steaks/5/
{
"state": "rare",
"next": "medium rare"
}
But, since the database row probably looks like this...
id cooking_state
5 "rare"
... there probably wouldn't be any actual event resource created, so you wouldn't be able to query GET /cooking/events/<id>/ - another thing which I'm not sure is acceptable or not in RESTful terms.
Again, this feels like a (this time more complex) veneer over verbs. It's like saying "The event which is the cooking of the steak" when you mean "Please cook the steak".
Obviously, the steak example may seem silly, but there are situations where it would be important to allow certain fields to be updated only in particular sequences. Is there an elegent way of dealing with this which is still fully RESTful?
Your example is a bit unsuited for your problem, because you don't actually set the state of the steak, you check it and try to decide if that's what you want. Basically you would do polling with a bunch of GET.
To be more practical (which oddly enough in this case means more abstract), suppose you have wizard-like kind of form that you want the user to fill.
You POST the form, and the result contains two main values:
the validation result (are the values ok AND is the user in the correct step?)
the path for the next step (what should the user do now?). If there is no next step, the user is done. If there is a next step, the user POSTs to that step and get the same kind of result.
If the user tries to go to the next step without having visited the previous ones, you return an out of sequence kind of error, which in your case I'd say is 412 Precondition failed .
Your example demonstrates the difficulty of modelling a ReSTful API well. Not because of problems with ReST itself, but because of domain complexity. However, I don't think the example really works. Either of your solutions may be appropriate in certain situations. It all depends on how you analyse the domain.
One analysis may suggest that if a steak is being cooked then its 'state' is not really a function of what your 'client' (diner?) does to it at all, it's a function of time and temperature. So your state change problem is really something which the client has no control over at all. In this model, the client would poll the resource until the 'state' changed (as if by magic) to the desired one and then 'consume' it (by calling DELETE). Any 'update' privileges which the client has would preclude them changing the 'state' field. The states are then managed by the server somehow. In essence, your API server is acting as a grill - the client creates an item to be 'cooked' and the server then cooks it until the client consumes it.
Another analysis may say that you've got two types of client - a 'diner' and a 'cooker'. When a diner requests a steak (POST to /steaks) a new resource is created to which they have no 'update' (PUT) privileges. A more privileged 'cooker' client may then periodically update the state of the resource whereas the diner client only has privileges to GET or DELETE it. In that model, it's the responsibility of the 'cooker' to enforce your state model and the API server itself doesn't enforce it. The API server isn't really anything more than a glorified order system - the cooker may for example decide to restart the state machine (because it dropped the steak). Either way the responsibility resides in one place, and the API would work with a PUT model because the the cooker client would simply be updating the resource.
Yet another interpretation says that the API server has to enforce the state machine even though there are the two types of client. In this case, the 'event' model really is appropriate. The 'cooker' only knows what's happened (i.e. what events have occurred) and the API server is responsible for maintaining the state transitions. Again, it's all hidden from the 'diner' client anyway. They just get to DELETE the steak when appropriate. This model can be very much more appropriate if you've got an API server which coordinates events from multiple sources (hey, you want mushrooms with your steak, right?).
Your concern about POST-ing events to a resource which doesn't end up creating a resource you can GET from is valid but there are always exceptions. For those it's perfectly acceptable to return a 204 (No Content) or (202 Accepted) without returning a Location you can get anything from. In the real world, if you did have a state model being influenced by a stream of events then you probably would want to keep a record of the incoming events in some kind of data store, and so you could potentially layer a resource over that (if appropriate).
YetAnoterInterpretation might suggest that there's only 1 client, with total control over any state transitions. That could be totally acceptable too - maybe I actually cook my own steaks and I just need a record of what I've done...
Ultimately, it's all down to where responsibility lies and how your individual use case is modelled. One thing does seem certain about your example - there should be only one place which is responsible for enforcing the state transitions. Exactly where the responsibility lies, and how to model it ReSTfully, is another question.
Moving on from the steak example, if you find that you've got resources which have a subset of fields on them which are updated according to some kind of state machine, independently of other resource fields, then you might want to model that as a kind of sub-resource, and then working out who is actually in control of that part of the resource (i.e. who actually enforces the state transitions) and break responsibilities down accordingly.
I understand HATEOAS represents the applications state by sending all actions that can be performed at that point in time within the application as it's response (HAL, JSON-LD, etc).
For example, viewing an account resource of a bank may allow you to deposit, withdraw or close the account (OPTIONS which may return UPDATE and DELETE verbs).
In terms of runtime discoverability of these links (by the consuming client), how might one go about this?
If the purpose of sending these links is the decouple the client from the server and drive the state by the hypermedia in the response, there must be an amount of knowledge the developer must hardcode in the application in order to make any sense of the responses being returned.
I understanding sending OPTIONS requests is the way to determine the current state of the resource and what you can do next, but in order to discover the actual URIs to use - would these simply be hardcoded as COOL URIs?
Like #VoicesOfUnreason said, in HATEOAS URIs are discoverable (and not documented) so that they can be changed. That is, unless they are the very entry points into your system (Cool URIs, the only ones that can be hard-coded by clients) - and you shouldn't have too many of those if you want the ability to evolve the rest of your system's URI structure in the future. This is in fact one of the most useful features of REST.
For the remaining non-Cool URIs, they can be changed over time, and your API documentation should spell out the fact that they should be discovered at runtime through hypermedia traversal.
Looking at the Richardson's Maturity Model (level 3), this would be where links come into play. For example, from the top level, say /api/version(/1), you would discover there's a link to the groups. Here's how this could look in a tool like HAL Browser:
Root:
{
"_links": {
"self": {
"href": "/api/root"
},
"api:group-add": {
"href": "http://apiname:port/api/group"
},
"api:group-search": {
"href": "http://apiname:port/api/group?pageNumber={pageNumber}&pageSize={pageSize}&sort={sort}"
},
"api:group-by-id": {
"href": "http://apiname:port/api/group/id" (OR "href": "http://apiname:port/api/group?id={id}")
}
}
}
The add would simply be a POST to that endpoint, and then you'd have 2 GET methods.
GET /api/group?pageNumber=0&pageSize=20&sort=asc
which could return something like this:
{
"groups": [
{
"id": 123,
"name": "Test Group"
},
{
"id": 134,
"name": "Tennis squad"
}
]
}
Then once you drill down to a particular group (say #123):
{
"Id" : 123,
"Name" : "test",
"_links": {
"self": {
"href": "/api/group/1" (OR "/api/group?id=1")
},
"edit": {
"href": "http://apiname:port/api/group/1"
},
"api:delete": {
"href": "http://apiname:port/api/group/1"
},
"api:items-query": {
"href": "http://apiname:port/api/bonus?groupId=1"
}
}
}
Here, the edit would simply be a PUT, and then you'll need a DELETE (see level 2 of REST in that same link), as for the items, you probably know best if they are just a property, or another endpoint; you could even embed them to be returned in the same call that's retrieving a group.
The advantage here would be that the client would only need to know the relationship (link) name (well obviously besides the resource structure/properties), while the server would be mostly free to alter the relationship (and resource) url.
There's a bunch of prior art around on trying to create expressive, discoverable hypermedia. You might want to review:
http://json-ld.org/
http://www.markus-lanthaler.com/hydra/
I am thinking maybe a series of if statement that checks for certain properties to determine the state or maybe even switch statements. Is this is correct path - or is there better means of hypermedia discovery?
My current thinking is that you want to be shaping your ideas more along the lines of negotiating and following a protocol; so think state machine rather than if statements.
For starters, review How To GET a Cup of Coffee.
The hyperlinks in the documents served by RESTBucks are designed to assist the client in negotiating the RESTBucks protocol; the assumption that the client already understands that protocol is baked into the model. In other words, the client already understands that negotiating the protocol will allow it to reach it's goal.
Of course, there could be multiple protocols that serve the same goal. For instance RESTBucks could also support a "Give Away Day Old Coffee" protocol; announcing the presence of each, the client would be expected to choose which is the better expression of the goal, and follow that path.
Introduction
/me/books.reads returns books[1].
It includes an array of books and the following fields for each book:
title
type
id
url
Problem
I'd like to get the author name(s) at least. I know that written_by is an existing field for books.
I'd like to get ISBN, if possible.
Current situation
I tried this:
/me/books.reads?fields=data.fields(author)
or
/me/books.reads?fields=data.fields(book.fields(author))
But the error response is:
"Subfields are not supported by data"
The books.reads response looks like this (just one book included):
{
"data": [
{
"id": "00000",
"from": {
"name": "User name",
"id": "11111"
},
"start_time": "2013-07-18T23:50:37+0000",
"publish_time": "2013-07-18T23:50:37+0000",
"application": {
"name": "Books",
"id": "174275722710475"
},
"data": {
"book": {
"id": "192511337557794",
"url": "https://www.facebook.com/pages/A-Semantic-Web-Primer/192511337557794",
"type": "books.book",
"title": "A Semantic Web Primer"
}
},
"type": "books.reads",
"no_feed_story": false,
"likes": {
"count": 0,
"can_like": true,
"user_likes": false
},
"comments": {
"count": 0,
"can_comment": true,
"comment_order": "chronological"
}
}
}
If I take the id of a book, I can get its metadata from the open graph, for example http://graph.facebook.com/192511337557794 returns something like this:
{
"category": "Book",
"description": "\u003CP>The development of the Semantic Web...",
"genre": "Computers",
"is_community_page": true,
"is_published": true,
"talking_about_count": 0,
"were_here_count": 0,
"written_by": "Grigoris Antoniou, Paul Groth, Frank Van Harmelen",
"id": "192511337557794",
"name": "A Semantic Web Primer",
"link": "http://www.facebook.com/pages/A-Semantic-Web-Primer/192511337557794",
"likes": 1
}
The response includes ~10 fields, including written_by which has the authors of the book.
Curiously, link field seems to map to url of the books.reads response. However, the field names are different, so I'm starting to loose hope that I would be able to ask for written_by in books.reads request..
The only reference that I've found about /me/books is https://developers.facebook.com/docs/reference/opengraph/object-type/books.book/
This is essentially about user sharing that he/she has read a book, not the details of the book itself.
The data structure is focused on the occasion of reading a book: when reading was started, when this story was published, etc.
[1] I know this thanks to How to get "read books"
FQl does not looks very promising – although you can request books from the user table, it seems to deliver just a string value with only the book titles comma-separated.
You can search page table by name – but I doubt it will work with name in (subquery) when what that subquery delivers is just one string of the format 'title 1,title 2,…'.
Can’t really test this right now, because I have read only one book so far (ahm, one that I have set as “books I read” on FB, not in general …) – but using that to search the page table by name already delivers a multitude of pages, and even if I narrow that selection down by AND is_community_page=1, I still get several, so no real way of telling which would be the right one, I guess.
So, using the Graph API and a batch request seems to be more promising.
Similar to an FQL multi-query, batch requests also allow you to refer data from the previous “operation” in a batch, by giving operations a “name”, and then referring to data from the first operation by using JSONPath expression format (see Specifying dependencies between operations in the request for details).
So a batch query for this could look like this,
[
{"method":"GET","name":"get-books","relative_url":"me\/books?fields=id"},
{"method":"GET","relative_url":"?ids={result=get-books:$.data.*.id}
&fields=description,name,written_by"}
]
Here all in one line, for easier copy&paste, so that line breaks don’t cause syntax errors:
[{"method":"GET","name":"get-books","relative_url":"me\/books?fields=id"},{"method":"GET","relative_url":"?ids={result=get-books:$.data.*.id}&fields=description,name,written_by"}]
So, to test this:
Go to Graph API Explorer.
Change method to POST via the dropdown, and clear whatever is in the field right next to it.
Click “Add a field”, and input name batch, and as value insert the line copy&pasted from above.
Since that will also get you a lot of “headers” you might not be interested in, you can add one more field, name include_headers and value false to get rid of those.
In the result, you will get a field named body, that contains the JSON-encoded data for the second query. If you want more fields, add them to the fields parameter of the second query, or leave that parameter out completely if you want all of them.
OK, after some trial-and-error I managed to create a direct link to Graph API Explorer to test this – the right amount of URL-encoding to use is a little fiddly to figure out :-)
(I left out the fields parameter for the second operation here, so this will give you all the info for the book that there is.)
As I said, I only got one book on FB, but this should work for a user with multiple books the same way (since the second operation just takes however many IDs it is given from the first one).
But I can’t tell you off the top of my head how this will work for a lot of books – how slow the second operation might get with that, when you set a high limit for the first one. And I also don’t know how this will behave in regard to pagination, which you might run into when me/books delivers a lot of books for a user.
But I think this should be a good enough starting point for you to figure the rest out by trying it on users with more data. HTH.
Edit: ISBN does not seem to be part of the info for a book’s community page, at least not for the ones I checked. And also written_by is optional – my book doesn’t have it. So you’ll only get that info if it is actually provided.
I'm interested in building a web service with a REST API. I've been reading about HATEOAS and many of the examples explain the concept by comparing it to what humans do when they surf the web. This has me thinking, why not build the REST API in such a way that it can be easily used by both humans and machines?
For example, I have an internal model of a widget, and this widget has properties like part number, price, etc. When a machine asks for a list of widgets, I can return a JSON representation.
{
widgets: [
{
id: 1,
part_number: "FOO123",
price: 100,
url: "/widget/1"
},
{
id: 2,
part_number: "FOO456",
price: 150,
url: "/widget/2"
},
{
id: 3,
part_number: "FOO789",
price: 200,
url: "/widget/3"
},
...
]
}
When a human requests the same list through his/her web browser, it seems like I should be able to take the same internal model and apply a different view to it to generate an HTML response. (Of course, I would decorate the HTML response with other page elements, like a header, footer, etc.)
Is this a sensible design? Why or why not? Are there any popular sites actually doing it?
The biggest drawback that I see is there is no obvious way for a user to delete a resource. In my use case, I'm not going to let users modify or delete resources, so this is not a deal-breaker, but in general how might you handle that?
#mehaase
First of all i'd suggest to use one of registered JSON hypermedia formats:
Collection+JSON: http://amundsen.com/media-types/collection/format/
Collection.next+JSON:
http://code.ge/media-types/collection-next-json/
HAL - Hypertext Application Language:
http://stateless.co/hal_specification.html
All of them offer explicit semantics for creating links with semantic link relations.
For example with Collection(.next)+JSON you can express your widgets like this:
{"collection": {
"version": 1.0,
"items": [{
"href": "/widget/1",
"data": [{
"name": "id",
"value": 1,
"prompt": "ID"
}, {
"name": "part_number",
"value": "FOO123",
"prompt": "Part number"
}, {
"name": "price",
"value": 100,
"prompt": "Price"
}],
"links": [{
"rel": "self",
"href": "http://...",
}, {
"rel": "edit",
"href": "http://..."
}]
}]
}}
This gives you several advantages:
You do not need to reinvent the wheel for specifying links
You can freely use all registered link relation types:
http://www.iana.org/assignments/link-relations/link-relations.xml
Based on your data structure, you can easily use collection/item semantics of mentioned format
If need be you can describe input forms as well
As you see from example, it has enough information for transforming to HTML(or other formats).
The biggest drawback that I see is there is no obvious way for a user
to delete a resource. In my use case, I'm not going to let users
modify or delete resources, so this is not a deal-breaker, but in
general how might you handle that?
for this read "edit" link relation specification, it implies that resource can be deleted.
There are a couple of things you can do, but the first premise is simply that the modern "generic" web browser is really crummy REST client.
If most of your interaction is guarded and managed by JavaScript, if you write a "rich client" so to speak where you're relying more on JS generated requests than simply links, forms, and the back button, then it can be a better REST client.
If you're stuck with the generic browser experience of forms and links, you can route around the lack of the other HTTP verbs by overloading POST. You lose some guarantees by intermediaries. DELETE is idempotent, POST is not, this has repercussions, but it's not devastating, and you just have to work around it. You can do idempotent things with POST, but intermediaries won't "know" that they are, so they can't assume its idempotent.
If you end up having to go "POST uber alles" you will either restrict your machine clients to the same api, or you offer up parallel services -- those used by POST stupid clients, and those others that have the full gamut available to them.
That said, if you choose an XML based hypermedia format, then what you can do is add XSL transforms to the XML payloads. The browsers will run the XSL on the payloads creating as pretty a page as you like (headers, footers, enough JS to choke a horse, etc.), while machines will ignore that aspect of it and focus solely on data as given.
Gives you a "best of both worlds" in that respect.
You can always build a REST API and then build your own, human-friendly web app around it. This is a common practice because you have out-of-the-box functionality and an extendable system for developers.
It is possible to do so simply by using HTML with RDFa. So humans can read the HTML and machines can read the RDFa annotations. You need a REST vocab like hydra to annotate the links, and other vocabs, like schema.org to annotate the content.
Another option to use different media types for different type of clients, like HTML for humans and JSON-LD for machines.