REST API - Create resource and nested resource best practices - rest

I have a question about REST API, especially about resource creation (and nested resources).
Suppose we have the following "GET" routes:
GET /recipes/1
{
"id": 1,
"name": "Crepes",
"ingredients": [
{"id": 1, "name": "Flour", "quantity": 100},
{"id": 2, "name": "Milk", "quantity": 15},
...
]
}
GET /recipes/1/ingredients/1
{
"name": "Flour",
"quantity": 100,
"details": "...",
...
}
My question is: what is the best practice/design for POST /recipes? (Suppose we want to create the previous recipe)
we make only 1 call:
POST /recipes
body = {
"name": "Crepes",
"ingredients": [
{"name": "Flour", "quantity": 100, ...},
{"name": "Milk", "quantity": 15, ...}
...
]
}
==> Recipe and ingredients are created at the same time
we make 1 call for recipe, and X for ingredients:
POST /recipes
body = {
"name": "Crepes"
}
POST /recipes/1/ingredients
body = {
"name": "Flour",
"quantity": 100,
...
}
...
==> recipe and ingredients are created one after the other
So, what is the best practice/design for resources and nested-resources?
Thanks !

TL; DR -- yes, you want to send a single request to the server, and permit the server to "create" as many resources as it needs to support future work.
The potentially complicating issue in HTTP is caching. One of the important ideas in REST is that it "allows references to be made to a concept before any realization of that concept exists".
Within the context of the web, that means that a client can potentially GET /recipes/1/ingredients/1 before that resource has a representation. When the server responds with 404 Not Found, that response is cacheable.
Here's an important idea: cache-invalidation has very precise semantics; a successful POST /recipes request will invalidate any locally cached copies of the /recipes resource, but it will have no effect on cached copies of /recipes/1 or /recipes/1/ingredients/1.
Which means that if you put a general purpose reverse proxy in front of your API, the copies of the different resources at the proxy won't all update together. Because the different resources aren't invalidating, there are various scenarios in which consumers of multiple resources will see inconsistent information.
The good news is that you, the origin server, control not only the representations of the resources but also the caching meta data. So you can tune the caching strategy to the best compromise among the conflicting design pressures.
In practice, you will probably find that bulk create of resources isn't a problem, because there's little reason for a client to fetch a resource before it has been created. Bulk updates are more problematic.

Related

REST: HTTP PUT - Parent with Child Entities

Is there a standard/convention in REST that dictates the expected behavior with respect to Child entities when I use an HTTP PUT on Parent record?
For example, the initial state of my Parent object is:
{
"id": 1,
"children": [
{"id": 1, ...},
{"id": 2, ...},
{"id": 3, ...}
],
...
}
And then I perform an HTTP PUT on /parents:
{
"id": 1,
"children": [
{"id": 2, ...}, // I changed a property in here
],
...
}
I would be inclined to update the Parent, and the Child with id 2, but are Children with id's 1 and 3 supposed to be deleted or not?
Is there a standard/convention in REST that dictates the expected behavior with respect to Child entities when I use an HTTP PUT on Parent record?
No
REST doesn't have "entities" or "records". It has "resources".
REST doesn't have "children". Common identifier spellings do not imply a relationship between two resources.
PUT /parents HTTP/1.1
Content-Type: application/json
{
"id": 1,
"children": [
{"id": 2, ...}, // I changed a property in here
],
...
}
What this message means is "make the representation of the resource /parents match the body of this message". In other words, save my copy of this document on top of your document.
In this case, it says that there should be exactly one entry in the children array, with id: 2.
How the server does that is an implementation detail hidden behind the REST facade. The message only describes what the client wants, not what the client gets. The server owns its own resources, and has a lot of freedom to choose how to modify them. That could include deleting the underlying entities, or marking them as end of life, or removing them from the list without changing them, or even none of those things.
The server does need to be a little careful with its response, to be sure not to imply that the new representation matches the body of the request unless that's actually what it has done.
HTTP and REST doesn't have a concept of 'children'. If you do a GET request on a resource and there's something called "children" there, then those children are basically just part of that resource.
A PUT request should replace the state of the resource. If you are replacing the list of children with a new list of children, then yes I would expect those changes to stick.

What's the RESTful way to return metadata on a collection of resources?

Say I have a REST API for accessing user notifications. I have an endpoint for getting all notifications:
GET https://server:443/api/notifications
Which returns the following response:
[
{
"status": "unread",
"_id": "5db8228d710ab4b1e33f19b2",
"title": "Some title",
"time": "2019-10-29T11:29:17.402Z",
"details": "Some details...",
"user": "user1"
},
{
"status": "unread",
"_id": "5db8228d710ab4b1e33f19b3",
"title": "Some title",
"time": "2019-10-29T11:29:17.411Z",
"details": "Some other details",
"user": "user2"
},
]
Now, I'd like to also be able to retrieve the amount of notifications for each user in a single request, for which the response will be something like:
[
{
"user": "user1",
"count": 1
},
{
"user": "user2",
"count": 1
},
]
What would be the best way, in terms of REST conventions, to do that?
What would be the best way, in terms of REST conventions, to do that?
REST really doesn't answer that. REST tells you that you have resources, but it doesn't actually offer any opinion on where the "boundaries" of your resources should be.
Again, think "web pages". You could add your summary to the web page that describes notifications, and that would be fine. Or you could decide that the notifications are described on one web page, and the summary on a different web page, and that would be fine.
What REST does tell you is that caching is important; so if you think the cache controls for summary data should be different from notification data, then you want to be thinking about separating that data into a different resource. If you think the summary data and the notification data needs to be synchronized, then its more likely that they belong as part of the same resource.
Of course, there's nothing in REST that says you can't have multiple resources that return the "same" data.
If you wanted the summary to be part of the notifications resource, and also wanted that information to be independently identifiable, then you would use a fragment to describe the summary "sub-resource"; perhaps https://server:443/api/notifications#summary.

Doesn't HATEOAS multiplicate HTTP requests?

I came across HATEOAS on my researches and was thinking : doesn't HATEOAS multiplicate HTTP requests ?
Let's take the basic customer and order example.
Let's say you want to retrieve an order, the endpoint would be /orders/2
with the following JSON response :
{
"id": 2,
"total": 50.00,
"links": [{
"rel": "customer",
"href": "http://api.domain.com/customer/1
}]
}
Now what if I also need the customer ? Do I have to make another request to /customer/1 ? Doesn't this overload the HTTP traffic ?
Couldn't I get the couple customer + order with a single endpoint like /customers/1/orders/2 ?
Or just send the customer in the /orders/2 JSON response ?
{
"id": 2,
"total": 50.00,
"customer": {
"id": 1,
"name": "Dylan Gauthier"
}
}
What's the benefit(s) of one solution or another ? When do I need one or the other ?
Thanks ! :-)
If the server only supplies the customer and order separately, then you have to make two requests regardless of whether they are following REST or not.
Nothing about REST or its HATEOAS constraint prevents the server from providing both customer and order in the same resource, exactly as you have suggested:
GET /orders/2
{
"id": 2,
"total": 50.00,
"customer": {
"name": "Dylan Gauthier"
}
}
But the customer in that response has no connection to the identifier /customers/1 — the server could combine the two ideas:
{
"id": 2,
"total": 50.00,
"links": [{
"rel": "customer",
"href": "http://api.domain.com/customer/1
}],
"resources": {
"http://api.domain.com/customer/1": {
"name": "Dylan Gauthier"
}
}
}
or better yet, group the links by their relation to the requested resource:
{
"id": 2,
"total": 50.00,
"links": {
"customer": [{
"href": "http://api.domain.com/customer/1"
}]
},
"resources": {
"http://api.domain.com/customer/1": {
"name": "Dylan Gauthier"
}
}
}
Whilst this would make it a bit more work for the client to print the name of the customer (nothing at all taxing, mind), it allows the client to fetch more information about the customer if they want to!
Just to add to Nicholas' answer:
Embedding related resources
Pros: saves you a trip to the server
Cons: While it saves you a trip the first time and may be a few lines of code, you are giving up on caching: if something changes in a related resource (that you embedded) client cache is no more valid, so the client has to make the request again. Of course, assuming you leverage HTTP caching. Which you should...
If you want to go this route, you are better off using something like GraphQL... but wait!
Going "pure" HATEOS
Pros: resources have independent life-cycles; easier to make each (type of) resource evolve without impacting the others. By fully leveraging the cache, overtime, the overall performance is far better.
Cons: more requests (at first access), this might be a little slower on first access; some more code to manage the HATEOS thing...
I personally tend to use the second approach whenever possible.
The classic web analogy:
If it can help, a classic website is just another api that serves html related resources, the client app being the browser itself. If you have ever done some html/css/js, you might want to approach it the same way:
For the given particular website, given its navigation architecture...etc would you rather inline all/part of the css/js (the related resources) in the html pages (the main resource) or not.

JSON API for non-resource responses

Currently, I'm working on new product and making REST API for both - public and internal needs. I started with {json:api} specification and I was pretty happy with it until I faced some questions I cannot find answers to.
According to JSON API specification, every resource MUST contain id.
http://jsonapi.org/format/
Every resource object MUST contain an id member and a type member. The values of the id and type members MUST be strings.
And that's fine in many cases but not all.
Most of our endpoints are about "resources"
If I ask for a "things" collection (http://example.com/things)
{
"data": [{
"type": "things",
"id": "1",
"attributes": {
"title": "first"
},
"links": {
"self": "http://example.com/things/1"
}
}, {
"type": "things",
"id": "1",
"attributes": {
"title": "second"
},
"links": {
"self": "http://example.com/things/2"
}
}]
}
If I ask for a single "things" resource (http://example.com/things/1)
{
"data": {
"type": "things",
"id": "1",
"attributes": {
"title": "first"
},
"links": {
"self": "http://example.com/things/1"
}
}
}
But what to do with endpoints which are not about resources and does not have ID?
For example, in our application, there is an endpoint http://example.com/stats which should return stats of current logged in user. Like
{
"active_things": 23,
"last_login": "2017"
}
There is no id for this "resource" (it's not actually a resource, is it?). Backend just collects some "stats" for logged in user and returns an object of stats. There many endpoints like this in this application, for example, we have Notification center page where the user can change email addresses for different notifications.
So frontend app (single-page-app) first has to get current values and it sends the request to GET http://example.com/notification-settings.
{
"notifications_about_new_thing": "arunas#example.com",
"notification_about_other_thing": "arunas#example.com"
}
And there are many more endpoints like this. The problem is - how to return these responses in JSONAPI format? There is no ID in these endpoints.
And the biggest question is - why nobody else is facing this issue (at least I cannot find any discussion about this)? :D All APIs I ever made has some endpoints which don't have "id".
I have two ideas, first is to fake id, like "id": "doesnt_matter", the second - do not use json-api for these endpoints. But I don't like both of them.
Think RESTfully and everything can (must) be a resource. There is no "logged in" user as there are no sessions in RESTful APIs as they are stateless. There's no session state maintained between REST API invocations, so you have to be explicit about who the user is.
In this case, the resource is the user who has some stats attributes (in the simple case) or perhaps a relationship to a separate stats relationship (more complicated, not shown):
GET /users/1234
{
"data": {
"type": "users",
"id": "1234",
"attributes": {
"name": "etc.",
"active_things": 23,
"last_login": "2017"
}
}
}
I'm no JSON API expert- but it's worth noting that while JSON API is a concrete specification, it is not the same thing as JSON, nor as a REST API. If you don't like its semantics, I agree with commenters who argue, "Don't use it." If you are going to use JSON API, do so in a compliant way, where every response is a resource; every resource has an ID and a type; and additional information is supplied as attributes of the resource.
Toward your question, I'm thinking about something similar where my application returns computation results. Now on the one hand, these are not strictly "resources" and so I've been toying with the idea of returning the raw result as an array (which I believe would be valid JSON, with a caveat), e.g:
[ 47 ]
On the other hand, there is the idea that the results are the results of a computation that the client specified RESTfully, in which case one of the following two cases is likely true:
The same request submitted later is likely to have the same result. This suggests that in fact the result really is a resource.
The same request submitted later is likely to have a different result. This suggests that the client may want to track how results change for various queries, and so at least the query parameters should be part of the response.
In both cases, the response really is a 'result' object, and even though it doesn't have an ID per se, it does have an identity. If nothing else fits, the ID could be the query that generated the response.
This seems RESTful to me. User #n2ygk suggests that this is not correct as regards the JSON API spec, that an ID should simply be a unique ID and not have another semantic interpretation.
I'd love to hear other perspectives.

How can I expose statistics about resources through a RESTful API?

I'm building an API for my web app and have got as far as exposing all the resources my app uses, e.g. /users, /roles, /posts etc with no problem.
I'm now stuck on how to expose statistics about some of these resources in a RESTful way. It doesn't seem right to have a statistics resource, as GET /statistics/1 could be anything, and the results will likely change each request, as the stats are real-time, so it will not be cacheable.
Background:
For each of the /users in the system, the app periodically queries Steam's API for the /games they are playing, and the /servers they are playing it on, and stores this information along with a timestamp in the /states resource.
This information is aggregated to show a tally of the most popular games and servers on the /statistics/games/current-usage and statistics/servers/current-usage standard HTML pages. Illustrative screenshots: servers, games (taken at different times).
EDIT: Sample data for the basic resources
"state": {
"id": 292002,
"user_id": 135,
"game_id": 24663,
"server_id": 135,
"created_at":"2014-06-22 21:12:03"
},
"user": {
"id": 112,
"username": "ilumos",
"steam_id_64": "76561197970613738"
},
"server": {
"id": 135,
"application_id": 24663,
"name": null,
"address": "192.168.241.65",
"port": "0"
},
"game": {
"id": 24663,
"name": "DEFCON",
"steam_app_id": 1520
}
EDIT 2: Does REST permit endpoints that use a timestamp as the resource identifier? e.g:
GET /statistics/1403681498/games to get a response like this:
[
"game": {
"id": 123,
"name": "DEFCON",
"users": [
{
"id": 7654,
"username": "daryl",
"server": {
"id": 127,
"ip": "123.123.123.123",
"port": "27960"
}
},
{
"id": 135,
"username": "ilumos"
},
]
}
]
You have a variety of not-wholly-unreasonable options.
You can
include statistics with each response. Will all clients want
statistics? Are there a lot of statistics? Maybe something like GET /games?orderBy=numPlayers-&offset=0&limit=10 would work if all you're tracking is number of players.
have a /statistics/{statisticId} endpoint. This is not inherently unRESTful.
have a /games/{gameId}/statistics endpoint.
have a /statistics/games/{gameId} endpoint.
Really, there's no way for us to tell you what the best way is to implement this because we don't have enough information.
I'm going to go with creating a usage resosuce as all of these statistics will be the usage of other resources, either "right now" or at a historic point in time.
My URIs will look like this:
GET /usage/{resource-name}/{resource-id}
GET /usage/games/ collection of games in use right now (with user totals)
GET /usage/servers/ collection of servers in use right now
GET /usage/games/?timestamp=1234567890 collection of games in use at {timestamp}
GET /usage/games/1 usage of game with id 1 right now
GET /usage/games/1?timestamp=1234567890 usage of game with id 1 at {timestamp}
GET /usage/games/?user_id=123 usage of game with id 1 filtered to show only user with id 123
And in future I can extend the resource to for example return usage for electricity usage
GET /usage/phases/ collection of phases in use right now (with power draw totals)
GET /usage/phases/1 usage of phase with id 1 right now
GET /usage/phases/?timestamp=1234567890 collection of phases in use at {timestsamp} (with power draw totals)
Unless there's something inhernatly un-RESTful about this it seems to be the most fitting way of exposing this info.