REST: HTTP PUT - Parent with Child Entities - rest

Is there a standard/convention in REST that dictates the expected behavior with respect to Child entities when I use an HTTP PUT on Parent record?
For example, the initial state of my Parent object is:
{
"id": 1,
"children": [
{"id": 1, ...},
{"id": 2, ...},
{"id": 3, ...}
],
...
}
And then I perform an HTTP PUT on /parents:
{
"id": 1,
"children": [
{"id": 2, ...}, // I changed a property in here
],
...
}
I would be inclined to update the Parent, and the Child with id 2, but are Children with id's 1 and 3 supposed to be deleted or not?

Is there a standard/convention in REST that dictates the expected behavior with respect to Child entities when I use an HTTP PUT on Parent record?
No
REST doesn't have "entities" or "records". It has "resources".
REST doesn't have "children". Common identifier spellings do not imply a relationship between two resources.
PUT /parents HTTP/1.1
Content-Type: application/json
{
"id": 1,
"children": [
{"id": 2, ...}, // I changed a property in here
],
...
}
What this message means is "make the representation of the resource /parents match the body of this message". In other words, save my copy of this document on top of your document.
In this case, it says that there should be exactly one entry in the children array, with id: 2.
How the server does that is an implementation detail hidden behind the REST facade. The message only describes what the client wants, not what the client gets. The server owns its own resources, and has a lot of freedom to choose how to modify them. That could include deleting the underlying entities, or marking them as end of life, or removing them from the list without changing them, or even none of those things.
The server does need to be a little careful with its response, to be sure not to imply that the new representation matches the body of the request unless that's actually what it has done.

HTTP and REST doesn't have a concept of 'children'. If you do a GET request on a resource and there's something called "children" there, then those children are basically just part of that resource.
A PUT request should replace the state of the resource. If you are replacing the list of children with a new list of children, then yes I would expect those changes to stick.

Related

REST API - Create resource and nested resource best practices

I have a question about REST API, especially about resource creation (and nested resources).
Suppose we have the following "GET" routes:
GET /recipes/1
{
"id": 1,
"name": "Crepes",
"ingredients": [
{"id": 1, "name": "Flour", "quantity": 100},
{"id": 2, "name": "Milk", "quantity": 15},
...
]
}
GET /recipes/1/ingredients/1
{
"name": "Flour",
"quantity": 100,
"details": "...",
...
}
My question is: what is the best practice/design for POST /recipes? (Suppose we want to create the previous recipe)
we make only 1 call:
POST /recipes
body = {
"name": "Crepes",
"ingredients": [
{"name": "Flour", "quantity": 100, ...},
{"name": "Milk", "quantity": 15, ...}
...
]
}
==> Recipe and ingredients are created at the same time
we make 1 call for recipe, and X for ingredients:
POST /recipes
body = {
"name": "Crepes"
}
POST /recipes/1/ingredients
body = {
"name": "Flour",
"quantity": 100,
...
}
...
==> recipe and ingredients are created one after the other
So, what is the best practice/design for resources and nested-resources?
Thanks !
TL; DR -- yes, you want to send a single request to the server, and permit the server to "create" as many resources as it needs to support future work.
The potentially complicating issue in HTTP is caching. One of the important ideas in REST is that it "allows references to be made to a concept before any realization of that concept exists".
Within the context of the web, that means that a client can potentially GET /recipes/1/ingredients/1 before that resource has a representation. When the server responds with 404 Not Found, that response is cacheable.
Here's an important idea: cache-invalidation has very precise semantics; a successful POST /recipes request will invalidate any locally cached copies of the /recipes resource, but it will have no effect on cached copies of /recipes/1 or /recipes/1/ingredients/1.
Which means that if you put a general purpose reverse proxy in front of your API, the copies of the different resources at the proxy won't all update together. Because the different resources aren't invalidating, there are various scenarios in which consumers of multiple resources will see inconsistent information.
The good news is that you, the origin server, control not only the representations of the resources but also the caching meta data. So you can tune the caching strategy to the best compromise among the conflicting design pressures.
In practice, you will probably find that bulk create of resources isn't a problem, because there's little reason for a client to fetch a resource before it has been created. Bulk updates are more problematic.

API Design: Querying a sub-resource

We have a resource which could be modeled as a nested object
GET /A/
[
{
"name": "my_a",
"B": [
{"name": "my_b", "address": "0xbeef"}
]
}
]
or a sub resource, like
GET /A/my_a/B
[
{
"name":"my_b", "address": "0xbeef"
}
]
Our customers want a way to query for objects of type A based on properties of type B, e.g. "get me all the A objects who have B objects with name 'my_b'".
It seems preferable to write the API using the "B as a sub-resource" style of writing because it lends itself to pagination if there are many B object types. Additionally, retrieving B objects can be expensive, so if only some clients are interested in B, it makes sense to required the seperate calls to retrieve subresource B. However, it also seems strange to allow users to query on a sub resource if the sub resource is not returned in the results.
For example, a query feels quite natural when in the form:
GET /A?query=B.address[equals]0xbeef
[
{
"name": "my_a",
"B": [
{"name": "my_b", "address": "0xbeef"}
]
}
]
but less so when the query looks like
GET /A?query=B.address[equals]0xbeef
[
{
"name": "my_a"
}
]
A compromise I'm considering is using the nested approach but not include the B objects by default. A query parameter can expose B. So,
GET /A?query=B.address[equals]0xbeef&include_b=true
[
{
"name": "my_a",
"B": [
{"name": "my_b", "address": "0xbeef"}
]
}
]
I researched "REST, nested objects, querying" and found examples. Most of these examples included the subresource as a nested object, the include_b parameter seems unique to my design.
So, SO, I'm looking for general feedback on this approach, and to see if this is a common problem with a known solution. Curious to hear what comes back.
edit 1:
Updated the example to show that querying can be on arbitrary properties.
As #RomanVottner pointed out, I'm actually not designing a RESTful API. Instead, the API is closer to an RPC translated to use HTTP/JSON. In fact, my team follows the Google API Design guide which itself is dictating how to write GRPC APIs which are then (I presume) automatically translated into web endpoints.
So, at the end of the day, I have not had my style question answered, other than to learn that my question wasn't accurate. I will most likely use the solution I purposed in the question.

JSON API for non-resource responses

Currently, I'm working on new product and making REST API for both - public and internal needs. I started with {json:api} specification and I was pretty happy with it until I faced some questions I cannot find answers to.
According to JSON API specification, every resource MUST contain id.
http://jsonapi.org/format/
Every resource object MUST contain an id member and a type member. The values of the id and type members MUST be strings.
And that's fine in many cases but not all.
Most of our endpoints are about "resources"
If I ask for a "things" collection (http://example.com/things)
{
"data": [{
"type": "things",
"id": "1",
"attributes": {
"title": "first"
},
"links": {
"self": "http://example.com/things/1"
}
}, {
"type": "things",
"id": "1",
"attributes": {
"title": "second"
},
"links": {
"self": "http://example.com/things/2"
}
}]
}
If I ask for a single "things" resource (http://example.com/things/1)
{
"data": {
"type": "things",
"id": "1",
"attributes": {
"title": "first"
},
"links": {
"self": "http://example.com/things/1"
}
}
}
But what to do with endpoints which are not about resources and does not have ID?
For example, in our application, there is an endpoint http://example.com/stats which should return stats of current logged in user. Like
{
"active_things": 23,
"last_login": "2017"
}
There is no id for this "resource" (it's not actually a resource, is it?). Backend just collects some "stats" for logged in user and returns an object of stats. There many endpoints like this in this application, for example, we have Notification center page where the user can change email addresses for different notifications.
So frontend app (single-page-app) first has to get current values and it sends the request to GET http://example.com/notification-settings.
{
"notifications_about_new_thing": "arunas#example.com",
"notification_about_other_thing": "arunas#example.com"
}
And there are many more endpoints like this. The problem is - how to return these responses in JSONAPI format? There is no ID in these endpoints.
And the biggest question is - why nobody else is facing this issue (at least I cannot find any discussion about this)? :D All APIs I ever made has some endpoints which don't have "id".
I have two ideas, first is to fake id, like "id": "doesnt_matter", the second - do not use json-api for these endpoints. But I don't like both of them.
Think RESTfully and everything can (must) be a resource. There is no "logged in" user as there are no sessions in RESTful APIs as they are stateless. There's no session state maintained between REST API invocations, so you have to be explicit about who the user is.
In this case, the resource is the user who has some stats attributes (in the simple case) or perhaps a relationship to a separate stats relationship (more complicated, not shown):
GET /users/1234
{
"data": {
"type": "users",
"id": "1234",
"attributes": {
"name": "etc.",
"active_things": 23,
"last_login": "2017"
}
}
}
I'm no JSON API expert- but it's worth noting that while JSON API is a concrete specification, it is not the same thing as JSON, nor as a REST API. If you don't like its semantics, I agree with commenters who argue, "Don't use it." If you are going to use JSON API, do so in a compliant way, where every response is a resource; every resource has an ID and a type; and additional information is supplied as attributes of the resource.
Toward your question, I'm thinking about something similar where my application returns computation results. Now on the one hand, these are not strictly "resources" and so I've been toying with the idea of returning the raw result as an array (which I believe would be valid JSON, with a caveat), e.g:
[ 47 ]
On the other hand, there is the idea that the results are the results of a computation that the client specified RESTfully, in which case one of the following two cases is likely true:
The same request submitted later is likely to have the same result. This suggests that in fact the result really is a resource.
The same request submitted later is likely to have a different result. This suggests that the client may want to track how results change for various queries, and so at least the query parameters should be part of the response.
In both cases, the response really is a 'result' object, and even though it doesn't have an ID per se, it does have an identity. If nothing else fits, the ID could be the query that generated the response.
This seems RESTful to me. User #n2ygk suggests that this is not correct as regards the JSON API spec, that an ID should simply be a unique ID and not have another semantic interpretation.
I'd love to hear other perspectives.

REST convention for parent insert and parent + joinId insert using same endpoint

In context of an event management system where speakers are talking at different sessions. My entities are Speakers and Sessions
Let's say the endpoint is
1) POST /speakers (to insert detail of a speaker ONLY)
2) POST /speakers (to insert detail of speaker + id of the session he's talking on)
point 2 requires to do an additional insert in the join table.
How can I specify both kinds of operations within the same endpoint.
A speaker could be represented including the session he speaks on. For example:
{
"id": 1234,
"firstname": "Joe",
"lastname": "Doe",
"sessions": []
}
This representation means that the speaker is not speaking on any session. sessions is an empty array. Doing
POST /speakers
Content-Type: application/json
with the JSON body as show above, would create the speaker.
If the client knows in advance all sessions the speaker will be speaking, the JSON could look like this:
{
"id": 1234,
"firstname": "Joe",
"lastname": "Doe",
"sessions": [
{
"id": 12,
"link": "/session/12"
},
{
"id": 34,
"link"; "/session/34"
}
]
}
For each session the speaker is speaking on, a short object consisting only of the id and a link to the session are included. This should be enough for the server to know how to link speaker and sessions in the database.
Now let's consider the case, that the sessions a speaker will speak on are not known in advance by the client. The client would create the speaker using the first JSON representation above including an empty sessions array. Later, when all sessions the speaker will speak on are known to the client, he would make a PATCH request:
PATCH /speakers/1234
Content-Type: application/json
{
"sessions": [
{
"id": 12,
"link": "/session/12"
},
{
"id": 34,
"link"; "/session/34"
}
]
}
Note that now we only send the sessions. All other attributes of the speaker shall be left as is on the server.
If the client wants to add sessions to the speaker one after the other, he could do this for every session:
POST /speakers/1234/sessions
Content-Type: application/json
{
"id": 43,
"link": "/sessions/43"
}
This would mean: "add session 43 to the list of sessions of speaker 1234". here /speakers/1234/sessions is a sub resource of /speaker/1234. Adding to it makes sense (of course the server would have to check for duplicates).
Note the different usage of POST to create a new resource (a speaker), to add to a sub resource (the list of sessions). Note also that changing only part of a resource (the speaker) uses PATCH.
Edit:
The HTTP verb PUT is usually used if the client wants to send the complete representation of a resource. When adding the list of sessions to an existing speaker, we use PATCH on the speaker because using PUT on him would require the client to send a complete representation of the speaker. In this use case the client does not want to do this, he wants to set the list of sessions.
An alternative way could be to
PUT /speakers/1234/sessions
Content-Type: application/json
[
{
"id": 12,
"link": "/session/12"
},
{
"id": 34,
"link"; "/session/34"
}
]
with a complete list of sessions on the subresource sessions. Note the difference: here the client is sending a complete representation of the list of sessions for the speaker. Here PUT means: "Overwrite the complete list of sessions for this speaker with what I provide".
Using /speakers?eventId=1 to get the list of speakers for even 1 is good practice. Here the client requests a filtered subset of the collection resource /speakers. Using query parameter to express filter conditions is very common.
About good resources for this kind of knowledge my general advice is to always think about resource. What are the resources your API provides? How are different types of resources related? Can they exist next to each other (a speaker can exist without a session, a session can exist without a speaker), or are the composites (a hotel room can only exist inside a hotel). Usually this kind of question helps. But in REST there are no hard rules or "standards" for how URLs must be constructed. I think it is important to be consistent in an API.

design pattern for dependent resource in REST

I am developing specs doc for resource URIs. Most everything is fairly well discussed around on the netz, and is all very helpful. However, I am a bit stuck on the pattern for a dependent resource. So, a dependent resource is something that exists at the pleasure of its parent resource. And, if the parent ceases to exist then the dependent also goes away. So, if I have books, a dependent resource would be the count of books. For any given query, if there are no books then there will be no count. Which is different from, say, an author... you could have no books, but still have authors. Ok. So I have something like this URI and the returned data
http://example.com/books.json?author=Homer
{"books": [
{"id": 33, "title": "Iliad", "author": "Homer", "pubyear": "800 BC"},
{"id": 33, "title": "Odyssey", "author": "Homer", "pubyear": "750 BC"}
]}
The URI ends in the plural version of the common noun, and the QUERY_STRING is used to filter the return set. The root node in the return "hash" is the common noun that was queried, and its key is an array each element of which is a hash with key/value pairs.
For the count, my instinct is to do the following
http://example.com/books/count.json?author=Homer
{"books": [
{"count": 2}
]}
or even
http://example.com/books/stats.json?author=Homer
{"books": [
{"stats": {
"count": 2,
"units": 10,
"sold": 3
}
]}
But, it seems the correct way really should be
http://example.com/books.json/count?author=Homer or
http://example.com/books.json?aggregate=count&author=Homer
any suggestions, thoughts?
The reason both seem to feel weird is that you are mixing the content type and the content identifier by putting ".json" on it. The content type should be in the request's "Accept" header. If you eliminate the ".json", the two possibilities you are considering reduce to the same thing.
That's a purist answer. If for some reason you must use the extension (framework or client limitations), then putting the extension on the last path element is more standard.