API Design: Querying a sub-resource - rest

We have a resource which could be modeled as a nested object
GET /A/
[
{
"name": "my_a",
"B": [
{"name": "my_b", "address": "0xbeef"}
]
}
]
or a sub resource, like
GET /A/my_a/B
[
{
"name":"my_b", "address": "0xbeef"
}
]
Our customers want a way to query for objects of type A based on properties of type B, e.g. "get me all the A objects who have B objects with name 'my_b'".
It seems preferable to write the API using the "B as a sub-resource" style of writing because it lends itself to pagination if there are many B object types. Additionally, retrieving B objects can be expensive, so if only some clients are interested in B, it makes sense to required the seperate calls to retrieve subresource B. However, it also seems strange to allow users to query on a sub resource if the sub resource is not returned in the results.
For example, a query feels quite natural when in the form:
GET /A?query=B.address[equals]0xbeef
[
{
"name": "my_a",
"B": [
{"name": "my_b", "address": "0xbeef"}
]
}
]
but less so when the query looks like
GET /A?query=B.address[equals]0xbeef
[
{
"name": "my_a"
}
]
A compromise I'm considering is using the nested approach but not include the B objects by default. A query parameter can expose B. So,
GET /A?query=B.address[equals]0xbeef&include_b=true
[
{
"name": "my_a",
"B": [
{"name": "my_b", "address": "0xbeef"}
]
}
]
I researched "REST, nested objects, querying" and found examples. Most of these examples included the subresource as a nested object, the include_b parameter seems unique to my design.
So, SO, I'm looking for general feedback on this approach, and to see if this is a common problem with a known solution. Curious to hear what comes back.
edit 1:
Updated the example to show that querying can be on arbitrary properties.

As #RomanVottner pointed out, I'm actually not designing a RESTful API. Instead, the API is closer to an RPC translated to use HTTP/JSON. In fact, my team follows the Google API Design guide which itself is dictating how to write GRPC APIs which are then (I presume) automatically translated into web endpoints.
So, at the end of the day, I have not had my style question answered, other than to learn that my question wasn't accurate. I will most likely use the solution I purposed in the question.

Related

How to model a progress "resource" in a REST api?

I have the following data structure that contains an array of sectionIds. They are stored in the order in which they were completed:
applicationProgress: ["sectionG", "sectionZ", "sectionA"]
I’d like to be able to do something like:
GET /application-progress - expected: sectionG, sectionZ, sectionA
GET /application-progress?filter[first]=true - expected: sectionG
GET /application-progress?filter[current]=true - expected: sectionA
GET /application-progress?filter[previous]=sectionZ - expected: sectionG
I appreciated the above URLs are incorrect, but I’m not sure how to name/structure them to get the expected data e.g. Are the resources here "sectionids"?
I'd like to adhere to the JSON:API specification.
UPDATE
I'm looking to adhere to JSON:API v1.0
In terms of resources I believe I have "Section" and "ProgressEntry". Each ProgressEntry will have a one-to-one relationship with a Section.
I'd like to be able to query within the collection e.g.
Get the first item in the collection:
GET /progress-entries?filter[first]
Returns:
{
"data": {
"type": "progress-entries",
"id": "progressL",
"attributes": {
"sectionId": "sectionG"
},
"relationships": {
"section": {
"links": {
"related": "http://example.com/sections/sectionG"
}
}
}
},
"included": [
{
"links": {
"self": "http://example.com/sections/sectionG"
},
"type": "sections",
"id": "sectionG",
"attributes": {
"id": "sectionG",
"title": "Some title"
}
}
]
}
Get the previous ProgressEntry given a relative ProgressEntry. So in the following example find a ProgressEntry whose sectionId attribute equals "sectionZ" and then get the previous entry (sectionG). I wasn't clear before that the filtering of this is based on the ProgressEntry's attributes:
GET /progress-entries?filter[attributes][sectionId]=sectionZ&filterAction=getPreviousEntry
Returns:
{
"data": {
"type": "progress-entries",
"id": "progressL",
"attributes": {
"sectionId": "sectionG"
},
"relationships": {
"section": {
"links": {
"related": "http://example.com/sections/sectionG"
}
}
}
},
"included": [
{
"links": {
"self": "http://example.com/sections/sectionG"
},
"type": "sections",
"id": "sectionG",
"attributes": {
"id": "sectionG",
"title": "Some title"
}
}
]
}
I started to comment on jelhan's reply though my answer was just to long for a reasonable comment on his objection, hence I include it here as it more or less provides a good introduction into the answer anyways.
A resource is identified by a unique identifier (URI). A URI is in general independent from any representation format else content-type negotiation would be useless. json-api is a media-type that defines the structure and semantics of representations exchanged for a specific resource. A media-type SHOULD NOT force any constraints on the URI structure of a resource as it is independent from it. One can't deduce the media-type to use based on a given URI even if the URI contains something like vnd.api+json as this might just be a Web page talking about json:api. A client may as well request application/hal+json instead of application/vnd.api+json on the same URI and receive the same state information just packaged in a different representation syntax, if the server supports both representation formats.
Profiles, as mentioned by jelhan, are just extension mechanisms to the actual media-type that allow a general media-type to specialize through adding further constraints, conventions or extensions. Such profiles use URIs similar to XML namespaces, and those URIs NEED NOT but SHOULD BE de-referencable to allow access to further documentation. There is no talk about the URI of the actual resource other than given by Web Linking that URIs may hint a client on the media-type to use, which I would not recommend as this requires a client to have certain knowledge about that hint.
As mentioned in my initial comments, URIs shouldn't convey semantics as link relations are there for!
Link-relations
By that, your outlined resource seems to be a collection of some further resources, sections by your domain language. While pagination as defined in json:api does not directly map here perfectly, unless you have so many sections that you want to split these into multiple pages, the same concept can be used using standardized link relations defined by IANA.
Here, at one point a server may provide you a link to the collection resource which may look like this:
{
"links": {
"self": "https://api.acme.org/section-queue",
"collection": "https://api.acme.org/app-progression",
...
},
...
}
Due to the collection link relation standardized by IANA you know that this resource may hold a collection of entries which upon invoking may return a json:api representation such as:
{
"links": {
"self": "https://api.acme.org/app-progression",
"first": "https://api.acme.org/app-progression/sectionG",
"last": "https://api/acme.org/app-progression/sectionA",
"current": "https://api.acme.org/app-progression",
"up": "https://api.acme.org/section-queue",
"https://api/acme.org/rel/section": "https://api.acme.org/app-progression/sectionG",
"https://api/acme.org/rel/section": "https://api.acme.org/app-progression/sectionZ",
"https://api/acme.org/rel/section": "https://api.acme.org/app-progression/sectionA",
...
},
...
}
where you have further links to go up or down the hierarchy or select the first or last section that finished. Note the last 3 sample URIs that leverages the extension relation types mechanism defined by RFC 5988 (Web Linking).
On drilling down the hierarchy further you might find links such as
{
"links": {
"self": "https://api.acme.org/app-progression/sectionZ",
"first": "https://api.acme.org/app-progression/sectionG",
"prev": "https://api.acme.org/app-progression/sectionG",
"next": "https://api.acme.org/app-progression/sectionA",
"last": "https://api.acme.org/app-progression/sectionA",
"current": "https://api.acme.org/app-progression/sectionA",
"up": "https://api.acme.org/app-progression",
...
},
...
}
This example should just showcase how a server is providing you with all the options a client may need to progress through its task. It will simply follow the links it is interested in. Based on the link relation names provided a client can make informed choices on whether the provided link is of interest or not. If it i.e. knows that a resource is a collection it might to traverse through all the elements and processes them one by one (or by multiple threads concurrently).
This approach is quite common on the Internet and allows the server to easily change its URI scheme over time as clients will only act upon the link relation name and only invoke the URI without attempting to deduce any logic from it. This technique is also easily usable for other media-types such as application/hal+json or the like and allows each of the respective resources to be cached and reused by default, which might take away load from your server, depending on the amount of queries it has to deal with.
Note that no word on the actual content of that section was yet said. It might be a complex summary of things typical to sections or it might just be a word. We could classify it and give it a name, as such even a simple word is a valid target for a resource. Further, as Jim Webber mentioned, your resources you expose via HTTP (REST) and your domain model are not identical and usually do not map one-to-one.
Filtering
json:api allows to group parameters together semantically by defining a customized x-www-form-urlencoded parsing. If content-type negotiation is used to agree on json:api as representation format, the likelihood of interoperability issues is rather low, though if such a representation is sent unsolicitedly parsing of such query parameters might fail.
It is important to mention that in a REST architecture clients should only use links provided by the server and not generate one on their own. A client usually is not interested in the URI but in the content of the URI, hence the server needs to know how to act upon the URI.
The outlined suggestions can be used but also URIs of the form
.../application-progress?filter=first
.../application-progress?filter=current
.../application-progress?filter=previous&on=sectionZ
can be used instead. This approach should in addition to that also work on almost all clients without the need to change their url-encoded parsing mechanism. In addition to that he management overhead to return URIs for other media-types generated may be minimized as well. Note that each of the URIs in the example above represent their own resource and a cache will store responses to such resources based on the URI used to retrieve such results. Queries like .../application-progress?filter=next&on=sectionG and .../application-progress?filter=previous&on=sectionA which retrieve basically the same representations are two distinctive resources which will be processed two times by your API as the response of the first query can't be reused as the cache key (URI) is different. According to Fielding caching is one of the few constraints REST has which has to be respected otherwise you are violating it.
How you design such URIs is completely up to you here. The important thing is, how you teach a client when to invoke such URIs and when it should not. Here, again, link-relations can and should be used.
Summary
In summary, which approach you prefer is up to you as well as which URI style you choose. Clients, especially in a REST environment, do not care about the structure of the URI. They operate on link-relations and use the URI just for invoking it to progress on with their task. As such, a server API should help a client by teaching it what it needs to know like in a text-based computer game in the 70/80's as mentioned by Jim Webber. It is helpful to think of the interaction model to design as affordances and state machine as explained by Asbjørn Ulsberg .
While you could apply filtering on grouped parameters provided by json:api such links may only be usable within the `json:api´ representation. If you copy & paste such a link to a browser or to some other channel, it might not be processable by that client. Therefore this would not be my first choice, TBH. Whether or not you design sections to be their own resource or just properties you want to retrieve is your choice here as well. We don't know really what sections are in your domain model, IMO it sounds like a valid resource though that may or may not have further properties.

JSON API for non-resource responses

Currently, I'm working on new product and making REST API for both - public and internal needs. I started with {json:api} specification and I was pretty happy with it until I faced some questions I cannot find answers to.
According to JSON API specification, every resource MUST contain id.
http://jsonapi.org/format/
Every resource object MUST contain an id member and a type member. The values of the id and type members MUST be strings.
And that's fine in many cases but not all.
Most of our endpoints are about "resources"
If I ask for a "things" collection (http://example.com/things)
{
"data": [{
"type": "things",
"id": "1",
"attributes": {
"title": "first"
},
"links": {
"self": "http://example.com/things/1"
}
}, {
"type": "things",
"id": "1",
"attributes": {
"title": "second"
},
"links": {
"self": "http://example.com/things/2"
}
}]
}
If I ask for a single "things" resource (http://example.com/things/1)
{
"data": {
"type": "things",
"id": "1",
"attributes": {
"title": "first"
},
"links": {
"self": "http://example.com/things/1"
}
}
}
But what to do with endpoints which are not about resources and does not have ID?
For example, in our application, there is an endpoint http://example.com/stats which should return stats of current logged in user. Like
{
"active_things": 23,
"last_login": "2017"
}
There is no id for this "resource" (it's not actually a resource, is it?). Backend just collects some "stats" for logged in user and returns an object of stats. There many endpoints like this in this application, for example, we have Notification center page where the user can change email addresses for different notifications.
So frontend app (single-page-app) first has to get current values and it sends the request to GET http://example.com/notification-settings.
{
"notifications_about_new_thing": "arunas#example.com",
"notification_about_other_thing": "arunas#example.com"
}
And there are many more endpoints like this. The problem is - how to return these responses in JSONAPI format? There is no ID in these endpoints.
And the biggest question is - why nobody else is facing this issue (at least I cannot find any discussion about this)? :D All APIs I ever made has some endpoints which don't have "id".
I have two ideas, first is to fake id, like "id": "doesnt_matter", the second - do not use json-api for these endpoints. But I don't like both of them.
Think RESTfully and everything can (must) be a resource. There is no "logged in" user as there are no sessions in RESTful APIs as they are stateless. There's no session state maintained between REST API invocations, so you have to be explicit about who the user is.
In this case, the resource is the user who has some stats attributes (in the simple case) or perhaps a relationship to a separate stats relationship (more complicated, not shown):
GET /users/1234
{
"data": {
"type": "users",
"id": "1234",
"attributes": {
"name": "etc.",
"active_things": 23,
"last_login": "2017"
}
}
}
I'm no JSON API expert- but it's worth noting that while JSON API is a concrete specification, it is not the same thing as JSON, nor as a REST API. If you don't like its semantics, I agree with commenters who argue, "Don't use it." If you are going to use JSON API, do so in a compliant way, where every response is a resource; every resource has an ID and a type; and additional information is supplied as attributes of the resource.
Toward your question, I'm thinking about something similar where my application returns computation results. Now on the one hand, these are not strictly "resources" and so I've been toying with the idea of returning the raw result as an array (which I believe would be valid JSON, with a caveat), e.g:
[ 47 ]
On the other hand, there is the idea that the results are the results of a computation that the client specified RESTfully, in which case one of the following two cases is likely true:
The same request submitted later is likely to have the same result. This suggests that in fact the result really is a resource.
The same request submitted later is likely to have a different result. This suggests that the client may want to track how results change for various queries, and so at least the query parameters should be part of the response.
In both cases, the response really is a 'result' object, and even though it doesn't have an ID per se, it does have an identity. If nothing else fits, the ID could be the query that generated the response.
This seems RESTful to me. User #n2ygk suggests that this is not correct as regards the JSON API spec, that an ID should simply be a unique ID and not have another semantic interpretation.
I'd love to hear other perspectives.

Complex claims in JWT

The JWT RFC does not seem to have any problem containing complex arrays such as:
{
"email": "test#test.com",
"businesses": [
{
"businessId": "1",
"businessName": "One",
"roles": [
"admin",
"accountant"
]
},
{
"businessId": "2",
"businessName": "Two",
"roles": [
"support"
]
}
]
}
And this seems a desirable scenario for our needs, since as part of the token we'd like to have a list of businesses a user has access to and what roles does he have for each business (it's part of its identity). The authorization policies at the API would later understand those groups and apply the required authorization logic.
I have seen that with IdentityServer4 the claims are added to the ProfileDataRequestContext's IEnumerable<Claim> IssuedClaims property.
Is there any recommended alternative to this complex claim structure? If not, is there any way to build that structure with IdentityServer4 (maybe some extension?) or the only way would be to manually serialize the JSON since the Claim seems to accept only a string?
PS: I have seen this question and this other where one of the authors of Identity Server talks about something similar being an antipattern. Not sure if the antipattern would be to have complex claims' structure or "authorization implementation details" in the claims.
Any advice on this would be great!
UPDATE:
After giving some thoughts I agree having a complex hierarchy of claims is not desirable and I could go around this problem with a dirty solution of prefixing roles for each businessId. Something like this:
{
"email": "test#test.com",
"roles": [
"1_admin",
"1_accountant",
"2_support"
],
"businesses": [
"1_One",
"2_Two"
]
}
that way I keep a simple structure and later on, at the client or API I can read the claims and find out that 1 is the id for the business with name One and it has the roles admin and account.
Would this be a better solution?
Claims are about identity information - and not complex permission "objects". You are far better off with a dedicated permission service that returns your permissions in any format you want based on the identity of the user.
I also hope your permission data doesn't change while the token is being used, otherwise you end up with stale data.
That said - claims are always strings in .NET - but you can serialize JSON objects into it by setting the ClaimValueType to IdentityServerConstants.ClaimValueTypes.Json.

Should I use an array or an object for REST API collections?

My API contains a Book entity which represents a collection of several possible BookContent depending on the language. A BookContent is another entity with the attributes Title and Content, which both depend on the Language.
Should a Book look like:
1)
[
{
"language": "English",
"title": "First title",
"content": "First content"
},
{
"language": "French",
"title": "Premier titre",
"content": "Premier contenu"
}
]
or like:
2)
{
"English": {
"title": "First title",
"content": "First content"
},
"French": {
"title": "Premier titre",
"content": "Premier contenu"
}
}
The option 1):
produces "self-contained" elements, i.e. each object contains all the information.
The language attribute can contain any Unicode character (whereas as a key it cannot).
is the only available option if the content depends on multiple criteria, i.e. language and year.
creates a clearer separation between the two entities. For example, it makes it easier to replace each element by its ID, if we were to decide not to embed the BookContent entity anymore but only return the BookContent ID.
is probably more familiar to developers using my API, since I believe it is more common to find this kind of structure in other REST APIs.
The option 2):
produces smaller elements.
makes it faster to look for the elements according to the language without traversing through the whole collection.
This is a general question to know which one looks more like a "best practice", with no assumption on how the clients are querying/using the REST API, nor how much performance matters as opposed to flexibility, etc.
Which option is generally more often a "best practice"?
You state in your question that:
This is a general question to know which one looks more like a "best practice", with no assumption on how the clients are querying/using the REST API.
You are writing a REST service that is designed to help clients get information out - if that wasn't your aim, there'd be no point writing it. Because of this, the most important question you need to answer is "what information do clients want?". The answer to that will dictate the structure of your data, not "which one looks better?" - we're not your service's end users.
Personally, I'd opt for the first, simply because it would seem wrong to have an array called books.English in my object, it also allows for languages with characters outside of A-Z, and caters for books where the language is not known (or mixed). If simplicity of the individual book is key (and the list of languages is well-defined and finite), then consider:
[
{
"language": "English",
"books": [{
"title": "First title",
"content": "First content"
}]
},
{
"language": "French",
"books": [{
"title": "Premier titre",
"content": "Premier contenu"
}]
}
]
In essence, however, there's no single best practise for the data structures you're building other than "make them useful".

design pattern for dependent resource in REST

I am developing specs doc for resource URIs. Most everything is fairly well discussed around on the netz, and is all very helpful. However, I am a bit stuck on the pattern for a dependent resource. So, a dependent resource is something that exists at the pleasure of its parent resource. And, if the parent ceases to exist then the dependent also goes away. So, if I have books, a dependent resource would be the count of books. For any given query, if there are no books then there will be no count. Which is different from, say, an author... you could have no books, but still have authors. Ok. So I have something like this URI and the returned data
http://example.com/books.json?author=Homer
{"books": [
{"id": 33, "title": "Iliad", "author": "Homer", "pubyear": "800 BC"},
{"id": 33, "title": "Odyssey", "author": "Homer", "pubyear": "750 BC"}
]}
The URI ends in the plural version of the common noun, and the QUERY_STRING is used to filter the return set. The root node in the return "hash" is the common noun that was queried, and its key is an array each element of which is a hash with key/value pairs.
For the count, my instinct is to do the following
http://example.com/books/count.json?author=Homer
{"books": [
{"count": 2}
]}
or even
http://example.com/books/stats.json?author=Homer
{"books": [
{"stats": {
"count": 2,
"units": 10,
"sold": 3
}
]}
But, it seems the correct way really should be
http://example.com/books.json/count?author=Homer or
http://example.com/books.json?aggregate=count&author=Homer
any suggestions, thoughts?
The reason both seem to feel weird is that you are mixing the content type and the content identifier by putting ".json" on it. The content type should be in the request's "Accept" header. If you eliminate the ".json", the two possibilities you are considering reduce to the same thing.
That's a purist answer. If for some reason you must use the extension (framework or client limitations), then putting the extension on the last path element is more standard.