Looking for some input on a REST API architectural design. I often find that the desired data is the combination of a view across multiple resources. Would you expect the client to combine them, or provide an API that does the combination for the client?
For example, let's say we are writing a REST API for people to become notified about events. Someone will indicate interest in an event in one of 2 ways:
Join an organization that regularly puts on events that the person has interest in
Search for and then mark a particular event run by an organization I wouldn't normally subscribe to
I can retrieve all of the events for user 100 by doing the following long steps:
GET /user/100/organizations returns 123
GET /organizations/123/events returns [15,16,20]
GET /user/100/savedevents returns [35,36]
GET /events/15,16,20,35,36 returns all of the events
But that seems rather heavy for a client. I almost want a client to be able to say, "give me all of the interesting events for this user":
GET /user/100/events
...and then require the server to understand that it has to go through all of steps 1-4 and return them, or, at the very least, return [15,16,20,35,36] so it becomes 2 steps: get event IDs; get event details.
Does this even make sense, to make a view that cuts across multiple resources that way?
EDIT: To explain further. My hesitation is because I can see how /organizations/123/events is a clean resource; if is identical to saying /events?organizations=123, i.e. "give me resource events where organizations=123". Same for /user/100/organizations.
But /user/100/events is not "give me resource events where organizations=123". It is "give me organizations registrations where user=100, retrieve those organization ids, then give me the events where the organization=123, then give me savedevents where user=100."
Each of the three steps itself is a clean resource mapping. Putting them together seems messy. But so does asking a client (especially a Web client), to figure out all that logic!
I was a bit confused by your question, so I'll try to be as comprehensive as possible and hopefully I'll have hit on an answer you need =P.
I often find that the desired data is the combination of a view across
multiple resources. Would you expect the client to combine them, or
provide an API that does the combination for the client?
In a true RESTful environment all cross-sectional views of data would be done by the server, not by the client.
The primary reason for a RESTful design is allow access to the CRUD model (create, read, update, delete) by way of using standard HTTP verbs (e.g. GET, POST, PUT, DELETE). Storing the returns of these methods in some sort of session or cookie or otherwise external method (e.g. "give me data for bob", "give me data on businesses", "give me data from my first two queries") goes above and beyond the REST methodology.
The way you'll want to leverage RESTful development is to find ways of combining resources in meaningful ways so as to provide a RESTful environment where the method calls are consistent; GET reads data, POST creates data, PUT updates data, DELETE deletes data).
So if you wanted to do something like Steps 1 through 4 I'd recommend something like:
GET /user/{userID}/organizations --> {return all affiliated organizations}
GET /user/{userID}/events --> {return all events associated with userID}
GET /organizations/{organization}/events --> {returns all eventID's assoc. with organization}
GET /user/{userID}/savedevents --> {return all eventID's userID saved to their profile}
GET /events/?eventID=(15,16,20,35,36) --> {return all of the events details for those eventID's}
GET /events/{eventID}--> {return events details for {eventID}}
Whereas you might also have:
GET /events/ --> {return a complete listing of all event ID's}
GET /events/{userID} --> {return all events userID is associated with}
POST /event/ --> {create a new event - ID is assigned by the server}
POST /user/ --> {create a new user - ID is assigned by the server}
PUT /user/{userID} --> {update/modify user information}
Then if you want cross-sectional slices of information, you would have a named resource for the cross section (else pass it as arguments). Be explicit with your resources (Random FYI, name your resources as nouns only - not verbs).
You also asked:
To explain further. My hesitation is because I can see how
/organizations/123/events is a clean resource; if is identical to
saying /events?organizations=123, i.e. "give me resource events where
organizations=123". Same for /user/100/organizations.
Essentially both the named resourced and the resource + argument method can provide the same information. Typically I have seen RESTful design API call for arguments only when an important delineation is required (range requests, date requests, some REALLY small unit of data, etc.). If you have some higher-order grouping of data that CAN BE parsed/introspected further then it's a named resource. In your example, I'd have it both API calls, as the RESTful spec calls for providing data via multiple paths and by way of using the established HTTP methods. However, I'd also expand a bit...
/events?organizations=123 --> {return the eventID's associated with org=123}
/organizations/123/events --> {return event DETAILS for events associated with org=123}
Have a read/go at this, by Apigee
There may be several ways to solve this... however, I think that most of the times (if the service is managed by the same provider) it is better to have the logic on the server-side and make REST calls as independent as possible of each other (i.e., the server performing the multiple operations required - normally read data from DBs that are store the data handled in the API resources).
In the example you talk about this would mean your REST API would expose a "user" resource and a sub-resource "events" (which you call "savedevents") he is interested in. With this in mind you would have something like this:
POST /user/{username}/events stores a new event (or multiple events) the user is interested in
GET /user/{username}/events returns all the events the user is interested in
GET /user/{username}/events/{eventid} returns details of a specific event
To "filter" user events per organization (and other filtering operations) you can use "query parameters":
GET /user/{username}/events?organization=123
So, the server (or API call) would perform the operations you describe from step 1 to step 4 in the GET /user/{username}/events. You can still make the other resources ("organizations" and "events") in your API, however they would be used in other contexts (like store new events or organizations, etc.).
HTH
Related
I've seen many examples of HATEOAS where every resource has links to related resources. An API that returns N items of a certain resource per page, the client would probably need N calls to fetch any nested resource by consuming HATEOAS. For example:
GET city/documents:
[{
id: 1,
city: {
self: 'http://service.com/cities?filter=id==1'
},
document: { ... }
...
}, {
id: 2,
city: {
self: 'http://service.com/cities?filter=id==2'
},
document: { ... }
...
}]
FYI, the query parameter uses the FIQL syntax to define the filters.
Now, if the client was to fetch the city details for each document (to show on UI), it will probably need N additional calls. However in my case, the /cities API can additionally take multiple city ids like this: /cities?filter=id=in=(1,2) that can reduce N calls to one. Is there a way to articulate something like this using HATEOAS? I've read about the templates but not sure how should the template look like and how would client consume it?
I've seen many examples of HATEOAS where every resource has links to related resources. An API that returns N items of a certain resource per page, the client would probably need N calls to fetch any nested resource by consuming HATEOAS.
Yes. Less true in a world with Server-Push, where the server can proactively provide multiple resources in response to a query. If you imagine asking for a web page, and getting the html, and then also the images and the java script resources too, then you've got the right sort of idea.
API can additionally take multiple city ids like this: /cities?filter=id=in=(1,2) that can reduce N calls to one. Is there a way to articulate something like this using HATEOAS?
Yes.
Let's walk through it carefully. What you've done here is introduced a new resource, with identifier /cities?filter=id=in=(1,2). You might have another resource /cities?filter=id=in=(1,20) and another resource /cities?filter=id=in=(1,2000). In your implementation, these might be a "single endpoint" that extracts parameters from the identifier and uses them to generate the correct representation.
So what you get is something like a data transfer object - a large grained resource fetched in a single go.
I've read about the templates but not sure how should the template look like and how would client consume it?
The simplest example, which you have likely seen already, is a web form. You allow the client to provide the start and end elements, and the form processing takes that information and creates the specified URI from it.
/filtered-cities?start=1&end=2000
So the client needs to understand what the form is for, and how to identify the semantics of the different elements in the form. The agent needs to understand the processing rules that transfer the form data into the URI.
URI Templates are the same basic idea; they give you a domain agnostic language with which to describe where the parameters go in a resource identifier. The basic pattern is the same - there needs to be agreement about the semantics of the parameters, the server provides a URI, the client provides a parameter map, and the generic code can take care of the merge
uri = template.apply(parameterMap)
URI Templates aren't quite as powerful as forms; with a form, you can introduce a default value for a parameter, but there is no analogous capability in URI templates.
HAL-Forms may give you a better sense of how a form based approach might work in JSON.
Let's say there are two (or more) RESTful microservices serving JSON. Service (A) stores user information (name, login, password, etc) and service (B) stores messages to/from that user (e.g. sender_id, subject, body, rcpt_ids).
Service (A) on /profile/{user_id} may respond with:
{id: 1, name:'Bob'}
{id: 2, name:'Alice'}
{id: 3, name:'Sue'}
and so on
Service (B) responding at /user/{user_id}/messages returns a list of messages destined for that {user_id} like so:
{id: 1, subj:'Hey', body:'Lorem ipsum', sender_id: 2, rcpt_ids: [1,3]},
{id: 2, subj:'Test', body:'blah blah', sender_id: 3, rcpt_ids: [1]}
How does the client application consuming these services handle putting the message listing together such that names are shown instead of sender/rcpt ids?
Method 1: Pull the list of messages, then start pulling profile info for each id listed in sender_id and rcpt_ids? That may require 100's of requests and could take a while. Rather naive and inefficient and may not scale with complex apps???
Method 2: Pull the list of messages, extract all user ids and make bulk request for all relevant users separately... this assumes such service endpoint exists. There is still delay between getting message listing, extracting user ids, sending request for bulk user info, and then awaiting for bulk user info response.
Ideally I want to serve out a complete response set in one go (messages and user info). My research brings me to merging of responses at service layer... a.k.a. Method 3: API Gateway technique.
But how does one even implement this?
I can obtain list of messages, extract user ids, make a call behind the scenes and obtain users data, merge result sets, then serve this final result up... This works ok with 2 services behind the scenes... But what if the message listing depends on more services... What if I needed to query multiple services behind the scenes, further parse responses of these, query more services based on secondary (tertiary?) results, and then finally merge... where does this madness stop? How does this affect response times?
And I've now effectively created another "client" that combines all microservice responses into one mega-response... which is no different that Method 1 above... except at server level.
Is that how it's done in the "real world"? Any insights? Are there any open source projects that are built on such API Gateway architecture I could examine?
The solution which we used for such problem was denormalization of data and events for updating.
Basically, a microservice has a subset of data it requires from other microservices beforehand so that it doesn't have to call them at run time. This data is managed through events. Other microservices when updated, fire an event with id as a context which can be consumed by any microservice which have any interest in it. This way the data remain in sync (of course it requires some form of failure mechanism for events). This seems lots of work but helps us with any future decisions regarding consolidation of data from different microservices. Our microservice will always have all data available locally for it process any request without synchronous dependency on other services
In your case i.e. for showing names with a message, you can keep an extra property for names in Service(B). So whenever a name update in Service(A) it will fire an update event with id for the updated name. The Service(B) then gets consumes the event, fetches relevant data from Service(A) and updates its database. This way even if Service(A) is down Service(B) will function, albeit with some stale data which will eventually be consistent when Service(A) comes up and you will always have some name to be shown on UI.
https://enterprisecraftsmanship.com/2017/07/05/how-to-request-information-from-multiple-microservices/
You might want to perform response aggregation strategies on your API gateway. I've written an article on how to perform this on ASP.net Core and Ocelot, but there should be a counter-part for other API gateway technologies:
https://www.pogsdotnet.com/2018/09/api-gateway-response-aggregation-with.html
You need to write another service called Aggregator which will internally call both services and get the response and merge/filter them and return the desired result. This can be easily achieved in non-blocking using Mono/Flux in Spring Reactive.
An API Gateway often does API composition.
But this is typical engineering problem where you have microservices which is implementing databases per service pattern.
The API Composition and Command Query Responsibility Segregation (CQRS) pattern are useful ways to implement queries .
Ideally I want to serve out a complete response set in one go
(messages and user info).
The problem you've described is what Facebook realized years ago in which they decided to tackle that by creating an open source specification called GraphQL.
But how does one even implement this?
It is already implemented in various popular programming languages and maybe you can give it a try in the programming language of your choice.
I have a question on how should I design my rest api(s) given that I need to return count of different objects in my application. There are multiple approaches that could be thought off
Defining a single api that generally accepts the object identifier in the request body (json) and returns back the count for each of the object identifiers in the response. The drawback is the api is too generic and possibly not restful since there is no resource. The url would look like GET /counts
Define individual api's for each of the resources for which count is needed. Assuming I need to return counts for the fields defined in the software, tables, processes, tasks, jobs etc I would define individual api's for each of these resources. It would look like GET /field/count or GET /table/count. A side effect of this would be there would be many web api's for each of the resources we need the count for. Is that bad?
Kindly share your thoughts on the above approaches and any new ways in which I could design such an API that adheres to the REST standards.
Thanks
It totally depends on the client that is consuming the APIs.
Case 1. If its a WebApp which needs count of many tables on a single page then both will lead to bad design where you will have to make hundreds of calls just for counts data. You can club counts in a single API and send that as a response.
Case 2. If the client are individually using the count then i would recommend the 2nd approach where the resource is clearly defined. For the 2nd approach you are making the client intelligent which is not recommended.
As mentioned in the comments REST is a totally subjective topic so there can be multiple view points to every design.
Let's say I have three resources that are related like so:
Grandparent (collection) -> Parent (collection) -> and Child (collection)
The above depicts the relationship among these resources like so: Each grandparent can map to one or several parents. Each parent can map to one or several children. I want the ability to support searching against the child resource but with the filter criteria:
If my clients pass me an id reference to a grandparent, I want to only search against children who are direct descendants of that grandparent.
If my clients pass me an id reference to a parent, I want to only search against children who are direct descendants of my parent.
I have thought of something like so:
GET /myservice/api/v1/grandparents/{grandparentID}/parents/children?search={text}
and
GET /myservice/api/v1/parents/{parentID}/children?search={text}
for the above requirements, respectively.
But I could also do something like this:
GET /myservice/api/v1/children?search={text}&grandparentID={id}&parentID=${id}
In this design, I could allow my client to pass me one or the other in the query string: either grandparentID or parentID, but not both.
My questions are:
1) Which API design is more RESTful, and why? Semantically, they mean and behave the same way. The last resource in the URI is "children", effectively implying that the client is operating on the children resource.
2) What are the pros and cons to each in terms of understandability from a client's perspective, and maintainability from the designer's perspective.
3) What are query strings really used for, besides "filtering" on your resource? If you go with the first approach, the filter parameter is embedded in the URI itself as a path parameter instead of a query string parameter.
Thanks!
Something to note, is that since you use GET in your above examples,
dependant on the end user browser settings, proxy settings or other calling applications internal settings,
the response is likely to be cached somewhere on the message route, and the original request message may never actually reach your API layer where the most recent data is accessed.
so first, you may want to review your requirement of using the verb GET.
If you want your API to return the most recent data every time, then don't use GET, or if you still want to use GET, then require the caller to append a random number to the end of the url, to decrease the likely hood of caching.
or get the client to send the verb PURGE, after every GET.
this proxy caching behaviour that is present across the internet, in browsers, and server architectures is where REST fails for me, since caching of GET can be very useful but only sometimes.
stick to basic querystrings if you want to make it really simple for the end developer and caching of responses offers no problems for you,
or
just use POST and forget about this tiresome REST argument
The task: I have multiple resources that need to be updated in one HTTP call.
The resource type, field and value to update are the same for all resources.
Example: have set of cars by their IDs, need to update "status" of all cars to "sold".
Classic RESTFul approach: use request URL something like
PUT /cars
with JSON body like
[{id:1,status:sold},{id:2,status:sold},...]
However this seems to be an overkill: too many times to put status:sold
Looking for a RESTful way (I mean the way that is as close to "standard" rest protocol as possible) to send status:sold just once for all cars along with the list of car IDs to update. This is what I would do:
PUT /cars
With JSON
{ids=[1,2,...],status:sold} but I am not sure if this is truly RESTful approach.
Any ideas?
Also as an added benefit: I would like to be able to avoid JSON for small number of cars by simply setting up a URL with parameters something like this:
PUT /cars?ids=1,2,3&status=sold
Is this RESTful enough?
An even simpler way would just be:
{sold:[1,2,...]}
There's no need to have multiple methods for larger or smaller numbers of requests - it wastes development time and has no noteable impact upon performance or bandwidth.
As far as it being RESTful goes, as long as it's easily decipherable by the recipient and contains all the information you need, then there's no problem with it.
As I understand it using put is not sufficient to write a single property of a resource. One idea is to simply expose the property as a resource itself:
Therefore: PUT /car/carId/status with body content 'Sold'.
Updating more than one car should result in multiple puts since a request should only target a single resource.
Another Idea is to expose a certain protocol where you build a 'batch' resource.
POST /daily-deals-report/ body content {"sold" : [1, 2, 3, 4, 5]}
Then the system can simply acknowledge the deals being made and update the cars status itself. This way you create a whole new point of view and enable a more REST like api then you actually intended.
Also you should think about exposing a resource listing all cars that are available and therefore are ready for being sold (therefore not sold, and not needing repairs or are not rent).
GET /cars/pricelist?city=* -> List of all cars ready to be sold including car id and price.
This way a car do not have a status regarding who is owning the car. A resource is usually independent of its owner (owner is aggregating cars not a composite of it).
Whenever you have difficulties to map an operation to a resource your model tend to be flawed by object oriented thinking. With resources many relations (parent property) and status properties tend to be misplaced since designing resources is even more abstract than thinking in services.
If you need to manipulate many similar objects you need to identify the business process that triggers those changes and expose this process by a single resource describing its input (just like the daily deals report).