OpenAPI as a single source of truth - limitations - rest

One of the benefits being promoted for API-first design or OpenAPI is that of their use as a single source of truth. To my mind, these schemas only serve as a contract - the actual source of truth for your API lies in your microservices implementation (typically a http endpoint).
How can OpenAPI claim to be a single source of truth when the contract cannot be enforced until the implementation on the API side is complete? I realise there is tooling available to assist with this, such as validation middleware that can be used to match your request and response against your schema, however this is typically only validated at the point that a network request is made, not at compile time.
Of course you could write api tests to validate, but this is very much dependent on good test coverage and not something you get out of the box.
TLDR - OpenAPI markets itself as being a single source of truth for APIs, but this simply isn't true until your API implementation matches the spec. What tools/techniques (if any) can be used to mitigate this?

Did a bit of additional research into available tooling and found a solution that helps mitigate this issue:
open-api-backend (and presumably other such libraries) have capabilities to map your api routes/handlers to a specific openAPI operation or operationID. You can then enforce schema validation such that only routes defined in the spec can be implemented, else a fail-fast error is thrown)

Related

Versioning related media types individually or in lockstep in a RESTful API

I'm developing a REST API around an ecommerce site and one of my resources is an Order which contains information like went it was made, the ID, the status, when it will be shipped, etc.
I have defined a media type for my Order resource like so:
application/vnd.myapp.order.v1+json
I also have defined another resource which is the status of an order, like so:
application/vnd.myapp.order-status.v1+json
My question is around the versioning of these media types. Seeing as they're related, would it make sense to version them in lockstep? For example, if the representation of the order resource changes and I create a application/vnd.myapp.order.v2+json, would it wise to also bump the version of the order-status media type to v2 as well? I'm also wondering if any there is a RESTful option with regards to the guidelines. I did have a look around online and couldn't really find anything talking about the best practice here, so any advice/opinions are appreciated.
I don't think mixing version and media type is a good idea.
In my opinion, you should separate it according to 'separate concerns principle' and 'single responsibility'.
https://en.wikipedia.org/wiki/Separation_of_concerns
https://en.wikipedia.org/wiki/Single-responsibility_principle
Many teams use header/url for versioning:
For example:
/api/v1/
/api/v2/
header {version:'v1'}
header {version:'v2'}
Then we can easily map the request with our need:
#RequestMapping(value="api/v1/books", consumes="application/json")
#RequestMapping(value="api/v2/books", consumes="application/json")
or
#RequestMapping(value="api/books",headers="version=v1", consumes="application/json")
#RequestMapping(value="api/books",headers="version=v2", consumes="application/json")
Although it seems useful it would be a violation of SoC and cause additional issues down the line as your API evolves.
Versioning your URLs is the better choice here, as URL is a perfect way to signal what type of resource is being dealt with and relation to the data it handles.
(To me introducing a custom header sounds much better design-wise then a custom versioned media type)
Custom media types are generally supposed to tell a consumer about the type of the data and its encoding scheme (e.g. xml vs json vs plain text and so on) and not how your fields are arranged from version to version while the encoding scheme is literally unchanged.
By choosing this path you would:
Force consumers of your API to tightly couple to that specific “representation” that creates maintenance issues on both sides.
Whenever you have multiple “versions” of your API co-existing at a given time - it introduce ambiguity when bodiless https methods such as DELETE or HEAD are used as the request information would simply be insufficient to correctly route your request, let alone the backend code completely.
It renders rels (Link Relation Types) less usable in their normal form (if you’d ever want to introduce them)

Single endpoint instead of API - what are the disadvantages?

I have a service, which is exposed over HTTP. Most of traffic input gets into it via single HTTP GET endpoint, in which the payload is serialized and encrypted (RSA). The client system have common code, which ensures that the serialization and deserialization will succeed. One of the encoded parameters is the operation type, in my service there is a huge switch (almost 100 cases) that checks which operation is performed and executes the proper code.
case OPERATION_1: {
operation = new Operation1Class(basicRequestData, serviceInjected);
break;
}
case OPERATION_2: {
operation = new Operation2Class(basicRequestData, anotherServiceInjected);
break;
}
The endpoints have a few types, some of them are typical resource endpoints (GET_something, UPDATE_something), some of them are method based (VALIDATE_something, CHECK_something).
I am thinking about refactoring the API of the service so that it is more RESTful, especially in the resource-based part of the system. To do so I would probably split the endpoint into the proper endpoints (e.g. /resource/{id}/subresource) or RPC-like endpoints (/validateSomething). I feel it would be better, however I cannot make up any argument for this.
The question is: what are the advantages of the refactored solution, and what follows: what are the disadvantages of the current solution?
The current solution separates client from server, it's scalable (adding new endpoint requires adding new operation type in the common code) and quite clear, two clients use it in two different programming languages. I know that the API is marked as 0-maturity in the Richardson's model, however I cannot make up a reason why I should change it into level 3 (or at least level 2 - resources and methods).
Most of traffic input gets into it via single HTTP GET endpoint, in which the payload is serialized and encrypted (RSA)
This is potentially a problem here, because the HTTP specification is quite clear that GET requests with a payload are out of bounds.
A payload within a GET request message has no defined semantics; sending a payload body on a GET request might cause some existing implementations to reject the request.
It's probably worth taking some time to review this, because it seems that your existing implementation works, so what's the problem?
The problem here is interop - can processes controlled by other people communicate successfully with the processes that you control? The HTTP standard gives us shared semantics for our "self descriptive messages"; when you violate that standard, you lose interop with things that you don't directly control.
And that in turn means that you can't freely leverage the wide array of solutions that we already have in support of HTTP, because you've introduce this inconsistency in your case.
The appropriate HTTP method to use for what you are currently doing? POST
REST (aka Richardson Level 3) is the architectural style of the world wide web.
Your "everything is a message to a single resource" approach gives up many of the advantages that made the world wide web catastrophically successful.
The most obvious of these is caching. "Web scale" is possible in part because the standardized caching support greatly reduces the number of round trips we need to make. However, the grain of caching in HTTP is the resource -- everything keys off of the target-uri of a request. Thus, by having all information shared via a single target-uri, you lose fine grain caching control.
You also lose safe request semantics - with every message buried in a single method type, general purpose components can't distinguish between "effectively read only" messages and messages that request that the origin server modify its own resources. This in turn means that you lose pre-fetching, and automatic retry of safe requests when the network is unstable.
In all, you've taken a rather intelligent application protocol and crippled it, leaving yourself with a transport protocol.
That's not necessarily the wrong choice for your circumstances - SOAP is a thing, after all, and again, your service does seem to be working as is? which implies that you don't currently need the capabilities that you've given up.
It would make me a little bit suspicious, in the sense that if you don't need these things, why are you using HTTP rather than some messaging protocol?

RESTful API endpoint for custom response data

I am building a RESTful API where I have a resource named solar_systems. solar_sytems has id(int), system_size(int), system_cost(int) columns with many other columns.
I understand that API endpoints will be-
/v1/solar-systems - for all systems
/v1/solar-systems/{id} - for a single system
And I have to pass query params for filter, search, sorting etc.
But what will be the best practice for API endpoints if I need some kind of custom data like if I need average system_cost for each system_size.
Will it be silly, if I use - /v1/solar-systems/average-system-cost?
Please I need your opinion from your experience.
It is not silly at all to use /v1/solar-systems/average-system-cost
It is easy to get caught up in the fact that technically the average-system-cost is not a resource. But it is a piece of data that is useful to retrieve. Ultimately the goal of REST is to make APIs that are understandable and readable. A specific endpoint that gets a useful piece of data definitely falls inside that.
Will it be silly, if I use /v1/solar-systems/average-system-cost?
The REST architecture doesn't enforce any URI design (see notes below). It's totally up to you to pick the URIs that better identify your resources.
However I would probably use query parameters to select the fields to be returned in the response. Something like /v1/solar-systems?fields=average-system-cost.
Note 1: The REST architectural style, described in the chapter 5 of Roy T. Fielding's dissertation, defines a set of constraints that must be followed by the applications that follow such architecture. However it says nothing about what the URLs must be like.
Note 2: On the other hand, the examples of a popular article written by Martin Fowler explaining a model defined by Leonard Richardson suggest a URL structure that looks friendly and easy to read.

RESTful model for get-or-create route

What's the most RESTful way to model an API which acts mostly as a GET, except that if the resource doesn't exist it creates it before returning it?
I can imagine using GET, although GET isn't supposed to change server state. I can also imagine using PUT, but in this case the resource should be immutable and PUT implies that the resource should be updated if it already exists. It can certainly be POST, but I feel like POST is the overused hammer to all impedance mismatch nails between API modeling and RESTful modeling.
Or should it be split into two separate routes outright? But that seems unnecessarily inefficient.
What's the consensus?
Typically, it is implemented as follows:
GET:Reads resource
POST:Creates resource
PUT:Updates resource
DELETE:Deletes resource
The common issue seems to be that get is limited on query string args. If you run against this limit, you may want to consider using custom headers. In either case, I would recommend that you follow the verb translations above.
Also, you don't mention the language. There is probably a framework that you can leverage that would abstract a lot of this from you.

Law of Demeter vs. REST

The Law of Demeter (really should be the suggestion of Demeter) says that you shouldn't "reach through" an object to get at their child objects. If you, as a client, need to perform some non-trivial operation, most of the time the domain model you're working with should support that operation.
REST is in principle a dumb hierarchy of objects. It is like a filesystem of resources/objects, where each object could have child objects. You almost always reach through with REST. You can optionally build up by-convention composite object types using REST techniques, and as long as the provider and the client agree on higher-level operations, you can avoid the reach-through.
So, how do you balance REST and Demeter? It seems to me that they are not in conflict, because REST is all about super-loose coupling to the point where it is pointless for the provider to try to anticipate all the needs of the clients, whereas Demeter assumes that logic can migrate to its most natural place through refactoring.
However, you could argue that REST is just a stop-gap until you understand your clients better. Is REST just a hack? Is Demeter unrealistic in any server/client scenario?
Is Demeter unrealistic in any server/client scenario?
I think you answered your own question here. How is REST in this manner different than SOAP or XML-RPC in this regard?
The point of REST is not to provide super-loose coupling, but the following:
Provide a identifier which accurately describes the resource being requested.
Provide services which behave as expected GET requests are idempotent, PUT updates records, POST creates, DELETE deletes
Minimize state being stored on the server
Tear down unnecessary complexity
There are cases where REST isn't the best answer, but REST works remarkably well in general.
I pay that law/suggestion no mind whatsoever. It defeats half the benefit of aggregation and composition, and I won't have it.
A link that is provided by a representation, exposed by a RESTful interface, can be completely opaque without violating any of REST's constraints. Therefore I would suggest that REST is completely consistent with the Law of Demeter. There is no requirement that the link expose the structure of the URL space in its URL.
e.g. in an object oriented scenario, you might replace the call a.b.c with a.bc
In a RESTful representation you could create the following:
<a>
<link href="bc"/>
</a>
instead of doing
GET a
<a>
<link href="b"/>
</a>
GET b
<b>
<link href="c"/>
</b>
GET c
I would have to disagree with altCognito and say that one of the main goals of REST is loose coupling. The uniform interface, standard media types and HATEOAS all combine to produce an extremely loosely coupled interface.
In response to David's comment:
REST is all about super-loose coupling to the point where it is pointless for the provider to try to anticipate all the needs of the clients
Actually, REST is about limiting the clients options by providing only valid links in the representations. Within those constraints the client can attempt to satisfy its own needs. It is by removing the knowledge from the client of when certain requests can be made that you achieve loose coupling. Loose coupling is not achieved by listing a set of resources and saying "ok, you can GET, PUT, POST, DELETE all you want."
I think they're really orthogonal. REST describes a collection of resources addressed by URIs with a set of canonical methods: GET, POST, etc. If REST routine returns a URI, that's not "reaching through" it's identifying a different object with the same method names.