How to manage the logic behind the API versioning? - rest

I want to identify what might be considered as a best practice for URI versioning of the APIs, regarding the logic of the back-end implementation.
Let's say we have a java application with the following API:
http://.../api/v1/user
Request:
{
"first name": "John",
"last name": "Doe"
}
After a while, we need to add 2 more mandatory fields to the user API:
http://.../api/v2/user
Request:
{
"first name": "John",
"last name": "Doe",
"age": 20,
"address": "Some address"
}
We are using separate DTOs for each version, one having 2 fields, and another having 4 fields.
We have only one entity for the application, but my question is how we should handle the logic, as a best practice? Is ok to handle this in only one service?
If those 2 new fields "age" and "address" would not be mandatory, this would not be considered a breaking change, but since they are, I am thinking that there are a few options:
use only one manager/service in the business layer for all user API versions (but the complexity of the code in that only one manager will grow very much in time and will be hard to maintain)
use only one manager for all user API versions and also use a class as a translator so I can make compatible older API versions with the new ones
a new manager/service in the business layer for each user API version
If I use only one manager for all user API versions and put there some constraints/validations, V2 will work, but V1 will throw an exception because those fields are not there.
I know that versioning is a big topic, but I could not find a specific answer on the web until now.
My intuition says that having a single manager for all user API versions will result in a method that has nothing to do with clean code, and also, I am thinking that any change added with a new version must be as loosely coupled as possible, because will be easier to make older methods deprecated and remove them in time.

You are correct in your belief that versioning with APIs is a contentious issue.
You are also making a breaking change and so incrementing the version of your API is the correct decision (w.r.t. semver).
Ideally your backend code will be under version control (eg GitHub). In this case you can safely consider V1 to be a specific commit in your repository. This is the code that has been deployed and is serving traffic for V1. You can then continue making changes to your code as you see fit. At some point you will have added some new breaking changes and decide to mark a specific commit as V2. You can then deploy V2 alongside V1. When you decide to depreciate V1 you can simply stop serving traffic.
You'll need some method of ensuring only V1 traffic goes to the V1 backend and V2 to the V2 backend. Generally this is done by using a Reverse Proxy; popular choices include NGINX and Apache. Any sufficient reverse proxy will allow you to direct requests based on the path such that if the request is prefixed by /api/v1 then redirect that request to Backend1 and if prefixed by /api/v2 to Backend2.
Hopefully this model will help keep your code clean: the master branch in your repository only needs to deal with the most recent API. If you need to make changes to older API versions this can be done with relative ease: branch off the V1 commit, make your changes, and then define the HEAD of that modified branch as the 'new' V1.
A couple of assumptions about your backend have been made for this answer that you should be aware about. Firstly, your backend can be scaled horizontally. For example, this means that if you interact with a database then the multiple versions of your API can all safely access the database concurrently. Secondly, that you have the resources do deploy replica backends.
Hopefully that explanation makes sense; but if not any questions send them my way!

If you're able to/can entertain code changes to your existing API, then you can refer to this link. Also, the link's mentioned at the bottom of the post direct you to respective GitHub source code which can be helpful in case if you think to introduce the code changes after your trial-error.
The mentioned approach(using #JsonView) basically prevents one from introducing multiple DTO's of a single entity for the same/multiple clients. Eventually, one can also refrain from introducing new version APIs each & every time you introduce new fields in your existing API.
spring-rest-jackson-jsonviewjackson-jsonview

Related

what is the purpose of the extra /v1 in the route api/v1

Why not just make your backend api route start with /api?
Why do we want to have the /v1 bit? Why not just api/? Can you give a concrete example? What are the benefits of either?
One of the major challenges surrounding exposing services is handling updates to the API contract. Clients may not want to update their applications when the API changes, so a versioning strategy becomes crucial. A versioning strategy allows clients to continue using the existing REST API and migrate their applications to the newer API when they are ready.
There are four common ways to version a REST API.
Versioning through URI Path
http://www.example.com/api/1/products
REST API versioning through the URI path
One way to version a REST API is to include the version number in the URI path.
xMatters uses this strategy, and so do DevOps teams at Facebook, Twitter, Airbnb, and many more.
The internal version of the API uses the 1.2.3 format, so it looks as follows:
MAJOR.MINOR.PATCH
Major version: The version used in the URI and denotes breaking changes to the API. Internally, a new major version implies creating a new API and the version number is used to route to the correct host.
Minor and Patch versions: These are transparent to the client and used internally for backward-compatible updates. They are usually communicated in change logs to inform clients about a new functionality or a bug fix.
This solution often uses URI routing to point to a specific version of the API. Because cache keys (in this situation URIs) are changed by version, clients can easily cache resources. When a new version of the REST API is released, it is perceived as a new entry in the cache.
Pros: Clients can cache resources easily
Cons: This solution has a pretty big footprint in the code base as introducing breaking changes implies branching the entire API
Ref: https://www.xmatters.com/blog/blog-four-rest-api-versioning-strategies/#:~:text=Clients%20may%20not%20want%20to,API%20when%20they%20are%20ready.

REST API design for a "reset to defaults" operation

I'm surprised to find so little mention of this dilemma online, and it makes me wonder if I'm totally missing something.
Assume I have a singleton resource called Settings. It is created on init/install of my web server, but certain users can modify it via a REST API, lets say /settings is my URI. I have a GET operation to retrieve the settings (as JSON), and a PATCH operation to set one or more of its values.
Now, I would like to let the user reset this resource (or maybe individual properties of it) to default - the default being "whatever value was used on init", before any PATCH calls were done. I can't seem to find any "best practice" approach for this, but here are the ones I have come up with:
Use a DELETE operation on the resource. It is after all idempotent, and its pretty clear (to me). But since the URI will still exist after DELETE, meaning the resource was neither removed nor moved to an inaccessible location, this contradicts the RESTful definition of DELETE.
Use a POST to a dedicated endpoint such as /settings/reset - I really dislike this one because its the most blatantly non-RESTful, as the verb is in the URI
Use the same PATCH operation, passing some stand-in for "default" such as a null value. The issue I have with this one is the outcome of the operation is different from the input (I set a property to null, then I get it and it has a string value)
Create a separate endpoint to GET the defaults, such as /setings/defaults, and then use the response in a PATCH to set to those values. This doesn't seem to contradict REST in any way, but it does require 2 API calls for seemingly one simple operation.
If one of the above is considered the best practice, or if there is one I haven't listed above, I'd love to hear about it.
Edit:
My specific project has some attributes that simplify this question, but I didn't mention them originally because my aim was for this thread to be used as a reference for anyone in the future trying to solve the same problem. I'd like to make sure this discussion is generic enough to be useful to others, but specific enough to also be useful to me. For that, I will append the following.
In my case, I am designing APIs for an existing product. It has a web interface for the average user, but also a REST (ish) API intended to meet the needs of developers who need to automate certain tasks with said product. In this oversimplified example, I might have the product deployed to a test environment on which i run various automated tests that modify the /settings and would like to run a cleanup script that resets /settings back to normal when I'm done.
The product is not SaaS (yet), and the APIs are not public (as in, anyone on the web can access them freely) - so the audience and thus the potential types of "clients" I may encounter is rather small - developers who use my product, that is deployed in their private data center or AWS EC2 machines, and need to write a script in whatever language to automate some task rather than doing it via UI.
What that means is that some technical considerations like caching are relevant. Human user considerations, like how consistent the API design is across various resources, and how easy it is to learn, are also relevant. But "can some 3rd party crawler identify the next actions it can perform from a given state" isn't so relevant (which is why we don't implement HATEOAS, or the OPTIONS method at all)
Let's discuss your mentioned options first:
1: DELETE does not necessarily need to delete or remove the state contained in the resource targeted by the URI. It just requires that the mapping of target URI to the resource is removed, which means that a consecutive request on the same URI should not return the state of the resource further, if no other operation was performed on that URI in the meantime. As you want to reuse the URI pointing to the client's settings resource, this is probably not the correct approch.
2: REST doesn't care about the spelling of the URI as long as it is valid according to RFC3986. There is no such thing as RESTful or RESTless URI. The URI as a whole is a pointer to a resource and a client should refrain from extracting knowledge of it by parsing and interpreting it. Client and server should though make use of link relation names URIs are attached to. This way URIs can be changed anytime and client will remain to be able to interact with the service further. The presented URI however leaves an RPC kind of smell, which an automated client is totally unaware of.
3: PATCH is actually pretty-similar to patching done by code versioning tools. Here a client should precalculate the steps needed to transform a source document to its desired form and contain these instructions into a so called patch document. If this patch document is applied by someone with the state of a document that matches the version used by the patch document, the changes should be applied correctly. In any other cases the outcome is uncertain. While application/json-patch+json is very similar to the philosophy on a patch-document containing separate instructions, application/merge-patch+json has a slightly different take on it by defining default rules (nulling out a property will lead to a removal, including a property will lead to its adding or update and leaving out properties will ignore these properties in the original document)
4: In this sense first retrieving the latest state from a resource and locally updating it/calculating the changes and then send the outcome to the server is probably the best approach of the ones listed. Here you should make use of conditional requests to guarantee that the changes are only applied on the version you recently downloaded and prevent issues by ruling out any intermediary changes done to that resource.
Basically, in a REST architecture the server offers a bunch of choices to a client that based on his task will chose one of the options and issue a request to the attached URI. Usually, the client is taught everything it needs to know by the server via form representations such as HTML forms, HAL forms or ION.
In such an environment settings is, as you mentioned, a valid resource on its own, so is also a default settings resource. So, in order to allow a client to reset his settings it is just a matter of "copying" the content of the default settings resource to the target settings resource. If you want to be WebDAV compliant, which is just an extension of HTTP, you could use the COPY HTTP operation (also see other registered HTTP operations at IANA). For plain HTTP clients though you might need a different approach so that any arbitrary HTTP clients will be able to reset settings to a desired default one.
How a server wants a client to perform that request can be taught via above mentioned form support. A very simplistic approach on the Web would be to send the client a HTML page with the settings pre-filled into the HTML form, maybe also allow the user to tweak his settings to his wishes beforehand, and then click a submit button to send the request to the URI present in the action attribute of the form, which can be any URI the server wants. As HTML only supports POST and GET in forms, on the Web you are restricted to POST.
One might think that just sending a payload containing the URI of the settings resource to reset and optionally the URI to the default settings to a dedicated endpoint via POST is enough and then let it perform its magic to reset the state to the default one. However, this approach does bypass caches and might let them believe that the old state is still valid. Caching in HTTP works as such that the de-facto URI of a resource is used as key and any non-safe operations performed on that URI will lead to an eviction of that stored content so that any consecutive requests would directly go to the server instead of being served by the cache instead. As you send the unsafe POSTrequest to a dedicated resource (or endpoint in terms of RPC) you miss out on the capability to inform the cache about the modification of the actual settings resource.
As REST is just a generalization of the interaction model used on the human Web, it is no miracle that the same concepts used on the Web also apply onto the application domain level. While you can use HTML here as well, JSON-based formats such as application/hal+json or the above mentioned HAL forms or ION formats are probably more popular. In general, the more media-type your service is able to support, the more likely the server will be to server a multitude of clients.
In contrast to the human Web, where images, buttons and further stuff provide an affordance of the respective control to a user, arbitrary clients, especially automated ones, usually don't coop with such affordances good. As such other ways to hint a client on the purpose of a URI or control element need to be provided, such as link relation names. While <<, <, >, >> may be used on a HTML page link to indicate first, previous, next and last elements in a collection, link relation here provide first, prev, next and last as alternatives. Such link relations should of course be either registered with IANA or at least follow the Web linking extension approach. A client looking up the URI on a prev relation will know the purpose of the URI as well as still be able to interact with the server if the URI ever changes. This is in essence also what HATEOAS is all about, using given controls to navigate the application though the state machine offered by the server.
Some general rules of thumb in designing applications for REST architectures are:
Design the interaction as if you'd interact with a Web page on the human Web, or more formally as a state machine or domain application protocol, as Jim Webber termed it, a client can run through
Let servers teach clients on how requests need to look like via support of different form types
APIs shouldn't use typed resources but instead rely on content type negotiation
The more media type your API or client supports the more likely it will be to interact with other peers
Long story short, in summary, a very basic approach is to offer a client a pre-filled form with all the data that makes up the default settings. The target URI of the action property targets the actual resource and thus also informs caches about the modification. This approach is on top also future-proof that clients will be served automatically with the new structure and properties a resource supports.
... so the audience and thus the potential types of "clients" I may encounter is rather small - developers who use my product, that is deployed in their private data center or AWS EC2 machines, and need to write a script in whatever language to automate some task rather than doing it via UI.
REST in the sense of Fielding's architectural style shines when there are a multitude of different clients interacting with your application and when there needs to be support for future evolution inherently integrated into the design. REST just gives you the flexibility to add new features down the road and well-behaved REST clients will just pick them up and continue. If you are either only interacting with a very limited set of clients, especially ones under your control, of if the likelihood of future changes are very small, REST might be overkill and not justify the additional overhead caused by the careful desing and implementation.
... some technical considerations like caching are relevant. Human user considerations, like how consistent the API design is across various resources, and how easy it is to learn, are also relevant. But "can some 3rd party crawler identify the next actions it can perform from a given state" isn't so relevant ...
The term API design already indicates that a more RPC-like approach is desired where certain operations are exposed user can invoke to perform some tasks. This is all fine as long as you don't call it REST API from Fielding's standpoint. The plain truth here is that there are hardly any applications/systems out there that really follow the REST architectural style but there are tons of "bad examples" who misuse the term REST and therefore indicate a wrong picture of the REST architecture, its purpose as well as its benefits and weaknesses. This is to some part a problem caused by people not reading Fielding's thesis (carefully) and partly due to the overall perference towards pragmatism and using/implementing shortcuts to get the job done ASAP.
In regards to the pragmatic take on "REST" it is hard to give an exact answer as everyone seems to understand different things about it. Most of those APIs rely on external documentation anyway, such as Swagger, OpenAPI and what not and here the URI seems to be the thing to give developers clue about the purpose. So a URI ending with .../settings/reset should be clear to most of the developers. Whether the URI has an RPC-smell to it or whether or not to follow the semantics of the respective HTTP operations, i.e. partial PUT or payloads within GET, is your design choice which you should document.
It is okay to use POST
POST serves many useful purposes in HTTP, including the general purpose of “this action isn’t worth standardizing.”
POST /settings HTTP/x.y
Content-Type: text/plain
Please restore the default settings
On the web, you'd be most likely to see this as a result of submitting a form; that form might be embedded within the representation of the /settings resource, or it might live in a separate document (that would depend on considerations like caching). In that setting, the payload of the request might change:
POST /settings HTTP/x.y
Content-Type: application/x-www-form-urlencoded
action=restoreDefaults
On the other hand: if the semantics of this message were worth standardizing (ie: if many resources on the web should be expected to understand "restore defaults" the same way), then you would instead register a definition for a new method token, pushing it through the standardization process and promoting adoption.
So it would be in this definition that we would specify, for instance, that the semantics of the method are idempotent but not safe, and also define any new headers that we might need.
there is a bit in it that conflicts with this idea of using POST to reset "The only thing REST requires of methods is that they be uniformly defined for all resources". If most of my resources are typical CRUD collections, where it is universally accepted that POST will create a new resource of a given type
There's a tension here that you should pay attention to:
The reference application for the REST architectural style is the world wide web.
The only unsafe method supported by HTML forms was POST
The Web was catastrophically successful
One of the ideas that powered this is that the interface was uniform -- a browser doesn't have to know if some identifier refers to a "collection resource" or a "member resource" or a document or an image or whatever. Neither do intermediate components like caches and reverse proxies. Everybody shares the same understanding of the self descriptive messages... even the deliberately vague ones like POST.
If you want a message with more specific semantics than POST, you register a definition for it. This is, for instance, precisely what happened in the case of PATCH -- somebody made the case that defining a new method with additional constraints on the semantics of the payload would allow a richer, more powerful general purpose components.
The same thing could happen with the semantics of CREATE, if someone were clever enough to sit down and make the case (again: how can general purpose components take advantage of the additional constraints on the semantics?)
But until then, those messages should be using POST, and general purpose components should not assume that POST has create semantics, because RFC 7231 doesn't provide those additional constraint.

How to do versioning in the REST API URI address?

I'm looking at ways to version the URIs of a Spring REST API because, in the case of launching a new application, the REST API handles the new application and supports the old application requests for a certain period. But I am in doubt as to what database, system entities, and URIs are in general whenever a version is added to a URI.
Example:
No version, a request to fetch all users:
http://host:8080/api/users
With versioning:
http://host:8080/v1/users
http://host:8080/v2/users
http://host:8080/v1/products
I was thinking of making a User entity and when defining the attributes I note them as not mandatory and the entity is always the most current version. For the 'v2' version of the URI I create a UserV2DTO so that it primarily serves the User entity doing the validations with the required annotations. For the 'v1' version of the URI, let's say that the user does not have the 'dateBirth' attribute and in this way it receives a UserV1DTO that does not have the dateBirth attribute and at the time of converting the DTO to Entity the Entity.dateBirth attribute is null because is required.
What I want to know is if this is a correct form of versioning, because in the database the attributes are not mandatory and the validation of the mandatory is in the DTO? Also I want to know if all the URIs of all the resources need to be changed to the last version or if in the case of 'products' it can stay in V1 until one day it is necessary to change only it?
How to do versioning in the REST API URI address?
Real answer? Any way you want, consistent with your local conventions.
In REST, URI are just identifiers; effectively they are keys used to look up information in a cache. Embedding information in the URI is done at the discretion of the server, and for its own exclusive use.
Roy Fielding, who defined REST while leading the specification work for HTTP/1.1, wrote
The reason to make a real REST API is to get evolvability … a "v1" is a middle finger to your API customers, indicating RPC/HTTP (not REST)
As an example, consider Google -- how many different implementations of web search do you suppose that they have had, over the years? But the API itself has been stable: navigate to the bookmarked home page, enter data in the search form, submit -- dispatching the data to whatever URI was described in the form meta data.
Of course, even if you aren't doing REST, you might still want a URI design that is sane. Using path segments to partition your resource hierarchy has advantages when it comes to relative URI, because you can then use dot-segments to produce other identifiers in the hierarchy.
/v1/users + ../products -> /v1/products
Your life down the road will likely be considerably easier if you are clear as to whether you really have a new resource in each version, or a shared resource with different representations. If someone changes /v2/users, should that also change /v1/users? What are the correct cache invalidation semantics?

REST API versioning. What to include in a new version?

So let's say that I have two endpoints:
example.com/v1/send
example.com/v1/read
and now I have to change something in /send without losing backward compatibility. So I'm creating:
example.com/v2/send
But what should I do then? Do I need to create example.com/v2/read which will be doing same as /v1? And let's imagine that there are lots of controllers with hundreds of endpoints. Will I be creating a new version like that with changing every small endpoint? Or should my frontend use API like that?
example.com/v1/send
example.com/v2/read
What is the best practice?
Over the time new endpoints may be included, some endpoints may be removed, the model may change, etc. That what versioning is for: track the changes.
It's likely that you will support both version 1 and 2 for a certain period, but you hardly will support both versions forever. At some point you may drop version 1, and want to keep only version 2 fully up and running.
So, consider the new version of the API as an API that can be used independently from the previous versions. In other words, a particular client should target one version of the API instead of multiple. And, of course, it's desirable to have backwards compatibility if possible.
In good time: Instead of adding the version in the URL, have you considered a media type to handle the versioning?
For instance, have a look at the GitHub API. All requests are handled by a default version of the API, but the target version can be defined (and the clients are encouraged to define the target version) in the Accept header, using a media type:
Accept: application/vnd.github.v3+json

WF4 workflow versions where service contract changes

I just successfully implemented a WF4 "versioning" system using WCF's Routing Service. I had a version1 workflow service to which I added a new Decision activity and saved it as a version2 service. So now I have 2 endpoints (with identical service contracts, i.e. all Receive activities are the same for both service) and a router that checks the content of a message (a "versionId" string on the object that all of my Receive's accept as an argument) to decide which endpoint to hit.
My question is, while this works fine when no changes are made to the service contract, how to I handle the need to add or remove methods from my service contract and create a version3 service? My original thought was, when I add the service reference to my client, I use the latest workflow service's endpoint to get the latest service contract. Then, in the config file, I change the endpoint I connect to to the router's endpoint. But this won't work if v1 and v2 have a different contract than v3. My proxy will have v3's methods and forget all about v1 and v2.
Any ideas of how to handle this? Should I create an actual service contract interface in my workflow solution (instead of just supplying a ServiceContractName in my Receive activities)?
If the WCF contract changes your client will need to be aware of the additional operations and when to call them. I have used the active bookmarks, it contains the WCF operation, from the persistence store in some applications to have the client app adapt to the workflow dynamically by checking the enabled bookmarks and enabling/disabling UI controls based on that. The client will still have to be updated when new operations are added to a new version of the workflow.
While WCF was young I heard a few voices arguing that endpoint versioning (for web services that is) should be accomplished by using a folder structure. I never got to the point of trying it out myself, but just analyzing the consequences of such a strategy seems to me as a splendid solution. I have no production experience of WCF, but am about to launch a rather comprehensive solution using version 4.0 of .NET (ASP.NET, WCF, WF...) and at this stage I would argue that using a folder structure to separate versions of endpoints would be a good solution.
The essence of such a strategy would be to never change or remove the contract of an endpoint (a specific version) until you are 100% sure that it is not used any more. While your services evolves you would just add new contracts and endpoints. This could lead to code duplication if one is not such a structured developer as one should be. But by introducing a service facade the duplication would be insignificant
I have been through the same situation. You can maintain the version by the help of custom implementation. save the Workflow Service URL in Database. And invoke them as per desire.
You can get the information about calling the WF Service with the URL by the client.
http://rajeevkumarsrivastava.blogspot.in/2014/08/consume-workflow-service-45-from-client.html
Hope this helps