I've been reading up on versioning strategies for ReST APIs, and something none of them appear to address is how you manage the underlying codebase.
Let's say we're making a bunch of breaking changes to an API - for example, changing our Customer resource so that it returns separate forename and surname fields instead of a single name field. (For this example, I'll use the URL versioning solution since it's easy to understand the concepts involved, but the question is equally applicable to content negotiation or custom HTTP headers)
We now have an endpoint at http://api.mycompany.com/v1/customers/{id}, and another incompatible endpoint at http://api.mycompany.com/v2/customers/{id}. We are still releasing bugfixes and security updates to the v1 API, but new feature development is now all focusing on v2. How do we write, test and deploy changes to our API server? I can see at least two solutions:
Use a source control branch/tag for the v1 codebase. v1 and v2 are developed, and deployed independently, with revision control merges used as necessary to apply the same bugfix to both versions - similar to how you'd manage codebases for native apps when developing a major new version whilst still supporting the previous version.
Make the codebase itself aware of the API versions, so you end up with a single codebase that includes both the v1 customer representation and the v2 customer representation. Treat versioning as part of your solution architecture instead of a deployment issue - probably using some combination of namespaces and routing to make sure requests are handled by the correct version.
The obvious advantage of the branch model is that it's trivial to delete old API versions - just stop deploying the appropriate branch/tag - but if you're running several versions, you could end up with a really convoluted branch structure and deployment pipeline. The "unified codebase" model avoids this problem, but (I think?) would make it much harder to remove deprecated resources and endpoints from the codebase when they're no longer required. I know this is probably subjective since there's unlikely to be a simple correct answer, but I'm curious to understand how organisations who maintain complex APIs across multiple versions are solving this problem.
I've used both of the strategies you mention. Of those two, I favor the second approach, being simpler, in use cases that support it. That is, if the versioning needs are simple, then go with a simpler software design:
A low number of changes, low complexity changes, or low frequency change schedule
Changes that are largely orthogonal to the rest of the codebase: the public API can exist peacefully with the rest of the stack without requiring "excessive" (for whatever definition of of that term you choose to adopt) branching in code
I did not find it overly difficult to remove deprecated versions using this model:
Good test coverage meant that ripping out a retired API and the associated backing code ensured no (well, minimal) regressions
Good naming strategy (API-versioned package names, or somewhat uglier, API versions in method names) made it easy to locate the relevant code
Cross-cutting concerns are harder; modifications to core backend systems to support multiple APIs have to be very carefully weighed. At some point, the cost of versioning backend (See comment on "excessive" above) outweighs the benefit of a single codebase.
The first approach is certainly simpler from the standpoint of reducing conflict between co-existing versions, but the overhead of maintaining separate systems tended to outweigh the benefit of reducing version conflict. That said, it was dead simple to stand up a new public API stack and start iterating on a separate API branch. Of course, generational loss set in almost immediately, and the branches turned into a mess of merges, merge conflict resolutions, and other such fun.
A third approach is at the architectural layer: adopt a variant of the Facade pattern, and abstract your APIs into public facing, versioned layers that talks to the appropriate Facade instance, which in turn talks to the backend via its own set of APIs. Your Facade (I used an Adapter in my previous project) becomes its own package, self-contained and testable, and allows you to migrate frontend APIs independently of the backend, and of each other.
This will work if your API versions tend to expose the same kinds of resources, but with different structural representations, as in your fullname/forename/surname example. It gets slightly harder if they start relying on different backend computations, as in, "My backend service has returned incorrectly calculated compound interest that has been exposed in public API v1. Our customers have already patched this incorrect behavior. Therefore, I cannot update that computation in the backend and have it apply until v2. Therefore we now need to fork our interest calculation code." Luckily, those tend to be infrequent: practically speaking, consumers of RESTful APIs favor accurate resource representations over bug-for-bug backwards compatibility, even amongst non-breaking changes on a theoretically idempotent GETted resource.
I'll be interested to hear your eventual decision.
For me the second approach is better. I have use it for the SOAP web services and plan to use it for REST also.
As you write, the codebase should be version aware, but a compatibility layer can be used as separate layer. In your example, the codebase can produce resource representation (JSON or XML) with first and last name, but the compatibility layer will change it to have only name instead.
The codebase should implement only the latest version, lets say v3. The compatibility layer should convert the requests and response between the newest version v3 and the supported versions e.g v1 and v2.
The compatibility layer can have a separate adapters for each supported version which can be connected as chain.
For example:
Client v1 request: v1 adapt to v2 ---> v2 adapt to v3 ----> codebase
Client v2 request: v1 adapt to v2 (skip) ---> v2 adapt to v3 ----> codebase
For the response the adapters function simply in the opposite direction. If you are using Java EE, you can you the servlet filter chain as adapter chain for example.
Removing one version is easy, delete the corresponding adapter and the test code.
Branching seems much better for me, and i used this approach in my case.
Yes as you already mentioned - backporting bug fixes will require some effort, but at the same time supporting multiple versions under one source base (with routing and all other stuff) will require you if not less, but at least same effort, making system more complicated and monstrous with different branches of logic inside (at some point of versioning you definetely will come to huge case() pointing to version modules having code duplicated, or having even worse if(version == 2) then...) .
Also dont forget that for regression purposes you still have to keep tests branched.
Regarding versioning policy: i would keep as max -2 versions from current, deprecating support for old ones - that would give some motivation for users to move.
Usually, introduction of a major version of API leading you in a situation of having to maintain multiple versions is an event which does not (or should not) occur very frequently. However, it cannot be avoided completely. I think it is overall a safe assumption that a major version, once introduced, would stay latest version for relatively long period of time. Based on this, I would prefer to achieve simplicity in the code at the expense of duplication as it gives me better confidence of not breaking previous version when I introduce changes in latest one.
Related
I have an internal service that exposes few APIs and few clients using these APIs. I have to make some breaking changes and redesign this service's API.
What are some of the best ways to maintain backward compatibility for these clients while making these changes? (I known it's not ideal but most things in the world aren't, right?)
One solution I can think of is having a config based on which the clients either talk to the old API or the new. This allows me to merge the client code immediately and then enable the new API through the config when the time is right for me.
I want to find out if there are more solutions out there that's in practice when making such breaking changes.
The most common way is to introduce versioning in your API, e.g:
http://api.example.com/ (can default to an older version for backwards compatibility)
http://api.example.com/v1
etc...
See more information and examples here: https://restfulapi.net/versioning/
I'm trying to figure out a scenario, but I can't find any relevant information on the web.
Let's say I am deploying an Android Application (v1.0.0) with the backend (v1.0.0).
At some point, I will make some changes and update the app to v1.0.1 and the backend to v1.0.1 and they will work perfectly.
But how can I also support the previous version of the application (maybe the new server version provides another format of response for one specific request)?
I thought of having separate deployments for every version of the server, but for many updates, that would mean a big resources impact.
Also, forcing the users to update doesn't seem a good option in my opinion.
Essentially you can go multiple ways of doing it. Really depends on your requirements, but usually the reality is a mixture of the things below.
As a rule of thumb, think in data model that will be held compatible. If the contract can not be kept or your realize major changes are needed, introduce a new version of API. If old clients can not be supported, force update. You can also agree on a policy on how long to support each previous versions and then force update, this will make your life much easier and simpler, than maintaining tens of versions of APIs.
Backwards compatible data model
You must think backwards with each release:
Think of incremental modelling with each release cycle. So you can keep things.
When you forget about it and you need to switch behavior based on data:
Happened to me in my trainee years. Simply you have to decode which version it might be, based on the data if you forget to add protocol version. From the beginning you can always have a version field on the data. Moreover, you can also add a set of compatible parser versions.
Include protocol version in data:
{
"data": [ arbitrary data model],
"protocolVersion": "v1"
}
Based on the protocol version, you can decide how to process the data on the server side. You don't need to keep client version in mind, only the protocol's. Since it can be that you release: 1.0.0, 1.0.1, 1.1.0, and you only change protocol in 1.2.0.
But I think the problem is the that as data changes subsequently, so does behavior on server side processing.
Versioned API
Sometimes you see different paths for major versions:
/api/v1/query
/api/v2/query
Used when backwards compatibility is broken or after total reconsideration. Therefore not every version will have an increment.
Use a query parameter with the client version:
/api/query?v=1.0
/api/query?v=1.1
Similar to previous one, just written differently. Although I think this is not the way you want to go down.
Track client releases and used service versions
In reality there are multiple requests and multiple versions of services being used all times per one client version. This makes things even harder to maintain.
All the time, document and track. Version and release management is very important. Know the git hash from which version you built, many time it can get confusing and you just changed only one parameter after release as a quick fix and the day after nothing is compatible anymore.
Map all service versions to the client version at each release, know which commit you really built and use tagging and release management.
Test everything before each release
Have clear requirements for your backwards compatibility. Maybe you only support two older versions, then test with all the two clients, new client with the to be released server. Document everything. And when you meet your criteria for release, go with it.
Summary
Reality is a mixture of solutions. Have clear requirements on the backward compatibility. Therefore you can test before each release. When it is necessary, force update. Keep track of releases, and document each client versions with all the services being used with their versions.
Use switch case at the server for each different version of the app.
I'm testing out Azure Service Fabric and started adding a lot of actors and services to the same project - is this okay to do or will I lose any of service fabric features as fail overs, scaleability etc?
My preference here is clearly 1 actor/1 service = 1 project. The big win with a platform like this is that it allows you to write proper microservice-oriented applications at close to no cost, at least compared to the implementation overhead you have when doing similar implementations on other, somewhat similar platforms.
I think it defies the point of an architecture like this to build services or actors that span multiple concerns. It makes sense (to me at least) to use these imaginary constraints to force you to keep the area of responsibility of these services as small as possible - and rather depend on/call other services in order to provide functionality outside of the responsibility of the project you are currently implementing.
In regards to scaling, it seems you'll still be able to scale your services/actors independently even though they are a part of the same project - at least that's implied by looking at the application manifest format. What you will not be able to do, though, are independent updates of services/actors within your project. As an example; if your project has two different actors, and you make a change to one of them, you will still need to deploy an update to both of them since they are part of the same code package and will share a version number.
I have an internal REST API that has the following properties:
It has a relatively large surface (many entities, most of which support all standard verbs)
It will evolve slowly and locally (mostly minor changes to individual entities)
I have read some REST versioning strategies, but I do not clearly see how they apply here.
If I use version-in-URL, then when v2 first appears, it would have just one entity/some verbs in it. The clients will be using v2 for one entity, v1 for other, and later maybe v3 for yet another. That is unless I copy/redirect the whole surface on each change, but that can get me to v10 quickly just because of 10 unrelated changes in different locations.
If I use version-in-content-type the problems are pretty similar (but I do not think version-in-content-type covers all potential changes so I wasn't planning to do that anyway).
Is there a common approach to evolving the API in this case?
we are creating hospital information system software. The project will be different hospital to hospital and contain different use cases. But lots of parts will be the same. So we will use branching mechanism of the source control. If we find a bug in one hospital, how can we know the other branches have the same bug or not.
IMAGE http://img85.imageshack.us/img85/5074/version.png
The numbers in the picture which we attached show the each hospital software.
Do you have a solution about this problem ?
Which source control(SVN,Git,Hg) we will be suitable about this problem ?
Thank you.!
Ok, this is not really a VCS question, this is first and foremost and architectural problem - how do you structure and build your application(s) in such a way that you can deliver the specific use cases to each hospital as required whilst being able, as you suggest, to fix bugs in commom code.
I think one can state with some certainty that if you follow the model suggested in your image you aren't going to be able to do so consistently and effectively. What you will end up with is a number of different, discrete, applications that have to be maintained separately even though they have at some point come from a common set of code.
Its hard to make more than good generalisations but the way I would think about this would be something along the following lines:
Firstly you need a core application (or set of applications or set of application libraries) these will form the basis of any delivered system and therefore a single maintained set of code (although this core may itself include external libraries).
You then have a number of options for your customised applications (the per hospital instance) you can define the available functionality a number of means:
At one extreme, by configuration - having one application containing all the code and effectively switching things on and off on a per instance basis.
At the other extreme by having an application per hospital that substantially comprises the core code with customisation.
However the likelyhood is that whilst the sum of the use cases for each hospital is different individual use cases will be common across a number of instances so you need to aim for a modular system i.e. something that starts with the common core and that can be extended and configured by composition as much as by any other means.
This means that you are probably want to make extensive use of Inversion of Control and Dependency Injection to give you flexibility within a common framework. You want to look at extensibility frameworks (I do .NET so I'd be looking at the Managed Extensibility Framework - MEF) that allow you to go some way towards "assembling" an application at runtime rather than at compile time.
You also need to pay attention to how you're going to deploy - especially how you're going to update - your applications and at this point you're right you're going to need to have both your version control and you build environment right.
Once you know how you're going build your application then you can look at your version control system - #VonC is spot on when he says that the key feature is to be able to include code from shared projects into multiple deliverable projects.
If it were me, now, I'd probably have a core (which will probably itself be multiple projects/solutions) and then one project/solution per hospital but I would be aiming to have as little code as possible in the per hospital projects - ideally just enough framework to define the instance specific configuration and UI customisation
As to which to use... if you're a Microsoft shop then take a good long hard look at TFS, the productivity gains from having a well integrated environment can be considerable.
Otherwise (and in any case), DVCS (Mercurial, Git, Bazaar, etc) seem to me to be gaining an edge on the more traditional systems as they mature. I think SVN is an excellent tool (I use it and it works), and I think that you need a central repository for this kind of development - not least because you need somewhere that triggers your Continuous Integration Server - however you can achieve the same thing with DVCS and the ability to do frequent local, incremental, commits without "breaking the build" and the flexibility that DVCS gives you means that if you have a choice now then that is almost certainly the way to go (but you do need to ensure that you establish good practices in ensuring that code is pushed to your core repositories early)
I think there is still a lot to address purely from the VCS question - but you can't get to that in useful detail 'til you know how you're going to structure your delivered solution.
All of those VCS (Version Control System) you mention are compatible with the notion of "shared component", which allows you to define a common shared and deployed code base, plus some specialization in each branch:
CVCS (Centralized)
Subversion externals
DVCS (Distributed)
Git submodules (see true nature of submodules)
Hg SubRepos
Considering the distributed aspect of the release management process, a DVCS would be more appropriate.
If the bug is located in the common code base, you can quickly see in the other branches if:
what exact version of the common component they are referring to.
they refer the same or older version of that common component than the one in which the bug has been found (in which case chances are they also do have the bug)