Versioning small changes in larger REST API

Versioning small changes in larger REST API - rest

I have an internal REST API that has the following properties:
It has a relatively large surface (many entities, most of which support all standard verbs)
It will evolve slowly and locally (mostly minor changes to individual entities)
I have read some REST versioning strategies, but I do not clearly see how they apply here.
If I use version-in-URL, then when v2 first appears, it would have just one entity/some verbs in it. The clients will be using v2 for one entity, v1 for other, and later maybe v3 for yet another. That is unless I copy/redirect the whole surface on each change, but that can get me to v10 quickly just because of 10 unrelated changes in different locations.
If I use version-in-content-type the problems are pretty similar (but I do not think version-in-content-type covers all potential changes so I wasn't planning to do that anyway).
Is there a common approach to evolving the API in this case?

Related

What are some of the ways to ensure backward compatibility while making breaking API changes?

I have an internal service that exposes few APIs and few clients using these APIs. I have to make some breaking changes and redesign this service's API.
What are some of the best ways to maintain backward compatibility for these clients while making these changes? (I known it's not ideal but most things in the world aren't, right?)
One solution I can think of is having a config based on which the clients either talk to the old API or the new. This allows me to merge the client code immediately and then enable the new API through the config when the time is right for me.
I want to find out if there are more solutions out there that's in practice when making such breaking changes.

The most common way is to introduce versioning in your API, e.g:
http://api.example.com/ (can default to an older version for backwards compatibility)
http://api.example.com/v1
etc...
See more information and examples here: https://restfulapi.net/versioning/

Should actors/services be split into multiple projects?

I'm testing out Azure Service Fabric and started adding a lot of actors and services to the same project - is this okay to do or will I lose any of service fabric features as fail overs, scaleability etc?

My preference here is clearly 1 actor/1 service = 1 project. The big win with a platform like this is that it allows you to write proper microservice-oriented applications at close to no cost, at least compared to the implementation overhead you have when doing similar implementations on other, somewhat similar platforms.
I think it defies the point of an architecture like this to build services or actors that span multiple concerns. It makes sense (to me at least) to use these imaginary constraints to force you to keep the area of responsibility of these services as small as possible - and rather depend on/call other services in order to provide functionality outside of the responsibility of the project you are currently implementing.
In regards to scaling, it seems you'll still be able to scale your services/actors independently even though they are a part of the same project - at least that's implied by looking at the application manifest format. What you will not be able to do, though, are independent updates of services/actors within your project. As an example; if your project has two different actors, and you make a change to one of them, you will still need to deploy an update to both of them since they are part of the same code package and will share a version number.

How do you manage the underlying codebase for a versioned API?

I've been reading up on versioning strategies for ReST APIs, and something none of them appear to address is how you manage the underlying codebase.
Let's say we're making a bunch of breaking changes to an API - for example, changing our Customer resource so that it returns separate forename and surname fields instead of a single name field. (For this example, I'll use the URL versioning solution since it's easy to understand the concepts involved, but the question is equally applicable to content negotiation or custom HTTP headers)
We now have an endpoint at http://api.mycompany.com/v1/customers/{id}, and another incompatible endpoint at http://api.mycompany.com/v2/customers/{id}. We are still releasing bugfixes and security updates to the v1 API, but new feature development is now all focusing on v2. How do we write, test and deploy changes to our API server? I can see at least two solutions:
Use a source control branch/tag for the v1 codebase. v1 and v2 are developed, and deployed independently, with revision control merges used as necessary to apply the same bugfix to both versions - similar to how you'd manage codebases for native apps when developing a major new version whilst still supporting the previous version.
Make the codebase itself aware of the API versions, so you end up with a single codebase that includes both the v1 customer representation and the v2 customer representation. Treat versioning as part of your solution architecture instead of a deployment issue - probably using some combination of namespaces and routing to make sure requests are handled by the correct version.
The obvious advantage of the branch model is that it's trivial to delete old API versions - just stop deploying the appropriate branch/tag - but if you're running several versions, you could end up with a really convoluted branch structure and deployment pipeline. The "unified codebase" model avoids this problem, but (I think?) would make it much harder to remove deprecated resources and endpoints from the codebase when they're no longer required. I know this is probably subjective since there's unlikely to be a simple correct answer, but I'm curious to understand how organisations who maintain complex APIs across multiple versions are solving this problem.

I've used both of the strategies you mention. Of those two, I favor the second approach, being simpler, in use cases that support it. That is, if the versioning needs are simple, then go with a simpler software design:
A low number of changes, low complexity changes, or low frequency change schedule
Changes that are largely orthogonal to the rest of the codebase: the public API can exist peacefully with the rest of the stack without requiring "excessive" (for whatever definition of of that term you choose to adopt) branching in code
I did not find it overly difficult to remove deprecated versions using this model:
Good test coverage meant that ripping out a retired API and the associated backing code ensured no (well, minimal) regressions
Good naming strategy (API-versioned package names, or somewhat uglier, API versions in method names) made it easy to locate the relevant code
Cross-cutting concerns are harder; modifications to core backend systems to support multiple APIs have to be very carefully weighed. At some point, the cost of versioning backend (See comment on "excessive" above) outweighs the benefit of a single codebase.
The first approach is certainly simpler from the standpoint of reducing conflict between co-existing versions, but the overhead of maintaining separate systems tended to outweigh the benefit of reducing version conflict. That said, it was dead simple to stand up a new public API stack and start iterating on a separate API branch. Of course, generational loss set in almost immediately, and the branches turned into a mess of merges, merge conflict resolutions, and other such fun.
A third approach is at the architectural layer: adopt a variant of the Facade pattern, and abstract your APIs into public facing, versioned layers that talks to the appropriate Facade instance, which in turn talks to the backend via its own set of APIs. Your Facade (I used an Adapter in my previous project) becomes its own package, self-contained and testable, and allows you to migrate frontend APIs independently of the backend, and of each other.
This will work if your API versions tend to expose the same kinds of resources, but with different structural representations, as in your fullname/forename/surname example. It gets slightly harder if they start relying on different backend computations, as in, "My backend service has returned incorrectly calculated compound interest that has been exposed in public API v1. Our customers have already patched this incorrect behavior. Therefore, I cannot update that computation in the backend and have it apply until v2. Therefore we now need to fork our interest calculation code." Luckily, those tend to be infrequent: practically speaking, consumers of RESTful APIs favor accurate resource representations over bug-for-bug backwards compatibility, even amongst non-breaking changes on a theoretically idempotent GETted resource.
I'll be interested to hear your eventual decision.

For me the second approach is better. I have use it for the SOAP web services and plan to use it for REST also.
As you write, the codebase should be version aware, but a compatibility layer can be used as separate layer. In your example, the codebase can produce resource representation (JSON or XML) with first and last name, but the compatibility layer will change it to have only name instead.
The codebase should implement only the latest version, lets say v3. The compatibility layer should convert the requests and response between the newest version v3 and the supported versions e.g v1 and v2.
The compatibility layer can have a separate adapters for each supported version which can be connected as chain.
For example:
Client v1 request: v1 adapt to v2 ---> v2 adapt to v3 ----> codebase
Client v2 request: v1 adapt to v2 (skip) ---> v2 adapt to v3 ----> codebase
For the response the adapters function simply in the opposite direction. If you are using Java EE, you can you the servlet filter chain as adapter chain for example.
Removing one version is easy, delete the corresponding adapter and the test code.

Branching seems much better for me, and i used this approach in my case.
Yes as you already mentioned - backporting bug fixes will require some effort, but at the same time supporting multiple versions under one source base (with routing and all other stuff) will require you if not less, but at least same effort, making system more complicated and monstrous with different branches of logic inside (at some point of versioning you definetely will come to huge case() pointing to version modules having code duplicated, or having even worse if(version == 2) then...) .
Also dont forget that for regression purposes you still have to keep tests branched.
Regarding versioning policy: i would keep as max -2 versions from current, deprecating support for old ones - that would give some motivation for users to move.

Usually, introduction of a major version of API leading you in a situation of having to maintain multiple versions is an event which does not (or should not) occur very frequently. However, it cannot be avoided completely. I think it is overall a safe assumption that a major version, once introduced, would stay latest version for relatively long period of time. Based on this, I would prefer to achieve simplicity in the code at the expense of duplication as it gives me better confidence of not breaking previous version when I introduce changes in latest one.

Using LDAP for issue-tracking / SCM

My current project involves using LDAP (Active Directory) and I'm using issue tracking for all of my projects, so the idea of combining both of them crossed my mind. In order to fit the requirements of StackOverflow I'll try to formulate this as question but I admit, this is more about just getting some opinions, please forgive me :):
I think that issue-tracking and SCM (software configuration management) in general would be a good application for LDAP because of the following reasons:
Easy to integrate into existing infrastructure (no need for additional user management)
Fine-grained access control for projects/issues etc.
Ready-To-Use hierarchical, property-oriented storage (which is typically needed for SCM/issue trackers)
Standard-API with bindings for almost all languages/technologies
Searching/Indexing, Backup/replication functionality already present in most LDAP solutions
Extensible schema already part of the LDAP technology (it would be easy to add properties to issues/projects etc.)
So my questions are:
Are you aware of any existing attempts to define a (standard) schema for issue-tracking resp. SCM (i.e. class definitions for issues, projects, versions, releases, revisions etc)
LDAP usually manage relatively slowly-changing data. How well would current implementations (OpenLDAP, ActiveDirectory) handle data (mainly in terms of performance and amount of data) that typically changes very frequently?
Are there any other drawbacks of such a solution you can think of?
and of course
Who would like to try to start such a project :) ...

The OP precises:
The question is not about using an existing issue tracker with LDAP authentication (redmine can do this for example),
but about storing tickets/issues/etc. directly within the LDAP tree...
Currently, each issue tracker has it's own API for accessing data, having all data accesible via LDAP could make writing tools (e.g. integration into IDEs etc.) much easier
To which the answer is easy.
Don't.
LDAP is not (repeat, not) made for that, and there is much more to an SCM or an Issue Tracker than just a bunch of hierarchical data.
An SCM has to come up with a way to store/reference efficiently deltas, entire tree, branches, labels.
an Issue Tracker is all about multiple relationship between one item and several other (several parents/children, related, duplicated, ...), plus has to manage somehow a tight reference with the code (or rather the changeset, set of version modified)
While it is true than by adding a all lot of new objectClass types, you could end up with a similar structure, you would essentially take what it is a Lightweight Directory (ie optimized for reading only) and transform it into a huge referential (with lots of read/write operations and complex data structures).
If you are looking about an unifying API, one generic one (not just for SCM or Bug Tracking) is OSLC (Open Services for Lifecycle Collaboration), an open-sourced protocol currently used for Change Management by RTC (Rational Team Concert).

How to manage multiple clients with slightly different business rules? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 3 years ago.
Improve this question
We have written a software package for a particular niche industry. This package has been pretty successful, to the extent that we have signed up several different clients in the industry, who use us as a hosted solution provider, and many others are knocking on our doors. If we achieve the kind of success that we're aiming for, we will have literally hundreds of clients, each with their own web site hosted on our servers.
Trouble is, each client comes in with their own little customizations and tweaks that they need for their own local circumstances and conditions, often (but not always) based on local state or even county legislation or bureaucracy. So while probably 90-95% of the system is the same across all clients, we're going to have to build and support these little customizations.
Moreover, the system is still very much a work in progress. There are enhancements and bug fixes happening continually on the core system that need to be applied across all clients.
We are writing code in .NET (ASP, C#), MS-SQL 2005 is our DB server, and we're using SourceGear Vault as our source control system. I have worked with branching in Vault before, and it's great if you only need to keep 2 or 3 branches synchronized - but we're looking at maintaining hundreds of branches, which is just unthinkable.
My question is: How do you recommend we manage all this?
I expect answers will be addressing things like object architecture, web server architecture, source control management, developer teams etc. I have a few ideas of my own, but I have no real experience in managing something like this, and I'd really appreciate hearing from people who have done this sort of thing before.
Thanks!

I would recommend against maintaining separate code branches per customer. This is a nightmare to maintain working code against your Core.
I do recommend you do implement the Strategy Pattern and cover your "customer customizations" with automated tests (e.g. Unit & Functional) whenever you are changing your Core.
UPDATE:
I recommend that before you get too many customers, you need to establish a system of creating and updating each of their websites. How involved you get is going to be balanced by your current revenue stream of course, but you should have an end in mind.
For example, when you just signed up Customer X (hopefully all via the web), their website will be created in XX minutes and send the customer an email stating it's ready.
You definitely want to setup a Continuous Integration (CI) environment. TeamCity is a great tool, and free.
With this in place, you'll be able to check your updates in a staging environment and can then apply those patches across your production instances.
Bottom Line: Once you get over a handful of customers, you need to start thinking about automating your operations and your deployment as yet another application to itself.
UPDATE: This post highlights the negative effects of branching per customer.

Our software has very similar requirements and I've picked up a few things over the years.
First of all, such customizations will cost you both in the short and long-term. If you have control over it, place some checks and balances such that sales & marketing do not over-zealously sell customizations.
I agree with the other posters that say NOT to use source control to manage this. It should be built into the project architecture wherever possible. When I first began working for my current employer, source control was being used for this and it quickly became a nightmare.
We use a separate database for each client, mainly because for many of our clients, the law or the client themselves require it due to privacy concerns, etc...
I would say that the business logic differences have probably been the least difficult part of the experience for us (your mileage may vary depending on the nature of the customizations required). For us, most variations in business logic can be broken down into a set of configuration values which we store in an xml file that is modified upon deployment (if machine specific) or stored in a client-specific folder and kept in source control (explained below). The business logic obtains these values at runtime and adjusts its execution appropriately. You can use this in concert with various strategy and factory patterns as well -- config fields can contain names of strategies etc... . Also, unit testing can be used to verify that you haven't broken things for other clients when you make changes. Currently, adding most new clients to the system involves simply mixing/matching the appropriate config values (as far as business logic is concerned).
More of a problem for us is managing the content of the site itself including the pages/style sheets/text strings/images, all of which our clients often want customized. The current approach that I've taken for this is to create a folder tree for each client that mirrors the main site - this tree is rooted at a folder named "custom" that is located in the main site folder and deployed with the site. Content placed in the client-specific set of folders either overrides or merges with the default content (depending on file type). At runtime the correct file is chosen based on the current context (user, language, etc...). The site can be made to serve multiple clients this way. Efficiency may also be a concern - you can use caching, etc... to make it faster (I use a custom VirtualPathProvider). The largest problem we run into is the burden of visually testing all of these pages when we need to make changes. Basically, to be 100% sure you haven't broken something in a client's custom setup when you have changed a shared stylesheet, image, etc... you would have to visually inspect every single page after any significant design change. I've developed some "feel" over time as to what changes can be comfortably made without breaking things, but it's still not a foolproof system by any means.
In my case I also have no control other than offering my opinion over which visual/code customizations are sold so MANY more of them than I would like have been sold and implemented.

This is not something that you want to solve with source control management, but within the architecture of your application.
I would come up with some sort of plugin like architecture. Which plugins to use for which website would then become a configuration issue and not a source control issue.
This allows you to use branches, etc. for the stuff that they are intended for: parallel development of code between (or maybe even over) releases. Each plugin becomes a seperate project (or subproject) within your source code system. This also allows you to combine all plugins and your main application into one visual studio solution to help with dependency analisys etc.
Loosely coupling the various components in your application is the best way to go.

As mention before, source control does not sound like a good solution for your problem. To me it sounds that is better yo have a single code base using a multi-tenant architecture. This way you get a lot of benefits in terms of managing your application, load on the service, scalability, etc.
Our product using this approach and what we have is some (a lot) of core functionality that is the same for all clients, custom modules that are used by one or more clients and at the core a the "customization" is a simple workflow engine that uses different workflows for different clients, so each clients gets the core functionality, its own workflow(s) and some extended set of modules that are either client specific or generalized for more that one client.
Here's something to get you started on multi-tenancy architecture:
Multi-Tenant Data Architecture
SaaS database tenancy patterns

Without more info, such as types of client specific customization, one can only guess how deep or superficial the changes are. Some simple/standard approaches to consider:
If you can keep a central config specifying the uniqueness from client to client
If you can centralize the business rules to one class or group of classes
If you can store the business rules in the database and pull out based on client
If the business rules can all be DB/SQL based (each client having their own DB
Overall hard coding differences based on client name/id is very problematic, keeping different code bases per client is costly (think of the complete testing/retesting time required for the 90% that doesn't change)...I think more info is required to properly answer (give some specifics)

Layer the application. One of those layers contains customizations and should be able to be pulled out at any time without affect on the rest of the system. Application- and DB-level "triggers" (quoted because they may or many not employ actual DB triggers) that call customer-specific code or are parametrized with customer keys) are very helpful.
Core should never be customized, but you must layer it in somewhere, even if it is simplistic web filtering.

What we have is a a core datbase that has the functionality that all clients get. Then each client has a separate database that contains the customizations for that client. This is expensive in terms of maintenance. The other problem is that when two clients ask for a simliar functionality, it is often done differnetly by the two separate teams. There is currently little done to share custiomizations between clients and make common ones become part of the core application. Each client has their own application portal, so we don't have the worry about a change to one client affecting some other client.
Right now we are looking at changing to a process using a rules engine, but there is some concern that the perfomance won't be there for the number of records we need to be able to process. However, in your circumstances, this might be a viable alternative.

I've used some applications that offered the following customizations:
Web pages were configurable - we could drag fields out of view, position them where we wanted with our own name for the field label.
Add our own views or stored procedures and use them in: data grids (along with an update proc) and reports. Each client would need their own database.
Custom mapping of Excel files to import data into system.
Add our own calculated fields.
Ability to run custom scripts on forms during various events.
Identify our own custom fields.
If you clients are larger companies, you're almost going to need your own SDK, API's, etc.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse