Our development departments grows and I want to force a stable master/trunk.
Up to now every developer can commit into the master/trunk. In future developers should commit into a staging area, and if all tests pass the code gets moved to the trunk automatically. If the test fails, the developer gets a mail with the failed tests.
We have several repositories: One for the core product, several plugins and a repository for every customer.
Up to now we run SVN and git, but switching all repos to git could be done, if necessary.
Which software could help us to get this done?
There a some articles on the web which explain how to use gerrit and jenkins to force a stable branch.
I am unsure if I need both, or if it is better to use something else.
Environment: We are 10 developers, and use python and django.
Question: Which tool can help me to force a stable master branch?
Update
I was on holiday, and now the bounty has expired. I am sorry. Thank you for your answers.
Question: Which tool can help me to force a stable master branch?
Having been researching this particular aspect of CI quasi-pathologically since our ~20 person PHP/ZF1-based dev team made the switch from SVN to Git over the winter (and I became the de-facto git mess-fixer), I can't help but share my experience with this particular aspect of continuous integration.
While obviously, having a "critical mass of unit tests running" in combination with a slew of conditionally parameterized Jenkins jobs, triggering infinitely more conditionally parameterized jobs, covering every imaginable circumstance would (perhaps) be the best and most proper way to move towards a Continuous Integration/Delivery/Deployment model, the meatspace resources required for such a migration are not insignificant.
Some questions:
Does your team have some kind of VCS workflow or, minimally, rules defined?
What percentage would you say, roughly, of your codebase is under some kind of behavioral (eg. selenium), functional or unit testing?
Does your team ( / senior devs ) actually have the time / interest to get the most out of gerrit's peer-based code review functionality?
On average, how many times do you deploy to production in any given day / week / month?
If the answers to more than one of these questions are 'no', 'none', or 'very little/few', then I'd perhaps consider investing in some whiteboard time to think through your team's general workflow before throwing Jenkins into the mix.
Also, git-hooks. Seriously.
However, if you're super keen on having a CI/Jenkins server, or you have all those basics covered already, then I'd point you to this truly remarkable gem of a blog post:
http://twasink.net/2011/09/16/making-parallel-branches-meet-regularly-with-git-and-jenkins/
And it's equally savvy cousin:
http://twasink.net/2011/09/20/git-feature-branches-and-jenkins-or-how-i-learned-to-stop-worrying-about-broken-builds/
Oh, and of course, the very necessary devopsreactions tumblr.
There a some articles on the web which explain how to use gerrit and jenkins to force a stable branch.
I am unsure if I need both, or if it is better to use something else.
gerrit is for coding review
Jenkins is a job scheduler that can run any job you want, including one:
compiling everything
launching sole unit test.
In each case, the idea is to do some guarded commit, ie pushing to an intermediate repo (gerrit, or one monitored by Jenkins), and only push to the final repo if the intermediate process (review or automatic build/test) passed successfully.
By adding intermediate repos, you can easily force one unique branch on the final "blessed" repo to which those intermediate referential will push to if the commits are deemed worthy.
It sounds like you are looking to establish a standard CI capability. You will need the following essential tools:
Source Version Control : SVN, git (You are already covered here)
CI server : Jenkins (you will need to build and run tests with each
check in, and report results. Jenkins is the defacto standard tool
used for this)
Testing : PyUnit
Artifact Repository : you will need a mechanism for organizing and
archiving the increments created with each build. This could be a
simple home grown directory based system. I have also used Archiva,
but there are other tools.
There are many additional tools that might be useful depending on your development process:
Code review : If you want to make code review a formal gate in your
process, Gerrit is a good tool.
Code coverage analysis : I've used EMMA in the past for Java. I am
sure that are some good tools for Python coverage.
Many others : a library of Jenkin's plugins that provide a variety of
useful tools is available to you. Taking some time to review
available plugins will definitely be time well spent.
In my experience, establishing the right cultural is as important as finding the right tooling.
Testing : one of the 10 principles of CI is "self testing builds". In
other words, you must have a critical mass of unit tests running.
Developers must become test infected. Unit testing must become a
natural, highly value part of each developers individual development
process. In my experience, establishing a culture of test infection
is the hardest part of deploying CI.
Frequent check-in : Developers and managers must organize there work
in a way that allows for frequent small check-ins. CI calls for daily
checkins. This is sometimes a difficult habit to establish.
Responsiveness to feedback : CI is about immediate feedback. The
developers must be conditioned to response to the immediate feedback.
If unit tests fail, the build it broken. Within 15 minutes of a CI
build breaking, the developer responsible should either have a fix
checked in, or have the original, bad check-in backed out.
Related
I'm trying to learn CICD concepts on my own, I don't understand how it helps me while I can easily push and pull my code from github and make my code into production
Continuous Integration is mainly a culture than being a tool. So, you need to understand why it's necessary that every developer on a team should integrate their code with the repository are least once a day.
Continuous Delivery also indicates challenges and best practices of delivering high-quality software as soon as possible. So, teams that want to decrease the risk and problems of integrating features and increase the speed of delivering new features should adopt the CI/CD culture.
To ensure that every code added to the repository will work and integrate with other parts, you need to check. For instance, you need to make sure that the project will be built successfully, the tests will be passed, and the new changes will not break any other parts, your code will pass some required code quality checks, and so on.
After that, you have to deploy somehow/publish the version of your software. This process usually has some steps and can be done manually in small teams/projects.
Based on the first rule of Continuous Integration, every team member should integrate the code with the repository multiple times a day. Since the frequency of this integration is high, it's not a good idea to do this process manually. There are always chances that somebody forgets to run the operation. That's the main reason why it's necessary to have an automatic CI/CD pipeline.
I am looking for some advice on the best practice for versioning software.
Background
Build automation with gradle.
Continuous integration with Jenkins
CVS as SCM
Semantic Versioning
Sonatype Nexus inhouse repo
Question
Lets say I make a change to come code. An automated CI job will pull it in and run some tests against it. If these tests should pass, should Jenkins update the version of the code and push it to nexus? Should it be pushed up as a "SNAPSHOT"? Should it be pushed up to nexus at all, or instead just left in the repository until I want to do a release?
Thanks in advance
I know you said you are using CVS, but first of all, have you checked the git-flow methodology?
http://nvie.com/posts/a-successful-git-branching-model/
I have little experience with CVS, but it can be applied to it, and a good versioning and CI procedure begins with having well defined branches, basically at least one for the latest release, and one for the latest in-development version.
With this you can tell the CI application what it is working with.
Didn't have time for a more detailed answer before, so I will extend now, trying to give a general answer.
Branches
Once you have clearly defined branches you can control your work flow. For example, it is usual having a 'master' and a 'develop' branches, where the master will contain the latest release, and develop will contain the next release.
This means you can always point to the latest release of the code, it is in the master branch, while the next version is in the develop branch. Of course, this can be more detailed, such as tagging the master branch for the various releases, or having an additional branch for each main feature, but it is enough having these two.
Following this, if you need to change something to the code, you edit the develop branch, and make sure it is all correct, then keep making changes until you are happy with the current version, and move this code to the master.
Tests
Now, how to make sure all is correct and valid? By including tests in your project. There is a lot which can be read about testing, but let's keep it simple. There are two main types of tests:
White box tests, where you know the insides of the code, and prepare the tests for the specific implementation, making sure it is built as you want
Black box tests, where you don't know how the code is implemented (or at least, you act as if you didn't), and prepare more generic tests, meant to make sure it works as expected
Now, going to the next step, you won't hear much about these two tests, and instead people will talk about the following ones:
Unit tests, where you test the smallest piece of code possible
Integration test, where you connect several pieces of code and test them
"The smallest piece of code possible" has a lot of different meanings, depending on the person and project. But keeping with the simplification, if you can't make a white box test of it, then you are creating an integration test.
Integration tests include things like database access, running servers, and take a long time. At least much longer than unit testing. Also, integration tests due to their complexity may require setting up a specific environment.
This means that while unit tests can be run locally with ease, integration tests may be so slow that people dislike running them, or may just be impossible to run in your machine.
So what do you do? Easy, separate the tests, so unit tests can be run locally after each change, while integration test are run (after unit tests) by the CI server after each commit.
Additional tests
Just as a comment, don't stop at this simplified vision of tests. There are several ways of handling tests, and some tests I wouldn't fit into unit or integration tests. For example it is always a good idea validating code style rules, or you can make a test which just deploys the project into a server, to make sure it doesn't break.
CI
Your CI server should be listening to commits, and if correctly configured it will know when this commit comes from a development version, a release or anything else. Allowing you to customise the process as you wish.
First of all it should run all the tests. No excuses, and don't worry if it takes two hours, it should run all the tests, as this is your shield against future problems.
If there are errors, then the CI server will stop and send a warning. Fix the code and start again. If all tests passed, then congratulations.
Now it is the time to deploy.
Deploying should be taken with care, always. The latest version available in the dependencies repository should, always, be the most current one.
It is nice having a script to deploy the releases into the repository after a commit, but unless you have some short of final validation, a manual human-controlled one at it, you may end releasing a bad version.
Of course, you may ignore this for development versions, as long as they are segregated from the actual releases, but otherwise it is best handling the final deployment by hand.
Well, it may be with a script or whatever you prefer, but it should be you who begins the deployment of releases, not the machine.
CI customisation
Having a CI server allows for much more than just testing and building.
Do you like reports? Then generate a test coverage report, a quality metrics one, or whatever you prefer. You are using a external service for this? Then send him the files and let it work.
Your project contains files to generate documentation? Build it and store it somewhere.
Recently my fledgling team (just two devs) attempted to implement continuous delivery practices as described by Jez Humble.
That is we ditched feature branches and pull requests (in git) and aimed to commit to the mainline branch every day at least.
We have a comprehensive unit and functional test suite for both the front and back end which is triggered automatically by Jenkins, when pushing to git.
We configured a feature switching app and resolved to use it for longer running features.
However, we encountered several problems and I'm curious to get a perspective from people who are successfully using this approach.
Delays due to Vetting/ Manual QA process
often tasks were small enough that we didn't think they warranted configuring feature switching, e.g. adding an extra field to a form, or changing some field labels. However, for various reasons that ticket would become blocked (e.g. some unforeseen aspect of the task needing UX input).
This would mean mainline ended up in a compromised state whilst we waited for external dependencies to unblock the task. Often we'd be saying "we can't deploy anything until Thursday, as that's when we can get an IA review"
The answer here is probably a much tighter vetting of which tasks are started. However, it was often difficult to completely anticipate every potential blocker. Maybe if a task becomes blocked additional dev should be done to add a feature switch, or revert the commits? Tricky situation.
Issues with code review during integration on mainline branch
Branches and pull requests give a nice breakdown of changes made on a single task. However, attempting CD we ended up with a mish-mash of unrelated commits on mainline, and the code reviewer having to somehow piece together commits that related to the task he was reviewing. And often there'd be a number of additional minor bug fixes, changes in response to review type commits at the end of a task. Essentially we couldn't figure out a clean way to code review work with this set up.
Generic code review issues
We used phabricator for a bit to do post-commit code reviews, but found it was flagging every single commit (some very minor) for code review, rather than showing us a list of changes per individual dev task. So it made reviewing the code onerous compared to git pull requests. Is there a better way?
We've now reverted back to short lived feature branching in git and raising pull requests to initiate code review and it's a nice set up, but if we could fix the issues we're having with non-feature branching CD, then we'd like to re-attempt that approach.
Could you automate the vetting process and/or run it before you integrate. If you automate the vetting process, for ex adding a form/button etc, you just need a suite of test to run post integration to validate that your mainline is not broken
You need to code review before integration i.e on the pull request . If issues are caught during a code review and fixed, the pull request is updated and the mainline is not messed.
Code review tools are very specific to a group of developers and the team needs.I suggest you play with a few code review tools to see which one suits your needs
Based on most of your questions, I would recommend running all your Vetting/code review etc before you merge.(You can do it in increments if the process is too cumbersome) and running an automated suite of tests for all the stuff that you want to do post integration.
If the process setup that you have in your team is too complicated to be finished in a day and can have multiple iterations , then it is worthwilefor you to evaluate a modified version of gitflow than a fork based CI model
If you use feature branches to work on tasks when you finish with a task you can either merge it back to the integration branch or create a pull request for the merge back to the integration branch.
In both case you get a merge commit, a summary of every change you made on the feature branch.
Do you need something more than this?
I'm tasked with helping to set up the process templates and check-in policies for my company's TFS 2008 installation.
Aside from three check-in policies (a check-in action must have comments against it, a code file must be peer-reviewed, there must be a work item associated with a check-in), I have been asked to consider and implement any others.
What are some of the most important or useful policies to enforce for version control?
The fewer the better.
Usually in an organization you want to ease the friction of check-in to ensure that you are encouraging developers to make frequent small discrete check-ins rather than checking out a load of stuff at once. Then again you want to ensure that you have a working codebase for everyone who needs it and are capturing the data that you need to improve your software delivery process.
Personally, a policy to enforce changeset comments and a work item association policy are ok - as they capture meta-data that is very easy to remember at the time but hard to find afterwards. It also encourages developers to get into the habit of having a work item to track all pieces of work - even experimental development or spikes.
The peer review process might be better performed using branching or another process rather than forcing a peer review on every check-in - however that depends on your process. Remember as well that you can have mandatory check-in notes in TFS to capture meta-data such as code reviewer. A check-in note is slightly different to a check-in policy and is often confused.
If you want read more discussion about check-in policies, take a look at a blog post I did on the balancing act a while ago. Also to hear some more discussion about check-in policies, I recorded a podcast recently with a fellow Team System MVP talking about their use of TFS and it might be interesting (Radio TFS, Using TFS with Ed Blankenship). Finally we also did a Radio TFS episode all about check-in policies in 2008 that might be of interest.
Don't break the build! Of course, finding an automated way to check on that and reject the check-in are the challenge.
Some rules that we follow in our company:
Commit all changes related to the same task at once (that will help review the changes and future rollbacks or merges if needed).
template based comments (eg: prefix all comments with a code that represents what was done, + for adds, - for removes, * for updates, ! for important modifications, etc).
Obviously always check-in code that compiles, and finished work to the main-line.
check-in daily unfinished work to branches.
The ones we use where I work on TFS are:
Code Analysis
This ensures that all the code was compiled on the devs machine before it was checked in
Work Item Association
If you've done a change there should have been an assigned task!
Last Build Successful
Using the TFS Build Server to check that the current code in source control compiled on an independant machine
Check In Comments (part of the TFS Powertools - http://msdn.microsoft.com/en-us/teamsystem/bb980963.aspx)
It's good to be able to see a summary of the check in without having to go to the work item(s)
Try to keep the number of developers working on the same branch small. That way the branch stays stable with respect to compilation, the unit tests, and regressions. It's a nightmare if a developer does a check in which compiles but his code breaks a key area of the application (such as login).
If you really have to have more than 10 developers checking code into the same branch, we've started an email policy where the developer checking in warns everyone that they're checking in, so that no one attempts to update their copy of the branch in the midst of a check in. Sometimes, we've had to have the converse, where we set aside an time in the date to prohibit check ins, so that updates are safe.
Frankly, the less policies, the better. The more policies you have, the greater the incentive for NOT using version control. What happens then is:
Code is developed on parallel, uncontrolled source control systems, and just the final revision goes to the official one.
People delay committing as much as possible, decreasing visibility of what they are doing to other developers.
People will actually avoid committing something if they can get away with it, and some will find a way to get away with it.
In fact, I think your three check-in policies are already too much. For instance:
Having code being peer-reviewed before check-in makes it much more difficult to have work-in-progress stored there. Instead, if the source control system allows it (and many do), control whether the source is peer reviewed or not. With some systems you can create a life cycle for a revision, with others you might create branches, and still others you might use tags.
Having a work-item associated with a check-in makes it impossible for developers to do exploratory programming, or having initiative on possible improvements. It stifles the developers. Instead, make sure that any revision going into integration tests or user acceptance tests, not to mention production itself, is associated with a work item.
This might sound anti-Enterprise, but it's just some things we have learned in a few decades of software development. Most enterprise organizations haven't been clued in to this, but, eventually, they will. So, you might go the very opposite way, but don't say no one ever told you.
I recommend the Agile Manifest, and, particularly, Lean Software Development for general principles.
Or, taking Stack Overflow design philosophy into account, make the system reward the behavior you want.
In my current job the supervisor's practice is to only check in production ready code. Most recently the project I was on involved work by 3 different developers with some file overlap. This meant manually integrating changes despite the fact that some changes took a day and then it was done. I wanted to see if this was a common practice and get suggestions on how to change this practice with the knowledge that many times my opinion means little in the grand scheme of things.
You can use various ways to handle this situation, depending on your source control system.
Private branches: Allow you to check in and work on code while you go, merging back and forth at appropriate times.
Shelvesets/pacakaged changesets: Allow you to store changesets and send them around for review - ensuring they're production ready before check in.
As to whether this is an appropriate way to work, we don't allow check-in to main branches without prior review. To pass review your code must pass various automated tools, and then must be acceptable to your peer reviewer. For some definitions of "production ready" - this is it. Therefore, we do something like what you do. However, we use private branches to ensure that check-ins can still be made while this is in progress, and that other check-ins don't have to interfere.
If production ready means tested in an integration environment, then it sounds like you may need staging branches or something similar.
Code that is checked in should be unit tested, but, to me, "production ready" implies that it's gone through integration and system testing. You can't do that until a code freeze, so I don't see how you can do that before every check in.
Start by switching away from VSS to something more reliable & feature-rich. See How to convince a company to switch their Source Control
Then apply known-good practices:
Check in often
Pick up others' changes often, to simplify merging
Use fast unit tests to make sure each change meets a minimum bar
Require that that the checked-in code always builds, and always passes tests.
Now you won't be "production ready" at this point: you will still need a couple weeks to test & fix before you can deploy. Getting that time down is awesome for you, and awesome for your customer, so invest in:
High quality automated acceptance tests.
wouldn't it be a good idea to have a testing branch of the repo that can have the non "production ready code" checked in after the changes are done and tested?
the main trunk should never have code checked in that breaks the build and doesn't pass unit tests, but branches don't have to have all those restrictions in place.
I would personally not approve of this because sometimes that's the best way to catch problem code with less experienced developers (by seeing it as they are working on it) and when you "check in early and often" you can rollback to earlier changes you made (as you were developing) if you decide that some changes you made earlier was actually a better idea.
I think it may be the version control we user, VSS in combination with a lack of time to learn the branching. I really like the idea of nightly check ins to help with development and avoid 'Going Dark'. I can see him being resistant to the trunks but perhaps building a development SS and when the code is production ready move it to production SS.
From the practices I have seen the term production quality is used as a 'frightener' to ensure that people are scared of breaking top of tree, not a bad thing to be honest because top of tree should always work if possible.
I would say that best practice is that you should only be merging distinct (i.e. seperate) functional components on the top of tree. If you have a significant overlap on deltas to the same source files I think this 'might' indicate that somewhere along the line the project management has broken down, and that those developers should have merged their changes to seperate integration branch before going in to the main line sources. An individual developer saying that they unit tested their stuff is irrelevant, because the thing they tested has changed!
Trying to solve integration problems on your main line codeline will inevitably stall other unrelated submissions.
Assuming that you are working in a centralized version control system (such as Subversion), and assuming that you have a concept of "the trunk" (where the latest well-working code lives):
If you work on new features in "features branches"/"experimental branches", then it's OK to commit code which is far from finished. (When the feature is done, you commit the well-behaving result into the "trunk".)
But you will not win a popularity contest if committing non-compiling/obviously non-working code into the "trunk" or a "release branch".
The Pragmatic Programmers have a book called Pragmatic Version Control using Subversion which includes a section with advice about branches.
Check in early and check in often for two main reasons -
1 - it might make it easier to integrate code
2 - in case your computer explodes your weeks of work isn't gone
An approach I particularly like is to have different life cycle versions in the depot. That is,for example, have a dev version of the code that is where the developers check in code that is in being worked on; then you could have a beta version, where you could add beta fixes to your code; and then a production version.
There is obvious overhead in this approach, such as the fact that you will have a larger workspace on you local machine, the fact that you will need need to have a migration process into place to move code from one stage to the next (which means a code freeze when doing the integration testing that goes with the migration), and that depending on the complexity of the project(s) you might need to have tools that change settings, environment variables, registry entries, etc.
All of this is a pain to set up, but you only do it once, and once you have it all in place, makes working on different stages of the code a breeze.
#bpapa
Nightly backups of work folders to servers will prevent losing more than a days work.
#tonyo
Let's see the requirement documents were completed the day after we finished coding. Does that tell you about our project management?
We are a small shop so while you would think change is easy there are some here that are unbending to the old ways.