What exactly is considered "breaking the build?" - version-control

In a CI environment, what exactly is considered a broken build?
There are several answers I can imagine (any combination of compiles, tests pass, metrics are in range, documentation exists etc.) , but I am not sure which of these are cannoncial.
For example, just today it happened to me that I actually checked in all code changes but forgot to commit the Visual Studio project file, thus breaking the unit tests. (even though I literally triple checked my commit, as it's a public OSS project on google code).
I was easily able to fix this in under a minute after my first commit, but should I consider myself a buildbreaker now?
How do you configure your CI environment: Is every revision built or only the newest version after each complete build, or do you use time based checking for new revs?

Ideally, you have
automated script that is scheduled to run each night to build the application from source code.
Scripts to copy the binaries to a directory/set of directories from which another script can be run to deploy the application if it is running in your environment, or used to create deliverables for a customer.
Automated suite of tests that run and verify all components pass all tests.
Automated script that verifies that the build has been built correctly.
Automated script/monitoring system that sends out an alert if the verification script fails.
When an alert is generated by the above process, then that is considered "breaking the build".
But since procedures/processes can vary from company to company, there may be alternate definitions. In some places it may be breaking the unit tests. Others it may be checking in source code that results in the code being unabl to compile.

Breaking the build is commiting any changes that make it impossible (or possible but not smart to do) to deploy the project.
Fixing a broken build does not repair the broken build, but makes it possible to create a new non-broken build.
I configure my CI-server to create a minimal build on every commit, and create a maximum build every period of time. The period depends on the number of people working on the project (more people is more commits) and the build duration (You can run your unit test suite every time, but the 30min. acceptance test suite once or twice a day).

Breaking a build is preventing any user that relies on the same standard tools and set of code than your CI environnment to get a compiling and running system.
If your coworkers cannot compile the system when they're up to date, because some config is missing, the build is broken, isn't it ?
If you coworkers cannot be confident that unit tests pass, because one of them is flaky, the build is broken, isn'it ?
If you have automated performance tests, and you're project have to be optimized, I would go as far as too say that if you code doesn't run fast enough, you've broken the build (but that is arguable)
I would not be so strongly minded about code coverage, or other metrics for example.
Breaking the build can happen. CI is just there to make sure it doesn't happen too much on the day you're supposed to ship ;)

for us, we use the term "breaking the build" any time the test suite fails after a new commit.
so, in your instance, yes you would have broken the build (according to our company at least)

As long as you broke the build because of a simple human error (like forgetting to commit something) and as long as this is an exception and not the rule, I would say it's OK. As long as you take care to fix this quickly :)
On the other hand, if somebody breaks builds on a regular basis by not executing the complete build locally on his/her machine before committing, then this is a sign of an ill-disciplined team member who doesn't really care for other team members and for the development process.
My experience is that one effective way to make people be more careful is to set up your CI server to send an email when (and only when) the status of the build changes, with the "culprit" on the To: list and the rest of the team on the Cc:. I guess you can could call it a "shame factor" ;)

A broken build is anything that doesn't pass the automated test suite.
You're a build breaker. ;) That's not really a big deal as long as you fix it fast. The whole point of a CI environment is to catch bugs, not make people afraid to commit.
My company builds the tip of each branch that is either in production or next up for production. We do this on every commit and each day at 4am. We're using Mercurial, so commit here means pushing changes to the integration repo.

Related

Build Flow vs Build Pipeline

I'm trying to split up a few Jenkins jobs using the Build Flow plugin so that instead of three monolithic jobs, we have three "starting points" that then use the DSL to trigger downstream jobs. I chose Build Flow over the Build Pipeline plugin because it seemed like it was a lot harder to share jobs between different pipelines ( ie, sharing the workspace of the multiple starting jobs with a single compile job ).
Previously, I had three jobs set up: Project-PR, Project-DEV, and Project-PROD.
Project-PR would build whenever a pull request happened in GitHub, and would just run a smaller subset of our unit tests, so that we could get quick verification that the PR is okay to merge.
Project-DEV would build whenever a feature branch was merged in GitHub into the main development branch, as well as having the ability to be manually triggered and given a different branch to pull. It would run the full suite of unit -- basically a sanity check that everything is still good. Then it would compile and minify, and push to a QA environment for testing, and then it would run the full suite of integration tests against that QA environment. This step was configured as a parametrized build, with the parameter being the name of the branch to pull, test, and push. It would push to and set up QA environment specific to that branch, so that we could QA multiple features without having to merge to development ( ie, feature-one.qa.example.com, feature-two.qa.example.com ).
Project-PROD would only ever be manually triggered, and would do the full unit and integration test suite, compile and minify the front-end code ( Less, JS, and CSS ), and push the built code into a special "release branch" in GitHub that can then be deployed -- we haven't quite reached the point of Jenkins being in charge of deployment.
Now, what I wanted to set up was to split the subtasks into their own jobs, so that it'd be easy to set up new jobs to without having to copy and paste all the build steps ( or copying the job and changing all the things that need to be unique ). This would let us do things like create a copy of the Project-DEV, but switch out the last job for one that deploys to a staging environment set up in the cloud. Or easily create a job that could report test results to a third party source, ie copy the results to a shared network folder or something. Or any number of things. The goal is basically to use these subtask jobs as building blocks to let us build more complicated jobs, while also making it easier to update how one portion of the build works ( for example, maybe we switch to a different technology for compiling, which might change how Jenkins would compile the code ).
For example, the Project-PR would be split into the following:
Project-PULL -> Project-SetupBuildEnv -> Project-PartialUnitTests
(BuildFlow) (Normal Job) (Normal Job)
The SetupBuildEnv would just pull down any NPM or Composer requirements, and set up the directories required for testing and building. PartialUnitTests then run, and report it's results back up to the
The Project-DEV could be split up like so:
Project-DEV -> Project->SetupBuildEnv -> Project-FullUnitTests -> Project-Compile -> Project-Minify -> Project-DeployQA -> Project-FullIntegrationTests
This way, the parts of the build process that are shared ( in this case, Project-SetupBuildEnv ) can be easily shared between jobs, reducing duplication, and making it easier to update a step in the build process without having to remember EVERY job that uses that step.
Right now, I'm using the Shared Workspace plugin so that all the steps use the same workspace. However, I'm running into an issue with that: it's not actually using one workspace. What's happening is that the Build Flow job will get a directory ( eg: /sharedspace/shared_one ), and download the code from GitHub into there. Then it will trigger the DSL, which starts up the 'SetupBuildEnv' job. But instead of working inside the same directory, it will get a directory with a name like "/sharedspace/shared_one#2", and run the build setup task in there. Then when it goes to do the third step ( unit testing ), it fails, because now it's got a third directory ( /sharedspace/shared_one#3 ), but that directory didn't have the setup run, so the required node and composer modules are missing. What's weird is that it looks like the Shared Workspace plugin is copying the first shared workspace to another directory and incrementing a counter ( the #N part of the directory name ) and giving that to the other jobs to work in.
So, question time:
is there a way to fix the Shared Workspace plugin so that it's actually only using one directory for each job?
if not, is it possible to have the Clone Workspace plugin take an argument, so I can specify what archived workspace to use instead of using the dropdown?
another possiblity: would using the shared workspace plugin, but use the "Local subdirectory for repo (optional)" option in the advanced git job options to specify the directory to use?
failing all that, is there some other way to set up a build pipeline that can share jobs with other pipelines that I've missed out on?
In my experience, even if you do get this working, this might not be a scalable way to go longer term. We've found the shared workspace plugin entirely a bad idea for long / complex builds (similar reasons to yours - but also: scaling across dozens of slaves becomes hard suddenly). Arguably the idea is slightly against the spirit of modern scalable CI.
I'd instead delegate more to your build tools, be they Maven / Gradle, Ant, even Grunt, whatever. If you want to keep these builds truly modular, but can't afford to rebuild at each step (we decided full independence was worth wasting a few minutes per build) then perhaps look at creating useful artefacts at key stages - in your case, minified assets TARs, library JARs, or maybe webjars, or whatever, and deploy them to a (Maven?) repo.
Later build steps in your pipeline can quickly, easily, and repeatably pull the latest (or named version) assets from this centralised repo, and continue with the build process.
An alternative (with similarities) is to build one or more assets, but only promote them after increasing numbers of tests are run, which can be done in separate builds coordinated by your build flow, using the Promoted Builds plugin etc.

jenkins continuous delivery with shared workspace

Background:
We have one Jenkins job (Production) to build a deliverable every night. We have another job (ProductionPush) that pushes out the deliverable over a proprietary protocol to production machines the next day. This is because some production machines are only available during certain hours during the day (It also gives us a chance to fix any last-minute build breaks). ProductionPush needs access to the deliverable built by the Production job (so it needs access to the same workspace). We have multiple nodes and concurrent builds (and thus unpredictable workspaces) and prefer not to tie the jobs to a fixed node/workspace since resources are somewhat limited.
Questions:
How to make sure both jobs share the same workspace and ensure that ProductionPush runs at a fixed time the next day only if Production succeeds -- without fixing both jobs to run out of the same node/workspace? I know the Parameterized Trigger Plugin might help with some of this but it does not seem to have time delay capability and 12 hours seems too long for a quiet period.
Is sharing the workspace a bad idea?
Answer 2: Yes, sharing workspace is a bad idea. There is possibility of file locks. There is the issue of workspace being wiped out. Just don't do it...
Answer 1: What you need is to Archive the artifacts of the build. This way, the artifacts for a particular build (by build number) will always be available, regardless of whether another build is running or not, or what state the workspaces are in
To Archive the artifacts
In your build job, under Post-build Actions, select Archive the artifacts
Specify what artifacts to archive (you can use a combination of below)
a) You can archive all: *.*
b) You can archive a particular file with wildcards: /path/to/file_version*.zip
c) You can ignore the intermediate directories like: **/file_version*.zip
To avoid storage problems with many artifacts, on the top of configuration you can select Discard Old Builds, Click Advanced button, and play around with Days to keep artifacts and Max # of builds to keep with artifacts. Note that these two settings do not control for how long the actual builds are kept (other settings control that)
To access artifacts from Jenkins
In the build history, select any previous build you want.
In addition to SCM changes and revisions data, you will now have a Build Artifacts link, under which you will find all the artifacts for that particular build.
You can also access them with Jenkins' permalinks, for example
http://JENKINS_URL/job/JOB_NAME/lastSuccessfulBuild/artifact/ and then the name of the artifact.
To access artifacts from another job
I've extensively explained how to access previous artifacts from another deploy job (in your example, ProductionPush) over here:
How to promote a specific build number from another job in Jenkins?
If your requirements are to always deploy latest build to Production, you can skip the configuration of promotion in the above link. Just follow the steps for configuration of the deploy job. Once you have your deploy job, if it is always run at the same time, just configure its Build periodically parameters. Alternatively, you can have yet another job that will trigger the deploy job based on whatever conditions you want.
In either case above, if your Default Selector is set to Latest successful build (as explained in the link above), the latest build will be pushed to Production
Not sure archiving artifacts is really a good idea. A staging repository might be better as it enables cross-functional teams to share artifacts across different builds when required by tweaking the Maven settings.xml file.
You really want a deployable (ear/war) as the thing that gets built, tested, then promoted to production once confidence is high with the build.
Use a build number on your deployable (major.minor.buildnumber). This is the thing you promote to production, providing your tests can be relied upon. Don't use a hyphen to separate minor with build number as that forces Maven to perform a lexical comparison... a decimal point will force a numeric comparison which will give you far less headaches.
Also, you didn't mention your target platform, but using the Maven APT/RPM plugin to push an APT/RPM to a APT/YUM repo that's available to a production box (AFTER successful testing!) would be a good fit, as per industry standards?

Where to store automated tests in source control

My company recently started getting up to date on the usage of TFS, source control, and branching strategies. Our current branching strategy is the basic 'Dev > Main > Release' method, which works well for our small team. However, the issue lies with our automated tests. All of our integration tests and UI tests are written in C# and executed in a nightly build process. In effort to keep source clean and well kept, where exactly should we place the automated test code?
You could place the automated tests inside of each branch. Your automated tests can be merged and treated just like regular code since it will be changing for new development.
The other option could be for you to place it where your projects build-types and build files reside.
It is important to make sure it is checked into source control.

Deployment: Build on production machine or not?

I have an automated deployment process for a Java app where currently I'm building the app on a build machine, checking the build into scm, and having the production machine pull the build artifact (which is a zip) and through ant move the class and config files to where they're supposed to be.
I've seen other strategies where the production machine pulls the source from scm and builds it itself.
The thing I don't like about the former approach is that if I'm building for production instead of staging or dev or whatever, I have to manually specify the env in the build. If the target server were in charge of this, though, there would be less thought and friction involved in the build. However, I also like using the exact same build as was being tested on staging.
So, I guess my question is, is it preferred to copy the already build/already tested app to production or to have production build the app again once it's been tested.
If you already have an automated build system that is creating a testing build, how hard is it to extend that so that it builds both a testing build and a production build at the same time. This way you get the security of knowing they were built from the exact same checked out source and you have less manual labor. I really cringe at the idea of checking built artifacts into SCM!
I always prefer keeping as little as humanly possible on a production server - less to update, less to go wrong.

Controlled Integration of Changes with Continuous Integration

I have a NSIS installer that we previously built using nAnt scripts that copy some files around and run makensis.exe via a exec task to build the installer exe. After the nant script completes, I have the compelte structure for our CD and also our download.
I was just doing a get from sourcesafe onto an unused desktop and using it as a build box, compiling there. Sometimes we would have a couple of files checked in that fix something critical. In those cases I would go to the build box, and very selectively get only those files, to avoid getting other changed files that we aren't ready to release yet. Basically I am able to allow development to continue and selectively include certain changed files into the installer for release.
Now we no longer have a free box, and need to build from our server. So I am setting up CI Factory so that the developer can kick the build off without remoting into the server. The one issue I am struggling with, is the best way to continue to allow this selective change control to occur. The default concept of CI that CI Factory implements is fine for internal development "head". However, I also want to setup a CCNet project that is run only on demand via a Force Build for this "public release" type of build.
This is what I've brain stormed so far, without being sure how well this will work, if at all(still figuring out what CCNet and CI Factory are all about). The "public release" CCNet project config/build would be setup such that it would not get latest. Modifications would not trigger a build. Since the other CCNet project that is using the default CI methodology(we'll call it the "CI project") of getting latest when changes are detected, then these two projects can't share the same working directory. So the "public release" would need a different working copy, so that its files won't get updated when the CI project's build is triggered. The developer would need to remote into the server, one VSS, selectively do a get into the "public release"'s working copy, and then force a build through CI Factory.
The disadvantage's I see with this is
1) Having to remote in to selectively do gets.
2) I have no idea how to allow a single CI Factory project to have two different working copies of the Product folder, so that each project configuration block has it's own.
3) I'm afraid of what kind of strangeness this might cause. I'm not quite sure yet how to specify a source control block in CCNet project config block, but prevent it from doing a get latest when it builds. I'm still gradually figuring out what things are in scripts and can be easily taken out without breaking other things, versus what is not meant to be mucked around with and/or is not configurable.
I would really like to hear about how others deal with this issue of selectively releasing changes, if you have a similar situation. I am constrained to VSS, so my immediate need is to solve this with that in mind, but at the same time I'd be interested in hearing how you manage this with other source control systems. I guess you would probably have a branch that is your latest developments branch, and then merge changes into the trunk whenever you want to release them? I really don't trust VSS for branching/merging, and I think the branching concepts might be a little too much overhead and learning curve for this shop. Like I said though, stories with other source control systems would be useful future knowledge for me.
Thanks in advance.
You need a branching structure in your repository to facilitate this. Something like the release branch method. Only select individuals can commit to this branch (or have a release/stable for that). Set up your manual CI launches to pull from the release branch as release nightly promote to milestone or final from there. I don't like the idea of manually modifying things on your build machine. Set up the changes in version control, in a safe place to prepare your release and let CI build from there, but manually triggered.
Check out these branching patterns. I suggested C3, codeline-per-release, often called release branching.
Heres an article on VSS branching that includes a link to merging.
This question looks similar.
Maybe you could move to another source control system with better support for this kind of thing. Any suggestions from MS people out there?