Best build process solution to manage build versions - version-control

I run a rather complex project with several independent applications. These use however a couple of shared components. So I have a source tree looking something like the below.
My Project
Application A
Shared1
Shared2
Application B
Application C
All applications have their own MSBuild script that builds the project and all the shared resources it needs. I also run these builds on a CruiseControl controlled continuous integration build server.
When the applications are deployed they are deployed on several servers to distribute load. This means that it’s extremely important to keep track of what build/revision is deployed on each of the different servers (we need to have the current version in the DLL version, for example “1.0.0.68”).
It’s equally important to be able to recreate a revision/build that been built to be able to roll back if something didn’t work out as intended (o yes, that happends ...). Today we’re using SourceSafe for source control but that possible to change if we could present good reasons for that (SS it’s actually working ok for us so far).
Another principle that we try to follow is that it’s only code that been build and tested by the integration server that we deploy further.
"CrusieControl Build Labels" solution
We had several ideas on solving the above. The first was to have the continuous integration server build and locally deploy the project and test it (it does that now). As you probably know a successful build in CruiseControl generates a build label and I guess we somehow could use that to set the DLL version of our executables (so build label 35 would create a DLL like “1.0.0.35” )? The idea was also to use this build label to label the complete source tree. Then we probably could check out by that label and recreate the build later on.
The reason for labeling the complete tree is to include not only the actual application code (that’s in one place in the source tree) but also all the shared items (that’s in different places in the tree). So a successful build of “Application A” would label to whole tree with label “ApplicationA35” for example.
There might however be an issue when trying to recreate this build and setting the DLL version before deploying as we then don’t have access to the CruiseControl generated build label anymore. If all CrusieControl build labels were unique for all the projects we could use only the number for labeling but that’s not the case (both application A and B could at the same time be on build 35) so we have to include the application name in the label. Hence SourceSafe label “Application35”. How can I then recreate build 34 and set 1.0.0.34 to the DLL version numbers once we built build 35?
"Revision number" solution
Someone told me that Subversion for example creates a revision number for the entire source tree on every check in – is this the case? Has SourceSafe something similar? If this is correct the idea is then to grab that revision number when getting latest and build on the CruiseControl server. The revision number could then be used to set the DLL version number (to for example “1.0.0.5678”). I guess we could then get this specific revision for the Subversion if needed and that then would include that application and all the shared items to be able to recreate a specific version from the past. Would that work and could this also be achived using SourceSafe?
Summarize
So the two main requirements are:
Be able to track build/revision number of the build and deployed DLL.
Be able to rebuild a past revision/build, set the old build/revision number on the executables of that build (to comply with requirement 1).
So how would you solve this? What would be your preferred approach and how would you solve it (or do you have a totally different idea?)? **Pleased give detailed answers. **
Bonus question What are the difference between a revision number and a build number and when would one really need both?

Your scheme is sound and achievable in VSS (although I would suggest you consider an alternative, VSS is really an outdated product).
For your "CI" Build - you would do the Versioning take a look at MSBuild Community Tasks Project which has a "Version" tasks. Typically you will have a "Version.txt" in your source tree and the MSBuild task will increment the "Release" number while the developers control the Major.Minor.Release.Revision numbers (that's how a client of mine wanted it). You can use revision if you prefer.
You then would have a "FileUpdate" tasks to edit the AssemblyInfo.cs file with that version, and your EXE's and "DLL's" will have the desired version.
Finally the VSSLabel task will label all your files appropriately.
For your "Rebuild" Build - you would modify your "Get" to get files from that Label, obviously not execute the "Version" task (as you are SELECTING a version to build) and then the FileUpdate tasks would use that version number.
Bonus question:
These are all "how you want to use them" - I would use build number for, well the build number, that is what I'd increment. If you are using CI you'll have very many builds - the vast majority with no intention of ever deploying anywhere.
The major and minor are self evident - but revision I've always used for a "Hotfix" indicator. I intend to have a "1.3" release - which would in reality be a product with say 1.3.1234.0 version. While working on 1.4 - I find a bug - and need a hot fix as 1.3.2400.1. Then when 1.4 is ready - it would be say 1.4.3500.0

I need more space than responding as comments directly allows...
Thanks! Good answer! What would be the
difference, what would be better
solving this using SubVersion for
example?Richard Hallgren (15 hours
ago)
The problems with VSS have nothing to do with this example (although the "Labeling" feature I believe is implemented inefficiently...)
Here are a few of the issues with VSS
1) Branching is basically impossible
2) Shared checkout is generally not used (I know of a few people who have had success with it)
3) performance is very poor - it is exteremly "chatty"
4) unless you have a very small repository - it is completely unreliable, to the point for most shops it's a ticking timebomb.
For 4 - the problem is that VSS is implemented by the entire repository being represented as "flat files" in the file system. When the repository gets over a certain size (I believe 4GB but I'm not confident in that figure) you get a chance for "corruption". As the size increases the chances of corruption grow until it becomes an almost certainty.
So take a look at your repository size - and if you are getting into the Gigabytes - I'd strongly recommend you begin planning on replacing VSS.
Regardless - a google of "VSS Sucks" gives 30K hits... I think if you did start using an alterantive - you will realize it's well worth the effort.

Have CC.net label the successful builds
have each project in the solution link to a common solutioninfo.cs file which contains assembly and file version attributes (remove from each projects assemblyinfo.cs)
Before the build have cruise control run an msbuild regex replace (from msbuild community tasks) to update the version information using the cc.net build label (passed in as a parameter to the msbuild task)
build the solution, run tests, fx cop etc
Optionally revert the solution info file
The result is that all assemblies in the cc.net published build have the same version numbers which conform to a label in the source code repository

UppercuT can do all of this with a custom packaging task to split the applications up. And to get the version number of the source, you might think about Subversion.
It's also insanely easy to get started.
http://code.google.com/p/uppercut/
Some good explanations here: UppercuT

Related

How should Eclipse RCP target platforms be configured for multiple products?

In our Eclipse RCP (Kepler with Tycho/Nexus) project we create a custom application with two sub parts. These parts are organized in the features Planning and Production. They are delivered as three different products: standalone Planning, standalone Production and a combination of Planning-*Production*.
The question is now how the target platform for the builds should look like?
Should the TP be set up per build? Meaning we have four TP for:
Build TP for creating plugins
Planning Release TP
Production Release TP
Planning-Production Release TP
Or should the TP be looked at like a repository? Meaning there is just one for the whole project and depending on the context the build will get dependencies either from the TP or the local source code.
(To be honest we have the first solution at the moment and my gut tells me it is a bad idea. Although my gut is good enough for me, solution architects tend to ignore such input. I am looking therefore as well for arguments why one or the other solution is better or worse.)
Or should the TP be looked at like a repository?
That's what we did in a similar situation (though not using Tycho, I don't think this should change things).
For the first option:
Primarily, I don't see any point. Since you have a combined release, production and planning dependencies need to be compatible anyway, and you'd certainly want to have same dependencies in development and release.
When one of dependencies is updated or removed, you only want to change it in one place (though this may be worked around using features).
No need to switch between target platforms to build a release (though this may be irrelevant depending on how exactly you build releases).

What to put under version control?

Almost any IDE creates lots of files that have nothing to do with the application being developed, they are generated and mantained by the IDE so he knows how to build the application, where the version control repository is and so on.
Should those files be kept under version control along with the files that really have something to do with the aplication (source code, application's configuration files, ...)?
The things is: on some IDEs if you create a new project and then import it into the version-control repository using the version-control client/commands embedded in the IDE, then all those files are sent to the respitory. And I'm not sure that's right: what is two different developers working on the same project want to use two different IDEs?
I want to keep this question agnostic avoiding references to any particular IDE, programming language or version control system. So this question is not exactly the same as these:
SVN and binaries - but this talks about binaries and SVN
Do you keep your build tools in version control? - but this talks about build tools (e.g. putting the jdk under version control)
What project files shouldn’t be checked into SVN - but this talks about SVN and dll's
Do you keep your project files under version control? - very similar (haven't found it before), thanks VonC
Rules of thumb:
Include everything which has an influence on the build result (compiler options, file encodings, ASCII/binary settings, etc.)
Include everything to make it possible to open the project from a clean checkout and being able to compile/run/test/debug/deploy it without any further manual intervention
Don't include files which contain absolute paths
Avoid including personal preferences (tab size, colors, window positions)
Follow the rules in this order.
[Update] There is always the question what should happen with generated code. As a rule of thumb, I always put those under version control. As always, take this rule with a grain of salt.
My reasons:
Versioning generated code seems like a waste of time. It's generated right? I can get it back at a push of a button!
Really?
If you had to bite the bullet and generate the exact same version of some previous release without fail, how much effort would it be? When generating code, you not only have to get all the input files right, you also have to turn back time for the code generator itself. Can you do that? Always? As easy as it would be to check out a certain version of the generated code if you had put it under version control?
And even if you could, could you ever be sure that didn't miss something?
So on one hand, putting generated code under version control make sense since it makes it dead easy to do what VCS are meant for: Go back in time.
Also it makes it easy to see the differences. Code generators are buggy, too. If I fix a bug and have 150'000 files generated, it helps a lot when I can compare them to the previous version to see that a) the bug is gone and b) nothing else changed unexpectedly. It's the unexpected part which you should worry about. If you don't, let me know and I'll make sure you never work for my company ever :-)
The major pain point of code generators is stability. It doesn't do when your code generator just spits out a random mess of bytes every time you run (well, unless you don't care about quality). Code generators need to be stable and deterministic. You run them twice with the same input and the output must be identical down to least significant bit.
So if you can't check in generated code because every run of the generator creates differences that aren't there, then your code generator has a bug. Fix it. Sort the code when you have to. Use hash maps that preserve order. Do everything necessary to make the output non-random. Just like you do everywhere else in your code.
Generated code that I might not put under version control would be documentation. Documentation is somewhat of a soft target. It doesn't matter as much when I regenerate the wrong version of the docs (say, it has a few typos more or less). But for releases, I might do that anyway so I can see the differences between releases. Might be useful, for example, to make sure the release notes are complete.
I also don't check in JAR files. As I do have full control over the whole build and full confidence that I can get back any version of the sources in a minute plus I know that I have everything necessary to build it without any further manual intervention, why would I need the executables for? Again, it might make sense to put them into a special release repo but then, better keep a copy of the last three years on your company's web server to download. Think: Comparing binaries is hard and doesn't tell you much.
I think it's best to put anything under version control that helps developers to get started quickly, ignoring anything that may be auto-generated by an IDE or build tools (e.g. Maven's eclipse plugin generates .project and .classpath - no need to check these in). Especially avoid files that change often, that contain nothing but user preferences, or that conflict between IDEs (e.g. another IDE that uses .project just like eclipse does).
For eclipse users, I find it especially handy to add code style (.settings/org.eclipse.jdt.core.prefs - auto formatting on save turned on) to get consistently formatted code.
Everything that can be automatically generated from the source+configuration files should not be under the version control! It only causes problems and limitations (like the one you stated - using 2 different project files by different programmers).
Its true not only for IDE "junk files" but also for intermediate files (like .pyc in python, .o in c etc).
This is where build automation and build files come in.
For example, you can still build the project (the two developers will need the same build software obviously) but they then could in turn use two different IDE's.
As for the 'junk' that gets generated, I tend to ignore most if it. I know this is meant to be language agnostic but consider Visual Studio. It generates user files (user settings etc..) this should not be under source control.
On the other hand, project files (used by the build process) most certainly should. I should add that if you are on a team and have all agreed on an IDE, then checking in IDE specific files is fine providing they are global and not user specific and/or not needed.
Those other questions do a good job of explaining what should and shouldn't be checked into source control so I wont repeat them.
In my opinion it depends on the project and environment. In a company environment where everybody is using the same IDE it can make sense to add the IDE files to the repository. While this depends a bit on the IDE, as some include absolute paths to things.
For a project which is developed in different environments it doesn't make sense and will be pain in the long run as the project files aren't maintained by all developers and make it harder to find "relevant" things.
Anything that would be devastating if it were lost, should be under version control.
In my opinion, anything needed to build the project (code, make files, media, databases with required program info, etc) should be in repositories. I realise that especially for media/database files this is contriversial, but to me if you can't branch and then hit build the source control's not doing it's job. This goes double for distributed systems with cheap branch creation/merging.
Anything else? Store it somewhere different. Developers should choose their own working environment as much as possible.
From what I have been looking at with version control, it seems that most things should go into it - e.g. source code and so on. However, the problem that many VCS's run into is when trying to handle large files, typically binaries, and at times things like audio and graphic files. Therefore, my personal way to do it is to put the source code under version control, along with general small sized graphics, and leave any binaries to other systems of management. If it is a binary that I created myself using the build system of the IDE, then that can definitily be ignored, because it is going to be regenerated every build. For dependancy libraries, well this is where dependancy package managers come in.
As for IDE generated files (I am assuming these are ones that aren't generated during the build process, such as the solution files for Visual Studio) - well, I think it would depend on whether or not you are working alone. If you are working alone, then go ahead and add them - they will allow you to revert settings in the solution or whatever you make. Same goes for other non-solution like files as well. However, if you are collaborating, then my recomendation is no - most IDE generated files tend to be, well, user specific - aka they work on your machine, but not neccesarily on others. Hence, you may be better of not including IDE generated files in that case.
tl;dr you should put most things that relate to your program into version control, excluding dependencies (things like libraries, graphics and audio should be handled by some other dependancy management system). As for things directly generated by the IDE - well, it would depend on if you are working alone or with other people.

Version control of deliverables

We need to regularly synchronize many dozens of binary files (project executables and DLLs) between many developers at several different locations, so that every developer has an up to date environment to build and test at. Due to nature of the project, updates must be done often and on-demand (overnight updates are not sufficient). This is not pretty, but we are stuck with it for a time.
We settled on using a regular version (source) control system: put everything into it as binary files, get-latest before testing and check-in updated DLL after testing.
It works fine, but a version control client has a lot of features which don't make sense for us and people occasionally get confused.
Are there any tools better suited for the task? Or may be a completely different approach?
Update:
I need to clarify that it's not a tightly integrated project - more like extensible system with a heap of "plugins", including thrid-party ones. We need to make sure those modules-plugins works nicely with recent versions of each other and the core. Centralised build as was suggested was considered initially, but it's not an option.
I'd probably take a look at rsync.
Just create a .CMD file that contains the call to rsync with all the correct parameters and let people call that. rsync is very smart in deciding what part of files need to be transferred, so it'll be very fast even when large files are involved.
What rsync doesn't do though is conflict resolution (or even detection), but in the scenario you described it's more like reading from a central place which is what rsync is designed to handle.
Another option is unison
You should look into continuous integration and having some kind of centralised build process. I can only imagine the kind of hell you're going through with your current approach.
Obviously that doesn't help with the keeping your local files in sync, but I think you have bigger problems with your process.
Building the project should be a centralized process in order to allow for better control soon your solution will be caos in the long run. Anyway here is what I'd do.
Create the usual repositories for
source files, resources,
documentation, etc for each project.
Create a repository for resources.
There will be the latest binary
versions for each project as well as
any required resources, files, etc.
Keep a good folder structure for
each project so developers can
"reference" the files directly.
Create a repository for final buidls
which will hold the actual stable
release. This will get the stable
files, done in an automatic way (if
possible) from the checked in
sources. This will hold the real
product, the real version for
integration testing and so on.
While far from being perfect you'll be able to define well established protocols. Check in your latest dll here, generate the "real" versión from latest source here.
What about embedding a 'what' string in the executables and libraries. Then you can synchronise the desired list of versions with a manifest.
We tend to use CVS id strings as a part of the what string.
const char cvsid[] = "#(#)INETOPS_filter_ip_$Revision: 1.9 $";
Entering the command
what filter_ip | grep INETOPS
returns
INETOPS_filter_ip_$Revision: 1.9 $
We do this for all deliverables so we can see if the versions in a bundle of libraries and executables match the list in a associated manifest.
HTH.
cheers,
Rob
Subversion handles binary files really well, is pretty fast, and scriptable. VisualSVN and TortoiseSVN make dealing with Subversion very easy too.
You could set up a folder that's checked out from Subversion with all your binary files (that all developers can push and update to) then just type "svn update" at the command line, or use TortoiseSVN: right click on the folder, click "SVN Update" and it'll update all the files and tell you what's changed.

Do you version "derived" files?

Using online interfaces to a version control system is a nice way to have a published location for the most recent versions of code. For example, I have a LaTeX package here (which is released to CTAN whenever changes are verified to actually work):
http://github.com/wspr/pstool/tree/master
The package itself is derived from a single file (in this case, pstool.tex) which, when processed, produces the documentation, the readme, the installer file, and the actual files that make up the package as it is used by LaTeX.
In order to make it easy for users who want to download this stuff, I include all of the derived files mentioned above in the repository itself as well as the master file pstool.tex. This means that I'll have double the number of changes every time I commit because the package file pstool.sty is a generated subset of the master file.
Is this a perversion of version control?
#Jon Limjap raised a good point:
Is there another way for you to publish your generated files elsewhere for download, instead of relying on your version control to be your download server?
That's really the crux of the matter in this case. Yes, released versions of the package can be obtained from elsewhere. So it does really make more sense to only version the non-generated files.
On the other hand, #Madir's comment that:
the convenience, which is real and repeated, outweighs cost, which is borne behind the scenes
is also rather pertinent in that if a user finds a bug and I fix it immediately, they can then head over to the repository and grab the file that's necessary for them to continue working without having to run any "installation" steps.
And this, I think, is the more important use case for my particular set of projects.
We don't version files that can be automatically generated using scripts included in the repository itself. The reason for this is that after a checkout, these files can be rebuild with a single click or command. In our projects we always try to make this as easy as possible, and thus preventing the need for versioning these files.
One scenario I can imagine where this could be useful if 'tagging' specific releases of a product, for use in a production environment (or any non-development environment) where tools required for generating the output might not be available.
We also use targets in our build scripts that can create and upload archives with a released version of our products. This can be uploaded to a production server, or a HTTP server for downloading by users of your products.
I am using Tortoise SVN for small system ASP.NET development. Most code is interpreted ASPX, but there are around a dozen binary DLLs generated by a manual compile step. Whilst it doesn't make a lot of sense to have these source-code versioned in theory, it certainly makes it convenient to ensure they are correctly mirrored from the development environment onto the production system (one click). Also - in case of disaster - the rollback to the previous step is again one click in SVN.
So I bit the bullet and included them in the SVN archive - the convenience, which is real and repeated, outweighs cost, which is borne behind the scenes.
Not necessarily, although best practices for source control advise that you do not include generated files, for obvious reasons.
Is there another way for you to publish your generated files elsewhere for download, instead of relying on your version control to be your download server?
Normally, derived files should not be stored in version control. In your case, you could build a release procedure that created a tarball that includes the derived files.
As you say, keeping the derived files in version control only increases the amount of noise you have to deal with.
In some cases we do, but it's more of a sysadmin type of use case, where the generated files (say, DNS zone files built from a script) have intrinsic interest in their own right, and the revision control is more linear audit trail than branching-and-tagging source control.

Solution deployment, CM, InstallShield

People,
We have 4 or 5 utilities that work in conjunction with our application. These utilities are either .bat files, or VB apps, PowerBuilder, etc. I am trying to manage these utils in source control, and am trying to figure out a better way to assign versions to them. Right now, the developers use the version control's meta-data -- specifically label -- to store the version number of the tool.
My goal is to have individual InstallShield packages for each utility, and an easy means to manage and assign version numbers to these packages.
Would you recommend a separate .ini file with the info, or store the info in InstallShield .ism file itself, or just use the meta-data info from version control tool?
UPDATE:
I like the idea Orion. I have one concern though. The script that increments the version number... it can not be intelligent enough to increment Major number etc. right. e.g. if one of the utils has version 1.2.3 and we are at a point where the new version is 2.0.0. The script may not be able to handle this.
I think this has to do a lot with our branching techniques -- we don't have any. The folks thought since the utils are so small, the source may not need branches.
PowerBuilder in particular has a nice trick you can do to incorporate the build number from an ini file into the compiled application.
Details here: http://www.pbdr.com/pbtips/ex/autorev.htm
We have ini file inside source control that stores the build number and its value is used in our build scripts to determine what label to apply to the source tree after a successful build. Works very nicely for our needs. When we branch, we do have to manually kick the file to increment the proper number though.
I managed our build system at my last job, which seemed to have some parallels to what you're asking.
There were ~30 C++ projects which needed compiling, and various .NET/Java things, and the odd perl script.
This was all built on our build machine using NAnt - If I were doing it today I'd use rake, but the idea is the same.
We basically had an auto-incrementing build number which was stored in a version.txt file in the root of the repository.
Each time we did a build (automatically done each night, or also on-demand if neccessary) the script would increment this number and check the file back into source control.
All the other apps referenced this file for their version number, or for things which didn't support working like this, the script would set environment variables or perform other workarounds
I'm pretty sure that our installshield programs referenced an environment variable for their version number, but we deprecated them in favour of wix as installshield really did suck
in the case of visual studio, grep/replace the number within the .csproj files, and check them back in
Hope this gives you some ideas
Using the meta data from your version control system should keep things simpler. It's how your developers already use the system. There is no additional file to maintain. My personal experience has taught me to version the satellite applications with the same as version as the main app. K.I.S.S