Typical best-practice ClearCase project structure - version-control

During a development project, the delivered code can go between different stages different environment before it reaches the production (e.g. Development Environment for testing deployment processes, Internal Testing for QC, Pre-Production and finally production).
This development effort produces many candidate release in which a certain release can be nominated to move upwards in the development process until it reaches production, also, there might be some cases where the code deployed on the production might require hot-fixes in parallel to the current internal development lines (i.e. Parallel Development).
For a certain UCM project maintained by IBM Rational ClearCase (CC), what is the recommended project structure to be created on "Project Explorer" to accommodate for the following:
The developers should mainly connect and deliver their work on the internal development line (or in CC terminology the development stream).
Once the delivered code to this development stream is considered acceptable, the Technical Team Lead (TTL) can create a baseline. This baseline can be later retrieved by the Deployment Engineer to be deployed on the local Development Environment.
If this baseline was found acceptable, this baseline can be delivered as a whole to the Internal Testing stream to be deployed for further Quality Control (QC) test.
If this baseline was found acceptable, this baseline can be delivered as a whole to the Pre-Production and so forth to the production similar to what was described above.
Of course, if any of these baselines were not accepted by its receiving party, it can be rejected, and the receiving party will wait for another baseline to be recommended for their stream.
Note: The Deployment Engineer will always use a dedicated stream for each environment to get his/her files required to carry out the build/deployment activities.
My apologies for everybody here since I understand that answering this can be long, but my question more concentrates on the exact type of streams and/or views that need to be created in "Project Explorer" to suffice the above objectives.
I am really trying to come up with the best practice approach for release management using CC and how it can be best used this purpose.
I would appreciate your help guys and many thanks to all in advance ...

The rule of thumb is simple:
The less branch, the better.
I mean, if you ever done deliver and rebase before with ClearCase, you know:
how painful it is
how poorly it scales with the number of file (merging 1000 files is awfully long, merging 5000 files is murder)
So the real rule of thumb is:
if you don't have to modify any file for a given development stage, don't create a branch.
For instance, for promoting a code to QA, where you will only read it (and launch some tests, in order to accept that code if they pass, or to reject that code if they fail), don't create a QA Stream where you would deliver the code: it is too long for an non-existent added value.
Use baseline promotion level whenever you can, and recommend your promoted baselines.
The Deployment Engineer will always use a dedicated stream for each environment to get his/her files required to carry out the build/deployment activities.
Err... no, if you don't have any change to do.
The Deployment Engineer doesn't care at all where the baseline is coming from, only if the code deploys and runs successfully.

Related

What is the correct/best/most efficient way to use centralised version control?

I work as part of a small development team (4 people).
None of us are incredibly experienced with version control, but we are required to use Perforce under our company's policies. For the most part it has been great, but we have have kept to a simple process agreed between ourselves that is starting to become less ideal. I was wondering if people could share their experiences of version control working smoothly and efficiently.
Our original setup is this:
We have a trunk, which holds production code as it is now.
Each user creates a development branch for their work, as we have
always worked on separate areas that don't really affect each other.
We develop on Redhat Linux boxes and the code is run from /var/www/html. So we sync to a workspace and copy those files to this path, change the permissions and then perform our changes there. When we want to check in, we check out the files in the workspace, overwrite them with what we have changed and submit (I think this might be our weakest part)
Any changes to trunk will be incorporated if they affect the functionality in question. The code is then deployed for testing.
When testing is complete, we merge the branch into trunk, and then create a release branch from the current trunk this is tested again and then released into production.
This worked fine previously because our projects were small and very separated. Now, however, we are all working on the same big dev branch. Changes have been released since the creation of the dev branch, and more will be made before it is finished.
We are also required to deploy the code for testing in various stages of it's development, and this code needs to be up to date with both the development changes, and any changes that have been made to production.
We have decided at this stage that we will create the release branch at the same time as the dev branch, into which we will merge current Trunk(production) and the current dev branch each time we need a testing version so that it is completely up to date. However, this merge takes a lot of time from the whole team and isn't really working out too well.
I've been told that different teams have different ways of going about things so I'm not looking for a fix for my process, but I would love to hear what setup you use of your willing to share
If you are not particularly familiar with version control and best practices I would suggest utilizing Streams in Perforce. Functionally Streams and Branches are very similar. The difference with Streams is that Perforce utilizes pre-built relationships based on the stream type and gives basic governance (i.e. you can't copy those files to the other stream until you merge).
All the commands CAN be overridden by an admin.
Once you are utilizing streams you can do things a few different ways. You have three types of streams, Release (most stable), Main (stable), and Development (least stable). You can create any hierarchy you like.
I suppose in your case I would have a Mainline, an integration development stream, and then a development stream for each developer to utilize. That way you each have your own playground and can move your changes to the integration stream once they are complete. Those completed changes can then be merged down to the other developer streams.

force stable trunk/master branch

Our development departments grows and I want to force a stable master/trunk.
Up to now every developer can commit into the master/trunk. In future developers should commit into a staging area, and if all tests pass the code gets moved to the trunk automatically. If the test fails, the developer gets a mail with the failed tests.
We have several repositories: One for the core product, several plugins and a repository for every customer.
Up to now we run SVN and git, but switching all repos to git could be done, if necessary.
Which software could help us to get this done?
There a some articles on the web which explain how to use gerrit and jenkins to force a stable branch.
I am unsure if I need both, or if it is better to use something else.
Environment: We are 10 developers, and use python and django.
Question: Which tool can help me to force a stable master branch?
Update
I was on holiday, and now the bounty has expired. I am sorry. Thank you for your answers.
Question: Which tool can help me to force a stable master branch?
Having been researching this particular aspect of CI quasi-pathologically since our ~20 person PHP/ZF1-based dev team made the switch from SVN to Git over the winter (and I became the de-facto git mess-fixer), I can't help but share my experience with this particular aspect of continuous integration.
While obviously, having a "critical mass of unit tests running" in combination with a slew of conditionally parameterized Jenkins jobs, triggering infinitely more conditionally parameterized jobs, covering every imaginable circumstance would (perhaps) be the best and most proper way to move towards a Continuous Integration/Delivery/Deployment model, the meatspace resources required for such a migration are not insignificant.
Some questions:
Does your team have some kind of VCS workflow or, minimally, rules defined?
What percentage would you say, roughly, of your codebase is under some kind of behavioral (eg. selenium), functional or unit testing?
Does your team ( / senior devs ) actually have the time / interest to get the most out of gerrit's peer-based code review functionality?
On average, how many times do you deploy to production in any given day / week / month?
If the answers to more than one of these questions are 'no', 'none', or 'very little/few', then I'd perhaps consider investing in some whiteboard time to think through your team's general workflow before throwing Jenkins into the mix.
Also, git-hooks. Seriously.
However, if you're super keen on having a CI/Jenkins server, or you have all those basics covered already, then I'd point you to this truly remarkable gem of a blog post:
http://twasink.net/2011/09/16/making-parallel-branches-meet-regularly-with-git-and-jenkins/
And it's equally savvy cousin:
http://twasink.net/2011/09/20/git-feature-branches-and-jenkins-or-how-i-learned-to-stop-worrying-about-broken-builds/
Oh, and of course, the very necessary devopsreactions tumblr.
There a some articles on the web which explain how to use gerrit and jenkins to force a stable branch.
I am unsure if I need both, or if it is better to use something else.
gerrit is for coding review
Jenkins is a job scheduler that can run any job you want, including one:
compiling everything
launching sole unit test.
In each case, the idea is to do some guarded commit, ie pushing to an intermediate repo (gerrit, or one monitored by Jenkins), and only push to the final repo if the intermediate process (review or automatic build/test) passed successfully.
By adding intermediate repos, you can easily force one unique branch on the final "blessed" repo to which those intermediate referential will push to if the commits are deemed worthy.
It sounds like you are looking to establish a standard CI capability. You will need the following essential tools:
Source Version Control : SVN, git (You are already covered here)
CI server : Jenkins (you will need to build and run tests with each
check in, and report results. Jenkins is the defacto standard tool
used for this)
Testing : PyUnit
Artifact Repository : you will need a mechanism for organizing and
archiving the increments created with each build. This could be a
simple home grown directory based system. I have also used Archiva,
but there are other tools.
There are many additional tools that might be useful depending on your development process:
Code review : If you want to make code review a formal gate in your
process, Gerrit is a good tool.
Code coverage analysis : I've used EMMA in the past for Java. I am
sure that are some good tools for Python coverage.
Many others : a library of Jenkin's plugins that provide a variety of
useful tools is available to you. Taking some time to review
available plugins will definitely be time well spent.
In my experience, establishing the right cultural is as important as finding the right tooling.
Testing : one of the 10 principles of CI is "self testing builds". In
other words, you must have a critical mass of unit tests running.
Developers must become test infected. Unit testing must become a
natural, highly value part of each developers individual development
process. In my experience, establishing a culture of test infection
is the hardest part of deploying CI.
Frequent check-in : Developers and managers must organize there work
in a way that allows for frequent small check-ins. CI calls for daily
checkins. This is sometimes a difficult habit to establish.
Responsiveness to feedback : CI is about immediate feedback. The
developers must be conditioned to response to the immediate feedback.
If unit tests fail, the build it broken. Within 15 minutes of a CI
build breaking, the developer responsible should either have a fix
checked in, or have the original, bad check-in backed out.

What are common practices for deployment of large scale systems?

Given a large scale software project with several components written in different languages, configuration files, configuration scripts, environment settings and database migration scripts - what are the common practices for deployment to production?
What are the difficulties to consider? Can the process be simplified with tools like Ant or Maven? How can rollback and database management be handled? Is it advisable to use version control for the production environment?
As I see it, you're mostly asking about best practices and tools for release engineering AKA releng -- it's important to know the "term of art" for a subject, because it makes it much easier to search for more information.
A configuration management system (CMS -- aka revision control system or version control system) is indispensable for today's software development; if you use one or more IDEs, it's also nice to have good integration between them and the CMS, though that's more of an issue for purposes of development than for purposes of deployment / releng.
From a releng viewpoint, the key thing about a CMS is that it must have good support for "branching" (under whatever name), because releases must be made from a "release branch" where all the code under development and all of its dependencies (code and data) are in a stable "snapshot" from which the exact, identical configuration can be reproduced at will.
The need for good branching support may be more obvious if you have to maintain multiple branches (customized for different uses, platforms, whatever), but even if your releases are always, strictly in a single linear sequence, releng best practices still dictate making a release branch. "Good branching support" includes ease of merging (and "conflict resolution" when different changes are made to a file), "cherry-picking" (taking one patch or changeset from one branch, or the head/trunk, and applying it to another branch), and the like.
In practice, you start off the release process by making a release branch; then, you run exhaustive testing on that branch (typically MUCH more than what you run everyday in your continuous build -- including extensive regression testing, integration testing, load testing, performance verification, etc, and possibly even more costly quality assurance processes, depending). If and when the exhaustive testing and QA reveal defects in the release-candidate (including regressions, performance degradation, etc), they must be fixed; in a large team, development on head/trunk may be continuing while the QA is being done, whence the need for ease of cherry-picking / merging / etc (whether your practice is to perform the fix on head or on the release branch, it still needs to be merged to the other side;-).
Last but not least, you're NOT getting full releng value from your CMS unless you're somehow tracking with it "everything" that your releases depend on -- simplest would be to have copies or hard links to all the binaries for the tools you need to build your release, etc, but that may often be impractical; so at the very least track the exact release, version, bugfix &c numbers of those tools that are used (operating system, compilers, system libraries, tools that preprocess image, sound or video files into final form, etc, etc). The key is being able, at need, to exactly reproduce the environment required to rebuild the exact version that's proposed for release (otherwise you'll go mad tracking down subtle bugs that may depend on third party tools' changes as their versions change;-).
After a CMS, the second most important tool for releng is a good issue tracking system -- ideally one that's well integrated with the CMS. That's also important for the development process (and other aspects of product management), but in terms of the release process the importance of the issue tracker is the ability to easily document exactly what bugs have been fixed, what features have been added, removed, or changed, and what modifications in performance (or other user-observable characteristics) are expected in the new forthcoming release. For the purpose, a key "best practice" in development is that every changeset that gets committed to the CMS must be connected to one (or more) issue in the issue tracking system: after all, there's gotta be some purpose for that change (fix a bug, change a feature, optimize something, or some internal refactor that's supposed to be invisible to the software's user); similarly, every tracked issue that's marked as "closed" must be connected to one (or more) changesets (unless the closing is of the "won't fix / working as intended" kind; issues related to bugs &c in third-party components, which have been fixed by the third-party supplier, are easy to treat similarly if you do manage to keep track of all third-party components in the CMS too, see above; if you don't, at least there should be text files under CMS documenting third-party components and their evolution, again see above, and they need to be changed when some tracked issue on a 3rd party component gets closed).
Automating the various releng processes (including building, automated testing, and deployment tasks) is the third top priority -- automated processes are much more productive and repeatable than asking some poor individual to manually go through a list of steps (for sufficiently complex tasks, of course, the workflow of the automation may need to "get a human being in the loop"). As you surmise, tools such as Ant (and SCons, etc, etc) can help here, but inevitably (unless you're lucky enough to get away with very simple and straightforward processes) you'll find yourself enriching them with ad-hoc scripts &c (some powerful and flexible scripting language such as perl, python, ruby, &c, will help). A "workflow engine" can also be precious when your release workflow is sufficiently complex (e.g. involving specific human beings or groups thereof "signing off" on QA compliance, legal compliance, UI guidelines compliance, and so forth).
Some other specific issues you're asking about vary enormously depending on specifics of your environment. If you can afford programmed downtime, your life is relatively easy, even with a large database in play, as you can operate sequentially and deterministically: you shut the existing system down gracefully, ensure the current database is saved and backed up (easing rollback, in the hopefully VERY rare case it's needed), run the one-off scripts for schema migration or other "irreversible" environment changes, fire the system back up again in a mode that's still unaccessible to general users, run another extensive suite of automated tests -- and finally if everything's gone smoothly (including the saving and backup of the DB in its new state, if relevant) the system's opened up to general use again.
If you need to update a "live" system, without downtime, this may range anywhere from a slight inconvenience to a systematic nightmare. In the best case, transactions are reasonably short, and synchronization between the state set by transactions may be delayed a bit without damage... and you have a reasonable abundance of resources (CPUs, storage, &c). In this case, you run two systems in parallel -- the old one and the new one -- and just make sure all new transactions target the new system, while letting old ones complete on the old system. A separate task periodically syncs "new data in the old system" to the new system, as transactions on the old system terminate. Eventually you can determine that no transactions are running on the old system and all changes that happened there are synced up to the new one -- and at that time you can finally shut down the old system. (You need to be prepared to "reverse sync" too of course, in case a rollback of the change is needed).
This is the "simple, sweet" extreme for live system updating; at the other extreme, you can find yourself in such an overconstrained situation that you can prove the task is impossible (you just cannot logically meet all the stated requirements with the given resources). Long sessions opened on the old system which just cannot be terminated -- scarce resources that make it impossible to run two systems in parallel -- core requirements for hard-real-time sync of every transaction -- etc, etc, can all make your life miserable (and as I noticed, at the extreme, can make the stated task absolutely impossible). The two best things you can do about this: (1) ensure you have abundant resources (this will also save your skin when some server unexpected goes belly-up... you'll have another one to fire up to meet the emergency!-); (2) consider this predicament from the start, when initially defining the architecture of the overall system (e.g.: prefer short-lived transactions to long-lived sessions that just can't be "snapshot, closed down, and restarted seamlessly from the snapshot", is one good arcitectural pointer;-).
disclaimer: where I work we use something I wrote that I'm going to mention
I'll tell you how I do it.
For configuration settings, and general deployment of code and content, I use a combination of NAnt, a CI server, and dashy (an automatic deployment tool). You can replace dashy with any other 'thing', that automates the uploads to your server (perhaps capistrano).
For DB scripts, we use RedGate SQL Compare to get the scripts, and then for most changes, I actually make them manually, where appropriate. This is because some of the changes are slightly complex, and I feel more comfortable doing it by hand. You can actually use this tool to do it for you, (or at least generate the scripts).
Depending on your language, there are some tools that can script the DB updates for you (I think someone on this forum wrote one; hopefully he'll reply), but I have no experience with those. It's something I'd like to add though.
Difficulties
I forgot to answer one of your questions.
The biggest problem in updating any significantly complicated/distributed site is DB synchronisation. You need to consider if you will have any downtime, and if you do, what will happen to the DBs. Will you shutdown everything, so that no transactions can be processed? Or will you shift everything to one server, update DB B, and then sync DB A & B, then update DB B? Or something else?
Whatever you choose, you need to choose it, and either say "Okay for each update, there will be X downtime", or whatever. Just have it documented.
The worst thing you can do is have someones transaction fail, mid processing, because you were updating that server; or to somehow leave only part of your system operational.
I don't think you have any option about using versioning or not.
You cannot go without versioning (as mentioned in a comment).
I am talking from experience since I am currently working on a website where we have several items that have to work together:
The software itself, its functionalities at a given point in time
The external systems (6 of them) we are linked with (messages are versioned)
A database, which holds the configuration (and translations)
The static resources (images, css, javascripts) hosted on an Apache server (several in fact)
A java applet, which has to be synchronized with the javascript and software
This works because we use versioning, although I must admit that our database is pretty simple, and because the deployment is automated.
Versioning means that we have actually at any point in time several concurrent versions of the messages, the database, the static resources and the java applet.
It is especially important in the event of a 'fallback'. If you discover a flaw when loading you new software and suddenly you cannot afford to load, you'll have a crisis if you have not versioned, and you'll just have to load the old soft if you have.
Now as I said, the deployment is scripted:
- the static resources are deployed first (+ the java applet)
- the database goes next, several configurations cohabit since they are versioned
- the soft is queued for load in a 'cool' time-window (when our traffic is at its lowest point, so at night)
And of course we take care of backward / forward compatibility issues with the external servers, which should not be underestimated.

Ideas on setting up a version control system

I've been tasked with setting up a version control for our web developers. The software, which was chosen for me because we already have other non-web developers using it, is Serena PVCS.
I'm having a hard time trying to decide how to set it up so I'm going to describe how development happens in our system, and hopefully it will generate some discussion on how best to do it.
We have 3 servers, Development, UAT/Staging, and Production. The web developers only have access to write and test their code on the Development server. Once they write the code, they must go through a certification process to get the code moved to UAT/Staging, then after the code is tested thoroughly there, it gets moved to Production.
It seems like making the Developers use version control for their code on Development which they are constantly changing and testing would be an annoyance. Normally only one developer works on a module at a time so there isn't much, if any, risk of over-writing other people's work.
My thought was to have them only use version control when they are ready to go to UAT/Staging. This allows them to develop and test without constantly checking in their code.
The certification group could then use the version control to help see what changes had been made to the module and to make sure they were always getting the latest revision from the developer to put up on UAT/Staging (now we rely on the developer zip'ing up their changed files and uploading them via a web request system).
This would take care of the file side of development, but leaves the whole database side out of version control. That's something else that I need to consider...
Any thoughts or ideas would be greatly appreciated. Thanks.
I would not treat source control as annoyance. See Nicks answer for the reasons.
If I were You, I would not decide this on my own, because it is not a
matter of setting up a version control software on some server but
a matter of changing and improving development procedures.
In Your case, it might be worth explaining and discussing release branches
with Your developers and with quality assurance.
This means that Your developers decide which feature to include into a release
and while the staging crew is busy on testing the "staging" branch of the source,
Your developers can already work on the next release without interfering with the staging team.
You can also think about feature branches, which means that there is a new branch for every specific new feature of the web site. Those branches are merged back, if the feature is implemented.
But again: Make sure, that Your teams agreed to the new development process. Otherwise, You waste Your time by setting up a version control system.
The process should at least include:
When to commit.
When to branch/merge.
What/When to tag.
The overall work flow.
I have used Serena, and it is indeed an annoyance. In addition to the unpleasantness of the workflow overhead Serena puts on top of the check in-check out process, it is a real pain with regard to doing anything besides the simplest of tasks.
In Serena ChangeMan, all code on local machines is managed through a central server. This is a really bad design. This means a lot of day-to-day branch maintenance work that would ordinarily be done by developers has to go through whomever has administrator privileges, making that person 1) a bottleneck and 2) embittered because they have a soul-sucking job.
The centralized management also strictly limits what developers are able to do with the code on their own machine. For example, if you want to create a second copy of the code locally on your box, just to do a quick test or whatever, you have to get the administrator to set up a second repository on your box. When you limit developers like this, you limit the productivity and creativity of your team.
Also, the tools are bad and the user interface is horrendous. And you will never be able to find developers who are already trained to use it, because its too obscure.
So, if another team says you have to use Serena, push back. That product is terrible.
Using source control isn't any annoyance, it's a tool. Having the benefits of branching and tagging is invaluable when working with new APIs and libraries.
And just a side note, a couple of months back one of the dev's machine's failed and lost all his newest source, we asked when the last time he committed code to the source control and it was 2 months. Sometimes just having it to back up stuff when you reach milestones is nice.
I usually commit to source control a couple of times a week, depending if I've hit a good stopping point and I'm about to move on to something different or bigger.
Following on from the last two good points I would also ask your other non-web developers what developmet process they are using so you won't have to create a new one. They would also have encountered many of he problems that occur in your environment, both technical using the same OS and setup and managerial.

What should we do when the buildserver is treated like a goldmine?

A year ago I started to create some automated builds on our build machine (TFS2008). Not so much for combining with full scale TDD (we still have a lot of old legacy code), but for being able to detect at an early stage if builds got broken. Another objective was also to minimize the packaging/deployment work.
This has been working quite well so far, but lately some coworkers are starting to treat the buildserver as a goldmine of quick releases, and the testing process seems to get less priority more often. Refactoring some of our code during 2-3 days proved that the builds on our buildserver potentially could reach our customers. :)
I guess our buildserver over time has shifted from being a 'consistency tool' for the developers, into being a server producing packages that is expected to be release quality 24/7.
This is clearly a communication problem, and there should be a set of rules on this. Only problem is that I don't know where to begin. Does anyone have similar experiences with this?
You're correct, it is a communications problem. If your developers and management are expecting release-quality builds all the time, they're not understanding the process of build/test/release.
The only thing you can do is clarify the purpose of a build server: a single, centralized location for builds. You need to clarify the distinction between a build and a release. Builds should always succeed (no one should break the build) but the ability to create a build does not have any bearing whatsoever on build quality or the suitability of a given build for release.
Build quality is measured by unit, functional, and user acceptance testing. There is no replacement for these tests in preparing a build for release. The long-term costs of not doing these tests far outweigh the short-term benefits of getting a release out the door.
Our unittestserver does tests, and tags CVS. Then we go on a buildserver which has ea script to create a release which isready for customer installation. This release is then installed on a test server as if it was the customer's server, and then tested.
Judging your story, you are hoping to find some script or setting which will prevent the buildserver from getting used as "quick release" server. The only real way to do this is process.
Rules in our company:
Developers check into CVS, they get mails from the unittest server if it fails, and have to fix that in code. No access to the build/test server for devs.
There is 1 specific developer who can create a release which he can send to the test department.
The test department installs the release on their test server and tests it.
The testers, and only the testers, can give a "Go" for release.
The release is done by a designated person who is also the customer contact.
As you can see the developers are seperated from the testers and the customer (formally speaking). In practice it is not all that rigid ofcourse, but people need to understand that if this process is not in place, the customer will get inferior quality software.
The customer has to be educated that "fast" means "low quality". We can do it Fast, Good, or Cheap. Pick two.
http://www.sixside.com/fast_good_cheap.asp
I suggest that all builds created by the internal build server state in the splash screen "INTERNAL BUILD - NOT FOR CUSTOMERS", and the release build server plops in the official splash screen.