Checking in of "commented out" code [closed] - version-control

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 5 years ago.
Improve this question
Ok, here is something that has caused some friction at my current job and I really didn't expect it to. Organized in house software development is a new concept here and I have drawn up a first draft of some coding guidelines.
I have proposed that "commented out" code should never be checked into the repository. The reason I have stated this is that the repository maintains a full history of the files. If you are removing the functional code then remove it altogether. The repository keeps your changes so it is easy to see what was changed.
This has caused some friction in that another developer believes that taking this route is too restrictive. This developer would like to be able to comment out some code that he is working on but is incomplete. This code then would never have been checked in before and then not saved anywhere. We are going to be using TFS so I suggested that shelving the changes would be the most correct solution. It was not accepted however because he would like to be able to checkin partial changes that may or may not be deployed.
We want to eventually get to a point where we are taking full advantage of Continuous Integration and automatically deploying to a development web server. Currently there is no development version of web servers or database servers but that will all be changed soon.
Anyway, what are your thoughts? Do you believe that "commented out" code is useful to have in the repository?
I'm very interested to hear from others on this topic.
Edit:
For clarity sake, we don't use private branches. If we did then I'd say do what you want with your private branch but don't ever merge commented out code with the trunk or any shared branches.
Edit:
There is no valid reason we don't use private or per user branches. It's not a concept I disagree with. We just haven't set it up that way yet. Perhaps that is the eventual middle ground. For now we use TFS shelving.

There may be others with different experiences, but in mine checking in half-finished code is a horrible idea, period.
Here are the principles I have learned and try to follow:
Check in often - at least once, but preferably many times per day
Only check in complete functionality
If the first and second conflict (e.g. it takes more than a day to make the functionality work) then the task is too large - break it into smaller completable tasks.
This means:
Commented-out code should never be checked in since it is not functional
Commenting is not a valid archival strategy, so whether it's code yet-to-be-finished or code that's being retired, commenting and checking in doesn't make any sense.
So in summary, NO! If the code is not ready to go to the next stage (whichever that is for you: IntTest/QA/UAT/PreProd/Prod), it should not be committed to a trunk or multi-developer branch. Period.
Edit: After reading the other answers and comments, I'll add that I don't think it's necessarily a good idea to ban commented code (not sure how you'd enforce that anyway). What I will say is that you should get everyone on your team to buy in to the philosophy I described above. The team I work on embraces it wholeheartedly. As a result, source control is a frictonless team-member, one that helps us get our job done.
People who don't embrace that philosophy usually cause broken windows and are often frustrated by source control. They see it as a necessary evil at best, and something to avoid at worst; which leads to infrequent checkins, which means changesets are huge and hard to merge, which compounds frustration, makes checkins something to avoid even more, etc. This is ultimately an attitude thing, not really a process thing. It's easy to put up mental barriers against it; it's easy to find reasons why it won't work, just like it's easy to find reasons not to diet if you don't really want to. But when people do want to do it and are committed to changing their habits, the results are dramatic. The burden is on you to sell it effectively.

"Never" is rarely a good word to use in guidelines.
Your colleague has a great example of when it might be appropriate to check in code that is commented out: When it is incomplete and might break the application if checked in while active.
For the most part, commenting out dead code is unnecessary in a well-managed change-controlled system. But, not all commented code is "dead."

One case where I leave commented out code:
// This approach doesn't work
// Blah, blah, blah
when that is the obvious approach to the problem but it contains some subtle flaw. Sure, the repository would have it but the repository wouldn't warn anyone in the future not to go down that road.

Commented out code should never be checked-in for the purpose of maintaining history. That is the point of source control.
People are talking a lot of ideals here. Maybe unlike everyone else, I have to work on multiple projects with multiple interruptions with the "real world" ocassionally interrupted my workday.
Sometimes, the reality is, I have to check-in partially complete code. It's either risk losing the code or check-in incomplete code. I can't always afford to "finish" a task, no matter how small. But I will not disconnect my laptop from the network without checking-in all code.
If necessary, I will create my own working branch to commit partial changes.

I would certainly discourage, strongly, ever checking in commented-out code. I would not, however, absolutely ban it. Sometimes (if rarely) it is appropriate to check commented-out code into source control. Saying "never do that" is too restrictive.
I think we all agree with these points:
Never check dead code into source control
Never check broken (non-functioning) code into source control, at least never to trunk and only very rarely to a private branch, YMMV
If you have temporarily commented something out or broken something for debugging purposes, don't check the code in until you restore the code to its correct form
Some of are saying there are other categories, such as temporarily removed code, or an incremental but incomplete improvement that includes a small amount of commented-out code as documentation of what comes next, or a very short (ideally 1 line) snippet of commented out code showing something that should never be re-added. Commented-out code should ALWAYS be accompanied by a comment that says why it is commented out (and not just deleted) and gives the expected lifetime of the commented-out code. For example, "The following code does more harm than good, so is commented out, but needs to be replaced before release XXX."
A comment like the above is appropriate if you are delivering a hotfix to stop a customer's bleeding and you don't have the immediate opportunity to find the ultimate fix. After delivering the hotfix, the commented-out code is a reminder that you still have something that needs fixing.
When do I check in commented-out code? One example is when I am tentatively removing something that I think there's a high probability will have to be re-added in the near future, in some form. The commented-out code is there to serve as a direct reminder that this is incomplete. Sure, the old version is in source control and you could just use a FIXME comment as a flag that something more is needed. However, sometimes (if not often) code is the better comment.
Also, when a bug is fixed by removing one line (or more rarely, two lines) of code, I'll sometimes just comment out the line with a comment to never re-enable that code with a reason why. This sort of comment is clear, direct, and concise.
Rex M said: 1) Only check in complete functionality, 2) [If] the task is too large - break it into smaller completable tasks.
In response: Yes, this is the ideal. Sometimes neither option is achievable when you are working on production code and have an immediate, critical problem to fix. Sometimes to complete a task, you need to put a version of code in the field for a while. This is especially true for data gathering code changes when you're trying to find the root cause of a problem.
For the specific instance being asked about in the more general question ... as long as the developer is checking commented-out code into a private branch that no-one will see but that developer (and perhaps someone the developer is collaborating with), it does little harm. But that developer should (almost) never deliver such code into trunk or an equivalent. Trunk should always build and should always function. Delivering unfinished code to trunk is almost always a very bad idea. If you let a developer check unfinished or temporary code into a private branch, then you have to rely on the developer to not forget to scrub the code before delivering into trunk.
To clarify in response to comments to other answers, if code is commented out and checked in, my expectation that the code will function if uncommented drops with the length of time the code has been commented out. Obviously, refactoring tools will not always include comments in their refactoring. Almost always, if I put commented-out code into production, the code is there to serve as a refined comment, something more specific than prose, that something needs to be done there. It is not something that should have a long life.
Finally, if you can find commented-out code in every source file, then something is wrong. Delivering commented-out code into trunk for any reason should be a rare event. If this occurs often, then it becomes clutter and loses its value.

I think never is too strong a condition.
I tend to comment out, checkin, run the tests, have a think and then remove comments after the next release.

In general, checking in commented-out code is wrong as it creates confusion amongst those who are not the original author and need to read or amend the code. In any event, the original author often ends up confused about the code after 3 months have passed.
I espouse the belief that the code belongs to the company, or team, and that it is your responsibility to make things easy for your peers. Checking in commented-out code without also adding a comment about why it is being retained is tantamount to saying:
I don't mind if you end up all
confused over why this stuff is here.
My needs are more important that yours
which is why I have done this. I do
not feel any need to justify
to you, or anyone else, why I have
done this.
For me commented-out code is normally seen as a sign of disrespect from a less that thoughtful co-worker.

When you need to add a small feature or bug fix like NOW, within the next 3 minutes and you have to fix a file you have some half developed code on I'd say it's ok, practical needs rule over pragmatic ideals on the battlefield.

I broadly agree with the principle that commented out code shouldn't be checked in. The source control system is a shared resource, and your colleague is, to an extent, using it as his personal scratch pad. This isn't very considerate to the other users, especially if you subscribe to the idea of shared ownership of the codebase.
The next developer to see that commented-out code would have no idea that it's a work in progress. Is he free to change it? Is it dead code? He doesn't know.
If your colleague's change isn't in a state where it can be checked in, he needs to finish it, and/or learn to make smaller, incremental changes.
"Checking in partial changes that may or may not be deployed" - presumably that also means may or may not be tested? That's a slippery slope to a very ropey code base.

Another reason for checked in commented out code:
You're modifying existing code, and found a subtle bug, one that is easy to overlook, and perhaps might even look correct at first glance. Comment it out, put the fix in its place, and add comments for what is going on, and why it was modified. Check that in, so that your comments on the fix are in the repository.

This shows a fundamental difference in two schools of thought: Those who only check in working code that they are satisfied with and feel is worthy of saving, and those who check in their work so the revision control is there to backstop them against data loss.
I'd charactarize the latter as "those who like to use their revision control system as a poor-man's tape backup", but that would be tipping my hand as to which camp I'm in. :-)
My guess is that you are of the "good code" camp, and he is of the "working code" camp.
[EDIT]
From the comments, yes I guessed right.
As I said, I'm with you, but as near as I can tell this is a minority opinion, both here on stackoverflow and where I work. As such, I don't think you can really enshrine it in your development standards as the only way to operate. Not if you want the standards followed anyway. One thing a good leader knows is to never give an order they know won't be followed.
btw: Good editors will help with keeping old versions. For example, in Emacs I set kept-old-versions and kept-old-versions to 10, which has it keep around the last 10 saves of my files. You might look into that as a way to help your argument against the revision-control-as-backup crowd. However, you won't ever win the argument.

It depends. If it's being left there for purposes of illustration, maybe. It could possibly be useful during refactoring. Otherwise, and generally, no. Also, commenting out unfinished code is bound to be both failure-prone and a time sink. Better he break the code into smaller pieces and check them in when they work.

In my experience, developer switches are commented out code.
Sometimes, new back-ends are built in parallel, with the activating switchs commented out in source control.
Some bizarre feature we need once on a blue moon but no customer will ever need is often implemented that way. These things usually carry a high risk of security or data integrity bypass so we don't want them active outside of development. Requiring a developer who would use it to uncomment the code first seems to be the easiest way of getting it.

I think "Never" is too strong a rule. I'd vote to leave some personal leeway around whether people check commented code in to the repository. The ultimate goal should be coder productivity, not "a pristine repository."
To balance that laxness out, make sure everyone knows that commented out code has an expiration date. Anyone is allowed to delete the commented code if it's been around for a full week and never been active. (Replace "a week" with whatever feels right to you.) That way, you reserve the right to kill clutter when you see it, without interfering too directly with people's personal styles.

Perhaps the real question here is whether developers should be allowed to check in incomplete code?
This practice would seem to be contradictory to your stated goal of implementing continuous integration.

My view: if developers are working on their own branches, or in their own sandbox area, then they should be able to check in whatever they want. It's when they check code into a shared branch (a feature branch, or a team's branch, or of course MAIN/trunk) that the code should be as pristine as possible (no commented out code, no more FIXMEs, etc).

The idea of allowing source-control history to illustrate the "old way" of doing something rather than commenting it out and checking in the commenting-out along with an explanation is a good idea in theory.
In the real world, however, nobody ever looks at source control history on the files they are working on unless it is part of an official review process of some sort (done only periodically), or if something doesn't work, and the developer can't figure out why.
Even then, looking back more than about 3 versions basically never happens.
Partially, this is because source-control systems don't make this sort of casual review easy. Usually you have to check out an old version or diff against an old version, you just see two versions, and there's no good concise view of what changed that can give you an at-a-glance idea of what changed.
Partially, it is the combination of human nature and the needs of the team. If I have to fix something, and I can fix it in a few hours, I'm not likely to spend an hour investigating old versions of the code that haven't been "live" in a month (which, with each developer checking in often, means back many revisions), unless I happen to know that there's something in there (such as if I remember a discussion about changing something related to what I'm doing now).
If the code is deleted and checked back in, then, for all intents and purposes (except for the limited purpose of a complete roll-back) it ceases to exist. Yes, it is there for backup purposes, but without a person in the role of code librarian, it is going to get lost.
My source control tree on my current project is about 10 weeks old, on a team of only about 4 engineers, and there are about 200 committed change lists. I know that my team does not do as good of a job as it should of checking in as soon as there is something solid and ready to go. That makes it pretty rough to rely on reading the code history for every part of the code to catch every important change.
Right now, I'm working on a project in initial development mode, which is very different from a project in a maintenance mode. Many of the same tools are used in both environments, but the needs differ quite a bit. For example, often there is a task that requires two or more engineers to work somewhat closely together to build something (say a client and a server of some sort).
If I'm writing the server, I might write up the code for the draft interface that the client will use and check it in completely non-functional, so that the engineer writing the client can update. This is because we have the policy that says that the only way to send code from one engineer to another is through the source control system.
If the task is going to take long enough, it would be worth creating a branch perhaps for the two of us to work on (though that is against policy in my organization -- engineers and individual team leads don't have the necessary permissions on the source-control server). Ultimately, its a trade-off, which is why we try not to institute too many "always" or "never" policies.
I would probably respond to such a no-commented-code-ever policy by saying that it was a bit naive. Well-intentioned, perhaps, but ultimately unlikely to achieve its purpose.
Though seeing this post is going to make be go back through the code I checked in last week and remove the commented-out portion that was both never final (though it worked) and also never likely to be desired again.

I absolutely agree that commented out code shouldn't be checked into the repository, that is what source code control is for.
In my experience when a programmer checks in commented out code, it is because he/she is not sure what the right solution is and is happier leaving the alternate solution in the source in the hope that someone else will make that decision.
I find it complicates the code and makes it difficult to read.
I have no problem with checking in half finished code (so you get the benefit of source control) that isn't called by the live system. My problem is with finding sections of commented code with no explanation the dilemma was that resulted in the code being excluded.

I think checking-in commented code in a source code control system should be done with extreme caution, especially if the language tags used to comment the code are written by blocks, i.e.:
/* My commented code start here
My commented code line 1
My commented code line 2
*/
Rather than on an individual line basis, like:
// My commented code start here
// My commented code line 1
// My commented code line 2
(you get the idea)
The reason I would use extreme caution is that depending of the technology, you should be very careful about the diff/merge tool you are using. With certain source code control system and certain language, the diff/merge tool can be easily confused. The standard diff/merge of ClearCase for example is notoriously bad for merging .xml files.
If it happens that the commenting blocks lines are not merge properly, presto your code will become active in the system when it shouldn't be. If the code is incomplete and break the build, that is probably the least evil, as you will spot it immediately.
But if the code passes the build, it may become active when it shouldn't be there, and from a CM perspective, that could be a nightmare scenario. QA generally test what should be there, they don't test for code that is not supposed to be there, so your code may end up in production before you know it, and by the time it would be realized the code is there when it shouldn't, the cost in maintenance have multiplied many folds (as the "bug" will be discovered in production or by the customer, worst place or time ever).

I think commented out code is considered to be "waste".
I am assuming you are working in a team environment. If you are working on your own, and you comment out code with a 'todo' and you will come back to it then that is different. But in a team environment you can safely assume once commented out code is checked in it is there to stay and it will most likely cause more pain than satisfaction.
If you are doing peer code reviews then it might answer your question. If another developer reviews your code and says "why is there this commented out code that is trying to do 'blah'" then your code has failed the code review and you shouldn't be checking it in anyway.
Commented out code will just raise questions with other developers - therefore wasting time and energy.
You need to ask the question "why" the code is commented out. Some suggestions:
If you are commenting out code because you are "unsure of business rules" then you probably have an issue with "scope creep" - best not to dirty your code base with requirements that "would be nice to have but we don't have time to implement" - keep it clean with clear code and tests around what is actually there.
If you are commenting out code because you are "not sure if it is the best way to do it" then have your code peer reviewed! Times are changing, you will look at code you write today in 2 years and think it's horrible! But you can't go around commenting out bits that you 'know' can be done better but you just can't find a way right now. Let whoever maintains the codebase long term determine whether there is a better way - just get the code written, tested and working and move on.
If you are commenting out code because "something doesn't work" then FIX IT! A common scenario is "broken tests" or "todo's". If you have these, you will save yourseslf a lot of time by either fixing them or just getting rid of them. If they can be "broken" for a period of time, they can most likely be broken forever.
All of these potential scenarios (and ones I haven't mentioned here) are wasted time and effort. Commented out code might seem like a small issue but could be an indicator of a bigger issue in your team.

Repositories are backup of code. If I am working on code but it is not completed, why not comment it out and check it in at the end of the day. That way if my hard drive dies overnight I will not have lost any work. I can check out the code in the morning uncomment it and keep on going.
The only reason I would comment it out is because I would not want to break the overnight build.

There is clearly a tension between 1) checking in early, and 2) always maintaining the repository in a working state. If you have more than a few developers, the latter is going to take increasing precedence, because you can't have one developer crapping all over everyone else for his own personal workflow. That said, you shouldn't underestimate the value of the first guideline. Developers use all different kinds of mental fenceposts, and individualized workflows are one way that great developers squeeze out those extra Xs. As a manager your job not to try to understand all these nuances—which you will fail at unless you are a genius and all your developers are idiots—but rather enable your developers to be the best they can be through their own decision-making.
You mention in the comment that you don't use private branches. My question for you is why not? Okay, I don't know anything about TFS, so maybe there are good reasons. However after using git for a year, I've gotta say that a good DVCS totally diffuses this tension. There are cases where I find commenting out code to be useful as I'm building a replacement, but I will lose sleep over it if I'm inflicting it on others. Being able to branch locally means I can keep meaningful commits for my individual process without having to worry about (or even notify) downstream developers of temporary messes.

Just echoing the chorus. Discourage this at all costs. It makes the code harder to read and leaves folks wondering whats good/bad about that code that isnt even part the application at the present time. You can always find changes by comparing revisions. If there was some major surgery and code was commented out en-masse the dev should have noted it in the revision notes on checkin/merge.
incomplete/experimental code should be in a branch to be developed to completion. the head/trunk should be the mainline that always compiles and is whats shipping. once the experimental branch is complete it/accepted it should be merged into the head/mainline. There is even an IEEE standard (IEEE 1042) describing this if you need support documentation.

I would prefer to see possibly broken, accessible code that isn't being used yet checked in over the same code being completely unavailable. Since all version control software allows some sort of 'working copy' separate from the trunk, it's really a much better idea to use those features instead.
New, non-functional code is fine in the trunk, because it is new. It probably doesn't break anything that already works. If it does break working code, then it should just go in a branch, so that other developers can (if they need to) check that branch out, and see what's broken.

"Scar Tissue" is what I call commented-out code. In the days before the widespread use of version-control systems, Code Monkeys would leave commented out code in the file in case they needed to revert functionality.
The only time it is acceptable to check-in "scar tissue" is
If you have a private branch and
You don't have time to make the code compile without errors and
You are going on a long vacation and
You don't trust your VCS, like if you use Visual Source Safe OR.
[EDIT]
You have a subtle bug that might be reintroduced if the incorrect code isn't left in as a reminder. (good point from other answers).
There is almost no excuse for #4 because there are plenty of freely available and robust VCS systems around, Git being the best example.
Otherwise, just let the VCS be your archive and code distributor. If another developer wants to look at your code, email him the diffs and let him apply the stuff he wants directly. In any event, the merge doesn't care why or how the coding of two files diverged.
Because it is code, scar-tissue can be more distracting even than a well-written comment. By its very nature as code, you make the maintenance programmer expend mental cpu-cycles figuring out if the scar-tissue has anything to do with his changes. It doesn't matter whether the scar is a week old or 10 years old, leaving scar-tissue in code imposes a burden upon those who must decypher the code afterwords.
[EDIT]
I'd add that there are two major scenarios that need to be distinguished:
private development, either coding a personal project or checking in to a private branch
Maintenance development, where the code being checked in is intended to be put into production.
Just say "NO" to scar-tissue!

I don't know - I always comment out the original lines before I make changes - it helps me revert them back if I change my mind. And yes I do check them in.
I do however strip out any old commented code from the previous check-in.
I know I could look at the diff logs to see what's changed but it's a pain - it's nice to see the last changes right there in the code.

A nice compromise is to write a little tool that dumps your checked out/modified files to a network backup drive. That way, you can modify til your heart's content and have your work backed up, but you never have to check in experimental or unfinished code.

I think that checking in commented out code should be valid as, just because the new change passed tests it may be more helpful to see what was there before and see if the new change is really an improvement.
If I have to go back several versions to see an earlier change that now leads to a performance hit then that would be very annoying.
Sometimes the commented out code is a good history, but, put dates as to when the code was commented out. Later, someone that is working near there can just delete the commented out code as it has been proven not to be needed.
It would also be good to know who commented out that code so that if some rationale is needed then they can be asked.
I prefer to write new functionality, ensure the unit tests pass, check it in, then allow others to use it and see how it works.

If the developer has commented out some code because it is not yet complete, then the correct "source control" way to deal with this would be for that developer to keep it in a separate branch of his own, until that code is ready to check in.
With a DVCS (like git, bazaar, or mercurial) this is dead easy as it requires no changes in the central repository. Otherwise, perhaps you could talk about giving developers their own branches on the server if they are working on specific features that will take them a long enough time (ie, days).
There is nothing wrong with checking in commented out code in some situations, it's just that this is one situation where there may be a better way to do it, so the developer can track changes to his source even though it isn't ready to be checked in to the main repository.

Clearly, the developer who is checking in commented-out code should be working in a separate branch, merging in changes from the trunk branch as neccessary.
It is up to the VCS system to assist the developer in this workflow (git is one excellent VCS system that works with this very nicely).

Related

Implementing Source Control

My company has 3 developers. Me, another guy, and a VP dev. I really want to implement source control, especially since our code seems to randomly change on it's own. We tend to develop on the server, live, etc.
I'm fine with having a copy of our database on my machine to work against, if necessary, as is the other guy. The VP dev doesn't want it. How can I work with him to change his mind, or make it work for him?
You have to make him think it's his idea.
Point out that with source control you not only have a built-in backup of everything, but you also have the previous versions - let him realize how much of a good thing that is.
Install SVN and tell the one that opposes it that "everybody does it" :)
And seriously - source control is a MUST even for a single developer, let alone for three.
As for the DB server - you can use one development server (it can be a regular machine). It is of course no problem if you use each a local copy, but you must have a strong database schema generation/synchronization tools.
You should have source control. There isn't much excuse for not having it. Source control will protect you against changes that will cause problems in your code. I would recommend putting the db schema and data (sample set) in your version control. This will allow independent changes to the db without screwing up what your users see live on the site.
Note that you're not really asking about source control here, but about where your development dataset resides. Local databases per-developer are best, if possible, but failing that, a reasonable alternative is to just have a virtual machine containing your source control server and a development database.
Putting things under source control is really easy - literally, 10 minutes from now you could have your source under source control. Rather than try and persuade him the benefits I would just go ahead and do it anyway.
Start simply by putting a copy of your source under source control - even if he doesn't use it just merge the changes from live into your source control repository on a regular basis. At least that way you have a revision history (and if you are him are the only people changing the source, it means that any changes you didnt make, he must have made)
With luck, slowly over time he will begin to see the benefits (him: oh no - everything just broke! you: Don't worry, I'll just look and see what has changed since the last working copy...)
It sounds like you need to convince him that it is
Necessary to solve a problem,
an appropriate solution (does exactly what you need) and
easy to use.
It sounds like you have the information to demonstrate #1: the last time the code, "changed on its own," on the server and you lost someone's work or mixed results poorly. Bam, there's your "problem." #3 is the next more difficult: you need to pick an SCM with a good set of tools and do a demo. The TortoiseX line of products (TortoiseHg, TortoiseSVN) are great for this, because they make it non-scary.
Item 2 is the hardest: to demonstrate that this is the appropriate solution. Perhaps, to convince him of this, you might refer to anecdotes of other programmers or by looking at Github, where you can look back at previous versions of a product. I'm clutching at straws, here, because I feel like his argument will be, "Ach, and that's when it's a huge headache, is when things break. It won't be worth it."
Obviously there are a large number of ways to deal with people (and for the most part you have a "people") problem.
The first thing I'd do is find out why he's so against source control. Often times people who don't like source control either don't like:
The extra work of committing
Don't always work next to an internet
See no extra value in it
There are different solutions to each of these problems. Obviously the third one is tricky, so I'll handle it last.
If they don't like the extra work of committing, some chron scripts will help them (or windows scheduler). Something that regularly commits in the background, or recursively goes through his files and adds them for the next commit. This will mean you'll do a little more work on your end to clean up extra files and deal with broken builds, but its a step. Alternatively if he's emailing you the code, a script that commits the emails works as well.
If he's not always working next to internet access, consider a system like GIT. The advantage of GIT (over something like SVN) is that it utilizes a pull model instead of a push. As a result you pull updates from other GIT users instead of pushing commits. If you are working on a plane and don't have internet access, this is a valuable feature.
Finally, demonstrating the importance of the system is tough. The best example is almost always: "My machine burned down." I suppose you could nuke his box, but for the moment let's look at ways that don't piss off your boss.
A good way to demonstrate the importance of a repository is a Daily Build. Having a daily build means you can readily integrate features and find bugs faster. Setting up a repository with a daily build will significantly improve your work conditions, and its likely to make a good impression.
These are just a few of the reasons that people don't like source control, but the key idea is finding what his reason is and adapting to it.

How can I convince my department to implement a version control system?

I recently joined the IT department of a big insurance company. Although the department's title is "IT", a lot of code gets written here; Java, JSP, JavaScript, COBOL and even some C++ from what I've heard. All the programs that allow insurers, brokers and the rest of the tie-wearing, white-collar workers to issue new contracts and deal with clients runs on the code produced by this department. I've been told that the department is pretty good by the parent company's standards and that we've even received an internal award or two. We're 17 people in the department, split in smaller groups of 2 or 3. As you might've guessed from the COBOL part further up, the average age is over 40 years (as a point of reference I'm 29 yo).
Right now, there is no version control system in place (there exists a general backup scheme though). When needed, files are passed around through shared folders. Usually there's one person in every group responsible for copying the "final" version of the files back to the production server. I find this absurd and even a bit dangerous.
How may I try to convince management that we should implement a VCS scheme in our department? I've never deployed a VCS myself but every other place I've worked at had one. I think I'll hit a "we've been OK until now, why bother" wall from the first step, coupled with the age of most of my co-workers that will feel this step is an unnecessary hurdle.
I know the basic advantages of VCS (traceability, granular backups, accountability etc). I'm looking to back my case with realistic cases and examples of real added value over the implementation costs, not just a "but-but-but, we must have a VCS you fools!" :-)
You don't necessarily need their permission.
Install svn on your machine, start using it, and then start convincing your fellow team members to use it too.
Then watch and see what happens.
Edits
The basic idea of this is that it's easier to show than to tell.
It's a great idea to support your ideas with a working implementation/solution.
Of course, if you succeed, and they want the system used department/company wide, you must be prepared to support the transition, know how the software is to be installed and used.
Going ahead and using something accepted in the industry is faster than having discussions on what system should be used.
There is a good change that this will get you noticed. You may also get your peers respect and support.
As suggested, the same approach can be made on other areas:
issue/bug tracking systems
quality tools
time tracking
continuous integration
a wiki for knowledge base, HOWTO's, guidelines, tutorials, presentations, screencasts
different IDEs and tools
build tools
automated deployment
various scripts that would save your team time
.. any item that will visibly add quality to your work, but doesn't (yet) disrupt existing methodologies and practices.
Joel Spolsky has an excellent article: Getting Things Done When You're Only a Grunt
Quote
Nobody on your team wants to use
source control? Create your own CVS
repository, on your own hard drive if
necessary. Even without cooperation,
you can check your code in
independently from everybody else's.
Then when they have problems that
source control can solve (someone
accidentally types rm * ~ instead of
rm *~), they'll come to you for help.
Eventually, people will realize that
they can have their own checkouts,
too.
Management? I will put bold the expressions and words you should use:
Your should display some examples how a VCS will prevent losing money to the company if some error/bugs or disaster happens. It will be faster to solve all problems, so maintaning the systens won't be so lazy and people become more productive.
You should also mention that implementing a VCS has no costs.
VCS will also give advantages for backup all the existing code. So, all the code will be safe.
My opinion on how to go about doing this, is that you should try to convince your fellow developers first. The way I see it, there are two ways this might go about:
You give the right arguments to the other developers (possibly only the head developers will suffice), they like the idea, and the suggest it to management. Management is easy to convince at that point, so everyone is happy.
You give the right arguments to management, who get all excited (great!) and mandate that version control has to be installed and used by everyone. Here's the thing: If at this point the other developers are not sold to the idea already, then (a) they might be hostile to an idea that management is forcing upon them, and (b) they might not like you for being the cause of it all.
So what are good arguments to convince fellow developers? As someone who uses subversion (which is the one I recommend in this case, by the way) even for his solo projects, here's a few advantages I can think of:
Using version control forces people to think of code modification in terms of a series of small, self-contained changes. This is an extremely beneficial way of working: where otherwise people would be inclined to make lots of changes all over the place, leaving the code in a mess, version control kinda forces them to change the code in bite-size, easy-to-swallow bits, keeping the code compiling at all times, easing the cost of integration with other modules, etc.
Version control makes it very easy to see what has changed in the code each time. This might sound trivial, but when you start modifying code it's easy to lose track after a while. But with VC it's all an "svn diff" (or equivalent) away, always.
Version control makes it very easy to see who has changed the code each time. So that, for example, when something breaks, you know who to blame. (It's not an accident that the subversion command which shows who last changed each line is called "svn blame".)
Version control makes it very easy to see why a piece of code was changed. Commit messages, if used properly, essentially provide continual documentation of the ongoing development process. Documentation that otherwise wouldn't be written.
Version control makes it very easy to track down regressions and see where they appeared. In the easy case, you just track down commit messages and spot the culprit. In the average case, you have to consult the diffs too. In the hard case, you have to do regression testing of previous versions using what amounts to binary search, which is still better than the no-VC case, where you simply have no clue.
This list is not exhaustive, of course, but these are the main benefits that come to mind right now. Obviously, as others have already mentioned, it's easier to show all this to your colleagues than to describe it to them, and setting it up for yourself first (but importing everyone's code, mind) sounds like a great idea.
As Joel points out on one of his articles, start using your own one man version control system and market its benefit on every opportunity you get. Show them the benefits of traceability, granular backups etc from your single instance. People will start realizing its benefits irrespective of their age.
I agree with the answer that are referring to the Joel Guerrilla article.
Install/Use some thing with a low overhead. Hg (Mercurial) is easy in a mixed eniroment and is good because you can bail out and use something else in an easy way.
You must share your things without making a fuzz about it. When someone needs your code, export it and use the "standard" corporate method (shared folder or whatever)
When you get code, always import it into a repos, if you think it is a new commit of a repos you already have, try to get it into that one.
Sooner or later you will have a code for several project and hopefully some commits on some repos. Then you can expose those with the mercurials webserver interface (hg serve -p XXXX).
When the times comes when someone don't know why something suddenly don't works as it should be and is trying to figure out why becase it was working last monday ... and you know that you have that code in a repos step up and ask if you can be of any assistance. Get the falty code, commit it into your repos and expose with hg serve. Look at it in the browser.
My point is that you must prove with real cases to your colleges that this stuff has a value.
If the haven't figured it out by themselves after some many years you have a mountain to climb but it can be done. You must be patience though. It could very well take a year to convert one man (old dog). If you have any younger coworkers try to do this together, the more code you can get hold on the better.
I would point out the hazards of not having one - lost code, developers over writing each other changes, ability to rollback problems, etc.
Also since Subversion and some others are free, point out there is no real cost to purchase, jsut the time to implement.
The biggest issue you will have as the new guy is that you will be seen as rockign the boat, if they had no issues to date they will be hard to convince. Perhaps start using it locally jsut for yourself and maybe they will like what they see and start to adopt it.
I would try small steps, maybe ask the others if they ever used one, point out the benefits, when an issue arises that a system would have prevented or aided in point it out delicately.
From a purely business perspective, and depending on the size and nature of your parent company an IT auditor may consider your lack of a VCS a finding (i.e. something that needs fixing). I believe you could improve your pitch to management by telling them that any CVS is a great way of showing that your department respects its resources and works in a structured way and efficient way, something auditors always like to see.
I don't know how your corporate culture works but I'd be careful about rolling out your own CVS since if it does see use it suddenly becomes your responsibility when things go wrong, even if you were not at fault. To cover your ass (and keep the aforementioned auditors even happier) I'd roll the system out with a full set of written procedures for its use and maintenance.
Finally while I myself am a big fan of initiative at any level of the enterprise don't expect people to remember to say thanks when they figure out how great it is. Some might, but for the most part you're doing this to make your own life easier and for your own karma.
Remember, there are plenty of version control systems that are absolutely free. And the amount of time spent installing and maintaining a version control system should be somewhere near 0 (they shouldn't require any maintenance). There isn't even a space penalty for most systems, as they can compress things internally.
You have listed some advantages, and there are others. But more importantly, I can't think of a single disadvantage.
I would also recommend starting with implementing VCS (Version Control System) for yourself first. I'd recommend using one of distributed VCS (Git, Mercurial, Bazaar) rather than centralized Subversion, because it would be easier to create central repository (or repositories) by cloning than moving your Subversion repository to central place. Distributed SCM can be also used in a smaller group to exchange ideas.
A few advantages of (modern) version control systems:
You can always revert (go back) to last working version of your code (provided that you follow some sane version control conventions, like at least tagging only tested code). With code shared via folders it might turn out that no one version works, backup copies were deleted to save space, and recovering code from backup is tedious / was never tested.
You can switch between working on some new feature (some experimental work), and working on urgent fix in currently deployed version (maintenace work) thanks to branches (and stash / shelve for uncommitted work).
If you follow good practices for version control (small and often commits, changes being about one single thing, writing good commit message describing change and whys of change) you would have much, much easier finding bugs, be it by bisecting history to find which change introduced bug, or by using version control system to look up who was responsible for given area of code (annotate / blame).
Start talking to the other developers about problems thay have had in the past as a way to get to know the system and how it evolved (sneaky, sneaky, sneaky, but hey this information will probably come in handy at some poitn even aside from the version control issue). You are bound to sooner or later find some wonderful examples of things that have already happened which would have been far less painful if you had version control. Use these examples when you present the idea to management.
I agree with the idea that you can probably start using your own version control and eventually will be able to help thm out of a fix, but I'd bet money they have been in some of those fixes already and if they already remember how painful the problme was before, it will help sell the new idea.
Look for another job.
Seriously.
There are way better jobs out there that don't require you to teach the existing staff.
Ones where you could go into work and just, y'know, work.
Also, keep in mind that 30 isn't far off. That's the age at which most people
stop suffering fools gladly.
Just a heads up.
EDIT
It's been suggested that quitting a bad job is for quitters.
Maybe so, but keep in mind that you're supposed to
put your employer to the Joel test before you accept the job, not after.

Log messages for revision control by yourself

I use version control extensively. When I'm working by myself, I still use it, and find many good things about it. I know I'm 'supposed' to put in good messages etc, but find that usually the date of a commit and all the tools for checking diffs etc are enough. I often end up putting in junk messages like 'changes'.
I guess this is a weird question, but, what do others use as their log messages when they're making commits in repositories that only they are using? Is there any problem with not leaving messages?
I happen to use git, but this question is more general.
For me it depends on the nature of the fix. Sometimes, its just one word. "Backup", or "copy changes". However, if something that caused me a lot of grief, I'll document my changes a lot more extensively. If it is open source and I won't be there all that long, I document my changes very extensively. svn -diff ( and then document all my changes that way...:)
Bug fixes that are identified by a number in another system, need to be in the change log.
I'll, grant you that "Fixed bug" isn't very good in the change log, but if it a simple bug then maybe that will do.
I don't think there is a good and fast rule, but your entry should be proportional to the amount of time you spent doing the code. A copy change? spelling mistake? not that much of a message needed.
Did you spend 2 hours fixing a bug? Yep! Long commit message.
I'm a solo developer using version control as well. I recently started using an issue tracking system that monitors the messages, so mine have gotten better and at least reference an issue number when there is one. The rest of the time, I try to at least generally state what areas changed in a short sentence or two.
But every once in a while, I still get lazy (or am half asleep) and type in things like "fixed a bug".
You should put in messages as meaningful as those you would put into code that has multiple developers. There's usually little difference between someone else looking at your changes in a couple of days, and you looking at them 12 months down the track. There's a good chance, in both those situations, that the person looking will have no idea why the change was made :-)
I even go so far as to use proper change control, even for the stuff I do solo. That means every change to the code base has to have either a change request or a bug report (with full documentation).
That makes my life a lot easier when I need to understand why something was done. I've got better uses for my "wetware" than trying to remember every little change and why it was done. Far better to let the machine remember it - its memory is so much better.
And, in my opinion, if you can't be bothered doing it right, don't do it at all. Just revert to the cowboy-coder mentality and save yourself some effort.
Doing it right doesn't take that much extra effort and the rewards are substantial. It all comes down to a cost/benefit analysis.

What is an appropriate level of detail for check-in notes?

When I check-in code, I sometime write very long, detailed checkin notes, other times I write very short ones (or no note at all). The longer notes tend to include information about why the change was made (business reasons, customer interactions, etc). However, I'm not sure if check-in notes are the right place for such detail. Most check-in notes I've seen tend to be short and simply reference a bug.
What is an appropriate level of detail for check-in notes?
Whatever your manager or company documentation tells you ;)
That being said, shorter is better. It's not the correct tool for lengthy documentation - your bug/feature tracking software is built for this and in most cases, can integrate with your source control.
Just enough so, when following the log few weeks later to have an idea about what hapenned.
I use these logs to check what has been done in the last day (or days) in the project I'm leading.
Shorter messages doesn't necessary mean better. Nor longer messages. Just keep in mind the goal of those comments: to give an overview of the activity on versioning system.
The right answer, I've found, is dependent on the needs of your organization. It sounds fuzzy, but the primary reason to provide detail for a code check-in is for context and understanding if that check-in needs to be reviewed or revisited. It might be incredibly verbose, or it may be remarkably simple.
In one company, our code check-ins would reference #+ticket-number. This mapped our SVN commits against a Trac ticket number, which held all of our details about a given issue or feature we were implementing. We referenced everything through Trac, so keeping our details in that form worked best for us.
For you, it depends on how you and your team work. I would base what info you keep in your check-ins on the need for the data, how often its referenced, and what happens if you lose context (i.e., have no idea why a change was implemented.)
Another consideration may be accessing those notes outside your code repository, which may not be the most effective mechanism for storing that information. Nonetheless, I find it's personal preference.
In my version-control experience, I tend to curse the ones that left no note at all, or a note that takes 5 minutes to dig through.
If you use your version control system to browse the history of a file to find a specific change, it's best to include a short comment on the why, and the what. The how is to go in the source code documentation.
Whenever I write a comment or a commit log message I ask myself "what will the next guy need to know? what are they likely to ask me about?"
Answering a question seems to be the easy way to keep comments brief and useful. It also avoids anti-documentation (rephrasing code, often in unintentionally ironic ways) or re-phrasing the metadata the vcs will be tracking anyway (added foo.java, tuesday change, new tag "bar-1-1-4")

How often to commit changes to source control? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 7 years ago.
Improve this question
How often should I commit changes to source control ? After every small feature, or only for large features ?
I'm working on a project and have a long-term feature to implement. Currently, I'm committing after every chunk of work, i.e. every sub-feature implemented and bug fixed. I even commit after I've added a new chunk of tests for some feature after discovering a bug.
However, I'm concerned about this pattern. In a productive day of work I might make 10 commits. Given that I'm using Subversion, these commits affect the whole repository, so I wonder if it indeed is a good practice to make so many ?
Anytime I complete a "full thought" of code that compiles and runs I check-in. This usually ends up being anywhere between 15-60 minutes. Sometimes it could be longer, but I always try to checkin if I have a lot of code changes that I wouldn't want to rewrite in case of failure. I also usually make sure my code compiles and I check-in at the end of the work day before I go home.
I wouldn't worry about making "too many" commits/check-ins. It really sucks when you have to rewrite something, and it's nice to be able to rollback in small increments just in case.
When you say you are concerned that your "commits affect the whole repository" --- are you referring to the fact that the whole repository's revision number increases? I don't know how many bits Subversion uses to store it, but I'm pretty sure you're not going to run out of revision numbers! Many commits are not a problem. You can commit ten times as often as the guy next door and you won't increase your carbon footprint at all.
A single function or method should be named for what it does, and if the name is too long, it is doing too much. I try to apply the same rule to check-ins: the check-in comment should describe exactly what the change accomplishes, and if the comment is too long, I'm probably changing too much at once.
I like this small article from Jeff Atwood: Check In Early, Check In Often
I personally commit every logical group of code that is finished/stable/compiles and try not to leave the day without committing what I did that day.
If you are making major changes and are concerned about affecting others working on the code, you can create a new branch, and then merge back into the trunk after your changes are complete.
If your version control comment is longer than one or two sentences, you probably aren't committing often enough.
I follow the open-source mantra (paraphrased) - commit early, commit often.
Basically whenever I think I've added useful functionality (however small) without introducing problems for other team members.
This commit-often strategy is particularly useful in continuous integration environments as it allows integration testing against other development efforts, giving early detection of problems.
I commit everytime I'm done with a task. That usually takes 30 mins to 1 hr.
Don't commit code that doesn't actually work. Don't use your repository as a backup solution.
Instead, back up your incomplete code locally in an automated way. Time Machine takes care of me, and there are plenty of free programs for other platforms.
The rule of thumb, that I use, is check-in when the group of files being checked-in can be covered by a single check-in comment.
This is generally to ensure that check-ins are atomic and that the comments can be easily digested by other developers.
It is especially true when your changes affect a configuration file (such as a spring context file or a struts config file) that has application wide scope. If you make several 'groups' of changes before checking in, their impact overlaps in the configuration file, causing the 2 groups to become merged with each other.
I don't think you should worry so much about how often. The important thing here is what, when and why. Saying that you have to commit every 3 hours or every 24 hours really makes no sense. Commit when you have something to commit, don't if you don't.
Here's an extract from my recommended best practices for version control:
[...] If you are doing many changes to a project at the same time, split them up into logical parts and commit them in multiple sessions. This makes it much easier to track the history of individual changes, which will save you a lot of time when trying to find and fix bugs later on. For example, if you are implementing feature A, B and C and fixing bug 1, 2 and 3, that should result in a total of at least six commits, one for each feature and one for each bug. If you are working on a big feature or doing extensive refactoring, consider splitting your work up into even smaller parts, and make a commit after each part is completed. Also, when implementing independent changes to multiple logical modules, commit changes to each module separately, even if they are part of a bigger change.
Ideally, you should never leave your office with uncommitted changes on your hard drive. If you are working on projects where changes will affect other people, consider using a branch to implement your changes and merge them back into the trunk when you are done. When committing changes to libraries or projects that other projects—and thus, other people—depend on, make sure you don’t break their builds by committing code that won’t compile. However, having code that doesn’t compile is not an excuse to avoid committing. Use branches instead. [...]
Your current pattern makes sense. Keep in mind how you use this source control: what if you have to rollback, or if you want to do a diff? The chunks you describe seem like exactly the right differential in those cases: the diff will show you exactly what changed in implementing bug #(specified in checkin log), or exactly what the new code was for implementing a feature. The rollback, similarly, will only touch one thing at a time.
I also like to commit after I finish a chunk of work, which is often several times a day. I think it's easier to see what's happening in small commits than big ones. If you're worried about too many commits, you may consider creating a branch and merging it back to the trunk when the whole feature is finished.
Here's a related blog post: Coding Horror: Check In Early, Check In Often
As others have stated, try to commit one logical chunk that is "complete" enough that it does not get in other devs' way (e.g., it builds and passes automated tests).
Each dev team / company must define what is "complete enough" for each branch. For example, you may have feature branches that require the code only to build, a Trunk that also requires code to pass automated tests, and labels indicating something has passed QA testing... or something like that.
I'm not saying that this is a good pattern to follow; I'm only pointing out that how done is "done" depends on your team's / company's policies.
I also like to check in regularly. That is every time I have a completed a step towards my goal.
This is typically every couple of hours.
My difficulty is finding someone willing and able to perform so many code reviews.
Our company policy is that we need to have a code review before we can check anything in, which makes sense, but there is not always someone in the department who has time to immediately perform a code review. Possible Solutions:
More work per check in; less checkins == less reviews.
Change the company checkin policy. If I have just done some refactoring and the unit tests all run green, maybe I can relax the rule?
Shelve the change until someone can perform the review and continue working. This can be problematic if the reviewer does not like you code and you have to redesign. Juggling different stages of a task by 'shelving' changes can become messy.
The moment you think about it.
(as long as what you check in is safe)
Depends on your source code system and what else you have in place. If you're using Git, then commit whenever you finish a step. I use SVN and I like to commit when I finish a whole feature, so, every one to five hours. If I were using CVS I'd do the same.
I agree with several of the responses: do not check in code that will not compile; use a personal branch or repository if your concern is having a "backup" of the code or its changes; check in when logical units are complete.
One other thing that I would add is that depending on your environment, the check-in rate may vary with time. For example, early in a project checking in after each functional piece of a component is complete makes sense for both safety and having a revision history (I am thinking of cases where earlier bits get refactored as later ones are being developed). Later in the project, on the other hand, entirely complete functionality becomes more important, especially during integration development/testing. A half-integration or half-fix does not help anyone.
As for checking in after each bug fix: unless the fix is trivial, absolutely! Nothing is more of a pain than finding that one check in contained three fixes and one of them needs to be rolled back. More often than not it seems that in that situation the developer fixed three bugs in one area and unwinding which change goes to which bug fix is a nightmare.
I like to commit changes every 30-60 minutes, as long as it compiles cleanly and there are no regressions in unit tests.
Well, you could have your own branch to which you can commit as often as you like, and when you are done with your feature, you could merge it to the main trunk.
On the frequency of Commits, I think of it this way, how much pain would it be to me if my hard disk crashed and I hadn't committed something - the quantum of this something for me is about 2 hours of work.
Of course, I never commit something that doesn't compile.
At least once a day.
I don't have a specific time limit per commit, I tend to commit once a test has passed and I'm happy with the code. I wouldn;t commit code that does not compile or is otherwise in a state that I would not feel good about reverting to in case of failure
You have to balance the compromise between safety and recoverability on the one hand and ease of change management for the entire project on the other.
The best scheme that I've used has had two answers to that question.
We used 2 completely separate repositories : one was the project wide repository and the other was our own personal repository (we were using rcs at the time).
We would check into our personal repository very regularly, pretty much each time you saved your open files. As such the personal repository was basically a big, long ranging, undo buffer.
Once we had a chunk of code that would compile, tested ok and was accepted as being ready for general use it was checked into the project repository.
Unfortunately this system relied on the use of different VCS technologies to be workable. I've not found any satisfactory method of achieving the same results while using two of VCS of the same type (eg. two subversion repositories)
However, I have had acceptable results by creating "personal" development branches in a subversion repository - checking into the branch regularly and then merging into the trunk upon completion.
If you're working on a branch which won't be released, a commit is always safe.
However, if you are sharing it with other developers, committing non-working code is likely to be a bit annoying (particularly if it's in an important place). Normally I only commit code which is effectively "working" - not that it's been fully tested, but that I've ascertained that it does actually compile and not fail immediately.
If you're using an integrated bug tracker, it may be helpful to do separate commits if you've fixed two bugs, so that the commit log can go against the right bugs. But then again, sometimes one code change fixes two bugs, so then you just have to choose which one to put it against (unless your system allows one commit to be associated with multiple bugs)
I still believe in the phrase 'commit often, commit early'. I prefer decentralized VCS like Mercurial and there's no problem to commit several things and push it upstream later.
This is really a common question, but the real question is: Can you commit unfinished code?
Whenever you finish some code that works and won't screw anyone else up if they get it in an update.
And please make sure you comment properly.