What is the use of commit messages? - version-control

I struggled asking that question but here it is.
I am using source control since several years for multiple projects using different systems (svn, hg, git) and I learned how to improve my messages by following guidelines etc.
But as far as I can remember I never ever had a look at them afterwards.
So ... how do you profit from your own commit messages? When I need to go back because I smashed something and need a fresh start, I usually just go back to the latest "node" (where I started or merged a branch). Do I write those messages just for people monitoring the project who are curious what is going on?
Regards

You write them as an aid to your future self, and others on the team. To give you some background of when I have found them useful:
I used to work on a project where commit messages were invaluable - on more than one occasion I used them to track down code that was years old. On that project our bug tracking system was also integrated with our VCS (ClearCase). So when you checked in a change, it would record the bug number in the commit comments. This was very helpful to allow you to trace back exactly what was changed and why.
So to sum it up, although commit messages may seem pointless if you are just starting out (especially if you are the only one working on the project), they become invaluable once you have a successful product that is supported in production by multiple developers.
Update
Another useful feature of commit messages is that they require you to review and summarize the changes you just made. Even if I remember what I have changed, I will often do a quick diff of a file before checking it in. I will briefly read it all over again to make sure there are no typo's, that I changed everything I meant to, etc. This is a simple way to review your code for those small little bugs that would otherwise find their way into your code. Anyway, after doing this I have a clear picture of what changed, so I use this to write a concise summary of the change when checking in the file. This is a simple habit that helps increase code quality with little effort on your part.

"Send me a list of the things you did in the past two weeks" - Boss

Your messages are more for other users than yourself. Although I make sure to place good commit messages even on personal repos as well. Helps when you get sidetracked on a project and visit it months down the line to get a handle of the recent work done on a project.

One thing I've found is that the commit messages are a good way to keep myself from not committing often enough. If I can't put the changes into a short commit message I probably should have committed the changes earlier.

In the best case, a commit is bound to a work item in a feature/bug tracker. That way you will be able to easily see which feature/bug has been implemented/fixed. This is not only useful to know if a certain revision contains a feature or bug fix but also to easily create a release note.

What would be the point of a commit without a note to tell you what it is? It's like asking 'Why do books have titles on the sides?', or perhaps 'Why do books have indexes and page numbers?'. It seems to me that a source control log that didn't have a description for each change wouldn't be very useful.
Reasons you may need to refer to the commit message include
A bug has surfaced and you want to find when that part of the code was changed last
You decide to undo some changes and need to decide which revision to revert to
For either of these possibilities, without good commit messages, you would be left looking through the diffs for every single commit until you found what you were looking for in the code.

Related

Rules for Commiting code on SVN with Multiples Developers

We are working on a single project, and also committing code at the end of day on SVN so that can get all update project. But issues is very often our code getting errors while committing code and projects get empty if someone get update at that time. So my question is are there any set of rules which we've to follow on committing so that every one go on straight path and no one get in trouble and save a lot time from these errors.
Thanks in advance. Cheers
Search Google, something similar to "source control best practices".
Top result has several tips. Sounds like the biggest problem you're facing is integrating with others' changes. Perhaps look into the following sections:
Incorporate others' changes frequently
Share your changes frequently
Coordinate with your co-workers
Investigate why you get errors. Blind application of rules is not good.
For example:
person A committed a code producing compilation error
why?
he finished his task, but hasn't checked build before committing to the trunk
why?
the entire build it too slow
solution: speed-up build, set up continuous build system which will check every commit and notify developers about problems as soon as possible
Another example:
person B committed a code which breaks the build
why?
he wanted to store his changes, but the task is not finished
solution: advice him to create a branch, when the task is finished it can be merged to the trunk (if branch lives for long time, merge changes from trunk to it periodically, then the merge will not be a problem)
There possible other scenarios. With more details you will be able to ask more precise question on StackOverflow and get better answers.
In case of use of SVN co-ordination is must between the team.
Make sure whenever you commit the code you have a idea about what are you committing.
The golden rule is "One developer One class".
If at all two different developers are working on same class. Ask them to maintain a document on what changes they have made. And, most important to mark a comments in a class it self.
There are some important things which need to be followed while committing the code. Whenever you see a conflict in your local files and server files. Make sure you go through every conflict and select the appropriate action.
Understand one important thing, whenever SVN is used one persons mistake can affect everyone.
Whenever possible, don't edit the same line of code as someone else
Leave meaningful comments on your commit messages
Comment your code
Make sure that all the code you commit compiles and runs as it should
Commit when appropriate ie: are passing off part of your code to be used by others or are done working on a feature
Make sure to communicate with other team members
If you find code that you don't know what it does, ask the author
If you're making major changes that will take a while to implement, use a branch and then merge it back in

File history: in the source or let scm handle it?

I'm learning mercurial as my solo scm software. With other management software, you can put change comments into the file header through tags. With hg you comment the change set, and that doesn't get into the source. I'm more used to central control like VSS.
Why should I put the file history into the header of the source file? Should I let mercurial manage the history with my changeset comments?
Let the source control system handle it.
If you put change details in the header it will soon become unwieldy and overwhelm the actual code.
Additionally if the scm has the concept of changelists (where many files are grouped into a single change) then you'll be able to write the comment so that it applies to the whole change and not just the edits in the one file (if that makes sense), giving you a clearer picture of why the edit was required.
Yes; let the source control system handle your changeset comments. The rationale for this is that it makes considerably more sense when you're viewing the change log later, trying to work out what's going on between two versions of a file - the source control system can present the change comment to try and enlighten the situation.
There's no reason to manually maintain a file history when SCM software is much better suited to solve this problem. All too often I see partially-completed file histories in the source, which actually hurts, because people incorrectly assume it is accurate.
The difference is not whether it's a centralized or distributed VCS, it's more about what's being changed.
When I moved to .Net, the number of files updated for any individual change seemed to skyrocket. If I had to log the change in each file, I'd never get any real work done. By commenting on the set of changes, it doesn't matter how many files I had to update.
If I ever needed to identify all of the changes for a particular change, I can diff between the two versions of the project.
The biggest difference (and advantage) I saw when switching away from SourceSafe was the switch from file based to project based commits. As soon as I got used to that, I stopped adding change-log type comments to all of my files.
(As a side effect, I've found that my process description comments have gotten better)
I'm not a big proponent of littering the code with change comments. In the event they are needed they can be looked up in the SCM (at least for the SCM variants I have used). If you do want them in the file, consider putting them at the end instead of the beginning. That way you won't have to scroll down past the (uninteresting, to me at least) comments before you get to the actual code.
Another vote for letting the SCM system handle the checkin comments, but I do have one thing to add.
Some systems allow you to use RCS tags in your source code where the SCM can insert the change history directly into the source file being committed automatically. Sounds like a nice balance because the history is then in the SCM system and then automatically put into the source code itself.
The problem is that this process changes the source file. I think that's a bad idea because the file cannot be changed on disk until after you comment is inserted. If you were a good engineer, you should have built and tested changes before the commit. If your source changes after the commit, then you've essentially got a build that could be broken - but most engineers won't build after a commit - why should they?
But it's just a comment you say! True, but I did have a case where there was code in my source file that strangely enough had reason to look like an RCS header tag and that section of the code got replaced on checkin, thereby munging my code. Easy enough to fix, but bad that a build got broken for 20+ users
Much easier to forget to maintain history in the source, as one always (imo) should comment commits to source control system that problem dissappers. Also if changing lots of files before commit, changing history in every file will be annoying work. This is really one of the points with having scm.
I have experience with this. I've had the file history in the comments, it was awful. Nothing but garbage, sometimes you would have to scroll down almost 1k lines of code changes before you finally got to what you wanted. Not to mention, you're slowing down other aspects of your build process by adding more kb to your source code tree.

Why are check-in/commit comments a required field in some source control systems?

In Perforce (atleast the GUI) a check-in/commit comment is required. (I don't believe they are required in Git or Subversion.) Most developers that work with me just fill it in with latest/updated/etc. I used to write meaningful descriptions, but at about 20 comments a day with stuff like 'replace an image.' 'Changed spelling of 'franhcise' gets really annoying. Furthermore most changes can be quickly seen in a Diff.
At first I thought I was just being lazy, but I tend not to even look at them when reviewing other peoples code. I'd rather go right to the Diff. Am I alone? Are required comments a good idea?
You should always leave good comments. Not necessarily describing what you changed, unless it is a large changeset with too many distracting little details... but always, always, describe why you made the change (maybe link to a bug tracker item if there is one).
When i'm looking at your diff a year later, after realizing that it introduced a subtle bug, i need to know why the change was made - if i can't find a good reason, i'm just going to roll it back and curse your lazy ways... ;-)
Meaningful comments serve several purposes:
If you're looking for a particular change in a version history, they let you quickly scan through the file's history (eg: "Hey, I know we fixed a bug about the flicker of this widget sometime in March last year. Do you remember what was the fix for that?").
They encourage you to make atomized commits. If you end up making check-ins with generic comments, that probably means you're doing too many things at once.
As mentioned earlier, they let you know why things changes. Sure, a diff can tell you, for instance, how the tax computation changed for item such and such. But it won't tell you that it's because law XYZ for taxation changed.
They make it easier to write release notes, or equivalent documentation.
Perhaps a bit of a different perspective:
If you want to review ALL the changes for a year or since the last release - do you want to look at all the diffs, or would you like to see a good commit comment and a link to a defect/issue item?
If you're making 20 check-ins per day, you're probably checking in too frequently. Group all the minor typo fixes into a single checkin with a comment of "fixed various typos".
Writing a meaningful comment takes about 30 seconds, so just get over it and do it.
As has been discussed in the comments to Shog9's answer, enforcing the comment on the tool level does not necessarily help keep the lazy people in line, because the requirement is too easy to circumvent (as was already mentioned in the question: just type "latest"/"updated"/etc, or even "sfakjs;d", which is probably more harmful than an empty line).
However, the fact that the tool requires it may serve as a reminder for a normally diligent developer who is accidentally going to commit without any explanation. If it does this even once, then we are on the plus side (i.e., the requirement is beneficial), because normally the functionality does not make any difference – the good guys write the comments anyway, whereas the bad gals can always get around the requirement, no matter what technical barriers you set up. (Whether you want to keep them employed is another question, of course.)
Mostly because they are not meant to be used for making commits after changing a css attribute etc., but rather after making a more meaningful change/bugfix. But comments are very useful anyways.

Best Practices for Comments on Code Commit

What template do you use for comments on code commit?
One example of a template is:
(change 1): (source file 1.1, 1.2): (details of change made), (why)
(change 2): (source file 2.1): (details of change made), (why)
Ideally each change should be mapped to an issue in the issue tracker. Is this template alright?
Here are my thoughts... all these will be open to interpretation depending on your particular development methodologies.
You should be committing fairly often, with a single focus for every commit, so based on that, comments should be short and detail what the focus of the commit was.
I'm a fan of posting the what in your comment, with the why and the how being detailed elsewhere (ideally in your bug tracking). The why should be the ticket, and upon closing the ticket, you should have some kind of note about how that particular issue was addressed.
A reference to your bug tracking system is good if it isn't handled otherwise (TRAC/SVN interaction, for example). The reason for this is to point other developers in the right direction if they're looking for more information on the commit.
Don't include specific file names unless the fix really complex and detail is needed. Even still, complex details probably belong in bug tracking with your implementation notes, not in version control. Files edited, diff details, etc, should hopefully be included with version control, and we don't want to spend time duplicating this.
Given these ideas, an example commit comment for me would be something like
Req3845: Updated validation to use the new RegEx validation developed in Req3831.
Short, communicates what was changed, and provides some kind of reference for others to get more info without hunting you down.
I prefix each paragraph with + - * or !
+ means its a new feature
- means feature is removed
* means feature is changed
! means bugfix
I don't think you should commit detailed description about what parts of the code are changed, because that's why every VC has diff :)
If you use a bug tracking system, include relevant ticket numbers.
You do not need to mention changed files, or your name. The source repository can figure that out by itself. Describing the changes also only makes sense if it is not non-trivially obvious from the diff.
Make sure you have a good first line, because this frequently appears in the change history view, and people need to find things by this (the bug tracking ticket number should go there, for example).
Try to commit related changes in a single changeset (and split unrelated changes into two commits, even if to the same file).
I try to follow the same rule as for code comments:
Explain the WHY, not the HOW.
IMO a comment should contain a reference to the issue (task tracker, or requirement). Which files are affected is already available from the version control system. Apart from that, it should be as short as possible, but still readable.
I try to keep my fixes in separate check-ins.
I don't use an actual template, but a mental one, and it's like this.
Issue - dev level summary of
issue.
The issue tracker has all the management details, and the changes/diffs can be reviewed for code changes, so the comment is for dev's to understand the why/what of the issue.
Here's what I've seen used successfully:
Reference to bug number or feature ID
Brief description of the change. What was changed.
Code reviewer (to ensure you have one) unless handled by the checkin system.
Name of tester or description of which tests were run (if late in the process and you are being extra careful)
I use the simple technique described by Chaosben on the JEDI Windows API blog.
In order to get a fast view on the
changes made to a repository, we
suggest to write brief concise
comments starting each line with one
of these chars:
+ if you added a feature/function/…
- if you removed a feature/function/bug/…
# if you changed something
Doing it this way, other developers may find the desired revision much better.
First, commits should solve one single problem (separate commits for logically separate changes). If you don't know what to write in the commit message, or the comit message is too long it might mean that you have multiple independent changes in your commit, and you should split it into smaller items.
I think that commit message conventions expected and used by git makes much sense:
The first line of commit message should be a short description
If appropriate, prefix mentioned above summary line with subsystem prefix, e.g. "docs:" or "contrib:"
In next paragraph or paragraphs describe the change, explaining why's and how's
Keep in mind that if someone needs details of what changed, they can get a diff. That said, I just usually write a sentence or two for each major change, and then lump any minor fixes at the end.
There is no hard and fast rule as its plain english. I try to explain the work done in minimum words possible. Anybody looking for history of changes just want to know what happened in a particular change. If anybody is after more details then its there in the code.
Second thing I follow is if there is any bug associated then stick that in or if its related to any dev task then associate that with the change.
If two files were changed for different reasons, they should be in different commits The only time you should commit more than one code file at a time is because they all belong to the same fix/change

Best practices for version control comments

There is a lot of conversation about commenting code, but how about commenting on check-ins?
I found this blog post:
http://redbitbluebit.com/subversion-check-in-comment-great-practices/
As the guy who is putting together the release notes, I am looking for ways to make that job easier.
Currently we defined our own scheme with <Begin_Doc>...<End_Doc> for anything that should be published to our software customers. But even for the internal stuff, I'd like to know the "why" for every change.
Every feature has a ticket/issue/bugreport/task/whatever-you-call-it, and the ticket number is always referenced in the check-in comment. This gives context.
I would advocate NOT using/overloading your version control system for this. I would suggest the issue tracking software as a better fit.
For one, it does not seem appropriate to have developers add all the context and duplicated information in a commit message that is already in a requirements doc or issue/defect system.
You can use a tool to gather the relevant fixes/issue numbers that are in the commit comments and then go collect those from your other repository, but I think it is a mistake to basically make your revision tool an external facing thing.
You need to define what the Source/version repository/SVN is - is it for managing your source files, or is it also for writing release note. I think it should not be overloaded.
We try to keep it simple: write one sentence describing the change that you are committing. If a developer needs two or more sentences to describe the commit, then perhaps the commit is two unrelated pieces of work. When commits like this end up in version control, then it is difficult to revert fixes in isolation.
Another piece of information that we like to include in our commit comment is the defect / feature number that the commit fixes / implements. Not all work that we do is related to a defect in our issue tracking system, so this is not compulsory.
One last piece of information that we put in our commit comments is the name of one code reviewer. This is the person who did a sanity check on the changes before the commit takes place.
I recommend functional comments. The comments should give a summary of what was changed. If something was changed, and why. Every commit should be explainable, if you can't explain it clearly, you probably shouldn't be checking it in.
The most important thing to remember when using source control logs is they are there to determine when and what was changed. The more functional, and detailed the better. commits should be made in bite size pieces, that can be explained with bite size comments.
My personal preference is this style:
UPDATED the error logging system.
Added a legacy error parsing routine using regex to get the legacy error codes.
Changed the text in the database error messages, to correct misspellings.
Removed commented out sections of code, because they were not used any more.
The key is what are you going to do with the comments. If you're creating release notes, then you can do as you suggested. However I would recommend instead you keep track of release notes somewhere else, such as in a project management or bug tracking tool.
As for developer related comments, we've generally asked people to explain what they're doing, a one sentence explanation. It doesn't need to be too formal, mainly because if it is people will push back against it. Plus, if you know who did it, and you have a quick comment, you can generally trace back the issue and find the person.
As well, if you use a tool like FogBugz, you can link an SVN checkin to a case number. Which means that you can look up the case to get the full discussion, comments, screenshots, etc. Which is much more information than you could ever enter in a checkin comment.
Agree with Remembrance, but you should also write a little bit about why you implemented the change /bug fix the way you did.
If you belive in checking in often, you should also include TO DO's in order to make it possible for one of your co-workers to complete the task.
Making my changes small helps: I can provide detailed descriptions of my changes this way.
The checkin comments should be the information that a developer wants: this includes refactorings, motivation behind the code, etc.
On our projects we always advocate providing some detail to what a commit is about and to assist in not having to duplicate information like the problem we use Trac and have our repository integrated. The advantage is that you can then reference the issue ticket in the comment and only state the resolution or steps of work carried out. Trac then automatically links the reference number to the original issue number and applies your commit message as a comment to the issue. Then when you want to see what has been done you can simply read the issues within Trac and have full context.
With regards to release notes we have found that taking the list of issues within a release and using the commit information as a basis for the comments has worked fine. Generally you will not have release notes that have the raw commit messages in them as your clients do not really care about every little change or even the level of detail that may be included in the comment. So you would normally need to do a fair amount of editing to highlight the main changes and bug fixes implemented in that release.
I would say to try to follow a changelog style. The first line should be a short summary, and include the issue/ticket number (if any). This should possibly be followed by a blank line depending on how your VCS handles multi-line commit messages, then a fuller multiline description. I would say it's unreasonable to impose any strict formatting since it will discourage frequent commits, but so long as the important commits (the ones closing issues, or major changes) are done this way you should be ok.
If you use something like Trac or roundup + svn integration, it can pick out issue numbers from the commit messages. I would always put these in the first line since they're so useful.
Edit: Given that this is by far my most downvoted answer, I think it's worth emphasizing what's hidden in the last paragraph: I'm a sole proprietor. I have 100% ownership of these projects and do not work with other developers. In a shop with more than one developer, everything I'm saying in this answer may be completely inapposite.
I subscribe to DRY here as in all things.
I almost never add a comment to my commits. A comment is almost always repeating myself. The answer to the question "what changed in this commit"? is almost always in the diff.
When I'm looking at a file and I ask "what the hell happened here?", the first thing I do is look at the diff with the previous rev. 90% of the time the answer is immediately apparent, either because the code's self-evident or because there was something not self-evident that I commented in the code. If it's not, I correlate the rev dates of the file with the bug-tracking system and the answer is there.
This always works. It sometimes requires a little investigation to figure something out, because I didn't comment my code adequately. But I've never been unable to find the answer fairly quickly.
The only time I add a comment to the commit log is when I know that a diff isn't going to help me. For instance, when I sort a class's members: the only thing that a diff is going to tell me in that case is that something very big happened. When I do that, I commit the file as soon as I've fixed it. There's no appropriate place to comment a change of that scope in the file, so I add a comment to the effect that the only change in this rev is reordering the members.
("Why wouldn't you comment a change like that in the revision history at the top of the file?" you might ask. I don't keep a revision history at the top of my files. That was a scary, break-the-habit-of-a-lifetime change to make, and I've never regretted it for a moment. The revision history is Subversion.)
If I didn't have 100% ownership of the project, it might be different. It might be too hard to correlate commits with bug fixes. It might be too hard to train other developers to code to a style that makes it possible to rely on version control effectively. I'd have to see.