how does version control work? - version-control

how does version control usually work? does it save diff files as a trail with hashes to validate the trail?

Check out Eric Sinks blog series on version control.
Also, Joel Spolsky wrote Hg Init: a Mercurial tutorial, that finally made me "get" what distributed source control is all about.
There are more than one ways to skin a cat...

Different VCS use different approaches. CVS, for example, will create a file on the server for each file which you commit. This is essentially a file in RCS format; CVS is only a wrapper around RCS which runs the RCS commands over many files in a directory subtree (RCS can only work on single files).
The RCS file contains a list of changes (version number, checkin message and how much was changed). After that comes a copy of the current HEAD version. The rest of the files are the diffs between the versions (long explanation).
This way, CVS can quickly return the HEAD version (which is most often requested) and it can compute the other versions.
CVS doesn't do any validation; if one of your files becomes corrupt, you need a backup. Since CVS is based on RCS, it can't version directories nor can it track renames. CVS and RCS use the standard diff(1) command to create the diffs.
Subversion (SVN) works similarily but adds versioning of directories and renames. Moreover, SVN uses a better diff algorithm (xdelta) which gives a smaller repository.
For an explanation how Git works, see here.

Darcs is very different and IMHO more intuitive than other SCMs even distributed ones.
There's an excellent guide for beginners about how it works: Understanding Darcs.

Related

CVS virtual modules & directory mapping to mercurial repositories

My question is similar to this one but for Mercurial (converting using cvs2hg). However there some differences. This is part of our CVSROOT/modules file and shows the problem nicely:
PD1 -a PROD/PD1/Drivers Drivers/PD1/Firmware KernelHeaders Shared IppLibs
PD2 -a PROD/PD2/ Drivers/PD1 KernelHeaders Shared IppLibs
#PD2Linux Driver
PD2Linux PROD/PD2/Drivers/Linux/BuildFiles &PD2LinuxSource
PD2LinuxSource -d src &PD2 &PD2LibUSB
PD2LibUSB -d ThirdParty/libusb libusb
As you can see the driver structure is complicated. We're definatly looking to rationalise the driver structure, rather than including the entire older driver (PD1) in the newer one.
As I understand it, in Mercurial you can use the share extension to do the sub directory mapping.
My questions are
Is there a way in Mercurial to bring files located further down in the directory tree (in this case the autoconf files) upto the root as is done in the first line of the PD2Linux Driver?
Is there a way to create directories, as per the -d flag?
How to merge changesets that span PD1 & PD2?
e.g. if changes were made in PD2 that spanned both drivers and checked in to PD in CVS. This is bit of long shot as CVS doesn't have changesets.
I wonder if the cvs2hg takes into account the CVS modules file?
Atm I converting each PD directory individually (creating a cvsroot in each subdir), would it better so convert them all together and then split up them into seperate hg repos?
You write:
As I understand it, in Mercurial you can use the share extension to do the sub directory mapping.
Not quite. The share extension let's you associate several working copies with a single repository — it's not about remapping (sub-)directories.
Is there a way in mercurial to bring files located further down in the directory tree(in this case the autoconf files) upto the root as is done in the first line of the PD2Linux
Driver?
The answer to this and your other questions is: no. The core problem is that Mercurial (and other distributed version control tools) requires you to checkout the full repository every time. You cannot just clone repo/some/dir/, you must always clone repo/.
Atm I converting each PD directory individually (creating a cvsroot in each subdir), would it better so convert them all together and then split up them into seperate hg repos?
The end result should be separate Mercurial repositories — precisely because you need to clone the full repository. So make sure to make a 1–1 mapping between repositories and your drivers.
One tool that you might find useful is subrepositories. A subrepository is a nested repository that Mercurial will checkout when you clone the outer repository. The come with a number of caveats, but big companies are using them today (I've helped a number of companies with setting up subrepos).

Which version control system should I use for my small personal code files?

I have some general scripts that I use and they keep getting modified over time. Right now, I do not use any version control software for them so basically the old files are lost unless I explicitly save them.
I need a good minimal version control system that I can use on a single machine. Which one do you use for such projects?
Git or mercurial both work great. No server required.
I've used subversion for this in the past. Mostly this is because I'm in windows, and TortoiseSVN is a dead simple UI for my repo.
For a scenario like yours, which is relatively simple, I'd recommend using either what you're familiar with, or what is easy to use on your platform.
Git is actually really easy to use in such a setting, and it scales just as well to really small repositories with a few commits a month as it does to huge ones with a hundred a day. Here's how you would set up such a repository:
$ cd ~/your-scripts
$ git init
$ git add .
$ git commit -m 'Start script repository'
Ta-da!
As a hosting solution we make use of http://codesion.com/free_cvs_svn, you will note they also support Git hosting. They also host a bunch of other services that go hand in-in-hand with versioning.
Check out some of the personal version control systems. Hers is a short list:
FileHamster
History Explorer
FolderTrack
Oops! Backup
They are super easy to use and "automatically checkin" when ever you modify your files.
Note: I am the author of FolderTrack and recomend it for software because it can treat a bunch of files as 1 big project. Therefore if you need to revert your project to where it was yesterday, it will revert the 8, 10, or how ever many other files you modified since that time.
Free code: BOS

Half-ignored files in VCS - is this supported?

I am using Eclipse and Subversion for Java development, and I find myself wishing for a feature in version control systems (one that is not available in SVN, to the best of my knowledge).
I would like my project settings files to be half-ignored. To be more precise, I want them to be available in VCS, I want merge to occur when someone checks in changes, but. I want my own changes ignored unless I very explicitly tell the system to take them.
This would allow me to have my local paths (and other settings) in my local configuration w/o screwing up other people's configuration. But, when I have a substantial change, I can still check it in (very very carefully, may be temporarily removing my other local changes) and have it delivered to other people.
Now, the actual question: is there any VCS that supports this feature? Or may be I am missing something in SVN? How do other people solve this problem in Eclipse?
Yes, Git support that feature through filter driver (a clean script can run upon commit, allowing you to clean the content from any of your changes if you want).
But another way would be to never version that setting file, and only version:
a template file
a value file
a script able to replace variables in the template files with the values from the value file, in order to generate the actual (and "private", as in "not versioned") setting file.
That way, you can modifying it at your heart's content without ever committing your changes.
.gitignore for git, .hgignore for mercurial and file paths and patterns can be added that will not be committed. There similar in SVN but i never worked out how to use it myself but my sysop did set it up form me.
git supports this with
git update-index --assume-unchanged <file>
and the complementary
git update-index --no-assume-unchanged <file>
See http://blog.pagebakers.nl/2009/01/29/git-ignoring-changes-in-tracked-files and http://kernel.org/pub/software/scm/git/docs/git-update-index.html#_options for more details.

Basic Subversion questions

I've just started using subversion, and have read the official documentation (svn book), cheat sheet and a couple of guides. I know how to install subversion (in linux), create a repository (svnadmin create), and import my Eclipse project into the repository (SVN import), view the repository files (using svn list).
But I am unable to understand some of the other terminologies. For example, after importing my Eclipse project into the newly created repository I have made changes to my Eclipse project (more than 1 file). Now, how should I update the repository with this added files/changes made to my Eclipse project?
The svn update command brings the changes from the repository into your working copy - which is the opposite of what I want i.e. bring the changes I made in my Eclipse project into the previously imported project in repository. If I am correct, you update the repository more often (as you keep extending your project implementation) than your current project (with update).
Also, I do not understand when would you use svn merge. The svn book states it applies the differences between 2 sources to a working copy. Is there a scenario which would explain this?
Finally, can I have more than 1 project checked into the repository? Or is it better to create a new repository for each project?
The term you are looking for is "commit".
Subversion does not exclusively lock a file for editing (though there is a command to do this if you really, really want to). So it is possible that you will need to merge two different users' sets of edits on a file, or even edits from two different working copies in two different locations on your machine.
Multiple projects is fine. Best approach IMHO is repository/project/trunk etc rather than repository/trunk/project.
Three things about SVN you should know:
Trunk - The main version of your code
Tags - 'Tagged' Versions of your code (i.e. v1.2.5-release)
Branches - Forks of the code for divergent development paths. We typically fork new branches to work on different versions, so if the current version is 1.2.4, you'd branch for 1.3's development. So if emergency changes to 1.2 need to be made (i.e. 1.2.5) you can work on it without worrying about what you broke by refactoring / feature adding in your 1.3 branch. The merge operation is designed so you can merge 1.3's branch back into trunk when you're ready to release 1.3, or a similar operation. You can also merge individual files (if two or more developers edited the same file at the same time and now you need to 'merge' the changes into the same file.
Each project in your repository should have 3 folders in it:
/trunk
/branches
/tags
These house the three points above. You don't have to have these folders, but you should. Other more mature VCS like Mericual/Git have the concepts of tags and branches baked into the system. In SVN these are more of a convention/reccomendation.
Terminology
Working Copy - The copy on your hard-drive, that contains all your edits, etc...
Add - Registers a file for tracking in version control
Update - Updates the working copy with changes from the server repository
Commit - Updates the server repository with changes from the working copy
Switch - Replaces the working copy with another folder within the server repository
Diff - Does a differential analysis of two files / versions of a file to see the changes between them.
Merge - Attempts to apply the changes from one or more files into another, highlighting conflicts.
Patch - A set of differences that can be used to update a file.
You commit changes to the repository
Merge is useful when you need to maintain two branches of a repository. For examples v1.x with most recent security fixes and the alpha version 2. That allows you to make the fixes in the 1.x code, whith the resulting binary for existing customers, and you can merge the changes into version 2 so fixing the bugs that weren't already caught.
I suggest you look around for 'typical svn workflows'. They will give you the big picture of the 'most common tasks'.
What you want to do is 'commit' the changes made to your files to the repository.
You need to merge in case of a conflict (when 2 or more people are working on a project and commit to the same repo. conflicts might arise).
Check the available articles on SVN kai remember to read about the sample/typical workflows or working scenarios with SVN.
Fully agree with David, but as far as question 3 is concerned, personally, I would distinguish between use cases:
Production: One project per repository. And do get warm with the mentioned tag/trunk/branch concept, it really helps a lot
Testing: I have one single repository where I have put virtually all my experimental codes (approx. 10 languages with x codes per language). Reason is: One experimental code takes me 1-2 minutes, creating a repository on a remote host, using ssh-security sometimes takes longer ;-)
Cheers
EL

How to merge project differences using visual source safe?

We have two different projects on our source safe database (one of them is a copy of another one for some reasons there was a problem with our branching operation that didn't pin our branched files therefore I had to get a label and add it as a different project)
I know how I can see the differences between two projects and I know that there is a mechanism that let us merge differences into one file (I think "reconcile all" will do the trick but i am not sure)
So here's my question how can I merge a file in a project with another file from another project?
VSS (or as i call it, source destruction system) will destroy your code if you try to merge it using the built-in tools. Why does it do that ? .. because its a lame tool.
This is what i recommend
Get latest both branches.
Get the last version of the code
before you branched. (just see the
date and guess if you have to)
Do a 3-way merge because you have a
base.
add the merged files into subversion
(or something better than
sourcesafe).
I have many old projects stored in sourcesafe. Its hopeless trying to use the built-in tools to do anything other than get latest, checkin and checkout.
Checkout the latest version of the first VSS somewhere.
Create a repository using a different VCS tool (Subversion should be the most simple choice).
Import the project version into the new Subversion repo as a branch.
Checkout the latest version of the second VSS somewhere else.
Import the project version into the new Subversion repo in a different branch.
Use any Subversion tools to merge the two branches.