Can RCS be used to version control entire directories? - version-control

I would like RCS to version control all the files in /var/spool/cron/crontabs/ and any new files that get created there. But it appears from the documentation that you have to supply it a filename and not a directory name.
Can RCS be used to do this? Is there a better tool to use?

TL;DR
If you need RCS-compatible files, but also need to work with directories, use CVS. Otherwise, use a more modern source code management (SCM) system that handles directories the way you want. In some cases, conversion scripts can ease the pain of migrating, of you choose to do so. Examples include svn import and rcs-fast-export piped to git-fast-import.
RCS for Files; CVS for "RCS with Directories"
RCS works on individual files. CVS uses the RCS file format, but also allows you to add directories. However, directories aren't really first-class objects within CVS, and RCS doesn't have the concept at all. Proceed with caution if you migrate to CVS just for directory support when more modern alternatives exist.
Directories as First-Class Objects
If directories are really important to you, you may want to consider moving to an SCM that treats directories as first-class objects. Subversion (svn) and Bazaar (bzr) both treat directories as versioned objects, and may be good choices for you.
Other systems (notably Git) handle directories as part of tree objects, but don't really version them directly. Instead, for practical purposes you can think of directory versioning in Git as something computed from tree and file objects. This usually works fine, but again it may not be what you need for your project.
Your mileage will vary with other SCMs. When in doubt, consult their documentation.

I was able to do with with a program called etckeeper, I just had to make some modifications because by default it's used to version control /etc.

Related

How can I do a > 5GB commit to Mercurial?

I'm trying to import an existing project into Mercurial. The project is a bit over 5GB.
When I try to do an hg push I always get an error about being out of buffer space.
Does anyone know of a good way of doing the initial commit?
If you are not tied down to using Mercurial, then another possibility would be to use boar. It is not a DVCS like Mercurial, instead you have a central repository in which you store your data, and "check out" versions of files - in much the same way as with Subversion.
The important part is that it is written with the express purpose of storing large, binary files.
I have not used it, so I cannot comment on how good it is at its job, or how stable it is, but it is a possible alternative that may well suit your needs.
For a brief explanation of why storing binary files in mercurial is discouraged please read
https://www.mercurial-scm.org/wiki/BinaryFiles and http://kiln.stackexchange.com/questions/1074/why-is-it-bad-to-store-binary-files-in-mercurial
In our case we handle binary files using Dropbox. It allows you to both keep the history of files and sync the folder between team members. If you don't need to keep history of files, you can use rsync to keep binaries sync'ed.
Assuming you do actually need to put such a large commit into Mercurial, I would guess that rather than a few million tiny files, the size of your commit is primarily due to a handful of biiiig files. In this case you could investigate the Large Files Extension, which should suit your needs. When you add a large file, it is tracked by checksum rather than content, so what Mercurial itself tracks is relatively small. The extension will take care of the versions for you.
However, as Alex Stuckey mentions, you shouldn't normally be committing things such as compiled binaries (object code, resulting executables, ...), which are the most likely reason you have such a big commit. You would do well to create a decent .hgignore file (one that removes the usual suspects - *.o, *.pdb, whatever, ...), which will help eliminate accidentally adding files like that in the future. I have a standard .hgignore which gets put into nearly all my repositories as the first commit, and has served me well.

Is it possible to put meta-information into a file using Mercurial and if so how?

In CVS (and RCS, I gather) the classic $...$ mechanism was used to insert meta-information into a file. Is there a mechanism or extension for doing something similar in Mercurial (or, for curiosity, other distributed versioning software)? I'm really only interested in tracking the date of the most recent change incorporated into the file.
That mechanism is usually known as "RCS Keywords". The Keyword Extension, distributed with Mercurial, appears to do exactly what you want.

Working with folders in RCS

I have been following the tutorial http://www.burlingtontelecom.net/~ashawley/rcs/tutorial.html on how to work with files using RCS. This works well but only with one file. Is there a way to create an RCS file with directories as well?
I have a project folder called myproject, and in this directory I have all my files for that project. I want to create a revision control system for the myproject folder and all its files that are inside.
As William's comment says, RCS only works with single files. (It also doesn't seem to be particularly suitable for multiple-user stuff.)
Of course, nothing stops you from putting each (source) file in a directory under RCS control; in fact, this is essentially what CVS does (though in recent versions it handles the RCS data itself, rather than invoking RCS to do it as it used to do). Unfortunately, this fragments the change history rather badly; a commit affecting many files ends up as separate commits to each file, which just happen to have the same commit message (and timestamp?), and in general every file will have a different revision in what the user might like to think of as the "same" revision. (This makes tags quite essential.) CVS also has issues with the atomicity of commits: you could end up with commit A and commit B getting tangled up, such that in file foo commit A precedes commit B, but in file bar commit B precedes commit A!
SVN (Subversion) is an attempt to rectify some of the problems in CVS, though it also brings some new limitations, and keeps many of the existing ones; it is probably wiser (as William implies) to just use a distributed version control system (DVCS) for your multi-file projects. There are many choices:
Darcs uses a unique patch-based model: a repository is treated as a sequence of patches, which can be applied to an empty tree to build the current revision; patches can often be reordered by "commuting" pairs of patches, and cherry-picking patches from other repositories is quite easy. The downside is that the change history is a bit less clear than in most DVCSes. See http://wiki.darcs.net/Using/Model, http://en.wikibooks.org/wiki/Understanding_Darcs/Patch_theory.
Directed-acyclic-graph (DAG) based DVCSes model a repository as a directed acyclic graph of revisions, where each revision can have one parent, two parents, or perhaps more. Each revision has an associated file tree state; sometimes renames are also tracked somehow.
Git, as already mentioned. Has a very simple model, but a very complicated interface: there are many commands, some of which are not really intended for humans to use (owing to many parts of it having been prototyped in shell script, probably), so it can be hard to find the ones you want. Also, its model might be a bit too simple: it doesn't track renames at all.
Bazaar (a.k.a. bzr) has a more complicated model, including support for file/directory renames. It's difficult to say how much more complicated, though, because whatever documentation may exist is not nearly as accessible as Git's. It does, however, have a rather simpler user interface, and there are a number of useful plugins, including a distributed-development-friendly SVN plugin: committing from a branch back to SVN need not interfere with the validity of others' branches of your branches, and bzr metadata is even committed back to SVN. Can make things much less painful if you want to start hacking on an SVN-based project without having commit access, but hope to get your changes committed eventually. Bazaar is my personal favorite DAG-based DVCS.
Mercurial (a.k.a. hg) seems fairly similar to Bazaar, though I think it tracks renames only for individual files, not for directories. It also supports plugins, though its SVN plugin isn't as nice as Bazaar's: it doesn't support lossless commits, so branching from other peoples' branches is unwise. I don't have much experience with it, so I can't really evaluate it in-depth.
As the comments already mention, if you are starting out with version control, you would be well advised to choose a newer system than RCS (git, mercurial, fossil, subversion, ...). That said, RCS still works fine for a single developer working primarily on a single machine - I still use it for my own code because I've not yet OK worked out how to get the (20+ years of) history I want into git in the way I want it.
Anyway, to use RCS, make sure you have an RCS sub-directory in each directory where you have working source code under RCS management. The RCS files will be placed in the sub-directory automatically, and retrieved automatically. If your version of make is not already aware of RCS, then you can train it so that it is - or get a version of make that does (GNU Make, for example).
TL:DR - Look into DCVS for an alternative of RCS. It uses CVS, which uses RCS, but it's more modular for working in a repository that is distributed, as well as having a hierarchy of directories.
I'm currently going through a similar issue, and may have found something worthy of note, especially for people who are being forced to use a light, command-line based revision control systems with multiple team members.
My manager will not get off this idea of using RCS as our version control. But for the specifications, he wants developers to be able to create and edit on their own repository on a localized server within our company. Two issues with this:
RCS does not create, nor hold any sort of 'repository'. It is software that keeps track of file edits, on a Per File Basis. Meaning that the 'repository' is nothing more than another directory with RCS checked-in files. This is sub-par for team-geared projects, to say the least.
On a large project with multiple directories and tens of individual working files, even the prospect of creating a top-level RCS directory with a symbolic link in the working directories gives rise to complications such as naming conventions, as well as forgetting which file came from which bottom-level / working directory.
With what SamB posted, even CVS gives additional problems with RCS that we now have to account for, but gives us a slight ability for some additional hierarchy. But one suggestion he forgot was DCVS.
It's nothing more than an extension of CVS, CVSup, and:
contains functionality to distribute CVS repositories with local lines of development and automatically handles synchronization of the distributed repositories in the background.

Shared directories in Mercurial

I have a development project in Mercurial. In the project I have multiple directories scattered about which all should contain the same basic files (CSS, images, etc).
I'd like to have all of the directories point to the same underlying directory, so that if I edit a file in one place, it is updated everywhere else. Basically a UNIX soft link to the directory, but I want this to work within Mercurial (and I'm on Windows).
I've looked at subrepos, but they seem to either point to a existing directory or a remote one. I'd rather not have the network involved. In my case I just what to point the subrepos to a relative location in the same project.
What's the best way to accomplish this (with the least amount of pain)?
Going back to Windows 2000 (but still on the NT line), you can use directory hardlinks with NTFS. As far as Mercurial is concerned, however, they would be different files that just happen to get updated simultaneously, and the junction points would not be stored in the repo. (I'd create a script to set them up when cloning.)
This would also be incompatible with cloning the repo to a *nix environment. (Or at least I don't see how you could do it without conversion, which changes the repo, which means you can't push/pull between them.)
This is a poor hack, but, barring more details from you, it was all I came up with.

Are there version control systems that allow you to permanently delete files?

I need to keep under version some large files (some Gigs).
I don't need, and I can't keep under version all the version of the files.
I want to be able to remove from my VCS large files version at some moment.
The files that I want to keep under version control are big .zip files or ISO images.
These files may contains executable software or data (seismic data, SAR images, GNSS data) and they are provided by the software supplier of my company.
What control version system could I use?
In CVS you can do that by removing the files from the repo. Subversion allows that by dumping the content of the repo and filter it to remove the files (that is a bit cumbersome). Perforce has an obliterate command for that. Many of the newer distributed VCS make it rather difficult by their usage of hashes all over the places and the fact that your repo may have been replicated elsewhere also complicate things. Hg has a strip command (part of the Mq extension), Git can also do that I think.
I don’t think there’s any version control system that allows you do that regularly because that goes against everything version control systems stand for.
Perforce generally allows files to be put in two way, as head revision only (so, you'd only every have one copy) or all revisions. Perforce does have the admin level obliterate command that can be used to delete revisions. Its up to you to query for a list of files, possibly by date or number of revisions, and to specify the revisions to the obliterate command. As the name suggests obliterate deletes the revisions permanently from the database, so, I always generate scripts to do this and review them before running them. If the obliterate command is NOT run with the -Y flag, it will generate a list of what would be obliterated, also very useful.
Somehow I get the impression that you should not use a version control system at all. As said before, what you're trying to do goes against everything you would need a version control system for in the first place.
I suggest you create a file system directory structure that makes sense for what you're trying to accomplish and so that you can structure your data. And just make backup's of those files.
TFS has a destroy command that you can use to permanently delete files or revisions as you see fit.
There is more information at this MSDN article.
Many version control systems allow you to configure them in a way so that they store only the differences between several versions of a file and save space through that.
For example if you have a 1Gig file committed, change a part of it and commit it again, only the changed part will be stored in the version control system.
There won't be 2Gigs used (initial and new file) but only 1Gig+sizeOfChanges.
There's just one downside:if you're storing files which change their whole content from revision to revision this can also be counter-productive as the changes take almost the same space as the original version. Archive files are a example for such files where only a small change in the (real) content can lead to a completely changed content of the archive file.
I'd suggest to test several version control systems on your own and with your specific needs and environment and monitor each one at the server-side how the storage requirements for each system changes.
Some distributed version control systems allow to create "checkpoints" that allow you to use this version as kind of a base revision and safe you from pulling all the history before the checkpoint on every checkout. So you can remove the big files, create a checkpoint, and checkout/clone the repository from that checkpoint to a new directory. Then you have there a new, small repository, but without the history before the checkpoint. It you don't need that history you can burn the old repository on CD and use the new, partial one from now on.
I've only tested it in darcs, and there it works, but YMMV depending on version control system and use cases.
It sounds to me like you need an intelligent backup system, rather than version control.
I use SyncBackSE; it allows you to keep a number of previous versions, and can also do things like "ignore all files changed more than 30 days ago".
It's one of the few bits of paid-for software I use. I think it's worth checking out.
I think you're talking about something like "AlienBrain" "bucket" system, aren't you? The ability to remove some revisions from version control.
If you want to destroy an item, it's normally called "obliterate" and it's supported by a number of systems out there.
Buckets, AFAIK are supported by:
AlienBrain
Accurev
PlasticSCM
I would save such files under a unique name (datestamped, perhaps), and perhaps additionally make a textual reference to the external file in the version control system.
Fossil allows you to do this via the "shun" mechanism. Fossil being a distributed SCM, however, means that this does not affect all repositories (for obvious reasons).