Related
I am new at working for a large company with various people working on the same files. Sadly we don’t have version control and I often find myself cross eyed. For lack of better terminology, we have a dev site, quality-assurance site, and the live site. We have most files in two languages. Since the network connected drives have an average transfer rate of 15kb/sec we often copy the files locally before working on them. Also contractors send us new versions of files, but we may have made changes on our side and everything gets screwed up.
Basically I’m working with 6-10 files with the same name and same purpose. Does anyone have any tips on how I can keep them straight? I use Beyond Compare 2 to see the differences but if there’s a program that compares all files time stamps to see which is most current may help.
Thoughts:
1) Get version control system (Git), otherwise you will continue to have more and more pain.
2) Create a includes/lib folder and reduce that 6-10 files down (to 1).
I'll suggest, take a lead and put your code in version control and push your team to move to new repository. It'll make everybody's life easier and most important reduce chances of any merge error.
Assuming you cannot convince the powers that be to actually use source code control, why not try using Mercurial purely locally. Hopefully you can insulate yourself from some of the noise. You could even make fake users for the contractors and commit & push those changes as though they were actually doing it.
It shouldn't be too hard to get a bureaucrat to see how nice a good gatekeeper like Mercurial or Git would be. Its kind of like helpful red tape!
We have 2 kinds of people at my shop:
The ones that starts to check-in the code since the first successful compilation.
The others that only checks-in the code when the project is almost done.
I am part of group 1, and trying to convince people of group 2 to act like me. Their arguments are like the following:
I'm the solo developer of this project.
It's just a prototype, maybe I'll have to rewrite from scratch again.
I don't want to pollute the Source Control with incomplete versions.
If I am right, please help me to raise arguments to convince them. If you agree with them tell me why.
When someone asked for good excuses not to use version control, they got 75 answers and 45 upvotes.
And when they asked Why should my team adopt source control, they got 26 answers.
Maybe you'll find something helpful there.
You don't need "arguments to convince them." Discourse is not a game, and you should not use your work as a debating platform. That's what your spouse is for :) Seriously, though, you need to explain why you care how other devs work on solo projects in which other people are not involved. What are you missing because they don't use source control? Do you need to see their early ideas to understand their later code? If you can sucessfully do that, you may be able to convince them.
I personally use version control at all times, but only because I don't walk a tightrope without a net. Other people have more courage, less time to spend on infrastructure, etc. Note that in 2009, in my opinion, hard disks rarely fail and rewritten code is often better than the code that it replaces.
While I'm answering a question with a question, let me ask another one: does your code need to compile/work/not-break-the-build to be checked in? I like my branches to get good and broken, then fixed, working, debugged, etc. At the same time, I like other devs to use source control however they want. Branches were invented for just that reason: so that people who can't get along do not have to cohabitate.
Here's my view to your points.
1) Even solo developers need somewhere to keep their code when their PC fails. What happens if they accidentally delete a file without source control?
2/3) Prototypes belong in source control so other team members can look at the code. We put our prototype code in a seperate location to the mainline branch. We call it Spike. Here's a great article on why you should keep Spike code- http://odetocode.com/Blogs/scott/archive/2008/11/17/12344.aspx
If I'm the sole developer on a project (in other words, the repository, or part of it, is under my complete control), then I start committing source code as soon as it's written, and I tend to check in after every incremental change, whether or not it works or represents any kind of milestone.
If I'm working in a repository on a project with others, then I tend to try and make my commits such that they don't break the mainline development, pass any tests, etc.
Whether or not it's a prototype, it deserves to go into source control; prototypes represent a lot of work, and lessons learned from them are valuable. Plus, prototypes have an awful habit of becoming production code, which you'll want in source control.
I try to only write code that compiles (everything else is commented out with a TODO/FIXME tag)... and also add everything to source control.
Argument 1: Even as a single dev it's nice to roll back to a running version, to track your progress, etc.
Argument 2: Who cares if it's just a prototype? You might stumble upon a similar problem in six months or so, and then just start looking for this other code...
Argument 3: Why not use more than one repo? I like to file misc stuff to my personal repo.
Start using source control about 20 minutes before you write your first line of your first artifact. There is never a good time to start after you're begun writing things.
some people can only learn from experience.
like a hard drive failure. or coding yourself into a dead-end after deleting code that actually worked
now, i'm not saying that you should erase their hard drive and then taunt them with "if only you had used source control"...but if something like were to happen, hopefully there would be a backup done first ;-)
Early and Often. As the Pragmatic Programmers say, source control is like a time machine, and you never know when you'll want to go back.
I would say to them...
I'm the solo developer of this project.
And when you leave or hand it off we'll have 0 developers. All the more reason to use source control.
The code belongs to the company not you and the company would like some accountability. Checking in code doesn't require too much effort:
svn ci <files> -m " implement ajax support for grid control
Next time someone new wants to make some changes on the grid control or do something related, they will have a great starting point. All projects start off with one or two people. Source control is easier now than it ever was--have they arranged a 30 minute demo of Tortoise SVN with them?
It's just a prototype, maybe I'll have to rewrite from scratch again.
Are they concerned about storage? Storage is cheap. Are they concerned about time wasted on versioning? It takes less time then the cursory email checks. If they are re-writing bits then source control is even more important to be able to reference old bits.
I don't want to pollute the Source Control with incomplete versions.
That's actually a good concern. I used to think the same thing at one point and avoided checking in code until it was nice and clean which is not a bad thing in and of itself but many times I just wanted to goof around. At this point learning about branching helps. Though I wish wish SVN had full support for purging folders like Perforce.
Let see their arguments:
I'm the solo developer of this project.
It's just a prototype, maybe I'll have to rewrite from scratch again.
I don't want to pollute the Source Control with incomplete versions.
First, the 3rd one. I can see the reasoning, but it is based on a bad assumption.
At work, we use Perforce, a centralized VCS, and indeed we only check in source that compile successfully and doesn't break anything (in theory, of course!), after peer review.
So when I start a non trivial change, I feel the need to intermediary commits. For example, recently I started to make some changes (somehow, in solo for this particular task, so I address point 1) on a Java code using JDom (XML parsing). Then I was stuck and wanted to use Java 1.6's built in XML parsing. It was obviously time to keep a trace of the current work, in case my attempt was failed and wanted to go back. Note this case somehow addresses the point 2.
The solution I chose is simple: I use an alternative SCM! Although some centralized VCS like SVN are usable in local (on the developer's computer), I was seduced by distributed VCS and after briefly testing Mercurial (which is good), I found Bazaar better suited to my needs and taste.
DVCS are well suited for this task because they are lightweight, flexible, allow alternative branches, doesn't "pollute" the source directory (all data is in one directory at the root of the project), etc.
By making a parallel source management, you don't pollute the source of other developers, while keeping the possibility to go back or quickly try alternative solutions.
At the end, by committing the final version to the official SCM, the result is the same, but with added security at the level of the developer.
I'd like to add two things. With version control you can:
Revert to last version that worked, or at least check how it looked like. For that you would need SCM which supports changesets / uses whole-tree commits.
Use it to find bugs, by using so called 'diff debugging' by finding commit in history that introduced the bug. You would want SCM which support it in automated or semi-automated fashion.
Personally, I often start version control after the first sucessful compile.
I just wonder why nobody mentioned distributed version control systems in this context: If you could manage to switch over to a distributed system (git, bazaar, mercury), most arguments of your second group would become pointless since they can just start their repository locally and push it to the server when they want (and they can also just remove it, if they want to restart from scratch).
For me, it's about having a consistent process. If you are writing code, it should follow the same source control process that your production code does. That helps build and enforce good development practices across the development team.
Categorizing the code as a prototype or other non-production type of project should just be used to determine where in the source control tree you put it.
We use both CVS (for non .NET projects) and TFS (for .NET projects) where I work, and the TFS repository has a Developer Sandbox folder where developers can check in personal experimental projects (prototypes).
If and when a project starts to get used in production, the code is moved out of the Developer Sandbox folder into it's own folder in the main tree.
I would say you should start adding the source and checking in before you even build the first time. It is then much easier to avoid checking in generated artifacts. I always use some source control, even for my small hobby hacks, just because it automatically filters the relevant from the noise.
So when I start prototyping I might create a project and then before building it I do "git init, git add ., git commit -a -m ..." just so that when I want to move the interesting parts I just clone over using git and then I can add it to the subversion repository or whatever is used where I am working at the moment.
It's called branching people try to get with the program :p Prototyping? Work in a branch. Experimenting? Work in a branch. New feature? Work in a branch.
Merge your branches into the main trunk when it makes sense.
I guess people tend to be laid back when it comes to setting up source control initially if the code may never be used. I have projects I coded belonging to both groups and the ones outside source control are not less important. It is one of those things that gets postponed everyday when it really should not.
On the other hand I sometimes commit too seldom complicating a revert once I screw up some CSS code and not knowing what I changed e.g. to make the footer of the site end up behind the header.
I check-in the project in source control before I start coding.
The first thing I do is create and organize the projects and support files (such as .sln files in .NET development) with the necessary support libraries and tools (usually in a lib folder) I know I will use in my project.
If I already have some code written, then I add it too, even if it is an incomplete application. Then I check-in everything. From there, everything is as usual, write some code, compile it, test it, check-in it...
You probably won't need to branch from this point or revert your changes, but I think it is a good practice to have everything under source control since the beginning, even if you don't have anything to compile.
I create a directory in source control before I start writing code for a project. I do the first commit after creating the project skeleton.
i'm drunk and and i do first git -init and then vim foo.cpp.
Any decent modern source control platform (of which VSS is not one) should not in any way be polluted by putting source code files into it. I am of the opinion that anything that has a life expectancy of more than about 1/2 an hour should be in source control as early as possible. Solo develpment is no longer a valid excuse for not using source control. It is not about security it is about bugs and long term history.
I've never really worked with a lot of people where we had to check out code and have repositories of old code, etc. I'm not sure I even know what these terms mean. If I want to to start a new project that involves more than myself that tracks all the code changes, does "check out" (again, don't know what that means), how do I get started? Is that what SVN is for? Something else? Do I download a program that keeps up with the code?
What do I do?
It will all be in house. No Internet for storing code.
I don't even know if what I am asking for is called source control. I see things about checking out, SVN, source control, and so on. I don't know if it is all talking about the same thing or not. I was hoping to use something open source.
So, a long time ago, in the bad old days of yore, source control used a library metaphor. If you wanted to edit a file, the only way to avoid conflicts was to make sure that you were the ONLY one editing the file. What you'd do is ask the source control system to "check out" that file, indicating that you were editing it and nobody else was allowed to edit it until you made your changes and the file was "checked in". If you needed to make a change to a checked out file, you had to go find that freakin' developer who'd had everythingImportant.conf checked out since last Tuesday..freakin' Bill...
Anyway, source control doesn't really work like that anymore, but the language stuck with us. Nowadays, "checking out" code means downloading a copy of the code from the code repository. The files will appear in a local directory, allowing you to use them, compile the code, and even make changes to the source that you could perhaps upload back to the repository later, should you need to. Even better, with just a single command, you can get all the changes that have been made by other developers since the last time you downloaded the code. Good stuff.
There are several major source control libraries, of which SVN (also called Subversion) is one (CVS, Git, HG, Perforce, ClearCase, etc are others). I recommend starting with SVN, Git, or HG, since they're all free and all have excellent documentation.
You might want to start using source control even if you're the only developer. There's nothing worse than realizing that last night the thousand lines of code you deleted as useless were actually critically important and are now lost forever. Source control allows you to zoom forwards and backwards in the history of your files, letting you easily recover stuff that you should not have removed, and giving you a lot more confidence about deleting useless stuff. Plus, fiddling around with it on your own is good practice.
Being comfortable with source/revision control software is a critical job skill of any serious software engineer. Mastering it will effectively level you up as a professional developer. Coming onto a project and finding that the team keeps all their source in a folder somewhere is an awful experience. Good luck! You're already on the right path just by being interested!
Check out Eric Sink's excellent series of articles:
Source Control HOWTO
I recommend Git and Subversion (SVN) both as free, open-source version control systems that work very well. Git has some nice features given that it can be easier to work decentralized.
Checkout means retrieving a file from a source control system. A source control system is a database (some, like CVS, use just specially marked up text files, but a file system is also a database) that holds all versions of your code (that are checked in after you make modifications).
Microsoft Visual SourceSafe uses a very proprietary database which is prone to corruption if it is not regularly maintained and uses reserved checkouts exclusively. Don't use it, for all those reasons.
The difference between a reserved checkout and an unreserved checkout is in an unreserved checkout; two people can be modifying the same file at once. The first one to check in gets in no problem, and the second one has to update their code to the latest version and merge the changes into theirs (which usually happens automatically, but if the same area of the file was changed, then there is a conflict, which has to be resolved before it can be checked in).
For some arguments for unreserved checkouts, see here.
Following this, you will be looking at a build process that independently checks out the code and builds the source code, so that everyone's changes are built and distributed together.
Are you creating the project that requires source control? If so, choose a source control system that meets your needs, and read the documentation for how to get it set up. If you are simply using a previously set up source control system for an existing project, ask a coworker who has been using it, or ask the person who set up the source control system.
For choosing a source control system that meets your needs, most source control systems have extensive descriptions of their features online, many provide evaluation or even completely free products, and there are many many many anecdotal descriptions of what working with each individual source control system is like, which can help.
Just don't use Microsoft Visual SourceSafe if you value your sanity and your code.
What is the best solution for maintaining backup and revision control on live websites?
As part of my job I work with several live websites. We need an efficient means of maintaining backups of the live folders over time. Additionally, updating these sites can be a pain, especially if a change happens to break in the live environment for whatever reason.
What would be ideal would be hassle-free source control. I implemented SVN for a while which was great as a semi-solution for backup as well as revision control (easy reversion of temporary or breaking changes) etc.
Unfortunately SVN places .SVN hidden directories everywhere which cause problems, especially when other developers make folder structure changes or copy/move website directories. I've heard the argument that this is a matter of education etc. but the approach taken by SVN is simply not a practical solution for us.
I am thinking that maybe an incremental backup solution may be better.
Other possibilities include:
SVK, which is command-line only which becomes a problem. Besides, I am unsure on how appropriate this would be.
Mercurial, perhaps with some triggers to hide the distributed component which is not required in this case and would be unnecessarily complicated for other developers.
I experimented briefly with Mercurial but couldn't find a nice way to have the repository seperate and kept constantly in-sync with the live folder working copy. Maybe as a source control solution (making repository and live folder the same place) combined with another backup solution this could be the way to go.
One downside of Mercurial is that it doesn't place empty folders under source control which is problematic for websites which often have empty folders as placeholder locations for file uploads etc.
Rsync, which I haven't really investigated.
I'd really appreciate your advice on the best way to maintain backups of live websites, ideally with an easy means of retrieving past versions quickly.
Answer replies:
#Kibbee:
It's not so much about education as no familiarity with anything but VSS and a lack of time/effort to learn anything else.
The xcopy/7-zip approach sounds reasonable I guess but it would quickly take up a lot of room right?
As far as source control, I think I'd like the source control to just say that "this is the state of the folder now, I'll deal with that and if I can't match stuff up that's your fault, I'll just start new histories" rather than fail hard.
#Steve M:
Yeah that's a nicer way of doing it but would require a significant cultural change. Having said that I very much like this approach.
#mk:
Nice, I didn't think about using Rsync to deploy. Does this only upload the differences? Overwriting the entire live directory everytime we make a change would be problematic due to site downtime.
I am still curious to see if there are any more traditional options
You can still use SVN, but instead of doing a checkout on your live environment, do an export, that way no .svn directories will be created. The downside, of course, is that no code changes on your live environment can take place. This is a good thing.
As a general rule, code changes on production systems should never be allowed. The change should be made and tested in a development/test/UAT environment, then once confirmed as OK, you can tag that code in SVN with something like RELEASE-x-x-x. Then, on the live system, export the code with that tag.
We use option 3. Rsync. I wrote a bash script to do this along with some extra checking, but here are the basics of what it does.
Make a tag for pushing to live.
Run svn export on that tag.
rsync to live.
So far it has been working out. We don't have to worry about user conflicts or have a separate user for running svn up on the production machine.
Any source control solution you pick is going to have problems if people are moving, deleting, or adding files and not telling the source control system about it. I'm not aware of any source control item that could solve this problem.
In the case where you just can't educate the people working on the project[1], then you may just have to go with daily snapshots. Something as simple as batch file using xcopy to a network drive, and possibly 7-zip on the command line to compress it so it doesn't take up too much space would probably be the simplest solution.
[1] I would highly disbelieve this, probably just more a case of people being too stubborn and not willing to learn, or do "extra work". Nevermind how much time source control could save them when they have to go back to previous versions, or 2 people have edited the same file.
rsync will only upload the differences. I haven't personally used it, but Mark Pilgrim wrote a long time ago about how it even handles binary diffs brilliantly.
svn+rsync sounds like a fantastic solution. I'll have to try that in the future.
After being told by at least 10 people on SO that version control was a good thing even if it's just me I now have a followup question.
What is the difference between all the different types of version control and is there a guide that anybody knows of for version control that's very simple and easy to understand?
We seem to be in the golden age of version control, with a ton of choices, all of which have their pros and cons.
Here are the ones I see most used:
svn - currently the most popular open source?
git - very hot since Linus switched to it
mercurial - some smart people I know swear by it
cvs - the one everybody is switching from
perforce - imho, the best features, but it's not open source. The two-user license is free, though.
visual sourcesafe - I'm not much in the Microsoft world, so I have no idea about this one, other than people like to rag on it as they rag on everything from Microsoft.
sccs - for historical interest we mention this, the great-grandaddy of many of the above
rcs - and the grandaddy of many of the above
My recommendation: you're safest with either git, svn or perforce, since a lot of people use them, they are cross platform, have good guis, you can buy books about them, etc.
Dont consider cvs, sccs, rcs, they are antique.
The nice thing is that, since your projects will be relatively small, you will be able to move your code to a new system once you're more experienced and decide you want to work with another system.
Eric Sink has a good overview of source control. There are also some existing questions here on SO.
To everyone just starting using version control:
Please do not use git (or hg or bzr) because of the hype
Use git (or hg or bzr) because they are better tools for managing source code than SVN.
I used SVN for a few years at work, and switched over to git 6 months ago. Without learning SVN first I would be totaly lost when it comes to using a DVCS.
For people just starting out with version control:
Start by downloading SVN
Learn why you need version control
Learn how to commit, checkout, branch
Learn why merging in SVN is such a pain
Then switch over to a DVCS and learn:
How to clone/branch/commit
How easy it is to merge your branches back (go branch crazy!)
How easy it is to rewrite commit history and keep your branchesup to date with the main line (git rebase -i, )
How to publish your changes so others can benefit
tldr; crowd:
Start with SVN and learn the basics, then graduate to a DVCS.
I would start with:
A Visual Guide to Version Control
Wikipedia
Then once you have read up on it, download and install SVN, TortoiseSVN and skim the first few chapters of the book and get started.
Version Control is essential to development, even if you're working by yourself because it protects you from yourself. If you make a mistake, it's a simple matter to rollback to a previous version of your code that you know works. This also frees you to explore and experiment with your code because you're free of having to worry about whether what you're doing is reversible or not. There are two major branches of Version Control Systems (VCS), Centralized and Distributed.
Centralized VCS are based on using a central server, where everyone "checks out" a project, works on it, and "commits" their changes back to the server for anybody else to use. The major Centralized VCS are CVS and SVN. Both have been heavily criticized because "merging" "branches" is extremely painful with them. [TODO: write explanation on what branches are and why merging is hard with CVS & SVN]
Distributed VCS let everyone have their own server, where you can "pull" changes from other people and "push" changes to a server. The most common Distributed VCS are Git and Mercurial. [TODO: write more on Distributed VCS]
If you're working on a project I heavily recommend using a distributed VCS. I recommend Git because it's blazingly fast, but is has been criticized as being too hard to use. If you don't mind using a commercial product BitKeeper is supposedly easy to use.
The answer to another question also applies here, most importantly
Jon Works said:
The most important thing about version control is:
JUST START USING IT
His answer goes into more detail, and I don't want to be accused of plaigerism so take a look.
The simple answer is, do you like Undo buttons? The answer is of course yes, because we as human being make mistakes all the time.
As programmers, its often the case though that it can take several hours of testing, code changes, overwrites, deletions, file moves and renames before we work out the method we are trying to use to fix a problem is entirely the wrong one and the code is more broken than when we started.
As such, Source Control is a massive Undo button to revert the code to an earlier time when the grass was green and the food plentiful. And not only that, because of how source control works, you can still keep a copy of your broken code, in case a few weeks down the line you want to refer to it again and cherry pick any good ideas that did come out of it.
I personally (though it could be called overkill) use a free Single user license version of Source Gear Fortress (which is their Vault source control product with bug tracking features). I find the UI really simple to use, it supports both the checkout > edit > checkin model and the edit > merge > commit model. It can be a little tricky to set up though, requiring you to run a local copy of ISS and SQL server. You might want to try a smaller program, like those recommended by other answers here. See what you like and what you can afford.
Mark said:
git - very hot since Linus switched to it
I just want to point out that Linus didn't switch to it, Linus wrote it.
If you are working by yourself in a Windows environment, then the single user license for SourceGear's Vault is free.
We use and like Mercurial. It follows a distributed model - it eliminates some of the sense of having to "check in" work. Mozilla has moved to Mercurial, which is a good sign that it's not going to go away any time soon. One con, in my opinion, is that there isn't a very good GUI for it. If you're comfortable with the command line, though, it's pretty handy.
Mercurial Documentation
Unofficial Manual
Just start using source control, no matter what type you use. What you use doesn't matter; it's the use of it that is important
Like everyone else, SC is really dependant on your needs, your budget, your environment, etc.
At its root, source control is designed to provide a central repository of all your code, and track who did what to it when. There should be a complete history, and you can get products that do full changelogs, auditing, access control, and on and on...
Each product that is out there starts to shine (so to speak) when you start to look at how you want or need to incorporate SC into your environment (whether it's your personal code and documents or a large corporations). And as people use them, they discover that the tool has limitations, so people write new ones. SVN was born out of limitations that the creators saw with CVS. Linus wanted something better for the Linux kernel, so now we have git.
I would say start using one (something like SVN which is very popular and pretty easy to use) and see how it goes. As time progresses you may find that you need some other functionality, or need to interface with other systems, so you may need SourceSafe or another tool.
Source control is always important, and while you can get away with manually re-numbering versions of PSD files or something as you work on them, you're going to forget to run that batch script once or twice, or likely forget which number went with which change. That's where most of these SC tools can help (as long as you check-in/check-out).
See also this SO question:
Difference between GIT and CVS