Version control personally and simply? - version-control

Requirement
make history for web text/code source files.
login-worker is only me, i.e personal usage.
automatically save history for each updated files(no require at once but at least once per week)
It must be a simple way to start and work.
I have 3 work places so need to do async files.
(not must but hopefully for future working environment) Any other non-engineer can also understand the location of history file and can see it easily.
Current way:
I made history folder the day, download files in there for edit, copy files when I edit/creat new one.
Advantage of the current way:
Very quick and simple, no need to do additional task to make history
Disadvantage of the current way:
Messy. Whenever day I work, I create a new history folder to keep downloaded files, so that it is messy in Finder(or windows explore).
Also, I don't have a way to Doing Async files for sure with in other places.
I tested to use GIT before, I had Thought GIT automatically save files I edit and save with a editor, but that was not the case. Also GIT is too complicated to use/start. If you recommend GIT, you need to show me ways to deal with the problem I had, for instance, simple GIT GUI with limited options without merging/project/branch etc because of personal usage for maintaining just one website.
Do you know any way to do version control personally and simply?
Thanks.

Suppose you entered <form ...> in your HTML—without the closing tag—and saved the file; do you really think the commit created by our imaginary VCS picked up that file's update event would have any sense?
What I mean, is that as with writing programs¹,
the history of source code changes are there for humans to read,
and for that matter, a good history graph should really read like a prose:
each commit should be atomic in the sense it comprises one (small) but
internally integral feature or fixes a bug, and had to be properly annotated
so that the intent of the change captured by that commit is clear.
What you want instead is just some dumb stream of changes purely for backup purposes.
Well, if you're fully aware of the repercussions (the most glaring one is that the generated history is completely useless for doing development on
the project and can only be used for rollbacks in case of "oopsies"),
there are two ways to go:
Some IDEs (namely, Eclipse) save a backup copy of each file they manage
on each save—thus providing your with such a rollback functionality w/o
using any VCS.
Script around any VCS you like: say, on Linux,
you start something like inotifywait telling it to watch your
project's root directory, recurvively, for write events on files,
read whatever the tool prints to its stdout when these events happen,
and for each event, call to your VCS of choice to record a new commit
with these changes.
¹ «Programs must be written for people to read, and only incidentally for machines to execute.» — Abelson & Sussman, "Structure and Interpretation of Computer Programs", preface to the first edition.

I strongly suggest you to have a deeper look at git.
It may looks difficult at the beginning, but you should spend some time learning it, that's all. All the problems above could be easily solved if you spend some time to learn the basics. There is also a nice "tutorial" on github on how to use git, no need to install anything: https://try.github.io/levels/1/challenges/1.

Related

Is there a version control system that is completely invisible to use while writing code?

I would like to use version control but I don't want to continuously commit several times an hour. Is there a version control system that records everything while you program so you don't have to commit, but still lets you go back to a previous state of your code?
Dropbox can do it. It records every change that you make.
You can do something like this if you're using the ZFS filesystem.
However, from a version-control point of view, I really don't think it's a good idea to store every changes. The size of you repository will become huge really fast.
And FYI, you don't have to commit several times an hour, I rarely do more than 4-5 commits a day.
Some operating systems make mounting WebDAV shares into the filesystem very easy; you could configure an SVN server to export WebDAV, mount the export into your filesystem, and get to work.
Don't forget to configure your editor to store temporary files or backup files somewhere other than your source tree or current working directory. Otherwise you'll have a ton of useless files cluttering up your source control system, making it harder to use in the future.
But finding which version to revert to can be pretty difficult without check in comments or changesets linking related changes together; it might not be worth the effort of configuring the entire system if it is too difficult to use to undo specific changes.

File history: in the source or let scm handle it?

I'm learning mercurial as my solo scm software. With other management software, you can put change comments into the file header through tags. With hg you comment the change set, and that doesn't get into the source. I'm more used to central control like VSS.
Why should I put the file history into the header of the source file? Should I let mercurial manage the history with my changeset comments?
Let the source control system handle it.
If you put change details in the header it will soon become unwieldy and overwhelm the actual code.
Additionally if the scm has the concept of changelists (where many files are grouped into a single change) then you'll be able to write the comment so that it applies to the whole change and not just the edits in the one file (if that makes sense), giving you a clearer picture of why the edit was required.
Yes; let the source control system handle your changeset comments. The rationale for this is that it makes considerably more sense when you're viewing the change log later, trying to work out what's going on between two versions of a file - the source control system can present the change comment to try and enlighten the situation.
There's no reason to manually maintain a file history when SCM software is much better suited to solve this problem. All too often I see partially-completed file histories in the source, which actually hurts, because people incorrectly assume it is accurate.
The difference is not whether it's a centralized or distributed VCS, it's more about what's being changed.
When I moved to .Net, the number of files updated for any individual change seemed to skyrocket. If I had to log the change in each file, I'd never get any real work done. By commenting on the set of changes, it doesn't matter how many files I had to update.
If I ever needed to identify all of the changes for a particular change, I can diff between the two versions of the project.
The biggest difference (and advantage) I saw when switching away from SourceSafe was the switch from file based to project based commits. As soon as I got used to that, I stopped adding change-log type comments to all of my files.
(As a side effect, I've found that my process description comments have gotten better)
I'm not a big proponent of littering the code with change comments. In the event they are needed they can be looked up in the SCM (at least for the SCM variants I have used). If you do want them in the file, consider putting them at the end instead of the beginning. That way you won't have to scroll down past the (uninteresting, to me at least) comments before you get to the actual code.
Another vote for letting the SCM system handle the checkin comments, but I do have one thing to add.
Some systems allow you to use RCS tags in your source code where the SCM can insert the change history directly into the source file being committed automatically. Sounds like a nice balance because the history is then in the SCM system and then automatically put into the source code itself.
The problem is that this process changes the source file. I think that's a bad idea because the file cannot be changed on disk until after you comment is inserted. If you were a good engineer, you should have built and tested changes before the commit. If your source changes after the commit, then you've essentially got a build that could be broken - but most engineers won't build after a commit - why should they?
But it's just a comment you say! True, but I did have a case where there was code in my source file that strangely enough had reason to look like an RCS header tag and that section of the code got replaced on checkin, thereby munging my code. Easy enough to fix, but bad that a build got broken for 20+ users
Much easier to forget to maintain history in the source, as one always (imo) should comment commits to source control system that problem dissappers. Also if changing lots of files before commit, changing history in every file will be annoying work. This is really one of the points with having scm.
I have experience with this. I've had the file history in the comments, it was awful. Nothing but garbage, sometimes you would have to scroll down almost 1k lines of code changes before you finally got to what you wanted. Not to mention, you're slowing down other aspects of your build process by adding more kb to your source code tree.

When to start to use source control in early stages of development?

We have 2 kinds of people at my shop:
The ones that starts to check-in the code since the first successful compilation.
The others that only checks-in the code when the project is almost done.
I am part of group 1, and trying to convince people of group 2 to act like me. Their arguments are like the following:
I'm the solo developer of this project.
It's just a prototype, maybe I'll have to rewrite from scratch again.
I don't want to pollute the Source Control with incomplete versions.
If I am right, please help me to raise arguments to convince them. If you agree with them tell me why.
When someone asked for good excuses not to use version control, they got 75 answers and 45 upvotes.
And when they asked Why should my team adopt source control, they got 26 answers.
Maybe you'll find something helpful there.
You don't need "arguments to convince them." Discourse is not a game, and you should not use your work as a debating platform. That's what your spouse is for :) Seriously, though, you need to explain why you care how other devs work on solo projects in which other people are not involved. What are you missing because they don't use source control? Do you need to see their early ideas to understand their later code? If you can sucessfully do that, you may be able to convince them.
I personally use version control at all times, but only because I don't walk a tightrope without a net. Other people have more courage, less time to spend on infrastructure, etc. Note that in 2009, in my opinion, hard disks rarely fail and rewritten code is often better than the code that it replaces.
While I'm answering a question with a question, let me ask another one: does your code need to compile/work/not-break-the-build to be checked in? I like my branches to get good and broken, then fixed, working, debugged, etc. At the same time, I like other devs to use source control however they want. Branches were invented for just that reason: so that people who can't get along do not have to cohabitate.
Here's my view to your points.
1) Even solo developers need somewhere to keep their code when their PC fails. What happens if they accidentally delete a file without source control?
2/3) Prototypes belong in source control so other team members can look at the code. We put our prototype code in a seperate location to the mainline branch. We call it Spike. Here's a great article on why you should keep Spike code- http://odetocode.com/Blogs/scott/archive/2008/11/17/12344.aspx
If I'm the sole developer on a project (in other words, the repository, or part of it, is under my complete control), then I start committing source code as soon as it's written, and I tend to check in after every incremental change, whether or not it works or represents any kind of milestone.
If I'm working in a repository on a project with others, then I tend to try and make my commits such that they don't break the mainline development, pass any tests, etc.
Whether or not it's a prototype, it deserves to go into source control; prototypes represent a lot of work, and lessons learned from them are valuable. Plus, prototypes have an awful habit of becoming production code, which you'll want in source control.
I try to only write code that compiles (everything else is commented out with a TODO/FIXME tag)... and also add everything to source control.
Argument 1: Even as a single dev it's nice to roll back to a running version, to track your progress, etc.
Argument 2: Who cares if it's just a prototype? You might stumble upon a similar problem in six months or so, and then just start looking for this other code...
Argument 3: Why not use more than one repo? I like to file misc stuff to my personal repo.
Start using source control about 20 minutes before you write your first line of your first artifact. There is never a good time to start after you're begun writing things.
some people can only learn from experience.
like a hard drive failure. or coding yourself into a dead-end after deleting code that actually worked
now, i'm not saying that you should erase their hard drive and then taunt them with "if only you had used source control"...but if something like were to happen, hopefully there would be a backup done first ;-)
Early and Often. As the Pragmatic Programmers say, source control is like a time machine, and you never know when you'll want to go back.
I would say to them...
I'm the solo developer of this project.
And when you leave or hand it off we'll have 0 developers. All the more reason to use source control.
The code belongs to the company not you and the company would like some accountability. Checking in code doesn't require too much effort:
svn ci <files> -m " implement ajax support for grid control
Next time someone new wants to make some changes on the grid control or do something related, they will have a great starting point. All projects start off with one or two people. Source control is easier now than it ever was--have they arranged a 30 minute demo of Tortoise SVN with them?
It's just a prototype, maybe I'll have to rewrite from scratch again.
Are they concerned about storage? Storage is cheap. Are they concerned about time wasted on versioning? It takes less time then the cursory email checks. If they are re-writing bits then source control is even more important to be able to reference old bits.
I don't want to pollute the Source Control with incomplete versions.
That's actually a good concern. I used to think the same thing at one point and avoided checking in code until it was nice and clean which is not a bad thing in and of itself but many times I just wanted to goof around. At this point learning about branching helps. Though I wish wish SVN had full support for purging folders like Perforce.
Let see their arguments:
I'm the solo developer of this project.
It's just a prototype, maybe I'll have to rewrite from scratch again.
I don't want to pollute the Source Control with incomplete versions.
First, the 3rd one. I can see the reasoning, but it is based on a bad assumption.
At work, we use Perforce, a centralized VCS, and indeed we only check in source that compile successfully and doesn't break anything (in theory, of course!), after peer review.
So when I start a non trivial change, I feel the need to intermediary commits. For example, recently I started to make some changes (somehow, in solo for this particular task, so I address point 1) on a Java code using JDom (XML parsing). Then I was stuck and wanted to use Java 1.6's built in XML parsing. It was obviously time to keep a trace of the current work, in case my attempt was failed and wanted to go back. Note this case somehow addresses the point 2.
The solution I chose is simple: I use an alternative SCM! Although some centralized VCS like SVN are usable in local (on the developer's computer), I was seduced by distributed VCS and after briefly testing Mercurial (which is good), I found Bazaar better suited to my needs and taste.
DVCS are well suited for this task because they are lightweight, flexible, allow alternative branches, doesn't "pollute" the source directory (all data is in one directory at the root of the project), etc.
By making a parallel source management, you don't pollute the source of other developers, while keeping the possibility to go back or quickly try alternative solutions.
At the end, by committing the final version to the official SCM, the result is the same, but with added security at the level of the developer.
I'd like to add two things. With version control you can:
Revert to last version that worked, or at least check how it looked like. For that you would need SCM which supports changesets / uses whole-tree commits.
Use it to find bugs, by using so called 'diff debugging' by finding commit in history that introduced the bug. You would want SCM which support it in automated or semi-automated fashion.
Personally, I often start version control after the first sucessful compile.
I just wonder why nobody mentioned distributed version control systems in this context: If you could manage to switch over to a distributed system (git, bazaar, mercury), most arguments of your second group would become pointless since they can just start their repository locally and push it to the server when they want (and they can also just remove it, if they want to restart from scratch).
For me, it's about having a consistent process. If you are writing code, it should follow the same source control process that your production code does. That helps build and enforce good development practices across the development team.
Categorizing the code as a prototype or other non-production type of project should just be used to determine where in the source control tree you put it.
We use both CVS (for non .NET projects) and TFS (for .NET projects) where I work, and the TFS repository has a Developer Sandbox folder where developers can check in personal experimental projects (prototypes).
If and when a project starts to get used in production, the code is moved out of the Developer Sandbox folder into it's own folder in the main tree.
I would say you should start adding the source and checking in before you even build the first time. It is then much easier to avoid checking in generated artifacts. I always use some source control, even for my small hobby hacks, just because it automatically filters the relevant from the noise.
So when I start prototyping I might create a project and then before building it I do "git init, git add ., git commit -a -m ..." just so that when I want to move the interesting parts I just clone over using git and then I can add it to the subversion repository or whatever is used where I am working at the moment.
It's called branching people try to get with the program :p Prototyping? Work in a branch. Experimenting? Work in a branch. New feature? Work in a branch.
Merge your branches into the main trunk when it makes sense.
I guess people tend to be laid back when it comes to setting up source control initially if the code may never be used. I have projects I coded belonging to both groups and the ones outside source control are not less important. It is one of those things that gets postponed everyday when it really should not.
On the other hand I sometimes commit too seldom complicating a revert once I screw up some CSS code and not knowing what I changed e.g. to make the footer of the site end up behind the header.
I check-in the project in source control before I start coding.
The first thing I do is create and organize the projects and support files (such as .sln files in .NET development) with the necessary support libraries and tools (usually in a lib folder) I know I will use in my project.
If I already have some code written, then I add it too, even if it is an incomplete application. Then I check-in everything. From there, everything is as usual, write some code, compile it, test it, check-in it...
You probably won't need to branch from this point or revert your changes, but I think it is a good practice to have everything under source control since the beginning, even if you don't have anything to compile.
I create a directory in source control before I start writing code for a project. I do the first commit after creating the project skeleton.
i'm drunk and and i do first git -init and then vim foo.cpp.
Any decent modern source control platform (of which VSS is not one) should not in any way be polluted by putting source code files into it. I am of the opinion that anything that has a life expectancy of more than about 1/2 an hour should be in source control as early as possible. Solo develpment is no longer a valid excuse for not using source control. It is not about security it is about bugs and long term history.

How do you "check out" code?

I've never really worked with a lot of people where we had to check out code and have repositories of old code, etc. I'm not sure I even know what these terms mean. If I want to to start a new project that involves more than myself that tracks all the code changes, does "check out" (again, don't know what that means), how do I get started? Is that what SVN is for? Something else? Do I download a program that keeps up with the code?
What do I do?
It will all be in house. No Internet for storing code.
I don't even know if what I am asking for is called source control. I see things about checking out, SVN, source control, and so on. I don't know if it is all talking about the same thing or not. I was hoping to use something open source.
So, a long time ago, in the bad old days of yore, source control used a library metaphor. If you wanted to edit a file, the only way to avoid conflicts was to make sure that you were the ONLY one editing the file. What you'd do is ask the source control system to "check out" that file, indicating that you were editing it and nobody else was allowed to edit it until you made your changes and the file was "checked in". If you needed to make a change to a checked out file, you had to go find that freakin' developer who'd had everythingImportant.conf checked out since last Tuesday..freakin' Bill...
Anyway, source control doesn't really work like that anymore, but the language stuck with us. Nowadays, "checking out" code means downloading a copy of the code from the code repository. The files will appear in a local directory, allowing you to use them, compile the code, and even make changes to the source that you could perhaps upload back to the repository later, should you need to. Even better, with just a single command, you can get all the changes that have been made by other developers since the last time you downloaded the code. Good stuff.
There are several major source control libraries, of which SVN (also called Subversion) is one (CVS, Git, HG, Perforce, ClearCase, etc are others). I recommend starting with SVN, Git, or HG, since they're all free and all have excellent documentation.
You might want to start using source control even if you're the only developer. There's nothing worse than realizing that last night the thousand lines of code you deleted as useless were actually critically important and are now lost forever. Source control allows you to zoom forwards and backwards in the history of your files, letting you easily recover stuff that you should not have removed, and giving you a lot more confidence about deleting useless stuff. Plus, fiddling around with it on your own is good practice.
Being comfortable with source/revision control software is a critical job skill of any serious software engineer. Mastering it will effectively level you up as a professional developer. Coming onto a project and finding that the team keeps all their source in a folder somewhere is an awful experience. Good luck! You're already on the right path just by being interested!
Check out Eric Sink's excellent series of articles:
Source Control HOWTO
I recommend Git and Subversion (SVN) both as free, open-source version control systems that work very well. Git has some nice features given that it can be easier to work decentralized.
Checkout means retrieving a file from a source control system. A source control system is a database (some, like CVS, use just specially marked up text files, but a file system is also a database) that holds all versions of your code (that are checked in after you make modifications).
Microsoft Visual SourceSafe uses a very proprietary database which is prone to corruption if it is not regularly maintained and uses reserved checkouts exclusively. Don't use it, for all those reasons.
The difference between a reserved checkout and an unreserved checkout is in an unreserved checkout; two people can be modifying the same file at once. The first one to check in gets in no problem, and the second one has to update their code to the latest version and merge the changes into theirs (which usually happens automatically, but if the same area of the file was changed, then there is a conflict, which has to be resolved before it can be checked in).
For some arguments for unreserved checkouts, see here.
Following this, you will be looking at a build process that independently checks out the code and builds the source code, so that everyone's changes are built and distributed together.
Are you creating the project that requires source control? If so, choose a source control system that meets your needs, and read the documentation for how to get it set up. If you are simply using a previously set up source control system for an existing project, ask a coworker who has been using it, or ask the person who set up the source control system.
For choosing a source control system that meets your needs, most source control systems have extensive descriptions of their features online, many provide evaluation or even completely free products, and there are many many many anecdotal descriptions of what working with each individual source control system is like, which can help.
Just don't use Microsoft Visual SourceSafe if you value your sanity and your code.

What is the best solution for maintaining backup and revision control on live websites?

What is the best solution for maintaining backup and revision control on live websites?
As part of my job I work with several live websites. We need an efficient means of maintaining backups of the live folders over time. Additionally, updating these sites can be a pain, especially if a change happens to break in the live environment for whatever reason.
What would be ideal would be hassle-free source control. I implemented SVN for a while which was great as a semi-solution for backup as well as revision control (easy reversion of temporary or breaking changes) etc.
Unfortunately SVN places .SVN hidden directories everywhere which cause problems, especially when other developers make folder structure changes or copy/move website directories. I've heard the argument that this is a matter of education etc. but the approach taken by SVN is simply not a practical solution for us.
I am thinking that maybe an incremental backup solution may be better.
Other possibilities include:
SVK, which is command-line only which becomes a problem. Besides, I am unsure on how appropriate this would be.
Mercurial, perhaps with some triggers to hide the distributed component which is not required in this case and would be unnecessarily complicated for other developers.
I experimented briefly with Mercurial but couldn't find a nice way to have the repository seperate and kept constantly in-sync with the live folder working copy. Maybe as a source control solution (making repository and live folder the same place) combined with another backup solution this could be the way to go.
One downside of Mercurial is that it doesn't place empty folders under source control which is problematic for websites which often have empty folders as placeholder locations for file uploads etc.
Rsync, which I haven't really investigated.
I'd really appreciate your advice on the best way to maintain backups of live websites, ideally with an easy means of retrieving past versions quickly.
Answer replies:
#Kibbee:
It's not so much about education as no familiarity with anything but VSS and a lack of time/effort to learn anything else.
The xcopy/7-zip approach sounds reasonable I guess but it would quickly take up a lot of room right?
As far as source control, I think I'd like the source control to just say that "this is the state of the folder now, I'll deal with that and if I can't match stuff up that's your fault, I'll just start new histories" rather than fail hard.
#Steve M:
Yeah that's a nicer way of doing it but would require a significant cultural change. Having said that I very much like this approach.
#mk:
Nice, I didn't think about using Rsync to deploy. Does this only upload the differences? Overwriting the entire live directory everytime we make a change would be problematic due to site downtime.
I am still curious to see if there are any more traditional options
You can still use SVN, but instead of doing a checkout on your live environment, do an export, that way no .svn directories will be created. The downside, of course, is that no code changes on your live environment can take place. This is a good thing.
As a general rule, code changes on production systems should never be allowed. The change should be made and tested in a development/test/UAT environment, then once confirmed as OK, you can tag that code in SVN with something like RELEASE-x-x-x. Then, on the live system, export the code with that tag.
We use option 3. Rsync. I wrote a bash script to do this along with some extra checking, but here are the basics of what it does.
Make a tag for pushing to live.
Run svn export on that tag.
rsync to live.
So far it has been working out. We don't have to worry about user conflicts or have a separate user for running svn up on the production machine.
Any source control solution you pick is going to have problems if people are moving, deleting, or adding files and not telling the source control system about it. I'm not aware of any source control item that could solve this problem.
In the case where you just can't educate the people working on the project[1], then you may just have to go with daily snapshots. Something as simple as batch file using xcopy to a network drive, and possibly 7-zip on the command line to compress it so it doesn't take up too much space would probably be the simplest solution.
[1] I would highly disbelieve this, probably just more a case of people being too stubborn and not willing to learn, or do "extra work". Nevermind how much time source control could save them when they have to go back to previous versions, or 2 people have edited the same file.
rsync will only upload the differences. I haven't personally used it, but Mark Pilgrim wrote a long time ago about how it even handles binary diffs brilliantly.
svn+rsync sounds like a fantastic solution. I'll have to try that in the future.