Website may have up to 10 files with same name/purpose, no version control - version-control

I am new at working for a large company with various people working on the same files. Sadly we don’t have version control and I often find myself cross eyed. For lack of better terminology, we have a dev site, quality-assurance site, and the live site. We have most files in two languages. Since the network connected drives have an average transfer rate of 15kb/sec we often copy the files locally before working on them. Also contractors send us new versions of files, but we may have made changes on our side and everything gets screwed up.
Basically I’m working with 6-10 files with the same name and same purpose. Does anyone have any tips on how I can keep them straight? I use Beyond Compare 2 to see the differences but if there’s a program that compares all files time stamps to see which is most current may help.

Thoughts:
1) Get version control system (Git), otherwise you will continue to have more and more pain.
2) Create a includes/lib folder and reduce that 6-10 files down (to 1).

I'll suggest, take a lead and put your code in version control and push your team to move to new repository. It'll make everybody's life easier and most important reduce chances of any merge error.

Assuming you cannot convince the powers that be to actually use source code control, why not try using Mercurial purely locally. Hopefully you can insulate yourself from some of the noise. You could even make fake users for the contractors and commit & push those changes as though they were actually doing it.
It shouldn't be too hard to get a bureaucrat to see how nice a good gatekeeper like Mercurial or Git would be. Its kind of like helpful red tape!

Related

How should I start with tracking file changes/versions?

I've been working with a lot of my files on the go recently, and in the process often times accumulated several copies of files in different stages of completion/revision. I'm working on any number of projects at a given time, so it's not always easy to remember or figure out quickly which version I should continue working on.
What type of options would you recommend that allow me to track changes locally and if possible with files I work on while at a remote location? I've never worked with file versioning or tracking systems, so not sure what direction I should be looking in. I work mostly with HTML, CSS, and PHP.
Any help is awesomely appreciated! Thanks.
PS. Don't know if I should have this in a separate question but what options are available for the same type of thing, change tracking/logging for files on server? Preferably something that not only vaguely notes a file has been changed, but that tracks specific changes that have occurred in files.
It's seems to me that github is prefect choice for your requirement. You can create repository for maintain the history, it's easy to use and it is free
https://github.com/

best practice in storing non-source files under version control

I have a project under version control, but the project has some images, videos and zip files that change every so often. I don't want to store these files under version control because they take up a lot of space and make updates and commits very slow.
What's a good way of dealing with this issue and still commit non-source files that have changed? is there a better way?
I'm currently using subversion, if there is another version control client that is better for dealing with this issue, please recommend it!
I have lots of non-source files in SVN and the only time it slows down the commit is when I change them. I don't see how this is an issue if they're only changing "every so often". Also the size really shouldn't be a concern. If your repository is on a server and you're worried about how much space it's taking up you need to upgrade. Hard drives are cheap. Buy them.
Some people feel strongly that non-source files don't belong in source control, I say an entire project should be stored in source control. That way if my development system goes down I can switch to another and after a couple minutes of downloading the project I'm back to coding.
You added in a comment:
The problem I have with this is that I don't care about the version history of those zip/video files, as long as they're the newest ones, there is no problem.
That means you have a linear workflow of development, only working on the LATEST of one main branch.
You do not seem to deal with the phase "after release", where you have to:
maintain what runs in production
develop small evolutions...
..while doing massive refactoring for experimenting some big evolutions
In the last three cases, the question of "what were the exact images, videos and zip files "I have to use" or "I was using at the time" might become important.
Anyhow, if you feel SVN do not handle them appropriately, I would still recommend having some way to remember that, for SVN revision xxx to yyy, you were using the version 'z' of your set of binaries.
For that, you could setup an external repository like Maven. See question "Is it acceptable/good to store binaries in SVN?" (my answer in that question is near the top of the page, but I link directly to Evan's answer as he mentions Maven).
While I find it a huge pain to deal with non-text files in version control, especially ones that change a lot, I've accepted the practice of "if it is needed for the build/installer it should go in version control". This of course is not a hard rule. I don't keep 3rd party libraries under version control (though know people who do).
I came under this opinion after setting up a Continuous Integration server at my shop. Having everything needed for the build that may change make's it much easier. As previously mentioned I don't keep libs under version control, but that is due to the fact that we rarely upgrade/add new libraries. If this is not the case for your shop then you may consider doing so. Also, if your images/videos/zips change more then once a year then I'd recommend keeping them under version control.

When to start to use source control in early stages of development?

We have 2 kinds of people at my shop:
The ones that starts to check-in the code since the first successful compilation.
The others that only checks-in the code when the project is almost done.
I am part of group 1, and trying to convince people of group 2 to act like me. Their arguments are like the following:
I'm the solo developer of this project.
It's just a prototype, maybe I'll have to rewrite from scratch again.
I don't want to pollute the Source Control with incomplete versions.
If I am right, please help me to raise arguments to convince them. If you agree with them tell me why.
When someone asked for good excuses not to use version control, they got 75 answers and 45 upvotes.
And when they asked Why should my team adopt source control, they got 26 answers.
Maybe you'll find something helpful there.
You don't need "arguments to convince them." Discourse is not a game, and you should not use your work as a debating platform. That's what your spouse is for :) Seriously, though, you need to explain why you care how other devs work on solo projects in which other people are not involved. What are you missing because they don't use source control? Do you need to see their early ideas to understand their later code? If you can sucessfully do that, you may be able to convince them.
I personally use version control at all times, but only because I don't walk a tightrope without a net. Other people have more courage, less time to spend on infrastructure, etc. Note that in 2009, in my opinion, hard disks rarely fail and rewritten code is often better than the code that it replaces.
While I'm answering a question with a question, let me ask another one: does your code need to compile/work/not-break-the-build to be checked in? I like my branches to get good and broken, then fixed, working, debugged, etc. At the same time, I like other devs to use source control however they want. Branches were invented for just that reason: so that people who can't get along do not have to cohabitate.
Here's my view to your points.
1) Even solo developers need somewhere to keep their code when their PC fails. What happens if they accidentally delete a file without source control?
2/3) Prototypes belong in source control so other team members can look at the code. We put our prototype code in a seperate location to the mainline branch. We call it Spike. Here's a great article on why you should keep Spike code- http://odetocode.com/Blogs/scott/archive/2008/11/17/12344.aspx
If I'm the sole developer on a project (in other words, the repository, or part of it, is under my complete control), then I start committing source code as soon as it's written, and I tend to check in after every incremental change, whether or not it works or represents any kind of milestone.
If I'm working in a repository on a project with others, then I tend to try and make my commits such that they don't break the mainline development, pass any tests, etc.
Whether or not it's a prototype, it deserves to go into source control; prototypes represent a lot of work, and lessons learned from them are valuable. Plus, prototypes have an awful habit of becoming production code, which you'll want in source control.
I try to only write code that compiles (everything else is commented out with a TODO/FIXME tag)... and also add everything to source control.
Argument 1: Even as a single dev it's nice to roll back to a running version, to track your progress, etc.
Argument 2: Who cares if it's just a prototype? You might stumble upon a similar problem in six months or so, and then just start looking for this other code...
Argument 3: Why not use more than one repo? I like to file misc stuff to my personal repo.
Start using source control about 20 minutes before you write your first line of your first artifact. There is never a good time to start after you're begun writing things.
some people can only learn from experience.
like a hard drive failure. or coding yourself into a dead-end after deleting code that actually worked
now, i'm not saying that you should erase their hard drive and then taunt them with "if only you had used source control"...but if something like were to happen, hopefully there would be a backup done first ;-)
Early and Often. As the Pragmatic Programmers say, source control is like a time machine, and you never know when you'll want to go back.
I would say to them...
I'm the solo developer of this project.
And when you leave or hand it off we'll have 0 developers. All the more reason to use source control.
The code belongs to the company not you and the company would like some accountability. Checking in code doesn't require too much effort:
svn ci <files> -m " implement ajax support for grid control
Next time someone new wants to make some changes on the grid control or do something related, they will have a great starting point. All projects start off with one or two people. Source control is easier now than it ever was--have they arranged a 30 minute demo of Tortoise SVN with them?
It's just a prototype, maybe I'll have to rewrite from scratch again.
Are they concerned about storage? Storage is cheap. Are they concerned about time wasted on versioning? It takes less time then the cursory email checks. If they are re-writing bits then source control is even more important to be able to reference old bits.
I don't want to pollute the Source Control with incomplete versions.
That's actually a good concern. I used to think the same thing at one point and avoided checking in code until it was nice and clean which is not a bad thing in and of itself but many times I just wanted to goof around. At this point learning about branching helps. Though I wish wish SVN had full support for purging folders like Perforce.
Let see their arguments:
I'm the solo developer of this project.
It's just a prototype, maybe I'll have to rewrite from scratch again.
I don't want to pollute the Source Control with incomplete versions.
First, the 3rd one. I can see the reasoning, but it is based on a bad assumption.
At work, we use Perforce, a centralized VCS, and indeed we only check in source that compile successfully and doesn't break anything (in theory, of course!), after peer review.
So when I start a non trivial change, I feel the need to intermediary commits. For example, recently I started to make some changes (somehow, in solo for this particular task, so I address point 1) on a Java code using JDom (XML parsing). Then I was stuck and wanted to use Java 1.6's built in XML parsing. It was obviously time to keep a trace of the current work, in case my attempt was failed and wanted to go back. Note this case somehow addresses the point 2.
The solution I chose is simple: I use an alternative SCM! Although some centralized VCS like SVN are usable in local (on the developer's computer), I was seduced by distributed VCS and after briefly testing Mercurial (which is good), I found Bazaar better suited to my needs and taste.
DVCS are well suited for this task because they are lightweight, flexible, allow alternative branches, doesn't "pollute" the source directory (all data is in one directory at the root of the project), etc.
By making a parallel source management, you don't pollute the source of other developers, while keeping the possibility to go back or quickly try alternative solutions.
At the end, by committing the final version to the official SCM, the result is the same, but with added security at the level of the developer.
I'd like to add two things. With version control you can:
Revert to last version that worked, or at least check how it looked like. For that you would need SCM which supports changesets / uses whole-tree commits.
Use it to find bugs, by using so called 'diff debugging' by finding commit in history that introduced the bug. You would want SCM which support it in automated or semi-automated fashion.
Personally, I often start version control after the first sucessful compile.
I just wonder why nobody mentioned distributed version control systems in this context: If you could manage to switch over to a distributed system (git, bazaar, mercury), most arguments of your second group would become pointless since they can just start their repository locally and push it to the server when they want (and they can also just remove it, if they want to restart from scratch).
For me, it's about having a consistent process. If you are writing code, it should follow the same source control process that your production code does. That helps build and enforce good development practices across the development team.
Categorizing the code as a prototype or other non-production type of project should just be used to determine where in the source control tree you put it.
We use both CVS (for non .NET projects) and TFS (for .NET projects) where I work, and the TFS repository has a Developer Sandbox folder where developers can check in personal experimental projects (prototypes).
If and when a project starts to get used in production, the code is moved out of the Developer Sandbox folder into it's own folder in the main tree.
I would say you should start adding the source and checking in before you even build the first time. It is then much easier to avoid checking in generated artifacts. I always use some source control, even for my small hobby hacks, just because it automatically filters the relevant from the noise.
So when I start prototyping I might create a project and then before building it I do "git init, git add ., git commit -a -m ..." just so that when I want to move the interesting parts I just clone over using git and then I can add it to the subversion repository or whatever is used where I am working at the moment.
It's called branching people try to get with the program :p Prototyping? Work in a branch. Experimenting? Work in a branch. New feature? Work in a branch.
Merge your branches into the main trunk when it makes sense.
I guess people tend to be laid back when it comes to setting up source control initially if the code may never be used. I have projects I coded belonging to both groups and the ones outside source control are not less important. It is one of those things that gets postponed everyday when it really should not.
On the other hand I sometimes commit too seldom complicating a revert once I screw up some CSS code and not knowing what I changed e.g. to make the footer of the site end up behind the header.
I check-in the project in source control before I start coding.
The first thing I do is create and organize the projects and support files (such as .sln files in .NET development) with the necessary support libraries and tools (usually in a lib folder) I know I will use in my project.
If I already have some code written, then I add it too, even if it is an incomplete application. Then I check-in everything. From there, everything is as usual, write some code, compile it, test it, check-in it...
You probably won't need to branch from this point or revert your changes, but I think it is a good practice to have everything under source control since the beginning, even if you don't have anything to compile.
I create a directory in source control before I start writing code for a project. I do the first commit after creating the project skeleton.
i'm drunk and and i do first git -init and then vim foo.cpp.
Any decent modern source control platform (of which VSS is not one) should not in any way be polluted by putting source code files into it. I am of the opinion that anything that has a life expectancy of more than about 1/2 an hour should be in source control as early as possible. Solo develpment is no longer a valid excuse for not using source control. It is not about security it is about bugs and long term history.

How do you "check out" code?

I've never really worked with a lot of people where we had to check out code and have repositories of old code, etc. I'm not sure I even know what these terms mean. If I want to to start a new project that involves more than myself that tracks all the code changes, does "check out" (again, don't know what that means), how do I get started? Is that what SVN is for? Something else? Do I download a program that keeps up with the code?
What do I do?
It will all be in house. No Internet for storing code.
I don't even know if what I am asking for is called source control. I see things about checking out, SVN, source control, and so on. I don't know if it is all talking about the same thing or not. I was hoping to use something open source.
So, a long time ago, in the bad old days of yore, source control used a library metaphor. If you wanted to edit a file, the only way to avoid conflicts was to make sure that you were the ONLY one editing the file. What you'd do is ask the source control system to "check out" that file, indicating that you were editing it and nobody else was allowed to edit it until you made your changes and the file was "checked in". If you needed to make a change to a checked out file, you had to go find that freakin' developer who'd had everythingImportant.conf checked out since last Tuesday..freakin' Bill...
Anyway, source control doesn't really work like that anymore, but the language stuck with us. Nowadays, "checking out" code means downloading a copy of the code from the code repository. The files will appear in a local directory, allowing you to use them, compile the code, and even make changes to the source that you could perhaps upload back to the repository later, should you need to. Even better, with just a single command, you can get all the changes that have been made by other developers since the last time you downloaded the code. Good stuff.
There are several major source control libraries, of which SVN (also called Subversion) is one (CVS, Git, HG, Perforce, ClearCase, etc are others). I recommend starting with SVN, Git, or HG, since they're all free and all have excellent documentation.
You might want to start using source control even if you're the only developer. There's nothing worse than realizing that last night the thousand lines of code you deleted as useless were actually critically important and are now lost forever. Source control allows you to zoom forwards and backwards in the history of your files, letting you easily recover stuff that you should not have removed, and giving you a lot more confidence about deleting useless stuff. Plus, fiddling around with it on your own is good practice.
Being comfortable with source/revision control software is a critical job skill of any serious software engineer. Mastering it will effectively level you up as a professional developer. Coming onto a project and finding that the team keeps all their source in a folder somewhere is an awful experience. Good luck! You're already on the right path just by being interested!
Check out Eric Sink's excellent series of articles:
Source Control HOWTO
I recommend Git and Subversion (SVN) both as free, open-source version control systems that work very well. Git has some nice features given that it can be easier to work decentralized.
Checkout means retrieving a file from a source control system. A source control system is a database (some, like CVS, use just specially marked up text files, but a file system is also a database) that holds all versions of your code (that are checked in after you make modifications).
Microsoft Visual SourceSafe uses a very proprietary database which is prone to corruption if it is not regularly maintained and uses reserved checkouts exclusively. Don't use it, for all those reasons.
The difference between a reserved checkout and an unreserved checkout is in an unreserved checkout; two people can be modifying the same file at once. The first one to check in gets in no problem, and the second one has to update their code to the latest version and merge the changes into theirs (which usually happens automatically, but if the same area of the file was changed, then there is a conflict, which has to be resolved before it can be checked in).
For some arguments for unreserved checkouts, see here.
Following this, you will be looking at a build process that independently checks out the code and builds the source code, so that everyone's changes are built and distributed together.
Are you creating the project that requires source control? If so, choose a source control system that meets your needs, and read the documentation for how to get it set up. If you are simply using a previously set up source control system for an existing project, ask a coworker who has been using it, or ask the person who set up the source control system.
For choosing a source control system that meets your needs, most source control systems have extensive descriptions of their features online, many provide evaluation or even completely free products, and there are many many many anecdotal descriptions of what working with each individual source control system is like, which can help.
Just don't use Microsoft Visual SourceSafe if you value your sanity and your code.

What is the best solution for maintaining backup and revision control on live websites?

What is the best solution for maintaining backup and revision control on live websites?
As part of my job I work with several live websites. We need an efficient means of maintaining backups of the live folders over time. Additionally, updating these sites can be a pain, especially if a change happens to break in the live environment for whatever reason.
What would be ideal would be hassle-free source control. I implemented SVN for a while which was great as a semi-solution for backup as well as revision control (easy reversion of temporary or breaking changes) etc.
Unfortunately SVN places .SVN hidden directories everywhere which cause problems, especially when other developers make folder structure changes or copy/move website directories. I've heard the argument that this is a matter of education etc. but the approach taken by SVN is simply not a practical solution for us.
I am thinking that maybe an incremental backup solution may be better.
Other possibilities include:
SVK, which is command-line only which becomes a problem. Besides, I am unsure on how appropriate this would be.
Mercurial, perhaps with some triggers to hide the distributed component which is not required in this case and would be unnecessarily complicated for other developers.
I experimented briefly with Mercurial but couldn't find a nice way to have the repository seperate and kept constantly in-sync with the live folder working copy. Maybe as a source control solution (making repository and live folder the same place) combined with another backup solution this could be the way to go.
One downside of Mercurial is that it doesn't place empty folders under source control which is problematic for websites which often have empty folders as placeholder locations for file uploads etc.
Rsync, which I haven't really investigated.
I'd really appreciate your advice on the best way to maintain backups of live websites, ideally with an easy means of retrieving past versions quickly.
Answer replies:
#Kibbee:
It's not so much about education as no familiarity with anything but VSS and a lack of time/effort to learn anything else.
The xcopy/7-zip approach sounds reasonable I guess but it would quickly take up a lot of room right?
As far as source control, I think I'd like the source control to just say that "this is the state of the folder now, I'll deal with that and if I can't match stuff up that's your fault, I'll just start new histories" rather than fail hard.
#Steve M:
Yeah that's a nicer way of doing it but would require a significant cultural change. Having said that I very much like this approach.
#mk:
Nice, I didn't think about using Rsync to deploy. Does this only upload the differences? Overwriting the entire live directory everytime we make a change would be problematic due to site downtime.
I am still curious to see if there are any more traditional options
You can still use SVN, but instead of doing a checkout on your live environment, do an export, that way no .svn directories will be created. The downside, of course, is that no code changes on your live environment can take place. This is a good thing.
As a general rule, code changes on production systems should never be allowed. The change should be made and tested in a development/test/UAT environment, then once confirmed as OK, you can tag that code in SVN with something like RELEASE-x-x-x. Then, on the live system, export the code with that tag.
We use option 3. Rsync. I wrote a bash script to do this along with some extra checking, but here are the basics of what it does.
Make a tag for pushing to live.
Run svn export on that tag.
rsync to live.
So far it has been working out. We don't have to worry about user conflicts or have a separate user for running svn up on the production machine.
Any source control solution you pick is going to have problems if people are moving, deleting, or adding files and not telling the source control system about it. I'm not aware of any source control item that could solve this problem.
In the case where you just can't educate the people working on the project[1], then you may just have to go with daily snapshots. Something as simple as batch file using xcopy to a network drive, and possibly 7-zip on the command line to compress it so it doesn't take up too much space would probably be the simplest solution.
[1] I would highly disbelieve this, probably just more a case of people being too stubborn and not willing to learn, or do "extra work". Nevermind how much time source control could save them when they have to go back to previous versions, or 2 people have edited the same file.
rsync will only upload the differences. I haven't personally used it, but Mark Pilgrim wrote a long time ago about how it even handles binary diffs brilliantly.
svn+rsync sounds like a fantastic solution. I'll have to try that in the future.