Keep sources from external repositories up-to-date - version-control

After you start tracking the source of a bunch of open source software, how do you keep your code in sync? Run svn update every time you want to look at or play with the code?
It strikes me that it would be better to essentially start mirroring the code with (say) a cron job every night. Have people set up workflows to do this sort of thing? (With alerts when/if any changes you make to the code end up conflicting with the latest update?)
Or am I on my own? (I'm running Mac OS X but interested in general as well as specific solutions.)

The general workflow recommended by the Subversion book is to update your working copy often; at the start of every work-day is a good time. But you don't have to. Just update whenever you feel like seeing the latest changes.
I have a number of open source repositories checked out under a src/ directory. Every couple of days, I remember to run 'svn up *' from that directory, and it updates all the working copies contained there.

If your repository sends you an email every time someone checks in, why not have a program that checks for those emails and then updates the working copy at that time? This way you're always up to date. Caveats include needlessly burning bandwidth and the possibility of getting odd conflicts when a file you're working on gets updated.
Just updating once a day, or once every few days, is only useful when there are a limited number of people working on a project, all in disparate areas of it. When you've got more than five people, and the possibility that they are working in similar parts of the code, updating once an hour, or more frequently, is much better.

I will update often really only when I use an open source library in my own application, the external repository will actually be part of my project tree, when I update my project it also updates the external repository. I think when you only look at code for research it will only make sense if you want to look at a new feature they released and then update.

You might want to look into using svn:externals: http://svnbook.red-bean.com/en/1.0/ch07s03.html

Related

Version control personally and simply?

Requirement
make history for web text/code source files.
login-worker is only me, i.e personal usage.
automatically save history for each updated files(no require at once but at least once per week)
It must be a simple way to start and work.
I have 3 work places so need to do async files.
(not must but hopefully for future working environment) Any other non-engineer can also understand the location of history file and can see it easily.
Current way:
I made history folder the day, download files in there for edit, copy files when I edit/creat new one.
Advantage of the current way:
Very quick and simple, no need to do additional task to make history
Disadvantage of the current way:
Messy. Whenever day I work, I create a new history folder to keep downloaded files, so that it is messy in Finder(or windows explore).
Also, I don't have a way to Doing Async files for sure with in other places.
I tested to use GIT before, I had Thought GIT automatically save files I edit and save with a editor, but that was not the case. Also GIT is too complicated to use/start. If you recommend GIT, you need to show me ways to deal with the problem I had, for instance, simple GIT GUI with limited options without merging/project/branch etc because of personal usage for maintaining just one website.
Do you know any way to do version control personally and simply?
Thanks.
Suppose you entered <form ...> in your HTML—without the closing tag—and saved the file; do you really think the commit created by our imaginary VCS picked up that file's update event would have any sense?
What I mean, is that as with writing programs¹,
the history of source code changes are there for humans to read,
and for that matter, a good history graph should really read like a prose:
each commit should be atomic in the sense it comprises one (small) but
internally integral feature or fixes a bug, and had to be properly annotated
so that the intent of the change captured by that commit is clear.
What you want instead is just some dumb stream of changes purely for backup purposes.
Well, if you're fully aware of the repercussions (the most glaring one is that the generated history is completely useless for doing development on
the project and can only be used for rollbacks in case of "oopsies"),
there are two ways to go:
Some IDEs (namely, Eclipse) save a backup copy of each file they manage
on each save—thus providing your with such a rollback functionality w/o
using any VCS.
Script around any VCS you like: say, on Linux,
you start something like inotifywait telling it to watch your
project's root directory, recurvively, for write events on files,
read whatever the tool prints to its stdout when these events happen,
and for each event, call to your VCS of choice to record a new commit
with these changes.
¹ «Programs must be written for people to read, and only incidentally for machines to execute.» — Abelson & Sussman, "Structure and Interpretation of Computer Programs", preface to the first edition.
I strongly suggest you to have a deeper look at git.
It may looks difficult at the beginning, but you should spend some time learning it, that's all. All the problems above could be easily solved if you spend some time to learn the basics. There is also a nice "tutorial" on github on how to use git, no need to install anything: https://try.github.io/levels/1/challenges/1.

Website may have up to 10 files with same name/purpose, no version control

I am new at working for a large company with various people working on the same files. Sadly we don’t have version control and I often find myself cross eyed. For lack of better terminology, we have a dev site, quality-assurance site, and the live site. We have most files in two languages. Since the network connected drives have an average transfer rate of 15kb/sec we often copy the files locally before working on them. Also contractors send us new versions of files, but we may have made changes on our side and everything gets screwed up.
Basically I’m working with 6-10 files with the same name and same purpose. Does anyone have any tips on how I can keep them straight? I use Beyond Compare 2 to see the differences but if there’s a program that compares all files time stamps to see which is most current may help.
Thoughts:
1) Get version control system (Git), otherwise you will continue to have more and more pain.
2) Create a includes/lib folder and reduce that 6-10 files down (to 1).
I'll suggest, take a lead and put your code in version control and push your team to move to new repository. It'll make everybody's life easier and most important reduce chances of any merge error.
Assuming you cannot convince the powers that be to actually use source code control, why not try using Mercurial purely locally. Hopefully you can insulate yourself from some of the noise. You could even make fake users for the contractors and commit & push those changes as though they were actually doing it.
It shouldn't be too hard to get a bureaucrat to see how nice a good gatekeeper like Mercurial or Git would be. Its kind of like helpful red tape!

How should I start with tracking file changes/versions?

I've been working with a lot of my files on the go recently, and in the process often times accumulated several copies of files in different stages of completion/revision. I'm working on any number of projects at a given time, so it's not always easy to remember or figure out quickly which version I should continue working on.
What type of options would you recommend that allow me to track changes locally and if possible with files I work on while at a remote location? I've never worked with file versioning or tracking systems, so not sure what direction I should be looking in. I work mostly with HTML, CSS, and PHP.
Any help is awesomely appreciated! Thanks.
PS. Don't know if I should have this in a separate question but what options are available for the same type of thing, change tracking/logging for files on server? Preferably something that not only vaguely notes a file has been changed, but that tracks specific changes that have occurred in files.
It's seems to me that github is prefect choice for your requirement. You can create repository for maintain the history, it's easy to use and it is free
https://github.com/

Procedures before checking in to source control?

I am starting to get a reputation at work as the "guy who breaks the builds".
The problem is not that I am writing dodgy code, but when it comes to checking my fixes back into source control, it all goes wrong.
I am regularly doing stupid things like :
forgetting to add new files
accidentally checking in code for a half fixed bug along with another bug fix
forgetting to save the files in VS before checking them in
I need to develop some habits / tools to stop this.
What do you regularly do to ensure the code you check in is correct and is what needs to go in?
Edit
I forgot to mention that things can get pretty chaotic in this place. I quite often have two or three things that Im working on in the same code base at any one time. When I check in I will only really want to check in one of those things.
A few suggestions:
try work on one issue at a time. It's easy to make unrelated changes to the codebase that then end up being committed as one big chunk with a poor log message. Git is excels here since you can so easily move switch branches, and stash and cherry pick changes.
run the status command before a commit to see which files you've touched and if you've created new files that need to be added to version control.
run the diff command to see what you've actually changed. Often times you find that you've left in some debug logging that should be taken out or made some unnecessary change that is just cluttering up the diff. Try to make your diffs as small and clean as possible.
make sure your working copy builds with your changes in it
update before checking in and make sure that your working copy builds with other peoples changes in it
run what ever smoke test suite you might have to make sure that your changes work correctly
make small and frequent commits. It's a lot easier to figure out what has broken the build when the breaking commit is small.
Other things that the team can do is setup a continuous integration server like David M suggested so that the broken build is discovered as soon as possibly and automatically.
I usually always do a Get Latest before, then build. If build is good then I check in my code.
Here is what I have been doing. I have used ClearCase and CVS in the past for source control, and most recently I have been using Subversion and Visual Studio 2008 as my IDE.
Make my code changes and build on the local machine.
Make sure they do, in fact, fix the bug in question.
Run an SVN update on the local machine and repeat steps 1 and 2.
Run through the automated unit tests to verify that they pass.
If an automated smoke test is available, which automatically tests a lot of the system's capabilities, run it. Verify that the results are correct.
Then go to the build machine and run the build script.
If the project's configuration has changed, this could definitely break a build. Perform an SVN update on the build machine, whether the build script does that or not. Open the build machine's copy of the IDE, and do a complete rebuild. This will show you whether the build box has any problems that you have taken care of on your machine but not on the build box.
The suggestions to keep separate branches for each issue are also very good, if you can keep track of all of the issues you are working on.
First, use multiple working copies (a.k.a sandboxes) - one per issue. So, if you've been working on some complex feature for a while, and you need to deal with a quick bug fix on the same project, check out a new clean working copy and do the bug fix there. With independent working copies for each issue there is no confusion about which changes to commit from the working copy to the reposistory.
Second, before committing changes, always perform the following three steps:
Buld the software.
Run a smoke test (does it start and run without crashing).
Inspect the changes you're checking in by diffing your changes against the baseline.
These should be repeated after any merge operations (e.g. after an SVN update).
At my workplace, the safety net for this is peer review. That is, get someone else to build, run, and reproduce your solution on their machine, on their view.
I cannot recommend this enough. It has caught so many omissions, would-be problems, and other accidental pieces of junk to make it a valuable part of the process. Not to mention that the mere knowledge that you have to place your work in front of someone else before having it go on to the main branch means that you raise your own quality standards.
In the past I have used branching in Clear Case to help with this issue. The process I used is below. I've never used SorceDepot so I do not know how this can be adapted to work with it.
Create a branch for the bug fix
Code all changes on the branch
Code Review
Merge to stable branch in a different view (the different view is important)
BEFORE checking in: compile, test, and run
Check in code to stable branch
By creating the branch and then merging the changes to a different view (I use Merge Manager to do the merge) any files that were not included or checked in immediately cause issues. This way everything gets tested as it will be when checked in on the stable branch.
The best thing to avoid your problems, is to use hooks, that are provided in most SCMs (they are for sure in SVN and Mercurial, and I believe they must be in other advanced SCMs). Attach unit tests to the hook and make it run every time someone checks code in - exactly before it is checked in. This way you will achieve two things:
code in SCM repo will always pass the tests,
you won't make most simple mistakes, because they should be easily detectable, if you have decent test suite.
I like having Tortoise plugins for Windows Explorer. The file icons are all badged with committed, modified or not added icons making it very easy to see what status the files are in. I also enable the meta data for Modified so I can sort changed files in the list (Details) view, where they bubble to the top so I can see them.
I bet there is a Tortoise* plugin for your SCM, I saw one for Mercurial and SVN (and CVS, ugh). I really wish Mac OS X's Finder would accept plugins like Tortoise, its so much easier than having to pop open a dedicated app most of the time.
Get someone else to go through "every" change "before" you check in the code.

When to start to use source control in early stages of development?

We have 2 kinds of people at my shop:
The ones that starts to check-in the code since the first successful compilation.
The others that only checks-in the code when the project is almost done.
I am part of group 1, and trying to convince people of group 2 to act like me. Their arguments are like the following:
I'm the solo developer of this project.
It's just a prototype, maybe I'll have to rewrite from scratch again.
I don't want to pollute the Source Control with incomplete versions.
If I am right, please help me to raise arguments to convince them. If you agree with them tell me why.
When someone asked for good excuses not to use version control, they got 75 answers and 45 upvotes.
And when they asked Why should my team adopt source control, they got 26 answers.
Maybe you'll find something helpful there.
You don't need "arguments to convince them." Discourse is not a game, and you should not use your work as a debating platform. That's what your spouse is for :) Seriously, though, you need to explain why you care how other devs work on solo projects in which other people are not involved. What are you missing because they don't use source control? Do you need to see their early ideas to understand their later code? If you can sucessfully do that, you may be able to convince them.
I personally use version control at all times, but only because I don't walk a tightrope without a net. Other people have more courage, less time to spend on infrastructure, etc. Note that in 2009, in my opinion, hard disks rarely fail and rewritten code is often better than the code that it replaces.
While I'm answering a question with a question, let me ask another one: does your code need to compile/work/not-break-the-build to be checked in? I like my branches to get good and broken, then fixed, working, debugged, etc. At the same time, I like other devs to use source control however they want. Branches were invented for just that reason: so that people who can't get along do not have to cohabitate.
Here's my view to your points.
1) Even solo developers need somewhere to keep their code when their PC fails. What happens if they accidentally delete a file without source control?
2/3) Prototypes belong in source control so other team members can look at the code. We put our prototype code in a seperate location to the mainline branch. We call it Spike. Here's a great article on why you should keep Spike code- http://odetocode.com/Blogs/scott/archive/2008/11/17/12344.aspx
If I'm the sole developer on a project (in other words, the repository, or part of it, is under my complete control), then I start committing source code as soon as it's written, and I tend to check in after every incremental change, whether or not it works or represents any kind of milestone.
If I'm working in a repository on a project with others, then I tend to try and make my commits such that they don't break the mainline development, pass any tests, etc.
Whether or not it's a prototype, it deserves to go into source control; prototypes represent a lot of work, and lessons learned from them are valuable. Plus, prototypes have an awful habit of becoming production code, which you'll want in source control.
I try to only write code that compiles (everything else is commented out with a TODO/FIXME tag)... and also add everything to source control.
Argument 1: Even as a single dev it's nice to roll back to a running version, to track your progress, etc.
Argument 2: Who cares if it's just a prototype? You might stumble upon a similar problem in six months or so, and then just start looking for this other code...
Argument 3: Why not use more than one repo? I like to file misc stuff to my personal repo.
Start using source control about 20 minutes before you write your first line of your first artifact. There is never a good time to start after you're begun writing things.
some people can only learn from experience.
like a hard drive failure. or coding yourself into a dead-end after deleting code that actually worked
now, i'm not saying that you should erase their hard drive and then taunt them with "if only you had used source control"...but if something like were to happen, hopefully there would be a backup done first ;-)
Early and Often. As the Pragmatic Programmers say, source control is like a time machine, and you never know when you'll want to go back.
I would say to them...
I'm the solo developer of this project.
And when you leave or hand it off we'll have 0 developers. All the more reason to use source control.
The code belongs to the company not you and the company would like some accountability. Checking in code doesn't require too much effort:
svn ci <files> -m " implement ajax support for grid control
Next time someone new wants to make some changes on the grid control or do something related, they will have a great starting point. All projects start off with one or two people. Source control is easier now than it ever was--have they arranged a 30 minute demo of Tortoise SVN with them?
It's just a prototype, maybe I'll have to rewrite from scratch again.
Are they concerned about storage? Storage is cheap. Are they concerned about time wasted on versioning? It takes less time then the cursory email checks. If they are re-writing bits then source control is even more important to be able to reference old bits.
I don't want to pollute the Source Control with incomplete versions.
That's actually a good concern. I used to think the same thing at one point and avoided checking in code until it was nice and clean which is not a bad thing in and of itself but many times I just wanted to goof around. At this point learning about branching helps. Though I wish wish SVN had full support for purging folders like Perforce.
Let see their arguments:
I'm the solo developer of this project.
It's just a prototype, maybe I'll have to rewrite from scratch again.
I don't want to pollute the Source Control with incomplete versions.
First, the 3rd one. I can see the reasoning, but it is based on a bad assumption.
At work, we use Perforce, a centralized VCS, and indeed we only check in source that compile successfully and doesn't break anything (in theory, of course!), after peer review.
So when I start a non trivial change, I feel the need to intermediary commits. For example, recently I started to make some changes (somehow, in solo for this particular task, so I address point 1) on a Java code using JDom (XML parsing). Then I was stuck and wanted to use Java 1.6's built in XML parsing. It was obviously time to keep a trace of the current work, in case my attempt was failed and wanted to go back. Note this case somehow addresses the point 2.
The solution I chose is simple: I use an alternative SCM! Although some centralized VCS like SVN are usable in local (on the developer's computer), I was seduced by distributed VCS and after briefly testing Mercurial (which is good), I found Bazaar better suited to my needs and taste.
DVCS are well suited for this task because they are lightweight, flexible, allow alternative branches, doesn't "pollute" the source directory (all data is in one directory at the root of the project), etc.
By making a parallel source management, you don't pollute the source of other developers, while keeping the possibility to go back or quickly try alternative solutions.
At the end, by committing the final version to the official SCM, the result is the same, but with added security at the level of the developer.
I'd like to add two things. With version control you can:
Revert to last version that worked, or at least check how it looked like. For that you would need SCM which supports changesets / uses whole-tree commits.
Use it to find bugs, by using so called 'diff debugging' by finding commit in history that introduced the bug. You would want SCM which support it in automated or semi-automated fashion.
Personally, I often start version control after the first sucessful compile.
I just wonder why nobody mentioned distributed version control systems in this context: If you could manage to switch over to a distributed system (git, bazaar, mercury), most arguments of your second group would become pointless since they can just start their repository locally and push it to the server when they want (and they can also just remove it, if they want to restart from scratch).
For me, it's about having a consistent process. If you are writing code, it should follow the same source control process that your production code does. That helps build and enforce good development practices across the development team.
Categorizing the code as a prototype or other non-production type of project should just be used to determine where in the source control tree you put it.
We use both CVS (for non .NET projects) and TFS (for .NET projects) where I work, and the TFS repository has a Developer Sandbox folder where developers can check in personal experimental projects (prototypes).
If and when a project starts to get used in production, the code is moved out of the Developer Sandbox folder into it's own folder in the main tree.
I would say you should start adding the source and checking in before you even build the first time. It is then much easier to avoid checking in generated artifacts. I always use some source control, even for my small hobby hacks, just because it automatically filters the relevant from the noise.
So when I start prototyping I might create a project and then before building it I do "git init, git add ., git commit -a -m ..." just so that when I want to move the interesting parts I just clone over using git and then I can add it to the subversion repository or whatever is used where I am working at the moment.
It's called branching people try to get with the program :p Prototyping? Work in a branch. Experimenting? Work in a branch. New feature? Work in a branch.
Merge your branches into the main trunk when it makes sense.
I guess people tend to be laid back when it comes to setting up source control initially if the code may never be used. I have projects I coded belonging to both groups and the ones outside source control are not less important. It is one of those things that gets postponed everyday when it really should not.
On the other hand I sometimes commit too seldom complicating a revert once I screw up some CSS code and not knowing what I changed e.g. to make the footer of the site end up behind the header.
I check-in the project in source control before I start coding.
The first thing I do is create and organize the projects and support files (such as .sln files in .NET development) with the necessary support libraries and tools (usually in a lib folder) I know I will use in my project.
If I already have some code written, then I add it too, even if it is an incomplete application. Then I check-in everything. From there, everything is as usual, write some code, compile it, test it, check-in it...
You probably won't need to branch from this point or revert your changes, but I think it is a good practice to have everything under source control since the beginning, even if you don't have anything to compile.
I create a directory in source control before I start writing code for a project. I do the first commit after creating the project skeleton.
i'm drunk and and i do first git -init and then vim foo.cpp.
Any decent modern source control platform (of which VSS is not one) should not in any way be polluted by putting source code files into it. I am of the opinion that anything that has a life expectancy of more than about 1/2 an hour should be in source control as early as possible. Solo develpment is no longer a valid excuse for not using source control. It is not about security it is about bugs and long term history.