Bazaar newbie question about repository structures - version-control

I want to use Bazaar on Windows XP for web-development and related tasks. Most of the files are edited locally and then transferred via FTP to the server. Just now the repository sits on my local workstation. Later on it should be shared locally with some co-workers. Perhaps we will use a local Linux server as a centralized repository, but this structure is not decided for now. But first I need to understand the impacts of the different repository setups, which I do not at all.
Using Bazaar-Explorer on Windows XP I’ve created a ‘shared tree repository’ from the option list of the init-dialogue in some location dev-filter/. Bazaar Explorer tells me:
Created repository with treeless branches at F:/bzr.local/dev-filter
Created branch at F:/bzr.local/dev-filter/trunk
Created working tree at F:/bzr.local/dev-filter/work
OK so far. Now I move a bunch of files into the work directory and add and commit them as Rev 1 ‘Start Revision’. Then I work on some of these files and commit them again as Rev 2. Here my confusion starts. Shouldn’t both revisions go into the trunk? The trunk is still empty, beside the .bzr directory which only holds some management information. If I delete my working directory, which I have tried during these first experiments, everything is gone. There’s obviously no hidden storage of those files.
OK. Perhaps I need to push it into the trunk? This does not work either. Entering the work/ directory and initializing the ‘push’ to the trunk, Bazaar-Explorer tells me
No new revisions to push.
So what? This looks like a severe conceptual misunderstanding about what should happen on my side.
Edit, 2010-02-03: Some conclusions
What I learned meanwhile is this:
I think I should switch to the command line until I really understand what’s going on, at least for creating the repositories and branches. Bazaar Explorer introduces a new level of abstraction which I only can handle if I understand the level beneath
One of the secrets of working with Bazaar at least for me is to understand those .bzr directories, their particular properties and states when created with ‘bzr init’, ‘bzr init-repository’, ‘bzr branch’ etc. in all their variants and how they are plumped together.
While there’s a whole chapter of ‘Organizing your workspace’ in the Bazaar User Guide, it’s more or less workflow oriented. The manual contains a lot of directory structures for the given examples. What I would prefer beside this and have not (or only rudimentary) found so far is some graphical representation of those ‘Lego like’ .bzr building blocks which create the linking of all the parts. So I started to invent some simple notation while working through the examples and looking into the .bzr directories to document what information is stored there, where does it come from, how and to what is it linked, is it complete or shared, etc.
Erich Schreiber

Created repository with treeless
branches at F:/bzr.local/dev-filter
This part of the output looks suspicious to me. Are you sure you chose 'Shared repository' and not 'Shared repository with treeless branches' from the init dialog?
Treeless Branches are branches without the working tree, if you indeed created a treeless branch for trunk then it makes sense that there are no files there.

Your changes are still saved in the F:/bzr.local/dev-filter/trunk/.bzr, and have indeed been committed there. You don't see those changes reflected in the file system because Bazaar has created trunk as a treeless branch, with `` as a lightweight checkout. See checkouts in the Bazaar User Reference.
If you open F:/bzr.local/dev-filter/trunk in Bazaar Explorer, you should see your revisions. If you create a new branch with a working tree or checkout based on trunk, Bazaar will create the files with your changes for you.

typically it goes like this.
bzr init-repo --no-trees F:/bzr.local/dev-filter
cd F:/bzr.local/dev-filter
bzr init trunk
bzr branch trunk work
---all above will not create any tree
Now in new directory say F:\temp
cd F:\temp
bzr checkout F:/bzr.local/dev-filter/work
bzr add
bzr commit
---back to F:/bzr.local/dev-filter/work
cd F:/bzr.local/dev-filter/work
bzr push F:/bzr.local/dev-filter/trunk

Related

Bazaar manage multiple branches at once

I'm using Bazaar quite some time now, but at the moment I'm searching a solution to the following problem:
Assuming you've got several developers with everyone developing in its own branch, like this:
Project
|
|----Branch 1
|
|----Branch 2
|
...
Now, we've got a project manager who wants to have an overview over all branches.
Is there any possibility (using only bzr functions) that he can manage those branches at once?
With "manage", I mean update, commit and perhaps even checkout (last one could perhaps be done with multi-pull but I think this would overwrite existing local data)
Greetings Florian
P.S. I know that this use-case could easily be achieved with SVN (by simply using subdirectories - but without the features of a dvcs) or more or less easily with shell-scripts (something like bzr list-branches|xargs bzr update), but I'd prefer a built-in bzr function
You can see all the branches in a directory tree with:
bzr branches -R /path/to/base/dir
However this works only on the local filesystem. If you need to find branches in a remote system, you need to run the command through ssh or something.
Once you have the list of branches, the manager should branch from them into his local shared repository, preferably configured with the --no-trees option for space efficiency. Existing branches should be pulled instead (using multi-pull for example), removed branches should be removed.
Once he has the branches, the easiest way to overview is using Bazaar Explorer. Open the shared repository location. I especially like the Log button, which will show the tree of logs.
When you say commit... The manager should not commit to the developer branches. If fixes are needed it's better to ask the developer to fix it, otherwise the manager will always have to clean up their mess for them. The manager should only merge other branches to the trunk/main/master. In other words use the gatekeeper workflow.
You can try the bzr-externals plugin or the bzr-scmproj plugin.

GitHub wiki managed by the main repository

I'd like to manage the GitHub wiki for my project at the same time as I'm developing the code. For example:
Branches
master (stable versions)
develop (development of next version)
Others... (Possible other dev / feature branches)
Ideally, I'd like the wiki to be contained in a subfolder (e.g. /wiki) of the project. Then when I'm making changes to the code I can also update the wiki as the same time (code + documentation change). It'd also mean that all my development code and documentation would be self-contained in the "develop" branch until I merge with the "master" branch. Hopefully, even if via a manual process, the GitHub wiki would then be updated after the merge with master to reflect the changes.
I've taken a look at Git's submodule feature, but from what I understand that usually points at a single revision. I'd like to somehow follow my code development so branching and merging would work as normal.
As explained in "True nature of submodules", you can make modifications and updates within a submodule, as long as you commit also the parent repo in order to record the new state of your "wiki" sub-repo.
If you intend to use Gollum to display and work on your GitHub wiki while it's on your local machine (you probably should), then you will have a trouble if you use submodules.
Gollum wants to do local commits to your local Git repository (but not pushes), but in a submodule .git is actually a file containing the local repository, not a true Git repository. This causes Gollum to break.
Submodules also have the problems that the versions aren't coupled to the parent repository, and they aren't completely de-coupled. It a nuisance to have the source code repository to want to push the new wiki version number (but not the wiki contents) every time you make a documentation change.
The solution I use is simply to clone the wiki repository into a directory inside the main project directory and add it to .gitignore. By using a consistent name for the directory across projects (e.g. github-wiki), the chance is minimized that the wiki won't be in .gitignore and gets accidentally uploaded into the main repository.
For consistency, his approach also works well for GitHub pages, although it's unnecessary as they don't experience the problem with Gollum.

Version control on an external project

I am working on an enormous project ("the project") which is open-source, and I am changing the project but don't have a permission to commit. I'm looking for strategies for maintaining my own branch of the project. Some issues I am contemplating:
How to put my own work in a version control system, given that I'm altering the project's source code, adding new files and so on.
How to keep in sync with the project without having to manually merge my own changes over and over again.
I've never been in this situation - I've always maintained my complete project in some version control system. My plan right now is something like that:
Creating a directory tree in my SVN, similar to the one in the project.
Keeping all the changed files (and only them) in my svn.
Every time I decide to sync with the new baseline of the project, I'll do a checkout, merge my svn tree into the new version, test, then commit my changes to my svn and distribute them along with the latest project baseline.
The problems here are ENDLESS. Way too many manual steps, more and more work over time, and so on. The correct way to go would be, of course, to be a part of the original project, but this seems to be quite irrelevant right now for various reasons and is out of the question.
Ideas?
I'd use git or mercurial for this; simply import the project into git or mercurial, and merge the upstream changes into a branch in your project for easy merging into your trunk.
If the upstream project has a repository of their own, the import is even easier. Both git and mercurial have support for directly importing other version control systems. I did this recently to adapt an existing project that lives in SVN: https://github.com/mjpieters/rod.recipe.rabbitmq
Note that that project has an 'upstream' branch. That particular project has now accepted my proposed changes after reviewing the changes in github.com.
There are a few questions here on SO on the subject:
Fork and synchronize Google Code Subversion repository into GitHub
Tracking upstream svn changes with git-svn and github?
Best way to fork SVN project with Git
It should be trivial to create a similar setup with mercurial.
You can use git to maintain your source control on your local system. In fact Git can be used to maintain just about any directory under version control. There is no need to sync to anything, git maintains all changes locally.
If you need to commit to SVN check out the documentation http://git-scm.com/docs/git-svn

Mercurial "vendor branches" from external repositories?

I want to store a project in Mercurial that contains external code (which can be modified by me) coming from Git and SVN repositories. In SVN I would solve this with vendor branches and copy the code around, but I understood that in Mercurial it's better to have different repositories for different projects, and pull between them when needed.
The project layout will be like this:
- externalLibraryA [comes from a SVN repo]
- ...with some extra files from me
- externalLibraryB [comes from a SVN repo]
- ...with some extra files from me
- externalPluginForExternalLibraryB [comes from a Git repo]
In Subversion I would create vendor dir and a trunk dir, copy all external libraries first in vendor, and then in the right place in trunk. (I think) I can do this in Mercurial too, with subrepositories, but is this the best way to do this?
I tried setting up different repositories for the external libraries, but then it seems I can't pull the externalLibraryARepo into the externalLibraryA directory of my main repository? It goes in the main directory, which is not what I want. I can also create a Mercurial mirror repository and include it as a subrepo in my main repository, but then the changes in this subdirectory go to the mirror repository, while I want them to stay in the main repository.
I'd probably just store this in one repository - note that in the link you give they are using their build system in the end to bring together the binary output from the different repos. I'm not clear on their rationale there.
If the underlying problem you're trying to solve is how to update the externals in a clean way, I'd probably use anonymous branching for that.
I.e. add the external lib to your project, and your modifications. Make sure it works. Tag with ExternalA-v1.0. Hack away on your actual project. Now ExternalA, Inc. has a new version of their stuff. Update your repo to ExternalA-v1.0 tag. Import their new version and apply your modifications on top. Commit. Now you have two heads: one with the latest version of your code (that works with ExternalA-v1.0) and one with the latest version of ExternalA (that does not work with your code, maybe). So then you merge and reconcile the two. Tag again, now with ExternalA-v2.0. Repeat as needed.
You can still keep your externals in separate repositories, but I assume that the project that is using those does not need to be up to date with changes there all the time - looks like the whole point of vendor branches is to have some point of isolation between dependee and dependants. Of course, moving the changes from the externalA project to the project that is using that will then be a manual affair (well, a copy, much like in SVN really).
It depends on whether your vendor code is going to be customized by your team or not. Our teams have had a great deal of success maintaining a named "vendor" branch on repositories with our own customizations on branches named by project name. This vendor code is then easily included in a project as a subrepository.
A caveat to this approach: if active development is going on in the subrepository, best keep it to directly editing the subrepository as a separate clone, otherwise it becomes necessary to pay close attention to the top-level repository so you don't inadvertantly bump your .hgsubstate forward to the wrong revision and break your build.
Watch out for merges of the top-level repository (your project) between versions which point to different named branches of your subrepository, as this can result in a merge between the "vendor" and "project" branches in the subrepository as it recurses, which may not be desirable.
Note that this functionality may change in the future as well, as some "warm" discussions have been taking place in recent months on the mercurial-devel mailing lists about the future of subrepository recursion.
edit:
I just saw this discussion in the related links as well, which seems relevant: https://stackoverflow.com/a/3998791/1186771

How to actually use a source control system?

So I get that most of you are frowning at me for not currently using any source control. I want to, I really do, now that I've spent some time reading the questions / answers here. I am a hobby programmer and really don't do much more than tinker, but I've been bitten a couple of times now not having the 'time machine' handy...
I still have to decide which product I'll go with, but that's not relevant to this question.
I'm really struggling with the flow of files under source control, so much so I'm not even sure how to pose the question sensibly.
Currently I have a directory hierarchy where all my PHP files live in a Linux Environment. I edit them there and can hit refresh on my browser to see what happens.
As I understand it, my files now live in a different place. When I want to edit, I check it out and edit away. But what is my substitute for F5? How do I test it? Do I have to check it back in, then hit F5? I admit to a good bit of trial and error in my work. I suspect I'm going to get tired of checking in and out real quick for the frequent small changes I tend to make. I have to be missing something, right?
Can anyone step me through where everything lives and how I test along the way, while keeping true to the goal of having a 'time machine' handy?
Eric Sink has a great series of posts on source control basics. His company (Sourcegear) makes a source control tool called Vault, but the how-to is generally pretty system agnostic.
Don't edit your code on production.
Create a development environment, with the appropriate services (apache w/mod_php).
The application directory within your dev environment is where you do your work.
Put your current production app in there.
Commit this directory to the source control tool. (now you have populated source control with your application)
Make changes in your new development environment, hitting F5 when you want to see/test what you've changed.
Merge/Commit your changes to source control.
Actually, your files, while stored in a source repository (big word for another place on your hard drive, or a hard drive somewhere else), can also exist on your local machine, too, just where they exist now.
So, all files that aren't checked out would be marked as "read only", if you are using VSS (not sure about SVN, CVS, etc). So, you could still run your website by hitting "F5" and it will reload the files where they currently are. If you check one out and are editing it, it becomes NOT read only, and you can change it.
Regardless, the web server that you are running will load readonly/writable files with the same effect.
You still have all the files on your hard drive, ready for F5!
The difference is that you can "checkpoint" your files into the repository. Your daily life doesn't have to change at all.
You can do a "checkout" to the same directory where you currently work so that doesn't have to change. Basically your working directory doesn't need to change.
This is a wildly open ended question because how you use a SCM depends heavily on which SCM you choose. A distributed SCM like git works very differently from a centralized one like Subversion.
svn is way easier to digest for the "new user", but git can be a little more powerful and improve your workflow. Subversion also has really great docs and tool support (like trac), and an online book that you should read:
http://svnbook.red-bean.com/
It will cover the basics of source control management which will help you in some way no matter which SCM you ultimately choose, so I recommend skimming the first few chapters.
edit: Let me point out why people are frowning on you, by the way: SCM is more than simply a "backup of your code". Having "timemachine" is nothing like an SCM. With an SCM you can go back in your change history and see what you actually changed and when which is something you'll never get with blobs of code. I'm sure you've asked yourself on more than one occasion: "how did this code get here?" or "I thought I fixed that bug"-- if you did, thats why you need SCM.
You don't "have" to change your workflow in a drastic way. You could, and in some cases you should, but that's not something version control dictates.
You just use the files as you would normally. Only under version control, once you reach a certain state of "finished" or at least "working" (solved an issue in your issue tracker, finished a certain method, tweaked something, etc), you check it in.
If you have more than one developer working on your codebase, be sure to update regularly, so you're always working against a recent (merged) version of the code.
Here is the general workflow that you'd use with a non-centralized source control system like CVS or Subversion: At first you import your current project into the so-called repository, a versioned storage of all your files. Take care only to import hand-generated files (source, data files, makefiles, project files). Generated files (object files, executables, generated documentation) should not be put into the repository.
Then you have to check out your working copy. As the name implies, this is where you will do all your local edits, where you will compile and where you will point your test server at. It's basically the replacement to where you worked at before. You only need to do these steps once per project (although you could check out multiple working copies, of course.)
This is the basic work cycle: At first you check out all changes made in the repository into your local working copy. When working in a team, this would bring in any changes other team members made since your last check out. Then you do your work. When you've finished with a set of work, you should check out the current version again and resolve possible conflicts due to changes by other team members. (In a disciplined team, this is usually not a problem.) Test, and when everything works as expected you commit (check in) your changes. Then you can continue working, and once you've finished again, check out, resolve conflicts, and check in again. Please note that you should only commit changes that were tested and work. How often you check in is a matter of taste, but a general rule says that you should commit your changes at least once at the end of your day. Personally, I commit my changes much more often than that, basically whenever I made a set of related changes that pass all tests.
Great question. With source control you can still do your "F5" refresh process. But after each edit (or a few minor edits) you want to check your code in so you have a copy backed up.
Depending on the source control system, you don't have to explicitly check out the file each time. Just editing the file will check it out. I've written a visual guide to source control that many people have found useful when grokking the basics.
I would recommend a distributed version control system (mercurial, git, bazaar, darcs) rather than a centralized version control system (cvs, svn). They're much easier to setup and work with.
Try mercurial (which is the VCS that I used to understand how version control works) and then if you like you can even move to git.
There's a really nice introductory tutorial on Mercurial's homepage: Understanding Mercurial. That will introduce you to the basic concepts on VCS and how things work. It's really great. After that I suggest you move on to the Mercurial tutorials: Mercurial tutorial page, which will teach you how to actually use Mercurial. Finally, you have a free ebook that is a really great reference on how to use Mercurial: Distributed Revision Control with Mercurial
If you're feeling more adventurous and want to start off with Git straight away, then this free ebook is a great place to start: Git Magic (Very easy read)
In the end, no matter what VCS tool you choose, what you'll end up doing is the following:
Have a repository that you don't manually edit, it only for the VCS
Have a working directory, where you make your changes as usual.
Change what you like, press F5 as many times as you wish. When you like what you've done and think you would like to save the project the way it is at that very moment (much like you would do when you're, for example, writing something in Word) you can then commit your changes to the repository.
If you ever need to go back to a certain state in your project you now have the power to do so.
And that's pretty much it.
If you are using Subversion, you check out your files once . Then, whenever you have made big changes (or are going to lunch or whatever), you commit them to the server. That way you can keep your old work flow by pressing F5, but every time you commit you save a copy of all the files in their current state in your SVN-repository.
Depends on the source control system you use. For example, for subversion and cvs your files can reside in a remote location, but you always check out your own copy of them locally. This local copy (often referred to as the working copy) are just regular files on the filesystem with some meta-data to let you upload your changes back to the server.
If you are using Subversion here's a good tutorial.
Depending on the source control system, 'checkout' may mean different things. In the SVN world, it just means retrieving (could be an update, could be a new file) the latest copy from the repository. In the source-safe world, that generally means updating the existing file and locking it. The text below uses the SVN meaning:
Using PHP, what you want to do is checkout your entire project/site to a working folder on a test apache site. You should have the repository set up so this can happen with a single checkout, including any necessary sub folders. You checkout your project to set this up one time.
Now you can make your changes and hit F5 to refresh as normal. When you're happy with a set of changes to support a particular fix or feature, you can commit in as a unit (with appropriate comments, of course). This puts the latest version in the repository.
Checking out/committing one file at a time would be a hassle.
A source control system is generally a storage place for your files and their history and usually separate from the files you're currently working on. It depends a bit on the type of version control system but suppose you're using something CVS-like (like subversion), then all your files will live in two (or more) places. You have the files in your local directory, the so called "working copy" and one in the repository, which can be located in another local folder, or on another machine, usually accessed over the network. Usually, after the first import of your files into the repository you check them out under a working folder where you continue working on them. I assume that would be the folder where your PHP files now live.
Now what happens when you've checked out a copy and you made some non-trivial changes that you want to "save"? You simply commit those changes in your working copy to the version control system. Now you have a history of your changes. Should you at any point wish to go back to the version at which you committed those changes, then you can simply revert your working copy to an older revision (the name given to the set of changes that you commit at once).
Note that this is all very CVS/SVN-specific, as GIT would work slightly different. I'd recommend starting with subversion and reading the first few chapters of the very excellent SVN Book to get you started.
This is all very subjective depending on the the source control solution that you decide to use. One that you will definitely want to look into is Subversion.
You mentioned that you're doing PHP, but are you doing it in a Linux environment or Windows? It's not really important, but what I typically did when I worked in a PHP environment was to have a production branch and a development branch. This allowed me to configure a cron job (a scheduled task in Windows) for automatically pulling from the production-ready branch for the production server, while pulling from the development branch for my dev server.
Once you decide on a tool, you should really spend some time learning how it works. The concepts of checking in and checking out don't apply to all source control solutions, for example. Either way, I'd highly recommend that you pick one that permits branching. This article goes over a great (in my opinion) source control model to follow in a production environment.
Of course, I state all this having not "tinkered" in years. I've been doing professional development for some time and my techniques might be overkill for somebody in your position. Not to say that there's anything wrong with that, however.
I just want to add that the system that I think was easiest to set up and work with was Mercurial. If you work alone and not in a team you just initialize it in your normal work folder and then go on from there. The normal flow is to edit any file using your favourite editor and then to a checkin (commit).
I havn't tried GIT but I assume it is very similar. Monotone was a little bit harder to get started with. These are all distributed source control systems.
It sounds like you're asking about how to use source control to manage releases.
Here's some general guidance that's not specific to websites:
Use a local copy for developing changes
Compile (if applicable) and test your changes before checking in
Run automated builds and tests as often as possible (at least daily)
Version your daily builds (have some way of specifying the exact bits of code corresponding to a particular build and test run)
If possible, use separate branches for major releases (or have a development and a release branch)
When necessary, stabilize your code base (define a set of tests such that passing all of those tests means you are confident enough in the quality of your product to release it, then drive toward 0 test failures, i.e. ban any checkins to the release branch other than fixes for the outstanding issues)
When you have a build which has the features you want and has passed all of the necessary tests, deploy it.
If you have a small team, a stable product, a fast build, and efficient, high-quality tests then this entire process might be 100% automated and could take place in minutes.
I recommend Subversion. Setting up a repository and using it is actually fairly trivial, even from the command line. Here's how it would go:
if you haven't setup your repo (repository)
1) Make sure you've got Subversion installed on your server
$ which svn
/usr/bin/svn
which is a tool that tells you the path to another tool. if it returns nothing that tool is not installed on your system
1b) If not, get it
$ apt-get install subversion
apt-get is a tool that installs other tools onto your system
If that's not the right name for subversion in apt, try this
$ apt-cache search subversion
or this
$ apt-cache search svn
Find the right package name and install it using apt-get install packagename
2) Create a new repository on your server
$ cd /path/to/directory/of/repositories
$ svnadmin create my_repository
svnadmin create reponame creates a new repository in the present working directory (pwd) with the name reponame
You are officially done creating your repository
if you have an existing repo, or have finished setting it up
1) Make sure you've got Subversion installed on your local machine per the instructions above
2) Check out the repository to your local machine
$ cd /repos/on/your/local/machine
$ svn co svn+ssh://www.myserver.com/path/to/directory/of/repositories/my_repository
svn co is the command you use to check out a repository
3) Create your initial directory structure (optional)
$ cd /repos/on/your/local/machine
$ cd my_repository
$ svn mkdir branches
$ svn mkdir tags
$ svn mkdir trunk
$ svn commit -m "Initial structure"
svn mkdir runs a regular mkdir and creates a directory in the present working directory with the name you supply after typing svn mkdir and then adds it to the repository.
svn commit -m "" sends your changes to the repository and updates it. Whatever you place in the quotes after -m is the comment for this commit (make it count!).
The "working copy" of your code would go in the trunk directory. branches is used for working on individual projects outside of trunk; each directory in branches is a copy of trunk for a different sub project. tags is used more releases. I suggest just focusing on trunk for a while and getting used to Subversion.
working with your repo
1) Add code to your repository
$ cd /repos/on/your/local/machine
$ svn add my_new_file.ext
$ svn add some/new/directory
$ svn add some/directory/*
$ svn add some/directory/*.ext
The second to last line adds every file in that directory. The last line adds every file with the extension .ext.
2) Check the status of your repository
$ cd /repos/on/your/local/machine
$ svn status
That will tell you if there are any new files, and updated files, and files with conflicts (differences between your local version and the version on the server), etc.
3) Update your local copy of your repository
$ cd /repos/on/your/local/machine
$ svn up
Updating pulls any new changes from the server you don't already have
svn up does care what directory you're in. If you want to update your entire repository, makre sure you're in the root directory of the repository (above trunk)
That's all you really need to know to get started. For more information I recommend you check out the Subversion Book.