Source Control for multiple projects/solutions with shared libraries - version-control

I am currently working on a project to convert a number of Excel VBA powered workbooks to VSTO solutions. All of the workbooks will share a number of class libraries and third party assemblies, in fact most of the work is done in the class libraries. I currently have my folder structure laid out like this.
Base
Libraries
Assemblies
Workbooks
Workbook1
Workbook2
Each of the workbooks will be its own solution, and the workbook solutions just reference the assemblies in the folder structure. My question is how would you lay out the source control? Would you start the repository at the base? Or would you create a repository for each workbook solution? Would you rearrange the folders?
Now that we have the initial development done, we're about to have a bunch of outside developers come on to the project to helps us convert the rest of the workbooks and I really like the idea of them being able to check out from the base directory and having all of the dependencies ready to go. I also worry that there are other concerns that come with having 20+ solutions/projects under one source control repository.
I want everything to be as simple as possible for people joining the project but I don't want to sacrifice long term usability. In my mind I've been going back and forth, what's simpler one repository or one repository per solution?
I'd appreciate and insight you have, because I'm fresh out.
Additional Information: Currently, I am using Mercurial personally, but the project will probably get moved to StarTeam unless I can make some convincing arguments for something else.

You don't mention in your question what source control you are using. As it doesn't sound like you need to limit your outside developers access to the rest of the repository I would not bother with setting up multiple repositories. I would assume that unless your code runs into the millions of lines size that repository size is not an issue.
It all depends what functionality your revision control system supports. In subversion you can declare other folders as external and provide a file URL for the content of that folder, this will cause subversion to deal with that folder as a separate repository even though it is within your folder structure.

Related

How to manage multiple small parts of a library with version control

I'm new to source version control, so I don't want to make a mistake in choosing the wrong setup for my project.
I have kind of a "library" that is made of many small "procedures" (they are written in a pseudo-language specific of a third paty software). Each procedure is a small stand alone "package" of 2/3 files (just the procedure itself, the documentation, and maybe one or two other sub-procedures that are needed only to the main one).
So I have like hundreds of those procedure-packages, archived in subfolders depending on the area of application, and some of them more complex may use others more basic.
I modify those procedures pretty often in the early stages, to improve them, but of course sometimes the modifications break the compatibility since thei involve adding/removing input/output parameters, so I suppose I must somehow "tag" versions of each procedure as if it was a single piece of software...
So I'm wondering what's the best way to manage them with a version control (I'm using Mercurial): am I supposed to make like hundreds repositories? o_O Or keep everything in one big repository and tag it everytime a procedure is revised? or maybe learn and use subrepositories?
Thanks for your help.
Simone
I can be subrepositories (or GuestRepo - no updates from 2013) with tags
Each changeset in "main" repo have linked to it changesets from all repositories, i.e. when you update to old changeset in master, all subrepos also updated accordinly
Tags in main repository will allow you to mark stable|functional combinations for re-use
Subrepos make sense, if each package can stand for itself without any connection to the master software. If this condition is not met, I would stay with one single repository. Especially since you stated that your packages contain few files with few changes it seems that the subrepo approach does not make sense here.

How should I start with tracking file changes/versions?

I've been working with a lot of my files on the go recently, and in the process often times accumulated several copies of files in different stages of completion/revision. I'm working on any number of projects at a given time, so it's not always easy to remember or figure out quickly which version I should continue working on.
What type of options would you recommend that allow me to track changes locally and if possible with files I work on while at a remote location? I've never worked with file versioning or tracking systems, so not sure what direction I should be looking in. I work mostly with HTML, CSS, and PHP.
Any help is awesomely appreciated! Thanks.
PS. Don't know if I should have this in a separate question but what options are available for the same type of thing, change tracking/logging for files on server? Preferably something that not only vaguely notes a file has been changed, but that tracks specific changes that have occurred in files.
It's seems to me that github is prefect choice for your requirement. You can create repository for maintain the history, it's easy to use and it is free
https://github.com/

How does one handle big library dependencies in source control?

My C++ application depends on Boost. I'd like someone to just be able to check out my repository and build the whole thing in one step. But the boost distribution is some 100MB and thousands of files, and it seems to bog down source control -- plus I really don't need it to be versioned.
What's the best way to handle this kind of problem?
Most version control tools/systems provide mechanics to add references to other repositories into your repository.
This way you can keep your repository clean from files of other repositories but still be able to just point to the correct library and version.
In Git it’s called submodules. In SVN it’s called externals.
In the end you’ll have to decide on whether you want to include the files into your repo so others won’t have to checkout the other repos as well, even when the references (submodule/external) make just that pretty easy. I’d prefer a clean repo though and reference other repositories, if available. This will also make maintaining and upgrading those libraries a lot easier.
I've found the book "Large-Scale C++ Software Design" by John Lakos very useful as far as organising large C++ projects is concerned. Recommended.

Project files under version control?

I work on a large project where all the source files are stored in a version control except the project files. This was the lead developer's decision. His reasoning was:
Its to time consuming to reconcile the differences among developers' working directories.
It allows developers to work independently until their changes are stable
Instead, a developer initially gets a copy of a fellow developer's project files. Then when new files are added each developer notifies all the rest about the change. This strikes me as far more time consuming in the long run.
In my opinion the supposed benefits of not tracking changes to the project files are outweighed by the danger. In addition to references to its needed source files each project file has configuration settings that would be very time consuming and error prone to reproduce if it became corrupted or there was a hardware failure. Some of them have source code embedded in them that would be nearly impossible to recover.
I tried to convince the lead that both of his reasons can be accomplished by:
Agreeing on a standard folder structure
Using relative paths in the project files
Using the version control system more effectively
But so far he's unwilling to heed my suggestions. I checked the svn log and discovered that each major version's history begins with an Add. I have a feeling he doesn't know how to use the branching feature at all.
Am I worrying about nothing or are my concerns valid?
Your concerns are valid. There's no good reason to exclude project files from the repository. They should absolutely be under version control. You'll need to standardize on a directory structure for automated builds as well, so your lead is just postponing the inevitable.
Here are some reasons to check project (*.*proj) files into version control:
Avoid unnecessary build breaks. Relying on individual developers to notify the rest of the team every time the add, remove or rename a source file is not a sustainable practice. There will be mistakes and you will end up with broken builds and your team will waste valuable time trying to determine why the build broke.
Maintain an authoritative source configuration. If there are no project files in the repository, you don't have enough information there to reliably build the solution. Is your team planning to deliver a build from one of your developer's machines? If so, which one? The whole point of having a source control repository is to maintain an authoritative source configuration from which you build and deliver releases.
Simplify management of your projects. Having each team member independently updating their individual copies of your various project files gets more complicated when you introduce project types that not everyone is familiar with. What happens if you need to introduce a WiX project to generate an MSI package or a Database project?
I'd also argue that the two points made in defense of this strategy of not checking in project files are easily refuted. Let's take a look at each:
Its to time consuming to reconcile the differences among developers' working directories.
Source configurations should always be setup with relative paths. If you have hard coded paths in your source configuration (project files, resource files, etc.) then you're doing it wrong. Choosing to ignore the problem is not going to make it go away.
It allows developers to work independently until their changes are stable
No, using version control lets developers work in isolation until their changes are stable. If you each continue to maintain your own separate copies of the project files, as soon as someone checks in a change that references a class in a new source file, you've broken everyone on the team until they stop what they're doing and carefully update their project files. Compare that experience with just "getting latest" from source control.
Generally, a project checked out of SVN should be working, or there should be tools included to make it work (e.g. autogen.sh). If the project file is missing or you need knowledge about which files should be in the project, there is something missing.
Automatically generated files should not be in SVN, as it is pointless to track the changes to these.
Project files with relative path belong under source control.
Files that don't: For example in .Net, I would not put the .suo (user options) web.config (or app.config under source control. You may have developers using different connection strings, etc.
In the case of web.config, I like to put a web.config.example in. That way you copy the file to web.config upon initial checkout and tweak what settings you'd like. If you add something that needs to be added to all web.config, you merge those lines into the .example version and notify the team to merge that into their local version.
I think it depends on the IDE and configuration of the project. Some IDEs have hard-coded absolute paths and that's a real problem with multiple developers working on the same code with different local copies and configurations. Avoid absolute path references to libraries, for example, if you can.
In Eclipse (and Java), it's fine to commit .project and .classpath files (so long as the classpath doesn't have absolute references). However, you may find that using tools like Maven can help having some independence from the IDE and individual settings (in which case you wouldn't need to commit .project, .settings and .classpath in Eclipse since m2eclipse would re-create them for you automatically). This might not apply as well to other languages/environments.
In addition, if I need to reference something really specific to my machine (either configuration or file location), it tend to have my own local branch in Git which I rebase when necessary, committing only the common parts to the remote repository. Git diff/rebase works well: it tends to be able to work out the diffs even if the local changes affect files that have been modified remotely, except when those changes conflict, in which case you get the opportunity to merge the changes manually.
That's just retarded. With a set up like that, I can have a perfectly working project containing files that are subtly different from everyone else. Imagine the havoc this would cause if someone accidentally propagates this mess into QA and everyone is trying to figure out what's going on. Imagine the catastrophe that would ensue if it ever got released to the production environment...!

Source Control : Should local source tree mirror server source tree?

Is it a best practice to have my local source tree mirror the server source tree? It seems to me that it should, however my department at work does not do it that way and I find it very confusing. If not, what are scenarios where it makes sense to deviate from the server source tree?
EDIT: To clarify what I mean - say the source directory we want to map on the local machine is here on the server:
\\TeamServer\Project\Releases\2008
On our local machine, that directory would be mapped like this:
D:\2008_Releases
instead of:
D:\Project\Releases\2008
I don't know if it's a best practice but I prefer mirroring the source tree too. It's one less "gotcha" in terms of getting a new developer up and running. Not mirroring the source tree can eventually come back to bite you when it comes to relative paths.
Someone probably made a mistake when they set things up originally and it never got corrected. IMHO it's a minor annoyance; just one of the side effects of not living in a perfect world ;)
If your local tree needs to be at the same path as it is on the server, then you can't have multiple copies checked out. At the last two places I've worked, it was common for me to have several copies of (parts of) the tree checked out at any given time, depending on how many different bugs or features I was working on, and how many branches of the product the bugs or features were in.
Personally, I have no idea where the source trees were stored on the servers, and I didn't need to. I just ran cvs co or svn co to get a copy of the tree in my working directory. As long as I ran make or ant somewhere in the source tree, everything below it would compile.
I prefer to have individual project/release pairs in their own directories as close to root as reasonably possible. It's a horrible PITA to go clickety-click on directory tree or having to type the "c:\project\releases\2008" -part for the umpteenth time.
I also think checking out sources to different path tends to flush out bugged assumptions about project locations (we have postbuild events that do some nasty things).
There is one reason not to do this: When you have access to the production machine. In this case, I prefer to have different paths. This makes it more likely that I notice my "rm -rf" is on the wrong box...
i think mirrowing is a good practise and i like to do it so too, however i have an extra temp folder for the case i need to checkout a fresh version for e.g. to test something with the current version or to fix a bug which has a high priority then what im doing atm.
The most important thing is to understand why the choice was made and use the team to determine if this is something that you would want to change. Team buying is important for these kind of decisions - and the team is more than just the developers. Source control trees contain things such as documentation, tests, resources, etc., so there is a legitimate chance that the structure was determined by multiple parties to find a common ground.
We use TFS where I work, as well. We do have a common root folder called C:\Source Control that is the main folder that all Team Projects live under. This choice was made because of the extreme dislike, by all parties, that the drive would get cluttered with various folders.
With TFS, you have the option to map multiple workspaces, so what you do on your local machine is not dictated by the structure on the server. My personal preference, and my team's preference, as well, is to use a single workspace mapping to the source control root. Given the shelving functionality in TFS, there is not a concern about having multiple copies checked out, since they can be shelved if something else needs to be worked on.
The build server has the same mapping, as well. This in no way, however, matches the deployment structure. Builds are dropped based upon the criteria of the project. The only time this presents a problem is when absolute paths are used, in configuration files, for example. Since we are not using absolute paths (by definition in our developer guidelines) this is a non-issue.
People have different opinions on how to organize their hard disks, why should you enforce a particular layout?
One should be able to check out multiple working copies of the same project in the repository - impossible if you insist using the same hierarchy.
In one large project I worked on, literally thousands of files assumed that the working copy was in a fixed absolute hard-coded path (\projectname). To work with several branches of the project, we had to resort to using different drive letters (on Windows), meaning that we had to divide our hard disks into many partitons (6 or more were common). This was very inconvenient.
Eventually we had to integrate the project as a sub-project of a larger project. This meant changing all those absolute paths, a tedious and time-consuming task.
Using relative paths gives you much more flexibility; however, if everybody has the project root at the exact same location, you won't notice if someone adds an absolute path somewhere by accident.