I work on a large project where all the source files are stored in a version control except the project files. This was the lead developer's decision. His reasoning was:
Its to time consuming to reconcile the differences among developers' working directories.
It allows developers to work independently until their changes are stable
Instead, a developer initially gets a copy of a fellow developer's project files. Then when new files are added each developer notifies all the rest about the change. This strikes me as far more time consuming in the long run.
In my opinion the supposed benefits of not tracking changes to the project files are outweighed by the danger. In addition to references to its needed source files each project file has configuration settings that would be very time consuming and error prone to reproduce if it became corrupted or there was a hardware failure. Some of them have source code embedded in them that would be nearly impossible to recover.
I tried to convince the lead that both of his reasons can be accomplished by:
Agreeing on a standard folder structure
Using relative paths in the project files
Using the version control system more effectively
But so far he's unwilling to heed my suggestions. I checked the svn log and discovered that each major version's history begins with an Add. I have a feeling he doesn't know how to use the branching feature at all.
Am I worrying about nothing or are my concerns valid?
Your concerns are valid. There's no good reason to exclude project files from the repository. They should absolutely be under version control. You'll need to standardize on a directory structure for automated builds as well, so your lead is just postponing the inevitable.
Here are some reasons to check project (*.*proj) files into version control:
Avoid unnecessary build breaks. Relying on individual developers to notify the rest of the team every time the add, remove or rename a source file is not a sustainable practice. There will be mistakes and you will end up with broken builds and your team will waste valuable time trying to determine why the build broke.
Maintain an authoritative source configuration. If there are no project files in the repository, you don't have enough information there to reliably build the solution. Is your team planning to deliver a build from one of your developer's machines? If so, which one? The whole point of having a source control repository is to maintain an authoritative source configuration from which you build and deliver releases.
Simplify management of your projects. Having each team member independently updating their individual copies of your various project files gets more complicated when you introduce project types that not everyone is familiar with. What happens if you need to introduce a WiX project to generate an MSI package or a Database project?
I'd also argue that the two points made in defense of this strategy of not checking in project files are easily refuted. Let's take a look at each:
Its to time consuming to reconcile the differences among developers' working directories.
Source configurations should always be setup with relative paths. If you have hard coded paths in your source configuration (project files, resource files, etc.) then you're doing it wrong. Choosing to ignore the problem is not going to make it go away.
It allows developers to work independently until their changes are stable
No, using version control lets developers work in isolation until their changes are stable. If you each continue to maintain your own separate copies of the project files, as soon as someone checks in a change that references a class in a new source file, you've broken everyone on the team until they stop what they're doing and carefully update their project files. Compare that experience with just "getting latest" from source control.
Generally, a project checked out of SVN should be working, or there should be tools included to make it work (e.g. autogen.sh). If the project file is missing or you need knowledge about which files should be in the project, there is something missing.
Automatically generated files should not be in SVN, as it is pointless to track the changes to these.
Project files with relative path belong under source control.
Files that don't: For example in .Net, I would not put the .suo (user options) web.config (or app.config under source control. You may have developers using different connection strings, etc.
In the case of web.config, I like to put a web.config.example in. That way you copy the file to web.config upon initial checkout and tweak what settings you'd like. If you add something that needs to be added to all web.config, you merge those lines into the .example version and notify the team to merge that into their local version.
I think it depends on the IDE and configuration of the project. Some IDEs have hard-coded absolute paths and that's a real problem with multiple developers working on the same code with different local copies and configurations. Avoid absolute path references to libraries, for example, if you can.
In Eclipse (and Java), it's fine to commit .project and .classpath files (so long as the classpath doesn't have absolute references). However, you may find that using tools like Maven can help having some independence from the IDE and individual settings (in which case you wouldn't need to commit .project, .settings and .classpath in Eclipse since m2eclipse would re-create them for you automatically). This might not apply as well to other languages/environments.
In addition, if I need to reference something really specific to my machine (either configuration or file location), it tend to have my own local branch in Git which I rebase when necessary, committing only the common parts to the remote repository. Git diff/rebase works well: it tends to be able to work out the diffs even if the local changes affect files that have been modified remotely, except when those changes conflict, in which case you get the opportunity to merge the changes manually.
That's just retarded. With a set up like that, I can have a perfectly working project containing files that are subtly different from everyone else. Imagine the havoc this would cause if someone accidentally propagates this mess into QA and everyone is trying to figure out what's going on. Imagine the catastrophe that would ensue if it ever got released to the production environment...!
Related
How do I have a Project Explorer's Working Set be built automatically from the contents of .gitignore, and then kept in sync with .gitignore?
I am working on a C++ AutoTools project which, as it is common for AutoTools projects, generates quite a lot of files during the build stage. I do have them .gitignored already. Now I'm trying Eclipse on that project, and found that I'd have to carefully pick files to ignore again.
You cannot. This feature does not exist.
The working sets functionality was implemented a long time before GIT appeared, and it was a method for removing clutter in large projects, and what is important, it was was a method that resided in the UI domain.
In fact, A working set extension point documentation shows it is possible to create a self-updating working set, and the search over the egit codebase returns no results.
As I have said, this feature is not implemented.
However, you can create your own plugin that will do what you want. It is not very complicated, and should not take more than a day or two. Or just open a feature request in the Eclipse bugzilla.
As for your underlying problem, you could try using the derived resources mechanism. It was added to make possible to prevent team providers (CVS/GIT) from managing files that are a result of a build.
Just a word of warning - GIT won't allow you to ignore further changes to any resource already under its control.
The issue that I am trying to figure out is whether it is best to keep the buildfile together with the source code, i.e. trunk and branches, or in some separate location (obviously still under SCM).
The question: (to keep in mind while reading the rest of the text) Buildfile that ensures branch re-buildability at any time (at expense of maintenance), or one that ensures latest bugfixes/improvements (to build process) are quickly used by multiple branches and different projects (at expense of backward compatibility with older branches)
It doesn't matter what we are building, or what technologies are used, but just for the sake of completeness: making a mobile application, building/packaging with ANT, using SVN for SCM.
Buildfile = instructions for your builder/compiler/packager to compile and package an application from source.
Buildfile with code
This is what we have right now. Ant's build.xml is stored alongside the main code in SVN. A number of other supporting "packaging" files (Apple's provisioning profiles and certs) are stored there as well.
Pros:
Single checkout from DEV perspective. When developers checkout the trunk or one of its branches, the buildfile is right there. They don't need to search for it elsewhere. A simple ant build after checkout is all they need.
When changes to the build/packaging process are done on trunk that require some reorganization in code (different file locations, support for compiler constants, etc), I need not worry about breaking existing branches, since each branch gets its own revision.
Cons:
When changes to the build/packaging process are done on trunk that improve the process and fix bugs, I now need to worry about merging those changes to all active branches, which means having to keep track of all dev/feature branches in addition to release branches.
No reusability. A technologically identical project, that only requires a few switches/property changes to the buildfile should be able to use identical buildfile. But because they are spread across multiple project locations (in addition to multiple branches, as from the point above), it becomes a nightmare to do a generic improvement that affects all those locations. Mainly due to the fact that no matter what, these files end up with little "patch-works" here and there, and eventually with conflicting merges and ever-so-slightly different processes that cannot be resolved without putting one of the projects on hold and modifying that process to "catch up" with the other.
Buildfile separate from code
To address the cons of the previous scenario in regards to re-using a single file and avoid a plethora of small fixes all over the place, I was thinking of keeping the build file separate. Shareable between the trunk, branches and other similar projects.
Pros:
Single file to modify, improve and bugfix, reusable by multiple other projects.
Cons:
No "single checkout" for DEVs (but it can be solved with svn externals or other linking solution
Breaking old/existing builds. Since there is only one version of the file now, introducing an improvement that requires code restructuring would make it incompatible with older branches. When that older branch needs to be rebuild (urgent fix to already released software), the build file will no longer work. Yes, it's solvable by getting a previous revision of the file, however:
It is not directly obvious which previous revision had worked with this branch
The older revision may be missing some other critical bugfixes to the build process.
Toss up question
So for me it is a toss up between making my life easier and only maintaining one file for bugfixes and improvements, and thus ensuring that projects use identical processes, latest bugfixes to the build process, etc. Or making developer's life easier by providing a single point of checkout, and ensuring branch "stability/re-buildability" because the buildfile that's checked in with the branch is guaranteed to work with that branch.
Is there a proper way for this? What is the proper way for this? Am I approaching this wrong?
I would suggest tagging your code with a version number every time you build. IF you ever want to roll back to a revision , you can always create a new branch from the tag of that revision and use the build.xml from that revision
You can also publish all your artefacts into a revision named directory. If you want to rollback you can just use the artefacts from this directory instead of going through the build process again .
Your build file should always be part of your code . That way a developer can check out his/her code into an IDE and start building from there for his local testing.
If you have env related configurations that are different per environment you should separate it out into a deployment configuration that is used when the code is deployed.
If you want to reuse your build files across projects , you can create a master build file with macros that is fetched prior to starting a build. The only thing you need to do is override the macros in your local project if you want to override the default behaviour
I am about to start participating in the development of a medium-sized project (~50k lines) that was until now written by a single person, and not versioned; as a result folders are cluttered with different versions of the same file (named file1, file2, file3, etc.).
I proposed to start using a VCS for it (a priori Mercurial, which is the only one I've ever used -- for my personal projects --, but I'm open to suggestions), so I'm taking any good ideas as to how to "start" the repository. E.g., should I make an initial commit with all the existing files, and immediately make a new commit with the unused files removed? Or something else?
(constructive remarks on mercurial vs bazaar vs git vs whatever are also welcome.)
Thanks for your tips.
E.g., should I make an initial commit with all the existing files, and immediately make a new commit with the unused files removed?
If the size of the repository is not a concern, then yes, that is a good starting point. Otherwise you can just commit what's actually used, and go from there.
As for which system, all DVCSes stick to the same core principles. Which one you pick is entirely subjective — the only way to truly know which one you like is to try each one.
I would say use what you are the most comfortable with and meets your needs. As far as where to start, I personally would seed the repo with the current source as is, that way you can verify that everything builds and runs as expected. you can make this initial seed a branch. That way you can always go back to your starting point before refactoring.
My approach to this was:
create a Mercurial repository in the existing project folder ("existing")
commit all project files to "existing"
create an empty repository in what a different location ("new")
As files are tested and QA'd (this was necessary because there was so much dross in "existing") pull them from "everything" to "new".
Once files had been pulled into "new"; delete the corresponding files from "existing". If access is needed to these files while the migration is under way, push them back from "new" to "existing".
This gave me the advantage of putting everything under some sort of control for recovery purposes, control over introducing the project to the DVCS. Eventually the existing project folder became completely tested and approved for the project moving forward. At this point the "everything" directory could be deleted or changed into a working folder; and "new" became the actual project folder.
I think Mercurial is a good choice. Lightweight, fast, very simple to use and well-integrated with Windows (if that's the platform you're dealing with).
I would probably get rid of all the clutter before the first commit. Delete everything you don't care about, run all the necessary tests and only then do the commit.
Yes, I'm dead set against the 0-day cluttering of repos.
Granted, a 50K SLOC project isn't very big, but if you commit files you already know you won't need, they will make your repo slightly bigger.
Also, remember to check that the tree doesn't contain large binary files. If it does, get rid of them if at all possible.
In VS, it's simple. Everything the project needs is stored in the project folder and all VS settings are stored in one place. Eclipse, however, stores Eclipse settings with the project and keeps a .metadata at the workspace level which is needed to detect the projects in the workspace. Thus, I can't simply branch a project and then open it in Eclipse. I need to set up a workspace, branch it into that workspace, copy over all my workspace settings (settings import/export doesn't even work right in Eclipse) so I have the same Eclipse settings, then do some kind of import to get the project in the workspace. This is what I generally refer to as a pain in the freaking neck, and it causes me to not branch any Java projects and to keep them all in one folder. This is also a pain.
Is there any way I can get a setup where I can just branch a project and open it in Eclipse, while maintaining the same Eclipse settings?
UPDATE: The current state of the question is expressed by the comment to soru's post.
Pretty sure you want to:
Keep the same workspace for all projects (or maybe a few, at the level of say 'hobby' and 'work').
switch between different branches in the same project by using the features of your version control tool/plugin
if you want to work on multiple branches at the same time, just create two projects, and manage them both as above.
if you want to temporarily hide the inactive version, use the 'working set' feature.
The main limitation is that you might want to have projects with the same name, but you can't. So sometimes you have to make up a project name different from the underlying folder name.
In general, mapping between VS and Eclipse:
Installation <-> workspace
Solution <-> working set
Project <-> project or folder or VC system branch or working set node
Refs:
VS object model
using working sets in Eclipse
working with branches in subclipe
Well, I'm not a fan of keeping any IDE specific settings in the repo, but when I do I keep only .project, .classpath and .settings.
You can also keep you settings at the workspace level (Windows->Preferences),and not on the project level (Project->Properties).
Also why do you create a seperate workspace for branches? You can keep it in one workspace, no need to create another one.
You could also use "switch" in subversion (I don't know if that's what you are using, but other revision systems should have something similar) and go to the branch you have created.
(of course if you wan to work concurrently on more than one branch then it doesn't help)
I can't speak to the Eclispe problem, as i'm only a n00b user, but I can speak to the secondary question.
I've been working in systems for a number of years that ended up needing to have various branches of the same code done for a variety of reasons.
One of the best reasons for keeping specific settings in project-specific locations is that so the various compiler / sdk / etc. settings & files can be specific per-branch and not collide between branches.
This allows, for example, for the work to upgrade a code set to a newer sdk/compiler to be done without impacting the ability to work on the existing "main line" code set with the previous sdk/compiler should the need arise.
In my experience in the computer game industry as a core technology wog, this happens a LOT.
I'm sure the same situations occur outside the computer game industry, maybe just not at the same pace.
Is it a best practice to have my local source tree mirror the server source tree? It seems to me that it should, however my department at work does not do it that way and I find it very confusing. If not, what are scenarios where it makes sense to deviate from the server source tree?
EDIT: To clarify what I mean - say the source directory we want to map on the local machine is here on the server:
\\TeamServer\Project\Releases\2008
On our local machine, that directory would be mapped like this:
D:\2008_Releases
instead of:
D:\Project\Releases\2008
I don't know if it's a best practice but I prefer mirroring the source tree too. It's one less "gotcha" in terms of getting a new developer up and running. Not mirroring the source tree can eventually come back to bite you when it comes to relative paths.
Someone probably made a mistake when they set things up originally and it never got corrected. IMHO it's a minor annoyance; just one of the side effects of not living in a perfect world ;)
If your local tree needs to be at the same path as it is on the server, then you can't have multiple copies checked out. At the last two places I've worked, it was common for me to have several copies of (parts of) the tree checked out at any given time, depending on how many different bugs or features I was working on, and how many branches of the product the bugs or features were in.
Personally, I have no idea where the source trees were stored on the servers, and I didn't need to. I just ran cvs co or svn co to get a copy of the tree in my working directory. As long as I ran make or ant somewhere in the source tree, everything below it would compile.
I prefer to have individual project/release pairs in their own directories as close to root as reasonably possible. It's a horrible PITA to go clickety-click on directory tree or having to type the "c:\project\releases\2008" -part for the umpteenth time.
I also think checking out sources to different path tends to flush out bugged assumptions about project locations (we have postbuild events that do some nasty things).
There is one reason not to do this: When you have access to the production machine. In this case, I prefer to have different paths. This makes it more likely that I notice my "rm -rf" is on the wrong box...
i think mirrowing is a good practise and i like to do it so too, however i have an extra temp folder for the case i need to checkout a fresh version for e.g. to test something with the current version or to fix a bug which has a high priority then what im doing atm.
The most important thing is to understand why the choice was made and use the team to determine if this is something that you would want to change. Team buying is important for these kind of decisions - and the team is more than just the developers. Source control trees contain things such as documentation, tests, resources, etc., so there is a legitimate chance that the structure was determined by multiple parties to find a common ground.
We use TFS where I work, as well. We do have a common root folder called C:\Source Control that is the main folder that all Team Projects live under. This choice was made because of the extreme dislike, by all parties, that the drive would get cluttered with various folders.
With TFS, you have the option to map multiple workspaces, so what you do on your local machine is not dictated by the structure on the server. My personal preference, and my team's preference, as well, is to use a single workspace mapping to the source control root. Given the shelving functionality in TFS, there is not a concern about having multiple copies checked out, since they can be shelved if something else needs to be worked on.
The build server has the same mapping, as well. This in no way, however, matches the deployment structure. Builds are dropped based upon the criteria of the project. The only time this presents a problem is when absolute paths are used, in configuration files, for example. Since we are not using absolute paths (by definition in our developer guidelines) this is a non-issue.
People have different opinions on how to organize their hard disks, why should you enforce a particular layout?
One should be able to check out multiple working copies of the same project in the repository - impossible if you insist using the same hierarchy.
In one large project I worked on, literally thousands of files assumed that the working copy was in a fixed absolute hard-coded path (\projectname). To work with several branches of the project, we had to resort to using different drive letters (on Windows), meaning that we had to divide our hard disks into many partitons (6 or more were common). This was very inconvenient.
Eventually we had to integrate the project as a sub-project of a larger project. This meant changing all those absolute paths, a tedious and time-consuming task.
Using relative paths gives you much more flexibility; however, if everybody has the project root at the exact same location, you won't notice if someone adds an absolute path somewhere by accident.