Deployed a version control system for company, how to use it with binary files - version-control

I am tasked with setup a Mercurial version control system for our small team of developers (2-3 person). There was no version control system before, just shared folders and multi-copies. I don't have much experience in setting version control system except for personal projects, just happened to be the most experienced person in term of version control system in our team. The code repository is in a shared folder in centre server, the top leve directory is client name, one level down is project name for that client.
The problems is I haven't figure out how to deal with binary files in our code repository. From what I read, the binary files shouldn't be version tracked. But as the code repository is centralized on the server, shouldn't the binary in here as well? Otherwise for things like image file, and third-party dll files, the project wouldn't build or run properly when cloned from centre server. Also there is a nice feature for Mercurial web interface where you can download the whole source package as ZIP or BZ2 compressed file, without necessary binary files, the download project wouldn't run or compile.
I guess the solution is including everything for the version control system except the temporary files and the files for debug purpose, but other than that, most binary files should be included? Due to limitation of version control system, I don't think there is a way for them to track changes sets only for binary files, so I guess we have to deal with it for a version control system.
Edit: After more research about how to setup version-control repository, the more recommended way of using version-control is to "store everything which is created manually, and nothing else", quote from Eric Sink.

You want to version control anything that you can't generate from other stuff in version control. That would be your source files, and your instances of third-party libraries, tools, etc. that your package relies on.
The binaries built from your project are something else entirely, and should be treated as different sorts of artifacts. If you want an easy-to-test downloadable archive, adapt your build process to provide that as a target: it should build the code, and then compress the source and built binary into the desired single file.

Binary files that are related or required by the project must be included in version-control, they can be tracked. The only thing that version control can't do with binary files is compare and merge.

Related

Updating binary files in TFS Source Control

So I decided to add my referenced 3rd party dlls to source control in a separate folder called lib and then reference them from said directory.
This works just fine, but when I want to update the files, TFS seems completely oblivious to the fact that the files have actually changed. Even if I copy over the old files, there seems to be no way of checking in the newer ones. If I choose the Check-in pending changes from the source control explorer, I get an info box saying there are no changes. But if I run a compare to a single DLL between latest and workspace versions, TFS does tell me the files are indeed different.
So is the only solution to delete the files from source control and then re-add them back as the newer versions, or could I just somehow update them?
Team Foundation Server (through 2010, and with 2012's "Server Workspaces") use a "Checkout/Edit/Checkin" model for version control that differs from many other types of version control systems (eg, "Edit/Merge/Commit" systems).
In order to update your binaries, you need to explicitly check them out and update the contents. You can then check them in. This type of system is tuned for dealing with large repositories and large files like binaries since it need not scan your disk to determine whether files have changed or not.
If you prefer to work with an Edit/Merge/Commit type system, which will scan your disk looking for changes and you need not explicitly check files out, this is available in TFS 2012 (as "Local Workspaces").
Have you tried to check out for edit the file before replacing it? It works here...

Keeping SSIS packages under the source control

I store all SSIS packages in Subversion repository, their configuration files as well. Configuration file almost always stored in the same folder where package is.
Problem is - SSIS seems to always store path to configuration file (the one saved in the package itself) as an absolute path.
When someone else checks out folder with the package in the location different from where I had on my development PC the configuration file is not detected (because my absolute path is stored and it doesn't exist on the other developer PC). So another developer has to remove this configuration and add it again from where it is now on his local hard drive. Then changed package is saved which will cause new version to be committed. When I get that version from SVN it will no longer match local path on my PC.
On a related note: another developer may want to change values in configuration file as well. If I later get the latest version of everything from SVN package will no longer work on my PC.
How do you work around these inconveniences?
Another solution is to save your configuration in a database with an environment variable as the first configuration to tell it what database to look in, that's what we do. We have scripts to populate ssisconfig for each server in our source control, but the package uses the actual table data for the database in the environment variable we are using.
Anyone who has heard my SQL Saturday presentations knows I don't much care for XML and this is one of the reasons. A trick to using XML configuration with varying locations is to use an environment variable (indirect configuration) to direct SSIS where it can look for that resource. The big, big downside to this approach is you'd generally need to create an environment variable for each set of configuration files or have a massive, honking .dtsconfig file which becomes painful for versioning.
The option I prefer if XML configuration is a must is that the "variableness" is removed. Developers and admins get together and everyone agrees "there will be a folder everywhere SSIS is done to hold configuration files and that location is X" and then it's just a matter of solving for X. At a previous job, we used D:\ssisdata\configs
#HLGEM's approach of a table for configurations is hands down my favorite approach to SSIS configuration (until you get to 2012 and their project deployment model where configuration is an entirely different animal)
I add a folder called "config" under my projects folder, add it to source control and mantain the config file in this folder. You can also add it to the SSIS project if you like.
I think its a good solution because everybody can have this folder and dowload the config file.
When the package is deployed it will read the config file from where you inform in the deployment manifest so this solution wont impact your development

Is it common for a developer to keep their NAnt.exe.config file in version control?

Is it common for a developer to keep their NAnt global configuration file (NAnt.exe.config) in version control?
And should or shouldn't the the rest of the files in the NAnt installation be added to the ignore file of the version control system?
One use of version control is as a backup. If the only copy of NAnt.exe.config is on a hard disk that dies, it will take some effort to reconstruct it (along with everything else that disappeared and wasn't backed up).
From the corporate perspective, having all of the work in progress backed up is a method for preserving assets. The corporate owner of the source code asset is assured that the asset will not be destroyed.
When there is another backup strategy, then sometimes the rule of thumb is not to put anything into version control that should not be shared with other developers. Such as customized data relevant only to one user and/or machine, or confidential information.
I keep a copy of the NAnt code for the version I'm using. This includes the .config file. This is so my build system is safe from "it disappeared from the internet" events (unlikely, but still).
Beyond that I see no reason to keep it around on your code repository, unless for some reason you've modified it somehow. Most everything in NAnt can be overridden in build files, like the target framework and so on.

Keep Attributes of Version Controlled Files Unchanged

Is it possible to keep the attributes of a version controlled file unchanged? I have a directory structure which I'd like my installer to recreate on the client machine. I was hoping the entire directory could be placed on VCS without affecting the file attributes.
I'm using TFS but would also like to hear about other version control systems.
Edit: I'm talking about Windows file system attributes such as Hidden/Archive/System/Read-only but any other information such as creation/modification dates is also welcome. I have a directory structure in which some files are read-only and need to have those files installed as such on the client's machine. TFS tends to set/unset the read-only attribute depending on whether the file is checked-in or checked-out.
TFS does not store the file attribute data (such as created date, modified date) etc in the current versions of TFS. The values for those attributes will be the time on the local computer when the files is first downloaded / modifed.
TFS 2010 has the ability to attach arbitrary metadata to version control objects. You'd have to write your own tool, however.
API specification (prelease): http://blogs.msdn.com/mrod/archive/2008/05/09/team-foundation-server-properties.aspx
Usually version control systems do not store full metadata information about the files under its control in repository. In usual usage of version control systems this is not needed, and might have even cause problems; version control systems store "sane" subset of metadata (like e.g. executable permissions, and symbolic links).
Possible solution is to use hooks to save required parts of file metadata on commit to some file (usually plain text file), keep this file under version control to distribute it automatically to all clients, and use hooks to restore metadata on checkout.
Example solutions of tools to save and restore metadata include (unfortunately examples are for Git, and not TFS, but it is the idea that matters):
metastore
git-cache-meta
Example solutions of tools to keep configuration files under version control (again: all of them using Git as a backend) include:
IsiSetup
etckeeper
giterback

How to manage external dependencies which are constantly being modified

Our development uses lots of open-source code and I'm trying to figure out what the best way to manage these external dependencies.
Our current configuration:
we are developing for both linux and windows
We use svn for our own code
external dependencies (boost, log4cpp, etc) are not stored in svn. Instead I put them under ./extern (or c:\extern on windows). I don't want to put them in our repository because I will not be able to update them that way. Some of these are constantly being updated.
My questions
What to do if I need to modify external code?
Currently I have created a folder in my svn repository called extern_hacks and that is where I put the modified external code. I then link (or copy on windows) the files into the external directory structure. This solution is problematic since it is hard to keep track of copying the files, and very hard to update from svn when files are sitting in two repositories (mine for the modified files, and the original repository say sourceforge)
How to manage versions of external dependencies?
I'm interested to hear how others deal with these issues. Thanks.
I keep them in svn, and manage them as vendor branches. Keeping them loose externally makes it very hard to go back to a previous build, or fix bugs in a previous build (especially if the bug is from a change to the external dependency)
Keeping them in svn has saved me lots of headache, and also allows you to get a new workstation able to work on your codebase quickly.
I do not understand why you say
I don't want to put them in our repository because I will not be able to update them that way. Some of these are constantly being updated.
You really need to
include external dependencies in your source control and periodically update them and then tese, test, test.
Coordinate your build process with the updates for the external dependencies.
If your code depends upon something, then you really need to have control over when it gets updated/modified. Coding in a space where these dependencies can get updated at any time is too painful as you're no doubt finding out. I personally prefer option 1.
When I had to do something like this, I added the external source as external, and then applied a patch to it. The patch contains my modifications to the external source. So, I actually only version control my patches. Most of the times this works, if there are no "dramatic" changes in the external code.
Have you considered Maven? It's a build system that has excellent support for managing dependencies. For each project you can specify the required dependencies in an xml file as part of that project. The external libraries are held in a dependency repository (in our case Artifactory) this is separate from your version control system and can just be a network drive. It also allows managing different versions of projects.
I would be careful considering Maven because:
it is another repository in a system where you already have a repository with your current version control system;
it (Maven) is based on the only "common version control" every developer have, the file system (which means no metadata, or properties attached to the file, no proper history in term of who modified what and when)
Now when dealing with third-parties, you can consider having them in your version control system, but in a packaged way: that is in a very compact way, with sources and documentations zipped, in order to have the least possible number of files.
That way, you will manage the deployment of those (many) third-party libraries easily since the number of files to deploy is low.
Plus, having them under source control allows you to make a branch (say, a 'hack' branch), in which you will stored the packaged (or zipped) version of the hacked library.
What you can store in an external way is the un-zipped, complete set of files representing those libraries since there is no real development on them, or just a punctual hack: normally, your job is not to develop existing libraries, but to use them (even a bit modified) for implementing faster some features of your project.
If you need at some point to compare some hacked version with some official version, you will just pull out from svn the appropriate 'hacked' version number, unzip-it and compare-it with the official (and externally stored) version (with winmerge for instance)