Automatically renaming files during a TFS Merge - version-control

My development team is in the process of implementing branching and merging in TFS. Our software is contained in one solution, which many projects embedded within. One of those projects contains all the SQL scripts to keep our databases versioned.
We are having trouble figuring out how to handle the Database project during merging. For example, whenever a developer needs to make any changes to the database (add/remove a stored proc, create table, add indexes, etc) we create a script file. Each file gets named with a version number. For instance, if the last script file that was checked in is named 4.1.0.1, I would name my new script 4.1.0.2. We use these file names to match a software version with the corresponding database version it needs to run.
Say I create a branch off our main code branch to do a new feature. I do all the coding, and I put all my SQL changes in one script file and add it to the DB project. Obviously I could just merge from the main code branch into my new branch to make sure I have the latest list of SQL scripts so I could name it correctly. The problem is new script files are added several times throughout the day and we think we're going to have a ton of merge conflicts with the script naming.
I'd like to somehow automate this, so a developer can just add a script with some random name and during the merge there is a hook or event that will figure out which files are new to the main code base and also how to correctly name those files so the developers don't have to worry about it. For example, I can just create a new file called "new_script.sql", and when I merge that back to the main code branch it will get renamed to 4.1.0.2
Previously I've used tools such as RoundHouse to take care of all the SQL versioning, however at my current job we use Sybase SQL Anywhere 10, and I haven't been able to track down something similar to RoundHouse that will work with Sybase.
Can anyone point me in the right direction on how to automate a task during a TFS merge? I'm assuming this can be done using PowerShell, but my concern is that most of the development team is unfamiliar with PowerShell and I was hoping to be able to automate this task without the developer needing to leave the team explorer window.
Any help is greatly appreciated!
Thanks!

I think you are absolute right that you will get naming conflicts if you continue to use this strategy and introduce branches.
I think you have two options:
When adding these files on a branch, add them prefixed with the branch name. Eg. branch_0.1.sql and branch_0.2.sql. Then when you merge your branch back into the trunk, you would have to rename these to the correct name. You could write a program to do this for you, and would be fairly simple.
To make this simpler, you could only create one SQL file per branch too and just keep adding to that one SQL file. Then even you merge it's a very simply process.
Ditch the self versioned files. Tfs is version control. There isn't much need to version the files yourself. We use Visual Studio database projects for our database versioning and this works absolute fine. I guess you would need to change something in your deployment strategy if you changed this.
I hope this helps.

Related

How can I do a stream copy/merge of only a subdirectory in Perforce?

I'm using P4V. I work in a subdirectory (eg code/jorge) and other people work in another subdirectory (eg art/) that I never deal with. Additionally I have a stream where I do my personal work. Every so often I need to merge changes from the main line to my stream, and copy them back up. However, the files in art/ are large binaries and Perforce spends a long time thinking about them even though I've not touched them. Is there any way to have perforce merge/copy my directory (code/jorge) without it spending time trying to merge art/? Can I tell P4V to merge/copy only the code directory?
Related but not identical question: Perforce streams, exclude files from merge/copy
If you don't touch those files, it might be easier to not include them in your stream at all rather than manually exclude them every time you do a merge.
I.e. if your stream Paths currently says:
share ...
maybe it should instead be:
share code/jorge/...
or, if you need the art for builds but never need to modify it, you might consider doing something like:
import art/...
share code/...
I am not sure this is the recommended option but you can actually merge without using the "Stream to Stream" option but the standard "Specify source and target file" options, even if you are in a stream depot.
So you can select any subdirectory as your source like 'dev/code/jorge' and the same subdirectory as destination like "main/code/jorge' and it will only consider that directory. We do it routinely in my team because we have a big mono repo and have not taken the time to setup multiple depots when we migrated to Perforce.

Starting to version an already medium size project

I am about to start participating in the development of a medium-sized project (~50k lines) that was until now written by a single person, and not versioned; as a result folders are cluttered with different versions of the same file (named file1, file2, file3, etc.).
I proposed to start using a VCS for it (a priori Mercurial, which is the only one I've ever used -- for my personal projects --, but I'm open to suggestions), so I'm taking any good ideas as to how to "start" the repository. E.g., should I make an initial commit with all the existing files, and immediately make a new commit with the unused files removed? Or something else?
(constructive remarks on mercurial vs bazaar vs git vs whatever are also welcome.)
Thanks for your tips.
E.g., should I make an initial commit with all the existing files, and immediately make a new commit with the unused files removed?
If the size of the repository is not a concern, then yes, that is a good starting point. Otherwise you can just commit what's actually used, and go from there.
As for which system, all DVCSes stick to the same core principles. Which one you pick is entirely subjective — the only way to truly know which one you like is to try each one.
I would say use what you are the most comfortable with and meets your needs. As far as where to start, I personally would seed the repo with the current source as is, that way you can verify that everything builds and runs as expected. you can make this initial seed a branch. That way you can always go back to your starting point before refactoring.
My approach to this was:
create a Mercurial repository in the existing project folder ("existing")
commit all project files to "existing"
create an empty repository in what a different location ("new")
As files are tested and QA'd (this was necessary because there was so much dross in "existing") pull them from "everything" to "new".
Once files had been pulled into "new"; delete the corresponding files from "existing". If access is needed to these files while the migration is under way, push them back from "new" to "existing".
This gave me the advantage of putting everything under some sort of control for recovery purposes, control over introducing the project to the DVCS. Eventually the existing project folder became completely tested and approved for the project moving forward. At this point the "everything" directory could be deleted or changed into a working folder; and "new" became the actual project folder.
I think Mercurial is a good choice. Lightweight, fast, very simple to use and well-integrated with Windows (if that's the platform you're dealing with).
I would probably get rid of all the clutter before the first commit. Delete everything you don't care about, run all the necessary tests and only then do the commit.
Yes, I'm dead set against the 0-day cluttering of repos.
Granted, a 50K SLOC project isn't very big, but if you commit files you already know you won't need, they will make your repo slightly bigger.
Also, remember to check that the tree doesn't contain large binary files. If it does, get rid of them if at all possible.

Project files under version control?

I work on a large project where all the source files are stored in a version control except the project files. This was the lead developer's decision. His reasoning was:
Its to time consuming to reconcile the differences among developers' working directories.
It allows developers to work independently until their changes are stable
Instead, a developer initially gets a copy of a fellow developer's project files. Then when new files are added each developer notifies all the rest about the change. This strikes me as far more time consuming in the long run.
In my opinion the supposed benefits of not tracking changes to the project files are outweighed by the danger. In addition to references to its needed source files each project file has configuration settings that would be very time consuming and error prone to reproduce if it became corrupted or there was a hardware failure. Some of them have source code embedded in them that would be nearly impossible to recover.
I tried to convince the lead that both of his reasons can be accomplished by:
Agreeing on a standard folder structure
Using relative paths in the project files
Using the version control system more effectively
But so far he's unwilling to heed my suggestions. I checked the svn log and discovered that each major version's history begins with an Add. I have a feeling he doesn't know how to use the branching feature at all.
Am I worrying about nothing or are my concerns valid?
Your concerns are valid. There's no good reason to exclude project files from the repository. They should absolutely be under version control. You'll need to standardize on a directory structure for automated builds as well, so your lead is just postponing the inevitable.
Here are some reasons to check project (*.*proj) files into version control:
Avoid unnecessary build breaks. Relying on individual developers to notify the rest of the team every time the add, remove or rename a source file is not a sustainable practice. There will be mistakes and you will end up with broken builds and your team will waste valuable time trying to determine why the build broke.
Maintain an authoritative source configuration. If there are no project files in the repository, you don't have enough information there to reliably build the solution. Is your team planning to deliver a build from one of your developer's machines? If so, which one? The whole point of having a source control repository is to maintain an authoritative source configuration from which you build and deliver releases.
Simplify management of your projects. Having each team member independently updating their individual copies of your various project files gets more complicated when you introduce project types that not everyone is familiar with. What happens if you need to introduce a WiX project to generate an MSI package or a Database project?
I'd also argue that the two points made in defense of this strategy of not checking in project files are easily refuted. Let's take a look at each:
Its to time consuming to reconcile the differences among developers' working directories.
Source configurations should always be setup with relative paths. If you have hard coded paths in your source configuration (project files, resource files, etc.) then you're doing it wrong. Choosing to ignore the problem is not going to make it go away.
It allows developers to work independently until their changes are stable
No, using version control lets developers work in isolation until their changes are stable. If you each continue to maintain your own separate copies of the project files, as soon as someone checks in a change that references a class in a new source file, you've broken everyone on the team until they stop what they're doing and carefully update their project files. Compare that experience with just "getting latest" from source control.
Generally, a project checked out of SVN should be working, or there should be tools included to make it work (e.g. autogen.sh). If the project file is missing or you need knowledge about which files should be in the project, there is something missing.
Automatically generated files should not be in SVN, as it is pointless to track the changes to these.
Project files with relative path belong under source control.
Files that don't: For example in .Net, I would not put the .suo (user options) web.config (or app.config under source control. You may have developers using different connection strings, etc.
In the case of web.config, I like to put a web.config.example in. That way you copy the file to web.config upon initial checkout and tweak what settings you'd like. If you add something that needs to be added to all web.config, you merge those lines into the .example version and notify the team to merge that into their local version.
I think it depends on the IDE and configuration of the project. Some IDEs have hard-coded absolute paths and that's a real problem with multiple developers working on the same code with different local copies and configurations. Avoid absolute path references to libraries, for example, if you can.
In Eclipse (and Java), it's fine to commit .project and .classpath files (so long as the classpath doesn't have absolute references). However, you may find that using tools like Maven can help having some independence from the IDE and individual settings (in which case you wouldn't need to commit .project, .settings and .classpath in Eclipse since m2eclipse would re-create them for you automatically). This might not apply as well to other languages/environments.
In addition, if I need to reference something really specific to my machine (either configuration or file location), it tend to have my own local branch in Git which I rebase when necessary, committing only the common parts to the remote repository. Git diff/rebase works well: it tends to be able to work out the diffs even if the local changes affect files that have been modified remotely, except when those changes conflict, in which case you get the opportunity to merge the changes manually.
That's just retarded. With a set up like that, I can have a perfectly working project containing files that are subtly different from everyone else. Imagine the havoc this would cause if someone accidentally propagates this mess into QA and everyone is trying to figure out what's going on. Imagine the catastrophe that would ensue if it ever got released to the production environment...!

How do I add programmatically-generated new files to source control?

This is something I've never really understood about source control, specifically Subversion (the only source control I've ever used, which isn't saying much). I'm considering moving to git or Mercurial, so if that affects the answer to my question, please indicate as such.
Ok. As I understand it, every time I create a new file, I have to tell SVN about it, so that it knows to add it to the repository and place it under control. Something like:
svn add newfile
That's fine if I'm the one creating the file: I know I created it, I know its name, I know where it lives, so it's easy to tell SVN about it.
But now suppose I'm using a framework of some kind, like Rails, Django, Symfony, etc., and suppose I've already done the initial commit. All of these frameworks create new files programmatically, often many at once, in different directories, etc. etc. How do I tell the source control about these new files? Do I have to hunt each one of them down individually and add them? Is there an easier way? (Or am I possibly misunderstanding something fundamental about source control?)
Generally speaking, you shouldn't add files to source control if they can be generated from other files in your project. It's true that in some cases, a file is initially generated, but must be modified manually. In that case you will have to add it to source control. However, you should almost never automatically add files.
I agree with Matthew in general, if it can be generated it shouldn't be added but remain dynamically created.
For the practical question of adding multiple files, I don't remember in svn (though I think it should be possible), but to do this in git:
Using git bash (command line) you can add all "loose" files under the directory or subdirectory by not specifying a file after the add command. You can also set git to ignore certain files, so they wont be added in that case.
Another way is using git gui, it displays all un-tracked files and you can select them all (or groups of it) and add them in one click.

Daily Build and SQL Server Changes

I am about to try and automate a daily build, which will involve database changes, code generation, and of course a build, commit, and later on a deployment. At the moment, each developer on the team includes their structure and data changes for the DB in two files respectively, e.g. 6.029_Brady_Data.sql. Each structure and data file includes all changes for a version, but all changes are repeatable, i.e. with EXISTS checks etc, so they can be run every day if needed.
What can I do to bring more order to this process, which is currently basically concatenate all structure change files, run them repeatedly until all dependencies are resolved, then repeat with the data change files.
Create a database project using Visual studio database edition, put it into source control and let the developers check in their code. I have done this and it works good with daily builds and offer a lot of support for structuring your database code. See this blog post for features
http://www.vitalygorn.com/blog/post/2008/01/Handling-Database-easily-with-Visual-Studio-2008.aspx