Ignore re-generated but unchanged files in a Subversion commit - entity-framework

We use Entity Framework 5. Sometimes we find if we make a change to the model all class files are deleted and re-created, although most of them contain exactly the same code as before.
Subversion marks all these files as deleted/new and, when committed, uploads a new version of all of them, regardless of whether their contents have actually changed. This is annoying as it makes it difficult to track which files have actually changed.
Is there any way to make subversion include in the commit only those files that have actually changed?
We are using TortoiseSVN 1.7.11, Build 23600; and AnkhSVN 2.3.10838.1211 with Visual Studio Professional 2012

As workaround you can use subversion hooks and prohibit remove/add files with identical names in same revision. Then user will be responsible to convert 'replace' to 'modify'. This is also annoying, but will be better look in history, better merged and updated.

Related

Perforce: submit files with "Version in workspace is not latest version"?

(I work with perforce from eclipse by the perforce plugin).
After associating my workspace with a perforce depot, all the files got status "Version in workspace is not latest version" (yellow triangle)....
When a file has this status, submit is disabled for this file.
When I do "Sync with depot" on a project, all those files show the conflict icon (even when there's no conflict...).
Conflict? does it mean I need to resolve?
Here is how to do resolve:
http://www.perforce.com/perforce/doc.current/manuals/p4guide/05_resolve.html
The problem is... by "resolving" perforce overrides all the files in my workspace with the files in the depot. So every change I made to the workspace before associating it to perforce is gone.
What I actually want to achieve is the other way around: submit all the files in my workspace to the depot. i.e. override the depot.
How can I do that?
If perforce says that you cannot submit because "Version in workspace is not latest version", then this means that you have a file open for edit that was already changed and submitted by someone else, i.e. you're working on an old version of the file.
You definitely should not try to force your (old) version on top of the newer one in the depot.
You really need to resolve. Perforce will not "override" all the files your workspace and discard your changes.
For merging (resolving) you can use the eclipse built-in merge tool or the p4merge (from Perforce).
As other answers say, most of the time your best bet is to work with Perforce's workflow and check out the workspace in advance and make your changes there, rather than make changes first and create the workspace later. Sometimes, though, you really do need to break Perforce's workflow and override the changes in the depot. If you're going to do that, you need to be extra careful that you're not reverting something important. (Even on a one-person project you might have forgotten you checked something in, so look carefully at the diffs before submitting.)
The easiest thing to do is, when Perforce tells you that you have a conflict, resolve but keep your changes. In the Perforce documentation link, that's 'resolve, accept yours' rather than accepting what Perforce thinks is the sensible merge. From the command line, that's p4 resolve -ay. It's worded a little differently in the p4v GUI, and may be worded differently still in the Eclipse plugin (which I haven't actually worked with).
The other option, which you might use if you have files checked out from an earlier revision, and you want to update to the tip revision without making any changes, is to tell Perforce to update the metadata, so that it thinks you have a newer version of the file, without actually altering any of your files. From the command line, that's p4 sync -k (whatever you want to sync). This, too, can be dangerous if used inappropriately.
I don't think you can achieve what you want in Perforce. You will need to copy the files you've changed to a safe place and then resolve/revert all of the files to remove the conflict/out of date flag. Once done, copy your changes back and submit. It's a pain, but you should have connected with Perforce before making your changes.

Starting to version an already medium size project

I am about to start participating in the development of a medium-sized project (~50k lines) that was until now written by a single person, and not versioned; as a result folders are cluttered with different versions of the same file (named file1, file2, file3, etc.).
I proposed to start using a VCS for it (a priori Mercurial, which is the only one I've ever used -- for my personal projects --, but I'm open to suggestions), so I'm taking any good ideas as to how to "start" the repository. E.g., should I make an initial commit with all the existing files, and immediately make a new commit with the unused files removed? Or something else?
(constructive remarks on mercurial vs bazaar vs git vs whatever are also welcome.)
Thanks for your tips.
E.g., should I make an initial commit with all the existing files, and immediately make a new commit with the unused files removed?
If the size of the repository is not a concern, then yes, that is a good starting point. Otherwise you can just commit what's actually used, and go from there.
As for which system, all DVCSes stick to the same core principles. Which one you pick is entirely subjective — the only way to truly know which one you like is to try each one.
I would say use what you are the most comfortable with and meets your needs. As far as where to start, I personally would seed the repo with the current source as is, that way you can verify that everything builds and runs as expected. you can make this initial seed a branch. That way you can always go back to your starting point before refactoring.
My approach to this was:
create a Mercurial repository in the existing project folder ("existing")
commit all project files to "existing"
create an empty repository in what a different location ("new")
As files are tested and QA'd (this was necessary because there was so much dross in "existing") pull them from "everything" to "new".
Once files had been pulled into "new"; delete the corresponding files from "existing". If access is needed to these files while the migration is under way, push them back from "new" to "existing".
This gave me the advantage of putting everything under some sort of control for recovery purposes, control over introducing the project to the DVCS. Eventually the existing project folder became completely tested and approved for the project moving forward. At this point the "everything" directory could be deleted or changed into a working folder; and "new" became the actual project folder.
I think Mercurial is a good choice. Lightweight, fast, very simple to use and well-integrated with Windows (if that's the platform you're dealing with).
I would probably get rid of all the clutter before the first commit. Delete everything you don't care about, run all the necessary tests and only then do the commit.
Yes, I'm dead set against the 0-day cluttering of repos.
Granted, a 50K SLOC project isn't very big, but if you commit files you already know you won't need, they will make your repo slightly bigger.
Also, remember to check that the tree doesn't contain large binary files. If it does, get rid of them if at all possible.

Working with folders in RCS

I have been following the tutorial http://www.burlingtontelecom.net/~ashawley/rcs/tutorial.html on how to work with files using RCS. This works well but only with one file. Is there a way to create an RCS file with directories as well?
I have a project folder called myproject, and in this directory I have all my files for that project. I want to create a revision control system for the myproject folder and all its files that are inside.
As William's comment says, RCS only works with single files. (It also doesn't seem to be particularly suitable for multiple-user stuff.)
Of course, nothing stops you from putting each (source) file in a directory under RCS control; in fact, this is essentially what CVS does (though in recent versions it handles the RCS data itself, rather than invoking RCS to do it as it used to do). Unfortunately, this fragments the change history rather badly; a commit affecting many files ends up as separate commits to each file, which just happen to have the same commit message (and timestamp?), and in general every file will have a different revision in what the user might like to think of as the "same" revision. (This makes tags quite essential.) CVS also has issues with the atomicity of commits: you could end up with commit A and commit B getting tangled up, such that in file foo commit A precedes commit B, but in file bar commit B precedes commit A!
SVN (Subversion) is an attempt to rectify some of the problems in CVS, though it also brings some new limitations, and keeps many of the existing ones; it is probably wiser (as William implies) to just use a distributed version control system (DVCS) for your multi-file projects. There are many choices:
Darcs uses a unique patch-based model: a repository is treated as a sequence of patches, which can be applied to an empty tree to build the current revision; patches can often be reordered by "commuting" pairs of patches, and cherry-picking patches from other repositories is quite easy. The downside is that the change history is a bit less clear than in most DVCSes. See http://wiki.darcs.net/Using/Model, http://en.wikibooks.org/wiki/Understanding_Darcs/Patch_theory.
Directed-acyclic-graph (DAG) based DVCSes model a repository as a directed acyclic graph of revisions, where each revision can have one parent, two parents, or perhaps more. Each revision has an associated file tree state; sometimes renames are also tracked somehow.
Git, as already mentioned. Has a very simple model, but a very complicated interface: there are many commands, some of which are not really intended for humans to use (owing to many parts of it having been prototyped in shell script, probably), so it can be hard to find the ones you want. Also, its model might be a bit too simple: it doesn't track renames at all.
Bazaar (a.k.a. bzr) has a more complicated model, including support for file/directory renames. It's difficult to say how much more complicated, though, because whatever documentation may exist is not nearly as accessible as Git's. It does, however, have a rather simpler user interface, and there are a number of useful plugins, including a distributed-development-friendly SVN plugin: committing from a branch back to SVN need not interfere with the validity of others' branches of your branches, and bzr metadata is even committed back to SVN. Can make things much less painful if you want to start hacking on an SVN-based project without having commit access, but hope to get your changes committed eventually. Bazaar is my personal favorite DAG-based DVCS.
Mercurial (a.k.a. hg) seems fairly similar to Bazaar, though I think it tracks renames only for individual files, not for directories. It also supports plugins, though its SVN plugin isn't as nice as Bazaar's: it doesn't support lossless commits, so branching from other peoples' branches is unwise. I don't have much experience with it, so I can't really evaluate it in-depth.
As the comments already mention, if you are starting out with version control, you would be well advised to choose a newer system than RCS (git, mercurial, fossil, subversion, ...). That said, RCS still works fine for a single developer working primarily on a single machine - I still use it for my own code because I've not yet OK worked out how to get the (20+ years of) history I want into git in the way I want it.
Anyway, to use RCS, make sure you have an RCS sub-directory in each directory where you have working source code under RCS management. The RCS files will be placed in the sub-directory automatically, and retrieved automatically. If your version of make is not already aware of RCS, then you can train it so that it is - or get a version of make that does (GNU Make, for example).
TL:DR - Look into DCVS for an alternative of RCS. It uses CVS, which uses RCS, but it's more modular for working in a repository that is distributed, as well as having a hierarchy of directories.
I'm currently going through a similar issue, and may have found something worthy of note, especially for people who are being forced to use a light, command-line based revision control systems with multiple team members.
My manager will not get off this idea of using RCS as our version control. But for the specifications, he wants developers to be able to create and edit on their own repository on a localized server within our company. Two issues with this:
RCS does not create, nor hold any sort of 'repository'. It is software that keeps track of file edits, on a Per File Basis. Meaning that the 'repository' is nothing more than another directory with RCS checked-in files. This is sub-par for team-geared projects, to say the least.
On a large project with multiple directories and tens of individual working files, even the prospect of creating a top-level RCS directory with a symbolic link in the working directories gives rise to complications such as naming conventions, as well as forgetting which file came from which bottom-level / working directory.
With what SamB posted, even CVS gives additional problems with RCS that we now have to account for, but gives us a slight ability for some additional hierarchy. But one suggestion he forgot was DCVS.
It's nothing more than an extension of CVS, CVSup, and:
contains functionality to distribute CVS repositories with local lines of development and automatically handles synchronization of the distributed repositories in the background.

Subversion using in Eclipse

I come from a Microsoft background in coding and thus have been used to Team Foundation Server and such for source control. Under TFS the files would check out by themselves in Eclipse and I would check them in when I was finished.
I have installed Subversion and the connector into Eclipse and have created my project with a local server
On Subversion do I have to check out the file when I need to change it? It doesnt change the RW permissions so I am not sure what the procedure is.
So basically if I am using Subversion in Eclipse what is the procedure for checking out a file and checking it in? What buttons are clicked?
Thanks for any help!
No, you don't need to "check out" to enable editing a file in Subversion. Subversion does not use the same type of locking VSS does (and TFS, by the sound of it - though I haven't used TFS myself). The locking that svn uses is sometimes called optimistic locking. Here is the svn manual page on file sharing and locking with a lot more specific details.
In Subversion, you would update your working copy like you normally would, but without any additional steps you could then just begin performing your changes to any file in the working copy without needing to lock out any other users, and commit when ready. If no one has modified it since you updated, then it will just commit the changes. Even if someone has, it will still commit (provided the same lines were not modified) and the server will handle it. If however, someone else modified the same lines of the file as you, then a conflict would occur and the commit would fail with "one or more files are in conflict". The conflict must then be manually looked at, eliminated, and marked as resolved, after which you would retry the commit and it would then go through (provided nothing else was in conflict).
Conflicts during every day work on a single branch are rare, which is why a lot of versioning systems use optimistic locking. Only when dealing with merging back and forth between branches do things sometimes get more involved.
Typically I would checkout the entire project, make my changes and then use the team sync view to review my code changes and commit from there. Right clicking is the key (see screenshots)
A great walk through on the basics can be viewed here.
Most of your actions will reside on under the Team menu; where you can commit, add, etc...
I use to use Subversion with eclipse. Now I use subversion with VSS. In both situations I've found I prefer, most of the time, to use Tortoise SVN for all my operations with the repository. Not as much of an answer but more of an opinion.

Project files under version control?

I work on a large project where all the source files are stored in a version control except the project files. This was the lead developer's decision. His reasoning was:
Its to time consuming to reconcile the differences among developers' working directories.
It allows developers to work independently until their changes are stable
Instead, a developer initially gets a copy of a fellow developer's project files. Then when new files are added each developer notifies all the rest about the change. This strikes me as far more time consuming in the long run.
In my opinion the supposed benefits of not tracking changes to the project files are outweighed by the danger. In addition to references to its needed source files each project file has configuration settings that would be very time consuming and error prone to reproduce if it became corrupted or there was a hardware failure. Some of them have source code embedded in them that would be nearly impossible to recover.
I tried to convince the lead that both of his reasons can be accomplished by:
Agreeing on a standard folder structure
Using relative paths in the project files
Using the version control system more effectively
But so far he's unwilling to heed my suggestions. I checked the svn log and discovered that each major version's history begins with an Add. I have a feeling he doesn't know how to use the branching feature at all.
Am I worrying about nothing or are my concerns valid?
Your concerns are valid. There's no good reason to exclude project files from the repository. They should absolutely be under version control. You'll need to standardize on a directory structure for automated builds as well, so your lead is just postponing the inevitable.
Here are some reasons to check project (*.*proj) files into version control:
Avoid unnecessary build breaks. Relying on individual developers to notify the rest of the team every time the add, remove or rename a source file is not a sustainable practice. There will be mistakes and you will end up with broken builds and your team will waste valuable time trying to determine why the build broke.
Maintain an authoritative source configuration. If there are no project files in the repository, you don't have enough information there to reliably build the solution. Is your team planning to deliver a build from one of your developer's machines? If so, which one? The whole point of having a source control repository is to maintain an authoritative source configuration from which you build and deliver releases.
Simplify management of your projects. Having each team member independently updating their individual copies of your various project files gets more complicated when you introduce project types that not everyone is familiar with. What happens if you need to introduce a WiX project to generate an MSI package or a Database project?
I'd also argue that the two points made in defense of this strategy of not checking in project files are easily refuted. Let's take a look at each:
Its to time consuming to reconcile the differences among developers' working directories.
Source configurations should always be setup with relative paths. If you have hard coded paths in your source configuration (project files, resource files, etc.) then you're doing it wrong. Choosing to ignore the problem is not going to make it go away.
It allows developers to work independently until their changes are stable
No, using version control lets developers work in isolation until their changes are stable. If you each continue to maintain your own separate copies of the project files, as soon as someone checks in a change that references a class in a new source file, you've broken everyone on the team until they stop what they're doing and carefully update their project files. Compare that experience with just "getting latest" from source control.
Generally, a project checked out of SVN should be working, or there should be tools included to make it work (e.g. autogen.sh). If the project file is missing or you need knowledge about which files should be in the project, there is something missing.
Automatically generated files should not be in SVN, as it is pointless to track the changes to these.
Project files with relative path belong under source control.
Files that don't: For example in .Net, I would not put the .suo (user options) web.config (or app.config under source control. You may have developers using different connection strings, etc.
In the case of web.config, I like to put a web.config.example in. That way you copy the file to web.config upon initial checkout and tweak what settings you'd like. If you add something that needs to be added to all web.config, you merge those lines into the .example version and notify the team to merge that into their local version.
I think it depends on the IDE and configuration of the project. Some IDEs have hard-coded absolute paths and that's a real problem with multiple developers working on the same code with different local copies and configurations. Avoid absolute path references to libraries, for example, if you can.
In Eclipse (and Java), it's fine to commit .project and .classpath files (so long as the classpath doesn't have absolute references). However, you may find that using tools like Maven can help having some independence from the IDE and individual settings (in which case you wouldn't need to commit .project, .settings and .classpath in Eclipse since m2eclipse would re-create them for you automatically). This might not apply as well to other languages/environments.
In addition, if I need to reference something really specific to my machine (either configuration or file location), it tend to have my own local branch in Git which I rebase when necessary, committing only the common parts to the remote repository. Git diff/rebase works well: it tends to be able to work out the diffs even if the local changes affect files that have been modified remotely, except when those changes conflict, in which case you get the opportunity to merge the changes manually.
That's just retarded. With a set up like that, I can have a perfectly working project containing files that are subtly different from everyone else. Imagine the havoc this would cause if someone accidentally propagates this mess into QA and everyone is trying to figure out what's going on. Imagine the catastrophe that would ensue if it ever got released to the production environment...!