Should I keep binary assets under TFS? How? - version-control

Our product is game-like, and is very rich (~40M - 100M) in binary supporting files - textures, meshes, movies etc. Like kai1968, I'd like to be able to sync-in these assets, and not just code, with a single click.
Strictly speaking, however, that is different than version control: I have no desire to burden our TFS with irrelevant history of these files. Can I somehow upload stuff without keeping history to TFS? It would be even better if I could opt to keep history at specific points (say, label points), and not in every checkin.
More generally, how do you manage sync of binary assets?
(I'm aware of other tools, perhaps better suited for such tasks, but diverging - or altogether migrating - from TFS is not an option right now.)

We've always kept binary assets in TFS when we need to, and just dealt with the side-effects of that choice (extra storage, longer check-ins because you can't diff on binaries, etc). I don't believe there's a way to selectively destroy the history of certain files, except manually. If you want to do this periodically, by hand, you could do the following:
Get a curent copy of the binary files
Destroy (delete with history) the binary copies in TFS
Manually add the files back to TFS
You'd have only the most recent copy, but this has a side-effect - you'd break any previous builds, since an attempt to retrieve source history wouldn't return these new copies of the files. TFS would check for a copy that matches the checkout you're attempting, and finding none, it wouldn't retrieve a copy of those files. You'd need to update your build scripts to pull the most recent binaries, as well as the historical code, if you wanted to build an old version, but even then, it won't be a true history.
The second option is to only check them in periodically - not with every single minor change. For example, keep these files somewhere safe (a file share with daily backups), and then only check in the changed binaries every week or so, or before every label, or whatever - this way, you don't have incremental history, but you'd still have your label history. You might even consider writing some kind of automated routine to apply labels, where it would check in any changes in that folder first, then apply the label.
Please post back what you end up doing - I'm curious to know!

Here are a few thoughts:
Consider using a separate VSTS project, so you don't mix the binaries and code in the same project. This makes it a bit easier to manage (e.g. you can keep the assets separate, and also any work items relating to them are more easily queried by filtering on the project). On the down side, this would mean 2 clicks to get latest.
Why don't you want to keep histories? The point of source control is to keep history so you can go back to a particular build for a particular day. Otherwise, you might as well just use a backup program on a network drive (and you really don't want to do that!)
If you're only worried about disk space usage, then don't. 100MB is tiny, and hard drives are cheap. My last game project had hundreds of gigabytes of assets and we kept the history of every change for over 3 years.
The assets won't slow anything down. They only take time to process if you check them in or Get them, which are both activities you will need to do even if you don't use source control. Indeed, source control makes things faster because you have a "one click does it all" solution.
The many other benefits of source control are really useful on assets, and vastly outweigh the negatives.

Related

Version control personally and simply?

Requirement
make history for web text/code source files.
login-worker is only me, i.e personal usage.
automatically save history for each updated files(no require at once but at least once per week)
It must be a simple way to start and work.
I have 3 work places so need to do async files.
(not must but hopefully for future working environment) Any other non-engineer can also understand the location of history file and can see it easily.
Current way:
I made history folder the day, download files in there for edit, copy files when I edit/creat new one.
Advantage of the current way:
Very quick and simple, no need to do additional task to make history
Disadvantage of the current way:
Messy. Whenever day I work, I create a new history folder to keep downloaded files, so that it is messy in Finder(or windows explore).
Also, I don't have a way to Doing Async files for sure with in other places.
I tested to use GIT before, I had Thought GIT automatically save files I edit and save with a editor, but that was not the case. Also GIT is too complicated to use/start. If you recommend GIT, you need to show me ways to deal with the problem I had, for instance, simple GIT GUI with limited options without merging/project/branch etc because of personal usage for maintaining just one website.
Do you know any way to do version control personally and simply?
Thanks.
Suppose you entered <form ...> in your HTML—without the closing tag—and saved the file; do you really think the commit created by our imaginary VCS picked up that file's update event would have any sense?
What I mean, is that as with writing programs¹,
the history of source code changes are there for humans to read,
and for that matter, a good history graph should really read like a prose:
each commit should be atomic in the sense it comprises one (small) but
internally integral feature or fixes a bug, and had to be properly annotated
so that the intent of the change captured by that commit is clear.
What you want instead is just some dumb stream of changes purely for backup purposes.
Well, if you're fully aware of the repercussions (the most glaring one is that the generated history is completely useless for doing development on
the project and can only be used for rollbacks in case of "oopsies"),
there are two ways to go:
Some IDEs (namely, Eclipse) save a backup copy of each file they manage
on each save—thus providing your with such a rollback functionality w/o
using any VCS.
Script around any VCS you like: say, on Linux,
you start something like inotifywait telling it to watch your
project's root directory, recurvively, for write events on files,
read whatever the tool prints to its stdout when these events happen,
and for each event, call to your VCS of choice to record a new commit
with these changes.
¹ «Programs must be written for people to read, and only incidentally for machines to execute.» — Abelson & Sussman, "Structure and Interpretation of Computer Programs", preface to the first edition.
I strongly suggest you to have a deeper look at git.
It may looks difficult at the beginning, but you should spend some time learning it, that's all. All the problems above could be easily solved if you spend some time to learn the basics. There is also a nice "tutorial" on github on how to use git, no need to install anything: https://try.github.io/levels/1/challenges/1.

Website may have up to 10 files with same name/purpose, no version control

I am new at working for a large company with various people working on the same files. Sadly we don’t have version control and I often find myself cross eyed. For lack of better terminology, we have a dev site, quality-assurance site, and the live site. We have most files in two languages. Since the network connected drives have an average transfer rate of 15kb/sec we often copy the files locally before working on them. Also contractors send us new versions of files, but we may have made changes on our side and everything gets screwed up.
Basically I’m working with 6-10 files with the same name and same purpose. Does anyone have any tips on how I can keep them straight? I use Beyond Compare 2 to see the differences but if there’s a program that compares all files time stamps to see which is most current may help.
Thoughts:
1) Get version control system (Git), otherwise you will continue to have more and more pain.
2) Create a includes/lib folder and reduce that 6-10 files down (to 1).
I'll suggest, take a lead and put your code in version control and push your team to move to new repository. It'll make everybody's life easier and most important reduce chances of any merge error.
Assuming you cannot convince the powers that be to actually use source code control, why not try using Mercurial purely locally. Hopefully you can insulate yourself from some of the noise. You could even make fake users for the contractors and commit & push those changes as though they were actually doing it.
It shouldn't be too hard to get a bureaucrat to see how nice a good gatekeeper like Mercurial or Git would be. Its kind of like helpful red tape!

Starting to version an already medium size project

I am about to start participating in the development of a medium-sized project (~50k lines) that was until now written by a single person, and not versioned; as a result folders are cluttered with different versions of the same file (named file1, file2, file3, etc.).
I proposed to start using a VCS for it (a priori Mercurial, which is the only one I've ever used -- for my personal projects --, but I'm open to suggestions), so I'm taking any good ideas as to how to "start" the repository. E.g., should I make an initial commit with all the existing files, and immediately make a new commit with the unused files removed? Or something else?
(constructive remarks on mercurial vs bazaar vs git vs whatever are also welcome.)
Thanks for your tips.
E.g., should I make an initial commit with all the existing files, and immediately make a new commit with the unused files removed?
If the size of the repository is not a concern, then yes, that is a good starting point. Otherwise you can just commit what's actually used, and go from there.
As for which system, all DVCSes stick to the same core principles. Which one you pick is entirely subjective — the only way to truly know which one you like is to try each one.
I would say use what you are the most comfortable with and meets your needs. As far as where to start, I personally would seed the repo with the current source as is, that way you can verify that everything builds and runs as expected. you can make this initial seed a branch. That way you can always go back to your starting point before refactoring.
My approach to this was:
create a Mercurial repository in the existing project folder ("existing")
commit all project files to "existing"
create an empty repository in what a different location ("new")
As files are tested and QA'd (this was necessary because there was so much dross in "existing") pull them from "everything" to "new".
Once files had been pulled into "new"; delete the corresponding files from "existing". If access is needed to these files while the migration is under way, push them back from "new" to "existing".
This gave me the advantage of putting everything under some sort of control for recovery purposes, control over introducing the project to the DVCS. Eventually the existing project folder became completely tested and approved for the project moving forward. At this point the "everything" directory could be deleted or changed into a working folder; and "new" became the actual project folder.
I think Mercurial is a good choice. Lightweight, fast, very simple to use and well-integrated with Windows (if that's the platform you're dealing with).
I would probably get rid of all the clutter before the first commit. Delete everything you don't care about, run all the necessary tests and only then do the commit.
Yes, I'm dead set against the 0-day cluttering of repos.
Granted, a 50K SLOC project isn't very big, but if you commit files you already know you won't need, they will make your repo slightly bigger.
Also, remember to check that the tree doesn't contain large binary files. If it does, get rid of them if at all possible.

Is there a version control system that is completely invisible to use while writing code?

I would like to use version control but I don't want to continuously commit several times an hour. Is there a version control system that records everything while you program so you don't have to commit, but still lets you go back to a previous state of your code?
Dropbox can do it. It records every change that you make.
You can do something like this if you're using the ZFS filesystem.
However, from a version-control point of view, I really don't think it's a good idea to store every changes. The size of you repository will become huge really fast.
And FYI, you don't have to commit several times an hour, I rarely do more than 4-5 commits a day.
Some operating systems make mounting WebDAV shares into the filesystem very easy; you could configure an SVN server to export WebDAV, mount the export into your filesystem, and get to work.
Don't forget to configure your editor to store temporary files or backup files somewhere other than your source tree or current working directory. Otherwise you'll have a ton of useless files cluttering up your source control system, making it harder to use in the future.
But finding which version to revert to can be pretty difficult without check in comments or changesets linking related changes together; it might not be worth the effort of configuring the entire system if it is too difficult to use to undo specific changes.

The theory (and terminology) behind Source Control

I've tried using source control for a couple projects but still don't really understand it. For these projects, we've used TortoiseSVN and have only had one line of revisions. (No trunk, branch, or any of that.) If there is a recommended way to set up source control systems, what are they? What are the reasons and benifits for setting it up that way? What is the underlying differences between the workings of a centralized and distributed source control system?
Think of source control as a giant "Undo" button for your source code. Every time you check in, you're adding a point to which you can roll back. Even if you don't use branching/merging, this feature alone can be very valuable.
Additionally, by having one 'authoritative' version of the source control, it becomes much easier to back up.
Centralized vs. distributed... the difference is really that in distributed, there isn't necessarily one 'authoritative' version of the source control, although in practice people usually still do have the master tree.
The big advantage to distributed source control is two-fold:
When you use distributed source control, you have the whole source tree on your local machine. You can commit, create branches, and work pretty much as though you were all alone, and then when you're ready to push up your changes, you can promote them from your machine to the master copy. If you're working "offline" a lot, this can be a huge benefit.
You don't have to ask anybody's permission to become a distributor of the source control. If person A is running the project, but person B and C want to make changes, and share those changes with each other, it becomes much easier with distributed source control.
I recommend checking out the following from Eric Sink:
http://www.ericsink.com/scm/source_control.html
Having some sort of revision control system in place is probably the most important tool a programmer has for reviewing code changes and understanding who did what to whom. Even for single person projects, it is invaluable to be able to diff current code against previous known working version to understand what might have gone wrong due to a change.
Here are two articles that are very helpful for understanding the basics. Beyond being informative, Sink's company sells a great source control product called Vault that is free for single users (I am not affiliated in any way with that company).
http://www.ericsink.com/scm/source_control.html
http://betterexplained.com/articles/a-visual-guide-to-version-control/
Vault info at www.vault.com.
Even if you don't branch, you may find it useful to use tags to mark releases.
Imagine that you rolled out a new version of your software yesterday and have started making major changes for the next version. A user calls you to report a serious bug in yesterday's release. You can't just fix it and copy over the changes from your development trunk because the changes you've just made the whole thing unstable.
If you had tagged the release, you could check out a working copy of it and use it to fix the bug.
Then, you might choose to create a branch at the tag and check the bug fix into it. That way, you can fix more bugs on that release while you continue to upgrade the trunk. You can also merge those fixes into the trunk so that they'll be present in the next release.
The common standard for setting up Subversion is to have three folders under the root of your repository: trunk, branches and tags. The trunk folder holds your current "main" line of development. For many shops and situations, this is all they ever use... just a single working repository of code.
The tags folder takes it one step further and allows you to "checkpoint" your code at certain points in time. For example, when you release a new build or sometimes even when you simply make a new build, you "tag" a copy into this folder. This just allows you to know exactly what your code looked like at that point in time.
The branches folder holds different kinds of branches that you might need in special situations. Sometimes a branch is a place to work on experimental feature or features that might take a long time to get stable (therefore you don't want to introduce them into your main line just yet). Other times, a branch might represent the "production" copy of your code which can be edited and deployed independently from your main line of code which contains changes intended for a future release.
Anyway, this is just one aspect of how to set up your system, but I think giving some thought to this structure is important.