How to completely restore a repository history that uses LFS? - github

I am very confused about how this all works, so I am gonna make a series of questions.
So I am almost at the end of my final degree project and I have been using Github for version control. At some point, I had to store large files (>100mb) and got this message:
My first question is: what does actually happen if I click "commit anyway"? does it mean that I can't commit anymore?
Anyway, I have done research about lfs and eventually installed it in my repo (btw, this is a Unity project). I have followed this video: https://www.youtube.com/watch?v=09McJ2NL7YM&t=615s. This guy suggests using this custom .gitattribute: https://gist.github.com/nemotoo/b8a1c3a0f1225bb9231979f389fd4f3. This automatically tracks all files with a certain extension and pushes them in lfs. At the time I thought this was cool, until I realised that this file made me push all tracked files no matter how big or small they are. what I should have done is use bash for pushing to lfs only when I would get the message above (so only for files >100mb). Since my whole degree depends on this project, I did not want to mess with GitHub and spend time trying to "fix it", but I want to know if there is any way to restore the whole history and made it as I have never used lfs (?)
Lastly, since I got loads of files stored in lfs, I get that whenever someone re-clones my repo, can just do git lfs fetch --all and then git lfs pull (and this should use up my bandwidth, right?). But.. what happens if someone decides to just "download" the project ("download zip)? Well, I have tried it, and all those files are missing completely.. Is there a way to download the project with the original files instead of pointers?
Also, if you exceed the free 1gb data pack that GitHub provides, and stop paying for additional storage, do you lose all those files??
At some point in the future, I would like to remove lfs, and if I have to, only store files >100mb (I think they are just 2 in total). But would that still mean that to have a complete version of the project the only option would be to clone the repo instead of downloading (?).
Sorry for the long question but I really need to understand these things.

Related

Upload a complete project vs. complete while uploaded (GitHub)

For those that had uploaded their repositories and projects to GitHub already, is it easier to upload first and then finalize the project, or to upload a "final" version only (not taking in the account future bug-fixes, if any).
I think of how easy it is to later on substitute the files in the existing repository, if the project is already on GitHub: seems like a hassle to delete all files and re-upload them again. Or, the more commits - the better?
Upload first: you can then update your files locally, commit and push.
Git will detect any change/addition/deletion locally, make a new commit, which will then be pushed to your repository.
So no "hassle" involved.
The "more commit, the better" is because you can follow evolution in your code, possibly get back to a previous state, or add fixes.
Plus, it is a good way to save an intermediate state of your project.

Which VCS 3D modellers use?

Which VCS 3D modellers use? For instance in Blender or 3DsMax.
Like any project, it is the choice of the person starting it. Subversion and git are two popular choices, each has strong points. It would be hard to say one is more popular than the other.
There are two points I would highlight in making your decision -
Disk Usage - multimedia projects often use large files. git is a distributed repository, that means every user checking out a copy gets the entire repository, this can lead to a lot of extra disk usage for large projects. Svn keeps all the revisions on the server and each user gets two copies of each file, one original and one working so that a comparison can be made locally.
This also extends to svn being able to checkout a subdirectory of a project, while git needs to copy the entire repo. While recent git versions can checkout a subset of working files, the whole revision history is still copied locally.
Checking out previous revisions - git uses a unique string to identify each revision, while svn uses a numerical sequence. This makes svn easier to just checkout the previous revision, or five revisions earlier. To get an earlier revision from git you need to list the history and copy a random string to get an earlier revision. At least when using the CLI, GUI apps can make this easier for both.
This can extend to discussions, svn users can say I have revision 125 and I have 122 to quickly know that someone is way behind or just missing one update.

Unexpected overwriting CVS/SVN repository files in Mac Eclipse

I worked with Windows Eclipse CVS and CVS did not allow me to overwrite the latest revisions – I needed to update first. At the same time one developer working on Mac constantly overwrote my files. We looked at this problem and found that his CVS Eclipse plugin allows overwriting the latest revisions without any warning.
Now I work with Mac myself using SVN Eclipse plugin and I accidentally overwrote the latest revisions from my co developer. How to prevent this overwriting? If this overwriting happens what is the graceful way of reverting to the previous revisions and committing them back to the repository?
Wait? Something is not right here...
CVS and Subversion will never let you overwrite someone else's changes. The whole purpose of version control is to allow multiple people to work on the same files at the same time.
There are two ways version control systems do this:
Checkout and Lock: The oldest systems used a checkout and lock system. That is, you checkout the code for changes, and no one else was allowed to checkout and make changes until you checked in your changes. The problem is that someone could checkout files for a week and forget to check them back in, or go on vacation. Then, everyone else is stuck unable to work.
Checkout, and first person who commits wins: In this system, two people can checkout the same file and do their work. However, the first person who finishes their changes and commits wins. The other person must do an update which will incorporate the first person's changes into their working copy before they can commit their changes. This is what Subversion and CVS do.
So, how in the world are you losing your changes? Or, how are you overwriting the other person's changes?
Sometimes this happens if you are sharing your checked out working copy with other people. This is wrong and should never be done. Instead, each user should have their own separate independent copy of the project (Heck, you can even have multiple version if you want). When your partner checks in their changes, it shouldn't affect your files.
What will happen is that when you try to commit your changes, you will be told that your working copy is out of date. You'll have to update your working copy and that will incorporate your partner's changes into your working copy. You should then verify that everything is okay, and then commit your working copy which will now include both your and your partner changes.
Does this answer your question? Are you sharing all sharing the same directory, or do you have your own working copy? Is there something else going on?

Github and Dropbox conflict risk?

I have a dev folder with all my projects. Some of these are on github and some are not. I also use Dropbox (with symlinks) to keep my data synchronised across several computers.
For example if I add something to my Documents folder on one PC I can then see it in the corresponding folder on another PC.
My question is: If I do the same with my dev folder (so the dev folder is synced by Dropbox on both PCs) will it cause problems with my pushing to github?
You don't ever want to mix code versioning strategies. Either all of your code lives in git (which is a good idea), or it all lives in Dropbox (which doesn't give you any history, hence a very bad idea).
When you add a source file to git, you should be forced to push it to Github so it can be pulled at a later date.
I get the feeling that you will run into issues when pushing the code - you'll be adding new files in through one source, but pulling through another - it'd turn into a headache more than a benefit.
I'm not sure exactly how you could "prove" that is ok. But, I have used exactly this development model with no issues. I personally, don't use symlinks in my dropbox but that shouldn't affect anything. All of my git repos are on my Dropbox. I've been working this way for over a year across OXS, Windows, and Ubuntu. All of my commits and pushes have worked just fine.
Also, this may be a repeat of this question: Using Git and Dropbox together effectively?
[edit:]
Actually one thing was recently brought to my attention is that you might run into an issue with line endings across systems. This post from GitHub (with a link back to an SO question) explains how to deal with line endings.
I had the same question and now my answer is "simply move your repository out of Dropbox".
As you can see, Using Git and Dropbox together effectively? is not the same question, but if you just search the key word "GitHub", you will see the debate about your confusing. And maybe you will make your own desition.

Starting to version an already medium size project

I am about to start participating in the development of a medium-sized project (~50k lines) that was until now written by a single person, and not versioned; as a result folders are cluttered with different versions of the same file (named file1, file2, file3, etc.).
I proposed to start using a VCS for it (a priori Mercurial, which is the only one I've ever used -- for my personal projects --, but I'm open to suggestions), so I'm taking any good ideas as to how to "start" the repository. E.g., should I make an initial commit with all the existing files, and immediately make a new commit with the unused files removed? Or something else?
(constructive remarks on mercurial vs bazaar vs git vs whatever are also welcome.)
Thanks for your tips.
E.g., should I make an initial commit with all the existing files, and immediately make a new commit with the unused files removed?
If the size of the repository is not a concern, then yes, that is a good starting point. Otherwise you can just commit what's actually used, and go from there.
As for which system, all DVCSes stick to the same core principles. Which one you pick is entirely subjective — the only way to truly know which one you like is to try each one.
I would say use what you are the most comfortable with and meets your needs. As far as where to start, I personally would seed the repo with the current source as is, that way you can verify that everything builds and runs as expected. you can make this initial seed a branch. That way you can always go back to your starting point before refactoring.
My approach to this was:
create a Mercurial repository in the existing project folder ("existing")
commit all project files to "existing"
create an empty repository in what a different location ("new")
As files are tested and QA'd (this was necessary because there was so much dross in "existing") pull them from "everything" to "new".
Once files had been pulled into "new"; delete the corresponding files from "existing". If access is needed to these files while the migration is under way, push them back from "new" to "existing".
This gave me the advantage of putting everything under some sort of control for recovery purposes, control over introducing the project to the DVCS. Eventually the existing project folder became completely tested and approved for the project moving forward. At this point the "everything" directory could be deleted or changed into a working folder; and "new" became the actual project folder.
I think Mercurial is a good choice. Lightweight, fast, very simple to use and well-integrated with Windows (if that's the platform you're dealing with).
I would probably get rid of all the clutter before the first commit. Delete everything you don't care about, run all the necessary tests and only then do the commit.
Yes, I'm dead set against the 0-day cluttering of repos.
Granted, a 50K SLOC project isn't very big, but if you commit files you already know you won't need, they will make your repo slightly bigger.
Also, remember to check that the tree doesn't contain large binary files. If it does, get rid of them if at all possible.