Clone huge 16 GB Git repo with Eclipse Neon - eclipse

Is there any way I can clone a huge Git repository (16+ GB) using the Git integration of latest Eclipse Neon?
I'm cloning by HTTP connection.
First, I ran into timeouts, but then increased the Remote connection timeout to 1800 seconds in Eclipse config.
Then the cloning almost completed, but at the very end it always fails telling me Premature EOF.
I have increased the http.postBuffer to 524288000 also (as many users suggested on StackOverflow), but this was not much of a help.
I also tried cloning the master branch only, but again, I was stuck with the same error message.
Is EGit not capable of handling such a big repo over HTTP?

The only Git-related way to clone such a huge Git repo would be through the recent (February 2017) GVFS (Git Virtual File System).
As tweeted, for a 270GB repo:
“The Windows codebase has over 3.5M files. With GVFS (Git Virtual File System), cloning now takes a few minutes instead of 12+ hours.”
See github.com/Microsoft/GVFS.
GVFS is based on Git fork: github.com/Microsoft/git.
And based on a protocol whose specifications are described here.
This is not yet supported by EGit, or even regular Git for now.

Depending on what you want to do with the repo, a shallow clone may be the solution (it won't bring the full git history): https://www.perforce.com/blog/141218/git-beyond-basics-using-shallow-clones
also, for such big repo, consider using git lfs in the future: https://git-lfs.github.com/
finally, I've seen many huge git repos that became so big because had files that wasn't supposed to be saved on git (executable files, binaries, videos, audio, and so on). If by mistake something like that happen, you can remove it from history using filter-branch. Check this SO ans: How to remove/delete a large file from commit history in Git repository? or this github article https://help.github.com/articles/remove-sensitive-data/
EDIT:
Microsoft has been developing GVFS that may be a solution in a near future (i think it's still not ready, but I haven't tested)

Do you really have a code project that's 16GB? That's pretty crazy, man!
I think the least painful way to go about this, is to open your shell and just type git clone http://my-url/project.git. And then try to see if you can make the repository somewhat smaller.

Eventually, I ended up cloning the repository using a SSH connection.
This works fine, even from within Eclipse (using EGit).
I had to create a SSH key in Eclipse properties, since Putty's PPK format is not compatible with Eclipse. Then, I managed to clone the entire repository.
Seems like HTTP is not suited to download a chunk of 16+ GB. :)

Related

Newbie Unable to clone repo

I've never used a VCS before and I'm attempting to set one up now.
I'm doing some Game Development with Unity3d. At first I googled how to set up VCS for Unity; and I found this: http://www.gamasutra.com/blogs/BurkayOzdemir/20130303/187697/Using_Unity3D_with_external_VCS_Mercurial__Bitbucket_step_to_step_guide.php
I followed it until it came time to clone the repository from within the hgTortoise workbench. When I hit the clone button after copying and pasting the URL to my repo from the BitBucket.org website I received an error: "Repository Git clone https:://username#bitbucket.org/username/projectName.git not found code: 255" and I do understand what an HTTP 404 error is.
Anyone who has used the internet knows it means the page could not be found.
I created this repo as private; is that why it could not find my repo?
Then I proceeded to follow the instructions at BitBucket's "BitBucket 101" help page. I installed Git, had already created a BitBucket account and repo, and the instructions which followed.
I stopped at the point where the help page said to enter some command line things in to Git Shell. I'm running Windows 8.1 and searches have shown me that particular program doesn't exist on this PC.
Am I doing this correctly? What am I doing wrong? All I need is to set up a VCS.
Git and Mercurial are 2 different distributed version control packages. They both use a command line interface. TortoiseHG (the package referred to in the step-by-step you linked to) is a GUI extension that's only used for Mercurial. (HG is the chemical symbol for Mercury, get it?!)
Bitbucket is an online repository that can host either Git or Mercurial repositories.
It looks like you created your repository on Bitbucket as a Git repo and not a Mercurial repo. Just delete the repo on Bitbucket (make sure you have a good copy of your source code) and recreate it as a Mercurial repo. Then work with TortoiseHG as instructed in the step-by-step.
The fact that it's a private repo doesn't matter. That just means it will only be visible to you (vs. everyone) and will require a password to push and pull changes via https or ssh.
Well, first of all, it seems that the tutorial that you seem to be using is based on using Mercurial instead of Git.
If you're comfortable with diving into the command line, you can download TortoiseHG, which is a Windows shell extension: http://tortoisehg.bitbucket.io/
However, there's nothing wrong with using SourceTree as well, which is a GUI-based interface for dealing with both Git and Mercurial repositories: https://www.sourcetreeapp.com/
FYI, if you downloaded Git for Windows, it should've provided you with a terminal called Git Bash that you can use for Git commands.

Git reset gives me "still trying to merge"

I've googled this in many different ways and can't find anyone else talking about it (at least as far as I understand).
On my office pc I was trying to find a solution to a problem I was having (so I was ahead of my remote git repo, but without committing).
That night at home I figured out the solution and pushed it to my remote repo from my home pc.
Now I'm back in work and I wanted to reset my local repo on my office pc to match the remote (and discard all my local changes).
I ran:
git reset --hard origin/branch1
I got:
HEAD is now at 1501f25 **Still trying to merge**
What does this mean?
'Still trying to merge' seems to indicate it didn't complete somehow, but I can't see how (and I'm having no luck finding a clear answer in the git docs).
If a git merge --abort (git1.7.4+, January 2011) doesn't do it, check if you still have a .git/MERGE_HEAD file (and delete it).
Then the git reset should proceed (or, since it completed, the git repo state should be coherent).
Make sure you are in the right branch you wanted to reset to origin/branch1.
As the OP Roy Collings suggests, recloning should get rid of the warning, but that means having one's project config files versioned in order to minimize the time spent to configure everything again in a new cloned repo.
Since relative paths are supported in an Eclipse config, having .project and .classpath in a git repo is possible.

Eclipse: How to export local history to a real SCM system like Git or SVN

I have an Eclipse project which started out as a smallish, quick'n'dirty, private hack. I did not bother to use a real SCM (source code management) system like Git or SVN, not even locally. What I have instead is a few days' worth of Local History, an out-of-the-box Eclipse feature. As so often, the project grew and I want to share it including history, because the history shows a lot of refactoring steps which come in handy as a showcase in order to teach someone else about refactoring, clean code etc.
I already know that I can manually retrieve old versions file by file and manually migrate them to e.g. a Git repository, committing changes one by one and file by file. But what I am really interested in is:
Can I reset the whole project (not just a single file) to a certain date using Local History?
Is there a way to export certain (or all) snapshots of the local project history, so I can commit them to Git snapshot by snapshot?
Is there even an option (or an external tool, script etc.) by means of which I can automatically migrate a project's local history to a real SCM system like Git (preferred) or SVN? It would also be okay if the tool just created lots of full project snapshots in subfolders named by timestamps.
Disclaimer: Yes, I do know that I should have used Git right from the start. It would have cost me just three minutes to set up a local repository etc. But... BUT. You know. ;-)
I don't think there is, but keep in mind that the task shouldn't be too tedious.
Make a copy of your project before starting, just for safety's sake and then:
git init
(revert to snapshot 1)
git add .
git commit -m "First snapshot commit"
(revert to snapshot 2)
git add .
git commit -m "Second snapshot commit"
Wash, rinse, repeat.
If you've only got a few dozen snapshots, it shouldn't take more than an hour or so to do, which is probably a lot less than it would take to figure out a programmatic solution.
Unfortunately, the answer is "no" to all of your questions. At least, using standard built-in Eclipse functionality; there's always a chance that someone has written a plugin that meets your needs, but in this case I'd be surprised. Check the Eclipse Marketplace (found under the Help menu).

How do you use Netbeans to work with a Github project?

From what I can tell, nbGit doesn't talk to Github. The best idea I've had so far is to install msysgit, use it to clone the repository to the local drive, then point nbGit at the local clone (creating a second repository). Then I would use nbGit to talk to the repository on disk, and msysgit to sync the on-disk repository with Github. Is there a better way?
Support for the git pull and push commands does not look like it is yet implemented within the nbGit plugin - see this bug report for details... So I think using the command line to sync with github might be your best option for a little while still.

Any way to do a local CVS proxy/server?

I have an online CVS repository that I need to check code into. However, the server is outside my control and is often down.
So, is there a way to set up some sort of local CVS server/proxy such that I can check my code into the local CVS server regularly and have the local CVS server batch commit the changes to the online CVS repo periodically?
The local repository could possibly run some other SCM system, if that was necessary to prevent conflict with CVS. Online commits could possibly be done manually, or via cron. I'm open to suggestions.
I guess that my main concern would be the problems faced in trying to set up some sort of repository 'hierarchy'.
PS: I'm running Linux all along the 'hierarchy'.
Edit: Found a similar item here.
Use git locally, and then git-cvsexportcommit would be my suggestion. There's a blog post that talks about this at http://issaris.blogspot.com/2005/11/cvs-to-git-and-back.html although I'll be the first to admit that the export process isn't as easy to use as perhaps it could be.
I'd recommend running git locally while continuing to use your CVS server when you have a connection to it. Here's a nicely-written article that explains how:
http://www.kernel.org/pub/software/scm/git/docs/v1.4.4.4/cvs-migration.html
You can use git as a "frontend" to CVS which will allow to you check-in your changes locally (offline) and then sync them up to the CVS server when your connection is available. There is a bit of a task to setup the environment, but once you get it going the workflow is pretty nice.
See How to export revision history from mercurial or git to cvs? for the setup & workflow.
This doesn't really answer the question, but it sounds like you need a distributed VCS system.
I think you should consider using a distributed source management system such as git or mercurial which support this kind of decentralized source control.
I have never used it, but CVSup may do what you need. As others have mentioned, though, a distributed VCS system like git or mercurial would probably be better.