Migrating to GitHub from SVN (VisualSVN Server) keeping SVN version numbers - github

Ok so we are trying to move from SVN (VisualSVN Server in house) to GitHub and we want to keep version numbers, all the posts so far I have tried to follow don't seem to find the version numbers. And I end up getting this 'Unable to determine upstream SVN information from HEAD history'
So we used the GitHub import utility to bring over the SVN repository but everything I have tried after that doesn't seem to work to allow us to access version numbering from SVN.
If anyone can help me please, I am not really great in either SVN or GitHub and all advice would be very welcome.
What do you mean by "we want to keep version numbers"? - The version info from SVN (revision numbers I think they call it) we need for historic records (we are getting rid of SVN) it would be great to see them as a message in the commit comments for historic SVN items. I may be using the wrong terms sorry.

I am not really great in either SVN or GitHub
Hire anybody proficient in at least one theme: SVN or Git (Github is service on top of Git and it's irrelevant to task "repository migration")
Another way is:
read and understand git-svn topic from Git Book
convert SVN-repository to local Git-repo
using fresh Git, discover and use find-rev subcommand in git-svn for your task:
When given an SVN revision number of the form rN, returns the
corresponding Git commit hash (this can optionally be followed by a
tree-ish to specify which branch should be searched). When given a
tree-ish, returns the corresponding SVN revision number.

Related

svn2git broken pipe error

I use svn2git but it fails always on the same line:
My command:
svn2git https://my-svn/proj/ --revision 1000:1001 --username xxx
After one houre this command stoped running and it fails with:
Broken pipe at /usr/lib64/perl5/vendor_perl/SVN/Ra.pm line
623.
What would cause this issue?
For a one-time migration git-svn is not the right tool for conversions of repositories or repository parts. It is a great tool if you want to use Git as frontend for an existing SVN server, but for one-time conversions you should not use git-svn, but svn2git which is much more suited for this use-case.
There are plenty tools called svn2git, the probably best one is the KDE one from https://github.com/svn-all-fast-export/svn2git. I strongly recommend using that svn2git tool. It is the best I know available out there and it is very flexible in what you can do with its rules files.
The svn2git tool you use is based on git-svn and thus suffers from most of the same drawbacks.
You will be easily able to configure svn2gits rule file to produce the result you want from your current SVN layout and you can also tell it to keep empty directories by giving it a commandline option that causes it to put empty .gitignore files in the directories to keep them.
If you are not 100% about the history of your repository, svneverever from http://blog.hartwork.org/?p=763 is a great tool to investigate the history of an SVN repository when migrating it to Git.
Even though git-svn (or the svn2git you used) is easier to start with, here are some further reasons why using the KDE svn2git instead of git-svn is superior, besides its flexibility:
the history is rebuilt much better and cleaner by svn2git (if the correct one is used), this is especially the case for more complex histories with branches and merges and so on
the tags are real tags and not branches in Git
with git-svn the tags contain an extra empty commit which also makes them not part of the branches, so a normal fetch will not get them until you give --tags to the command as by default only tags pointing to fetched branches are fetched also. With the proper svn2git tags are where they belong
if you changed layout in SVN you can easily configure this with svn2git, with git-svn you will loose history eventually
with svn2git you can also split one SVN repository into multiple Git repositories easily
or combine multiple SVN repositories in the same SVN root into one Git repository easily
the conversion is a gazillion times faster with the correct svn2git than with git-svn
You see, there are many reasons why git-svn is worse and the KDE svn2git is superior. :-)
From what I can tell, the "broken pipe" error is due to a problem receiving a file. It might be really big or there may be something else odd about it.
If you run into the error, fortunately in most cases you can simply re-run the git svn clone command or other related command and usually it can seamlessly resume from where it left off.
If you are using git-svn to do a one time migration of an SVN repo to Git, just be sure to follow Atlassian's guide. They have a tool for properly converting the SVN tags to Git tags, but it is a destructive process and you won't be able to use the git repo to sync with SVN anymore, so only do this when you're finally ready to abandon the SVN repo.

Clone huge 16 GB Git repo with Eclipse Neon

Is there any way I can clone a huge Git repository (16+ GB) using the Git integration of latest Eclipse Neon?
I'm cloning by HTTP connection.
First, I ran into timeouts, but then increased the Remote connection timeout to 1800 seconds in Eclipse config.
Then the cloning almost completed, but at the very end it always fails telling me Premature EOF.
I have increased the http.postBuffer to 524288000 also (as many users suggested on StackOverflow), but this was not much of a help.
I also tried cloning the master branch only, but again, I was stuck with the same error message.
Is EGit not capable of handling such a big repo over HTTP?
The only Git-related way to clone such a huge Git repo would be through the recent (February 2017) GVFS (Git Virtual File System).
As tweeted, for a 270GB repo:
“The Windows codebase has over 3.5M files. With GVFS (Git Virtual File System), cloning now takes a few minutes instead of 12+ hours.”
See github.com/Microsoft/GVFS.
GVFS is based on Git fork: github.com/Microsoft/git.
And based on a protocol whose specifications are described here.
This is not yet supported by EGit, or even regular Git for now.
Depending on what you want to do with the repo, a shallow clone may be the solution (it won't bring the full git history): https://www.perforce.com/blog/141218/git-beyond-basics-using-shallow-clones
also, for such big repo, consider using git lfs in the future: https://git-lfs.github.com/
finally, I've seen many huge git repos that became so big because had files that wasn't supposed to be saved on git (executable files, binaries, videos, audio, and so on). If by mistake something like that happen, you can remove it from history using filter-branch. Check this SO ans: How to remove/delete a large file from commit history in Git repository? or this github article https://help.github.com/articles/remove-sensitive-data/
EDIT:
Microsoft has been developing GVFS that may be a solution in a near future (i think it's still not ready, but I haven't tested)
Do you really have a code project that's 16GB? That's pretty crazy, man!
I think the least painful way to go about this, is to open your shell and just type git clone http://my-url/project.git. And then try to see if you can make the repository somewhat smaller.
Eventually, I ended up cloning the repository using a SSH connection.
This works fine, even from within Eclipse (using EGit).
I had to create a SSH key in Eclipse properties, since Putty's PPK format is not compatible with Eclipse. Then, I managed to clone the entire repository.
Seems like HTTP is not suited to download a chunk of 16+ GB. :)

How do you use Git within Eclipse as it was intended?

I've recently been looking at using Git to eventually replace the CVS repository we have at work. However after watching Linus Torvalds' video on YouTube about Git it seems that every tutorial I find suggests using Git in the same way CVS is used except that you have a local repository which I agree is very useful for speed and distribution.
However the tutorials suggest that what you do is each clone the repository you want to develop on from a remote location and that when changes are made you commit locally building up a history to help with merge control. When you are ready to commit your changes you then push them to the remote location, but first you fetch changes to check for merge conflicts (just like CVS).
However in Linus' video he describe the functionality of Git as a group of developers working on some code pushing and fetching from each other as needed, not using a remote location i.e. a centralized location. He also describes people pushing their changes out to verifiers who fetch and push code also. So you can see it's possible to create scalable structure within a company also.
My question is can anybody point me in the direction of some tutorials that actually explain how to do this distributed development of code using Git so that developers push and fetch code from each other with out committing to the remote repository and if possible it would be very nice to have this tutorials Eclipsed based.
Thanks in advance,
Alexei Blue.
I don't know any specific tutorial about this. In general, for connecting to a repository, you have to be running a git server that listens (and authenticates) to git requests.
To have a specific repository for each developer is possible - but each repository needs that server component. It is possible to store multiple repositories on the same computer, that allows reducing the number of servers required.
However, for other reasons it is beneficial to have some kind of central structure (e.g. a repository for stuff to be released; or a repository for stuff not verified yet). This structure is not required to be a single central repository, but multiple ones with well-defined workflows regarding the data move between repositories (e.g. if code from the verification repository is validated, it should be pushed to the release repository).
In other words, you should be ready to create Git servers (e.g. see http://tumblr.intranation.com/post/766290565/how-set-up-your-own-private-git-server-linux for details; but there are other tutorials for this as well), and define workflows for your own company to use it.
Additionally, I recommend looking at AlBlue's blog series called Git Tip of the Week.
Finally, to ease the introduction I suggest to first introduce Git as a direct replacement for CVS, and then present the other changes one by one.
Take a look at alblue's blog entry on Gerrit
This shows a workflow different from the classic centralized server such as CVS or SVN. Superficially it looks similar as you pull the source from a central Git server, but you push your commits to the Gerrit server that can compile and test the code to make sure it works before eventually pushing the changes to the central Git server.
Now instead of pushing the changes to Gerrit, you could have pushed the code to your pair programming buddy and he could have manually reviewed and tested the code.
Or maybe you're going on holiday and a colleague will finish the task you've started. No problem, just push your changes to their Git repo.
Git doesn't treat any of these other Git instances any different from each other. From Git's perspective, none of them are special.

How can I do a partial update (i.e., get isolated changesets) from subversion with subclipse?

If a file is committed several times with various changes, how can I fetch one change at a time, i.e., one changeset at a time?
I use eclipse, subversion, and subclipse, and I can't change the former two for the time being (or the MS platform..).
In my Team/Synchronization view in eclipse (using subclipse), choosing the changeset model, a file seems to be listed only in the latest relevant changeset even if all changesets are listed. So an earlier changeset doesn't necessarily show the full set of files in the original commit, nor the original diff for a file in a commit.
Update: I'm thinking about using changesets for simplified code review, so I'd like the partial update represented for all the files commited in one changeset. It's easy to get diffs and specific revisions for specific files in eclipse, but I'd like to step through all the changes in one specific commit/ changeset in a practical manner.
As I'm sure you know, svn up will by default grab the latest revision of the file.
However, you can use the -r parameter to svn up to grab a particular revision of a file. So if you know a file was committed in revisions 5, 7, and 9, you could do this:
svn up -r5 myfile
svn up -r7 myfile
svn up -r9 myfile
I believe (but I don't have an installation of it in front of me) that Subclipse has a similar option, labeled something like "Update to Revision..."
Subversion does not support atomic changesets.
(Note: If anyone can prove me wrong, I'll happily switch accepted answer.)
I've compared Git and Subversion using TortoiseGit and TortoiseSVN (and looked at what is possible on the command line).
With both Svn and Git I can update to a certain revision, or see and update to different versions of only one file at a time.
With both Tortoise clients can I see individual commits (revisions) from the repository and look at changes between a revision and the previous revision. (Note that I can't seem to do this in Eclipse, ref the question.)
Only with Git, however, can I update to or cherry-pick an isolated commit. The closest I've seen to this functionality in Subversion is to update to head and then revert a certain revision with a "subtractive merge"...
Test setup: make a project, check out or clone the project, make 2 separate commits to repository from elsewhere, including at least one file that is modified in both commits.
Then, with Git: fetch remote changes.
Then, with both Git and Subversion: look at the log.

How to merge project differences using visual source safe?

We have two different projects on our source safe database (one of them is a copy of another one for some reasons there was a problem with our branching operation that didn't pin our branched files therefore I had to get a label and add it as a different project)
I know how I can see the differences between two projects and I know that there is a mechanism that let us merge differences into one file (I think "reconcile all" will do the trick but i am not sure)
So here's my question how can I merge a file in a project with another file from another project?
VSS (or as i call it, source destruction system) will destroy your code if you try to merge it using the built-in tools. Why does it do that ? .. because its a lame tool.
This is what i recommend
Get latest both branches.
Get the last version of the code
before you branched. (just see the
date and guess if you have to)
Do a 3-way merge because you have a
base.
add the merged files into subversion
(or something better than
sourcesafe).
I have many old projects stored in sourcesafe. Its hopeless trying to use the built-in tools to do anything other than get latest, checkin and checkout.
Checkout the latest version of the first VSS somewhere.
Create a repository using a different VCS tool (Subversion should be the most simple choice).
Import the project version into the new Subversion repo as a branch.
Checkout the latest version of the second VSS somewhere else.
Import the project version into the new Subversion repo in a different branch.
Use any Subversion tools to merge the two branches.