How to get source code from GitHub data export? - github

I decided to backup all my github data and found this: https://help.github.com/en/github/understanding-how-github-uses-and-protects-your-data/requesting-an-archive-of-your-personal-accounts-data
I managed to get the .tar.gz file and it seems to contain all my repositories but there is no source code in there. Judging by the size, it looks like some kind of archive in objects/pack/*.pack
Is there any way to access original source code?

it looks like some kind of archive in objects/pack/*.pack
According to Download a user migration archive:
The archive will also contain an attachments directory that includes all attachment files uploaded to GitHub.com and a repositories directory that contains the repository's Git data.
Those might be bare repositories or bundles.
Once uncompressed, try and git clone one of those folders (to a new empty folder)
The OP johnymachine confirms in the comments:
git clone .\repositories\username\repository.git\ .\repository\
Meaning repository.git is a bare repo (Git database only, no files checked out)

Related

How do I add a file in a subfolder to a new repository?

I have a repository for a website and it has two separate remotes. One is for the website files and one for datasets and R scripts to make some data in my blog posts reproducible and archived for the future.
My local file structure looks like this.
-Website
|
|--website-files/posts/blog-post1
|/blog-post2
|r_script.R
The folder Website has two remotes one - origin - for the website, and one - blog-post - for the dumping ground for my replication files.
So, because I have cleanly added a second remote, I tried to add the file r_script.R and push it to the remote blog-post.
git add website-files/posts/r_script.R
Then, though, when I check the status, git status shows the file name as untracked listed as
../../r_script.R
The precise question: How do I add a file in a subfolder to be tracked and then to push its own unique remote? Note, when I copy r_script.R to the folder Website, and run git add r_script.R it shows up as a staged file ready for committing.
But I would really rather keep it in the subfolder to keep it clean.
Maybe should I add the repo blog-post as a submodule to the subfolder website-files/posts/ or something like that?

Artifactory github Repository - downloadBranch

I have github setup as a VCS repository in Artifactory. When using the downloadBrach API through Artifactory (similar to downloading files via git clone), the download appears to include everything except the dot hidden files (.gitignore is an example).
Is there a way to include all files (including the dot hidden files) when downloading a branch from an Artifactory VCS repository?
This is what I've tried:
curl -XGET "https://artifactory.domainname.com/artifactory/api/vcs/downloadBranch/github-remote-vcs/jquery/jquery/master" -o jqueryMaster.tar.gz
This results in a gzipped tarball that contains all files in the repo, except for the dot hidden files but I need all files in the repo.
Update #1
Slight correction - the dot hidden files are getting downloaded with the exception of the .git subdirectory containing information about the Repo itself. Does anyone know if there is a way to get the .git directory as well as the Repo metadata included?
Does anyone know if there is a way to get the .git directory as well as the Repo metadata included?
No: The downloadBranch is for downloading a tarball (tar.gz/zip, default tar.gz) of a complete branch.
It is an archive, not a full repo history.
Exposing a VCS history through Artifactory API is for downloading archives only, not a full repo.
List all tags
List all branches
Download a specific tag
Download a file within a tag
Download a specific branch
Download a file within a branch
Download a release

Is there a way to download the zip file for a GitHub repository that includes links to other repositories?

There is a GitHub repository that I would like to download as a zip file. However, some of the code I want to download is located in separate repositories and only a link to those repositories are located in the main repository, not the actual code. If I want a zip file containing the whole codebase, do I need to download the separate repositories and assemble it, or is there a way to do it?
Not directly from GitHub: the tarball or archive representing a repo does not include its submodules, assuming the repo has a .gitmodules file.
Only a git clone --recursive (maybe with --depth=1) would get you a complete content.
Note: if the repo does not have a .gitmodules file, those references are just gitlinks to other repo SHA1, without their URL, in which case even a regular clone would not get their content.
If I want a zip file containing the whole codebase, do I need to download the separate repositories and assemble it?
Yes, ... except you might end up with the wrong content: the main parent repo does reference other repos specific SHA1, so you need to make sure to query an archive for the right reference.

CVS branch actual location on server

The CVS repository in my project has a HEAD code and 8 other branches. The server location mentioned as '/local/cvs/srcjboss' contains only the projects in the HEAD branch.
Is there a physical location on the server where all the branch code can be accessed ? I need the server location for CVS to SVN migration.
If it helps, we are using a linux server
To convert a CVS history to Subversion using cvs2svn, you need filesystem-level access to the data from the central CVS repository. It is not enough to have access to a working copy where the code is checked out. It's not really clear from your question which of these you have under /local/cvs/srcjboss.
A CVS repository is recognizable from its CVSROOT subdirectory and lots of files named like your project files, but with ",v" appended, like maybe "Makefile,v" or "index.html,v" or "build.xml,v". Each of these files contains the entire history (including branches) of the corresponding file from your project, in rcsfile(5) format. The repository probably also contains "Attic" subdirectories that hold the histories of files that are not present in HEAD.
A CVS working copy, on the other hand, contains one particular version of each file (with no ",v" suffix), plus a CVS subdirectory in each of your project directory. A CVS working copy doesn't contain any of the project history.
So is your /local/cvs/srcjboss the CVS repository or is it a working copy?
If you have a working copy and are trying to find the central repository, look in one of the files named CVS/Root. It will tell you the location of the repository from which the working copy was checked out.

How to cleanup Mercurial repository?

I need to delete my "uploads" folder from the repository with all its history because it contains only junk testing data.
Please help.
You'll want to use the convert extension that ships with mercurial. Since you want to scrub a directory from the history you'll have to completely filter you're existing repository, CONVERTing it into a new one.
Assume the following made up structure of your repo:
/
src
doc
images
upload
Create a simple text file with the following content
exclude upload
You can do more with this file but keep it simple to get to your goal. The path to be excluded is relative to the repository root
Now run mercurial convert
hg convert --filemap path/to/the/textfile old-repo new-repo
Change to the directory of the new repo. Notice that mercurial created a bare/null rev repo (no content but the .hg directory). Run the following to update to your latest changset. Notice the upload directory is gone!
cd path/to/new/repo
hg update
WARNING: I do not know how this handles named branches or tags. You're on your own. At least you're not modifying the original repo. Make as many copies as you need to get it right.