How to search all GitHub repositories for SHA? - github

Is there a way to find all repositories on github that contain a given commit (given its SHA)?
I can find easily check if a given commit exists in a known repository by checking the existence of
https://github.com/${USER}/${REPO}/commit/${SHA}
...but what if I don't know the repo-slug?
Doing a simple search on SHAs (via the webinterface) doesn't return anything.

Related

How do I search the readme of repositories using Github Search API?

My objective is to count the number of repositories that use PyTorch. Therefore, I came up with the following code, using the THUNDER CLIENT extension in VS Code -
https://api.github.com/search/repositories?q=language:python + readme:PyTorch
However, this gives me just 7 search results. I am confident the result should be in the range of thousands. Could someone suggest where I am going wrong?
The GitHub search API for repositories checks the name, description and the README of all repositories.
Therefore, all that was needed to be done was -
https://api.github.com/search/repositories?q=PyTorch

How to reset a file to a particular commit with JGit?

Consider my local repository contains more than one file, while doing checkout for a particular commit of a file, other files in the repository got deleted.
I am using following API (git is the instance of git repository)
git.checkout().setName(commitId).call()
Is this correct way to check out a particular commit of a particular file?
The JavaDoc of setName() says
When only checking out paths and not switching branches, use setStartPoint(} to specify from which branch or commit to check out files.
And for addPath() it states:
If this option is set, neither the setCreateBranch() nor setName() option is considered. In other words, these options are exclusive.
Therefore I think you should use
git.checkout().addPath( ... ).setStartPoint( ... ).call();
Your call reset the index (and can remove files no longer present in the new commit you check out)
You can look for a more precise example in jgit/porcelain/RevertChanges.java
// revert the changes
git.checkout().addPath(fileName).call();
In your case:
git.checkout().setname(commitId).addPath(fileName).call()

GitHub Repository search using partial or begins-with

I would like to do a code search in a collection of repositories. I would like to restrict my search using a pattern or naming convention in the repository name.
1) I can limit my search by the organization.
Example: "user:google"
2) I can search a single repository directly.
Example: "repo:google/devtoolsExtended test"
So, is it possible to search all repositories that begin with a pattern?
Equivalent to: "repo:google/dev* test"
In order to search for repositories all you need to do is just type in your query, like the following:
https://github.com/search?q=foo
This finds repos such as FooTable and foodme as well, as if we searched for foo*. If you want to limit your search to the repository name only, use in:name, which results in:
https://github.com/search?q=foo+in%3Aname&type=Repositories
There's no explicit method of searching by begins with or using wildcards, but results for those kinds of searches appear as well.
Sources:
GitHub Help: Searching repositories

Github: comparing across forks?

Short version
When I compare two forks on Github, it does not compare the latest states, but the current state of the base fork with the last common commit (or am I wrong?); so how can I compare the latest states/heads on Github?
Longer version
I am trying to compare two repositories on Github.
It does not seem to compare the latest states of both repository. Instead, it compares:
the base fork as it was when both repositories where identical (last common commit?)
with
the head fork as it is now.
You can see this in the Github's fork comparison example, it says there are no changes between those two repositories, but there are now very different.
How can I compare the latest states/heads on Github?
https://github.com/github/linguist/compare/master...gjtorikian:master
github:master is up to date with all commits from gjtorikian:master.
Try switching the base for your comparison.
It means that all commits from gjtorikian/liguist are part of github/linguist.
The reverse is not true:
https://github.com/gjtorikian/linguist/compare/master...github:master
That would give all (1866) commits from github/linguist which are not part of gjtorikian/linkguist.
This is triple-dot '...' diff between the common ancestor of two branches and the second branch (see "What are the differences between double-dot “..” and triple-dot “…” in Git diff commit ranges?"):
In the first case github/linguist:master...gjtorikian/linguist:master, the common ancestor and gjtorikian/linguist:master are the same! O commits.
In the second case gjtorikian/linguist:master...github/linguist:master, github/linguist:master has 1866 commits since the common ancestor (here, since gjtorikian/linguist:master).
As a side note, the compare of forks can be reached from the compare page.
Say your project is Zipios:
https://github.com/Zipios/Zipios
What you want to do is add the .../compare to that URL:
https://github.com/Zipios/Zipios/compare
On that page, you can select two branches but if you look closely, at the top there is a link that says: compare across forks.
Once you clicked on that link, it shows you two extra dropdowns with your main branch and the list of forks.
What I have yet to discover is how to go from the main page of a project to the Compare page. Maybe someone could shed light on that part?
From #somerandomdev49:
To go to the compare page, go to the "Pull Requests" tab and click the "Create Pull Request" button.

Count number of empty repositories on GitHub

I was just wondering if it's possible to count the total number of empty repositories on GitHub.
If not for all users, can it be done for yourself?
Edit
I have tried the size:0 search, but it seems to return a lot of repositories which do contain data. Taking something like size:0..1 didn't help either.
If I try searching for the keyword empty, but it doesn't cover all aspects.
Update
I got a response from Brian Levine (GitHub)
That would be an interesting statistic. We don't have a simple way to do that right now. However, you might be able to use the GitHub API to get close. You could look through public repositories and compare "pushed_at" and "created_at" dates to see if there has been any activity. Additionally, you could find repositories with a "size" of 0. There's more information on how to find this information, and much more, right here:
http://developer.github.com/v3/repos/
You could:
list all public repos through the API, and,
for each repo, check the ones with a size equals to 0.
(The size seems to be in KB)
GET /repos/:owner/:repo
Note that an "empty" repo could still have at least one commit, when created with the default README.md description file.
Actually, as the OP Aniket comments:
I explained the meaning of empty as: 0-1 commits, max 3 files:
.gitignore
README.md
LICENSE
(Note: README is different from README.md)
Another way is, for each repo, to look at the number of commits.
0 or 1 commit means probably an empty repo.
Update: GitHub confirms there is no current way to determine if a repo is "empty".
The closest way to do that would be:
You could look through public repositories and compare "pushed_at" and "created_at" dates to see if there has been any activity
To check if a repository is empty, look to see if it has any commits.
https://api.github.com/repos/:owner/:repo/commits?per_page=1
An empty repository will have a non-successful HTTP status and the content...
{
"message": "Git Repository is empty.",
"documentation_url": "https://developer.github.com/v3"
}
If it doesn't exist, you'll get a 404 and...
{
"message": "Not Found",
"documentation_url": "https://developer.github.com/v3"
}
If it does exist, you'll get an HTTP 200 and one commit.
Using the attribute "size" from the API will not help as mentioned by other posters here.
An example is this repository:
https://api.github.com/repos/errfree/test
If you note, it displays the size as 48 despite being empty.
Disclaimer: This approach is a hack. It is not efficient nor officially supported by GitHub, but works good enough for me.
Basically, I download the Zip version of the repository. When the repository is empty then it will not return a zip file but provides as result an HTML page saying "This repository is empty.".
After downloading a zip file, I verify if the size is smaller than 30Kb and if this is the case, I look inside the file contents for the string "This repository is empty." to confirm that a given repository is empty.
Here is a practical example of direct zip download that on this case will display an empty page:
https://github.com/errfree/test/zipball/master/
My pseudo-code in Java:
// we might have reached an empty repository
if(fileZip.length() < 30000){
// read the contents
final String content = utils.files.readAsString(fileZip);
// is this an HTML file with the repository empty message?
if(content.contains("This repository is empty.")){
return null;
}
}
Hope this helps.