Github API: Check if a branch or repository contains a commit - github

Can I use the Github API to check if a certain repository contains a certain commit?
At first glance, it seems that the get a single commit API call should work, returning 404 if there is no such commit in the repository. But that is not true: It seems that this call will run successful on commits that are present in forked repositories (possibly due to a pull request). (This effect can also be observed in the regular web interface; this particular commit has not been pulled into that repository yet.)

Api GitHub search
For searching other repositories one can use the api, which finds commits via various criteria. (This method returns up to 100 results per page.):
https://developer.github.com/v3/search/#search-commits
Only the default branch is considered, mostly master
Api usage
GET /search/commits
q can be any kind of search term combination: https://help.github.com/articles/searching-commits/
Example parameters for q
hash:124a9a0ee1d8f1e15e833aff432fbb3b02632105
Matches commits with the hash 124a9a0ee1d8f1e15e833aff432fbb3b02632105.
parent:124a9a0ee1d8f1e15e833aff432fbb3b02632105
Matches children of 124a9a0ee1d8f1e15e833aff432fbb3b02632105.
further parameters, like sorting, ordering can be found in the documentation above.
Usage example per hash:
example call https://api.github.com/search/commits?q=<searchterm>+<searchterm2>
specific call: https://api.github.com/search/commits?q=repo:adejoux/kitchen-wpar+hash:0a3a228e5b250daf06f933b35b3f0eafc715be4f
You need to add an special header, because the api is available for developers to preview
header to add: application/vnd.github.cloak-preview

Related

How to find the exact contributor count of a GitHub repository using GitHub API?

I am trying to count the total number of contributors of a GitHub repository using the GitHub API. But, I did not get the exact number of contributors shown in the repository. For example, in the azure-sdk-for-go repository, the total number of contributors are shows as 188.
Now, if I run the below query, I get 157 as result.
def contributorCount(u, r):
return re.search('\d+$', requests.get('https://api.github.com/repos/{}/{}/contributors?per_page=1'.format(u, r)).links['last']['url']).group()
print(contributorCount("Azure", "azure-sdk-for-go"))
If I add the anon=True in the URL, then I got 169 contributors.
https://api.github.com/repos/{}/{}/contributors?per_page=1&anon=true
What am I missing here?
As noted in this discussion, it might not be trivial to find the same number:
A user can be “anonymous” if there is no GitHub user associated with a given email address.
And the reason that your number still may not match the one given by the UI is because the same GitHub user may have contributed using multiple email addresses. This is why I said above:
on larger repos, you may not be able to replicate the exact figure that we show on the website.
The API simply doesn’t return the information you need in order to be able to replicate the number we show.

Merge pull request by Github API: SHA parameter

I was reading the documentation of GitHub API and I'm not sure what do do with the Merge Pull Request method.
https://developer.github.com/v3/pulls/#merge-a-pull-request-merge-button
Specifically with the parameter SHA. I don't understand what exactly I should provide to API.
The INPUT section says I must provide
commit_title Title for the automatic commit message.
commit_message Extra detail to append to automatic commit message.
sha SHA that pull request head must match to allow merge.
merge_method Merge method to
use.
Where do I exactly get the sha value that I need to pass to API?
Thanks!
Consider the following diagram, which shows a feature branch derived from some base branch:
base: ... A -- B -- C
\
feature: D -- E
Let's suppose that we created a pull request from feature going back to base. GitHub would execute this pull request by merging feature into base. The pull request HEAD, at the time we created the pull request, would be commit E in feature. But, the HEAD of the feature branch could change before the pull request is completed.
The API call you mention includes the SHA-1 hash of the pull request HEAD, as a requirement for the pull request to complete. This would avoid the possibility of feature being merged back to base while containing additional commits beyond commit E.
Regarding how you would find the SHA-1 hash for E, the pull request HEAD, you may simply try using git log, e.g.
# from feature
git log
Then, check the output for what should be the latest entry from commit E, and find the hash.

GitHub API - latest public repositories

I would like to list public GitHub repositories with the latest create/update/push timestamps (for me any of these is acceptable). Can I achieve this with the GitHub API?
I have tried the following:
Tried using /repositories endpoint, and use the link header to navigate to the last page. However, the link header I receive only has first and next links, whereas I need a last link.
Tried using /search/repositories endpoint. This will work as long as I have a keyword or filter in the q parameter, but it will not accept an empty q parameter.
I got in touch with GitHub support, and there are two solutions to this:
Use binary search on the since parameter of the /repositories endpoint to find the last page.
Cons: may quickly exhaust the API rate limit.
Use the /search/repositories endpoint with an always-true predicate such as stars>=0.
Cons: likely to cause a query timeout/ incomplete results.

In Github, is there a way to search for pull requests created by any author from a provided list?

For my team's weekly builds, I go through all pull requests from the company GitHub and pull out the PRs associated to my team. This requires an annoying sieving step that requires a walk-through of the company's previous week of code contribution.
I looked at the official GitHub search documentation (HERE) and found the "author" field could be used to narrow down the search in the way I want, but when I try this at https://github.com/pulls it only works on one author at a time.
Is there a way to search across a list of authors?
For a little extra context, my team operates across a large list of repos, all of which are under a blanket organization which houses all repos across the company.
Make sure that you are using the full search at https://github.com/search.
Then simply add extra author: <name> fields to your query. The searching engine will OR fields. For example:
is:pr author:username1 author:username2
(Note that this only works on https://github.com/search. The search syntax on other pages, like https://github.com/pulls, is severely limited and does not support searching by multiple authors. If you try the same search on https://github.com/pulls, GitHub will simply ignore all but one author that you list.)
To limit it to repositories by a specific owner, add the user: <owner> field to the query.
Using the route github.com/search instead of github.com/pulls is the "right" answer in some sense, but I like the format of the /pulls page better. When working in a small team my approach is to use /pulls but substitute "involves" for "author", like this (for reference, the same query using /search and "author").
You will get "extra" hits where the author is someone outside the list, but it's another trick to know. (Names in the examples picked at random from recent public PRs)
You could simply use the advanced search for that: https://github.com/search/advanced 🤗
Option 1: Using Github's Search Query Language
Go to https://github.com/search
Type in a query following the format of this example (replacing author:* with your usernames.
Example: is:pr repo:zino-hofmann/graphql-flutter author:apackin author:kvenn
Explained
is:pr - only PRs (since Github treats Issues and PRs both as "Issues")
repo: - only show PRs in that repo
author: - only show PRs for these authors
It shows as "Issues", but the list will only include PRs.
Option 2: Fancy Bookmark/Alfred/Spotlight Search
You can modify the query params in the following URL to have the list of people on your team.
Replacing <username1,2,3,4> with your teammates Github username's.
Replacing <your_company> with your company URL (or removing that entirely if not on enterprise).
https://github.<your_company>.com/search?q=author%3A<username1>+author%3A<username2>+author%3A<username3>+author%3A<username4>+is%3Apr&type=Issues
Option 3: Using Github's Advanced Search UI
You can use Github's "Advanced Search" to achieve what you're looking for without needing to learn Github's query language.
For public repos: http://github.com/search/advanced
For internal/enterprise repos: http://github.<your_company>.com/search/advanced
You can use the fields below for filtering:
To filter for specific repos, use "Advanced options" -> "In these repositories"
To filter for specific authors, use "Issues options" -> "Opened by the author"
It uses query params under the hood, so you can generate the search with your UI and copy and paste it (to use for Option 3).
Note: You'll need to add "is:pr" to the resulting search query, no way to do that in the UI.

List of branches a commit appears on

Using the GitHub API (v3) I'd like to figure out which branches a commit appears on. I didn't find a way to directly query this, either through repo commits or the commit data objects. An alternate solution would be to list all the branches, and compare with their HEAD; I guess the comparison would fail if the commit is not on the given branch.
Is this supported via the current API, and I just missed it? If not, do you have a (better) workaround?
That's not possible directly via the GitHub API.
Workaround 1:
get a list of all branches
for each branch, get a list of commits on that branch
check if the commit is in the list of commits for each branch
Workaround 2 (I think this will work, but not 100% sure if I missed a case):
get a list of all branches
for each branch, compare the branch with the SHA:
https://api.github.com/repos/:user/:repo/compare/:branch...:sha_of_commit
If the value of the status attribute in the response is diverged or ahead, then the commit is not in the branch. If the value of the status attribute is behind or identical, then the commit is in the branch.
I haven't checked if this is directly supported by the GitHub API, but this is trivial to do using plain Git:
git branch --all --contains <commit>
That will list all branches (local and remote) in a local repository that contain the given commit.
There is no direct endpoint to list the branches by a commit ID. But you can use a combination of other endpoints to solve this.
Condition 1
Use branches-where-head endpoint to get the name of the branches where the given commit ID is the head.
Condition 2
If condition 1 returns empty, get pull info object using the commit ID and extract the branch information using pulls info endpoint.
Tip: The return object contains properties - base.ref and head.ref which has the branch name. If base.ref is the name of the master branch, use head.ref which should give you the name of the branches that contains the commit ID.
In case anyone using the PyGitHub (https://pygithub.readthedocs.io/en/latest/introduction.html) library for API calls -
First get the list of all branches and then
g = Github(<accesskey>)
repo = g.get_repo(<repository>)
is_commit_in_branch = repo.compare('stable-2.11', <commit_id>).status
# If it returns behind or identical, then the commit is in the branch.
GitLab API
Following Ivan Zuzak's solution number 2, to know if a commit is on a branch:
Use GitLab's repository compare API, and compare from the branch, to the commit
GET /projects/:id/repository/compare?from=<branch>&to=<sha_of_commit>
If the commits list is empty, then yes, the commit is on that branch.
In Python, using python-gitlab:
def is_commit_on_branch(project, commit, branch):
c = project.repository_compare(branch, commit)
return not c['commits']