Get if pull request passed all required status checks using GitHub API - github

I need to check via GitHub API if a pull request passed all required status checks. I use GitHub Enterprise 2.8 at this moment.
I know that I can get all status checks for last commit (following statuses_url in pull request). However, I don't know which status checks are set up to be required in a given repository. This is my main problem.
I also need to aggregate these status checks, group them by context and take latest in each context. It's ok, but seems to be reimplementation of logic, that GitHub performs internally when decides if a pull request can be merged.
For my case, it would be ideal to have something like can_be_merged in pull request fields, which meaning is mergeable && all required status checks passed && approved, but as far as I know there is no such field.

Finally solved this! You actually need to get the information off the protected branch, not off the check itself. Here are some API details: https://developer.github.com/v3/repos/branches/#list-required-status-checks-contexts-of-protected-branch.
So the flow to solve this is:
Check if the base branch for the PR is protected, and if so;
Use the above endpoint to determine which checks are required;
Compare the checks on the latest PR commit to the required checks determined in step 2.

Based on #moustachio's answer, if you are using v4 graphql API, you may use
pullRequest(...) {
baseRef {
refUpdateRule {
requiredStatusCheckContexts
}
}
}
to get the same info without additional API request

Related

How to trigger azure pipeline via API in a way it does not report it was manually triggered

We have an Azure pipeline building a static site. When there is a change in a content repository the site needs to be rebuilt. For that, we're using webhooks and Azure DevOps API. The request to queue the build is very simple and is illustrated for example here.
What I don't like about this is that int the build listing it says "Manually triggered for person XY", where the person XY is the one who generated the credentials used in the API request. It seems quite confusing because any API request seems strange to be labeled as "manually requested". What would be the best way how to achieve more semantically correct message?
I've found there is a reason property which can be sent in the request. But none of the values seems to represent what I want and some of them do not work (probably need additional properties and there is no documentation for that).
Based on my test, when you use the Rest API to queue a build and set the build reason, the reason could be shown in the Build(except:batchedCI and buildCompletion).
Here is the Rest API example:
Post https://dev.azure.com/Organization/Project/_apis/build/builds?api-version=4.1
Request Body:
{
"definition": {
"id": 372
},
"reason":"pullRequest"
}
The value : checkInShelveset individualCI pullRequest schedule could show their own names.
The value: manual and none could show manually trigger.
The other value(e.g. All, userCreated) will show Other Build Reason.
For the value: batchedCI and buildCompletion.
BatchedCI: Continuous integration (CI) triggered by a Git push or a TFVC check-in, and the Batch changes was selected.
This means that batch changes are required to achieve this trigger. So it doesn't support to queue build in Rest API .
buildCompletion: you could refer to this ticket This reason doesn't support in Rest API-queue Build.
Note: If you enter a custom value or misspelling, it will always display manual trigger.
In the end, I went with all value and also overriding the person via requestedFor property. This leads to the message "Other build reason", which seems usable to me.
{
"definition": {
"id": 17
},
"reason":"all",
"requestedFor": {
"id": "4f9ff423-0e0d-4bfb-9f6b-e76d2e9cd3ae"
}
}
However, I'm not sure if there aren't any unwanted consequences of this "All reasons" value.

Unable to flag / trigger "Merge when pipeline succeeds" via Gitlab Api (v3/v4)

So as a part of some tests to automatically accept / merge successful pipelines in our git repository i was running some tests to flag the "merge when pipeline succeeds" feature when the pipeline is still running:
So this button is available when the pipeline is still running and will convert to a green 'Accept merge' button when the pipeline succeeds:
(note that this picture was taken afterwards not to confuse the use-case)
additionally i have set these general settings:
So when checking the Gitlab API Documentation it says i should use the following endpoint:
PUT /projects/:id/merge_requests/:merge_request_iid/merge
when using the parameter ?merge_when_pipleline_succeeds=true it should flag the button.
However when i call this endpoint when the pipeline is still running (i built in a wait for 10 mins while testing this) i get the following result:
i am getting a Method Not Allowed. My assumption is that the endpoint i am using is correct because otherwise i would've gotten a bad request / not found return code.
when checking the gitlab merge request i am seeing that indeed the flag is not set to true:
However, when i manually click the blue button the mergerequest looks like this:
Also if i let the pipeline finish and then proceed to call the merge api (w/ or w/o the merge when pipeline succeeds flag) it will accept the merge. It just does not work when the pipeline is running (which is odd because even the button itself only shows when the pipeline is running)
so i am wondering what I am doing wrong here.
I am using a Powershell module to call the GitLab API. The Accept part of the module is not official and was made by myself because i found this feature missing.
I am using the same credentials for the API w/ a personal access token to authenticate to the API. Other functionality of the API work with this token like creating merge requests, retrieving status of a current MR and accepting a MR when the pipeline is finished.
I have tried the following variants :
Use the V3 api with merge_when_build_succeeds=true --> nets the same
result
Uncheck the "Only allow merge request to be merged if the
pipeline succeeds" --> nets the same result
Use ID of the merge request instead of IID
use /merge_when_pipeline_succeeds instead of ?merge_when_pipeline_succeeds=true
use True instead of true --> nets the same result
I get a similar issue with the python-gitlab library on v4. It works sometimes when I use:
mr.merge(merge_when_pipeline_succeeds=True)
Where mr is a ProjectMergeRequest object. However, if the MR has a merge conflict in it I get that 405 Method Not Allowed error back.
My best guess is to see if I can apply logic before calling mr.merge() to check for problems. Will update this if that works.
UPDATE: Looks like there is no feature to check for conflicts as of today. https://gitlab.com/gitlab-org/gitlab-ce/issues/41762
UPDATE 2: You can check merge_status when looking at the MR information, so either that attribute or an exception then mr.merge() fails would let you identify when it won't work.

Recommended way to list all repos/commits for a given user using github3.py

I'm building a GitHub application to pull commit information from our internal repos. I'm using the following code to iterate over all commits:
gh = login(token=gc.ACCESS_TOKEN)
for repo in gh.iter_repos():
for commit in repo.iter_commits():
print(commit.__dict__)
print(commit.additions)
print(commit.author)
print(commit.commit)
print(commit.committer)
print(commit.deletions)
print(commit.files)
print(commit.total)
The additions/deletions/total values are all coming back as 0, and the files attribute is always []. When I click on the url, I can see that this is not the case. I've verified through curl calls that the API indeed has record of these attributes.
Reading more in the documentation, it seems that iter_commits is deprecated in favor of iter_user_commits. Might this be the case why it is not returning all information about the commits? However, this method does not return any repositories for me when I use it like this:
gh = login(token=gc.ACCESS_TOKEN)
user = gh.user()
for repo in gh.iter_user_repos(user):
In short, I'm wondering what the recommended method is to get all commits for all the repositories a user has access to.
There's nothing wrong with iter_repos with a logged in GitHub instance.
In short here's what's happening (this is described in github3.py's documentation): When listing a resource from GitHub's API, not all of the attributes are actually returned. If you want all of the information, you have to request the information for each commit. In short your code should look like this:
gh = login(token=gc.ACCESS_TOKEN)
for repo in gh.iter_repos():
for commit in repo.iter_commits():
commit.refresh()
print(commit.additions)
print(commit.deletions)
# etc.

How do I determine branch name or id in webhook push event?

I was ecstatic when I got a simple webhook event listener working with GitHub push events on my Azure site, but I realize now I'm not seeing the branch name or id in the json payload (example here https://developer.github.com/v3/activity/events/types/#pushevent)
I thought maybe "tree_id" would be it, but it doesn't seem to be. I couldn't find any info about this in GitHubs's doc. Maybe I need to take one of the id's from the event and make another api call to get the branch? The reason for this is I want to be able to link GitHub push events with my app portfolio, which has branches defined. So, the push events are a way to see code change activity on my different apps -- and knowing the branch is therefore important.
I wrote to GitHub support, and they told me that the branch name is part of the "ref" element in the root of the json payload. When parsing from a JToken object called jsonBody, the C# looks like this
var branchName = jsonBody["ref"].ToString().Split('/').Last();
For example in "refs/heads/master", the branch name is "master"
You need to pay closer look on WEBHOOK response mainly. Here is the trick for JSONPATH ( at-least what I did with my jenkins job):
first read your webhook whole response with character "$". You can catch it is some variable like:
$webhookres='$'
echo $webhookres
Once you have response printed, copy it and paste here: https://jsonpath.com/
Now create your pattern. For example if you want branch name (if event is push):
$.ref
Once you have the branch name( it will have extra string with /), simply trim the unwanted part using awk or cut (linux commands).
You are not limited to this only. All you need to work on pattern and you can make use of this approach for getting other values as well like, author, git repo url etc. and then these can be used in your automation further.
even if you are using any other platform like Azure, JSONPATH concept will be same. because as suggested in accepted answer, "jsonBody["ref"]", it is equivalent to $.ref, as altogether you have to identify the PATTERN ( as here PATTERN is 'ref')

How can I get a list of all pull requests for a repo through the github API?

I want to obtain a list of all pull requests on a repo through the github API.
I've followed the instructions at http://developer.github.com/v3/pulls/ but when I query /repos/:owner/:repo/pulls it's consistently returning fewer pull requests than displayed on the website.
For example, when I query the torvalds/linux repo I get 9 open pull requests (there are 14 on the website). If I add ?state=closed I get a different set of 11 closed pull requests (the website shows around 20).
Does anyone know where this discrepancy arises, and if there's any way to get a complete list of pull requests for a repo through the API?
You can get all pull requests (closed, opened, merged) through the variable state.
Just set state=all in the GET query, like this->
https://api.github.com/repos/:owner/:repo/pulls?state=all
For more info: check the Parameters table at https://developer.github.com/v3/pulls/#list-pull-requests
Edit: As per Tomáš Votruba's comment:
the default value for, "per_page=30". The maximum is per_page=100. To get more than 100 results, you need to call it multiple itmes: "&page=1", "&page=2"...
PyGithub (https://github.com/PyGithub/PyGithub), a Python library to access the GitHub API v3, enables you to get paginated resources.
For example,
g = Github(login_or_token=$YOUR_TOKEN, per_page=100)
r = g.get_repo($REPO_NUMBER)
for pull in r.get_pulls('all'):
# You can access pulls
See the documentation (http://pygithub.readthedocs.io/en/latest/index.html).
With Github's new official CLI (command line interface):
gh pr list --repo OWNER/REPO
which would produce something like:
Showing 2 of 2 pull requests in OWNER/REPO
#62 Doing something that-weird-branch-name
#58 My PR title wasnt-inspired-branch
See additional details and options and installation instructions.
There is a way to get a complete list and you're doing it. What are you using to communicate with the API? I suspect you may not be doing something correctly. For example (there are only 13 open pull requests currently) using my API wrapper (github3.py) I get all of the open pull requests. An example of how to do it without my wrapper in python is:
import requests
r = requests.get('https://api.github.com/repos/torvalds/linux/pulls')
len(r.json()) == 13
and I can also get that result (vaguely) in cURL by counting the results myself: curl https://api.github.com/repos/torvalds/linux/pulls.
If you, however, run into a repository with more than 25 (or 30) pull requests that's an entirely different issue but most certainly it is not what you're encountering now.
If you want to retrieve all pull requests (commits, comments, issues etc) you have to use pagination.
https://developer.github.com/v3/#pagination
The GET request "pulls" will only return open pull-requests.
If you want to get all pull-requests either you do set the parameter state to all, or you use issues.
Extra information
If you need other data from Github, such as issues, then you can identify pull-requests from issues, and you can then retrieve each pull-request no matter if it is closed or open. It will also give you a couple of more attributes (mergeable, merged, merge-commit-sha, nr of commits etc)
If an issue is a pull-request, then it will contain that attribute. Otherwise, it is just an issue.
From the API: https://developer.github.com/v3/pulls/#labels-assignees-and-milestones
"Every pull request is an issue, but not every issue is a pull request. For this reason, “shared” actions for both features, like manipulating assignees, labels and milestones, are provided within the Issues API."
Edit I just found that issues behaves similar to pull-requests, so one would need to do retrieve all by setting the state parameter to all
You can also use GraphQL API v4 to request all pull requests for a repo. It requests all the pull requests by default if you don't specify the states field :
{
repository(name: "material-ui", owner: "mui-org") {
pullRequests(first: 100, orderBy: {field: CREATED_AT, direction: DESC}) {
totalCount
nodes {
title
state
author {
login
}
createdAt
}
}
}
}
Try it in the explorer
The search API shoul help: https://help.github.com/enterprise/2.2/user/articles/searching-issues/
q = repo:org/name is:pr ...
GitHub provides a "Link" header which specifies the previous, next and last URL to fetch the values.Eg, Link Header response,
<https://api.github.com/repos/:owner/:repo/pulls?state=all&page=2>; rel="next", <https://api.github.com/repos/:owner/:repo/pulls?state=all&page=15>; rel="last"
rel="next" suggests the next set of values.
Here's a snippet of Python code that retrieves information of all pull requests from a specific GitHub repository and parses it into a nice DataFrame:
import pandas as pd
organization = 'pvlib'
repository = 'pvlib-python'
state = 'all' # other options include 'closed' or 'open'
page = 1 # initialize page number to 1 (first page)
dfs = [] # create empty list to hold individual dataframes
# Note it is necessary to loop as each request retrieves maximum 30 entries
while True:
url = f"https://api.github.com/repos/{organization}/{repository}/pulls?" \
f"state={state}&page={page}"
dfi = pd.read_json(url)
if dfi.empty:
break
dfs.append(dfi) # add dataframe to list of dataframes
page += 1 # Advance onto the next page
df = pd.concat(dfs, axis='rows', ignore_index=True)
# Create a new column with usernames
df['username'] = pd.json_normalize(df['user'])['login']