Retrieve ALL Github issues of a specific Project using the GraphQL API - github

I've been trying to retrieve all GitHub issues of a specific project using their GraphQL API.
The problem that i have is that i need to specify in the items a first or last param it doesn't work. Although by specifying one of these params i get only a partition of the issues.
I thought that i could get the first 100, then use pagination and get the other 100 etc until the response is an empty list. From what i read, i cannot find a parameter in the items that defines a page.
What are your thoughts on this? Is there a workaround?
Thanks a lot for your time.

This query seems to work fine and gives page info under items,
query{
organization(login: "microsoft") {
projectV2(number: 559) {
title
items(first: 100) {
pageInfo {
endCursor
hasNextPage
}
}
}
}
}
Output,
{
"data": {
"organization": {
"projectV2": {
"title": "Azure TRE - Engineering",
"items": {
"pageInfo": {
"endCursor": "Njc",
"hasNextPage": false
}
}
}
}
}
}
Tested the query using,
GitHub Explorer

Related

Pagination on reactionGroups in the GitHub GraphQL API

I’m trying to extract the username of all the users that have reacted to an issue (and how they reacted) with the GitHub GraphQL API. I’ve only been able to extract a maximum of 11 users per reaction group per query, and I haven’t found a way to successfully paginate the queries - the same users are returned each time.
Here’s an example of my query using an issue with many reactions:
{
repository(owner: "mapbox", name: "mapbox-gl-js") {
issue(number: 3184) {
reactionGroups {
content
reactors(first: 30) {
totalCount
pageInfo {
hasNextPage
endCursor
}
edges {
node {
... on User {
login
}
}
}
}
}
}
}
}
For THUMBS_UP reactions this correctly returns totalCount: 77. However, there are only 11 usernames returned (not the 30 requested). The value of hasNextPage in pageInfo is false, and using the returned cursor value or modifying the reactors query to last:30 instead of first:30 has no impact on which 11 users are returned.
Is there a way I can modify my query to get this working (I’m new to GraphQL) or is this a current limitation of the API? Thanks!
(I've also asked this on the GitHub community forums, but no reply yet - see here)

How to query all languages from GitHubs graphql

I am trying to query GitHub for information about repositories using their v4 graphql. One of the things I want to query is the breakdown of all the languages used in the repo. Or if possible, the breakdown of the languages across all of a user's repos. I have tried the following snippet, but it returns null, where as primary language returns the primary language
languages: {
edges: {
node: {
name
}
}
}
The only thing I can find relating to languages is the primary language. But I would like to show stats for a user and the all languages they use either in a single repo or across off their repos.
You are missing the slicing field, here you can put first: 100 to get the first 100 languages for the repository:
{
user(login: "torvalds") {
repositories(first: 100) {
nodes {
primaryLanguage {
name
}
languages(first: 100) {
nodes {
name
}
}
}
}
}
}
If you want to have stats per language (eg if you want to know which is the second, third language etc...) I'm affraid this is not currently possible with the graphql API but using the List Languages API Rest for instance https://api.github.com/repos/torvalds/linux/languages
I wanted to point our something else that may help.
You can get more details about a language (i.e. primary, secondary etc) by looking at the language size. Comparing the totalSize for the whole repo to the size for each language it has.
The following query (example for pytorch) will get the data you need. Put it into the GH's GQL Explorer to check it out.
{
repository(name: "pytorch", owner: "pytorch") {
languages(first: 100) {
totalSize
edges {
size
node {
name
id
}
}
}
}
}
You will get an output of the form
{
"data": {
"repository": {
"languages": {
"totalSize": 78666590,
"edges": [
{
"size": 826272,
"node": {
"name": "CMake",
"id": "MDg6TGFuZ3VhZ2U0NDA="
}
},
{
"size": 29256797,
"node": {
"name": "Python",
"id": "MDg6TGFuZ3VhZ2UxNDU="
}
}, ...
To get % for each language just do size / totalSize * 100

Get GitHub Repository Insights via GitHub GraphQL API (v4)

I want to obtain information about the number of times my projects have been viewed, cloned and where the traffic came from (individually).
I can currently view this Traffic information by clicking on the Insights button of the repository (via the web interface).
Is there a schema in the GitHub v4 GraphQL API to retrieve this information?
The closest I got was the following; nodes didn't contain any sort of statistical data:
{
viewer {
repositories(first: 100) {
totalCount
nodes {
name
description
}
pageInfo {
endCursor
hasNextPage
}
}
}
}
// response
{
"data": {
"viewer": {
"repositories": {
"totalCount": 55,
"nodes": [
{
"name": "Repo Name",
"description": "Repo Description"
},
{
...
}
}
}
}
}
Currently it is not possible to retrieve traffic information using GraphQL API (as explained in here).
Alternately you can use REST API v3 as described in GitHub documentation.
Please notice that reviewing traffic requires tokens with push permissions to the desired repositories.

How can I get branch count on a repository via GitHub API?

I'm working on a UI which lists all repositories of a given user or organization. This is using a tree format, where the first level is the repositories, and the second level of hierarchy (child nodes) are to be each branch, if expanded.
I'm using a mechanism that deliberately doesn't require me to pull a list of all branches of a given repo, because the API has rate limits on API calls. Instead, all I have to do is instruct it how many child nodes it contains, without actually assigning values to them (until the moment the user expands it). I was almost sure that fetching a list of repos includes branch count in the result, but to my disappointment, I don't see it. I can only see count of forks, stargazers, watchers, issues, etc. Everything except branch count.
The intention of the UI is that it will know in advance the number of branches to populate the child nodes, but not actually fetch them until after user has expanded the parent node - thus immediately showing empty placeholders for each branch, followed by asynchronous loading of the actual branches to populate. Again, since I need to avoid too many API calls. As user scrolls, it will use pagination to fetch only the page(s) it needs to show to the user, and keep it cached for later display.
Specifically, I'm using the Virtual TreeView for Delphi:
procedure TfrmMain.LstInitChildren(Sender: TBaseVirtualTree; Node: PVirtualNode;
var ChildCount: Cardinal);
var
L: Integer;
R: TGitHubRepo;
begin
L:= Lst.GetNodeLevel(Node);
case L of
0: begin
//TODO: Return number of branches...
R:= TGitHubRepo(Lst.GetNodeData(Node));
ChildCount:= R.I['branch_count']; //TODO: There is no such thing!!!
end;
1: ChildCount:= 0; //Branches have no further child nodes
end;
end;
Is there something I'm missing that allows me to get repo branch count without having to fetch a complete list of all of them up-front?
You can use the new GraphQL API instead. This allows you to tailor your queries and results to just what you need. Rather than grabbing the count and then later filling in the branches, you can do both in one query.
Try out the Query Explorer.
query {
repository(owner: "octocat", name: "Hello-World") {
refs(first: 100, refPrefix:"refs/heads/") {
totalCount
nodes {
name
}
},
pullRequests(states:[OPEN]) {
totalCount
}
}
}
{
"data": {
"repository": {
"refs": {
"totalCount": 3,
"nodes": [
{
"name": "master"
},
{
"name": "octocat-patch-1"
},
{
"name": "test"
}
]
},
"pullRequests": {
"totalCount": 192
}
}
}
}
Pagination is done with cursors. First you get the first page, up to 100 at a time, but we're using just 2 here for brevity. The response will contain a unique cursor.
{
repository(owner: "octocat", name: "Hello-World") {
pullRequests(first:2, states: [OPEN]) {
edges {
node {
title
}
cursor
}
}
}
}
{
"data": {
"repository": {
"pullRequests": {
"edges": [
{
"node": {
"title": "Update README"
},
"cursor": "Y3Vyc29yOnYyOpHOABRYHg=="
},
{
"node": {
"title": "Just a pull request test"
},
"cursor": "Y3Vyc29yOnYyOpHOABR2bQ=="
}
]
}
}
}
}
You can then ask for more elements after the cursor. This will get the next 2 elements.
{
repository(owner: "octocat", name: "Hello-World") {
pullRequests(first:2, after: "Y3Vyc29yOnYyOpHOABR2bQ==", states: [OPEN]) {
edges {
node {
title
}
cursor
}
}
}
}
Queries can be written like functions and passed arguments. The arguments are sent in a separate bit of JSON. This allows the query to be a simple unchanging string.
This query does the same thing as before.
query NextPullRequestPage($pullRequestCursor:String) {
repository(owner: "octocat", name: "Hello-World") {
pullRequests(first:2, after: $pullRequestCursor, states: [OPEN]) {
edges {
node {
title
}
cursor
}
}
}
}
{
"pullRequestCursor": "Y3Vyc29yOnYyOpHOABR2bQ=="
}
{ "pullRequestCursor": null } will fetch the first page.
Its rate limit calculations are more complex than the REST API. Instead of calls per hour, you get 5000 points per hour. Each query costs a certain number of points which roughly correspond to how much it costs Github to compute the results. You can find out how much a query costs by asking for its rateLimit information. If you pass it dryRun: true it will just tell you the cost without running the query.
{
rateLimit(dryRun:true) {
limit
cost
remaining
resetAt
}
repository(owner: "octocat", name: "Hello-World") {
refs(first: 100, refPrefix: "refs/heads/") {
totalCount
nodes {
name
}
}
pullRequests(states: [OPEN]) {
totalCount
}
}
}
{
"data": {
"rateLimit": {
"limit": 5000,
"cost": 1,
"remaining": 4979,
"resetAt": "2019-08-21T05:13:56Z"
}
}
}
This query costs just one point. I have 4979 points remaining and I'll get my rate limit reset at 05:13 UTC.
The GraphQL API is extremely flexible. You should be able to do more with it using less Github resources and less programming to work around rate limits.

How to get total number of commits using GitHub API

I am trying to collect some statistics about our project repositories on GitHub. I am able to get total number of commits for each contributor , but it is for default branch.
curl https://api.github.com/repos/cms-sw/cmssw/stats/contributors
The problem is , how can i get the same info for non-default branches , where i can specify a branch name. Is any such operation possible using GitHub API ?
thanks.
You should be able to use GitHub's GraphQL API to get at this data, although it won't be aggregated for you.
Try the following query in their GraphQL Explorer:
query($owner:String!, $name:String!) {
repository(owner:$owner,name:$name) {
refs(first:30, refPrefix:"refs/heads/") {
edges {
cursor
node {
name
target {
... on Commit {
history(first:30) {
edges {
cursor
node {
author {
email
}
}
}
}
}
}
}
}
}
}
}
With these variables:
{
"owner": "rails",
"name": "rails"
}
That will list out each of the author emails for each of the commits of each of the branches in a given repository. It would be up to you to paginate over the data (adding something like cursor: "b7aa251234357f7ddddccabcbce332af39dd95f6" after the first:30 arguments). You'd also have to aggregate the counts on your end.
Hope this helps.