Getting the list of all branches in a GitHub organisation without triggering Rate Limit, using Bash? - github

While trying to establish a list of incoming GitHub commits I've stumbled accross the GitHub rate api limits, of 60 calls per hour. As explained in this answer, one can get the lists of branches with an API call using:
https://api.github.com/repos/{username}/{repo-name}/branches
However, that triggers the rate limit for the average GitHub organisation/user. So I thought I'd try a different approach, using RSS/atom format. However, as that same answer explains, the atom format/rss feed seems to depend on the user having a list of all branches in a repository. This question asks for an overview of all commits in a repository, yet instead it is given an answer for all commits in the default branch of the repository. And this question receives a working answer that triggers the rate limit, as it relies on at least 1 API call per repository.
Hence, I would like to ask: How could one get a list of all branches of a GitHub user, using at most 1 GitHub API call?
Note, using atom views would be perfectly fine, however, I have not found an atom view like: https://github.com/:owner/:repo/commits.atom or https://github.com/:owner/:repo/branches.atom that displays all branches in a repository. I would strongly prefer a solution that does not rely on a third party like: https://rsshub.app/github/repos/yanglr as I imagine, they too will at some point start rate-limiting.
My current approach is to scrape the source code of https://github.com/:user/:repo/branches using bash. However, I imagine there might exist a more efficient solution to this.
MWE
Thanks to the comments, I was ble to find a bash MWE to perform a GraphQL query using terminal. It is given in this answer, where bearer is not a variable, it is the means of identification and the ...... should be your personal GitHub Access token. I am currently looking into how to get the repositories beyond the 1st hundred. Then I'll look at how to get the branches of those repositories.
Attempt I
The following query yields a json with the repositories and first 4 branches in each repository of a user!
name:examplequery.gql.
query {
repositoryOwner(login: "somegithubuser") {
repositories(first: 40) {
edges {
node {
nameWithOwner
refs(
refPrefix: "refs/heads/"
orderBy: { direction: DESC, field: TAG_COMMIT_DATE }
first: 4
) {
edges {
node {
... on Ref {
name
}
}
}
}
}
}
}
}
}
Next, a bash script is made that runs the query:
#!/usr/bin/env bash
# Runs graphql query on GitHub. Execute with:
# ./run_graphql_query.sh examplequery1.gql
GITHUB_PERSONAL_ACCESS_TOKEN_GLOBAL="your_github_personal_access_token"
if [ $# -ne 1 ]; then
echo "usage of this script is incorrect."
exit 1
fi
if [ ! -f $1 ];then
echo "usage of this script is incorrect."
exit 1
fi
# Form query JSON
QUERY=$(jq -n \
--arg q "$(cat $1 | tr -d '\n')" \
'{ query: $q }')
curl -s -X POST \
-H "Content-Type: application/json" \
-H "Authorization: bearer $GITHUB_PERSONAL_ACCESS_TOKEN_GLOBAL" \
--data "$QUERY" \
https://api.github.com/graphql
It can be ran with:
./run_graphql_query.sh examplequery1.gql
There are two more issues to resolve before I can answer the question. How I can iterate over all repositories instead of only the first 100. How I can parse the json into a list of branches per repository.

Related

How can I add in-code annotations in PR reviews usign Github's `gh` tool?

On the Github web UI, I can click on a line and say something like:
Good architecture, but please pass the std::vector<std::uint8_t> hugedata as const &, to avoid a copy.
and bundle such comments as one review with a final verdict.
So far, I've only found gh pr review, which only allows me to generally approve/comment/reject a PR that I'm reviewing.
Is there a way to do detailed in-code reviews using the gh CLI?
if not, how can I use the github api to do that myself?
gh doesn't seem to have built in support for this, but you can still use gh api to call the API:
Note the repository owner, repository name, and pull request ID
Get a diff of the pull request so you can get the right files and positions
gh api \
-H "Accept: application/vnd.github.v3.diff" \
/repos/OWNER/REPO/pulls/ID
Note any files you want to comment on after +++
Note any positions you want to comment on after ## (by number of lines after that line)
Create a pull request review with your comments (using the file as path, the line offset from the start of the hunk as position, and your commend as body)
gh api \
-X POST \
/repos/OWNER/REPO/pulls/ID/reviews \
-d '{ "comments": [{"path": ...,"position": ...,"body": ...}, ...] }'
Submit the pull request review on GitHub (alternatively if you want to automate this, add the body and event properties to your review's body)

jfrog cli artifact search by filename pattern

I want to search for a filename pattern across entire JFrog ARM
without knowing the explicit repository name in the JFrog cli.
jfrog rt s "reponame/*pattern*"
is giving the results as expected in a specific repo.
But I have repo1, repo2, repo3, ... so on.
How do I search using wildcard for reponame, below is not working.
jfrog rt s "*/*pattern*"
Basically I want the jfrog cli equlivalent of the curl GET request search
"https://server/artifactory/api/search/artifact?name=*pattern*"
This is not for cli client, but an alternative way to get desired feature. Spent some time looking at API here:
https://www.jfrog.com/confluence/display/RTF/Artifactory+REST+API
I recommend to scroll down that page slowly and read in entirety as a lof of possible commands, syntax is excellent, I executed a few searches and they searched all local repositories. No need to recursively search 1 by 1. Command syntax:
export url="http://url/to/articatory"
curl --noproxy '*' -x GET "$url/api/search/artifact?name=log4j*"
Read link above for more granular search options/syntax.
How I set it up:
alias artpost='curl -X POST "http://url/artifactory/api/search/aql" -T - -u admin:password'
Some example usage:
echo 'items.find({"name": {"$match" : "log4j*"}})' | artpost
echo 'items.find({"$and" : [{"created" : {"$gt" : "2017-06-12"}},{"name": {"$nmatch" : "*surefire*"}}]})' | artpost

GitHub api to obtain last N number of commits

Is it possible to obtain last N number of commits to a particular branch in a GitHub repository using GitHub API ?
I just found few GitHub api details regarding the commits here, but none of them are giving details about last N number of commits!
Anyone can provide a better idea about this ?
Also, Is it possible to identify the changed file type during the last commit from a user ?
You can try this Github API to get the last N number of commits,
INPUT:
GIT_REPO="https://api.github.com/repos/kubernetes/kubernetes" # Input Git Repo
BRANCH_NAME="master" # Input Branch Name
COMMITS_NUM="5" # Input to get last "N" number of commits
curl --silent --insecure --request GET --header "Accept: application/vnd.github.inertia-preview+json" "$GIT_REPO/commits?sha=$BRANCH_NAME&page=1&per_page=1000" | jq --raw-output '.[] | "\(.sha)|\(.commit.author.date)|\(.commit.message)|\(.commit.author.name)|\(.commit.author.email)" | gsub("[\n\t]"; "")' | awk 'NF' | awk '{$1=$1;print}' | head -$COMMITS_NUM
OUTPUT:
COMMIT_ID|DATE/TIME|COMMIT_MESSAGE|AUTHOR_NAME|AUTHOR_EMAIL
5ed4b76a03b5eddc62939a1569b61532b4a06a72|2020-11-26T15:24:19Z|Merge pull request #96421 from dgrisonnet/fix-apiservice-availabilityFix aggregator_unavailable_apiservice gauge|Kubernetes Prow Robot|k8s-ci-robot#users.noreply.github.com
c1f36fa6f28d3618c03b65799bc3f58007624e5f|2020-11-25T06:32:41Z|Merge pull request #96829 from songjiaxun/azuredisk_api_versionfix: change disk client API version for Azure Stack|Kubernetes Prow Robot|k8s-ci-robot#users.noreply.github.com
c678434623be4957d892a9865e5649f887a40c49|2020-11-24T21:20:39Z|Merge pull request #96831 from bobbypage/vendor-cadvisor-v0_38_5vendor: update cAdvisor to v0.38.5|Kubernetes Prow Robot|k8s-ci-robot#users.noreply.github.com
c652ffbe4a29143623a1aaec39f745575f7e43ad|2020-11-24T14:59:01Z|Merge pull request #96636 from Nordix/disable-nodeport-2service.spec.AllocateLoadBalancerNodePorts followup|Kubernetes Prow Robot|k8s-ci-robot#users.noreply.github.com
4a46efb70701ee00028723ecb137e401d83be4f4|2020-11-24T07:45:19Z|vendor: update cAdvisor to v0.38.5|David Porter|david#porter.me
Note:
1.Make sure you have installed jq to get output in the desired format and parse json key as per your requirement.
2.Make sure to update the Git Repo Url in "curl" command.

How to extract the list of all repositories in Stash or Bitbucket?

I need to extract the list of all repos under all projects in Bitbucket. Is there a REST API for the same? I couldn't find one.
I have both on-premise and cloud Bitbucket.
Clone ALL Projects & Repositories for a given stash url
#!/usr/bin/python
#
# #author Jason LeMonier
#
# Clone ALL Projects & Repositories for a given stash url
#
# Loop through all projects: [P1, P2, ...]
# P1 > for each project make a directory with the key "P1"
# Then clone every repository inside of directory P1
# Backup a directory, create P2, ...
#
# Added ACTION_FLAG bit so the same logic can run fetch --all on every repository and/or clone.
import sys
import os
import stashy
ACTION_FLAG = 1 # Bit: +1=Clone, +2=fetch --all
url = os.environ["STASH_URL"] # "https://mystash.com/stash"
user = os.environ["STASH_USER"] # joedoe"
pwd = os.environ["STASH_PWD"] # Yay123
stash = stashy.connect(url, user, pwd)
def mkdir(xdir):
if not os.path.exists(xdir):
os.makedirs(xdir)
def run_cmd(cmd):
print ("Directory cwd: %s "%(os.getcwd() ))
print ("Running Command: \n %s " %(cmd))
os.system(cmd)
start_dir = os.getcwd()
for project in stash.projects:
pk = project_key = project["key"]
mkdir(pk)
os.chdir(pk)
for repo in stash.projects[project_key].repos.list():
for url in repo["links"]["clone"]:
href = url["href"]
repo_dir = href.split("/")[-1].split(".")[0]
if (url["name"] == "http"):
print (" url.href: %s"% href) # https://joedoe#mystash.com/stash/scm/app/ae.git
print ("Directory cwd: %s Project: %s"%(os.getcwd(), pk))
if ACTION_FLAG & 1 > 0:
if not os.path.exists(repo_dir):
run_cmd("git clone %s" % url["href"])
else:
print ("Directory: %s/%s exists already. Skipping clone. "%(os.getcwd(), repo_dir))
if ACTION_FLAG & 2 > 0:
# chdir into directory "ae" based on url of this repo, fetch, chdir back
cur_dir = os.getcwd()
os.chdir(repo_dir)
run_cmd("git fetch --all ")
os.chdir(cur_dir)
break
os.chdir(start_dir) # avoiding ".." in case of incorrect git directories
Once logged in: on the top right, click on your profile pic and then 'View profile'
Take note of your user (in the example below 'YourEmail#domain.com', but keep in mind it's case sensitive)
Click on profile pic > Manage account > Personal access token > Create a token (choosing 'Read' access type is enough for this functionality)
For all repos in all projects:
Open a CLI and use the command below (remember to fill in your server domain!):
curl -u "YourEmail#domain.com" -X GET https://<my_server_domain>/rest/api/1.0/projects/?limit=1000
It will ask you for your personal access token, you comply and you get a JSON file with all repos requested
For all repos in a given project:
Pick the project you want to get repos from. In my case, the project URL is: <your_server_domain>/projects/TECH/ and therefore my {projectKey} is 'TECH', which you'll need for the command below.
Open a CLI and use this command (remember to fill in your server domain and projectKey!):
curl -u "YourEmail#domain.com" -X GET https://<my_server_domain>/rest/api/1.0/projects/{projectKey}/repos?limit=50
Final touches
(optional) If you want just the titles of the repos requested and you have jq installed (for Windows, downloading the exe and adding it to PATH should be enough, but you need to restart your CLI for that new addition to be detected), you can use the command below:
curl -u $BBUSER -X GET <my_server_domain>/rest/api/1.0/projects/TECH/repos?limit=50 | jq '.values|.[]|.name'
(tested with Data Center/Atlassian Bitbucket v7.9.0 and powershell CLI)
For Bitbucket Cloud
You can use their REST API to access and perform queries on your server.
Specifically, you can use this documentation page, provided by Atlassian, to learn how to list you're repositories.
For Bitbucket Server
Edit: As of receiving this tweet from Dan Bennett, I've learnt there is an API/plugin system for Bitbucket Server that could possibly cater for your needs. For docs: See here.
Edit2: Found this reference to listing personal repositories that may serve as a solution.
AFAIK there isn't a solution for you unless you built a little API for yourself that interacted with your Bitbucket Server instance.
Atlassian Documentation does indicate that to list all currently configured repositories you can do git remote -v. However I'm dubious of this as this isn't normally how git remote -v is used; I think it's more likely that Atlassian's documentation is being unclear rather than Atlassian building in this functionality to Bitbucket Server.
I ended up having to do this myself with an on-prem install of Bitbucket which didn't seem to have the REST APIs discussed above accessible, so I came up with a short script to scrape it out of the web page. This workaround has the advantage that there's nothing you need to install, and you don't need to worry about dependencies, certs or logins other than just logging into your Bitbucket server. You can also set this up as a bookmark if you urlencode the script and prefix it with javascript:.
To use this:
Open your bitbucket server project page, where you should see a list of repos.
Open your browser's devtools console. This is usually F12 or ctrl-shift-i.
Paste the following into the command prompt there.
JSON.stringify(Array.from(document.querySelectorAll('[data-repository-id]')).map(aTag => {
const href = aTag.getAttribute('href');
let projName = href.match(/\/projects\/(.+)\/repos/)[1].toLowerCase();
let repoName = href.match(/\/repos\/(.+)\/browse/)[1];
repoName = repoName.replace(' ', '-');
const templ = `https://${location.host}/scm/${projName}/${repoName}.git`;
return {
href,
name: aTag.innerText,
clone: templ
}
}));
The result is a JSON string containing an array with the repo's URL, name, and clone URL.
[{
"href": "/projects/FOO/repos/some-repo-here/browse",
"name": "some-repo-here",
"clone": "https://mybitbucket.company.com/scm/foo/some-repo-here.git"
}]
This ruby script isn't the greatest code, which makes sense, because I'm not the greatest coder. But it is clear, tested, and it works.
The script filters the output of a Bitbucket API call to create a complete report of all repos on a Bitbucket server. Report is arranged by project, and includes totals and subtotals, a link to each repo, and whether the repos are public or personal. I could have simplified it for general use, but it's pretty useful as it is.
There are no command line arguments. Just run it.
#!/usr/bin/ruby
#
# #author Bill Cernansky
#
# List and count all repos on a Bitbucket server, arranged by project, to STDOUT.
#
require 'json'
bbserver = 'http(s)://server.domain.com'
bbuser = 'username'
bbpassword = 'password'
bbmaxrepos = 2000 # Increase if you have more than 2000 repos
reposRaw = JSON.parse(`curl -s -u '#{bbuser}':'#{bbpassword}' -X GET #{bbserver}/rest/api/1.0/repos?limit=#{bbmaxrepos}`)
projects = {}
repoCount = reposRaw['values'].count
reposRaw['values'].each do |r|
projID = r['project']['key']
if projects[projID].nil?
projects[projID] = {}
projects[projID]['name'] = r['project']['name']
projects[projID]['repos'] = {}
end
repoName = r['name']
projects[projID]['repos'][repoName] = r['links']['clone'][0]['href']
end
privateProjCount = projects.keys.grep(/^\~/).count
publicProjCount = projects.keys.count - privateProjCount
reportText = ''
privateRepoCount = 0
projects.keys.sort.each do |p|
# Personal project slugs always start with tilde
isPrivate = p[0] == '~'
projRepoCount = projects[p]['repos'].keys.count
privateRepoCount += projRepoCount if isPrivate
reportText += "\nProject: #{p} : #{projects[p]['name']}\n #{projRepoCount} #{isPrivate ? 'PERSONAL' : 'Public'} repositories\n"
projects[p]['repos'].keys.each do |r|
reportText += sprintf(" %-30s : %s\n", r, projects[p]['repos'][r])
end
end
puts "BITBUCKET REPO REPORT\n\n"
puts sprintf(" Total Projects: %5d Public: %5d Personal: %5d", projects.keys.count, publicProjCount, privateProjCount)
puts sprintf(" Total Repos: %5d Public: %5d Personal: %5d", repoCount, repoCount - privateRepoCount, privateRepoCount)
puts reportText
The way I solved this issue, was get the html page and give it a ridiculous limit like this. thats in python :
cmd = "curl -s -k --user " + username + " https://URL/projects/<KEY_PROJECT_NAME>/?limit\=10000"
then I parsed it with BeautifulSoup
make_list = str((subprocess.check_output(cmd, shell=True)).rstrip().decode("utf-8"))
html = make_list
parsed_html = BeautifulSoup(html,'html.parser')
list1 = []
for a in parsed_html.find_all("a", href=re.compile("/<projects>/<KEY_PROJECT_NAME>/repos/")):
list1.append(a.string)
print(list1)
to use this make sure you change and , this should be the bitbucket project you are targeting. All , I am doing is parsing an html file.
Here's how I pulled the list of repos from Bitbucket Cloud.
Setup OAauth Consumer
Go to your workspace settings and setup an OAuth consumer, you should be able to go here directly using this link: https://bitbucket.org/{your_workspace}/workspace/settings/api
The only setting that matters is the callback URL which can be anything but I chose http://localhost
Once setup, this will display a key and secret pair for your OAuth consumer, I will refer to these as {oauth_key} and {oauth_secret} below
Authenticate with the API
Go to https://bitbucket.org/site/oauth2/authorize?client_id={oauth_key}&response_type=code ensuring you replace {oauth_key}
This will redirect you to something like http://localhost/?code=xxxxxxxxxxxxxxxxxx, make a note of that code, I'll refer to that as {oauth_code} below
In your terminal go to curl -X POST -u "{oauth_key}:{oauth_secret}" https://bitbucket.org/site/oauth2/access_token -d grant_type=authorization_code -d code={oauth_code} replacing the placeholders.
This should return json including the access_token, I’ll refer to that access token as {oauth_token}
Get the list of repos
You can now run the following to get the list of repos. Bear in mind that your {oauth_token} lasts 2hrs by default.
curl --request GET \
--url 'https://api.bitbucket.org/2.0/repositories/pageant?page=1' \
--header 'Authorization: Bearer {oauth_token}' \
--header 'Accept: application/json'
This response is paginated so you'll need to page through the responses, 10 repositories at a time.

Getting the issues from a certain milestone in Github

All I'm looking for is a way to get a list of issues for a given milestone. It looks like Github treats milestones a bit like labels in that you can ask for the labels for an issue, but not the issues for a given label.
I know that I can filter my issues by milestone on the Github website, but this traverses multiple pages and I wanted an easy way to see all of the issues for a milestone in a more printer friendly version.
Any tips?
You could use GitHub's API for this. See here on how to get the list of issues for a repo and notice the milestone parameter. The response you will get is a big JSON document, so you would have to create a small script to pull only the titles of the issues, or use grep, or smething like jq.
Notice also that API responses are also paged, but you can set the paging to be 100 entries per page, which is usually enough. If not, you would again have to create a small script to fetch all the pages (or do it manually).
You can use the GraphQL API which is V4. and do something like:
{
repository(owner: "X", name: "X") {
milestone(number: X) {
id
issues(first: 100) {
edges {
node {
id,
title
}
}
}
}
}
}
I was not able to find any easy methods. This worked a treat for me:
brew install hub (on OSX). Hub is created by GitHub
cd to the local repo you want to access the origin for.
hub issue -M 21 -f "%I,%t,%L,%b,%au,%as" > save_here.csv
profit.
Find the issue # (21 in the example above) in the URL on GitHub when you are viewing the milestone.
Docs for hub and in particular the format (-f) flag can be found here: https://hub.github.com/hub-issue.1.html
First find the list of milestones using this
Then query this api by milestone number for each milestone
Given a milestone $title in $owner/$repo, we can list the issues in this milestone using curl and jq:
api_url="https://api.github.com/repos/$owner/$repo"
MS=$(curl -s "$api_url/milestones" | jq '.[] | select(.title == "QA")')
MS_number=$(echo "$MS" | jq .number -r)
MS_state=$(echo "$MS" | jq .state -r)
echo "Found $title milestone with state=$MS_state"
echo ""
issues=$(curl -s "$api_url/issues?milestone=$MS_number" | jq '.[].number' -r)
echo "The following issues are in the QA milestone:"
for i in $issues; do
issue_title=$(curl -s "$api_url/issues/$i" | jq '.title' -r)
echo " https://github/$owner/$repo/issues/$i - $issue_title"
done
echo ""