Background
We are writing some documentation for our support team.
We want to include links to files that are stored in private GitHub repositories.
We do not want the documentation to become stale if somebody refactors the code in the private GitHub repositories, so instead I am setting up a CI job that parses the documentation (with jsoup if you are interested) and finds all the links.
Once we have all the links we start checking them.
NOTE: we have written a custom link checker, because one of the critical set of links we have is for our monitoring solution, and sadly (also understandably) the SaaS we are using returns 404's for any unauthenticated requests on the URLs of the alerts.
The SaaS itself uses a 2FA to access the Web UI, so what we have ended up doing is parsing the URLs and then constructing an equivalent call to the SaaS API to validate the link.
For the monitoring system we use, this is easy: all the URLs are the same format.
Question
Can we validate a random GitHub URL as valid (ideally using only curl - I can translate to my chosen HTTP client from there, and curl gives a more generic answer) using a Personal Access Token? And if so, how?
The URLs could be:
simple direct to repo URLs: https://github.com/<org>/<repo>
direct to branch URLs: https://github.com/<org>/<repo>/tree/<branch>
file URLs: https://github.com/<org>/<repo>/blob/<path/to/file>
diff URLs: https://github.com/<org>/<repo>/compare/[<branch>...]<branch>
other URLs that are based on the presence of the repo and do not vary in child path, e.g. https://github.com/<org>/<repo>/pulls, https://github.com/<org>/<repo>/settings/collaboration, etc
plus who knows what other URLs people will add within the docs...
Things I have tried that didn't work
HTTP Basic authentication with the Personal Access Token as the password, e.g.
curl -I -u stephenc:2....token.redacted....b https://github.com/stephenc/<repo-name>
HTTP/1.1 404 Not Found
HTTP Bearer authentication, e.g.
curl -I -H "Authorization: bearer 2....token.redacted....b" https://github.com/stephenc/<repo-name>
HTTP/1.1 404 Not Found
It looks like it works for some URLs (no idea which ones).
I can access curl -u agentgonzo:$TOKEN https://raw.githubusercontent.com/agentgonzo/repo/path/to/file using the API Token as my username, but the same doesn't work on https://github.com URLs. Not sure if this will help you or not.
I got an answer from GitHub Support: No
Since a personal access token won't work for GitHub web UI URLs, no, there isn't a way to verify all possible GitHub private repo URLs without making API calls in some cases.
Related
I am trying to follow these docs to download an artifact from github using githubs API:
https://docs.github.com/en/rest/actions/artifacts#download-an-artifact
I ran the curl command given in the docs, and it gave me the following url from which to download the artifact (I have replaced the specifics with ...)
https://pipelines.actions.githubusercontent.com/serviceHosts/..../_apis/pipelines/1/runs/16/signedartifactscontent?artifactName=my-artifact&urlExpires=....&urlSigningMethod=HMACV2&urlSignature=....
I am able to download the artifact by putting the URL into my browser (it automatically downloads when the URL is visited) however I tried to use wget to download it via console and got this error:
wget https://pipelines.actions.githubusercontent.com/... # the command I ran
HTTP request sent, awaiting response... 400 Bad Request # the error I got
How can I download a zip file to console? Should I use something other than wget?
I'd like to clarify that viewing this link in the browser is possible even when not logged in to github (or when in private browsing). Also, I can download the zip file at the link as many times as I would like before the link expires after 1 minute. Also my repo is private, which is necessary for my work. I need to use an access token when doing the curl command as described in the docs, however the link that is returned to me does not require any authentication when accessed via a browser.
The api docs seem a bit ambiguous here. It is possible that the redirect can only be accessed a single time in which case you should try generating the redirect and first using wget to parse it. You can then unzip the file using the unzip command.
If that is not the case I believe this statement in the api docs is key:
Anyone with read access to the repository can use this endpoint. If the repository is private you must use an access token with the repo scope. GitHub Apps must have the actions:read permission to use this endpoint.
My guess is that your repository is private and you are logged in on the browser to Github which allows you to be authenticated hence why you are able to download from the redirect link. I would suggest trying from incognito mode to test this.
Migrating the repository to public would allow you to bypass this issue. Alternatively you can pass the authentication token as a header to wget like so in order to authenticate with the server to pull the file.
header='--header=Authorization: token <TOKEN>'
wget "$header" https://pipelines.actions.githubusercontent.com/... -O output_file
The problem was that I didn't put quotes around my url. I needed to do this:
wget "https://pipelines.actions.githubusercontent.com/serviceHosts/..../_apis/pipelines/1/runs/16/signedartifactscontent?artifactName=my-artifact&urlExpires=....&urlSigningMethod=HMACV2&urlSignature=...."
I have a repo with shell script and want to put single command to run it in readme file, like:
bash <(curl -L <path_to_raw_script_file>)
Raw file urls for GitHub Enterprise look like this: https://raw.github.ibm.com/<user>/<repo>/<branch>/<path_to_file>?token=<token>, where <token> is unique to the file and generated when accesing it via Raw button in repository or with ?raw=true suffix in url.
The problem is, tokens get invalidated after few days/when file is updated and I wouldn't like to update mentioned command each time token becomes invalid. Is there a way to deal with it?
I know there is a way for user to create personal token and use it to login to github from machine he's runnning script from, but I wanted to keep it as simple as possible.
I was thinking of something like auto-generating that raw file url (since user reading the readme file on github surely does have access to the script located in the same repo), but I am not sure if that's possible.
No input, one-liner.
You can get this link by clicking the raw button in the GHE UI, just remove the token query param at the end.
curl -sfSO https://${USER}:${TOKEN}#${GHE_DOMAIN}/raw/${REPO_OWNER}/${REPO_NAME}/${REF}/${FILE}
I believe you'll always need the tokens - however if you'd like to automate the process you can dynamically request tokens associated with a github Oauth app and not associated with any user profile.
https://developer.github.com/enterprise/2.13/apps/building-oauth-apps/authorizing-oauth-apps/
I know there is a way for user to create personal token and use it to login to GitHub from machine he's runnning script from, but I wanted to keep it as simple as possible.
Actually, using GCM (Git Credential Manager); the PAT will be provided when accessing the raw.xxx URL.
But only with GCM v2.0.692 which supports those URLs. See PR 599.
Fix GitHub Enterprise API URL for raw source code links
This is a simple fix of #598 for GitHub Enterprise instances that use a raw. hostname prefix for raw source code links.
I've verified this fix locally by swapping out the GitHub.dll that is used by Visual Studio.
So it now checks for 'raw.' in the hostname and remove it to get the correct GHE API URL.
I configured my Bitbucket repo to be read-only accessible via REST API publicly. There are some JSON configuration files that I need to read the content using GET HTTP method.
https://<bitbucket-repo-url>/config.json?raw
I want to switch to a secure method using Access Keys.
I want to try to test this using curl, but I don't know the arguments on including the access keys. Can anyone help?
Access keys are for SSH only. They will not work with any HTTP-based utilities (like curl) or endpoints (like the one you list in your example).
Is this Bitbucket Server (the on-premise version)? If so, https://confluence.atlassian.com/bitbucketserver/permanently-authenticating-with-git-repositories-776639846.html?_ga=2.188793826.854670382.1505151098-758028192.1431549295 may be helpful for you.
We have a build system on our network that frequently hits the github api limit for our company's IP address. This, of course, also blocks local developers.
The readme indicates that we should be able to authenticate for more requests, but I can't see how.
The Github API has a 60-requests-per-hour rate-limit for
non-authenticated use. You'll likely never hit this as TSD uses local
caching and the definition files are downloaded from Github RAW urls.
If you need some more then a scope-limited Github OAuth token can be
used to boost the limit to 5000.
From the tsd page on npm:
.tsdrc
This is a optional JSON encoded file to define global settings. TSD looks for it in the user's home director (eg: %USERPROFILE% on Windows, $HOME / ~ on Linux), and in the current working directory.
"proxy" - Use a http proxy
Any standard http-proxy as supported by the request package.
{
"proxy": "http://proxy.example.com:88"
}
"token" - Github OAuth token:
The OAuth token can be used to boost the Github API rate-limit from 60 to 5000 (non-cached) requests per hour. The is token needs just 'read-only access to public information' so no additional OAuth scopes are necessary.
{
"token": "0beec7b5ea3f0fdbc95d0dd47f3c5bc275da8a33"
}
You can create this token on Github.com:
Go to https://github.com/settings/tokens/new
Deselect all scopes to create a token with just basic authentication.
(verify you really deselected all scopes)
(wonder why these presets were set??)
Enter a identifying name, something like "TSD Turbo 5000"
Create the token.
Copy the hex-string to the token element in the .tsdrc file.
Verify enhanced rate-limit using $ tsd rate
Change or revoke the token at any time on https://github.com/settings/applications
Note: keep in mind the .tsdrc file is not secured. Don't use a token with additional scope unless you know what you are doing.
The bare 'no scope' token is relatively harmless as it gives 'read-only access to public information', same as any non-autenticated access. But it does identify any requests done with it as being yours, so it is still your responsibility to keep the token private.
I want to programmatically get a list of open pull requests for a specific private github repository - ours, as it turns out. I assume I can only do this via the github api (http://developer.github.com/) - feel free to tell me there's another way - but I can't figure out whether the API allows this, either. The given API calls seem to assume the target repository is public, which ours is not. I would have thought there would be a way to authenticate as a user of the given repository via ssh key (the same way committing works), but I don't see anything to that effect. All in all I'm puzzled and not at all sure I can actually do this. Am I missing a crucial part of the documentation, or is there possibly some alternative I can leverage?
Yes, the GitHub Pull Requests API supports private repos also. You just need to authenticate or you will get an error saying that the repository does not exist.
Example using curl and basic authentication:
curl -u "username" https://api.github.com/repos/:user/:repo/pulls
This will then prompt you for your password and return a list of pull requests as described in the API docs.
Also check out the docs on authentication: http://developer.github.com/v3/#authentication