GitHub Api: list of all repos with a given language - github

Yes, there is this question:
Github API: How to get all repositories written in a given language
however the answer provided only returns 100 results.
So how can I get the list of ALL repositories for a given language,
e.g. for Mathematica
curl https://api.github.com/search/repositories?q=language:mathematica
says there are 8000+ items that I should get, but this returns only top 30...
I have tried since

As suggested by #Bertrand Martel adding
&page=<page>&per_page=100
works.
You just have to request page 1 with 1 result per page to get total results, and then iterate over pages as needed.

Related

How to find the exact contributor count of a GitHub repository using GitHub API?

I am trying to count the total number of contributors of a GitHub repository using the GitHub API. But, I did not get the exact number of contributors shown in the repository. For example, in the azure-sdk-for-go repository, the total number of contributors are shows as 188.
Now, if I run the below query, I get 157 as result.
def contributorCount(u, r):
return re.search('\d+$', requests.get('https://api.github.com/repos/{}/{}/contributors?per_page=1'.format(u, r)).links['last']['url']).group()
print(contributorCount("Azure", "azure-sdk-for-go"))
If I add the anon=True in the URL, then I got 169 contributors.
https://api.github.com/repos/{}/{}/contributors?per_page=1&anon=true
What am I missing here?
As noted in this discussion, it might not be trivial to find the same number:
A user can be “anonymous” if there is no GitHub user associated with a given email address.
And the reason that your number still may not match the one given by the UI is because the same GitHub user may have contributed using multiple email addresses. This is why I said above:
on larger repos, you may not be able to replicate the exact figure that we show on the website.
The API simply doesn’t return the information you need in order to be able to replicate the number we show.

Why i receive different result searching repositories?

I try find latest updated repo on GitHub.
I use this two methods:
https://api.github.com/search/repositories?q=user:github+sort:updated+&per_page=5&type=all
https://api.github.com/users/github/repos?type=all&sort=updated&per_page=5
Why i get differend repos? Which method is working?
On GitHub web site i can see results like in the first link:
https://github.com/github
I went through the results of both the requests. It looks like in the first case sort:updated uses pushed_at field to sort the results. In the second case, sort=updatedis using updated_at field to sort the results. So, depending on which field you would like to sort your results, you could use either. Strangely, i could not find any documentation of this difference.

GitHub API - latest public repositories

I would like to list public GitHub repositories with the latest create/update/push timestamps (for me any of these is acceptable). Can I achieve this with the GitHub API?
I have tried the following:
Tried using /repositories endpoint, and use the link header to navigate to the last page. However, the link header I receive only has first and next links, whereas I need a last link.
Tried using /search/repositories endpoint. This will work as long as I have a keyword or filter in the q parameter, but it will not accept an empty q parameter.
I got in touch with GitHub support, and there are two solutions to this:
Use binary search on the since parameter of the /repositories endpoint to find the last page.
Cons: may quickly exhaust the API rate limit.
Use the /search/repositories endpoint with an always-true predicate such as stars>=0.
Cons: likely to cause a query timeout/ incomplete results.

How can I list all users of the same location using the Github API?

It is my first time using the Github API, sorry if this is a stupid question. I did a short search for location:Germany, and got 39,063 users. I want to create a list of all the 39,063 usernames and tried this command:
curl -i https://api.github.com/search/users?q=location%3AGermany | grep login
However this returns only 30 hits.. Could anyone give me some advice, or guide me to the right resources?
You will have to make additional requests for other pages:
Pagination
Requests that return multiple items will be paginated to 30 items by default. You can specify further pages with the ?page parameter. For some resources, you can also set a custom page size up to 100 with the ?per_page parameter. Note that for technical reasons not all endpoints respect the ?per_page parameter, see events for example.
$ curl 'https://api.github.com/user/repos?page=2&per_page=100'
Note that page numbering is 1-based and that omitting the ?page parameter will return the first page.
For more information on pagination, check out our guide on Traversing with Pagination.

Is there a way to get the number of repositories per language using Github's API?

I would like to use the Github API to retrieve the number of repositories for each language. For example,
C++ 200,134
Java 175,432
C# 123,453
...
The only API with a filter parameter would by the search repositories one:
GET /legacy/repos/search/:keyword
with the optional parameter language.
But that would returned a list of repositories on multiple page, so you would still need to make the sum yourself.
Note that very recently (as in early March, 2013), the API might limit the result to 1000 results only.
Following up on VonC's answer, the search API will now give you the total number of results matched by your query. So you can use this to get the total number of repositories for one particular language:
GET /search/repositories?q=language:languagename
Language name is case-insentitive, must be URL-encoded, and spaces must be replaced with dashes. For example (Objective C++):
GET /search/repositories?q=language:objective-c%2B%2B
{
"total_count": 2090,
...