Is the GitHub traffic insight reliable? - github

I have a repo that was public initially, made private later, and turned public again only a while ago.
My issue is, I had about 250 views with only 2 unique viewers before it turned private (I don't know how THAT's possible).
Once I had made it public, I checked the traffic and had 727 views with still only 2 unique viewers, and the graph even shows a view count when the repo went private for a few days.
How did the view count jump so high and how did my repo have views even when it was private?
Any form of help is appreciated. Thanks in advance.

Yes, the numbers include everyone's views including repository owners and contributors. There's no way to filter this information at the moment but we can definitely add that as a feature request for the team to consider. Your own views are counted on repository traffic graphs, and there is no way to filter out your own page views of those of other repository contributors

Related

What is a good way to allow the wider discoverability of private GitHub repositories?

If you are in an organisation, there may be GitHub repositories that are private (i.e. you don't have access to them), but it would be useful to know that they exist, and then you could arrange access where appropriate.
In other words we are trying to enable discoverability, in a way that can lead to access. This could be done with sharing readme's (noting that people need to have some discipline to write sensible readme's).
This blog post Solving the innersource discoverability problem looks like a potential solution, but may require that the user has access to see all the repos in the portal? I'd like for the user to be able to view readme's for all repos - if they don't have access, the can contact whoever is listed on the readme.
I see another option for making a file public from a private repo (using gitexporter to create a public repo with only the readme, example here. This makes it public, not my first preference, and would require every repo to do some work, far from ideal. While it doesn't give a neat portal, it should allow GitHub search functionality to find it by topic or keyword?
A related, perhaps simpler option is proposed here, where a student shares a readme from a private repo as a public GitHub page. Again, requires a little work from every repo, no neat portal, but can be found with GitHub search? While public Github pages can be made private, then would only be visible to those with repo access?
So, if I'm summarising basic requirements:
All org repos (public, private or team) have a readme that can be accessed by search by someone in the org (preferably not requiring each individual to modify their repo).
Additional nice to have features:
All readmes can be viewed in a portal with search
Bonus for being able to make super private (only collaborators can see readme - flag in readme?), org private (only people in the org - default) and public (flag in readme?).
Simple to implement!
Suggestions?
I think you have already provided a suitable solution for it here already within your question. Alternatively, you can use APIs (GET repos, GET README of a repo) to get each repositories README and save it to a database/JSON based on a cron scheduler and create a web interface based on that data.
But, I'm gonna elaborate on a few areas of improvement. The problem I see with this is the nature of the search. We aren't always looking for keywords, sometimes we are trying to find a potential fuzzy match for our problem, especially in the case of a larger organization with more than a couple of thousand repositories. In those cases, a search engine implementation will provide much better results. In my opinion, we should collect the README and FAQs and put them into Elastic search, expose search API for queries. The collection of README and FAQs should be part of the CI/CD pipeline, and while pushing new versions to artifactory it must publish metadata as well.
This looks like a use case for internal repositories to me. You can find more about internal repositories here.
Whether you can use internal repositories or not highly depends on your company's policies.
Another thing to consider is that this will expose your whole repositories, not just the README.

Get a list of all your activities on GitHub

On my GitHub account, I have public and private repositories and push rights to one or two external repositories. Is there a sensible way to get all my activities in chronological order? It should not only cover commits but also things like issues, comments, pushes, etc.
I know that basically, the GitHub API would provide such information but I'm not sure if it is possible to get all things at once and maybe there is already a tool for it.
Can someone give me a hint?

Managing benefits/drawbacks of private repos

As someone who is just beginning to think about using private repositories, if I understand correctly, they basically let you make commits in private until you are ready to open-source your app/program to the world and then, once you do, your entire Github/Bitbucket commit history becomes visible to everyone (like as if you were developing out in the open the entire time).
Now what happens if someone open-sources something before you do and claims provenance in the field/area/app/etc.? Can you basically open-source your software in return (or contact the authors directly) and "counter-claim" provenance? Obviously, the open-source person wouldn't have known about your existence since you're developing in private mode, so whose "right-of-way" would it be in such a hypothetical situation?
I can clearly see the utility of private repos for potential forking by competitors who have many more resources than you do and can hypothetically out-code you to the finish line and/or refactor your code significantly (potentially without attribution), but beyond that I don't really see much of a direct benefit to software development in private repos. Can anyone clarify the above points for me? For the record, I have investigated related posts like: https://softwareengineering.stackexchange.com/questions/87577/whats-the-benefit-of-having-a-private-repository-for-personal-projects
Private repository is about visibility: visible only by you or by all.
It is not about content: you can store anything (not too big) in a Git repo (public or private): a project, or just a collection of files. It is not limited to " software development". You can keep private simple text files representing notes you want to remember, for instance.
Typically, the three ways of claiming ownership of an open-source project, as described in "Ownership and Open Source" by Eric S. Raymond, have nothing to do with private/public repo.
One, the most obvious, is to found the project. When a project has had only one maintainer since its inception and the maintainer is still active, custom does not even permit a question as to who owns the project.
(See also "How do I navigate to the earliest commit in a Github repository?")
The second way is to have ownership of the project handed to you by the previous owner.
The third way to acquire ownership of a project is to observe that it needs work and the owner has disappeared or lost interest.
So this is more about communication, and less about repository management.

Does the GitHub traffic graph include your own views?

I have several projects on GitHub, and they all have the traffic graph where I can view how much traffic my repository is getting.
The blog post I had linked is very vague about visitors. It states:
..how many unique visitors it's had..
I just find it odd that some of my repositories have daily activity, but I'm not sure if most of those views are me, and if they are, why does it say "unique visitors" when i would be the only unique visitor
Question:
Does the traffic graph used on GitHub include yourself when navigating through your own source? It's very minor, but I'm genuinely curious if the views I'm getting is myself navigating through the source, or if I have people that are actually browsing through my source.
In specific, the line that shows "Views", not "Unique visitors" because unique visitors will obviously mean new people browsing the repository.
For those who think this is offtopic, re-read the on-topic post.
Most notably:
but if your question generally covers… software tools commonly used by programmers
OK I just contacted support and received a response:
Hello -
> Do the numbers in the traffic graphs include your own views? What about the view of contributors?
Thanks for getting in touch! Yes, the numbers include everyone's
views including repository owners and contributors. There's no way to
filter this information at the moment, but I can definitely add that
as a feature request for the team to consider.
Hope that answers your question - thanks!
So it does include your own views, but they might add the option to filter it later.
It looks like this behavior has changed, and now the traffic by the repository owner's views does not count when the owner is logged in.
A recent support question asked this, among others, and received the following reply from a member of staff:
My visit to my repository also count as a visit?
No, viewing your own repository while signed in doesn’t count towards this data.
I have checked that his holds by checking one of my repositories: the graph shows no views even on days when I visited the repository several times.
It is latest update as on 22nd Dec 2021 from GIT support that it is still recording owner's own views on owner's repository
From what I have experienced, as of November 2021, it does count the owner as a viewer. It is unclear as to why it counts as a unique viewer.
To be specific, I have seven repositories in GitHub, to which all of them have been inactive. On the 8th of November, 2021, I decided to check the traffic from all of them. Besides the portfolio, none of them gained traffic. The next day, all of them from yesterday gained traffic. Coincidence? No.
Yep, it appears that Github counts your own visits to your repositories too. In this image, the "Traffic' page has 8 views from 1 visitor. Given that, the Traffic page is available only to the owner of the repository, you can deduct that Github counts your own visits too.

how public is a github public repo?

I am finally moving to GitHub for source control. We can only use a public repo for a project were doing, but how public is public? Is it safe to assume that if I do not publicize the project at all, no one will really find it among the 3 mil repos they already have?
I cannot really have people seeing the source code as of right now, but 7/mo is a little steep for needing just 1 private repo.
No. That's not a safe assumption. GitHub has a search engine for public repos, and people use it (including myself). So, there's always a decent possibility that someone will see your source-code. If you want a free private repo, I suggest using BitBucket instead, or another service that offers free private repos. Note that BitBucket is only free if you have 5 or fewer users working in your repo.
When asking yourself such questions, you should consider that a lot of website visitors are other computers, not humans. The google bot is indexing every page it can find, ohloh.net creates statistics over open source projects on github, ... and so on and so forth. So if something is on the Internet, people will find it. :)