Counting commits per developer using webhooks - github

Using Github webhooks, what is the best way to count the number of commits by a developer for a particular piece of work.
I was thinking that I could do like this:
Listen for webhooks pertaining to merges of any PR into the main branch
discard the webhook event unless it pertains to the main branch
if the PR merge is into the main branch, then find out all the commits (and developer-related to each commit) related to that merge and calculate a count
are all these commits(and developer usernames) even listed in the webhook event?
Will the commits related to the merge definitely only be comprised of commits after the branch was created up until its merge.. or will they actually go all the way back to the beginning of repository creation?
Had also considered listening on webhooks related to "Tag" type pushes, but would a new tag event be able to tell me about all the commits between that tag and a previous tag.. probably not right? I'm guessing that whatever the tag is that it would always represent all commits from the start of repo creation

You can use roughly the approach you've outlined. You can extract the base and head revisions from the desired webhook events and then run git rev-list --no-commit-header --format=%aE $base...$head to find the email address associated with each commit. (You can use git log instead of git rev-list --no-commit-header if you're using an older Git).
That will count only the commits that were created in the branch and are not otherwise merged into main. Note that if you're using squash merges, this doesn't work because the merge base doesn't update; I recommend not using squash merges anyway. In the case of a squash merge, the number of commits created is always 1. You can tell whether a squash merge has taken place because git merge-base --is-ancestor $head main will exit 1; that is, the head of the PR branch is not an ancestor of the most recent main.
While this will work to count the commits per developer, I should point out that counting a developer's commits is not a good measure of productivity and shouldn't be used as a performance metric. A developer may be highly skilled and write only a single commit after investigating a difficult problem, but counting commits would rank that person lower than someone who fixes many easy problems. There are also other skills, like communication and collaboration, which cannot be measured in commits.

Related

What constitutes a "branch with too many changes" for Github PR rebase merging?

What criteria does Github use to determine whether or not a given PR "has too many changes" for rebase-merging?
I've worked on a refactoring effort recently and after submitting and getting approval in the PR, I got the following message from Github's UI, preventing me from rebase-merging the change:
This branch cannot be rebased due to too many changes
At the time, I had no idea what exactly the limit was, so I split my ~30-commit PR into 2 ~15-commit changes, which made the restriction go away.
I found a question around the same limitation, but it doesn't focus on what these limits are, asking for workarounds instead:
This branch cannot be rebased due to too many changes
I'm now doing another change which also has a similar number of commits and wanted to know exactly what rules Github uses to determine whether or not a given PR "has too many changes" so that I can split my PRs accordingly upfront and avoid rework both on my end as well as on reviewer's.
I tried finding official documentation on this to no avail. Unfortunately, the only way I'm aware of to test it is to submit the PR and check if the message comes up. This however becomes unfeasible since the message only shows after all checks have succeeded, meaning I have to also get actual reviews on the change to check whether they are mergeable or not. If I then find out it is not mergeable, I have to create a separate PR and split the change, wasting everybody's time on re-reviews.
Many different aspects (or a combination of multiple aspects) could be the cause for this:
Number of commits
Number of affected files
Number of affected lines
Number of conflicts with base branch
etc
I've been using GH for a long time and had never seen this message until a couple weeks ago, so I assume it is either some new feature, or maybe a restriction depending on the type of plan used by my company.
This is almost certainly a timeout-based situation. GitHub uses libgit2 to perform rebases and how long a rebase takes depends on the number of commits, the complexity of the changes in each of those commits, and the complexity related to the repository (size of objects that must be looked up, etc.). libgit2 doesn't have all the optimizations of regular Git, but regular Git cannot rebase in a bare repository, so the use of libgit2 is required.
So there's no fixed limit in the number of commits or the complexity, only that if it takes longer than whatever the timeout is, the operation will fail, and you'll be left with that message. One way to avoid this is to use regular merge commits, which while not immune from timing out, are less likely to do so because the merge involves only three commits, not however many are in your branch.

How to merge several branches into master with one pull request, been able to undo just one later in github

I am working on a project with a legacy codebase that has basically no automated testing and no testing environment, so the only testing possible is in my local environment with no testing input data and no output test cases.
During the week we merge the branches we finish into one week-merge-day branch using pull requests, and that branch is merged to master once a week using also a pull request. The problem is that, given the poor testing scenario, it is common to need to undo the week merge because one of the branches is causing the production version not to be usable. Of course, this merge undo restores the master branch to the previous state, undoing the merge of all the branches merged, not only the wrong one.
How can I handle this situation when I have to merge several branches at once and, later, undo the merge of only one of them?

Detect direct pushes to master w/GitHub Actions

We currently have a GitHub repository where our master branch is protected for everyone except admins, who are able to commit and push directly to the branch without first opening a pull request. We're looking to find a way to send a Slack notification anytime an admin commits directly to master in order to call attention to the fact that there was an override of the branch protections. This may happen intentionally due to extreme circumstances or, worst case, by mistake (which will need to be addressed).
This seems like it'd be possible with a combination of the GitHub Slack action, the if key on the job/step definition, and ideally some piece of information from the push event JSON.
The last part is where I'm stuck: I don't see an obvious way to use the data contained in the push event to differentiate between one-off commits that would violate our branch protection policy and a normal/compliant pull request.
Does anyone have any ideas as to whether or not this is possible? Perhaps there's another event that I should be attaching this workflow to that would give me the information I'd need to tell the difference and launch the Slack notification?
In general, using GitHub Actions to do this kind of notification is problematic because the user can simply remove or neutralize the code that reports this and then push to the main branch. The Actions workflow that's used will be the one pushed into the repo as part of that commit, so this won't be an effective control.
You'd want to probably instead use a webhook to notify a service of this fact and then look at the HEAD commit, parse the commit message to extract the PR number, and verify that the second parent of the commit is the same as the head of the PR. Note that this won't work if you're using squash merges, because there's no easy way to verify that the commit created by a squash merge is the same as the one created by the branch from which it was created.

Mercurial hook to list outdated branches which are behind their parents

We use mercurial workflow with one stable branch (default), one unstable (develop) and feature branches. We want feature branches to always contain all changesets from parent branch(es) to simplify merging them back. Is there any example hooks to prevent adding commits to feature branches which are behind their parent branch? Github has a similar message when your branch is behind master.
Generally there are two scenarios:
Enforce feature-branch owners to sync with upstream branch before pushing new changes (pushed changegroup should be not behind its parent)
Periodically check for list of child branches which became outdated due to recent commits in their parent branches and nudge branch owners to sync or close it
It's basically two separate things you ask:
There is no pre-made hook for denying push when not merged with default which I know of, but maybe some google-foo might reveal some. However it should be quite straight forward to create one:
By my understanding of your requirements: you want to check an incoming transaction to contain only commits to the non-default branch, if their latest commit is a merge of default into it. This is a task for a server-side hook of the pretxnchangegroup kind; there you can analyse the incoming transaction and the changesets it contains and reject the transaction if it does not meet your requirements. Therein check the latest commit to a feature branch (or better non-default and non-develop(?)) to be a merge commit with default or develop.
As to periodic checking with the central repository:
The periodic checking for outdated child branches is a client-side action. You could for instance create here a commit hook. Thus querying hg incoming each time a commit is made. Or simply make a crontab entry client-side and trim the output to your desire.

How do I notify all forks of my code of a critical change?

Suppose I have a following situation. Long ago I published some useful code on Github and a lot of people forked it since then. Now I find some really serious error (like a buffer overrun) in my code and fix it and I realize that all forks should better have that fix, otherwise Bad Things™ might happen.
How do I notify owners of all forks that there's this critical change they'd better pull?
An upstream repo doesn't really know about its downstream repo (see "Definition of “downstream” and “upstream”").
And you cannot make a pull request to a fork (that wouldn't scale well anyway).
So the easiest was is to count on the other developers to update their local clone with your latest changes, which will include your latest fixes.
You can update your README.md for all to see, but you cannot really "broadcast" to all the forks (not to mention all the direct clones you have no knowledge about).
Anyway, if they want to contribute back, you will reject any pull request which isn't fast-forward.
That means they will have to rebase their work on top of the latest from "upstream" (your repo), before pushing to their fork and making said pull request.