How to query advanced issue handling on github (use of milestones and projects)? - github

I'd like to get the repositories that make the most active use of milestones and/or projects. By "most active" I mean something like most cards moved on a project board or most issues added to a milestone.
I tried GH Archive which has yearly datasets on Google bigquery. I ran this query
SELECT
JSON_EXTRACT(payload, '$.action')
FROM
[githubarchive:year.2017]
WHERE
type in ("IssuesEvent")
and JSON_EXTRACT(payload, '$.action') in ("milestoned", "labeled", "assigned")
LIMIT
20
and this query
SELECT
type
FROM
[githubarchive:year.2017]
WHERE
type IN ("MilestoneEvent",
"ProjectEvent",
"ProjectCardEvent")
LIMIT
20
Both return zero results. Does GH Archive not import all events? Am I making a mistake in the queries? Is there another source where I can get this information?

Related

OData Query to Only Return State Changes from Azure Devops

I have an OData Query that I am using to pull data into PowerBI that I am trying to make more efficient. I am doing a report from Azure DevOps and pulling data in from the WorkItemRevisions resource. Currently, I am pulling all the data for a Work Item and then filtering in PowerBI to only get when the State has changed. I would like to move this filtering to the Odata query so that I can minimize the data that I am pulling into the report.
Currently, I have a query like the following (simplified example used for this question)
https://analytics.dev.azure.com/{Organization}/{Project}/_odata/v3.0-preview/WorkItemRevisions?
$select=Revision,WorkItemId,WorkItemType,Title,State,ChangedDate,LeadTimeDays,ParentWorkItemId
How can this be updated so that only Revisions where the State has changed (from New to Active, Active to Done, etc) are returned?
How can this be updated so that only Revisions where the State has changed (from New to Active, Active to Done, etc) are returned?
I am afraid that OData Query could not perfectly achieve what we need.
There is a feature Revisions/any(r:r/state eq '{state}') to filter the work item has a set state in the past.
For example:
https://analytics.dev.azure.com/<Organization>/<Project>/_odata/v2.0//WorkItems?
$filter=State eq 'Closed' and Revisions/any(r:r/State eq 'Active')
This query is similar to a Work Item query that uses the Was Ever operator.
As I said, this may not be a perfect solution. That because it can only filter whether the work item has ever had a specified states, but cannot accurately determine the states of the work item must be from New to Active, Active to Done. If we change the state of the workitem from Active to Resolved, then change it from Resolved to Closed. Then this work item will appear in the query results.
In addition, even if you use the UI query, we cannot accurately query the result of the work item status changing from A to B. To achieve this goal, we need to use REST API.
So, we could use the feature Revisions/any(r:r/state eq '{state}') to reduce the data pulled into the report to a certain extent.
You can use below query to achieve what you want.
/_odata/v4.0-preview/WorkItemRevisions?&$apply=filter(((WorkItem/WorkItemType
eq 'Bug' or WorkItem/WorkItemType eq 'Product Backlog Item') and
((WorkItem/ChangedDate ge 2022-10-05Z and WorkItem/ChangedDate le
2022-11-04Z) or (WorkItem/CreatedDate ge 2022-10-05Z and
WorkItem/CreatedDate le 2022-11-04Z))))/groupby((WorkItemId,State),
aggregate(ChangedDate with min as MinChangedDate))
To filter the group by data you need to encapsulate it under $apply as it is shown above.
Above URL will return all states and their changed dates for Bug and PBI Work Item types which are added or updated with a given date range.
Hope it helps!
if you are able to use Analytics Views instead of OData, there is a dedicated field available in Analytics Views Fields setting called "State Changed Date"

Azure DevOps Boards - display query result on a board

how to develop the extension to display query result on a board? Such thing is not possible in the Azure Devops unfortunatelly. I've found two extensions on the marketplace which are doing what I need:
AA Query Board
Query based boards
but this extensions are not updated for a long time and I couldn't contact the authors (I need to change few things in order to be able to use it internally in my company).
I've found also this topic Add tabs on query result pages, so it looks like it's quite easy to add new tab to the query result menu, but I have no idea and I can't find any info how to get data (work items) from query result to display them?
Rest of the extension is just to display this data in grid, so that would be also quite easy, but getting this query result data is blocking me.
There is a Query Results Widget that you can use to display the query results on the Dashboards under Overview.
1, First you need to create a shared query if not exist, and save query to the shared queries folder shown as below screenshot. (You can click the Column options from the Editor page to add and remove columns to be shown on the results)
Or drag and drop the query from My Queries folder to Shared Queries folder.
2, Go to Dashboards under Overview, and Click Edit, then search and add widget Query results
3, Click the gear icon on the Query Results widget to configure it and select the query you want to display. Then the query result will be display on the Dashboards
Update:
There are some other ways to show the query results on the dashboards, For below example:
you can select your shared query and click more actions(3dots) and click Add to dashboards. This will display simple total number of query results.
you can also create different Charts for the query results and add it to Dashboards.
Select your shared query and go to Charts tab, the choose New Chart, select a Chart type, After you configured the chart, you can click the 3dots on the chart and add it to dashboards, check below screenshot:
Eventually I managed to contact the author of the "AA Query Board" extension and it turns out that he has a public repository on GitHub with the source code of the extension, so basically everyone can lookup how it's done or base on it.
Link to the repository: https://github.com/staticnz/aa_query_board

BigQuery github dataset returns wrong results

So, I'm trying to do some queries using bigquery-public-data:github_repos.files, which was updated on May 25, 2018, 2:07:03 AM, in theory, it contains all files data from github - as it says in the description of the table:
File metadata for all files at HEAD.
Join with [bigquery-public-data:github_repos.contents] on id columns
to search text.
So, I have this tool called goreleaser, to use it, users create a file named .goreleaser.yaml. To have an idea of how many repositories are using it, I was using the github search, something like this a search for filename:goreleaser extension:yaml extension:yml path:/, you can see the results on this link.
This shows 1k+ results, and gets results for all these possible names:
goreleaser.yml
goreleaser.yaml
.goreleaser.yml
.goreleaser.yaml
The problem is, github shows the 1k result count, but you can only paginate until 1k or so. I wrote some code in Go using the API and etc, you see it here.
Anyway, I tried to do something similar with bigquery, here is my foolish attempt:
SELECT repo_name, path
FROM [bigquery-public-data:github_repos.files]
WHERE REGEXP_MATCH(path, r'\.?goreleaser.ya?ml')
This will include the vendored tools, which is not ok, but that's not the problem. The problem is that even with the vendored tools, it only shows ~500 results, not 1k.
PS: I also tried the simplified version matching path with LIKE and etc, same results.
So, either I'm doing something horribly wrong, this table does not include all data as it says it does or github search is lying to me.
Any advice?
Thanks!
Not every project in GitHub is mirrored on BigQuery's repo dataset.
Let's look at all projects that got more than 40 stars in April, vs what we can find mirrored in BigQuery's repos:
SELECT COUNT(name) april_projects_gt_stars, COUNT(repo_name) projects_mirrored
FROM (
SELECT DISTINCT repo_name, name, c
FROM `bigquery-public-data.github_repos.files` a
RIGHT JOIN (
SELECT repo.name, COUNT(*) c
FROM `githubarchive.month.201804`
WHERE type='WatchEvent'
GROUP BY 1
HAVING c>40
) b
ON repo_name=name
)
9522 vs 3995. Why?
Only open source projects are mirrored. This according to the open source detected license - if GitHub can't tell what license a project is using, the project can't be mirrored.
New projects: The pipeline might miss some new projects. Please report them.

Creating a VSTS Extension, using WIQL query to grab work item data, can I grab Activity field data?

I'm creating a Visual Studio Team Services extension that in it's current iteration is supposed to display child tasks for development, testing, etc. that were added to a work item. I build a WIQL query to get these tasks and some data about them.
In VSTS (and TFS), tasks have an Activity field, which I want, to differentiate between the different types of tasks (development, testing, etc.). However, I'm finding with the below WIQL query I create, I get the following error: TF51005: The query references a field that does not exist. The error is caused by «[System.Activity]». Is there a way I can get access to the Activity field for those tasks? Or is it just not supported currently?
SELECT [System.Id], [System.WorkItemType], [System.Title],
[System.Activity], [System.State]
FROM WorkItemLinks
WHERE (Source.[System.TeamProject] = 'someProjectID'
AND Source.[System.Id] = someWorkItemID
AND Source.[System.State] <> 'Removed')
AND ([System.Links.LinkType] = 'System.LinkTypes.Hierarchy-Forward')
AND (Target.[System.WorkItemType] = 'Task')
MODE(Recursive)
Working through this I discovered https://marketplace.visualstudio.com/items?itemName=ottostreifel.wiql-editor, which has helped make it alot easier debugging my WIQL query. I highly recommend it to anyone who is new to working with WIQL.
You can create a query with necessary fields in web access, then get detail wiql by using Get a query or folder REST API (add $expand=wiql parameter).
I looked some more and discovered my answer, apparently Microsoft.VSTS.Common.Activity is the field you want to reference to get the activity for the task. I found it here: https://www.visualstudio.com/en-us/docs/work/track/query-numeric. Looks like there's some more information there about some data you can grab, like Microsoft.VSTS.Scheduling.StoryPoints. However it's definitely not a complete list, and I wasn't able to find one. Feel free to comment on this if you know of a complete list of references to use to grab anything you want about a work item!

Get list of all files with the user who checked in the latest version in TFS

Is there a way in TFS to get a list of files under source control with the user who checked in the latest version/version you have locally.
The closest functionality to this that i can find is in the source control explorer window you can see each files with the latest check-in date, but not with the user who checked it in.
There is no way currently from the VS Source control explorer. The best you can get is using the Web TFS version. You will see the name of the user in the comments section (in orange in the image below) along with changset # and any comment.
If that doesn't work for you somehow then you can either use TFS Api or SQL query against TFS DB. Following SQL should give you the result.
SELECT TOP 10
V.ChildItem AS [FileName],
I.DisplayName AS [ChangedBy],
CS.CreationDate AS [ChangeDate]
FROM tbl_Changeset CS
INNER JOIN tbl_Identity I
ON I.IdentityID = CS.OwnerID
INNER JOIN tbl_Version V
ON V.VersionFrom = CS.ChangesetID