Using GitHub in a classroom - github

I'm going to be teaching a data journalism class next year to students with fairly weak coding and computer skills. I'm prepared to do the work necessary to guide them through using R to learn how to scrape data, make plots and maps and such.
However, I am thinking about how to enable them to work in groups.
Obviously GitHub is the place for collaborative work on projects, but, wow, it has a learning curve.
I am wondering if it would be possible for me to set up one repo for the entire class and then somehow have each group in the class have their own branches for their own projects so that I could kind of oversee the merges as they work on their projects.
I can see the merits of GitHub for this, but I am trying to make it as simple as possible.
Please note, I see that GitHub Classroom is a thing, but that really seems to facilitate grading and marking for large classes. That's not really what I need.

I'd suggest having each student create a fork of your base repository so that they can do work on their own copy of your code. This way, they'll each have their own workspace that they can contribute to alone. You could even have them create a new fork for each group collaboration.
See the documentation here for information on forking repositories.

Related

Are there best-practice guidelines for maintaining a repository?

Are there best-practice guidelines for maintaining a GitHub repository? I've contributed to many open source projects and used GitHub for projects that I work on solo, but now I'm working with a team of six developers, including myself, to build a system, and I've been placed in charge of maintaining the repository. Nothing is to get merged into our main branch without my approval. As little as I know about maintaining a GitHub repository, of those within the organization (two team members are consultants) I've the most experience with the process.
But I've never maintained a GitHub repository, and while I'm doing OK, I know that there must be a body of knowledge out there of how to handle this correctly. I just haven't been able to find it.
One hurdle I've been jumping over repeatedly, for example, is merge conflicts. Usually they're minor, but not always. Is there some known system available that allows me to enforce who has the ability to edit which files at any given time, for example?
And yes, I realize this may not be the best Stack Exchange forum, but none of the others seemed more suited to the topic.
The Cloud Native Computing Foundation (CNCF) serves as the vendor-neutral home for many of the fastest-growing open source projects, including Kubernetes, Prometheus, and Envoy.
As such, it can be used as a starting point for your own project: see contribute.cncf.io/maintainers/github/, which offers:
template, to be usre you have your README, LICENSE and other important files.
labels, to better classify your issues
Add also a clear "release and maintenance policy", and you should be in good shape.

Azure DevOps - organizing projects and repositories

(Posting the question here as this is the 'community' that Microsoft redirects to with a 'Need advice? Ask community' button. Hope it won't get closed as 'primarily opinion based' or 'too broad')
Hello,
I want to start using AzureDevops in my department for organizing code & work. We're a small team who creates a large number of applications & plugins.
Some of these applications have a very short lifecycle, i.e. we deliver them, and they work for years without changes. Other apps are larger and are updated/fixed across several months or years.
These applications are completely separate from each other in all aspects.
As far as I understand Azure DevOps structure, my department should become an 'Organization' (we can/need to be separate from the rest of the corporation).
I am a bit puzzled about the 'Project' part. Documentation says
In general, we recommend that you use a single project to support your organization or enterprise.
So, let's say we do have one project called Our Apps - where do we then put all the individual application-projects?
As far as I understand, each product (application) that we deliver should have it's own repository (or a set of applications, if they are logically connected).
This is in order to allow a developer to simply clone the repo on their machine and contribute to that product only - without downloading other projects etc.
I need to be able to:
easily navigate/see all the tens/(hundreds?) of applications that we create,
view their separate kanban boards (for those project that do have it, not all of them will)
to see their repositories (Git or TFS), commits etc
see & manage their pipelines
At the moment it seems to me that the only place where I can see a 'list' of what products do we have is the drop down below:
And the only way to see what is going on in the big-enough-to-get-own-board products is by creating a new separate 'SomeApp Team' in the Project (even though same people are in it), so that I can have a board for the SomeApp - and view the boards from here:
Is that the intended way to organize the structure?
Any alternative approaches?
Is there any way to have a 'cross-reposistory' or 'cross-team' overview?
What about creating documentation for each 'product'?
The "one project to rule them all" was coined by Martin Hinshelwood and his blog post from way-back-when explains the reasons and limitations.
With the introduction of Tagging and filtering on the backlog there is an alternative approach within the one-project setup.
Create team for the real teams you have in your organisation.
Create an area path for each major project/product in the org.
Assign the area paths of the projects to the teams who are working on them. This can change over time.
Optionally tag work items with the major project/product for additional filtering.
This way each team sees a complete view of all the work they can pull from. And they can quickly filter the work by tags to remove items from view when discussing specific projects/products.
Also, when teams change their focus from one product/project to another, you can simply change the assigned areas for that team to update their view.
The Plan View extension provides an additional cross-team view across over all the work. And the Dependency Tracker extension can visualize dependencies over time.
You can also use the Epic/Feature/PBI|UserStory tree structure to create additional grouping in your work items. You can customize the process template to introduce a Product level, though for the planning features to work, that would also mean that you'd also have to create full traceability from Product down to PBI|UserStory.
The main recommendation is to try a few of these approaches in a light-weight manner to see how they work and find your own ideal setup.
Another option for cross project visualization is to enable the Analytics Extension and connect it to PowerBI.
As you'll soon figure out, naming guidelines for your Tags, Repositories, Pipelines is going to be very important. Being able to quickly filter to the right level requires this.

For a github repository that is no longer maintained, how do developers agree on which fork to use to focus all their work on?

So I'm looking at the michiya/django-pyodbc-azure repository. It's a great project, but it seems to be unmaintained. There are lots of forks of the project, and I'm not sure which one to follow. The nice thing about github is that it makes it easy to find the repository that serves a particular purpose, and it also makes it easy to combine efforts contributing to this repository which is common to everyone. But this idea seems to fall apart when a project is unmaintained and forks just spiral with nobody benefiting from combining efforts with anybody else. What's the solution?
Edit: I just filed an issue and mentioned all the people who submitted recent issues, PRs, forks. It was a pain to gather all the names manually and type them up, I'm sure I missed a lot of good names who would have contributed, and I may have mis-typed some names
Edit: Just found this article from 2011 which discusses the same issue in my question. While I replied to one of the comments, I got the idea of using Machine Learning to sort forks by usefulness:
"True. With all the Machine Learning buzz these days, I wish there was
a "github-forks-helper.net" that was powered by some trained ML
network that trained on all github forks identifying which ones ended
in upstream. It would then sort unmerged forks by likeliness of ending
upstream"
Posted this datascience.stackexchange.com question to gather some thoughts about ML to help with sorting these forks

Sharing routines within a user community

Im building a toolbox for a certain branch of biology. One of the reasons Julia was chosen is its simplicity, as biologists wont be assumed to be able to write complex C-code
What I'd like to add is a way for users to share their own custom methods for others to review/verify/use, both to promote collaboration and to add a bit of sense of community
What Im sure of is that this specific demography of (mostly) biologist wont be able or have patience to fork a github project or anything that could be considered remotely complex, especially when it wont benefit them explicitly to do so
So, what I'd like to do is provide the simplest of interfaces, with the add/view options to either add a routine or view routines (along with descriptions, ratings etc)
I can only think of two ways to accomplish storing the scripts pushed by users, by having them on a server, or, more simply, using SQL
tl;dr can postgresql store scripts or is that a terrible idea
I ask, mainly because there will be 'raw data' available on a postgresql server, and I'd like to be able to keep that and the 'community methods' both in the same place for convenience sake
To summarize the discussion in the comments to this question:
Version control is an excellent solution to sharing control, but from a scientist's perspective, it can be difficult and complicated. Luckily, GitHub now offers a GUI that is easy to learn and yet retains a lot of the power of Git. For instance, GitHub allows one to edit files directly from the web UI.

Good github structure when dealing with many small projects that have a common code base?

I'm working for a web development company and we're thinking about using GitHub for version control. We work with several different .NET-based CMS-platforms and of course with a lot of different customers.
We have a standard code base for each CMS which we start from when building a new site. We of course would like to maintain that and make it possible to merge some changes from a developed site (when the standard code base has been improved in some way).
We often need to make small changes to a published site at a later date and would like to be able to do this with minimal effort (i.e. the customer gladly pays for us to fix his problem in 2 hours, but doesn't want to pay for a 2 hour set up first).
How should we set this up to be able to work in an efficient fashion? I'm not very used to distributed version control (I've worked with CVS, Subversion and Clear Case before), but from my imagination we could:
Set up one repository for each customer, start with a copy of the standard code base and go from there. Lots of repositories of course.
Set up one repository for each CMS and then branch off one branch for each customer. This is probably (?) the best way, but what happens when we have 100 customers (=branches) in the same repository? It also doesn't feel entirely nice that we create a lot of branches that we don't really have any intention of ever merging back to the main branch.
I don't know, perhaps lots of branches is only a problem in my imagination or perhaps there are better ways to do this that I haven't thought about. I would be interested in any experince in similar problems.
Thank you for your time and help.
With Git, several repos make sense for submodules purpose (sharing common component, see nature of Git submodules, in the third part of the answer)
But in your case, one repo with a branch per customer can work, provided you are using the branches to:
isolate some client-specific changes (and long-lived branch with no merging back to master are ok),
while rebasing those same branches on top of master (which contains the common code, with common evolutions needed by all the branches).