Are there best-practice guidelines for maintaining a repository? - github

Are there best-practice guidelines for maintaining a GitHub repository? I've contributed to many open source projects and used GitHub for projects that I work on solo, but now I'm working with a team of six developers, including myself, to build a system, and I've been placed in charge of maintaining the repository. Nothing is to get merged into our main branch without my approval. As little as I know about maintaining a GitHub repository, of those within the organization (two team members are consultants) I've the most experience with the process.
But I've never maintained a GitHub repository, and while I'm doing OK, I know that there must be a body of knowledge out there of how to handle this correctly. I just haven't been able to find it.
One hurdle I've been jumping over repeatedly, for example, is merge conflicts. Usually they're minor, but not always. Is there some known system available that allows me to enforce who has the ability to edit which files at any given time, for example?
And yes, I realize this may not be the best Stack Exchange forum, but none of the others seemed more suited to the topic.

The Cloud Native Computing Foundation (CNCF) serves as the vendor-neutral home for many of the fastest-growing open source projects, including Kubernetes, Prometheus, and Envoy.
As such, it can be used as a starting point for your own project: see contribute.cncf.io/maintainers/github/, which offers:
template, to be usre you have your README, LICENSE and other important files.
labels, to better classify your issues
Add also a clear "release and maintenance policy", and you should be in good shape.

Related

GitHub Multiple Repositories vs. Branching for multiple environments

This might be a very beginner question, but I'm working on a large production website in a startup environment. I just recently started using Heroku, Github, and Ruby on Rails because I'm looking for much more flexibility and version control as compared to just locally making changes and uploading to a server.
My question, which might be very obvious, is if I should use a different repository for each environment (development, testing, staging, production, etc.) or just a main repository and branches to handle new features.
My initial thought is to create multiple repositories. For instance, if I add a new feature, like an image uploader, I would use the code from the development repository. Make the changes, and upload it with commits along the way to keep track of the small changes. Once I had tested it locally I would want to upload it to the test repository with a single commit that lists the feature added (e.g. "Added Image Uploader to account page").
My thought is this would allow micro-managing of commits within the development environment, while the testing environment commits would be more focused on bug fixes, etc.
This makes sense in my mind because as you move up in environments you remove the extraneous commits and focus on what is needed for each environment. I could also see how this could be achieved with branches though, so I was looking for some advice on how this is handled. Pros and cons, personal examples, etc.
I have seen a couple other related questions, but none of them seemed to touch on the same concerns I had.
Thanks in advance!
-Matt
Using different repos makes sense with a Distributed VCS, and I mention that publication aspect (push/pull) in:
"How do you keep changes separate and isolated across multiple deployment environments in git?"
"Reasons for not working on the master branch in Git"
The one difficult aspect of managing different environments is the configuration files which can contain different values per environment.
For that, I recommend content fiter driver:
That helps generating the actual config files with the current values in them, depending on the current deployment environment.

Can't Decide On a Suitable Open Source Project Host

I need a little bit of help deciding which project host (if any) to move our currently existing project to.
We currently have an SVN (but willing to migrate if necessary) closed-source project existing on Assembla. We're wondering about moving because we want to open source our existing project and:
We don't have the resources to actively promote our now open source project, and suspect that the project hosting we select might influence how our project is publicized. If we move from assembla to github, how likely is it that our project will get more attention?
We want it to be as easy as possible for new developers to pickup and start running.
Our project is also going to need very extensive wiki documentation, as it is a very complex enterprise web application framework (somewhat similar to spring). Does it make sense to put that documentation at the same place as our repo host? Or should we have a separate website for that? We also would like to have a blog as well as a forum. Same question for those.
Help?
We don't have the resources to actively promote our now open source project, and suspect that the project hosting we select might influence how our project is publicized.
Might - and might not. You can see a lot of solo-projects on any source-hosting
We want it to be as easy as possible for new developers to pickup and start running
Assembla is very good choice in this case. Do not be in a common disease Git-mania. Really "big community" in case of Github is just common marketing cheating, no more - it's not your community
Assembla have most needed (for big complex project) tools, compared to competitors. Pull requests on Github implemented better, yes. But I can't recall any other advantages. Support of almost all modern widely-used SCM (except Bazaar) is big plus also.
Around community size: Assembla have big plans of expansion to million users in nearest years (two, AFAICR)
NB: You can think about changing SCM to (some) DVCS - forking|merging are more natural in these systems and it will give one more level of freedom to contributors without big headache for any side
I think your project would get more attention being on github or bitbucket and developers would be able to easily clone from either host using git. With that said, you could try to promote it on sites like Hacker News.
Since your wiki would be very complex, I think a seperate website would be more suitable.

Best practice to maintain source code under version control with multiple companies?

I'm wondering if there is any best practice for maintaining your source code under version control among different companies. In Open Source there is a maintainer, who receives patches, decides on them and applies them. But what about closed sourced projects where different companies get different workloads and just commit them to the trunk and branches? Is this maintainer concept applicable to a project on which multiple companies work on?
You can choose from a wide range of version control systems. (Not only subversion)
With the "versioning" concept you are safe that no one damages the project permanently.
So there is no need for a manual approval process, especially when there are contracts for example between the participating companies.
I'd also set up a commit mailinglist so you have some kind of peer review of changes. So no changes can be done without anyone noticing them.
If applicable set up some kind of continous integration environment to keep the quality up.
I don't understand the question about the branches. The decision whether to use them or not is IMHO not depending on the fact that the commiters are employed in the same company or not.
Its really up to you to decide which workflow works best for the companies involved. Subversion has the ability to add permissions to your trunk and branches allowing you to lock down certain parts of your repository to people who are "trusted" with merge access to trunk. You'll need good communication amongst the companies. Using the open source Trac provides a wiki, integrated RSS feeds of the commits to the project and code browser.
Usually, each site works on its dedicated branch and can import the other remote site branch, to decide what to integrate in its own work.
But if a site need to work directly on the other site branch, one possible practice is the concept of branch membership which allows only one site at a time to work on a given branch.
(not sure it is possible with SVN though)
That allows for two remote site (with a large time shift) to work on the same task in a tightly integrated manner.
My recommendation : subversion, with that configured you give away a url and then checkout, update, get things done and when you guess that the project is ready, snapshot and deliver.

Source Control - Distributed Systems vs. Non Distributed - What's the difference?

I just read Spolsky's last piece about Distributed vs. Non-Distributed version control systems http://www.joelonsoftware.com/items/2010/03/17.html. What's the difference between the two? Our company uses TFS. What camp does this fall in?
The difference is in the publication process:
a CVCS (Centralized) means: to see the work of your colleague, you must wait for them to publish (commit) to the central repository. Then you can update your workspace.
You are an active producer: if you don't publish anything, nobody sees anything.
You are a passive consumer: you discover new updates when you refresh your workspace, and have to deal with those changes whether you want it or not.
.
a DVCS means: there is no "one central repository", but every workspace is a repository, and to see the work of your colleague, you can refer to his/her repo and simply pulled its history into your local repo.
You are a passive producer: anyone can "plug in" into your repo and pull local commits that you did into their own local repo.
You are an active consumer: any update you are pulling from other repo is not immediately integrated into your active branch unless you explicitly make it so (through merge or rebase).
Version Control System is about mastering the complexity of the changes in data (because of parallel tasks and/or parallel works on one task), and the way you collaborate with others (other tasks and/or other people) is quite different between a CVCS and a DVCS.
TFS (Team Foundation Server) is a project management system which includes a CVCS: Team Foundation Version Control (TFVC), centered around the notion of "work item".
Its centralized aspect enforces a consistency (of other elements than just sources)
See also this VSS to TFS document, which illustrates how it is adapted to a team having access to one referential.
One referential means it is easier to maintain (no synchronization or data refresh to perform), hence the greater number of elements (tasks lists, project plans, issues, and requirements) managed in it.
Simply speaking, a centralized VCS (including TFS) system has a central storage and each users gets and commits to this one location.
In distributed VCS, each user has the full repository and can make changes that are then synchronized to other repositories, a server is usually not really necessary.
Check out http://hginit.com. Joel wrote a nice tutorial for Mercurial, which is a DVCS. I hadn't done any reading about DVCS before (I've always used SVN) and I found it easy to understand.
A centralized VCS (CVCS) involves a central server that is interacted with. A distributed VCS (DVCS) doesn't need a centralized server.
DVCS checkouts are complete and self-contained, including repository history. This is not the case with CVCS.
With a CVCS, most activities require interacting with the server. Not so with DVCS, since they are "complete" checkouts, repo history and all.
You need write access to commit to a CVCS; users of DVCS "pull" changes from each other. This leads to more social coding facilitated by the likes of GitHub and BitBucket.
Those are a few relevant items, no doubt there are others.
The difference is huge.
In distributed systems, each developer works in his own sandbox; he has the freedom to experiment as much as he want, and only push to the "main" repository when his code is ready.
In central systems, everyone works in the same sandbox. This means that if your code is not stable, you can't check it in, because you will break everyone else's code.
If you're working on a feature, it will naturally take a while before it stabilizes, and because you can't afford to commit any unstable code, you would sit on changes until they're stable. This makes development really really slow, specially when you have lots of people working on the project. You just can't add new features easily because you have this stabilization issue where you want the code in the trunk to be stable but you can't!
with distributed systems, because each developer works on his own sandbox, he doesn't need to worry about messing up anyone else's code. And because these systems tend to be really good at merging, you can still have your codebase be up to date with the main repository while still maintaining your changes in your local repository.
I would recommend reading Martin Fowler's review of Version Control Tools
In short the key difference between CVCS and DVCS is that the former (of which TFS is an example) have one central repository of code and in the latter case, there are multiple repositories and no one is 'by default' the central one - they are all equal.

Merging and branching shared code between projects in TFS

I'm currently in charge of migrating our asp.net applications from source safe to TFS. We have three or four very similar apps (let us say e-commerce) that currently share a core library (services, business logic, entities, data access etc).
The applications are similar but not identical so one app might get a feature set the others won't get etc.
I want to stop the sharing of code and instead set up branches (if that fits) so if I change something in Application A:s core library I will need to merge the changes with the other branches instead of them getting the changes automatically. This to avoid surprises when you update from your trunk and suddenly the core has changed for another project and this project breaks in some way.
Any suggestions on how I should set this up in TFS? Should I have a "main" Core that is not directly used in any project that is the parent of all the other cores so I can push changes up to that one from one core and then distribute it to the other cores? Does that make sense and would it be easy to set up in TFS?
In response to your comment, I'd suggest you to read up on Feature branches on the CodePlex website.
Scenario 4 – Branch for Feature
In this scenario, you create a
development branch, perform work in
that branch, and then merge your work
back into your main source tree. You
organize your development branches
based on product features. The
following is a physical view showing
branching for feature development:
My Team Project
Development -> Isolated development branch container
Feature A -> Feature branch
Source
Feature B -> Feature branch
Source
Feature C -> Feature branch
Source
Main -> Main Integration branch
Source
We are alos moving from SS to TFS in the near future.
As I perceive it, we are going to keep our SS repository online and start fresh over in TFS. Our framework probably will get its own project in TFS. Project specific shared units will need to get merged from time to time.
The way you structure your repository depends on your specific situation. Every branch scenario has its specific advantages and drawbacks.
How many projects
How many developers
Are the developers dedicated
Do you need concurrent hot fixes
Do you need service packs
Take a look at the CodePlex branching guide for all the information you need to make an informed decision about your TFS structure. Print out the cheat sheets and pin them to your wall for quick reference.
Before executing on your branch plan,
pay attention to this cautionary
message - every branch you create does
have a cost so make sure you get some
value from it. The mechanics of
branching in TFS are simplified to a
single right click branch command.
However, the total cost of branching
is paid by reduced code velocity to
main, merge conflicts and additional
testing can be expensive.
I am assuming you have already investigated whether you truly need to make your "copies" seperate team projects. Remember the TFS concept of a "Team Project" is a VERY LARGE high level container. It is not the same thing as what most IT shops consider a "Project". Think of "Microsoft Vista" or "Office 2007" as a project, not, say "A new release of Company XYZ's Accounts Receivable System" as a project in the Team Project sense.
I have a client that decided on one single Team Project for TFS. There is nothing wrong with this - and it is truly the best scenario in many circumstances.
If you truly need a very strong isolation between your copies of the application (perhaps they are seperate clients and you need very strong security seperation) and must have seperate team projects.
That said - you still - as you've stated need to share code between instances of your application. The first thing I would strongly recommend is to get away from "Cut and Paste" sharing. I would truly try to isolate the shared code into a seperate Solution and generate binaries for that (perhaps you've already done this!)
This is covered in the Codeplex TFS: http://tfsguide.codeplex.com/
Another approach I've done for several clients - is to have a Team Project that contains the shared code. The "Build" creates the binaries for the shared code - and the "Deploy" simply copies those to a "known location" (ie UNC share on the build machine)
For the applications that are "Consumers" of the "Framework" we simply used the "AdditionalReferencesPath" Item group to include the location of that known location.
Furthermore - this tool: http://tfsdepreplicator.codeplex.com/ can be helpful. This would allow you to have builds automatically triggered for your "Consumer" Projects whenever the "Framework" solution is built.
My brief answer is that you should only setup one 'TFS project' and simply organize your different projects, i.e. your individual applications, and each shared library, as separate folders under that one TFS project. The alternative is to include specific (binary) builds of the shared libraries in each individual application – if you do that then you can organize each application into it's own TFS project, tho you can't merge changes or branch those projects without using the TFS command line (and some non-obvious commands to boot).
I was trying to determine the same information, this guide on codeplex is perfect
http://vsarbranchingguide.codeplex.com/releases
Includes terminology and different branching workflow approaches as well as cheat sheets.