Are projects using GitHub checks having continuous integration? - github

I am doing data analysis on GitHub projects and I want to filter projects having continuous integration (on GitHub).
There are two types of checks and statuses on GitHub: Checks and Statuses! Projects can use GitHub Apps to run checks or mark their commits with external services (CI or other) [source]. My question here is: does having GitHub Checks (or statuses) results available for a project mean that the project is using CI for sure? If not what other factors should be presented to say a project is having continuous integration?

Possibly. But you can't be sure. It means that some check runs and some status is updated. But without looking at the automation, there is no way to conclude that Continuous integration takes place.
Maybe it checks whether the contributor has signed a contribution agreement.
Maybe it checks for the presence of a Issue id or an attachment.
Maybe it updated some external system (like Service Now) so the issue can be tracked there as well...
Checks and statuses are used in many different ways.
And Continuous Integration looks different for different technologies. Some languages need to be compiled, others won't need that. They'd hopefully have some kind of tests to validate nothing broke during integration, but there is no surefire way to know as it may simply be running a script or using a test framework or something else.
You can probably easily conclude that the absence of checks and statuses likely means that CI isn't being performed (even that can't be said with 100% certainty as an external system may be performing the CI, just not reporting back the status). The presence if checks and statuses means that something happens. But you likely need to dig a bit deeper to classify whether the thing that happens constitutes to CI.

Related

what is the use of CICD and how it saves my time while I can simply push and pull my code from github and make my code into production too easily?

I'm trying to learn CICD concepts on my own, I don't understand how it helps me while I can easily push and pull my code from github and make my code into production
Continuous Integration is mainly a culture than being a tool. So, you need to understand why it's necessary that every developer on a team should integrate their code with the repository are least once a day.
Continuous Delivery also indicates challenges and best practices of delivering high-quality software as soon as possible. So, teams that want to decrease the risk and problems of integrating features and increase the speed of delivering new features should adopt the CI/CD culture.
To ensure that every code added to the repository will work and integrate with other parts, you need to check. For instance, you need to make sure that the project will be built successfully, the tests will be passed, and the new changes will not break any other parts, your code will pass some required code quality checks, and so on.
After that, you have to deploy somehow/publish the version of your software. This process usually has some steps and can be done manually in small teams/projects.
Based on the first rule of Continuous Integration, every team member should integrate the code with the repository multiple times a day. Since the frequency of this integration is high, it's not a good idea to do this process manually. There are always chances that somebody forgets to run the operation. That's the main reason why it's necessary to have an automatic CI/CD pipeline.

force stable trunk/master branch

Our development departments grows and I want to force a stable master/trunk.
Up to now every developer can commit into the master/trunk. In future developers should commit into a staging area, and if all tests pass the code gets moved to the trunk automatically. If the test fails, the developer gets a mail with the failed tests.
We have several repositories: One for the core product, several plugins and a repository for every customer.
Up to now we run SVN and git, but switching all repos to git could be done, if necessary.
Which software could help us to get this done?
There a some articles on the web which explain how to use gerrit and jenkins to force a stable branch.
I am unsure if I need both, or if it is better to use something else.
Environment: We are 10 developers, and use python and django.
Question: Which tool can help me to force a stable master branch?
Update
I was on holiday, and now the bounty has expired. I am sorry. Thank you for your answers.
Question: Which tool can help me to force a stable master branch?
Having been researching this particular aspect of CI quasi-pathologically since our ~20 person PHP/ZF1-based dev team made the switch from SVN to Git over the winter (and I became the de-facto git mess-fixer), I can't help but share my experience with this particular aspect of continuous integration.
While obviously, having a "critical mass of unit tests running" in combination with a slew of conditionally parameterized Jenkins jobs, triggering infinitely more conditionally parameterized jobs, covering every imaginable circumstance would (perhaps) be the best and most proper way to move towards a Continuous Integration/Delivery/Deployment model, the meatspace resources required for such a migration are not insignificant.
Some questions:
Does your team have some kind of VCS workflow or, minimally, rules defined?
What percentage would you say, roughly, of your codebase is under some kind of behavioral (eg. selenium), functional or unit testing?
Does your team ( / senior devs ) actually have the time / interest to get the most out of gerrit's peer-based code review functionality?
On average, how many times do you deploy to production in any given day / week / month?
If the answers to more than one of these questions are 'no', 'none', or 'very little/few', then I'd perhaps consider investing in some whiteboard time to think through your team's general workflow before throwing Jenkins into the mix.
Also, git-hooks. Seriously.
However, if you're super keen on having a CI/Jenkins server, or you have all those basics covered already, then I'd point you to this truly remarkable gem of a blog post:
http://twasink.net/2011/09/16/making-parallel-branches-meet-regularly-with-git-and-jenkins/
And it's equally savvy cousin:
http://twasink.net/2011/09/20/git-feature-branches-and-jenkins-or-how-i-learned-to-stop-worrying-about-broken-builds/
Oh, and of course, the very necessary devopsreactions tumblr.
There a some articles on the web which explain how to use gerrit and jenkins to force a stable branch.
I am unsure if I need both, or if it is better to use something else.
gerrit is for coding review
Jenkins is a job scheduler that can run any job you want, including one:
compiling everything
launching sole unit test.
In each case, the idea is to do some guarded commit, ie pushing to an intermediate repo (gerrit, or one monitored by Jenkins), and only push to the final repo if the intermediate process (review or automatic build/test) passed successfully.
By adding intermediate repos, you can easily force one unique branch on the final "blessed" repo to which those intermediate referential will push to if the commits are deemed worthy.
It sounds like you are looking to establish a standard CI capability. You will need the following essential tools:
Source Version Control : SVN, git (You are already covered here)
CI server : Jenkins (you will need to build and run tests with each
check in, and report results. Jenkins is the defacto standard tool
used for this)
Testing : PyUnit
Artifact Repository : you will need a mechanism for organizing and
archiving the increments created with each build. This could be a
simple home grown directory based system. I have also used Archiva,
but there are other tools.
There are many additional tools that might be useful depending on your development process:
Code review : If you want to make code review a formal gate in your
process, Gerrit is a good tool.
Code coverage analysis : I've used EMMA in the past for Java. I am
sure that are some good tools for Python coverage.
Many others : a library of Jenkin's plugins that provide a variety of
useful tools is available to you. Taking some time to review
available plugins will definitely be time well spent.
In my experience, establishing the right cultural is as important as finding the right tooling.
Testing : one of the 10 principles of CI is "self testing builds". In
other words, you must have a critical mass of unit tests running.
Developers must become test infected. Unit testing must become a
natural, highly value part of each developers individual development
process. In my experience, establishing a culture of test infection
is the hardest part of deploying CI.
Frequent check-in : Developers and managers must organize there work
in a way that allows for frequent small check-ins. CI calls for daily
checkins. This is sometimes a difficult habit to establish.
Responsiveness to feedback : CI is about immediate feedback. The
developers must be conditioned to response to the immediate feedback.
If unit tests fail, the build it broken. Within 15 minutes of a CI
build breaking, the developer responsible should either have a fix
checked in, or have the original, bad check-in backed out.

Ideas on setting up a version control system

I've been tasked with setting up a version control for our web developers. The software, which was chosen for me because we already have other non-web developers using it, is Serena PVCS.
I'm having a hard time trying to decide how to set it up so I'm going to describe how development happens in our system, and hopefully it will generate some discussion on how best to do it.
We have 3 servers, Development, UAT/Staging, and Production. The web developers only have access to write and test their code on the Development server. Once they write the code, they must go through a certification process to get the code moved to UAT/Staging, then after the code is tested thoroughly there, it gets moved to Production.
It seems like making the Developers use version control for their code on Development which they are constantly changing and testing would be an annoyance. Normally only one developer works on a module at a time so there isn't much, if any, risk of over-writing other people's work.
My thought was to have them only use version control when they are ready to go to UAT/Staging. This allows them to develop and test without constantly checking in their code.
The certification group could then use the version control to help see what changes had been made to the module and to make sure they were always getting the latest revision from the developer to put up on UAT/Staging (now we rely on the developer zip'ing up their changed files and uploading them via a web request system).
This would take care of the file side of development, but leaves the whole database side out of version control. That's something else that I need to consider...
Any thoughts or ideas would be greatly appreciated. Thanks.
I would not treat source control as annoyance. See Nicks answer for the reasons.
If I were You, I would not decide this on my own, because it is not a
matter of setting up a version control software on some server but
a matter of changing and improving development procedures.
In Your case, it might be worth explaining and discussing release branches
with Your developers and with quality assurance.
This means that Your developers decide which feature to include into a release
and while the staging crew is busy on testing the "staging" branch of the source,
Your developers can already work on the next release without interfering with the staging team.
You can also think about feature branches, which means that there is a new branch for every specific new feature of the web site. Those branches are merged back, if the feature is implemented.
But again: Make sure, that Your teams agreed to the new development process. Otherwise, You waste Your time by setting up a version control system.
The process should at least include:
When to commit.
When to branch/merge.
What/When to tag.
The overall work flow.
I have used Serena, and it is indeed an annoyance. In addition to the unpleasantness of the workflow overhead Serena puts on top of the check in-check out process, it is a real pain with regard to doing anything besides the simplest of tasks.
In Serena ChangeMan, all code on local machines is managed through a central server. This is a really bad design. This means a lot of day-to-day branch maintenance work that would ordinarily be done by developers has to go through whomever has administrator privileges, making that person 1) a bottleneck and 2) embittered because they have a soul-sucking job.
The centralized management also strictly limits what developers are able to do with the code on their own machine. For example, if you want to create a second copy of the code locally on your box, just to do a quick test or whatever, you have to get the administrator to set up a second repository on your box. When you limit developers like this, you limit the productivity and creativity of your team.
Also, the tools are bad and the user interface is horrendous. And you will never be able to find developers who are already trained to use it, because its too obscure.
So, if another team says you have to use Serena, push back. That product is terrible.
Using source control isn't any annoyance, it's a tool. Having the benefits of branching and tagging is invaluable when working with new APIs and libraries.
And just a side note, a couple of months back one of the dev's machine's failed and lost all his newest source, we asked when the last time he committed code to the source control and it was 2 months. Sometimes just having it to back up stuff when you reach milestones is nice.
I usually commit to source control a couple of times a week, depending if I've hit a good stopping point and I'm about to move on to something different or bigger.
Following on from the last two good points I would also ask your other non-web developers what developmet process they are using so you won't have to create a new one. They would also have encountered many of he problems that occur in your environment, both technical using the same OS and setup and managerial.

What Check-In Policies should be considered for version control?

I'm tasked with helping to set up the process templates and check-in policies for my company's TFS 2008 installation.
Aside from three check-in policies (a check-in action must have comments against it, a code file must be peer-reviewed, there must be a work item associated with a check-in), I have been asked to consider and implement any others.
What are some of the most important or useful policies to enforce for version control?
The fewer the better.
Usually in an organization you want to ease the friction of check-in to ensure that you are encouraging developers to make frequent small discrete check-ins rather than checking out a load of stuff at once. Then again you want to ensure that you have a working codebase for everyone who needs it and are capturing the data that you need to improve your software delivery process.
Personally, a policy to enforce changeset comments and a work item association policy are ok - as they capture meta-data that is very easy to remember at the time but hard to find afterwards. It also encourages developers to get into the habit of having a work item to track all pieces of work - even experimental development or spikes.
The peer review process might be better performed using branching or another process rather than forcing a peer review on every check-in - however that depends on your process. Remember as well that you can have mandatory check-in notes in TFS to capture meta-data such as code reviewer. A check-in note is slightly different to a check-in policy and is often confused.
If you want read more discussion about check-in policies, take a look at a blog post I did on the balancing act a while ago. Also to hear some more discussion about check-in policies, I recorded a podcast recently with a fellow Team System MVP talking about their use of TFS and it might be interesting (Radio TFS, Using TFS with Ed Blankenship). Finally we also did a Radio TFS episode all about check-in policies in 2008 that might be of interest.
Don't break the build! Of course, finding an automated way to check on that and reject the check-in are the challenge.
Some rules that we follow in our company:
Commit all changes related to the same task at once (that will help review the changes and future rollbacks or merges if needed).
template based comments (eg: prefix all comments with a code that represents what was done, + for adds, - for removes, * for updates, ! for important modifications, etc).
Obviously always check-in code that compiles, and finished work to the main-line.
check-in daily unfinished work to branches.
The ones we use where I work on TFS are:
Code Analysis
This ensures that all the code was compiled on the devs machine before it was checked in
Work Item Association
If you've done a change there should have been an assigned task!
Last Build Successful
Using the TFS Build Server to check that the current code in source control compiled on an independant machine
Check In Comments (part of the TFS Powertools - http://msdn.microsoft.com/en-us/teamsystem/bb980963.aspx)
It's good to be able to see a summary of the check in without having to go to the work item(s)
Try to keep the number of developers working on the same branch small. That way the branch stays stable with respect to compilation, the unit tests, and regressions. It's a nightmare if a developer does a check in which compiles but his code breaks a key area of the application (such as login).
If you really have to have more than 10 developers checking code into the same branch, we've started an email policy where the developer checking in warns everyone that they're checking in, so that no one attempts to update their copy of the branch in the midst of a check in. Sometimes, we've had to have the converse, where we set aside an time in the date to prohibit check ins, so that updates are safe.
Frankly, the less policies, the better. The more policies you have, the greater the incentive for NOT using version control. What happens then is:
Code is developed on parallel, uncontrolled source control systems, and just the final revision goes to the official one.
People delay committing as much as possible, decreasing visibility of what they are doing to other developers.
People will actually avoid committing something if they can get away with it, and some will find a way to get away with it.
In fact, I think your three check-in policies are already too much. For instance:
Having code being peer-reviewed before check-in makes it much more difficult to have work-in-progress stored there. Instead, if the source control system allows it (and many do), control whether the source is peer reviewed or not. With some systems you can create a life cycle for a revision, with others you might create branches, and still others you might use tags.
Having a work-item associated with a check-in makes it impossible for developers to do exploratory programming, or having initiative on possible improvements. It stifles the developers. Instead, make sure that any revision going into integration tests or user acceptance tests, not to mention production itself, is associated with a work item.
This might sound anti-Enterprise, but it's just some things we have learned in a few decades of software development. Most enterprise organizations haven't been clued in to this, but, eventually, they will. So, you might go the very opposite way, but don't say no one ever told you.
I recommend the Agile Manifest, and, particularly, Lean Software Development for general principles.
Or, taking Stack Overflow design philosophy into account, make the system reward the behavior you want.

version control practice

In my current job the supervisor's practice is to only check in production ready code. Most recently the project I was on involved work by 3 different developers with some file overlap. This meant manually integrating changes despite the fact that some changes took a day and then it was done. I wanted to see if this was a common practice and get suggestions on how to change this practice with the knowledge that many times my opinion means little in the grand scheme of things.
You can use various ways to handle this situation, depending on your source control system.
Private branches: Allow you to check in and work on code while you go, merging back and forth at appropriate times.
Shelvesets/pacakaged changesets: Allow you to store changesets and send them around for review - ensuring they're production ready before check in.
As to whether this is an appropriate way to work, we don't allow check-in to main branches without prior review. To pass review your code must pass various automated tools, and then must be acceptable to your peer reviewer. For some definitions of "production ready" - this is it. Therefore, we do something like what you do. However, we use private branches to ensure that check-ins can still be made while this is in progress, and that other check-ins don't have to interfere.
If production ready means tested in an integration environment, then it sounds like you may need staging branches or something similar.
Code that is checked in should be unit tested, but, to me, "production ready" implies that it's gone through integration and system testing. You can't do that until a code freeze, so I don't see how you can do that before every check in.
Start by switching away from VSS to something more reliable & feature-rich. See How to convince a company to switch their Source Control
Then apply known-good practices:
Check in often
Pick up others' changes often, to simplify merging
Use fast unit tests to make sure each change meets a minimum bar
Require that that the checked-in code always builds, and always passes tests.
Now you won't be "production ready" at this point: you will still need a couple weeks to test & fix before you can deploy. Getting that time down is awesome for you, and awesome for your customer, so invest in:
High quality automated acceptance tests.
wouldn't it be a good idea to have a testing branch of the repo that can have the non "production ready code" checked in after the changes are done and tested?
the main trunk should never have code checked in that breaks the build and doesn't pass unit tests, but branches don't have to have all those restrictions in place.
I would personally not approve of this because sometimes that's the best way to catch problem code with less experienced developers (by seeing it as they are working on it) and when you "check in early and often" you can rollback to earlier changes you made (as you were developing) if you decide that some changes you made earlier was actually a better idea.
I think it may be the version control we user, VSS in combination with a lack of time to learn the branching. I really like the idea of nightly check ins to help with development and avoid 'Going Dark'. I can see him being resistant to the trunks but perhaps building a development SS and when the code is production ready move it to production SS.
From the practices I have seen the term production quality is used as a 'frightener' to ensure that people are scared of breaking top of tree, not a bad thing to be honest because top of tree should always work if possible.
I would say that best practice is that you should only be merging distinct (i.e. seperate) functional components on the top of tree. If you have a significant overlap on deltas to the same source files I think this 'might' indicate that somewhere along the line the project management has broken down, and that those developers should have merged their changes to seperate integration branch before going in to the main line sources. An individual developer saying that they unit tested their stuff is irrelevant, because the thing they tested has changed!
Trying to solve integration problems on your main line codeline will inevitably stall other unrelated submissions.
Assuming that you are working in a centralized version control system (such as Subversion), and assuming that you have a concept of "the trunk" (where the latest well-working code lives):
If you work on new features in "features branches"/"experimental branches", then it's OK to commit code which is far from finished. (When the feature is done, you commit the well-behaving result into the "trunk".)
But you will not win a popularity contest if committing non-compiling/obviously non-working code into the "trunk" or a "release branch".
The Pragmatic Programmers have a book called Pragmatic Version Control using Subversion which includes a section with advice about branches.
Check in early and check in often for two main reasons -
1 - it might make it easier to integrate code
2 - in case your computer explodes your weeks of work isn't gone
An approach I particularly like is to have different life cycle versions in the depot. That is,for example, have a dev version of the code that is where the developers check in code that is in being worked on; then you could have a beta version, where you could add beta fixes to your code; and then a production version.
There is obvious overhead in this approach, such as the fact that you will have a larger workspace on you local machine, the fact that you will need need to have a migration process into place to move code from one stage to the next (which means a code freeze when doing the integration testing that goes with the migration), and that depending on the complexity of the project(s) you might need to have tools that change settings, environment variables, registry entries, etc.
All of this is a pain to set up, but you only do it once, and once you have it all in place, makes working on different stages of the code a breeze.
#bpapa
Nightly backups of work folders to servers will prevent losing more than a days work.
#tonyo
Let's see the requirement documents were completed the day after we finished coding. Does that tell you about our project management?
We are a small shop so while you would think change is easy there are some here that are unbending to the old ways.