Avoid rebuilding artifacts in Jenkins multibranch pipelines - merge

TL;DR
How do I avoid rebuilding artifacts on master when a feature is merged without creating multiple pipelines per project? Where do I access the information about which branch was merged?
More Info
I run Jenkins to build many projects stored in two different VCSs (Gitlab, Bitbucket). Auto-discovery for both VCSs work and create multi-branch pipelines for every project/branch/PR containing a Jenkinsfile (Gitlab Branch Source Plugin, Bitbucket Branch Source Plugin).
Build artifacts get produced and stored on every build (e.g. docker images pushed to registry).
As I follow a feature branch workflow, these features get eventually merged into master, master will then be deployed in irregular intervals.
When doing the merge, there is an artifact already built and stored for this code(see appendix:1). It was built for the feature branch the code originated from (e.g. container mysuperapp:feat-add-better-things-3). I would like to take this artifact and promote it as the new master artifact (e.g. mysuperapp:master), avoiding a rebuild (and unit + integration testing everything).
But, merging a feature branch just kicks off a new build pipeline on branch master without any information about the merged branch (see appendix:2). This is correct behavior concerning master (new commit(s) where pushed) but prevents me from reacting to the merged branch (e.g. the aforementioned promoting or even just deleting unused artifacts). Is there any way to get the information, which branch was merged?
I am aware, that I can create a new pipeline listening for PR webhooks from my VCSs, running a pipeline to do the promotion and ignore builds on master completely. But this moves visibility of this process to a different pipeline and requires additional pipelines for projects, e.g. reducing the advantage of auto-discovery to 50% (have to create these merge pipelines for each project).
How can I keep the advantages of auto-discovery and visibility of executed steps while also executing something on a merge?
Ideas: Tag artifacts differently, but how (needs to be able to clean up correctly)? Parameterize pipelines and setup a single merge pipeline which re-triggers the pipeline 'push on master' with parameters of the merged branch. But can this be done without having to setup the webhooks for every project? Ask the VCSs via REST about which branch belonged to a commit?
Greets and thanks for the help you all! This may be a complicated one, but it would be so cool to get this to work. It's the last barrier for me to enable continuos delivery for a lot of projects!
Appendix:
1: I am also aware, that to have consistent builds, I have to enforce --ff-only merges. This question is not about the pitfalls of git but rather about the way to go with Jenkins.
2: Git provides me with the parent commits, I can easily find out, which commit was merged. But, especially using "Delete branch after merge", leaves me without the branch ref in git. Tagging my docker images with commits instead of branches leaves me with backtracking the last commit on each build to delete the old, obsolete build.

Related

Azure Devops Build Pipeline from PR Trigger get source branch

So ive been building a build pipeline, that is triggered whenever a pull request is done to master, so we have a branch policy such that the only change to the master branch is through pull requests.
I want the build pipeline to checkout the source branch of the PR and do some commits to the source branch as part of the build pipeline. I thought i could just use the Build.SourceBranchName variable but when the pipeline is triggered the SourceBranchName is master. So I could not use it.
Are there any easy ways of doing this?
I want the build pipeline to checkout the source branch of the PR
To checkout the source branch of the PR, you could use the predefined system variables about PR:
System.PullRequest.SourceBranch and System.PullRequest.TargetBranch
To get the branch that is being reviewed in a pull request, we should select the variable System.PullRequest.SourceBranch.
now the issue becomes that because of a new commit to the PR it runs
the pipeline again, this should not happen since i have [skip ci] in
the commit message.
As we know, the [skip ci] or [ci skip] is used to skip running CI, like the option
Enable continuous integration on UI:
However, our current scenario is branch policy for build validation instead of CI. This is very different from CI, although they seem to be doing the same build task. Branch policy is to protect our branches from being corrupted by incorrect submit. This is a verified operation instead of continuous integration.
Check the document Skipping CI for individual commits for some more details.
So, this is two different scenarios, we could not apply the CI settings to the branch policy.
Second, Branch policy is used to protect our branches, any commit requires validation by branch pliocy, although sometimes we can know that our modifications don't require build validation, but we're not sure if there are any where we overlook that cause our target branch to be broken. Skip unnecessary verification will bring us some construction convenience, but with the risk measurement it brings, these conveniences are negligible, so we don't recommend skipping the verification of the branch office strategy.
If skipping Build Validation is your insistence, you can try LJ’s suggestion.

Development and Production Environments with GitHub flow

At work, we're now using GitHub, and with that GitHub flow. My understanding of GitHub flow is that there is a master branch and feature branches. Unlike git flow, there is no develop branch.
This works quite well on projects that we've done, and simplifies things.
However, for our products, we have a development and production environment. For the production environment, we use the master branch, whereas for the development environment we're not sure how to do it?
The only idea I can think of is:
When a branch is merged with master, redeploy master using GitHub actions.
When another branch is pushed, set up a GitHub action so that any other branch (other than master) is deployed to this environment.
Currently, for projects that require a development environment, we're essentially using git flow (features -> develop -> master).
Do you think my idea is sensible, and if not what would you recommend?
Edit:
Just to clarify, I'm asking the best way to implement development with GitHub Flow and not git flow.
In my experience, GitHub Flow with multiple environments works like this. Merging to master does not automatically deploy to production. Instead, merging to master creates a build artifact that is able to be promoted through environments using ChatOps tooling.
For example, pushing to master creates a build artifact named something like my-service-47cbd6c, which is a combination of the service name and the short commit hash. This is pushed to an artifact repository of some kind. The artifact can then be deployed to various environments using tooling such as ChatOps style slash commands to trigger the deloy. This tooling could also have checks to make sure test environments are not skipped, for example. Finally, the artifact is promoted to production.
So for your use case with GitHub Actions, what I would suggest is this:
Pushing to master creates the build artifact and automatically deploys it to the development environment.
Test in development
Promote the artifact by deploying to production using a slash command. The action slash-command-dispatch would help you with this.
You might also consider the notion of environments (as illustrated here)
Recently (Feb. 2021), you can:
##Limit which branches can deploy to an environment
You can now limit which branches can deploy to an environment using Environment protection rules.
When a job tries to deploy to an environment with Deployment branches configured Actions will check the value of github.ref against the configuration and if it does not match the job will fail and the run will stop.
The Deployment branches rule can be configured to allow:
All branches – Any branch in the repository can deploy
Protected branches – Only branches with protection rules
Selected branches – Branches matching a set of name patterns
That means you can define a job to deploy in dev environment, and that job, as a condition, will only run if triggered from a commit pushed from a given branch (master in your case)
For anyone facing the same question or wanting to simplify their process away from gitflow, I'd recommend taking a look at this article. Whilst it doesn't talk about Github flow explicitly it does effectively provide one solution to the OP.
Purests may consider this to be not strictly Gitflow but to my mind it's a simple tweak that makes the deployment & CI/CD strategy more explicit in git. I prefer to have this approach rather than add some magic to the tooling which can make a process harder for devs to follow and understand.
I think the Gitflow intro is written fairly pragmatically as well:
Different teams may have different deployment strategies. For some, it may be best to deploy to a specially provisioned testing environment. For others, deploying directly to production may be the better choice...
The diagram in the article sums it up well:
So here we have Master == Gitflow main and the useful addition is the temporary release branch from which you can deploy to other environments such as development. What is worth considering is what you choose to call this temporary branch, in the above it's considered a release, in your process it may be a test branch, etc.
You can take or leave the squashing and tagging and the tooling will change between teams. Equally you may or may not care about actual version numbers.
This isn't a million miles away from VonC's answer, the difference is the process is more tightly defined and it's more towards having multiple developers merge into a single branch & apply fixes in order to get a new version ready for production. It may well be that you configure the deployment of this temporary branch via a naming convention as in his answer.
The way I've implemented this flow is using PRs. I did it with Azure DevOps, but I'd say that the same can be achieved with GitHub Actions.
When you have a branch that you intent to test and eventually merge to master and release to production, you create a PR from that branch to master. The PR will trigger a pipeline, which will run your build, static analysis and tests. If that passes, the PR is deployed to a test environment where further automated and manual testing can happen. That PR can be reviewed and approved by other developers and, if you need to, by QA after manual testing. You can configure GitHub PR rules to enforce the approvals. Once approved, you can merge the PR to master.
What happens once in master is independent of the workflow above, but most likely a new pipeline will be triggered, which will build a release candidate and run the whole path to production (with or without manual intervention).
One of the tricks is how the PR pipeline decides which environment to deploy the PR too. I can think of three options:
Create an environment on the fly which will be killed once the PR is merged or closed. This is the most advanced and flexible option. This would require the system to publish the environment location to the PR.
Have a pool of environments and have the automation figure out which are free and automatically choose one. The environments could be stopped, so you find an environment which is stopped, start it up and deploy there. Once the PR is closed/merged, stop the environment again.You can publish the environment location to the PR.
Add a label to the PR indicating the environment (ie. env-1, env-2, etc.). This is the simplest option, but it requires that developers look at the open PRs to see which environments are already in use in other PRs to avoid overwriting other people's code.
With all these options, once the PR is created, you can just push new commits to the branch and the environment will be updated.
You also need to decide what you want to do when a new commit is pushed to master. You most likely want to trigger a new PR build to update the environments with the latest master, but you can do this automatically or manually, depending on how busy your master is.
Nathan, adding a development branch is good idea, you can work on development changes in new branch and test them in dev environment and after getting signoff to move to production environment you can merge your changes in master branch.
Don't forget to perform regression testing on merged master branch to test both old features and new features are working fine before releasing your code for installation in production

How to implement Git tag and merge on release?

The final stage of our release pipeline is a manual stage used to confirm the deployed release got its final acceptance. Among the tasks we would like to run in this stage:
Tag the develop branch with the release label. Say "1.2.3".
Merge the develop branch into the master branch.
(We're using Azure Git repositories)
Although it looks like the right moment to make these changes in Git, I'm not quite certain this is the intended usage of Azure release pipelines. I confess being a bit new to Azure pipelines and there seem to be no evident pipeline task for doing such changes.
However, I believe this kind of post-release SCM changes is quite common.
My question is therefore: Where and how is the proper way to apply those SCM changes in Azure Devops ?
EDIT: I could make it work to use a command line task to run git commands. Config was passed by means of a variable group.
You can just do the regular git commands in a Command Line script (first , clone the repo (or add it as an artifact), then tag/merge).
Or install the Tag Git on Release & Git Merge extensions and use them.

Azure datafactory deployment automation from multiple branches

I want to create automated deployment pipeline for azure datafactory.
For one stream of development we can configure it using doc
https://learn.microsoft.com/en-us/azure/data-factory/continuous-integration-deployment
But when it comes to deploying to two diff test datafactories for parrallel features development (in two different branches), it is not working because the adb_publish which gets generated is only specific to the one datafactory.
Currently we are doing deployement using powershell scripts and passing object list which needs to be deployed.
Our repo is in Azure devops.
I tried
linking the repo to multiple df but then it is causing issue, perhaps when finding deltas to publish.
Creating forks of repo instead of branches so that adb_publish can be seperate for the every datafactory - but this approach will not work when there is a conflict, which needs manual merge, so the testing will be required again instead of moving to prod.
Adf_publish get generated whenever you publish. Publishing takes whatever you have in your repo and updates data factory with it.
To develop multiple features in parallel, you need to just use "Save". Save will commit your changes to the branch you are actually working on. Other branches will do the same. Whenever you want to publish, you need to first make a pull request from your branch to master, then publish. Any merge conflict should be solved when merging everything in the master branch. Then just publish and there shouldn't be any conflicts, and adf_publish will get generated after that.
Hope this helped!
Since a GitHub repository can be associated with only one data factory. And you are only allowed to publish to the Data Factory service from your collaboration branch. Check this
It seems there is not a direct and easy way to accomplish this. If forking repo as workaround, you may have to solve the conflicts before merging as #Martin suggested.

Azure Pipelines: Store git submodules as artifacts and only build as needed

We have a project written in C that depends on several libraries as git submodules. We built an Azure Pipeline to build it, using multiple containers targeting multiple environments.
The challenge is that the build takes more time than we'd like, partly because of the fact that the submodules are being recompiled every time, even though they do not change.
What I'm looking for is a way to build the submodules only when needed, store them as artifacts, and have the main build consume them.
As far as I understand, I can set up a build for the submodule's repos which will poll for changes, but I want my product to depend on specific commits of the submodules - i.e. I'm not always taking the latest submodule version.
So I'm looking to trigger a submodule build whenever we switch to a new commit. Can this be achieved in Azure Pipelines? What would be the best way to manage the artifacts (e.g. store the commit ID as part of the artifact name)?