Concourse: how to pass job's output to a different job - concourse

It's not clear for me from the documentation if it's even possible to pass one job's output to the another job (not from task to task, but from job to job).
I don't know if conceptually I'm doing the right thing, maybe it should be modeled differently in Concourse, but what I'm trying to achieve is having pipeline for Java project split into several granular jobs, which can be executed in parallel, and triggered independently if I need to re-run some job.
How I see the pipeline:
First job:
pulls the code from github repo
builds the project with maven
deploys artifacts to the maven repository (mvn deploy)
updates SNAPSHOT versions of the Maven project submodules
copies artifacts (jar files) to the output directory (output of the task)
Second job:
picks up jar's from the output
builds docker containers for all of them (in parallel)
Pipeline goes on
I was unable to pass the output from job 1 to job 2.
Also, I am curious if any changes I introduce to the original git repo resource will be present in the next job (from job 1 to job 2).
So the questions are:
What is a proper way to pass build state from job to job (I know, jobs might get scheduled on different nodes, and definitely in different containers)?
Is it necessary to store the state in a resource (say, S3/git)?
Is the Concourse stateless by design (in this context)?
Where's the best place to get more info? I've tried the manual, it's just not that detailed.
What I've found so far:
outputs are not passed from job to job
Any changes to the resource (put to the github repo) are fetched in the next job, but changes in working copy are not
Minimal example (it fails if commented lines are uncommented with error: missing inputs: gist-upd, gist-out):
---
resources:
- name: gist
type: git
source:
uri: "git#bitbucket.org:snippets/foo/bar.git"
branch: master
private_key: {{private_git_key}}
jobs:
- name: update
plan:
- get: gist
trigger: true
- task: update-gist
config:
platform: linux
image_resource:
type: docker-image
source: {repository: concourse/bosh-cli}
inputs:
- name: gist
outputs:
- name: gist-upd
- name: gist-out
run:
path: sh
args:
- -exc
- |
git config --global user.email "nobody#concourse.ci"
git config --global user.name "Concourse"
git clone gist gist-upd
cd gist-upd
echo `date` > test
git commit -am "upd"
cd ../gist
echo "foo" > test
cd ../gist-out
echo "out" > test
- put: gist
params: {repository: gist-upd}
- name: fetch-updated
plan:
- get: gist
passed: [update]
trigger: true
- task: check-gist
config:
platform: linux
image_resource:
type: docker-image
source: {repository: alpine}
inputs:
- name: gist
#- name: gist-upd
#- name: gist-out
run:
path: sh
args:
- -exc
- |
ls -l gist
cat gist/test
#ls -l gist-upd
#cat gist-upd/test
#ls -l gist-out
#cat gist-out/test

To answer your questions one by one.
All build state needs to be passed from job to job in the form of a resource which must be stored on some sort of external store.
It is necessary to store on some sort of external store. Each resource type handles this upload and download itself, so for your specific case I would check out this maven custom resource type, which seems to do what you want it to.
Yes, this statelessness is the defining trait behind concourse. The only stateful element in concourse is a resource, which must be strictly versioned and stored on an external data store. When you combine the containerization of tasks with the external store of resources, you get the guaranteed reproducibility that concourse provides. Each version of a resource is going to be backed up on some sort of data store, and so even if the data center that your ci runs on is to completely fall down, you can still have strict reproducibility of each of your ci builds.
In order to get more info I would recommend doing a tutorial of some kind to get your hands dirty and build a pipeline yourself. Stark and wayne have a tutorial that could be useful. In order to help understand resources there is also a resources tutorial, which might be helpful for you specifically.
Also, to get to your specific error, the reason that you are seeing missing inputs is because concourse will look for directories (made by resource gets) named each of those inputs. So you would need to get resource instances named gist-upd and gist-out prior to to starting the task.

Related

How to retain docker-image after release-candidate build for future push to prod registry

In our release-candidate pipeline to our QA (think "stage") environment, we are using a vmImage to build our docker-container and then pushing it to our Non-Prod registry.
pool:
vmImage: "ubuntu-latest"
steps:
- task: pseudo-code: ## get everything prepped for buildAndPush
- task: Docker#2
inputs:
containerRegistry: "Our Non-Prod Registroy"
repository: "apps/app-name"
command: "buildAndPush"
Dockerfile: "**/Dockerfile"
tags: |
$(Build.SourceBranchName)
These are release-candidates. Once the code is approved for release, we want to push the same container to our Production registry; however, we can't figure out how to do it in this framework. Seems like we have only two options:
rebuild it in a different run of the pipeline later which will push it to our Production registry
push every release-candidate container to our Production registry
I don't like either of these options. What I want is a way to retain the container somewhere, and then to push it to the our Production registry when we decide that we want to release it.
How would we do this?
You can actually
Push docker image to your non-production registry
Once your it is approved for the release (I don't know what exactly you mean by saying this in terms on your pipeline) you can (Please check this topic)
You can pull the image, tag it and push it to the new registry.
docker pull old-registry/app
docker tag old-registry/app new-registry/app
docker push new-registry/app
You can also take a look here where is described above case.

How to write CI/CD pipeline to run integration testing of java micro services on Google kubernetes cluster?

Background:
I have 8-9 private clusterIP spring based microservices in a GKE cluster. All of the microservices are having integration tests bundled with them. I am using bitbucket and using maven as build tool.
All of the microservices are talking to each other via rest call with url: http://:8080/rest/api/fetch
Requirement: I have testing enviroment ready with all the docker images up on GKE Test cluster. I want that as soon as I merge the code to master for service-A, pipeline should deploy image to tes-env and run integration test cases. If test cases passes, it should deploy to QA-environment, otherwise rollback the image of service-A back to previous one.
Issue: On every code merge to master, I am able to run JUNIT test cases of service-A, build its docker image, push it on GCR and deploy it on test-env cluster. But how can I trigger integration test cases after the deployment and rollback to previously deployed image back if integration test cases fails? Is there any way?
TIA
You can create different steps for each part:
pipelines:
branches:
BRANCH_NAME:
- step:
script:
- BUILD
- step:
script:
- DEPLOY
- step:
script:
- First set of JUNIT test
- step:
script:
- Run Integration Tests (Here you can add if you fail to do rollback)
script:
- Upload to QA
There are many ways you can do it. From the above information its not clear which build tool you are using.
Lets say if you are using bamboo you can create a task for the same and include it in the SDLC process. Mostly the task can have bamboo script or ansible script.
You could also create a separate shell script to run the integration test suite after deployment.
You should probably check what Tekton is offering.
The Tekton Pipelines project provides k8s-style resources for declaring CI/CD-style pipelines.
If you use Gitlab CICD you can break the stages as follows:
stages:
- compile
- build
- test
- push
- review
- deploy
where you should compile the code in the first stage, then build the docker images from it in the next and then pull images and run them to do all your tests (including the integration tests)
here is the mockup of how it will look like:
compile-stage:
stage: compile
script:
- echo 'Compiling Application'
# - bash my compile script
# Compile artifacts can be used in the build stage.
artifacts:
paths:
- out/dist/dir
expire_in: 1 week
build-stage:
stage: build
script:
- docker build . -t "${CI_REGISTRY_IMAGE}:testversion" ## Dockerfile should make use of out/dist/dir
- docker push "${CI_REGISTRY_IMAGE}:testversion"
test-stage1:
stage: test
script:
- docker run -it ${CI_REGISTRY_IMAGE}:testversion bash unit_test.sh
test-stage2:
stage: test
script:
- docker run -d ${CI_REGISTRY_IMAGE}:testversion
- ./integeration_test.sh
## You will only push the latest image if the build will pass all the tests.
push-stage:
stage: push
script:
- docker pull ${CI_REGISTRY_IMAGE}:testversion
- docker tag ${CI_REGISTRY_IMAGE}:testversion -t ${CI_REGISTRY_IMAGE}:latest
- docker push ${CI_REGISTRY_IMAGE}:latest
## An app will be deployed on staging if it has passed all the tests.
## The concept of CICD is generally that you should do all the automated tests before even deploying on staging. Staging can be used for User Acceptance and Quality Assurance Tests etc.
deploy-staging:
stage: review
environment:
name: review/$CI_COMMIT_REF_NAME
url: https://$CI_ENVIRONMENT_SLUG.$KUBE_INGRESS_BASE_DOMAIN
on_stop: stop_review
only:
- branches
script:
- kubectl apply -f deployments.yml
## The Deployment on production environment will be manual and only when there is a version tag committed.
deploy-production:
stage: deploy
environment:
name: prod
url: https://$CI_ENVIRONMENT_SLUG.$KUBE_INGRESS_BASE_DOMAIN
only:
- tags
script:
- kubectl apply -f deployments.yml
when:
- manual
I hope the above snippet will help you. If you want to learn more about deploying microservices using gitlab cicd on GKE read this

Reuse portion of github action across jobs

I have a workflow for CI in a monorepo, for this workflow two projects end up being built. The jobs run fine, however, I'm wondering if there is a way to remove the duplication in this workflow.yml file with the setting up of the runner for the job. I have them split so they run in parallel as they do not rely on one another and to be faster to complete. It's a big time difference in 5 minutes vs. 10+ when waiting for the CI to finish.
jobs:
job1:
name: PT.W Build
runs-on: macos-latest
steps:
- name: Checkout Repo
uses: actions/checkout#v1
- name: Setup SSH-Agent
uses: webfactory/ssh-agent#v0.2.0
with:
ssh-private-key: |
${{ secrets.SSH_PRIVATE_KEY }}
- name: Setup JDK 1.8
uses: actions/setup-java#v1
with:
java-version: 1.8
- name: Setup Permobil-Client
run: |
echo no | npm i -g nativescript
tns usage-reporting disable
tns error-reporting disable
npm run setup.all
- name: Build PT.W Android
run: |
cd apps/wear/pushtracker
tns build android --env.uglify
job2:
name: SD.W Build
runs-on: macos-latest
steps:
- name: Checkout Repo
uses: actions/checkout#v1
- name: Setup SSH-Agent
uses: webfactory/ssh-agent#v0.2.0
with:
ssh-private-key: |
${{ secrets.SSH_PRIVATE_KEY }}
- name: Setup JDK 1.8
uses: actions/setup-java#v1
with:
java-version: 1.8
- name: Setup Permobil-Client
run: |
echo no | npm i -g nativescript
tns usage-reporting disable
tns error-reporting disable
npm run setup.all
- name: Build SD.W Android
run: |
cd apps/wear/smartdrive
tns build android --env.uglify
You can see here the jobs have almost an identical process, it's just the building of the different apps themselves. I'm wondering if there is a way to take the duplicate blocks in the jobs and create a way to only write that once and reuse it in both jobs.
There are 3 main approaches for code reusing in GitHub Actions:
Reusable Workflows
Dispatched workflows
Composite Actions <-- it's the best one in your case
The following details are from my article describing their pros and cons:
🔸 Reusing workflows
The obvious option is using the "Reusable workflows" feature that allows you to extract some steps into a separate "reusable" workflow and call this workflow as a job in other workflows.
🥡 Takeaways:
Nested reusable workflow calls are allowed (up to 4 levels) while loops are not permitted.
Env variables are not inherited. Secrets can be inherited by using special secrets: inherit job param.
It's not convenient if you need to extract and reuse several steps inside one job.
Since it runs as a separate job, you have to use build artifacts to share files between a reusable workflow and your main workflow.
You can call a reusable workflow in synchronous or asynchronous manner (managing it by jobs ordering using needs keys).
A reusable workflow can define outputs that extract outputs/outcomes from executed steps. They can be easily used to pass data to the "main" workflow.
🔸 Dispatched workflows
Another possibility that GitHub gives us is workflow_dispatch event that can trigger a workflow run. Simply put, you can trigger a workflow manually or through GitHub API and provide its inputs.
There are actions available on the Marketplace which allow you to trigger a "dispatched" workflow as a step of "main" workflow.
Some of them also allow doing it in a synchronous manner (wait until dispatched workflow is finished). It is worth to say that this feature is implemented by polling statuses of repo workflows which is not very reliable, especially in a concurrent environment. Also, it is bounded by GitHub API usage limits and therefore has a delay in finding out a status of dispatched workflow.
🥡 Takeaways
You can have multiple nested calls, triggering a workflow from another triggered workflow. If done careless, can lead to an infinite loop.
You need a special token with "workflows" permission; your usual secrets.GITHUB_TOKEN doesn't allow you to dispatch a workflow.
You can trigger multiple dispatched workflows inside one job.
There is no easy way to get some data back from dispatched workflows to the main one.
Works better in "fire and forget" scenario. Waiting for a finish of dispatched workflow has some limitations.
You can observe dispatched workflows runs and cancel them manually.
🔸 Composite Actions
In this approach we extract steps to a distinct composite action, that can be located in the same or separate repository.
From your "main" workflow it looks like a usual action (a single step), but internally it consists of multiple steps each of which can call own actions.
🥡 Takeaways:
Supports nesting: each step of a composite action can use another composite action.
Bad visualisation of internal steps run: in the "main" workflow it's displayed as a usual step run. In raw logs you can find details of internal steps execution, but it doesn't look very friendly.
Shares environment variables with a parent job, but doesn't share secrets, which should be passed explicitly via inputs.
Supports inputs and outputs. Outputs are prepared from outputs/outcomes of internal steps and can be easily used to pass data from composite action to the "main" workflow.
A composite action runs inside the job of the "main" workflow. Since they share a common file system, there is no need to use build artifacts to transfer files from the composite action to the "main" workflow.
You can't use continue-on-error option inside a composite action.
As I know currently there is no way to reuse steps
but in this case, you can use strategy for parallel build and different variation:
jobs:
build:
name: Build
runs-on: macos-latest
strategy:
matrix:
build-dir: ['apps/wear/pushtracker', 'apps/wear/smartdrive']
steps:
- name: Checkout Repo
uses: actions/checkout#v1
- name: Setup SSH-Agent
uses: webfactory/ssh-agent#v0.2.0
with:
ssh-private-key: |
${{ secrets.SSH_PRIVATE_KEY }}
- name: Setup JDK 1.8
uses: actions/setup-java#v1
with:
java-version: 1.8
- name: Setup Permobil-Client
run: |
echo no | npm i -g nativescript
tns usage-reporting disable
tns error-reporting disable
npm run setup.all
- name: Build Android
run: |
cd ${{ matrix.build-dir }}
tns build android --env.uglify
For more information please visit https://help.github.com/en/actions/reference/workflow-syntax-for-github-actions#jobsjob_idstrategy
Since Oct. 2021, "Reusable workflows are generally available"
Reusable workflows are now generally available.
Reusable workflows help you reduce duplication by enabling you to reuse an entire workflow as if it were an action. A number of improvements have been made since the beta was released in October:
You can utilize outputs to pass data from reusable workflows to other jobs in the caller workflow
You can pass environment secrets to reusable workflows
The audit log includes information about which reusable workflows are used
See "Reusing workflows" for more.
A workflow that uses another workflow is referred to as a "caller" workflow.
The reusable workflow is a "called" workflow.
One caller workflow can use multiple called workflows.
Each called workflow is referenced in a single line.
The result is that the caller workflow file may contain just a few lines of YAML, but may perform a large number of tasks when it's run. When you reuse a workflow, the entire called workflow is used, just as if it was part of the caller workflow.
Example:
In the reusable workflow, use the inputs and secrets keywords to define inputs or secrets that will be passed from a caller workflow.
# .github/actions/my-action.yml
# Note the special trigger 'on: workflow_call:'
on:
workflow_call:
inputs:
username:
required: true
type: string
secrets:
envPAT:
required: true
Reference the input or secret in the reusable workflow.
jobs:
reusable_workflow_job:
runs-on: ubuntu-latest
environment: production
steps:
- uses: ./.github/actions/my-action
with:
username: ${{ inputs.username }}
token: ${{ secrets.envPAT }}
With ./.github/actions/my-action the name of the my-action.yml file in your own repository.
A reusable workflow does not have to be in the same repository, and can be in another public one.
Davide Benvegnù aka CoderDave illustrates that in "Avoid Duplication! GitHub Actions Reusable Workflows" where:
n3wt0n/ActionsTest/.github/workflows/reusableWorkflowsUser.yml references
n3wt0n/ReusableWorkflow/.github/workflows/buildAndPublishDockerImage.yml#main

Travis CI: How to conditionally run provider deployment jobs?

I have a travis script deploying to different S3 buckets based on 2 conditions:
1. the branch name
2. the $TRAVIS_BRANCH env variable
... travis stuff
deploy:
- provider: s3
... other config
bucket: my-staging-bucket
on:
repo: MyOrg/my-repo
branch: staging
condition: $TRAVIS_BRANCH = staging
- provider: s3
... other config
bucket: my-prod-bucket
on:
repo: MyOrg/my-repo
branch: production
condition: $TRAVIS_BRANCH = production
It's working as expected:
When I deploy to staging, the first config successfully builds and deploys and I'm given appropriate messaging in Travis' job log.
It also tries to deploy to production and is stopped by the on: conditions, again providing messaging that indicates as much. The resulting log messages look like so, the first two lines indicating successful depoyment to staging and no deployment to production.
-Preparing deploy
-Deploying application
-Skipping a deployment with the s3 provider because a custom condition was not met
This is consistent when the situation is reversed:
-Skipping a deployment with the s3 provider because this branch is not permitted: production
-Skipping a deployment with the s3 provider because a custom condition was not met
...
-Preparing deploy
-Deploying application
This has lead to some confusion amonst the team as the messaging appears to be a false negative, indicating the deployment failed when it's actually functioning as intended. What I would like do is set up Travis so that it only runs the deployment script approprite for that branch and env variable combo.
Is there a way to do that? I was under the impression this was the method for conditional deployment.
If there's no way to prevent both deploy jobs from running, is there a way to at suppress the messaging in the job log?
The best way to do this would be to use Travis' stages and jobs features. Stages are groups of jobs. Jobs inside stages run in parallel. Stages run in sequence, one after the other. Entire stages can be conditional, and stages can also contain conditional jobs. Jobs in a stage can be deploy jobs too (i.e. the entire deploy: in your travis.yml can be nested inside a conditional stage. Most importantly for your goals, conditional stages and their included jobs are silently skipped if the condition is not met.
This is very different to the standard deploy: matrix that you already have. i.e. your current deploy step contains 2 deployments and so you get the message that it is skipping a deployment.
Instead, you can change that into separate deploy stages with conditional jobs.
The downside to using stages like this is that each stage runs in its own VM and so you can't share data from one stage to the next. (i.e build artifacts from previous stages do not propagate to subsequent stages). You can get around this by sharing the build results of a lengthy compile stage via S3, for example.
More information can be found here:
https://docs.travis-ci.com/user/build-stages
I have a working example here in my github: https://github.com/brianonn/travis-test
jobs:
include:
- stage: compile
script: bash scripts/compile.sh
- stage: test
script: bash scripts/test.sh
- stage: deploy-staging
if: branch = staging
name: "Deploy to staging S3"
script: skip
deploy:
provider: script
script: bash scripts/deploy.sh staging
on:
branch: staging
condition: $TRAVIS_BRANCH = staging
- stage: deploy-prod
if: branch = production
name: "Deploy to production S3"
script: skip
deploy:
provider: script
script: bash scripts/deploy.sh production
on:
branch: production
condition: $TRAVIS_BRANCH = production
This produces a Travis job log that is specific to each one of staging and production:

Is it possible to build a docker image without pushing it?

I want to build a docker image in my pipeline and then run a job inside it, without pushing or pulling the image.
Is this possible?
It's by design that you can't pass artifacts between jobs in a pipeline without using some kind of external resource to store it. However, you can pass between tasks in a single job. Also, you specify images on a per-task level rather than a per-job level. Ergo, the simplest way to do what you want may be to have a single job that has a first task to generate the docker-image, and a second task which consumes it as the container image.
In your case, you would build the docker image in the build task and use docker export to export the image's filesystem to a rootfs which you can put into the output (my-task-image). Keep in mind the particular schema to the rootfs output that it needs to match. You will need rootfs/... (the extracted 'docker export') and metadata.json which can just contain an empty json object. You can look at the in script within the docker-image-resource for more information on how to make it match the schema : https://github.com/concourse/docker-image-resource/blob/master/assets/in. Then in the subsequent task, you can add the image parameter in your pipeline yml as such:
- task: use-task-image
image: my-task-image
file: my-project/ci/tasks/my-task.yml
in order to use the built image in the task.
UDPATE: the PR was rejected
This answer doesn't currently work, as the "dry_run" PR was rejected. See https://github.com/concourse/docker-image-resource/pull/185
I will update here if I find an approach which does work.
The "dry_run" parameter which was added to the docker resource in Oct 2017 now allows this (github pr)
You need to add a dummy docker resource like:
resources:
- name: dummy-docker-image
type: docker-image
icon: docker
source:
repository: example.com
tag: latest
- name: my-source
type: git
source:
uri: git#github.com:me/my-source.git
Then add a build step which pushes to that docker resource but with "dry_run" set so that nothing actually gets pushed:
jobs:
- name: My Job
plan:
- get: my-source
trigger: true
- put: dummy-docker-image
params:
dry_run: true
build: path/to/build/scope
dockerfile: path/to/build/scope/path/to/Dockerfile