Buildah vs Kaniko - argo-workflows

Buildah vs Kaniko - argo-workflows

I'm using ArgoWorkflow to automate our CI/CD chains.
In order to build images, and push them to our private registry we are faced between the choice of either buildah or kaniko. But I can't put my finger on the main difference between the two. Pros and cons wise, and also on how do these tools handle parallel builds and cache management. Can anyone clarify these points ? Or even suggest another tool that can maybe do the job in a more simple way.
Some clarifications on the subject would be really helpful.
Thanks in advance.

buildah will require either a privileged container with more then one UID or a container running with CAP_SETUID, CAP_SETGID to build container images.
It is not hacking on the file system like kanicko does to get around these requirements. It runs full contianers when building.
--isolation chroot, will make it a little easier to get buildah to work within kubernetes.

kaniko is very simple to setup and has some magic that let it work with no requirements in kubernetes :)
I also tried buildah but was unable to configure it and found it too complex to setup in a kubernetes environment.
You can use an internal Docker registry as cache management for kaniko, but a local storage can be configured instead (not tried yet). Just use the latest version of kaniko (v1.7.0), that fixes an important bug in the cached layers management.
These are some functions (declared in the file ci/libkaniko.sh) that I use in my GitLab CI pipelines, executed by a GitLab kubernetes runner. They should hopefully clarify setup and usage of kaniko.
function kaniko_config
{
local docker_auth="$(echo -n "$CI_REGISTRY_USER:$CI_REGISTRY_PASSWORD" | base64)"
mkdir -p $DOCKER_CONFIG
[ -e $DOCKER_CONFIG/config.json ] || \
cat <<JSON > $DOCKER_CONFIG/config.json
{
"auths": {
"$CI_REGISTRY": {
"auth": "$docker_auth"
}
}
}
JSON
}
# Usage example (.gitlab-ci.yml)
#
# build php:
# extends: .build
# variables:
# DOCKER_CONFIG: "$CI_PROJECT_DIR/php/.docker"
# DOCKER_IMAGE_PHP_DEVEL_BRANCH: &php-devel-image "${CI_REGISTRY_IMAGE}/php:${CI_COMMIT_REF_SLUG}-build"
# script:
# - kaniko_build
# --destination $DOCKER_IMAGE_PHP_DEVEL_BRANCH
# --dockerfile $CI_PROJECT_DIR/docker/images/php/Dockerfile
# --target devel
function kaniko_build
{
kaniko_config
echo "Kaniko cache enabled ($CI_REGISTRY_IMAGE/cache)"
/kaniko/executor \
--build-arg http_proxy="${HTTP_PROXY}" \
--build-arg https_proxy="${HTTPS_PROXY}" \
--build-arg no_proxy="${NO_PROXY}" \
--cache --cache-repo $CI_REGISTRY_IMAGE/cache \
--context "$CI_PROJECT_DIR" \
--digest-file=/dev/termination-log \
--label "ci.job.id=${CI_JOB_ID}" \
--label "ci.pipeline.id=${CI_PIPELINE_ID}" \
--verbosity info \
$#
[ -r /dev/termination-log ] && \
echo "Manifest digest: $(cat /dev/termination-log)"
}
With these functions a new image can be built with:
stages:
- build
build app:
stage: build
image:
name: gcr.io/kaniko-project/executor:v1.7.0-debug
entrypoint: [""]
variables:
DOCKER_CONFIG: "$CI_PROJECT_DIR/app/.docker"
DOCKER_IMAGE_APP_RELEASE_BRANCH: &app-devel-image "${CI_REGISTRY_IMAGE}/phelps:${CI_COMMIT_REF_SLUG}"
GIT_SUBMODULE_STRATEGY: recursive
before_script:
- source ci/libkaniko.sh
script:
- kaniko_build
--destination $DOCKER_IMAGE_APP_RELEASE_BRANCH
--digest-file $CI_PROJECT_DIR/docker-content-digest-app
--dockerfile $CI_PROJECT_DIR/docker/Dockerfile
artifacts:
paths:
- docker-content-digest-app
tags:
- k8s-runner
Note that you have to use the debug version of kaniko executor because this image tag provides a shell (and other busybox based binaries).

Related

How to decide Quarkus application arguments in Kubernetes at run-time?

I've built a Quarkus 2.7.1 console application using picocli that includes several subcommands. I'd like to be able to run this application within a Kubernetes cluster and decide its arguments at run-time. This is so that I can use the same container image to run the application in different modes within the cluster.
To get things started I added the JIB extension and tried setting the arguments using a configuration value quarkus.jib.jvm-arguments. Unfortunately it seems like this configuration value is locked at build-time so I'm unable to update this at run-time.
Next I tried setting quarkus.args while using default settings for JIB. The configuration value documentation makes it sound general enough for the job but it doesn't seem to have an affect when the application is run in the container. Since most references to this configuration value in documentation are in the context of Dev Mode I'm wondering if this may be disabled outside of that.
How can I get this application running in a container image with its arguments decided at run-time?

You can set quarkus.jib.jvm-entrypoint to any container entrypoint command you want, including scripts. An example in the doc is quarkus.jib.jvm-entrypoint=/deployments/run-java.sh. You could make use of $CLI_ARGUMENTS in such a script. Even something like quarkus.jib.jvm-entrypoint=/bin/sh,-c,'/deployments/run-java.sh $CLI_ARGUMENTS' should work too, as long as you place the script run-java.sh at /deployments in the image. The possibility is limitless.
Also see this SO answer if there's an issue. (The OP in the link put a customer script at src/main/jib/docker/run-java.sh (src/main/jib is Jib's default "extra files directory") so that Jib places the script in the image at /docker/run-java.sh.

I was able to find a solution to the problem with a bit of experimenting this morning.
With the quarkus-container-image-docker extension (instead of quarkus.jib.jvm-arguments) I was able to take the template Dockerfile.jvm and extend it to pass through arguments to the CLI. The only line that needed changing was the ENTRYPOINT (details included in the snippet below). I changed the ENTRYPOINT form (from exec to shell) and added an environment variable as an argument to pass-through program arguments.
FROM registry.access.redhat.com/ubi8/ubi-minimal:8.3
ARG JAVA_PACKAGE=java-11-openjdk-headless
ARG RUN_JAVA_VERSION=1.3.8
ENV LANG='en_US.UTF-8' LANGUAGE='en_US:en'
# Install java and the run-java script
# Also set up permissions for user `1001`
RUN microdnf install curl ca-certificates ${JAVA_PACKAGE} \
&& microdnf update \
&& microdnf clean all \
&& mkdir /deployments \
&& chown 1001 /deployments \
&& chmod "g+rwX" /deployments \
&& chown 1001:root /deployments \
&& curl https://repo1.maven.org/maven2/io/fabric8/run-java-sh/${RUN_JAVA_VERSION}/run-java-sh-${RUN_JAVA_VERSION}-sh.sh -o /deployments/run-java.sh \
&& chown 1001 /deployments/run-java.sh \
&& chmod 540 /deployments/run-java.sh \
&& echo "securerandom.source=file:/dev/urandom" >> /etc/alternatives/jre/lib/security/java.security
# Configure the JAVA_OPTIONS, you can add -XshowSettings:vm to also display the heap size.
ENV JAVA_OPTIONS="-Dquarkus.http.host=0.0.0.0 -Djava.util.logging.manager=org.jboss.logmanager.LogManager"
# We make four distinct layers so if there are application changes the library layers can be re-used
COPY --chown=1001 target/quarkus-app/lib/ /deployments/lib/
COPY --chown=1001 target/quarkus-app/*.jar /deployments/
COPY --chown=1001 target/quarkus-app/app/ /deployments/app/
COPY --chown=1001 target/quarkus-app/quarkus/ /deployments/quarkus/
EXPOSE 8080
USER 1001
# [== BEFORE ==]
# ENTRYPOINT [ "/deployments/run-java.sh" ]
# [== AFTER ==]
ENTRYPOINT "/deployments/run-java.sh" $CLI_ARGUMENTS

I have tried the above approaches but they didn't work with the default quarkus JIB's ubi8/openjdk-17-runtime image. This is because this base image doesn't use /work as the WORKIR, but instead the /home/jboss.
Therefore, I created a custom start-up script and referenced it on the properties file as following. This approach works better if there's a need to set application params using environment variables:
File: application.properties
quarkus.jib.jvm-entrypoint=/bin/sh,run-java.sh
File: src/main/jib/home/jboss/run-java.sh
java \
-Djavax.net.ssl.trustStore=/deployments/truststore \
-Djavax.net.ssl.trustStorePassword="$TRUST_STORE_PASSWORD" \
-jar quarkus-run.jar

Set Variables for Airflow during Helm install

I'm installing Airflow on kind with the following command:
export RELEASE_NAME=first-release
export NAMESPACE=airflow
helm install $RELEASE_NAME apache-airflow/airflow --namespace $NAMESPACE \
--set images.airflow.repository=my-dags \
--set images.airflow.tag=0.0.1 \
--values env.yaml
Andthe file env.yaml looks like the below:
env:
- name: "AIRFLOW_VAR_KEY"
value: "value_1"
But from the Web UI (when I go to Admins --> Variables), these credentials don't appear there.
How do I pass these credentials during helm install? Thanks!
UPDATE: It turns out that the environment variable was set successfully. However it doesn't show up on the Web UI

i am not sure how your full env.yaml file is
but to set the environment variables in Airflow
## environment variables for the web/scheduler/worker Pods (for airflow configs)
##
## WARNING:
## - don't include sensitive variables in here, instead make use of `airflow.extraEnv` with Secrets
## - don't specify `AIRFLOW__CORE__SQL_ALCHEMY_CONN`, `AIRFLOW__CELERY__RESULT_BACKEND`,
## or `AIRFLOW__CELERY__BROKER_URL`, they are dynamically created from chart values
##
## NOTE:
## - airflow allows environment configs to be set as environment variables
## - they take the form: AIRFLOW__<section>__<key>
## - see the Airflow documentation: https://airflow.apache.org/docs/stable/howto/set-config.html
##
## EXAMPLE:
## config:
## ## Security
## AIRFLOW__CORE__SECURE_MODE: "True"
## AIRFLOW__API__AUTH_BACKEND: "airflow.api.auth.backend.deny_all"
Reference file
and after that you have to run your command and further your DAG will be able to access the variables.
Helm documentation : https://github.com/helm/charts/tree/master/stable/airflow#docs-airflow---configs
Make sure you are configuring section : airflow.config

Okay, so I've figured this one out: The environment variables are set just nicely on the pods. However, this will not appear on the Web UI.
Workaround: to make it appear in the web UI, I will have to go into the Scheduler pod and import the variables. It can be done with a bash script.
# Get the name of scheduler pod
export SCHEDULER_POD_NAME="$(kubectl get pods --no-headers -o custom-columns=":metadata.name" -n airflow | grep scheduler)"
# Copy variables to the scheduler pod
kubectl cp ./variables.json airflow/$SCHEDULER_POD_NAME:./
# Import variables to scheduler with airflow command
kubectl -n $NAMESPACE exec $SCHEDULER_POD_NAME -- airflow variables import variables.json

DevSpace hook for running tests in container after an update to the container

My ultimate goal is to have tests run automatically anytime a container is updated. For example, if update /api, it should sync the changes between local and the container. After that it should automatically run the tests... ultimately.
I'm starting out with Hello World! though per the example:
# DevSpace --version = 5.16.0
version: v1beta11
...
hooks:
- command: |
echo Hello World!
container:
imageSelector: ${APP-NAME}/${API-DEV}
events: ["after:initialSync:${API}"]
...
I've tried all of the following and don't get the desired behavior:
stop:sync:${API}
restart:sync:${name}
after:initialSync:${API}
devCommand:after:sync
At best I can just get Hello World! to print on the initial run of devspace dev -b, but nothing after I make changes to the files for /api which causes files to sync.
Suggestions?

You will need a post-sync hook for this, which is separate from the DevSpace lifecycle hooks. You can define it with the dev.sync directly and it looks like this:
dev:
sync:
- imageSelector: john/devbackend
onUpload:
execRemote:
onBatch:
command: bash
args:
- -c
- "echo 'Hello World!' && other commands..."
More information in the docs: https://devspace.sh/cli/docs/configuration/development/file-synchronization#onupload

Is there a way to automatically create a container when starting Azurite?

For test purposes I create and run an Azurite docker image, in a test pipeline.
I would like to have the blob container automatically created though after Azurite is started, as it would simplify things.
Is there any good way to achieve this?
For the Postgres image we use, we can specify an init.sql which is run on startup. If something similar is available for Azurite, that would be awesome.

You can use the following Dockerfile to install the azure-storage-blob Python package on the Alpine based azurite image. The resulting image size is ~400MB compared to the ~1.2GB azure-cli image.
ARG AZURITE_VERSION="3.17.0"
FROM mcr.microsoft.com/azure-storage/azurite:${AZURITE_VERSION}
# Install azure-storage-blob python package
RUN apk update && \
apk --no-cache add py3-pip && \
apk add --virtual=build gcc libffi-dev musl-dev python3-dev && \
pip3 install --upgrade pip && \
pip3 install azure-storage-blob==12.12.0
# Copy init_azurite.py script
COPY ./init_azurite.py init_azurite.py
# Copy local blobs to azurite
COPY ./init_containers init_containers
# Run the blob emulator and initialize the blob containers
CMD python3 init_azurite.py --directory=init_containers & \
azurite-blob --blobHost 0.0.0.0 --blobPort 10000
The init_azurite.py script is a local Python script that uses the azure-storage-blob package to batch upload files and directories to the azurite blob storage emulator.
import argparse
import os
from time import sleep
from azure.core.exceptions import ResourceExistsError
from azure.storage.blob import BlobServiceClient, ContainerClient
def upload_file(container_client: ContainerClient, source: str, dest: str) -> None:
"""
Upload a single file to a path inside the container.
"""
print(f"Uploading {source} to {dest}")
with open(source, "rb") as data:
try:
container_client.upload_blob(name=dest, data=data)
except ResourceExistsError:
pass
def upload_dir(container_client: ContainerClient, source: str, dest: str) -> None:
"""
Upload a directory to a path inside the container.
"""
prefix = "" if dest == "" else dest + "/"
prefix += os.path.basename(source) + "/"
for root, dirs, files in os.walk(source):
for name in files:
dir_part = os.path.relpath(root, source)
dir_part = "" if dir_part == "." else dir_part + "/"
file_path = os.path.join(root, name)
blob_path = prefix + dir_part + name
upload_file(container_client, file_path, blob_path)
def init_containers(
service_client: BlobServiceClient, containers_directory: str
) -> None:
"""
Iterate on the containers directory and do the following:
1- create the container.
2- upload all folders and files to the container.
"""
for container_name in os.listdir(containers_directory):
container_path = os.path.join(containers_directory, container_name)
if os.path.isdir(container_path):
container_client = service_client.get_container_client(container_name)
try:
container_client.create_container()
except ResourceExistsError:
pass
for blob in os.listdir(container_path):
blob_path = os.path.join(container_path, blob)
if os.path.isdir(blob_path):
upload_dir(container_client, blob_path, "")
else:
upload_file(container_client, blob_path, blob)
if __name__ == "__main__":
parser = argparse.ArgumentParser(
description="Initialize azurite emulator containers."
)
parser.add_argument(
"--directory",
required=True,
help="""
Directory that contains subdirectories named after the
containers that we should create. Each subdirectory will contain the files
and directories of its container.
"""
)
args = parser.parse_args()
# Connect to the localhost emulator (after 5 secs to make sure it's up).
sleep(5)
blob_service_client = BlobServiceClient(
account_url="http://localhost:10000/devstoreaccount1",
credential={
"account_name": "devstoreaccount1",
"account_key": (
"Eby8vdM02xNOcqFlqUwJPLlmEtlCDXJ1OUzFT50uSRZ6IFsuFq2UVErCz4I6tq"
"/K1SZFPTOtr/KBHBeksoGMGw=="
)
}
)
# Only initialize if not already initialized.
if next(blob_service_client.list_containers(), None):
print("Emulator already has containers, will skip initialization.")
else:
init_containers(blob_service_client, args.directory)
This script will be copied to the azurite container and will populate the initial blob containers every time the azurite container is started unless some containers were already persisted using docker volumes. In that case, nothing will happen.
Following is an example docker-compose.yml file:
azurite:
build:
context: ./
dockerfile: Dockerfile
args:
AZURITE_VERSION: 3.17.0
restart: on-failure
ports:
- 10000:10000
volumes:
- azurite-data:/opt/azurite
volumes:
azurite-data:
Using such volumes will persist the emulator data until you destroy them (e.g. by using docker-compose down -v).
Finally, init_containers is a local directory that contains the containers and their folders/files. It will be copied to the azurite container when the image is built.
For example:
init_containers:
container-name-1:
dir-1:
file.txt
img.png
dir-2:
file.txt
container-name-2:
dir-1:
file.txt
img.png

I've solved the issue by creating a custom docker image and executing azure-cli tools from a health check. There could certainly be better solutions, and I will update the accepted answer if someone posts a better solution.
In more details
A solution to create the required data on startup is to run my own script. I chose to trigger the script from a health check I defined in docker-compose. What it does is use azure cli tools to create a container and then verify that it exists.
The script:
AZURE_STORAGE_CONNECTION_STRING="UseDevelopmentStorage=true"
export AZURE_STORAGE_CONNECTION_STRING
az storage container create -n images
az storage container show -n images
exit $?
However, the azurite image is based on alpine, which doesn't have apt, so installing azure cli was a bit tricky. So I did it the other way around, and based my image on mcr.microsoft.com/azure-cli:latest. With that done I installed Azurite like this:
RUN apk add npm
RUN npm install -g azurite --silent
All that's left is to actually run azurite, see the official azurite dockerfile for details.
It is possible to do this without azure-cli and use curl instead (and with that, not having to use the azure-cli docker image). However this was a bit complicated to get the authentication header working properly, so using azure-cli was easier.

How to force Devel::Cover to ignore a folder when using perl-helpers via Travis CI

The MetaCPAN Travis CI coverage builds are quite slow. See https://travis-ci.org/metacpan/metacpan-web/builds/238884497 This is likely in part because we're not successfully ignoring the /local folder that gets created by Carton as part of our build. See https://coveralls.io/builds/11809290
We're using perl-helpers to help with our Travis configuration. I thought I should be able to use the DEVEL_COVER_OPTIONS environment variable in order to fix this, but I guess I don't have the correct incantation. I've included the entire config below because a few snippets out of context seemed misleading.
language: perl
perl:
- "5.22"
matrix:
fast_finish: true
allow_failures:
- env: COVERAGE=1 USE_CPANFILE_SNAPSHOT=true
- env: USE_CPANFILE_SNAPSHOT=false HARNESS_VERBOSE=1
env:
global:
# Carton --deployment only works on the same version of perl
# that the snapshot was built from.
- DEPLOYMENT_PERL_VERSION=5.22
- DEVEL_COVER_OPTIONS="-ignore ^local/"
matrix:
# Get one passing run with coverage and one passing run with Test::Vars
# checks. If run together they more than double the build time.
- COVERAGE=1 USE_CPANFILE_SNAPSHOT=true
- USE_CPANFILE_SNAPSHOT=false HARNESS_VERBOSE=1
- USE_CPANFILE_SNAPSHOT=true
before_install:
- git clone git://github.com/travis-perl/helpers ~/travis-perl-helpers
- source ~/travis-perl-helpers/init
- npm install -g less js-beautify
# Pre-install from backpan to avoid upgrade breakage.
- cpanm -n http://cpan.metacpan.org/authors/id/M/ML/MLEHMANN/common-sense-3.6.tar.gz
- cpanm -n App::cpm Carton
install:
- cpan-install --coverage # installs converage prereqs, if enabled
- 'cpm install `test "${USE_CPANFILE_SNAPSHOT}" = "false" && echo " --resolver metadb" || echo " --resolver snapshot"`'
before_script:
- coverage-setup
script:
# Devel::Cover isn't in the cpanfile
# but if it's installed into the global dirs this should work.
- carton exec prove -lr -j$(test-jobs) t
after_success:
- coverage-report
notifications:
email:
recipients:
- olaf#seekrit.com
on_success: change
on_failure: always
irc: "irc.perl.org#metacpan-travis"
# Use newer travis infrastructure.
sudo: false
cache:
directories:
- local

The syntax for the Devel::Cover options on the command line is weird. You need to put stuff comma-separated. At least when you use PERL5OPT.
DEVEL_COVER_OPTIONS="-ignore,^local/"
See for example https://github.com/simbabque/AWS-S3/blob/master/.travis.yml#L26, where it's a whole lot of stuff with commas.
PERL5OPT=-MDevel::Cover=-ignore,"t/",+ignore,"prove",-coverage,statement,branch,condition,path,subroutine prove -lrs t