Airflow KubernetesPodOperator image pull policy pull if different SHA digest - kubernetes

Currently using KubernetesPodOperator with default_image_policy (IfNotPresent). Will be using static tag IDs for different environments. For example, in dev env, the tag will be dev, in qa env, the tag will be qa, and so on. The issue is if there is actually a new version (different sha digest) of the image but same tag ID. I can change the image policy to Always but then it will download all the time. The Airflow DAG contain several tasks that uses KubernetesPodOperator of the same image and I don't want an image downloaded always for each of the task runs.
Is there an image policy that checks the sha digest(instead of tag ID) if exists and downloads it if not?

In the containers image definition you can specify the sha to be sure it pulls/uses the correct one. For example:
apiVersion: v1
kind: Pod
metadata:
name: ubuntu
spec:
containers:
- name: ubuntu
image: ubuntu#sha256:bc2f7250f69267c9c6b66d7b6a81a54d3878bb85f1ebb5f951c896d13e6ba537
command: [ "/bin/bash", "-c", "--" ]
args: [ "while true; do sleep 30; done;" ]
or for Airflow KubernetesPodOperator you can use:
example = KubernetesPodOperator(
image=ubuntu#sha256:bc2f7250f69267c9c6b66d7b6a81a54d3878bb85f1ebb5f951c896d13e6ba537,
task_id="example_task",
...
)

Related

Use one Kustomize patch to set environment variables for a deployment and a cron job

Is there an easy way to share a set of environment variables (coming in from various config maps and secrets) between the same container in a deployment and a cron job?
I'm using Kustomize, but I can't figure out how to approach since with a patch the patch itself would be a bit different depending on if patching a deployment or a cron job.
For example, in my deployment I have something like this:
spec:
containers:
- name: my_app
env:
name: SOME_SECRET
valueFrom:
secretKeyRef:
name: my_secret
key: my_key
envFrom:
- configmapRef:
name: some_config_map
I'd like to also have this applied to a cron job, but since the YAML schema is different between a deploymnet and a cron job I'm not sure if this is something that is possible/supported.
Thanks!
Don't know if it can be done with Kustomize or not...
But have you considered using helm instead? You can have dedicated templates for deployment and cron job and use the same value for both of their container env field...

Jenkins - Kubernetes - Consolidate Agent Configuration

We are running Jenkins, in Kubernetes via the official helm chart.
Every pipeline has the same agent definition in place.
pipeline {
agent {
kubernetes {
inheritFrom 'default'
yamlFile 'automation/Jenkins/KubernetesPod.yaml'
}
}
The KubernetesPod.yaml looks like this.
metadata:
labels:
job-name: cicd_application
spec:
containers:
- name: operations
image: xxxxx.dkr.ecr.us-west-1.amazonaws.com/operations:0.1.3
command:
- sleep
args:
- 99d
This works fine. Our job DSL looks like this and everything just works.
steps {
container('operations') {
The problem comes in, when that operations container bumps from 0.1.3 to 0.1.4. I now have to create a Merge Request against 40 pipelines.
Is there a way to
Pull this file in from another repo.
Define and refer to this in JcasC
Ideally, when we bump the image ( its things like TF, Ansible etc) we can just do it all at once.
Thanks.

Kubernetes Podspec to download only container image but it should not install

I want to download the container image, but dont want to deploy/install the image.
How can i deploy podspec to download only images but it should not create container.
Any podspec snapshot for this?
As far as I know there is no direct Kubernetes resources to only download an image of your choosing. To have the images of your applications on your Nodes you can consider using following solutions/workarounds:
Use a Daemonset with an initContainer('s)
Use tools like Ansible to pull the images with a playbook
Use a Daemonset with InitContainers
Assuming the following situation:
You've created 2 images that you want to have on all of the Nodes.
You can use a Daemonset (spawn a Pod on each Node) with initContainers (with images as source) that will run on all Nodes and ensure that the images will be present on the machine.
An example of such setup could be following:
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: pull-images
labels:
k8s-app: pull-images
spec:
# AS THIS DAEMONSET IS NOT SUPPOSED TO SERVE TRAFFIC I WOULD CONSIDER USING THIS UPDATE STRATEGY FOR SPEEDING UP THE DOWNLOAD PROCESS
updateStrategy:
type: RollingUpdate
rollingUpdate:
maxUnavailable: 100
selector:
matchLabels:
name: pull-images
template:
metadata:
labels:
name: pull-images
spec:
initContainers:
# PUT HERE IMAGES THAT YOU WANT TO PULL AND OVERRIDE THEIR ENTRYPOINT
- name: ubuntu
image: ubuntu:20.04 # <-- IMAGE #1
imagePullPolicy: Always # SPECIFY THE POLICY FOR SPECIFIC IMAGE
command: ["/bin/sh", "-c", "exit 0"]
- name: nginx
image: nginx:1.19.10 # <-- IMAGE #2
imagePullPolicy: IfNotPresent # SPECIFY THE POLICY FOR SPECIFIC IMAGE
command: ["/bin/sh", "-c", "exit 0"]
containers:
# MAIN CONTAINER WITH AS SMALL AS POSSIBLE IMAGE SLEEPING
- name: alpine
image: alpine
command: [sleep]
args:
- "infinity"
Kubernetes Daemonset controller will ensure that the Pod will run on each Node. Before the image is run, the initContainers will act as a placeholders for the images. The images that you want to have on the Nodes will be pulled. The ENTRYPOINT will be overridden to not run the image continuously. After that the main container (alpine) will be run with a sleep infinity command.
This setup will also work when the new Nodes are added.
Following on that topic I would also consider checking following documentation on imagePullPolicy:
Kubernetes.io: Docs: Concepts: Containers: Images: Updating images
A side note!
I've set the imagePullPolicy for the images in initContainers differently to show you that you can specify the imagePullPolicy independently for each container. Please use the policy that suits your use case the most.
Use tools like Ansible to pull the images with a playbook
Assuming that you have SSH access to the Nodes you can consider using Ansible with it's community module (assuming that you are using Docker):
community.docker.docker_image
Citing the documentation for this module:
This plugin is part of the community.docker collection (version 1.3.0).
To install it use: ansible-galaxy collection install community.docker.
Synopsis
Build, load or pull an image, making the image available for creating containers. Also supports tagging an image into a repository and archiving an image to a .tar file.
-- Docs.ansible.com: Ansible: Collections: Community: Docker: Docker image module
You can use it with a following example:
hosts.yaml
all:
hosts:
node-1:
ansible_port: 22
ansible_host: X.Y.Z.Q
node-2:
ansible_port: 22
ansible_host: A.B.C.D
playbook.yaml
- name: Playbook to download images
hosts: all
user: ENTER_USER
tasks:
- name: Pull an image
community.docker.docker_image:
name: "{{ item }}"
source: pull
with_items:
- "nginx"
- "ubuntu"
A side note!
In the ansible way, I needed to install docker python package:
$ pip3 install docker
Additional resources:
Kubernetes.io: Docs: Concepts: Workloads: Controllers: Daemonset
Kubernetes.io: Docs: Concepts: Workloads: Pods: initContainers

Is there a way to load private image using a skaffold config without building it?

I have created an mock.Dockerfile which just contains one line.
FROM eu.gcr.io/some-org/mock-service:0.2.0
With that config and a reference to it the build section, skaffold builds that dockerfile using the private GCR registry. However, if I remove that Dockerfile, skaffold does not build it, and when starting skaffold it only loads the images which are referenced in that build section(public images, like postgres work as well). So in that local kubernetes config, like minikube, this results in a
ImagePullBackOff
Failed to pull image "eu.gcr.io/some-org/mock-service:0.2.0": rpc error: code = Unknown desc = Error response from daemon: unauthorized: You don't have the needed permissions to perform this operation, and you may have invalid credentials
So basically when I create a one-line Dockerfile, and include that, skaffold builds that image and loads it into minikube. Now it is possible to change the minikube config so that request to GCR succeds, but the goal is that developers don't have to change their minikube config...
Is there any other way to get that image loaded into Minikube, without changing the config and without that one-line Dockerfile?
skaffold.yaml:
apiVersion: skaffold/v2beta8
kind: Config
metadata:
name: some-service
build:
artifacts:
- image: eu.gcr.io/some-org/some-service
docker:
dockerfile: Dockerfile
- image: eu.gcr.io/some-org/mock-service
docker:
dockerfile: mock.Dockerfile
local: { }
profiles:
- name: mock
activation:
- kubeContext: (minikube|kind-.*|k3d-(.*))
deploy:
helm:
releases:
- name: postgres
chartPath: test/postgres
- name: mock-service
chartPath: test/mock-service
- name: skaffold-some-service
chartPath: helm/some-service
artifactOverrides:
image: eu.gcr.io/some-org/some-service
setValues:
serviceAccount.create: true
Although GKE comes pre-configured to pull from registries within the same project, Kubernetes clusters generally require special configuration at the pod level to pull from private registries. It's a bit involved.
Fortunately minikube introduced a registry-creds add-on that will configure the minikube instance with appropriate credentials to pull images.

run kubernetes job in cloud builder

I want to create and remove a job using Google Cloud Builder. Here's my configuration which builds my Docker image and pushes to GCR.
# cloudbuild.yaml
steps:
- name: 'gcr.io/cloud-builders/docker'
args: ['build', '-t', 'gcr.io/xyz/abc:latest','-f','Dockerfile.ng-unit','.']
- name: 'gcr.io/cloud-builders/docker'
args: ['push', 'gcr.io/xyz/abc:latest']
Now I want to create a job , I want to run something like
kubectl create -R -f ./kubernetes
which creates job in kubernetes folder.
I know cloud builder has - name: 'gcr.io/cloud-builders/kubectl' but I can't figure out how to use it. Plus how can I authenticate it to run kubectl commands? How can I use service_key.json
I wasn't able to connect and get cluster credentials. Here's what I did
Go to IAM, add another Role to xyz#cloudbuild.gserviceaccount.com. I used Project Editor.
Wrote this on cloudbuild.yaml name: 'gcr.io/cloud-builders/kubectl'
args: ['create', '-R', '-f','./dockertests/unit-tests/kubernetes']