what is the skywalking agent image address about 7.0.0 - kubernetes

I am now using skywalking as my apm, and now I am configuring the address of my skywalking agent like this:
"initContainers": [
{
"name": "init-agent",
"image": "apache/skywalking-agent:7.0.0",
"command": [
"sh",
"-c",
"set -ex;mkdir -p /skywalking/agent;cp -r /opt/skywalking/agent/* /skywalking/agent;"
],
"resources": {},
"volumeMounts": [
{
"name": "agent",
"mountPath": "/skywalking/agent"
}
],
"terminationMessagePath": "/dev/termination-log",
"terminationMessagePolicy": "File",
"imagePullPolicy": "IfNotPresent"
}
],
but it tells me this address is not correct. Is skywalking agent having docker image? What is the docker image address to use in kubernetes v1.16.0 cluster? I am searching from internet and only find a skywalking base image.

We (Apache SkyWalking Team) have officially provided the Java agent Docker image here https://github.com/apache/skywalking-docker/tree/master/java-agent#use-this-image-as-sidecar-of-kubernetes-service, its usage can be also found there.
As for this specific question, you can always pass environment variable to override the default configuration, for the OAP backend address, its env var is SW_AGENT_COLLECTOR_BACKEND_SERVICES so you can simply pass SW_AGENT_COLLECTOR_BACKEND_SERVICES=your-oap-address:11800 to point the agent to your real OAP address.
apiVersion: v1
kind: Pod
metadata:
name: agent-as-sidecar
spec:
restartPolicy: Never
volumes:
- name: skywalking-agent
emptyDir: { }
containers:
- name: agent-container
image: apache/skywalking-java-agent:8.4.0-alpine
volumeMounts:
- name: skywalking-agent
mountPath: /agent
command: [ "/bin/sh" ]
args: [ "-c", "cp -R /skywalking/agent /agent/" ]
- name: app-container
image: springio/gs-spring-boot-docker
volumeMounts:
- name: skywalking-agent
mountPath: /skywalking
env:
- name: JAVA_TOOL_OPTIONS
value: "-javaagent:/skywalking/agent/skywalking-agent.jar"
- name: SW_AGENT_COLLECTOR_BACKEND_SERVICES
value: "your-oap-address-accessible-inside-docker" # <<=== THIS

Finally I build the side car image by myself:
wget https://www.apache.org/dyn/closer.cgi/skywalking/7.0.0/apache-skywalking-apm-7.0.0.tar.gz && tar -zxvf apache-skywalking-apm-7.0.0.tar.gz
this is the docker file:
FROM busybox:latest
ENV LANG=C.UTF-8
RUN set -eux && mkdir -p /usr/skywalking/agent/
ADD apache-skywalking-apm-bin/agent/ /usr/skywalking/agent/
WORKDIR /

Related

Using k8s to deploy jenkins and make slave Pods to perform tasks, how to put files or installation packages generated in the pod into jenkins pvc

I deployed jenkins and used the slave pod to run it. I used the local pv mode of openebs, deployed jenkins and volume on the same node. I used the volume mode to transfer the data generated by the pod. It is shared in the jenkins volume of the host computer. My task is to download the code and add some installation packages to it, but it takes a long time to download the packages. I hope that the slave pod will not be downloaded every time it is deployed.
#!/usr/bin/env groovy
// groovy公共变量
def PROJECT = "CI-code"
def WORKDIR_PATH = "/opt/status"
def DOWNLOAD_KUBE_DOWNLOAD_URL = "xxx/kube-1.19.0-v2.2.0-amd64.tar.gz"
def PVC_PATH = "/var/openebs/local/pvc-8e8f9830-9bdc-494d-ac45-19310cbda035/cloudybase"
pipeline {
agent {
kubernetes {
yaml """
apiVersion: v1
kind: Pod
metadata:
name: jenkins-slave
namespace: devops-tools
spec:
containers:
- name: jnlp
image: "xxx/google_containers/jenkins-slave-jdk11-wget:latest"
imagePullPolicy: Always
securityContext:
privileged: true
runAsUser: 0
volumeMounts:
- name: docker-cmd
mountPath: /usr/bin/docker
- name: docker-sock
mountPath: /var/run/docker.sock
- name: code
mountPath: /home/jenkins/agent/workspace/${PROJECT}
volumes:
- name: docker-cmd
hostPath:
path: /usr/bin/docker
- name: docker-sock
hostPath:
path: /var/run/docker.sock
- name: code
hostPath:
path: ${PVC_PATH}
"""
}
}
stages {
stage('拉取代码') {
steps {
git branch: 'release-2.2.0', credentialsId: 'e17ba069-aa8b-4bfd-9c9e-3f3956914f09', url: 'xxx/deployworker.git'
}
}
stage('下载依赖包到组件包对应目录') {
steps {
sh """
logging() {
echo -e "\033[32m $(/bin/date)\033[0m" - $#
}
main () {
logging Check ${DOWNLOAD_KUBE_DOWNLOAD_URL} installation package if download ...
if [ ! -f "${PVC_PATH}/kube_status_code" ];then
download_kube
fi
}
download_kube () {
DOWNLOAD_KUBE_NAME=$(echo ${DOWNLOAD_KUBE_DOWNLOAD_URL} | /bin/sed 's|.*/||')
cd / && { /bin/curl -O ${DOWNLOAD_KUBE_DOWNLOAD_URL} ; cd -; }
if [ "$?" -ne "0" ]; then
echo "Failed"
exit 1
fi
echo 'true' > ${W_PATH}/kube_status_code
mkdir -p /home/jenkins/agent/workspace/deploywork/Middleware-choreography/kubeQ/kubeQ/
tar xf /\${DOWNLOAD_KUBE_NAME} -C /home/jenkins/agent/workspace/deploywork/Middleware-choreography/kubeQ/kubeQ/
if [ "$?" -ne "0" ]; then
echo "Failed"
exit 1
fi
}
"""
}
}
Hi zccharts from the above explanation I can get that you are trying to create your container from scratch every time and it is consuming time. This can be solved by creating your base container image with all the packages installed and you can use it in your pipeline for deploying your application.

Kaniko Image Cache in Jenkins Kubernetes Agents

Here's the Jenkinsfile, I'm spinning up:
pipeline {
agent {
kubernetes {
yaml '''
apiVersion: v1
kind: Pod
metadata:
name: kaniko
namespace: jenkins
spec:
containers:
- name: kaniko
image: gcr.io/kaniko-project/executor:v1.8.1-debug
imagePullPolicy: IfNotPresent
command:
- /busybox/cat
tty: true
volumeMounts:
- name: jenkins-docker-cfg
mountPath: /kaniko/.docker
- name: image-cache
mountPath: /cache
imagePullSecrets:
- name: regcred
volumes:
- name: image-cache
persistentVolumeClaim:
claimName: kaniko-cache-pvc
- name: jenkins-docker-cfg
projected:
sources:
- secret:
name: regcred
items:
- key: .dockerconfigjson
path: config.json
'''
}
}
stages {
stage('Build & Cache Image'){
steps{
container(name: 'kaniko', shell: '/busybox/sh') {
withEnv(['PATH+EXTRA=/busybox']) {
sh '''#!/busybox/sh -xe
/kaniko/executor \
--cache \
--cache-dir=/cache \
--dockerfile Dockerfile \
--context `pwd`/Dockerfile \
--insecure \
--skip-tls-verify \
--destination testrepo/kaniko-test:0.0.1'''
}
}
}
}
}
}
Problem is the executor doesn't dump the cache anywhere I can find. If I rerun the pod and stage, the executor logs say that there's no cache. I want to retain the cache using a PVC as you can see. Any thoughts? Do I miss something?
Thanks in advance.
You should use separate pod kaniko-warmer, which will download you specific images.
- name: kaniko-warmer
image: gcr.io/kaniko-project/warmer:latest
args: ["--cache-dir=/cache",
"--image=nginx:1.17.1-alpine",
"--image=node:17"]
volumeMounts:
- name: kaniko-cache
mountPath: /cache
volumes:
- name: kaniko-cache
hostPath:
path: /opt/volumes/database/qazexam-front-cache
type: DirectoryOrCreate
Then volume kaniko-cache could be mounted to kaniko executor

Why are podman pods not reproducible using kubernetes yaml file?

I created a pod following a RedHat blog post and created a subsequent pod using the YAML file
Post: https://www.redhat.com/sysadmin/compose-podman-pods
When creating the pod using the commands, the pod works fine (can access localhost:8080)
When creating the pod using the YAML file, I get error 403 forbidden
I have tried this on two different hosts (both creating pod from scratch and using YAML), deleting all images and pod each time to make sure nothing was influencing the process
I'm using podman 2.0.4 on Ubuntu 20.04
Commands:
podman create --name wptestpod -p 8080:80
podman run \
-d --restart=always --pod=wptestpod \
-e MYSQL_ROOT_PASSWORD="myrootpass" \
-e MYSQL_DATABASE="wp" \
-e MYSQL_USER="wordpress" \
-e MYSQL_PASSWORD="w0rdpr3ss" \
--name=wptest-db mariadb
podman run \
-d --restart=always --pod=wptestpod \
-e WORDPRESS_DB_NAME="wp" \
-e WORDPRESS_DB_USER="wordpress" \
-e WORDPRESS_DB_PASSWORD="w0rdpr3ss" \
-e WORDPRESS_DB_HOST="127.0.0.1" \
--name wptest-web wordpress
Original YAML file from podman generate kube wptestpod > wptestpod.yaml:
# Generation of Kubernetes YAML is still under development!
#
# Save the output of this file and use kubectl create -f to import
# it into Kubernetes.
#
# Created with podman-2.0.4
apiVersion: v1
kind: Pod
metadata:
creationTimestamp: '2020-08-26T17:02:56Z'
labels:
app: wptestpod
name: wptestpod
spec:
containers:
- command:
- apache2-foreground
env:
- name: PATH
value: /usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
- name: TERM
value: xterm
- name: container
value: podman
- name: WORDPRESS_DB_NAME
value: wp
- name: WORDPRESS_DB_USER
value: wordpress
- name: APACHE_CONFDIR
value: /etc/apache2
- name: PHP_LDFLAGS
value: -Wl,-O1 -pie
- name: PHP_VERSION
value: 7.4.9
- name: PHP_EXTRA_CONFIGURE_ARGS
value: --with-apxs2 --disable-cgi
- name: GPG_KEYS
value: 42670A7FE4D0441C8E4632349E4FDC074A4EF02D 5A52880781F755608BF815FC910DEB46F53EA312
- name: WORDPRESS_DB_PASSWORD
value: t3stp4ssw0rd
- name: APACHE_ENVVARS
value: /etc/apache2/envvars
- name: PHP_ASC_URL
value: https://www.php.net/distributions/php-7.4.9.tar.xz.asc
- name: PHP_SHA256
value: 23733f4a608ad1bebdcecf0138ebc5fd57cf20d6e0915f98a9444c3f747dc57b
- name: PHP_URL
value: https://www.php.net/distributions/php-7.4.9.tar.xz
- name: WORDPRESS_DB_HOST
value: 127.0.0.1
- name: PHP_CPPFLAGS
value: -fstack-protector-strong -fpic -fpie -O2 -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64
- name: PHP_MD5
- name: PHP_EXTRA_BUILD_DEPS
value: apache2-dev
- name: PHP_CFLAGS
value: -fstack-protector-strong -fpic -fpie -O2 -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64
- name: WORDPRESS_SHA1
value: 03fe1a139b3cd987cc588ba95fab2460cba2a89e
- name: PHPIZE_DEPS
value: "autoconf \t\tdpkg-dev \t\tfile \t\tg++ \t\tgcc \t\tlibc-dev \t\tmake \t\tpkg-config \t\tre2c"
- name: WORDPRESS_VERSION
value: '5.5'
- name: PHP_INI_DIR
value: /usr/local/etc/php
- name: HOSTNAME
value: wptestpod
image: docker.io/library/wordpress:latest
name: wptest-web
ports:
- containerPort: 80
hostPort: 8080
protocol: TCP
resources: {}
securityContext:
allowPrivilegeEscalation: true
capabilities: {}
privileged: false
readOnlyRootFilesystem: false
seLinuxOptions: {}
workingDir: /var/www/html
- command:
- mysqld
env:
- name: PATH
value: /usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
- name: TERM
value: xterm
- name: container
value: podman
- name: MYSQL_PASSWORD
value: t3stp4ssw0rd
- name: GOSU_VERSION
value: '1.12'
- name: GPG_KEYS
value: 177F4010FE56CA3336300305F1656F24C74CD1D8
- name: MARIADB_MAJOR
value: '10.5'
- name: MYSQL_ROOT_PASSWORD
value: t3stp4ssw0rd
- name: MARIADB_VERSION
value: 1:10.5.5+maria~focal
- name: MYSQL_DATABASE
value: wp
- name: MYSQL_USER
value: wordpress
- name: HOSTNAME
value: wptestpod
image: docker.io/library/mariadb:latest
name: wptest-db
resources: {}
securityContext:
allowPrivilegeEscalation: true
capabilities: {}
privileged: false
readOnlyRootFilesystem: false
seLinuxOptions: {}
workingDir: /
status: {}
---
metadata:
creationTimestamp: null
spec: {}
status:
loadBalancer: {}
YAML file with certain envs removed (taken from blog post):
# Generation of Kubernetes YAML is still under development!
#
# Save the output of this file and use kubectl create -f to import
# it into Kubernetes.
#
# Created with podman-1.9.3
apiVersion: v1
kind: Pod
metadata:
creationTimestamp: "2020-07-01T20:17:42Z"
labels:
app: wptestpod
name: wptestpod
spec:
containers:
- name: wptest-web
env:
- name: WORDPRESS_DB_NAME
value: wp
- name: WORDPRESS_DB_HOST
value: 127.0.0.1
- name: WORDPRESS_DB_USER
value: wordpress
- name: WORDPRESS_DB_PASSWORD
value: w0rdpr3ss
image: docker.io/library/wordpress:latest
ports:
- containerPort: 80
hostPort: 8080
protocol: TCP
resources: {}
securityContext:
allowPrivilegeEscalation: true
capabilities: {}
privileged: false
readOnlyRootFilesystem: false
seLinuxOptions: {}
workingDir: /var/www/html
- name: wptest-db
env:
- name: MYSQL_ROOT_PASSWORD
value: myrootpass
- name: MYSQL_USER
value: wordpress
- name: MYSQL_PASSWORD
value: w0rdpr3ss
- name: MYSQL_DATABASE
value: wp
image: docker.io/library/mariadb:latest
resources: {}
securityContext:
allowPrivilegeEscalation: true
capabilities: {}
privileged: false
readOnlyRootFilesystem: false
seLinuxOptions: {}
workingDir: /
status: {}
Can anyone see why this pod would not work when created using the YAML file, but works fine when created using the commands? It seems like a good workflow, but it's useless if the pods produced with the YAML are non-functional.
I found the same article, and the same problem than you. None of the following tests worked for me:
Add and remove environment variables
Add and remove restartPolicy part
Play with the capabilities part
As soon as you move back the command part, everything fires up again.
Check it with the following wordpress.yaml:
# Generation of Kubernetes YAML is still under development!
#
# Save the output of this file and use kubectl create -f to import
# it into Kubernetes.
#
# Created with podman-2.2.1
apiVersion: v1
kind: Pod
metadata:
labels:
app: wordpress-pod
name: wordpress-pod
spec:
containers:
- command:
- apache2-foreground
name: wptest-web
env:
- name: WORDPRESS_DB_NAME
value: wp
- name: WORDPRESS_DB_HOST
value: 127.0.0.1
- name: WORDPRESS_DB_USER
value: wordpress
- name: WORDPRESS_DB_PASSWORD
value: w0rdpr3ss
image: docker.io/library/wordpress:latest
ports:
- containerPort: 80
hostPort: 8080
protocol: TCP
resources: {}
securityContext:
allowPrivilegeEscalation: true
capabilities: {}
privileged: false
readOnlyRootFilesystem: false
seLinuxOptions: {}
workingDir: /var/www/html
- command:
- mysqld
name: wptest-db
env:
- name: MYSQL_ROOT_PASSWORD
value: myrootpass
- name: MYSQL_USER
value: wordpress
- name: MYSQL_PASSWORD
value: w0rdpr3ss
- name: MYSQL_DATABASE
value: wp
image: docker.io/library/mariadb:latest
resources: {}
securityContext:
allowPrivilegeEscalation: true
capabilities: {}
privileged: false
readOnlyRootFilesystem: false
seLinuxOptions: {}
workingDir: /
status: {}
Play & checks:
# Create containers, pod and run everything
$ podman play kube wordpress.yaml
# Output
Pod:
5a211c35419b4fcf0deda718e47eec2dd10653a5c5bacc275c312ae75326e746
Containers:
bfd087b5649f8d1b3c62ef86f28f4bcce880653881bcda21823c09e0cca1c85b
5aceb11500db0a91b4db2cc4145879764e16ed0e8f95a2f85d9a55672f65c34b
# Check running state
$ podman container ls; podman pod ls
# Output
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
5aceb11500db docker.io/library/mariadb:latest mysqld 13 seconds ago Up 10 seconds ago 0.0.0.0:8080->80/tcp wordpress-pod-wptest-db
bfd087b5649f docker.io/library/wordpress:latest apache2-foregroun... 16 seconds ago Up 10 seconds ago 0.0.0.0:8080->80/tcp wordpress-pod-wptest-web
d8bf33eede43 k8s.gcr.io/pause:3.2 19 seconds ago Up 11 seconds ago 0.0.0.0:8080->80/tcp 5a211c35419b-infra
POD ID NAME STATUS CREATED INFRA ID # OF CONTAINERS
5a211c35419b wordpress-pod Running 20 seconds ago d8bf33eede43 3
A bit more explanation about the bug:
The problem is that entrypoint and cmd are not parsed correctly from the images, as it should and you would expect. It was working on previous versions, and it is already identified and fixed for the future ones.
For complete reference:
Comment found at podman#8710-comment.748672710 breaks this problem into two pieces:
"make podman play use ENVs from image" (podman#8654 already fixed in mainstream)
"podman play should honour both ENTRYPOINT and CMD from image" (podman#8666)
This one is replaced by "play kube: fix args/command handling" (podman#8807 the one already merged to mainstream)

Pass json string to environment variable in a k8s deployment for Envoy

I have a K8s deployment with one pod running among others a container with Envoy sw. I have defined image in such way that if there is an Environment variable EXTRA_OPTS defined it will be appended to the command line to start Envoy.
I want to use that variable to override default configuration as explained in
https://www.envoyproxy.io/docs/envoy/latest/operations/cli#cmdoption-config-yaml
Environment variable works ok for other command options such as "-l debug" for example.
Also, I have tested expected final command line and it works.
Dockerfile set Envoy to run in this way:
CMD ["/bin/bash", "-c", "envoy -c envoy.yaml $EXTRA_OPTS"]
What I want is to set this:
...
- image: envoy-proxy:1.10.0
imagePullPolicy: IfNotPresent
name: envoy-proxy
env:
- name: EXTRA_OPTS
value: ' --config-yaml "admin: { address: { socket_address: { address: 0.0.0.0, port_value: 9902 } } }"'
...
I have succesfully tested running envoy with final command line:
envoy -c /etc/envoy/envoy.yaml --config-yaml "admin: { address: { socket_address: { address: 0.0.0.0, port_value: 9902 } } }"
And I have also tested a "simpler" option in EXTRA_OPTS and it works:
...
- image: envoy-proxy:1.10.0
imagePullPolicy: IfNotPresent
name: envoy-proxy
env:
- name: EXTRA_OPTS
value: ' -l debug'
...
I would expect Envoy running with this new admin port, instead I'm having param errors:
PARSE ERROR: Argument: {
Couldn't find match for argument
It looks like quotes are not being passed to the actual Environment variable into the container...
Any clue???
Thanks to all
You should set ["/bin/bash", "-c", "envoy -c envoy.yaml"] as an ENTRYPOINT in you dockerfile or use command in kubernetes and then use args to add additional arguments.
You can find more information in docker documentation
Let me explain by example:
$ docker build -t fl3sh/test:bash .
$ cat Dockerfile
FROM ubuntu
RUN echo '#!/bin/bash' > args.sh && \
echo 'echo "$#"' >> args.sh && \
chmod -x args.sh
CMD ["args","from","docker","cmd"]
ENTRYPOINT ["/bin/bash", "args.sh", "$ENV_ARG"]
cat args.yaml
apiVersion: v1
kind: Pod
metadata:
creationTimestamp: null
labels:
run: args
name: args
spec:
containers:
- args:
- args
- from
- k8s
image: fl3sh/test:bash
name: args
imagePullPolicy: Always
resources: {}
dnsPolicy: ClusterFirst
restartPolicy: Never
status: {}
Output:
pod/args $ENV_ARG args from k8s
cat command-args.yaml
apiVersion: v1
kind: Pod
metadata:
creationTimestamp: null
labels:
run: command-args
name: command-args
spec:
containers:
- command:
- /bin/bash
- -c
args:
- 'echo args'
image: fl3sh/test:bash
imagePullPolicy: Always
name: args
resources: {}
dnsPolicy: ClusterFirst
restartPolicy: Never
status: {}
Output:
pod/command-args args
cat command-env-args.yaml
apiVersion: v1
kind: Pod
metadata:
labels:
run: command-env-args
name: command-env-args
spec:
containers:
- env:
- name: ENV_ARG
value: "arg from env"
command:
- /bin/bash
- -c
- exec echo "$ENV_ARG"
image: fl3sh/test:bash
imagePullPolicy: Always
name: args
resources: {}
dnsPolicy: ClusterFirst
restartPolicy: Never
status: {}
Output:
pod/command-env-args arg from env
cat command-no-args.yaml
apiVersion: v1
kind: Pod
metadata:
creationTimestamp: null
labels:
run: command-no-args
name: command-no-args
spec:
containers:
- command:
- /bin/bash
- -c
- 'echo "no args";echo "$#"'
image: fl3sh/test:bash
name: args
imagePullPolicy: Always
resources: {}
dnsPolicy: ClusterFirst
restartPolicy: Never
status: {}
Output:
pod/command-no-args no args
#notice ^ empty line above
cat no-args.yaml
apiVersion: v1
kind: Pod
metadata:
creationTimestamp: null
labels:
run: no-args
name: no-args
spec:
containers:
- image: fl3sh/test:bash
name: no-args
imagePullPolicy: Always
resources: {}
dnsPolicy: ClusterFirst
restartPolicy: Never
status: {}
Output:
pod/no-args $ENV_ARG args from docker cmd
If you need to recreate my example you can use this loop to get this output like above:
for p in `kubectl get po -oname`; do echo cat ${p#*/}.yaml; echo ""; \
cat ${p#*/}.yaml; echo -e "\nOutput:"; printf "$p "; \
kubectl logs $p;echo "";done
Conclusion if you need to pass env as arguments use:
command:
- /bin/bash
- -c
- exec echo "$ENV_ARG"
I hope now it is clear.

Unable to execute Airflow KubernetesExecutor

Following the project from here, I am trying to integrate airflow kubernetes executor using NFS server as backed storage PV. I've a PV airflow-pv which is linked with NFS server. Airflow webserver and scheduler are using a PVC airflow-pvc which is bound with airflow-pv. I've placed my dag files in NFS server /var/nfs/airflow/development/<dags/logs>. I can see newly added DAGS in webserver UI aswell. However when I execute a DAG from UI, the scheduler fires a new POD for that tasks BUT the new worker pod fails to run saying
Unable to mount volumes for pod "tutorialprintdate-3e1a4443363e4c9f81fd63438cdb9873_development(976b1e64-b46d-11e9-92af-025000000001)": timeout expired waiting for volumes to attach or mount for pod "development"/"tutorialprintdate-3e1a4443363e4c9f81fd63438cdb9873". list of unmounted volumes=[airflow-dags]. list of unattached volumes=[airflow-dags airflow-logs airflow-config default-token-hjwth]
here is my webserver and scheduler deployment files;
apiVersion: v1
kind: Service
metadata:
name: airflow-webserver-svc
namespace: development
spec:
type: NodePort
ports:
- name: web
protocol: TCP
port: 8080
selector:
app: airflow-webserver-app
namespace: development
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: airflow-webserver-dep
namespace: development
spec:
replicas: 1
selector:
matchLabels:
app: airflow-webserver-app
namespace: development
template:
metadata:
labels:
app: airflow-webserver-app
namespace: development
spec:
restartPolicy: Always
containers:
- name: airflow-webserver-app
image: airflow:externalConfigs
imagePullPolicy: IfNotPresent
ports:
- containerPort: 8080
args: ["-webserver"]
env:
- name: AIRFLOW_KUBE_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
- name: AIRFLOW__CORE__FERNET_KEY
valueFrom:
secretKeyRef:
name: airflow-secrets
key: AIRFLOW__CORE__FERNET_KEY
- name: MYSQL_ROOT_PASSWORD
valueFrom:
secretKeyRef:
name: airflow-secrets
key: MYSQL_PASSWORD
- name: MYSQL_PASSWORD
valueFrom:
secretKeyRef:
name: airflow-secrets
key: MYSQL_PASSWORD
- name: DB_HOST
value: mysql-svc.development.svc.cluster.local
- name: DB_PORT
value: "3306"
- name: MYSQL_DATABASE
value: airflow
- name: MYSQL_USER
value: airflow
- name: MYSQL_PASSWORD
value: airflow
- name: AIRFLOW__CORE__EXECUTOR
value: "KubernetesExecutor"
volumeMounts:
- name: airflow-config
mountPath: /usr/local/airflow/airflow.cfg
subPath: airflow.cfg
- name: airflow-files
mountPath: /usr/local/airflow/dags
subPath: airflow/development/dags
- name: airflow-files
mountPath: /usr/local/airflow/plugins
subPath: airflow/development/plugins
- name: airflow-files
mountPath: /usr/local/airflow/logs
subPath: airflow/development/logs
- name: airflow-files
mountPath: /usr/local/airflow/temp
subPath: airflow/development/temp
volumes:
- name: airflow-files
persistentVolumeClaim:
claimName: airflow-pvc
- name: airflow-config
configMap:
name: airflow-config
The scheduler yaml file is exactly the same except the container args is args: ["-scheduler"]. Here is my airflow.cfg file,
apiVersion: v1
kind: ConfigMap
metadata:
name: "airflow-config"
namespace: development
data:
airflow.cfg: |
[core]
airflow_home = /usr/local/airflow
dags_folder = /usr/local/airflow/dags
base_log_folder = /usr/local/airflow/logs
executor = KubernetesExecutor
plugins_folder = /usr/local/airflow/plugins
load_examples = false
[scheduler]
child_process_log_directory = /usr/local/airflow/logs/scheduler
[webserver]
rbac = false
[kubernetes]
airflow_configmap =
worker_container_repository = airflow
worker_container_tag = externalConfigs
worker_container_image_pull_policy = IfNotPresent
delete_worker_pods = true
dags_volume_claim = airflow-pvc
dags_volume_subpath =
logs_volume_claim = airflow-pvc
logs_volume_subpath =
env_from_configmap_ref = airflow-config
env_from_secret_ref = airflow-secrets
in_cluster = true
namespace = development
[kubernetes_node_selectors]
# the key-value pairs to be given to worker pods.
# the worker pods will be scheduled to the nodes of the specified key-value pairs.
# should be supplied in the format: key = value
[kubernetes_environment_variables]
//the below configs gets overwritten by above [kubernetes] configs
AIRFLOW__KUBERNETES__DAGS_VOLUME_CLAIM = airflow-pvc
AIRFLOW__KUBERNETES__DAGS_VOLUME_SUBPATH = var/nfs/airflow/development/dags
AIRFLOW__KUBERNETES__LOGS_VOLUME_CLAIM = airflow-pvc
AIRFLOW__KUBERNETES__LOGS_VOLUME_SUBPATH = var/nfs/airflow/development/logs
[kubernetes_secrets]
AIRFLOW__CORE__SQL_ALCHEMY_CONN = airflow-secrets=AIRFLOW__CORE__SQL_ALCHEMY_CONN
AIRFLOW_HOME = airflow-secrets=AIRFLOW_HOME
[cli]
api_client = airflow.api.client.json_client
endpoint_url = https://airflow.crunchanalytics.cloud
[api]
auth_backend = airflow.api.auth.backend.default
[admin]
# ui to hide sensitive variable fields when set to true
hide_sensitive_variable_fields = true
After firing a manual task, the logs of the Scheduler tells me that KubernetesExecutorConfig() executed with all values as None. Seems like it didn't picked up the configs ? I've tried almost everything I know of, but cannot manage to make it work. Could someone tell me waht am I missing ?
[2019-08-01 14:44:22,944] {jobs.py:1341} INFO - Sending ('kubernetes_sample', 'run_this_first', datetime.datetime(2019, 8, 1, 13, 45, 51, 874679, tzinfo=<Timezone [UTC]>), 1) to executor with priority 3 and queue default
[2019-08-01 14:44:22,944] {base_executor.py:56} INFO - Adding to queue: airflow run kubernetes_sample run_this_first 2019-08-01T13:45:51.874679+00:00 --local -sd /usr/local/airflow/dags/airflow/development/dags/k8s_dag.py
[2019-08-01 14:44:22,948] {kubernetes_executor.py:629} INFO - Add task ('kubernetes_sample', 'run_this_first', datetime.datetime(2019, 8, 1, 13, 45, 51, 874679, tzinfo=<Timezone [UTC]>), 1) with command airflow run kubernetes_sample run_this_first 2019-08-01T13:45:51.874679+00:00 --local -sd /usr/local/airflow/dags/airflow/development/dags/k8s_dag.py with executor_config {}
[2019-08-01 14:44:22,949] {kubernetes_executor.py:379} INFO - Kubernetes job is (('kubernetes_sample', 'run_this_first', datetime.datetime(2019, 8, 1, 13, 45, 51, 874679, tzinfo=<Timezone [UTC]>), 1), 'airflow run kubernetes_sample run_this_first 2019-08-01T13:45:51.874679+00:00 --local -sd /usr/local/airflow/dags/airflow/development/dags/k8s_dag.py', KubernetesExecutorConfig(image=None, image_pull_policy=None, request_memory=None, request_cpu=None, limit_memory=None, limit_cpu=None, gcp_service_account_key=None, node_selectors=None, affinity=None, annotations={}, volumes=[], volume_mounts=[], tolerations=None))
[2019-08-01 14:44:23,042] {kubernetes_executor.py:292} INFO - Event: kubernetessamplerunthisfirst-7fe05ddb34aa4cb9a5604e420d5b60a3 had an event of type ADDED
[2019-08-01 14:44:23,046] {kubernetes_executor.py:324} INFO - Event: kubernetessamplerunthisfirst-7fe05ddb34aa4cb9a5604e420d5b60a3 Pending
[2019-08-01 14:44:23,049] {kubernetes_executor.py:292} INFO - Event: kubernetessamplerunthisfirst-7fe05ddb34aa4cb9a5604e420d5b60a3 had an event of type MODIFIED
[2019-08-01 14:44:23,049] {kubernetes_executor.py:324} INFO - Event: kubernetessamplerunthisfirst-7fe05ddb34aa4cb9a5604e420d5b60a3 Pending
for reference, here is my PV and PVC;
kind: PersistentVolume
apiVersion: v1
metadata:
name: airflow-pv
labels:
mode: local
environment: development
spec:
persistentVolumeReclaimPolicy: Retain
storageClassName: airflow-pv
capacity:
storage: 4Gi
accessModes:
- ReadWriteMany
nfs:
server: 10.105.225.217
path: "/"
---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: airflow-pvc
namespace: development
spec:
storageClassName: airflow-pv
accessModes:
- ReadWriteMany
resources:
requests:
storage: 1Gi
selector:
matchLabels:
mode: local
environment: development
Using Airflow version: 1.10.3
Since no answer yet, I'll share my findings so far. In my airflow.conf under kubernetes section, we are to pass the following values
dags_volume_claim = airflow-pvc
dags_volume_subpath = airflow/development/dags
logs_volume_claim = airflow-pvc
logs_volume_subpath = airflow/development/logs
the way how scheduler creates a new pod from the above configs is as follows (only mentioning the volumes and volumeMounts);
"volumes": [
{
"name": "airflow-dags",
"persistentVolumeClaim": {
"claimName": "airflow-pvc"
}
},
{
"name": "airflow-logs",
"persistentVolumeClaim": {
"claimName": "airflow-pvc"
}
}],
"containers": [
{ ...
"volumeMounts": [
{
"name": "airflow-dags",
"readOnly": true,
"mountPath": "/usr/local/airflow/dags",
"subPath": "airflow/development/dags"
},
{
"name": "airflow-logs",
"mountPath": "/usr/local/airflow/logs",
"subPath": "airflow/development/logs"
}]
...}]
K8s DOESN'T likes multiple volumes pointing to same pvc (airflow-pvc). To fix this, I'd to create two PVC (and PV) for dags and logs dags_volume_claim = airflow-dags-pvc and logs_volume_claim = airflow-log-pvc which works fine.
I don't kow if this has already been addressed in newer version of airflow (I am using 1.10.3). The airflow scheduler should handle this case when ppl using same PVC then create a pod with single volume and 2 volumeMounts referring to that Volume e.g.
"volumes": [
{
"name": "airflow-dags-logs", <--just an example name
"persistentVolumeClaim": {
"claimName": "airflow-pvc"
}
}
"containers": [
{ ...
"volumeMounts": [
{
"name": "airflow-dags-logs",
"readOnly": true,
"mountPath": "/usr/local/airflow/dags",
"subPath": "airflow/development/dags" <--taken from configs
},
{
"name": "airflow-dags-logs",
"mountPath": "/usr/local/airflow/logs",
"subPath": "airflow/development/logs" <--taken from configs
}]
...}]
I deployed a pod with above configurations and it works!