I'm experimenting with the new Flink Kubernetes operator and I've been able to do pretty much everything that I need besides one thing: getting a JAR file from the S3 file system.
Context
I have a Flink application running in a EKS cluster in AWS and have all the information saved in a S3 buckets. Things like savepoints, checkpoints, high availability and JARs files are all stored there.
I've been able to save the savepoints, checkpoints and high availability information in the bucket, but when trying to get the JAR file from the same bucket I get the error:
Could not find a file system implementation for scheme 's3'. The scheme is directly supported by Flink through the following plugins: flink-s3-fs-hadoop, flink-s3-fs-presto.
I was able to get to this thread, but I wasn't able to get the resource fetcher to work correctly. Also the solution is not ideal and I was searching for a more direct approach.
Deployment files
Here's the files that I'm deploying in the cluster:
deployment.yml
apiVersion: flink.apache.org/v1beta1
kind: FlinkDeployment
metadata:
name: flink-deployment
spec:
podTemplate:
apiVersion: v1
kind: Pod
metadata:
name: pod-template
spec:
containers:
- name: flink-main-container
env:
- name: ENABLE_BUILT_IN_PLUGINS
value: flink-s3-fs-presto-1.15.3.jar;flink-s3-fs-hadoop-1.15.3.jar
volumeMounts:
- mountPath: /flink-data
name: flink-volume
volumes:
- name: flink-volume
hostPath:
path: /tmp
type: Directory
image: flink:1.15
flinkVersion: v1_15
flinkConfiguration:
state.checkpoints.dir: s3://kubernetes-operator/checkpoints
state.savepoints.dir: s3://kubernetes-operator/savepoints
high-availability: org.apache.flink.kubernetes.highavailability.KubernetesHaServicesFactory
high-availability.storageDir: s3://kubernetes-operator/ha
jobManager:
resource:
memory: "2048m"
cpu: 1
taskManager:
resource:
memory: "2048m"
cpu: 1
serviceAccount: flink
session-job.yml
apiVersion: flink.apache.org/v1beta1
kind: FlinkSessionJob
metadata:
name: flink-session-job
spec:
deploymentName: flink-deployment
job:
jarURI: s3://kubernetes-operator/savepoints/flink.jar
parallelism: 3
upgradeMode: savepoint
savepointTriggerNonce: 0
The Flink Kubernetes operator version that I'm using is 1.3.1
Is there anything that I'm missing or doing wrong?
The download of the jar happens in flink-kubernetes-operator pod. So, when you apply FlinkSessionJob, the fink-operator would recognize the Crd and will try to download the jar from jarUri location and construct a JobGraph and submit the sessionJob to JobDeployment. Flink Kubernetes Operator will also have flink running inside it to build a JobGraph.
So, You will have to add flink-s3-fs-hadoop-1.15.3.jar in location /opt/flink/plugins/s3-fs-hadoop/ inside flink-kubernetes-operator
You can add the jar either by extending the ghcr.io/apache/flink-kubernetes-operator image, curl the jar and copy it to plugins location
or
You can write an initContainer which will download the jar to a volume and mount that volume
volumes:
- name: s3-plugin
emptyDir: { }
initContainers:
- name: busybox
image: busybox:latest
volumeMounts:
- mountPath: /opt/flink/plugins/s3-fs-hadoop
name: s3-plugin
containers:
- image: 'ghcr.io/apache/flink-kubernetes-operator:95128bf'
name: flink-kubernetes-operator
volumeMounts:
- mountPath: /opt/flink/plugins/s3-fs-hadoop
name: s3-plugin
Also, if you are using serviceAccount for S3 authentication, give below config in flinkConfig
fs.s3a.aws.credentials.provider: com.amazonaws.auth.WebIdentityTokenCredentialsProvider
Related
I have created kubernetes cluster on digitalocean. and I have deployed k6 as a job on kubernetes cluster.
apiVersion: batch/v1
kind: Job
metadata:
name: benchmark
spec:
template:
spec:
containers:
- name: benchmark
image: loadimpact/k6:0.29.0
command: ["k6", "run", "--vus", "2", "--duration", "5m", "--out", "json=./test.json", "/etc/k6-config/script.js"]
volumeMounts:
- name: config-volume
mountPath: /etc/k6-config
restartPolicy: Never
volumes:
- name: config-volume
configMap:
name: k6-config
this is how my k6-job.yaml file look like. After deploying it in kubernetes cluster I have checked the pods logs. it is showing permission denied error.
level=error msg="open ./test.json: permission denied"
how to solve this issue?
The k6 Docker image runs as an unprivileged user, but unfortunately the default work directory is set to /, so it has no permission to write there.
To work around this consider changing the JSON output path to /home/k6/out.json, i.e.:
command: ["k6", "run", "--vus", "2", "--duration", "5m", "--out", "json=/home/k6/test.json", "/etc/k6-config/script.js"]
I'm one of the maintainers on the team, so will propose a change to the Dockerfile to set the WORKDIR to /home/k6 to make the default behavior a bit more intuitive.
I have a deployment that runs two containers. One of the containers attempts to build (during deployment) a javascript bundle that the other container, nginx, tries to serve.
I want to use a shared volume to place the javascript bundle after it's built.
So far, I have the following deployment file (with irrelevant pieces removed):
apiVersion: apps/v1
kind: Deployment
metadata:
...
spec:
...
template:
...
spec:
hostNetwork: true
containers:
- name: personal-site
image: wheresmycookie/personal-site:3.1
volumeMounts:
- name: build-volume
mountPath: /var/app/dist
- name: nginx-server
image: nginx:1.19.0
volumeMounts:
- name: build-volume
mountPath: /var/app/dist
volumes:
- name: build-volume
emptyDir: {}
To the best of my ability, I have followed these guides:
https://kubernetes.io/docs/concepts/storage/volumes/#emptydir
https://kubernetes.io/docs/tasks/access-application-cluster/communicate-containers-same-pod-shared-volume/
One other things to point out is that I'm trying to run this locally atm using minikube.
EDIT: The Dockerfile I used to build this image is:
FROM node:alpine
WORKDIR /var/app
COPY . .
RUN npm install
RUN npm install -g #vue/cli#latest
CMD ["npm", "run", "build"]
I realize that I do not need to build this when I actually run the image, but my next goal is to insert pod instance information as environment variables, so with javascript unfortunately I can only build once that information is available to me.
Problem
The logs from the personal-site container reveal:
- Building for production...
ERROR Error: EBUSY: resource busy or locked, rmdir '/var/app/dist'
Error: EBUSY: resource busy or locked, rmdir '/var/app/dist'
I'm not sure why the build is trying to remove /dist, but also have a feeling that this is irrelevant. I could be wrong?
I thought that maybe this could be related to the lifecycle of containers/volumes, but the docs suggest that "An emptyDir volume is first created when a Pod is assigned to a Node, and exists as long as that Pod is running on that node".
Question
What are some reasons that a volume might not be available to me after the containers are already running? Given that you probably have much more experience than I do with Kubernetes, what would you look into next?
The best way is to customize your image's entrypoint as following:
Once you finish building the /var/app/dist folder, copy(or move) this folder to another empty path (.e.g: /opt/dist)
cp -r /var/app/dist/* /opt/dist
PAY ATTENTION: this Step must be done in the script of ENTRYPOINT not in the RUN layer.
Now use /opt/dist instead..:
apiVersion: apps/v1
kind: Deployment
metadata:
...
spec:
...
template:
...
spec:
hostNetwork: true
containers:
- name: personal-site
image: wheresmycookie/personal-site:3.1
volumeMounts:
- name: build-volume
mountPath: /opt/dist # <--- make it consistent with image's entrypoint algorithm
- name: nginx-server
image: nginx:1.19.0
volumeMounts:
- name: build-volume
mountPath: /var/app/dist
volumes:
- name: build-volume
emptyDir: {}
Good luck!
If it's not clear how to customize the entrypoint, share with us your entrypoint of the image and we will implement it.
I used config map with files but i am experimenting with portable services like supervisor d and other internal tools.
we have golang binary that can be run in any image. what i am trying is to run these binary using configmap.
Example :-
We have a internal tool written in Go(size is less than 7MB) can be store in config map and we want to mount that config map inside kuberneates pod and want to run it inside pod
Question :- does anyone use it ? Is it a good approach ? What is the best practice ?
I don't believe you can put 7MB of content in a ConfigMap. See here for example. What you're trying to do sounds like a very unusual practice. The standard practice to run binaries in Pods in Kubernetes is to build a container image that includes the binary and configure the image or the Pod to run that binary.
I too faced similar issue while storing elastic.jks keystore binary file in k8s pod.
AFAIK there are two options:
Make use of configmap to store binary data. Check this out.
OR
Store your binary file remotely somewhere like in s3 bucket and pull that binary before running actual pod using initContainers concept.
apiVersion: v1
kind: Pod
metadata:
name: alpine
namespace: default
spec:
containers:
- name: myapp-container
image: alpine:3.1
command: ['sh', '-c', 'if [ -f /jks/elastic.jks ]; then sleep 99999; fi']
volumeMounts:
- name: jksdata
mountPath: /jks
initContainers:
- name: init-container
image: atlassian/pipelines-awscli
command: ["/bin/sh","-c"]
args: ['aws s3 sync s3://my-artifacts/$CLUSTER /jks/']
imagePullPolicy: IfNotPresent
volumeMounts:
- name: jksdata
mountPath: /jks
env:
- name: CLUSTER
value: dev-elastic
volumes:
- name: jksdata
emptyDir: {}
restartPolicy: Always
As #amit-kumar-gupta mentioned the configmap size constraint.
I recommend the second way.
Hope this helps.
Is there a way to load any kernel module ("modprobe nfsd" in my case) automatically after starting/upgrading nodes or in GKE? We are running an NFS server pod on our kubernetes cluster and it dies after every GKE upgrade
Tried both cos and ubuntu images, none of them seems to have nfsd loaded by default.
Also tried something like this, but it seems it does not do what it is supposed to do:
kind: DaemonSet
apiVersion: extensions/v1beta1
metadata:
name: nfsd-modprobe
labels:
app: nfsd-modprobe
spec:
template:
metadata:
labels:
app: nfsd-modprobe
spec:
hostPID: true
containers:
- name: nfsd-modprobe
image: gcr.io/google-containers/startup-script:v1
imagePullPolicy: Always
securityContext:
privileged: true
env:
- name: STARTUP_SCRIPT
value: |
#! /bin/bash
modprobe nfs
modprobe nfsd
while true; do sleep 1; done
I faced the same issue, existing answer is correct, I want to expand it with working example of nfs pod within kubernetes cluster which has capabilities and libraries to load required modules.
It has two important parts:
privileged mode
mounted /lib/modules directory within the container to use it
nfs-server.yaml
kind: Pod
apiVersion: v1
metadata:
name: nfs-server-pod
spec:
containers:
- name: nfs-server-container
image: erichough/nfs-server
securityContext:
privileged: true
env:
- name: NFS_EXPORT_0
value: "/test *(rw,no_subtree_check,insecure,fsid=0)"
volumeMounts:
- mountPath: /lib/modules # mounting modules into container
name: lib-modules
readOnly: true # make sure it's readonly
- mountPath: /test
name: export-dir
volumes:
- hostPath: # using hostpath to get modules from the host
path: /lib/modules
type: Directory
name: lib-modules
- name: export-dir
emptyDir: {}
Reference which helped as well - Automatically load required kernel modules.
By default, you cannot load modules from inside a container because excluding kernel components is one of the main reason containers are lightweight and portable. You need to load the module from the host OS in order to make it available inside the container. This means you could simply launch a script that enables the kernel modules you want after each GKE upgrade.
However, there exists a somewhat hacky way to load kernel modules from inside a docker container. It all boils down to launching your container with escalated privileges and with access to certain host directories. You should try that if you really want to mount your kernel modules while inside a container.
I have created a pod with two containers. I know that different containers in a pod share same network namespace (i.e.,same IP and port space) and can also share a storage volume between them via configmaps. My question is do the pods also share same filesystem. For instance, in my case I have one container 'C1' that generates a dynamic file every 10 min in /var/targets.yml and I want the the other container 'C2' to read this file and perform its own independent action.
Is there a way to do this, may be some workaround via configmaps? or do I have to access these file via networking since each container have their own IP(But this may not be a good idea when it comes to POD restarts). Any suggestions or references please?
You can use an emptyDir for this:
apiVersion: v1
kind: Pod
metadata:
name: test-pd
spec:
containers:
- image: gcr.io/google_containers/test-webserver
name: generating-container
volumeMounts:
- mountPath: /cache
name: cache-volume
- image: gcr.io/google_containers/test-webserver
name: consuming-container
volumeMounts:
- mountPath: /cache
name: cache-volume
volumes:
- name: cache-volume
emptyDir: {}
But be aware, that the data isn't persistent during container recreations.