How to use Skaffold with kubernetes volumes? - kubernetes

I have an python application whose docker build takes about 15-20 minutes.
Here is how my Dockerfile looks like more or less
FROM ubuntu:18.04
...
COPY . /usr/local/app
RUN pip install -r /usr/local/app/requirements.txt
...
CMD ...
Now if I use skaffold, any code change triggers a rebuild and it is going to do a reinstall of all requirements(as from the COPY step everything else is going to be rebuild) regardless of whether they were already installed. iIn docker-compose this issue would be solved using volumes. In kubernetes, if we use volumes in the following way:
apiVersion: v1
kind: Pod
metadata:
name: test
spec:
containers:
- image: test:test
name: test-container
volumeMounts:
- mountPath: /usr/local/venv # this is the directory of the
# virtualenv of python
name: test-volume
volumes:
- name: test-volume
awsElasticBlockStore:
volumeID: <volume-id>
fsType: ext4
will this extra requirements build be resolved with skaffold?

I can't speak for skaffold specifically but the container image build can be improved. If there is layer caching available then only reinstall the dependencies when your requirements.txt changes. This is documented in the "ADD or COPY" Best Practices.
FROM ubuntu:18.04
...
COPY requirements.txt /usr/local/app/
RUN pip install -r /usr/local/app/requirements.txt
COPY . /usr/local/app
...
CMD ...
You may need to trigger updates some times if the module versions are loosely defined and say you want a new patch version. I've found requirements should be specific so versions don't slide underneath your application without your knowledge/testing.
Kaniko in-cluster builds
For kaniko builds to make use of a cache in a cluster where there is no persistent storage by default, kaniko needs either a persistent volume mounted (--cache-dir) or a container image repo (--cache-repo) with the layers available.

If your goal is to speed up the dev process: Instead of triggering an entirely new deployment process every time you change a line of code, you can switch to a sync-based dev process to deploy once and then update the files within the running containers when editing code.
Skaffold supports file sync to directly update files inside the deployed containers if you change them on your local machine. However, the docs state "File sync is alpha" (https://skaffold.dev/docs/how-tos/filesync/) and I can completely agree from working with it a while ago: The sync mechanism is only one-directional (no sync from container back to local) and pretty buggy, i.e. it crashes frequently when switching git branches, installing dependencies etc. which can be pretty annoying.
If you want a more stable alternative for sync-based Kubernetes development which is very easy to get started with, take a look at DevSpace: https://github.com/devspace-cloud/devspace
I am one of the maintainers of DevSpace and started the project because Skaffold was much too slow for our team and it did not have a file sync back then.

#Matt's answer is a great best practice (+1) - skaffold in and of itself won't solve the underlying layer cache invalidation issues which results in having to re-install the requirements during every build.
For additional performance, you can cache all the python packages in a volume mounted in your pod for example:
apiVersion: v1
kind: Pod
metadata:
name: test
spec:
containers:
- image: test:test
name: test-container
volumeMounts:
- mountPath: /usr/local/venv
name: test-volume
- mountPath: /root/.cache/pip
name: pip-cache
volumes:
- name: test-volume
awsElasticBlockStore:
volumeID: <volume-id>
fsType: ext4
- name: pip-cache
awsElasticBlockStore:
volumeID: <volume-id>
fsType: ext4
That way if the build cache is ever invalidated and you have to re-install your requirements.txt you'd be saving some time by fetching them from cache.
If you're building with kaniko you can also cache base images to a persistent disk using the kaniko-warmer, for example:
...
volumeMounts:
...
- mountPath: /cache
name: kaniko-warmer
volumes:
...
- name: kaniko-warmer
awsElasticBlockStore:
volumeID: <volume-id>
fsType: ext4
Running the kaniko-warmer inside the pod: docker run --rm -it -v /cache:/cache --entrypoint /kaniko/warmer gcr.io/kaniko-project/warmer --cache-dir=/cache --image=python:3.7-slim --image=nginx:1.17.3. Your skaffold.yaml might look something like:
apiVersion: skaffold/v1beta13
kind: Config
build:
artifacts:
- image: test:test
kaniko:
buildContext:
localDir: {}
cache:
hostPath: /cache
cluster:
namespace: jx
dockerConfig:
secretName: jenkins-docker-cfg
tagPolicy:
envTemplate:
template: '{{.DOCKER_REGISTRY}}/{{.IMAGE_NAME}}'
deploy:
kubectl: {}

Related

Copy file inside Kubernetes pod from another container

I need to copy a file inside my pod during the time of creation. I don't want to use ConfigMap and Secrets. I am trying to create a volumeMounts and copy the source file using the kubectl cp command—my manifest looks like this.
apiVersion: v1
kind: Pod
metadata:
name: copy
labels:
app: hello
spec:
containers:
- name: init-myservice
image: bitnami/kubectl
command: ['kubectl','cp','./test.json','init-myservice:./data']
volumeMounts:
- name: my-storage
mountPath: data
- name: init-myservices
image: nginx
volumeMounts:
- name: my-storage
mountPath: data
volumes:
- name: my-storage
emptyDir: {}
But I am getting a CrashLoopBackOff error. Any help or suggestion is highly appreciated.
it's not possible.
let me explain : you need to think of it like two different machine. here your local machine is the one where the file exist and you want to copy it in another machine with cp. but it's not possible. and this is what you are trying to do here. you are trying to copy file from your machine to pod's machine.
here you can do one thing just create your own docker image for init-container. and copy the file you want to store before building the docker image. then you can copy that file in shared volume where you want to store the file.
I do agree with an answer provided by H.R. Emon, it explains why you can't just run kubectl cp inside of the container. I do also think there are some resources that could be added to show you how you can tackle this particular setup.
For this particular use case it is recommended to use an initContainer.
initContainers - specialized containers that run before app containers in a Pod. Init containers can contain utilities or setup scripts not present in an app image.
Kubernetes.io: Docs: Concepts: Workloads: Pods: Init-containers
You could use the example from the official Kubernetes documentation (assuming that downloading your test.json is feasible):
apiVersion: v1
kind: Pod
metadata:
name: init-demo
spec:
containers:
- name: nginx
image: nginx
ports:
- containerPort: 80
volumeMounts:
- name: workdir
mountPath: /usr/share/nginx/html
# These containers are run during pod initialization
initContainers:
- name: install
image: busybox
command:
- wget
- "-O"
- "/work-dir/index.html"
- http://info.cern.ch
volumeMounts:
- name: workdir
mountPath: "/work-dir"
dnsPolicy: Default
volumes:
- name: workdir
emptyDir: {}
-- Kubernetes.io: Docs: Tasks: Configure Pod Initalization: Create a pod that has an initContainer
You can also modify above example to your specific needs.
Also, referring to your particular example, there are some things that you will need to be aware of:
To use kubectl inside of a Pod you will need to have required permissions to access the Kubernetes API. You can do it by using serviceAccount with some permissions. More can be found in this links:
Kubernetes.io: Docs: Reference: Access authn authz: Authentication: Service account tokens
Kubernetes.io: Docs: Reference: Access authn authz: RBAC
Your bitnami/kubectl container will run into CrashLoopBackOff errors because of the fact that you're passing a single command that will run to completion. After that Pod would report status Completed and it would be restarted due to this fact resulting in before mentioned CrashLoopBackOff. To avoid that you would need to use initContainer.
You can read more about what is happening in your setup by following this answer (connected with previous point):
Stackoverflow.com: Questions: What happens one of the container process crashes in multiple container POD?
Additional resources:
Kubernetes.io: Pod lifecycle
A side note!
I also do consider including the reason why Secrets and ConfigMaps cannot be used to be important in this particular setup.

Volume shared between two containers "is busy or locked"

I have a deployment that runs two containers. One of the containers attempts to build (during deployment) a javascript bundle that the other container, nginx, tries to serve.
I want to use a shared volume to place the javascript bundle after it's built.
So far, I have the following deployment file (with irrelevant pieces removed):
apiVersion: apps/v1
kind: Deployment
metadata:
...
spec:
...
template:
...
spec:
hostNetwork: true
containers:
- name: personal-site
image: wheresmycookie/personal-site:3.1
volumeMounts:
- name: build-volume
mountPath: /var/app/dist
- name: nginx-server
image: nginx:1.19.0
volumeMounts:
- name: build-volume
mountPath: /var/app/dist
volumes:
- name: build-volume
emptyDir: {}
To the best of my ability, I have followed these guides:
https://kubernetes.io/docs/concepts/storage/volumes/#emptydir
https://kubernetes.io/docs/tasks/access-application-cluster/communicate-containers-same-pod-shared-volume/
One other things to point out is that I'm trying to run this locally atm using minikube.
EDIT: The Dockerfile I used to build this image is:
FROM node:alpine
WORKDIR /var/app
COPY . .
RUN npm install
RUN npm install -g #vue/cli#latest
CMD ["npm", "run", "build"]
I realize that I do not need to build this when I actually run the image, but my next goal is to insert pod instance information as environment variables, so with javascript unfortunately I can only build once that information is available to me.
Problem
The logs from the personal-site container reveal:
- Building for production...
ERROR Error: EBUSY: resource busy or locked, rmdir '/var/app/dist'
Error: EBUSY: resource busy or locked, rmdir '/var/app/dist'
I'm not sure why the build is trying to remove /dist, but also have a feeling that this is irrelevant. I could be wrong?
I thought that maybe this could be related to the lifecycle of containers/volumes, but the docs suggest that "An emptyDir volume is first created when a Pod is assigned to a Node, and exists as long as that Pod is running on that node".
Question
What are some reasons that a volume might not be available to me after the containers are already running? Given that you probably have much more experience than I do with Kubernetes, what would you look into next?
The best way is to customize your image's entrypoint as following:
Once you finish building the /var/app/dist folder, copy(or move) this folder to another empty path (.e.g: /opt/dist)
cp -r /var/app/dist/* /opt/dist
PAY ATTENTION: this Step must be done in the script of ENTRYPOINT not in the RUN layer.
Now use /opt/dist instead..:
apiVersion: apps/v1
kind: Deployment
metadata:
...
spec:
...
template:
...
spec:
hostNetwork: true
containers:
- name: personal-site
image: wheresmycookie/personal-site:3.1
volumeMounts:
- name: build-volume
mountPath: /opt/dist # <--- make it consistent with image's entrypoint algorithm
- name: nginx-server
image: nginx:1.19.0
volumeMounts:
- name: build-volume
mountPath: /var/app/dist
volumes:
- name: build-volume
emptyDir: {}
Good luck!
If it's not clear how to customize the entrypoint, share with us your entrypoint of the image and we will implement it.

Unable to access volume content using initContainers

I have a simple image (mdw:1.0.0) with some content in it:
FROM alpine:3.9
COPY /role /mdw
WORKDIR /mdw
I was expecting that my container 'nginx' would see the content of /mdw folder, but there is no file.
apiVersion: v1
kind: Pod
metadata:
name: init-demo
spec:
initContainers:
- name: install
image: mdw:1.0.0
imagePullPolicy: Never
volumeMounts:
- name: workdir
mountPath: "/mdw"
containers:
- name: nginx
image: nginx
volumeMounts:
- name: workdir
mountPath: /mdw
command: ["ls", "-l", "/mdw"]
volumes:
- name: workdir
emptyDir: {}
Do you know what is the reason and how to fix it ?
Thank you very much
When mounting volume if directory already exists will get wiped. It's intentional and no fix really.
Only way would be to populate the directory after mounting is done.
Your init container doesn't do anything: the Dockerfile doesn't have a CMD and the Kubernetes deployment spec doesn't set a command: either. It starts and immediately exits. (The base Linux distribution images generally have a default command to launch an interactive shell, but absent a tty this will also immediately exit.)
Meanwhile, your Kubernetes setup is also mounting an empty directory over the only content you've put into the image, which prevents the init container from having an effect.
You can build a custom nginx image that directly copies the content in:
FROM nginx
COPY /role /usr/share/nginx/html
Don't use initContainers:, and use that image as the main containers: image.
There is a Docker-specific feature, using Docker named volumes, that can populate a named volume on first use, and you're probably thinking of this feature. This comes with a couple of important caveats (it only takes effect the very first time you run a container, and ignores updates to the image; it doesn't work with bind mounts). This is a plain-Docker-specific feature: Kubernetes will never auto-populate a volume for you.

How to run binary using kuberneates config-map

I used config map with files but i am experimenting with portable services like supervisor d and other internal tools.
we have golang binary that can be run in any image. what i am trying is to run these binary using configmap.
Example :-
We have a internal tool written in Go(size is less than 7MB) can be store in config map and we want to mount that config map inside kuberneates pod and want to run it inside pod
Question :- does anyone use it ? Is it a good approach ? What is the best practice ?
I don't believe you can put 7MB of content in a ConfigMap. See here for example. What you're trying to do sounds like a very unusual practice. The standard practice to run binaries in Pods in Kubernetes is to build a container image that includes the binary and configure the image or the Pod to run that binary.
I too faced similar issue while storing elastic.jks keystore binary file in k8s pod.
AFAIK there are two options:
Make use of configmap to store binary data. Check this out.
OR
Store your binary file remotely somewhere like in s3 bucket and pull that binary before running actual pod using initContainers concept.
apiVersion: v1
kind: Pod
metadata:
name: alpine
namespace: default
spec:
containers:
- name: myapp-container
image: alpine:3.1
command: ['sh', '-c', 'if [ -f /jks/elastic.jks ]; then sleep 99999; fi']
volumeMounts:
- name: jksdata
mountPath: /jks
initContainers:
- name: init-container
image: atlassian/pipelines-awscli
command: ["/bin/sh","-c"]
args: ['aws s3 sync s3://my-artifacts/$CLUSTER /jks/']
imagePullPolicy: IfNotPresent
volumeMounts:
- name: jksdata
mountPath: /jks
env:
- name: CLUSTER
value: dev-elastic
volumes:
- name: jksdata
emptyDir: {}
restartPolicy: Always
As #amit-kumar-gupta mentioned the configmap size constraint.
I recommend the second way.
Hope this helps.

how to pass a configuration file thought yaml on kubernetes to create new replication controller

i am trying to pass a configuration file(which is located on master) on nginx container at the time of replication controller creation through kubernetes.. ex. as we are using ADD command in Dockerfile...
There isn't a way to dynamically add file to a pod specification when instantiating it in Kubernetes.
Here are a couple of alternatives (that may solve your problem):
Build the configuration file into your container (using the docker ADD command). This has the advantage that it works in the way which you are already familiar but the disadvantage that you can no longer parameterize your container without rebuilding it.
Use environment variables instead of a configuration file. This may require some refactoring of your code (or creating a side-car container to turn environment variables into the configuration file that your application expects).
Put the configuration file into a volume. Mount this volume into your pod and read the configuration file from the volume.
Use a secret. This isn't the intended use for secrets, but secrets manifest themselves as files inside your container, so you can base64 encode your configuration file, store it as a secret in the apiserver, and then point your application to the location of the secret file that is created inside your pod.
I believe you can also download config during container initialization.
See example below, you may download config instead index.html but I would not use it for sensetive info like passwords.
apiVersion: v1
kind: Pod
metadata:
name: init-demo
spec:
containers:
- name: nginx
image: nginx
ports:
- containerPort: 80
volumeMounts:
- name: workdir
mountPath: /usr/share/nginx/html
# These containers are run during pod initialization
initContainers:
- name: install
image: busybox
command:
- wget
- "-O"
- "/work-dir/index.html"
- http://kubernetes.io
volumeMounts:
- name: workdir
mountPath: "/work-dir"
dnsPolicy: Default
volumes:
- name: workdir
emptyDir: {}