Kubernetes Podspec to download only container image but it should not install - kubernetes

I want to download the container image, but dont want to deploy/install the image.
How can i deploy podspec to download only images but it should not create container.
Any podspec snapshot for this?

As far as I know there is no direct Kubernetes resources to only download an image of your choosing. To have the images of your applications on your Nodes you can consider using following solutions/workarounds:
Use a Daemonset with an initContainer('s)
Use tools like Ansible to pull the images with a playbook
Use a Daemonset with InitContainers
Assuming the following situation:
You've created 2 images that you want to have on all of the Nodes.
You can use a Daemonset (spawn a Pod on each Node) with initContainers (with images as source) that will run on all Nodes and ensure that the images will be present on the machine.
An example of such setup could be following:
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: pull-images
labels:
k8s-app: pull-images
spec:
# AS THIS DAEMONSET IS NOT SUPPOSED TO SERVE TRAFFIC I WOULD CONSIDER USING THIS UPDATE STRATEGY FOR SPEEDING UP THE DOWNLOAD PROCESS
updateStrategy:
type: RollingUpdate
rollingUpdate:
maxUnavailable: 100
selector:
matchLabels:
name: pull-images
template:
metadata:
labels:
name: pull-images
spec:
initContainers:
# PUT HERE IMAGES THAT YOU WANT TO PULL AND OVERRIDE THEIR ENTRYPOINT
- name: ubuntu
image: ubuntu:20.04 # <-- IMAGE #1
imagePullPolicy: Always # SPECIFY THE POLICY FOR SPECIFIC IMAGE
command: ["/bin/sh", "-c", "exit 0"]
- name: nginx
image: nginx:1.19.10 # <-- IMAGE #2
imagePullPolicy: IfNotPresent # SPECIFY THE POLICY FOR SPECIFIC IMAGE
command: ["/bin/sh", "-c", "exit 0"]
containers:
# MAIN CONTAINER WITH AS SMALL AS POSSIBLE IMAGE SLEEPING
- name: alpine
image: alpine
command: [sleep]
args:
- "infinity"
Kubernetes Daemonset controller will ensure that the Pod will run on each Node. Before the image is run, the initContainers will act as a placeholders for the images. The images that you want to have on the Nodes will be pulled. The ENTRYPOINT will be overridden to not run the image continuously. After that the main container (alpine) will be run with a sleep infinity command.
This setup will also work when the new Nodes are added.
Following on that topic I would also consider checking following documentation on imagePullPolicy:
Kubernetes.io: Docs: Concepts: Containers: Images: Updating images
A side note!
I've set the imagePullPolicy for the images in initContainers differently to show you that you can specify the imagePullPolicy independently for each container. Please use the policy that suits your use case the most.
Use tools like Ansible to pull the images with a playbook
Assuming that you have SSH access to the Nodes you can consider using Ansible with it's community module (assuming that you are using Docker):
community.docker.docker_image
Citing the documentation for this module:
This plugin is part of the community.docker collection (version 1.3.0).
To install it use: ansible-galaxy collection install community.docker.
Synopsis
Build, load or pull an image, making the image available for creating containers. Also supports tagging an image into a repository and archiving an image to a .tar file.
-- Docs.ansible.com: Ansible: Collections: Community: Docker: Docker image module
You can use it with a following example:
hosts.yaml
all:
hosts:
node-1:
ansible_port: 22
ansible_host: X.Y.Z.Q
node-2:
ansible_port: 22
ansible_host: A.B.C.D
playbook.yaml
- name: Playbook to download images
hosts: all
user: ENTER_USER
tasks:
- name: Pull an image
community.docker.docker_image:
name: "{{ item }}"
source: pull
with_items:
- "nginx"
- "ubuntu"
A side note!
In the ansible way, I needed to install docker python package:
$ pip3 install docker
Additional resources:
Kubernetes.io: Docs: Concepts: Workloads: Controllers: Daemonset
Kubernetes.io: Docs: Concepts: Workloads: Pods: initContainers

Related

Copy file inside Kubernetes pod from another container

I need to copy a file inside my pod during the time of creation. I don't want to use ConfigMap and Secrets. I am trying to create a volumeMounts and copy the source file using the kubectl cp command—my manifest looks like this.
apiVersion: v1
kind: Pod
metadata:
name: copy
labels:
app: hello
spec:
containers:
- name: init-myservice
image: bitnami/kubectl
command: ['kubectl','cp','./test.json','init-myservice:./data']
volumeMounts:
- name: my-storage
mountPath: data
- name: init-myservices
image: nginx
volumeMounts:
- name: my-storage
mountPath: data
volumes:
- name: my-storage
emptyDir: {}
But I am getting a CrashLoopBackOff error. Any help or suggestion is highly appreciated.
it's not possible.
let me explain : you need to think of it like two different machine. here your local machine is the one where the file exist and you want to copy it in another machine with cp. but it's not possible. and this is what you are trying to do here. you are trying to copy file from your machine to pod's machine.
here you can do one thing just create your own docker image for init-container. and copy the file you want to store before building the docker image. then you can copy that file in shared volume where you want to store the file.
I do agree with an answer provided by H.R. Emon, it explains why you can't just run kubectl cp inside of the container. I do also think there are some resources that could be added to show you how you can tackle this particular setup.
For this particular use case it is recommended to use an initContainer.
initContainers - specialized containers that run before app containers in a Pod. Init containers can contain utilities or setup scripts not present in an app image.
Kubernetes.io: Docs: Concepts: Workloads: Pods: Init-containers
You could use the example from the official Kubernetes documentation (assuming that downloading your test.json is feasible):
apiVersion: v1
kind: Pod
metadata:
name: init-demo
spec:
containers:
- name: nginx
image: nginx
ports:
- containerPort: 80
volumeMounts:
- name: workdir
mountPath: /usr/share/nginx/html
# These containers are run during pod initialization
initContainers:
- name: install
image: busybox
command:
- wget
- "-O"
- "/work-dir/index.html"
- http://info.cern.ch
volumeMounts:
- name: workdir
mountPath: "/work-dir"
dnsPolicy: Default
volumes:
- name: workdir
emptyDir: {}
-- Kubernetes.io: Docs: Tasks: Configure Pod Initalization: Create a pod that has an initContainer
You can also modify above example to your specific needs.
Also, referring to your particular example, there are some things that you will need to be aware of:
To use kubectl inside of a Pod you will need to have required permissions to access the Kubernetes API. You can do it by using serviceAccount with some permissions. More can be found in this links:
Kubernetes.io: Docs: Reference: Access authn authz: Authentication: Service account tokens
Kubernetes.io: Docs: Reference: Access authn authz: RBAC
Your bitnami/kubectl container will run into CrashLoopBackOff errors because of the fact that you're passing a single command that will run to completion. After that Pod would report status Completed and it would be restarted due to this fact resulting in before mentioned CrashLoopBackOff. To avoid that you would need to use initContainer.
You can read more about what is happening in your setup by following this answer (connected with previous point):
Stackoverflow.com: Questions: What happens one of the container process crashes in multiple container POD?
Additional resources:
Kubernetes.io: Pod lifecycle
A side note!
I also do consider including the reason why Secrets and ConfigMaps cannot be used to be important in this particular setup.

Volume shared between two containers "is busy or locked"

I have a deployment that runs two containers. One of the containers attempts to build (during deployment) a javascript bundle that the other container, nginx, tries to serve.
I want to use a shared volume to place the javascript bundle after it's built.
So far, I have the following deployment file (with irrelevant pieces removed):
apiVersion: apps/v1
kind: Deployment
metadata:
...
spec:
...
template:
...
spec:
hostNetwork: true
containers:
- name: personal-site
image: wheresmycookie/personal-site:3.1
volumeMounts:
- name: build-volume
mountPath: /var/app/dist
- name: nginx-server
image: nginx:1.19.0
volumeMounts:
- name: build-volume
mountPath: /var/app/dist
volumes:
- name: build-volume
emptyDir: {}
To the best of my ability, I have followed these guides:
https://kubernetes.io/docs/concepts/storage/volumes/#emptydir
https://kubernetes.io/docs/tasks/access-application-cluster/communicate-containers-same-pod-shared-volume/
One other things to point out is that I'm trying to run this locally atm using minikube.
EDIT: The Dockerfile I used to build this image is:
FROM node:alpine
WORKDIR /var/app
COPY . .
RUN npm install
RUN npm install -g #vue/cli#latest
CMD ["npm", "run", "build"]
I realize that I do not need to build this when I actually run the image, but my next goal is to insert pod instance information as environment variables, so with javascript unfortunately I can only build once that information is available to me.
Problem
The logs from the personal-site container reveal:
- Building for production...
ERROR Error: EBUSY: resource busy or locked, rmdir '/var/app/dist'
Error: EBUSY: resource busy or locked, rmdir '/var/app/dist'
I'm not sure why the build is trying to remove /dist, but also have a feeling that this is irrelevant. I could be wrong?
I thought that maybe this could be related to the lifecycle of containers/volumes, but the docs suggest that "An emptyDir volume is first created when a Pod is assigned to a Node, and exists as long as that Pod is running on that node".
Question
What are some reasons that a volume might not be available to me after the containers are already running? Given that you probably have much more experience than I do with Kubernetes, what would you look into next?
The best way is to customize your image's entrypoint as following:
Once you finish building the /var/app/dist folder, copy(or move) this folder to another empty path (.e.g: /opt/dist)
cp -r /var/app/dist/* /opt/dist
PAY ATTENTION: this Step must be done in the script of ENTRYPOINT not in the RUN layer.
Now use /opt/dist instead..:
apiVersion: apps/v1
kind: Deployment
metadata:
...
spec:
...
template:
...
spec:
hostNetwork: true
containers:
- name: personal-site
image: wheresmycookie/personal-site:3.1
volumeMounts:
- name: build-volume
mountPath: /opt/dist # <--- make it consistent with image's entrypoint algorithm
- name: nginx-server
image: nginx:1.19.0
volumeMounts:
- name: build-volume
mountPath: /var/app/dist
volumes:
- name: build-volume
emptyDir: {}
Good luck!
If it's not clear how to customize the entrypoint, share with us your entrypoint of the image and we will implement it.

Invoking a script file in helm k8s job template

I am trying to invoke a script inside a helm k8s job template. When I run helm with
helm install ./mychartname/ --generate-name
The job runs however, it couldn't find the script file (run.sh). Is this possible with helm?
apiVersion: batch/v1
kind: Job
metadata:
name: pre-install-job
spec:
template:
spec:
containers:
- name: pre-install
image: busybox
imagePullPolicy: IfNotPresent
command: ['sh', '-c', '../run.sh']
restartPolicy: OnFailure
terminationGracePeriodSeconds: 0
backoffLimit: 3
completions: 1
parallelism: 1
Here is my directory structure
├── mychartname
│ ├── templates
│ │ ├── test.job
│ │──run.sh
Below is the way to achieve running of script inside helm template
apiVersion: batch/v1
kind: Job
metadata:
name: pre-install-job-v05
spec:
template:
spec:
containers:
- name: pre-install-v05
image: busybox
imagePullPolicy: IfNotPresent
command: ["/bin/sh", "-c", {{.Files.Get "scripts/run.sh" }}]
restartPolicy: OnFailure
terminationGracePeriodSeconds: 0
backoffLimit: 3
completions: 1
parallelism: 1
In general, Kubernetes only runs software that's packaged in Docker images. Kubernetes will never run things off of your local system. In your example, the cluster will create a new unmodified busybox container, then from that container's root directory, try to run sh -c ../run.sh; since that script isn't part of the stock busybox image, it won't run.
The best approach here is to build an image out of your script and push it to a Docker registry. This is the standard way to run any custom software in Kubernetes, so you probably already have a workflow to do it. (For a test setup in Minikube you can point your local Docker at the Minikube environment and build a local image, but this doesn't scale to things hosted in the cloud or otherwise running on a multi-host environment.)
In principle you could upload the script in a config map in a separate Helm template file, mount it into your job spec, and run it (you may need to explicitly sh run.sh to get around file-permission issues). Depending on your environment this may work as well as actually having an image, but if you need to update and redeploy the Helm chart every time the script changes, it's "more normal" to do the same work by having your CI system build and upload a new image (and it'll be the same approach as for your application deployments).

How to use Skaffold with kubernetes volumes?

I have an python application whose docker build takes about 15-20 minutes.
Here is how my Dockerfile looks like more or less
FROM ubuntu:18.04
...
COPY . /usr/local/app
RUN pip install -r /usr/local/app/requirements.txt
...
CMD ...
Now if I use skaffold, any code change triggers a rebuild and it is going to do a reinstall of all requirements(as from the COPY step everything else is going to be rebuild) regardless of whether they were already installed. iIn docker-compose this issue would be solved using volumes. In kubernetes, if we use volumes in the following way:
apiVersion: v1
kind: Pod
metadata:
name: test
spec:
containers:
- image: test:test
name: test-container
volumeMounts:
- mountPath: /usr/local/venv # this is the directory of the
# virtualenv of python
name: test-volume
volumes:
- name: test-volume
awsElasticBlockStore:
volumeID: <volume-id>
fsType: ext4
will this extra requirements build be resolved with skaffold?
I can't speak for skaffold specifically but the container image build can be improved. If there is layer caching available then only reinstall the dependencies when your requirements.txt changes. This is documented in the "ADD or COPY" Best Practices.
FROM ubuntu:18.04
...
COPY requirements.txt /usr/local/app/
RUN pip install -r /usr/local/app/requirements.txt
COPY . /usr/local/app
...
CMD ...
You may need to trigger updates some times if the module versions are loosely defined and say you want a new patch version. I've found requirements should be specific so versions don't slide underneath your application without your knowledge/testing.
Kaniko in-cluster builds
For kaniko builds to make use of a cache in a cluster where there is no persistent storage by default, kaniko needs either a persistent volume mounted (--cache-dir) or a container image repo (--cache-repo) with the layers available.
If your goal is to speed up the dev process: Instead of triggering an entirely new deployment process every time you change a line of code, you can switch to a sync-based dev process to deploy once and then update the files within the running containers when editing code.
Skaffold supports file sync to directly update files inside the deployed containers if you change them on your local machine. However, the docs state "File sync is alpha" (https://skaffold.dev/docs/how-tos/filesync/) and I can completely agree from working with it a while ago: The sync mechanism is only one-directional (no sync from container back to local) and pretty buggy, i.e. it crashes frequently when switching git branches, installing dependencies etc. which can be pretty annoying.
If you want a more stable alternative for sync-based Kubernetes development which is very easy to get started with, take a look at DevSpace: https://github.com/devspace-cloud/devspace
I am one of the maintainers of DevSpace and started the project because Skaffold was much too slow for our team and it did not have a file sync back then.
#Matt's answer is a great best practice (+1) - skaffold in and of itself won't solve the underlying layer cache invalidation issues which results in having to re-install the requirements during every build.
For additional performance, you can cache all the python packages in a volume mounted in your pod for example:
apiVersion: v1
kind: Pod
metadata:
name: test
spec:
containers:
- image: test:test
name: test-container
volumeMounts:
- mountPath: /usr/local/venv
name: test-volume
- mountPath: /root/.cache/pip
name: pip-cache
volumes:
- name: test-volume
awsElasticBlockStore:
volumeID: <volume-id>
fsType: ext4
- name: pip-cache
awsElasticBlockStore:
volumeID: <volume-id>
fsType: ext4
That way if the build cache is ever invalidated and you have to re-install your requirements.txt you'd be saving some time by fetching them from cache.
If you're building with kaniko you can also cache base images to a persistent disk using the kaniko-warmer, for example:
...
volumeMounts:
...
- mountPath: /cache
name: kaniko-warmer
volumes:
...
- name: kaniko-warmer
awsElasticBlockStore:
volumeID: <volume-id>
fsType: ext4
Running the kaniko-warmer inside the pod: docker run --rm -it -v /cache:/cache --entrypoint /kaniko/warmer gcr.io/kaniko-project/warmer --cache-dir=/cache --image=python:3.7-slim --image=nginx:1.17.3. Your skaffold.yaml might look something like:
apiVersion: skaffold/v1beta13
kind: Config
build:
artifacts:
- image: test:test
kaniko:
buildContext:
localDir: {}
cache:
hostPath: /cache
cluster:
namespace: jx
dockerConfig:
secretName: jenkins-docker-cfg
tagPolicy:
envTemplate:
template: '{{.DOCKER_REGISTRY}}/{{.IMAGE_NAME}}'
deploy:
kubectl: {}

OpenShift's YAML execution precedence regarding volume mounting and commands

As a beginner in container administration, I can't find a clear description of OpenShift's deployment stages and related YAML statements, specifically when persistent volume mounting and shell commands execution are involved. For example, in the RedHat documentation there is a lot of examples. A simple one is 16.4. Pod Object Definition:
apiVersion: v1
kind: Pod
metadata:
name: busybox-nfs-pod
labels:
name: busybox-nfs-pod
spec:
containers:
- name: busybox-nfs-pod
image: busybox
command: ["sleep", "60000"]
volumeMounts:
- name: nfsvol-2
mountPath: /usr/share/busybox
readOnly: false
securityContext:
supplementalGroups: [100003]
privileged: false
volumes:
- name: nfsvol-2
persistentVolumeClaim:
claimName: nfs-pvc
Now the question is: does the command sleep (or any other) execute before or after the mount of nfsvol-2 is finished? In other words, is it possible to use the volume's resources in such commands? And if it's not possible in this config, then which event handlers to use instead? I don't see any mention about an event like volume mounted.
does the command sleep (or any other) execute before or after the
mount of nfsvol-2 is finished?
To understand this, lets dig into the underlying concepts for Openshift.
OpenShift is a container application platform that brings docker and Kubernetes to the enterprise. So Openshift is nothing but an abstraction layer on top of docker and kubernetes along with additional features.
Regarding the volumes and commands lets consider the following example:
Let's run the docker container by mounting a volume, which is the home directory of host machine to the root path of the container(-v is option to attach volume).
$ docker run -it -v /home:/root ubuntu /bin/bash
Unable to find image 'ubuntu:latest' locally
latest: Pulling from library/ubuntu
50aff78429b1: Pull complete
f6d82e297bce: Pull complete
275abb2c8a6f: Pull complete
9f15a39356d6: Pull complete
fc0342a94c89: Pull complete
Digest: sha256:f871d0805ee3ce1c52b0608108dbdf1b447a34d22d5c7278a3a9dd78fc12c663
Status: Downloaded newer image for ubuntu:latest
root#1f07f083ba79:/# cd /root/
root#1f07f083ba79:~# ls
lost+found raghavendralokineni raghu user1
root#1f07f083ba79:~/raghavendralokineni# pwd
/root/raghavendralokineni
Now execute the sleep command in the container and exit.
root#1f07f083ba79:~/raghavendralokineni# sleep 10
root#1f07f083ba79:~/raghavendralokineni#
root#1f07f083ba79:~/raghavendralokineni# exit
Check the files available in the /home path which we have mounted to the container. This content is same as that of /root path in the container.
raghavendralokineni#iconic-glider-186709:/home$ ls
lost+found raghavendralokineni raghu user1
So when a volume is mounted to the container, any changes in the volume will be effected in the host machine as well.
Hence the volume will be mounted along with the container and commands will be executed after container is started.
Coming back to the your YAML file,
volumeMounts:
- name: nfsvol-2
mountPath: /usr/share/busybox
It says ,mount the volume nfsvol-2 to the container and the information regarding the volume is mentioned in volumes:
volumes:
- name: nfsvol-2
persistentVolumeClaim:
claimName: nfs-pvc
So mount the volume to the container and execute the command which is specifed:
containers:
- name: busybox-nfs-pod
image: busybox
command: ["sleep", "60000"]
Hope this helps.