Is there a way to automatically create a container when starting Azurite? - azurite

For test purposes I create and run an Azurite docker image, in a test pipeline.
I would like to have the blob container automatically created though after Azurite is started, as it would simplify things.
Is there any good way to achieve this?
For the Postgres image we use, we can specify an init.sql which is run on startup. If something similar is available for Azurite, that would be awesome.

You can use the following Dockerfile to install the azure-storage-blob Python package on the Alpine based azurite image. The resulting image size is ~400MB compared to the ~1.2GB azure-cli image.
ARG AZURITE_VERSION="3.17.0"
FROM mcr.microsoft.com/azure-storage/azurite:${AZURITE_VERSION}
# Install azure-storage-blob python package
RUN apk update && \
apk --no-cache add py3-pip && \
apk add --virtual=build gcc libffi-dev musl-dev python3-dev && \
pip3 install --upgrade pip && \
pip3 install azure-storage-blob==12.12.0
# Copy init_azurite.py script
COPY ./init_azurite.py init_azurite.py
# Copy local blobs to azurite
COPY ./init_containers init_containers
# Run the blob emulator and initialize the blob containers
CMD python3 init_azurite.py --directory=init_containers & \
azurite-blob --blobHost 0.0.0.0 --blobPort 10000
The init_azurite.py script is a local Python script that uses the azure-storage-blob package to batch upload files and directories to the azurite blob storage emulator.
import argparse
import os
from time import sleep
from azure.core.exceptions import ResourceExistsError
from azure.storage.blob import BlobServiceClient, ContainerClient
def upload_file(container_client: ContainerClient, source: str, dest: str) -> None:
"""
Upload a single file to a path inside the container.
"""
print(f"Uploading {source} to {dest}")
with open(source, "rb") as data:
try:
container_client.upload_blob(name=dest, data=data)
except ResourceExistsError:
pass
def upload_dir(container_client: ContainerClient, source: str, dest: str) -> None:
"""
Upload a directory to a path inside the container.
"""
prefix = "" if dest == "" else dest + "/"
prefix += os.path.basename(source) + "/"
for root, dirs, files in os.walk(source):
for name in files:
dir_part = os.path.relpath(root, source)
dir_part = "" if dir_part == "." else dir_part + "/"
file_path = os.path.join(root, name)
blob_path = prefix + dir_part + name
upload_file(container_client, file_path, blob_path)
def init_containers(
service_client: BlobServiceClient, containers_directory: str
) -> None:
"""
Iterate on the containers directory and do the following:
1- create the container.
2- upload all folders and files to the container.
"""
for container_name in os.listdir(containers_directory):
container_path = os.path.join(containers_directory, container_name)
if os.path.isdir(container_path):
container_client = service_client.get_container_client(container_name)
try:
container_client.create_container()
except ResourceExistsError:
pass
for blob in os.listdir(container_path):
blob_path = os.path.join(container_path, blob)
if os.path.isdir(blob_path):
upload_dir(container_client, blob_path, "")
else:
upload_file(container_client, blob_path, blob)
if __name__ == "__main__":
parser = argparse.ArgumentParser(
description="Initialize azurite emulator containers."
)
parser.add_argument(
"--directory",
required=True,
help="""
Directory that contains subdirectories named after the
containers that we should create. Each subdirectory will contain the files
and directories of its container.
"""
)
args = parser.parse_args()
# Connect to the localhost emulator (after 5 secs to make sure it's up).
sleep(5)
blob_service_client = BlobServiceClient(
account_url="http://localhost:10000/devstoreaccount1",
credential={
"account_name": "devstoreaccount1",
"account_key": (
"Eby8vdM02xNOcqFlqUwJPLlmEtlCDXJ1OUzFT50uSRZ6IFsuFq2UVErCz4I6tq"
"/K1SZFPTOtr/KBHBeksoGMGw=="
)
}
)
# Only initialize if not already initialized.
if next(blob_service_client.list_containers(), None):
print("Emulator already has containers, will skip initialization.")
else:
init_containers(blob_service_client, args.directory)
This script will be copied to the azurite container and will populate the initial blob containers every time the azurite container is started unless some containers were already persisted using docker volumes. In that case, nothing will happen.
Following is an example docker-compose.yml file:
azurite:
build:
context: ./
dockerfile: Dockerfile
args:
AZURITE_VERSION: 3.17.0
restart: on-failure
ports:
- 10000:10000
volumes:
- azurite-data:/opt/azurite
volumes:
azurite-data:
Using such volumes will persist the emulator data until you destroy them (e.g. by using docker-compose down -v).
Finally, init_containers is a local directory that contains the containers and their folders/files. It will be copied to the azurite container when the image is built.
For example:
init_containers:
container-name-1:
dir-1:
file.txt
img.png
dir-2:
file.txt
container-name-2:
dir-1:
file.txt
img.png

I've solved the issue by creating a custom docker image and executing azure-cli tools from a health check. There could certainly be better solutions, and I will update the accepted answer if someone posts a better solution.
In more details
A solution to create the required data on startup is to run my own script. I chose to trigger the script from a health check I defined in docker-compose. What it does is use azure cli tools to create a container and then verify that it exists.
The script:
AZURE_STORAGE_CONNECTION_STRING="UseDevelopmentStorage=true"
export AZURE_STORAGE_CONNECTION_STRING
az storage container create -n images
az storage container show -n images
exit $?
However, the azurite image is based on alpine, which doesn't have apt, so installing azure cli was a bit tricky. So I did it the other way around, and based my image on mcr.microsoft.com/azure-cli:latest. With that done I installed Azurite like this:
RUN apk add npm
RUN npm install -g azurite --silent
All that's left is to actually run azurite, see the official azurite dockerfile for details.
It is possible to do this without azure-cli and use curl instead (and with that, not having to use the azure-cli docker image). However this was a bit complicated to get the authentication header working properly, so using azure-cli was easier.

Related

FileSystemException when runnning a Dart Shelf Docker Container

I generated a dart project with dart create -t server-shelf . --force.
On the top folder I created a json file (my_data.json) with some mock data.
In my code I am using the data from the json file like:
final _data = json.decode(File('my_data.json').readAsStringSync()) as List<dynamic>;
But if I try to start my server with docker run -it -p 8080:8080 myserver I am getting:
FileSystemException: Cannot open file, path = 'my_data.json' (OS
Error: No such file or directory, errno = 2)
My Dockerfile:
# Use latest stable channel SDK.
FROM dart:stable AS build
# Resolve app dependencies.
WORKDIR /app
COPY pubspec.* ./
RUN dart pub get
# Copy app source code (except anything in .dockerignore) and AOT compile app.
COPY . .
RUN dart compile exe bin/server.dart -o bin/server
# Build minimal serving image from AOT-compiled `/server`
# and the pre-built AOT-runtime in the `/runtime/` directory of the base image.
FROM scratch
COPY --from=build /runtime/ /
COPY --from=build /app/bin/server /app/bin/
COPY my_data.json /app/my_data.json
# Start server.
EXPOSE 8080
CMD ["/app/bin/server"]
I think since you didn't set the WORKDIR for the new image that you started building FROM scratch. You can fix this simply by adding WORKDIR /app again, to the specification of the new image you're building, which is being used to run your application. It will look like this:
...
# Start server.
WORKDIR /app
EXPOSE 8080
CMD ["/app/bin/server"]
Replace
COPY my_data.json /app/my_data.json
with
COPY --from=build app/my_data.json app/

How to decide Quarkus application arguments in Kubernetes at run-time?

I've built a Quarkus 2.7.1 console application using picocli that includes several subcommands. I'd like to be able to run this application within a Kubernetes cluster and decide its arguments at run-time. This is so that I can use the same container image to run the application in different modes within the cluster.
To get things started I added the JIB extension and tried setting the arguments using a configuration value quarkus.jib.jvm-arguments. Unfortunately it seems like this configuration value is locked at build-time so I'm unable to update this at run-time.
Next I tried setting quarkus.args while using default settings for JIB. The configuration value documentation makes it sound general enough for the job but it doesn't seem to have an affect when the application is run in the container. Since most references to this configuration value in documentation are in the context of Dev Mode I'm wondering if this may be disabled outside of that.
How can I get this application running in a container image with its arguments decided at run-time?
You can set quarkus.jib.jvm-entrypoint to any container entrypoint command you want, including scripts. An example in the doc is quarkus.jib.jvm-entrypoint=/deployments/run-java.sh. You could make use of $CLI_ARGUMENTS in such a script. Even something like quarkus.jib.jvm-entrypoint=/bin/sh,-c,'/deployments/run-java.sh $CLI_ARGUMENTS' should work too, as long as you place the script run-java.sh at /deployments in the image. The possibility is limitless.
Also see this SO answer if there's an issue. (The OP in the link put a customer script at src/main/jib/docker/run-java.sh (src/main/jib is Jib's default "extra files directory") so that Jib places the script in the image at /docker/run-java.sh.
I was able to find a solution to the problem with a bit of experimenting this morning.
With the quarkus-container-image-docker extension (instead of quarkus.jib.jvm-arguments) I was able to take the template Dockerfile.jvm and extend it to pass through arguments to the CLI. The only line that needed changing was the ENTRYPOINT (details included in the snippet below). I changed the ENTRYPOINT form (from exec to shell) and added an environment variable as an argument to pass-through program arguments.
FROM registry.access.redhat.com/ubi8/ubi-minimal:8.3
ARG JAVA_PACKAGE=java-11-openjdk-headless
ARG RUN_JAVA_VERSION=1.3.8
ENV LANG='en_US.UTF-8' LANGUAGE='en_US:en'
# Install java and the run-java script
# Also set up permissions for user `1001`
RUN microdnf install curl ca-certificates ${JAVA_PACKAGE} \
&& microdnf update \
&& microdnf clean all \
&& mkdir /deployments \
&& chown 1001 /deployments \
&& chmod "g+rwX" /deployments \
&& chown 1001:root /deployments \
&& curl https://repo1.maven.org/maven2/io/fabric8/run-java-sh/${RUN_JAVA_VERSION}/run-java-sh-${RUN_JAVA_VERSION}-sh.sh -o /deployments/run-java.sh \
&& chown 1001 /deployments/run-java.sh \
&& chmod 540 /deployments/run-java.sh \
&& echo "securerandom.source=file:/dev/urandom" >> /etc/alternatives/jre/lib/security/java.security
# Configure the JAVA_OPTIONS, you can add -XshowSettings:vm to also display the heap size.
ENV JAVA_OPTIONS="-Dquarkus.http.host=0.0.0.0 -Djava.util.logging.manager=org.jboss.logmanager.LogManager"
# We make four distinct layers so if there are application changes the library layers can be re-used
COPY --chown=1001 target/quarkus-app/lib/ /deployments/lib/
COPY --chown=1001 target/quarkus-app/*.jar /deployments/
COPY --chown=1001 target/quarkus-app/app/ /deployments/app/
COPY --chown=1001 target/quarkus-app/quarkus/ /deployments/quarkus/
EXPOSE 8080
USER 1001
# [== BEFORE ==]
# ENTRYPOINT [ "/deployments/run-java.sh" ]
# [== AFTER ==]
ENTRYPOINT "/deployments/run-java.sh" $CLI_ARGUMENTS
I have tried the above approaches but they didn't work with the default quarkus JIB's ubi8/openjdk-17-runtime image. This is because this base image doesn't use /work as the WORKIR, but instead the /home/jboss.
Therefore, I created a custom start-up script and referenced it on the properties file as following. This approach works better if there's a need to set application params using environment variables:
File: application.properties
quarkus.jib.jvm-entrypoint=/bin/sh,run-java.sh
File: src/main/jib/home/jboss/run-java.sh
java \
-Djavax.net.ssl.trustStore=/deployments/truststore \
-Djavax.net.ssl.trustStorePassword="$TRUST_STORE_PASSWORD" \
-jar quarkus-run.jar

How to run a python function or script somescript.py on the KubernetesPodOperator in airflow?

I am running a Celery Executor and I'm trying to run some python script in the KubernetesPodOperator. Below are examples of what I have tried that didn't work. What am I doing wrong?
Running sctipt
org_node = KubernetesPodOperator(
namespace='default',
image="python",
cmds=["python", "somescript.py" "-c"],
arguments=["print('HELLO')"],
labels={"foo": "bar"},
image_pull_policy="Always",
name=task,
task_id=task,
is_delete_operator_pod=False,
get_logs=True,
dag=dag
)
Running function load_users_into_table()
def load_users_into_table(postgres_hook, schema, path):
gdf = read_csv(path)
gdf.to_sql('users', con=postgres_hook.get_sqlalchemy_engine(), schema=schema)
org_node = KubernetesPodOperator(
namespace='default',
image="python",
cmds=["python", "somescript.py" "-c"],
arguments=[load_users_into_table],
labels={"foo": "bar"},
image_pull_policy="Always",
name=task,
task_id=task,
is_delete_operator_pod=False,
get_logs=True,
dag=dag
)
The script somescript.py must be in Docker image.
Step-1: let's create a image https://docs.docker.com/develop/develop-images/dockerfile_best-practices/.
FROM python:3.8
# copy requirement.txt from local to container
COPY requirements.txt requirements.txt
# install dependencies into container (geopandas, sqlalchemy)
RUN pip install -r requirements.txt
# copy the python script from local to container
COPY somescript.py somescript.py
ENTRYPOINT [ "python", "somescript.py"]
Step-2: Build and push the image into public Docker repository https://hub.docker.com.
NB: kubernetes_pod_operator looks for image from public docker repo
# build image
docker build -t my-python-img:latest .
# test if your image works perfectly
docker run my-python-img:latest
# push image.
docker tag my-python-img username/my-python-img
docker push username/my-python-img
docker pull username/my-python-img
step-3: Lest's create k8s task.
p = KubernetesPodOperator(
namespace='default',
image='username/my-python-img:latest',
labels={'dag-id': dag.dag_id},
name='airflow-my-image-pod',
task_id='load-users',
in_cluster=False, #False: local, True: cluster
cluster_context='microk8s',
config_file='/usr/local/airflow/include/.kube/config',
is_delete_operator_pod=True,
get_logs=True,
dag=dag
)
If you don't understand where configuration file comes from, look here: https://www.astronomer.io/docs/cloud/stable/develop/kubepodoperator-local.
Finally: I want to mention something important when working with databases (credentials). Kubernetes offers the use secret to secure sensitive information. https://airflow.apache.org/docs/apache-airflow-providers-cncf-kubernetes/stable/operators.html
KubernetesPodOperator launches a Kubernetes pod that runs a container as specified in the operator's arguments.
First Example
In the first example, the following happens:
KubernetesPodOperator instructs K8s to lunch a pod and prepare to run a container in it using the python image (the image parameter) from hub.docker.com (the default image registry)
ENTRYPOINT of the python image is replaced by ["python", "somescript.py" "-c"] (the cmd parameter)
CMD of the python image is replaced by ["print('HELLO')"] (the arguments parameter)
...
The container is run
So, the complete command that is run in the container is
python somescript.py -c print('HELLO')
Obviously, the official Python image from Docker Hub does not have somescript.py in its working directory. Even if did, it probably would have been not the one that you wrote. That is why the command fails with something like:
python: can't open file 'somescrit.py': [Errno 2] No such file or directory
Second Example
In the second example, pretty much the same happens as in the first example, but the command that is run in the container (again based on the cmd and arguments parameters) is
python somescript.py -c None
(None is the string representation of the load_users_into_table()'s return value)
This command fails, because of the same reasons as in the first example.
How It Could be Done (a Sketch)
You could build a Docker image with somescript.py and all its dependencies. Push the image to an image registry. Specify the image, ENTRYPOINT, and CMD in the corresponding parameters of KubernetesPodOperator.

Docker COPY error when copying files from host to container

In the following Dockerfile I'm trying to copy a jar file from a location on the host into the container, but seems Docker does not like it as I guess I'm missing something. Here is my Dockerfile:
FROM anapsix/alpine-java:jdk8
MAINTAINER joesan
ENV SBT_VERSION 0.13.15
ENV CHECKSUM 18b106d09b2874f2a538c6e1f6b20c565885b2a8051428bd6d630fb92c1c0f96
ENV APP_NAME my-app
ENV PROJECT_HOME /opt/apps
RUN mkdir -p $PROJECT_HOME/$APP_NAME
# Copy the jar file
COPY ./target/scala-*/my-app-*.jar $PROJECT_HOME/$APP_NAME
# Copy the database file
COPY .my-db.mv.db $PROJECT_HOME/$APP_NAME
# Run the application
CMD ["$PROJECT_HOME/$APP_NAME java -Denv=dev -jar my-app-*.jar"]
In my build pipeline, I could see the following error message:
Step 8/10 : COPY ./target/scala-*/my-app-*.jar $PROJECT_HOME/$APP_NAME
COPY failed: no source files were specified
REPOSITORY TAG IMAGE ID CREATED SIZE
<none> <none> 4a240742a379 Less than a second ago 171MB
anapsix/alpine-java jdk8 ed55c27d366d 3 years ago 171MB
Error response from daemon: No such image: [secure]
Pushing image [secure] to repository hub.docker.com
The push refers to repository [docker.io/[secure]/my-app]
An image does not exist locally with the tag: [secure]/my-app
What is that I'm missing and how could I debug this? I mean I could add some echo statements to print out the path, but I'm not sure why I face this error!
This is probably because the target folder is not in "./" folder. which can be because it's ignored by .dockerignore file or the build context is not pointing to the parent folder of the target folder.
In case you are not familiar with build context, it's explained here

Cypress and Docker - Can't run because no spec files were found

I'm trying to run cypress tests inside a docker container. I've simplified my setup so I can just try to get a simple container instance running and a few tests executed.
I'm using docker-compose
version: '2.1'
services:
e2e:
image: test/e2e:v1
command: ["./node_modules/.bin/cypress", "run", "--spec", "integration/mobile/all.js"]
build:
context: .
dockerfile: Dockerfile-cypress
container_name: cypress
network_mode: host
and my Dokerfile-cypress
FROM cypress/browsers:chrome69
RUN mkdir /usr/src/app
WORKDIR /usr/src/app
RUN npm install cypress#3.1.0
COPY cypress /usr/src/app/cypress
COPY cypress.json /usr/src/app/cypress
RUN ./node_modules/.bin/cypress verify
when I run docker-compose up after I build my image I see
cypress | name_to_handle_at on /dev: Operation not permitted
cypress | Can't run because no spec files were found.
cypress |
cypress | We searched for any files matching this glob pattern:
cypress |
cypress | integration/mobile/all-control.js
cypress exited with code 1
I've verified that my cypress files and folders have been copied over and I can verify that my test files exist. I've been stuck on this for a while and i'm unsure what to do besides giving up.
Any guidance appreciated.
Turns out cypress automatically checks for /cypress/integration folder. Moving all my cypress files inside this folder got it working.
The problem: No cypress specs files were found on your automation suite.
Solution: Cypress Test files are located in cypress/integration by default, but can be configured to another directory.
In this folder insert your suites per section:
for example:
- cypress/integration/billing-scenarios-suite
- cypress/integration/user-management-suite
- cypress/integration/proccesses-and-handlers-suite
I assume that these suite directories contains sub directories (which represents some micro logic) ,therefore you need to run it recursively to gather all files:
cypress run --spec \
cypress/integration/<your-suite>/**/* , \
cypress/integration/<your-suite>/**/**/*
If you run in cypress on docker verify that cypress volume contains the tests and mapped and mounted in container on volume section (on Dockerfile / docker-compose.yml file) and run it properly:
docker exec -it <container-id> cypress run --spec \
cypress/integration/<your-suite>/**/* , \
cypress/integration/<your-suite>/**/**/*
I noticed that if you CLICK AND DRAG file method to get file path in VSC, then it generates path with SMALL c-drive letter and this causes error: "Can't run because no spec files were found. + We searched for specs matching this glob pattern:"
e.g. by click and drag I get:
cypress run --spec c:\Users\dmitr\Desktop\cno-dma-replica-for-cy-test\cypress\integration\dma-playground.spec.js
Notice SMALL c, in above.
BUT if I use right click 'get path', I get BIG C, and it works for some reason:
cypress run --spec C:\Users\dmitr\Desktop\cno-dma-replica-for-cy-test\cypress\integration\dma-playground.spec.js
and this causes it to work.
Its strange I know, but there you go.
but if you just use: