For years we have built base PHP-FPM container images locally with code like this to include Oracle DB support:
ARG PHP_VERSION=7.4
ARG PHP_TYPE=fpm
FROM php:${PHP_VERSION}-${PHP_TYPE}
ENV LD_LIBRARY_PATH /usr/local/instantclient
ENV ORACLE_BASE /usr/local/instantclient
ENV ORACLE_HOME /usr/local/instantclient
ENV TNS_ADMIN /etc/oracle
COPY oracle /etc/oracle
RUN echo 'instantclient,/usr/local/instantclient' | pecl install oci8-${OCI8_VERSION} \
&& docker-php-ext-configure oci8 --with-oci8=instantclient,/usr/local/instantclient \
&& docker-php-ext-install oci8 \
&& docker-php-ext-configure pdo_oci --with-pdo-oci=instantclient,/usr/local/instantclient \
&& docker-php-ext-install pdo_oci \
&& rm -rf /tmp/pear
From this image we build application specific images that are deployed to a Kubernetes cluster and the TNS_ADMIN variable and value have persisted without issue.
We recently changed how the images are built (using Kaniko and GitLab CI instead of building them locally) and found that now when the image is deployed to the Kubernetes cluster (via Helm) the TNS_ADMIN variable is now missing (not just a blank value, the entire variable). Another change made was how the Oracle pieces are installed (using docker-php-extension-installer), so the pertinent Dockerfile code looks like this now:
ADD https://github.com/mlocati/docker-php-extension-installer/releases/latest/download/install-php-extensions /usr/local/bin/
RUN chmod +x /usr/local/bin/install-php-extensions && \
install-php-extensions oci8 pdo_oci
# Oracle client config
ENV TNS_ADMIN=/etc/oracle
COPY php.cli/oracle /etc/oracle
And, here is the GitLab CI Kaniko related code to build the application specific images (only the $PHP_TYPE applies to the image in question):
- |
LOCAL_REPOSITORY=${CI_REGISTRY}/<internal namespace path>/$REPOSITORY
# Build config.json for credentials
echo "{\"auths\":{\"${CI_REGISTRY}\":{\"auth\":\"$(printf "%s:%s" "${CI_REGISTRY_USER}" "${CI_REGISTRY_PASSWORD}" | base64 | tr -d '\n')\"}}}" > /kaniko/.docker/config.json
/kaniko/executor --context $CI_PROJECT_DIR --dockerfile $CI_PROJECT_DIR/$DOCKER_FILE_PATH/Dockerfile --build-arg PHP_VERSION=$PHP_VERSION --build-arg PHP_TYPE=$PHP_TYPE --build-arg PHPUNIT_VERSION=$PHPUNIT_VERSION --build-arg PHPCS_VERSION=$PHPCS_VERSION --build-arg PHPCSFIXER_VERSION=$PHPCSFIXER_VERSION --destination $LOCAL_REPOSITORY:$PHP_VERSION-$TAG_NAME
Thinking this was possibly due to how Kaniko works, or the changes to the Oracle install process, we pulled the base image and application image separately and ran them with a bash shell. When pulled locally, the TNS_ADMIN variable is present. This suggests whatever is occurring is happening once Helm deploys it to the cluster.
What is vexing is on the surface neither of the changes we made should affect the setting of an environment variable in this manner in the image, but those were the only changes made that coincide with the issue arising. So, the issue seems to be when deploying the image to our cluster. This process itself has not changed at all. The Helm chart has not changed, which indicates it is not part of this issue; that being said, the issue occurs when Helm deploys the chart that uses the image.
Has anyone else seen something like this, or have any ideas where to center our search for answers?
Well, our issue was one that is probably endemic to many people running applications in Kubernetes: our image pull policy for the Helm deployment was set to IfNotPresent and a cached image without the ENV value set was being used (the image was built using a Dockerfile that did not set TNS_ADMIN). We have a lot of moving parts in our process and made multiple changes that were not seen due to this.
I am of course chastened by this explanation and so I will offer the advice to always make sure you are pulling a fresh image as the first step in troubleshooting issues with Kubernetes/Helm deployments.
Related
Is it possible with Docker to combine two images into one?
Like this here:
genericA --
\
---> specificAB
/
genericB --
For example there's an image for Java and an image for MySQL.
I'd like to have an image with Java and MySQL.
No, you can only inherit from one image.
You probably don't want Java and MySQL in the same image as it's more idiomatic to have a single component in a container i.e. create a separate MySQL container and link it to the Java container rather than put both into the same container.
However, if you really must have them in the same image, write a Dockerfile with Java as the base image (FROM statement) and install MySQL in the Dockerfile. You should be able to largely copy the statements from the official MySQL Dockerfile.
Docker doesn't directly support this, but you can use DockerMake (full disclosure: I wrote it) to manage this sort of "inheritance". It uses a YAML file to set up the individual pieces of the image, then drives the build by generating the appropriate Dockerfiles.
Here's how you would build this slightly more complicated example:
--> genericA --
/ \
debian:jessie --> customBase ---> specificAB
\ /
--> genericB --
You would use this DockerMake.yml file:
specificAB:
requires:
- genericA
- genericB
genericA:
requires:
- customBase
build_directory: [some local directory]
build: |
#Dockerfile commands go here, such as
ADD installA.sh
RUN ./installA.sh
genericB:
requires:
- customBase
build: |
#Here are some other commands you could run
RUN apt-get install -y genericB
ENV PATH=$PATH:something
customBase:
FROM: debian:jessie
build: |
RUN apt-get update && apt-get install -y buildessentials
After installing the docker-make CLI tool (pip install dockermake), you can then build the specificAB image just by running
docker-make specificAB
If you do docker commit, it is not handy to see what commands were used in order to build your container, you have to issue a docker history image
If you have a Dockerfile, just look at it and you see how it was built and what it contains.
Docker commit is 'by hand', so prone to errors, docker build using a Dockerfile that works is much better.
You can put multiple FROM commands in a single Dockerfile.
https://docs.docker.com/reference/builder/#from
I don't want to have to deploy a whole other ECS service just to enable X-Ray. I'm hoping I can run X-Ray on the same docker container as my app, I would have thought that was the preferred way of running it. I know there might be some data loss if my container dies. But I don't much care about that, I'm trying to stop this proliferation of extra services which serve only extra analytical/logging functions, I already have a logstash container I'm not happy about, my feeling is that apps themselves should be able to do this sort of stuff.
While we have the Dockerhub image of the X-Ray Daemon, you can absolutely run the daemon in the same docker container as your application - that shouldn't be an issue.
Here's the typical setup with the daemon dockerfile and task definition instructions:
https://docs.aws.amazon.com/xray/latest/devguide/xray-daemon-ecs.html
I imagine you can simply omit the task definition attributes around the daemon, since it would be running locally beside your application - those wouldn't be used at all.
So I think the proper way to do this is using supervisord, see link for an example of that, but I ended up just making a very simple script:
# start.sh
/usr/bin/xray &
$CATALINA_HOME/bin/catalina.sh run
And then having a Dockerfile:
FROM tomcat:9-jdk11-openjdk
RUN apt-get install -y unzip
RUN curl -o daemon.zip https://s3.dualstack.us-east-2.amazonaws.com/aws-xray-assets.us-east-2/xray-daemon/aws-xray-daemon-linux-3.x.zip
RUN unzip daemon.zip && cp xray /usr/bin/xray
# COPY APPLICATION
# TODO
COPY start.sh /usr/bin/start.sh
RUN chmod +x /usr/bin/start.sh
CMD ["/bin/bash", "/usr/bin/start.sh"]
I think I will look at using supervisord next time.
I am trying to debug a java app on GKE cluster through stack driver.
I have created a GKE cluster with Allow full access to all Cloud APIs
I am following documentation: https://cloud.google.com/debugger/docs/setup/java
Here is my DockerFile:
FROM openjdk:8-jdk-alpine
VOLUME /tmp
ARG JAR_FILE
COPY ${JAR_FILE} alnt-watchlist-microservice.jar
ENTRYPOINT ["java","-Djava.security.egd=file:/dev/./urandom","-jar","/alnt-watchlist-microservice.jar"]
In documentation, it was written to add following lines in DockeFile:
RUN mkdir /opt/cdbg && \
wget -qO- https://storage.googleapis.com/cloud-debugger/compute-java/debian-wheezy/cdbg_java_agent_gce.tar.gz | \
tar xvz -C /opt/cdbg
RUN java -agentpath:/opt/cdbg/cdbg_java_agent.so
-Dcom.google.cdbg.module=tpm-watchlist
-Dcom.google.cdbg.version=v1
-jar /alnt-watchlist-microservice.jar
When I build DockerFile, It fails saying tar: invalid magic , tar: short read.
In stackdriver debug console, It always show 'No deployed application found'. Which application it will show? I have already 2 services deployed on my kubernetes cluster.
I have already executed
gcloud debug source gen-repo-info-file --output-directory="WEB-INF/classes/
in my project's directory.
It generated source-context.json. After its creation, I tried building docker image and its failing.
The debugger will be ready for use when you deploy your containerized app. You are getting No deployed application found error because your debugger agent is failing to download or unzip in dockerfile.
Please check this discussion to resolve the tar: invalid magic , tar: short read. error.
Unfortunately it looks like Alpine isn't regularly tested with Debugger. There's a sample setup here that might help you: https://github.com/GoogleCloudPlatform/cloud-debug-java#alpine-linux
I resolved the issue.
Firstly, you will have to use java image "gcr.io/google-appengine/openjdk" instead of Alpine one.
Secondly,
I was putting entry points without comma separated (Basically in wrong format)
ENTRYPOINT ["java","-agentpath:/opt/cdbg/cdbg_java_agent.so", "-Djava.security.egd=file:/dev/./urandom" ,"-Dcom.google.cdbg.module=watchlist"]
I'm aware that if I change my Dockerfile or build directory, I'm supposed to run docker-compose build. This surely implies that docker-compose has some cache somewhere of its already-built images.
Where is it? How do I purge it?
I'd like to get back to a state where docker-compose up is forced to do the initial build steps, without me needing to remember to run docker-compose build.
I've run docker stop $(docker ps -aq) and docker X prune (for X in container, image, volume, network), but docker-compose up still refuses to run the build steps in my Dockerfile.
Or am I completely misunderstanding how docker-compose works?
you can pass on additional argument (--no-cache) to skip using cache during build process.
docker#default:~$ docker-compose build --help
Build or rebuild services.
Services are built once and then tagged as `project_service`,
e.g. `composetest_db`. If you change a service's `Dockerfile` or the
contents of its build directory, you can run `docker-compose build` to rebuild it.
Usage: build [options] [--build-arg key=val...] [SERVICE...]
Options:
--compress Compress the build context using gzip.
--force-rm Always remove intermediate containers.
--no-cache Do not use cache when building the image.
--pull Always attempt to pull a newer version of the image.
-m, --memory MEM Sets memory limit for the build container.
--build-arg key=val Set build-time variables for services.
docker#default:~$
docker-compose uses images, which you can see with docker images:
$ docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
docker_ubuntu latest a7dc4f9bbdfb 19 hours ago 158MB
ubuntu 16.04 0b1edfbffd27 3 weeks ago 113MB
hello-world latest f2a91732366c 6 months ago 264MB
The docker-compose images are prefixed with (usually) the name of the directory you're running docker-compose in. So, for me, the docker_ubuntu image.
docker image prune thinks that the images are in use, so it doesn't prune them.
To get rid of the docker-compose image, you need to delete it explicitly:
docker image rm docker_ubuntu
I have an asp.net core 2.0 application whose docker image runs fine locally, but when that same image is deployed to an AKS cluster, the pods have a status of CrashLoopBackOff and the pod log shows:
Did you mean to run dotnet SDK commands? Please install dotnet SDK from:
http://go.microsoft.com/fwlink/?LinkID=798306&clcid=0x409.
And since you can't ssh to AKS clusters, it's pretty difficult to figure this out?
Dockerfile:
FROM microsoft/aspnetcore:2.0
WORKDIR /app
COPY . .
EXPOSE 80
ENTRYPOINT ["dotnet", "myapi.dll"]
Turned out that our build system wasn't putting the app code into the container as we thought. Since the container wasn't runnable, I didn't know how to inspect its contents until I found this command which is a lifesaver for these kinds of situations:
docker run --rm -it --entrypoint=/bin/bash [image_id]
... which at this point, you can freely inspect/verify the contents of the container.
I just ran into the same issue and it's because I was missing a key piece to the puzzle.
docker-compose -f docker-compose.ci.build.yml run ci-build
VS2017 Docker Tools will create that docker-compose.ci.build.yml file. After that command is run, the publish folder is populated and docker build -t <tag> will build a populated image (without an empty /app folder).