I have made a local workstation with CentOS 8, I have installed docker CE and everything necessary to set up a remote development environment (smb, firewall, etc.).
My client computer is a mac and the application is based on a docker-compose of several containers which with the docker MAC application works correctly.
After configuring pycharm professional with the remote interpreter and starting the project there are 2 containers that start perfectly (postgres: 11.2-alpine, redis: alpine) but the application that is a Django-rest with a specific dockerfile when it is ready to execute the entrypoint throws The following error that I attach below although I have accessed the server and if I execute the command docker-compose up --build it starts perfectly, it is something related to the fact of starting from Pycharm.
Has anyone had this problem and been able to solve it?
Successfully tagged energy_energy:latest
Creating energy_redis-celery-energy_1 ...
Creating energy_redis-cache-energy_1 ...
Creating energy_postgres_1 ...
Creating energy_celery-worker-energy_1 ...
Creating energy_celery-beat-energy_1 ...
Creating energy_celery-beat-energy_1 ... error
ERROR: for energy_celery-beat-energy_1 Cannot start service celery-beat-energy: OCI runtime create failed: container_linux.go:349: starting container process caused "exec: \"./start.sh\": stat ./start.sh: no such file or directory": unknown
Creating energy_celery-worker-energy_1 ... error
ERROR: for energy_celery-worker-energy_1 Cannot start service celery-worker-energy: OCI runtime create failed: container_linux.go:349: starting container process caused "exec: \"./start.sh\": stat ./start.sh: no such file or directory": unknown
ERROR: for celery-beat-energy Cannot start service celery-beat-energy: OCI runtime create failed: container_linux.go:349: starting container process caused "exec: \"./start.sh\": stat ./start.sh: no such file or directory": unknown
ERROR: for celery-worker-energy Cannot start service celery-worker-energy: OCI runtime create failed: container_linux.go:349: starting container process caused "exec: \"./start.sh\": stat ./start.sh: no such file or directory": unknown
ERROR: Encountered errors while bringing up the project
This is my entire Dockerfile:
FROM python:3.7-alpine
ENV PYTHONDONTWRITEBYTECODE 1
ENV PYTHONUNBUFFERED 1
ENV CERTS_DIR /var/certs
ENV NGINX_VERSION 1.15.3
EXPOSE 1234
RUN apk update \
&& GPG_KEYS=B0F4253373F8F6F510D42178520A9993A1C052F8 \
&& CONFIG="\
--prefix=/etc/nginx \
--sbin-path=/usr/sbin/nginx \
--modules-path=/usr/lib/nginx/modules \
--conf-path=/etc/nginx/nginx.conf \
--error-log-path=/var/log/nginx/error.log \
--http-log-path=/var/log/nginx/access.log \
--pid-path=/var/run/nginx.pid \
--lock-path=/var/run/nginx.lock \
--http-client-body-temp-path=/var/cache/nginx/client_temp \
--http-proxy-temp-path=/var/cache/nginx/proxy_temp \
--http-fastcgi-temp-path=/var/cache/nginx/fastcgi_temp \
--http-uwsgi-temp-path=/var/cache/nginx/uwsgi_temp \
--http-scgi-temp-path=/var/cache/nginx/scgi_temp \
--user=nginx \
--group=nginx \
--with-http_ssl_module \
--with-http_realip_module \
--with-http_addition_module \
--with-http_sub_module \
--with-http_dav_module \
--with-http_flv_module \
--with-http_mp4_module \
--with-http_gunzip_module \
--with-http_gzip_static_module \
--with-http_random_index_module \
--with-http_secure_link_module \
--with-http_stub_status_module \
--with-http_auth_request_module \
--with-http_xslt_module=dynamic \
--with-http_image_filter_module=dynamic \
--with-http_geoip_module=dynamic \
--with-threads \
--with-stream \
--with-stream_ssl_module \
--with-stream_ssl_preread_module \
--with-stream_realip_module \
--with-stream_geoip_module=dynamic \
--with-http_slice_module \
--with-mail \
--with-mail_ssl_module \
--with-compat \
--with-file-aio \
--with-http_v2_module \
" \
&& addgroup -S nginx \
&& adduser -D -S -h /var/cache/nginx -s /sbin/nologin -G nginx nginx \
&& apk add tzdata \
gcc \
libc-dev \
linux-headers \
mariadb-dev \
postgresql-dev \
netcat-openbsd \
curl \
libffi-dev \
supervisor \
&& apk add --no-cache --virtual .build-deps \
make \
openssl-dev \
pcre-dev \
zlib-dev \
gnupg1 \
libxslt-dev \
gd-dev \
geoip-dev \
&& curl -fSL https://nginx.org/download/nginx-$NGINX_VERSION.tar.gz -o nginx.tar.gz \
&& curl -fSL https://nginx.org/download/nginx-$NGINX_VERSION.tar.gz.asc -o nginx.tar.gz.asc \
&& export GNUPGHOME="$(mktemp -d)" \
&& found=''; \
for server in \
ha.pool.sks-keyservers.net \
hkp://keyserver.ubuntu.com:80 \
hkp://p80.pool.sks-keyservers.net:80 \
pgp.mit.edu \
; do \
echo "Fetching GPG key $GPG_KEYS from $server"; \
gpg --keyserver "$server" --keyserver-options timeout=10 --recv-keys "$GPG_KEYS" && found=yes && break; \
done; \
test -z "$found" && echo >&2 "error: failed to fetch GPG key $GPG_KEYS" && exit 1; \
gpg --batch --verify nginx.tar.gz.asc nginx.tar.gz \
&& rm -rf "$GNUPGHOME" nginx.tar.gz.asc \
&& mkdir -p /usr/src \
&& tar -zxC /usr/src -f nginx.tar.gz \
&& rm nginx.tar.gz \
&& cd /usr/src/nginx-$NGINX_VERSION \
&& ./configure $CONFIG --with-debug \
&& make -j$(getconf _NPROCESSORS_ONLN) \
&& mv objs/nginx objs/nginx-debug \
&& mv objs/ngx_http_xslt_filter_module.so objs/ngx_http_xslt_filter_module-debug.so \
&& mv objs/ngx_http_image_filter_module.so objs/ngx_http_image_filter_module-debug.so \
&& mv objs/ngx_http_geoip_module.so objs/ngx_http_geoip_module-debug.so \
&& mv objs/ngx_stream_geoip_module.so objs/ngx_stream_geoip_module-debug.so \
&& ./configure $CONFIG \
&& make -j$(getconf _NPROCESSORS_ONLN) \
&& make install \
&& rm -rf /etc/nginx/html/ \
&& mkdir /etc/nginx/conf.d/ \
&& mkdir -p /usr/share/nginx/html/ \
&& install -m644 html/index.html /usr/share/nginx/html/ \
&& install -m644 html/50x.html /usr/share/nginx/html/ \
&& install -m755 objs/nginx-debug /usr/sbin/nginx-debug \
&& install -m755 objs/ngx_http_xslt_filter_module-debug.so /usr/lib/nginx/modules/ngx_http_xslt_filter_module-debug.so \
&& install -m755 objs/ngx_http_image_filter_module-debug.so /usr/lib/nginx/modules/ngx_http_image_filter_module-debug.so \
&& install -m755 objs/ngx_http_geoip_module-debug.so /usr/lib/nginx/modules/ngx_http_geoip_module-debug.so \
&& install -m755 objs/ngx_stream_geoip_module-debug.so /usr/lib/nginx/modules/ngx_stream_geoip_module-debug.so \
&& ln -s ../../usr/lib/nginx/modules /etc/nginx/modules \
&& strip /usr/sbin/nginx* \
&& strip /usr/lib/nginx/modules/*.so \
&& rm -rf /usr/src/nginx-$NGINX_VERSION \
\
&& apk add --no-cache --virtual .gettext gettext \
&& mv /usr/bin/envsubst /tmp/ \
\
&& runDeps="$( \
scanelf --needed --nobanner --format '%n#p' /usr/sbin/nginx /usr/lib/nginx/modules/*.so /tmp/envsubst \
| tr ',' '\n' \
| sort -u \
| awk 'system("[ -e /usr/local/lib/" $1 " ]") == 0 { next } { print "so:" $1 }' \
)" \
&& apk add --no-cache --virtual .nginx-rundeps $runDeps \
&& apk del .build-deps \
&& apk del .gettext \
&& mv /tmp/envsubst /usr/local/bin/ \
\
&& ln -sf /dev/stdout /var/log/nginx/access.log \
&& ln -sf /dev/stderr /var/log/nginx/error.log
COPY nginx.conf /etc/nginx/nginx.conf
COPY conf/certs /var/certs
COPY ./src /app
COPY supervisord.ini /etc/supervisor.d/supervisord.ini
WORKDIR /app
RUN pip install --upgrade pip
RUN pip install pipenv
RUN pipenv install --system
RUN chmod +x ./start.sh
ENTRYPOINT ["./start.sh"]
This is my entire docker-compose:
# SERVICE ENDPOINT USER PASSWORD
# ==========================================================
# postgres-11 localhost:2323 root root
# redis-cache-energy sin exposición no no
# redis-celery sin exposición no no
# energy localhost:1337 no no Nnigx => Gunicorn => Django
# celery-beat sin exposición no no
# celery-worker sin exposición no no
version: '3'
services:
####################################################################################################
# DATABASES #
####################################################################################################
postgres:
image: postgres:11.2-alpine
restart: always
volumes:
- postgres_data_energy:/var/lib/postgresql/data/
environment:
- POSTGRES_USER=root
- POSTGRES_PASSWORD=root
- POSTGRES_DB=local
ports:
- "2323:5432"
networks:
- energy-network
####################################################################################################
# REDIS #
####################################################################################################
redis-cache-energy:
image: redis:alpine
command: redis-server --requirepass root
restart: always
networks:
- energy-network
redis-celery-energy:
image: redis:alpine
command: redis-server --requirepass root
restart: always
networks:
- energy-network
####################################################################################################
# energy #
####################################################################################################
celery-beat-energy:
build: .
command: celery-beat-energy
restart: always
volumes:
- ./src:/app
- ./logs:/var/logs
env_file:
- conf/local.env
depends_on:
- redis-celery-energy
networks:
- energy-network
celery-worker-energy:
build: .
command: celery-worker-energy
restart: always
volumes:
- ./src:/app
- ./logs:/var/logs
env_file:
- conf/local.env
depends_on:
- redis-celery-energy
networks:
- energy-network
energy:
build: .
command: nginx
restart: always
volumes:
- ./src:/app
- ./logs:/var/logs
ports:
- 1234:1234
env_file:
- conf/local.env
depends_on:
- postgres
- redis-cache-energy
- redis-celery-energy
- celery-worker-energy
- celery-beat-energy
networks:
- energy-network
####################################################################################################
# VOLUMES #
####################################################################################################
volumes:
postgres_data_energy:
####################################################################################################
# networks #
####################################################################################################
networks:
energy-network:
driver: bridge
More info:
If i run the project from terminal SSH from pycharm and run a ls command in the entrypoint, i see the files.
SSH TERMINAL FROM PYCHARM
RESULT ENTRYPOINT, SEE THE FILES
If I run form pycharm the ls is empty.
RESULT ENTRYPOINT, EMPTY
RUN FROM PYCHARM
More info about my problem
Related
docker container exits immediately after start with either docker run -t -d postgresql_ha:pg13 or docker-compose up
[root#vistradpdb01] /opt/apps/Postgresql_HA_test # docker-compose up
Starting postgresql_ha_test_postgresql_ha_1 ... done
Attaching to postgresql_ha_test_postgresql_ha_1
postgresql_ha_1 | + su-exec postgres:postgres /usr/bin/pg_ctl start -D /var/lib/postgresql/data
postgresql_ha_1 | pg_ctl: another server might be running; trying to start server anyway
postgresql_ha_1 | waiting for server to start....2023-02-16 13:33:14.885 UTC [9] LOG: starting PostgreSQL 13.10 on x86_64-alpine-linux-musl, compiled by gcc (Alpine 12.2.1_git20220924-r4) 12.2.1 20220924, 64-bit
postgresql_ha_1 | 2023-02-16 13:33:14.885 UTC [9] LOG: listening on IPv4 address "0.0.0.0", port 5432
postgresql_ha_1 | 2023-02-16 13:33:14.885 UTC [9] LOG: listening on IPv6 address "::", port 5432
postgresql_ha_1 | 2023-02-16 13:33:14.885 UTC [9] LOG: listening on Unix socket "/run/postgresql/.s.PGSQL.5432"
postgresql_ha_1 | 2023-02-16 13:33:14.887 UTC [10] LOG: database system was interrupted; last known up at 2023-02-16 13:11:17 UTC
postgresql_ha_1 | 2023-02-16 13:33:15.176 UTC [10] LOG: database system was not properly shut down; automatic recovery in progress
postgresql_ha_1 | 2023-02-16 13:33:15.178 UTC [10] LOG: invalid record length at 0/15AA198: wanted 24, got 0
postgresql_ha_1 | 2023-02-16 13:33:15.178 UTC [10] LOG: redo is not required
postgresql_ha_1 | 2023-02-16 13:33:15.183 UTC [9] LOG: database system is ready to accept connections
postgresql_ha_1 | done
postgresql_ha_1 | server started
postgresql_ha_test_postgresql_ha_1 exited with code 0
I am trying to create a Postgres high availability image with patroni, etcd and haproxy included.
I have read alot of articles on how to force a container to stay running but have not found anything that works.
I tired Starting a Pseudo-TTY.
Dockerfile:
FROM alpine:latest AS build
USER root
ENV LANG en_US.UTF-8
ENV LC_ALL en_US.UTF-8
ENV POSTGRES_USER="postgres"
ENV POSTGRES_PASSWORD="postgres"
ENV PGDATA="/var/lib/postgresql/data/pgdata"
ENV POSTGRES_INITDB_WALDIR="/var/lib/postgresql/log/pgwal"
COPY ./docker-entrypoint.sh /usr/local/bin/
RUN chmod +x /usr/local/bin/docker-entrypoint.sh \
&& sed -i -e 's/\r$//' /usr/local/bin/docker-entrypoint.sh \
&& mkdir -p /var/lib/postgresql \
&& addgroup -g 70 -S postgres \
&& adduser -u 70 -S -D -G postgres -H -h /var/lib/postgresql -s /bin/sh postgres \
&& chown -R postgres:postgres /var/lib/postgresql \
&& apk update && apk add --no-cache \
bash \
su-exec \
curl \
gcc \
linux-headers \
python3 \
python3-dev \
musl-dev \
postgresql13 \
haproxy \
py3-pip \
libgcc \
&& rm -rf /var/lib/apt/lists/* \
&& curl -L https://github.com/etcd-io/etcd/releases/download/v3.5.7/etcd-v3.5.7-linux-amd64.tar.gz -o /opt/etcd-v3.5.7-linux-amd64.tar.gz \
&& mkdir /opt/etcd \
&& tar xzvf /opt/etcd-v3.5.7-linux-amd64.tar.gz -C /opt/etcd --strip-components=1 \
&& rm -f /opt/etcd-v3.5.7-linux-amd64.tar.gz \
&& pip install --trusted-host pypi.python.org --trusted-host pypi.org --trusted-host files.pythonhosted.org patroni[etcd] \
&& su-exec postgres:postgres mkdir /var/lib/postgresql/data \
&& su-exec postgres:postgres chmod 0700 /var/lib/postgresql/data \
&& su-exec postgres:postgres initdb -D /var/lib/postgresql/data \
&& echo "host all all 0.0.0.0/0 md5" >> /var/lib/postgresql/data/pg_hba.conf \
&& echo "listen_addresses='*'" >> /var/lib/postgresql/data/postgresql.conf \
&& mkdir /run/postgresql \
&& chown postgres:postgres /run/postgresql \
&& chmod 2777 /var/run/postgresql \
&& curl -L https://github.com/tianon/gosu/releases/download/1.16/gosu-amd64 -o /usr/local/bin/gosu \
&& curl -L https://github.com/tianon/gosu/releases/download/1.16/gosu-amd64.asc -o /usr/local/bin/gosu.asc \
&& chmod +x /usr/local/bin/gosu \
&& gosu nobody true
FROM scratch
COPY --from=build / /
USER postgres
ENTRYPOINT ["bash","-c","/usr/local/bin/docker-entrypoint.sh"]
docker-compose.yml:
version: '3.1'
services:
postgresql_ha:
image: postgresql_ha:pg13
ports:
- '5432:5432'
volumes:
- /opt/store/pgdata/data:/var/lib/postgresql/data/pgdata
- /opt/store/pgdata/tblspc:/var/lib/postgresql/data/tsdata
- /opt/store/pglog/wal:/var/lib/postgresql/log/pgwal
- /opt/store/pglog/log:/var/lib/postgresql/log/pglog
- ./postgres-init.sh:/docker-entrypoint-initdb.d/postgres-init.sh
tty: true
docker-entrypoint.sh:
#!/bin/bash
set -x
su-exec postgres:postgres /usr/bin/pg_ctl start -D /var/lib/postgresql/data
"$#"
I have three Azure Pipeline agents built on Ubuntu 18.04 images and deployed to a Kubernetes cluster. Agents are running the latest version, 2.182.1, but this problem also happened using 2.181.0.
Executing build pipelines individually works just fine. Build completes successfully every time. But whenever a second pipeline starts while another pipeline is already running, it fails - every time - on the "Checkout" job with the following error:
The working folder U:\azp\agent\_work\1\s is already in use by the workspace ws_1_34;Project Collection Build Service (myaccount) on computer linux-agent-deployment-78bfb76d.
These are three separate and distinct agents running as separate containers. Why would a job from one container be impacting a job running on a different container? Concurrent builds work all day long on my non-container Windows servers.
The container agents are deployed as a standard Kubernetes "deployment" object:
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: linux-agent
name: linux-agent-deployment
namespace: pipelines
annotations:
kubernetes.io/change-cause: "update agent image to 20210304 - change from OpenJDK to Oracle Java JDK 11"
spec:
replicas: 3
revisionHistoryLimit: 3
selector:
matchLabels:
app: linux-agent
strategy:
rollingUpdate:
maxUnavailable: 1
template:
metadata:
labels:
app: linux-agent
spec:
serviceAccountName: sa-aws-azp-pipelineagent
containers:
- name: linux-agent
image: 999999999999.dkr.ecr.us-east-2.amazonaws.com/mgmt/my-linux-agent:20210304
imagePullPolicy: IfNotPresent
env:
- name: AZP_URL
value: https://dev.azure.com/myaccount
- name: AZP_POOL
value: EKS-Linux
- name: AZP_TOKEN
valueFrom:
secretKeyRef:
name: azure-devops
key: agent-token
My build agent containers are pretty straightforward...
FROM ubuntu:18.04
ENV ACCEPT_EULA=y
ENV DEBIAN_FRONTEND=noninteractive
RUN echo "APT::Get::Assume-Yes \"true\";" > /etc/apt/apt.conf.d/90assumeyes
RUN ln -fs /usr/share/zoneinfo/America/Chicago /etc/localtime
RUN apt-get update \
&& apt-get install -y --no-install-recommends \
apt-transport-https \
ca-certificates \
curl \
jq \
git \
iputils-ping \
libcurl4 \
libicu60 \
libunwind8 \
netcat \
dnsutils \
wget \
zip \
unzip \
telnet \
ftp \
file \
time \
tzdata \
build-essential \
libc6 \
libgcc1 \
libgssapi-krb5-2 \
liblttng-ust0 \
libssl1.0 \
libstdc++6 \
zlib1g \
apt-utils \
bison \
brotli \
bzip2 \
dbus \
dpkg \
fakeroot \
flex \
gnupg2 \
iproute2 \
lib32z1 \
libc++-dev \
libc++abi-dev \
libgbm-dev \
libgconf-2-4 \
libgtk-3-0 \
libsecret-1-dev \
libsqlite3-dev \
libxkbfile-dev \
libxss1 \
locales \
m4 \
openssh-client \
parallel \
patchelf \
pkg-config \
rpm \
rsync \
shellcheck \
sqlite3 \
ssh \
sudo \
texinfo \
tk \
upx \
xorriso \
xvfb \
xz-utils \
zstd \
zsync \
software-properties-common
### REQUIRED APPLICATIONS
# Amazon Web Services - CLI
RUN curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip" \
&& unzip awscliv2.zip \
&& sudo ./aws/install
# MS SQL Tools (ONE-TIME SETUP OF MICROSOFT REPOSITORY INCLUDED)
RUN curl https://packages.microsoft.com/keys/microsoft.asc | sudo apt-key add - \
&& curl https://packages.microsoft.com/config/ubuntu/18.04/prod.list | sudo tee /etc/apt/sources.list.d/msprod.list \
&& sudo apt-get update && sudo ACCEPT_EULA=Y apt-get install -y mssql-tools unixodbc-dev
# Powershell Global Tool (https://learn.microsoft.com/en-us/powershell/scripting/install/installing-powershell-core-on-linux?view=powershell-7.1)
RUN sudo apt-get install -y powershell
# .NET Core SDKs (https://learn.microsoft.com/en-us/dotnet/core/install/linux-ubuntu)
# see also (https://packages.microsoft.com/ubuntu/18.04/prod/dists/bionic/main/binary-amd64/) "Packages"
# SDKs Included: 2.1, 2.2, 3.0, 3.1, 5.0
RUN sudo apt-get install -y dotnet-host \
aspnetcore-store-2.0.0 \
aspnetcore-store-2.0.3 \
aspnetcore-store-2.0.5 \
aspnetcore-store-2.0.6 \
aspnetcore-store-2.0.7 \
aspnetcore-store-2.0.8 \
aspnetcore-store-2.0.9 \
dotnet-hostfxr-2.0.7 \
dotnet-hostfxr-2.0.9 \
dotnet-hostfxr-2.1 \
dotnet-hostfxr-2.2 \
dotnet-hostfxr-3.0 \
dotnet-hostfxr-3.1 \
dotnet-hostfxr-5.0 \
dotnet-runtime-deps-2.1 \
dotnet-runtime-deps-2.2 \
dotnet-runtime-deps-3.0 \
dotnet-runtime-deps-3.1 \
dotnet-runtime-deps-5.0 \
dotnet-targeting-pack-3.0 \
dotnet-targeting-pack-3.1 \
dotnet-targeting-pack-5.0 \
netstandard-targeting-pack-2.1 \
aspnetcore-targeting-pack-3.0 \
aspnetcore-targeting-pack-3.1 \
aspnetcore-targeting-pack-5.0 \
dotnet-runtime-2.1 \
dotnet-runtime-2.2 \
dotnet-runtime-3.0 \
dotnet-runtime-3.1 \
dotnet-runtime-5.0 \
aspnetcore-runtime-2.1 \
aspnetcore-runtime-2.2 \
aspnetcore-runtime-3.0 \
aspnetcore-runtime-3.1 \
aspnetcore-runtime-5.0 \
dotnet-sdk-2.1 \
dotnet-sdk-2.2 \
dotnet-sdk-3.0 \
dotnet-sdk-3.1 \
dotnet-sdk-5.0
# Initialize dotnet
RUN dotnet help
RUN dotnet --info
# Node.js (https://github.com/nodesource/distributions/blob/master/README.md)
RUN curl -sL https://deb.nodesource.com/setup_14.x | sudo -E bash - \
&& sudo apt-get install -y nodejs \
&& node --version \
&& npm --version
# Java JDK 11
COPY JDK/ /var/cache/oracle-jdk11-installer-local/
RUN add-apt-repository -y ppa:linuxuprising/java && \
apt-get update && \
echo oracle-java11-installer shared/accepted-oracle-license-v1-2 select true | sudo /usr/bin/debconf-set-selections && \
apt-get install -y oracle-java11-installer-local
ENV JAVA_HOME=/usr/lib/jvm/java-11-oracle \
JAVA_TOOL_OPTIONS=-Dfile.encoding=UTF8
# Clean package cache
RUN rm -rf /var/lib/apt/lists/* \
&& rm -rf /etc/apt/sources.list.d/*
WORKDIR /azp
COPY ./start.sh .
RUN chmod +x start.sh
CMD ["./start.sh"]
What am I doing wrong?
Solution has been found. Here's how I resolved this for anyone coming across this post:
I discovered a helm chart for Azure Pipeline agents - emberstack/docker-azure-pipelines-agent - and after poking around in the contents, discovered what was staring me in the face the last couple of days: "StatefulSets"
Simple, easy to test, and working well so far. I refactored my k8s manifest as a StatefulSet object and the agents are up and able to run builds concurrently. Still more testing to do, but looking very positive at this point.
apiVersion: apps/v1
kind: StatefulSet
metadata:
labels:
app: linux-agent
name: linux-pipeline-agent
namespace: pipelines
annotations:
kubernetes.io/change-cause: "Init 20210304 - Oracle Java JDK 11"
spec:
podManagementPolicy: Parallel
replicas: 3
revisionHistoryLimit: 3
selector:
matchLabels:
app: linux-agent
serviceName: agent-svc
updateStrategy:
type: RollingUpdate
template:
metadata:
labels:
app: linux-agent
spec:
serviceAccountName: sa-aws-azp-pipelineagent
containers:
- name: linux-agent
image: 999999999999.dkr.ecr.us-east-2.amazonaws.com/mgmt/my-linux-agent:20210304
imagePullPolicy: IfNotPresent
env:
- name: AZP_AGENT_NAME
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.name
- name: AZP_URL
value: https://dev.azure.com/myaccount
- name: AZP_POOL
value: EKS-Linux
- name: AZP_TOKEN
valueFrom:
secretKeyRef:
name: azure-devops
key: agent-token
I created a dockerfile and changed User to a non-root user nobody. Locally this works perfectly. When deployed on kubernetes howerever, I get error
java.nio.file.AccessDeniedException: ./xxxxxx_2.12-2.6.3.jar
When I dug in more I realised this jar file is download after the spark dependencies used in the dockerfiles are downloaded. Therefore any permissions given to the spark folder are not present for this newly dowloaded jar file which is downloaded at runtime into /opt/spark/.ivy2/xxx which has root permissions. This causes the pod in kubernetes to fail.
I am wondering if there is way to give permissions to execute this jar file. Since it seems this is not possible in the Dockerfile. Any suggestion as to how to solve this issue ??
As proposed by #mario
ARG SPARK_OPERATOR_BASE_IMAGE_VERSION=v2.4.5
FROM xxx/alpine:3.12 as preparator
ARG SCALA_VERSION=2.12
ARG SPARK_VERSION=2.4.7
ARG HADOOP_VERSION=3.2.1
ARG AWS_SDK_VERSION=1.11.375
ARG MAVEN_VERSION=3.6.2
RUN apk add --no-cache \
bash \
curl && \
mkdir /target
COPY hashes /tmp/
COPY prepare /tmp/
WORKDIR /tmp
# Download Hadoop
RUN curl -L -O https://downloads.apache.org/hadoop/common/hadoop-${HADOOP_VERSION}/hadoop-${HADOOP_VERSION}.tar.gz && \
sha256sum -c hadoop-${HADOOP_VERSION}.sha256 && \
tar -xzf hadoop-${HADOOP_VERSION}.tar.gz && \
mv hadoop-${HADOOP_VERSION} /target/hadoop
# Download Spark
RUN curl -L -O https://downloads.apache.org/spark/spark-${SPARK_VERSION}/spark-${SPARK_VERSION}-bin-without-hadoop-scala-${SCALA_VERSION}.tgz && \
sha512sum -c spark-${SPARK_VERSION}-bin-without-hadoop-scala-${SCALA_VERSION}.sha512 && \
tar -xzf spark-${SPARK_VERSION}-bin-without-hadoop-scala-${SCALA_VERSION}.tgz && \
mv spark-${SPARK_VERSION}-bin-without-hadoop-scala-${SCALA_VERSION} /target/spark && \
# Download Spark 3.0.0 entrypoint script from GitHub, bugfixing for 2.4.7
curl -L -O https://raw.githubusercontent.com/apache/spark/v3.0.0/resource-managers/kubernetes/docker/src/main/dockerfiles/spark/entrypoint.sh && \
mv entrypoint.sh /target/entrypoint.sh && \
chmod +x /target/entrypoint.sh
# Download AWS Jars
RUN curl -L -O https://repo1.maven.org/maven2/org/apache/hadoop/hadoop-aws/${HADOOP_VERSION}/hadoop-aws-${HADOOP_VERSION}.jar && \
sha1sum -c hadoop-aws-${HADOOP_VERSION}.jar.sha1 && \
mv hadoop-aws-${HADOOP_VERSION}.jar /target/spark/jars/ && \
curl -L -O https://repo1.maven.org/maven2/com/amazonaws/aws-java-sdk-bundle/${AWS_SDK_VERSION}/aws-java-sdk-bundle-${AWS_SDK_VERSION}.jar && \
sha1sum -c aws-java-sdk-bundle-${AWS_SDK_VERSION}.jar.sha1 && \
mv aws-java-sdk-bundle-${AWS_SDK_VERSION}.jar /target/spark/jars/
# Directory needed for saving built jars
RUN mkdir /target/spark/custom-jars/
#### Download Prometheus + Metric dependencies ####
# install java, maven and prometheus fat jar using maven (pom.xml)
RUN apk add --update openjdk8 && \
curl -L -O https://archive.apache.org/dist/maven/maven-3/${MAVEN_VERSION}/binaries/apache-maven-${MAVEN_VERSION}-bin.tar.gz && \
tar -xzf apache-maven-${MAVEN_VERSION}-bin.tar.gz && export PATH=./apache-maven-${MAVEN_VERSION}/bin:$PATH && \
mv prometheus-pom.xml pom.xml && mvn clean package && mv target/prometheusMetricLibs-jar-with-dependencies.jar /target/spark/custom-jars/
RUN \
chown -R nobody:99 /target/spark \
&& chown -R nobody:99 /target/hadoop \
&& chmod -R ugo+rw /target/spark \
&& chmod -R ugo+rw /target/hadoop
ARG SPARK_OPERATOR_BASE_IMAGE_VERSION
FROM gcr.io/spark-operator/spark:${SPARK_OPERATOR_BASE_IMAGE_VERSION}
RUN rm -rf /opt/spark/
COPY --from=preparator /target/ /opt/
ENV SPARK_HOME=/opt/spark \
HADOOP_HOME=/opt/hadoop
ENV HADOOP_OPTS="-Djava.library.path=/opt/hadoop/lib/native" \
LD_LIBRARY_PATH=${HADOOP_HOME}/lib/native \
PATH=${HADOOP_HOME}/bin:${SPARK_HOME}/bin:${PATH}
COPY conf /opt/spark/conf/
RUN echo "export JAVA_HOME=${JAVA_HOME}" >> ${HADOOP_HOME}/etc/hadoop/hadoop-env.sh && \
echo "export JAVA_HOME=${JAVA_HOME}" > ${SPARK_HOME}/conf/spark-env.sh && \
echo "export SPARK_DIST_CLASSPATH=\$(hadoop classpath)" >> ${SPARK_HOME}/conf/spark-env.sh
# 99 used instead of nobody because in the alpine image the Group Id of nobody is different than in the spark operator image.
RUN \
addgroup --gid 99 nobody \
&& echo "nobody:x:99:99:nobody:/nonexistent:/usr/sbin/nologin" >> /etc/passwd \
&& usermod -a -G users nobody \
&& chmod -R ugo+rw /var/lib/
USER nobody
# if we want local storage
# "spark.eventLog.dir": "tmp/spark-events"
RUN mkdir -p /tmp/spark-events
in my pod the jar file is implemented like this
sparkConf:
"spark.ui.port": "4045"
"spark.eventLog.enabled": {{ .Values.spark.eventLogEnabled | quote }}
"spark.eventLog.dir": "xx//"
"spark.jars.ivySettings": "/vault/secrets/xx-ivysettings.xml"
"spark.jars.ivy": "/opt/spark/.ivy2"
"spark.jars.packages": "xxxx_2.12:{{ .Values.appVersion }}"
"spark.blacklist.enabled": "false"
"spark.driver.supervise": "true"
"spark.app.name": {{ .Values.name | quote }}
"spark.submit.deployMode": {{ .Values.spark.deployMode | quote }}
"spark.driver.extraJavaOptions": "-Dlog4j.configurationFile=log4j.properties"
"spark.executor.extraJavaOptions": "-Dlog4j.configurationFile=log4j.properties"
It would be easier to answer your question if you share your Dockerfile and your kubernetes Pod template yaml manifest, but in short: you can manipulate your Pod's permissions using the securityContext like in the example from the docs below:
apiVersion: v1
kind: Pod
metadata:
name: security-context-demo
spec:
securityContext:
runAsUser: 1000
runAsGroup: 3000
fsGroup: 2000
volumes:
- name: sec-ctx-vol
emptyDir: {}
containers:
- name: sec-ctx-demo
image: busybox
command: [ "sh", "-c", "sleep 1h" ]
volumeMounts:
- name: sec-ctx-vol
mountPath: /data/demo
securityContext:
allowPrivilegeEscalation: false
For debugging purposes you can start from setting allowPrivilegeEscalation to true or runAsUser: 0 (root) but keep in mind this is not the solution that can be used in production. Running containers as root is generally a bad idea and in most cases it can be avoided.
Therefore any permissions given to the spark folder are not present
for this newly dowloaded jar file which is downloaded at runtime into
/opt/spark/.ivy2/xxx which has root permissions.
Is it really necessary for it to have root permissions ? Most likely it can be fixed in your Dockerfile.
I'm trying to mount a volume in docker-compose. But it seems my user does not have permission to use volume. :/
My dockerfile is:
FROM openjdk:8u181-jdk-slim
ENV HOME /app
ENV CONFIG_PATH $HOME/config
ENV DATA_PATH $HOME/data
ENV LOG_PATH $HOME/log
RUN addgroup --gid 1001 myuser \
&& adduser --uid 1001 --gid 1001 --home $HOME --shell /bin/bash \
--gecos "" --no-create-home --disabled-password myuser \
&& mkdir -p $CONFIG_PATH $DATA_PATH $LOG_PATH \
&& chown -R myuser:myuser $HOME \
&& chmod -R g=u $HOME \
&& chmod +x $HOME/*
RUN apt-get update \
&& apt-get install -y curl \
&& apt-get clean
VOLUME $CONFIG_PATH $DATA_PATH $LOG_PATH
USER myuser:myuser
EXPOSE 7777
EXPOSE 8080
HEALTHCHECK --interval=1m --timeout=10s --start-period=2m \
CMD curl -f http://localhost:7777/health || exit 1
COPY --chown=myuser my-service-*.jar $HOME/my-service.jar
ENTRYPOINT ["/bin/bash", "-c", "java $JAVA_OPTS -jar $HOME/my-service.jar $0 $#"]
my docker-compose file is:
volumes:
my-service_stream:
my-service:
image: my-service-image
networks:
- internal
env_file:
- config/common.env
volumes:
- my-service_stream:/app/data/state
not able to use myuser and not able to mount for this user :/. I can not use and myuser does not have permission to write that volume.
I have tried adding user to my docker-compose file as
user: "1001:1001"
but nothing is changed.
im trying to setup NextCloud on Postgres with Docker but im unable to access/reach the postgress container from the nextcloud setup page.
Here is my setup:
docker network create --driver bridge nextcloud
docker run -p 127.0.0.1:5432:5432 \
--name postgres \
--link cloud.mydomain.com \
--net=nextcloud \
-e POSTGRES_PASSWORD=supersecretpass123 \
-e POSTGRES_USER=nextcloud \
-e POSTGRES_DB=nextcloud \
-v postgres-data:/var/lib/postgresql/data \
-d postgres
docker run -d -p 127.0.0.1:8080:80 \
--name="cloud.mydomain.com" \
-e VIRTUAL_HOST=cloud.mydomain.com \
-v nextcloud:/var/www/html \
--net=nextcloud \
nextcloud
docker run -d -p 80:80 -p 443:443 --name="cloud.mydomain.com-proxy" \
--net=nextcloud \
-v /srv/gitlab:/etc/nginx/vhost.d:ro \
-v /root/certs:/etc/nginx/certs \
-v /var/run/docker.sock:/tmp/docker.sock:ro \
--restart always \
jwilder/nginx-proxy:latest
Any Suggestion?
You need to invert the link, add the --link postgres to cloud.mydomain.com:
docker run -d -p 127.0.0.1:8080:80 \
--name="cloud.mydomain.com" \
--link postgres \
-e VIRTUAL_HOST=cloud.mydomain.com \
-v nextcloud:/var/www/html \
--net=nextcloud \
nextcloud