I have three Azure Pipeline agents built on Ubuntu 18.04 images and deployed to a Kubernetes cluster. Agents are running the latest version, 2.182.1, but this problem also happened using 2.181.0.
Executing build pipelines individually works just fine. Build completes successfully every time. But whenever a second pipeline starts while another pipeline is already running, it fails - every time - on the "Checkout" job with the following error:
The working folder U:\azp\agent\_work\1\s is already in use by the workspace ws_1_34;Project Collection Build Service (myaccount) on computer linux-agent-deployment-78bfb76d.
These are three separate and distinct agents running as separate containers. Why would a job from one container be impacting a job running on a different container? Concurrent builds work all day long on my non-container Windows servers.
The container agents are deployed as a standard Kubernetes "deployment" object:
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: linux-agent
name: linux-agent-deployment
namespace: pipelines
annotations:
kubernetes.io/change-cause: "update agent image to 20210304 - change from OpenJDK to Oracle Java JDK 11"
spec:
replicas: 3
revisionHistoryLimit: 3
selector:
matchLabels:
app: linux-agent
strategy:
rollingUpdate:
maxUnavailable: 1
template:
metadata:
labels:
app: linux-agent
spec:
serviceAccountName: sa-aws-azp-pipelineagent
containers:
- name: linux-agent
image: 999999999999.dkr.ecr.us-east-2.amazonaws.com/mgmt/my-linux-agent:20210304
imagePullPolicy: IfNotPresent
env:
- name: AZP_URL
value: https://dev.azure.com/myaccount
- name: AZP_POOL
value: EKS-Linux
- name: AZP_TOKEN
valueFrom:
secretKeyRef:
name: azure-devops
key: agent-token
My build agent containers are pretty straightforward...
FROM ubuntu:18.04
ENV ACCEPT_EULA=y
ENV DEBIAN_FRONTEND=noninteractive
RUN echo "APT::Get::Assume-Yes \"true\";" > /etc/apt/apt.conf.d/90assumeyes
RUN ln -fs /usr/share/zoneinfo/America/Chicago /etc/localtime
RUN apt-get update \
&& apt-get install -y --no-install-recommends \
apt-transport-https \
ca-certificates \
curl \
jq \
git \
iputils-ping \
libcurl4 \
libicu60 \
libunwind8 \
netcat \
dnsutils \
wget \
zip \
unzip \
telnet \
ftp \
file \
time \
tzdata \
build-essential \
libc6 \
libgcc1 \
libgssapi-krb5-2 \
liblttng-ust0 \
libssl1.0 \
libstdc++6 \
zlib1g \
apt-utils \
bison \
brotli \
bzip2 \
dbus \
dpkg \
fakeroot \
flex \
gnupg2 \
iproute2 \
lib32z1 \
libc++-dev \
libc++abi-dev \
libgbm-dev \
libgconf-2-4 \
libgtk-3-0 \
libsecret-1-dev \
libsqlite3-dev \
libxkbfile-dev \
libxss1 \
locales \
m4 \
openssh-client \
parallel \
patchelf \
pkg-config \
rpm \
rsync \
shellcheck \
sqlite3 \
ssh \
sudo \
texinfo \
tk \
upx \
xorriso \
xvfb \
xz-utils \
zstd \
zsync \
software-properties-common
### REQUIRED APPLICATIONS
# Amazon Web Services - CLI
RUN curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip" \
&& unzip awscliv2.zip \
&& sudo ./aws/install
# MS SQL Tools (ONE-TIME SETUP OF MICROSOFT REPOSITORY INCLUDED)
RUN curl https://packages.microsoft.com/keys/microsoft.asc | sudo apt-key add - \
&& curl https://packages.microsoft.com/config/ubuntu/18.04/prod.list | sudo tee /etc/apt/sources.list.d/msprod.list \
&& sudo apt-get update && sudo ACCEPT_EULA=Y apt-get install -y mssql-tools unixodbc-dev
# Powershell Global Tool (https://learn.microsoft.com/en-us/powershell/scripting/install/installing-powershell-core-on-linux?view=powershell-7.1)
RUN sudo apt-get install -y powershell
# .NET Core SDKs (https://learn.microsoft.com/en-us/dotnet/core/install/linux-ubuntu)
# see also (https://packages.microsoft.com/ubuntu/18.04/prod/dists/bionic/main/binary-amd64/) "Packages"
# SDKs Included: 2.1, 2.2, 3.0, 3.1, 5.0
RUN sudo apt-get install -y dotnet-host \
aspnetcore-store-2.0.0 \
aspnetcore-store-2.0.3 \
aspnetcore-store-2.0.5 \
aspnetcore-store-2.0.6 \
aspnetcore-store-2.0.7 \
aspnetcore-store-2.0.8 \
aspnetcore-store-2.0.9 \
dotnet-hostfxr-2.0.7 \
dotnet-hostfxr-2.0.9 \
dotnet-hostfxr-2.1 \
dotnet-hostfxr-2.2 \
dotnet-hostfxr-3.0 \
dotnet-hostfxr-3.1 \
dotnet-hostfxr-5.0 \
dotnet-runtime-deps-2.1 \
dotnet-runtime-deps-2.2 \
dotnet-runtime-deps-3.0 \
dotnet-runtime-deps-3.1 \
dotnet-runtime-deps-5.0 \
dotnet-targeting-pack-3.0 \
dotnet-targeting-pack-3.1 \
dotnet-targeting-pack-5.0 \
netstandard-targeting-pack-2.1 \
aspnetcore-targeting-pack-3.0 \
aspnetcore-targeting-pack-3.1 \
aspnetcore-targeting-pack-5.0 \
dotnet-runtime-2.1 \
dotnet-runtime-2.2 \
dotnet-runtime-3.0 \
dotnet-runtime-3.1 \
dotnet-runtime-5.0 \
aspnetcore-runtime-2.1 \
aspnetcore-runtime-2.2 \
aspnetcore-runtime-3.0 \
aspnetcore-runtime-3.1 \
aspnetcore-runtime-5.0 \
dotnet-sdk-2.1 \
dotnet-sdk-2.2 \
dotnet-sdk-3.0 \
dotnet-sdk-3.1 \
dotnet-sdk-5.0
# Initialize dotnet
RUN dotnet help
RUN dotnet --info
# Node.js (https://github.com/nodesource/distributions/blob/master/README.md)
RUN curl -sL https://deb.nodesource.com/setup_14.x | sudo -E bash - \
&& sudo apt-get install -y nodejs \
&& node --version \
&& npm --version
# Java JDK 11
COPY JDK/ /var/cache/oracle-jdk11-installer-local/
RUN add-apt-repository -y ppa:linuxuprising/java && \
apt-get update && \
echo oracle-java11-installer shared/accepted-oracle-license-v1-2 select true | sudo /usr/bin/debconf-set-selections && \
apt-get install -y oracle-java11-installer-local
ENV JAVA_HOME=/usr/lib/jvm/java-11-oracle \
JAVA_TOOL_OPTIONS=-Dfile.encoding=UTF8
# Clean package cache
RUN rm -rf /var/lib/apt/lists/* \
&& rm -rf /etc/apt/sources.list.d/*
WORKDIR /azp
COPY ./start.sh .
RUN chmod +x start.sh
CMD ["./start.sh"]
What am I doing wrong?
Solution has been found. Here's how I resolved this for anyone coming across this post:
I discovered a helm chart for Azure Pipeline agents - emberstack/docker-azure-pipelines-agent - and after poking around in the contents, discovered what was staring me in the face the last couple of days: "StatefulSets"
Simple, easy to test, and working well so far. I refactored my k8s manifest as a StatefulSet object and the agents are up and able to run builds concurrently. Still more testing to do, but looking very positive at this point.
apiVersion: apps/v1
kind: StatefulSet
metadata:
labels:
app: linux-agent
name: linux-pipeline-agent
namespace: pipelines
annotations:
kubernetes.io/change-cause: "Init 20210304 - Oracle Java JDK 11"
spec:
podManagementPolicy: Parallel
replicas: 3
revisionHistoryLimit: 3
selector:
matchLabels:
app: linux-agent
serviceName: agent-svc
updateStrategy:
type: RollingUpdate
template:
metadata:
labels:
app: linux-agent
spec:
serviceAccountName: sa-aws-azp-pipelineagent
containers:
- name: linux-agent
image: 999999999999.dkr.ecr.us-east-2.amazonaws.com/mgmt/my-linux-agent:20210304
imagePullPolicy: IfNotPresent
env:
- name: AZP_AGENT_NAME
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.name
- name: AZP_URL
value: https://dev.azure.com/myaccount
- name: AZP_POOL
value: EKS-Linux
- name: AZP_TOKEN
valueFrom:
secretKeyRef:
name: azure-devops
key: agent-token
Related
This is my first time using GitLab for EKS and I feel so lost. I've been following the docs and so far I
Created a project on GitLab that contains my kubernetes manifest files
Created a config.yaml in that project in the directory .gitlab/agents/stockagent
Here's the config.yaml, my project name is "Stock-Market-API-K8s" and my k8s manifests are in the root directory of that project
ci_access:
projects:
- id: "root/Stock-Market-API-K8s"
In my root directory of my project, I also have a .gitlab-ci.yml file and here's the contents of that
deploy:
image:
name: mpriv32/stock-api:latest
entrypoint: ['']
script:
- kubectl config get-contexts
- kubectl config use-context .gitlab/agents/stockagent
- kubectl get pods
Using the default example from the docs, it seems that the get-contexts script is the one that failed. Here's the full error from my logs
Executing "step_script" stage of the job script
00:01
Using docker image sha256:58ddf823e9d7ee4c0e75779b7e01dab9b11ac0d985d1b2d2fe6c6b95a849573d for mpriv32/stock-api:latest with digest mpriv32/stock-api#sha256:a2e79a2c3a57327f93e36ec55297a606626e4dc8d72e469dd4dc2f3c1f589bac ...
$ kubectl config get-contexts
/bin/bash: line 123: kubectl: command not found
Cleaning up project directory and file based variables
00:00
ERROR: Job failed: exit code 1
Here's my job.yaml file for my kubernetes pod, just in case it plays a factor at all
apiVersion: v1
kind: Pod
metadata:
name: stock-api
labels:
app: stock-api
spec:
containers:
- name: stock-api
image: mpriv32/stock-api:latest
envFrom:
- secretRef:
name: api-credentials
restartPolicy: Never
In your case, I guess the image(mpriv32/stock-api:latest) that you are using doesn't have a dependency kubectl as a global executable, please use an image as an example - bitnami/kubectl which "contains" kubectl
deploy:
image:
name: bitnami/kubectl
the image keyword is the name of the Docker image the Docker executor uses to run CI/CD jobs.
For more information https://docs.gitlab.com/ee/ci/docker/using_docker_images.html
Or you can build your docker image on top of bitnami/kubectl
FROM bitnami/kubectl:1.20.9 as kubectl
FROM ubuntu-or-whatever-image:tag
# Do whatever you need to with the
# ubuntu-or-whatever-image:tag image, then:
COPY --from=kubectl /opt/bitnami/kubectl/bin/kubectl /usr/local/bin/
Or you can go with the approach of building an image from the scratch by
installing there the dependencies that you are using
smth like
FROM ubuntu:18.10
WORKDIR /root
COPY bootstrap.sh ./
RUN apt-get update && apt-get -y install --no-install-recommends \
gnupg \
curl \
wget \
git \
apt-transport-https \
ca-certificates \
zsh \
&& rm -rf /var/lib/apt/lists/*
ENV SHELL /usr/bin/zsh
# Install kubectl
RUN curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | apt-key add - && \
echo "deb https://apt.kubernetes.io/ kubernetes-xenial main" | tee -a /etc/apt/sources.list.d/kubernetes.list && \
apt-get update && apt-get -y install --no-install-recommends kubectl
I created a dockerfile and changed User to a non-root user nobody. Locally this works perfectly. When deployed on kubernetes howerever, I get error
java.nio.file.AccessDeniedException: ./xxxxxx_2.12-2.6.3.jar
When I dug in more I realised this jar file is download after the spark dependencies used in the dockerfiles are downloaded. Therefore any permissions given to the spark folder are not present for this newly dowloaded jar file which is downloaded at runtime into /opt/spark/.ivy2/xxx which has root permissions. This causes the pod in kubernetes to fail.
I am wondering if there is way to give permissions to execute this jar file. Since it seems this is not possible in the Dockerfile. Any suggestion as to how to solve this issue ??
As proposed by #mario
ARG SPARK_OPERATOR_BASE_IMAGE_VERSION=v2.4.5
FROM xxx/alpine:3.12 as preparator
ARG SCALA_VERSION=2.12
ARG SPARK_VERSION=2.4.7
ARG HADOOP_VERSION=3.2.1
ARG AWS_SDK_VERSION=1.11.375
ARG MAVEN_VERSION=3.6.2
RUN apk add --no-cache \
bash \
curl && \
mkdir /target
COPY hashes /tmp/
COPY prepare /tmp/
WORKDIR /tmp
# Download Hadoop
RUN curl -L -O https://downloads.apache.org/hadoop/common/hadoop-${HADOOP_VERSION}/hadoop-${HADOOP_VERSION}.tar.gz && \
sha256sum -c hadoop-${HADOOP_VERSION}.sha256 && \
tar -xzf hadoop-${HADOOP_VERSION}.tar.gz && \
mv hadoop-${HADOOP_VERSION} /target/hadoop
# Download Spark
RUN curl -L -O https://downloads.apache.org/spark/spark-${SPARK_VERSION}/spark-${SPARK_VERSION}-bin-without-hadoop-scala-${SCALA_VERSION}.tgz && \
sha512sum -c spark-${SPARK_VERSION}-bin-without-hadoop-scala-${SCALA_VERSION}.sha512 && \
tar -xzf spark-${SPARK_VERSION}-bin-without-hadoop-scala-${SCALA_VERSION}.tgz && \
mv spark-${SPARK_VERSION}-bin-without-hadoop-scala-${SCALA_VERSION} /target/spark && \
# Download Spark 3.0.0 entrypoint script from GitHub, bugfixing for 2.4.7
curl -L -O https://raw.githubusercontent.com/apache/spark/v3.0.0/resource-managers/kubernetes/docker/src/main/dockerfiles/spark/entrypoint.sh && \
mv entrypoint.sh /target/entrypoint.sh && \
chmod +x /target/entrypoint.sh
# Download AWS Jars
RUN curl -L -O https://repo1.maven.org/maven2/org/apache/hadoop/hadoop-aws/${HADOOP_VERSION}/hadoop-aws-${HADOOP_VERSION}.jar && \
sha1sum -c hadoop-aws-${HADOOP_VERSION}.jar.sha1 && \
mv hadoop-aws-${HADOOP_VERSION}.jar /target/spark/jars/ && \
curl -L -O https://repo1.maven.org/maven2/com/amazonaws/aws-java-sdk-bundle/${AWS_SDK_VERSION}/aws-java-sdk-bundle-${AWS_SDK_VERSION}.jar && \
sha1sum -c aws-java-sdk-bundle-${AWS_SDK_VERSION}.jar.sha1 && \
mv aws-java-sdk-bundle-${AWS_SDK_VERSION}.jar /target/spark/jars/
# Directory needed for saving built jars
RUN mkdir /target/spark/custom-jars/
#### Download Prometheus + Metric dependencies ####
# install java, maven and prometheus fat jar using maven (pom.xml)
RUN apk add --update openjdk8 && \
curl -L -O https://archive.apache.org/dist/maven/maven-3/${MAVEN_VERSION}/binaries/apache-maven-${MAVEN_VERSION}-bin.tar.gz && \
tar -xzf apache-maven-${MAVEN_VERSION}-bin.tar.gz && export PATH=./apache-maven-${MAVEN_VERSION}/bin:$PATH && \
mv prometheus-pom.xml pom.xml && mvn clean package && mv target/prometheusMetricLibs-jar-with-dependencies.jar /target/spark/custom-jars/
RUN \
chown -R nobody:99 /target/spark \
&& chown -R nobody:99 /target/hadoop \
&& chmod -R ugo+rw /target/spark \
&& chmod -R ugo+rw /target/hadoop
ARG SPARK_OPERATOR_BASE_IMAGE_VERSION
FROM gcr.io/spark-operator/spark:${SPARK_OPERATOR_BASE_IMAGE_VERSION}
RUN rm -rf /opt/spark/
COPY --from=preparator /target/ /opt/
ENV SPARK_HOME=/opt/spark \
HADOOP_HOME=/opt/hadoop
ENV HADOOP_OPTS="-Djava.library.path=/opt/hadoop/lib/native" \
LD_LIBRARY_PATH=${HADOOP_HOME}/lib/native \
PATH=${HADOOP_HOME}/bin:${SPARK_HOME}/bin:${PATH}
COPY conf /opt/spark/conf/
RUN echo "export JAVA_HOME=${JAVA_HOME}" >> ${HADOOP_HOME}/etc/hadoop/hadoop-env.sh && \
echo "export JAVA_HOME=${JAVA_HOME}" > ${SPARK_HOME}/conf/spark-env.sh && \
echo "export SPARK_DIST_CLASSPATH=\$(hadoop classpath)" >> ${SPARK_HOME}/conf/spark-env.sh
# 99 used instead of nobody because in the alpine image the Group Id of nobody is different than in the spark operator image.
RUN \
addgroup --gid 99 nobody \
&& echo "nobody:x:99:99:nobody:/nonexistent:/usr/sbin/nologin" >> /etc/passwd \
&& usermod -a -G users nobody \
&& chmod -R ugo+rw /var/lib/
USER nobody
# if we want local storage
# "spark.eventLog.dir": "tmp/spark-events"
RUN mkdir -p /tmp/spark-events
in my pod the jar file is implemented like this
sparkConf:
"spark.ui.port": "4045"
"spark.eventLog.enabled": {{ .Values.spark.eventLogEnabled | quote }}
"spark.eventLog.dir": "xx//"
"spark.jars.ivySettings": "/vault/secrets/xx-ivysettings.xml"
"spark.jars.ivy": "/opt/spark/.ivy2"
"spark.jars.packages": "xxxx_2.12:{{ .Values.appVersion }}"
"spark.blacklist.enabled": "false"
"spark.driver.supervise": "true"
"spark.app.name": {{ .Values.name | quote }}
"spark.submit.deployMode": {{ .Values.spark.deployMode | quote }}
"spark.driver.extraJavaOptions": "-Dlog4j.configurationFile=log4j.properties"
"spark.executor.extraJavaOptions": "-Dlog4j.configurationFile=log4j.properties"
It would be easier to answer your question if you share your Dockerfile and your kubernetes Pod template yaml manifest, but in short: you can manipulate your Pod's permissions using the securityContext like in the example from the docs below:
apiVersion: v1
kind: Pod
metadata:
name: security-context-demo
spec:
securityContext:
runAsUser: 1000
runAsGroup: 3000
fsGroup: 2000
volumes:
- name: sec-ctx-vol
emptyDir: {}
containers:
- name: sec-ctx-demo
image: busybox
command: [ "sh", "-c", "sleep 1h" ]
volumeMounts:
- name: sec-ctx-vol
mountPath: /data/demo
securityContext:
allowPrivilegeEscalation: false
For debugging purposes you can start from setting allowPrivilegeEscalation to true or runAsUser: 0 (root) but keep in mind this is not the solution that can be used in production. Running containers as root is generally a bad idea and in most cases it can be avoided.
Therefore any permissions given to the spark folder are not present
for this newly dowloaded jar file which is downloaded at runtime into
/opt/spark/.ivy2/xxx which has root permissions.
Is it really necessary for it to have root permissions ? Most likely it can be fixed in your Dockerfile.
I have made a local workstation with CentOS 8, I have installed docker CE and everything necessary to set up a remote development environment (smb, firewall, etc.).
My client computer is a mac and the application is based on a docker-compose of several containers which with the docker MAC application works correctly.
After configuring pycharm professional with the remote interpreter and starting the project there are 2 containers that start perfectly (postgres: 11.2-alpine, redis: alpine) but the application that is a Django-rest with a specific dockerfile when it is ready to execute the entrypoint throws The following error that I attach below although I have accessed the server and if I execute the command docker-compose up --build it starts perfectly, it is something related to the fact of starting from Pycharm.
Has anyone had this problem and been able to solve it?
Successfully tagged energy_energy:latest
Creating energy_redis-celery-energy_1 ...
Creating energy_redis-cache-energy_1 ...
Creating energy_postgres_1 ...
Creating energy_celery-worker-energy_1 ...
Creating energy_celery-beat-energy_1 ...
Creating energy_celery-beat-energy_1 ... error
ERROR: for energy_celery-beat-energy_1 Cannot start service celery-beat-energy: OCI runtime create failed: container_linux.go:349: starting container process caused "exec: \"./start.sh\": stat ./start.sh: no such file or directory": unknown
Creating energy_celery-worker-energy_1 ... error
ERROR: for energy_celery-worker-energy_1 Cannot start service celery-worker-energy: OCI runtime create failed: container_linux.go:349: starting container process caused "exec: \"./start.sh\": stat ./start.sh: no such file or directory": unknown
ERROR: for celery-beat-energy Cannot start service celery-beat-energy: OCI runtime create failed: container_linux.go:349: starting container process caused "exec: \"./start.sh\": stat ./start.sh: no such file or directory": unknown
ERROR: for celery-worker-energy Cannot start service celery-worker-energy: OCI runtime create failed: container_linux.go:349: starting container process caused "exec: \"./start.sh\": stat ./start.sh: no such file or directory": unknown
ERROR: Encountered errors while bringing up the project
This is my entire Dockerfile:
FROM python:3.7-alpine
ENV PYTHONDONTWRITEBYTECODE 1
ENV PYTHONUNBUFFERED 1
ENV CERTS_DIR /var/certs
ENV NGINX_VERSION 1.15.3
EXPOSE 1234
RUN apk update \
&& GPG_KEYS=B0F4253373F8F6F510D42178520A9993A1C052F8 \
&& CONFIG="\
--prefix=/etc/nginx \
--sbin-path=/usr/sbin/nginx \
--modules-path=/usr/lib/nginx/modules \
--conf-path=/etc/nginx/nginx.conf \
--error-log-path=/var/log/nginx/error.log \
--http-log-path=/var/log/nginx/access.log \
--pid-path=/var/run/nginx.pid \
--lock-path=/var/run/nginx.lock \
--http-client-body-temp-path=/var/cache/nginx/client_temp \
--http-proxy-temp-path=/var/cache/nginx/proxy_temp \
--http-fastcgi-temp-path=/var/cache/nginx/fastcgi_temp \
--http-uwsgi-temp-path=/var/cache/nginx/uwsgi_temp \
--http-scgi-temp-path=/var/cache/nginx/scgi_temp \
--user=nginx \
--group=nginx \
--with-http_ssl_module \
--with-http_realip_module \
--with-http_addition_module \
--with-http_sub_module \
--with-http_dav_module \
--with-http_flv_module \
--with-http_mp4_module \
--with-http_gunzip_module \
--with-http_gzip_static_module \
--with-http_random_index_module \
--with-http_secure_link_module \
--with-http_stub_status_module \
--with-http_auth_request_module \
--with-http_xslt_module=dynamic \
--with-http_image_filter_module=dynamic \
--with-http_geoip_module=dynamic \
--with-threads \
--with-stream \
--with-stream_ssl_module \
--with-stream_ssl_preread_module \
--with-stream_realip_module \
--with-stream_geoip_module=dynamic \
--with-http_slice_module \
--with-mail \
--with-mail_ssl_module \
--with-compat \
--with-file-aio \
--with-http_v2_module \
" \
&& addgroup -S nginx \
&& adduser -D -S -h /var/cache/nginx -s /sbin/nologin -G nginx nginx \
&& apk add tzdata \
gcc \
libc-dev \
linux-headers \
mariadb-dev \
postgresql-dev \
netcat-openbsd \
curl \
libffi-dev \
supervisor \
&& apk add --no-cache --virtual .build-deps \
make \
openssl-dev \
pcre-dev \
zlib-dev \
gnupg1 \
libxslt-dev \
gd-dev \
geoip-dev \
&& curl -fSL https://nginx.org/download/nginx-$NGINX_VERSION.tar.gz -o nginx.tar.gz \
&& curl -fSL https://nginx.org/download/nginx-$NGINX_VERSION.tar.gz.asc -o nginx.tar.gz.asc \
&& export GNUPGHOME="$(mktemp -d)" \
&& found=''; \
for server in \
ha.pool.sks-keyservers.net \
hkp://keyserver.ubuntu.com:80 \
hkp://p80.pool.sks-keyservers.net:80 \
pgp.mit.edu \
; do \
echo "Fetching GPG key $GPG_KEYS from $server"; \
gpg --keyserver "$server" --keyserver-options timeout=10 --recv-keys "$GPG_KEYS" && found=yes && break; \
done; \
test -z "$found" && echo >&2 "error: failed to fetch GPG key $GPG_KEYS" && exit 1; \
gpg --batch --verify nginx.tar.gz.asc nginx.tar.gz \
&& rm -rf "$GNUPGHOME" nginx.tar.gz.asc \
&& mkdir -p /usr/src \
&& tar -zxC /usr/src -f nginx.tar.gz \
&& rm nginx.tar.gz \
&& cd /usr/src/nginx-$NGINX_VERSION \
&& ./configure $CONFIG --with-debug \
&& make -j$(getconf _NPROCESSORS_ONLN) \
&& mv objs/nginx objs/nginx-debug \
&& mv objs/ngx_http_xslt_filter_module.so objs/ngx_http_xslt_filter_module-debug.so \
&& mv objs/ngx_http_image_filter_module.so objs/ngx_http_image_filter_module-debug.so \
&& mv objs/ngx_http_geoip_module.so objs/ngx_http_geoip_module-debug.so \
&& mv objs/ngx_stream_geoip_module.so objs/ngx_stream_geoip_module-debug.so \
&& ./configure $CONFIG \
&& make -j$(getconf _NPROCESSORS_ONLN) \
&& make install \
&& rm -rf /etc/nginx/html/ \
&& mkdir /etc/nginx/conf.d/ \
&& mkdir -p /usr/share/nginx/html/ \
&& install -m644 html/index.html /usr/share/nginx/html/ \
&& install -m644 html/50x.html /usr/share/nginx/html/ \
&& install -m755 objs/nginx-debug /usr/sbin/nginx-debug \
&& install -m755 objs/ngx_http_xslt_filter_module-debug.so /usr/lib/nginx/modules/ngx_http_xslt_filter_module-debug.so \
&& install -m755 objs/ngx_http_image_filter_module-debug.so /usr/lib/nginx/modules/ngx_http_image_filter_module-debug.so \
&& install -m755 objs/ngx_http_geoip_module-debug.so /usr/lib/nginx/modules/ngx_http_geoip_module-debug.so \
&& install -m755 objs/ngx_stream_geoip_module-debug.so /usr/lib/nginx/modules/ngx_stream_geoip_module-debug.so \
&& ln -s ../../usr/lib/nginx/modules /etc/nginx/modules \
&& strip /usr/sbin/nginx* \
&& strip /usr/lib/nginx/modules/*.so \
&& rm -rf /usr/src/nginx-$NGINX_VERSION \
\
&& apk add --no-cache --virtual .gettext gettext \
&& mv /usr/bin/envsubst /tmp/ \
\
&& runDeps="$( \
scanelf --needed --nobanner --format '%n#p' /usr/sbin/nginx /usr/lib/nginx/modules/*.so /tmp/envsubst \
| tr ',' '\n' \
| sort -u \
| awk 'system("[ -e /usr/local/lib/" $1 " ]") == 0 { next } { print "so:" $1 }' \
)" \
&& apk add --no-cache --virtual .nginx-rundeps $runDeps \
&& apk del .build-deps \
&& apk del .gettext \
&& mv /tmp/envsubst /usr/local/bin/ \
\
&& ln -sf /dev/stdout /var/log/nginx/access.log \
&& ln -sf /dev/stderr /var/log/nginx/error.log
COPY nginx.conf /etc/nginx/nginx.conf
COPY conf/certs /var/certs
COPY ./src /app
COPY supervisord.ini /etc/supervisor.d/supervisord.ini
WORKDIR /app
RUN pip install --upgrade pip
RUN pip install pipenv
RUN pipenv install --system
RUN chmod +x ./start.sh
ENTRYPOINT ["./start.sh"]
This is my entire docker-compose:
# SERVICE ENDPOINT USER PASSWORD
# ==========================================================
# postgres-11 localhost:2323 root root
# redis-cache-energy sin exposición no no
# redis-celery sin exposición no no
# energy localhost:1337 no no Nnigx => Gunicorn => Django
# celery-beat sin exposición no no
# celery-worker sin exposición no no
version: '3'
services:
####################################################################################################
# DATABASES #
####################################################################################################
postgres:
image: postgres:11.2-alpine
restart: always
volumes:
- postgres_data_energy:/var/lib/postgresql/data/
environment:
- POSTGRES_USER=root
- POSTGRES_PASSWORD=root
- POSTGRES_DB=local
ports:
- "2323:5432"
networks:
- energy-network
####################################################################################################
# REDIS #
####################################################################################################
redis-cache-energy:
image: redis:alpine
command: redis-server --requirepass root
restart: always
networks:
- energy-network
redis-celery-energy:
image: redis:alpine
command: redis-server --requirepass root
restart: always
networks:
- energy-network
####################################################################################################
# energy #
####################################################################################################
celery-beat-energy:
build: .
command: celery-beat-energy
restart: always
volumes:
- ./src:/app
- ./logs:/var/logs
env_file:
- conf/local.env
depends_on:
- redis-celery-energy
networks:
- energy-network
celery-worker-energy:
build: .
command: celery-worker-energy
restart: always
volumes:
- ./src:/app
- ./logs:/var/logs
env_file:
- conf/local.env
depends_on:
- redis-celery-energy
networks:
- energy-network
energy:
build: .
command: nginx
restart: always
volumes:
- ./src:/app
- ./logs:/var/logs
ports:
- 1234:1234
env_file:
- conf/local.env
depends_on:
- postgres
- redis-cache-energy
- redis-celery-energy
- celery-worker-energy
- celery-beat-energy
networks:
- energy-network
####################################################################################################
# VOLUMES #
####################################################################################################
volumes:
postgres_data_energy:
####################################################################################################
# networks #
####################################################################################################
networks:
energy-network:
driver: bridge
More info:
If i run the project from terminal SSH from pycharm and run a ls command in the entrypoint, i see the files.
SSH TERMINAL FROM PYCHARM
RESULT ENTRYPOINT, SEE THE FILES
If I run form pycharm the ls is empty.
RESULT ENTRYPOINT, EMPTY
RUN FROM PYCHARM
More info about my problem
Run Daemonset
kubectl create -f test-daemon.yaml --validate=false
Error
Error from server: error when creating "test-daemon.yaml": the server could not find the requested resource (post daemonsets.extensions)
Config
[Unit]
Description=Kubernetes API Server
Documentation=https://github.com/GoogleCloudPlatform/kubernetes
Requires=network-online.target etcd2.service generate-serviceaccount-key.service
After=network-online.target etcd2.service generate-serviceaccount-key.service
[Service]
EnvironmentFile=/etc/environment
ExecStartPre=-/usr/bin/mkdir -p /opt/bin
ExecStartPre=/usr/bin/curl -L -o /opt/bin/kube-apiserver -z /opt/bin/kube-apiserver https://storage.googleapis.com/kubernetes-release/release/v1.0.1/bin/linux/amd64/kube-apiserver
ExecStartPre=/usr/bin/chmod +x /opt/bin/kube-apiserver
ExecStartPre=/opt/bin/wupiao 127.0.0.1:2379/v2/machines
ExecStart=/opt/bin/kube-apiserver \
--service_account_key_file=/opt/bin/kube-serviceaccount.key \
--service_account_lookup=false \
--admission_control=NamespaceLifecycle,NamespaceAutoProvision,LimitRanger,SecurityContextDeny,ServiceAccount,ResourceQuota \
--runtime_config=api/v1,extensions/v1beta1=true,extensions/v1beta1/daemonsets=true \
--allow_privileged=true \
--insecure_bind_address=0.0.0.0 \
--insecure_port=3001 \
--kubelet_https=true \
--secure_port=6443 \
--service-cluster-ip-range=10.100.0.0/16 \
--etcd_servers=http://127.0.0.1:2379 \
--public_address_override=${COREOS_PRIVATE_IPV4} \
--logtostderr=true
Restart=always
RestartSec=10
Added config
--runtime_config=api/v1,extensions/v1beta1=true,extensions/v1beta1/daemonsets=true
ReplicationController
apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
labels:
app: test
name: test
spec:
template:
metadata:
labels:
app: test
spec:
containers:
name: test
image: 192.168.1.3:4000/test
ports:
- containerPort: 80
Try removing the schema cache: rm -rf /tmp/kubectl.schema
I am trying to mount a persistent disk on my container which runs a Postgres custom image. I am using Kubernetes and following this tutorial.
This is my db_pod.yaml file:
apiVersion: v1
kind: Pod
metadata:
name: lp-db
labels:
name: lp-db
spec:
containers:
- image: my_username/my-db
name: my-db
ports:
- containerPort: 5432
name: my-db
volumeMounts:
- name: pg-data
mountPath: /var/lib/postgresql/data
volumes:
- name: pg-data
gcePersistentDisk:
pdName: my-db-disk
fsType: ext4
I create the disk using the command gcloud compute disks create --size 200GB my-db-disk.
However, when I run the pod, delete it, and then run it again (like in the tutorial) my data is not persisted.
I tried multiple versions of this file, including with PersistentVolumes and PersistentVolumeClaims, I tried changing the mountPath, but to no success.
Edit
Dockerfile for creating the Postgres image:
FROM ubuntu:trusty
RUN rm /bin/sh && \
ln -s /bin/bash /bin/sh
# Get Postgres
RUN echo "deb http://apt.postgresql.org/pub/repos/apt/ trusty-pgdg main" >> /etc/apt/sources.list.d/pgdg.list
RUN apt-get update && \
apt-get install -y wget
RUN wget --quiet -O - https://www.postgresql.org/media/keys/ACCC4CF8.asc | sudo apt-key add -
# Install virtualenv (will be needed later)
RUN apt-get update && \
apt-get install -y \
libjpeg-dev \
libpq-dev \
postgresql-9.4 \
python-dev \
python-pip \
python-virtualenv \
strace \
supervisor
# Grab gosu for easy step-down from root
RUN gpg --keyserver pool.sks-keyservers.net --recv-keys B42F6819007F00F88E364FD4036A9C25BF357DD4
RUN apt-get update && apt-get install -y --no-install-recommends ca-certificates wget && rm -rf /var/lib/apt/lists/* \
&& wget -O /usr/local/bin/gosu "https://github.com/tianon/gosu/releases/download/1.2/gosu-$(dpkg --print-architecture)" \
&& wget -O /usr/local/bin/gosu.asc "https://github.com/tianon/gosu/releases/download/1.2/gosu-$(dpkg --print-architecture).asc" \
&& gpg --verify /usr/local/bin/gosu.asc \
&& rm /usr/local/bin/gosu.asc \
&& chmod +x /usr/local/bin/gosu \
&& apt-get purge -y --auto-remove ca-certificates wget
# make the "en_US.UTF-8" locale so postgres will be utf-8 enabled by default
RUN apt-get update && apt-get install -y locales && rm -rf /var/lib/apt/lists/* \
&& localedef -i en_US -c -f UTF-8 -A /usr/share/locale/locale.alias en_US.UTF-8
ENV LANG en_US.utf8
# Adjust PostgreSQL configuration so that remote connections to the database are possible.
RUN echo "host all all 0.0.0.0/0 md5" >> /etc/postgresql/9.4/main/pg_hba.conf
# And add ``listen_addresses`` to ``/etc/postgresql/9.4/main/postgresql.conf``
RUN echo "listen_addresses='*'" >> /etc/postgresql/9.4/main/postgresql.conf
RUN echo "log_directory='/var/log/postgresql'" >> /etc/postgresql/9.4/main/postgresql.conf
# Add all code from the project and all config files
WORKDIR /home/projects/my-project
COPY . .
# Add VOLUMEs to allow backup of config, logs and databases
ENV PGDATA /var/lib/postgresql/data
VOLUME /var/lib/postgresql/data
# Expose an entrypoint and a port
RUN chmod +x scripts/sh/*
EXPOSE 5432
ENTRYPOINT ["scripts/sh/entrypoint-postgres.sh"]
And entrypoint script:
echo " I am " && gosu postgres whoami
gosu postgres /etc/init.d/postgresql start && echo 'Started postgres'
gosu postgres psql --command "CREATE USER myuser WITH SUPERUSER PASSWORD 'mypassword';" && echo 'Created user'
gosu postgres createdb -O myuser mydb && echo 'Created db'
# This just keeps the container alive.
tail -F /var/log/postgresql/postgresql-9.4-main.log
In the end, it seems that the real problem was the fact that I was trying to create the database from my entrypoint script.
Things such as creating a db or a user should be done at container creation time so I ended up using the standard Postgres image, which actually provides a simple and easy way to create an user and a db.
This is the fully functional configuration file for Postgres.
apiVersion: v1
kind: Pod
metadata:
name: postgres
labels:
name: postgres
spec:
containers:
- name: postgres
image: postgres
env:
- name: PGDATA
value: /var/lib/postgresql/data/pgdata
- name: POSTGRES_USER
value: myuser
- name: POSTGRES_PASSWORD
value: mypassword
- name: POSTGRES_DB
value: mydb
ports:
- containerPort: 5432
volumeMounts:
- mountPath: /var/lib/postgresql/data
name: pg-data
volumes:
- name: pg-data
persistentVolumeClaim:
claimName: pg-data-claim
Thanks to all those who helped me :)
does your custom postgresql persist data at /var/lib/postgresql/data?
are you able to get logs from your postgresql container and spot anything interesting?
when your pod is running, can you see the mountpoints inside your container and check the persistent disk is there?
I followed this scenario and I was able to persist my data by changing the mountPath to /var/lib/postgresql and also reproduced using cassandra (i.e. /var/lib/cassandra for mountPath)
I was able to delete/restart pods from different nodes/hosts and still see my "users" table and the data I previously entered. However, I was not using a custom image, I just used standard docker images.