I have a Docker swarm culster , and this is $ docker info
[dannil#ozcluster01 ozms]$ docker info
Containers: 15
Running: 10
Paused: 0
Stopped: 5
Images: 32
Server Version: swarm/1.2.5
Role: primary
Strategy: spread
Filters: health, port, containerslots, dependency, affinity, constraint
Nodes: 2
ozcluster01: 192.168.168.41:2375
└ ID: CKCO:JGAA:PIOM:F4PL:6TIH:EQFY:KZ6X:B64Q:HRFH:FSTT:MLJT:BJUY
└ Status: Healthy
└ Containers: 8 (6 Running, 0 Paused, 2 Stopped)
└ Reserved CPUs: 0 / 2
└ Reserved Memory: 192 MiB / 3.79 G
└ Labels: executiondriver=native-0.2, kernelversion=3.10.0-327.13.1.el7.x86_64, operatingsystem=CentOS Linux 7 (Core), storagedriver=devicemapper
└ UpdatedAt: 2016-11-04T09:24:29Z
└ ServerVersion: 1.10.3
ozcluster02: 192.168.168.42:2375
└ ID: 73GR:6M7W:GMWD:D3DO:UASW:YHJ2:BTH6:DCO5:NJM6:SXPN:PXTY:3NHI
└ Status: Healthy
└ Containers: 7 (4 Running, 0 Paused, 3 Stopped)
└ Reserved CPUs: 0 / 2
└ Reserved Memory: 192 MiB / 3.79 GiB
└ Labels: executiondriver=native-0.2, kernelversion=3.10.0-327.10.1.el7.x86_64, operatingsystem=CentOS Linux 7 (Core), storagedriver=devicemapper
└ UpdatedAt: 2016-11-04T09:24:14Z
└ ServerVersion: 1.10.3
Then I exec docker-compose up -d to run my docker containers with
lables constraint:node==ozculster02
but the container still start up on ozculster01.
and this is my docker-compose.yml file :
version: '2'
services:
rabbitmq:
image: rabbitmq
ports:
- "5672:5672"
- "15672:15672"
config-service:
image: ozms/config-service
ports:
- "8888:8888"
volumes:
- ~/ozms/configs:/var/tmp/
- ~/ozms/log:/log
labels:
- "affinity:image==ozms/config-service"
eureka-service:
image: ozms/eureka-service
ports:
- "8761:8761"
volumes:
- ~/ozms/log:/log
labels:
- "constraint:node==ozcluster02"
environment:
- SPRING_RABBITMQ_HOST=rabbitmq
in compose v3 you should put constraint not in labels section, it shold be deploy section
services:
...
eureka-service:
...
deploy:
placement:
constraints:
- node.hostname==ozcluster02
see more at
https://docs.docker.com/compose/compose-file/
Related
I hope I miss something, but could not find answer anywhere. I am using ROOK storage class resource to provision PV that are later attached to POD. in example adding 3 volumes (1 emptyDir, and 2 volumes from ROOK cluster)
test.yaml
apiVersion: v1
kind: Pod
metadata:
name: security-context-demo
namespace: misc
spec:
securityContext:
runAsUser: 1000
runAsGroup: 3000
fsGroup: 2000
# runAsNonRoot: true
volumes:
- name: keymanager-keys
persistentVolumeClaim:
claimName: keymanager-pvc
readOnly: false
- name: keymanager-keyblock
persistentVolumeClaim:
claimName: keymanager-block-pvc
readOnly: false
- name: keymanager-key-n
persistentVolumeClaim:
claimName: keymanager-ext4x-pvc
readOnly: false
- name: local-keys
emptyDir: {}
- name: ephemeral-storage
ephemeral:
volumeClaimTemplate:
spec:
accessModes: ["ReadWriteMany"]
storageClassName: "ceph-filesystem"
resources:
requests:
storage: 10Mi
containers:
- name: sec-ctx-demo
image: busybox:1.28
command: [ "sh", "-c", "sleep 1h" ]
# securityContext:
# privileged: true
volumeMounts:
- name: keymanager-keys
mountPath: /data/pvc-filesystem-preconfigured
- name: local-keys
mountPath: /data/emptydir
- name: ephemeral-storage
mountPath: /data/pvc-filesystem-ephemeral-storage
- name: keymanager-keyblock
mountPath: /data/pvc-block-storage
- name: keymanager-key-n
mountPath: /data/testing-with-fstypes
POD information:
kubectl get pods -n misc
NAME READY STATUS RESTARTS AGE
security-context-demo 1/1 Running 15 (30m ago) 15h
kubectl exec -it -n misc security-context-demo -- sh
/ $ ls -l /data/
total 5
drwxrwsrwx 2 root 2000 4096 Feb 13 17:31 emptydir
drwxrwsr-x 3 root 2000 1024 Feb 13 07:08 pvc-block-storage
drwxr-xr-x 2 root root 0 Feb 13 17:31 pvc-filesystem-ephemeral-storage
drwxr-xr-x 2 root root 0 Feb 9 11:29 pvc-filesystem-preconfigured
drwxr-xr-x 2 root root 0 Feb 13 08:28 testing-with-fstypes
/data/emptydir and /data/pvc-block-storage directories are not mounted as expected to group 2000.
valumes that uses Rook CephFS ignores fsGroup field in the PodSecurityContext.If Rook filesystem support fsGroup this should not be happening.
Expected behavior:
would like to see mounted volumes to specific group (that will be used by unprivileged users). Don't know weather doing something wrong or what is happening. Expecting to be able to write to mounted volume with app user.
Environment:
OS (e.g. from /etc/os-release): Debian GNU/Linux 11 (bullseye) Kernel (e.g. uname -a):
Linux worker1 5.10.0-9-amd64 1 SMP Debian 5.10.70-1 (2021-09-30) x86_64 GNU/Linux
Rook version (use rook version inside of a Rook Pod): v1.10.6
Storage backend version (e.g. for ceph do ceph -v): ceph version 17.2.5 (98318ae89f1a893a6ded3a640405cdbb33e08757) quincy (stable)
Kubernetes version (use kubectl version): v1.24.1
cluster:
id: 25e0134a-515e-4f62-8eec-ae5d8cb3e650
health: HEALTH_OK
services:
mon: 3 daemons, quorum a,c,d (age 7w)
mgr: a(active, since 10w)
mds: 1/1 daemons up, 1 hot standby
osd: 3 osds: 3 up (since 11w), 3 in (since 11w)
rgw: 1 daemon active (1 hosts, 1 zones)
data:
volumes: 1/1 healthy
pools: 12 pools, 185 pgs
objects: 421 objects, 1.2 MiB
usage: 1.5 GiB used, 249 GiB / 250 GiB avail
pgs: 185 active+clean
if more information needed let me know and I will update this post. also if helps can add PV and PVC configs that are created after Pod creation.
My yaml file likes :
apiVersion: kubevirt.io/v1
kind: VirtualMachineInstance
metadata:
name: m1
spec:
domain:
cpu:
cores: 4
devices:
disks:
- name: harddrive
disk: {}
- name: cloudinitdisk
disk: {}
interfaces:
- name: ovs-net
bridge: {}
- name: default
masquerade: {}
resources:
requests:
memory: 8G
volumes:
- name: harddrive
containerDisk:
image: 1.1.1.1:8888/redhat/redhat79:latest
- name: cloudinitdisk
cloudInitNoCloud:
userData: |
#!/bin/bash
echo 1 > /opt/1.txt
networks:
- name: ovs-net
multus:
networkName: ovs-vlan-100
- name: default
pod: {}
VMI is running and I login the vm , nothing is in directory '/opt'; I find a disk sdb ,I mount sdb to /mnt, I can see file 'userdata', and the content in 'userdata' is right
I don't know where I did wrong
K8S 1.22.10
I also tried the other two methods
1)
cloudInitNoCloud:
userData: |
bootcmd:
- touch /opt/1.txt
runcmd:
- touch /opt/2.txt
cloudInitNoCloud:
secretRef:
name: my-vmi-secret
I hope the cloudinitnocloud work, and it can run my command
I find the problem, the docker image that I used doesn't install cloud* package
Kubevirt offical doesn't mention it, I think I can use it directly.
I have a custom kubernetes cluster with local disk PersistentVolumes. I am trying to deploy spring-cloud-dataflow using this guide.
However, none of the pods are able to write on persistent volumes mounted. Here are the errors.
│
│ mariadb 12:55:19.88 INFO ==> Validating settings in MYSQL_*/MARIADB_* env vars │
│ mariadb 12:55:19.89 INFO ==> Initializing mariadb database │
│ mkdir: cannot create directory '/bitnami/mariadb/data': Permission denied
│ zookeeper 12:55:47.87 INFO ==> ** Starting ZooKeeper setup **
│
│ mkdir: cannot create directory '/bitnami/zookeeper/data': Permission denied │
│ Stream closed EOF for default/spring-cdf-release-zookeeper-0 (zookeeper)
I have tried adding initContainers but did not help.
rabbitmq:
enabled: false
mariadb:
initContainers:
- name: take-data-dir-ownership
image: docker.io/bitnami/minideb:stretch
command:
- chown
- -R
- 777:777
- /bitnami/mariadb
securityContext:
runAsUser: 0
volumeMounts:
- name: data-spring-cdf-release-mariadb-0
mountPath: /bitnami/mariadb
kafka:
enabled: true
initContainers:
- name: take-data-dir-ownership
image: docker.io/bitnami/minideb:stretch
command:
- chown
- -R
- 777:777
- /bitnami/kafka
securityContext:
runAsUser: 0
volumeMounts:
- name: data-spring-cdf-release-kafka-0
mountPath: /bitnami/kafka
zookeeper:
enabled: true
initContainers:
- name: take-data-dir-ownership
image: docker.io/bitnami/minideb:stretch
command:
- chown
- -R
- 777:777
- /bitnami/zookeeper
securityContext:
runAsUser: 0
volumeMounts:
- name: data-spring-cdf-release-zookeeper-0
mountPath: /bitnami/zookeeper
Any suggestions how can I make this volume writable by the pod?
I created a Docker volume as such:
sudo docker volume create --driver=local --name=es-data1 --opt type=none --opt o=bind --opt device=/usr/local/contoso/data1/elasticsearch/data1
/usr/local/contoso/data1/elasticsearch/data1 is a symlink.
And I'm instantiating three Elasticsearch Docker containers in my docker-compose.yml file as such:
version: '3.7'
services:
elasticsearch:
image: docker.elastic.co/elasticsearch/elasticsearch:7.6.0
logging:
driver: none
container_name: elasticsearch1
environment:
- node.name=elasticsearch1
- cluster.name=docker-cluster
- cluster.initial_master_nodes=elasticsearch1
- bootstrap.memory_lock=true
- "ES_JAVA_OPTS=-Xms1G -Xmx1G"
- http.cors.enabled=true
- http.cors.allow-origin=*
- network.host=_eth0_
ulimits:
nproc: 65535
memlock:
soft: -1
hard: -1
cap_add:
- ALL
# privileged: true
deploy:
replicas: 1
update_config:
parallelism: 1
delay: 10s
resources:
limits:
cpus: '1'
memory: 1G
reservations:
cpus: '1'
memory: 1G
restart_policy:
condition: unless-stopped
delay: 5s
max_attempts: 3
window: 10s
volumes:
- es-logs:/var/log
- es-data1:/usr/share/elasticsearch/data
networks:
- elastic
- ingress
ports:
- 9200:9200
- 9300:9300
healthcheck:
test: wget -q -O - http://127.0.0.1:9200/_cat/health
elasticsearch2:
image: docker.elastic.co/elasticsearch/elasticsearch:7.6.0
logging:
driver: none
container_name: elasticsearch2
environment:
- node.name=elasticsearch2
- cluster.name=docker-cluster
- cluster.initial_master_nodes=elasticsearch1
- bootstrap.memory_lock=true
- "ES_JAVA_OPTS=-Xms1G -Xmx1G"
- "discovery.zen.ping.unicast.hosts=elasticsearch1"
- http.cors.enabled=true
- http.cors.allow-origin=*
- network.host=_eth0_
ulimits:
nproc: 65535
memlock:
soft: -1
hard: -1
cap_add:
- ALL
# privileged: true
deploy:
replicas: 1
update_config:
parallelism: 1
delay: 10s
resources:
limits:
cpus: '1'
memory: 1G
reservations:
cpus: '1'
memory: 1G
restart_policy:
condition: unless-stopped
delay: 5s
max_attempts: 3
window: 10s
volumes:
- es-logs:/var/log
- es-data2:/usr/share/elasticsearch/data
networks:
- elastic
- ingress
ports:
- 9201:9200
healthcheck:
test: wget -q -O - http://127.0.0.1:9200/_cat/health
elasticsearch3:
image: docker.elastic.co/elasticsearch/elasticsearch:7.6.0
logging:
driver: none
container_name: elasticsearch3
environment:
- node.name=elasticsearch3
- cluster.name=docker-cluster
- cluster.initial_master_nodes=elasticsearch1
- bootstrap.memory_lock=true
- "ES_JAVA_OPTS=-Xms1G -Xmx1G"
- "discovery.zen.ping.unicast.hosts=elasticsearch1"
- http.cors.enabled=true
- http.cors.allow-origin=*
- network.host=_eth0_
ulimits:
nproc: 65535
memlock:
soft: -1
hard: -1
cap_add:
- ALL
# privileged: true
deploy:
replicas: 1
update_config:
parallelism: 1
delay: 10s
resources:
limits:
cpus: '1'
memory: 1G
reservations:
cpus: '1'
memory: 1G
restart_policy:
condition: unless-stopped
delay: 5s
max_attempts: 3
window: 10s
volumes:
- es-logs:/var/log
- es-data3:/usr/share/elasticsearch/data
networks:
- elastic
- ingress
ports:
- 9202:9200
healthcheck:
test: wget -q -O - http://127.0.0.1:9200/_cat/health
volumes:
es-data1:
driver: local
external: true
es-data2:
driver: local
external: true
es-data3:
driver: local
external: true
es-logs:
driver: local
external: true
networks:
elastic:
external: true
ingress:
external: true
My Problem:
The Elasticsearch containers are persisting index data to both the host filesystem and the mounted symlink.
My Question:
How do I modify my configuration so that the Elasticsearch containers are only persisting index data to the mounted symlink?
It seems to be the default behavior of the local volume driver that the files are additionally stored on the host machine. You can change the volume settings in your docker-compose.yml to prevent the docker from persisting (copying) files on the host file system (see nocopy: true), like so:
version: '3.7'
services:
elasticsearch:
....
volumes:
- type: volume
source: es-data1
target: /usr/share/elasticsearch/data
volume:
nocopy: true
....
volumes:
es-data1:
driver: local
external: true
You may also want to check this question here: Docker-compose - volumes driver local meaning. So, there seem to be some docker volume plugins that are made specifically for the portability reasons; such as flocker or hedvig. But I didn't use a plugin for such purpose, so I can't really recommend one, yet.
I'm trying to run an app using docker swarm. The app is designed to be completely local running on a single computer using docker swarm.
If I SSH into the server and run a docker stack deploy everything works, as seen here running docker service ls:
When this deployment works, the services generally go live in this order:
Registry (a private registry)
Main (an Nginx service) and Postgres
All other services in random order (all Node apps)
The problem I am having is on reboot. When I reboot the server, I pretty consistently have the issue of the services failing with this result:
I am getting some errors that could be helpful.
In Postgres: docker service logs APP_NAME_postgres -f:
In Docker logs: sudo journalctl -fu docker.service
Update: June 5th, 2019
Also, By request from a GitHub issue docker version output:
Client:
Version: 18.09.5
API version: 1.39
Go version: go1.10.8
Git commit: e8ff056
Built: Thu Apr 11 04:43:57 2019
OS/Arch: linux/amd64
Experimental: false
Server: Docker Engine - Community
Engine:
Version: 18.09.5
API version: 1.39 (minimum version 1.12)
Go version: go1.10.8
Git commit: e8ff056
Built: Thu Apr 11 04:10:53 2019
OS/Arch: linux/amd64
Experimental: false
And docker info output:
Containers: 28
Running: 9
Paused: 0
Stopped: 19
Images: 14
Server Version: 18.09.5
Storage Driver: overlay2
Backing Filesystem: extfs
Supports d_type: true
Native Overlay Diff: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
Volume: local
Network: bridge host macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: active
NodeID: pbouae9n1qnezcq2y09m7yn43
Is Manager: true
ClusterID: nq9095ldyeq5ydbsqvwpgdw1z
Managers: 1
Nodes: 1
Default Address Pool: 10.0.0.0/8
SubnetSize: 24
Orchestration:
Task History Retention Limit: 5
Raft:
Snapshot Interval: 10000
Number of Old Snapshots to Retain: 0
Heartbeat Tick: 1
Election Tick: 10
Dispatcher:
Heartbeat Period: 5 seconds
CA Configuration:
Expiry Duration: 3 months
Force Rotate: 1
Autolock Managers: false
Root Rotation In Progress: false
Node Address: 192.168.0.47
Manager Addresses:
192.168.0.47:2377
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: bb71b10fd8f58240ca47fbb579b9d1028eea7c84
runc version: 2b18fe1d885ee5083ef9f0838fee39b62d653e30
init version: fec3683
Security Options:
apparmor
seccomp
Profile: default
Kernel Version: 4.15.0-50-generic
Operating System: Ubuntu 18.04.2 LTS
OSType: linux
Architecture: x86_64
CPUs: 4
Total Memory: 3.68GiB
Name: oeemaster
ID: 76LH:BH65:CFLT:FJOZ:NCZT:VJBM:2T57:UMAL:3PVC:OOXO:EBSZ:OIVH
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false
Product License: Community Engine
WARNING: No swap limit support
And finally, My docker swarm stack/compose file:
secrets:
jwt-secret:
external: true
pg-db:
external: true
pg-host:
external: true
pg-pass:
external: true
pg-user:
external: true
ssl_dhparam:
external: true
services:
accounts:
depends_on:
- postgres
- registry
deploy:
restart_policy:
condition: on-failure
environment:
JWT_SECRET_FILE: /run/secrets/jwt-secret
PG_DB_FILE: /run/secrets/pg-db
PG_HOST_FILE: /run/secrets/pg-host
PG_PASS_FILE: /run/secrets/pg-pass
PG_USER_FILE: /run/secrets/pg-user
image: 127.0.0.1:5000/local-oee-master-accounts:v0.8.0
secrets:
- source: jwt-secret
- source: pg-db
- source: pg-host
- source: pg-pass
- source: pg-user
graphs:
depends_on:
- postgres
- registry
deploy:
restart_policy:
condition: on-failure
environment:
PG_DB_FILE: /run/secrets/pg-db
PG_HOST_FILE: /run/secrets/pg-host
PG_PASS_FILE: /run/secrets/pg-pass
PG_USER_FILE: /run/secrets/pg-user
image: 127.0.0.1:5000/local-oee-master-graphs:v0.8.0
secrets:
- source: pg-db
- source: pg-host
- source: pg-pass
- source: pg-user
health:
depends_on:
- postgres
- registry
deploy:
restart_policy:
condition: on-failure
environment:
PG_DB_FILE: /run/secrets/pg-db
PG_HOST_FILE: /run/secrets/pg-host
PG_PASS_FILE: /run/secrets/pg-pass
PG_USER_FILE: /run/secrets/pg-user
image: 127.0.0.1:5000/local-oee-master-health:v0.8.0
secrets:
- source: pg-db
- source: pg-host
- source: pg-pass
- source: pg-user
live-data:
depends_on:
- postgres
- registry
deploy:
restart_policy:
condition: on-failure
image: 127.0.0.1:5000/local-oee-master-live-data:v0.8.0
ports:
- published: 32000
target: 80
main:
depends_on:
- accounts
- graphs
- health
- live-data
- point-logs
- registry
deploy:
restart_policy:
condition: on-failure
environment:
MAIN_CONFIG_FILE: nginx.local.conf
image: 127.0.0.1:5000/local-oee-master-nginx:v0.8.0
ports:
- published: 80
target: 80
- published: 443
target: 443
modbus-logger:
depends_on:
- point-logs
- registry
deploy:
restart_policy:
condition: on-failure
environment:
CONTROLLER_ADDRESS: 192.168.2.100
SERVER_ADDRESS: http://point-logs
image: 127.0.0.1:5000/local-oee-master-modbus-logger:v0.8.0
point-logs:
depends_on:
- postgres
- registry
deploy:
restart_policy:
condition: on-failure
environment:
ENV_TYPE: local
PG_DB_FILE: /run/secrets/pg-db
PG_HOST_FILE: /run/secrets/pg-host
PG_PASS_FILE: /run/secrets/pg-pass
PG_USER_FILE: /run/secrets/pg-user
image: 127.0.0.1:5000/local-oee-master-point-logs:v0.8.0
secrets:
- source: pg-db
- source: pg-host
- source: pg-pass
- source: pg-user
postgres:
depends_on:
- registry
deploy:
restart_policy:
condition: on-failure
window: 120s
environment:
POSTGRES_PASSWORD: password
image: 127.0.0.1:5000/local-oee-master-postgres:v0.8.0
ports:
- published: 5432
target: 5432
volumes:
- /media/db_main/postgres_oee_master:/var/lib/postgresql/data:rw
registry:
deploy:
restart_policy:
condition: on-failure
image: registry:2
ports:
- mode: host
published: 5000
target: 5000
volumes:
- /mnt/registry:/var/lib/registry:rw
version: '3.2'
Things I've tried
Action: Added restart_policy > window: 120s
Result: No Effect
Action: Postgres restart_policy > condition: none & crontab #reboot redeploy
Result: No Effect
Action: Set all containers stop_grace_period: 2m
Result: No Effect
Current Workaround
Currently, I have hacked together a solution that is working just so I can move on to the next thing. I just wrote a shell script called recreate.sh that will kill the failed first boot version of the server, wait for it to break down, and the "manually" run docker stack deploy again. I am then setting the script to run at boot with crontab #reboot. This is working for shutdowns and reboots, but I don't accept this as the proper answer, so I won't add it as one.
It looks to me that you need to check is who/what kills postgres service. From logs you posted it seems that postrgres receives smart shutdown signal. Then, postress stops gently. Your stack file has restart policy set to "on-failure", and since postres process stops gently (exit code 0), docker does not consider this as failue and as instructed, it does not restart.
In conclusion, I'd recommend changing restart policy to "any" from "on-failure".
Also, have in mind that "depends_on" settings that you use are ignored in swarm and you need to have your services/images own way of ensuring proper startup order or to be able to work when dependent services are not up yet.
There's also thing you could try - healthchecks. Perhaps your postgres base image has a healthcheck defined and it's terminating container by sending a kill signal to it. And as wrote earlier, postgres shuts down gently and there's no error exit code and restart policy does not trigger. Try disabling healthcheck in yaml or go to dockerfiles to see for the healthcheck directive and figure out why it triggers.