Postgres not starting on swarm server reboot - postgresql

I'm trying to run an app using docker swarm. The app is designed to be completely local running on a single computer using docker swarm.
If I SSH into the server and run a docker stack deploy everything works, as seen here running docker service ls:
When this deployment works, the services generally go live in this order:
Registry (a private registry)
Main (an Nginx service) and Postgres
All other services in random order (all Node apps)
The problem I am having is on reboot. When I reboot the server, I pretty consistently have the issue of the services failing with this result:
I am getting some errors that could be helpful.
In Postgres: docker service logs APP_NAME_postgres -f:
In Docker logs: sudo journalctl -fu docker.service
Update: June 5th, 2019
Also, By request from a GitHub issue docker version output:
Client:
Version: 18.09.5
API version: 1.39
Go version: go1.10.8
Git commit: e8ff056
Built: Thu Apr 11 04:43:57 2019
OS/Arch: linux/amd64
Experimental: false
Server: Docker Engine - Community
Engine:
Version: 18.09.5
API version: 1.39 (minimum version 1.12)
Go version: go1.10.8
Git commit: e8ff056
Built: Thu Apr 11 04:10:53 2019
OS/Arch: linux/amd64
Experimental: false
And docker info output:
Containers: 28
Running: 9
Paused: 0
Stopped: 19
Images: 14
Server Version: 18.09.5
Storage Driver: overlay2
Backing Filesystem: extfs
Supports d_type: true
Native Overlay Diff: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
Volume: local
Network: bridge host macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: active
NodeID: pbouae9n1qnezcq2y09m7yn43
Is Manager: true
ClusterID: nq9095ldyeq5ydbsqvwpgdw1z
Managers: 1
Nodes: 1
Default Address Pool: 10.0.0.0/8
SubnetSize: 24
Orchestration:
Task History Retention Limit: 5
Raft:
Snapshot Interval: 10000
Number of Old Snapshots to Retain: 0
Heartbeat Tick: 1
Election Tick: 10
Dispatcher:
Heartbeat Period: 5 seconds
CA Configuration:
Expiry Duration: 3 months
Force Rotate: 1
Autolock Managers: false
Root Rotation In Progress: false
Node Address: 192.168.0.47
Manager Addresses:
192.168.0.47:2377
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: bb71b10fd8f58240ca47fbb579b9d1028eea7c84
runc version: 2b18fe1d885ee5083ef9f0838fee39b62d653e30
init version: fec3683
Security Options:
apparmor
seccomp
Profile: default
Kernel Version: 4.15.0-50-generic
Operating System: Ubuntu 18.04.2 LTS
OSType: linux
Architecture: x86_64
CPUs: 4
Total Memory: 3.68GiB
Name: oeemaster
ID: 76LH:BH65:CFLT:FJOZ:NCZT:VJBM:2T57:UMAL:3PVC:OOXO:EBSZ:OIVH
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false
Product License: Community Engine
WARNING: No swap limit support
And finally, My docker swarm stack/compose file:
secrets:
jwt-secret:
external: true
pg-db:
external: true
pg-host:
external: true
pg-pass:
external: true
pg-user:
external: true
ssl_dhparam:
external: true
services:
accounts:
depends_on:
- postgres
- registry
deploy:
restart_policy:
condition: on-failure
environment:
JWT_SECRET_FILE: /run/secrets/jwt-secret
PG_DB_FILE: /run/secrets/pg-db
PG_HOST_FILE: /run/secrets/pg-host
PG_PASS_FILE: /run/secrets/pg-pass
PG_USER_FILE: /run/secrets/pg-user
image: 127.0.0.1:5000/local-oee-master-accounts:v0.8.0
secrets:
- source: jwt-secret
- source: pg-db
- source: pg-host
- source: pg-pass
- source: pg-user
graphs:
depends_on:
- postgres
- registry
deploy:
restart_policy:
condition: on-failure
environment:
PG_DB_FILE: /run/secrets/pg-db
PG_HOST_FILE: /run/secrets/pg-host
PG_PASS_FILE: /run/secrets/pg-pass
PG_USER_FILE: /run/secrets/pg-user
image: 127.0.0.1:5000/local-oee-master-graphs:v0.8.0
secrets:
- source: pg-db
- source: pg-host
- source: pg-pass
- source: pg-user
health:
depends_on:
- postgres
- registry
deploy:
restart_policy:
condition: on-failure
environment:
PG_DB_FILE: /run/secrets/pg-db
PG_HOST_FILE: /run/secrets/pg-host
PG_PASS_FILE: /run/secrets/pg-pass
PG_USER_FILE: /run/secrets/pg-user
image: 127.0.0.1:5000/local-oee-master-health:v0.8.0
secrets:
- source: pg-db
- source: pg-host
- source: pg-pass
- source: pg-user
live-data:
depends_on:
- postgres
- registry
deploy:
restart_policy:
condition: on-failure
image: 127.0.0.1:5000/local-oee-master-live-data:v0.8.0
ports:
- published: 32000
target: 80
main:
depends_on:
- accounts
- graphs
- health
- live-data
- point-logs
- registry
deploy:
restart_policy:
condition: on-failure
environment:
MAIN_CONFIG_FILE: nginx.local.conf
image: 127.0.0.1:5000/local-oee-master-nginx:v0.8.0
ports:
- published: 80
target: 80
- published: 443
target: 443
modbus-logger:
depends_on:
- point-logs
- registry
deploy:
restart_policy:
condition: on-failure
environment:
CONTROLLER_ADDRESS: 192.168.2.100
SERVER_ADDRESS: http://point-logs
image: 127.0.0.1:5000/local-oee-master-modbus-logger:v0.8.0
point-logs:
depends_on:
- postgres
- registry
deploy:
restart_policy:
condition: on-failure
environment:
ENV_TYPE: local
PG_DB_FILE: /run/secrets/pg-db
PG_HOST_FILE: /run/secrets/pg-host
PG_PASS_FILE: /run/secrets/pg-pass
PG_USER_FILE: /run/secrets/pg-user
image: 127.0.0.1:5000/local-oee-master-point-logs:v0.8.0
secrets:
- source: pg-db
- source: pg-host
- source: pg-pass
- source: pg-user
postgres:
depends_on:
- registry
deploy:
restart_policy:
condition: on-failure
window: 120s
environment:
POSTGRES_PASSWORD: password
image: 127.0.0.1:5000/local-oee-master-postgres:v0.8.0
ports:
- published: 5432
target: 5432
volumes:
- /media/db_main/postgres_oee_master:/var/lib/postgresql/data:rw
registry:
deploy:
restart_policy:
condition: on-failure
image: registry:2
ports:
- mode: host
published: 5000
target: 5000
volumes:
- /mnt/registry:/var/lib/registry:rw
version: '3.2'
Things I've tried
Action: Added restart_policy > window: 120s
Result: No Effect
Action: Postgres restart_policy > condition: none & crontab #reboot redeploy
Result: No Effect
Action: Set all containers stop_grace_period: 2m
Result: No Effect
Current Workaround
Currently, I have hacked together a solution that is working just so I can move on to the next thing. I just wrote a shell script called recreate.sh that will kill the failed first boot version of the server, wait for it to break down, and the "manually" run docker stack deploy again. I am then setting the script to run at boot with crontab #reboot. This is working for shutdowns and reboots, but I don't accept this as the proper answer, so I won't add it as one.

It looks to me that you need to check is who/what kills postgres service. From logs you posted it seems that postrgres receives smart shutdown signal. Then, postress stops gently. Your stack file has restart policy set to "on-failure", and since postres process stops gently (exit code 0), docker does not consider this as failue and as instructed, it does not restart.
In conclusion, I'd recommend changing restart policy to "any" from "on-failure".
Also, have in mind that "depends_on" settings that you use are ignored in swarm and you need to have your services/images own way of ensuring proper startup order or to be able to work when dependent services are not up yet.
There's also thing you could try - healthchecks. Perhaps your postgres base image has a healthcheck defined and it's terminating container by sending a kill signal to it. And as wrote earlier, postgres shuts down gently and there's no error exit code and restart policy does not trigger. Try disabling healthcheck in yaml or go to dockerfiles to see for the healthcheck directive and figure out why it triggers.

Related

Docker Swarm - don't restart service on entrypoint success

When trying to deploy my app on Docker swarm I have two services: NGINX to serve static files and app to compile some static files. To run static files compilation I'm using entrypoint in Compose file.
docker-compose.yml:
version: "3.9"
services:
nginx:
image: nginx
healthcheck:
test: curl --fail -s http://localhost:80/lib/tether/examples/viewport/index.html || exit 1
interval: 1m
timeout: 5s
retries: 3
volumes:
- /www:/usr/share/nginx/html/
deploy:
mode: replicated
replicas: 1
placement:
constraints:
- node.role == manager
ports:
- "8000:80"
depends_on:
- client
client:
image: my-client-image:latest
restart: "no"
deploy:
mode: replicated
replicas: 1
placement:
constraints:
- node.role == manager
volumes:
- /www:/app/www
entrypoint: /entrypoint.sh
entrypoint.sh
./node_modules/.bin/gulp compilescss
I tried adding restart: "no" in my service, but service is restarted on entrypoint completion anyway
Docker 23.0.0 is now out. As such you have two options:
stack files now support swarm jobs. Swarm understands that these run to completion. i.e. mode: replicated-job.
Docker Compose V3 Reference makes it clear that "restart:" applies to compose and "deploy.restart_policy.condition: on-failure" is the equivalent swarm statement.

Unable to see services with traefik

I'm a beginner and Im a bit confused about how traefik works...
I want to use the app freqtrade (trading bot) as a docker service and replicate it with different type of configuration, if you have 5min you can go check this guy I want to do the same thing...
But I don't understant why I can't see my app running with traefik :
What I did :
Configure my domain to my server like that :
server config
And on this machine I create a docker swarm and the treafik service with this tutorial and then, my docker compose file look like that :
```
version: '3.3'
services:
traefik:
# Use the latest v2.2.x Traefik image available
image: traefik:v2.2
ports:
# Listen on port 80, default for HTTP, necessary to redirect to HTTPS
- 80:80
# Listen on port 443, default for HTTPS
- 443:443
networks:
- traefik-public
deploy:
placement:
constraints:
# Make the traefik service run only on the node with this label
# as the node with it has the volume for the certificates
- node.labels.traefik-public.traefik-public-certificates == true
labels:
# Enable Traefik for this service, to make it available in the public network
- traefik.enable=true
# Use the traefik-public network (declared below)
- traefik.docker.network=traefik-public
# Use the custom label "traefik.constraint-label=traefik-public"
# This public Traefik will only use services with this label
# That way you can add other internal Traefik instances per stack if needed
- traefik.constraint-label=traefik-public
# admin-auth middleware with HTTP Basic auth
# Using the environment variables USERNAME and HASHED_PASSWORD
- traefik.http.middlewares.admin-auth.basicauth.users=${USERNAME?Variable not set}:${HASHED_PASSWORD?Variable not set}
# https-redirect middleware to redirect HTTP to HTTPS
# It can be re-used by other stacks in other Docker Compose files
- traefik.http.middlewares.https-redirect.redirectscheme.scheme=https
- traefik.http.middlewares.https-redirect.redirectscheme.permanent=true
# traefik-http set up only to use the middleware to redirect to https
# Uses the environment variable DOMAIN
- traefik.http.routers.traefik-public-http.rule=Host(`${DOMAIN?Variable not set}`)
- traefik.http.routers.traefik-public-http.entrypoints=http
- traefik.http.routers.traefik-public-http.middlewares=https-redirect
# traefik-https the actual router using HTTPS
# Uses the environment variable DOMAIN
- traefik.http.routers.traefik-public-https.rule=Host(`${DOMAIN?Variable not set}`)
- traefik.http.routers.traefik-public-https.entrypoints=https
- traefik.http.routers.traefik-public-https.tls=true
# Use the special Traefik service api#internal with the web UI/Dashboard
- traefik.http.routers.traefik-public-https.service=api#internal
# Use the "le" (Let's Encrypt) resolver created below
- traefik.http.routers.traefik-public-https.tls.certresolver=le
# Enable HTTP Basic auth, using the middleware created above
- traefik.http.routers.traefik-public-https.middlewares=admin-auth
# Define the port inside of the Docker service to use
- traefik.http.services.traefik-public.loadbalancer.server.port=8080
volumes:
# Add Docker as a mounted volume, so that Traefik can read the labels of other services
- /var/run/docker.sock:/var/run/docker.sock:ro
# Mount the volume to store the certificates
- traefik-public-certificates:/certificates
command:
# Enable Docker in Traefik, so that it reads labels from Docker services
- --providers.docker
# Add a constraint to only use services with the label "traefik.constraint-label=traefik-public"
- --providers.docker.constraints=Label(`traefik.constraint-label`, `traefik-public`)
# Do not expose all Docker services, only the ones explicitly exposed
- --providers.docker.exposedbydefault=false
# Enable Docker Swarm mode
- --providers.docker.swarmmode
# Create an entrypoint "http" listening on port 80
- --entrypoints.http.address=:80
# Create an entrypoint "https" listening on port 443
- --entrypoints.https.address=:443
# Create the certificate resolver "le" for Let's Encrypt, uses the environment variable EMAIL
- --certificatesresolvers.le.acme.email=${EMAIL?Variable not set}
# Store the Let's Encrypt certificates in the mounted volume
- --certificatesresolvers.le.acme.storage=/certificates/acme.json
# Use the TLS Challenge for Let's Encrypt
- --certificatesresolvers.le.acme.tlschallenge=true
# Enable the access log, with HTTP requests
- --accesslog
# Enable the Traefik log, for configurations and errors
- --log
# Enable the Dashboard and API
- --api
volumes:
# Create a volume to store the certificates, there is a constraint to make sure
# Traefik is always deployed to the same Docker node with the same volume containing
# the HTTPS certificates
traefik-public-certificates:
networks:
traefik-public:
driver: overlay
attachable: true
```
And deploy it :
docker stack deploy -c traefik.yml traefik
After that traefik works fine. Why I can't see the port 8080 in my entrypoint ? is it important for others services ?
Entrypoint traefik
I try to disable the firewall in configuration of the server and also do ufw allow 8080 but nothing change...
I create my a application like I create traefik service with this docker-compose file :
---
version: '3'
networks:
traefik_traefik-public:
external: true
services:
freqtrade:
image: freqtradeorg/freqtrade:stable
# image: freqtradeorg/freqtrade:develop
# Use plotting image
# image: freqtradeorg/freqtrade:develop_plot
# Build step - only needed when additional dependencies are needed
# build:
# context: .
# dockerfile: "./docker/Dockerfile.custom"
restart: unless-stopped
container_name: freqtrade
volumes:
- "./user_data:/freqtrade/user_data"
# Expose api on port 8080 (localhost only)
# Please read the https://www.freqtrade.io/en/stable/rest-api/ documentation
# before enabling this.
networks:
- traefik_traefik-public
deploy:
mode: replicated
replicas: 1
placement:
constraints:
- node.role == manager
restart_policy:
condition: on-failure
delay: 5s
command: >
trade
--logfile /freqtrade/user_data/logs/freqtrade.log
--db-url sqlite:////freqtrade/user_data/tradesv3.sqlite
--config /freqtrade/user_data/config.json
--strategy SampleStrategy
labels:
- traefik.http.routers.bot001.tls=true'
- traefik.http.routers.bot001.rule=Host(`bot001.bots.lordgoliath.com`)'
- traefik.http.services.bot001.loadbalancer.server.port=8080'
and this is a part of the configuation file of the bot (to access to the UI)
"api_server": {
"enabled": true,
"enable_openapi": true,
"listen_ip_address": "0.0.0.0",
"listen_port": 8080,
"verbosity": "info",
"jwt_secret_key": "somethingrandom",
"CORS_origins": ["https://bots.lordgoliath.com"],
"username": "api",
"password": "api"
},
then :
docker stack deploy -c docker-compose.yml freqtrade
So I have that :
goliath#localhost:~/freqtrade_test/user_data$ docker service ls
ID NAME MODE REPLICAS IMAGE PORTS
nkvpjjztjibg freqtrade_freqtrade replicated 1/1 freqtradeorg/freqtrade:stable
6qryu28ute9i traefik_traefik replicated 1/1 traefik:v2.2 *:80->80/tcp, *:443->443/tcp
I see the bot running with the command docker service logs freqtrade_freqtrade but
when I try to go on my domain to see it have only the Traefik dashboard and can't see anything else running.
traefik http
traefik https
how I can see my app freqtrade running ? how can I access to the bot UI via my domain ?
Thanks !
Sorry for my bad English I hope this is clear enough to understand my problem
UPDATE
docker service inspect --pretty freqtrade_freqtrade
ID: o6bpaso69i9n6etybtj09xsqi
Name: ft1_freqtrade
Labels:
com.docker.stack.image=freqtradeorg/freqtrade:stable
com.docker.stack.namespace=ft1
Service Mode: Replicated
Replicas: 1
Placement:
Constraints: [node.role == manager]
UpdateConfig:
Parallelism: 1
On failure: pause
Monitoring Period: 5s
Max failure ratio: 0
Update order: stop-first
RollbackConfig:
Parallelism: 1
On failure: pause
Monitoring Period: 5s
Max failure ratio: 0
Rollback order: stop-first
ContainerSpec:
Image: freqtradeorg/freqtrade:stable#sha256:3b2f2acb5b9cfedaa7b07cf56af01d1a750bce4c3054bdbaf40ac27935c984eb
Args: trade --logfile /freqtrade/user_data/logs/freqtrade.log --db-url sqlite:////freqtrade/user_data/tradesv3.sqlite --config /freqtrade/user_data/config.json --strategy SampleStrategy
Mounts:
Target: /freqtrade/user_data
Source: /home/goliath/freqtrade_test/user_data
ReadOnly: false
Type: bind
Resources:
Networks: traefik_traefik-public
Endpoint Mode: vip
UPDATE NEW docker-compose.yml
---
version: '3'
networks:
traefik_traefik-public:
external: true
services:
freqtrade:
image: freqtradeorg/freqtrade:stable
# image: freqtradeorg/freqtrade:develop
# Use plotting image
# image: freqtradeorg/freqtrade:develop_plot
# Build step - only needed when additional dependencies are needed
# build:
# context: .
# dockerfile: "./docker/Dockerfile.custom"
restart: unless-stopped
container_name: freqtrade
volumes:
- "./user_data:/freqtrade/user_data"
# Expose api on port 8080 (localhost only)
# Please read the https://www.freqtrade.io/en/stable/rest-api/ documentation
# before enabling this.
networks:
- traefik_traefik-public
deploy:
mode: replicated
replicas: 1
placement:
constraints:
- node.role == manager
restart_policy:
condition: on-failure
delay: 5s
labels:
- 'traefik.enabled=true'
- 'traefik.http.routers.bot001.tls=true'
- 'traefik.http.routers.bot001.rule=Host(`bot001.bots.lordgoliath.com`)'
- 'traefik.http.services.bot001.loadbalancer.server.port=8080'
command: >
trade
--logfile /freqtrade/user_data/logs/freqtrade.log
--db-url sqlite:////freqtrade/user_data/tradesv3.sqlite
--config /freqtrade/user_data/config.json
--strategy SampleStrategy
UPDATE docker network ls
goliath#localhost:~/freqtrade_test$ docker network ls
NETWORK ID NAME DRIVER SCOPE
003e00401b5d bridge bridge local
9f3d9a222928 docker_gwbridge bridge local
09a33afad0c9 host host local
r4u268yenm5u ingress overlay swarm
bed40e4a5c62 none null local
qo9w45gitke5 traefik_traefik-public overlay swarm
This is the minimal config you need to integrate in order to see the traefik dashboard on localhost:8080
version: "3.9"
services:
traefik:
image: traefik:latest
command: |
--api.insecure=true
ports:
- 8080:8080
Then, your minimal configuration to get traefik to route example.com to itself:
version: "3.9"
networks:
public:
attachable: true
name: traefik
services:
traefik:
image: traefik:latest
volumes:
- /var/run/docker.sock:/var/run/docker.sock:ro
command: |
--api.insecure=true
--providers.docker.exposedbydefault=false
--providers.docker.swarmmode
--providers.docker.network=traefik
ports:
- 80:80
networks:
- public
deploy:
labels:
traefik.enable: "true"
traefik.http.routers.traefik.rule: Host(`example.com`)
traefik.http.services.traefik.loadbalancer.server.port: 8080
Now, minimal https support - using Traefik self signed certs to start with. Note that we configure tls on the https entrypoint, which means traefik implicitly creates http and https variants for each router.
version: "3.9"
networks:
public:
attachable: true
name: traefik
services:
traefik:
image: traefik:latest
volumes:
- /var/run/docker.sock:/var/run/docker.sock:ro
command: |
--api.insecure=true
--providers.docker.exposedbydefault=false
--providers.docker.swarmmode
--providers.docker.network=traefik
--entrypoints.http.address=:80
--entrypoints.https.address=:443
--entrypoints.https.http.tls=true
deploy:
placement:
constraints:
- node.role == manager
ports:
# - 8080:8080
- 80:80
- 443:443
networks:
- public
deploy:
labels:
traefik.enable: "true"
traefik.http.routers.traefik.rule: Host(`example.com`)
traefik.http.services.traefik.loadbalancer.server.port: 8080
At this point, gluing in your le config should be simple.
Your freqtrade stack compose would need to be this. If this is a single node swarm, just omit the placement constraints, but when the swarm is large enough to have workers, then tasks that don't need to be on managers should explicitly be kept on workers.
Traefik needs to talk to the swarm api over the docker socket, which is on manager nodes only, which is why it must be node.role==manager.
version: "3.9"
networks:
traefik:
external: true
services:
freqtrade:
image: freqtradeorg/freqtrade:stable
command: ...
volumes: ...
networks:
- traefik
deploy:
placement:
constraints:
- node.role == worker
restart_policy:
max_attempts: 5
labels:
traefik.enabled: "true"
traefik.http.routers.bot001.rule: Host(`bot001.bots.lordgoliath.com`)
traefik.http.services.bot001.loadbalancer.server.port: 8080

Disable caddy ssl to enable a deploy to Cloud Run through Gitlab CI

I am trying to deploy api-platform docker-compose images to Cloud Run. I have that working up to the point where the image starts to run. I have the system listening on port 8080 but I cannot turn off the https redirect. So, it shuts down. Here is the message I receive:
ERROR: (gcloud.run.deploy) Cloud Run error: Container failed to start. Failed to start and then listen on the port defined by the PORT environment variable. Logs for this revision might contain more information.
run: loading initial config: loading new config: loading http app module: provision http: server srv0: setting up route handlers: route 0: loading handler modules: position 0: loading module 'subroute': provision http.handlers.subroute: setting up subroutes: route 0: loading handler modules: position 0: loading module 'subroute': provision http.handlers.subroute: setting up subroutes: route 1: loading handler modules: position 0: loading module 'mercure': provision http.handlers.mercure: a JWT key for publishers must be provided
I am using Cloud Build to build the container and deploy to Run. This is kicked off through Gitlab CI.
This is my cloudbuild.yaml file.
steps:
- name: docker/compose
args: ['build']
- name: 'gcr.io/cloud-builders/docker'
args: ['tag', 'workspace_caddy:latest', 'gcr.io/$PROJECT_ID/apiplatform']
- name: docker
args: [ 'push', 'gcr.io/$PROJECT_ID/apiplatform']
- name: 'gcr.io/cloud-builders/gcloud'
args: ['run', 'deploy', 'erp-ui', '--image', 'gcr.io/$PROJECT_ID/apiplatform', '--region', 'us-west1', '--platform', 'managed', '--allow-unauthenticated']
docker-compose.yml
version: "3.4"
services:
php:
build:
context: ./api
target: api_platform_php
depends_on:
- database
restart: unless-stopped
volumes:
- php_socket:/var/run/php
healthcheck:
interval: 10s
timeout: 3s
retries: 3
start_period: 30s
pwa:
build:
context: ./pwa
target: api_platform_pwa_prod
environment:
API_PLATFORM_CLIENT_GENERATOR_ENTRYPOINT: http://caddy
caddy:
build:
context: api/
target: api_platform_caddy
depends_on:
- php
- pwa
environment:
PWA_UPSTREAM: pwa:3000
SERVER_NAME: ${SERVER_NAME:-localhost, caddy:8080}
MERCURE_PUBLISHER_JWT_KEY: ${MERCURE_PUBLISHER_JWT_KEY:-!ChangeMe!}
MERCURE_SUBSCRIBER_JWT_KEY: ${MERCURE_SUBSCRIBER_JWT_KEY:-!ChangeMe!}
restart: unless-stopped
volumes:
- php_socket:/var/run/php
- caddy_data:/data
- caddy_config:/config
ports:
# HTTP
- target: 8080
published: 8080
protocol: tcp
database:
image: postgres:13-alpine
environment:
- POSTGRES_DB=api
- POSTGRES_PASSWORD=!ChangeMe!
- POSTGRES_USER=api-platform
volumes:
- db_data:/var/lib/postgresql/data:rw
# you may use a bind-mounted host directory instead, so that it is harder to accidentally remove the volume and lose all your data!
# - ./api/docker/db/data:/var/lib/postgresql/data:rw
volumes:
php_socket:
db_data:
caddy_data:
caddy_config:

Filebeat failed connect to backoff?

I have 2 servers A and B. When I run filebeat by docker-compose on the server A it's working well. But on the server B I have the following error:
pipeline/output.go:154 Failed to connect to backoff(async(tcp://logstash_ip:5044)): dial tcp logstash_ip:5044: connect: no route to host
So I think I missed some config on the server B. So how can I figure out my problem and fix them.
[Edited] Add filebeat.yml and docker-compose
Notice: I ran filebeat on the server A and got failed, so I tested it on the server B and it is still working. So I guess I have some problems with server config
filebeat.yml
logging.level: error
logging.to_files: true
logging.files:
path: /var/log/filebeat
name: filebeat
keepfiles: 7
permissions: 0644
filebeat.inputs:
- type: log
enabled: true
paths:
- /usr/share/filebeat/mylog/**/*.log
processors:
- decode_json_fields:
fields: ['message']
target: 'json'
output.logstash:
hosts: ['logstash_ip:5044']
console.pretty: true
processors:
- add_docker_metadata:
host: 'unix:///host_docker/docker.sock'
docker-compose
version: '3.3'
services:
filebeat:
user: root
container_name: filebeat
image: docker.elastic.co/beats/filebeat:7.9.3
volumes:
- /var/run/docker.sock:/host_docker/docker.sock
- /var/lib/docker:/host_docker/var/lib/docker
- ./logs/progress:/usr/share/filebeat/mylog
- ./filebeat.yml:/usr/share/filebeat/filebeat.yml:z
command: ['--strict.perms=false']
ulimits:
memlock:
soft: -1
hard: -1
stdin_open: true
tty: true
network_mode: bridge
deploy:
mode: global
logging:
driver: 'json-file'
options:
max-size: '10m'
max-file: '50'
Thanks in advance
Assumptions:
Mentioned docker-compose file is for filebeat "concentration" server
which is running in docker on server B.
Both server are running in same network space and/or are accessible between themselves
Server B as filebeat server have correct firewall setting to accept connection on port 5044 (check with telnet from server A after starting container)
docker-compose (assuming server B)
version: '3.3'
services:
filebeat:
user: root
container_name: filebeat
ports: ##################
- 5044:5044 # <- see open port
image: docker.elastic.co/beats/filebeat:7.9.3
volumes:
- /var/run/docker.sock:/host_docker/docker.sock
- /var/lib/docker:/host_docker/var/lib/docker
- ./logs/progress:/usr/share/filebeat/mylog
- ./filebeat.yml:/usr/share/filebeat/filebeat.yml:z
command: ['--strict.perms=false']
ulimits:
memlock:
soft: -1
hard: -1
stdin_open: true
tty: true
network_mode: bridge
deploy:
mode: global
logging:
driver: 'json-file'
options:
max-size: '10m'
max-file: '50'
filebeat.yml (assuming both servers)
logging.level: error
logging.to_files: true
logging.files:
path: /var/log/filebeat
name: filebeat
keepfiles: 7
permissions: 0644
filebeat.inputs:
- type: log
enabled: true
paths:
- /usr/share/filebeat/mylog/**/*.log
processors:
- decode_json_fields:
fields: ['message']
target: 'json'
output.logstash:
hosts: ['<SERVER-B-IP>:5044'] ## <- see server IP
console.pretty: true
processors:
- add_docker_metadata:
host: 'unix:///host_docker/docker.sock'

Docker containers with volume mounting exits immediately on using docker-compose up

I am using docker-compose up command to spin-up few containers on AWS AMI RHEL 7.6 instance. I observe that in whichever containers there's a volume mounting, they are exiting with status Exiting(1) immediately after starting and remaining containers remain up. I tried using tty: true and stdin_open: true, but it didn't help. Surprisingly, the set-up works fine in another instance which basically I am trying to replicate in this new one.
The stopped containers are Fabric v1.2 peers, CAs and orderer.
Docker-compose.yml file which is in root folder where I use docker-compose up command
version: '2.1'
networks:
gcsbc:
name: gcsbc
services:
ca.org1.example.com:
extends:
file: fabric/docker-compose.yml
service: ca.org1.example.com
fabric/docker-compose.yml
version: '2.1'
networks:
gcsbc:
services:
ca.org1.example.com:
image: hyperledger/fabric-ca
environment:
- FABRIC_CA_HOME=/etc/hyperledger/fabric-ca-server
- FABRIC_CA_SERVER_CA_NAME=ca-org1
- FABRIC_CA_SERVER_CA_CERTFILE=/etc/hyperledger/fabric-ca-server-config/ca.org1.example.com-cert.pem
- FABRIC_CA_SERVER_TLS_ENABLED=true
- FABRIC_CA_SERVER_TLS_CERTFILE=/etc/hyperledger/fabric-ca-server-config/ca.org1.example.com-cert.pem
ports:
- '7054:7054'
command: sh -c 'fabric-ca-server start -b admin:adminpw -d'
volumes:
- ./artifacts/channel/crypto-config/peerOrganizations/org1.example.com/ca/:/etc/hyperledger/fabric-ca-server-config
container_name: ca_peerorg1
networks:
- gcsbc
hostname: ca.org1.example.com