container started from docker compose is unable to connect to another container - postgresql

I have docker-compose that looks like this
version: '3.7'
networks:
iam_network:
external:
name: foundation_iam
rdc_network:
name: rdcstu3_net
services:
rdcdeploy:
restart: "no"
container_name: rdcdeploy
build:
context: ./rdcdeploy
args:
- build_version
- build_type
image: rdcdeploy:$build_version
volumes:
- ./cfg:/cfg
networks:
- rdc_network
rdcrabbitmq:
restart: "no"
container_name: rdcrabbitmq
build:
context: ./rabbitmq
args:
- build_version
- build_type
image: rdcrabbitmq:$build_version
ports:
- "5772:5672"
- "15772:15672"
depends_on:
- rdcdeploy
volumes:
- ./cfg:/cfg
networks:
- rdc_network
rdcdb:
restart: "no"
container_name: rdcdb
build:
context: ./postgres
args:
- build_version
- build_type
image: rdcpostgres:$build_version
ports:
- "5532:5432"
depends_on:
- rdcdeploy
volumes:
- ./cfg:/cfg
networks:
- rdc_network
rdcdbdeploy:
restart: "no"
container_name: rdcdbdeploy
build:
context: ./rdcdbdeploy
args:
- build_version
- build_type
image: rdcdbdeploy:$build_version
depends_on:
- rdcdb
volumes:
- ./cfg:/cfg
networks:
- rdc_network
rihapp:
restart: "no"
container_name: rihapp
build:
context: ./rihserver
args:
- build_version
- build_type
image: rihapp:$build_version
ports:
- "9090:8080"
depends_on:
- rdcrabbitmq
- rdcdb
volumes:
- ./cfg:/cfg
networks:
- iam_network
- rdc_network
subscription_scheduler:
restart: "no"
container_name: subscription_scheduler
build:
context: ./subscription
args:
- build_version
- build_type
image: subscription_scheduler:$build_version
depends_on:
- rdcrabbitmq
- rdcdb
- rihapp
volumes:
- ./cfg:/cfg
networks:
- iam_network
- rdc_network
environment:
- rdc.subscription.instanceNumber=0
subscription_processor:
restart: "no"
container_name: subscription_processor
build:
context: ./subscription
args:
- build_version
- build_type
image: subscription_processor:$build_version
depends_on:
- rdcrabbitmq
- rdcdb
- rihapp
volumes:
- ./cfg:/cfg
networks:
- iam_network
- rdc_network
environment:
- rdc.subscription.instanceNumber=1
rdcsmoketest:
restart: "no"
container_name: rdcsmoketests
build:
context: ./rdcdeploy
image: rdcdeploy:$build_version
volumes:
- ./cfg:/cfg
depends_on:
- rihapp
networks:
- iam_network
- rdc_network
entrypoint:
- wait-for-rihapp.sh
- rdcdeploy
command: ["-x", "-z", "/cfg", "-c", "/cfg/config.yml", "docker"]
I start it using docker-compose up and it shows that the containers are started.
eedaa5e11a0e rdicdeploy:3.3.0.1 "wait-for-rihapp.sh…" 2 minutes ago Up 38 seconds rdicsmoketests
9178355cbca7 subscription_scheduler:3.3.0.1 "./wait-for-env.sh /…" 2 minutes ago Up 38 seconds subscription_scheduler
ae24a4b76f3e subscription_processor:3.3.0.1 "./wait-for-env.sh /…" 2 minutes ago Up 38 seconds subscription_processor
5f789ae74ef2 rihapp:3.3.0.1 "./wait_for_rdic_db.s…" 2 minutes ago Up 39 seconds 0.0.0.0:9090->8080/tcp rihapp
698b26d0ca37 rdicdbdeploy:3.3.0.1 "wait-for-env-db.sh …" 2 minutes ago Up 39 seconds rdicdbdeploy
592cb850f5b9 rdicrabbitmq:3.3.0.1 "wait-for-env.sh /cf…" 2 minutes ago Up 39 seconds 4369/tcp, 5671/tcp, 15671/tcp, 25672/tcp, 0.0.0.0:5772->5672/tcp, 0.0.0.0:15772->15672/tcp rdicrabbitmq
505a0f36528f rdicpostgres:3.3.0.1 "wait-for-env.sh /cf…" 2 minutes ago Up 39 seconds 0.0.0.0:5532->5432/tcp
But for some reason no container are able to connect either rabbitmq or postgres.
Logs for rabbitmq shows that they have started
2020-07-24 10:32:13.226 [info] <0.370.0> Running boot step direct_client defined by app rabbit
2020-07-24 10:32:13.226 [info] <0.370.0> Running boot step os_signal_handler defined by app rabbit
2020-07-24 10:32:13.226 [info] <0.489.0> Swapping OS signal event handler (erl_signal_server) for our own
2020-07-24 10:32:13.262 [info] <0.539.0> Management plugin: HTTP (non-TLS) listener started on port 15672
2020-07-24 10:32:13.262 [info] <0.645.0> Statistics database started.
2020-07-24 10:32:13.262 [info] <0.644.0> Starting worker pool 'management_worker_pool' with 3 processes in it
2020-07-24 10:32:13.480 [info] <0.8.0> Server startup complete; 3 plugins started.
* rabbitmq_management
* rabbitmq_web_dispatch
* rabbitmq_management_agent
completed with 3 plugins.
For postgres too
server started
CREATE DATABASE
CREATE ROLE
/usr/local/bin/docker-entrypoint.sh: ignoring /docker-entrypoint-initdb.d/*
waiting for server to shut down...LOG: received fast shutdown request
.LOG: aborting any active transactions
LOG: autovacuum launcher shutting down
LOG: shutting down
LOG: database system is shut down
done
server stopped
PostgreSQL init process complete; ready for start up.
LOG: database system was shut down at 2020-07-24 10:30:59 UTC
LOG: MultiXact member wraparound protections are now enabled
LOG: database system is ready to accept connections
LOG: autovacuum launcher started
Environment Available - proceeding with startup docker-entrypoint.sh postgres
LOG: database system was interrupted; last known up at 2020-07-24 10:31:00 UTC
LOG: database system was not properly shut down; automatic recovery in progress
LOG: invalid record length at 0/14EEEA0: wanted 24, got 0
LOG: redo is not required
LOG: MultiXact member wraparound protections are now enabled
LOG: database system is ready to accept connections
LOG: autovacuum launcher started
But the applications are trying to connect 5772 but the connection refused for rabbitmq and for postgres it also tells that
psql: error: could not connect to server: could not connect to server: Connection refused
rihapp | Is the server running on host "localhost" (127.0.0.1) and accepting
rihapp | TCP/IP connections on port 5532?
It also generates .env files that contains environment variables for APPS like
DATABASE_URL=postgres://rdc:rdc#localhost:5532/pg_db
spring.datasource.url=jdbc:postgresql://localhost:5532/pg_db
spring.rabbitmq.host=localhost
spring.rabbitmq.port=5772
What might be a problem? It feels like some kind of network problem.

It seems that you've configured the clients to contact the servers on localhost:X, am I getting this right?
In that case you need to be aware that containers in docker-compose have different network cgroups, and are able to reach each other through a bridge interface. This means that in the container, you should use rdcrabbitmq:5672 instead of localhost:5772

Related

docker-compose wait for startup script of PostgreSQL to finish before starting local Dockerfile's container

I have the following docker-compose.yml file which specifies that the backend service should wait until the postgres service is healthy before starting the backend service. Apparently, postgres service is already healthy even if it is still running its startup script.
This is my docker-compose.yml file.
version: "3.7"
services:
backend:
build: .
ports:
- "8080:8080"
env_file:
- .env
depends_on:
postgres:
condition: service_healthy
postgres:
image: postgres:13
ports:
- "${DB_PORT}:${DB_PORT}"
env_file:
- .env
volumes:
- ./initdb.d:/docker-entrypoint-initdb.d
- data:/var/lib/postgresql/data
healthcheck:
test: ["CMD-SHELL", "pg_isready -U ${POSTGRES_USER} -d ${POSTGRES_DB}"]
interval: 5s
timeout: 10s
retries: 5
volumes:
data:
I am mounting a startup script that runs for quite a while in the ./initdb.d file. The startup script will populate the DB with 1M rows. It seems like when the startup scripts are running, the backend service can't connect to postgres. Right now, my best solution is simply to add restart: on-failure:5 to wait for the startup scripts to finish. Is there a more robust way to achieve this though?

What means Configuring authentication for SERVER mode

I got postgresql and pgadmin4 with docker swarm on my ubuntu 18.04 server, but pgadmin gives me errors and after a while the application crashes and I can't enter postgres: this is the error
PermissionError: [Errno 1] Operation not permitted: '/var/lib/pgadmin/sessions'
WARNING: Failed to set ACL on the directory containing the configuration database:
[Errno 1] Operation not permitted: '/var/lib/pgadmin'
HINT : You may need to manually set the permissions on
/var/lib/pgadmin to allow pgadmin to write to it.
/usr/local/lib/python3.8/os.py:1023: RuntimeWarning: line buffering (buffering=1) isn't supported in binary mode, the default buffer size will be used
return io.open(fd, *args, **kwargs)
[2020-09-04 21:12:24 +0000] [1] [INFO] Shutting down: Master
[2020-09-04 21:12:24 +0000] [1] [INFO] Reason: Worker failed to boot.
WARNING: Failed to set ACL on the directory containing the configuration database:
[Errno 1] Operation not permitted: '/var/lib/pgadmin'
HINT : You may need to manually set the permissions on
/var/lib/pgadmin to allow pgadmin to write to it.
NOTE: Configuring authentication for SERVER mode.
sudo: setrlimit(RLIMIT_CORE): Operation not permitted
[2020-09-04 21:14:26 +0000] [1] [INFO] Starting gunicorn 19.9.0
[2020-09-04 21:14:26 +0000] [1] [INFO] Listening at: http://[::]:80 (1)
[2020-09-04 21:14:26 +0000] [1] [INFO] Using worker: threads
/usr/local/lib/python3.8/os.py:1023: RuntimeWarning: line buffering (buffering=1) isn't supported in binary mode, the default buffer size will be used
return io.open(fd, *args, **kwargs)
[2020-09-04 21:14:26 +0000] [89] [INFO] Booting worker with pid: 89
in error I see that it tells me NOTE: Configuring authentication for SERVER mode.
but I don't know how to configure what it indicates, someone could help me solve my problem.
Thank you
Edit:
docker-compose.yml
version: '3'
services:
ssl:
image: danieldent/nginx-ssl-proxy
restart: always
environment:
UPSTREAM: myApp:8086
SERVERNAME: dominio.com
ports:
- 80:80/tcp
- 443:443/tcp
depends_on:
- myApp
volumes:
- ./nginxAPP:/etc/letsencrypt
- ./nginxAPP:/etc/nginx/user.conf.d:ro
bdd:
restart: always
image: postgres:12
ports:
- 5432:5432/tcp
environment:
POSTGRES_USER: user
POSTGRES_PASSWORD: 12345
POSTGRES_DB: miBDD
volumes:
- ./pgdata:/var/lib/postgresql/data
pgadmin:
image: dpage/pgadmin4
ports:
- 9095:80/tcp
environment:
PGADMIN_DEFAULT_EMAIL: user
PGADMIN_DEFAULT_PASSWORD: 12345
PROXY_X_FOR_COUNT: 3
PROXY_X_PROTO_COUNT: 3
PROXY_X_HOST_COUNT: 3
PROXY_X_PORT_COUNT: 3
volumes:
- ./pgadminAplicattion:/var/lib/pgadmin
myApp:
restart: always
image: appImage
ports:
- 8086:8086
depends_on:
- bdd
working_dir: /usr/myApp
environment:
CONFIG_PATH: ../configuation
command: "node server.js"
It's generally a bad idea to use bind mounts in a non-development environments and doubly so when it comes to Docker Swarm (as opposed to regular Docker). This goes doubly when it comes to images like postgres or dpage/pgadmin4, which require those mounted directories to have specific ownership and/or read/write priviledges.
In your case, you need to run:
sudo chown 999:999 pgdata
sudo chown 5050:5050 pgadminAplicattion
to give those directories correct ownership.
That being said, it's a much better idea to avoid bind mounts entirely and use named volumes instead (irrelevant parts of Compose file skipped):
version: "3"
services:
bdd:
restart: always
image: postgres:12
ports:
- 5432:5432/tcp
environment:
POSTGRES_USER: user
POSTGRES_PASSWORD: 12345
POSTGRES_DB: miBDD
volumes:
- pgdata:/var/lib/postgresql/data
pgadmin:
restart: always
image: dpage/pgadmin4
ports:
- 9095:80/tcp
environment:
PGADMIN_DEFAULT_EMAIL: user
PGADMIN_DEFAULT_PASSWORD: 12345
PROXY_X_FOR_COUNT: 3
PROXY_X_PROTO_COUNT: 3
PROXY_X_HOST_COUNT: 3
PROXY_X_PORT_COUNT: 3
volumes:
- pgadmin:/var/lib/pgadmin
volumes:
pgdata:
pgadmin:
In your case you need to use following command: Try this
sudo chown -R 5050:5050 /var/lib/pgadmin

Docker: Wait for the Mongodb ReplicaSet

everyone. I have an odd problem (who hasn't?)
I have this docker-compose file:
version: '3.4'
services:
ludustack-web:
container_name: ludustack-web
image: ${DOCKER_REGISTRY-}ludustack-web
build:
context: .
dockerfile: LuduStack.Web/Dockerfile
networks:
- ludustack-network
ports:
- '80:80'
- '443:443'
depends_on:
- 'ludustack-db'
ludustack-db:
container_name: ludustack-db
command: mongod --auth
image: mongo:latest
hostname: mongodb
networks:
- ludustack-network
ports:
- '27017:27017'
env_file:
- .env
environment:
- MONGO_INITDB_ROOT_USERNAME=${MONGO_INITDB_ROOT_USERNAME}
- MONGO_INITDB_ROOT_PASSWORD=${MONGO_INITDB_ROOT_PASSWORD}
- MONGO_INITDB_DATABASE=${MONGO_INITDB_DATABASE}
- MONGO_REPLICA_SET_NAME=${MONGO_REPLICA_SET_NAME}
healthcheck:
test: test $$(echo "rs.initiate().ok || rs.status().ok" | mongo -u $${MONGO_INITDB_ROOT_USERNAME} -p $${MONGO_INITDB_ROOT_PASSWORD} --quiet) -eq 1
interval: 60s
start_period: 60s
command: ["--replSet", "${MONGO_REPLICA_SET_NAME}", "--bind_ip_all"]
networks:
ludustack-network:
driver: bridge
The problem is the web application only waits for the mongodb container to be ready, not the replica set itself. So, when the application starts, it crashes because the replicaset is not ready yet. Right after the crash, it logs the replicaset continuing its job:
Any tips on how to make the web application wait the replicaset to be ready?
The application did wait, for 30 seconds. You can increase the timeout by adjusting serverSelectionTimeoutMS URI option or through language-specific means.

Celery task hangs when .apply_async() or .delay() is used within request processed by Django, but works fine when invoked in shell

Invokement of a following task:
task__determine_order_details_processing_or_created_status.apply_async(
args=[order_record.Order_ID],
eta=datetime.now(GMT_timezone)+timedelta(minutes=1)
)
Ends up in the workers' timeout. It looks like the method is never releasing the worker to continue its job
web_1 | [2019-11-21 05:43:43 +0000] [1] [CRITICAL] WORKER TIMEOUT (pid:1559)
web_1 | [2019-11-21 05:43:43 +0000] [1559] [INFO] Worker exiting (pid: 1559)
web_1 | [2019-11-21 05:43:43 +0000] [1636] [INFO] Booting worker with pid: 1636
Whereas, the same command invoked with the usage of Django shell creates a completely working celerty task:
celery_1 | [2019-11-21 05:47:06,500: INFO/MainProcess] Received task: task__determine_order_details_processing_or_created_status[f94708be-a0ab-4853-8785-a11c8c7ca9f1] ETA:[2019-11-21 05:48:06.304924+00:00]
docker-compose.yml:
web:
build: ./server
command: gunicorn server.wsgi:application --reload --limit-request-line 16376 --bind 0.0.0.0:8001
volumes:
- ./server:/usr/src
expose:
- 8001
env_file: .env.dev
links:
- memcached
depends_on:
- db_development_2
- redis
db_development_2:
restart: always
image: postgres:latest
volumes:
- postgres_development3:/var/lib/postgresql/volume/
env_file: .env.dev
logging:
driver: none
redis:
image: "redis:alpine"
restart: always
logging:
driver: none
celery:
build: ./server
command: celery -A server.celery worker -l info
env_file: .env.dev
volumes:
- ./server:/usr/src
depends_on:
- db_development_2
- redis
restart: always
celery-beat:
build: ./server
command: celery -A server.celery beat -l info
env_file: .env.dev
volumes:
- ./server:/usr/src
depends_on:
- db_development_2
- redis
restart: always
logging:
driver: none
Can you please share more details?
Error is from gunicorn right?
Are you running this in docker environment? Celery on different container?
What does WSGI<-->YOUR_APP command look like?
example:
gunicorn app.wsgi:tour_application -w 6 -b :8000 --timeout 120
Can you try with more time-out like 120 in above eg.?

why can't the first container "see" the later containers

I have a series of containers that are started by docker-compose. Specifically they are multiple zookeeper containers:
zk1:
image: seven10/zookeeper:3.4.6
container_name: zk1
hostname: zk1
restart: always
ports:
- "2181:2181"
- "2888:2888"
- "3888:3888"
environment:
- ZOOKEEPER_ID=1
net: ${MY_NETWORK_NAME}
volumes:
- /seven10/zk/zk1/data:/opt/zookeeper-3.4.6/data
zk2:
image: seven10/zookeeper:3.4.6
container_name: zk2
restart: always
hostname: zk2
ports:
- "2182:2181"
- "2889:2888"
- "3889:3888"
environment:
- ZOOKEEPER_ID=2
net: ${MY_NETWORK_NAME}
volumes:
- /seven10/zk/zk2/data:/opt/zookeeper-3.4.6/data
zk3:
image: seven10/zookeeper:3.4.6
container_name: zk3
hostname: zk3
restart: always
ports:
- "2183:2181"
- "2890:2888"
- "3890:3888"
environment:
- ZOOKEEPER_ID=3
net: ${MY_NETWORK_NAME}
volumes:
- /seven10/zk/zk3/data:/opt/zookeeper-3.4.6/data
So when I go to start the containers, zk1 gives me this warning at the start:
WARN [WorkerSender[myid=1]:QuorumCnxManager#382] - Cannot open channel to 3 at
election address zk3:3888
java.net.UnknownHostException: zk3
but then doesn't say anything else about zk3 after a couple of seconds.
However, zk1 gives the following error for zk2 continuously:
2016-02-18 15:28:57,384 [myid:1] - WARN [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Learner#233] - Unexpected exception, tries=0, connecting to zk2:2888
java.net.UnknownHostException: zk2
zk2 doesn't say ANYTHING ever about zk1, but briefly complains with the "cannot open channel" error for zk3.
zk3 doesn't every mention zk1 or zk2.
So the big problem is that zk1 can't find zk2 ever. It just spams the logs and refuses connections from kafka. Why is this so and how should I go about solving this problem?
My dev box is using docker version 1.9.1 and docker-compose version 1.5.1 on ubuntu 14.04 (Mint Rafello I think?), although the target environment will be ubuntu 15.10.
Does your host system know how to link zk1/2/3 to an IP address? If you're launching all three servers on the same node, you should use localhost as the host name (the server name should still be unique)