Failed to run patroni - postgresql

I have follow this tutorial: https://linode.com/docs/databases/postgresql/create-a-highly-available-postgresql-cluster-using-patroni-and-haproxy/ , in order to set up Highly Available PostgreSQL Cluster Using Patroni and HAProxy.
But when I try to start patroni I get this error:
ubuntu#sudo patroni /etc/patroni.yml
2018-05-31 09:49:37,159 INFO: Failed to import patroni.dcs.consul
2018-05-31 09:49:37,166 INFO: Selected new etcd server http://privateetcdIP:2379
2018-05-31 09:49:37,173 INFO: Lock owner: None; I am postgresqlm
2018-05-31 09:49:37,175 INFO: trying to bootstrap a new cluster
pg_ctl: cannot be run as root
Please log in (using, e.g., "su") as the (unprivileged) user that will
own the server process.
2018-05-31 09:49:37,185 INFO: removing initialize key after failed attempt to bootstrap the cluster
2018-05-31 09:49:37,673 INFO: Lock owner: None; I am postgresqlm
Traceback (most recent call last):
File "/usr/local/bin/patroni", line 9, in <module>
load_entry_point('patroni==1.4.4', 'console_scripts', 'patroni')()
File "/usr/local/lib/python2.7/dist-packages/patroni/__init__.py", line 176, in main
return patroni_main()
File "/usr/local/lib/python2.7/dist-packages/patroni/__init__.py", line 145, in patroni_main
patroni.run()
File "/usr/local/lib/python2.7/dist-packages/patroni/__init__.py", line 114, in run
logger.info(self.ha.run_cycle())
File "/usr/local/lib/python2.7/dist-packages/patroni/ha.py", line 1164, in run_cycle
info = self._run_cycle()
File "/usr/local/lib/python2.7/dist-packages/patroni/ha.py", line 1077, in _run_cycle
return self.post_bootstrap()
File "/usr/local/lib/python2.7/dist-packages/patroni/ha.py", line 976, in post_bootstrap
self.cancel_initialization()
File "/usr/local/lib/python2.7/dist-packages/patroni/ha.py", line 971, in cancel_initialization
raise PatroniException('Failed to bootstrap cluster')
The configuration of /etc/patroni.yml is:
scope: postgres
namespace: /db/
name: postgresqlm
restapi:
listen: privateIPoffirstnode:8008
connect_address: privateIPoffirstnode:8008
etcd:
host: privateIPofetcd:2379
bootstrap:
dcs:
ttl: 30
loop_wait: 10
retry_timeout: 10
maximum_lag_on_failover: 1048576
postgresql:
use_pg_rewind: true
max_connections: 100
initdb:
- encoding: UTF8
- data-checksums
pg_hba:
- host replication replicator 127.0.0.1/32 md5
- host replication replicator privateIPoffirstnode/0 md5
- host replication replicator privateIPofsecondnode/0 md5
- host replication replicator privateIPofthirdnode/0 md5
- host all all 0.0.0.0/0 md5
users:
admin:
password: admin
options:
- createrole
- createdb
postgresql:
listen: privateIPoffirstnode:5432
connect_address: privateIPoffirstnode:5432
data_dir: /data/patroni
pgpass: /tmp/pgpass
bin_dir: /usr/lib/postgresql/9.5/bin
authentication:
replication:
username: replicator
password: rep-pass
superuser:
username: postgres
password: '12345'
parameters:
unix_socket_directories: '.'
tags:
nofailover: false
noloadbalance: false
clonefrom: false
nosync: false
The configuration of /etc/systemd/system/patroni.service is:
[Unit]
Description=Runners to orchestrate a high-availability PostgreSQL
After=syslog.target network.target
[Service]
Type=simple
User=postgres
Group=postgres
ExecStart=/usr/local/bin/patroni /etc/patroni.yml
KillMode=process
TimeoutSec=30
Restart=no
[Install]
WantedBy=multi-user.targ
etcd congiguration:
ETCD_LISTEN_PEER_URLS="http://privateIPofetcd:2380"
ETCD_LISTEN_CLIENT_URLS="http://localhost:2379,http://privateIPofetcd:2379"
ETCD_INITIAL_ADVERTISE_PEER_URLS="http://privateIPofetcd:2380"
ETCD_INITIAL_CLUSTER="etcd0=http://privateIPofetcd:2380,"
ETCD_ADVERTISE_CLIENT_URLS="http://privateIPofetcd:2379"
ETCD_INITIAL_CLUSTER_TOKEN="cluster1"
ETCD_INITIAL_CLUSTER_STATE="new"
Of course, I have the real ips in privateIPoffirstnode, privateIPofsecondnode etc.
So, does anyone know what this error means?

I think the answer is obvious. If you start patroni with sudo, it will run as root, and that is exactly the error message you get.
Why don't you start it via systemctl? Your /etc/systemd/system/patroni.service has correctly configured a User that is not root.

Follow this guide to configure highly available Postgresql cluster.
Its fully tested and working.

Related

Can't connect PostGis to database and server

I followed these steps to set up QWC services https://github.com/qwc-services/qwc-services-core#quick-start and I can run the demo. But if load my own QGIS project, I receive the following error message:
qwc-qgis-server_1 | 07:50:07 WARNING Server[99]: <ServerException>Layer(s) not valid</ServerException>
qwc-qgis-server_1 |
qwc-qgis-server_1 | 07:50:07 WARNING ClearCapabilities[99]: Cached cleared : /data/MeasurementDemo.qgs
qwc-qgis-server_1 | 07:50:07 WARNING PostGIS[99]: Connection to database failed
qwc-qgis-server_1 | could not connect to server: No such file or directory
qwc-qgis-server_1 | Is the server running locally and accepting
qwc-qgis-server_1 | connections on Unix domain socket "/var/run/postgresql/.s.PGSQL.5432"?
qwc-qgis-server_1 |
qwc-qgis-server_1 | 07:50:07 CRITICAL Server[99]: Error, Layer(s) measurement_b46e976f_2d0f_4bf0_942a_9d9462b40c3e not valid in project /data/MeasurementDemo.qgs
qwc-qgis-server_1 | 07:50:07 WARNING Server[99]: <ServerException>Layer(s) not valid</ServerException>
qwc-qgis-server_1 |
qwc-config-service_1 | [2022-01-04 07:50:09,360] WARNING in config_generator: Skipping theme item '': Could not get capabilities for /ows/MeasurementDemo
qwc-config-service_1 | [2022-01-04 07:50:19,468] CRITICAL in config_generator: The generation of the configuration files resulted in a failure
qwc-config-service_1 | [2022-01-04 07:50:19,468] CRITICAL in config_generator: The configuration files were not updated!
qwc-config-service_1 | [2022-01-04 07:50:20,856] CRITICAL in config_generator: The generation of the permission files resulted in a failure.
qwc-config-service_1 | [2022-01-04 07:50:20,857] CRITICAL in config_generator: The permission files were not updated!
qwc-config-service_1 | [pid: 15|app: 0|req: 18/18] 172.18.0.11 () {30 vars in 408 bytes} [Tue Jan 4 07:50:05 2022] POST /generate_configs?tenant=default => generated 2881 bytes in 15083 msecs (HTTP/1.1 200) 2 headers in 81 bytes (1 switches on core 0)
As the error is quite similar to this question: PostgreSQL: Why psql can't connect to server?, I followed the answers but with no result.
ps -ef | grep postgres gives me the following result:
postgres 203911 1 0 07:35 ? 00:00:00 /usr/lib/postgresql/13/bin/postgres -D /var/lib/postgresql/13/main -c config_file=/etc/postgresql/13/main/postgresql.conf
Also I found the socket in
/var/run/postgresql/.s.PGSQL.5432
And I run the command
psql -h /var/run/postgresql/ GeoDB
But without result. After that I checked the ph_hba.conf File:
# "local" is for Unix domain socket connections only
local all all peer
Running the command pg_lsclusters gives me:
Ver Cluster Port Status Owner Data directory Log file
13 main 5432 online postgres /var/lib/postgresql/13/main /var/log/postgresql/postgresql-13-main.log
Also after restarting the pg_ctlcluster and PostgreSQL the error remained the same.
Edit 1
After the answer from cnaimi I checked the postgresql.confFile:
# - Connection Settings -
#listen_addresses = '*' # what IP address(es) to listen on;
# comma-separated list of addresses;
# defaults to 'localhost'; use '*' for all
# (change requires restart)
port = 5432 # (change requires restart)
max_connections = 100 # (change requires restart)
#superuser_reserved_connections = 3 # (change requires restart)
unix_socket_directories = '/var/run/postgresql' # comma-separated list of directories
# (change requires restart)
#unix_socket_group = '*' # (change requires restart)
#unix_socket_permissions = 0777 # begin with 0 to use octal notation
# (change requires restart)
#bonjour = off # advertise server via Bonjour
# (change requires restart)
#bonjour_name = '' # defaults to the computer name
# (change requires restart)
But I can't find an error there as the port is 5432 and it listen to all adresses.
Edit 2
During my search I found several pg_service.conf Files:
./qwc-services/qwc-docker/wsgi-service/pg_service.conf
./qwc-services/qwc-docker/qgis-server/pg_service.conf
./qwc-services/qwc-docker/postgis/pg_service.conf
./qwc-services/qwc-docker/pg_service.conf
Each if them contain one or more credentials for databases like the one below:
[qwc_geodb]
host=qwc-postgis
port=5432
dbname=qwc_demo
user=qwc_service
password=qwc_service
sslmode=disable
The port is in all files correct, as far as I saw. But of course the db name and user/password are wrong. Does this could cause the error? Or does QWS get the credentials through the .qgs file?
Edit 3
Thanks to the hints from Devdatta Tengshe I set the host for PostgreSQL to 127.0.0.1. By using sudo docker-compose ps one can see the used container and their ports:
Name Command State Ports
-------------------------------------------------------------------------------------------------------------------------------
qwc-docker_qwc-admin-gui_1 /bin/sh -c uwsgi --http-so ... Up 127.0.0.1:5031->9090/tcp
qwc-docker_qwc-api-gateway_1 /docker-entrypoint.sh ngin ... Up 0.0.0.0:8088->80/tcp,:::8088->80/tcp
qwc-docker_qwc-auth-service_1 /bin/sh -c uwsgi --http-so ... Up 127.0.0.1:5017->9090/tcp
qwc-docker_qwc-config-service_1 /bin/sh -c uwsgi --http-so ... Up 127.0.0.1:5010->9090/tcp
qwc-docker_qwc-data-service_1 /bin/sh -c uwsgi --http-so ... Up 127.0.0.1:5012->9090/tcp
qwc-docker_qwc-elevation-service_1 /bin/sh -c uwsgi --http-so ... Up 127.0.0.1:5002->9090/tcp
qwc-docker_qwc-fulltext-search-service_1 /bin/sh -c uwsgi --http-so ... Up 127.0.0.1:5011->9090/tcp
qwc-docker_qwc-map-viewer_1 /bin/sh -c uwsgi --http-so ... Up 127.0.0.1:5030->9090/tcp
qwc-docker_qwc-mapinfo-service_1 /bin/sh -c uwsgi --http-so ... Up 127.0.0.1:5016->9090/tcp
qwc-docker_qwc-ogc-service_1 /bin/sh -c uwsgi --http-so ... Up 127.0.0.1:5013->9090/tcp
qwc-docker_qwc-permalink-service_1 /bin/sh -c uwsgi --http-so ... Up 127.0.0.1:5001->9090/tcp
qwc-docker_qwc-postgis_1 docker-entrypoint.sh postgres Up (healthy) 127.0.0.1:5439->5432/tcp
qwc-docker_qwc-qgis-server_1 /sbin/my_init Up 127.0.0.1:8001->80/tcp
qwc-docker_qwc-solr_1 docker-entrypoint.sh solr- ... Up 127.0.0.1:8983->8983/tcp
Can you check the postgres.conf file located in
/etc/postgresql/13/main/postgresql.conf
specially the parameter listen_address
Maybe you have to specify from which host you are listening.
But if the demo example is working the database configuration should be ok.
You can also check the port for postgres on postgres.conf and validate it's 5432.
There are a couple of things that need to be fixed to get this working.
I'm assuming that you have the Postgres Server running on the host machine, and not within any Docker container.
When you configured your QGIS Map file, you probably connected to localhost, and this information got saved in the .qgs file.
This is why your first error message says that it trying to connect to localhost, and no server was found. This error was thrown within the qwc docker container.
This error is occuring, because QGIS server (within the docker container) is not able to connect to the postgres server which is running on the host, using 'localhost' as the hostname
To solve this, you need to do the following:
In QGIS, connect to the Postgres Server using 127.0.0.1 and not localhost.
Save your qgs file using this new connection.
When you run the docker container for qwc, use --network="host" as the commandline parameter.
See: From inside of a Docker container, how do I connect to the localhost of the machine?
After this, the qgis server (within docker container) should be able to connect to the Postgres Server running on your host, using 127.0.0.1 as IP address.

Cannot connect Barman to PostgreSQL 12

I have 2 ubuntu-20.04 VM on VMWARE with Postgres 12 installed on each
pgprimary on ip 192.168.1.131
pgbackup on ip 192.168.1.130
barman CLI tools are installed on pgprimary
barman is installed on pgbackup
I want to backup data from pgprimary on pgbackupsame 2 users as Postgress users
on each machine I created
2 Linux sudoist users
useradd barman
useradd streaming_barman
also created the same two user as Postgress users
createuser --superuser --replication -P barman
createuser --superuser --replication -P streaming_barman
here are relevant parts on the configuration files
On pgprimary
postgressql.conf
listen_addresses = '*' # what IP address(es) to listen on;
port = 5432
archive_mode = on
archive_command = 'cp %p /var/lib/postgresql/12/arc/%f'
wal_level = replica
restore_command = 'cp /var/lib/postgresql/12/arc/%f %p'
recovery_target_time = '2021-03-24 16:18:11.319298+05:30'
recovery_target_inclusive = false
pg_hba.conf
local all postgres peer
# TYPE DATABASE USER ADDRESS METHOD
local all all peer
host all all 127.0.0.1/32 md5
host all all ::1/128 md5
#local replication all peer
#host replication all 127.0.0.1/32 md5
#host replication all ::1/128 md5
# FOR TESTING
local replication all trust
host replication all 127.0.0.1/32 trust
host replication all ::1/128 trust
also did
firewall-cmd --permanent --add-port=5432/tcp
firewall-cmd --reload
========================
con
On pgbackup
sudo cat <<'EOF' >> /etc/barman.d/pgprimary.conf
[pgprimary]
description = "Example of PostgreSQL Database (Streaming-Only)"
conninfo = host=192.168.1.131 user=barman dbname=training
streaming_conninfo = host=192.168.1.131 user=streaming_barman dbname=training
backup_method = postgres
streaming_archiver = on
slot_name = barman
create_slot = auto
EOF
pg_hba.conf
cat <<'EOF' >>~/.pgpass
pgprimary:*:*:barman:barman
pgprimary:*:*:streaming_barman:barman
EOF
Then I did
barman cron
Output
Starting WAL archiving for server pgprimary
Starting streaming archiver for server pgprimary
barman check pgprimary
Then I get this error
[13643] barman.utils WARNING: Failed opening the requested log file. Using standard error instead.
Server pgprimary:
2021-10-30 21:39:15,982 [13643] barman.server ERROR: Check 'WAL archive' failed for server 'pgprimary'
WAL archive: FAILED (please make sure WAL shipping is setup)
2021-10-30 21:39:37,006 [13643] barman.postgres WARNING: Error retrieving PostgreSQL status: connection to server at "192.168.131" (192.168.0.131), port 5432 failed: Connection refused
2021-10-30 21:39:58,021 [13643] barman.server ERROR: Check 'check timeout' failed for server 'pgprimary'
check timeout: FAILED (barman check command timed out)
Why cannot connect barman to the server ?
UPDATE:
psql -h 192.168.1.131 -U barman -d training
Password for user barman:
psql (12.8 (Ubuntu 12.8-0ubuntu0.20.04.1))
SSL connection (protocol: TLSv1.3, cipher: TLS_AES_256_GCM_SHA384, bits: 256, compression: off)
Type "help" for help.
I also can connect to server via netstat

PostgreSQL error: FATAL: password authentication failed for user "pgadmin"

I am making a Telegram bot with a PostgreSQL database. Everything works correctly on the local computer, but when I upload the code to the server using docker-compose, the following errors appear in the console from time to time:
2021-10-10 02:08:58.865 UTC [3458] FATAL: password authentication failed for user "pgadmin"
2021-10-10 02:08:58.865 UTC [3458] DETAIL: Role "pgadmin" does not exist.
Connection matched pg_hba.conf line 94: "host all all 0.0.0.0/0 md5"
2021-10-09 16:27:22.198 UTC [2259] FATAL: password authentication failed for user "pgsql"
2021-10-09 16:27:22.198 UTC [2259] DETAIL: Role "pgsql" does not exist.
Connection matched pg_hba.conf line 94: "host all all 0.0.0.0/0 md5"
2021-10-09 16:25:16.113 UTC [2250] FATAL: password authentication failed for user "postgres"
2021-10-09 16:25:18.992 UTC [2252] DETAIL: Password does not match for user "postgres".
Connection matched pg_hba.conf line 94: "host all all 0.0.0.0/0 md5"
2021-10-09 21:06:32.169 UTC [2831] FATAL: expected password response, got message type 0
2021-10-09 21:07:39.610 UTC [2834] FATAL: unsupported frontend protocol 0.0: server supports 2.0 to 3.0
2021-10-09 21:07:39.819 UTC [2835] FATAL: unsupported frontend protocol 255.255: server supports 2.0 to 3.0
2021-10-09 21:07:40.029 UTC [2836] FATAL: no PostgreSQL user name specified in startup packet
And I don't use these users. It seems to me that someone else is trying to access the database. If so, is it possible to somehow specify the IP addresses from which traffic will be allowed? But my bot parses sites, so you need to limit the IP so that it would not affect its work. Thanks!
UPD:
As I was prompted in the comments, maybe the contents of the docker files will help
Dockerfile:
FROM python:3.9.5
WORKDIR /src
COPY requirements.txt /src
RUN pip install -r requirements.txt
COPY . /src
docker-compose.yml:
version: "3.1"
services:
steamtrader_db:
container_name: steamtrader_db
image: sameersbn/postgresql:10-2
environment:
PG_PASSWORD: $PGPASSWORD
DB_USER: $PGUSER
DB_PASS: $PGPASSWORD
DB_NAME: $DATABASE
restart: always
ports:
- 5432:5432
networks:
- steamtrader_botnet
volumes:
- ./pgdata:/var/lib/postgresql
steamtrader_bot:
container_name: steamtrader
build:
context: .
command: python bot.py
restart: always
networks:
- steamtrader_botnet
env_file:
- ".env"
volumes:
- .:/src
depends_on:
- steamtrader_db
networks:
steamtrader_botnet:
driver: bridge

Gitlab CI services are stopped after bizzar psql call

I am trying to use services in gitlab ci, namely, postgres. Anyway, postgres service doesn't seem to be running, although I just copied what is there in Gitlab CI docs. In logs, after service logs that it started, some psql command (I don't know where it came from) gives an name resolution error. If I am doing something wrong here, what is the way to run postgres service in gitlab-ci?
Below are .gitlab-ci.yml file and logs:
.gitlab-ci.yml
image: ubuntu
services:
- name: postgres:12.2-alpine
alias: postgres
variables:
POSTGRES_DB: badr
POSTGRES_USER: badr
POSTGRES_PASSWORD: badr
PGHOST: postgres
POSTGRES_HOST_AUTH_METHOD: trust
stages:
- test
test db:
stage: test
before_script:
- until (echo > /dev/tcp/postgres/5432) >/dev/null 2>&1;do >&2 echo "service not ready...sleeping";sleep 5;done
script:
- echo "connected to...$PGHOST"
- sleep 10
logs
Running with gitlab-runner 13.2.0-rc2 (45f2b4ec)
on docker-auto-scale fa6cab46
Preparing the "docker+machine" executor
00:55
Using Docker executor with image ubuntu ...
Starting service postgres:12.2-alpine ...
Pulling docker image postgres:12.2-alpine ...
Using docker image sha256:ae192c4d3adaebbbf2f023e1e50eaadfabccb6b08c855ac13d6ce2232381a58a for postgres:12.2-alpine ...
WARNING: Service postgres:12.2-alpine is already created. Ignoring.
Waiting for services to be up and running...
*** WARNING: Service runner-fa6cab46-project-14794655-concurrent-0-f52b350b86ad38db-postgres-0 probably didn't start properly.
Health check error:
service "runner-fa6cab46-project-14794655-concurrent-0-f52b350b86ad38db-postgres-0-wait-for-service" timeout
Health check container logs:
Service container logs:
2020-07-31T09:15:00.377204181Z ********************************************************************************
2020-07-31T09:15:00.377254629Z WARNING: POSTGRES_HOST_AUTH_METHOD has been set to "trust". This will allow
2020-07-31T09:15:00.377259167Z anyone with access to the Postgres port to access your database without
2020-07-31T09:15:00.377262471Z a password, even if POSTGRES_PASSWORD is set. See PostgreSQL
2020-07-31T09:15:00.377265670Z documentation about "trust":
2020-07-31T09:15:00.377269181Z https://www.postgresql.org/docs/current/auth-trust.html
2020-07-31T09:15:00.377272282Z In Docker's default configuration, this is effectively any other
2020-07-31T09:15:00.377276152Z container on the same system.
2020-07-31T09:15:00.377295876Z
2020-07-31T09:15:00.377299453Z It is not recommended to use POSTGRES_HOST_AUTH_METHOD=trust. Replace
2020-07-31T09:15:00.377302412Z it with "-e POSTGRES_PASSWORD=password" instead to set a password in
2020-07-31T09:15:00.377305641Z "docker run".
2020-07-31T09:15:00.377308656Z ********************************************************************************
2020-07-31T09:15:00.404620899Z The files belonging to this database system will be owned by user "postgres".
2020-07-31T09:15:00.406021814Z This user must also own the server process.
2020-07-31T09:15:00.406074886Z
2020-07-31T09:15:00.406083517Z The database cluster will be initialized with locale "en_US.utf8".
2020-07-31T09:15:00.406087263Z The default database encoding has accordingly been set to "UTF8".
2020-07-31T09:15:00.406090884Z The default text search configuration will be set to "english".
2020-07-31T09:15:00.406094281Z
2020-07-31T09:15:00.406097490Z Data page checksums are disabled.
2020-07-31T09:15:00.406101511Z
2020-07-31T09:15:00.406197662Z fixing permissions on existing directory /var/lib/postgresql/data ... ok
2020-07-31T09:15:00.406858429Z creating subdirectories ... ok
2020-07-31T09:15:00.407274720Z selecting dynamic shared memory implementation ... posix
2020-07-31T09:15:00.428414929Z selecting default max_connections ... 100
2020-07-31T09:15:00.506801199Z selecting default shared_buffers ... 128MB
2020-07-31T09:15:00.689382376Z selecting default time zone ... UTC
2020-07-31T09:15:00.695744690Z creating configuration files ... ok
2020-07-31T09:15:01.009439741Z running bootstrap script ... ok
2020-07-31T09:15:01.355673765Z sh: locale: not found
2020-07-31T09:15:01.355836607Z 2020-07-31 09:15:01.355 UTC [30] WARNING: no usable system locales were found
2020-07-31T09:15:01.784080826Z performing post-bootstrap initialization ... ok
2020-07-31T09:15:02.416545146Z syncing data to disk ... ok
2020-07-31T09:15:02.416652656Z
2020-07-31T09:15:02.416854775Z initdb: warning: enabling "trust" authentication for local connections
2020-07-31T09:15:02.416911707Z You can change this by editing pg_hba.conf or using the option -A, or
2020-07-31T09:15:02.416917642Z --auth-local and --auth-host, the next time you run initdb.
2020-07-31T09:15:02.416962149Z
2020-07-31T09:15:02.416967325Z Success. You can now start the database server using:
2020-07-31T09:15:02.416970415Z
2020-07-31T09:15:02.416990907Z pg_ctl -D /var/lib/postgresql/data -l logfile start
2020-07-31T09:15:02.416995097Z
2020-07-31T09:15:02.440378884Z waiting for server to start....2020-07-31 09:15:02.440 UTC [35] LOG: starting PostgreSQL 12.2 on x86_64-pc-linux-musl, compiled by gcc (Alpine 9.2.0) 9.2.0, 64-bit
2020-07-31T09:15:02.442773414Z 2020-07-31 09:15:02.442 UTC [35] LOG: listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
2020-07-31T09:15:02.461804500Z 2020-07-31 09:15:02.461 UTC [36] LOG: database system was shut down at 2020-07-31 09:15:01 UTC
2020-07-31T09:15:02.465323529Z 2020-07-31 09:15:02.465 UTC [35] LOG: database system is ready to accept connections
2020-07-31T09:15:02.524643142Z done
2020-07-31T09:15:02.524766601Z server started
2020-07-31T09:15:02.537508874Z psql: error: could not connect to server: could not translate host name "postgres" to address: Name does not resolve
*********
Pulling docker image ubuntu ...
Using docker image sha256:1e4467b07108685c38297025797890f0492c4ec509212e2e4b4822d367fe6bc8 for ubuntu ...
Preparing environment
00:02
Getting source from Git repository
00:01
$ eval "$CI_PRE_CLONE_SCRIPT"
Fetching changes with git depth set to 50...
Initialized empty Git repository in /builds/badrmoh/cicdtest/.git/
Created fresh repository.
Checking out 604433de as master...
Skipping Git submodules setup
Executing "step_script" stage of the job script
$ until (echo > /dev/tcp/postgres/5432) >/dev/null 2>&1;do >&2 echo "service not ready...sleeping";sleep 5;done
service not ready...sleeping
service not ready...sleeping
service not ready...sleeping
service not ready...sleeping
service not ready...sleeping
service not ready...sleeping
gitlab-runner
The problem was in PGHOST variable. It seems it is used internally by postgres container that is why it fails to start.
The solution is to set PGHOST in script directive:
image: ubuntu
services:
- name: postgres:9
alias: postgres
variables:
POSTGRES_DB: badr
POSTGRES_USER: badr
POSTGRES_PASSWORD: badr
POSTGRES_HOST_AUTH_METHOD: trust
stages:
- test
test db:
stage: test
before_script:
- export PGHOST=postgres
- until (echo > /dev/tcp/$PGHOST/5432) >/dev/null 2>&1;do >&2 echo "service $PGHOST not ready...sleeping";sleep 5;done
script:
- echo "connected to...$PGHOST"
- sleep 10
Note: You can't use variables directive within jobs in this case since it seems to be populated before even starting services itself.

PostgreSQL - Fail to start delegate ip address on pgpool 2

I try to set up a pgpool server on ubuntu server and following this link : pgpool-II Tutorial [ Watchdog ].
But when I to start a pgpool service, the delegated IP doesn't start.
I have seen in a log file on syslog and got some error like this.
Oct 25 08:46:25 pgpool-1 pgpool[1647]: [8-2] 2017-10-25 08:46:25: pid 1647: DETAIL: Host:"172.16.0.42" WD Port:9000 pgpool-II port:5432
Oct 25 08:46:25 pgpool-1 pgpool: SIOCSIFADDR: Operation not permitted
Oct 25 08:46:25 pgpool-1 pgpool: SIOCSIFFLAGS: Operation not permitted
Oct 25 08:46:25 pgpool-1 pgpool: SIOCSIFNETMASK: Operation not permitted
Oct 25 08:46:25 pgpool-1 pgpool[1648]: [18-1] 2017-10-25 08:46:25: pid 1648: LOG: failed to acquire the delegate IP address
Oct 25 08:46:25 pgpool-1 pgpool[1648]: [18-2] 2017-10-25 08:46:25: pid 1648: DETAIL: 'if_up_cmd' failed
Oct 25 08:46:25 pgpool-1 pgpool[1648]: [19-1] 2017-10-25 08:46:25: pid 1648: WARNING: watchdog escalation failed to acquire delegate IP
I use ubuntu 14.04 with pgpool2 version 3.6.6-1, and watchdog version 5.31-1.
And I has configured on pgpool.conf at virtual IP setting like this.
# - Virtual IP control Setting -
delegate_IP = '172.16.0.201'
if_cmd_path = '/sbin'
if_up_cmd = 'ifconfig eth0:0 inet $_IP_$ netmask 255.255.0.0'
if_down_cmd = 'ifconfig eth0:0 down'
arping_path = '/usr/sbin'
arping_cmd = 'arping -U $_IP_$ -w 1'
Any suggestion for this? Thank you for any help.
Looks like user that runs it doesn't have permission to use ifconfig.
Did you follow those steps from tutorial?
setuid configuration
In watchdog process, root privilege is required to contol virtual IP.
You could start pgpool-II as root user. However in this tutorial,
Apache needs to start pgpool as apache user and control virtual IP
because we are using pgpoolAdmin. For this purpose, we setuid
if_config and arping. Also we don't want any user other than apache
accesses the commands because of security reason. Execute following
commands on each of osspc19 and osspc20 (It requires root privilege).
At first, make a directory for containing ipconfig and arping which is
set setuid. The path is specified at ifconif_path and arping_path; in
this tutorial, this is /home/apache/sbin. Then give execute privilege
to only apache user.
$ su -
# mkdir -p /home/apache/sbin
# chown apache:apache /home/apache/sbin
# chmod 700 /home/apache/sbin
Next, copy the original ifconfig and arping to the directory and then
set setuid to these.
# cp /sbin/ifconfig /home/apache/sbin
# cp /use/sbin/arping /home/apache/sbin
# chmod 4755 /home/apache/sbin/ifconfig
# chmod 4755 /home/apache/sbin/arping
Note that explained above should be used for tutorial purpose only. In
the real world you'd better create setuid wrapper programs to execute
ifconfig and arping. This is left for your exercise.
(Note: this answer may help in case you run Pgpool-II servers with Watchdog in Docker containers)
I tried to setup Pgpool-II servers with Watchdog in Docker containers today, and I got almost the same error (though I did set the SUID bit and even tried running Pgpool-II as the root user):
SIOCSIFADDR: Operation not permitted
SIOCSIFFLAGS: Operation not permitted
SIOCSIFNETMASK: Operation not permitted
pid 88: LOG: failed to acquire the delegate IP address
pid 88: DETAIL: 'if_up_cmd' failed
pid 88: WARNING: watchdog escalation failed to acquire delegate IP
Later I found that it was because the container did not have the privilege to change its network configurations, by default by design.
I then ran my Pgpool-II Docker containers in the privileged mode as shown below:
pgpool1:
privileged: true
image: postdock/pgpool:latest-pgpool36
...
The error is gone and the virtual IP is set up correctly.
My problem is solved by the following method.
Make a directory for containing ipconfig and arping. Then give execute privilege to only non-root user.
$mkdir /var/lib/pgsql/sbin
$chown postgres:postgres /var/lib/pgsql/sbin
$cp /sbin/ip /var/lib/pgsql/sbin
$cp /sbin/arping /var/lib/pgsql/sbin
Run visudo, which safely edits the sudoers file:
$visudo
Then add an entry like this in sudoers file:
postgres ALL = NOPASSWD: /var/lib/pgsql/sbin/ip *, /var/lib/pgsql/sbin/arping *
Next, create bash files(ipadd.sh,ipdel.sh,arping.sh) to run ip and arping commands with sudo.
$cat /var/lib/pgsql/sbin/ipadd.sh
#!/bin/bash
sudo /var/lib/pgsql/sbin/ip addr add $1/24 dev eth1 label eth1:0
$cat /var/lib/pgsql/sbin/ipdel.sh
#!/bin/bash
sudo /var/lib/pgsql/sbin/ip addr del $1/24 dev eth1
$cat /var/lib/pgsql/sbin/arping.sh
#!/bin/bash
sudo /var/lib/pgsql/sbin/arping -U $1 -w 1 -I eth1
$chmod 755 /var/lib/pgsql/sbin/*
$chown postgres:postgres /var/lib/pgsql/sbin/*
Add an entry like this in pgpool.conf:
delegate_IP = '10.10.10.62'
if_up_cmd = 'ipadd.sh $_IP_$'
if_down_cmd = 'ipdel.sh $_IP_$'
arping_cmd = 'arping.sh $_IP_$'
if_cmd_path = '/var/lib/pgsql/sbin'
arping_path = '/var/lib/pgsql/sbin'
Then restart the pgpool service. Ignore the warning you can see as follows.
WARNING: checking setuid bit of if_up_cmd
DETAIL: ifup[/var/lib/pgsql/sbin/ipadd.sh] doesn't have setuid bit
WARNING: checking setuid bit of if_down_cmd
DETAIL: ifdown[/var/lib/pgsql/sbin/ipdel.sh] doesn't have setuid bit
WARNING: checking setuid bit of arping command
DETAIL: arping[/var/lib/pgsql/sbin/arping.sh] doesn't have setuid bit
Stop and check one of your two pgpool services.