Postgres Register Standby fails - postgresql

I am trying to setup a Primary and a standby using repmgr. I think I have successfully setup master, but standby setup keeps failing.
On Standby node
/usr/pgsql-12/bin/repmgr -h master_ip standby clone
NOTICE: destination directory "/var/lib/pgsql/12/data" provided
INFO: connecting to source node
DETAIL: connection string is: host=master_ip
DETAIL: current installation size is 32 MB
ERROR: repmgr extension is available but not installed in database "(null)"
HINT: check that you are cloning from the database where "repmgr" is installed
On Master Node:
/usr/pgsql-12/bin/repmgr cluster show
ID | Name | Role | Status | Upstream | Location | Priority | Timeline | Connection string
----+-------------+---------+-----------+----------+----------+----------+----------+----------------------------------------------------------------
1 | hostname | primary | * running | | default | 100 | 1 | host=master_ip dbname=repmgr user=repmgr connect_timeout=2
postgres=# SELECT * FROM pg_available_extensions WHERE name='repmgr';
name | default_version | installed_version | comment
--------+-----------------+-------------------+------------------------------------
repmgr | 5.3 | | Replication manager for PostgreSQL

resolved after adding -U repmgr -d repmgr to the clone command.

Related

using pgpool, i got empty value in replication state

I'm trying to use pgpool to postgres HA.
node_id | hostname | port | status | pg_status | lb_weight | role | pg_role | select_cnt | load_bala
nce_node | replication_delay | replication_state | replication_sync_state | last_status_change
---------+----------+------+--------+-----------+-----------+---------+---------+------------+----------
---------+-------------------+-------------------+------------------------+---------------------
0 | master | 5432 | up | up | 0.500000 | primary | primary | 1 | false
| 0 | | | 2022-05-30 10:33:21
1 | slave | 5432 | up | up | 0.500000 | standby | primary | 0 | true
| 419431440 | | | 2022-05-30 10:33:21
In this process, other process is working well, but I got empty value replictation_state and replication_sync_state.
And I got high value in replication_delay.
Why those values are empty and high value?
Is there should change values in postgres.conf or pgpool.conf for replication?
In this case, I used 'pg_basebackup -h host -U Repuser -p port -D dir -X stream' for slave
this is pcp_node_info's result
master 5432 2 0.500000 up up primary primary 0 none none 2022-05-30 10:42:40
slave 5432 2 0.500000 up up standby primary 419431848 none none 2022-05-30 10:42:40
Sorry to my English Level, Thank you for your help
My version
postgres 14.2
pgpool 4.3.1
You need to provide application_name in both configurations files - myrecovery.conf (primary_conninfo variable) and pgpool.conf for each node.
Also you should check recovery_1st_stage and follow_primary.sh files as there you also find block with application_name. Script are used by pgpool to recover replica (with pcp_recover_node) or promote new master.
After all you can check current value with "select * from pg_stat_replication;" (on master) or "select * from pg_stat_wal_receiver;" (on replica)
More information: https://www.pgpool.net/docs/pgpool-II-4.3.1/en/html/example-cluster.html

pgpool/postgres - replication_delay is too high, how to reset?

in our setup the show pool_nodes shows a very high replication_delay and it keeps increasing, becuase of which any new queries are not replicated in the slave
following is the output of show pool_nodes command, is there a way to reset this, data loss if fine as this is not a live/production system.
[root#DB2 ~]# psql -h DB-HA-Hostname -U postgres -p 5432 -c 'show pool_nodes'
node_id | hostname | port | status | lb_weight | role | select_cnt | load_balance_node | replication_delay | last_status_change
---------+--------------------------------------+------+--------+-----------+---------+------------+-------------------+-------------------+---------------------
0 | DB1-hostname | 5432 | up | 0.500000 | primary | 0 | true | 0 | 2021-01-11 19:32:00
1 | DB2-hostname | 5432 | up | 0.500000 | standby | 0 | false | 54986528 | 2021-01-11 19:32:00
(2 rows)
I have tried, restarting nodes, restarting pgpool, restarting postgresql , deleting database etc.. but no luck. As soon as the slave gets attached the replication_delay is high again..
You can run this command to check status of replication:
psql -h DB-HA-Hostname -U postgres -p 5432 -c "select * from pg_stat_replication" -x
if it shows:
if not, the configuration has failed.
You can show your configuration ?
Check the replication is running or not if it's not running re-configure the standby then attached the nodes
select * from pg_stat_replication;
after taking basebackup start the postgresql server then pcp-attach-node on pgpool

repmgr - how to make previous Primary to become a standby after failover

After performing a fail over, I had the previous Primary down, and the old standby became the Primary, as expected.
$ repmgr -f /etc/repmgr.conf cluster show --compact
ID | Name | Role | Status | Upstream | Location | Prio. | TLI
----+-----------------+---------+-----------+----------+----------+-------+-----
1 | server1 | primary | - failed | | default | 100 | ?
2 | server2 | primary | * running | | default | 100 | 2
3 | PG-Node-Witness | witness | * running | server2 | default | 0 | 1
I would like to make the old Primary join the cluster as a standby.
I gather the rejoin command should do that.
However, when I try to rejoin it, to be the new standby, I get this (I run this on the old Primary which is down ):
repmgr -f /etc/repmgr.conf -d 'host=10.9.7.97 user=repmgr dbname=repmgr' node rejoin
--where 10.9.7.97 is the ip of node I am running from.
I get this error:
$ repmgr -f /etc/repmgr.conf -d 'host=10.97.7.97 user=repmgr dbname=repmgr' node rejoin --verbose -
NOTICE: using provided configuration file "/etc/repmgr.conf"
ERROR: connection to database failed
DETAIL:
could not connect to server: Connection refused
Is the server running on host "10.97.7.97" and accepting
TCP/IP connections on port 5432?
Of course postgres is down on 10.9.7.97 - the old primary.
If I start it however, it starts as another primary:
$ repmgr -f /etc/repmgr.conf cluster show --compact
ID | Name | Role | Status | Upstream | Location | Prio. | TLI
----+-----------------+---------+-----------+----------+----------+-------+-----
1 | server1 | primary | ! running | | default | 100 | 1
2 | server2 | primary | * running | | default | 100 | 2
3 | PG-Node-Witness | witness | * running | server2 | default | 0 | 1
so what is the way to make the old primary the new standby...?
Thanks
Apparently the
-d 'host=
in the rejoin command, should specify the current Primary (previous standby).

Can't get new postgres config file settings to take effect

I have a somewhat large table in my database and I am inserting new records to it. As the number of records grow, I started having issues and can't insert.
My postgresql log files suggest I increase WAL size:
[700] LOG: checkpoints are occurring too frequently (6 seconds apart)
[700] HINT: Consider increasing the configuration parameter "max_wal_size".
I got the path to my config file with =# show config_file; and made some modifications with vim:
max_wal_senders = 0
wal_level = minimal
max_wal_size = 4GB
When I check the file I see the changes I made.
I then tried reloading and restarting the database:
(I get the data directory with =# show data_directory ;)
I tried reload:
pg_ctl reload -D path
server signaled
I tried restart
pg_ctl restart -D path
waiting for server to shut down.... done
server stopped
waiting for server to start....
2020-01-17 13:08:19.063 EST [16913] LOG: listening on IPv4 address
2020-01-17 13:08:19.063 EST [16913] LOG: listening on IPv6 address
2020-01-17 13:08:19.079 EST [16913] LOG: listening on Unix socket
2020-01-17 13:08:19.117 EST [16914] LOG: database system was shut down at 2020-01-17 13:08:18 EST
2020-01-17 13:08:19.126 EST [16913] LOG: database system is ready to accept connections
done
server started
But when I connect to the database and check for my settings:
name | setting | unit | category | short_desc | extra_desc | context | vartype | source | min_val | max_val | enumvals | boot_val | reset_val | sourcefile | sourceline | pending_restart
-----------------+---------+------+-------------------------------+-------------------------------------------------------------------------+------------+------------+---------+---------+---------+------------+---------------------------+----------+-----------+------------+------------+-----------------
max_wal_senders | 10 | | Replication / Sending Servers | Sets the maximum number of simultaneously running WAL sender processes. | | postmaster | integer | default | 0 | 262143 | | 10 | 10 | | | f
max_wal_size | 1024 | MB | Write-Ahead Log / Checkpoints | Sets the WAL size that triggers a checkpoint. | | sighup | integer | default | 2 | 2147483647 | | 1024 | 1024 | | | f
wal_level | replica | | Write-Ahead Log / Settings | Set the level of information written to the WAL. | | postmaster | enum | default | | | {minimal,replica,logical} | replica | replica | | | f
(3 rows)
I still see the old default settings.
What am I missing here? How can I get these settings to take effect?
Configuration settings can come from several sources:
postgresql.conf
postgresql.auto.conf (set with ALTER SYSTEM)
command line arguments at server start
set with ALTER DATABASE or ALTER USER
Moreover, if a parameter occurs twice in a configuration file, the second entry wins.
To figure out from where in this mess your setting originates, run
SELECT name, source, sourcefile, sourceline, pending_restart
FROM pg_settings
WHERE name IN ('wal_level', 'max_wal_size', 'max_wal_senders');
If the source is database or user, you can user the psql command \drds to figure out details.
The result of the queries shows that your PostgreSQL has been modified or built so that these values are the default values.
You'd have to override these defaults with any of the methods shown above.
Locations of config files. Ordered by priority.
/var/lib/postgresql/12/main/postgresql.auto.conf
/etc/postgresql/12/main/postgresql.conf

Users And Grant Execute Permission Gets Automatically Removed Only In Cloud Sql

CLOUD SQL VERSION & DB ENGINE: Currently our CLOUD MYSQL Version is 5.6.21 n DB ENGINE is INNODB
1. Create User In Mysql
Create User 'USERNAME' # 'HOSTNAME' Identified By 'PASSWORD';
But This User Is Not Permanently Stored In mysql.user Table. This User Getting removed In The Table If Any Issue Comes In Script Side Or Server Restarts...and also sometimes, created user password gets empty.
2.Likewise Grant Execute Permission For Procedure Also Not Working Properly.
Grant Execute On Procedure Schemaname . Spname To 'USERNAME'#'%';
This Execute Permission Works For Some Time,But The Privileges Immediately Disappears For The Granted User.
Other Solutons We Tried Are:
1.Flush Tables-After Creating User
2.Flush Privilges- After Giving Any Grant Access/Revoke Access
But These 2 Solutions Are Also Not Working In Google Cloud Sql, Still Issue Remains Same.
But This Issue We Dont Have In Local Mysql Version, It Is Reproducible Only On Google Cloud Sql.
We are Struck With This Issue In Our Front End App.
Anyone knows how To resolve This Issue In Google Cloud Sql...
I'm not able to reproduce the fact that a creating a user doesn't survive a Cloud SQL instances.
Here is how I tested (I replaces some sensitive information with (edited)).
First I connect to an existing instance and create a user called xxx and checked that it shows up in the mysql.user table.
$ mysql -uroot -proot -h (edited)
mysql> SELECT host,user,password FROM mysql.user;
+-----------+------+-------------------------------------------+
| host | user | password |
+-----------+------+-------------------------------------------+
| localhost | root | |
| 127.0.0.1 | root | |
| ::1 | root | |
| localhost | | |
| % | root | *81F5E21E35407D884A6CD4A731AEBFB6AF209E1B |
+-----------+------+-------------------------------------------+
5 rows in set (0.07 sec)
mysql> CREATE USER xxx#'%' IDENTIFIED BY 'xxx';
Query OK, 0 rows affected (0.61 sec)
mysql> SELECT host,user,password FROM mysql.user;
+-----------+------+-------------------------------------------+
| host | user | password |
+-----------+------+-------------------------------------------+
| localhost | root | |
| 127.0.0.1 | root | |
| ::1 | root | |
| localhost | | |
| % | root | *81F5E21E35407D884A6CD4A731AEBFB6AF209E1B |
| % | xxx | *3D56A309CD04FA2EEF181462E59011F075C89548 |
+-----------+------+-------------------------------------------+
6 rows in set (0.06 sec)
mysql> Bye
Then I restart the Cloud SQL instances.
$ gcloud sql instances restart (edited) --project (edited)
Restarting Cloud SQL instance...done.
Restarted [https://www.googleapis.com/sql/v1beta3/projects/(edited)/instances/(edited)].
$
Then I connected again and check the mysql.user tables.
$ mysql -uroot -proot -h (edited)
mysql> SELECT host,user,password FROM mysql.user;
+-----------+------+-------------------------------------------+
| host | user | password |
+-----------+------+-------------------------------------------+
| localhost | root | |
| 127.0.0.1 | root | |
| ::1 | root | |
| localhost | | |
| % | root | *81F5E21E35407D884A6CD4A731AEBFB6AF209E1B |
| % | xxx | *3D56A309CD04FA2EEF181462E59011F075C89548 |
+-----------+------+-------------------------------------------+
6 rows in set (0.07 sec)
mysql> Bye
$