is it possible to check (as not rooted user/SQL question) to check, if my connection from client to server uses SSL (my destination server cas uses both - secured and not secured connection)?
as of 9.5:
https://www.postgresql.org/docs/current/static/monitoring-stats.html#PG-STAT-SSL-VIEW
The pg_stat_ssl view will contain one row per backend or WAL sender
process, showing statistics about SSL usage on this connection. It can
be joined to pg_stat_activity or pg_stat_replication on the pid column
to get more details about the connection.
t=# set role notsu ;
SET
Time: 9.289 ms
t=> select * from pg_stat_ssl where pid = pg_backend_pid();
pid | ssl | version | cipher | bits | compression | clientdn
-------+-----+---------+--------+------+-------------+----------
43767 | f | | | | |
(1 row)
Time: 10.846 ms
t=> \du+ notsu
List of roles
Role name | Attributes | Member of | Description
-----------+------------+-----------+-------------
notsu | | {} |
the above shows my connection is not SSL
Related
I am trying to create PostgreSQL - Etcd - Patroni(PEP) cluster. There are lots of examples on the internet and I have created one and it runs perfect. Yet, this architecture should comply with my company' s backup solution which is NetApp. We are putting the database into backup mode with "SELECT pg_start_backup('test_backup', true);" and then copy all the data files to backup directory.
PEP cluster has a small problem with this solution. Taking backup is running fine, but restoration point is not that much good. In order to restore the leader of the PEP cluster I need stop the database and then move the backup files to the data directory and finally start the restoration. At this point Patroni says the restoration node is a new cluster. Here is the error:
raise PatroniFatalException('Failed to bootstrap cluster')
patroni.exceptions.PatroniFatalException: 'Failed to bootstrap cluster'
2022-04-11 12:49:29,930 INFO: No PostgreSQL configuration items changed, nothing to reload.
2022-04-11 12:49:29,942 INFO: Lock owner: None; I am pgsql_node1
2022-04-11 12:49:29,962 INFO: trying to bootstrap a new cluster
The files belonging to this database system will be owned by user "postgres".
Also, when I check the patroni cluster status I saw this:
root#4cddca032454:/data/backup# patronictl -c /etc/patroni/config.yml list
+ Cluster: pgsql (7085327534197401486) --------+----+-----------+
| Member | Host | Role | State | TL | Lag in MB |
+-------------+------------+---------+---------+----+-----------+
| pgsql_node1 | 172.17.0.6 | Replica | stopped | | unknown |
| pgsql_node2 | 172.17.0.7 | Replica | running | 11 | 0 |
| pgsql_node3 | 172.17.0.8 | Replica | running | 11 | 0 |
+-------------+------------+---------+---------+----+-----------+
At this point I have a PEP cluster without a leader. So, how can I solve this issue?
(Note: The restoration node attempted to join right cluster because, before starting the restoration I check cluster status and got this result:
root#4cddca032454:/data/backup# patronictl -c /etc/patroni/config.yml list
+ Cluster: pgsql (7085327534197401486) --------+----+-----------+
| Member | Host | Role | State | TL | Lag in MB |
+-------------+------------+---------+---------+----+-----------+
| pgsql_node2 | 172.17.0.7 | Replica | running | 11 | 0 |
| pgsql_node3 | 172.17.0.8 | Replica | running | 11 | 0 |
+-------------+------------+---------+---------+----+-----------+
pgsql_node1 is not there.
)
As explained here, "https://patroni.readthedocs.io/en/latest/existing_data.html#existing-data" I can create a new cluster after restoration but my priority saving the cluster. Or do I think wrong, all this steps are same with the converting a standalone PostgreSQL database to PEP cluster?
Please let me know if you need any data or something is not clear.
Here is my leader node patroni config file:
scope: "cluster"
namespace: "/cluster/"
name: 8d454a228d251
restapi:
listen: 172.17.0.2:8008
connect_address: 172.17.0.2:8008
etcd:
host: 172.17.0.2:2379
dcs:
ttl: 30
loop_wait: 10
retry_timeout: 10
maximum_lag_on_failover: 1048576
check_timeline: true
postgresql:
use_pg_rewind: true
remove_data_directory_on_rewind_failure: true
remove_data_directory_on_diverged_timelines: true
use_slots: true
postgresql:
listen: 0.0.0.0:5433
connect_address: 172.17.0.2:5432
use_unix_socket: true
data_dir: /data/postgresql/
bin_dir: /usr/lib/postgresql/14/bin
config_dir: /etc/postgresql/14/main
authentication:
replication:
username: "patroni_replication"
password: "123123"
superuser:
username: "patroni_superuser"
password: "123123"
parameters:
unix_socket_directories: '/var/run/postgresql/'
logging_collector: 'on'
log_directory: '/var/log/postgresql'
log_filename: 'postgresql-14-8d454a228d25.log'
restore_command: 'cp /data/backup/%f %p'
recovery_target_timeline: 'latest'
promote_trigger_file: '/tmp/promote'
Thanks!
If I understand well you want to restore the primary server (leader) restoring the data directory with a new set of backup files.
After doing the restore in data directory of leader you need to recreate the patroni cluster (remove the keys in DCS) with patronictl remove option.
Example:
stop pgsql_node2
stop pgsql_node3
stop pgsql_node1
on pgsql_node1:
patronictl -c /etc/patroni/config.yml remove <clustername>
start pgsql_node1
start pgsql_node2
start pgsql_node3
I'm testing failover using RDS Aurora PostgreSQL.
First, create RDS Aurora PostgreSQL and access the writer cluster to create users table.
$ CREATE TABLE users (
id SERIAL PRIMARY KEY NOT NULL,
name varchar(10) NOT NULL,
createAt TIMESTAMP DEFAULT Now() );
And I added one row and checked the table.
$ INSERT INTO users(name) VALUES ('test');
$ SELECT * FROM users;
+----+--------+----------------------------+
| id | name | createdAt |
+----+--------+----------------------------+
| 1 | test | 2022-02-02 23:09:57.047981 |
+----+--------+----------------------------+
After failover of RDS Aurora Cluster, I added another row and checked the table.
$ INSERT INTO users(name) VALUES ('temp');
$ SELECT * FROM users;
+-----+--------+----------------------------+
| id | name | createdAt |
+-----+--------+----------------------------+
| 1 | test | 2022-02-01 11:09:57.047981 |
| 32 | temp | 2022-02-01 11:25:57.047981 |
+-----+--------+----------------------------+
After failover, the id value that should be 2 became 32.
Why is this happening?
Is there any way to solve this problem?
That is to be expected. Index modifications are not WAL logged whenever nextval is called, because that could become a performance bottleneck. Rather, a WAL record is written every 32 calls. That means that the sequence can skip some values after a crash or failover to the standby.
You may want to read my ruminations about gaps in sequences.
The second day I can not overcome the connection error through pgbouncer if I use auth_type = hba:
postgres=# create user monitoring with password 'monitoring';
postgres=# create database monitoring owner monitoring;
postgres=# \du+ monitoring
List of roles
Role name | Attributes | Member of | Description
------------+------------+-----------+-------------
monitoring | | {} |
postgres=# \l+ monitoring
List of databases
Name | Owner | Encoding | Collate | Ctype | Access privileges | Size | Tablespace | Description
------------+------------+----------+-------------+-------------+-------------------+---------+------------+-------------
monitoring | monitoring | UTF8 | en_US.UTF-8 | en_US.UTF-8 | | 7861 kB | pg_default |
/var/lib/pgsql/10/data/pg_hba.conf:
# TYPE DATABASE USER ADDRESS METHOD
host monitoring monitoring 0.0.0.0/0 trust
local monitoring monitoring trust
/etc/pgbouncer/pgbouncer.ini:
pidfile = /var/run/pgbouncer/pgbouncer.pid
reserve_pool_size = 5
reserve_pool_timeout = 2
listen_port = 6432
listen_addr = *
auth_type = hba
auth_hba_file = /etc/pgbouncer/hba_bouncer.conf
auth_file = /etc/pgbouncer/userlist.txt
logfile = /var/log/pgbouncer/pgbouncer.log
log_connections = 0
log_disconnections = 0
log_pooler_errors = 1
max_client_conn = 5000
server_idle_timeout = 30
pool_mode = transaction
server_reset_query =
admin_users = root
stats_users = root,monitoring
[databases]
* = client_encoding=UTF8 host=localhost port=5432 pool_size=1000
In pg_hba.conf of pgbouncer I also tried to specify specific addresses of interfaces of the server with mask /32, also mask /8, /16 (real mask of my network segment).
The result is only one: login rejected!
/etc/pgbouncer/hba_bouncer.conf:
host monitoring monitoring 0.0.0.0/0 trust
host monitoring monitoring 127.0.0.1/32 trust
/etc/pgbouncer/userlist.txt:
"monitoring" "monitoring"
Connection attempt:
# psql -U monitoring -p 5432 -h 127.0.0.1
psql (10.1)
Type "help" for help.
monitoring=>
# psql -U monitoring -p 6432 -h 127.0.0.1
psql: ERROR: login rejected
We have a use case similar to yours. We are running version 1.12.0 and we ran into the same issue where we also got the "ERROR: login rejected" message.
Turned out after investigation that the permissions on our pg_hba.conf for pg_bouncer was incorrect. Once we gave pgbouncer read permissions, it was working as expected. Unfortunately nothing in the higher logging that we turned on revealed this, and we happened to stumbled across this solution through testing on our own.
Ps. the password hash in the pgbouncer config we had left as "" as we're using trust on our connection. I don't think there is anything different in our config to what you posted otherwise.
Our Postgres BDR database system stopped replicating data between the nodes.
When I did a check using the pg_xlog_location_diff I noticed that there is a growing buffer in the replication slot.
SELECT slot_name, database, active, pg_xlog_location_diff(pg_current_xlog_insert_location(), restart_lsn) AS retained_bytes
FROM pg_replication_slots
WHERE plugin = 'bdr';
slot_name | database | active | retained_bytes
-----------------------------------------+--------------+--------+----------------
bdr_26702_6275336279642079463_1_20305__ | ourdatabase | f | 32253352
I also noticed that the slot is marked as active=false.
SELECT * FROM pg_replication_slots;
-[ RECORD 1 ]+----------------------------------------
slot_name | bdr_26702_6275336279642079463_1_20305__
plugin | bdr
slot_type | logical
datoid | 26702
database | ourdatabase
active | f
xmin |
catalog_xmin | 8041
restart_lsn | 0/5F0C6C8
I increased the Postgres logging level, but then only messages I see in the log are:
LOCATION: LogicalIncreaseRestartDecodingForSlot, logical.c:886
DEBUG: 00000: updated xmin: 1 restart: 0
LOCATION: LogicalConfirmReceivedLocation, logical.c:958
DEBUG: 00000: failed to increase restart lsn: proposed 0/7DCE6F8, after 0/7DCE6F8, current candidate 0/7DCE6F8, current after 0/7DCE6F8, flushed up to 0/7DCE6F8
Please let me know if you have an idea how I can re-activate the replication slot and allow the replication to resume.
Except if you have really huuuuuge amount of data, I cannot see any reason for not recreating the replication from scratch. Stop the slave, delete the slot on master, delete data directory on slave, create new slot (with the same name to avoid further changes on slave), do pg_basebackup.
You can find a good tutorial here.
I am using Postgres for one of my applications and sometimes (not very frequently) one of the connection goes into <IDLE> in transaction state and it keeps acquired lock that causes other connections to wait on these locks ultimately causing my application to hang.
Following is the output from pg_stat_activity table for that process:
select * from pg_stat_activity
24081 | db | 798 | 16384 | db | | 10.112.61.218 | | 59034 | 2013-09-12 23:46:05.132267+00 | 2013-09-12 23:47:31.763084+00 | 2013-09-12 23:47:31.763534+00 | f | <IDLE> in transaction
This indicates that PID=798 is in <IDLE> in transaction state. The client process on web server is found as following using the client_port (59034) from above output.
sudo netstat -apl | grep 59034
tcp 0 0 ip-10-112-61-218.:59034 db-server:postgresql ESTABLISHED 23843/pgbouncer
I know that something is wrong in my application code (I killed one of the running application cron and it freed the locks) that is causing the connection to hang, but I am not able to trace it.
This is not very frequent and I can't find any definite reproduction steps either as this only occurs on the production server.
I would like to get inputs on how to trace such idle connection, e.g. getting last executed query or some kind of trace-back to identify which part of code is causing this issue.
If you upgrade to 9.2 or higher, the pg_stat_activity view will show you what the most recent query executed was for idle in transaction connections.
select * from pg_stat_activity \x\g\x
...
waiting | f
state | idle in transaction
query | select count(*) from pg_class ;
You can also (even in 9.1) look in pg_locks to see what locks are being held by the idle in transaction process. If it only has locks on very commonly used objects, this might not narrow things down much, but if it was a peculiar lock that could tell you exactly where in your code to look.
If you are stuck with 9.1, you can perhaps use the debugger to get all but the first 22 characters of the query (the first 22 are overwritten by the <IDLE> in transaction\0 message). For example:
(gdb) printf "%s\n", ((MyBEEntry->st_activity)+22)