I have 3 VM nodes running Master-Slave Postgres-11. They are being managed by Pacemaker.
Node Attributes:
* Node node04:
+ master-pgsqlins : 1000
+ pgsqlins-data-status : LATEST
+ pgsqlins-master-baseline : 000000C0D8000098
+ pgsqlins-status : PRI
* Node node05:
+ master-pgsqlins : -INFINITY
+ pgsqlins-data-status : STREAMING|ASYNC
+ pgsqlins-status : HS:async
* Node node06:
+ master-pgsqlins : 100
+ pgsqlins-data-status : STREAMING|SYNC
+ pgsqlins-status : HS:sync
Async node throws an error at times that the required WAL file is missing. It then stops the replication and starts it again.
On the master node, WAL archiving is enabled and they are synced to another folder named wal_archive. There is another process that keeps removing the files from that wal_archive folder. So I understand why the slave node would throw that error, but what I want to understand is that how is it able to start back again without that missing file?
The postgresql.conf
# Connection settings
# -------------------
listen_addresses = '*'
port = 5432
max_connections = 600
tcp_keepalives_idle = 0
tcp_keepalives_interval = 0
tcp_keepalives_count = 0
# Memory-related settings
# -----------------------
shared_buffers = 2GB # Physical memory 1/4
##DEBUG: mmap(1652555776) with MAP_HUGETLB failed, huge pages disabled: Cannot allocate memory
#huge_pages = try # on, off, or try
#temp_buffers = 16MB # depends on DB checklist
work_mem = 8MB # Need tuning
effective_cache_size = 4GB # Physical memory 1/2
maintenance_work_mem = 512MB
wal_buffers = 64MB
# WAL/Replication/HA settings
# --------------------
wal_level = logical
synchronous_commit = remote_write
archive_mode = on
archive_command = 'rsync -a %p /xxxxx/wal_archive/%f'
#archive_command = ':'
max_wal_senders=5
hot_standby = on
restart_after_crash = off
wal_sender_timeout = 60000
wal_receiver_status_interval = 2
max_standby_streaming_delay = -1
max_standby_archive_delay = -1
hot_standby_feedback = on
random_page_cost = 1.5
max_wal_size = 5GB
min_wal_size = 200MB
checkpoint_completion_target = 0.9
checkpoint_timeout = 30min
# Logging settings
# ----------------
log_destination = 'csvlog,syslog'
logging_collector = on
log_directory = 'pg_log'
log_filename = 'postgresql_%Y%m%d.log'
log_truncate_on_rotation = off
log_rotation_age = 1h
log_rotation_size = 0
log_timezone = 'Japan'
log_line_prefix = '%t [%p]: [%l-1] %h:%u#%d:[XXXPG]:CODE:%e '
log_statement = ddl
log_min_messages = info # DEBUG5
log_min_error_statement = info # DEBUG5
log_error_verbosity = default
log_checkpoints = on
log_lock_waits = on
log_temp_files = 0
log_connections = on
log_disconnections = on
log_duration = off
log_min_duration_statement = 1000
log_autovacuum_min_duration = 3000ms
track_functions = pl
track_activity_query_size = 8192
# Locale/display settings
# -----------------------
lc_messages = 'C'
lc_monetary = 'en_US.UTF-8' # ja_JP.eucJP
lc_numeric = 'en_US.UTF-8' # ja_JP.eucJP
lc_time = 'en_US.UTF-8' # ja_JP.eucJP
timezone = 'Asia/Tokyo'
bytea_output = 'escape'
# Auto vacuum settings
# -----------------------
autovacuum = on
autovacuum_max_workers = 3
autovacuum_vacuum_cost_limit = 200
#shared_preload_libraries = 'pg_stat_statements,auto_explain' <------------------check this
auto_explain.log_min_duration = 10000
auto_explain.log_analyze = on
include '/var/lib/pgsql/tmp/rep_mode.conf' # added by pgsql RA
On the async slave node, this is the recovery.conf
primary_conninfo = 'host=1xx.xx.xx.xx port=5432 user=replica application_name=node05 keepalives_idle=60 keepalives_interval=5 keepalives_count=5'
restore_command = 'rsync -a /xxxxx/wal_archive/%f %p'
recovery_target_timeline = 'latest'
standby_mode = 'on'
The logs about the error from master
2021-07-05 23:35:02.321 JST,,,28926,,60e16b42.70fe,122,,2021-07-04 17:03:14 JST,,0,LOG,00000,"checkpoint complete: wrote 2897 buffers (1.1%); 0 WAL file(s) added, 0 removed
, 2 recycled; write=106.770 s, sync=0.050 s, total=106.827 s; sync files=251, longest=0.017 s, average=0.001 s; distance=20262 kB, estimate=46658 kB",,,,,,,,,""
2021-07-05 23:35:02.322 JST,,,28926,,60e16b42.70fe,123,,2021-07-04 17:03:14 JST,,0,LOG,00000,"checkpoint starting: immediate force wait",,,,,,,,,""
2021-07-05 23:35:02.347 JST,,,28926,,60e16b42.70fe,124,,2021-07-04 17:03:14 JST,,0,LOG,00000,"checkpoint complete: wrote 173 buffers (0.1%); 0 WAL file(s) added, 0 removed,
1 recycled; write=0.007 s, sync=0.012 s, total=0.026 s; sync files=43, longest=0.005 s, average=0.001 s; distance=14410 kB, estimate=43434 kB",,,,,,,,,""
2021-07-05 23:35:02.348 JST,"replica","",3451,"1xx.xx.xx.xxx:45120",60e16bfc.d7b,3,"streaming C1/97C3E000",2021-07-04 17:06:20 JST,116/0,0,ERROR,XX000,"requested WAL segment 00000001000000C100000097 has already been removed",,,,,,,,,"node05"
2021-07-05 23:35:02.361 JST,"replica","",3451,"1xx.xx.xx.xxx:45120",60e16bfc.d7b,4,"idle",2021-07-04 17:06:20 JST,,0,LOG,00000,"disconnection: session time: 30:28:41.550 user=replica database= host=172.17.48.141 port=45120",,,,,,,,,"node05"
2021-07-05 23:35:02.399 JST,,,24896,"1xx.xx.xx.xxx:49278",60e31896.6140,1,"",2021-07-05 23:35:02 JST,,0,LOG,00000,"connection received: host=1xx.xx.xx.xxx port=49278",,,,,,,,,""
2021-07-05 23:35:02.401 JST,"postgres","postgres",24851,"[local]",60e31896.6113,3,"idle",2021-07-05 23:35:02 JST,,0,LOG,00000,"disconnection: session time: 0:00:00.251 user=postgres database=postgres host=[local]",,,,,,,,,"postgres#node04"
2021-07-05 23:35:02.403 JST,"replica","",24896,"1xx.xx.xx.xxx:49278",60e31896.6140,2,"authentication",2021-07-05 23:35:02 JST,116/72,0,LOG,00000,"replication connection authorized: user=replica",,,,,,,,,""
The logs about the error from async slave node
2021-07-05 23:35:02.359 JST,,,2541,,60e16bfc.9ed,2,,2021-07-04 17:06:20 JST,,0,FATAL,XX000,"could not receive data from WAL stream: ERROR: requested WAL segment 00000001000000C100000097 has already been removed",,,,,,,,,""
2021-07-05 23:35:02.408 JST,,,4703,,60e31896.125f,1,,2021-07-05 23:35:02 JST,,0,LOG,00000,"started streaming WAL from primary at C1/98000000 on timeline 1",,,,,,,,,""
2021-07-05 23:35:03.318 JST,,,4835,"[local]",60e31897.12e3,1,"",2021-07-05 23:35:03 JST,,0,LOG,00000,"connection received: host=[local]",,,,,,,,,""
Sync slave node doesn't throw this error, only async slave node, and that too recovers without any manual intervention. Is there a way to avoid this error other than by not removing the archived wal files from the wal_archive folder every 2 mins?
I have installed postgres in a containerized environment using docker-compose, for that I have used this docker image crunchydata/crunchy-postgres-gis:centos7-11.5-2.4.2, all was running right till I realized that PG_DIR/pg_wal is taking a lot of disk space, I don't want to use pg_archivecleanup every time nor in a cron job, but I want to configure postgres to do that automatically. please, what is the correct configuration for that?
This is my postgresql.conf file.
listen_addresses = '*' # what IP address(es) to listen on;
port = 5432 # (change requires restart)
unix_socket_directories = '/tmp' # comma-separated list of directories
unix_socket_permissions = 0777 # begin with 0 to use octal notation
temp_buffers = 8MB # min 800kB
max_connections = 400
shared_buffers = 1536MB
effective_cache_size = 4608MB
maintenance_work_mem = 384MB
checkpoint_completion_target = 0.7
wal_buffers = 16MB
default_statistics_target = 100
random_page_cost = 1.1
effective_io_concurrency = 200
work_mem = 4MB
min_wal_size = 1GB
max_wal_size = 2GB
max_worker_processes = 4
max_parallel_workers_per_gather = 2
max_parallel_workers = 4
unix_socket_directories = '/tmp' # comma-separated list of directories
unix_socket_permissions = 0777 # begin with 0 to use octal notation
shared_preload_libraries = 'pg_stat_statements.so' # (change requires restart)
#------------------------------------------------------------------------------
# WRITE AHEAD LOG
#------------------------------------------------------------------------------
wal_level = hot_standby # minimal, archive, or hot_standby
max_wal_senders = 6 # max number of walsender processes
wal_keep_segments = 400 # in logfile segments, 16MB each; 0 disables
hot_standby = on # "on" allows queries during recovery
max_standby_archive_delay = 30s # max delay before canceling queries
max_standby_streaming_delay = 30s # max delay before canceling queries
wal_receiver_status_interval = 10s # send replies at least this often
archive_mode = on # enables archiving; off, on, or always
# (change requires restart)
archive_command = 'pgbackrest archive-push %p' # command to use to archive a logfile segment
# placeholders: %p = path of file to archive
# %f = file name only
# e.g. 'test ! -f /mnt/server/archivedir/%f && cp %p /mnt/server/archivedir/%f'
archive_timeout = 60 # force a logfile segment switch after this
# number of seconds; 0 disables
#------------------------------------------------------------------------------
# ERROR REPORTING AND LOGGING
#------------------------------------------------------------------------------
log_destination = 'stderr' # Valid values are combinations of
logging_collector = on # Enable capturing of stderr and csvlog
log_directory = 'pg_log' # directory where log files are written,
log_filename = 'postgresql-%a.log' # log file name pattern,
log_truncate_on_rotation = on # If on, an existing log file with the
log_rotation_age = 1d # Automatic rotation of logfiles will
log_rotation_size = 0 # Automatic rotation of logfiles will
log_min_duration_statement = 0 # -1 is disabled, 0 logs all statements
log_checkpoints = on
log_connections = on
log_disconnections = on
log_line_prefix = '%t [%p]: [%l-1] user=%u,db=%d,app=%a,client=%h'
log_lock_waits = on # log lock waits >= deadlock_timeout
log_timezone = 'US/Eastern'
log_autovacuum_min_duration = 0 # -1 disables, 0 logs all actions and
datestyle = 'iso, mdy'
timezone = 'US/Eastern'
lc_messages = 'C' # locale for system error message
lc_monetary = 'C' # locale for monetary formatting
lc_numeric = 'C' # locale for number formatting
lc_time = 'C' # locale for time formatting
default_text_search_config = 'pg_catalog.english'
Thanks
You haven't shown us any evidence that pgbackrest has anything to do with this. If it is failing, you should see messages about that in the server's log file. If it is succeeding, then it should be taking up space in the archive, wherever that is, not in pg_wal.
But wal_keep_segments = 400 will lead to over 6.25GB of pg_wal being retained. I don't know if that constitutes "a lot" or not.
pg_archivecleanup isn't for cleaning up pg_wal, it is for cleaning up the archive.
I tried to deploy HAWQ 2.0 but could not get the HAWQ Master to run. Below is the error log:
[gpadmin#hdps31hwxworker2 hawqAdminLogs]$ cat ~/hawqAdminLogs/hawq_init_20160805.log
20160805:23:00:10:050348 hawq_init:hdps31hwxworker2:gpadmin-[INFO]:-Prepare to do 'hawq init'
20160805:23:00:10:050348 hawq_init:hdps31hwxworker2:gpadmin-[INFO]:-You can find log in:
20160805:23:00:10:050348 hawq_init:hdps31hwxworker2:gpadmin-[INFO]:-/home/gpadmin/hawqAdminLogs/hawq_init_20160805.log
20160805:23:00:10:050348 hawq_init:hdps31hwxworker2:gpadmin-[INFO]:-GPHOME is set to:
20160805:23:00:10:050348 hawq_init:hdps31hwxworker2:gpadmin-[INFO]:-/usr/local/hawq/.
20160805:23:00:10:050348 hawq_init:hdps31hwxworker2:gpadmin-[DEBUG]:-Current user is 'gpadmin'
20160805:23:00:10:050348 hawq_init:hdps31hwxworker2:gpadmin-[DEBUG]:-Parsing config file:
20160805:23:00:10:050348 hawq_init:hdps31hwxworker2:gpadmin-[DEBUG]:-/usr/local/hawq/./etc/hawq-site.xml
20160805:23:00:10:050348 hawq_init:hdps31hwxworker2:gpadmin-[INFO]:-Init hawq with args: ['init', 'master']
20160805:23:00:10:050348 hawq_init:hdps31hwxworker2:gpadmin-[INFO]:-Check: hawq_master_address_host is set
20160805:23:00:10:050348 hawq_init:hdps31hwxworker2:gpadmin-[INFO]:-Check: hawq_master_address_port is set
20160805:23:00:10:050348 hawq_init:hdps31hwxworker2:gpadmin-[INFO]:-Check: hawq_master_directory is set
20160805:23:00:10:050348 hawq_init:hdps31hwxworker2:gpadmin-[INFO]:-Check: hawq_segment_directory is set
20160805:23:00:10:050348 hawq_init:hdps31hwxworker2:gpadmin-[INFO]:-Check: hawq_segment_address_port is set
20160805:23:00:10:050348 hawq_init:hdps31hwxworker2:gpadmin-[INFO]:-Check: hawq_dfs_url is set
20160805:23:00:10:050348 hawq_init:hdps31hwxworker2:gpadmin-[INFO]:-Check: hawq_master_temp_directory is set
20160805:23:00:10:050348 hawq_init:hdps31hwxworker2:gpadmin-[INFO]:-Check: hawq_segment_temp_directory is set
20160805:23:00:11:050348 hawq_init:hdps31hwxworker2:gpadmin-[INFO]:-Check if hdfs path is available
20160805:23:00:11:050348 hawq_init:hdps31hwxworker2:gpadmin-[DEBUG]:-Check hdfs: /usr/local/hawq/./bin/gpcheckhdfs hdfs hdpsm2demo4.demo.local:8020/hawq_default off
20160805:23:00:11:050348 hawq_init:hdps31hwxworker2:gpadmin-[WARNING]:-2016-08-05 23:00:11.338621, p50546, th139769637427168, WARNING the number of nodes in pipeline is 1 [172.17.15.31(172.17.15.31)], is less than the expected number of replica 3 for block [block pool ID: isi_hdfs_pool block ID 4341187780_1000] file /hawq_default/testFile
20160805:23:00:11:050348 hawq_init:hdps31hwxworker2:gpadmin-[INFO]:-1 segment hosts defined
20160805:23:00:11:050348 hawq_init:hdps31hwxworker2:gpadmin-[INFO]:-Set default_hash_table_bucket_number as: 6
20160805:23:00:17:050348 hawq_init:hdps31hwxworker2:gpadmin-[INFO]:-Start to init master
The files belonging to this database system will be owned by user "gpadmin".
This user must also own the server process.
The database cluster will be initialized with locale en_US.utf8.
fixing permissions on existing directory /data/hawq/master ... ok
creating subdirectories ... ok
selecting default max_connections ... 1280
selecting default shared_buffers/max_fsm_pages ... 125MB/200000
creating configuration files ... ok
creating template1 database in /data/hawq/master/base/1 ... 2016-08-05 22:00:18.554441 GMT,,,p50803,th-1212598144,,,,0,,,seg-10000,,,,,"WARNING","01000","""fsync"": can not be set by the user and will be ignored.",,,,,,,,"set_config_option","guc.c",10023,
ok
loading file-system persistent tables for template1 ...
2016-08-05 22:00:20.023594 GMT,,,p50835,th38852736,,,,0,,,seg-10000,,,,,"WARNING","01000","""fsync"": can not be set by the user and will be ignored.",,,,,,,,"set_config_option","guc.c",10023,
2016-08-05 23:00:20.126221 BST,,,p50835,th38852736,,,,0,,,seg-10000,,,,,"FATAL","XX000","could not create shared memory segment: Invalid argument (pg_shmem.c:183)","Failed system call was shmget(key=1, size=506213024, 03600).","This error usually means that PostgreSQL's request for a shared memory segment exceeded your kernel's SHMMAX parameter. You can either reduce the request size or reconfigure the kernel with larger SHMMAX. To reduce the request size (currently 506213024 bytes), reduce PostgreSQL's shared_buffers parameter (currently 4000) and/or its max_connections parameter (currently 3000).
If the request size is already small, it's possible that it is less than your kernel's SHMMIN parameter, in which case raising the request size or reconfiguring SHMMIN is called for.
The PostgreSQL documentation contains more information about shared memory configuration.",,,,,,"InternalIpcMemoryCreate","pg_shmem.c",183,1 0x87463a postgres errstart + 0x22a
2 0x74c5e6 postgres <symbol not found> + 0x74c5e6
3 0x74c7cd postgres PGSharedMemoryCreate + 0x3d
4 0x7976b6 postgres CreateSharedMemoryAndSemaphores + 0x336
5 0x880489 postgres BaseInit + 0x19
6 0x7b03bc postgres PostgresMain + 0xdbc
7 0x6c07d5 postgres main + 0x535
8 0x3c0861ed1d libc.so.6 __libc_start_main + 0xfd
9 0x4a14e9 postgres <symbol not found> + 0x4a14e9
child process exited with exit code 1
initdb: removing contents of data directory "/data/hawq/master"
Master postgres initdb failed
20160805:23:00:20:050348 hawq_init:hdps31hwxworker2:gpadmin-[INFO]:-Master postgres initdb failed
20160805:23:00:20:050348 hawq_init:hdps31hwxworker2:gpadmin-[ERROR]:-Master init failed, exit
This is in Advanced gpcheck
[global]
configfile_version = 4
[linux.mount]
mount.points = /
[linux.sysctl]
sysctl.kernel.shmmax = 500000000
sysctl.kernel.shmmni = 4096
sysctl.kernel.shmall = 400000000
sysctl.kernel.sem = 250 512000 100 2048
sysctl.kernel.sysrq = 1
sysctl.kernel.core_uses_pid = 1
sysctl.kernel.msgmnb = 65536
sysctl.kernel.msgmax = 65536
sysctl.kernel.msgmni = 2048
sysctl.net.ipv4.tcp_syncookies = 0
sysctl.net.ipv4.ip_forward = 0
sysctl.net.ipv4.conf.default.accept_source_route = 0
sysctl.net.ipv4.tcp_tw_recycle = 1
sysctl.net.ipv4.tcp_max_syn_backlog = 200000
sysctl.net.ipv4.conf.all.arp_filter = 1
sysctl.net.ipv4.ip_local_port_range = 1281 65535
sysctl.net.core.netdev_max_backlog = 200000
sysctl.vm.overcommit_memory = 2
sysctl.fs.nr_open = 2000000
sysctl.kernel.threads-max = 798720
sysctl.kernel.pid_max = 798720
# increase network
sysctl.net.core.rmem_max = 2097152
sysctl.net.core.wmem_max = 2097152
[linux.limits]
soft.nofile = 2900000
hard.nofile = 2900000
soft.nproc = 131072
hard.nproc = 131072
[linux.diskusage]
diskusage.monitor.mounts = /
diskusage.monitor.usagemax = 90%
[hdfs]
dfs.mem.namenode.heap = 40960
dfs.mem.datanode.heap = 6144
# in hdfs-site.xml
dfs.support.append = true
dfs.client.enable.read.from.local = true
dfs.block.local-path-access.user = gpadmin
dfs.datanode.max.transfer.threads = 40960
dfs.client.socket-timeout = 300000000
dfs.datanode.socket.write.timeout = 7200000
dfs.namenode.handler.count = 60
ipc.server.handler.queue.size = 3300
dfs.datanode.handler.count = 60
ipc.client.connection.maxidletime = 3600000
dfs.namenode.accesstime.precision = -1
Look like it is complaining about memory but I can't seem to find the parameters to change. Where is shared_buffers and max_connections?
How to fix this error in general? Thanks.
Your memory settings are too low to initialize the database. Don't bother with shared_buffers or max_connections.
You have:
kernel.shmmax = 500000000
kernel.shmall = 400000000
and it should be:
kernel.shmmax = 1000000000
kernel.shmall = 4000000000
Reference: http://hdb.docs.pivotal.io/hdb/install/install-cli.html
I would also make sure you have enough swap configured on your nodes based on the amount of RAM you have.
Reference: http://hdb.docs.pivotal.io/20/requirements/system-requirements.html
Shared_buffer sets the amount of memory a HAWQ segment instance uses for shared memory buffers. This setting must be at least 128KB and at least 16KB times max_connections.
When setting shared_buffers, the values for the operating system parameters SHMMAX or SHMALL might also need to be adjusted
The value of SHMMAX must be greater than this value:
shared_buffers + other_seg_shmem
You can set the parameter values using "hawq config " utility
hawq config -s shared_buffers (Will show you the value )
hawq config -c shared_buffers -v value .Please let me know how that goes !