Postgres performance degrading as cache gets consumed - postgresql

I migrated a Postgres database 9.1 doing 300K transactions/hour from a server with Red Hat OS, Intel(R) Xeon(R) CPU E5-2670 0 # 2.60GHz / 16 Core, 64 GB RAM, 240 GB x 4 Intel SSD TO Intel(R) Xeon(R) CPU E5-2680 v4 # 2.40GHz / 56 Core, 128 GB RAM, 2TB nvme PCI SSD, RANDOM READ 450000 iops, RANDOM WRITE 56000 iops. CentOS 6.9.
Over period of time the server slows down and the amount of data processed get reduced. If I clear the cache manually (sync; echo 3 > /proc/sys/vm/drop_caches) then the data processing resume to maximum level. Again after some time with load the performance deteriorate in terms of amount of data processed. The cache memory shows it has been fully consumed.
pg configuration :
datestyle = 'redwood,show_time'
db_dialect = 'redwood'
default_text_search_config = 'pg_catalog.english'
edb_dynatune = 90
edb_redwood_date = on
edb_redwood_strings = on
lc_messages = 'en_US.UTF-8'
lc_monetary = 'en_US.UTF-8'
lc_numeric = 'en_US.UTF-8'
lc_time = 'en_US.UTF-8'
shared_preload_libraries = '$libdir/dbms_pipe,$libdir/edb_gen,$libdir/plugins/plugin_debugger,$libdir/plugins/plugin_spl_debugger'
timed_statistics = off
archive_command = 'rsync -a %p slave:/opt/PostgresPlus/9.1AS/wals/%f'
archive_mode = on
listen_addresses = '*'
log_destination = 'syslog'
syslog_facility = 'LOCAL0'
logging_collector = on
log_line_prefix = '%t'
max_wal_senders = 4
port = 6432
wal_keep_segments = 128
wal_level = hot_standby
temp_buffers='50MB'
constraint_exclusion = on
autovacuum = on
enable_bitmapscan = off
max_connections = 200
shared_buffers = 32GB
effective_cache_size = 96GB
work_mem = 167772kB
maintenance_work_mem = 2GB
checkpoint_segments = 64
checkpoint_completion_target = 0.9
wal_buffers = 16MB
default_statistics_target = 100

Related

Postgresql-12 too many wal files

We have a Postgresql-12 DB in our production. Today we realized our disk usage was increased against last month in master server (last month: 4.4TB out of 14TB, now: 9.8TB out of 14TB). When i run ncdu command our actual postgresql data is just 3.4TB other 6.4TB is used by just wal files. We have a standby server as well. Wal archiving is enabled on our deployment and we continuously storing wal files to our another backup server near the basebackup of our DB. So all this wal files are necessary even the after we backup them? If not what should we do for free our disk space from unnecessary wal files? Here is the our postgresql.conf:
Our server's specs:
Centos 7
48 Core Intel Xenon Gold CPU
256 GB Memory
4*7,6TB SSD RAID1+0 (Total ~14TB)
listen_addresses = '*'
max_connections = 500
superuser_reserved_connections = 10
password_encryption = md5
shared_buffers = 64GB
max_prepared_transactions = 100
work_mem = 83886kB
maintenance_work_mem = 2GB
max_stack_depth = 2MB
dynamic_shared_memory_type = posix
bgwriter_delay = 100ms
bgwriter_lru_maxpages = 10000
bgwriter_lru_multiplier = 10.0
effective_io_concurrency = 200
max_worker_processes = 48
max_parallel_maintenance_workers = 4
max_parallel_workers_per_gather = 4
max_parallel_workers = 48
wal_level = replica
wal_sync_method = fdatasync
wal_compression = on
wal_log_hints = on
wal_buffers = 32MB
commit_delay = 0
max_wal_size = 16GB
min_wal_size = 4GB
checkpoint_completion_target = 0.9
archive_mode = on
archive_command = 'test ! -f /pgdata/wal_backup/%f && cp %p /pgdata/wal_backup/%f && /var/lib/pgsql/backup_wal.sh'
archive_timeout = 3600
random_page_cost = 1.1
effective_cache_size = 192GB
default_statistics_target = 100
log_destination = 'stderr'
logging_collector = on
log_directory = 'log'
log_filename = 'postgresql-%a.log'
log_truncate_on_rotation = on
log_rotation_age = 1d
log_rotation_size = 0
log_min_duration_statement = 4000
log_checkpoints = on
log_line_prefix = '<user=%u db=%d host=%h pid=%p app=%a time=%m > '
log_lock_waits = on
log_temp_files = 0
log_timezone = 'Europe/Istanbul'
cluster_name = 'pg12/primary'
track_io_timing = on
track_functions = all
log_autovacuum_min_duration = 0
statement_timeout = 3600000
datestyle = 'iso, mdy'
timezone = 'Europe/Istanbul'
lc_messages = 'en_US.UTF-8'
lc_monetary = 'en_US.UTF-8'
lc_numeric = 'en_US.UTF-8'
lc_time = 'en_US.UTF-8'
default_text_search_config = 'pg_catalog.english'
shared_preload_libraries = 'pg_stat_statements'
max_locks_per_transaction = 128
pg_stat_statements.max = 10000
pg_stat_statements.track = all
pg_stat_statements.track_utility = on
pg_stat_statements.save = on
We have a standby server as well. Wal archiving is enabled on our deployment and we continuously storing wal files to our another backup server near the basebackup of our DB.
One option is that there is an unused replication slot (it has to be in primary_slot_name on the standby). Consult pg_replication_slots.
The other is that your archiver is failing. Consult pg_stat_archiver.

Async slave node missing WAL files on Postgres11

I have 3 VM nodes running Master-Slave Postgres-11. They are being managed by Pacemaker.
Node Attributes:
* Node node04:
+ master-pgsqlins : 1000
+ pgsqlins-data-status : LATEST
+ pgsqlins-master-baseline : 000000C0D8000098
+ pgsqlins-status : PRI
* Node node05:
+ master-pgsqlins : -INFINITY
+ pgsqlins-data-status : STREAMING|ASYNC
+ pgsqlins-status : HS:async
* Node node06:
+ master-pgsqlins : 100
+ pgsqlins-data-status : STREAMING|SYNC
+ pgsqlins-status : HS:sync
Async node throws an error at times that the required WAL file is missing. It then stops the replication and starts it again.
On the master node, WAL archiving is enabled and they are synced to another folder named wal_archive. There is another process that keeps removing the files from that wal_archive folder. So I understand why the slave node would throw that error, but what I want to understand is that how is it able to start back again without that missing file?
The postgresql.conf
# Connection settings
# -------------------
listen_addresses = '*'
port = 5432
max_connections = 600
tcp_keepalives_idle = 0
tcp_keepalives_interval = 0
tcp_keepalives_count = 0
# Memory-related settings
# -----------------------
shared_buffers = 2GB # Physical memory 1/4
##DEBUG: mmap(1652555776) with MAP_HUGETLB failed, huge pages disabled: Cannot allocate memory
#huge_pages = try # on, off, or try
#temp_buffers = 16MB # depends on DB checklist
work_mem = 8MB # Need tuning
effective_cache_size = 4GB # Physical memory 1/2
maintenance_work_mem = 512MB
wal_buffers = 64MB
# WAL/Replication/HA settings
# --------------------
wal_level = logical
synchronous_commit = remote_write
archive_mode = on
archive_command = 'rsync -a %p /xxxxx/wal_archive/%f'
#archive_command = ':'
max_wal_senders=5
hot_standby = on
restart_after_crash = off
wal_sender_timeout = 60000
wal_receiver_status_interval = 2
max_standby_streaming_delay = -1
max_standby_archive_delay = -1
hot_standby_feedback = on
random_page_cost = 1.5
max_wal_size = 5GB
min_wal_size = 200MB
checkpoint_completion_target = 0.9
checkpoint_timeout = 30min
# Logging settings
# ----------------
log_destination = 'csvlog,syslog'
logging_collector = on
log_directory = 'pg_log'
log_filename = 'postgresql_%Y%m%d.log'
log_truncate_on_rotation = off
log_rotation_age = 1h
log_rotation_size = 0
log_timezone = 'Japan'
log_line_prefix = '%t [%p]: [%l-1] %h:%u#%d:[XXXPG]:CODE:%e '
log_statement = ddl
log_min_messages = info # DEBUG5
log_min_error_statement = info # DEBUG5
log_error_verbosity = default
log_checkpoints = on
log_lock_waits = on
log_temp_files = 0
log_connections = on
log_disconnections = on
log_duration = off
log_min_duration_statement = 1000
log_autovacuum_min_duration = 3000ms
track_functions = pl
track_activity_query_size = 8192
# Locale/display settings
# -----------------------
lc_messages = 'C'
lc_monetary = 'en_US.UTF-8' # ja_JP.eucJP
lc_numeric = 'en_US.UTF-8' # ja_JP.eucJP
lc_time = 'en_US.UTF-8' # ja_JP.eucJP
timezone = 'Asia/Tokyo'
bytea_output = 'escape'
# Auto vacuum settings
# -----------------------
autovacuum = on
autovacuum_max_workers = 3
autovacuum_vacuum_cost_limit = 200
#shared_preload_libraries = 'pg_stat_statements,auto_explain' <------------------check this
auto_explain.log_min_duration = 10000
auto_explain.log_analyze = on
include '/var/lib/pgsql/tmp/rep_mode.conf' # added by pgsql RA
On the async slave node, this is the recovery.conf
primary_conninfo = 'host=1xx.xx.xx.xx port=5432 user=replica application_name=node05 keepalives_idle=60 keepalives_interval=5 keepalives_count=5'
restore_command = 'rsync -a /xxxxx/wal_archive/%f %p'
recovery_target_timeline = 'latest'
standby_mode = 'on'
The logs about the error from master
2021-07-05 23:35:02.321 JST,,,28926,,60e16b42.70fe,122,,2021-07-04 17:03:14 JST,,0,LOG,00000,"checkpoint complete: wrote 2897 buffers (1.1%); 0 WAL file(s) added, 0 removed
, 2 recycled; write=106.770 s, sync=0.050 s, total=106.827 s; sync files=251, longest=0.017 s, average=0.001 s; distance=20262 kB, estimate=46658 kB",,,,,,,,,""
2021-07-05 23:35:02.322 JST,,,28926,,60e16b42.70fe,123,,2021-07-04 17:03:14 JST,,0,LOG,00000,"checkpoint starting: immediate force wait",,,,,,,,,""
2021-07-05 23:35:02.347 JST,,,28926,,60e16b42.70fe,124,,2021-07-04 17:03:14 JST,,0,LOG,00000,"checkpoint complete: wrote 173 buffers (0.1%); 0 WAL file(s) added, 0 removed,
1 recycled; write=0.007 s, sync=0.012 s, total=0.026 s; sync files=43, longest=0.005 s, average=0.001 s; distance=14410 kB, estimate=43434 kB",,,,,,,,,""
2021-07-05 23:35:02.348 JST,"replica","",3451,"1xx.xx.xx.xxx:45120",60e16bfc.d7b,3,"streaming C1/97C3E000",2021-07-04 17:06:20 JST,116/0,0,ERROR,XX000,"requested WAL segment 00000001000000C100000097 has already been removed",,,,,,,,,"node05"
2021-07-05 23:35:02.361 JST,"replica","",3451,"1xx.xx.xx.xxx:45120",60e16bfc.d7b,4,"idle",2021-07-04 17:06:20 JST,,0,LOG,00000,"disconnection: session time: 30:28:41.550 user=replica database= host=172.17.48.141 port=45120",,,,,,,,,"node05"
2021-07-05 23:35:02.399 JST,,,24896,"1xx.xx.xx.xxx:49278",60e31896.6140,1,"",2021-07-05 23:35:02 JST,,0,LOG,00000,"connection received: host=1xx.xx.xx.xxx port=49278",,,,,,,,,""
2021-07-05 23:35:02.401 JST,"postgres","postgres",24851,"[local]",60e31896.6113,3,"idle",2021-07-05 23:35:02 JST,,0,LOG,00000,"disconnection: session time: 0:00:00.251 user=postgres database=postgres host=[local]",,,,,,,,,"postgres#node04"
2021-07-05 23:35:02.403 JST,"replica","",24896,"1xx.xx.xx.xxx:49278",60e31896.6140,2,"authentication",2021-07-05 23:35:02 JST,116/72,0,LOG,00000,"replication connection authorized: user=replica",,,,,,,,,""
The logs about the error from async slave node
2021-07-05 23:35:02.359 JST,,,2541,,60e16bfc.9ed,2,,2021-07-04 17:06:20 JST,,0,FATAL,XX000,"could not receive data from WAL stream: ERROR: requested WAL segment 00000001000000C100000097 has already been removed",,,,,,,,,""
2021-07-05 23:35:02.408 JST,,,4703,,60e31896.125f,1,,2021-07-05 23:35:02 JST,,0,LOG,00000,"started streaming WAL from primary at C1/98000000 on timeline 1",,,,,,,,,""
2021-07-05 23:35:03.318 JST,,,4835,"[local]",60e31897.12e3,1,"",2021-07-05 23:35:03 JST,,0,LOG,00000,"connection received: host=[local]",,,,,,,,,""
Sync slave node doesn't throw this error, only async slave node, and that too recovers without any manual intervention. Is there a way to avoid this error other than by not removing the archived wal files from the wal_archive folder every 2 mins?

pg_wal is taking a lot of disk space

I have installed postgres in a containerized environment using docker-compose, for that I have used this docker image crunchydata/crunchy-postgres-gis:centos7-11.5-2.4.2, all was running right till I realized that PG_DIR/pg_wal is taking a lot of disk space, I don't want to use pg_archivecleanup every time nor in a cron job, but I want to configure postgres to do that automatically. please, what is the correct configuration for that?
This is my postgresql.conf file.
listen_addresses = '*' # what IP address(es) to listen on;
port = 5432 # (change requires restart)
unix_socket_directories = '/tmp' # comma-separated list of directories
unix_socket_permissions = 0777 # begin with 0 to use octal notation
temp_buffers = 8MB # min 800kB
max_connections = 400
shared_buffers = 1536MB
effective_cache_size = 4608MB
maintenance_work_mem = 384MB
checkpoint_completion_target = 0.7
wal_buffers = 16MB
default_statistics_target = 100
random_page_cost = 1.1
effective_io_concurrency = 200
work_mem = 4MB
min_wal_size = 1GB
max_wal_size = 2GB
max_worker_processes = 4
max_parallel_workers_per_gather = 2
max_parallel_workers = 4
unix_socket_directories = '/tmp' # comma-separated list of directories
unix_socket_permissions = 0777 # begin with 0 to use octal notation
shared_preload_libraries = 'pg_stat_statements.so' # (change requires restart)
#------------------------------------------------------------------------------
# WRITE AHEAD LOG
#------------------------------------------------------------------------------
wal_level = hot_standby # minimal, archive, or hot_standby
max_wal_senders = 6 # max number of walsender processes
wal_keep_segments = 400 # in logfile segments, 16MB each; 0 disables
hot_standby = on # "on" allows queries during recovery
max_standby_archive_delay = 30s # max delay before canceling queries
max_standby_streaming_delay = 30s # max delay before canceling queries
wal_receiver_status_interval = 10s # send replies at least this often
archive_mode = on # enables archiving; off, on, or always
# (change requires restart)
archive_command = 'pgbackrest archive-push %p' # command to use to archive a logfile segment
# placeholders: %p = path of file to archive
# %f = file name only
# e.g. 'test ! -f /mnt/server/archivedir/%f && cp %p /mnt/server/archivedir/%f'
archive_timeout = 60 # force a logfile segment switch after this
# number of seconds; 0 disables
#------------------------------------------------------------------------------
# ERROR REPORTING AND LOGGING
#------------------------------------------------------------------------------
log_destination = 'stderr' # Valid values are combinations of
logging_collector = on # Enable capturing of stderr and csvlog
log_directory = 'pg_log' # directory where log files are written,
log_filename = 'postgresql-%a.log' # log file name pattern,
log_truncate_on_rotation = on # If on, an existing log file with the
log_rotation_age = 1d # Automatic rotation of logfiles will
log_rotation_size = 0 # Automatic rotation of logfiles will
log_min_duration_statement = 0 # -1 is disabled, 0 logs all statements
log_checkpoints = on
log_connections = on
log_disconnections = on
log_line_prefix = '%t [%p]: [%l-1] user=%u,db=%d,app=%a,client=%h'
log_lock_waits = on # log lock waits >= deadlock_timeout
log_timezone = 'US/Eastern'
log_autovacuum_min_duration = 0 # -1 disables, 0 logs all actions and
datestyle = 'iso, mdy'
timezone = 'US/Eastern'
lc_messages = 'C' # locale for system error message
lc_monetary = 'C' # locale for monetary formatting
lc_numeric = 'C' # locale for number formatting
lc_time = 'C' # locale for time formatting
default_text_search_config = 'pg_catalog.english'
Thanks
You haven't shown us any evidence that pgbackrest has anything to do with this. If it is failing, you should see messages about that in the server's log file. If it is succeeding, then it should be taking up space in the archive, wherever that is, not in pg_wal.
But wal_keep_segments = 400 will lead to over 6.25GB of pg_wal being retained. I don't know if that constitutes "a lot" or not.
pg_archivecleanup isn't for cleaning up pg_wal, it is for cleaning up the archive.

PostgreSQL 9.5 Replication Lag running on EC2

I have a series of PostgreSQL 9.5 servers running on r4.16xlarge instances and Amazon Linux 1 that started experiencing replication lag of several seconds starting this week. The configurations were changed but the old configs weren't saved so I'm not sure what the previous settings were. Here's the custom values:
max_connections = 1500
shared_buffers = 128GB
effective_cache_size = 132GB
maintenance_work_mem = 128MB
checkpoint_completion_target = 0.7
wal_buffers = 16MB
default_statistics_target = 100
#effective_io_concurrency = 10
work_mem = 128MB
min_wal_size = 1GB
max_wal_size = 2GB
max_worker_processes = 64
synchronous_commit = off
The drive layout is as follows - 4 disks for the xlog drive and 10 for the regular partition, all gp2 disk type.
Personalities : [raid0]
md126 : active raid0 xvdo[3] xvdn[2] xvdm[1] xvdl[0]
419428352 blocks super 1.2 512k chunks
md127 : active raid0 xvdk[9] xvdj[8] xvdi[7] xvdh[6] xvdg[5] xvdf[4] xvde[3] xvdd[2] xvdc[1] xvdb[0]
2097146880 blocks super 1.2 512k chunks
The master server is a smaller c4.8xlarge instance with this setup:
max_connections = 1500
shared_buffers = 15GB
effective_cache_size = 45GB
maintenance_work_mem = 1GB
checkpoint_completion_target = 0.9
wal_buffers = 16MB
default_statistics_target = 100
random_page_cost = 1.1
effective_io_concurrency = 16
work_mem = 26MB
min_wal_size = 1GB
max_wal_size = 2GB
max_worker_processes = 36
With this drive layout:
Personalities : [raid0]
md126 : active raid0 xvdd[2] xvdc[1] xvdb[0] xvde[3]
419428352 blocks super 1.2 512k chunks
md127 : active raid0 xvdr[12] xvdg[1] xvdo[9] xvdl[6] xvdh[2] xvdf[0] xvdp[10] xvdu[15] xvdm[7] xvdj[4] xvdn[8] xvdk[5] xvdi[3] xvds[13] xvdt[14] xvdq[11]
3355435008 blocks super 1.2 512k chunks
I guess I'm looking for optimal settings for these two instance types so I can eliminate the replication lag. None of the servers are what I would call heavily loaded.
With further digging I found that the following setting fixed the replication lag:
hot_standby_feedback = on
This may cause some WAL bloating on the master but now the backlog is gone.

Master postgres initdb failed while deploying HAWQ 2.0 on Hortonworks

I tried to deploy HAWQ 2.0 but could not get the HAWQ Master to run. Below is the error log:
[gpadmin#hdps31hwxworker2 hawqAdminLogs]$ cat ~/hawqAdminLogs/hawq_init_20160805.log
20160805:23:00:10:050348 hawq_init:hdps31hwxworker2:gpadmin-[INFO]:-Prepare to do 'hawq init'
20160805:23:00:10:050348 hawq_init:hdps31hwxworker2:gpadmin-[INFO]:-You can find log in:
20160805:23:00:10:050348 hawq_init:hdps31hwxworker2:gpadmin-[INFO]:-/home/gpadmin/hawqAdminLogs/hawq_init_20160805.log
20160805:23:00:10:050348 hawq_init:hdps31hwxworker2:gpadmin-[INFO]:-GPHOME is set to:
20160805:23:00:10:050348 hawq_init:hdps31hwxworker2:gpadmin-[INFO]:-/usr/local/hawq/.
20160805:23:00:10:050348 hawq_init:hdps31hwxworker2:gpadmin-[DEBUG]:-Current user is 'gpadmin'
20160805:23:00:10:050348 hawq_init:hdps31hwxworker2:gpadmin-[DEBUG]:-Parsing config file:
20160805:23:00:10:050348 hawq_init:hdps31hwxworker2:gpadmin-[DEBUG]:-/usr/local/hawq/./etc/hawq-site.xml
20160805:23:00:10:050348 hawq_init:hdps31hwxworker2:gpadmin-[INFO]:-Init hawq with args: ['init', 'master']
20160805:23:00:10:050348 hawq_init:hdps31hwxworker2:gpadmin-[INFO]:-Check: hawq_master_address_host is set
20160805:23:00:10:050348 hawq_init:hdps31hwxworker2:gpadmin-[INFO]:-Check: hawq_master_address_port is set
20160805:23:00:10:050348 hawq_init:hdps31hwxworker2:gpadmin-[INFO]:-Check: hawq_master_directory is set
20160805:23:00:10:050348 hawq_init:hdps31hwxworker2:gpadmin-[INFO]:-Check: hawq_segment_directory is set
20160805:23:00:10:050348 hawq_init:hdps31hwxworker2:gpadmin-[INFO]:-Check: hawq_segment_address_port is set
20160805:23:00:10:050348 hawq_init:hdps31hwxworker2:gpadmin-[INFO]:-Check: hawq_dfs_url is set
20160805:23:00:10:050348 hawq_init:hdps31hwxworker2:gpadmin-[INFO]:-Check: hawq_master_temp_directory is set
20160805:23:00:10:050348 hawq_init:hdps31hwxworker2:gpadmin-[INFO]:-Check: hawq_segment_temp_directory is set
20160805:23:00:11:050348 hawq_init:hdps31hwxworker2:gpadmin-[INFO]:-Check if hdfs path is available
20160805:23:00:11:050348 hawq_init:hdps31hwxworker2:gpadmin-[DEBUG]:-Check hdfs: /usr/local/hawq/./bin/gpcheckhdfs hdfs hdpsm2demo4.demo.local:8020/hawq_default off
20160805:23:00:11:050348 hawq_init:hdps31hwxworker2:gpadmin-[WARNING]:-2016-08-05 23:00:11.338621, p50546, th139769637427168, WARNING the number of nodes in pipeline is 1 [172.17.15.31(172.17.15.31)], is less than the expected number of replica 3 for block [block pool ID: isi_hdfs_pool block ID 4341187780_1000] file /hawq_default/testFile
20160805:23:00:11:050348 hawq_init:hdps31hwxworker2:gpadmin-[INFO]:-1 segment hosts defined
20160805:23:00:11:050348 hawq_init:hdps31hwxworker2:gpadmin-[INFO]:-Set default_hash_table_bucket_number as: 6
20160805:23:00:17:050348 hawq_init:hdps31hwxworker2:gpadmin-[INFO]:-Start to init master
The files belonging to this database system will be owned by user "gpadmin".
This user must also own the server process.
The database cluster will be initialized with locale en_US.utf8.
fixing permissions on existing directory /data/hawq/master ... ok
creating subdirectories ... ok
selecting default max_connections ... 1280
selecting default shared_buffers/max_fsm_pages ... 125MB/200000
creating configuration files ... ok
creating template1 database in /data/hawq/master/base/1 ... 2016-08-05 22:00:18.554441 GMT,,,p50803,th-1212598144,,,,0,,,seg-10000,,,,,"WARNING","01000","""fsync"": can not be set by the user and will be ignored.",,,,,,,,"set_config_option","guc.c",10023,
ok
loading file-system persistent tables for template1 ...
2016-08-05 22:00:20.023594 GMT,,,p50835,th38852736,,,,0,,,seg-10000,,,,,"WARNING","01000","""fsync"": can not be set by the user and will be ignored.",,,,,,,,"set_config_option","guc.c",10023,
2016-08-05 23:00:20.126221 BST,,,p50835,th38852736,,,,0,,,seg-10000,,,,,"FATAL","XX000","could not create shared memory segment: Invalid argument (pg_shmem.c:183)","Failed system call was shmget(key=1, size=506213024, 03600).","This error usually means that PostgreSQL's request for a shared memory segment exceeded your kernel's SHMMAX parameter. You can either reduce the request size or reconfigure the kernel with larger SHMMAX. To reduce the request size (currently 506213024 bytes), reduce PostgreSQL's shared_buffers parameter (currently 4000) and/or its max_connections parameter (currently 3000).
If the request size is already small, it's possible that it is less than your kernel's SHMMIN parameter, in which case raising the request size or reconfiguring SHMMIN is called for.
The PostgreSQL documentation contains more information about shared memory configuration.",,,,,,"InternalIpcMemoryCreate","pg_shmem.c",183,1 0x87463a postgres errstart + 0x22a
2 0x74c5e6 postgres <symbol not found> + 0x74c5e6
3 0x74c7cd postgres PGSharedMemoryCreate + 0x3d
4 0x7976b6 postgres CreateSharedMemoryAndSemaphores + 0x336
5 0x880489 postgres BaseInit + 0x19
6 0x7b03bc postgres PostgresMain + 0xdbc
7 0x6c07d5 postgres main + 0x535
8 0x3c0861ed1d libc.so.6 __libc_start_main + 0xfd
9 0x4a14e9 postgres <symbol not found> + 0x4a14e9
child process exited with exit code 1
initdb: removing contents of data directory "/data/hawq/master"
Master postgres initdb failed
20160805:23:00:20:050348 hawq_init:hdps31hwxworker2:gpadmin-[INFO]:-Master postgres initdb failed
20160805:23:00:20:050348 hawq_init:hdps31hwxworker2:gpadmin-[ERROR]:-Master init failed, exit
This is in Advanced gpcheck
[global]
configfile_version = 4
[linux.mount]
mount.points = /
[linux.sysctl]
sysctl.kernel.shmmax = 500000000
sysctl.kernel.shmmni = 4096
sysctl.kernel.shmall = 400000000
sysctl.kernel.sem = 250 512000 100 2048
sysctl.kernel.sysrq = 1
sysctl.kernel.core_uses_pid = 1
sysctl.kernel.msgmnb = 65536
sysctl.kernel.msgmax = 65536
sysctl.kernel.msgmni = 2048
sysctl.net.ipv4.tcp_syncookies = 0
sysctl.net.ipv4.ip_forward = 0
sysctl.net.ipv4.conf.default.accept_source_route = 0
sysctl.net.ipv4.tcp_tw_recycle = 1
sysctl.net.ipv4.tcp_max_syn_backlog = 200000
sysctl.net.ipv4.conf.all.arp_filter = 1
sysctl.net.ipv4.ip_local_port_range = 1281 65535
sysctl.net.core.netdev_max_backlog = 200000
sysctl.vm.overcommit_memory = 2
sysctl.fs.nr_open = 2000000
sysctl.kernel.threads-max = 798720
sysctl.kernel.pid_max = 798720
# increase network
sysctl.net.core.rmem_max = 2097152
sysctl.net.core.wmem_max = 2097152
[linux.limits]
soft.nofile = 2900000
hard.nofile = 2900000
soft.nproc = 131072
hard.nproc = 131072
[linux.diskusage]
diskusage.monitor.mounts = /
diskusage.monitor.usagemax = 90%
[hdfs]
dfs.mem.namenode.heap = 40960
dfs.mem.datanode.heap = 6144
# in hdfs-site.xml
dfs.support.append = true
dfs.client.enable.read.from.local = true
dfs.block.local-path-access.user = gpadmin
dfs.datanode.max.transfer.threads = 40960
dfs.client.socket-timeout = 300000000
dfs.datanode.socket.write.timeout = 7200000
dfs.namenode.handler.count = 60
ipc.server.handler.queue.size = 3300
dfs.datanode.handler.count = 60
ipc.client.connection.maxidletime = 3600000
dfs.namenode.accesstime.precision = -1
Look like it is complaining about memory but I can't seem to find the parameters to change. Where is shared_buffers and max_connections?
How to fix this error in general? Thanks.
Your memory settings are too low to initialize the database. Don't bother with shared_buffers or max_connections.
You have:
kernel.shmmax = 500000000
kernel.shmall = 400000000
and it should be:
kernel.shmmax = 1000000000
kernel.shmall = 4000000000
Reference: http://hdb.docs.pivotal.io/hdb/install/install-cli.html
I would also make sure you have enough swap configured on your nodes based on the amount of RAM you have.
Reference: http://hdb.docs.pivotal.io/20/requirements/system-requirements.html
Shared_buffer sets the amount of memory a HAWQ segment instance uses for shared memory buffers. This setting must be at least 128KB and at least 16KB times max_connections.
When setting shared_buffers, the values for the operating system parameters SHMMAX or SHMALL might also need to be adjusted
The value of SHMMAX must be greater than this value:
shared_buffers + other_seg_shmem
You can set the parameter values using "hawq config " utility
hawq config -s shared_buffers (Will show you the value )
hawq config -c shared_buffers -v value .Please let me know how that goes !