I am verifying some of the configuration in my production Postgres instance. Our DB server has 32 GB RAM. From pg_settings, I see that effective_cache_size is set to:
postgres=> select name, setting, unit from pg_settings where name like 'effective_cache_size';
name | setting | unit
----------------------+---------+------
effective_cache_size | 7851762 | 8kB
(1 row)
As per my understanding, this value accounts to 7851762 X 8 KB = 62.8 GB. If my calculation is right, we are basically telling the optimizer that we have 62 GB for this parameter whereas we have only 32 GB of physical RAM.
Please correct me if I am calculating this parameter wrong. I always get confused with calculating parameter allocations for units with 8 KB.
7851762 times 8 kB is approximately 60 GB.
I would configure the setting to 30 GB if the machine is dedicated to the PostgreSQL database.
This parameter tells PostgreSQL how much memory there is available for caching its files. If the value is high, PostgreSQL will estimate nested loop joins with an index scan on the inner side cheaper, because it assumes that the index will probably be cached.
Related
When I run a query in postgres it will load in memory all the data it needs for query execution but when the query finishes executing the memory consumed by postgres is not released.
I run postgres with the Timescaledb extension installed, which already stores 93 million records, the request to get all these 93 million records takes 13GB of RAM and 21mins. After the request, RAM is reduced to 9GB and holds it.
How can I release all this memory?
Here's my postgresql.conf memory section.
shared_buffers = 8GB
huge_pages = try
huge_page_size = 0
temp_buffers = 8MB
max_prepared_transactions = 0
work_mem = 5118kB
hash_mem_multiplier = 1.0
maintenance_work_mem = 2047MB
autovacuum_work_mem = -1
logical_decoding_work_mem = 64MB
max_stack_depth = 2MB
shared_memory_type = mmap
effective_cache_size = 4GB
UPD 15.12.2022
#jjanes Yes, you were right. As a colleague and I found out, memory usage is not exactly what we thought about from the very beginning.
As we also understood from the free -m -h, when a request uses more memory than configured, the remaining data begins to be loaded into cache/buffer.
When we clean cache/buffer after big query, memory usage for this service drops to 8,53GB. This occupied memory +- coincides with the value of shared_buffers = 8GB, it practically does not decrease. For large requests, postgres still continues to use to request large chunks of memory from cache/buffer.
It turns out that we need to somehow clean the cache of postgres itself, since it does not release it itself?
I have 2 postgres instances in azure.
I see every hour:
1 temporary file created in one postgres instance (About 8 MB Max
Size)
4 temporary files are created in another postgres instance
(About 11 MB Max Size)
I thought this was frequent enough for me to increase work_mem...
I increased work_mem to 8MB and 12 MB respectively in the 2 postgres instances but then I saw the temporary files were still created..
This time I saw each instance has one temporary file of 16MB size each... this behavior confuses me..
I expected that temporary file creation would stop..
I tried to refer to: https://pganalyze.com/docs/log-insights/server/S7
Is few temporary files every hour not a big deal?
Should I not tune work_mem ?
I have a query to be executed in PostgreSQL, which gives nearly 1,00,0000 records and is taking too much time to execute.
The server is installed in Ubuntu having 16 GB RAM. We have indexed the query to increase the performance and still taking time.
Also I tried by changing some parameters like work_mem and effective_cache_size but there is no reasonable changes.
Can anyone suggest a suitable values that can be given for the performance related parameters in PostgreSQL for 16 GB RAM Ubuntu machine.
The following command VACUUM my_table has been running for 24 hours already on Postgres (v11.5)
The table has around:
112 million rows
Table Space: 193 GB
6 indexes on 6 different fields + Primary Key index
Is this normal?
More information if it helps...
AWS RDS instance
16GB memory + 4 vCPU (db.m5.xlarge)
800GB allocated storage (Database is taking 495GB of that so far)
Provisioned IOPS - 10000
Adding more info here -
SELECT relname, n_dead_tup FROM pg_stat_user_tables; returns 163441017
We are not running any application queries against the DB, we wanted to let the DB finish vacuum process
It can be. Maybe your 16GB RAM is too low for effective manipulation with big tables (190GB). Very generic rule says so RAM should be about 1/10 db size.
What you can check:
a) look to table pg_stat_activity for related process and check if vacuum is not in waiting on lock state.
b) if you can, check metrics related to IO. Maybe there you will see high IO waits - this is signal, so your IO is overloaded, and then vacuum can be very slow. Table 193GB is really big.
I am building a large postgres 9.1 database on ubuntu 12.04, with one table that holds about 80 million rows or so. Whenever I run a SELECT statement:
SELECT * FROM db WHERE ID=1;
It takes almost 2.5 minutes to execute the query which returns only a few thousand rows. After running a few diagnostics on the disk I/O, I think that is not the problem, but just in case below is the output from a diagnostic. (I have 2GB of RAM) I am not exactly sure what a good output is here, but it seems ballpark given stats found for other servers on the internet.
time sh -c "dd if=/dev/zero of=bigfile bs=8k count=500000 && sync"
500000+0 records in
500000+0 records out
4096000000 bytes (4.1 GB) copied, 106.969 s, 38.3 MB/s
real 1m49.091s
user 0m0.248s
sys 0m9.369s
I have modified postgresql.conf considerably, boosting the effective_cache to 75% of ram, shared_buffers to 25%, checkpoint_segments to 15, work_mem to 256MB, autovacuum, SHMMAX on the kernel, etc. I have had some performance increases, but not more than 5% better. Networking shouldnt be an issue since it still takes a long time even running on localhost. I am planning to add even more data, and the query time seems to be growing quickly with the number of rows.
It seems like I should be able to run these SELECT statements in a few seconds, not a few minutes. Any suggestions on where this bottleneck could be?
sorry if this is inexcusably obvious, but do you have an index on the ID column?
also, though I'm not blaming the disk, you merely tested sequential bandwidth, which tells you very little about latency. though I have to say that 38 MB/s is underwhelming even for that measure...