postgres not releasing memory after big query

postgres not releasing memory after big query - postgresql

When I run a query in postgres it will load in memory all the data it needs for query execution but when the query finishes executing the memory consumed by postgres is not released.
I run postgres with the Timescaledb extension installed, which already stores 93 million records, the request to get all these 93 million records takes 13GB of RAM and 21mins. After the request, RAM is reduced to 9GB and holds it.
How can I release all this memory?
Here's my postgresql.conf memory section.
shared_buffers = 8GB
huge_pages = try
huge_page_size = 0
temp_buffers = 8MB
max_prepared_transactions = 0
work_mem = 5118kB
hash_mem_multiplier = 1.0
maintenance_work_mem = 2047MB
autovacuum_work_mem = -1
logical_decoding_work_mem = 64MB
max_stack_depth = 2MB
shared_memory_type = mmap
effective_cache_size = 4GB
UPD 15.12.2022
#jjanes Yes, you were right. As a colleague and I found out, memory usage is not exactly what we thought about from the very beginning.
As we also understood from the free -m -h, when a request uses more memory than configured, the remaining data begins to be loaded into cache/buffer.
When we clean cache/buffer after big query, memory usage for this service drops to 8,53GB. This occupied memory +- coincides with the value of shared_buffers = 8GB, it practically does not decrease. For large requests, postgres still continues to use to request large chunks of memory from cache/buffer.
It turns out that we need to somehow clean the cache of postgres itself, since it does not release it itself?

Related

Why the amount of WAL files decrease

I monitor the amount of WAL files in pg_wal. Overtimes, it reduces by itself. I dont have clustering, just single server with logical replication.
My parameter :
archive_timeout = 3600
min_wal_size = 2 GB
max_wal_size = 16 GB
wal_keep_segment = 4000
archiving_mode = ON
archive command = test ! -f /archive/%f && cp %p /archive/%f
wal_level = logical
What are the reasons the amount of WALs reduce ? I try to look for articles but never found one. Please point me to one or maybe answer this.
Thanks

WAL segments are automatically deleted if they are no longer needed. Also, PostgreSQL automatically creates new WAL segments for future use, and the number of such segments depends on the amount of data modification activity. So it is totally normal for the size of pg_wal to vary with the amount of data modification activity.

Understanding Postgres work_mem, maintenance_work_mem and temp_buffers allocation

I want to understand allocation of some postgresql.conf parameters. I want to know when maintenance_work_mem, work_mem and temp_buffers gets allocated.
As per my understanding,
maintenance_work_memory gets allocated at server start and this
memory cannot be used by any other process.
work_mem gets allocated at time of query parsing, planner checks for number of sort methods
or hash tables and allocates memory accordingly. Sort operation may not use full allocated memory but still its reserved for that particular operation and cannot be used by any other process.
temp_buffers gets allocated at start of each session.
I have gone through the docs but didn't get any proper answer.
Is this understanding correct?

Maintenance work mem is allocated per session for VACUUM, CREATE INDEX and ADD FOREIGN KEYS, and it depends on parallel workers too like with autovacuum_max_workers = 3 and maintenance_work_mem = 1 GB then autovacuum will consume 1*3= 3GB of memory similarly while creating an index.
Now, work_mem also gets allocated per session depending on your sort/hash operations however I am sure Postgres don't reserve anything to be used in the future and for tuning this and you should always consider your number of parallel connection before allocating this memory as this parameter consumes memory work_mem*sort operations running in your cluster per connection.
Yes, that's true temp_buffers can be changed within individual sessions but only before the first use of temporary tables within the session.
http://rhaas.blogspot.com/2019/01/how-much-maintenanceworkmem-do-i-need.html by Robert Hass
https://www.depesz.com/2011/07/03/understanding-postgresql-conf-work_mem/ by Deprez
https://www.interdb.jp/pg/pgsql02.html and https://severalnines.com/database-blog/architecture-and-tuning-memory-postgresql-databases was very helpful understanding the memory architecture

work_mem is the maximum memory that a single step in an execution plan can use. It is allocated by the executor freed when the query execution is done.
maintenance_work_mem is the same, but for maintenance operations like CREATE INDEX and VACUUM.
temp_buffers is used to cache temporary tables and remains in use for the duration of the database session.

how does memory allocation for postgres work?

shared_buffers - In a regular PostgreSQL installation, say I am allocating 25% of my memory to shared_buffers that means it leaves 75% for rest such as OS, page cache and work_mems etc. Is my understanding correct?
If so, AWS Aurora for Postgres uses 75% of memory for shared_buffers, then it would leave just 25% for other things?
Does the memory specified for work_mem, fully gets allocated to all sessions irrespective of whether they do any sorting or hashing operations?

Your first statement is necessarily true:
If 75% of the RAM are used for shared buffers, then only 25% are available for other things like process private memory.
work_mem is the upper limit of memory that one operation (“node”) in an execution plan is ready to use for operations like creating a hash or a bitmap or sorting. That does not mean that every operation allocates that much memory, it is just a limit.
Without any claim for absolute reliability, here is my personal rule of thumb:
shared_buffers + max_connections * work_mem should be less or equal to the RAM available. Then you are unlikely to run out of memory.

How to reliably memory constrain a postgres database

I run postgres 10.4 on a very small machine with strict memory constraints (e.g. 200MB) on Debian. System swap space must be disabled in my case but SSD Disk space is plenty available (e.g. > 500GB). I am using a waterfall approach to distribute all available memory to the different uses in postgres following this logic:
The available memory is 200MB
---
max_connections = 10
max_worker_processes = 2
shared_buffers = 50MB
work_mem = (200MB - shared_buffers) * 0.8 / max_connections
maintenance_work_mem = (200MB - shared_buffers) * 0.1 / max_worker_processes
temp_buffers = (200MB - shared_buffers) * 0.05
wal_buffers = (200MB - shared_buffers) * 0.05
temp_file_limit = -1 (i.e. unlimited)
effective_cache_size = 200MB / 2
It is crucial for me, that sessions or even the postmaster are never canceled due to memory restrictions to ensure stable operation of postgres. In low memory situations postgres should work with temp files instead of memory.
I still get out of memory errors in some situations. (e.g. when I have a large insert into a table.)
How do I need to set all parameters to guarantee that postgres will not try to get more memory than there is available.

You can refer to this official document for the in-depth study of the memory configuration of postgresql server - https://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server
It has all the limits and properly suggested values for each memory param, these params can be set using the server configuration attributes like # of CPUs, RAM capacity etc.
Else, use this online tool to try out different configurations and make sure that the server doesn't require more memory than what is available to it - https://pgtune.leopard.in.ua/#/

SELECT performance issues on postgresql 9.1

I am building a large postgres 9.1 database on ubuntu 12.04, with one table that holds about 80 million rows or so. Whenever I run a SELECT statement:
SELECT * FROM db WHERE ID=1;
It takes almost 2.5 minutes to execute the query which returns only a few thousand rows. After running a few diagnostics on the disk I/O, I think that is not the problem, but just in case below is the output from a diagnostic. (I have 2GB of RAM) I am not exactly sure what a good output is here, but it seems ballpark given stats found for other servers on the internet.
time sh -c "dd if=/dev/zero of=bigfile bs=8k count=500000 && sync"
500000+0 records in
500000+0 records out
4096000000 bytes (4.1 GB) copied, 106.969 s, 38.3 MB/s
real 1m49.091s
user 0m0.248s
sys 0m9.369s
I have modified postgresql.conf considerably, boosting the effective_cache to 75% of ram, shared_buffers to 25%, checkpoint_segments to 15, work_mem to 256MB, autovacuum, SHMMAX on the kernel, etc. I have had some performance increases, but not more than 5% better. Networking shouldnt be an issue since it still takes a long time even running on localhost. I am planning to add even more data, and the query time seems to be growing quickly with the number of rows.
It seems like I should be able to run these SELECT statements in a few seconds, not a few minutes. Any suggestions on where this bottleneck could be?

sorry if this is inexcusably obvious, but do you have an index on the ID column?
also, though I'm not blaming the disk, you merely tested sequential bandwidth, which tells you very little about latency. though I have to say that 38 MB/s is underwhelming even for that measure...

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse