vacuum_cost_page_miss set to zero - postgresql

On RDS PostgreSQL instance, below are the vacuum parameters set.
autovacuum_vacuum_cost_delay 5
autovacuum_vacuum_cost_limit 3000
vacuum_cost_page_dirty 20
vacuum_cost_page_hit 1
vacuum_cost_page_miss 0
From what I understand from this blog - https://www.2ndquadrant.com/en/blog/autovacuum-tuning-basics/
There is a cost associated with if the page being vacuumed is in shared buffer or not. If the vacuum_cost_page_miss is set to zero, I am thinking its going to assume the cost of reading from disk is free and since the cost limit is set to 3000, the autovacuum will be performing lots of IO until it reaches the cost limit. Is my understanding correct? Would it mean something else by setting this parameter to 0?

0 is not a special value here, it means the same thing it does in ordinary arithmetic. So, no throttling will get applied on the basis of page misses.
This is my preferred setting for vacuum_cost_page_miss. Page misses are inherently self-limiting. Once a page is needed and not found, then the process stalls until the page is read. No more read requests will get issued by that process while it is waiting. This is in contrast to page dirtying. There is nothing other than vacuum_cost_page_dirty-driven throttling to prevent the vacuum process from dirtying pages far faster than they can be written to disk, leading to IO constipation which will then disturb everyone else on the system.
If you are going to reduce vacuum_cost_page_miss to zero, you should also set vacuum_cost_page_hit to zero. Having the latter high than the former is weird. Maybe whoever came up with those setting just figured that 1 was already low enough, so there was no point in changing yet another setting.
vacuum_cost_page_miss throttling could be particularly bad before v9.6 (when freeze map was introduced) when freezing large tables which have hit autovacuum_freeze_max_age, but have seen few changes since the last freeze. PostgreSQL will charge vacuum_cost_page_miss for every page, even though most of them will be found already in the kernel page cache (but not in shared_buffers) through the magic of readahead. So it will slow-walk the table as if it were doing random reads, while doing no useful work and holding the table lock hostage. This might be the exact thing that lead your predecessor to make the changes he made.
the autovacuum will be performing lots of IO until it reaches the cost limit.
Autovacuum once begun has a mostly fixed task to do, and will do the amount of IO it needs to do to get it done. At stake is not how much IO it will do, but over how much time it does it.

These settings are silly in my opinion.
Setting the cost for a page found in shared buffers higher than the cost for a page read from disk does not make any sense. Also, if you want to discount I/O costs, why leave vacuum_cost_page_dirty at 20?
Finally, increasing the cost limit and leaving the cost delay at a high value like 5 (the default from v12 on is 2) can only be explained if RDS is based on a PostgreSQL version older than v12.
It feels like whoever dabbled with the autovacuum settings had a vague idea to make autovacuum more aggressive, but didn't understand it well enough to do it right. I think the default settings of v12 are better.

Related

Postgres auto-vacuum doesn't reclaim the dead tuples space causes disk full issue

I have a use case to insert 100 000 rows per min at the same time in another end few threads will take the rows and delete them from my table. So definitely it will create lot of dead tuples in my table.
My auto-vacuum configurations are
autovacuum_max_workers = 3
autovacuum_naptime = 1min
utovacuum_vacuum_scale_factor = 0.2
autovacuum_analyze_scale_factor = 0.1
autovacuum_vacuum_cost_delay = 20ms
autovacuum_vacuum_cost_limit = -1
From "pg_stat_user_tables" I can find auto-vacuum is running on my table but within a few hours my disk will be full (500 GB) and I can't able to insert any new row.
on the second try, I changed the following configuration
autovacuum_naptime = 60min
autovacuum_vacuum_cost_delay = 0
This time my simulation and auto-vacuum are running well and max disk size is 180 GB.
Here my doubt is, if I change the "autovacuum_vacuum_cost_delay" to zero ms, how auto-vacuum freeing the dead tuples space and PG reuse it? why it is not working as intended if I set the value is 20 ms?
Here my doubt is, if I change the "autovacuum_vacuum_cost_delay" to zero ms, how auto-vacuum freeing the dead tuples space and PG reuse it?
The space freed up by vacuum is recorded in the free space map, from where it gets handed out for re-use by future INSERTs.
Another detail to add, in 9.6 the free space map is only vacuumed once the entire table itself is completely vacuumed, and so the freed up space is not findable until then. If the VACUUM never makes it to the very end, because it is too slow or gets interupted, then the space it is freeing up will not be reused for INSERTs. This was improved in v11.
why it is not working as intended if I set the value is 20 ms?
Because vacuum can't keep up at that value. The default values for PostgreSQL are often suitable only for smaller servers, which yours doesn't seem to be. It is appropriate and advisable to change the defaults in this situation. Note that in v12, the default was lowered from 20 to 2 (and its type was correspondingly changed from int to float, so you can now specify the value with more precision)
To summarize, your app creates tons of dead tuples and autovacuum can't keep up. Possible solutions
This sounds more like a task queue than a regular table. Perhaps a PostgreSQL table is not ideal for your this specific use case. Use a solution such as RabbitMQ/Redis instead.
Create time-based range partitions and purge old partitions once they're empty, while disabling autovacuum on this table alone. Consider not deleting rows at all and just purging old partitions if you can identify handled partitions.
Tweak with the autovacuum settings so that it works constantly, without any naps or interference. Increasing maintenance_work_mem could help speed autovacuum too. Perhaps you'll find out that you've reached your hard-drive's limits. In that case, you will have to optimize the storage so that it can accommodate those expensive INSERT+DELETE+autovacuum operations.
Well the default value is 2 ms Autovacuum. So your 20ms value is high:
autovacuum_vacuum_cost_delay (floating point)
"Specifies the cost delay value that will be used in automatic VACUUM operations. If -1 is specified, the regular vacuum_cost_delay value will be used. If this value is specified without units, it is taken as milliseconds. The default value is 2 milliseconds. This parameter can only be set in the postgresql.conf file or on the server command line; but the setting can be overridden for individual tables by changing table storage parameters."
As explained here Vacuum:
"
vacuum_cost_delay (floating point)
The amount of time that the process will sleep when the cost limit has been exceeded. If this value is specified without units, it is taken as milliseconds. The default value is zero, which disables the cost-based vacuum delay feature. Positive values enable cost-based vacuuming.
When using cost-based vacuuming, appropriate values for vacuum_cost_delay are usually quite small, perhaps less than 1 millisecond. While vacuum_cost_delay can be set to fractional-millisecond values, such delays may not be measured accurately on older platforms. On such platforms, increasing VACUUM's throttled resource consumption above what you get at 1ms will require changing the other vacuum cost parameters. You should, nonetheless, keep vacuum_cost_delay as small as your platform will consistently measure; large delays are not helpful.
"

Postgresql auto-vacuuming taking too long

I have db table which has around 5-6 Mn entries and it is taking around 20 minutes to perform vacuuming. Since, one field of this table is updated very frequently, thereare a lot of dead rows to deal with.
For an estimate, with our current user base it can have 2 Million dead tuples on daily basis. So, vacuuming of this table requires both:
Read IO: as the whole table is not present in shared memory.
Write IO: as there are a lot of entries to update.
What should be an ideal way to vacuum this table? Should I increase the autovacuum_cost_limit to allow more operations per autovacuum run? But as i can see, it will increase IOPS, which again might hinder the performance. Currently, I have autovacuum_scale_factor = 0.2. Should I decrease it? If I decrease it it will run more often, although write IO will decrease but it will lead to more number of time period with high read IO.
Also, as the user base will increase it will take more and more time as the size of table with increase and vacuum will have to read a lot from disk. So, what should I do?
One of the solution I have thought of:
Separate the highly updated column and make a separate table.
Tweaking the parameter to make it run more often to decrease write IO(as discussed above). How to handle more Read IO, as vacuum will now run more often?
Combine point 2 along with increasing RAM to reduce Read IO as well.
In general what is the approach that people takes, because I assume people must have very big table 10GB or more, that needs to be vacuumed.
Separating the column is a viable strategy but would be a last resort to me. PostgreSQL already has a high per-row overhead, and doing this would double it (which might also remove most of the benefit). Plus, it would make your queries uglier, harder to read, harder to maintain, easier to introduce bugs. Where splitting it would be most attractive is if index-only-scans on a set of columns not including this is are important to you, and splitting it out lets you keep the visibility map for those remaining columns in a better state.
Why do you care that it takes 20 minutes? Is that causing something bad to happen? At that rate, you could vacuum this table 72 times a day, which seems to be way more often than it actually needs to be vacuumed. In v12, the default value for autovacuum_vacuum_cost_delay was dropped 10 fold, to 2ms. This change in default was not driven by changes in the code in v12, but rather by the realization that the old default was just out of date with modern hardware in most cases. I would have no trouble pushing that change into v11 config; but I don't think doing so would address your main concern, either.
Do you actually have a problem with the amount of IO you are generating, or is it just conjecture? The IO done is mostly sequential, but how important that is would depend on your storage hardware. Do you see latency spikes while the vacuum is happening? Are you charged per IO and your bill is too high? High IO is not inherently a problem, it is only a problem if it causes a problem.
Currently, I have autovacuum_scale_factor = 0.2. Should I decrease it?
If I decrease it it will run more often, although write IO will
decrease but it will lead to more number of time period with high read
IO.
Running more often probably won't decrease your write IO by much if any. Every table/index page with at least one obsolete tuple needs to get written, during every vacuum. Writing one page just to remove one obsolete tuple will cause more writing than waiting until there are a lot of obsolete tuples that can all be removed by one write. You might be writing somewhat less per vacuum, but doing more vacuums will make up for that, and probably far more than make up for it.
There are two approaches:
Reduce autovacuum_vacuum_cost_delay for that table so that autovacuum becomes faster. It will still consume I/O, CPU and RAM.
Set the fillfactor for the table to a value less than 100 and make sure that the column you update frequently is not indexed. Then you could get HOT updates which don't require VACUUM.

Impact of log_duration in PostgreSQL performance

All,
I'm trying to understand the impact of PostgreSQL server setting log_duration. Would it cause any performance issues setting it to ON. I was trying to setup a confluence server with Postgres backend. When this value is set as ON, the response of the service is slow, whereas when I set it to OFF, it is fast.
log_min_duration_statement is set to -1 => No query texts should be logged.
Queries:
Why it is slow when it is set to ON
Logging => does it log in any log file? If yes, where to find the same in server
There is another feature pg_stat_statements.track, which can be set to TOP, ALL. Why setting this is not giving performance issue. It is also tracking queries. But, it gives total_time and calls, but not individual.
Could some one help me understand the impact.
refer to this, let me know if you need more info:
log min duration
This allows logging a sample of statements, without incurring excessive
log traffic (which may impact performance). This can be useful when
analyzing workloads with lots of short queries.
The sampling is configured using two new GUC parameters:
log_min_duration_sample - minimum required statement duration
log_statement_sample_rate - sample rate (0.0 - 1.0)
Only statements with duration exceeding log_min_duration_sample are
considered for sampling. To enable sampling, both those GUCs have to
be set correctly.
The existing log_min_duration_statement GUC has a higher priority, i.e.
statements with duration exceeding log_min_duration_statement will be
always logged, irrespectedly of how the sampling is configured. This
means only configurations
log_min_duration_sample < log_min_duration_statement
do actually sample the statements, instead of logging everything.
The best way to know is to measure it.
For simple select-only queries, such as pgbench -S, the impact is quite high. I get 22,000 selects per second with it off, and 13,000 with it on. For more complex select queries or for DML, the percentage impact will be much less.
Formatting and printing log messages, and the IPC to the log collector, take time. pg_stat_statements doesn't do that for every query, it just increments some counters.
Of course log_duration logs to a log file (or pipe), otherwise there wouldn't be any point in turning it on. The location is highly configurable, look in postgresql.conf for things like log_destination, log_directory, and log_filename. (But if you don't know where to find the logs, why would you even bother to turn this logging on in the first place?)

Postgresql explicit VACUUM vs. auto-VACUUM: Differences? Recommendations?

Quick question from a PostgreSQL (relative) newb:
We run a batch process that, as its final step, deletes most of the previous batches.
Disk space is a concern, so we need to ensure that PostgreSQL cleans up after itself.
Other than forcing PostgreSQL to garbage-collect faster, is there any difference between explicitly calling VACUUM at the end of the batch vs. letting the auto-VACUUM daemon handle it? Is there any reason to recommend one approach vs. the other?
Thanks!
Way back when there was one vacuum, and it was full and blocking. Then PostgreSQL guys added non-blocking vacuum. But you still had to schedule it yourself.
Then some genius made a daemon that ran vacuum automatically for you when the tables needed it. It uses the exact same vacuum command you or I would use, but has a lot of settings, especially default ones, that make it run slower and less intrusively. Primarily these settings are for worker threads (default 3), delay cost (20ms for autovac, 0ms for regular vac) and autovacuum cost delay limit (-1, i.e. use system setting which is 200).
Therefore, regular vacuum is VERY aggressive with no cost delay, and will run as hard and fast as your IO subsystem will let it. It basically competes with your regular workload for IO bandwidth.
Generally you can do one of two things in your situation:
One: Make autovacuum more aggressive. By lowering the autovacuum_vacuum_cost_delay from 20 to something in the 2 to 5 range it will run much faster but still not get in the way too much.
Two: Run regular vacuums by hand. Since regular vacuums, by default, have no cost_delay, this will be the fastest but also the most distruptive.
Decision is yours based on usage patterns etc.

Postgres causing swapping on CentOS

All,
I am running CentOS 6.0 with Postgresql 8.4 and can't seem to figure out how to prevent so much disc swap from occurring. I have 12 gigs of RAM and 4 processors and I am doing some simple updates (1 table at a time). I thought for a minute that the inserts happening in parallel from a script I wrong was causing the large memory usage but when I saw the simple update causing it too I basically threw in the towel and decided to ask for help.
I pasted the conf file here. http://pastebin.com/e0jdBu0J
You can see that I set the buffers relatively low and the connection amounts high. The DB service will not start if I set the shared buffers any higher than 64 megs. Anyone have an idea what may be causing this for me?
Thanks,
Adam
If you're going into swap, increasing shared_buffers will make the problem worse; you'll be taking RAM away from the part that's running out and swapping, instead dedicating memory to the database caching. It's worth fixing SHMMAX etc. just on general principle and for later tuning work, but that's not going to help with this problem.
Guessing at the identify of your memory gobbling source is a crapshoot. Far better to look at data from "top -c" and ps to find which processes are using a lot of it. It's possible for a really bad query to consume way more memory than it should. If you see memory use spike up for a PostgreSQL process running something, check the process ID against the information in pg_stat_tables to see what it's doing.
There are a couple of things that can cause this sort of issue that often surprise people. If you are doing a large number of row updates in a single transaction, and there are foreign key checks or triggers involved, that can run out of memory. The queue of things to check in each of those cases is kept in RAM, and can be surprisingly big.
There are two problems with your PostgreSQL settings that might be related. Databases don't actually work very well if you have a lot more active connections than cores in the server; best performance is normally 2 to 3 active clients per core. And all sorts of things go wrong once you've got more than a few hundred connection. There is some connections^2 behavior that gets ugly there performance wise, and there are some memory issues too. If you really need 1250 connections, you should be using a connection pooler such as pgBouncer or pgpool-II.
And effective_io_concurrency = 1000 is way too high for any hardware on the planet. Useful values for that in a small multiple of how many disks you have in the server. I have no idea what happens as far as memory usage goes when you set it that high, but it's not been tested very well at that range. Normal settings more like 1 to 25. The parameters outlined at Tuning Your PostgreSQL Server are much more important than it is; the concurrency value only impacts one particular type of table scan.
Centos 6 seems to have a very conservative shmmax as a default
Set your shared buffers to that recommended by postgres tuning resources
see for explanation and how to set.
To experiment you can (as root) use sysctl -w kernel.shmmax = n
where n is the value that the startup error message that postgres is trying to allocate on startup. When you identify the value you wish to use permanently then set that in /etc/sysctl.conf