how explain tbl_rows smaller than estimated_visible_rows - amazon-redshift

by analyzing the SVV_TABLE_INFO table on my cluster,
I noticed that for some table the tbl_rows value was smaller than the estimated_visible_rows value ; Sometimes the gap is very large
I did some tests, like running an Analyze or a vaccum but it didn't change anything
how explain tbl_rows smaller than estimated_visible_rows

I see a similar issue where tbl_rows is 36,862,261 where as estimated_visible_rows is 229,512,224.
Even after vacuum to 100 percent followed by analyze; tbl_rows is 36,862,261 and estimated_visible_rows is 229,512,224.
However, it says in https://aws.amazon.com/premiumsupport/knowledge-center/redshift-vacuum-performance/#:~:text=The%20estimated_visible_rows%20is%20the%20number,and%20unsorted%20should%20reach%200. that "After a complete vacuum (delete and sort), the value for tbl_rows and estimated_visible_rows should resemble each other..."

Related

Understanding auto-vacuum and when it is triggered

We've noticed one of our tables growing considerably on PG 12. This table is the target of very frequent updates, with a mix of column types, including a very large text column (often with over 50kb of data) - we run a local cron job that looks for rows older than X time and set the text column to a null value (as we no longer need the data for that particular column after X amount of time).
We understand this does not actually free up disk space due to the MVCC model, but we were hoping that auto-vacuum would take care of this. To our surprise, the table continues to grow (now over 40gb worth) without auto-vacuum running. Running a vacuum manually has addressed the issue and we no longer see growth.
This has lead me to investigate other tables, I'm realising that I don't understand how auto-vacuum is triggered at all.
Here is my understanding of how it works, which hopefully someone can pick apart:
I look for tables that have a large amount of dead tuples in them:
select * from pg_stat_all_tables ORDER BY n_dead_tup desc;
I identify tableX with 33169557 dead tuples (n_dead_tup column).
I run a select * from pg_class ORDER BY reltuples desc; to check how many estimated rows there are on table tableX
I identify 1725253 rows via the reltuples column.
I confirm my autovacuum settings: autovacuum_vacuum_threshold = 50 and autovacuum_vacuum_scale_factor = 0.2
I apply the formula threshold + pg_class.reltuples * scale_factor, so, 50 + 1725253 * 0.2 which returns 345100.6
It is my understanding that auto-vacuum will start on this table once ~345100 dead tuples are found. But tableX is already at a whopping 33169557 dead tuples!, The last_autovacuum on this table was back in February.
Any clarification would be welcome.
Your algorithm is absolutely correct.
Here are some reasons why things could go wrong:
autovacuum runs, but is so slow that it never gets done
If you see no running autovacuum, that is not your problem.
autovacuum runs, but a long running open transaction prevents it from removing dead tuples
other tables need to be vacuumed more urgently (to avoid transaction ID wraparound), so the three workers are busy with other things
autovacuum runs, but conflicts with high concurrent locks on the table (LOCK TABLE, ALTER TABLE, ...)
This makes autovacuum give up and try again later.
autovacuum is disabled, perhaps only for that table

Postgres Vacuum doesnt free up space

I have a table in my database which is occupying 161GB hard disk space. Only 5 gb free space is left out of 200Gb harddisk.
The following command shows that my table is consuming 161GB harddisk space,
select pg_size_pretty(pg_total_relation_size('Employee'));
There are close to 527 rows in the table. Now I deleted 250 rows. Again I checked the pg_total_relation_size of Employee. Still the size is 161GB.
After seeing the output of the above query, I ran the vacuum command:
VACUUM VERBOSE ANALYZE Employee;
I checked if the VACUUM actually happened using,
SELECT relname, last_vacuum, last_autovacuum FROM pg_stat_user_tables;
I can see the last vacuum time matching the time I ran the VACUUM command.
I also ran the below command to see if there any dead tuples,
SELECT relname, n_dead_tup FROM pg_stat_user_tables;
n_dead_tup count for Employee table is 0.
Still after all these above commands if I run,
select pg_size_pretty(pg_total_relation_size('Employee'));
it still shows 161GB.
May I please know the reason behind this? Also please correct me on how to free interface_list.
vacuum doesn't physically "free" space. It only marks no longer used space as re-usable. So subsequent UPDATE or INSERT statements can use that space instead of appending to the table.
Quote from the manual
The standard form of VACUUM removes dead row versions in tables and indexes and marks the space available for future reuse. However, it will not return the space to the operating system, except in the special case where one or more pages at the end of a table become entirely free and an exclusive table lock can be easily obtained
(emphasis mine)
If you re-insert the 250 deleted rows, you will see that the table doesn't grow again, as the newly inserted rows simply use the space that was marked free by vacuum.
If you actually want to physically reduce the size of the table to size that is "needed", you need to run vacuum full.
Quote from the manual
VACUUM FULL actively compacts tables by writing a complete new version of the table file with no dead space. This minimizes the size of the table, but can take a long time. It also requires extra disk space for the new copy of the table, until the operation completes
(emphasis mine)

Select * from table_name is running slow

The table contains around 700 000 data. Is there any way to make the query run faster?
This table is stored on a server.
I have tried to run the query by taking the specific columns.
If select * from table_name is unusually slow, check for these things:
Network speed. How large is the data and how fast is your network? For large queries you may want to think about your data in bytes instead of rows. Run select bytes/1024/1024/1024 gb from dba_segments where segment_name = 'TABLE_NAME'; and compare that with your network speed.
Row fetch size. If the application or IDE is fetching one-row-at-a-time, each row has a large overhead with network lag. You may need to increase that setting.
Empty segment. In a few weird cases the table's segment size can increase and never shrink. For example, if the table used to have billions of rows, and they were deleted but not truncated, the space would not be released. Then a select * from table_name may need to read a lot of empty extents to get to the real data. If the GB size from the above query seems too large, run alter table table_name move; to rebuild the table and possible save space.
Recursive query. A query that simple almost couldn't have a bad execution plan. It's possible, but rate, for a recursive query has a bad execution plan. While the query is running, look at select * from gv$sql where users_executing > 0;. There might be a data dictionary query that's really slow and needs to be tuned.

fast growing table in postgresql

We run postgresql 9.5.2 in an RDS instance. One thing we noticed was that a certain table sometimes grow very rapidly in size.
The table in question has only 33k rows and ~600 columns. All columns are numeric (decimal(25, 6)). After vacuum full, the "total_bytes" as reported in the following query
select c.relname, pg_total_relation_size(c.oid) AS total_bytes
from pg_class c;
is about 150MB. However, we observed this grew to 71GB at one point. In a recent episode, total_bytes grew by 10GB in a 30 minute period.
During the episode mentioned above, we had a batch update query that runs ~4 times per minute that updates every record in the table. However, during other times table size remained constant despite similar update activities.
I understand this is probably caused by "dead records" being left over from the updates. Indeed when this table grow too big simply running vacuum full will shrink it to its normal size (150M). My questions are
have other people experienced similar rapid growth in table size in postgresql and is this normal?
if our batch update queries are causing the rapid growth in table size, why doesn't it happen every time? In fact I tried to to reproduce it manually by running something like
update my_table set x = x * 2
but couldn't -- table size remained the same before and after the query.
The problem is having 600 columns in a single table, which is never a good idea. This is going to cause a lot of problems, table size is just one of them.
From the PostgreSQL docs...
The actual storage requirement [for numeric values] is two bytes for each group of four decimal digits, plus three to eight bytes overhead.
So decimal(25, 6) is something like 8 + (31 / 4 * 2) or about 24 bytes per column. At 600 columns per row that's about 14,400 bytes per row or 14k per row. At 33,000 rows that's about 450 megs.
If you're updating every row 4 times per minute, that's going to leave about 1.8 gigs per minute of dead rows.
You should fix your schema design.
You shouldn't need to touch every row of a table 4 times a minute.
You should ask a question about redesigning that table and process.

Evaluate how much space will be freed by VACUUM in Redshift

According to AWS doc:
Amazon Redshift does not automatically reclaim and reuse space that is freed when you delete rows and update rows.
Before running VACUUM, is there a way to know or evaluate how much space will be free from disk by the VACUUM?
Thx
References:
http://docs.aws.amazon.com/redshift/latest/dg/t_Reclaiming_storage_space202.html
http://docs.aws.amazon.com/redshift/latest/dg/r_VACUUM_command.html
You can calculate the amount of storage that will be freed up from a vacuum command by looking up the tbl_rows column in the svv_table_info view. This includes rows that are marked for deletion. Compare that to a select count(*) from the same table and you'll have a ratio. Something like this on a theoretical table named factsales.
select (select cast(count(*) as numeric(12,0)) from factsales) /
cast(tbl_rows as numeric(12,0))
as "percentage of non deleted rows"
from svv_table_info where "table" = 'factsales'
There doesn't appear to be a straightforward way to execute dynamic SQL and cursors so to get this same ratio across all tables you'd have to execute the code from an external source or programming language i.e. python.
Its not an extremely accurate way, but you can query svv_table_info and look for the column deleted_pct. This will give you a rough idea, in percentage terms, about what fraction of the table needs to be rebuilt using vacuum.
You can run it for all the tables in your system to get this estimate for the whole system.