Is it possible to view the history of all vacuum and analyze commands executed for a specific table in Amazon Redshift.
You can check the history of VACUUM run using query history tables - SVL_STATEMENTTEXT etc.
Documentation for more details.
Fair warning, these history tables only store data for last 15 days, so in the long run you might wanna take back ups.
You can also check the current sorted-unsorted blocks for a current table in Redshift, that gives a good idea of how the table data is stored.
Documentation for more details.
Related
I am looking to find the data source of couple of Tables in Redshift. I have gone through all the stored procedures in Redshift instance. I couldn't find any stored procedure which populates these tables in Redshift. I have also checked the Data Migration Service and didn't see these tables are being migrated from RDS instance. However, the tables are updated regularly each day.
What would be the way to find how data is populated in those 2 tables? Is there any logs or system tables I can look in to?
One place I'd look is svl_statementtext. That will pull any queries and utility queries that may be inserting or running copy jobs against that table. Just use a WHERE text LIKE %yourtablenamehere% and see what comes back.
https://docs.aws.amazon.com/redshift/latest/dg/r_SVL_STATEMENTTEXT.html
Also check scheduled queries in the Redshift UI console.
In an effort to do some basic housekeeping on our Amazon RDS (Postgresql) instance, my team hopes to drop unused or rarely used tables from our database. In Redshift, I used the stl_query table to determine which tables were accessed frequently enough to remain.
The problem is, I can't seem to figure out an equivalent strategy for Postgres. I tried checking the log files in the console, but these don't appear to have the correct info.
Aside from searching our code base for references to used tables, is there a good strategy to find unused / infrequently used tables in Postgres? If sufficient logs exist, I am willing to write some sort of parsing script to get the necessary data - I just need to find a good source.
It turns out the statistics I need live in the statistics collector views, specifically pg_stat_user_tables.
This is the query I was able to find infrequently accessed tables:
SELECT
relname,
schemaname
FROM
pg_stat_user_tables
WHERE
(idx_tup_fetch + seq_tup_read) < 5; --access threshold
Reading release notes of recent Postgres 9.6, I found this interesting new feature
Add a generic command progress reporting facility (Vinayak Pokale,
Rahila Syed, Amit Langote, Robert Haas)
Further reading gave me no information on this, but some play around article at depesz.
Of course the first what I thought - is there any history for what has been processed (and list of object to be processed - I dreamed) somewhere as well? Or this pg_stat_get_progress_info just shows current and have no idea of VACUUM plans and past?..
And another Question - Is there interface to consume that facility for own process (reports, data load and etc).
The view is called pg_stat_progress_vacuum; depesz must have used an older version of the patch for his article.
Currently, progress reporting is only available for VACUUM (and autovacuum) operations.
This feature offers no historical data, but there are other ways to get those:
If you set log_autovacuum_min_duration to 0, all autovacuum operations will be reported in the server log (normally, you don't have to run VACUUM manually).
The pg_stat_all_tables system view contains columns last_vacuum and last_autovacuum that indicate when the respective operation last ran on the table.
How to find in PostgreSQL 9.5 what is causing deadlock error/failure when doing full vacuumdb over database with option --jobs to run full vacuum in parallel.
I just get some process numbers and table names... How to prevent this so I could successfuly do full vacuum over database in parallel?
Completing a VACUUM FULL under load is a pretty hard task. The problem is that Postgres is contracting space taken by the table, thus any data manipulation interferes with that.
To achieve a full vacuum you have these options:
Lock access to the vacuumed table. Not sure if acquiring some exclusive lock will help, though. You may need to prevent access to the table on application level.
Use a create new table - swap (rename tables) - move data - drop original technique. This way you do not contract space under the original table, you free it by simply dropping the table. Of course you are rebuilding all indexes, redirecting FKs, etc.
Another question is: do you need to VACUUM FULL? The only thing it does that VACUUM ANALYZE does not is contracting the table on the file system. If you are not very limited by disk space you do not need doing a full vacuum that much.
Hope that helps.
Sometims I want to monitor the performace of PostgreSQL DATABASE, I double that the plan of some sql statements were changed in the past. Is there any views which show current and history plan information about SQL of PostgreSQL?
Use the auto_explain extension. It can write the plans of all queries to the server log.
Plan information is dynamic based on the current state of the DB, what the latest ANALYZE shows, statistics, etc. These stats are accessible in the pg_stats view (see http://www.postgresql.org/docs/8.2/static/planner-stats.html) which you could back up for later analysis.