AWS Redshift vacuum returns ERROR: Assert

AWS Redshift vacuum returns ERROR: Assert - amazon-redshift

Trying to vacuum a table in our Redshift cluster returns:
Error : ERROR: Assert
For other tables vacuum works just fine. The table is fairly large, but queries run against it seem to work.
Any idea how to get more info on what's wrong and trying to fix? Should I try making a copy of the table (probably an overnight job, given the table size)?
Thanks for your help.

Probably if you update your cluster it will fix the issue.
https://forums.aws.amazon.com/thread.jspa?threadID=245468

Related

Error "MultiXactId **** has not been created yet -- apparent wraparound" on PG11

I face an annoying problem on a Postgres 11.13 database when trying to get data from a big table.
The first 6 millions rows can be fetched, then I get a "MultiXactId **** has not been created yet -- apparent wraparound" message.
I've already tested, on this table :
various "select ..." queries (even in functions with exception management to ignore possible errors)
pg_dump
REINDEX TABLE
VACUUM FULL, with and without the zero_damaged_pages enabled
VACUUM FREEZE, with and without the zero_damaged_pages enabled
Nothing to do: I get every time that "MultiXactId **** has not been created yet -- apparent wraparound" error.
Is there a solution to fix that kind of problem, or is that "broken/corrupted" table definitively lost ?
Thanks in advance for any advice

Cannot execute ANALYZE during recovery

We have a insert only table for which we often get bad results due to query plan using nested loops instead of hash joins. To solve this we have to run ANALYZE manually (vacuum sometimes don't run on insret only tables, long story, not the point here). When I try to run analyze on replica machine, I get ERROR: cannot execute ANALYZE during recovery error. So this made me think that we maybe don't need to execute ANALYZE on replica.
My question is: are statistics propagated to replica when executing analyze on master node?
Question in link below is similar to this one, but it is asked in regards to vacuum. We are only using ANALYZE.
https://serverfault.com/questions/212219/postgresql-9-does-vacuuming-a-table-on-the-primary-replicate-on-the-mirror

Statistics are stored in table, and this table is replicated from primary server to replica. So you don't need and you cannot to run ANALYZE on replica (physical replication)

How to alter external table in Redshift Spectrum?

I want to add a partition of data to my external table, but I'm receiving the error: ALTER EXTERNAL TABLE cannot run inside a transaction block.
I removed the BEGIN/END transaction but still the same error persists. I read on some forums that adding an isolation level might solve the problem, but wanted to get an opinion of others, if someone has experienced this before.

A standard statement like this works for me. If you are getting error from this as well, please share your exact statement ?
ALTER TABLE spectrum_schema.spect_test
ADD PARTITION (column_part='2019-07-23')
LOCATION 's3://bucketname/folder1/column_part=2019-07-23/';

Postgres table queries not responding

I have been trying to truncate a table using SQlWorkbench. Suddenly, SqlWorkbench got freezed, while the truncate was in progress. I had to kill workbench from taskmanager. But now none of the queries are working on the table on which the truncate was aborted abruptly. For other tables queries are working fine. Need help, as I have to upload fresh data on the same table. Currently I am not even able to drop the table. What can be done to resolve this issue?

This looks like the TRUNCATE got stuck behind a lock, and then you killed the front end, while TRUNCATE kept running.
Connect to the database as superuser and examine the pg_stat_activity view; you should see some long running transactions.
Use the function pg_terminate_backend to kill these sessions by their pid.

Deal with Postgresql Error -canceling statement due to conflict with recovery- in psycopg2

I'm creating a reporting engine that makes a couple of long queries over a standby server and process the result with pandas. Everything works fine but sometimes I have some issues with the execution of those queries using a psycopg2 cursor: the query is cancelled with the following message:
ERROR: cancelling statement due to conflict with recovery
Detail: User query might have needed to see row versions that must be removed
I was investigating this issue
PostgreSQL ERROR: canceling statement due to conflict with recovery
https://www.postgresql.org/docs/9.0/static/hot-standby.html#HOT-STANDBY-CONFLICT
but all solutions suggest fixing the issue making modifications to the server's configuration. I can't make those modifications (We won the last football game against IT guys :) ) so I want to know how can I deal with this situation from the perspective of a developer. Can I resolve this issue using python code? My temporary solution is simple: catch the exception and retry all the failed queries. Maybe could be done better (I hope so).
Thanks in advance

There is nothing you can do to avoid that error without changing the PostgreSQL configuration (from PostgreSQL 9.1 on, you could e.g. set hot_standby_feedback to on).
You are dealing with the error in the correct fashion – simply retry the failed transaction.

The table data on the hot standby slave server is modified while a long running query is running. A solution (PostgreSQL 9.1+) to make sure the table data is not modified is to suspend the replication on the slave and resume after the query.
select pg_xlog_replay_pause(); -- suspend
select * from foo; -- your query
select pg_xlog_replay_resume(); --resume

I recently encountered a similar error and was also in the position of not being a dba/devops person with access to the underlying database settings.
My solution was to reduce the time of the query where ever possible. Obviously this requires deep knowledge of your tables and data, but I was able to solve my problem with a combination of a more efficient WHERE filter, a GROUPBY aggregation, and more extensive use of indexes.
By reducing the amount of server side execute time and data, you reduce the chance of a rollback error occurring.
However, a rollback can still occur during your shortened window, so a comprehensive solution would also make use of some retry logic for when a rollback error occurs.
Update: A colleague implemented said retry logic as well as batching the query to make the data volumes smaller. These three solutions have made the problem go away entirely.

I got the same error. What you CAN do (if the query is simple enough), is deviding the data into smaller chunks as a workaround.
I did this within a python loop to call the query multiple times with the LIMIT and OFFSET parameter like:
query_chunk = f"""
SELECT *
FROM {database}.{datatable}
LIMIT {chunk_size} OFFSET {i_chunk * chunk_size}
"""
where database and datatable are the names of your sources..
The chunk_size is individually and to set this to a not too high value is crucial for the query to finish.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

AWS Redshift vacuum returns ERROR: Assert - amazon-redshift

Probably if you update your cluster it will fix the issue. https://forums.aws.amazon.com/thread.jspa?threadID=245468

Related

Error "MultiXactId **** has not been created yet -- apparent wraparound" on PG11

Cannot execute ANALYZE during recovery

How to alter external table in Redshift Spectrum?

Postgres table queries not responding

Deal with Postgresql Error -canceling statement due to conflict with recovery- in psycopg2

Categories

Resources