rollback transactions in redshift (postgres 8.0) - amazon-redshift

update db.numbers_daily
set flag = 'excluded'
where product_name ~~* '%toy%'
and flag is null or flag not in ('deleted', 'excluded')
I forgot to include a () in the where clause for "flag",
the supposed query is
update db.numbers_daily
set flag = 'excluded'
where product_name ~~* '%toy%'
and (flag is null or flag not in ('deleted', 'excluded') )
Could you tell me how I can rollback the transaction?
I tried running
rollback;
but nothing happened.
I am using Redshift (Postgres 8.0).
Checking Redshift, I got my transaction ID, PID , and query, so surely there must be a way to rollback based on the transaction ID?

Related

When will select query aquire ExclusiveLock and RowExclusiveLock in PostgreSQL?

According to official documentation, select query only need sharelock, but I found my select query acquired Exclusive lock. How did it happen? Here is my select query:
select gc.id
from group_access_strategy ga
left outer join person_group pg on gp.person_group_id=pg.id
where gp.id=3
what is different from official documentation is that I added left join.
Most likely you ran another command like ALTER TABLE person_group ... (Access Exclusive) or an UPDATE/INSERT/DELETE (Row exclusive) in the same transaction. Locks will persist until a transaction is completed or aborted.
So if you ran:
BEGIN; --BEGIN starts the transaction
UPDATE group_access_strategy SET column = 'some data' where id = 1;
SELECT
gc.id,
FROM
group_access_strategy ga
LEFT OUTER JOIN person_group pg ON (gp.person_group_id = pg.id)
WHERE
pg.id = 3
The UPDATE statement would have created a Row Exclusive Lock that will not be released until you end the transaction by:
Saving all of the changes made since BEGIN:
COMMIT;
OR
nullifying any of the effects of statements since BEGIN with
ROLLBACK;
If you're new to Postgres and typically run your queries in an IDE like PG Admin or DataGrip, the BEGIN / COMMIT ROLLBACK commands are issued behind the scenes for you when you click the corresponding UI buttons.

Feedback on whether index was created on materialized views in postgresql

I created a unique index for a materialized view as :
create unique index if not exists matview_key on
matview (some_group_id, some_description);
I can't tell if it has been created
How do I see the index?
Thank you!
Two ways to verify index creation:
--In psql
\d matview
--Using SQL
select
*
from
pg_indexes
where
indexname = 'matview_key'
and
tablename = 'matview';
More information on pg_indexes.
Like has been commented, if the command finishes successfully and you don't get an error message, the index was created. Possible caveat: while the transaction is not committed, nobody else can see it (except the unique name is reserved now), and it still might get rolled back. Check in a separate transaction to be sure.
To be absolutely sure:
SELECT pg_get_indexdef(oid)
FROM pg_catalog.pg_class
WHERE relname = 'matview_key'
AND relkind = 'i'
-- AND relnamespace = 'public'::regnamespace -- optional, to make sure of the schema, too
This way you see whether an index of the given name exists, and also its exact definition to rule out a different index with the same name. Pure SQL, works from any client. (There is nothing special about an index on materialized views.)
Also filter for the schema to be absolutely sure. Would be the "default" schema (a.k.a. "current" schema) in your case, since you did not specify in the creation. See:
How does the search_path influence identifier resolution and the "current schema"
Related:
Create index if it does not exist
How to check if a table exists in a given schema
In psql:
\di public.matview_key
To only find indexes. Again, the schema is optional to narrow down.
Progress Reporting
If creating an index takes a long time, you can look up progress in pg_stat_progress_create_index since Postgres 12:
SELECT * FROM pg_stat_progress_create_index
-- WHERE relid = 'public.matview'::regclass -- optionally narrow down
Un alternative to looking into pg_indexes is pg_matviews (for a materialized view only)
select *
from pg_matviews
where matviewname = 'my_matview_name';

Is update by select atomic for READ_COMMITED in postgres?

Is following query atomic within READ_COMMITED transaction?
update my_table
set
owner = ?,
where id = (
select id from my_table
where owner is null
limit 1
) returning *
I run tests on local postgres instance and it seems to be atomic, but is this always the case?
Each SQL statement in READ COMMITTED isolation level takes a snapshot of the database, so it and all its subqueries see a consistent version of the database.
But you are not safe from “lost updates”: it is possible that a concurrent transaction modifies a row between the start of the statement and the time the row is updated, so it could be that the row that is actually updated does not have owner set to NULL any more.
If you need to avoid that, add a FOR UPDATE clause in the subquery.

Postgres: SELECT FOR UPDATE does not see new rows after lock release

Trying to support PostgreSQL DB in my application, found this strange behaviour.
Preparation:
CREATE TABLE test(id INTEGER, flag BOOLEAN);
INSERT INTO test(id, flag) VALUES (1, true);
Assume two concurrent transactions (Autocommit=false, READ_COMMITTED) TX1 and TX2:
TX1:
UPDATE test SET flag = FALSE WHERE id = 1;
INSERT INTO test(id, flag) VALUES (2, TRUE);
-- (wait, no COMMIT yet)
TX2:
SELECT id FROM test WHERE flag=true FOR UPDATE;
-- waits for TX1 to release lock
Now, if I COMMIT in TX1, the SELECT in TX2 returns empty cursor.
It is strange to me, because same experiment in Oracle and MariaDB results in selecting newly created row (id=2).
I could not find anything about this behaviour in PG documentation.
Am I missing something?
Is there any way to force PG server to "refresh" statement visibility after acquiring lock?
PS: PostgreSQL version 11.1
TX2 scans the table and tries to lock the results.
The scan sees the snapshot of the database from the start of the query, so it cannot see any rows that were inserted (or made eligible in some other way) by concurrent modifications that started after that snapshot was taken.
That is why you cannot see the row with the id 2.
For id 1, that is also true, so the scan finds that row. But the query has to wait until the lock is released. When that finally happens, it fetches that latest committed version of the row and performs the check again, so that row is excluded as well.
This “EvalPlanQual” recheck (to use PostgreSQL jargon) is only performed for rows that were found during the scan, but were locked. The second row isn't even found during the scan, so no such processing happens there.
This is a bit odd, admitted. But it is not a bug, it is just the way PostgreSQL wirks.
If you want to avoid such anomalies, use the REPEATABLE READ isolation level. Then you will get a serialization error in such a case and can retry the transaction, thus avoiding inconsistencies like that.

How can two DELETE queries deadlock in Postgres?

Among the many things we do with Postgres at work, we use it as a cache for certain kinds of remote requests. Our schema is:
CREATE TABLE IF NOT EXISTS cache (
key VARCHAR(256) PRIMARY KEY,
value TEXT NOT NULL,
ttl TIMESTAMP DEFAULT NULL
);
CREATE INDEX IF NOT EXISTS idx_cache_ttl ON cache(ttl);
This table does not have triggers or foreign keys. Updates are typically:
INSERT INTO cache (key, value, ttl)
VALUES ('Ethan is testing8393645', '"hi6286166"', sec2ttl(300))
ON CONFLICT (key) DO UPDATE
SET value = '"hi6286166"', ttl = sec2ttl(300);
(Where sec2ttl is defined as:)
CREATE OR REPLACE FUNCTION sec2ttl(seconds FLOAT)
RETURNS TIMESTAMP AS $$
BEGIN
IF seconds IS NULL THEN
RETURN NULL;
END IF;
RETURN now() + (seconds || ' SECOND')::INTERVAL;
END;
$$ LANGUAGE plpgsql;
Querying the cache is done in a transaction like this:
BEGIN;
DELETE FROM cache WHERE ttl IS NOT NULL AND now() > ttl;
SELECT value FROM cache WHERE key = 'Ethan is testing6460437';
COMMIT;
There are a few things not to like about this design -- the DELETE happening in cache "reads", the index on cache.ttl is not ascending which makes it kind of useless, (edit: ASC is the default, thanks wargre!) plus the fact that we're using Postgres as a cache at all. But all of that would have been acceptable except that we've started getting deadlocks in production, which tend to look like this:
ERROR: deadlock detected
DETAIL: Process 12750 waits for ShareLock on transaction 632693475; blocked by process 10080.
Process 10080 waits for ShareLock on transaction 632693479; blocked by process 12750.
HINT: See server log for query details.
CONTEXT: while deleting tuple (426,1) in relation "cache"
[SQL: 'DELETE FROM cache WHERE ttl IS NOT NULL AND now() > ttl;']
Investigating the logs more thoroughly indicates that both transactions were performing this DELETE operation.
As far as I can tell:
My transactions are in READ COMMITTED isolation mode.
ShareLocks are grabbed by one transaction to indicate that it wants to mutate rows that another transaction has mutated (i.e. locked).
Based on the output of an EXPLAIN query, the ShareLocks should be grabbed by both DELETE transactions in physical order.
The deadlock indicates that both queries locked rows in a different order.
If all that is correct, then somehow some simultaneous transaction has changed the physical order of rows. I see that an UPDATE can move a row to an earlier or later physical position, but in my application, the UPDATEs always remove rows from consideration by the DELETEs (because they're always extending a row's TTL). If the rows were previously in physical order, and you remove one, then you're still left with physical order. Similarly for DELETE. We're not doing any VACUUM or any other operation which you might expect to reorder rows.
Based on Avoiding PostgreSQL deadlocks when performing bulk update and delete operations, I tried to change the DELETE queries to:
DELETE FROM cache c
USING (
SELECT key
FROM cache
WHERE ttl IS NOT NULL AND now() > ttl
ORDER BY ttl ASC
FOR UPDATE
) del
WHERE del.key = c.key;
However, I'm still able to get deadlocks locally. So generally, how can two DELETE queries deadlock? Is it because they're locking in an undefined order, and if so, how do I enforce a specific order?
You should instead ignore expired cache entries, so you will not depend on a frequent delete operation for cache expiration:
SELECT value
FROM cache
WHERE
key = 'Ethan is testing6460437'
and (ttl is null or ttl<now());
And have another job that periodically chooses keys to delete skipping already locked keys, which has to either force a well defined order of deleted row, or, better, skip already locked for update rows:
with delete_keys as (
select key from cache
where
ttl is not null
and now()>ttl
for update skip locked
)
delete from cache
where key in (select key from delete_keys);
If you can't schedule this periodically you should run this cleanup like randomly once every 1000 runs of your select query, like this:
create or replace function delete_expired_cache()
returns void
language sql
as $$
with delete_keys as (
select key from cache
where
ttl is not null
and now()>ttl
for update skip locked
)
delete from cache
where key in (select key from delete_keys);
$$;
SELECT value
FROM cache
WHERE
key = 'Ethan is testing6460437'
and (ttl is null or ttl<now());
select delete_expired_cache() where random()<0.001;
You should avoid writes, as they are expensive. Don't delete cache so often.
Also you should use timestamp with time zone type (or timestamptz for short) instead of simple timestamp - especially if you don't know why - a timestamp is not the thing most think it is - blame SQL standard.