Delete rows from a table if table exists in Redshift otherwise ignore deletion - amazon-redshift

I am using Redshift. I want a query to delete selected rows from a redshift table if the table exists otherwise just ignore the statement.

Redshift's SQL dialect doesn't contain control-of-flow statements like IF.. THEN so you are not going to be able to do this in a single SQL statement.
Your application or process will need to first query the Redshift table metadata to determine if a table exists e.g.
select 1 from pg_tables where schemaname = 'myschema' and tablename = 'myschema';
If data is returned (i.e. the table exists) then the application or process will execute the delete statement, if no data is returned the application or process does nothing. Basically you need to handle the "if this then do this" logic externally to Redshift.

I recommend #Nathan's answer. I would use python/psycopg2 to set up this logic. The first query would check for the table's existence in pg_tables (eg SELECT count(1) FROM pg_tables WHERE tablename='foo'), and store the result in a variable. Then you'd check the results of that variable to decide whether to kick off a second query (your delete).
But, maybe you don't want to do it in Python. You're just all about Redshift (it's pretty sweet). You could just run the DELETE query in Redshift. If the table is not present, the query fails and nothing happens. If the table is, you delete your data. There's no harm in generating an error here.

Related

Why can't we view deleted table in SQL Server

What is meant by a "logical table?" i.e. Deleted and Inserted. That is, why can't we do something like:
Delete
From exampletable
Select *
From deleted
In the same session and see the results from the deleted table?
You probably search for OUPUT clause:
Delete From exampletable
OUTPUT deleted.*
The DELETED and INSERTED tables are created by the SQL engine to handle the data manipulation in your data manipulation statement. Think of them as being similar to a temp table that you would create in a stored procedure to hold interim results.
Once your DML statement has completed, SQL Server doesn't need them anymore, so it "drops" the "temp tables" and they aren't there to query any longer. You can, though, access them using the OUTPUT clause in your DML statement, as #LucaszSzozda explains, because at that point, the engine hasn't dropped them yet.

Move truncated records to another table in Postgresql 9.5

Problem is following: remove all records from one table, and insert them to another.
I have a table that is partitioned by date criteria. To avoid partitioning each record one by one, I'm collecting the data in one table, and periodically move them to another table. Copied records have to be removed from first table. I'm using DELETE query with RETURNING, but the side effect is that autovacuum is having a lot of work to do to clean up the mess from original table.
I'm trying to achieve the same effect (copy and remove records), but without creating additional work for vacuum mechanism.
As I'm removing all rows (by delete without where conditions), I was thinking about TRUNCATE, but it does not support RETURNING clause. Another idea was to somehow configure the table, to automatically remove tuple from page on delete operation, without waiting for vacuum, but I did not found if it is possible.
Can you suggest something, that I could use to solve my problem?
You need to use something like:
--Open your transaction
BEGIN;
--Prevent concurrent writes, but allow concurrent data access
LOCK TABLE table_a IN SHARE MODE;
--Copy the data from table_a to table_b, you can also use CREATE TABLE AS to do this
INSERT INTO table_b AS SELECT * FROM table_a;
--Zeroying table_a
TRUNCATE TABLE table_a;
--Commits and release the lock
COMMIT;

Workaround in Redshift for "ADD COLUMN IF NOT EXISTS"

I'm trying to execute an S3 copy operation via Spark-Redshift and I'm looking to modify the Redshift table structure before running the copy command in order to add any missing columns (they should be all VARCHAR).
What I'm able to do is send an SQL query before running the copy, so ideally I would have liked to ALTER TABLE ADD COLUMN IF NOT EXISTS column_name VARCHAR(256). Unfortunately, Redshift does not offer support for ADD COLUMN IF NOT EXISTS, so I'm currently looking for a workaround.
I've tried to query the pg_table_def table to check for the existence of the column, and that works, but I'm not sure how to chain that with an ALTER TABLE statement. Here's the current state of my query, I'm open to any suggestions for accomplishing the above.
select
case when count(*) < 1 then ALTER TABLE tbl { ADD COLUMN 'test_col' VARCHAR(256) }
else 'ok'
end
from pg_table_def where schemaname = 'schema' and tablename = 'tbl' and pg_table_def.column = 'test_col'
Also, I've seen this question: Redshift: add column if not exists, however the accepted answer isn't mentioning how to actually achieve this.

Why doesn't this rule prevent duplicate key violations?

(postgresql) I was trying to COPY csv data into a table but I was getting duplicate key violation errors, and there's no way to tell COPY to ignore those, so following internet wisdom I tried adding this rule:
CREATE OR REPLACE RULE ignore_duplicate_inserts AS
ON INSERT TO mytable
WHERE (EXISTS ( SELECT mytable.id
FROM mytable
WHERE mytable.id = new.id)) DO NOTHING;
to circumvent the problem, but I still get those errors - any ideas why ?
Rules by default add things to the current action:
Roughly speaking, a rule causes additional commands to be executed when a given command on a given table is executed.
But an INSTEAD rule allows you to replace the action:
Alternatively, an INSTEAD rule can replace a given command by another, or cause a command not to be executed at all.
So, I think you want to specify INSTEAD:
CREATE OR REPLACE RULE ignore_duplicate_inserts AS
ON INSERT TO mytable
WHERE (EXISTS ( SELECT mytable.id
FROM mytable
WHERE mytable.id = new.id)) DO INSTEAD NOTHING;
Without the INSTEAD, your rule is essentially saying "do the INSERT and then do nothing" when you want to say "instead of the INSERT, do nothing" and, AFAIK, the DO INSTEAD NOTHING will do that.
I'm not an expert on PostgreSQL rules but I think adding the "INSTEAD" should work.
UPDATE: Thanks to araqnid we know that:
COPY FROM will invoke any triggers and check constraints on the destination table. However, it will not invoke rules
So a rule isn't going to work in this situation. However, triggers are fired during COPY FROM so you could write a BEFORE INSERT trigger that would return NULL when it detected duplicate rows:
It can return NULL to skip the operation for the current row. This instructs the executor to not perform the row-level operation that invoked the trigger (the insertion or modification of a particular table row).
That said, I think you'd be better off with araqnid's "load it all into a temporary table, clean it up, and copy it to the final destination" would be a more sensible solution for a bulk loading operation like you have.
COPY FROM will not invoke rules (http://www.postgresql.org/docs/9.0/interactive/sql-copy.html#AEN58860)
My approach would be to load the CSV data into a temp table, then use an INSERT...SELECT statement to copy the data into the target table where it doesn't already exist. (If there are duplicates in the CSV data itself, remove those from the temp table first). Something like:
BEGIN;
CREATE TEMP TABLE stage_data(key_column, data_columns...) ON COMMIT DROP;
\copy stage_data from data.csv with csv header
-- prevent any other updates while we are merging input (omit this if you don't need it)
LOCK target_data IN SHARE ROW EXCLUSIVE MODE;
-- insert into target table
INSERT INTO target_data(key_column, data_columns...)
SELECT key_column, data_columns...
FROM stage_data
WHERE NOT EXISTS (SELECT 1 FROM target_data
WHERE target_data.key_column = stage_data.key_column)
END;

Select query in sqlite

hey what i want to do is select everything from the table and insert data into it if its not already present in the table or update the data if present in the table.
The problem here is if my table is empty there is nothing to select and hence the query is not executed and therefore it neither inserts nor updates my table.
NOTE: My insert and update query are written in the select query in if else condition.
You can use the INSERT OR REPLACE statement. Also, see the following links.
SQLite "INSERT OR REPLACE INTO" vs. "UPDATE ... WHERE"
http://www.sqlite.org/lang_conflict.html