Variable containing the number of rows affected by previous DELETE? (in a function) - postgresql

I have a function that is used as an INSERT trigger. This function deletes rows that would conflict with [the serial number in] the row being inserted. It works beautifully, so I'd really rather not debate the merits of the concept.
DECLARE
re1 feeds_item.shareurl%TYPE;
BEGIN
SELECT regexp_replace(NEW.shareurl, '/[^/]+(-[0-9]+\.html)$','/[^/]+\\1') INTO re1;
RAISE NOTICE 'DELETEing rows from feeds_item where shareurl ~ ''%''', re1;
DELETE FROM feeds_item where shareurl ~ re1;
RETURN NEW;
END;
I would like to add to the NOTICE an indication of how many rows are affected (aka: deleted). How can I do that (using LANGUAGE 'plpgsql')?
UPDATE:
Base on some excellent guidance from "Chicken in the kitchen", I have changed it to this:
DECLARE
re1 feeds_item.shareurl%TYPE;
num_rows int;
BEGIN
SELECT regexp_replace(NEW.shareurl, '/[^/]+(-[0-9]+\.html)$','/[^/]+\\1') INTO re1;
DELETE FROM feeds_item where shareurl ~ re1;
IF FOUND THEN
GET DIAGNOSTICS num_rows = ROW_COUNT;
RAISE NOTICE 'DELETEd % row(s) from feeds_item where shareurl ~ ''%''', num_rows, re1;
END IF;
RETURN NEW;
END;

For a very robust solution, that is part of PostgreSQL SQL and not just plpgsql you could also do the following:
with a as (DELETE FROM feeds_item WHERE shareurl ~ re1 returning 1)
select count(*) from a;
You can actually get lots more information such as:
with a as (delete from sales returning amount)
select sum(amount) from a;
to see totals, in this way you could get any aggregate and even group and filter it.

In Oracle PL/SQL, the system variable to store the number of deleted / inserted / updated rows is:
SQL%ROWCOUNT
After a DELETE / INSERT / UPDATE statement, and BEFORE COMMITTING, you can store SQL%ROWCOUNT in a variable of type NUMBER. Remember that COMMIT or ROLLBACK reset to ZERO the value of SQL%ROWCOUNT, so you have to copy the SQL%ROWCOUNT value in a variable BEFORE COMMIT or ROLLBACK.
Example:
BEGIN
DECLARE
affected_rows NUMBER DEFAULT 0;
BEGIN
DELETE FROM feeds_item
WHERE shareurl = re1;
affected_rows := SQL%ROWCOUNT;
DBMS_OUTPUT.
put_line (
'This DELETE would affect '
|| affected_rows
|| ' records in FEEDS_ITEM table.');
ROLLBACK;
END;
END;
I have found also this interesting SOLUTION (source: http://markmail.org/message/grqap2pncqd6w3sp )
On 4/7/07, Karthikeyan Sundaram wrote:
Hi,
I am using 8.1.0 postgres and trying to write a plpgsql block. In that I am inserting a row. I want to check to see if the row has been
inserted or not.
In oracle we can say like this
begin
insert into table_a values (1);
if sql%rowcount > 0
then
dbms.output.put_line('rows inserted');
else
dbms.output.put_line('rows not inserted');
end if; end;
Is there something equal to sql%rowcount in postgres? Please help.
Regards skarthi
Maybe:
http://www.postgresql.org/docs/8.2/static/plpgsql-statements.html#PLPGSQL-STATEMENTS-SQL-ONEROW
Click on the link above, you'll see this content:
37.6.6. Obtaining the Result Status There are several ways to determine the effect of a command. The first method is to use the GET
DIAGNOSTICS command, which has the form:
GET DIAGNOSTICS variable = item [ , ... ];This command allows
retrieval of system status indicators. Each item is a key word
identifying a state value to be assigned to the specified variable
(which should be of the right data type to receive it). The currently
available status items are ROW_COUNT, the number of rows processed by
the last SQL command sent down to the SQL engine, and RESULT_OID, the
OID of the last row inserted by the most recent SQL command. Note that
RESULT_OID is only useful after an INSERT command into a table
containing OIDs.
An example:
GET DIAGNOSTICS integer_var = ROW_COUNT; The second method to
determine the effects of a command is to check the special variable
named FOUND, which is of type boolean. FOUND starts out false within
each PL/pgSQL function call. It is set by each of the following types
of statements:
A SELECT INTO statement sets FOUND true if a row is assigned, false if
no row is returned.
A PERFORM statement sets FOUND true if it produces (and discards) a
row, false if no row is produced.
UPDATE, INSERT, and DELETE statements set FOUND true if at least one
row is affected, false if no row is affected.
A FETCH statement sets FOUND true if it returns a row, false if no row
is returned.
A FOR statement sets FOUND true if it iterates one or more times, else
false. This applies to all three variants of the FOR statement
(integer FOR loops, record-set FOR loops, and dynamic record-set FOR
loops). FOUND is set this way when the FOR loop exits; inside the
execution of the loop, FOUND is not modified by the FOR statement,
although it may be changed by the execution of other statements within
the loop body.
FOUND is a local variable within each PL/pgSQL function; any changes
to it affect only the current function.

I would to share my code (I had this idea from Roelof Rossouw):
CREATE OR REPLACE FUNCTION my_schema.sp_delete_mytable(_id integer)
RETURNS integer AS
$BODY$
DECLARE
AFFECTEDROWS integer;
BEGIN
WITH a AS (DELETE FROM mytable WHERE id = _id RETURNING 1)
SELECT count(*) INTO AFFECTEDROWS FROM a;
IF AFFECTEDROWS = 1 THEN
RETURN 1;
ELSE
RETURN 0;
END IF;
EXCEPTION WHEN OTHERS THEN
RETURN 0;
END;
$BODY$
LANGUAGE plpgsql VOLATILE
COST 100;

Related

Call postgresql record's field by name

I have a function that uses RECORD to temporarily store the data. I can use it - it's fine. My problem is that I can't hardcode columns I need to get from the RECORD. I must do it dynamically. Something line:
DECLARE
r1 RECORD;
r2 RECORD;
BEGIN
for r1 in Select column_name
from columns_to_process
where process_now = True
loop
for r2 in Select *
from my_data_table
where whatever
loop
-----------------------------
here I must call column by its name that is unknown at design time
-----------------------------
... do something with
r2.(r1.column_name)
end loop;
end loop;
END;
Does anyone know how to do it?
best regards
M
There is no need to select the all the qualifying rows and compute the total in a loop. Actually when working with SQL try to drop the word loop for your vocabulary; instead just use sum(column_name) in the select. The issue here is that you do not know what column to sum when the query is written, and all structural components(table names, columns names, operators, etc) must be known before submitting. You cannot use a variable for a structural component - in this case a column name. To do that you must use dynamic sql - i.e. SQL statement built by the process. The following accomplishes that: See example here.
create or replace function sum_something(
the_something text -- column name
, for_id my_table.id%type -- my_table.id
)
returns numeric
language plpgsql
as $$
declare
k_query_base constant text :=
$STMT$ Select sum(%I) from my_table where id = %s; $STMT$;
l_query text;
l_sum numeric;
begin
l_query = format(k_query_base, the_something, for_id);
raise notice E'Rumming Statememt:\n %',l_query; -- for prod raise Log
execute l_query into l_sum;
return l_sum;
end;
$$;
Well, after some time I figured out that I could use temporary table instead of RECORD. Doing so gives me all advantages of using dynamic queries so I can call any column by its name.
DECLARE
_my_var bigint;
BEGIN
create temporary table _my_temp_table as
Select _any, _column, _you, _need
from _my_table
where whatever = something;
execute 'Select ' || _any || ' from _my_temp_table' into _my_var;
... do whatever
END;
However I still believe that there should be a way to call records field by it's name.

How to get a function to work within a DO statement in Postgresql

I have a long and complex plpgsql function that creates a bunch of temporary tables nested within a while statement to get the optimal result. When the condition has been met I insert the result into an existing table, the function is far to long to post here but this is an example:
CREATE OR REPLACE FUNCTION public.test_function(id_input integer, val_input numeric)
RETURNS VOID AS
$BODY$
DECLARE
id_input numeric = $1;
val_input numeric = $2;
BEGIN
WHILE test_val < 0
LOOP
CREATE TEMP TABLE temp_table AS
SELECT a.existing_val - val_input AS new_val
FROM existing_table a
WHERE a.id = id_input;
test_val := (SELECT new_val FROM temp_table);
val_input := val_input + 1;
END LOOP;
INSERT INTO output_table (id, new_val)
SELECT a.id, a.new_val
FROM temp_table a;
END;
$BODY$
LANGUAGE plpgsql;
The function works if I call it like this SELECT test_function(1, 1000) However I would like run this function on a table with 60,000+ rows, like this:
SELECT test_function(a.id, a.val_input)
FROM data_table a;
It works when I use a subset of the data_table, say 1000 rows. However when I run it on the full table (60,000+ rows) I get the following error "AbortTransaction while in COMMIT state". After some reading I found out COMMITS, so in my case the inserts do not occur until the function has finished running which takes about 4 hours. So does anyone know what is going on?
As a workaround I tried nesting the function in a DO statement so the inserts are committed straight away:
DO
$do$
DECLARE
r data_table%rowtype;
BEGIN
FOR r IN
SELECT * FROM data_table
LOOP
SELECT public.test_function(r.id, r.val_input);
END LOOP;
END
$do$;
However then I get the following error "ERROR: query has no destination for result data", which I guess means I need to rewrite the function to use PERFORM instead of SELECT. However I have not had any luck with this as yet.
Any ideas?
Since you are not interested in the function result, you should use
PERFORM public.test_function(r.id, r.val_input);
instead of
SELECT public.test_function(r.id, r.val_input);
The latter syntax would only work if you add INTO some_variable as a destination for the query result.
Thank you all for your suggestions. I ended up using Jim Jones's suggestion and converting the function to a procedure which allowed me to use COMMIT after I did the INSERT. I also followed Jeremy's suggestion and moved from temp tables to CTE's. This solved the problem for me.

How to clone a RECORD in PostgreSQL

I want to loop through a query, but also retain the actual record for the next loop, so I can compare two adjacent rows.
CREATE OR REPLACE FUNCTION public.test ()
RETURNS void AS
$body$
DECLARE
previous RECORD;
actual RECORD;
query TEXT;
isdistinct BOOLEAN;
tablename VARCHAR;
columnname VARCHAR;
firstrow BOOLEAN DEFAULT TRUE;
BEGIN
tablename = 'naplo.esemeny';
columnname = 'esemeny_id';
query = 'SELECT * FROM ' || tablename || ' LIMIT 2';
FOR actual IN EXECUTE query LOOP
--do stuff
--save previous record
IF NOT firstrow THEN
EXECUTE 'SELECT ($1).' || columnname || ' IS DISTINCT FROM ($2).' || columnname
INTO isdistinct USING previous, actual;
RAISE NOTICE 'previous: %', previous.esemeny_id;
RAISE NOTICE 'actual: %', actual.esemeny_id;
RAISE NOTICE 'isdistinct: %', isdistinct;
ELSE
firstrow = false;
END IF;
previous = actual;
END LOOP;
RETURN;
END;
$body$
LANGUAGE 'plpgsql'
VOLATILE
CALLED ON NULL INPUT
SECURITY INVOKER
COST 100;
The table:
CREATE TABLE naplo.esemeny (
esemeny_id SERIAL,
felhasznalo_id VARCHAR DEFAULT "current_user"() NOT NULL,
kotesszam VARCHAR(10),
idegen_azonosito INTEGER,
esemenytipus_id VARCHAR(10),
letrehozva TIMESTAMP WITHOUT TIME ZONE DEFAULT now() NOT NULL,
szoveg VARCHAR,
munkalap_id VARCHAR(13),
ajanlat_id INTEGER,
CONSTRAINT esemeny_pkey PRIMARY KEY(esemeny_id),
CONSTRAINT esemeny_fk_esemenytipus FOREIGN KEY (esemenytipus_id)
REFERENCES naplo.esemenytipus(esemenytipus_id)
ON DELETE RESTRICT
ON UPDATE RESTRICT
NOT DEFERRABLE
)
WITH (oids = true);
The code above doesn't work, the following error message is thrown:
ERROR: could not identify column "esemeny_id" in record data type
LINE 1: SELECT ($1).esemeny_id IS DISTINCT FROM ($2).esemeny_id
^
QUERY: SELECT ($1).esemeny_id IS DISTINCT FROM ($2).esemeny_id
CONTEXT: PL/pgSQL function "test" line 18 at EXECUTE statement
LOG: duration: 0.000 ms statement: SET DateStyle TO 'ISO'
What am I missing?
Disclaimer: I know the code doesn't make too much sense, I only created so I can demonstrate the problem.
This does not directly answer your question, and may be of no use at all, since you did not really describe your end goal.
If the end goal is to be able to compare the value of a column in the current row with the value of the same column in the previous row, then you might be much better off using a windowing query:
SELECT actual, previous
FROM (
SELECT mycolumn AS actual,
lag(mycolumn) OVER () AS previous
FROM mytable
ORDER BY somecriteria
) as q
WHERE previous IS NOT NULL
AND actual IS DISTINCT FROM previous
This example prints the rows where the current row is different from the previous row.
Note that I added an ORDER BY clause - it does not make sense to talk about "the previous row" without specifying ordering, otherwise you would get random results.
This is plain SQL, not PlPgSQL, but if you can wrap it in a function if you want to dynamically generate the query.
I am pretty sure, there is a better solution for your actual problem. But to answer the question asked, here is a solution with polymorphic types:
The main problem is that you need well known composite types to work with. the structure of anonymous records is undefined until assigned.
CREATE OR REPLACE FUNCTION public.test (actual anyelement, _col text
, OUT previous anyelement) AS
$func$
DECLARE
isdistinct bool;
BEGIN
FOR actual IN
EXECUTE format('SELECT * FROM %s LIMIT 3', pg_typeof(actual))
LOOP
EXECUTE format('SELECT ($1).%1$I IS DISTINCT FROM ($2).%1$I', _col)
INTO isdistinct
USING previous, actual;
RAISE NOTICE 'previous: %; actual: %; isdistinct: %'
, previous, actual, isdistinct;
previous := actual;
END LOOP;
previous := NULL; -- reset dummy output (optional)
END
$func$ LANGUAGE plpgsql;
Call:
SELECT public.test(NULL::naplo.esemeny, 'esemeny_id')
I am abusing an OUT parameter, since it's not possible to declare additional variables with a polymorphic composite type (at least I have failed repeatedly).
If your column name is stable you can replace the second EXECUTE with a simple expression.
I am running out of time, explanation in these related answers:
Declare variable of composite type in PostgreSQL using %TYPE
Refactor a PL/pgSQL function to return the output of various SELECT queries
Asides:
Don't quote the language name, it's an identifier, not a string.
Do you really need WITH (oids = true) in your table? This is still allowed, but largely deprecated in modern Postgres.

count number of rows to be affected before update in trigger

I want to know number of rows that will be affected by UPDATE query in BEFORE per statement trigger . Is that possible?
The problem is that i want to allow only queries that will update up to 4 rows. If affected rows count is 5 or more i want to raise error.
I don't want to do this in code because i need this check on db level.
Is this at all possible?
Thanks in advance for any clues on that
Write a function that updates the rows for you or performs a rollback. Sorry for poor style formatting.
create function update_max(varchar, int)
RETURNS void AS
$BODY$
DECLARE
sql ALIAS FOR $1;
max ALIAS FOR $2;
rcount INT;
BEGIN
EXECUTE sql;
GET DIAGNOSTICS rcount = ROW_COUNT;
IF rcount > max THEN
--ROLLBACK;
RAISE EXCEPTION 'Too much rows affected (%).', rcount;
END IF;
--COMMIT;
END;
$BODY$ LANGUAGE plpgsql
Then call it like
select update_max('update t1 set id=id+10 where id < 4', 3);
where the first param ist your sql-Statement and the 2nd your max rows.
Simon had a good idea but his implementation is unnecessarily complicated. This is my proposition:
create or replace function trg_check_max_4()
returns trigger as $$
begin
perform true from pg_class
where relname='check_max_4' and relnamespace=pg_my_temp_schema();
if not FOUND then
create temporary table check_max_4
(value int check (value<=4))
on commit drop;
insert into check_max_4 values (0);
end if;
update check_max_4 set value=value+1;
return new;
end; $$ language plpgsql;
I've created something like this:
begin;
create table test (
id integer
);
insert into test(id) select generate_series(1,100);
create or replace function trg_check_max_4_updated_records()
returns trigger as $$
declare
counter_ integer := 0;
tablename_ text := 'temptable';
begin
raise notice 'trigger fired';
select count(42) into counter_
from pg_catalog.pg_tables where tablename = tablename_;
if counter_ = 0 then
raise notice 'Creating table %', tablename_;
execute 'create temporary table ' || tablename_ || ' (counter integer) on commit drop';
execute 'insert into ' || tablename_ || ' (counter) values(1)';
execute 'select counter from ' || tablename_ into counter_;
raise notice 'Actual value for counter= [%]', counter_;
else
execute 'select counter from ' || tablename_ into counter_;
execute 'update ' || tablename_ || ' set counter = counter + 1';
raise notice 'updating';
execute 'select counter from ' || tablename_ into counter_;
raise notice 'Actual value for counter= [%]', counter_;
if counter_ > 4 then
raise exception 'Cannot change more than 4 rows in one trancation';
end if;
end if;
return new;
end; $$ language plpgsql;
create trigger trg_bu_test before
update on test
for each row
execute procedure trg_check_max_4_updated_records();
update test set id = 10 where id <= 1;
update test set id = 10 where id <= 2;
update test set id = 10 where id <= 3;
update test set id = 10 where id <= 4;
update test set id = 10 where id <= 5;
rollback;
The main idea is to have a trigger on 'before update for each row' that creates (if necessary) a temporary table (that is dropped at the end of transaction). In this table there is just one row with one value, that is the number of updated rows in current transaction. For each update the value is incremented. If the value is bigger than 4, the transaction is stopped.
But I think that this is a wrong solution for your problem. What's a problem to run such wrong query that you've written about, twice, so you'll have 8 rows changed. What about deletion rows or truncating them?
PostgreSQL has two types of triggers: row and statement triggers. Row triggers only work within the context of a row so you can't use those. Unfortunately, "before" statement triggers don't see what kind of change is about to take place so I don't believe you can use those, either.
Based on that, I would say it's unlikely you'll be able to build that kind of protection into the database using triggers, not unless you don't mind using an "after" trigger and rolling back the transaction if the condition isn't satisfied. Wouldn't mind being proved wrong. :)
Have a look at using Serializable Isolation Level. I believe this will give you a consistent view of the database data within your transaction. Then you can use option #1 that MusiGenesis mentioned, without the timing vulnerability. Test it of course to validate.
I've never worked with postgresql, so my answer may not apply. In SQL Server, your trigger can call a stored procedure which would do one of two things:
Perform a SELECT COUNT(*) to determine the number of records that will be affected by the UPDATE, and then only execute the UPDATE if the count is 4 or less
Perform the UPDATE within a transaction, and only commit the transaction if the returned number of rows affected is 4 or less
No. 1 is timing vulnerable (the number of records affected by the UPDATE may change between the COUNT(*) check and the actual UPDATE. No. 2 is pretty inefficient, if there are many cases where the number of rows updated is greater than 4.

How to get the number of deleted rows in PostgreSQL?

I am looking for a way to return the number of rows affected by a DELETE clause in PostgreSQL. The documentation states that;
On successful completion, a DELETE
command returns a command tag of the
form
DELETE count
The count is the number of rows
deleted. If count is 0, no rows
matched the condition (this is not
considered an error).
If the DELETE command contains a
RETURNING clause, the result will be
similar to that of a SELECT statement
containing the columns and values
defined in the RETURNING list,
computed over the row(s) deleted by
the command.
But I am having trouble finding a good example of it. Can anyone help me with this, how can I find out how many rows were deleted?
EDIT:
I wanted to present an alternative that I have found later. It can be found in here, explained under 38.5.5. Obtaining the Result Status title.
You can use RETURNING clause:
DELETE FROM table WHERE condition IS TRUE RETURNING *;
After that you just have to check number of rows returned. You can streamline it with CTE:
WITH deleted AS (DELETE FROM table WHERE condition IS TRUE RETURNING *) SELECT count(*) FROM deleted;
This should return just the number of deleted rows.
GET DIAGNOSTICS is used to display number of modified/deleted records.
Sample code
CREATE OR REPLACE FUNCTION fnName()
RETURNS void AS
$BODY$
declare
count numeric;
begin
count := 0;
LOOP
-- condition here update or delete;
GET DIAGNOSTICS count = ROW_COUNT;
raise notice 'Value: %', count;
end loop;
end;
$BODY$a
This should be simple in Java.
Statement stmt = connection.createStatement();
int rowsAffected = stmt.executeUpdate("delete from your_table");
System.out.println("deleted: " + rowsAffected);
See java.sql.Statement.
in Python using psycopg2, the rowcount attribute can be used. Here is an example to find out how many rows were deleted...
cur = connection.cursor()
try:
cur.execute("DELETE FROM table WHERE col1 = %s", (value,))
connection.commit()
count = cur.rowcount
cur.close()
print("A total of %s rows were deleted." % count)
except:
connection.rollback()
print("An error as occurred, No rows were deleted")
This works in functions. It works with other operations like INSERT as well.
DECLARE _result INTEGER;
...
DELETE FROM mytable WHERE amount = 0; -- or whatever other operation you desire
GET DIAGNOSTICS _result = ROW_COUNT;
IF _result > 0 THEN
RAISE NOTICE 'Removed % rows with amount = 0', _result;
END IF;
You need the PQcmdTuples function from libpq. Which in PHP for example is wrapped as pg_affected_rows.