Calculate number of rows affected by batch query in PostgreSQL - postgresql

First of all, yes I've read documentation for DO statement :)
http://www.postgresql.org/docs/9.1/static/sql-do.html
So my question:
I need to execute some dynamic block of code that contains UPDATE statements and calculate the number of all affected rows. I'm using Ado.Net provider.
In Oracle the solution would have 4 steps:
add InputOutput parameter "N" to command
add BEGIN ... END; to command
add :N := :N + sql%rowcount after each statement.
It's done! We can read N parameter from command, after execute it.
How can I do it with PostgreSQL? I'm using npgsql provider but can migrate to devard if it helps.

DO statement blocks are good to execute dynamic SQL. They are no good to return values. Use a plpgsql function for that.
The key statement you need is:
GET DIAGNOSTICS integer_var = ROW_COUNT;
Details in the manual.
Example code:
CREATE OR REPLACE FUNCTION f_upd_some()
RETURNS integer AS
$func$
DECLARE
ct int;
i int;
BEGIN
EXECUTE 'UPDATE tbl1 ...'; -- something dynamic here
GET DIAGNOSTICS ct = ROW_COUNT; -- initialize with 1st count
UPDATE tbl2 ...; -- nothing dynamic here
GET DIAGNOSTICS i = ROW_COUNT;
ct := ct + i; -- add up
RETURN ct;
END
$func$ LANGUAGE plpgsql;
Call:
SELECT * FROM f_upd_some();

My solution is quite simple. In Oracle I need to use variables to calculate the sum of updated rows because command.ExecuteNonQuery() returns only the count of rows affected by the last UPDATE in the batch.
However, npgsql returns the sum of all rows updated by all UPDATE queries. So I only need to call command.ExecuteNonQuery() and get the result without any variables. Much easier than with Oracle.

Related

Query Execution in Postgresql

How to execute query within format() function in Postgres? can any one please guide me.
IF NOT EXISTS ( SELECT format ('select id
from '||some_table||' where emp_id
='||QUOTE_LITERAL(m_emp_id)||' )
) ;
You may combine EXECUTE FORMAT and an IF condition using GET DIAGNOSTICS
Here's an example which you can reuse with minimal changes.I have passed table name using %I identifier and parameterised argument ($1) for employee_id. This is safe against SQL Injection.LIMIT 1 is used since we are interested if at least one row exists. This will improve query performance and is equivalent(or efficient) to using EXISTS for huge datasets with multiple matching rows.
DO $$
DECLARE
some_table TEXT := 'employees';
m_emp_id INT := 100;
cnt INT;
BEGIN
EXECUTE format ('select 1
from %I where employee_id
= $1 LIMIT 1',some_table ) USING m_emp_id ;
GET DIAGNOSTICS cnt = ROW_COUNT;
IF cnt > 0
THEN
RAISE NOTICE 'FOUND';
ELSE
RAISE NOTICE 'NOT FOUND';
END IF;
END
$$;
Result
NOTICE: FOUND
DO
The #Kaushik Nayak reply is correct - I will try just little bit more explain this issue.
PLpgSQL knows two types of queries to database:
static (embedded) SQL - SQL is written directly to plpgsql code. Static SQL can be parametrized (can use a varible), but the parameters should be used as table or column identifier. The semantic (and execution plan) should be same every time.
dynamic SQL - this style of queries is similar to classic client side SQL - SQL is written as string expression (that is evaluated) in running time, and result of this string expression is evaluated as SQL query. There are not limits for parametrization, but there is a risk of SQL injections (parameters must be sanitized against SQL injection (good examples are in #Kaushik Nayak reply)). More, there is a overhead with every time replanning - so dynamic SQL should be used only when it is necessary. Dynamic SQL is processed by EXECUTE command. The syntax of IF commands is:
IF expression THEN
statements; ...
END IF;
You cannot to place into expression PLpgSQL statements. So this task should be divided to more steps:
EXECUTE format(...) USING m_emp_id;
GET DIAGNOSTICS rc = ROW_COUNT;
IF cnt > 0 THEN
...
END IF;

How to migrate oracle's pipelined function into PostgreSQL

As I am newbie to plpgSQL,
I stuck while migrating a Oracle query into PostgreSQL.
Oracle query:
create or replace FUNCTION employee_all_case(
p_ugr_id IN integer,
p_case_type_id IN integer
)
RETURN number_tab_t PIPELINED
-- LANGUAGE 'plpgsql'
-- COST 100
-- VOLATILE
-- AS $$
-- DECLARE
is
l_user_id NUMBER;
l_account_id NUMBER;
BEGIN
l_user_id := p_ugr_id;
l_account_id := p_case_type_id;
FOR cases IN
(SELECT ccase.case_id, ccase.employee_id
FROM ct_case ccase
INNER JOIN ct_case_type ctype
ON (ccase.case_type_id=ctype.case_type_id)
WHERE ccase.employee_id = l_user_id)
LOOP
IF cases.employee_id IS NOT NULL THEN
PIPE ROW (cases.case_id);
END IF;
END LOOP;
RETURN;
END;
--$$
When I execute this function then I get the following result
select * from table(select employee_all_case(14533,1190) from dual)
My question here is: I really do not understand the pipelined function and how can I obtain the same result in PostgreSQL as Oracle query ?
Please help.
Thank you guys, your solution was very helpful.
I found the desire result:
-- select * from employee_all_case(14533,1190);
-- drop function employee_all_case
create or replace FUNCTION employee_all_case(p_ugr_id IN integer ,p_case_type_id IN integer)
returns table (case_id double precision)
-- PIPELINED
LANGUAGE 'plpgsql'
COST 100
VOLATILE
AS $$
DECLARE
-- is
l_user_id integer;
l_account_id integer;
BEGIN
l_user_id := cp_lookup$get_user_id_from_ugr_id(p_ugr_id);
l_account_id := cp_lookup$acctid_from_ugr(p_ugr_id);
RETURN QUERY SELECT ccase.case_id
FROM ct_case ccase
INNER JOIN ct_case_type ctype ON ccase.case_type_id = ctype.case_type_id
WHERE ccase.employee_id = p_ugr_id
and ccase.employee_id IS NOT NULL;
--return NEXT;
END;
$$
You would rewrite that to a set returning function:
Change the return type to
RETURNS SETOF integer
and do away with the PIPELINED.
Change the PIPE ROW statement to
RETURN NEXT cases.case_id;
Of course, you will have to do the obvious syntactic changes, like using integer instead of NUMBER and putting the IN before the parameter name.
But actually, it is quite unnecessary to write a function for that. Doing it in a single SELECT statement would be both simpler and faster.
Pipelined functions are best translated to a simple SQL function returning a table.
Something like this:
create or replace function employee_all_case(p_ugr_id integer, p_case_type_IN integer)
returns table (case_id integer)
as
$$
SELECT ccase.case_id
FROM ct_case ccase
INNER JOIN ct_case_type ctype ON ccase.case_type_id = ctype.case_type_id
WHERE ccase.employee_id = p_ugr_id
and cases.employee_id IS NOT NULL;
$$
language sql;
Note that your sample code did not use the second parameter p_case_type_id.
Usage is also more straightforward:
select *
from employee_all_case(14533,1190);
Before diving into the solution, I will provide some information which will help you to understand better.
So basically PIPELINED came into picture for improving memory allocation at run time.
As you all know collections will occupy space when ever they got created. So the more you use, the more memory will get allocated.
Pipelining negates the need to build huge collections by piping rows out of the function.
saving memory and allowing subsequent processing to start before all the rows are generated.
Pipelined table functions include the PIPELINED clause and use the PIPE ROW call to push rows out of the function as soon as they are created, rather than building up a table collection.
By using Pipelined how memory usage will be optimized?
Well, it's very simple. instead of storing data into an array, just process the data by using pipe row(desired type). This actually returns the row and process the next row.
coming to solution in plpgsql
simple but not preferred while storing large data.
Remove PIPELINED from return declaration and return an array of desired type. something like RETURNS typrec2[].
Where ever you are using pipe row(), add that entry to array and finally return that array.
create a temp table like
CREATE TEMPORARY TABLE temp_table (required fields) ON COMMIT DROP;
and insert data into it. Replace pipe row with insert statement and finally return statement like
return query select * from temp_table
**The best link for understanding PIPELINED in oracle [https://oracle-base.com/articles/misc/pipelined-table-functions]
pretty ordinary for postgres reference [http://manojadinesh.blogspot.com/2011/11/pipelined-in-oracle-as-well-in.html]
Hope this helps some one conceptually.

No Implicit variable as an alternative to ##ROWCOUNT/SQL%ROWCOUNT [duplicate]

I want to returns the number of rows affected by the last statement.
Using Microsoft SQL Server 2008 R2 I do it this way:
SELECT * FROM Test_table;
SELECT ##ROWCOUNT AS [Number Of Rows Affected];
Will gives:
Number Of Rows Affected
-----------------------
10
How about in PostgreSQL 9.3?
DO $$
DECLARE
total_rows integer;
BEGIN
UPDATE emp_salary
SET salary = salary+1;
IF NOT FOUND THEN
RAISE NOTICE 'No rows found';
ELSIF FOUND THEN
GET DIAGNOSTICS total_rows := ROW_COUNT;
-- the above line used to get row_count
RAISE NOTICE 'Rows Found : total_rows: %', total_rows;
END IF;
END $$;
AFAIK there is no such construct in postgresql however the number of rows is part of the result you get from postgresql.
CORRECTION: as a_horse_with_no_name states in his comment there is something similar which can be used within PL/pgSQL. Also see example in answer posted by
Achilles Ram Nakirekanti
From within programs however my original suggestion is in most cases easier then having to resort to the usage of PL/pgSQL.
When using libpq:
On the result of a select you can use PQntuples to determine the number of rows returned. For update, insert and delete you can use PQcmdTuples with the result to get the number of rows affected.
Other client libraries often have similar functionality.
For REF from referred article: GET DIAGNOSTICS integer_var = ROW_COUNT;

Benchmarking/Performance of Postgresql Query

I want to measure the performance of postgresql code I wrote. In the code tables get created, selfwritten functions get called etc.
Looking around, I found EXPLAIN ANALYSE is the way to go.
However, as far as I understand it, the code only gets executed once. For a more realistic analysis I want to execute the code many many times and have the results of each iteration written somewhere, ideally in a table (for statistics later).
Is there a way to do this with a native postgresql function? If there is no native postgresql function, would I accomplish this with a simple loop? Further, how would I write out the information of every EXPLAIN ANALYZE iteration?
One way to do this is to write a function that runs an explain and then spool the output of that into a file (or insert that into a table).
E.g.:
create or replace function show_plan(to_explain text)
returns table (line_nr integer, line text)
as
$$
declare
l_plan_line record;
l_line integer;
begin
l_line := 1;
for l_plan_line in execute 'explain (analyze, verbose)'||to_explain loop
return query select l_line, l_plan_line."QUERY PLAN";
l_line := l_line + 1;
end loop;
end;
$$
language plpgsql;
Then you can use generate_series() to run a statement multiple times:
select g.i as run_nr, e.*
from show_plan('select * from foo') e
cross join generate_series(1,10) as g(i)
order by g.i, e.line_nr;
This will run the function 10 times with the passed SQL statement. The result can either be spooled to a file (how you do that depends on the SQL client you are using) or inserted into a table.
For an automatic analysis it's probably easer to use a more "parseable" explain format, e.g. XML or JSON. This is also easier to handle in the output as the plan is a single XML (or JSON) value instead of multiple text lines:
create or replace function show_plan_xml(to_explain text)
returns xml
as
$$
begin
return execut 'explain (analyze, verbose, format xml)'||to_explain;
end;
$$
language plpgsql;
Then use:
select g.i as run_nr, show_plan_xml('select * from foo')
from join generate_series(1,10) as g(i)
order by g.i;

##ROWCOUNT in PostgreSQL 9.3

I want to returns the number of rows affected by the last statement.
Using Microsoft SQL Server 2008 R2 I do it this way:
SELECT * FROM Test_table;
SELECT ##ROWCOUNT AS [Number Of Rows Affected];
Will gives:
Number Of Rows Affected
-----------------------
10
How about in PostgreSQL 9.3?
DO $$
DECLARE
total_rows integer;
BEGIN
UPDATE emp_salary
SET salary = salary+1;
IF NOT FOUND THEN
RAISE NOTICE 'No rows found';
ELSIF FOUND THEN
GET DIAGNOSTICS total_rows := ROW_COUNT;
-- the above line used to get row_count
RAISE NOTICE 'Rows Found : total_rows: %', total_rows;
END IF;
END $$;
AFAIK there is no such construct in postgresql however the number of rows is part of the result you get from postgresql.
CORRECTION: as a_horse_with_no_name states in his comment there is something similar which can be used within PL/pgSQL. Also see example in answer posted by
Achilles Ram Nakirekanti
From within programs however my original suggestion is in most cases easier then having to resort to the usage of PL/pgSQL.
When using libpq:
On the result of a select you can use PQntuples to determine the number of rows returned. For update, insert and delete you can use PQcmdTuples with the result to get the number of rows affected.
Other client libraries often have similar functionality.
For REF from referred article: GET DIAGNOSTICS integer_var = ROW_COUNT;