Use prepared statement or dynamic command in Postgresql stored procedure? - postgresql

I am learning Postgreslq and I have a question about this code sample:
CREATE OR REPLACE PROCEDURE CreateProject(
IN projectName VARCHAR(45),
IN projectYear SMALLINT)
LANGUAGE plpgsql
AS $$
BEGIN
-- PREPARE addProject (VARCHAR(45),SMALLINT) AS
-- INSERT INTO projects (projectName, year) VALUES ($1, $2);
-- EXECUTE addProject(projectName, projectYear );
EXECUTE 'INSERT INTO projects (projectName, year) VALUES ($1, $2)'
USING projectName,projectYear;
END $$;
I am trying to write stored procedure that would be safe against SQL injection. Coming from Mysql, I know that I would have to use prepared statement with parameters. Here in Postgresql it doesn't allow to(commented code), on the other hand if I use dynamic command it works. Could someone explain, why it is not possible to use prepared statement in such a situation?

Execution plans are prepared and cached for PL/PgSQL, it happens automatically. So, there's no need to use PREPARE inside PL/PGSQL ( As you've seen, you can't) for the sake of optimisation.
SQL injection is possible if you use dynamic arguments and use concatenation to append them instead of parameterising. However, since you are running a simple insert without dynamic columns/tables, EXECUTE is not needed.
CREATE OR REPLACE procedure CreateProject(
IN projectName VARCHAR(45),
IN projectYear SMALLINT )
LANGUAGE plpgsql
AS $$
BEGIN
INSERT INTO projects (projectName, year) VALUES ($1, $2);
END $$;
DEMO

Related

Execute prepare statement in a sql function

CREATE OR REPLACE FUNCTION my_function2()
RETURNS VOID
LANGUAGE plpgsql AS
$func$
BEGIN
PREPARE "my-statement7" (text, text) AS
INSERT INTO "table" (key, a_id)
VALUES ($1, $2);
EXECUTE format('EXECUTE "my-statement7" (%1, %2)', 'value1', 'value2');
END;
$func$;
SELECT my_function2();
However this will return error.
I have run successfully when I do execute prepare statement in the command line but when I do it in a function it will prompt error. Somebody said the execute statement will be ambiguous in execute statement or function, so it need put execute into execute format. However, it doesn't work too.
I want to execute the prepare statement for insert because I got a lot of data got to insert in same format, but the data may need to insert to different table (also data are different), so I cannot just do insert multi or COPY for it. I think the prepare statement is the solution for me, but it cannot work in a function. This will bother me if I want to do a lot of "execute prepare statement" in a function.
In PL/SQL, queries are already prepared and cached, so you don't have to.
As each expression and SQL command is first executed in the function, the PL/pgSQL interpreter parses and analyzes the command to create a prepared statement, using the SPI manager's SPI_prepare function. Subsequent visits to that expression or command reuse the prepared statement.
Here's how you'd write it if you tried. Note the use of _ instead of - in the name to avoid quoting issues; no need to format.
CREATE OR REPLACE FUNCTION my_function2()
RETURNS VOID
LANGUAGE plpgsql AS
$func$
BEGIN
PREPARE my_statement7 (text, text) AS
INSERT INTO example (key, a_id)
VALUES ($1, $2);
EXECUTE my_statement7( 'value1', 'value2');
END;
$func$;
However, the first time it will fail (for reasons I don't entirely understand): ERROR: function my_statement7(unknown, unknown) does not exist. The second time it is called it will also fail: ERROR: prepared statement "my_statement7" already exists.

Declare and return value for DELETE and INSERT

I am trying to remove duplicated data from some of our databases based upon unique id's. All deleted data should be stored in a separate table for auditing purposes. Since it concerns quite some databases and different schemas and tables I wanted to start using variables to reduce chance of errors and the amount of work it will take me.
This is the best example query I could think off, but it doesn't work:
do $$
declare #source_schema varchar := 'my_source_schema';
declare #source_table varchar := 'my_source_table';
declare #target_table varchar := 'my_target_schema' || source_table || '_duplicates'; --target schema and appendix are always the same, source_table is a variable input.
declare #unique_keys varchar := ('1', '2', '3')
begin
select into #target_table
from #source_schema.#source_table
where id in (#unique_keys);
delete from #source_schema.#source_table where export_id in (#unique_keys);
end ;
$$;
The query syntax works with hard-coded values.
Most of the times my variables are perceived as columns or not recognized at all. :(
You need to create and then call a plpgsql procedure with input parameters :
CREATE OR REPLACE PROCEDURE duplicates_suppress
(my_target_schema text, my_source_schema text, my_source_table text, unique_keys text[])
LANGUAGE plpgsql AS
$$
BEGIN
EXECUTE FORMAT(
'WITH list AS (INSERT INTO %1$I.%3$I_duplicates SELECT * FROM %2$I.%3$I WHERE array[id] <# %4$L :: integer[] RETURNING id)
DELETE FROM %2$I.%3$I AS t USING list AS l WHERE t.id = l.id', my_target_schema, my_source_schema, my_source_table, unique_keys :: text) ;
END ;
$$ ;
The procedure duplicates_suppress inserts into my_target_schema.my_source_table || '_duplicates' the rows from my_source_schema.my_source_table whose id is in the array unique_keys and then deletes these rows from the table my_source_schema.my_source_table .
See the test result in dbfiddle.
As has been commented, you need some kind of dynamic SQL. In a FUNCTION, PROCEDURE or a DO statement to do it on the server.
You should be comfortable with PL/pgSQL. Dynamic SQL is no beginners' toy.
Example with a PROCEDURE, like Edouard already suggested. You'll need a FUNCTION instead to wrap it in an outer transaction (like you very well might). See:
When to use stored procedure / user-defined function?
CREATE OR REPLACE PROCEDURE pg_temp.f_archive_dupes(_source_schema text, _source_table text, _unique_keys int[], OUT _row_count int)
LANGUAGE plpgsql AS
$proc$
-- target schema and appendix are always the same, source_table is a variable input
DECLARE
_target_schema CONSTANT text := 's2'; -- hardcoded
_target_table text := _source_table || '_duplicates';
_sql text := format(
'WITH del AS (
DELETE FROM %I.%I
WHERE id = ANY($1)
RETURNING *
)
INSERT INTO %I.%I TABLE del', _source_schema, _source_table
, _target_schema, _target_table);
BEGIN
RAISE NOTICE '%', _sql; -- debug
EXECUTE _sql USING _unique_keys; -- execute
GET DIAGNOSTICS _row_count = ROW_COUNT;
END
$proc$;
Call:
CALL pg_temp.f_archive_dupes('s1', 't1', '{1, 3}', 0);
db<>fiddle here
I made the procedure temporary, since I assume you don't need to keep it permanently. Create it once per database. See:
How to create a temporary function in PostgreSQL?
Passed schema and table names are case-sensitive strings! (Unlike unquoted identifiers in plain SQL.) Either way, be wary of SQL-injection when concatenating SQL dynamically. See:
Are PostgreSQL column names case-sensitive?
Table name as a PostgreSQL function parameter
Made _unique_keys type int[] (array of integer) since your sample values look like integers. Use a the actual data type of your id columns!
The variable _sql holds the query string, so it can easily be debugged before actually executing. Using RAISE NOTICE '%', _sql; for that purpose.
I suggest to comment the EXECUTE line until you are sure.
I made the PROCEDURE return the number of processed rows. You didn't ask for that, but it's typically convenient. At hardly any cost. See:
Dynamic SQL (EXECUTE) as condition for IF statement
Best way to get result count before LIMIT was applied
Last, but not least, use DELETE ... RETURNING * in a data-modifying CTE. Since that has to find rows only once it comes at about half the cost of separate SELECT and DELETE. And it's perfectly safe. If anything goes wrong, the whole transaction is rolled back anyway.
Two separate commands can also run into concurrency issues or race conditions which are ruled out this way, as DELETE implicitly locks the rows to delete. Example:
Replicating data between Postgres DBs
Or you can build the statements in a client program. Like psql, and use \gexec. Example:
Filter column names from existing table for SQL DDL statement
Based on Erwin's answer, minor optimization...
create or replace procedure pg_temp.p_archive_dump
(_source_schema text, _source_table text,
_unique_key int[],_target_schema text)
language plpgsql as
$$
declare
_row_count bigint;
_target_table text := '';
BEGIN
select quote_ident(_source_table) ||'_'|| array_to_string(_unique_key,'_') into _target_table from quote_ident(_source_table);
raise notice 'the deleted table records will store in %.%',_target_schema, _target_table;
execute format('create table %I.%I as select * from %I.%I limit 0',_target_schema, _target_table,_source_schema,_source_table );
execute format('with mm as ( delete from %I.%I where id = any (%L) returning * ) insert into %I.%I table mm'
,_source_schema,_source_table,_unique_key, _target_schema, _target_table);
GET DIAGNOSTICS _row_count = ROW_COUNT;
RAISE notice 'rows influenced, %',_row_count;
end
$$;
--
if your _unique_key is not that much, this solution also create a table for you. Obviously you need to create the target schema yourself.
If your unique_key is too much, you can customize to properly rename the dumped table.
Let's call it.
call pg_temp.p_archive_dump('s1','t1', '{1,2}','s2');
s1 is the source schema, t1 is source table, {1,2} is the unique key you want to extract to the new table. s2 is the target schema

SELECT usage with the new CREATE PROCEDURE method

I'm trying to store a simple SELECT query with the new CREATE PROCEDURE method in PostgreSQL 11. My idea is to store the queries in the DB, because I can have a much simple code in my API server and maybe I don't need to develop a query builder if I can use if/else in an sql function with enforced type safety. I have this minimal example:
First I tried this plpgsql function:
CREATE OR REPLACE PROCEDURE test_proc() AS $$
BEGIN
SELECT * FROM my_db
LIMIT 1;
END;
$$ LANGUAGE plpgsql;
CALL test_proc();
However throws this error:
ERROR: query has no destination for result data
HINT: If you want to discard the results of a SELECT, use PERFORM instead.
CONTEXT: PL/pgSQL function test_proc() line 3 at SQL statement SQL state: 42601
If I trying to use RETURN QUERY:
CREATE OR REPLACE PROCEDURE test_proc() AS $$
BEGIN
RETURN QUERY;
SELECT * FROM my_db
LIMIT 1;
END;
$$ LANGUAGE plpgsql;
I'm getting this error:
ERROR: cannot use RETURN QUERY in a non-SETOF function
LINE 17: RETURN QUERY; ^
SQL state: 42804
Character: 310
I'm also getting error when I try to use RETURNS void AS $$ or RETURNS table(...) AS $$. Seems like RETURNS not supported in CREATE PROCEDURE? So, is it possible to return a table with the new stored procedure method? Or if it's not, maybe JSON?
Procedures in PostgreSQL (Oracle, DB2) are not same like procedures in MS-SQL. It has different target, and you cannot use it. Usually, the best what you can do, forgot all what you know from MSSQL. The procedural part is really different.
Only functions can returns some data - so you need to use functions. Functions can returns scalar value, composite value or array value, or table. You want function that returns table.
CREATE OR REPLACE FUNCTION fx()
RETURNS SETOF mytab AS $$
BEGIN
RETURN QUERY SELECT * FROM mytab;
END
$$ LANGUAGE plpgsql;
SELECT * FROM fx();
For record:
You can use SQL function, that can have better (or worse) performance (depends on context). These functions are sometimes named as parametrized views.
CREATE OR REPLACE FUNCTION fx()
RETURNS SETOF mytab AS $$
SELECT * FROM mytab;
$$ LANGUAGE sql;
Attention: this technique is antipattern!!! Don't do it. It is really not good idea. The functions should not to wrap queries. If you want to hide some complexity of queries, then use a views. Don't use a functions. Functions are effective barier for query optimizer, and when you use this antipattern, then optimizer cannot to well optimize any non trivial queries that use in this form evaluated subqueries.
Use it - if you want very very slow applications - or if your data model or queries are primitive. In other cases, don't do it.
Don't afraid of SQL - it is great language designed for manual usage. It is good to place all data access to one module (model), to don't access database everywhere in your code, but it is bad too hide SQL in your code.
First of all Procedure was introduced in PostgreSQL 11, If you are using below 11th version, you cannot use Procedures. Instead to Procedure you can use functions.
Syntax to create function
CREATE or replace function function_name(_parameter varchar)
returns table(col1 varchar, col2 varchar, col3 varchar)
language 'plpgsql'
as $BODY$
BEGIN
return query select a.col1, a.col2, b.col3 from table a
join table2 as b on a.col1 = b.col1;
END;
$BODY$;
you can call a function same a like table
select * From function_name('sample data');
syntax to create Procedure.
CREATE OR REPLACE PROCEDURE procedure_name(_parameter varcar,INOUT result refcursor)
LANGUAGE 'plpgsql'
AS $BODY$
BEGIN
open result for SELECT , * from sampletable where a = _parameter;
END;
$BODY$;
you can execute a Procedure using call keyword, within a transaction
BEGIN;
CALL public.procedure_name( 'sample data', 'test');
fetch all in "test";
COMMIT;
The postgreSql 11. we have to create a stored procedure
there is the solution :
Create procedure to execute query in PostgreSQL

How to create a stored procedure including the "SELECT" in Oracle SQL Developer?

I am novice user in oracle and well I am creating a stored procedure to display data from a table, because my teaching process requires it. At first I ran my query follows.
Create or replace procedure p_ mostrar
Is
Begin
Select ID_MODULO, NOMBRE, URL, ESTADO, ICONO FROM MODULO WHERE ESTADO=1 ;
Commit;
End p_mostrar;
And he throws me the following error:
The judgment was expected INTO" After some research changed the syntax and run it as follows:
Create or replace procedure p_ mostrar (C1 out sys_refcursor)
Is
Begin Open C1 for Select ID_MODULO, NOMBRE, URL, ESTADO, ICONO
FROM MODULO
WHERE ESTADO=1 ;
Commit;
End p_mostrar;
And I think runs correctly. But now it does not know how to run the procedure. I thank you in advance and expect a prompt response. Remember, I'm learning with Oracle SQL Developer.
When you are dealing with select statement inside a stored procedure, you need to include INTO clause to the select statement to store the values in a variable. Try this
CREATE OR REPLACE PROCEDURE p_mostrar
IS
v_id_modulo modulo.id_modulo%TYPE;
v_nombre modulo.nombre%TYPE;
v_url modulo.url%TYPE;
v_estado modulo.estado%TYPE;
v_icono modulo.icono%TYPE;
BEGIN
SELECT id_modulo, nombre, url, estado, icono
INTO v_id_modulo, v_nombre, v_url, v_estado, v_icono --needed to catch the values selected and store it to declared variables
FROM modulo
WHERE estado=1 ;
Commit; -- i dont think this is necessary, you use commit statement only when you use DML statements to manage the changes made
dbms_output.put_line(v_id_modulo||' '|| v_nombre||' '||v_url||' '||v_estado||' '||v_icono); --used to display the values stored in the variables
END;
This should work:
var result refcursor
execute p_ mostrar(:result)

Use function variable in dynamic COPY statement

According to docs of PostgreSQL it is possible to copy data to csv file right from a query without using an intermediate table. I am curious how to do that.
CREATE OR REPLACE FUNCTION m_tbl(my_var integer)
RETURNS void AS
$BODY$
DECLARE
BEGIN
COPY (
select my_var
)
TO 'c:/temp/out.csv';
END;
$$ LANGUAGE plpgsql;
I get an error: no such column 'my_var'.
Yes, it is possible to COPY from any query, whether or not it refers to a table.
However, COPY is a non-plannable statement, a utility statement. It doesn't support query parameters - and query parameters are how PL/PgSQL implements the insertion of variables into statements.
So you can't use PL/PgSQL variables with COPY.
You must instead use dynamic SQL with EXECUTE. See the Pl/PgSQL documentation for examples. There are lots of examples here on Stack Overflow and on https://dba.stackexchange.com/ too.
Something like:
EXECUTE format('
COPY (
select %L
)
TO ''c:/temp/out.csv'';
', my_var);
The same applies if you want the file path to be dynamic - you'd use:
EXECUTE format('
COPY (
select %L
)
TO %L;
', my_var, 'file_name.csv');
It also works for dynamic column names but you would use %I (for identifier, like "my_name") instead of %L for literal like 'my_value'. For details on %I and %L, see the documentation for format.