How to apply a LIMIT in postgresql conditionally? - postgresql

I'd like to apply a LIMIT condition but only if the variable I am limiting by is populated. Otherwise no limit should be applied. The following syntax causes an error but provides an idea of what I'm trying to achieve
Example
SELECT
myTable."Id",
FROM reference.myTable myTable
CASE
WHEN #limit IS NOT NULL THEN LIMIT #limit
END;

Best you can do is probably this without dynamic SQL
SELECT *
FROM mytable
ORDER BY id
LIMIT
CASE WHEN :lim IS NULL THEN 99999999 ELSE :lim END;

Related

PostgreSQL. Is such function vulnerable to SQL injection or is it safe?

Functions that looks problematic
I'm exploring postgresql database and I see a recurring pattern:
CREATE OR REPLACE FUNCTION paginated_class(_orderby text DEFAULT NULL, _limit int DEFAULT 10, _offset int DEFAULT 0)
RETURNS SETOF pg_class
LANGUAGE PLPGSQL
AS $$
BEGIN
return query execute'
select * from pg_class
order by '|| coalesce (_orderby, 'relname ASC') ||'
limit $1 offset $1
'
USING _limit, _offset;
END;
$$;
Sample usage:
SELECT * FROM paginated_class(_orderby:='reltype DESC, relowner ASC ')
Repeating is:
_orderby is passed as text. It could be any combination of fields of returned SETOF type. E.g. 'relname ASC, reltype DESC'
_orderby parameter is not sanitized or checked in any way
_limit and _offset are integers
DB Fiddle for that: https://www.db-fiddle.com/f/vF6bCN37yDrjBiTEsdEwX6/1
Question: is such function vulnerable to SQL injection or not?
By external signs it's possible to suspect that such function is vulnerable to sql injection.
But all my attempts to find combination of params failed.
E.g.
CREATE TABLE T(id int);
SELECT * FROM paginated_class(_orderby:='reltype; DROP TABLE T; SELECT * FROM pg_class');
will return "Query Error: error: cannot open multi-query plan as cursor".
I did not found a way to exploit vulnerability if it exists with UPDATE/INSERT/DELETE.
So can we conclude that such function is actually safe?
If so: then why?
UPDATE. Possible plan for attack
Maybe I was not clear: I'm not asking about general guidelines but for experimental exploit of vulnerability or proof that such exploit is not possible.
DB Fiddle for this: https://www.db-fiddle.com/f/vF6bCN37yDrjBiTEsdEwX6/4 (or you can provide other of course)
My conclusions so far
A. Such attack could be possible if _orderby will have parts:
sql code that suppresses output of first SELECT
do something harmful
select * from pg_class so that it satisfies RETURNS SETOF pg_class
E.g.
SELECT * FROM paginated_class(_orderby:='relname; DELETE FROM my_table; SELECT * FROM pg_class')
It's easy for 2 and 3. I don't know a way to do 1st part.
This will generate: "error: cannot open multi-query plan as cursor"
B. If it's not possible to suppress first SELECT
Then:
every postgresql function works in separate transaction
because of error this transaction will be rollbacked
there is no autonomous transactions as in Oracle
for non-transactional ops: I know only about sequence-related ops
everything else be that DML or DDL is transactional
So? Can we conclude that such function is actually safe?
Or I'm missing something?
UPDATE 2. Attack using prepared function
From answer https://stackoverflow.com/a/69189090/1168212
A. It's possible to implement Denial-of-service attack putting expensive calculation
B. Side-effects:
If you put a function with side effects into the ORDER BY clause, you may also be able to modify data.
Let's try the latter:
CREATE FUNCTION harmful_fn()
RETURNS bool
LANGUAGE SQL
AS '
DELETE FROM my_table;
SELECT true;
';
SELECT * FROM paginated_class(_orderby:='harmful_fn()', _limit:=1);
https://www.db-fiddle.com/f/vF6bCN37yDrjBiTEsdEwX6/8
Yes.
So if an attacker has right to create functions: non-DOS attack is possible too.
I accept Laurenz Albe answer but: is it possible to do non-DOS attack without function?
Ideas?
No, that is not safe. An attacker could put any code into your ORDER BY clause via the _orderby parameter.
For example, you can pass an arbitrary subquery, as long as it returns only a single row: (SELECT max(i) FROM generate_series(1, 100000000000000) AS i). That can easily be used for a denial-of-service attack, if the query is expensive enough. Or, like with this example, you can cause a (brief) out-of-space condition with temporary files.
If you put a function with side effects into the ORDER BY clause, you may also be able to modify data.

Query Execution in Postgresql

How to execute query within format() function in Postgres? can any one please guide me.
IF NOT EXISTS ( SELECT format ('select id
from '||some_table||' where emp_id
='||QUOTE_LITERAL(m_emp_id)||' )
) ;
You may combine EXECUTE FORMAT and an IF condition using GET DIAGNOSTICS
Here's an example which you can reuse with minimal changes.I have passed table name using %I identifier and parameterised argument ($1) for employee_id. This is safe against SQL Injection.LIMIT 1 is used since we are interested if at least one row exists. This will improve query performance and is equivalent(or efficient) to using EXISTS for huge datasets with multiple matching rows.
DO $$
DECLARE
some_table TEXT := 'employees';
m_emp_id INT := 100;
cnt INT;
BEGIN
EXECUTE format ('select 1
from %I where employee_id
= $1 LIMIT 1',some_table ) USING m_emp_id ;
GET DIAGNOSTICS cnt = ROW_COUNT;
IF cnt > 0
THEN
RAISE NOTICE 'FOUND';
ELSE
RAISE NOTICE 'NOT FOUND';
END IF;
END
$$;
Result
NOTICE: FOUND
DO
The #Kaushik Nayak reply is correct - I will try just little bit more explain this issue.
PLpgSQL knows two types of queries to database:
static (embedded) SQL - SQL is written directly to plpgsql code. Static SQL can be parametrized (can use a varible), but the parameters should be used as table or column identifier. The semantic (and execution plan) should be same every time.
dynamic SQL - this style of queries is similar to classic client side SQL - SQL is written as string expression (that is evaluated) in running time, and result of this string expression is evaluated as SQL query. There are not limits for parametrization, but there is a risk of SQL injections (parameters must be sanitized against SQL injection (good examples are in #Kaushik Nayak reply)). More, there is a overhead with every time replanning - so dynamic SQL should be used only when it is necessary. Dynamic SQL is processed by EXECUTE command. The syntax of IF commands is:
IF expression THEN
statements; ...
END IF;
You cannot to place into expression PLpgSQL statements. So this task should be divided to more steps:
EXECUTE format(...) USING m_emp_id;
GET DIAGNOSTICS rc = ROW_COUNT;
IF cnt > 0 THEN
...
END IF;

Execute a SELECT with dynamic ORDER BY expression inside a function

I'm trying to EXECUTE some SELECTs to use inside a function, my code is something like this:
DECLARE
result_one record;
BEGIN
EXECUTE 'WITH Q1 AS
(
SELECT id
FROM table_two
INNER JOINs, WHERE, etc, ORDER BY... DESC
)
SELECT Q1.id
FROM Q1
WHERE, ORDER BY...DESC';
RETURN final_result;
END;
I know how to do it in MySQL, but in PostgreSQL I'm failing. What should I change or how should I do it?
For a function to be able to return multiple rows it has to be declared as returns table() (or returns setof)
And to actually return a result from within a PL/pgSQL function you need to use return query (as documented in the manual)
To build dynamic SQL in Postgres it is highly recommended to use the format() function to properly deal with identifiers (and to make the source easier to read).
So you need something like:
create or replace function get_data(p_sort_column text)
returns table (id integer)
as
$$
begin
return query execute
format(
'with q1 as (
select id
from table_two
join table_three on ...
)
select q1.id
from q1
order by %I desc', p_sort_column);
end;
$$
language plpgsql;
Note that the order by inside the CTE is pretty much useless if you are sorting the final query unless you use a LIMIT or distinct on () inside the query.
You can make your life even easier if you use another level of dollar quoting for the dynamic SQL:
create or replace function get_data(p_sort_column text)
returns table (id integer)
as
$$
begin
return query execute
format(
$query$
with q1 as (
select id
from table_two
join table_three on ...
)
select q1.id
from q1
order by %I desc
$query$, p_sort_column);
end;
$$
language plpgsql;
What a_horse said. And:
How to return result of a SELECT inside a function in PostgreSQL?
Plus, to pick a column for ORDER BY dynamically, you have to add that column to the SELECT list of your CTE, which leads to complications if the column can be duplicated (like with passing 'id') ...
Better yet, remove the CTE entirely. There is nothing in your question to warrant its use anyway. (Only use CTEs when needed in Postgres, they are typically slower than equivalent subqueries or simple queries.)
CREATE OR REPLACE FUNCTION get_data(p_sort_column text)
RETURNS TABLE (id integer) AS
$func$
BEGIN
RETURN QUERY EXECUTE format(
$q$
SELECT t2.id -- assuming you meant t2?
FROM table_two t2
JOIN table_three t3 on ...
ORDER BY t2.%I DESC NULL LAST -- see below!
$q$, $1);
END
$func$ LANGUAGE plpgsql;
I appended NULLS LAST - you'll probably want that, too:
PostgreSQL sort by datetime asc, null first?
If p_sort_column is from the same table all the time, hard-code that table name / alias in the ORDER BY clause. Else, pass the table name / alias separately and auto-quote them separately to be safe:
Define table and column names as arguments in a plpgsql function?
I suggest to table-qualify all column names in a bigger query with multiple joins (t2.id not just id). Avoids various kinds of surprising results / confusion / abuse.
And you may want to schema-qualify your table names (myschema.table_two) to avoid similar troubles when calling the function with a different search_path:
How does the search_path influence identifier resolution and the "current schema"

IF EXISTS-error

Could someone tell me the right use of IF EXISTS. I have this query, but I have problem with IF EXISTS function.
insert into master.dbo.turnover3(shop,somefield)
select '301',Curr_Turnover
from [S301].vpm.dbo.BO_POS_SAP_Turnover
where datediff(day,left(Sale_Date,16),getdate())= '0' if not exists (select NULL)
The right use of if exists is easy. It is not valid syntax for a select, update, or SQL statement. Your query would work fine without it:
insert into master.dbo.turnover3(shop,somefield)
select '301',Curr_Turnover
from [S301].vpm.dbo.BO_POS_SAP_Turnover
where datediff(day,left(Sale_Date,16),getdate())= '0' ;
if exists is part of T-SQL. That is, you can write scripts with it. With this syntax, it is just another variation on the if statement:
if not exists (select *
from master.dbo.turnover3(shop,somefield)
where shop = '301'
)
begin
insert into master.dbo.turnover3(shop,somefield)
select '301',Curr_Turnover
from [S301].vpm.dbo.BO_POS_SAP_Turnover
where datediff(day,left(Sale_Date,16),getdate())= '0' ;
end;
Since you are not doing an update here (or since there is no else part), you don't need to check for the existence of records. No records found means no records inserted.
insert into master.dbo.turnover3(shop,somefield)
select '301',Curr_Turnover
from [S301].vpm.dbo.BO_POS_SAP_Turnover
where datediff(day,left(Sale_Date,16),getdate()) = 0
Also as a side note, not sure what you do with left(Sale_Date,16) in the datediff function in the where clause. If Scale_Date is a date/datetime type you could use it as datediff(day,Sale_Date,getdate()= 0

Is using the RETURNING clause from an UPDATE as the query clause for an INSERT's query clause possible?

I'm trying to use PostgreSQL's RETURNING clause on an UPDATE within in UPDATE statement, and running into trouble.
Postgres allows a query clause in an INSERT, for example:
INSERT INTO films
SELECT * FROM tmp_films WHERE date_prod < '2004-05-07';
I would like to use the RETURNING clause from an UPDATE as the query clause for an INSERT, for example:
INSERT INTO user_status_history(status)
UPDATE user_status SET status = 'ACTIVE' WHERE status = 'DISABLED' RETURNING status
All of the Postgres references I can find suggest that a RETURNING clause behaved exactly like a SELECT clause,
however when I run something like the above, I get the following:
ERROR: syntax error at or near "UPDATE"
LINE 2: UPDATE user_statuses
Despite being able to execute the UPDATE portion of the above query without error.
Is using the RETURNING clause from an UPDATE as the query clause for an INSERT's query clause possible?
The goal is to update one table and insert into another with a single query, if possible.
With PostgreSQL 9.1 (or higher) you may use the new functionality that allows data-modification commands (INSERT/UPDATE/DELETE) in WITH clauses, such as:
WITH updated_rows AS
(
UPDATE products
SET ...
WHERE ...
RETURNING *
)
INSERT INTO products_log
SELECT * FROM updated_rows;
With PostgreSQL 9.0 (or lower) you may embed the UPDATE command inside one function, and then use that function from another function which performs the INSERT command, such as:
FUNCTION update_rows()
RETURNS TABLE (id integer, descrip varchar)
AS $$
BEGIN
RETURN QUERY
UPDATE products
SET ...
WHERE ...
RETURNING *;
END;
$$ LANGUAGE plpgsql;
FUNCTION insert_rows()
RETURNS void
AS $$
BEGIN
INSERT INTO products_log
SELECT * FROM update_rows() AS x;
END;
$$ LANGUAGE plpgsql;
Right now, no.
There was a feature that almost made it into PostgreSQL 9.0 known as Writeable CTE's that does what you're thinking (although the syntax is different).
Currently, you could either do this via a trigger or as two separate statements.
I think this is not possible the way you are trying to do.
I'd suggest you to write an AFTER UPDATE trigger, which could perform the insert, then.