We have a function that builds some XML and uses EXECUTE/USING to try to prevent SQL injection. It's along the lines of:
CREATE OR REPLACE FUNCTION public.f1(t TEXT)
RETURNS XML AS
$BODY$
DECLARE
ret_val XML;
BEGIN
EXECUTE 'SELECT ''<'' || $1 || ''>'''
--rest of xml building
INTO ret_val
USING t;
RETURN ret_val;
END
$BODY$
LANGUAGE plpgsql IMMUTABLE;
I don't like this much due to the concatenation. It would be much better if we could just do SELECT '''<$1>'' but that ends up with a literal $1 rather than replacing it with the value.
Due to the concatenation, it's making me wonder whether we even need SQL injection prevention here. It's not reading from a table, just building an XML string which is returned.
Is there any actual risk from not using USING in this case? Does concatenating $1 negate the effects of USING, or does USING even have any effect on a statement that doesn't use a table?
There are a few things to unpack here.
Firstly, the SQL you have there is actually a fixed string:
'SELECT ''<'' || $1 || ''>'''
So nothing can be directly injected here, because there's no dynamic SQL. As Laurenz Albe pointed out, the SQL in this particular example could be written as a non-dynamic statement:
SELECT '<' || t || '>'
There is still no SQL injection here, because you're not evaluating the contents of t, just manipulating it as a string, just as SELECT a + 1 would manipulate a as a number.
The key point is that the actual SQL is hard-coded, and the concatenation is just the instruction in that SQL.
Note that this similar-looking query would be dangerous (the syntax highlighting gives a clue to the difference):
EXECUTE 'SELECT ''<' || t || '>''' -- DON'T DO THIS!
Here, the value of t is being used as part of the SQL string - the concatenation is happening first, and then the result is executed. So a value of '1'; DROP TABLE users; --' would result in the query SELECT '<1'; DROP TABLE users; -- which is clearly undesirable.
Secondly, as explained in the docs, the $1 is a parameter, supplied by the USING clause as data, so it too is safe from SQL injection. This is similar to using a parameterised query in a programming language outside the database - you build up the query as a string, carefully whitelisting the tables and columns referenced, then provide the variable data separately, where it cannot be reinterpreted as SQL. Or to put it another way, it's like another function "one level deeper", with the parameters specified by the USING clause acting just like the parameters to an actual function.
Finally, though, a note of caution: you are vulnerable to XML injection: if nothing else has validated or escaped t, you could generate invalid or dangerous XML. For instance, consider what would happen if the value of t was 'foo><script>alert("Hello, world!");</script></foo' and the result ended up parsed as HTML.
There is no danger for SQL injection here, because you are using USING.
Note that you could have been using static SQL to achieve the same thing:
SELECT '<' || t || '>' INTO ret_val;
Related
I've got a bit of complex sql code that I'm converting from MSSql to Postgres (using Entity Framework Core 2.1), to deal with potential race conditions when inserting to a table with a unique index. Here's the dumbed-down version:
const string QUERY = #"
DO
$$
BEGIN
insert into Foo (Field1,Field2,Field3)
values (#value1,#value2,#value3);
EXCEPTION WHEN others THEN
-- do nothing; it's a race condition
END;
$$ LANGUAGE plpgsql;
select *
from Foo
where Field1 = #value1
and Field2 = #value2;
";
return DbContext.Foos
.FromSql(QUERY,
new NpgsqlParameter("value1", value1),
new NpgsqlParameter("value2", value2),
new NpgsqlParameter("value3", value3))
.First();
In other words, try to insert the record, but don't throw an exception if the attempt to insert it results in a unique index violation (index is on Field1+Field2), and return the record, whether it was created by me or another thread.
This concept worked fine in MSSql, using a TRY..CATCH block. As far as I can tell, the way to handle Postgres exceptions is as I've done, in a plpgsql block.
BUT...
It appears that variable substitution in plpgsql blocks doesn't work. The code above fails on the .First() (no elements in sequence), and when I comment out the EXCEPTION line, I see the real problem, which is:
Npgsql.PostgresException : 42703: column "value1" does not exist
When I test using regular Sql, i.e. doing the insert without using a plpgsql block, this works fine.
So, what is the correct way to do variable substitution in a plpgsql block?
The reason this doesn't work is that the body of the DO statement is actually a string, a text. See reference
$$ is just another way to delimit text in postgresql. It can be just as well be replaced with ' or $somestuff$.
As it is a string, Npgsql and Postgresql have no reason to mess with #value1 in it.
Solutions? Only a very ugly one, so not using this construction, as you're not able to pass it any values. And messing with string concatenation is no different than doing concatenation in C# in the first place.
Alternatives? Yes!
You don't need to handle exceptions in plpgsql blocks. Simply insert, use the ON CONFLICT DO NOTHING, and be on your way.
INSERT INTO Foo (Field1,Field2,Field3)
VALUES (#value1,#value2,#value3)
ON CONFLICT DO NOTHING;
select *
from Foo
where Field1 = #value1
and Field2 = #value2;
Or if you really want to keep using plpgsql, you can simply create a temporary table, using the ON COMMIT DROP option, fill it up with these parameters as one row, then use it in the DO statement. For that to work all your code must execute as part of one transaction. You can use one explicitly just in case.
The only ways to pass parameters to plpgsql code is via these 2 methods:
Declaring a function, then calling it with arguments
When already inside a plpgsql block you can call:
EXECUTE $$ INSERT ... VALUES ($1, $2, $3); $$ USING 3, 'text value', 5.234;
End notes:
As a fellow T-SQL developer who loved its freedom, but transitioned to Postgresql, I have to say that the BIG difference is that on one side there's T-SQL which gives the power, and on the other side it's a very powerful Postgresql-flavored SQL. plpgsql is very rarely warranted. In fact, in a code base of megabytes of complex SQL stuff, I can rewrite pretty much every plpgsql code in SQL. That's how powerful it really is compared to MSSQL-flavored SQL. It just takes some getting used to, and befriending the very ample documentation. Good luck!
I am facing issue while storing jason value which has some text with single quotes.
Below is the scenario,
My Stored Procedure:
CREATE OR REPLACE FUNCTION JsonParse(inputdata json)
RETURNS void AS $$
DECLARE
BEGIN
UPDATE
MyTable
SET
settings_details= inputdata
WHERE
settings_key='my-list';
END;
$$
LANGUAGE PLPGSQL;
select * from JsonParse('[{
"myArray": ["New Text1 ''abcd''", "New Text1 ''abcd''"]
},
{"myArray": ["New Text1 ''abcd''", "New Text1 ''abcd''"]}]');
I get the below error :
ERROR: syntax error at or near "abcd"
I can add extra single quote to abcd and efgh. It solves the issue. But the problem is that i dont have control on the input text to JsonParse procedure.
The stored procedure should be capable enough to handle this.
May i please know on how to tackle this
Your stored procedure is not rejecting this, the SQL you are running is malformed.
Try a simpler example:
SELECT 'test o'reilly' AS FOO;
As the syntax highlighting shows, the SQL parser will think the string is 'test o', and the reilly is a syntax error.
This is a sign that you are not correctly handling data input into your SQL queries, and are probably vulnerable to SQL injection attacks (what if I supply you with the "JSON" string {}'); DROP TABLE users; --).
Everywhere that you add data to an SQL query, you need to do one of two things:
Escape the input using an appropriate function provided by your language or framework. This will mostly just be doubling apostrophes, but if implemented properly will ensure there aren't other characters that can break the input based on the database's character set etc.
Or, use parametrised prepared queries, where you write the SQL with placeholders, and then provide the data as a separate argument, which is transmitted separately, so cannot be mistaken for SQL (and, consequently, doesn't need quotes around strings). In your case, a parametrised query might look like this: select * from JsonParse(?); with the execute function taking one parameter.
I was reading PostgreSql documentation here, and came across the following code snippet:
EXECUTE 'SELECT count(*) FROM mytable WHERE inserted_by = $1 AND inserted <= $2'
INTO c
USING checked_user, checked_date;
The documentation states that "This method is often preferable to inserting data values into the command string as text: it avoids run-time overhead of converting the values to text and back, and it is much less prone to SQL-injection attacks since there is no need for quoting or escaping".
Can you show me how this code is prone to SQL injection at all?
Edit: in all other RDBMS I have worked with this would completely prevent SQL injection. What is implemented differently in PostgreSql?
Quick answer is that it isn't by itself prone to SQL injection, As I understand your question you are asking why we don't just say so. So since you are looking for scenarios where this might lead to SQL injection, consider that mytable might be a view, and so could have additional functions behind it. Those functions might be vulnerable to SQL injection.
So you can't look at a query and conclude that it is definitely not susceptible to SQL injection. The best you can do is indicate that at the level provided, this specific level of your application does not raise sql injection concerns here.
Here is an example of a case where sql injection might very well happen.
CREATE OR REPLACE FUNCTION ban_user() returns trigger
language plpgsql security definer as
$$
begin
insert into banned_users (username) values (new.username);
execute 'alter user ' || new.username || ' WITH VALID UNTIL ''YESTERDAY''';
return new;
end;
Note that utility functions cannot be parameterized as you indicate, and we forgot to quote_ident() around new.username, thus making the field vulnerable.
CREATE OR REPLACE VIEW banned_users_today AS
SELECT username FROM banned_users where banned_date = 'today';
CREATE TRIGGER i_banned_users_today INSTEAD OF INSERT ON banned_users_today
FOR EACH ROW EXECUTE PROCEDURE ban_user();
EXECUTE 'insert into banned_users_today (username) values ($1)'
USING 'postgres with password ''boo''; drop function ban_user() cascade; --';
So no it doesn't completely solve the problem even if used everywhere it can be used. And proper use of quote_literal() and quote_ident() don't always solve the problem either.
The thing is that the problem can always be at a lower level than the query you are executing.
The bound parameters prevent garbage to manipulate the statement into doing anything other than what it's intended to do.
This guarantees no possibility for SQL-injection attacks short of a Postgres bug. (See H2C03's link for examples of what could go wrong.)
I imagine the "much less prone to SQL-injection attacks" amounts to CYA verbiage were such a thing were to arise.
SQL injection is usually associated with large data dumps on pastebin.com and such scenario won't work here even if the example used contatenation not variables. It's because that COUNT(*) will aggregate all data you'd be trying to steal.
But I can imagine scenarios where count of arbitrary record would be sufficiently valuable information - e.g. number of competitor's clients, number of sold products etc. And actually recalling some of the really tricky blind SQL injection methods it might be possible to build a query that using COUNT alone would allow to iteratively recover actual text from the database.
It would be also much easier to exploit on a database sufficiently old and misconfigured to allow the ; separator, in which case the attacker might just append a completely separate query.
I would love to exploit the SQL output formatting of PostgreSQL inside my PL/pgSQL functions, but I'm starting to feel I have to give up the idea.
I have my PL/pgSQL function query_result:
CREATE OR REPLACE FUNCTION query_result(
this_query text
) RETURNS SETOF record AS
$$
BEGIN
RETURN QUERY EXECUTE this_query;
END;
$$ LANGUAGE plpgsql;
..merrily returning a SETOF records from an input text query, and which I can use for my SQL scripting with dynamic queries:
mydb=# SELECT * FROM query_result('SELECT ' || :MYVAR || ' FROM Alice') AS t (id int);
id
----
1
2
3
So my hope was to find a way to deliver this same nicely formatted output from inside a PL/pgSQL function instead, but RAISE does not support SETOF types, and there's no magic predefined cast from SETOF records to text (I know I could create my own CAST..)
If I create a dummy print_result function:
CREATE OR REPLACE FUNCTION print_result(
this_query text
) RETURNS void AS
$$
BEGIN
SELECT query_result(this_query);
END;
$$ LANGUAGE plpgsql;
..I cannot print the formatted output:
mydb=# SELECT print_result('SELECT ' || :MYVAR || ' FROM Alice');
ERROR: set-valued function called in context that cannot accept a set
...
Thanks for any suggestion (which preferably works with PostgreSQL 8.4).
Ok, to do anything with your result set in print_result you'll have to loop over it. That'll look something like this -
Here result_record is defined as a record variable. For the sake of explanation, we'll also assume that you have a formatted_results variable that is defined as text and defaulted to a blank string to hold the formatted results.
FOR result_record IN SELECT * FROM query_result(this_query) AS t (id int) LOOP
-- With all this, you can do something like this
formatted_results := formatted_results ||','|| result_record.id;
END LOOP;
RETURN formatted_results;
So, if you change print_results to return text, declare the variables as I've described and add this in, your function will return a comma-separated list of all your results (with an extra comma at the end, I'm sure you can make use of PostgreSQL's string functions to trim that). I'm not sure this is exactly what you want, but this should give you a good idea about how to manipulate your result set. You can get more information here about control structures, which should let you do pretty much whatever you want.
EDIT TO ANSWER THE REAL QUESTION:
The ability to format data tuples as readable text is a feature of the psql client, not the PostgreSQL server. To make this feature available in the server would require extracting relevant code or modules from the psql utility and recompiling them as a database function. This seems possible (and it is also possible that someone has already done this), but I am not familiar enough with the process to provide a good description of how to do that. Most likely, the best solution for formatting query results as text will be to make use of PostgreSQL's string formatting functions to implement the features you need for your application.
Is the following SQL susceptible to SQL injection via the #SearchWord parameter?
I want to use parameters with the FormsOf function, but the only guide to doing so I've found is in this Stack Overflow question: How to pass parameter to FormsOf function in sql server
However the solution seems to be to use a bit of dynamic SQL, and I was wondering if that would be susceptible to SQL injection. What would happen in the following example if #searchWord contained a SQL injection type string? Is it not a problem because it's still within a parameter, passed as an argument to FREETEXTTABLE?
The solution given is:
DECLARE #SearchWord nvarchar(max)
SET #SearchWord = 'tax'
DECLARE #SearchString nvarchar(max)
SET #SearchString = 'FormsOf(INFLECTIONAL, "' + #SearchWord + '")'
SELECT listing_id, RANK, name, address, city, zip, heading, phone
FROM listings a,
FREETEXTTABLE(listings, *, #SearchString)
WHERE [KEY] = a.listing_id
ORDER BY RANK DESC, name
No, it's not susceptible. There's no dynamic SQL here (that would require either using EXEC or sp_executesql), so there's no vector for SQL injection.
In order for a SQL injection vulnerability to exist, the user-supplied string (in this case #SearchWord) must actually be inserted directly into the text of the SQL statement. Here, it's only being used to construct another string variable, which is subsequently used as a parameter to another SQL statement.
This statement can, however, fail if the user inputs an "invalid" search word, i.e. one containing single quotes, so you should probably still escape whatever value is passed to #SearchWord. But it cannot be used to execute arbitrary SQL.
I haven't tested this, but I don't think the interpreter is simply pasting the value of #SearchString into the statement. It should parse #SearchString using the rules that FREETEXTTABLE expects--that's the way other parameters work.