Issue while storing json value with single quotes - postgresql

I am facing issue while storing jason value which has some text with single quotes.
Below is the scenario,
My Stored Procedure:
CREATE OR REPLACE FUNCTION JsonParse(inputdata json)
RETURNS void AS $$
DECLARE
BEGIN
UPDATE
MyTable
SET
settings_details= inputdata
WHERE
settings_key='my-list';
END;
$$
LANGUAGE PLPGSQL;
select * from JsonParse('[{
"myArray": ["New Text1 ''abcd''", "New Text1 ''abcd''"]
},
{"myArray": ["New Text1 ''abcd''", "New Text1 ''abcd''"]}]');
I get the below error :
ERROR: syntax error at or near "abcd"
I can add extra single quote to abcd and efgh. It solves the issue. But the problem is that i dont have control on the input text to JsonParse procedure.
The stored procedure should be capable enough to handle this.
May i please know on how to tackle this

Your stored procedure is not rejecting this, the SQL you are running is malformed.
Try a simpler example:
SELECT 'test o'reilly' AS FOO;
As the syntax highlighting shows, the SQL parser will think the string is 'test o', and the reilly is a syntax error.
This is a sign that you are not correctly handling data input into your SQL queries, and are probably vulnerable to SQL injection attacks (what if I supply you with the "JSON" string {}'); DROP TABLE users; --).
Everywhere that you add data to an SQL query, you need to do one of two things:
Escape the input using an appropriate function provided by your language or framework. This will mostly just be doubling apostrophes, but if implemented properly will ensure there aren't other characters that can break the input based on the database's character set etc.
Or, use parametrised prepared queries, where you write the SQL with placeholders, and then provide the data as a separate argument, which is transmitted separately, so cannot be mistaken for SQL (and, consequently, doesn't need quotes around strings). In your case, a parametrised query might look like this: select * from JsonParse(?); with the execute function taking one parameter.

Related

Variable column name on function

I'm new to pgsql but have I have 8 years of experience with MSSQL, what i'm trying achieve is: create a function to apply this remove invalid data from names, it will remove all special characters, numbers and accents, keeping only spaces and a-Z characters, I want to use it on columns of different tables, but I cant really find what I'm doing wrong.
Here is my code:
CREATE OR REPLACE FUNCTION f_validaNome (VARCHAR(255))
RETURNS VARCHAR(255) AS
SELECT regexp_replace(unaccent($1), '[^[:alpha:]\s]', '', 'g')
COMMIT
If I run
SELECT regexp_replace(unaccent(column_name), '[^[:alpha:]\s]', '', 'g')
from TableA
my code runs fine. I don't know exactly what is wrong with the function code.
That's not how functions are written in Postgres.
As documented in the manual the function's body must be passed as a string and you need to specify which language the function is written in. Functions can be written in SQL, PL/pgSQL, PL/python, PL/perl and many others. There is also no need to reference parameters by position. Passing a dollar quoted string makes writing the function body easier.
For what you are doing, a simple SQL function is enough. It's also unnecessary to use an arbitrary character limit like 255 (which does have any performance or storage advantages over any other defined max length). So just use text.
CREATE OR REPLACE FUNCTION f_validanome (p_input text)
RETURNS text
AS
$body$ --<< string starts here.
SELECT regexp_replace(unaccent(p_input), '[^[:alpha:]\s]', '', 'g'); --<< required ; at the end
$body$ --<< string ends here
language sql
immutable; --<< required ; at the end

How to do variable substitution in plpgsql?

I've got a bit of complex sql code that I'm converting from MSSql to Postgres (using Entity Framework Core 2.1), to deal with potential race conditions when inserting to a table with a unique index. Here's the dumbed-down version:
const string QUERY = #"
DO
$$
BEGIN
insert into Foo (Field1,Field2,Field3)
values (#value1,#value2,#value3);
EXCEPTION WHEN others THEN
-- do nothing; it's a race condition
END;
$$ LANGUAGE plpgsql;
select *
from Foo
where Field1 = #value1
and Field2 = #value2;
";
return DbContext.Foos
.FromSql(QUERY,
new NpgsqlParameter("value1", value1),
new NpgsqlParameter("value2", value2),
new NpgsqlParameter("value3", value3))
.First();
In other words, try to insert the record, but don't throw an exception if the attempt to insert it results in a unique index violation (index is on Field1+Field2), and return the record, whether it was created by me or another thread.
This concept worked fine in MSSql, using a TRY..CATCH block. As far as I can tell, the way to handle Postgres exceptions is as I've done, in a plpgsql block.
BUT...
It appears that variable substitution in plpgsql blocks doesn't work. The code above fails on the .First() (no elements in sequence), and when I comment out the EXCEPTION line, I see the real problem, which is:
Npgsql.PostgresException : 42703: column "value1" does not exist
When I test using regular Sql, i.e. doing the insert without using a plpgsql block, this works fine.
So, what is the correct way to do variable substitution in a plpgsql block?
The reason this doesn't work is that the body of the DO statement is actually a string, a text. See reference
$$ is just another way to delimit text in postgresql. It can be just as well be replaced with ' or $somestuff$.
As it is a string, Npgsql and Postgresql have no reason to mess with #value1 in it.
Solutions? Only a very ugly one, so not using this construction, as you're not able to pass it any values. And messing with string concatenation is no different than doing concatenation in C# in the first place.
Alternatives? Yes!
You don't need to handle exceptions in plpgsql blocks. Simply insert, use the ON CONFLICT DO NOTHING, and be on your way.
INSERT INTO Foo (Field1,Field2,Field3)
VALUES (#value1,#value2,#value3)
ON CONFLICT DO NOTHING;
select *
from Foo
where Field1 = #value1
and Field2 = #value2;
Or if you really want to keep using plpgsql, you can simply create a temporary table, using the ON COMMIT DROP option, fill it up with these parameters as one row, then use it in the DO statement. For that to work all your code must execute as part of one transaction. You can use one explicitly just in case.
The only ways to pass parameters to plpgsql code is via these 2 methods:
Declaring a function, then calling it with arguments
When already inside a plpgsql block you can call:
EXECUTE $$ INSERT ... VALUES ($1, $2, $3); $$ USING 3, 'text value', 5.234;
End notes:
As a fellow T-SQL developer who loved its freedom, but transitioned to Postgresql, I have to say that the BIG difference is that on one side there's T-SQL which gives the power, and on the other side it's a very powerful Postgresql-flavored SQL. plpgsql is very rarely warranted. In fact, in a code base of megabytes of complex SQL stuff, I can rewrite pretty much every plpgsql code in SQL. That's how powerful it really is compared to MSSQL-flavored SQL. It just takes some getting used to, and befriending the very ample documentation. Good luck!

Is this actually open to SQL injection?

We have a function that builds some XML and uses EXECUTE/USING to try to prevent SQL injection. It's along the lines of:
CREATE OR REPLACE FUNCTION public.f1(t TEXT)
RETURNS XML AS
$BODY$
DECLARE
ret_val XML;
BEGIN
EXECUTE 'SELECT ''<'' || $1 || ''>'''
--rest of xml building
INTO ret_val
USING t;
RETURN ret_val;
END
$BODY$
LANGUAGE plpgsql IMMUTABLE;
I don't like this much due to the concatenation. It would be much better if we could just do SELECT '''<$1>'' but that ends up with a literal $1 rather than replacing it with the value.
Due to the concatenation, it's making me wonder whether we even need SQL injection prevention here. It's not reading from a table, just building an XML string which is returned.
Is there any actual risk from not using USING in this case? Does concatenating $1 negate the effects of USING, or does USING even have any effect on a statement that doesn't use a table?
There are a few things to unpack here.
Firstly, the SQL you have there is actually a fixed string:
'SELECT ''<'' || $1 || ''>'''
So nothing can be directly injected here, because there's no dynamic SQL. As Laurenz Albe pointed out, the SQL in this particular example could be written as a non-dynamic statement:
SELECT '<' || t || '>'
There is still no SQL injection here, because you're not evaluating the contents of t, just manipulating it as a string, just as SELECT a + 1 would manipulate a as a number.
The key point is that the actual SQL is hard-coded, and the concatenation is just the instruction in that SQL.
Note that this similar-looking query would be dangerous (the syntax highlighting gives a clue to the difference):
EXECUTE 'SELECT ''<' || t || '>''' -- DON'T DO THIS!
Here, the value of t is being used as part of the SQL string - the concatenation is happening first, and then the result is executed. So a value of '1'; DROP TABLE users; --' would result in the query SELECT '<1'; DROP TABLE users; -- which is clearly undesirable.
Secondly, as explained in the docs, the $1 is a parameter, supplied by the USING clause as data, so it too is safe from SQL injection. This is similar to using a parameterised query in a programming language outside the database - you build up the query as a string, carefully whitelisting the tables and columns referenced, then provide the variable data separately, where it cannot be reinterpreted as SQL. Or to put it another way, it's like another function "one level deeper", with the parameters specified by the USING clause acting just like the parameters to an actual function.
Finally, though, a note of caution: you are vulnerable to XML injection: if nothing else has validated or escaped t, you could generate invalid or dangerous XML. For instance, consider what would happen if the value of t was 'foo><script>alert("Hello, world!");</script></foo' and the result ended up parsed as HTML.
There is no danger for SQL injection here, because you are using USING.
Note that you could have been using static SQL to achieve the same thing:
SELECT '<' || t || '>' INTO ret_val;

Print ASCII-art formatted SETOF records from inside a PL/pgSQL function

I would love to exploit the SQL output formatting of PostgreSQL inside my PL/pgSQL functions, but I'm starting to feel I have to give up the idea.
I have my PL/pgSQL function query_result:
CREATE OR REPLACE FUNCTION query_result(
this_query text
) RETURNS SETOF record AS
$$
BEGIN
RETURN QUERY EXECUTE this_query;
END;
$$ LANGUAGE plpgsql;
..merrily returning a SETOF records from an input text query, and which I can use for my SQL scripting with dynamic queries:
mydb=# SELECT * FROM query_result('SELECT ' || :MYVAR || ' FROM Alice') AS t (id int);
id
----
1
2
3
So my hope was to find a way to deliver this same nicely formatted output from inside a PL/pgSQL function instead, but RAISE does not support SETOF types, and there's no magic predefined cast from SETOF records to text (I know I could create my own CAST..)
If I create a dummy print_result function:
CREATE OR REPLACE FUNCTION print_result(
this_query text
) RETURNS void AS
$$
BEGIN
SELECT query_result(this_query);
END;
$$ LANGUAGE plpgsql;
..I cannot print the formatted output:
mydb=# SELECT print_result('SELECT ' || :MYVAR || ' FROM Alice');
ERROR: set-valued function called in context that cannot accept a set
...
Thanks for any suggestion (which preferably works with PostgreSQL 8.4).
Ok, to do anything with your result set in print_result you'll have to loop over it. That'll look something like this -
Here result_record is defined as a record variable. For the sake of explanation, we'll also assume that you have a formatted_results variable that is defined as text and defaulted to a blank string to hold the formatted results.
FOR result_record IN SELECT * FROM query_result(this_query) AS t (id int) LOOP
-- With all this, you can do something like this
formatted_results := formatted_results ||','|| result_record.id;
END LOOP;
RETURN formatted_results;
So, if you change print_results to return text, declare the variables as I've described and add this in, your function will return a comma-separated list of all your results (with an extra comma at the end, I'm sure you can make use of PostgreSQL's string functions to trim that). I'm not sure this is exactly what you want, but this should give you a good idea about how to manipulate your result set. You can get more information here about control structures, which should let you do pretty much whatever you want.
EDIT TO ANSWER THE REAL QUESTION:
The ability to format data tuples as readable text is a feature of the psql client, not the PostgreSQL server. To make this feature available in the server would require extracting relevant code or modules from the psql utility and recompiling them as a database function. This seems possible (and it is also possible that someone has already done this), but I am not familiar enough with the process to provide a good description of how to do that. Most likely, the best solution for formatting query results as text will be to make use of PostgreSQL's string formatting functions to implement the features you need for your application.

Parameters in the FormsOf function and SQL injection

Is the following SQL susceptible to SQL injection via the #SearchWord parameter?
I want to use parameters with the FormsOf function, but the only guide to doing so I've found is in this Stack Overflow question: How to pass parameter to FormsOf function in sql server
However the solution seems to be to use a bit of dynamic SQL, and I was wondering if that would be susceptible to SQL injection. What would happen in the following example if #searchWord contained a SQL injection type string? Is it not a problem because it's still within a parameter, passed as an argument to FREETEXTTABLE?
The solution given is:
DECLARE #SearchWord nvarchar(max)
SET #SearchWord = 'tax'
DECLARE #SearchString nvarchar(max)
SET #SearchString = 'FormsOf(INFLECTIONAL, "' + #SearchWord + '")'
SELECT listing_id, RANK, name, address, city, zip, heading, phone
FROM listings a,
FREETEXTTABLE(listings, *, #SearchString)
WHERE [KEY] = a.listing_id
ORDER BY RANK DESC, name
No, it's not susceptible. There's no dynamic SQL here (that would require either using EXEC or sp_executesql), so there's no vector for SQL injection.
In order for a SQL injection vulnerability to exist, the user-supplied string (in this case #SearchWord) must actually be inserted directly into the text of the SQL statement. Here, it's only being used to construct another string variable, which is subsequently used as a parameter to another SQL statement.
This statement can, however, fail if the user inputs an "invalid" search word, i.e. one containing single quotes, so you should probably still escape whatever value is passed to #SearchWord. But it cannot be used to execute arbitrary SQL.
I haven't tested this, but I don't think the interpreter is simply pasting the value of #SearchString into the statement. It should parse #SearchString using the rules that FREETEXTTABLE expects--that's the way other parameters work.