Recognizing invalid dates in postgresql - postgresql

I am trying to parse dirty input into postgres tables. I have a problem with a 'date' field occasionally containing non-dates such as '00000000' or '20100100'. pg refuses to accept these, and rightly so.
Is there a way to have postgres recognize invalid dates (or only valid dates, if that works better), so I can substitute a sensible default?
(I've considered building a table listing the dates I'm willing to accept, and use that in a sub-select, but that seems awfully inelegant.)
Cheers,
Jurgen

http://www.tek-tips.com/viewthread.cfm?qid=1280050&page=9
A more generic approach than the above:
create function safe_cast(text,anyelement)
returns anyelement
language plpgsql as $$
begin
$0 := $1;
return $0;
exception when others then
return $2;
end; $$;
Used like this:
select safe_cast('Jan 10, 2009', '2011-01-01'::timestamp)
select safe_cast('Jan 10, 2009', null::timestamp)
Credited to the friendly dudes at the #postgresql irc channel. :)

You could write a pgsql function with an exception handling block.

Related

PL/PostgreSQL how to convert a variable into a table name

So I have a function in PostgreSQL that dynamically selects columns from a dynamic table. I got this solution from this post and it works great other than one thing.
This is inside of a file that is connected to a Node server, and so the $1 and $2 in the second SELECT * FROM represent values passed from there. The issue right now is that I am getting a syntax error that I don't understand (I am newer to SQL so that may be why).
$2 represents the name of the table to be selected from as a string, so for example it could be 'goals'. The error is syntax error at or near "'goals'". I realize that it cannot be a string with single quotes (I believe) and so I am wondering how to convert that variable to be a table name? using "goals" there as well as goals, for example works as expected, though I'm not sure how to do that outside of a function.
CREATE OR REPLACE FUNCTION get_data(user_id INT, table_name anyelement)
RETURNS SETOF ANYELEMENT AS $$
BEGIN
RETURN QUERY EXECUTE
format('SELECT * FROM %s WHERE user_id = $1', pg_typeof(table_name)) USING user_id;
END;
$$ LANGUAGE plpgsql;
SELECT * FROM get_data($1, NULL::$2);
$1 is 5 and $2 is 'goals' for example
After many hours of trying to figure it out, thanks to Adrian's comment, I found MassiveJS (how I'm connecting to my PostgreSQL server) has inline functions to do queries. In my controller file in my server I was able to create a one line function as such:
const data = await db[tableName].where("user_id=$1", [userId])
Didn't know inline SQL existed in MassiveJS, so that was great to find out!

How to do variable substitution in plpgsql?

I've got a bit of complex sql code that I'm converting from MSSql to Postgres (using Entity Framework Core 2.1), to deal with potential race conditions when inserting to a table with a unique index. Here's the dumbed-down version:
const string QUERY = #"
DO
$$
BEGIN
insert into Foo (Field1,Field2,Field3)
values (#value1,#value2,#value3);
EXCEPTION WHEN others THEN
-- do nothing; it's a race condition
END;
$$ LANGUAGE plpgsql;
select *
from Foo
where Field1 = #value1
and Field2 = #value2;
";
return DbContext.Foos
.FromSql(QUERY,
new NpgsqlParameter("value1", value1),
new NpgsqlParameter("value2", value2),
new NpgsqlParameter("value3", value3))
.First();
In other words, try to insert the record, but don't throw an exception if the attempt to insert it results in a unique index violation (index is on Field1+Field2), and return the record, whether it was created by me or another thread.
This concept worked fine in MSSql, using a TRY..CATCH block. As far as I can tell, the way to handle Postgres exceptions is as I've done, in a plpgsql block.
BUT...
It appears that variable substitution in plpgsql blocks doesn't work. The code above fails on the .First() (no elements in sequence), and when I comment out the EXCEPTION line, I see the real problem, which is:
Npgsql.PostgresException : 42703: column "value1" does not exist
When I test using regular Sql, i.e. doing the insert without using a plpgsql block, this works fine.
So, what is the correct way to do variable substitution in a plpgsql block?
The reason this doesn't work is that the body of the DO statement is actually a string, a text. See reference
$$ is just another way to delimit text in postgresql. It can be just as well be replaced with ' or $somestuff$.
As it is a string, Npgsql and Postgresql have no reason to mess with #value1 in it.
Solutions? Only a very ugly one, so not using this construction, as you're not able to pass it any values. And messing with string concatenation is no different than doing concatenation in C# in the first place.
Alternatives? Yes!
You don't need to handle exceptions in plpgsql blocks. Simply insert, use the ON CONFLICT DO NOTHING, and be on your way.
INSERT INTO Foo (Field1,Field2,Field3)
VALUES (#value1,#value2,#value3)
ON CONFLICT DO NOTHING;
select *
from Foo
where Field1 = #value1
and Field2 = #value2;
Or if you really want to keep using plpgsql, you can simply create a temporary table, using the ON COMMIT DROP option, fill it up with these parameters as one row, then use it in the DO statement. For that to work all your code must execute as part of one transaction. You can use one explicitly just in case.
The only ways to pass parameters to plpgsql code is via these 2 methods:
Declaring a function, then calling it with arguments
When already inside a plpgsql block you can call:
EXECUTE $$ INSERT ... VALUES ($1, $2, $3); $$ USING 3, 'text value', 5.234;
End notes:
As a fellow T-SQL developer who loved its freedom, but transitioned to Postgresql, I have to say that the BIG difference is that on one side there's T-SQL which gives the power, and on the other side it's a very powerful Postgresql-flavored SQL. plpgsql is very rarely warranted. In fact, in a code base of megabytes of complex SQL stuff, I can rewrite pretty much every plpgsql code in SQL. That's how powerful it really is compared to MSSQL-flavored SQL. It just takes some getting used to, and befriending the very ample documentation. Good luck!

Is this actually open to SQL injection?

We have a function that builds some XML and uses EXECUTE/USING to try to prevent SQL injection. It's along the lines of:
CREATE OR REPLACE FUNCTION public.f1(t TEXT)
RETURNS XML AS
$BODY$
DECLARE
ret_val XML;
BEGIN
EXECUTE 'SELECT ''<'' || $1 || ''>'''
--rest of xml building
INTO ret_val
USING t;
RETURN ret_val;
END
$BODY$
LANGUAGE plpgsql IMMUTABLE;
I don't like this much due to the concatenation. It would be much better if we could just do SELECT '''<$1>'' but that ends up with a literal $1 rather than replacing it with the value.
Due to the concatenation, it's making me wonder whether we even need SQL injection prevention here. It's not reading from a table, just building an XML string which is returned.
Is there any actual risk from not using USING in this case? Does concatenating $1 negate the effects of USING, or does USING even have any effect on a statement that doesn't use a table?
There are a few things to unpack here.
Firstly, the SQL you have there is actually a fixed string:
'SELECT ''<'' || $1 || ''>'''
So nothing can be directly injected here, because there's no dynamic SQL. As Laurenz Albe pointed out, the SQL in this particular example could be written as a non-dynamic statement:
SELECT '<' || t || '>'
There is still no SQL injection here, because you're not evaluating the contents of t, just manipulating it as a string, just as SELECT a + 1 would manipulate a as a number.
The key point is that the actual SQL is hard-coded, and the concatenation is just the instruction in that SQL.
Note that this similar-looking query would be dangerous (the syntax highlighting gives a clue to the difference):
EXECUTE 'SELECT ''<' || t || '>''' -- DON'T DO THIS!
Here, the value of t is being used as part of the SQL string - the concatenation is happening first, and then the result is executed. So a value of '1'; DROP TABLE users; --' would result in the query SELECT '<1'; DROP TABLE users; -- which is clearly undesirable.
Secondly, as explained in the docs, the $1 is a parameter, supplied by the USING clause as data, so it too is safe from SQL injection. This is similar to using a parameterised query in a programming language outside the database - you build up the query as a string, carefully whitelisting the tables and columns referenced, then provide the variable data separately, where it cannot be reinterpreted as SQL. Or to put it another way, it's like another function "one level deeper", with the parameters specified by the USING clause acting just like the parameters to an actual function.
Finally, though, a note of caution: you are vulnerable to XML injection: if nothing else has validated or escaped t, you could generate invalid or dangerous XML. For instance, consider what would happen if the value of t was 'foo><script>alert("Hello, world!");</script></foo' and the result ended up parsed as HTML.
There is no danger for SQL injection here, because you are using USING.
Note that you could have been using static SQL to achieve the same thing:
SELECT '<' || t || '>' INTO ret_val;

How can I send column values as the payload in a postgresql NOTIFY message?

If an entry in a table satisfies certain conditions, a NOTIFY is sent out. I want the payload to include the ID number and several other columns of information. Is there a postgres method to convert variables (OLD.ColumnID, etc) to strings?
using postgres 9.3
#klin is correct that NOTIFY doesn't support anything other than string literals. However there is a function pg_notify() which takes normal arguments to deal with exactly this situation. It's been around since at least 9.0 and that link is to the official documentation - always worth reading it carefully, there is a wealth of information there.
My guess is that the notify has to be done within a trigger function. Use a dynamic query, e.g.
execute format('notify channel, ''id: %s''', old.id);
The solution was to upgrade Postgres to a version that supported JSON.
Even postgresql 9.3 supports json. You could have just used row_to_json(payload)::text
Sorry for the long answer, i just cant walk away without reacting to the other answers too.
The format version fails in many ways. Before EXECUTE, you shoud prepare the plan. The "pseudo command" does not fits the syntax of execute which is
EXECUTE somepreparedplanname (parameter1, ...)
The %s in format is again too bad, this way you can summon sql injection attacks. When constructing a query with format, you need to use %L for literals %I for column/table/function/etc ids, and use %s almost never.
The other solution with the pg_notify function is correct. Try
LISTEN channel;
SELECT pg_notify('channel','Id: '|| pg_backend_pid ());
in psql command line.
So back to the original question: sdemurjian,
Its not clarified in the question, if you wants to use this notification thing in some trigger function. So here is an example (maybe not) for you (because im a little late. sorry for that too):
CREATE TABLE columns("columnID" oid, "columnData" text);
CREATE FUNCTION column_trigger_func() RETURNS TRIGGER AS
$$ BEGIN PERFORM pg_notify('columnchannel', 'Id: '||OLD."columnID");
RETURN NEW; END; $$ LANGUAGE plpgsql;
CREATE TRIGGER column_notify BEFORE UPDATE ON columns FOR EACH ROW
EXECUTE PROCEDURE column_trigger_func();
LISTEN columnchannel;
INSERT INTO columns VALUES(1,'testdata');
BEGIN; UPDATE columns SET "columnData" = 'success'; END;
BEGIN; UPDATE columns SET "columnData" = 'fail'; ROLLBACK;
Please note that in early postgres versions (any before 9), the notify command does not accepts any payload and there is no pg_notify function.
In 8.1 the trigger function stil works if you define it like
CREATE FUNCTION column_trigger_func() RETURNS TRIGGER AS
$$ BEGIN NOTIFY columnchannel; RETURN NEW; END; $$ LANGUAGE plpgsql;

Print ASCII-art formatted SETOF records from inside a PL/pgSQL function

I would love to exploit the SQL output formatting of PostgreSQL inside my PL/pgSQL functions, but I'm starting to feel I have to give up the idea.
I have my PL/pgSQL function query_result:
CREATE OR REPLACE FUNCTION query_result(
this_query text
) RETURNS SETOF record AS
$$
BEGIN
RETURN QUERY EXECUTE this_query;
END;
$$ LANGUAGE plpgsql;
..merrily returning a SETOF records from an input text query, and which I can use for my SQL scripting with dynamic queries:
mydb=# SELECT * FROM query_result('SELECT ' || :MYVAR || ' FROM Alice') AS t (id int);
id
----
1
2
3
So my hope was to find a way to deliver this same nicely formatted output from inside a PL/pgSQL function instead, but RAISE does not support SETOF types, and there's no magic predefined cast from SETOF records to text (I know I could create my own CAST..)
If I create a dummy print_result function:
CREATE OR REPLACE FUNCTION print_result(
this_query text
) RETURNS void AS
$$
BEGIN
SELECT query_result(this_query);
END;
$$ LANGUAGE plpgsql;
..I cannot print the formatted output:
mydb=# SELECT print_result('SELECT ' || :MYVAR || ' FROM Alice');
ERROR: set-valued function called in context that cannot accept a set
...
Thanks for any suggestion (which preferably works with PostgreSQL 8.4).
Ok, to do anything with your result set in print_result you'll have to loop over it. That'll look something like this -
Here result_record is defined as a record variable. For the sake of explanation, we'll also assume that you have a formatted_results variable that is defined as text and defaulted to a blank string to hold the formatted results.
FOR result_record IN SELECT * FROM query_result(this_query) AS t (id int) LOOP
-- With all this, you can do something like this
formatted_results := formatted_results ||','|| result_record.id;
END LOOP;
RETURN formatted_results;
So, if you change print_results to return text, declare the variables as I've described and add this in, your function will return a comma-separated list of all your results (with an extra comma at the end, I'm sure you can make use of PostgreSQL's string functions to trim that). I'm not sure this is exactly what you want, but this should give you a good idea about how to manipulate your result set. You can get more information here about control structures, which should let you do pretty much whatever you want.
EDIT TO ANSWER THE REAL QUESTION:
The ability to format data tuples as readable text is a feature of the psql client, not the PostgreSQL server. To make this feature available in the server would require extracting relevant code or modules from the psql utility and recompiling them as a database function. This seems possible (and it is also possible that someone has already done this), but I am not familiar enough with the process to provide a good description of how to do that. Most likely, the best solution for formatting query results as text will be to make use of PostgreSQL's string formatting functions to implement the features you need for your application.