postgres function 101: returning text - postgresql

I have a function like this:
CREATE OR REPLACE FUNCTION current_name()
RETURNS text AS 'select foo;' LANGUAGE sql;
Except it doesn't work. Neither does RETURN TEXT "SELECT 'foo';"
How can I keep it written in SQL, but still return text?

I think this is the least change you need to make it work.
CREATE OR REPLACE FUNCTION current_name()
RETURNS text AS
'select ''foo''::text;'
LANGUAGE sql;
You'll see that the SQL statement--the body of the function--is a string. Strings have to be quoted, and single quotes within a quoted string have to be escaped. Soon you have more quotation marks than actual text.
The usual way to write something like this is to use a dollar quoted string constant.
CREATE OR REPLACE FUNCTION current_name()
RETURNS text AS
$$
select 'foo'::text;
$$
LANGUAGE sql;

Related

Postgres replacing 'text' with e'text'

I inserted a bunch of rows with a text field like content='...\n...\n...'.
I didn't use e in front, like conent=e'...\n...\n..., so now \n is not actually displayed as a newline - it's printed as text.
How do I fix this, i.e. how to change every row's content field from '...' to e'...'?
The syntax variant E'string' makes Postgres interpret the given string as Posix escape string. \n encoding a newline is only one of many interpreted escape sequences (even if the most common one). See:
Insert text with single quotes in PostgreSQL
To "re-evaluate" your Posix escape string, you could use a simple function with dynamic SQL like this:
CREATE OR REPLACE FUNCTION f_eval_posix_escapes(INOUT _string text)
LANGUAGE plpgsql AS
$func$
BEGIN
EXECUTE 'SELECT E''' || _string || '''' INTO _string;
END
$func$;
WARNING 1: This is inherently unsafe! We have to evaluate input strings dynamically without quoting and escaping, which allows SQL injection. Only use this in a safe environment.
WARNING 2: Don't apply repeatedly. Or it will misinterpret your actual string with genuine \ characters, etc.
WARNING 3: This simple function is imperfect as it cannot cope with nested single quotes properly. If you have some of those, consider instead:
Unescape a string with escaped newlines and carriage returns
Apply:
UPDATE tbl
SET content = f_eval_posix_escapes(content)
WHERE content IS DISTINCT FROM f_eval_posix_escapes(content);
db<>fiddle here
Note the added WHERE clause to skip updates that would not change anything. See:
How do I (or can I) SELECT DISTINCT on multiple columns?
Use REPLACE in an update query. Something like this: (I'm on mobile so please ignore any typo or syntax erro)
UPDATE table
SET
column = REPLACE(column, '\n', e'\n')

Is there a mechanism in SQL to escape a variable?

I will write a stored procedure in PostgreSQL which accepts a variable (my knowledge of SQL is close to zero, so I apologize if the question is obvious). Since this variable will be used verbatim in the call, I wanted to ensure that it is properly escaped to avoid injection.
Is there a function I can wrap the variable in, which would properly do the escaping?
I specifically would like to do that in SQL, as opposed to sanitizing the input (that variable) in the code which calls the SQL query (which would have arguably been easier).
I am surprised not to find any prominent documentation about such a functionality, which leads me to believe that this is not a standard practice. The closest I could get to was with the lexer source code of Postgresql but this is beyond my capacities to understand whether this is the right escaping that is mentioned (and which would lead to string being used as u&’stringuescape’’’, which looks quite barbaric)
There are several quoting functions in PostgreSQL, documented at https://www.postgresql.org/docs/current/functions-string.html
quote_ident(string text) text Return the given string suitably quoted to be used as an identifier in an SQL statement string. Quotes are added only if necessary (i.e., if the string contains non-identifier characters or would be case-folded). Embedded quotes are properly doubled. See also Example 40-1. quote_ident('Foo bar') "Foo bar"
quote_literal(string text) text Return the given string suitably quoted to be used as a string literal in an SQL statement string. Embedded single-quotes and backslashes are properly doubled. Note that quote_literal returns null on null input; if the argument might be null, quote_nullable is often more suitable. See also Example 40-1. quote_literal(E'O\'Reilly') 'O''Reilly'
quote_literal(value anyelement) text Coerce the given value to text and then quote it as a literal. Embedded single-quotes and backslashes are properly doubled. quote_literal(42.5) '42.5'
quote_nullable(string text) text Return the given string suitably quoted to be used as a string literal in an SQL statement string; or, if the argument is null, return NULL. Embedded single-quotes and backslashes are properly doubled. See also Example 40-1. quote_nullable(NULL) NULL
quote_nullable(value anyelement) text Coerce the given value to text and then quote it as a literal; or, if the argument is null, return NULL. Embedded single-quotes and backslashes are properly doubled. quote_nullable(42.5) '42.5'
But if you're designing procedures that prepare SQL from a string, you should use query parameters instead.
PREPARE fooplan (int, text, bool, numeric) AS
INSERT INTO foo VALUES($1, $2, $3, $4);
EXECUTE fooplan(1, 'Hunter Valley', 't', 200.00);
Read more in https://www.postgresql.org/docs/current/sql-prepare.html

Convert all hex in a string to its char value in Redshift

In Redshift, I'm trying to convert strings like this:
http%3A%2F%2Fwww.amazon.com%2FTest%3Fname%3DGary%26Bob
To look like this:
http://www.amazon.com/Test?name=Gary&Bob
Basically I need to convert all of the hex in a string to its char value. The only way I can think of is to use a regex function. I tried to do it in two different ways and received error messages for both:
SELECT REGEXP_REPLACE(hex_string, '%([[:xdigit:]][[:xdigit:]])', CHR(x'\\1'::int))
ERROR: 22P02: "\" is not a valid hexadecimal digit
SELECT REGEXP_REPLACE(hex_string, '%([[:xdigit:]][[:xdigit:]])',CHR(STRTOL('0x'||'\\1', 16)::int))
ERROR: 22023: The input 0x\1 is not valid to be converted to base 16
The CHR and STRTOL functions works by itself. For example:
SELECT CHR(x'3A'::int)
SELECT CHR(STRTOL('0x3A', 16)::int)
both returns
:
And if I run the same pattern using a different function (other than CHR and STRTOL), it works:
REGEXP_REPLACE(hex_string, '%([[:xdigit:]][[:xdigit:]])', LOWER('{H}'||'\\1'||'{/H}'))
returns
http{h}3A{/h}{h}2F{/h}{h}2F{/h}www.amazon.com{h}2F{/h}Test{h}3F{/h}name{h}3D{/h}Gary{h}26{/h}Bob
But for some reason those functions won't recognize the regex matching group.
Any tips on how I can do this?
I guess the other solution is to use nested REPLACE() functions for all of the special hex characters, but that's probably a very last resort.
What you want to do is called "URL decode".
Currently there is no built-in function for doing this, but you can create a custom User-Defined Function (make sure you have the required privileges):
CREATE FUNCTION urldecode(url VARCHAR)
RETURNS varchar
IMMUTABLE AS $$
import urllib
return urllib.unquote(url).decode('utf8') # or 'latin-1', depending on how the text is encoded
$$ LANGUAGE plpythonu;
Example query:
SELECT urldecode('http%3A%2F%2Fwww.amazon.com%2FTest%3Fname%3DGary%26Bob');
Result:
http://www.amazon.com/Test?name=Gary&Bob
I tried #hiddenbit's answer in REDSHIFT, but Python 3 isn't supported. The following Py2 code did work for me, however:
DROP FUNCTION urldecode(varchar);
CREATE FUNCTION urldecode(url VARCHAR)
RETURNS varchar
IMMUTABLE AS $$
import urllib
return urllib.unquote(url)
$$ LANGUAGE plpythonu;

Why the DELIMITER $$ in plpgsql? [duplicate]

Being completely new to PL/pgSQL , what is the meaning of double dollar signs in this function:
CREATE OR REPLACE FUNCTION check_phone_number(text)
RETURNS boolean AS $$
BEGIN
IF NOT $1 ~ e'^\\+\\d{3}\\ \\d{3} \\d{3} \\d{3}$' THEN
RAISE EXCEPTION 'Wrong formated string "%". Expected format is +999 999';
END IF;
RETURN true;
END;
$$ LANGUAGE plpgsql STRICT IMMUTABLE;
I'm guessing that, in RETURNS boolean AS $$, $$ is a placeholder.
The last line is a bit of a mystery: $$ LANGUAGE plpgsql STRICT IMMUTABLE;
By the way, what does the last line mean?
These dollar signs ($$) are used for dollar quoting, which is in no way specific to function definitions. It can be used to replace single quotes enclosing string literals (constants) anywhere in SQL scripts.
The body of a function happens to be such a string literal. Dollar-quoting is a PostgreSQL-specific substitute for single quotes to avoid escaping of nested single quotes (recursively). You could enclose the function body in single-quotes just as well. But then you'd have to escape all single-quotes in the body:
CREATE OR REPLACE FUNCTION check_phone_number(text)
RETURNS boolean
LANGUAGE plpgsql STRICT IMMUTABLE AS
'
BEGIN
IF NOT $1 ~ e''^\\+\\d{3}\\ \\d{3} \\d{3} \\d{3}$'' THEN
RAISE EXCEPTION ''Malformed string "%". Expected format is +999 999'';
END IF;
RETURN true;
END
';
This isn't such a good idea. Use dollar-quoting instead. More specifically, also put a token between the $$ to make each pair unique - you might want to use nested dollar-quotes inside the function body. I do that a lot, actually.
CREATE OR REPLACE FUNCTION check_phone_number(text)
RETURNS boolean
LANGUAGE plpgsql STRICT IMMUTABLE AS
$func$
BEGIN
...
END
$func$;
See:
Insert text with single quotes in PostgreSQL
As to your second question:
Read the most excellent manual on CREATE FUNCTION to understand the last line of your example.
The $$ is a delimiter you use to indicate where the function definition starts and ends. Consider the following,
CREATE TABLE <name> <definition goes here> <options go here, eg: WITH OIDS>
The create function syntax is similar, but because you are going to use all sorts of SQL in your function (especially the end of statement ; character), the parser would trip if you didn't delimit it. So you should read your statement as:
CREATE OR REPLACE FUNCTION check_phone_number(text)
RETURNS boolean AS <code delimited by $$> LANGUAGE plpgsql STRICT IMMUTABLE;
The stuff after the actual definition are options to give the database more information about your function, so it can optimize its usage.
In fact, if you look under "4.1.2.4. Dollar-Quoted String Constants" in the manual, you will see that you can even use characters in between the dollar symbols and it will all count as one delimiter.

What are '$$' used for in PL/pgSQL

Being completely new to PL/pgSQL , what is the meaning of double dollar signs in this function:
CREATE OR REPLACE FUNCTION check_phone_number(text)
RETURNS boolean AS $$
BEGIN
IF NOT $1 ~ e'^\\+\\d{3}\\ \\d{3} \\d{3} \\d{3}$' THEN
RAISE EXCEPTION 'Wrong formated string "%". Expected format is +999 999';
END IF;
RETURN true;
END;
$$ LANGUAGE plpgsql STRICT IMMUTABLE;
I'm guessing that, in RETURNS boolean AS $$, $$ is a placeholder.
The last line is a bit of a mystery: $$ LANGUAGE plpgsql STRICT IMMUTABLE;
By the way, what does the last line mean?
These dollar signs ($$) are used for dollar quoting, which is in no way specific to function definitions. It can be used to replace single quotes enclosing string literals (constants) anywhere in SQL scripts.
The body of a function happens to be such a string literal. Dollar-quoting is a PostgreSQL-specific substitute for single quotes to avoid escaping of nested single quotes (recursively). You could enclose the function body in single-quotes just as well. But then you'd have to escape all single-quotes in the body:
CREATE OR REPLACE FUNCTION check_phone_number(text)
RETURNS boolean
LANGUAGE plpgsql STRICT IMMUTABLE AS
'
BEGIN
IF NOT $1 ~ e''^\\+\\d{3}\\ \\d{3} \\d{3} \\d{3}$'' THEN
RAISE EXCEPTION ''Malformed string "%". Expected format is +999 999'';
END IF;
RETURN true;
END
';
This isn't such a good idea. Use dollar-quoting instead. More specifically, also put a token between the $$ to make each pair unique - you might want to use nested dollar-quotes inside the function body. I do that a lot, actually.
CREATE OR REPLACE FUNCTION check_phone_number(text)
RETURNS boolean
LANGUAGE plpgsql STRICT IMMUTABLE AS
$func$
BEGIN
...
END
$func$;
See:
Insert text with single quotes in PostgreSQL
As to your second question:
Read the most excellent manual on CREATE FUNCTION to understand the last line of your example.
The $$ is a delimiter you use to indicate where the function definition starts and ends. Consider the following,
CREATE TABLE <name> <definition goes here> <options go here, eg: WITH OIDS>
The create function syntax is similar, but because you are going to use all sorts of SQL in your function (especially the end of statement ; character), the parser would trip if you didn't delimit it. So you should read your statement as:
CREATE OR REPLACE FUNCTION check_phone_number(text)
RETURNS boolean AS <code delimited by $$> LANGUAGE plpgsql STRICT IMMUTABLE;
The stuff after the actual definition are options to give the database more information about your function, so it can optimize its usage.
In fact, if you look under "4.1.2.4. Dollar-Quoted String Constants" in the manual, you will see that you can even use characters in between the dollar symbols and it will all count as one delimiter.