How to XML-encode a string in PostgreSQL?

How to XML-encode a string in PostgreSQL? - postgresql

Question: I can create a XML-encoded string in Postgres like this:
SELECT xmlelement(name name, 'AT&T', null )
now I want to get the xml encoded value, that is to say AT&T.
But if I do:
SELECT unnest(xpath('/name/text()', xmlelement(name name, 'AT&T', null )))
then I get AT&T, not AT&T.
How can I get the XML-encoded value?
Furthermore, isn't it possible to supply an empty name to xmlelement, and just cast to varchar?

I would suggest to use a simple function.
create or replace function xml_escape(s text) returns text as
$$
select replace(replace(replace(s, '&', '&'), '>', '>'), '<', '<');
$$
language sql immutable strict;

If you are writing to an HTML client then you will have to HTML escape that for it to show the raw HTML.
As I see you are mainly a C# developer then the static method HttpUtility.HtmlEncode() will do it.

Related

Postgres replacing 'text' with e'text'

I inserted a bunch of rows with a text field like content='...\n...\n...'.
I didn't use e in front, like conent=e'...\n...\n..., so now \n is not actually displayed as a newline - it's printed as text.
How do I fix this, i.e. how to change every row's content field from '...' to e'...'?

The syntax variant E'string' makes Postgres interpret the given string as Posix escape string. \n encoding a newline is only one of many interpreted escape sequences (even if the most common one). See:
Insert text with single quotes in PostgreSQL
To "re-evaluate" your Posix escape string, you could use a simple function with dynamic SQL like this:
CREATE OR REPLACE FUNCTION f_eval_posix_escapes(INOUT _string text)
LANGUAGE plpgsql AS
$func$
BEGIN
EXECUTE 'SELECT E''' || _string || '''' INTO _string;
END
$func$;
WARNING 1: This is inherently unsafe! We have to evaluate input strings dynamically without quoting and escaping, which allows SQL injection. Only use this in a safe environment.
WARNING 2: Don't apply repeatedly. Or it will misinterpret your actual string with genuine \ characters, etc.
WARNING 3: This simple function is imperfect as it cannot cope with nested single quotes properly. If you have some of those, consider instead:
Unescape a string with escaped newlines and carriage returns
Apply:
UPDATE tbl
SET content = f_eval_posix_escapes(content)
WHERE content IS DISTINCT FROM f_eval_posix_escapes(content);
db<>fiddle here
Note the added WHERE clause to skip updates that would not change anything. See:
How do I (or can I) SELECT DISTINCT on multiple columns?

Use REPLACE in an update query. Something like this: (I'm on mobile so please ignore any typo or syntax erro)
UPDATE table
SET
column = REPLACE(column, '\n', e'\n')

Convert all hex in a string to its char value in Redshift

In Redshift, I'm trying to convert strings like this:
http%3A%2F%2Fwww.amazon.com%2FTest%3Fname%3DGary%26Bob
To look like this:
http://www.amazon.com/Test?name=Gary&Bob
Basically I need to convert all of the hex in a string to its char value. The only way I can think of is to use a regex function. I tried to do it in two different ways and received error messages for both:
SELECT REGEXP_REPLACE(hex_string, '%([[:xdigit:]][[:xdigit:]])', CHR(x'\\1'::int))
ERROR: 22P02: "\" is not a valid hexadecimal digit
SELECT REGEXP_REPLACE(hex_string, '%([[:xdigit:]][[:xdigit:]])',CHR(STRTOL('0x'||'\\1', 16)::int))
ERROR: 22023: The input 0x\1 is not valid to be converted to base 16
The CHR and STRTOL functions works by itself. For example:
SELECT CHR(x'3A'::int)
SELECT CHR(STRTOL('0x3A', 16)::int)
both returns
:
And if I run the same pattern using a different function (other than CHR and STRTOL), it works:
REGEXP_REPLACE(hex_string, '%([[:xdigit:]][[:xdigit:]])', LOWER('{H}'||'\\1'||'{/H}'))
returns
http{h}3A{/h}{h}2F{/h}{h}2F{/h}www.amazon.com{h}2F{/h}Test{h}3F{/h}name{h}3D{/h}Gary{h}26{/h}Bob
But for some reason those functions won't recognize the regex matching group.
Any tips on how I can do this?
I guess the other solution is to use nested REPLACE() functions for all of the special hex characters, but that's probably a very last resort.

What you want to do is called "URL decode".
Currently there is no built-in function for doing this, but you can create a custom User-Defined Function (make sure you have the required privileges):
CREATE FUNCTION urldecode(url VARCHAR)
RETURNS varchar
IMMUTABLE AS $$
import urllib
return urllib.unquote(url).decode('utf8') # or 'latin-1', depending on how the text is encoded
$$ LANGUAGE plpythonu;
Example query:
SELECT urldecode('http%3A%2F%2Fwww.amazon.com%2FTest%3Fname%3DGary%26Bob');
Result:
http://www.amazon.com/Test?name=Gary&Bob

I tried #hiddenbit's answer in REDSHIFT, but Python 3 isn't supported. The following Py2 code did work for me, however:
DROP FUNCTION urldecode(varchar);
CREATE FUNCTION urldecode(url VARCHAR)
RETURNS varchar
IMMUTABLE AS $$
import urllib
return urllib.unquote(url)
$$ LANGUAGE plpythonu;

postgres function 101: returning text

I have a function like this:
CREATE OR REPLACE FUNCTION current_name()
RETURNS text AS 'select foo;' LANGUAGE sql;
Except it doesn't work. Neither does RETURN TEXT "SELECT 'foo';"
How can I keep it written in SQL, but still return text?

I think this is the least change you need to make it work.
CREATE OR REPLACE FUNCTION current_name()
RETURNS text AS
'select ''foo''::text;'
LANGUAGE sql;
You'll see that the SQL statement--the body of the function--is a string. Strings have to be quoted, and single quotes within a quoted string have to be escaped. Soon you have more quotation marks than actual text.
The usual way to write something like this is to use a dollar quoted string constant.
CREATE OR REPLACE FUNCTION current_name()
RETURNS text AS
$$
select 'foo'::text;
$$
LANGUAGE sql;

PostgreSQL Trim excessive trailing zeroes: type numeric but expression is of type text

I'm trying to clean out excessive trailing zeros, I used the following query...
UPDATE _table_ SET _column_=trim(trailing '00' FROM '_column_');
...and I received the following error:
ERROR: column "_column_" is of
expression is of type text.
I've played around with the quotes since that usually is what it barrels down to for text versus numeric though without any luck.
The CREATE TABLE syntax:
CREATE TABLE _table_ (
id bigint NOT NULL,
x bigint,
y bigint,
_column_ numeric
);

You can cast the arguments from and the result back to numeric:
UPDATE _table_ SET _column_=trim(trailing '00' FROM _column_::text)::numeric;
Also note that you don't quote column names with single quotes as you did.

Postgres version 13 now comes with the trim_scale() function:
UPDATE _table_ SET _column_ = trim_scale(_column_);

trim takes string parameters, so _column_ has to be cast to a string (varchar for example). Then, the result of trim has to be cast back to numeric.
UPDATE _table_ SET _column_=trim(trailing '00' FROM _column_::varchar)::numeric;

Another (arguably more consistent) way to clean out the trailing zeroes from a NUMERIC field would be to use something like the following:
UPDATE _table_ SET _column_ = CAST(to_char(_column_, 'FM999999999990.999999') AS NUMERIC);
Note that you would have to modify the FM pattern to match the maximum expected precision and scale of your _column_ field. For more details on the FM pattern modifier and/or the to_char(..) function see the PostgreSQL docs here and here.
Edit: Also, see the following post on the gnumed-devel mailing list for a longer and more thorough explanation on this approach.

Be careful with all the answers here. Although this looks like a simple problem, it's not.
If you have pg 13 or higher, you should use trim_scale (there is an answer about that already). If not, here is my "Polyfill":
DO $x$
BEGIN
IF count(*)=0 FROM pg_proc where proname='trim_scale' THEN
CREATE FUNCTION trim_scale(numeric) RETURNS numeric AS $$
SELECT CASE WHEN trim($1::text, '0')::numeric = $1 THEN trim($1::text, '0')::numeric ELSE $1 END $$
LANGUAGE SQL;
END IF;
END;
$x$;
And here is a query for testing the answers:
WITH test as (SELECT unnest(string_to_array('1|2.0|0030.00|4.123456000|300000','|'))::numeric _column_)
SELECT _column_ original,
trim(trailing '00' FROM _column_::text)::numeric accepted_answer,
CAST(to_char(_column_, 'FM999999999990.999') AS NUMERIC) another_fancy_one,
CASE WHEN trim(_column_::text, '0')::numeric = _column_ THEN trim(_column_::text, '0')::numeric ELSE _column_ END my FROM test;
Well... it looks like, I'm trying to show the flaws of the earlier answers, while just can't come up with other testcases. Maybe you should write more, if you can.
I'm like short syntax instead of fancy sql keywords, so I always go with :: over CAST and function call with comma separated args over constructs like trim(trailing '00' FROM _column_). But it's a personal taste only, you should check your company or team standards (and fight for change them XD)

How can I mimic the php urldecode function in postgresql?

I have a column url encoded with urlencode in php. I wish to make a select like this
SELECT some_mix_of_functions(...) AS Decoded FROM table
Replace is not a good solution because I will have to add all the decoding by hand. Any other solution to get the desire result ?

Yes you can:
CREATE OR REPLACE FUNCTION decode_url_part(p varchar) RETURNS varchar AS $$
SELECT convert_from(CAST(E'\\x' || string_agg(CASE WHEN length(r.m[1]) = 1 THEN encode(convert_to(r.m[1], 'SQL_ASCII'), 'hex') ELSE substring(r.m[1] from 2 for 2) END, '') AS bytea), 'UTF8')
FROM regexp_matches($1, '%[0-9a-f][0-9a-f]|.', 'gi') AS r(m);
$$ LANGUAGE SQL IMMUTABLE STRICT;
This creates a function decode_url_part, then you can use it like that:
SELECT decode_url_part('your%20urlencoded%20string')
Or you can just use the mix of functions and subqueries from the body of the above function.
This doesn't handle '+' characters (representing whitespace), but I guess adding this is quite easy (if you ever need it).
Also, this assumes utf-8 encoding for non-ascii characters, but you can replace 'UTF8' with your own encoding if you want.
It should be noted that the above code relies on undocumented postgresql feature, namely that the results of regexp_matches function are processed in the order they occur in the original string (which is natural, but not specified in docs).
As Pablo Santa Cruz notes, string_agg is a PostgreSQL 9.0 aggregate function. The equivalent code below doesn't use it (I hope it works for 8.x):
SELECT convert_from(CAST(E'\\x' || array_to_string(ARRAY(
SELECT CASE WHEN length(r.m[1]) = 1 THEN encode(convert_to(r.m[1], 'SQL_ASCII'), 'hex') ELSE substring(r.m[1] from 2 for 2) END
FROM regexp_matches($1, '%[0-9a-f][0-9a-f]|.', 'gi') AS r(m)
), '') AS bytea), 'UTF8');

Not out of the box. But you could create a pl/perl function that wraps the perl equivalent. (Or a pl/php function).

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

How to XML-encode a string in PostgreSQL? - postgresql

I would suggest to use a simple function. create or replace function xml_escape(s text) returns text as $$ select replace(replace(replace(s, '&', '&'), '>', '>'), '<', '<'); $$ language sql immutable strict;

If you are writing to an HTML client then you will have to HTML escape that for it to show the raw HTML. As I see you are mainly a C# developer then the static method HttpUtility.HtmlEncode() will do it.

Related

Postgres replacing 'text' with e'text'

Convert all hex in a string to its char value in Redshift

postgres function 101: returning text

PostgreSQL Trim excessive trailing zeroes: type numeric but expression is of type text

How can I mimic the php urldecode function in postgresql?

Categories

Resources