Convert all hex in a string to its char value in Redshift - amazon-redshift

In Redshift, I'm trying to convert strings like this:
http%3A%2F%2Fwww.amazon.com%2FTest%3Fname%3DGary%26Bob
To look like this:
http://www.amazon.com/Test?name=Gary&Bob
Basically I need to convert all of the hex in a string to its char value. The only way I can think of is to use a regex function. I tried to do it in two different ways and received error messages for both:
SELECT REGEXP_REPLACE(hex_string, '%([[:xdigit:]][[:xdigit:]])', CHR(x'\\1'::int))
ERROR: 22P02: "\" is not a valid hexadecimal digit
SELECT REGEXP_REPLACE(hex_string, '%([[:xdigit:]][[:xdigit:]])',CHR(STRTOL('0x'||'\\1', 16)::int))
ERROR: 22023: The input 0x\1 is not valid to be converted to base 16
The CHR and STRTOL functions works by itself. For example:
SELECT CHR(x'3A'::int)
SELECT CHR(STRTOL('0x3A', 16)::int)
both returns
:
And if I run the same pattern using a different function (other than CHR and STRTOL), it works:
REGEXP_REPLACE(hex_string, '%([[:xdigit:]][[:xdigit:]])', LOWER('{H}'||'\\1'||'{/H}'))
returns
http{h}3A{/h}{h}2F{/h}{h}2F{/h}www.amazon.com{h}2F{/h}Test{h}3F{/h}name{h}3D{/h}Gary{h}26{/h}Bob
But for some reason those functions won't recognize the regex matching group.
Any tips on how I can do this?
I guess the other solution is to use nested REPLACE() functions for all of the special hex characters, but that's probably a very last resort.

What you want to do is called "URL decode".
Currently there is no built-in function for doing this, but you can create a custom User-Defined Function (make sure you have the required privileges):
CREATE FUNCTION urldecode(url VARCHAR)
RETURNS varchar
IMMUTABLE AS $$
import urllib
return urllib.unquote(url).decode('utf8') # or 'latin-1', depending on how the text is encoded
$$ LANGUAGE plpythonu;
Example query:
SELECT urldecode('http%3A%2F%2Fwww.amazon.com%2FTest%3Fname%3DGary%26Bob');
Result:
http://www.amazon.com/Test?name=Gary&Bob

I tried #hiddenbit's answer in REDSHIFT, but Python 3 isn't supported. The following Py2 code did work for me, however:
DROP FUNCTION urldecode(varchar);
CREATE FUNCTION urldecode(url VARCHAR)
RETURNS varchar
IMMUTABLE AS $$
import urllib
return urllib.unquote(url)
$$ LANGUAGE plpythonu;

Related

How to deal with input parameter containing single quote in between value?

I have function which take varying character from frontend and return certain computed values, but the issue iam facing is when input value for that parameter contain single quote than its throwing error like procedure does not exist.
CREATE OR REPLACE PROCEDURE compute(p_company_name character varying DEFAULT NULL::character, INOUT response double precision DEFAULT NULL::double precision)
LANGUAGE plpgsql
AS $procedure$
begin
select estimate into response from tableA
where comp = p_company_name;
exception
when others then select -1 into response;---other error
end
$procedure$
;
For all input value without quote in it works fine when input value for parameter is like p_company_name = samsung's then it throwing error.
Please help thanks.
Your code is broken - you use wrong (or you don't use) parameter escaping. Every input should be sanitized by quote escaping:
Input: "Pavel's book" -> Output "Pavel''s book"
select foo('samsung's'); -- syntax error
select foo('samsung''s'); -- ok
or you can use custom string
select foo($$samsung's$$); -- ok
You should to read some about SQL injection, because if you see described problem, then your application is SQL injection vulnerable.

postgres function 101: returning text

I have a function like this:
CREATE OR REPLACE FUNCTION current_name()
RETURNS text AS 'select foo;' LANGUAGE sql;
Except it doesn't work. Neither does RETURN TEXT "SELECT 'foo';"
How can I keep it written in SQL, but still return text?
I think this is the least change you need to make it work.
CREATE OR REPLACE FUNCTION current_name()
RETURNS text AS
'select ''foo''::text;'
LANGUAGE sql;
You'll see that the SQL statement--the body of the function--is a string. Strings have to be quoted, and single quotes within a quoted string have to be escaped. Soon you have more quotation marks than actual text.
The usual way to write something like this is to use a dollar quoted string constant.
CREATE OR REPLACE FUNCTION current_name()
RETURNS text AS
$$
select 'foo'::text;
$$
LANGUAGE sql;

PostgreSQL Trim excessive trailing zeroes: type numeric but expression is of type text

I'm trying to clean out excessive trailing zeros, I used the following query...
UPDATE _table_ SET _column_=trim(trailing '00' FROM '_column_');
...and I received the following error:
ERROR: column "_column_" is of
expression is of type text.
I've played around with the quotes since that usually is what it barrels down to for text versus numeric though without any luck.
The CREATE TABLE syntax:
CREATE TABLE _table_ (
id bigint NOT NULL,
x bigint,
y bigint,
_column_ numeric
);
You can cast the arguments from and the result back to numeric:
UPDATE _table_ SET _column_=trim(trailing '00' FROM _column_::text)::numeric;
Also note that you don't quote column names with single quotes as you did.
Postgres version 13 now comes with the trim_scale() function:
UPDATE _table_ SET _column_ = trim_scale(_column_);
trim takes string parameters, so _column_ has to be cast to a string (varchar for example). Then, the result of trim has to be cast back to numeric.
UPDATE _table_ SET _column_=trim(trailing '00' FROM _column_::varchar)::numeric;
Another (arguably more consistent) way to clean out the trailing zeroes from a NUMERIC field would be to use something like the following:
UPDATE _table_ SET _column_ = CAST(to_char(_column_, 'FM999999999990.999999') AS NUMERIC);
Note that you would have to modify the FM pattern to match the maximum expected precision and scale of your _column_ field. For more details on the FM pattern modifier and/or the to_char(..) function see the PostgreSQL docs here and here.
Edit: Also, see the following post on the gnumed-devel mailing list for a longer and more thorough explanation on this approach.
Be careful with all the answers here. Although this looks like a simple problem, it's not.
If you have pg 13 or higher, you should use trim_scale (there is an answer about that already). If not, here is my "Polyfill":
DO $x$
BEGIN
IF count(*)=0 FROM pg_proc where proname='trim_scale' THEN
CREATE FUNCTION trim_scale(numeric) RETURNS numeric AS $$
SELECT CASE WHEN trim($1::text, '0')::numeric = $1 THEN trim($1::text, '0')::numeric ELSE $1 END $$
LANGUAGE SQL;
END IF;
END;
$x$;
And here is a query for testing the answers:
WITH test as (SELECT unnest(string_to_array('1|2.0|0030.00|4.123456000|300000','|'))::numeric _column_)
SELECT _column_ original,
trim(trailing '00' FROM _column_::text)::numeric accepted_answer,
CAST(to_char(_column_, 'FM999999999990.999') AS NUMERIC) another_fancy_one,
CASE WHEN trim(_column_::text, '0')::numeric = _column_ THEN trim(_column_::text, '0')::numeric ELSE _column_ END my FROM test;
Well... it looks like, I'm trying to show the flaws of the earlier answers, while just can't come up with other testcases. Maybe you should write more, if you can.
I'm like short syntax instead of fancy sql keywords, so I always go with :: over CAST and function call with comma separated args over constructs like trim(trailing '00' FROM _column_). But it's a personal taste only, you should check your company or team standards (and fight for change them XD)

How to XOR md5 hash values and cast them to HEX in postgresql

What I have tried so far
SELECT md5(text) will return text (hex strings) .
After that We need to xor them
SELECT x'hex_string' # x'hex_string';
But the above results in binary values.
How do I again convert them into hex string?
Is there anyway to xor md5 values in postgresql and convert this into hexadecimal values again ?
Those binary values are in fact of type bit varying, which differs significantly from bytea.
bit varying comes with built-in support for XOR and such, but PostgreSQL doesn't provide a cast from bit varying to bytea.
You could write a function that does the cast, but it's not trivial and probably not the more efficient way in your case.
It would make more sense to XOR the md5 digests directly. PostgreSQL doesn't provide the XOR operator for bytea either, but it can be easily written in a function, especially when assumed that the operands have an equal length (16 bytes in the case of md5 digests):
CREATE FUNCTION xor_digests(_in1 bytea, _in2 bytea) RETURNS bytea
AS $$
DECLARE
o int; -- offset
BEGIN
FOR o IN 0..octet_length(_in1)-1 LOOP
_in1 := set_byte(_in1, o, get_byte(_in1, o) # get_byte(_in2, o));
END LOOP;
RETURN _in1;
END;
$$ language plpgsql;
Now the built-in postgresql md5 function that produces an hex string is not the best fit for post-processing, either. The pgcrypto module provides this function instead:
digest(data text, type text) returns bytea
Using this function and getting the final result as an hex string:
select encode(
xor_digest ( digest('first string', 'md5') ,
digest('second string', 'md5')),
'hex');
produces the result: c1bd61a3c411bc0127c6d7ab1238c4bd with type text.
If pgcrypto can't be installed and only the built-in md5 function is available, you could still combine encode and decode to achieve the result like this:
select
encode(
xor_digest(
decode(md5('first string'), 'hex'),
decode(md5('second string'), 'hex')
),
'hex'
);
Result:
c1bd61a3c411bc0127c6d7ab1238c4bd
You may want to check
https://github.com/artejera/text_xor_agg/
I recently wrote this postgres script for such a use case.

How can I mimic the php urldecode function in postgresql?

I have a column url encoded with urlencode in php. I wish to make a select like this
SELECT some_mix_of_functions(...) AS Decoded FROM table
Replace is not a good solution because I will have to add all the decoding by hand. Any other solution to get the desire result ?
Yes you can:
CREATE OR REPLACE FUNCTION decode_url_part(p varchar) RETURNS varchar AS $$
SELECT convert_from(CAST(E'\\x' || string_agg(CASE WHEN length(r.m[1]) = 1 THEN encode(convert_to(r.m[1], 'SQL_ASCII'), 'hex') ELSE substring(r.m[1] from 2 for 2) END, '') AS bytea), 'UTF8')
FROM regexp_matches($1, '%[0-9a-f][0-9a-f]|.', 'gi') AS r(m);
$$ LANGUAGE SQL IMMUTABLE STRICT;
This creates a function decode_url_part, then you can use it like that:
SELECT decode_url_part('your%20urlencoded%20string')
Or you can just use the mix of functions and subqueries from the body of the above function.
This doesn't handle '+' characters (representing whitespace), but I guess adding this is quite easy (if you ever need it).
Also, this assumes utf-8 encoding for non-ascii characters, but you can replace 'UTF8' with your own encoding if you want.
It should be noted that the above code relies on undocumented postgresql feature, namely that the results of regexp_matches function are processed in the order they occur in the original string (which is natural, but not specified in docs).
As Pablo Santa Cruz notes, string_agg is a PostgreSQL 9.0 aggregate function. The equivalent code below doesn't use it (I hope it works for 8.x):
SELECT convert_from(CAST(E'\\x' || array_to_string(ARRAY(
SELECT CASE WHEN length(r.m[1]) = 1 THEN encode(convert_to(r.m[1], 'SQL_ASCII'), 'hex') ELSE substring(r.m[1] from 2 for 2) END
FROM regexp_matches($1, '%[0-9a-f][0-9a-f]|.', 'gi') AS r(m)
), '') AS bytea), 'UTF8');
Not out of the box. But you could create a pl/perl function that wraps the perl equivalent. (Or a pl/php function).