In PostgreSQL, how can I unwrap a json string to text? - postgresql

Suppose I have a value of type json, say y. One may obtain such a value through, for example, obj->'key', or any function that returns values of type json.
This value, when cast to text, includes quotation marks i.e. "y" instead of y. In cases where using json types is unavoidable, this poses a problem, especially when we wish to compare the value with literal strings e.g.
select foo(x)='bar';
The API Brainstorm page suggests a from_json function that will intelligently unwrap JSON strings, but I doubt that is available yet. In the meantime, how can one convert JSON strings to text without the quotation marks?

Text:
To extract a value as text, use #>>:
SELECT to_json('foo'::text) #>> '{}';
From: Postgres: How to convert a json string to text?
PostgreSQL doc page: https://www.postgresql.org/docs/11/functions-json.html
So it addresses your question specifically, but it doesn't work with any other types, like integer or float for example. The #> operator will not work for other types either.
Numbers:
Because JSON only has one numeric type, "number", and has no concept of int or float, there's no obvious way to cast a JSON type to a "correct" numeric type. It's best to know the schema of your JSON, extract the text and then cast to the correct type:
SELECT (('{"a":2.01}'::json)->'a'#>>'{}')::float
PostgreSQL does however have support for "arbitrary precision numbers" ("up to 131072 digits before the decimal point; up to 16383 digits after the decimal point") with its "numeric" type. JSON also supports 'e' notation for large numbers.
Try this to test them both out:
SELECT (('{"a":2e99999}'::json)->'a'#>>'{}')::numeric

The ->> operator unwraps quotation marks correctly. In order to take advantage of that operator, we wrap up our value inside an array, and then convert that to json.
CREATE OR REPLACE FUNCTION json2text(IN from_json JSON)
RETURNS TEXT AS $$
BEGIN
RETURN to_json(ARRAY[from_json])->>0;
END; $$
LANGUAGE plpgsql;
For completeness, we provide a CAST that makes use of the function above.
CREATE CAST (json AS text) WITH json2text(json) AS ASSIGNMENT;

Related

Convert comma separated non json string to json

Below is the value of a string in a text column.
select col1 from tt_d_tab;
'A:10000000,B:50000000,C:1000000,D:10000000,E:10000000'
I'm trying to convert it into json of below format.
'{"A": 10000000,"B": 50000000,"C": 1000000,"D": 10000000,"E": 10000000}'
Can someone help on this?
If you know that neither the keys nor values will have : or , characters in them, you can write
select json_object(regexp_split_to_array(col1,'[:,]')) from tt_d_tab;
This splits the string on every colon and comma, then interprets the result as key/value pairs.
If the string manipulation gets any more complicated, SQL may not be the ideal tool for the job, but it's still doable, either by this method or by converting the string into the form you need directly and then casting it to json with ::json.
If your key is a single capital letter as in your example
select concat('{',regexp_replace('A:10000000,B:50000000,C:1000000,D:10000000,E:10000000','([A-Z])','"\1"','g'),'}')::json json_field;
A more general case with any number of letters caps or not
select concat('{',regexp_replace('Ac:10000000,BT:50000000,Cs:1000000,D:10000000,E:10000000','([a-zA-Z]+)','"\1"','g'),'}')::json json_field;

pg_get_serial_sequence in postgres fails and returns misleading error

This is not obviuos to me.
When I do:
SELECT MAX("SequenceNumber") FROM "TrackingEvent";
It returns perfectly fine with the correct result
When I do:
SELECT nextval(pg_get_serial_sequence("TrackingEvent", "SequenceNumber")) AS NextId
It returns an error which says
column "TrackingEvent" does not exist.
Not only is it wrong but the first argument of the function pg_get_serial_sequence takes a table name and not a column name, so the error is aslo misleading.
Anyways, can someone explain to me why I get an error on the pg_get_serial_sequence function ?
pg_get_serial_sequence() expects a string as its argument, not an identifier. String constants are written with single quotes in SQL, "TrackingEvent" is an identifier, 'TrackingEvent' is a string constant.
But because the function converts the string constant to an identifier, you need to include the double quotes as part of the string constant. This however only applies to the table name, not the column name, as explained in the manual
Because the first parameter is potentially a schema and table, it is not treated as a double-quoted identifier, meaning it is lower cased by default, while the second parameter, being just a column name, is treated as double-quoted and has its case preserved.
So you need to use:
SELECT nextval(pg_get_serial_sequence('"TrackingEvent"', 'SequenceNumber'))
This is another good example why using quoted identifiers is a bad idea. You should rename "TrackingEvent" to tracking_event and "SequenceNumber" to sequence_number

Is there a mechanism in SQL to escape a variable?

I will write a stored procedure in PostgreSQL which accepts a variable (my knowledge of SQL is close to zero, so I apologize if the question is obvious). Since this variable will be used verbatim in the call, I wanted to ensure that it is properly escaped to avoid injection.
Is there a function I can wrap the variable in, which would properly do the escaping?
I specifically would like to do that in SQL, as opposed to sanitizing the input (that variable) in the code which calls the SQL query (which would have arguably been easier).
I am surprised not to find any prominent documentation about such a functionality, which leads me to believe that this is not a standard practice. The closest I could get to was with the lexer source code of Postgresql but this is beyond my capacities to understand whether this is the right escaping that is mentioned (and which would lead to string being used as u&’stringuescape’’’, which looks quite barbaric)
There are several quoting functions in PostgreSQL, documented at https://www.postgresql.org/docs/current/functions-string.html
quote_ident(string text) text Return the given string suitably quoted to be used as an identifier in an SQL statement string. Quotes are added only if necessary (i.e., if the string contains non-identifier characters or would be case-folded). Embedded quotes are properly doubled. See also Example 40-1. quote_ident('Foo bar') "Foo bar"
quote_literal(string text) text Return the given string suitably quoted to be used as a string literal in an SQL statement string. Embedded single-quotes and backslashes are properly doubled. Note that quote_literal returns null on null input; if the argument might be null, quote_nullable is often more suitable. See also Example 40-1. quote_literal(E'O\'Reilly') 'O''Reilly'
quote_literal(value anyelement) text Coerce the given value to text and then quote it as a literal. Embedded single-quotes and backslashes are properly doubled. quote_literal(42.5) '42.5'
quote_nullable(string text) text Return the given string suitably quoted to be used as a string literal in an SQL statement string; or, if the argument is null, return NULL. Embedded single-quotes and backslashes are properly doubled. See also Example 40-1. quote_nullable(NULL) NULL
quote_nullable(value anyelement) text Coerce the given value to text and then quote it as a literal; or, if the argument is null, return NULL. Embedded single-quotes and backslashes are properly doubled. quote_nullable(42.5) '42.5'
But if you're designing procedures that prepare SQL from a string, you should use query parameters instead.
PREPARE fooplan (int, text, bool, numeric) AS
INSERT INTO foo VALUES($1, $2, $3, $4);
EXECUTE fooplan(1, 'Hunter Valley', 't', 200.00);
Read more in https://www.postgresql.org/docs/current/sql-prepare.html

Convert all hex in a string to its char value in Redshift

In Redshift, I'm trying to convert strings like this:
http%3A%2F%2Fwww.amazon.com%2FTest%3Fname%3DGary%26Bob
To look like this:
http://www.amazon.com/Test?name=Gary&Bob
Basically I need to convert all of the hex in a string to its char value. The only way I can think of is to use a regex function. I tried to do it in two different ways and received error messages for both:
SELECT REGEXP_REPLACE(hex_string, '%([[:xdigit:]][[:xdigit:]])', CHR(x'\\1'::int))
ERROR: 22P02: "\" is not a valid hexadecimal digit
SELECT REGEXP_REPLACE(hex_string, '%([[:xdigit:]][[:xdigit:]])',CHR(STRTOL('0x'||'\\1', 16)::int))
ERROR: 22023: The input 0x\1 is not valid to be converted to base 16
The CHR and STRTOL functions works by itself. For example:
SELECT CHR(x'3A'::int)
SELECT CHR(STRTOL('0x3A', 16)::int)
both returns
:
And if I run the same pattern using a different function (other than CHR and STRTOL), it works:
REGEXP_REPLACE(hex_string, '%([[:xdigit:]][[:xdigit:]])', LOWER('{H}'||'\\1'||'{/H}'))
returns
http{h}3A{/h}{h}2F{/h}{h}2F{/h}www.amazon.com{h}2F{/h}Test{h}3F{/h}name{h}3D{/h}Gary{h}26{/h}Bob
But for some reason those functions won't recognize the regex matching group.
Any tips on how I can do this?
I guess the other solution is to use nested REPLACE() functions for all of the special hex characters, but that's probably a very last resort.
What you want to do is called "URL decode".
Currently there is no built-in function for doing this, but you can create a custom User-Defined Function (make sure you have the required privileges):
CREATE FUNCTION urldecode(url VARCHAR)
RETURNS varchar
IMMUTABLE AS $$
import urllib
return urllib.unquote(url).decode('utf8') # or 'latin-1', depending on how the text is encoded
$$ LANGUAGE plpythonu;
Example query:
SELECT urldecode('http%3A%2F%2Fwww.amazon.com%2FTest%3Fname%3DGary%26Bob');
Result:
http://www.amazon.com/Test?name=Gary&Bob
I tried #hiddenbit's answer in REDSHIFT, but Python 3 isn't supported. The following Py2 code did work for me, however:
DROP FUNCTION urldecode(varchar);
CREATE FUNCTION urldecode(url VARCHAR)
RETURNS varchar
IMMUTABLE AS $$
import urllib
return urllib.unquote(url)
$$ LANGUAGE plpythonu;

PostgreSQL Trim excessive trailing zeroes: type numeric but expression is of type text

I'm trying to clean out excessive trailing zeros, I used the following query...
UPDATE _table_ SET _column_=trim(trailing '00' FROM '_column_');
...and I received the following error:
ERROR: column "_column_" is of
expression is of type text.
I've played around with the quotes since that usually is what it barrels down to for text versus numeric though without any luck.
The CREATE TABLE syntax:
CREATE TABLE _table_ (
id bigint NOT NULL,
x bigint,
y bigint,
_column_ numeric
);
You can cast the arguments from and the result back to numeric:
UPDATE _table_ SET _column_=trim(trailing '00' FROM _column_::text)::numeric;
Also note that you don't quote column names with single quotes as you did.
Postgres version 13 now comes with the trim_scale() function:
UPDATE _table_ SET _column_ = trim_scale(_column_);
trim takes string parameters, so _column_ has to be cast to a string (varchar for example). Then, the result of trim has to be cast back to numeric.
UPDATE _table_ SET _column_=trim(trailing '00' FROM _column_::varchar)::numeric;
Another (arguably more consistent) way to clean out the trailing zeroes from a NUMERIC field would be to use something like the following:
UPDATE _table_ SET _column_ = CAST(to_char(_column_, 'FM999999999990.999999') AS NUMERIC);
Note that you would have to modify the FM pattern to match the maximum expected precision and scale of your _column_ field. For more details on the FM pattern modifier and/or the to_char(..) function see the PostgreSQL docs here and here.
Edit: Also, see the following post on the gnumed-devel mailing list for a longer and more thorough explanation on this approach.
Be careful with all the answers here. Although this looks like a simple problem, it's not.
If you have pg 13 or higher, you should use trim_scale (there is an answer about that already). If not, here is my "Polyfill":
DO $x$
BEGIN
IF count(*)=0 FROM pg_proc where proname='trim_scale' THEN
CREATE FUNCTION trim_scale(numeric) RETURNS numeric AS $$
SELECT CASE WHEN trim($1::text, '0')::numeric = $1 THEN trim($1::text, '0')::numeric ELSE $1 END $$
LANGUAGE SQL;
END IF;
END;
$x$;
And here is a query for testing the answers:
WITH test as (SELECT unnest(string_to_array('1|2.0|0030.00|4.123456000|300000','|'))::numeric _column_)
SELECT _column_ original,
trim(trailing '00' FROM _column_::text)::numeric accepted_answer,
CAST(to_char(_column_, 'FM999999999990.999') AS NUMERIC) another_fancy_one,
CASE WHEN trim(_column_::text, '0')::numeric = _column_ THEN trim(_column_::text, '0')::numeric ELSE _column_ END my FROM test;
Well... it looks like, I'm trying to show the flaws of the earlier answers, while just can't come up with other testcases. Maybe you should write more, if you can.
I'm like short syntax instead of fancy sql keywords, so I always go with :: over CAST and function call with comma separated args over constructs like trim(trailing '00' FROM _column_). But it's a personal taste only, you should check your company or team standards (and fight for change them XD)