SQLAlchemy IN_ - trouble with leading zeroes - postgresql

In my sqlalchemy ( sqlalchemy = "^1.4.36" ) query I have a clause:
.filter( some_model.some_field[2].in_(['item1', 'item2']) )
where some_field is jsonb and the value in some_field value in the db formatted like this:
["something","something","123"]
or
["something","something","0123"]
note: some_field[2] is always digits-only double-quoted string, sometimes with leading zeroes and sometimes without them.
The query works fine for cases like this:
.filter( some_model.some_field[2].in_(['123', '345']) )
and fails when the values in the in_ clause have leading zeroes:
e.g. .filter( some_model.some_field[2].in_(['0123', '0345']) ) fails.
The error it gives:
cursor.execute(statement, parameters)\\npsycopg2.errors.InvalidTextRepresentation: invalid input syntax for type json\\nLINE 3: ...d_on) = 2 AND (app_cache.value_metadata -> 2) IN (\\'0123\\'\\n ^\\nDETAIL: Token \"0123\" is invalid.
Again, in the case of '123' (or any string of digits without leading zero) instead of '0123' the error is not thrown.
What is wrong with having leading zeroes for the strings in the list of in_ clause? Thanks.
UPDATE: basically, sqlachemy's IN_ assumes int input and fails accordingly. There must be some reasoning behind this behavior, can't tell what it is. I removed that filter fromm the query and did the filtering of the ouput in python code afterwards.

The problem here is that the values in the IN clause are being interpreted by PostgreSQL as JSON representations of integers, and an integer with a leading zero is not valid JSON.
The IN clause has a value of type jsonb on the left hand side. The values on the right hand side are not explicitly typed, so Postgres tries to find the best match that will allow them to be compared with a jsonb value. This type is jsonb, so Postgres attempts to cast the values to jsonb. This works for values without a leading zero, because digits in single quotes without leading zeroes are valid representations of integers in JSON:
test# select '123'::jsonb;
jsonb
═══════
123
(1 row)
but it doesn't work for values with leading zeroes, because they are not valid JSON:
test# select '0123'::jsonb;
ERROR: invalid input syntax for type json
LINE 1: select '0123'::jsonb;
^
DETAIL: Token "0123" is invalid.
CONTEXT: JSON data, line 1: 0123
Assuming that you expect some_field[2].in_(['123', '345']) and some_field[2].in_(['0123', '345']) to match ["something","something","123"] and ["something","something","123"] respectively, you can either serialise the values to JSON yourself:
some_field[2].in_([json.dumps(x) for x in ['0123', '345']])
or use the contained_by operator (<# in PostgreSQL), to test whether some_field[2] is present in the list of values:
some_field[2].contained_by(['0123', '345'])
or cast some_field[2] to text (that is, use the ->> operator) so that the values are compared as text, not JSON.
some_field[2].astext.in_(['0123', '345'])

Related

Text and jsonb concatenation in a single postgresql query

How can I concatenate a string inside of a concatenated jsonb object in postgresql? In other words, I am using the JSONb concatenate operator as well as the text concatenate operator in the same query and running into trouble.
Or... if there is a totally different query I should be executing, I'd appreciate hearing suggestions. The goal is to update a row containing a jsonb column. We don't want to overwrite existing key value pairs in the jsonb column that are not provided in the query and we also want to update multiple rows at once.
My query:
update contacts as c set data = data || '{"geomatch": "MATCH","latitude":'||v.latitude||'}'
from (values (16247746,40.814140),
(16247747,20.900840),
(16247748,20.890570)) as v(contact_id,latitude) where c.contact_id = v.contact_id
The Error:
ERROR: invalid input syntax for type json
LINE 85: update contacts as c set data = data || '{"geomatch": "MATCH...
^
DETAIL: The input string ended unexpectedly.
CONTEXT: JSON data, line 1: {"geomatch": "MATCH","latitude":
SQL state: 22P02
Character: 4573
You might be looking for
SET data = data || ('{"geomatch": "MATCH","latitude":'||v.latitude||'}')::jsonb
-- ^^ jsonb ^^ text ^^ text
but that's not how one should build JSON objects - that v.latitude might not be a valid JSON literal, or even contain some injection like "", "otherKey": "oops". (Admittedly, in your example you control the values, and they're numbers so it might be fine, but it's still a bad practice). Instead, use jsonb_build_object:
SET data = data || jsonb_build_object('geomatch', 'MATCH', 'latitude', v.latitude)
There are two problems. The first is operator precedence preventing your concatenation of a jsonb object to what is read a text object. The second is that the concatenation of text pieces requires a cast to jsonb.
This should work:
update contacts as c
set data = data || ('{"geomatch": "MATCH","latitude":'||v.latitude||'}')::jsonb
from (values (16247746,40.814140),
(16247747,20.900840),
(16247748,20.890570)) as v(contact_id,latitude)
where c.contact_id = v.contact_id
;

Check if character varying is between range of numbers

I hava data in my database and i need to select all data where 1 column number is between 1-100.
Im having problems, because i cant use - between 1 and 100; Because that column is character varying, not integer. But all data are numbers (i cant change it to integer).
Code;
dst_db1.eachRow("Select length_to_fault from diags where length_to_fault between 1 AND 100")
Error - operator does not exist: character varying >= integer
Since your column supposed to contain numeric values but is defined as text (or version of text) there will be times when it does not i.e. You need 2 validations: that the column actually contains numeric data and that it falls into your value restriction. So add the following predicates to your query.
and length_to_fault ~ '^\+?\d+(\.\d*)?$'
and length_to_fault::numeric <# ('[1.0,100.0]')::numrange;
The first builds a regexp that insures the column is a valid floating point value. The second insures the numeric value fall within the specified numeric range. See fiddle.
I understand you cannot change the database, but this looks like a good place for a check constraint esp. if n/a is the only non-numeric are allowed. You may want to talk with your DBA ans see about the following constraint.
alter table diags
add constraint length_to_fault_check
check ( lower(length_to_fault) = 'n/a'
or ( length_to_fault ~ '^\+?\d+(\.\d*)?$'
and length_to_fault::numeric <# ('[1.0,100.0]')::numrange
)
);
Then your query need only check that:
lower(lenth_to_fault) != 'n/a'
The below PostgreSQL query will work
SELECT length_to_fault FROM diags WHERE regexp_replace(length_to_fault, '[\s+]', '', 'g')::numeric BETWEEN 1 AND 100;

Cast to int instead of decimal?

I have field that has up to 9 comma separated values each of which have a string value and a numeric value separated by colon. After parsing them all some of the values between 0 and 1 are being set to an integer rather than a numeric as cast. The problem is obviously related to data type but I am unsure what is causing it or how to fix it. The problem only exists in the case statement, the split_part function seems to be working perfect.
Things I have tried:
nvl(split_part(one,':',2),0) = COALESCE types text and integer cannot be matched
nvl(split_part(one,':',2)::numeric,0) => Invalid input syntax for type numeric
numerous other cast/convert variations
(CASE WHEN split_part(one,':',2) = '' THEN 0::numeric ELSE split_part(one,':',2)::numeric END)::numeric => runs but get int value of 0
When using the split_part function outside of case statement it does work correctly. However, I need the result to be zero for null values.
split_part(one,':',2) => 0.02068278096187390979 (expected result)
When running the code above I get zero but expect 0.02068278096187390979
Field "one" has the following value 'xyz: 0.02068278096187390979' before the split_part function.
EXAMPLE:
create table test(one varchar);
insert into test values('XYZ: 0.50000000000000000000')
select
one ,split_part(one,':',2) as correct_value_for_those_that_are_not_null ,
case
when split_part(one,':',2) = '' then null
else split_part(one,':',2)::numeric
end::numeric as this_one_is_the_problem
from test
However, I need the result to be zero for null values.
Your example does not deal with NULL values at all, though. Only addressing the empty string ('').
To replace either with 0 reliably, efficiently and without casting issues:
SELECT part1, CASE WHEN part2 <> '' THEN part2::numeric ELSE numeric '0' END AS part2
FROM (
SELECT split_part(one, ':', 1) AS part1
, split_part(one, ':', 2) AS part2
FROM test
) sub;
See:
Best way to check for "empty or null value"
Also note that all SQL CASE branches must agree on a common data type. There have been minor adjustments in the logic that determines the resulting type in the past, so the version of Postgres may play a role in corner cases. Don't recall the details now.
nvl()is not a Postgres function. You probably meant COALESCE. The manual:
This SQL-standard function provides capabilities similar to NVL and IFNULL, which are used in some other database systems.

RIGHT Function in UPDATE Statement w/ Integer Field

I am attempting to run a simple UPDATE script on an integer field, whereby the trailing 2 numbers are "kept", and the leading numbers are removed. For example, "0440" would be updated as "40." I can get the desired data in a SELECT statement, such as
SELECT RIGHT(field_name::varchar, 2)
FROM table_name;
However, I run into an error when I try to use this same functionality in an UPDATE script, such as:
UPDATE schema_name.table_name
SET field_name = RIGHT(field_name::varchar, 2);
The error I receive reads:
column . . . is of type integer but expression is of type text . . .
HINT: You will need to rewrite or cast the expression
You're casting the integer to varchar but you're not casting the result back to integer.
UPDATE schema_name.table_name
SET field_name = RIGHT(field_name::TEXT, 2)::INTEGER;
The error is quite straight forward - right returns textual data, which you cannot assign to an integer column. You could, however, explicitly cast it back:
UPDATE schema_name.table_name
SET field_name = RIGHT(field_name::varchar, 2)::int;
1 is a digit (or a number - or a string), '123' is a number (or a string).
Your example 0440 does not make sense for an integer value, since leading (insignificant) 0 are not stored.
Strictly speaking data type integer is no good to store the "trailing 2 numbers" - meaning digits - since 00 and 0 both result in the same integer value 0. But I don't think that's what you meant.
For operating on the numeric value, don't use string functions (which requires casting back and forth. The modulo operator % does what you need, exactly: field_name%100. So:
UPDATE schema_name.table_name
SET field_name = field_name%100
WHERE field_name > 99; -- to avoid empty updates

remove non-numeric characters in a column (character varying), postgresql (9.3.5)

I need to remove non-numeric characters in a column (character varying) and keep numeric values in postgresql 9.3.5.
Examples:
1) "ggg" => ""
2) "3,0 kg" => "3,0"
3) "15 kg." => "15"
4) ...
There are a few problems, some values are like:
1) "2x3,25"
2) "96+109"
3) ...
These need to remain as is (i.e when containing non-numeric characters between numeric characters - do nothing).
Using regexp_replace is more simple:
# select regexp_replace('test1234test45abc', '[^0-9]+', '', 'g');
regexp_replace
----------------
123445
(1 row)
The ^ means not, so any character that is not in the range 0-9 will be replaced with an empty string, ''.
The 'g' is a flag that means all matches will be replaced, not just the first match.
For modifying strings in PostgreSQL take a look at The String functions and operators section of the documentation. Function substring(string from pattern) uses POSIX regular expressions for pattern matching and works well for removing different characters from your string.
(Note that the VALUES clause inside the parentheses is just to provide the example material and you can replace it any SELECT statement or table that provides the data):
SELECT substring(column1 from '(([0-9]+.*)*[0-9]+)'), column1 FROM
(VALUES
('ggg'),
('3,0 kg'),
('15 kg.'),
('2x3,25'),
('96+109')
) strings
The regular expression explained in parts:
[0-9]+ - string has at least one number, example: '789'
[0-9]+.* - string has at least one number followed by something, example: '12smth'
([0-9]+.\*)* - the string similar to the previous line zero or more times, example: '12smth22smth'
(([0-9]+.\*)*[0-9]+) - the string from the previous line zero or more times and at least one number at the end, example: '12smth22smth345'