select an item with a null comparison - postgresql

I'm not using a full on DB Abstraction library, and am using raw sql templates in psycopg2 that look like this :
SELECT id FROM table WHERE message = %(message)s ;
The ideal query to retrieve my intended results looks something like this :
SELECT id FROM table WHERE message = 'a3cbb207' ;
SELECT id FROM table WHERE message IS NULL ;
Unfortunately... the obvious problem is that my NULL comparisons come out looking like this:
SELECT id FROM table WHERE message = NULL ;
... which is not the correct comparison - and doesn't give me the intended result set.
My actual queries are much more complex than the illustration above - so I can't change them easily. ( which would be the correct solution , i agree. i'm looking for an emergency fix right now )
Does anyone know of a workaround , so I can keep the same singular templates going until a proper fix is in place ? I was trying to get coalesce and/or cast to work , but I struck out with my attempts.

What you want is IS NOT DISTINCT FROM.
SELECT id FROM table WHERE message IS NOT DISTINCT FROM 'the text';
SELECT id FROM table WHERE message IS NOT DISTINCT FROM NULL;
NULL IS NOT DISTINCT FROM NULL is true, not NULL, so it's like = but with different NULL comparison semantics. Great in trigger functions.
AFAIK can't use IS DISTINCT FROM for index lookups though, so be careful there. It can be better to use separate tests for null and value.

You can try writing your query clause as follows:
WHERE message = %(message)s OR ((%message)s IS NULL AND message IS NULL))
It's a bit rough, but it means "select the message that match my parameter, or all the messages that are null if my parameter is null". It should do the trick.

Unfortunately, NULL does not actually equal anything (not even another NULL) as the value of NULL is intended to represent an unknown. Your best bet is to change your templates to handle this correctly.
If it's possible that you can pass in separate values for the left and right operand in your template, one way to still use an equal sign would be:
SELECT id FROM table WHERE true = (message is null);

Related

Postgres: getting "... is out of range for type integer" when using NULLIF

For context, this issue occurred in a Go program I am writing using the default postgres database driver.
I have been building a service to talk to a postgres database which has a table similar to the one listed below:
CREATE TABLE object (
id SERIAL PRIMARY KEY NOT NULL,
name VARCHAR(255) UNIQUE,
some_other_id BIGINT UNIQUE
...
);
I have created some endpoints for this item including an "Install" endpoint which effectively acts as an upsert function like so:
INSERT INTO object (name, some_other_id)
VALUES ($1, $2)
ON CONFLICT name DO UPDATE SET
some_other_id = COALESCE(NULLIF($2, 0), object.some_other_id)
I also have an "Update" endpoint with an underlying query like so:
UPDATE object
SET some_other_id = COALESCE(NULLIF($2, 0), object.some_other_id)
WHERE name = $1
The problem:
Whenever I run the update query I always run into the error, referencing the field "some_other_id":
pq: value "1010101010144" is out of range for type integer
However this error never occurs on the "upsert" version of the query, even when the row already exists in the database (when it has to evaluate the COALESCE statement). I have been able to prevent this error by updating COALESCE statement to be as follows:
COALESCE(NULLIF($2, CAST(0 AS BIGINT)), object.some_other_id)
But as it never occurrs with the first query I wondered if this inconsitency had come from me doing something wrong or something that I don't understand? And also what the best practice is with this, should I be casting all values?
I am definitely passing in a 64 bit integer to the query for "some_other_id", and the first query works with the Go implementation even without the explicit type cast.
If any more information (or Go implementation) is required then please let me know, many thanks in advance! (:
Edit:
To eliminate confusion, the queries are being executed directly in Go code like so:
res, err := s.db.ExecContext(ctx, `UPDATE object SET some_other_id = COALESCE(NULLIF($2, 0), object.some_other_id) WHERE name = $1`,
"a name",
1010101010144,
)
Both queries are executed in exactly the same way.
Edit: Also corrected parameter (from $51 to $2) in my current workaround.
I would also like to take this opportunity to note that the query does work with my proposed fix, which suggests that the issue is in me confusing postgres with types in the NULLIF statement? There is no stored procedure asking for an INTEGER arg inbetween my code and the database, at least that I have written.
This has to do with how the postgres parser resolves types for the parameters. I don't know how exactly it's implemented, but given the observed behaviour, I would assume that the INSERT query doesn't fail because it is clear from (name,some_other_id) VALUES ($1,$2) that the $2 parameter should have the same type as the target some_other_id column, which is of type int8. This type information is then also used in the NULLIF expression of the DO UPDATE SET part of the query.
You can also test this assumption by using (name) VALUES ($1) in the INSERT and you'll see that the NULLIF expression in DO UPDATE SET will then fail the same way as it does in the UPDATE query.
So the UPDATE query fails because there is not enough context for the parser to infer the accurate type of the $2 parameter. The "closest" thing that the parser can use to infer the type of $2 is the NULLIF call expression, specifically it uses the type of the second argument of the call expression, i.e. 0, which is of type int4, and it then uses that type information for the first argument, i.e. $2.
To avoid this issue, you should use an explicit type cast with any parameter where the type cannot be inferred accurately. i.e. use NULLIF($2::int8, 0).
COALESCE(NULLIF($51, CAST(0 AS BIGINT)), object.some_other_id)
Fifty-one? Realy?
pq: value "1010101010144" is out of range for type integer
Pay attention, the data type in the error message is an integer, not bigint.
I think the reason for the error is out of showed code. So I take out a magic crystal ball and make a pass with my hands.
an "Install" endpoint which effectively acts as an upsert function like so
I also have an "Update" endpoint
Do you call endpoint a PostgreSQL function (stored procedure)? I think yes.
Also $1, $2 looks like PostgreSQL function arguments.
The magic crystal ball says: you have two PostgreSQL function with different data types of arguments:
"Install" endpoint has $2 function argument as a bigint data type. It looks like CREATE FUNCTION Install(VARCHAR(255), bigint)
"Update" endpoint has $2 function argument as an integer data type, not bigint. It looks like CREATE FUNCTION Update(VARCHAR(255), integer).
At last, I would rewrite your condition more understandable:
UPDATE object
SET some_other_id =
CASE
WHEN $2 = 0 THEN object.some_other_id
ELSE $2
END
WHERE name = $1

PreparedStatement setNull in SELECT query

I am using Postgresql together with HikariCP and my query is something like
SELECT * FROM my_table WHERE int_val = ? ...
Now, I would like to set NULL value to my variables - I have tried
ps.setNull(1, Types.INTEGER); // ps is instance of PreparedStatement
try (ResultSet rs = ps.executeQuery()) {
... // get result from resultset
}
Although I have rows matching the conditions ( NULL in column 'int_val'), I have not received any records..
The problem is (I think) in query produced by the Statement, looks like:
System.out.println(ps.toString());
// --> SELECT * FROM my_table WHERE int_val = NULL ...
But the query should look like:
"SELECT * FROM my_table WHERE int_val IS NULL ..." - this query works
I need to use dynamically create PreparedStatements which will contain NULL values, so I cannot somehow easily bypass this.
I have tried creating connection without the HikariCP with the same result, so I thing the problem is in the postgresql driver? Or am I doing something wrong?
UPDATE:
Based on answer from #Vao Tsun I have set transform_null_equals = on in postgresql.conf , which started changing val = null --> val is null in 'simple' Statements, but NOT in PreparedStatements..
To summarize:
try (ResultSet rs = st.executeQuery(SELECT * FROM my_table WHERE int_val = NULL)){
// query is replaced to '.. int_val IS NULL ..' and gets correct result
}
ps.setNull(1, Types.INTEGER);
try (ResultSet rs = ps.executeQuery()) {
// Does not get replaced and does not get any result
}
I am using JVM version 1.8.0_121, the latest postgres driver (42.1.4), but I have also tried older driver (9.4.1212). Database version -- PostgreSQL 9.6.2, compiled by Visual C++ build 1800, 64-bit.
It is meant behaviour that comparison x = null is equal to null (no matter what x is equal to). Basically for SQL NULL is unknown, not the actual value... To bypass it you can set transform_null_equals to on or true. Please checkout docs:
https://www.postgresql.org/docs/current/static/functions-comparison.html
Some applications might expect that expression = NULL returns true if
expression evaluates to the null value. It is highly recommended that
these applications be modified to comply with the SQL standard.
However, if that cannot be done the transform_null_equals
configuration variable is available. If it is enabled, PostgreSQL will
convert x = NULL clauses to x IS NULL.
I have just found a solution, which works the same for "values" and "NULLs" by using IS NOT DISTINCT FROM instead of =.
More on postgresql wiki
It is important to recognize that null is not a value with SQL. It is encoding the logical notion of "unknown". This is why null = var results in false always, even for cases where var has a value of null. So even if if you are replacing the value of your variable (aka ? in your case) with a value of null, the result be definition must not be what you do expect as long as SQL standard is complied with.
Now there are some databases around that try to outsmart SQL standard by assuming a column value of null should be taken as a programming language null (nil, undef or whatever is used for that purpose).
This creates some convenience for the unwary programmer, but in the long run causes grieve as soon as you need a true distinction between a SQL null and a programming language null.
Nevertheless, for ease of porting from such databases to PostgresQL (or simple for ease of lazy programming) you may resort to setting transform_null_equals.
BUT, you are using prepared statements. As such, prepared statements are converted to query plan once and such query plan needs to be valid for all potential values of the variables used in the prepared statement query. Now, a VAR is null is fundamentally different from a VAR = ?. So there is no chance for the query parser, query optimizer or even query execution engine to dynamically rewrite the (already prepared) query based on the actual parameter values passed in.
From this, you should take the recommendation serious that is given with the documentation of transform_null_equals and change your code to use VAR is null when a null value is to be searched for and a VAR = ? for other cases.

DB2: invalid use of one of the following: an untyped parameter marker, the DEFAULT keyword, or a null value

A user can only modify the ST_ASSMT_NM and , CAN_DT columns in the ST_ASSMT_REF record. In our system, we keep history in the same table and we never really update a record, we just insert a new row to represent the updated record. As a result, the "active" record is the record with the greatest LAST_TS timestamp value for a VENDR_ID. To prevent the possibility of an update to columns that cannot be changed, I wrote the logical UPDATE so that it retrieves the non-changable values from the original record and copies them to the new one being created. For the fields that can be modified, I pass them as params,
INSERT INTO GSAS.ST_ASSMT_REF
(
VENDR_ID
,ST_ASSMT_NM
,ST_CD
,EFF_DT
,CAN_DT
,LAST_TS
,LAST_OPER_ID
)
SELECT
ORIG_ST_ASSMT_REF.VENDR_ID
,#ST_ASSMT_NM
,ORIG_ST_ASSMT_REF.ST_CD
,ORIG_ST_ASSMT_REF.EFF_DT
,#CAN_DT
,CURRENT TIMESTAMP
,#LAST_OPER_ID
FROM
(
SELECT
ST_ASSMT_REF_ACTIVE_V.VENDR_ID
,ST_ASSMT_REF_ACTIVE_V.ST_ASSMT_NM
,ST_ASSMT_REF_ACTIVE_V.ST_CD
,ST_ASSMT_REF_ACTIVE_V.EFF_DT
,ST_ASSMT_REF_ACTIVE_V.CAN_DT
,CURRENT TIMESTAMP
,ST_ASSMT_REF_ACTIVE_V.LAST_OPER_ID
FROM
G2YF.ST_ASSMT_REF_ACTIVE_V ST_ASSMT_REF_ACTIVE_V --The view of only the most recent, active records
WHERE
ST_ASSMT_REF_ACTIVE_V.VENDR_ID = #VENDR_ID
) ORIG_ST_ASSMT_REF;
However, I am getting this error:
DB2 SP
:
ERROR [42610] [IBM][DB2] SQL0418N The statement was not processed because the statement contains an invalid use of one of the following: an untyped parameter marker, the DEFAULT keyword, or a null value.
It appears as though DB2 will not allow me to use a variable in a SELECT statement. For example, when I do this in TOAD for DB2:
select 1, #vendorId from SYSIBM.SYSDUMMY1
I get a popup dialog box. When I provide any string value, I get the same error.
I usually use SQL Server and I'm pretty sure I wouldn't have an issue doing this but I am not sure how to handle it get.
Suggestions? I know that I could do this in two seperate commands, 1 query SELECT to retreive the original VALUES and then supply the returned values and the modified ones to the INSERT command, but I should be able to do thios in one. Why can't I?
As you mentioned in your comment, DB2 is really picky about data types, and it wants you to cast your variables into the right data types. Even if you are passing in NULLs, sometimes DB2 wants you to cast the NULL to the data type of the target column.
Here is another answer I have on the topic.

How can I change all occurrences of a particular value in any column in PostgreSQL?

I have three different values in my database that represent a null: an actual null, an empty string, and a string {x:Null}. This value appears across multiple columns.
{x:Null} is normalized on the web front-end, so all these values look exactly the same although they end up ordered differently in a sort. How can I write a query that will take these values and make them actual nulls across every column and every table?
Bonus points if you can tell me how to make sure these other empty values are always inserted as nulls going forward. (Disclaimer: I have no power to grant any actual bonus points. ;)
You can query the information_schema to get a list of all tables and columns with a string type.
SELECT table_name, column_name
FROM information_schema.columns
WHERE data_type IN ('text', 'character', 'character varying')
NOTE double check first what values data_type has, I'm not sure if it will be character or char or what.
Then I would write a small program to update each column in each table. Here it is sketched out in Perl.
while( my($table, $column) = $sth->fetch ) {
my $q_table = $dbh->quote($table);
my $q_column = $dbh->quote($column);
$dbh->do(q[
UPDATE `$q_table`
SET `$q_column` = NULL
WHERE `$q_column` = '{x:Null}'
OR `$q_column` = ''
]);
}
Be sure to SQL escape $table and $column as in my sample.
Going forward, you'll have to set CONSTRAINTS on each and every column. You can use the information_schema.columns to do this as well. Something like
ALTER TABLE `$q_table` ADD CHECK(`$q_column` NOT IN ('{x:Null}', ''))
You could use a trigger to change the values to NULL, but I don't like data stores that silently change basic data for application purposes.
For new columns and tables, you'll have to remember to add that constraint. Same caveats about data_type apply.
However, it's probably a bad idea to say that no column can ever be an empty string. You might want to be bit more selective.
Another thing to note: NULL is a funny thing, its not true and its not false. You might be better off deciding that an empty string is the thing to set empty values to.
I don't think this approach is maintainable. It's scribbling an application rule all over the data layer. What if you have some data that doesn't follow that rule? And it will have to be continuously maintained for any new data schema added. Perhaps instead you should put this at your ORM layer. Or write a few stored procedures to take care of this.
Using the information_schema.columns table, write a procedural language routine which iterates through all applicable tables and columns, executing an update... set *column* = NULL...where column in ('','{x:Null}'). for each eligible column.
As for inserting these values as NULL going forward, you would have to set triggers on your tables to intercept these values and replace them with NULL.
I don't think there is any query that would do this thing for every table and every column. In principle, what you want to do is
UPDATE table SET column=NULL WHERE column='' OR column='{x:Null}';
You could try selecting data from the pg_attribute and pg_class columns to get the names of the tables and names of the columns and then generating automatically the queries. Be sure to select only those columns that contain textual data.
What if somebody has entered a genuine string '{x:Null}'? You would then change it into NULL.
However, you have done a real mistake by letting the situation to be as bad as it's currently. You should always normalize data before putting it into a database.

Copying of data from varchar2 to number in a same table

I have two columns and i need to copy of data from column VISITSAUTHORIZED to NEWVISITS, When i use below command to copy data i am getting an error message "invalid number".Can anyone correct this ?
VISITSAUTHORIZED VARCHAR2(10)
NEWVISITS NUMBER(8)
SQL> update patientinsurance set NEWVISITS=VISITSAUTHORIZED ;
ERROR at line 1:
ORA-01722: invalid number
It depends what kind of data you have in your old column. If it is all consistently formatted then you might be able to do:
update patientinsurance
set newvisits = to_number(visitsauthorized, '<format model>')
But it sounds more likely that you have something less easy to deal with. (The joys of storing data as the wrong datatype, which I assume is what you're now correcting). If there are rogue characters then you could use translate to get rid of them, perhaps, but you'd have to wonder about the integrity of the data and the values you end up with.
You can do something like this to display all the values that can't be converted, which may give you an idea of the best way to proceed - if there are only a few you might be able to correct them manually before re-running your update:
set serveroutput on
declare
newvisits number;
number_format_exception exception;
pragma exception_init(number_format_exception, -6502);
begin
for r in (select id, visitsauthorized from patientinsurance) loop
begin
newvisits := to_number(r.visitsauthorized);
exception
when number_format_exception then
dbms_output.put_line(sqlcode || ' ID ' || r.id
|| ' value ' || r.visitsauthorized);
end;
end loop;
end;
/
This is guessing you have a unique identifier field called ID, but change that as appropriate for your table, obviously.
Another approach is to convert the numbers that are valid and skip over the rest, which you can do with an error logging table:
exec dbms_errlog.create_error_log(dml_table_name => 'PATIENTINSURANCE');
merge into patientinsurance target
using (select id, visitsauthorized from patientinsurance) source
on (target.id = source.id)
when matched then
update set target.newvisits = source.visitsauthorized
log errors into err$_patientinsurance reject limit unlimited;
You can then query the error table to see what failed:
select id, visitsauthorized, ora_err_number$
from err$_patientinsurance;
Or see which records in your main table have newvisits still null. Analysing your data should probably be the first step though.
If you want to strip out all non-numeric characters and treat whatever is left as a number then you can change the merge to do:
...
update set target.newvisits = regexp_replace(source.visitsauthorized,
'[^[:digit:]]', null)
But then you probably don't need the merge, you can just do:
update patientinsurance set newvisits = regexp_replace(visitsauthorized,
'[^[:digit:]]', null);
This will strip out any group or decimal separators as well, which might not be an issue, particularly as you're inserting into a number(8) column. But you could preserve those if you wanted to, by changing the pattern to '[^[:digit:].,]'... though that could give you other problems still, potentially.
You can also do this with translate if the regex is too slow.