comparing two fields that may be null

comparing two fields that may be null - postgresql

Is there a comparison operator where a.unitnum = b.unitnum would be true if both a.unitnum and b.unitnum are null? Seems that a.unitnum IS b.unitnum is invalid

yes, there is IS DISTINCT FROM and IS NOT DISTINCT FROM
postgres=# \pset null ****
Null display is "****".
postgres=# select null = null;
┌──────────┐
│ ?column? │
╞══════════╡
│ **** │
└──────────┘
(1 row)
postgres=# select null is not distinct from null;
┌──────────┐
│ ?column? │
╞══════════╡
│ t │
└──────────┘
(1 row)
postgres=# select 10 = null;
┌──────────┐
│ ?column? │
╞══════════╡
│ **** │
└──────────┘
(1 row)
postgres=# select 10 is distinct from null;
┌──────────┐
│ ?column? │
╞══════════╡
│ t │
└──────────┘
(1 row)
postgres=# select 10 is not distinct from null;
┌──────────┐
│ ?column? │
╞══════════╡
│ f │
└──────────┘
(1 row)
postgres=# select 10 is not distinct from 20;
┌──────────┐
│ ?column? │
╞══════════╡
│ f │
└──────────┘
(1 row)

yes, there is, but it is recomended to not use it. here is sample:
t=# select null = null;
?column?
----------
(1 row)
t=# set transform_null_equals = on;
SET
t=# select null = null;
?column?
----------
t
(1 row)
UPDATE: apparently would work only for comparison column = NULL, not column = column:
t=# with s as (select null::int a, null::int b) select a <> b from s;
?column?
----------
(1 row)
so the shortest comparison would be coalesce:
t=# with s as (select null::int a, null::int b) select coalesce(a,b,0) = 0 from s;
?column?
----------
t
(1 row)

IF(a.unitnum IS null AND b.unitnum IS null)
THEN
RAISE NOTICE 'unitum field is null in both a and b tables'
ELSE
RAISE NOTICE 'unitum field is not null in at least one a or b tables'
END IF;

No but you can use a.unitnum = b.unitnum or (a.unitnum is null and b.unitnum is null)

If you need to handle all cases:
a.unitnum is null b.unitnum is null
a.unitnum is null b.unitnum is not null
a.unitnum is not null b.unitnum is null
a.unitnum is not null b.unitnum is not null
Then you may want to use this expression:
select *
from a, b
where
((a.unitnum is not null) and (b.unitnum is not null) and (a.unitnum = b.unitnum)) or
((a.unitnum is null) and (b.unitnum is null));
Here you can test how it works:
SELECT
((a is not null) and (b is not null) and (a = b)) or
((a is null) and (b is null))
FROM (VALUES (null,null)
, (null,1)
, (1,null)
, (1,1)
, (1,2)
) t1 (a, b);
P.S.
Just use IS NOT DISTINCT FROM from the accepted answer... It works the same but shorter.

Related

Single Column Table query much faster than with Multi Column Table

I have a bunch of strings for which I like to find the best match in a table column. This column contains about 400,000 rows and the strings are no longer than 100 characters.
When I run the query on the whole table (8 columns in total) my query takes about 32 secs. All text columns have a GIN index.
with my_phrases(phrase) as (
values
('ABC'),
('123'),
('XYZ'),
('MNO'),
('KLM'),
('FOO'),
('AYE'),
('OPS')
)
select my_phrases.phrase, best_match.phrase, best_match.similarity
from my_phrases,
lateral (
select opm.phrase, similarity(opm.phrase, my_phrases.phrase) similarity
from my_table opm
where phrase % my_phrases.phrase
order by opm.phrase <-> my_phrases.phrase
limit 1
) best_match
order by my_phrases.phrase
;
Now when I copy the column phrase into a separate table and add a GIN index. The query becomes 400ms.
with my_phrases(phrase) as (
values
('ABC'),
('123'),
('XYZ'),
('MNO'),
('KLM'),
('FOO'),
('AYE'),
('OPS')
)
select my_phrases.phrase, best_match.phrase, best_match.similarity
from my_phrases,
lateral (
select phrase, similarity(phrase, my_phrases.phrase) similarity
from my_one_column_table
where phrase % section.phrase
order by phrase <-> my_phrases.phrase
limit 1
) best_match
order by my_phrases.phrase
;
Here is more info on the server:
select version(); ->
PostgreSQL 14.3 on aarch64-unknown-linux-gnu, compiled by gcc (GCC) 7.3.1 20180712 (Red Hat 7.3.1-6), 64-bit
The multi-column table was created from a csv as follows:
create table my_table
(
id serial primary key,
p_id uuid,
p_name varchar,
b_id uuid,
b_name varchar,
pt_id uuid,
pt_name varchar,
phrase varchar
);
\copy my_table(p_id, p_name, b_id, b_name, pt_id, pt_name, phrase) from 'XXX.csv' csv header;
create index my_table__p_name__gin on my_table using gin(p_name gin_trgm_ops);
create index my_table__b_name__gin on my_table using gin(b_name gin_trgm_ops);
create index my_table__pt_name__gin on my_table using gin(pt_name gin_trgm_ops);
create index my_table__[hrase]__gin on my_table using gin(phrase gin_trgm_ops);
\d+ my_table
Column │ Type │ Collation │ Nullable │ Default │ Storage │ Compression │ Stats target │ Description
═══════════════════╪═══════════════════╪═══════════╪══════════╪═══════════════════════════════════════════════════════╪══════════╪═════════════╪══════════════╪═════════════
id │ integer │ │ not null │ nextval('my_table_id_seq'::regclass) │ plain │ │ │
p_id │ uuid │ │ │ │ plain │ │ │
p_name │ character varying │ │ │ │ extended │ │ │
b_id │ uuid │ │ │ │ plain │ │ │
b_name │ character varying │ │ │ │ extended │ │ │
pt_id │ uuid │ │ │ │ plain │ │ │
pt_name │ character varying │ │ │ │ extended │ │ │
phrase │ character varying │ │ │ │ extended │ │ │
Indexes:
"my_table_pkey" PRIMARY KEY, btree (id)
"my_table__b_name__gin" gin (b_name gin_trgm_ops)
"my_table__phrase__gin" gin (phrase gin_trgm_ops)
"my_table__phrase__idx" btree (phrase)
"my_table__p_name__gin" gin (p_name gin_trgm_ops)
"my_table__pt_name__gin" gin (pt_name gin_trgm_ops)
Access method: heap
For the slow query, the query plan is here
For the fast query, the query plan is here

Postgresql - SQL query to list all sequences in database

I would like to select all sequences in the database, get the schema of sequence, dependent table, the schema of a table, dependent column.
I've tried the following query:
SELECT
ns.nspname AS sequence_schema_name,
s.relname AS sequence_name,
t_ns.nspname AS table_schema_name,
t.relname AS table_name,
a.attname AS column_name,
s.oid,
s.relnamespace,
d.*,
a.*
FROM pg_class s
JOIN pg_namespace ns
ON ns.oid = s.relnamespace
left JOIN pg_depend d --
ON d.objid = s.oid --TO FIX???
AND d.classid = 'pg_class'::regclass --TO FIX???
AND d.refclassid = 'pg_class'::regclass --TO FIX???
left JOIN pg_class t
ON t.oid = d.refobjid --TO FIX???
left JOIN pg_attribute a
ON a.attrelid = d.refobjid
AND a.attnum = d.refobjsubid
left JOIN pg_namespace t_ns
ON t.relnamespace = t_ns.oid
WHERE s.relkind = 'S'
;
Unfortunately, this query does not work at 100%. The query filter some sequences.
I need it for further processing (after data restore on different ENV, I need to find max column-value and set sequence to MAX+1).
Could anyone help me?

The following query should to work:
create table foo(id serial, v integer);
create table boo(id_boo serial, v integer);
create sequence omega;
create table bubu(id integer default nextval('omega'), v integer);
select sn.nspname as seq_schema,
s.relname as seqname,
st.nspname as tableschema,
t.relname as tablename,
at.attname as columname
from pg_class s
join pg_namespace sn on sn.oid = s.relnamespace
join pg_depend d on d.refobjid = s.oid
join pg_attrdef a on d.objid = a.oid
join pg_attribute at on at.attrelid = a.adrelid and at.attnum = a.adnum
join pg_class t on t.oid = a.adrelid
join pg_namespace st on st.oid = t.relnamespace
where s.relkind = 'S'
and d.classid = 'pg_attrdef'::regclass
and d.refclassid = 'pg_class'::regclass;
┌────────────┬────────────────┬─────────────┬───────────┬───────────┐
│ seq_schema │ seqname │ tableschema │ tablename │ columname │
╞════════════╪════════════════╪═════════════╪═══════════╪═══════════╡
│ public │ foo_id_seq │ public │ foo │ id │
│ public │ boo_id_boo_seq │ public │ boo │ id_boo │
│ public │ omega │ public │ bubu │ id │
└────────────┴────────────────┴─────────────┴───────────┴───────────┘
(3 rows)
For calling sequence related functions you can use s.oid column. For this case, it is sequence unique oid identifier. You need cast it to regclass.
A script for you request can looks like:
do $$
declare
r record;
max_val bigint;
begin
for r in
select s.oid as seqoid,
at.attname as colname,
a.adrelid as reloid
from pg_class s
join pg_namespace sn on sn.oid = s.relnamespace
join pg_depend d on d.refobjid = s.oid
join pg_attrdef a on d.objid = a.oid
join pg_attribute at on at.attrelid = a.adrelid and at.attnum = a.adnum
where s.relkind = 'S'
and d.classid = 'pg_attrdef'::regclass
and d.refclassid = 'pg_class'::regclass
loop
-- probably lock here can be safer, in safe (single user) maintainance mode
-- it is not necessary
execute format('lock table %s in exclusive mode', r.reloid::regclass);
-- expect usual one sequnce per table
execute format('select COALESCE(max(%I),0) from %s', r.colname, r.reloid::regclass)
into max_val;
-- set sequence
perform setval(r.seqoid, max_val + 1);
end loop;
end;
$$
Note: Using %s for table name or sequence name in format function is safe, because the cast from Oid type to regclass type generate safe string (schema is used when it is necessary every time, escaping is used when it is needed every time).

Postgresql pattern match a selection

I'm trying to find a selection where the start of a column matches a column in another table in postgres. I'm looking to do something along the lines of the following.
Return all records in table1 where table1.name starts with any of the labels in table2.labels.
SELECT
name
FROM table1
WHERE
name LIKE (SELECT distinct label FROM table2);

You should append % sign to a label to use it in like. Also, use any() as the subquery may yield more than one row.
select name
from table1
where name like any(select distinct concat(label, '%') from table2);

You can join data from tables on any operator you want, including for example the regexp matching operator ~.
begin;
create table so.a(f1 text);
create table so.b(f2 text);
insert into so.a(f1)
select md5(x::text)
from generate_series(1, 300) t(x);
insert into so.b(f2)
select substring(md5(x::text) from (20*random())::int for 4)
from generate_series(1, 20) t(x);
select f2, f1
from so.a
join so.b
on a.f1 ~ b.f2
order by f2;
rollback;
Which gives:
pgloader# \i /Users/dim/dev/temp/stackoverflow/45693581.sql
BEGIN
CREATE TABLE
CREATE TABLE
INSERT 0 300
INSERT 0 20
f2 │ f1
══════╪══════════════════════════════════
12bd │ 6512bd43d9caa6e02c990b0a82652dca
3708 │ 98f13708210194c475687be6106a3b84
4d76 │ c20ad4d76fe97759aa27a0c99bff6710
5d77 │ e4da3b7fbbce2345d7772b0674a318d5
5f74 │ 1f0e3dad99908345f7439f8ffabdffc4
5fce │ 8f14e45fceea167a5a36dedd4bea2543
6790 │ 1679091c5a880faf6fb5e6087eb1b2dc
6802 │ d3d9446802a44259755d38e6d163e820
6816 │ 6f4922f45568161a8cdf4ad2299f6d23
74d9 │ c74d97b01eae257e44aa9d5bade97baf
7ff0 │ 9bf31c7ff062936a96d3c8bd1f8f2ff3
820d │ c4ca4238a0b923820dcc509a6f75849b
87e4 │ eccbc87e4b5ce2fe28308fd9f2a7baf3
95fb │ c9f0f895fb98ab9159f51fd0297e236d
aab │ 32bb90e8976aab5298d5da10fe66f21d
aab │ aab3238922bcc25a6f606eb525ffdc56
aab │ d395771085aab05244a4fb8fd91bf4ee
c51c │ 45c48cce2e2d7fbdea1afc51c7c6ad26
c51c │ c51ce410c124a10e0db5e4b97fc2af39
ce2e │ 45c48cce2e2d7fbdea1afc51c7c6ad26
d918 │ a87ff679a2f3e71d9181a67b7542122c
e728 │ c81e728d9d4c2f636f067f89cc14862c
fdf2 │ 70efdf2ec9b086079795c442636b55fb
(23 rows)
ROLLBACK
The dataset isn't very interesting, granted. You can speed that up with using the pg_trgm extension at https://www.postgresql.org/docs/current/static/pgtrgm.html

Postgres: Create duplicates of existing rows, changing one value?

I am working in Postgres 9.4. I have a table that looks like this:
Column │ Type │ Modifiers
─────────────────┼──────────────────────┼───────────────────────
id │ integer │ not null default
total_list_size │ integer │ not null
date │ date │ not null
pct_id │ character varying(3) │
I want to take all values where date='2015-09-01', and create identical new entries with the date 2015-10-01.
How can I best do this?
I can get the list of values to copy with SELECT * from mytable WHERE date='2015-09-01', but I'm not sure what to do after that.

If the column id is serial then
INSERT INTO mytable (total_list_size, date, pct_id)
SELECT total_list_size, '2015-10-01', pct_id
FROM mytable
WHERE date = '2015-09-01';
else, if you want the ids to be duplicated:
INSERT INTO mytable (id, total_list_size, date, pct_id)
SELECT id, total_list_size, '2015-10-01', pct_id
FROM mytable
WHERE date = '2015-09-01';

How to check table UNLOGGED with postgresql?

CREATE UNLOGGED TABLE IF NOT EXISTS <tablename>
How can I first check if the desired table is created UNLOGGED, and if not alter the table accordingly?
postgres 9.4

You can check column relpersistence of table pg_class;
postgres=# select relpersistence, relname from pg_class where relname like 'foo%';
┌────────────────┬─────────┐
│ relpersistence │ relname │
╞════════════════╪═════════╡
│ p │ foo │
│ p │ foo1 │
│ u │ foo2 │
└────────────────┴─────────┘
(3 rows)
foo2 is unlogged table.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

comparing two fields that may be null - postgresql

Is there a comparison operator where a.unitnum = b.unitnum would be true if both a.unitnum and b.unitnum are null? Seems that a.unitnum IS b.unitnum is invalid

IF(a.unitnum IS null AND b.unitnum IS null) THEN RAISE NOTICE 'unitum field is null in both a and b tables' ELSE RAISE NOTICE 'unitum field is not null in at least one a or b tables' END IF;

No but you can use a.unitnum = b.unitnum or (a.unitnum is null and b.unitnum is null)

Related

Single Column Table query much faster than with Multi Column Table

Postgresql - SQL query to list all sequences in database

Postgresql pattern match a selection

Postgres: Create duplicates of existing rows, changing one value?

How to check table UNLOGGED with postgresql?

Categories

Resources