Compare two string arrays in PostgreSQL - postgresql

In a Postgres DB, I have 2 varchar[] columns in 2 tables with common ID that contain data as below:
I need to return rows where all values of t_col are present in f_col (with or without extra data).
Desired result:
I tried with the below as conditions in self join query.
select
from t1 a, t2 b
where a.id=b.id
not (a.f_col #> b.t_col)

Related

Join two tables on all columns to determine if they contain identical information

I want to check if tables table_a and table_b are identical. I thought I could full outer join both tables on all columns and count the number of rows and missing values. However, both tables have many columns and I do not want to explicitly type out every column name.
Both tables have the same number of columns as well as names. How can I full outer join both of them on all columns without explicitly typing every column name?
I would like to do something along this syntax:
select
count(1)
,sum(case when x.id is null then 1 else 0 end) as x_nulls
,sum(case when y.id is null then 1 else 0 end) as y_nulls
from
x
full outer join
y
on
*
;
You can use NATURAL FULL OUTER JOIN here. The NATURAL key word will join on all columns that have the same name.
Just testing if the tables are identical could then be:
SELECT *
FROM x NATURAL FULL OUTER JOIN y
WHERE x.id IS NULL OR y.id IS NULL
This will show "orphaned" rows in either table.
You might use except operators.
For example the following would return an empty set if both tables contain the same rows:
select * from t1
except
select * from t2;
If you want to find rows in t1 that are different to those in t2 you could do
select * from t1
where not exists (select * from t1 except select * from t2);
Provided the number and types of columns match you can use select *, the tables' columns can vary in names; you could also invert the above and union to return combined differences.

JPA equivalent of PostgreSQL with null comparison

I have a requirement something like there are two db tables A and B.
Table A and table B both have column 'merit' of type integer. The requirement is to find the entries from table A those have merit that matches with the least merit number in table B.
If merit in B is NULL for all the entries, the query should result all the entries from table A.
If merit in B has valid numbers, the query should result the entries of A those match the least merit number in table B.
Sample data is like this::
TABLE A TABLE B
COL1 COL2 MERIT COL1 COL2 MERIT
a ab 1 c ac 1
b bc 2 d ad 3
From above data the least merit in B is 1 so, only the matching entry should result from Table A.
If merit column in B is null for all the entries ie. B has no valid number for merit, two entries from A should result.
So, I came up with the below sql query::
select A.* from A where A.merit IS NOT DISTINCT FROM (
select min(B.merit) from B where B.merit IS NOT NULL);
I am unable to write the JPA equivalent of this sql because of "IS NOT DISTINCT FROM".
The below queries are not working.
select a from A a where a.merit in (select min(b.merit) from B b where b.merit is not null)
select a from A a where a.merit = (select min(b.merit) from B b where b.merit is not null)
My environment is POSTGRESQL, HIBERNATE in QUARKUS.
Thanks for any suggestions.

It takes a long time to compare two table values (large volume)

Table a has 100 row with name columns.
Table b is the same structure as Table A, but has 10 million rows.
Create a query to verify that the value of table a is not in table b.
However, comparing the value of table a with the value of table b takes too long.
I want to complete the work in 5 seconds, but I don't know how.
Below is the method I tried. Both table name columns have b-tree indexes.
1.
select
name
from
a
where
and not exists (select
name
from
b
where
a.name = b.name
);
select
a.name
from
a left outer join b
on a.name = b.name
where
b.name is null;
You want the following index on the B table:
CREATE INDEX name_idx ON b (name);
This should allow Postgres to rapidly lookup any of the 100 names in the a table against the above index. This should avoid the need to do a full table scan of the b table, which, as you are seeing, can be costly.

dynamically choose fields from different table based on existense

I have two tables A and B.
Both the tables have same number of columns.
Table A always contains all ids of Table B.
Need to fetch row from Table B first if it does not exist then have
to fetch from Table A.
I was trying to dynamically do this
select
CASE
WHEN b.id is null THEN
a.*
ELSE
b.*
END
from A a
left join B b on b.id = a.id
I think this syntax is not correct.
Can some one suggest how to proceed.
It looks like you want to select all columns from table A except when a matching ID exists in table B. In that case you want to select all columns from table B.
That can be done with this query as long as the number and types of columns in both tables are compatible:
select * from a where not exists (select 1 from b where b.id = a.id)
union all
select * from b
If the number, types, or order of columns differs you will need to explicitly specify the columns to return in each sub query.

Full outer join on multiple tables in PostgreSQL

In PostgreSQL, I have N tables, each consisting of two columns: id and value. Within each table, id is a unique identifier and value is numeric.
I would like to join all the tables using id and, for each id, create a sum of values of all the tables where the id is present (meaning the id may be present only in subset of tables).
I was trying the following query:
SELECT COALESCE(a.id, b.id, c.id) AS id,
COALESCE(a.value,0) + COALESCE(b.value,0) + COALESCE(c.value.0) AS value
FROM
a
FULL OUTER JOIN
b
ON (a.id=b.id)
FULL OUTER JOIN
c
ON (b.id=c.id)
But it doesn't work for cases when the id is present in a and c, but not in b.
I suppose I would have to do some bracketing like:
SELECT COALESCE(x.id, c.id) AS id, x.value+c.value AS value
FROM
(SELECT COALESCE(a.id, b.id), a.value+b.value AS value
FROM
a
FULL OUTER JOIN
b
ON (a.id=b.id)
) AS x
FULL OUTER JOIN
c
ON (x.id = c.id)
It was only 3 tables and the code is ugly enough already imho. Is there some elegant, systematic ways how to do the join for N tables? Not to get lost in my code?
I would also like to point out that I did some simplifications in my example. Tables a, b, c, ..., are actually results of quite complex queries over several materialized views. But the syntactical problem remains the same.
I understood you need to sum the values from N tables and group them by id, correct?
For that I would do this:
Select x.id, sum (x.value) from (
Select * from a
Union all
Select * from b
Union all........
) as x group by x.id;
Since the n tables are composed by the same fields you can union them all creating a big table full of all the id - value tuples from all tables. Use union all because union filters for duplicates!
Then just sum all the values grouped by id.