Multi-column index - postgresql

I have 4 columns and I would like to match a record if any of the 4 columns match any of an array of values, something like this (syntax is not correct, but this is the idea):
SELECT * FROM y WHERE (col1,col2,col3,col4) IN (val1,val2,val3,val4)
Right now I'm using this syntax:
SELECT
*
FROM
y
WHERE
col1 IN (val1,val2,val3,val4)
OR
col2 IN (val1,val2,val3,val4)
OR
col3 IN (val1,val2,val3,val4)
OR
col4 IN (val1,val2,val3,val4)
I have 4 individual indexes on each column, but I'm wondering if there's a better type of multi-column index I could use.
So two questions:
Is there a better type of index rather than individual ones on each of col1,col2,col3 and col4?
What's the syntax in there where clause?

One index I can recommend in such cases is a bloom filter.
For that, you'll have to use PostgreSQL v10 and install the bloom extension:
CREATE EXTENSION bloom;
Then you can create a multicolumn index:
CREATE INDEX ON y USING bloom (col1, col2, col3, col4);
This index will support your query.
If the OR creates a performance problem, try using UNION:
SELECT * FROM y WHERE col1 IN (val1,val2,val3,val4)
UNION
SELECT * FROM y WHERE col2 IN (val1,val2,val3,val4)
UNION
SELECT * FROM y WHERE col3 IN (val1,val2,val3,val4)
UNION
SELECT * FROM y WHERE col4 IN (val1,val2,val3,val4);

Related

DB2 - Concat all values in a column into a Single string

Let's say I have a table like this:
test
Col1
Col2
A
1
B
1
C
1
D
2
I am doing query select col1 from test where col2 = 1;
This will return a column with values A B and C in 3 separate rows.
I want the SQL to return a single row with value A|B|C. Is this possible to do? If it is how should I do it?
you can use LISTAGG function like this:
SELECT LISTAGG(col1, ',')
If LISTAGG is not available, it can be reproduced with XMLAGG:
SELECT SUBSTR(XMLSERIALIZE(XMLAGG(XMLTEXT('|'||"Col1"))),2)
FROM "test"
WHERE "Col2" = 1

Split a comma separated string of unknown elements to multiple columns in PostgreSQL 11.0

I have a table in PostgreSQL with following values.
col1
; substrate
positive allosteric modulator
inducer; substrate
I would like to split the row values by ';' into multiple columns. As per my understanding, split_part() function works only for fix number of values.
How can I get the below output?
col1 col2 col3
; substrate substrate
positive allosteric modulator positive allosteric modulator
inducer; substrate inducer substrate
Thanks
You can split it into an array, then access each array element:
select col1,
elements[1] as col2,
elements[2] as col3
from (
select col1, regexp_split_to_array(col1, '\s*;\s*') as elements
from the_table
) t

SQL Where Clause with multiple and joined conditions

In the below table, I want to write a SQL query to exclude a row when col1=2 and col2=1. Only when both conditions are met, I want to drop that column. If col1=2 but col2<>1 then I want to keep that row.
col1
col2
1
2
1
2
2
2
2
1
I am trying this below snippet but it's not working:
select *
from table
where (col1<>2 and col2<>1)
You can use either:
WHERE NOT (col1 <> 2 AND col2 <> 1)
or
WHERE (col1 <> 2 OR col2 <> 1)
You should use or instead of and.

Query not working with ENUM field in Oracle NoSQL Database

I encountered issues with the execution of a select query with WHERE clause on ENUM field
Here's a sample query which is not working:
kv-> execute "select * from Table1_TBL where col1 < 100 and col1 >10 and Table1Summaries.values($value.col2 = 'VAL1')"
In general, in comparisons enum columns behave like strings. So if "col2" is a column of table "Table1_TBL" that is declared as an enum, the query should be the following:
select * from Table1_TBL where col1 < 100 and col1 >10 and col2 = 'VAL1'.

Retrieving the rows using join query

I have two tables like this
A B
---- -------
col1 col2 col1 col2
---------- -----------
A table contains 300k rows
B table contains 400k rows
I need to count the col1 for table A if it is matching col1 for table B
I have written a query like this:
select count(distinct ab.col1) from A ab join B bc on(ab.col1=bc.col1)
but this takes too much time
could try a group by...
Also ensure that the col1 is indexed in both tables
SELECT COUNT (col1 )
FROM
(
SELECT aa.col1
FROM A aa JOIN B bb on aa.col1 = bb.col1
GROUP BY (aa.col1)
)
It's difficult to answer without you positing more details: did you analyze the tables? Do you have an index on col1 on each table? How many rows are you counting?
That being said, there aren'y so many potential query plans for your query. You likely have two seq scans that are hash joined together, which is about the best you can do... If you've a material numbers of rows, you'll be counting a gazillion rows, and this takes time.
Perhaps you could rewrite the query differently? If every B.col1 is in A.col1, you could get the same result without the join:
select count(distinct col1) from B
If A has low cardinality, it might be faster to rely on exists():
with vals as (
select distinct A.col1 as val from A
)
select count(*) from vals
where exists(select 1 from B where B.col1 = vals.val)
Or, if you know every possible value from A.col1 and it's reasonably small, you could unnest an array without querying A at all:
select count(*) from unnest(Array[val1, val2, ...]) as vals (val)
where exists(select 1 from B where B.col1 = vals.val)
Or vice-versa, in each of the above, if every B holds the reference values.