PostgreSQL an efficient way to select using IN and OR - postgresql

I'm trying to select a table in which I a have 3 or 4 columns that I want to match against an array/list with 1,000 or 10,000 (passed as parameter using a function) but it takes too much time, I readed that using indexes might help but still there ir poor performance during Query.
Here is my query:
SELECT
"id", "parcel_number", "alternate_parcel_number", "parcel_tax_number"
FROM "addresses"
WHERE
"parcel_number" IN ('A080100', ... 'A0368895224')
OR "alternate_parcel_number" IN ('A080100', ... 'A0368895224')
OR "parcel_tax_number" IN ('A080100', ... 'A0368895224');
With an array/list of 1000 lines it takes about 2 min to return results.
Thanks.

I found a solution on a website and change to VALUES gives a huge speed improvement.
URL: https://www.datadoghq.com/blog/100x-faster-postgres-performance-by-changing-1-line/
SELECT
"id", "parcel_number", "alternate_parcel_number", "parcel_tax_number"
FROM "addresses"
WHERE
"parcel_number" = ANY (VALUES ('A080100'), ... ('A0368895224'))
OR "alternate_parcel_number" = ANY (VALUES ('A080100'), ... ('A0368895224'))
OR "parcel_tax_number" = ANY (VALUES ('A080100'), ... ('A0368895224'));

It's not advisable to pass 10.000 values as parameters to a function. When I need to do that, my solution was:
Insert all parameters on a temporary table and associate all of them to a key value:
CREATE TEMP_TABLE (PARAM_KEY INT, PARCEL_NO INT);
Pass only the key value to the function
Run the query:
SELECT id, parcel_number, alternate_parcel_number, parcel_tax_number
FROM "addresses" join temp_table on parcel_number = temp_param
WHERE PARAM_key = :param_key
UNION
SELECT id, parcel_number, alternate_parcel_number, parcel_tax_number
FROM "addresses" join temp_table
on alternate_parcel_number = temp_param
WHERE PARAM_key = :param_key
UNION
SELECT id, parcel_number, alternate_parcel_number, parcel_tax_number
FROM "addresses" join temp_table on parcel_tax_number = temp_param
WHERE PARAM_key = :param_key;

Related

Select into a table with a CTE [duplicate]

I have a very complex CTE and I would like to insert the result into a physical table.
Is the following valid?
INSERT INTO dbo.prf_BatchItemAdditionalAPartyNos
(
BatchID,
AccountNo,
APartyNo,
SourceRowID
)
WITH tab (
-- some query
)
SELECT * FROM tab
I am thinking of using a function to create this CTE which will allow me to reuse. Any thoughts?
You need to put the CTE first and then combine the INSERT INTO with your select statement. Also, the "AS" keyword following the CTE's name is not optional:
WITH tab AS (
bla bla
)
INSERT INTO dbo.prf_BatchItemAdditionalAPartyNos (
BatchID,
AccountNo,
APartyNo,
SourceRowID
)
SELECT * FROM tab
Please note that the code assumes that the CTE will return exactly four fields and that those fields are matching in order and type with those specified in the INSERT statement.
If that is not the case, just replace the "SELECT *" with a specific select of the fields that you require.
As for your question on using a function, I would say "it depends". If you are putting the data in a table just because of performance reasons, and the speed is acceptable when using it through a function, then I'd consider function to be an option.
On the other hand, if you need to use the result of the CTE in several different queries, and speed is already an issue, I'd go for a table (either regular, or temp).
WITH common_table_expression (Transact-SQL)
The WITH clause for Common Table Expressions go at the top.
Wrapping every insert in a CTE has the benefit of visually segregating the query logic from the column mapping.
Spot the mistake:
WITH _INSERT_ AS (
SELECT
[BatchID] = blah
,[APartyNo] = blahblah
,[SourceRowID] = blahblahblah
FROM Table1 AS t1
)
INSERT Table2
([BatchID], [SourceRowID], [APartyNo])
SELECT [BatchID], [APartyNo], [SourceRowID]
FROM _INSERT_
Same mistake:
INSERT Table2 (
[BatchID]
,[SourceRowID]
,[APartyNo]
)
SELECT
[BatchID] = blah
,[APartyNo] = blahblah
,[SourceRowID] = blahblahblah
FROM Table1 AS t1
A few lines of boilerplate make it extremely easy to verify the code inserts the right number of columns in the right order, even with a very large number of columns. Your future self will thank you later.
Yep:
WITH tab (
bla bla
)
INSERT INTO dbo.prf_BatchItemAdditionalAPartyNos ( BatchID, AccountNo,
APartyNo,
SourceRowID)
SELECT * FROM tab
Note that this is for SQL Server, which supports multiple CTEs:
WITH x AS (), y AS () INSERT INTO z (a, b, c) SELECT a, b, c FROM y
Teradata allows only one CTE and the syntax is as your example.
Late to the party here, but for my purposes I wanted to be able to run the code the user inputted and store in a temp table. Using oracle no such issues.. the insert is at the start of the statement before the with clause.
For this to work in sql server, the following worked:
INSERT into #stagetable execute (#InputSql)
(so the select statement #inputsql can start as a with clause).

SQL Insert values with subquery and column multiplication

I am trying to insert a value with a subquery including a column multiplication into a table. This works perfectly, however if I query the table, it only has 0 values. Does anybody know why?
My queries are:
create table user_payments
(payment_stream number(4));
insert into user_payments (payment_stream)
select ct.min_price*s.stream_min as payment_stream
from users u
, streaming s
, content_type ct
, contract c
, media_content mc
where u.user_ID = s.user_ID
and mc.media_content_ID = s.media_content_ID
and ct.content_type_ID = mc.content_type_ID
and u.user_ID = c.user_ID
and c.contract_name = 'Pay as you go';
If I query the select query individually, I get the expected outcome, however, once the rows are inserted into the table, all values are 0.
Thanks for your help!
try datatype numeric(18,2) in table var

Insert multiple rows where not exists PostgresQL

I'd like to generate a single sql query to mass-insert a series of rows that don't exist on a table. My current setup makes a new query for each record insertion similar to the solution detailed in WHERE NOT EXISTS in PostgreSQL gives syntax error, but I'd like to move this to a single query to optimize performance since my current setup could generate several hundred queries at a time. Right now I'm trying something like the example I've added below:
INSERT INTO users (first_name, last_name, uid)
SELECT ( 'John', 'Doe', '3sldkjfksjd'), ( 'Jane', 'Doe', 'adslkejkdsjfds')
WHERE NOT EXISTS (
SELECT * FROM users WHERE uid IN ('3sldkjfksjd', 'adslkejkdsjfds')
)
Postgres returns the following error:
PG::Error: ERROR: INSERT has more target columns than expressions
The problem is that PostgresQL doesn't seem to want to take a series of values when using SELECT. Conversely, I can make the insertions using VALUES, but I can't then prevent duplicates from being generated using WHERE NOT EXISTS.
http://www.techonthenet.com/postgresql/insert.php suggests in the section EXAMPLE - USING SUB-SELECT that multiple records should be insertable from another referenced table using SELECT, so I'm wondering why I can't seem to pass in a series of values to insert. The values I'm passing are coming from an external API, so I need to generate the values to insert by hand.
Your select is not doing what you think it does.
The most compact version in PostgreSQL would be something like this:
with data(first_name, last_name, uid) as (
values
( 'John', 'Doe', '3sldkjfksjd'),
( 'Jane', 'Doe', 'adslkejkdsjfds')
)
insert into users (first_name, last_name, uid)
select d.first_name, d.last_name, d.uid
from data d
where not exists (select 1
from users u2
where u2.uid = d.uid);
Which is pretty much equivalent to:
insert into users (first_name, last_name, uid)
select d.first_name, d.last_name, d.uid
from (
select 'John' as first_name, 'Doe' as last_name, '3sldkjfksjd' as uid
union all
select 'Jane', 'Doe', 'adslkejkdsjfds'
) as d
where not exists (select 1
from users u2
where u2.uid = d.uid);
a_horse_with_no_name's answer actually has a syntax error, missing a final closing right parens, but other than that is the correct way to do this.
Update:
For anyone coming to this with a situation like mine, if you have columns that need to be type cast (for instance timestamps or uuids or jsonb in PG 9.5), you must declare that in the values you pass to the query:
-- insert multiple if not exists
-- where another_column_name is of type uuid, with strings cast as uuids
-- where created_at and updated_at is of type timestamp, with strings cast as timestamps
WITH data (id, some_column_name, another_column_name, created_at, updated_at) AS (
VALUES
(<id value>, <some_column_name_value>, 'a5fa7660-8273-4ffd-b832-d94f081a4661'::uuid, '2016-06-13T12:15:27.552-07:00'::timestamp, '2016-06-13T12:15:27.879-07:00'::timestamp),
(<id value>, <some_column_name_value>, 'b9b17117-1e90-45c5-8f62-d03412d407dd'::uuid, '2016-06-13T12:08:17.683-07:00'::timestamp, '2016-06-13T12:08:17.801-07:00'::timestamp)
)
INSERT INTO table_name (id, some_column_name, another_column_name, created_at, updated_at)
SELECT d.id, d.survey_id, d.arrival_uuid, d.gf_created_at, d.gf_updated_at
FROM data d
WHERE NOT EXISTS (SELECT 1 FROM table_name t WHERE t.id = d.id);
a_horse_with_no_name's answer saved me today on a project, but had to make these tweaks to make it perfect.

PostgreSQL - select the results of two subqueries

I have 2 complex queries that are both subqueries in postgres, the results of which are:
q1_results = id , delta , metric_1
q2_results = id , delta , metric_2
i'd like to combine the results of the queries, so the outer query can access either:
results_a = id , delta , metric_1 , metric_2
results_b = id , delta , combined_metric
i can't figure out how to do this. online searches keep leading me to UNION , but that keeps the metrics in the same column. i need to keep them split.
It's not entirely clear what you're asking in the question and the comments, but it sounds like you might be looking for a full join with a bunch of coalesce statements, e.g.:
-- create view at your option, e.g.:
-- create view combined_query as
select coalesce(a.id, b.id) as id,
coalesce(a.delta, b.delta) as delta,
a.metric1 as metric1,
b.metric2 as metric2,
coalesce(a.metric1,0) + coalesce(b.metric2,0) as combined
from (...) as results_a a
full join (...) as results_b b on a.id = b.id -- and a.delta = b.delta maybe?

Combining INSERT INTO and WITH/CTE

I have a very complex CTE and I would like to insert the result into a physical table.
Is the following valid?
INSERT INTO dbo.prf_BatchItemAdditionalAPartyNos
(
BatchID,
AccountNo,
APartyNo,
SourceRowID
)
WITH tab (
-- some query
)
SELECT * FROM tab
I am thinking of using a function to create this CTE which will allow me to reuse. Any thoughts?
You need to put the CTE first and then combine the INSERT INTO with your select statement. Also, the "AS" keyword following the CTE's name is not optional:
WITH tab AS (
bla bla
)
INSERT INTO dbo.prf_BatchItemAdditionalAPartyNos (
BatchID,
AccountNo,
APartyNo,
SourceRowID
)
SELECT * FROM tab
Please note that the code assumes that the CTE will return exactly four fields and that those fields are matching in order and type with those specified in the INSERT statement.
If that is not the case, just replace the "SELECT *" with a specific select of the fields that you require.
As for your question on using a function, I would say "it depends". If you are putting the data in a table just because of performance reasons, and the speed is acceptable when using it through a function, then I'd consider function to be an option.
On the other hand, if you need to use the result of the CTE in several different queries, and speed is already an issue, I'd go for a table (either regular, or temp).
WITH common_table_expression (Transact-SQL)
The WITH clause for Common Table Expressions go at the top.
Wrapping every insert in a CTE has the benefit of visually segregating the query logic from the column mapping.
Spot the mistake:
WITH _INSERT_ AS (
SELECT
[BatchID] = blah
,[APartyNo] = blahblah
,[SourceRowID] = blahblahblah
FROM Table1 AS t1
)
INSERT Table2
([BatchID], [SourceRowID], [APartyNo])
SELECT [BatchID], [APartyNo], [SourceRowID]
FROM _INSERT_
Same mistake:
INSERT Table2 (
[BatchID]
,[SourceRowID]
,[APartyNo]
)
SELECT
[BatchID] = blah
,[APartyNo] = blahblah
,[SourceRowID] = blahblahblah
FROM Table1 AS t1
A few lines of boilerplate make it extremely easy to verify the code inserts the right number of columns in the right order, even with a very large number of columns. Your future self will thank you later.
Yep:
WITH tab (
bla bla
)
INSERT INTO dbo.prf_BatchItemAdditionalAPartyNos ( BatchID, AccountNo,
APartyNo,
SourceRowID)
SELECT * FROM tab
Note that this is for SQL Server, which supports multiple CTEs:
WITH x AS (), y AS () INSERT INTO z (a, b, c) SELECT a, b, c FROM y
Teradata allows only one CTE and the syntax is as your example.
Late to the party here, but for my purposes I wanted to be able to run the code the user inputted and store in a temp table. Using oracle no such issues.. the insert is at the start of the statement before the with clause.
For this to work in sql server, the following worked:
INSERT into #stagetable execute (#InputSql)
(so the select statement #inputsql can start as a with clause).