Does SQL execute subqueries fully? - postgresql

Imagine I have this SQL query and the table2 is HUGE.
select product_id, count(product_id)
from table1
where table2_ptr_id in (select id
from table2
where author is not null)
Will SQL first execute the subquery and load all the table2 into memory? like if table1 has 10 rows and table2 has 10 million rows will it be better to join first and then filter? Or DB is smart enough to optimize this query as it is written.

You have to EXPLAIN the query to know what it is doing.
However, your query will likely perform better in PostgreSQL if you rewrite it to
SELECT product_id
FROM table1
WHERE EXISTS (SELECT 1
FROM table2
WHERE table2.id = table1.table2_ptr_id
AND table2.author IS NOT NULL);
Then PostgreSQL can use an anti-join, which will probably perform much better with a huge table2.
Remark: the count in your query doesn't make any sense to me.

Related

Postgres select queries inside a transaction is slow

What could be the reason that select queries performed inside a transaction is really slow
even though the queries are with just their primary key?
I have two tables after updating one row in table1 the query in table2 within the same transaction is really slow
even though the query on table2 is just
select * from table2 where id = 10
table1 and table2 have a lot of rows

Indexes to support OR condition over a JOIN

I'm wondering if Postgres has any support optimizing for following fundamental problem.
I want to do a search a agains two columns on different tables joined via a foreign key. I have created an index for each column. If I do my join query and have a where condition for either one or the other column, the respective index is used to filter the result and the query performance is great. If use two where clause combined by an OR for one field on each table, the query gets very slow and no indexes are used. Presumably this is because the optimizer sees no other way than doing a full table join and scan to resolve. The query looks something like this:
select table1.id
from table1
left join table2 on table1.fk = table2.id
where table1.haystack ilike '%needle%' or table2.haystack ilike '%needle%'
The operation (ilike) isn't the issue and interchangeable, I have a working Trigram index setup. I just want to find out if there is any other way to make this type of query performant beside denormalizing all searched fields into one table.
I would be very greateful for any ideas.
No, there is no special support in the database to optimize this. Do it yourself:
SELECT table1.id
FROM table1
JOIN table2 ON table1.fk = table2.id
WHERE table1.haystack ILIKE '%needle%'
UNION
SELECT table1.id
FROM table1
JOIN table2 ON table1.fk = table2.id
WHERE table2.haystack ILIKE '%needle%'
Provided both conditions are selective and indexed with a trigram index, and you have indexes on the join condition, that will be faster.

How to update joined table using condition

I'm having an issue with a simple update statement. I'm new to postgresql and I'm still stuck on MS Sql Server syntax.
What I want to do is to update all records from table1 which are not present / don't exist in table2. Table1 and Table2 are having an 1 to 1 relation. The join column is "colx" from my example
On Ms SQL Server I would have something like this:
UPDATE table1 set col1='some value' from table1 t1 LEFT JOIN table2 t2 on t1.colx=t2.colx WHERE t2.colx IS NULL
or
UPDATE table1 set col1='some value' from table1 t1 where not exists (select 1 from table2 t2 where t1.colx=t2.colx)
My issue is when performing the same on PostgreSql it updates all records from table1, not only the records matching the condition (e.g. I was expecting 4 records to be updated, but all records from table1 are updated instead).
I checked using a select statement the join condition for all possible approaches and I have the expected result (e.g. 4 records).
Is there anything I'm missing?
Your question is not very clear about the requirement.
What I understood is you want to update the value of col1 in table1 for those records which are not present in the table2.
You can try it this way in Postgresql:
UPDATE table1 t1 set col1='some value' where not exists(select 1 from table2 where colx=t1.colx)
DEMO

Dynamic values to another SQL statement

Is there a way to combine two SQL queries into a single SELECT query in PostgreSQL?
My requirements are as follows:
SELECT id FROM table1;
SELECT name FROM table2 WHERE table2.id = table1.id;
I think I need to pass values of table1.id in as some sort of dynamic values (loop values) for use in the SELECT statement executed on table2. What is the easiest way to solve this problem, is it possible to do this with stored procedures or functions in PostgreSQL?
select t1.id, name
from
table1 t1
inner join
table2 t2 using (id)
where t1.id = 1

Insert data into table effeciently, postgresql

I am new to postgresql (and databases in general) and was hoping to get some pointers on improving the efficiency of the following statement.
I am inserting data from one table to another, and do not want to insert duplicate values. I have a rid (unique identifier in each table) that are indexed and are Primary Keys.
I am currently using the following statement:
INSERT INTO table1 SELECT * FROM table2 WHERE rid NOT IN (SELECT rid FROM table1).
As of now the table one is 200,000 records, table2 is 20,000 records. Table1 is going to keep growing (probably to around 2,000,000) and table2 will stay around 20,000 records. As of now the statement takes about 15 minutes to run. I am concerned that as Table1 grows this is going to take way to long. Any suggestions?
This should be more efficient than your current query:
INSERT INTO table1
SELECT *
FROM table2
WHERE NOT EXISTS (
SELECT 1 FROM table1 WHERE table1.rid = table2.rid
);
insert into table1
select t2.*
from
table2 t2
left join
table1 t1 on t1.rid = t2.rid
where t1.rid is null