How to optimize the script - postgresql

I have the following SQL code:
select t1.*
from t1
join t3 on t3.id = t1.id
join t2 on t1.num = t2.num and coalesce(t1.date,t3.date) >= t2.date
but this script is not optimal at all, probably because of inequality in join.Is there a way to rewrite this, nothing comes to my mind

you can add indexes to columns t1.id, t3.id, t1.num, t2.num,
t1.date, t2.date, t3.date to perform an index scan while query
execution.
Postgresql - Index optimization for Date columns
Alternatively, if exists, also add the Direct join condition between
t2 & t3.
Use specific columns to be returned instead of "*".

Related

Using LIMIT Statement in INNER JOIN (postgreSQL)

I am having trouble using the LIMIT Statement. I would really appreciate your help.
I am trying to INNER JOIN three tables and use the LIMIT statement to only query a few lines because the tables are so huge.
So, basically, this is what I am trying to accomplish:
SELECT *
FROM ((scheme1.table1
INNER JOIN scheme1.table2
ON scheme1.table1.column1 = scheme1.table2.column1 LIMIT 1)
INNER JOIN scheme1.table3
ON scheme1.table1.column1 = scheme1.table3.column1)
LIMIT 1;
I get an syntax error on the LIMIT from the first INNER JOIN. Why? How can I limit the results I get from each of the INNER JOINS. If I only use the second "LIMIT 1" at the bottom, I will query the entire table.
Thanks a lot!
LIMIT can only be applied to queries, not to a table reference. So you need to use a complete SELECT query for table2 in order to be able to use the LIMIT clause:
SELECT *
FROM schema1.table1 as t1
INNER JOIN (
select *
from schema1.table2
order by ???
limit 1
) as t2 ON t1.column1 = t2.column1
INNER JOIN schema1.table3 as t3 on ON t1.column1 = t3.column1
order by ???
limit 1;
Note that LIMIT without an ORDER BY typically makes no sense as results of a query have no inherent sort order. You should think about applying the necessary ORDER BY in the derived table (aka sub-query) and the outer query to get consistent and deterministic results.

PostgreSQL SELECT query using same structure as a DELETE query

With MySQL, I perform a delete with a join as follows:
DELETE t1
FROM t1
LEFT OUTER JOIN t2 ON t2.fk_id = t1.id
WHERE t2.id IS NULL;
Before executing the query, I would typical make sure I am okay with the outcome first by viewing the soon to be deleted records as follows:
SELECT t1.*
FROM t1
LEFT OUTER JOIN t2 ON t2.fk_id=t1.id
WHERE t2.id IS NULL;
Using PostgreSQL, I believe I should perform the same delete action as follows (EDIT below):
DELETE FROM t1
USING t2
WHERE t1.id = t2.fk_id
AND t2.id IS NULL;
Concerned that I might not be correct, I unsuccessfully tried the following:
SELECT t1.*
USING t2
WHERE t2.fk_id=t1.id
AND t2.id IS NULL;
Is there a way with PostgreSQL to perform a SELECT query using the same structure as a DELETE query?
EDIT - Not even correct since I am effectively performing an INNER JOIN and will not delete any records, and will need to change it as follows. Still would appreciate an answer to my original question.
DELETE FROM t1
USING t1 t_1
LEFT OUTER JOIN t2 ON t2.fk_id=t_1.id
WHERE t1.id=t_1.id AND t2.id IS NULL;
I would use a NOT EXISTS condition:
delete from t1
where not exists (select *
from t2
where t1.id = t2.fk_id);
That's typically faster than a join and it's also standard compliant SQL.

Get an Empty Column from a table when filter condition fetches empty in PostgreSQL

I have two tables table1(id,name,type) and table2(id,source,destination)
When I run query
SELECT
name,
source,
destination
FROM
table1,
table2
WHERE
table1.id=table2.id
If there's no id matching between two tables, can I still get empty column for source and destination .
Yes, you basically want an OUTER JOIN and remember to always use the explicit ANSI JOIN syntax and not the implicit comma syntax for joins.Also use proper table aliases to avoid ambiguity.
SELECT
t1.name,
t2.source,
t2.destination
FROM
table1 t1 left outer join
table2 t2 ON t1.id = t2.id

Update a Table with value from another Table. PL SQL

I frequently use below JOIN and UPDATE to bring value from one table to another in TSQL.
UPDATE T1
SET T1.Mobile = T2.Mobile
FROM Table1 T1 INNER JOIN Table2 T2 ON T1.ID = T2.ID
Then I found in PL SQL, the syntax to do such update is either by using a MERGE, or a nest statement.
Either way appear to be not as straight forward as the TSQL solution, which makes me wondering if PL SQL developer actually perform such cross table update, or if there is other development principle making such update unnecessary.
One comment I got from a PL SQL developer is that they'd rather create a view like
CREATE VIEW MyView AS
(
SELECT T1.Filed1, T1.Field2, T2.Mobile
FROM Table1 T1 INNER JOIN Table2 T2 ON T1.ID = T2.ID
);
This looks like a viable solution taking the fact that joining does not introduce duplicates into Table1/MyView, or putting a dedup logic above.
One obvious benefit for this is that we can continue refreshing Table2.Mobile, and MyView will always have the updated value.
I am seeking comment on coding principle. :)
You may update using a correlated subquery:
UPDATE Table1 T1
SET T1.Mobile = (SELECT Mobile FROM Table2 T2 WHERE T1.ID = T2.ID);
You should use this query to make an inner join.
Without the where clause it's just a left join, that can consists of empty mobile data from T2
UPDATE Table1 T1
SET T1.Mobile = (SELECT min(Mobile) FROM Table2 T2 WHERE T1.ID = T2.ID)
where T1.ID in (SELECT T2.ID FROM Table2 T2)
;
And with the max(Mobile) you prevent to get an error when you have a
1:n relation for T1:T2.
To speedup the statements you should create indexes on T1.ID and T2.ID

Criteria in WHERE vs. JOIN

Is there any difference/limitations/considerations that need to be made when adding additional criteria to a JOIN rather than including it in a WHERE clause. Example...
SELECT
*
FROM
TABLE1 t1
INNER JOIN
TABLE2 t2
ON t1.a = t2.a
AND
t1.DATE_TIME < 06/01/2015
versus
SELECT
*
FROM
TABLE1 t1
INNER JOIN
TABLE2 t2
ON t1.a = t2.a
WHERE
t1.DATE_TIME < 06/01/2015
All the optimizers of DBMS threats the two queries in the same way, so there is no difference in performance between them. The most commonly used form is the second one.
The optimize most likely will treat those two the same
But if you get into 4 or more join what can happen is for the query optimizer to go into a loop join and in that case they might process differently
The safer bet is to have the condition in the join (the first)