Improve a query with duplicated SELECTS?

Improve a query with duplicated SELECTS? - tsql

I am performing a query that selects two times from the exact same table on 2 different columns and performing a compare with the same set of data twice from another table.
My current method:
DELETE FROM MY_TABLE
WHERE MY_TABLE.BUY_ORDER_ID
IN ( SELECT #tmp_table.order_id FROM #tmp_table )
OR MY_TABLE.SELL_ORDER_ID
IN ( SELECT #tmp_table.order_id FROM #tmp_table )
Is there a way to improve on the query?
Thanks

Possibly. Need to test on your data.
DELETE MY_TABLE
FROM MY_TABLE m
JOIN #tmp_table
on #tmp_table.order_id = m.BUY_ORDER_ID
or #tmp_table.order_id = m.SELL_ORDER_ID
If #tmp_table.order_id is the PK or unique then declare it.
Splitting hairs but maybe
DELETE MY_TABLE
FROM MY_TABLE m
JOIN #tmp_table
on #tmp_table.order_id in ( m.BUY_ORDER_ID, m.SELL_ORDER_ID )

I've tried this on SQL Server and it seems faster. I suppose you can do something similar on sybase?
DELETE FROM MY_TABLE
WHERE EXISTS
(
SELECT * FROM #tmp_table
WHERE
#tmp_table.order_id = MY_TABLE.BUY_ORDER_ID
OR
#tmp_table.order_id = MY_TABLE.SELL_ORDER_ID
)

Related

How to create multiple temp tables using records from a CTE that I need to call multiple times in Postgres plpgsql Procedure?

UPDATE:
I am using the CTE because I am using a LOOP to loop in batches of 10000.
I am already using a CTE expression within a plpgsql Procedure to grab some Foreign Keys from (1) specific table, we can call it master_table. I created a brand new table, we can call this table table_with_fks, in my DDL statements so this table holds the FKs I am fetching and saving.
I later take these FKs from my table_with_fks and JOIN on my other tables in my database to get the entire original record (the full record with all columns from its corresponding table) and insert it into an archive table.
I have an awesome lucid chart I drew that might make what I say down below make much more sense:
My CTE example:
LOOP
EXIT WHEN some_condition;
WITH fk_list_cte AS (
SELECT mt.fk1, mt.fk2, mt.fk3, mt.fk4
FROM master_table mt
WHERE mt.created_date < now() - interval '365' // archive record if >= 1 year old
LIMIT 10000
)
INSERT INTO table_with_fks (SELECT * FROM fk_list_cte);
commit;
END LOOP;
Now, I have (4) other Procedures that JOIN on each FK in this table_with_fks with its parent table that it references. I do this because as I said, I only got the FK at first, and I don't have all the original columns for the record. So I will do something like
LOOP
EXIT WHEN some_condition;
WITH full_record_cte AS (
SELECT *
FROM table_with_fks fks
JOIN parent_table1 pt1
ON fks.fk1 = pt1.id
LIMIT 10000),
INSERT INTO (select * from full_record_cte);
commit;
END LOOP;
NOW, what I want to do, is instead of having to RE-JOIN 4 times later on these FK's that are found in my table_with_fks, I want to use the first CTE fk_list_cte to JOIN on the parent tables right away and grab the full record from each (4) tables and put it in some TEMP postgres table. I think I will need (4) unique TEMP tables, as I don't know how it would work if I combine all their data into one BIG table, because each table has different data/different columns.
Is there a way to use the original CTE fk_list_cte and call it multiple times in succession and CREATE 4 TEMP tables right after, that all use the original CTE? example:
LOOP
EXIT WHEN some_condition;
WITH fk_list_cte AS (
SELECT mt.fk1, mt.fk2, mt.fk3, mt.fk4
FROM master_table mt
WHERE mt.created_date < now() - interval '365' // archive record if >= 1 year old
LIMIT 10000
),
WITH fetch_fk1_original_record_from_parent AS (
SELECT *
FROM fk_list_cte cte
JOIN parent_table1 pt1
ON cte.fk1 = pt1.id
),
WITH fetch_fk2_original_record_from_parent AS (
SELECT *
FROM fk_list_cte cte
JOIN parent_table2 pt2
ON cte.fk2 = pt2.id
),
WITH fetch_fk3_original_record_from_parent AS (
SELECT *
FROM fk_list_cte cte
JOIN parent_table3 pt3
ON cte.fk3 = pt3.id
),
WITH fetch_fk4_original_record_from_parent AS (
SELECT *
FROM fk_list_cte cte
JOIN parent_table4 pt4
ON cte.fk4 = pt4.id
),
CREATE TEMPORARY TABLE fk1_tmp_tbl AS (
SELECT *
FROM fetch_fk1_original_record_from_parent
)
CREATE TEMPORARY TABLE fk2_tmp_tbl AS (
SELECT *
FROM fetch_fk2_original_record_from_parent
)
CREATE TEMPORARY TABLE fk3_tmp_tbl AS (
SELECT *
FROM fetch_fk3_original_record_from_parent
)
CREATE TEMPORARY TABLE fk4_tmp_tbl AS (
SELECT *
FROM fetch_fk4_original_record_from_parent
);
END LOOP;
I know the 4 CREATE TEMPORARY TABLE statements definitely won't work, (can I create 4 temp tables simultaneously/at once?) . Does anyone see the logic of what I am trying to do here and can help me?

insert into temp table without creating it from union results

I have the below query that get results from more than one select.
Now I want these to be in a temp table.
Is there any way to insert these into a temp table without creating the table?
I know how to do that for select
Select * into #s --like that
However how to do that one more than one select?
SELECT Ori.[GeoBoundaryAssId], Ori.[FromGeoBoundaryId], Ori.Sort
From [GeoBoundaryAss] As Ori where Ori.[FromGeoBoundaryId] = (select distinct [FromGeoBoundaryId] from inserted )
Union
SELECT I.[GeoBoundaryAssId], I.[FromGeoBoundaryId], I.Sort
From [inserted] I ;

Add INTO after the first SELECT.
SELECT Ori.[GeoBoundaryAssId], Ori.[FromGeoBoundaryId], Ori.Sort
INTO #s
From [GeoBoundaryAss] As Ori where Ori.[FromGeoBoundaryId] = (select distinct [FromGeoBoundaryId] from inserted )
Union
SELECT I.[GeoBoundaryAssId], I.[FromGeoBoundaryId], I.Sort
From [inserted] I ;

Try this,
INSERT INTO #s ([GeoBoundaryAssId], [FromGeoBoundaryId], Sort)
(
SELECT Ori.[GeoBoundaryAssId], Ori.[FromGeoBoundaryId], Ori.Sort
FROM [GeoBoundaryAss] AS Ori WHERE Ori.[FromGeoBoundaryId] in (SELECT DISTINCT [FromGeoBoundaryId] FROM inserted )
UNION
SELECT I.[GeoBoundaryAssId], I.[FromGeoBoundaryId], I.Sort
FROM [inserted] I
)

dynamically create and load table from select query

SELECT
MEM_ID, [C1],[C2]
from
(select
MEM_ID, Condition_id, condition_result
from tbl_GConditionResult
) x
pivot
(
sum(condition_result)
for condition_id in ([C1],[C2])
) p
The above query returns three columns of data. Until runtime I will not know how many columns in the select statement. Is it possible to load the data from the select statement into a dynamically created table? After processing the data from the dynamically created table I want to drop the table.
Thank you for your help.
Smith

Yes, do a SELECT INTO e.g.
SELECT
MEM_ID, [C1],[C2]
into #TEMP
from
(select
MEM_ID, Condition_id, condition_result
from tbl_GConditionResult
) x
pivot
(
sum(condition_result)
for condition_id in ([C1],[C2])
) p
-- Do what you need with the TEMP table
DROP TABLE #TEMP

PostgreSQL Record Reordering using Update with a Sub-Select

I found this solution on the SQL Server forum on how to reorder records in a table.
UPDATE SomeTable
SET rankcol = SubQuery.Sort_Order
FROM
(
SELECT IDCol, Row_Number() OVER (ORDER BY ValueCOL) as SORT_ORDER
FROM SomeTable
) SubQuery
INNER JOIN SomeTable ON
SubQuery.IDCol = SomeTable.IDCol
When I try doing the same on PostgreSQL, I get an error message -
ERROR: table name "sometable" specified more than once
Any help will be appreciated.
Thanks!

You don`t need to explicitly join SomeTable, how cool is that? :)
UPDATE SomeTable
SET rankcol = SubQuery.Sort_Order
FROM
(
SELECT IDCol, Row_Number() OVER (ORDER BY ValueCOL) as SORT_ORDER
FROM SomeTable
) SubQuery
where SubQuery.IDCol = SomeTable.IDCol
remark: Postgres is case insensitive, better use lower-case, like row_number, sort_order, id_col , etc.

T-SQL: Selecting rows to delete via joins

Scenario:
Let's say I have two tables, TableA and TableB. TableB's primary key is a single column (BId), and is a foreign key column in TableA.
In my situation, I want to remove all rows in TableA that are linked with specific rows in TableB: Can I do that through joins? Delete all rows that are pulled in from the joins?
DELETE FROM TableA
FROM
TableA a
INNER JOIN TableB b
ON b.BId = a.BId
AND [my filter condition]
Or am I forced to do this:
DELETE FROM TableA
WHERE
BId IN (SELECT BId FROM TableB WHERE [my filter condition])
The reason I ask is it seems to me that the first option would be much more effecient when dealing with larger tables.
Thanks!

DELETE TableA
FROM TableA a
INNER JOIN TableB b
ON b.Bid = a.Bid
AND [my filter condition]
should work

I would use this syntax
Delete a
from TableA a
Inner Join TableB b
on a.BId = b.BId
WHERE [filter condition]

Yes you can. Example :
DELETE TableA
FROM TableA AS a
INNER JOIN TableB AS b
ON a.BId = b.BId
WHERE [filter condition]

Was trying to do this with an access database and found I needed to use a.* right after the delete.
DELETE a.*
FROM TableA AS a
INNER JOIN TableB AS b
ON a.BId = b.BId
WHERE [filter condition]

It's almost the same in MySQL, but you have to use the table alias right after the word "DELETE":
DELETE a
FROM TableA AS a
INNER JOIN TableB AS b
ON a.BId = b.BId
WHERE [filter condition]

The syntax above doesn't work in Interbase 2007. Instead, I had to use something like:
DELETE FROM TableA a WHERE [filter condition on TableA]
AND (a.BId IN (SELECT a.BId FROM TableB b JOIN TableA a
ON a.BId = b.BId
WHERE [filter condition on TableB]))
(Note Interbase doesn't support the AS keyword for aliases)

I'm using this
DELETE TableA
FROM TableA a
INNER JOIN
TableB b on b.Bid = a.Bid
AND [condition]
and #TheTXI way is good as enough but I read answers and comments and I found one things must be answered is using condition in WHERE clause or as join condition. So I decided to test it and write an snippet but didn't find a meaningful difference between them. You can see sql script here and important point is that I preferred to write it as commnet because of this is not exact answer but it is large and can't be put in comments, please pardon me.
Declare #TableA Table
(
aId INT,
aName VARCHAR(50),
bId INT
)
Declare #TableB Table
(
bId INT,
bName VARCHAR(50)
)
Declare #TableC Table
(
cId INT,
cName VARCHAR(50),
dId INT
)
Declare #TableD Table
(
dId INT,
dName VARCHAR(50)
)
DECLARE #StartTime DATETIME;
SELECT #startTime = GETDATE();
DECLARE #i INT;
SET #i = 1;
WHILE #i < 1000000
BEGIN
INSERT INTO #TableB VALUES(#i, 'nameB:' + CONVERT(VARCHAR, #i))
INSERT INTO #TableA VALUES(#i+5, 'nameA:' + CONVERT(VARCHAR, #i+5), #i)
SET #i = #i + 1;
END
SELECT #startTime = GETDATE()
DELETE a
--SELECT *
FROM #TableA a
Inner Join #TableB b
ON a.BId = b.BId
WHERE a.aName LIKE '%5'
SELECT Duration = DATEDIFF(ms,#StartTime,GETDATE())
SET #i = 1;
WHILE #i < 1000000
BEGIN
INSERT INTO #TableD VALUES(#i, 'nameB:' + CONVERT(VARCHAR, #i))
INSERT INTO #TableC VALUES(#i+5, 'nameA:' + CONVERT(VARCHAR, #i+5), #i)
SET #i = #i + 1;
END
SELECT #startTime = GETDATE()
DELETE c
--SELECT *
FROM #TableC c
Inner Join #TableD d
ON c.DId = d.DId
AND c.cName LIKE '%5'
SELECT Duration = DATEDIFF(ms,#StartTime,GETDATE())
If you could get good reason from this script or write another useful, please share. Thanks and hope this help.

Let's say you have 2 tables, one with a Master set (eg. Employees) and one with a child set (eg. Dependents) and you're wanting to get rid of all the rows of data in the Dependents table that cannot key up with any rows in the Master table.
delete from Dependents where EmpID in (
select d.EmpID from Employees e
right join Dependents d on e.EmpID = d.EmpID
where e.EmpID is null)
The point to notice here is that you're just collecting an 'array' of EmpIDs from the join first, the using that set of EmpIDs to do a Deletion operation on the Dependents table.

In SQLite, the only thing that work is something similar to beauXjames' answer.
It seems to come down to this
DELETE FROM table1 WHERE table1.col1 IN (SOME TEMPORARY TABLE);
and that some temporary table can be crated by SELECT and JOIN your two table which you can filter this temporary table based on the condition that you want to delete the records in Table1.

The simpler way is:
DELETE TableA
FROM TableB
WHERE TableA.ID = TableB.ID

DELETE FROM table1
where id IN
(SELECT id FROM table2..INNER JOIN..INNER JOIN WHERE etc)
Minimize use of DML queries with Joins. You should be able to do most of all DML queries with subqueries like above.
In general, joins should only be used when you need to SELECT or GROUP by columns in 2 or more tables. If you're only touching multiple tables to define a population, use subqueries. For DELETE queries, use correlated subquery.

You can run this query:
DELETE FROM TableA
FROM
TableA a, TableB b
WHERE
a.Bid=b.Bid
AND
[my filter condition]

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Improve a query with duplicated SELECTS? - tsql

I've tried this on SQL Server and it seems faster. I suppose you can do something similar on sybase? DELETE FROM MY_TABLE WHERE EXISTS ( SELECT * FROM #tmp_table WHERE #tmp_table.order_id = MY_TABLE.BUY_ORDER_ID OR #tmp_table.order_id = MY_TABLE.SELL_ORDER_ID )

Related

How to create multiple temp tables using records from a CTE that I need to call multiple times in Postgres plpgsql Procedure?

insert into temp table without creating it from union results

dynamically create and load table from select query

PostgreSQL Record Reordering using Update with a Sub-Select

T-SQL: Selecting rows to delete via joins

Categories

Resources