We used to have a really badly performing CTE that used to take 3-5 mins to execute. I have modified the CTE and used a function with temp tables to accomplish the same task. Now the new function runs in less than 5 secs.
This is what I did:
From
WITH CTE1 AS (
SELECT ...
FROM ...
),
CTE2 AS (
SELECT ...
FROM ...
),
CTEn AS (
SELECT ...
FROM ...
)
SELECT A,B,C,D
FROM CTE1
JOIN CTE2 ON ...
JOIN CTEn ON ...;
TO
CREATE OR REPLACE FUNCTION FUNC_ABC(a integer)
RETURNS TABLE(A integer, B integer, C integer, D integer)
LANGUAGE plpgsql
AS $function$
DECLARE
x ALIAS for $1;
BEGIN
DROP TABLE IF EXISTS CTE1;
DROP TABLE IF EXISTS CTE2;
DROP TABLE IF EXISTS CTEn;
CREATE TEMP TABLE CTE1 AS
( SELECT ...
FROM ...
);
CREATE TEMP TABLE CTE2 AS
( SELECT ...
FROM ...);
CREATE TEMP TABLE CTEn AS
( SELECT ...
FROM ...);
CREATE INDEX ix_cte1 ON CTE1(A);
CREATE INDEX ix_cte2 ON CTE2(B);
CREATE INDEX ix_cten ON CTEn(C);
CREATE INDEX ix_cten ON CTEn(D);
RETURNS QUERY SELECT A,B,C,D
FROM CTE1
JOIN CTE2 ON ...
JOIN CTEn ON ...
END;
$function$
;
As I stated above, the function pretty fast. The reason behind adding "DROP TABLE" is that, within a transaction, this function can be executed any number of times. But, intermittently, we see an error like:
ERROR: must be owner of relation CTE1
I am not able to reproduce this error. And there is only one user that runs this function. No other user has permissions to execute this function.
I couldn't think of a scenario when this would fail. Any thoughts of insights will be appreciated.
Related
UPDATE:
I am using the CTE because I am using a LOOP to loop in batches of 10000.
I am already using a CTE expression within a plpgsql Procedure to grab some Foreign Keys from (1) specific table, we can call it master_table. I created a brand new table, we can call this table table_with_fks, in my DDL statements so this table holds the FKs I am fetching and saving.
I later take these FKs from my table_with_fks and JOIN on my other tables in my database to get the entire original record (the full record with all columns from its corresponding table) and insert it into an archive table.
I have an awesome lucid chart I drew that might make what I say down below make much more sense:
My CTE example:
LOOP
EXIT WHEN some_condition;
WITH fk_list_cte AS (
SELECT mt.fk1, mt.fk2, mt.fk3, mt.fk4
FROM master_table mt
WHERE mt.created_date < now() - interval '365' // archive record if >= 1 year old
LIMIT 10000
)
INSERT INTO table_with_fks (SELECT * FROM fk_list_cte);
commit;
END LOOP;
Now, I have (4) other Procedures that JOIN on each FK in this table_with_fks with its parent table that it references. I do this because as I said, I only got the FK at first, and I don't have all the original columns for the record. So I will do something like
LOOP
EXIT WHEN some_condition;
WITH full_record_cte AS (
SELECT *
FROM table_with_fks fks
JOIN parent_table1 pt1
ON fks.fk1 = pt1.id
LIMIT 10000),
INSERT INTO (select * from full_record_cte);
commit;
END LOOP;
NOW, what I want to do, is instead of having to RE-JOIN 4 times later on these FK's that are found in my table_with_fks, I want to use the first CTE fk_list_cte to JOIN on the parent tables right away and grab the full record from each (4) tables and put it in some TEMP postgres table. I think I will need (4) unique TEMP tables, as I don't know how it would work if I combine all their data into one BIG table, because each table has different data/different columns.
Is there a way to use the original CTE fk_list_cte and call it multiple times in succession and CREATE 4 TEMP tables right after, that all use the original CTE? example:
LOOP
EXIT WHEN some_condition;
WITH fk_list_cte AS (
SELECT mt.fk1, mt.fk2, mt.fk3, mt.fk4
FROM master_table mt
WHERE mt.created_date < now() - interval '365' // archive record if >= 1 year old
LIMIT 10000
),
WITH fetch_fk1_original_record_from_parent AS (
SELECT *
FROM fk_list_cte cte
JOIN parent_table1 pt1
ON cte.fk1 = pt1.id
),
WITH fetch_fk2_original_record_from_parent AS (
SELECT *
FROM fk_list_cte cte
JOIN parent_table2 pt2
ON cte.fk2 = pt2.id
),
WITH fetch_fk3_original_record_from_parent AS (
SELECT *
FROM fk_list_cte cte
JOIN parent_table3 pt3
ON cte.fk3 = pt3.id
),
WITH fetch_fk4_original_record_from_parent AS (
SELECT *
FROM fk_list_cte cte
JOIN parent_table4 pt4
ON cte.fk4 = pt4.id
),
CREATE TEMPORARY TABLE fk1_tmp_tbl AS (
SELECT *
FROM fetch_fk1_original_record_from_parent
)
CREATE TEMPORARY TABLE fk2_tmp_tbl AS (
SELECT *
FROM fetch_fk2_original_record_from_parent
)
CREATE TEMPORARY TABLE fk3_tmp_tbl AS (
SELECT *
FROM fetch_fk3_original_record_from_parent
)
CREATE TEMPORARY TABLE fk4_tmp_tbl AS (
SELECT *
FROM fetch_fk4_original_record_from_parent
);
END LOOP;
I know the 4 CREATE TEMPORARY TABLE statements definitely won't work, (can I create 4 temp tables simultaneously/at once?) . Does anyone see the logic of what I am trying to do here and can help me?
I have a select query that returns a dataset with "n" records in one column. I would like to use this column as the parameter in a stored procedure. Below a reduced example of my case.
The query:
SELECT code FROM rawproducts
The dataset:
CODE
1
2
3
The stored procedure:
ALTER PROCEDURE [dbo].[MyInsertSP]
(#code INT)
AS
BEGIN
INSERT INTO PRODUCTS description, price, stock
SELECT description, price, stock
FROM INVENTORY I
WHERE I.icode = #code
END
I already have the actual query and stored procedure done; I just am not sure how to put them both together.
I would appreciate any assistance here! Thank you!
PS: of course the stored procedure is not as simple as above. I just choose to use a very silly example to keep things small here. :)
Here's two methods for you, one using a loop without a cursor:
DECLARE #code_list TABLE (code INT);
INSERT INTO #code_list SELECT code, ROW_NUMBER() OVER (ORDER BY code) AS row_id FROM rawproducts;
DECLARE #count INT;
SELECT #count = COUNT(*) FROM #code_list;
WHILE #count > 0
BEGIN
DECLARE #code INT;
SELECT #code = code FROM #code_list WHERE row_id = #count;
EXEC MyInsertSP #code;
DELETE FROM #code_list WHERE row_id = #count;
SELECT #count = COUNT(*) FROM #code_list;
END;
This works by putting the codes into a table variable, and assigning a number from 1..n to each row. Then we loop through them, one at a time, deleting them as they are processed, until there is nothing left in the table variable.
But here's what I would consider a better method:
CREATE TYPE dbo.code_list AS TABLE (code INT);
GO
CREATE PROCEDURE MyInsertSP (
#code_list dbo.code_list)
AS
BEGIN
INSERT INTO PRODUCTS (
[description],
price,
stock)
SELECT
i.[description],
i.price,
i.stock
FROM
INVENTORY i
INNER JOIN #code_list cl ON cl.code = i.code;
END;
GO
DECLARE #code_list dbo.code_list;
INSERT INTO #code_list SELECT code FROM rawproducts;
EXEC MyInsertSP #code_list = #code_list;
To get this to work I create a user-defined table type, then use this to pass a list of codes into the stored procedure. It means slightly rewriting your stored procedure, but the actual code to do the work is much smaller.
(how to) Run a stored procedure using select columns as input
parameters?
What you are looking for is APPLY; APPLY is how you use columns as input parameters. The only thing unclear is how/where the input column is populated. Let's start with sample data:
IF OBJECT_ID('dbo.Products', 'U') IS NOT NULL DROP TABLE dbo.Products;
IF OBJECT_ID('dbo.Inventory','U') IS NOT NULL DROP TABLE dbo.Inventory;
IF OBJECT_ID('dbo.Code','U') IS NOT NULL DROP TABLE dbo.Code;
CREATE TABLE dbo.Products
(
[description] VARCHAR(1000) NULL,
price DECIMAL(10,2) NOT NULL,
stock INT NOT NULL
);
CREATE TABLE dbo.Inventory
(
icode INT NOT NULL,
[description] VARCHAR(1000) NULL,
price DECIMAL(10,2) NOT NULL,
stock INT NOT NULL
);
CREATE TABLE dbo.Code(icode INT NOT NULL);
INSERT dbo.Inventory
VALUES (10,'',20.10,3),(11,'',40.10,3),(11,'',25.23,3),(11,'',55.23,3),(12,'',50.23,3),
(15,'',33.10,3),(15,'',19.16,5),(18,'',75.00,3),(21,'',88.00,3),(21,'',100.99,3);
CREATE CLUSTERED INDEX uq_inventory ON dbo.Inventory(icode);
The function:
CREATE FUNCTION dbo.fnInventory(#code INT)
RETURNS TABLE AS RETURN
SELECT i.[description], i.price, i.stock
FROM dbo.Inventory I
WHERE I.icode = #code;
USE:
DECLARE #code TABLE (icode INT);
INSERT #code VALUES (10),(11);
SELECT f.[description], f.price, f.stock
FROM #code AS c
CROSS APPLY dbo.fnInventory(c.icode) AS f;
Results:
description price stock
-------------- -------- -----------
20.10 3
40.10 3
Updated Proc (note my comments):
ALTER PROC dbo.MyInsertSP -- (1) Lose the input param
AS
-- (2) Code that populates the "code" table
INSERT dbo.Code VALUES (10),(11);
-- (3) Use CROSS APPLY to pass the values from dbo.code to your function
INSERT dbo.Products ([description], price, stock)
SELECT f.[description], f.price, f.stock
FROM dbo.code AS c
CROSS APPLY dbo.fnInventory(c.icode) AS f;
This ^^^ is how it's done.
I want to create a procedure in which I insert data into several tables. I need to get the inserted ID's so I create temp table in which I catch them. The problem is that I receive an error "Invalid column name 'app_guid'" and "Invalid column name 'app_nazwa_pliku'" but I create temp tables with such columns. Do you happen to know what's wrong with my code?
create procedure p_paseczek_przenies
as
declare #new_nr_sprawy varchar(50)
if object_id('tempdb..##paseczki') is not null drop table ##paseczki
select
top 1 with ties
s.sp_numer as SprawaGlowna_sp_numer,
s.sp_id as SprawaGlowna_sp_id
,Paseczek.max_ak_id as Paseczek_max_ak_id
,apisp_data_przyjscia
,app_guid
,app_nazwa_pliku
into ##paseczki
from sprawa as s
join akcja as a on a.ak_sp_id=s.sp_id and ak_akt_id=111
join sprawa_powiazania as sp on s.sp_id=sp.sp_id and rodzaj_powiazania='SPRAWY POLUBOWNE'
join (select max(ak_id) max_ak_id,ak_sp_id from akcja
where ak_akt_id=1089
group by ak_sp_id) as Paseczek on Paseczek.ak_sp_id=sp.sp_id_powiazana
join akcja_pismo on apis_ak_id=max_ak_id
join akcja_pismo_przychodzace on apis_apisp_id=apisp_id
join akcja_pismo_plik on app_apis_id=apis_id
where s.sp_numer=#new_nr_sprawy
order by ROW_NUMBER() over (partition by s.sp_id order by paseczek.max_ak_id desc)
if exists (select * from ##paseczki)
begin
if object_id('tempdb..##akcja') is not null drop table ##akcja
create table ##akcja (
ak_id int
,apisp_data_przyjscia datetime
,app_guid varchar(max)
,app_nazwa_pliku varchar(max)
)
merge akcja as target using (
select * from ##paseczki) as source on 1=0
when not matched then insert
(ak_akt_id, ak_sp_id, ak_kolejnosc, ak_interwal, ak_zakonczono, ak_pr_id, ak_publiczna)
values (1089,SprawaGlowna_sp_id,1,1,getdate(),5,1)
output inserted.ak_id,source.apisp_data_przyjscia,source.app_guid,source.app_nazwa_pliku
into ##akcja;
insert into rezultat
(re_ak_id, re_ret_id, re_data_planowana, re_us_id_planujacy, re_data_wykonania, re_us_id_wykonujacy, re_konczy)
select ak_id,309,getdate(),5,getdate(),5,1 from ##akcja
if object_id('tempdb..##akcja_pismo_przychodzace') is not null drop table ##akcja_pismo_przychodzace
create table ##akcja_pismo_przychodzace (
apisp_id int
,ak_id int
,app_guid varchar(max)
,app_nazwa_pliku varchar(max)
)
merge akcja_pismo_przychodzace as target using (
select * from ##akcja) as source on 1=0
when not matched then insert
(apisp_data_przyjscia)
values (apisp_data_przyjscia)
output inserted.apisp_id,source.ak_id,source.app_guid,source.app_nazwa_pliku
into ##akcja_pismo_przychodzace;
if object_id('tempdb..##akcja_pismo') is not null drop table ##akcja_pismo
create table ##akcja_pismo (
apis_id int
,app_guid varchar(max)
,app_nazwa_pliku varchar(max)
)
merge akcja_pismo as target using (
select * from ##akcja_pismo_przychodzace) as source on 1=0
when not matched then insert
(apis_ak_id, apis_apisp_id, apis_data_stworzenia,[apis_us_id_tworzacy])
values (ak_id,apisp_id,getdate(),5)
output inserted.apis_id,source.app_guid,source.app_nazwa_pliku
into ##akcja_pismo;
alter table [dm_data_bps].[dbo].[akcja_pismo_plik] disable trigger [tr_akcja_pismo_plik_ins]
insert into akcja_pismo_plik
([app_guid],[app_apis_id],[app_nazwa_pliku])
select [app_guid],[apis_id],[app_nazwa_pliku] from ##akcja_pismo
alter table [dm_data_bps].[dbo].[akcja_pismo_plik] enable trigger [tr_akcja_pismo_plik_ins]
end
SQL Server compiles the procedure at creation and when it is first executed, verifying the entire procedure based on the context at that time.
For example, try the following query:
CREATE PROCEDURE P
AS
IF OBJECT_ID('tempdb..#T') IS NOT NULL DROP TABLE #T
SELECT 1 Y INTO #T
SELECT Y FROM #T
GO
CREATE TABLE #T (X INT)
GO
EXEC P
You will get an error ("Invalid column name 'Y'."), because when the procedure is compiled the table #T has only the column X.
To avoid this problem, you should make sure that the table #T either does not exist or has the right columns, before the procedure is executed.
One way would be to have another stored procedure (a wrapper):
CREATE PROCEDURE P1
AS
SELECT 1 Y INTO #T
SELECT Y FROM #T
GO
CREATE PROCEDURE P2
AS
IF OBJECT_ID('tempdb..#T') IS NOT NULL DROP TABLE #T
EXEC P1
GO
CREATE TABLE #T (X INT)
GO
EXEC P2
GO
DROP PROCEDURE P1, P2
--DROP TABLE #T
Another way would be to use dynamic SQL, because that code is compiled separately, as if it would be another stored procedure.
A better way would be to make sure that temp tables are uniquely named in each stored procedure, unless sharing data between them is desired. For the later case, you can read http://www.sommarskog.se/share_data.html#temptables for more insights.
This error is also encountered when a stored procedure creates a #temp table and then fires a trigger which creates a #temp table with the same name. The SP #temp table is referenced by the trigger when the column names are explicit, (like SELECT id FROM #temp;), but the local trigger #temp table is referenced when SELECT * FROM #temp; is used.
Microsoft, if you are listening, could you kindly attend to it and retrofit existing supported versions with a maintenance update?
It is expected to now take in a table called waypoints and follow through the function body.
drop function if exists everything(waypoints);
create function everything(waypoints) RETURNS TABLE(node int, xy text[]) as $$
BEGIN
drop table if exists bbox;
create temporary table bbox(...);
insert into bbox
select ... from waypoints;
drop table if exists b_spaces;
create temporary table b_spaces(
...
);
insert into b_spaces
select ...
drop table if exists b_graph; -- Line the error flags.
create temporary table b_graph(
...
);
insert into b_graph
select ...
drop table if exists local_green;
create temporary table local_green(
...
);
insert into local_green
...
with aug_temp as (
select ...
)
insert into b_graph(source, target, cost) (
(select ... from aug_temp)
UNION
(select ... from aug_temp)
);
return query
with
results as (
select id1, ... from b_graph -- The relation being complained about.
),
pkg as (
select loc, ...
)
select id1, array_agg(loc)
from pkg
group by id1;
return;
END;
$$ LANGUAGE plpgsql;
This returns cannot DROP TABLE b_graph because it is being used by active queries in this session
How do I go about rectifying this issue?
The error message is rather obvious, you cannot drop a temp table while it is being used.
You might be able to avoid the problem by adding ON COMMIT DROP:
Temporary table and loops in a function
However, this can probably be simpler. If you don't need all those temp tables to begin with (which I suspect), you can replace them all with CTEs (or most of them probably even with cheaper subqueries) and simplify to one big query. Can be plpgsql or just SQL:
CREATE FUNCTION everything(waypoints)
RETURNS TABLE(node int, xy text[]) AS
$func$
WITH bbox AS (SELECT ... FROM waypoints) -- not the fct. parameter!
, b_spaces AS (SELECT ... )
, b_graph AS (SELECT ... )
, local_green AS (SELECT ... )
, aug_temp AS (SELECT ... )
, b_graph2(source, target, cost) AS (
SELECT ... FROM b_graph
UNION ALL -- guessing you really want UNION ALL
SELECT ... FROM aug_temp
UNION ALL
SELECT ... FROM aug_temp
)
, results AS (SELECT id1, ... FROM b_graph2)
, pkg AS (SELECT loc, ... )
SELECT id1, array_agg(loc)
FROM pkg
GROUP BY id1
$func$ LANGUAGE sql;
Views are just storing a query ("the recipe"), not the actual resulting values ("the soup").
It's typically cheaper to use CTEs instead of creating temp tables.
Derived tables in queries, sorted by their typical overall performance (exceptions for special cases involving indexes). From slow to fast:
CREATE TABLE
CREATE UNLOGGED TABLE
CREATE TEMP TABLE
CTE
subquery
UNION would try to fold duplicate rows. Typically, people really want UNION ALL, which just appends rows. Faster and does not try to remove dupes.
I'm trying to run a graph search to find all nodes accessible from a starting point, like so:
with recursive
nodes_traversed as (
select START_NODE ID
from START_POSITION
union all
select ed.DST_NODE
from EDGES ed
join nodes_traversed NT
on (NT.ID = ed.START_NODE)
and (ed.DST_NODE not in (select ID from nodes_traversed))
)
select distinct * from nodes_traversed
Unfortunately, when I try to run that, I get an error:
Recursive CTE member (nodes_traversed) can refer itself only in FROM clause.
That "not in select" clause is important to the recursive expression, though, as it provides the ending point. (Without it, you get infinite recursion.) Using generation counting, like in the accepted answer to this question, would not help, since this is a highly cyclic graph.
Is there any way to work around this without having to create a stored proc that does it iteratively?
Here is my solution that use global temporary table, I have limited recursion by level and nodes from temporary table.
I am not sure how it will work on large set of data.
create procedure get_nodes (
START_NODE integer)
returns (
NODE_ID integer)
as
declare variable C1 integer;
declare variable C2 integer;
begin
/**
create global temporary table id_list(
id integer
);
create index id_list_idx1 ON id_list (id);
*/
delete from id_list;
while ( 1 = 1 ) do
begin
select count(distinct id) from id_list into :c1;
insert into id_list
select id from
(
with recursive nodes_traversed as (
select :START_NODE AS ID , 0 as Lv
from RDB$DATABASE
union all
select ed.DST_NODE , Lv+1
from edges ed
join nodes_traversed NT
on
(NT.ID = ed.START_NODE)
and nt.Lv < 5 -- Max recursion level
and nt.id not in (select id from id_list)
)
select distinct id from nodes_traversed);
select count(distinct id) from id_list into :c2;
if (c1 = c2) then break;
end
for select distinct id from id_list into :node_id do
begin
suspend ;
end
end