Postgresql function executed much longer than the same query

Postgresql function executed much longer than the same query - postgresql

I'm using PostgreSQL 9.2.9 and have the following problem.
There are function:
CREATE OR REPLACE FUNCTION report_children_without_place(text, date, date, integer)
RETURNS TABLE (department_name character varying, kindergarten_name character varying, a1 bigint) AS $BODY$
BEGIN
RETURN QUERY WITH rh AS (
SELECT (array_agg(status ORDER BY date DESC))[1] AS status, request
FROM requeststatushistory
WHERE date <= $3
GROUP BY request
)
SELECT
w.name,
kgn.name,
COUNT(*)
FROM kindergarten_request_table_materialized kr
JOIN rh ON rh.request = kr.id
JOIN requeststatuses s ON s.id = rh.status AND s.sysname IN ('confirmed', 'need_meet_completion', 'kindergarten_need_meet')
JOIN workareas kgn ON kr.kindergarten = kgn.id AND kgn.tree <# CAST($1 AS LTREE) AND kgn.active
JOIN organizationforms of ON of.id = kgn.organizationform AND of.sysname IN ('state','municipal','departmental')
JOIN workareas w ON w.tree #> kgn.tree AND w.active
JOIN workareatypes mt ON mt.id = w.type AND mt.sysname = 'management'
WHERE kr.requestyear = $4
GROUP BY kgn.name, w.name
ORDER BY w.name, kgn.name;
END
$BODY$ LANGUAGE PLPGSQL STABLE;
EXPLAIN ANALYZE SELECT * FROM report_children_without_place('83.86443.86445', '14-04-2015', '14-04-2015', 2014);
Total runtime: 242805.085 ms.
But query from function's body executes much faster:
EXPLAIN ANALYZE WITH rh AS (
SELECT (array_agg(status ORDER BY date DESC))[1] AS status, request
FROM requeststatushistory
WHERE date <= '14-04-2015'
GROUP BY request
)
SELECT
w.name,
kgn.name,
COUNT(*)
FROM kindergarten_request_table_materialized kr
JOIN rh ON rh.request = kr.id
JOIN requeststatuses s ON s.id = rh.status AND s.sysname IN ('confirmed', 'need_meet_completion', 'kindergarten_need_meet')
JOIN workareas kgn ON kr.kindergarten = kgn.id AND kgn.tree <# CAST('83.86443.86445' AS LTREE) AND kgn.active
JOIN organizationforms of ON of.id = kgn.organizationform AND of.sysname IN ('state','municipal','departmental')
JOIN workareas w ON w.tree #> kgn.tree AND w.active
JOIN workareatypes mt ON mt.id = w.type AND mt.sysname = 'management'
WHERE kr.requestyear = 2014
GROUP BY kgn.name, w.name
ORDER BY w.name, kgn.name;
Total runtime: 2156.740 ms.
Why function executed so longer than the same query? Thank's

Your query runs faster because the "variables" are not actually variable -- they are static values (IE strings in quotes). This means the execution planner can leverage indexes. Within your stored procedure, your variables are actual variables, and the planner cannot make assumptions about indexes. For example - you might have a partial index on requeststatushistory where "date" is <= '2012-12-31'. The index can only be used if the $3 is known. Since it might hold a date from 2015, the partial index would be of no use. In fact, it would be detrimental.
I frequently construct a string within my functions where I concatenate my variables as literals and then execute the function using something like the following:
DECLARE
my_dynamic_sql TEXT;
BEGIN
my_dynamic_sql := $$
SELECT *
FROM my_table
WHERE $$ || quote_literal($3) || $$::TIMESTAMPTZ BETWEEN start_time
AND end_time;$$;
/* You can only see this if client_min_messages = DEBUG */
RAISE DEBUG '%', my_dynamic_sql;
RETURN QUERY EXECUTE my_dynamic_sql;
END;
The dynamic SQL is VERY useful because you can actually get an explain of the query when I have set client_min_messages=DEBUG; I can scrape the query from the screen and paste it back in after EXPLAIN or EXPLAIN ANALYZE and see what the execution planner is doing. This also allows you to construct very different queries as needed to optimize for variables (IE exclude unnecessary tables if warranted) and maintain a common API for your clients.
You may be tempted to avoid the dynamic SQL for fear of performance issues (I was at first) but you will be amazed at how LITTLE time is spent in planning compared to some of the cost of a couple of table scans on your seven-table join!
Good luck!
Follow-up: You might experiment with Common Table Expressions (CTEs) for performance as well. If you have a table that has a low signal-to-noise ratio (has many, many more records in it than you actually want to return) then a CTE can be very helpful. PostgreSQL executes CTEs early in the query, and materializes the resulting rows in memory. This allows you to use the same result set multiple times and in multiple places in your query. The benefit can really be surprising if you design it correctly.
sql_txt := $$
WITH my_cte as (
select fk1 as moar_data 1
, field1
, field2 /*do not need all other fields taking up RAM!*/
from my_table
where field3 between $$ || quote_literal(input_start_ts) || $$::timestamptz
and $$ || quote_literal(input_end_ts) || $$::timestamptz
),
keys_cte as ( select key_field
from big_look_up_table
where look_up_name = ANY($$ ||
QUOTE_LITERAL(input_array_of_names) || $$::VARCHAR[])
)
SELECT field1, field2, moar_data1, moar_data2
FROM moar_data_table
INNER JOIN my_cte
USING (moar_data1)
WHERE moar_data_table.moar_data_key in (select key_field from keys_cte) $$;
An execution plan is likely to show that it chooses to use an index on moar_data_tale.moar_data_key. This would appear to go against what I said above in my prior answer - except for the fact that the keys_cte results are materialized (and therefore cannot be changed by another transaction in a race-condition) -- you have your own little copy of the data for use in this query.
Oh - and CTEs can use other CTEs that are declared earlier in the same query. I have used this "trick" to replace sub-queries in very complex joins and seen great improvements.
Happy Hacking!

Related

postgresql function error ERROR: query has no destination for result data

I have created one function in postgresql. but when i try to return data i am getting below error
ERROR: query has no destination for result data
HINT: If you want to discard the results of a SELECT, use PERFORM instead.
CONTEXT: PL/pgSQL function "fn_GetAllCountData"() line 27 at SQL statement
SQL state: 42601
Below is the my postgresql function. In this function I am getting task status count in one query
CREATE OR REPLACE FUNCTION public."fn_GetAllCountData"() RETURNS setof "AssignDetails" AS $BODY$
DECLARE
total_draft text;
total_pending text;
total_rejected text;
total_approved text;
total_prev_pending text;
"AssignDetails" text;
BEGIN
--Total pending application no by the user
Select k."UserCode" as "UserCode",count(S."taskAssignTo") as "TotalPending" into total_pending
from user
left Outer Join public."tbl_task" S
on k."UserCode"=S."taskAssignTo" and s.Status='P'
And to_char(S."assignDate"::date, 'dd-mm-yyyy') = to_char(current_Date, 'dd-mm-yyyy')
group by k."UserCode";
--Previous Pending
Select k."UserCode" as "UserCode",count(S."taskAssignTo") as "TotalPrevPending" into total_prev_pending
from kyc k
left Outer Join public."tbl_task" S
on k."UserCode"=S."taskAssignTo" and s.Status='P'
And S."assignDate" < CONCAT(current_Date, ' 00:00:00'):: timestamp
group by k."UserCode";
-- Total Objection raised by the user
Select k."UserCode" as "UserCode",count(S."taskAssignTo") as "TotalRejected" into total_rejected
from kyc k
left Outer Join tbl_task S
on k."UserCode"=S."taskAssignTo" and s.Status='R'
And to_char(S."objectionDate"::date, 'dd-mm-yyyy') = to_char(current_Date, 'dd-mm-yyyy')
group by k."UserCode";
-- Total Approved application no by the user
Select k."UserCode" as "UserCode",count(S."taskAssignTo") as "TotalApproved" into total_approved
from kyc k
left Outer Join public."tbl_task" S
on k."UserCode"=S."taskAssignTo" and s.Status='A'
And S."assignDate" < CONCAT(current_Date, ' 00:00:00'):: timestamp
group by k."UserCode";
--Application no with start Time and total time
Select K."UserCode",K."Status", K."AppType",ST."taskNo" as "TaskId", ST."startTime" as "StartTime",
case
when COALESCE(ST."endTime",'')=''
then (SELECT DATEDIFF('second', ST."startTime":: timestamp, current_timestamp::timestamp))
else (SELECT DATEDIFF('second', ST."startTime":: timestamp, ST."endTime"::timestamp))
end as "Totaltime"
into "Final"
from kyc K
left outer join public."tbl_task_details" ST
On K."UserCode"=ST."empCode";
--Total Checked In Draft application no through by the user
Select k."UserCode" as "UserCode",count(S."taskAssignTo") as "Status_Count" into total_draft
from kyc k
left Outer Join public."tbl_task" S
on k."UserCode"=S."taskAssignTo" and s.Status='D'
And S."assignDate" < CONCAT(current_Date, ' 00:00:00'):: timestamp
group by k."UserCode";
Select distinct K."UserCode",K."Status",K."AppType",K."LoginTime",K."LogoutTime",
F."TaskId",F."StartTime",F."Totaltime",
TP."TotalPending" as "Pending",
TR."TotalRejected" as "Objection",
TA."TotalApproved" as "Approved",
TS."TotalAssign" as "Total_Assigned",
TD."Status_Count" as "Draft_Count",
TPP."TotalPrevPending" As "Prev_Pending"
into "AssignDetails"
From "Final" F
Right outer join kyc K On K."UserCode"=F."UserCode"
left outer join total_scrutiny TS On K."UserCode"=Ts."UserCode"
left outer join total_draft TD On TD."UserCode"=K."UserCode"
left outer join total_pending TP On TP."UserCode"=K."UserCode"
left outer join total_rejected TR On TR."UserCode"=K."UserCode"
left outer join total_approved TA On TA."UserCode"=K."UserCode"
Left Outer Join total_prev_pending TPP On TPP."UserCode"=K."UserCode"
order by TS."TotalAssign" desc;
Select * From "AssignDetails";
END
$BODY$ LANGUAGE plpgsql;
I tried to return table with return query but still not working. I don't know what i am doing wrong. Please help me with this.

Please note that postgreSQL only reports one error at a time. In fact there is a very great deal wrong with your function, so much so that it would take too long to correct everything here.
I have therefore given you a cut-down version here, which should point you in the right direction. I will give the code first, and then explain the points.
CREATE OR REPLACE FUNCTION public.fn_getallcountdata() RETURNS TABLE (usercode text, totalpending integer) AS $BODY$
BEGIN
CREATE TEMP TABLE total_pending
(
usercode text,
totalpending int
) ON COMMIT DROP;
--Total pending application no by the user
INSERT INTO total_pending
Select k.usercode, count(s.taskassignto)::integer
from public.user k
left Outer Join public.tbl_task s
on k.usercode=s.taskassignto and s.status='P'
And s.assigndate::date = current_date
group by k.usercode;
RETURN QUERY
select t.usercode, t.totalpending From total_pending t;
END;
$BODY$ LANGUAGE plpgsql;
Points to note:
Firstly please avoid using mixed case names in postgreSQL. It means that you have to double quote everything which is a real pain!
Secondly, you were declaring variables as text, when in fact they were holding table data. This you cannot do (you can only put a single value in any variable). Instead you need to create temporary tables in the way I have done. Note in particular the use of ON COMMIT DROP. This is a useful way in postgreSQL to avoid having to remember to drop temporary tables when you are finished with them!
Thirdly your alias k is not referring to anything in your first select. Note also that user is a reserved word. If you insist on having user as a name for a table, then you will need to access it through public.user (assuming it is in the public schema).
(As an aside it is generally considered to be a security risk to use the public schema, because of guest access).
Fourthly there is no need to convert a date to string form in order to compare it. Casting a timestamp to a date and directly comparing to another date is in fact far faster, than converting both dates to a string representation and comparing the strings.
Fifthly COUNT in postgreSQL returns a bigint, which is why I generally cast it as integer, because an integer usually suffices!
I have defined the function to return a table containing named columns. You can use setof, but if you do it has to be a known table type.
For the final SELECT I have supplied the required RETURN QUERY first. Note also that I am using a table alias. This is because the column names in the returning table match those in the temporary table, so you need to be explicit as to what you are doing.
I strongly recommend that you experiment with a shorter function first, (as in my cutdown version) and then increase the complexity once you have it compiling (and running). To this end please also note that in postgreSQL, if a function compiles, it does not mean that it contains no runtime errors. Also if you change the return columns between different compilations, you will need to delete the previous version.
Hope this points you in the right direction, but please feel free to get back with any further issues.

Select query became very very very slow in postgresql

I have one table which contains "133,072,194" records and I am trying to execute
SELECT COUNT(test)
FROM mytable
WHERE test = false
but it is taking Execution time: 128320.712 ms
I already have indexing on test column. Could you please let me know, what I can optimize or change, so my query became faster?
Because of this, my other select query is also not working.

If there are many rows where test is FALSE, you won't be able to get an exact result faster than with a sequential scan, which is slow for big tables.
If you have only few rows that satisfy the condition, you should create a partial index:
CREATE INDEX mytable_notest_ind ON mytable(id) WHERE NOT test;
(assuming that id is the primary key) and keep mytable autovacuumed often enough that you get an index only scan.
But usually exact results for queries like this are not required.
You could calculate an estimated count from the table statistics with a query like this:
SELECT t.reltuples
* (1 - t.nullfrac)
* mcv.freq AS count_false
FROM pg_stats AS s
CROSS JOIN LATERAL unnest(s.most_common_vals::text::boolean[],
s.most_common_freqs) AS mcv(val, freq)
JOIN pg_class AS t
ON s.tablename = t.relname
AND s.schemaname = t.relnamespace::regnamespace::text
WHERE s.tablename = 'mytable'
AND s.attname = 'test'
AND mcv.val = FALSE;
That would be very fast.
See my blog post for more considerations about the speed of SELECT count(*).

Four triggers: It does not work

Hello everybody,
excuse me for my bad english
It's been more than 4 days I am trying to solve my problem:
each trigger works well but when I combine them there is an error:
the subquery returns more than 1 value.
I tried to follow all the tips in this website and others, I could not make it works, though.
the concerned tables are: PIECES, COMPOSITIONSGAMMES, nomenclatures and SITUATIONS.
What I want the triggers to do is :
When the user inserts a new row on "SITUATIONS" and if 'nomstrategie'= "DST" (It's a name of a strategy but this detail does not really matter, I mean for people who will help me), I need other rows to be inserted with the same reference (referencepiece), the same strategy(nomstrategie). Only 'ancienposte' and 'nouveauposte' have to change. Indeed, the first one's value(s) has to be all 'Numeroposte' from the table "Compositionsgammes". The second one's value has to be '???'.
I need, when I insert a new row and 'nomstrategie'='DST', other rows to be inserted with all 'piecesfilles' in the table "Nomenclatures"
of the reference 'referencepiece' in the row inserted by the user. And in 'ancienposte', there should be 'numeroposte' in the table "compositionsgammes".
I need, when the user inserts a new row and 'nomstrategie'= 'delestage, another row to be inserted as below, for example :
inserted row: Ref A ancienposte : P01 Nouveauposte :P02 Nomstrategie :Delestage…………
row to be inserted: Ref A ancienposte : P02 Nouveauposte :NULL Nomstrategie :Delestage…………
I need, for every row in the table "situations", calculate a value called 'charge' in the table situations charge=(TS/Taillelot)+Tu
here are the triggers I've done:
create trigger [dbo].[ALLDST]
ON [dbo].[SITUATIONS]
AFTER INSERT /*pas d'update*/
as
begin
set nocount on
insert into SITUATIONS(ReferencePiece,nomstrategie,AncienPoste,nouveauposte,DateStrategie)
select distinct i.referencepiece, i.nomstrategie,COMPOSITIONSGAMMES.NumeroPoste,'???',i.DateStrategie
from inserted i, PIECES, compositionsgammes, SITUATIONS s
where i.ReferencePiece is not null
and i.NomStrategie='DST'
and i.ReferencePiece=pieces.ReferencePiece and pieces.CodeGamme=COMPOSITIONSGAMMES.CodeGamme
and i.AncienPoste<>COMPOSITIONSGAMMES.NumeroPoste
and i.DateStrategie=s.DateStrategie
end
create trigger [dbo].[Calcul_Charge]
on [charges].[dbo].[SITUATIONS]
after insert
as
begin
update situations
set charge= (select (cg.TS/pieces.TailleLot)+cg.tu from situations s
inner join COMPOSITIONSGAMMES cg on cg.NumeroPoste=SITUATIONS.AncienPoste
inner join pieces on SITUATIONS.ReferencePiece=pieces.ReferencePiece
inner join inserted i on s.DateStrategie=i.DateStrategie
where cg.CodeGamme=pieces.CodeGamme and NumeroPoste=situations.AncienPoste
)
end
create trigger [dbo].[Duplicate_SITUATIONS]
ON [dbo].[SITUATIONS]
AFTER INSERT
as
begin
set nocount on
declare #ref varchar(50)
declare #strategie varchar(50)
declare #ancienposte varchar(50)
declare #datestrategie date
declare #pourcentage decimal(18,3)
declare #coeff decimal(18,3)
declare #charge decimal(18,3)
/*while (select referencepiece from situations where ReferencePiece) is not null*/
select #ref=referencepiece, #strategie=nomstrategie,#ancienposte=NouveauPoste,
#datestrategie=datestrategie, #pourcentage=PourcentageStrategie,#coeff=coeffameliorationposte,#charge=charge
from inserted,POSTESDECHARGE
where ReferencePiece is not null
and POSTESDECHARGE.NumeroPoste = inserted.AncienPoste
if #strategie = 'delestage' and #ancienposte is not null
/*if GETDATE()>= (select datestrategie from SITUATIONS)*/
begin
insert into SITUATIONS(ReferencePiece, nomstrategie,AncienPoste,DateStrategie,
StatutStrategie,DateModification,PourcentageStrategie,charge)
values
(#ref, #strategie, #ancienposte, #datestrategie,1,getdate(),#pourcentage,#charge*#coeff)
end
end

I'm mostly familiar with T-SQL (MS SQL), not sure if this will work for your case.. but I usually avoid updates using a sub query and rewrite your update:
update situations
set charge= (select (cg.TS/pieces.TailleLot)+cg.tu from situations s
inner join COMPOSITIONSGAMMES cg on cg.NumeroPoste=SITUATIONS.AncienPoste
inner join pieces on SITUATIONS.ReferencePiece=pieces.ReferencePiece
inner join inserted i on s.DateStrategie=i.DateStrategie
where cg.CodeGamme=pieces.CodeGamme and NumeroPoste=situations.AncienPoste
)
as follows
update s set
charge= (cg.TS/pieces.TailleLot)+cg.tu
from situations s
inner join COMPOSITIONSGAMMES cg on cg.NumeroPoste=SITUATIONS.AncienPoste
inner join pieces on SITUATIONS.ReferencePiece=pieces.ReferencePiece
inner join inserted i on s.DateStrategie=i.DateStrategie
where cg.CodeGamme=pieces.CodeGamme and NumeroPoste=situations.AncienPoste

PostgreSQL - select the results of two subqueries

I have 2 complex queries that are both subqueries in postgres, the results of which are:
q1_results = id , delta , metric_1
q2_results = id , delta , metric_2
i'd like to combine the results of the queries, so the outer query can access either:
results_a = id , delta , metric_1 , metric_2
results_b = id , delta , combined_metric
i can't figure out how to do this. online searches keep leading me to UNION , but that keeps the metrics in the same column. i need to keep them split.

It's not entirely clear what you're asking in the question and the comments, but it sounds like you might be looking for a full join with a bunch of coalesce statements, e.g.:
-- create view at your option, e.g.:
-- create view combined_query as
select coalesce(a.id, b.id) as id,
coalesce(a.delta, b.delta) as delta,
a.metric1 as metric1,
b.metric2 as metric2,
coalesce(a.metric1,0) + coalesce(b.metric2,0) as combined
from (...) as results_a a
full join (...) as results_b b on a.id = b.id -- and a.delta = b.delta maybe?

Unexpected SQL results: string vs. direct SQL

Working SQL
The following code works as expected, returning two columns of data (a row number and a valid value):
sql_amounts := '
SELECT
row_number() OVER (ORDER BY taken)::integer,
avg( amount )::double precision
FROM
x_function( '|| id || ', 25 ) ca,
x_table m
WHERE
m.category_id = 1 AND
m.location_id = ca.id AND
extract( month from m.taken ) = 1 AND
extract( day from m.taken ) = 1
GROUP BY
m.taken
ORDER BY
m.taken';
FOR r, amount IN EXECUTE sql_amounts LOOP
SELECT array_append( v_row, r::integer ) INTO v_row;
SELECT array_append( v_amount, amount::double precision ) INTO v_amount;
END LOOP;
Non-Working SQL
The following code does not work as expected; the first column is a row number, the second column is NULL.
FOR r, amount IN
SELECT
row_number() OVER (ORDER BY taken)::integer,
avg( amount )::double precision
FROM
x_function( id, 25 ) ca,
x_table m
WHERE
m.category_id = 1 AND
m.location_id = ca.id AND
extract( month from m.taken ) = 1 AND
extract( day from m.taken ) = 1
GROUP BY
m.taken
ORDER BY
m.taken
LOOP
SELECT array_append( v_row, r::integer ) INTO v_row;
SELECT array_append( v_amount, amount::double precision ) INTO v_amount;
END LOOP;
Question
Why does the non-working code return a NULL value for the second column when the query itself returns two valid columns? (This question is mostly academic; if there is a way to express the query without resorting to wrapping it in a text string, that would be great to know.)
Full Code
http://pastebin.com/hgV8f8gL
Software
PostgreSQL 8.4
Thank you.

The two statements aren't strictly equivalent.
Assuming id = 4, the first one gets planned/prepared on each pass, and behaves like:
prepare dyn_stmt as '... x_function( 4, 25 ) ...'; execute dyn_stmt;
The other gets planned/prepared on the first pass only, and behaves more like:
prepare stc_stmt as '... x_function( $1, 25 ) ...'; execute stc_stmt(4);
(The loop will actually make it prepare a cursor for the above, but that's besides the point for our sake.)
A number of factors can make the two yield different results.
Search path changes before calling the procedure will be ignored by the second call. In particular if this makes x_table point to something different.
Constants of all kinds and calls to immutable functions are "hard-wired" in the second call's plan.
Consider this as an illustration of these side-effects:
deallocate all;
begin;
prepare good as select now();
prepare bad as select current_timestamp;
execute good; -- yields the current timestamp
execute bad; -- yields the current timestamp
commit;
execute good; -- yields the current timestamp
execute bad; -- yields the timestamp at which it was prepared
Why the two aren't returning the same results in your case would depend on the context (you only posted part of your pl/pgsql function, so it's hard to tell), but my guess is you're running into a variation of the above kind of problem.

From Tom Lane:
I think the problem is that you're assuming "amount" will refer to a table column of the query, when actually it's a local variable of the plpgsql function. The second interpretation will take precedence unless you qualify the column reference with the table's name/alias.
Note: PG 9.0 will throw an error by default when there is an ambiguity of this type.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Postgresql function executed much longer than the same query - postgresql

Related

postgresql function error ERROR: query has no destination for result data

Select query became very very very slow in postgresql

Four triggers: It does not work

PostgreSQL - select the results of two subqueries

Unexpected SQL results: string vs. direct SQL

Categories

Resources