How to create a user-defined function in Postgresql? - postgresql

I have the following query in Postgres. I want to have a function where the user can define the value of record_year and record_month using the function call, dynamically without having to specify in the query statement. That way the same query can be reused for different user input values for record_year and record_month (in this case). Can anybody help me out?
SELECT x.sid, x.record_year, x.record_month, y.addr
FROM x
FULL OUTER JOIN y
ON x.sid = y.sid
WHERE x.record_time='23:50'
and x.record_year='2020'
and x.record_month='1'
group by x.station_id, x.record_year, x.record_month, y.addr;
The function call could be something like, select function_name(record_year, record_month);

As you are returning a result the function should be defined as returns table(). PL/pgSQL is not required if all you want to do is to return a result. A SQL function will be enough.
create function get_data(p_time time, p_year int, p_month int)
returns table (sid int, record_year int, record_month int, addr text)
as
$$
SELECT x.sid, x.record_year, x.record_month, y.addr
FROM x
FULL OUTER JOIN y ON x.sid = y.sid
WHERE x.record_time p_time
and x.record_year = p_year
and x.record_month = p_month
group by x.station_id, x.record_year, x.record_month, y.addr
$$
language sql
stable;
You have to adjust the data types of the returned columns - I have only guessed them based on name.
Note that your full outer join is really a left join because of the conditions on table x
As it is a set returning function, use it like a table in the FROM clause:
select *
from get_data(time '23:50', 2020, 1);

Related

PostgreSQL - Writing aggregate function which do average or percentile

I would like to create aggregate function
avg_or_percentile(column, type::text)
which aggregate by average or given percentile depending on parameter 'type' which can take values like:
'avg', '50', '99' etc.
If 'avg' - aggregate using 'avg' function, if other - use percentile.
I don't see any possibility to include default avg function or aggregate function in my custom one.
If i try doing something like this:
CREATE OR REPLACE FUNCTION avg_or_percentile_transition(state internal, value double precision)
RETURNS internal AS $$
BEGIN
CASE agg_type
WHEN 'avg' THEN
return int8_avg_accum(state, value);
ELSE
return ordered_set_transition(state, value);
END CASE;
END;
$$ LANGUAGE plpgsql;
I am getting error, that internal type cannot be used.
Is the only way to achieve this to decompose them and combine both implementations in a custom one?
Am I missing something?
EDIT 1:
Right now I am achieving it using query like this ($ are macros from grafana):
Select
$__timeGroup("Closed", $__interval, 0) as "time",
CASE
WHEN '${Func}'='avg'
THEN avg(duration)
ELSE percentile_cont(try_cast_numeric('0.${Func}')) within group (order by duration asc)
END as "${Func}"
from
(SELECT
*,
extract(epoch from "Closed"-"Created")/3600 as duration
FROM pr."PullRequests" pr
WHERE $__timeFilter("Closed") and (pr."TeamName" = ${Team} or ${Team} = 'All')) withDuration
GROUP BY time
ORDER BY time
But it is very ugly statement and it gets even worse, when we add additional columns to aggregate. Would love to simplify it.

How to convert an jsonb array and use stats moment

how are you?
I needed to store an array of numbers as JSONB in PostgreSQL.
Now I'm trying to calculate stats moments from this JSON, I'm facing some issues.
Sample of my data:
I already was able to convert a JSON into a float array.
I used a function to convert jsonb to float array.
CREATE OR REPLACE FUNCTION jsonb_array_castdouble(jsonb) RETURNS float[] AS $f$
SELECT array_agg(x)::float[] || ARRAY[]::float[] FROM jsonb_array_elements_text($1) t(x);
$f$ LANGUAGE sql IMMUTABLE;
Using this SQL:
with data as (
select
s.id as id,
jsonb_array_castdouble(s.snx_normalized) as serie
FROM
spectra s
)
select * from data;
I found a function that can do these calculations and I need to pass an array for that: https://github.com/ellisonch/PostgreSQL-Stats-Aggregate/
But this function requires an array in another way: unnested
I already tried to use unnest, but it will get only one value, not the entire array :(.
My goal is:
Be able to apply stats moment (kurtosis, skewness) for each row.
like:
index
skewness
1
21.2131
2
1.123
Bonus: There is a way to not use this 'with data', use the transformation in the select statement?
snx_wavelengths is JSON, right? And also you provided it as a picture and not text :( the data looks like (id, snx_wavelengths) - I believe you meant id saying index (not a good idea to use a keyword, would require identifier doublequotes):
1,[1,2,3,4]
2,[373,232,435,84]
If that is right:
select id, (stats_agg(v::float)).skewness
from myMeasures,
lateral json_array_elements_text(snx_wavelengths) v
group by id;
DBFiddle demo
BTW, you don't need "with data" in the original sample if you don't want to use and could replace with a subquery. ie:
select (stats_agg(n)).* from (select unnest(array[16,22,33,24,15])) data(n)
union all
select (stats_agg(n)).* from (select unnest(array[416,622,833,224,215])) data(n);
EDIT: And if you needed other stats too:
select id, "count","min","max","mean","variance","skewness","kurtosis"
from myMeasures,
lateral (select (stats_agg(v::float)).* from json_array_elements_text(snx_wavelengths) v) foo
group by id,"count","min","max","mean","variance","skewness","kurtosis";
DBFiddle demo

postgresql function error ERROR: query has no destination for result data

I have created one function in postgresql. but when i try to return data i am getting below error
ERROR: query has no destination for result data
HINT: If you want to discard the results of a SELECT, use PERFORM instead.
CONTEXT: PL/pgSQL function "fn_GetAllCountData"() line 27 at SQL statement
SQL state: 42601
Below is the my postgresql function. In this function I am getting task status count in one query
CREATE OR REPLACE FUNCTION public."fn_GetAllCountData"() RETURNS setof "AssignDetails" AS $BODY$
DECLARE
total_draft text;
total_pending text;
total_rejected text;
total_approved text;
total_prev_pending text;
"AssignDetails" text;
BEGIN
--Total pending application no by the user
Select k."UserCode" as "UserCode",count(S."taskAssignTo") as "TotalPending" into total_pending
from user
left Outer Join public."tbl_task" S
on k."UserCode"=S."taskAssignTo" and s.Status='P'
And to_char(S."assignDate"::date, 'dd-mm-yyyy') = to_char(current_Date, 'dd-mm-yyyy')
group by k."UserCode";
--Previous Pending
Select k."UserCode" as "UserCode",count(S."taskAssignTo") as "TotalPrevPending" into total_prev_pending
from kyc k
left Outer Join public."tbl_task" S
on k."UserCode"=S."taskAssignTo" and s.Status='P'
And S."assignDate" < CONCAT(current_Date, ' 00:00:00'):: timestamp
group by k."UserCode";
-- Total Objection raised by the user
Select k."UserCode" as "UserCode",count(S."taskAssignTo") as "TotalRejected" into total_rejected
from kyc k
left Outer Join tbl_task S
on k."UserCode"=S."taskAssignTo" and s.Status='R'
And to_char(S."objectionDate"::date, 'dd-mm-yyyy') = to_char(current_Date, 'dd-mm-yyyy')
group by k."UserCode";
-- Total Approved application no by the user
Select k."UserCode" as "UserCode",count(S."taskAssignTo") as "TotalApproved" into total_approved
from kyc k
left Outer Join public."tbl_task" S
on k."UserCode"=S."taskAssignTo" and s.Status='A'
And S."assignDate" < CONCAT(current_Date, ' 00:00:00'):: timestamp
group by k."UserCode";
--Application no with start Time and total time
Select K."UserCode",K."Status", K."AppType",ST."taskNo" as "TaskId", ST."startTime" as "StartTime",
case
when COALESCE(ST."endTime",'')=''
then (SELECT DATEDIFF('second', ST."startTime":: timestamp, current_timestamp::timestamp))
else (SELECT DATEDIFF('second', ST."startTime":: timestamp, ST."endTime"::timestamp))
end as "Totaltime"
into "Final"
from kyc K
left outer join public."tbl_task_details" ST
On K."UserCode"=ST."empCode";
--Total Checked In Draft application no through by the user
Select k."UserCode" as "UserCode",count(S."taskAssignTo") as "Status_Count" into total_draft
from kyc k
left Outer Join public."tbl_task" S
on k."UserCode"=S."taskAssignTo" and s.Status='D'
And S."assignDate" < CONCAT(current_Date, ' 00:00:00'):: timestamp
group by k."UserCode";
Select distinct K."UserCode",K."Status",K."AppType",K."LoginTime",K."LogoutTime",
F."TaskId",F."StartTime",F."Totaltime",
TP."TotalPending" as "Pending",
TR."TotalRejected" as "Objection",
TA."TotalApproved" as "Approved",
TS."TotalAssign" as "Total_Assigned",
TD."Status_Count" as "Draft_Count",
TPP."TotalPrevPending" As "Prev_Pending"
into "AssignDetails"
From "Final" F
Right outer join kyc K On K."UserCode"=F."UserCode"
left outer join total_scrutiny TS On K."UserCode"=Ts."UserCode"
left outer join total_draft TD On TD."UserCode"=K."UserCode"
left outer join total_pending TP On TP."UserCode"=K."UserCode"
left outer join total_rejected TR On TR."UserCode"=K."UserCode"
left outer join total_approved TA On TA."UserCode"=K."UserCode"
Left Outer Join total_prev_pending TPP On TPP."UserCode"=K."UserCode"
order by TS."TotalAssign" desc;
Select * From "AssignDetails";
END
$BODY$ LANGUAGE plpgsql;
I tried to return table with return query but still not working. I don't know what i am doing wrong. Please help me with this.
Please note that postgreSQL only reports one error at a time. In fact there is a very great deal wrong with your function, so much so that it would take too long to correct everything here.
I have therefore given you a cut-down version here, which should point you in the right direction. I will give the code first, and then explain the points.
CREATE OR REPLACE FUNCTION public.fn_getallcountdata() RETURNS TABLE (usercode text, totalpending integer) AS $BODY$
BEGIN
CREATE TEMP TABLE total_pending
(
usercode text,
totalpending int
) ON COMMIT DROP;
--Total pending application no by the user
INSERT INTO total_pending
Select k.usercode, count(s.taskassignto)::integer
from public.user k
left Outer Join public.tbl_task s
on k.usercode=s.taskassignto and s.status='P'
And s.assigndate::date = current_date
group by k.usercode;
RETURN QUERY
select t.usercode, t.totalpending From total_pending t;
END;
$BODY$ LANGUAGE plpgsql;
Points to note:
Firstly please avoid using mixed case names in postgreSQL. It means that you have to double quote everything which is a real pain!
Secondly, you were declaring variables as text, when in fact they were holding table data. This you cannot do (you can only put a single value in any variable). Instead you need to create temporary tables in the way I have done. Note in particular the use of ON COMMIT DROP. This is a useful way in postgreSQL to avoid having to remember to drop temporary tables when you are finished with them!
Thirdly your alias k is not referring to anything in your first select. Note also that user is a reserved word. If you insist on having user as a name for a table, then you will need to access it through public.user (assuming it is in the public schema).
(As an aside it is generally considered to be a security risk to use the public schema, because of guest access).
Fourthly there is no need to convert a date to string form in order to compare it. Casting a timestamp to a date and directly comparing to another date is in fact far faster, than converting both dates to a string representation and comparing the strings.
Fifthly COUNT in postgreSQL returns a bigint, which is why I generally cast it as integer, because an integer usually suffices!
I have defined the function to return a table containing named columns. You can use setof, but if you do it has to be a known table type.
For the final SELECT I have supplied the required RETURN QUERY first. Note also that I am using a table alias. This is because the column names in the returning table match those in the temporary table, so you need to be explicit as to what you are doing.
I strongly recommend that you experiment with a shorter function first, (as in my cutdown version) and then increase the complexity once you have it compiling (and running). To this end please also note that in postgreSQL, if a function compiles, it does not mean that it contains no runtime errors. Also if you change the return columns between different compilations, you will need to delete the previous version.
Hope this points you in the right direction, but please feel free to get back with any further issues.

Create function to compute an average of 3 values

I'm trying to write a function in postgre sql to take an average across three columns. I have written the following function:
create function xcol_avg (col1, col2, col3)
returns numeric as $$
begin
return (coalesce(col1, 0) + coalesce(col2,0) +coalesce(col3, 0))/
case when (col 1 is null or col1 = 0 then 0 else 1 end +
case when (col 2 is null or col2 = 0 then 0 else 1 end +
case when (col 3 is null or col3 = 0 then 0 else 1 end;
end
What is the problem with my code? Also, is there a way to get the function to return null if it ends up dividing by 0? Any help is really appreciated.
Thanks!
Actually, you can make a function that will use a variable number of arguments and depending on their number compute the average. In Postgres there's a word VARIADIC for such things:
SQL functions can be declared to accept variable numbers of arguments, so long as all the "optional" arguments are of the same data type
Function code:
CREATE FUNCTION xcol_avg(numeric, VARIADIC numeric[])
RETURNS numeric
LANGUAGE plpgsql
IMMUTABLE
AS $$
BEGIN
RETURN (SELECT AVG(vals) FROM unnest($2 || ARRAY[$1]) t(vals));
END;
$$;
Use case with different number of arguments:
select xcol_avg(1,6); -- returns 3.5
select xcol_avg(1,5.5,4); -- returns 3.5
select xcol_avg(1,2,3,4,5,6,7); -- returns 4
Click on this Button to try this online.
Explanation:
Marking a function as IMMUTABLE improves the execution time by allowing the optimizer to pre-evaluate the function. Immutable functions cannot modify the database and are guaranteed to always return the same results when called with the same input.
Declaring the last parameter of a function as VARIADIC which has to be of an array type lets you provide optional arguments that will be passed to the function as an array. Note that you don't explicitly write the array, you just list your parameters as you normally would.
unnest() is a function that returns a set of rows by expanding an array. In other words it's "unpacking" the array elements into separate rows
|| is an array operator that provides the array-to-array concatenation. Here it serves the purpose of connecting the first (required) argument with the rest given in a VARIADIC array.
AVG() is an aggregate function that computes an average of all input values. In our case it would take "unpacked" rows from a column named vals and compute the average.
With this solution you don't need to worry about dividing by zero, as at least one argument is required and avg() is doing the job you wanted to do manually by building up the denominator.
Apply it in a query:
This function would also work for computing an average of multiple columns in a row. Consider a table tbl with columns name, cost1, cost2, cost3 and below statement:
SELECT
name, cost1, cost2, cost3,
xcol_avg(cost1, cost2, cost3) AS average_cost
FROM tbl
For more general information about CREATE FUNCTION check the resourceful documentation.

Postgresql function doesn't return anything when UUID argument is used in WHERE clause

I'm pretty new to Postgresql. The issue I'm having is that I have a function that returns a table, but when I pass an UUID which is used in the where clause, it returns nothing. The funny thing is that if I take the SQL statement inside the function and run it by itself in PgAdmin, it gives me the right result.
The function looks like the following:
CREATE OR REPLACE FUNCTION get_service (
service_id uuid ) RETURNS TABLE(id uuid,title text,description text,category text,photo_url text,address text,
created_by uuid,created_on timestamp,service_rating float,rating_count bigint) AS $func$
Select
service.id,
service.title,
service.description,
service.category,
service.photo_url,
service.address,
service.created_by,
service.created_on,
CAST(AVG(rating.rating) AS float) as service_rating,
Count(rating.rating) as rating_count
from service
left join rating_service_map map
on service.id = map.service_id
left join rating
on rating.id = map.rating_id
where service.id = service_id
group by service.id,service.title,service.description,service.category,service.photo_url,service.address,service.created_by,service.created_on;
$func$ LANGUAGE SQL;
I have two records in my service table. The ID is of the type uuid and has a default value of uuid_generate_v4(). One of the records has an id of '2af3f03e-b2e5-44fd-89e8-3dc5fb641732'
If I run this I get no result:
select * from get_service('2af3f03e-b2e5-44fd-89e8-3dc5fb641732')
But if I run the following statement (the SQL portion of the function), then I get my right result:
Select
service.id,
service.title,
service.description,
service.category,
service.photo_url,
service.address,
service.created_by,
service.created_on,
CAST(AVG(rating.rating) AS float) as service_rating,
Count(rating.rating) as rating_count
from service
left join rating_service_map map
on service.id = map.service_id
left join rating
on rating.id = map.rating_id
where service.id = '2af3f03e-b2e5-44fd-89e8-3dc5fb641732'
group by service.id,service.title,service.description,service.category,service.photo_url,service.address,service.created_by,service.created_on;
I've also tried to cast the service_id (I've tried "where service.id = sevice_id::uuid" and "where service.id = CAST(service_id AS uuid)") but none of them worked.
I really appreciate it if you can tell me what I'm doing wrong. I've been at this for a couple of hours now.
Thank you.
I suspect that it's because the identifier service_id is ambiguous, being present as both a function parameter and a column in the map table.
Unlike a plain query, where such ambiguity would result in an error, conflicts in SQL functions are resolved by giving precedence to the column, so service_id in your case is actually referring to map.service_id.
You can either qualify it in your function body using the name of your function (i.e. get_service.service_id), or simply choose another name for the parameter.