Create a list of objects in plsql - postgresql - postgresql

I need to create a list of objects in PL/SQL - postgres and return it as table to user.
Here is the scenario. I have two table called
create table ProcessDetails(
processName varchar,
processstartdate timestamp,
processenddate timestamp);
create table processSLA(
processName varchar,
sla numeric);
Now I need to loop over all the records in processDetails table and check which records for each activity type has breached sla, within sla and those that are more 80% of sla.
I would need help in understanding how to loop over records and create a collection in which for each processtype I have details required.
sample data from processdetails table
ProcessName processstartdate processenddate
-----------------------------------------------------
"Create" "2018-12-24 13:11:05.122694" null
"Delete" "2018-12-24 12:12:24.269266" null
"Delete" "2018-12-23 13:12:31.89164" null
"Create" "2018-12-22 13:12:37.505486" null
processSLA
ProcessName sla(in hrs)
---------------------------------
Create 1
Delete 10
And the output will look something like this
ProcessName WithinSLA(Count) BreachedSLA(Count) Exceeded80%SLA(Count)
---------------------------------------------------------------------
Create 1 1 3
Delete 1 2 1

For each SLA, you can look up all corresponding process details with a join. The link between two joined tables specified in a join condition. For your example, using (processName) would work.
To find processes that have exceeded the SLA, say that the allowed end date is smaller than the actual end date:
select processName
, count(case when det.processstartdate + interval '1 hour' * sla.hours >=
coalesce(det.processenddate, now()) then 1 end) as InSLA
, count(case when det.processstartdate + interval '1 hour' * sla.hours <
coalesce(det.processenddate, now()) then 1 end) as BreachedSLA
, count(case when det.processstartdate + interval '1 hour' * 0.8 * sla.hours <
coalesce(det.processenddate, now()) then 1 end) as 80PercentSLA
from processSLA sla
left join
ProcessDetails det
using (processName)
group by
processName

You can join both tables and use conditional aggregation based on the calculation of the difference between the timestamps.
Something like that:
SELECT count(CASE
WHEN extract(EPOCH FROM pd.processenddate - pd.processstartdate) / 3600 < ps.sla * .8 THEN
1
END) "less than 80%",
count(CASE
WHEN extract(EPOCH FROM pd.processenddate - pd.processstartdate) / 3600 >= ps.sla * .8
AND extract(EPOCH FROM pd.processenddate - pd.processstartdate) / 3600 <= ps.sla THEN
1
END) "80% to 100%",
count(CASE
WHEN extract(EPOCH FROM pd.processenddate - pd.processstartdate) / 3600 > ps.sla THEN
1
END) "more than 100%"
FROM processdetails pd
INNER JOIN processsla ps
ON ps.processname = pd.processname;

Related

Calculations inside window function in PostgreSQL

I have a dataset of sales. To summarize, the structure is
client_id
date_purchase
There might be several purchases done by the same customer on different dates. There can also be several purchases done on the same date (by different or the same customer).
My goal is to get the number of customers, for any given day, that made 2 or more purchases between that day and 90 days prior.
That is, the expected output is
date_purchase
number_of_customers
2022-12-19
200
2022-12-18
194
(...)
Please note this calculates, for any given date, the number of customer with 2+ purchases between that date and 90 days prior.
I know it has something to do with a window function. But so far I have not found a way to calculate, for every window of 90 days, how many customers have done 2+ purchases.
I've tried several window functions with no success:
partition by date_purchase
range between interval '90 days' preceding and current row
So far I can't get to calculate correctly the number for each date.
Window function doesn't seem to be relevant here because there is no relationship between the rows of the same window. A simple query or a self-join query should provide the expected result.
Assuming that client_id and date_purchase are two columns of my_table :
1. Query for a given date reference_date :
SELECT a.reference_date AS date_purchase, count(*) AS number_of_customers
FROM ( SELECT reference_date , client_id
FROM my_table
WHERE date_purchase <= reference_date AND date_purchase >= reference_date - INTERVAL '90 days'
GROUP BY client_id
HAVING count(*) >= 2
) AS a
2. Query for a given interval of dates reference_date => reference_date + INTERVAL '20 days' :
SELECT a.date AS date_purchase, count(*) AS number_of_customers
FROM ( SELECT ref.date, t.client_id
FROM my_table AS t
INNER JOIN generate_series(reference_date, reference_date + INTERVAL '20 days', '1 day') AS ref(date)
ON t.date_purchase <= ref.date AND t.date_purchase >= ref.date - INTERVAL '90 days'
GROUP BY ref.date, t.client_id
HAVING count(*) >= 2
) AS a
GROUP BY a.date
ORDER BY a.date
3. Query for all the date_purchase in mytable :
SELECT a.date AS date_purchase, count(*) AS number_of_customers
FROM ( SELECT ref.date, t.client_id
FROM my_table AS t
INNER JOIN (SELECT DISTINCT date_purchase AS date FROM my_table) AS ref
ON t.date_purchase <= ref.date AND t.date_purchase >= ref.date - INTERVAL '90 days'
GROUP BY ref.date, t.client_id
HAVING count(*) >= 2
) AS a
GROUP BY a.date
ORDER BY a.date

Dynamic value passing in Postgres

Here is a complex query where i need to pass some dates as dynamic to this, As of now i have hardcoded this '2021-08-01' AND '2022-07-31' these 2 dates.
But i have to pass this dates dynamically in such a way that next dates ie, 2022-06 month , thew dates passed will be '2021-07-01' and '2022-06-30' , basically 12 months behind data.
if we take 2022-05 then the passed date should be '2021-06-01' and '2022-05-31'.
How can we achieve this ? Any suggestions or help will be much appreciated.
below is the query for reference
WITH base as
(
SELECT created_at as period ,order_number, TRIM(email) as email ,is_first_order
FROM orders
WHERE created_at::DATE BETWEEN '2021-08-01' AND '2022-07-31'
)
,base_agg as
(
select TO_CHAR(period,'YYYY-MM') as period
,COUNT(DISTINCT email)FILTER(WHERE is_first_order IS TRUE) as new_users
,COUNT(DISTINCT order_number)FILTER(WHERE is_first_order IS FALSE) as returning_orders
FROM base
GROUP BY 1
)
,base_cumulative as
(
SELECT ROW_NUMBER() OVER(ORDER BY PERIOD DESC ) as rno
,period
,new_users
,returning_orders
,sum("new_users")over (order by "period" asc rows between unbounded preceding and current row) as "cumulative_total"
from base_agg
)
SELECT
(SELECT period FROM base_cumulative WHERE rno=1) period
,(SELECT cumulative_total FROM base_cumulative WHERE rno=1) as cumulated_customers
,SUM(returning_orders) as returning_orders
,SUM(returning_orders)/NULLIF((SELECT cumulative_total FROM base_cumulative WHERE rno=1),0) as rate
FROM base_cumulative
You can calculate the end of current month based on NOW() and some logic, the same can be applied with the rest of the calculation
select date_trunc('month', now())::date + interval '1 month - 1 day' end_of_this_month,
date_trunc('month', now())::date + interval '1 month - 1 day'::interval - '1 year'::interval + '1 day'::interval first_day_of_prev_year_month
;
Result
end_of_this_month | first_day_of_prev_year_month
---------------------+------------------------------
2022-08-31 00:00:00 | 2021-09-01 00:00:00
(1 row)

Postgres: Calculating the number of working months in the last X years

I users table and a jobs. User has many jobs and jobs have a start_date and end_date:
Column | Type | Modifiers
----------------+-----------------------------+---------------------------------------------------
id | integer | not null default nextval('jobs_id_seq'::regclass)
title | character varying |
employer | character varying |
start_date | date |
end_date | date |
user_id | integer |
I need to calculate the total number of months that a person has spent working within the past X years.
I've looked at OVERLAPS and played with intervals a bit but I can't quite figure out what I need. I want to make sure that even it the start_date is outside the X years range that I still count the months that are inside the range.
Here is what I have so far:
select sum(EXTRACT(YEAR FROM months) * 12 + EXTRACT(MONTH FROM months))
as working_months
from (
select CASE current
WHEN true THEN
age(current_date, start_date)
ELSE age(end_date, start_date)
END as months
from jobs inner join users on jobs.user_id = users.id
where users.id = 4
) as employment_time;
with jobs (start_date, end_date, user_id) as ( values
('2000-01-01'::date, '2005-12-31'::date, 1),
('2007-10-01', '2008-09-30', 1),
('2010-09-01', '2014-10-20', 1)
)
select
user_id,
extract(year from work_time) * 12 + extract(month from work_time) as months
from (
select
user_id,
sum(age(upper(period), lower(period))) as work_time
from (
select
user_id,
daterange(start_date, end_date, '[]') *
daterange((current_date - interval '10 years')::date, current_date)
as period
from jobs
) s
group by user_id
) s
;
user_id | months
---------+--------
1 | 70
Range type -
Range functions
The basic query would be this:
SELECT sum(extract(year from months) * 12 + extract(month from months)) AS working_months
FROM (
SELECT
age(CASE (start_date, start_date) OVERLAPS (current_date, interval '-5 years')
WHEN true THEN start_date
ELSE current_date - interval '5 years'
END AS strt::timestamp,
CASE current
WHEN true THEN current_date
ELSE end_date
END AS fin::timestamp) AS months
FROM jobs
WHERE user_id = 4) AS employment_time;
You may also put this in a SQL function with parameters for the number of years and user_id. Note that you throw away partial months from individual jobs. You can add extract(day from months) / 30 to the top SELECT to harvest those partial months into full months.
This assumes that jobs cannot overlap. If they do, then the query becomes much more complex.

need to capture NULL in a query

I have a query that shows success rate for staff and works splendidly except: If staff "Bob" has not had any activity in the date range, he will not appear in the results. If he had at least one code in the query it would result in a 0% or 100%. If there are no codes attached to his name, he does not show in the results. I have seen an example of -
ISNULL(s.code, 'No Entry') AS NoContact to use but I guess I am not using it correctly
and I just cannot figure out how to add it into the query. Can someone assist?
Here is the current query that works great (but omits any staff who do not have any of the codes:
SELECT st.staff_id
,round((count(s.code IN ('10401','10402','10403') OR NULL) * 100.0)
/ count(*), 1) AS successes
-- unsuccessful code is 10405
FROM notes n
JOIN services s ON s.zzud_service = n.zrud_service
JOIN staff st ON st.zzud_staff = n.zrud_staff
WHERE n.date_service >= DATE '07/01/2014' AND n.date_service <= CURRENT_DATE
-- n.date_service BETWEEN (now() - '30 days'::interval) AND now()
AND s.code IN ('10401','10402','10403','10405')
GROUP BY st.staff_id;
Here is a sample result:
Staff SuccessRate Explination
Sam 100% (has 1 successful and 0 unsuccessful)
Joe 50% (has 1 successful and 1 unsuccessful)
Amy 0% (has 1 unsuccessful)
Bob does not show ( no discharges in the date range)
Since you place the staff table at the end you need to right join it and move the conditions to the join conditions.
select
st.staff_id,
round(
count(s.code in ('10401','10402','10403') or null) * 100.0
/
count(*)
, 1) as successes
-- unsuccessful code is 10405
from
notes n
inner join
services s on
s.zzud_service = n.zrud_service and
n.date_service >= date '07/01/2014' and
n.date_service <= current_date
right join
staff st on
st.zzud_staff = n.zrud_staff
-- n.date_service between (now() - '30 days'::interval) and now()
and s.code in ('10401','10402','10403','10405')
group by st.staff_id;

Checking missing hours for every id in a table

I have a table that contains column for id-s (id_code) and a time for transaction (time). What I need is to figure out those hours between two dates for each id where no transaction took place. Lets say i need to check missing hours for id 1 and id 2 from a table below between 2014-06-13 12:00:00 and 2014-06-13 14:59:59 - the desired result would be that id 1 has missing transactions 2014-06-13 13:00:00 and id 2 is missing transactions 2014-06-13 14:00:00.
id_code | time
1 | 2014-06-13 12:23:12
2 | 2014-06-13 12:27:23
1 | 2014-06-13 12:56:21
2 | 2014-06-13 13:34:12
1 | 2014-06-13 14:23:56
I am using PostgreSQL 9.3
SQL Fiddle
select c.id, d.time
from
(
select distinct id
from t
) c
cross join
generate_series (
(select date_trunc('hour', min(t.time)) from t),
(select date_trunc('hour', max(t.time)) from t),
interval '1 hour'
) d(time)
left join
(
select id, date_trunc('hour', t.time) as time
from t
group by id, 2
) t on t.time = d.time and c.id = t.id
where t.time is null
order by c.id, d.time
The generate_series will build a set of all possible hours. The cross join will make that a matrix of all possible ids of all possible hours. Then the t.time is null condition will filter those id x hours that do not exist.
SELECT DISTINCT id, h FROM t, generate_series('2014-06-13 12:00:00'::timestamp, '2014-06-13 14:59:59'::timestamp, '1 hour') h
EXCEPT
SELECT id, date_trunc('hour', time) FROM t
Thanks to Clodoaldo Neto for providing a useful SQL Fiddle page for testing!