filling missing rows with currency - postgresql

I have the following table (with different currencies):
date
currency
ex_rate
30/11/2020 00.00
USD
0.8347245409015025
27/11/2020 00.00
USD
0.8387854386847845
26/11/2020 00.00
USD
0.84033613445378152
As you can see, there is some missing data for two dates. I would like to fill it with the previous available date, so it would be like this:
date
currency
ex_rate
30/11/2020 00.00
USD
0.8347245409015025
29/11/2020 00.00
USD
0.8387854386847845
28/11/2020 00.00
USD
0.8387854386847845
27/11/2020 00.00
USD
0.8387854386847845
26/11/2020 00.00
USD
0.84033613445378152
Or redirect me to a question of the same kind

You can use generate_series to build the series of dates from the earlierst and later values available in the table, then bring the corresponding rows with a lateral join:
select d.dt, 'USD', t.ex_rate
from (
select generate_series(min(date), max(date), interval '1 day') as dt
from mytable
where currency = 'USD'
) d
cross join lateral (
select t.*
from mytable t
where currency = 'USD' and t.date <= d.dt
order by t.date desc limit 1
) t
I wonder whether a left join on date equality and then some window function technique to build groups of records might be more efficient:
select dt, 'USD', max(ex_rate) over(partition by grp) as ex_rate
from (
select d.*, t.*,
count(t.date) over(order by d.dt) as grp
from (
select generate_series(min(date), max(date), interval '1 day') as dt
from mytable
where currency = 'USD'
) d
left join mytable t on currency = 'USD' and t.date = d.dt
) t
Note that this can easily be generalized to handle all currencies at once:
select dt, currency, max(ex_rate) over(partition by currency, grp) as ex_rate
from (
select d.dt, c.currency, t.ex_rate,
count(t.date) over(partition by c.currency order by d.dt) as grp
from (select distinct currency from mytable) c
cross join (
select generate_series(min(date), max(date), interval '1 day') as dt
from mytable
) d
left join mytable t on t.currency = c.currency and t.date = d.dt
) t

Related

How to collapse overlapping date periods with acceptable gaps using T-SQL?

We want to group our members' enrollments into "continuous enrollments," allowing for a gap of up to 45 days. I know how to use LEAD to determine if an enrollment should be grouped with the next, but I don't know how to group them. Would it be more appropriate to add 45 to the term date and subtract 45 from the effective date, then check for overlapping date periods? My goal is to have a SQL view that returns the results similar to the final query below. Thank you for your help.
SELECT '101' AS MemID, '2021-01-01' AS EffDate, '2021-01-31' AS TermDate INTO #T1 UNION
SELECT '101', '2021-02-01', '2021-02-28' UNION
SELECT '101', '2021-03-01', '2021-03-31' UNION
SELECT '101', '2021-06-01', '2021-06-30' UNION
SELECT '999', '2021-01-01', '2021-01-15' UNION
SELECT '999', '2021-09-01', '2021-09-28' UNION
SELECT '999', '2021-10-01', '2021-10-31'
SELECT *
, LEAD(EffDate) OVER (PARTITION BY MemID ORDER BY EffDate) AS LeadEffDate
, DATEDIFF(DAY, TermDate, (LEAD(EffDate) OVER (PARTITION BY MemID ORDER BY EffDate))) AS DaysToNextEnrollment
, CASE WHEN (DATEDIFF(DAY, TermDate, (LEAD(EffDate) OVER (PARTITION BY MemID ORDER BY EffDate)))) <= 45 THEN 1 ELSE 0 END AS CombineWithNextRecord
FROM #T1
-- result objective
SELECT 101 AS MemID, '2021-01-01' AS EffDate, '2021-03-31' AS TermDate UNION
SELECT 101, '2021-06-01', '2021-06-30' UNION
SELECT 999, '2021-01-01', '2021-01-15' UNION
SELECT 999, '2021-09-01', '2021-10-31'
I think you are really close. Your question is very similar to
TSQL - creating from-to date table while ignoring in-between steps with conditions with a logic difference on what you want to consider to be the same group.
My basic approach is to use the LAG() function to figure out the previous values for MemID and TermDate and combine that with your 45 day rule to define a group. And finally get the first and last values of each group.
Here is my response to that question modified to your situation.
SELECT
a4.MemID
, CONVERT (DATE, a4.First_EffDate) AS [EffDate]
, CONVERT (DATE, a4.TermDate) AS [TermDate]
FROM (
SELECT
a3.MemID
, a3.EffDate
, a3.TermDate
, a3.MemID_group
, FIRST_VALUE (a3.EffDate) OVER (PARTITION BY a3.MemID_group ORDER BY a3.EffDate) AS [First_EffDate]
, ROW_NUMBER () OVER (PARTITION BY a3.MemID_group ORDER BY a3.EffDate DESC) AS [Row_number]
FROM (
SELECT
a2.MemID
, a2.EffDate
, a2.TermDate
, a2.Previous_MemID
, a2.Previous_TermDate
, a2.New_group
, SUM (a2.New_group) OVER (ORDER BY a2.MemID, a2.EffDate) AS [MemID_group]
FROM (
SELECT
a1.MemID
, a1.EffDate
, a1.TermDate
, a1.Previous_MemID
, a1.Previous_TermDate
---------------------------------------------------------------------------------
-- new group if the MemID is different from the previous row OR
-- if the MemID is the same as the previous row AND it has been more than 45 days
-- between the TermDate of the previous row and the EffDate of the current row
,
IIF((a1.MemID <> a1.Previous_MemID)
OR (
a1.MemID = a1.Previous_MemID
AND DATEDIFF (DAY, a1.Previous_TermDate, a1.EffDate) > 45
)
, 1
, 0) AS [New_group]
---------------------------------------------------------------------------------
FROM (
SELECT
MemID
, EffDate
, TermDate
, LAG (MemID) OVER (ORDER BY MemID) AS [Previous_MemID]
, LAG (TermDate) OVER (PARTITION BY MemID ORDER BY EffDate) AS [Previous_TermDate]
FROM #T1
) a1
) a2
) a3
) a4
WHERE a4.[Row_number] = 1;
Here is the dbfiddle.

How to group by when using a modulo

For each company, I want to sum the revenue for the 4 most recent quarters, then the 4 subsequent ones, and so on (see screenshot attached for details). How can I do that?
SQL query and result - 1st attempt (failed)
https://i.stack.imgur.com/wWhhb.png
SELECT
ticker,
period_end_date,
revenue,
1+ ((rn - 1) % 4) AS test
FROM (
SELECT
ticker,
period_end_date,
revenue,
ROW_NUMBER() OVER (PARTITION BY ticker ORDER BY period_end_date DESC) rn
FROM "ANALYTICS"."vQUARTERLY_MASTER_MATERIALIZED"
--WHERE ticker = 'ACN'
ORDER BY ticker
) q
EDIT: the following code meets my needs. The 'revenue' is summed using the most recent quarter and the 3 quarters thereafter.
SELECT
ticker,
period_end_date,
SUM(revenue) OVER (PARTITION BY ticker ORDER BY period_end_date DESC ROWS BETWEEN CURRENT ROW AND 3 FOLLOWING) AS total_revenue
FROM "ANALYTICS"."vQUARTERLY_MASTER_MATERIALIZED"
--WHERE ticker = 'ACN'
ORDER BY ticker
You can try this :
SELECT ticker
, period_end_date
, total_revenue
FROM (
SELECT ticker
, period_end_date
, SUM(revenue) OVER (PARTITION BY ticker ORDER BY period_end_date DESC ROWS BETWEEN CURRENT ROW AND 3 FOLLOWING) AS total_revenue
, max(period_end_date) OVER (PARTITION BY ticker) AS period_end_date_max
FROM "ANALYTICS"."vQUARTERLY_MASTER_MATERIALIZED"
--WHERE ticker = 'ACN
) q
WHERE EXTRACT(MONTH FROM period_end_date) = EXTRACT(MONTH FROM period_end_date_max)
ORDER BY ticker, period_end_date ASC

TSQL - Replace Cursor

I found in our database a cursor statement and I would like to replace it.
Declare #max_date datetime
Select #max_date = max(finished) From Payments
Declare #begin_date datetime = '2015-02-01'
Declare #end_of_last_month datetime
While #begin_date <= #max_date
Begin
SELECT #end_of_last_month = CAST(DATEADD(DAY, -1 , DATEFROMPARTS(YEAR(#begin_date),MONTH(#begin_date),1)) AS DATE) --AS end_of_last_month
Insert Into #table(Customer, ArticleTypeID, ArticleType, end_of_month, month, year)
Select Count(distinct (customerId)), prod.ArticleTypeID, at.ArticleType, #end_of_last_month, datepart(month, #end_of_last_month), datepart(year, #end_of_last_month)
From Customer cust
Inner join Payments pay ON pay.member_id = m.member_id
Inner Join Products prod ON prod.product_id = pay.product_id
Inner Join ArticleType at ON at.ArticleTypeID = prod.ArticleTypeID
Where #end_of_last_month between begin_date and expire_date
and completed = 1
Group by prod.ArticleTypeID, at.ArticleType
order by prod.ArticleTypeID, at.ArticleType
Set #begin_date = DATEADD(month, 1, #begin_date)
End
It groups all User per Month where the begin- and expire date in the actual Cursormonth.
Notes:
The user has different payment types, for e.g. 1 Month, 6 Month and so on.
Is it possible to rewrite the code - my problem is only the identification at the where clause (#end_of_last_month between begin_date and expire_date)
How can I handle this with joins or cte's?
What you need first, if not already is a numbers table
Using said Numbers table you can create a dynamic list of dates for "end_of_Last_Month" like so
;WITH ctexAllDates
AS (
SELECT end_of_last_month = DATEADD(DAY, -1, DATEADD(MONTH, N.N -1, #begin_date))
FROM
dbo.Numbers N
WHERE
N.N <= DATEDIFF(MONTH, #begin_date, #max_date) + 1
)
select * FROM ctexAllDates
Then combine with your query like so
;WITH ctexAllDates
AS (
SELECT end_of_last_month = DATEADD(DAY, -1, DATEADD(MONTH, N.N -1, #begin_date))
FROM
dbo.Numbers N
WHERE
N.N <= DATEDIFF(MONTH, #begin_date, #max_date) + 1
)
INSERT INTO #table
(
Customer
, ArticleTypeID
, ArticleType
, end_of_month
, month
, year
)
SELECT
COUNT(DISTINCT (customerId))
, prod.ArticleTypeID
, at.ArticleType
, A.end_of_last_month
, DATEPART(MONTH, A.end_of_last_month)
, DATEPART(YEAR, A.end_of_last_month)
FROM
Customer cust
INNER JOIN Payments pay ON pay.member_id = m.member_id
INNER JOIN Products prod ON prod.product_id = pay.product_id
INNER JOIN ArticleType at ON at.ArticleTypeID = prod.ArticleTypeID
LEFT JOIN ctexAllDates A ON A.end_of_last_month BETWEEN begin_date AND expire_date
WHERE completed = 1
GROUP BY
prod.ArticleTypeID
, at.ArticleType
, A.end_of_last_month
ORDER BY
prod.ArticleTypeID
, at.ArticleType;

GROUP BY Month, Display Month Name in SELECT List

I am trying to group by month to separate my result set into monthly breakdowns, but I want to display the month name instead of the month number.
SELECT
DATEPART(MONTH, R.received_date) AS [Month],
--DATENAME( MONTH, DATEPART( MONTH, R.received_date)) AS [Month] ,
o.name AS [CountyName] ,
rsc.description AS [Filing],
COUNT(*) AS [Total]
FROM dbo.requests AS [r]
INNER JOIN dbo.organizations AS [o] ON o.id = r.submitted_to_organiztion_id
INNER JOIN dbo.request_status_codes AS [rsc] ON rsc.code = r.request_status_code
WHERE r.submitted_to_organiztion_id < 68
AND r.request_type_code = 1
AND CAST(r.received_date AS DATE) >= '01/01/2016'
AND CAST(r.received_date AS DATE) <= '06/30/2016'
AND o.name = 'Alachua'
GROUP BY
DATEPART(MONTH, R.received_date) ,
--DATENAME( MONTH, DATEPART( MONTH, R.received_date)) ,
o.name ,
rsc.description;
And the results are expected below:
But if I group by DATENAME, the results are not what I would expect from the query below:
SELECT
--DATEPART(MONTH, R.received_date) AS [Month],
DATENAME( MONTH, DATEPART( MONTH, R.received_date)) AS [Month] ,
o.name AS [CountyName] ,
rsc.description AS [Filing],
COUNT(*) AS [Total]
FROM dbo.requests AS [r]
INNER JOIN dbo.organizations AS [o] ON o.id = r.submitted_to_organiztion_id
INNER JOIN dbo.request_status_codes AS [rsc] ON rsc.code = r.request_status_code
WHERE r.submitted_to_organiztion_id < 68
AND r.request_type_code = 1
AND CAST(r.received_date AS DATE) >= '01/01/2016'
AND CAST(r.received_date AS DATE) <= '06/30/2016'
AND o.name = 'Alachua'
GROUP BY
--DATEPART(MONTH, R.received_date) ,
DATENAME( MONTH, DATEPART( MONTH, R.received_date)) ,
o.name ,
rsc.description
ORDER BY
[Month] ,
CountyName ,
Filing
when 6 months become grouped into one set - January:
What can I do to fix the query to display the first result set but with the month name instead of numeric value?
Thanks,

SUM OVER PARTITION to calculate running total

I am trying to modify my query to include a running total for each county in my report. Below is my working query with an attempt to use SUM OVER PARTITION commented out:
SELECT DATEPART(MONTH, r.received_date) AS [MonthID] ,
DATENAME(MONTH, r.received_date) AS [Month] ,
o.name AS [CountyName] ,
rsc.description AS [Filing] ,
COUNT(r.id) AS [Request_Total] ,
CAST (AVG(CAST (DATEDIFF(HOUR, received_date, completion_date) AS DECIMAL(8,2))) / 24 AS DECIMAL(8,2)) AS [Total_Time_Days]
--SUM(r.id) OVER (PARTITION BY o.name) AS [TotalFilings]
FROM dbo.requests AS [r]
INNER JOIN dbo.organizations AS [o] ON o.id = r.submitted_to_organiztion_id
INNER JOIN dbo.request_status_codes AS [rsc] ON rsc.code = r.request_status_code
WHERE r.submitted_to_organiztion_id < 68
AND r.request_type_code = 1
AND CAST(r.received_date AS DATE) >= '01/01/2016'
AND CAST(r.received_date AS DATE) <= '06/30/2016'
AND o.name = 'Alachua'
GROUP BY DATENAME(MONTH, r.received_date) ,
DATEPART(MONTH, r.received_date) ,
o.name ,
rsc.description
ORDER BY DATEPART(MONTH, r.received_date) ,
CountyName ,
Filing;
And the results look correct:
Perhaps I am misusing the SUM PARTITION BYbut my end goal is to add an additional column that will sum the filing types for each county by month.
For example, the additional column for the month of January should be 13,654 while February should be 14,238 and so on.
Could I get some advice on how to get this query working correctly? Thanks,
Not sure this is the best way or more efficient, but I was able to create a sub-query to obtain the results I wanted. I do believe a CTE or use of a Windows function would be better, but I haven't been able to get it to work. Here is my query however:
SELECT X.[MonthID] ,
X.[Month] ,
X.[CountyName] ,
X.[Filing] ,
X.[Avg_Time_Days] ,
SUM(X.Request_Total) AS [Total_Requests]
FROM ( SELECT DATEPART(MONTH, r.received_date) AS [MonthID] ,
DATENAME(MONTH, r.received_date) AS [Month] ,
o.name AS [CountyName] ,
rsc.description AS [Filing] ,
COUNT(r.id) AS [Request_Total] ,
CAST (AVG(CAST (DATEDIFF(HOUR, received_date,
completion_date) AS DECIMAL(8, 2)))
/ 24 AS DECIMAL(8, 2)) AS [Avg_Time_Days]
--, SUM(r.id) OVER (PARTITION BY o.name, rsc.description) AS [TotalFilings]
FROM dbo.requests AS [r]
INNER JOIN dbo.organizations AS [o] ON o.id = r.submitted_to_organiztion_id
INNER JOIN dbo.request_status_codes AS [rsc] ON rsc.code = r.request_status_code
WHERE r.submitted_to_organiztion_id < 68
AND r.request_type_code = 1
AND CAST(r.received_date AS DATE) >= '01/01/2016'
AND CAST(r.received_date AS DATE) <= '06/30/2016'
--AND o.name = 'Alachua'
GROUP BY DATENAME(MONTH, r.received_date) ,
DATEPART(MONTH, r.received_date) ,
o.name ,
rsc.description
--, r.id
--ORDER BY DATEPART(MONTH, r.received_date) ,
-- CountyName ,
-- Filing
) AS X
GROUP BY X.[MonthID] ,
X.[Month] ,
X.[CountyName] ,
X.[Filing] ,
X.[Avg_Time_Days]
ORDER BY X.[MonthID] ,
X.[Month] ,
X.[CountyName] ,
X.[Filing];