Maximum count of overlapping intervals in PostgreSQL - postgresql

Suppose there is a table structured as follows:
id start end
--------------------
01 00:18 00:23
02 00:22 00:31
03 00:23 00:48
04 00:23 00:39
05 00:24 00:25
06 00:24 00:31
07 00:24 00:38
08 00:25 00:37
09 00:26 00:42
10 00:31 00:34
11 00:33 00:38
The objective is to compute the overall maximum number of rows having been active (i.e. between start and end) at any given moment in time. This would be relatively straightforward using a procedural algorithm, but I'm not sure how to do this in SQL.
According to the above example, this maximum value would be 8 and would correspond to the 00:31 timestamp where active rows were 2, 3, 4, 6, 7, 8, 9, 10 (as shown in the schema below).
Obtaining the timestamp(s) and the active rows corresponding to the maximum value is not important, all is needed is the actual value itself.

I was thinking of at first, using generate_series() to iterate every minute and get the count of active intervals for each, then take the max of this.
You can improve your idea and iterate only "start" values from the table because one of "start" points includes in time interval with maximum active rows.
select id, start,
(select count(1) from tbl t where tbl.start between t.start and t."end")
from tbl;
Here results
id start count
-----------------
1 00:18:00 1
2 00:22:00 2
3 00:23:00 4
4 00:23:00 4
5 00:24:00 6
6 00:24:00 6
7 00:24:00 6
8 00:25:00 7
9 00:26:00 7
10 00:31:00 8
11 00:33:00 7
So, this query gives you maximum number of rows having been active
select
max((select count(1) from tbl t where tbl.start between t.start and t."end"))
from tbl;
max
-----
8

Related

PostgreSQL query with UNIQUE values returned based on condition

Its an example of a table from PostgreSQL.
I learning the SQL query and cant find anything to help me pass this.
What I`m working to achieve is:
Return UNIQ(DISTINCT) values of WNR WHEN tdate >='2020-01-13 00:00:01.757000'
WNR tdate T1 T2 T3
2 '2020-01-06 00:05:23.229000' 8 18 15
2 '2020-01-06 00:05:23.725000' 11 4 7
2 '2020-01-06 00:05:31.578000' 19 12 6
3 '2020-01-13 00:00:01.655000' 9 9 3
3 '2020-01-13 00:00:01.757000' 5 11 16
3 '2020-01-13 00:00:05.778000' 16 17 16
4 '2020-01-20 00:00:11.925000' 18 13 4
4 '2020-01-20 00:00:12.177000' 18 3 15
4 '2020-01-20 00:00:12.694000' 7 12 7
5 '2020-01-27 00:00:04.860000' 19 3 14
5 '2020-01-27 00:00:05.056000' 14 18 8
5 '2020-01-27 00:00:05.107000' 18 7 14
Result expected should be 3,4,5
Thank you!
To select distinct values in Postgresql you can use DISTINCT clause.
From Postgresql documentation: SELECT DISTINCT eliminates duplicate rows from the result. SELECT DISTINCT ON eliminates rows that match on all the specified expressions. SELECT ALL (the default) will return all candidate rows, including duplicates. (See DISTINCT Clause below.)
SELECT DISTINCT WNR
FROM table_name
WHERE tdate >='2020-01-13 00:00:01.757000';

PostgreSQL - How can I SUM until a certain hour of the day?

I'm trying to create a metric for a PostgreSQL integrated dashboard which would show today's "Total Payment Value" (TPV) of a certain product, as well as yesterday's TPV of the same product, up until the same moment as today, so if I'm accessing the dashboard at 5 pm, it will show what it was yesterday until 5 pm and today's TPV.
edit: My question wasn't very clear so I'm adding a few more lines and editing the query, which had a mistake.
I tried this:
select
sum(case when table.product in (13,14,15,16) then amount else 0 end) as "TPV"
,date_trunc('day', table.date) as "Day"
from table
where
date > current_date - 1
group by date_trunc('day', table.date)
order by 2,1
I only want to sum the amount when product = 13, 14, 15 or 16
An example of the product, date and amount would be like this:
product amount date
8 4750 19/03/2019 00:21
14 7840 12/04/2019 22:40
14 15000 22/03/2019 18:27
14 11715 19/03/2019 00:12
14 1054 22/03/2019 18:22
14 18491 17/03/2019 14:28
14 12253 17/03/2019 14:30
14 27600 17/03/2019 14:32
14 3936 17/03/2019 14:28
14 19007 19/03/2019 00:14
8 9400 19/03/2019 00:21
8 4750 19/03/2019 00:21
8 25000 19/03/2019 00:17
14 10346 22/03/2019 18:23
I would like to have a metric that always calculates the sum of the product value today up until the current moment - when the "product" corresponds to values 13, 14, 15 or 16 - as well as the same metric for yesterday, e.g., it's 1 PM now, I want today's TPV until 1 PM and yesterday's TPV until 1 PM as well!

Calculating Running Avg for YTD Sum with constant denominator for a year

I have the following table from SQL
ID Date Score
-----+-------------+----------
10 2015-01-10 5
20 2015-01-10 5
10 2015-02-10 15
40 2015-02-10 25
30 2015-02-10 5
10 2015-03-10 15
10 2014-01-10 25
20 2014-02-10 35
50 2014-03-10 45
In Tableau I want a line graph to display
(YTD Sum of Score)/Total number of IDs for a year.
For Jan 2015 - 10/4=2.5
For Feb 2015 - 55/4=13.75
For Jan 2014 - 60/3=20
The denominator should remain constant throughout the year and not change monthwise.
Looks like you can achieve your desired result with two calculated fields. First, make a [Year] field with:
year([Date])
Then make a second calculated field as follows:
sum([Score])/sum({fixed [Year] : countd([Id])})
This will sum the score and divide by IDs for the given year. It uses Level of Detail calculation.

Getting depth-first traversal insted of breadth first in T-SQL

I have the following T-SQL function: https://gist.github.com/cwattengard/11365802
This returns data in a breadth-first traversal. Is there a simple way to make this function return its data in a depth-first traversal? I have a treeview-component that excpects this (legacy system).
I already have a similar stored procedure that returns the tree in a depth-first traversal, but it's using cursors and is really slow. (6-7 seconds as opposed to this function that takes less than a second on the same data).
I think I just had a eureka moment. If I add the Path variable already supplied by the CTE, and sort by that, I get what I want. The OrgID is a unique ID. So ordering by it would make it sort by the expected output for the user (chronologically) and be depth-first for the treeview.
http://sqlanywhere.blogspot.in/2012/10/example-recursive-union-tree-traversal.html
Here's a diagram showing the primary keys for a tree-structured table:
1
|
---------------------------------------
2 93 4 5
| | | |
-------------- ------------ -------- ------
6 7 8 9 10 11 12 13 14 15 16 17 18 19
| | | |
----- ----- ----- -----
27 26 25 24 23 22 21 20
Here's what the breadth-first and depth-first queries should return:
Breadth-First Depth-First
1 1
2 2
93 6
4 7
5 27
6 26
7 8
8 9
9 10
10 93
11 11
12 12
13 13
14 25
15 24
16 14
17 4
18 15
19 16
27 17
26 23
25 22
24 5
23 18
22 21
21 20
20 19
If you order the output by tekst will that do it?
First populate a table variable #unsorted inside the function; then finally return Select * from #unsorted order by tekst?
I know this is very late to the game, but seems like you could use hierarchyid to get a nice depth first search...
The github file referenced in the op appears to have gone missing, but the basic formula is
Put hierarchyid::GetRoot() as Foo in your CTE anchor query
Put cast (cte.Foo.ToString() + cast(row_number() over(order by ) as varchar) + '/' as hierarchyid) as Foo in your recursive query
order By Foo when you invoke the CTE
and the results come out depth first

While loop to add data for pivot

Currently i have a requirement which needs a table to look like this:
Instrument Long Short 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 ....
Fixed 41 41 35 35 35 35 35 35 35 53 25 25
Index 16 16 22 22 22 32 12 12 12 12 12 12
Credits 29 29 41 16 16 16 16 16 16 16 16 16
Short term 12 12 5 5 5 5 5 5 5 5 5 17
My worktable looks like the following:
Instrument Long Short Annual Coupon Maturity Date Instrument ID
Fixed 10 10 10 01/01/2025 1
Index 5 5 10 10/05/2016 2
Credits 15 15 16 25/06/2020 3
Short term 12 12 5 31/10/2022 4
Fixed 13 13 15 31/03/2030 5
Fixed 18 18 10 31/01/2019 6
Credits 14 14 11 31/12/2013 7
Index 11 11 12 31/10/2040 8
..... etc
So basically the long and the short in the pivot should be the sum of each distinct instrument ID. And then for each year i need to take the sum of each Annual Coupon until the maturity date year where the long and the coupon rate are added together.
My thinking was that i had to create a while loop which would populate a table with a record for each year for each instrument until the maturity date, so that i could then pivot using an sql pivot some how. Does this seem feasible? Any other ideas on the best way of doing this, particularly i might need help on the while loop?
The following solution uses a numbers table to unfold ranges in your table, performs some special processing on some of the data columns in the unfolded set, and finally pivots the results:
WITH unfolded AS (
SELECT
t.Instrument,
Long = SUM(z.Long ) OVER (PARTITION BY Instrument),
Short = SUM(z.Short) OVER (PARTITION BY Instrument),
Year = y.Number,
YearValue = t.AnnualCoupon + z.Long + z.Short
FROM YourTable t
CROSS APPLY (SELECT YEAR(t.MaturityDate)) x (Year)
INNER JOIN numbers y ON y.Number BETWEEN YEAR(GETDATE()) AND x.Year
CROSS APPLY (
SELECT
Long = CASE y.Number WHEN x.Year THEN t.Long ELSE 0 END,
Short = CASE y.Number WHEN x.Year THEN t.Short ELSE 0 END
) z (Long, Short)
),
pivoted AS (
SELECT *
FROM unfolded
PIVOT (
SUM(YearValue) FOR Year IN ([2013], [2014], [2015], [2016], [2017], [2018], [2019], [2020],
[2021], [2022], [2023], [2024], [2025], [2026], [2027], [2028], [2029], [2030],
[2031], [2032], [2033], [2034], [2035], [2036], [2037], [2038], [2039], [2040])
) p
)
SELECT *
FROM pivoted
;
It returns results for a static range years. To use it for a dynamically calculated year range, you'll first need to prepare the list of years as a CSV string, something like this:
SET #columnlist = STUFF(
(
SELECT ', [' + CAST(Number) + ']'
FROM numbers
WHERE Number BETWEEN YEAR(GETDATE())
AND (SELECT YEAR(MAX(MaturityDate)) FROM YourTable)
ORDER BY Number
FOR XML PATH ('')
),
1, 2, ''
);
then put it into the dynamic SQL version of the query:
SET #sql = N'
WITH unfolded AS (
...
PIVOT (
SUM(YearValue) FOR Year IN (' + #columnlist + ')
) p
)
SELECT *
FROM pivoted;
';
and execute the result:
EXECUTE(#sql);
You can try this solution at SQL Fiddle.