I'm trying to find the maximum sequence of days by customer in my data. I want to understand what is the max sequence of days that specific customer made. If someone enter to my app in the 25/8/16 AND 26/08/16 AND 27/08/16 AND 01/09/16 AND 02/09/16 - The max sequence will be 3 days (25,26,27).
In the end (The output) I want to get two fields: custid | MaxDaySequence
I have the following fields in my data table: custid | orderdate(timestemp)
For exmple:
custid orderdate
1 25/08/2007
1 03/10/2007
1 13/10/2007
1 15/01/2008
1 16/03/2008
1 09/04/2008
2 18/09/2006
2 08/08/2007
2 28/11/2007
2 04/03/2008
3 27/11/2006
3 15/04/2007
3 13/05/2007
3 19/06/2007
3 22/09/2007
3 25/09/2007
3 28/01/2008
I'm using PostgreSQL 2014.
Thanks
Trying:
select custid, max(num_days) as longest
from (
select custid,rn, count (*) as num_days
from (
select custid, date(orderdate),
cast (row_number() over (partition by custid order by date(orderdate)) as varchar(5)) as rn
from table_
) x group by custid, CURRENT_DATE - INTERVAL rn|| ' day'
) y group by custid
Try:
SELECT custid, max( abc ) as max_sequence_of_days
FROM (
SELECT custid, yy, count(*) abc
FROM (
SELECT * ,
SUM( xx ) OVER (partition by custid order by orderdate ) yy
FROM (
select * ,
CASE WHEN
orderdate - lag( orderdate ) over (partition by custid order by orderdate )
<= 1
THEN 0 ELSE 1 END xx
from mytable
) x
) z
GROUP BY custid, yy
) q
GROUP BY custid
Demo: http://sqlfiddle.com/#!15/00422/11
===== EDIT ===========
Got "operator does not exist: interval <= integer"
This means that orderdate column is of type timestamp, not date.
In this case you need to use <= interval '1' day condition instead of <= 1:
Please see this link: https://www.postgresql.org/docs/9.0/static/functions-datetime.html to learn more about date arithmetic in PostgreSQL
Please see this demo:
http://sqlfiddle.com/#!15/7c2200/2
SELECT custid, max( abc ) as max_sequence_of_days
FROM (
SELECT custid, yy, count(*) abc
FROM (
SELECT * ,
SUM( xx ) OVER (partition by custid order by orderdate ) yy
FROM (
select * ,
CASE WHEN
orderdate - lag( orderdate ) over (partition by custid order by orderdate )
<= interval '1' day
THEN 0 ELSE 1 END xx
from mytable
) x
) z
GROUP BY custid, yy
) q
GROUP BY custid
Related
I have a pricing table that contains the pricing data for products. There are around 600 unique product_id, each currently having 4 days worth of pricing data, which will eventually go up to 30 days. The table below is a small subset of the data to represent that table structure:
date
product_id
price_trend
2022-08-21
1
0.08
2022-08-22
1
0.18
2022-08-23
1
0.30
2022-08-21
2
0.15
2022-08-22
2
0.20
2022-08-23
2
0.22
So in my script, for each product_id I am trying to get yesterdays price_trend and todays price_trend and then calculate the price_delta between the two. I then order by price_delta and limit the results to 5.
I am having some issues as in some cases yesterdays price_trend is 0 and then todays price trend is 0.50 for example. This does not mean that the price trend has increased, but mostly likely that price_trend was not gathered yesterday for whatever reason.
Now I would like to remove any records where price_trend for today or yesterday equals 0, however, when I add AND pricing.trend_price > 0 the value return is just null instead.
Script:
SELECT
magic_sets_cards.name,
(SELECT pricing.trend_price
FROM pricing
WHERE pricing.product_id = magic_sets_cards_identifiers.mcm_id
AND pricing.date = (SELECT MAX(date) - INTERVAL '2 DAY' FROM pricing)
AND pricing.trend_price > 0) AS price_yesterday,
(SELECT pricing.trend_price
FROM pricing
WHERE pricing.product_id = magic_sets_cards_identifiers.mcm_id
AND pricing.date = (SELECT MAX(date) FROM pricing)
AND pricing.trend_price > 0) AS price_today,
((SELECT pricing.trend_price
FROM pricing
WHERE pricing.product_id = magic_sets_cards_identifiers.mcm_id
AND pricing.date = (SELECT MAX(date) FROM pricing)) -
(SELECT pricing.trend_price
FROM pricing
WHERE pricing.product_id = magic_sets_cards_identifiers.mcm_id
AND pricing.date = (SELECT MAX(date) - INTERVAL '2 DAY' FROM pricing))) AS price_delta
FROM magic_sets
JOIN magic_sets_cards ON magic_sets_cards.set_id = magic_sets.id
JOIN magic_sets_cards_identifiers ON magic_sets_cards_identifiers.card_id = magic_sets_cards.id
JOIN pricing ON pricing.product_id = magic_sets_cards_identifiers.mcm_id
WHERE magic_sets.code = '2X2'
AND pricing.date = (SELECT MAX(date) FROM pricing)
ORDER BY price_delta DESC
LIMIT 5
Results:
name
price_yesterday
price_today
price_delta
"Fiery Justice"
null
0.50
0.50
"Hostage Taker"
3.50
4.00
0.50
"Damnation"
17.02
17.33
0.31
"Bring to Light"
0.42
0.72
0.30
"City of Brass"
17.41
17.68
0.27
I would like to get it so that the "Fiery Justice" in this example is just ignored.
with the use of rank() you can get the output ., Look into...
Query without null rows :
with cte as (Select
product_id,
SUM(CASE WHEN rank = 1 THEN price_trend ELSE null END) today,
SUM(CASE WHEN rank = 2 THEN price_trend ELSE null END) yesterday,
SUM(CASE WHEN rank = 1 THEN price_trend ELSE 0 END) -
SUM(CASE WHEN rank = 2 THEN price_trend ELSE 0 END) as diff
FROM (
SELECT
product_id,
price_trend,
date,
rank() OVER (PARTITION BY product_id ORDER BY date DESC) as rank
FROM tableName where price_trend>0 and date between current_date-5 and current_date-4) p
WHERE rank in (1,2)
GROUP BY product_id
) select * from cte where (case when today is null or yesterday is null then 'NULL' else 'VALID' end)!='NULL'
Query with null values :
Select
product_id,
SUM(CASE WHEN rank = 1 THEN price_trend ELSE 0 END) today,
SUM(CASE WHEN rank = 2 THEN price_trend ELSE 0 END) yesterday,
SUM(CASE WHEN rank = 1 THEN price_trend ELSE 0 END) -
SUM(CASE WHEN rank = 2 THEN price_trend ELSE 0 END) as diff
FROM (
SELECT
product_id,
price_trend,
date,
rank() OVER (PARTITION BY product_id ORDER BY date DESC) as rank
FROM tableName where date between current_date-5 and current_date-4) p
WHERE rank in (1,2)
GROUP BY product_id
Change the condition :
where date between current_date-3 and current_date-2
OUTPUT :
product_id today yesterday diff
1 0.06 0.02 0.04
2 0.64 0.62 0.02
CREATE TABLE tableName
(
date date,
product_id int,
price_trend numeric(9,2)
);
INSERT INTO tableName (date ,product_id ,price_trend) VALUES ('2022-08-21 ', '1 ', '0.02');
INSERT INTO tableName (date ,product_id ,price_trend) VALUES ('2022-08-22 ', '1 ', '0.06');
INSERT INTO tableName (date ,product_id ,price_trend) VALUES ('2022-08-23 ', '1 ', '0.10');
INSERT INTO tableName (date ,product_id ,price_trend) VALUES ('2022-08-24 ', '1 ', '0.13');
INSERT INTO tableName (date ,product_id ,price_trend) VALUES ('2022-08-25 ', '1 ', '0.18');
INSERT INTO tableName (date ,product_id ,price_trend) VALUES ('2022-08-26 ', '1 ', '0.30');
INSERT INTO tableName (date ,product_id ,price_trend) VALUES ('2022-08-21 ', '2 ', '0.62');
INSERT INTO tableName (date ,product_id ,price_trend) VALUES ('2022-08-22 ', '2 ', '0.64');
INSERT INTO tableName (date ,product_id ,price_trend) VALUES ('2022-08-23 ', '2 ', '0.69');
INSERT INTO tableName (date ,product_id ,price_trend) VALUES ('2022-08-24 ', '2 ', '0.78');
INSERT INTO tableName (date ,product_id ,price_trend) VALUES ('2022-08-25 ', '2 ', '0.88');
INSERT INTO tableName (date ,product_id ,price_trend) VALUES ('2022-08-26 ', '2 ', '0.90');
Let's say I have a table:
DATE
ID
VALUE
01.2010
1
100
02.2010
1
200
...
...
...
12.2010
1
300
01.2011
1
150
02.2011
1
250
...
...
...
12.2011
1
350
01.2012
1
200
02.2012
1
300
...
...
...
12.2012
1
400
I want to get a median of VALUE grouped by months i.e. get something like
DATE
ID
VALUE
MEDIAN
01.2010
1
100
100
02.2010
1
200
200
...
...
...
...
12.2010
1
300
300
01.2011
1
150
125 = (100+150)/2
02.2011
1
250
225 = (200+250)/2
...
...
...
...
12.2011
1
350
325 = (300+350)/2
01.2012
1
200
150
02.2012
1
300
250
...
...
...
...
12.2012
1
400
350
I have more ID in table so I would like to get this result for every ID.
I have tried doing
SELECT PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY VALUE) OVER (PARTITION BY Id, MONTH(Date) ORDER BY Date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)
but I get "The function 'PERCENTILE_CONT' may not have a window frame.
I've also tried the following (but also without any results):
SELECT PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY VALUE)
OVER (PARTITION BY Id, MONTH(Date))
FROM tab1 LEFT JOIN tab2
ON tab1.key = tab2.key
WHERE tab1.Date BETWEEN Min(Date) AND tab2.Date
EDIT
So far I have resolved it with
SELECT (CASE WHEN Date =2010 THEN PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY CASE WHEN Date = 2010 THEN VALUE ELSE NULL) OVER (PARTITION BY Id, MONTH(Date)) ELSE 0 END) +
(CASE WHEN Date =2011 THEN PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY CASE WHEN Date <= 2011 THEN VALUE ELSE NULL) OVER (PARTITION BY Id, MONTH(Date)) ELSE 0 END) +
(CASE WHEN Date =2012 THEN PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY CASE WHEN Date <= 2012 THEN VALUE ELSE NULL) OVER (PARTITION BY Id, MONTH(Date)) ELSE 0 END)
FROM tab1
But to be honest, I would like to have an resolution without assumption of a priori knowledge of dates. I've thought about WHILE LOOP and updating column while #MinYear <= #MaxYear where in every iteration #MinYear = #MinYear+1 but in this case I would have to create temporary tables which I'm trying to avoid.
My idea is to use (Value1+value2)/2 as median as your requirement is little complicated.
CREATE TABLE MedianData
(
[Date] VARCHAR(100)
,ID INT
,[Value] INT
)
INSERT INTO MedianData VALUES ('01.2010', 1, 100)
,('02.2010', 1, 200)
,('12.2010', 1, 300)
,('01.2011', 1, 150)
,('02.2011', 1, 250)
,('12.2011', 1, 350)
,('01.2012', 1, 200)
,('02.2012', 1, 300)
,('12.2012', 1, 400)
SELECT *
,ROW_NUMBER() OVER ( PARTITION BY Substring([Date],1,2 ) ORDER BY [Date] ) AS [row]
,Substring([Date],1,2 ) as [MONTH]
INTO #Temp_tbl2
FROM MedianData
SELECT
A.Date
,A.ID
,A.[Value]
--Logic is applied here. I used (Value1+value2)/2 as median
,CASE WHEN A.[row] = 3 THEN ( A.[Value] + ( SELECT T.[Value] FROM #Temp_tbl2
T where T.[MONTH] = Substring(A.[Date],1,2 ) AND T.[row] = 1 ) )/2
WHEN A.[row] != 1 THEN (A.total/2)
ELSE A.total END as [Median]
INTO #Temp_table
FROM
(
SELECT *
,ROW_NUMBER() OVER ( PARTITION BY Substring([Date],1,2 ) ORDER BY [Date] ) AS [row]
,SUM ([Value] ) OVER ( PARTITION BY Substring([Date],1,2 ) ORDER BY [Date] ) AS [total]
FROM MedianData
) AS A
--to make the table data order
SELECT MedianData.*, #Temp_table.Median
FROM MedianData
INNER JOIN #Temp_table
ON MedianData.[Date] = #Temp_table.[Date]
drop table #Temp_table
drop table #Temp_tbl2
DB-Fiddle
CREATE TABLE sales (
id SERIAL PRIMARY KEY,
country VARCHAR(255),
sales_date DATE,
sales_volume DECIMAL,
fix_costs DECIMAL
);
INSERT INTO sales
(country, sales_date, sales_volume, fix_costs
)
VALUES
('DE', '2020-01-03', '500', '2000'),
('FR', '2020-01-03', '350', '2000'),
('None', '2020-01-31', '0', '2000'),
('DE', '2020-02-15', '0', '5000'),
('FR', '2020-02-15', '0', '5000'),
('None', '2020-02-29', '0', '5000'),
('DE', '2020-03-27', '180', '4000'),
('FR', '2020-03-27', '970', '4000'),
('None', '2020-03-31', '0', '4000');
Expected Result:
sales_date | country | sales_volume | fix_costs
-------------|--------------|------------------|------------------------------------------
2020-01-03 | DE | 500 | 37.95 (= 2000/31 = 64.5 x 0.59)
2020-01-03 | FR | 350 | 26.57 (= 2000/31 = 64.5 x 0.41)
-------------|--------------|------------------|------------------------------------------
2020-02-15 | DE | 0 | 86.21 (= 5000/28 = 172.4 x 0.50)
2020-02-15 | FR | 0 | 86.21 (= 5000/28 = 172.4 x 0.50)
-------------|--------------|------------------|------------------------------------------
2020-03-27 | DE | 180 | 20.20 (= 4000/31 = 129.0 x 0.16)
2020-03-27 | FR | 970 | 108.84 (= 4000/31 = 129.0 x 0.84)
-------------|--------------|------------------|-------------------------------------------
The column fix_costs in the expected result is calculated as the following:
Step 1) Get the daily rate of the fix_costs per month.(2000/31 = 64.5; 5000/29 = 172.4; 4000/31 = 129.0)
Step 2) Split the daily value to the countries DE and FR based on their share in the sales_volume. (500/850 = 0.59; 350/850 = 0.41; 180/1150 = 0.16; 970/1150 = 0.84)
Step 3) In case the sales_volume is 0 the daily rate gets split 50/50 to DE and FR as you can see for 2020-02-15.
In MariaDB I was able to this with the below query:
SELECT
s.sales_date,
s.country,
s.sales_volume,
(CASE WHEN SUM(sales_volume) OVER (PARTITION BY sales_date) > 0
THEN ((s.fix_costs/ DAY(LAST_DAY(sales_date))) *
sales_volume / NULLIF(SUM(sales_volume) OVER (PARTITION BY sales_date), 0)
)
ELSE (s.fix_costs / DAY(LAST_DAY(sales_date))) * 1 / SUM(country <> 'None') OVER (PARTITION by sales_date)
END) AS imputed_fix_costs
FROM sales s
WHERE country <> 'None'
GROUP BY 1,2,3
ORDER BY 1;
However, in PostgresSQL I get an error on DAY(LAST_DAY(sales_date)).
I tried to replace this part with (date_part('DAY', ((date_trunc('MONTH', s.sales_date) + INTERVAL '1 MONTH - 1 DAY')::date)))
However, this is causing another error.
How do I need to modify the query to get the expected result?
The Postgresql equivalent of DAY(LAST_DAY(sales_date)) would be:
extract(day from (date_trunc('month', sales_date + interval '1 month') - interval '1 day'))
The expression SUM(country <> 'None') also needs to be fixed as
SUM(case when country <> 'None' then 1 else 0 end)
It might be a good idea to define this compatibility function:
create function last_day(d date) returns date as
$$
select date_trunc('month', d + interval '1 month') - interval '1 day';
$$ language sql immutable;
Then the first expression becomes simply
extract(day from last_day(sales_date))
I would create a function to return the last day (number) for a given date - which is actually the "length" of the month.
create function month_length(p_input date)
returns integer
as
$$
select extract(day from (date_trunc('month', p_input) + interval '1 month - 1 day'));
$$
language sql
immutable;
Then the query can be written as:
select sales_date, country,
sum(sales_volume),
sum(fix_costs_per_day * cost_factor)
from (
select id, country, sales_date, sales_volume, fix_costs,
fix_costs / month_length(sales_date) as fix_costs_per_day,
case
when sum(sales_volume) over (partition by sales_date) > 0
then sales_volume::numeric / sum(sales_volume) over (partition by sales_date)
else sales_volume::numeric / 2
end as cost_factor
from sales
where country <> 'None'
) t
group by sales_date, country
order by sales_date, country
I am working with a query in PSQL and I am trying to use a window function to divide two other window function counts. This is what I currently have:
WITH month_cte as ( Select generate_series(date_trunc('month', current_date) - interval '12' month, date_trunc('month', current_date), interval '1' month) as month_year
)
select DISTINCT ON (month_year, q.rep_name) month_cte.*, q.*
FROM month_CTE
LEFT JOIN (
select *,
CASE
WHEN date_quoted IS NOT NULL THEN COUNT(*) OVER (PARTITION BY rep_name, date_trunc('month', date_quoted))
ELSE NULL
END as month_quotes,
CASE WHEN edocs_signed_date IS NOT NULL THEN COUNT(*) OVER (PARTITION BY rep_name, date_trunc('month', edocs_signed_date))
ELSE NULL
END as month_sales,
CASE
WHEN date_quoted IS NOT NULL And edocs_signed_date IS NOT NULL THEN CAST(COUNT(*) OVER (PARTITION BY rep_name, date_trunc('month', edocs_signed_date))/
COUNT(*) OVER (PARTITION BY rep_name, date_trunc('month', date_quoted))* 100.0 AS numeric)
ELSE NULL
END as month_closing
FROM quote_report_view
ORDER BY rep_name, edocs_signed_date, date_quoted
) q
ON (date_trunc('month', q.date_quoted) = month_cte.month_year OR date_trunc('month', q.edocs_signed_date) = month_cte.month_year)
ORDER BY month_year, rep_name, month_quotes, month_sales
The line that I am trying to get to work is the 3rd Case:
CASE WHEN date_quoted IS NOT NULL And edocs_signed_date IS NOT NULL THEN CAST(COUNT(*) OVER (PARTITION BY rep_name, date_trunc('month', edocs_signed_date))/
COUNT(*) OVER (PARTITION BY rep_name, date_trunc('month', date_quoted))* 100.0 AS numeric)
ELSE NULL
END as month_closing
I am basically trying to divide the 2nd count window function by the 1st count window function and get a percentage for month_closing.
These are my current results:
"2020-08-01 00:00:00-04" 869272 "2020-08-04 00:00:00" "2020-08-04 00:00:00" "Jesus" 1 1 100.0
"2020-08-01 00:00:00-04" 875518 "2020-08-19 00:00:00" "2020-09-01 00:00:00" "Jim" 36 1 0.0
"2020-08-01 00:00:00-04" 876462 "2020-08-04 00:00:00" "2020-08-04 00:00:00" "Nick" 39 12 0.0
"2020-08-01 00:00:00-04" 873572 "2020-08-04 00:00:00" "2020-08-04 00:00:00" "Piero" 63 36 0.0
I am only getting either 0.00 or 1.00 in my last column where I am trying to calculate the closing percentage. How can I make this work to get a true percentage?
Thanks!
what is happening is that for ex using your data db engine first calculate division integer / integer ( ex 1/ 36 which result is 0 with data type of integer) then it does the multiply 0 * 100.0 ( integer * numeric which output data type is numeric but the result is 0.00
so either cast first count(*) to numeric
or multiply it by 1.0
or if you are calculating percentage multiply first count(*) by 100.00 first like so:
WITH month_cte as ( Select generate_series(date_trunc('month', current_date) - interval '12' month, date_trunc('month', current_date), interval '1' month) as month_year
)
select DISTINCT ON (month_year, q.rep_name) month_cte.*, q.*
FROM month_CTE
LEFT JOIN (
select *,
CASE
WHEN date_quoted IS NOT NULL
THEN COUNT(*) OVER (PARTITION BY rep_name, date_trunc('month', date_quoted))
ELSE NULL
END as month_quotes,
CASE WHEN edocs_signed_date IS NOT NULL
THEN COUNT(*) OVER (PARTITION BY rep_name, date_trunc('month', edocs_signed_date))
ELSE NULL
END as month_sales,
CASE WHEN date_quoted IS NOT NULL And edocs_signed_date IS NOT NULL
THEN CAST(COUNT(*) OVER (PARTITION BY rep_name, date_trunc('month', edocs_signed_date)) * 100.0
/ COUNT(*) OVER (PARTITION BY rep_name, date_trunc('month', date_quoted)) AS numeric)
ELSE NULL
END as month_closing
FROM quote_report_view
ORDER BY rep_name, edocs_signed_date, date_quoted
) q
ON (date_trunc('month', q.date_quoted) = month_cte.month_year OR date_trunc('month', q.edocs_signed_date) = month_cte.month_year)
ORDER BY month_year, rep_name, month_quotes, month_sales
I am trying to find the total per year
For example
Start date End Date Total Value
1 07/01/14 01/01/15 $10,000
2 08/01/13 12/01/14 $10,000
3 03/01/13 05/01/15 $10,000
As you can see, Some items are over multiple years. Is there a way to find out what the total value is per year.
Solution should be:
item 3
2013- $3600
2014-$4800
2015-1600
Then a summation would be down for all three items to give a yearly total.
What I have so far:
I have a rolling summation code which is shown below.
case when
(
[begin date] >= dateadd(mm,0,DATEADD(mm,DATEDIFF(mm,0,getdate()),0))
and [end date] >= dateadd(mm,0,DATEADD(mm,DATEDIFF(mm,0,getdate()),0))
)
OR
(
[Begin Date] < dateadd(mm,0,DATEADD(mm,DATEDIFF(mm,0,getdate()),0))
and [End Date] >= dateadd(mm,0,DATEADD(mm,DATEDIFF(mm,0,getdate()),0))
)
then [Totalvalue]/nullif(DATEDIFF(mm,[begin date],[end date]),0)
else 0
end [Current Month]
I dono how you got that total values for item 3
but for item 3 i hope it should be
2013 = 3704
2014 = 4444
2015 = 1852
Dono how efficient this code is just have a try
CREATE TABLE #tblName
(
itemid INT,
startdate DATETIME,
endate DATETIME,
value int
)
INSERT INTO #tblName
VALUES (1,'2014/07/01','2015/01/01',10000),
(2,'2013/08/01','2014/12/01',10000),
(3,'2013/03/01','2015/05/01',10000)
DECLARE #mindate DATETIME,
#maxdate DATETIME
SELECT #mindate = Min(startdate),
#maxdate = Max(endate)
FROM #tblName
SELECT *
FROM #tblName;
WITH cte
AS (SELECT #mindate startdate
UNION ALL
SELECT Dateadd(mm, 1, startdate) startdate
FROM cte
WHERE startdate <= Dateadd(mm, -1, #maxdate))
SELECT a.value * ( ( convert(numeric(22,6),a.cnt) / convert(numeric(22,6),c.total) ) * 100 ) / 100,a.itemid,a.startdate
FROM (SELECT Avg(value) value,
Count(1) cnt,
itemid,
Year(a.startdate) startdate
FROM cte a
JOIN #tblName b
ON a.startdate BETWEEN b.startdate AND b.endate
GROUP BY itemid,
Year(a.startdate)) a
JOIN(SELECT Sum(cnt) total,
itemid
FROM (SELECT Avg(value) value,
Count(1) cnt,
itemid,
Year(a.startdate) startdate
FROM cte a
JOIN #tblName b
ON a.startdate BETWEEN b.startdate AND b.endate
GROUP BY itemid,
Year(a.startdate)) B
GROUP BY itemid) C
ON a.itemid = c.itemid
WHERE a.itemid = 3