I am calculating the last 12-months count after joining multiple tables, my expected output is working is OK but it is not what I want?. I want to add another column with the name "Current Month", so the basic idea is if I see a report for the month May, then it will start from Last year's May till This year's April and May as Current Month, total 13 columns counts. My intuition says window query will help me out on this, but I am now sure how I can do that.
select
c.name,
SUM(case when RTRIM(TO_CHAR(mor.sent_at , 'Month')) = 'January' THEN 1 END) as "January",
SUM(case when RTRIM(TO_CHAR(mor.sent_at , 'Month')) = 'February' THEN 1 END) as "February",
SUM(case when RTRIM(TO_CHAR(mor.sent_at , 'Month')) = 'March' THEN 1 END) as "March",
SUM(case when RTRIM(TO_CHAR(mor.sent_at , 'Month')) = 'April' THEN 1 END) as "April",
SUM(case when RTRIM(TO_CHAR(mor.sent_at , 'Month')) = 'May' THEN 1 END) as "May",
SUM(case when RTRIM(TO_CHAR(mor.sent_at , 'Month')) = 'June' THEN 1 END) as "June",
SUM(case when RTRIM(TO_CHAR(mor.sent_at , 'Month')) = 'July' THEN 1 END) as "July",
SUM(case when RTRIM(TO_CHAR(mor.sent_at , 'Month')) = 'August' THEN 1 END) as "August",
SUM(case when RTRIM(TO_CHAR(mor.sent_at , 'Month')) = 'September' THEN 1 END) as "September",
SUM(case when RTRIM(TO_CHAR(mor.sent_at , 'Month')) = 'October' THEN 1 END) as "October",
SUM(case when RTRIM(TO_CHAR(mor.sent_at , 'Month')) = 'November' THEN 1 END) as "November",
SUM(case when RTRIM(TO_CHAR(mor.sent_at , 'Month')) = 'December' THEN 1 END) as "December"
from analytics_outbox mo
inner join analytics_outbox_recipient mor on mor.analytics_outbox_id = mo.id
inner join customer c on c.id = mo.customer_id
group by c.name
Current Output:
name |january|february|march |april |may |june|july|august|september|october|november|december|
----------------------------------+-------+--------+------+-------+-------+----+----+------+---------+-------+--------+--------+
ABC | | | 1| 2| | | | | | | | |
DEF | 11| 24| 34| 32| 19| | | | | | | |
GEH | 9| 3| 7| 18| 22| | | | | | | |
IJK | | | | 1| | | | | | | | |
Dynamic result column names are only possible with dynamic SQL.
This should do the job efficiently, save the dynamic column names:
SELECT c.name
, to_char(t.mon, 'Month YYYY') AS report_month
, count(*) FILTER (WHERE mor.sent_at >= t.mon - interval '12 mon' AND mor.sent_at < t.mon - interval '11 mon') AS mon1
, count(*) FILTER (WHERE mor.sent_at >= t.mon - interval '11 mon' AND mor.sent_at < t.mon - interval '10 mon') AS mon2
, count(*) FILTER (WHERE mor.sent_at >= t.mon - interval '10 mon' AND mor.sent_at < t.mon - interval '09 mon') AS mon3
-- etc.
FROM analytics_outbox mo
JOIN analytics_outbox_recipient mor ON mor.analytics_outbox_id = mo.id
JOIN customer c ON c.id = mo.customer_id
, (SELECT date_trunc('month', now())) AS t(mon) -- add once for ease of use
GROUP BY 1;
This compares unaltered values from sent_at to a constant value (computed once), which is cheaper than running each value through multiple functions before comparison.
Possible corner case issues with time zone and timestamp vs. timestamptz unresolved due to missing input.
Related
This is TSQL and I'm trying to calculate repeat purchase rate for last 12 months. This is achieved by looking at sum of customers who have bought more than 1 time last 12 months and the total number of customers last 12 months.
The SQL code below will give me just that; but i would like to dynamically do this for the last 12 months. This is the part where i'm stuck and not should how to best achieve this.
Each month should include data going back 12 months. I.e. June should hold data between June 2018 and June 2018, May should hold data from May 2018 till May 2019.
[Order Date] is a normal datefield (yyyy-mm-dd hh:mm:ss)
DECLARE #startdate1 DATETIME
DECLARE #enddate1 DATETIME
SET #enddate1 = DATEADD(MONTH, DATEDIFF(MONTH, 0, GETDATE())-1, 0) -- Starting June 2018
SET #startdate1 = DATEADD(mm,DATEDIFF(mm,0,GETDATE())-13,0) -- Ending June 2019
;
with dataset as (
select [Phone No_] as who_identifier,
count(distinct([Order No_])) as mycount
from [MyCompany$Sales Invoice Header]
where [Order Date] between #startdate1 and #enddate1
group by [Phone No_]
),
frequentbuyers as (
select who_identifier, sum(mycount) as frequentbuyerscount
from dataset
where mycount > 1
group by who_identifier),
allpurchases as (
select who_identifier, sum(mycount) as allpurchasescount
from dataset
group by who_identifier
)
select sum(frequentbuyerscount) as frequentbuyercount, (select sum(allpurchasescount) from allpurchases) as allpurchasecount
from frequentbuyers
I'm hoping to achieve end result looking something like this:
...Dec, Jan, Feb, March, April, May, June each month holding both values for frequentbuyercount and allpurchasescount.
Here is the code. I made a little modification for the frequentbuyerscount and allpurchasescount. If you use a sumif like expression you don't need a second cte.
if object_id('tempdb.dbo.#tmpMonths') is not null drop table #tmpMonths
create table #tmpMonths ( MonthID datetime, StartDate datetime, EndDate datetime)
declare #MonthCount int = 12
declare #Month datetime = DATEADD(MONTH, DATEDIFF(MONTH, 0, GETDATE()), 0)
while #MonthCount > 0 begin
insert into #tmpMonths( MonthID, StartDate, EndDate )
select #Month, dateadd(month, -12, #Month), #Month
set #Month = dateadd(month, -1, #Month)
set #MonthCount = #MonthCount - 1
end
;with dataset as (
select m.MonthID as MonthID, [Phone No_] as who_identifier,
count(distinct([Order No_])) as mycount
from [MyCompany$Sales Invoice Header]
inner join #tmpMonths m on [Order Date] between m.StartDate and m.EndDate
group by m.MonthID, [Phone No_]
),
buyers as (
select MonthID, who_identifier
, sum(iif(mycount > 1, mycount, 0)) as frequentbuyerscount --sum only if count > 1
, sum(mycount) as allpurchasescount
from dataset
group by MonthID, who_identifier
)
select
b.MonthID
, max(tm.StartDate) StartDate, max(tm.EndDate) EndDate
, sum(b.frequentbuyerscount) as frequentbuyercount
, sum(b.allpurchasescount) as allpurchasecount
from buyers b inner join #tmpMonths tm on tm.MonthID = b.MonthID
group by b.MonthID
Be aware, that the code was tested only syntax-wise.
After the test data, this is the result:
MonthID | StartDate | EndDate | frequentbuyercount | allpurchasecount
-----------------------------------------------------------------------------
2018-08-01 | 2017-08-01 | 2018-08-01 | 340 | 3702
2018-09-01 | 2017-09-01 | 2018-09-01 | 340 | 3702
2018-10-01 | 2017-10-01 | 2018-10-01 | 340 | 3702
2018-11-01 | 2017-11-01 | 2018-11-01 | 340 | 3702
2018-12-01 | 2017-12-01 | 2018-12-01 | 340 | 3703
2019-01-01 | 2018-01-01 | 2019-01-01 | 340 | 3703
2019-02-01 | 2018-02-01 | 2019-02-01 | 2 | 8
2019-03-01 | 2018-03-01 | 2019-03-01 | 2 | 3
2019-04-01 | 2018-04-01 | 2019-04-01 | 2 | 3
2019-05-01 | 2018-05-01 | 2019-05-01 | 2 | 3
2019-06-01 | 2018-06-01 | 2019-06-01 | 2 | 3
2019-07-01 | 2018-07-01 | 2019-07-01 | 2 | 3
The table that I querying is :
table testing_table:
testType | period_from | period_to| copies |
1 | 20180101| 20181201| 1|
2 | 20180101 | 20191201| 1|
3 | 20190101| 20191201| 1|
I want to loop through the array and use the below query to generate values like this:
DateVar | ABTEST | CDTEST | EFTEST |
20180101| 4| 0| 0|
20180201| 3| 4| 2|
dateVar = ['20180101','20180201','20180501'].
I am trying to develop an sql query like this:
SELECT
SUM (
CASE
WHEN (testType = 1 AND (period_from <= dateVar AND period_to >= dateVar)) THEN
copies
ELSE
0
END
) AS "ABTEST",
SUM (
CASE
WHEN (testType = 2 AND (period_from <= dateVar AND period_to >= dateVar)) THEN
copies
ELSE
0
END
) AS "CDTEST",
SUM (
CASE
WHEN (testType = 3 AND (period_from <= dateVar AND period_to >= dateVar)) THEN
copies
ELSE
0
END
) AS "EFTEST"
FROM
testing_table;
I am lost as to what to do with it. Do I look into functions?
I think you should use unnest function to accomplish what you are asking i have written a query you may want to check
SELECT DTVAR,
SUM(CASE
WHEN TestType = 1
THEN copies
ELSE 0
END) AS 'ABTEST',
SUM(CASE
WHEN TestType = 2
THEN copies
ELSE 0
END) AS 'CDTEST',
SUM(CASE
WHEN TestType = 3
THEN copies
ELSE 0
END) AS 'EFTEST'
FROM (
SELECT DTVAR, TestType, sum(copies) AS copies
FROM testing_table
INNER JOIN (
SELECT DTVAR
FROM unnest(dateVar ['20180101','20180201','20180501']) AS DTVAR
) AA
ON (
period_from <= DTVAR
AND period_to >= DTVAR
)
GROUP BY DTVAR, TestType
) A
GROUP BY DTVAR
hope this helps..
I´m having problems querying when lead() values are not within the range of current row, rows on the range's edge return null lead() values.
Let’s say I have a simple table to keep track of continuous counters
create table anytable
( wseller integer NOT NULL,
wday date NOT NULL,
wshift smallint NOT NULL,
wconter numeric(9,1) )
with the following values
wseller wday wshift wcounter
1 2016-11-30 1 100.5
1 2017-01-03 1 102.5
1 2017-01-25 2 103.2
1 2017-02-05 2 106.1
2 2015-05-05 2 81.1
2 2017-01-01 1 92.1
2 2017-01-01 2 93.1
3 2016-12-01 1 45.2
3 2017-01-05 1 50.1
and want net units for current year
wseller wday wshift units
1 2017-01-03 1 2
1 2017-01-25 2 0.7
1 2017-02-05 2 2.9
2 2017-01-01 1 11
2 2017-01-01 2 1
3 2017-01-05 1 4.9
If I use
seletc wseller, wday, wshift, wcounter-lead(wcounter) over (partition by wseller order by wseller, wday desc, wshift desc)
from anytable
where wday>='2017-01-01'
gives me nulls on the first wseller by partition. I´m using this query within a large CTE.
What am I doing wrong?
The scope of a window function takes into account conditions in the WHERE clause. Move the condition to the outer query:
select *
from (
select
wseller, wday, wshift,
wcounter- lead(wcounter) over (partition by wseller order by wday desc, wshift desc)
from anytable
) s
where wday >= '2017-01-01'
order by wseller, wday, wshift
wseller | wday | wshift | ?column?
---------+------------+--------+----------
1 | 2017-01-03 | 1 | 2.0
1 | 2017-01-25 | 2 | 0.7
1 | 2017-02-05 | 2 | 2.9
2 | 2017-01-01 | 1 | 11.0
2 | 2017-01-01 | 2 | 1.0
3 | 2017-01-05 | 1 | 4.9
(6 rows)
I'm having issues trying to wrap my head around how to extract some time series stats from my Postgres DB.
For example, I have several stores. I record how many sales each store made each day in a table that looks like:
+------------+----------+-------+
| Date | Store ID | Count |
+------------+----------+-------+
| 2017-02-01 | 1 | 10 |
| 2017-02-01 | 2 | 20 |
| 2017-02-03 | 1 | 11 |
| 2017-02-03 | 2 | 21 |
| 2017-02-04 | 3 | 30 |
+------------+----------+-------+
I'm trying to display this data on a bar/line graph with different lines per Store and the blank dates filled in with 0.
I have been successful getting it to show the sum per day (combining all the stores into one sum) using generate_series, but I can't figure out how to separate it out so each store has a value for each day... the result being something like:
["Store ID 1", 10, 0, 11, 0]
["Store ID 2", 20, 0, 21, 0]
["Store ID 3", 0, 0, 0, 30]
It is necessary to build a cross join dates X stores:
select store_id, array_agg(total order by date) as total
from (
select store_id, date, coalesce(sum(total), 0) as total
from
t
right join (
generate_series(
(select min(date) from t),
(select max(date) from t),
'1 day'
) gs (date)
cross join
(select distinct store_id from t) s
) using (date, store_id)
group by 1,2
) s
group by 1
order by 1
;
store_id | total
----------+-------------
1 | {10,0,11,0}
2 | {20,0,21,0}
3 | {0,0,0,30}
Sample data:
create table t (date date, store_id int, total int);
insert into t (date, store_id, total) values
('2017-02-01',1,10),
('2017-02-01',2,20),
('2017-02-03',1,11),
('2017-02-03',2,21),
('2017-02-04',3,30);
I have a table with 3 column
+---------------+-------------------------+-------+
| InstrumentId | Date | Price |
+---------------+-------------------------+-------+
| 39 | 2012-10-31 00:00:00.000 | 150 |
| 39 | 2012-11-01 00:00:00.000 | 160 |
| 39 | 2012-11-01 00:00:00.000 | 200 |
| 40 | 2012-10-31 00:00:00.000 | 150 |
| 40 | 2012-11-01 00:00:00.000 | 140 |
| 40 | 2012-11-01 00:00:00.000 | 200 |
| 50 | 2012-10-31 00:00:00.000 | 150 |
| 50 | 2012-11-01 00:00:00.000 | 150 |
| 50 | 2012-11-01 00:00:00.000 | 150 |
+---------------+-------------------------+-------+
I need to recive next result:
+--------------+-------+
| InstrumentId | Price |
+--------------+-------+
| 39 | 200 |
| 40 | 0 |
| 50 | 150 |
+--------------+-------+
rules:
if price for same InstrumentId is growing or is equal => return last price (that means every next price greater or equal to a previous price.
For instance Id 39: 150 <= 160 <= 200 => return 200)
if any price for same InstrumentId is less than previous => return 0 (see instrumentId 40)
I can do that with a cursor... but I think that exist a simply workaround to do this.
Any ideas?
test data:
DECLARE #table TABLE(
instrumentId INT NOT NULL,
priceListDate DATETIME NOT NULL,
price DECIMAL NOT NULL
)
INSERT INTO #table
(
instrumentId,
priceListDate,
price
)
VALUES( 39, '2012-10-31 00:00:00.000', 150),
(39,'2012-11-01 00:00:00.000', 160),
(39,'2012-11-01 00:00:00.000', 200),
(40,'2012-10-31 00:00:00.000', 150),
(40,'2012-11-01 00:00:00.000', 140),
(40,'2012-11-01 00:00:00.000', 200),
(50,'2012-10-31 00:00:00.000', 150),
(50,'2012-11-01 00:00:00.000', 150),
(50,'2012-11-01 00:00:00.000', 150)
Let me know if this works ok. I'm guessing you won't ever have a price = -1 in your table, I think that would cause problems for the current solution.
WITH CTE
AS ( SELECT RN = ROW_NUMBER() OVER ( ORDER BY instrumentId ) ,
*
FROM #table
)
SELECT CASE WHEN MIN(X.xPrice) = -1 THEN 0
ELSE MAX(X.xPrice)
END 'price' ,
X.instrumentId
FROM ( SELECT CASE WHEN [Current Row].instrumentId = [Next Row].instrumentId
THEN CASE WHEN [Current Row].price > [Next Row].price
THEN -1
ELSE [Current Row].price
END
ELSE CASE WHEN [Previous Row].instrumentId = [Current Row].instrumentId
THEN CASE WHEN [Previous Row].price <= [Current Row].price
THEN [Current Row].price
ELSE -1
END
ELSE [Current Row].price
END
END 'xPrice' ,
[Current Row].RN ,
[Current Row].instrumentId
FROM CTE [Current Row]
LEFT JOIN CTE [Previous Row] ON [Previous Row].RN = [Current Row].RN
- 1
LEFT JOIN CTE [Next Row] ON [Next Row].RN = [Current Row].RN
+ 1
) X
GROUP BY X.instrumentId
It might seem a bit convoluted but the basic idea is to determine the next and previous row of the current on in order to test the value of the price column for that row.