How to divide a period in columns - postgresql

I am trying to create a query where the first column shows the list of the companies and the other 3 columns their revenues per month. This is what I do:
WITH time_frame AS
(SELECT date_trunc('month',NOW())-interval '0 week'),
time_frame1 AS
(SELECT date_trunc('month',NOW())-interval '1 month'),
time_frame2 AS
(SELECT date_trunc('month',NOW())-interval '2 month')
select table1.company_name,
(CASE
WHEN table2.date_of_transaction = (SELECT * FROM time_frame2) THEN sum(table2.amount)
ELSE NULL
END) AS "current week - 2",
(CASE
WHEN table2.date_of_transaction = (SELECT * FROM time_frame1) THEN sum(table2.amount)
ELSE NULL
END) AS "current week - 1",
(CASE
WHEN table2.date_of_transaction = (SELECT * FROM time_frame2) THEN
sum(table2.amount)
ELSE NULL
END) AS "current week - 2"
from table1
join table2 on table2.table1_id = table.id
where table1.company_joined >= '04-20-2019'
group by 1
When I execute the table this comes out: Error running query: column "table2.date_of_transaction" must appear in the GROUP BY clause or be used in an aggregate function LINE 15: WHEN table2.date_of_transaction = (SELECT * FROM time_frame) TH... ^
Do you have any ideas on how to solve it? Thank you.
company name
month1
month2
name 1
£233
£343
name 2
£243
£34
name 3
£133
£43

you can simplify the statement by using the filter() operator
select t1.company_name,
sum(t2.amount) filter (where t2.date_of_transaction = date_trunc('month',NOW())-interval '2 month'),
sum(t2.amount) filter (where t2.date_of_transaction = date_trunc('month',NOW())-interval '1 month'),
sum(t2.amount) filter (where t2.date_of_transaction = date_trunc('month',NOW()))
from table1 t1
join table2 t2 on t2.table1_id = t1.id
where t1.company_joined >= date '2019-04-20'
group by t1.company_name;
If you really want to put the date ranges into a CTE, you only need one:
with dates (r1, r2, r3) as (
values
(date_trunc('month',NOW())-interval '2 month',
date_trunc('month',NOW())-interval '1 month',
date_trunc('month',NOW()))
)
select t1.company_name,
sum(t2.amount) filter (where t2.date_of_transaction = d.r1),
sum(t2.amount) filter (where t2.date_of_transaction = d.r2),
sum(t2.amount) filter (where t2.date_of_transaction = d.r3)
from table1 t1
cross join dates d
join table2 t2 on t2.table1_id = t1.id
where t1.company_joined >= date '2019-04-20'
group by t1.company_name
;
The CTE dates returns a single row with three columns and thus the cross join doesn't change the resulting number of rows.

Related

Sub query in SELECT - ungrouped column from outer query

I have to calculate the ARPU (Revenue / # users) but I got this error:
subquery uses ungrouped column "usage_records.date" from outer query
LINE 7: WHERE created_at <= date_trunc('day', usage_records.d... ^
Expected results:
Revenue(day) = SUM(quantity_eur) for that day
Users Count (day) = Total signed up users before that day
Postgresql (Query)
SELECT
date_trunc('day', usage_records.date) AS day,
SUM(usage_records.quantity_eur) as Revenue,
( SELECT
COUNT(users.id)
FROM users
WHERE created_at <= date_trunc('day', usage_records.date)
) as users_count
FROM users
INNER JOIN ownerships ON (ownerships.user_id = users.id)
INNER JOIN profiles ON (profiles.id = ownerships.profile_id)
INNER JOIN usage_records ON (usage_records.profile_id = profiles.id)
GROUP BY DAY
ORDER BY DAY asc
your subquery (executed for each row ) cointain a column nont mentioned in group by but not involeved in aggregation ..
this produce error
but you could refactor your query using a contional also for this value
SELECT
date_trunc('day', usage_records.date) AS day,
SUM(usage_records.quantity_eur) as Revenue,
sum( case when created_at <= date_trunc('day', usage_records.date)
AND users.id is not null
then 1 else 0 end ) users_count
FROM users
INNER JOIN ownerships ON (ownerships.user_id = users.id)
INNER JOIN profiles ON (profiles.id = ownerships.profile_id)
INNER JOIN usage_records ON (usage_records.profile_id = profiles.id)
GROUP BY DAY
ORDER BY DAY asc

Return 0 value of SUM function for all rows in left table

forgive the probably inane question.
I am using the following where I want to return a value for every salesman even if the spend is zero. How do I achieve this ?
SELECT
CAST(coalesce(customer.salesman, '0') AS integer) as Salesman,
Sum(sophead.order_value) AS "Quote Value"
FROM
customer
INNER JOIN sophead ON sophead.inv_account = customer.account
WHERE
customer.company = 2 AND
Extract(MONTH FROM sophead.order_date) = Extract(MONTH FROM Now()) AND
Extract(YEAR FROM sophead.order_date) = Extract(YEAR FROM Now()) AND
sophead.order_type = 'Q' AND
sophead.salesman IN ('21','22','25','28','29','76')
GROUP BY
customer.salesman
ORDER BY
customer.salesman
You just need to use LEFT JOIN and COALESCE to ensure you show 0 if no sales:
SELECT
CAST(coalesce(customer.salesman, '0') AS integer) as Salesman,
COALESCE(Sum(sophead.order_value), 0) AS "Quote Value"
FROM customer
LEFT JOIN sophead
ON sophead.inv_account = customer.account
AND Extract(MONTH FROM sophead.order_date) = Extract(MONTH FROM Now())
AND Extract(YEAR FROM sophead.order_date) = Extract(YEAR FROM Now())
AND sophead.salesman IN ('21','22','25','28','29','76')
WHERE customer.company = 2
AND sophead.order_type = 'Q'
GROUP BY
customer.salesman
ORDER BY
customer.salesman

How do I avoid joining multiple times when using union all statement?

I was working on a query where I hit a point:
SELECT tpd.timestamp::Date,'Mon' AS Label,
count(tpd.aggregated)
FROM tap.deving AS tpd INNER JOIN
(select DATE_TRUNC('week', timestamp), MAX(timestamp) AS max_timestamp
from tap.deving
group by DATE_TRUNC('week', timestamp)
) b
on tpd.timestamp = b.max_timestamp
left JOIN ca.hardware AS ch ON tpd.dev = ch.name
left JOIN ca.sites AS css ON css.id = ch.id
WHERE (tpd.aggregated=TRUE)
AND (css.country='USA') and (tpd.timestamp::date=now()::Date - interval '1 day') group by tpd.timestamp
UNION ALL
SELECT tpd.timestamp::date,'Tap but not' AS Label,
count(tpd.tap)
FROM tap.deving AS tpd INNER JOIN
(select DATE_TRUNC('week', timestamp), MAX(timestamp) AS max_timestamp
from tap.deving
group by DATE_TRUNC('week', timestamp)
) b
on tpd.timestamp = b.max_timestamp
left JOIN ca.hardware AS ch ON tpd.dev = ch.name
left JOIN ca.sites AS css ON css.id = ch.id
WHERE (tpd.tap=true)
AND (tpd.aggregated=false) and (tpd.needs_to_be=true)
AND (css.country='USA') and (tpd.timestamp::date=now()::Date - interval '1 day') group by tpd.timestamp
I wrote this query with the help of many SO posts and it has gone quite messy and super slow. I could not get my head on how to optimize this query.
Can you please try this query.
SELECT tpd.timestamp::Date,CASE tpd.aggregated
WHEN false THEN 'Tap but not'
WHEN true THEN 'Mon' as Label,
count(tpd.aggregated)
FROM tap.deving AS tpd INNER JOIN
(select DATE_TRUNC('week', timestamp), MAX(timestamp) AS max_timestamp
from tap.deving
group by DATE_TRUNC('week', timestamp)
) b
on tpd.timestamp = b.max_timestamp
left JOIN ca.hardware AS ch ON tpd.dev = ch.name
left JOIN ca.sites AS css ON css.id = ch.id
WHERE ((tpd.aggregated=TRUE) or ((tpd.tap=true) AND (tpd.aggregated=false) and (tpd.needs_to_be=true)))
AND (css.country='USA') and (tpd.timestamp::date=now()::Date - interval '1 day') group by tpd.timestamp;

Join on generate_series and count

I'm trying to find the # users who did action A or action B on a monthly basis.
Table: User
- id
- "creationDate"
Table: action_A
- user_id (= user.id)
- "creationDate"
Table: action_B
- user_id (= user.id)
- "creationDate"
The general idea of what I was trying to do was that I'd find the list of users who did action A in Month X and the list of users who did action B in Month X, then count how many ids are there for every month based on a generate_series of monthly dates.
I tried the following, however, the query times out when running and I'm not sure if there's any way to optimize it (or if it is even correct).
SELECT monthseries."Month", count(*)
FROM
(SELECT to_char(DAY::date, 'YYYY-MM') AS "Month"
FROM generate_series('2014-01-01'::date, CURRENT_DATE, '1 month') DAY) monthseries
LEFT JOIN
(SELECT to_char("creationDate", 'YYYY-MM') AS "Month",
id
FROM action_A) did_action_A ON monthseries."Month" = did_action_A."Month"
LEFT JOIN
(SELECT to_char("creationDate", 'YYYY-MM') AS "Month",
id
FROM action_B) did_action_B ON monthseries."Month" = did_action_B."Month"
GROUP BY monthseries."Month"
Any comments/ help would be immensely helpful!
If you want to count distinct users:
select to_char(month, 'YYYY-MM') as "Month", count(*)
from
generate_series(
'2014-01-01'::date, current_date, '1 month'
) monthseries (month)
left join (
(
select distinct date_trunc('month', "creationDate") as month, id
from action_a
) a
full outer join (
select distinct date_trunc('month', "creationDate") as month, id
from action_b
) b using (month, id)
) s using (month)
group by 1
order by 1

Dealing with periods and dates without using cursors

I would like to solve this issue avoiding to use cursors (FETCH).
Here comes the problem...
1st Table/quantity
------------------
periodid periodstart periodend quantity
1 2010/10/01 2010/10/15 5
2st Table/sold items
-----------------------
periodid periodstart periodend solditems
14343 2010/10/05 2010/10/06 2
Now I would like to get the following view or just query result
Table Table/stock
-----------------------
periodstart periodend itemsinstock
2010/10/01 2010/10/04 5
2010/10/05 2010/10/06 3
2010/10/07 2010/10/15 5
It seems impossible to solve this problem without using cursors, or without using single dates instead of periods.
I would appreciate any help.
Thanks
DECLARE #t1 TABLE (periodid INT,periodstart DATE,periodend DATE,quantity INT)
DECLARE #t2 TABLE (periodid INT,periodstart DATE,periodend DATE,solditems INT)
INSERT INTO #t1 VALUES(1,'2010-10-01T00:00:00.000','2010-10-15T00:00:00.000',5)
INSERT INTO #t2 VALUES(14343,'2010-10-05T00:00:00.000','2010-10-06T00:00:00.000',2)
DECLARE #D1 DATE
SELECT #D1 = MIN(P) FROM (SELECT MIN(periodstart) P FROM #t1
UNION ALL
SELECT MIN(periodstart) FROM #t2) D
DECLARE #D2 DATE
SELECT #D2 = MAX(P) FROM (SELECT MAX(periodend) P FROM #t1
UNION ALL
SELECT MAX(periodend) FROM #t2) D
;WITH
L0 AS (SELECT 1 AS c UNION ALL SELECT 1),
L1 AS (SELECT 1 AS c FROM L0 A CROSS JOIN L0 B),
L2 AS (SELECT 1 AS c FROM L1 A CROSS JOIN L1 B),
L3 AS (SELECT 1 AS c FROM L2 A CROSS JOIN L2 B),
L4 AS (SELECT 1 AS c FROM L3 A CROSS JOIN L3 B),
Nums AS (SELECT ROW_NUMBER() OVER (ORDER BY (SELECT 0)) AS i FROM L4),
Dates AS(SELECT DATEADD(DAY,i-1,#D1) AS D FROM Nums where i <= 1+DATEDIFF(DAY,#D1,#D2)) ,
Stock As (
SELECT D ,t1.quantity - ISNULL(t2.solditems,0) AS itemsinstock
FROM Dates
LEFT OUTER JOIN #t1 t1 ON t1.periodend >= D and t1.periodstart <= D
LEFT OUTER JOIN #t2 t2 ON t2.periodend >= D and t2.periodstart <= D ),
NStock As (
select D,itemsinstock, ROW_NUMBER() over (order by D) - ROW_NUMBER() over (partition by itemsinstock order by D) AS G
from Stock)
SELECT MIN(D) AS periodstart, MAX(D) AS periodend, itemsinstock
FROM NStock
GROUP BY G, itemsinstock
ORDER BY periodstart
Hopefully a little easier to read than Martin's. I used different tables and sample data, hopefully extrapolating the right info:
CREATE TABLE [dbo].[Quantity](
[PeriodStart] [date] NOT NULL,
[PeriodEnd] [date] NOT NULL,
[Quantity] [int] NOT NULL
) ON [PRIMARY]
CREATE TABLE [dbo].[SoldItems](
[PeriodStart] [date] NOT NULL,
[PeriodEnd] [date] NOT NULL,
[SoldItems] [int] NOT NULL
) ON [PRIMARY]
INSERT INTO Quantity (PeriodStart,PeriodEnd,Quantity)
SELECT '20100101','20100115',5
INSERT INTO SoldItems (PeriodStart,PeriodEnd,SoldItems)
SELECT '20100105','20100107',2 union all
SELECT '20100106','20100108',1
The actual query is now:
;WITH Dates as (
select PeriodStart as DateVal from SoldItems union select PeriodEnd from SoldItems union select PeriodStart from Quantity union select PeriodEnd from Quantity
), Periods as (
select d1.DateVal as StartDate, d2.DateVal as EndDate
from Dates d1 inner join Dates d2 on d1.DateVal < d2.DateVal left join Dates d3 on d1.DateVal < d3.DateVal and d3.DateVal < d2.DateVal where d3.DateVal is null
), QuantitiesSold as (
select StartDate,EndDate,COALESCE(SUM(si.SoldItems),0) as Quantity
from Periods p left join SoldItems si on p.StartDate < si.PeriodEnd and si.PeriodStart < p.EndDate
group by StartDate,EndDate
)
select StartDate,EndDate,q.Quantity - qs.Quantity
from QuantitiesSold qs inner join Quantity q on qs.StartDate < q.PeriodEnd and q.PeriodStart < qs.EndDate
And the result is:
StartDate EndDate (No column name)
2010-01-01 2010-01-05 5
2010-01-05 2010-01-06 3
2010-01-06 2010-01-07 2
2010-01-07 2010-01-08 4
2010-01-08 2010-01-15 5
Explanation: I'm using three Common Table Expressions. The first (Dates) is gathering all of the dates that we're talking about, from the two tables involved. The second (Periods) selects consecutive values from the Dates CTE. And the third (QuantitiesSold) then finds items in the SoldItems table that overlap these periods, and adds their totals together. All that remains in the outer select is to subtract these quantities from the total quantity stored in the Quantity Table
John, what you could do is a WHILE loop. Declare and initialise 2 variables before your loop, one being the start date and the other being end date. Your loop would then look like this:
WHILE(#StartEnd <= #EndDate)
BEGIN
--processing goes here
SET #StartEnd = #StartEnd + 1
END
You would need to store your period definitions in another table, so you could retrieve those and output rows when required to a temporary table.
Let me know if you need any more detailed examples, or if I've got the wrong end of the stick!
Damien,
I am trying to fully understand your solution and test it on a large scale of data, but I receive following errors for your code.
Msg 102, Level 15, State 1, Line 20
Incorrect syntax near 'Dates'.
Msg 102, Level 15, State 1, Line 22
Incorrect syntax near ','.
Msg 102, Level 15, State 1, Line 25
Incorrect syntax near ','.
Damien,
Based on your solution I also wanted to get a neat display for StockItems without overlapping dates. How about this solution?
CREATE TABLE [dbo].[SoldItems](
[PeriodStart] [datetime] NOT NULL,
[PeriodEnd] [datetime] NOT NULL,
[SoldItems] [int] NOT NULL
) ON [PRIMARY]
INSERT INTO SoldItems (PeriodStart,PeriodEnd,SoldItems)
SELECT '20100105','20100106',2 union all
SELECT '20100105','20100108',3 union all
SELECT '20100115','20100116',1 union all
SELECT '20100101','20100120',10
;WITH Dates as (
select PeriodStart as DateVal from SoldItems
union
select PeriodEnd from SoldItems
union
select PeriodStart from Quantity
union
select PeriodEnd from Quantity
), Periods as (
select d1.DateVal as StartDate, d2.DateVal as EndDate
from Dates d1
inner join Dates d2 on d1.DateVal < d2.DateVal
left join Dates d3 on d1.DateVal < d3.DateVal and
d3.DateVal < d2.DateVal where d3.DateVal is null
), QuantitiesSold as (
select StartDate,EndDate,SUM(si.SoldItems) as Quantity
from Periods p left join SoldItems si on p.StartDate < si.PeriodEnd and si.PeriodStart < p.EndDate
group by StartDate,EndDate
)
select StartDate,EndDate, qs.Quantity
from QuantitiesSold qs
where qs.quantity is not null