postgres remove blank values from grouping - postgresql

I have a rollup SQL statement that I'm trying to covert over from Oracle into PostGresSql. Overall the results look correct except I'm getting a blank value in the grouping column and I'm not sure how to get rid of it.
Right now I have:
SELECT
COALESCE(CASE WHEN GROUPING(COUNTY) = 1 THEN 'TOTAL' else county::text END) as COUNTY
,COUNT(CASE WHEN STV_TO_GAS THEN 1 END) as STOVE_TO_GAS_SUM
,COUNT(CASE WHEN FIRE_TO_GAS THEN 1 END) as FIRE_TO_GAS_SUM
,COUNT(CASE WHEN PELLET_TO_GAS THEN 1 END) as PELLET_TO_GAS_SUM
,COUNT(CASE WHEN STV_TO_ELECTRIC THEN 1 END) as STOVE_TO_ELECTRIC_SUM
,COUNT(CASE WHEN FIRE_TO_ELECTRIC THEN 1 END) as FIRE_TO_ELECTRIC_SUM
,COUNT(CASE WHEN PELLET_TO_ELECTRIC THEN 1 END) as PELLET_TO_ELECTRIC_SUM
,COUNT(CASE WHEN MERIDIAN THEN 1 END) as WITHIN_MERIDIAN_SUM
,count(CASE WHEN hb357.stv_to_gas then 1 END) +
count(CASE WHEN hb357.fire_to_gas then 1 END) +
count(CASE WHEN hb357.pellet_to_gas then 1 END) +
count(CASE WHEN hb357.stv_to_electric then 1 END) +
count(CASE WHEN hb357.fire_to_electric then 1 END) +
count(CASE WHEN hb357.pellet_to_electric then 1 END) +
count(CASE WHEN hb357.meridian then 1 END ) AS county_totals
FROM woodburn.HB357
WHERE app_status IN ('pending','approved')
AND (COUNTY IS NOT NULL OR trim(COUNTY) <> '')
GROUP BY rollup (county)
but its still returning a blank value in the county column
I've also trued changing the first case statement to
COALESCE(CASE
WHEN GROUPING(COUNTY) = 1 THEN 'TOTAL'
WHEN TRIM(COUNTY) != '' AND COUNTY IS NOT NULL then county
END) as COUNTY
but its returning a null row for county
FINAL SOLUTION
After trying all things suggested I essentially implemented what Jorge Campos recommended by doing the following:
SELECT
COALESCE(CASE
WHEN GROUPING(COUNTY) = 1 THEN 'TOTAL'
WHEN TRIM(COUNTY) != '' then county
END) as COUNTY
,COUNT(CASE WHEN STV_TO_GAS THEN 1 END) as STOVE_TO_GAS_SUM
,COUNT(CASE WHEN FIRE_TO_GAS THEN 1 END) as FIRE_TO_GAS_SUM
,COUNT(CASE WHEN PELLET_TO_GAS THEN 1 END) as PELLET_TO_GAS_SUM
,COUNT(CASE WHEN STV_TO_ELECTRIC THEN 1 END) as STOVE_TO_ELECTRIC_SUM
,COUNT(CASE WHEN FIRE_TO_ELECTRIC THEN 1 END) as FIRE_TO_ELECTRIC_SUM
,COUNT(CASE WHEN PELLET_TO_ELECTRIC THEN 1 END) as PELLET_TO_ELECTRIC_SUM
,COUNT(CASE WHEN MERIDIAN THEN 1 END) as WITHIN_MERIDIAN_SUM
,count(CASE WHEN hb357.stv_to_gas then 1 END) +
count(CASE WHEN hb357.fire_to_gas then 1 END) +
count(CASE WHEN hb357.pellet_to_gas then 1 END) +
count(CASE WHEN hb357.stv_to_electric then 1 END) +
count(CASE WHEN hb357.fire_to_electric then 1 END) +
count(CASE WHEN hb357.pellet_to_electric then 1 END) +
count(CASE WHEN hb357.meridian then 1 END ) AS county_totals
FROM woodburn.HB357
WHERE app_status IN ('pending','approved')
AND COUNTY IS NOT NULL
AND TRIM(COUNTY) != ''
GROUP BY rollup (county)

Related

Postgres Count Function for monthly transactions, counting distinct only

I am counting the size of my teams within a department. All employees have an employee ID beginning with "E" and then a designating number (i.e. "0", "1", etc) to denote which team they are on.
I have the following query in Postgres to count the size of the teams, but the problem is that with this query, I get a lot of rows that are empty, because some months are duplicated. For example, the row containing "May/2016" may be duplicated 3 times, with only 1 row containing the actual team counts.
select to_char("Date", 'Mon/YYYY') as "Date",
sum(case when l_part LIKE 'E0%%' then count end) as "ACCOUNTING",
sum(case when l_part LIKE 'E1%%' then count end) as "SW",
sum(case when l_part LIKE 'E2%%' then count end) as "SUPPORT",
sum(case when l_part LIKE 'E3%%' then count end) as "CALLCENTER",
sum(case when l_part LIKE 'E4%%' then count end) as "ADMIN",
sum(case when l_part LIKE 'E5%%' then count end) as "MARKETING",
sum(case when l_part LIKE 'E9%%' then count end) as "MANAGEMENT"
from (
select left("Type",4)as l_part, count(*),"Date" from
"Transactions" group by "Date",l_part
) p group by "Date"
order by min("Date");
If I can just get the count down to one row per month/yyyy, and order by the date that would be helpful and less confusing. Any tweaks to my attempt appreciated.
Here is what populates, as an example using September 2015:
This is what I get:
DATE | ACCOUNTING | SW | SUPPORT | CALLCENTER | ADMIN | MARKETING |
Sep/15| | | | | | |
Sep/15| | | | | | |
Sep/15| 1 | 2 | 1 | 5 | 5 | 3 |
I suspect your issue is the GROUP BY clause, which I think is solved using DATE_TRUNC(). Not sure if you need the where clause.
SELECT
to_char(DATE_TRUNC('month',"Date"), 'Mon/YYYY') as "Date"
, SUM(CASE WHEN left("Type",4) LIKE 'E0%%' THEN 1 END) AS "ACCOUNTING"
, SUM(CASE WHEN left("Type",4) LIKE 'E1%%' THEN 1 END) AS "SW"
, SUM(CASE WHEN left("Type",4) LIKE 'E2%%' THEN 1 END) AS "SUPPORT"
, SUM(CASE WHEN left("Type",4) LIKE 'E3%%' THEN 1 END) AS "CALLCENTER"
, SUM(CASE WHEN left("Type",4) LIKE 'E4%%' THEN 1 END) AS "ADMIN"
, SUM(CASE WHEN left("Type",4) LIKE 'E5%%' THEN 1 END) AS "MARKETING"
, SUM(CASE WHEN left("Type",4) LIKE 'E9%%' THEN 1 END) AS "MANAGEMENT"
FROM "Transactions"
WHERE "Date" IS NOT NULL
GROUP BY
DATE_TRUNC('month',"Date")
ORDER BY
DATE_TRUNC('month',"Date")
btw: Instead of SUM() an alternatve using COUNT() would be:
SELECT
to_char(DATE_TRUNC('month',"Date"), 'Mon/YYYY') as "Date"
, COUNT(CASE WHEN left("Type",4) LIKE 'E0%%' THEN 1 END) AS "ACCOUNTING"
, COUNT(CASE WHEN left("Type",4) LIKE 'E1%%' THEN 1 END) AS "SW"
, COUNT(CASE WHEN left("Type",4) LIKE 'E2%%' THEN 1 END) AS "SUPPORT"
, COUNT(CASE WHEN left("Type",4) LIKE 'E3%%' THEN 1 END) AS "CALLCENTER"
, COUNT(CASE WHEN left("Type",4) LIKE 'E4%%' THEN 1 END) AS "ADMIN"
, COUNT(CASE WHEN left("Type",4) LIKE 'E5%%' THEN 1 END) AS "MARKETING"
, COUNT(CASE WHEN left("Type",4) LIKE 'E9%%' THEN 1 END) AS "MANAGEMENT"
COUNT() increments by one for any NON-NULL value it encounters.

order by with operation on select case query

I want to rank a table by multi-columns and sum of each point that is decided by value.
For example, I can get some columns and each points by below query.
select
(case when seller=true then 50 else 0 end) as sel,
(case when buyer=true then 40 else 0 end) as buy
from company;
but I can't order by this values by like this query
select
(case when seller=true then 50 else 0 end) as sel,
(case when buyer=true then 40 else 0 end) as buy
from company
order by (sel + by);
or this
select
(case when seller=true then 50 else 0 end) as sel,
(case when buyer=true then 40 else 0 end) as buy,
(sell, buy) as sm
from company
order by sm;
How can I do that?
Oh, sorry, I found the answer.
select * from
(select
(case when seller=true then 50 else 0 end) as sel,
(case when buyer=true then 40 else 0 end) as buy
from company) as tmp
order by (tmp.sel + tmp.buy);

How to get data on a single row

I have table called RUGS with the data below. How do I write a TSQl query to get the data as shown in Output. I am not familiar with unPIVOT
`cono ARtype days Revenue PPD
140 MCD 5 1000 500
140 MRA 6 2000 600
140 MRA 7 3000 700
141 MCD 1 5000 100
141 MRA 2 6000 200
141 MRA 3 7000 300`
Result
140 MCD 5 1000 500 MRA 6 2000 600 MRA 7 3000 700
141 MCD 1 5000 100 MRA 2 6000 200 MRA 3 7000 300
Given that every cono will have exactly 3 records (as stated in the comments), a cte with row_number can be used with case statements.
If any have less than three records, you will see blanks and zeroes in the results. Any with more than three will not have all records represented.
Here is an example with #RUGS as a table variable:
declare #RUGS table (cono int, ARType char(3), [days] int, Revenue int, PPD int)
insert into #RUGS VALUES
(140,'MCD',5,1000,500)
,(140,'MRA',6,2000,600)
,(140,'MRA',7,3000,700)
,(141,'MCD',1,5000,100)
,(141,'MRA',2,6000,200)
,(141,'MRA',3,7000,300);
with cte as
(
select row_number() over(partition by cono order by (select 1)) as rn, * from #RUGS
)
select cono,
max(case when rn = 1 then ARType else '' end) as ARType1,
max(case when rn = 1 then days else '' end) as days1,
max(case when rn = 1 then Revenue else '' end) as Revenue1,
max(case when rn = 1 then PPD else '' end) as PPD1,
max(case when rn = 2 then ARType else '' end) as ARType2,
max(case when rn = 2 then days else '' end) as days2,
max(case when rn = 2 then Revenue else '' end) as Revenue2,
max(case when rn = 2 then PPD else '' end) as PPD2,
max(case when rn = 3 then ARType else '' end) as ARType3,
max(case when rn = 3 then days else '' end) as days3,
max(case when rn = 3 then Revenue else '' end) as Revenue3,
max(case when rn = 3 then PPD else '' end) as PPD3
from cte group by cono

How to optimize for both scenarios when a join is faster than dense_rank only when the result set is small?

When unfiltered, dense_rank on FeedDeliveryNutrients.NutrientID over 150,000 rows is 3.5x faster than joining to Nutrients with row_number on Nutrients.ID and using the joined row number. When filtered to a specific flock, joining with row_number is 9x faster.
Is there any optimization technique that could get the best of both worlds in a single query?
Fastest when unfiltered (150,000 rows returned):
select
FeedDeliveries.FlockID,
FeedDeliveryID,
DeliveryLb,
Bin,
DeliveryDate,
FormulaID,
FeedEnergy,
Nutrient1, Nutrient2, Nutrient3, Nutrient4, Nutrient5, Nutrient6, Nutrient7, Nutrient8, Nutrient9, Nutrient10, Nutrient11, Nutrient12, Nutrient13, Nutrient14, Nutrient15
from (
select
FeedDeliveryID,
sum(case when dense_rank = 1 then Amount end) as Nutrient1,
sum(case when dense_rank = 2 then Amount end) as Nutrient2,
sum(case when dense_rank = 3 then Amount end) as Nutrient3,
sum(case when dense_rank = 4 then Amount end) as Nutrient4,
sum(case when dense_rank = 5 then Amount end) as Nutrient5,
sum(case when dense_rank = 6 then Amount end) as Nutrient6,
sum(case when dense_rank = 7 then Amount end) as Nutrient7,
sum(case when dense_rank = 8 then Amount end) as Nutrient8,
sum(case when dense_rank = 9 then Amount end) as Nutrient9,
sum(case when dense_rank = 10 then Amount end) as Nutrient10,
sum(case when dense_rank = 11 then Amount end) as Nutrient11,
sum(case when dense_rank = 12 then Amount end) as Nutrient12,
sum(case when dense_rank = 13 then Amount end) as Nutrient13,
sum(case when dense_rank = 14 then Amount end) as Nutrient14,
sum(case when dense_rank = 15 then Amount end) as Nutrient15
from (select *, dense_rank() over (partition by FeedDeliveryID order by NutrientID) as dense_rank from dbo.FeedDeliveryNutrients) n
group by FeedDeliveryID
) pvt
join dbo.FeedDeliveries on FeedDeliveries.ID = FeedDeliveryID
Fastest when filtered by dbo.FeedDeliveries.FlockID (~100 rows returned):
select
FeedDeliveries.FlockID,
FeedDeliveryID,
DeliveryLb,
Bin,
DeliveryDate,
FormulaID,
FeedEnergy,
Nutrient1, Nutrient2, Nutrient3, Nutrient4, Nutrient5, Nutrient6, Nutrient7, Nutrient8, Nutrient9, Nutrient10, Nutrient11, Nutrient12, Nutrient13, Nutrient14, Nutrient15
from (
select
FeedDeliveryID,
sum(case when n.row_number = 1 then Amount end) as Nutrient1,
sum(case when n.row_number = 2 then Amount end) as Nutrient2,
sum(case when n.row_number = 3 then Amount end) as Nutrient3,
sum(case when n.row_number = 4 then Amount end) as Nutrient4,
sum(case when n.row_number = 5 then Amount end) as Nutrient5,
sum(case when n.row_number = 6 then Amount end) as Nutrient6,
sum(case when n.row_number = 7 then Amount end) as Nutrient7,
sum(case when n.row_number = 8 then Amount end) as Nutrient8,
sum(case when n.row_number = 9 then Amount end) as Nutrient9,
sum(case when n.row_number = 10 then Amount end) as Nutrient10,
sum(case when n.row_number = 11 then Amount end) as Nutrient11,
sum(case when n.row_number = 12 then Amount end) as Nutrient12,
sum(case when n.row_number = 13 then Amount end) as Nutrient13,
sum(case when n.row_number = 14 then Amount end) as Nutrient14,
sum(case when n.row_number = 15 then Amount end) as Nutrient15
from dbo.FeedDeliveryNutrients
join (select *, row_number() over (order by ID) as row_number from dbo.Nutrients) n on n.ID = NutrientID
group by FeedDeliveryID
) pvt
join dbo.FeedDeliveries on FeedDeliveries.ID = FeedDeliveryID
You already got the answer. Optimize for the most critical scenario.
From start I can use optimize for the 150.000 rows scenario but you must sniff a bit on your actual production server. If that 150.000 only happens a few times /week and the 100 rows ill hit your server many times/minute that is the more critical for you.

Sum up items between setup of custom times

We need to count the number of items that occur 10 minutes before and 10 minutes after the hour, by day. We have a table that tracks the items individually. Ideally i would like to have the output be something like the below, but and totally open to other suggestions.
Table - Attendance
Att_item timestamp
1 2012-09-12 18:08:00
2 2012-09-01 23:26:00
3 2012-09-23 09:33:00
4 2012-09-11 09:43:00
5 2012-09-06 05:57:00
6 2012-09-17 19:26:00
7 2012-09-06 10:51:00
8 2012-09-19 09:42:00
9 2012-09-06 13:55:00
10 2012-09-05 07:26:00
11 2012-09-02 03:08:00
12 2012-09-19 12:17:00
13 2012-09-12 18:14:00
14 2012-09-12 18:14:00
Output
Date Timeslot_5pm Timeslot_6pm Timeslot_7pm
9/11/2012 11 22 22
9/12/2012 30 21 55
9/13/2012 44 33 44
Your requirements are not totally clear, but if you only want to count the number of records in the 20 minute window:
select cast(tstmp as date) date,
sum(case when datepart(hour, tstmp) = 1 then 1 else 0 end) Timeslot_1am,
sum(case when datepart(hour, tstmp) = 2 then 1 else 0 end) Timeslot_2am,
sum(case when datepart(hour, tstmp) = 3 then 1 else 0 end) Timeslot_3am,
sum(case when datepart(hour, tstmp) = 4 then 1 else 0 end) Timeslot_4am,
sum(case when datepart(hour, tstmp) = 5 then 1 else 0 end) Timeslot_5am,
sum(case when datepart(hour, tstmp) = 6 then 1 else 0 end) Timeslot_6am,
sum(case when datepart(hour, tstmp) = 7 then 1 else 0 end) Timeslot_7am,
sum(case when datepart(hour, tstmp) = 8 then 1 else 0 end) Timeslot_8am,
sum(case when datepart(hour, tstmp) = 9 then 1 else 0 end) Timeslot_9am,
sum(case when datepart(hour, tstmp) = 10 then 1 else 0 end) Timeslot_10am,
sum(case when datepart(hour, tstmp) = 11 then 1 else 0 end) Timeslot_11am,
sum(case when datepart(hour, tstmp) = 12 then 1 else 0 end) Timeslot_12pm,
sum(case when datepart(hour, tstmp) = 13 then 1 else 0 end) Timeslot_1pm,
sum(case when datepart(hour, tstmp) = 14 then 1 else 0 end) Timeslot_2pm,
sum(case when datepart(hour, tstmp) = 15 then 1 else 0 end) Timeslot_3pm,
sum(case when datepart(hour, tstmp) = 16 then 1 else 0 end) Timeslot_4pm,
sum(case when datepart(hour, tstmp) = 17 then 1 else 0 end) Timeslot_5pm,
sum(case when datepart(hour, tstmp) = 18 then 1 else 0 end) Timeslot_6pm,
sum(case when datepart(hour, tstmp) = 19 then 1 else 0 end) Timeslot_7pm,
sum(case when datepart(hour, tstmp) = 20 then 1 else 0 end) Timeslot_8pm,
sum(case when datepart(hour, tstmp) = 21 then 1 else 0 end) Timeslot_9pm,
sum(case when datepart(hour, tstmp) = 22 then 1 else 0 end) Timeslot_10pm,
sum(case when datepart(hour, tstmp) = 23 then 1 else 0 end) Timeslot_11pm
from yourtable
where datepart(minute, tstmp) >= 50
or datepart(minute, tstmp) <= 10
group by cast(tstmp as date)
If you want to count the number of records within each hour plus the records that are in the >=50 and <= 10 timeframe, then you will have to adjust this.
This does just one column (well 4 but you get my point).
select DATEPART(YYYY, FTSdate) as [year], DATEPART(mm, FTSdate) as [month]
, DATEPART(dd, FTSdate) as [day], DATEPART(hh, FTSdate) as [hour], COUNT(*)
from [Gabe2a].[dbo].[docSVsys]
where DATEPART(mi, FTSdate) >= 50 or DATEPART(mi, FTSdate) <= 10
group by DATEPART(YYYY, FTSdate), DATEPART(mm, FTSdate), DATEPART(dd, FTSdate), DATEPART(hh, FTSdate)
order by DATEPART(YYYY, FTSdate), DATEPART(mm, FTSdate), DATEPART(dd, FTSdate), DATEPART(hh, FTSdate)
Separate columns.
select DATEPART(YYYY, FTSdate) as [year], DATEPART(mm, FTSdate) as [month]
, DATEPART(dd, FTSdate) as [day]
, sum(case when DATEPART(hh, FTSdate) = '0' then 1 else 0 end) as [0:00] -- midnight
, sum(case when DATEPART(hh, FTSdate) = '1' then 1 else 0 end) as [1:00]
, sum(case when DATEPART(hh, FTSdate) = '2' then 1 else 0 end) as [2:00]
, sum(case when DATEPART(hh, FTSdate) = '3' then 1 else 0 end) as [3:00]
, sum(case when DATEPART(hh, FTSdate) = '4' then 1 else 0 end) as [4:00]
from [Gabe2a].[dbo].[docSVsys]
where DATEPART(mi, FTSdate) >= 50 or DATEPART(mi, FTSdate) <= 10
group by DATEPART(YYYY, FTSdate), DATEPART(mm, FTSdate), DATEPART(dd, FTSdate)
order by DATEPART(YYYY, FTSdate), DATEPART(mm, FTSdate), DATEPART(dd, FTSdate)