Merging several joins into a single select / case - tsql

I am currently querying a table 4 times using different criteria and then using left joins as part of a much larger query to return all the data. The larger query does not run particularly quickly and I am fairly certain that my current approach is not efficient.
What I am wondering is if it is possible to somehow use a CASE statement to increment one of the 4 columns. My 4 queries currently are:
SELECT ts.department,
Sum([hours]) AS ChargeableTimeYTD
FROM bwbfiles.sos.timesummary ts
WHERE category = 'C'
AND [year] = '2019'
GROUP BY department
SELECT ts.department,
Sum([hours]) AS ChargeableTimeMTD
FROM bwbfiles.sos.timesummary ts
WHERE category = 'C'
AND [year] = '2019'
AND [period] = 4
GROUP BY department
SELECT ts.department,
Sum([hours]) AS NonChargeableTimeProBono
FROM bwbfiles.sos.timesummary ts
WHERE category = 'NC'
AND ( [act_code] = '001N'
OR [act_code] = '00N6' )
AND [year] = '2019'
GROUP BY department
SELECT ts.department,
Sum([hours]) AS NonChargeableTimeNonProBono
FROM bwbfiles.sos.timesummary ts
WHERE category = 'NC'
AND ( [act_code] <> '001N'
AND [act_code] <> '00N6' )
AND [year] = '2019'
GROUP BY department
The aim would be to end up with a query result with 5 columns
Department, ChargeableTimeYTD, ChargeableTimeMTD, NonChargeableTimeProBono, NonChargeableTimeNonProBono
Or instead of CASE would I remove the group by department from each bit and have a query that produced 3 columns
Department, Hours, Category (where Category is ChargeableTimeYTD/ChargeableTimeMTD etc...etc...) and then pivot that into 5 columns.
Thanks in advance!

This may do what i think you're asking for
SELECT ts.department,
Sum(case when category = 'C' then [hours] else 0 end) AS ChargeableTimeYTD,
Sum(case when category = 'C' and [period] = 4 then [hours] else 0 end) AS ChargeableTimeMTD,
Sum(case when category = 'NC' and ([act_code] = '001N' or [act_code] = '00N6') then [hours] else 0 end) AS NonChargeableTimeProBono,
Sum(case when category = 'NC' and ([act_code] <> '001N' or [act_code] <> '00N6') then [hours] else 0 end) AS NonChargeableTimeNonProBono
FROM bwbfiles.sos.timesummary ts
where [year] = '2019'
and [category] in ('C','NC')
GROUP BY department

Related

Aggregating columns and getting count of values as row

I have a query output as below:
ID ID2 Working Leave Off Day
14595 76885302 10 0 0
178489 78756208 0 0 1
178489 78756208 0 1 0
I want to receive an output like below:
ID ID2 code value
14595 76885302 Working 10
178489 78756208 Off day 1
178489 78756208 Leave 1
My query is like below:
select tei.organisationunitid,pi.trackedentityinstanceid as tei,
count(case when tedv.value = 'Working' then tedv.value end) Working,
count(case when tedv.value = 'Off day' then tedv.value end) Offday,
count(case when tedv.value = 'Leave' then tedv.value end) Leave
from programstageinstance psi
inner join programinstance pi on pi.programinstanceid = psi.programinstanceid
inner join trackedentitydatavalue tedv on tedv.programstageinstanceid = psi.programstageinstanceid
inner join dataelement de on de.dataelementid = tedv.dataelementid
inner join trackedentityinstance tei on tei.trackedentityinstanceid = pi.trackedentityinstanceid
where psi.executiondate between '2017-01-01' and '2019-06-01'
and de.uid in ('x2222EGfY4K')
and psi.programstageid in (select programstageid
from programstage
where uid = 'CLoZpO22228')
and tei.organisationunitid in (select organisationunitid
from organisationunit
where path like '%Spd2222fvPr%')
group by pi.trackedentityinstanceid,de.uid,tei.organisationunitid,tedv.value
How can I achieve this?
I would try the JSON approach. I made a step-by-step fiddle:
demo:db<>fiddle
SELECT
id, id2,
elements ->> 'code' AS code,
SUM((elements ->> 'value')::int) AS value
FROM (
SELECT
id,
id2,
json_build_object('code', 'working', 'value', working) AS working,
json_build_object('code', 'leave', 'value', leave) AS leave,
json_build_object('code', 'off_day', 'value', off_day) AS off_day
FROM
mytable
) s,
unnest(ARRAY[working, leave, off_day]) as elements
GROUP BY 1,2,3
HAVING SUM((elements ->> 'value')::int) > 0

SUM(CASE WHEN ...) returns a greater number than COUNT(DISTINCT..)

I have written a query in two models, but I can't figure out why the second query returns a greater number than the first one; while the number that the first one, COUNT(DISTINCT...) returns is correct:
WITH types(id) AS (VALUES('{1, 4, 5, 3}'::INTEGER[])),
date_gen64 AS
(
SELECT CAST (generate_series(date '10/1/2017', date '11/15/2017', interval
'1 day') AS date) as days ORDER BY days)
SELECT cl.class_date AS c_date,
count(DISTINCT (CASE WHEN co.id = 1 THEN p.id END)),
count(DISTINCT (CASE WHEN co.id = 2 THEN p.id END))
FROM person p
JOIN envelope e ON e.personID = p.id
JOIN "class" cl on cl.id = p.classID
JOIN course co ON co.id = cl.course_id AND co.id = 1
JOIN types ON cr.type_id = ANY (types.id)
RIGHT JOIN date_gen64 dg ON dg.days = cl.class_date
GROUP BY cl.class_date
ORDER BY cl.class_date
The above query returns 26 but following query returns 27!
The reason why I rewrote it with SUM is that the first query
was too slow. But my question is that why the second one counts more?
WITH types(id) AS (VALUES('{1, 4, 5, 3}'::INTEGER[]))
SELECT tmpcl.days,
SUM(CASE WHEN tmp80.course_id = 1 THEN 1
ELSE 0 END),
SUM(CASE WHEN tmp80.course_id = 2 THEN 1
ELSE 0 END)
FROM (
SELECT CAST (generate_series(date '10/1/2017', date '11/15/2017',
interval '1 day') AS date) as days ORDER BY days) tmpcl
LEFT JOIN (
SELECT DISTINCT p.id AS "person_id",
cl.class_date AS c_date,
co.id AS "course_id"
FROM person p
JOIN envelope e ON e.personID = p.id
JOIN "class" cl on cl.id = p.classID
JOIN course co ON co.id = cl.course_id
JOIN types ON cr.type_id = ANY (types.id)
WHERE co.id IN ( 1 , 2 )
) tmp80 ON tmpcl.days = tmp80.class_date
GROUP BY tmpcl.days
ORDER BY tmpcl.days
You can theoretically have multiple people enrolled in the same class on the same day. Indeed that would seem to be the main point of having classes. So each time there are multiple people assigned to the same class on the same day you can have a higher count than you would in your first query. Does that make sense?
You don't appear to be using p.id in that inner query so simply remove it and your counts should match.
WITH types(id) AS (VALUES('{1, 4, 5, 3}'::INTEGER[]))
SELECT tmpcl.days,
SUM(CASE WHEN tmp80.course_id = 1 THEN 1
ELSE 0 END),
SUM(CASE WHEN tmp80.course_id = 2 THEN 1
ELSE 0 END)
FROM (
SELECT CAST (generate_series(date '10/1/2017', date '11/15/2017',
interval '1 day') AS date) as days ORDER BY days) tmpcl
LEFT JOIN (
SELECT DISTINCT cl.class_date AS c_date,
co.id AS "course_id"
FROM person p
JOIN envelope e ON e.personID = p.id
JOIN "class" cl on cl.id = p.classID
JOIN course co ON co.id = cl.course_id
JOIN types ON cr.type_id = ANY (types.id)
WHERE co.id IN ( 1 , 2 )
) tmp80 ON tmpcl.days = tmp80.class_date
GROUP BY tmpcl.days
ORDER BY tmpcl.days

How to append CTE results to main query output?

I've created a TSQL query that pulls from two sets of tables in my database. The tables in the Common Table Expression are different from the tables in the main query. I'm joining on MRN and need the end result to contain accounts from both sets of tables. I've written the following query to this end:
with cteHosp as(
select Distinct p.EncounterNumber, p.MRN, p.AdmitAge
from HospitalPatients p
inner join Eligibility e on p.MRN = e.MRN
inner join HospChgDtl c on p.pt_id = c.pt_id
inner join HospitalDiagnoses d on p.pt_id = d.pt_id
where p.AdmitAge >=12
and d.dx_cd in ('G89.4','R52.1','R52.2','Z00.129')
)
Select Distinct a.AccountNo, a.dob, DATEDIFF(yy, a.dob, GETDATE()) as Age
from RHCCPTDetail c
inner join RHCAppointments a on c.ClaimID = a.ClaimID
inner join Eligibility e on c.hl7Id = e.MRN
full outer join cteHosp on e.MRN = cteHosp.MRN
where DATEDIFF(yy, a.dob, getdate()) >= 12
and left(c.PriDiag,7) in ('G89.4','R52.1','R52.2', 'Z00.129')
or (
DATEDIFF(yy, a.dob, getdate()) >= 12
and LEFT(c.DiagCode2,7) in ('G89.4','R52.1','R52.2','Z00.129')
)
or (
DATEDIFF(yy, a.dob, getdate()) >= 12
and LEFT(c.DiagCode3,7) in ('G89.4','R52.1','R52.2','Z00.129')
)
or (
DATEDIFF(yy, a.dob, getdate()) >= 12
and LEFT(c.DiagCode4,7) in ('G89.4','R52.1','R52.2','Z00.129')
)
order by AccountNo
How do I merge together the output of both the common table expression and the main query into one set of results?
Merge performs inserts, updates or deletes. I believe you want to join the cte. If so, here is an example.
Notice the cteBatch is joined to the Main query below.
with
cteBatch (BatchID,BatchDate,Creator,LogID)
as
(
select
BatchID
,dateadd(day,right(BatchID,3) -1,
cast(cast(left(BatchID,4) as varchar(4))
+ '-01-01' as date)) BatchDate
,Creator
,LogID
from tblPriceMatrixBatch b
unpivot
(
LogID
for Logs in (LogIDISI,LogIDTG,LogIDWeb)
)u
)
Select
0 as isCurrent
,i.InterfaceID
,i.InterfaceName
,b.BatchID
,b.BatchDate
,case when isdate(l.start) = 0 and isdate(l.[end]) = 0 then 'Scheduled'
when isdate(l.start) = 1 and isdate(l.[end]) = 0 then 'Running'
when isdate(l.start) = 1 and isdate(l.[end]) = 1 and isnull(l.haserror,0) = 1 then 'Failed'
when isdate(l.start) = 1 and isdate(l.[end]) = 1 and isnull(l.haserror,0) != 1 then 'Success'
else 'idunno' end as stat
,l.Start as StartTime
,l.[end] as CompleteTime
,b.Creator as Usr
from EOCSupport.dbo.Interfaces i
join EOCSupport.dbo.Logs l
on i.InterfaceID = l.InterfaceID
join cteBatch b
on b.logid = l.LogID

how to select top 10 without duplicates

Using SQL Server 2012
I need to select TOP 10 Producer based on a ProducerCode. But the data is messed up, users were entering same Producers just spelled differently and with the same ProducerCode.
So I just need TOP 10, so if the ProducerCode is repeating, I just want to pick the first one in a list.
How can I achieve that?
Sample of my data
;WITH cte_TopWP --T
AS
(
SELECT distinct ProducerCode, Producer,SUM(premium) as NetWrittenPremium,
SUM(CASE WHEN PolicyType = 'New Business' THEN Premium ELSE 0 END) as NewBusiness1,
SUM(CASE WHEN PolicyType = 'Renewal' THEN Premium ELSE 0 END) as Renewal1,
SUM(CASE WHEN PolicyType = 'Rewrite' THEN Premium ELSE 0 END) as Rewrite1
FROM ProductionReportMetrics
WHERE YEAR(EffectiveDate) = 2016 AND TransactionType = 'Policy' AND CompanyLine = 'Arch Insurance Company'--AND ProducerType = 'Wholesaler'
GROUP BY ProducerCode,Producer
)
,
cte_Counts --C
AS
(
SELECT distinct ProducerCode, ProducerName, COUNT (distinct ControlNo) as Submissions2,
SUM(CASE WHEN QuotedPremium IS NOT NULL THEN 1 ELSE 0 END) as Quoted2,
SUM(CASE WHEN Type = 'New Business' AND Status IN ('Bound','Cancelled','Notice of Cancellation') THEN 1 ELSE 0 END ) as NewBusiness2,
SUM(CASE WHEN Type = 'Renewal' AND Status IN ('Bound','Cancelled','Notice of Cancellation') THEN 1 ELSE 0 END ) as Renewal2,
SUM(CASE WHEN Type = 'Rewrite' AND Status IN ('Bound','Cancelled','Notice of Cancellation') THEN 1 ELSE 0 END ) as Rewrite2,
SUM(CASE WHEN Status = 'Declined' THEN 1 ELSE 0 END ) as Declined2
FROM ClearanceReportMetrics
WHERE YEAR(EffectiveDate)=2016 AND CompanyLine = 'Arch Insurance Company'
GROUP BY ProducerCode,ProducerName
)
SELECT top 10 RANK() OVER (ORDER BY NetWrittenPremium desc) as Rank,
t.ProducerCode,
c.ProducerName as 'Producer',
NetWrittenPremium,
t.NewBusiness1,
t.Renewal1,
t.Rewrite1,
c.[NewBusiness2]+c.[Renewal2]+c.[Rewrite2] as PolicyCount,
c.Submissions2,
c.Quoted2,
c.[NewBusiness2],
c.Renewal2,
c.Rewrite2,
c.Declined2
FROM cte_TopWP t --LEFT OUTER JOIN tblProducers p on t.ProducerCode=p.ProducerCode
LEFT OUTER JOIN cte_Counts c ON t.ProducerCode=c.ProducerCode
You should use ROW_NUMBER to fix your issue.
https://msdn.microsoft.com/en-us/library/ms186734.aspx
A good example of this is the following answer:
https://dba.stackexchange.com/a/22198
Here's the code example from the answer.
SELECT * FROM
(
SELECT acss_lookup.ID AS acss_lookupID,
ROW_NUMBER() OVER
(PARTITION BY your_distinct_column ORDER BY any_column_you_think_is_appropriate)
as num,
acss_lookup.product_lookupID AS acssproduct_lookupID,
acss_lookup.region_lookupID AS acssregion_lookupID,
acss_lookup.document_lookupID AS acssdocument_lookupID,
product.ID AS product_ID,
product.parent_productID AS productparent_product_ID,
product.label AS product_label,
product.displayheading AS product_displayheading,
product.displayorder AS product_displayorder,
product.display AS product_display,
product.ignorenewupdate AS product_ignorenewupdate,
product.directlink AS product_directlink,
product.directlinkURL AS product_directlinkURL,
product.shortdescription AS product_shortdescription,
product.logo AS product_logo,
product.thumbnail AS product_thumbnail,
product.content AS product_content,
product.pdf AS product_pdf,
product.language_lookupID AS product_language_lookupID,
document.ID AS document_ID,
document.shortdescription AS document_shortdescription,
document.language_lookupID AS document_language_lookupID,
document.document_note AS document_document_note,
document.displayheading AS document_displayheading
FROM acss_lookup
INNER JOIN product ON (acss_lookup.product_lookupID = product.ID)
INNER JOIN document ON (acss_lookup.document_lookupID = document.ID)
)a
WHERE a.num = 1
ORDER BY product_displayheading ASC;
You could do this:
SELECT ProducerCode, MIN(Producer) AS Producer, ...
GROUP BY ProducerCode

Help with difficult 'group by' clause

need some your help with a query.
I have a table Managers (ManagerId, ManagerName)
I have a table Statuses (StatusId, StatusName)
(There's about 10 statuses in that table)
I have a table Clients (ClientId, ClientName, ManagerId, StatusId, WhenAdded)
(WhenAdded is a datetime type)
It's obvious that field 'ManagerId' refers to a table 'Managers' and field 'StatusId' refers to a table 'Statuses'.
User wants to get some statistics about Managers over a period of time (from startDate to endDate using field 'WhenAdded') in the following table.
Columns:
ManagerName, NumberOfClients, NumberOfClientsWithStatus1, NumberOfClientsWithStatus2, NumberOfClientsWithStatus3 and so on.
Number of columns with name NumberOfClientsWithStatusI where i is a number of statuses equal to number of rows in table 'Statuses'.
How can I do that?
t-sql, sql server 2008 r2 express edition.
SELECT
ManagerName,
COUNT(*) AS NumberOfClients,
COUNT(CASE WHEN S.StatusId = 1 THEN 1 ELSE NULL END) AS NumberOfClientsWithStatus1,
COUNT(CASE WHEN S.StatusId = 2 THEN 1 ELSE NULL END) AS NumberOfClientsWithStatus2,
COUNT(CASE WHEN S.StatusId = 3 THEN 1 ELSE NULL END) AS NumberOfClientsWithStatus3,
...
FROM
Clients C
JOIN
Managers M ON C.ManagerId = M.ManagerId
JOIN
Statuses S ON C.StatusId = S.StatusId
WHERE
M.WhenAdded BETWEEN #startDate AND #endDate
GROUP BY
M.ManagerName
Note: there is no clean way to add arbritrary numbers of status columns in SQL (not just SQL Server) because its a fixed column output. You'd have to change the query for status, unless you deal with this in the client
Edit, after comment
SELECT
ManagerName,
COUNT(*) AS NumberOfClients,
COUNT(CASE WHEN S.StatusId = 1 THEN 1 ELSE NULL END) AS NumberOfClientsWithStatus1,
COUNT(CASE WHEN S.StatusId = 2 THEN 1 ELSE NULL END) AS NumberOfClientsWithStatus2,
COUNT(CASE WHEN S.StatusId = 3 THEN 1 ELSE NULL END) AS NumberOfClientsWithStatus3,
...
FROM
Managers M ON C.ManagerId = M.ManagerId
LEFT JOIN
Clients C
LEFT JOIN
Statuses S ON C.StatusId = S.StatusId
WHERE
M.WhenAdded BETWEEN #startDate AND #endDate
GROUP BY
M.ManagerName
If you know that statuses table will always contain a limited number of statuses, you can do this:
SELECT M.ManagerName,
COUNT(C.ClientId) NumberOfClients,
SUM(CASE WHEN S.StatusId= 1 THEN 1 ELSE 0 END) NumberOfClientsWithStatus1,
SUM(CASE WHEN S.StatusId= 2 THEN 1 ELSE 0 END) NumberOfClientsWithStatus2,
...
FROM Clients C
JOIN Managers M on M.ManagerId = C.ManagerId
JOIN Statuses S on S.StatusId = C.StatusId
WHERE C.WhenAdded BETWEEN startDate AND endDate
GROUP BY ManagerName