Trouble with sums and a having clause - tsql

I have a query that populates a report of "written off" hours, the total of which is the sum RegHrs, OvtHrs, and SpecialOvtHrs, but only those values with a positive value (each of these fields may have positive and negative values - positive values are "written off").
The query I am using (which doesn't work) is:
select
LD.Employee,
max(EM.LastName + ', ' + Em.FirstName) as EMName,
LD.WBS1, LD.WBS2,
sum(LD.RegHrs + LD.OvtHrs + LD.SpecialOvtHrs) as [Hours],
CL.Name as ClientName, pctf.CustProgram,
max(PR.Name) as ProjName,
LD.PKey, ISNULL(BillingDirectives.Comment, 'None')
from AnvilProd..LD
left join AnvilProd..PR on LD.WBS1 = PR.WBS1 and PR.WBS2 = ' ' and PR.WBS3 = ' '
left join AnvilProd..EM on LD.Employee = EM.Employee
left join AnvilProd..CL on PR.ClientID = CL.ClientID
left join AnvilProd..ProjectCustomTabFields pctf on PR.WBS1 = pctf.WBS1 and pctf.WBS2 = ' ' and pctf.WBS3 = ' '
left join InterfaceDev..BillingDirectives on BillingDirectives.PKey = LD.PKey
where LD.BillStatus = 'X'
and LD.WrittenOffPeriod = #custPeriod
and LD.WBS1 not in (select distinct WBS1 from AnvilProd..BT where FeeBasis = 'L')
and LD.WBS1 not in (select distinct WBS1 from InterfaceDev..CircledHoursReportEliminatedJobs where ActiveStatus = 'Active')
group by pctf.CustProgram, CL.Name, LD.WBS1, LD.WBS2, LD.Employee, BillingDirectives.Comment, LD.PKey
-- having ((sum(LD.RegHrs) > 0) or (sum(LD.OvtHrs) > 0) or (sum(LD.SpecialOvtHrs) > 0))
order by pctf.CustProgram, CL.Name, LD.WBS1, WBS2, EMName
I need to find written off hours for each Employee, WBS1, WBS2 combination.
I have tried a dozen different things with that having clause and cannot get it to give me an accurate result.

Use a case inside the sum():
select blah, blah,
sum(
case when LD.RegHrs > 0 then LD.RegHrs else 0 end +
case when LD.OvtHrs > 0 then LD.OvtHrs else 0 end +
case when LD.SpecialOvtHrs > 0 then LD.SpecialOvtHrs else 0 end
) as [Hours], blah, blah
from blah join blah ...
group by blah, blah
-- no having clause
As a mathematical curiosity, you could also code the sum this way:
sum((LD.RegHrs + abs(LD.RegHrs) +
(LD.OvtHrs + abs(LD.OvtHrs) +
(LD.SpecialOvtHrs + abs(LD.SpecialOvtHrs)) / 2
which although is a bit less readable, it uses less code and may impress your colleagues more :)

Related

converting sql statement back to lambda expression

I have the query below, and its sql code. It's running really slow, so it was re written in sql, now I'm just not sure how to convert the sql back to a lambda expression.
This is the part of the expression giving me the problems, somewhere in
r.RecordProducts.Any()
records = records
.Include(r => r.Employer)
.Include(r => r.Contractor)
.Include(r => r.RecordProducts)
.ThenInclude(rp => rp.ProductDefendant.Defendant)
.Where(r => EF.Functions.Like(r.Employer.DefendantCode, "%" + input.DefendantCode + "%")
|| EF.Functions.Like(r.Contractor.DefendantCode, "%" + input.DefendantCode + "%")
|| r.RecordProducts.Any(rp => EF.Functions.Like(rp.ProductDefendant.Defendant.DefendantCode, "%" + input.DefendantCode + "%") && rp.IsActive == true));
the any clause does an exist and some funky stuff in the sql where clause below
SELECT [t].[Id], [t].[StartDate], [t].[EndDate], [t].[WitnessName], [t].[SourceCode], [t].[JobsiteName], [t].[ShipName], [t].[EmployerCode]
FROM (
SELECT DISTINCT [r].[RecordID] AS [Id], [r].[StartDate], [r].[EndDate], [r.Witness].[FullName] AS [WitnessName], CASE
WHEN [r].[SourceID] IS NOT NULL
THEN [r.Source].[SourceCode] ELSE N'zzzzz'
END AS [SourceCode], CASE
WHEN [r].[JobsiteID] IS NOT NULL
THEN [r.Jobsite].[JobsiteName] ELSE N'zzzzz'
END AS [JobsiteName], CASE
WHEN [r].[ShipID] IS NOT NULL
THEN [r.Ship].[ShipName] ELSE N'zzzzz'
END AS [ShipName], CASE
WHEN [r].[EmployerID] IS NOT NULL
THEN [r.Employer].[DefendantCode] ELSE N'zzzzz'
END AS [EmployerCode]
FROM [Records] AS [r]
LEFT JOIN [Ships] AS [r.Ship] ON [r].[ShipID] = [r.Ship].[ShipID]
LEFT JOIN [Jobsites] AS [r.Jobsite] ON [r].[JobsiteID] = [r.Jobsite].[JobsiteID]
LEFT JOIN [Sources] AS [r.Source] ON [r].[SourceID] = [r.Source].[SourceID]
LEFT JOIN [Witnesses] AS [r.Witness] ON [r].[WitnessID] = [r.Witness].[WitnessID]
LEFT JOIN [Defendants] AS [r.Contractor] ON [r].[ContractorID] = [r.Contractor].[DefendantID]
LEFT JOIN [Defendants] AS [r.Employer] ON [r].[EmployerID] = [r.Employer].[DefendantID]
WHERE ([r].[IsActive] = 1) AND (([r.Employer].[DefendantCode] LIKE (N'%' + 'cert') + N'%' OR [r.Contractor].[DefendantCode] LIKE (N'%' + 'cert') + N'%') OR EXISTS (
SELECT 1
FROM [Records_Products] AS [rp]
INNER JOIN [Product_Defendant] AS [rp.ProductDefendant] ON [rp].[DefendantProductID] = [rp.ProductDefendant].[DefendantProductID]
INNER JOIN [Defendants] AS [rp.ProductDefendant.Defendant] ON [rp.ProductDefendant].[DefendantID] = [rp.ProductDefendant.Defendant].[DefendantID]
WHERE ([rp.ProductDefendant.Defendant].[DefendantCode] LIKE (N'%' + 'cert') + N'%' AND ([rp].[IsActive] = 1)) AND ([r].[RecordID] = [rp].[RecordID])))
) AS [t]
ORDER BY [t].[SourceCode]
OFFSET 0 ROWS FETCH NEXT 500 ROWS ONLY
Here is the new sql that works better, just not sure how to convert it back to a lambda expression
SELECT [t].[Id]
,[t].[StartDate]
,[t].[EndDate]
,[t].[WitnessName]
,[t].[SourceCode]
,[t].[JobsiteName]
,[t].[ShipName]
,[t].[EmployerCode]
FROM (
SELECT DISTINCT [r].[RecordID] AS [Id]
,[r].[StartDate]
,[r].[EndDate]
,[r.Witness].[FullName] AS [WitnessName]
,CASE
WHEN [r].[SourceID] IS NOT NULL
THEN [r.Source].[SourceCode]
ELSE N'zzzzz'
END AS [SourceCode]
,CASE
WHEN [r].[JobsiteID] IS NOT NULL
THEN [r.Jobsite].[JobsiteName]
ELSE N'zzzzz'
END AS [JobsiteName]
,CASE
WHEN [r].[ShipID] IS NOT NULL
THEN [r.Ship].[ShipName]
ELSE N'zzzzz'
END AS [ShipName]
,CASE
WHEN [r].[EmployerID] IS NOT NULL
THEN [r.Employer].[DefendantCode]
ELSE N'zzzzz'
END AS [EmployerCode]
FROM [Records] AS [r]
LEFT JOIN [Ships] AS [r.Ship] ON [r].[ShipID] = [r.Ship].[ShipID]
LEFT JOIN [Jobsites] AS [r.Jobsite] ON [r].[JobsiteID] = [r.Jobsite].[JobsiteID]
LEFT JOIN [Sources] AS [r.Source] ON [r].[SourceID] = [r.Source].[SourceID]
LEFT JOIN [Witnesses] AS [r.Witness] ON [r].[WitnessID] = [r.Witness].[WitnessID]
LEFT JOIN [Defendants] AS [r.Contractor] ON [r].[ContractorID] = [r.Contractor].[DefendantID]
LEFT JOIN [Defendants] AS [r.Employer] ON [r].[EmployerID] = [r.Employer].[DefendantID]
LEFT JOIN (
SELECT [rp].[RecordID]
FROM [Records_Products] AS [rp]
INNER JOIN [Product_Defendant] AS [rp.ProductDefendant] ON [rp].[DefendantProductID] = [rp.ProductDefendant].[DefendantProductID]
INNER JOIN [Defendants] AS [rp.ProductDefendant.Defendant] ON [rp.ProductDefendant].[DefendantID] = [rp.ProductDefendant.Defendant].[DefendantID]
WHERE (
[rp.ProductDefendant.Defendant].[DefendantCode] LIKE (N'%' + 'cert') + N'%'
AND ([rp].[IsActive] = 1)
)
) AS RecordProduct ON [r].[RecordID] = RecordProduct.[RecordID]
WHERE ([r].[IsActive] = 1)
AND (
(
[r.Employer].[DefendantCode] LIKE (N'%' + 'cert') + N'%'
OR [r.Contractor].[DefendantCode] LIKE (N'%' + 'cert') + N'%'
)
OR RecordProduct.RecordID IS NOT NULL --OR EXISTS ( -- SELECT 1 -- FROM [Records_Products] AS [rp] -- INNER JOIN [Product_Defendant] AS [rp.ProductDefendant] ON [rp].[DefendantProductID] = [rp.ProductDefendant].[DefendantProductID] -- INNER JOIN [Defendants] AS [rp.ProductDefendant.Defendant] ON [rp.ProductDefendant].[DefendantID] = [rp.ProductDefendant.Defendant].[DefendantID] -- WHERE ([rp.ProductDefendant.Defendant].[DefendantCode] LIKE (N'%' + 'cert') + N'%' -- AND ([rp].[IsActive] = 1)) AND ([r].[RecordID] = [rp].[RecordID]) -- ) )) AS [t]ORDER BY [t].[SourceCode]OFFSET 0 ROWS FETCH NEXT 500 ROWS ONLY
)
)
The linq expression you supplied and the SQL generated do not match. For one, the linq expression is performing an Include on the various related tables which would have included all of those entity columns in the top-level SELECT which are not present in your example SQL. I also don't see conditions in the Linq expression for the Take 500 & OrderBy, or IsActive assertion on Record.
To be able to help determine the source of any performance concern we need to see the complete Linq expression and the resulting SQL.
Looking at the basis of the Linq expression you provided:
records = records
.Include(r => r.Employer)
.Include(r => r.Contractor)
.Include(r => r.RecordProducts)
.ThenInclude(rp => rp.ProductDefendant.Defendant)
.Where(r => EF.Functions.Like(r.Employer.DefendantCode, "%" + input.DefendantCode + "%")
|| EF.Functions.Like(r.Contractor.DefendantCode, "%" + input.DefendantCode + "%")
|| r.RecordProducts.Any(rp => EF.Functions.Like(rp.ProductDefendant.Defendant.DefendantCode, "%" + input.DefendantCode + "%") && rp.IsActive == true));
There are a few suggestions I can make:
There is no need for the Functions.Like. You should be able to achieve the same with Contains.
Avoid using Include and instead utilize Select to retrieve the columns from the resulting structure that you actually need. Populate these into ViewModels or consume them in the code. The less data you pull back, the better optimized the SQL can be for indexing, and the less data pulled across the wire. Consuming entities also leads to unexpected lazy-load scenarios as systems mature and someone forgets to Include a new relation.
.
records = records
.Where(r => r.IsActive
&& (r.Employer.DefendantCode.Contains(input.DefendantCode)
|| r.Contractor.DefendantCode.Contains(input.DefendantCode)
|| r.RecordProducts.Any(rp => rp.IsActive
&& rp.ProductDefendant.Defendant.DefendantCode.Contains(input.DefendantCode))
.OrderBy(r => r.SourceCode)
.Select(r => new RecordViewModel
{
// Populate the data you want here.
}).Take(500).ToList();
This also adds the IsActive check, OrderBy, and Take(500) based on your sample SQL.

replace elements in a 2d array

Given a 2d array
select (ARRAY[[1,2,3], [4,0,0], [7,8,9]]);
{{1,2,3},{4,0,0},{7,8,9}}
Is there a way to replace the slice at [2:2][2:] (the {{0,0}}) with values 5 and 6? array_replace replaces a specific value so I'm not sure how to approach this.
I believe it's more readable to code a function in plpgsql. However, a pure SQL solution also exists:
select (
select array_agg(inner_array order by outer_index)
from (
select outer_index,
array_agg(
case
when outer_index = 2 and inner_index = 2 then 5
when outer_index = 2 and inner_index = 3 then 6
else item
end
order by inner_index
) inner_array
from (
select item,
1 + (n - 1) % array_length(a, 1) inner_index,
1 + (n - 1) / array_length(a, 2) outer_index
from
unnest(a) with ordinality x (item, n)
) _
group by outer_index
)_
)
from (
select (ARRAY[[1,2,3], [4,0,0], [7,8,9]]) a
)_;

UPDATE table via join in SQL

I am trying to normalize my tables to make the db more efficient.
To do this I have removed several columns from a table that I was updating several columns on.
Here is the original query when all the columns were in the table:
UPDATE myActDataBaselDataTable
set [Correct Arrears 2]=(case when [Maturity Date]='' then 0 else datediff(d,convert(datetime,#DataDate, 102),convert(datetime,[Maturity Date],102)) end)
from myActDataBaselDataTable
Now I have removed [Maturity Date] from the table myActDataBaselDataTable and it's necessary to retrieve that column from the base reference table ACTData, where it is called Mat.
In my table myActDataBaselDataTable the Account number field is a concatenation of 3 fields in ACTData, thus
myActDataBaselDataTable.[Account No]=ac.[Unit] + ' ' + ac.[ACNo] + ' ' + ac.[Suffix]
(where ac is the alias for ACTData)
So, having looked at the answers given elsewhere on SO (such as 1604091: update-a-table-using-join-in-sql-server), I tried to modify this particular update statement as below, but I cannot get it right:
UPDATE myActDataBaselDataTable
set dt.[Correct Arrears 2]=(
case when ac.[Mat]=''
then 0
else datediff(d,convert(datetime,'2014-04-30', 102),convert(datetime,ac.[Mat],102))
end)
from ACTData ac
inner join myActDataBaselDataTable dt
ON dt.[Account No]=ac.[Unit] + ' ' + ac.[ACNo] + ' ' + ac.[Suffix]
I either get an Incorrect syntax near 'From' error, or The multi-part identifier "dt.Correct Arrears 2" could not be bound.
I'd be grateful for any guidance on how to get this right, or suugestiopns about how to do it better.
thanks
EDIT:
BTW, when I run the below as a SELECT it returns data with no errors:
select case when [ac].[Mat]=''
then 0
else datediff(d,convert(datetime,'2014-04-30', 102),convert(datetime,[ac].[Mat],102))
end
from ACTData ac
inner join myActDataBaselDataTable dt
ON dt.[Account No]=ac.[Unit] + ' ' + ac.[ACNo] + ' ' + ac.[Suffix]
In a join update, update the alias
update dt
What is confusing is that in later versions of SQL you don't need to use the alias in the update line

How to add logic to my SQL query to include/exclude search parameters?

Using SQL Server 2008R2, I have here is a simple query:
SELECT *
FROM ItemData
WHERE FREETEXT(Title, '"' + #OriginalSearchTerm + '"')
AND ( WebsiteID=#WebsiteID AND GeoCity = #GeoCity AND GeoState = #GeoState )
ORDER BY ItemListID DESC
This is all fine when there is a valid value for #GeoCity and #GeoState. However there will be scenarios where #GeoCity = -1 and/or #GeoState = -1.
I would rather not write entire separate queries for these cases, although this would work just fine.
How can I optimize the current query to do just this?
Thanks.
-- This might be more efficient as OR statement executes block for each OR statement.
SELECT *
FROM ItemData
WHERE FREETEXT(Title, '"' + #OriginalSearchTerm + '"')
AND ( WebsiteID=#WebsiteID AND
(GeoCity = ISNULL(#GeoCity,-1)) AND
(GeoState = ISNULL(GeoState,-1))
ORDER BY ItemListID DESC
I'm not sure what you want the query to do when #GeoCity = -1 and/or #GeoState = -1. Assuming you want to exclude the GeoCity = #GeoCity and/or GeoState = #GeoState conditions in that case, here's the query:
SELECT *
FROM ItemData
WHERE FREETEXT(Title, '"' + #OriginalSearchTerm + '"')
AND ( WebsiteID=#WebsiteID AND
(#GeoCity = -1 OR GeoCity = #GeoCity) AND
(#GeoState = -1 OR GeoState = #GeoState) )
ORDER BY ItemListID DESC

SQL query for two possible values?

I am using SSMS 2008 R2 and am trying to figure out the SQL select statement to select all records where two or more of the values are found.
These are the four possible values I am looking for. If two or more of these values (SubstanceAbuse, BehaviorEmotion, SexualAbuse, DomesticViolence) are met, I want to set a new field to 1. How do I do this?
case when qav.[test_setup_details_caption] in ('Substance Abuse / Drug Use','Caregiver monitor youth for drug alcohol use') then 1 else 0 end SubstanceAbuse,
case when qav.[test_setup_details_caption] in ('Physical Aggression','Firesetting','Gang Involvement','Runaway Behavior') then 1 else 0 end BehaviorEmotion,
case when qav.[test_setup_details_caption] = 'Problem Sexual Behavior' then 1 else 0 end SexualAbuse,
case when qav.[test_setup_details_caption] LIKE '%Domestic%' then 1 else 0 end DomesticViolence,
My suggestion would be to take the above statement and make it a virtual table in a new SELECT statement. Then you can do a SUM on the ones (since they are calculated already) in your WHERE statement and display only
where (Sub + Beh + Sex + Dom) > 1
It would look something like this (pseudo-code):
SELECT t.*
FROM (SELECT sub case, Beh case, etc.
FROM yourtable) t
WHERE (t.sub + t.Beh + t.Sex + t.Dom) > 1
It seems like all you need is this WHERE clause:
WHERE SubstanceAbuse + BehaviorEmotion + SexualAbuse + DomesticViolence > 1
update myTable set myField = 1 where 2 <= (select SubstanceAbuse + BehaviorEmotion + SexualAbuse + DomesticViolence from ...)
Of course, this is just a template for your query, but you get the idea. If the answer is still unclear then I kindly ask you to give me more details.
Best regards,
Lajos Arpad.