Could someone please offer some advice. I have the following query that is using roughly 200,000 records. I need to evaluate a 'DateTime' field to evaluate if the revenue occurs during the correct time slot. I am currently using CASE statements to evaluate the DateTime field and it is an absolute pig, it runs over 5 minutes. Is there a faster more efficient way to do this? Note the variables #cur_date, #end_date, #prev_yr_qtr_start, #cur_date_yr_prev etc are all strings and r.pw_ship_date is of type DATETIME. So in essence I'm comparing r.pw_ship_date to strings ie:'2017-01-01 00:00'
Note: it took 4:00 minutes to run this query when I added 'SELECT TOP(500)' for 200,000 records it would take forever.
Thanks in advance
DECLARE #total TABLE
(
acct_number VARCHAR(50),
pro_nbr VARCHAR(50),
sales_rep VARCHAR(50),
bill_to_name VARCHAR(50),
billing_addr1 VARCHAR(50),
billing_addr2 VARCHAR(50),
billing_city CHAR(50),
billing_state CHAR(2),
billing_zip CHAR(10),
cur_month_bills INT,
cur_month_rev DECIMAL(30, 6),
cur_qtr_bills INT,
cur_qtr_rev DECIMAL(30, 6),
prev_yr_qtr_bills INT,
prev_yr_qtr_rev DECIMAL(30, 6),
cur_ytd_bills INT,
cur_ytd_rev DECIMAL(30, 6),
prev_ytd_bills INT
)
INSERT INTO #total
SELECT TOP(50000) f.acct_number ,
r.pro_nbr ,
r.sales_rep ,
r.bill_to_name ,
r.billing_addr1 ,
r.billing_addr2 ,
r.billing_city ,
r.billing_state ,
r.billing_zip ,
'cur_month_bills' = MAX(( CASE WHEN r.pw_ship_date BETWEEN #cur_date AND #end_date THEN 1 ELSE 0 END )) ,
'cur_month_rev' = MAX(ROUND(( CASE WHEN r.pw_ship_date BETWEEN #cur_date AND #end_date THEN f.tot_revenue ELSE 0 END ), 2)) ,
'cur_qtr_bills' = MAX((CASE WHEN r.pw_ship_date BETWEEN #cur_date AND #end_date THEN 1 ELSE 0 END )) ,
'cur_qtr_rev' = MAX(ROUND(CASE WHEN r.pw_ship_date BETWEEN #cur_date AND #end_date THEN f.tot_revenue ELSE 0 END, 2)) ,
'prev_yr_qtr_bills' = MAX(CASE WHEN r.pw_ship_date BETWEEN #prev_yr_qtr_start AND #cur_date_yr_prev THEN 1 ELSE 0 END ) ,
'prev_yr_qtr_rev' = MAX(ROUND(CASE WHEN r.pw_ship_date BETWEEN #prev_yr_qtr_start AND #cur_date_yr_prev THEN f.tot_revenue ELSE 0 END , 2)) ,
'cur_ytd_bills' = MAX(CASE WHEN r.pw_ship_date BETWEEN #first_day_cur_yr AND #end_date THEN 1 ELSE 0 END ),
'cur_ytd_rev' = MAX(ROUND(CASE WHEN r.pw_ship_date BETWEEN #first_day_cur_yr AND #end_date THEN f.tot_revenue ELSE 0 END , 2)) ,
'prev_ytd_bills' = MAX(CASE WHEN r.pw_ship_date BETWEEN #first_day_prev_yr AND #end_date THEN 1 ELSE 0 END )
FROM #summed f
INNER JOIN #raw r ON f.acct_number = r.acct_number AND f.pro_nbr = r.pro_nbr
GROUP BY f.acct_number ,
r.pro_nbr ,
r.sales_rep ,
r.bill_to_name ,
r.billing_addr1 ,
r.billing_addr2 ,
r.billing_city ,
r.billing_state ,
r.billing_zip;
Change your table variables #raw and #summed to temporary tables. Table variables have no statistics and are extremely limited with regard to indexing (you can only have one). Because of this, SQL Server assumes that your table variables have only one row (2012 and older) or 100 rows (2014+). This means that you almost certainly are getting a bad execution plan for your query, and that's going to ruin you.
Once you've changed #raw and #summed into #raw and #summed, put an index on them - at a minimum, index your foreign keys (the fields you're joining on), acct_number and pro-nbr. It may be worth creating a clustered index and/or a primary key as well, but that's something you'll need to experiment with to find the performance you require.
The other thing that is killing your performance is comparing datetimes to strings. This is causing a type conversion and that can drag you down significantly. If you're working with a date/time, use the appropriate data type - not a string that looks like a date.
If this is still not running quickly enough, move your CASE statements out of your aggregate functions.
MAX(( CASE WHEN r.pw_ship_date BETWEEN #cur_date AND #end_date THEN 1 ELSE 0 END ))
Move the CASE statement into the query that populates #raw.pw_ship_date so that when you're performing the aggregate, you're just looking at integers all the way down.
Related
I am having issues sorting some dates in 3 different ranges of dates and return a values according to the correct range. I am hoping you can give me a efficent and clean way of doing it.
I have 6 different dates that I get from a SQL Table. Those dates are then stored in variables. All the dates can also be a Null value. My dates are seperated in 3 date ranges. I want to return an indication of what ranges I am in by using the earliest start Date in all of my ranges. The date of the correct range must also be smaller than the current Date. A date Range can also consist of only an End Date. In that case, we considered that the range end at the end date and is active before that. We select the earliest end date that is close to the current Date in that case.
Return 0 if all the date are null
Range #1(Category #1) X Start Date and X end Date Return 1
Range #2(Category #2) Y Start Date and Y end Date Return 2
Range #3(Category #3) Z Start Date and Z end Date Return 3
EDIT
Ex#1 XStart = December 10 , XEnd = December 15
YStart = December 12 , Yend = December 13
ZStart = December 9 , ZEnd = Null
Expected result would be Z Category
Ex#2 XStart = December 8 , XEnd = December 15
YStart = NULL , Yend = NULL
ZStart = December 9 , ZEnd = Null
Expected result would be X Category
Ex#3XStart = NULL , XEnd = December 15
YStart = NULL , Yend = NULL
ZStart = December 9 , ZEnd = Null
Expected result would be X Category
Ex#4 XStart = December 10 , XEnd = December 15
YStart = NULL , Yend = NULL
ZStart = December 9 , ZEnd = Null
Expected result would be Z Category
Is there a more efficent way than doing a lot of IF statements ? I am having difficulty handling all of those conditions and checks. Here is a snippet of what I have so far.
--Return 0 is not Condition is Applicable
ALTER PROCEDURE [dbo].[HO_GetReason]
#HOID INT
AS
BEGIN
Declare #IsHOIDReal INT = (SELECT ID from T_HO where id = #HOID)
Declare #XStartDate Datetime
Declare #XEndDate Datetime
Declare #YStartDate Datetime
Declare #YEndDate Datetime
Declare #ZStartDate Datetime
Declare #ZEndDate Datetime
CREATE TABLE #tmpT_HO_Withhold (
ID INT NOT NULL,
XStartDate Datetime null,
XEndDate Datetime null,
YStartDate Datetime null,
YEndDate Datetime null,
ZStartDate Datetime null,
ZEndDate Datetime null,
PRIMARY KEY CLUSTERED (ID)
)
IF (#IsHOIDReal IS NOT NULL)
BEGIN
INSERT INTO #tmpT_HO_Withhold
SELECT T_HO.ID,
XStartDate ,
XEndDate ,
YStartDate ,
YEndDate ,
ZStartDate ,
ZEndDate
FROM dbo.T_HO
WHERE ID = #HOID
SET #XStartDate = (Select TOP 1 XStartDate from #tmpT_HO_Withhold)
SET #XEndDate = (Select TOP 1 XEndDate from #tmpT_HO_Withhold)
SET #YStartDate = (Select TOP 1 YStartDate from #tmpT_HO_Withhold)
SET #YEndDate = (Select TOP 1 YEndDate from #tmpT_HO_Withhold)
SET #ZStartDate = (Select TOP 1 ZStartDate from #tmpT_HO_Withhold)
SET #ZEndDate = (Select TOP 1 ZEndDate from #tmpT_HO_Withhold)
IF(#XStartDate IS NULL AND #YStartDate IS NULL AND #ZStartDate IS NULL)
BEGIN print 'NO CONDITION' Select 0 as 'HO_GetReason' END
ELSE IF (#XStartDate IS NOT NULL AND #YStartDate IS NULL AND #ZStartDate IS NULL) BEGIN print '1' Select 1 as 'HO_GetReason'END
ELSE IF (#XStartDate IS NOT NULL AND #YStartDate IS NULL AND #ZStartDate IS NULL) BEGIN print '2' Select 2 as 'HO_GetReason'END
ELSE IF (#XStartDate IS NULL AND #YStartDate IS NULL AND #ZStartDate IS NOT NULL) BEGIN print '3' Select 3 as 'HO_GetReason'END
END
DROP TABLE #tmpT_HO_Withhold END
Notes regarding efficient and clean:
Complex conditional are not in the inefficient category. It can fall into the hard to read category and maintain, but they are a pretty quick operation.
Example: That second "else if" looks strangely like the first "else if". Code will not be reached.
Creating and destroying the temp table will be the slowest part of your stored procedure.
Temp tables using #tablename are not concurrency safe in stored procedure, you can end up with odd schema altered errors in some cases.
You can get to the same results by swapping most of that with:
SELECT
#XStartDate = XStartDate ,
#XEndDate = XEndDate ,
#YStartDate = YStartDate ,
#YEndDate = YEndDate ,
#ZStartDate = ZStartDate ,
#ZEndDate = ZEndDate
FROM dbo.T_HO
WHERE ID = #HOID
Id is unique based on the primary key spotted in your create table, so TOP isn't necessary in this format, no rows will leave the values as null.
Personally, once I get that conditional working (absolute final form), I would be tempted to directly adjust it to a CASE statement and set that as a PERSISTENT computed COLUMN in the base table.
ALTER TABLE dbo.T_HO ADD Reason AS (CASE WHEN XStartDate IS NOT NULL AND ... THEN ... WHEN ... THEN ... ELSE 0 END) PERSISTED
I have a catch script that is meant to pick up when a sum comes out with a null value and insert a 0, which works fine on one query but not my current one, if I break down what code I am using
Declare #Period int = 5
SELECT A.MATTER_CODE,DATEPART(month,A.DATE_OPENED) As DateOpened,B.DEPTNAME,B.DEPTCODE
INTO #TmpPREVYTD
FROM MATTER A
LEFT JOIN DEPT_MASTER B on A.DEPT_CODE = B.DEPTCODE
WHERE A.date_opened between DATEADD(YEAR, DATEDIFF(YEAR, 0,DATEADD(YEAR, -1, GETDATE())), 120) and
DATEADD(YEAR, DATEDIFF(YEAR, 0,DATEADD(YEAR, 0, GETDATE())), 120)
ORDER BY DATE_OPENED
This pushes out the data I require for the query in the correct year and month
SELECT COUNT(*) As 'Fin',Dateopened Into #TmpPFin FROM #TmpPREVYTD where DEPTCODE = 'FIN' GROUP BY
DATEOPENED
This counts every time a job relating to Finance is in the raw data.
SELECT SUM(FIN)As Fin INTO #PFIN FROM #TmpPFIN WHERE Dateopened Between 5 and #Period
This then sums up all the months which relate to the required set months, so in this example I just want it to count May only, which has nothing in it, thus why it is pushing a null value.
If EXISTS(Select Fin from #PFIN)
GOTO TmpITD
Else insert into #PFIN (Fin) Values( 0)
TmpITD:
Finally this is the catcher which should be finding the #PFin has a null value and inserting a 0, however what I think it is doing is going straight to TmpITD, as if I just run the insert statement it adds the 0.
so currently if I run the entire statement I keep getting a null value which means the report in the end comes out blank.
Am I missing something here as this exact same code works in other queries but not in this one, it would appear that perhaps it does exist somehow but with a null value, which has totally confused me.
EDIT: If I add something to the end of TmpITD it pushes that out so I know for sure now that the problem is the if Exists thinks it does exist when it is actually Null
If your final table #PFIN has rows then it exists even if the values in it are NULL.
A solution might be the following tweak to remove NULL rows and get a truly empty data set:
If EXISTS(Select Fin from #PFIN WHERE Fin IS NOT NULL)
GOTO TmpITD
Else insert into #PFIN (Fin) Values( 0)
TmpITD:
Alternatively as aggrigates can return 0 when you group a set of NULLS you may want to exclude 0 values too:
If EXISTS(Select Fin from #PFIN WHERE Fin <> 0)
GOTO TmpITD
Else insert into #PFIN (Fin) Values( 0)
TmpITD:
I developed the following function:
create function kv_fn_ValuationPerItem_AW (#dDate date, #active bit)
returns table
as
return
(
select
Code ItemCode
, Description_1 ItemDescription
, ItemGroup
, Qty_On_Hand CurrentQtyOnHand
, AveUCst CurrentAvgCost
, Qty_On_Hand*AveUCst CurrentValue
from _bvSTTransactionsFull t
inner join StkItem s on t.AccountLink = s.StockLink
where ServiceItem = 0
and ItemActive = #active
and TxDate <= #dDate
group by Code, Description_1, ItemGroup, Qty_On_Hand, AveUCst
)
The function requires two parameters:
Date
Is the item Active - 1 = Active & 0 = Inactive
If I use the function as stipulated above, by specifying 1 for the Active Parameter, then the results will only be for Active Items.
If I specify 0, then it'll return all inactive Items.
How do I alter this function to cater for Active Items or both Active & Inactive?
i.e. if the parameter is 1, the where clause should read as ItemActive = #active, but when it's 0, the where clause should read as ItemActive in (1,0), How do I change the function to work like this?
I tried a case, but my syntax is not correct...
It's as simple as adding an or to your where cluase:
...
and (ItemActive = 1 OR #active = 0)
...
BTW, you might want to do it like this instead:
and (ItemActive = #active OR #active IS NULL)
which means that when you pass in 1 as #active you'll get only the active items, when you pass in 0 you'll get only the inactive members, but when you pass in null you'll get all records, regardless of the value in the ItemActive column.
Thanks Shnugo & Zohar for your answers,
Please amend your answers, then I'll mark yours as the answer.
The solution to my problem was to alter the Function as following:
create function kv_fn_ValuationPerItem_AW (#dDate date, #active bit)
returns table
as
return
(
select
Code ItemCode
, Description_1 ItemDescription
, ItemGroup
, Qty_On_Hand CurrentQtyOnHand
, AveUCst CurrentAvgCost
, Qty_On_Hand*AveUCst CurrentValue
from _bvSTTransactionsFull t
inner join StkItem s on t.AccountLink = s.StockLink
where ServiceItem = 0
and ItemActive in (1,#active)
and TxDate <= #dDate
group by Code, Description_1, ItemGroup, Qty_On_Hand, AveUCst
)
I think you are looking for this:
DECLARE #mockup TABLE(ID INT IDENTITY,SomeValue VARCHAR(100),Active BIT);
INSERT INTO #mockup VALUES('Row 1 is active',1)
,('Row 2 is active',1)
,('Row 3 is inactive',0)
,('Row 4 is inactive',0);
DECLARE #OnlyActive BIT=0; --set this to 1 to see active rows only
SELECT *
FROM #mockup m
WHERE (#OnlyActive=0 OR m.Active=1);
The idea is: If the parameter is set to 0 this expression is always true, if not, the column Active must be set to 1.
Hint: I used paranthesis, which was not needed in this simple case. But in your more complex WHERE clause they will be needed...
Hint2: I named the parameter OnlyActive, which expresses a bit better what you are looking for. You might turn the parameter to ShowAll with an invers logic too...
I have a table like this:
amount type app owe
1 a 10 10
2 a 8 -2
3 a 20 12
4 i 30 10
5 a 40 10
owe is:
(type == 'a')?app - sum(owe) where amount < (amount for current row):max(app-sum(owe)where amount<(amount for current row),0)
So I'd need a window function on the column that the window function is on. There are these partition on rows between rows unlimited preceding and prior row, but it has to be on a different column, not the column I'm summing. Is there a way to reference the same column the window function is on
I tried an alias
case
when type = a
then app - sum(owe)over(ROWS BETWEEN UNBOUNDED PRECEDING AND 1 preceding) as owe
else
greatest(0,app - sum(owe)over(ROWS BETWEEN UNBOUNDED PRECEDING AND 1 preceding))
end as owe
But since owe doesn't exist when I made it, I get:
owe doesn't exist.
Is there some other way?
You cannot do that with window functions. Your only chance using SQL is a recursive CTE:
WITH RECURSIVE tab_owe AS (
SELECT amount, type, app,
CASE WHEN type = 'a'
THEN app
ELSE GREATEST(app, 0)
END AS owe
FROM tab
ORDER BY amount LIMIT 1
UNION ALL
SELECT t.amount, t.type, t.app,
CASE WHEN t.type = 'a'
THEN t.app - sum(tab_owe.owe)
ELSE GREATEST(t.app - sum(tab_owe.owe), 0)
END AS owe
FROM (SELECT amount, type, app
FROM tab
WHERE amount > (SELECT max(amount) FROM tab_owe)
ORDER BY amount
LIMIT 1) AS t
CROSS JOIN tab_owe
GROUP BY t.amount, t.type, t.app
)
SELECT amount, type, app, owe
FROM tab_owe;
(untested)
This would be much easier to write in procedural code, sou consider using a table function.
This is what I came up with. Of course, I'm not a real programmer, so I'm sure there's a smarter way:
insert into mort (amount, "type", app)
values
(1,'a',10),
(2,'a',8),
(3,'a',20),
(4,'i',30),
(5,'a',40)
CREATE OR REPLACE FUNCTION mort_v ()
RETURNS TABLE (
zamount int,
ztype text,
zapp int,
zowe double precision
) AS $$
DECLARE
var_r record;
charlie double precision;
sam double precision;
BEGIN
charlie = 0;
FOR var_r IN(SELECT
amount,
"type",
app
FROM mort order by 1)
LOOP
zamount = var_r.amount;
ztype = var_r.type;
zapp = var_r.app;
sam = var_r.app - charlie;
if ztype = 'a' then
zowe = sam;
else
zowe = greatest(sam, 0);
end if;
charlie = charlie + zowe;
RETURN NEXT;
END LOOP;
END; $$
LANGUAGE 'plpgsql';
select * from mort_v()
So with my limited skills you'll notice I had to add a 'z' in front of the columns that are already in the table so I can spit it out again. If your table has 30 columns you'd normally have to do this 30 times. But, I asked a real engineer and he mentioned that if you just spit out the primary key with the calculated column, you can just join it back to the original table. That's smarter than what I have. If there's an even better solution, that would be great. This does serve as a nice reference to how to do something like a cursor in postgre and how to make variables without a '#' in front like in mssqlserver.
OK, the umpteenth conditional column question:
I'm writing a stored proc that takes an input parameter that's mapped to one of several flag columns. What's the best way to filter on the requested column? I'm currently on SQL2000, but about to move to SQL2008, so I'll take a contemporary solution if one's available.
The table queried in the sproc looks like
ID ... fooFlag barFlag bazFlag quuxFlag
-- ------- ------- ------- --------
01 1 0 0 1
02 0 1 0 0
03 0 0 1 1
04 1 0 0 0
and I want to do something like
select ID, name, description, ...
from myTable
where (colname like #flag + 'Flag') = 1
so if I call the sproc like exec uspMyProc #flag = 'foo' I'd get back rows 1 and 4.
I know I can't do the part in parens directly in SQL. In order to do dynamic SQL, I'll have to stuff the entire query into a string, concatenate the #flag param in the WHERE clause and then exec the string. Aside from the dirty feeling I get when doing dynamic SQL, my query is fairly large (I'm selecting a couple dozen fields, joining 5 tables, calling a couple of functions), so it's a big giant string all because of a single line in a 3-line WHERE filter.
Alternately, I could have 4 copies of the query and select among them in a CASE statement. This leaves the SQL code directly executable (and subject to syntax hilighting, etc.) but at the cost of repeating big chunks of code, since I can't use the CASE on just the WHERE clause.
Are there any other options? Any tricky joins or logical operations that can be applied? Or should I just get over it and exec the dynamic SQL?
There are a few ways to do this:
You can do this with a case statement.
select ID, name, description, ...
from myTable
where CASE
WHEN #flag = 'foo' then fooFlag
WHEN #flag = 'bar' then barFlag
END = 1
You can use IF.
IF (#flag = 'foo') BEGIN
select ID, name, description, ...
from myTable
where fooFlag = 1
END ELSE IF (#flag = 'bar') BEGIN
select ID, name, description, ...
from myTable
where barFlag = 1
END
....
You can have a complicated where clause with a lot of parentheses.
select ID, name, description, ...
from myTable
where (#flag = 'foo' and fooFlag = 1)
OR (#flag = 'bar' and barFlag = 1) OR ...
You can do this with dynamic sql:
DECLARE #SQL nvarchar(4000)
SELECT #SQL = N'select ID, name, description, ...
from myTable
where (colname like ''' + #flag + 'Flag'') = 1'
EXECUTE sp_ExecuteSQL #SQL, N''
There are more, but I think one of these will get you going.
"Alternately, I could have 4 copies of the query and select among them in a CASE statement."
You don't need to copy your entire query 4 times, just add all the possibilities into the where clauses in your single copy of the query:
select ID, name, description, ...
from myTable
where (#flag = 'foo' and fooFlag = 1) OR (#flag = 'bar' and barFlag = 1) OR ...
What I would do is CASE some variables at the beginning. Example:
DECLARE
#fooFlag int,
#barFlag int,
#bazFlag int,
#quuxFlag int
SET #fooFlag = CASE WHEN #flag = 'foo' THEN 1 ELSE NULL END
SET #barFlag = CASE WHEN #flag = 'bar' THEN 1 ELSE NULL END
SET #bazFlag = CASE WHEN #flag = 'baz' THEN 1 ELSE NULL END
SET #quuxFlag = CASE WHEN #flag = 'quux' THEN 1 ELSE NULL END
SELECT ID, name, description, ...
FROM myTable
WHERE (fooFlag >= ISNULL(#fooFlag, 0) AND fooFlag <= ISNULL(#fooFlag, 1))
AND (barFlag >= ISNULL(#barFlag, 0) AND barFlag <= ISNULL(#barFlag, 1))
AND (bazFlag >= ISNULL(#bazFlag, 0) AND bazFlag <= ISNULL(#bazFlag, 1))
AND (quuxFlag >= ISNULL(#quuxFlag, 0) AND quuxFlag <= ISNULL(#quuxFlag, 1))
The good thing about this query is that, because the possible values for "flags" are bounded, you can calculate all your conditionals as prerequisites instead of wrapping columns in them. This guarantees a high-performance index seek on whichever columns are indexed, and doesn't require writing any dynamic SQL. And it's better than writing 4 separate queries for obvious reasons.
You could have a parameter for each possible flag column, then check if the parameter is null or the value in the column is equal to the parameter. Then you pass in a 1 for the flags that you want to check and leave the others null.
select id, name, description, ...
from myTable
where (#fooFlag is null or fooFlag = #fooFlag) AND
(#barFlag is null or barFlag = #barFlag) AND
...
Honestly, though, this seems like an ideal candidate for building a dynamic LINQ query and skipping the SPROC once you get to SQL2008.
int should be accepted as varchar value
declare #CompanyID as varchar(10) = '' -- or anyother value
select * from EmployeeChatTbl chat
where chat.ConversationDetails like '%'+#searchKey+'%'
and
(
(0 = CASE WHEN (#CompanyID = '' ) THEN 0 ELSE 1 END)
or
(chat.CompanyID = #CompanyID)
)
working
when the companyID is present , then filtration based on it is done, other wise , filtration is skipped.
where
case when #value<>0 then Field else 1 end
=
case when #value<>0 then #value else 1 end