Initializing multiple variables with one SELECT statement - tsql

It is a well known practice to this:
DECLARE #A INT
,#B INT
SELECT #A = [Column01]
,#B = [Column02]
FROM [dbo].[data]
I am wondering is it true for the following code:
DECLARE #A INT
,#B INT
SELECT #A = [Column01]
,#B = #A + [Column02]
FROM [dbo].[data]
that #A is always getting value before #B?
In my real case [Column01] and Column02 are an expressions with many columns and T-SQL functions and using #A as reference is simplifying the initialization of #B.

that #A is always getting value before #B?
The answer is no.
Quoted from Itzik Ben-Gan's Microsoft SQL Server 2012 T-SQL Fundamentals
SQL supports a concept called all-at-once operations, which means that all expressions that appear in the same logical query processing phase are evaluated logically at the same point in time.
Please also check All-at-Once Operations in T-SQL

In my experience so far this has always worked 'top down' for me. However, I too feel a bit queasy whenever I write something like that and have been known to split it into two separate commands when I had a more paranoid day, maybe I should have more of those =)
That said, the question could be extended to these syntaxes of which I 'assume by experience' they are correct but again wonder if anyone has a more definite answer to it :
DECLARE #a int,
#b int,
#x int
SELECT #a = (CASE name WHEN 'A' THEN value ELSE #a END),
#b = (CASE name WHEN 'B' THEN value ELSE #b END)
FROM myTable
WHERE name IN ('A', 'B')
which gives the same result as below but is quite a bit faster, especially if you have to fetch many of them
SELECT #a = value FROM myTable WHERE name = 'A'
SELECT #b = value FROM myTable WHERE name = 'B'
Or, this one:
DECLARE #a int = 8,
#b int = 5,
#x int
UPDATE myTable
SET #x = #a * leftField + #b * rightField,
mySum = #x,
mySquare = Power(#x, 2)
WHERE ...
Where I use #x to calculate an intermediate value for a given record and use said value later on to set a field or as part of a formula again. (I agree it's a stupid example but right now can't come up with something more sensible)
Or this one that now seems to be generally accepted as 'OK', but I do remember the days when this would go haywire if you'd add an ORDER BY to it, which is often a requirement.
DECLARE #a varchar(max)
SELECT #a = (CASE WHEN #a IS NULL THEN name ELSE #a + ',' + name END)
FROM sys.objects
WHERE type = 'S'
SELECT #a
UPDATE: these things (likely) usually work fine as long as the datasets are small but odd things start to happen when the size of the data grows and the QO decides to use a different plan using multi-threading etc... So I tried to 'break' things by setting up a largish table that would no longer fit in memory and then see what would happen. I took a simple example that could easily be split into multiple threads. The result can be found here but you'll need to adapt it to your hardware off course (please don't put sqlFiddle to its knees!). The results so far are that things keep working but the query plans differ depending on the query we run (with or without #x and #y) !

Related

Recursive replace from a table of characters

In short, I am looking for a single recursive query that can perform multiple replaces over one string. I have a notion it can be done, but am failing to wrap my head around it.
Granted, I'd prefer the biz-layer of the application, or even the CLR, to do the replacing, but these are not options in this case.
More specifically, I want to replace the below mess - which is C&P in 8 different stored procedures - with a TVF.
SET #temp = REPLACE(RTRIM(#target), '~', '-')
SET #temp = REPLACE(#temp, '''', '-')
SET #temp = REPLACE(#temp, '!', '-')
SET #temp = REPLACE(#temp, '#', '-')
SET #temp = REPLACE(#temp, '#', '-')
-- 23 additional lines reducted
SET #target = #temp
Here is where I've started:
-- I have a split string TVF called tvf_SplitString that takes a string
-- and a splitter, and returns a table with one row for each element.
-- EDIT: tvf_SplitString returns a two-column table: pos, element, of which
-- pos is simply the row_number of the element.
SELECT REPLACE('A~B!C#D#C!B~A', MM.ELEMENT, '-') TGT
FROM dbo.tvf_SplitString('~-''-!-#-#', '-') MM
Notice I've joined all the offending characters into a single string separated by '-' (knowing that '-' will never be one of the offending characters), which is then split. The result from this query looks like:
TGT
------------
A-B!C#D#C!B-A
A~B!C#D#C!B~A
A~B-C#D#C-B~A
A~B!C-D-C!B~A
A~B!C#D#C!B~A
So, the replace clearly works, but now I want it to be recursive so I can pull the top 1 and eventually come out with:
TGT
------------
A-B-C-D-C-B-A
Any ideas on how to accomplish this with one query?
EDIT: Well, actual recursion isn't necessary if there's another way. I'm pondering the use of a table of numbers here, too.
You can use this in a scalar function. I use it to remove all control characters from some external input.
SELECT #target = REPLACE(#target, invalidChar, '-')
FROM (VALUES ('~'),(''''),('!'),('#'),('#')) AS T(invalidChar)
I figured it out. I failed to mention that the tvf_SplitString function returns a row number as "pos" (although a subquery assigning row_number could also have worked). With that fact, I could control cross join between the recursive call and the split.
-- the cast to varchar(max) matches the output of the TVF, otherwise error.
-- The iteration counter is joined to the row number value from the split string
-- function to ensure each iteration only replaces on one character.
WITH XX AS (SELECT CAST('A~B!C#D#C!B~A' AS VARCHAR(MAX)) TGT, 1 RN
UNION ALL
SELECT REPLACE(XX.TGT, MM.ELEMENT, '-'), RN + 1 RN
FROM XX, dbo.tvf_SplitString('~-''-!-#-#', '-') MM
WHERE XX.RN = MM.pos)
SELECT TOP 1 XX.TGT
FROM XX
ORDER BY RN DESC
Still, I'm open to other suggestions.

SQL invalid conversion return null instead of throwing error

I have a table with a varchar column, and I want to find values that match a certain number. So lets say that column contains the following entries (except with millions of rows in real life):
123456789012
2345678
3456
23 45
713?2
00123456789012
So I decide I want all the rows which are numerically 123456789012 write a statement that looks something like this:
SELECT * FROM MyTable WHERE CAST(MyColumn as bigint) = 123456789012
It should return the first and last row, but instead the whole query blows up because it can't convert the "23 45" and "713?2" to bigint.
Is there another way to do the conversion that will return NULL for values that can't convert?
SQL Server does NOT guarantee boolean operator short-circuit, see On SQL Server boolean operator short-circuit. So all solution using ISNUMERIC(...) AND CAST(...) are fundamentally flawed (they may work, but hey can arbitrarily fail later dependiong on the generated plan). A better solution is using CASE, as Thomas suggests: CASE ISNUMERIC(...) WHEN 1 THEN CAST(...) ELSE NULL END. But, as gbn pointed out, ISNUMERIC is notoriously finicky in identifying what 'numeric' means and many cases where one would expect it to return 0 it returns 1. So mixing the CASE with the LIKE:
CASE WHEN MyRow NOT LIKE '%[^0-9]%' THEN CAST(MyRow as bigint) ELSE NULL END
But the real problem is that if you have millions of rows and you have to search them like this, you'll always end up scanning end-to-end since the expression is not SARG-able (no matter how we rewrite it). The real issue here is data purity, and should be addressed at the appropriate level, where the data is populated. Another thing to consider is if is possible to create a persisted computed column with this expression and create a filtered index on it which eliminates NULL (ie. non-numeric). That would speed up things a little.
If you are using SQL Server 2012 you can use the 2 new methods:
TRY_CAST()
TRY_CONVERT()
Both methods are equivalent. They return a value cast to the specified data type if the cast succeeds; otherwise, returns null. The only difference is that CONVERT is SQL Server specific, CAST is ANSI. using CAST will make your code more portable (although not sure if any other database provider implements TRY_CAST)
ISNUMERIC will accept empty string and values like 1.23 or 5E-04 so could be unreliable.
And you don't know what order things will be evaluated in so it could still fail (SQL is declarative, not procedural, so the WHERE clause probably won't be evaluated left to right)
So:
you want to accept value that consist only of the characters 0-9
you need to materialise the "number" filter so it's applied before CAST
Something like:
SELECT
*
FROM
(
SELECT TOP 2000000000 *
FROM MyTable
WHERE MyColumn NOT LIKE '%[^0-9]%' --double negative rejects anything except 0-9
ORDER BY MyColumn
) foo
WHERE
CAST(MyColumn as bigint) = 123456789012 --applied after number check
Edit: quick example that fails.
CREATE TABLE #foo (bigintstring varchar(100))
INSERT #foo (bigintstring )VALUES ('1.23')
INSERT #foo (bigintstring )VALUES ('1 23')
INSERT #foo (bigintstring )VALUES ('123')
SELECT * FROM #foo
WHERE
ISNUMERIC(bigintstring) = 1
AND
CAST(bigintstring AS bigint) = 123
SELECT *
FROM MyTable
WHERE ISNUMERIC(MyRow) = 1
AND CAST(MyRow as float) = 123456789012
The ISNUMERIC() function should give you what you need.
SELECT * FROM MyTable
WHERE ISNUMERIC(MyRow) = 1
AND CAST(MyRow as bigint) = 123456789012
And to add a case statement like Thomas suggested:
SELECT * FROM MyTable
WHERE CASE(ISNUMERIC(MyRow)
WHEN 1 THEN CAST(MyRow as bigint)
ELSE NULL
END = 123456789012
http://msdn.microsoft.com/en-us/library/ms186272.aspx
SELECT *
FROM MyTable
WHERE (ISNUMERIC(MyColumn) = 1) AND (CAST(MyColumn as bigint) = 123456789012)
Additionally you can use a CASE statement in order to get null values.
SELECT
CASE
WHEN (ISNUMERIC(MyColumn) = 1) THEN CAST(MyColumn as bigint)
ELSE NULL
END AS 'MyColumnAsBigInt'
FROM tableName
If you require additional filtering, for numerics which are not valid to be cast to bigint, you can use the following instead of ISNUMERIC:
PATINDEX('%[^0-9]%',MyColumn)) = 0
If you need decimal values instead of integers, cast to float instead and change the regex to '%[^0-9.]%'

How does ANSI_NULLS work in TSQL?

SET ANSI_NULLS OFF seems to give different results in TSQL depending on whether you're comparing a field from a table or a value. Can anyone help me understand why the last 2 of my queries give no results? I'm not looking for a solution, just an explanation.
select 1 as 'Col' into #a
select NULL as 'Col' into #b
--This query gives results, as expected.
SET ANSI_NULLS OFF
select * from #b
where NULL = Col
--This query gives results, as expected.
SET ANSI_NULLS OFF
select * from #a
where NULL != Col
--This workaround gives results, too.
select * from #a a, #b b
where isnull(a.Col, '') != isnull(b.Col, '')
--This query gives no results, why?
SET ANSI_NULLS OFF
select * from #a a, #b b
where a.Col != b.Col
--This query gives no results, why?
SET ANSI_NULLS OFF
select * from #a a, #b b
where b.Col != a.Col
The reason the last two queries fail is that SET ANSI_NULLS ON/OFF only applies when you are comparing against a variable or the NULL value. It does not apply when you are comparing column values. From the BOL:
SET ANSI_NULLS ON affects a comparison
only if one of the operands of the
comparison is either a variable that
is NULL or a literal NULL. If both
sides of the comparison are columns or
compound expressions, the setting does
not affect the comparison.
Anything compared to a null value fails. Even comparing two null values will fail. Even the != will fail because of the (IMHO) stupid handling of NULL.
That said, the != queries could be rewritten to say:
select * from #a a where a.Col not in (select b.Col from #b b)
The last query is identical to the second to last query as the order of the comparison doesn't matter.
Incidentally, your workaround works simply because you are testing for a null value in the #b.Col column and explicitly converting it to a '' which then allows your query to do a string compare between them. An alternative way of writing that would be:
select * from #a a, #b b
where a.Col != COALESCE(b.Col, '')

How to refactor this sql query

I have a lengthy query here, and wondering whether it could be refactor?
Declare #A1 as int
Declare #A2 as int
...
Declare #A50 as int
SET #A1 =(Select id from table where code='ABC1')
SET #A2 =(Select id from table where code='ABC2')
...
SET #A50 =(Select id from table where code='ABC50')
Insert into tableB
Select
Case when #A1='somevalue' Then 'x' else 'y' End,
Case when #A2='somevalue' Then 'x' else 'y' End,
..
Case when #A50='somevalue' Then 'x' else 'y' End
From tableC inner join ......
So as you can see from above, there is quite some redundant code. But I can not think of a way to make it simpler.
Any help is appreciated.
If you need the variables assigned, you could pivot your table...
SELECT *
FROM
(
SELECT Code, Id
FROM Table
) t
PIVOT
(MAX(Id) FOR Code IN ([ABC1],[ABC2],[ABC3],[ABC50])) p /* List them all here */
;
...and then assign them accordingly.
SELECT #A1 = [ABC1], #A2 = [ABC2]
FROM
(
SELECT Code, Id
FROM Table
) t
PIVOT
(MAX(Id) FOR Code IN ([ABC1],[ABC2],[ABC3],[ABC50])) p /* List them all here */
;
But I doubt you actually need to assign them at all. I just can't really picture what you're trying to achieve.
Pivotting may help you, as you can still use the CASE statements.
Rob
Without taking the time to develop a full answer, I would start by trying:
select id from table where code in ('ABC1', ... ,'ABC50')
then pivot that, to get one row result set of columns ABC1 through ABC50 with ID values.
Join that row in the FROM.
If 'somevalue', 'x' and 'y' are constant for all fifty expressions. Then start from:
select case id when 'somevalue' then 'x' else 'y' end as XY
from table
where code in ('ABC1', ... ,'ABC50')
I am not entirely sure from your example, but it looks like you should be able to do one of a few things.
Create a nice look up table that will tell you for a given value of the select statement what should be placed there. This would be much shorter and should be insanely fast.
Create a simple for loop in your code and generate a list of 50 small queries.
Use sub-selects or generate a list of selects with one round trip to retrieve your #a1-#A50 values and then generate the query with them already in place.
Jacob

How to combine variable assignment with data-retrieval operations in T-SQL

Just to clarify, I'm running Sybase 12.5.3, but I am lead to believe that this holds true for SQL Server 2005 too. Basically, I'm trying to write a query that looks a little like this, I've simplified it as much as possible to highlight the problem:
DECLARE #a int, #b int, #c int
SELECT
#a = huzzah.a
,#b = huzzah.b
,#c = huzzah.c
FROM (
SELECT
1 a
,2 b
,3 c
) huzzah
This query gives me the following error: "Error:141 A SELECT statement that assigns a value to a variable must not be combined with data-retrieval operations."
The only work around that I've got for this so far, is to insert the derived-table data into a temporary table and then select it right back out again. Which works fine, but the fact that this doesn't work irks me. Is there a better way to do this?
The error does appear as described in 12.5.3 esd 4 & 7, it runs fine in 12.5.4 esd 4 & 6.
Looks like a bug that's been patched, your only options seem to be workaround or patch.
Have found what appears to be the bug 377625
I've just ran your code against 12.5.3 and it parses fine...doesn't return anything but it does run. Have you maybe simplified the problem a bit too much because I'm not seeing any error messages at all.
Just to be clear, the following runs and returns what you'd expect.
DECLARE #a int, #b int, #c int
SELECT
#a = huzzah.a
,#b = huzzah.b
,#c = huzzah.c
FROM (
SELECT
1 a
,2 b
,3 c
) huzzah
select #a
select #b
select #c