How to combine variable assignment with data-retrieval operations in T-SQL - tsql

Just to clarify, I'm running Sybase 12.5.3, but I am lead to believe that this holds true for SQL Server 2005 too. Basically, I'm trying to write a query that looks a little like this, I've simplified it as much as possible to highlight the problem:
DECLARE #a int, #b int, #c int
SELECT
#a = huzzah.a
,#b = huzzah.b
,#c = huzzah.c
FROM (
SELECT
1 a
,2 b
,3 c
) huzzah
This query gives me the following error: "Error:141 A SELECT statement that assigns a value to a variable must not be combined with data-retrieval operations."
The only work around that I've got for this so far, is to insert the derived-table data into a temporary table and then select it right back out again. Which works fine, but the fact that this doesn't work irks me. Is there a better way to do this?

The error does appear as described in 12.5.3 esd 4 & 7, it runs fine in 12.5.4 esd 4 & 6.
Looks like a bug that's been patched, your only options seem to be workaround or patch.
Have found what appears to be the bug 377625

I've just ran your code against 12.5.3 and it parses fine...doesn't return anything but it does run. Have you maybe simplified the problem a bit too much because I'm not seeing any error messages at all.
Just to be clear, the following runs and returns what you'd expect.
DECLARE #a int, #b int, #c int
SELECT
#a = huzzah.a
,#b = huzzah.b
,#c = huzzah.c
FROM (
SELECT
1 a
,2 b
,3 c
) huzzah
select #a
select #b
select #c

Related

TSQL - Parsing substring out of larger string

I have a bunch of rows with values that look like below. It's json extract that I unfortunately have to parse out and load. Anyway, my json parsing tool for some reason doesn't want to parse this full column out so i need to do it in TSQL. I only need the unique_id field:
[{"unique_id":"12345","system_type":"Test System."}]
I tried the below SQL but it's only returning the first 5 characters of the whole column. I know what the issue is which is I need to know how to tell the substring to continue until the 4th set of quotes which comes after the value. I'm not sure how to code the substring like that.
select substring([jsonfield],CHARINDEX('[{"unique_id":"',[jsonfield]),
CHARINDEX('"',[jsonfield]) - CHARINDEX('[{"unique_id":"',[jsonfield]) +
LEN('"')) from etl.my_test_table
Can anyone help me with this?
Thank you, I appreciate it!
Since you tagged 2016, why not use OPENJSON()
Here's an example:
DECLARE #TestData TABLE
(
[SampleData] NVARCHAR(MAX)
);
INSERT INTO #TestData (
[SampleData]
)
VALUES ( N'[{"unique_id":"12345","system_type":"Test System."}]' )
,( N'[{"unique_id":"1234567","system_type":"Test System."},{"unique_id":"1234567_2","system_type":"Test System."}]' )
SELECT b.[unique_id]
FROM #TestData [a]
CROSS APPLY
OPENJSON([a].[SampleData], '$')
WITH (
[unique_id] NVARCHAR(100) '$.unique_id'
) AS [b];
Giving you:
unique_id
---------------
12345
1234567
1234567_2
You can get all the fields as well, just add them to the WITH clause:
SELECT [b].[unique_id]
, [b].[system_type]
FROM #TestData [a]
CROSS APPLY
OPENJSON([a].[SampleData], '$')
WITH (
[unique_id] NVARCHAR(100) '$.unique_id'
, [system_type] NVARCHAR(100) '$.system_type'
) AS [b];
Take it step by step
First get everything to the left of system_type
SELECT LEFT(jsonfield, CHARINDEX('","system_type":"',jsonfield) as s
FROM -- etc
Then take everything to the right of "unique_id":"
SELECT RIGHT(S, LEN(S) - (CHARINDEX('"unique_id":"',S) + 12)) as Result
FROM (
SELECT LEFT(jsonfield, CHARINDEX('","system_type":"',jsonfield) as s
FROM -- etc
) X
Note, I did not test this so it could be off by one or have a syntax error, but you get the idea.
If your larger string ist just a simple JSON as posted, the solution is very easy:
SELECT
JSON_VALUE(N'[{"unique_id":"12345","system_type":"Test System."}]','$[0].unique_id');
JSON_VALUE() needs SQL-Server 2016 and will extract one single value from a specified path.

Initializing multiple variables with one SELECT statement

It is a well known practice to this:
DECLARE #A INT
,#B INT
SELECT #A = [Column01]
,#B = [Column02]
FROM [dbo].[data]
I am wondering is it true for the following code:
DECLARE #A INT
,#B INT
SELECT #A = [Column01]
,#B = #A + [Column02]
FROM [dbo].[data]
that #A is always getting value before #B?
In my real case [Column01] and Column02 are an expressions with many columns and T-SQL functions and using #A as reference is simplifying the initialization of #B.
that #A is always getting value before #B?
The answer is no.
Quoted from Itzik Ben-Gan's Microsoft SQL Server 2012 T-SQL Fundamentals
SQL supports a concept called all-at-once operations, which means that all expressions that appear in the same logical query processing phase are evaluated logically at the same point in time.
Please also check All-at-Once Operations in T-SQL
In my experience so far this has always worked 'top down' for me. However, I too feel a bit queasy whenever I write something like that and have been known to split it into two separate commands when I had a more paranoid day, maybe I should have more of those =)
That said, the question could be extended to these syntaxes of which I 'assume by experience' they are correct but again wonder if anyone has a more definite answer to it :
DECLARE #a int,
#b int,
#x int
SELECT #a = (CASE name WHEN 'A' THEN value ELSE #a END),
#b = (CASE name WHEN 'B' THEN value ELSE #b END)
FROM myTable
WHERE name IN ('A', 'B')
which gives the same result as below but is quite a bit faster, especially if you have to fetch many of them
SELECT #a = value FROM myTable WHERE name = 'A'
SELECT #b = value FROM myTable WHERE name = 'B'
Or, this one:
DECLARE #a int = 8,
#b int = 5,
#x int
UPDATE myTable
SET #x = #a * leftField + #b * rightField,
mySum = #x,
mySquare = Power(#x, 2)
WHERE ...
Where I use #x to calculate an intermediate value for a given record and use said value later on to set a field or as part of a formula again. (I agree it's a stupid example but right now can't come up with something more sensible)
Or this one that now seems to be generally accepted as 'OK', but I do remember the days when this would go haywire if you'd add an ORDER BY to it, which is often a requirement.
DECLARE #a varchar(max)
SELECT #a = (CASE WHEN #a IS NULL THEN name ELSE #a + ',' + name END)
FROM sys.objects
WHERE type = 'S'
SELECT #a
UPDATE: these things (likely) usually work fine as long as the datasets are small but odd things start to happen when the size of the data grows and the QO decides to use a different plan using multi-threading etc... So I tried to 'break' things by setting up a largish table that would no longer fit in memory and then see what would happen. I took a simple example that could easily be split into multiple threads. The result can be found here but you'll need to adapt it to your hardware off course (please don't put sqlFiddle to its knees!). The results so far are that things keep working but the query plans differ depending on the query we run (with or without #x and #y) !

Recursive replace from a table of characters

In short, I am looking for a single recursive query that can perform multiple replaces over one string. I have a notion it can be done, but am failing to wrap my head around it.
Granted, I'd prefer the biz-layer of the application, or even the CLR, to do the replacing, but these are not options in this case.
More specifically, I want to replace the below mess - which is C&P in 8 different stored procedures - with a TVF.
SET #temp = REPLACE(RTRIM(#target), '~', '-')
SET #temp = REPLACE(#temp, '''', '-')
SET #temp = REPLACE(#temp, '!', '-')
SET #temp = REPLACE(#temp, '#', '-')
SET #temp = REPLACE(#temp, '#', '-')
-- 23 additional lines reducted
SET #target = #temp
Here is where I've started:
-- I have a split string TVF called tvf_SplitString that takes a string
-- and a splitter, and returns a table with one row for each element.
-- EDIT: tvf_SplitString returns a two-column table: pos, element, of which
-- pos is simply the row_number of the element.
SELECT REPLACE('A~B!C#D#C!B~A', MM.ELEMENT, '-') TGT
FROM dbo.tvf_SplitString('~-''-!-#-#', '-') MM
Notice I've joined all the offending characters into a single string separated by '-' (knowing that '-' will never be one of the offending characters), which is then split. The result from this query looks like:
TGT
------------
A-B!C#D#C!B-A
A~B!C#D#C!B~A
A~B-C#D#C-B~A
A~B!C-D-C!B~A
A~B!C#D#C!B~A
So, the replace clearly works, but now I want it to be recursive so I can pull the top 1 and eventually come out with:
TGT
------------
A-B-C-D-C-B-A
Any ideas on how to accomplish this with one query?
EDIT: Well, actual recursion isn't necessary if there's another way. I'm pondering the use of a table of numbers here, too.
You can use this in a scalar function. I use it to remove all control characters from some external input.
SELECT #target = REPLACE(#target, invalidChar, '-')
FROM (VALUES ('~'),(''''),('!'),('#'),('#')) AS T(invalidChar)
I figured it out. I failed to mention that the tvf_SplitString function returns a row number as "pos" (although a subquery assigning row_number could also have worked). With that fact, I could control cross join between the recursive call and the split.
-- the cast to varchar(max) matches the output of the TVF, otherwise error.
-- The iteration counter is joined to the row number value from the split string
-- function to ensure each iteration only replaces on one character.
WITH XX AS (SELECT CAST('A~B!C#D#C!B~A' AS VARCHAR(MAX)) TGT, 1 RN
UNION ALL
SELECT REPLACE(XX.TGT, MM.ELEMENT, '-'), RN + 1 RN
FROM XX, dbo.tvf_SplitString('~-''-!-#-#', '-') MM
WHERE XX.RN = MM.pos)
SELECT TOP 1 XX.TGT
FROM XX
ORDER BY RN DESC
Still, I'm open to other suggestions.

SQL invalid conversion return null instead of throwing error

I have a table with a varchar column, and I want to find values that match a certain number. So lets say that column contains the following entries (except with millions of rows in real life):
123456789012
2345678
3456
23 45
713?2
00123456789012
So I decide I want all the rows which are numerically 123456789012 write a statement that looks something like this:
SELECT * FROM MyTable WHERE CAST(MyColumn as bigint) = 123456789012
It should return the first and last row, but instead the whole query blows up because it can't convert the "23 45" and "713?2" to bigint.
Is there another way to do the conversion that will return NULL for values that can't convert?
SQL Server does NOT guarantee boolean operator short-circuit, see On SQL Server boolean operator short-circuit. So all solution using ISNUMERIC(...) AND CAST(...) are fundamentally flawed (they may work, but hey can arbitrarily fail later dependiong on the generated plan). A better solution is using CASE, as Thomas suggests: CASE ISNUMERIC(...) WHEN 1 THEN CAST(...) ELSE NULL END. But, as gbn pointed out, ISNUMERIC is notoriously finicky in identifying what 'numeric' means and many cases where one would expect it to return 0 it returns 1. So mixing the CASE with the LIKE:
CASE WHEN MyRow NOT LIKE '%[^0-9]%' THEN CAST(MyRow as bigint) ELSE NULL END
But the real problem is that if you have millions of rows and you have to search them like this, you'll always end up scanning end-to-end since the expression is not SARG-able (no matter how we rewrite it). The real issue here is data purity, and should be addressed at the appropriate level, where the data is populated. Another thing to consider is if is possible to create a persisted computed column with this expression and create a filtered index on it which eliminates NULL (ie. non-numeric). That would speed up things a little.
If you are using SQL Server 2012 you can use the 2 new methods:
TRY_CAST()
TRY_CONVERT()
Both methods are equivalent. They return a value cast to the specified data type if the cast succeeds; otherwise, returns null. The only difference is that CONVERT is SQL Server specific, CAST is ANSI. using CAST will make your code more portable (although not sure if any other database provider implements TRY_CAST)
ISNUMERIC will accept empty string and values like 1.23 or 5E-04 so could be unreliable.
And you don't know what order things will be evaluated in so it could still fail (SQL is declarative, not procedural, so the WHERE clause probably won't be evaluated left to right)
So:
you want to accept value that consist only of the characters 0-9
you need to materialise the "number" filter so it's applied before CAST
Something like:
SELECT
*
FROM
(
SELECT TOP 2000000000 *
FROM MyTable
WHERE MyColumn NOT LIKE '%[^0-9]%' --double negative rejects anything except 0-9
ORDER BY MyColumn
) foo
WHERE
CAST(MyColumn as bigint) = 123456789012 --applied after number check
Edit: quick example that fails.
CREATE TABLE #foo (bigintstring varchar(100))
INSERT #foo (bigintstring )VALUES ('1.23')
INSERT #foo (bigintstring )VALUES ('1 23')
INSERT #foo (bigintstring )VALUES ('123')
SELECT * FROM #foo
WHERE
ISNUMERIC(bigintstring) = 1
AND
CAST(bigintstring AS bigint) = 123
SELECT *
FROM MyTable
WHERE ISNUMERIC(MyRow) = 1
AND CAST(MyRow as float) = 123456789012
The ISNUMERIC() function should give you what you need.
SELECT * FROM MyTable
WHERE ISNUMERIC(MyRow) = 1
AND CAST(MyRow as bigint) = 123456789012
And to add a case statement like Thomas suggested:
SELECT * FROM MyTable
WHERE CASE(ISNUMERIC(MyRow)
WHEN 1 THEN CAST(MyRow as bigint)
ELSE NULL
END = 123456789012
http://msdn.microsoft.com/en-us/library/ms186272.aspx
SELECT *
FROM MyTable
WHERE (ISNUMERIC(MyColumn) = 1) AND (CAST(MyColumn as bigint) = 123456789012)
Additionally you can use a CASE statement in order to get null values.
SELECT
CASE
WHEN (ISNUMERIC(MyColumn) = 1) THEN CAST(MyColumn as bigint)
ELSE NULL
END AS 'MyColumnAsBigInt'
FROM tableName
If you require additional filtering, for numerics which are not valid to be cast to bigint, you can use the following instead of ISNUMERIC:
PATINDEX('%[^0-9]%',MyColumn)) = 0
If you need decimal values instead of integers, cast to float instead and change the regex to '%[^0-9.]%'

How to refactor this sql query

I have a lengthy query here, and wondering whether it could be refactor?
Declare #A1 as int
Declare #A2 as int
...
Declare #A50 as int
SET #A1 =(Select id from table where code='ABC1')
SET #A2 =(Select id from table where code='ABC2')
...
SET #A50 =(Select id from table where code='ABC50')
Insert into tableB
Select
Case when #A1='somevalue' Then 'x' else 'y' End,
Case when #A2='somevalue' Then 'x' else 'y' End,
..
Case when #A50='somevalue' Then 'x' else 'y' End
From tableC inner join ......
So as you can see from above, there is quite some redundant code. But I can not think of a way to make it simpler.
Any help is appreciated.
If you need the variables assigned, you could pivot your table...
SELECT *
FROM
(
SELECT Code, Id
FROM Table
) t
PIVOT
(MAX(Id) FOR Code IN ([ABC1],[ABC2],[ABC3],[ABC50])) p /* List them all here */
;
...and then assign them accordingly.
SELECT #A1 = [ABC1], #A2 = [ABC2]
FROM
(
SELECT Code, Id
FROM Table
) t
PIVOT
(MAX(Id) FOR Code IN ([ABC1],[ABC2],[ABC3],[ABC50])) p /* List them all here */
;
But I doubt you actually need to assign them at all. I just can't really picture what you're trying to achieve.
Pivotting may help you, as you can still use the CASE statements.
Rob
Without taking the time to develop a full answer, I would start by trying:
select id from table where code in ('ABC1', ... ,'ABC50')
then pivot that, to get one row result set of columns ABC1 through ABC50 with ID values.
Join that row in the FROM.
If 'somevalue', 'x' and 'y' are constant for all fifty expressions. Then start from:
select case id when 'somevalue' then 'x' else 'y' end as XY
from table
where code in ('ABC1', ... ,'ABC50')
I am not entirely sure from your example, but it looks like you should be able to do one of a few things.
Create a nice look up table that will tell you for a given value of the select statement what should be placed there. This would be much shorter and should be insanely fast.
Create a simple for loop in your code and generate a list of 50 small queries.
Use sub-selects or generate a list of selects with one round trip to retrieve your #a1-#A50 values and then generate the query with them already in place.
Jacob