How to refactor this sql query

How to refactor this sql query - tsql

I have a lengthy query here, and wondering whether it could be refactor?
Declare #A1 as int
Declare #A2 as int
...
Declare #A50 as int
SET #A1 =(Select id from table where code='ABC1')
SET #A2 =(Select id from table where code='ABC2')
...
SET #A50 =(Select id from table where code='ABC50')
Insert into tableB
Select
Case when #A1='somevalue' Then 'x' else 'y' End,
Case when #A2='somevalue' Then 'x' else 'y' End,
..
Case when #A50='somevalue' Then 'x' else 'y' End
From tableC inner join ......
So as you can see from above, there is quite some redundant code. But I can not think of a way to make it simpler.
Any help is appreciated.

If you need the variables assigned, you could pivot your table...
SELECT *
FROM
(
SELECT Code, Id
FROM Table
) t
PIVOT
(MAX(Id) FOR Code IN ([ABC1],[ABC2],[ABC3],[ABC50])) p /* List them all here */
;
...and then assign them accordingly.
SELECT #A1 = [ABC1], #A2 = [ABC2]
FROM
(
SELECT Code, Id
FROM Table
) t
PIVOT
(MAX(Id) FOR Code IN ([ABC1],[ABC2],[ABC3],[ABC50])) p /* List them all here */
;
But I doubt you actually need to assign them at all. I just can't really picture what you're trying to achieve.
Pivotting may help you, as you can still use the CASE statements.
Rob

Without taking the time to develop a full answer, I would start by trying:
select id from table where code in ('ABC1', ... ,'ABC50')
then pivot that, to get one row result set of columns ABC1 through ABC50 with ID values.
Join that row in the FROM.
If 'somevalue', 'x' and 'y' are constant for all fifty expressions. Then start from:
select case id when 'somevalue' then 'x' else 'y' end as XY
from table
where code in ('ABC1', ... ,'ABC50')

I am not entirely sure from your example, but it looks like you should be able to do one of a few things.
Create a nice look up table that will tell you for a given value of the select statement what should be placed there. This would be much shorter and should be insanely fast.
Create a simple for loop in your code and generate a list of 50 small queries.
Use sub-selects or generate a list of selects with one round trip to retrieve your #a1-#A50 values and then generate the query with them already in place.
Jacob

Related

postgres case statement with subquery

I have a subquery like this
with subquery as (select host from table_A where << some condition >>)
and in my main query, I am querying data from another table called table_B, and one of the columns is called destination_host. Now I need to check if the destination_host is in the list returned from my subquery, then I want to output TypeA in my select statement or else TypeB. My select statement looks something like
select name, place, destination_host
from table_B
where <<some condition>>
I want to output a fourth column that is based on a condition check, let's say we call this host_category and if the destination_host value exists in the subquery then I want to add value typeA or else typeB. Please can you help me understand how to write this. I understand that it is hard to provide guidance if you don't have actual data to work with.
I tried using case statements such as this one:
when (destination_host in (select host from subquery)) THEN 'typeA'
when (destination_host not in (select host from subquery)) THEN 'typeB'
end as host_category
but I don't think this is the way to solve this problem.

I would use EXISTS:
WITH subquery AS (...)
SELECT CASE WHEN EXISTS (SELECT 1 FROM subquery
WHERE subquery.host = table_b.destination_host)
THEN 'typeA'
ELSE 'typeB'
END
FROM table_b;
With queries like that, you have to take care of NULL values. If table_b.destination_host is NULL, the row will always show up as typeB, because NULL = NULL is not TRUE in SQL.

Is there a more efficient / elegant way to write this code I have?

I'm wondering if anybody can help me out with any or all of this code below. I've made it work, but it seems inefficient to me and is probably quite a bit slower than optimal.
Some basic background on the necessity of this code in the first place:
I have a table of shipping records that does not include the corresponding invoice number. I've looked all through the tables and I continue to do so. In fact, only this morning I discovered that if a packing slip has been generated that I can link the shipping table to the packing slip table via that packing slip ID and grab the invoice number from there. Absent that link, however, I'm forced to guess. In most instances, that's not terribly difficult, because the invoice table has number, line and release that can match up. But when there are multiple shipments for number, line and release (for instance, when a line is partially shipped) then there can be multiple answers, only one of which is correct. I am partially helped by the presence of a a column in the shipping table that states what the date sequence is for that number, line and release, but there are still circumstances where the process I use for "guessing" can be somewhat ambiguous.
What my procedure does is this. First, it creates a table of data that includes the invoice number if there was a pack slip to link it through.
Next, it dumps all of that data into a second table, this time using--only if the invoice was NULL in the first table--a "guess" about the invoice number based on partitioning all the shipping records by number, line, release, date sequence and date, and then comparing that to the same type of thing for the invoice table, and trying to line everything up by date.
Finally, it parses through that table and finds any last nulls and essentially matches them up with the first record of any invoice for that number, line and release.
Both guesses have added characters to show that they are, in fact, guesses.
IF OBJECT_ID('tempdb..#cosTAble') IS NOT NULL
DROP TABLE #cosTable
DECLARE #cosTable2 TABLE (
ID INT IDENTITY
,co_num CoNumType
,co_line CoLineType
,co_release CoReleaseType
,date_seq DateSeqType
,ship_date DateType
,inv_num NVARCHAR(14)
)
DECLARE
#co_num_ck CoNumType
,#co_line_ck CoLineType
,#co_release_ck CoReleaseType
DECLARE #Counter1 INT = 0
SELECT cos.co_num, cos.co_line, cos.co_release, cos.date_seq, cos.ship_date, cos.qty_invoiced, pck.inv_num
INTO #cosTable
FROM co_ship cos
LEFT JOIN pckitem pck
ON cos.pack_num = pck.pack_num
AND cos.co_num = pck.co_num
AND cos.co_line = pck.co_line
AND cos.co_release = pck.co_release
;WITH cos_Order
AS(
SELECT co_num, co_line, co_release, qty_invoiced, date_seq, ship_date, ROW_NUMBER () OVER (PARTITION BY co_num, co_line, co_release ORDER BY ship_date) AS cosrow
FROM co_ship
WHERE qty_invoiced > 0
),
invi_Order
AS(
SELECT inv_num, co_num, co_line, co_release, ROW_NUMBER () OVER (PARTITION BY co_num, co_line, co_release ORDER BY RecordDate) AS invirow
FROM inv_item
WHERE qty_invoiced > 0
),
cos_invi
AS(
SELECT cosO.*, inviO.inv_num
FROM cos_Order cosO
LEFT JOIN invi_Order inviO
ON cosO.co_num = inviO.co_num AND cosO.co_line = inviO.co_line AND cosO.cosrow = inviO.invirow)
INSERT INTO #cosTable2
SELECT cosT.co_num, cosT.co_line, cosT.co_release, cosT.date_seq, cosT.ship_date, COALESCE(cosT.inv_num,'*'+cosi.inv_num) AS inv_num
FROM #cosTable cosT
LEFT JOIN cos_invi cosi
ON cosT.co_num = cosi.co_num
AND cosT.co_line = cosi.co_line
AND cosT.co_release = cosi.co_release
AND cosT.date_seq = cosi.date_seq
AND cosT.ship_date = cosi.ship_date
WHILE #Counter1 < (SELECT MAX(ID) FROM #cosTable2) BEGIN
SET #Counter1 += 1
SET #co_num_ck = (SELECT co_num FROM #cosTable2 WHERE ID = #Counter1)
SET #co_line_ck = (SELECT co_line FROM #cosTable2 WHERE ID = #Counter1)
SET #co_release_ck = (SELECT co_release FROM #cosTable2 WHERE ID = #Counter1)
IF EXISTS (SELECT * FROM #cosTable2 WHERE ID = #Counter1 AND inv_num IS NULL)
UPDATE #cosTable2
SET inv_num = '^' + (SELECT TOP 1 inv_num FROM #cosTable2 WHERE
#co_num_ck = co_num AND
#co_line_ck = co_line AND
#co_release_ck = co_release)
WHERE ID = #Counter1 AND inv_num IS NULL
END
SELECT * FROM #cosTable2
ORDER BY co_num, co_line, co_release, date_seq, ship_date

You're in a bad spot - as #craig.white and #HLGEM suggest, you've inherited something without sufficient constraints to make the data correct or safe...and now you have to "synthesize" it. I get that guesses are the best you can do, and you can, at least make your guesses reasonable performance-wise.
After that, you should squeal loudly to get some time to fix the db - to apply the constraints needed to prevent further crapification of the data.
Performance-wise, the while loop is a disaster. You'd be better off replacing that whole mess with a single update statement...something like:
update c0
set inv_nbr = '^' + c1.inv_nbr
from
#cosTable2 c0
left outer join
(
select
co_num,
co_line,
co_release,
inv_nbr
from
#cosTable2
where
inv_nbr is not null
group by
co_num,
co_line,
co_release,
inv_nbr
) as c1
on
c0.co_num = c1.co_num and
c0.co_line = c1.co_line and
c0.co_release = c1.co_release
where
c0.inv_num is null
...which does the same thing the loop does, only in a single statement.

It seems to me that you are trying very hard to solve a problem that should not exist. What you describe is an unfortunately common situation where a process has grown organically without intent and specific direction as a business has grown which has made data extraction near impossible to automate. You very much need a set of policies and procedures- For (very crude and simple) example:
1: An Order must exist before a packing slip can be generated.
2: a packing slip must exist before an invoice can be generated.
3: an invoice is created using data from the packing slip and order (what was requested, what was picked, what do we bill)
-Again, this is a crude example just to illustrate the idea.
All of the data MUST be entered at the proper time or someone has not done their job.
It is not in the IT departments typical skillset to accurately and consistently provide management good data when such data does not exist.

What does a column assignment using an aggregate in the columns area of a select do?

I'm trying to decipher another programmer's code who is long-gone, and I came across a select statement in a stored procedure that looks like this (simplified) example:
SELECT #Table2.Col1, Table2.Col2, Table2.Col3, MysteryColumn = CASE WHEN y.Col3 IS NOT NULL THEN #Table2.MysteryColumn - y.Col3 ELSE #Table2.MysteryColumn END
INTO #Table1
FROM #Table2
LEFT OUTER JOIN (
SELECT Table3.Col1, Table3.Col2, Col3 = SUM(#Table3.Col3)
FROM Table3
INNER JOIN #Table4 ON Table4.Col1 = Table3.Col1 AND Table4.Col2 = Table3.Col2
GROUP BY Table3.Col1, Table3.Col2
) AS y ON #Table2.Col1 = y.Col1 AND #Table2.Col2 = y.Col2
WHERE #Table2.Col2 < #EnteredValue
My question, what does the fourth column of the primary selection do? does it produce a boolean value checking to see if the values are equal? or does it set the #Table2.MysteryColumn equal to some value and then inserts it into #Table1? Or does it just update the #Table2.MysteryColumn and not output a value into #Table1?
This same thing seems to happen inside of the sub-query on the third column, and I am equally at a loss as to what that does as well.

MysteryColumn = gives the expression a name also called a column alias. The fact that a column in the table#2 also has the same name is besides the point.
Since it uses INTO syntax it also gives the column its name in the resulting temporary table. See the SELECT CLAUSE and note | column_alias = expression and the INTO CLAUSE

sp_executesql vs user defined scalar function

In the table below I am storing some conditions like this:
Then, generally, in second table, I am having the following records:
and what I need is to compare these values using the right condition and store the result ( let's say '0' for false, and '1' for true in additional column).
I am going to do this in a store procedure and basically I am going to compare from several to hundreds of records.
What of the possible solution is to use sp_executesql for each row building dynamic statements and the other is to create my own scalar function and to call it for eacy row using cross apply.
Could anyone tell which is the more efficient way?
Note: I know that the best way to answer this is to make the two solutions and test, but I am hoping that there might be answered of this, based on other stuff like caching and SQL internal optimizations and others, which will save me a lot of time because this is only part of a bigger problem.

I don't see the need in use of sp_executesql in this case. You can obtain result for all records at once in a single statement:
select Result = case
when ct.Abbreviation='=' and t.ValueOne=t.ValueTwo then 1
when ct.Abbreviation='>' and t.ValueOne>t.ValueTwo then 1
when ct.Abbreviation='>=' and t.ValueOne>=t.ValueTwo then 1
when ct.Abbreviation='<=' and t.ValueOne<=t.ValueTwo then 1
when ct.Abbreviation='<>' and t.ValueOne<>t.ValueTwo then 1
when ct.Abbreviation='<' and t.ValueOne<t.ValueTwo then 1
else 0 end
from YourTable t
join ConditionType ct on ct.ID = t.ConditionTypeID
and update additional column with something like:
;with cte as (
select t.AdditionalColumn, Result = case
when ct.Abbreviation='=' and t.ValueOne=t.ValueTwo then 1
when ct.Abbreviation='>' and t.ValueOne>t.ValueTwo then 1
when ct.Abbreviation='>=' and t.ValueOne>=t.ValueTwo then 1
when ct.Abbreviation='<=' and t.ValueOne<=t.ValueTwo then 1
when ct.Abbreviation='<>' and t.ValueOne<>t.ValueTwo then 1
when ct.Abbreviation='<' and t.ValueOne<t.ValueTwo then 1
else 0 end
from YourTable t
join ConditionType ct on ct.ID = t.ConditionTypeID
)
update cte
set AdditionalColumn = Result
If above logic is supposed to be applied in many places, not just over one table, then yes you may think about function. Though I would used rather inline table-valued function (not scalar), because of there is overhead imposed with use of user defined scalar functions (to call and return, and the more rows to be processed the more time wastes).
create function ftComparison
(
#v1 float,
#v2 float,
#cType int
)
returns table
as return
select
Result = case
when ct.Abbreviation='=' and #v1=#v2 then 1
when ct.Abbreviation='>' and #v1>#v2 then 1
when ct.Abbreviation='>=' and #v1>=#v2 then 1
when ct.Abbreviation='<=' and #v1<=#v2 then 1
when ct.Abbreviation='<>' and #v1<>#v2 then 1
when ct.Abbreviation='<' and #v1<#v2 then 1
else 0
end
from ConditionType ct
where ct.ID = #cType
which can be applied then as:
select f.Result
from YourTable t
cross apply ftComparison(ValueOne, ValueTwo, t.ConditionTypeID) f
or
select f.Result
from YourAnotherTable t
cross apply ftComparison(SomeValueColumn, SomeOtherValueColumn, #someConditionType) f

Splitting comma delimited cell data

I have a spreadsheet with multiple columns, one of which is an owner_id column. The problem is that this column contains a comma delimited list of owner id's and not just a single one.
I've imported this spreadsheet into my sql database (2008) and have completed other importing tasks and now have a parcel_id column as a result of this process.
I need to create an entry in my parcelOwners table for each parcelID/ownerID pair, but I'm not sure how to go about this with the owner id's being in the comma delimited list.
My tables look like this:
ImportData
=================
owner_id varchar,
parcelID int
sample row (owner_id = '13782, 21431', parcelID = 319)
ParcelOwners
=================
ownerID int,
parcelID int
row from ImportData table should look like:
ownerID = 13782, parcelID = 319
ownerID = 21431, parcelID = 319
Is this a common situation for anybody and if so, how do you go about getting around this?

The below function will split you comma sep column into a table. You will then need to iterate through the temp table and insert 1 row into your parcelOwners table using the data from your single column. To get this to work you will need an outer loop to iterate through the parcelOwners table and an inner loop to iterate through the #temptable for each row. Also, don't forget, if you come to a row in your outer loop with no comma's in the owner_id column you won't want to do anything.
CREATE FUNCTION dbo.Split(#String varchar(8000), #Delimiter char(1))
returns #temptable TABLE (items varchar(8000))
as
begin
declare #idx int
declare #slice varchar(8000)
select #idx = 1
if len(#String)<1 or #String is null return
while #idx!= 0
begin
set #idx = charindex(#Delimiter,#String)
if #idx!=0
set #slice = left(#String,#idx - 1)
else
set #slice = #String
if(len(#slice)>0)
insert into #temptable(Items) values(#slice)
set #String = right(#String,len(#String) - #idx)
if len(#String) = 0 break
end
return
end

You can do this easily leveraging SQL Server's XML functions:
WITH xmlData (xml_owner_id,parecelID) AS (
/* make into xml */
SELECT cast('<x>'+replace(owner_id,',','</x><x>')+'</x>' as XML) AS xml_owner_id, parecelID
FROM ImportData
)
SELECT x.value('.','int') AS owner_id, parecelID /* split up */
FROM xmlData
CROSS APPLY xmlData.xml_owner_id.nodes('//x') AS func(x)

(In response to #senloe's question about how to use the function supplied by #RandomBen)
This answer to a previous question shows how to use OUTER APPLY to apply a function to every row in a table. In your case, and assuming you have already run #RandomBen's code to create the dbo.Split function, the syntax would look something like this:
INSERT INTO ParcelOwners (ownerId, parcelID)
SELECT CONVERT(int, Results.items), ImportData.parcelID
FROM ImportData
OUTER APPLY dbo.Split(ImportData.owner_id, ',') AS Results
(I don't have access to SQL Server right now, so I haven't tried it yet. You can run it without the first line, i.e. just from SELECT onwards, to see what output it is going to generate before you actually do the INSERT).

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

How to refactor this sql query - tsql

Related

postgres case statement with subquery

Is there a more efficient / elegant way to write this code I have?

What does a column assignment using an aggregate in the columns area of a select do?

sp_executesql vs user defined scalar function

Splitting comma delimited cell data

Categories

Resources