Matching records by ID and displaying non-matching columns in TSQL - tsql

In SQL Server I have two tables, Registrar Records and Teacher Records which are identical. The first part of the query matches on the student ID and shows whether or not a match was found in a dynamic column (that part I did already).
An additional dynamic column should list out all the columns that didn't match - I'm wondering if there is a way to do this without making a gigantic Case expression since that would have many possibilities, this is what I have so far:
SET ANSI_NULLS OFF
SELECT TR.*,
CASE WHEN RR.StudentID IS NULL THEN 'NO MATCH'
WHEN RR.StudentID IS NOT NULL AND RR.FirstName = TR.FirstName
AND RR.LastName = TR.LastName
AND RR.Floor = TR.Floor
AND RR.FirstQuarterGrade = TR.FirstQuarterGrade
AND RR.SecondQuarterGrade = TR.SecondQuarterGrade
AND RR.ThirdQuarterGrade = TR.ThirdQuarterGrade
AND RR.FinalGrade = TR.FinalGrade THEN 'MATCH'
ELSE 'MATCH WITH ISSUE' END AS MatchResult
--TO DO: Add a ISSUE column; lists columns with mismatch
FROM TeacherRecords TR
LEFT JOIN RegistrarRecords RR ON RR.StudentID = TR.StudentID
If a script is needed for the table here is the table:
CREATE TABLE [dbo].[RegistrarRecords](
[StudentID] [varchar](10) NOT NULL,
[FirstName] [varchar](50) NULL,
[LastName] [varchar](50) NULL,
[Floor] [int] NULL,
[Room] [varchar](10) NULL,
[FirstQuarterGrade] [int] NULL,
[SecondQuarterGrade] [int] NULL,
[ThirdQuarterGrade] [int] NULL,
[FinalGrade] [int] NULL
) ON [PRIMARY]

You could split this query into two parts
First find all the issues
Then check if there are any issues in doing the final select
Here is sample code to do so. The Issue_List field contains a comma-separated list of all the columns with issues (e.g., 'Floor, FirstQuarterGrade'). If this field is empty, then it represents that there are no issues.
WITH TR_Match_Info AS
(SELECT TR.*,
RR.StudentID AS RR_StudentID,
'' + CASE WHEN RR.FirstName = TR.FirstName THEN '' ELSE 'FirstName, ' END
+ CASE WHEN RR.LastName = TR.LastName THEN '' ELSE 'LastName, ' END
+ CASE WHEN RR.Floor = TR.Floor THEN '' ELSE 'Floor, ' END
+ CASE WHEN RR.FirstQuarterGrade = TR.FirstQuarterGrade THEN '' ELSE 'FirstQuarterGrade, ' END
+ CASE WHEN RR.SecondQuarterGrade = TR.SecondQuarterGrade THEN '' ELSE 'SecondQuarterGrade, ' END
+ CASE WHEN RR.ThirdQuarterGrade = TR.ThirdQuarterGrade THEN '' ELSE 'ThirdQuarterGrade, ' END
+ CASE WHEN RR.FinalGrade = TR.FinalGrade THEN '' ELSE 'FinalGrade, ' END
AS Issue_List
FROM TeacherRecords TR
LEFT JOIN RegistrarRecords RR ON RR.StudentID = TR.StudentID
)
SELECT [StudentID],
[FirstName],
[LastName],
[Floor],
[Room],
[FirstQuarterGrade],
[SecondQuarterGrade],
[ThirdQuarterGrade],
[FinalGrade],
CASE WHEN RR_StudentID IS NULL THEN 'NO MATCH'
WHEN Issue_List = '' THEN 'MATCH'
ELSE 'MATCH WITH ISSUE' END
AS MatchResult,
CASE WHEN RR_StudentID IS NULL THEN ''
WHEN LEN(Issue_List) > 0 THEN LEFT(Issue_List, LEN(Issue_List) - 1)
ELSE Issue_List
END AS Match_Issues
FROM TR_Match_Info;
Note that the issue checks above are based on your code. You should review how NULLs are handled - if the values are both NULL it flags it as an issue (e.g., when checking whether NULL = NULL the result is NULL so it goes to the ELSE component of these CASE expressions).

Related

Optimize multiple case expressions with THEN/END

I have following query.
SELECT
i.id,
CASE WHEN ia.detail_count = 1 THEN i.space_id ELSE NULL END AS space_id,
CASE WHEN ia.detail_count = 1 THEN i.resident_id ELSE NULL END AS resident_id,
CASE WHEN ia.detail_count = 1 THEN i.lease_id ELSE NULL END AS lease_id,
i.deleted_by,
i.deleted_on,
i.updated_by,
i.updated_on,
i.created_by
From
myTable i
JOIN (
SELECT
icd.id,
json_build_object(
'lease_ids', array_remove(array_agg(icd.lease_id), NULL),
'resident_ids', array_remove(array_agg(icd.resident_id), NULL),
'space_ids', array_remove(array_agg(icd.space_id), NULL)
) AS details,
COUNT(icd.id) As detail_count
FROM
mytable_details icd
GROUP BY
icd.id
) ia ON ia.id = i.id;
Can we optimize following three expressions into 1, since condition is same only operand is different.
CASE WHEN ia.detail_count = 1 THEN i.space_id ELSE NULL END AS space_id,
CASE WHEN ia.detail_count = 1 THEN i.resident_id ELSE NULL END AS resident_id,
CASE WHEN ia.detail_count = 1 THEN i.lease_id ELSE NULL END AS lease_id,
I'm not sure that this counts as an optimisation, but I think you could modify the inner query:
(COUNT(icd.id) = 1) as one_detail
... which would return a Boolean result, and then ...
case when one_detail then i.space_id end as space_id,
case when one_detail then i.resident_id end as resident_id,
case when one_detail then i.lease_id end as lease_id,
I'm not sure it's worthwhile in a simple case like this, but for a more complex condition it might be.

Why selectrow_array does not work with null values in where clause

I am trying to fetch the count from SQL server database and it gives 0 for fields with null values. Below is what I am using.
my $sql = q{SELECT count(*) from customer where first_name = ? and last_name = ?};
my #bind_values = ($first_name, $last_name);
my $count = $dbh->selectrow_array($sql, undef, #bind_values);
This returns 0 if either value is null in the database. I know prepare automatically makes it is null if the passed parameter is undef, but I don't know why it's not working.
So here is weird observation. When I type the SQL with values in Toda for SQL server, it works :
SELECT count(*) from customer where first_name = 'bob' and last_name is null
but when I try the same query and pass values in the parameter for the first_name = bob and the last_name {null} . it does not work.
SELECT count(*) from customer where first_name = ? and last_name = ?
For NULL in the WHERE clause you simply need a different query. I write them below each other, so you can spot the difference:
...("select * from test where col2 = ?", undef, 1);
...("select * from test where col2 is ?", undef, undef);
...("select * from test where col2 is ?", undef, 1);
...("select * from test where col2 = ?", undef, undef);
The first two commands work, stick to those. The third is a syntax error, the fourth is what you tried and which indeed does not return anything.
The DBI manpage has a section of NULL values that talks about this case a bit more.
So, here it is what I did. I added or field is null statement with each field if the value is undef.
my $sql = q{SELECT count(*) from customer where (first_name = ? or (first_name is null and ? = 1)) and (last_name = ? or (last_name is null and ? = 1))};
my #bind_values = ($first_name, defined($first_name)?0:1, $last_name, defined($last_name)?0:1);
my $count = $dbh->selectrow_array($sql, undef, #bind_values);
If anyone has better solution please post it.

Using IndexOf and/Or Substring to parse data into separate columns

I am working on migrating data from one database to another for a hospital. In the old database, the doctor's specialty IDs are all in one column (swvar_specialties), each separated by commas. In the new database, each specialty ID will have it's own column (example: Specialty1_PrimaryID, Specialty2_PrimaryID, Specialty3_PrimaryID, etc). I am trying to export the data out of the old database and separate these into these separate columns. I know I can use indexof and substring to do this - I just need help with the syntax.
So this query:
Select swvar_specialties as Specialty1_PrimaryID
From PhysDirectory
might return results similar to 39,52,16. I need this query to display Specialty1_PrimaryID = 39, Specialty2_PrimaryID = 52, and Specialty3_PrimaryID = 16 in the results. Below is my query so far. I will eventually have a join to pull the specialty names from the specialties table. I just need to get this worked out first.
Select pd.ref as PrimaryID, pd.swvar_name_first as FirstName, pd.swvar_name_middle as MiddleName,
pd.swvar_name_last as LastName, pd.swvar_name_suffix + ' ' + pd.swvar_name_degree as NameSuffix,
pd.swvar_birthdate as DateOfBirth,pd.swvar_notes as AdditionalInformation, 'images/' + '' + pd.swvar_photo as ImageURL,
pd.swvar_philosophy as PhilosophyOfCare, pd.swvar_gender as Gender, pd.swvar_specialties as Specialty1_PrimaryID, pd.swvar_languages as Language1_Name
From PhysDirectory as pd
The article Split function equivalent in T-SQL? provides some details on how to use a split function to split a comma-delimited string.
By modifying the table-valued function in presented in this article to provide an identity column we can target a specific row such as Specialty1_PrimaryID:
/*
Splits string into parts delimitered with specified character.
*/
CREATE FUNCTION [dbo].[SDF_SplitString]
(
#sString nvarchar(2048),
#cDelimiter nchar(1)
)
RETURNS #tParts TABLE (id bigint IDENTITY, part nvarchar(2048) )
AS
BEGIN
if #sString is null return
declare #iStart int,
#iPos int
if substring( #sString, 1, 1 ) = #cDelimiter
begin
set #iStart = 2
insert into #tParts
values( null )
end
else
set #iStart = 1
while 1=1
begin
set #iPos = charindex( #cDelimiter, #sString, #iStart )
if #iPos = 0
set #iPos = len( #sString )+1
if #iPos - #iStart > 0
insert into #tParts
values ( substring( #sString, #iStart, #iPos-#iStart ))
else
insert into #tParts
values( null )
set #iStart = #iPos+1
if #iStart > len( #sString )
break
end
RETURN
END
Your query can the utilise this split function as follows:
Select
pd.ref as PrimaryID,
pd.swvar_name_first as FirstName,
pd.swvar_name_middle as MiddleName,
pd.swvar_name_last as LastName,
pd.swvar_name_suffix + ' ' + pd.swvar_name_degree as LastName,
pd.swvar_birthdate as DateOfBirth,pd.swvar_notes as AdditionalInformation,
'images/' + '' + pd.swvar_photo as ImageURL,
pd.swvar_philosophy as PhilosophyOfCare, pd.swvar_gender as Gender,
(Select part from SDF_SplitString(pd.swvar_specialties, ',') where id=1) as Specialty1_PrimaryID,
(Select part from SDF_SplitString(pd.swvar_specialties, ',') where id=2) as Specialty2_PrimaryID,
pd.swvar_languages as Language1_Name
From PhysDirectory as pd

T-SQL 'AND' keyword not short-circuiting it seems

The below stored proc works fine except for the fact that when I uncomment the second part of the date check in the 'where' clause it blows up on a date conversion even if the passed in keyword is null or '111'.
I'm open to any suggestions on how to do this dynamic where clause differently.
I appreciate any help.
ALTER PROCEDURE [SurveyEngine].[GetPageOf_CommentsOverviewRowModel]
#sortColumn varchar(50),
#isASC bit,
#keyword varchar(50)
AS
BEGIN
declare #keywordType varchar(4)
set #keywordType = null
if ISDATE(#keyword) = 1
set #keywordType = 'date'
else if ISNUMERIC(#keyword) = 1
set #keywordType = 'int'
select c.CommentBatch BatchID, c.CreatedDate DateReturned, COUNT(c.CommentID) TotalComments
from SurveyEngine.Comment c
where (#keywordType is null)
or (#keywordType = 'date') --and c.CreatedDate = #keyword)
or (#keywordType = 'int' and (CONVERT(varchar(10), c.CommentBatch) like #keyword+'%'))
group by c.CommentBatch, c.CreatedDate
order by case when #sortColumn = 'BatchID' and #isASC = 0 then c.CommentBatch end desc,
case when #sortColumn = 'BatchID' and #isASC = 1 then c.CommentBatch end,
case when #sortColumn = 'DateReturned' and #isASC = 0 then c.CreatedDate end desc,
case when #sortColumn = 'DateReturned' and #isASC = 1 then c.CreatedDate end,
case when #sortColumn = 'TotalComments' and #isASC = 0 then COUNT(c.CommentID) end desc,
case when #sortColumn = 'TotalComments' and #isASC = 1 then COUNT(c.CommentID) end
END
EDIT Sorry, brain cloud. Things need to be initialized differently.
Change the setup to:
declare #keywordType varchar(4)
declare #TargetDate as DateTime = NULL
set #keywordType = null
if ISDATE(#keyword) = 1
begin
set #keywordType = 'date'
set #TargetDate = Cast( #keyword as DateTime )
end
else if ISNUMERIC(#keyword) = 1
set #keywordType = 'int'
Then change:
and c.CreatedDate = #keyword
to:
and c.CreatedDate = Coalesce( #TargetDate, c.CreatedDate )
That will result in a NOP if you are not searching by date.
Based on this guy's blog: http://blogs.msdn.com/b/bartd/archive/2011/03/03/don-t-depend-on-expression-short-circuiting-in-t-sql-not-even-with-case.aspx it looks like you can't guarantee the order of operations in the where clause, even though short circuiting is supported. The execution plan may choose to evaluate the second statement first.
He recommends using a case structure instead (like pst mentioned before) as it is "more" gauranteed. But I don't think I can rewrite your where clause as a case because you're using three different operators (is null, =, and LIKE).

T-SQL: returning VARCHAR in a derived column

I am having problems returning a VARCHAR out of a derived column.
Below are extremely simplified code examples.
I have been able to do this before:
SELECT *, message =
CASE
WHEN (status = 0)
THEN 'aaa'
END
FROM products
But when I introduce a Common Table Expression or Derived Table:
WITH CTE_products AS (SELECT * from products)
SELECT *, message =
CASE WHEN (status = 0)
THEN 'aaa'
END
FROM CTE_products
this seems to fail with the following message:
Conversion failed when converting the varchar value 'aaa' to data type int.
When I tweak the line to say:
WITH CTE_products AS (SELECT * from products)
SELECT *, message =
CASE WHEN (status = 0)
THEN '123'
END
FROM CTE_products
It returns correctly.
...
When I remove all the other clauses prior to it, it also works fine returning 'aaa'.
My preference would be to keep this as a single, stand-alone query.
The problem is that the column is an integer dataype and sql server is trying to convert 'aaa' to integer
one way
WITH CTE_products AS (SELECT * from products)
SELECT *, message =
CASE WHEN (status = 0)
THEN 'aaa' else convert(varchar(50),status)
END
FROM CTE_products
I actually ended up finding the answer.
One of my CASE/WHEN clauses used a derived column from the CTE and that ended up causing the confusion.
Before:
WITH CTE_products AS (SELECT *, qty_a + qty_b as qty_total FROM products)
SELECT *, message =
CASE WHEN (status = 0)
THEN 'Status is 0, the total is: ' + qty_total + '!'
END
FROM CTE_products
Corrected:
WITH CTE_products AS (SELECT *, qty_a + qty_b as qty_total FROM products)
SELECT *, message =
CASE WHEN (status = 0)
THEN 'Status is 0, the total is: ' + CAST(qty_total AS VARCHAR) + '!'
END
FROM CTE_products
I ended up removing WHEN/THEN clauses within the CASE statement right afterwards to see if it was a flukey parentheses error when I realized that in the absence of any of the WHEN/THEN clauses that included the derived column from the CTE, it was able to return VARCHAR.