SQL Server Pattern Matching on Joins - tsql

I have a series of identifiers in the format AAAA-NNN-AAA.
I want to join this to another table on the identifier but where each NNN contains 1,2 or 3 in either the second or third position within NNN.
So ABCD-010 would match ABCD-010 or ABCD-011 or ABCD-001 etc but not ABCD-121 or ABCD-003
I've look at Like obviously and PATINDEX but wondered if there was a standard 'efficient' method ?

One of the way how you can achieve is, something like this:
create table #t1
(
id nvarchar(max)
)
create table #t2
(
id nvarchar(max)
)
insert into #t1 values('ABCD-010')
insert into #t2 values('ABCD-021')
select *
from #t1 t1
join #t2 t2 on left(t1.id,4) = left(t2.id,4)
and (cast(right(t2.id, 1) as int) in (1,2,3) or cast(left(right(t2.id, 2),1) as int) in (1,2,3))

Related

How to ignore a record in query to avoid a conversion error in JOIN?

I have a table T1 with alphanumeric codes (varchar column) where always the three first digits will be numeric like this:
001ABCD
100EFGH
541XYZZ
OTHER
NOTE: Please notice that I have ONE exception record which is all alpha (OTHER).
Also I have a table T2 with 3-digit numbers (int column) like this:
001
200
300
So when I run the following query:
SELECT * from T1
LEFT JOIN T2
ON SUBSTRING(T1.code1,1,3) = T2.code2
WHERE T1.code1 <> 'OTHER'
It is causing me the error:
Conversion failed when converting the varchar value 'OTH' to data type int.
I know the issue but not how to fix it (it's trying to compare 'OTH' with the T2.code2 INT column).
I tried to use WHERE but it didn't work at all.
I cannot get rid of the 'OTHER' record and convert the T2.code2 column from int to varchar is not an option. Any idea?
Here are 3 different ways you can solve this. I would recommended the persisted computed column since it only has to be calculated on insert and update, not every time you run the read query.
DROP TABLE IF EXISTS #T2;
DROP TABLE IF EXISTS #T1;
CREATE TABLE #T1
(
Code1 VARCHAR(10)
,Code2Computed AS TRY_CONVERT(INT,SUBSTRING(Code1,1,3)) PERSISTED
)
;
CREATE TABLE #T2
(
Code2 INT
)
;
INSERT INTO #T1
(Code1)
VALUES
('001ABCD')
,('100EFGH')
,('541XYZZ')
,('OTHER')
;
INSERT INTO #T2
(Code2)
VALUES
(001)
,(100)
,(200)
,(300)
,(541)
;
--Convert INT to 3 digit code
SELECT *
FROM #T1
LEFT JOIN #T2
ON SUBSTRING(#T1.Code1,1,3) = RIGHT(CONCAT('000',#T2.Code2),3)
;
--Convert 3 digit code to INT
SELECT *
FROM #T1
LEFT JOIN #T2
ON TRY_CONVERT(INT,SUBSTRING(#T1.Code1,1,3)) = #T2.Code2
;
--Use computed column
SELECT *
FROM #T1
LEFT JOIN #T2
ON #T1.Code2Computed = #T2.Code2
;

T-SQL - Pivot/Crosstab - variable number of values

I have a simple data set that looks like this:
Name Code
A A-One
A A-Two
B B-One
C C-One
C C-Two
C C-Three
I want to output it so it looks like this:
Name Code1 Code2 Code3 Code4 Code...n ...
A A-One A-Two
B B-One
C C-One C-Two C-Three
For each of the 'Name' values, there can be an undetermined number of 'Code' values.
I have been looking at various examples of Pivot SQL [including simple Pivot sql and sql using the XML function?] but I have not been able to figure this out - or to understand if it is even possible.
I would appreciate any help or pointers.
Thanks!
Try it like this:
DECLARE #tbl TABLE([Name] VARCHAR(100),Code VARCHAR(100));
INSERT INTO #tbl VALUES
('A','A-One')
,('A','A-Two')
,('B','B-One')
,('C','C-One')
,('C','C-Two')
,('C','C-Three');
SELECT p.*
FROM
(
SELECT *
,CONCAT('Code',ROW_NUMBER() OVER(PARTITION BY [Name] ORDER BY Code)) AS ColumnName
FROM #tbl
)t
PIVOT
(
MAX(Code) FOR ColumnName IN (Code1,Code2,Code3,Code4,Code5 /*add as many as you need*/)
)p;
This line
,CONCAT('Code',ROW_NUMBER() OVER(PARTITION BY [Name] ORDER BY Code)) AS ColumnName
will use a partitioned ROW_NUMBER in order to create numbered column names per code. The rest is simple PIVOT...
UPDATE: A dynamic approach to reflect the max amount of codes per group
CREATE TABLE TblTest([Name] VARCHAR(100),Code VARCHAR(100));
INSERT INTO TblTest VALUES
('A','A-One')
,('A','A-Two')
,('B','B-One')
,('C','C-One')
,('C','C-Two')
,('C','C-Three');
DECLARE #cols VARCHAR(MAX);
WITH GetMaxCount(mc) AS(SELECT TOP 1 COUNT([Code]) FROM TblTest GROUP BY [Name] ORDER BY COUNT([Code]) DESC)
SELECT #cols=STUFF(
(
SELECT CONCAT(',Code',Nmbr)
FROM
(SELECT TOP((SELECT mc FROM GetMaxCount)) ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) FROM master..spt_values) t(Nmbr)
FOR XML PATH('')
),1,1,'');
DECLARE #sql VARCHAR(MAX)=
'SELECT p.*
FROM
(
SELECT *
,CONCAT(''Code'',ROW_NUMBER() OVER(PARTITION BY [Name] ORDER BY Code)) AS ColumnName
FROM TblTest
)t
PIVOT
(
MAX(Code) FOR ColumnName IN (' + #cols + ')
)p;';
EXEC(#sql);
GO
DROP TABLE TblTest;
As you can see, the only part which will change in order to reflect the actual amount of columns is the list in PIVOTs IN() clause.
You can create a string, which looks like Code1,Code2,Code3,...CodeN and build the statement dynamically. This can be triggered with EXEC().
I'd prefer the first approach. Dynamically created SQL is very mighty, but can be a pain in the neck too...

Taking result from SQL/T-SQL Subselect into the parent select statement

I want to extend ListA with Company coming from #MyList.CompanyNo, plese refer to the code listing
Data&Init:
begin /*Just the init data*/
DECLARE #MyList TABLE (Mail nvarchar(max), CompanyNo int)
INSERT INTO #MyList VALUES ('...com',20)
INSERT INTO #MyList VALUES ('...com',230)
INSERT INTO #MyList VALUES ('...com',120)
INSERT INTO #MyList VALUES ('...com',223)
end
--DECLARE
DECLARE #ListA TABLE (Id nvarchar(max), Mail nvarchar(max))
DECLARE #ListB TABLE (Id nvarchar(max), Mail nvarchar(max),Company int)
Starting point(this works):
INSERT INTO #ListA(Id,Mail) select someId,name from [somedb].[dbo].aers where name IN (SELECT Mail FROM #MyList)
I was trying to do it the following way:
INSERT INTO #ListB(Id,Mail,Company) select someId,name,#MyList.CompanyNo from [somedb].[dbo].aers where name IN (SELECT Mail FROM #MyList)
So actually I want to extend ListB with the corrosponding #MyList.CompanyNo.
Thanks, what can I do ?
You could use JOIN based on condition from WHERE:
INSERT INTO #ListB(Id,Mail,Company)
select a.someId,a.name,m.CompanyNo
from [somedb].[dbo].aers a
join #MyList m
ON a.name = m.Mail;

Is there a shortcut to deleting all in one table not in another?

Are there any shortcuts for deleting everything in one table that does not exist in the second?
I know I can do this:
DECLARE #Table1 TABLE (ID INT)
DECLARE #Table2 TABLE (ID INT)
INSERT INTO #Table1 VALUES (1),(2),(3),(4)
INSERT INTO #Table2 VALUES (3),(4)
DELETE t1
FROM #Table1 t1
WHERE NOT EXISTS (SELECT 1 FROM #Table2 t2 WHERE t2.ID = t1.ID)
SELECT * FROM #Table1
However, I have over 600 columns, so you can see why I might be reluctant to go that route if there's another way. What I WANT to do would look like this:
DECLARE #Table1 TABLE (ID INT)
DECLARE #Table2 TABLE (ID INT)
INSERT INTO #Table1 VALUES (1),(2),(3),(4)
INSERT INTO #Table2 VALUES (3),(4)
DELETE #Table1
EXCEPT SELECT * FROM #Table2
That EXCEPT has been very handy in dealing with this project I'm working on, but I guess it's limited.
Please use this:
DELETE FROM #Table1 WHERE BINARY_CHECKSUM(*) NOT IN(SELECT BINARY_CHECKSUM(*) FROM #Table2);
But be carefull, if your table contains float data types. In very rare cases wrong checksum may be calculated. But, these cases are rare and random, no problems will remain after second delete iteration.
Sure:
DELETE t1
FROM #Table1 t1
LEFT JOIN #Table2 t2 ON t2.ID = t1.ID
WHERE t2.ID IS NULL
My first answer was about the case, when t1 and t2 tables are the same, and joined corressponding cols, when deciding deletion.
Ok, now about the other situation: your #table1 column [ID] can by joined with any unknown #table2 column. You can solve 600+ cols problem using XML:
DELETE FROM #Table1 WHERE CONVERT(NVARCHAR, [ID]) NOT IN
(
SELECT
[col].[value]('(.)[1]', 'NVARCHAR(MAX)')
FROM
(
SELECT [xml] = (CONVERT(XML, (SELECT * FROM #Table2 FOR XML PATH('t2'))))
) AS [t2]
CROSS APPLY [t2].[xml].[nodes]('t2/*') AS [tab]([col])
);

TSQL Getting the ident_current of a view this uses joins

Using the following queries, I am trying to understand why I am able to get the ident_current on one view, but not the other.
Here is some sample data:
create table temptable1 (id int identity(1,1), name varchar(100), [type] int)
insert into temptable1 values
( 'apple', 1),
( 'banana', 1),
( 'cake', 3)
create table temptable2 (id int identity(1,1), name varchar(100))
insert into temptable2 values
( 'fruit'),
( 'vegetable'),
( 'pastry')
exec ('
create view dbo.identcurrentworks
as
select
t1.id as t1id
,t1.name as t1name
,t1.type as t1type
from temptable1 t1
')
--drop view dbo.identcurrentworks
exec ('
create view dbo.identcurrentdoesnotwork
as
select
t1.id as t1id,
t1.name as t1name,
t1.type,
t2.id as t2id,
t2.name as t2name
from temptable1 t1
join temptable2 t2 on t1.type=t2.id
')
--drop view dbo.identcurrentdoesnotwork
select * from dbo.identcurrentworks
select IDENT_CURRENT('dbo.temptable1')
select IDENT_CURRENT('dbo.identcurrentworks')
select * from dbo.identcurrentdoesnotwork
select IDENT_CURRENT('dbo.temptable2')
select IDENT_CURRENT('dbo.identcurrentdoesnotwork')
--drop table temptable1
--drop table temptable2
I am uncertain as to why I can get the ident_current on the view dbo.identcurrentworks but not on the other. Any ideas?
#Pondlife is right - the view identcurrentdoesnotwork has no identity column defined because there are two identity columns in the select statement. You can verify this by running:
sp_help identcurrentdoesnotwork
Note that as the others have pointed out, for most situations, one should use SCOPE_IDENTITY instead of IDENTITY_CURRENT.
For more on IDENT_CURRENT, click here, and for more on ##IDENTITY vs IDENT_CURRENT vs SCOPE_IDENTITY, click here .