Compare 2 strings and get match - T SQL - tsql

What i want is to compare 2 strings and get how many characters both strings have in common.
For example:
I have declare a variable with value Test1.
Get values from a table with a select query and compare them with the variable to get how many characters are the same in order starting from the first character of the variable.
I compare the variable against the values from the query.
Character are case sensitive (i use UPPER( string ) to capitalize both variable and value from the select statement)
I will select the String with the MAX() number. From the output image i will select Test1 and NOT Test11 because Test11 exist the number of characters against the variable.
Output
Any suggestions?

You can use a recursive CTE for this...
For your next question: Please to not post pictures. Rather try to set up a stand alone and self-running sample as I do it here (DDL and INSERT).
DECLARE #tbl TABLE(ID INT IDENTITY, SomeValue VARCHAR(100));
INSERT INTO #tbl VALUES ('Test1')
,('Test11')
,('Test')
,('abc')
,('Tyes')
,('cest');
--This is the string we use to compare (casing depends on the underlying collation)
DECLARE #CheckString VARCHAR(100)='Test1';
--The query
WITH recCTE AS
(
SELECT t.ID
,t.SomeValue
,1 AS pos
--,SUBSTRING(#CheckString,1,1) AS LetterInCheckString
--,SUBSTRING(t.SomeValue,1,1) AS LetterInTableValue
,CASE WHEN SUBSTRING(#CheckString,1,1)=SUBSTRING(t.SomeValue,1,1) THEN 1 ELSE 0 END AS IsTheSame
FROM #tbl t
UNION ALL
SELECT recCTE.ID
,recCTE.SomeValue
,recCTE.Pos+1
--,SUBSTRING(#CheckString,recCTE.Pos+1,1)
--,SUBSTRING(recCTE.SomeValue,recCTE.Pos+1,1)
,CASE WHEN SUBSTRING(#CheckString,recCTE.Pos+1,1)=SUBSTRING(recCTE.SomeValue,recCTE.Pos+1,1) THEN 1 ELSE 0 END
FROM recCTE
WHERE recCTE.IsTheSame=1 AND SUBSTRING(#CheckString,recCTE.Pos+1,1) <>''
)
SELECT ID,SomeValue,SUM(IsTheSame)
FROM recCTE
GROUP BY ID,SomeValue
ORDER BY ID;
The idea in short:
We start with the recursion's anchor at position=1
We add to this, as long as the string is the same and substring() returns a value.
The result is the SUM() of same characters.
To be honest: T-SQL is the wrong tool for this...

Related

Concatenate string instead of just replacing it

I have a table with standard columns where I want to perform regular INSERTs.
But one of the columns is of type varchar with special semantics. It's a string that's supposed to behave as a set of strings, where the elements of the set are separated by commas.
Eg. if one row has in that varchar column the value fish,sheep,dove, and I insert the string ,fish,eagle, I want the result to be fish,sheep,dove,eagle (ie. eagle gets added to the set, but fish doesn't because it's already in the set).
I have here this Postgres code that does the "set concatenation" that I want:
SELECT string_agg(unnest, ',') AS x FROM (SELECT DISTINCT unnest(string_to_array('fish,sheep,dove' || ',fish,eagle', ','))) AS x;
But I can't figure out how to apply this logic to insertions.
What I want is something like:
CREATE TABLE IF NOT EXISTS t00(
userid int8 PRIMARY KEY,
a int8,
b varchar);
INSERT INTO t00 (userid,a,b) VALUES (0,1,'fish,sheep,dove');
INSERT INTO t00 (userid,a,b) VALUES (0,1,',fish,eagle')
ON CONFLICT (userid)
DO UPDATE SET
a = EXCLUDED.a,
b = SELECT string_agg(unnest, ',') AS x FROM (SELECT DISTINCT unnest(string_to_array(t00.b || EXCLUDED.b, ','))) AS x;
How can I achieve something like that?
Storing comma separated values is a huge mistake to begin with. But if you really want to make your life harder than it needs to be, you might want to create a function that merges two comma separated lists:
create function merge_lists(p_one text, p_two text)
returns text
as
$$
select string_agg(item, ',')
from (
select e.item
from unnest(string_to_array(p_one, ',')) as e(item)
where e.item <> '' --< necessary because of the leading , in your data
union
select t.item
from unnest(string_to_array(p_two, ',')) t(item)
where t.item <> ''
) t;
$$
language sql;
If you are using Postgres 14 or later, unnest(string_to_array(..., ',')) can be replace with string_to_table(..., ',')
Then your INSERT statement gets a bit simpler:
INSERT INTO t00 (userid,a,b) VALUES (0,1,',fish,eagle')
ON CONFLICT (userid)
DO UPDATE SET
a = EXCLUDED.a,
b = merge_lists(excluded.b, t00.b);
I think I was only missing parentheses around the SELECT statement:
INSERT INTO t00 (userid,a,b) VALUES (0,1,',fish,eagle')
ON CONFLICT (userid)
DO UPDATE SET
a = EXCLUDED.a,
b = (SELECT string_agg(unnest, ',') AS x FROM (SELECT DISTINCT unnest(string_to_array(t00.b || EXCLUDED.b, ','))) AS x);

Have strange DB developed which has no keys and allows entry of numbers into varchar which also allows nulls

How can I ensure that TSQL will not bark at me with
these values returned:
'1.00000000'
or
NULL
or
''
or
'some value'
When i convert to an int
If you are using SQL Server 2012 or later, you may use the TRY_CONVERT function, e.g.
WITH yourTable AS (
SELECT 123 AS intVal UNION ALL
SELECT '123' UNION ALL
SELECT NULL
)
SELECT
intVal,
CASE WHEN TRY_CONVERT(int, intVal) IS NOT NULL THEN 'yes' ELSE 'no' END AS can_parse
FROM yourTable;
Demo
The TRY_CONVERT function will return NULL in this case if it can't convert the input to an integer. So, this is a safe way to probe your data before trying a formal cast or conversion.
Here was the answer I found that worked for me...
TSQL - Cast string to integer or return default value
I'm not on 2012 or higher due to customer...
Don't give me credit though :) I was only good at searching for the answer that worked for me...
Although I changed it from returning null to returning zero since the stupid varchar should be an int column with a default of zero :)
Here's one that works for any value that is truly a VARCHAR and not an int
since VARCHAR is really a variable length string data type
WITH tmpTable AS (
SELECT '123' as intVal UNION ALL
SELECT 'dog' UNION ALL
SELECT '345' UNION ALL
SELECT 'cat' UNION ALL
SELECT '987' UNION ALL
SELECT '4f7g7' UNION ALL
SELECT NULL
)
SELECT
intVal
,case when intVal not like '%[^0-9]%' then 'yes' else 'no' end FROM tmpTable;
credit given to Tim Biegeleisen for his answer above....
All though when characters are found with his solution it will
still error out... hence the changes
Demo

Removing all the Alphabets from a string using a single SQL Query [duplicate]

I'm currently doing a data conversion project and need to strip all alphabetical characters from a string. Unfortunately I can't create or use a function as we don't own the source machine making the methods I've found from searching for previous posts unusable.
What would be the best way to do this in a select statement? Speed isn't too much of an issue as this will only be running over 30,000 records or so and is a once off statement.
You can do this in a single statement. You're not really creating a statement with 200+ REPLACEs are you?!
update tbl
set S = U.clean
from tbl
cross apply
(
select Substring(tbl.S,v.number,1)
-- this table will cater for strings up to length 2047
from master..spt_values v
where v.type='P' and v.number between 1 and len(tbl.S)
and Substring(tbl.S,v.number,1) like '[0-9]'
order by v.number
for xml path ('')
) U(clean)
Working SQL Fiddle showing this query with sample data
Replicated below for posterity:
create table tbl (ID int identity, S varchar(500))
insert tbl select 'asdlfj;390312hr9fasd9uhf012 3or h239ur ' + char(13) + 'asdfasf'
insert tbl select '123'
insert tbl select ''
insert tbl select null
insert tbl select '123 a 124'
Results
ID S
1 390312990123239
2 123
3 (null)
4 (null)
5 123124
CTE comes for HELP here.
;WITH CTE AS
(
SELECT
[ProductNumber] AS OrigProductNumber
,CAST([ProductNumber] AS VARCHAR(100)) AS [ProductNumber]
FROM [AdventureWorks].[Production].[Product]
UNION ALL
SELECT OrigProductNumber
,CAST(STUFF([ProductNumber], PATINDEX('%[^0-9]%', [ProductNumber]), 1, '') AS VARCHAR(100) ) AS [ProductNumber]
FROM CTE WHERE PATINDEX('%[^0-9]%', [ProductNumber]) > 0
)
SELECT * FROM CTE
WHERE PATINDEX('%[^0-9]%', [ProductNumber]) = 0
OPTION (MAXRECURSION 0)
output:
OrigProductNumber ProductNumber
WB-H098 098
VE-C304-S 304
VE-C304-M 304
VE-C304-L 304
TT-T092 092
RichardTheKiwi's script in a function for use in selects without cross apply,
also added dot because in my case I use it for double and money values within a varchar field
CREATE FUNCTION dbo.ReplaceNonNumericChars (#string VARCHAR(5000))
RETURNS VARCHAR(1000)
AS
BEGIN
SET #string = REPLACE(#string, ',', '.')
SET #string = (SELECT SUBSTRING(#string, v.number, 1)
FROM master..spt_values v
WHERE v.type = 'P'
AND v.number BETWEEN 1 AND LEN(#string)
AND (SUBSTRING(#string, v.number, 1) LIKE '[0-9]'
OR SUBSTRING(#string, v.number, 1) LIKE '[.]')
ORDER BY v.number
FOR
XML PATH('')
)
RETURN #string
END
GO
Thanks RichardTheKiwi +1
Well if you really can't use a function, I suppose you could do something like this:
SELECT REPLACE(REPLACE(REPLACE(LOWER(col),'a',''),'b',''),'c','')
FROM dbo.table...
Obviously it would be a lot uglier than that, since I only handled the first three letters, but it should give the idea.

Update SQL variable when recordset is empty

We have a loop in SQL Server 2005 that loops around on a table getting each items parent until it gets to the top of the tree:
DECLARE #T Table
(
ItemID INT NOT NULL PRIMARY KEY,
AncestorID INT NULL
)
Which has data like this:
ItemID | AncestorID
1 2
2 3
3 4
4 NULL
We have a loop that basically does this:
DECLARE #AncestorID INT
SELECT #AncestorID = 1
WHILE (#AncestorID IS NOT NULL)
BEGIN
--Do some work
SELECT #AncestorID = T.AncestorID
FROM #T t
WHERE T.ItemID = #AncestorID
print #AncestorID
END
(Yes I know SQL is set based, and this is processing row by row, the "Do some work" needs to be done line by line for a reason).
This has always worked fine until today when we ended up in an endless loop. Turns out the cause was some wrong data:
ItemID | AncestorID
1 2
2 3
4 NULL
ItemID 3 was deleted. The loop now never ends because AncestorID is never NULL - it stays at 3.
Is there anyway to rewrite the select statement to make #AncestorID null if the SELECT query returns 0 rows, or do I need to have a separate SELECT statement to count the records and some IF ELSE type logic?
Is there anyway to rewrite the select statement to make #AncestorID
null if the SELECT query returns 0 rows,
You can use an aggregate on T.AncestorID.
SELECT #AncestorID = min(T.AncestorID)
FROM #T t
WHERE T.ItemID = #AncestorID
You could use another variable, e.g. #PreviousAncestorId, to hold the previous value and reset #AncestorId to NULL before the query.
You could check ##RowCount after the query to see if a row was found.
The code will still have issues dealing with cycles of arbitrary length within the data, e.g. a row where both values are the same. You would need to keep track of the visited rows in order to detect cycles. A simple reality check would be to count the number of iterations of the loop and check it against the number of rows.
Use a Break
e.g.
WHILE (#AncestorID IS NOT NULL)
BEGIN
SELECT T.AncestorID INTO #TEMP
FROM #T t WHERE T.ItemID = #AncestorID
IF((SELECT COUNT(*) FROM #TEMP) = 0) BREAK;
SELECT #AncestorID=T.AncestorID
FROM #TEMP t
print #AncestorID
DROP TABLE #TEMP
END

Check if a varchar is a number (T-SQL)

Is there an easy way to figure out if a varchar is a number?
Examples:
abc123 --> no number
123 --> yes, its a number
ISNUMERIC will not do - it tells you that the string can be converted to any of the numeric types, which is almost always a pointless piece of information to know. For example, all of the following are numeric, according to ISNUMERIC:
£, $, 0d0
If you want to check for digits and only digits, a negative LIKE expression is what you want:
not Value like '%[^0-9]%'
ISNUMERIC will do
Check the NOTES section too in the article.
You can check like this:
declare #vchar varchar(50)
set #vchar ='34343';
select case when #vchar not like '%[^0-9]%' then 'Number' else 'Not a Number' end
Using SQL Server 2012+, you can use the TRY_* functions if you have specific needs. For example,
-- will fail for decimal values, but allow negative values
TRY_CAST(#value AS INT) IS NOT NULL
-- will fail for non-positive integers; can be used with other examples below as well, or reversed if only negative desired
TRY_CAST(#value AS INT) > 0
-- will fail if a $ is used, but allow decimals to the specified precision
TRY_CAST(#value AS DECIMAL(10,2)) IS NOT NULL
-- will allow valid currency
TRY_CAST(#value AS MONEY) IS NOT NULL
-- will allow scientific notation to be used like 1.7E+3
TRY_CAST(#value AS FLOAT) IS NOT NULL
I ran into the need to allow decimal values, so I used not Value like '%[^0-9.]%'
Wade73's answer for decimals doesn't quite work. I've modified it to allow only a single decimal point.
declare #MyTable table(MyVar nvarchar(10));
insert into #MyTable (MyVar)
values
(N'1234')
, (N'000005')
, (N'1,000')
, (N'293.8457')
, (N'x')
, (N'+')
, (N'293.8457.')
, (N'......');
-- This shows that Wade73's answer allows some non-numeric values to slip through.
select * from (
select
MyVar
, case when MyVar not like N'%[^0-9.]%' then 1 else 0 end as IsNumber
from
#MyTable
) t order by IsNumber;
-- Notice the addition of "and MyVar not like N'%.%.%'".
select * from (
select
MyVar
, case when MyVar not like N'%[^0-9.]%' and MyVar not like N'%.%.%' then 1 else 0 end as IsNumber
from
#MyTable
) t
order by IsNumber;
Damien_The_Unbeliever noted that his was only good for digits
Wade73 added a bit to handle decimal points
neizan made an additional tweak as did notwhereuareat.
Unfortunately, none appear to handle negative values and they appear to have issues with a comma in the value...
Here's my tweak to pick up negative values and those with commas
declare #MyTable table(MyVar nvarchar(10));
insert into #MyTable (MyVar)
values
(N'1234')
, (N'000005')
, (N'1,000')
, (N'293.8457')
, (N'x')
, (N'+')
, (N'293.8457.')
, (N'......')
, (N'.')
, (N'-375.4')
, (N'-00003')
, (N'-2,000')
, (N'3-3')
, (N'3000-')
;
-- This shows that Neizan's answer allows "." to slip through.
select * from (
select
MyVar
, case when MyVar not like N'%[^0-9.]%' then 1 else 0 end as IsNumber
from
#MyTable
) t order by IsNumber;
-- Notice the addition of "and MyVar not like '.'".
select * from (
select
MyVar
, case when MyVar not like N'%[^0-9.]%' and MyVar not like N'%.%.%' and MyVar not like '.' then 1 else 0 end as IsNumber
from
#MyTable
) t
order by IsNumber;
--Trying to tweak for negative values and the comma
--Modified when comparison
select * from (
select
MyVar
, case
when MyVar not like N'%[^0-9.,-]%' and MyVar not like '.' and isnumeric(MyVar) = 1 then 1
else 0
end as IsNumber
from
#MyTable
) t
order by IsNumber;
DECLARE #A nvarchar(100) = '12'
IF(ISNUMERIC(#A) = 1)
BEGIN
PRINT 'YES NUMERIC'
END
Neizan's code lets values of just a "." through. At the risk of getting too pedantic, I added one more AND clause.
declare #MyTable table(MyVar nvarchar(10));
insert into #MyTable (MyVar)
values
(N'1234')
, (N'000005')
, (N'1,000')
, (N'293.8457')
, (N'x')
, (N'+')
, (N'293.8457.')
, (N'......')
, (N'.')
;
-- This shows that Neizan's answer allows "." to slip through.
select * from (
select
MyVar
, case when MyVar not like N'%[^0-9.]%' then 1 else 0 end as IsNumber
from
#MyTable
) t order by IsNumber;
-- Notice the addition of "and MyVar not like '.'".
select * from (
select
MyVar
, case when MyVar not like N'%[^0-9.]%' and MyVar not like N'%.%.%' and MyVar not like '.' then 1 else 0 end as IsNumber
from
#MyTable
) t
order by IsNumber;
Do not forget to exclude carriage returns from your data!
As in:
SELECT
Myotherval
, CASE WHEN TRIM(REPLACE([MyVal], char(13) + char(10), '')) not like '%[^0-9]%' and RTRIM(REPLACE([MyVal], char(13) + char(10), '')) not like '.' and isnumeric(REPLACE([MyVal], char(13) + char(10), '')) = 1 THEN 'my number: ' + [MyVal]
ELSE ISNULL(Cast([MyVal] AS VARCHAR(8000)), '')
END AS 'MyVal'
FROM MyTable
In case you want to add a constraint on a field:
Positive integer with fixed length
ALTER TABLE dbo.BankBranchType
ADD CONSTRAINT CK_TransitNumberMustBe5Digits
CHECK (TransitNumber NOT like '%[^0-9]%'
AND LEN(TransitNumber) = 5)
To check the Number, Currency, and Amount, use the below SQL fragment.
#value NOT LIKE '%[^0-9.,]%'
For a quick win, refer to the below example:
Function example:
CREATE FUNCTION [dbo].[fnCheckValueIsNumber](
#value NVARCHAR(255)=NULL
)RETURNS INT AS BEGIN
DECLARE #ReturnValue INT=0
IF EXISTS (SELECT * WHERE #value NOT LIKE '%[^0-9.,]%') SELECT #ReturnValue=1
RETURN #ReturnValue;
Execution result
SELECT [dbo].[fnCheckValueIsNumber]('12345')
RESULT = 1
SELECT [dbo].[fnCheckValueIsNumber]('10020.25')
RESULT = 1
SELECT [dbo].[fnCheckValueIsNumber]('10,020.25')
RESULT = 1
SELECT [dbo].[fnCheckValueIsNumber]('12,345ABCD')
RESULT = 0