Trying to extract text using CHARINDEX ()- 1 but getting an error - tsql

I have a column with Names, and I am trying to split the column into First and Last Name using Text functions such as LEFT/SUBSTRING/CHARINDEX.
Data in the column:
Name
Yang, Jon
Huang, Eugene
Torres, Ruben
Zhu, Christy
Johnson, Elizabeth
Everything works fine as long as I use this code:
SELECT
[Name]
--,LEFT([Name], CHARINDEX(' ', [Name])) AS FirstName
,SUBSTRING([Name], 1, CHARINDEX(' ', [Name] )) AS FirstName
FROM
DataModeling.Customer
But the problem arises when I try to subtract 1 from CHARINDEX to exclude the Comma from the result and it throws this error:
I have done this operation many times in Excel so trying to replicate it with TSQL. Any suggestion on what I am doing wrong is helpful.

You get that error when CHARINDEX(' ', [Name] ) return 0. So minus 1 will make it negative and it is invalid value for substring()
You can use CASE expression to check the return value from CHARINDEX() and return the correct value to substring()
Or, you can "cheat" by using
CHARINDEX( ' ', [Name] + ' ' )
So CHARINDEX() will always return a value that is more than 0

Related

TSQL query to extract a value between to char where a specific set of characters is there

I have a problem I can't seem to figure out. I am trying to extract capacity from a product description. It is always between two values, "," and "oz." however there could be other commas included in the description that are not part of what I'm trying to extract. Example value is , 15 oz., or , 2 oz.,
I'm trying to find values that have the oz in them and are between two commas and I have been completely unsuccessfully. I've tried many things, but here is the latest that I have tried today and I'm just getting an error.
SELECT SUBSTRING(
FullDescription,
CHARINDEX(',', FullDescription),
CHARINDEX('oz.',FullDescription)
- CHARINDEX(',', FullDescription)
+ Len('oz.')
)
from CatalogManagement.Product
Since the backwards pattern ,.zo is more recognisable, I'd go with the REVERSE function
Sample values:
"something, something more, 18oz., complete"
"shorter, 12oz., remainder"
"there is no capacity, in this, value"
"a bit more, 14oz, and some followups, maybe"
SELECT REVERSE(
SUBSTRING (
REVERSE(FullDescription),
CHARINDEX(',.zo', REVERSE(FullDescription)) + 1,
CHARINDEX(',', REVERSE(FullDescription), CHARINDEX(',.zo', REVERSE(FullDescription)) + 1) - CHARINDEX(',.zo', REVERSE(FullDescription)) - 1
)
)
FROM CatalogManagement.Product
WHERE FullDescription LIKE '%oz.,%'
You might use XML-splitting together with a XQuery predicate:
DECLARE #tbl TABLE(ID INT IDENTITY, YourString VARCHAR(MAX));
INSERT INTO #tbl VALUES('Here is one with an amount, 1 oz., some more text')
,('Here is one with no amount, some more text')
,('a, 10 oz.')
,('b, 20oz., no blank between oz and the number')
,('30oz., starts with the pattern, no leading comma');
SELECT t.*
,A.oz.value('.','nvarchar(max)') oz
FROM #tbl t
CROSS APPLY(SELECT CAST('<x>' + REPLACE((SELECT t.YourString AS [*] FOR XML PATH('')),',','</x><x>') + '</x>' AS XML)
.query('/x[contains(text()[1],"oz.")]')) A(oz);
The idea in short:
We use some string methods to replace commas with XML tags and to cast your string to XML. each fragment is placed within a decent <x> element.
We use a predicate to return just the fragments containing "oz.".
You can filter easily with
WHERE LEN(A.oz.value('.','nvarchar(max)'))>0

How to find/replace weird whitespace in string

I find in my sql database string whit weird whitespace which cannot be replace like REPLACE(string, ' ', '') RTRIM and cant it even find with string = '% %'. This space is even transfered to new table when using SELECT string INTO
If i select this string in managment studio and copy that is seems is normal space and when everything is works but cant do nothing directly from database. What else can i do? Its some kind of error or can i try some special character for this?
First, you must identify the character.
You can do that by using a tally table (or a cte) and the Unicode function:
The following script will return a table with two columns: one contains a char and the other it's unicode value:
DECLARE #Str nvarchar(100) = N'This is a string containing 1 number and some words.';
with Tally(n) as
(
SELECT TOP(LEN(#str)) ROW_NUMBER() OVER(ORDER BY ##SPID)
FROM sys.objects a
--CROSS JOIN sys.objects b -- (unremark if there are not enough rows in the tally cte)
)
SELECT SUBSTRING(#str, n, 1) As TheChar,
UNICODE(SUBSTRING(#str, n, 1)) As TheCode
FROM Tally
WHERE n <= LEN(#str)
You can also add a condition to the where clause to only include "special" chars:
AND SUBSTRING(#str, n, 1) NOT LIKE '[a-zA-Z0-9]'
Then you can replace it using it's unicode value using nchar (I've used 32 in this example since it's unicode "regular" space:
SELECT REPLACE(#str, NCHAR(32), '|')
Result:
This|is|a|string|containing|1|number|and|some|words.

PostgreSQL return last n words

How to return last n words using Postgres.
I have tried using LEFT method.
SELECT DISTINCT LEFT(name, -4) FROM my_table;
but it return last 4 characters ,i want to return last 3 words.
demo:db<>fiddle
You can do this using a the SUBSTRING() function and regular expressions:
SELECT
SUBSTRING(name FROM '((\S+\s+){0,3}\S+$)')
FROM my_table
This has been explained here: How can I match the last two words in a sentence in PostgreSQL?
\S+ is a string of non-whitespace characters
\s+ is a string of whitespace characters (e.g. one space)
(\S+\s+){0,3} Zero to three words separated by a space
\S+$ one word at the end of the text.
-> creates 4 words (or less if there are no more).
One way is to use regexp_split_to_array() to split the string into the words it contains and then put a string back together using the last 3 words in that array.
SELECT coalesce(w.words[array_length(w.words, 1) - 2] || ' ', '')
|| coalesce(w.words[array_length(w.words, 1) - 1] || ' ', '')
|| coalesce(w.words[array_length(w.words, 1)], '')
FROM mytable t
CROSS JOIN LATERAL (SELECT regexp_split_to_array(t."name", ' ') words) w;
db<>fiddle
RIGHT() should do
SELECT RIGHT('MYCOLUMN', 4); -- returns LUMN
UPD
You can convert to array and then back to string
SELECT array_to_string(sentence[(array_length(sentence,1)-3):(array_length(sentence,1))],' ','*')
FROM
(
SELECT regexp_split_to_array('this is the one of the way to get the last four words of the string', E'\\s+') AS sentence
) foo;
DEMO HERE

TSQL - CONCAT_NULL_YIELDS_NULL ON Not Returning Null?

I have had a look around and seem to have come across a strange issue with SQL Server 2008 R2.
I understand that with CONCAT_NULL_YIELDS_NULL = ON means that the following will always resolve to NULL
SELECT NULL + 'My String'
I'm happy with that, however when using this in conjunction with COALESCE() it doesn’t appear to be working on my database.
Consider the following query where MyString is VARCHAR(2000)
SELECT COALESCE(MyString + ', ', '') FROM MyTableOfValues
Now in my query, when MyString IS NULL it returns an empty (NOT NULL) string. I can see this in the query results window.
However unusually enough, when running this in conjunction with an INSERT it fails to recognise the CONCAT_NULL_YIELDS_NULL instead, inserting a blank ‘, ‘.
Query is as follows for insert.
CONCAT_NULL_YIELDS_NULL ON
INSERT INTO Mytable(StringValue)
SELECT COALESCE(MyString + ', ', '')
FROM MyTableOfValues
Further to this I have also checked the database and CONCAT_NULL_YIELDS_NULL = TRUE…
Use NULLIF(MyString, '') instead of just MyString:
SELECT COALESCE(NULLIF(MyString, '') + ', ', '') FROM MyTableOfValues
Coalesce returns the first nonnull expression among its arguments.
You're getting a ', ' because it's the first nonnull expression in your coalesce call.
http://msdn.microsoft.com/en-us/library/ms190349.aspx
From some of the answers provided I was able to assertain a more in depth understanding of COALESCE().
The reason the above query did not fully work was because although I was checking for nulls, and empty string ('') is not considered null. Therefore although the above query worked, I should have checked for empty strings in my table first.
e.g.
SELECT COALESCE(FirstName + ', ', '') + Surname
FROM
(
SELECT 'Joe' AS Firstname, 'Bloggs' AS Surname UNION ALL
SELECT NULL, 'Jones' UNION ALL
SELECT '', 'Jones' UNION ALL
SELECT 'Bob', 'Tielly'
) AS [MyTable]
Will return
FullName
-----------
Joe, Bloggs
Jones
, Jones
Bob, Tielly
Now row 3 has returned a "," character which I was not originally expecting due to a Blank but NOT NULL value.
The following code now works as expected as it checks for blank values. It works, but it looks like I took the long way around. There may be a better way.
-- Ammended Query
SELECT COALESCE(REPLACE(FirstName, Firstname , Firstname + ', '), '') + Surname AS FullName0
FROM
(
SELECT 'Joe' AS Firstname, 'Bloggs' AS Surname UNION ALL
SELECT NULL, 'Jones' UNION ALL
SELECT '', 'Jones' UNION ALL
SELECT 'Bob', 'Tielly'
) AS [MyTable]

Extract the first word of a string in a SQL Server query

What's the best way to extract the first word of a string in sql server query?
SELECT CASE CHARINDEX(' ', #Foo, 1)
WHEN 0 THEN #Foo -- empty or single word
ELSE SUBSTRING(#Foo, 1, CHARINDEX(' ', #Foo, 1) - 1) -- multi-word
END
You could perhaps use this in a UDF:
CREATE FUNCTION [dbo].[FirstWord] (#value varchar(max))
RETURNS varchar(max)
AS
BEGIN
RETURN CASE CHARINDEX(' ', #value, 1)
WHEN 0 THEN #value
ELSE SUBSTRING(#value, 1, CHARINDEX(' ', #value, 1) - 1) END
END
GO -- test:
SELECT dbo.FirstWord(NULL)
SELECT dbo.FirstWord('')
SELECT dbo.FirstWord('abc')
SELECT dbo.FirstWord('abc def')
SELECT dbo.FirstWord('abc def ghi')
I wanted to do something like this without making a separate function, and came up with this simple one-line approach:
DECLARE #test NVARCHAR(255)
SET #test = 'First Second'
SELECT SUBSTRING(#test,1,(CHARINDEX(' ',#test + ' ')-1))
This would return the result "First"
It's short, just not as robust, as it assumes your string doesn't start with a space. It will handle one-word inputs, multi-word inputs, and empty string inputs.
Enhancement of Ben Brandt's answer to compensate even if the string starts with space by applying LTRIM(). Tried to edit his answer but rejected, so I am now posting it here separately.
DECLARE #test NVARCHAR(255)
SET #test = 'First Second'
SELECT SUBSTRING(LTRIM(#test),1,(CHARINDEX(' ',LTRIM(#test) + ' ')-1))
Adding the following before the RETURN statement would solve for the cases where a leading space was included in the field:
SET #Value = LTRIM(RTRIM(#Value))
Marc's answer got me most of the way to what I needed, but I had to go with patIndex rather than charIndex because sometimes characters other than spaces mark the ends of my data's words. Here I'm using '%[ /-]%' to look for space, slash, or dash.
Select race_id, race_description
, Case patIndex ('%[ /-]%', LTrim (race_description))
When 0 Then LTrim (race_description)
Else substring (LTrim (race_description), 1, patIndex ('%[ /-]%', LTrim (race_description)) - 1)
End race_abbreviation
from tbl_races
Results...
race_id race_description race_abbreviation
------- ------------------------- -----------------
1 White White
2 Black or African American Black
3 Hispanic/Latino Hispanic
Caveat: this is for a small data set (US federal race reporting categories); I don't know what would happen to performance when scaled up to huge numbers.
DECLARE #string NVARCHAR(50)
SET #string = 'CUT STRING'
SELECT LEFT(#string,(PATINDEX('% %',#string)))
Extract the first word from the indicated field:
SELECT SUBSTRING(field1, 1, CHARINDEX(' ', field1)) FROM table1;
Extract the second and successive words from the indicated field:
SELECT SUBSTRING(field1, CHARINDEX(' ', field1)+1, LEN (field1)-CHARINDEX(' ', field1)) FROM table1;
A slight tweak to the function returns the next word from a start point in the entry
CREATE FUNCTION [dbo].[GetWord]
(
#value varchar(max)
, #startLocation int
)
RETURNS varchar(max)
AS
BEGIN
SET #value = LTRIM(RTRIM(#Value))
SELECT #startLocation =
CASE
WHEN #startLocation > Len(#value) THEN LEN(#value)
ELSE #startLocation
END
SELECT #value =
CASE
WHEN #startLocation > 1
THEN LTRIM(RTRIM(RIGHT(#value, LEN(#value) - #startLocation)))
ELSE #value
END
RETURN CASE CHARINDEX(' ', #value, 1)
WHEN 0 THEN #value
ELSE SUBSTRING(#value, 1, CHARINDEX(' ', #value, 1) - 1)
END
END
GO
SELECT dbo.GetWord(NULL, 1)
SELECT dbo.GetWord('', 1)
SELECT dbo.GetWord('abc', 1)
SELECT dbo.GetWord('abc def', 4)
SELECT dbo.GetWord('abc def ghi', 20)
Try This:
Select race_id, race_description
, Case patIndex ('%[ /-]%', LTrim (race_description))
When 0 Then LTrim (race_description)
Else substring (LTrim (race_description), 1, patIndex ('%[ /-]%', LTrim (race_description)) - 1)
End race_abbreviation
from tbl_races