T-SQL Substring of a Link - tsql

I have a string with which I want to retrieve a substring out of:
"app/reports/here"
The resulting substring I want is "app/reports"
I some T-SQL as below:
DECLARE #document varchar(64);
SELECT #document = 'app/reports/here';
SELECT substring(#document, 0, CHARINDEX('/', replace(#document,'app/','')));
GO
The result of the code above is "app/rep".
How could I get the full string I need? Someone about the CHARINDEX is confusing me..
Thanks

Just add 4 to the CHARINDEX since you replace 4 characters from the original string with an empty string so the result from CHARINDEX is shifted by that. So it should be:
DECLARE #document varchar(64);
SELECT #document = 'app/reports/here';
SELECT substring(#document, 0, CHARINDEX('/', replace(#document,'app/',''))+4);
GO

Related

Extract specific words from text value

I have a query and a returned value that looks like this:
select properties->>'text' as snippet from table where id = 31;
snippet
-----------------------------------
There are many variations of passages of Lorem Ipsum available, but the majority have suffered alteration in some form, by injected humour, or randomised words which don't look even slightly believable.
(1 row)
This returns as I expect based on my query.
Is there a way that I can slice the returned text to only return words from position 5 to position 8 for example? Or alternatively, slice by character position which I will be able to use as a workaround?
I have tried using:
select properties->>'text'[0:13] as snippet from table where id = 31;
Which I hoped would return:
There are many
But it hasn't worked.
Is this possibly to slice a jsonb text field?
To "slice by character position", you can simply use the substr() function:
select substr(properties->>'text', 1, 15) as snippet
from the_table
where id = 31;
If you really want "words", you can split the text into an array using e.g. regexp_split_to_array. Once you have an array, you can use the slice syntax:
select (regexp_split_to_array(properties->>'text','\s+'))[5:8] as snippet
from the_table
where id = 31;
This returns an array, if you want it as a string, you can use array_to_string()
select array_to_string((regexp_split_to_array(properties->>'text','\s+'))[5:8],' ') as snippet
from the_table
where id = 31;
If you need that frequently, I would wrap it into a function:
create function extract_words(p_input text, p_start int, p_end int)
returns text
as
$$
select array_to_string((regexp_split_to_array(p_input,'\s+'))[p_start:p_end],' ');
$$
language sql
immutable;
Then the query is much easier to read:
select extract_words(properties->>'text', 5, 8) as snippet
from the_table
where id = 31;

Trying to manipulate string such as if '26169;#c785643', then the result should be like 'c785643'

I am trying to manipulate string data in a column such as if the given string is '20591;#e123456;#17507;#c567890;#15518;#e135791' or '26169;#c785643', then the
result should be like 'e123456;c567890;e135791' or 'c785643'. The number of digits in between can be of any length.
Some of the things I have tried so far are:
select replace('20591;#e123456;#17507;#c567890;#15518;#e135791','#','');
This leaves me with '20591;e123456;17507;c567890;15518;e135791', which still includes the digits without 'e' or 'c' prefixed to them. i want to get rid of 20591, 17507 and 15518.
Create function that will keep a pattern of '%[#][ec][0-9][;]%' and will get rid of the rest.
The most important advise is: Do not store any data in a delimited string. This is violating the most basic principle of relational database concepts (1.NF).
The second hint is SO-related: Please always add / tag your questions with the appropriate tool. The tag [tsql] points to SQL-Server, but this might be wrong (which would invalidate both answers). Please tag the full product with its version (e.g. [sql-server-2012]). Especially with string splitting there are very important product related changes from version to version.
Now to your question.
Working with (almost) any version of SQL-Server
My suggestion uses a trick with XML:
(credits to Alan Burstein for the mockup)
DECLARE #table TABLE (someid INT IDENTITY, somestring VARCHAR(50));
INSERT #table VALUES ('20591;#e123456;#17507;#c567890;#15518;#e135791'),('26169;#c785643')
--the query
SELECT t.someid,t.somestring,A.CastedToXml
,STUFF(REPLACE(A.CastedToXml.query('/x[contains(text()[1],"#") and empty(substring(text()[1],2,100) cast as xs:int?)]')
.value('.','nvarchar(max)'),'#',';'),1,1,'') TheNewList
FROM #table t
CROSS APPLY(SELECT CAST('<x>' + REPLACE(t.somestring,';','</x><x>') + '</x>' AS XML)) A(CastedToXml);
The idea in short:
By replacing the ; with XML tags </x><x> we can transform your delimited list to XML. I included the intermediate XML into the result set. Just click it to see how this works.
In the next query I use a XQuery predicate first to find entries, which contain a # and second, which do NOT cast to an integer without the #.
The thrid step is specific to XML again. The XPath . in .value() will return all content as one string.
Finally we have to replace the # with ; and cut away the leading ; using STUFF().
UPDATE The same idea, but a bit shorter:
You can try this as well
SELECT t.someid,t.somestring,A.CastedToXml
,REPLACE(A.CastedToXml.query('data(/x[empty(. cast as xs:int?)])')
.value('.','nvarchar(max)'),' ',';') TheNewList
FROM #table t
CROSS APPLY(SELECT CAST('<x>' + REPLACE(t.somestring,';#','</x><x>') + '</x>' AS XML)) A(CastedToXml);
Here I use ;# to split your string and data() to implicitly concatenate your values (blank-separated).
UPDATE 2 for v2017
If you have v2017+ I'd suggest a combination of a JSON splitter and STRING_AGG():
SELECT t.someid,STRING_AGG(A.[value],';') AS TheNewList
FROM #table t
CROSS APPLY OPENJSON(CONCAT('["',REPLACE(t.somestring,';#','","'),'"]')) A
WHERE TRY_CAST(A.[value] AS INT) IS NULL
GROUP BY t.someid;
You did not include the version of SQL Server you are on. If you are using 2016+ you can use SPLIT_STRING, otherwise a good T-SQL splitter will do.
Against a single variable:
DECLARE #somestring VARCHAR(1000) = '20591;#e123456;#17507;#c567890;#15518;#e135791';
SELECT NewString = STUFF((
SELECT ','+split.item
FROM STRING_SPLIT(#somestring,';') AS s
CROSS APPLY (VALUES(REPLACE(s.[value],'#',''))) AS split(item)
WHERE split.item LIKE '[a-z][0-9]%'
FOR XML PATH('')),1,1,'');
Against a table:
NewString
----------------------
e123456,c567890,e135791
-- Against a table
DECLARE #table TABLE (someid INT IDENTITY, somestring VARCHAR(50));
INSERT #table VALUES ('20591;#e123456;#17507;#c567890;#15518;#e135791'),('26169;#c785643')
SELECT t.*, fn.NewString
FROM #table AS t
CROSS APPLY
(
SELECT NewString = STUFF((
SELECT ','+split.item
FROM STRING_SPLIT(t.somestring,';') AS s
CROSS APPLY (VALUES(REPLACE(s.[value],'#',''))) AS split(item)
WHERE split.item LIKE '[a-z][0-9]%'
FOR XML PATH('')),1,1,'')
) AS fn;
Returns:
someid somestring NewString
----------- -------------------------------------------------- -----------------------------
1 20591;#e123456;#17507;#c567890;#15518;#e135791 e123456,c567890,e135791
2 26169;#c785643 c785643

CHARINDEX and SUBSTRING between two Pipes

I have a string in a table :
S-1-5-21-109290937-1013972632-435976164-15678|l.smith|DOMAIN-UK|0x95231|1
I need to extract the data between the first and second | .
So from the above it would return l.smith only.
I have tried various CHARINDEX and SUBSTRINGS but it always errors, the length of the string also changes so I cant trim the other | out.
I tried this just now, Hope this is what you are looking for,
DECLARE #VALUE VARCHAR(MAX) = 'S-1-5-21-109290937-1013972632-435976164-15678|l.smith|DOMAIN-UK|0x95231|1';
DECLARE #FIRSTINDEX INT = CHARINDEX('|',#VALUE,1);
SELECT
SUBSTRING(#VALUE, #FIRSTINDEX+1, CHARINDEX('|',#VALUE,#FIRSTINDEX+1)-CHARINDEX('|',#VALUE,1)-1);

How to get last part of nvarchar with variable size in T-SQL?

Imagine that I have the following value in my nvarchar variable:
DECLARE #txt nvarchar(255)
SET #txt = '32|foo|foo2|123'
Is there a way to easily get the last part just after the last | that is 123 in this case ?
I could write a split function but I'm not interested in the first parts of this string. Is there another way to get that last part of the string without getting the first parts ?
Note that all parts of my string have variable sizes.
You can use a combination of LEFT, REVERSE and CHARINDEX for this.
The query below reverses the string, finds the first occurance of |, strips out other characters and then straightens the string back.
DECLARE #txt nvarchar(255)
SET #txt = '32|foo|foo2|123'
SELECT REVERSE(LEFT(REVERSE(#txt),CHARINDEX('|',REVERSE(#txt))-1))
Output
123
Edit
If your string only has 4 parts or less and . isn't a valid character, you can also use PARSENAME for this.
DECLARE #txt nvarchar(255)
SET #txt = '32|foo|foo2|123'
SELECT PARSENAME(REPLACE(#txt,'|','.'),1)
You can reverse your string to get the desired result:
DECLARE #txt nvarchar(255) = '32|foo|foo2|123'
SELECT REVERSE(SUBSTRING(REVERSE(#txt), 1, CHARINDEX('|', REVERSE(#txt)) -1))

select first letter of different columns in oracle

I want a query which will return a combination of characters and number
Example:
Table name - emp
Columns required - fname,lname,code
If fname=abc and lname=pqr and the row is very first of the table then result should be code = ap001.
For next row it should be like this:
Fname = efg, lname = rst
Code = er002 and likewise.
I know that we can use substr to retrieve first letter of a colume but I don't know how to use it to do with two columns and how to concatenate.
OK. You know you can use substr function. Now, to concatenate you will need a concatenation operator ||. To get the number of row retrieved by your query, you need the rownum pseudocolumn. Perhaps you will also need to use to_char function to format the number. About all those functions and operators you can read in SQL reference. Anyway I think you need something like this (I didn't check it):
select substr(fname, 1, 1) || substr(lname, 1, 1) || to_char(rownum, 'fm009') code
from emp