How do i extract texts from string and save it as two columns and add character at the end for third column - tsql

I m using TSQL, I want to extract text from the string and save it as two column and third one as
The following code is not complete but just getting rid of PRD1T_ the Finapp and not sure how to cater for rest of the text
Select substring(Table_name,
charindex('_',Table_name)+1,
Len(Table_name) - charindex('.',Table_name)) as Landing_Schema_Name
FROM [e].[Load_History_test]

When asking questions like this it is best to provide some sample data and expected results, as Ronen suggested. For SQL questions a really good way of doing this is with a temp table and sample data, like this:
CREATE TABLE #load_history_test (
table_name VARCHAR(100)
);
INSERT INTO #load_history_test
SELECT 'PRD1T_FINAPP.HOLCONTRACT'
UNION ALL
SELECT 'PRD1T_FINAPP.TOCCASE'
UNION ALL
SELECT 'PRD1T_FINAPP.TOCCASE';
So that provides something people can run and starts towards the criteria for a minimal, reproducible example, which is the secret to getting a good answer on StackOverflow. I have used SELECT with UNION ALL as Azure Synapse does not currently support the VALUES clause for multiple records.
For expected results, it's often good to display them in a table, something like this:
col1
col2
col3
PRD1T
FINAPP
HOLCONTRACT
PRD1T
FINAPP
TOCCASE
This way it's clear to people what you expect. It is not clear why you have two TOCCASE examples in your screenprint.
For your problem, there is more than one approach. You are along the right lines with CHARINDEX, SUBSTRING and LEFT but things can start to look complicated. Therefore I tend to wrap up some complexity in a Common Table Expression (CTE), see below for an example. There is also a kind of 'trick' approach with a built-in SQL function called PARSENAME. This is designed to extract from four-part object names common in SQL Server eg <server-name>.<database-name>.<schema-name>.<object-name>. As long as your object names will never have more than four parts, this approach will work for you. See the main help for PARSENAME here. See below for a complete demo that runs end to end with a temp table to demonstrate the different principles:
IF OBJECT_ID('#load_history_test') IS NOT NULL
DROP TABLE #load_history_test;
CREATE TABLE #load_history_test (
table_name VARCHAR(100)
);
INSERT INTO #load_history_test
SELECT 'PRD1T_FINAPP.HOLCONTRACT'
UNION ALL
SELECT 'PRD1T_FINAPP.TOCCASE'
UNION ALL
SELECT 'PRD1T_FINAPP.TOCCASE';
;WITH cte AS (
SELECT
table_name AS original_table_name,
CHARINDEX( '_', table_name ) underscorePos,
CHARINDEX( '.', table_name ) stopPos,
REPLACE( table_name, '_', '.' ) AS clean_table_name
FROM #load_history_test
)
SELECT
*,
PARSENAME( clean_table_name, 3 ) a,
PARSENAME( clean_table_name, 2 ) b,
PARSENAME( clean_table_name, 1 ) c,
LEFT( original_table_name, underscorePos - 1 ) getItemBeforeUnderscore,
SUBSTRING( original_table_name, underscorePos + 1, ( ( stopPos - 1 ) - underscorePos ) ) AS getItemAfterUnderscore,
SUBSTRING( original_table_name, stopPos + 1, 99 ) getItemAfterStop
FROM cte;

Related

TSQL - Parsing substring out of larger string

I have a bunch of rows with values that look like below. It's json extract that I unfortunately have to parse out and load. Anyway, my json parsing tool for some reason doesn't want to parse this full column out so i need to do it in TSQL. I only need the unique_id field:
[{"unique_id":"12345","system_type":"Test System."}]
I tried the below SQL but it's only returning the first 5 characters of the whole column. I know what the issue is which is I need to know how to tell the substring to continue until the 4th set of quotes which comes after the value. I'm not sure how to code the substring like that.
select substring([jsonfield],CHARINDEX('[{"unique_id":"',[jsonfield]),
CHARINDEX('"',[jsonfield]) - CHARINDEX('[{"unique_id":"',[jsonfield]) +
LEN('"')) from etl.my_test_table
Can anyone help me with this?
Thank you, I appreciate it!
Since you tagged 2016, why not use OPENJSON()
Here's an example:
DECLARE #TestData TABLE
(
[SampleData] NVARCHAR(MAX)
);
INSERT INTO #TestData (
[SampleData]
)
VALUES ( N'[{"unique_id":"12345","system_type":"Test System."}]' )
,( N'[{"unique_id":"1234567","system_type":"Test System."},{"unique_id":"1234567_2","system_type":"Test System."}]' )
SELECT b.[unique_id]
FROM #TestData [a]
CROSS APPLY
OPENJSON([a].[SampleData], '$')
WITH (
[unique_id] NVARCHAR(100) '$.unique_id'
) AS [b];
Giving you:
unique_id
---------------
12345
1234567
1234567_2
You can get all the fields as well, just add them to the WITH clause:
SELECT [b].[unique_id]
, [b].[system_type]
FROM #TestData [a]
CROSS APPLY
OPENJSON([a].[SampleData], '$')
WITH (
[unique_id] NVARCHAR(100) '$.unique_id'
, [system_type] NVARCHAR(100) '$.system_type'
) AS [b];
Take it step by step
First get everything to the left of system_type
SELECT LEFT(jsonfield, CHARINDEX('","system_type":"',jsonfield) as s
FROM -- etc
Then take everything to the right of "unique_id":"
SELECT RIGHT(S, LEN(S) - (CHARINDEX('"unique_id":"',S) + 12)) as Result
FROM (
SELECT LEFT(jsonfield, CHARINDEX('","system_type":"',jsonfield) as s
FROM -- etc
) X
Note, I did not test this so it could be off by one or have a syntax error, but you get the idea.
If your larger string ist just a simple JSON as posted, the solution is very easy:
SELECT
JSON_VALUE(N'[{"unique_id":"12345","system_type":"Test System."}]','$[0].unique_id');
JSON_VALUE() needs SQL-Server 2016 and will extract one single value from a specified path.

SQL NOT LIKE comparison against dynamic list

Working on a new TSQL Stored Procedure, I am wanting to get all rows where values in a specific column don't start with any of a specific set of 2 character substrings.
The general idea is:
SELECT * FROM table WHERE value NOT LIKE 's1%' AND value NOT LIKE 's2%' AND value NOT LIKE 's3%'.
The catch is that I am trying to make it dynamic so that the specific substrings can be pulled from another table in the database, which can have more values added to it.
While I have never used the IN operator before, I think something along these lines should do what I am looking for, however, I don't think it is possible to use wildcards with IN, so I might not be able to compare just the substrings.
SELECT * FROM table WHERE value NOT IN (SELECT substrings FROM subTable)
To get around that limitation, I am trying to do something like this:
SELECT * FROM table WHERE SUBSTRING(value, 1, 2) NOT IN (SELECT Prefix FROM subTable WHERE Prefix IS NOT NULL)
but I'm not sure this is right, or if it is the most efficient way to do this. My preference is to do this in a Stored Procedure, but if that isn't feasible or efficient I'm also open to building the query dynamically in C#.
Here's an option. Load values you want to filter to a table, left outer join and use PATINDEX().
DECLARE #FilterValues TABLE
(
[FilterValue] NVARCHAR(10)
);
--Table with values we want filter on.
INSERT INTO #FilterValues (
[FilterValue]
)
VALUES ( N's1' )
, ( N's2' )
, ( N's3' );
DECLARE #TestData TABLE
(
[TestValues] NVARCHAR(100)
);
--Load some test data
INSERT INTO #TestData (
[TestValues]
)
VALUES ( N's1 Test Data' )
, ( N's2 Test Data' )
, ( N's3 Test Data' )
, ( N'test data not filtered out' )
, ( N'test data not filtered out 1' );
SELECT a.*
FROM #TestData [a]
LEFT OUTER JOIN #FilterValues [b]
ON PATINDEX([b].[FilterValue] + '%', [a].[TestValues]) > 0
WHERE [b].[FilterValue] IS NULL;

Select into a table with a CTE [duplicate]

I have a very complex CTE and I would like to insert the result into a physical table.
Is the following valid?
INSERT INTO dbo.prf_BatchItemAdditionalAPartyNos
(
BatchID,
AccountNo,
APartyNo,
SourceRowID
)
WITH tab (
-- some query
)
SELECT * FROM tab
I am thinking of using a function to create this CTE which will allow me to reuse. Any thoughts?
You need to put the CTE first and then combine the INSERT INTO with your select statement. Also, the "AS" keyword following the CTE's name is not optional:
WITH tab AS (
bla bla
)
INSERT INTO dbo.prf_BatchItemAdditionalAPartyNos (
BatchID,
AccountNo,
APartyNo,
SourceRowID
)
SELECT * FROM tab
Please note that the code assumes that the CTE will return exactly four fields and that those fields are matching in order and type with those specified in the INSERT statement.
If that is not the case, just replace the "SELECT *" with a specific select of the fields that you require.
As for your question on using a function, I would say "it depends". If you are putting the data in a table just because of performance reasons, and the speed is acceptable when using it through a function, then I'd consider function to be an option.
On the other hand, if you need to use the result of the CTE in several different queries, and speed is already an issue, I'd go for a table (either regular, or temp).
WITH common_table_expression (Transact-SQL)
The WITH clause for Common Table Expressions go at the top.
Wrapping every insert in a CTE has the benefit of visually segregating the query logic from the column mapping.
Spot the mistake:
WITH _INSERT_ AS (
SELECT
[BatchID] = blah
,[APartyNo] = blahblah
,[SourceRowID] = blahblahblah
FROM Table1 AS t1
)
INSERT Table2
([BatchID], [SourceRowID], [APartyNo])
SELECT [BatchID], [APartyNo], [SourceRowID]
FROM _INSERT_
Same mistake:
INSERT Table2 (
[BatchID]
,[SourceRowID]
,[APartyNo]
)
SELECT
[BatchID] = blah
,[APartyNo] = blahblah
,[SourceRowID] = blahblahblah
FROM Table1 AS t1
A few lines of boilerplate make it extremely easy to verify the code inserts the right number of columns in the right order, even with a very large number of columns. Your future self will thank you later.
Yep:
WITH tab (
bla bla
)
INSERT INTO dbo.prf_BatchItemAdditionalAPartyNos ( BatchID, AccountNo,
APartyNo,
SourceRowID)
SELECT * FROM tab
Note that this is for SQL Server, which supports multiple CTEs:
WITH x AS (), y AS () INSERT INTO z (a, b, c) SELECT a, b, c FROM y
Teradata allows only one CTE and the syntax is as your example.
Late to the party here, but for my purposes I wanted to be able to run the code the user inputted and store in a temp table. Using oracle no such issues.. the insert is at the start of the statement before the with clause.
For this to work in sql server, the following worked:
INSERT into #stagetable execute (#InputSql)
(so the select statement #inputsql can start as a with clause).

help with TSQL IN statement with int

I am trying to create the following select statement in a stored proc
#dealerids nvarchar(256)
SELECT *
FROM INVOICES as I
WHERE convert(nvarchar(20), I.DealerID) in (#dealerids)
I.DealerID is an INT in the table. and the Parameter for dealerids would be formatted such as
(8820, 8891, 8834)
When I run this with parameters provided I get no rows back. I know these dealerIDs should provided rows as if I do it individually I get back what I expect.
I think I am doing
WHERE convert(nvarchar(20), I.DealerID) in (#dealerids)
incorrectly. Can anyone point out what I am doing wrong here?
Use a table values parameter (new in SQl Server 2008). Set it up by creating the actual table parameter type:
CREATE TYPE IntTableType AS TABLE (ID INTEGER PRIMARY KEY)
Your procedure would then be:
Create Procedure up_TEST
#Ids IntTableType READONLY
AS
SELECT *
FROM ATable a
WHERE a.Id IN (SELECT ID FROM #Ids)
RETURN 0
GO
if you can't use table value parameters, see: "Arrays and Lists in SQL Server 2005 and Beyond, When Table Value Parameters Do Not Cut it" by Erland Sommarskog, then there are many ways to split string in SQL Server. This article covers the PROs and CONs of just about every method. in general, you need to create a split function. This is how a split function can be used:
SELECT
*
FROM YourTable y
INNER JOIN dbo.yourSplitFunction(#Parameter) s ON y.ID=s.Value
I prefer the number table approach to split a string in TSQL but there are numerous ways to split strings in SQL Server, see the previous link, which explains the PROs and CONs of each.
For the Numbers Table method to work, you need to do this one time table setup, which will create a table Numbers that contains rows from 1 to 10,000:
SELECT TOP 10000 IDENTITY(int,1,1) AS Number
INTO Numbers
FROM sys.objects s1
CROSS JOIN sys.objects s2
ALTER TABLE Numbers ADD CONSTRAINT PK_Numbers PRIMARY KEY CLUSTERED (Number)
Once the Numbers table is set up, create this split function:
CREATE FUNCTION [dbo].[FN_ListToTable]
(
#SplitOn char(1) --REQUIRED, the character to split the #List string on
,#List varchar(8000)--REQUIRED, the list to split apart
)
RETURNS TABLE
AS
RETURN
(
----------------
--SINGLE QUERY-- --this will not return empty rows
----------------
SELECT
ListValue
FROM (SELECT
LTRIM(RTRIM(SUBSTRING(List2, number+1, CHARINDEX(#SplitOn, List2, number+1)-number - 1))) AS ListValue
FROM (
SELECT #SplitOn + #List + #SplitOn AS List2
) AS dt
INNER JOIN Numbers n ON n.Number < LEN(dt.List2)
WHERE SUBSTRING(List2, number, 1) = #SplitOn
) dt2
WHERE ListValue IS NOT NULL AND ListValue!=''
);
GO
You can now easily split a CSV string into a table and join on it:
Create Procedure up_TEST
#Ids VARCHAR(MAX)
AS
SELECT * FROM ATable a
WHERE a.Id IN (SELECT ListValue FROM dbo.FN_ListToTable(',',#Ids))
You can't use #dealerids like that, you need to use dynamic SQL, like this:
#dealerids nvarchar(256)
EXEC('SELECT *
FROM INVOICES as I
WHERE convert(nvarchar(20), I.DealerID) in (' + #dealerids + ')'
The downside is that you open yourself up to SQL injection attacks unless you specifically control the data going into #dealerids.
There are better ways to handle this depending on your version of SQL Server, which are documented in this great article.
Split #dealerids into a table then JOIN
SELECT *
FROM INVOICES as I
JOIN
ufnSplit(#dealerids) S ON I.DealerID = S.ParsedIntDealerID
Assorted split functions here (I'd probably a numbers table in this case for a small string

Combining INSERT INTO and WITH/CTE

I have a very complex CTE and I would like to insert the result into a physical table.
Is the following valid?
INSERT INTO dbo.prf_BatchItemAdditionalAPartyNos
(
BatchID,
AccountNo,
APartyNo,
SourceRowID
)
WITH tab (
-- some query
)
SELECT * FROM tab
I am thinking of using a function to create this CTE which will allow me to reuse. Any thoughts?
You need to put the CTE first and then combine the INSERT INTO with your select statement. Also, the "AS" keyword following the CTE's name is not optional:
WITH tab AS (
bla bla
)
INSERT INTO dbo.prf_BatchItemAdditionalAPartyNos (
BatchID,
AccountNo,
APartyNo,
SourceRowID
)
SELECT * FROM tab
Please note that the code assumes that the CTE will return exactly four fields and that those fields are matching in order and type with those specified in the INSERT statement.
If that is not the case, just replace the "SELECT *" with a specific select of the fields that you require.
As for your question on using a function, I would say "it depends". If you are putting the data in a table just because of performance reasons, and the speed is acceptable when using it through a function, then I'd consider function to be an option.
On the other hand, if you need to use the result of the CTE in several different queries, and speed is already an issue, I'd go for a table (either regular, or temp).
WITH common_table_expression (Transact-SQL)
The WITH clause for Common Table Expressions go at the top.
Wrapping every insert in a CTE has the benefit of visually segregating the query logic from the column mapping.
Spot the mistake:
WITH _INSERT_ AS (
SELECT
[BatchID] = blah
,[APartyNo] = blahblah
,[SourceRowID] = blahblahblah
FROM Table1 AS t1
)
INSERT Table2
([BatchID], [SourceRowID], [APartyNo])
SELECT [BatchID], [APartyNo], [SourceRowID]
FROM _INSERT_
Same mistake:
INSERT Table2 (
[BatchID]
,[SourceRowID]
,[APartyNo]
)
SELECT
[BatchID] = blah
,[APartyNo] = blahblah
,[SourceRowID] = blahblahblah
FROM Table1 AS t1
A few lines of boilerplate make it extremely easy to verify the code inserts the right number of columns in the right order, even with a very large number of columns. Your future self will thank you later.
Yep:
WITH tab (
bla bla
)
INSERT INTO dbo.prf_BatchItemAdditionalAPartyNos ( BatchID, AccountNo,
APartyNo,
SourceRowID)
SELECT * FROM tab
Note that this is for SQL Server, which supports multiple CTEs:
WITH x AS (), y AS () INSERT INTO z (a, b, c) SELECT a, b, c FROM y
Teradata allows only one CTE and the syntax is as your example.
Late to the party here, but for my purposes I wanted to be able to run the code the user inputted and store in a temp table. Using oracle no such issues.. the insert is at the start of the statement before the with clause.
For this to work in sql server, the following worked:
INSERT into #stagetable execute (#InputSql)
(so the select statement #inputsql can start as a with clause).