aggregate multiple columns over dynamic pivot in sql - tsql

I'm creating a stored procedure that would allow the user to retrieve data from 2 tables by providing the PersonID number as a parameter.
I thought of using the pivot function to pivot the Data table dynamically by non-aggregating over multiple columns and retrieving data from ONE column in a different table. The 2 tables below are just sample data as I have over 100 columns for the data table, hence the dynamic part. The 2 tables doesn't have a common ID column but just a common column_name.
Here are the 2 tables:
Mapping Table:
CREATE table #table (
ID varchar(10) NOT NULL,
Column_Name varchar (255) NOT NULL,
Page_Num varchar(10) NOT NULL,
Line_Num varchar(10) NOT NULL,
Element_Num varchar(10) NOT NULL
)
INSERT INTO #table (ID,Column_Name,Page_Num,Line_Num,Element_Num) VALUES ('1','Name', 'DT-01', '200','20')
INSERT INTO #table (ID,Column_Name,Page_Num,Line_Num,Element_Num) VALUES ('2','SSN', 'DT-02', '220','10')
INSERT INTO #table (ID,Column_Name,Page_Num,Line_Num,Element_Num) VALUES ('3','City', 'DT-03', '300','11')
INSERT INTO #table (ID,Column_Name,Page_Num,Line_Num,Element_Num) VALUES ('4','StreetName', 'DT-04', '350','33')
INSERT INTO #table (ID,Column_Name,Page_Num,Line_Num,Element_Num) VALUES ('5','Sex', 'DT-05', '310','51')
Creates:
ID Column_Name Page_Num Line_Num Element_Num
_________________________________________________________________
1 Name DT-01 200 20
2 SSN DT-02 220 10
3 City DT-03 300 11
4 StreetName DT-04 350 33
5 Sex DT-05 310 51
Data table:
CREATE table #temp (
PersonID varchar (100) NOT NULL,
Name varchar(100) NOT NULL,
SSN varchar (255) NOT NULL,
City varchar(100) NOT NULL,
StreetName varchar(100) NOT NULL,
Sex varchar(100) NOT NULL
)
INSERT INTO #temp (PersonID,Name,SSN,City,StreetName,Sex) VALUES ('112','Joe','945890189', 'Lookesville', 'Broad st','Male')
INSERT INTO #temp (PersonID,Name,SSN,City,StreetName,Sex) VALUES ('140','Santana','514819926', 'Falls Church', 'Gane Rd', 'Female')
INSERT INTO #temp (PersonID,Name,SSN,City,StreetName,Sex) VALUES ('481','Wyatt','014523548','Gainesville', 'Westfield blvd', 'Male')
INSERT INTO #temp (PersonID,Name,SSN,City,StreetName,Sex) VALUES ('724','Brittany','551489230','Aldi', 'Ostrich rd', 'Female')
INSERT INTO #temp (PersonID,Name,SSN,City,StreetName,Sex) VALUES ('100','Giovanni','774451362','Paige', 'Company ln', 'Male')
Creates:
PersonID Name SSN City StreetName Sex
_______________________________________________________________________
112 Joe 945890189 Lookesville Broad st Male
140 Santana 514819926 Falls Church Gane Rd Female
481 Wyatt 014523548 Gainesville Westfield rd Male
724 Brittany 551489230 Aldi Ostrich rd Female
100 Giovanni 774451362 Paige Company ln Male
The end result should be:
Example: User enters parameter PersonID = 140
Column_name Page_Num Line_Num Element_Num Data
_____________________________________________________________________________
Name DT-01 200 20 Santana
SSN DT-02 220 10 514819926
City DT-03 300 11 Falls Church
StreetName DT-04 350 33 Gane Rd
Sex DT-05 310 51 Female
... ... ... ... ...
and so on..

The following will dynamically unpivot a data row, and then perform a join on the field name with the def data.
If you want to run this query without a filter, I would suggest adding A.PersonID to the top SELECT and remove the WHERE
I should add, UNPIVOT would be more performant, but with this approach, there is no need to define and/or recast values. That said, the performance is still very respectable.
Example
Select D.*
,Data=C.Value
From #Temp A
Cross Apply (Select XMLData = cast((Select A.* For XML Raw) as xml)) B
Cross Apply (
Select Item = attr.value('local-name(.)','varchar(100)')
,Value = attr.value('.','varchar(max)')
From B.XMLData.nodes('/row') as X(r)
Cross Apply X.r.nodes('./#*') AS N(attr)
) C
Join #Table D on (C.Item=D.Column_Name)
Where PersonID=140
Returns
If it Helps with the Visualization, the CROSS APPLY C generates the following:
EDIT - As a Stored Procedure
CREATE PROCEDURE [dbo].[YourProcedureName](#PersonID int)
As
Begin
Set NoCount On;
Select D.*
,Data=C.Value
From YourPersonTableName A
Cross Apply (Select XMLData = cast((Select A.* For XML Raw) as xml)) B
Cross Apply (
Select Item = attr.value('local-name(.)','varchar(100)')
,Value = attr.value('.','varchar(max)')
From B.XMLData.nodes('/row') as X(r)
Cross Apply X.r.nodes('./#*') AS N(attr)
) C
Join YourObjectTableName D on (C.Item=D.Column_Name)
Where PersonID=#PersonID
End

Related

Is it possible to find duplicating records in two columns simultaneously in PostgreSQL?

I have the following database schema (oversimplified):
create sequence partners_partner_id_seq;
create table partners
(
partner_id integer default nextval('partners_partner_id_seq'::regclass) not null primary key,
name varchar(255) default NULL::character varying,
company_id varchar(20) default NULL::character varying,
vat_id varchar(50) default NULL::character varying,
is_deleted boolean default false not null
);
INSERT INTO partners(name, company_id, vat_id) VALUES('test1','1010109191191', 'BG1010109191192');
INSERT INTO partners(name, company_id, vat_id) VALUES('test2','1010109191191', 'BG1010109191192');
INSERT INTO partners(name, company_id, vat_id) VALUES('test3','3214567890102', 'BG1010109191192');
INSERT INTO partners(name, company_id, vat_id) VALUES('test4','9999999999999', 'GE9999999999999');
I am trying to figure out how to return test1, test2 (because the company_id column value duplicates vertically) and test3 (because the vat_id column value duplicates vertically as well).
To put it in other words - I need to find duplicating company_id and vat_id records and group them together, so that test1, test2 and test3 would be together, because they duplicate by company_id and vat_id.
So far I have the following query:
SELECT *
FROM (
SELECT *, LEAD(row, 1) OVER () AS nextrow
FROM (
SELECT *, ROW_NUMBER() OVER (w) AS row
FROM partners
WHERE is_deleted = false
AND ((company_id != '' AND company_id IS NOT null) OR (vat_id != '' AND vat_id IS NOT NULL))
WINDOW w AS (PARTITION BY company_id, vat_id ORDER BY partner_id DESC)
) x
) y
WHERE (row > 1 OR nextrow > 1)
AND is_deleted = false
This successfully shows all company_id duplicates, but does not appear to show vat_id ones - test3 row is missing. Is this possible to be done within one query?
Here is a db-fiddle with the schema, data and predefined query reproducing my result.
You can do this with recursion, but depending on the size of your data you may want to iterate, instead.
The trick is to make the name just another match key instead of treating it differently than the company_id and vat_id:
create table partners (
partner_id integer generated always as identity primary key,
name text,
company_id text,
vat_id text,
is_deleted boolean not null default false
);
insert into partners (name, company_id, vat_id) values
('test1','1010109191191', 'BG1010109191192'),
('test2','1010109191191', 'BG1010109191192'),
('test3','3214567890102', 'BG1010109191192'),
('test4','9999999999999', 'GE9999999999999'),
('test5','3214567890102', 'BG8888888888888'),
('test6','2983489023408', 'BG8888888888888')
;
I added a couple of test cases and left in the lone partner.
with recursive keys as (
select partner_id,
array['n_'||name, 'c_'||company_id, 'v_'||vat_id] as matcher,
array[partner_id] as matchlist,
1 as size
from partners
), matchers as (
select *
from keys
union all
select p.partner_id, c.matcher,
p.matchlist||c.partner_id as matchlist,
p.size + 1
from matchers p
join keys c
on c.matcher && p.matcher
and not p.matchlist #> array[c.partner_id]
), largest as (
select distinct sort(matchlist) as matchlist
from matchers m
where not exists (select 1
from matchers
where matchlist #> m.matchlist
and size > m.size)
-- and size > 1
)
select *
from largest
;
matchlist
{1,2,3,5,6}
{4}
fiddle
EDIT UPDATE
Since recursion did not perform, here is an iterative example in plpgsql that uses a temporary table:
create temporary table match1 (
partner_id int not null,
group_id int not null,
matchkey uuid not null
);
create index on match1 (matchkey);
create index on match1 (group_id);
insert into match1
select partner_id, partner_id, md5('n_'||name)::uuid from partners
union all
select partner_id, partner_id, md5('c_'||company_id)::uuid from partners
union all
select partner_id, partner_id, md5('v_'||vat_id)::uuid from partners;
do $$
declare _cnt bigint;
begin
loop
with consolidate as (
select group_id,
min(group_id) over (partition by matchkey) as new_group_id
from match1
), minimize as (
select group_id, min(new_group_id) as new_group_id
from consolidate
group by group_id
), doupdate as (
update match1
set group_id = m.new_group_id
from minimize m
where m.group_id = match1.group_id
and m.new_group_id != match1.group_id
returning *
)
select count(*) into _cnt from doupdate;
if _cnt = 0 then
exit;
end if;
end loop;
end;
$$;
updated fiddle

SQL Stored Procedure with Table Valued Parameters and Hierarchical data [duplicate]

Code:
CREATE TYPE dbo.tEmployeeData AS TABLE
(
FirstName NVARCHAR(50),
LastName NVARCHAR(50),
DepartmentType NVARCHAR(10),
DepartmentBuilding NVARCHAR(50),
DepartmentEmployeeLevel NVARCHAR(10),
DepartmentTypeAMetadata NVARCHAR(100),
DepartmentTypeBMetadata NVARCHAR(100)
)
GO
CREATE PROC dbo.EmployeeImport
(#tEmployeeData tEmployeeData READONLY)
AS
BEGIN
DECLARE #MainEmployee TABLE
(EmployeeID INT IDENTITY(1,1),
FirstName NVARCHAR(50),
LastName NVARCHAR(50))
DECLARE #ParentEmployeeDepartment TABLE
(EmployeeID INT,
ParentEmployeeDepartmentID INT IDENTITY(1,1),
DepartmentType NVARCHAR(10))
DECLARE #ChildEmployeeDepartmentTypeA TABLE
(ParentEmployeeDepartmentID INT,
DepartmentBuilding NVARCHAR(50),
DepartmentEmployeeLevel NVARCHAR(10),
DepartmentTypeAMetadata NVARCHAR(100))
DECLARE #ChildEmployeeDepartmentTypeB TABLE
(ParentEmployeeDepartmentID INT,
DepartmentBuilding NVARCHAR(50),
DepartmentEmployeeLevel NVARCHAR(10),
DepartmentTypeBMetadata NVARCHAR(100))
-- INSERT CODE GOES HERE
SELECT * FROM #MainEmployee
SELECT * FROM #ParentEmployeeDepartment
SELECT * FROM #ChildEmployeeDepartmentTypeA
SELECT * FROM #ChildEmployeeDepartmentTypeB
END
GO
DECLARE #tEmployeeData tEmployeeData
INSERT INTO #tEmployeeData (FirstName, LastName, DepartmentType,
DepartmentBuilding, DepartmentEmployeeLevel,
DepartmentTypeAMetadata, DepartmentTypeBMetadata)
SELECT
N'Tom_FN', N'Tom_LN', N'A',
N'101', N'IV', N'Tech/IT', NULL
UNION
SELECT
N'Mike_FN', N'Mike_LN', N'B',
N'OpenH', N'XII', NULL, N'Med'
UNION
SELECT
N'Joe_FN', N'Joe_LN', N'A',
N'101', N'IV', N'Tech/IT', NULL
UNION
SELECT
N'Dave_FN', N'Dave_LN', N'B',
N'OpenC', N'XII', NULL, N'Lab'
EXEC EmployeeImport #tEmployeeData
GO
DROP PROC dbo.EmployeeImport
DROP TYPE dbo.tEmployeeData
Notes:
The table variables are replaced by real tables in live environment.
EmployeeID and ParentEmployeeDepartmentID columns' values don't always match each other. Live environment has more records in the udt (tEmployeeData) than just 4
Goal:
The udt (tEmployeeData) will be passed into the procedure
The procedure should first insert the data into the #MainEmployee table (and get the EmployeeIDs)
Next, the procedure should insert the data into the #ParentEmployeeDepartment table (and get the ParentEmployeeDepartmentID) - note EmployeeID is coming from the previous output.
Then, the procedure should split the child level data based on the DepartmentType ("A" = insert into #ChildEmployeeDepartmentTypeA and "B" = insert into #ChildEmployeeDepartmentTypeB).
ParentEmployeeDepartmentID from #ParentEmployeeDepartment should be used when inserting data into either #ChildEmployeeDepartmentTypeA or #ChildEmployeeDepartmentTypeB
The procedure should should run fast (need to avoid row by row operation)
Output:
#MainEmployee:
EmployeeID FirstName LastName
---------------------------------
1 Tom_FN Tom_LN
2 Mike_FN Mike_LN
3 Joe_FN Joe_LN
4 Dave_FN Dave_LN
#ParentEmployeeDepartment:
EmployeeID ParentEmployeeDepartmentID DepartmentType
-------------------------------------------------------
1 1 A
2 2 B
3 3 A
4 4 B
#ChildEmployeeDepartmentTypeA:
ParentEmployeeDepartmentID DepartmentBuilding DepartmentEmployeeLevel DepartmentTypeAMetadata
---------------------------------------------------------------------------------------------------------
1 101 IV Tech/IT
3 101 IV Tech/IT
#ChildEmployeeDepartmentTypeB:
ParentEmployeeDepartmentID DepartmentBuilding DepartmentEmployeeLevel DepartmentTypeAMetadata
----------------------------------------------------------------------------------------------------------
2 OpenH XII Med
4 OpenC XII Lab
I know I can use the OUTPUT clause after the insert and get EmployeeID and ParentEmployeeDepartmentID, but I'm not sure how to insert the right child records into right tables with right mapping to the parent table. Any help would be appreciated.
Here is my solution (based on the same answer I've linked to in the comments):
First, you must add another column to your UDT, to hold a temporary ID for the employee:
CREATE TYPE dbo.tEmployeeData AS TABLE
(
FirstName NVARCHAR(50),
LastName NVARCHAR(50),
DepartmentType NVARCHAR(10),
DepartmentBuilding NVARCHAR(50),
DepartmentEmployeeLevel NVARCHAR(10),
DepartmentTypeAMetadata NVARCHAR(100),
DepartmentTypeBMetadata NVARCHAR(100),
EmployeeId int
)
GO
Populating it with that new employeeId column:
DECLARE #tEmployeeData tEmployeeData
INSERT INTO #tEmployeeData (FirstName, LastName, DepartmentType,
DepartmentBuilding, DepartmentEmployeeLevel,
DepartmentTypeAMetadata, DepartmentTypeBMetadata, EmployeeId)
SELECT
N'Tom_FN', N'Tom_LN', N'A',
N'101', N'IV', N'Tech/IT', NULL, 5
UNION
SELECT
N'Mike_FN', N'Mike_LN', N'B',
N'OpenH', N'XII', NULL, N'Med', 6
UNION
SELECT
N'Joe_FN', N'Joe_LN', N'A',
N'101', N'IV', N'Tech/IT', NULL, 7
UNION
SELECT
N'Dave_FN', N'Dave_LN', N'B',
N'OpenC', N'XII', NULL, N'Lab', 8
Insert part goes here
Then, you use a table variable to map the inserted value from the employee table to the temp employee id in the data you sent to the procedure:
DECLARE #EmployeeidMap TABLE
(
temp_id int,
id int
)
Now, the trick is to populate the employee table with the MERGE statement instead of an INSERT...SELECT because you have to use values from both inserted and source data in the output clause:
MERGE INTO #MainEmployee USING #tEmployeeData AS sourceData ON 1 = 0 -- Always not matched
WHEN NOT MATCHED THEN
INSERT (FirstName, LastName)
VALUES (sourceData.FirstName, sourceData.LastName)
OUTPUT sourceData.EmployeeId, inserted.EmployeeID
INTO #EmployeeidMap (temp_id, id); -- populate the map table
From that point on it's simple, you need to join the data you sent to the #EmployeeidMap to get the actual employeeId:
INSERT INTO #ParentEmployeeDepartment (EmployeeID, DepartmentType)
SELECT Id, DepartmentType
FROM #tEmployeeData
INNER JOIN #EmployeeidMap ON EmployeeID = temp_id
Now you can use the data in #ParentEmployeeDepartment to map the actual values in ParentEmployeeDepartmentID to the data you sent:
Testing the inserts so far
SELECT FirstName,
LastName,
SentData.DepartmentType As [Dept. Type],
DepartmentBuilding As Building,
DepartmentEmployeeLevel As [Emp. Level],
DepartmentTypeAMetadata As [A Meta],
DepartmentTypeBMetadata As [B Meta],
SentData.EmployeeId As TempId, EmpMap.id As [Emp. Id], DeptMap.ParentEmployeeDepartmentID As [Dept. Id]
FROM #tEmployeeData SentData
INNER JOIN #EmployeeidMap EmpMap ON SentData.EmployeeId = temp_id
INNER JOIN #ParentEmployeeDepartment DeptMap ON EmpMap.id = DeptMap.EmployeeID
results:
FirstName LastName Dept. Type Building Emp. Level A Meta B Meta TempId Emp. Id Dept. Id
--------- -------- ---------- -------- ---------- ------ ------ ------ ----------- -----------
Dave_FN Dave_LN B OpenC XII NULL Lab 8 1 1
Joe_FN Joe_LN A 101 IV Tech/IT NULL 7 2 2
Mike_FN Mike_LN B OpenH XII NULL Med 6 3 3
Tom_FN Tom_LN A 101 IV Tech/IT NULL 5 4 4
I'm sure that from this point you can easily figure out the last 2 inserts yourself.

select columns in select query from another mapping table

I am looking for select query with dynamic column names that can be derived from another table. Below the sample data and query I am looking to get.
create table mapping_tmp (
string_number varchar,
mapping_name varchar
)
create table fact_tmp (
product varchar,
product_family varchar,
string1 varchar,
string2 varchar,
string3 varchar,
string4 varchar,
string5 varchar
)
insert into mapping_tmp values ('string1','commodity');
insert into mapping_tmp values ('string2','real commodity');
insert into mapping_tmp values ('string3','country');
insert into mapping_tmp values ('string4','region');
insert into mapping_tmp values ('string5','area');
insert into fact_tmp values ('P1','PF1','ABC1','DEF1','GHI1','JKL1','MNO1');
insert into fact_tmp values ('P2','PF2','ABC2','DEF2','GHI2','JKL2','MNO2');
insert into fact_tmp values ('P3','PF3','ABC3','DEF3','GHI3','JKL3','MNO3');
insert into fact_tmp values ('P4','PF4','ABC4','DEF4','GHI4','JKL4','MNO4');
insert into fact_tmp values ('P5','PF5','ABC5','DEF5','GHI5','JKL5','MNO5');
insert into fact_tmp values ('P6','PF6','ABC6','DEF6','GHI6','JKL6','MNO6');
Expected output, select fields should be taken from mapping_tmp and those fields data should be displayed in select result.
select product,
(select string_number from mapping_tmp where mapping_name = 'country') as country,
(select string_number from mapping_tmp where mapping_name = 'area') as area
from fact_tmp;
The actual query is
select product, string3 as country, string5 as area from fact_tmp;
and the output:
product country area
1 P1 GHI1 MNO1
2 P2 GHI2 MNO2
3 P3 GHI3 MNO3
4 P4 GHI4 MNO4
5 P5 GHI5 MNO5
6 P6 GHI6 MNO6
I am looking for simple sql query, I cannot use stored procedure or function in application.

What's wrong with this T-SQL MERGE statement?

I am new to MERGE, and I'm sure I have some error in my code.
This code will run and create my scenario:
I have two tables, one that is called TempUpsert that fills from a SqlBulkCopy operation (100s of millions of records) and a Sales table that holds the production data which is to be indexed and used.
I wish to merge the TempUpsert table with the Sales one
I am obviously doing something wrong as it fails with even the smallest example
IF EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].[TempUpsert]') )
drop table TempUpsert;
CREATE TABLE [dbo].[TempUpsert](
[FirstName] [varchar](200) NOT NULL,
[LastName] [varchar](200) NOT NULL,
[Score] [int] NOT NULL
) ON [PRIMARY] ;
CREATE TABLE [dbo].[Sales](
[FullName] [varchar](200) NOT NULL,
[LastName] [varchar](200) NOT NULL,
[FirstName] [varchar](200) NOT NULL,
[lastUpdated] [date] NOT NULL,
CONSTRAINT [PK_Sales] PRIMARY KEY CLUSTERED
(
[FullName] ASC
)
---- PROC
CREATE PROCEDURE [dbo].[sp_MoveFromTempUpsert_to_Sales]
(#HashMod int)
AS
BEGIN
-- SET NOCOUNT ON added to prevent extra result sets from
-- interfering with SELECT statements.
SET NOCOUNT ON;
MERGE Sales AS trget
USING (
SELECT
--- Edit: Thanks to Mikal added DISTINCT
DISTINCT
FirstName, LastName , [Score], LastName+'.'+FirstName AS FullName
FROM TempUpsert AS ups) AS src (FirstName, LastName, [Score], FullName)
ON
(
src.[Score] = #hashMod
AND
trget.FullName=src.FullName
)
WHEN MATCHED
THEN
UPDATE SET trget.lastUpdated = GetDate()
WHEN NOT MATCHED
THEN INSERT ([FullName], [LastName], [FirstName], [lastUpdated])
VALUES (FullName, src.LastName, src.FirstName, GetDate())
OUTPUT $action, Inserted.*, Deleted.* ;
--print ##rowcount
END
GO
--- Insert dummie data
INSERT INTO TempUpsert (FirstName, LastName, Score)
VALUES ('John','Smith',2);
INSERT INTO TempUpsert (FirstName, LastName, Score)
VALUES ('John','Block',2);
INSERT INTO TempUpsert (FirstName, LastName, Score)
VALUES ('John','Smith',2); --make multiple on purpose
----- EXECUTE PROC
GO
DECLARE #return_value int
EXEC #return_value = [dbo].[sp_MoveFromTempUpsert_to_Sales]
#HashMod = 2
SELECT 'Return Value' = #return_value
GO
This returns:
(1 row(s) affected)
(1 row(s) affected)
(1 row(s) affected)
Msg 2627, Level 14, State 1, Procedure sp_MoveFromTempUpsert_to_Sales, Line 12
Violation of PRIMARY KEY constraint 'PK_Sales'. Cannot insert duplicate key in object
'dbo.Sales'. The statement has been terminated.
(1 row(s) affected)
What am I doing wrong please?
Greatly appreciated
The first two rows in your staging table will give you the duplicate PK. violation. Conc is the PK and you insert tmain+dmain with the same value twice.
In Summation
MERGE requires its input (Using) to be duplicates free
the Using is a regular SQL statement, so you can use Group By, distinct and having as well as Where clauses.
My final Merge looks like so :
MERGE Sales AS trget
USING (
SELECT FirstName, LastName, Score, LastName + '.' + FirstName AS FullName
FROM TempUpsert AS ups
WHERE Score = #hashMod
GROUP BY FirstName, LastName, Score, LastName + '.' + FirstName
) AS src (FirstName, LastName, [Score], FullName)
ON
(
-- src.[Score] = #hashMod
--AND
trget.FullName=src.FullName
)
WHEN MATCHED
THEN
UPDATE SET trget.lastUpdated = GetDate()
WHEN NOT MATCHED
THEN INSERT ([FullName], [LastName], [FirstName], [lastUpdated])
VALUES (FullName, src.LastName, src.FirstName, GetDate())
OUTPUT $action, Inserted.*, Deleted.* ;
--print ##rowcount
END
And it works!
Thanks to you all :)
Without DISTINCT or proper AGGREGATE function in subquery used in USING part of MERGE there will be two rows which suits criteria used in ON part of MERGE, which is not allowed. (Two John.Smith)
AND
Move the condition src.[Score] = #hashMod inside the subquery,
instead if ON clause not succeed, for example John.Smith have score of 2, and #HashMod = 1 - then if you already have the row with John.Smith in target table - you'll get an error with Primary Key Constraint

one column split to more column sql server 2008?

Table name: Table1
id name
1 1-aaa-14 milan road
2 23-abcde-lsd road
3 2-mnbvcx-welcoome street
I want the result like this:
Id name name1 name2
1 1 aaa 14 milan road
2 23 abcde lsd road
3 2 mnbvcx welcoome street
This function ought to give you what you need.
--Drop Function Dbo.Part
Create Function Dbo.Part
(#Value Varchar(8000)
,#Part Int
,#Sep Char(1)='-'
)Returns Varchar(8000)
As Begin
Declare #Start Int
Declare #Finish Int
Set #Start=1
Set #Finish=CharIndex(#Sep,#Value,#Start)
While (#Part>1 And #Finish>0)Begin
Set #Start=#Finish+1
Set #Finish=CharIndex(#Sep,#Value,#Start)
Set #Part=#Part-1
End
If #Part>1 Set #Start=Len(#Value)+1 -- Not found
If #Finish=0 Set #Finish=Len(#Value)+1 -- Last token on line
Return SubString(#Value,#Start,#Finish-#Start)
End
Usage:
Select ID
,Dbo.Part(Name,1,Default)As Name
,Dbo.Part(Name,2,Default)As Name1
,Dbo.Part(Name,3,Default)As Name2
From Dbo.Table1
It's rather compute-intensive, so if Table1 is very long you ought to write the results to another table, which you could refresh from time to time (perhaps once a day, at night).
Better yet, you could create a trigger, which automatically updates Table2 whenever a change is made to Table1. Assuming that column ID is primary key:
Create Table Dbo.Table2(
ID Int Constraint PK_Table2 Primary Key,
Name Varchar(8000),
Name1 Varchar(8000),
Name2 Varchar(8000))
Create Trigger Trigger_Table1 on Dbo.Table1 After Insert,Update,Delete
As Begin
If (Select Count(*)From Deleted)>0
Delete From Dbo.Table2 Where ID=(Select ID From Deleted)
If (Select Count(*)From Inserted)>0
Insert Dbo.Table2(ID, Name, Name1, Name2)
Select ID
,Dbo.Part(Name,1,Default)
,Dbo.Part(Name,2,Default)
,Dbo.Part(Name,3,Default)
From Inserted
End
Now, do your data manipulation (Insert, Update, Delete) on Table1, but do your Select statements on Table2 instead.
The below solution uses a recursive CTE for splitting the strings, and PIVOT for displaying the parts in their own columns.
WITH Table1 (id, name) AS (
SELECT 1, '1-aaa-14 milan road' UNION ALL
SELECT 2, '23-abcde-lsd road' UNION ALL
SELECT 3, '2-mnbvcx-welcoome street'
),
cutpositions AS (
SELECT
id, name,
rownum = 1,
startpos = 1,
nextdash = CHARINDEX('-', name + '-')
FROM Table1
UNION ALL
SELECT
id, name,
rownum + 1,
nextdash + 1,
CHARINDEX('-', name + '-', nextdash + 1)
FROM cutpositions c
WHERE nextdash < LEN(name)
)
SELECT
id,
[1] AS name,
[2] AS name1,
[3] AS name2
/* add more columns here */
FROM (
SELECT
id, rownum,
part = SUBSTRING(name, startpos, nextdash - startpos)
FROM cutpositions
) s
PIVOT ( MAX(part) FOR rownum IN ([1], [2], [3] /* extend the list here */) ) x
Without additional modifications this query can split names consisting of up to 100 parts (that's the default maximum recursion depth, which can be changed), but can only display no more than 3 of them. You can easily extend it to however many parts you want it to display, just follow the instructions in the comments.
select T.id,
substring(T.Name, 1, D1.Pos-1) as Name,
substring(T.Name, D1.Pos+1, D2.Pos-D1.Pos-1) as Name1,
substring(T.Name, D2.Pos+1, len(T.name)) as Name2
from Table1 as T
cross apply (select charindex('-', T.Name, 1)) as D1(Pos)
cross apply (select charindex('-', T.Name, D1.Pos+1)) as D2(Pos)
Testing performance of suggested solutions
Setup:
create table Table1
(
id int identity primary key,
Name varchar(50)
)
go
insert into Table1
select '1-aaa-14 milan road' union all
select '23-abcde-lsd road' union all
select '2-mnbvcx-welcoome street'
go 10000
Result:
if you always will have 2 dashes, you can do the following by using PARSENAME
--testing table
CREATE TABLE #test(id INT, NAME VARCHAR(1000))
INSERT #test VALUES(1, '1-aaa-14 milan road')
INSERT #test VALUES(2, '23-abcde-lsd road')
INSERT #test VALUES(3, '2-mnbvcx-welcoome street')
SELECT id,PARSENAME(name,3) AS name,
PARSENAME(name,2) AS name1,
PARSENAME(name,1)AS name2
FROM (
SELECT id,REPLACE(NAME,'-','.') NAME
FROM #test)x
if you have dots in the name column you have to first replace them and then replace them back to dots in the end
example, by using a tilde to substitute the dot
INSERT #test VALUES(3, '5-mnbvcx-welcoome street.')
SELECT id,REPLACE(PARSENAME(name,3),'~','.') AS name,
REPLACE(PARSENAME(name,2),'~','.') AS name1,
REPLACE(PARSENAME(name,1),'~','.') AS name2
FROM (
SELECT id,REPLACE(REPLACE(NAME,'.','~'),'-','.') NAME
FROM #test)x