T-SQL - Pivot/Crosstab - variable number of values - tsql

I have a simple data set that looks like this:
Name Code
A A-One
A A-Two
B B-One
C C-One
C C-Two
C C-Three
I want to output it so it looks like this:
Name Code1 Code2 Code3 Code4 Code...n ...
A A-One A-Two
B B-One
C C-One C-Two C-Three
For each of the 'Name' values, there can be an undetermined number of 'Code' values.
I have been looking at various examples of Pivot SQL [including simple Pivot sql and sql using the XML function?] but I have not been able to figure this out - or to understand if it is even possible.
I would appreciate any help or pointers.
Thanks!

Try it like this:
DECLARE #tbl TABLE([Name] VARCHAR(100),Code VARCHAR(100));
INSERT INTO #tbl VALUES
('A','A-One')
,('A','A-Two')
,('B','B-One')
,('C','C-One')
,('C','C-Two')
,('C','C-Three');
SELECT p.*
FROM
(
SELECT *
,CONCAT('Code',ROW_NUMBER() OVER(PARTITION BY [Name] ORDER BY Code)) AS ColumnName
FROM #tbl
)t
PIVOT
(
MAX(Code) FOR ColumnName IN (Code1,Code2,Code3,Code4,Code5 /*add as many as you need*/)
)p;
This line
,CONCAT('Code',ROW_NUMBER() OVER(PARTITION BY [Name] ORDER BY Code)) AS ColumnName
will use a partitioned ROW_NUMBER in order to create numbered column names per code. The rest is simple PIVOT...
UPDATE: A dynamic approach to reflect the max amount of codes per group
CREATE TABLE TblTest([Name] VARCHAR(100),Code VARCHAR(100));
INSERT INTO TblTest VALUES
('A','A-One')
,('A','A-Two')
,('B','B-One')
,('C','C-One')
,('C','C-Two')
,('C','C-Three');
DECLARE #cols VARCHAR(MAX);
WITH GetMaxCount(mc) AS(SELECT TOP 1 COUNT([Code]) FROM TblTest GROUP BY [Name] ORDER BY COUNT([Code]) DESC)
SELECT #cols=STUFF(
(
SELECT CONCAT(',Code',Nmbr)
FROM
(SELECT TOP((SELECT mc FROM GetMaxCount)) ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) FROM master..spt_values) t(Nmbr)
FOR XML PATH('')
),1,1,'');
DECLARE #sql VARCHAR(MAX)=
'SELECT p.*
FROM
(
SELECT *
,CONCAT(''Code'',ROW_NUMBER() OVER(PARTITION BY [Name] ORDER BY Code)) AS ColumnName
FROM TblTest
)t
PIVOT
(
MAX(Code) FOR ColumnName IN (' + #cols + ')
)p;';
EXEC(#sql);
GO
DROP TABLE TblTest;
As you can see, the only part which will change in order to reflect the actual amount of columns is the list in PIVOTs IN() clause.
You can create a string, which looks like Code1,Code2,Code3,...CodeN and build the statement dynamically. This can be triggered with EXEC().
I'd prefer the first approach. Dynamically created SQL is very mighty, but can be a pain in the neck too...

Related

Removing all the Alphabets from a string using a single SQL Query [duplicate]

I'm currently doing a data conversion project and need to strip all alphabetical characters from a string. Unfortunately I can't create or use a function as we don't own the source machine making the methods I've found from searching for previous posts unusable.
What would be the best way to do this in a select statement? Speed isn't too much of an issue as this will only be running over 30,000 records or so and is a once off statement.
You can do this in a single statement. You're not really creating a statement with 200+ REPLACEs are you?!
update tbl
set S = U.clean
from tbl
cross apply
(
select Substring(tbl.S,v.number,1)
-- this table will cater for strings up to length 2047
from master..spt_values v
where v.type='P' and v.number between 1 and len(tbl.S)
and Substring(tbl.S,v.number,1) like '[0-9]'
order by v.number
for xml path ('')
) U(clean)
Working SQL Fiddle showing this query with sample data
Replicated below for posterity:
create table tbl (ID int identity, S varchar(500))
insert tbl select 'asdlfj;390312hr9fasd9uhf012 3or h239ur ' + char(13) + 'asdfasf'
insert tbl select '123'
insert tbl select ''
insert tbl select null
insert tbl select '123 a 124'
Results
ID S
1 390312990123239
2 123
3 (null)
4 (null)
5 123124
CTE comes for HELP here.
;WITH CTE AS
(
SELECT
[ProductNumber] AS OrigProductNumber
,CAST([ProductNumber] AS VARCHAR(100)) AS [ProductNumber]
FROM [AdventureWorks].[Production].[Product]
UNION ALL
SELECT OrigProductNumber
,CAST(STUFF([ProductNumber], PATINDEX('%[^0-9]%', [ProductNumber]), 1, '') AS VARCHAR(100) ) AS [ProductNumber]
FROM CTE WHERE PATINDEX('%[^0-9]%', [ProductNumber]) > 0
)
SELECT * FROM CTE
WHERE PATINDEX('%[^0-9]%', [ProductNumber]) = 0
OPTION (MAXRECURSION 0)
output:
OrigProductNumber ProductNumber
WB-H098 098
VE-C304-S 304
VE-C304-M 304
VE-C304-L 304
TT-T092 092
RichardTheKiwi's script in a function for use in selects without cross apply,
also added dot because in my case I use it for double and money values within a varchar field
CREATE FUNCTION dbo.ReplaceNonNumericChars (#string VARCHAR(5000))
RETURNS VARCHAR(1000)
AS
BEGIN
SET #string = REPLACE(#string, ',', '.')
SET #string = (SELECT SUBSTRING(#string, v.number, 1)
FROM master..spt_values v
WHERE v.type = 'P'
AND v.number BETWEEN 1 AND LEN(#string)
AND (SUBSTRING(#string, v.number, 1) LIKE '[0-9]'
OR SUBSTRING(#string, v.number, 1) LIKE '[.]')
ORDER BY v.number
FOR
XML PATH('')
)
RETURN #string
END
GO
Thanks RichardTheKiwi +1
Well if you really can't use a function, I suppose you could do something like this:
SELECT REPLACE(REPLACE(REPLACE(LOWER(col),'a',''),'b',''),'c','')
FROM dbo.table...
Obviously it would be a lot uglier than that, since I only handled the first three letters, but it should give the idea.

SQL Server : Split string to row

How to turn data from below:
CODE COMBINATION USER
1111.111.11.0 KEN; JIMMY
666.778.0.99 KEN
888.66.77.99 LIM(JIM); JIMMY
To
CODE COMBINATION USER
1111.111.11.0 KEN
1111.111.11.0 JIMMY
666.778.0.99 KEN
888.66.77.99 LIM(JIM)
888.66.77.99 JIMMY
I know in SQL Server 2016 this can be done by split string function, but my production is SQL Server 2014.
With this TVF, you can supply the string to be split and delimiter. Furthermore, you get the sequence number which can be very useful for secondary processing.
Select [CODE COMBINATION]
,[USER] = B.RetVal
From YourTable A
Cross Apply [dbo].[udf-Str-Parse](A.[USER],';') B
Returns
The Parse UDF
CREATE FUNCTION [dbo].[udf-Str-Parse] (#String varchar(max),#Delimiter varchar(10))
Returns Table
As
Return (
Select RetSeq = Row_Number() over (Order By (Select null))
,RetVal = LTrim(RTrim(B.i.value('(./text())[1]', 'varchar(max)')))
From (Select x = Cast('<x>'+ Replace(#String,#Delimiter,'</x><x>')+'</x>' as xml).query('.')) as A
Cross Apply x.nodes('x') AS B(i)
);
--Select * from [dbo].[udf-Str-Parse]('Dog,Cat,House,Car',',')
--Select * from [dbo].[udf-Str-Parse]('John Cappelletti was here',' ')
Now, another option is the Parse-Row UDF. Notice we return the parsed string in one row. Currently 9 positions, but it is easy to expand or contract.
Select [CODE COMBINATION]
,B.*
From YourTable A
Cross Apply [dbo].[udf-Str-Parse-Row](A.[USER],';') B
Returns
The Parse Row UDF
CREATE FUNCTION [dbo].[udf-Str-Parse-Row] (#String varchar(max),#Delimiter varchar(10))
Returns Table
As
Return (
Select Pos1 = xDim.value('/x[1]','varchar(max)')
,Pos2 = xDim.value('/x[2]','varchar(max)')
,Pos3 = xDim.value('/x[3]','varchar(max)')
,Pos4 = xDim.value('/x[4]','varchar(max)')
,Pos5 = xDim.value('/x[5]','varchar(max)')
,Pos6 = xDim.value('/x[6]','varchar(max)')
,Pos7 = xDim.value('/x[7]','varchar(max)')
,Pos8 = xDim.value('/x[8]','varchar(max)')
,Pos9 = xDim.value('/x[9]','varchar(max)')
From (Select Cast('<x>' + Replace(#String,#Delimiter,'</x><x>')+'</x>' as XML) as xDim) A
)
--Select * from [dbo].[udf-Str-Parse-Row]('Dog,Cat,House,Car',',')
--Select * from [dbo].[udf-Str-Parse-Row]('John Cappelletti',' ')
You need to use a UDF for splitting it on each row
CREATE FUNCTION [DBO].[FN_SPLIT_STR_TO_COL] (#T AS VARCHAR(4000) )
RETURNS
#RESULT TABLE(VALUE VARCHAR(250))
AS
BEGIN
SET #T= #T+';'
;WITH MYCTE(START,[END]) AS(
SELECT 1 AS START,CHARINDEX(';',#T,1) AS [END]
UNION ALL
SELECT [END]+1 AS START,CHARINDEX(';',#T,[END]+1)AS [END]
FROM MYCTE WHERE [END]<LEN(#T)
)
INSERT INTO #RESULT
SELECT SUBSTRING(#T,START,[END]-START) NAME FROM MYCTE;
RETURN
END
Now query on your table by calling above function with CROSS APPLY
SELECT [CodeCombination],FN_RS.VALUE FROM TABLE1
CROSS APPLY
(SELECT * FROM [DBO].[FN_SPLIT_STR_TO_COL] (User))
AS FN_RS
If your [USER] column only has one semicolon you don't need a "split string" function at all; you could use CROSS APPLY like this:
-- Your Sample data
DECLARE #table TABLE (CODE_COMBINATION varchar(30), [USER] varchar(100));
INSERT #table
VALUES ('1111.111.11.0', 'KEN; JIMMY'), ('666.778.0.99', 'XKEN'),
('888.66.77.99','LIM(JIM); JIMMY');
-- Solution using only CROSS APPLY
SELECT CODE_COMBINATION, [USER] = LTRIM(s.s)
FROM #table t
CROSS APPLY (VALUES (CHARINDEX(';',t.[USER]))) d(d)
CROSS APPLY
(
SELECT SUBSTRING(t.[USER], 1, ISNULL(NULLIF(d.d,0),1001)-1)
UNION ALL
SELECT SUBSTRING(t.[USER], d.d+1, 1000)
WHERE d.d > 0
) s(s);
If you do need a pre SQL Server 2016 "split string" function I would strongly suggest using Jeff Moden's DelimitedSplit8k or Eirikur Eiriksson's DelimitedSplit8K_LEAD. Both of these will outperform an XML-based or recursice CTE "split string" function.

How to write a multi-parameter CTE script?

I am trying to write a TSQL script for an SSRS report that uses a CTE to select records based on the parameters chosen. I'm looking for the most efficient way to do this, either all in TSQL and/or SSRS. I have 4 parameters which can be set to NULL (All values) or one specific value. Then in my CTE, I have the following line:
ROW_NUMBER() over(partition by G.[program_providing_service],G.people_id
order by G.[actual_date] desc) as rowID
This above CTE is for the case when Program is NULL and People is not null. My 4 parameters are:
Program, Facility, Staff, and People.
So I only want to partition values when they are NULL. Currently I implement this by one CTE depending on the parameter values. For example, if they choose NULL for all parameters except People, then this CTE would look like:
ROW_NUMBER() over(partition by G.people_id
order by G.[actual_date] desc) as rowID
Or if all 5 parameters are null:
ROW_NUMBER() over(partition by G.[program_providing_service], G.[site_providing_service], G.staff_id, G.people_id
order by G.[actual_date] desc) as rowID
If they do not choose NULL for any of the 4 parameters, then I probably do not need to partition by any field since I just want the top 1 record ordered by actual_date descending. This is what my CTE looks like:
;with cte as
(
Select distinct
G.[actual_date],
G.[site_providing_service],
p.[program_name],
G.[staff_id],
G.program_providing_service,
ROW_NUMBER() over(partition by G.[program_providing_service],G.people_id
order by G.[actual_date] desc) as rowID
From
event_log_rv G With (NoLock)
WHERE
...
AND (#ClientID Is Null OR [people_id]=#ClientID)
AND (#StaffID Is Null OR [staff_id] = #StaffID)
AND (#FacilityID Is Null OR [site_providing_service] = #FacilityID)
AND (#ProgramID Is Null OR [program_providing_service] = #ProgramID)
and (#SupervisorID is NULL OR staff_id in (select staff_id from #supervisors))
)
SELECT
[actual_date],
[site_providing_service],
[program_name],
[staff_id],
program_providing_service,
people_id,
rowID
FROM cte WHERE rowid = 1
ORDER BY [Client_FullName]
where the ROW_NUMBER line varies depending on the parameters chosen. Currently I have 5 IF statements in this TSQL script that look like:
IF #ProgramID IS NOT NULL AND #ClientID IS NULL
BEGIN
...
END
with one CTE in each of these IF statements:
IF #FacilityID IS NOT NULL AND #ClientID IS NULL
BEGIN
...
END
IF #ProgramID IS NOT NULL AND #ClientID IS NULL
BEGIN
...
END
IF #StaffID IS NOT NULL AND #ClientID IS NULL
BEGIN
...
END
IF #ClientID IS NOT NULL
BEGIN
...
END
How can I code for all possible options, whether they choose NULL or else specific values?
OMG.... it took me long time to try to understand what you want to do. There is some contradiction in your description. Pleas revist your description. Like you said you only want to partition values when they are NULL; then you also said, when they choose NULL for all parameter except for people, then you partition on people....
No matter what way you want to achieve, partition on 'null' or 'not null', you can construct dynamic sql to achieve this, instead of adding a lot of [if...else]
Following code is pseudo, definitely not tested. Just give you a hint. The following code has one assumption, which is your parameters have priority in partition order, for example, if Program is not null (or null), Program is in the first location.
declare #sql varchar(max)
set #sql = '
;with cte as
(
Select distinct
G.[actual_date],
G.[site_providing_service],
p.[program_name],
G.[staff_id],
G.program_providing_service,
ROW_NUMBER() over(partition by
'
if(#progarm is null)
set #sql = #sql + 'G.[program_providing_service],'
if(#facility is null)
set #sql = #sql + 'G.[site_providing_service],'
if(#staff is null )
set #sql = #sql + 'G.staff_id,'
if(#people is null)
set #sql = #sql + 'G.people_id'
set #sql = #sql + '
order by G.[actual_date] desc) as rowID
From
event_log_rv G With (NoLock)
WHERE
...
AND (#ClientID Is Null OR [people_id]=#ClientID)
AND (#StaffID Is Null OR [staff_id] = #StaffID)
AND (#FacilityID Is Null OR [site_providing_service] = #FacilityID)
AND (#ProgramID Is Null OR [program_providing_service] = #ProgramID)
and (#SupervisorID is NULL OR staff_id in (select staff_id from #supervisors))
)
SELECT
[actual_date],
[site_providing_service],
[program_name],
[staff_id],
program_providing_service,
people_id,
rowID
FROM cte WHERE rowid = 1
ORDER BY [Client_FullName]
'
exec(#sql)

PostgreSQL join to denormalize a table with generate_series

I've this table:
CREATE TABLE "mytable"
( name text, count integer );
INSERT INTO mytable VALUES ('john', 4),('mark',2),('albert',3);
and I would like "denormlize" the rows in this way:
SELECT name FROM mytable JOIN generate_series(1,4) tmp(a) ON (a<=count)
so I've a number of rows for each name equals to the count column: I've 4 rows with john, 2 with mark and 3 with albert.
But i can't use the generate_series() function if I don't know the highest count (in this case 4). There is a way to do this without knowing the MAX(count) ?
select name,
generate_series(1,count)
from mytable;
Set returning functions can be used in the select list and will do a cross join with the row retrieved from the base table.
I think this is an undocumented behaviour that might go away in the future, but I'm not sure about that (I recall some discussion regarding this on the mailing list)
SQLFiddle example
DROP TABLE ztable ;
CREATE TABLE ztable (zname varchar, zvalue INTEGER NOT NULL);
INSERT INTO ztable(zname, zvalue) VALUES( 'one', 1), ( 'two', 2 ), ( 'three', 3) , ( 'four', 4 );
WITH expand AS (
WITH RECURSIVE zzz AS (
SELECT 1::integer AS rnk , t0.zname
FROM ztable t0
UNION
SELECT 1+rr.rnk , t1.zname
FROM ztable t1
JOIN zzz rr ON rr.rnk < t1.zvalue
)
SELECT zzz.zname
FROM zzz
)
SELECT x.*
FROM expand x
;

dynamically create and load table from select query

SELECT
MEM_ID, [C1],[C2]
from
(select
MEM_ID, Condition_id, condition_result
from tbl_GConditionResult
) x
pivot
(
sum(condition_result)
for condition_id in ([C1],[C2])
) p
The above query returns three columns of data. Until runtime I will not know how many columns in the select statement. Is it possible to load the data from the select statement into a dynamically created table? After processing the data from the dynamically created table I want to drop the table.
Thank you for your help.
Smith
Yes, do a SELECT INTO e.g.
SELECT
MEM_ID, [C1],[C2]
into #TEMP
from
(select
MEM_ID, Condition_id, condition_result
from tbl_GConditionResult
) x
pivot
(
sum(condition_result)
for condition_id in ([C1],[C2])
) p
-- Do what you need with the TEMP table
DROP TABLE #TEMP