Dynamic field definitions - can this be done in T-SQL? - tsql

I have a requirement that I'm struggling to implement. If possible, I'd like to achieve this with native T-SQL.
I have the following tables:
CUSTOMER
========
ID,
Name
FIELDDEF
========
ID,
Name
FieldType (Char T, N, D for Text, Number or Date)
CUSTOMERFIELD
=============
ID,
CustomerID,
FieldDefID,
CaptureDate,
ValueText,
ValueNumber,
ValueDate
Basically, the purpose of these tables is to provide an extensible custom field system. The idea is that the user creates new field definitions that can be a text, number or date field. Then, they create values for these fields in the ValueText, ValueNumber OR ValueDate field.
Example:
*Customer*
1,BOB
2,JIM
*FieldDef*
1,Mobile,T
1,DateOfBirth,D
*CustomerField*
ID,CustomerID,FieldDefID,CaptureDate,ValueText,ValueNumber,ValueDate
1,1,1,2011-01-1,07123456789,NULL,NULL
2,1,2,2011-01-1,NULL,NULL,09-DEC-1980
3,1,1,2011-01-2,07123498787,NULL,NULL
I need to create a view that looks like this:
*CustomerView*
ID,Name,Mobile,DateOfBirth
1,BOB,07123498787,09-DEC-1980
Note that Bob's mobile is the second one in the list, because it uses the most recent capture date.
Ideally, I need this to be extensible, so if I create a new field def in the future, it is automatically picked up in the CustomerView.
Is this possible in T-SQL at all?
Thanks,
Simon.

This would not be possible with a view, unless the view is dynamically recreated on the fly every time FieldDef changes because view schemas are locked-in at creation time. However, it may be possible with a stored procedure, which may or may not work depending on how you are using it.
Edit 1
Here is a sample query that works just for your current field names, and would have to be modified by dynamic SQL to work in general:
Edit 2
Modified to grab the newest values from the customer field table
with CustomerFieldNewest as (
select
cf1.*
from
customerfield cf1
inner join
(
select
customerid,
fielddefid,
max(capturedate) as maxcapturedate
from
customerfield cf2
group by
customerid,
fielddefid
) cf2 on cf1.customerid = cf2.customerid
and cf1.fielddefid = cf2.fielddefid
and cf1.capturedate = cf2.maxcapturedate
)
,CustomerFieldPivot as (
select
C.ID as ID
,max(case when F.Name = 'Mobile' then CF.ValueText end) as Mobile
,max(case when F.Name = 'DateOfBirth' then CF.ValueDate end) as DateOfBirth
from
Customer C
left join
CustomerFieldNewest CF on C.ID = CF.CustomerID
left join
FieldDef F on F.ID = CF.FieldDefID
group by
C.ID
)
select
C.*
,P.Mobile
,P.DateOfBirth
from
Customer C
left join
CustomerFieldPivot P on C.ID = P.ID
Edit 3
Here is T-SQL code to generate the view on the fly based on the current set of fields in FieldDef (this assumes the view CustomerView already exists, so you will need to create it first as a blank definition or you will get an error). I'm not sure about the performance of all this, but it should work correctly.
declare #sql varchar(max)
declare #fielddef varchar(max)
declare #fieldlist varchar(max)
select
#fielddef = coalesce(#fielddef + ', ' + CHAR(13) + CHAR(10), '') +
' max(case when F.Name = ''' + F.Name + ''' then CF.' +
case F.FieldType
when 'T' then 'ValueText'
when 'N' then 'ValueNumber'
when 'D' then 'ValueDate'
end
+ ' end) as [' + F.Name + ']'
,#fieldlist = coalesce(#fieldlist + ', ' + CHAR(13) + CHAR(10), '') +
' [' + F.Name + ']'
from
FieldDef F
set #sql = '
alter view [CustomerView] as
with CustomerFieldNewest as (
select
cf1.*
from
customerfield cf1
inner join
(
select
customerid,
fielddefid,
max(capturedate) as maxcapturedate
from
customerfield cf2
group by
customerid,
fielddefid
) cf2 on cf1.customerid = cf2.customerid
and cf1.fielddefid = cf2.fielddefid
and cf1.capturedate = cf2.maxcapturedate
)
,CustomerFieldPivot as (
select
C.ID as ID,
' + #fielddef + '
from
Customer C
left join
CustomerFieldNewest CF on C.ID = CF.CustomerID
left join
FieldDef F on F.ID = CF.FieldDefID
group by
C.ID
)
select
C.*,
' + #fieldlist + '
from
Customer C
left join
CustomerFieldPivot P on C.ID = P.ID
'
print #sql
exec(#sql)
select * from CustomerView

You need to build a crosstab which you do with the Pivot statement in TSQL. Here's an article that talks about how to build the pivot dynamically.
http://sqlserver-qa.net/blogs/t-sql/archive/2008/08/27/4809.aspx

Just for completeness there is sql_variant:
declare #t table (typ varchar(1), yuk sql_variant)
insert #t values ('d', getdate())
insert #t values ('i', 1234)
insert #t values ('s', 'bleep bloop')
select
yuk,
case typ
when 'd' then convert(datetime, yuk, 106)+50
when 'i' then cast(yuk as int) * 2
when 's' then reverse(cast(yuk as varchar))
else yuk
end
from #t

Related

How to move the data to the next line based on spaces in sqlserver 2008 R2

Input : Keep the column value into next line if word to word space is 3 space and length of the word is >9 .
declare #Table table(CL1 varchar(50))
INSERT INTO #Table
SELECT 'Ohh my GOD'
UNION ALL
SELECT 'hindunewspaer is no1 paper'
select * from #Table
o/p :
CL1
ohh
my god
hindunewpaer
is no1 paper
Used a Split/Parse function. Can be inline if needed.
EDIT - Switch to a Parser which is not limited to 8K because the final
string could easily be larger than 8K
Example
;with cte0 as (
Select Seq=Row_Number() over (Order by (Select null)),RetSeq,RetVal
From #Table A
Cross Apply (
Select RetSeq
,RetVal=case when len(RetVal)>9 then '~~~' else '' end+RetVal+case when len(RetVal)>9 then '~~~' else '' end
From [dbo].[udf-Str-Parse](Replace(CL1,' ','~~~ '),' ')
) B ),
cte1 as ( Select S=Stuff((Select ' '+RetVal From cte0 Order by Seq For XML Path ('')),1,1,'') )
Select CL1 = RetVal
From cte1 A
Cross Apply [dbo].[udf-Str-Parse](A.S,'~~~') B
Order By RetSeq
Returns
CL1
Ohh
my GOD
hindunewspaer
is no1 paper
The Split/Parse Function if Needed
CREATE FUNCTION [dbo].[udf-Str-Parse] (#String varchar(max),#Delimiter varchar(10))
Returns Table
As
Return (
Select RetSeq = Row_Number() over (Order By (Select null))
,RetVal = LTrim(RTrim(B.i.value('(./text())[1]', 'varchar(max)')))
From (Select x = Cast('<x>' + replace((Select replace(#String,#Delimiter,'§§Split§§') as [*] For XML Path('')),'§§Split§§','</x><x>')+'</x>' as xml).query('.')) as A
Cross Apply x.nodes('x') AS B(i)
);
--Thanks Shnugo for making this XML safe
--Select * from [dbo].[udf-Str-Parse]('Dog,Cat,House,Car',',')
--Select * from [dbo].[udf-Str-Parse]('John Cappelletti was here',' ')
--Select * from [dbo].[udf-Str-Parse]('this,is,<test>,for,< & >',',')

Dynamic sql to select a specific value from a column using Joins

I am attempting to use dynamic sql to select a value based on a field. I have a table of field references I am using for the column names. What I am having troubles with is of course the dynamic sql. My return result is (SELECT ecoa_code FROM CRA_METRO2_BASE WHERE id = 568470) for example. But I really want it to run that select statement. Executing only returns the last row.
DECLARE #BaseCol VARCHAR(250)
SELECT
#BaseCol = '(SELECT ' + FR_base.field_name + ' FROM CRA_METRO2_BASE WHERE id = ' + CONVERT(VARCHAR(15), B.id) + ')'
FROM CRA_INNOVIS_AUDIT_ERROR_FIELDS E
LEFT JOIN CRA_METRO2_BASE B
ON B.id = E.base_id
LEFT JOIN CRA_METRO2_FIELD_REF FR_base
ON FR_base.id = E.base_field_ref
WHERE E.audit_id = #audit_id
EXEC(#BaseCol)
I am not sure I understand your premises correctly and without a mock-up...; so please take this answer with a grain of salt:)
DECLARE #sqlstring VARCHAR(MAX)
SELECT #sqlstring = 'SELECT ' + a.column_name + ' FROM ' + a.[Schema] + '.' + a.table_name
from (
SELECT TOP 1 T.object_id,OBJECT_SCHEMA_NAME(T.[object_id],DB_ID()) AS [Schema],
T.[name] AS [table_name], AC.[name] AS [column_name]
--,TY.[name] AS system_data_type
, AC.[max_length],
AC.[precision], AC.[scale], AC.[is_nullable], AC.[is_ansi_padded]
,AC.column_id
FROM sys.tables AS T
INNER JOIN sys.[all_columns] AC ON T.[object_id] = AC.[object_id]
) a
SELECT #sqlstring
EXEC(#sqlstring)
So I used my above query and now I am using a CTE to build my basic result list. And in my cte I create update statements which then are all put into a temp table.
I extract the update statements and execute them on the temp table. And walla, I have my results!
IF(OBJECT_ID('tempdb..#Temp') IS NOT NULL)
BEGIN
DROP TABLE #Temp
END
CREATE TABLE #Temp
(
usb_data VARCHAR(500),
cra_data VARCHAR(500)
);
WITH ErrorFieldsCTE(id, field, usb_data, cra_data, AUD, SOR, acceptable_variance, is_variance_known, is_reviewed)
AS(
SELECT
+ 'UPDATE #TEMP SET usb_data = (SELECT ' + FR_base.field_name +' FROM CRA_METRO2_BASE WHERE id = '+ CONVERT(VARCHAR(25), B.id) +' ) WHERE id = ' + CONVERT(VARCHAR(15), E.id) + ' ' [usb_data],
+ 'UPDATE #TEMP SET cra_data = (SELECT ' + FR_audit.field_name +' FROM CRA_INNOVIS_INBOUND_AUDIT_INFORMATION WHERE id = '+ CONVERT(VARCHAR(25), A.id) +') WHERE id = ' + CONVERT(VARCHAR(15), E.id) + ' ' [cra_data]
FROM CRA_INNOVIS_AUDIT_ERROR_FIELDS E
LEFT JOIN CRA_METRO2_BASE B
ON B.id = E.base_id
LEFT JOIN CRA_INNOVIS_INBOUND_AUDIT_INFORMATION A
ON A.id = E.audit_id
LEFT JOIN CRA_METRO2_FIELD_REF FR_audit
ON FR_audit.id = E.audit_field_ref
LEFT JOIN CRA_METRO2_FIELD_REF FR_base
ON FR_base.id = E.base_field_ref
WHERE E.audit_id = #audit_id
)
INSERT INTO #Temp
SELECT
id, field, usb_data, cra_data, AUD, SOR, acceptable_variance, is_variance_known, is_reviewed
FROM ErrorFieldsCTE
SELECT -- extract query
#usb_data += usb_data + '',
#cra_data += cra_data + ''
FROM #Temp
EXEC(#usb_data) -- updating temp table, selects usb-data
EXEC(#cra_data) -- updating temp table, selects cra-data
SELECT -- return to web
id, field, usb_data, cra_data, AUD, SOR, acceptable_variance, is_variance_known, is_reviewed
FROM #Temp
IF(OBJECT_ID('tempdb..#Temp') IS NOT NULL)
Begin
Drop Table #Temp
End

How to pivot a table to a view on matching-length delimited cells

Disclaimer: I'm dealing with a rather old legacy system so any comments telling me about poor design are redundant, although I do genuinely appreciate any such sentiment. There is a new version that solves most legacy problems but we still have to maintain the old system, so basically, we have to manage for now.
I have a table that looks like this (yes, that is a single column, I know):
And I need a view (for reporting purposes) that will dynamically process the data in said table and return this:
The values are \n-delimited (shudder) and you can assume there will always be the same number of values in each cell (9 in the example, although other databases could have 4 or 12 or any number), although I suppose having NULL-insertion in the event of missing values couldn't hurt. They will also always be in a matching order (as in the example, 'AUD', 'Australian Dollar', and '$' are all the first values in their respective cells, and so on).
I've found various approaches to splitting a single cell out into a view, but nothing that covers merging data in such a way as I require. Sitting at home with a cold has not helped my research capabilities. Help me StackOverflow, you're my only hope!
Bonus points for tidy, relatively readable SQL examples, although I'm anticipating messiness as a natural by-product of the hackish nature of my required solution.
Something like this. I didn't take the time to build out the tables, but it should be fairly obvious where you can replace my variables with your rows. You will also want to do a replace char(10) where I have used commas. You could package it up in a table valued function and then call as a view.
declare #xml1 xml
declare #xml2 xml
declare #xml3 xml
declare #c1 nvarchar(250)
declare #c2 nvarchar(250)
declare #c3 nvarchar(250)
set #c1 = N'AUD,CAD,EUR,GBP,JPY,NZD,USD,KES,CHF';
set #c2 = N'Australian Dollar,Canadian Dollar,Euro,Pound Sterling,Yen,New Zealand Dollar,United States Dollar,Kenyan Shilling, Swiss Franc';
set #c3 = N'$,$,C,L,Y,$,$,K,F';
-- you'd use replace(#c1, char(10), '</r><r>') etc etc for /n delimited code
set #xml1 = N'<root><r>' + replace(#c1,',','</r><r>') + '</r></root>';
set #xml2 = N'<root><r>' + replace(#c2,',','</r><r>') + '</r></root>';
set #xml3 = N'<root><r>' + replace(#c3,',','</r><r>') + '</r></root>';
select code.code, name.name, symbol.symbol
from
(select ROW_NUMBER() over (order by ##rowcount) as ck,
c.value('.','varchar(max)') as [code]
from #xml1.nodes('//root/r') as a(c)) as code
inner join
(select ROW_NUMBER() over (order by ##rowcount) as nk,
n.value('.','varchar(max)') as [name]
from #xml2.nodes('//root/r') as a(n)) as name on code.ck = name.nk
inner join
(select ROW_NUMBER() over (order by ##rowcount) as sk,
s.value('.','varchar(max)') as [symbol]
from #xml3.nodes('//root/r') as a(s)) as symbol on symbol.sk = name.nk
You can run this as a single script in SSMS for verification that it works. No schema necessary.
Using Jeff Moden's Tally Ho! CSV splitter:
CREATE FUNCTION [dbo].[DelimitedSplit8K]
--===== Define I/O parameters
(#pString VARCHAR(8000), #pDelimiter CHAR(1))
--WARNING!!! DO NOT USE MAX DATA-TYPES HERE! IT WILL KILL PERFORMANCE!
RETURNS TABLE WITH SCHEMABINDING AS
RETURN
--===== "Inline" CTE Driven "Tally Table" produces values from 1 up to 10,000...
-- enough to cover VARCHAR(8000)
WITH
E1(N) AS (
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1
), --10E+1 or 10 rows
E2(N) AS (SELECT 1 FROM E1 a, E1 b), --10E+2 or 100 rows
E4(N) AS (SELECT 1 FROM E2 a, E2 b), --10E+4 or 10,000 rows max
cteTally(N) AS (--==== This provides the "base" CTE and limits the number of rows right up front
-- for both a performance gain and prevention of accidental "overruns"
SELECT TOP (ISNULL(DATALENGTH(#pString),0)) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM E4
),
cteStart(N1) AS (--==== This returns N+1 (starting position of each "element" just once for each delimiter)
SELECT 1 UNION ALL
SELECT t.N+1 FROM cteTally t WHERE SUBSTRING(#pString,t.N,1) = #pDelimiter
),
cteLen(N1,L1) AS(--==== Return start and length (for use in substring)
SELECT s.N1,
ISNULL(NULLIF(CHARINDEX(#pDelimiter,#pString,s.N1),0)-s.N1,8000)
FROM cteStart s
)
--===== Do the actual split. The ISNULL/NULLIF combo handles the length for the final element when no delimiter is found.
SELECT ItemNumber = ROW_NUMBER() OVER(ORDER BY l.N1),
Item = SUBSTRING(#pString, l.N1, l.L1)
FROM cteLen l
;
and inline CTE data like this
with
data as (select Num,Currencies from (values
(1,'AUD'+char(10)+'CAD'+char(10)+'USD'+char(10)+'KES')
,(2,'Australian DOllar'+char(10)+'Canadian Dollar'+char(10)+'US Dollar'+char(10)+'Kenyan Shilling')
,(3,'$'+char(10)+'$'+char(10)+'$'+char(10)+'k')
)data(Num,Currencies)
),
The solution is as simple as this:
map as (select * from (values
(1,'Code')
,(2,'Name')
,(3,'Symbol')
)map(Num,Col )
)
select
ItemNumber
,max(Code) as Code
,max(Name) as Name
,max(Symbol) as Symbol
from (
select
map.Num
,map.Col
,c.Item
,c.ItemNumber
from data
join map
on map.Num = data.Num
cross apply dbo.DelimitedSplit8K(data.Currencies,char(10)) c
) t
pivot (max(Item) for Col in (Code,Name,Symbol)) pvt
group by ItemNumber
to give us:
ItemNumber Code Name Symbol
-------------- ---- -------------------- ---------------
1 AUD Australian DOllar $
2 CAD Canadian Dollar $
3 USD US Dollar $
4 KES Kenyan Shilling k
Hope this Helps. Run all together or replace the table variable with a temptable.
Sample Data:
IF OBJECT_ID(N'tempdb..#table') > 0
BEGIN
DROP TABLE #table
END
DECLARE #table TABLE(ATTRIBUTELVAUE VARCHAR(MAX))
INSERT INTO #table
SELECT
'AFN
ALL
DZD
USD
EUR
AOA
XCD
XCD
ARS'
INSERT INTO #table
SELECT
'Afghanistan
Albania
Algeria
American Samoa
Andorra
Angola
Anguilla
Antigua and Barbuda
Argentina'
INSERT INTO #table
SELECT
'AF
AL
DZ
AS
AD
AO
AI
AG
AR'
Query:
IF OBJECT_ID(N'tempdb..#TEMP') > 0
BEGIN
DROP TABLE #TEMP
END
DECLARE #StartLoop INT
DECLARE #EndLoop INT
DECLARE #Code TABLE (ID INT IDENTITY(1, 1),
Code VARCHAR(250))
DECLARE #Name TABLE (ID INT IDENTITY(1, 1),
Name VARCHAR(250))
DECLARE #Symbol TABLE (ID INT IDENTITY(1, 1),
Symbol VARCHAR(250))
SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS ID,
*
INTO #Temp
FROM #table
SELECT #StartLoop = MIN(ID),
#EndLoop = MAX(ID)
FROM #Temp
WHILE #StartLoop <= #EndLoop
BEGIN
DECLARE #WorkingString VARCHAR(MAX)
SELECT #WorkingString = ATTRIBUTELVAUE + CHAR(10) + ' '
FROM #Temp
WHERE ID = #StartLoop
--print #WorkingString
WHILE CHARINDEX(CHAR(10), #WorkingString) > 0
BEGIN
DECLARE #SearchCharacter INT
DECLARE #WorkingStringLength INT
DECLARE #TempStringLength INT
DECLARE #TempString VARCHAR(MAX)
SET #WorkingStringLength = LEN(#WorkingString)
SET #SearchCharacter = CHARINDEX(CHAR(10), #WorkingString)
SET #TempString = SUBSTRING(#WorkingString, 1, #SearchCharacter - 1)
SET #TempStringLength = LEN(#TempString)
SET #WorkingString = SUBSTRING(#WorkingString, #SearchCharacter + 1, #WorkingStringLength)
SET #TempString = REPLACE(#TempString, CHAR(13), '')
IF #StartLoop = 1
BEGIN
INSERT INTO #Code
SELECT #TempString
END
IF #StartLoop = 2
BEGIN
INSERT INTO #Name
SELECT #TempString
END
IF #StartLoop = 3
BEGIN
INSERT INTO #Symbol
SELECT #TempString
END
END
SET #StartLoop = #StartLoop + 1
END
SELECT Code,
Name,
Symbol
FROM #Code AS c
JOIN #Name AS n
ON c.ID = n.ID
JOIN #Symbol AS s
ON s.ID = n.ID
Cleanup:
IF OBJECT_ID(N'tempdb..#TEMP') > 0
BEGIN
DROP TABLE #TEMP
END
IF OBJECT_ID(N'tempdb..#table') > 0
BEGIN
DROP TABLE #table
END
Because I needed a view, this ended up being my solution:
CREATE FUNCTION [dbo].[CurrencyTableGenerator]()
RETURNS
#CurrencyTable TABLE(
Code NVARCHAR(250)
,Name NVARCHAR(250)
,Symbol NVARCHAR(250)
)
AS
BEGIN
DECLARE #xml1 XML
DECLARE #xml2 XML
DECLARE #xml3 XML
DECLARE #C1 NVARCHAR(250)
DECLARE #C2 NVARCHAR(250)
DECLARE #c3 NVARCHAR(250)
SET #c1 = (SELECT ...)
SET #c2 = (SELECT ...)
SET #c3 = (SELECT ...)
SET #xml1 = N'<root><r>' + REPLACE(#c1, CHAR(10), '</r><r>') + '</r></root>';
SET #xml2 = N'<root><r>' + REPLACE(#c2, CHAR(10), '</r><r>') + '</r></root>';
SET #xml3 = N'<root><r>' + REPLACE(#c3, CHAR(10), '</r><r>') + '</r></root>';
INSERT INTO #CurrencyTable
SELECT Code.Code, Name.Name, Symbol.Symbol
FROM
(SELECT ROW_NUMBER() OVER (ORDER BY ##ROWCOUNT) AS ck,
c.value('.', 'VARCHAR(250)') AS [Code]
FROM #xml1.nodes('//root/r') AS a(c)) AS Code
INNER JOIN
(SELECT ROW_NUMBER() OVER (ORDER BY ##ROWCOUNT) AS nk,
n.value('.', 'VARCHAR(250)') AS [Name]
FROM #xml2.nodes('//root/r') AS a(n)) AS Name ON Code.ck = Name.nk
INNER JOIN
(SELECT ROW_NUMBER() OVER (ORDER BY ##ROWCOUNT) AS sk,
s.value('.', 'VARCHAR(250)') AS [Symbol]
FROM #xml3.nodes('//root/r') AS a(s)) AS Symbol ON Symbol.sk = Name.nk
RETURN
END
GO
CREATE VIEW [dbo].[CurrencyView]
AS
SELECT * FROM [dbo].[CurrencyTableGenerator]()
GO
Thanks to RThomas for the function.

Why does my query to combine columns return NULL? (SQL on StackExchange DataExplorer )

I have written a StackExchange DataExplorer query to list all comments by User.Id
The query works and returns Ids of questions and answers. What I do not understand is
why, for answers, the second column is empty.
DECLARE #UserId int = ##UserId##
Select p.Id
, '<a href=https://stackoverflow.com/questions/'
+ Cast(p.Id as varchar(20)) + '>'
+ Cast(p.Id as varchar(20))
+ ' - ' + p.Title + '</a>'
, c.Text
FROM Users u
Join Comments c ON c.UserId = #UserId
JOIN Posts p ON p.Id = c.PostId
where u.Id = #UserId AND p.Id IS NOT NULL
Even assuming that the column p.Title is NULL the column p.Id is not NULL and I would therefore expect that this part
'<a href=https://stackoverflow.com/questions/'
+ Cast(p.Id as varchar(20)) + '>'
+ Cast(p.Id as varchar(20))
+ ' - ' + p.Title + '</a>'
would return something as per this question. But the second column is totally empty.
Why is that the case?
Even assuming that the column p.Title is NULL
Which it is for those rows.
the column p.Id is not NULL and therefore i would expect [the result
to be something not null]
Nope. If you concatenate NULL with anything in SQL Server using the + operator then you end up getting NULL except if concat_null_yields_null is OFF.
You can use the CONCAT function instead. This also saves the need to CAST
DECLARE #UserId INT = ##UserId##
SELECT p.Id,
CONCAT('<a href=http://stackoverflow.com/questions/',
p.Id,
'>',
p.Id,
' - ',
p.Title COLLATE SQL_Latin1_General_CP1_CI_AS,
'</a>'),
c.Text
FROM Users u
JOIN Comments c
ON c.UserId = #UserId
JOIN Posts p
ON p.Id = c.PostId
WHERE u.Id = #UserId
AND p.Id IS NOT NULL

Percentage of Values for Top 3 from a Character Field

I have an unusual situation. Please consider the following code:
IF OBJECT_ID('tempdb..#CharacterTest') IS NOT NULL
DROP TABLE #CharacterTest
CREATE TABLE #CharacterTest
(
[ID] int IDENTITY(1, 1) NOT NULL,
[CharField] varchar(50) NULL
)
INSERT INTO #CharacterTest (CharField)
VALUES ('A')
, ('A')
, ('A')
, ('A')
, ('B')
, ('B')
, ('B')
, ('C')
, ('C')
, ('D')
, ('D')
, ('F')
, ('G')
, ('H')
, ('I')
, ('J')
, ('K')
, ('L')
, ('M')
, ('N')
, (' ')
, (' ')
, (' ')
, (NULL)
, ('');
I would like a query which gives me a character string like this:
A (16%), B (12%), C(8%)
Please notice the following:
I don't want to have empty strings, strings with all blanks, or nulls listed in the top 3, but I do want the percentage of values calculated using the entire record count for the table.
Ties can be ignored, so if there were 22 values in the list with 8% frequency, it's alright to simply return whichever one is first.
Percentages can be rounded to whole numbers.
I'd like to find the easiest way to write this query while still retaining T-SQL compatibility back to SQL Server 2005. What is the best way to do this? Window Functions?
I'd go for.
WITH T1
AS (SELECT [CharField],
100.0 * COUNT(*) OVER (PARTITION BY [CharField]) /
COUNT(*) OVER () AS Pct
FROM #CharacterTest),
T2
AS (SELECT DISTINCT TOP 3 *
FROM T1
WHERE [CharField] <> '' --Excludes all blank or NULL as well
ORDER BY Pct DESC)
SELECT STUFF((SELECT ',' + [CharField] + ' (' + CAST(CAST(ROUND(Pct,1) AS INT) AS VARCHAR(3)) + ')'
FROM T2
ORDER BY Pct DESC
FOR XML PATH('')), 1, 1, '') AS Result
My first attempt would probably be this. Not saying that it's the best way to handle it, but that it would work.
DECLARE #TotalCount INT
SELECT #TotalCount = COUNT(*) FROM #CharacterTest AS ct
SELECT TOP(3) CharField, COUNT(*) * 1.0 / #TotalCount AS OverallPercentage
FROM #CharacterTest AS ct
WHERE CharField IS NOT NULL AND REPLACE(CharField, ' ', '') <> ''
GROUP BY CharField
ORDER BY COUNT(*) desc
DROP TABLE #CharacterTest
This should get the character string you need:
declare #output varchar(200);
with cte as (
select CharField
, (count(*) * 100) / (select count(*) from #CharacterTest) as CharPct
, row_number() over (order by count(*) desc, CharField) as RowNum
from #CharacterTest
where replace(CharField, ' ', '') not like ''
group by CharField
)
select #output = coalesce(#output + ', ', '') + CharField + ' (' + cast(CharPct as varchar(11)) + '%)'
from cte
where RowNum <= 3
order by RowNum;
select #output;
-- Returns:
-- A (16%), B (12%), C (8%)
I would draw attention to storing a single character in a varchar(50) column, however.