Grouping sets of records in sql - tsql

The grouping is done on from and toloc and one group is been indicated by usrid
Table :
from toloc usrid
a b 1
c d 1 --- group 1
e f 1
-------------------
a b 2
c d 2 --- group 2
e f 2
----------------------
a b 3
c d 3 --- group 3
h k 3
after group set query required resulset ???
from toloc usrid
a b 1
c d 1 --- group 1 & 3 combined to form 1 group
e f 1
-------------------
a b 2![alt text][1]
c d 2 --- group 2
h k 2
How can I achieve the resultset.
I have to group similar set of records in sql. Is it possible to do with rollup or the new grouping sets. I'm not been able to figure it.

I dug up this old question. Assuming there there are no duplicated rows it should work.
I solved the one you linked to first and rewrote it to match this one, so the fields will differ in names compared to your example.
DECLARE #t TABLE (fromloc VARCHAR(30), toloc VARCHAR(30), usr_history INT)
INSERT #t VALUES ('a', 'b', 1)
INSERT #t VALUES ('c', 'b', 1)
INSERT #t VALUES ('e', 'f', 1)
INSERT #t VALUES ('a', 'b', 2)
INSERT #t VALUES ('c', 'b', 2)
INSERT #t VALUES ('e', 'f', 2)
INSERT #t VALUES ('a', 'b', 3)
INSERT #t VALUES ('c', 'd', 3)
INSERT #t VALUES ('h', 'k', 3)
;WITH c as
(
SELECT t1.usr_history h1, t2.usr_history h2, COUNT(*) COUNT
FROM #t t1
JOIN #t t2 ON t1.fromloc = t2.fromloc and t1.toloc = t2.toloc and t1.usr_history < t2.usr_history
GROUP BY t1.usr_history, t2.usr_history
),
d as (
SELECT usr_history h, COUNT(*) COUNT FROM #t GROUP BY usr_history
),
e as (
SELECT d.h FROM d JOIN c ON c.COUNT = d.COUNT and c.h2 = d.h
JOIN d d2 ON d2.COUNT=c.COUNT and d2.h= c.h1
)
SELECT fromloc, toloc, DENSE_RANK() OVER (ORDER BY usr_history) AS 'usrid'
FROM #t t
WHERE NOT EXISTS (SELECT 1 FROM e WHERE e.h = t.usr_history)

Answer to this question is here: https://stackoverflow.com/a/6727662/195446
Another way is to use FOR XML PATH as signature of set of records.

Related

Select top 5 and bottom 5 columns from a list of columns based on their values

I have a requirement where I need to Select top 5 and bottom 5 columns from a list of columns based on their values.
If more than 1 column has same value then select any one from them.
Eg
CREATE TABLE #b(Company VARCHAR(10),A1 INt,A2 INt,A3 INt,A4 INt,B1 INt,G1 INt,G2 INt,G3 INt,HH5 INt,SS6 INt)
INSERT INTo #b
SELECT 'test_A',8,10,6,10,0,6,0,6,13,4 UNION ALL
SELECT 'test_B',17,7,0,1,3,18,0,6,9,5 UNION ALL
SELECT 'test_C',0,0,6,1,2,6,3,4,3,2 UNION ALL
SELECt 'test_D',13,1,4,1,4,1,9,0,0,5
SELECT * FROM #b
Desired Output:
Company
Top5
Bottom5
test_A
HH5,A2,A1,A3,SS6
B1,SS6,A3,A1,A2
test_B
G1,A1,HH5,A2,G3
A3,A4,B1,SS6,G3
I am able to find the top values but not the column names.
Here is I am stuck at, I am able to find the max scores but not sure how to find the column that holds this max value.
SELECT Company,(
SELECT MAX(myval)
FROM (VALUES (A1),(A2),(A3),(A4),(B1),(G1),(G2),(G3),(HH5)) AS temp(myval))
AS MaxOfColumns
FROM #b
As Larnu suggested, the first step would be to UNPIVOT the data into a form like (Company, ColumnName, Value). You can then use the ROW_NUMBER() window function to assign ordinals 1 - 10 to each value for each company based on the sorted value.
Next, you can wrap the above in a Common Table Expression (CTE) to feed a query that, for each Company, uses conditional aggregation with the STRING_AGG() to selectively combine the top 5 and bottom 5 column names to produce the desired result.
Something like:
;WITH Data AS (
SELECT
Company,
ColumnName,
Value,
ROW_NUMBER() OVER(PARTITION BY Company ORDER BY Value DESC, ColumnName) AS Ord
FROM #b
UNPIVOT (
Value FOR ColumnName IN (A1, A2, A3, A4, B1, G1, G2, G3, HH5, SS6)
) U
)
SELECT
D.Company,
STRING_AGG(CASE WHEN D.Ord BETWEEN 1 AND 5 THEN D.ColumnName END, ', ')
WITHIN GROUP (ORDER BY D.ORD) AS Top5,
STRING_AGG(CASE WHEN D.Ord BETWEEN 6 AND 10 THEN D.ColumnName END, ', ')
WITHIN GROUP (ORDER BY D.ORD) AS Bottom5
FROM Data D
GROUP BY D.Company
ORDER BY D.Company
For older SQL Server versions that don't support STRING_AGG(), the FOR XML PATH(''),TYPE construct can be used to concatenate text. The .value('text()[1]', 'varchar(max)') function is then used to safely extract the result from the XML, and finally the STUFF() function is used to strip out the leading separator (comma-space).
;WITH Data AS (
SELECT
Company,
ColumnName,
Value,
ROW_NUMBER() OVER(PARTITION BY Company ORDER BY Value DESC, ColumnName) AS Ord
FROM #b
UNPIVOT (
Value FOR ColumnName IN (A1, A2, A3, A4, B1, G1, G2, G3, HH5, SS6)
) U
)
SELECT B.Company, C.Top5, C.Bottom5
FROM #b B
CROSS APPLY (
SELECT
STUFF((
SELECT ', ' + D.ColumnName
FROM Data D
WHERE D.Company = B.Company
AND D.Ord BETWEEN 1 AND 5
ORDER BY D.ORD
FOR XML PATH(''),TYPE
).value('text()[1]', 'varchar(max)'), 1, 2, '') AS Top5,
STUFF((
SELECT ', ' + D.ColumnName
FROM Data D
WHERE D.Company = B.Company
AND D.Ord BETWEEN 6 AND 10
ORDER BY D.ORD
FOR XML PATH(''),TYPE
).value('text()[1]', 'varchar(max)'), 1, 2, '') AS Bottom5
) C
ORDER BY B.Company
See this db<>fiddle fr a demo.
If you also want lists of the top 5 and bottom 5 values, you can repeat the aggregations above while substituting CONVERT(VARCHAR, D.Value) for D.ColumnName where appropriate.

In SQL Server, Is it possible to pivot rows without aggregation?

Here is an example: Assuming that each command has a limited number of parameters, I would like to represent the parameters and their values as named_indexed columns.
--drop table TCommand
--drop table TParam
Create Table TCommand (
CommandID INT,
CommandName NCHAR(20),
Description NVARCHAR(100)
);
Create Table TParam (
CommandID INT,
ParamName NCHAR(20),
ParamValue NCHAR(50)
);
insert into TCommand Values(1, 'C1', 'Desc for command C1')
insert into TCommand Values(2, 'C2', 'Desc for command C2')
insert into TCommand Values(3, 'C3', 'Desc for command C3')
insert into TParam Values (1, 'Pa', 'xa1')
insert into TParam Values (1, 'Pb', 'yb1')
insert into TParam Values (1, 'Pc', 'zc1')
insert into TParam Values (2, 'Px', 'xa2')
insert into TParam Values (2, 'Py', 'yb2')
insert into TParam Values (3, 'Pt', 'xa3')
insert into TParam Values (3, 'Pu', 'yb3')
select tc.*, tp.ParamName, tp.ParamValue
from TCommand tc
join TParam tp on tp.CommandID=tc.CommandID
order by tc.CommandName, tp.ParamName
Results:
CommandID CommandName Description ParamName ParamValue
----------- ----------- -------------------- --------- ----------
1 C1 Desc for command C1 Pa xa1
1 C1 Desc for command C1 Pb yb1
1 C1 Desc for command C1 Pc zc1
2 C2 Desc for command C2 Px xa2
2 C2 Desc for command C2 Py yb2
3 C3 Desc for command C3 Pt xa3
3 C3 Desc for command C3 Pu yb3
Here is the format I would like to obtain.
CommandID CommandName Description ParamName_1 ParamValue_1 ParamName_2 ParamValue_2 ParamName_3 ParamValue_3
----------- ----------- -------------------- ----------- ------------ ----------- ------------ ----------- ------------
1 C1 Desc for command C1 Pa xa1 Pb yb1 Pc zc1
2 C2 Desc for command C2 Px xa2 Py yb2 NULL NULL
3 C3 Desc for command C3 Pt xa3 Pu yb3 NULL NULL
What query should I write? Earlier attempts failed using PIVOT because of missing Aggregation function (which I thought I do not need).
Thanks in advance.
You need to create two pivot queries here: one for ParamName and one for ParamValue. In the q1 and q2 derived tables I have selected only the needed columns for each pivot (if I would have used just q in both queries, I would get extra rows with NULLs in the relevant columns).
To be able to join the two queries, you need a column representing the position of the parameter (named RowNum in the query below).
If there is a single value for each value of the pivot column, you can use an aggregator function such as MIN or MAX (which ignores the NULL values and keeps the single input value).
Therefore, you can use the following query:
;WITH q AS (
SELECT tp.CommandID,
ROW_NUMBER() OVER (PARTITION BY tp.CommandID ORDER BY tp.ParamName) AS RowNum,
tp.ParamName, tp.ParamValue
FROM dbo.TParam tp
)
SELECT tc.CommandID, tc.CommandName, tc.Description,
x1.ParamName_1, x2.ParamValue_2, x1.ParamName_2, x2.ParamValue_2, x1.ParamName_3, x2.ParamValue_3
FROM dbo.TCommand tc
LEFT JOIN (
SELECT p.CommandID, p.[1] AS ParamName_1, p.[2] AS ParamName_2, p.[3] AS ParamName_3
FROM (SELECT q.CommandID, q.RowNum, q.ParamName FROM q) q1
PIVOT (MAX(ParamName) FOR RowNum IN ([1],[2],[3])) p
) x1 ON x1.CommandID = tc.CommandID
LEFT JOIN (
SELECT p.CommandID, p.[1] AS ParamValue_1, p.[2] AS ParamValue_2, p.[3] AS ParamValue_3
FROM (SELECT q.CommandID, q.RowNum, q.ParamValue FROM q) q2
PIVOT (MAX(ParamValue) FOR RowNum IN ([1],[2],[3])) p
) x2 ON x2.CommandID = tc.CommandID

SSRS Expression Split string in rows and column

I am working with SQL Server 2008 Report service. I have to try to split string values in different columns in same row in expression but I can't get the excepted output. I have provided input and output details. I have to split values by space (" ") and ("-").
Input :
Sample 1:
ASY-LOS,SLD,ME,A1,A5,J4A,J4B,J4O,J4P,J4S,J4T,J7,J10,J2A,J2,S2,S3,S3T,S3S,E2,E2F,E6,T6,8,SB1,E1S,OTH AS2-J4A,J4B,J4O,J4P,J4S,J4T,J7,J1O,J2A,S2,S3,J2,T6,T8,E2,E4,E6,SLD,SB1,OTH
Sample 2:
A1 A2 A3 A5 D2 D3 D6 E2 E4 E5 E6 EOW LH LL LOS OTH P8 PH PL PZ-1,2,T1,T2,T3 R2-C,E,A RH RL S1 S2-D S3
Output should be:
Thank you.
I wrote this before I saw your comment about having to do it in the report. If you can explain why you cannot do this in the dataset query then there may be a way around that.
Anyway, here's one way of doing this using SQL
DECLARE #t table (RowN int identity (1,1), sample varchar(500))
INSERT INTO #t (sample) SELECT 'ASY-LOS,SLD,ME,A1,A5,J4A,J4B,J4O,J4P,J4S,J4T,J7,J10,J2A,J2,S2,S3,S3T,S3S,E2,E2F,E6,T6,8,SB1,E1S,OTH AS2-J4A,J4B,J4O,J4P,J4S,J4T,J7,J1O,J2A,S2,S3,J2,T6,T8,E2,E4,E6,SLD,SB1,OTH'
INSERT INTO #t (sample) SELECT 'A1 A2 A3 A5 D2 D3 D6 E2 E4 E5 E6 EOW LH LL LOS OTH P8 PH PL PZ-1,2,T1,T2,T3 R2-C,E,A RH RL S1 S2-D S3'
drop table if exists #s1
SELECT RowN, sample, SampleIdx = idx, SampleValue = [Value]
into #s1
from #t t
CROSS APPLY
spring..fn_Split(sample, ' ') as x
drop table if exists #s2
SELECT
s1.*
, s2idx = Idx
, s2Value = [Value]
into #s2
FROM #s1 s1
CROSS APPLY spring..fn_Split(SampleValue, '-')
SELECT SampleKey = [1],
Output = [2] FROM #s2
PIVOT (
MAX(s2Value)
FOR s2Idx IN ([1],[2])
) p
This produced the following results
If you do not have a split function, here is the script to create the one I use
CREATE FUNCTION [dbo].[fn_Split]
/* Define I/O parameters WARNING! DO NOT USE MAX DATA-TYPES HERE! IT WILL KILL PERFORMANCE! */
(#pString VARCHAR(8000)
,#pDelimiter CHAR(1)
)
RETURNS TABLE
WITH SCHEMABINDING
AS
RETURN
/*"Inline" CTE Driven "Tally Table" produces values from 1 up to 10,000: enough to cover VARCHAR(8000)*/
WITH E1(N) AS (
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1
)--10E+1 or 10 rows
,E2(N) AS (SELECT 1 FROM E1 a,E1 b)--10E+2 or 100 rows
,E4(N) AS (SELECT 1 FROM E2 a,E2 b)--10E+4 or 10,000 rows max
/* This provides the "base" CTE and limits the number of rows right up front
for both a performance gain and prevention of accidental "overruns" */
,cteTally(N) AS (
SELECT TOP (ISNULL(DATALENGTH(#pString), 0)) ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
FROM E4
)
/* This returns N+1 (starting position of each "element" just once for each delimiter) */
,cteStart(N1) AS (
SELECT 1 UNION ALL
SELECT t.N + 1 FROM cteTally t WHERE SUBSTRING(#pString, t.N, 1) = #pDelimiter
)
/* Return start and length (for use in SUBSTRING later) */
,cteLen(N1, L1) AS (
SELECT s.N1
,ISNULL(NULLIF(CHARINDEX(#pDelimiter, #pString, s.N1), 0) - s.N1, 8000)
FROM cteStart s
)
/* Do the actual split.
The ISNULL/NULLIF combo handles the length for the final element when no delimiter is found. */
SELECT
idx = ROW_NUMBER() OVER (ORDER BY l.N1)
,value = SUBSTRING(#pString, l.N1, l.L1)
FROM cteLen l

Converting two Text/Numerical data to Ranges

I'm attempting to convert data from two columns (one with text and one with numbers) to a range.
I've searched and unable to find something that works for this needed solution:
Table:
ColumnA Nvarchar(50)
ColumnB Int
Table Sample:
ColumnA ColumnB
AA 1
AA 2
AA 3
AA 4
AA 5
AB 1
AB 2
AB 3
AB 4
Desired Output:
AA:1-5, AB:1-4
Any help would be greatly appreciated
Note I am assuming the reason you're asking the question is that you can have broken ranges and you're not simply looking for the min/max ColumnB for each ColumnA.
If you ask me, this type of thing is probably best handled in code on either an intermediate layer or directly in your presentation layer. Sort the rows by (ColumnA, ColumnB) in your query, then you can get the desired results in a single pass as you read rows - by comparing the current values with the previous row, and outputting a row when either ColumnA changes or ColumnB is not adjacent.
However, if you're bent on doing this in SQL, you can use a recursive CTE. The basic premise would be to correlate each row with an adjacent row and hold on to the beginning value of ColumnB as you proceed. An adjacent row is defined as a row with the same value of ColumnA and the next value of ColumnB (i.e. the previous row + 1).
Something like the following ought to do:
;with cte as (
select a.ColumnA, a.ColumnB, a.ColumnB as rangeStart
from myTable a
where not exists ( --make sure we don't keep 'intermediate rows' as start rows
select 1
from myTable b
where b.ColumnA = a.ColumnA
and b.ColumnB = a.ColumnB - 1
)
union all
select a.ColumnA, b.ColumnB, a.rangeStart
from cte a
join myTable b on a.ColumnA = b.ColumnA
and b.ColumnB = a.ColumnB + 1 --correlate with 'next' row
)
select ColumnA, rangeStart, max(ColumnB) as rangeEnd
from cte
group by ColumnA, rangeStart
And given your sample data, indeed it does.
And for kicks, here is another Fiddle with data having gaps in ColumnB.
Note the group by clause for the continuous values by doing some math.
DECLARE #Data table (ColumnA Nvarchar(50), ColumnB Int)
INSERT #Data VALUES
('AA', 1),
('AA', 2),
('AA', 3),
--('AA', 4),
('AA', 5),
('AB', 1),
('AB', 2),
('AB', 3),
('AB', 4)
;WITH Ordered AS
(
SELECT
ROW_NUMBER() OVER (PARTITION BY ColumnA ORDER BY ColumnB) AS Seq,
*
FROM #Data
)
SELECT
ColumnA,
CASE
WHEN 1 = 0 THEN ''
-- if the ColumnA only has 1 row, the display is 1-1? or just 1?
--WHEN MIN(ColumnB) = MAX(ColumnB) THEN CONVERT(varchar(10), MIN(ColumnB))
ELSE CONVERT(varchar(10), MIN(ColumnB)) + '-' + CONVERT(varchar(10), MAX(ColumnB))
END AS Range
FROM Ordered
GROUP BY
ColumnA,
ColumnB - Seq -- The math
ORDER BY ColumnA, MIN(ColumnB)
SQL Fiddle

Full Outer Self Join [duplicate]

This question already has an answer here:
SQL Full Outer Join on same column in same table
(1 answer)
Closed 9 years ago.
The problem is to return the rows which contain nulls as well. Below is SQL code to create table and populate it with sample data.
I'm expecting below, but query does not show the two rows with null values.
src_t1 id1_t1 id2_t1 val_t1 src_t2 id1_t2 id2_t2 val_t2
b z z 4
a w w 100 b w w 1
a x x 200 b x x 2
a y y 300
Data:
CREATE TABLE sample (
src VARCHAR(6)
,id1 VARCHAR(6)
,id2 VARCHAR(6)
,val FLOAT
);
INSERT INTO sample (src, id1, id2, val)
VALUES ('a', 'w', 'w', 100)
,('b', 'w', 'w', 1)
,('a', 'x', 'x', 200)
,('b', 'x', 'x', 2)
,('a', 'y', 'y', 300)
,('b', 'z', 'z', 4)
;
This is my test query. It does not show results when t1.src = 'a' and t1.id1 = 'y' or when t2.src = 'b' and t2.id1 = 'z'.
Why?
What's the correct query?
SELECT t1.src, t1.id1, t1.id2, t1.val
,t2.src as src2, t2.id1, t2.id2, t2.val
FROM sample t1 FULL OUTER JOIN sample t2
ON t1.id1 = t2.id1 AND t1.id2 = t2.id2
WHERE (t1.src = 'a' AND t2.src = 'b')
OR (t1.src IS NULL AND t1.id1 IS NULL AND t1.id2 IS NULL)
OR (t2.src IS NULL AND t2.id1 IS NULL AND t2.id2 IS NULL)
I've also tried moving the conditions in the WHERE clause to the ON clause as well.
TIA.
The WHERE clause evaluates too late, effectively converting your query into an inner join.
Instead, write your query like this using proper JOIN syntax:
SELECT t1.src, t1.id1, t1.id2, t1.val
,t2.src as src2, t2.id1, t2.id2, t2.val
FROM (
select * from sample
where src='a'
) t1 FULL OUTER JOIN (
select * from sample
where src='b'
) t2
ON t1.id1 = t2.id1 AND t1.id2 = t2.id2
yielding this result set:
src id1 id2 val src2 id1 id2 val
---- ---- ---- ----------- ---- ---- ---- -----------
a w w 100 b w w 1
a x x 200 b x x 2
NULL NULL NULL NULL b z z 4
a y y 300 NULL NULL NULL NULL
Update:
Note also the use of two sub-queries to clearly separate the source table into two distinct relvars. I missed this for a minute on my first submission.
Actually, I think the solution is a bit cleaner if a CTE is used:
WITH A AS (
select * from sample where src='a'
),
B AS (
select * from sample where src='b'
)
SELECT *
FROM A FULL OUTER JOIN B
ON A.ID1 = B.ID1 AND A.ID2 = B.ID2
;