Converting two Text/Numerical data to Ranges - sql-server-2008-r2

I'm attempting to convert data from two columns (one with text and one with numbers) to a range.
I've searched and unable to find something that works for this needed solution:
Table:
ColumnA Nvarchar(50)
ColumnB Int
Table Sample:
ColumnA ColumnB
AA 1
AA 2
AA 3
AA 4
AA 5
AB 1
AB 2
AB 3
AB 4
Desired Output:
AA:1-5, AB:1-4
Any help would be greatly appreciated

Note I am assuming the reason you're asking the question is that you can have broken ranges and you're not simply looking for the min/max ColumnB for each ColumnA.
If you ask me, this type of thing is probably best handled in code on either an intermediate layer or directly in your presentation layer. Sort the rows by (ColumnA, ColumnB) in your query, then you can get the desired results in a single pass as you read rows - by comparing the current values with the previous row, and outputting a row when either ColumnA changes or ColumnB is not adjacent.
However, if you're bent on doing this in SQL, you can use a recursive CTE. The basic premise would be to correlate each row with an adjacent row and hold on to the beginning value of ColumnB as you proceed. An adjacent row is defined as a row with the same value of ColumnA and the next value of ColumnB (i.e. the previous row + 1).
Something like the following ought to do:
;with cte as (
select a.ColumnA, a.ColumnB, a.ColumnB as rangeStart
from myTable a
where not exists ( --make sure we don't keep 'intermediate rows' as start rows
select 1
from myTable b
where b.ColumnA = a.ColumnA
and b.ColumnB = a.ColumnB - 1
)
union all
select a.ColumnA, b.ColumnB, a.rangeStart
from cte a
join myTable b on a.ColumnA = b.ColumnA
and b.ColumnB = a.ColumnB + 1 --correlate with 'next' row
)
select ColumnA, rangeStart, max(ColumnB) as rangeEnd
from cte
group by ColumnA, rangeStart
And given your sample data, indeed it does.
And for kicks, here is another Fiddle with data having gaps in ColumnB.

Note the group by clause for the continuous values by doing some math.
DECLARE #Data table (ColumnA Nvarchar(50), ColumnB Int)
INSERT #Data VALUES
('AA', 1),
('AA', 2),
('AA', 3),
--('AA', 4),
('AA', 5),
('AB', 1),
('AB', 2),
('AB', 3),
('AB', 4)
;WITH Ordered AS
(
SELECT
ROW_NUMBER() OVER (PARTITION BY ColumnA ORDER BY ColumnB) AS Seq,
*
FROM #Data
)
SELECT
ColumnA,
CASE
WHEN 1 = 0 THEN ''
-- if the ColumnA only has 1 row, the display is 1-1? or just 1?
--WHEN MIN(ColumnB) = MAX(ColumnB) THEN CONVERT(varchar(10), MIN(ColumnB))
ELSE CONVERT(varchar(10), MIN(ColumnB)) + '-' + CONVERT(varchar(10), MAX(ColumnB))
END AS Range
FROM Ordered
GROUP BY
ColumnA,
ColumnB - Seq -- The math
ORDER BY ColumnA, MIN(ColumnB)
SQL Fiddle

Related

Select top 5 and bottom 5 columns from a list of columns based on their values

I have a requirement where I need to Select top 5 and bottom 5 columns from a list of columns based on their values.
If more than 1 column has same value then select any one from them.
Eg
CREATE TABLE #b(Company VARCHAR(10),A1 INt,A2 INt,A3 INt,A4 INt,B1 INt,G1 INt,G2 INt,G3 INt,HH5 INt,SS6 INt)
INSERT INTo #b
SELECT 'test_A',8,10,6,10,0,6,0,6,13,4 UNION ALL
SELECT 'test_B',17,7,0,1,3,18,0,6,9,5 UNION ALL
SELECT 'test_C',0,0,6,1,2,6,3,4,3,2 UNION ALL
SELECt 'test_D',13,1,4,1,4,1,9,0,0,5
SELECT * FROM #b
Desired Output:
Company
Top5
Bottom5
test_A
HH5,A2,A1,A3,SS6
B1,SS6,A3,A1,A2
test_B
G1,A1,HH5,A2,G3
A3,A4,B1,SS6,G3
I am able to find the top values but not the column names.
Here is I am stuck at, I am able to find the max scores but not sure how to find the column that holds this max value.
SELECT Company,(
SELECT MAX(myval)
FROM (VALUES (A1),(A2),(A3),(A4),(B1),(G1),(G2),(G3),(HH5)) AS temp(myval))
AS MaxOfColumns
FROM #b
As Larnu suggested, the first step would be to UNPIVOT the data into a form like (Company, ColumnName, Value). You can then use the ROW_NUMBER() window function to assign ordinals 1 - 10 to each value for each company based on the sorted value.
Next, you can wrap the above in a Common Table Expression (CTE) to feed a query that, for each Company, uses conditional aggregation with the STRING_AGG() to selectively combine the top 5 and bottom 5 column names to produce the desired result.
Something like:
;WITH Data AS (
SELECT
Company,
ColumnName,
Value,
ROW_NUMBER() OVER(PARTITION BY Company ORDER BY Value DESC, ColumnName) AS Ord
FROM #b
UNPIVOT (
Value FOR ColumnName IN (A1, A2, A3, A4, B1, G1, G2, G3, HH5, SS6)
) U
)
SELECT
D.Company,
STRING_AGG(CASE WHEN D.Ord BETWEEN 1 AND 5 THEN D.ColumnName END, ', ')
WITHIN GROUP (ORDER BY D.ORD) AS Top5,
STRING_AGG(CASE WHEN D.Ord BETWEEN 6 AND 10 THEN D.ColumnName END, ', ')
WITHIN GROUP (ORDER BY D.ORD) AS Bottom5
FROM Data D
GROUP BY D.Company
ORDER BY D.Company
For older SQL Server versions that don't support STRING_AGG(), the FOR XML PATH(''),TYPE construct can be used to concatenate text. The .value('text()[1]', 'varchar(max)') function is then used to safely extract the result from the XML, and finally the STUFF() function is used to strip out the leading separator (comma-space).
;WITH Data AS (
SELECT
Company,
ColumnName,
Value,
ROW_NUMBER() OVER(PARTITION BY Company ORDER BY Value DESC, ColumnName) AS Ord
FROM #b
UNPIVOT (
Value FOR ColumnName IN (A1, A2, A3, A4, B1, G1, G2, G3, HH5, SS6)
) U
)
SELECT B.Company, C.Top5, C.Bottom5
FROM #b B
CROSS APPLY (
SELECT
STUFF((
SELECT ', ' + D.ColumnName
FROM Data D
WHERE D.Company = B.Company
AND D.Ord BETWEEN 1 AND 5
ORDER BY D.ORD
FOR XML PATH(''),TYPE
).value('text()[1]', 'varchar(max)'), 1, 2, '') AS Top5,
STUFF((
SELECT ', ' + D.ColumnName
FROM Data D
WHERE D.Company = B.Company
AND D.Ord BETWEEN 6 AND 10
ORDER BY D.ORD
FOR XML PATH(''),TYPE
).value('text()[1]', 'varchar(max)'), 1, 2, '') AS Bottom5
) C
ORDER BY B.Company
See this db<>fiddle fr a demo.
If you also want lists of the top 5 and bottom 5 values, you can repeat the aggregations above while substituting CONVERT(VARCHAR, D.Value) for D.ColumnName where appropriate.

TSQL - in a string, replace a character with a fixed one every 2 characters

I can't replace every 2 characters of a string with a '.'
select STUFF('abcdefghi', 3, 1, '.') c3,STUFF('abcdefghi', 5, 1,
'.') c5,STUFF('abcdefghi', 7, 1, '.') c7,STUFF('abcdefghi', 9, 1, '.')
c9
if I use STUFF I should subsequently overlap the strings c3, c5, c7 and c9. but I can't find a method
can you help me?
initial string:
abcdefghi
the result I would like is
ab.de.gh.
the string can be up to 50 characters
Create a numbers / tally / digits table, if you don't have one already, then you can use this to target each character position:
with digits as ( /* This would be a real table, here it's just to test */
select n from (values(1),(2),(3),(4),(5),(6),(7),(8),(9),(10))x(n)
), t as (
select 'abcdefghi' as s
)
select String_Agg( case when d.n%3 = 0 then '.' else Substring(t.s, d.n, 1) end, '')
from t
cross apply digits d
where d.n <Len(t.s)
Using for xml with existing table
with digits as (
select n from (values(1),(2),(3),(4),(5),(6),(7),(8),(9),(10))x(n)
),
r as (
select t.id, case when d.n%3=0 then '.' else Substring(t.s, d.n, 1) end ch
from t
cross apply digits d
where d.n <Len(t.s)
)
select result=(select '' + ch
from r r2
where r2.id=r.id
for xml path('')
)
from r
group by r.id
You can try it like this:
Easiest might be a quirky update ike here:
DECLARE #string VARCHAR(100)='abcdefghijklmnopqrstuvwxyz';
SELECT #string = STUFF(#string,3*A.pos,1,'.')
FROM (SELECT TOP(LEN(#string)/3) ROW_NUMBER() OVER(ORDER BY (SELECT NULL))
FROM master..spt_values) A(pos);
SELECT #string;
Better/Cleaner/Prettier was a recursive CTE:
We use a declared table to have some tabular sample data
DECLARE #tbl TABLE(ID INT IDENTITY, SomeString VARCHAR(200));
INSERT INTO #tbl VALUES('')
,('a')
,('ab')
,('abc')
,('abcd')
,('abcde')
,('abcdefghijklmnopqrstuvwxyz');
--the query
WITH recCTE AS
(
SELECT ID
,SomeString
,(LEN(SomeString)+1)/3 AS CountDots
,1 AS OccuranceOfDot
,SUBSTRING(SomeString,4,LEN(SomeString)) AS RestString
,CAST(LEFT(SomeString,2) AS VARCHAR(MAX)) AS Growing
FROM #tbl
UNION ALL
SELECT t.ID
,r.SomeString
,r.CountDots
,r.OccuranceOfDot+2
,SUBSTRING(RestString,4,LEN(RestString))
,CONCAT(Growing,'.',LEFT(r.RestString,2))
FROM #tbl t
INNER JOIN recCTE r ON t.ID=r.ID
WHERE r.OccuranceOfDot/2<r.CountDots-1
)
SELECT TOP 1 WITH TIES ID,Growing
FROM recCTE
ORDER BY ROW_NUMBER() OVER(PARTITION BY ID ORDER BY OccuranceOfDot DESC);
--the result
1
2 a
3 ab
4 ab
5 ab
6 ab.de
7 ab.de.gh.jk.mn.pq.st.vw.yz
The idea in short
We use a recursive CTE to walk along the string
we add the needed portion together with a dot
We stop, when the remaining length is to short to continue
a little magic is the ORDER BY ROW_NUMBER() OVER() together with TOP 1 WITH TIES. This will allow all first rows (frist per ID) to appear.

SQL query - get parent index from level and child index

I have dataset with two columns: index and level.
Level is number indicating level in hierarchy of nested parent child records.
The records are in order of hierarchy and index is just the line number of record.
The rule is that any record's parent record has level = child level - 1.
My challenge is to identify the parent's index based on this rule.
For each record, I need to SQL query that will get the record's parent index.
The SQL query will be a self join, and get the max index value where the self join index < child.index and the self join level = child.level
I need help to figure out how to write this SQL.
I can use MS Access or use SQL in VBA to perform this query.
This is a visual representation of the data set.
This is sample data and expected result .. want to get parent index .. parent level is child level - 1.
Index,Level Number,Parent Level,Parent Index
1,1,1,1
2,2,1,1
4,4,3,3
9,9,8,8
3,3,2,2
5,5,4,4
8,8,7,7
6,6,5,5
7,7,6,6
10,10,9,9
11,11,10,10
12,12,11,11
13,13,12,12
14,14,13,13
15,14,13,13
16,14,13,13
17,14,13,13
18,14,13,13
19,14,13,13
20,14,13,13
21,13,12,12
22,13,12,12
23,13,12,12
24,14,13,23
25,14,13,23
26,14,13,23
27,11,10,10
28,9,8,8
29,9,8,8
30,9,8,8
31,9,8,8
32,9,8,8
33,9,8,8
34,9,8,8
35,8,7,7
36,9,8,35
37,10,9,36
38,11,10,37
39,11,10,37
40,12,11,39
41,12,11,39
42,13,12,41
43,13,12,41
44,13,12,41
45,11,10,37
46,12,11,45
47,13,12,46
48,14,13,47
49,14,13,47
50,14,13,47
51,14,13,47
52,14,13,47
53,14,13,47
54,14,13,47
55,13,12,46
56,13,12,46
57,13,12,46
58,9,8,35
59,9,8,35
60,9,8,35
61,9,8,35
62,8,7,7
63,8,7,7
64,8,7,7
65,8,7,7
66,8,7,7
67,8,7,7
68,8,7,7
Edited to add:
I tried to do this in Excel Power Query, and found an answer, but it takes forever to run so need to find SQL VBA/ADO solution. But here is Power Query solution to help give ideas about how to do it SQL.
let
Source = Excel.CurrentWorkbook(){[Name="Tabelle3"]}[Content],
ParentIndex = Table.AddColumn(Source, "ParentIndex", each let Index=[Index], LN=[Level Number] in List.Max(Table.SelectRows(Source, each _[Index] < Index and _[Level Number]=LN-1)[Index])),
#"Merged Queries" = Table.NestedJoin(ParentIndex,{"ParentIndex"},ParentIndex,{"Index"},"NewColumn",JoinKind.LeftOuter),
#"Expanded NewColumn" = Table.ExpandTableColumn(#"Merged Queries", "NewColumn", {"Level Number"}, {"Level Number.1"})
in
#"Expanded NewColumn"
This Power Query solution finds Max index where each row index < all index and level = level -1
DECLARE #t TABLE (val INT)
INSERT INTO #t
VALUES
(1),(2),(3),(4),(5),(6),(7),(8),(9),(10),(11),(12),(13),
(14),(14),(14),(14),(14),(14),(14),(13),(13),(13),(14),(14),(14),(11)
SELECT REPLICATE(' ', val) + CAST(val AS VARCHAR(10))
FROM #t
Output
-----------------------------
1
2
3
4
5
6
7
8
9
10
11
12
13
14
14
14
14
14
14
14
13
13
--http://stackoverflow.com/questions/36639349/sql-query-get-parent-index-from-level-and-child-index
declare #table table
(idx int, level int)
insert into #table
(idx,level)
values
(1,1),
(2,2),
(3,3),
(4,4),
(5,5),
(6,6),
(7,7),
(8,8),
(9,9),
(10,10),
(11,11),
(12,12),
(13,13),
(14,14),
(15,14),
(16,14),
(17,14),
(18,14),
(19,14),
(20,14),
(21,14),
(22,13),
(23,13),
(24,13),
(25,14),
(26,14),
(27,14),
(28,11),
(29,9),
(30,8)
select v.idx,v.level,v.parentlevel,u.idx parentidx
from
(
select s.* from --Find the first idx,level
(
select t.*, t.level - 1 as parentlevel,
row_number() over (partition by level order by idx,level) rownum
from #table t
) s
where rownum = 1
) u
join --join to every occurance of
(select t2.*, t2.level - 1 parentlevel,
1 as rownum
from #table t2
) v
on (v.parentlevel = u.level and v.rownum = u.rownum)
union --and put 1 back
select w.idx,w.level,w.level,w.idx
from #table w
where w.idx = 1
order by v.idx

how to do dead reckoning on column of table, postgresql

I have a table looks like,
x y
1 2
2 null
3 null
1 null
11 null
I want to fill the null value by conducting a rolling
function to apply y_{i+1}=y_{i}+x_{i+1} with sql as simple as possible (inplace)
so the expected result
x y
1 2
2 4
3 7
1 8
11 19
implement in postgresql. I may encapsulate it in a window function, but the implementation of custom function seems always complex
WITH RECURSIVE t AS (
select x, y, 1 as rank from my_table where y is not null
UNION ALL
SELECT A.x, A.x+ t.y y , t.rank + 1 rank FROM t
inner join
(select row_number() over () rank, x, y from my_table ) A
on t.rank+1 = A.rank
)
SELECT x,y FROM t;
You can iterate over rows using a recursive CTE. But in order to do so, you need a way to jump from row to row. Here's an example using an ID column:
; with recursive cte as
(
select id
, y
from Table1
where id = 1
union all
select cur.id
, prev.y + cur.x
from Table1 cur
join cte prev
on cur.id = prev.id + 1
)
select *
from cte
;
You can see the query at SQL Fiddle. If you don't have an ID column, but you do have another way to order the rows, you can use row_number() to get an ID:
; with recursive sorted as
(
-- Specify your ordering here. This example sorts by the dt column.
select row_number() over (order by dt) as id
, *
from Table1
)
, cte as
(
select id
, y
from sorted
where id = 1
union all
select cur.id
, prev.y + cur.x
from sorted cur
join cte prev
on cur.id = prev.id + 1
)
select *
from cte
;
Here's the SQL Fiddle link.

Find duplicate row "details" in table

OrderId OrderCode Description
-------------------------------
1 Z123 Stuff
2 ABC999 Things
3 Z123 Stuff
I have duplicates in a table like the above. I'm trying to get a report of which Orders are duplicates, and what Order they are duplicates of, so I can figure out how they got into the database.
So ideally I'd like to get an output something like;
OrderId IsDuplicatedBy
-------------------------
1 3
3 1
I can't work out how to code this in SQL.
You can use the same table twice in one query and join on the fields you need to check against. T1.OrderID <> T2.OrderID is needed to not find a duplicate for the same row.
declare #T table (OrderID int, OrderCode varchar(10), Description varchar(50))
insert into #T values
(1, 'Z123', 'Stuff'),
(2, 'ABC999', 'Things'),
(3, 'Z123', 'Stuff')
select
T1.OrderID,
T2.OrderID as IsDuplicatedBy
from #T as T1
inner join #T as T2
on T1.OrderCode = T2.OrderCode and
T1.Description = T2.Description and
T1.OrderID <> T2.OrderID
Result:
OrderID IsDuplicatedBy
1 3
3 1