Multi-Column, Multi-Row PIVOT - sql-server-2008-r2

Consider that I have a table which contains data in the following form:
Foo_FK MonthCode_FK Activity_FK SumResultsX SumResultsY
-----------------------------------------------------------
1 201312 0 10 2
1 201312 1 5 1
1 201401 0 15 3
1 201401 1 7 2
2 201312 0 9 3
2 201312 1 1 2
2 201401 0 6 2
2 201401 1 17 4
For my purposes, it is safe to assume that this table is an aggregation which would have been created by a GROUP BY on Foo_FK, MonthCode_FK, Activity_FK with SUM( ResultsA ), SUM( ResultsB ) to obtain the data, making Foo_FK, MonthCode_FK, Activity_FK unique per record.
If for some reason I found it preferable to PIVOT this table in a stored procedure to ease the amount of screwing around with SSRS I'd have to do ( and undoubtedly later maintain ), wishing to get the following format for consumption via a matrix tablix thingy:
Foo_FK 1312_0_X 1312_0_Y 1312_1_X 1312_1_Y 1401_0_X 1401_0_Y 1401_1_X 1401_1_Y
--------------------------------------------------------------------------------------
1 10 2 5 1 15 3 7 2
2 9 3 1 2 6 2 17 4
How would I go about doing this in a not-mental way? Please refer to this SQL Fiddle at proof I am likely trying to use a hammer to build a device that pushes in nails. Don't worry about a dynamic version as I'm sure I can figure that out once I'm guided through the static solution for this test case.
Right now, I've tried to create a Foo_FK, MonthCode_FK set via the following, which I then attempt to PIVOT ( see the Fiddle for the full mess ):
SELECT Foo_FK = ISNULL( a0.Foo_FK, a1.Foo_FK ),
MonthCode_FK = ISNULL( a0.MonthCode_FK, a1.MonthCode_FK ),
[0_X] = ISNULL( a0.SumResultX, 0 ),
[0_Y] = ISNULL( a0.SumResultY, 0 ),
[1_X] = ISNULL( a1.SumResultX, 0 ),
[1_Y] = ISNULL( a1.SumResultY, 0 )
FROM ( SELECT Foo_FK, MonthCode_FK, Activity_FK,
SumResultX, SumResultY
FROM dbo.t_FooActivityByMonth
WHERE Activity_FK = 0 ) a0
FULL OUTER JOIN (
SELECT Foo_FK, MonthCode_FK, Activity_FK,
SumResultX, SumResultY
FROM dbo.t_FooActivityByMonth
WHERE Activity_FK = 1 ) a1
ON a0.Foo_FK = a1.Foo_FK;
I have come across some excellent advice on this SO question, so I'm in the process of performing some form of UNPIVOT before I twist everything back out using PIVOT and MAX, but if there's a better way to do this, I'm all ears.

It seems that you should be able to do this by applying unpivot to your SumResultX and SumResultY columns first, then pivoting the data:
;with cte as
(
select Foo_FK,
col = cast(MonthCode_FK as varchar(6))+'_'
+cast(activity_fk as varchar(1))+'_'+sumresult,
value
from dbo.t_FooActivityByMonth
cross apply
(
values
('X', SumResultX),
('Y', SumResultY)
) c (sumresult, value)
)
select Foo_FK,
[201312_0_X], [201312_0_Y], [201312_1_X], [201312_1_Y],
[201401_0_X], [201401_0_Y], [201401_1_X], [201401_1_Y]
from cte
pivot
(
max(value)
for col in ([201312_0_X], [201312_0_Y], [201312_1_X], [201312_1_Y],
[201401_0_X], [201401_0_Y], [201401_1_X], [201401_1_Y])
) piv;
See SQL Fiddle with Demo

Related

SQL Select based on each row of previous select

I have a table with answers regarding different questions, all of them numbered. There are basically these columns: IdAnswer (unique for each answer in the table), IdUser (which won't repeat even if the same user answer questions a second time), IdQuestion and Answer.
IdAnswer IdUser IdQuestion Answer
1 John 1 0
2 John 4 1
3 John 5 1
4 John 6 0
5 Bob 1 1
6 Bob 3 1
7 Bob 5 0
8 Mark 2 0
9 Mark 7 1
10 Mark 5 0
I'd like to select from this table all answers to a specific question (say, IdQuestion = 5), and also the last question each user answered just before question number 5.
In the end I need a table that should look like this:
IdAnswer IdUser IdQuestion Answer
2 John 4 1
3 John 5 1
6 Bob 3 1
7 Bob 5 0
9 Mark 7 1
10 Mark 5 0
I've managed to make this work using a cursor to iterate through each line from the first SELECT result (which filters by IdQuestion), but I'm not sure if this is the best (and fastest) way of doing it. Is there any more efficient way of achieving the same result?
And by the way, I'm using SQL Server Management Studio 2012.
Here is one way using LEAD function
select * from
(
select *,NextQ = Lead(IdQuestion)over(partition by IdUser order by IdAnswer)
from youtable
) a
Where 5 in (IdQuestion, NextQ )
for older versions
;WITH cte
AS (SELECT prev_id = Min(CASE WHEN IdQuestion = 5 THEN rn - 1 END) OVER( partition BY IdUser),*
FROM (SELECT rn = Row_number()OVER(partition BY IdUser ORDER BY IdAnswer),*
FROM Yourtable)a)
SELECT *
FROM cte
WHERE rn IN ( prev_id, prev_id + 1 )

How to optimize query

I have the same problem as mentioned in In SQL, how to select the top 2 rows for each group. The answer is working fine. But it takes too much time. How to optimize this query?
Example:
sample_table
act_id: act_cnt:
1 1
2 1
3 1
4 1
5 1
6 3
7 3
8 3
9 4
a 4
b 4
c 4
d 4
e 4
Now i want to group it (or using some other ways). And i want to select 2 rows from each group. Sample Output:
act_id: act_cnt:
1 1
2 1
6 3
7 3
9 4
a 4
I am new to SQL. How to do it?
The answer you linked to uses an inefficient workaround for MySQL's lack of window functions.
Using a window function is most probably much faster as you only need to read the table once:
select name,
score
from (
select name,
score,
dense_rank() over (partition by name order by score desc) as rnk
from the_table
) t
where rnk <= 2;
SQLFiddle: http://sqlfiddle.com/#!15/b0198/1
Having an index on (name, score) should speed up this query.
Edit after the question (and the problem) has been changed
select act_id,
act_cnt
from (
select act_id,
act_cnt,
row_number() over (partition by act_cnt order by act_id) as rn
from sample_table
) t
where rn <= 2;
New SQLFiddle: http://sqlfiddle.com/#!15/fc44b/1

how does one implement max(count(field_1)) in Hive?

I have ran a query in Hive whose result gets me 2 columns (year and count).
1900 2
1901 5
1902 7
1903 3
1904 5
I need to find the maximum count and return both the year and the count;
expecting answer 1902 7
I ran a nested query like in SQL but it gives me a parse error saying "..cannot recognize input 'select'in expression specification.."
Can anyone let me know? Thanks.
regards,
Rahul
Use the collect_max UDF which returns the keys and values with the maximum values from Brickhouse ( http://github.com/klout/brickhouse )
select collect_max( year, count , 1 )
from mytable;
Or if you want separate columns
select array_index( map_keys( map_max ), 0 ) as max_year,
array_index( map_values( map_max ), 0 ) as max_value
from
( select collect_max( year, count, 1 ) from mytable );

T-SQL table variable data order

I have a UDF which returns table variable like
--
--
RETURNS #ElementTable TABLE
(
ElementID INT IDENTITY(1,1) PRIMARY KEY NOT NULL,
ElementValue VARCHAR(MAX)
)
AS
--
--
Is the order of data in this table variable guaranteed to be same as the order data is inserted into it. e.g. if I issue
INSERT INTO #ElementTable(ElementValue) VALUES ('1')
INSERT INTO #ElementTable(ElementValue) VALUES ('2')
INSERT INTO #ElementTable(ElementValue) VALUES ('3')
I expect data will always be returned in that order when I say
select ElementValue from #ElementTable --Here I don't use order by
EDIT:
If order by is not guaranteed then the following query
SELECT T1.ElementValue,T2.ElementValue FROM dbo.MyFunc() T1
Cross Apply dbo.MyFunc T2
order by t1.elementid
will not produce 9x9 matrix as
1 1
1 2
1 3
2 1
2 2
2 3
3 1
3 2
3 3
consistently.
Is there any possibility that it could be like
1 2
1 1
1 3
2 3
2 2
2 1
3 1
3 2
3 3
How to do it using my above function?
No, the order is not guaranteed to be the same.
Unless, of course you are using ORDER BY. Then it is guaranteed to be the same.
Given your update, you obtain it in the obvious way - you ask the system to give you the results in the order you want:
SELECT T1.ElementValue,T2.ElementValue FROM dbo.MyFunc() T1
Cross join dbo.MyFunc() T2
order by t1.elementid, t2.elementid
You are guaranteed that if you're using inefficient single row inserts within your UDF, that the IDENTITY values will match the order in which the individual INSERT statements were specified.
Order is not guaranteed.
But if all you want is just simply to get your records back in the same order you inserted them, then just order by your primary key. Since you already have that field setup as an auto-increment, it should suffice.
...or use a deterministic function
SELECT TOP 9
M1 = (ROW_NUMBER() OVER(ORDER BY id) + 2) / 3,
M2 = (ROW_NUMBER() OVER(ORDER BY id) + 2) % 3 + 1
FROM
sysobjects
M1 M2
1 1
1 2
1 3
2 1
2 2
2 3
3 1
3 2
3 3

simplest way to do recursive t-sql for multiple selects

I am developing a Bill Of Materials cost calculator program and I am struggling to fathom a simple solution to some recursive selects I want.
I am using SQL Server 2005 for this part of the application.
Say I have Product A, which contains assembly B, and Part C. Assembly B will contain parts D and E, but, here is where I struggle, D and or E may contain X number of other assemblies.
I can do something along the lines of;
SELECT * FROM TBLBOM WHERE Parent = A
UNION
SELECT * FROM TBLBOM WHERE Parent = B
UNION
SELECT * FROM TBLBOM WHERE Parent = C
To produce something along the lines of;
PARENT COMP COST
A X £1
B D £0.5
B E £0.5
....
C Y £1
But lets say Component D is made up of Component F & G, how would I accommodate this in a t-sql statement.
In a nutshell, I need to expand out the full component list of all assemblies that are associated to a parent product regardless of whether they are in a sub assembly or a sub assembly of a sub assembly etc...
Ideally I would like to avoid a cursor at all costs :)
Any help / guidance would be appreciated.
Thank you.
EDIT;
As requested, here is the table structure and expected output. The parent is the DRAWINGNO and the child node is the PART (which could also be a parent in itself);
BOMID DRAWINGNO ITEM PART COST
1303 HGR05180 1 HGR05370 1
1304 HGR05180 2 HGF65050 4
1305 HGR05180 3 HGF50340 1
1312 HGR05370 1 HPN05075 1
1313 HGR05370 2 HPN05085 2
1314 HGR05370 3 HPN05080 1
1848 EXP-18G 1 HGR05180 1
1849 EXP-18G 2 HGR05210 3
1850 EXP-18G 3 HGR05230 1
1851 EXP-18G 4 HGR05140 1
1852 EXP-18G 5 HGR05150 2
1853 EXP-18G 6 HGR05050 1
1854 EXP-18G 7 ESC05350 1
1855 EXP-18G 8 ESC05330 3
1856 EXP-18G 9 HGR05360 1
1857 EXP-18G 10 HGR05370 2
1858 EXP-18G 11 ESC05640 1
If i understand (and without table structure) you can try something like this
DECLARE #Table TABLE(
Component VARCHAR(50),
Parent VARCHAR(50),
Cost FLOAT
)
INSERT INTO #Table SELECT 'B', 'A', 1
INSERT INTO #Table SELECT 'C', 'B', 2
INSERT INTO #Table SELECT 'C', 'B', 3
INSERT INTO #Table SELECT 'D', 'C', 4
DECLARE #Product VARCHAR(50)
SET #Product = 'A'
;WITH Selects AS (
SELECT *
FROM #Table
WHERE Parent = #Product
UNION ALL
SELECT t.*
FROM #Table t INNER JOIN
Selects s ON t.Parent = s.Component
)
SELECt *
FROm Selects
You want to be using recursive common table expression (CTEs). Books Online has a lot of information on how to use these; in the index, look up CTEs and pick the "Recursive Queries Using Common Table Expressions" entry. (I've had problems before linking to BOL online, or I'd try to link it here.)
Also, if you post your table structure, you should get half a dozen examples within five minutes. Better yet, try and search SO for prior examples.