PIVOT not producing data on a single row - tsql

I am trying to write PIVOT to generate a row of data that originally sits as multiple rows in the DB. The DB data looks like this (appended)
txtSchoolID txtSubjectArchivedName intSubjectID intGradeID intGradeTransposeValue
95406288448 History 7 634 2
95406288448 History 7 635 2
95406288448 History 7 636 2
95406288448 History 7 637 2
95406288448 History 7 638 2
95406288448 History 7 639 2
95406288448 History 7 640 2
95406288448 History 7 641 2
95406288448 History 7 642 2
95406288448 History 7 643 2
What I want to get to is 1 row for each subject and SchoolID with the grades listed as columns.
I have written the following pivot:
SELECT intSubjectID, txtSchoolID, [636] AS Effort, [637] AS Focus, [638] AS Participation, [639] AS Groupwork, [640] AS Rigour, [641] AS Curiosity, [642] AS Initiative,
[643] AS SelfOrganisation, [644] as Perserverance
FROM (SELECT txtSchoolID, intReportTypeID, txtSubjectArchivedName, intSubjectID, intReportProgress, txtTitle, txtForename, txtPreName, txtMiddleNames,
txtSurname, txtGender, txtForm, intNCYear, txtSubmitByTitle, txtSubmitByPreName, txtSubmitByFirstname, txtSubmitByMiddleNames,
txtSubmitBySurname, txtCurrentSubjectName, txtCurrentSubjectReportName, intReportCycleID, txtReportCycleName, intReportCycleType,
intPreviousReportCycle, txtReportCycleShortName, intReportCycleTerm, intReportCycleAcademicYear, dtReportCycleStartDate,
dtReportCycleFinishDate, dtReportCyclePrintDate, txtReportTermName, dtReportTermStartDate, dtReportTermFinishDate,
intGradeID, txtGradingName, txtGradingOptions, txtShortGradingName, txtGrade, intGradeTransposeValue FROM VwReportsManagementAcademicReports) p
PIVOT
(MAX (intGradeTransposeValue)
FOR intGradeID IN ([636], [637], [638], [639], [640], [641], [642], [643], [644] )
) AS pvt
WHERE (intReportCycleID = 142) AND (intReportProgress = 1)
However, this is producing this
intSubjectID txtSchoolID Effort Focus Participation Groupwork Rigour Curiosity Initiative SelfOrganisation Perserverance
8 74001484142 NULL NULL NULL NULL NULL NULL NULL NULL NULL
8 74001484142 NULL NULL NULL NULL NULL 2 NULL NULL NULL
8 74001484142 3 NULL NULL NULL NULL NULL NULL NULL NULL
8 74001484142 NULL 2 NULL NULL NULL NULL NULL NULL NULL
8 74001484142 NULL NULL NULL 2 NULL NULL NULL NULL NULL
8 74001484142 NULL NULL NULL NULL NULL NULL 2 NULL NULL
8 74001484142 NULL NULL 2 NULL NULL NULL NULL NULL NULL
8 74001484142 NULL NULL NULL NULL NULL NULL NULL NULL 2
8 74001484142 NULL NULL NULL NULL 2 NULL NULL NULL NULL
8 74001484142 NULL NULL NULL NULL NULL NULL NULL 2 NULL
What I want is
intSubjectID txtSchoolID Effort Focus Participation Groupwork Rigour Curiosity Initiative SelfOrganisation Perserverance
8 74001484142 3 2 2 2 2 2 2 2 2
Is there a way to get it like this.
I have never tried a PIVOT before, this is my first time, so all help welcome.

I think the reason you got the unexpected result is you have so many unwanted columns in the Select in the sub-query and the pivot will group them, too.
Your query might be very close to your ideal result: try:
SELECT intSubjectID, txtSchoolID, [636] AS Effort, [637] AS Focus, [638] AS Participation, [639] AS Groupwork, [640] AS Rigour, [641] AS Curiosity, [642] AS Initiative,
[643] AS SelfOrganisation, [644] as Perserverance
FROM (SELECT txtSchoolID, intReportTypeID FROM VwReportsManagementAcademicReports) p --just these two
PIVOT
(MAX (intGradeTransposeValue)
FOR intGradeID IN ([636], [637], [638], [639], [640], [641], [642], [643], [644] )
) AS pvt
WHERE (intReportCycleID = 142) AND (intReportProgress = 1)

As per the comment above - the solution was:
try stripping the inner select down to only the columns that will be used in the pivot and are expected in the output. intSubjectID, txtSchoolID, intGradeTransposeValue, and intGradeID. all other columns will act as a grouping column in the output and can cause this type of non grouped output.

pivot can't return such what you asking for, but you can use another approach:
--test dataset
declare #test as table
( txtSchoolID bigint,
txtSubjectArchivedName varchar(10),
intSubjectID int,
intGradeID int,
intGradeTransposeValue int)
insert into #test
Values
(95406288448,'History',7,634,2),
(95406288448,'History',7,635,2),
(95406288448,'History',7,636,2),
(95406288448,'History',7,637,2),
(95406288448,'History',7,638,2),
(95406288448,'History',7,639,2),
(95406288448,'History',7,640,2),
(95406288448,'History',7,641,2),
(95406288448,'History',7,642,2),
(95406288448,'History',7,643,2)
--conditional aggregation
select intSubjectID,
txtSchoolID,
count(case when intGradeID = 636 then 1 end) AS Effort,
count(case when intGradeID = 637 then 1 end) AS Focus,
count(case when intGradeID = 638 then 1 end) AS Participation,
count(case when intGradeID = 639 then 1 end) AS Groupwork,
count(case when intGradeID = 640 then 1 end) AS Rigour,
count(case when intGradeID = 641 then 1 end) AS Curiosity,
count(case when intGradeID = 642 then 1 end) AS Initiative,
count(case when intGradeID = 643 then 1 end) AS SelfOrganisation,
count(case when intGradeID = 644 then 1 end) as Perserverance
from #test
group by intSubjectID, txtSchoolID
test is here

Related

How can i get all rows from two tables Postgres

I have a problem with JOIN of two tables.
CREATE table appointment(
idappointment serial primary key,
idday int references days(idday),
worktime text
);
create table booking(
idbooking serial,
idappointment int references appointment(idappointment),
date date,
primary key(idappointment)
);
appointment
idappointment
idday
worktime
1
1
07:00-08:00
2
1
08:00-09:00
3
1
09:00-10:00
4
2
09:00-10:00
booking
idbooking
idappointment
date
1
1
2021-08-22
1
2
2021-08-2
And I want :
idbooking
idappointment
date
idbooking
idappointment
date
1
1
07:00-08:00
null
null
null
2
1
08:00-09:00
null
null
null
3
1
09:00-10:00
null
null
null
4
2
09:00-10:00
null
null
null
null
null
null
1
1
2021-08-22
null
null
null
1
2
2021-08-2
1
1
07:00-08:00
1
1
2021-08-22
2
1
08:00-09:00
1
2
2021-08-2
How can I get it ?

Get email frequency using grouping sets

I have a table with emails logs and need to check the email frequency. using grouping sets retrieve email count hourly, daily, weekly, monthly and yearly.
Can anyone send me an example.
Thank you !
SELECT DATEPART(yyyy,create_time) [year]
, DATEPART(mm,create_time) [month]
, DATEPART(WEEK,create_time) [week]
, DATEPART(dd,create_time) [day]
, DATEPART(hour,create_time) [hour]
, COUNT(*) AS c
FROM your_table
GROUP BY GROUPING SETS (DATEPART(yyyy,create_time)
,DATEPART(mm,create_time)
,DATEPART(WEEK,create_time)
,DATEPART(dd,create_time)
,DATEPART(hour,create_time))
ORDER BY [year], [month], [week], [day], [hour]
To group by multiple fields, just group columns in parentheses:
,(DATEPART(yyyy,create_time), DATEPART(mm,create_time))
Here's sample output, including the year/month grouping:
year month week day hour c
NULL NULL NULL NULL 0 12
NULL NULL NULL NULL 1 1
NULL NULL NULL 1 NULL 219
NULL NULL NULL 2 NULL 467
NULL NULL 1 NULL NULL 124
NULL NULL 2 NULL NULL 216
NULL 1 NULL NULL NULL 1899
NULL 2 NULL NULL NULL 1419
2015 NULL NULL NULL NULL 3750
2016 NULL NULL NULL NULL 7446
2015 8 NULL NULL NULL 391
2015 9 NULL NULL NULL 891
Thank you for the reply !
I have added ID column to the grouping sets,not getting if it is Hourly/weekly/daily/monthly generated.
SELECT ID, DATEPART(yyyy,create_time) [year]
, DATEPART(mm,create_time) [month]
, DATEPART(WEEK,create_time) [week]
, DATEPART(dd,create_time) [day]
, DATEPART(hour,create_time) [hour]
, COUNT(*) AS c
FROM your_table
GROUP BY GROUPING SETS(
(DATEPART(yyyy,create_time),ID)
,(DATEPART(mm,create_time),ID)
,(DATEPART(WEEK,create_time),ID)
,(DATEPART(dd,create_time),ID)
,(DATEPART(hour,create_time) ,ID))
ORDER BY [year], [month], [week], [day], [hour],ID
Result
Month Week Daily Hour C
NULL NULL NULL 0 12
NULL NULL NULL 0 471
NULL NULL NULL 0 176
NULL NULL NULL 0 145
NULL NULL NULL 0 633
NULL NULL NULL 0 13
NULL NULL NULL 0 24
NULL NULL NULL 0 2
NULL NULL NULL 0 324
NULL NULL NULL 0 555

Fun with row_number() - Redshift Postgres - Time sequence and restarting numbering

I am looking to number streaks within my data, the goal is to find where at least 3 consecutive streaks are flagged by the np.
Here is a subset of my data:
drop table if exists bi_test;
create table test (id varchar(12),rd date,np decimal);
insert into test
select 'aaabbbccc', '2016-07-25'::date, 0 union all
select 'aaabbbccc', '2016-08-01'::date, 0 union all
select 'aaabbbccc', '2016-08-08'::date, 0 union all
select 'aaabbbccc', '2016-08-15'::date, 0 union all
select 'aaabbbccc', '2016-08-22'::date, 1 union all
select 'aaabbbccc', '2016-08-29'::date, 0 union all
select 'aaabbbccc', '2016-09-05'::date, 1 union all
select 'aaabbbccc', '2016-09-12'::date, 0 union all
select 'aaabbbccc', '2016-09-19'::date, 1;
I am hoping to use row_number() and count(), but it doesn't seem to be giving me the result I want.
select
*
,row_number() over (partition by t.id order by t.rd) all_ctr
,count(t.id) over (partition by t.id) all_count
,row_number() over (partition by t.id,t.np order by t.rd) np_counter
,count(t.id) over (partition by t.id,t.np) np_non_np
from
bi_adhoc.test t
order by
t.rd;
Here are my results, and the desired result:
id rd np all_ctr all_count np_counter np_non_np **Desired**
aaabbbccc 7/25/2016 0 1 9 1 6 **1**
aaabbbccc 8/1/2016 0 2 9 2 6 **2**
aaabbbccc 8/8/2016 0 3 9 3 6 **3**
aaabbbccc 8/15/2016 0 4 9 4 6 **4**
aaabbbccc 8/22/2016 1 5 9 1 3 **1**
aaabbbccc 8/29/2016 0 6 9 5 6 **1**
aaabbbccc 9/5/2016 1 7 9 2 3 **1**
aaabbbccc 9/12/2016 0 8 9 6 6 **1**
aaabbbccc 9/19/2016 1 9 9 3 3 **1**
One way to do this would be to calculate the lag (np) value in a CTE, and then compare the current np and lagged np to detect a streak. This may not be the most optimal way, but seems to work fine.
with source_cte as
(
select
*
,row_number() over (partition by t.id order by t.rd) row_num
,lag(np,1) over (partition by t.id order by t.rd) as prev_np
from
bi_adhoc.test t
)
, streak_cte as
(
select
*,
case when np=prev_np or prev_np is NULL then 1 else 0 end as is_streak
from
source_cte
)
select
*,
case when is_streak=1 then dense_rank() over (partition by id, is_streak order by rd) else 1 end as desired
from
streak_cte
order by
rd;
First, I added some additional data to help fully illustrate the problem...
drop table if exists bi_adhoc.test;
create table bi_adhoc.test (id varchar(12),period date,hit decimal);
insert into bi_adhoc.test
select 'aaabbbccc', '2016-07-25'::date, 0 union all
select 'aaabbbccc', '2016-08-01'::date, 0 union all
select 'aaabbbccc', '2016-08-08'::date, 0 union all
select 'aaabbbccc', '2016-08-15'::date, 1 union all
select 'aaabbbccc', '2016-08-22'::date, 1 union all
select 'aaabbbccc', '2016-08-29'::date, 0 union all
select 'aaabbbccc', '2016-09-05'::date, 0 union all
select 'aaabbbccc', '2016-09-12'::date, 1 union all
select 'aaabbbccc', '2016-09-19'::date, 0 union all
select 'aaabbbccc', '2016-09-26'::date, 1 union all
select 'aaabbbccc', '2016-10-03'::date, 1 union all
select 'aaabbbccc', '2016-10-10'::date, 1 union all
select 'aaabbbccc', '2016-10-17'::date, 1 union all
select 'aaabbbccc', '2016-10-24'::date, 1 union all
select 'aaabbbccc', '2016-10-31'::date, 0 union all
select 'aaabbbccc', '2016-11-07'::date, 0 union all
select 'aaabbbccc', '2016-11-14'::date, 0 union all
select 'aaabbbccc', '2016-11-21'::date, 0 union all
select 'aaabbbccc', '2016-11-28'::date, 0 union all
select 'aaabbbccc', '2016-12-05'::date, 1 union all
select 'aaabbbccc', '2016-12-12'::date, 1;
Then the key was to figure out what a streak was and how to identify each streak so I could partition the data to have something to partition the data.
select
*
,case
when t1.hit = 1 then row_number() over (partition by t1.id,t1.hit_partition order by t1.period)
when t1.hit = 0 then row_number() over (partition by t1.id,t1.miss_partition order by t1.period)
else null
end desired
from
(
select
*
,row_number() over (partition by t.id order by t.id,t.period)
,case
when t.hit = 1 then row_number() over (partition by t.id, t.hit order by t.period)
else null
end hit_counter
,case
when t.hit = 1 then row_number() over (partition by t.id order by t.id,t.period) - row_number() over (partition by t.id, t.hit order by t.period)
else null
end hit_partition
,case
when t.hit = 0 then row_number() over (partition by t.id, t.hit order by t.period)
else null
end miss_counter
,case
when t.hit = 0 then row_number() over (partition by t.id order by t.id,t.period) - row_number() over (partition by t.id, t.hit order by t.period)
else null
end miss_partition
from
bi_adhoc.test t
) t1
order by
t1.id
,t1.period;
The result of this:
id period hit row_number hit_counter hit_partition miss_counter miss_partition desired
aaabbbccc 2016-07-25 0 1 NULL NULL 1 0 1
aaabbbccc 2016-08-01 0 2 NULL NULL 2 0 2
aaabbbccc 2016-08-08 0 3 NULL NULL 3 0 3
aaabbbccc 2016-08-15 1 4 1 3 NULL NULL 1
aaabbbccc 2016-08-22 1 5 2 3 NULL NULL 2
aaabbbccc 2016-08-29 0 6 NULL NULL 4 2 1
aaabbbccc 2016-09-05 0 7 NULL NULL 5 2 2
aaabbbccc 2016-09-12 1 8 3 5 NULL NULL 1
aaabbbccc 2016-09-19 0 9 NULL NULL 6 3 1
aaabbbccc 2016-09-26 1 10 4 6 NULL NULL 1
aaabbbccc 2016-10-03 1 11 5 6 NULL NULL 2
aaabbbccc 2016-10-10 1 12 6 6 NULL NULL 3
aaabbbccc 2016-10-17 1 13 7 6 NULL NULL 4
aaabbbccc 2016-10-24 1 14 8 6 NULL NULL 5
aaabbbccc 2016-10-31 0 15 NULL NULL 7 8 1
aaabbbccc 2016-11-07 0 16 NULL NULL 8 8 2
aaabbbccc 2016-11-14 0 17 NULL NULL 9 8 3
aaabbbccc 2016-11-21 0 18 NULL NULL 10 8 4
aaabbbccc 2016-11-28 0 19 NULL NULL 11 8 5
aaabbbccc 2016-12-05 1 20 9 11 NULL NULL 1
aaabbbccc 2016-12-12 1 21 10 11 NULL NULL 2

Group by in pivot table in SQL Server 2008 R2

How can I group by this pivot table
select *
from
(SELECT
ProductionID,ProductionDetailID,
[DeviceID],[DeviceSpeed],[LattNO]
from
(SELECT
*
from view_3
where ProductionID = 6) x pivot
(
max(Value)FOR PropertyName in ([DeviceID],[DeviceSpeed],[LattNO])
) AS pvt ) as pp
Result:
ProductionID ProductionDetailID DeviceID DeviceSpeed LattNO
6 2 5 NULL NULL
6 2 NULL 8 NULL
6 2 NULL NULL 6
6 3 1 NULL NULL
6 3 NULL 2 NULL
and how can I get this result:
ProductionID ProductionDetailID DeviceID DeviceSpeed LattNO
6 2 5 8 6
6 3 1 2 NULL
SELECT
ProductionID,ProductionDetailID, Sum(Cast(isnull([DeviceID],0) as Int)) [DeviceID],Sum(Cast(isnull([DeviceSpeed],0) as Int)) [DeviceSpeed],Case Sum(Cast(isnull([LattNO],0) as Int)) When 0 then Null else Sum(Cast(isnull([LattNO],0) as Int)) End [LattNO] from
(
SELECT
* FROM dbo.View_3
)
x pivot
(
max(Value)FOR PropertyName in ([DeviceID],[DeviceSpeed],[LattNO])
) AS pvt
Group by ProductionID , ProductionDetailID

Ignore null values when using SQL Server 2012's Last_Value() function

I am using SQL Server 2012 and have a table of values that look like this. It is populated with event data.
FldType Date Price Size
--------------------------------------------
2 2012-08-22 00:02:01 9140 1048
0 2012-08-22 00:02:02 9140 77
1 2012-08-22 00:02:03 9150 281
2 2012-08-22 00:02:04 9140 1090
0 2012-08-22 00:02:05 9150 1
1 2012-08-22 00:02:06 9150 324
2 2012-08-22 00:02:07 9140 1063
I would like to track the lastest value for each of the 3 field types (0,1,2) so that the final output looks like this.
Date Price0 Size0 Price1 Size1 Price2 Size2
-----------------------------------------------------------------
2012-08-22 00:02:01 NULL NULL NULL NULL 9140 1048
2012-08-22 00:02:02 9140 77 NULL NULL 9140 1048
2012-08-22 00:02:03 9140 77 9150 281 9140 1048
2012-08-22 00:02:04 9140 77 9150 281 9140 1090
2012-08-22 00:02:05 9150 1 9150 281 9140 1090
2012-08-22 00:02:06 9150 1 9150 324 9140 1090
2012-08-22 00:02:07 9150 1 9150 324 9140 1063
Unfortunately, it is not ignoring subsequent null values so I get this instead.
Date Price0 Size0 Price1 Size1 Price2 Size2
-----------------------------------------------------------------
2012-08-22 00:02:01 NULL NULL NULL NULL 9140 1048
2012-08-22 00:02:02 9140 77 NULL NULL NULL NULL
2012-08-22 00:02:03 NULL NULL 9150 281 NULL NULL
2012-08-22 00:02:04 NULL NULL NULL NULL 9140 1090
2012-08-22 00:02:05 9150 1 NULL NULL NULL NULL
2012-08-22 00:02:06 NULL NULL 9150 324 NULL NULL
2012-08-22 00:02:07 NULL NULL NULL NULL 9140 1063
My current query looks like this
SELECT [Date],
LAST_VALUE(Price0) OVER (PARTITION BY FldType ORDER BY [Date] ) AS Price0,
LAST_VALUE(Size0) OVER (PARTITION BY FldType ORDER BY [Date]) AS Size0,
LAST_VALUE(Price1) OVER (PARTITION BY FldType ORDER BY [Date] ) AS Price1,
LAST_VALUE(Size1) OVER (PARTITION BY FldType ORDER BY [Date]) AS Size1,
LAST_VALUE(Price2) OVER (PARTITION BY FldType ORDER BY [Date] ) AS Price2,
LAST_VALUE(Size2) OVER (PARTITION BY FldType ORDER BY [Date]) AS Size2
FROM (
SELECT FldType, [Date], Price, Size,
CASE WHEN FldType = 0 THEN Price END as Price0,
CASE WHEN FldType = 0 THEN Size END as Size0,
CASE WHEN FldType = 1 THEN Price END as Price1,
CASE WHEN FldType = 1 THEN Size END as Size1,
CASE WHEN FldType = 2 THEN Price END as Price2,
CASE WHEN FldType = 2 THEN Size END as Size2
FROM [RawData].[dbo].[Events]
) as T1
ORDER BY [Date]
Is there some way to have SQL Server 2012 ignore null values when determining the lastest value? Or is there a better approach not using Last_Value() function?
To summarize I am trying to achieve two thing.
Split the Price and Size columns into 6 columns (2 columns x 3 field types)
Keep track of the latest value in each of these columns.
Any suggestions would be apprciated.
I'm not sure you can do it with LAST_VALUE, unless you add a PIVOT maybe.
Also, you need to treat Size and Price separately because they come from different rows. So, this achieves what you want be breaking it down.
DECLARE #source TABLE (FldType int, DateCol DateTime, Price int, Size int);
INSERT #source VALUES
(2, '2012-08-22 00:02:01', 9140, 1048),(0, '2012-08-22 00:02:02', 9140, 77),
(1, '2012-08-22 00:02:03', 9150, 281),(2, '2012-08-22 00:02:04', 9140, 1090),
(0, '2012-08-22 00:02:05', 9150, 1),(1, '2012-08-22 00:02:06', 9150, 324),
(2, '2012-08-22 00:02:07', 9140, 1063);
SELECT
S.DateCol, Xp0.Price0, Xs0.Size0, Xp1.Price1, Xs1.Size1, Xp2.Price2, Xs2.Size2
FROM
#source S
OUTER APPLY
(SELECT TOP 1 S0.Price AS Price0 FROM #source S0 WHERE S0.FldType = 0 AND S0.DateCol <= S.DateCol ORDER BY S0.DateCol DESC) Xp0
OUTER APPLY
(SELECT TOP 1 S1.Price AS Price1 FROM #source S1 WHERE S1.FldType = 1 AND S1.DateCol <= S.DateCol ORDER BY S1.DateCol DESC) Xp1
OUTER APPLY
(SELECT TOP 1 S2.Price AS Price2 FROM #source S2 WHERE S2.FldType = 2 AND S2.DateCol <= S.DateCol ORDER BY S2.DateCol DESC) Xp2
OUTER APPLY
(SELECT TOP 1 S0.Size AS Size0 FROM #source S0 WHERE S0.FldType = 0 AND S0.DateCol <= S.DateCol ORDER BY S0.DateCol DESC) Xs0
OUTER APPLY
(SELECT TOP 1 S1.Size AS Size1 FROM #source S1 WHERE S1.FldType = 1 AND S1.DateCol <= S.DateCol ORDER BY S1.DateCol DESC) Xs1
OUTER APPLY
(SELECT TOP 1 S2.Size AS Size2 FROM #source S2 WHERE S2.FldType = 2 AND S2.DateCol <= S.DateCol ORDER BY S2.DateCol DESC) Xs2
ORDER BY
DateCol;
The other way is to maintain a separate table via triggers or some ETL that does it the summary for you.