TSQL Question About Update Column with Hierarchy in Value

TSQL Question About Update Column with Hierarchy in Value - tsql

I have this table with two columns, Hierarchy (Alphabet order) and Access as below.
N Hierarchy Access
1 A Y
2 A >B N
3 A >B >C NULL
4 A >B >C >D NULL
5 A >B >C >D >E NULL
6 A >B >C >D >E >F NULL
7 A >B >C >D >E >F >G Y
8 A >B >C >D >E >F >G >J NULL
I need to update Access column with this logic. If the Access value is null, update the Access for this row with Access from next higher hierarchy where Access is not null.
For example,
Row 7's Access is null, the query will update Row 7's Access to Y. Because hierarchy G is Y.
Row 6 will be N, because hierarchy E, D and C is null, and B is N (not null).
Row 5 will be N, because hierarchy D and C is null, and B is N (not null).
Row 4 will be N, because hierarchy C is null, and B is N (not null).
So the desired output would look like this
N Hierarchy Access
1 A Y
2 A >B N
3 A >B >C N
4 A >B >C >D N
5 A >B >C >D >E N
6 A >B >C >D >E >F N
7 A >B >C >D >E >F >G Y
8 A >B >C >D >E >F >G >J Y
How can I achieve this? Thank you.

Honestly, I'm going to firstly repeat my comment:
I really suggest fixing your design. Storing delimited data like this is not a good idea. SQL Server has a built in hierachyid data type, and if you don't want to use that, you are far better off using a primary and foreign key relationship.
This isn't pretty to achieve with your denormalised data. You have to use LIKE expressions and then get rid of the unwanted JOINed rows with a "TOP 1 in each group", by doing something like this:
CREATE TABLE dbo.DelimitedHierarchy (N int NOT NULL,
Hierarchy varchar(200) NOT NULL,
Access char(1) NULL);
GO
INSERT INTO dbo.DelimitedHierarchy (N, Hierarchy, Access)
VALUES (1,'A','Y'),
(2,'A >B','N'),
(3,'A >B >C',NULL),
(4,'A >B >C >D',NULL),
(5,'A >B >C >D >E',NULL),
(6,'A >B >C >D >E >F',NULL),
(7,'A >B >C >D >E >F >G','Y'),
(8,'A >B >C >D >E >F >G >J',NULL);
GO
WITH rCTE AS(
SELECT DH.N,
DH.Hierarchy,
DH.Access,
1 AS Level
FROM dbo.DelimitedHierarchy DH
WHERE DH.Hierarchy NOT LIKE '%>%'
UNION ALL
SELECT DH.N,
DH.Hierarchy,
ISNULL(DH.Access,r.Access),
r.Level + 1
FROM dbo.DelimitedHierarchy DH
JOIN rCTE r ON DH.Hierarchy LIKE r.Hierarchy + ' >%')
SELECT TOP (1) WITH TIES
*
FROM rCTE r
ORDER BY ROW_NUMBER() OVER (PARTITION BY Hierarchy ORDER BY Level DESC);
GO
DROP TABLE dbo.DelimitedHierarchy;
If you had a properly normalised data set, not using hierarchyid then the query would look like this:
CREATE TABLE dbo.NormalisedHierarchy (N int NOT NULL,
value char(1) NOT NULL,
parent char(1) NULL,
Access char(1) NULL);
GO
INSERT INTO dbo.NormalisedHierarchy (N, Value, Parent, Access)
VALUES (1,'A',NULL,'Y'),
(2,'B','A','N'),
(3,'C','B',NULL),
(4,'D','C',NULL),
(5,'E','D',NULL),
(6,'F','E',NULL),
(7,'G','F','Y'),
(8,'J','G',NULL);
GO
WITH rCTE AS(
SELECT NH.N,
NH.[value],
NH.Access,
1 AS Level
FROM dbo.NormalisedHierarchy NH
WHERE NH.parent IS NULL
UNION ALL
SELECT NH.N,
NH.[value],
ISNULL(NH.Access,r.Access),
r.Level + 1
FROM dbo.NormalisedHierarchy NH
JOIN rCTE r ON NH.parent = r.[value])
SELECT *
FROM rCTE r
ORDER BY [Level] ASC;
GO
DROP TABLE dbo.NormalisedHierarchy;
Alternatively, implement a hierarchyid. I don't include a solution for that, as I have never worked with them, and have used self referencing normalised tables.

Related

Overwriting group of values with in same column another set of group based on other column group

Input:
Name GroupId Processed NewGroupId NgId
Mike 1 N 9 NULL
Mikes 1 N 9 NULL
Miken 5 Y 9 5
Mikel 5 Y 9 5
Output:
Name GroupId Processed NewGroupId NgId
Mike 1 N 9 5
Mikes 1 N 9 5
Miken 5 Y 9 5
Mikel 5 Y 9 5
below query worked in sql server, due to correlated subquery same is not working in spark sql.
Is there any alternate either with spark sql or pyspark dataframe.
SELECT Name,groupid,IsProcessed,ngid,
CASE WHEN ngid IS NULL THEN
COALESCE((SELECT top 1 ngid FROM temp D
WHERE D.NewGroupId = T.NewGroupId AND
D.ngid IS NOT NULL ), null)
ELSE ngid
END AS ngid
FROM temp T

worked with below in sparksql.
spark.sql("select LKUP,groupid,IsProcessed,NewGroupId ,coalesce((select Max(D.ngid) from test2 D where D.NewGroupId = T.NewGroupId AND D.ngidis not null),null) as ngid from test2 T")

PIVOT not producing data on a single row

I am trying to write PIVOT to generate a row of data that originally sits as multiple rows in the DB. The DB data looks like this (appended)
txtSchoolID txtSubjectArchivedName intSubjectID intGradeID intGradeTransposeValue
95406288448 History 7 634 2
95406288448 History 7 635 2
95406288448 History 7 636 2
95406288448 History 7 637 2
95406288448 History 7 638 2
95406288448 History 7 639 2
95406288448 History 7 640 2
95406288448 History 7 641 2
95406288448 History 7 642 2
95406288448 History 7 643 2
What I want to get to is 1 row for each subject and SchoolID with the grades listed as columns.
I have written the following pivot:
SELECT intSubjectID, txtSchoolID, [636] AS Effort, [637] AS Focus, [638] AS Participation, [639] AS Groupwork, [640] AS Rigour, [641] AS Curiosity, [642] AS Initiative,
[643] AS SelfOrganisation, [644] as Perserverance
FROM (SELECT txtSchoolID, intReportTypeID, txtSubjectArchivedName, intSubjectID, intReportProgress, txtTitle, txtForename, txtPreName, txtMiddleNames,
txtSurname, txtGender, txtForm, intNCYear, txtSubmitByTitle, txtSubmitByPreName, txtSubmitByFirstname, txtSubmitByMiddleNames,
txtSubmitBySurname, txtCurrentSubjectName, txtCurrentSubjectReportName, intReportCycleID, txtReportCycleName, intReportCycleType,
intPreviousReportCycle, txtReportCycleShortName, intReportCycleTerm, intReportCycleAcademicYear, dtReportCycleStartDate,
dtReportCycleFinishDate, dtReportCyclePrintDate, txtReportTermName, dtReportTermStartDate, dtReportTermFinishDate,
intGradeID, txtGradingName, txtGradingOptions, txtShortGradingName, txtGrade, intGradeTransposeValue FROM VwReportsManagementAcademicReports) p
PIVOT
(MAX (intGradeTransposeValue)
FOR intGradeID IN ([636], [637], [638], [639], [640], [641], [642], [643], [644] )
) AS pvt
WHERE (intReportCycleID = 142) AND (intReportProgress = 1)
However, this is producing this
intSubjectID txtSchoolID Effort Focus Participation Groupwork Rigour Curiosity Initiative SelfOrganisation Perserverance
8 74001484142 NULL NULL NULL NULL NULL NULL NULL NULL NULL
8 74001484142 NULL NULL NULL NULL NULL 2 NULL NULL NULL
8 74001484142 3 NULL NULL NULL NULL NULL NULL NULL NULL
8 74001484142 NULL 2 NULL NULL NULL NULL NULL NULL NULL
8 74001484142 NULL NULL NULL 2 NULL NULL NULL NULL NULL
8 74001484142 NULL NULL NULL NULL NULL NULL 2 NULL NULL
8 74001484142 NULL NULL 2 NULL NULL NULL NULL NULL NULL
8 74001484142 NULL NULL NULL NULL NULL NULL NULL NULL 2
8 74001484142 NULL NULL NULL NULL 2 NULL NULL NULL NULL
8 74001484142 NULL NULL NULL NULL NULL NULL NULL 2 NULL
What I want is
intSubjectID txtSchoolID Effort Focus Participation Groupwork Rigour Curiosity Initiative SelfOrganisation Perserverance
8 74001484142 3 2 2 2 2 2 2 2 2
Is there a way to get it like this.
I have never tried a PIVOT before, this is my first time, so all help welcome.

I think the reason you got the unexpected result is you have so many unwanted columns in the Select in the sub-query and the pivot will group them, too.
Your query might be very close to your ideal result: try:
SELECT intSubjectID, txtSchoolID, [636] AS Effort, [637] AS Focus, [638] AS Participation, [639] AS Groupwork, [640] AS Rigour, [641] AS Curiosity, [642] AS Initiative,
[643] AS SelfOrganisation, [644] as Perserverance
FROM (SELECT txtSchoolID, intReportTypeID FROM VwReportsManagementAcademicReports) p --just these two
PIVOT
(MAX (intGradeTransposeValue)
FOR intGradeID IN ([636], [637], [638], [639], [640], [641], [642], [643], [644] )
) AS pvt
WHERE (intReportCycleID = 142) AND (intReportProgress = 1)

As per the comment above - the solution was:
try stripping the inner select down to only the columns that will be used in the pivot and are expected in the output. intSubjectID, txtSchoolID, intGradeTransposeValue, and intGradeID. all other columns will act as a grouping column in the output and can cause this type of non grouped output.

pivot can't return such what you asking for, but you can use another approach:
--test dataset
declare #test as table
( txtSchoolID bigint,
txtSubjectArchivedName varchar(10),
intSubjectID int,
intGradeID int,
intGradeTransposeValue int)
insert into #test
Values
(95406288448,'History',7,634,2),
(95406288448,'History',7,635,2),
(95406288448,'History',7,636,2),
(95406288448,'History',7,637,2),
(95406288448,'History',7,638,2),
(95406288448,'History',7,639,2),
(95406288448,'History',7,640,2),
(95406288448,'History',7,641,2),
(95406288448,'History',7,642,2),
(95406288448,'History',7,643,2)
--conditional aggregation
select intSubjectID,
txtSchoolID,
count(case when intGradeID = 636 then 1 end) AS Effort,
count(case when intGradeID = 637 then 1 end) AS Focus,
count(case when intGradeID = 638 then 1 end) AS Participation,
count(case when intGradeID = 639 then 1 end) AS Groupwork,
count(case when intGradeID = 640 then 1 end) AS Rigour,
count(case when intGradeID = 641 then 1 end) AS Curiosity,
count(case when intGradeID = 642 then 1 end) AS Initiative,
count(case when intGradeID = 643 then 1 end) AS SelfOrganisation,
count(case when intGradeID = 644 then 1 end) as Perserverance
from #test
group by intSubjectID, txtSchoolID
test is here

PostgreSQL window function & difference between dates

Suppose I have data formatted in the following way (FYI, total row count is over 30K):
customer_id order_date order_rank
A 2017-02-19 1
A 2017-02-24 2
A 2017-03-31 3
A 2017-07-03 4
A 2017-08-10 5
B 2016-04-24 1
B 2016-04-30 2
C 2016-07-18 1
C 2016-09-01 2
C 2016-09-13 3
I need a 4th column, let's call it days_since_last_order which, in the case where order_rank = 1 then 0 else calculate the number of days since the previous order (with rank n-1).
So, the above would return:
customer_id order_date order_rank days_since_last_order
A 2017-02-19 1 0
A 2017-02-24 2 5
A 2017-03-31 3 35
A 2017-07-03 4 94
A 2017-08-10 5 38
B 2016-04-24 1 0
B 2016-04-30 2 6
C 2016-07-18 1 79
C 2016-09-01 2 45
C 2016-09-13 3 12
Is there an easier way to calculate the above with a window function (or similar) rather than join the entire dataset against itself (eg. on A.order_rank = B.order_rank - 1) and doing the calc?
Thanks!

use the lag window function
SELECT
customer_id
, order_date
, order_rank
, COALESCE(
DATE(order_date)
- DATE(LAG(order_date) OVER (PARTITION BY customer_id ORDER BY order_date))
, 0)
FROM <table_name>

One SQL Stored Procedure to get cut off date of two different cut off date format

I have one system that read from two client databases. For the two clients, both of them have different format of cut off date:
1) Client A: Every month at 15th. Example: 15-12-2016.
2) Client B: Every first day of the month. Example: 1-1-2017.
The cut off date are stored in the table as below:
Now I need a single query to retrieve the current month's cut off date of the client. For instance, today is 15-2-2017, so the expected cut off date for both clients should be as below:
1) Client A: 15-1-2017
2) Client B: 1-2-2017
How can I accomplish this in a single Stored Procedure? For client B, I can always get the first day of the month. But this can't apply to client A since their cut off is last month's date.

Might be something like this you are looking for:
DECLARE #DummyClient TABLE(ID INT IDENTITY,ClientName VARCHAR(100));
DECLARE #DummyDates TABLE(ClientID INT,YourDate DATE);
INSERT INTO #DummyClient VALUES
('A'),('B');
INSERT INTO #DummyDates VALUES
(1,{d'2016-12-15'}),(2,{d'2017-01-01'});
WITH Numbers AS
( SELECT 0 AS Nr
UNION ALL SELECT 1
UNION ALL SELECT 2
UNION ALL SELECT 3
UNION ALL SELECT 4
UNION ALL SELECT 5
UNION ALL SELECT 6
UNION ALL SELECT 7
UNION ALL SELECT 9
UNION ALL SELECT 10
UNION ALL SELECT 11
UNION ALL SELECT 12
UNION ALL SELECT 13
UNION ALL SELECT 14
UNION ALL SELECT 15
UNION ALL SELECT 16
UNION ALL SELECT 17
UNION ALL SELECT 18
UNION ALL SELECT 19
UNION ALL SELECT 20
UNION ALL SELECT 21
UNION ALL SELECT 22
UNION ALL SELECT 23
UNION ALL SELECT 24
)
,ClientExt AS
(
SELECT c.*
,MIN(d.YourDate) AS MinDate
FROM #DummyClient AS c
INNER JOIN #DummyDates AS d ON c.ID=d.ClientID
GROUP BY c.ID,c.ClientName
)
SELECT ID,ClientName,D
FROM ClientExt
CROSS APPLY(SELECT DATEADD(MONTH,Numbers.Nr,MinDate)
FROM Numbers) AS RunningDate(D);
The result
ID Cl Date
1 A 2016-12-15
1 A 2017-01-15
1 A 2017-02-15
1 A 2017-03-15
1 A 2017-04-15
1 A 2017-05-15
1 A 2017-06-15
1 A 2017-07-15
1 A 2017-09-15
1 A 2017-10-15
1 A 2017-11-15
1 A 2017-12-15
1 A 2018-01-15
1 A 2018-02-15
1 A 2018-03-15
1 A 2018-04-15
1 A 2018-05-15
1 A 2018-06-15
1 A 2018-07-15
1 A 2018-08-15
1 A 2018-09-15
1 A 2018-10-15
1 A 2018-11-15
1 A 2018-12-15
2 B 2017-01-01
2 B 2017-02-01
2 B 2017-03-01
2 B 2017-04-01
2 B 2017-05-01
2 B 2017-06-01
2 B 2017-07-01
2 B 2017-08-01
2 B 2017-10-01
2 B 2017-11-01
2 B 2017-12-01
2 B 2018-01-01
2 B 2018-02-01
2 B 2018-03-01
2 B 2018-04-01
2 B 2018-05-01
2 B 2018-06-01
2 B 2018-07-01
2 B 2018-08-01
2 B 2018-09-01
2 B 2018-10-01
2 B 2018-11-01
2 B 2018-12-01
2 B 2019-01-01

How to generate a date to be included in UNPIVOT results without a loop?

Say I had an example like so, where Im transposing columns into rows with UNPIVOT.
DECLARE #pvt AS TABLE (VendorID int, Emp1 int, Emp2 int, Emp3 int, Emp4 int, Emp5 int);
INSERT INTO #pvt (VendorId,Emp1,Emp2,Emp3,Emp4,Emp5) VALUES (1,4,3,5,4,4);
INSERT INTO #pvt (VendorId,Emp1,Emp2,Emp3,Emp4,Emp5) VALUES (2,4,1,5,5,5);
INSERT INTO #pvt (VendorId,Emp1,Emp2,Emp3,Emp4,Emp5) VALUES (3,4,3,5,4,4);
INSERT INTO #pvt (VendorId,Emp1,Emp2,Emp3,Emp4,Emp5) VALUES (4,4,2,5,5,4);
INSERT INTO #pvt (VendorId,Emp1,Emp2,Emp3,Emp4,Emp5) VALUES (5,5,1,5,5,5);
--Unpivot the table.
SELECT VendorID, Employee, Orders
FROM
(SELECT VendorID, Emp1, Emp2, Emp3, Emp4, Emp5
FROM #pvt) p
UNPIVOT
(Orders FOR Employee IN
(Emp1, Emp2, Emp3, Emp4, Emp5)
)AS unpvt;
GO
Which produces results like this
VendorID Employee Orders
1 Emp1 4
1 Emp2 3
1 Emp3 5
1 Emp4 4
1 Emp5 4
2 Emp1 4
2 Emp2 1
2 Emp3 5
2 Emp4 5
2 Emp5 5
3 Emp1 4
3 Emp2 3
3 Emp3 5
3 Emp4 4
3 Emp5 4
However, I want to include an "incremental date like so that it repeats in a group for each Vendor and the results would be like this
VendorID Employee Orders OrderDate
1 Emp1 4 01/01/2014
1 Emp2 3 02/01/2014
1 Emp3 5 03/01/2014
1 Emp4 4 04/01/2014
1 Emp5 4 05/01/2014
2 Emp1 4 ..
2 Emp2 1
2 Emp3 5
2 Emp4 5
2 Emp5 5
3 Emp1 4
3 Emp2 3
3 Emp3 5
3 Emp4 4
3 Emp5 4
The kicker is that I want to try to do this without resorting to a loop since the transposed results are going to be about 100K records. Is there a way to generate that date field like that without looping over the results?
[edit]
I think, but not sure yet, that [this]1 post might help, using ROW NUMBER

You can use:
Dateadd(DAY, row_number() over( partition by VendorId Order by Employee), #stardate)
According to your example you can partition by vendorId and order by Employee. But you can change just like a regular order by.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

TSQL Question About Update Column with Hierarchy in Value - tsql

Related

Overwriting group of values with in same column another set of group based on other column group

PIVOT not producing data on a single row

PostgreSQL window function & difference between dates

One SQL Stored Procedure to get cut off date of two different cut off date format

How to generate a date to be included in UNPIVOT results without a loop?

Categories

Resources