How to order UNPIVOT - tsql

I have the following UNPIVOT code and I would like to order it by the FactSheetSummary columns so that when it is converted to rows it is order 1 - 12:
INSERT INTO #Results
SELECT DISTINCT ReportingDate, PortfolioID,ISIN, PortfolioNme, Section,REPLACE(REPLACE(Risks,'‘',''''),'’','''')
FROM
(SELECT DISTINCT
ReportingDate
, PortfolioID
, ISIN
, PortfolioNme
, Section
, FactSheetSummary_1, FactSheetSummary_2, FactSheetSummary_3
, FactSheetSummary_4, FactSheetSummary_5, FactSheetSummary_6
, FactSheetSummary_7, FactSheetSummary_8, FactSheetSummary_9
, FactSheetSummary_10, FactSheetSummary_11, FactSheetSummary_12
FROM #WorkingTableFactsheet) p
UNPIVOT
(Risks FOR FactsheetSummary IN
( FactSheetSummary_1, FactSheetSummary_2, FactSheetSummary_3
, FactSheetSummary_4, FactSheetSummary_5, FactSheetSummary_6
, FactSheetSummary_7, FactSheetSummary_8, FactSheetSummary_9
, FactSheetSummary_10, FactSheetSummary_11, FactSheetSummary_12)
)AS unpvt;
--DELETE records where there are no Risk Narratives
DELETE FROM #Results
WHERE Risks = ''
SELECT
ReportingDate
, PortfolioID
, ISIN
, PortfolioNme
, Section
, Risks
, ROW_NUMBER() OVER(PARTITION BY ISIN,Section ORDER BY ISIN,Section,Risks) as SortOrder
FROM #Results
order by ISIN, Risks
Is it possible to do this? I thought the IN of the UNPIVOT would dictate the order? Do I need to add a column to dictate which I would like to be 1 through to 12?

Just use case expression to order:
order by case FactsheetSummary
when 'FactSheetSummary_1' then 1
when 'FactSheetSummary_2' then 2
when 'FactSheetSummary_12' then 12 end

Related

How to collapse overlapping date periods with acceptable gaps using T-SQL?

We want to group our members' enrollments into "continuous enrollments," allowing for a gap of up to 45 days. I know how to use LEAD to determine if an enrollment should be grouped with the next, but I don't know how to group them. Would it be more appropriate to add 45 to the term date and subtract 45 from the effective date, then check for overlapping date periods? My goal is to have a SQL view that returns the results similar to the final query below. Thank you for your help.
SELECT '101' AS MemID, '2021-01-01' AS EffDate, '2021-01-31' AS TermDate INTO #T1 UNION
SELECT '101', '2021-02-01', '2021-02-28' UNION
SELECT '101', '2021-03-01', '2021-03-31' UNION
SELECT '101', '2021-06-01', '2021-06-30' UNION
SELECT '999', '2021-01-01', '2021-01-15' UNION
SELECT '999', '2021-09-01', '2021-09-28' UNION
SELECT '999', '2021-10-01', '2021-10-31'
SELECT *
, LEAD(EffDate) OVER (PARTITION BY MemID ORDER BY EffDate) AS LeadEffDate
, DATEDIFF(DAY, TermDate, (LEAD(EffDate) OVER (PARTITION BY MemID ORDER BY EffDate))) AS DaysToNextEnrollment
, CASE WHEN (DATEDIFF(DAY, TermDate, (LEAD(EffDate) OVER (PARTITION BY MemID ORDER BY EffDate)))) <= 45 THEN 1 ELSE 0 END AS CombineWithNextRecord
FROM #T1
-- result objective
SELECT 101 AS MemID, '2021-01-01' AS EffDate, '2021-03-31' AS TermDate UNION
SELECT 101, '2021-06-01', '2021-06-30' UNION
SELECT 999, '2021-01-01', '2021-01-15' UNION
SELECT 999, '2021-09-01', '2021-10-31'
I think you are really close. Your question is very similar to
TSQL - creating from-to date table while ignoring in-between steps with conditions with a logic difference on what you want to consider to be the same group.
My basic approach is to use the LAG() function to figure out the previous values for MemID and TermDate and combine that with your 45 day rule to define a group. And finally get the first and last values of each group.
Here is my response to that question modified to your situation.
SELECT
a4.MemID
, CONVERT (DATE, a4.First_EffDate) AS [EffDate]
, CONVERT (DATE, a4.TermDate) AS [TermDate]
FROM (
SELECT
a3.MemID
, a3.EffDate
, a3.TermDate
, a3.MemID_group
, FIRST_VALUE (a3.EffDate) OVER (PARTITION BY a3.MemID_group ORDER BY a3.EffDate) AS [First_EffDate]
, ROW_NUMBER () OVER (PARTITION BY a3.MemID_group ORDER BY a3.EffDate DESC) AS [Row_number]
FROM (
SELECT
a2.MemID
, a2.EffDate
, a2.TermDate
, a2.Previous_MemID
, a2.Previous_TermDate
, a2.New_group
, SUM (a2.New_group) OVER (ORDER BY a2.MemID, a2.EffDate) AS [MemID_group]
FROM (
SELECT
a1.MemID
, a1.EffDate
, a1.TermDate
, a1.Previous_MemID
, a1.Previous_TermDate
---------------------------------------------------------------------------------
-- new group if the MemID is different from the previous row OR
-- if the MemID is the same as the previous row AND it has been more than 45 days
-- between the TermDate of the previous row and the EffDate of the current row
,
IIF((a1.MemID <> a1.Previous_MemID)
OR (
a1.MemID = a1.Previous_MemID
AND DATEDIFF (DAY, a1.Previous_TermDate, a1.EffDate) > 45
)
, 1
, 0) AS [New_group]
---------------------------------------------------------------------------------
FROM (
SELECT
MemID
, EffDate
, TermDate
, LAG (MemID) OVER (ORDER BY MemID) AS [Previous_MemID]
, LAG (TermDate) OVER (PARTITION BY MemID ORDER BY EffDate) AS [Previous_TermDate]
FROM #T1
) a1
) a2
) a3
) a4
WHERE a4.[Row_number] = 1;
Here is the dbfiddle.

SQL Server - Select with Group By together Raw_Number

I'm using SQL Server 2000 (80). So, it's not possible to use the LAG function.
I have a code a data set with four columns:
Purchase_Date
Facility_no
Seller_id
Sale_id
I need to identify missing Sale_ids. So every sale_id is a 100% sequential, so the should not be any gaps in order.
This code works for a specific date and store if specified. But i need to work on entire data set looping looping through every facility_id and every seller_id for ever purchase_date
declare #MAXCOUNT int
set #MAXCOUNT =
(
select MAX(Sale_Id)
from #table
where
Facility_no in (124) and
Purchase_date = '2/7/2020'
and Seller_id = 1
)
;WITH TRX_COUNT AS
(
SELECT 1 AS Number
union all
select Number + 1 from TRX_COUNT
where Number < #MAXCOUNT
)
select * from TRX_COUNT
where
Number NOT IN
(
select Sale_Id
from #table
where
Facility_no in (124)
and Purchase_Date = '2/7/2020'
and seller_id = 1
)
order by Number
OPTION (maxrecursion 0)
My Dataset
This column:
case when
Sale_Id=0 or 1=Sale_Id-LAG(Sale_Id) over (partition by Facility_no, Purchase_Date, Seller_id)
then 'OK' else 'Previous Missing' end
will tell you which Seller_Ids have some sale missing. If you want to go a step further and have exactly your desired output, then filter out and distinct the 'Previous Missing' ones, and join with a tally table on not exists.
Edit: OP mentions in comments they can't use LAG(). My suggestion, then, would be:
Make a temp table that that has the max(sale_id) group by facility/seller_id
Then you can get your missing results by this pseudocode query:
Select ...
from temptable t
inner join tally N on t.maxsale <=N.num
where not exists( select ... from sourcetable s where s.facility=t.facility and s.seller=t.seller and s.sale=N.num)
> because the only way to "construct" nonexisting combinations is to construct them all and just remove the existing ones.
This one worked out
; WITH cte_Rn AS (
SELECT *, ROW_NUMBER() OVER(PARTITION BY Facility_no, Purchase_Date, Seller_id ORDER BY Purchase_Date) AS [Rn_Num]
FROM (
SELECT
Facility_no,
Purchase_Date,
Seller_id,
Sale_id
FROM MyTable WITH (NOLOCK)
) a
)
, cte_Rn_0 as (
SELECT
Facility_no,
Purchase_Date,
Seller_id,
Sale_id,
-- [Rn_Num] AS 'Skipped Sale'
-- , case when Sale_id = 0 Then [Rn_Num] - 1 Else [Rn_Num] End AS 'Skipped Sale for 0'
, [Rn_Num] - 1 AS 'Skipped Sale for 0'
FROM cte_Rn a
)
SELECT
Facility_no,
Purchase_Date,
Seller_id,
Sale_id,
-- [Skipped Sale],
[Skipped Sale for 0]
FROM cte_Rn_0 a
WHERE NOT EXISTS
(
select * from cte_Rn_0 b
where b.Sale_id = a.[Skipped Sale for 0]
and a.Facility_no = b.Facility_no
and a.Purchase_Date = b.Purchase_Date
and a.Seller_id = b.Seller_id
)
--ORDER BY Purchase_Date ASC

GROUP BY column and clause in postgres

I would like group the columns of a table with by a column value as well as when another condition is met. For example, with the following table:
Events:
id session_id flags created_at ...
--------------------------------------------
1 100 OTHER ...
2 101 OTHER ...
3 101 NEW_SESSION ...
4 101 OTHER ...
5 101 NEW_SESSION ...
6 100 OTHER ...
7 102 OTHER ...
I want the following result:
session_id events_count first_event_id last_event_id
-------------------------------------------------------
100-0 2 1 6
101-0 1 2 2
101-1 2 3 4
101-2 1 5 5
102-0 1 7 7
The basic idea is that I want to extract sessions from events. They are grouped by session_id. I also want a new session whenever I have the flag NEW_SESSION.
The query is something like this:
SELECT ? as session_id
, count(id) as events_count
, MIN(id) as first_event_id
, MAX(id) last_event_id
GROUP BY session_id
-- , and whenever flags is NEW_SESSION
ORDER BY id
But I dont know how to express the group by condition properly. Any idea ?
Update 2
In the comments I've noticed that you want them unique. Then we can use a variable:
SET #inc := 0;
(
SELECT CONCAT(session_id, '-', !ABS(STRCMP(flags, 'NEW_SESSION'))) AS session_id
, COUNT(id) AS events_count
, MIN(id) AS first_event_id
, MAX(id) last_event_id
FROM events
WHERE flags != 'NEW_SESSION'
GROUP BY events.session_id, events.flags
ORDER BY events.id
) UNION (
SELECT CONCAT(session_id, '-', #inc := #inc + 1) AS session_id
, COUNT(id) AS events_count
, MIN(id) AS first_event_id
, MAX(id) last_event_id
FROM events
WHERE flags = 'NEW_SESSION'
GROUP by events.id
ORDER BY events.id
);
Update
The following prevents grouping for the NEW_SESSION rows:
(
SELECT CONCAT(session_id, '-', !ABS(STRCMP(flags, 'NEW_SESSION'))) AS session_id
, COUNT(id) AS events_count
, MIN(id) AS first_event_id
, MAX(id) last_event_id
FROM events
WHERE flags != 'NEW_SESSION'
GROUP BY events.session_id, events.flags
ORDER BY events.id
) UNION (
SELECT CONCAT(session_id, '-1') AS session_id
, COUNT(id) AS events_count
, MIN(id) AS first_event_id
, MAX(id) last_event_id
FROM events
WHERE flags = 'NEW_SESSION'
GROUP BY id
ORDER BY events.id
);
Original answer
As far as I understand, you are trying to group events by the session IDs and
"whether it's a NEW_SESSION" flag. If it's so, then I'd express it as follows:
SELECT CONCAT(session_id, '-', !ABS(STRCMP(flags, 'NEW_SESSION'))) AS session_id
, COUNT(id) AS events_count
, MIN(id) AS first_event_id
, MAX(id) last_event_id
FROM events
GROUP BY events.session_id, events.flags
ORDER BY events.id;

TSQL - LEAD for Next Different Row

Is there a way to use the lead function such that I can get the next row where something has changed, as opposed it where it is the same?
In this example, the RowType can be 'in' or 'out', for each 'in' I need to know the next RowNumber where it has become 'out'. I have been playing with the lead function as it is really fast, however I haven't been able to get it working. I just need to do the following really, which is partition by a RowType which isn't the one in the current row.
select
RowNumber
,RowType --In this case I am only interested in RowType = 'In'
, Lead(RowNumber)
OVER (partition by "RowType = out" --This is the bit I am stuck on--
order by RowNumber ASC) as NextOutFlow
from table
order by RowNumber asc
Thanks in advance for any help
Rather than using lead() I would use an outer apply that returns the next row with type out for all rows with type in:
select RowNumber, RowType, nextOut
from your_table t
outer apply (
select min(RowNumber) as nextOut
from your_table
where RowNumber > t.RowNumber and RowType='Out'
) oa
where RowType = 'In'
order by RowNumber asc
Given sample data like:
RowNumber RowType
1 in
2 out
3 in
4 in
5 out
6 in
This would return:
RowNumber RowType nextOut
1 in 2
3 in 5
4 in 5
6 in NULL
I think this will work
If you would use a bit field for in out you would get better performance
;with cte1 as
(
SELECT [inden], [OnOff]
, lag([OnOff]) over (order by [inden]) as [lagOnOff]
FROM [OnOff]
), cte2 as
(
select [inden], [OnOff], [lagOnOff]
, lead([inden]) over (order by [inden]) as [Leadinden]
from cte1
where [OnOff] <> [lagOnOff]
or [lagOnOff] is null
)
select [inden], [OnOff], [lagOnOff], [Leadinden]
from cte2
where [OnOff] = 'true'
probably slower but if you have the right indexes may work
select t1.rowNum as 'rowNumIn', min(t2.rownum) as 'nextRowNumOut'
from tabel t1
join table t2
on t1.rowType = 'In'
and t2.rowType = 'Out'
and t2.rowNum > t1.rowNum
and t2.rowNum < t1.rowNum + 1000 -- if you can constrain it
group by t1.rowNum

DISTINCT isn't removing dupes

I am not sure how to use DISTINCT in an AB BA fashion. For instance, I have two columns BoughtLoyaltyProgramId, SoldLoyaltyProgramId. But even when I use DISTINCT, it produces a duplicate when the same code in a boughtloyaltyprogramid appears in soldloyaltyprogramid. I want no dupes but I have no idea how this works with multiple columns and pairings.
Here is the stored procedure:
ALTER PROC AA
#LPPProgramID UNIQUEIDENTIFIER ,
#DateFrom DATETIME ,
#DateTo DATETIME
AS
SELECT DISTINCT TOP ( 5 )
BoughtLoyaltyProgramId ,
SoldLoyaltyProgramId ,
DateTransactionCleared ,
ExchangeRate
FROM dbo.PEX_ClearedTransactions
WHERE DateTransactionCleared >= #DateFrom
AND DateTransactionCleared < #DateTo
AND ( BoughtLoyaltyProgramId = #LPPProgramID
OR SoldLoyaltyProgramId = #LPPProgramID
)
ORDER BY ExchangeRate;
GO
Distinct is per ROW, so the value in the columns in a row are in a distinct combination, the data isn't compared in each column of a row, to other columns in that row.
You likely will also want to do some comparison in your Where statement for the column data.
Perhaps you want to use ROW_NUMBER:
WITH cte
AS (SELECT boughtloyaltyprogramid,
soldloyaltyprogramid,
datetransactioncleared,
exchangerate,
RN=Row_number() OVER(
partition BY boughtloyaltyprogramid, soldloyaltyprogramid
ORDER BY exchangerate)
FROM dbo.pex_clearedtransactions
WHERE datetransactioncleared >= #DateFrom
AND datetransactioncleared < #DateTo
AND ( boughtloyaltyprogramid = #LPPProgramID
OR soldloyaltyprogramid = #LPPProgramID ))
SELECT TOP(5) * FROM cte
WHERE RN = 1
ORDER BY exchangerate
Here is how you can get all distinct values from two columns:
SELECT distinct * from
(SELECT BoughtLoyaltyProgramId
FROM dbo.PEX_ClearedTransactions
UNION ALL
SELECT SoldLoyaltyProgramId
FROM dbo.PEX_ClearedTransactions) as A