Summarize values from month to month - tsql

I have a table like this.
https://i.stack.imgur.com/XpPZi.png
I need to add up the phase duration for each ID and each phase from month to month. If a phase does not occur in a month, then in this month I need the sum from this phase in the previous month, so that in every month is every phase from the past (+ value from current month if this phase exist in the month).
Expected result
https://i.stack.imgur.com/ukX0I.png
My issue is, that I don't get the sum for each phase in the month an don't get the phase in the month if this phase is not occur in the month.
Can someone please help me?
CREATE TABLE [dbo].[Tab_Status_Test](
[ID] [int] NULL,
[Phase] [nvarchar](50) NULL,
[Phase_duration] [int] NULL,
[EOM_Date] [date] NULL
) ON [PRIMARY]
insert into Tab_status_test
(ID ,Phase,Phase_duration, EOM_Date)
values
('1' ,'C' , '22','2021/02/28')
,('1' ,'A' , '13','2021/03/31')
,('1' ,'A' , '5','2021/03/31')
,('1' ,'B' , '2','2021/03/31')
,('1' ,'B' , '19','2021/04/30')
,('1' ,'A' , '3','2021/04/30')
,('1' ,'B' , '1','2021/04/30')
,('1' ,'A' , '3','2021/04/30')
,('1' ,'B' , '22','2021/05/31')
,('1' ,'C' , '22','2021/06/30')
,('1' ,'D' , '20','2021/07/31')
,('1' ,'A' , '2','2021/07/31')
,('2' ,'C' , '22','2021/02/28')
,('2' ,'A' , '13','2021/03/31')
,('2' ,'A' , '5','2021/03/31')
,('3' ,'B' , '2','2021/03/31')
,('3' ,'B' , '19','2021/04/30')
,('2' ,'A' , '3','2021/04/30')
,('3' ,'B' , '1','2021/04/30')
,('2' ,'A' , '3','2021/04/30')
,('2' ,'B' , '22','2021/05/31')
,('3' ,'C' , '22','2021/06/30')
,('3' ,'D' , '20','2021/07/31')
,('3' ,'A' , '2','2021/07/31')
This is my code
WITH Sum_Dur
AS
(
SELECT ID
,EOM_Date
,phase
,Phase_duration
,LAG(Phase_duration) OVER (Partition BY phase, eom_date ORDER BY phase,eom_date) as PrevEvent
FROM [CM_PT].[dbo].Tab_Status_Test
)
SELECT *,
SUM(PrevEvent+Phase_duration) AS SummedCount
FROM Sum_Dur
GROUP BY ID
,EOM_Date
,phase
,Phase_duration
, PrevEvent

The main thing that you want is a running sum, which you can get using SUM() OVER(PARTITION BY ... ORDER BY ...).
Since you want to include gaps, you will need to generate a complete set of IDs, Dates and Phases, which you can do with several SELECT DISTINCT ... subqueries CROSS JOINed together.
Because you data contains multiple entries with activity for the same ID, date, and phase, that data needs to be grouped to avoid duplicate rows in the results.
The final piece is eliminating early results before any activity has occurred. That can be done by wrapping everything else up as another subselect to apply a WHERE Phase_duration > 0 condition.
The result is something like:
SELECT *
FROM (
-- Running totals
SELECT I.ID, D.EOM_Date, P.Phase,
SUM(T.Phase_duration) OVER(Partition by I.ID, P.Phase ORDER BY D.EOM_DATE) AS Phase_duration
FROM (SELECT DISTINCT ID FROM Tab_status_test) I
CROSS JOIN (SELECT DISTINCT EOM_Date FROM Tab_status_test) D
CROSS JOIN (SELECT DISTINCT Phase FROM Tab_status_test) P
LEFT JOIN (
-- Monthly totals
SELECT ID, EOM_Date, Phase, SUM(Phase_duration) AS Phase_duration
FROM Tab_status_test T
GROUP BY ID, EOM_Date, Phase
) T ON T.ID = I.ID AND T.EOM_Date = D.EOM_Date AND T.Phase = P.Phase
) A
WHERE A.Phase_duration > 0
ORDER BY A.ID, A.EOM_Date, A.Phase
Partial results:
ID
EOM_Date
Phase
Phase_duration
1
2021-02-28
C
22
1
2021-03-31
A
18
1
2021-03-31
B
2
1
2021-03-31
C
22
1
2021-04-30
A
24
1
2021-04-30
B
22
1
2021-04-30
C
22
See this db<>fiddle.
The above assumes that there is at least some activity in every month. If you could potentially have a gap where there is no activity for an entire month, you will need to replace the distinct-date subselect with a calendar gererator.
Your "expected results" were slightly different from the results I got. In particular, the total phase duration for {ID = 1, EOM_Date = 2022-04-30, Phase = B} should be 22 instead of 21 based on your supplied data.
If you want to order your data as C/A/B, you can replace the Phase term in the ORDER BY with a case statement that maps teh values to an alternate sort order. Something like CASE Phase WHEN 'C' THEN 1 WHEN 'A' THEN 2 WHEN 'B' THEN 3 END.

Related

How to collapse overlapping date periods with acceptable gaps using T-SQL?

We want to group our members' enrollments into "continuous enrollments," allowing for a gap of up to 45 days. I know how to use LEAD to determine if an enrollment should be grouped with the next, but I don't know how to group them. Would it be more appropriate to add 45 to the term date and subtract 45 from the effective date, then check for overlapping date periods? My goal is to have a SQL view that returns the results similar to the final query below. Thank you for your help.
SELECT '101' AS MemID, '2021-01-01' AS EffDate, '2021-01-31' AS TermDate INTO #T1 UNION
SELECT '101', '2021-02-01', '2021-02-28' UNION
SELECT '101', '2021-03-01', '2021-03-31' UNION
SELECT '101', '2021-06-01', '2021-06-30' UNION
SELECT '999', '2021-01-01', '2021-01-15' UNION
SELECT '999', '2021-09-01', '2021-09-28' UNION
SELECT '999', '2021-10-01', '2021-10-31'
SELECT *
, LEAD(EffDate) OVER (PARTITION BY MemID ORDER BY EffDate) AS LeadEffDate
, DATEDIFF(DAY, TermDate, (LEAD(EffDate) OVER (PARTITION BY MemID ORDER BY EffDate))) AS DaysToNextEnrollment
, CASE WHEN (DATEDIFF(DAY, TermDate, (LEAD(EffDate) OVER (PARTITION BY MemID ORDER BY EffDate)))) <= 45 THEN 1 ELSE 0 END AS CombineWithNextRecord
FROM #T1
-- result objective
SELECT 101 AS MemID, '2021-01-01' AS EffDate, '2021-03-31' AS TermDate UNION
SELECT 101, '2021-06-01', '2021-06-30' UNION
SELECT 999, '2021-01-01', '2021-01-15' UNION
SELECT 999, '2021-09-01', '2021-10-31'
I think you are really close. Your question is very similar to
TSQL - creating from-to date table while ignoring in-between steps with conditions with a logic difference on what you want to consider to be the same group.
My basic approach is to use the LAG() function to figure out the previous values for MemID and TermDate and combine that with your 45 day rule to define a group. And finally get the first and last values of each group.
Here is my response to that question modified to your situation.
SELECT
a4.MemID
, CONVERT (DATE, a4.First_EffDate) AS [EffDate]
, CONVERT (DATE, a4.TermDate) AS [TermDate]
FROM (
SELECT
a3.MemID
, a3.EffDate
, a3.TermDate
, a3.MemID_group
, FIRST_VALUE (a3.EffDate) OVER (PARTITION BY a3.MemID_group ORDER BY a3.EffDate) AS [First_EffDate]
, ROW_NUMBER () OVER (PARTITION BY a3.MemID_group ORDER BY a3.EffDate DESC) AS [Row_number]
FROM (
SELECT
a2.MemID
, a2.EffDate
, a2.TermDate
, a2.Previous_MemID
, a2.Previous_TermDate
, a2.New_group
, SUM (a2.New_group) OVER (ORDER BY a2.MemID, a2.EffDate) AS [MemID_group]
FROM (
SELECT
a1.MemID
, a1.EffDate
, a1.TermDate
, a1.Previous_MemID
, a1.Previous_TermDate
---------------------------------------------------------------------------------
-- new group if the MemID is different from the previous row OR
-- if the MemID is the same as the previous row AND it has been more than 45 days
-- between the TermDate of the previous row and the EffDate of the current row
,
IIF((a1.MemID <> a1.Previous_MemID)
OR (
a1.MemID = a1.Previous_MemID
AND DATEDIFF (DAY, a1.Previous_TermDate, a1.EffDate) > 45
)
, 1
, 0) AS [New_group]
---------------------------------------------------------------------------------
FROM (
SELECT
MemID
, EffDate
, TermDate
, LAG (MemID) OVER (ORDER BY MemID) AS [Previous_MemID]
, LAG (TermDate) OVER (PARTITION BY MemID ORDER BY EffDate) AS [Previous_TermDate]
FROM #T1
) a1
) a2
) a3
) a4
WHERE a4.[Row_number] = 1;
Here is the dbfiddle.

PostgreSQL - SQL function to loop through all months of the year and pull 10 random records from each

I am attempting to pull 10 random records from each month of this year using this query here but I get an error "ERROR: relation "c1" does not exist
"
Not sure where I'm going wrong - I think it may be I'm using Mysql syntax instead, but how do I resolve this?
My desired output is like this
Month
Another header
2021-01
random email 1
2021-01
random email 2
total of ten random emails from January, then ten more for each month this year (til November of course as Dec yet to happen)..
With CTE AS
(
Select month,
email,
Row_Number() Over (Partition By month Order By FLOOR(RANDOM()*(1-1000000+1))) AS RN
From (
SELECT
DISTINCT(TO_CHAR(DATE_TRUNC('month', timestamp ), 'YYYY-MM')) AS month
,CASE
WHEN
JSON_EXTRACT_PATH_TEXT(json_extract_array_element_text (form_data,0),'name') = 'email'
THEN
JSON_EXTRACT_PATH_TEXT(json_extract_array_element_text (form_data,0),'value')
END AS email
FROM form_submits_y2 fs
WHERE fs.website_id IN (791)
AND month LIKE '2021%'
GROUP BY 1,2
ORDER BY 1 ASC
)
)
SELECT *
FROM CTE C1
LEFT JOIN
(SELECT RN
,month
,email
FROM CTE C2
WHERE C2.month = C1.month
ORDER BY RANDOM() LIMIT 10) C3
ON C1.RN = C3.RN
ORDER By month ASC```
You can't reference an outer table inside a derived table with a regular join. You need to use left join lateral to make that work
I did end up finding a more elegant solution to my query here via this source from github :
SELECT
month
,email
FROM
(
Select month,
email,
Row_Number() Over (Partition By month Order By FLOOR(RANDOM()*(1-1000000+1))) AS RN
From (
SELECT
TO_CHAR(DATE_TRUNC('month', timestamp ), 'YYYY-MM') AS month
,CASE
WHEN JSON_EXTRACT_PATH_TEXT(json_extract_array_element_text (form_data,0),'name') = 'email'
THEN JSON_EXTRACT_PATH_TEXT(json_extract_array_element_text (form_data,0),'value')
END AS email
FROM form_submits_y2 fs
WHERE fs.website_id IN (791)
AND month LIKE '2021%'
GROUP BY 1,2
ORDER BY 1 ASC
)
) q
WHERE
RN <=10
ORDER BY month ASC

How to find the average of the three maximum values in a specific group in a moving window in Big Query?

I have a data set as in the table below. I want to find the average of the maximum three values in a rolling 12 month window grouped by id.
id date value
id1 2020/01/01 500
id1 2021/02/01 300
id1 2021/03/01 150
id1 2021/08/01 100
id1 2021/12/01 400
id2 2020/01/01 50
id2 2020/02/01 900
id2 2021/12/01 100
So my expected output is:
id date value
id1 2020/01/01 500
id1 2021/02/01 300
id1 2021/03/01 225
id1 2021/08/01 183.33
id1 2021/12/01 283.33
id2 2020/01/01 50
id2 2020/02/01 500
id2 2021/12/01 100
I.e. for id1 2021/12/01: (400+300+150)/3 = 283.33 which is the average of the three largest values in a rolling 12 month window for group ID1.
I managed to get to this point:
CREATE TEMP FUNCTION avg_array(arr ANY TYPE) AS ((
SELECT AVG(val) FROM(
SELECT val FROM UNNEST(arr) val ORDER BY val DESC LIMIT 3)
)
);
SELECT id, date, avg_array(val_arr)
FROM (
SELECT
id, date, ARRAY_AGG(value) OVER (
PARTITION BY id
ORDER BY id, date DESC ROWS BETWEEN CURRENT ROW AND 11 FOLLOWING
) as val_arr
FROM `table` )
Which works, but I feel like there must be a better way to do this. Specifically, I can't figure out how to get the average of the maximum three from the OVER as well rather than creating a seperate function.
(If not possible to combine date window with finding maximum values, it would also be useful for me to know how to find the average of the maximum three in any group by group without creating a seperate function)
`
In your code, the year of the date in the “PARTITION BY id,EXTRACT(YEAR FROM date) “ statement is missing.
CREATE TEMP FUNCTION avg_array(arr ANY TYPE) AS ((
SELECT AVG(val) FROM(
SELECT val FROM UNNEST(arr) val ORDER BY val DESC LIMIT 3))
);
SELECT id, date, avg_array(val_arr)
FROM (
SELECT
id, date, ARRAY_AGG(value) OVER (
PARTITION BY id,EXTRACT(YEAR FROM date)
ORDER BY id, date DESC ROWS BETWEEN CURRENT ROW AND 11 FOLLOWING
) as val_arr
FROM `table` )
order by id,date asc
Here, you can see a sample code to get the maximum 3 numbers of a group:
select id,AVG(value) as vg from (
select id,date,value from (
select id, date, value from `table`
order by value desc) a limit 3
) b group by id
You can see more information about over function in this link.
Consider below approach
select id, date,
(select round(avg(value), 2) from (
select value from t.arr value
order by value desc
limit 3
)) value
from (
select *, array_agg(value) over last_12_month arr from table
window last_12_month as (partition by id
order by 12 * (extract(year from date)) + extract(month from date)
range between 11 preceding and current row
)
) t
if applied to sample data in your question - output is

postgreSQL select interval and fill blanks

I'm working on a system to manage the problems in different projects.
I have the following tables:
Projects
id
Description
Country
1
3D experience
Brazil
2
Lorem Epsum
Chile
Problems
id
idProject
Description
1
1
Not loading
2
1
Breaking down
Problems_status
id
idProblem
Status
Start_date
End_date
1
1
Red
2020-10-17
2020-10-25
2
1
Yellow
2020-10-25
2020-11-20
3
1
Red
2020-11-20
4
2
Red
2020-11-01
2020-11-25
5
2
Yellow
2020-11-25
2020-12-22
6
2
Red
2020-12-22
2020-12-23
7
2
Green
2020-12-23
In the above examples, the problem 1 is still red, and the problem 2 is green (no end date).
I need to create a chart when the user selects an specific project, where the status of the problems along the weeks (starting by the week of the first registered problem) will be shown. The chart of the project 1 should look like this:
I'm trying to write a code in postgreSQL to return a table like this, so that I can populate this chart:
Week
Green
Yellow
Red
42/20
0
0
1
43/20
0
0
1
44/20
0
1
0
...
...
...
...
04/21
1
0
1
I've been trying multiple ways but just can't figure out how to do that, could someone help me please?
Bellow a db-fiddle to help:
CREATE TABLE projects (
id serial NOT NULL,
description character varying(50) NOT NULL,
country character varying(50) NOT NULL,
CONSTRAINT projects_pkey PRIMARY KEY (id)
);
CREATE TABLE problems (
id serial NOT NULL,
id_project integer NOT NULL,
description character varying(50) NOT NULL,
CONSTRAINT problems_pkey PRIMARY KEY (id),
CONSTRAINT problems_id_project_fkey FOREIGN KEY (id_project)
REFERENCES projects (id) MATCH SIMPLE
);
CREATE TABLE problems_status (
id serial NOT NULL,
id_problem integer NOT NULL,
status character varying(50) NOT NULL,
start_date date NOT NULL,
end_date date,
CONSTRAINT problems_status_pkey PRIMARY KEY (id),
CONSTRAINT problems_status_id_problem_fkey FOREIGN KEY (id_problem)
REFERENCES problems (id) MATCH SIMPLE
);
INSERT INTO projects (description, country) VALUES ('3D experience','Brazil');
INSERT INTO projects (description, country) VALUES ('Lorem Epsum','Chile');
INSERT INTO problems (id_project ,description) VALUES (1,'Not loading');
INSERT INTO problems (id_project ,description) VALUES (1,'Breaking down');
INSERT INTO problems_status (id_problem, status, start_date, end_date) VALUES
(1, 'Red', '2020-10-17', '2020-10-25'),(1, 'Yellow', '2020-10-25', '2020-11-20'),
(1, 'Red', '2020-11-20', NULL),(2, 'Red', '2020-11-01', '2020-11-25'),
(2, 'Yellow', '2020-11-25', '2020-12-22'),(2, 'Red', '2020-12-22', '2020-12-23'),
(2, 'Green', '2020-12-23', NULL);
If I understood correctly your goal is to produce a weekly tally by problem status for a particular project for a specific time period (Min db date to current date). Further if a problem status spans week then is should be included in each weeks tally. That involve 2 time periods, the report period against the status start/end dates and checking for overlap of those dates. Now there ate 5 overlaps scenarios that need checking; lets call the ranges let A the any week in the report period and B. the start/end of status. Now, allowing that A must end within the reporting period. but B does not we have the following.
A starts, B starts, A ends, B ends. B overlaps end of A.
A starts, B starts, B ends, A ends. B totally contained within A.
B starts, A starts, B ends, A ends. B overlaps start of A.
B starts, A starts, A ends, B ends. A totally enclosed within B.
Fortunately, Postgres provides functionally to handle all the above meaning the query does not have to handle the individual validations. This is DATERANGEs and the Overlap operator. The difficult work then becomes defining each week with in A. Then employ the Overlap operator on daterange for each week in A against the daterange for B (start_date, end_date). Then do conditional aggregation. for each overlap detected. See full example here.
with problem_list( problem_id ) as
-- identify the specific problem_ids desirded
(select ps.id
from projects p
join problems ps on(ps.id_project = p.id)
where p.id = &selected_project
) --select * from problem_list;
, report_period(srange, erange) as
-- generate the first day of week (Mon) for the
-- oldest start date through day of week of Current_Date
(select min(first_of_week(ps.start_date))
, first_of_week(current_date)
from problem_status ps
join problem_list pl
on (pl.problem_id = ps.id_problem)
) --select * from report_period;
, weekly_calendar(wk,yr, week_dates) as
-- expand the start, end date ranges to week dates (Mon-Sun)
-- and identify the week number with year
(select extract( week from mon)::integer wk
, extract( isoyear from mon)::integer yr
, daterange(mon, mon+6, '[]'::text) wk_dates
from (select generate_series(srange,erange, interval '7 days')::date mon
from report_period
) d
) -- select * from weekly_calendar;
, status_by_week(yr,wk,status) as
-- determine where problem start_date, end_date overlaps each calendar week
-- then where multiple statuses exist for any week keep only the lat
( select yr,wk,status
from (select wc.yr,wc.wk,ps.status
-- , ps.start_date, wc.week_dates,id_problem
, row_number() over (partition by ps.id_problem,yr,wk order by yr, wk, start_date desc) rn
from problem_status ps
join problem_list pl on (pl.problem_id = ps.id_problem)
join weekly_calendar wc on (wc.week_dates && daterange(ps.start_date,ps.end_date)) -- actual overlap test
) ac
where rn=1
) -- select * from status_by_week order by wk;
select 'Project ' || p.id || ': ' || p.description Project
, to_char(wk,'fm09') || '/' || substr(to_char(yr,'fm0000'),3) "WK"
, "Red", "Yellow", "Green"
from projects p
cross join (select sbw.yr,sbw.wk
, count(*) filter (where sbw.status = 'Red') "Red"
, count(*) filter (where sbw.status = 'Yellow') "Yellow"
, count(*) filter (where sbw.status = 'Green') "Green"
from status_by_week sbw
group by sbw.yr, sbw.wk
) sr
where p.id = &selected_project
order by yr,wk;
The CTEs and main operate as follows:
problem_list: Identifies the Problems (id_problem) related the
specified project.
report_period: Identifies the full reporting period start to end.
weekly_calendar: Generates the beginning date (Mon) and ending date (Sun) for each week within the reporting period (A above). Along the
way it also gathers week of the year and the ISO year.
status_by_week: This is the real work horse preforming two tasks.
First is passes each problem by each of the week in the calendar. It
builds row for each overlap detected. Then it enforces the "one
status" rule.
Finally, the main select aggregates the status into the appropriate
buckets and adds the syntactic sugar getting the Program Name.
Note the function first_of_week(). This is a user defined function and available in the example and below. I created it some time ago and have found it useful. You are free to use it. But you do so without any claim of suitability or guaranty.
create or replace
function first_of_week(date_in date)
returns date
language sql
immutable strict
/*
* Given a date return the first day of the week according to ISO-8601
*
* ISO-8601 Standard (in short)
* 1 All weeks begin on Monday.
* 2 All Weeks have exactly 7 days.
* 3 First week of any year is the Monday on or before 4-Jan.
* This implies that the last few days on Dec may be in the
* first week of the following year and that the first few
* days of Jan may be in week 53 (53) of the prior year.
* (Not at the same time obviously.)
*
*/
as $$
with wk_adj(l_days) as (values (array[0,1,2,3,4,5,6]))
select date_in - l_days[ extract (isodow from date_in)::integer ]
from wk_adj;
$$;
In the example I have implemented the query as a SQL function as it seems db<>fiddle has issues with bound variables
and substitution variables, Besides it gave the ability to parameterize it. (Hate hard coded values). For the example I
added additional data fro extra testing, Mostly as data that will not be selected. And an additional Status (what happens if it encounters something other than those 3 status values (in this case Pink). This easy to remove, just get rid on OTHER.
Your notice that "the daterange is covering mon-mon, instead of mon-sun" is incorrect, although it would appear that way for someone not use to looking at them. Lets take week 43. If you queried the date range it would show [2020-10-19,2020-10-26) and yes both those dates are Monday. However, the bracketing characters have meaning. The leading character [ says the date is to included and the trailing character ) says the date is not to be included. A standard condition:
somedate && [2020-10-19,2020-10-26)
is the same as
somedate >= 2020-10-19 and somedate < 2020-10-26
This is why when you change the increment from "mon+6" to "mon+5" you fixed week 43, but introduced errors into other weeks.
You can fill in blanks using COALESCE to select the first non-null value in the list.
SELECT COALESCE(<some_value_that_could_be_null>, <some_value_that_will_not_be_null>);
If you want to force the bounds of your time range into a result set you can UNION your result set with a specific date.
SELECT ... -- your data query here
UNION ALL
SELECT end_ts -- WHERE end_ts is a timestamptz type
In order to UNION you will need to have the same arity and same type of fields returned in the unioned query. You can fill in everything other than the timestamp with NULL casted to whichever the matching type is.
More concrete example:
WITH data AS -- get raw data
(
SELECT p.id
, ps.status
, ps.start_date
, COALESCE(ps.end_date, CURRENT_DATE, '01-01-2025'::DATE) -- you can fill in NULL values with COALESCE
, pj.country
, pj.description
, MAX(start_date) OVER (PARTITION BY p.id) AS latest_update
FROM problems p
JOIN projects pj ON (pj.id = p.id_project)
JOIN problem_status ps ON (p.id = ps.id_problem)
UNION ALL -- force bounds in the following
SELECT NULL::INTEGER -- could be null or a defaulted value
, NULL::TEXT -- could be null or a defaulted value
, start_date -- either as an input param to a function or a hard-coded date
, end_date -- either as an input param to a function or a hard-coded date
, NULL::TEXT
, NULL::TEXT
, NULL::DATE
) -- aggregate in the following
SELECT <week> -- you'll have to figure out how you're getting weeks out of the DATE data
, COUNT(*) FILTER (WHERE status = 'Red')
, COUNT(*) FILTER (WHERE status = 'Yellow')
, COUNT(*) FILTER (WHERE status = 'Green')
FROM data
WHERE start_date = latest_update
GROUP BY <week>
;
Some of the features used in this query are very powerful and you should look them up if they're new to you and you are going to be doing a bunch of reporting queries. Mainly coalesce, common table expressions (CTE), window functions, and aggregate expressions.
Aggregate Expressions
WITH Queries (CTEs)
COALESCE
Window Functions
I wrote a dbfiddle for you to take a look at here after you updated your requirements.

How to order UNPIVOT

I have the following UNPIVOT code and I would like to order it by the FactSheetSummary columns so that when it is converted to rows it is order 1 - 12:
INSERT INTO #Results
SELECT DISTINCT ReportingDate, PortfolioID,ISIN, PortfolioNme, Section,REPLACE(REPLACE(Risks,'‘',''''),'’','''')
FROM
(SELECT DISTINCT
ReportingDate
, PortfolioID
, ISIN
, PortfolioNme
, Section
, FactSheetSummary_1, FactSheetSummary_2, FactSheetSummary_3
, FactSheetSummary_4, FactSheetSummary_5, FactSheetSummary_6
, FactSheetSummary_7, FactSheetSummary_8, FactSheetSummary_9
, FactSheetSummary_10, FactSheetSummary_11, FactSheetSummary_12
FROM #WorkingTableFactsheet) p
UNPIVOT
(Risks FOR FactsheetSummary IN
( FactSheetSummary_1, FactSheetSummary_2, FactSheetSummary_3
, FactSheetSummary_4, FactSheetSummary_5, FactSheetSummary_6
, FactSheetSummary_7, FactSheetSummary_8, FactSheetSummary_9
, FactSheetSummary_10, FactSheetSummary_11, FactSheetSummary_12)
)AS unpvt;
--DELETE records where there are no Risk Narratives
DELETE FROM #Results
WHERE Risks = ''
SELECT
ReportingDate
, PortfolioID
, ISIN
, PortfolioNme
, Section
, Risks
, ROW_NUMBER() OVER(PARTITION BY ISIN,Section ORDER BY ISIN,Section,Risks) as SortOrder
FROM #Results
order by ISIN, Risks
Is it possible to do this? I thought the IN of the UNPIVOT would dictate the order? Do I need to add a column to dictate which I would like to be 1 through to 12?
Just use case expression to order:
order by case FactsheetSummary
when 'FactSheetSummary_1' then 1
when 'FactSheetSummary_2' then 2
when 'FactSheetSummary_12' then 12 end