Pivot for multiple columns in SQL Server - tsql

I have 3 areas:
Melt, HSM, LSM
each order produced in some areas with following data:
Start Date, Finish Date, Weight
I have a Viewin SQL Server 2012(top image), How can I create a pivot that generate bottom image, using tsql?

You could use conditional aggregation:
SELECT [Order],
MAX(CASE WHEN Area = 'Melt' THEN StartDate END) AS Melt_SDate,
MAX(CASE WHEN Area = 'Melt' THEN FinisthDate END) AS Melt_FDate,
MAX(CASE WHEN Area = 'Melt' THEN Weight END) AS Melt_Weight,
MAX(CASE WHEN Area = 'HSM' THEN StartDate END) AS HSM_SDate,
MAX(CASE WHEN Area = 'HSM' THEN FinisthDa END) AS HSM_FDate,
MAX(CASE WHEN Area = 'HSM' THEN Weight END) AS HSM_Weight,
MAX(CASE WHEN Area = 'LSM' THEN StartDate END) AS LSM_SDate,
MAX(CASE WHEN Area = 'LSM' THEN FinisthDate END) AS LSM_FDate,
MAX(CASE WHEN Area = 'LSM' THEN Weight END) AS LSM_Weight
FROM tab_name
GROUP BY [Order]; -- ORDER is reserved word, you should avoid such identifiers;
To make it more concise you could use IIF:
MAX(CASE WHEN Area = 'Melt' THEN StartDate END) AS Melt_SDate,
<=>
MAX(IIF(Area='Melt',StartDate,NULL)) AS Melt_SDate,

Related

Spark SQL in PySpark throws ParseError Exception

I have a pyspark code running in an AWS glue job. I have some temporary tables created from dataframes in the same code using the function createOrReplaceTempView().
I am running a multiline SQL query within the code to fetch data from these temporary tables and store in a dataframe. However, I am getting ParseError Exception. The same query runs absolutely fine in a SQL client. Below is the query. Can someone please help? Unfortunately I don't get any error message other than the text "ParseError Exception"
def getSCMCaseHistLanding(self):
landingDF = self.spark.sql("""SELECT aud.case_id,
r.name AS role_name,
cas.account_id,
cas.created_time AS case_created_time,
cas.last_updated_time,
cas.screening_decision,
aud.created_time AS record_created_time,
aud.username,
row_number() OVER (PARTITION BY aud.case_id ORDER BY date_trunc('minute', aud.created_time) + date_part('second', aud.created_time)::INT / 10 * INTERVAL '10 sec') rank,
max(CASE WHEN field = 'status_id' THEN aud.value ELSE NULL END) AS d_status,
max(CASE WHEN field = 'status_id' THEN SPLIT_PART(SPLIT_PART(aud.description, 'from ', 2), ' to', 1) ELSE NULL END) AS src_status,
max(CASE WHEN lower(field) = 'annotation' THEN [value] ELSE NULL END) AS annotation,
max(CASE WHEN lower(field) = 'reason_id' THEN [value] ELSE NULL END) AS reason,
max(CASE WHEN lower(field) = 'approver_id' THEN [value] ELSE NULL END) AS approver,
r.team_name AS approver_source,
max(CASE WHEN lower(field) = 'decision_id' THEN [value] ELSE NULL END) AS decision,
max(CASE WHEN field = 'status_id' THEN aud.created_time ELSE NULL END) AS lsd,
last_value(lsd IGNORE NULLS) OVER (PARTITION BY aud.case_id ORDER BY record_created_time ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS last_status_changed_date,
max(CASE WHEN aud.field = 'assigned_to' THEN REVERSE(SPLIT_PART(REVERSE(aud.value), ' ', 1)) ELSE NULL END) AS assigned_to_t,
last_value(assigned_to_t IGNORE NULLS)
OVER (PARTITION BY aud.case_id ORDER BY record_created_time ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS assigned_to,
max(CASE WHEN aud.field = 'assigned_to' THEN aud.username ELSE NULL END) AS assigned_by,
max(CASE
WHEN (aud.field = 'assigned_to' AND aud.description LIKE 'Workbasket%') OR
(aud.field = 'assigned_to' AND aud.description LIKE '%by%') THEN 'Auto Assign'
WHEN aud.field = 'assigned_to' AND aud.description LIKE '%themself%' THEN 'Get Next'
END) AS assignment_method,
max(
CASE WHEN aud.field = 'assigned_to' THEN aud.description ELSE NULL END) AS assignment_detail,
max(CASE WHEN lower(field) = 'state_id' THEN [value] ELSE NULL END) AS ll_state,
last_value(d_status IGNORE NULLS)
OVER (PARTITION BY aud.case_id ORDER BY record_created_time ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS dest_status,
last_value(ll_state IGNORE NULLS)
OVER (PARTITION BY aud.case_id ORDER BY record_created_time ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS l_state,
CASE WHEN l_state IS NULL THEN 'Open' ELSE l_state END AS state,
max(CASE WHEN lower(field) = 'responsive_action_id' THEN [value] ELSE NULL END) AS responsive_action,
max(CASE WHEN type_name = 'Language' THEN skill.skill_name ELSE NULL END) AS language_skill,
max(
CASE WHEN type_name = 'Business' THEN skill.skill_name ELSE NULL END) AS business_skill,
max(CASE
WHEN lower(field) = 'annotation' AND lower(value) LIKE 'f+ applied by jarvis%' THEN 'jarvis'
WHEN lower(field) = 'annotation' AND (lower(value) LIKE '%bulk action%'
OR lower(value) LIKE '%moving cases to invalid data%') THEN 'bulk action'
WHEN lower(field) = 'annotation' AND lower(value) LIKE '%t+ decision recommended by scr model%'
THEN 'SCR T+ Model'
WHEN lower(field) = 'annotation' AND lower(value) LIKE '%f+ applied by scr model%'
THEN 'SCR F+ Model'
ELSE NULL END) AS source_of_action,
max(CASE
WHEN lower(field) = 'annotation' AND lower(value) LIKE '%duplicate case%' THEN 'Y'
ELSE 'N' END) AS duplicate_case,
max(CASE
WHEN field = 'state_id' AND aud.description = 'Case reopened' THEN
date_trunc('minute', aud.created_time) +
date_part('second', aud.created_time)::INT / 10 * INTERVAL '10 sec'
ELSE NULL END) AS case_reopened_time,
max(0) AS active_record_flag
FROM cm_spectre_case_audit aud
JOIN cm_spectre_case cas
ON cas.case_id = aud.case_id
LEFT JOIN v_cm_user_snapshot u ON
CASE
WHEN aud.field = 'assigned_to' THEN REVERSE(SPLIT_PART(REVERSE(aud.value), ' ', 1))
ELSE aud.username
END = u.alias AND aud.created_time BETWEEN u.start_time AND u.end_time
LEFT JOIN cm_role r ON u.current_role_id = r.role_id
LEFT JOIN cm_lookup_skills lookup_skill ON lookup_skill.case_id = cas.case_id
LEFT JOIN cm_skill skill ON skill.skill_id = lookup_skill.skill_id
LEFT JOIN cm_skill_type skill_type ON skill_type.type_id = skill.type_id
WHERE 1 = 1
AND NOT (field = 'created_time'
AND username = 'System')
AND NOT (field = 'annotation'
AND username = 'System')
AND NOT (field = 'skill_name_')
AND NOT (field = 'assigned_to'
AND lower(aud.description) LIKE '%unassigned%')
AND NOT (field = 'attachment')
AND NOT (field = 'accept_list_id')
AND NOT (field LIKE 'screening_match_id%')
GROUP BY aud.case_id, cas.account_id, role_name, r.team_name, cas.created_time, cas.last_updated_time,
cas.screening_decision,
aud.created_time,
aud.username
""")
return landingDF

Aggregate Nested Pgsql query

I'm trying to query a table to get me some results, but the way I'm doing it gives me the error: ERROR: aggregate function calls cannot be nested.
The table looks like this:
ID | canal_c1 | tarifacao | date | ativo_id
The query I'm trying is this:
SELECT
SUM(case when tarifacao = 'ForaPonta' then canal_c1 else 0 end) as ConsForaPonta,
MAX(case when tarifacao = 'ForaPonta' then canal_c1 else 0 end) as DemForaPonta,
ativo_id as ativo_id,
data_leitura_inicio::date as date
FROM
medicao
WHERE
medicao.ativo_id in (45) AND
medicao.tipo_leitura = 'Consumo' AND
medicao.data_leitura_inicio >= '2017-01-01' AND
medicao.data_leitura_inicio < '2017-01-10'
GROUP BY
medicao.ativo_id,
medicao.data_leitura_inicio::date
That gives me result like these:
query result
And that's fine what I need now is the datetime from the DemForaPonta field, in order to do that I trying this, but got that error.
MAX(case when tarifacao = 'ForaPonta' and
canal_c1 = MAX(case when tarifacao = 'ForaPonta' then canal_c1 else 0 end)
then data_leitura_inicio end) as DateDemForaPonta
Do you know how I could achieve this?
Thanks.
Edit:
Here's an example, the query result is the intended result.
example
It would be helpful to have an example with real data. But anyway, you can do what you need with nested selects, among other options.
Nested SELECTs
SELECT *,
MAX(case when tarifacao = 'ForaPonta' and canal_c1 = DemForaPonta then data_leitura_inicio END)
AS DateDemForaPonta
FROM(
SELECT
SUM(case when tarifacao = 'ForaPonta' then canal_c1 else 0 end) as ConsForaPonta,
MAX(case when tarifacao = 'ForaPonta' then canal_c1 else 0 end) as DemForaPonta,
ativo_id as ativo_id,
data_leitura_inicio::date as date
FROM
medicao
WHERE
medicao.ativo_id in (45) AND
medicao.tipo_leitura = 'Consumo' AND
medicao.data_leitura_inicio >= '2017-01-01' AND
medicao.data_leitura_inicio < '2017-01-10'
GROUP BY
medicao.ativo_id,
medicao.data_leitura_inicio::date
)data
Since I don't have data to test it, don't really know if it does the trick, but it should.

How can I insert data into a temporary table

Is there a way to insert the results of this query into a temporary table in SSMS? I have tried a number of ways but failed thus far.
Or is there another way, all I al looking for it to query the results of this and join another table which I can't seem to do in the query itself.
USE CommDB
;With CTE AS (SELECT s.attendanceNumber,
MAX(CASE WHEN s.rnk = 1 THEN s.ExamExaminationCode END) as examCode1,
MAX(CASE WHEN s.rnk = 2 THEN s.ExamExaminationCode END) as examCode2,
MAX(CASE WHEN s.rnk = 3 THEN s.ExamExaminationCode END) as examCode3,
MAX(CASE WHEN s.rnk = 4 THEN s.ExamExaminationCode END) as examCode4,
MAX(CASE WHEN s.rnk = 5 THEN s.ExamExaminationCode END) as examCode5,
MAX(CASE WHEN s.rnk = 6 THEN s.ExamExaminationCode END) as examCode6,
MAX(CASE WHEN s.rnk = 7 THEN s.ExamExaminationCode END) as examCode7,
MAX(CASE WHEN s.rnk = 8 THEN s.ExamExaminationCode END) as examCode8,
MAX(CASE WHEN s.rnk = 9 THEN s.ExamExaminationCode END) as examCode9,
MAX(CASE WHEN s.rnk = 10 THEN s.ExamExaminationCode END) as examCode10,
MAX(CASE WHEN s.rnk = 11 THEN s.ExamExaminationCode END) as examCode11,
MAX(CASE WHEN s.rnk = 12 THEN s.ExamExaminationCode END) as examCode12,
MAX(CASE WHEN s.rnk = 13 THEN s.ExamExaminationCode END) as examCode13,
MAX(CASE WHEN s.rnk = 14 THEN s.ExamExaminationCode END) as examCode14
FROM (
SELECT [AttendanceNumber]
,[ExaminationDate]
,RadiologyID
,[ExamExaminationCode]
,ROW_NUMBER() OVER(PARTITION BY [AttendanceNumber]
ORDER BY [ExamExaminationCode]) as rnk --Ordered ASC so examcodes dont move
FROM [CommDB].[dbo].[tblRadiologyData] rd
where rd.ExaminationDate >= '01 october 2015'
and rd.AttendanceSiteCode IN('CNM','RNM') ) s
GROUP BY s.attendanceNumber)
Select c.examCode1,
c.examCode2,
c.examCode3,
c.examCode4,
c.examCode5,
c.examCode6,
c.examCode7,
c.examCode8,
c.examCode9,
c.examCode10,
c.examCode11,
c.examCode12,
c.examCode13,
c.examCode14,
c.AttendanceNumber,
COUNT(c.AttendanceNumber) as [No of occurances]
-- (Select lu.HRGCode from CommDB.dbo.tblRadiologyNucMedLookup lu
-- Where ISNULL(c.examCode1,'') = ISNULL(lu.Exam01,'')
-- )
from CTE c
GROUP by c.examCode1,
c.examCode2,
C.examCode3,
C.examCode4,
C.examCode5,
C.examCode6,
C.examCode7,
C.examCode8,
C.examCode9,
C.examCode10,
C.examCode11,
C.examCode12,
C.examCode13,
C.examCode14,
C.AttendanceNumber
ORDER BY C.examCode1
With pivot statement your live will be much easier - you don't need cte.
And your question - there are several types of temporary tables:
global temporary table - prefixed with ## (you can access this table from other sessions)
session temporary table - prefixed with # (you can access this table only from session)
variable temporary - must be declared and prefixed with # (you can access this table only in one batch)
table on tempdb - common table, but after administration tasks (restart service) will be deleted
other:
you can use view or TVF instead of temporary tables (it is another way)
you can utilize potencial of CTE (most people use it only as view)
Type 1,2,4 you can create as standard table (with create statement or script select into).
Type 3 you have to declare as variable.
Example of session temporary table:
create table #TmpTable (id int, col1 varchar(5), col2 varchar(5))
insert into #TmpTable (id, col1, col2) select 1, 'KSDFA', 'ASDAS' -- from tbl

how to select top 10 without duplicates

Using SQL Server 2012
I need to select TOP 10 Producer based on a ProducerCode. But the data is messed up, users were entering same Producers just spelled differently and with the same ProducerCode.
So I just need TOP 10, so if the ProducerCode is repeating, I just want to pick the first one in a list.
How can I achieve that?
Sample of my data
;WITH cte_TopWP --T
AS
(
SELECT distinct ProducerCode, Producer,SUM(premium) as NetWrittenPremium,
SUM(CASE WHEN PolicyType = 'New Business' THEN Premium ELSE 0 END) as NewBusiness1,
SUM(CASE WHEN PolicyType = 'Renewal' THEN Premium ELSE 0 END) as Renewal1,
SUM(CASE WHEN PolicyType = 'Rewrite' THEN Premium ELSE 0 END) as Rewrite1
FROM ProductionReportMetrics
WHERE YEAR(EffectiveDate) = 2016 AND TransactionType = 'Policy' AND CompanyLine = 'Arch Insurance Company'--AND ProducerType = 'Wholesaler'
GROUP BY ProducerCode,Producer
)
,
cte_Counts --C
AS
(
SELECT distinct ProducerCode, ProducerName, COUNT (distinct ControlNo) as Submissions2,
SUM(CASE WHEN QuotedPremium IS NOT NULL THEN 1 ELSE 0 END) as Quoted2,
SUM(CASE WHEN Type = 'New Business' AND Status IN ('Bound','Cancelled','Notice of Cancellation') THEN 1 ELSE 0 END ) as NewBusiness2,
SUM(CASE WHEN Type = 'Renewal' AND Status IN ('Bound','Cancelled','Notice of Cancellation') THEN 1 ELSE 0 END ) as Renewal2,
SUM(CASE WHEN Type = 'Rewrite' AND Status IN ('Bound','Cancelled','Notice of Cancellation') THEN 1 ELSE 0 END ) as Rewrite2,
SUM(CASE WHEN Status = 'Declined' THEN 1 ELSE 0 END ) as Declined2
FROM ClearanceReportMetrics
WHERE YEAR(EffectiveDate)=2016 AND CompanyLine = 'Arch Insurance Company'
GROUP BY ProducerCode,ProducerName
)
SELECT top 10 RANK() OVER (ORDER BY NetWrittenPremium desc) as Rank,
t.ProducerCode,
c.ProducerName as 'Producer',
NetWrittenPremium,
t.NewBusiness1,
t.Renewal1,
t.Rewrite1,
c.[NewBusiness2]+c.[Renewal2]+c.[Rewrite2] as PolicyCount,
c.Submissions2,
c.Quoted2,
c.[NewBusiness2],
c.Renewal2,
c.Rewrite2,
c.Declined2
FROM cte_TopWP t --LEFT OUTER JOIN tblProducers p on t.ProducerCode=p.ProducerCode
LEFT OUTER JOIN cte_Counts c ON t.ProducerCode=c.ProducerCode
You should use ROW_NUMBER to fix your issue.
https://msdn.microsoft.com/en-us/library/ms186734.aspx
A good example of this is the following answer:
https://dba.stackexchange.com/a/22198
Here's the code example from the answer.
SELECT * FROM
(
SELECT acss_lookup.ID AS acss_lookupID,
ROW_NUMBER() OVER
(PARTITION BY your_distinct_column ORDER BY any_column_you_think_is_appropriate)
as num,
acss_lookup.product_lookupID AS acssproduct_lookupID,
acss_lookup.region_lookupID AS acssregion_lookupID,
acss_lookup.document_lookupID AS acssdocument_lookupID,
product.ID AS product_ID,
product.parent_productID AS productparent_product_ID,
product.label AS product_label,
product.displayheading AS product_displayheading,
product.displayorder AS product_displayorder,
product.display AS product_display,
product.ignorenewupdate AS product_ignorenewupdate,
product.directlink AS product_directlink,
product.directlinkURL AS product_directlinkURL,
product.shortdescription AS product_shortdescription,
product.logo AS product_logo,
product.thumbnail AS product_thumbnail,
product.content AS product_content,
product.pdf AS product_pdf,
product.language_lookupID AS product_language_lookupID,
document.ID AS document_ID,
document.shortdescription AS document_shortdescription,
document.language_lookupID AS document_language_lookupID,
document.document_note AS document_document_note,
document.displayheading AS document_displayheading
FROM acss_lookup
INNER JOIN product ON (acss_lookup.product_lookupID = product.ID)
INNER JOIN document ON (acss_lookup.document_lookupID = document.ID)
)a
WHERE a.num = 1
ORDER BY product_displayheading ASC;
You could do this:
SELECT ProducerCode, MIN(Producer) AS Producer, ...
GROUP BY ProducerCode

Help with difficult 'group by' clause

need some your help with a query.
I have a table Managers (ManagerId, ManagerName)
I have a table Statuses (StatusId, StatusName)
(There's about 10 statuses in that table)
I have a table Clients (ClientId, ClientName, ManagerId, StatusId, WhenAdded)
(WhenAdded is a datetime type)
It's obvious that field 'ManagerId' refers to a table 'Managers' and field 'StatusId' refers to a table 'Statuses'.
User wants to get some statistics about Managers over a period of time (from startDate to endDate using field 'WhenAdded') in the following table.
Columns:
ManagerName, NumberOfClients, NumberOfClientsWithStatus1, NumberOfClientsWithStatus2, NumberOfClientsWithStatus3 and so on.
Number of columns with name NumberOfClientsWithStatusI where i is a number of statuses equal to number of rows in table 'Statuses'.
How can I do that?
t-sql, sql server 2008 r2 express edition.
SELECT
ManagerName,
COUNT(*) AS NumberOfClients,
COUNT(CASE WHEN S.StatusId = 1 THEN 1 ELSE NULL END) AS NumberOfClientsWithStatus1,
COUNT(CASE WHEN S.StatusId = 2 THEN 1 ELSE NULL END) AS NumberOfClientsWithStatus2,
COUNT(CASE WHEN S.StatusId = 3 THEN 1 ELSE NULL END) AS NumberOfClientsWithStatus3,
...
FROM
Clients C
JOIN
Managers M ON C.ManagerId = M.ManagerId
JOIN
Statuses S ON C.StatusId = S.StatusId
WHERE
M.WhenAdded BETWEEN #startDate AND #endDate
GROUP BY
M.ManagerName
Note: there is no clean way to add arbritrary numbers of status columns in SQL (not just SQL Server) because its a fixed column output. You'd have to change the query for status, unless you deal with this in the client
Edit, after comment
SELECT
ManagerName,
COUNT(*) AS NumberOfClients,
COUNT(CASE WHEN S.StatusId = 1 THEN 1 ELSE NULL END) AS NumberOfClientsWithStatus1,
COUNT(CASE WHEN S.StatusId = 2 THEN 1 ELSE NULL END) AS NumberOfClientsWithStatus2,
COUNT(CASE WHEN S.StatusId = 3 THEN 1 ELSE NULL END) AS NumberOfClientsWithStatus3,
...
FROM
Managers M ON C.ManagerId = M.ManagerId
LEFT JOIN
Clients C
LEFT JOIN
Statuses S ON C.StatusId = S.StatusId
WHERE
M.WhenAdded BETWEEN #startDate AND #endDate
GROUP BY
M.ManagerName
If you know that statuses table will always contain a limited number of statuses, you can do this:
SELECT M.ManagerName,
COUNT(C.ClientId) NumberOfClients,
SUM(CASE WHEN S.StatusId= 1 THEN 1 ELSE 0 END) NumberOfClientsWithStatus1,
SUM(CASE WHEN S.StatusId= 2 THEN 1 ELSE 0 END) NumberOfClientsWithStatus2,
...
FROM Clients C
JOIN Managers M on M.ManagerId = C.ManagerId
JOIN Statuses S on S.StatusId = C.StatusId
WHERE C.WhenAdded BETWEEN startDate AND endDate
GROUP BY ManagerName