Postgres Crosstab on double columns with unknown value - postgresql

So i have a table like this in my Postgres v.10 DB
CREATE TABLE t1(id integer primary key, ref integer,v_id integer,total numeric, year varchar, total_lastyear numeric,lastyear varchar ) ;
INSERT INTO t1 VALUES
(1, 2077,15,10000,2020,9000,2019),
(2, 2000,13,190000,2020,189000,2019),
(3, 2065,11,10000,2020,10000,2019),
(4, 1999,14,2300,2020,9000,2019);
select * from t1 =
id ref v_id total year total_lastyear lastyear
1 2077 15 10000 2020 9000 2019
2 2000 13 190000 2020 189000 2019
3 2065 11 10000 2020 10000 2019
4 1999 14 2300 2020 9000 2019
Now i want to Pivot this table so that i have 2020 and 2019 as columns with the total amounts as values.
My Problems:
I don't know how two pivot two columns in the same query, is that even possibly or do you have to make two steps?
The years 2020 and 2019 are dynamic and can change from one day to another. The year inside the column is the same on every row.
So basicly i need to save the years inside lastyear and year in some variable and pass it to the Crosstab query.
This far i made it myself but i only managed to pivot one year and the 2019 and 2020 years is hardcoded.
Demo

You can pivot one at a time with WITH.
WITH xd1 AS (
SELECT * FROM crosstab('SELECT ref,v_id,year,total FROM t1 ORDER BY 1,3',
'SELECT DISTINCT year FROM t1 ORDER BY 1') AS ct1(ref int,v_id int,"2020" int)
), xd2 AS (
SELECT * FROM crosstab('SELECT ref,v_id,lastyear,total_lastyear FROM t1 ORDER BY 1,3',
'SELECT DISTINCT lastyear FROM t1 ORDER BY 1') AS ct2(ref int,v_id int,"2019" int)
)
SELECT xd1.ref,xd1.v_id,xd1."2020",xxx."2019"
FROM xd1
LEFT JOIN xd2 AS xxx ON xxx.ref = xd1.ref AND xxx.v_id = xd1.v_id;
This doesn't prevent from last_year and year colliding.
You still have to know the years query will return as you have to define record as it is returned by crosstab.
You could wrap it in an EXECUTE format() to make it more dynamic and deal with some stringology.
This issue was mentioned here.

Related

convert values from two columns into one new column

I have two columns: year and month:
Year Month
2017 01
2017 02
2018 12
2019 06
2020 07
With
select to_date(concat(Year, Month), 'YYYYMM') csv_date FROM my_table;
I can get just one column with date datatype.
How can I add this column in my table, to get this:
Year Month csv_date
2017 01 2017-01-00
2017 02 2017-02-00
2018 12 2018-12-00
2019 06 2019-06-00
2020 07 2020-07-00
You can not have a column defined as date that contains 00 for the day. That would be an invalid date, and Postgres will not allow it. The suggested method of concatenating the 2 works if the year and month are defined as a string type column, but the result will have '01' for the day. If those columns are defined as numeric then you can use the make date function.
with my_table(tyr, tmo, nyr,nmo) as
( values ('2020', '04', 2020, 04 ) )
select to_date(concat(tyr, tmo), 'YYYYMM') txt_date
, make_date(nyr,nmo,01) num_date
from my_table;
With that said then use the to_char function for a date column you can to get just year and month (and if you must) add the '-00'. so
with my_table (adate) as
( values ( date '2020-04-01') )
select adate, to_char(adate,'yyyy-mm') || '-00' as yyyymm
from mytable;
If you are on v12 and want to add the column you can add it as a generated column. This will have the advantage that it cannot be updated independently but will automatically update when the source columns(s) get updated. See fiddle complete example;
alter table my_table add column cvs_date date generated always as (make_date(yr, mo,01)) stored;
Using PostgreSQL Query
If you want to add new column then
alter table my_table add column csv_date date;
update my_table set csv_date=to_date(concat(Year, Month), 'YYYYMM');
If you want only select output then:
select year, month, to_date(concat(Year, Month), 'YYYYMM') csv_date FROM my_table;

Insert subquery date according to day

I would like to insert subquery a date based on it day. Plus, each date can only be used four times. Once it reached fourth times, the fifth value will use another date of same day. In other word, use date of Monday of next week. Example, Monday with 6 JUNE 2016 to Monday with 13 JUNE 2016 (you may check the calendar).
I have a query of getting a list of date based on presentationdatestart and presentationdateend from presentation table:
select a.presentationid,
a.presentationday,
to_char (a.presentationdatestart + delta, 'DD-MM-YYYY', 'NLS_CALENDAR=GREGORIAN') list_date
from presentation a,
(select level - 1 as delta
from dual
connect by level - 1 <= (select max (presentationdateend - presentationdatestart)
from presentation))
where a.presentationdatestart + delta <= a.presentationdateend
and a.presentationday = to_char(a.presentationdatestart + delta, 'fmDay')
order by a.presentationdatestart + delta,
a.presentationid; --IMPORTANT!!!--
For example,
presentationday presentationdatestart presentationdateend
Monday 01-05-2016 04-06-2016
Tuesday 01-05-2016 04-06-2016
Wednesday 01-05-2016 04-06-2016
Thursday 01-05-2016 04-06-2016
The query result will list all possible dates between 01-05-2016 until 04-06-2016:
Monday 02-05-2016
Tuesday 03-05-2016
Wednesday 04-05-2016
Thursday 05-05-2016
....
Monday 30-05-2016
Tuesday 31-05-2016
Wednesday 01-06-2016
Thursday 02-06-2016 (20 rows)
This is my INSERT query :
insert into CSP600_SCHEDULE (studentID,
studentName,
projectTitle,
supervisorID,
supervisorName,
examinerID,
examinerName,
exavailableID,
availableday,
availablestart,
availableend,
availabledate)
select '2013816591',
'mong',
'abc',
'1004',
'Sue',
'1002',
'hazlifah',
2,
'Monday', //BASED ON THIS DAY
'12:00:00',
'2:00:00',
to_char (a.presentationdatestart + delta, 'DD-MM-YYYY', 'NLS_CALENDAR=GREGORIAN') list_date //FOR AVAILABLEDATE
from presentation a,
(select level - 1 as delta
from dual
connect by level - 1 <= (select max (presentationdateend - presentationdatestart)
from presentation))
where a.presentationdatestart + delta <= a.presentationdateend
and a.presentationday = to_char(a.presentationdatestart + delta, 'fmDay')
order by a.presentationdatestart + delta,
a.presentationid;
This query successfully added 20 rows because all possible dates were 20 rows. I would like modify the query to be able to insert based on availableDay and each date can only be used four times for each different studentID.
Possible outcome in CSP600_SCHEDULE (I am removing unrelated columns to ease readability):
StudentID StudentName availableDay availableDate
2013 abc Monday 01-05-2016
2014 def Monday 01-05-2016
2015 ghi Monday 01-05-2016
2016 klm Monday 01-05-2016
2010 nop Tuesday 02-05-2016
2017 qrs Tuesday 02-05-2016
2018 tuv Tuesday 02-05-2016
2019 wxy Tuesday 02-05-2016
.....
2039 rrr Monday 09-05-2016
.....
You may check the calendar :)
I think what you're asking for is to list your students and then batch them up in groups of 4 - each batch is then allocated to a date. Is that right?
In which case something like this should work (I'm using a list of tables as the student names just so I don't need to insert any data into a custom table) :
WITH students AS
(SELECT table_name
FROM all_tables
WHERE rownum < 100
)
SELECT
table_name
,SYSDATE + (CEIL(rownum/4) -1)
FROM
students
;
I hope that helps you
...okay, following your comments, I think this might be a better solution :
WITH students AS
(SELECT table_name student_name
FROM all_tables
WHERE rownum < 100
)
, dates AS
(SELECT TRUNC(sysdate) appointment_date from dual UNION
SELECT TRUNC(sysdate+2) from dual UNION
SELECT TRUNC(sysdate+4) from dual UNION
SELECT TRUNC(sysdate+6) from dual UNION
SELECT TRUNC(sysdate+8) from dual UNION
SELECT TRUNC(sysdate+10) from dual UNION
SELECT TRUNC(sysdate+12) from dual UNION
SELECT TRUNC(sysdate+14) from dual
)
SELECT
s.student_name
,d.appointment_date
FROM
--get a list of students each with a sequential row number, ordered by student name
(SELECT
student_name
,ROW_NUMBER() OVER (ORDER BY student_name) rn
FROM students
) s
--get a list of available dates with a sequential row number, ordered by date
,(SELECT
appointment_date
,ROW_NUMBER() OVER (ORDER BY appointment_date) rn
FROM dates
) d
WHERE 1=1
--allocate the first four students to date rownumber1, next four students to date rownumber 2...
AND CEIL(s.rn/4) = d.rn
;

What is the best way to get this TSQL Pivot to work [duplicate]

I need to do the following transpose in MS SQL
from:
Day A B
---------
Mon 1 2
Tue 3 4
Wed 5 6
Thu 7 8
Fri 9 0
To the following:
Value Mon Tue Wed Thu Fri
--------------------------
A 1 3 5 7 9
B 2 4 6 8 0
I understand how to do it with PIVOT when there is only one column (A) but I can not figure out how to do it when there are multiple columns to transpose (A,B,...)
Example code to be transposed:
select LEFT(datename(dw,datetime),3) as DateWeek,
sum(ACalls) as A,
Sum(BCalls) as B
from DataTable
group by LEFT(datename(dw,datetime),3)
Table Structure:
Column DataType
DateTime Datetime
ACalls int
BCalls int
Any help will be much appreciated.
In order to transpose the data into the result that you want, you will need to use both the UNPIVOT and the PIVOT functions.
The UNPIVOT function takes the A and B columns and converts the results into rows. Then you will use the PIVOT function to transform the day values into columns:
select *
from
(
select day, col, value
from yourtable
unpivot
(
value
for col in (A, B)
) unpiv
) src
pivot
(
max(value)
for day in (Mon, Tue, Wed, Thu, Fri)
) piv
See SQL Fiddle with Demo.
If you are using SQL Server 2008+, then you can use CROSS APPLY with VALUES to unpivot the data. You code would be changed to the following:
select *
from
(
select day, col, value
from yourtable
cross apply
(
values ('A', A),('B', B)
) c (col, value)
) src
pivot
(
max(value)
for day in (Mon, Tue, Wed, Thu, Fri)
) piv
See SQL Fiddle with Demo.
Edit #1, applying your current query into the above solution you will use something similar to this:
select *
from
(
select LEFT(datename(dw,datetime),3) as DateWeek,
col,
value
from DataTable
cross apply
(
values ('A', ACalls), ('B', BCalls)
) c (col, value)
) src
pivot
(
sum(value)
for dateweek in (Mon, Tue, Wed, Thu, Fri)
) piv

T-SQL - Data Islands and Gaps - How do I summarise transactional data by month?

I'm trying to query some transactional data to establish the CurrentProductionHours value for each Report at the end of each month.
Providing there has been a transaction for each report in each month, that's pretty straight-forward... I can use something along the lines of the code below to partition transactions by month and then pick out the rows where TransactionByMonth = 1 (effectively, the last transaction for each report each month).
SELECT
ReportId,
TransactionId,
CurrentProductionHours,
ROW_NUMBER() OVER (PARTITION BY [ReportId], [CalendarYear], [MonthOfYear]
ORDER BY TransactionTimestamp desc
) AS TransactionByMonth
FROM
tblSource
The problem that I have is that there will not necessarily be a transaction for every report every month... When that's the case, I need to carry forward the last known CurrentProductionHours value to the month which has no transaction as this indicates that there has been no change. Potentially, this value may need to be carried forward multiple times.
Source Data:
ReportId TransactionTimestamp CurrentProductionHours
1 2014-01-05 13:37:00 14.50
1 2014-01-20 09:15:00 15.00
1 2014-01-21 10:20:00 10.00
2 2014-01-22 09:43:00 22.00
1 2014-02-02 08:50:00 12.00
Target Results:
ReportId Month Year ProductionHours
1 1 2014 10.00
2 1 2014 22.00
1 2 2014 12.00
2 2 2014 22.00
I should also mention that I have a date table available, which can be referenced if required.
** UPDATE 05/03/2014 **
I now have query which is genertating results as shown in the example below but I'm left with islands of data (where a transaction existed in that month) and gaps in between... My question is still similar but in some ways a little more generic - What is the best way to fill gaps between data islands if you have the dataset below as a starting point?
ReportId Month Year ProductionHours
1 1 2014 10.00
1 2 2014 12.00
1 3 2014 NULL
2 1 2014 22.00
2 2 2014 NULL
2 3 2014 NULL
Any advice about how to tackle this would be greatly appreciated!
Try this:
;with a as
(
select dateadd(m, datediff(m, 0, min(TransactionTimestamp))+1,0) minTransactionTimestamp,
max(TransactionTimestamp) maxTransactionTimestamp from tblSource
), b as
(
select minTransactionTimestamp TT, maxTransactionTimestamp
from a
union all
select dateadd(m, 1, TT), maxTransactionTimestamp
from b
where tt < maxTransactionTimestamp
), c as
(
select distinct t.ReportId, b.TT from tblSource t
cross apply b
)
select c.ReportId,
month(dateadd(m, -1, c.TT)) Month,
year(dateadd(m, -1, c.TT)) Year,
x.CurrentProductionHours
from c
cross apply
(select top 1 CurrentProductionHours from tblSource
where TransactionTimestamp < c.TT
and ReportId = c.ReportId
order by TransactionTimestamp desc) x
A similar approach but using a cartesian to obtain all the combinations of report ids/months.
in the first step.
A second step adds to that cartesian the maximum timestamp from the source table where the month is less or equal to the month in the current row.
Finally it joins the source table to the temp table by report id/timestamp to obtain the latest source table row for every report id/month.
;
WITH allcombinations -- Cartesian (reportid X yearmonth)
AS ( SELECT reportid ,
yearmonth
FROM ( SELECT DISTINCT
reportid
FROM tblSource
) a
JOIN ( SELECT DISTINCT
DATEPART(yy, transactionTimestamp)
* 100 + DATEPART(MM,
transactionTimestamp) yearmonth
FROM tblSource
) b ON 1 = 1
),
maxdates --add correlated max timestamp where the month is less or equal to the month in current record
AS ( SELECT a.* ,
( SELECT MAX(transactionTimestamp)
FROM tblSource t
WHERE t.reportid = a.reportid
AND DATEPART(yy, t.transactionTimestamp)
* 100 + DATEPART(MM,
t.transactionTimestamp) <= a.yearmonth
) maxtstamp
FROM allcombinations a
)
-- join previous data to the source table by reportid and timestamp
SELECT distinct m.reportid ,
m.yearmonth ,
t.CurrentProductionHours
FROM maxdates m
JOIN tblSource t ON t.transactionTimestamp = m.maxtstamp and t.reportid=m.reportid
ORDER BY m.reportid ,
m.yearmonth

Returning multiple months of data into one select

I have a question on SQL 2008 which is probably quite easy but I can't see the woods for the trees now.
I am trying to produce a sql based report detailing the last six months of helpdesk issue stats, per application, per office, per month which I then take into ssrs to apply prettiness :o)
Anyway - I have my script, which is fine on a month by month basis, for example;
SELECT distinct t.name_1 'Application',
(select distinct name from location where location_ref = c.location_ref) as office,
Count (t.name_1) as [Call Count],
datename(month, dateadd(month,-2,getdate()))+' '+datename(year, dateadd(month,-2,getdate())) as [Report Month]
FROM call_logging C
Inner Join problem_type t On t.ref_composite = c.ref_composite
AND c.resolve_time between onvert(datetime,convert(varchar,month(dateadd(m,-2,getdate()))) + '/01/' + convert(varchar,year(dateadd(m,-2,getdate()))))
and convert(datetime,convert(varchar,month(dateadd(m,-1,getdate()))) + '/01/' + convert(varchar,year(getdate())))
and c.resolve_group in ('48', '60')
which brings back all of May's issues.
The problem is that t.name_1 (the application in which the issue is for) is dynamic and the list grows or shrinks every month.
I basically need a layout of
APPLICATION OFFICE COUNT JUNE MAY APRIL MARCH FEB JAN
WORD LONDON 20 1 1 2 5 10 1
WORD PARIS 10 2 3 1 2 0 3
EXCEL MADRID 05 0 0 3 2 0 0
etc (if that makes sense on this layout!)
I've gone down the 6 separate reports road but it just doesn't look very nice in ssrs. I've thought about #tmptables but they don't like inserting distinct rows.
SELECT [C].[name_1] AS [APPLICATION]
,COUNT([name_1]) AS [CALL COUNT]
,[l].[location_ref]
,[dbo].[ufn_GetDateTime_CalenderYearMonth]([resolve_time]) AS [StartCalenderYearMonth]
FROM [call_logging] [C] INNER JOIN [problem_type] [t]
ON [t].[ref_composite] = [c].[ref_composite]
AND [c].[resolve_group] IN ('48', '60')
INNER JOIN [location] [l] ON [c].[location_ref] = [l].[location_ref]
WHERE [C].[resolve_time] BETWEEN '2011-01-01' AND GETDATE()
GROUP BY [C].[name_1], [l].[location_ref], [dbo].[ufn_GetDateTime_CalenderYearMonth]([resolve_time])
And the code for ufn_GetDateTime_CalenderYearMonth is:
CREATE FUNCTION [dbo].[ufn_GetDateTime_CalenderYearMonth] (#DateTime datetime)
RETURNS varchar(20)
AS
BEGIN
declare #dateString varchar(20)
declare #yearString varchar(10)
declare #monthString varchar(10)
set #yearString = cast( DATEPART(year, #DateTime) as varchar(10))
if(DATEPART(month, #DateTime) < 10)
set #monthString = '0' + cast( DATEPART(month, #DateTime) as varchar(5) )
else
set #monthString = cast( DATEPART(month, #DateTime) as varchar(5) )
set #dateString = #yearString + '-' + #monthString
RETURN (#dateString)
END
You just slap the resultset in a matrix and group everything by [StartCalenderYearMonth] and it will show numbers for each month from 1st of Jan 2011 till now..