Qlik Sense - Displaying data by time series aggregated by a given time period - qliksense

I have data in the following format in Qlik Sense:
Date Customer Flag
2016-10-01 A 1
2016-10-01 B 0
2016-10-02 A 1
2016-10-02 C 1
2016-10-03 A 1
2016-10-03 B 1
2016-10-03 C 1
2016-10-05 C 1
2016-10-10 A 0
2016-10-10 B 1
2016-10-11 C 0
I would like to display this data in a table in Qlik Sense in the following format:
#Week Count Distinct Customer Where Flag is 1
39 2
40 3
41 3
Logic Behind: I need Week wise count of DISTINCT Customers who have Flag = 1 in the that Week including one previous Week. So, here Week Number 40 will display distinct count for Week 39 and 40. Similarly Week 41 will display distinct count for Week 40 and 41 and so on.
I would appreciate if anyone can suggest a Qlik Sense expression for the above?

Seems to me that this should be solved in the script level instead trying to over complicate your expression/GUI (if possible at all)
The script below will create additional table Dates which will contain the Date --> Week link. The Week can be used as normal dimension.
RawData:
Load * Inline [
Date , Customer, Flag
2016-10-01, A , 1
2016-10-01, B , 0
2016-10-02, A , 1
2016-10-02, C , 1
2016-10-03, A , 1
2016-10-03, B , 1
2016-10-03, C , 1
2016-10-05, C , 1
2016-10-10, A , 0
2016-10-10, B , 1
2016-10-11, C , 0
];
// Generate week number from the existing dates
Dates_Temp:
Load
Date,
Week(Date) as Week
Resident
RawData
;
Concatenate
// Generate previous week number from the existing dates
Load
Date,
Week(Date + 7) as Week // Generate the previous week number
Resident
RawData
;
// The following code will remove week 42
// If week 38 need to be visible just ignore/delete
// the script below --->
// Find the max week from the generated weeks
MaxWeek:
Load
max(Week) as maxWeek
Resident
Dates_Temp
;
// Load the min week in vMaxnWeek variable
let vMaxWeek = peek('maxWeek');
// This table is not needed anymore
Drop Table MaxWeek;
NoConcatenate
// The new Dates table will contain all weeks apart from week 42
Dates:
Load
*
Resident
Dates_Temp
Where
Week <> $(vMaxWeek)
;
Drop Table Dates_Temp;
let vMaxWeek = null();
After executing the script the data structure will look like this:
And Dates table will contain the following data:
As you can see for each date there are two weeks numbers accosted. (only week 39 is single record because week 38 is removed from the data. not sure if its needed)
So after this the expression is very simple:
= count( {< Flag = {1} >} distinct Customer)
And the result is:
P.S. the screenshots are from QlikView but the same load code and expression can be used in QS as well

As an alternative to Stefan's answer, Qlik Sense has some great features when it comes to dates. The below can be plugged into your script and will generate additional fields on the selector screen that can be used.
//AutoCalendar
[Calendar]:
DECLARE FIELD DEFINITION Tagged ('$date')
FIELDS
Dual(Year($1), YearStart($1)) AS [Year] Tagged ('$axis', '$year')
,Dual('Q'&Num(Ceil(Num(Month($1))/3)),Num(Ceil(NUM(Month($1))/3),00)) AS [Quarter] Tagged ('$quarter')
,Dual(Year($1)&'-Q'&Num(Ceil(Num(Month($1))/3)),QuarterStart($1)) AS [YearQuarter] Tagged ('$axis', '$yearquarter')
,Month($1) AS [Month] Tagged ('$month')
,Dual(Year($1)&'-'&Month($1), monthstart($1)) AS [YearMonth] Tagged ('$axis', '$yearmonth')
,Dual('W'&Num(Week($1),00), Num(Week($1),00)) AS [Week] Tagged ('$weeknumber')
,Date(Floor($1)) AS [Date] Tagged ('$date')
/*User added date components*/
,Dual(Year($1), if(Year($1)=Year(today()),YearStart($1),null)) AS [ThisYear] Tagged ('$axis', '$thisyear')
,Dual(Year($1)&'-Q'&Num(Ceil(Num(Month($1))/3)), if(Year($1)=Year(today()),QuarterStart($1),null))
AS [ThisYearQuarter] Tagged ('$axis', '$thisyearquarter')
,Dual(Year($1)&'-'&Month($1)
, if(Year($1)=Year(today()), monthstart($1),null)) AS [ThisYearMonth] Tagged ('$axis', '$thisyearmonth')
,Dual(Year($1), if(Year($1)=(Year(today())-1),YearStart($1),null)) AS [LastYear] Tagged ('$axis', '$lastyear')
,Dual(Year($1)&'-Q'&Num(Ceil(Num(Month($1))/3))
, if(Year($1)=(Year(today())-1),QuarterStart($1),null)) AS [LastYearQuarter] Tagged ('$axis', '$lastyearquarter')
,Dual(Year($1)&'-'&Month($1)
, if(Year($1)=(Year(today())-1), monthstart($1),null)) AS [LastYearMonth] Tagged ('$axis', '$lastyearmonth')
,Dual(date(MonthStart($1),'MMM-YYYY')
, if(Monthstart($1)=Monthstart(today()),Monthstart($1),null)) AS [ThisMonth] Tagged ('$axis', '$thismonth')
,Dual(date(MonthStart($1),'MMM-YYYY')
, if(Monthstart($1)=Monthstart(addmonths(today(),-1)),Monthstart($1),null))
AS [LastMonth] Tagged ('$axis', '$lastmonth')
,Dual(Year($1)&'-Q'&Num(Ceil(Num(Month($1))/3))
,if(QuarterStart($1)=QuarterStart(Today()),QuarterStart($1),null))
AS [ThisQuarter] Tagged ('$axis', '$thisquarter')
,Dual(Year($1)&'-Q'&Num(Ceil(Num(Month($1))/3))
,if(QuarterStart($1)=QuarterStart(addmonths(Today(),-3)),QuarterStart($1),null))
AS [LastQuarter] Tagged ('$axis', '$lastquarter')
,Dual(date(MonthStart($1),'MMM-YYYY')
,if(QuarterStart($1)=QuarterStart(Today()),MonthStart($1),null)) AS [ThisQuarterMonth] Tagged ('$axis', '$thisquartermonths');
DERIVE FIELDS FROM FIELDS [CurrencyDate]
USING [Calendar] ;
This would use the same expression as Stefan.
= count( {< Flag = {1} >} distinct Customer)
The benefit of doing it this way
fields are hidden in the dropdown so don't clutter field selecter
multiple dates can be processed in the same stack, just add to the derive from fields delimiting with a ,

Related

Find all instances of a date in a date range - SQL Server

I need to find the price for an item for each financial year end date in a date range. In this case the financial year is e.g. 31 March
The table I have for example:
ItemID
Value
DateFrom
DateTo
1
10
'2019/01/01'
'2021/02/28'
1
11
'2021/03/01'
'2021/05/01'
SQL Fiddle
The SQL would thus result in the above table to be:
ItemID
Value
DateFrom
DateTo
1
10
'2019/01/01'
'2019/03/30'
1
10
'2020/03/31'
'2021/02/28'
1
11
'2020/03/01'
'2021/03/30'
1
11
'2020/03/31'
'2021/05/01'
You can solve it, but a prerequisite is the creation of a table called financial_years and filling it with data. This would be the structure of the table:
financial_years(id, DateFrom, DateTo)
Now that you have this table, you can do something like this:
select ItemID, Value, financial_years.DateFrom, financial_years.DateTo
from items
join financial_years
on (items.DateFrom between financial_years.DateFrom and financial_years.DateTo) or
(items.DateTo between financial_years.DateFrom and financial_years.DateTo)
order by financial_years.DateFrom;
The accepted answer is not correct, as it does not split out different parts of the year which have different values.
You also do not need a Year table, although it can be beneficial. You can generate it on the fly using a VALUES table.
Note also a better way to check the intervals overlap, using AND not OR
WITH Years AS (
SELECT
YearStart = DATEFROMPARTS(v.yr, 3, 31),
YearEnd = DATEFROMPARTS(v.yr + 1, 3, 31)
FROM (VALUES
(2015),(2016),(2017),(2018),(2019),(2020),(2021),(2022),(2023),(2024),(2025),(2026),(2027),(2028),(2029),(2030),(2031),(2032),(2033),(2034),(2035),(2036),(2037),(2038),(2039)
) v(yr)
)
SELECT
i.ItemID,
i.Value,
DateFrom = CASE WHEN i.DateFrom > y.YearStart THEN i.DateFrom ELSE y.YearStart END,
DateTo = CASE WHEN i.DateTo > y.YearEnd THEN y.YearEnd ELSE i.DateTo END
FROM items i
JOIN Years y ON i.DateFrom <= y.YearEnd
AND i.DateTo >= y.YearStart;

Re-assign value based on dates

I need to compare the dates and reassign the values to two new variables by ID.
If there are two dates for same id, then:
If the 'date' variable is earlier, its value should be reassigned to "earlier status".
If the 'date' variable is later, its value should be reassigned to "Current status".
if there is only one date for the id, the value will be reassigned to "current status". and the "earlier status" need to be missing.
if there are more than two dates for the id, then the value for the middle date will be ignored, and only use the earlier and most current value.
Any thoughts? Much appreciated!
This is the code that I have tried:
data origin;
input id date mmddyy8. status;
datalines;
1 1/1/2010 0
1 1/1/2011 1
2 2/2/2002 1
3 3/3/2003 1
3 2/5/2010 0
4 1/1/2000 0
4 1/1/2003 0
4 1/1/2005 1
;
run;
proc print; format date yymmdd8.; run;
proc sort data=origin out=a1;
by id date;
run;
data need; set a1;
if first.date then EarlierStatus=status;
else if last.date then CurrentStatus=status;
by id;
run;
proc print; format date yymmdd8.; run;
So, a couple of things. First - note a few corrections to your code - in particular the : which is critical if you're going to input with mixed list style.
Second; you need to retain EarlierStatus. Otherwise it gets cleared out each data step iteration.
Third, you need to use first.id not first.date (and similar for last) - what first is doing there is saying "This is the first iteration of a new value of id". Date is what you'd say in English ("The first date for that...").
Finally, you need a couple of more tests to set your variables the way you have them.
data origin;
input id date :mmddyy10. status;
format date mmddyy10.;
datalines;
1 1/1/2010 0
1 1/1/2011 1
2 2/2/2002 1
3 3/3/2003 1
3 2/5/2010 0
4 1/1/2000 0
4 1/1/2003 0
4 1/1/2005 1
;
run;
proc sort data=origin out=a1;
by id date;
run;
data need;
set a1;
by id;
retain EarlierStatus;
if first.id then call missing(EarlierStatus); *first time through for an ID, clear EarlierStatus;
if first.id and not last.id then EarlierStatus=status; *if it is first time for the id, but not ONLY time, then set EarlierStatus;
else if last.id then CurrentStatus=status; *if it is last time for the id, then set CurrentStatus;
if last.id then output; *and if it is last time for the id, then output;
run;
The if/elses that I do there could be done slightly differently, depending on how you want to do things exactly, I was trying to keep things a bit direct as far as how they relate to each other.
This proc sql will get what you want:
proc sql;
create table need as
select distinct
t1.id,
t2.EarlierStatus,
t1.CurrentStatus
from (select distinct
id,
date,
status as CurrentStatus
from origin
group by id
having date=max(date)) as t1
left join (select distinct
id,
date,
status as EarlierStatus
from origin
group by id
having date ~= max(date)) as t2 on t1.id=t2.id;
quit;
The above code has two subqueries. In the first subquery, you retain only the rows with the max of date by id, and rename status to CurrentStatus. In the second subquery, you retain all the rows that do not have the max of date by id and rename status to EarlyStatus. So if your origin table has only one date for one id, it is also the max and you will delete this row in the second subquery. Then you perform a left join between the first and the second subqueries, pulling EarlyStatus from the second into the first query. If EarlyStatus is not found, then it goes missing.
Best,

Number of days in a month in DB2

Is there a way to find the number of days in a month in DB2. For example I have a datetime field which I display as Jan-2020, Feb-2020 and so on. Based on this field I need to fetch the number of days for that month. The output should be something like below table,
I'm using the below query
select reportdate, TO_CHAR(reportdate, 'Mon-YYYY') as textmonth from mytable
Expected output
ReportDate textMonth No of Days
1-1-2020 08:00 Jan-2020 31
1-2-2020 09:00 Feb-2020 29
12-03-2020 07:00 Mar-2020 31
Try this:
/*
WITH MYTABLE (reportdate) AS
(
VALUES
TIMESTAMP('2020-01-01 08:00:00')
, TIMESTAMP('2020-02-01 09:00:00')
, TIMESTAMP('2020-03-12 07:00:00')
)
*/
SELECT reportdate, textMonth, DAYS(D + 1 MONTH) - DAYS(D) AS NO_OF_DAYS
FROM
(
SELECT
reportdate, TO_CHAR(reportdate, 'Mon-YYYY') textMonth
, DATE(TO_DATE('01-' || TO_CHAR(reportdate, 'Mon-YYYY'), 'dd-Mon-yyyy')) D
FROM MYTABLE
);
Db2 has the function DAYS_TO_END_OF_MONTH and several others which you could use. Based on your month input, construct the first day of the month. This should be something like 2020-01-01 for Jan-2020 or 2020-02-01 for Feb-2020. Follow the link for several other conversion functions which allow you to transform between formats and to perform date arithmetics.
convert your column to a proper date and try this: day(last_day(date_column))

Merging average of time series corresponding to time span in a different data set

I have two datasets, one with contracts and one with market prices. The gist of what I am trying to accomplish is to find the average value of a time series that corresponds to a period of time in a cross-sectional data set. Please see below.
Example Dataset 1:
Beginning Ending Price
1/1/2014 5/15/2014 $19.50
3/2/2012 10/9/2015 $20.31
...
1/1/2012 1/8/2012 $19.00
In the example above there are several contracts, the first spanning from January 2014 to May 2014, the second from March 2012 to October 2015. Each one has a single price. The second dataset has weekly market prices.
Example Dataset 2:
Date Price
1/1/2012 $18
1/8/2012 $17.50
....
1/15/2015 $21.00
I would like to find the average "market price" (i.e. the average of the price in dataset 2) between the beginning and ending period for each contract on dataset 1. So, for the third contract from 1/1/2012 to 1/8/2012, from the second dataset the output would be (18+17.50)/2 = 17.75. Then merge this value back to the original dataset.
I work with Stata, but can also work with R or Excel.
Also, if you have a better suggestion for a title I would really appreciate it!
You can cross the contracts cross section data with the time series, which forms every pairwise combination, drop the prices from outside the date range, and calculate the mean like this:
/* Fake Data */
tempfile ts ccs
clear
input str9 d p_daily
"1/1/2012" 18
"1/8/2012" 17.50
"1/15/2015" 21.00
end
gen date = date(d,"MDY")
format date %td
drop d
rename date d
save `ts'
clear
input id str8 bd str9 ed p_contract
1 "1/1/2014" "5/15/2014" 19.50
2 "3/2/2012" "10/9/2015" 20.31
3 "1/1/2012" "1/8/2012" 19.00
end
foreach var of varlist bd ed {
gen date = date(`var',"MDY")
format date %td
drop `var'
rename date `var'
}
save `ccs'
/* Calculate Mean Prices and Merge Contracts Back In */
cross using `ts'
sort id d
keep if d >= bd & d <=ed
collapse (mean) mean_p = p_daily, by(id bd ed p_contract)
merge 1:1 id using `ccs', nogen
sort id
This gets you something like this:
id p_contract bd ed mean_p
1 19.5 01jan2014 15may2014 .
2 20.31 02mar2012 09oct2015 21
3 19 01jan2012 08jan2012 17.75

Qlikview - Data between dates; filter out data past or future data depending on selected date

I've seen threads where the document has Start Date and End Date "widgets" where users type in their dates, however, I'm looking for a dynamic solution, for example on the table below, when I select a date, say "1/1/2004", I only want to see active players (this would exclude Michael Jordan only).
Jersey# Name RookieYr RetirementYr Average PPG
23 Michael Jordan 1/1/1984 1/1/2003 24
33 Scotty Pippen 1/1/1987 1/1/2008 15
1 Derrick Rose 1/1/2008 1/1/9999 16
25 Vince Carter 1/1/1998 1/1/9999 18
The most flexible way is to IntervalMatch the RookieYr * RetireYr dates into a table of all dates. See http://qlikviewcookbook.com/recipes/download-info/count-days-in-a-transaction-using-intervalmatch/ for a complete example.
Here's the interval match for your data. You'll can obviously create your calendar however you want.
STATS:
load * inline [
Jersey#, Name, RookieYr, RetirementYr, Average, PPG
23, Michael Jordan, 1/1/1984, 1/1/2003, 24
33, Scotty Pippen, 1/1/1987, 1/1/2008, 15
1, Derrick Rose, 1/1/2008, 1/1/9999, 16
25, Vince Carter, 1/1/1998, 1/1/9999, 18
];
let zDateMin=37000;
let zDateMax=40000;
DATES:
LOAD
Date($(zDateMin) + IterNo() - 1) as [DATE],
year( Date($(zDateMin) + IterNo() - 1)) as YEAR,
month( Date($(zDateMin) + IterNo() - 1)) as MONTH
AUTOGENERATE 1
WHILE $(zDateMin)+IterNo()-1<= $(zDateMax);
INTERVAL:
IntervalMatch (DATE) load RookieYr, RetirementYr resident STATS;
left join (DATES) load * resident INTERVAL; drop table INTERVAL;
There's not much to it you need to load 2 tables one with the start and end dates and one with the calendar dates then you interval match the date field to the start and end field and from there it will work the last join is just to tidy up a bit.
The result of all of that is this ctrl-t. Don't worry about the Syn key it is required to maintain the interval matching.
Then you can have something like this.
Derrick Rose is also excluded since he had not started by 1/1/2004