Period To Date Median Over Time - visualization

I'm trying to calculate in Quicksight the median over time on a quarterly bases, so given these values:
Value
Date
1
01/01/2023
2
01/02/2023
3
01/02/2023
4
01/02/2023
5
01/03/2023
6
01/04/2023
7
01/04/2023
8
01/05/2023
Ideally I would like to get these values:
Value
Date
1
01/2023
2.5
02/2023
3
03/2023
6.5
04/2023
7
05/2023
There are some functions in Quicksight (like periodToDateAvgOverTime()) that perform this aggregation for other functions. Is there a way to calculate this with a custom formula or some work around?

You could use periodToDateAvgOverTime(median(value),....), or use the mediaIf(value,condition) and use the date as condition

Related

Filtering a column per a maximum value to show on histogram

I want to show on the histogram only the values from the 2nd major week.
Example: I have this table with "Week" and "feature_value". And week "6" is the 2nd major week.
Week feature_value
3 10,1
4 10,5
5 10,7
5 10,3
6 11,1
6 10,7
7 10,3
Basically What I want is this:
Week feature_value
6 11,1
6 10,7
I succeed doing it in qliksense table with this formula:
=Num(Aggr(distinct IF(max(Week,2),feature_value),feature_value))
But, when I use it on the histogram it appears "The chart is not displayed because it contains only undefined values."
See below the error:
Does anybody know how to solve it?

How to average monthly data using a specific method in Matlab?

I have the following vector of monthly values (vectorA). I put the date related info next to it to help illustrate the task but I work with just the vector itself
dates month_in_q vectorA
31/01/2020 1 10
29/02/2020 2 15
31/03/2020 3 6
30/04/2020 1 8
31/05/2020 2 4
30/06/2020 3 3
How can I create a new vectorNEW according to this algorithm
In each quarter the first month is the original first month
In each quarter the second month is the average of first and second month
In each quarter the third month is the average of all three months
So that I get the following vectorNEW by manipulating the original vectorA in a loop given this the re-occuring pattern above
dates month_in_q vectorA vectorNEW
31/01/2020 1 10 10
29/02/2020 2 15 AVG(10+15)
31/03/2020 3 6 AVG(10+15+6)
30/04/2020 1 8 8
31/05/2020 2 4 AVG(8+4)
30/06/2020 3 3 AVG(8+4+3)
... ... ... ...
An elegant solution was provided by the user dpb on mathworks website.
vectorNEW=reshape(cumsum(reshape(vectorA,3,[]))./[1:3].',[],1);
Further info below
https://uk.mathworks.com/matlabcentral/answers/823055-how-to-average-monthly-data-using-a-specific-method-in-matlab

Rolling N monthly average in Redshift with multiple entries per month

I would like to use Redshift's window aggregation functions to create an 'N' month rolling average of some data. The data will have multiple unique entries per any given month. If possible, I'd like to avoid first grouping by and averaging within months before performing rolling average as this is taking an average of an average and not ideal (as this post does: 3 Month Moving Average - Redshift SQL).
This is a sample dataset of just one account (there will be more than 1).
Quote Date Account. Value
3/24/2015 acme. 3
3/25/2015 acme. 7
4/1/2015 acme. 12
4/3/2015 acme. 17
5/15/2015 acme. 1
6/30/2015 acme. 3
7/30/2015 acme. 9
And this is what I would like the result to look like for a 3 month rolling average (for an example).
Quote_Date Account. Value Month 3M_Rolling_Average
3/24/2015 acme. 3 1 3
3/25/2015 acme. 7 1 5
4/1/2015 acme. 12 2 7.33
4/3/2015 acme. 17 2 9.75
5/15/2015 acme. 1 3 8
6/30/2015 acme. 3 4 8.25
7/30/2015 acme. 9 5 4.33
The code I have tried looks like this:
avg(Value) over (partition by Account order by Quote Date rows between 2 preceding and current row)
But, this only operates over the last 2 rows (and including current row) which would work if I had one unique value for each month but as stated, this is not the case. I am open to any kind of ranking solution or nested partitioning. Any help is greatly appreciated.
Since an average is just the sum() / count(), you just need to group by month but get the sum() and count(). Then use your lag to sum 3 months of sums and divide by the sum of 3 months of counts. You are correct that average of averages is not correct but if you carry the sums and counts things work.

Average and synchronize a timeseries with varying timestamps to a user-defined interval in MATLAB

Assume the following timeseries (ts) with assigned values:
time val
15:00 4
15:45 7
17:12 2.3
17:50 2.9
Every value from a timestamp is valid until the next appears. Thus, from 15:00 to 15:45 the value is 4 or from 15:45 to 17:12 it is 2.3. Every new data point between these timestamps should have the same value. What i want is a new ts, with a constant time-interval and a pre-defined start point. Let's say the starting point is 15:00 and the interval should be 30 min. Normally, I could use the synchronize function - however, the function uses the interpolation method and this is not what i need here, since the values between the data points should not be interpolated, but be averaged if timestamps are overlapping.
The new ts should be like:
time val
15:00 4
15:30 5.5
16:00 7
16:30 7
17:00 4.18
The value for timestamp 15:30 is computed as = (4*15+7*15)/30, and so on. I have implemented a code, that is capable of fixing this by applying the trapz function with a lot of if statements. However, I was wondering if there are better/simpler solutions around, as a modified synchronize function, since I have more than 500.000 data points.
Thanks in advance
I managed to fix my problem by dividing all time steps into minute-values and afterwards applying the trapezoidal rule to get the sum of the area under the curve (AUC) and then the average by dividing with the applied minute interval.
AllValues = interp1(Time,Data,NewTime,'previous')';
[Xdata,Ydata] = stairs(NewTime,AllValues);
NewTS = timeseries(Xdata,Ydata);
TrapzSum = cumtrapz(NewTS.time,NewTS.data);
TrapzSum = TrapzSum(1:2:end);
NewResults = diff(TrapzSum(IndicesOfNewInterval))/MinInt;

Calculating MAX(DATE) for Value Groups Where Values Go Back and Forth

I have another challenge that I am trying to resolve but unable to get the solution yet. Here is the scenario. Pardon the formatting if it messes up at the time of posting.
ACCT_NUM CERT_ID Code Date Desired Output
A 1 10 1/1/2007 1/1/2008
A 1 10 1/1/2008 1/1/2008
A 1 20 1/1/2009 1/1/2010
A 1 20 1/1/2010 1/1/2010
A 1 10 1/1/2011 1/1/2012
A 1 10 1/1/2012 1/1/2012
A 2 20 1/1/2007 1/1/2008
A 2 20 1/1/2008 1/1/2008
A 2 10 1/1/2009 1/1/2010
A 2 10 1/1/2010 1/1/2010
A 2 30 1/1/2011 1/1/2011
A 2 10 1/1/2012 1/1/2013
A 2 10 1/1/2013 1/1/2013
As you can see, I need to do a MAX on the date based on each group of code values (apart from ACCT_NUM and CERT_ID) before the value changes. If the same value repeats, I need to a MAX of the data again for that group separately. For example, for CERT_ID of '1', I cannot group all four rows of Code 10 to get a MAX date of 1/1/2012. I need to get the MAX for the first two rows and then another MAX for the next two rows separately since there is another code in between. I am trying to accomplish this in Cognos Framework Manager.
Gurus, please advise.
The syntax for getting the max value for CERT_ID is:
maximum(Date for CERT_ID)
If you want additional level/s for max you can use the following syntax:
maximum(Date for ACCT_NUM,CERT_ID,Code)
In general, it is best practice to group and summarize values in report, not in framework manager.