I have the following vector of monthly values (vectorA). I put the date related info next to it to help illustrate the task but I work with just the vector itself
dates month_in_q vectorA
31/01/2020 1 10
29/02/2020 2 15
31/03/2020 3 6
30/04/2020 1 8
31/05/2020 2 4
30/06/2020 3 3
How can I create a new vectorNEW according to this algorithm
In each quarter the first month is the original first month
In each quarter the second month is the average of first and second month
In each quarter the third month is the average of all three months
So that I get the following vectorNEW by manipulating the original vectorA in a loop given this the re-occuring pattern above
dates month_in_q vectorA vectorNEW
31/01/2020 1 10 10
29/02/2020 2 15 AVG(10+15)
31/03/2020 3 6 AVG(10+15+6)
30/04/2020 1 8 8
31/05/2020 2 4 AVG(8+4)
30/06/2020 3 3 AVG(8+4+3)
... ... ... ...
An elegant solution was provided by the user dpb on mathworks website.
vectorNEW=reshape(cumsum(reshape(vectorA,3,[]))./[1:3].',[],1);
Further info below
https://uk.mathworks.com/matlabcentral/answers/823055-how-to-average-monthly-data-using-a-specific-method-in-matlab
Related
I have a SQL table which has two columns called seq and sub_seq as seen below. I would like to add a third column called id, which goes up by 1 every time the sub_seq starts again at 1 as shown in the table below.
seq
sub_seq
id
1
1
1
2
2
1
3
3
1
4
4
1
5
5
1
6
1
2
7
2
2
8
3
2
9
1
3
10
2
3
11
3
3
12
4
3
13
5
3
14
6
3
15
7
3
I could write a solution using plpgsql, however I would like to know if there is a way of doing this in standard SQL. Any help would be greatly appreciated.
If sub_seq is always a running sequence then you can use the DENSE RANK function and order over the differences of two columns, assuming it will consistently uniform.
SELECT seq, sub_Seq, DENSE_RANK() OVER (ORDER BY seq-sub_Seq) AS id
FROM tableDemo
This solution is based on the sample data you have provided, I think more sample data would be helpful to check the whole scenario.
I would like to use Redshift's window aggregation functions to create an 'N' month rolling average of some data. The data will have multiple unique entries per any given month. If possible, I'd like to avoid first grouping by and averaging within months before performing rolling average as this is taking an average of an average and not ideal (as this post does: 3 Month Moving Average - Redshift SQL).
This is a sample dataset of just one account (there will be more than 1).
Quote Date Account. Value
3/24/2015 acme. 3
3/25/2015 acme. 7
4/1/2015 acme. 12
4/3/2015 acme. 17
5/15/2015 acme. 1
6/30/2015 acme. 3
7/30/2015 acme. 9
And this is what I would like the result to look like for a 3 month rolling average (for an example).
Quote_Date Account. Value Month 3M_Rolling_Average
3/24/2015 acme. 3 1 3
3/25/2015 acme. 7 1 5
4/1/2015 acme. 12 2 7.33
4/3/2015 acme. 17 2 9.75
5/15/2015 acme. 1 3 8
6/30/2015 acme. 3 4 8.25
7/30/2015 acme. 9 5 4.33
The code I have tried looks like this:
avg(Value) over (partition by Account order by Quote Date rows between 2 preceding and current row)
But, this only operates over the last 2 rows (and including current row) which would work if I had one unique value for each month but as stated, this is not the case. I am open to any kind of ranking solution or nested partitioning. Any help is greatly appreciated.
Since an average is just the sum() / count(), you just need to group by month but get the sum() and count(). Then use your lag to sum 3 months of sums and divide by the sum of 3 months of counts. You are correct that average of averages is not correct but if you carry the sums and counts things work.
This question already has answers here:
Show two different plots in one plot
(2 answers)
Closed 8 years ago.
Good Morning, I have a problem with matlab plot.
I have generated sample of data that belong to different days; the data are the main posture of the human (labelled with 1,2,3,4).
Now I have 30 vector (one for each day) with the number of sample equals to the seconds of the day (about 86400 sample...). I have one posture for each second.
My aim is to plot the distribution of the sample during one month, in X axis I would have the days of the month (1,2,3.....30) and in the Y axis I would have the hour (sample/3600 I think).
How can I plot all the data in only one graph? I have two main problem:
I have 30 vector with different lenght (because I have generated the sample with random function) so the first step is to allineate the data I think because PLOT function needs vectors with the same lenght...
plot 30 days in the same plot, in order to evaluate the whole distribution of the posture in a month
A small example: day1 = [2222111333444] day2 = [22111333333444] day3 = [2221111133334444]. The input are sequences of postures (one sequence for day); now I need to obtain a plot with a "vertical representation" of these postures (on the x axis the days, on the y axis the hour of the day, for each hour I have about 3600 sample-one sample for second). With the command "hold on" no problem but I don't need to overlap the data but I need to place side by side the vector data
Andrea
It goes something like this, but of course,if you have 30 days and one entry per second you would need to use a matrix and sum the individual rows. Also, you don't need to make the vectors the same size, but then you have to use a different parameter for the x axis (Days) everytime.
day1=[2 2 2 2 1 1 1 1 3 3 3 4 4 4];
day2=[2 2 1 1 1 3 3 3 3 3 4 4 4 4];
day3=[2 2 2 1 1 1 1 3 4 4 4 4 4 4];
Days=1:3;
LayingTime=[sum(day1==1),sum(day2==1),sum(day3==1)];
SittingTime=[sum(day1==2),sum(day2==2),sum(day3==2)];
StandingTime=[sum(day1==3),sum(day2==3),sum(day3==3)];
RockingTime=[sum(day1==4),sum(day2==4),sum(day3==4)];
plot(Days,LayingTime,Days,SittingTime,Days,StandingTime,Days,RockingTime)
xlabel('Day')
ylabel('Hours of Activity')
legend('Hours Laying','Hours Sitting','Hours Standing','Hours Rocking')
I have another challenge that I am trying to resolve but unable to get the solution yet. Here is the scenario. Pardon the formatting if it messes up at the time of posting.
ACCT_NUM CERT_ID Code Date Desired Output
A 1 10 1/1/2007 1/1/2008
A 1 10 1/1/2008 1/1/2008
A 1 20 1/1/2009 1/1/2010
A 1 20 1/1/2010 1/1/2010
A 1 10 1/1/2011 1/1/2012
A 1 10 1/1/2012 1/1/2012
A 2 20 1/1/2007 1/1/2008
A 2 20 1/1/2008 1/1/2008
A 2 10 1/1/2009 1/1/2010
A 2 10 1/1/2010 1/1/2010
A 2 30 1/1/2011 1/1/2011
A 2 10 1/1/2012 1/1/2013
A 2 10 1/1/2013 1/1/2013
As you can see, I need to do a MAX on the date based on each group of code values (apart from ACCT_NUM and CERT_ID) before the value changes. If the same value repeats, I need to a MAX of the data again for that group separately. For example, for CERT_ID of '1', I cannot group all four rows of Code 10 to get a MAX date of 1/1/2012. I need to get the MAX for the first two rows and then another MAX for the next two rows separately since there is another code in between. I am trying to accomplish this in Cognos Framework Manager.
Gurus, please advise.
The syntax for getting the max value for CERT_ID is:
maximum(Date for CERT_ID)
If you want additional level/s for max you can use the following syntax:
maximum(Date for ACCT_NUM,CERT_ID,Code)
In general, it is best practice to group and summarize values in report, not in framework manager.
I have a dataset with some numbers for each month.
For example:
1/1/2009 param1 param2
2/1/2009 param1 param2
3/1/2009 param1 param2
What I need is to show 4 lines of summary:
last 6 months
this year (last 12 months)
last year (12 to 24 months ago)
total
I was thinking of adding a parameter for each record that assings each record to a specific time frame (6 months ago, 12 months ago, etc.). But groups 1 and 2 are overlapping, so some records would belong to both.
Do you have any suggestions on how to display such a summary?
Thanks a lot!
Irene
You can use Running Totals with an evaluate formula to only total certain rows. Assuming you have an {asofdate} parameter and the {month} data field...
last 6 months
datediff("m", {month}, {asofdate}) <= 6
this year (last 12 months)
datediff("m", {month}, {asofdate}) <= 12
last year (12 to 24 months ago)
datediff("m", {month}, {asofdate}) >= 13
and datediff("m", {month}, {asofdate}) <= 24
total
just use a sum