How do i convert monthly data into quarterly data in matlab as part of a table - matlab

How do I convert monthly data into quarterly data?
For example:
my 1 table looks like this:
TB3m
0.08
0.07
0.06
0.12
0.13
0.14
my second table is table of dates:
dates
1975/01/31
1975/02/28
1975/03/31
1975/04/30
1975/05/31
1975/06/30
I want to convert table 1 such that it takes the average of 3 months to give me quarterly data.
It should ideally look like:
TB3M_quarterly
0.07
0.13
so that it can match my other quarterly dates table which looks like:
dates_quarterly
1975/03/31
1975/06/30
Over all my data is from 1975 January to 2021 june which would give me around 186 quarterly data. Please suggest what I can use. It is thefirst time I am using matlab

In case you can't understand a command, search for it in MATLAB documentation. One quick way to do so in Windows is to click on the function you don't understand and then press F1. For example, if you can't understand what calmonths is doing, then click on calminths and press F1.
TBm = [0.08; 0.07; 0.06; 0.12; 0.13; 0.14]; %values
% months
t1 = datetime(1975, 01, 31); %1st month
% datetime is a datatype to represent time
t2 = datetime(1975, 06, 30); %change this as your last month
dates = t1:calmonths(1):t2;
dates = dates'; %row vector to column vector
TBm_reshaped = reshape(TBm, 3, []);
TB3m_quarterly = mean(TBm_reshaped);
dates_quarterly = dates(3:3:end);
TB3m_quarterly = TB3m_quarterly';
T = table(dates_quarterly, TB3m_quarterly)

I suggest you organize your data in a timetable, and then use the function retime()
% Replicating your data
dates = datetime(1975,1,31) : calmonths(1) : datetime(1975,6,30);
T = timetable([8 7 6 12 13 14]'./100, 'RowTimes', dates);
% Using retime to get quarterly values:
T_quarterly = retime(T, 'quarterly', 'mean')
Here you aggregate by taking the mean of the monthly data. For other aggregation methods, look at the documentation for retime()

Related

How can I convert time from seconds to decimal year in Matlab?

I have a dataset which includes the seconds that have passed since 2000-01-01 00:00:00.0 and I would like them to be converted to decimal years (for example 2013.87).
An example from the dataset:
416554767.293262
416554768.037637
416554768.782013
416554769.526386
416554770.270761
416554771.015136
416554771.759509
416554772.503884
416554773.248258
416554773.992632
416554774.737007
416554775.481381
416554776.225757
416554776.970131
416554777.714504
416554778.458880
Can anyone help me out on this?
Thanks!
You should be able to perform these computations using methods of datetime and duration. A bit like this. I've tried to be careful regarding the number of seconds / year, since of course that varies depending on whether the year in question is a leap year.
% Original data
data = [416554767.293262
416554768.037637
416554768.782013
416554769.526386
416554770.270761
416554771.015136
416554771.759509
416554772.503884
416554773.248258
416554773.992632
416554774.737007
416554775.481381
416554776.225757
416554776.970131
416554777.714504
416554778.458880];
% Original data is seconds since 'base':
base = datetime(2000,1,1);
% Get datetimes corresponding to 'data'
dates = base + seconds(data);
% Extract the year portion from the dates
wholeYears = year(dates);
% Extract the remainder of the dates as seconds
remainderInSeconds = seconds(dates - datetime(wholeYears,1,1));
% Calculate the number of seconds in each of the years
secondsPerYear = seconds(datetime(wholeYears + 1, 1, 1) - datetime(wholeYears, 1, 1));
% Final result is whole years + remainder expressed as years
result = wholeYears + (remainderInSeconds ./ secondsPerYear);
fprintf('%.16f\n', result);

Convert milliseconds into hours and plot

I'm trying to convert an array of milliseconds and its respective data. However I want to do so in hours and minutes.
Millis = [60000 120000 180000 240000....]
Power = [ 12 14 12 13 14 ...]
I've set it up so the data records every minute, hence the 60000 millis (= 1 minimte). I am trying to plot time on the x axis and power on the y. I would like to have the x axis displayed in hours and minutes with each respective power data corresponding to its respective time.
I've tried this
for i=2:length(Millis)
Conv2Min(i) = Millis(i) / 60000;
Time(i) = startTime + Conv2Min(i);
if (Time(i-1) > Time(i) + 60)
Time(i) + 100;
end
end
s = num2str(Time);
This in attempt to turn the milliseconds into hours starting at 08:00 and once 60 minutes have past going to 09:00, the problem is plotting this. I get a gap between 08:59 and 09:00. I also cannot maintain the 0=initial 0.
In this scenario it is preferable to work with datenum values and then use datetick to set the format of the tick labels of your plot to 'HH:MM'.
Let's suppose that you started taking measurements at t_1 = [HH_1, MM_1] and stopped taking measurements at t_2 = [HH_2, MM_2].
A cool trick to generate the array of datenum values is to use the following expression:
time_datenums = HH_1/24 + MM_1/1440 : 1/1440 : HH_2/24 + MM_2/1440;
Explanation:
We are creating a regularly-spaced vector time_datenums = A:B:C using the colon (:) operator, where A is the starting datenum value, B is the increment between datenum values and C is the ending datenum value.
Since your measurements have been taken every minute (60000 milliseconds), then the increment between datenum values should be of 1 minute too. As a day has 24 hours, that makes 1440 minutes a day, so use B = 1/1440 as the increment between vector elements, to get 1 minute increments.
For A and C we simply need to divide the hour digits by 24 and the minute digits by 1440 and sum them up like this:
A = HH_1/24 + MM_1/1440
C = HH_2/24 + MM_2/1440
So for example, if t_1 = [08, 00], then A = 08/24 + 00/1440. As simple as that.
Notice that this procedure doesn't use the datenum function at all, and still, it manages to generate a valid array of datenum values only taking into consideration the time of the datenum, without needing to bother about the date of the datenum. You can learn more about this here and here.
Going back to your original problem, let's have a look at the code:
time_millisec = 0:60000:9e6; % Time array in milliseconds.
power = 10*rand(size(time_millisec)); % Random power data.
% Elapsed time in milliseconds.
elapsed_millisec = time_millisec(end) - time_millisec(1);
% Integer part of elapsed hours.
elapsed_hours_int = fix(elapsed_millisec/(1000*60*60));
% Fractional part of elapsed hours.
elapsed_hours_frac = (elapsed_millisec/(1000*60*60)) - elapsed_hours_int;
t_1 = [08, 00]; % Start time 08:00
t_2 = [t_1(1) + elapsed_hours_int, t_1(2) + elapsed_hours_frac*60]; % Compute End time.
HH_1 = t_1(1); % Hour digits of t_1
MM_1 = t_1(2); % Minute digits of t_1
HH_2 = t_2(1); % Hour digits of t_2
MM_2 = t_2(2); % Minute digits of t_2
time_datenums = HH_1/24+MM_1/1440:1/1440:HH_2/24+MM_2/1440; % Array of datenums.
plot(time_datenums, power); % Plot data.
datetick('x', 'HH:MM'); % Set 'HH:MM' datetick format for the x axis.
This is the output:
I would use datenums:
Millis = [60000 120000 180000 240000 360000];
Power = [ 12 14 12 13 14 ];
d = [2017 05 01 08 00 00]; %starting point (y,m,d,h,m,s)
d = repmat(d,[length(Millis),1]);
d(:,6)=Millis/1000; %add them as seconds
D=datenum(d); %convert to datenums
plot(D,Power) %plot
datetick('x','HH:MM') %set the x-axis to datenums with HH:MM as format
an even shorter approach would be: (thanks to codeaviator for the idea)
Millis = [60000 120000 180000 240000 360000];
Power = [ 12 14 12 13 14 ];
D = 8/24+Millis/86400000; %24h / day, 86400000ms / day
plot(D,Power) %plot
datetick('x','HH:MM') %set the x-axis to datenums with HH:MM as format
I guess, there is an easier way using datetick and datenum, but I couldn't figure it out. This should solve your problem for now:
Millis=6e4:6e4:6e6;
power=randi([5 15],1,numel(Millis));
hours=floor(Millis/(6e4*60))+8; minutes=mod(Millis,(6e4*60))/6e4; % Calculate the hours and minutes of your Millisecond vector.
plot(Millis,power)
xlabels=arrayfun(#(x,y) sprintf('%d:%d',x,y),hours,minutes,'UniformOutput',0); % Create time-strings of the format HH:MM for your XTickLabels
tickDist=10; % define how often you want your XTicks (e.g. 1 if you want the ticks every minute)
set(gca,'XTick',Millis(tickDist:tickDist:end),'XTickLabel',xlabels(tickDist:tickDist:end))

Generate timestamp series in Matlab?

all
I wonder if there is a way to generate timestamp series in Matlab ?
I assume there will be a start time, a end time, and a frequency.
It is simple to generate normal series using 1:1:100 (1 to 100 by 1)
How I can use a similar way to generate a time stamp series?
For instance, I specify start time as 9am, up to 10am, I want to generate something like 9:00:00:000, 9:00:00:500, 9:00:01:000, ....
gaped by 500 millisecond
Or even better, include date as well.
Use datenum, the only problem you might have is that your colliding with a gap second/day or summer savings time if you're spanning a long time period (but I don't think that's implemented in datestr as you can read here).
Play around with datenum, now and datestr
starttime = datenum(2000, 1, 1, 9, 0, 0);
dt = 0.500/86400; % datenum is a serial time format with 1 = 1 day = 86400 sec
N = 5;
timevec = starttime + dt*(0:(N-1));
>> datestr(timevec, 'HH:MM:SS.FFF')
ans =
09:00:00.000
09:00:00.500
09:00:01.000
09:00:01.500
09:00:02.000
Starting from 2015a, you can use the milliseconds function to build a vector of timesteps between to time points:
start = datetime('2017/1/3 9:00:00:000','InputFormat','yyyy/MM/dd H:mm:ss:SSS');
step = milliseconds(500);
fin = datetime('2017/1/3 10:00:00:000','InputFormat','yyyy/MM/dd H:mm:ss:SSS');
time_vec = start:step:fin;
If you don't define the date explicitly it will choose the current date.
You can also have one structure for both the time and the data, you can use the timeseries class (using start from above):
data = rand(7201,1);
ts = timeseries(data,'Name','MyTs');
ts.TimeInfo.StartDate = start;
ts.TimeInfo.Units = 'milliseconds';
ts = setuniformtime(ts,'Interval',500);
This will create a time series object:
>> ts
timeseries
Common Properties:
Name: 'MyTs'
Time: [7201x1 double]
TimeInfo: [1x1 tsdata.timemetadata]
Data: [7201x1 double]
DataInfo: [1x1 tsdata.datametadata]
with the following time info:
>> ts.TimeInfo
tsdata.timemetadata
Package: tsdata
Uniform Time:
Length 7201
Increment 500 milliseconds
Time Range:
Start 03-Jan-2017 09:00:00
End 03-Jan-2017 10:00:00
Common Properties:
Units: 'milliseconds'
Format: ''
StartDate: '03-Jan-2017 09:00:00'
It depends on your needs, but you can consider using the combination of datetime() and one or many of days(), hours(), minutes(), seconds() etc. functions.
Lets write some code:
start=datetime(1985,07,13,9,0,0); % your start date
steps=seconds(0:0.5:100); % your vector with steps
timeseries=start+steps; % your time series
you can also set format for displaying data that meets your needs, to do so check datetime properties manual.

MATLAB how to filter timeseries minute bar data so as to calculate realised volatility?

I have a data set looks like this:
'2014-01-07 22:20:00' [0.0016]
'2014-01-07 22:25:00' [0.0013]
'2014-01-07 22:30:00' [0.0017]
'2014-01-07 22:35:00' [0.0020]
'2014-01-07 22:40:00' [0.0019]
'2014-01-07 22:45:00' [0.0022]
'2014-01-07 22:50:00' [0.0019]
'2014-01-07 22:55:00' [0.0019]
'2014-01-07 23:00:00' [0.0021]
'2014-01-07 23:05:00' [0.0021]
'2014-01-07 23:10:00' [0.0026]
First column is the time stamp recording data everything 5 min, second column is return.
For each day, I want to calculate sum of squared 5 min bar returns. Here I define a day as from 5:00 pm - 5:00 pm. ( So date 2014-01-07 is from 2014-01-06 17:00 to 2014-01-07 17:00 ). So for each day, I would sum squared returns from 5:00 pm - 5:00 pm. Output will be something like:
'2014-01-07' [0.046]
'2014-01-08' [0.033]
How should I do this?
Here is alternative solution
Just defining some randome data
t1 = datetime('2016-05-31 00:00:00','InputFormat','yyyy-MM-dd HH:mm:ss ');
t2 = datetime('2016-06-05 00:00:00','InputFormat','yyyy-MM-dd HH:mm:ss ');
Samples = 288; %because your sampling time is 5 mins
t = (t1:1/Samples:t2).';
X = rand(1,length(t));
First we find the sample which has the given criteria (Can be anything, In your case it was 00:05:00)
n = find(t.Hour >= 5,1,'first')
b = n;
Find the total number of days after the given sample
totaldays = length(find(diff(t.Day)))
and square and accumulate the 'return'for each day
for i = 1:totaldays - 1
sum_acc(i) = sum(X(b:b + (Samples - 1)).^2);
b = b + Samples;
end
This is just for visualization of the data
Dates = datetime(datestr(bsxfun(#plus ,datenum(t(n)) , 0:totaldays - 2)),'Format','yyyy-MM-dd')
table(Dates,sum_acc.','VariableNames',{'Date' 'Sum'})
Date Sum
__________ ______
2016-05-31 93.898
2016-06-01 90.164
2016-06-02 90.039
2016-06-03 91.676
I admit that your dates are in a cell and your values in a vector.
So for example you have:
date = {'2014-01-07 16:20:00','2014-01-07 22:25:00','2014-01-08 16:20:00','2014-01-08 22:25:00'};
value = [1 2 3 4];
You can find the sum for each date with:
%Creation of an index that separate each "day".
[~,~,ind] = unique(floor(cellfun(#datenum,date)+datenum(0,0,0,7,0,0))) %datenum(0,0,0,7,0,0) correspond to the offset
for i = 1:length(unique(ind))
sumdate(i) = sum(number(ind==i).^2)
end
And you can find the corresponding day of each sum with
datesum = cellstr(datestr(unique(floor(cellfun(#datenum,date)+datenum(0,0,0,7,0,0)))))

How to construct moving time average with different weights for different months?

So I want to construct a moving time average with different weights for different months. E.g. see the filter function at http://www.mathworks.com/help/matlab/data_analysis/filtering-data.html, where b = # of days in each month and a = # of days in a year.
The issue is, though, that the time-series is a series of temperatures for every month (and I want to construct a yearly average temperature for each set of possible years, where a year could be from March to February, for example). Using this approach, the first month in each window would be weighted as 31/365, irrespective of whether the first month is January or June.
In that case, the standard filter algorithm wouldn't work. Is there an alternative?
A solution that incorporates leap years would also be nice, but is not necessary for an initial solution.
A weighted average is defined as sum(x .* weights) / sum(weights). If you want to calculate this in a moving average kind of way, I guess you could do (untested):
moving_sum = #(n, x) filter(ones(1,n), 1, x);
moving_weighted_avg = moving_sum(12, temperature .* days_per_month) ...
./ moving_sum(12, days_per_month);
If temperature is a vector of monthly temperatures and days_per_month contains the actual number of days of the corresponding months, this should even work in case of leap years.
Edit to answer comment
You can reconstruct days_per_month like so:
start_year = 2003;
start_month = 10;
nmonth = 130;
month_offset = 0:nmonth - 1;
month = mod(start_month + month_offset - 1, 12) + 1;
year = start_year + floor((start_month + month_offset - 1) / 12);
days_in_month = eomday(year, month);
disp([month_offset; year; month; days_in_month]') %print table to check