datetime matlab different temporal resolution - matlab

I am trying to plot two timeseries in one graph. Unfortunatelly, the data sets have different temporal resolutions and my code using datetime does not work. My aim is one xtick per hour. Any idea how I can solve that problem? Thanks!
dataset1 = rand(1,230).';
dataset2 = rand(1,33).';
xstart = datenum('19/02 09:00','dd/mm HH:MM');
xend = datenum('21/02 18:00','dd/mm HH:MM');
x = linspace(xstart,xend,20);
Dat = linspace(xstart,xend,numel(dataset1));
x1=[1:1:230].' %values every 15 minutes
x0_OM = datenum('19/02 09:00','dd/mm HH:MM');
x1_OM = datenum('20/02 18:00','dd/mm HH:MM');
xData = linspace(x0_OM,x1_OM,20);
Dat2 = linspace(xstart,xend,numel(dataset2));
x2=[1:4:130].' %hourly values
fig=figure ();
yyaxis left
plot(x1,dataset1);
ylabel('Dataset 1')
xlabel('timesteps (15min Interval)');
yyaxis right
plot(x2,dataset2);
ylabel('Dataset 2')
set(gca,'XTick', xData) %does not work
datetick('x', 'dd/mm HH:MM', 'keeplimits','keepticks') %does not work

I generalized your code a bit and used something nicer to inspect than random numbers. I removed the labelling part to keep the script short.
% Dataset 1, 15 minutes interval
xstart1 = datenum('19/02 09:00','dd/mm HH:MM');
xend1 = datenum('21/02 18:00','dd/mm HH:MM');
Dat1 = xstart1:1/24/4:xend1; % 1/24/4 is a 15 minutes step
dataset1 = sin(linspace(0, 2*pi, numel(Dat1)));
% Dataset 2, 1 hour interval
xstart2 = datenum('19/02 09:00', 'dd/mm HH:MM');
xend2 = datenum('20/02 18:00', 'dd/mm HH:MM');
Dat2 = xstart2:1/24:xend2; % 1/24 is a 1 hour step
dataset2 = cos(linspace(0, 2*pi, numel(Dat2)));
% Determine "global" start and end.
xstart = min(xstart1, xstart2);
xend = max(xend1, xend2);
Dat = xstart:1/24:xend;
% Plot
fig = figure();
hold on;
plot(Dat1, dataset1, '*');
plot(Dat2, dataset2, 'r*');
set(gca, 'XTick', Dat);
datetick('x', 'dd/mm HH:MM', 'keepticks', 'keeplimits');
hold off;
Principally, that should work, but the output is not nice, due to the long tick labels. Could you please check, if that is, what you wanted to achieve?

The last two commands actually work, but unfortunaltely the ticks are just at another place than the graph is. Your x1 (and x2) values are from 1 to 230 while the xData values for the ticks are around 730000. If you choose the x values for the plot at the datenum values it works.
Another issue is that the length of the vectors dont add up to values for every 15 minutes (or 1 hour). If you want values for every 15 minutes for the timespan from 19/02 09:00 to 21/02 18:00 (57 hours in total) you need:
4(1/h)*57(hours) + 1 for the last value = 229 values
or in general:
(timespan / timewindow) + 1
If you apply those changes to your code you get
dataset1 = rand(1,229).';
dataset2 = rand(1,34).';
xstart = datenum('19/02 09:00','dd/mm HH:MM');
xend = datenum('21/02 18:00','dd/mm HH:MM');
% in datenumformat 1 = 24 hours
fifteenminutes=(1/24/4);%15 minutes
spacing_in_15min=((xend-xstart)/fifteenminutes)+1;%duration devided by timewindow, +1 for last value
x1 = linspace(xstart,xend,spacing_in_15min); %values every 15 minutes
x0_OM = datenum('19/02 09:00','dd/mm HH:MM');
x1_OM = datenum('20/02 18:00','dd/mm HH:MM');
onehour=1/24; %one hoour
spacing_in_1hour=((x1_OM-x0_OM)/onehour)+1;%duration devided by timewindow, +1 for last value
x2 = linspace(x0_OM,x1_OM,spacing_in_1hour); %hourly values
tickvalues = linspace(xstart,xend,((xend-xstart)/onehour)+1);
fig=figure ();
yyaxis left
plot(x1,dataset1);
ylabel('Dataset 1')
xlabel('timesteps (15min Interval)');
yyaxis right
plot(x2,dataset2);
ylabel('Dataset 2')
set(gca(1),'XTick', tickvalues); %Ticks every hour for the larger dataset
set(gca(1),'XLim', [x2(1) x2(end)]); %focus on the time with both datasets
datetick('x', 'dd/mm HH:MM', 'keeplimits','keepticks'); %Tickformat
Which I think is what you were looking for. I have removed a few values (x,Dat,xData) which were unused. Unfortunaltey even in full screen mode 34 Tick values is a lot so you might want to change the Tickformat or zoom in on a special part.
If you have to do more works in this area I recommend you to look into the MATLAB datetime format, that I find a bit better to handle than the datenum.

Related

Problem plotting data in Matlab (setting month and year in the x-axis)

I have the following table structure in MATLAB:
Year Month datapoint
1990 1 5
1990 2 7
.
.
.
1995 12 3
I want to plot this with datapoint on y-axis and something like 1990_1, 1990_2... on the x-axis.
How can I go about doing this?
You can format the appearance of the XAxis by getting the handle to that object with the get function, and then modifying the properties directly.
% Create example table
t = table();
t.Year = repelem(1990,72,1);
t.Month = [1:72].';
t.datapoint = [5:76].';
plot(t.datapoint)
% Get x axis
xaxis = get(gca,'XAxis');
% Format tick labels
xaxis.TickLabels = compose('%d_%d',t.Year,t.Month);
% Format interpreter
xaxis.TickLabelInterpreter = 'none';
% Limit number of ticks
xaxis.TickValues = 1:numel(t.datapoint);
As per your comment, to only see every 12th label:
indx = 1:72;
indx(12:12:72) = 0;
indx(indx > 1) = 1;
xaxis.TickLabels(find(indx)) = {''}

How to convert 3-dimensional daily data into monthly?

I have 3-dimensional data matrix for ten years (2001-2010). In each file data matrix is 180 x 360 x 365/366 (latitude x longitude x daily rainfall). for example: 2001: 180x360x365, 2002: 180x360x365, 2003: 180x360x365, 2004: 180x360x366........................... 2010: 180x360x365
Now I want to convert this daily rainfall into monthly rainfall (by summing) and combine all the years in one file.
So my final output will be 180x360x120 (latitude x longitude x monthly rainfall over ten the years).
It might be time consuming, but I suppose you could use some form of loop to iterate over the data in each year on a monthly basis, pick out the appropriate number of data points for each month, and then add that to a final data set. Something to the effect of the (very rough) code below might work:
years = ['2001','2002,'2003',...,'2010'];
months = ['Jan','Feb','Mar',...,'Dec'];
finalDataset=[];
for i=1:length(years)
year = years(i);
yearData=%% load in dataset for that year %%
for j=1:length(months)
month = months(j);
switch month
case {'Jan','Mar'}
days=30;
case 'Feb'
days=28'
if(year=='2004' || year=='2008')
days=29;
end
% then continue with cases to include each month
end
monthData=yearData(:,:,1:days) % extract the data for those months
yearData(:,:,1:days)=[]; % delete data already extracted
summedRain = % take mean of rainfall data
monthSummed = % replace daily rainfall data with monthly rainfall, but keep latitude and longitude data
finalDataset=[finalDataset; monthSummed];
end
end
Apologies it's very shabby and I haven't included some of the indexing details, but I hope that helps in at least illustrating an idea? I'm also not entirely sure whether 'if' statements work within 'switch' statements, but the days amendment can be added elsewhere if not.
I am sure you can vectorise this to work faster, but it should do the job. Haven't tested properly
% range of years
years = 2000:2016;
leap_years = [2000 2004 2008 2012 2016];
% Generating random data
nr_of_years = numel(years);
rainfall_data = cell(nr_of_years, 1);
for i=1:nr_of_years
nr_of_days = 365;
if ismember(years(i), leap_years);
nr_of_days = 366;
end
rainfall_data{i} = rand(180, 360, nr_of_days);
end
The actual code you need is below
% fixed stuff
months = 12;
nr_of_days = [31 28 31 30 31 30 31 31 30 31 30 31];
nr_of_days_leap = [31 29 31 30 31 30 31 31 30 31 30 31];
% building vectors of month indices for days
month_indices = [];
month_indices_leap = [];
for i=1:months
month_indices_temp = repmat(i, nr_of_days(i), 1);
month_indices_leap_temp = repmat(i, nr_of_days_leap(i), 1);
month_indices = [month_indices; month_indices_temp];
month_indices_leap = [month_indices_leap; month_indices_leap_temp];
end
% the result will be stored here
result = zeros(size(rainfall_data{i}, 1), size(rainfall_data{i}, 2), months*nr_of_years);
for i=1:nr_of_years
% determining which indices to use depending if it is a leap year
month_indices_temp = month_indices;
if size(rainfall_data{i}, 3)==366
month_indices_temp = month_indices_leap;
end
% data for the current year
current_data = rainfall_data{i};
% this holds the data for current year
monthy_sums = zeros(size(rainfall_data{i}, 1), size(rainfall_data{i}, 2), months);
for j=1:months
monthy_sums(:,:,j) = sum(current_data(:,:,j==month_indices_temp), 3);
end
% putting it into the combined matrix
result(:,:,((i-1)*months+1):(i*months)) = monthy_sums;
end
You can probably achieve a more elegant solution using build in datetime, datestr and datenum, but I am not sure those would be a lot faster or shorter.
EDIT: An alternative using built in date functions
months = 12;
% where the result will be stored
result = zeros(size(rainfall_data{i}, 1), size(rainfall_data{i}, 2), months*nr_of_years);
for i=1:nr_of_years
current_data = rainfall_data{i};
% first day of the year
year_start_timestamp = datenum(datetime(years(i), 1, 1));
% holding current sums
monthy_sums = zeros(size(current_data, 1), size(current_data, 2), months);
% finding the month indices vector
datetime_obj = datetime(datestr(year_start_timestamp:(year_start_timestamp+size(current_data, 3)-1)));
month_indices = datetime_obj.Month;
% summing
for j=1:months
monthy_sums(:,:,j) = sum(current_data(:,:,j==month_indices), 3);
end
% result
result(:,:,((i-1)*months+1):(i*months)) = monthy_sums;
end
This 2nd solution took 1.45 seconds for me, compared to the 1.2 seconds for the first solution. The results were the same for both cases. Hope this helps.

How to plot event frequency using datetime array in matlab

I have an array of values in the datetime format. Each value represent an event happening at the specified date and time. How can I plot event frequency in events per day, events per month, etc?
I already managed to plot events per hour of the day using histogram(mydata.Hour)
Thank you for your answers!
EDIT: some precisions after the first answer below:
yes, that's what I'm doing already using histogram and data.Hour. However, what I want to do is compute the average number of event per day, and plotting that all along the time period my events are.
here is a working example:
% generating 500 random events
dates = datetime(now-1000*rand(500,1),'convertfrom','datenum');
figure;
edges = -0.5:23.5;
histogram(dates.Hour,edges)
title('Events per hours of the day')
xlim ([-0.5 23.5])
ax1 = gca;
ax1.XTick = 0:2:23;
ax1.XTickLabel = {'Midnight','2','4','6','8','10','Noon','14','16','18','20','22'};
ax1.XTickLabelRotation = 45;
figure;
daynumber = weekday(dates);
histogram(daynumber)
title('Events per days of the week')
ax2 = gca;
ax2.XTick = [1:7];
ax2.XTickLabel = {'Sunday','Monday','Tuesday','Wednesday','Thursday','Friday','Saturday'};
ax2.XTickLabelRotation = 45;
Assuming you have an array of datetimes already, you can access the day, hour, minute, second, etc. of each datetime object with:
datetime.Day
datetime.Hour
datetime.Minute
Using this notation, we can use a simple array to keep count of how many events happen at each day/hour/minute/etc. and then use a bar graph to plot the results. Two examples are shown below and you can extrapolate from these to modify for what you need.
Here's how it would work for plotting the event frequency for every hour in one day:
hours_count = zeros(24,1);
for dt = 1:length(datetimes)
hour = datetimes(dt).Hour;
hours_count(hour+1) = hours_count(hour+1) + 1;
end
bar(hours_count)
set(gca,'Xtick',1:24,'XTickLabel',strtrim(cellstr(num2str([0:23]'))))
xlabel('Hour of the Day')
ylabel('Number of Events')
Here's how it would work for plotting the event frequency for every day in one month:
days_in_the_month = 30;
days_count = zeros(days_in_the_month,1);
for dt = 1:length(datetimes)
day = datetimes(dt).Day;
days_count(day) = days_count(day) + 1;
end
bar(days_count)
set(gca,'Xtick',1:days_in_the_month,'XTickLabel',strtrim(cellstr(num2str([1:days_in_the_month]'))))
xlabel('Day of the Month')
ylabel('Number of Events')

plotting data for 2 consecutive years

I have the following code for producing a plot of data that extends two years in terms of day of year:
time = datenum('2008-04-17 02:00'):datenum('2009-11-24 12:27');
dateV = datevec(time);
for i = 1:length(time);
DOY(i) = time(i) - datenum(dateV(i,1),0,0);
end
data = rand(length(time),1);
plot(time,data);
set(gca,'XTick',floor(time(1:50:end))','XTickLabel',floor(DOY(1:50:end)))
Could someone suggest a method for ensuring the ticks on the xaxis are for day numbers that are multiples of 10 i.e. 110, 160 etc.
ADDED SECTION:
DateTime=datestr(datenum('2007-01-01 00:00','yyyy-mm-dd HH:MM'):1/24:...
datenum('2011-12-31 23:00','yyyy-mm-dd HH:MM'),...
'yyyy-mm-dd HH:MM');
time = datenum(DateTime,'yyyy-mm-dd HH:MM');
dateV = datevec(time);
for i = 1:length(time);
DOY(i) = time(i) - datenum(dateV(i,1),0,0);
end
data = rand(length(time),1);
plot(time,data);
mydays = ~mod(floor(DOY),40); %true for days that are multiples of 10
set(gca,'XTick',floor(time(mydays))','XTickLabel',floor(DOY(mydays)))
Error using set
Values must be monotonically increasing
This can be fixed by removing floor, i.e.
set(gca,'XTick',time(mydays),'XTickLabel',floor(DOY(mydays)))
But it generates the labels in bold, why is this?
You can generate an index to days that are multiples of N using the mod function. This function returns you the remainder after dividing a number by N.
mydays = ~mod(floor(DOY),10); %true for days that are multiples of 10
data = rand(length(time),1);
plot(time,data);
set(gca,'XTick',time(mydays),'XTickLabel',floor(DOY(mydays)))
In your case having each multiple of 10 would lead to too many labels on the x-axis, so maybe try multiple of 40, which produces the following: .
Finally, if you want to change the axis labels so that it is no longer in bold, you can use:
set(gca, 'FontWeight', 'normal');
or similarly you can alter its size and font:
set(gca, 'FontSize', 14, 'FontName', 'Calibri')
EDIT: corrected typo in previous code, and added note on altering label font

For command + interpolation: need some tips

I have a matrix A with three columns: daily dates, prices, and hours - all same size vector - there are multiple prices associated to hours in a day.
sample data below:
A_dates = A_hours= A_prices=
[20080902 [9.698 [24.09
20080902 9.891 24.59
200080902 10.251 24.60
20080903 9.584 25.63
200080903 10.45 24.96
200080903 12.12 24.78
200080904 12.95 26.98
20080904 13.569 26.78
20080904] 14.589] 25.41]
Keep in my mind that I have about two years of daily data with about 10 000 prices per day that covers almost every minutes in a day from 9:30am to 16:00pm. Actually my initial dataset time was in milliseconds. I then converted my milliseconds in hours. I have some hours like 14.589 repeated three times with 3 different prices. Hence I did the following:
time=[A_dates,A_hours,A_prices];
[timeinhr,price]=consolidator(time,A_prices,'mean'); where timeinhr is both vector A_dates and A_hours
to take an average price at each say 14.589hours.
then for any missing hours with .25 .50 .75 and integer hours - I wish to interpolate.
For each date, hours repeat and I need to interpolate linearly prices that I don't have for some "wanted" hours. But of course I can't use the command interp1 if my hours repeats in my column because I have multiple days. So say:
%# here I want hours in 0.25unit increments (like 9.5hrs)
new_timeinhr = 0:0.25:max(A_hours));
day_hour = rem(new_timeinhour, 24);
%# Here I want only prices between 9.5hours and 16hours
new_timeinhr( day_hour <= 9.2 | day_hour >= 16.1 ) = [];
I then create a unique vectors of day and want to use a for and if command to interpolate daily and then stack my new prices in a vector one after the other:
days = unique(A_dates);
for j = 1:length(days);
if A_dates == days(j)
int_prices(j) = interp1(A_hours, A_prices, new_timeinhr);
end;
end;
My error is:
In an assignment A(I) = B, the number of elements in B and I must be the same.
How can I write the int_prices(j) to the stack?
I recommend converting your input to a single monotonic time value. Use the MATLAB datenum format, which represents one day as 1. There are plenty of advantages to this: You get the builtin MATLAB time/date functions, you get plot labels formatted nicely as date/time via datetick, and interpolation just works. Without test data, I can't test this code, but here's the general idea.
Based on your new information that dates are stored as 20080902 (I assume yyyymmdd), I've updated the initial conversion code. Also, since the layout of A is causing confusion, I'm going to refer to the columns of A as the vectors A_prices, A_hours, and A_dates.
% This datenum vector matches A. I'm assuming they're already sorted by date and time
At = datenum(num2str(A_dates), 'yyyymmdd') + datenum(0, 0, 0, A_hours, 0, 0);
incr = datenum(0, 0, 0, 0.25, 0, 0); % 0.25 hour
t = (At(1):incr:At(end)).'; % Full timespan of dataset, in 0.25 hour increments
frac_hours = 24*(t - floor(t)); % Fractional hours into the day
t_business_day = t((frac_hours > 9.4) & (frac_hours < 16.1)); % Time vector only where you want it
P = interp1(At, A_prices, t_business_day);
I repeat, since there's no test data, I can't test the code. I highly recommend testing the date conversion code by using datestr to convert back from the datenum to readable dates.
Converting days/hours to serial date numbers, as suggested by #Peter, is definitely the way to go. Based on his code (which I already upvoted), I present below a simple example.
First I start by creating some fake data resembling what you described (with some missing parts as well):
%# three days in increments of 1 hour
dt = datenum(num2str((0:23)','2012-06-01 %02d:00'), 'yyyy-mm-dd HH:MM'); %#'
dt = [dt; dt+1; dt+2];
%# price data corresponding to each hour
p = cumsum(rand(size(dt))-0.5);
%# show plot
plot(dt, p, '.-'), datetick('x')
grid on, xlabel('Date/Time'), ylabel('Prices')
%# lets remove some rows as missing
idx = ( rand(size(dt)) < 0.1 );
hold on, plot(dt(idx), p(idx), 'ro'), hold off
legend({'prices','missing'})
dt(idx) = [];
p(idx) = [];
%# matrix same as yours: days,prices,hours
ymd = str2double( cellstr(datestr(dt,'yyyymmdd')) );
hr = str2double( cellstr(datestr(dt,'HH')) );
A = [ymd p hr];
%# let clear all variables except the data matrix A
clearvars -except A
Next we interpolate the price data across the entire range in 15 minutes increments:
%# convert days/hours to serial date number
dt = datenum(num2str(A(:,[1 3]),'%d %d'), 'yyyymmdd HH');
%# create a vector of 15 min increments
t_15min = (0:0.25:(24-0.25))'; %#'
tt = datenum(0,0,0, t_15min,0,0);
%# offset serial date across all days
ymd = datenum(num2str(unique(A(:,1))), 'yyyymmdd');
tt = bsxfun(#plus, ymd', tt); %#'
tt = tt(:);
%# interpolate data at new datetimes
pp = interp1(dt, A(:,2), tt);
%# extract desired period of time from each day
idx = (9.5 <= t_15min & t_15min <= 16);
idx2 = bsxfun(#plus, find(idx), (0:numel(ymd)-1)*numel(t_15min));
P = pp(idx2(:));
%# plot interpolated data, and show extracted periods
figure, plot(tt, pp, '.-'), datetick('x'), hold on
plot([tt(idx2);nan(1,numel(ymd))], [pp(idx2);nan(1,numel(ymd))], 'r.-')
hold off, grid on, xlabel('Date/Time'), ylabel('Prices')
legend({'interpolated prices','period of 9:30 - 16:00'})
and here are the two plots showing the original and interpolated data:
I think I might have solved it this way:
new_timeinhr = 0:0.25:max(A(:,2));
day_hour = rem(new_timeinhr, 24);
new_timeinhr( day_hour <= 9.4 | day_hour >= 16.1 ) = [];
days=unique(data(:,1));
P=[];
for j=1:length(days);
condition=A(:,1)==days(j);
intprices = interp1(A(condition,2), A(condition,3), new_timeinhr);
P=vertcat(P,intprices');
end;