Candle Stick graph with Timetable data showing weekend gaps - matlab

First of all, I think the answers I've found are outdated:
Excluding weekend gaps from financial timeseries plots
Exclude Date Gaps in Time Series Plot in Matlab
Datetick take into account NaN in plot
My problem:
I've created a candlestick graph based on a Timetable table with these dates (the format is dd/mm/yyyy):
'25/01/2019'
'24/01/2019'
'23/01/2019'
'22/01/2019'
'21/01/2019'
'18/01/2019'
'17/01/2019'
'16/01/2019'
'15/01/2019'
'14/01/2019'
'11/01/2019'
'10/01/2019'
'09/01/2019'
'08/01/2019'
'07/01/2019'
'04/01/2019'
'03/01/2019'
'02/01/2019'
'28/12/2018'
'27/12/2018'
'26/12/2018'
'21/12/2018'
'20/12/2018'
'19/12/2018'
'18/12/2018'
And this code:
candle(this.values);
This gives me this plot:
As you can see, there are gaps corresponding to the non-business days.
Given the answers that I've found to the same problem what I did was:
Created two arrays one with the dates and the other with dates strings:
this.dates = table2timetable(ticker(1:5:25,:));
%sort them out because were generated in reverse order
this.dates = timetable2table(sortrows(this.dates(:,1)));
this.dates = this.aux(:,1);
this.lbl = datestr(this.aux{:,1},'dd/mm/yyyy');
Obtain the gca object to set the X-axis properties:
this.ax = gca;
this.ax.XTick = this.dates{:,1};
this.ax.XTickMode = 'manual';
this.ax.XTickLabel = this.lbl;
And the result is this:
So the properties are being set correctly but the gaps remain.
Finally I've tried to set the Timetable property VariableContinuity and called the retime function to generate the missing dates entries with NaN data to see if that helped but with the same results:
this.values.Properties.VariableContinuity = {'event','event','event','event','event','event','event','event'};
this.values = retime(this.values,'daily');
What else could I do to hide the gaps?

I believe that once plotted, you cannot remove the gaps. You have to remove the gaps before plotting. In your timetable, create a linear date array (no gaps) and use this for the timetable, then plot. Then, the gaps will not be there but the dates will be wrong. To put the right date, use the following code (similar to your code).
this.ax.XTickMode = 'manual';
this.ax.XTickLabel = YOUR_CORRECT_DATE_CELL_ARRAY

Related

Matlab how to remove the weekend from the chart

I want to plot a line chart that the x-axis is datetime. But there is a weekend gap in the chart. I wonder how can I remove the gaps of the weekends.
figure
plot(data.DateTime(1:833),data.diff(1:833),'b');
hold on
plot(data.DateTime(834:970),data.diff(834:970),'r');
hold on
plot(data.DateTime(971:1546),data.diff(971:1546),'b');
hold off
You can use weekday to filter your date to only include values for days of the week that aren't 6 or 7 (Saturday or Sunday)
days = datetime(2021,5,0:31);
days_weekdays_only = days(and(weekday(days)~=1,weekday(days)~=7));
This will filter the days data to only contain dates that are Mon-Fri.
Then plot this filtered data.
Edit: You can plot the values against an index of the filtered data and then change the x-axis labels to match the datetime string. This way it will skip the weekends but the x-axis will still show the date time.
days = datetime(2021,5,0:31);
weekdays = days(and(weekday(days)~=1,weekday(days)~=7));
data = randi(5,length(days),1);
data_weekdays = data(and(weekday(days)~=1,weekday(days)~=7));
idx = 1:length(weekdays);
plot(idx,data_weekdays)
set(gca,'XTickLabel',datestr(weekdays));

How to use the index of selected data from one figure, to plot something in another figure?

I have a scatter plot on one figure. I'd like to be able to select possibly multiple data points on the mentioned scatter plot, and plot a (possibly) multi-line timeseries chart on the other figure, based on the indexes of the selected data.
Pseudo code:
data = { x: [1,2,3], y: [1,2,3], time_series: [[1,2,3],[4,5,6],[7,8,9]] }
figure1 = scatter_plot(x, y, select_enabled=True)
figure2 = multi_line_timeseries(figure1.indexes_of_selected_points)
show([figure1, figure2])
So if the [1,1] data point (index 0) is selected on figure 1, then the [1,2,3] timeseries (index 0) is plotted on figure 2. If multiple points are selected, then multiple timeseries are plotted.
A restraint is that the HoloViews library can't be used, due to it not supporting my platform.
How can this be achieved?
Note: I have opted to not support simultaneous multiple timeseries plotting, though that would be a trivial extension of this.
To use selected data point's index to determine what is to be plotted in another figure, you need to:
put the relevant data (i.e. x,y,timeseries in the example) on one or multiple ColumnDataSources;
I put the data to select and data that will be updated on different cds's, because I fear it might create a callback loop, though I've not tested this.
create a ColumnDataSource which will act as source for the second figure that plots the timeseries;
enable a selection tool, for example TapTool ('tap');
add a CustomJS callback to the ColumnDataSource that holds the selectable data points;
parametrize that callback with the ColumnDataSource that holds the timeseries data;
have the callback access indeces of selected data points;
have the callback make required changes to the second figure's ColumnDataSource;
call cds_of_2nd_figure.change.emit() before returning from the callback.
Code to illustrate:
cds = ColumnDataSource(data=dict(x=x,y=y,timeseries=timeseries))
cds2 = ColumnDataSource(x_to_plot=[],u_to_plot=[])
def selection_callback(d=cds,d2=cds2):
last_selected_ix = cb_obj.selected.indices[0]
timeserie = d.data['timeseries'][last_selected_ix]
x_to_plot = timeserie['x']
y_to_plot = timeserie['y']
d2.data['x_to_plot'] = x_to_plot
d2.data['y_to_plot'] = y_to_plot
d2.changes.emit()
# turn above function to js
selection_callback = CustomJS.from_py_func( selection_callback )
cds.callback = selection_callback
When some figure selects data from cds, the timeseries[ix] timeserie will be plotted on the figure/s that plot cds2, where ix is the index of the last selected data point from cds.
Relevant resource that has all the relevant information:
https://docs.bokeh.org/en/latest/docs/user_guide/interaction/callbacks.html#customjs-for-tools

Using matlab to arrange and sort data

My data is an excel file with two columns in this format:
Date Type
3/12/06 A
3/12/06 B
3/12/06 B
3/12/06 C
6/01/07 A
6/01/07 A
8/01/07 B
...
Column A are dates and can be repeated while column B are types of observations on these dates.
In MATLAB I want to plot each type as a function of time, however first I need to arrange my data. There are often multiple identical rows that correspond to multiple observations of the same type on the same date. So I think first I need to count how many times a certain type occurred on the same day?
Any help would be great! I'm still at the stage of trying to read the dates in the correct format...
Here is a solution: I replace each type and each date with a specific index and then I use accumarray in order to create a 2D pivot table. You can also directly use the function pivot table from excel.
% We load the xls file.
[~,txt] = xlsread('test.xls');
% We delete the header:
txt(1,:) = [];
% Value and index for the date:
[val_d,~,ind_d] = unique(txt(:,1));
% Value and index for the type:
[val_c,~,ind_c] = unique(txt(:,2));
% We use accumarray to create a pivot table that count each occurence.
acc = accumarray([ind_d,ind_c],1)
% Then we simply plot the result:
dateFormat = 'dd/mm/yy';
for i = 1:length(val_c);
subplot(1,length(val_c),i)
bar(datenum(val_d,dateFormat),acc(:,i),1) % easier to deal with datenum
datetick('x',dateFormat)
xlabel('Date')
ylabel([val_c{i},' count'])
ylim([0,3])
end
RESULT:

Add the year in the x-axis in Matlab

I have a two column data with mmyyyy and SPI (Standardized Precipitation Index) variables. The first two samples have no data (NAN). The file is:
011982 NAN
021982 NAN
031982 -1.348
.
.
.
122013 1.098
I load the time and SPI data into MATLAB, then I would like to plot it but it is not working.
I would like to plot line graph but I really have no idea how to plot time in x-axis and I would like my x-axis to show only the year.
Using the new datetime data type in MATLAB (added in R2014b), this should be easy.
Here is an example. First we load the data into a MATLAB table:
% import data from file
fid = fopen('file.dat', 'rt');
C = textscan(fid, '%{MMyyyy}D %f');
fclose(fid);
% create table
t = table(C{:}, 'VariableNames',{'date','SPI'});
You get something like this:
>> t(1:10,:)
ans =
date SPI
______ ________
011982 NaN
021982 NaN
031982 2.022
041982 1.5689
051982 0.75813
061982 -0.74338
071982 -1.7323
081982 -2.4466
091982 -0.86604
101982 0.085698
Next to plot the data with date and time, it's as easy as calling plot:
plot(t.date, t.SPI)
xlabel('Date'), ylabel('Standardized Precipitation Index')
By default, plot chooses tick mark locations based on the range of data. When you zoom in and out of a plot, the tick labels automatically adjust to the new axis limits.
But if you want, you can also specify a custom format for the datetime tick labels. Note that when you do this, the plot always formats the tick labels according to the specified value, they won't adjust on zoom:
plot(t.date, t.SPI, 'DatetimeTickFormat','yyyy')
I'm adding another answer that works in older MATLAB versions without table or datetime data types.
Like before, we first import the data from file, but this time we read the dates as strings then convert them to serial date numbers using datenum function (defined as the number of days since "January 0, 0000"):
% import data from file
fid = fopen('file.dat', 'rt');
C = textscan(fid, '%s %f');
fclose(fid);
% create matrix
t = [datenum(C{1},'mmyyyy') C{2}];
The data looks like this:
>> format long
>> t(1:10,:)
ans =
1.0e+05 *
7.239120000000000 NaN
7.239430000000000 NaN
7.239710000000000 0.000005606888474
7.240020000000000 0.000009156147863
7.240320000000000 0.000004504804864
7.240630000000000 0.000008359005819
7.240930000000000 0.000007436313932
7.241240000000000 0.000002800134237
7.241550000000000 0.000005261613664
7.241850000000000 0.000001809901372
Next we plot the data like before, but instead we use the datetick function to format the x-axis as dates ('yyyy' for years):
plot(t(:,1), t(:,2))
datetick('x', 'yyyy')
xlabel('Date'), ylabel('Standardized Precipitation Index')
Unfortunately the tick labels will not automatically update when you zoom in and out... The good news, there are solutions on the File Exchange that solve this issue, for example datetickzoom and datetick2.

Matlab - Access index of max value in for loop and use it to remove values from array

I would like to recursively find the maximum value in a series of matrices (column 8, to be specific), then use the index of that maximum value to set all values in the array with index up to the max index to NaN (for columns 14:16). It is straight forward to find the max value and index, but using a for loop to do it for multiple arrays I am stumped.
Here is how I can do it without a for loop:
[C,Max] = max(wy2000(:,8));
wy2000(1:Max,14:16) = NaN;
[C,Max] = max(wy2001(:,8));
wy2001(1:Max,14:16) = NaN;
[C,Max] = max(wy2002(:,8));
wy2002(1:Max,14:16) = NaN;
and so on and so forth...
Here are two ways I have tried using a for loop:
startyear = 2000;
endyear = 2009;
for n=startyear:endyear
currentYear = sprintf('wy%d',n);
[C,Max] = max(currentYear(:,8));
currentYear(1:Max,14:16) = NaN;
end
Here is another way I tried, using the eval function
for n=2000:2009;
currentYear = ['wy' int2str(n)];
var2 = ['maxswe' int2str(n)];
eval([var2 ' = max(currentYear(:,8))']);
end
In both cases, the problem seems to be that MATLAB doesn't recognize the 'currentYear' variable to be the array that corresponds to the wyXXXX that I already have created in my workspace.
Based on Peters answer, here is some more info about my data. I am starting with a matrix of data called all_data which holds 16 columns of data, spanning the time period 1982 - 2012. I am only interested in the period 2000 - 2009, and I am also interested in analyzing each year individually (2000, 2001,...,2009).
To get the data into individual years, I use the following code:
for n=2000:2009;
s = datenum(n-1,10,1);
e = datenum(n,9,30);
startcell = find(TIME(:,7)==s);
endcell = find(TIME(:,7)==e);
var1 = ['wy' int2str(n)];
eval([var1 '= all_data3(startcell:endcell,:)']);
eval(['save ', var1]);
end
For clarification, it is the period 10/1/YEAR1 to 9/30/YEAR2 that I am interested in, and TIME is a matrix holding the dates and times of my data.
So at the end of the above for-loop, I have a new matrix for each water-year (wy). I then want to find the date of maximum snow-accumulation (column 8) and exclude all data prior to that date from my analysis. this is where the original question comes from.
Peter's solution works, but I was hoping to find a more simple solution to find the max date and set the values prior to that date to NaN, without having to declare a bunch of variables (or entries in a cell array).
If I could write a loop that would create the cell array that Peter suggested based on a start and end year, that would make the code transferable to other datasets, but when i try to do this I run into the issue that the index for the cell-array is 1:length(years), but the wy arrays are named according to the actual year, so there is an inconsistency when using the eval function.
Matt
You've discovered the problem with eval and dynamically named variables. They're messy. I'd recommend recoding this as a cell array, with the cell array index being the index for the year:
years = 2000:2009;
wy{1} = wy2000;
wy{2} = wy2001;
% etc...
% Then,
for n=1:length(years)
[C, maxval] = max(wy{n}(:,8));
% etc.
end
You really only need the actual year when you input the data and when you display it. Now, if you're starting from a huge pile of arrays already named this way, that's the time to use eval: to convert them into this form that's easier to use. Just form the eval strings so they read, for example, 'wy{1} = wy2000;'