Mean difference between MATLAB and Excel - matlab

I wrote a function that calculates the mean price of the last 15 mins of the trading day. In the attached excel file I get a mean of 55.23 but my function in MATLAB returns 55.32. I've been messing with this all day, and cannot figure out the answer for the difference. Can anyone tell me why the means are different in excel and MATLAB? Thank you.
function last15MinsOfDay=last15MinsOfDay(time,price)
% last15MinsOfDay takes the average of prices between 3:45 and 4:00.
timeStr=cellstr(datestr(time));
timeDbl=datevec(timeStr);
times=and(timeDbl(:,4)==14,timeDbl(:,5)>=46)+and(timeDbl(:,4)==15,timeDbl(:,5)==0);
priceIdx=find(times);
z=find(fwdshift(1,priceIdx)~=priceIdx+1);
z=[1; z];
mu=zeros(length(z),1);
for i = 1:length(z)-1;
mu(i)=mean(price(priceIdx(z(i):z(i+1))));
end
last15MinsOfDay=mu;
Excel file

The correct mean is 55.236 in your data. I'm not sure exactly what you're intending in the lines after the priceIdx line, but in MATLAB simply:
mean(price(priceIdx))
where priceIdx is a vector of the indices of the items you want to include should be what you need.

Related

How can I increase the speed of this xlsread for loop?

I have made a script which contains a for loop selecting columns from 533 different excel files and places them into matrices so that they can be compared, however the process is taking too long (it ran for 3 hours yesterday and wasn't even halfway through!!).
I know xlsread is naturally slow, but does anyone know how I can make my script run faster? The script is below, thanks!!
%Split the data into g's and h's
CRNum = 533; %Number of Carrington Rotation files
A(:,1) = xlsread('CR1643.xlsx','A:A'); % Set harmonic coefficient columns
A(:,2) = xlsread('CR1643.xlsx','B:B');
B(:,1) = xlsread('CR1643.xlsx','A:A');
B(:,2) = xlsread('CR1643.xlsx','B:B');
for k = 1:CRNum
textFileName = ['CR' num2str(k+1642) '.xlsx'];
A(:,k+2) = xlsread(textFileName,'C:C'); %for g
B(:,k+2) = xlsread(textFileName,'D:D'); %for h
end
Don't use xlsread if you want to go through a loop. because it opens excel and then closes excel server each time you call it, which is time consuming. instead before the loop use actxserver to open excel, do what you want and finally close actxserver after your loop. For a good example of using actxserver, search for "Read Spreadsheet Data Using Excel as Automation Server" in MATLAB help.
And also take a look at readtable which works faster than xlsread, but generates a table instead.
The most obvious improvement seems to be to load the files only partially if possible. However, if that is not an option, try whether it helps to only open each file once (read everything you need, and then assign it).
M(:,k+2) = xlsread(textFileName,'C:D');
Also check how much you are reading in each time, if you read in many rows in the first file, you may make the first dimension of A big, and then you will fill it each time you read a file?
As an extra: a small but simple improvment can be found at the start. Don't use 4 load statements, but use 1 and then assign variables based on the result.
As mentioned in this post, the easiest thing to change would be to set 'Basic' to true. This disables things like formulas and macros in Excel and allows you to read a simple table more quickly. For example, you can use:
xlsread('CR1643.xlsx','A:A', 'Basic', true)
This resulted in a decrease in load time from about 22 seconds to about 1 second for me when I tested it on a 11,000 by 7 Excel sheet.

Converting from datenum back to actual time

Ok, so I've been messing with what should be a simple problem. I'm trying to plot an numerical array (double) vs. the associated time stamp (cell) with the format of the DD-MM-YY HH:MM:SS (for example: 13-Mar-15 07:23:10).
I'm able to plot a single set using datetime(time stamp). Due to the data set it outputs the nice HH:MM on the x-axis. Very nice.
Now in order to plot 2 sets of values on the same graph, I've found that Matlab doesn't like to use the date_time twice for the x-axis, so then I go to the infamous datenum function, which is able to plot both on the same graph. However, it's in the serial value of time and it jacks with my plot sizing (i.e. the x-axis doesn't autosize).
With what should be a simple problem has actually caused me days scouring the internet trying to reconvert it back to my beloved HH:MM after converting the "time stamp" into the serial time.
I don't think that a code sample or data set should be necessary for example purposes. (but can provide if needed)
I've tried to use the datetick function, but can't get seem to get it working.
The trick with datenum and datetick is to set limits and tick positions before you call datetick, then make sure it doesn't redo them.
So, after plotting your two sets of data against the datenums it would go something like:
step = 1/24; % for hourly - adjust to preference
ticks = datenum('mystarttime'):step:datenum('myendtime')
set(gca,'XTick',ticks)
datetick('x','HH:MM','keepticks','keeplimits')
I work a lot with time series and most often I have to plot my data versus time/date.
Since Matlab never gave me a definitively convenient solution, for many years now I work this way:
I defined once and for all a Matlab shortcut (in the matlab shortcut toolbar):
containing the following code:
xticktemp = get(gca,'Xtick') ;
ticklabel = {datestr(xticktemp(1)) datestr(xticktemp(2:end),15) } ;
set(gca,'Xticklabel',ticklabel)
clear xticktemp ticklabel
I convert all my times with datenum and use this format for working (or posix time when more convenient, but it's another story), calculating and plotting. When I need to now exactly the time of an event in a human readable format, I press the shortcut and I obtain something like:
Of course there are 2 major limitations with this method:
This does not control the steps of the ticks (it just replicate what Matlab initially set)
You have to re-click your shortcut to refresh the tick labels everytime you zoom or pan the figure
For these reasons, when I need to finalize figure for presentation to others I won't use this trick and define my ticks exactly like I want them, but when you are simply working away, this shortcut has saved me hours (may be even days?) of fiddling around over the last 10 years ....

using from workspace block in simulink

how to use the "from workspace block in simulink" ?
I have tried using the from workspace block by given 10*2 matrix as input. it is appending some extra data along the data I have given .
and I have such 3 such blocks and want to know how I merge them.
Read the documentation. Simulink is time-based so the data in your From Workspace block must be as a function of time. Does your 10 x 2 matrix represent a signal as a function of time? If so, it needs to be as follows:
A two-dimensional matrix:
The first element of each matrix row is a
time stamp.
The rest of each row is a scalar or vector of signal
values.
The leftmost element of each row is the time stamp of the
value(s) in the rest of the row.
10 values isn't very much, it's likely that Simulink will need additional data points at intermediate times, if you have the Interpolate Data check box ticked. If not, "the current output equals the output at the most recent time for which data exists".
I think you may have a misunderstanding of the variables intended to be read by the FromWorkspace block.
The block expects a time series defining the value at various points in the simulation.
The From Workspace block help should point you in the right direction on this. Mathworks Help Documentation
I believe that something like the following would work for you:
>> WorkspaceVar.time=0;
>> WorkspaceVar.signals.values=zeros(10,2)
>> WorkspaceVar.signals.dimensions = [10,2]

where do i put a formula in matlab

I have no matlab or mathematical experience but i would like to do the following :
convert an excel file to a tab delimited file and open this in matlab organized in the following way:
every row is a new subject
first colum is name of subject
other 8 columns are the parameters for each subject
I would like to run a growthfunction on each subject and obtain the following results
the maximum velocity and corresponding growth and time
the minimum velocity before the maximum velocity is reached and corresponding time and growth
the maximum growth (the function nears an asymptot)
-This is the code that I would use
tmin=0;
tmax=20;
dt=1
t=tmin:dt:tmax;
y = m1.*(1-1./(1+(m2.*(t+m8)).^m5+(m3.*(t+m8)).^m6+(m4.*(t+m8)).^m7));
dy=diff(y)./dt;
max(dy);
min(dy);
imax=find(dy==max(dy))+1;
imin=find(dy==min(dy))+1;
t(imax);
t(imin);
y(imax);
y(imin);
y(20);
Where do i put this code so that it knows that m1 to m8 are corresponding to the different columns in my file? how do i link these?
How can i make sure that the output of each subject appears in a column in my tab delimited file (like excel)
In brief what I would like to do:
have a file with on every row a new subject and column 2-9 are the values of the parameters m1 to m8. Run the formula so that in column 9 i will have the maximum velocity, in 10 the minimum velocity and so one...
Can anyone help me out
Thanks
[~,~,rawData]=xlsread('yourExcelSheet.xlsx')
SubjectNames=rawData(:,1) % I think () is better here than { }, may have to switch this up.
Data=cell2mat(rawData(:,2:9)) % convert the last 8 columns (2 to 9) to a matrix of type double
%This ^^ also assumes there are no headers in excel, if there are headers, it would be %rawData(2:end,2:9) instead of what I have above
m1=Data(:,2)
m2=Data(:,3)
% etc. etc.
m8=Data(:,9)
alternatively, if you can get over the messiness, substitute "Data(:,2)" in your equation instead of m1, it will give you a tiny speedup
Disclaimer: I just wrote all this off the top of my head, if it errors, hopefully its just something small, otherwise let me know.
you can just import the data by double clicking the xls file. A dialog should appear. Select the data range that you want to import.
you can then simply state untitled(:,1) = m1 etc.

MATLAB query about for loop, reading in data and plotting

I am a complete novice at using matlab and am trying to work out if there is a way of optimising my code. Essentially I have data from model outputs and I need to plot them using matlab. In addition I have reference data (with 95% confidence intervals) which I plot on the same graph to get a visual idea on how close the model outputs and reference data is.
In terms of the model outputs I have several thousand files (number sequentially) which I open in a loop and plot. The problem/question I have is whether I can preprocess the data and then plot later - to save time. The issue I seem to be having when I try this is that I have a legend which either does not appear or is inaccurate.
My code (apolgies if it not elegant):
fn= xlsread(['tbobserved' '.xls']);
time= fn(:,1);
totalreference=fn(:,4);
totalreferencelowerci=fn(:,6);
totalreferenceupperci=fn(:,7);
figure
plot(time,totalrefrence,'-', time, totalreferencelowerci,'--', time, totalreferenceupperci,'--');
xlabel('Year');
ylabel('Reference incidence per 100,000 population');
title ('Total');
clickableLegend('Observed reference data', 'Totalreferencelowerci', 'Totalreferenceupperci','Location','BestOutside');
xlim([1910 1970]);
hold on
start_sim=10000;
end_sim=10005;
h = zeros (1,1000);
for i=start_sim:end_sim %is there any way of doing this earlier to save time?
a=int2str(i);
incidenceFile =strcat('result_', 'Sim', '_', a, 'I_byCal_total.xls');
est_tot=importdata(incidenceFile, '\t', 1);
cal_tot=est_tot.data;
magnitude=1;
t1=cal_tot(:,1)+1750;
totalmodel=cal_tot(:,3)+cal_tot(:,5);
h(a)=plot(t1,totalmodel);
xlim([1910 1970]);
ylim([0 500]);
hold all
clickableLegend(h(a),a,'Location','BestOutside')
end
Essentially I was hoping to have a way of reading in the data and then plot later - ie. optimise the code.
I hope you might be able to help.
Thanks.
mp
Regarding your issue concerning
I have a legend which either does not
appear or is inaccurate.
have a look at the following extracts from your code.
...
h = zeros (1,1000);
...
a=int2str(i);
...
h(a)=plot(t1,totalmodel);
...
You are using a character array as index. Instead of h(a) you should use h(i). MATLAB seems to cast the character array a to double as shown in the following example with a = 10;.
>> double(int2str(10))
ans = 49 48
Instead of h(10) the plot handle will be assigned to h([49 48]) which is not your intention.