Tabulated values's management in MATLAB - matlab

I have to build a function from tabulated values (two columns) which are written in a text file. The process to make it is the following:
Use the command importdata to read the data file
Xp = importdata('Xp.dat','\t',1);
Store each column in a variable
x = Xp(1:18304,1);
y = Xp(1:18304,2);
Make a curve fitting with both variables
ft = fittype('linearinterp');
datos.f_Xp = fit(x,y,ft);
However, when I am profiling the code I have found out that my bottleneck are the built-in functions fittype.fittype, fittype.evaluate, cfit.feval, ppval and cfit.subsref
which are related to the curve fitting. So I ask myself how I should manage the tabulated values for improving my code.

you're trying to fit 18304 data points to a curve. Also, you're using linearinterp... which means a routine is being run in a piecewise fashion. if you want to make the code faster use less datapoints.
Or perhaps try:
ft = fittype('poly1');
Not sure is it will be the answer you need as I don't have access to the data

May be "Eval" function could work in your case,
some simple example :
A = '1+4'; eval(A)
ans =
5
P = 'pwd'; eval(P)
ans =
/home/myname
and a bit more advanced!
for n = 1:12
eval(['M',int2str(n),' = magic(n)'])
end
Also, it has a sister name "feval"
guess, what does it do !
[V,D] = feval('eig',A)
[V,D] = eig(A)
and here
function plotf(fun,x)
y = feval(fun,x);
plot(x,y)
You are right ! all are equivalent,
check out here and find more relevant function

Related

MATLAB how to make nlinfit() not quit out of loop and how to test distribution of results

Given some data (from the function 1/x), I solved for powers of a model function made of the sum of exponentials. I did this for several different samples of data and I want to plot the powers that I find and test to see what type of distribution they have.
My main problems are:
My loops are quitting when the nlinfit() function says the function gives -Inf values. I've tried playing around with initial conditions but can't see how to fix it. I know that I am working with the function 1/x, so there may not be a way around this problem. Is there at least a way to force the loop to keep going after getting this error? It only happens for some iterations. If I manually regenerate my sample data and run it again it works. So if I could force the loop to keep going it would be ideal. I have tried switching the for loop to a while loop and putting a try catch statement inside but this just seems to run endlessly.
I have code set to test how if the distribution of the resulting power is close to the exponential distribution. Is this the correct way to do this? When I add in the Monte Carlo tolerance ('MCTol') it has a drastic effect on p in some cases (going from 0.5 to 0.9). Do I have it set up correctly? Is there a way to also see empirically how far the curve is from the exponential distribution?
Are there better ways to fit this curve or do any of the above? I'm just learning so any suggestions are appreciated.
Here is my current code :
%Function that I use to fit curve and get error for each iteration
function[beta,error] = Fit_and_Calc_Error(T,X,modelfun, beta0)
opts = statset('nlinfit');
opts.RobustWgtFun = 'bisquare';
beta = nlinfit(T,X,modelfun,beta0,opts);
error = immse(X,modelfun(beta,T));
end
%Starting actual loop stuff
tracking_a = zeros();
tracking_error = zeros();
modelfun = #(a,x)(exp(-a(1)*x)+exp(-a(2)*x));
for i = 1:60
%make new random data
x = 20*rand(300,1)+1; %between 1 and 21 (Avoiding x=0)
X = sort(x);
Y = 1./(X);
%fit random data
beta0 = [max(X(:,1))/10;max(X(:,1))/10];
[beta,error]= Fit_and_Calc_Error(X,Y,modelfun, beta0);
%storing data found
tracking_a(i,1:length(beta)) = beta;
tracking_error(i) = error;
end
%Testing if first power found has exponential dist
pdN = fitdist(tracking_a(:,1),'Exponential');
histogram(tracking_a(:,1),10)
hold on
x_values2 = 0:0.001:5;
y2 = pdf(pdN,x_values2);
plot(x_values2,y2)
hold off
[h,p] = lillietest(tracking_a(:,1),'Distribution','exponential','MCTol',1e-4)
%Testing if second power found has exponential dist
pdN2 = fitdist(tracking_a(:,2),'Exponential');
histogram(tracking_a(:,2),10)
hold on
y3 = pdf(pdN2,x_values2);
plot(x_values2,y3)
hold off
[h2,p2] = lillietest(tracking_a(:,2),'Distribution','exponential','MCTol',1e-4)

Speeding up matlab for loop

I have a system of 5 ODEs with nonlinear terms involved. I am trying to vary 3 parameters over some ranges to see what parameters would produce the necessary behaviour that I am looking for.
The issue is I have written the code with 3 for loops and it takes a very long time to get the output.
I am also storing the parameter values within the loops when it meets a parameter set that satisfies an ODE event.
This is how I have implemented it in matlab.
function [m,cVal,x,y]=parameters()
b=5000;
q=0;
r=10^4;
s=0;
n=10^-8;
time=3000;
m=[];
cVal=[];
x=[];
y=[];
val1=0.1:0.01:5;
val2=0.1:0.2:8;
val3=10^-13:10^-14:10^-11;
for i=1:length(val1)
for j=1:length(val2)
for k=1:length(val3)
options = odeset('AbsTol',1e-15,'RelTol',1e-13,'Events',#eventfunction);
[t,y,te,ye]=ode45(#(t,y)systemFunc(t,y,[val1(i),val2(j),val3(k)]),0:time,[b,q,s,r,n],options);
if length(te)==1
m=[m;val1(i)];
cVal=[cVal;val2(j)];
x=[x;val3(k)];
y=[y;ye(1)];
end
end
end
end
Is there any other way that I can use to speed up this process?
Profile viewer results
I have written the system of ODEs simply with the a format like
function s=systemFunc(t,y,p)
s= zeros(2,1);
s(1)=f*y(1)*(1-(y(1)/k))-p(1)*y(2)*y(1)/(p(2)*y(2)+y(1));
s(2)=p(3)*y(1)-d*y(2);
end
f,d,k are constant parameters.
The equations are more complicated than what's here as its a system of 5 ODEs with lots of non linear terms interacting with each other.
Tommaso is right. Preallocating will save some time.
But I would guess that there is fundamentally not a lot you can do since you are running ode45 in a loop. ode45 itself may be the bottleneck.
I would suggest you profile your code to see where the bottleneck is:
profile on
parameters(... )
profile viewer
I would guess that ode45 is the problem. Probably you will find that you should actually focus your time on optimizing the systemFunc code for performance. But you won't know that until you run the profiler.
EDIT
Based on the profiler output and additional code, I see some things that will help
It seems like the vectorization of your values is hurting you. Instead of
#(t,y)systemFunc(t,y,[val1(i),val2(j),val3(k)])
try
#(t,y)systemFunc(t,y,val1(i),val2(j),val3(k))
where your system function is defined as
function s=systemFunc(t,y,p1,p2,p3)
s= zeros(2,1);
s(1)=f*y(1)*(1-(y(1)/k))-p1*y(2)*y(1)/(p2*y(2)+y(1));
s(2)=p3*y(1)-d*y(2);
end
Next, note that you don't have to preallocate space in the systemFunc, just combine them in the output:
function s=systemFunc(t,y,p1,p2,p3)
s = [ f*y(1)*(1-(y(1)/k))-p1*y(2)*y(1)/(p2*y(2)+y(1)),
p3*y(1)-d*y(2) ];
end
Finally, note that ode45 is internally taking about 1/3 of your runtime. There is not much you will be able to do about that. If you can live with it, I would suggest increasing your 'AbsTol' and 'RelTol' to more reasonable numbers. Those values are really small, and are making ode45 run for a really long time. If you can live with it, try increasing them to something like 1e-6 or 1e-8 and see how much the performance increases. Alternatively, depending on how smooth your function is, you might be able to do better with a different integrator (like ode23). But your mileage will vary based on how smooth your problem is.
I have two suggestions for you.
Preallocate the vectors in which you store your results and use an
increasing index to populate them into each iteration.
Since the options you use are always the same, instantiate then
outside the loop only once.
Final code:
function [m,cVal,x,y] = parameters()
b = 5000;
q = 0;
r = 10^4;
s = 0;
n = 10^-8;
time = 3000;
options = odeset('AbsTol',1e-15,'RelTol',1e-13,'Events',#eventfunction);
val1 = 0.1:0.01:5;
val1_len = numel(val1);
val2 = 0.1:0.2:8;
val2_len = numel(val2);
val3 = 10^-13:10^-14:10^-11;
val3_len = numel(val3);
total_len = val1_len * val2_len * val3_len;
m = NaN(total_len,1);
cVal = NaN(total_len,1);
x = NaN(total_len,1);
y = NaN(total_len,1);
res_offset = 1;
for i = 1:val1_len
for j = 1:val2_len
for k = 1:val3_len
[t,y,te,ye] = ode45(#(t,y)systemFunc(t,y,[val1(i),val2(j),val3(k)]),0:time,[b,q,s,r,n],options);
if (length(te) == 1)
m(res_offset) = val1(i);
cVal(res_offset) = val2(j);
x(res_offset) = val3(k);
y(res_offset) = ye(1);
end
res_offset = res_offset + 1;
end
end
end
end
If you only want to preserve result values that have been correctly computed, you can remove the rows containing NaNs at the bottom of your function. Indexing on one of the vectors will be enough to clear everything:
rows_ok = ~isnan(y);
m = m(rows_ok);
cVal = cVal(rows_ok);
x = x(rows_ok);
y = y(rows_ok);
In continuation of the other suggestions, I have 2 more suggestions for you:
You might want to try with a different solver, ODE45 is for non-stiff problems, but from the looks of it, it might seem like your problem could be stiff (parameters have a different order of magnitude). Try for instance with the ode23s method.
Secondly, without knowing which event you are looking for, maybe it is possible for you to use a logarithmic search rather than a linear one. e.g. the Bisection method. This will severely cut down on the number of times you have to solve the equation.

Matlab logncdf function is not producing expected result

So on this problem it seems pretty straight forward we are given
mean of x = 10,281 and sigma of x = 4112.4
We are asked to determine P(X<15,000)
Now I thought the code for this in matlab should be super straightforward
mu = 10281
sigma = 4112.4
p = logncdf(15000,10281,4112.4)
However this gives
p = .0063
The given answer is .8790 and just looking at p you can tell it is wrong because we are at 15000 which is over the mean which means it should be above .5. What is the deal with this function?
I saw somewhere you might need to take the exp(15000) for x in the function that results in a probability of 1 which is too high.
Any pointers would be much appreciated
%If X is lognormally distributed with parameters:-
mu = 10281;
sigma = 4112.4;
%then log(X) is normally distributed with following parameters:
mew_actual = log((mu^2)/sqrt(sigma^2+mu^2));
sigma_actual = sqrt(log((sigma^2)/(mu^2) +1));
Now you can use either of the following to compute CDF:-
p = cdf('Normal',log(15000),mew_actual,sigma_actual)
or
p=logncdf(15000,mew_actual,sigma_actual)
which gives 0.8796
(which I believe is the correct answer)
The answer given to you is 0.8790 because if you solve the question by hand, you'll get something like: z = 1.172759 and when you look this value in the table, you can only find z = 1.17(without the rest of decimal places) and for which φ(z)=0.8790.
You can verify the exact answer using this calculator. The related screenshot is attached below:

Efficient Implementation of `im2col` and `col2im`

MATLAB's im2col and col2im are very important function for vectorization in MATLAB when dealing with images.
Yet they require MATLAB's Image Processing Toolbox.
My question is, is there an efficient (Vectorzied) way to implement the using MATLAB's functions (With no toolbox)?
I need both the sliding and distinct mode.
I don't need any padding.
Thank You.
I can only hope that Mathworks guys don't sue you or me or Stackoverflow for that matter, trying to create vectorized implementations of their IP toolbox functions, as they have put price on that toolbox. But in any case, forgetting those issues, here are the implementations.
Replacement for im2col with 'sliding' option
I wasn't able to vectorize this until I sat down to write solution to another problem on Stackoverflow. So, I would strongly encourage to look into it too.
function out = im2col_sliding(A,blocksize)
nrows = blocksize(1);
ncols = blocksize(2);
%// Get sizes for later usages
[m,n] = size(A);
%// Start indices for each block
start_ind = reshape(bsxfun(#plus,[1:m-nrows+1]',[0:n-ncols]*m),[],1); %//'
%// Row indices
lin_row = permute(bsxfun(#plus,start_ind,[0:nrows-1])',[1 3 2]); %//'
%// Get linear indices based on row and col indices and get desired output
out = A(reshape(bsxfun(#plus,lin_row,[0:ncols-1]*m),nrows*ncols,[]));
return;
Replacement for im2col with 'distinct' option
function out = im2col_distinct(A,blocksize)
nrows = blocksize(1);
ncols = blocksize(2);
nele = nrows*ncols;
row_ext = mod(size(A,1),nrows);
col_ext = mod(size(A,2),ncols);
padrowlen = (row_ext~=0)*(nrows - row_ext);
padcollen = (col_ext~=0)*(ncols - col_ext);
A1 = zeros(size(A,1)+padrowlen,size(A,2)+padcollen);
A1(1:size(A,1),1:size(A,2)) = A;
t1 = reshape(A1,nrows,size(A1,1)/nrows,[]);
t2 = reshape(permute(t1,[1 3 2]),size(t1,1)*size(t1,3),[]);
t3 = permute(reshape(t2,nele,size(t2,1)/nele,[]),[1 3 2]);
out = reshape(t3,nele,[]);
return;
Some quick tests show that both these implementations particularly sliding one for small to decent sized input data and distinct for all datasizes perform much better than the in-built MATLAB function implementations in terms of runtime performance.
How to use
With in-built MATLAB function -
B = im2col(A,[nrows ncols],'sliding')
With our custom function -
B = im2col_sliding(A,[nrows ncols])
%// ------------------------------------
With in-built MATLAB function -
B = im2col(A,[nrows ncols],'distinct')
With our custom function -
B = im2col_distinct(A,[nrows ncols])
You can cheat while looking to GNU Octave image package. There are im2col and col2im as script language implemented:
im2col
col2im
As far as I see, it differs most in different comment style (# instead of %) and different string style (" instead of '). If you change this and remove the assert test on the bottom, it might be runnable already. If not, go through it with the debugger.
Furthermore, be aware of the License (GPLv3). It's free, but your changes have to be free too!

How can I plot data to a “best fit” cos² graph in Matlab?

I’m currently a Physics student and for several weeks have been compiling data related to ‘Quantum Entanglement’. I’ve now got to a point where I have to plot my data (which should resemble a cos² graph - and does) to a sort of “best fit” cos² graph. The lab script says the following:
A more precise determination of the visibility V (this is basically how 'clean' the data is) follows from the best fit to the measured data using the function:
f(b) = A/2[1-Vsin(b-b(center)/P)]
Granted this probably doesn’t mean much out of context, but essentially A is the amplitude, b is an angle and P is the periodicity. Hence this is also a “wave” like the experimental data I have found.
From this I understand, as previously mentioned, I am making a “best fit” curve. However, I have been told that this isn’t possible with Excel and that the best approach is Matlab.
I know intermediate JavaScript but do not know Matlab and was hoping for some direction.
Is there a tutorial I can read for this? Is it possible for someone to go through it with me? I really have no idea what it entails, so any feed back would be greatly appreciated.
Thanks a lot!
Initial steps
I guess we should begin by getting a representation in Matlab of the function that you're trying to model. A direct translation of your formula looks like this:
function y = targetfunction(A,V,P,bc,b)
y = (A/2) * (1 - V * sin((b-bc) / P));
end
Getting hold of the data
My next step is going to be to generate some data to work with (you'll use your own data, naturally). So here's a function that generates some noisy data. Notice that I've supplied some values for the parameters.
function [y b] = generateData(npoints,noise)
A = 2;
V = 1;
P = 0.7;
bc = 0;
b = 2 * pi * rand(npoints,1);
y = targetfunction(A,V,P,bc,b) + noise * randn(npoints,1);
end
The function rand generates random points on the interval [0,1], and I multiplied those by 2*pi to get points randomly on the interval [0, 2*pi]. I then applied the target function at those points, and added a bit of noise (the function randn generates normally distributed random variables).
Fitting parameters
The most complicated function is the one that fits a model to your data. For this I use the function fminunc, which does unconstrained minimization. The routine looks like this:
function [A V P bc] = bestfit(y,b)
x0(1) = 1; %# A
x0(2) = 1; %# V
x0(3) = 0.5; %# P
x0(4) = 0; %# bc
f = #(x) norm(y - targetfunction(x(1),x(2),x(3),x(4),b));
x = fminunc(f,x0);
A = x(1);
V = x(2);
P = x(3);
bc = x(4);
end
Let's go through line by line. First, I define the function f that I want to minimize. This isn't too hard. To minimize a function in Matlab, it needs to take a single vector as a parameter. Therefore we have to pack our four parameters into a vector, which I do in the first four lines. I used values that are close, but not the same, as the ones that I used to generate the data.
Then I define the function I want to minimize. It takes a single argument x, which it unpacks and feeds to the targetfunction, along with the points b in our dataset. Hopefully these are close to y. We measure how far they are from y by subtracting from y and applying the function norm, which squares every component, adds them up and takes the square root (i.e. it computes the root mean square error).
Then I call fminunc with our function to be minimized, and the initial guess for the parameters. This uses an internal routine to find the closest match for each of the parameters, and returns them in the vector x.
Finally, I unpack the parameters from the vector x.
Putting it all together
We now have all the components we need, so we just want one final function to tie them together. Here it is:
function master
%# Generate some data (you should read in your own data here)
[f b] = generateData(1000,1);
%# Find the best fitting parameters
[A V P bc] = bestfit(f,b);
%# Print them to the screen
fprintf('A = %f\n',A)
fprintf('V = %f\n',V)
fprintf('P = %f\n',P)
fprintf('bc = %f\n',bc)
%# Make plots of the data and the function we have fitted
plot(b,f,'.');
hold on
plot(sort(b),targetfunction(A,V,P,bc,sort(b)),'r','LineWidth',2)
end
If I run this function, I see this being printed to the screen:
>> master
Local minimum found.
Optimization completed because the size of the gradient is less than
the default value of the function tolerance.
A = 1.991727
V = 0.979819
P = 0.695265
bc = 0.067431
And the following plot appears:
That fit looks good enough to me. Let me know if you have any questions about anything I've done here.
I am a bit surprised as you mention f(a) and your function does not contain an a, but in general, suppose you want to plot f(x) = cos(x)^2
First determine for which values of x you want to make a plot, for example
xmin = 0;
stepsize = 1/100;
xmax = 6.5;
x = xmin:stepsize:xmax;
y = cos(x).^2;
plot(x,y)
However, note that this approach works just as well in excel, you just have to do some work to get your x values and function in the right cells.