MATLAB code requires too much time for compiling - matlab

I am trying to compute this
in MATLAB but the code requires about 8 hours to compile. In particular e, Ft=[h(t);q(t)] and Omega are 2x1 matrices (e' is 1x2), Gamma is a 2x2 matrix and n=30. Can someone help me to optimize this code?
I tried in this way:
aux=[0;0];
for k=0:29
for j=1:k-1
aux=[aux Gamma^j*Omega];
end
E(t,k+1)= e'*(sum(aux,2)+Gamma^k*[h(t);q(t)]);
end
Vix=1/30*sum(E,2);
EDIT
now I changed into this and it is faster, but I am not sure that I am applying correctly the formula in the picture...
for t=2:T
% 1. compute today's volatility
csi(t) = log(SP500(t)/SP500(t-1))-r(t)+0.5*h(t);
q(t+1) = omega+rho*q(t)+phi*((csi(t)-lambda*sqrt(h(t)))^2-h(t));
h(t+1) = q(t)+alpha*((csi(t)-lambda*sqrt(h(t)))^2-q(t))+beta*(h(t)-q(t));
for k=1:30
aux=zeros(2,k);
for j=0:k-1
aux(:,j+1)=Gamma^j*Omega;
end
E(t,k)= e'*(sum(aux,2)+Gamma^k*[h(t);q(t)]);
end
end
Vix(2:end)=1/30*sum(E(2:end,:),2);
(I don't need Vix(1))

Here are some reasons I can think of:
REPEATED COPYING(No preallocation) The main reason for the long run time is the line aux=[aux Gamma^j*Omega] line, in which an array is concatenated at every loop iteration. MATLAB's debugger should have flagged this for you in its editor and should have cited that "memory preallocation" using zeros should be implemented.
Essentially, when one concatenates arrays this way, MATLAB is internally making copies of the array at every loop iteration, thus, in addition to the math operations copying is taking place. As the array grows, the copying operations become ever more expensive. This is avoided by preallocation, which consists of predefining the size of the storage array (in this case the variable aux) so that MATLAB doesn't have to keep on allocating space on the go. Try:
aux = zeros(2, 406); %Creates a 2 by 406 array. I explain how I get 406 below:
p = 0; %A variable that indexes the columns of aux
for k=0:29
for j=1:k-1
p = p+1; %Update column counter
aux(:,p) = Gamma^j*Omega; % A 2x2 matrix multiplied by a 2x1 matrix yields a 2x1.
end
E(t,k+1)= e'*(sum(aux,2)+Gamma^k*[h(t);q(t)]);
end
Vix=1/30*sum(E,2);
Now, MATLAB simply overwrites the individual elements of aux instead of copying aux, and concatenating it with Gamma^j*Omega, and then overwriting aux. Essentially, the above makes MATLAB allocate space for aux ONCE instead of 406 times. I figured out that aux ends up being a 2 by 406 array for the n=30 case in the end by running this code:
p = 0;
for k = 0:29
for j = 1:k-1
p = p + 1;
end
end
To know the final size of aux for other values of n you should see if a formula for it is available (or derive your own).
LOOPING TRANSPOSITION OF A CONSTANT?
Next, e'. As you may know, ' is the transpose operation. From your sample code, the variable e is not edited inside the for loops, yet you have the ' operator inside the outer for loop. If you perform the transpose operation once outside the outer for loop you save yourself the expense of transposing it at every loop iteration.
RUNNING TOTAL
As a final note, I would suggest replacing sum(aux,2) with a variable that keeps a running total. This is because currently, this makes MATLAB sum over the entirety of aux at every loop iteration.
Hope this helps mate.

Related

matlab fix slicing to execute parfor

Just beginning with parallel stuff...
I have a code that boils down to filling the columns of a matrix A (which is preallocated with NaNs) with an array of variable length:
A = nan(100);
for ii=1:100
hmy = randi([1,100]); %lenght of the array
A(1:hmy,ii) = rand(hmy,1); %array
end
Simply transforming the for in a parfor does not even run
parfor ii=1:100
hmy = randi([1,100]); %how many not NaN values to put in
A(1:hmy,ii) = rand(hmy,1);
end
because the parfor does not like the indexing:
MATLAB runs loops in parfor functions by dividing the loop iterations
into groups, and then sending them to MATLAB workers where they run in
parallel. For MATLAB to do this in a repeatable, reliable manner, it
must be able to classify all the variables used in the loop. The code
uses the indicated variable in a way that is incompatible with
classification.
I thought that this was due to the indexing on the first dimension and tried a workaround that did not work (same error message as before):
parfor ii=1:100
hmy = randi([1,100]);
tmp = [rand(hmy,1); NaN(size(A,1)-hmy,1)];
A(:,ii) = tmp;
end
How can I index A in order to store the array?
You cant partially change the row or column data in A. You have to do either a full row or full column inside parfor. Here is the updated code.
A = nan(100);
parfor ii=1:100
hmy = randi([1,100]); %lenght of the array
temp = nan(1,100);
temp(1:hmy) = rand(hmy,1); %array
A(:,ii) = temp; %updating full row of iith column
end
First of all, the output variable (A in this case) will be a sliced variable in case of parfor. Which means this variable will be sliced in different part for parallel calculation. The form of indexing in the sliced variable should be same for all incident. Here, you are creating a random number(hmy) and using (1:hmy) as a index which is changing every now and then. That's why your output variable can't be sliced and you have the error.
If you try with a fixed hmy, then it will be alright.

Efficient way to perform loops of matrix multiplications

I have an implementation that involves multiplying matrices, summing them up, and storing them. It goes like this,
A = 0;
b = 0;
for i=1:1225
... load A_i operator
A_i_obj = load([path_temp,'A_',num2str(i),'.mat']);
A_i = (A_i_obj.A);
% z_i is some variable of size Nx1 that I compute in this loop something like
% x is some variable of size Nx1 calculated above this loop
z_i = A_i*x;
% I have to perform some operations like these
y_i = A_i*(z_i + x);
A = A + A_i*A_i'
b = b + A_i*y_i;
end
% A and b will be used here something like
soln = inv(A)*b;
My problem is the large amount of simulation time being consumed by the above code. Even when the operations inside the loop are efficient (let's say ~0.01mins), the entire looped implementation still consumes about ~12-13mins. Can somebody please help me out and suggest an efficient way to do this? Thanks so much!
I don't get the point of loading the mat file in every iteration... load your data only once and outside the for loop. After all, the operation is very time expensive and, more important, once the data is loaded into a variable it doesn't have to be performed anymore.
A = 0;
A_i = load('A_i.mat');
for i = 1:1225
% ...
y_i = A_i * (z_i + x);
A = A + A_i * A_i';
end
On a side note, you are assigning the output of the load function directly to a variable using the following overload:
S = load(___) loads data into S, using any of the input arguments in the previous syntax group.
- If filename is a MAT-file, then S is a structure array.
- If filename is an ASCII file, then S is a double-precision array containing data from the file.
So I suppose your file is in ASCII format, otherwise your A_i will not be a matrix but a structure array. Also, don't use the ' operator to transpose a matrix, but .', since the first one corresponds to the complex conjugate transpose:
A = A + A_i * A_i.';
Since you omitted a part of the code running inside the loop, I can't do more in order to improve its performance.
Profile, profile, profile
I would guess that the process of loading the matrices is killing you, but it is hard to say. To get a better idea of which step is killing you, start your code with
profile on
and end it with
profile viewer
Then run it again. When the code completes, it will show you the time taken by each call, which will help you figure out where the problem is.

Evaluate a changing function in loop

I am writing a code that generates a function f in a loop. This function f changes in every loop, for example from f = x + 2x to f = 3x^2 + 1 (randomly), and I want to evaluate f at different points in every loop. I have tried using subs, eval, matlabFunction etc but it is still running slowly. How would you tackle a problem like this in the most efficient way?
This is as fast as I have been able to do it. ****matlabFunction and subs go slower than this.
The code below is my solution and it is one loop. In my larger code the function f and point x0 change in every loop so you can imagine why I want this to go as fast as possible. I would greatly appreciate it if someone could go through this, and give me any pointers. If my coding is crap feel free to tell me :D
x = sym('x',[2,1]);
f = [x(1)-x(1)cos(x(2)), x(2)-3x(2)^2*cos(x(1))];
J = jacobian(f,x);
x0 = [2,1];
N=length(x0); % Number of equations
%% Transform into string
fstr = map2mat(char(f));
Jstr = map2mat(char(J));
% replace every occurence of 'xi' with 'x(i)'
Jstr = addPar(Jstr,N);
fstr = addPar(fstr,N);
x = x0;
phi0 = eval(fstr)
J = eval(Jstr)
function str = addPar(str,N)
% pstr = addPar(str,N)
% Transforms every occurence of xi in str into x(i)
% N is the maximum value of i
% replace every occurence of xi with x(i)
% note that we do this backwards to avoid x10 being
% replaced with x(1)0
for i=N:-1:1
is = num2str(i);
xis = ['x' is];
xpis = ['x(' is ')'];
str = strrep(str,xis,xpis);
end
function r = map2mat(r)
% MAP2MAT Maple to MATLAB string conversion.
% Lifted from the symbolic toolbox source code
% MAP2MAT(r) converts the Maple string r containing
% matrix, vector, or array to a valid MATLAB string.
%
% Examples: map2mat(matrix([[a,b], [c,d]]) returns
% [a,b;c,d]
% map2mat(array([[a,b], [c,d]]) returns
% [a,b;c,d]
% map2mat(vector([[a,b,c,d]]) returns
% [a,b,c,d]
% Deblank.
r(findstr(r,' ')) = [];
% Special case of the empty matrix or vector
if strcmp(r,'vector([])') | strcmp(r,'matrix([])') | ...
strcmp(r,'array([])')
r = [];
else
% Remove matrix, vector, or array from the string.
r = strrep(r,'matrix([[','['); r = strrep(r,'array([[','[');
r = strrep(r,'vector([','['); r = strrep(r,'],[',';');
r = strrep(r,']])',']'); r = strrep(r,'])',']');
end
There are several ways to get huge boosts in speed for this sort of problem:
The java GUI front end slows everything down. Go back to version 2010a or earlier. Go back to when it was based on C or fortran. The MATLAB script runs as fast as if you had put it into the MATLAB "compiler".
If you have MatLab compiler (or builder, I forget which) but not the coder, then you can process your code and have it run a few times faster without modifying the code.
write it to a file, then call it as a function. I have done this for changing finite-element expressions, so large ugly math that makes $y = 3x^2 +1$ look simple. In that it gave me solid speed increase.
vectorize, vectorize, vectorize. It used to reliably give 10x to 12x speed increase. Pull it out of loops. The java, I think, obscures this some by making everything slower.
have you "profiled" your function to make sure that "eval" or such are the problem? If you fix "eval" and your bottleneck is elsewhere then you will have problems.
If you have the choice between eval and subs, stick with eval. subs gives you a symbolic solution, not a numeric one.
If there is a clean way to have multiple instances of MatLab running, especially if you have a decently core-rich cpu that MatLab does not fully utilize, then get several of them going. If you are at an educational institution you might try running several different versions (2010a, 2010b, 2009a,...) on the same system. I (fuzzily) recall they didn't collide when I did it. Running more than about 8 started slowing things down more than it improved them. Make sure they aren't colliding on file access if you are using files to share control.
You could write your program in LabVIEW (not MathScript, not MatLab) and because it is a compiled language, there are times that code can run 1000x faster.
You could go all numeric and make it a matrix activity. This depends on your code, but if you could randomly populate the columns in the matrix then matrix multiply it to a matrix $ \left[ 1, x, x^{2}, ...\right] $, that would likely be several hundreds or thousands of times faster than your current level of equation handling and still in MatLab.
About your coding:
don't redeclare "x" as a symbol every loop, that is expensive.
what is this "map2mat" then "addPar" stuff?
the string handling functions are horrible for runtime. Stick to one language. The symbolic toolbox IS maple, and you don't have to get goofy hand-made parsing to make it work with the rest of MatLab.

MATLAB: The "dot operator" versus using a "for loop"

I'm a bit of a beginner at matlab so I'm having a some trouble understanding differentiating a dot operator and a for loop.
Given a Column vector (it's a pretty long column vector). We are given the following equation...
f(x)=0.2*x^3 + (1/3)*(x^2-1) + 2*cos(x)+3*cos(10x)
I need to use the method of dot operator and a for loop to create 2 plots and also the time (using tic, toc)
However, with dot operator does it mean using
.^ or .*
in the equation? and if this is the case, wouldn't I still need to use that in order to make a for loop?
Any clarification or assistance would be greatly appreciated! I don't really understand how I would write these...
The operators prefixed with a dot are called element-wise operators. It performs the operation on each element of the arrays (after checking that all involved arrays have the same number of elements). So you don't need a for-loop with using this operator, this is implied. This is called vectorization.
For example:
C = A.*B;
is equivalent to:
C = zeros(size(C));
for i=1:numel(A)
C(i) = A(i)*B(i);
end
but the first one is heavily optimized. So it's strongly advised to use vectorized operators as much as possible.
your x is an vector of defined length and step size, for example it can be:
x = 1:1:100 %generates 1,2,3....100
x = 1:0.1:10 %generates 0.1,0.2,0.3....10
so if you want to write a function of x (which is a vector), for speed purposes, you might want to use the dot-product denoted by .* in matlab. In your case you can do:
f(x)=0.2.*x.^3 + (1/3).*(x.^2-1) + 2.*cos(x) +3.*cos(10x)
The more costly way to compute is to use a for loop:
for x = 1:100
f(x)=0.2*x^3 + (1/3)*(x^2-1) + 2*cos(x) +3*cos(10x)
end
figure
plot(f)
The problem in your for loop is that you overwrite on the value of fx2, in each iteration you give it a new value but it always remain of size 1x1. For minimal change in your code you could do something like:
fx2=[];
for x = A(:,4)
fx2 = [fx2 0.2*x^3+(1/3)*(x^2-1)+2*cos(x)+3*cos(10*x)];
end
plot(x,fx2)
that way, you add a new value in the vector fx2 at each iteration instead of overwriting, (fx2 would be 1x1 then 1x2...). However be aware that this is not optimized at all because well there is a for loop that can be avoided but also because fx2's size changes at each iteration. Another better solution would be to predefine fx2 with the right size and then in the loop, change its ith value at the ith iteration.

MATLAB - vector script

I have recently started learning MatLab, and wrote the following script today as part of my practice to see how we can generate a vector:
x = [];
n = 4;
for i = i:n
x = [x,i^2];
end
x
When I run this script I get what I expect, namely the following vector:
x = 0 1 4 9 16
However, if I run the script a second time right afterwards I only get the following output:
x = 16
What is the reason for this? How come I only get the last vector entry as output the second time I run the script, and not the vector in its entirety? If anyone can explain this to me, I would greatly appreciate it.
Beginning with a fresh workspace, i will simply be the complex number 1i (as in x^2=-1). I imagine you got this warning on the first run:
Warning: Colon operands must be real scalars.
So the for statement basically loops over for i = real(1i):4. Note that real(1i)=0.
When you rerun the script again with the variables already initialized (assuming you didn't clear the workspace), i will refer to a variable containing the last value of 4, shadowing the builtin function i with the same name, and the for-loop executes:
x=[];
for i=4:4
x = [x, i^2]
end
which iterates only one time, thus you end up with x=16
you forget to initialize i.
after first execution i is 4 and remains 4.
then you initialize x as an empty vector but because i is 4 the loop runs only once.
clear your workspace and inspect it before and after first execution.
Is it possibly a simple typo?
for i = i:n
and should actually mean
for i = 1:n
as i is (probably) uninitialized in the first run, and therefore 0, it works just fine.
The second time, i is still n (=4), and only runs once.
Also, as a performance-tip: in every iteration of your loop you increase the size of your vector, the more efficient (and more matlaboid) way would be to create the vector with the basevalues first, for example with
x = 1:n
and then square each value by
x = x^2
In Matlab, using vector-operations (or matrix-operations on higher dimensions) should be prefered over iterative loop approaches, as it gives matlab the opportunity to do optimised operations. It is also often more readable that way.