How to make non iterative code faster than the iterative when using line by line backslash inverse? - matlab

This code creats a matrix B contains the product of the each line of A by the backslash inverse of a aline of x
A = [1,2,3,8,1;10,45,7,3,1;9,8,15,75,65,];
x = [14,5,11,15,33;7,1,9,1,1;87,45,11,0,65];
B=zeros(3,1);
% the iterative code
tic
for k = 1:size(x,1)
B(k) = A(k,:)*(x(k,:)\1);
end
t1 = toc;
disp(B)
How do I avoid the for loop, and keep the code faster?
I notice that, in MATLAB, the backslash of a vector x returns a vector y which inverse the maximum element and make the rest 0,the I tried to make the code without for loop:
tic
% code without iteration
ind = x==max(abs(x),[],2);
y = (1./x).*ind;
y(isnan(y))=0;
C = sum(A.*y,2);
t2 = toc;
disp(C)
I got the same output. However, the second code was slower in my pc than the first code
t1 = 0.000681.
t2 = 0.002536 .
I tried the pinv() function but it doesn't give the same results (the backslash inverse is better for my code)

Related

How to run a for loop using two variables at the same time in Matlab

I am using this command in Matlab:
grazAng = grazingang(H,R)
If i fix H, I can treat R as a vector:
z=[];
for i=1:1000
z(i)=abs(grazingang(1,i));
end
Now I would like to have both H and R to by dynamic. For example:
H=[0,0.25,0.5]
R=[1,2,3]
And I would like my loop to run three times, each time selecting a pair of (H,R) values with the same indexes, i.e. (0,1),(0.25,2),(0.5,3) and then store the result in z. Could anyone help me out with this?
Remember, everything in MATLAB is an array. To do this with a loop, you need to index into the arrays:
H = [0,0.25,0.5];
R = [1,2,3];
z = zeros(size(H)); % Pre-allocation is generally advised
for i = 1:1000
z(i) = abs(grazingang(H(i),R(i)));
end
But MATLAB functions generally accept vectors and do this for you, so all you need to do is:
H=[0,0.25,0.5];
R=[1,2,3];
z = abs(grazingang(H,R));

Is there any special rules for nesting if-statement in for-loop in MATLAB?

I am trying to create a signal and then build a discrete-time signal by sampling the CT signal I create first. Until the last for-loop, things work out fine but I need to take N samples seperated by T. Without an if statement, I am getting an index out-of-bounds error and I had to limit sampling within the duration of the signal. For some reason, my code goes into if statement once and no more, and for debugging, I am printing out the values both in if and out of if. Although the logical operation should be true for more than one iteration(printing statements will show the values), it just does not print the statements inside the if-statement. What's wrong here?
function x = myA2D(b,w,p,T,N)
%MYA2D description: Takes in parameters to construct the CT-sampled DT signal
%b,w,p are Mx1 vectors and it returns Nx1 vector.
timeSpace = 0:0.001:3*pi;
xConstT = zeros(size(timeSpace));
%Construct Xc(t) signal
for k = 1:size(b,1)
temp = b(k) .* cos(w(k).*timeSpace + p(k));
xConstT = xConstT + temp;
end
plot(xConstT);
%Sampling CT-Signal to build DT-signal
disp(strcat('xConstT size',int2str(size(xConstT))));**strong text**
x = zeros(N,1);
sizeConstT = size(xConstT);
for i = 0:N-1
index = i .* T .* 1000 + 1;
disp(strcat('indexoo=',int2str(index)));
disp(strcat('xConstSizeeee',int2str(sizeConstT)));
if index <= sizeConstT
disp(strcat('idx=',int2str(index)));
disp(strcat('xSize',int2str(sizeConstT)));
%x(i+1,1) = xConstT(index);
end
end
end
sizeConstT = size(xConstT); creates an 1x2 array so you compare a float to an array, and your code enters the if loop only if comparison to each element of the array is successful. This example illustrates the issue:
if 1 <= [1 12]; disp('one'); end % <- prints 'one'
if 2 <= [1 12]; disp('two'); end % <- prints nothing
Your code will work with sizeConstT = length(xConstT);

Quickly Evaluating MANY matlabFunctions

This post builds on my post about quickly evaluating analytic Jacobian in Matlab:
fast evaluation of analytical jacobian in MATLAB
The key difference is that now, I am working with the Hessian and I have to evaluate close to 700 matlabFunctions (instead of 1 matlabFunction, like I did for the Jacobian) each time the hessian is evaluated. So there is an opportunity to do things a little differently.
I have tried to do this two ways so far and I am thinking about implementing a third and was wondering if anyone has any other suggestions. I will go through each method with a toy example, but first some preprocessing to generate these matlabFunctions:
PreProcessing:
% This part of the code is calculated once, it is not the issue
dvs = 5;
X=sym('X',[dvs,1]);
num = dvs - 1; % number of constraints
% multiple functions
for k = 1:num
f1(X(k+1),X(k)) = (X(k+1)^3 - X(k)^2*k^2);
c(k) = f1;
end
gradc = jacobian(c,X).'; % .' performs transpose
parfor k = 1:num
hessc{k} = jacobian(gradc(:,k),X);
end
parfor k = 1:num
hess_name = strcat('hessian_',num2str(k));
matlabFunction(hessc{k},'file',hess_name,'vars',X);
end
METHOD #1 : Evaluate functions in series
%% Now we use the functions to run an "optimization." Just for an example the "optimization" is just a for loop
fprintf('This is test A, where the functions are evaluated in series!\n');
tic
for q = 1:10
x_dv = rand(dvs,1); % these are the design variables
lambda = rand(num,1); % these are the lagrange multipliers
x_dv_cell = num2cell(x_dv); % for passing large design variables
for k = 1:num
hess_name = strcat('hessian_',num2str(k));
function_handle = str2func(hess_name);
H_temp(:,:,k) = lambda(k)*function_handle(x_dv_cell{:});
end
H = sum(H_temp,3);
end
fprintf('The time for test A was:\n')
toc
METHOD # 2: Evaluate functions in parallel
%% Try to run a parfor loop
fprintf('This is test B, where the functions are evaluated in parallel!\n');
tic
for q = 1:10
x_dv = rand(dvs,1); % these are the design variables
lambda = rand(num,1); % these are the lagrange multipliers
x_dv_cell = num2cell(x_dv); % for passing large design variables
parfor k = 1:num
hess_name = strcat('hessian_',num2str(k));
function_handle = str2func(hess_name);
H_temp(:,:,k) = lambda(k)*function_handle(x_dv_cell{:});
end
H = sum(H_temp,3);
end
fprintf('The time for test B was:\n')
toc
RESULTS:
METHOD #1 = 0.008691 seconds
METHOD #2 = 0.464786 seconds
DISCUSSION of RESULTS
This result makes sense because, the functions evaluate very quickly and running them in parallel waists a lot of time setting up and sending out the jobs to the different Matlabs ( and then getting the data back from them). I see the same result on my actual problem.
METHOD # 3: Evaluating the functions using the GPU
I have not tried this yet, but I am interested to see what the performance difference is. I am not yet familiar with doing this in Matlab and will add it once I am done.
Any other thoughts? Comments? Thanks!

Sweeping initial conditions for a set of ODEs using parfor

I am currently trying to use parfor to sweep across a range of initial conditions for a set of differential equations solved by ode45. The code works fine using two nested for loops but I was hoping parfor could make the process more efficient. Unfortunately, I have run into an issue where the solver is able to solve one of the combinations in the matrix representing initial conditions across a range of variables, but the others seem to have their initial values all set at 0, instead of the values specified by the initial conditions. It may have something to do with the fact that I need to create a matrix of zeros ('P') that the results will be written into, perhaps overwriting the initial conditions(?) Any help would be greatly appreciated.
Thanks,
Kyle
function help(C, R)
A = 0.01;
B = 0.00;
C = [0.001,0.01];
D = 0.00;
R = [1e-10,1e-9];
[CGrid,RGrid] = meshgrid(C,R);
parfor ij = 1 : numel(CGrid)
c2 = [A; B; CGrid(ij); D; RGrid(ij)];
[t,c] = ode45('ode_sys',[0:1:300],c2);
for k=1:length([0:1:300])
for l=1:length(c2)
if c(k,l)<0
c(k,l)=0;
end
end
end
P = zeros(301,5,numel(R),numel(C));
temp = zeros(301,5);
temp(:,1) = c(:,1);
temp(:,2) = c(:,2);
temp(:,3) = c(:,3);
temp(:,4) = c(:,4);
temp(:,5) = c(:,5);
P(:,:,ij)=temp;
parsave('data.mat', P);
end
end
You have one error, and a few opportunities to simplify the code.
In the parfor loop, you have this line P = zeros(301,5,numel(R),numel(C)); which overwrites P with all zeros at each iteration. Put this before the parfor loop.
The first double-for loop that makes negative elements of c zero can be done using max(c,0), which should be more efficient. You can also do P(:,:,ij)=c(:,1:5) directly.
So you can replace your parfor loop with
P = zeros(301,5,numel(R),numel(C));
for ij = 1 : numel(CGrid)
c2 = [A; B; CGrid(ij); D; RGrid(ij)];
[t,c] = ode45('ode_sys',0:300,c2);
c = max(c,0);
P(:,:,ij) = c(:,1:5);
parsave('data.mat',P);
end

Non Local Means Filter Optimization in MATLAB

I'm trying to write a Non-Local Means filter for an assignment. I've written the code in two ways, but the method I'd expect to be quicker is much slower than the other method.
Method 1: (This method is slower)
for i = 1:size(I,1)
tic
sprintf('%d/%d',i,size(I,1))
for j = 1:size(I,2)
w = exp((-abs(I-I(i,j))^2)/(h^2));
Z = sum(sum(w));
w = w/Z;
sumV = w .* I;
NL(i,j) = sum(sum(sumV));
end
toc
end
Method 2: (This method is faster)
for i = 1:size(I,1)
tic
sprintf('%d/%d',i,size(I,1))
for j = 1:size(I,2)
Z = 0;
for k = 1:size(I,1)
for l = 1:size(I,2)
w = exp((-abs(I(i,j)-I(k,l))^2)/(h^2));
Z = Z + w;
end
end
sumV = 0;
for k = 1:size(I,1)
for l = 1:size(I,2)
w = exp((-abs(I(i,j)-I(k,l))^2)/(h^2));
w = w/Z;
sumV = sumV + w * I(k,l);
end
end
NL(i,j) = sumV;
end
toc
end
I really thought that MATLAB would be optimized for Matrix operations. Is there reason it isn't in this code? The difference is pretty large. For a 512x512 image, with h = 0.05, one iteration of the outer loop takes 24-28 seconds for Method 1 and 10-12 seconds for Method 2.
The two methods are not doing the same thing. In Method 2, the term abs(I(i,j)-I(k,l)) in the w= expression is being squared, which is fine because the term is just a single numeric value.
However, in Method 1, the term abs(I-I(i,j)) is actually a matrix (The numeric value I(i,j) is being subtracted from every element in the matrix I, returning a matrix again). So, when this term is squared with the ^ operator, matrix multiplication is happening. My guess, based on Method 2, is that this is not what you intended. If instead, you want to square each element in that matrix, then use the .^ operator, as in abs(I-I(i,j)).^2
Matrix multiplication is a much more computation intensive operation, which is likely why Method 1 takes so much longer.
My guess is that you have not preassigned NL, that both methods are in the same function (or are scripts and you didn't clear NL between function runs). This would have slowed the first method by quite a bit.
Try the following: Create a function for both methods. Run each method once. Then use the profiler to see where each function spends most of its time.
A much faster implementation (Vectorized) could be achieved using im2col:
Create a Vector out of each neighborhood.
Using predefined indices calculate the distance between each patch.
Sum over the values and the weights using sum function.
This method will work with no loop at all.