Suppose I have a matrix A and I want to apply a function f to each of its elements. I can then use f(A), if f is vectorized or arrayfun(f,A) if it's not.
But what if I had a function that depends on the entry and its indices: f = #(i,j,x) something. How do I apply this function to the matrix A without using a for loop like the following?
for j=1:size(A,2)
for i=1:size(A,1)
fA(i,j) = f(i,j,A(i,j));
end
end
I'd like to consider the function f to be vectorized. Hints on shorter notation for non-vectorized functions are welcome, though.
I have read your answers and I came up with another idea using indexing, which is the fastest way. Here is my test script:
%// Test function
f = #(i,j,x) i.*x + j.*x.^2;
%// Initialize times
tfor = 0;
tnd = 0;
tsub = 0;
tmy = 0;
%// Do the calculation 100 times
for it = 1:100
%// Random input data
A = rand(100);
%// Clear all variables
clear fA1 fA2 fA3 fA4;
%// Use the for loop
tic;
fA1(size(A,1),size(A,2)) = 0;
for j=1:size(A,2)
for i=1:size(A,1)
fA1(i,j) = f(i,j,A(i,j));
end
end
tfor = tfor + toc;
%// Use ndgrid, like #Divakar suggested
clear I J;
tic;
[I,J] = ndgrid(1:size(A,1),1:size(A,2));
fA2 = f(I,J,A);
tnd = tnd + toc;
%// Test if the calculation is correct
if max(max(abs(fA2-fA1))) > 0
max(max(abs(fA2-fA1)))
end
%// Use ind2sub, like #DennisKlopfer suggested
clear I J;
tic;
[I,J] = ind2sub(size(A),1:numel(A));
fA3 = arrayfun(f,reshape(I,size(A)),reshape(J,size(A)),A);
tsub = tsub + toc;
%// Test if the calculation is correct
if max(max(abs(fA3-fA1))) > 0
max(max(abs(fA3-fA1)))
end
%// My suggestion using indexing
clear sA1 sA2 ssA1 ssA2;
tic;
sA1=size(A,1);
ssA1=1:sA1;
sA2=size(A,2);
ssA2=1:sA2;
fA4 = f(ssA1(ones(1,sA2),:)', ssA2(ones(1,sA1,1),:), A); %'
tmy = tmy + toc;
%// Test if the calculation is correct
if max(max(abs(fA4-fA1))) > 0
max(max(abs(fA4-fA1)))
end
end
%// Print times
tfor
tnd
tsub
tmy
I get the result
tfor =
0.6813
tnd =
0.0341
tsub =
10.7477
tmy =
0.0171
Assuming that the function is vectorized ( no dependency or recursions involved), as mentioned in the comments earlier, you could use ndgrid to create 2D meshes corresponding to the two nested loop iterators i and j and of the same size as A. When these are fed to the particular function f, it would operate on the input 2D arrays in a vectorized manner. Thus, the implementation would look something like this -
[I,J] = ndgrid(1:size(A,1),1:size(A,2));
out = f(I,J,A);
Sample run -
>> f = #(i,j,k) i.^2+j.^2+sin(k);
A = rand(4,5);
for j=1:size(A,2)
for i=1:size(A,1)
fA(i,j) = f(i,j,A(i,j));
end
end
>> fA
fA =
2.3445 5.7939 10.371 17.506 26.539
5.7385 8.282 13.538 20.703 29.452
10.552 13.687 18.076 25.804 34.012
17.522 20.684 25.054 32.13 41.331
>> [I,J] = ndgrid(1:size(A,1),1:size(A,2)); out = f(I,J,A);
>> out
out =
2.3445 5.7939 10.371 17.506 26.539
5.7385 8.282 13.538 20.703 29.452
10.552 13.687 18.076 25.804 34.012
17.522 20.684 25.054 32.13 41.331
Using arrayfun(), ind2sub() and reshape() you can create the indexes matching the form of A. This way arrayfun() is applicable. There might be a better version as this feels a little bit like a hack, it should work on vectorized and unvectorized functions though.
[I,J] = ind2sub(size(A),1:numel(A));
fA = arrayfun(f,reshape(I,size(A)),reshape(J,size(A)),A)
Related
I have the following data:
N = 10^3;
x = randn(N,1);
y = randn(N,1);
z = randn(N,1);
f = x.^2+y.^2+z.^2;
Now I want to split this continuous 3D space into nB bins.
nB = 20;
[~,~,x_bins] = histcounts(x,nB);
[~,~,y_bins] = histcounts(y,nB);
[~,~,z_bins] = histcounts(z,nB);
And put in each cube average f or nan if no observations happen in the cube:
F = nan(50,50,50);
for iX = 1:20
for iY = 1:20
for iZ = 1:20
idx = (x_bins==iX)&(y_bins==iY)&(z_bins==iZ);
F(iX,iY,iZ) = mean(f(idx));
end
end
end
isosurface(F,0.5)
This code does what I want. My problem is the speed. This code is extremely slow when N > 10^5 and nB = 100.
How can I speed up this code?
I also tried the accumarray() function:
subs=([x_bins,y_bins,z_bins]);
F2 = accumarray(subs,f,[],#mean);
all(F(:) == F2(:)) % false
However, this code produces a different result.
The problem with the code in the OP is that it tests all elements of the data for each element in the output array. The output array has nB^3 elements, the data has N elements, so the algorithm is O(N*nB^3). Instead, one can loop over the N elements of the input, and set the corresponding element in the output array, which is an operation O(N) (2nd code block below).
The accumarray solution in the OP needs to use the fillvals parameter, set it to NaN (3rd code block below).
To compare the results, one needs to explicitly test that both arrays have NaN in the same locations, and have equal non-NaN values elsewhere:
all( ( isnan(F(:)) & isnan(F2(:)) ) | ( F(:) == F2(:) ) )
% \-------same NaN values------/ \--same values--/
Here is code. All three versions produce identical results. Timings in Octave 4.4.1 (no JIT), in MATLAB the loop code should be faster. (Using input data from OP, with N=10^3 and nB=20).
%% OP's code, O(N*nB^3)
tic
F = nan(nB,nB,nB);
for iX = 1:nB
for iY = 1:nB
for iZ = 1:nB
idx = (x_bins==iX)&(y_bins==iY)&(z_bins==iZ);
F(iX,iY,iZ) = mean(f(idx));
end
end
end
toc
% Elapsed time is 1.61736 seconds.
%% Looping over input, O(N)
tic
s = zeros(nB,nB,nB);
c = zeros(nB,nB,nB);
ind = sub2ind([nB,nB,nB],x_bins,y_bins,z_bins);
for ii=1:N
s(ind(ii)) = s(ind(ii)) + f(ii);
c(ind(ii)) = c(ind(ii)) + 1;
end
F2 = s ./ c;
toc
% Elapsed time is 0.0606539 seconds.
%% Other alternative, using accumarray
tic
ind = sub2ind([nB,nB,nB],x_bins,y_bins,z_bins);
F3 = accumarray(ind,f,[nB,nB,nB],#mean,NaN);
toc
% Elapsed time is 0.14113 seconds.
I have a problem with this nested for loop:
eta = [1e-3:1e-2:9e-1];
HN =5;
for ii = 1:numel(eta)
for v = 1:HN
DeltaEta(v) = eta(ii)*6;
end
end
This code gives the output of DeltaEta as a 1x5 vector.
However, I want the result to be 90x5 vector where DeltaEta is computed 5 times for each value of eta.
I believe the problem is with the way I am nesting the loops.
It seems trivial but I can't get the desired output, any leads would be appreciated.
You're assigning outputs to DeltaEta(v), where v = 1,2,..,HN. So you're only ever assigning to
DeltaEta(1), DeltaEta(2), ..., DeltaEta(5)
You can solve this with a 2D matrix output, indexing on ii too...
eta = [1e-3:1e-2:9e-1];
HN = 5;
DeltaEta = NaN( numel(eta), HN );
for ii = 1:numel(eta)
for v = 1:HN
DeltaEta(ii,v) = eta(ii)*6;
end
end
% optional reshape at end to get column vector
DeltaEta = DeltaEta(:);
Note, there is no change within your inner loop - DeltaEta is the same for all values of v. That means you can get rid of the inner loop
eta = [1e-3:1e-2:9e-1];
HN = 5;
DeltaEta = NaN( numel(eta), HN );
for ii = 1:numel(eta)
DeltaEta( ii, : ) = eta(ii) * 6;
end
And now we can see a way to actually remove the outer loop too
eta = [1e-3:1e-2:9e-1];
HN = 5;
DeltaEta = repmat( eta*6, HN, 1 ).';
To answer your question as asked, you need to index on ii as well as v:
eta = [1e-3:1e-2:9e-1];
HN =5;
for ii = 1:numel(eta)
for v = 1:HN
DeltaEta(ii,v) = eta(ii)*6;
end
end
However this is in general a bad idea -- if you catch yourself using for-loops in MATLAB (particularly doubly-nested for-loops) you should consider if there might be a better way that uses MATLAB's strong vectorisation abilities.
I am trying to convert my code over to run with parfor, since as it is it takes a long time to run on its own. However I keep getting this error. I have search around on the website and have read people with similar problems, but none of those answers seem to fix my problem. This is my code:
r = 5;
Mu = 12.57e-9;
Nu = 12e6;
I = 1.8;
const = pi*Nu*Mu*r*I;
a = 55;
b = 69;
c = 206;
[m,n,p] = size(Lesion_Visible);
A = zeros(m,n,p);
parpool(2)
syms k
parfor J = 1:m
for I = 1:n
for K = 1:p
if Lesion_Visible(J,I,K) ~= 0
Theta = atand((J-b)/(I-a));
Rho = abs((I-a)/cosd(Theta))*0.05;
Z = abs(c-K)*0.05;
E = vpa(const*int(abs(besselj(0,Rho*k)*exp(-Z*k)*besselj(0,r*k)),0,20),5);
A (J,I,K) = E;
end
end
end
end
I'm trying to calculate the electric field in specific position on an array and matlab give me the error "The variable A in a parfor cannot be classified". I need help. Thanks.
As classification of variables in parfor loop is not permitted, you should try to save the output of each loop in a variable & then save the final output into the desired variable, A in your case!
This should do the job-
parfor J = 1:m
B=zeros(n,p); %create a padding matrix of two dimension
for I = 1:n
C=zeros(p); %create a padding matrix of one dimension
for K = 1:p
if Lesion_Visible(J,I,K) ~= 0
Theta = atand((J-b)./(I-a));
Rho = abs((I-a)./cosd(Theta))*0.05;
Z = abs(c-K).*0.05;
E = vpa(const.*int(abs(besselj(0,Rho.*k).*exp(-Z.*k).*besselj(0,r.*k)),0,20),5);
C(K) = E; %save output of innnermost loop to the padded matrix C
end
end
B(I,:)=C; % save the output to dim1 I of matrix B
end
A(J,:,:)=B; save the output to dim1 J of final matrix A
end
Go through the following for better understanding-
http://www.mathworks.com/help/distcomp/classification-of-variables-in-parfor-loops.html
http://in.mathworks.com/help/distcomp/sliced-variable.html
Can anyone help vectorize this Matlab code? The specific problem is the sum and bessel function with vector inputs.
Thank you!
N = 3;
rho_g = linspace(1e-3,1,N);
phi_g = linspace(0,2*pi,N);
n = 1:3;
tau = [1 2.*ones(1,length(n)-1)];
for ii = 1:length(rho_g)
for jj = 1:length(phi_g)
% Coordinates
rho_o = rho_g(ii);
phi_o = phi_g(jj);
% factors
fc = cos(n.*(phi_o-phi_s));
fs = sin(n.*(phi_o-phi_s));
Ez_t(ii,jj) = sum(tau.*besselj(n,k(3)*rho_s).*besselh(n,2,k(3)*rho_o).*fc);
end
end
You could try to vectorize this code, which might be possible with some bsxfun or so, but it would be hard to understand code, and it is the question if it would run any faster, since your code already uses vector math in the inner loop (even though your vectors only have length 3). The resulting code would become very difficult to read, so you or your colleague will have no idea what it does when you have a look at it in 2 years time.
Before wasting time on vectorization, it is much more important that you learn about loop invariant code motion, which is easy to apply to your code. Some observations:
you do not use fs, so remove that.
the term tau.*besselj(n,k(3)*rho_s) does not depend on any of your loop variables ii and jj, so it is constant. Calculate it once before your loop.
you should probably pre-allocate the matrix Ez_t.
the only terms that change during the loop are fc, which depends on jj, and besselh(n,2,k(3)*rho_o), which depends on ii. I guess that the latter costs much more time to calculate, so it better to not calculate this N*N times in the inner loop, but only N times in the outer loop. If the calculation based on jj would take more time, you could swap the for-loops over ii and jj, but that does not seem to be the case here.
The result code would look something like this (untested):
N = 3;
rho_g = linspace(1e-3,1,N);
phi_g = linspace(0,2*pi,N);
n = 1:3;
tau = [1 2.*ones(1,length(n)-1)];
% constant part, does not depend on ii and jj, so calculate only once!
temp1 = tau.*besselj(n,k(3)*rho_s);
Ez_t = nan(length(rho_g), length(phi_g)); % preallocate space
for ii = 1:length(rho_g)
% calculate stuff that depends on ii only
rho_o = rho_g(ii);
temp2 = besselh(n,2,k(3)*rho_o);
for jj = 1:length(phi_g)
phi_o = phi_g(jj);
fc = cos(n.*(phi_o-phi_s));
Ez_t(ii,jj) = sum(temp1.*temp2.*fc);
end
end
Initialization -
N = 3;
rho_g = linspace(1e-3,1,N);
phi_g = linspace(0,2*pi,N);
n = 1:3;
tau = [1 2.*ones(1,length(n)-1)];
Nested loops form (Copy from your code and shown here for comparison only) -
for ii = 1:length(rho_g)
for jj = 1:length(phi_g)
% Coordinates
rho_o = rho_g(ii);
phi_o = phi_g(jj);
% factors
fc = cos(n.*(phi_o-phi_s));
fs = sin(n.*(phi_o-phi_s));
Ez_t(ii,jj) = sum(tau.*besselj(n,k(3)*rho_s).*besselh(n,2,k(3)*rho_o).*fc);
end
end
Vectorized solution -
%%// Term - 1
term1 = repmat(tau.*besselj(n,k(3)*rho_s),[N*N 1]);
%%// Term - 2
[n1,rho_g1] = meshgrid(n,rho_g);
term2_intm = besselh(n1,2,k(3)*rho_g1);
term2 = transpose(reshape(repmat(transpose(term2_intm),[N 1]),N,N*N));
%%// Term -3
angle1 = repmat(bsxfun(#times,bsxfun(#minus,phi_g,phi_s')',n),[N 1]);
fc = cos(angle1);
%%// Output
Ez_t = sum(term1.*term2.*fc,2);
Ez_t = transpose(reshape(Ez_t,N,N));
Points to note about this vectorization or code simplification –
‘fs’ doesn’t change the output of the script, Ez_t, so it could be removed for now.
The output seems to be ‘Ez_t’,which requires three basic terms in the code as –
tau.*besselj(n,k(3)*rho_s), besselh(n,2,k(3)*rho_o) and fc. These are calculated separately for vectorization as terms1,2 and 3 respectively.
All these three terms appear to be of 1xN sizes. Our aim thus becomes to calculate these three terms without loops. Now, the two loops run for N times each, thus giving us a total loop count of NxN. Thus, we must have NxN times the data in each such term as compared to when these terms were inside the nested loops.
This is basically the essence of the vectorization done here, as the three terms are represented by ‘term1’,’term2’ and ‘fc’ itself.
In order to give a self-contained answer, I'll copy the original initialization
N = 3;
rho_g = linspace(1e-3,1,N);
phi_g = linspace(0,2*pi,N);
n = 1:3;
tau = [1 2.*ones(1,length(n)-1)];
and generate some missing data (k(3) and rho_s and phi_s in the dimension of n)
rho_s = rand(size(n));
phi_s = rand(size(n));
k(3) = rand(1);
then you can compute the same Ez_t with multidimensional arrays:
[RHO_G, PHI_G, N] = meshgrid(rho_g, phi_g, n);
[~, ~, TAU] = meshgrid(rho_g, phi_g, tau);
[~, ~, RHO_S] = meshgrid(rho_g, phi_g, rho_s);
[~, ~, PHI_S] = meshgrid(rho_g, phi_g, phi_s);
FC = cos(N.*(PHI_G - PHI_S));
FS = sin(N.*(PHI_G - PHI_S)); % not used
EZ_T = sum(TAU.*besselj(N, k(3)*RHO_S).*besselh(N, 2, k(3)*RHO_G).*FC, 3).';
You can check afterwards that both matrices are the same
norm(Ez_t - EZ_T)
i want to remove nested for loops from my code?
i can't remove them.
k = 3;
Data = rand(100,5);
m = zeros(size(Data));
N = size(Data,2); % number of features
M = size(Data,1); % number of objects
bound = zeros(N,k+1);
MAX = max(Data);
MIN = min(Data);
for ii = 1:N
bound(ii,:) = linspace(MIN(ii), MAX(ii), k+1);
end
bound(:,end) = bound(:,end)+eps;
tic;
for ii = 1:M
for jj=1:N
for kk=1:k
if bound(jj,kk)<=Data(ii,jj) && Data(ii,jj)<bound(jj,kk+1)
m(ii,jj) = kk;
end
end
end
end
You can do away with nesting upto a certain limit.
At a glance, as the jj index seems to be uniform in the operation within the nested loop, you can replace
for ii = 1:M
for jj=1:N
for kk=1:k
if bound(jj,kk)<=Data(ii,jj) && Data(ii,jj)<bound(jj,kk+1)
m(ii,jj) = kk;
end
end
end
end
by simply
for ii = 1:M
for kk=1:k
m(ii,(bound(:,kk)<=Data(ii,:)' & Data(ii,:)'<bound(:,kk+1))) = kk;
end
end
This would give you the exact same result as before.
Since your longest loop is over ii=1:M, we should prioritise vectorising this one over the others. The smallest loop is over kk=1:k so this one can probably stay without worrying about it too much. You can use bsxfun to great effect in vectorisations of this sort:
for kk = 1:k
ind = bsxfun(#le, bound(:, kk)', Data) & bsxfun(#gt, bound(:, kk+1)', Data);
m(ind) = kk;
end
This gives the same result as your above code.
Another alternative is histc(), which is specifically designed for binning:
for jj = 1:N
[~, m(:,jj)] = histc(Data(:,jj),bound(jj,:));
end
This solution is on par with bsxfun() but it's not a very meaningful comparison because here the loop is across columns while with bsxfun is across bounds. Therefore, as a rule of thumb I would go with histc() if I have less columns than bounds, otherwise bsxfun().