I'm trying to span values of a certain multivariable function f(nu,delta,Omega,kappa,Gamma). The code is as follows:
% PREALLOCATE RESULT MATRIX
N = 10;
d = 0.1;
M = zeros(4*N/d+(2*N+1)/d,6);
% SET UP LOOP INDICES
i=1;
increm=1;
% LOOP OVER MULTIPLE VARIABLE
for nu=d:d:N
for delta=-N:d:N
for Omega=d:d:N
for kappa=d:d:N
tic
for Gamma=d:d:N
% CALCULATE THE FUNCTION
mss = ((Gamma+kappa).*((Gamma+kappa).^2+4.*(delta+(-1).*nu).^2).^(-1)+( ...
-1).*(Gamma+kappa).*((Gamma+kappa).^2+4.*(delta+nu).^2).^(-1)+( ...
kappa.^2+4.*(delta+(-1).*nu).^2).^(-1).*(((2.*Gamma+kappa).^2+4.*( ...
delta+(-1).*nu).^2).*(kappa.*(Gamma+kappa).*(2.*Gamma+kappa)+4.*(( ...
-1).*Gamma+kappa).*(delta+(-1).*nu).^2)+4.*(kappa.*(2.*Gamma+ ...
kappa).^2+4.*(4.*Gamma+kappa).*(delta+(-1).*nu).^2).*Omega.^2).*( ...
16.*(delta+(-1).*nu).^4+4.*(delta+(-1).*nu).^2.*((Gamma+kappa).^2+ ...
(2.*Gamma+kappa).^2+(-8).*Omega.^2)+((Gamma+kappa).*(2.*Gamma+ ...
kappa)+4.*Omega.^2).^2).^(-1)+(-1).*(kappa.^2+4.*(delta+nu).^2).^( ...
-1).*(((2.*Gamma+kappa).^2+4.*(delta+nu).^2).*(kappa.*(Gamma+ ...
kappa).*(2.*Gamma+kappa)+4.*((-1).*Gamma+kappa).*(delta+nu).^2)+ ...
4.*(kappa.*(2.*Gamma+kappa).^2+4.*(4.*Gamma+kappa).*(delta+nu).^2) ...
.*Omega.^2).*(16.*(delta+nu).^4+4.*(delta+nu).^2.*((Gamma+kappa) ...
.^2+(2.*Gamma+kappa).^2+(-8).*Omega.^2)+((Gamma+kappa).*(2.*Gamma+ ...
kappa)+4.*Omega.^2).^2).^(-1)).^(-1).*((Gamma+kappa).*((Gamma+ ...
kappa).^2+4.*(delta+nu).^2).^(-1)+(kappa.^2+4.*(delta+nu).^2).^( ...
-1).*(((2.*Gamma+kappa).^2+4.*(delta+nu).^2).*(kappa.*(Gamma+ ...
kappa).*(2.*Gamma+kappa)+4.*((-1).*Gamma+kappa).*(delta+nu).^2)+ ...
4.*(kappa.*(2.*Gamma+kappa).^2+4.*(4.*Gamma+kappa).*(delta+nu).^2) ...
.*Omega.^2).*(16.*(delta+nu).^4+4.*(delta+nu).^2.*((Gamma+kappa) ...
.^2+(2.*Gamma+kappa).^2+(-8).*Omega.^2)+((Gamma+kappa).*(2.*Gamma+ ...
kappa)+4.*Omega.^2).^2).^(-1));
% STORE THE RESULT
M(i,:) = [mss nu delta Omega kappa Gamma];
i = i+increm;
end
end
toc
end
end
end
save M
However the preallocation does not help that each iteration takes longer time. When I run the code and interrupt it prematurely, the iterations take
Elapsed time is 0.003354 seconds.
Elapsed time is 0.006374 seconds.
Elapsed time is 0.009043 seconds.
Elapsed time is 0.012092 seconds.
Elapsed time is 0.015287 seconds.
Elapsed time is 0.019239 seconds.
Elapsed time is 0.023898 seconds.
Elapsed time is 0.035345 seconds.
Elapsed time is 0.046675 seconds.
Elapsed time is 0.056000 seconds.
Elapsed time is 0.066323 seconds.
Elapsed time is 0.072178 seconds.
Elapsed time is 0.075174 seconds.
Elapsed time is 0.081095 seconds.
Elapsed time is 0.095016 seconds.
Elapsed time is 0.095214 seconds.
Elapsed time is 0.100089 seconds.
Elapsed time is 0.104286 seconds.
Elapsed time is 0.109454 seconds.
Elapsed time is 0.115368 seconds.
Elapsed time is 0.124278 seconds.
Elapsed time is 0.131521 seconds.
Elapsed time is 0.135023 seconds.
Elapsed time is 0.137370 seconds.
Elapsed time is 0.145331 seconds.
Elapsed time is 0.163449 seconds.
Elapsed time is 0.162654 seconds.
Elapsed time is 0.159628 seconds.
Elapsed time is 0.166585 seconds.
I don't see how the change of variables itself could cause this, because the only vary by amount d, which shouldn't make it much harder to calculate the new value of mss than in the previous interation.
You aren't pre-allocating a big enough array!
If you do the following nested loop
for ii=1:3
for jj = 1:4
doSomething()
end
end
doSomething is executed 3*4=12 times. With your allocation scheme, you would be allocating 3+4=7 times.
In other words, change your preallocation to
M = zeros((N/d)^4*(2*N+1)/d,6);
and all will be fine.
Related
I am working on computing some features of an image data set and saving the features for later use. Below is the code:
tic
l = 9907 % size of image data set
% pre-allocating space for variables in the for loop
Icolor = cell(1,l);
Iwave = cell(1,l);
IglrlFeatures = cell(1,l);
for i = 1:l % l = size of image data set = 9907
IDB{1,i} = imread(strcat(path,strcat(num2str(i),'.jpg')));
Icolor{1,i} = colorMoments(IDB{1,i}); % 6-features in each cell
Iwave{1,i} = waveletTransform(IDB{1,i}); % 8-features in each cell
IglrlFeatures{1,i} = textureFeatures(IDB{1,i}); % 44-features in each cell
ICW{1,i} = [Icolor{1,i} Iwave{1,i} IglrlFeatures{1,i}];
end
toc
Here the computation time for each function on single image is:
colorMoments(single_image) = Elapsed time is 0.009689 seconds.
waveletTransform(single_image) = Elapsed time is 0.018069 seconds.
textureFeatures(single_image) = Elapsed time is 0.022902 seconds.
l = data set size = 9907 images
Computational times for different data set sizes (l):
l = 10; Elapsed time is 0.402629 seconds.
l = 100; Elapsed time is 2.233971 seconds.
l = 1000; Elapsed time is 21.178395 seconds.
l = 2000; Elapsed time is 44.510071 seconds.
l = 5000; Elapsed time is 111.393866 seconds.
l = 9907; Elapsed time is 238.924998 seconds. approximately (~4 mins)
I want to decrease this computational time, any suggestions?
Thanks,
Gopi
The measured times seem to indicate computational complexity of order O(n). I doubt that the order can be reduced further for this type of problem. So, I believe that, at best, we can only hope for a linear increase in performance.
One thing you should look into is whether the code is using multiple processor cores. If not, try restructuring the for-loop to be able to use parfor instead.
I have 1Xn cell array of values. and I want to count values that are in given range in matlab.
I implemented it as follows :
count1 = length(find(h{1}<ti & h{1}>ti-INT));
h is my cell array and I want the count of values between ti and ti-INT.
This implementation give correct result, but it is very slow.
Is there any faster function available for the specified operation ?
Sum the occurrence flags:
count1 = sum(h{1}<ti & h{1}>ti-INT);
I know that I will upset the Gods of MATLAB for using tic and toc for code timig, but:
x = rand(10^7,1);
tic; sum(x>0.5); toc;
tic; nnz(x>0.5); toc;
tic; length(find(x>0.5)); toc;
shows on several runs that sum() is twice as fast as nnz(), and 3 times faster than length(find()), e.g.:
Elapsed time is 0.049855 seconds.
Elapsed time is 0.120931 seconds.
Elapsed time is 0.162025 seconds.
This is on my R2012a running on a Windows machine with i5 + 3Gb RAM.
Later edit:
For counting the elements from the entire cell array, one may use:
count_all = sum(cellfun(#(x) sum(x<ti & x>ti-INT), h));
I am doing a comparison and performance test of 3 methods to get the closest index of what i click in the ginput() the first method takes distance from each click and finds the nearest index in the next step the second one is doing the same but through a for loop and 3rd is the exact same copy of the first one but reduction of one step
ax = subplot(1,1,1)
plot(timestamps,datavalue)
hzoom = zoom(ax);
hzoom.Motion = 'horizontal';
[x, ~] = ginput(2);
%1)
tic;
tmp = abs(bsxfun(#minus,x,datenum(timestamps).'));
[~, idx1] = min(tmp,[],2);
toc;
%2)
tic;
for r = 1:length(x)
val = x(r);
tmp = abs(datenum(datenum(timestamps - val)));
[~, idx2] = min(tmp);
closest_indx(r) = idx2;
end
toc;
%3)
tic;
[~, idx3] = min(abs(bsxfun(#minus,x,datenum(timestamps).')),[],2);
toc;
Now when I look at the results
test1)
Elapsed time is 0.009182 seconds.
Elapsed time is 0.019211 seconds.
Elapsed time is 0.011261 seconds.
test2)
Elapsed time is 0.012625 seconds.
Elapsed time is 0.022681 seconds.
Elapsed time is 0.017999 seconds.
test3)
Elapsed time is 0.013053 seconds.
Elapsed time is 0.020170 seconds.
Elapsed time is 0.015248 seconds.
test4)
Elapsed time is 0.011613 seconds.
Elapsed time is 0.018644 seconds.
Elapsed time is 0.015952 seconds.
It takes less time for the first method even though it has one more step of taking all the values and placing in into a 'tmp' matrix. Does anyone have a good explaination for this ?
I need to use etime to calculate how many seconds a computation takes. I thought about something like this:
t1 = datetime('now');
% Do some computation
t2 = datetime('now');
temp = etime(t2, t1)
But I am getting this error message:
Error using etime(line 40), Index exceeds matrix dimensions.
What's wrong with it?
The inputs to etime are expected to be vectors that are the same format as the output of clock and not datetime objects.
t1 = clock;
t2 = clock;
elapsed = etime(t2, t1)
It is likely easier to surround your code with tic and toc which will automatically compute the elapsed time.
tmr = tic;
% do stuff
elapsed = toc(tmr);
That being said, if you want an accurate measurement of execution time, it is far better to use timeit.
I have the following code from a previous question and I need help optimizing the code for speed. This is the code:
function OfdmSym()
N = 64
n = 1000
symbol = ones(Complex{Float64}, n, 64)
data = ones(Complex{Float64}, 1, 48)
unused = zeros(Complex{Float64}, 1, 12)
pilot = ones(Complex{Float64}, 1, 4)
s = [-1-im -1+im 1-im 1+im]
for i=1:n # generate 1000 symbols
for j = 1:48 # generate 48 complex data symbols whose basis is s
r = rand(1:4) # 1, 2, 3, or 4
data[j] = s[r]
end
symbol[i,:]=[data[1,1:10] pilot[1] data[1,11:20] pilot[2] data[1,21:30] pilot[3] data[1,31:40] pilot[4] data[1,41:48] unused]
end
end
OfdmSym()
I appreciate your help.
First of all, I timed it with N = 100000
OfdmSym() # Warmup
for i = 1:5
#time OfdmSym()
end
and its pretty quick as it is
elapsed time: 3.235866305 seconds (1278393328 bytes allocated, 15.18% gc time)
elapsed time: 3.147812323 seconds (1278393328 bytes allocated, 14.89% gc time)
elapsed time: 3.144739194 seconds (1278393328 bytes allocated, 14.68% gc time)
elapsed time: 3.118775273 seconds (1278393328 bytes allocated, 14.79% gc time)
elapsed time: 3.137765971 seconds (1278393328 bytes allocated, 14.85% gc time)
But I rewrote using for loops to avoid the slicing:
function OfdmSym2()
N = 64
n = 100000
symbol = zeros(Complex{Float64}, n, 64)
s = [-1-im, -1+im, 1-im, 1+im]
for i=1:n
for j = 1:48
#inbounds symbol[i,j] = s[rand(1:4)]
end
symbol[i,11] = one(Complex{Float64})
symbol[i,22] = one(Complex{Float64})
symbol[i,33] = one(Complex{Float64})
symbol[i,44] = one(Complex{Float64})
end
end
OfdmSym2() # Warmup
for i = 1:5
#time OfdmSym2()
end
which is 20x faster
elapsed time: 0.159715932 seconds (102400256 bytes allocated, 12.80% gc time)
elapsed time: 0.159113184 seconds (102400256 bytes allocated, 14.75% gc time)
elapsed time: 0.158200345 seconds (102400256 bytes allocated, 14.82% gc time)
elapsed time: 0.158469032 seconds (102400256 bytes allocated, 15.00% gc time)
elapsed time: 0.157919113 seconds (102400256 bytes allocated, 14.86% gc time)
If you look at the profiler (#profile) you'll see that most of the time is spent generating random numbers, as you'd expect, as everything else is just moving numbers around.
It's all just bits, right? This isn't clean (at all), but it runs slightly faster on my machine (which is much slower than yours so I won't bother posting my times). Is it a little faster on your machine?
function my_OfdmSym()
const n = 100000
const my_one_bits = uint64(1023) << 52
const my_sign_bit = uint64(1) << 63
my_sym = Array(Uint64,n<<1,64)
fill!(my_sym, my_one_bits)
for col = [1:10, 12:21, 23:32, 34:43, 45:52]
for row = 1:(n<<1)
if randbool() my_sym[row, col] |= my_sign_bit end
end
end
my_symbol = reinterpret(Complex{Float64}, my_sym, (n, 64))
for k in [11, 22, 33, 44]
my_symbol[:, k] = 1.0
end
for k=53:64
my_symbol[:, k] = 0.0
end
end