I have some massive matrix computation to do in MATLAB. It's nothing complicated (see below). I'm having issues with making computation in MATLAB efficient. What I have below works but the time it takes simply isn't feasible because of the computation time.
for i = 1 : 100
for j = 1 : 20000
element = matrix{i}(j,1);
if element <= bigNum && element >= smallNum
count = count + 1;
end
end
end
Is there a way of making this faster? MATLAB is meant to be good at these problems so I would imagine so?
Thank you :).
count = 0
for i = 1:100
count = count + sum(matrix{i}(:,1) <= bigNum & matrix{i}(:,1) >= smallNum);
end
If your matrix is a matrix, then this will do:
count = sum(matrix(:) >= smallNum & matrix(:) <= bigNum);
If your matrix is really huge, use anyExceed. You can profile (check the running time) of both functions on matrix and decide.
Related
I have a 2048x2048 image and I want to re-set the pixels values according to certain condition. The problem is that it takes houres (if not days) for the code to end. Is there a way to shorten the run-time?
This is the function:
function ProcImage = ProcessImage(X,length,width)
for i=1:length
for j=1:width
if X(j,i)<=0.025*(max(max(X(:,:))))
X(j,i)=0;
else
if X(j,i)>=0.95*(max(max(X(:,:))))
X(j,i)=(max(max(X(:,:))));
end
end
end
end
ProcImage=X(1:end,1:end);
thanks in advance.
Vectorize it. You don't need to compute the maximum value of X on every iteration, since it will be the same throughout. Compute it once, store that value, then use it later. You can also do away with the loops by using element-wise logical operations and matrix indexing. Here's a simplified version that should be much faster:
maxX = max(X(:));
X(X <= 0.025.*maxX) = 0;
X(X >= 0.95.*maxX) = maxX;
If your image is a gray scale image that its values are in the range 0 to 255 here is a possible solution:
m = max(X(:));
tbl = 0:255;
tbl(tbl<=0.025*m)=0;
tbl(tbl>=0.95*m)=m;
X = tbl(int16(X)+1);
In the code below, I want to ensure that there is an even chance of theImageRand being equal to theImage or theImage2, but I realized that between 1 and 100 more numbers are equal to 2 mod 1 than 2 mod 0. So theImage is being chosen a disproportionate amount of the time.
This was the simplest idea that came to me, but maybe there is a function that can do this easier? I was also thinking that I could find a number that fits what I'm looking for and put that into randi(n).
xRand = randi(100);
if mod(xRand,2) == 1
theImageRand = theImage;
elseif mod(xRand,2) == 0
theImageRand = theImage2;
end
Please let me know if I can explain more clearly. Thanks in advance.
tl;dr
Your code does exactly what you want, but it can be simplified by using randi(2) and removing the calculation of mod. However, it is worth to address some more points...
mod(xRand,2) to reduce 1-100 to 0/1
For xRand between 1 and 100, the result of mod(xRand,2) will be distributed equally on 0 and 1 as you can see by executing the following code:
xRand = 1:100;
xMod = mod(xRand,2);
cnt1 = sum(xMod == 1) % results in 50
cnt0 = sum(xMod == 0) % results in 50 as well
Basically, your code works as expected because randi chooses uniformly distributed numbers from 1 to 100. Subsequently, mod reduces them into a binary representation which is still uniformly distributed since the mapping is done for equal bins.
Simplification using randi(2)
The whole process of generating these uniformly distributed binary numbers can be simplified by just generating a binary set from the beginning. To achieve this, you can use randi(2) which directly gives you 1 or 2 as rayryeng pointed out in his comment to the question.
That would give you the following code:
xRand = randi(2);
if xRand == 1
theImageRand = theImage;
elseif xRand == 2
theImageRand = theImage2;
end
Is it correct?
Now we have a look at the interesting part of this question: Is the result really uniformly distributed? To check that, we can run the code N times and then analyze how many times each image has been chosen. Therefore, we assign 1 to the first image and 2 to the second image and store the result in res. After the for-loop, we take the sum of the elements where they are 1 or 2.
N = 1000000;
theImage = 1;
theImage2 = 2;
res = zeros(N,1);
for n = 1:N
xRand = randi(2);
if xRand == 1
theImageRand = theImage;
elseif xRand == 2
theImageRand = theImage2;
end
% xRand = randi(100);
% if mod(xRand,2) == 1
% theImageRand = theImage;
% elseif mod(xRand,2) == 0
% theImageRand = theImage2;
% end
res(n) = theImageRand;
end
cnt1 = sum(res==1);
cnt2 = sum(res==2);
percentage1 = cnt1/N*100 % approximately 50
percentage2 = cnt2/N*100 % approximately 50 as well
As we can see, percentage1 and percentage2 are approximately 50 which means the two images get both chosen around 50% of the time. It can be misleading to count the difference between cnt1 and cnt2 because this number can be high if N is large. However, if we observe this difference for many realization, the overall mean will be approximately zero. Furthermore, we can observe that your code using mod(randi(100),2) gives a distribution of 50% as well. It is just not as efficient and straight-forward as the solution with randi(2), which performs approximately 15% faster on my machine using R2016a.
Bottomline: I would recommend to use randi(2) as proposed above since it is more intuitive and more efficient as well. The observed difference is attributed to the random process and equalizes itself with more realizations. It is important to consider the percentage and not the absolute difference of the two images.
You can generate a number between zero and two and then check if xRand equals zero or one, which has an even chance of happening; however larger numbers make the chance of getting one over the other vary more.
(I've said two because when declaring random values, always have one extra.)
1 + 1/(2^4) + 1/(3^4) + 1/(4^4) + ...
This is the infinite series that I'd like to get the sum value. So I wrote this code in MATLAB.
n = 1;
numToAdd = 1;
sum = 0;
while numToAdd > 0
numToAdd = n^(-4);
sum = sum + numToAdd;
n = n + 1;
end
disp(sum);
But I couldn't get the result because this code occurred an infinite loop. However, the code I write underneath -- it worked well. It took only a second.
n = 1;
oldsum = -1;
newsum = 0;
while newsum > oldsum
oldsum = newsum;
newsum = newsum + n^(-4);
n = n+1;
end
disp(newsum);
I read these codes again and googled for a while, but coudln't find out the critical point. What makes the difference between these two codes? Is it a matter of precision of double in MATLAB?
The first version would have to go down to the minimum value for a double ~10^-308, while the second will only need to go down to the machine epsilon ~10^-16. The epsilon value is the largest value x such that 1+x = 1.
This means the first version will need approximately 10^77 iterations, while the second only needs 10^4.
The problem boils down to this:
x = 1.23456789; % Some random number
xEqualsXPlusEps = (x == x + 1e-20)
ZeroEqualsEps = (0 == 1e-20)
xEqualsXPlusEps will be true, while ZeroEqualsEps is false. This is due to the way floating point arithmetic works. The value 1e-20 is smaller than the least significant bit of x, so x+1e-20 won't be larger than x. However 1e-20 is not considered equal to 0. In comparison to x, 1e-20 is relatively small, whereas in comparison to 0, 1e-20 is not small at all.
To fix this problem you would have to use:
while numToAdd > tolerance %// Instead of > 0
where tolerance is some small number greater than zero.
So I have a list of 190 numbers ranging from 1:19 (each number is repeated 10 times) that I need to sample 10 at a time. Within each sample of 10, I don't want the numbers to repeat, I tried incorporating a while loop, but computation time was way too long. So far I'm at the point where I can generate the numbers and see if there are repetitions within each subset. Any ideas?
N=[];
for i=1:10
N=[N randperm(19)];
end
B=[];
for j=1:10
if length(unique(N(j*10-9:j*10)))<10
B=[B 1];
end
end
sum(B)
Below is an updated version of the code. this might be a little more clear in showing what I want. (19 targets taken 10 at a time without repetition until all 19 targets have been repeated 10 times)
nTargs = 19;
pairs = nchoosek(1:nTargs, 10);
nPairs = size(pairs, 1);
order = randperm(nPairs);
values=randsample(order,19);
targs=pairs(values,:);
Alltargs=false;
while ~Alltargs
targs=pairs(randsample(order,19),:);
B=[];
for i=1:19
G=length(find(targs==i))==10;
B=[B G];
end
if sum(B)==19
Alltargs=true;
end
end
Here are some very simple steps to do this, basically you just shuffle the vector once, and then you grab the last 10 unique values:
v = repmat(1:19,1,10);
v = v(randperm(numel(v)));
[a idx]=unique(v);
result = unique(v);
v(idx)=[];
The algorithm should be fairly efficient, if you want to do the next 10, just run the last part again and combine the results into a totalResult
You want to sample the numbers 1:19 randomly in blocks of 10 without repetitions. The Matlab function 'randsample' has an optional 'replacement' argument which you can set to 'false' if you do not want repetitions. For example:
N = [];
replacement = false;
for i = 1:19
N = [N randsample(19,10,replacement)];
end
This generates a 19 x 10 matrix of random integers in the range [1,..,19] without repetitions within each column.
Edit: Here is a solution that addresses the requirement that each of the integers [1,..,19] occurs exactly 10 times, in addition to no repetition within each column / sample:
nRange = 19; nRep = 10;
valueRep = true; % true while there are repetitions
nLoops = 0; % count the number of iterations
while valueRep
l = zeros(1,nRep);
v = [];
for m = 1:nRep
v = [v, randperm(nRange,nRange)];
end
m1 = reshape(v,nRep,nRange);
for n = 1:nRep
l(n) = length(unique(m1(:,n)));
end
if all(l == nRep)
valueRep = false;
end
nLoops = nLoops + 1;
end
result = m1;
For the parameters in the question it takes about 300 iterations to find a result.
I think you should approach this constructively.
It's easy to initially find a 19 groups that fulfill your conditions just by rearranging the series 1:19: series1 = repmat(1:19,1,10); and rearranged= reshape(series1,10,19)
then shuffle the values
I would select two random columns copy them and switch the values at two random positions
then make a test if it fulfills your condition - like: test = #(x) numel(unique(x))==10 - if yes replace your columns
just keep shuffling till your time runs out or you are happy
of course you might come up with more efficient shuffling or testing
I was given another solution through the MATLAB forum that works pretty well (Credit to Niklas Nylen over on the MATLAB forum). Computation time is pretty low too. It basically shuffles the numbers until there are no repetitions within every 10 values. Thanks all for your help.
y = repmat(1:19,1,10);
% Run enough iterations to get the output random enough, I selected 100000
for ii = 1:100000
% Select random index
index = randi(length(y)-1);
% Check if it is allowed to switch places
if y(index)~=y(min(index+10, length(y))) && y(index+1)~=y(max(1,index-9))
% Make the switch
yTmp = y(index);
y(index)=y(index+1);
y(index+1)=yTmp;
end
end
I want to make the code below fast. It takes so long time to run, and I got this error:
Warning: FOR loop index is too large. Truncating to 2147483647.
I need to calculate over 3^100 so... is it impossible?
function sodiv = divisorSum(n)
sodiv = 0;
for i=1:n
if (mod(n,i) == 0)
sodiv = sodiv + i;
end
end
end
function finalSum1 = formular1(N,n)
finalSum1 = 0;
for k = 1:N
finalSum1 = finalSum1 + (divisorSum(k) * divisorSum(3^n*(N-k)));
end
end
Nv=100;
nv=[1:20];
for i=1:length(nv)
tic;
nfunc1(i)=formular1(Nv,nv(i));
nt1(i)=toc;
sprintf('nt1 : %d finished, %f', i,nt1(i))
end
The purpose of this code is to check the algorithm's calculation time.
The algorithm is too general and inefficient for this particular problem.
I understand you want to sum the divisors of 3^100. But these divisors are easily determined.
S = 1 + 3 + 3^2 + 3^3 + ... + 3^100, a geometric series.
3*S = 3 + 3^2 + ... + 3^101
subtract
2*S = 3^101 - 1
S = (3^101 - 1)/2
This code will never finish, because it is so inefficient.
For instance, there is a function that counts number of all divisors and is going through all numbers from 1 to N and count. But using an efficient formula would make it run much master.
Let's say that one need to sum divisors of number a^b where a is prime number.
Instead of calculating a^b and going form 1 to a^b, one can see that it is better going
a^1, a^2, a^3, ..., a^n, because only these numbers are divisors. But you can go even further and observe that the sum of these numbers are the sum of geometric progression so the number of divisors become:
sum divisors, a^b = (a^(b+1)-1) / (a-1)