I have the following vector v = [r1 r2 r3 r4 r5 .... rn], with r integer numbers.
I want to check:
if r1 not equal to r2 not equal to r3 ... not equal to rn (all different of each other):
print v
else (some elements are equal and other not equal):
print the index of the equal elements.
I recommend using the unique command. It will return the values of all unique values.
If you want to check that all values in your matrix v are unique, I would use the following command:
everything_is_unique = length(unique(v))==length(v);
You can also return the indices of the equal elements.
See the documentation on unique for more information.
Alternate to using unique you can also sort the elements and check whether they are all different:
all(diff(sort(v)))
By using sort with more input arguements you could get the indices you are looking for.
I'll give one more solution, and a comparison of all methods tried so far.
My solution is based on the observation that when using builtin functions like sort() or unique(), you lose opportunities for early escapes. That is, the sort() will have to sort the vector completely before you can continue with your algorithm, even though this is not required if two equal values are already detected inside the sort.
Therefore, I simply iterate through the array, and compare the current value to all following values using any(). This works around some of these issues, and works well enough for a lot of cases.
However, the worst case complexity is O(N²), which is a hell of a lot worse than sort(), which has only O(N·log(N)). So as usual, it all depends on context :)
Trying this:
clc
N = 1e4;
% Zigzag's solution
tic
for ii = 1:1e2
v = randi(N, N,1);
length(unique(v))==length(v);
end
toc
% Dennis Jaheruddin's solution
tic
for ii = 1:1e4
v = randi(N, N,1);
all(diff(sort(v)));
end
toc
% My solution
tic
for ii = 1:1e4
v = randi(N, N,1);
cond = true;
for jj = 1:numel(v)
if any(v(jj) == v(jj+1:end))
cond = false;
break;
end
end
end
toc
The random numbers are generated inside the loops to ensure a variety of different cases will come by. Results on my PC:
Elapsed time is 16.787976 seconds. % unique
Elapsed time is 14.284696 seconds. % sort + diff
Elapsed time is 5.376655 seconds. % loop + any
So explicit looping (provided feature accel is on) with early exit is actually almost three times faster than the standard "vectorized" approach :)
PS - I also tried to nest another loop to try and improve having to compare all value before detecting equal values (first v(jj)==v(jj+1:end) is evaluated completely, before any() can start doing its job), but here, the overheads really start to get in the way (or the JIT is not coping well enough with this sort of thing, I don't know). In theory, this should be even faster of course, but unfortunately, not in MATLAB :)
However, change the random number generation
v = randi(N, N,1);
into
v = randi(N*N, N,1);
and the results are quite different:
Elapsed time is 0.162625 seconds. % unique
Elapsed time is 0.147369 seconds. % sort + diff
Elapsed time is 30.767247 seconds. % loop + any
Here I used only 100 iterations instead of 10.000, for obvious reasons :)
Related
I have some code which uses sparse indexing (and there's no way that I can get around that). I run this in a function, and use it for two problems, where the sizes of all the variables involved do not change. However, for one problem, the sparse indexing part takes 5 seconds, and for the other, takes 25 seconds.
I checked the size of every variable involved, and they are the same for both problems. I also checked that xv is a full matrix for both problem types.
So, anyone else ever run into something weird like this? Any ideas as to why this would happen? Mainly I am trying to make the code more efficient, and while 5 seconds is ok for my particular application, 25 seconds (especially when I can't explain it) is very bad.
Edit: Here is a link to a photo that profiles this weird behavior. The runtime values were recorded on the third run to ensure that the size of X is also not changing. And I did check that xv is a dense (not sparse) matrix both times.
https://www.dropbox.com/s/i41j6afanzbjdyg/weird_bcd_thing.png?dl=0
Thanks so much for any help!
Code below (runs in a for loop). If I use ptype = 1, then it's 5 seconds, ptype = 3 is 25 seconds.
clvec = cliques{k};
xcurr = full(X(clvec));
xv = reshape(xcurr - Z(offset_index(k) + 1 : offset_index(k) + ncl^2),ncl,ncl);
%these two functions both take a dense symmetric matrix and return a dense symmetric matrix, and in both cases the size is the same for a given k.
if ptype == 1
xv = proj_PSD(xv,0,0);
elseif ptype == 3
xv = proj_Schoenberg(xv,0);
end
Xd = vec(xv) - xcurr;
%THIS IS THE WEIRD LINE
tic
X(clvec) = xv;
toc;
In the 'WEIRD LINE' : X(clvec) = xv;
You are using a random access to a sparse matrix.
This access in a sparse matrix is not constant and depends on its data. The time is may depend on the matrix values and the indices you are trying to access.
This is not the case in regular matrix, where you usually get a stable access time, and faster.
In order to assure a stable constant access try to change the implementation based on your specific matrix usage, try to avoid values assign by random access.
See next code for as a reference:
X = sparse(randi(100,50,1),randi(100,50,1),randn(1),100,100);
for i=1:10000
rand_inds{i} = randperm(10000,100);
end
for i=1:100
ti = tic;
X(rand_inds{i}) = 3;
to_X(i) = toc(ti);
end
Xf = full(X);
for i=1:100
ti = tic;
Xf(rand_inds{i}) = 3;
to_Xf(i) = toc(ti);
end
figure;plot(to_X);hold on;plot(to_Xf,'r');
I solved my problem! I'm posting the answer because I think it's interesting.
One thing I didn't mention in the question is that the loop goes from k = 1 to k = L, and for ptype = 3, we add one more step, and that's assigning all the diagonal indices to 0:
X(diag_index) = 0
where diag_index is computed ahead of time.
The problem is, instead of just assigning the values to 0, MATLAB will automatically discard these indices, and the next loop, when accessing diagonal indices, it has to re-allocate for X. So, I changed that line to
X(diag_index) = eps;
and now they both run equally fast! (It's not the best solution, since that's going to be a source of error later, but there's no more mystery!)
The answer is never what you think it would be...
Suppose you have 5 vectors: v_1, v_2, v_3, v_4 and v_5. These vectors each contain a range of values from a minimum to a maximum. So for example:
v_1 = minimum_value:step:maximum_value;
Each of these vectors uses the same step size but has a different minimum and maximum value. Thus they are each of a different length.
A function F(v_1, v_2, v_3, v_4, v_5) is dependant on these vectors and can use any combination of the elements within them. (Apologies for the poor explanation). I am trying to find the maximum value of F and record the values which resulted in it. My current approach has been to use multiple embedded for loops as shown to work out the function for every combination of the vectors elements:
% Set the temp value to a small value
temp = 0;
% For every combination of the five vectors use the equation. If the result
% is greater than the one calculated previously, store it along with the values
% (postitions) of elements within the vectors
for a=1:length(v_1)
for b=1:length(v_2)
for c=1:length(v_3)
for d=1:length(v_4)
for e=1:length(v_5)
% The function is a combination of trigonometrics, summations,
% multiplications etc..
Result = F(v_1(a), v_2(b), v_3(c), v_4(d), v_5(e))
% If the value of Result is greater that the previous value,
% store it and record the values of 'a','b','c','d' and 'e'
if Result > temp;
temp = Result;
f = a;
g = b;
h = c;
i = d;
j = e;
end
end
end
end
end
end
This gets incredibly slow, for small step sizes. If there are around 100 elements in each vector the number of combinations is around 100*100*100*100*100. This is a problem as I need small step values to get a suitably converged answer.
I was wondering if it was possible to speed this up using Vectorization, or any other method. I was also looking at generating the combinations prior to the calculation but this seemed even slower than my current method. I haven't used Matlab for a long time but just looking at the number of embedded for loops makes me think that this can definitely be sped up. Thank you for the suggestions.
No matter how you generate your parameter combination, you will end up calling your function F 100^5 times. The easiest solution would be to use parfor instead in order to exploit multi-core calculation. If you do that, you should store the calculation results and find the maximum after the loop, because your current approach would not be thread-safe.
Having said that and not knowing anything about your actual problem, I would advise you to implement a more structured approach, like first finding a coarse solution with a bigger step size and narrowing it down successivley by reducing the min/max values of your parameter intervals. What you have currently is the absolute brute-force method which will never be very effective.
I want to make 1000 random permutations of a vector in matlab. I do it like this
% vector is A
num_A = length(A);
for i=1:1000
n = randperm(num_A);
A = A(n); % This is one permutation
end
This takes like 73 seconds. Is there any way to do it more efficiently?
Problem 1 - Overwriting the original vector inside loop
Each time A = A(n); will overwrite A, the input vector, with a new permutation. This might be reasonable since anyway you don't need the order but all the elements in A. However, it's extremely inefficient because you have to re-write a million-element array in every iteration.
Solution: Store the permutation into a new variable -
B(ii, :) = A(n);
Problem 2 - Using i as iterator
We at Stackoverflow are always telling serious Matlab users that using i and j as interators in loops is absolutely a bad idea. Check this answer to see why it makes your code slow, and check other answers in that page for why it's bad.
Solution - use ii instead of i.
Problem 3 - Using unneccessary for loop
Actually you can avoid this for loop at all since the iterations are not related to each other, and it will be faster if you allow Matlab do parallel computing.
Solution - use arrayfun to generate 1000 results at once.
Final solution
Use arrayfun to generate 1000 x num_A indices. I think (didn't confirm) it's faster than directly accessing A.
n = cell2mat(arrayfun(#(x) randperm(num_A), 1:1000', 'UniformOutput', false)');
Then store all 1000 permutations at once, into a new variable.
B = A(n);
I found this code pretty attractive. You can replace randperm with Shuffle. Example code -
B = Shuffle(repmat(A, 1000, 1), 2);
A = perms(num_A)
A = A(1:1000)
Perms returns all the different permutations, just take the first 1000 permutations.
I plan on deleting the first row of a matrix multiple times and was wondering what the best/most efficient way of doing this would be.
I know I can do something like this
M(1,:)=[]
or
M = M(2:end)
but I am not sure which is the best way or if there is another better way.
Hey just tested those 2 methods with tic and toc
This is the code I used:
A=rand(100,100000);
tic
a=A(2:end,:);
t1=toc
tic
A(1,:)=[];
t2=toc
and this is the result:
t1 =
0.0603
t2 =
0.0744
If you use longer columns it gets even more obvious:
A=rand(10000,100);
t1 =
0.0083
t2 =
0.0124
So saving the columns you want to keep seems to be faster.
Edit
It was commented that tic and toc are not "trustworthy" in the millisecond domain so it was recommended to use loops to run the code multiple times. But the result doesn't change.
A=rand(100,100000);
size_A=size(A);
tic
for k=1:1:100
A1=A;
A1=A1(2:end,:);
end
t1=toc
tic
for k=1:1:100
A1=A;
A1(1,:)=[];
end
t2=toc
this results in:
t1 =
7.5237
t2 =
15.2234
Generally it might be faster to keep what you want. Depending on the dimensions of your matrix however results may vary. Consider the following test case where two matrices are generated, A1, and B1, of dimension 100x100000 and 100000x100. The results are obtained from the profile viewer but tic toc measurements confirmed these results.
A1=rand(100,100000);
for ii=1:100
A=A1;
A=A(2:end,:);
end
for ii=1:100
A=A1;
A(1,:)=[];
end
B1=rand(100000,100);
for ii=1:100
B=B1;
B=B(2:end,:);
end
for ii=1:100
B=B1;
B(1,:)=[];
end
The results clearly show that the first case (keeping what you want on a matrix with lots of columns) is very slow actually.
There is no clear this or that is faster though. You should try to time for your situation!
When multiplying two matrices, I tried the following two options:
1)
res = X*A;
2)
for i = 1:size(A,2)
res(:,i) = X*A(:,i);
end
I preallocated memory for res in both. And surprisingly, I found option 2 to be faster.
Can someone explain how this is so?
edit:
I tried
K=10000;
clear t1 t2
t1=zeros(K,1);
t2=zeros(K,1);
for k=1:K
clear res
x = rand(100,100);
a = rand(100,100);
tic
res = x*a;
t1(k) = toc;
end
for k=1:K
clear res2
res2 = zeros(100,100);
x = rand(100,100);
a = rand(100,100);
tic
for i = 1:100
res2(:,i) = x*a(:,i);
end
t2(k) = toc;
end
I run both codes in a loop 1000 times. In average (but not always) the first vectorized code was 3-4 times faster. I cleared the result variables and preallocated before starting timer.
x = rand(100,100);
a = rand(100,100);
K=1000;
clear t1 t2
t1=zeros(K,1);
t2=zeros(K,1);
for k=1:K
clear res
tic
res = x*a;
t1(k) = toc;
end
for k=1:K
clear res2
res2 = zeros(100,100);
tic
for i = 1:100
res2(:,i) = x*a(:,i);
end
t2(k) = toc;
end
So, never make a timing conclusion based on a single run.
I believe I can chime in on the variation in timings between the two methods, as well as why people are getting different relative speeds.
Before Matlab version 2008a (or a version near that release), for loops took a major hit in any Matlab code because the interpreter (a layer between the very readable script and a lower level implementation of the code) would have to re-interpret the code each time through the for loop.
Since that release, the interpreter has gotten progressively better so, when running a modern version of Matlab, the interpreter can look at your code and say "Ah ha! I know what he is doing, let me optimize it just a bit" and avoid the hit it would otherwise take by reinterpreting the code.
I would expect the two ways of performing matrix multiplies to evaluate in the same amount of time, why the for loop implementation runs faster is because of some detail in the optimizations of the interpreter that us mere mortals are not privy to know.
One broad lesson we should take from this, is not all versions are equal. I do work on a couple of bleeding edge cases using two Matlab add ons, the SimBiology and the Parallel Computing Toolboxes, both of which (especially if you want them to work together) are version dependent in speed of execution, and from time to time other stability issues. As such, I keep the three most recent releases of Matlab, will test that I get the same answers out of each version, and I'll occasionally roll back to an earlier version if I find issues with some features. This is probably overkill for most people, but gives you an idea of version differences.
Hope this helps.
Edits:
To clarify, code vectorization is still important. But given a script like:
x_slow = zeros(1,1e5);
x_fast = zeros(1,1e5);
tic;
for i=1:1e5
x_slow(i) = log(i);
end
time_slow = toc; % evaluates for me in .0132 seconds
tic;
x_fast = log(1:1e5);
time_fast = toc; % evaluates for me in .0055 seconds
The disparity between time_slow and time_fast has reduced in the past several versions based on improvements in the interpreter. The example I saw I believe was on 2000a vs. 2008b, but that's subject to my recollection.
There is something else that might be going on that was addressed by Oli and Yuk. There is often a difference between the time_1 and time_2 in:
tic; x = log(1:1e5); time_1 = toc
tic; x = log(1:1e5); time_2 = toc
So the test of one million evaluations vs. one evaluation is valuable, depending on where in memory x is (in cache or no).
Hope this helps again.
This may well be an effect of caching. a is already in the cache by the time you do the second version, so it has an advantage. Try creating an independent set of inputs to make it fair. Also, it's probably better to measure the time of e.g. 1 million iterations of this, in order to eliminate typical variations due to outside effects.
It looks to me that you are not multiplying matrix properly, you need to sum all the products from ith row of X matrix and jth column of A matrix, that might be a reason.
Look here to see how it's done.