Faster alternative for find and sort in MATLAB - matlab

I am working on community detection with a genetic algorithm. Communities are represented in locus-based representation, that each index(gene) and it's value are in same community.
For example in the figure below, if the chromosome is (b) the communities will be (d)
So to extract communities from a chromosome, I need to iteratively find the indexes and values, to do this, I have written this code:
while (SumComms)~=nVar
j=find(Sol>0,1,'first');%find the first node which has not placed in any community
Com=[j,Sol(j)];%index of the node and it's value
Comsize=0;
while Comsize<numel(Com)
Comsize=numel(Com);
x=find(ismembc(Sol,sort([Com,Sol(Com)])));%Indexes which Com occure in Sol
Com=unique([Com,x,Sol(x)]);
end
Sol(Com)=0;
i=i+1;
SumComms=SumComms+numel(Com);
Communities{i}=Com;
end
But x=find(ismembc(Sol,sort([Com,Sol(Com)]))) is very time consuming in even mid-size networks. Do you know any faster way?

Here is a solution using logical vector. Instead of operation on indices we can define Com as a logical vector so operations on indices such as ismember can be reduced to indexing operations:
i=0;
SumComms=0;
nVar = 9;
Sol = [3 0 3 1 5 6 4 7 7]+1;
while SumComms ~= nVar
j=find(Sol>1,1,'first');
Com = false(1,nVar);
Com([j Sol(j)])=true;
Comsize=0;
sumcom = 2;
while Comsize<sumcom
Comsize=sum(Com);
Com(Sol(Com))=true;
Com = Com(Sol);
Com(Sol(Com))=true;
sumcom = sum(Com);
end
Sol(Com)=1;
i = i + 1;
SumComms=SumComms+sumcom;
Communities{i}=find(Com);
end
Result of a test in Octave shows that the proposed method is at least 10x faster than the original method .

Related

Binary map in Matlab

I know it is not possible to create a true binary variable in Matlab. Even 'true(8,1)' has a size of 8 bytes, instead of 8 bits. Very memory inefficient.
In Jon Bentley's Programming Pearls he poses this problem:
input: a file containing at most n positive integers, each less then n, where n = 1E7. No integer exists twice.
output: a sorted list in increasing order of input integers
constraints: At most roughly 1MB of storage available in main memory. Ample disk storage.
In the answer he claims the program should be able to solve it lightning fast using a binary map, or bit-vector. I tried to implement this bit-vector in matlab:
classdef binar
properties (Access=protected)
byteslist uint8
end
methods
function obj = binar(N)
N=ceil(N/8);
obj.byteslist=zeros([N,1],'uint8');
end
function obj = setbit(obj,N,bit) %N from 0 to N-1
v=rem(N,8)+1; %byte position [1 8]
B = (N-v+1)/8+1; %byte nr [1 ..]
obj.byteslist(B)=bitset(obj.byteslist(B),v,bit);
end
function bit = getbit(obj,N)
v=rem(N,8)+1;
B = (N-v)/8+1;
bit = bitget(obj.byteslist(B),v);
end
end
end
To my surprise setting bits is very slow. I timed this code:
N=1E7;
B = binar(N);
itr=1E5;
tic
for ct = 1:itr
pos = uint32(randi([0,N-1],1));
B = setbit(B,pos,1);
end
toc
From the size perspective this works fine. 'S=whos('B');S.bytes/2^20' returns 1.19MB. However it does not seem to be very fast. Running bitset 1E5 times takes 0.367s, where I would think this is literally the simplest operation you can ask of a computer.
Is there a way to make this faster in Matlab?
update: I tried with bitand, and a predeclared vector
obj.byte=uint8([1 2 4 8 16 32 64 128]); %declared on creation of obj
obj.byteslist(B)=bitand(obj.byteslist(B),obj.byte(v));
Then i tried:
obj.byteslist(B)=obj.byteslist(B)+obj.byte(v);
There is hardly a speed difference with bitset.

Vectorizing the solution of a linear equation system in MATLAB

Summary: This question deals with the improvement of an algorithm for the computation of linear regression.
I have a 3D (dlMAT) array representing monochrome photographs of the same scene taken at different exposure times (the vector IT) . Mathematically, every vector along the 3rd dimension of dlMAT represents a separate linear regression problem that needs to be solved. The equation whose coefficients need to be estimated is of the form:
DL = R*IT^P, where DL and IT are obtained experimentally and R and P must be estimated.
The above equation can be transformed into a simple linear model after applying a logarithm:
log(DL) = log(R) + P*log(IT) => y = a + b*x
Presented below is the most "naive" way to solve this system of equations, which essentially involves iterating over all "3rd dimension vectors" and fitting a polynomial of order 1 to (IT,DL(ind1,ind2,:):
%// Define some nominal values:
R = 0.3;
IT = 600:600:3000;
P = 0.97;
%// Impose some believable spatial variations:
pMAT = 0.01*randn(3)+P;
rMAT = 0.1*randn(3)+R;
%// Generate "fake" observation data:
dlMAT = bsxfun(#times,rMAT,bsxfun(#power,permute(IT,[3,1,2]),pMAT));
%// Regression:
sol = cell(size(rMAT)); %// preallocation
for ind1 = 1:size(dlMAT,1)
for ind2 = 1:size(dlMAT,2)
sol{ind1,ind2} = polyfit(log(IT(:)),log(squeeze(dlMAT(ind1,ind2,:))),1);
end
end
fittedP = cellfun(#(x)x(1),sol); %// Estimate of pMAT
fittedR = cellfun(#(x)exp(x(2)),sol); %// Estimate of rMAT
The above approach seems like a good candidate for vectorization, since it does not utilize MATLAB's main strength that is MATrix operations. For this reason, it does not scale very well and takes much longer to execute than I think it should.
There exist alternative ways to perform this computation based on matrix division, as demonstrated here and here, which involve something like this:
sol = [ones(size(x)),log(x)]\log(y);
That is, appending a vector of 1s to the observations, followed by mldivide to solve the equation system.
The main challenge I'm facing is how to adapt my data to the algorithm (or vice versa).
Question #1: How can the matrix-division-based solution be extended to solve the problem presented above (and potentially replace the loops I am using)?
Question #2 (bonus): What is the principle behind this matrix-division-based solution?
The secret ingredient behind the solution that includes matrix division is the Vandermonde matrix. The question discusses a linear problem (linear regression), and those can always be formulated as a matrix problem, which \ (mldivide) can solve in a mean-square error sense‡. Such an algorithm, solving a similar problem, is demonstrated and explained in this answer.
Below is benchmarking code that compares the original solution with two alternatives suggested in chat1, 2 :
function regressionBenchmark(numEl)
clc
if nargin<1, numEl=10; end
%// Define some nominal values:
R = 5;
IT = 600:600:3000;
P = 0.97;
%// Impose some believable spatial variations:
pMAT = 0.01*randn(numEl)+P;
rMAT = 0.1*randn(numEl)+R;
%// Generate "fake" measurement data using the relation "DL = R*IT.^P"
dlMAT = bsxfun(#times,rMAT,bsxfun(#power,permute(IT,[3,1,2]),pMAT));
%% // Method1: loops + polyval
disp('-------------------------------Method 1: loops + polyval')
tic; [fR,fP] = method1(IT,dlMAT); toc;
fprintf(1,'Regression performance:\nR: %d\nP: %d\n',norm(fR-rMAT,1),norm(fP-pMAT,1));
%% // Method2: loops + Vandermonde
disp('-------------------------------Method 2: loops + Vandermonde')
tic; [fR,fP] = method2(IT,dlMAT); toc;
fprintf(1,'Regression performance:\nR: %d\nP: %d\n',norm(fR-rMAT,1),norm(fP-pMAT,1));
%% // Method3: vectorized Vandermonde
disp('-------------------------------Method 3: vectorized Vandermonde')
tic; [fR,fP] = method3(IT,dlMAT); toc;
fprintf(1,'Regression performance:\nR: %d\nP: %d\n',norm(fR-rMAT,1),norm(fP-pMAT,1));
function [fittedR,fittedP] = method1(IT,dlMAT)
sol = cell(size(dlMAT,1),size(dlMAT,2));
for ind1 = 1:size(dlMAT,1)
for ind2 = 1:size(dlMAT,2)
sol{ind1,ind2} = polyfit(log(IT(:)),log(squeeze(dlMAT(ind1,ind2,:))),1);
end
end
fittedR = cellfun(#(x)exp(x(2)),sol);
fittedP = cellfun(#(x)x(1),sol);
function [fittedR,fittedP] = method2(IT,dlMAT)
sol = cell(size(dlMAT,1),size(dlMAT,2));
for ind1 = 1:size(dlMAT,1)
for ind2 = 1:size(dlMAT,2)
sol{ind1,ind2} = flipud([ones(numel(IT),1) log(IT(:))]\log(squeeze(dlMAT(ind1,ind2,:)))).'; %'
end
end
fittedR = cellfun(#(x)exp(x(2)),sol);
fittedP = cellfun(#(x)x(1),sol);
function [fittedR,fittedP] = method3(IT,dlMAT)
N = 1; %// Degree of polynomial
VM = bsxfun(#power, log(IT(:)), 0:N); %// Vandermonde matrix
result = fliplr((VM\log(reshape(dlMAT,[],size(dlMAT,3)).')).');
%// Compressed version:
%// result = fliplr(([ones(numel(IT),1) log(IT(:))]\log(reshape(dlMAT,[],size(dlMAT,3)).')).');
fittedR = exp(real(reshape(result(:,2),size(dlMAT,1),size(dlMAT,2))));
fittedP = real(reshape(result(:,1),size(dlMAT,1),size(dlMAT,2)));
The reason why method 2 can be vectorized into method 3 is essentially that matrix multiplication can be separated by the columns of the second matrix. If A*B produces matrix X, then by definition A*B(:,n) gives X(:,n) for any n. Moving A to the right-hand side with mldivide, this means that the divisions A\X(:,n) can be done in one go for all n with A\X. The same holds for an overdetermined system (linear regression problem), in which there is no exact solution in general, and mldivide finds the matrix that minimizes the mean-square error. In this case too, the operations A\X(:,n) (method 2) can be done in one go for all n with A\X (method 3).
The implications of improving the algorithm when increasing the size of dlMAT can be seen below:
For the case of 500*500 (or 2.5E5) elements, the speedup from Method 1 to Method 3 is about x3500!
It is also interesting to observe the output of profile (here, for the case of 500*500):
Method 1
Method 2
Method 3
From the above it is seen that rearranging the elements via squeeze and flipud takes up about half (!) of the runtime of Method 2. It is also seen that some time is lost on the conversion of the solution from cells to matrices.
Since the 3rd solution avoids all of these pitfalls, as well as the loops altogether (which mostly means re-evaluation of the script on every iteration) - it unsurprisingly results in a considerable speedup.
Notes:
There was very little difference between the "compressed" and the "explicit" versions of Method 3 in favor of the "explicit" version. For this reason it was not included in the comparison.
A solution was attempted where the inputs to Method 3 were gpuArray-ed. This did not provide improved performance (and even somewhat degradaed them), possibly due to wrong implementation, or the overhead associated with copying matrices back and forth between RAM and VRAM.

Vectorize Evaluations of Meshgrid Points in Matlab

I need the "for" loop in the following representative section of code to run as efficiently as possible. The mean function in the code is acting as a representative placeholder for my own function.
x = linspace(-1,1,15);
y = linspace(2,4,15);
[xgrid, ygrid] = meshgrid(x,y);
mc = rand(100000,1);
z=zeros(size(xgrid));
for i=1:length(xgrid)
for j=1:length(ygrid)
z(i,j) = mean(xgrid(i,j) + ygrid(i,j) + xgrid(i,j)*ygrid(i,j)*mc);
end
end
I have vectorized the code and improved its speed by about 2.5 times by building a matrix in which mc is replicated for each grid point. My implementation results in a very large matrix (3 x 22500000) filled with repeated data. I've mitigated the memory penalty of this approach by converting the matrix to single precision, but it seems like there should be a more efficient way to do what I want that avoids replicating so much data.
You could use bsxfun with few reshapes -
A = bsxfun(#times,y,x.'); %//'
B = bsxfun(#plus,y,x.'); %//'
C = mean(bsxfun(#plus,bsxfun(#times,mc,reshape(A,1,[])) , reshape(B,1,[])),1);
z_out = reshape(C,numel(x),[]).';

Rectifying compute_curvature.m error in Toolbox Graph in Matlab

I am currently using the Toolbox Graph on the Matlab File Exchange to calculate curvature on 3D surfaces and find them very helpful (http://www.mathworks.com/matlabcentral/fileexchange/5355). However, the following error message is issued in “compute_curvature” for certain surface descriptions and the code fails to run completely:
> Error in ==> compute_curvature_mod at 75
> dp = sum( normal(:,E(:,1)) .* normal(:,E(:,2)), 1 );
> ??? Index exceeds matrix dimensions.
This happens only sporadically, but there is no obvious reason why the toolbox works perfectly fine for some surfaces and not for others (of a similar topology). I also noticed that someone had asked about this bug back in November 2009 on File Exchange, but that the question had gone unanswered. The post states
"compute_curvature will generate an error on line 75 ("dp = sum(
normal(:,E(:,1)) .* normal(:,E(:,2)), 1 );") for SOME surfaces. The
error stems from E containing indices that are out of range which is
caused by line 48 ("A = sparse(double(i),double(j),s,n,n);") where A's
values eventually entirely make up the E matrix. The problem occurs
when the i and j vectors create the same ordered pair twice in which
case the sparse function adds the two s vector elements together for
that matrix location resulting in a value that is too large to be used
as an index on line 75. For example, if i = [1 1] and j = [2 2] and s
= [3 4] then A(1,2) will equal 3 + 4 = 7.
The i and j vectors are created here:
i = [face(1,:) face(2,:) face(3,:)];
j = [face(2,:) face(3,:) face(1,:)];
Just wanted to add that the error I mentioned is caused by the
flipping of the sign of the surface normal of just one face by
rearranging the order of the vertices in the face matrix"
I have tried debugging the code myself but have not had any luck. I am wondering if anyone here has solved the problem or could give me insight – I need the code to be sufficiently general-purpose in order to calculate curvature for a variety of surfaces, not just for a select few.
The November 2009 bug report on File Exchange traces the problem back to the behavior of sparse:
S = SPARSE(i,j,s,m,n,nzmax) uses the rows of [i,j,s] to generate an
m-by-n sparse matrix with space allocated for nzmax nonzeros. The
two integer index vectors, i and j, and the real or complex entries
vector, s, all have the same length, nnz, which is the number of
nonzeros in the resulting sparse matrix S . Any elements of s
which have duplicate values of i and j are added together.
The lines of code where the problem originates are here:
i = [face(1,:) face(2,:) face(3,:)];
j = [face(2,:) face(3,:) face(1,:)];
s = [1:m 1:m 1:m];
A = sparse(i,j,s,n,n);
Based on this information removal of the repeat indices, presumably using unique or similar, might solve the problem:
[B,I,J] = unique([i.' j.'],'rows');
i = B(:,1).';
j = B(:,2).';
s = s(I);
The full solution may look something like this:
i = [face(1,:) face(2,:) face(3,:)];
j = [face(2,:) face(3,:) face(1,:)];
s = [1:m 1:m 1:m];
[B,I,J] = unique([i.' j.'],'rows');
i = B(:,1).';
j = B(:,2).';
s = s(I);
A = sparse(i,j,s,n,n);
Since I do not have a detailed understanding of the algorithm it is hard to tell whether the removal of entries will have a negative effect.

How can I speed up this call to quantile in Matlab?

I have a MATLAB routine with one rather obvious bottleneck. I've profiled the function, with the result that 2/3 of the computing time is used in the function levels:
The function levels takes a matrix of floats and splits each column into nLevels buckets, returning a matrix of the same size as the input, with each entry replaced by the number of the bucket it falls into.
To do this I use the quantile function to get the bucket limits, and a loop to assign the entries to buckets. Here's my implementation:
function [Y q] = levels(X,nLevels)
% "Assign each of the elements of X to an integer-valued level"
p = linspace(0, 1.0, nLevels+1);
q = quantile(X,p);
if isvector(q)
q=transpose(q);
end
Y = zeros(size(X));
for i = 1:nLevels
% "The variables g and l indicate the entries that are respectively greater than
% or less than the relevant bucket limits. The line Y(g & l) = i is assigning the
% value i to any element that falls in this bucket."
if i ~= nLevels % "The default; doesnt include upper bound"
g = bsxfun(#ge,X,q(i,:));
l = bsxfun(#lt,X,q(i+1,:));
else % "For the final level we include the upper bound"
g = bsxfun(#ge,X,q(i,:));
l = bsxfun(#le,X,q(i+1,:));
end
Y(g & l) = i;
end
Is there anything I can do to speed this up? Can the code be vectorized?
If I understand correctly, you want to know how many items fell in each bucket.
Use:
n = hist(Y,nbins)
Though I am not sure that it will help in the speedup. It is just cleaner this way.
Edit : Following the comment:
You can use the second output parameter of histc
[n,bin] = histc(...) also returns an index matrix bin. If x is a vector, n(k) = >sum(bin==k). bin is zero for out of range values. If x is an M-by-N matrix, then
How About this
function [Y q] = levels(X,nLevels)
p = linspace(0, 1.0, nLevels+1);
q = quantile(X,p);
Y = zeros(size(X));
for i = 1:numel(q)-1
Y = Y+ X>=q(i);
end
This results in the following:
>>X = [3 1 4 6 7 2];
>>[Y, q] = levels(X,2)
Y =
1 1 2 2 2 1
q =
1 3.5 7
You could also modify the logic line to ensure values are less than the start of the next bin. However, I don't think it is necessary.
I think you shoud use histc
[~,Y] = histc(X,q)
As you can see in matlab's doc:
Description
n = histc(x,edges) counts the number of values in vector x that fall
between the elements in the edges vector (which must contain
monotonically nondecreasing values). n is a length(edges) vector
containing these counts. No elements of x can be complex.
I made a couple of refinements (including one inspired by Aero Engy in another answer) that have resulted in some improvements. To test them out, I created a random matrix of a million rows and 100 columns to run the improved functions on:
>> x = randn(1000000,100);
First, I ran my unmodified code, with the following results:
Note that of the 40 seconds, around 14 of them are spent computing the quantiles - I can't expect to improve this part of the routine (I assume that Mathworks have already optimized it, though I guess that to assume makes an...)
Next, I modified the routine to the following, which should be faster and has the advantage of being fewer lines as well!
function [Y q] = levels(X,nLevels)
p = linspace(0, 1.0, nLevels+1);
q = quantile(X,p);
if isvector(q), q = transpose(q); end
Y = ones(size(X));
for i = 2:nLevels
Y = Y + bsxfun(#ge,X,q(i,:));
end
The profiling results with this code are:
So it is 15 seconds faster, which represents a 150% speedup of the portion of code that is mine, rather than MathWorks.
Finally, following a suggestion of Andrey (again in another answer) I modified the code to use the second output of the histc function, which assigns entries to bins. It doesn't treat the columns independently, so I had to loop over the columns manually, but it seems to be performing really well. Here's the code:
function [Y q] = levels(X,nLevels)
p = linspace(0,1,nLevels+1);
q = quantile(X,p);
if isvector(q), q = transpose(q); end
q(end,:) = 2 * q(end,:);
Y = zeros(size(X));
for k = 1:size(X,2)
[junk Y(:,k)] = histc(X(:,k),q(:,k));
end
And the profiling results:
We now spend only 4.3 seconds in codes outside the quantile function, which is around a 500% speedup over what I wrote originally. I've spent a bit of time writing this answer because I think it's turned into a nice example of how you can use the MATLAB profiler and StackExchange in combination to get much better performance from your code.
I'm happy with this result, although of course I'll continue to be pleased to hear other answers. At this stage the main performance increase will come from increasing the performance of the part of the code that currently calls quantile. I can't see how to do this immediately, but maybe someone else here can. Thanks again!
You can sort the columns and divide+round the inverse indexes:
function Y = levels(X,nLevels)
% "Assign each of the elements of X to an integer-valued level"
[S,IX]=sort(X);
[grid1,grid2]=ndgrid(1:size(IX,1),1:size(IX,2));
invIX=zeros(size(X));
invIX(sub2ind(size(X),IX(:),grid2(:)))=grid1;
Y=ceil(invIX/size(X,1)*nLevels);
Or you can use tiedrank:
function Y = levels(X,nLevels)
% "Assign each of the elements of X to an integer-valued level"
R=tiedrank(X);
Y=ceil(R/size(X,1)*nLevels);
Surprisingly, both these solutions are slightly slower than the quantile+histc solution.