I know it is not possible to create a true binary variable in Matlab. Even 'true(8,1)' has a size of 8 bytes, instead of 8 bits. Very memory inefficient.
In Jon Bentley's Programming Pearls he poses this problem:
input: a file containing at most n positive integers, each less then n, where n = 1E7. No integer exists twice.
output: a sorted list in increasing order of input integers
constraints: At most roughly 1MB of storage available in main memory. Ample disk storage.
In the answer he claims the program should be able to solve it lightning fast using a binary map, or bit-vector. I tried to implement this bit-vector in matlab:
classdef binar
properties (Access=protected)
byteslist uint8
end
methods
function obj = binar(N)
N=ceil(N/8);
obj.byteslist=zeros([N,1],'uint8');
end
function obj = setbit(obj,N,bit) %N from 0 to N-1
v=rem(N,8)+1; %byte position [1 8]
B = (N-v+1)/8+1; %byte nr [1 ..]
obj.byteslist(B)=bitset(obj.byteslist(B),v,bit);
end
function bit = getbit(obj,N)
v=rem(N,8)+1;
B = (N-v)/8+1;
bit = bitget(obj.byteslist(B),v);
end
end
end
To my surprise setting bits is very slow. I timed this code:
N=1E7;
B = binar(N);
itr=1E5;
tic
for ct = 1:itr
pos = uint32(randi([0,N-1],1));
B = setbit(B,pos,1);
end
toc
From the size perspective this works fine. 'S=whos('B');S.bytes/2^20' returns 1.19MB. However it does not seem to be very fast. Running bitset 1E5 times takes 0.367s, where I would think this is literally the simplest operation you can ask of a computer.
Is there a way to make this faster in Matlab?
update: I tried with bitand, and a predeclared vector
obj.byte=uint8([1 2 4 8 16 32 64 128]); %declared on creation of obj
obj.byteslist(B)=bitand(obj.byteslist(B),obj.byte(v));
Then i tried:
obj.byteslist(B)=obj.byteslist(B)+obj.byte(v);
There is hardly a speed difference with bitset.
Related
I am working on community detection with a genetic algorithm. Communities are represented in locus-based representation, that each index(gene) and it's value are in same community.
For example in the figure below, if the chromosome is (b) the communities will be (d)
So to extract communities from a chromosome, I need to iteratively find the indexes and values, to do this, I have written this code:
while (SumComms)~=nVar
j=find(Sol>0,1,'first');%find the first node which has not placed in any community
Com=[j,Sol(j)];%index of the node and it's value
Comsize=0;
while Comsize<numel(Com)
Comsize=numel(Com);
x=find(ismembc(Sol,sort([Com,Sol(Com)])));%Indexes which Com occure in Sol
Com=unique([Com,x,Sol(x)]);
end
Sol(Com)=0;
i=i+1;
SumComms=SumComms+numel(Com);
Communities{i}=Com;
end
But x=find(ismembc(Sol,sort([Com,Sol(Com)]))) is very time consuming in even mid-size networks. Do you know any faster way?
Here is a solution using logical vector. Instead of operation on indices we can define Com as a logical vector so operations on indices such as ismember can be reduced to indexing operations:
i=0;
SumComms=0;
nVar = 9;
Sol = [3 0 3 1 5 6 4 7 7]+1;
while SumComms ~= nVar
j=find(Sol>1,1,'first');
Com = false(1,nVar);
Com([j Sol(j)])=true;
Comsize=0;
sumcom = 2;
while Comsize<sumcom
Comsize=sum(Com);
Com(Sol(Com))=true;
Com = Com(Sol);
Com(Sol(Com))=true;
sumcom = sum(Com);
end
Sol(Com)=1;
i = i + 1;
SumComms=SumComms+sumcom;
Communities{i}=find(Com);
end
Result of a test in Octave shows that the proposed method is at least 10x faster than the original method .
I'm completely new to matlab and this is my first project. Mnist has 60000 picture between 0 and 9 for training and 1000 picture to test. what I did is try to make a pattern for all of this 10 class (0 to 9) by using mean.then for for recognition I use Euclidean distance. this is very simple but the accuracy is really low.
I don't know where is exactly my problem to give my back this percentage of accuracy. the accuracy :1.73%
here is my code
finding 10 pattern for all of our class:
root = 'F:\matlab\ex1\exercise-EquivaliencOfL2DistanceAndDotProduct\dataset';
fn = strcat (root, '\MnistTrainX.mat');
load (fn);
fn = strcat (root, '\MnistTrainY.mat');
load (fn);
weights = zeros (10, 784);
b = zeros (10, 1);
im=reshape(MnistTrainX(5,:),[28 ,28]);
imshow(im,[]);
imshow(im',[]);
for c=1 : 10
idx=find(MnistTrainY == c-1);
weights (c,:)=mean( MnistTrainX(idx,:));
end
trainAccuray = ComputeInnerProductAccuracy(weights,b, MnistTrainX,MnistTrainY);
display(trainAccuray);
fn = strcat (root, '\MnistTestX.mat');
load (fn);
fn = strcat (root, '\MnistTestY.mat');
load (fn);
testAccuray = ComputeInnerProductAccuracy(weights, b, MnistTestX, MnistTestY);
display(testAccuray);
and this is accuracy function
function [acc]=ComputeInnerProductAccuracy(weights, b, X, Y)
n = size(X, 1);
minmat = zeros (60000, 2);
endmat = zeros (60000, 10);
m = size(X);
a=0;
for i=1 : n
for j=1 : 10
endmat(i,j)=sum((X(i,:)-(weights(j,:))).^2,2);
end
[minmat(i,1) ,minmat(i,2)]= min(endmat(i,:));
if minmat(i,2)== Y(i)
a=a+1;
end
end
acc=(a*100)/60000;
end
Your code is mostly correct, though it's quite inefficient. I won't spend the time to make it more efficient as there are many areas that need addressing. Instead I'll focus on what is wrong. There are two things wrong with the code. Firstly is when you find which digit has the lowest distance:
[minmat(i,1) ,minmat(i,2)]= min(endmat(i,:));
Note that the second output of min produces the location of where the minimum is starting at index 1. The class values in Y should contain 0 to 9 but the output index of min in your case is from 1 to 10. The output minimum indices and the corresponding class values are 1 off from each other, which is probably the reason why you have such bad accuracy.
Therefore, you have to subtract 1 from minmat(i, 2) before you check to see if the minimum label is indeed the ground truth... or you can simply add 1 to Y(i) when checking:
[minmat(i,1) ,minmat(i,2)]= min(endmat(i,:));
if minmat(i,2)== Y(i)+1 % Change
a=a+1;
end
The second thing that is incorrect is that the "inner product" function (actually you're computing the Euclidean distance.... but let's put that aside for this answer) assumes that there are always 60000 inputs yet your test set doesn't have this many inputs. This will work fine on your training data but it will report the wrong accuracy for your test data. Make sure you change all instances of 60000 in the function to n. This variable you've already created in your code and determines how many inputs there are.
I am trying to increase the speed of code that operates on large datasets. I need to perform the function out = sinc(x), where x is a 2048-by-37499 matrix of doubles. This is very expensive and is the bottleneck of my program (even when computed on the GPU).
I am looking for any solution which improves the speed of this operation.
I expect that this might be achieved by pre-computing a vector LookUp = sinc(y) where y is the vector y = min(min(x)):dy:max(max(x)), i.e. a vector spanning the whole range of expected x elements.
How can I efficiently generate an approximation of sinc(x) from this LookUp vector?
I need to avoid generating a three dimensional array, since this would consume more memory than I have available.
Here is a test for the interp1 solution:
a = -15;
b = 15;
rands = (b-a).*rand(1024,37499) + a;
sincx = -15:0.000005:15;
sincy = sinc(sincx);
tic
res1 = interp1(sincx,sincy,rands);
toc
tic
res2 = sinc(rands);
toc'
sincx = gpuArray(sincx);
sincy = gpuArray(sincy);
r = gpuArray(rands);
tic
r = interp1(sincx,sincy,r);
toc
r = gpuArray(rands);
tic
r = sinc(r);
toc
Elapsed time is 0.426091 seconds.
Elapsed time is 0.472551 seconds.
Elapsed time is 0.004311 seconds.
Elapsed time is 0.130904 seconds.
Corresponding to CPU interp1, CPU sinc, GPU interp1, GPU sinc respectively
Not sure I understood completely your problem.
But once you have LookUp = sinc(y) you can use the Matlab function interp1
out = interp1(y,LookUp,x)
where x can be a matrix of any size
I came to the conclusion, that your code can not be improved significantly. The fastest possible lookup table is based on simple indexing. For a performance test, lets just perform the test based on random data:
%test data:
x=rand(2048,37499);
%relevant code:
out = sinc(x);
Now the lookup based on integer indices:
a=min(x(:));
b=max(x(:));
n=1000;
x2=round((x-a)/(b-a)*(n-1)+1);
lookup=sinc(1:n);
out2=lookup(x2);
Regardless of the size of the lookup table or the input data, the last lines in both code blocks take roughly the same time. Having sinc evaluate roughly as fast as a indexing operation, I can only assume that it is already implemented using a lookup table.
I found a faster way (if you have a NVIDIA GPU on your PC) , however this will return NaN for x=0, but if, for any reason, you can deal with having NaN or you know it will never be zero then:
if you define r = gpuArray(rands); and actually evaluate the sinc function by yourself in the GPU as:
tic
r=rdivide(sin(pi*r),pi*r);
toc
This generally is giving me about 3.2x the speed than the interp1 version in the GPU, and its more accurate (tested using your code above, iterating 100 times with different random data, having both methods similar std).
This works because sin and elementwise division rdivide are also GPU implemented (while for some reason sinc isn't) . See: http://uk.mathworks.com/help/distcomp/run-built-in-functions-on-a-gpu.html
m = min(x(:));
y = m:dy:max(x(:));
LookUp = sinc(y);
now sinc(n) should equal
LookUp((n-m)/dy + 1)
assuming n is an integer multiple of dy and lies within the range m and max(x(:)). To get to the LookUp index (i.e. an integer between 1 and numel(y), we first shift n but the minimum m, then scale it by dy and finally add 1 because MATLAB indexes from 1 instead of 0.
I don't know what that wll do for you efficiency though but give it a try.
Also you can put this into an anonymous function to help readability:
sinc_lookup = #(n)(LookUp((n-m)/dy + 1))
and now you can just call
sinc_lookup(n)
Please suggest how to sort out this issue:
nNodes = 50400;
adj = sparse(nNodes,nNodes);
adj(sub2ind([nNodes nNodes], ind, ind + 1)) = 1; %ind is a vector of indices
??? Maximum variable size allowed by the program is exceeded.
I think the problem is 32/64-bit related. If you have a 32 bit processor, you can address at most
2^32 = 4.294967296e+09
elements. If you have a 64-bit processor, this number increases to
2^64 = 9.223372036854776e+18
Unfortunately, for reasons that are at best vague to me, Matlab does not use this full range. To find out the actual range used by Matlab, issue the following command:
[~,maxSize] = computer
On a 32-bit system, this gives
>> [~,maxSize] = computer
maxSize =
2.147483647000000e+09
>> log2(maxSize)
ans =
3.099999999932819e+01
and on a 64-bit system, it gives
>> [~,maxSize] = computer
maxSize =
2.814749767106550e+14
>> log2(maxSize)
ans =
47.999999999999993
So apparently, on a 32-bit system, Matlab only uses 31 bits to address elements, which gives you the upper limit.
If anyone can clarify why Matlab only uses 31 bits on a 32-bit system, and only 48 bits on a 64-bit system, that'd be awesome :)
Internally, Matlab always uses linear indices to access elements in an array (it probably just uses a C-style array or so), which implies for your adj matrix that its final element is
finEl = nNodes*nNodes = 2.54016e+09
This, unfortunately, is larger than the maximum addressable with 31 bits. Therefore, on the 32-bit system,
>> adj(end) = 1;
??? Maximum variable size allowed by the program is exceeded.
while this command poses no problem at all on the 64-bit system.
You'll have to use a workaround on a 32-bit system:
nNodes = 50400;
% split sparse array up into 4 pieces
adj{1,1} = sparse(nNodes/2,nNodes/2); adj{1,2} = sparse(nNodes/2,nNodes/2);
adj{2,1} = sparse(nNodes/2,nNodes/2); adj{2,2} = sparse(nNodes/2,nNodes/2);
% assign or index values to HUGE sparse arrays
function ret = indHuge(mat, inds, vals)
% get size of cell
sz = size(mat);
% return current values when not given new values
if nargin < 3
% I have to leave this up to you...
% otherwise, assign new values
else
% I have to leave this up to you...
end
end
% now initialize desired elements to 1
adj = indHuge(adj, sub2ind([nNodes nNodes], ind, ind + 1), 1);
I just had the idea to cast all this into a proper class, so that you can use much more intuitive syntax...but that's a whole lot more than I have time for now :)
adj = sparse(ind, ind + 1, ones(size(ind)), nNodes, nNodes, length(ind));
This worked fine...
And, if we have to access the last element of the sparse matrix, we can access by adj(nNodes, nNodes), but adj(nNodes * nNodes) throws error.
I have a MATLAB routine with one rather obvious bottleneck. I've profiled the function, with the result that 2/3 of the computing time is used in the function levels:
The function levels takes a matrix of floats and splits each column into nLevels buckets, returning a matrix of the same size as the input, with each entry replaced by the number of the bucket it falls into.
To do this I use the quantile function to get the bucket limits, and a loop to assign the entries to buckets. Here's my implementation:
function [Y q] = levels(X,nLevels)
% "Assign each of the elements of X to an integer-valued level"
p = linspace(0, 1.0, nLevels+1);
q = quantile(X,p);
if isvector(q)
q=transpose(q);
end
Y = zeros(size(X));
for i = 1:nLevels
% "The variables g and l indicate the entries that are respectively greater than
% or less than the relevant bucket limits. The line Y(g & l) = i is assigning the
% value i to any element that falls in this bucket."
if i ~= nLevels % "The default; doesnt include upper bound"
g = bsxfun(#ge,X,q(i,:));
l = bsxfun(#lt,X,q(i+1,:));
else % "For the final level we include the upper bound"
g = bsxfun(#ge,X,q(i,:));
l = bsxfun(#le,X,q(i+1,:));
end
Y(g & l) = i;
end
Is there anything I can do to speed this up? Can the code be vectorized?
If I understand correctly, you want to know how many items fell in each bucket.
Use:
n = hist(Y,nbins)
Though I am not sure that it will help in the speedup. It is just cleaner this way.
Edit : Following the comment:
You can use the second output parameter of histc
[n,bin] = histc(...) also returns an index matrix bin. If x is a vector, n(k) = >sum(bin==k). bin is zero for out of range values. If x is an M-by-N matrix, then
How About this
function [Y q] = levels(X,nLevels)
p = linspace(0, 1.0, nLevels+1);
q = quantile(X,p);
Y = zeros(size(X));
for i = 1:numel(q)-1
Y = Y+ X>=q(i);
end
This results in the following:
>>X = [3 1 4 6 7 2];
>>[Y, q] = levels(X,2)
Y =
1 1 2 2 2 1
q =
1 3.5 7
You could also modify the logic line to ensure values are less than the start of the next bin. However, I don't think it is necessary.
I think you shoud use histc
[~,Y] = histc(X,q)
As you can see in matlab's doc:
Description
n = histc(x,edges) counts the number of values in vector x that fall
between the elements in the edges vector (which must contain
monotonically nondecreasing values). n is a length(edges) vector
containing these counts. No elements of x can be complex.
I made a couple of refinements (including one inspired by Aero Engy in another answer) that have resulted in some improvements. To test them out, I created a random matrix of a million rows and 100 columns to run the improved functions on:
>> x = randn(1000000,100);
First, I ran my unmodified code, with the following results:
Note that of the 40 seconds, around 14 of them are spent computing the quantiles - I can't expect to improve this part of the routine (I assume that Mathworks have already optimized it, though I guess that to assume makes an...)
Next, I modified the routine to the following, which should be faster and has the advantage of being fewer lines as well!
function [Y q] = levels(X,nLevels)
p = linspace(0, 1.0, nLevels+1);
q = quantile(X,p);
if isvector(q), q = transpose(q); end
Y = ones(size(X));
for i = 2:nLevels
Y = Y + bsxfun(#ge,X,q(i,:));
end
The profiling results with this code are:
So it is 15 seconds faster, which represents a 150% speedup of the portion of code that is mine, rather than MathWorks.
Finally, following a suggestion of Andrey (again in another answer) I modified the code to use the second output of the histc function, which assigns entries to bins. It doesn't treat the columns independently, so I had to loop over the columns manually, but it seems to be performing really well. Here's the code:
function [Y q] = levels(X,nLevels)
p = linspace(0,1,nLevels+1);
q = quantile(X,p);
if isvector(q), q = transpose(q); end
q(end,:) = 2 * q(end,:);
Y = zeros(size(X));
for k = 1:size(X,2)
[junk Y(:,k)] = histc(X(:,k),q(:,k));
end
And the profiling results:
We now spend only 4.3 seconds in codes outside the quantile function, which is around a 500% speedup over what I wrote originally. I've spent a bit of time writing this answer because I think it's turned into a nice example of how you can use the MATLAB profiler and StackExchange in combination to get much better performance from your code.
I'm happy with this result, although of course I'll continue to be pleased to hear other answers. At this stage the main performance increase will come from increasing the performance of the part of the code that currently calls quantile. I can't see how to do this immediately, but maybe someone else here can. Thanks again!
You can sort the columns and divide+round the inverse indexes:
function Y = levels(X,nLevels)
% "Assign each of the elements of X to an integer-valued level"
[S,IX]=sort(X);
[grid1,grid2]=ndgrid(1:size(IX,1),1:size(IX,2));
invIX=zeros(size(X));
invIX(sub2ind(size(X),IX(:),grid2(:)))=grid1;
Y=ceil(invIX/size(X,1)*nLevels);
Or you can use tiedrank:
function Y = levels(X,nLevels)
% "Assign each of the elements of X to an integer-valued level"
R=tiedrank(X);
Y=ceil(R/size(X,1)*nLevels);
Surprisingly, both these solutions are slightly slower than the quantile+histc solution.