Efficient Multiplication of Matrices with Large Numbers of Zeroes - matlab

I have two arrays that take the following form:
0…0…0 0 0 0…0 0…0…0 0 0 0…0
⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮
0…0 0 0 0 0…0 0…0 0 0 0 0…0
A = 0…0 1 2 3 0…0 B = 0…0 9 8 7 0…0
0…0 4 5 6 0…0 0…0 6 5 4 0…0
0…0 0 0 0 0…0 0…0 0 0 0 0…0
⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮
0…0…0 0 0 0…0 0…0…0 0 0 0…0
The size of the non-zero areas of A and B may not be exactly the same, but the diagram above is already getting a bit unwieldy.
Ultimately, the value I'm after is sum(sum(A .* B)). I feel like there must be a way to only multiply the non-zero elements, but every approach I can come up with seems to cause MATLAB to make a copy of the matrix, which utterly destroys any gains made by reducing the number of operations. B is reused for several iterations of the inner loop, so I can amortize expensive calculations on B over many loop iterations.
I've tried the following approaches so far:
Naive Approach:
function C = innerLoop(A, B)
C = sum(sum(A .* B))
end
innerLoop takes about 4.3 seconds over 86,000 calls using this. (Based on MATLAB's "Run and Time" functionality.)
Shrinking B First:
function B = resize(self, B1)
rows = abs(sum(B, 2)) > 1e-4;
top = find(rows, 1, 'first');
bot = find(rows, 1, 'last');
cols = abs(sum(B, 1)) > 1e-4;
left = find(cols, 1, 'first');
right = find(cols, 1, 'last');
self.Rows = top:bot; % Store in class properties for use in inner loop
self.Cols = left:right; % Store in class properties for use in inner loop
B = B(top:bot, left:right);
end
function C = innerLoop(A, B)
result = A(self.Rows, self.Cols) .* B;
C = sum(sum(result));
end
My hope with this approach was that MATLAB would realize that I wasn't writing to A and elide the copy, but this approach spends about 6.8 seconds in innerLoop.
I also tried only calculation the offsets outside innerLoop in the hopes that MATLAB might be able to pick up on the fact that I'm using the same subscripts on both matrices to optimize things:
function B = resize(self, B1)
rows = abs(sum(B, 2)) > 1e-4;
top = find(rows, 1, 'first');
bot = find(rows, 1, 'last');
cols = abs(sum(B, 1)) > 1e-4;
left = find(cols, 1, 'first');
right = find(cols, 1, 'last');
self.Rows = top:bot; % Store in class properties for use in inner loop
self.Cols = left:right; % Store in class properties for use in inner loop
end
function C = innerLoop(A, B)
result = A(self.Rows, self.Cols) .* B(self.Rows, self.Cols);
C = sum(sum(result));
end
Unfortunately this was the slowest yet at about 8.6 seconds.
I also tried looping with the following code:
function C = innerLoop(A, B)
C = 0;
for i = self.Rows
for j = self.Cols
C = C + field(i, j) * self.Sensitivity.Z(i, j);
end
end
end
I know that looping used to be very slow in MATLAB, but I've read some papers indicating that it is much faster than it used to be. That said, if the loop version ever finishes running, I'll let you know how long it took, but it's well over a couple minutes by now.
Any suggestions on how to optimize this would be greatly appreciated!

You can use sparse matrices for this problem. Matlab handles different sizes of the «non-sparse-part» automatically. To get a sparse matrix, the sparse-function is used. After that you can do the element-wise multiplication and then sum all elements of C in a separate line.
A = [0 0 0 0 0 0 0;
0 0 0 0 0 0 0;
0 0 1 2 3 0 0;
0 0 4 5 6 0 0;
0 0 0 0 0 0 0;
0 0 0 0 0 0 0];
B = [0 0 0 0 0 0 0;
0 0 0 0 0 0 0;
0 0 9 8 7 0 0;
0 0 6 5 4 0 0;
0 0 0 0 0 0 0;
0 0 0 0 0 0 0];
A = sparse(A);
B = sparse(B);
C = A .* B;
sum(C(:))

This is a rewrite of my initial post
I stand corrected. I don't know what went wrong in my former test. I thought it may have been a 32 vs 64 bit implementation of the sparse algorithm but not even. After careful re-running of the benchmark on 2 different machines, the sparse method wins then all.
Benchmark code:
function ExecTimes = bench_sum_sparse
nOrder = (1:9).' * 10.^(2:3) ; nOrder = [nOrder(:) ; (1:2).'*1e4] ;
%// nOrder = (1:30)*100 ;
npt = numel(nOrder) ;
ExecTimes = zeros( npt , 3 ) ;
fprintf('\n%s%5d \n','Calculating for N = ',0) ;
for k = 1:npt
% // sample data
N = nOrder(k) ;
fprintf('\b\b\b\b\b\b%5d\n',N) ; % // display progress
A = zeros(N) ;
B = A ;
innerMat = (1:10).'*(1:10) ; %'
ixInnerMat = innerMat + N/2 - 5 ;
A(ixInnerMat) = innerMat ;
B(ixInnerMat) = innerMat ;
% // benchmark
f1 = #() innerLoop(A,B) ;
ExecTimes(k,1) = timeit( f1 ) ;
clear f1
f2 = #() sum_logicIndex(A, B) ;
ExecTimes(k,2) = timeit( f2 ) ;
clear f2
A = sparse(A);
B = sparse(B);
f3 = #() sum_sparse(A,B) ;
ExecTimes(k,3) = timeit( f3 ) ;
clear f3
%// checksum1 = f1() - f2 ()
%// checksum2 = f1() - f3 ()
end
end
function C = innerLoop(A, B)
C = sum(sum(A .* B)) ;
end
function C = sum_logicIndex(A,B)
idx = A>0 & B>0 ;
C = sum(sum(A(idx).*B(idx))) ;
end
function C = sum_sparse(A,B)
C = A .* B;
C = sum(C(:)) ;
end
All tests ran on Matlab 2013b
64 bit Machine : Intel I7-3820 # 3.6GHz - 16 GB RAM - Windows 7
32 bit Machine : Intel E2200 # 2.2GHz - 3GB RAM - Windows 8.1

Related

How can I randomize two binary vectors that are similar while making sure all possible combinations are respected?

Been trying to solve this simple problem for a while but just couldn't find the solution for the life of me...
I'm programming an experiment in PsychToolbox but I'll spare you the details, basically I have two vectors A and B of equal size with the same number of ones and zeroes:
A = [0 0 1 1]
B = [0 0 1 1]
Both vectors A and B must be randomized independently but in such a way that one combination of items between the two vectors is never repeated. That is, I must end up with this
A = [1 1 0 0]
B = [1 0 0 1]
or this:
A = [0 0 1 1]
B = [0 1 0 1]
but I should never end up with this:
A = [1 1 0 0]
B = [1 1 0 0]
or this
A = [0 1 0 1]
B = [0 1 0 1]
One way to determine this is to check the sum of items between the two vectors A+B, which should always contain only one 2 or only one 0:
A = [1 1 0 0]
B = [1 0 0 1]
A+B = 2 1 0 1
Been trying to make this a condition within a 'while' loop (e.g. so long as the number of zeroes in the vector obtained by A+B is superior to 1, keep randomizing A and B), but either it still produces repeated combination or it just never stops looping. I know this is a trivial problem but I just can't get my head around it somehow. Anyone care to help?
This is a simplified version of the script I got:
A = [1 1 0 0];
B = A;
ARand = randperm(length(A));
A = A(ARand);
BRand = randperm(length(B));
B = B(BRand);
while nnz(~(A+B)) > 1
ARand = randperm(length(A));
A = A(ARand);
BRand = randperm(length(B));
B = B(BRand);
end
Still, I end up with repeated combinations.
% If you are only looking for an answer to this scenario the easiest way is
% as follows:
A = [0 0 1 1];
B = [0 0 1 1];
nn = length(A);
keepset = [0 0 1 1;0 1 0 1];
keepset = keepset(:,randperm(nn))
% If you want a more general solution for arbitrary A & B (for example)
A = [0 0 0 1 1 1 2 2 2];
B = [0 0 0 1 1 1 2 2 2];
nn = length(A);
Ai = A(randperm(nn));
Bi = B(randperm(nn));
% initialize keepset with the first combination of A & B
keepset = [Ai(1);Bi(1)];
loopcnt = 0;
while (size(keepset,2) < nn)
% randomize the elements in A and B independently
Ai = A(randperm(nn));
Bi = B(randperm(nn));
% test each combination of Ai and Bi to see if it is already in the
% keepset
for ii = 1:nn
tstcombo = [Ai(ii);Bi(ii)];
matchtest = bsxfun(#eq,tstcombo,keepset);
matchind = find((matchtest(1,:) & matchtest(2,:)));
if isempty(matchind)
keepset = [keepset tstcombo];
end
end
loopcnt = loopcnt + 1;
if loopcnt > 1000
disp('Execution halted after 1000 attempts')
break
elseif (size(keepset,2) >= nn)
disp(sprintf('Completed in %0.f iterations',loopcnt))
end
end
keepset
It's much more efficient to permute the combinations randomly than shuffling the arrays independently and handling the inevitable matching A/B elements.
There are lots of ways to generate all possible pairs, see
How to generate all pairs from two vectors in MATLAB using vectorised code?
For this example I'll use
allCombs = combvec([0,1],[0,1]);
% = [ 0 1 0 1
% 0 0 1 1 ]
Now you just want to select some amount of unique (non-repeating) columns from this array in a random order. In all of your examples you select all 4 columns. The randperm function is perfect for this, from the docs:
p = randperm(n,k) returns a row vector containing k unique integers selected randomly from 1 to n inclusive.
n = size(allCombs,2); % number of combinations (or columns) to choose from
k = 4; % number of columns to choose for output
AB = allCombs( :, randperm(n,k) ); % random selection of pairs
If you need this split into two variables then you have
A = AB(1,:);
B = AB(2,:);
Here's a possible solution:
A = [0 0 1 1];
B = [0 0 1 1];
% Randomize A and B independently
ARand = randperm(length(A));
A = A(ARand);
BRand = randperm(length(B));
B = B(BRand);
% Keep randomizing A and B until the condition is met
while sum(A+B) ~= 1 && sum(A+B) ~= length(A)
ARand = randperm(length(A));
A = A(ARand);
BRand = randperm(length(B));
B = B(BRand);
end
This solution checks if the sum of the elements in A+B is either 1 or the length of A, which indicates that only one element in A+B is either a 0 or a 2, respectively. If either of these conditions is not met, the vectors A and B are randomized again.

How to generate a customized checker board matrix as fast as possible?

I need a function that creates a checker board matrix with M rows and N columns of P*Q rectangles. I modified the third solution from here to get that:
function [I] = mycheckerboard(M, N, P, Q)
nr = M*P;
nc = N*Q;
i = floor(mod((0:(nc-1))/Q, 2));
j = floor(mod((0:(nr-1))/P, 2))';
r = repmat(i, [nr 1]);
c = repmat(j, [1 nc]);
I = xor(r, c);
it works with no problem:
I=mycheckerboard(2, 3, 4, 3)
I =
0 0 0 1 1 1 0 0 0
0 0 0 1 1 1 0 0 0
0 0 0 1 1 1 0 0 0
0 0 0 1 1 1 0 0 0
1 1 1 0 0 0 1 1 1
1 1 1 0 0 0 1 1 1
1 1 1 0 0 0 1 1 1
1 1 1 0 0 0 1 1 1
But it's not fast enough since there are lots of calls of this function in a single run. Is there a faster way to get the result? How can I remove floating point divisions and/or calls of the floor function?
Your code is fairly fast for small matrices, but becomes less so as the dimensions get larger. Here's a one-liner using bsxfun and imresize (requires Image Processing toolbox that most have):
m = 2;
n = 3;
p = 4;
q = 3;
I = imresize(bsxfun(#xor, mod(1:m, 2).', mod(1:n, 2)), [p*m q*n], 'nearest')
Or, inspired by #AndrasDeak's use of kron, this is faster with R2015b:
I = kron(bsxfun(#xor, mod(1:m, 2).', mod(1:n, 2)), ones(p, q))
For a small bit more speed, the underlying code for kron can be simplified by taking advantage of the structure of the problem:
A = bsxfun(#xor, mod(1:m, 2).', mod(1:n, 2));
A = permute(A, [3 1 4 2]);
B = ones(q, 1, p);
I = reshape(bsxfun(#times, A, B), [m*n p*q]);
or as one (long) line:
I = reshape(bsxfun(#times, permute(bsxfun(#xor, mod(1:m, 2).', mod(1:n, 2)), [3 1 4 2]), ones(q, 1, p)), [m*n p*q]);
I suggest first creating a binary matrix for the checkerboard's fields, then using the built-in kron to blow it up to the necessary size:
M = 2;
N = 3;
P = 4;
Q = 3;
[iM,iN] = meshgrid(1:M,1:N);
A = zeros(M,N);
A(mod(iM.'+iN.',2)==1) = 1;
board = kron(A,ones(P,Q))

Vectorization of multiple embedded for loops

I have the following code that includes 3 iterated for loops in order to create an upper diagonal matrix, I plan on performing on large data set many times and want to make as computationally efficient as possible.
data = magic(3);
n = size(data,1);
W = zeros(n,n);
for i = 1:n
for j = i:n
if i==j
W(i,j)=0;
else
for k = 1:n
temp(1,k) = (data(i,k)-data(j,k))^2;
sumTemp = sumTemp + temp(1,k);
end
W(i,j)=sqrt(sumTemp);
end
temp = 0;
sumTemp = 0;
end
end
Answer should look like:
[0 6.4807 9.7980
0 0 6.4807
0 0 0]
I am working it hard right now, but figure I would throw it out there in case anyone has any suggestions that would save me hours of fiddling around.
This is hat I have at the moment:
data = magic(3);
n = size(data,1);
W = zeros(n,n);
for i = 1:n
for j = i+1:n
W(i,j)= norm(data(i,:)-data(j,:))
%W(i,j)= sqrt(sum((data(i,:)-data(j,:)).^2));
end
end
What I did:
vecorized the inner loop
removed www, which is unused
changed 2nd loop, start at i+1 because nothing is done for i=j
Replaced sqrt((a-b).^2) with norm(a-b)
And now the "full" vectorization:
data = magic(3);
n = size(data,1);
W = zeros(n,n);
tri=triu(ones(n,n),1)>0;
[i,j]=find(tri);
W(tri)=arrayfun(#(i,j)norm(data(i,:)-data(j,:)),i,j)
Here is a straightforward solution with bsxfun:
Wfull = sqrt(squeeze(sum(bsxfun(#minus,data,permute(data,[3 2 1])).^2,2)))
W = triu(Wfull)
Use this where data is N-by-D, where N is the number of points and D is dimensions. For example,
>> data = magic(3);
>> triu(sqrt(squeeze(sum(bsxfun(#minus,data,permute(data,[3 2 1])).^2,2))))
ans =
0 6.4807 9.7980
0 0 6.4807
0 0 0
>> data = magic(5); data(:,end-1:end)=[]
data =
17 24 1
23 5 7
4 6 13
10 12 19
11 18 25
>> triu(sqrt(squeeze(sum(bsxfun(#minus,data,permute(data,[3 2 1])).^2,2))))
ans =
0 20.8087 25.2389 22.7376 25.4558
0 0 19.9499 19.0263 25.2389
0 0 0 10.3923 18.3576
0 0 0 0 8.5440
0 0 0 0 0
>>

MATLAB matrix not formatting correctly

I have some code below, and I cant seem to get the matrices formatted correctly. I have been trying to get the matrices to look more professional (close together) with \t and fprintf, but cant seem to do so. I am also having some trouble putting titles for each columns of the matrix. Any help would be much appreciated!
clear all
clc
format('bank')
% input file values %
A = [4 6 5 1 0 0 0 0 0; 7 8 4 0 1 0 0 0 0; 6 5 9 0 0 1 0 0 0; 1 0 0 0 0 0 -1 0 0; 0 1 0 0 0 0 0 -1 0; 0 0 1 0 0 0 0 0 -1];
b = [480; 600; 480; 24; 20; 25];
c = [3000 4000 4000 0 0 0 0 0 0];
% Starting xb %
xb = [1 2 3 4 5 6]
% Starting xn %
xn = [7 8 9]
cb = c(xb)
cn = c(xn)
% Get B from A %
B = A(:,xb)
% Get N from A %
N = A(:,xn)
% Calculate z %
z = ((cb*(inv(B))*A)-c)
% Calculate B^(-1) %
Binv = inv(B)
% Calculate RHS of row 0 %
RHS0 = cb*Binv*b
% Calculates A %
A = Binv*A
%STARTING Tableau%
ST = [z RHS0;A b]
for j=1:A
fprintf(1,'\tz%d',j)
end
q = 0
while q == 0
m = input('what is the index value of the ENTERING variable? ')
n = input('what is the index value of the LEAVING variable? ')
xn(xn==m)= n
xb(xb==n) = m
cb = c(xb)
cn = c(xn)
B = A(:,xb)
N = A(:,xn)
Tableuz = (c-(cb*(B^(-1))*A))
RHS0 = (cb*(B^(-1))*b)
TableuA = ((B^(-1))*A)
Tableub = ((B^(-1))*b)
CT = [Tableuz RHS0; TableuA Tableub];
disp(CT)
q = input('Is the tableau optimal? Y-1, N-0')
end
I didn't dig into what you are doing really deeply, but a few pointers:
* Put semicolons at the end of lines you don't want printing to the screen--it makes it easier to see what is happening elsewhere.
* Your for j=1:A loop only prints j. I think what you want is more like this:
for row = 1:size(A,1)
for column = 1:size(A,2)
fprintf('%10.2f', A(row,column));
end
fprintf('\n');
end
If you haven't used the Matlab debugger yet, give it a try; it makes a lot of these problems easier to spot. All you have to do to start it is to add a breakpoint to the file by clicking on the dash(-) next to the line numbers and starting the script. Quick web searches can turn up the solution very quickly too--someone else has usually already had any problem you're going to run into.
Good luck.
Try using num2str with a format argument of your desired precision. It's meant for converting matrices to strings. (note: this is different than mat2str which serializes matrices so they can be deserialized with eval)

How can I generate the following matrix in MATLAB?

I want to generate a matrix that is "stairsteppy" from a vector.
Example input vector: [8 12 17]
Example output matrix:
[1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0]
[0 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0]
[0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1]
Is there an easier (or built-in) way to do this than the following?:
function M = stairstep(v)
M = zeros(length(v),max(v));
v2 = [0 v];
for i = 1:length(v)
M(i,(v2(i)+1):v2(i+1)) = 1;
end
You can do this via indexing.
A = eye(3);
B = A(:,[zeros(1,8)+1, zeros(1,4)+2, zeros(1,5)+3])
Here's a solution without explicit loops:
function M = stairstep(v)
L = length(v); % M will be
V = max(v); % an L x V matrix
M = zeros(L, V);
% create indices to set to one
idx = zeros(1, V);
idx(v + 1) = 1;
idx = cumsum(idx) + 1;
idx = sub2ind(size(M), idx(1:V), 1:V);
% update the output matrix
M(idx) = 1;
EDIT: fixed bug :p
There's no built-in function I know of to do this, but here's one vectorized solution:
v = [8 12 17];
N = numel(v);
M = zeros(N,max(v));
M([0 v(1:N-1)]*N+(1:N)) = 1;
M(v(1:N-1)*N+(1:N-1)) = -1;
M = cumsum(M,2);
EDIT: I like the idea that Jonas had to use BLKDIAG. I couldn't help playing with the idea a bit until I shortened it further (using MAT2CELL instead of ARRAYFUN):
C = mat2cell(ones(1,max(v)),1,diff([0 v]));
M = blkdiag(C{:});
A very short version of a vectorized solution
function out = stairstep(v)
% create lists of ones
oneCell = arrayfun(#(x)ones(1,x),diff([0,v]),'UniformOutput',false);
% create output
out = blkdiag(oneCell{:});
You can use ones to define the places where you have 1's:
http://www.mathworks.com/help/techdoc/ref/ones.html