Contracting tensor in Matlab - matlab

I am looking for a way to contract two indices of a tensor in Matlab.
Say I have a tensor of dimension [17,10,17,12] I am looking for a function that sums over the first and third dimension with the same index and leaves a matrix of dimension [10,12] (analogous to a trace in two dimensions).
I am currently studying tensor networks and I mainly use the functions "permute" and "reshape". If one is contracting multiple tensors and is not careful from the beginning, one might end up with indices one wants to contract in one tensor of the form [i,j,i,k].
Of course one can go back and contract the tensors in a way such that this does not happen, but I'd nonetheless be interested in a more robust solution.
EDIT:
Something to the effect of:
A = rand(17,10,17,12);
A_contracted = zeros(10,12);
for i = [1:10]
for j = [1:12]
for k = [1:17]
A_contracted(i,j) = A_contracted(i,j) + A(k,i,k,j);
end
end
end

Here's a way to do it:
A_contracted = permute(sum( ...
A.*((1:size(A,1)).'==reshape(1:size(A,3), 1, 1, [])), [1 3]), [2 4 1 3]);
The above uses implicit expansion and the possibility to operate along multiple dimensions at once in sum, which are recent Matlab features. For older Matlab versions,
A_contracted = permute(sum(sum( ...
A.*bsxfun(#eq, (1:size(A,1)).', reshape(1:size(A,3), 1, 1, [])),1),3), [2 4 1 3]);

[I feel like I'm starting to sound like a broken record...]
You should always implement your code as a loop first, then try to optimize using permute and reshape. But note that permute needs to copy data, so tends to increase the amount of work, rather than decrease it. Recent versions of MATLAB are no longer slow with loops, and thus copying data is no longer always a useful hack to speed up things.
For example, the loop in the question can be simplified to:
A_contracted = zeros(size(A,2),size(A,4));
for k = 1:size(A,1)
A_contracted = A_contracted + squeeze(A(k,:,k,:));
end
(I've also generalized to arbitrary sizes).
Comparing with Luis' answer, I see the vectorized method winning for small arrays such as the one in the OP (17x10x17x12) with 0.09 ms vs 0.19 ms. But with very small times all around it is likely not worth the effort. However, for larger arrays (I tried 17x100x17x120) I see the loop method winning 1.3 ms vs 2.6 ms.
The more data, the bigger the advantage to using just plain old loops. With 170x100x170x120 it is 0.04 s vs 0.45 s.
Test code:
A = rand(17,100,17,120);
assert(all(method2(A)==method1(A),'all'))
timeit(#()method1(A))
timeit(#()method2(A))
function A_contracted = method1(A)
A_contracted = permute(sum( ...
A.*((1:size(A,1)).'==reshape(1:size(A,3), 1, 1, [])), [1 3]), [2 4 1 3]);
end
function A_contracted = method2(A)
A_contracted = zeros(size(A,2),size(A,4));
for k = 1:size(A,1)
A_contracted = A_contracted + squeeze(A(k,:,k,:));
end
end

My professor suggested another solution (in the following denoted by method3) involving reshape and matrix multiplication.
take a unit matrix of the size of the contracted index
reshape it into a vector
reshape the tensor you want to contract accordingly
multiply the vector and the tensor
reshape the Contracted tensor
sample code comparing to Luis's (method1) and Cris's answer (method2):
A = rand(17,10,17,10);
timeit(#()method1(A))
timeit(#()method2(A))
timeit(#()method3(A))
function A_contracted = method1(A)
A_contracted = permute(sum( ...
A.*((1:size(A,1)).'==reshape(1:size(A,3), 1, 1, [])), [1 3]), [2 4 1 3]);
end
function A_contracted = method2(A)
A_contracted = zeros(size(A,2),size(A,4));
for k = 1:size(A,1)
A_contracted = A_contracted + squeeze(A(k,:,k,:));
end
end
function A_contracted = method3(A)
sa_1 = size(A,1);
Unity = eye(size(A, 1));
Unity = reshape(Unity, [1,sa_1*sa_1]);
A1 = permute(A, [1,3,2,4]);
A2 = reshape(A1, [sa_1*sa_1, size(A1, 3)* size(A1,4)]);
UnA = Unity*A2;
A_contracted = reshape(UnA, [size(A1,3), size(A1,4)]);
end
method3 dominates for small dimensions by an order of magnitude over both method1 and method2 and beats method1 for larger dimensions as well, but is beaten by for loops for larger dimensions by one order of magnitude.
method3 has the (somewhat personal) advantage of being more intuitive for the application in my physics course in the sense that a contraction is not really in the tensor itself, but with respect to a metric. method3 may be easily adapted to incorporate this feature.

Pretty easy
squeeze(sum(sum(a,3),1))
The sum(a,n) sums over the nth dimension of the array and the squeeze removes any singleton dimensions

Related

Matlab: Vectorizing 4 nested for loops

So, I need to vectorize some for loops into a single line. I understand how vectorize one and two for-loops, but am really struggling to do more than that. Essentially, I am computing a "blur" matrix M2 of size (n-2)x(m-2) of an original matrix M of size nxm, where s = size(M):
for x = 0:1
for y = 0:1
m = zeros(1, 9);
k = 1;
for i = 1:(s(1) - 1)
for j = 1:(s(2) - 1)
m(1, k) = M(i+x,j+y);
k = k+1;
end
end
M2(x+1,y+1) = mean(m);
end
end
This is the closest I've gotten:
for x=0:1
for y=0:1
M2(x+1, y+1) = mean(mean(M((x+1):(3+x),(y+1):(3+y))))
end
end
To get any closer to a one-line solution, it seems like there has to be some kind of "communication" where I assign two variables (x,y) to index over M2 and index over M; I just don't see how it can be done otherwise, but I am assured there is a solution.
Is there a reason why you are not using MATLAB's convolution function to help you do this? You are performing a blur with a 3 x 3 averaging kernel with overlapping neighbourhoods. This is exactly what convolution is doing. You can perform this using conv2:
M2 = conv2(M, ones(3) / 9, 'valid');
The 'valid' flag ensures that you return a size(M) - 2 matrix in both dimensions as you have requested.
In your code, you have hardcoded this for a 4 x 4 matrix. To double-check to see if we have the right results, let's generate a random 4 x 4 matrix:
rng(123);
M = rand(4, 4);
s = size(M);
If we run this with your code, we get:
>> M2
M2 =
0.5054 0.4707
0.5130 0.5276
Doing this with conv2:
>> M2 = conv2(M, ones(3) / 9, 'valid')
M2 =
0.5054 0.4707
0.5130 0.5276
However, if you want to do this from first principles, the overlapping of the pixel neighbourhoods is very difficult to escape using loops. The two for loop approach you have is good enough and it tackles the problem appropriately. I would make the size of the input instead of being hard coded. Therefore, write a function that does something like this:
function M2 = blur_fp(M)
s = size(M);
M2 = zeros(s(1) - 2, s(2) - 2);
for ii = 2 : s(1) - 1
for jj = 2 : s(2) - 1
p = M(ii - 1 : ii + 1, jj - 1 : jj + 1);
M2(ii - 1, jj - 1) = mean(p(:));
end
end
The first line of code defines the function, which we will call blur_fp. The next couple lines of code determine the size of the input matrix as well as initialising a blank matrix to store out output. We then loop through each pixel location in the matrix that is possible without the kernel going outside of the boundaries of the image, we grab a 3 x 3 neighbourhood with each pixel location serving as the centre, we then unroll the matrix into a single column vector, find the average and store it in the appropriate output. For small kernels and relatively large matrices, this should perform OK.
To take this a little further, you can use user Divakar's im2col_sliding function which takes overlapping neighbourhoods and unrolls them into columns. Therefore, each column represents a neighbourhood which you can then blur the input using vector-matrix multiplication. You would then use reshape to reshape the result back into a matrix:
T = im2col_sliding(M, [3 3]);
V = ones(1, 9) / 9;
s = size(M);
M2 = reshape(V * T, s(1) - 2, s(2) - 2);
This unfortunately cannot be done in a single line unless you use built-in functions. I'm not sure what your intention is, but hopefully the gamut of approaches you have seen here have given you some insight on how to do this efficiently. BTW, using loops for small matrices (i.e. 4 x 4) may be better in efficiency. You will start to notice performance changes when you increase the size of the input... then again, I would argue that using loops are competitive as of R2015b when the JIT has significantly improved.

Extract data from multidimentional array into 2 dims based on index

I have a huge (1000000x100x7) matrix and i need to create a (1000000x100x1) matrix based on an index vector (100x1) which holds 1 2 3 4 5 6 or 7 for each location.
I do not want to use loops
The problem (I think)
First, let me try create a minimum working example that I think captures what you want to do. You have a matrix A and an index vector index:
A = rand(1000000, 100, 7);
index = randi(7, [100, 1]);
And you would like to do something like this:
[I,J,K] = size(A);
B = zeros(I,J);
for i=1:I
for j=1:J
B(i,j) = A(i,j,index(j));
end
end
Only you'd like to do so without the loops.
Linear indexing
One way to do this is by using linear indexing. This is kinda a tricky thing that depends on how the matrix is laid out in memory, and I'm gonna do a really terrible job explaining it, but you can also check out the documentation for the sub2ind and ind2sub functions.
Anyways, it means that given your (1,000,000 x 100 x 7) matrix stored in column-major format, you can refer to the same element in many different ways, i.e.:
A(i, j, k)
A(i, j + 100*(k-1))
A(i + 1000000*(j-1 + 100*(k-1)))
all refer to the same element of the matrix. Anyways, the punchline is:
linear_index = (1:J)' + J*(index-1);
B_noloop = A(:, linear_index);
And of course we should verify that this produces the same answer:
>> isequal(B, B_noloop)
ans =
1
Yay!
Performance vs. readability
So testing this on my computer, the nested loops took 5.37 seconds and the no-loop version took 0.29 seconds. However, it's kinda hard to tell what's going on in that code. Perhaps a more reasonable compromise would be:
B_oneloop = zeros(I,J);
for j=1:J
B_oneloop(:,j) = A(:,j,index(j));
end
which vectorizes the longest dimension of the matrix and thus gets most of the way there (0.43 seconds), but maintains the readability of the original code.

Need help in using bsxfun

I have two arrays in MATLAB:
A; % size(A) = [NX NY NZ 3 3]
b; % size(b) = [NX NY NZ 3 1]
In fact, in the three dimensional domain, I have two arrays defined for each (i, j, k) which are obtained from above-mentioned arrays A and b, respectively and their sizes are [3 3] and [3 1], respectively. Let's for the sake of example, call these arrays m and n.
m; % size(m) = [3 3]
n; % size(n) = [3 1]
How can I solve m\n for each point of the domain in a vectorize fashion? I used bsxfun but I am not successful.
solution = bsxfun( #(A,b) A\b, A, b );
I think the problem is with the expansion of the singleton elements and I don't know how to fix it.
I tried some solutions, it seems that a for loop is acutally the fastest possibility in this case.
A naive approach looks like this:
%iterate
C=zeros(size(B));
for a=1:size(A,1)
for b=1:size(A,2)
for c=1:size(A,3)
C(a,b,c,:)=squeeze(A(a,b,c,:,:))\squeeze(B(a,b,c,:));
end
end
end
The squeeze is expensive in computation time, because it needs some advanced indexing. Swapping the dimensions instead is faster.
A=permute(A,[4,5,1,2,3]);
B=permute(B,[4,1,2,3]);
C2=zeros(size(B));
for a=1:size(A,3)
for b=1:size(A,4)
for c=1:size(A,5)
C2(:,a,b,c)=(A(:,:,a,b,c))\(B(:,a,b,c));
end
end
end
C2=permute(C2,[2,3,4,1]);
The second solution is about 5 times faster.
/Update: I found an improved version. Reshaping and using only one large loop increases the speed again. This version is also suitable to be used with the parallel computing toolbox, in case you own it replace the for with a parfor and start the workers.
A=permute(A,[4,5,1,2,3]);
B=permute(B,[4,1,2,3]);
%linearize A and B to get a better performance
linA=reshape(A,[size(A,1),size(A,2),size(A,3)*size(A,4)*size(A,5)]);
linB=reshape(B,[size(B,1),size(B,2)*size(B,3)*size(B,4)]);
C3=zeros(size(linB));
for a=1:size(linA,3)
C3(:,a)=(linA(:,:,a))\(linB(:,a));
end
%undo linearization
C3=reshape(C3,size(B));
%undo dimension swap
C3=permute(C3,[2,3,4,1]);

Matlab: element 3D matrices multiplication

I have two matrices: B with size 9x100x51 and K with size 34x9x100. I want to multiply all of K(34) with each one of B(9) so as to have a final matrix G with size 34x9x100x51.
For example: the element G(:,5,60,25) is composed as follow
G(:,5,60,25)=K(:,5,60)*B(5,60,25).
I hope that the example helps to understand what I want to do.
Thank you
Any time you find yourself writing nested loops in matlab, there's a good chance you can speed up quite a bit using the built-in vectorized forms of the functions. The code ends up being quite a bit shorter typically too (but often less immediately clear to a reader, so comment your code!).
In this case, does avoiding the nested loops make a difference? Absolutely! Let's get to work. #slayton has provided a 3-loop solution. We can get faster.
Restating the problem a bit, B has 51 9x100 matrices and K has 34 9x100 matrices. For each combination of 51x34, you want to element-wise multiply the respective 9x100 matrices from B and K.
Element-wise multiplication is a great job for bsxfun, so we can conceptually reduce this problem to working along two dimensions (the third dimension of B, first dimension of K):
Initial, two-loop solution:
B = rand(9,100,51);
K = rand(34,9,100);
G = nan(34,9,100,51);
for b=1:size(B,3)
for k=1:size(K,1)
G(k,:,:,b) = bsxfun(#times,B(:,:,b), squeeze(K(k,:,:)));
end
end
Ok, two loops is making progress. Can we do better? Well, let's recognize that the matrices B and K can be replicated along the appropriate dimensions, then element-wise multiplied all at once.
B = rand(9,100,51);
K = rand(34,9,100);
B2 = repmat(permute(B,[4 1 2 3]), [size(K,1) size(B)]);
K2 = repmat(K, [size(K) size(B,3)]);
G = bsxfun(#times,B2,K2);
So, how do the solutions compare speed-wise? I tested the on the octave online utility, and didn't include the time to generate the initial B and K matrices. I did include the time to preallocate the G matrix for the solutions that needed preallocation. The code is below.
3 loops (#slayton's answer): 4.024471 s
2 loop solution: 1.616120 s
0-loop repmat/bsxfun solution: 1.211850 s
0-loop repmat/bsxfun solution, no temporaries: 0.605838 s
Caveat: The timing may depend quite a bit on your machine, I wouldn't trust the online utility for great timing tests. Changing the order of when the loops were executed (even taking care not to reuse variables and mess up time of allocation) did change things a bit, namely the 2-loop solution was sometimes as fast as the no-loop solution with temporaries stored. However, the more vectorized you can get, the better you will be.
Here's the code for the speed test:
B = rand(9,100,51);
K = rand(34,9,100);
tic
G1 = nan(34,9,100,51);
for ii = 1:size(B,1)
for jj = 1:size(B,2);
for kk = 1:size(B,3)
G1(:, ii, jj, kk) = K(:,ii,jj) .* B(ii,jj,kk);
end
end
end
t=toc;
printf('Time for 3 loop solution: %f\n' ,t)
tic
G2 = nan(34,9,100,51);
for b=1:size(B,3)
for k=1:size(K,1)
G2(k,:,:,b) = bsxfun(#times,B(:,:,b), squeeze(K(k,:,:)));
end
end
t=toc;
printf('Time for 2 loop solution: %f\n' ,t)
tic
B2 = repmat(permute(B,[4 1 2 3]), [size(K,1) 1 1 1]);
K2 = repmat(K, [1 1 1 size(B,3)]);
G3 = bsxfun(#times,B2,K2);
t=toc;
printf('Time for 0-loop repmat/bsxfun solution: %f\n' ,t)
tic
G4 = bsxfun(#times,repmat(permute(B,[4 1 2 3]), [size(K,1) 1 1 1]),repmat(K, [1 1 1 size(B,3)]));
t=toc;
printf('Time for 0-loop repmat/bsxfun solution, no temporaries: %f\n' ,t)
disp('Are the results equal?')
isequal(G1,G2)
isequal(G1,G3)
Time for 3 loop solution: 4.024471
Time for 2 loop solution: 1.616120
Time for 0-loop repmat/bsxfun solution: 1.211850
Time for 0-loop repmat/bsxfun solution, no temporaries: 0.605838
Are the results equal?
ans = 1
ans = 1
You can do this with nested loops, although it probably won't be terribly fast:
B = rand(9,100,51);
K = rand(34,9,100);
G = nan(34,9,100,51)
for ii = 1:size(B,1)
for jj = 1:size(B,2);
for kk = 1:size(B,3)
G(:, ii, jj, kk) = K(:,ii,jj) .* B(ii,jj,kk);
end
end
end
Its been a long day and my brain is a bit fried, kudos to anyone who can improve this!

How to perform a column by column circular shift of a matrix without a loop

I need to circularly shift individual columns of a matrix.
This is easy if you want to shift all the columns by the same amount, however, in my case I need to shift them all by a different amount.
Currently I'm using a loop and if possible I'd like to remove the loop and use a faster, vector based, approach.
My current code
A = randi(2, 4, 2);
B = A;
for i = 1:size( A,2 );
d = randi( size( A,1 ));
B(:,i) = circshift( A(:,i), [d, 0] );
end
Is is possible to remove the loop from this code?
Update I tested all three methods and compared them to the loop described in this question. I timed how long it would take to execute a column by column circular shift on a 1000x1000 matrix 100 times. I repeated this test several times.
Results:
My loop took more than 12 seconds
Pursuit's suggestion less than a seconds
Zroth's orginal answer took just over 2 seconds
Ansari's suggest was slower than the original loop
Edit
Pursuit is right: Using a for-loop and appropriate indexing seems to be the way to go here. Here's one way of doing it:
[m, n] = size(A);
D = randi([0, m - 1], [1, n]);
B = zeros(m, n);
for i = (1 : n)
B(:, i) = [A((m - D(i) + 1 : m), i); A((1 : m - D(i) ), i)];
end
Original answer
I've looked for something similar before, but I never came across a good solution. A modification of one of the algorithms used here gives a slight performance boost in my tests:
[m, n] = size(A);
mtxLinearIndices ...
= bsxfun(#plus, ...
mod(bsxfun(#minus, (0 : m - 1)', D), m), ...
(1 : m : m * n));
C = A(idxs);
Ugly? Definitely. Like I said, it seems to be slightly faster (2--3 times faster for me); but both algorithms are clocking in at under a second for m = 3000 and n = 1000 (on a rather old computer, too).
It might be worth noting that, for me, both algorithms seem to outperform the algorithm provided by Ansari, though his answer is certainly more straightforward. (Ansari's algorithm's output does not agree with the other two algorithms for me; but that could just be a discrepancy in how the shifts are being applied.) In general, arrayfun seems pretty slow when I've tried to use it. Cell arrays also seem slow to me. But my testing might be biased somehow.
Not sure how much faster this would be, but you could try this:
[nr, nc] = size(A);
B = arrayfun(#(i) circshift(A(:, i), randi(nr)), 1:nc, 'UniformOutput', false);
B = cell2mat(B);
You'll have to benchmark it, but using arrayfun may speed it up a little bit.
I suspect, your circular shifting, operations on the random integer matrix donot make it any more random since the numbers are uniformly distributed.
So I hope your question is using randi() for demonstration purposes only.