Translating np.einsum() to MATLAB - matlab

I am having trouble understanding the documentation of np.einsum(). How are subscripts interpreted?
I am trying to write np.einsum('a...c,b...c', Y, conj(Y)) where Y is a matrix of shape C, F, T on the original python. Also, due to previous implementation differences my MATLAB Y is of size [F, T, C].
What does 'a...c,b...c' index in each component? I am confused.
How do I write the same instructions in MATLAB?

Quoting from the einsum documentation page:
To enable and control broadcasting, use an ellipsis. Default NumPy-style broadcasting is done by adding an ellipsis to the left of each term, like np.einsum('...ii->...i', a). To take the trace along the first and last axes, you can do np.einsum('i...i', a), or to do a matrix-matrix product with the left-most indices instead of rightmost, one can do np.einsum('ij...,jk...->ik...', a, b).
Later, an example is given:
>>> a = np.arange(25).reshape(5,5)
>>> np.einsum('...j->...', a)
array([ 10, 35, 60, 85, 110])
The equivalent MATLAB code to this example would be:
>> a = reshape(0:24, [5,5]).';
>> sum(a,2).'
ans =
10 35 60 85 110
Several things to note:
The ellipsis operator (...) should not be understood as "range", but as "whatever needs to be there".
"Broadcasting" refers to automatic replication of an array along the relevant dimension so that the mathematical operation is defined. This is a feature that exists in MATLAB since R2016b (called "implicit expansion").
You may notice several transpose operations (.') in the MATLAB equivalent. This is because numpy arrays are row-major while MATLAB array are column-major. Practically, while the underlying data has the same sequential order, a numpy array appears transposed compared to MATLAB. The transpositions were done so that arrays appear the same in intermediate stages.
Another example from these docs is:
>>> a = np.arange(6).reshape((3,2))
>>> b = np.arange(12).reshape((4,3))
>>> np.einsum('ki,jk->ij', a, b)
array([[10, 28, 46, 64],
[13, 40, 67, 94]])
>>> np.einsum('ki,...k->i...', a, b)
array([[10, 28, 46, 64],
[13, 40, 67, 94]])
>>> np.einsum('k...,jk', a, b)
array([[10, 28, 46, 64],
[13, 40, 67, 94]])
Which can be written as follows in MATLAB:
A = reshape(0:5, [2 3]).';
B = reshape(0:11, [3 4]).';
A.' * B.'
permute(sum( permute(A, [3 1 2]) .* B,2), [3 1 2])
shiftdim(sum( shiftdim(A, -1) .* B, 2), 2)
Several things to note:
When going from np.einsum('ki,jk->ij', a, b) to np.einsum('ki,...k->i...', a, b) you can see that the j-th dimension is replaced with .... The fact that both these examples have a -> in them, means it's in explicit mode.
When going from np.einsum('ki,jk->ij', a, b) to np.einsum('k...,jk', a, b), you can see that now the i-th dimension is replaced with .... The omission of ->...j simply demonstrates implicit mode (where the output dimensions are ordered alphabetically).

Related

Optimizing tensor multiplications

I've got a real-time image processing program I'm trying to optimize, and it all boils down to matrix multiplications. Consider 3 tensors I'm calculating in the initialization stage:
A = np.arange(35 * 51 * 59).reshape([35, 51, 59])
B = np.arange(37 * 51 * 51 * 59).reshape([37, 51, 51, 59])
C = np.arange(59 * 27).reshape([59, 27])
Each frame, I'm getting a new data in the form of a fourth tensor:
M = np.arange(35 * 37 * 59).reshape([35, 37, 59]).
Currently, I'm calculating D = np.einsum('xyf,xtf,ytpf,fr->tpr', M, A, B, C), where D is my desired result, and it's the major bottleneck of the program. There are two directions I'm trying to follow in order to optimize it.
First I tried coming up with a tensor T, a function of A, B, C, D that I can pre-calculate, and then it'll all boil to D = np.tensordot(M, T, axes=..). I wasn't successful. I spent a lot of time on it, is it even possible at all?
Moreover, the program itself is written in MATLAB. As it doesn't have a built-in tensor multiplication function (einsum or tensordot equivilent), I'm currently using the tprod toolbox, and doing:
temp1 = etprod('dcb', A, 'abc', M, 'adc');
temp2 = etprod('dbc', B, 'abcd', temp1, 'adb');
D = etprod('cdb', C, 'ab', temp2, 'acd');
As the default dot product function in MATLAB (for 2D matrices) is much faster then etprod, I though about reshaping A, B, C, D to 2D arrays in a way that I will able to multiple 2D matrices using the default function, without hand-written for loops. I wasn't successful with that either.
Any thoughts? thanks!
If this operation is done many times with different values of M we could define
D0 = np.einsum('xft,fr->tpr',A, B, C)
The whole operation could be broken into binary steps:
D0=np.einsum('xtf,ytpf->xyptf',A,B)
D0=np.einsum('xyptf,fr->xyftpr',D0,C)
D=np.einsum('tprxfy,xfy->tpr',D0,M)
The final operation uses D0 and M and can be coded as a matrix vector operation. In Matlab it would be
D=reshape(D0.[],numel(M))*M(:);
which could then be reordered as desired.
We could write this order as (((A,B),C),M)
It might be better, however, to use ((M,C),A,B)
D=np.einsum('xyf,fr->xyfr',M,C)
D0=np.einsum('xyfr,xtf->ytfr',D,A)
D=np.einsum('ytfr,ytpf->tpr',D,B)
This ordering of operations has intermediate arrays with only 4 indices rather than one with 6. If each operation is much faster than the single one this may be an advantage.

Monotonicty with the interp1 function

Sometimes when I use the interp1 function in MATLAB, it throws an error saying my vectors need to be monotonically increasing, and other times it doesn't.
For example, let's say I have 3 vectors.
A = [286, 295, 298, 301, 304, 308, 310, 324, 330, 335];
B = [31000, 30950, 30875, 30775, 30650, 30500, 30425, 29900, 29675, 29450];
C = [290, 291, 292, 293, 294, 295, 296, 297, 298, 299];
And I want to run
D = interp1(A,B,C);
This function will return successfully even though B is not monotonically increasing. Does the monotonicity only apply to the first and third vectors passing into the equation?
What the error message actually means
The error is actually a little mis-leading in this case and is caused by all values in A not being unique (not strictly monotonic). The error (which is less useful) actually propagates up from griddedInterpolant which is used by many interpolation functions and therefore has a generic error message.
Why it only applies to some inputs
With interp1 you are essentially attempting to construct an estimate of a function f(x) using x locations provided by the user as well as their corresponding values (f(x)). In your example, A contains the location of each data point (x) and B contains the values of your function at each of those points (f(x)). It is only necessary that the locations (values in A) are unique so that you don't have multiple values in B for the same value of A. If you did, interp1 doesn't know how to cope with that.
The ordering of A (the monotonically increasing part of the error) doesn't matter because interp1 will automatically sort A to be increasing (it also re-arranges B so that the values still correspond to A).*
C is simply the locations at which you want to sample the interpolant. You can request the value of the function at the same point a million times with no issue. interp1 will simply return the corresponding value for each location in C so there are no constraints on the values or ordering of C.
A = [1 3 2]; % Not monotically increasing but DOES need to be unique values
B = [1 2 1]; % Can have any value and can repeat values but each
% value corresponds with each element in A
C = [3 3 1 1 2 2]; % Can be any order and can repeat values
% ERROR FREE!
interp1(A, B, C)
% 2 2 1 1 1 1
*If you do want the ordering of your A and B points to be respected, then you'll want to parameterize your input data in a different way as suggested in this answer

applying arrayfun on n-dimensional matrixes

i need your help in solving the following problem:
how can i generalize the following for any n dimensional array:
reshape(arrayfun(#(x,y)sprintf('%d,%d',x,y),C{:},'un',0),size(M));
M is my matrix and C is my matrix of indexes of M.
thanks in advance.
The problem isn't the number of dimensions in the arguments to arrayfun per se, but the number of arguments themselves - which happens in your example to correspond to the number of dimensions each argument has. You therefore need to pass it a function that accepts varargin, which still works on an anonymous function:
reshape(arrayfun(#(varargin)sprintf(strjoin(repmat('%d',size(varargin)),','),varargin{:}),C{:},'un',0),size(M));
This function gave me a lot of headache.
For functions like
f1 = #(x1,x2) x1*x2
You can do
output = arrayfun(f1,x1,x2);
where x1 and x2 are input columns.
However, if you're doing a generalized program, where f1 could have any number of inputs and you need a generalized input matrix like X, you'll need, for example
f1 = #(x1,x2,x3,x4,x5) 2*x1+4*x2+10*x3+0.2*x4+x5;
output = arrayfun(f1,num2cell(X,1){:});
where X represents a matrix with 5 columns representing x1 through x5
For example:
X = [1, 2, 3, 4, 5;
6, 7, 8, 9, 0;];

Converting Matrix into an array

I have a matrix that is size(A) = 20x301088 and another vector linear_index which is 301088x1.
I need to convert A into an array that is 97x97x32x20. But it has to be in a certain order, the vector linear_index contains the linear indices of a 97x97x32 in a specific order.
For example, the element at A(20,4) should be put into linear_index(4) of B(:,:,:,20). Hopefully that makes sense. Each row of A will make its own 97x97x32 matrix, and the elements will be placed at the indices specified by the value in linear_index.
I have done it once but it requires the shiftdim command:
B(1:length(lx) , linear_index) = A(1:length(lx) , :);
B = shiftdim(A,1);
This works, but the shiftdim command takes a bit of time, especially as the size of my matrices can go up to 97x97x32x194.
How about
>> B = A(:,linear_index)'; %' re-order and permute
>> B = reshape( B, 97, 97, 32, 194, 20 );

Matlab "for loop" to create a matrix

I have a fairly large vector named blender. I have extracted n elements for which blender is greater than x (irrelevant). Now my difficulty is the following:
I am trying to create a (21 x n) matrix with each element of blender plus 10 things before, and the 10 things after.
element=find(blender >= 120);
I have been trying variations of the following:
for i=element(1:end)
Matrix(i)= Matrix(blender(i-10:i+10));
end
then I want to plot one column of the matrix at the time when I hit Enter.
This second part I can figure out later, but I would appreciate some help making the Matrix
Thanks
First, you can use "logical indexing" of your array, which uses a logical expression do address your vector. With blender = [2, 302, 35, 199, 781, 312, 8], it could look like this:
>> b_hi = blender(blender>=120)
b_hi =
302 199 781 312
Second, you can concatenate arrays like in b_padded = [1, 2, b_hi, 3, 4]. If b_hi was a column vector, you'd use semicolons instead of commas.
Third, there is a function reshape that allows you to turn the resulting vector into a matrix. doc reshape will tell you details. For example, to turn b_padded into a 2-by-4 matrix,
>> b_matrix = reshape(b_padded, 4, 2)
b_matrix =
1 302 781 3
2 199 312 4
will do. This means you can do all of the job without any for-loop. Note that transposing the result of reshape(b_padded, 2, 4) will give you the other possible 2-by-4 matrix. You obtain the transpose of a matrix A by A'. You will find out which one you want.
You need to create a new matrix, and use two indices so that Matlab knows it is assigning to a column in a 2D matrix.
NewMatrix = zeros(21, length(element));
for i = 1:length(element)
k = element(i);
NewMatrix(:,i)= Matrix(blender(k-10:k+10));
end