Extracting and storing non-zero entries in MATLAB - matlab

Could anyone help me build and correct my code which aims to only save the non-zero elements of an arbitrary square matrix and its index? Basically I need to write a script that does the same function as 'sparse' in MATLAB.
`%Consider a 3x3 matrix
A=[ 0 0 9 ;-1 8 0;0 -5 0 ];
n=3; %size of matrix
%initialise following arrays:
RI= zeros(n,1); %row index
CI = zeros(n,1); %column index
V = zeros(n,1); %value in the matrix
for k = 1:n %row 1 to n
for j = 1:n %column 1 to n
if A(k,j)~=0
RI(k)=k;
CI(j)=j;
V(k,j)=A(k,j);
end
end
end`

You could use the find function to find all the non-zero elements.
So,
[RI, CI, V] = find(A);
% 2 1 -1
% 2 2 8
% 3 2 -5
% 1 3 9
EDIT :
I realize from your comments that your goal was to learn coding in Matlab and you might be wondering why your code didn't work as expected. So let me try to explain the issue along with an example code that is similar to yours.
% Given:
A=[ 0 0 9 ;-1 8 0;0 -5 0 ];
Firstly, instead of manually specifying the size as n = 3, I'd recommend using the built-in size function.
sz = size(A);
% note that this contains 2 elements:
% [number of rows, number of columns]
Next, to initialize the arrays RI, CI and V we would like to know their sizes. Since we do not know the number of non-zero elements to start with, we
have two options: (1) choose a large number that is guaranteed to be equal to or greater than the number of non-zero elements, for example prod(sz). (Why is that true?). (2) Do not initialize it at all and let Matlab dynamically allocate memory as required. I'd follow the second option in the code below.
% we'll keep a count of non-zero elements as we find them
numNZ = 0; % this will increment every time a non-zero element is found
for iCol = 1:sz(2) %column 1 to end
for iRow = 1:sz(1) %row 1 to end
if A(iRow,iCol)~=0
numNZ = numNZ + 1;
RI(numNZ) = iRow;
CI(numNZ) = iCol;
V(numNZ) = A(iRow,iCol);
end
end
end
disp([RI, CI, V])
% 2 1 -1
% 2 2 8
% 3 2 -5
% 1 3 9
Makes sense?

So I think we've established that the point of this is to learn an unfamiliar programming language. The simplest solution is to use sparse itself but that gives you no insight into programming. Nor does find, which can be used similarly.
Now, we could go the same route you've started using: procedural for and if over each row and each column. Could be almost any programming language, but for a few quirks of punctuation. But what you'll find, even if you do correct the mistakes (like the fact that n should be the number of non-zero entries, not the number of rows) is that this is a very slow way of doing numerical work in Matlab.
Here's another (still inefficient, but less so) way which will hopefully provide some insight into the "vectorized" way of doing things, which is one of the things that makes Matlab as powerful as it is:
function [RI, CI, V] = mysparse(A) % first: use functions!
[nRows, nCols] = size(A);
[allRowIndices, allColIndices] = ndgrid(1:nRows, 1:nCols) % let's leave the semicolon off so you can see for yourself what it does.
% It's very similar to `meshgrid` which you'll see more often (it's heavily used in Matlab graphics)
% but `ndgrid` is "simpler" in that it's more in tune with the fundamental conventions of Matlab (rows, then columns)
isNonZero = A ~= 0; % this gives you a "logical array" which is a very powerful thing: it can be used as a subscript to select elements from another array, in one shot...
RI = allRowIndices(isNonZero); % like this
CI = allColIndices(isNonZero); % or this
V = A(isNonZero); % or even this
RI = RI(:); % have to do this explicitly, because the lines above will reshape the values into a single long string under some circumstances but not others
CI = CI(:);
V = V(:);

I will go with a N x 3 matrix where N are the number of non-zero elements in the matrix.
% Define a matrix A as follows:
A = randi([0 1],[4 4])
for i=1:16
if A(i) ~= 0
A(i) = rand;
end
end
[row,col] = find(A);
elms = A(A~=0); % MATLAB always works in column-major order and is consistent,
% so no need to use sub2ind to access elements given by find
newSparse_A = [row col elms];
Output:
newSparse_A =
1.0000 1.0000 0.9027
2.0000 1.0000 0.9448
3.0000 1.0000 0.4909
1.0000 2.0000 0.4893
2.0000 2.0000 0.3377
4.0000 2.0000 0.9001
>> sparse(A)
ans =
(1,1) 0.9027
(2,1) 0.9448
(3,1) 0.4909
(1,2) 0.4893
(2,2) 0.3377
(4,2) 0.9001

Related

Octave Coding - I need help coding coefficients of polynomial

This question fairly easy doing it manually however, I am struggling to have this written in code.
There is a quartic polynomial:
P(x)=ax^4+bx^3+cx^2+dx+e
There is also a given matrix M:
5 0 -1 2 9
-2 -1 0 1 2
Which the first row gives P(x) and the second row gives the value of x.
Using the information in matrix M, find the coefficients:
a, b, c, d, e
I would know how to work this manually, subbing each column and solve simultaneously with the other columns to obtain a value for each coefficient or put it in a matrix.
I have an idea of what to do, but I don't know how to code it.
I do believe the last line would be linearsolve(M(,1),M(,2)) and thus be able to obtain each coefficient but I have no idea how to get to that line.
Welcome J Cheong
% Values of y and x (note: in your post you had 2 values at x = 1, I assumed that was an accident)
M = [5 0 -1 2 9 ; -2 -1 0 1 2];
% Separate for clarity
y = M(1,:);
x = M(2,:);
% Fit to highest order polynomial model
f = fit(x',y',['poly', num2str(length(y)-1)])
% Extract coefficients
coeff = coeffvalues(f);
% Plotting
X = linspace(min(x)-1, max(x) + 1, 1000) ;
plot(x,y,'.',X,f(X))
Edit
Sorry, I'm using Matlab. Looking at the Octave documentation. You should be able to get the coefficients using
p = polyfit(x,y,length(y)-1)';
Then to display the coefficients the way you specified try this
strcat(cellstr(char(96+(1:length(p))')), { ' = ' } , cellstr(num2str(p)))
y=[5 0 -1 2 9];
x=[-2 -1 0 1 2];
P=polyfit(x,y,2)
gives
P =
2.0000 1.0000 -1.0000
these are your coefficients for c,d,e the others are zero. You can check the result:
polyval(P, x)
ans =
5.0000e+00 2.2204e-16 -1.0000e+00 2.0000e+00 9.0000e+00
which gives you y
Btw, you can solve this very fast just inside your head without calculator because the function values for x=0 and x=+/-1 are very easy to calculate.

Interpolate matrices for different times in Matlab

I have computed variables stored in a matrix for a specific time vector.
Now I want to interpolate between those whole matrices for a new time vector to get the matrices for the desired new time vector.
I've came up with the following solution but it seems clunky and computational demanding:
clear all;
a(:,:,1) = [1 1 1;2 2 2;3 3 3]; % Matrix 1
a(:,:,2) = [4 4 4;6 6 6;8 8 8]; % Matrix 2
t1 = [1 2]; % Old time vector
t2 = [1 1.5 2]; % New time vector
% Interpolation for each matrix element
for r = 1:1:size(a,2)
for c = 1:1:size(a,1)
tab(:) = a(r,c,:);
tabInterp(r,c,:) = interp1(t1,tab(:),t2);
end
end
The result is and should be:
[2.5000 2.5000 2.5000
4.0000 4.0000 4.0000
5.5000 5.5000 5.5000]
Any thoughts?
You can do the linear interpolation manually, and all at once...
m = ( t2 - t1(1) ) / ( t1(2) - t1(1) );
% Linear interpolation using the standard 'y = m*x + c' linear structure
tabInterp = reshape(m,1,1,[]) .* (a(:,:,2)-a(:,:,1)) + a(:,:,1);
This will work for any size t2, as long as t1 has 2 elements.
If you have a t1 with more than 2 elements, you can create the scaling vector m using interp1. This is relatively efficient because you're only using interp1 for your time vector, not the matrix:
m = interp1( t1, (t1-min(t1))/(max(t1)-min(t1)), t2, 'linear', 'extrap' );
This uses implicit expansion with the .* operation, which requires R2016b or newer. If you have an older MATLAB version then use bsxfun for the same functionality.
I don't really see a problem with a loop based approach, but if you're looking for a loopless method you can do the following.
[rows, cols, ~] = size(a);
aReshape = reshape(a, rows*cols, []).';
tabInterp = reshape(interp1(t1, aReshape, t2).', rows, cols, []);
Looking at the source code for interp1 it appears a for loop is being used anyway so I doubt this will result in any performance gain.

Linspace applied on array [duplicate]

This question already has an answer here:
Linspace using matrix input matlab
(1 answer)
Closed 4 years ago.
Given an array like a = [ -1; 0; 1];. For each a(i), I need to compute a linearly spaced vector with linspace(min(a(i),0),max(a(i),0),3);, where each linspace-vector should be stored into a matrix:
A = [-1 -0.5 0;
0 0 0;
0 0.5 1];
With a for loop, I can do this like so:
for i=1:3
A(i) = linspace(min(a(i),0),max(a(i),0),3);
end
How can I achieve this without using loops?
The fastest way I can think of is calculating the step-size, construct the vector from that using implicit binary expansion.
a = [ -1; 0; 1];
n = 3;
stepsizes = (max(a,0)-min(a,0))/(n-1);
A = min(a,0) + (0:(n-1)).*stepsizes;
Timeit:
A couple of timeit results using (use timeit(#SO) and remove comments from the blocks to be timed):
function SO()
n = 1e3;
m = 1e5;
a = randi(9,m,1)-4;
% %Wolfie
% aminmax = [min(a, 0), max(a,0)]';
% A = interp1( [0,1], aminmax, linspace(0,1,n) )';
% %Nicky
% stepsizes = (max(a,0)-min(a,0))/(n-1);
% A = min(a,0) + (0:(n-1)).*stepsizes;
% %Loop
% A = zeros(m,n);
% for i=1:m
% A(i,:) = linspace(min(a(i),0),max(a(i),0),n);
% end
%Arrayfun:
A = cell2mat(arrayfun(#(x) linspace(min(x,0),max(x,0),n),a,'UniformOutput',false));
Then the times are:
Wolfie: 2.2243 s
Mine: 0.3643 s
Standard loop: 1.0953 s
arrayfun: 2.6298 s
Take a = [ -1; 0; 1]. Create the min / max array:
aminmax = [min(a, 0), max(a,0)].';
Now use interp1
N = 3; % Number of interpolation points.
b = interp1( [0,1], aminmax, linspace(0,1,N) ).';
>> b =
-1.0000 -0.5000 0
0 0 0
0 0.5000 1.0000
One of the possible solutions is to use arrayfun that applies a function to each element of the array. You also want to convert your results into a matrix, since the output is in the cell array. Since the output of the arrayfun is non-scalar, you have to turn off uniform output.
cell2mat(arrayfun(#(x) linspace(min(x,0),max(x,0),3),a,'UniformOutput',false))
Edit: I performed some testing using tic-toc method on 100000 long arrays. I found out, that the solution with arrayfun takes approx. 1.5 time longer than the one you suggested with for loops.
The fastest approach would be to calculate what you need using matrix-vector operation. For example, if you only need to calculate linspace with 3 elements, you can use something like:
[min(a(:),0), (max(a(:),0)+min(a(:),0))/2 ,max(a(:),0)];
You can generalize this method for any number of elements in linspace function (not necessarily just 3). Note that readability will suffer, as the volume of code will increase:
j=4; % number of elements in each linspace
b=zeros(size(a,1),j); % create a solution matrix of size Nxj
b(:,1)=min(a(:),0); %first row
b(:,end)=max(a(:),0); % last row
temp=b(:,1)+b(:,end); % sum of the first and the last row
for i=2:j-1
b(:,i)=temp*(i-1)/(j-1); % fill in intermediate rows
end
Note, that in this method I loop over the number of elements in each linspace, but not through the array a. With small j (like j=3 in your example) this will work way faster compared to the method with looping over the array a (if you consider large arrays like a=rand(100000,1)).

Add a diagonal of zeros to a matrix in MATLAB

Suppose I have a matrix A of dimension Nx(N-1) in MATLAB, e.g.
N=5;
A=[1 2 3 4;
5 6 7 8;
9 10 11 12;
13 14 15 16;
17 18 19 20 ];
I want to transform A into an NxN matrix B, just by adding a zero diagonal, i.e.,
B=[ 0 1 2 3 4;
5 0 6 7 8;
9 10 0 11 12;
13 14 15 0 16;
17 18 19 20 0];
This code does what I want:
B_temp = zeros(N,N);
B_temp(1,:) = [0 A(1,:)];
B_temp(N,:) = [A(N,:) 0];
for j=2:N-1
B_temp(j,:)= [A(j,1:j-1) 0 A(j,j:end)];
end
B = B_temp;
Could you suggest an efficient way to vectorise it?
You can do this with upper and lower triangular parts of the matrix (triu and tril).
Then it's a 1 line solution:
B = [tril(A,-1) zeros(N, 1)] + [zeros(N,1) triu(A)];
Edit: benchmark
This is a comparison of the loop method, the 2 methods in Sardar's answer, and my method above.
Benchmark code, using timeit for timing and directly lifting code from question and answers:
function benchie()
N = 1e4; A = rand(N,N-1); % Initialise large matrix
% Set up anonymous functions for input to timeit
s1 = #() sardar1(A,N); s2 = #() sardar2(A,N);
w = #() wolfie(A,N); u = #() user3285148(A,N);
% timings
timeit(s1), timeit(s2), timeit(w), timeit(u)
end
function sardar1(A, N) % using eye as an indexing matrix
B=double(~eye(N)); B(find(B))=A.'; B=B.';
end
function sardar2(A,N) % similar to sardar1, but avoiding slow operations
B=1-eye(N); B(logical(B))=A.'; B=B.';
end
function wolfie(A,N) % using triangular parts of the matrix
B = [tril(A,-1) zeros(N, 1)] + [zeros(N,1) triu(A)];
end
function user3285148(A, N) % original looping method
B = zeros(N,N); B(1,:) = [0 A(1,:)]; B(N,:) = [A(N,:) 0];
for j=2:N-1; B(j,:)= [A(j,1:j-1) 0 A(j,j:end)]; end
end
Results:
Sardar method 1: 2.83 secs
Sardar method 2: 1.82 secs
My method: 1.45 secs
Looping method: 3.80 secs (!)
Conclusions:
Your desire to vectorise this was well founded, looping is way slower than other methods.
Avoiding data conversions and find for large matrices is important, saving ~35% processing time between Sardar's methods.
By avoiding indexing all together you can save a further 20% processing time.
Generate a matrix with zeros at diagonal and ones at non-diagonal indices. Replace the non-diagonal elements with the transpose of A (since MATLAB is column major). Transpose again to get the correct order.
B = double(~eye(N)); %Converting to double since we want to replace with double entries
B(find(B)) = A.'; %Replacing the entries
B = B.'; %Transposing again to get the matrix in the correct order
Edit:
As suggested by Wolfie for the same algorithm, you can get rid of conversion to double and the use of find with:
B = 1-eye(N);
B(logical(B)) = A.';
B = B.';
If you want to insert any vector on a diagonal of a matrix, one can use plain indexing. The following snippet gives you the indices of the desired diagonal, given the size of the square matrix n (matrix is n by n), and the number of the diagonal k, where k=0 corresponds to the main diagonal, positive numbers of k to upper diagonals and negative numbers of k to lower diagonals. ixd finally gives you the 2D indices.
function [idx] = diagidx(n,k)
% n size of square matrix
% k number of diagonal
if k==0 % identity
idx = [(1:n).' (1:n).']; % [row col]
elseif k>0 % Upper diagonal
idx = [(1:n-k).' (1+k:n).'];
elseif k<0 % lower diagonal
idx = [(1+abs(k):n).' (1:n-abs(k)).'];
end
end
Usage:
n=10;
k=3;
A = rand(n);
idx = diagidx(n,k);
A(idx) = 1:(n-k);

measure similarity between 1 dimensional vectors

EDITED QUESTION
I have n signals of equal length.
X_signal
Y_signal
...
Z_signal
I calculate minima of these signals and I store their location (in time) in the vectors
X = [x1 x2 x3 x4 ... x100]
Y = [y1 y2 y3 y4 ... y150]
...
Z = [z1 z2 z3 z4 ... z110]
You can think about X,Y,..Z as time series that can have different lenght.
I assume that the original signals are similar if they have their minima almost at the same locations.
I would like to know what would be a smart approach to measure this kind of similarity keeping in mind that some minima in X,Y,Z can be just noise.
For example if X = [1 5 8 12 15 20] and Y = [1.5 5.5 7.5 10 12 15.5 20.2] they should be similar since almost all the points have the same value exept for Y(4) = [10].
If you have time code or pseudo code in Matlab is appreciated, otherwise also a suggestion, link etc. is super fine.
Thanks
ORIGINAL QUESTION
I have n vectors of different length.
X = [x1 x2 x3 x4 ... x100]
Y = [y1 y2 y3 y4 ... y150]
...
Z = [z1 z2 z3 z4 ... z110]
Vectors (X Y ... Z) represent minima values of the energy of the corresponding signals (X_energy, Y_energy, etc).
To recap starting from the signals X_signal, Y_signal ... Z_signal I compute the energy in windows of 20 samples and I calculate the minima of the resulting energy signals.
Assuming that 2 or more vector are similar if they have almost equal values (i.e. X and Y are similar if x1 ~= y1, x2 ~=y2, etc.) In other words I assume that the original signals are similar if they have minimum energy at the same (or almost at the same) time instant. I would like to know what would be a smart approach to measure this kind of similarity.
PS.
It is almost impossible that two vectors are equal so I would like to have just an idea of how close their "points" are.
X and Y could be similar also if they are shifted (i.e. x1~=y3, x2=~y4, etc)
It is always the case that the values are in ascending order (x1<x2<...<x100)
If you have time code or pseudo code in Matlab is appreciated, otherwise also a suggestion, link etc. is super fine.
Thanks
One possible approach (particularly if you do not have the Statistics and/or Signal Processing toolbox) is to generate a correlation matrix for all of your vectors with the Matlab function corrcoef
Since your vectors are different sizes, you would have to either
zero pad the smaller vectors so they are the same size as the largest
Or take an aligned sample of values less than or equal to the number
of values in the smallest vector, out of each of them before
computing correlation.
It depends on your application which procedure is more suitable. Since your vectors are ordered in ascending order, likely zero padding would be inappropriate.
Then you would need to create a matrix M with the rows corresponding to the elements, and the columns corresponding to each (zero padded or sampled) vector.
You could do that with the Matlab function horzcat:
M=horzcat(V1,V2,...Vn)
where V1, V2, ..Vn are each column vectors of the same size.
Finally you could get a correlation matrix for all of your vectors with corrcoef:
Cmat=corrcoef(M)
Matlab docs for corrcoef at this link will help you understand how to interpret the results statistically.
Note that this approach would not take into account any correlation between lagged versions of your vectors.
Edited answer
Now that it is clear that X vector is the time positions of all minima of signal 'X', Y vector is the time positions of all minima of signal 'Y', etc... Here is some updated code.
In fact the idea is still the same ... we build a linearly sampled time vector from all time positions of the minima in all signals (+ from some time sampling precision)... then we build new signals being 1.0 everywhere expect at minima time locations (set to 0.0) ... finally we use the same correlation code as before.
NB Speed and memory optimized version is now available here
function [RMax] = MinimaCorrelation(c, ts)
%[
% Some default resolution and time-location of minima positions
if (nargin < 2), ts = 0.1; end
if (nargin < 1), c = { [1 3 8 7 3 4 12]; [3 8 7 3]; [4 12]; [5 3 8 -3 12]; [1 3 8 7 3 4 12]; }; end
% Number of channels
n = length(c);
% Build linearly sample time vector for all time locations
minTime = min(cellfun(#min, c));
maxTime = max(cellfun(#max, c));
timeVector = minTime:ts:maxTime;
timeVector(end+1) = timeVector(end) + ts; % just to really include min and max if step is not ok
% Build new signals being '1' everywhere except at minima locations (set to '0')
s = ones(n, length(timeVector));
for ni = 1:n
for mv = c{ni}
[~, ind] = min(abs(timeVector - mv));
s(ni, ind) = 0;
end
end
% Correlation (copied 3 times to avoid biased effect on sides ==> circular shifting is ok this way)
s = [s, s, s].';
RMax = max(xcorr(s, 'coeff'), [], 1);
% Put in R(i,j) format
RMax = reshape(RMax, [n n]);
%]
end
With default data, one obtains:
1.0000 0.9899 0.9866 0.9829 1.0000
0.9899 1.0000 0.9833 0.9865 0.9899
0.9866 0.9833 1.0000 0.9832 0.9866
0.9829 0.9865 0.9832 1.0000 0.9829
1.0000 0.9899 0.9866 0.9829 1.0000
Careful, it is brute force solution (time & memory consumption increases quickly with the number of signal and time resolution to have). Now that question is more clear, maybe someone will find smarter answer.
Original answer
Here is coarse-code for an approach using the maximum of cross-correlation and xcorr routine (in signal processing toolbox):
function [RMax] = xcorrmax(c)
%[
% Default signals for test
if (nargin < 1),
c = cell(0,0);
c{end+1} = [1 3 8 7 3 4 12];
c{end+1} = [3 8 7 3];
c{end+1} = [4 12];
c{end+1} = [5 3 8 -3 12];
c{end+1} = [1 3 8 7 3 4 12];
end
% Number of channels
n = length(c);
% Padding to have vectors all of the same length
% See also `padarray` to do circular/symmetric padding (i don't have image toolbox)
maxlength = max(cellfun(#length, c));
c = cellfun(#(x)myquickpad(x, maxlength), c, 'UniformOutput', false);
c = cell2mat(c.').';
% Compute cross correlation (multichannel case) and keep max value
% NB1: May also use xcov if signal mean is not important
% NB2: Normalization at lag = 0
RMax = max(xcorr(c, 'coeff'), [], 1);
% Put in R(i,j) format
RMax = reshape(RMax, [n n]);
%]
end
function [a] = myquickpad(a, maxlength)
%[
if (length(a) < maxlength)
a(maxlength) = 0;
end
%]
end
For the following signals:
(1) [1 3 8 7 3 4 12]
(2) [3 8 7 3]
(3) [4 12]
(4) [5 3 8 -3 12]
(5) [1 3 8 7 3 4 12]
It returns the following correlation matrix R(i,j) between ith and jth signals:
1.0000 0.6698 0.7402 0.8016 1.0000
0.6698 1.0000 0.8012 0.4853 0.6698
0.7402 0.8012 1.0000 0.6587 0.7402
0.8016 0.4853 0.6587 1.0000 0.8016
1.0000 0.6698 0.7402 0.8016 1.0000
Some remarks:
It looks coherent, for instance signal (1) and (5) are identical and correlation is 1.0.
Because of normalization used it considers (1) closer to (3) than (2) ... so should be reviewed upon your needs (see normalization as in xcorrcoef for instance as shown by #paisanco).
You can use xcov instead of xcorr if signal shifts in amplitude are not important.
Again, this is a coarse approach, not speed/memory optimized at all, nor accounting for the fact that values are sorted, and may be not fully inline with what you'd really like to have.