I am trying to compute column-wise differences in the following matrix:
A =
0 NaN NaN 0.3750 NaN
NaN 0.1250 0.2500 0.3750 NaN
I would like to obtain:
0.3750 NaN NaN
0.1250 0.1250 0.1250
Where I am essentially taking a columnwise difference, skipping NaN values and shifting values to the left.
A one-dimensional case would be:
A = [0 NaN 0.250 0.375 NaN 0.625];
NaN_diff(A) = [0.250 0.125 0.250];
Any way to do this efficiently in MATLAB without using inefficient find() queries per row?
Here's a solution that vectorizes most of the operations:
notNan = ~isnan(A);
numNN = sum(notNan,2);
shifted = NaN(size(A));
for r = 1:size(A,1)
myRow = A(r,:);
shifted(r,1:numNN(r)) = myRow(notNan(r,:));
end
nanDiff = diff(shifted,1,2);
Here is an alternative vectorized solution:
%// Convert to cell array without NaNs
[rows, cols] = size(A);
C = cellfun(#(x)x(~isnan(x)), mat2cell(A, ones(1, rows), cols), 'Uniform', 0);
%// Compute diff for each row and pad
N = max(sum(~isnan(A), 2));
C = cellfun(#(x)[diff(x) nan(1, N - length(x))], C, 'Uniform', 0);
%// Convert back to a matrix
nandiff = vertcat(C{:});
If you want to pad the result matrix with zeroes instead of NaN values, change the nan function call in nan(1, N - length(x)) to zeros.
Here is an alternative method that does require you to loop over each row, but should still have decent performance and feels very intuitive to me.
B = NaN(size(A,1),size(A,2)-1)
for i = 1:size(A,1)
idx = ~isnan(A(:,i))
B(i,1:sum(idx)) = diff(A(i,idx))
end
I'm aware that this is a rather old question, but for people like me who stumble into this page, here is a simpler (imho) solution to the question:
A = [0 NaN 0.250 0.375 NaN 0.625];
A(isnan(A))=[]; % identify index of NaN values and remove them from the array
B = diff(A);
Here is another simple solution without using a loop [but assuming all values are in ascending order]:
A=[0 NaN NaN 0.3750 NaN;NaN 0.1250 0.2500 0.3750 NaN]
A(isnan(A(:,1)))=0;
B=sort(A,2);
C=diff(B,1,2)
Related
I have three matlab matrices A, B, and C with the same size:
A = [1:3; 4:6; 7:9];
B = [2 NaN 5; NaN NaN 7; 0 1 NaN];
C = [3 NaN 2; 1 NaN NaN; 1 NaN 5];
%>> A = %>>B = %>>C =
% 1 2 3 % 2 NaN 5 % 3 NaN 2
% 4 5 6 % NaN NaN 7 % 1 NaN NaN
% 7 8 9 % 0 1 NaN % 1 NaN 5
I would like the three matrices to keep only values for which each of the 3 matrices does not have a NaN in that specific position. That is, I would like to obtain the following:
%>> A = %>>B = %>>C =
% 1 NaN 3 % 2 NaN 5 % 3 NaN 2
% NaN NaN NaN % NaN NaN NaN % NaN NaN NaN
% 7 NaN NaN % 0 NaN NaN % 1 NaN NaN
In my attempt, I'm stacking the three matrices along the third dimension of a new matrix ABC with size 3x3x3 and then I'm using a for loop to make sure all the three matrices do not have NaN in that specific position.
ABC(:,:,1)=A; ABC(:,:,2)=B; ABC(:,:,3)=C;
for i=1:size(A,1)
for j=1:size(A,2)
count = squeeze(ABC(i,j,:));
if sum(~isnan(count))<size(ABC,3)
A(i,j)=NaN;
B(i,j)=NaN;
C(i,j)=NaN;
end
end
end
This code works fine. However, as I have more than 30 matrices of bigger size I was wondering whether there is a more elegant solution to this problem.
Thank you for your help.
Lets do fancy indexing!
First, the solution:
indnan=sum(isnan(cat(3,A,B,C)),3)>0;
A(indnan)=NaN;
B(indnan)=NaN;
C(indnan)=NaN;
What this code does is essentially creates a 3D matrix, and computes how many NaN there are in each (i,j,:) arrays. Then, if there are more than 0 (i.e.any of them is NaN) it gets a logical index for it. Finally, we fill up all those with NaN, leaving only the non-NaN alive.
Ander’s answer is good, but for very large matrices it might be expensive to create that 3D matrix.
First of all, I would suggest putting the matrices into a cell array. That makes it a lot easier to programmatically manage many arrays. That is, instead of A, B, etc, work with C{1}, C{2}, etc:
C = {A,B,C};
It takes essentially zero cost to make this change.
Now, to find all elements where one of the matrices is NaN:
M = isnan(C{1});
for ii=2:numel(C)
M = M | isnan(C{ii});
end
A similar loop then sets the corresponding elements to NaN:
for ii=1:numel(C)
C{ii}(M) = NaN,
end
This latter loop can be replaced by a call to cellfun, but I like explicit loops.
EDIT: Here are some timings. This is yet another example of loops being faster in modern MATLAB than the equivalent vectorized code. Back in the old days, the loop code would have been 100x slower.
This is the test code:
function so(sz) % input argument is the size of the arrays
C3 = cell(1,3);
for ii=1:numel(C3)
C3{ii} = create(sz,0.05);
end
C20 = cell(1,20);
for ii=1:numel(C20)
C20{ii} = create(sz,0.01);
end
if(~isequal(method1(C3),method2(C3))), error('not equal!'), end
if(~isequal(method1(C20),method2(C20))), error('not equal!'), end
fprintf('method 1, 3 arrays: %f s\n',timeit(#()method1(C3)))
fprintf('method 2, 3 arrays: %f s\n',timeit(#()method2(C3)))
fprintf('method 1, 20 arrays: %f s\n',timeit(#()method1(C20)))
fprintf('method 2, 20 arrays: %f s\n',timeit(#()method2(C20)))
% method 1 is the vectorized code from Ander:
function mask = method1(C)
mask = sum(isnan(cat(3,C{:})),3)>0;
% method 2 is the loop code from this answer:
function mask = method2(C)
mask = isnan(C{1});
for ii=2:numel(C)
mask = mask | isnan(C{ii});
end
function mat = create(sz,p)
mat = rand(sz);
mat(mat<p) = nan;
These are the results on my machine (with R2017a):
>> so(500)
method 1, 3 arrays: 0.003215 s
method 2, 3 arrays: 0.000386 s
method 1, 20 arrays: 0.016503 s
method 2, 20 arrays: 0.001257 s
The loop is 10x faster! For small arrays I see much less of a difference, but the loop code is still several times faster, even for 5x5 arrays.
Similarly to this question, I have a matrix with real values (including NaNs) A of dimension mxn in Matlab. I want to construct a matrix B listing row-wise each element of the non-unique Cartesian product of the values contained in As columns which are not NaN. To be more clear consider the following example.
Example:
%m=3;
%n=3;
A=[2.1 0 NaN;
69 NaN 1;
NaN 32.1 NaN];
%Hence, the Cartesian product {2.1,0}x{69,1}x{32.1} is
%{(2.1,69,32.1),(2.1,1,32.1),(0,69,32.1),(0,1,32.1)}
%I construct B by disposing row-wise each 3-tuple in the Cartesian product
B=[2.1 69 32.1;
2.1 1 32.1;
0 69 32.1;
0 1 32.1];
I came up with a solution using cells:
function B = q48444528(A)
if nargin < 1
A = [2.1 0 NaN;
69 NaN 1 ;
NaN 32.1 NaN];
end
% Converting to a cell array of rows:
C = num2cell(A,2);
% Getting rid of NaN values:
C = cellfun(#(x)x(~isnan(x)),C,'UniformOutput',false);
% Finding combinations:
B = combvec(C{:}).';
Output:
B =
2.1000 69.0000 32.1000
0 69.0000 32.1000
2.1000 1.0000 32.1000
0 1.0000 32.1000
I have the 4x2 matrix A:
A = [2 NaN 5 8; 14 NaN 23 NaN]';
I want to replace the non-NaN values with their associated indices within each column in A. The output looks like this:
out = [1 NaN 3 4; 1 NaN 3 NaN]';
I know how to do it for each column manually, but I would like an automatic solution, as I have much larger matrices to handle. Anyone has any idea?
out = bsxfun(#times, A-A+1, (1:size(A,1)).');
How it works:
A-A+1 replaces actual numbers in A by 1, and keeps NaN as NaN
(1:size(A,1)).' is a column vector of row indices
bsxfun(#times, ...) multiplies both of the above with singleton expansion.
As pointed out by #thewaywewalk, in Matlab R2016 onwards bsxfun(#times...) can be replaced by .*, as singleton expansion is enabled by default:
out = (A-A+1) .* (1:size(A,1)).';
An alternative suggested by #Dev-Il is
out = bsxfun(#plus, A*0, (1:size(A,1)).');
This works because multiplying by 0 replaces actual numbers by 0, and keeps NaN as is.
Applying ind2sub to a mask created with isnan will do.
mask = find(~isnan(A));
[rows,~] = ind2sub(size(A),mask)
A(mask) = rows;
Note that the second output of ind2sub needs to be requested (but neglected with ~) as well [rows,~] to indicate you want the output for a 2D-matrix.
A =
1 1
NaN NaN
3 3
4 NaN
A.' =
1 NaN 3 4
1 NaN 3 NaN
Also be careful the with the two different transpose operators ' and .'.
Alternative
[n,m] = size(A);
B = ndgrid(1:n,1:m);
B(isnan(A)) = NaN;
or even (with a little inspiration by Luis Mendo)
[n,m] = size(A);
B = A-A + ndgrid(1:n,1:m)
or in one line
B = A-A + ndgrid(1:size(A,1),1:size(A,2))
This can be done using repmat and isnan as follows:
A = [ 2 NaN 5 8;
14 NaN 23 NaN];
out=repmat([1:size(A,2)],size(A,1),1); % out contains indexes of all the values
out(isnan(A))= NaN % Replacing the indexes where NaN exists with NaN
Output:
1 NaN 3 4
1 NaN 3 NaN
You can take the transpose if you want.
I'm adding another answer for a couple of reasons:
Because overkill (*ahem* kron *ahem*) is fun.
To demonstrate that A*0 does the same as A-A.
A = [2 NaN 5 8; 14 NaN 23 NaN].';
out = A*0 + kron((1:size(A,1)).', ones(1,size(A,2)))
out =
1 1
NaN NaN
3 3
4 NaN
I have a 3x15000 matrix and I want to save the segments during a change from NaN to a number. So I have large sections that all 3 rows are NaN and when it changes I want to generate a new matrix. Is it best to index the start and end point? How do I set a flag in order to step through all the data?
For example:
NaN 5.30669473796592 5.82479888441640 NaN
NaN 103.308010031436 103.534581233064 NaN
NaN 1787365.55338272 1745186.16219408 NaN
So I want to save the numeric values inbetween.
Correct me if I am wrong, but it sounds like that 3x15000 matrix of yours contains some discrete data and you want to save each chunk of data as an individual matrix.
Suppose your matrix looks like this:
h(:,[2,3,5]) = rand(3,3)
h =
NaN 0.9649 0.9572 NaN 0.1419 NaN
NaN 0.1576 0.4854 NaN 0.4218 NaN
NaN 0.9706 0.8003 NaN 0.9157 NaN
Now you want to copy columns 2,3 into one matrix and column 5 into another. One approach is to first find the columns that contain only NaNs. You could do something like this:
ind = all(isnan(h),1)
ind =
1 0 0 1 0 1
isnan returns an array of 1s and 0s where the 1s indicates where the NaNs are. all(...,1) returns the column indices where all the rows are NaNs. ind contains the flags that you want. To save each chunk of data individually, you can use a simple for loop. Here is a quick and dirty solution:
j = 1;
k = 1;
x = nan(3,1); %temporary matrix to store numerical values
c = cell(2,1); %cell array to store the chunks of data individually
%if you can predict how many elements `c` should have, then
%you can pre-allocate appropriately.
for i=1:length(ind)
%loop through all the columns
if ind(i) == 1
%if we encounter a flag and 'x' has data, dump 'x' into 'c{k}',
%reset 'x' and continue.
if ~all(isnan(x))
c{k} = x;
k = k+1;
end
%reset x
x = nan(3,1);
j=1;
continue
end
x(:,j) = h(:,i);
j = j+1;
%catch data at the end, if last column of h does not contain all NaNs
if i==length(ind)
c{k} = x;
end
end
Your data chunks are stored as matrixes in a cell array:
c{1}
ans =
0.9649 0.9572
0.1576 0.4854
0.9706 0.8003
c{2}
ans =
0.1419
0.4218
0.9157
Hope this helps.
I'm not sure to understand what kind of output you want, but here is a way to get rid of the NaN elements:
Let's consider this matrix:
A =
NaN 1 2 NaN 3 4 NaN
NaN 5 6 NaN 7 8 NaN
NaN 9 10 NaN 11 12 NaN
Then using the find command, we can get the columns in which all the elements are NaN (as you stated in your question):
[~ ,col] = find(isnan(A))
col =
1
1
1
4
4
4
7
7
7
Then we can delete them from A to form a new matrix:
A(:,col) = []
A =
1 2 3 4
5 6 7 8
9 10 11 12
Is this what you had in mind? If not please be more specific. Thanks!
I have been trying to make a general import of Ghaul's answer to my earlier question about importing an upper triangular matrix.
Initial Data:
1.0 3.32 -7.23
1.00 0.60
1.00
A = importdata('A.txt')
A =
1.0000 3.3200 -7.2300
1.0000 0.6000 NaN
1.0000 NaN NaN
So you will have to shift the two last rows, like this:
A(2,:) = circshift(A(2,:),[0 1])
A(3,:) = circshift(A(3,:),[0 2])
A =
1.0000 3.3200 -7.2300
NaN 1.0000 0.6000
NaN NaN 1.0000
and then replace the NaNs with their symmetric counterparts:
A(isnan(A)) = A(isnan(A)')
A =
1.0000 3.3200 -7.2300
3.3200 1.0000 0.6000
-7.2300 0.6000 1.0000
I have this, so we get the complete matrix for any size:
A = importdata('A.txt')
for i = (1:size(A)-1)
A(i+1,:) = circshift(A(i+1,:),[0 i]);
end
A(isnan(A)) = A(isnan(A)');
Is this approach the best? There must be something better. I remember someone told me to try not to use for loops in MATLAB.
UPDATE
So this is the result. Is there any way to make it faster without the loop?
A = importdata('A.txt')
for i = (1:size(A)-1)
A(i+1,:) = circshift(A(i+1,:),[0 i])
end
A(isnan(A)) = 0;
A = A + triu(A, 1)';
Here's another general solution that should work for any size upper triangular matrix. It uses the functions ROT90, SPDIAGS, and TRIU:
>> A = [1 3.32 -7.23; 1 0.6 nan; 1 nan nan]; %# Sample matrix
>> A = spdiags(rot90(A),1-size(A,2):0); %# Shift the rows
>> A = A+triu(A,1).' %'# Mirror around the main diagonal
A =
1.0000 3.3200 -7.2300
3.3200 1.0000 0.6000
-7.2300 0.6000 1.0000
Here's one way without loop. If you have a more recent version of Matlab, you may want to check which solution is really faster, since loops aren't as bad as they used to be.
A = A'; %'# transpose so that linear indices get the right order
out = tril(ones(size(A))); %# create an array of indices
out(out>0) = A(~isnan(A)); %# overwrite the indices with the right number
out = out + triu(out',1); %'# fix the upper half of the array