Cartesian product of row values of a marix in Matlab - matlab

Similarly to this question, I have a matrix with real values (including NaNs) A of dimension mxn in Matlab. I want to construct a matrix B listing row-wise each element of the non-unique Cartesian product of the values contained in As columns which are not NaN. To be more clear consider the following example.
Example:
%m=3;
%n=3;
A=[2.1 0 NaN;
69 NaN 1;
NaN 32.1 NaN];
%Hence, the Cartesian product {2.1,0}x{69,1}x{32.1} is
%{(2.1,69,32.1),(2.1,1,32.1),(0,69,32.1),(0,1,32.1)}
%I construct B by disposing row-wise each 3-tuple in the Cartesian product
B=[2.1 69 32.1;
2.1 1 32.1;
0 69 32.1;
0 1 32.1];

I came up with a solution using cells:
function B = q48444528(A)
if nargin < 1
A = [2.1 0 NaN;
69 NaN 1 ;
NaN 32.1 NaN];
end
% Converting to a cell array of rows:
C = num2cell(A,2);
% Getting rid of NaN values:
C = cellfun(#(x)x(~isnan(x)),C,'UniformOutput',false);
% Finding combinations:
B = combvec(C{:}).';
Output:
B =
2.1000 69.0000 32.1000
0 69.0000 32.1000
2.1000 1.0000 32.1000
0 1.0000 32.1000

Related

Finding the NaN boundary of a matrix in MATLAB

I have a very large (2019x1678 double) DEM (digital elevation model) file put as a matrix in MATLAB. The edges of it contain NaN values. In order to account for edge effects in my code, I have to put a 1 cell buffer (same value as adjacent cell) around my DEM. Where NaNs are present, I need to find the edge of the NaN values in order to build that buffer. I have tried doing this two ways:
In the first I get the row and column coordinates all non-NaN DEM values, and find the first and last row numbers for each column to get the north and south boundaries, then find the first and last column numbers for each row to get the east and west boundaries. I use these in the sub2ind() to create my buffer.
[r, c] = find(~isnan(Zb_ext)); %Zb is my DEM matrix
idx = accumarray(c, r, [], #(x) {[min(x) max(x)]});
idx = vertcat(idx{:});
NorthBoundary_row = transpose(idx(:,1)); % the values to fill my buffer with
NorthBoundary_row_ext = transpose(idx(:,1) - 1); % My buffer cells
columnmax = length(NorthBoundary_row);
column1 = min(c);
Boundary_Colu = linspace(column1,column1+columnmax-1,columnmax);
SouthBoundary_row = (transpose(idx(:,2))); % Repeat for south Boundary
SouthBoundary_row_ext = transpose(idx(:,2) + 1);
SouthB_Ind = sub2ind(size(Zb_ext),SouthBoundary_row,Boundary_Colu);
SouthB_Ind_ext = sub2ind(size(Zb_ext),SouthBoundary_row_ext, Boundary_Colu);
NorthB_Ind = sub2ind(size(Zb_ext),NorthBoundary_row, Boundary_Colu);
NorthB_Ind_ext = sub2ind(size(Zb_ext),NorthBoundary_row_ext, Boundary_Colu);
Zb_ext(NorthB_Ind_ext) = Zb_ext(NorthB_Ind);
Zb_ext(SouthB_Ind_ext) = Zb_ext(SouthB_Ind);
% Repeat above for East and West Boundary by reversing the roles of row and
% column
[r, c] = find(~isnan(Zb_ext));
idx = accumarray(r, c, [], #(x) {[min(x) max(x)]});
idx = vertcat(idx{:});
EastBoundary_colu = transpose(idx(:,1)); % Repeat for east Boundary
EastBoundary_colu_ext = transpose(idx(:,1) - 1);
row1 = min(r);
rowmax = length(EastBoundary_colu);
Boundary_row = linspace(row1,row1+rowmax-1,rowmax);
WestBoundary_colu = transpose(idx(:,2)); % Repeat for west Boundary
WestBoundary_colu_ext = transpose(idx(:,2) + 1);
EastB_Ind = sub2ind(size(Zb_ext),Boundary_row, EastBoundary_colu);
EastB_Ind_ext = sub2ind(size(Zb_ext),Boundary_row, EastBoundary_colu_ext);
WestB_Ind = sub2ind(size(Zb_ext),Boundary_row, WestBoundary_colu);
WestB_Ind_ext = sub2ind(size(Zb_ext),Boundary_row, WestBoundary_colu_ext);
Zb_ext(NorthB_Ind_ext) = Zb_ext(NorthB_Ind);
Zb_ext(SouthB_Ind_ext) = Zb_ext(SouthB_Ind);
Zb_ext(EastB_Ind_ext) = Zb_ext(EastB_Ind);
Zb_ext(WestB_Ind_ext) = Zb_ext(WestB_Ind);
This works well on my small development matrix, but fails on my full sized DEM. I do not understand the behavior of my code, but looking at the data there are gaps in my boundary. I wonder if I need to better control the order of max/min row/column values, though in my test on a smaller dataset, all seemed in order....
The second method I got from a similar question to this and basically uses a dilation method. However, when I transition to my full dataset, it takes hours to calculate ZbDilated. Although my first method does not work, it at least calculates within seconds.
[m, n] = size(Zb); %
Zb_ext = nan(size(Zb)+2);
Zb_ext(2:end-1, 2:end-1) = Zb; % pad Zb with zeroes on each side
ZbNANs = ~isnan(Zb_ext);
ZbDilated = zeros(m + 2, n + 2); % this will hold the dilated shape.
for i = 1:(m+2)
if i == 1 %handling boundary situations during dilation
i_f = i;
i_l = i+1;
elseif i == m+2
i_f = i-1;
i_l = i;
else
i_f = i-1;
i_l = i+1;
end
for j = 1:(n+2)
mask = zeros(size(ZbNANs));
if j == 1 %handling boundary situations again
j_f = j;
j_l = j+1;
elseif j == n+2
j_f = j-1;
j_l = j;
else
j_f = j-1;
j_l = j+1;
end
mask(i_f:i_l, j_f:j_l) = 1; % this places a 3x3 square of 1's around (i, j)
ZbDilated(i, j) = max(ZbNANs(logical(mask)));
end
end
Zb_ext(logical(ZbDilated)) = fillmissing(Zb_ext(logical(ZbDilated)),'nearest');
Does anyone have any ideas on making either of these usable?
Here is what I start out with:
NaN NaN 2 5 39 55 44 8 NaN NaN
NaN NaN NaN 7 33 48 31 66 17 NaN
NaN NaN NaN 28 NaN 89 NaN NaN NaN NaN
Here is the matrix buffered on the limits with NaNs:
NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
NaN NaN NaN 2 5 39 55 44 8 NaN NaN NaN
NaN NaN NaN NaN 7 33 48 31 66 17 NaN NaN
NaN NaN NaN NaN 28 NaN 89 NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
Here is what I want to get after using fillmissing (though I have noticed some irregularities with how buffer values are filled...):
NaN NaN 2 2 5 39 55 44 8 17 NaN NaN
NaN NaN 2 2 5 39 55 44 8 17 17 NaN
NaN NaN 2 2 7 33 48 31 66 17 17 NaN
NaN NaN NaN 2 28 33 89 31 66 17 17 NaN
NaN NaN NaN 5 28 55 89 8 NaN NaN NaN NaN
To try and clear up any confusion about what I am doing, here is the logical I get from dilation I use for fillmissing
0 0 1 1 1 1 1 1 1 1 0 0
0 0 1 1 1 1 1 1 1 1 1 0
0 0 1 1 1 1 1 1 1 1 1 0
0 0 0 1 1 1 1 1 1 1 1 0
0 0 0 1 1 1 1 1 0 0 0 0
A faster way to apply a 3x3 dilation would be as follows. This does involve some large intermediate matrices, which make it less efficient than, say applying imdilate.
[m, n] = size(Zb); %
Zb_ext = nan(size(Zb)+2);
Zb_ext(2:end-1, 2:end-1) = Zb; % pad A with zeroes on each side
ZbNANs = ~isnan(Zb_ext);
ZbDilated = ZbNANs; % this will hold the dilated shape.
% up and down neighbors
ZbDilated(2:end, :) = max(ZbDilated(2:end, :), ZbNANs(1:end-1, :));
ZbDilated(1:end-1, :) = max(ZbDilated(1:end-1, :), ZbNANs(2:end, :));
% left and right neighbors
ZbDilated(:, 2:end) = max(ZbDilated(:, 2:end), ZbNANs(:, 1:end-1));
ZbDilated(:, 1:end-1) = max(ZbDilated(:, 1:end-1), ZbNANs(:, 2:end));
% and 4 diagonal neighbors
ZbDilated(2:end, 2:end) = max(ZbDilated(2:end, 2:end), ZbNANs(1:end-1, 1:end-1));
ZbDilated(1:end-1, 2:end) = max(ZbDilated(1:end-1, 2:end), ZbNANs(2:end, 1:end-1));
ZbDilated(2:end, 1:end-1) = max(ZbDilated(2:end, 1:end-1), ZbNANs(1:end-1, 2:end));
ZbDilated(1:end-1, 1:end-1) = max(ZbDilated(1:end-1, 1:end-1), ZbNANs(2:end, 2:end));
This is a tedious way to write it, I'm sure there's a loop that can be written that is shorter, but this I think makes the intention clearer.
[Edit: Because we're dealing with a logical array here, instead of max(A,B) we could also do A | B. I'm not sure if there would be any difference in time.]
What #beaker said in a comment was to not use
mask = zeros(size(ZbNANs));
mask(i_f:i_l, j_f:j_l) = 1; % this places a 3x3 square of 1's around (i, j)
ZbDilated(i, j) = max(ZbNANs(logical(mask)));
but rather do
ZbDilated(i, j) = max(ZbNANs(i_f:i_l, j_f:j_l), [], 'all');
[Edit: Because we're dealing with a logical array here, instead of max(A,[],'all') we could also do any(A,'all'), which should be faster. See #beaker's other comment.]

Selecting simultaneously not NaN values in multiple matrices

I have three matlab matrices A, B, and C with the same size:
A = [1:3; 4:6; 7:9];
B = [2 NaN 5; NaN NaN 7; 0 1 NaN];
C = [3 NaN 2; 1 NaN NaN; 1 NaN 5];
%>> A = %>>B = %>>C =
% 1 2 3 % 2 NaN 5 % 3 NaN 2
% 4 5 6 % NaN NaN 7 % 1 NaN NaN
% 7 8 9 % 0 1 NaN % 1 NaN 5
I would like the three matrices to keep only values for which each of the 3 matrices does not have a NaN in that specific position. That is, I would like to obtain the following:
%>> A = %>>B = %>>C =
% 1 NaN 3 % 2 NaN 5 % 3 NaN 2
% NaN NaN NaN % NaN NaN NaN % NaN NaN NaN
% 7 NaN NaN % 0 NaN NaN % 1 NaN NaN
In my attempt, I'm stacking the three matrices along the third dimension of a new matrix ABC with size 3x3x3 and then I'm using a for loop to make sure all the three matrices do not have NaN in that specific position.
ABC(:,:,1)=A; ABC(:,:,2)=B; ABC(:,:,3)=C;
for i=1:size(A,1)
for j=1:size(A,2)
count = squeeze(ABC(i,j,:));
if sum(~isnan(count))<size(ABC,3)
A(i,j)=NaN;
B(i,j)=NaN;
C(i,j)=NaN;
end
end
end
This code works fine. However, as I have more than 30 matrices of bigger size I was wondering whether there is a more elegant solution to this problem.
Thank you for your help.
Lets do fancy indexing!
First, the solution:
indnan=sum(isnan(cat(3,A,B,C)),3)>0;
A(indnan)=NaN;
B(indnan)=NaN;
C(indnan)=NaN;
What this code does is essentially creates a 3D matrix, and computes how many NaN there are in each (i,j,:) arrays. Then, if there are more than 0 (i.e.any of them is NaN) it gets a logical index for it. Finally, we fill up all those with NaN, leaving only the non-NaN alive.
Ander’s answer is good, but for very large matrices it might be expensive to create that 3D matrix.
First of all, I would suggest putting the matrices into a cell array. That makes it a lot easier to programmatically manage many arrays. That is, instead of A, B, etc, work with C{1}, C{2}, etc:
C = {A,B,C};
It takes essentially zero cost to make this change.
Now, to find all elements where one of the matrices is NaN:
M = isnan(C{1});
for ii=2:numel(C)
M = M | isnan(C{ii});
end
A similar loop then sets the corresponding elements to NaN:
for ii=1:numel(C)
C{ii}(M) = NaN,
end
This latter loop can be replaced by a call to cellfun, but I like explicit loops.
EDIT: Here are some timings. This is yet another example of loops being faster in modern MATLAB than the equivalent vectorized code. Back in the old days, the loop code would have been 100x slower.
This is the test code:
function so(sz) % input argument is the size of the arrays
C3 = cell(1,3);
for ii=1:numel(C3)
C3{ii} = create(sz,0.05);
end
C20 = cell(1,20);
for ii=1:numel(C20)
C20{ii} = create(sz,0.01);
end
if(~isequal(method1(C3),method2(C3))), error('not equal!'), end
if(~isequal(method1(C20),method2(C20))), error('not equal!'), end
fprintf('method 1, 3 arrays: %f s\n',timeit(#()method1(C3)))
fprintf('method 2, 3 arrays: %f s\n',timeit(#()method2(C3)))
fprintf('method 1, 20 arrays: %f s\n',timeit(#()method1(C20)))
fprintf('method 2, 20 arrays: %f s\n',timeit(#()method2(C20)))
% method 1 is the vectorized code from Ander:
function mask = method1(C)
mask = sum(isnan(cat(3,C{:})),3)>0;
% method 2 is the loop code from this answer:
function mask = method2(C)
mask = isnan(C{1});
for ii=2:numel(C)
mask = mask | isnan(C{ii});
end
function mat = create(sz,p)
mat = rand(sz);
mat(mat<p) = nan;
These are the results on my machine (with R2017a):
>> so(500)
method 1, 3 arrays: 0.003215 s
method 2, 3 arrays: 0.000386 s
method 1, 20 arrays: 0.016503 s
method 2, 20 arrays: 0.001257 s
The loop is 10x faster! For small arrays I see much less of a difference, but the loop code is still several times faster, even for 5x5 arrays.

Rearranging elements in a row matlab

I have two matrices in Matlab.
A =
and
B =
I want to assign the elements having the same cell-value according to it's corresponding column number in A matrix and move the elements there. I want to map the elements of B with A so that B elements also moves in that position.
I want this
A =
And therefore,
B =
Is there a way to do this?!
Thanks.
Easiest way I can think of is to create row/column pairs where the rows correspond row locations of the matrix and column locations are the actual elements of the matrix themselves. The values seen at these row/column pairs are again just the matrix values themselves.
You can very easily do this with sparse. Recreating the matrix above and storing this in A:
A = [1 2 5 8; 1 2 4 7];
... I would do it this way:
r = repmat((1:size(A,1)).', 1, size(A,2)); %'
S = full(sparse(r(:),A(:),A(:)));
The first line of code generates row locations for each value in the matrix A, then using sparse to specify row/column pairs and the associated values and we use full to convert to a proper numeric matrix.
We get:
S =
1 2 0 0 5 0 0 8
1 2 0 4 0 0 7 0
You can also do the same for the matrix B. You'd use sparse and specify the third parameter to be B instead:
B = [0.5 0.2 0.6 0.8; 0.4 0.6 0.8 0.9];
S2 = full(sparse(r(:),A(:),B(:)));
We get:
>> S2
S2 =
0.5000 0.2000 0 0 0.6000 0 0 0.8000
0.4000 0.6000 0 0.8000 0 0 0.9000 0

Save Matrix values when values change from NaN to a Number in MATLAB

I have a 3x15000 matrix and I want to save the segments during a change from NaN to a number. So I have large sections that all 3 rows are NaN and when it changes I want to generate a new matrix. Is it best to index the start and end point? How do I set a flag in order to step through all the data?
For example:
NaN 5.30669473796592 5.82479888441640 NaN
NaN 103.308010031436 103.534581233064 NaN
NaN 1787365.55338272 1745186.16219408 NaN
So I want to save the numeric values inbetween.
Correct me if I am wrong, but it sounds like that 3x15000 matrix of yours contains some discrete data and you want to save each chunk of data as an individual matrix.
Suppose your matrix looks like this:
h(:,[2,3,5]) = rand(3,3)
h =
NaN 0.9649 0.9572 NaN 0.1419 NaN
NaN 0.1576 0.4854 NaN 0.4218 NaN
NaN 0.9706 0.8003 NaN 0.9157 NaN
Now you want to copy columns 2,3 into one matrix and column 5 into another. One approach is to first find the columns that contain only NaNs. You could do something like this:
ind = all(isnan(h),1)
ind =
1 0 0 1 0 1
isnan returns an array of 1s and 0s where the 1s indicates where the NaNs are. all(...,1) returns the column indices where all the rows are NaNs. ind contains the flags that you want. To save each chunk of data individually, you can use a simple for loop. Here is a quick and dirty solution:
j = 1;
k = 1;
x = nan(3,1); %temporary matrix to store numerical values
c = cell(2,1); %cell array to store the chunks of data individually
%if you can predict how many elements `c` should have, then
%you can pre-allocate appropriately.
for i=1:length(ind)
%loop through all the columns
if ind(i) == 1
%if we encounter a flag and 'x' has data, dump 'x' into 'c{k}',
%reset 'x' and continue.
if ~all(isnan(x))
c{k} = x;
k = k+1;
end
%reset x
x = nan(3,1);
j=1;
continue
end
x(:,j) = h(:,i);
j = j+1;
%catch data at the end, if last column of h does not contain all NaNs
if i==length(ind)
c{k} = x;
end
end
Your data chunks are stored as matrixes in a cell array:
c{1}
ans =
0.9649 0.9572
0.1576 0.4854
0.9706 0.8003
c{2}
ans =
0.1419
0.4218
0.9157
Hope this helps.
I'm not sure to understand what kind of output you want, but here is a way to get rid of the NaN elements:
Let's consider this matrix:
A =
NaN 1 2 NaN 3 4 NaN
NaN 5 6 NaN 7 8 NaN
NaN 9 10 NaN 11 12 NaN
Then using the find command, we can get the columns in which all the elements are NaN (as you stated in your question):
[~ ,col] = find(isnan(A))
col =
1
1
1
4
4
4
7
7
7
Then we can delete them from A to form a new matrix:
A(:,col) = []
A =
1 2 3 4
5 6 7 8
9 10 11 12
Is this what you had in mind? If not please be more specific. Thanks!

MATLAB remove NaN values from matrix and shift values left

I am trying to compute column-wise differences in the following matrix:
A =
0 NaN NaN 0.3750 NaN
NaN 0.1250 0.2500 0.3750 NaN
I would like to obtain:
0.3750 NaN NaN
0.1250 0.1250 0.1250
Where I am essentially taking a columnwise difference, skipping NaN values and shifting values to the left.
A one-dimensional case would be:
A = [0 NaN 0.250 0.375 NaN 0.625];
NaN_diff(A) = [0.250 0.125 0.250];
Any way to do this efficiently in MATLAB without using inefficient find() queries per row?
Here's a solution that vectorizes most of the operations:
notNan = ~isnan(A);
numNN = sum(notNan,2);
shifted = NaN(size(A));
for r = 1:size(A,1)
myRow = A(r,:);
shifted(r,1:numNN(r)) = myRow(notNan(r,:));
end
nanDiff = diff(shifted,1,2);
Here is an alternative vectorized solution:
%// Convert to cell array without NaNs
[rows, cols] = size(A);
C = cellfun(#(x)x(~isnan(x)), mat2cell(A, ones(1, rows), cols), 'Uniform', 0);
%// Compute diff for each row and pad
N = max(sum(~isnan(A), 2));
C = cellfun(#(x)[diff(x) nan(1, N - length(x))], C, 'Uniform', 0);
%// Convert back to a matrix
nandiff = vertcat(C{:});
If you want to pad the result matrix with zeroes instead of NaN values, change the nan function call in nan(1, N - length(x)) to zeros.
Here is an alternative method that does require you to loop over each row, but should still have decent performance and feels very intuitive to me.
B = NaN(size(A,1),size(A,2)-1)
for i = 1:size(A,1)
idx = ~isnan(A(:,i))
B(i,1:sum(idx)) = diff(A(i,idx))
end
I'm aware that this is a rather old question, but for people like me who stumble into this page, here is a simpler (imho) solution to the question:
A = [0 NaN 0.250 0.375 NaN 0.625];
A(isnan(A))=[]; % identify index of NaN values and remove them from the array
B = diff(A);
Here is another simple solution without using a loop [but assuming all values are in ascending order]:
A=[0 NaN NaN 0.3750 NaN;NaN 0.1250 0.2500 0.3750 NaN]
A(isnan(A(:,1)))=0;
B=sort(A,2);
C=diff(B,1,2)