I'd like to replace all the NaNs in a vector with the last previous non-NaN value
input = [1 2 3 NaN NaN 2];
output = [1 2 3 3 3 2];
i'd like to try and speed up the loop I already have
input = [1 2 3 NaN NaN 2];
if isnan(input(1))
input(1) = 0;
end
for i= 2:numel(input)
if isnan(input(i))
input(i) = input(i-1);
end
end
thanks in advance
Since you want the previous non-NaN value, I'll assume that the first value must be a number.
while(any(isnan(input)))
input(isnan(input)) = input(find(isnan(input))-1);
end
I profiled dylan's solution, Oleg's solution, and mine on a 47.7 million long vector. The times were 12.3s for dylan, 3.7 for Oleg, and 1.9 for mine.
Here a commented solution, works for a vector only but might be enxtended to work on a matrix:
A = [NaN NaN 1 2 3 NaN NaN 2 NaN NaN NaN 3 NaN 5 NaN NaN];
% start/end positions of NaN sequences
sten = diff([0 isnan(A) 0]);
B = [NaN A];
% replace with previous non NaN
B(sten == -1) = B(sten == 1);
% Trim first value (previously padded)
B = B(2:end);
Comparison
A: NaN NaN 1 2 3 NaN NaN 2 NaN NaN NaN 3 NaN 5 NaN NaN
B: NaN NaN 1 2 3 NaN 3 2 NaN NaN 2 3 3 5 NaN 5
Not fully vectorized but quite simple and probably still fairly efficient:
x = [1 2 3 NaN NaN 2];
for f = find(isnan(x))
x(f)=x(f-1);
end
Of course this is only slightly different than the solution provided by #Hugh Nolan
nan_ind = find(isnan(A)==1);
A(nan_ind) = A(nan_ind-1);
Related
I have a 1-d data file with occasional NaN values. If I apply movmean to this input data, is there a simple way to set the movmean value to NaN if the number of input values within the moving window is greater than a threshold value? For example, if the window length is 10 and a threshold value is 3, I would like the movmean value to be NaN for this set of 10 values:
[1 3 NaN 4 NaN 2 5 NaN NaN 3]
but the give me a valid movmean value for this set of 10 values:
[1 3 2 4 NaN NaN 3 2 5 3]
This is a matlab question, and you can do something like the following:
w = 10; t = 3;
A = [1 3 NaN 4 NaN 2 5 NaN NaN 3];
M = movmean(A,w,'omitnan');
N = movsum(isnan(A),w) >= t;
M(N) = NaN;
Knowing that:
There are a lot of discussion about plotting equal sized matrices in a cell array and it is quite easy to do without a loop.
For example, to plot the 2-by-2 matrices in mycell:
mycell = {[1 1; 2 1], [1 1; 3 1], [1 1; 4 1]};
We can use cellfun to add a row of NaN at the bottom of each matrix and then convert the cell to a matrix:
mycellnaned = cellfun(#(x) {[x;nan(1,2)]}, mycell);
mymat = cell2mat(mycellnaned');
mymat looks like:
1 1 1 1 1
2 1 3 1 4
NaN NaN NaN NaN NaN
Then we can plot it easily:
mymatx = mymat(:,1:2:end);
mymaty = mymat(:,2:2:end);
figure;
plot(mymatx, mymaty,'+-');
The problem:
The problem is now, how do I do something similar with a cell containing non-equal matrices? Such as:
mycell = {
[1:2; ones(1,2)]';
[1:4; ones(1,4)*2]';
[1:6; ones(1,6)*3]';
[1:8; ones(1,8)*4]';
[1:10; ones(1,10)*5]';
[1:12; ones(1,12)*6]';
};
mycell = repmat(mycell,1000,1);
I would not be able to convert them into one matrix like I did before. I could use a loop, as suggested in this answer, but it would be very inefficient if the cell contains thousands of matrices.
Therefore, I'm looking for a more efficient way of plotting non-equal sized matrices in a cell array.
Note that different colours should be used for different matrices in the figure.
Well, while I was writing the question, I figured it out...
I'd like to keep the question open since there might be better solutions.
For everyone else's reference, the solution is simple: add NaN to make the matrices equal sized:
% find out the maximum length of all matrices in the array
cellLengthMax = max(cellfun('length', mycell));
% fill the matrices so they are equal in size.
mycellfilled = cellfun(#(x) {[
x
nan(cellLengthMax-size(x,1), 2)
nan(1, 2)
]}, mycell);
Then convert to a matrix and plot:
mymat = cell2mat(mycellfilled');
mymatx = mymat(:,1:2:end);
mymaty = mymat(:,2:2:end);
figure;
plot(mymatx, mymaty,'+-');
mymat looks like:
1 1 1 2 1 3 1 4 1 5 1 6
2 1 2 2 2 3 2 4 2 5 2 6
NaN NaN 3 2 3 3 3 4 3 5 3 6
NaN NaN 4 2 4 3 4 4 4 5 4 6
NaN NaN NaN NaN 5 3 5 4 5 5 5 6
NaN NaN NaN NaN 6 3 6 4 6 5 6 6
NaN NaN NaN NaN NaN NaN 7 4 7 5 7 6
NaN NaN NaN NaN NaN NaN 8 4 8 5 8 6
NaN NaN NaN NaN NaN NaN NaN NaN 9 5 9 6
NaN NaN NaN NaN NaN NaN NaN NaN 10 5 10 6
NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 11 6
NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 12 6
NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
Update:
Time cost for plotting 6000 matrices:
using the solution proposed here: 1.183546 seconds.
using a loop: 3.450423 seconds.
Still not very satisfactory. I really wish to reduce the time to 0.1 seconds, because I'm trying to design an interactive UI, where the user can change a few parameters and the result get plotted instantly.
I don't want to reduce the resolution of the figure.
Update:
I did a profiler and it seems the 99% of the time is wasted on plot(mymatx, mymaty,'+-');. So the conclusion is, there is probably no other way to fasten this.
I have the 4x2 matrix A:
A = [2 NaN 5 8; 14 NaN 23 NaN]';
I want to replace the non-NaN values with their associated indices within each column in A. The output looks like this:
out = [1 NaN 3 4; 1 NaN 3 NaN]';
I know how to do it for each column manually, but I would like an automatic solution, as I have much larger matrices to handle. Anyone has any idea?
out = bsxfun(#times, A-A+1, (1:size(A,1)).');
How it works:
A-A+1 replaces actual numbers in A by 1, and keeps NaN as NaN
(1:size(A,1)).' is a column vector of row indices
bsxfun(#times, ...) multiplies both of the above with singleton expansion.
As pointed out by #thewaywewalk, in Matlab R2016 onwards bsxfun(#times...) can be replaced by .*, as singleton expansion is enabled by default:
out = (A-A+1) .* (1:size(A,1)).';
An alternative suggested by #Dev-Il is
out = bsxfun(#plus, A*0, (1:size(A,1)).');
This works because multiplying by 0 replaces actual numbers by 0, and keeps NaN as is.
Applying ind2sub to a mask created with isnan will do.
mask = find(~isnan(A));
[rows,~] = ind2sub(size(A),mask)
A(mask) = rows;
Note that the second output of ind2sub needs to be requested (but neglected with ~) as well [rows,~] to indicate you want the output for a 2D-matrix.
A =
1 1
NaN NaN
3 3
4 NaN
A.' =
1 NaN 3 4
1 NaN 3 NaN
Also be careful the with the two different transpose operators ' and .'.
Alternative
[n,m] = size(A);
B = ndgrid(1:n,1:m);
B(isnan(A)) = NaN;
or even (with a little inspiration by Luis Mendo)
[n,m] = size(A);
B = A-A + ndgrid(1:n,1:m)
or in one line
B = A-A + ndgrid(1:size(A,1),1:size(A,2))
This can be done using repmat and isnan as follows:
A = [ 2 NaN 5 8;
14 NaN 23 NaN];
out=repmat([1:size(A,2)],size(A,1),1); % out contains indexes of all the values
out(isnan(A))= NaN % Replacing the indexes where NaN exists with NaN
Output:
1 NaN 3 4
1 NaN 3 NaN
You can take the transpose if you want.
I'm adding another answer for a couple of reasons:
Because overkill (*ahem* kron *ahem*) is fun.
To demonstrate that A*0 does the same as A-A.
A = [2 NaN 5 8; 14 NaN 23 NaN].';
out = A*0 + kron((1:size(A,1)).', ones(1,size(A,2)))
out =
1 1
NaN NaN
3 3
4 NaN
I would appreciate if someone can help me with this problem...
I have a vector
A = [NaN 1 1 1 1 NaN NaN NaN NaN NaN 2 2 2 NaN NaN NaN 2 NaN NaN 3 NaN NaN];
I would like to fill the NaN values according to this logic.
1) if the value that precedes the sequence of NaN is different from the one that follows the sequence => assign half of the NaNs to the first value and half to the second value
2) if the NaN seqence is between 2 equal values => fill the NaN with that value.
A should be then:
A = [1 1 1 1 1 1 1 (1) 2 2 2 2 2 2 2 2 2 2 3 3 3]
I have put one 1 within brakets because I assigned that value to the first half...the sequence of NaNs is odd.
I am typing this in my phone, without MATLAB - so there can be some issues. But this should be close:
t = 1:numel(A);
Anew = interp1(t(~isnan(A)),A(~isnan(A)),t,'nearest','extrap');
If you have the image processing toolbox, you can use bwdist to calculate the index of the nearest non-NaN-neighbor:
nanMask = isnan(A);
[~,idx] = bwdist(~nanMask);
A(nanMask) = A(idx(nanMask));
I have a vector of values such as the following:
1
2
3
NaN
4
7
NaN
NaN
54
5
2
7
2
NaN
NaN
NaN
5
54
3
2
NaN
NaN
NaN
NaN
4
NaN
How can I use
interp1
in such way that only a variable amount of consecutive NaN-values would be interpolated? That is for example I would want to interpolate only those NaN-values where there are at most three consecutive NaN-values. So NaN, NaN NaN and NaN NaN NaN would be interpolated but not NaN NaN NaN NaN.
Thank you for any help =)
P.S. If I can't do this with interp1, any ideas how to do this in another way? =)
To give an example, the vector I gave would become:
1
2
3
interpolated
4
7
interpolated
interpolated
54
5
2
7
2
interpolated
interpolated
interpolated
5
54
3
2
NaN
NaN
NaN
NaN
4
interpolated
First of all, find the positions and lengths of all sequences of NaN values:
nan_idx = isnan(x(:))';
nan_start = strfind([0, nan_idx], [0 1]);
nan_len = strfind([nan_idx, 0], [1 0]) - nan_start + 1;
Next, find the indices of the NaN elements not to interpolate:
thr = 3;
nan_start = nan_start(nan_len > thr);
nan_end = nan_start + nan_len(nan_len > thr) - 1;
idx = cell2mat(arrayfun(#colon, nan_start, nan_end, 'UniformOutput', false));
Now, interpolate everything and replace the elements that shouldn't have been interpolated back with NaN values:
x_new = interp1(find(~nan_idx), x(~nan_idx), 1:numel(x));
x_new(idx) = NaN;
I know this is an bad habit in matlab, but I would think this particular case requires a loop:
function out = f(v)
out = zeros(numel(v));
k = 0;
for i = 1:numel(v)
if v(i) ~= NaN
if k > 3
out(i-k:i - 1) = ones(1, k) * NaN;
else
out(i-k: i - 1) = interp1();%TODO: call interp1 with right params
end
out(i) = v(i)
k = 0
else
k = k + 1 % number of consecutive NaN value encoutered so far
end
end