MATLAB: Interpolation over NaNs in a time series - matlab

Question: How to locally interpolate over small lengths of NaNs?
I have a time series ("x" data sampled evenly at "t" times) that has blocks of NaNs.
For example:
x = [ 1 2 4 2 3 15 10 NaN NaN NaN NaN 2 4 NaN 19 25]
t = [0.1 0.2 0.3 ...etc..]
I want to perform interpolation over the NaN.
The most elementary approach would be to just linearly interpolate from the left-most data point to the right-most data point. Eg. a line from x = 10 to x = 2 and the 4 NaN values will be assigned values from the line.
The length of the time series is ~1.5 million with ~10000 NaNs, so I don't want to incorporate data (in the interpolation) that is far away from the NaN locations. Some of the NaNs span a length of 1000-2000.
X(isnan(X)) = interp1(find(~isnan(X)), X(~isnan(X)), find(isnan(X)), 'linear');
will linearly interpolate over the NaN using the whole time series.
How would I interpolate locally? Linear should be sufficient. Perhaps linear interpolation incorporating a few points to the left and to the right of the NaN blocks (maybe 100-200 points). A natural neighbour or spline (?) algorithm might be more suitable; I must be careful in not adding anomalous behaviour to the time series (e.g. interpolation that adds fictitious "power" to a frequency).
UPDATE:
The time series is a record of a minute-sampled temperature over a year long period. Linear interpolation is sufficient; I just need to fill in the ~6-7 hour length gaps of NaNs (I am provided with data before the NaN gaps and after the NaN gaps).

I think this is (at least partially) what you seek:
% example data
x = [ 1 2 4 2 3 15 10 NaN NaN NaN NaN 2 4 NaN 19 25];
t = linspace(0.1, 10, numel(x));
% indices to NaN values in x
% (assumes there are no NaNs in t)
nans = isnan(x);
% replace all NaNs in x with linearly interpolated values
x(nans) = interp1(t(~nans), x(~nans), t(nans));
note that you can easily switch interpolation method here:
% cubic splines
x(nans) = interp1(t(~nans), x(~nans), t(nans), 'spline');
% nearest neighbor
x(nans) = interp1(t(~nans), x(~nans), t(nans), 'nearest');

Consider using inpaint_nans, a very nice tool designed to interpolate NaN elements in a 1-d or 2-d array using non-NaN elements. It can also extrapolate, as it does not use a triangulation of the data. It also allows different approaches to the interpolation.

Related

Matlab: How to turn a vector of 2928x1 into a vector of 8784x1 with specific intervals of elements?

I am trying to convert a vector with 2928 values into a vector with 8784 values. The first vector is a vector with info with an interval of 3hours, and I would like to have an hourly vector with those values added every 3 hours and the remaining should be filled with NaN.
My first approach was to create a NaN vector with 8784 values but then I have not been able to create a 'for loop' that worked with this.
To make it simple, I'll try to explain with an example (n is the number of values of the smallest vector):
S_3h = ones(n,1); % this acts as the small vector that has only information each 3hours
B_h = nan(3*n,1); %this is the created hourly vector that I want to fulfill
The result wanted is:
B_h = [1 nan nan 1 nan nan 1 nan nan 1 nan nan ...]
Can you help me please?
Thank you very much in advance!
Simply index with step different to 1. In this case, step is 3.
B_h(1:3:end) = S_3h
There are already two good answers, so just for the sport (and for the kron...), here is one liner solution:
S_3h = randi(10,1,n);
B_h=kron(S_3h,[1 NaN NaN]);
Zizy Archer's solution is good (and probably what you should use), but below is another option.
S_3h = ones(n,1);
B_h = nan(3,n); % notice the different indices
B_h(1,:) = S_3h; % the top row contains the non-NaN values. This is common to all methods.
B_h = B_h(:); % reshape to a column vector
Done a bit differently:
B_h = reshape( S_3h.' .* [1; NaN(1,2)],[],1); % R2016b onward
B_h = reshape( bsxfun(#times, S_3h.',[1; NaN(2,1)]),[],1 ); % R2007a onward
If you have the image processing toolbox, you can also use the padarray function like so:
B_h = reshape(padarray(S_3h, [0 2], NaN, 'post').', [], 1);

Using interp2 in Matlab with NaN inputs

I have some observational data that is relatively complete, but contains some NaN values, in an matrix in matlab and I want to interpolate them to a more evenly spaced grid using interp2
So, to keep things simple lets say I have one complete (no NaN values) matrix, and one that looks something like:
A = [ 1 2 3 4;
2 3 2 NaN;
0 2 3 4;
0 NaN 4 5 ]
with B and C being complete matrices, interp2 won't accept an input matrix with NaN values. So if I do something like this:
[AI,BI] = meshgrid(a,b) %# matrices to interpolate data to, arbitrary
CI = interp2(A,B,C,AI,BI) %# interpolation, A has NaN values
I get an error:
Error using griddedInterpolant
The coordinates of the input points must be finite values; Inf and NaN are not permitted.
Can anyone suggest either a solution, or reasonable work around that doesn't obstruct my data?
Sorry the quick fix I gave in comment does not work directly for 2D data (it does work that simply with interp1 though, if you ever need it).
For gridded data, if you have NaNs in your grid then you do not have a uniform grid and you cannot use interp2 directly. In this case you have to use griddata first, to re-interpolate your data over a uniform grid (patch the holes basically).
(1) Let's show an example inspired from the Matlab doc:
%% // define a surface
[A,B] = meshgrid(-3:0.25:3);
C = peaks(A,B);
%// poke some holes in it (in every coordinate set)
A(15,3:8) = NaN ;
B(14:18,13) = NaN ;
C(8,16:21) = NaN ;
(2) Now let's fix your data on a clean grid:
%// identify indices valid for the 3 matrix
idxgood=~(isnan(A) | isnan(B) | isnan(C));
%// define a "uniform" grid without holes (same boundaries and sampling than original grid)
[AI,BI] = meshgrid(-3:0.25:3) ;
%// re-interpolate scattered data (only valid indices) over the "uniform" grid
CI = griddata( A(idxgood),B(idxgood),C(idxgood), AI, BI ) ;
(3) Once your grid is uniform, you can then use interp2 if you want to mesh on a finer grid for example:
[XI,YI] = meshgrid(-3:0.1:3) ; %// create finer grid
ZI = interp2( AI,BI,CI,XI,YI ) ; %// re-interpolate
However, note that if this is all what you wanted to do, you could also use griddata only, and do everything in one step:
%// identify indices valid for the 3 matrix
idxgood=~(isnan(A) | isnan(B) | isnan(C));
%// define a "uniform" grid without holes (finer grid than original grid)
[XI,YI] = meshgrid(-3:0.1:3) ;
%// re-interpolate scattered data (only valid indices) over the "uniform" grid
ZI = griddata( A(idxgood),B(idxgood),C(idxgood), XI, YI ) ;
This produces the exact same grid and data than we obtained on step (3) above.
Last note: In case your NaNs are on the border of your domain, by default these functions cannot "interpolate" values for these border. To force them to do so, look at the extrapolation options of these functions, or simply interpolate on a slightly smaller grid which doesn't have NaN on the border.

Trend values of a 3d matrix having nan values

I have a 3D matrix (19-by-21-by-23) of mean AOD values. My dataset consists of NaN values at many positions in the matrix. I wish to calculate the trend of each grid of the matrix, but I am somehow getting incorrect results and mostly NaN value as the trend. Help me in correcting the code. The code I wrote is below:
y=1:21;
% = mat2cell(y,1*ones(1,1),21);
% Mcell = mat2cell(aa,19,repmat(1,21,1));
kq= zeros(19,2);
X = mat2cell(winter,1*ones(19,1),21);
for i=1:19;
for k=1:23;
u(i,:,:)= winter(i,:,:);
kq(i,:,:)= polyfit(u(i,:,:),y,1);
end
end

Voxel neighborhood indexing - Detecting "out of bounds" in 26 neighbor access with linear indexing

Well, I don't know how to describe my problem with a title, I hope the one I got is correct.
I have a matrix (Min the example below) that is a 3D image, composed, in this case, by 11x11x11 voxels (I made it logical just for easiness, and size is just an example also).
In my code, I need to reach the 26 neighbors of some voxels, and for that I use some fancy linear indexing found in: http://www.mathworks.com/matlabcentral/answers/86900-how-to-find-all-neighbours-of-an-element-in-n-dimensional-matrix
The problem is that if the point is in the "boundary" of M some out of bounds values are tried to be accessed, and that will generate an error.
To solve this problem, a good approach would be to create a boundary around M making it +2 size in every dimension, and populate that with zeros, however I really would like to avoid changing M, as my code is quite more complex that the one in the example.
I cant find any way of doing it, i'm a bit stuck here. Any suggestion?
EDIT: #Dan answer works, however I would like to see if there is a possible solution using this linear indexing method.
% Example data
M=round(randn(11,11,11))~=0;
% Fancy way of storing 26 neigh indices for good accesing
s=size(M);
N=length(s);
[c1{1:N}]=ndgrid(1:3);
c2(1:N)={2};
neigh26=sub2ind(s,c1{:}) - sub2ind(s,c2{:});
point=[5 1 6];
% This will work unless the point is in the boundary (like in this example)
neighbours=M(sub2ind(s,point(1),point(2),point(3))+neigh26)
Is that linear indexing stuff essential? Because it's pretty easy to handle boundary conditions is you use subscript indexing and min and max like this:
p = [5, 1, 6];
neighbourhood = M(max(1,p(1)-1)):min(p(1)+1,end),
max(1,p(2)-1)):min(p(2)+1,end),
max(1,p(3)-1)):min(p(3)+1,end))
%// Get rid of the point it self (i.e. the center)
neighbours = neighbourhood([1:13, 15:end])
This way you can also easily generalize this if you want a broader neighbourhood:
p = [5, 1, 6];
n = 2;
neighbourhood = M(max(1,p(1)-n)):min(p(1)+n,end),
max(1,p(2)-n)):min(p(2)+n,end),
max(1,p(3)-n)):min(p(3)+n,end))
%// Get rid of the point it self (i.e. the center)
mid = ceil(numel(neigbourhood)/2);
neighbours = neighbourhood([1:mid-2, mid+1:end])
or if you liked to keep the cube shape then maybe:
neighbours = neighbourhood;
neighbours(mid) = NaN;
If you want to use this many times in your code it's probably best to refactor it as an m-file function that just returns the indices:
function ind = getNeighbours(M,p,n)
M = zeros(size(M));
M(max(1,p(1)-n)):min(p(1)+n,end), max(1,p(2)-n)):min(p(2)+n,end), max(1,p(3)-n)):min(p(3)+n,end)) = 1;
M(p(1), p(2), p(3)) = 0;
ind = find(M);
end
Basic theory: Extend input array to left-right, up-down, one more on each sides of the third dimension with NaNs. This would allow us to use a uniform 3x3x3 grid and then later on use those NaNs to detect elements that go beyond the boundaries of input array and as such are to be discarded.
Code
%// Initializations
sz_ext = size(M)+2; %// Get size of padded/extended input 3D array
M_ext = NaN(sz_ext); %// Initialize extended array
M_ext(2:end-1,2:end-1,2:end-1) = M; %// Insert values from M into it
%// Important stuff here : Calculate linear offset indices within one 3D slice
%// then for neighboring 3D slices too
offset2D = bsxfun(#plus,[-1:1]',[-1:1]*sz_ext(1)); %//'
offset3D = bsxfun(#plus,offset2D,permute([-1:1]*sz_ext(1)*sz_ext(2),[1 3 2]));
%// Get linear indices for all points
points_linear_idx = sub2ind(size(M_ext),point(:,1)+1,point(:,2)+1,point(:,3)+1);
%// Linear indices for all neighboring elements for all points; index into M_ext
neigh26 = M_ext(bsxfun(#plus,offset3D,permute(points_linear_idx,[4 3 2 1])))
How to use: Thus, each slice in the 4th dimension represent the 27 elements (neighboring plus the element itself) as 3x3x3 array. Hence, neigh26 would be a 3x3x3xN array where N is the number of points in point array.
Example: As an example, let's assume some random values in M and Point -
M=rand(11,11,11);
point = [
1 1 4;
1 7 1]
On running the earlier code with these inputs, I get something like this -
neigh26(:,:,1,1) =
NaN NaN NaN
NaN 0.5859 0.4917
NaN 0.6733 0.6688
neigh26(:,:,2,1) =
NaN NaN NaN
NaN 0.0663 0.5544
NaN 0.3440 0.3664
neigh26(:,:,3,1) =
NaN NaN NaN
NaN 0.3555 0.1257
NaN 0.4424 0.9577
neigh26(:,:,1,2) =
NaN NaN NaN
NaN NaN NaN
NaN NaN NaN
neigh26(:,:,2,2) =
NaN NaN NaN
0.7708 0.3712 0.2866
0.7088 0.3743 0.2326
neigh26(:,:,3,2) =
NaN NaN NaN
0.4938 0.5051 0.9416
0.1966 0.0213 0.8036

How to hide zero values in bar3 plot in MATLAB

I've got a 2-D histogram (the plot is 3D - several histograms graphed side by side) that I've generated with the bar3 plot command. However, all the zero values show up as flat squares in the x-y plane. Is there a way I can prevent MATLAB from displaying the values? I already tried replacing all zeros with NaNs, but it didn't change anything about the plot. Here's the code I've been experimenting with:
x1=normrnd(50,15,100,1); %generate random data to test code
x2=normrnd(40,13,100,1);
x3=normrnd(65,12,100,1);
low=min([x1;x2;x3]);
high=max([x1;x2;x3]);
y=linspace(low,high,(high-low)/4); %establish consistent bins for histogram
z1=hist(x1,y);
z2=hist(x2,y);
z3=hist(x3,y);
z=[z1;z2;z3]';
bar3(z)
As you can see, there are quite a few zero values on the plot. Closing the figure and re-plotting after replacing zeros with NaNs seems to change nothing:
close
z(z==0)=NaN;
bar3(z)
One solution is to modify the graphics objects created by bar3. First, you have to get the handles returned from bar3:
h = bar3(z);
In your case, h will be a 3-element vector of handles, one for each set of colored bars. The following code should then make the bins with counts of zero invisible:
for i = 1:numel(h)
index = logical(kron(z(:, i) == 0, ones(6, 1)));
zData = get(h(i), 'ZData');
zData(index, :) = nan;
set(h(i), 'ZData', zData);
end
And here's an illustration (with obligatory free-hand circles):
How it works...
If your vector of bin counts is N-by-1, then bar3 will plot 6*N rectangular patches (i.e. the 6 faces of a cuboid for each bin). The 'ZData' property for each set of patch objects in h will therefore be (6*N)-by-4, since there are 4 corners for each rectangular face. Each cluster of 6 rows of the 'ZData' property is therefore a set of z-coordinates for the 6 faces of one bin.
The above code first creates a logical vector with ones everywhere the bin count equals 0, then replicates each element of this vector 6 times using the kron function. This becomes an index for the rows of the 'ZData' property, and this index is used to set the z-coordinates to nan for the patches of empty bins. This will cause the patches to not be rendered.
EDIT:
Here's a slightly modified version of the code that makes it more general by fetching the bar height from the 'ZData' property of the plotted bars, so all that's needed for it to work are the handles returned from bar3. I've also wrapped the code in a function (sans error and input checking):
function remove_empty_bars(hBars)
for iSeries = 1:numel(hBars)
zData = get(hBars(iSeries), 'ZData'); % Get the z data
index = logical(kron(zData(2:6:end, 2) == 0, ones(6, 1))); % Find empty bars
zData(index, :) = nan; % Set the z data for empty bars to nan
set(hBars(iSeries), 'ZData', zData); % Update the graphics objects
end
end
Here is an example that shows how to hide bars with zero-values. We start with a normal BAR3 plot:
x = 1:7;
Y = jet(numel(x));
h = bar3(x,Y,'detached');
xlabel x; ylabel y; zlabel z; box on;
Note that the variable h contains an array of surface handles (3 in this case, one for each "group" of bars. The groups correspond to the columns of the Y matrix, each represented by a different color).
And now the code to hide zero values:
for i=1:numel(h)
%# get the ZData matrix of the current group
Z = get(h(i), 'ZData');
%# row-indices of Z matrix. Columns correspond to each rectangular bar
rowsInd = reshape(1:size(Z,1), 6,[]);
%# find bars with zero height
barsIdx = all([Z(2:6:end,2:3) Z(3:6:end,2:3)]==0, 2);
%# replace their values with NaN for those bars
Z(rowsInd(:,barsIdx),:) = NaN;
%# update the ZData
set(h(i), 'ZData',Z)
end
Explanation:
For each group of bars, a surface graphic object is created (with handle stored in h(i)). It's Z-coordinates matrix ZData is represented as a 6*N-by-4 matrix (same thing for XData, YData, and CData matrices), where N is the number of rectangular bars in each group or 7 in the example above.
This way each rectangle is represented with 6x4 matrices (one for each of X/Y/Z coordinates). For example the coordinates of one such rectangle would look like:
>> xx = get(h(3),'XData'); yy = get(h(3),'YData'); zz = get(h(3),'ZData');
>> xx(1:6,:)
ans =
NaN 2.6 3.4 NaN
2.6 2.6 3.4 3.4
2.6 2.6 3.4 3.4
NaN 2.6 3.4 NaN
NaN 2.6 3.4 NaN
NaN NaN NaN NaN
>> yy(1:6,:)
ans =
NaN 0.6 0.6 NaN
0.6 0.6 0.6 0.6
1.4 1.4 1.4 1.4
NaN 1.4 1.4 NaN
NaN 0.6 0.6 NaN
NaN NaN NaN NaN
>> zz(1:6,:)
ans =
NaN 0 0 NaN
0 1 1 0
0 1 1 0
NaN 0 0 NaN
NaN 0 0 NaN
NaN NaN NaN NaN
The second column of each traces the points along the left face, the third column traces the points along the right face, and when the two are connected would draw 4 faces of the rectangle:
>> surface(xx(1:6,2:3), yy(1:6,2:3), zz(1:6,2:3), cc(1:6,2:3))
>> view(3)
The first and last columns would draw the two remaining faces by closing the sides of the rectangle.
All such matrices are concatenated as one tall matrix, and the rectangles are all drawn using a single surface object. This is achieved by using NaN values to separate the different parts, both inside the points of the same rectangle, and in-between the difference rectangles.
So what the above code does is to look for rectangles where the Z-height is zero, and replace all its values with NaN values which effectively tells MATLAB not to draw the surfaces formed by those points.
My problem was not zero values, but NaN values (which are converted into zero values inside of bar3).
I wanted to keep displaying elements with values zero, but not the elements with value nan.
I adjusted the code slightly, and it worked perfectly:
for i = 1:numel(h)
index = logical(kron(isnan(z(:,i)),ones(6,1)));
zData = get(h(i),'ZData');
zData(index,:) = nan;
set(h(i),'ZData',zData);
end
Thanks!