Interpolation of time series data in MATLAB - matlab

I have a multidimensional time series in MATLAB. Let's say it's of M dimensions, N samples, and as such I have it stored in terms of NxM matrix.
I want interpolate the time series, to fit a new length (N1), where always N is always less than N1.
In other words, if I have multiple time series (all sampled at the same rate, just of different lengths), I want to interpolate them all to be of length N0.
How can one achieve this with MATLAB?
EDIT: Could one achieve this with imresize?
i.e.:
A = randn(5,10) % 10 dimensions, 5 samples
desiredLength = 15; % we want 15 samples in lenght
newA = imresize(A, [desiredLength 10], 'bilinear');

A procedure like the following might do what you want. The new data will be a linear interpolation of the old data.
[initSize1, initSize2] = ndgrid(1:size(Data, 1), 1:size(Data, 2));
[newSize1, newSize2] = ndgrid(linspace(1, size(Data, 1), newlength), 1:size(Data, 2));
newData = interpn(initSize1, initSize2, Data, newSize1, newSize2);
As coded up, only dimension 1 should change, as the second gridded dimension is the same in the first and second calls to ndgrid.

If you have a timeseries object, you might also want to look at the resample method for the timeseries object:
http://www.mathworks.co.uk/help/matlab/ref/timeseries.resample.html

Related

How to sum 3d Matrix row by interval in Matlab?

I have a 36x256x2232 3d matrix in Matlab created by M = ones(36,256,2232) and I want to reduce the size of the matrix by sum rows by interval 3. The result matrix should be 12x256x2232 and each cell should have the value 3.
I tried using reshape and sum function but I get 1x256x2232 matrix.
How can I do this without using the for-loop ?
This should do it:
M = ones(36,256,2232)
reduced = reshape(sum(reshape(M, 3,[], 256,2232), 1),[], 256, 2232);
reshape makes a 4d matrix with the given intervals
sum reduce it
second reshape transform it to 3d again
you can use also squeeze, which removes singleton dimensions:
reduced = squeeze(sum(reshape(M, 3,[], 256,2232), 1));
You can use the new-ish splitapply function (which is similar to accumarray but can handle data with multiple dimensions). This approach works even if the number of rows is not a multiple of the group size:
M = ones(4,5,2); % example data
n = 3; % group size
result = splitapply(#(x)sum(x,1), M, floor((0:size(M,1)-1).'/n)+1);

Get Matrix of minimum coordinate distance to point set

I have a set of points or coordinates like {(3,3), (3,4), (4,5), ...} and want to build a matrix with the minimum distance to this point set. Let me illustrate using a runnable example:
width = 10;
height = 10;
% Get min distance to those points
pts = [3 3; 3 4; 3 5; 2 4];
sumSPts = length(pts);
% Helper to determine element coordinates
[cols, rows] = meshgrid(1:width, 1:height);
PtCoords = cat(3, rows, cols);
AllDistances = zeros(height, width,sumSPts);
% To get Roh_I of evry pt
for k = 1:sumSPts
% Get coordinates of current Scribble Point
currPt = pts(k,:);
% Get Row and Col diffs
RowDiff = PtCoords(:,:,1) - currPt(1);
ColDiff = PtCoords(:,:,2) - currPt(2);
AllDistances(:,:,k) = sqrt(RowDiff.^2 + ColDiff.^2);
end
MinDistances = min(AllDistances, [], 3);
This code runs perfectly fine but I have to deal with matrix sizes of about 700 milion entries (height = 700, width = 500, sumSPts = 2k) and this slows down the calculation. Is there a better algorithm to speed things up?
As stated in the comments, you don't necessary have to put everything into a huge matrix and deal with gigantic matrices. You can :
1. Slice the pts matrix into reasonably small slices (say of length 100)
2. Loop on the slices and calculate the Mindistances slice over these points
3. Take the global min
tic
Mindistances=[];
width = 500;
height = 700;
Np=2000;
pts = [randi(width,Np,1) randi(height,Np,1)];
SliceSize=100;
[Xcoords,Ycoords]=meshgrid(1:width,1:height);
% Compute the minima for the slices from 1 to floor(Np/SliceSize)
for i=1:floor(Np/SliceSize)
% Calculate indexes of the next slice
SliceIndexes=((i-1)*SliceSize+1):i*SliceSize
% Get the corresponding points and reshape them to a vector along the 3rd dim.
Xpts=reshape(pts(SliceIndexes,1),1,1,[]);
Ypts=reshape(pts(SliceIndexes,2),1,1,[]);
% Do all the diffs between your coordinates and your points using bsxfun singleton expansion
Xdiffs=bsxfun(#minus,Xcoords,Xpts);
Ydiffs=bsxfun(#minus,Ycoords,Ypts);
% Calculate all the distances of the slice in one call
Alldistances=bsxfun(#hypot,Xdiffs,Ydiffs);
% Concatenate the mindistances
Mindistances=cat(3,Mindistances,min(Alldistances,[],3));
end
% Check if last slice needed
if mod(Np,SliceSize)~=0
% Get the corresponding points and reshape them to a vector along the 3rd dim.
Xpts=reshape(pts(floor(Np/SliceSize)*SliceSize+1:end,1),1,1,[]);
Ypts=reshape(pts(floor(Np/SliceSize)*SliceSize+1:end,2),1,1,[]);
% Do all the diffs between your coordinates and your points using bsxfun singleton expansion
Xdiffs=bsxfun(#minus,Xcoords,Xpts);
Ydiffs=bsxfun(#minus,Ycoords,Ypts);
% Calculate all the distances of the slice in one call
Alldistances=bsxfun(#hypot,Xdiffs,Ydiffs);
% Concatenate the mindistances
Mindistances=cat(3,Mindistances,min(Alldistances,[],3));
end
% Get global minimum
Mindistances=min(Mindistances,[],3);
toc
Elapsed time is 9.830051 seconds.
Note :
You'll not end up doing less calculations. But It will be a lot less intensive for your memory (700M doubles takes 45Go in memory), thus speeding up the process (With the help of vectorizing aswell)
About bsxfun singleton expansion
One of the great strength of bsxfun is that you don't have to feed it matrices whose values are along the same dimensions.
For example :
Say I have two vectors X and Y defined as :
X=[1 2]; % row vector X
Y=[1;2]; % Column vector Y
And that I want a 2x2 matrix Z built as Z(i,j)=X(i)+Y(j) for 1<=i<=2 and 1<=j<=2.
Suppose you don't know about the existence of meshgrid (The example is a bit too simple), then you'll have to do :
Xs=repmat(X,2,1);
Ys=repmat(Y,1,2);
Z=Xs+Ys;
While with bsxfun you can just do :
Z=bsxfun(#plus,X,Y);
To calculate the value of Z(2,2) for example, bsxfun will automatically fetch the second value of X and Y and compute. This has the advantage of saving a lot of memory space (No need to define Xs and Ys in this example) and being faster with big matrices.
Bsxfun Vs Repmat
If you're interested with comparing the computational time between bsxfun and repmat, here are two excellent (word is not even strong enough) SO posts by Divakar :
Comparing BSXFUN and REPMAT
BSXFUN on memory efficiency with relational operations

Computing a moving average

I need to compute a moving average over a data series, within a for loop. I have to get the moving average over N=9 days. The array I'm computing in is 4 series of 365 values (M), which itself are mean values of another set of data. I want to plot the mean values of my data with the moving average in one plot.
I googled a bit about moving averages and the "conv" command and found something which i tried implementing in my code.:
hold on
for ii=1:4;
M=mean(C{ii},2)
wts = [1/24;repmat(1/12,11,1);1/24];
Ms=conv(M,wts,'valid')
plot(M)
plot(Ms,'r')
end
hold off
So basically, I compute my mean and plot it with a (wrong) moving average. I picked the "wts" value right off the mathworks site, so that is incorrect. (source: http://www.mathworks.nl/help/econ/moving-average-trend-estimation.html) My problem though, is that I do not understand what this "wts" is. Could anyone explain? If it has something to do with the weights of the values: that is invalid in this case. All values are weighted the same.
And if I am doing this entirely wrong, could I get some help with it?
My sincerest thanks.
There are two more alternatives:
1) filter
From the doc:
You can use filter to find a running average without using a for loop.
This example finds the running average of a 16-element vector, using a
window size of 5.
data = [1:0.2:4]'; %'
windowSize = 5;
filter(ones(1,windowSize)/windowSize,1,data)
2) smooth as part of the Curve Fitting Toolbox (which is available in most cases)
From the doc:
yy = smooth(y) smooths the data in the column vector y using a moving
average filter. Results are returned in the column vector yy. The
default span for the moving average is 5.
%// Create noisy data with outliers:
x = 15*rand(150,1);
y = sin(x) + 0.5*(rand(size(x))-0.5);
y(ceil(length(x)*rand(2,1))) = 3;
%// Smooth the data using the loess and rloess methods with a span of 10%:
yy1 = smooth(x,y,0.1,'loess');
yy2 = smooth(x,y,0.1,'rloess');
In 2016 MATLAB added the movmean function that calculates a moving average:
N = 9;
M_moving_average = movmean(M,N)
Using conv is an excellent way to implement a moving average. In the code you are using, wts is how much you are weighing each value (as you guessed). the sum of that vector should always be equal to one. If you wish to weight each value evenly and do a size N moving filter then you would want to do
N = 7;
wts = ones(N,1)/N;
sum(wts) % result = 1
Using the 'valid' argument in conv will result in having fewer values in Ms than you have in M. Use 'same' if you don't mind the effects of zero padding. If you have the signal processing toolbox you can use cconv if you want to try a circular moving average. Something like
N = 7;
wts = ones(N,1)/N;
cconv(x,wts,N);
should work.
You should read the conv and cconv documentation for more information if you haven't already.
I would use this:
% does moving average on signal x, window size is w
function y = movingAverage(x, w)
k = ones(1, w) / w
y = conv(x, k, 'same');
end
ripped straight from here.
To comment on your current implementation. wts is the weighting vector, which from the Mathworks, is a 13 point average, with special attention on the first and last point of weightings half of the rest.

MATLAB windowed FFT for a 2D matrix data (image)

Data: Say I have a 2000 rows by 500 column matrix (image)
What I need: Compute the FFT of 64 rows by 10 column chunks of above data. In other words, I want to compute the FFT of 64X10 window that is run across the entire data matrix. The FFT result is used to compute a scalar value (say peak amplitude frequency) which is used to create a new "FFT value" image.
Now, I need the final FFT image to be the same size as the original data (2000 X 500).
What is the fastest way to accomplish this in MATLAB? I am currently using for loops which is relatively slow. Also I use interpolation to size up the final image to the original data size.
As #EitanT pointed out, you can use blockproc for batch block processing of an image J. However you should define your function handle as
fun = #(block_struct) fft2(block_struct.data);
B = blockproc(J, [64 10], fun);
For a [2000 x 500] matrix this will give you a [2000 x 500] output of complex Fourier values, evaluated at sub-sampled pixel locations with a local support (size of the input to FFT) of [64 x 10]. Now, to replace those values with a single, e.g. with the peak log-magnitude, you can further specify
fun = #(block_struct) max(max(log(abs(fft2(block_struct.data)))));
B = blockproc(J, [64 10], fun);
The output then is a [2000/64 x 500/10] output of block-patch values, which you can resize by nearest-neighbor interpolation (or something else for smoother versions) to the desired [2000 x 500] original size
C = imresize(B, [2000 500], 'nearest');
I can include a real image example if it will further help.
Update: To get overlapping blocks you can use the 'Bordersize' option of blockproc by setting the overlap [V H] such that the final windows of size [M + 2*V, N + 2*H] will still be [64, 10] in size. Example:
fun = #(block_struct) log(abs(fft2(block_struct.data)));
V = 16; H = 3; % overlap values
overlap = [V H];
M = 32; N = 4; % non-overlapping values
B1 = blockproc(J, [M N], fun, 'BorderSize', overlap); % final windows are 64 x 10
However, this will work with keeping the full Fourier response, not the single-value version with max(max()) above.
See also this post for filtering using blockproc:Dealing with “Really Big” Images: Block Processing.
If you want to apply the same function (in your case, the 2-D Fourier transform) on individual distinct blocks in a larger matrix, you can do that with the blkproc function, which is replaced in newer MATLAB releases by blockproc.
However, I infer that you wish to apply apply fft2 on overlapping blocks in a "sliding window" fashion. For this purpose you can use colfilt with the 'sliding' option. Note that the function that we're applying on each block is the fft:
block_size = [64, 10];
temp_size = 5 * block_size;
col_func = #(x)cellfun(#(y)max(max(abs(fft2(y)))), num2cell(x, 1), 'Un', 0);
B = colfilt(A, block_size, 10 * block_size, 'sliding', col_func);
How does this work? colfilt processes the matrix A by rearranging each "sliding" block into a separate column of a new temporary matrix, and then applying the col_func to this new matrix. col_func in turn restores each column into the original block and applies fft2 on it, returning the largest amplitude value for each column.
Important things to note:
Since this mentioned temporary matrix includes all possible "sliding" blocks, memory could be a limitation. Therefore, in order to use less memory in calculations, colfilt breaks up the original matrix A into sub-matrices of temp_size, and performs calculations separately on each. The resulting matrix B is still the same, of course.
Each element in the resulting matrix B is computed from the corresponding block neighborhood. The larger your image is, the more blocks you will need to process, so the computation time will increase geometrically. I believe that you'll have to wait quite a bit until MATLAB finishes processing all sliding windows on your 2000-by-500 matrix.

Modified linear interpolation with missing data

Imagine a set of data with given x-values (as a column vector) and several y-values combined in a matrix (row vector of column vectors). Some of the values in the matrix are not available:
%% Create the test data
N = 1e2; % Number of x-values
x = 2*sort(rand(N, 1))-1;
Y = [x.^2, x.^3, x.^4, x.^5, x.^6]; % Example values
Y(50:80, 4) = NaN(31, 1); % Some values are not avaiable
Now i have a column vector of new x-values for interpolation.
K = 1e2; % Number of interplolation values
x_i = rand(K, 1);
My goal is to find a fast way to interpolate all y-values for the given x_i values. If there are NaN values in the y-values, I want to use the y-value which is before the missing data. In the example case this would be the data in Y(49, :).
If I use interp1, I get NaN-values and the execution is slow for large x and x_i:
starttime = cputime;
Y_i1 = interp1(x, Y, x_i);
executiontime1 = cputime - starttime
An alternative is interp1q, which is about two times faster.
What is a very fast way which allows my modifications?
Possible ideas:
Do postprocessing of Y_i1 to eliminate NaN-values.
Use a combination of a loop and the find-command to always use the neighbour without interpolation.
Using interp1 with spline interpolation (spline) ignores NaN's.