Signal smoothing algorithm (Matlab's moving average) - matlab

I have written a simple code that performs a 3-point moving average smoothing algorithm. It is meant to follow the same basic algorithm as Matlab's smooth(...) function as described here.
However, the result of my code is very different from that of Matlab's. Matlab's 3-point filter appears to perform a much more aggressive smoothing.
Here is a comparison of a noisy data smoothed using my code (red) and Matlab's function (blue):
Here is my code written in the form of a function:
function [NewSignal] = smoothing(signal)
NewSignal = signal;
for i = 2 : length(signal)-1
NewSignal(i,:) = (NewSignal(i,:)+NewSignal(i-1,:)+NewSignal(i+1,:))./3;
end
end
Matlab's function is used as follows:
signal = smooth(time, signal, 3, 'moving');
As far as I understand Matlab's function works the same way; it averages 3 adjacent bins to a single bin. So I expected both algorithms to produce the same results.
So, what is the reason for the discrepancy? And how could I tweak my code to produce the same results?
Edit:
My sample data can be found here. It can be accessed using:
M = csvread('DS0009.csv');
time = M(:,1);
signal = M(:,2);
Here is the new result (red plot) using rinkert's correction:

One reason for the difference could be that you are partially using your smoothed signal during smoothing. In your loop, you store the smoothed value in NewSignal(i,:), and for the next sample to smooth this value will be called by NewSignal(i-1,:).
Let NewSignal be determined by the original signal only:
function [NewSignal] = smoothing(signal)
NewSignal = signal;
for i = 2 : length(signal)-1
NewSignal(i,:) = (signal(i,:)+signal(i-1,:)+signal(i+1,:))./3;
end
end
Update: To show that the function above in fact does the same as Matlab's smooth function, let's consider this MVCE:
t = (0:0.01:10).'; % time vector
y = sin(t) + 0.5*randn(size(t));
y_smooth1 = smooth(t,y,3,'moving');
y_smooth2 = smoothing(y);
difference_methods = abs(y_smooth1-y_smooth2);
So creating a sine wave, add some noise, and determine the absolute difference between the two methods. If you take the sum of all the differences you will see that this adds up to something like 7.5137e-14, which cannot explain the differences you see.
Plotting the smooth signal (blue original, red smoothed):
figure(1); clf; hold on
plot(t,y)
plot(t,y_smooth2)
And then plotting the difference between the two methods:
figure(2); clf; hold on;
plot(t,y_smooth1-y_smooth2)
As you can see, the difference is of the order 1e-16, so influenced by the Floating-point relative accuracy (see eps).

To answer your question in the comments: the Function filter and smooth perform arithmetically the same (in the case that they are applied for moving average). however, there are the special cases at the beginning and endpoints which are handled differently.
This is also stated in the documentation of smooth "Because of the way smooth handles endpoints, the result differs from the result returned by the filter function."
Here you see it in an example:
%generate randonm data
signal=rand(1,50);
%plot data
plot(signal,'LineWidth',2)
hold on
%plot filtered data
plot(filter(ones(3,1)/3,1,signal),'r-','LineWidth',2)
%plot smoothed data
plot( smooth(signal,3,'moving'),'m--','LineWidth',2)
%plot smoothed and delayed
plot([zeros(1,1); smooth(signal,3,'moving')],'k--','LineWidth',2)
hold off
legend({'Data','Filter','Smooth','Smooth-Delay'})
As you can see the filtered data (in red) is just a delayed version of the smoothed data (in magenta). Additionally, they differ in the beginning. Delaying the smoothed data results in an identical waveform as the filtered data (besides the beginning). As rinkert pointed out, your approach overwrites the data points which you are accessing in the next step. This is a different issue.
In the next example you will see that rinkerts implementation (smooth-rinkert) is identical to matlabs smooth, and that your approach differs from both due to overwriting the values:
So it is your function which low passes the input stronger. (as pointed out by Cris)

Related

Using Matlab digitalfilter object from lowpass() in filter()

I need to perform a lowpass filtering operation many times (same cutoff frequency and sampling frequency each time).
It seems that lowpass() is a slow operation and I can speed it up if I reuse the filter object that is created. The help for lowpass() says that:
[Y,D] = lowpass(...) returns the digital filter, D, used to filter the signal. [...] Call filter(D,X) to filter data.
However when I try to use the filter object in this way I get different results to what lowpass() returns. I expect lp1 and lp2 to be identical in this example:
% Test input data
x = rand([200,1]);
% Lowpass filter, and save the filter object to df:
[lp1, df] = lowpass(x,50,1000);
% Reuse the filter object:
lp2 = filter(df, x);
% Plot to show that they are clearly different results
figure;plot(lp1);hold on;plot(lp2);
I have also tried using filtfilt() but this does not produce an identical result.
Quick trick to reuse filter object df if you want the same output as from lowpass IN THIS SPECIFIC SCENARIO and if time's not on your side: lp3 = conv(x, df.Coefficients, 'same');
Explanation: the filter you designed using lowpass.m is a finite impulse response filter (FIR) which shifts phases. Look again at the two curves in the figure your code snippet produces and you'll realize that the curve produced by filter.m is essentially a right-shifted version of the version produced by lowpass.m. You could try to correct for that, see https://de.mathworks.com/help/signal/ug/compensate-for-the-delay-introduced-by-an-fir-filter.html. The surprising part, though, is that lowpass.m corrects for the delay automatically, which a cursory reading of the docs didn't reveal.
filtfilt.m performs filtering twice, once in the forward direction, then in the reverse, and additionally '...minimizes start-up and ending transients...'. Because this to and fro filtering essentially cancels the delays but also filters harder, the results will by definition be different from those of either lowpass.m and filter.m.
In the long term, you will be best off investing some time into designing your filters from the bottom up (the Matlab docs are excellent in this regard), not using relatively new convenience wrapper functions like lowpass which seemingly do things unasked for in the background.
% Test input data
x = rand([200,1]);
t = 1:200;
% Lowpass filter, and save the filter object to df:
[lp1, df] = lowpass(x,50,1000);
% Reuse the filter object:
lp2 = filter(df, x);
% Convolution of time series with filter coefficients
lp3 = conv(x, df.Coefficients, 'same');
% Plot
figure(1), clf;
subplot(2,1,1)
% Plot lp3 with a tiny offset so we can discern it from lp1
plot(t, lp1, t, lp2, t, lp3+0.01)
legend('lowpass.m','filter.m','convolution')
% Use filtfilt to obtain zero-lag, doubly filtered version (for comparison)
lp4 = filtfilt(df, x);
subplot(2,1,2)
plot(t, lp1, t, lp4)
legend('lowpass.m','filtfilt.m')

FFT: why the reconstructions from different frequency-domain data produce the same result

Does the (i)fftshift operation which changes the position of an certain value have something to do with the reconstructed image?
If using zero-filling, cutting the data in frequency-domain also make no sense?
A MATLAB demonstration:
I = imread('cameraman.tif');
% making 3 different frequency data
kraw = fft2(I);
kshift = fftshift(kraw);
kcut = kshift(:,1:end-64);
imshow(abs([kraw,kshift,kcut]),[])
% reconstructing
ToImage = #(x) uint8(abs(x));
Rraw = ToImage(ifft2(kraw));
Rshift = ToImage(ifft2(kshift));
Rcut = ToImage(ifft2(kcut,size(I,1),size(I,2)));
imshow([I,Rraw,Rshift,Rcut])
% metric the difference
ssim_raw = ssim(uint8(abs(Rraw)),I);
ssim_shift = ssim(uint8(abs(Rshift)),I);
ssim_cut = ssim(uint8(abs(Rcut)),I);
title(['SSIM: 1-----|-----',num2str(ssim_raw),'----|-----',num2str(ssim_shift),'----|-----',num2str(ssim_cut)])
I can't run matlab right now, but the general answer is that they have to produce different results. The DFT is an isomorphism, which means that there is one and only one spectrum for any image and one and only one image for any spectrum.
You should probably look at the actual coherent differences of the results. For instance, an fftshift in the frequency domain is equivalent to a linear phase multiplication in the spatial domain and will not affect the magnitude. The cut example surprises me, so I suspect its the result of how the ssim metric works. I am not familiar with it so I can't give any specifics.

matlab: cdfplot of relative error

The figure shown above is the plot of cumulative distribution function (cdf) plot for relative error (attached together the code used to generate the plot). The relative error is defined as abs(measured-predicted)/(measured). May I know the possible error/interpretation as the plot is supposed to be a smooth curve.
X = load('measured.txt');
Xhat = load('predicted.txt');
idx = find(X>0);
x = X(idx);
xhat = Xhat(idx);
relativeError = abs(x-xhat)./(x);
cdfplot(relativeError);
The input data file is a 4x4 matrix with zeros on the diagonal and some unmeasured entries (represent with 0). Appreciate for your kind help. Thanks!
The plot should be a discontinuous one because you are using discrete data. You are not plotting an analytic function which has an explicit (or implicit) function that maps, say, x to y. Instead, all you have is at most 16 points that relates x and y.
The CDF only "grows" when new samples are counted; otherwise its value remains steady, just because there isn't any satisfying sample that could increase the "frequency".
You can check the example in Mathworks' `cdfplot1 documentation to understand the concept of "empirical cdf". Again, only when you observe a sample can you increase the cdf.
If you really want to "get" a smooth curve, either 1) add more points so that the discontinuous line looks smoother, or 2) find any statistical model of whatever you are working on, and plot the analytic function instead.

Convert grayscale image to point cloud (similar to dither)

I'm currently trying to implement a method to generate TSP art, and for that I need a list of points (x,y), the local density of which is proportional to the gray scale pixel value of a given image.
My first thought was: well that works pretty much like Inverse Transform Sampling for statistics (you want to draw a sample that matches a given probability density function but you can only create a sample that is uniformly distributed).
I implemented this and it works fairly well, as evident by executing this code:
%% Load image, adjust it for our needs
im=imread('http://goo.gl/DDwV3t'); %load random headshot from google
im=imadjust(im,stretchlim(im,[.01,.65]),[]);
im=im2double(rgb2gray(im));
im=im(10:end-5,50:end-5);
figure;imshow(im);title('original');
im=1-im; %we want black dots on white background
im=flipud(im); %and we want it the right way up
%% process per row
imrow = cumsum(im,2);
imrow=imrow*size(imrow,1)./repmat(max(imrow,[],2),1,size(imrow,2));
y=1:size(imrow,2);
ximrow_i = zeros(size(imrow));
for i = 1:size(imrow,1)
mask =logical([diff(imrow(i,:))>=0.01,0]); %needed for interp
ximrow_i(i,:) = interp1(imrow(i,mask),y(mask),y);
end
y=1:size(ximrow_i,1);
y=repmat(y',1,size(ximrow_i,2));
y1=y(1:5:end,1:5:end); %downscale a bit
ximcol_i1=ximrow_i(1:5:end,1:5:end); %downscale a bit
figure('Color','w');plot(ximcol_i1(:),y1(:),'k.');title('Inverse Transform Sampling on rows');
axis equal;axis off;
%% process per column
imcol=cumsum(im,1);
imcol=imcol*size(imcol,2)./repmat(max(imcol,[],1),size(imcol,1),1);
y=1:size(imcol,1);
yimcol_i=zeros(size(imcol));
for i = 1:size(imcol,2)
mask =logical([diff(imcol(:,i))>=0.01;0]);
yimcol_i(:,i) = interp1(imcol(mask,i),y(mask),y);
end
y=1:size(imcol,2);
y=repmat(y,size(imcol,1),1);
y1=y(1:5:end,1:5:end);
yimcol_i1=yimcol_i(1:5:end,1:5:end);
figure('Color','w');plot(y1(:),yimcol_i1(:),'k.');title('Inverse Transform Sampling on cols');
axis equal;axis off;
It has the shortcoming that I can only use this per-row or per-column, but not both. The Inverse Transform Sampling method does not work for multivariate PDFs in general, and I'm fairly sure I wont be able to get it to work in this case.
Is there a simple method to achieve my goal that I haven't seen yet?
I am aware that an algorithm called Voronoi Stippler has been used to create the desired result and I will investigate that, but for the moment I liked the simplicity of Inverse Transform Sampling and would like to know if I can extend that method to match my needs.
It turns out this is fairly simple and can be done by Rejection Sampling.
For the special case where the instrumental distribution is U(0,1) it works like this (if I understood it correctly):
im=imread('http://goo.gl/DDwV3t'); %load random headshot from google
im=imadjust(im,stretchlim(im,[.01,.65]),[]);
im=im2double(rgb2gray(im));
im=im(10:end-5,50:end-5);
im=1-flipud(im);
d = im > .9*rand(size(im));
d=d&(rand(size(d))>.95); %randomly sieve out some more points
[i,j]=ind2sub(size(d),find(d));
figure('Color','w');plot(j,i,'k.');title('Rejection Sampling');
axis equal;axis off;
The sampling is done in one line:
d = im > .9*rand(size(im));
Since I ended up with too many points I randomly sampled the result thus reducing the number of points by approximately the factor 20.
This is pretty much the result I originally desired.

How to implement a matched filter

I have a template. I compute the impulse response of the matched filter by taking the inverse Fourier Transform of the conjugate of the Fourier transform of my template. And I would like to perform the matched filtering operation on one of my available EEG channels using the 'filter' command in Matlab. Using the filter command the coefficient 'b' is my impulse response? Moreover, I would like to implement Matlab code to threshold the output of the matched filter to detect peaks.How can I achieve it?
Here is a start for you,
% A template is given
temp = randn(100,1);
% Create a matched filter based on the template
b = flipud(temp(:));
% For testing the matched filter, create a random signal which
% contains a match for the template at some time index
x = [randn(200,1); temp(:); randn(300,1)];
n = 1:length(x);
% Process the signal with the matched filter
y = filter(b,1,x);
% Set a detection threshold (exmaple used is 90% of template)
thresh = 0.9
% Compute normalizing factor
u = temp.'*temp;
% Find matches
matches = n(y>thresh*u);
% Plot the results
plot(n,y,'b', n(matches), y(matches), 'ro');
% Print the results to the console
display(matches);
As Andreas mentions in his answer, there is no need for the fourier transform. If you have a time-domain template, then its matched filter is simply a time-reversed version of itself (which I achieve with flipud). As you go along, you will find that there are many nuances to be worked out. This code works great because I am in control from start to finish, but once you start working with real data, things get much more complicated. Choosing an appropriate threshold value for example will require some knowledge about the data that you will be working with.
In truth, peak detection can be a very non-trivial task depending on the nature of your signals, etc. In my case, peak detection was easy because my signal was completely uncorrelated with the template, except at the point in the middle, and I also knew exactly what amplitude I was expecting to see. All of these assumptions are over-simplifications of the problem which I used to demonstrate the concepts.
Practically, you do this
y = filter( h, 1, x )
with h the impuse response and x and y input and output signals.
The matched filter is nothing else than a correlator that correlates with a given signal pattern.
It has a impulse response which is just the time reverse of the signal pattern you try to look for.
So by the way: If you have a measured signal pattern, reverse it and set this as impulse response of a FIR filter. There is no need to do this in the frequency domain if you have measurement in the time domain (both approaches are equivalent but one is more error prone then the other)