How to fit time series data as sum of time-shifted kernels - matlab

I have some voltage data recorded from brain electrophysiology that look like this:
Raw voltage trace
Notice that the fluctuations <200 uV is the actual data and the huge peaks are artifacts that might arise from occasional touching of the reference with something else. I'm not an expert in electrical engineering, but the artifact looks like the step response of a high-pass filter that is used to remove the DC offset.
Now I'm trying to remove the artifacts by curve fitting. If I detect peaks in my diff(data) and avarage all waveforms around the peak, it looks like this:
Average artifact waveform
My questions are the following:
What is the function/kernal for the artifact? I thought exponential decay would work, but the waveform clearly has overshoot that an exponential decay doesn't. Is this actually a step response of a high-pass filter? If so, what's the analytical form?
How do I solve the fitting problem of expressing my signal as a sum of time-shifted and scaled kernals? cfit doesn't seem to be the correct option.
Thank you!

You can try using a superposition of three sigmoid functions, for example:
Yields this:
When used with parameters described in this Desmos: https://www.desmos.com/calculator/ouuauyihdt
You can probably simplify the equation a bit or add additional parameters, and then use any of the standard curve fitters to fit your data.

Related

Fourier transform and filtering the MD trajectory data instead of PCA for dimensionality reduction?

I was using PCA for dimensionality reduction of MD (molecular dynamics) trajectory data of some protein simulations. Basically my data is xyz coordinates of protein atoms which change with time (that means I have lot of frames of this xyz coordinates). The dimension of this data is something like 20000 frames of 200x3 (atoms by coordinates). I implemented PCA using princomp command in Matlab.
I was wondering if I can do FFT on my data. I have experience of doing FFT on audio signals (1D signal). Here my data has both time and space in picture. It must be theoretically possible to implement FFT on my data and then filter it using a LPF (low pass filter). But I am unsure.
Can someone give me some direction/code snippets/references towards implementing FFT on my data?
Why people are preferring PCA more often compared to FFT and filtering. Is it because of computational efficiency of algorithm or is it because of the statistical nature of underlying data?
For the first question "Can someone give me some direction/code snippets/references towards implementing FFT on my data?":
I should say fft is implemented in matlab and you do not need to implement it by your own. Also, for your case you should use fftn (fft documentation)to transform and after applying lowpass filtering by dessignfilt (design filter in matalab), the apply ifftn (inverse fft in matlab)to inverse the transform.
For the second question "Why people are preferring PCA more often compared to FFT and filtering ...":
I should say as the filtering in fft is done in signal space, after filtering you can't generalize it in time space. You can more details about this drawback in this article.
But, Fourier analysis has also some other
serious drawbacks. One of them may be that time
information is lost in transforming to the frequency
domain. When looking at a Fourier transform of a
signal, it is impossible to tell when a particular event
has taken place. If it is a stationary signal - this
drawback isn't very important. However, most
interesting signals contain numerous non-stationary or
transitory characteristics: drift, trends, abrupt changes,
and beginnings and ends of events. These
characteristics are often the most important part of the
signal, and Fourier analysis is not suitable in detecting
them.

Notch or Bandstop filter and preparing data for it

I am new to matlab and signal processing methods, but i am trying to use its filter properties over a set of data I have. I have a collection of amplitude values obtained at different timestamps. When this is plotted, I get a waveform with several peaks that I can identify. I then perform calculations to derive the time between each consecutive peak and I want to eliminate the rates that are around the range of 48-52peaks per second.
What would be the correct way to go about processing this data step by step? Would a bandstop or notch filter be better if I want to eliminate those frequencies and not attenuate it simply? I am completely lost in the parameters required to feed into the filters for this. Please help...
periodogram is OK, but I would suggest using pwelch instead. It makes a more reasonable PSD estimate and the default parameters are well thought out (Hann windows, 50% overlap of segments, etc.)
If what you want is to remove signals in a wide band (e.g. 48-52 Hz) equally, rather than a single and unchanging frequency, than a bandstop filter is ideal. For example:
fs = 2048;
y = rand(fs*8, 1);
[b,a] = ellip(4, 2, 40, [46 54]/(fs/2));
yy = filter(b,a,y);
This will use a 4th order elliptic bandstop filter to filter the random data variable 'y'. filtfilt.m is also a nice function; it applies the filter forwards and backwards so you get twice the filter action and none of the phase lag or dispersion.
I am currently doing something similar to what you are doing.
I am processing a lot of signals from the Inertial Measurement Unit and motor drives. They all are asynchronously obtained, i.e. they all have very different timestamp and also very different acquisition frequency.
First thing I did was to interpolate all signals data in order to have all signals with same timestamp. You can use the matlab function interp to do this.
After this, you will have all signals with same sample frequency and also timestamp, which will be good in further analysis.
Ok, another thing you can do to analyse the frequency of the peaks is to perform the fourier transform. For beginners i advice the use of periodogram function and not the fft function.
Imagine you signal is x and your sample frequency (after interpolation) is Fs.
You can now use the function periodogram available in matlab like this:
[P,f] = periodogram(x,[],[length(t)],Fs);
This will give the power vs frequency of your signal. After that you will be able to plot and take a look at the frequencies of your signal. In other words, you be able to see the frequencies of the signals that make your acquired signal.
Plot the data this way:
plot(f,P); or semilogy(f,P);
The second is the same thing as the first, but with a logarithmic scale.
After this analysis you can use the Filter Desing and Analysis Tool to design you filter. Just type fdatool in matlab and it will open the design window. Choose the filter type, the cut and pass frequencies and click in design. This tool is very intuitive.
After designing you can export the filter to workspace.
Finally you can use the filter you designed in your signal to see if its what you wanted.
Use the functions filter os filtfilt for this.
Search in the web of the matlab help for the functions I wrote to get more details.
There are a lot of examples availables too.
I hope I could help you.
Good luck.

Power spectrum from autocorrelation function with MATLAB

I have some dynamic light scattering data. The machine pumps out the autocorrelation function, and a count-rate.
I can do a simple fit to the ACF
ACF = exp(-D*q^2*t)
and obtain the diffusion coefficient.
I want to obtain the same D from the power spectrum. I have been able to create a power spectrum in two ways -- from the Fourier transform of the ACF, and from the count rate. Both agree, but the power spectrum does not look like in the one in the books, so I'm not sure how to use it to work out the line width.
Attached is an image from a PDF that shows what you should get, and what I get from MATLAB. Can anyone make sense of whats going on?
I have used the code of answer #3 on this question. The resulting autocorrelation comes out exactly the same as
the machine gives me and
using MATLAB's autocorr command on the photoncount data.
Thank you for your time.
When you compute the Fourier transform from short sequences of data it often looks very noisy. There are a number of reasons for this. One reason is that the statistics of individual Fourier components are not Gaussian, and so averaging the spectra across multiple samples of data will only slowly improve the quality of the estimate.
Another causes of "noisiness" in empirical spectra behavior is that you are applying (to a finite data sample) a transform which involves a pathological sinc function and which assumes an infinite length signal. To diminish this problem, it helps to apply a "windowing-function" to your data before computing the Fourier transform. One of the more complicated but also more powerful windowing approaches is the use of so-called 'Slepian tapers'.
MATLAB conveniently implements well-known windows in functions such as hamming and hann.

Designing a simple bandpass/bandstop filter in Matlab

For a homework assignment I have to design a simple bandpass filter in Matlab that filters out everything between 250Hz and 1000 Hz. What I did so far:
- using the 'enframe' function to create half overlapping windows with 512 samples each. On the windows I apply the hann window function.
- On each window I apply an fft. After this I reconstruct the original signal with the function ifft, that all goes well.
But the problem is how I have to interpret the result of the fft function and how to filter out a frequency band.
Unless I'm mistaken, it sounds like you're taking the wrong approach to this.
If your assignment is to manipulate a signal specifically by manipulating its FFT then ignore me. Otherwise.. read on.
The FFT is normally used to analyse a signal in the frequency domain. If you start fiddling with the complex coefficients that an FFT returns then you're getting into a complicated mathematical situation. This is particularly the case since your cut-off frequencies aren't going to lie nicely on FFT bin frequencies. Also, remember that the FFT is not a perfect transform of the signal you're analysing. It will always introduce artefacts of its own due to scalloping error, and convolution with your hann window.
So.. let's leave the FFT for analysis, and build a filter.
If you're doing band-pass design in your class I'm going to assume you understand what they do. There's a number of functions in Matlab to generate the coefficients for different types of filter i.e. butter, kaiser cheby1. Look up their help pages in Matlab for loads more info. The values you plug in to these functions will be dependent on your filter specification, i.e. you want "X"dB rolloff and "Y"dB passband ripple. You'll need some idea of the how these filters work, and knowledge of their transfer functions to understand how their filter order relates to your specification.
Once you have your coefficients, it's just a case of running them through the filter function (again.. check the help page if you're not sure how this works).
The mighty JOS has a great walkthrough of bandpass filter design here.
One other little niggle.. in your question you mentioned that you want your filter to "filter out" everything between 250Hz and 1000Hz. This is a bit ambiguous. If you're designing a bandpass filter you would want to "pass" everything between 250Hz and 1000Hz. If you do in fact want to "filter out" everything in this range you want a band-stop filter instead.
It all depends on the sampling rate you use.
If you sample right according to the Nyquist-Shannon sampling theorem then you can try and interpret the samples of your fft using the definition of the DFT.
For understanding which frequencies correspond with which samples in the dft results, I think it's best to look at the inverse transformation. You multiply coefficient k with
exp(i*2*pi*k/N*n)
which can be interpreted to be a cosine with Euler's Formula. So each coefficient gets multiplied by a sine of a certain frequency.
Good luck ;)

Finding Relevant Peaks in Messy FFTs

I have FFT outputs that look like this:
At 523 Hz is the maximum value. However, being a messy FFT, there are lots of little peaks that are right near the large peaks. However, they're irrelevant, whereas the peaks shown aren't. Are the any algorithms I can use to extract the maxima of this FFT that matter; I.E., aren't just random peaks cropping up near "real" peaks? Perhaps there is some sort of filter I can apply to this FFT output?
EDIT: The context of this is that I am trying to take one-hit sound samples (like someone pressing a key on a piano) and extract the loudest partials. In the image below, the peaks above 2000 Hz are important, because they are discrete partials of the given sound (which happens to be a sort of bell). However, the peaks that are scattered about right near 523 seem to be just artifacts, and I want to ignore them.
If the peak is broad, it could indicate that the peak frequency is modulated (AM, FM or both), or is actually a composite of several spectral peaks, themselves each potentially modulated.
For instance, a piano note may be the result of the hammer hitting up to 3 strings that are all tuned just a tiny fraction differently, and they all can modulate as they exchange energy between strings though the piano frame. Guitar strings can change frequency as the pluck shape distortion smooths out and decays. Bells change shape after they are hit, which can modulate their spectrum. Etc.
If the sound itself is "messy" then you need a good definition of what you mean by the "real" peak, before applying any sort of smoothing or side-band rejection filter. e.g. All that "messiness" may be part of what makes a bell sound like a real bell instead of an electronic sinewave generator.
Try convolving your FFT (treating it as a signal) with a rectangular pulse( pulse = ones(1:20)/20; ). This might get rid of some of them. Your maxima will be shifted by 10 frequency bins to teh right, to take that into account. You would basically be integrating your signal. Similar techniques are used in Pan-Tompkins algorithm for heart beat identification.
I worked on a similar problem once, and choosed to use savitsky-golay filters for smoothing the spectrum data. I could get some significant peaks, and it didn't messed too much with the overall spectrum.
But I Had a problem with what hotpaw2 is alerting you, I have lost important characteristics along with the lost of "messiness", so I truly recommend you hear him. But, if you think you won't have a problem with that, I think savitsky-golay can help.
There are non-FFT methods for creating a frequency domain representation of time domain data which are better for noisy data sets, like Max-ent recontruction.
For noisy time-series data, a max-ent reconstruction will be capable of distinguising true peaks from noise very effectively (without adding any artifacts or other modifications to suppress noise).
Max ent works by "guessing" an FFT for a time domain specturm, and then doing an IFT, and comparing the results with the "actual" time-series data, iteratively. The final output of maxent is a frequency domain spectrum (like the one you show above).
There are implementations in java i believe for 1-d spectra, but I have never used one.