fft artificial defects due to finite sampling frequency - matlab

I use Matlab to calculate the fft result of a time series data. The signal has an unknown fundamental frequency (~80 MHz in this case), together with several high order harmonics (1-20th order). However, due to finite sampling frequency (500 MHz in this case), I always get the mixing frequencies from high order frequency (7-20), e.g. 7th with a peak at abs(2*500-80*7)=440 MHz, 8th with frequency 360 MHz and 13th with a peak at abs(13*80-2*500)=40 MHz. Does anyone know how to get rid of these artificial mixing frequencies? One possible way is to increase the sampling frequency to sufficient large value. However, my data set has fixed number of data and time range. So the sampling frequency is actually determined by the property of the data set. Any solutions to this problem?
(I have image for this problem but I don't have enough reputation to post a image. Sorry for bring inconvenience for understanding this question)

You are hitting on a fundamental property of sampling - when you sample data at a fixed frequency fs, you cannot tell the difference between two signals with the same amplitude but different frequencies, where one has f1=fs/2 - d and the other has f2=f2/2 + d. This effect is frequently used to advantage - for example in mixers - but at other times, it's an inconvenience.
Unless you are looking for this mixing effect (done, for example, at the digital receiver in a modern MRI scanner), you need to apply a "brick wall filter" with a cutoff frequency of fs/2. It is not uncommon to have filters with a roll-off of 24 dB / octave or higher - in other words, they let "everything through" below the cutoff, and "stop everything" above it.
Data acquisition vendors will often supply filtering solutions with their ADC boards for exactly this reason.
Long way to say: "That's how digitization works". But it's true - that is how digitization works.

Typically, one low-pass filters the signal to below half the sample rate before sampling. Otherwise, after sampling, there is usually no way to separate any aliased high frequency noise (your high order harmonics) from the more useful spectrum below (Nyquist) half the sample rate.
If you don't filter the signal before sampling it, the defect is inherent in the sample vector, not the FFT.

Related

Using low frequency data to calibrate high frequency data

I have a 10 Hz time series measured by a fast instrument and a 1 minute time series measured by a slow reference instrument. The data consists of a fluctuating meteorological parameter. The slow reference instrument is used to calibrate the fast instrument measurements. Both time series are synchronised.
My idea:
Average the 10 Hz data into 1 minute blocks.
Take 5 one minute block from each time series and calculate the linear regression equations.
Use the regression equations to calibrate the 10 Hz data in 5 minute blocks (3000 data points).
What would be the best way to match (calibrate) the high frequency data using the low frequency data? I use MATLAB.
More background: The fast instrument outputs a fluctuating voltage signal while the slow instrument outputs the true value of a trace gas concentration in ppb (parts per billion). The slow instrument samples every ten seconds and outputs the average every one minute.
In short I would like to have my fast signal also in ppb but without losing it's integrity (I need the turbulent fluctuations to remain unfiltered), hence the need to use a linear fit.
Here's my approach and the results I got...
I modelled the problem as there being
a real (unmeasured by instruments) signal.
Let's call this real.
a slow signal - which is just the real signal sampled once a minute.
Let's call this lf (short for low frequency).
a fast signal - real signal + noise + signal drift.
Let's call this hf (short for high frequency).
The task was to take the slow and fast signals and try to reconstruct the real signal.
(Using least squares as a scoring metric)
Strategy:
Define a "piecewise linear filter" - this takes a signal, and returns a piecewise version of it. (With each piecewise part occurring where the slow signal is measured.)
NOTE: The slow signal is considered piecewise anyway.
Define a forwards-backwards low pass filter.
Define "uncertainty" to be 0 at the points where the low frequency signal is measured. It linearly increases to 1 when the timestamp is halfway between low frequency signal measurements.
Now, take your high frequency signal and filter it with the low pass filter.
Let's call this hf_lp
Take hf_lp and apply the "piecewise linear filter" to it.
Let's call this hf_lp_pl
Subtract the last two from each other.
I.e. hf_diff = hf_lp - hf_lp_pl.
You now want to find some function that estimates how by how much hf_diff should be added to the low frequency signal (lf) such that the squared error between real_estimated and real is minimized.
I fitted a function along the lines of real_estimated = lf + diff.*(a1*uncertainty + a2*uncertainty.^2 + a3*uncertainty.^3)
Use fminsearch or other optimization techniques to get a1, a2, a3...
Here is a sample plot of my results - you can see that real_estimated is much closer to real than the slow signal lf.
Closing thoughts...
The fast signal contains too much very low frequency (drift) and too much
very high frequency (noise) components.
But it has valuable medium frequency info.
The slow signal has perfect low frequency information, but no medium frequency info.
The strategy above is really just one way of extracting the medium frequencies from the fast signal and adding it to the low frequency signal.
This way, we get the best of all worlds: low frequencies, medium frequencies and low noise.

How to decide to cuttoff frequecies of filter in case of using ADC( Flow: Analog-signal to ADC to bits to fir_filter to filtered_output)

FIR filter has to be used for removing the noise.
I don't know the frequencies of the noise that might be adding up into the analog feedback signal I am taking.
My apparatus consists analog feedback signal then i am using ADC to digitize the value now I have to apply FIR filter to remove the noise, Now I am not sure which noise the noise which added up in the analog signal from the environment or some sort of noise comes there due to ADC ?
I have to code this in vhdl.(this part is easy I can do that).
My main problem is in deciding the frequencies.
Thanks in Advance !
I am tagging vhdl as some people who are working in vhdl might know about the filter.
Let me start by stating the obvious: An ADC samples at a fixed rate and can not represent any frequency higher than the Nyquist frequency
Step one: understand aliasing, and that any frequency higher than the Nyquist will alias into your signal as noise. Once you get this you understand that you need an anti aliasing filter in your hardware, in your analog signal path before you digitize it. Depending on the noise requirements of the application you may implement a very complicated 4 pole filter using op-amps; the simplest is to use an RC filter.
Step two: setting the filter cut off. Don't set the cutoff right at the Nyquist frequency, make sure the filter is cutting well before the nyquist (1/2x... 1/10x, depends really how clean and how much noise is present)
So now you're actually kind of over sampling your signal: The filter is cutting above your signal, and the sample rate is high enough such that the Nyquist frequency is sufficiently higher. Over sampling is kind of extra data, that you captured with the intent of filtering further, and possibly even decimating (keeping on in N samples and throwing the rest out)
Step three: use a filter to further remove the noise between the initial cut off of the anti-aliasing filter and the nyquist frequency. This is a science on it's own really, but let me start by suggesting a good decimation filter: Averaging 2 values. It's a box-car filter of order 2, also known as a SINC filter, and can be re applied N times. After N times it is the equivalent of an FIR using the values of the Nth row in pascal's triangle (and divided by their sum).
Again, the filter choice is a science on it's own really. To the extreme is the decimation filters of a sigma-delta ADC. The CS5376A datasheet clearly explains what they're doing; I learn quite a bit just from reading that datasheet!

Changing data in Matlab to correlate to sampling frequency

Good day,
I have a document of data measured during experiments. The first columns of the document is time thereafter torque and displacement readings.
My measuring equipment were supposed to sample at 200Hz, however during the experiment as the measured data increased the computer slowed down resulting in sampling rates lower than 200Hz.
I however require readings at an exact sampling frequency (anything between 0 and 200Hz is acceptable), how can I modify/interpolate my data to correlate to the desired frequency?
For general resampling, use the resample function (see its doc for examples of use). It lets you specify the resampling factor as a rational number, with the limitation that the numerator and denominator can't be too large. That imposes restrictions when the resampling factor is very close to 1; other than that it is the way to go.
If you need to be very fine with your resampling factor (por example, correcting sampling frequency by amounts of the order of 1 part per million, which requires a resampling factor very close to 1), I suggest you use linear interpolation with the function interp1 (see its doc). This interpolation method is not as good as that of resample, but the error is negligible for resampling factors close to 1, and it lets you very fine control of the resampling factor.

Best way to extract neuronal spike times from a noisy signal / voltage meaurement

I'm a neuroscientist, and not a very good one. My colleague has kindly provided me with a noisy voltage measurements of the PY neuron of the Stomatogastric Ganglion of the lobster.
The activity of this neuron is characterised by a slow depolarised plateaux with fast spikes on top (a burst).
Both idealised and noisy versions are presented here for you to peruse at your leisure.
It's my job to extract the spike times from the noisy signal but this is so far beyond my experience level I have no idea where to begin. Fortunately, I am a total ninja at Matlab.
Could someone kindly provide me with the name of the procedure, filter or smoothing function which is best suited for this task. Or even the appropriate forum to ask such an asinine question.
Presumably, it needs to increase the signal to noise ratio? The problem here seems to be determining the difference between noise and a bona fide spike as the margin between the two is quite small.
UPDATE: 02/07/2013
I have tried the following filters in Matlab with mixed results. It's still very hard to say what is noise and what is a spike.
Lowpass Butterworth filter,
median filter,
gaussian,
moving weighted window,
moving average filter,
smooth,
sgolay filter.
This may not be an adequate response for stackoverflow - but one way of increasing a signal to noise ratio in your case is to average parts of the signal.
low pass your signal to remove noise (and spikes), and find the minima of the filtered signal (from your image, one minimum every 600 data points). Keep the indexes of each minimum,
on the noisy signal, for each minimum index, select the consecutive 700 data points. If you have 50 minima, you should have a 50 by 700 matrix,
average your matrix. You should have a 1 by 700 vector.
By averaging parts of the signal (minimum-locked potentials), you will take advantage of two properties: noise is zero-mean (well, it should be), and the signal of interest is repetitive. The first will therefore decrease as you pile up potentials, and the second will increase. With this process however, you will lose the spike times for each slow wave figure, but at least have them for blocks of 50 minima.
This technique is known in neuroscience as event-related potential (http://en.wikipedia.org/wiki/Event-related_potential). It may not fit perfectly your signal, or the result may not give nice spikes, but you may extract the spike times for some periods of interest (given the nature of your signal, I would say that you would need 5 or 10 potentials to see an emerging mean activity).
There are some toolboxes that do part of the job (but I would program it myself given the complexity of the task). These are eeglab or fieldtrip. They have a bunch of filter/decomposition options too, as well as some statistical features.

Finding Relevant Peaks in Messy FFTs

I have FFT outputs that look like this:
At 523 Hz is the maximum value. However, being a messy FFT, there are lots of little peaks that are right near the large peaks. However, they're irrelevant, whereas the peaks shown aren't. Are the any algorithms I can use to extract the maxima of this FFT that matter; I.E., aren't just random peaks cropping up near "real" peaks? Perhaps there is some sort of filter I can apply to this FFT output?
EDIT: The context of this is that I am trying to take one-hit sound samples (like someone pressing a key on a piano) and extract the loudest partials. In the image below, the peaks above 2000 Hz are important, because they are discrete partials of the given sound (which happens to be a sort of bell). However, the peaks that are scattered about right near 523 seem to be just artifacts, and I want to ignore them.
If the peak is broad, it could indicate that the peak frequency is modulated (AM, FM or both), or is actually a composite of several spectral peaks, themselves each potentially modulated.
For instance, a piano note may be the result of the hammer hitting up to 3 strings that are all tuned just a tiny fraction differently, and they all can modulate as they exchange energy between strings though the piano frame. Guitar strings can change frequency as the pluck shape distortion smooths out and decays. Bells change shape after they are hit, which can modulate their spectrum. Etc.
If the sound itself is "messy" then you need a good definition of what you mean by the "real" peak, before applying any sort of smoothing or side-band rejection filter. e.g. All that "messiness" may be part of what makes a bell sound like a real bell instead of an electronic sinewave generator.
Try convolving your FFT (treating it as a signal) with a rectangular pulse( pulse = ones(1:20)/20; ). This might get rid of some of them. Your maxima will be shifted by 10 frequency bins to teh right, to take that into account. You would basically be integrating your signal. Similar techniques are used in Pan-Tompkins algorithm for heart beat identification.
I worked on a similar problem once, and choosed to use savitsky-golay filters for smoothing the spectrum data. I could get some significant peaks, and it didn't messed too much with the overall spectrum.
But I Had a problem with what hotpaw2 is alerting you, I have lost important characteristics along with the lost of "messiness", so I truly recommend you hear him. But, if you think you won't have a problem with that, I think savitsky-golay can help.
There are non-FFT methods for creating a frequency domain representation of time domain data which are better for noisy data sets, like Max-ent recontruction.
For noisy time-series data, a max-ent reconstruction will be capable of distinguising true peaks from noise very effectively (without adding any artifacts or other modifications to suppress noise).
Max ent works by "guessing" an FFT for a time domain specturm, and then doing an IFT, and comparing the results with the "actual" time-series data, iteratively. The final output of maxent is a frequency domain spectrum (like the one you show above).
There are implementations in java i believe for 1-d spectra, but I have never used one.