FFT: Match samples to frequency - matlab

let us assume,
I have a vector t with the times in seconds of my samples. (These samples are not equally distributed on the time domain.
Also I have a vector data containing the samplevalues at the time t.
t and data have the same length.
If I plot the graph some sort of periodical signal is obtained.
now I could perform: abs(fft(data)) to get my spectrum, which is then plotted over the amount of data points on the x-axis.
How can I obtain my spectrum regarding the times in vector t and plot it?
I want to see which frequencies in 1/s or which period in s my signal contains.
Thanks for your help.

[Not the OP's intention]: FFT will give you the spectrum (global) for any number of input data points. You cannot have a specific data point (in time) associated with parts (or the full) spectrum.
What you can do instead is use spectrogram and obtain the Short-Time Fourier Transform (STFT). This will give you a NxM discrete grid of time-frequency FT values (N: FT frequency bins, M: signal time-windows).
By localizing the (overlapping) STFT windows on your data samples of interest you will get N frequency magnitude values, thus the distribution of short-term spectrum estimates as the signal changes in time.
See also the possibly relevant answer here: https://stackoverflow.com/a/12085728/651951
EDIT/UPDATE:
For unevenly spaced data you need to consider the Non-Uniform DFT (and Non-uniform FFT implementations). See the relevant question/answer here https://scicomp.stackexchange.com/q/593
The primary approaches for NFFT or NUFFT, are based on creating a uniform grid through local convolutions/interpolation, running FFT on this and undoing the convolutional effect of the interpolation filter.
You can read more:
A. Dutt and V. Rokhlin, Fast Fourier transforms for nonequispaced data, SIAM J. Sci. Comput., 14, 1993.
L. Greengard and J.-Y. Lee, Accelerating the Nonuniform Fast Fourier Transform, SIAM Review, 46 (3), 2004.
Pippig, M. und Potts, D., Particle Simulation Based on Nonequispaced Fast Fourier Transforms, in: Fast Methods for Long-Range Interactions in Complex Systems, 2011.
For an implementation (with an interface to MATLAB) try NFFT and possibly its parallelized version PNFFT. You may find a nice walk-through on how to set-up and use here.

You can resample or interpolate your sample points to get another set of sample points that are equally spaced in t. The chosen spacing or sample rate of the second set of equally spaced sample points will allow you to infer frequencies to the result of an FFT of that second set.
The results may be noisy or include aliasing unless the initial data set is bandlimited to a sufficiently low frequency to allow interpolation. If bandlimited, then you might try something like cubic splines as an interpolation method.
Although it may look like one can get a high FFT bin frequency resolution by resampling to a larger number of data points, the actual useful resolution accuracy will be more related to the original number of samples.

Related

How does this logic produce high and low pass filters?

I was studying for a signals & systems project and I have come across this code on high and low pass filters for an audio signal on the internet. Now I have tested this code and it works but I really don't understand how it is doing the low/high pass action.
The logic is that a sound is read into MATLAB by using the audioread or wavread function and the audio is stored as an nx2 matrix. The n depends on the sampling rate and the 2 columns are due to the 2 sterio channels.
Now here is the code for the low pass;
[hootie,fs]=wavread('hootie.wav'); % loads Hootie
out=hootie;
for n=2:length(hootie)
out(n,1)=.9*out(n-1,1)+hootie(n,1); % left
out(n,2)=.9*out(n-1,2)+hootie(n,2); % right
end
And this is for the high pass;
out=hootie;
for n=2:length(hootie)
out(n,1)=hootie(n,1)-hootie(n-1,1); % left
out(n,2)=hootie(n,2)-hootie(n-1,2); % right
end
I would really like to know how this produces the filtering effect since this is making no sense to me yet it works. Also shouldn't there be any cutoff points in these filters ?
The frequency response for a filter can be roughly estimated using a pole-zero plot. How this works can be found on the internet, for example in this link. The filter can be for example be a so called Finite Impulse Response (FIR) filter, or an Infinite Impulse Response (IIR) filter. The FIR-filters properties is determined only from the input signal (no feedback, open loop), while the IIR-filter uses the previous signal output to control the current signal output (feedback loop or closed loop). The general equation can be written like,
a_0*y(n)+a_1*y(n-1)+... = b_0*x(n)+ b_1*x(n-1)+...
Applying the discrete fourier transform you may define a filter H(z) = X(z)/Y(Z) using the fact that it is possible to define a filter H(z) so that Y(Z)=H(Z)*X(Z). Note that I skip a lot of steps here to cut down this text to proper length.
The point of the discussion is that these discrete poles can be mapped in a pole-zero plot. The pole-zero plot for digital filters plots the poles and zeros in a diagram where the normalized frequencies, relative to the sampling frequencies are illustrated by the unit circle, where fs/2 is located at 180 degrees( eg. a frequency fs/8 will be defined as the polar coordinate (r, phi)=(1,pi/4) ). The "zeros" are then the nominator polynom A(z) and the poles are defined by the denominator polynom B(z). A frequency close to a zero will have an attenuation at that frequency. A frequency close to a pole will instead have a high amplifictation at that frequency instead. Further, frequencies far from a pole is attenuated and frequencies far from a zero is amplified.
For your highpass filter you have a polynom,
y(n)=x(n)-x(n-1),
for each channel. This is transformed and it is possble to create a filter,
H(z) = 1 - z^(-1)
For your lowpass filter the equation instead looks like this,
y(n) - y(n-1) = x(n),
which becomes the filter
H(z) = 1/( 1-0.9*z^(-1) ).
Placing these filters in the pole-zero plot you will have the zero in the highpass filter on the positive x-axis. This means that you will have high attenuation for low frequencies and high amplification for high frequencies. The pole in the lowpass filter will also be loccated on the positive x-axis and will thus amplify low frequencies and attenuate high frequencies.
This description is best illustrated with images, which is why I recommend you to follow my links. Good luck and please comment ask if anything is unclear.

What is a spectrogram and how do I set its parameters?

I am trying to plot the spectrogram of my time domain signal given:
N=5000;
phi = (rand(1,N)-0.5)*pi;
a = tan((0.5.*phi));
i = 2.*a./(1-a.^2);
plot(i);
spectrogram(i,100,1,100,1e3);
The problem is I don't understand the parameters and what values should be given. These values that I am using, I referred to MATLAB's online documentation of spectrogram. I am new to MATLAB, and I am just not getting the idea. Any help will be greatly appreciated!
Before we actually go into what that MATLAB command does, you probably want to know what a spectrogram is. That way you'll get more meaning into how each parameter works.
A spectrogram is a visual representation of the Short-Time Fourier Transform. Think of this as taking chunks of an input signal and applying a local Fourier Transform on each chunk. Each chunk has a specified width and you apply a Fourier Transform to this chunk. You should take note that each chunk has an associated frequency distribution. For each chunk that is centred at a specific time point in your time signal, you get a bunch of frequency components. The collection of all of these frequency components at each chunk and plotted all together is what is essentially a spectrogram.
The spectrogram is a 2D visual heat map where the horizontal axis represents the time of the signal and the vertical axis represents the frequency axis. What is visualized is an image where darker colours means that for a particular time point and a particular frequency, the lower in magnitude the frequency component is, the darker the colour. Similarly, the higher in magnitude the frequency component is, the lighter the colour.
Here's one perfect example of a spectrogram:
Source: Wikipedia
Therefore, for each time point, we see a distribution of frequency components. Think of each column as the frequency decomposition of a chunk centred at this time point. For each column, we see a varying spectrum of colours. The darker the colour is, the lower the magnitude component at that frequency is and vice-versa.
So!... now you're armed with that, let's go into how MATLAB works in terms of the function and its parameters. The way you are calling spectrogram conforms to this version of the function:
spectrogram(x,window,noverlap,nfft,fs)
Let's go through each parameter one by one so you can get a greater understanding of what each does:
x - This is the input time-domain signal you wish to find the spectrogram of. It can't get much simpler than that. In your case, the signal you want to find the spectrogram of is defined in the following code:
N=5000;
phi = (rand(1,N)-0.5)*pi;
a = tan((0.5.*phi));
i = 2.*a./(1-a.^2);
Here, i is the signal you want to find the spectrogram of.
window - If you recall, we decompose the image into chunks, and each chunk has a specified width. window defines the width of each chunk in terms of samples. As this is a discrete-time signal, you know that this signal was sampled with a particular sampling frequency and sampling period. You can determine how large the window is in terms of samples by:
window_samples = window_time/Ts
Ts is the sampling time of your signal. Setting the window size is actually very empirical and requires a lot of experimentation. Basically, the larger the window size, the better frequency resolution you get as you're capturing more of the frequencies, but the time localization is poor. Similarly, the smaller the window size, the better localization you have in time, but you don't get that great of a frequency decomposition. I don't have any suggestions here on what the most optimal size is... which is why wavelets are preferred when it comes to time-frequency decomposition. For each "chunk", the chunks get decomposed into smaller chunks of a dynamic width so you get a mixture of good time and frequency localization.
noverlap - Another way to ensure good frequency localization is that the chunks are overlapping. A proper spectrogram ensures that each chunk has a certain number of samples that are overlapping for each chunk and noverlap defines how many samples are overlapped in each window. The default is 50% of the width of each chunk.
nfft - You are essentially taking the FFT of each chunk. nfft tells you how many FFT points are desired to be computed per chunk. The default number of points is the largest of either 256, or floor(log2(N)) where N is the length of the signal. nfft also gives a measure of how fine-grained the frequency resolution will be. A higher number of FFT points would give higher frequency resolution and thus showing fine-grained details along the frequency axis of the spectrogram if visualised.
fs - The sampling frequency of your signal. The default is 1 Hz, but you can override this to whatever the sampling frequency your signal is at.
Therefore, what you should probably take out of this is that I can't really tell you how to set the parameters. It all depends on what signal you have, but hopefully the above explanation will give you a better idea of how to set the parameters.
Good luck!

Methodology of FFT for Matlab spectrogram / short time Fourier transform functions

I'm trying to figure out how MATLAB does the short time Fourier transforms for its spectrogram function (and related functions like specgram, or stft in Octave). What is curious to me is that you can apparently specify the length of the window and the FFT length (number of output frequencies) independently, whereas I would have expected that these two should be equal (since the length of an FFT'd signal is the same as the length of the original signal). To illustrate what I mean, here is the function call:
[S,F,T]=spectrogram(signal,winSize,overlapSize,fftSize,rate);
winSize is the length of subintervals which are to be (individually) FFT'd, and fftSize is the number of frequency components given in the output. When these are not equal, does Matlab do interpolation to produce the required number of frequency bins?
Ultimately the reason I want to know is so that I can determine the proper units and scaling for the frequencies.
Cheers
A windowed segment of a signal can be zero-padded to a longer length vector to use a longer FFT. The frequency scaling will be determined by the length of the FFT (and the signals sample rate). The window size and window formula will determine the effective resolution, in terms of peak separation ability.
Why do this? Some FFT sizes can be computed more efficiently than others (slightly or a lot, depending on the FFT library used). Also, a longer FFT will calculate more points or bins, thus producing a higher density of interpolated points in a potentially smoother spectrum result.

How to measure power spectral density in matlab?

I am trying to measure the PSD of a stochastic process in matlab, but I am not sure how to do it. I have posted the exact same question here, but I thought I might have more luck here.
The stochastic process describes wind speed, and is represented by a vector of real numbers. Each entry corresponds to the wind speed in a point in space, measured in m/s. The points are 0.0005 m apart. How do I measure and plot the PSD? Let's call the vector V. My first idea was to use
[p, w] = pwelch(V);
loglog(w,p);
But is this correct? The thing is, that I'm given an analytical expression, which the PSD should (in theory) match. By plotting it with these two lines of code, it looks all wrong. Specifically it looks as though it could need a translation and a scaling. Other than that, the shapes of the two are similar.
UPDATE:
The image above actually doesn't depict the PSD obtained by using pwelch on a single vector, but rather the mean of the PSD of 200 vectors, since these vectors stems from numerical simulations. As suggested, I have tried scaling by 2*pi/0.0005. I saw that you can actually give this information to pwelch. So I tried using the code
[p, w] = pwelch(V,[],[],[],2*pi/0.0005);
loglog(w,p);
instead. As seen below, it looks much nicer. It is, however, still not perfect. Is that just something I should expect? Taking the squareroot is not the answer, by the way. But thanks for the suggestion. For one thing, it should follow Kolmogorov's -5/3 law, which it does now (it follows the green line, which has slope -5/3). The function I'm trying to match it with is the Shkarofsky spectral density function, which is the one-dimensional Fourier transform of the Shkarofsky correlation function. Is it not possible to mark up math, here on the site?
UPDATE 2:
I have tried using [p, w] = pwelch(V,[],[],[],1/0.0005); as I was suggested. But as you can seem it still doesn't quite match up. It's hard for me to explain exactly what I'm looking for. But what I would like (or, what I expected) is that the dip, of the computed and the analytical PSD happens at the same time, and falls off with the same speed. The data comes from simulations of turbulence. The analytical expression has been fitted to actual measurements of turbulence, wherein this dip is present as well. I'm no expert at all, but as far as I know the dip happens at the small length scales, since the energy is dissipated, due to viscosity of the air.
What about using the standard equation for obtaining a PSD? I'd would do this way:
Sxx(f) = (fft(x(t)).*conj(fft(x(t))))*(dt^2);
or
Sxx = fftshift(abs(fft(x(t))))*(dt^2);
Then, if you really need, you may think of applying a windowing criterium, such as
Hanning
Hamming
Welch
which will only somehow filter your PSD.
Presumably you need to rescale the frequency (wavenumber) to units of 1/m.
The frequency units from pwelch should be rescaled, since as the documentation explains:
W is the vector of normalized frequencies at which the PSD is
estimated. W has units of rad/sample.
Off the cuff my guess is that the scaling factor is
scale = 1/0.0005/(2*pi);
or 318.3 (m^-1).
As for the intensity, it looks like taking a square root might help. Perhaps your equation reports an intensity, not PSD?
Edit
As you point out, since the analytical and computed PSD have nearly identical slopes they appear to obey similar power laws up to 800 m^-1. I am not sure to what degree you require exponents or offsets to match to be satisfied with a specific model, and I am not familiar with this particular theory.
As for the apparent inconsistency at high wavenumbers, I would point out that you are entering the domain of very small numbers and therefore (1) floating point issues and (2) noise are probably lurking. The very nice looking dip in the computed PSD on the other hand appears very real but I have no explanation for it (maybe your noise is not white?).
You may want to look at this submission at matlab central as it may be useful.
Edit #2
After inspecting documentation of pwelch, it appears that you should pass 1/0.0005 (the sampling rate) and not 2*pi/0.0005. This should not affect the slope but will affect the intercept.
The dip in PSD in your simulation results looks similar to aliasing artifacts
that I have seen in my data when the original data were interpolated with a
low-order method. To make this clearer - say my original data was spaced at
0.002m, but in the course of cleaning up missing data, trying to save space, whatever,
I linearly interpolated those data onto a 0.005m spacing. The frequency response
of linear interpolation is not well-behaved, and will introduce peaks and valleys
at the high wavenumber end of your spectrum.
There are different conventions for spectral estimates - whether the wavenumber
units are 1/m, or radians/m. Single-sided spectra or double-sided spectra.
help pwelch
shows that the default settings return a one-sided spectrum, i.e. the bin for some
frequency ω will include the power density for both +ω and -ω.
You should double check that the idealized spectrum to which you are comparing
is also a one-sided spectrum. Otherwise, you'll need to half the values of your
one-sided spectrum to get values representative of the +ω side of a
two-sided spectrum.
I agree with Try Hard that it is the cyclic frequency (generally Hz, or in this case 1/m)
which should be specified to pwelch. That said, the returned frequency vector
from pwelch is also in those units. Analytical
spectral formulae are usually written in terms of angular frequency, so you'll
want to be sure that you evaluate it in terms of radians/m, but scale back to 1/m
for plotting.

How to get coefficients for sine/cosine function from complex FFT in Matlab?

I'm working on a control system that measures the movement of a vibrating robot arm. Because there is some deadtime, I need to look into the future of the somewhat noisy signal.
My idea was to use the frequencies in the sampled signal and produce a fourier function that could be used for extrapolation.
My question: I already have the FFT of the signal vector (containing 60-100 values e.g.) and can see the main frequencies in the amplitude spectrum. Now I want to have a function f(t) which fits to the signal, removes some noise, and can be used to predict the near future of the signal. How do I calculate the coefficients for the sine/cosine functions out of the complex FFT data?
Thank you so much!
AFAIR FFT essentially produces output as a sum of sine functions with different frequencies. The importance of each frequency is the height of each peak. So what you really want to do here is filter out some frequencies (ie. high frequencies for the arm to move gently) and then come back to the time domain.
In matlab this should be like going through the vector of what you got from fft, setting some values to 0 (or doing something more complex to it) and then use ifft to come back to time domain and make the prediction based on what you get.
There's also one thing you should consider while doing this - Nyquist frequency - this means that the highest frequency that you get on your fft is half of the sampling frequency.
If you use an FFT for data that isn't periodic within the FFT aperture length, then you may need to use a window to reduce spurious frequencies due to "spectral leakage". Frequency estimation techniques to better estimate "between bin" frequency content may also be appropriate. The phase of each cosine sinusoid, relative to the edge of the window, is usually atan2(imag[i], real[i]). The frequency depends on the sample rate and bin number versus the length of the FFT.
You might also want to look into using a Kalman filter instead of an FFT.
Added: If your signal isn't exactly integer periodic in the FFT length, then you may want to do an fftshift before the FFT to move the resulting phase measurement reference point to the center of your data vector, instead of a possibly discontinuous circular edge.