I try to find the strongest frequency component with Matlab. It works, but if the datapoints and periods are not nicely aligned, I need to zero-pad my data to increase the FFT resolution. So far so good.
The problem is that, when I zero-pad too much, the frequency with the maximal power changes, even if everything is aligned nicely and I would expect a clear result.
This is my MWE:
Tmax = 1024;
resolution = 1024;
period = 512;
X = linspace(0,Tmax,resolution);
Y = sin(2*pi*X/period);
% N_fft = 2^12; % still fine, max_period is 512
N_fft = 2^13; % now max_period is 546.1333
F = fft(Y,N_fft);
power = abs(F(1:N_fft/2)).^2;
dt = Tmax/resolution;
freq = (0:N_fft/2-1)/N_fft/dt;
[~, ind] = max(power);
max_period = 1/freq(ind)
With zero-padding up to 2^12 everything works fine, but when I zero-pad to 2^13, I get a wrong result. It seems like too much zero-padding shifts the spectrum, but I doubt it. I rather expect a bug in my code, but I cannot find it. What am I doing wrong?
EDIT: It seems like the spectrum is skewed towards the low frequencies. Zero-padding just makes this visible:
Why is my spectrum skewed? Shouldn't it be symmetric?

Here is a graphic explanation of what you're doing wrong (which is mostly a resolution problem).
EDIT: this shows the power for each fft data point, mapped to the indices of the 2^14 dataset. That is, the indices for the 2^13 data numbered 1,2,3 map to 1,3,5 on this graph; the indices for 2^12 data numbered 1,2,3 map to 1,5,9; and so on.
You can see that the "true" value should in fact not be 512 -- your indexing is off by 1 or a fraction of 1.

Its not a bug in your code. It has to do with the properties of the DFT (and thus the FFT, which is merely a fast version of the DFT).
When you zero-pad, you add frequency resolution, particularly on the lower end.
Here you use a sine wave as test, so you are basically convolving a finite length sine with finite sines and cosines (see here https://en.wikipedia.org/wiki/Fast_Fourier_transform details), which have almost the same or lower frequency.
If you were doing a "proper" fft, i.e. doing integrals from -inf to +inf, even those low frequency components would give you zero coefficients for the FFT, but since you are doing finite sums, the result of those convolutions is not zero and hence the actual computed fourier transform is inaccurate.

TL;DR: Use a better window function!
The long version:
After searching further, I finally found the explanation. Neither is indexing the problem, nor the additional low frequency components added by the zero-padding. The frequency response of the rectangular window, combined with the negative frequency components is the culprit. I found out on this website explaining window functions.
I made more plots to explain:
Top: The frequency response without windowing: two delta peaks, one at the positive and one at the negative frequency. I always plotted the positive part, since I didn't expect to need the negative frequency components. Middle: The frequency response of the rectangular window function. It is relatively broad, but I didn't care, because I thought I'd have only a single peak. Bottom: The frequency response of the zero-padded signal. In time domain, this is the multiplication of window function and sine-wave. In frequency domain, this amounts to the convolution of the frequency response of the window function with the frequency response of the perfect sine. Since there are two peaks, the relatively broad frequency responses of the window overlap significantly, leading to a skewed spectrum and therefore a shifted peak.
The solution: A way to circumvent this is to use a proper window function, like a Hamming window, to have a much smaller frequency response of the window, leading to less overlap.


Why is the number of sample frequencies in `scipy.signal.stft()` tied to the hop size?

This question relates to SciPy's Short-time Fourier Transform function for signal processing.
For some reason I don't understand, the size of the output 'array of sample frequencies' is exactly equal to the hop size. From the documentation:
nperseg : int, optional
Length of each segment. Defaults to 256.
noverlap : int, optional
Number of points to overlap between segments. If None, noverlap = nperseg // 2. Defaults to None. When specified, the COLA constraint must be met (see Notes below).
f : ndarray
Array of sample frequencies.
hop size H = nperseg - noverlap
I'm new to signal processing and Fourier transforms, but as far as I understand a STFT is just chopping an audio file into segments ('time frames') on which you perform a Fourier transform. So if I want to do a STFT on 100 time frames, I'd expect the output to be a matrix of size 100 x F, where F is an array of measured frequencies ('measured' probably isn't the right word here but you know what I mean).
This is kinda what SciPy's implementation does, but the size of f here is what bothers me. It's supposed to be an array describing the different frequencies, like [0Hz 500Hz 1000Hz], and it does, but for some reasons its size exactly the same as the hop size. If the hop size is 700, the number of measured frequencies is 700.
The hop size is the number of samples (i.e. time) between each time frame, and is correctly calculated as H = nperseg - noverlap, but what does this have to do with the frequency array?
Edit: Related to this question
An FFT is an square matrix transform from one orthogonal basis to another of the same dimension. This is because N is the exact number of orthogonal (e.g. that don't interfere with one another) complex sinusoids that fit in a time domain vector of length N.
A longer time vector can contain more frequency information (e.g. it's hard to tell 2 frequencies apart using just 3 sample points, but much easier with 3000 samples, etc.)
You can zero-pad your short time vector of length N to use a longer FFT, but that is identical to interpolating a nice curve between N frequency points, which makes all the FFT results interdependent.
For many purposes (visualization, etc.) an STFT is overlapped, where the adjacent segments share some overlapped data instead of just being end-to-end. This gives better time locality (e.g. the segments can be spaced closer but still be long enough so that each one can provide the frequency resolution required).

Frequency domain phase shift, amplitude, hope size and non-linearity

I am trying to implement a frequency domain phase shift but there are few points on which I am not sure.
1- I am able to get a perfect reconstruction from a sine or sweep signal using a hanning window with a hop size of 50%. Nevertheless, how should I normalise my result when using a hop size > 50%?
2- When shifting the phase of low frequency signals (f<100, window size<1024, fs=44100) I can clearly see some non-linearity in my result. Is this because of the window size being to short for low frequencies?
Thank you very much for your help.
t = 1/fs:1/fs:(endTime);
x = sin(2*pi*f1.*t);
H=L*.50;% Hopsize of 50%
%match input length with window length
pend=length(x)- L ;
buffer0pad= zeros(N,1);
outBuffer0pad= zeros(L,1);
df = fs/N;
f= -fs/2:df:fs/2 - df;
while pin<pend
buffer = x(pin+1:pin+L).*w;
%append zero padding in the middle
buffer0pad(1:(L)/2)=buffer((L)/2+1: L);
X = fft(buffer0pad,N);
% Phase modification
X = abs(X).*exp(1i*(angle(X))-(1i*2*pi*f'*delay));
% undo zero padding----------------------
outBuffer0pad(1:L/2)=outBuffer(N-(L/2-1): N);
y(pin+1:pin+L) = y(pin+1:pin+L) + outBuffer0pad;
%match output length with original input length
hold on
hold off
Anything below 100 Hz has less than two cycles in your FFT window. Note that a DFT or FFT represents any waveform, including a single non-integer-periodic sinusoid, by possibly summing up of a whole bunch of sinusoids of very different frequencies. e.g. a lot more than just one. That's just how the math works.
For a von Hann window containing less than 2 cycles, these are often a bunch of mostly completely different frequencies (possibly very far away in terms of percentage from your low frequency). Shifting the phase of all those completely different frequencies may or may not shift your windowed low frequency sinusoid by the desired amount (depending on how different in frequency your signal is from being integer-periodic).
Also for low frequencies, the complex conjugate mirror needs to be shifted in the opposite direction in phase in order to still represent a completely real result. So you end up mixing 2 overlapped and opposite phase changes, which again is mostly a problem if the low frequency signal is far from being integer periodic in the DFT aperture.
Using a longer window in time and samples allows more cycles of a given frequency to fit inside it (thus possibly needing a lesser power of very different frequency sinusoids to be summed up in order to compose, make up or synthesize your low frequency sinusoid); and the complex conjugate is farther away in terms of FFT result bin index, thus reducing interference.
A sequence using any hop of a von Hann window that in 50% / (some-integer) in length is non-lossy (except for the very first or last window). All other hop sizes modulate or destroy information, and thus can't be normalized by a constant for reconstruction.

how to use ifft function in MATLAB with experimental data

I am trying to use the ifft function in MATLAB on some experimental data, but I don't get the expected results.
I have frequency data of a logarithmic sine sweep excitation, therefore I know the amplitude [g's], the frequency [Hz] and the phase (which is 0 since the point is a piloting point).
I tried to feed it directly to the ifft function, but I get a complex number as a result (and I expected a real result since it is a time signal). I thought the problem could be that the signal is not symmetric, therefore I computed the symmetric part in this way (in a 'for' loop)
x(i) = conj(x(mod(N-i+1,N)+1))
and I added it at the end of the amplitude vector.
new_amp = [amplitude x];
In this way the new amplitude vector is symmetric, but now I also doubled the dimension of that vector and this means I have to double the dimension of the frequency vector also.
Anyway, I fed the new amplitude vector to the ifft but still I don't get the logarithmic sine sweep, although this time the output is real as expected.
To compute the time [s] for the plot I used the following formula:
t = 60*3.33*log10(f/f(1))/(sweep rate)
What am I doing wrong?
Thank you in advance
If you want to create identical time domain signal from specified frequency values you should take into account lots of details. It seems to me very complicated problem and I think it need very strength background on the mathematics behind it.
But I think you may work on some details to get more acceptable result:
1- Time vector should be equally spaced based on sampling from frequency steps and maximum.
t = 0:1/fs:N/fs;
where: *N* is the length of signal in frequency domain, and *fs* is twice the
highest frequency in frequency domain.
2- You should have some sort of logarithmic phases on the frequency bins I think.
3- Your signal in frequency domain must be even to have real signal in time domain.
I hope this could help, even for someone to improve it.

Time delay estimation using crosscorrelation

I have two sensors seperated by some distance which receive a signal from a source. The signal in its pure form is a sine wave at a frequency of 17kHz. I want to estimate the TDOA between the two sensors. I am using crosscorrelation and below is my code
x1; % signal as recieved by sensor1
x2; % signal as recieved by sensor2
len = length(x1);
nfft = 2^nextpow2(2*len-1);
X1 = fft(x1);
X2 = fft(x2);
X = X1.*conj(X2);
m = ifft(X);
r = [m(end-len+1) m(1:len)];
[a,i] = max(r);
td = i - length(r)/2;
I am filtering my signals x1 and x2 by removing all frequencies below 17kHz.
I am having two problems with the above code:
1. With the sensors and source at the same place, I am getting different values of 'td' at each time. I am not sure what is wrong. Is it because of the noise? If so can anyone please provide a solution? I have read many papers and went through other questions on stackoverflow so please answer with code along with theory instead of just stating the theory.
2. The value of 'td' is sometimes not matching with the delay as calculated using xcorr. What am i doing wrong? Below is my code for td using xcorr
[xc,lags] = xcorr(x1,x2);
[m,i] = max(xc);
td = lags(i);
One problem you might have is the fact that you only use a single frequency. At f = 17 kHz, and an estimated speed-of-sound v = 340 m/s (I assume you use ultra-sound), the wavelength is lambda = v / f = 2 cm. This means that your length measurement has an unambiguity range of 2 cm (sorry, cannot find a good link, google yourself). This means that you already need to know your distance to better than 2 cm, before you can use the result of your measurement to refine the distance.
Think of it in another way: when taking the cross-correlation between two perfect sines, the result should be a 'comb' of peaks with spacing equal to the wavelength. If they overlap perfectly, and you displace one signal by one wavelength, they still overlap perfectly. This means that you first have to know which of these peaks is the right one, otherwise a different peak can be the highest every time purely by random noise. Did you make a plot of the calculated cross-correlation before trying to blindly find the maximum?
This problem is the same as in interferometry, where it is easy to measure small distance variations with a resolution smaller than a wavelength by measuring phase differences, but you have no idea about the absolute distance, since you do not know the absolute phase.
The solution to this is actually easy: let your source generate more frequencies. Even using (band-limited) white-noise should work without problems when calculating cross-correlations, and it removes the ambiguity problem. You should see the white noise as a collection of sines. The cross-correlation of each of them will generate a comb, but with different spacing. When adding all those combs together, they will add up significantly only in a single point, at the delay you are looking for!
White Noise, Maximum Length Sequency or other non-periodic signals should be used as the test signal for time delay measurement using cross correleation. This is because non-periodic signals have only one cross correlation peak and there will be no ambiguity to determine the time delay. It is possible to use the burst type of periodic signals to do the job, but with degraded SNR. If you have to use a continuous periodic signal as the test signal, then you can only measure a time delay within one period of the periodic test signal. This should explain why, in your case, using lower frequency sine wave as the test signal works while using higher frequency sine wave does not. This is demonstrated in these videos: https://youtu.be/L6YJqhbsuFY, https://youtu.be/7u1nSD0RlwY .

Time Series from spectrum

I am having a samll problem while converting a spectrum to a time series. I have read many article sand I htink I am applying the right procedure but I do not get the right results. Could you help to find the error?
I have a time series like:
When I compute the spectrum I do:
%number of points
%time interval
%Fast Fourier transform
%power of positive frequencies
%plot spectrum
semilogy(frequency,spectrum); grid on;
xlabel('Frequency [Hz]');
ylabel('Power Spectrum [N*m]^2/[Hz]');
title('SPD load signal');
And I obtain:
I think the spectrum is well computed. However now I need to go back and obtain a time series from this spectrum and I do:
ap = sqrt(2.*spectrum*df)';
%random number form -pi to pi
epsilon=-pi + 2*pi*rand(1,length(ap));
%transform to time series
%Add the mean value
However, the plot looks like:
Where it is one order of magnitude lower than the original serie.
Any recommendation?
There are (at least) two things going on here. The first is that you are throwing away information, and then substituting random numbers for that information.
The FFT of a real sequence is a sequence of complex numbers consisting of a real and imaginary part. Converting those numbers to polar form gives you magnitude and phase angle. You are capturing the magnitude part with p=aps(fft(...)), but you are not capturing the phase angle (which would involve atan2(...)). You are then making up random numbers (epsilon=...) and using those to replace the original numbers when you reconstruct your time-series. Also, as the FFT of a real sequence has a particular symmetry, substituting random numbers for the phase angle destroys that symmetry, which means that the IFFT will in general no longer be a real sequence, but a sequence of complex numbers - and again, you're only looking at the real portion of the IFFT, so you're throwing away information again. If this is an audio signal, the results may sound somewhat like the original (or they may be completely different), but the waveform definitely won't match...
The second issue is that in many implementations, ifft(fft(...)) will scale the result by the number of points in the signal. There are several different ways to avoid that, with differing results, but sometimes more attractive in different scenarios, depending on what you are trying to do. You can either scale the fft() result before you do the ifft(), or scale the ifft() result at the end, or in some cases, I've even seen both being scaled by a factor of sqrt(N) - doing it twice has the end result of scaling the final result by N, but it is a bit less efficient since you do the scaling twice...