how to improve the resolution of the PSD using Matlab - matlab

I have and audio signal, which I read with Matlab, and use pwelch to get its PSD, here ist the code that I'm using
[x,Fs] = audioread('audioFile.wav');
x= x(:,1) % mono
[xPSD,f] = pwelch(x,hamming(512),16,1024,Fs);
plot(f,xPSD);
since the FS=96000 and I'm only interrested in Frequencies bellow 5khz, I would like to calculate the PSD only for the area, and also being able to adjust the resolution of the PSD ! any idea hwo to do that !

When calculating PSDs with pwelch, there is always a trade-off between spectral resolution, number of averages and the amount of data you need. My preferred way to use it is:
[psd_x, freq] = pwelch(x, hann(nfft), nfft/2, nfft, fsample);
A few differences with your code:
I prefer to use hann windows, since I have bad experiences with hamming windows, they are not very good if your signal contains for example a large DC component. See this comparison, which shows that the roll-off of hann is much better, at the only cost of a slightly higher first sidelobe.
I use windows that overlap 50% (by using noverlap = nfft/2), in this way you 'get the most out of your data'. In your case, there is only 16/512 = 3% overlap between windows, and since the window function is close to zero at its edges, the data points at the edges do not contribute as much as points in the middle of a window. With half-overlapping windows this effect is much smaller. Making the overlap bigger than 50% is useless, you will get more averages, but since you will use the same points many times, this does not add any extra information. Just stick to 50%.
I usually make the length of the fft (fourth argument to pwelch) the same as the window length. The only case you want to have this different is when you use zero-padding, which has limited use.
There are a few simple formulas, which you should memorize when working with pwelch and similar functions:
The spectral resolution is given by the window length only: df = 1 / t_window
The length of a single window is t_window = nfft / f_sample.
With half-overlapping windows, the total amount of needed data is t_total = t_window * (n_average + 1) / 2
For one-sided spectra, the number of spectral bins of the PSD is nfft / 2.
Nyquist: f_max = f_sample / 2
To get a reasonably smooth spectrum, I would typically use on the order of 20 averages. Combining the equations above and filling in the required spectral resolution then gives you the total length of data you need. Or the other way around, if you only have a limited amount of data available, you could calculate the frequency resolution you can obtain.

Related

Frequency domain phase shift, amplitude, hope size and non-linearity

I am trying to implement a frequency domain phase shift but there are few points on which I am not sure.
1- I am able to get a perfect reconstruction from a sine or sweep signal using a hanning window with a hop size of 50%. Nevertheless, how should I normalise my result when using a hop size > 50%?
2- When shifting the phase of low frequency signals (f<100, window size<1024, fs=44100) I can clearly see some non-linearity in my result. Is this because of the window size being to short for low frequencies?
Thank you very much for your help.
clear
freq=500;
fs=44100;
endTime=0.02;
t = 1/fs:1/fs:(endTime);
f1=linspace(freq,freq,fs*endTime);
x = sin(2*pi*f1.*t);
targetLength=numel(x);
L=1024;
w=hanning(L);
H=L*.50;% Hopsize of 50%
N=1024;
%match input length with window length
x=[zeros(L,1);x';zeros(L+mod(length(x),H),1)];
pend=length(x)- L ;
pin=0;
count=1;
X=zeros(N,1);
buffer0pad= zeros(N,1);
outBuffer0pad= zeros(L,1);
y=zeros(length(x),1);
delay=-.00001;
df = fs/N;
f= -fs/2:df:fs/2 - df;
while pin<pend
buffer = x(pin+1:pin+L).*w;
%append zero padding in the middle
buffer0pad(1:(L)/2)=buffer((L)/2+1: L);
buffer0pad(N-(L)/2+1:N)=buffer(1:(L)/2);
X = fft(buffer0pad,N);
% Phase modification
X = abs(X).*exp(1i*(angle(X))-(1i*2*pi*f'*delay));
outBuffer=real(ifft(X,N));
% undo zero padding----------------------
outBuffer0pad(1:L/2)=outBuffer(N-(L/2-1): N);
outBuffer0pad(L/2+1:L)=outBuffer(1:(L)/2);
%Overlap-add
y(pin+1:pin+L) = y(pin+1:pin+L) + outBuffer0pad;
pin=pin+H;
count=count+1;
end
%match output length with original input length
output=y(L+1:numel(y)-(L+mod(targetLength,H)));
figure(2)
plot(t,x(L+1:numel(x)-(L+mod(targetLength,H))))
hold on
plot(t,output)
hold off
Anything below 100 Hz has less than two cycles in your FFT window. Note that a DFT or FFT represents any waveform, including a single non-integer-periodic sinusoid, by possibly summing up of a whole bunch of sinusoids of very different frequencies. e.g. a lot more than just one. That's just how the math works.
For a von Hann window containing less than 2 cycles, these are often a bunch of mostly completely different frequencies (possibly very far away in terms of percentage from your low frequency). Shifting the phase of all those completely different frequencies may or may not shift your windowed low frequency sinusoid by the desired amount (depending on how different in frequency your signal is from being integer-periodic).
Also for low frequencies, the complex conjugate mirror needs to be shifted in the opposite direction in phase in order to still represent a completely real result. So you end up mixing 2 overlapped and opposite phase changes, which again is mostly a problem if the low frequency signal is far from being integer periodic in the DFT aperture.
Using a longer window in time and samples allows more cycles of a given frequency to fit inside it (thus possibly needing a lesser power of very different frequency sinusoids to be summed up in order to compose, make up or synthesize your low frequency sinusoid); and the complex conjugate is farther away in terms of FFT result bin index, thus reducing interference.
A sequence using any hop of a von Hann window that in 50% / (some-integer) in length is non-lossy (except for the very first or last window). All other hop sizes modulate or destroy information, and thus can't be normalized by a constant for reconstruction.

Summing Frequency Spectrums

I've a set of data from an EEG device from which I want to find the strength of different brain waves in Matlab. I tried to use EEGLAB but I wasn't really sure how, so at this point I'm simply using the dsp toolbox in Matlab.
For background: I've 15 epochs, 4 seconds in length. The device sampled at 256 Hz, and there are 264 sensors, so there are 1024 data points for each sensor for each epoch, i.e. my raw data is 264 x 1024 x 15. The baseline is removed. The data in each epoch is going to be used to train a classifier eventually, so I'm dealing with each epoch individually. I'll come up with more data samples later.
Anyways, what I've done so far is apply a Hann filter to the data and then run fft on the filtered data. So now I have the information in frequency domain. However, I'm not quite sure how to go from the power of the fft buckets to the power of certain frequency bands (e.g. alpha 8-13), to get the values I seek.
I know the answer should be straightforward but I can't seem to get find the answer I want online, and then there's further confusion by certain sources recommending using a wavelet transform? Here's the little bit of code I have so far, the input "data" is one epoch, i.e. 264 x 1024.
% apply a hann window
siz = size(data);
hann_window = hann(siz(2));
hann_window = repmat(hann_window.', siz(1), 1);
hann_data = data.' * hann_window;
% run fft
X = fft(hann_data, [], 2);
X_mag = abs(X);
X_mag = X_mag.';
Thanks for the assistance!
If I'm understanding your question correctly, you are wanting to scale the FFT output to get the correct power. To do this you need to divide by the number of samples used for the FFT.
X_mag = abs(X)/length(hann_data); % This gives the correct power.
See this question for more info.
Once the content is scaled correctly, you can find the power in a band (e.g. 8 - 13 Hz) by integrating the content from the start to the stop of the band. Since you are dealing with discrete values it is a discrete integration. For perspective, this is equivalent to changing the resolution bandwidth of a spectrum analyzer.

What is a spectrogram and how do I set its parameters?

I am trying to plot the spectrogram of my time domain signal given:
N=5000;
phi = (rand(1,N)-0.5)*pi;
a = tan((0.5.*phi));
i = 2.*a./(1-a.^2);
plot(i);
spectrogram(i,100,1,100,1e3);
The problem is I don't understand the parameters and what values should be given. These values that I am using, I referred to MATLAB's online documentation of spectrogram. I am new to MATLAB, and I am just not getting the idea. Any help will be greatly appreciated!
Before we actually go into what that MATLAB command does, you probably want to know what a spectrogram is. That way you'll get more meaning into how each parameter works.
A spectrogram is a visual representation of the Short-Time Fourier Transform. Think of this as taking chunks of an input signal and applying a local Fourier Transform on each chunk. Each chunk has a specified width and you apply a Fourier Transform to this chunk. You should take note that each chunk has an associated frequency distribution. For each chunk that is centred at a specific time point in your time signal, you get a bunch of frequency components. The collection of all of these frequency components at each chunk and plotted all together is what is essentially a spectrogram.
The spectrogram is a 2D visual heat map where the horizontal axis represents the time of the signal and the vertical axis represents the frequency axis. What is visualized is an image where darker colours means that for a particular time point and a particular frequency, the lower in magnitude the frequency component is, the darker the colour. Similarly, the higher in magnitude the frequency component is, the lighter the colour.
Here's one perfect example of a spectrogram:
Source: Wikipedia
Therefore, for each time point, we see a distribution of frequency components. Think of each column as the frequency decomposition of a chunk centred at this time point. For each column, we see a varying spectrum of colours. The darker the colour is, the lower the magnitude component at that frequency is and vice-versa.
So!... now you're armed with that, let's go into how MATLAB works in terms of the function and its parameters. The way you are calling spectrogram conforms to this version of the function:
spectrogram(x,window,noverlap,nfft,fs)
Let's go through each parameter one by one so you can get a greater understanding of what each does:
x - This is the input time-domain signal you wish to find the spectrogram of. It can't get much simpler than that. In your case, the signal you want to find the spectrogram of is defined in the following code:
N=5000;
phi = (rand(1,N)-0.5)*pi;
a = tan((0.5.*phi));
i = 2.*a./(1-a.^2);
Here, i is the signal you want to find the spectrogram of.
window - If you recall, we decompose the image into chunks, and each chunk has a specified width. window defines the width of each chunk in terms of samples. As this is a discrete-time signal, you know that this signal was sampled with a particular sampling frequency and sampling period. You can determine how large the window is in terms of samples by:
window_samples = window_time/Ts
Ts is the sampling time of your signal. Setting the window size is actually very empirical and requires a lot of experimentation. Basically, the larger the window size, the better frequency resolution you get as you're capturing more of the frequencies, but the time localization is poor. Similarly, the smaller the window size, the better localization you have in time, but you don't get that great of a frequency decomposition. I don't have any suggestions here on what the most optimal size is... which is why wavelets are preferred when it comes to time-frequency decomposition. For each "chunk", the chunks get decomposed into smaller chunks of a dynamic width so you get a mixture of good time and frequency localization.
noverlap - Another way to ensure good frequency localization is that the chunks are overlapping. A proper spectrogram ensures that each chunk has a certain number of samples that are overlapping for each chunk and noverlap defines how many samples are overlapped in each window. The default is 50% of the width of each chunk.
nfft - You are essentially taking the FFT of each chunk. nfft tells you how many FFT points are desired to be computed per chunk. The default number of points is the largest of either 256, or floor(log2(N)) where N is the length of the signal. nfft also gives a measure of how fine-grained the frequency resolution will be. A higher number of FFT points would give higher frequency resolution and thus showing fine-grained details along the frequency axis of the spectrogram if visualised.
fs - The sampling frequency of your signal. The default is 1 Hz, but you can override this to whatever the sampling frequency your signal is at.
Therefore, what you should probably take out of this is that I can't really tell you how to set the parameters. It all depends on what signal you have, but hopefully the above explanation will give you a better idea of how to set the parameters.
Good luck!

Time delay estimation using crosscorrelation

I have two sensors seperated by some distance which receive a signal from a source. The signal in its pure form is a sine wave at a frequency of 17kHz. I want to estimate the TDOA between the two sensors. I am using crosscorrelation and below is my code
x1; % signal as recieved by sensor1
x2; % signal as recieved by sensor2
len = length(x1);
nfft = 2^nextpow2(2*len-1);
X1 = fft(x1);
X2 = fft(x2);
X = X1.*conj(X2);
m = ifft(X);
r = [m(end-len+1) m(1:len)];
[a,i] = max(r);
td = i - length(r)/2;
I am filtering my signals x1 and x2 by removing all frequencies below 17kHz.
I am having two problems with the above code:
1. With the sensors and source at the same place, I am getting different values of 'td' at each time. I am not sure what is wrong. Is it because of the noise? If so can anyone please provide a solution? I have read many papers and went through other questions on stackoverflow so please answer with code along with theory instead of just stating the theory.
2. The value of 'td' is sometimes not matching with the delay as calculated using xcorr. What am i doing wrong? Below is my code for td using xcorr
[xc,lags] = xcorr(x1,x2);
[m,i] = max(xc);
td = lags(i);
One problem you might have is the fact that you only use a single frequency. At f = 17 kHz, and an estimated speed-of-sound v = 340 m/s (I assume you use ultra-sound), the wavelength is lambda = v / f = 2 cm. This means that your length measurement has an unambiguity range of 2 cm (sorry, cannot find a good link, google yourself). This means that you already need to know your distance to better than 2 cm, before you can use the result of your measurement to refine the distance.
Think of it in another way: when taking the cross-correlation between two perfect sines, the result should be a 'comb' of peaks with spacing equal to the wavelength. If they overlap perfectly, and you displace one signal by one wavelength, they still overlap perfectly. This means that you first have to know which of these peaks is the right one, otherwise a different peak can be the highest every time purely by random noise. Did you make a plot of the calculated cross-correlation before trying to blindly find the maximum?
This problem is the same as in interferometry, where it is easy to measure small distance variations with a resolution smaller than a wavelength by measuring phase differences, but you have no idea about the absolute distance, since you do not know the absolute phase.
The solution to this is actually easy: let your source generate more frequencies. Even using (band-limited) white-noise should work without problems when calculating cross-correlations, and it removes the ambiguity problem. You should see the white noise as a collection of sines. The cross-correlation of each of them will generate a comb, but with different spacing. When adding all those combs together, they will add up significantly only in a single point, at the delay you are looking for!
White Noise, Maximum Length Sequency or other non-periodic signals should be used as the test signal for time delay measurement using cross correleation. This is because non-periodic signals have only one cross correlation peak and there will be no ambiguity to determine the time delay. It is possible to use the burst type of periodic signals to do the job, but with degraded SNR. If you have to use a continuous periodic signal as the test signal, then you can only measure a time delay within one period of the periodic test signal. This should explain why, in your case, using lower frequency sine wave as the test signal works while using higher frequency sine wave does not. This is demonstrated in these videos: https://youtu.be/L6YJqhbsuFY, https://youtu.be/7u1nSD0RlwY .

How does number of points change a FFT in MATLAB

When taking fft(signal, nfft) of a signal, how does nfft change the outcome and why? Can I have a fixed value for nfft, say 2^18, or do I need to go 2^nextpow2(2*length(signal)-1)?
I am computing the power spectral density(PSD) of two signals by taking the FFT of the autocorrelation, and I want to compare the the results. Since the signals are of different lengths, I am worried if I don't fix nfft, it would make the comparison really hard!
There is no inherent reason to use a power-of-two (it just might make the processing more efficient in some circumstances).
However, to make the FFTs of two different signals "commensurate", you will indeed need to zero-pad one or other (or both) signals to the same lengths before taking their FFTs.
However, I feel obliged to say: If you need to ask this, then you're probably not at a point on the DSP learning curve where you're going to be able to do anything useful with the results. You should get yourself a decent book on DSP theory, e.g. this.
Most modern FFT implementations (including MATLAB's which is based on FFTW) now rarely require padding a signal's time series to a length equal to a power of two. However, nearly all implementations will offer better, and sometimes much much better, performance for FFT's of data vectors w/ a power of 2 length. For MATLAB specifically, padding to a power of 2 or to a length with many low prime factors will give you the best performance (N = 1000 = 2^3 * 5^3 would be excellent, N = 997 would be a terrible choice).
Zero-padding will not increase frequency resolution in your PSD, however it does reduce the bin-size in the frequency domain. So if you add NZeros to a signal vector of length N the FFT will now output a vector of length ( N + NZeros )/2 + 1. This means that each bin of frequencies will now have a width of:
Bin width (Hz) = F_s / ( N + NZeros )
Where F_s is the signal sample frequency.
If you find that you need to separate or identify two closely space peaks in the frequency domain, you need to increase your sample time. You'll quickly discover that zero-padding buys you nothing to that end - and intuitively that's what we'd expect. How can we expect more information in our power spectrum w/o adding more information (longer time series) in our input?
Best,
Paul