Matlab: finding phase difference using cpsd - matlab

From my understanding, when using the cpsd function as such:
[Pxy,f] = cpsd(x,y,window,Ns,NFFT,Fs);
matlab chops the time series data into smaller windows with size specified by you. And the windows are shifted by Ns data point. The final [Pxy, f] are an average of results obtained from each individual window. Please correct me if I am wrong about this process.
My question is, if I use angle(Pxy) at a specific frequency, say 34Hz. Does that give me the phase difference between signal x and y at the frequency 34Hz?
I am having doubt about this because if Pxy was an average between each individual window, and because each individual was offset by a window shift, doesn't that mean the averaged Pxy's phase is affected by the window shift?
I've tried to correct this by ensuring that the window shift corresponds to an integer of full phase difference corresponding to 34Hz. Is this correct?
And just a little background about what I am doing:
I basically have numerous time-series pressure measurement over 60 seconds at 1000Hz sampling rate.
Power spectrum analysis indicates that there is a peak frequency at 34 Hz for each signal. (averaged over all windows)
I want to compare each signal's phase difference from each other corresponding to the 34Hz peak.
FFT analysis of individual window reveals that this peak frequency moves around. So I am not sure if cpsd is the correct way to be going about this.
I am currently considering trying to use xcorr to calculate the overall time lag between the signals and then calculate the phase difference from that. I have also heard of hilbert transform, but I got no idea how that works yet.

Yes, cpsd works.
You can test your result by set two input signals, such as:
t=[0:0.001:5];
omega=25;
x1=sin(2*pi*omega*t);
x2=sin(2*pi*omega*t+pi/3);
you can check whether the phase shift calculated by cpsd is pi/3.

Related

If we take STFT of a single sinusoid, and plot the value corresponding to carrier frequency in real imaginary plane, how many points should it be?

I created a sinusoid with frequency 550Hz that goes for 1 second
fs=44100;
Duration=1; %second
Len=Duration * fs; %length of sinusoid
t=(0:Len-1)/fs;
x=sin(2*pi*550*t);
for the purpose of exploring and learning, I have decided to take the short time Fourier transform of this signal. I did it as below:
window_len=0.02*fs; %length of the window
hop=window_len/3; %hop size
nfft=2^nextpow2(window_len);
window=hamming(window_len,'periodic');
[S,f,t]=spectrogram(x,window,hop,nfft,fs);
Now I want to plot the real versus imaginary value of S for the frequency equal to 550 and see what happens. First of all, in the frequency vector I didn’t have the exact 550. There was one 516.5 and 559.6. So, I just looked at the spectrogram and chose whichever that was close to it and picked that. When I tried to plot real vs imaginary of S for the frequency I chose (over all time frames), the values all fall in 3 points as it shows in the attached plot. Why three points?
Each STFT window can have a different complex phase depending on how the start (or middle) of the window is synchronized (or not) with the sinusoids period. So the real-complex IQ plot for the peak magnitude DFT result bin can be a circular scatter plot, depending on the number of DFT windows and the ratio between the stepping distance (or length - overlap) and the period of the sinusoid.
The phase of the STFT coefficients for the different windows depends on which data exactly the window "sees". So for your particular choice of window length and hop, it so happens that as you slide through your single-frequency sinusoid, there only three different data chunks that you window "sees". To see what I mean, just plot:
plot(x(1:window_len),'x')
plot(x(1+hop:window_len+hop),'x')
plot(x(1+2*hop:window_len+2*hop),'x')
plot(x(1+3*hop:window_len+3*hop),'x')
.. and if you continue you will see that the pattern repeats itself, i.e., the first plot for instance is the same as the fourth, the second as the fifth etc. Therefore you only have three different real-imaginary part combinations.
Of course, this will change if you change the window length and the hopsize, and you will get more points. For instance, try
window_len =nfft;
hop=ceil(window_len/4)
I hope that helps.

What is a spectrogram and how do I set its parameters?

I am trying to plot the spectrogram of my time domain signal given:
N=5000;
phi = (rand(1,N)-0.5)*pi;
a = tan((0.5.*phi));
i = 2.*a./(1-a.^2);
plot(i);
spectrogram(i,100,1,100,1e3);
The problem is I don't understand the parameters and what values should be given. These values that I am using, I referred to MATLAB's online documentation of spectrogram. I am new to MATLAB, and I am just not getting the idea. Any help will be greatly appreciated!
Before we actually go into what that MATLAB command does, you probably want to know what a spectrogram is. That way you'll get more meaning into how each parameter works.
A spectrogram is a visual representation of the Short-Time Fourier Transform. Think of this as taking chunks of an input signal and applying a local Fourier Transform on each chunk. Each chunk has a specified width and you apply a Fourier Transform to this chunk. You should take note that each chunk has an associated frequency distribution. For each chunk that is centred at a specific time point in your time signal, you get a bunch of frequency components. The collection of all of these frequency components at each chunk and plotted all together is what is essentially a spectrogram.
The spectrogram is a 2D visual heat map where the horizontal axis represents the time of the signal and the vertical axis represents the frequency axis. What is visualized is an image where darker colours means that for a particular time point and a particular frequency, the lower in magnitude the frequency component is, the darker the colour. Similarly, the higher in magnitude the frequency component is, the lighter the colour.
Here's one perfect example of a spectrogram:
Source: Wikipedia
Therefore, for each time point, we see a distribution of frequency components. Think of each column as the frequency decomposition of a chunk centred at this time point. For each column, we see a varying spectrum of colours. The darker the colour is, the lower the magnitude component at that frequency is and vice-versa.
So!... now you're armed with that, let's go into how MATLAB works in terms of the function and its parameters. The way you are calling spectrogram conforms to this version of the function:
spectrogram(x,window,noverlap,nfft,fs)
Let's go through each parameter one by one so you can get a greater understanding of what each does:
x - This is the input time-domain signal you wish to find the spectrogram of. It can't get much simpler than that. In your case, the signal you want to find the spectrogram of is defined in the following code:
N=5000;
phi = (rand(1,N)-0.5)*pi;
a = tan((0.5.*phi));
i = 2.*a./(1-a.^2);
Here, i is the signal you want to find the spectrogram of.
window - If you recall, we decompose the image into chunks, and each chunk has a specified width. window defines the width of each chunk in terms of samples. As this is a discrete-time signal, you know that this signal was sampled with a particular sampling frequency and sampling period. You can determine how large the window is in terms of samples by:
window_samples = window_time/Ts
Ts is the sampling time of your signal. Setting the window size is actually very empirical and requires a lot of experimentation. Basically, the larger the window size, the better frequency resolution you get as you're capturing more of the frequencies, but the time localization is poor. Similarly, the smaller the window size, the better localization you have in time, but you don't get that great of a frequency decomposition. I don't have any suggestions here on what the most optimal size is... which is why wavelets are preferred when it comes to time-frequency decomposition. For each "chunk", the chunks get decomposed into smaller chunks of a dynamic width so you get a mixture of good time and frequency localization.
noverlap - Another way to ensure good frequency localization is that the chunks are overlapping. A proper spectrogram ensures that each chunk has a certain number of samples that are overlapping for each chunk and noverlap defines how many samples are overlapped in each window. The default is 50% of the width of each chunk.
nfft - You are essentially taking the FFT of each chunk. nfft tells you how many FFT points are desired to be computed per chunk. The default number of points is the largest of either 256, or floor(log2(N)) where N is the length of the signal. nfft also gives a measure of how fine-grained the frequency resolution will be. A higher number of FFT points would give higher frequency resolution and thus showing fine-grained details along the frequency axis of the spectrogram if visualised.
fs - The sampling frequency of your signal. The default is 1 Hz, but you can override this to whatever the sampling frequency your signal is at.
Therefore, what you should probably take out of this is that I can't really tell you how to set the parameters. It all depends on what signal you have, but hopefully the above explanation will give you a better idea of how to set the parameters.
Good luck!

how to use ifft function in MATLAB with experimental data

I am trying to use the ifft function in MATLAB on some experimental data, but I don't get the expected results.
I have frequency data of a logarithmic sine sweep excitation, therefore I know the amplitude [g's], the frequency [Hz] and the phase (which is 0 since the point is a piloting point).
I tried to feed it directly to the ifft function, but I get a complex number as a result (and I expected a real result since it is a time signal). I thought the problem could be that the signal is not symmetric, therefore I computed the symmetric part in this way (in a 'for' loop)
x(i) = conj(x(mod(N-i+1,N)+1))
and I added it at the end of the amplitude vector.
new_amp = [amplitude x];
In this way the new amplitude vector is symmetric, but now I also doubled the dimension of that vector and this means I have to double the dimension of the frequency vector also.
Anyway, I fed the new amplitude vector to the ifft but still I don't get the logarithmic sine sweep, although this time the output is real as expected.
To compute the time [s] for the plot I used the following formula:
t = 60*3.33*log10(f/f(1))/(sweep rate)
What am I doing wrong?
Thank you in advance
If you want to create identical time domain signal from specified frequency values you should take into account lots of details. It seems to me very complicated problem and I think it need very strength background on the mathematics behind it.
But I think you may work on some details to get more acceptable result:
1- Time vector should be equally spaced based on sampling from frequency steps and maximum.
t = 0:1/fs:N/fs;
where: *N* is the length of signal in frequency domain, and *fs* is twice the
highest frequency in frequency domain.
2- You should have some sort of logarithmic phases on the frequency bins I think.
3- Your signal in frequency domain must be even to have real signal in time domain.
I hope this could help, even for someone to improve it.

Time delay estimation using crosscorrelation

I have two sensors seperated by some distance which receive a signal from a source. The signal in its pure form is a sine wave at a frequency of 17kHz. I want to estimate the TDOA between the two sensors. I am using crosscorrelation and below is my code
x1; % signal as recieved by sensor1
x2; % signal as recieved by sensor2
len = length(x1);
nfft = 2^nextpow2(2*len-1);
X1 = fft(x1);
X2 = fft(x2);
X = X1.*conj(X2);
m = ifft(X);
r = [m(end-len+1) m(1:len)];
[a,i] = max(r);
td = i - length(r)/2;
I am filtering my signals x1 and x2 by removing all frequencies below 17kHz.
I am having two problems with the above code:
1. With the sensors and source at the same place, I am getting different values of 'td' at each time. I am not sure what is wrong. Is it because of the noise? If so can anyone please provide a solution? I have read many papers and went through other questions on stackoverflow so please answer with code along with theory instead of just stating the theory.
2. The value of 'td' is sometimes not matching with the delay as calculated using xcorr. What am i doing wrong? Below is my code for td using xcorr
[xc,lags] = xcorr(x1,x2);
[m,i] = max(xc);
td = lags(i);
One problem you might have is the fact that you only use a single frequency. At f = 17 kHz, and an estimated speed-of-sound v = 340 m/s (I assume you use ultra-sound), the wavelength is lambda = v / f = 2 cm. This means that your length measurement has an unambiguity range of 2 cm (sorry, cannot find a good link, google yourself). This means that you already need to know your distance to better than 2 cm, before you can use the result of your measurement to refine the distance.
Think of it in another way: when taking the cross-correlation between two perfect sines, the result should be a 'comb' of peaks with spacing equal to the wavelength. If they overlap perfectly, and you displace one signal by one wavelength, they still overlap perfectly. This means that you first have to know which of these peaks is the right one, otherwise a different peak can be the highest every time purely by random noise. Did you make a plot of the calculated cross-correlation before trying to blindly find the maximum?
This problem is the same as in interferometry, where it is easy to measure small distance variations with a resolution smaller than a wavelength by measuring phase differences, but you have no idea about the absolute distance, since you do not know the absolute phase.
The solution to this is actually easy: let your source generate more frequencies. Even using (band-limited) white-noise should work without problems when calculating cross-correlations, and it removes the ambiguity problem. You should see the white noise as a collection of sines. The cross-correlation of each of them will generate a comb, but with different spacing. When adding all those combs together, they will add up significantly only in a single point, at the delay you are looking for!
White Noise, Maximum Length Sequency or other non-periodic signals should be used as the test signal for time delay measurement using cross correleation. This is because non-periodic signals have only one cross correlation peak and there will be no ambiguity to determine the time delay. It is possible to use the burst type of periodic signals to do the job, but with degraded SNR. If you have to use a continuous periodic signal as the test signal, then you can only measure a time delay within one period of the periodic test signal. This should explain why, in your case, using lower frequency sine wave as the test signal works while using higher frequency sine wave does not. This is demonstrated in these videos: https://youtu.be/L6YJqhbsuFY, https://youtu.be/7u1nSD0RlwY .

Position of high correlation using matlab

I have a signal that more or less repeats itself (not exactly the same from one to the next, see plot to the left). If I use autocorrelation I get a number of maximums (right plot), but it doesn't tell me where (which sample number) the correlation is high. It gives me the lags but I lose information on the position, that is, the sample number in my original data where the signal occurs. For example in the auto-corr. plot, the second peak at sample 500 should correspond to the signal at about sample 750 in the data plot. I could do this by using a small window that moves over the data trace and find the maximums but it takes too much time. Is there a faster way of doing this in matlab? thanks.
I think you're misinterpreting autocorrelation. The correlation peak at 5000 is not due to a single location in the time series, but rather to the fact that the entire time series is similar to itself, when offset by 5000 samples. As much of that peak is due to the time series peak at 18000 as it is to the time series peak at 7500. Your autocorrelation will get very strange if, for example, you do not have a truly periodic time series (that is, if the interval between pulses is nonuniform).
If you can isolate one example of your pulse, and choose the location you want as your t=0, then a correlation of that one pulse with the time series will give you just what you want. Each pulse will light up clearly, at the time location at which it occurs. Then you just need a peak finder.
Yes, you could get the indices of the elements with the maximum amplitude using
treshold = max(a)/2
ind = find(a>=treshold)
where a is the matrix containing the correlation result.