Position of high correlation using matlab - matlab

I have a signal that more or less repeats itself (not exactly the same from one to the next, see plot to the left). If I use autocorrelation I get a number of maximums (right plot), but it doesn't tell me where (which sample number) the correlation is high. It gives me the lags but I lose information on the position, that is, the sample number in my original data where the signal occurs. For example in the auto-corr. plot, the second peak at sample 500 should correspond to the signal at about sample 750 in the data plot. I could do this by using a small window that moves over the data trace and find the maximums but it takes too much time. Is there a faster way of doing this in matlab? thanks.

I think you're misinterpreting autocorrelation. The correlation peak at 5000 is not due to a single location in the time series, but rather to the fact that the entire time series is similar to itself, when offset by 5000 samples. As much of that peak is due to the time series peak at 18000 as it is to the time series peak at 7500. Your autocorrelation will get very strange if, for example, you do not have a truly periodic time series (that is, if the interval between pulses is nonuniform).
If you can isolate one example of your pulse, and choose the location you want as your t=0, then a correlation of that one pulse with the time series will give you just what you want. Each pulse will light up clearly, at the time location at which it occurs. Then you just need a peak finder.

Yes, you could get the indices of the elements with the maximum amplitude using
treshold = max(a)/2
ind = find(a>=treshold)
where a is the matrix containing the correlation result.

Related

If we take STFT of a single sinusoid, and plot the value corresponding to carrier frequency in real imaginary plane, how many points should it be?

I created a sinusoid with frequency 550Hz that goes for 1 second
fs=44100;
Duration=1; %second
Len=Duration * fs; %length of sinusoid
t=(0:Len-1)/fs;
x=sin(2*pi*550*t);
for the purpose of exploring and learning, I have decided to take the short time Fourier transform of this signal. I did it as below:
window_len=0.02*fs; %length of the window
hop=window_len/3; %hop size
nfft=2^nextpow2(window_len);
window=hamming(window_len,'periodic');
[S,f,t]=spectrogram(x,window,hop,nfft,fs);
Now I want to plot the real versus imaginary value of S for the frequency equal to 550 and see what happens. First of all, in the frequency vector I didn’t have the exact 550. There was one 516.5 and 559.6. So, I just looked at the spectrogram and chose whichever that was close to it and picked that. When I tried to plot real vs imaginary of S for the frequency I chose (over all time frames), the values all fall in 3 points as it shows in the attached plot. Why three points?
Each STFT window can have a different complex phase depending on how the start (or middle) of the window is synchronized (or not) with the sinusoids period. So the real-complex IQ plot for the peak magnitude DFT result bin can be a circular scatter plot, depending on the number of DFT windows and the ratio between the stepping distance (or length - overlap) and the period of the sinusoid.
The phase of the STFT coefficients for the different windows depends on which data exactly the window "sees". So for your particular choice of window length and hop, it so happens that as you slide through your single-frequency sinusoid, there only three different data chunks that you window "sees". To see what I mean, just plot:
plot(x(1:window_len),'x')
plot(x(1+hop:window_len+hop),'x')
plot(x(1+2*hop:window_len+2*hop),'x')
plot(x(1+3*hop:window_len+3*hop),'x')
.. and if you continue you will see that the pattern repeats itself, i.e., the first plot for instance is the same as the fourth, the second as the fifth etc. Therefore you only have three different real-imaginary part combinations.
Of course, this will change if you change the window length and the hopsize, and you will get more points. For instance, try
window_len =nfft;
hop=ceil(window_len/4)
I hope that helps.

Matlab: finding phase difference using cpsd

From my understanding, when using the cpsd function as such:
[Pxy,f] = cpsd(x,y,window,Ns,NFFT,Fs);
matlab chops the time series data into smaller windows with size specified by you. And the windows are shifted by Ns data point. The final [Pxy, f] are an average of results obtained from each individual window. Please correct me if I am wrong about this process.
My question is, if I use angle(Pxy) at a specific frequency, say 34Hz. Does that give me the phase difference between signal x and y at the frequency 34Hz?
I am having doubt about this because if Pxy was an average between each individual window, and because each individual was offset by a window shift, doesn't that mean the averaged Pxy's phase is affected by the window shift?
I've tried to correct this by ensuring that the window shift corresponds to an integer of full phase difference corresponding to 34Hz. Is this correct?
And just a little background about what I am doing:
I basically have numerous time-series pressure measurement over 60 seconds at 1000Hz sampling rate.
Power spectrum analysis indicates that there is a peak frequency at 34 Hz for each signal. (averaged over all windows)
I want to compare each signal's phase difference from each other corresponding to the 34Hz peak.
FFT analysis of individual window reveals that this peak frequency moves around. So I am not sure if cpsd is the correct way to be going about this.
I am currently considering trying to use xcorr to calculate the overall time lag between the signals and then calculate the phase difference from that. I have also heard of hilbert transform, but I got no idea how that works yet.
Yes, cpsd works.
You can test your result by set two input signals, such as:
t=[0:0.001:5];
omega=25;
x1=sin(2*pi*omega*t);
x2=sin(2*pi*omega*t+pi/3);
you can check whether the phase shift calculated by cpsd is pi/3.

What is a spectrogram and how do I set its parameters?

I am trying to plot the spectrogram of my time domain signal given:
N=5000;
phi = (rand(1,N)-0.5)*pi;
a = tan((0.5.*phi));
i = 2.*a./(1-a.^2);
plot(i);
spectrogram(i,100,1,100,1e3);
The problem is I don't understand the parameters and what values should be given. These values that I am using, I referred to MATLAB's online documentation of spectrogram. I am new to MATLAB, and I am just not getting the idea. Any help will be greatly appreciated!
Before we actually go into what that MATLAB command does, you probably want to know what a spectrogram is. That way you'll get more meaning into how each parameter works.
A spectrogram is a visual representation of the Short-Time Fourier Transform. Think of this as taking chunks of an input signal and applying a local Fourier Transform on each chunk. Each chunk has a specified width and you apply a Fourier Transform to this chunk. You should take note that each chunk has an associated frequency distribution. For each chunk that is centred at a specific time point in your time signal, you get a bunch of frequency components. The collection of all of these frequency components at each chunk and plotted all together is what is essentially a spectrogram.
The spectrogram is a 2D visual heat map where the horizontal axis represents the time of the signal and the vertical axis represents the frequency axis. What is visualized is an image where darker colours means that for a particular time point and a particular frequency, the lower in magnitude the frequency component is, the darker the colour. Similarly, the higher in magnitude the frequency component is, the lighter the colour.
Here's one perfect example of a spectrogram:
Source: Wikipedia
Therefore, for each time point, we see a distribution of frequency components. Think of each column as the frequency decomposition of a chunk centred at this time point. For each column, we see a varying spectrum of colours. The darker the colour is, the lower the magnitude component at that frequency is and vice-versa.
So!... now you're armed with that, let's go into how MATLAB works in terms of the function and its parameters. The way you are calling spectrogram conforms to this version of the function:
spectrogram(x,window,noverlap,nfft,fs)
Let's go through each parameter one by one so you can get a greater understanding of what each does:
x - This is the input time-domain signal you wish to find the spectrogram of. It can't get much simpler than that. In your case, the signal you want to find the spectrogram of is defined in the following code:
N=5000;
phi = (rand(1,N)-0.5)*pi;
a = tan((0.5.*phi));
i = 2.*a./(1-a.^2);
Here, i is the signal you want to find the spectrogram of.
window - If you recall, we decompose the image into chunks, and each chunk has a specified width. window defines the width of each chunk in terms of samples. As this is a discrete-time signal, you know that this signal was sampled with a particular sampling frequency and sampling period. You can determine how large the window is in terms of samples by:
window_samples = window_time/Ts
Ts is the sampling time of your signal. Setting the window size is actually very empirical and requires a lot of experimentation. Basically, the larger the window size, the better frequency resolution you get as you're capturing more of the frequencies, but the time localization is poor. Similarly, the smaller the window size, the better localization you have in time, but you don't get that great of a frequency decomposition. I don't have any suggestions here on what the most optimal size is... which is why wavelets are preferred when it comes to time-frequency decomposition. For each "chunk", the chunks get decomposed into smaller chunks of a dynamic width so you get a mixture of good time and frequency localization.
noverlap - Another way to ensure good frequency localization is that the chunks are overlapping. A proper spectrogram ensures that each chunk has a certain number of samples that are overlapping for each chunk and noverlap defines how many samples are overlapped in each window. The default is 50% of the width of each chunk.
nfft - You are essentially taking the FFT of each chunk. nfft tells you how many FFT points are desired to be computed per chunk. The default number of points is the largest of either 256, or floor(log2(N)) where N is the length of the signal. nfft also gives a measure of how fine-grained the frequency resolution will be. A higher number of FFT points would give higher frequency resolution and thus showing fine-grained details along the frequency axis of the spectrogram if visualised.
fs - The sampling frequency of your signal. The default is 1 Hz, but you can override this to whatever the sampling frequency your signal is at.
Therefore, what you should probably take out of this is that I can't really tell you how to set the parameters. It all depends on what signal you have, but hopefully the above explanation will give you a better idea of how to set the parameters.
Good luck!

DTMF, DFT window length

We've got an assignment in school to create a DTMF decoder, but are having trouble understanding what needs to be done, and how. First of all we need to calculate the energy of the signal using convolution. We do it by making use of the window length and the absolute value of the input signal:
SmoothEnergyOfInputSignal = conv(abs(X), ones(1,winlen)/winlen); %moving average
Now, we don't know how to get the proper window length. The smoothed energy is used to segment the signal, and later to determine the different frequencies in the signal making use of basis vectors(?)
The dtmf-pulses are at least 40ms separated by at least 40ms of silence.
The sampling frequency is at 8kHz and our signal is about 17601 samples long.
We thought that by doing fs*0.04 we'd get the window length. 0.04=40ms, but now the smoothed energy signal is shifted so the segments go beyond the maximum samples of the input signal.
[Sound, fs] = audioread('dtmf_all.wav');
winlen = fs*0.04
E = conv(abs(Sound),ones(1, winlen)/winlen)
Long story short: How do we calculate the "correct" window length?
Thanks in advance.
EDIT: The instructions were updated, and we're not supposed to use convolution. We're supposed to use filter()

FFT when data set has varying vector lengths

I have data from a model I am running. However the data is collected at each time step and there are varying numbers of time steps. It works out that although there are varying time steps, it is compensated by the change in time step so that all runs are running for the same time.
However I would think that when I have a vector that is 200 in length and one that is 900 in length, taking the FFT will give me inherently different frequencies. I feel like I should take the FFT with respect to the same time axis of all the samples.
The way I have the data now is just as row vectors were each entry is not associated with a space in time.
Is there a way to take the fft of each vector with respect to their place in a time axis rather than their place in the vector array?
My goal is to write a for loop and take the fft of many data sets, and then plot them to compare of frequency signatures change.
If you collect 200 samples in 1 second (200 Hz), you can resolve input data from 1 Hz (1/(1 sec)) to 100 Hz. If you sample for 1 second collecting 900 samples, you can resolve input from 1 Hz to 450 Hz. So both your samples have the same spacing (sampling in the frequency axis is 1 Hz), but they go up to different maximum frequencies!
If your issue is just about plotting, you can either throw away the high frequencies which are not available in all your plots:
totaltime=1; %# common total time of all datasets, in seconds
minsamplenumber=200;
figure;
hold all;
cutofffreq=((minsamplenumber/2+1)/totaltime);
freqscale=0:(1/totaltime):cutofffreq;
datasetcount=42;
ffts=NaN(minsamplenumber,datasetcount);
for i=1:datasetcount
data{i}=... %# collect your data; to make life easier always collect an even number..
ffts(:,i)=fft(data{i},minsamplenumber);
plot(freqscale,ffts{i}(1:end/2+1));
end
... or live with reality, and plot all data you have:
totaltime=1; %# common total time of all datasets, in seconds
figure;
hold all;
for i=1:42
data{i}=... %# collect your data; to make life easier always collect an even number..
ffts{i}=fft(data{i});
maxfreq(i)=((numel(ffts{i})/2+1)/totaltime);
freqscale{i}=0:(1/totaltime):maxfreq(i);
plot(freqscale{i},ffts{i}(1:end/2+1));
end
You could resample your data (by filtered interpolation) into constant length vectors where the sample rate was the same constant rate in each frame. You may have to overlap your FFT frames as well to get constant frame or window offsets.