Estimation of Pitch from Speech Signals Using Autocorrelation Algorithm

Estimation of Pitch from Speech Signals Using Autocorrelation Algorithm - matlab

I want to detect speech signals pitch frequency using autocorrelation algorithm. I have a MATLAB code but the results are wrong. I would be grateful if you could solve the mistake in my code.
[y,Fs]=audioread('Sample1.wav');
y=y(:,1);
auto_corr_y=xcorr(y);
subplot(2,1,1);plot(y)
subplot(2,1,2);plot(auto_corr_y)
[pks,locs] = findpeaks(auto_corr_y);
[mm,peak1_ind]=max(pks);
period=locs(peak1_ind+1)-locs(peak1_ind);
pitch_Hz=Fs/period
Thank you for your help in this matter.

Seems, your code do not works because the Sample1.wav must contains only the short quazi-periodic part of the vocalized record. Also note, the pitch frequency is not the constant over time, so your estimation must takes this into account.
If you just want to estimate the frequency, you can take the RAPT method from the Speech Filling System (see the sfs_rapt.m wrapper for Windows).

Related

Manual pitch estimation of a speech signal

I am new to speech processing. So please forgive for my ignorance. I was given a short speech signal (10 sec) and was asked to manually annotate pitch using MATLAB or Wavesufer software. Now how to find pitch of a speech signal?. Is there any theoretical resource to help the problem? I tried to plot pitch-contour of the signal using Wavesurfer.Is it right?
Edit 1:My work is applying various pitch detection algorithms for our data and compare their accuracies. So manually annotated pitch acts as the reference.
UPDATE 1: I obtained the GCIs (Glottal Closure Instants) by differentiating EGG (dEGG) signal and the peaks in dEGG are GCIs. Time interval between two successive GCIs is the pitch period (s). The inverse of pitch period is pitch (hz).
UPDATE 2 : SIGMA is a famous algorithm for automatic GCI detection.
Thanks everyone.

Usually ground truth is obtained on the signal accompanied with EGG recording. EGG is an acronym for Electrogastrogram, it's a special device which records true pitch.
Since I doubt you have access to such device, I recommend you to use existing database for pitch extraction evaluation carefully prepared for that task. You can download it here. This data was collected in University of Edinburgh by Paul Bagshaw
I suggest you to read his thesis as well.
If you want to compare with the state of the art algorithm for pitch extraction check https://github.com/google/REAPER. Also note that "true" pitch might not be the best feature for subsequent algorithms. Sometime you might extract pitch with mistakes but get better accuracy for example for speech recognition. Check for more information this publication.

FFT in Matlab in order to find signal frequency and create a graph with peaks

I have data from an accelerometer and made a graph of acceleration(y-axis) and time (x-axis). The frequency rate of the sensor is arround 100 samples per second. but there is no equally spaced time (for example it goes from 10.046,10.047,10.163 etc) the pace is not const. And there is no function of the signal i get. I need to find the frequency of the signal and made a graph of frequency(Hz x-axis) and acceleration (y-axis). but i don't know which code of FFT suits my case. (sorry for bad english)
Any help would be greatly appreciated

For an FFT to work you will need to reconstruct the signal you have with with a regular interval. There are two ways you can do this:
Interpolate the data you already have to make an accurate guess at where the signal would be at a regular interval. However, this FFT may contain significant inaccuracies.
OR
Adjust the device reading from the accelerometer incorporate an accurate timer such that results are always transmitted at regular intervals. This is what I would recommend.

If there any way to get frequency of sound.

I want to know the frequency of data. I had a little bit idea that it can be done using FFT, but I am not sure how to do it. Once I passed the entire data to FFT, then it is giving me 2 peaks, but how can I get the frequency?
Thanks a lot in advance.

Have a look at this page for an explanation on how to calculate it:
FFT Fundamentals
Please also check this answer (it's C# code but I think you can easily understand it)
How do I obtain the frequencies of each value in an FFT?
And finally have a look at this one, it uses DFT instead of FFT:
Determining the magnitude of a certain frequency on the iPhone
I also found this implementation that you can use in Objective-C:
A lib to find the frequency https://github.com/jkells/sc_listener
A example using the above library https://github.com/jkells/sc_listener_sample
Regards

An FFT will give you the frequency of all the sinusoidal components of a signal. If instead you want just the frequency of the periodicity of common waveforms (more interesting sounding and looking that a plain sinewave) such as produced by speech and music, then you may want to use a pitch detection/estimation algorithm instead of just an FFT peak.

How can i find a sound intensity by using Matlab?

I'm looking for some functions in MATLAB in order to find out some parameters of sound,such az intensity,density,frequency,time and spectral identity.
i know how to use 'audiorecorder' as a function to record the sampled voice,and also 'getaudio', in order to plot it.But i need to realize the parametres of a sampled recorded voice,that i mentioned above.i'd be so thankful if anyone could help me.

This is a very vague question, you may want to narrow it down (at first) and to add as much contextual details as you can, it will certainly attract a lot more answers (also as mentionned by Ion, you could post it at http://dsp.stackexchange.com).
Sound intensity: microphones usually measures pressure, but you can get the intensity from that quite easily (see this question). Your main problem is that microphones are not usually calibrated, this means that you cannot associate an amplitude with a pressure. You can get sound density from sound intensity.
Frequency: you can get the spectrum of your sound by using the Fast Fourier Transform (see the Matlab function fft).
As for spectral or time identity, I believe these are psychoacoustics notions, which is not really my area of expertise.

I'm no expert but I have played with Matlab a little in the past.
One function I remember was wavread() to input a sound signal into Matlab, which if executed in this form [Y, FS, NBITS]=WAVREAD("AUDIO.WAV") would return something like:
AUDIO.WAV:
Fs = 100 kHz
Bits per sample = 10
Size = 100000
(numbers from the top of my head)
Now about the other things you ask, I'm not really sure. You can expect a better answer from somebody else. I think this question should be moved to Signal Processing SE btw.

Peak detection in Performous code

I was looking to implement voice pitch detection in iphone using HPS method. But the detected tones are not very accurate. Performous does a decent job of pitch detection.
I looked through the code but i did not fully get the theory behind the calculations.
They use FFT and find the peaks. But the part where they use the phase of FFT output, got me confused.I figure they use some heuristics for voice frequencies.
So,Could anyone please explain the algorithm used in Performous to detect pitch?

[Performous][1] extracts pitch from the microphone. Also the code is open source. Here is a description of what the algorithm does, from the guy that coded it (Tronic on irc.freenode.net#performous).
PCM input (with buffering)
FFT (1024 samples at a time, remove 200 samples from front of the buffer afterwards)
Reassignment method (against the previous FFT that was 200 samples earlier)
Filtering of peaks (this part could be done much better or even left out)
Combining peaks into sets of harmonics (we call the combination a tone)
Temporal filtering of tones (update the set of tones detected earlier instead of simply using the newly detected ones)
Pick the best vocal tone (frequency limits, weighting, could use the harmonic array also but I don't think we do)
I still wasn't able from this information to figure it out and implement it. If anyone manages this, please please post your results here, and comment this response so that SO notifies me.
The task would be to create a minimal C++ wrapper around this code.