Selecting part of a sample by frequency - matlab

I'm wondering if there is a way to select part of a sample at a given frequency. The only way I can think to index the sample by frequency is using an FFT, but doing that seems to mess up the sample so that it's not actually playable anymore. I was wondering how else one might select the part of a sample at a given frequency whilst keeping the sound intelligible?
Edit: The exact instructions were "synthesize an example of each vowel of pitch 150 Hz and duration 5 seconds".
Edit: I completely misunderstood what I needed to do originally. New question is here: Synthesizing vowel from existing audio sample jin matlab

The exact phrasing suggests you are being asked to synthesize, ie create a new signal, not filter, or modify an existing signal. Moreover it asks about a fundamental frequency of 150 Hz (It uses the word pitch and not frequency. I'm assuming that fundamental frequency is good enough and/or what they meant :).
So, let me try rewording the question for you:
Do the following for each vowel sound (A, E, I, O, U, etc):
Create a 5 second sound with a fundamental frequency of 150 Hz.
I can think of two ways to solve this problem: 1. sum up some sine waves (all of which will be a multiple of 150 Hz) at different intensities. Knowing the intensities is the trick here. or 2. Start with a pulse of 150 Hz and filter it. Knowing the exact filter to use is the trick here, although using the right pulse will probably have some impact as well. Either way, you don't need or want an FFT in the generation stage. If you can't or don't want to look up the unknowns above, you could use an FFT to analyze a real person saying those sounds and use the results of the analysis to fill in the gaps. It wouldn't be too hard to do that, but it's probably covered in an advanced textbook on phonetics and/or acoustics.
If you need a more detailed answer, perhaps you should create a new question and link it here for help answering that. I suggest the following tags, if they exist:
Speech synthesis
Filtering
audio
phonetics

You should define "at a given frequency" more precisely, but it seems that what you want is a filter with a narrow pass-band tuned at the desired frequency.
However, the narrow frequency requirement is opposed to intelligibility. In the limit, a single frequency would just give you a sinusoid, and intelligibility would be completely lost.

Related

Modifying Sound Input to Determine Frequency

I'm working on a project and I've hit a snag that is past my understanding. My goal is to create an artificial neural network which is fed information from a sound file which is then ported through the system, resulting in a labeling of the chord. I'm hoping to make this to help in music transcription -- not to actually do the transcription itself, but to help in the harmonization aspect. I digress.
I've read as much as I can on the Goertzel and the FFT function, but I'm unsure if these functions are what I'm looking for. I'm not looking for any particular frequency in the sound sample, but rather, I'm hoping to find the higher, middle, and low range frequencies of the sample.
I know the Goertzel algorithm returns a high number if a particular frequency is found, but it seems computational wasteful to run the algorithm for all possible tones in a given sample. Any ideas on what to use?
Or, if this is impossible, I'd love to know that too before spending too much time on this one project.
Thank you for your time!
Probably better suited to DSP StackExchange.
Suppose you FFT a single 110Hz tone to get a spectrogram; you'll see evenly spaced peaks at 110 220 330 etc Hz -- the harmonics. 110 is the fundamental.
Suppose you have 3 tones. Already it's going to look quite messy in the frequency domain. Especially if you have a chord containing e.g. A110 and A220.
On account of this, I think a neural network is a good approach.
Feed in FFT output.
It would be a good idea to use a neural network that accepts complex valued inputs, as FFT outputs of a complex number for each frequency bin.
http://www.eagle.tamut.edu/faculty/igor/PRESENTATIONS/IJCNN-0813_Tutorial.pdf
It may seem computationally wasteful to extract so many frequencies with FFT, but FFT algorithms are extremely efficient nowadays. You should probably use a bit strength of 10, so 2^10 inputs -> 2^9 = 512 complex bins.
FFT is the right solution. Basically, when you have the FFT of an input signal that consists only of sinus waves, you can determine the chord by just mapping which frequencys are present to specific tones in whichever musical temperament you want to use, then look up the chord specified by those tones. If you don't have sinus-waves as input, then using a neural network is a valid attempt in solving the problem, provided that you have enough samples to train it.
FFT is the right way. Harmonics don't bother you, since they are an integer multiple of the fundamental frequency they're just higher 'octaves' of the same note. And to recognize a chord, tranpositions of notes over whole octaves don't matter.

Notch filters and harmonic noise in matlab

So basically, my problem is that I have a speech signal in .wav format that is corrupted by a harmonic noise source at some frequency. My goal is to identify the frequency at which this noise occurs, and use a notch filter to remove said noise. So far, I have read the speech signal into matlab using:
[data, Fs] = wavread('signal.wav');
My question is how can I identify the frequency at which the harmonic noise is occurring, and once I've done that, how can I go about implementing a notch filter at that frequency?
NOTE: I do not have access to the iirnotch() command or fdesign.notch() due to the version of MATLAB I am currently using (2010).
The general procedure would be to analyse the spectrum, to identify the frequency in question, then design a filter around that frequency. For most real applications it's all a bit woolly: the frequencies move around and there's no easy way to distinguish noise from signal, so you have to use clever techniques and a bit of guesswork. However if you know you have a monotonic corruption then, yes, an FFT and a notch filter will probably do the trick.
You can analyse the signal with fft and design a filter with, among others, fir1, which I believe is part of the signal processing toolbox. If you don't have the signal processing toolbox you can do it 'by hand', as in transform to the frequency domain, remove the frequency(ies) you don't want (by zeroing the relevant elements of the frequency vector) and transform back to time domain. There's a tutorial on exactly that here.
The fft and fir1 functions are well documented: search the Mathworks site to get code examples to get you up and running.
To add to/ammend xenoclast's answer, filtering in the frequency domain may or may not work for you. There are many thorny issues with filtering in the frequency domain, some of which are covered here: http://blog.bjornroche.com/2012/08/why-eq-is-done-in-time-domain.html
One additional issue is that if you try to process your entire file at once, the "width" or Q of the filters will depend on the length of your file. This might work out for you, or it might not. If you have many files of different lengths, don't expect similar results this way.
To design your own IIR notch filter, you could use the RBJ audio filter cookbook. If you need help, I wrote up a tutorial here:
http://blog.bjornroche.com/2012/08/basic-audio-eqs.html
My tutorial uses bell/peaking filter, but it's easy to follow that and then replace it with a notch filter from RBJ.
One final note: assuming this is actually an audio signal in your .wav file, you can also use your ears to find and fix the problem frequencies:
Open the file in an audio editing program that lets you adjust filter settings in real-time (I am not sure if audacity lets you do this, but probably).
Use a "boost" or "parametric" filter set to a high gain and sweep the frequency setting until you hear the noise accentuated the most.
replace the boost filter with a notch filter of the same frequency. You may need to tweak the width to trade off noise elimination vs. signal preservation.
repete as needed (due to the many harmonics).
save the resulting file.
Of course, some audio editing apps have built-in harmonic noise reduction features that work especially well for 50/60 Hz noise.

How can i find a sound intensity by using Matlab?

I'm looking for some functions in MATLAB in order to find out some parameters of sound,such az intensity,density,frequency,time and spectral identity.
i know how to use 'audiorecorder' as a function to record the sampled voice,and also 'getaudio', in order to plot it.But i need to realize the parametres of a sampled recorded voice,that i mentioned above.i'd be so thankful if anyone could help me.
This is a very vague question, you may want to narrow it down (at first) and to add as much contextual details as you can, it will certainly attract a lot more answers (also as mentionned by Ion, you could post it at http://dsp.stackexchange.com).
Sound intensity: microphones usually measures pressure, but you can get the intensity from that quite easily (see this question). Your main problem is that microphones are not usually calibrated, this means that you cannot associate an amplitude with a pressure. You can get sound density from sound intensity.
Frequency: you can get the spectrum of your sound by using the Fast Fourier Transform (see the Matlab function fft).
As for spectral or time identity, I believe these are psychoacoustics notions, which is not really my area of expertise.
I'm no expert but I have played with Matlab a little in the past.
One function I remember was wavread() to input a sound signal into Matlab, which if executed in this form [Y, FS, NBITS]=WAVREAD("AUDIO.WAV") would return something like:
AUDIO.WAV:
Fs = 100 kHz
Bits per sample = 10
Size = 100000
(numbers from the top of my head)
Now about the other things you ask, I'm not really sure. You can expect a better answer from somebody else. I think this question should be moved to Signal Processing SE btw.

Audio processing with fft iphone app [duplicate]

I'm trying to write a simple tuner (no, not to make yet another tuner app), and am looking at the AurioTouch sample source (has anyone tried to comment this code??).
My worry is that aurioTouch doesn't seem to actually work very well when looking at the frequency domain graph. I play a single note on an instrument and I don't see a nicely ordered, small, set of frequencies with one string peak at the appropriate frequency of the note.
Has anyone used aurioTouch enough to know whether the underlying code is functional or whether it is just a crude sample?
Other options I have are to use FFTW or KISS FFT. Anyone have any experience with those?
Thanks.
You're expecting the wrong thing!!
Not the library's fault
Whether the library produces it properly or not, you're looking for a pattern that rarely actually exists in real-life sounds. Only a perfect sine wave, electronically generated, will cause an even partway discrete appearing 'spike' in the freq. graph. If you don't believe it try firing up a 'spectrum analyzer' visualization in winamp or media player. It's not so much the PC's fault.
Real sound waves are complicated animals
Picture a sawtooth or sqaure wave in your mind's eye. those sharp turnaround - corners or points on the wave, look like tons of higher harmonics to the FFT or even a real fourier. And if you've ever seen a real 'sqaure wave/sawtooth' on a scope, or even a 'sine wave' produced by an instrument that is supposed to produce a sinewave, take a look at all the sharp nooks and crannies in just ONE note (if you don't have a scope just zoom way in on the wave in audacity - the more you zoom, the higher notes you're looking at). Yep, those deviations all count as frequencies.
It's hard to tell the difference between one note and a whole orchestra sometimes in a spectrum analysis.
But I hear single notes!
So how does the ear do it? It considers the entire waveform. Then your lower brain lies to your upper brain about what the input is: one note, not a mess of overtones.
You can't do it as fully, but you can approximate it via 'training.'
Approximation: building some smarts
PLAY the note on the instrument and 'save' the frequency graph. Do this for notes in several frequency ranges, or better yet all notes.
Then interpolate the notes to fill in gaps (by 1/2 or 1/4 steps) by multiplying the saved graphs for that instrument by 2^(1/12) (or 1/24 for 1/4 steps, etc).
Figure out how to store them in a quickly-searchable data structure like a BST or trie. Only it would have to return a 'how close is this' score. It would have to identify the match via proportions of frequencies as well, in case it came in different volumes.
Using the smarts
Next time you're looking for a note from that instrument, just take the 'heard' freq graph and find it in that data structure. You can record several instruments that make different waveforms and search for them too. If there are background sounds or multiple notes, take the closest match. Then if you want to identify other notes, 'subtract' the found frequency pattern from the sampled one, and rinse, lather repeat.
It won't work by your voice...
If you ever tried to tune yourself by singing into a guitar tuner, you'll know that tuners arent that clever. Of course some instruments (voice esp) really float around the pitch and generate an ever-evolving waveform (even without somebody singing).
What are you trying to accomplish?
You would not have to totally get this fancy for a 'simple' tuner app, but if you're not making just another tuner app them I'm guessing you actually want to identify notes (e.g., maybe you want to autogenerate midi files from songs on the radio ;-)
Good luck. I hope you find a library that does all this junk instead of having to roll your own.
Edit 2017
Note this webpage: http://www.feilding.net/sfuad/musi3012-01/html/lectures/015_instruments_II.htm
Well down the page, there are spectrum analyses of various organ pipes. There are many, many overtones. These are possible to detect - with enough work - if you 'train' your app with them first (just like telling a kid, 'this is what a clarinet sounds like...')
aurioTouch looks weird because the frequency axis is on a linear scale. It's very difficult to interpret FFT output when the x-axis is anything other than a logarithmic scale (traditionally log2).
If you can't use aurioTouch's integer-FFT, check out my library:
http://github.com/alexbw/iPhoneFFT
It uses double-precision, has support for multiple window types, and implements Welch's method (which should give you more stable spectra when viewed over time).
#zaph, the FFT does compute a true Discrete Fourier Transform. It is simply an efficient algorithm that takes advantage of the bit-wise representation of digital signals.
FFTs use frequency bins and the bin frequency width is based on the FFT parameters. To find a frequency you will need to record it sampled at a rate at least twice the highest frequency present in the sample. Then find the time between the cycles. If it is not a pure frequency this will of course be harder.
I am using Ooura FFT to compute the FFT of acceleromter data. I do not always obtain the correct spectrum. For some reason, Ooura FFT produces completely wrong results with spectral magnitudes of the order 10^200 across all frequencies.

AurioTouch & FFT for an instrument tuner

I'm trying to write a simple tuner (no, not to make yet another tuner app), and am looking at the AurioTouch sample source (has anyone tried to comment this code??).
My worry is that aurioTouch doesn't seem to actually work very well when looking at the frequency domain graph. I play a single note on an instrument and I don't see a nicely ordered, small, set of frequencies with one string peak at the appropriate frequency of the note.
Has anyone used aurioTouch enough to know whether the underlying code is functional or whether it is just a crude sample?
Other options I have are to use FFTW or KISS FFT. Anyone have any experience with those?
Thanks.
You're expecting the wrong thing!!
Not the library's fault
Whether the library produces it properly or not, you're looking for a pattern that rarely actually exists in real-life sounds. Only a perfect sine wave, electronically generated, will cause an even partway discrete appearing 'spike' in the freq. graph. If you don't believe it try firing up a 'spectrum analyzer' visualization in winamp or media player. It's not so much the PC's fault.
Real sound waves are complicated animals
Picture a sawtooth or sqaure wave in your mind's eye. those sharp turnaround - corners or points on the wave, look like tons of higher harmonics to the FFT or even a real fourier. And if you've ever seen a real 'sqaure wave/sawtooth' on a scope, or even a 'sine wave' produced by an instrument that is supposed to produce a sinewave, take a look at all the sharp nooks and crannies in just ONE note (if you don't have a scope just zoom way in on the wave in audacity - the more you zoom, the higher notes you're looking at). Yep, those deviations all count as frequencies.
It's hard to tell the difference between one note and a whole orchestra sometimes in a spectrum analysis.
But I hear single notes!
So how does the ear do it? It considers the entire waveform. Then your lower brain lies to your upper brain about what the input is: one note, not a mess of overtones.
You can't do it as fully, but you can approximate it via 'training.'
Approximation: building some smarts
PLAY the note on the instrument and 'save' the frequency graph. Do this for notes in several frequency ranges, or better yet all notes.
Then interpolate the notes to fill in gaps (by 1/2 or 1/4 steps) by multiplying the saved graphs for that instrument by 2^(1/12) (or 1/24 for 1/4 steps, etc).
Figure out how to store them in a quickly-searchable data structure like a BST or trie. Only it would have to return a 'how close is this' score. It would have to identify the match via proportions of frequencies as well, in case it came in different volumes.
Using the smarts
Next time you're looking for a note from that instrument, just take the 'heard' freq graph and find it in that data structure. You can record several instruments that make different waveforms and search for them too. If there are background sounds or multiple notes, take the closest match. Then if you want to identify other notes, 'subtract' the found frequency pattern from the sampled one, and rinse, lather repeat.
It won't work by your voice...
If you ever tried to tune yourself by singing into a guitar tuner, you'll know that tuners arent that clever. Of course some instruments (voice esp) really float around the pitch and generate an ever-evolving waveform (even without somebody singing).
What are you trying to accomplish?
You would not have to totally get this fancy for a 'simple' tuner app, but if you're not making just another tuner app them I'm guessing you actually want to identify notes (e.g., maybe you want to autogenerate midi files from songs on the radio ;-)
Good luck. I hope you find a library that does all this junk instead of having to roll your own.
Edit 2017
Note this webpage: http://www.feilding.net/sfuad/musi3012-01/html/lectures/015_instruments_II.htm
Well down the page, there are spectrum analyses of various organ pipes. There are many, many overtones. These are possible to detect - with enough work - if you 'train' your app with them first (just like telling a kid, 'this is what a clarinet sounds like...')
aurioTouch looks weird because the frequency axis is on a linear scale. It's very difficult to interpret FFT output when the x-axis is anything other than a logarithmic scale (traditionally log2).
If you can't use aurioTouch's integer-FFT, check out my library:
http://github.com/alexbw/iPhoneFFT
It uses double-precision, has support for multiple window types, and implements Welch's method (which should give you more stable spectra when viewed over time).
#zaph, the FFT does compute a true Discrete Fourier Transform. It is simply an efficient algorithm that takes advantage of the bit-wise representation of digital signals.
FFTs use frequency bins and the bin frequency width is based on the FFT parameters. To find a frequency you will need to record it sampled at a rate at least twice the highest frequency present in the sample. Then find the time between the cycles. If it is not a pure frequency this will of course be harder.
I am using Ooura FFT to compute the FFT of acceleromter data. I do not always obtain the correct spectrum. For some reason, Ooura FFT produces completely wrong results with spectral magnitudes of the order 10^200 across all frequencies.