In iPhone's library, AudioQueue.h file has a structure AudioQueueLevelMeterState.
It has two floats -> mAveragePower and mPeakPower
What are the units stored in them.
Is it decibels or not?
Since neither the documentation nor header comments say so I think it is assumed to be the only standard unit. As far as I know, the decibel is the only unit of power for sound.
There are other units for measuring sound but only the decibel measures power (in the scientific sense.)
Related
The docs for AVAudioPlayer say averagePowerForChannel: "Returns the average power for a given channel, in decibels, for the sound being played." But it doesn't say what "average" means, or how/if that average is weighted. I'm particularly interested in what the range of samples is, specifically the time interval and whether it goes forward into the future at all.
AVAudioRecorder: peak and average power says that the formula used to calculate power is RMS (root mean square) over a number of samples, but also doesn't say how many samples are used.
Optional bonus question: if I want to calculate what the average power will be in, say, 100 msec -- enough time for an animation to begin now and reach an appropriate level soon -- what frameworks should I be looking into? Confusion with meters in AVAudioRecorder says I can get ahold of the raw sample data with AudioQueue or RemoteIO, and then I can use something like this: Android version of AVAudioPlayer's averagePowerForChannel -- but I haven't found sample code (pardon the pun).
I want to measure the loudness of ambient sound. Having read a number of posts on stackoverflow I feel more confused than I was originally. Im not a sound engineer just a programmer.
I think I need to calculate dBSPL with the formula 20 * log10 (voltage / Voltage_Ref)
So for this I need to sample the internal microphone voltage (or pressure in Pascals?) level. The class AVAudioRecorder allows me to meter read the peakPowerForChannel but this gives a dbFS reading between 160 and 0. Where 0 is full power. How do I access the voltage/pressure levels, with another API perhaps?
I had read that roughly 0 dbFS = 99 db SPL. But that would mean the maximum db SPL I could read using the peakPowerForChannel reading would be 99 db SPL. I'm looking to read levels higher than this.
Any information on this would be most appreciated - im somewhat stuck at this point.
Thanks
Mike
The only way to do this is to test your particular iOS device model (and perhaps production batch) against a known sound source at a given distance and relationship to the mic in an anechoic chamber. The voltage and pressure relationship is neither specified by Apple nor available from any API.
I want to add a few bytes of data to a sound file (for example a song). The sound file will be transmitted via radio to a received who uses for example the iPhone microphone to pick up the sound, and an application will show the original bytes of data. Preferably it should not be hearable for humans.
What is such technology called? Are there any applications that can do this?
Libraries/apps that can be used on iPhone?
It's audio steganography. There are algorithms to do it. Refer to here.
I've done some research, and it seems the way to go is:
Use low audio frequencies.
Spread the "bits" around randomly - do not use a pattern as it will be picked up by the listener. "White noise" is a good clue. The random pattern is known by the sender and receiver.
Use Fourier transform to pick up frequency and amplitude
Clean up input data.
Use checksum/redundancy-algorithms to compensate for loss.
I'm writing a prototype and am having a bit difficulty in picking up the right frequency as if has a ~4 Hz offset (100 Hz becomes 96.x Hz when played and picked up by the microphone).
This is not the answer, but I hope it helps.
Is it possible to compare two sounds ?
for example app have already a sound file mp3 or any format, is it possible to compare any static sound file and recorded sound inside of app ?
Any comments are welcomed.
Regards
This forum thread has a good answer (about three down) - http://www.dsprelated.com/showmessage/103820/1.php.
The trick is to get the decoded audio from the mp3 - if they're just short 'hello' sounds, I'd store them inside the app as a wav instead of decoding them (though I've never used CoreAudio or any of the other frameworks before so mp3 decoding into memory might be easy).
When you've got your reference wav and your recorded wav, follow the steps in the post above :
1 Do whatever is necessary to convert .wav files to their discrete- time
signals:
http://www.sonicspot.com/guide/wavefiles.html
2 time-warping might or might not be necessary depending on difference
between two sample rates:
http://en.wikipedia.org/wiki/Dynamic_time_warping
3 After time warping, truncate both signals so that their durations are
equivalent.
4 Compute normalized energy spectral density (ESD) from DFT's two signals:
http://en.wikipedia.org/wiki/Power_spectrum.
6 Compute mean-square-error (MSE) between normalized ESD's of two
signals:
http://en.wikipedia.org/wiki/Mean_squared_error
The MSE between the normalized ESD's
of two signals is good metric of
closeness. If you have say, 10 .wav
files, and 2 of them are nearly the
same, but the others are not, the two
that are close should have a
relatively low MSE. Two perfectly
identical signals will obviously have
MSE of zero. Ideally, two "equivalent"
signals with different time scales,
(20-second human talking versus
5-second chipmunk), different energies
(soft-spoken human verus yelling
chipmunk), and different phases
(sampling began at slightly different
instant against continuous time
input); should still have MSE of zero,
but quantization errors inherent in
DSP will yield MSE slightly greater
than zero.
http://en.wikipedia.org/wiki/Minimum_mean-square_error
You should get two different MSE values, one between your male->recorded track and one between your female->recorded track. The comparison with the lowest difference is probably the correct gender.
I confess that I've never tried to do this and it looks very hard - good luck!
I have an OpenAl sound engine on my iPhone app. When I play a sound that I have loaded, I can control it's pitch.
In OpenAl a pitch set to 1.0 has no effect. If you double it to 2.0, it plays the note 1 octave higher(12 semitones). If you halve it, to 0.5, it will be an octave lower (12 semitones).
So, my original sample is playing a C. I assumed that if I divide 1 by 12 (semitones) I could get the pitch for the individual notes in that octave. But this does not seem to be the case. Which makes we think that semitones are not equal values. Is that true?
Does anyone know how I can work out the openAl pitch value for individual notes in an octave?
Thank you
Semitones are equal ratios. So, if your sample is C, C# will be the 12th root of two. If you count semitones C=0, C#=1 etc, the ratio is pow(2.0, n*1.0/12.0)
Works for negative numbers, too.
I should note, this is not strictly true in every tuning scheme... but this is a good start. If you really care about the full complexities of musical tuning, I can find you some references.