I'm building a piece of hardware that sends data into the headphone jack, and I need a way to record short snippets and analyze it quickly (hopefully without having to save the file and reopen for analysis). I have played around with fft and the accelerate frameworks, though I don't think it's exactly what I'm looking for.
I'm wondering mostly if something like this is feasible: record a ~30ms snippet of audio, and then grab an array of floats representing the voltage/(db levels?) throughout the recording. Then I could interpret the data depending on the levels at different ms through the recording. Would something like AVAudioRecorder be able to record at a resolution which I could examine every ms in the recording? Since this will be a repeating process, I'm hoping to keep the cpu down as well.
This is totally doable. Use AudioSession with AudioUnits.
Related
I am using several instances of AVAudioPlayer to play overlapping sounds, and getting harsh distortion as a result. Here is my situation... I have an app with several piano keys. Upon touching a key, it plays a note. If I touch 6-7 keys in rapid succession, my app plays a 2 second .mp3 clip for each key. Since I am using separate audio streams, they sounds overlap (which they should), but the result is lots of distortion, pops, or buzzing sounds!
How can I make the overlapping audio crisp and clean? I recorded the piano sounds myself and they are very nice, clean, noise-free recordings, and I don't understand why the overlapping streams sound so bad. Even at low volume or through headphones, the quality is just very degraded.
Any suggestions are appreciated!
Couple of things:
Clipping
The "buzzing" you describe is almost assuredly clipping—the result of adding two or more waveforms together and the resulting, combined waveform having its peaks cut off—clipped—at unity.
When you're designing virtual synthesizers with polyphony, you have to take into consideration how many voices will likely play at once and provide headroom, typically by attenuating each voice.
In practice, you can achieve this with AVAudioPlayer by setting each instances volume property to 0.316 for 10 dB of headroom. (Enough for 8 simultaneous voices)
The obvious problem here that when the user plays a single voice, it may seem too quiet—you'll want to experiment with various headroom values and typical user behavior and adjust to taste (it's also signal-dependent. Your piano samples may clip more/less easily than other waveforms depending on their recorded amplitude.)
Depending on your app's intended user, you might consider making this headroom parameter available to them.
Discontinuities/Performance
The pops and clicks you're hearing may not be a result of clipping, but rather a side effect of the fact you're using mp3 as your audio file format. This is a Bad Idea™. iOS devices only have one hardware stereo mp3 decoder, so as soon as you spin up a second, third, etc. voice, iOS has to decode the mp3 audio data on the cpu. Depending on the device, you can only decode a couple audio streams this way before suffering from underflow discontinuities (cut that in half for stereo files, obviously)... the CPU simply can't decode enough samples for the output audio stream in time, so you hear nasty pops and clicks.
For sample playback, you want to use an LPCM audio encoding (like wav or aiff) or something extremely efficient to decode, like ima4. One strategy that I've used in every app I've shipped that has these types of audio samples is to ship samples in mp3 or aac format, but decode them once to an LPCM file in the app's sandbox the first time the app is launched. This way you get the benefit of a smaller app bundle and low CPU utilization/higher polyphony at runtime when decoding the samples. (With a small hit to the first-time user experience while the user waits for the samples to be decoded.)
My understanding is that AVAudioPlayer isn't meant to be used like that. In general, when combining lots of sounds into a single output like that, you want to open a single stream and mix the sounds yourself.
What you are encountering is clipping — it's occurring because the combined volumes of the sounds you're playing are exceeding the maximum possible volume. You need to decrease the volume of these sounds when there's more than one playing at a time.
I have a series of sounds that a user will play, rearrange, and edit etc. while using my app. When the user is finished, I want them to be able to save their work and record it to an mp3.
I don't want to play it through speakers and record it with the mic since that will result in low sound quality and interference. I cannot think of any ways of doing this that doesn't require extra hardware and/or a computer.
How can I do this using just their device?
Well, I would say it cant be done with AVFoundation.
My suggestion is to use Audio Units, and transform all your interactions to an audio graph. at some point you set a render notify on the RemoteIO so every time it renders sounds to the speakers you get a callback where you can write it down those frames/packets/data into a file.
I will probably suggest to use AAC(m4a) over MP3. I am not very fond of MP3, and to be honest as far as I know the sdk does not provide encoding to MP3, probably due to licensing issues. I could be wrong though. Check this sample code below, probably the best sample code you will ever find on Audio units on the web.
AudioGraph by Tom Zic
so I'm making an app and what I need to do is when for example someone starts talking I need to detect that there is a sound and then record it.
I found this tutorial http://mobileorchard.com/tutorial-detecting-when-a-user-blows-into-the-mic/ but it starts the recording on the beginning and then based on the recording it detects the sound.
Is there any other way to detect a sound without actually starting the recorder first? What I thought of would be having 2 recorders, one for detection and one for actually recording the sound. Another solution would be to edit (trim) the sound after it's recorded.
Are these approaches somehow standard or is there a better way to detect sound?
Thanks.
edit: if anyone ever reads this, I also found this http://bonkel.wordpress.com/2010/03/03/frequency-detection-using-fourier-transform/
If you don't mind getting a little dirty, you could go down to a lower level, to CoreAudio, and read data out of the input buffers until you see values exceeding your threshold, and start recording those input buffers, or triggering a high level recording call. You can similarly stop recording after a period of silence.
If you use CoreAudio, you have a lot of control over what you record. You could, pretty easily, filter out background noise, or add beeps to signify when the recording stopped due to silence, and even add markers to use later to match time to the recording.
CoreAudio does require you to do more work. You will have to read the microphone buffers on a timely basis and either save or discard the data pretty quickly in order not to drop any sound data. This isn't that hard, as the devices have plenty of CPU power to do that and other tasks at the same time - you just have to have a good grasp of CoreAudio.
There are plenty of Apple CoreAudio samples that can guide you. The WWDC 2010 and 2010 CoreAudio sessions are also a must-see.
You could use either the Audio Queue or the Core Audio (RemoteIO Audio Unit) API. Unless your app requires low latency, the Audio Queue API may be simpler to use.
You need to start the recording API to detect any sound, but you don't need to save everything you get from the recording callback to a file.
Specifically, I just want to record something, reverse it, and play it back. I've looked through the apple docs and couldn't find anything about editing audio. Is it possible?
Yes, it is definitely possible. Last I checked the Apple Core Audio docs were not very good, but it has been a few months since I've worked with it. Here are the steps that I would follow.
Record the audio sample.
Reverse the audio by looping through the first half of the array and swapping the value located there with one an equidistant from the end of the array.
Play the resulting audio clip.
Quite frankly, the first step is probably the hardest. Here is a decent article about doing audio on the iPhone including recording. Make sure you look at all of the different parts of the article. Here is another article about recording sound on the iPhone, but using a different framework. There are really several ways to go about recording on the iPhone though, last I checked, if you want to play audio while you are recording you have to use RemoteIO.
Edit:
If you would like to use RemoteIO(which I preferred), then this site is pretty helpful for getting started with it. Also, the aurioTouch sample program that Apple provides is immensely helpful (though more than you want).
If you don't need RemoteIO (because it can be a major pain though it is more low-level and thus more flexible), then try the SpeakHere sample program. It is made just to record and play back. However, I just looked at it and it writes the recording to a file rather than a buffer which isn't what you want. I would recommend going with RemoteIO for that reason (unless you can find a way to have it write to a buffer instead).
I'm looking to create an app that emulates a physical instrument. I've got audio samples but I want to be able to increase the pitch/frequency dynamically so I don't have to load from too many files.
Any idea which audio API will be able to do this? I reckon either OpenAL or Audio Queue Services but am not sure which is suitable. Any links to guides/sample code is also much appreciated.
Thanks in advance.
I went down this road in 2009, trying Audio Toolkit, Audio Queue Services, openAL, and finally settling on the RemoteIO AudioUnit.
Audio Toolbox is fine for basic triggered sound effects, but it wasn't able to change frequencies or loop samples.
Audio Queue Services can loop samples, but the only way I could find to adjust the playback frequency of a sample was to re-read the data from the file -- very painful. Plus, the framework is tremendously cumbersome - I'd only use it if I was trying to stream something off the Internet.
OpenAL was a godsend - was up and running with it in under an hour, after getting my hands on the no-longer-available-from-Apple "CrashLanding" iPhone sample app. I found OpenAL to be ideally suited to games or even a musical instrument -- samples could be pre-loaded, adjusting the frequency was easy, and looping was no problem. The deal-breaker for me was that starting and stopping a looped sample would result in a nasty "pop" almost every time. Also the builtin 3d positional audio mixer was a bit too CPU-intensive for my liking.
If your instrument does not use looped samples, I'd suggest trying the OpenAL route first - the learning curve is much less intimidating. Try to track down "SoundEngine.h", "CrashLanding" or "TouchFighter", or check out the following link:
http://benbritten.com/blog/2008/11/06/openal-sound-on-the-iphone/
Since looped samples was a requirement for me, I finally settled on AudioUnits (which, on the iPhone, is referred to as "RemoteIO" if you want to do input or output). It was tremendously difficult to implement - very similar to Audio Queue Services, in that the core of your implementation will be inside a "buffer callback", being called several times per second to fill a buffer of outbound audio with raw SInt16 values.
Ultimately, I got my instrument working beautifully with multi-note polyphony, looped samples, no popping, and minimal latency.
Unfortunately, RemoteIO is not well documented. Michael Tyson was one of the first in the field to write about RemoteIO at length, and his posts (and the comments) were very useful to me:
http://michael.tyson.id.au/2008/11/04/using-remoteio-audio-unit/
Good luck!
Edited years later: I've open-sourced the RemoteIO/AudioUnits code I alluded to above: https://github.com/glenn-barnett/hexaphone/blob/master/Classes/Instrument.m - apologies for the mess, I hope to get some time to clean up the code and comments.
Try creating an Audio Unit. I'm doing something similar an AU worked well for me.
Initially I used an audio queue as it was simpler (higher level?) and
synchronous, however it was lacking in responsiveness, so I dumped it for
the Audio Unit.
It sounds, a bit, like you're creating essentially the wavetable synthesis method of playing MIDI files. You might be able to find a MIDI synthesizer for the iPhone that you can use, and then use your audio samples to build a wavetable set. Anytime you'd want to play tones, you would simply send the MIDI event into the iPhone MIDI synth with your loaded wavetable set.
Another option now is AUSampler.
http://developer.apple.com/library/mac/#technotes/tn2283/_index.html