so I'm making an app and what I need to do is when for example someone starts talking I need to detect that there is a sound and then record it.
I found this tutorial http://mobileorchard.com/tutorial-detecting-when-a-user-blows-into-the-mic/ but it starts the recording on the beginning and then based on the recording it detects the sound.
Is there any other way to detect a sound without actually starting the recorder first? What I thought of would be having 2 recorders, one for detection and one for actually recording the sound. Another solution would be to edit (trim) the sound after it's recorded.
Are these approaches somehow standard or is there a better way to detect sound?
Thanks.
edit: if anyone ever reads this, I also found this http://bonkel.wordpress.com/2010/03/03/frequency-detection-using-fourier-transform/
If you don't mind getting a little dirty, you could go down to a lower level, to CoreAudio, and read data out of the input buffers until you see values exceeding your threshold, and start recording those input buffers, or triggering a high level recording call. You can similarly stop recording after a period of silence.
If you use CoreAudio, you have a lot of control over what you record. You could, pretty easily, filter out background noise, or add beeps to signify when the recording stopped due to silence, and even add markers to use later to match time to the recording.
CoreAudio does require you to do more work. You will have to read the microphone buffers on a timely basis and either save or discard the data pretty quickly in order not to drop any sound data. This isn't that hard, as the devices have plenty of CPU power to do that and other tasks at the same time - you just have to have a good grasp of CoreAudio.
There are plenty of Apple CoreAudio samples that can guide you. The WWDC 2010 and 2010 CoreAudio sessions are also a must-see.
You could use either the Audio Queue or the Core Audio (RemoteIO Audio Unit) API. Unless your app requires low latency, the Audio Queue API may be simpler to use.
You need to start the recording API to detect any sound, but you don't need to save everything you get from the recording callback to a file.
Related
I have an audio app in which all of the sound generating work is accomplished by pure data (using libpd).
I've coded a special sequencer in swift which controls the start/stop playback of multiple sequences, played by the synth engines in pure data.
Until now, I've completely avoided using Core Audio or AVFoundation for any aspect of my app, because I know nothing about them, and they both seem to require C or Objective C coding, which I know nearly nothing about.
However, I've been told from a previous q&a on here, that I need to use Core Audio or AVFoundation to get accurate timing. Without it, I've tried everything else, and the timing is totally messed up (laggy, jittery).
All of the tutorials and books on Core Audio seem overwhelmingly broad and deep to me. If all I need from one of these frameworks is accurate timing for my sequencer, how do you suggest I achieve this as someone who is a total novice to Core Audio and Objective-C, but otherwise has a 95% finished audio app?
If your sequencer is Swift code that depends on being called just-in-time to push audio, it won't work with good timing accuracy. e.g. you can't get the timing you need.
Core Audio uses a real-time pull-model (which excludes Swift code of any interesting complexity). AVFoundation likely requires you to create your audio ahead of time, and schedule buffers. An iOS app needs to be designed nearly from the ground up for one of these two solutions.
Added: If your existing code can generate audio samples a bit ahead of time, enough to statistically cover using a jittery OS timer, you can schedule this pre-generated output to be played a few milliseconds later (e.g. when pulled at the correct sample time).
AudioKit is an open source audio framework that provides Swift access to Core Audio services. It includes a Core Audio based sequencer, and there is plenty of sample code available in the form of Swift Playgrounds.
The AudioKit AKSequencer class has the transport controls you need. You can add MIDI events to your sequencer instance programmatically, or read them from a file. You could then connect your sequencer to an AKCallbackInstrument which can execute code upon receiving MIDI noteOn and noteOff commands, which might be one way to trigger your generated audio.
I'm building a piece of hardware that sends data into the headphone jack, and I need a way to record short snippets and analyze it quickly (hopefully without having to save the file and reopen for analysis). I have played around with fft and the accelerate frameworks, though I don't think it's exactly what I'm looking for.
I'm wondering mostly if something like this is feasible: record a ~30ms snippet of audio, and then grab an array of floats representing the voltage/(db levels?) throughout the recording. Then I could interpret the data depending on the levels at different ms through the recording. Would something like AVAudioRecorder be able to record at a resolution which I could examine every ms in the recording? Since this will be a repeating process, I'm hoping to keep the cpu down as well.
This is totally doable. Use AudioSession with AudioUnits.
I'm trying to find the best way to play a seamless loop of audio, that the user can switch out for another at the shortest possible notice, with a decent number (30-150) of very short loops being available. Will OpenAL be sufficient for this, or do I need to delve into Audio Units? The Apple Documentation says that for real-time feedback like an instrument, Audio Units is the right choice.
I just want to get the community's opinion on this, and any links and sample projects would be greatly appreciated.
You can use AVAudioPlayer to seamlessly loop a compressed audio file (numberOfLoops = -1). I suggest using IMA4-encoded CAF files, as these are rumored to benefit from hardware decompression (saving CPU cycles for other things).
To keep file size down, you can lower the bit rate (try 96 kbps) and/or use mono.
Note that AVAudioPlayer does not allow you to change the tempo or frequency of playback.
this probably doesn't really answer your question, but have you ever looked at Finch?
Just looking at the source might provide some pointers.
Johannes
If you need to be able to switch to another audio sample with no playback delay, you'll need to use OpenAL. AVAudioPlayer has a delay before it starts playing.
You can minimize that delay by calling prepareToPlay, but it won't always eliminate the delay completely. As well, if you have 30 to 150 samples that the user selects for playback, you won't know beforehand which samples need to be preloaded.
Here's a rundown of the pros and cons between OpenAL and AVAudioPlayer: http://kstenerud.github.com/ObjectAL-for-iPhone/documentation/index.html#choosing_sec
I ended up using Cocos2D's audio library for this, and it was far more performant than I'd expected. AVAudioPlayer ended up being a good bit lower-level than I'd needed.
I want to record the sound of my iPhone-App. So like someone is playing something on a iPhone instrument and after that you can hear it.
Is it possible without the micro?
Do you mean an App you build yourself? If yes, you could just save the rendered waveform (maybe encoded/compressed to save space) for later playback. (see: Extended Audio File Services, it can write the same AudioBufferList to a file that you would render to the RemoteAudio Unit when playing audio in your Instrument-App)
[Edit: removed comments on recording third-party app audio output ...]
With the AVFoundation you are currently using, you're always working on the level of the sound files. Your code never sees the actual audio signal. Thus, you can't 'grab' the audio signal that your app generates when it is used. Also, AVAudioPlayer does not provide any means of getting to the final signal. If you're using multiple instances of AVAudio player to play multiple sounds at the same time you also wouldn't be able to get at the mixed signal.
Alas, you probably need to use CoreAudio which is a much more low level interface.
I'd like to suggest an alternative approach: Instead of recording the audio output, why not record the sequence of actions together with their time which lead to the audio being played? Write this sequence of events to a file and read it back in to reproduce the 'performance' - it's a bit like your own MIDI sequencer :)
I'm looking to create an app that emulates a physical instrument. I've got audio samples but I want to be able to increase the pitch/frequency dynamically so I don't have to load from too many files.
Any idea which audio API will be able to do this? I reckon either OpenAL or Audio Queue Services but am not sure which is suitable. Any links to guides/sample code is also much appreciated.
Thanks in advance.
I went down this road in 2009, trying Audio Toolkit, Audio Queue Services, openAL, and finally settling on the RemoteIO AudioUnit.
Audio Toolbox is fine for basic triggered sound effects, but it wasn't able to change frequencies or loop samples.
Audio Queue Services can loop samples, but the only way I could find to adjust the playback frequency of a sample was to re-read the data from the file -- very painful. Plus, the framework is tremendously cumbersome - I'd only use it if I was trying to stream something off the Internet.
OpenAL was a godsend - was up and running with it in under an hour, after getting my hands on the no-longer-available-from-Apple "CrashLanding" iPhone sample app. I found OpenAL to be ideally suited to games or even a musical instrument -- samples could be pre-loaded, adjusting the frequency was easy, and looping was no problem. The deal-breaker for me was that starting and stopping a looped sample would result in a nasty "pop" almost every time. Also the builtin 3d positional audio mixer was a bit too CPU-intensive for my liking.
If your instrument does not use looped samples, I'd suggest trying the OpenAL route first - the learning curve is much less intimidating. Try to track down "SoundEngine.h", "CrashLanding" or "TouchFighter", or check out the following link:
http://benbritten.com/blog/2008/11/06/openal-sound-on-the-iphone/
Since looped samples was a requirement for me, I finally settled on AudioUnits (which, on the iPhone, is referred to as "RemoteIO" if you want to do input or output). It was tremendously difficult to implement - very similar to Audio Queue Services, in that the core of your implementation will be inside a "buffer callback", being called several times per second to fill a buffer of outbound audio with raw SInt16 values.
Ultimately, I got my instrument working beautifully with multi-note polyphony, looped samples, no popping, and minimal latency.
Unfortunately, RemoteIO is not well documented. Michael Tyson was one of the first in the field to write about RemoteIO at length, and his posts (and the comments) were very useful to me:
http://michael.tyson.id.au/2008/11/04/using-remoteio-audio-unit/
Good luck!
Edited years later: I've open-sourced the RemoteIO/AudioUnits code I alluded to above: https://github.com/glenn-barnett/hexaphone/blob/master/Classes/Instrument.m - apologies for the mess, I hope to get some time to clean up the code and comments.
Try creating an Audio Unit. I'm doing something similar an AU worked well for me.
Initially I used an audio queue as it was simpler (higher level?) and
synchronous, however it was lacking in responsiveness, so I dumped it for
the Audio Unit.
It sounds, a bit, like you're creating essentially the wavetable synthesis method of playing MIDI files. You might be able to find a MIDI synthesizer for the iPhone that you can use, and then use your audio samples to build a wavetable set. Anytime you'd want to play tones, you would simply send the MIDI event into the iPhone MIDI synth with your loaded wavetable set.
Another option now is AUSampler.
http://developer.apple.com/library/mac/#technotes/tn2283/_index.html