I have an audio app in which all of the sound generating work is accomplished by pure data (using libpd).
I've coded a special sequencer in swift which controls the start/stop playback of multiple sequences, played by the synth engines in pure data.
Until now, I've completely avoided using Core Audio or AVFoundation for any aspect of my app, because I know nothing about them, and they both seem to require C or Objective C coding, which I know nearly nothing about.
However, I've been told from a previous q&a on here, that I need to use Core Audio or AVFoundation to get accurate timing. Without it, I've tried everything else, and the timing is totally messed up (laggy, jittery).
All of the tutorials and books on Core Audio seem overwhelmingly broad and deep to me. If all I need from one of these frameworks is accurate timing for my sequencer, how do you suggest I achieve this as someone who is a total novice to Core Audio and Objective-C, but otherwise has a 95% finished audio app?
If your sequencer is Swift code that depends on being called just-in-time to push audio, it won't work with good timing accuracy. e.g. you can't get the timing you need.
Core Audio uses a real-time pull-model (which excludes Swift code of any interesting complexity). AVFoundation likely requires you to create your audio ahead of time, and schedule buffers. An iOS app needs to be designed nearly from the ground up for one of these two solutions.
Added: If your existing code can generate audio samples a bit ahead of time, enough to statistically cover using a jittery OS timer, you can schedule this pre-generated output to be played a few milliseconds later (e.g. when pulled at the correct sample time).
AudioKit is an open source audio framework that provides Swift access to Core Audio services. It includes a Core Audio based sequencer, and there is plenty of sample code available in the form of Swift Playgrounds.
The AudioKit AKSequencer class has the transport controls you need. You can add MIDI events to your sequencer instance programmatically, or read them from a file. You could then connect your sequencer to an AKCallbackInstrument which can execute code upon receiving MIDI noteOn and noteOff commands, which might be one way to trigger your generated audio.
Related
I've been searching for a while and can't come to a good conclusion.
I am trying to create an app that can "record" beats that a user makes on a 4x4 button array. Each button has a sound tied to it and after they hit record, I want to mix the audio that gets played and save it to a file so they can listen to it and play over it later.
What makes this even trickier is that there will be a metronome playing and I do not want to mix the metronome sound into the audio that is getting saved.
From what I have found, the only way to go is Audio Units for these features, but I am reluctant to since it seems a little overkill and somewhat complicated to learn. Can Audio Toolbox make this any easier?
Thanks!
In generally, using a AudioToolBox easily implements.
more information, see below sample code. it's a lot of help.
MixerHost
I just need an IO Unit with a processing callback with a simple play and stop feature.
Apple has this giant MixerHost demo with thousands of lines of code just to play two mixed audio files.
It seems 99% of that code is boilerplate to set things up.
Maybe there is a open source framework which deals with this boilerplate such that you can just set up your Audio Session and start constructing a simple processing graph with an IO Unit?
Take a look at Novocaine, an analgesic for high-performance audio on the iPhone, iPad and Mac OS X. Really fast audio in iOS and Mac OS X using Audio Units is hard, and will leave you scarred and bloody. What used to take days can now be done with just a few lines of code.
Just to add a bit to #fannheyward's answer, Novocaine is definitely the way to go. The key advantage is that you can pass in an objective-C block which will be executed each time the audio subsystem is ready to process a block of audio. It abstracts away most of the difficult boilerplate code, and lets you focus on the DSP.
so I'm making an app and what I need to do is when for example someone starts talking I need to detect that there is a sound and then record it.
I found this tutorial http://mobileorchard.com/tutorial-detecting-when-a-user-blows-into-the-mic/ but it starts the recording on the beginning and then based on the recording it detects the sound.
Is there any other way to detect a sound without actually starting the recorder first? What I thought of would be having 2 recorders, one for detection and one for actually recording the sound. Another solution would be to edit (trim) the sound after it's recorded.
Are these approaches somehow standard or is there a better way to detect sound?
Thanks.
edit: if anyone ever reads this, I also found this http://bonkel.wordpress.com/2010/03/03/frequency-detection-using-fourier-transform/
If you don't mind getting a little dirty, you could go down to a lower level, to CoreAudio, and read data out of the input buffers until you see values exceeding your threshold, and start recording those input buffers, or triggering a high level recording call. You can similarly stop recording after a period of silence.
If you use CoreAudio, you have a lot of control over what you record. You could, pretty easily, filter out background noise, or add beeps to signify when the recording stopped due to silence, and even add markers to use later to match time to the recording.
CoreAudio does require you to do more work. You will have to read the microphone buffers on a timely basis and either save or discard the data pretty quickly in order not to drop any sound data. This isn't that hard, as the devices have plenty of CPU power to do that and other tasks at the same time - you just have to have a good grasp of CoreAudio.
There are plenty of Apple CoreAudio samples that can guide you. The WWDC 2010 and 2010 CoreAudio sessions are also a must-see.
You could use either the Audio Queue or the Core Audio (RemoteIO Audio Unit) API. Unless your app requires low latency, the Audio Queue API may be simpler to use.
You need to start the recording API to detect any sound, but you don't need to save everything you get from the recording callback to a file.
Hi unfortunately I've not been able to figure out audio on the iPhone. The best I've come close to are the AVAudioRecorder/Player classes and I know that they are no good fo audio processing.
So i'm wondering if someone would be able to explain to me how to "listen" to the iPhone's mic input in chunks of say 1024 samples, analyse the samples and do stuff. And just keep going like that until my app terminates or tells it to stop. I'm not looking to save any data, all I want is to analyse the data in real time and do stuff in real time with it.
I've attempted to try and understand apples "aurioTouch" example but it's just way too complicated for me to understand.
So can someone explain to me how I should go about this?
If you want to analyze audio input in real-time, it doesn't get a lot simpler than Apple's aurioTouch iOS sample app with source code (there is also a mirror site). You can google a bit more info on using the Audio Unit RemoteIO API for recording, but you'll still have to figure out the real-time analysis DSP portion.
The Audio Queue API is a slight bit simpler for getting input buffers of raw PCM audio data from the mic, but not much simpler, and it has a higher latency.
Added later: There's also a version of aurioTouch converted to Swift here: https://github.com/ooper-shlab/aurioTouch2.0-Swift
AVAudioPlayer/Recorder class won't take you there if you wanna do any real time audio processing. The Audio Toolbox and Audio Unit frameworks are the way to go. Check here for apple's audio programming guide to see which framework suits your need. And believe me, these low level stuff is not easy and is poorly documented. CocoaDev has some tutorials where you can find sample codes. Also, there is an audio DSP library DIRAC I recently discovered for tempo and pitch manipulation. I haven't looked into it much but you might find it useful.
If all you want is samples with a minimum amount of processing by the OS, you probably want the Audio Queue API; see Audio Queue Services Programming Guide.
AVAudioRecorder is designed for recording to a file, and AudioUnit is more for "pluggable" audio processing (and on the Mac side of things, AU Lab is actually pretty cool).
I'm looking to create an app that emulates a physical instrument. I've got audio samples but I want to be able to increase the pitch/frequency dynamically so I don't have to load from too many files.
Any idea which audio API will be able to do this? I reckon either OpenAL or Audio Queue Services but am not sure which is suitable. Any links to guides/sample code is also much appreciated.
Thanks in advance.
I went down this road in 2009, trying Audio Toolkit, Audio Queue Services, openAL, and finally settling on the RemoteIO AudioUnit.
Audio Toolbox is fine for basic triggered sound effects, but it wasn't able to change frequencies or loop samples.
Audio Queue Services can loop samples, but the only way I could find to adjust the playback frequency of a sample was to re-read the data from the file -- very painful. Plus, the framework is tremendously cumbersome - I'd only use it if I was trying to stream something off the Internet.
OpenAL was a godsend - was up and running with it in under an hour, after getting my hands on the no-longer-available-from-Apple "CrashLanding" iPhone sample app. I found OpenAL to be ideally suited to games or even a musical instrument -- samples could be pre-loaded, adjusting the frequency was easy, and looping was no problem. The deal-breaker for me was that starting and stopping a looped sample would result in a nasty "pop" almost every time. Also the builtin 3d positional audio mixer was a bit too CPU-intensive for my liking.
If your instrument does not use looped samples, I'd suggest trying the OpenAL route first - the learning curve is much less intimidating. Try to track down "SoundEngine.h", "CrashLanding" or "TouchFighter", or check out the following link:
http://benbritten.com/blog/2008/11/06/openal-sound-on-the-iphone/
Since looped samples was a requirement for me, I finally settled on AudioUnits (which, on the iPhone, is referred to as "RemoteIO" if you want to do input or output). It was tremendously difficult to implement - very similar to Audio Queue Services, in that the core of your implementation will be inside a "buffer callback", being called several times per second to fill a buffer of outbound audio with raw SInt16 values.
Ultimately, I got my instrument working beautifully with multi-note polyphony, looped samples, no popping, and minimal latency.
Unfortunately, RemoteIO is not well documented. Michael Tyson was one of the first in the field to write about RemoteIO at length, and his posts (and the comments) were very useful to me:
http://michael.tyson.id.au/2008/11/04/using-remoteio-audio-unit/
Good luck!
Edited years later: I've open-sourced the RemoteIO/AudioUnits code I alluded to above: https://github.com/glenn-barnett/hexaphone/blob/master/Classes/Instrument.m - apologies for the mess, I hope to get some time to clean up the code and comments.
Try creating an Audio Unit. I'm doing something similar an AU worked well for me.
Initially I used an audio queue as it was simpler (higher level?) and
synchronous, however it was lacking in responsiveness, so I dumped it for
the Audio Unit.
It sounds, a bit, like you're creating essentially the wavetable synthesis method of playing MIDI files. You might be able to find a MIDI synthesizer for the iPhone that you can use, and then use your audio samples to build a wavetable set. Anytime you'd want to play tones, you would simply send the MIDI event into the iPhone MIDI synth with your loaded wavetable set.
Another option now is AUSampler.
http://developer.apple.com/library/mac/#technotes/tn2283/_index.html