Using graphics hardware for audio processing in iphone app - iphone

We are developing an iphone app that needs to process audio data in real time, but we are suffering with performance. The bottlenecks are in audio effects, which are in fact quite simple, but the performance hit is noticeable when several are added.
Most of the audio effects code is written in C.
We think there are two places we can use gpu hardware to speed things up: using openCL for effects and hardware for interpolation/smoothing. We are fairly new to this and don't know where to begin.

You probably mean OpenGL, as OpenCL is only present on the desktop. Yes, you could use OpenGL ES 2.0 programmable shaders for this, if you wanted to perform some very fast parallel processing, but that will be extremely complex to pull off.
You might first want to look at the Accelerate framework, which has hardware-accelerated functions for doing just the kind of tasks needed for audio processing. A great place to start is Apple's WWDC 2010 session 202 - "The Accelerate framework for iPhone OS", along with their "Taking Advantage of the Accelerate Framework" article.
Also, don't dismiss Hans' suggestion that you profile your code first, because your performance bottleneck might be somewhere you don't expect.

You might get better DSP acceleration coding for the ARM NEON SIMD unit. NEON is designed for DSP operations and can pipeline multiple single precision floating point operations per cycle. Whereas getting audio data in and out of GPU memory may be possible, but may not be that fast.
But you might want to profile your code to see if something else is the bottleneck. The iPhone 4 CPU can easily keep up with doing multiple FFT's and IIR filters on a real-time audio stream.

Related

Is there a maximum number of OpenGL ES calls you can make on iPad?

Can anyone tell me if there is a limit to the maximum number of OpenGL ES calls that can be made on iPad (i.e. OpenGL draw calls and state changes)?
I am working on a game and seeing low FPS, so I wonder if it has any thing to with my large number of OpenGL calls.
There's no real maximum for OpenGL ES commands, but each one does have some overhead associated with it. Redundant state changes should be eliminated, and expensive state changes should be reduced by grouping geometry in ways that everything using one state is drawn, then the next. Apple has some recommendations for this in their OpenGL ES Programming Guide for iOS.
However, I've rarely found the OpenGL ES commands to be the cause of significant performance degradation in my applications. The larger problems tend to be due to the size of your geometry or from the complexity of any shaders or other effects you apply to your scene. I share some tips that I've applied for reducing geometry size here, and one tool for profiling shaders here, but I'm still learning the ins-and-outs of shader tuning myself.
If you do really care about fine-tuning the OpenGL calls you're making, the best profiling tool to use is the new OpenGL ES Analyzer instrument that comes with Xcode 4. I show a couple of example screens from that instrument in my answer here, where I used it to identify some redundant settings. It will find these calls for you, and point out where they are in your code. You can also use Time Profiler to see if you're putting more load on the CPU than you should be when rendering your frames, and track down the offending lines of code.
As far as I know, there is no such limit as maximum number of glCalls. But definitely, the more the number of gl calls, time taken would be more. Batched rendering is one of the main optimizations that has to be made with OpenGL.
Instead of suspecting the bottleneck, use Instruments to locate it.
Trace Template OpenGL ES Analysis is the weapon of choice. Last time I used it, you had to manually attach it to a running process on the device (all other startup options failed).

iPhone shader profiling

I'm using a series of shaders to perform realtime image processing on the iPhone (3GS/4/iPad). The fps isn't what I'd like it to be.
Are there any tools that I can use to help me work out what the bottlenecks are?
I assume you already know that performance tests on the Simulator are worthless and that you're testing on real metal, so Instruments is always a good place to start - specifically in your case you'd be interested in the OpenGL ES and OpenGL ES Analyzer instruments.
Generally speaking for GLSL, there's a list of common GLSL mistakes at the OpenGL.org site. The O'Reilly labs "iPhone 3D Programming" book has some further hints, such as avoiding expensive operations in conditionals, and watching for texture lookups.
Also, it's going to depend on what kind of image processing you're doing; if you're trying to apply heavy Photoshop-esqe filters that would give a quad-core pause to render, it's going to be costly on a lowly phone.
The only currently available tool is the PVRUniSCo editor, which will give you a cycle count for each line of code in your shader (though only on Windows, it seems).

iPhone: CPU power to do DSP/Fourier transform/frequency domain?

I want to analyze MIC audio on an ongoing basis (not just a snipper or prerecorded sample), and display frequency graph and filter out certain aspects of the audio. Is the iPhone powerful enough for that? I suspect the answer is a yes, given the Google and iPhone voice recognition, Shazaam and other music recognition apps, and guitar tuner apps out there. However, I don't know what limitations I'll have to deal with.
Anyone play around with this area?
Apple's sample code aurioTouch has a FFT implementation.
The apps that I've seen do some sort of music/voice recognition need an internet connection, so it's highly likely that these just so some sort of feature calculation on the audio and send these features via http to do the recognition on the server.
In any case, frequency graphs and filtering have been done before on lesser CPUs a dozen years ago. The iPhone should be no problem.
"Fast enough" may be a function of your (or your customer's) expectations on how much frequency resolution you are looking for and your base sample rate.
An N-point FFT is on the order of N*log2(N) computations, so if you don't have enough MIPS, reducing N is a potential area of concession for you.
In many applications, sample rate is a non-negotiable, but if it was, this would be another possibility.
I made an app that calculates the FFT live
http://www.itunes.com/apps/oscope
You can find my code for the FFT on GitHub (although it's a little rough)
http://github.com/alexbw/iPhoneFFT
Apple's new iPhone OS 4.0 SDK allows for built-in computation of the FFT with the "Accelerate" library, so I'd definitely start working with the new OS if it's a central part of your app's functionality.
You cant just port FFT code written in C into your app...there is the thumb compiler option that complicates floating point arithmetic. You need to put it in arm mode

What's the most suitable sound/audio framework for iPhone OpenGL-ES games?

I'm writing a game for iPhone/iPod.
My engine is using OpenGL-ES, and this means game requires some performance.
(realtime games, not a static board-game like games.)
I looked at basic sound framework in iPhone, there're several frameworks,(Core Audio, Audio Toolbox, OpenAL...) but I cannot determine differences of them in detail.
I think OpenAL will gain best performance, but it's just a guess with no clue. And iPhone/iPod is a music player hardware, I cannot know in-depth features of iPhone/iPod.
I'm new to all of those framework, so I have to study one of them. And now I'm choosing one.
The features required for me is:
Delay-less playback. Sound effect should be a realtime feedback.
Streamed long music playback with very small memory footprint.
Volume control per playback of sound effect.
Mixing. Multiple difference sound effect can be played at same time. (around 4 or more)
Other feature required for games.
Hardware acceleration (if exists)
Realtime filtering effect (reverb, echo, 3D, ...) if possible.
...
Can you recommend a framework for my game? And some explanation about each framework also will be very appreciated.
You can do everything you want with OpenAL. It's what I'd recommend for a game.
Plus, it's the only framework for 3D positional audio which often goes hand-in-hand with a 3D game.
OpenAL, Core Audio, AudioToolbox etc. are wrappers around the same things: namely, Apple’s own audio processing features. OpenAL is just a different interface but has the same performance as Core Audio, as it sends commands to the same things.
There are several other “audio engines” out there that are just wrappers.
At risk of tooting my own horn, Superpowered is the only audio SDK that outperforms Apple’s Core Audio on mobile devices. It’s specifically designed to outperform every single one of those, with lower memory footprint, CPU load and battery usage. For example, the Superpowered reverb is 5x faster than Apple’s. See http://superpowered.com/reverb/

Most performant audio layer?

I'm curious as to which of the available audio layers is the most performant, out of the ones available on the iPhone. Currently I've used the SystemSoundID method, and the AVAudioPlayer method, and I'm wondering if it's worth investigating AudioQueue or OpenAL...are there significant performance gains to be had?
Thanks!
Audio is a complex issue, and most of it is done by hardware, so there is no performance gains in changing APIs.
The different APIs are for different tasks:
SystemSound is for short notification sounds (max 10 sec)
AudioQueue is for everything longer than a SystemSound
AVAudioPlayer is just an Objective-C layer above AudioQueue, and you don't lose any performance for this layer. (So if AVAudioPlayer is working for you, stay with it!)
OpenAL is for sound effects.
What about FMOD for the iPhone? It's mostly used for game development and availible for various platforms.
I've been reading about very low level and very low latency audio using RemoteIO. Take a look at this article and subsequent (long) discussion : Using RemoteIO audio units. I wouldn't recommend going down this path unless the higher level libraries completely fail for your application. The author found very distinct performance differences between the different approaches - some quite unexpected. YMMV