iPhone: CPU power to do DSP/Fourier transform/frequency domain? - iphone

I want to analyze MIC audio on an ongoing basis (not just a snipper or prerecorded sample), and display frequency graph and filter out certain aspects of the audio. Is the iPhone powerful enough for that? I suspect the answer is a yes, given the Google and iPhone voice recognition, Shazaam and other music recognition apps, and guitar tuner apps out there. However, I don't know what limitations I'll have to deal with.
Anyone play around with this area?

Apple's sample code aurioTouch has a FFT implementation.

The apps that I've seen do some sort of music/voice recognition need an internet connection, so it's highly likely that these just so some sort of feature calculation on the audio and send these features via http to do the recognition on the server.
In any case, frequency graphs and filtering have been done before on lesser CPUs a dozen years ago. The iPhone should be no problem.

"Fast enough" may be a function of your (or your customer's) expectations on how much frequency resolution you are looking for and your base sample rate.
An N-point FFT is on the order of N*log2(N) computations, so if you don't have enough MIPS, reducing N is a potential area of concession for you.
In many applications, sample rate is a non-negotiable, but if it was, this would be another possibility.

I made an app that calculates the FFT live
http://www.itunes.com/apps/oscope
You can find my code for the FFT on GitHub (although it's a little rough)
http://github.com/alexbw/iPhoneFFT
Apple's new iPhone OS 4.0 SDK allows for built-in computation of the FFT with the "Accelerate" library, so I'd definitely start working with the new OS if it's a central part of your app's functionality.

You cant just port FFT code written in C into your app...there is the thumb compiler option that complicates floating point arithmetic. You need to put it in arm mode

Related

AVAudio detect note/pitch/etc. iPhone xcode objective-c

I'm making an app on the iphone and I need a way of detecting the tune of the sounds coming in through the microphone. (I.e. A#, G, C♭, etc.)
I assumed I'd use AVAudio but I really don't know and I can't find anything in the documentation..
Any help?
Musical notes are nothing more than specific frequencies of sound. You will need a way to analyze all of the frequencies in your input signal, and then find a way to isolate the individual notes.
Finding frequencies in an audio signal is done using the Fast Fourier Transform (FFT). There is plenty of source code available online to compute the FFT from an audio signal. In particular, oScope offers an open-source solution for the iPhone.
Edit: Pitch detection seems to be the technical name for what you are trying to do. The answers to a similar question here on SO may be of use.
There's nothing built-in to the iOS APIs for musical pitch estimation. You will have to code your own DSP function. The FFTs in the Accelerate framework will give you spectral frequency information from a PCM sampled waveform, but frequency is different from psycho-perceptual pitch.
There are a bunch of good and bad ways to estimate frequency and pitch. I have a long partial list of various estimation methods on my DSP resources web page.
You can look at Apple's aurioTouch sample app for an example of getting iOS device audio input and displaying it's frequency spectrum.
Like #e.James said, you are looking to find the pitch of a note, its called Pitch Detection. There are a ton of resources at CCRMA, Stanford University for what you are looking for. Just google for Pitch Detection and you will see a brilliant collection of algorithms. As far as wanting to find the FFT of blocks of Audio Samples, you could use the built-in FFT function of the Accelerate Framework (see this and this) or use the MoMu toolkit. Using MoMu has the benefit of it's functions decomposing the audio stream into samples for you and easy application of the FFT using it's own functions.

Is number recognition on iPhone possible in real-time?

I need to recognise numbers from the camera image on iPhone, in real-time. I know there will be no more than 5 digits on the image.
Is this problem realistic to solve given the computational specifications of the iPhone?
Does anyone have any experience using the Tesseract OCR library, and do you think it could be solved by using it?
The depends on your definition of "real-time", but yes, it should be possible to do relatively fast recognition of just the digits 0-9 on an iPhone 4, particularly if you can fonts, lighting conditions, etc. that they will appear in.
I highly recommend reading the article on how Sudoku Grab does its recognition of puzzles using the iPhone camera. In their case, a trained neural network was used to identify the digits, which should be reasonably simple and fast on modern iOS hardware.
The current recognition libraries out there, like OpenCV, will use the iPhone's CPU to do the processing. I've heard that they can do even more complex tasks like facial recognition fast enough to use with video sources while showing a minimal amount of stutter.
For even better performance, I believe that there's a lot of potential in the programmable GPUs on the newer iOS devices. In my benchmarks, I saw a 14X - 28X speedup when using the iPhone 4's GPU for simple image processing. While few people are looking at this right now, something like Sudoku Grab's neural network should be a parallel enough process to benefit from running on the GPU.
It should be computationally possible. There are apps that can get a bar code in real time and also an app that does real time translation. (Word Lens). I'm not sure what libraries they use, however.
YES it is possible using the tesseract engine
Here is the sample code if you like to check...
https://github.com/nolanbrown/Tesseract-iPhone-Demo
There is free SDK for that: http://rtrsdk.com/ Supports both iOS and Andorid, works in real-time, helps you capture any text, numbers should not be a problem.
Disclaimer: I work for ABBYY
Yes. Bender can help you with that. It lets you build and run neural nets on iOS. As it uses Metal under the hood, it runs fast and smooth. It also supports running TensorFlow models directly.
So you can run in Bender an existing model in TensorFlow trained for digit recognition Handwritten Digit Recognition using Convolutional Neural Networks in Python with Keras if you need help
Disclaimer: I worked on this project.

calculating frequency with apple's auriotouch example

I am working on a program that needs to capture the frequency of sound from a guitar. I have modified the aurioTouch example to output the frequency by using the frequency with the highest magnitude. It works ok for high notes but is very inaccurate on the lower strings. I believe it is due to overtones. I researched ways on how to solve this problem such as Cepstrum Analysis but I am lost on how to implement this within the example code as it is unclear and hard to follow without comments. any help would be greatly appreciated, thanks!
As you have discovered, musical pitch is not the same as peak frequency.
But trying to investigate algorithms while trying to work with real-time audio is not easy.
I suggest you separate the problems. Record some music sounds (guitar plucks, etc.) on your Mac into raw sound files. Try your chosen pitch estimation algorithms on these recorded sample sets. Then, after you get this working, figure out how to integrate your code into the iOS audio and Accelerate (for FFT) frameworks.

How to develop an iphone app with reverb functionality?

I am developing an iPhone application (like Audio Processing). I have to give some effect to the audios.
If it is desktop app, many options are there. We can get good examples and full project like audacity. But I want to develop for iPhone.
I got an app with reverb option; (take a look at following link). Just I watch the "video", I did not test this application in my iPhone device.
http://www.appstorehq.com/reverb-iphone-89870/app
My question is; How can I develop the app with reverb functionality ? Is there any documentation for that ? If it is, just share with us.
NOTE: We can use AudioUnit to develop the app with reverb functionality (I am not clear with this.).
EDIT: I don't like to use any third party library.
If anybody having knowledge about this, please share with us.
Thanks.
if yourre targeting ios5 you can just the audio unit subtype kAudioUnitSubType_Reverb2 of the effect audio unit.
reverb unit
AudioComponentDescription auEffectUnitDescription;
auEffectUnitDescription.componentType = kAudioUnitType_Effect;
auEffectUnitDescription.componentSubType = kAudioUnitSubType_Reverb2;
auEffectUnitDescription.componentManufacturer = kAudioUnitManufacturer_Apple;
AUGraphAddNode(
processingGraph,
&auEffectUnitDescription,
&auEffectNode),
Failing that you could just write your own reverb code in the remoteio callback. A simple delay might be easier to do and would sound similar.
iOS 5.0 brings native OpenAL support, so it is now much easier - you don't have to code the algorithm yourself. It also bring support for a variety of reverb spaces:
Small Room
Medium Room
Large Room (2 configurations)
Medium Hall (3 configurations)
Large Hall (2 configurations)
Plate
Medium Chamber
Large Chamber
Cathedral
I suggest that you try the ObjectAL wrapper which already has a great support for the reverb effect:
https://github.com/kstenerud/ObjectAL-for-iPhone
Grab the source from this repository, load "ObjectAL.xcodeproj" and run the ObjectALDemo target on any iOS 5.0 device (should also work on the simulator). This will give you a good starting point and feeling of what the reverb effect is capable of.
If you still don't to use any 3rd party library, you can just grab the relevant pieces from ObjectAL. Look for the reverb-related code in the following source files (and their corresponding headers):
https://github.com/kstenerud/ObjectAL-for-iPhone/blob/master/ObjectAL/ObjectAL/OpenAL/ALListener.m
https://github.com/kstenerud/ObjectAL-for-iPhone/blob/master/ObjectAL/ObjectAL/OpenAL/ALSource.m
https://github.com/kstenerud/ObjectAL-for-iPhone/blob/master/ObjectAL/ObjectAL/OpenAL/ALWrapper.m
Good luck with your project!
AUs are a good place to start.
write your own reverb AU which contains a reverb implementation. there are tons of ways to implement a reverb. a medium/long convolution reverb is much to ask from a phone, but something such as a FDN (feedback delay network) will not require a lot of memory or CPU.
both implementations are easy to implement, if you're familiar with audio programming and optimization. the tough part is actually making one that sounds very good and performs well.
if you're unable to write optimal low level code or you do not (presently) understand basic audio signal processing, then you'll have a few obstacles to overcome -- it may be a long road in that case.
Searching the iOS documentation for "reverb" produces a link to the Core Audio Overview, which references reverb as an "effect unit." Perhaps that's worth further study?
No good, I have attempted the audio unit approach and even though it is in the documentation it is "not" implemented yet by the apple engineers. Each time you call the function to set the reverb property you will only get failure status code. You would have to implement your own reverb effect. Try reading some DSP book and you might find a clue.
you need to learn some DSP-level coding, the DSP cookbook book is okay and there are others out there. But basically you need to be comfortable with handling audio signal in the frequency domain and things such as FFT's. Once you have that, implementing a reverb filter should be straight-forward.
This is an answer I've given before, but I believe it is relevant here. I am going to agree with the others and say that you are going to have to become a bit more familiar with core-audio if you want to do this properly.
I highly recommend this core-audio book. It will teach what you need to do this right and will save you a lot of frustration.
The chapter on audio effects has not been published yet, but if it is anything like the rest of the book it's worth the wait.
EDIT
You will most likely need to do this with an audio effect (which is a form of an audio unit).

Change in pitch of voice

I am creating an iPhone application in which when I make a call to anyone I should be able to change the pitch of my call voice in real time.
So for that which framework or any third party library should I use?
Thanks,
Sunil.
For speech your best bet is probably an implementation of PSOLA. This allows pitch shifting and/or time compression/expansion. You can either implement it yourself (it's fairly straightforward if you're familiar with DSP etc) or Google for open source implementations.
If we want to change sound pith it looks most natural to transform small sound segments into the frequency domain using FFT, then shift frequency distribution and return back to the time domain using inverse FFT. Yes, it works, but unfortunately algorithms of this kind is too time consuming for iPhone.
But there are also other group of SOLA-like algoritms, they simplest versions can be implemented on iPhone.
Follow this links for libraries and more info:
http://www.dspdimension.com/admin/time-pitch-overview
http://www.surina.net/soundtouch/index.html#download
http://www.guitarpitchshifter.com/algorithm.html