Easy and straight-forward FFT of MP3 file - iphone

I have a MP3 file and need to constantly detect and show Hz value of this playing MP3 file. A bit of googling shows that I have 2 opportunities: use FFT or use Apple Accelerate framework. Unfortunately I haven't found any easy to use sample of neither. All samples, like AurioTouch etc, need tons of code to get simple number for sample buffer. Is there any easy example for pitch detection for iOS?
For example I've found https://github.com/clindsey/pkmFFT, but it's missing some files that its' author has removed. Anything working like that?

I'm afraid not. Working with sound is generally hard, and Core Audio makes no exception. Now to the matter at hand.
FFT is an algorithm for transforming input from time domain to frequency domain. Is not necessarily linked with sound processing, you can use it for other things other than sound as well.
Accelerate is an Apple provided framework, which among many other things offer an FFT implementation. So, you actually don't have two options there, just one and its implementation.
Now, depending on what you want to do(e.g. if you favour speed over accuracy, robustness over simplicity etc) and the type of waveform you have(simple, complex, human speech, music), FFT may be not enough on its own or not even the right choice for your task. There are other options, auto-correlation, zero-crossing, cepstral analysis, maximum likelihood to mention some. But none are trivial, except for zero-crossing, which also gives you the poorest results and will fail to work with complex waveforms.

Here is a good place to start:
http://blog.bjornroche.com/2012/07/frequency-detection-using-fft-aka-pitch.html
There are also other question on SO.
However, as indicated by other answers, this is not something that can just be "magically" done. Even if you license code from someone (eg, iZotope, and z-plane both make excellent code for doing what you want to do), you still need to understand what's going on to get data in and out of their libraries.

If you need fast pitch detection go with http://www.schmittmachine.com/dywapitchtrack.html
You'll find a IOS sample code inside.
If you need FFT you should use Apple Accelerate framework.
Hope this help

Related

Does Web Audio have any plans of making limiters, imager, saturator, or multiband compressors?

I have checked the interfaces that web audio offers. Pizzicato.js offers a great library for these effects, but it is a pitty that some of the best and essential effects are missing, like a limiter, multi-band compressor, parametric equalizer, saturator, stereo imager. I was just wondering if there any plans for them where i can check if they are willing to make these in the future. Just don't know where i could ask.
Thanks
WebAudio is a collection of fairly elemental base processors. There's really no higher order effects, because instead, it provides the foundational elements with which to build the effects.
For example, there's a dynamics compressor, but there's no variMU emulation, or FET circuit, or even your run of the mill digital compressor. But, using the pieces that the API does have, you can build out or model a compressor that behaves however you want. Just think it through and figure out how the signal needs to be processed and you'll find pretty much every component you need to achieve it. If not, the AudioWorklet(the successor to the deprecated ScriptProcessorNode) can fill in the blanks.
That being said, you would build a limiter using the DynamicsCompressor, and use BiquadFilters and DynamicsCompressors to build a multiband comp. You'd use the WaveShaper to build things like tape saturators, overdrive, and bit crushers. And you can create an imager or stereo widener effect using things like the PannerNode and one of the AudioParams (setValueAtTime() is probably the simplest).
WebAudio isn't really plug & play, but it's what you'd use to build something that is. If you'd rather skip the tedium of building your own DSPs, I can't really blame you, it's not easy. But there's plenty of sadists out there who already did, and many are made by very imaginative and talented engineers. A couple libraries worth checking out would be Tuna.js which is a very user friendly/straight forward effects library, or Tone.js for something much more fleshed out and complete.
According to the docs several already seem to be implemented:
Compressor: https://developer.mozilla.org/en-US/docs/Web/API/DynamicsCompressorNode
EQ: https://developer.mozilla.org/en-US/docs/Web/API/BiquadFilterNode
Saturator can probaby be impemented using: https://developer.mozilla.org/en-US/docs/Web/API/ConvolverNode

How to use Math.NET Neodym Signal Processing Library to Filter Frequencies?

I have streaming continuous data that I need to filter by frequency. I have high-level knowledge of FFT and frequency filtering, but no real practical knowledge.
After searching for a suitable library to aid me, I came across Math.NET's Neodym Signal Processing Library (https://mathnetneodym.codeplex.com/releases/view/91769)
I downloaded it, upgraded it to VS2010 and used the SignalGenerator class to generate various sine wave of various frequencies. It worked great.
Now, I need to know how to filter dynamic data. This StackOverflow question seems to come the closest to mine, and is a good start:
Filtering continuous data, how to get rid of transients?
There is a .chm file in the downloaded project, but it seems to just be the API documentation. I need some high-level help just pointing me in the right direction to use the library.
Can anybody provide a good resource to refer to so I can use this library for my purpose?
Thank you!
John
UPDATE
I'm pretty sure I use the OnlineFilter.CreateBandpass static function with the mode ImpulseResponse.Infinite... but I'll post whatever I find out.

FFT using Apple's Accelerate Framework

I'm trying to use apple's Accelerate Framework to setup FFT, windowing, overlapping and downsampling with input taken from the device's microphone. I've been looking all over for some sample code/application on how to use it, but I didn't quite figure out how to implement something like this in an actual project. Could you help point me in the right direction? Thanks
UPDATE
Here is the code I've managed to put together so far from various sources, including some other questions answered here. What it's missing is a downsampling function and the overlapping. In Apple's documentation, I found the following function vDSP_zrdesamp for desampling but i'm having trouble implementing it.
The real problem is trying to put this in an actual project. I've tried modifying this github project but I didn't manage to fit it in, so any help would be more than welcome. The actual FFT (which I need to modify) takes places in the RIOInterface class, in the audio callback function.
Please note that the project I mention on github does not belong to me, but it does not have a licence that forbids me to modify it for my own purposes. Also, this project is for testing only until I reach a more stable algorithm.

OpenCV vs Matlab - Line/Hallway Detection

For my computer vision class, I'm going to be doing a project where I extract information about a hallway based on an image of that hallway. In particular, the lines of the hallway which extend toward a vanishing point will be of interest. My question is whether I should use Matlab, OpenCV, or something else to implement this.
I don't have a ton of time for this project. This fact makes Matlab seem like a good option since it seems you can usually get things up and running quickly there. On the other hand, I hope to take what I do for this class project and extend it out further for research once the class is complete. This makes OpenCV seem better as (from what I've read) it's much more efficient. It's possible another choice would be to implement it in Matlab for the project than port that code to an OpenCV form later. It should be noted that I have plenty of experience with C/C++, but only a little in both Matlab and OpenCV.
At the moment, I'm leaning toward just using OpenCV from the start. However, I would like the opinion of someone who's had a bit more experience here than myself. If you'd recommend something over both OpenCV and Matlab, please say so. Also, if you have any tips on what packages or toolkits might be useful for such a project, they would be greatly appreciated.
Any suggestions? Thanks for your time!
Using which one it is easier for you to write a piece of code to read an image file and display it?
If you know C++ very well, then it should be easy to debug the code. Since you say you have little experience with Matlab, if you make a small mistake in the code debugging can take a long time.
So I suggest break down the problem into:
read image and display it, this is very easy in both
detect edges using a simple/classic method, this is super easy in both, display the result and visually check it's correctly done
use a robust line fitting method, the RANSAC and Hough transform methods are probably what you're going to use, OpenCV makes using the easier than you can guess, Matlab also has built in functions to detect lines using the Hough transform, and gives you the start/end points of each segment. But if you're finding a vanishing point, you shouldn't need those.
The decision is yours, this is not a very difficult problem, can find loads of help on the web. Good luck with the project, and please let us know how it goes.

Building better positional audio [AudioQueue manipulation]

I'm building an app that has a requirement for really accurate positional audio, down to the level of modelling inter-aural time difference (ITD), the slight delay difference between stereo channels that varies with a sound's position relative to a listener. Unfortunately, the iPhone's implementation of OpenAL doesn't have this feature, nor is a delay Audio Unit supplied in the SDK.
After a bit of reading around, I've decided that the best way to approach this problem is to implement my own delay by manipulating an AudioQueue (I can also see some projects in my future which may require learning this stuff, so this is as good an excuse to learn as any). However, I don't have any experience in low-level audio programming at all, and certainly none with AudioQueue. Trying to learn both:
a) the general theory of audio processing
and
b) the specifics of how AudioQueue implements that theory
is proving far too much to take in all at once :(
So, my questions are:
1) where's a good place to start learning about DSP and how audio generation and processing works in general (down to the level of how audio data is structured in memory, how mixing works, that kinda thing)?
2) what's a good way to get a feel for how AudioQueue does this? Are there any good examples of how to get it reading from a generated ring buffer, rather that just fetching bits of a file on-demand with AudioFileReadPackets, like Apple's SpeakHere example does?
and, most importantly
3) is there a simpler way of doing this that I've overlooked?
I think Richard Lyons' "Understanding Digital Signal Processing" is widely revered as a good starter DSP book, though it's all math and no code.
If timing is so important, you'll likely want to use the Remote I/O audio unit, rather than the higher-latency audio queue. Some of the audio unit examples may be helpful to you here, like the "aurioTouch" example that uses the Remote I/O unit for capture and performs an FFT on it to get the frequencies.
If the built-in AL isn't going to do it for you, I think you've opted into the "crazy hard" level of difficulty.
Sounds like you should probably be on the coreaudio-api list (lists.apple.com), where Apple's Core Audio engineers hang out.
Another great resource for learning the fundamental basics of DSP and their applications is The Scientist and Engineer's Guide to Digital Signal Processing by Steven W. Smith. It is available online for free at http://www.dspguide.com/ but you can also order a printed copy.
I really like how the author builds up the fundamental theory in a way that very palatable.
Furthermore, you should check out the Core Audio Public Utility which you'll find at /Developer/Extras/CoreAudio/PublicUtility. It covers a lot of the basic structures you'll need to get in place in order to work with CoreAudio.