Building better positional audio [AudioQueue manipulation] - iphone

I'm building an app that has a requirement for really accurate positional audio, down to the level of modelling inter-aural time difference (ITD), the slight delay difference between stereo channels that varies with a sound's position relative to a listener. Unfortunately, the iPhone's implementation of OpenAL doesn't have this feature, nor is a delay Audio Unit supplied in the SDK.
After a bit of reading around, I've decided that the best way to approach this problem is to implement my own delay by manipulating an AudioQueue (I can also see some projects in my future which may require learning this stuff, so this is as good an excuse to learn as any). However, I don't have any experience in low-level audio programming at all, and certainly none with AudioQueue. Trying to learn both:
a) the general theory of audio processing
and
b) the specifics of how AudioQueue implements that theory
is proving far too much to take in all at once :(
So, my questions are:
1) where's a good place to start learning about DSP and how audio generation and processing works in general (down to the level of how audio data is structured in memory, how mixing works, that kinda thing)?
2) what's a good way to get a feel for how AudioQueue does this? Are there any good examples of how to get it reading from a generated ring buffer, rather that just fetching bits of a file on-demand with AudioFileReadPackets, like Apple's SpeakHere example does?
and, most importantly
3) is there a simpler way of doing this that I've overlooked?

I think Richard Lyons' "Understanding Digital Signal Processing" is widely revered as a good starter DSP book, though it's all math and no code.
If timing is so important, you'll likely want to use the Remote I/O audio unit, rather than the higher-latency audio queue. Some of the audio unit examples may be helpful to you here, like the "aurioTouch" example that uses the Remote I/O unit for capture and performs an FFT on it to get the frequencies.
If the built-in AL isn't going to do it for you, I think you've opted into the "crazy hard" level of difficulty.
Sounds like you should probably be on the coreaudio-api list (lists.apple.com), where Apple's Core Audio engineers hang out.

Another great resource for learning the fundamental basics of DSP and their applications is The Scientist and Engineer's Guide to Digital Signal Processing by Steven W. Smith. It is available online for free at http://www.dspguide.com/ but you can also order a printed copy.
I really like how the author builds up the fundamental theory in a way that very palatable.
Furthermore, you should check out the Core Audio Public Utility which you'll find at /Developer/Extras/CoreAudio/PublicUtility. It covers a lot of the basic structures you'll need to get in place in order to work with CoreAudio.

Related

Does Web Audio have any plans of making limiters, imager, saturator, or multiband compressors?

I have checked the interfaces that web audio offers. Pizzicato.js offers a great library for these effects, but it is a pitty that some of the best and essential effects are missing, like a limiter, multi-band compressor, parametric equalizer, saturator, stereo imager. I was just wondering if there any plans for them where i can check if they are willing to make these in the future. Just don't know where i could ask.
Thanks
WebAudio is a collection of fairly elemental base processors. There's really no higher order effects, because instead, it provides the foundational elements with which to build the effects.
For example, there's a dynamics compressor, but there's no variMU emulation, or FET circuit, or even your run of the mill digital compressor. But, using the pieces that the API does have, you can build out or model a compressor that behaves however you want. Just think it through and figure out how the signal needs to be processed and you'll find pretty much every component you need to achieve it. If not, the AudioWorklet(the successor to the deprecated ScriptProcessorNode) can fill in the blanks.
That being said, you would build a limiter using the DynamicsCompressor, and use BiquadFilters and DynamicsCompressors to build a multiband comp. You'd use the WaveShaper to build things like tape saturators, overdrive, and bit crushers. And you can create an imager or stereo widener effect using things like the PannerNode and one of the AudioParams (setValueAtTime() is probably the simplest).
WebAudio isn't really plug & play, but it's what you'd use to build something that is. If you'd rather skip the tedium of building your own DSPs, I can't really blame you, it's not easy. But there's plenty of sadists out there who already did, and many are made by very imaginative and talented engineers. A couple libraries worth checking out would be Tuna.js which is a very user friendly/straight forward effects library, or Tone.js for something much more fleshed out and complete.
According to the docs several already seem to be implemented:
Compressor: https://developer.mozilla.org/en-US/docs/Web/API/DynamicsCompressorNode
EQ: https://developer.mozilla.org/en-US/docs/Web/API/BiquadFilterNode
Saturator can probaby be impemented using: https://developer.mozilla.org/en-US/docs/Web/API/ConvolverNode

Audio analysis and feature extraction for music visualization

I have a general question regarding how I should proceed with my music visualization endeavors. I am interested in visualizing classical music pieces, either recorded or live. I have used so far Processing, but I am open to other software or programming languages, too. Since my background is musicology and music theory, I would like to incorporate music theoretical concepts or any tools that MIR research has made available in analyzing music, so that I could create visualizations based on them.
However, I don't really know from where to start. I feel like I would like to study MIR concepts, but I feel it can become very technical and it seems that I don't need to be knowledgeable about all facets of MIR research. For example I want to analyze and visualize music, but not to synthesize. Or, could it be that learning about sound synthesis would also help me with my visualizations? And to what extent?
Alternatively, would it be better, to just pick some software/language and try its specific libraries for audio analysis??
Basically my dilemma is whether I should start with MIR theory or there is an alternative way of getting a good overview of all the features that one could maybe extract from a music piece, in order to have a richer palette for visualizations.
If you could give me some advice on this, I would be very grateful.
Thanks, Ilias

Easy and straight-forward FFT of MP3 file

I have a MP3 file and need to constantly detect and show Hz value of this playing MP3 file. A bit of googling shows that I have 2 opportunities: use FFT or use Apple Accelerate framework. Unfortunately I haven't found any easy to use sample of neither. All samples, like AurioTouch etc, need tons of code to get simple number for sample buffer. Is there any easy example for pitch detection for iOS?
For example I've found https://github.com/clindsey/pkmFFT, but it's missing some files that its' author has removed. Anything working like that?
I'm afraid not. Working with sound is generally hard, and Core Audio makes no exception. Now to the matter at hand.
FFT is an algorithm for transforming input from time domain to frequency domain. Is not necessarily linked with sound processing, you can use it for other things other than sound as well.
Accelerate is an Apple provided framework, which among many other things offer an FFT implementation. So, you actually don't have two options there, just one and its implementation.
Now, depending on what you want to do(e.g. if you favour speed over accuracy, robustness over simplicity etc) and the type of waveform you have(simple, complex, human speech, music), FFT may be not enough on its own or not even the right choice for your task. There are other options, auto-correlation, zero-crossing, cepstral analysis, maximum likelihood to mention some. But none are trivial, except for zero-crossing, which also gives you the poorest results and will fail to work with complex waveforms.
Here is a good place to start:
http://blog.bjornroche.com/2012/07/frequency-detection-using-fft-aka-pitch.html
There are also other question on SO.
However, as indicated by other answers, this is not something that can just be "magically" done. Even if you license code from someone (eg, iZotope, and z-plane both make excellent code for doing what you want to do), you still need to understand what's going on to get data in and out of their libraries.
If you need fast pitch detection go with http://www.schmittmachine.com/dywapitchtrack.html
You'll find a IOS sample code inside.
If you need FFT you should use Apple Accelerate framework.
Hope this help

Change the tempo\bpm of mp3 file

I am looking for a way to change the bpm\tempo of mp3 file(that also convert already to mp3).
its also can be done with c library that i will add to my project.
There are two methods for changing the perceived tempo of audio. One method is to simply alter the playback speed. This is easy to do with PCM audio, but will require you to decode the MP3. It might be possible to alter the sample rate (effectively the same thing) in the compressed domain (i.e. in the MP3 file itself), but I don't know how to do that. The big drawback with this approach is all the pitches in the audio change, too. This can make vocals sound unnatural, for instance.
Another approach is to apply a pitch-invariant speed change. This is a far more complex operation, and there are many proprietary algorithms and research papers addressing it. The pitch-synchronous overlap add (PSOLA) technique is one that works well. You can also look at what Audacity (an open-source audio editor) does. Since you are on iOS, Apple's audio framework might also provide some support for this. Look for AUPitch in the iOS documentation.

Real time system concept proof project

I'm taking an introductory course (3 months) about real time systems design, but any implementation.
I would like to build something that let me understand better what I'll learn in theory, but since I have never done any real time system I can't estimate how long will take any project. It would be a concept proof project, or something like that, given my available time and knowledge.
Please, could you give me some idea? Thank you in advance.
I programm in TSQL, Delphi and C#, but I'll not have any problem in learning another language.
Suggest you consider exploring the Real-Time Specification for Java (RTSJ). While it is not a traditional environment for constructing real-time software, it is an up-and-coming technology with a lot of interest. Even better, you can witness some of the ongoing debate about what matters and what doesn't in real-time systems.
Sun's JavaRTS is freely available for download, and has some interesting demonstrations available to show deterministic behavior, and show off their RT garbage collector.
In terms of a specific project, I suggest you start simple: 1) Build a work-generator that you can tune to consume a given amount of CPU time; 2) Put this into a framework that can produce a distribution of work-generator tasks (as threads, or as chunks of work executed in a thread) and a mechanism for logging the work produced; 3) Produce charts of the execution time, sojourn time, deadline, slack/overrun of these tasks versus their priority; 4) demonstrate that tasks running in the context of real-time threads (vice timesharing) behave differently.
Bonus points if you can measure the overhead in the scheduler by determining at what supplied load (total CPU time produced by your work generator tasks divided by wall-clock time) your tasks begin missing deadlines.
Try to think of real-time tasks that are time-critical, for instance video-playing, which fails if tasks are not finished (e.g. calculating the next frame) in time.
You can also think of some industrial solutions, but they are probably more difficult to study in your local environment.
You should definitely consider building your system using a hardware development board equipped with a small processor (ARM, PIC, AVR, any one will do). This really helped remove my fear of the low-level when I started developing. You'll have to use C or C++ though.
You will then have two alternatives : either go bare-metal, or use a real-time OS.
Going bare-metal, you can learn :
How to initalize your processor from scratch and most importantly how to use interrupts, which are the fastest way you have to respond to an externel event
How to implement lightweight threads with fast context switching, something every real-time OS implements
In order to ease this a bit, look for a dev kit which comes with lots of documentation and source code. I used Embedded Artists ARM boards and they give you a lot of material.
Going with the RT OS :
You'll fast-track your project, and will be able to learn how to fine-tune a RT OS
You may try your hand at an open-source OS, such as Linux or the BSDs, and learn a lot from the source code
Either choice is good, you will get a really cool hands-on project to show off and hopefully better understand your course material. Good luck!
As most realtime systems are still implemented in C or C++ it may be good to brush up your knowledge of these programming languages. Many realtime systems are also embedded systems, so you might want to play around with a cheap open source one like BeagleBoard (http://beagleboard.org/). This will also give you a chance to learn about cross compiling etc.