Software channels, what are they? - fmod

Learning how to use FMOD and using this guide:
http://www.gamedev.net/page/resources/_/technical/game-programming/a-quick-guide-to-fmod-r2098
It says under the Initialization sub-title that the second parameter in FSOUND_Init is the maximum amount of software channels. Could someone please explain or provide a link to an article which explains what software channels are.
If possible please keep the explanations as simple as possible.
Thank you

You can learn a lot from FMOD samples. u should have a look on them.
Channel can manipulate with your sound (e.g. You can set sound volume via Channels)
Maximum amout of channels mean how many sounds you can play simultaneously.

Related

Audio analysis and feature extraction for music visualization

I have a general question regarding how I should proceed with my music visualization endeavors. I am interested in visualizing classical music pieces, either recorded or live. I have used so far Processing, but I am open to other software or programming languages, too. Since my background is musicology and music theory, I would like to incorporate music theoretical concepts or any tools that MIR research has made available in analyzing music, so that I could create visualizations based on them.
However, I don't really know from where to start. I feel like I would like to study MIR concepts, but I feel it can become very technical and it seems that I don't need to be knowledgeable about all facets of MIR research. For example I want to analyze and visualize music, but not to synthesize. Or, could it be that learning about sound synthesis would also help me with my visualizations? And to what extent?
Alternatively, would it be better, to just pick some software/language and try its specific libraries for audio analysis??
Basically my dilemma is whether I should start with MIR theory or there is an alternative way of getting a good overview of all the features that one could maybe extract from a music piece, in order to have a richer palette for visualizations.
If you could give me some advice on this, I would be very grateful.
Thanks, Ilias

Easy and straight-forward FFT of MP3 file

I have a MP3 file and need to constantly detect and show Hz value of this playing MP3 file. A bit of googling shows that I have 2 opportunities: use FFT or use Apple Accelerate framework. Unfortunately I haven't found any easy to use sample of neither. All samples, like AurioTouch etc, need tons of code to get simple number for sample buffer. Is there any easy example for pitch detection for iOS?
For example I've found https://github.com/clindsey/pkmFFT, but it's missing some files that its' author has removed. Anything working like that?
I'm afraid not. Working with sound is generally hard, and Core Audio makes no exception. Now to the matter at hand.
FFT is an algorithm for transforming input from time domain to frequency domain. Is not necessarily linked with sound processing, you can use it for other things other than sound as well.
Accelerate is an Apple provided framework, which among many other things offer an FFT implementation. So, you actually don't have two options there, just one and its implementation.
Now, depending on what you want to do(e.g. if you favour speed over accuracy, robustness over simplicity etc) and the type of waveform you have(simple, complex, human speech, music), FFT may be not enough on its own or not even the right choice for your task. There are other options, auto-correlation, zero-crossing, cepstral analysis, maximum likelihood to mention some. But none are trivial, except for zero-crossing, which also gives you the poorest results and will fail to work with complex waveforms.
Here is a good place to start:
http://blog.bjornroche.com/2012/07/frequency-detection-using-fft-aka-pitch.html
There are also other question on SO.
However, as indicated by other answers, this is not something that can just be "magically" done. Even if you license code from someone (eg, iZotope, and z-plane both make excellent code for doing what you want to do), you still need to understand what's going on to get data in and out of their libraries.
If you need fast pitch detection go with http://www.schmittmachine.com/dywapitchtrack.html
You'll find a IOS sample code inside.
If you need FFT you should use Apple Accelerate framework.
Hope this help

Simple examples/applications of Bayesian Networks

Thanks for reading.
I want to implement a Baysian Network using the Matlab's BNT toolbox.The thing is, I can't find "easy" examples, since it's the first time I have to deal with BN.
Can you propose some possible applications, (with not many nodes) please ^^ ?
Have a look at Tom Mitchell's "Machine Learning" book, which covers the subject starting with small, simple examples. I suspect there are many course slides you could access online which also give simple examples.
I think it helps to start with higher level tools to get a feel for how to construct networks before constructing them in code. Having a UI also allows you to play with the network and get a feel for the way the networks behave (propagation, explaining away, etc).
For example have a look at the free Genie (http://genie.sis.pitt.edu) and its samples, and/or the 50 node limited Hugin-Lite (http://www.hugin.com/productsservices/demo/hugin-lite) with it's sample networks. You can then check your BNT implementations to make sure they verify against the software packages.
Edit: I forgot to mention Netica which is another BN/Influence diagram software package which I think has the biggest selection of examples http://www.norsys.com/netlibrary/index.htm.

How would you compare a spoken word to an audio file?

How would you go about comparing a spoken word to an audio file and determining if they match? For example, if I say "apple" to my iPhone application, I would like for it to record the audio and compare it with a prerecorded audio file of someone saying "apple". It should be able to determine that the two spoken words match.
What kind of algorithm or library could I use to perform this kind of voice-based audio file matching?
You should look up Acoustic Fingerprinting see wikipedia link below. Shazam is basically doing it for music.
http://en.wikipedia.org/wiki/Acoustic_fingerprint
I know this question is old, but I discovered this library today:
http://www.ispikit.com/
Sphinx does voice recognition and pocketSphinx has been ported to the iPhone by Brian King
check https://github.com/KingOfBrian/VocalKit
He has provided excellent details and made it easy to implement for yourself. I've run his example and modified my own rendition of it.
You can use a neural networks library and teach it to recognize different speech patterns. This will require some know how behind the general theory of neural networks and how they can be used to create systems that will behave a particular way. If you know nothing about the subject, you can get started on just the basics and then use a library rather than implementing something yourself. Hope that helps.

Building better positional audio [AudioQueue manipulation]

I'm building an app that has a requirement for really accurate positional audio, down to the level of modelling inter-aural time difference (ITD), the slight delay difference between stereo channels that varies with a sound's position relative to a listener. Unfortunately, the iPhone's implementation of OpenAL doesn't have this feature, nor is a delay Audio Unit supplied in the SDK.
After a bit of reading around, I've decided that the best way to approach this problem is to implement my own delay by manipulating an AudioQueue (I can also see some projects in my future which may require learning this stuff, so this is as good an excuse to learn as any). However, I don't have any experience in low-level audio programming at all, and certainly none with AudioQueue. Trying to learn both:
a) the general theory of audio processing
and
b) the specifics of how AudioQueue implements that theory
is proving far too much to take in all at once :(
So, my questions are:
1) where's a good place to start learning about DSP and how audio generation and processing works in general (down to the level of how audio data is structured in memory, how mixing works, that kinda thing)?
2) what's a good way to get a feel for how AudioQueue does this? Are there any good examples of how to get it reading from a generated ring buffer, rather that just fetching bits of a file on-demand with AudioFileReadPackets, like Apple's SpeakHere example does?
and, most importantly
3) is there a simpler way of doing this that I've overlooked?
I think Richard Lyons' "Understanding Digital Signal Processing" is widely revered as a good starter DSP book, though it's all math and no code.
If timing is so important, you'll likely want to use the Remote I/O audio unit, rather than the higher-latency audio queue. Some of the audio unit examples may be helpful to you here, like the "aurioTouch" example that uses the Remote I/O unit for capture and performs an FFT on it to get the frequencies.
If the built-in AL isn't going to do it for you, I think you've opted into the "crazy hard" level of difficulty.
Sounds like you should probably be on the coreaudio-api list (lists.apple.com), where Apple's Core Audio engineers hang out.
Another great resource for learning the fundamental basics of DSP and their applications is The Scientist and Engineer's Guide to Digital Signal Processing by Steven W. Smith. It is available online for free at http://www.dspguide.com/ but you can also order a printed copy.
I really like how the author builds up the fundamental theory in a way that very palatable.
Furthermore, you should check out the Core Audio Public Utility which you'll find at /Developer/Extras/CoreAudio/PublicUtility. It covers a lot of the basic structures you'll need to get in place in order to work with CoreAudio.