I'm trying to get samples from AudioQueue to show spectrum of music (like in iTunes) on iPhone.
Ive read a lot of posts but almost all asks about get samples when Recording, not playing :(
I'm using AudioQueue Services for streaming audio. Please help to understanding next points:
1/ Where can I get access to samples (PCM, non mp3 (I'm using mp3 stream)
2/ Should I collect samples in my own buffer to apply fft ?
3/ Is it possible get frequencies without fft transformations ?
4/ How can I synchronize my fft shift in buffer with current playing samples ?
thanks,
update:
AudioQueueProcessingTapNew
For iOS6+, this works fine for me. But what about iOS5 ?
For playing audio, the idea is to get at the samples before you feed them to the Audio Queue callback. You may need to convert any compressed audio file format into raw PCM samples beforehand. This can be done using one of the AVFoundation converter or file reader services.
You can then copy frames of data from the same source used to feed the Audio Queue callback buffers, and apply your FFT or other DSP for visualization to them.
You can use either FFTs or a bank of band-pass filters to get frequency info, but the FFT is very efficient at this.
Synchronization needs to done by trial-and-error, as Apple does not specify exact audio and view graphic display latencies, which may differ between iOS devices and OS versions anyway. But short Audio Queue buffers or using the RemoteIO Audio Unit may give you better control of the audio latency, and OpenGL ES will give you better control of the graphic latency.
Related
I am attempting to stream live audio from an iOS device to a web browser. The iOS device sends small, mono wav files (as they are recorded) through a web socket. Once the client receives the wav files, I have the Web Audio API decode and schedule them accordingly.
This gets me about 99% of the way there, except I can hear clicks between each audio chunk. After some reading around, I have realized the likely source of my problem: the audio is being recorded at a sample rate of only 4k and this cannot be changed. It appears that the Web Audio API's decodeAudioData() function does not handle sample rates other than 44.1k with exact precision resulting in gaps between chunks.
I have tried literally everything I could find about this problem (scriptProcessorNodes, adjusting the timing, creating new buffers, even manually upsampling) and none of them have worked. At this point I am about to abandon the Web Audio API.
Is the Web Audio API appropriate for this?
Is there a better alternative for what I am trying to accomplish?
Any help/suggestions are appreciated, thanks!
Alas! AudioFeeder.js works wonders. I just specify the sampling rate of 4k, feed it raw 32 bit pcm data and it outputs a consistent stream of seamless audio! Even has built in buffer handling events, so no need to set any loops or timeouts to schedule chunk playback. I did have to tweak it a bit, though, to connect it to the rest of my web audio nodes and not just context.destination.
Note: AudioFeeder does automatically upsample to the audio context sampling rate. Going from 4k to 44.1k did introduce some pretty gnarly sounding artifacts in the highend, but a 48db lowpass filter (4 x 12db's) at 2khz got rid of them. I chose 2khz because, thanks to Harry Nyquist, I know that a sampling rate of 4k couldn't have possibly produced frequencies above 2khz in the original file.
All hail Brion Vibbers
I'm currently playing a stream on my iOS App but one feature we'd like to add is the visualization of the output wave. I use an output audio queue in order to play the stream, but have found no way to read the output buffer. Can this be achieved using audio queues or shall be done wit a lower level api?
To visualize, you presumably need PCM (uncompressed) data, so if you're pushing some compressed format into the queue like MP3 or AAC, then you never see the data you need. If you were working with PCM (maybe you're uncompressing it yourself with the Audio Conversion APIs), then you could visualize before putting samples into the queue. But then the problem would be latency - you want to visualize samples when they play, not when they go into the queue.
For latency reasons alone, you probably want to be using audio units.
It cannot actually be done. In order to do so, I need audio units to implement the streamer.
I want to apply an audio filter on the users voice in iPhone.
The filter is quite heavy and needs many audio samples to get the desired quality. I do not want to apply the filter in realtime but I want to have an almost realtime performance. I would like the processing to happen in parrallel with the recording when the nessesary samples are collected and when the user stops recording to hear (after a few seconds) the distorted sound.
My questions are:
1. Which is the right technology layer for this task e.g. audio units?
2. Which are the steps involved?
3. Which are the key concepts and API methods to use?
4. I want to capture the users voice. Which are the right recording settings for this? If my filter alter alters the frequency should I use a wider range?
5. How can I collect the necessary samples for my filter? How can I handle the audio data? I mean depending on the recording settings how the data are packed?
6. How can I wright the final audio recording to a file?
Thanks in advance!
If you find a delay of over a hundred milliseconds acceptable, you can use the Audio Queue API, which is a bit simpler than using the RemoteIO Audio Unit, for both capture and audio playback. You can process the buffers in your own NSOperationQueue as the come in from the audio queue, and either save the processed results to a file or just kept in memory if there is room.
For Question 4: If your audio filter is linear, then you won't need any wider frequency range. If you are doing non-linear filtering, all bets are off.
Hi unfortunately I've not been able to figure out audio on the iPhone. The best I've come close to are the AVAudioRecorder/Player classes and I know that they are no good fo audio processing.
So i'm wondering if someone would be able to explain to me how to "listen" to the iPhone's mic input in chunks of say 1024 samples, analyse the samples and do stuff. And just keep going like that until my app terminates or tells it to stop. I'm not looking to save any data, all I want is to analyse the data in real time and do stuff in real time with it.
I've attempted to try and understand apples "aurioTouch" example but it's just way too complicated for me to understand.
So can someone explain to me how I should go about this?
If you want to analyze audio input in real-time, it doesn't get a lot simpler than Apple's aurioTouch iOS sample app with source code (there is also a mirror site). You can google a bit more info on using the Audio Unit RemoteIO API for recording, but you'll still have to figure out the real-time analysis DSP portion.
The Audio Queue API is a slight bit simpler for getting input buffers of raw PCM audio data from the mic, but not much simpler, and it has a higher latency.
Added later: There's also a version of aurioTouch converted to Swift here: https://github.com/ooper-shlab/aurioTouch2.0-Swift
AVAudioPlayer/Recorder class won't take you there if you wanna do any real time audio processing. The Audio Toolbox and Audio Unit frameworks are the way to go. Check here for apple's audio programming guide to see which framework suits your need. And believe me, these low level stuff is not easy and is poorly documented. CocoaDev has some tutorials where you can find sample codes. Also, there is an audio DSP library DIRAC I recently discovered for tempo and pitch manipulation. I haven't looked into it much but you might find it useful.
If all you want is samples with a minimum amount of processing by the OS, you probably want the Audio Queue API; see Audio Queue Services Programming Guide.
AVAudioRecorder is designed for recording to a file, and AudioUnit is more for "pluggable" audio processing (and on the Mac side of things, AU Lab is actually pretty cool).
I am new to Core Audio and really lost, I am trying to record an audio and then apply voice modulation to that recording and play it back. I have looked at the example Speak Here which uses Audio Queue for audio recording. I am stuck at the part of how to change the audio samples. I understand that it can be done using Audio Unit in the call back function to change the audio samples, but I have no idea what to apply to those samples to change them (will changing pitch help ?).
If you could direct me to some source code or tutorial or any site that explains voice modulation for objective C will really really help me. Thank you all in advance.
What you are trying to do here is not that simple. Basically, you would have to implement a vocoder ("voice-coder") to change a voice. The Wikipedia links should help you there.
Then, you still have manipulate those samples in CoreAudio. You can do this using Audio Queue Services but that not exactly an easy-to-use API. It might actually be less trouble to use one of the simpler CoreAudio APIs and wrap your vocoder in an Audio Unit.
Do you have some experience with audio processing? Implementing a vocoder without some prior knowledge about audio processing in general is a tough task.
First, to actually answer your question: When you called the AudioQueueNewInput() function, you pass it the name of a routine that will be called every time data is available to you. You probably called it MyInputBufferHandler() or something. It's third argument is an AudioQueueBufferRef which hold the incoming data.
Be aware that this is not as simple as looking at each sample (amplitude) and lowering or raising it. You receive samples in the temporal (time) domain as amplitudes. There is no pitch or frequency information available. What you need to do is move the incoming samples (waveform) into the frequency domain, wherein each "point" in that space is a frequency and it's accompanying power and phase. You can do that with an FFT (fast Fourier transform) but the mathematics are somewhat sophisticated. Apple does provide FFT routines in the Acceleration framework, but be aware that you are wading into very deep water here.