While there are plenty of tutorials for how to use AVCaptureSession to grab camera data, I can find no information (even on apple's dev network itself) on how to properly handle microphone data.
I have implemented AVCaptureAudioDataOutputSampleBufferDelegate, and I'm getting calls to my delegate, but I have no idea how the contents of the CMSampleBufferRef I get are formatted. Are the contents of the buffer one discrete sample? What are its properties? Where can these properties be set?
Video properties can be set using [AVCaptureVideoDataOutput setVideoSettings:], but there is no corresponding call for AVCaptureAudioDataOutput (no setAudioSettings or anything similar).
They are formatted as LPCM! You can verify this by getting the AudioStreamBasicDescription like so:
CMFormatDescriptionRef formatDescription = CMSampleBufferGetFormatDescription(sampleBuffer);
const AudioStreamBasicDescription *streamDescription = CMAudioFormatDescriptionGetStreamBasicDescription(formatDescription);
and then checking the stream description’s mFormatId.
Related
Okay guys, I've read many things about the FFT stuff, but it seems to be a bit more complicated than building a tableView.
I am searching for a way to analyze the playing audio (from iPod Library) in three ranges (low, mid, high). I think FFT is doing the job, but I'm not sure if I could filter (Lowpass, Bandpass and Highpass) the playing audio and analyze the peaks as well.
So if anyone knows what is the best (by best I mean, fastest (CPU) way to do so, please help me. There will be no front-end, so I won't draw the FFT in a Window (I guess the drawing does eat a lot of the cpu).
Then I have no idea how I could analyze the audio. All the FFT Sample Codes I found are using the mic. I do not want to use the mic. I saw something getting the Audio File and exporting it to a uncompressed file, but I need a live-analysation.
I've had a look at aurioTouch2, but I don't get how I could change the input from the mic to the iPod Library.
I think, the part I'm searching for is here:
// Initialize our remote i/o unit
inputProc.inputProc = PerformThru;
inputProc.inputProcRefCon = self;
CFURLRef url = NULL;
try {
url = CFURLCreateWithFileSystemPath(kCFAllocatorDefault, CFStringRef([[NSBundle mainBundle] pathForResource:#"button_press" ofType:#"caf"]), kCFURLPOSIXPathStyle, false);
XThrowIfError(AudioServicesCreateSystemSoundID(url, &buttonPressSound), "couldn't create button tap alert sound");
CFRelease(url);
// Initialize and configure the audio session
XThrowIfError(AudioSessionInitialize(NULL, NULL, rioInterruptionListener, self), "couldn't initialize audio session");
UInt32 audioCategory = kAudioSessionCategory_PlayAndRecord;
XThrowIfError(AudioSessionSetProperty(kAudioSessionProperty_AudioCategory, sizeof(audioCategory), &audioCategory), "couldn't set audio category");
XThrowIfError(AudioSessionAddPropertyListener(kAudioSessionProperty_AudioRouteChange, propListener, self), "couldn't set property listener");
Float32 preferredBufferSize = .005;
XThrowIfError(AudioSessionSetProperty(kAudioSessionProperty_PreferredHardwareIOBufferDuration, sizeof(preferredBufferSize), &preferredBufferSize), "couldn't set i/o buffer duration");
UInt32 size = sizeof(hwSampleRate);
XThrowIfError(AudioSessionGetProperty(kAudioSessionProperty_CurrentHardwareSampleRate, &size, &hwSampleRate), "couldn't get hw sample rate");
XThrowIfError(AudioSessionSetActive(true), "couldn't set audio session active\n");
XThrowIfError(SetupRemoteIO(rioUnit, inputProc, thruFormat), "couldn't setup remote i/o unit");
unitHasBeenCreated = true;
drawFormat.SetAUCanonical(2, false);
drawFormat.mSampleRate = 44100;
(...)
But I'm quite new to all of these AudioUnits, so I can't understand where an input is loaded. Then, the code mentioned above uses AVAudioSession. A little birdie told me, this will be deprecated, so what is the alternative?
So, basically:
How can I get the currently playing audio in order to do an analyzation? Can I just use a MPMusicPlayerController and get the samples? Or do I have to build a entire AudioUnit which plays the Library?
What is the fastest way (CPU) to analyze lows, mids and highs? Filtering? FFT? Something else?
Will I get in trouble with the Copyrights of bought music? Because I tried to convert the playing file to PCA Samples and sometimes I have this error:
VTM_AViPodReader[7666:307] * Terminating app
due to uncaught exception 'NSInvalidArgumentException', reason:
'* -[AVAssetReader initWithAsset:error:] invalid parameter not
satisfying: asset != ((void *)0)'
What is the "new" way to do an FFT if the whole AVAudioSession stuff won't work in the future?
You can't get the currently playing audio (security sandbox prevents this) on iOS, unless your app is the one playing the audio using certain select APIs (Audio Queue, RemoteIO, etc.)
3 bandpass filters (made with IIR biquads) will be faster than an FFT. But even a full FFT will use a very small percentage of CPU time.
An app can't convert or play protected music from the iTunes library in a form where samples can be captured.
The FFT is in the Accelerate framework, not in the audio session.
I'm using monotouch to create iphone applications and I need to encode audio that I receive from the mic into a gsm file.
I have already encoded audio into wav, but now, for more specific needs, I need to record it into GSM. If someone could tell me or show me some doc that explains either how to encode from mic into gsm or how to convert wav into gsm that would be awesome.
Ty,
Axel
--- UPDATE ---
There's an entry for MicrosoftGSM in MonoTouch.AudioToolbox.AudioFormatType, yet I get a 1718449215 OSStatus error. I guess that the reason is that my other arguments aren't corrent. Tough I don't know the specification for saving as GSM. Here's my not working code:
//set up the NSObject Array of values that will be combined with the keys to make the NSDictionary
NSObject[] values = new NSObject[]
{
NSNumber.FromFloat (44100.0f), //Sample Rate
NSNumber.FromInt32 ((int)MonoTouch.AudioToolbox.AudioFormatType.MicrosoftGSM),
NSNumber.FromInt32(2),
NSNumber.FromInt32((int)AVAudioQuality.High),
};
//Set up the NSObject Array of keys that will be combined with the values to make the NSDictionary
NSObject[] keys = new NSObject[]
{
AVAudioSettings.AVSampleRateKey,
AVAudioSettings.AVFormatIDKey,
AVAudioSettings.AVNumberOfChannelsKey,
AVAudioSettings.AVEncoderAudioQualityKey,
};
//Set Settings with the Values and Keys to create the NSDictionary
settings = NSDictionary.FromObjectsAndKeys (values, keys);
Bad news, there is no built in way to use AVAudioRecorder to record GSM audio files
The only supported audio recording formats are
MPEG4AAC
AppleLossless
AppleIMA4
iLBC
ULaw
LinearPCM
Anyways you could setup a webservice and use a third party converter like SoX that can convert the audio for you.
btw if you try to use the recorder using MicrosoftGSM format you will likely to get an OSStatus error 1718449215 which is the representation of kAudioFormatUnsupportedDataFormatError error
Alex
In my application, I am receiving audio data in LinearPCM format, which I need to play.
I am following iOS SpeakHere example. However I cannot get how and where I should provide a buffer to AudioQueue.
Can anyone provide me a working example of playing audio buffer in iOS via AudioQueue?
In the SpeakHere example playback is achieved using AudioQueue.
In the set up of AudioQueue, a function is specified that will be called when the queue wants more data.
You can see that in this method:
void AQPlayer::SetupNewQueue()
Here's the line that specifies the callback function:
XThrowIfError(AudioQueueNewOutput(&mDataFormat, AQPlayer::AQBufferCallback, this,
CFRunLoopGetCurrent(), kCFRunLoopCommonModes, 0, &mQueue), "AudioQueueNew failed");
If you take a look at AQPlayer::AQBufferCallback, you'll see where it gets the data from. In this example, the data has been written out to a file on disk. That's a good solution if you want to save memory, or if there's a possibility the audio file could be quite large.
Anyway, looking at AQPlayer::AQBufferCallback, you'll see a call to a function AudioFileReadPackets. That's what reads in the audio packets from the file on disk. It reads them straight into the buffer that AudioQueue will use:
OSStatus result = AudioFileReadPackets(THIS->GetAudioFileID(), false, &numBytes, inCompleteAQBuffer->mPacketDescriptions, THIS->GetCurrentPacket(), &nPackets,
inCompleteAQBuffer->mAudioData);
That buffer is inCompleteAQBuffer->mAudioData.
Finally, the callback function must enqueue the buffer as follows:
if (nPackets > 0) {
inCompleteAQBuffer->mAudioDataByteSize = numBytes;
inCompleteAQBuffer->mPacketDescriptionCount = nPackets;
AudioQueueEnqueueBuffer(inAQ, inCompleteAQBuffer, 0, NULL);
THIS->mCurrentPacket = (THIS->GetCurrentPacket() + nPackets);
}
Note first that it has to check that we have some packets to play. It also has to specify how many bytes are in the buffer.
Then, this line here:
THIS->mCurrentPacket = (THIS->GetCurrentPacket() + nPackets);
That keeps a track of where we are overall in our audio buffer. In other words, as more data is copied in from the file, we need to position the mCurrentPacket forward to that the next copy puts data in the correct place.
I'm wondering if there is a way to "listen" without recording and display the microphone's input levels?
Apples SpeakHere sample does the record and playback, and am wondering if there could a be a lighter version of just "listening" without actually recording and saving a file.
I use AudioQueues for this purpose. In your callback, get the input level like so:
AudioQueueLevelMeterState meter[NUM_INPUT_CHANNELS];
UInt32 dataSize = sizeof(meter);
AudioQueueGetProperty(aqInput, kAudioQueueProperty_CurrentLevelMeterDB, meter, &dataSize);
// input 'level' is in meter.mAveragePower
And simply don't write the audio into a file.
I want to use OpenAL to play music in an iOS game. The music files are stored in mp3 format and I want to stream them using a buffer queue. I load audio data into the buffers using AudioFileReadPacketData(). However playing the buffers only gives me noise. It works perfectly for caf files, but not for mp3s. Did I miss some vital step in decoding the file?
Code I use to open the sound file:
- (void) openFile:(NSString*)fileName {
NSBundle *bundle = [NSBundle mainBundle];
CFURLRef url = (CFURLRef)[[NSURL fileURLWithPath:[bundle pathForResource:fileName ofType:#"mp3"]] retain];
AudioFileOpenURL(url, kAudioFileReadPermission, 0, &audioFile);
AudioStreamBasicDescription theFormat;
UInt32 formatSize = sizeof(theFormat);
AudioFileGetProperty(audioFile, kAudioFilePropertyDataFormat, &formatSize, &theFormat);
freq = (ALsizei)theFormat.mSampleRate;
CFRelease(url);
}
Code I use to fill in buffers:
- (void) loadOneChunkIntoBuffer:(ALuint)buffer {
char data[STREAM_BUFFER_SIZE];
UInt32 loadSize = STREAM_BUFFER_SIZE;
AudioStreamPacketDescription packetDesc[STREAM_PACKETS];
UInt32 numPackets = STREAM_PACKETS;
AudioFileReadPacketData(audioFile, NO, &loadSize, packetDesc, packetsLoaded, &numPackets, data);
alBufferData(buffer, AL_FORMAT_STEREO16, data, loadSize, freq);
packetsLoaded += numPackets;
}
Because you're reading bytes of MP3 data and treating them as PCM data.
You almost certainly want AudioFileReadPacketData(). EDIT: Except that still gives you MP3 data; it just gives it in packets and (possibly) parses packet headers.
If you don't require OpenAL, AVAudioPlayer is probably the better way to go (according to the Multimedia Programming Guide, there's also Audio Queue services if you want more control).
If you really need to use OpenAL, according to TN2199 you'll need to convert it to PCM in the native byte order. See oalTouch/Classes/MyOpenALSupport.c for an example of using Extended Audio File Services to do this. Note that TN2199 says the format "must ... not use hardware decompression" — according to the Multimedia Programming Guide, software decoding is supported for everything except HE-AAC since OS 3.0. Also note that software MP3 decoding can use a significant amount of CPU time.
Alternatively, explicitly convert the audio using AudioConverter or (possibly) AudioUnit with kAudioUnitSubType_AUConverter. If you do this, it might be worthwhile decompressing everything once and keeping it in memory to minimize overhead.