Has anyone had success converting 32KHz PCM to 96Kbit AAC on iPhone/iOS?
I can not get this to work correctly on any hardware device. The code I wrote only works correctly in the simulator. When run on current-generation iPad/iPod/iPhone, my code 'skips' large chunks of audio.
The resulting encoded stream contains a repeating pattern of ~640ms of 'good' audio followed by ~640ms of 'bad' audio.
Encoding both 16bit linear and 8.24 fixed-point PCM yielded the same results.
Here is the code to setup an Audio Converter to encode MPEG4-AAC 96kbits # 32KHz:
AudioStreamBasicDescription descPCMFormat;
descPCMFormat.mSampleRate = 32000;
descPCMFormat.mChannelsPerFrame = 1;
descPCMFormat.mBitsPerChannel = sizeof(AudioUnitSampleType) * 8;
descPCMFormat.mBytesPerPacket = sizeof(AudioUnitSampleType);
descPCMFormat.mFramesPerPacket = 1;
descPCMFormat.mBytesPerFrame = sizeof(AudioUnitSampleType);
descPCMFormat.mFormatID = kAudioFormatLinearPCM;
descPCMFormat.mFormatFlags = kAudioFormatFlagsAudioUnitCanonical;
AudioStreamBasicDescription descAACFormat;
descAACFormat.mSampleRate = 32000;
descAACFormat.mChannelsPerFrame = 1;
descAACFormat.mBitsPerChannel = 0;
descAACFormat.mBytesPerPacket = 0;
descAACFormat.mFramesPerPacket = 1024;
descAACFormat.mBytesPerFrame = 0;
descAACFormat.mFormatID = kAudioFormatMPEG4AAC;
descAACFormat.mFormatFlags = 0;
AudioConverterNew(& descPCMFormat, & descAACFormat, &m_hCodec);
UInt32 ulBitRate = 96000;
UInt32 ulSize = sizeof(ulBitRate);
AudioConverterSetProperty(m_hCodec, kAudioConverterEncodeBitRate, ulSize, & ulBitRate);
Simple conversion routine. This routine is called every 32ms with a block of 1024 PCM samples, and expects 384 bytes of encoded AAC:
OSStatus CMyObj::Convert(
const AudioUnitSampleType * pSrc,
const size_t ulSrc,
uint8_t * pDst,
size_t & ulDst)
{
// error and sanity checking removed..
// assume caller is converting 1024 samples to at most 384 bytes
OSStatus osStatus;
m_pSrcPtr = (uint8_t*)pSrc;
m_ulSrcLen = ulSrc; // verified to be 1024*sizeof(AudioUnitSampleType);
AudioBufferList destBuffers;
destBuffers.mNumberBuffers = 1;
destBuffers.mBuffers[0].mNumberChannels = 1;
destBuffers.mBuffers[0].mDataByteSize = 384;
destBuffers.mBuffers[0].mData = pDst;
AudioStreamPacketDescription destDescription;
destDescription.mStartOffset = 0;
destDescription.mVariableFramesInPacket = 0;
destDescription.mDataByteSize = 384;
UInt32 ulDstPackets = 1;
osStatus = AudioConverterFillComplexBuffer(
m_hCodec,
InputDataProc,
this,
& ulDstPackets,
& destBuffers,
& destDescription);
ulDst = destBuffers.mBuffers[0].mDataByteSize;
return osStatus;
}
The input data proceedure simply provides the 1024 samples to the encoder:
static OSStatus CMyObj::InputDataProc(
AudioConverterRef hCodec,
UInt32 *pulSrcPackets,
AudioBufferList *pSrcBuffers,
AudioStreamPacketDescription **ppPacketDescription,
void *pUserData)
{
// error and sanity checking removed
CMyObj *pThis = (CMyObj*)pUserData;
const UInt32 ulMaxSrcPackets = pThis->m_ulSrcLen / sizeof(AudioUnitSampleType);
const UInt32 ulRetSrcPackets = min(ulMaxSrcPackets, *pulSrcPackets);
if( ulRetSrcPackets )
{
UInt32 ulRetSrcBytes = ulRetSrcPackets * sizeof(AudioUnitSampleType);
*pulSrcPackets = ulRetSrcPackets;
pSrcBuffers->mBuffers[0].mData = pThis->m_pSrcPtr;
pSrcBuffers->mBuffers[0].mDataByteSize = ulRetSrcBytes;
pSrcBuffers->mBuffers[0].mNumberChannels = 1;
pThis->m_pSrcPtr += ulRetSrcBytes;
pThis-> m_ulSrcLen -= ulRetSrcBytes;
return noErr;
}
*pulSrcPackets = 0;
pSrcBuffers->mBuffers[0].mData = NULL;
pSrcBuffers->mBuffers[0].mDataByteSize = 0;
pSrcBuffers->mBuffers[0].mNumberChannels = 1;
return 500; // local error code to signal end-of-packet
}
Everything works fine when run on the simulator.
When run on the device, however, InputDataProc is not called consistently. For up to 20 times in a row, calls to AudioConverterFillComplexBuffer provoke calls to InputDataProc, and everything looks fine. Then, for the next ~ 21 calls to AudioConverterFillComplexBuffer, InputDataProc will NOT be called. This pattern repeats forever:
-> Convert
-> AudioConverterFillComplexBuffer
-> InputDataProc
-> results in 384 bytes of 'good' AAC
-> Convert
-> AudioConverterFillComplexBuffer
-> InputDataProc
-> results in 384 bytes of 'good' AAC
.. repeats up to 18 more times
-> Convert
-> AudioConverterFillComplexBuffer
-> results in 384 bytes of 'bad' AAC
-> Convert
-> AudioConverterFillComplexBuffer
-> results in 384 bytes of 'bad' AAC
.. repeats up to 18 more times
Where is the converter getting the input data to create the 'bad' AAC, since it isn't calling InputDataProc?
Does anyone see anything glaringly wrong with this approach?
Are there any special settings that need to be made on the hardware codec (MagicCookies or ?) ?
Does the HW AAC codec support 32000 sample rate?
I find that: the default outputBitRate for 32KHz-input-PCM is 48000 bit, the default outputBitRate for 44.1KHz-input-PCM is 64000 bit.
When use the the default outputBitRate, 32KHz input makes huge noise.
Even use these codes from apple`s sample , 44.1KHz input have a little noise.
Then i fix the outputBitRate to 64kbs, 32KHz & 44.1KHz both works well。
UInt32 outputBitRate = 64000; // 64kbs
UInt32 propSize = sizeof(outputBitRate);
if (AudioConverterSetProperty(m_converter, kAudioConverterEncodeBitRate, propSize, &outputBitRate) != noErr) {
} else {
NSLog(#"upyun.com uplivesdk UPAACEncoder error 102");
}
Related
I am working on an iOS project that needs to encode and decode Speex audio using a remoteIO audio unit as input / output.
The problem I am having is although speex doesn't print any errors, the audio I get is somewhat recognizable as voice but very distorted, sort of sounds like the gain was just cranked up in a robotic way.
Here are the encode and decode functions (Input to encode is 320 bytes of signed integers from the audio unit render function, Input to decode is 62 bytes of compressed data ):
#define AUDIO_QUALITY 10
#define FRAME_SIZE 160
#define COMP_FRAME_SIZE 62
char *encodeSpeexWithBuffer(spx_int16_t *buffer, int *insize) {
SpeexBits bits;
void *enc_state;
char *outputBuffer = (char *)malloc(200);
speex_bits_init(&bits);
enc_state = speex_encoder_init(&speex_nb_mode);
int quality = AUDIO_QUALITY;
speex_encoder_ctl(enc_state, SPEEX_SET_QUALITY, &quality);
speex_bits_reset(&bits);
speex_encode_int(enc_state, buffer, &bits);
*insize = speex_bits_write(&bits, outputBuffer, 200);
speex_bits_destroy(&bits);
speex_encoder_destroy(enc_state);
return outputBuffer;
}
short *decodeSpeexWithBuffer(char *buffer) {
SpeexBits bits;
void *dec_state;
speex_bits_init(&bits);
dec_state = speex_decoder_init(&speex_nb_mode);
short *outTemp = (short *)malloc(FRAME_SIZE * 2);
speex_bits_read_from(&bits, buffer, COMP_FRAME_SIZE);
speex_decode_int(dec_state, &bits, outTemp);
speex_decoder_destroy(dec_state);
speex_bits_destroy(&bits);
return outTemp;
}
And the audio unit format:
// Describe format
audioFormat.mSampleRate = 8000.00;
audioFormat.mFormatID = kAudioFormatLinearPCM;
audioFormat.mFormatFlags = kAudioFormatFlagIsSignedInteger |
kAudioFormatFlagsNativeEndian |
kAudioFormatFlagIsPacked;
audioFormat.mFramesPerPacket = 1;
audioFormat.mChannelsPerFrame = 1;
audioFormat.mBitsPerChannel = 16;
audioFormat.mBytesPerPacket = 2;
audioFormat.mBytesPerFrame = 2;
No errors are reported anywhere and I have confirmed that the Audio Unit is processing at a sample rate of 8000
After a few days of going crazy over this I finally figured it out. The trick with Speex is that you must initialize a SpeexBit and encoder void* and use them throughout the entire session. Because I was recreating them for every piece of the encode it was causing strange sounding results.
Once I moved:
speex_bits_init(&bits);
enc_state = speex_encoder_init(&speex_nb_mode);
Out of the while loop everything worked great.
I am currently in the process of building an application that reads in audio from my iPhone's microphone, and then does some processing and visuals. Of course I am starting with the audio stuff first, but am having one minor problem.
I am defining my sampling rate to be 44100 Hz and defining my buffer to hold 4096 samples. Which is does. However, when I print this data out, copy it into MATLAB to double check accuracy, the sample rate I have to use is half of my iPhone defined rate, or 22050 Hz, for it to be correct.
I think it has something to do with the following code and how it is putting 2 bytes per packet, and when I am looping through the buffer, the buffer is spitting out the whole packet, which my code assumes is a single number. So what I am wondering is how to split up those packets and read them as individual numbers.
- (void)setupAudioFormat {
memset(&dataFormat, 0, sizeof(dataFormat));
dataFormat.mSampleRate = kSampleRate;
dataFormat.mFormatID = kAudioFormatLinearPCM;
dataFormat.mFramesPerPacket = 1;
dataFormat.mChannelsPerFrame = 1;
// dataFormat.mBytesPerFrame = 2;
// dataFormat.mBytesPerPacket = 2;
dataFormat.mBitsPerChannel = 16;
dataFormat.mReserved = 0;
dataFormat.mBytesPerPacket = dataFormat.mBytesPerFrame = (dataFormat.mBitsPerChannel / 8) * dataFormat.mChannelsPerFrame;
dataFormat.mFormatFlags =
kLinearPCMFormatFlagIsSignedInteger |
kLinearPCMFormatFlagIsPacked;
}
If what I described is unclear, please let me know. Thanks!
EDIT
Adding the code that I used to print the data
float *audioFloat = (float *)malloc(numBytes * sizeof(float));
int *temp = (int*)inBuffer->mAudioData;
int i;
float power = pow(2, 31);
for (i = 0;i<numBytes;i++) {
audioFloat[i] = temp[i]/power;
printf("%f ",audioFloat[i]);
}
I found the problem with what I was doing. It was a c pointer issue, and since I have never really programmed in C before, I of course got them wrong.
You can not directly cast inBuffer->mAudioData to an int array. So what I simply did was the following
SInt16 *buffer = malloc(sizeof(SInt16)*kBufferByteSize);
buffer = inBuffer->mAudioData;
This worked out just fine and now my data is of correct length and the data is represented properly.
I saw your answer, there also is an underlying issue which gives wrong sample data bytes which is because of an endian issue of bytes being swapped.
-(void)feedSamplesToEngine:(UInt32)audioDataBytesCapacity audioData:(void *)audioData {
int sampleCount = audioDataBytesCapacity / sizeof(SAMPLE_TYPE);
SAMPLE_TYPE *samples = (SAMPLE_TYPE*)audioData;
//SAMPLE_TYPE *sample_le = (SAMPLE_TYPE *)malloc(sizeof(SAMPLE_TYPE)*sampleCount );//for swapping endians
std::string shorts;
double power = pow(2,10);
for(int i = 0; i < sampleCount; i++)
{
SAMPLE_TYPE sample_le = (0xff00 & (samples[i] << 8)) | (0x00ff & (samples[i] >> 8)) ; //Endianess issue
char dataInterim[30];
sprintf(dataInterim,"%f ", sample_le/power); // normalize it.
shorts.append(dataInterim);
}
I wrote a loop to encode pcm audio data generated by my app to aac using Extended Audio File Services. The encoding takes place in a background thread synchronously, and not in real-time.
The encoding works flawlessly on ipad 1 and iphone 3gs/4 for both ios 4 and 5. However, for dual-core devices (iphone 4s, ipad 2) the third call to ExtAudioFileWrite crashes the encoding thread with no stack trace and no error code.
Here is the code in question:
The data formats
AudioStreamBasicDescription AUCanonicalASBD(Float64 sampleRate,
UInt32 channel){
AudioStreamBasicDescription audioFormat;
audioFormat.mSampleRate = sampleRate;
audioFormat.mFormatID = kAudioFormatLinearPCM;
audioFormat.mFormatFlags = kAudioFormatFlagsAudioUnitCanonical;
audioFormat.mChannelsPerFrame = channel;
audioFormat.mBytesPerPacket = sizeof(AudioUnitSampleType);
audioFormat.mBytesPerFrame = sizeof(AudioUnitSampleType);
audioFormat.mFramesPerPacket = 1;
audioFormat.mBitsPerChannel = 8 * sizeof(AudioUnitSampleType);
audioFormat.mReserved = 0;
return audioFormat;
}
AudioStreamBasicDescription MixdownAAC(void){
AudioStreamBasicDescription audioFormat;
audioFormat.mSampleRate = 44100.0;
audioFormat.mFormatID = kAudioFormatMPEG4AAC;
audioFormat.mFormatFlags = kMPEG4Object_AAC_Main;
audioFormat.mChannelsPerFrame = 2;
audioFormat.mBytesPerPacket = 0;
audioFormat.mBytesPerFrame = 0;
audioFormat.mFramesPerPacket = 1024;
audioFormat.mBitsPerChannel = 0;
audioFormat.mReserved = 0;
return audioFormat;
}
The render loop
OSStatus err;
ExtAudioFileRef outFile;
NSURL *mixdownURL = [NSURL fileURLWithPath:filePath isDirectory:NO];
// internal data format
AudioStreamBasicDescription localFormat = AUCanonicalASBD(44100.0, 2);
// output file format
AudioStreamBasicDescription mixdownFormat = MixdownAAC();
err = ExtAudioFileCreateWithURL((CFURLRef)mixdownURL,
kAudioFileM4AType,
&mixdownFormat,
NULL,
kAudioFileFlags_EraseFile,
&outFile);
err = ExtAudioFileSetProperty(outFile, kExtAudioFileProperty_ClientDataFormat, sizeof(AudioStreamBasicDescription), &localFormat);
// prep
AllRenderData *allData = &allRenderData;
writeBuffer = malloc(sizeof(AudioBufferList) + (2*sizeof(AudioBuffer)));
writeBuffer->mNumberBuffers = 2;
writeBuffer->mBuffers[0].mNumberChannels = 1;
writeBuffer->mBuffers[0].mDataByteSize = bufferBytes;
writeBuffer->mBuffers[0].mData = malloc(bufferBytes);
writeBuffer->mBuffers[1].mNumberChannels = 1;
writeBuffer->mBuffers[1].mDataByteSize = bufferBytes;
writeBuffer->mBuffers[1].mData = malloc(bufferBytes);
memset(writeBuffer->mBuffers[0].mData, 0, bufferBytes);
memset(writeBuffer->mBuffers[1].mData, 0, bufferBytes);
UInt32 framesToGet;
UInt32 frameCount = allData->gLoopStartFrame;
UInt32 startFrame = allData->gLoopStartFrame;
UInt32 lastFrame = allData->gLoopEndFrame;
// write one silent buffer
ExtAudioFileWrite(outFile, bufferFrames, writeBuffer);
while (frameCount < lastFrame){
// how many frames do we need to get
if (lastFrame - frameCount > bufferFrames)
framesToGet = bufferFrames;
else
framesToGet = lastFrame - frameCount;
// get dem frames
err = theBigOlCallback((void*)&allRenderData,
NULL, NULL, 1,
framesToGet, writeBuffer);
// write to output file
ExtAudioFileWrite(outFile, framesToGet, writeBuffer);
frameCount += framesToGet;
}
// write one trailing silent buffer
memset(writeBuffer->mBuffers[0].mData, 0, bufferBytes);
memset(writeBuffer->mBuffers[1].mData, 0, bufferBytes);
processLimiterInPlace8p24(limiter, writeBuffer->mBuffers[0].mData, writeBuffer->mBuffers[1].mData, bufferFrames);
ExtAudioFileWrite(outFile, bufferFrames, writeBuffer);
err = ExtAudioFileDispose(outFile);
The pcm frames are properly created, but ExtAudioFileWrite fails the 2nd/3rd time it is called.
Any ideas? Thank you!
I had a very similar problem where I was attempting to use Extended Audio File Services in order to stream PCM sound into an m4a file on an iPad 2. Everything appeared to work except that every call to ExtAudioFileWrite returned the error code -66567 (kExtAudioFileError_MaxPacketSizeUnknown). The fix I eventually found was to set the "Codec Manufacturer" to software instead of hardware. So place
UInt32 codecManf = kAppleSoftwareAudioCodecManufacturer;
ExtAudioFileSetProperty(FileToWrite, kExtAudioFileProperty_CodecManufacturer, sizeof(UInt32), &codecManf);
just before you set the client data format.
This would lead me to believe that Apple's hardware codecs can only support very specific encoding, but the software codecs can more reliably do what you want. In my case, the software codec translation to m4a takes 50% longer than writing the exact same file to LPCM format.
Does anyone know whether Apple specifies somewhere what their audio codec hardware is capable of? It seems that software engineers are stuck playing the hours-long guessing game of setting the ~20 parameters in the AudioStreamBasicDescription and AudioChannelLayout for the client and for the file to every possible permutation until something works...
I have a problem with the this function AudioConverterConvertBuffer. Basically I want to convert from this format
_
streamFormat.mFormatFlags = kLinearPCMFormatFlagIsSignedInteger | kLinearPCMFormatFlagIsPacked |0 ;
_streamFormat.mBitsPerChannel = 16;
_streamFormat.mChannelsPerFrame = 2;
_streamFormat.mBytesPerPacket = 4;
_streamFormat.mBytesPerFrame = 4;
_streamFormat.mFramesPerPacket = 1;
_streamFormat.mSampleRate = 44100;
_streamFormat.mReserved = 0;
to this format
_streamFormatOutput.mFormatFlags = kLinearPCMFormatFlagIsSignedInteger | kLinearPCMFormatFlagIsPacked|0 ;//| kAudioFormatFlagIsNonInterleaved |0;
_streamFormatOutput.mBitsPerChannel = 16;
_streamFormatOutput.mChannelsPerFrame = 1;
_streamFormatOutput.mBytesPerPacket = 2;
_streamFormatOutput.mBytesPerFrame = 2;
_streamFormatOutput.mFramesPerPacket = 1;
_streamFormatOutput.mSampleRate = 44100;
_streamFormatOutput.mReserved = 0;
and what i want to do is to extract an audio channel(Left channel or right channel) from an LPCM buffer based on the input format to make it mono in the output format. Some logic code to convert is as follows
This is to set the channel map for PCM output file
SInt32 channelMap[1] = {0};
status = AudioConverterSetProperty(converter, kAudioConverterChannelMap, sizeof(channelMap), channelMap);
and this is to convert the buffer in a while loop
AudioBufferList audioBufferList;
CMBlockBufferRef blockBuffer;
CMSampleBufferGetAudioBufferListWithRetainedBlockBuffer(sampBuffer, NULL, &audioBufferList, sizeof(audioBufferList), NULL, NULL, 0, &blockBuffer);
for (int y=0; y<audioBufferList.mNumberBuffers; y++) {
AudioBuffer audioBuffer = audioBufferList.mBuffers[y];
//frames = audioBuffer.mData;
NSLog(#"the number of channel for buffer number %d is %d",y,audioBuffer.mNumberChannels);
NSLog(#"The buffer size is %d",audioBuffer.mDataByteSize);
numBytesIO = audioBuffer.mDataByteSize;
convertedBuf = malloc(sizeof(char)*numBytesIO);
status = AudioConverterConvertBuffer(converter, audioBuffer.mDataByteSize, audioBuffer.mData, &numBytesIO, convertedBuf);
char errchar[10];
NSLog(#"status audio converter convert %d",status);
if (status != 0) {
NSLog(#"Fail conversion");
assert(0);
}
NSLog(#"Bytes converted %d",numBytesIO);
status = AudioFileWriteBytes(mRecordFile, YES, countByteBuf, &numBytesIO, convertedBuf);
NSLog(#"status for writebyte %d, bytes written %d",status,numBytesIO);
free(convertedBuf);
if (numBytesIO != audioBuffer.mDataByteSize) {
NSLog(#"Something wrong in writing");
assert(0);
}
countByteBuf = countByteBuf + numBytesIO;
But the insz problem is there... so it cant convert. I would appreciate any input
Thanks in advance
First, you cannot use AudioConverterConvertBuffer() to convert anything where input and output byte size is different. You need to use AudioConverterFillComplexBuffer(). This includes performing any kind of sample rate conversions, or adding/removing channels.
See Apple's documentation on AudioConverterConvertBuffer(). This was also discussed on Apple's CoreAudio mailing lists, but I'm afraid I cannot find a reference right now.
Second, even if this could be done (which it can't) you are passing the same number of bytes allocated for output as you had for input, despite actually requiring half of the number of bytes (due to reducing number of channels from 2 to 1).
I'm actually working on using AudioConverterConvertBuffer() right now, and the test files are mono while I need to play stereo. I'm currently stuck with the converter performing conversion only of the first chunk of the data. If I manage to get this to work, I'll try to remember to post the code. If I don't post it, please poke me in comments.
I am writing an iPhone app that records and plays audio simultaneously using the I/O audio unit as per Apple's recommendations.
I want to apply some sound effects (reverb, etc) on the recorded audio before playing it back. For these effects to work well, I need the samples to be floating point numbers, rather than integers. It seems this should be possible, by creating an AudioStreamBasicDescription with kAudioFormatFlagIsFloat set on mFormatFlags. This is what my code looks like:
AudioStreamBasicDescription streamDescription;
streamDescription.mSampleRate = 44100.0;
streamDescription.mFormatID = kAudioFormatLinearPCM;
streamDescription.mFormatFlags = kAudioFormatFlagIsFloat;
streamDescription.mBitsPerChannel = 32;
streamDescription.mBytesPerFrame = 4;
streamDescription.mBytesPerPacket = 4;
streamDescription.mChannelsPerFrame = 1;
streamDescription.mFramesPerPacket = 1;
streamDescription.mReserved = 0;
OSStatus status;
status = AudioUnitSetProperty(audioUnit, kAudioUnitProperty_StreamFormat, kAudioUnitScope_Input, 0, &streamDescription, sizeof(streamDescription));
if (status != noErr)
fprintf(stderr, "AudioUnitSetProperty (kAudioUnitProperty_StreamFormat, kAudioUnitScope_Input) returned status %ld\n", status);
status = AudioUnitSetProperty(audioUnit, kAudioUnitProperty_StreamFormat, kAudioUnitScope_Output, 1, &streamDescription, sizeof(streamDescription));
if (status != noErr)
fprintf(stderr, "AudioUnitSetProperty (kAudioUnitProperty_StreamFormat, kAudioUnitScope_Output) returned status %ld\n", status);
However, when I run this (on an iPhone 3GS running iPhoneOS 3.1.3), I get this:
AudioUnitSetProperty (kAudioUnitProperty_StreamFormat, kAudioUnitScope_Input) returned error -10868
AudioUnitSetProperty (kAudioUnitProperty_StreamFormat, kAudioUnitScope_Output) returned error -10868
(-10868 is the value of kAudioUnitErr_FormatNotSupported)
I didn't find anything of value in Apple's documentation, apart from a recommendation to stick to 16 bit little-endian integers. However, the aurioTouch example project contains at least some support code related to kAudioFormatFlagIsFloat.
So, is my stream description incorrect, or is kAudioFormatFlagIsFloat simply not supported on iPhoneOS?
It's not supported, as far as I know. You can pretty easily convert to floats, though using AudioConverter. I do this conversion (both ways) in real time to use the Accelerate framework with iOS audio. (note: this code is copied and pasted from more modular code, so there may be some minor typos)
First, you'll need the AudioStreamBasicDescription from the input. Say
AudioStreamBasicDescription aBasicDescription = {0};
aBasicDescription.mSampleRate = self.samplerate;
aBasicDescription.mFormatID = kAudioFormatLinearPCM;
aBasicDescription.mFormatFlags = kAudioFormatFlagIsSignedInteger | kAudioFormatFlagIsPacked;
aBasicDescription.mFramesPerPacket = 1;
aBasicDescription.mChannelsPerFrame = 1;
aBasicDescription.mBitsPerChannel = 8 * sizeof(SInt16);
aBasicDescription.mBytesPerPacket = sizeof(SInt16) * aBasicDescription.mFramesPerPacket;
aBasicDescription.mBytesPerFrame = sizeof(SInt16) * aBasicDescription.mChannelsPerFrame
Then, generate a corresponding AudioStreamBasicDescription for float.
AudioStreamBasicDescription floatDesc = {0};
floatDesc.mFormatID = kAudioFormatLinearPCM;
floatDesc.mFormatFlags = kAudioFormatFlagIsFloat | kAudioFormatFlagIsPacked;
floatDesc.mBitsPerChannel = 8 * sizeof(float);
floatDesc.mFramesPerPacket = 1;
floatDesc.mChannelsPerFrame = 1;
floatDesc.mBytesPerPacket = sizeof(float) * floatDesc.mFramesPerPacket;
floatDesc.mBytesPerFrame = sizeof(float) * floatDesc.mChannelsPerFrame;
floatDesc.mSampleRate = [controller samplerate];
Make some buffers.
UInt32 intSize = inNumberFrames * sizeof(SInt16);
UInt32 floatSize = inNumberFrames * sizeof(float);
float *dataBuffer = (float *)calloc(numberOfAudioFramesIn, sizeof(float));
Then convert. (ioData is your AudioBufferList containing the int audio)
AudioConverterRef converter;
OSStatus err = noErr;
err = AudioConverterNew(&aBasicDescription, &floatDesct, &converter);
//check for error here in "real" code
err = AudioConverterConvertBuffer(converter, intSize, ioData->mBuffers[0].mData, &floatSize, dataBuffer);
//check for error here in "real" code
//do stuff to dataBuffer, which now contains floats
//convert the floats back by running the conversion the other way
I'm doing something unrelated to AudioUnits but I am using AudioStreamBasicDescription on iOS. I was able to use float samples by specifying:
dstFormat.mFormatFlags = kAudioFormatFlagIsFloat | kAudioFormatFlagIsNonInterleaved | kAudioFormatFlagsNativeEndian | kLinearPCMFormatFlagIsPacked;
The book Learning Core Audio: A Hands-on Guide to Audio Programming for Mac and iOS was helpful for this.
It is supported.
The problem is you must also set kAudioFormatFlagIsNonInterleaved on mFormatFlags. If you don't do this when setting kAudioFormatFlagIsFloat, you will get a format error.
So, you want to do something like this when preparing your AudioStreamBasicDescription:
streamDescription.mFormatFlags = kAudioFormatFlagIsFloat |
kAudioFormatFlagIsNonInterleaved;
As for why iOS requires this, I'm not sure - I only stumbled across it via trial and error.
From the Core Audio docs:
kAudioFormatFlagIsFloat
Set for floating point, clear for integer.
Available in iPhone OS 2.0 and later.
Declared in CoreAudioTypes.h.
I don't know enough about your stream to comment on its [in]correctness.
You can obtain an interleaved float RemoteIO with the following ASBD setup:
// STEREO_CHANNEL = 2, defaultSampleRate = 44100
AudioStreamBasicDescription const audioDescription = {
.mSampleRate = defaultSampleRate,
.mFormatID = kAudioFormatLinearPCM,
.mFormatFlags = kAudioFormatFlagIsFloat,
.mBytesPerPacket = STEREO_CHANNEL * sizeof(float),
.mFramesPerPacket = 1,
.mBytesPerFrame = STEREO_CHANNEL * sizeof(float),
.mChannelsPerFrame = STEREO_CHANNEL,
.mBitsPerChannel = 8 * sizeof(float),
.mReserved = 0
};
This worked for me.