I am trying to encode video in h.264 format using FFMpeg in iOS. Actually I am receiving sample buffer from iPhone Camera and then converting it AVFrame and further from AVFrame to H.264 video. But while h.264 encoding if I use :
codec = avcodec_find_encoder(CODEC_ID_MPEG1VIDEO);
Then codec is found but if I use:
codec = avcodec_find_encoder(CODEC_ID_H264);
Then codec is nil means codec not found. Full code is as below:
static void encode(AVFrame *picture)
{
AVCodec *codec;
AVCodecContext *c= NULL;
int i, out_size, size, outbuf_size;
uint8_t *outbuf, *picture_buf;
printf("Video encoding\n");
/* find the mpeg1 video encoder */
//avcodec_init() ; // Also tried this but giving warning and not worked
avcodec_register_all();
codec = avcodec_find_encoder(CODEC_ID_H264);
if (!codec) {
fprintf(stderr, "codec not found\n");
exit(1);
}
c= avcodec_alloc_context();
picture= avcodec_alloc_frame();
/* put sample parameters */
c->bit_rate = 400000;
/* resolution must be a multiple of two */
c->width = 352;
c->height = 288;
/* frames per second */
c->time_base= (AVRational){1,25};
c->gop_size = 10; /* emit one intra frame every ten frames */
c->max_b_frames=1;
c->pix_fmt = PIX_FMT_YUV420P;
/* open it */
if (avcodec_open(c, codec) < 0) {
fprintf(stderr, "could not open codec\n");
exit(1);
}
/* alloc image and output buffer */
outbuf_size = 100000;
outbuf = malloc(outbuf_size);
size = c->width * c->height;
picture_buf = malloc((size * 3) / 2); /* size for YUV 420 */
out_size = avcodec_encode_video(c, outbuf, outbuf_size, picture);
NSLog(#"NSdada===%#",[NSData dataWithBytes:(const void *)outbuf length:out_size]);
free(picture_buf);
free(outbuf);
avcodec_close(c);
av_free(c);
av_free(picture);
printf("\n");
}
Your ffmpeg probably was compiled without libx264 support. So it really can't find encoder for H.264 as there are no other H.264 encoders in ffmpeg afaik.
For one thing, ffmpeg and x264 are licensed under the LGPL and GPL, so you would need to address license issues before you could include that code in your iPhone app. Under iOS, you would want to use the hardware h.264 encoder already supplied with the device. The AVAsset API is used to decode from and encode to h.264 in an iOS app.
Related
I found a library that helps to convert WAV file to Flac:
https://github.com/jhurt/wav_to_flac
Also succeed to compile Flac to the platform and it works fine.
I've been using this library after capturing the audio on wav format to convert it to Flac and then send to my server.
Problem is that the audio file could be long and then precious time is wasted.
The thing is that I want to encode the audio as Flac format and send that to server on the same time when capturing and not after capturing stops, So, I need a help here on how to do that (encode Flac directly from the audio so I could send it to my server)...
In my library called libsprec, you can see an example of both recording a WAV file (here) and converting it to FLAC (here). (Credits: the audio recording part heavily relies on Erica Sadun's work, for the record.)
Now if you want to do this in one step, you can do that as well. The trick is that you have to do the initialization of both the Audio Queues and the FLAC library first, then "interleave" the calls to them, i. e. when you get some audio data in the callback function of the Audio Queue, you immediately FLAC-encode it.
I don't think, however, that this would be much faster than recording and encoding in two separate steps. The heavy part of the processing is the recording and the maths in the encoding itself, so re-reading the same buffer (or I dare you, even a file!) won't add much to the processing time.
That said, you may want to do something like this:
// First, we initialize the Audio Queue
AudioStreamBasicDescription desc;
desc.mFormatID = kAudioFormatLinearPCM;
desc.mFormatFlags = kLinearPCMFormatFlagIsSignedInteger | kLinearPCMFormatFlagIsPacked;
desc.mReserved = 0;
desc.mSampleRate = SAMPLE_RATE;
desc.mChannelsPerFrame = 2; // stereo (?)
desc.mBitsPerChannel = BITS_PER_SAMPLE;
desc.mBytesPerFrame = BYTES_PER_FRAME;
desc.mFramesPerPacket = 1;
desc.mBytesPerPacket = desc.mFramesPerPacket * desc.mBytesPerFrame;
AudioQueueRef queue;
status = AudioQueueNewInput(
&desc,
audio_queue_callback, // our custom callback function
NULL,
NULL,
NULL,
0,
&queue
);
if (status)
return status;
AudioQueueBufferRef buffers[NUM_BUFFERS];
for (i = 0; i < NUM_BUFFERS; i++) {
status = AudioQueueAllocateBuffer(
queue,
0x5000, // max buffer size
&buffers[i]
);
if (status)
return status;
status = AudioQueueEnqueueBuffer(
queue,
buffers[i],
0,
NULL
);
if (status)
return status;
}
// Then, we initialize the FLAC encoder:
FLAC__StreamEncoder *encoder;
FLAC__StreamEncoderInitStatus status;
FILE *infile;
const char *dataloc;
uint32_t rate; /* sample rate */
uint32_t total; /* number of samples in file */
uint32_t channels; /* number of channels */
uint32_t bps; /* bits per sample */
uint32_t dataoff; /* offset of PCM data within the file */
int err;
/*
* BUFFSIZE samples * 2 bytes per sample * 2 channels
*/
FLAC__byte buffer[BUFSIZE * 2 * 2];
/*
* BUFFSIZE samples * 2 channels
*/
FLAC__int32 pcm[BUFSIZE * 2];
/*
* Create and initialize the FLAC encoder
*/
encoder = FLAC__stream_encoder_new();
if (!encoder)
return -1;
FLAC__stream_encoder_set_verify(encoder, true);
FLAC__stream_encoder_set_compression_level(encoder, 5);
FLAC__stream_encoder_set_channels(encoder, NUM_CHANNELS); // 2 for stereo
FLAC__stream_encoder_set_bits_per_sample(encoder, BITS_PER_SAMPLE); // 32 for stereo 16 bit per channel
FLAC__stream_encoder_set_sample_rate(encoder, SAMPLE_RATE);
status = FLAC__stream_encoder_init_stream(encoder, flac_callback, NULL, NULL, NULL, NULL);
if (status != FLAC__STREAM_ENCODER_INIT_STATUS_OK)
return -1;
// We now start the Audio Queue...
status = AudioQueueStart(queue, NULL);
// And when it's finished, we clean up the FLAC encoder...
FLAC__stream_encoder_finish(encoder);
FLAC__stream_encoder_delete(encoder);
// and the audio queue and its belongings too
AudioQueueFlush(queue);
AudioQueueStop(queue, false);
for (i = 0; i < NUM_BUFFERS; i++)
AudioQueueFreeBuffer(queue, buffers[i]);
AudioQueueDispose(queue, true);
// In the audio queue callback function, we do the encoding:
void audio_queue_callback(
void *data,
AudioQueueRef inAQ,
AudioQueueBufferRef buffer,
const AudioTimeStamp *start_time,
UInt32 num_packets,
const AudioStreamPacketDescription *desc
)
{
unsigned char *buf = buffer->mAudioData;
for (size_t i = 0; i < num_packets * channels; i++) {
uint16_t msb = *(uint8_t *)(buf + i * 2 + 1);
uint16_t usample = (msb << 8) | lsb;
union {
uint16_t usample;
int16_t ssample;
} u;
u.usample = usample;
pcm[i] = u.ssample;
}
FLAC__bool succ = FLAC__stream_encoder_process_interleaved(encoder, pcm, num_packets);
if (!succ)
// handle_error();
}
// Finally, in the FLAC stream encoder callback:
FLAC__StreamEncoderWriteStatus flac_callback(
const FLAC__StreamEncoder *encoder,
const FLAC__byte buffer[],
size_t bytes,
unsigned samples,
unsigned current_frame,
void *client_data
)
{
// Here process `buffer' and stuff,
// then:
return FLAC__STREAM_ENCODER_SEEK_STATUS_OK;
}
You are welcome.
Your question is not very specific, but you need to use Audio Recording Services, which will let you get access to the audio data in chunks, and then move the data you get from there into the streaming interface of the FLAC encoder. You can not use the WAV to FLAC program you linked to, you have to tap into the FLAC library yourself. API docs here.
Example on how to use a callback here.
can't you record your audio in wav using audio queue services and process output packets with your lib ?
edit from apple dev doc :
"Applications writing AIFF and WAV files must either update the data header’s size field at the end of recording—which can result in an unusable file if recording is interrupted before the header is finalized—or they must update the size field after recording each packet of data, which is inefficient."
apparently it seems quite hard to encode a wav file on the fly
I'm looking at this example: https://developer.apple.com/library/ios/#samplecode/SpeakHere/Introduction/Intro.html
I modified it ( the AQRecorder.mm)to record mp3 instead of caf file. I changed from kAudioFileCAFType to kAudioFileMP3Type but it does not create the file.
The code became
void AQRecorder::SetupAudioFormat(UInt32 inFormatID)
{
memset(&mRecordFormat, 0, sizeof(mRecordFormat));
UInt32 size = sizeof(mRecordFormat.mSampleRate);
XThrowIfError(AudioSessionGetProperty( kAudioSessionProperty_CurrentHardwareSampleRate,
&size,
&mRecordFormat.mSampleRate), "couldn't get hardware sample rate");
size = sizeof(mRecordFormat.mChannelsPerFrame);
XThrowIfError(AudioSessionGetProperty( kAudioSessionProperty_CurrentHardwareInputNumberChannels,
&size,
&mRecordFormat.mChannelsPerFrame), "couldn't get input channel count");
mRecordFormat.mFormatID = inFormatID;
if (inFormatID == kAudioFormatLinearPCM)
{
// if we want pcm, default to signed 16-bit little-endian
mRecordFormat.mFormatFlags = kLinearPCMFormatFlagIsSignedInteger | kLinearPCMFormatFlagIsPacked;
mRecordFormat.mBitsPerChannel = 16;
mRecordFormat.mBytesPerPacket = mRecordFormat.mBytesPerFrame = (mRecordFormat.mBitsPerChannel / 8) * mRecordFormat.mChannelsPerFrame;
mRecordFormat.mFramesPerPacket = 1;
}
}
void AQRecorder::StartRecord(CFStringRef inRecordFile)
{
int i, bufferByteSize;
UInt32 size;
CFURLRef url;
try {
mFileName = CFStringCreateCopy(kCFAllocatorDefault, inRecordFile);
// specify the recording format
SetupAudioFormat(kAudioFormatLinearPCM);
// create the queue
XThrowIfError(AudioQueueNewInput(
&mRecordFormat,
MyInputBufferHandler,
this /* userData */,
NULL /* run loop */, NULL /* run loop mode */,
0 /* flags */, &mQueue), "AudioQueueNewInput failed");
// get the record format back from the queue's audio converter --
// the file may require a more specific stream description than was necessary to create the encoder.
mRecordPacket = 0;
size = sizeof(mRecordFormat);
XThrowIfError(AudioQueueGetProperty(mQueue, kAudioQueueProperty_StreamDescription,
&mRecordFormat, &size), "couldn't get queue's format");
NSString *recordFile = [NSTemporaryDirectory() stringByAppendingPathComponent: (NSString*)inRecordFile];
NSLog(recordFile);
url = CFURLCreateWithString(kCFAllocatorDefault, (CFStringRef)recordFile, NULL);
// create the audio file kAudioFileCAFType
XThrowIfError(AudioFileCreateWithURL(url, kAudioFileMP3Type, &mRecordFormat, kAudioFileFlags_EraseFile,
&mRecordFile), "AudioFileCreateWithURL failed");
CFRelease(url);
// copy the cookie first to give the file object as much info as we can about the data going in
// not necessary for pcm, but required for some compressed audio
CopyEncoderCookieToFile();
// allocate and enqueue buffers
bufferByteSize = ComputeRecordBufferSize(&mRecordFormat, kBufferDurationSeconds); // enough bytes for half a second
for (i = 0; i < kNumberRecordBuffers; ++i) {
XThrowIfError(AudioQueueAllocateBuffer(mQueue, bufferByteSize, &mBuffers[i]),
"AudioQueueAllocateBuffer failed");
XThrowIfError(AudioQueueEnqueueBuffer(mQueue, mBuffers[i], 0, NULL),
"AudioQueueEnqueueBuffer failed");
}
// start the queue
mIsRunning = true;
XThrowIfError(AudioQueueStart(mQueue, NULL), "AudioQueueStart failed");
}
catch (CAXException &e) {
char buf[256];
fprintf(stderr, "Error: %s (%s)\n", e.mOperation, e.FormatError(buf));
}
catch (...) {
fprintf(stderr, "An unknown error occurred\n");
}
}
Am I missing any settings, or what's wrong with my code? , mp3 be supported from apple
https://developer.apple.com/library/mac/#documentation/MusicAudio/Reference/AudioFileConvertRef/Reference/reference.html
iOS devices don't support recording to the MP3 encoding format. Actually, I don't think any of the iOS devices do. You have to choose an alternate format. Core Audio can read, but not write, MP3 files.
You can use Lame library for encoding caf to mp3 file format. Check this sample iOSMp3Recorder
I successfully compiled libavcodec with speex enabled.
I modified example from FFMPEG docs to encode the sample audio into Speex.
But the result file cannot be played with VLC Player(which has Speex decoder).
Any tips?
static void audio_encode_example(const char *filename)
{
AVCodec *codec;
AVCodecContext *c= NULL;
int frame_size, i, j, out_size, outbuf_size;
FILE *f;
short *samples;
float t, tincr;
uint8_t *outbuf;
printf("Audio encoding\n");
/* find the MP2 encoder */
codec = avcodec_find_encoder(CODEC_ID_SPEEX);
if (!codec) {
fprintf(stderr, "codec not found\n");
exit(1);
}
c= avcodec_alloc_context();
/* put sample parameters */
c->bit_rate = 64000;
c->sample_rate = 32000;
c->channels = 2;
c->sample_fmt=AV_SAMPLE_FMT_S16;
/* open it */
if (avcodec_open(c, codec) < 0) {
fprintf(stderr, "could not open codec\n");
exit(1);
}
/* the codec gives us the frame size, in samples */
frame_size = c->frame_size;
printf("frame size %d\n",frame_size);
samples =(short*) malloc(frame_size * 2 * c->channels);
outbuf_size = 10000;
outbuf =( uint8_t*) malloc(outbuf_size);
f = fopen(filename, "wb");
if (!f) {
fprintf(stderr, "could not open %s\n", filename);
exit(1);
}
/* encode a single tone sound */
t = 0;
tincr = 2 * M_PI * 440.0 / c->sample_rate;
for(i=0;i<200;i++) {
for(j=0;j<frame_size;j++) {
samples[2*j] = (int)(sin(t) * 10000);
samples[2*j+1] = samples[2*j];
t += tincr;
}
/* encode the samples */
out_size = avcodec_encode_audio(c, outbuf, outbuf_size, samples);
fwrite(outbuf, 1, out_size, f);
}
fclose(f);
free(outbuf);
free(samples);
avcodec_close(c);
av_free(c);
}
int main(int argc, char **argv)
{
avcodec_register_all();
audio_encode_example(argv[1]);
return 0;
}
Does Speex (I don't know it) by chance require a container format into which these frames are placed, with some kind of header? You're just taking the output of an encoder and dumping into a file without going through any formatting (libavformat).
Try encoding the same data into Speex using the ffmpeg command line utility and see if the resulting file plays.
I'm looking at some info at www.speex.org and it seems that speex data is put into .ogg files. The player you are using might not recognize raw Speex data, but only if it is wrapped in .ogg.
Though not a 100% definite answer, I hope this is of some help!
I'm writing a CoreAudio backend for an audio library called XAL. Input buffers can be of various sample rates. I'm using a single audio unit for output. Idea is to convert the buffers and mix them prior to sending them to the audio unit.
Everything works as long as the input buffer has the same properties (sample rate, channel count, etc) as the output audio unit. Hence, the mixing part works.
However, I'm stuck with sample rate and channel count conversion. From what I figured out, this is easiest to do with Audio Converter Services API. I've managed to construct a converter; the idea is that the output format is the same as the output unit format, but possibly adjusted for purposes of the converter.
Audio converter is successfully constructed, but upon calling AudioConverterFillComplexBuffer(), I get output status error -50.
I'd love if I could get another set of eyeballs on this code. Problem is probably somewhere below AudioConverterNew(). Variable stream contains incoming (and outgoing) buffer data, and streamSize contains byte-size of incoming (and outgoing) buffer data.
What did I do wrong?
void CoreAudio_AudioManager::_convertStream(Buffer* buffer, unsigned char** stream, int *streamSize)
{
if (buffer->getBitsPerSample() != unitDescription.mBitsPerChannel ||
buffer->getChannels() != unitDescription.mChannelsPerFrame ||
buffer->getSamplingRate() != unitDescription.mSampleRate)
{
printf("INPUT STREAM SIZE: %d\n", *streamSize);
// describe the input format's description
AudioStreamBasicDescription inputDescription;
memset(&inputDescription, 0, sizeof(inputDescription));
inputDescription.mFormatID = kAudioFormatLinearPCM;
inputDescription.mFormatFlags = kLinearPCMFormatFlagIsPacked | kLinearPCMFormatFlagIsSignedInteger;
inputDescription.mChannelsPerFrame = buffer->getChannels();
inputDescription.mSampleRate = buffer->getSamplingRate();
inputDescription.mBitsPerChannel = buffer->getBitsPerSample();
inputDescription.mBytesPerFrame = (inputDescription.mBitsPerChannel * inputDescription.mChannelsPerFrame) / 8;
inputDescription.mFramesPerPacket = 1; //*streamSize / inputDescription.mBytesPerFrame;
inputDescription.mBytesPerPacket = inputDescription.mBytesPerFrame * inputDescription.mFramesPerPacket;
printf("INPUT : %lu bytes per packet for sample rate %g, channels %d\n", inputDescription.mBytesPerPacket, inputDescription.mSampleRate, inputDescription.mChannelsPerFrame);
// copy conversion output format's description from the
// output audio unit's description.
// then adjust framesPerPacket to match the input we'll be passing.
// framecount of our input stream is based on the input bytecount.
// output stream will have same number of frames, but different
// number of bytes.
AudioStreamBasicDescription outputDescription = unitDescription;
outputDescription.mFramesPerPacket = 1; //inputDescription.mFramesPerPacket;
outputDescription.mBytesPerPacket = outputDescription.mBytesPerFrame * outputDescription.mFramesPerPacket;
printf("OUTPUT : %lu bytes per packet for sample rate %g, channels %d\n", outputDescription.mBytesPerPacket, outputDescription.mSampleRate, outputDescription.mChannelsPerFrame);
// create an audio converter
AudioConverterRef audioConverter;
OSStatus acCreationResult = AudioConverterNew(&inputDescription, &outputDescription, &audioConverter);
printf("Created audio converter %p (status: %d)\n", audioConverter, acCreationResult);
if(!audioConverter)
{
// bail out
free(*stream);
*streamSize = 0;
*stream = (unsigned char*)malloc(0);
return;
}
// calculate number of bytes required for output of input stream.
// allocate buffer of adequate size.
UInt32 outputBytes = outputDescription.mBytesPerPacket * (*streamSize / inputDescription.mBytesPerFrame); // outputDescription.mFramesPerPacket * outputDescription.mBytesPerFrame;
unsigned char *outputBuffer = (unsigned char*)malloc(outputBytes);
memset(outputBuffer, 0, outputBytes);
printf("OUTPUT BYTES : %d\n", outputBytes);
// describe input data we'll pass into converter
AudioBuffer inputBuffer;
inputBuffer.mNumberChannels = inputDescription.mChannelsPerFrame;
inputBuffer.mDataByteSize = *streamSize;
inputBuffer.mData = *stream;
// describe output data buffers into which we can receive data.
AudioBufferList outputBufferList;
outputBufferList.mNumberBuffers = 1;
outputBufferList.mBuffers[0].mNumberChannels = outputDescription.mChannelsPerFrame;
outputBufferList.mBuffers[0].mDataByteSize = outputBytes;
outputBufferList.mBuffers[0].mData = outputBuffer;
// set output data packet size
UInt32 outputDataPacketSize = outputDescription.mBytesPerPacket;
// convert
OSStatus result = AudioConverterFillComplexBuffer(audioConverter, /* AudioConverterRef inAudioConverter */
CoreAudio_AudioManager::_converterComplexInputDataProc, /* AudioConverterComplexInputDataProc inInputDataProc */
&inputBuffer, /* void *inInputDataProcUserData */
&outputDataPacketSize, /* UInt32 *ioOutputDataPacketSize */
&outputBufferList, /* AudioBufferList *outOutputData */
NULL /* AudioStreamPacketDescription *outPacketDescription */
);
printf("Result: %d wheee\n", result);
// change "stream" to describe our output buffer.
// even if error occured, we'd rather have silence than unconverted audio.
free(*stream);
*stream = outputBuffer;
*streamSize = outputBytes;
// dispose of the audio converter
AudioConverterDispose(audioConverter);
}
}
OSStatus CoreAudio_AudioManager::_converterComplexInputDataProc(AudioConverterRef inAudioConverter,
UInt32* ioNumberDataPackets,
AudioBufferList* ioData,
AudioStreamPacketDescription** ioDataPacketDescription,
void* inUserData)
{
printf("Converter\n");
if(*ioNumberDataPackets != 1)
{
xal::log("_converterComplexInputDataProc cannot provide input data; invalid number of packets requested");
*ioNumberDataPackets = 0;
ioData->mNumberBuffers = 0;
return -50;
}
*ioNumberDataPackets = 1;
ioData->mNumberBuffers = 1;
ioData->mBuffers[0] = *(AudioBuffer*)inUserData;
*ioDataPacketDescription = NULL;
return 0;
}
Working code for Core Audio sample rate conversion and channel count conversion, using Audio Converter Services (now available as a part of the BSD-licensed XAL audio library):
void CoreAudio_AudioManager::_convertStream(Buffer* buffer, unsigned char** stream, int *streamSize)
{
if (buffer->getBitsPerSample() != unitDescription.mBitsPerChannel ||
buffer->getChannels() != unitDescription.mChannelsPerFrame ||
buffer->getSamplingRate() != unitDescription.mSampleRate)
{
// describe the input format's description
AudioStreamBasicDescription inputDescription;
memset(&inputDescription, 0, sizeof(inputDescription));
inputDescription.mFormatID = kAudioFormatLinearPCM;
inputDescription.mFormatFlags = kLinearPCMFormatFlagIsPacked | kLinearPCMFormatFlagIsSignedInteger;
inputDescription.mChannelsPerFrame = buffer->getChannels();
inputDescription.mSampleRate = buffer->getSamplingRate();
inputDescription.mBitsPerChannel = buffer->getBitsPerSample();
inputDescription.mBytesPerFrame = (inputDescription.mBitsPerChannel * inputDescription.mChannelsPerFrame) / 8;
inputDescription.mFramesPerPacket = 1; //*streamSize / inputDescription.mBytesPerFrame;
inputDescription.mBytesPerPacket = inputDescription.mBytesPerFrame * inputDescription.mFramesPerPacket;
// copy conversion output format's description from the
// output audio unit's description.
// then adjust framesPerPacket to match the input we'll be passing.
// framecount of our input stream is based on the input bytecount.
// output stream will have same number of frames, but different
// number of bytes.
AudioStreamBasicDescription outputDescription = unitDescription;
outputDescription.mFramesPerPacket = 1; //inputDescription.mFramesPerPacket;
outputDescription.mBytesPerPacket = outputDescription.mBytesPerFrame * outputDescription.mFramesPerPacket;
// create an audio converter
AudioConverterRef audioConverter;
OSStatus acCreationResult = AudioConverterNew(&inputDescription, &outputDescription, &audioConverter);
if(!audioConverter)
{
// bail out
free(*stream);
*streamSize = 0;
*stream = (unsigned char*)malloc(0);
return;
}
// calculate number of bytes required for output of input stream.
// allocate buffer of adequate size.
UInt32 outputBytes = outputDescription.mBytesPerPacket * (*streamSize / inputDescription.mBytesPerPacket); // outputDescription.mFramesPerPacket * outputDescription.mBytesPerFrame;
unsigned char *outputBuffer = (unsigned char*)malloc(outputBytes);
memset(outputBuffer, 0, outputBytes);
// describe input data we'll pass into converter
AudioBuffer inputBuffer;
inputBuffer.mNumberChannels = inputDescription.mChannelsPerFrame;
inputBuffer.mDataByteSize = *streamSize;
inputBuffer.mData = *stream;
// describe output data buffers into which we can receive data.
AudioBufferList outputBufferList;
outputBufferList.mNumberBuffers = 1;
outputBufferList.mBuffers[0].mNumberChannels = outputDescription.mChannelsPerFrame;
outputBufferList.mBuffers[0].mDataByteSize = outputBytes;
outputBufferList.mBuffers[0].mData = outputBuffer;
// set output data packet size
UInt32 outputDataPacketSize = outputBytes / outputDescription.mBytesPerPacket;
// fill class members with data that we'll pass into
// the InputDataProc
_converter_currentBuffer = &inputBuffer;
_converter_currentInputDescription = inputDescription;
// convert
OSStatus result = AudioConverterFillComplexBuffer(audioConverter, /* AudioConverterRef inAudioConverter */
CoreAudio_AudioManager::_converterComplexInputDataProc, /* AudioConverterComplexInputDataProc inInputDataProc */
this, /* void *inInputDataProcUserData */
&outputDataPacketSize, /* UInt32 *ioOutputDataPacketSize */
&outputBufferList, /* AudioBufferList *outOutputData */
NULL /* AudioStreamPacketDescription *outPacketDescription */
);
// change "stream" to describe our output buffer.
// even if error occured, we'd rather have silence than unconverted audio.
free(*stream);
*stream = outputBuffer;
*streamSize = outputBytes;
// dispose of the audio converter
AudioConverterDispose(audioConverter);
}
}
OSStatus CoreAudio_AudioManager::_converterComplexInputDataProc(AudioConverterRef inAudioConverter,
UInt32* ioNumberDataPackets,
AudioBufferList* ioData,
AudioStreamPacketDescription** ioDataPacketDescription,
void* inUserData)
{
if(ioDataPacketDescription)
{
xal::log("_converterComplexInputDataProc cannot provide input data; it doesn't know how to provide packet descriptions");
*ioDataPacketDescription = NULL;
*ioNumberDataPackets = 0;
ioData->mNumberBuffers = 0;
return 501;
}
CoreAudio_AudioManager *self = (CoreAudio_AudioManager*)inUserData;
ioData->mNumberBuffers = 1;
ioData->mBuffers[0] = *(self->_converter_currentBuffer);
*ioNumberDataPackets = ioData->mBuffers[0].mDataByteSize / self->_converter_currentInputDescription.mBytesPerPacket;
return 0;
}
In the header, as part of the CoreAudio_AudioManager class, here are relevant instance variables:
AudioStreamBasicDescription unitDescription;
AudioBuffer *_converter_currentBuffer;
AudioStreamBasicDescription _converter_currentInputDescription;
A few months later, I'm looking at this and I've realized that I didn't document the changes.
If you are interested in what the changes were:
look at the callback function CoreAudio_AudioManager::_converterComplexInputDataProc
one has to properly specify the number of output packets into ioNumberDataPackets
this has required introduction of new instance variables to hold both the buffer (the previous inUserData) and the input description (used to calculate the number of packets to be fed into Core Audio's converter)
this calculation of "output" packets (those fed into the converter) is done based on amount of data that our callback received, and the number of bytes per packet that the input format contains
Hopefully this edit will help a future reader (myself included)!
I'm writing an iPhone app which should record the users voice, and feed the audio data into a library for modifications such as changing tempo and pitch. I started off with the SpeakHere example code from Apple:
http://developer.apple.com/library/ios/#samplecode/SpeakHere/Introduction/Intro.html
That project lays the groundwork for recording the user's voice and playing it back. It works well.
Now I'm diving into the code and I need to figure out how to feed the audio data into the SoundTouch library (http://www.surina.net/soundtouch/) to change the pitch. I became familiar with the Audio Queue framework while going through the code, and I found the place where I receive the audio data from the recording.
Essentially, you call AudioQueueNewInput to create a new input queue. You pass a callback function which is called every time a chunk of audio data is available. It is within this callback that I need to pass the chunks of data into SoundTouch.
I have it all setup, but the noise I play back from the SoundTouch library is very staticky (it barely resembles the original). If I don't pass it through SoundTouch and play the original audio it works fine.
Basically, I'm missing something about what the actual data I'm getting represents. I was assuming that I am getting a stream of shorts which are samples, 1 sample for each channel. That's how SoundTouch is expecting it, so it must not be right somehow.
Here is the code which sets up the audio queue so you can see how it is configured.
void AQRecorder::SetupAudioFormat(UInt32 inFormatID)
{
memset(&mRecordFormat, 0, sizeof(mRecordFormat));
UInt32 size = sizeof(mRecordFormat.mSampleRate);
XThrowIfError(AudioSessionGetProperty(kAudioSessionProperty_CurrentHardwareSampleRate,
&size,
&mRecordFormat.mSampleRate), "couldn't get hardware sample rate");
size = sizeof(mRecordFormat.mChannelsPerFrame);
XThrowIfError(AudioSessionGetProperty(kAudioSessionProperty_CurrentHardwareInputNumberChannels,
&size,
&mRecordFormat.mChannelsPerFrame), "couldn't get input channel count");
mRecordFormat.mFormatID = inFormatID;
if (inFormatID == kAudioFormatLinearPCM)
{
// if we want pcm, default to signed 16-bit little-endian
mRecordFormat.mFormatFlags = kLinearPCMFormatFlagIsSignedInteger | kLinearPCMFormatFlagIsPacked;
mRecordFormat.mBitsPerChannel = 16;
mRecordFormat.mBytesPerPacket = mRecordFormat.mBytesPerFrame = (mRecordFormat.mBitsPerChannel / 8) * mRecordFormat.mChannelsPerFrame;
mRecordFormat.mFramesPerPacket = 1;
}
}
And here's part of the code which actually sets it up:
SetupAudioFormat(kAudioFormatLinearPCM);
// create the queue
XThrowIfError(AudioQueueNewInput(
&mRecordFormat,
MyInputBufferHandler,
this /* userData */,
NULL /* run loop */, NULL /* run loop mode */,
0 /* flags */, &mQueue), "AudioQueueNewInput failed");
And finally, here is the callback which handles new audio data:
void AQRecorder::MyInputBufferHandler(void *inUserData,
AudioQueueRef inAQ,
AudioQueueBufferRef inBuffer,
const AudioTimeStamp *inStartTime,
UInt32 inNumPackets,
const AudioStreamPacketDescription *inPacketDesc) {
AQRecorder *aqr = (AQRecorder *)inUserData;
try {
if (inNumPackets > 0) {
CAStreamBasicDescription queueFormat = aqr->DataFormat();
SoundTouch *soundTouch = aqr->getSoundTouch();
soundTouch->putSamples((const SAMPLETYPE *)inBuffer->mAudioData,
inBuffer->mAudioDataByteSize / 2 / queueFormat.NumberChannels());
SAMPLETYPE *samples = (SAMPLETYPE *)malloc(sizeof(SAMPLETYPE) * 10000 * queueFormat.NumberChannels());
UInt32 numSamples;
while((numSamples = soundTouch->receiveSamples((SAMPLETYPE *)samples, 10000))) {
// write packets to file
XThrowIfError(AudioFileWritePackets(aqr->mRecordFile,
FALSE,
numSamples * 2 * queueFormat.NumberChannels(),
NULL,
aqr->mRecordPacket,
&numSamples,
samples),
"AudioFileWritePackets failed");
aqr->mRecordPacket += numSamples;
}
free(samples);
}
// if we're not stopping, re-enqueue the buffe so that it gets filled again
if (aqr->IsRunning())
XThrowIfError(AudioQueueEnqueueBuffer(inAQ, inBuffer, 0, NULL), "AudioQueueEnqueueBuffer failed");
} catch (CAXException e) {
char buf[256];
fprintf(stderr, "Error: %s (%s)\n", e.mOperation, e.FormatError(buf));
}
}
You can see that I'm passing the data in inBuffer->mAudioData to SoundTouch. In my callback, what exactly are the bytes representing, i.e. how do I extract samples from mAudioData?
The default endianess of the Audio Queue may be the opposite of what you expect. You may have to swap upper and lower bytes of each 16-bit audio samples after record and before play.
sample_le = (0xff00 & (sample_be << 8)) | (0x00ff & (sample_be >> 8)) ;
You have to check that the endianess, the signedness, etc. of what you are getting match what the library expects. Use mFormatFlags of AudioStreamBasicDescription to determine the source format. Then you might have to convert the samples (e.g. newSample = sample + 0x8000)