I'm trying to get audio queue working on an iphone app, and whenever AudioQueueStart is called it gives the "fmt?" result code (kAudioFormatUnsupportedDataFormatError). In the code below i'm setting the format to kAudioFormatLinearPCM, which surely is supported. What am i doing wrong?
data.mDataFormat.mSampleRate = 44100;
data.mDataFormat.mFormatID = kAudioFormatLinearPCM;
data.mDataFormat.mFormatFlags = 0;
data.mDataFormat.mBytesPerPacket = 4;
data.mDataFormat.mFramesPerPacket = 1;
data.mDataFormat.mBytesPerFrame = 4;
data.mDataFormat.mChannelsPerFrame = 2;
data.mDataFormat.mBitsPerChannel = 16;
OSStatus status;
status = AudioQueueNewOutput(&data.mDataFormat, audioCallback, &data, CFRunLoopGetCurrent (), kCFRunLoopCommonModes, 0, &data.mQueue);
for (int i = 0; i < NUMBUFFERS; ++i)
{
status = AudioQueueAllocateBuffer (data.mQueue, BUFFERSIZE, &data.mBuffers[i] );
audioCallback (&data, data.mQueue, data.mBuffers[i]);
}
Float32 gain = 1.0;
status = AudioQueueSetParameter (data.mQueue, kAudioQueueParam_Volume, gain);
status = AudioQueueStart(data.mQueue, NULL);
data is of type audioData which is like this:
typedef struct _audioData {
AudioQueueRef mQueue;
AudioQueueBufferRef mBuffers[NUMBUFFERS];
AudioStreamBasicDescription mDataFormat;
} audioData;
thanks
The cause of your error is actually AudioQueueNewOutput rather than AudioQueueStart.. See this related question audio streaming services failing to recognize file type
it turns out i needed to set some flags. it works with
data.mDataFormat.mFormatFlags = kLinearPCMFormatFlagIsBigEndian | kLinearPCMFormatFlagIsSignedInteger | kLinearPCMFormatFlagIsPacked;
edit: actually, dont use kLinearPCMFormatFlagIsBigEndian, it seems that with this format it should be little endian.
Related
I'm trying to create a game in Unity where each frame is rendered into a texture and then put together into a video using FFmpeg. The output created by FFmpeg should eventually be sent over the network to a client UI. However, I'm struggling mainly with the part where a frame is caught, and passed to an unsafe method as a byte array where it should be processed further by FFmpeg. The wrapper I'm using is FFmpeg.AutoGen.
The render to texture method:
private IEnumerator CaptureFrame()
{
yield return new WaitForEndOfFrame();
RenderTexture.active = rt;
frame.ReadPixels(rect, 0, 0);
frame.Apply();
bytes = frame.GetRawTextureData();
EncodeAndWrite(bytes, bytes.Length);
}
The unsafe encoding method so far:
private unsafe void EncodeAndWrite(byte[] bytes, int size)
{
GCHandle pinned = GCHandle.Alloc(bytes, GCHandleType.Pinned);
IntPtr address = pinned.AddrOfPinnedObject();
sbyte** inData = (sbyte**)address;
fixed(int* lineSize = new int[1])
{
lineSize[0] = 4 * textureWidth;
// Convert RGBA to YUV420P
ffmpeg.sws_scale(sws, inData, lineSize, 0, codecContext->width, inputFrame->extended_data, inputFrame->linesize);
}
inputFrame->pts = frameCounter++;
if(ffmpeg.avcodec_send_frame(codecContext, inputFrame) < 0)
throw new ApplicationException("Error sending a frame for encoding!");
pkt = new AVPacket();
fixed(AVPacket* packet = &pkt)
ffmpeg.av_init_packet(packet);
pkt.data = null;
pkt.size = 0;
pinned.Free();
...
}
sws_scale takes a sbyte** as the second parameter, therefore I'm trying to convert the input byte array to sbyte** by first pinning it with GCHandle and doing an explicit type conversion afterwards. I don't know if that's the correct way, though.
Moreover, the condition if(ffmpeg.avcodec_send_frame(codecContext, inputFrame) < 0) alwasy throws an ApplicationException, where I also really don't know why this happens. codecContext and inputFrame are my AVCodecContext and AVFrame objects, respectively, and the fields are defined as the following:
codecContext
codecContext = ffmpeg.avcodec_alloc_context3(codec);
codecContext->bit_rate = 400000;
codecContext->width = textureWidth;
codecContext->height = textureHeight;
AVRational timeBase = new AVRational();
timeBase.num = 1;
timeBase.den = (int)fps;
codecContext->time_base = timeBase;
videoAVStream->time_base = timeBase;
AVRational frameRate = new AVRational();
frameRate.num = (int)fps;
frameRate.den = 1;
codecContext->framerate = frameRate;
codecContext->gop_size = 10;
codecContext->max_b_frames = 1;
codecContext->pix_fmt = AVPixelFormat.AV_PIX_FMT_YUV420P;
inputFrame
inputFrame = ffmpeg.av_frame_alloc();
inputFrame->format = (int)codecContext->pix_fmt;
inputFrame->width = textureWidth;
inputFrame->height = textureHeight;
inputFrame->linesize[0] = inputFrame->width;
Any help in fixing the issue would be greatly appreciated :)
Check examples on here: https://github.com/FFmpeg/FFmpeg/tree/master/doc/examples
Especially scaling_video.c. In FFmpeg scaling and pixel format conversion is same operation (keep the size parameters same for just pixel format conversion).
These examples very easy to follow. Give it a try.
I think your casting is incorrect sbyte** inData = (sbyte**)address;
because address is IntPtr object, so the correct casting probably should be
sbyte* pinData = (sbyte *)address.ToPointer(); sbyte** ppInData = &pinData;
I'm trying to use libavcodec to encode a flv video.
Following code is a sample code to generate a mpeg video, it works well. But after replacing the codec ID with AV_CODEC_ID_FLV1, the generated video file cannot be played.
void simpleEncode(){
AVCodec *codec = avcodec_find_encoder(AV_CODEC_ID_MPEG1VIDEO);
AVCodecContext *ctx = avcodec_alloc_context3(codec);
ctx->bit_rate = 400000;
ctx->width = 352;
ctx->height = 288;
AVRational time_base = {1,25};
ctx->time_base = time_base;
ctx->gop_size = 10;
ctx->pix_fmt = AV_PIX_FMT_YUV420P;
avcodec_open2(ctx, codec, NULL);
AVFrame *frame = av_frame_alloc();
av_image_alloc(frame->data, frame->linesize, ctx->width, ctx->height, ctx->pix_fmt, 32);
frame->format = ctx->pix_fmt;
frame->height = ctx->height;
frame->width = ctx->width;
AVPacket pkt;
int got_output;
FILE *f = fopen("test.mpg", "wb");
for(int i=0; i<25; i++){
av_init_packet(&pkt);
pkt.data = NULL;
pkt.size = 0;
for(int w=0; w<ctx->width; w++){
for(int h=0; h<ctx->height; h++){
frame->data[0][h*frame->linesize[0]+w]=i*10;
}
}
for(int w=0; w<ctx->width/2; w++){
for(int h=0; h<ctx->height/2; h++){
frame->data[1][h*frame->linesize[1]+w]=i*10;
frame->data[2][h*frame->linesize[2]+w]=i*10;
}
}
frame->pts=i;
avcodec_encode_video2(ctx, &pkt, frame, &got_output);
fwrite(pkt.data, 1, pkt.size, f);
}
}
AV_CODEC_ID_FLV1 is a poorly named macro. It refers to the Sorenson H.263 codec. It at one time was the default codec for the FLV container format, but no longer is. It was replaced with VP6 and now h.264. Except for this history it has no relation to the flv container format.
An mpeg1 stream is an odd thing in that its elementary stream format is also a valid container. This is not the case for h.263. You can not simply write the packets to disk and play them back. You must encapsulate the ES into a container. The easiest way to do that is use libavformat.
I'm writing a CoreAudio backend for an audio library called XAL. Input buffers can be of various sample rates. I'm using a single audio unit for output. Idea is to convert the buffers and mix them prior to sending them to the audio unit.
Everything works as long as the input buffer has the same properties (sample rate, channel count, etc) as the output audio unit. Hence, the mixing part works.
However, I'm stuck with sample rate and channel count conversion. From what I figured out, this is easiest to do with Audio Converter Services API. I've managed to construct a converter; the idea is that the output format is the same as the output unit format, but possibly adjusted for purposes of the converter.
Audio converter is successfully constructed, but upon calling AudioConverterFillComplexBuffer(), I get output status error -50.
I'd love if I could get another set of eyeballs on this code. Problem is probably somewhere below AudioConverterNew(). Variable stream contains incoming (and outgoing) buffer data, and streamSize contains byte-size of incoming (and outgoing) buffer data.
What did I do wrong?
void CoreAudio_AudioManager::_convertStream(Buffer* buffer, unsigned char** stream, int *streamSize)
{
if (buffer->getBitsPerSample() != unitDescription.mBitsPerChannel ||
buffer->getChannels() != unitDescription.mChannelsPerFrame ||
buffer->getSamplingRate() != unitDescription.mSampleRate)
{
printf("INPUT STREAM SIZE: %d\n", *streamSize);
// describe the input format's description
AudioStreamBasicDescription inputDescription;
memset(&inputDescription, 0, sizeof(inputDescription));
inputDescription.mFormatID = kAudioFormatLinearPCM;
inputDescription.mFormatFlags = kLinearPCMFormatFlagIsPacked | kLinearPCMFormatFlagIsSignedInteger;
inputDescription.mChannelsPerFrame = buffer->getChannels();
inputDescription.mSampleRate = buffer->getSamplingRate();
inputDescription.mBitsPerChannel = buffer->getBitsPerSample();
inputDescription.mBytesPerFrame = (inputDescription.mBitsPerChannel * inputDescription.mChannelsPerFrame) / 8;
inputDescription.mFramesPerPacket = 1; //*streamSize / inputDescription.mBytesPerFrame;
inputDescription.mBytesPerPacket = inputDescription.mBytesPerFrame * inputDescription.mFramesPerPacket;
printf("INPUT : %lu bytes per packet for sample rate %g, channels %d\n", inputDescription.mBytesPerPacket, inputDescription.mSampleRate, inputDescription.mChannelsPerFrame);
// copy conversion output format's description from the
// output audio unit's description.
// then adjust framesPerPacket to match the input we'll be passing.
// framecount of our input stream is based on the input bytecount.
// output stream will have same number of frames, but different
// number of bytes.
AudioStreamBasicDescription outputDescription = unitDescription;
outputDescription.mFramesPerPacket = 1; //inputDescription.mFramesPerPacket;
outputDescription.mBytesPerPacket = outputDescription.mBytesPerFrame * outputDescription.mFramesPerPacket;
printf("OUTPUT : %lu bytes per packet for sample rate %g, channels %d\n", outputDescription.mBytesPerPacket, outputDescription.mSampleRate, outputDescription.mChannelsPerFrame);
// create an audio converter
AudioConverterRef audioConverter;
OSStatus acCreationResult = AudioConverterNew(&inputDescription, &outputDescription, &audioConverter);
printf("Created audio converter %p (status: %d)\n", audioConverter, acCreationResult);
if(!audioConverter)
{
// bail out
free(*stream);
*streamSize = 0;
*stream = (unsigned char*)malloc(0);
return;
}
// calculate number of bytes required for output of input stream.
// allocate buffer of adequate size.
UInt32 outputBytes = outputDescription.mBytesPerPacket * (*streamSize / inputDescription.mBytesPerFrame); // outputDescription.mFramesPerPacket * outputDescription.mBytesPerFrame;
unsigned char *outputBuffer = (unsigned char*)malloc(outputBytes);
memset(outputBuffer, 0, outputBytes);
printf("OUTPUT BYTES : %d\n", outputBytes);
// describe input data we'll pass into converter
AudioBuffer inputBuffer;
inputBuffer.mNumberChannels = inputDescription.mChannelsPerFrame;
inputBuffer.mDataByteSize = *streamSize;
inputBuffer.mData = *stream;
// describe output data buffers into which we can receive data.
AudioBufferList outputBufferList;
outputBufferList.mNumberBuffers = 1;
outputBufferList.mBuffers[0].mNumberChannels = outputDescription.mChannelsPerFrame;
outputBufferList.mBuffers[0].mDataByteSize = outputBytes;
outputBufferList.mBuffers[0].mData = outputBuffer;
// set output data packet size
UInt32 outputDataPacketSize = outputDescription.mBytesPerPacket;
// convert
OSStatus result = AudioConverterFillComplexBuffer(audioConverter, /* AudioConverterRef inAudioConverter */
CoreAudio_AudioManager::_converterComplexInputDataProc, /* AudioConverterComplexInputDataProc inInputDataProc */
&inputBuffer, /* void *inInputDataProcUserData */
&outputDataPacketSize, /* UInt32 *ioOutputDataPacketSize */
&outputBufferList, /* AudioBufferList *outOutputData */
NULL /* AudioStreamPacketDescription *outPacketDescription */
);
printf("Result: %d wheee\n", result);
// change "stream" to describe our output buffer.
// even if error occured, we'd rather have silence than unconverted audio.
free(*stream);
*stream = outputBuffer;
*streamSize = outputBytes;
// dispose of the audio converter
AudioConverterDispose(audioConverter);
}
}
OSStatus CoreAudio_AudioManager::_converterComplexInputDataProc(AudioConverterRef inAudioConverter,
UInt32* ioNumberDataPackets,
AudioBufferList* ioData,
AudioStreamPacketDescription** ioDataPacketDescription,
void* inUserData)
{
printf("Converter\n");
if(*ioNumberDataPackets != 1)
{
xal::log("_converterComplexInputDataProc cannot provide input data; invalid number of packets requested");
*ioNumberDataPackets = 0;
ioData->mNumberBuffers = 0;
return -50;
}
*ioNumberDataPackets = 1;
ioData->mNumberBuffers = 1;
ioData->mBuffers[0] = *(AudioBuffer*)inUserData;
*ioDataPacketDescription = NULL;
return 0;
}
Working code for Core Audio sample rate conversion and channel count conversion, using Audio Converter Services (now available as a part of the BSD-licensed XAL audio library):
void CoreAudio_AudioManager::_convertStream(Buffer* buffer, unsigned char** stream, int *streamSize)
{
if (buffer->getBitsPerSample() != unitDescription.mBitsPerChannel ||
buffer->getChannels() != unitDescription.mChannelsPerFrame ||
buffer->getSamplingRate() != unitDescription.mSampleRate)
{
// describe the input format's description
AudioStreamBasicDescription inputDescription;
memset(&inputDescription, 0, sizeof(inputDescription));
inputDescription.mFormatID = kAudioFormatLinearPCM;
inputDescription.mFormatFlags = kLinearPCMFormatFlagIsPacked | kLinearPCMFormatFlagIsSignedInteger;
inputDescription.mChannelsPerFrame = buffer->getChannels();
inputDescription.mSampleRate = buffer->getSamplingRate();
inputDescription.mBitsPerChannel = buffer->getBitsPerSample();
inputDescription.mBytesPerFrame = (inputDescription.mBitsPerChannel * inputDescription.mChannelsPerFrame) / 8;
inputDescription.mFramesPerPacket = 1; //*streamSize / inputDescription.mBytesPerFrame;
inputDescription.mBytesPerPacket = inputDescription.mBytesPerFrame * inputDescription.mFramesPerPacket;
// copy conversion output format's description from the
// output audio unit's description.
// then adjust framesPerPacket to match the input we'll be passing.
// framecount of our input stream is based on the input bytecount.
// output stream will have same number of frames, but different
// number of bytes.
AudioStreamBasicDescription outputDescription = unitDescription;
outputDescription.mFramesPerPacket = 1; //inputDescription.mFramesPerPacket;
outputDescription.mBytesPerPacket = outputDescription.mBytesPerFrame * outputDescription.mFramesPerPacket;
// create an audio converter
AudioConverterRef audioConverter;
OSStatus acCreationResult = AudioConverterNew(&inputDescription, &outputDescription, &audioConverter);
if(!audioConverter)
{
// bail out
free(*stream);
*streamSize = 0;
*stream = (unsigned char*)malloc(0);
return;
}
// calculate number of bytes required for output of input stream.
// allocate buffer of adequate size.
UInt32 outputBytes = outputDescription.mBytesPerPacket * (*streamSize / inputDescription.mBytesPerPacket); // outputDescription.mFramesPerPacket * outputDescription.mBytesPerFrame;
unsigned char *outputBuffer = (unsigned char*)malloc(outputBytes);
memset(outputBuffer, 0, outputBytes);
// describe input data we'll pass into converter
AudioBuffer inputBuffer;
inputBuffer.mNumberChannels = inputDescription.mChannelsPerFrame;
inputBuffer.mDataByteSize = *streamSize;
inputBuffer.mData = *stream;
// describe output data buffers into which we can receive data.
AudioBufferList outputBufferList;
outputBufferList.mNumberBuffers = 1;
outputBufferList.mBuffers[0].mNumberChannels = outputDescription.mChannelsPerFrame;
outputBufferList.mBuffers[0].mDataByteSize = outputBytes;
outputBufferList.mBuffers[0].mData = outputBuffer;
// set output data packet size
UInt32 outputDataPacketSize = outputBytes / outputDescription.mBytesPerPacket;
// fill class members with data that we'll pass into
// the InputDataProc
_converter_currentBuffer = &inputBuffer;
_converter_currentInputDescription = inputDescription;
// convert
OSStatus result = AudioConverterFillComplexBuffer(audioConverter, /* AudioConverterRef inAudioConverter */
CoreAudio_AudioManager::_converterComplexInputDataProc, /* AudioConverterComplexInputDataProc inInputDataProc */
this, /* void *inInputDataProcUserData */
&outputDataPacketSize, /* UInt32 *ioOutputDataPacketSize */
&outputBufferList, /* AudioBufferList *outOutputData */
NULL /* AudioStreamPacketDescription *outPacketDescription */
);
// change "stream" to describe our output buffer.
// even if error occured, we'd rather have silence than unconverted audio.
free(*stream);
*stream = outputBuffer;
*streamSize = outputBytes;
// dispose of the audio converter
AudioConverterDispose(audioConverter);
}
}
OSStatus CoreAudio_AudioManager::_converterComplexInputDataProc(AudioConverterRef inAudioConverter,
UInt32* ioNumberDataPackets,
AudioBufferList* ioData,
AudioStreamPacketDescription** ioDataPacketDescription,
void* inUserData)
{
if(ioDataPacketDescription)
{
xal::log("_converterComplexInputDataProc cannot provide input data; it doesn't know how to provide packet descriptions");
*ioDataPacketDescription = NULL;
*ioNumberDataPackets = 0;
ioData->mNumberBuffers = 0;
return 501;
}
CoreAudio_AudioManager *self = (CoreAudio_AudioManager*)inUserData;
ioData->mNumberBuffers = 1;
ioData->mBuffers[0] = *(self->_converter_currentBuffer);
*ioNumberDataPackets = ioData->mBuffers[0].mDataByteSize / self->_converter_currentInputDescription.mBytesPerPacket;
return 0;
}
In the header, as part of the CoreAudio_AudioManager class, here are relevant instance variables:
AudioStreamBasicDescription unitDescription;
AudioBuffer *_converter_currentBuffer;
AudioStreamBasicDescription _converter_currentInputDescription;
A few months later, I'm looking at this and I've realized that I didn't document the changes.
If you are interested in what the changes were:
look at the callback function CoreAudio_AudioManager::_converterComplexInputDataProc
one has to properly specify the number of output packets into ioNumberDataPackets
this has required introduction of new instance variables to hold both the buffer (the previous inUserData) and the input description (used to calculate the number of packets to be fed into Core Audio's converter)
this calculation of "output" packets (those fed into the converter) is done based on amount of data that our callback received, and the number of bytes per packet that the input format contains
Hopefully this edit will help a future reader (myself included)!
Reading the documentation about iOS SDK CMBufferQueueCreate, it says that getDuration and version are required, all the others callbacks can be NULL.
But running the following code:
CFAllocatorRef allocator;
CMBufferCallbacks *callbacks;
callbacks = malloc(sizeof(CMBufferCallbacks));
callbacks->version = 0;
callbacks->getDuration = timeCallback;
callbacks->refcon = NULL;
callbacks->getDecodeTimeStamp = NULL;
callbacks->getPresentationTimeStamp = NULL;
callbacks->isDataReady = NULL;
callbacks->compare = NULL;
callbacks->dataBecameReadyNotification = NULL;
CMItemCount capacity = 4;
OSStatus s = CMBufferQueueCreate(allocator, capacity, callbacks, queue);
NSLog(#"QUEUE: %x", queue);
NSLog(#"STATUS: %i", s);
with timeCallback:
CMTime timeCallback(CMBufferRef buf, void *refcon){
return CMTimeMake(1, 1);
}
and queue is:
CMBufferQueueRef* queue;
queue creations fails (queue = 0) and returns a status of:
kCMBufferQueueError_RequiredParameterMissing = -12761,
The callbacks variable is correctly initialized, at least the debugger says so.
Has anybody used the CMBufferQueue?
Presumably there is nothing wrong with the parameters. At least the same as what you wrote is stated in CMBufferQueue.h about the required parameters. But it looks like you are passing a null pointer as the CMBufferQueueRef* parameter. I have updated your sample as follows and it seems to create the message loop OK.
CMBufferQueueRef queue;
CFAllocatorRef allocator = kCFAllocatorDefault;
CMBufferCallbacks *callbacks;
callbacks = malloc(sizeof(CMBufferCallbacks));
callbacks->version = 0;
callbacks->getDuration = timeCallback;
callbacks->refcon = NULL;
callbacks->getDecodeTimeStamp = NULL;
callbacks->getPresentationTimeStamp = NULL;
callbacks->isDataReady = NULL;
callbacks->compare = NULL;
callbacks->dataBecameReadyNotification = NULL;
CMItemCount capacity = 4;
OSStatus s = CMBufferQueueCreate(allocator, capacity, callbacks, &queue);
NSLog(#"QUEUE: %x", queue);
NSLog(#"STATUS: %i", s);
The time callback is still the same.
It does not look like it helps topic starter, but I hope it helps somebody else.
It's meant to return an OSType, but instead I'm just getting -50. Does anyone have any idea what error this represents? I can't find it anywhere.
A code snippet for context (the error is so ambiguous I don't know what snippet to paste, here's pretty much everything):
ExtAudioFileRef cafFile;
AudioStreamBasicDescription cafDesc;
cafDesc.mBitsPerChannel = 16;
cafDesc.mBytesPerFrame = 4;
cafDesc.mBytesPerPacket = 4;
cafDesc.mChannelsPerFrame = 2;
cafDesc.mFormatFlags = 0;
cafDesc.mFormatID = 'ima4';
cafDesc.mFramesPerPacket = 1;
cafDesc.mReserved = 0;
cafDesc.mSampleRate = 44100;
OSType status = ExtAudioFileCreateWithURL(
fileURL, // inURL
'caff', // inFileType
&cafDesc, // inStreamDesc
NULL, // inChannelLayout
kAudioFileFlags_EraseFile, // inFlags
&cafFile // outExtAudioFile
); // returns 0xFFFFFFCE
ExtAudioFileCreateWithURL() returns an OSStatus, not an OSType. See the file MacErrors.h for the various error codes. In this case, -50 is paramErr (error in user parameter list), so you're passing one or more of the parameters incorrectly to the function.