How to check results of ExtAudioFileRead? - iphone

I'm using ExtAudioFileRead to read WAV file into memory as float * buffer. However, I'm not quite sure about results - when I print them out, I get values from - 1 to + 1(which should be theoretically correct), but how can I be sure that they are correct?
- (float *) readTestFileAndSize: (int *) size
{
CFStringRef str = CFStringCreateWithCString(
NULL,
[[[NSBundle mainBundle] pathForResource: #"25" ofType:#"wav"] UTF8String],
kCFStringEncodingMacRoman
);
CFURLRef inputFileURL = CFURLCreateWithFileSystemPath(
kCFAllocatorDefault,
str,
kCFURLPOSIXPathStyle,
false
);
ExtAudioFileRef fileRef;
ExtAudioFileOpenURL(inputFileURL, &fileRef);
SInt64 theFileLengthInFrames = 0;
// Get the total frame count
UInt32 thePropertySize = sizeof(theFileLengthInFrames);
ExtAudioFileGetProperty(fileRef, kExtAudioFileProperty_FileLengthFrames, &thePropertySize, &theFileLengthInFrames);
AudioStreamBasicDescription audioFormat;
audioFormat.mSampleRate = 44100;
audioFormat.mFormatID = kAudioFormatLinearPCM;
audioFormat.mFormatFlags = kLinearPCMFormatFlagIsFloat;
audioFormat.mBitsPerChannel = sizeof(Float32) * 8;
audioFormat.mChannelsPerFrame = 1; // Mono
audioFormat.mBytesPerFrame = audioFormat.mChannelsPerFrame * sizeof(Float32); // == sizeof(Float32)
audioFormat.mFramesPerPacket = 1;
audioFormat.mBytesPerPacket = audioFormat.mFramesPerPacket * audioFormat.mBytesPerFrame; // = sizeof(Float32)
// 3) Apply audio format to the Extended Audio File
ExtAudioFileSetProperty(
fileRef,
kExtAudioFileProperty_ClientDataFormat,
sizeof (AudioStreamBasicDescription), //= audioFormat
&audioFormat);
int numSamples = 1024; //How many samples to read in at a time
UInt32 sizePerPacket = audioFormat.mBytesPerPacket; // = sizeof(Float32) = 32bytes
UInt32 packetsPerBuffer = numSamples;
UInt32 outputBufferSize = packetsPerBuffer * sizePerPacket;
// So the lvalue of outputBuffer is the memory location where we have reserved space
UInt8 *outputBuffer = (UInt8 *)malloc(sizeof(UInt8 *) * outputBufferSize);
NSLog(#"outputBufferSize - %llu", theFileLengthInFrames);
float* total = malloc(theFileLengthInFrames * sizeof(float));
*size = theFileLengthInFrames;
AudioBufferList convertedData;
convertedData.mNumberBuffers = 1; // Set this to 1 for mono
convertedData.mBuffers[0].mNumberChannels = audioFormat.mChannelsPerFrame; //also = 1
convertedData.mBuffers[0].mDataByteSize = outputBufferSize;
convertedData.mBuffers[0].mData = outputBuffer; //
int totalBytes = 0;
UInt32 frameCount = numSamples;
while (frameCount > 0) {
ExtAudioFileRead(fileRef, &frameCount, &convertedData);
if (frameCount > 0) {
AudioBuffer audioBuffer = convertedData.mBuffers[0];
float *samplesAsCArray = (float *)audioBuffer.mData;
memcpy(total + totalBytes, samplesAsCArray, frameCount * sizeof(float));
totalBytes += frameCount;
}
}
return total;
}

There are only a few ways to test that I can think of:
Compare the data you've loaded to data loaded by something you know works
Play the audio data back out somehow (probably using an AudioQueue)

Related

iPhone - finalizing Apple's vague "VerificationController.m"

I am trying to implement the new VerificationController.m class that Apple released to fix the in-app purchase fraud problem.
As everything released by Apple, this is one more vague, incomplete and bad explained document with a lot of voids and unknowns that cannot be circumvented/understood by everyone.
I am trying to implement that, but at the end of the code we see these four methods:
- (NSString *)encodeBase64:(const uint8_t *)input length:(NSInteger)length
{
#warning Replace this method.
return nil;
}
- (NSString *)decodeBase64:(NSString *)input length:(NSInteger *)length
{
#warning Replace this method.
return nil;
}
#warning Implement this function.
char* base64_encode(const void* buf, size_t size)
{ return NULL; }
#warning Implement this function.
void * base64_decode(const char* s, size_t * data_len)
{ return NULL; }
You can see that Apple was lazy to implement the C functions at the end of the code. As my C/C++ abilities stink, I see I need to implement these two functions in C/C++ and that they must return char and void (???). Other people have posted routines to do that on SO, but they are either in Objective-C or not returning chars and void (??).
NOTE: this is another problem I have: how can a method return void if it is used by Apple in this form?
uint8_t *purchase_info_bytes = base64_decode([purchase_info_string cStringUsingEncoding:NSASCIIStringEncoding], &purchase_info_length);
shouldn't it be returning uint8_t?
NOTE2: another problem I have is that apple says base64_encode is required but it is not being used on the code provided by them. I think they are smoking bad stuff or my C/C++ knowledge really stink.
So, returning to my first question. Can someone post/point a method that can do the job that follows the requirements of the declared methods base64_encode and base64_decode? Please refrain from posting objective-c methods that are not compatible with these requirements imposed by Apple.
Thanks.
This solution should be pretty straight forward, which includes all the methods to populate the missing information. Tested and functional within the sandbox.
// single base64 character conversion
static int POS(char c)
{
if (c>='A' && c<='Z') return c - 'A';
if (c>='a' && c<='z') return c - 'a' + 26;
if (c>='0' && c<='9') return c - '0' + 52;
if (c == '+') return 62;
if (c == '/') return 63;
if (c == '=') return -1;
[NSException raise:#"invalid BASE64 encoding" format:#"Invalid BASE64 encoding"];
return 0;
}
- (NSString *)encodeBase64:(const uint8_t *)input length:(NSInteger)length
{
return [NSString stringWithUTF8String:base64_encode(input, (size_t)length)];
}
- (NSString *)decodeBase64:(NSString *)input length:(NSInteger *)length
{
size_t retLen;
uint8_t *retStr = base64_decode([input UTF8String], &retLen);
if (length)
*length = (NSInteger)retLen;
NSString *st = [[[NSString alloc] initWithBytes:retStr
length:retLen
encoding:NSUTF8StringEncoding] autorelease];
free(retStr); // If base64_decode returns dynamically allocated memory
return st;
}
char* base64_encode(const void* buf, size_t size)
{
static const char base64[] = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/";
char* str = (char*) malloc((size+3)*4/3 + 1);
char* p = str;
unsigned char* q = (unsigned char*) buf;
size_t i = 0;
while(i < size) {
int c = q[i++];
c *= 256;
if (i < size) c += q[i];
i++;
c *= 256;
if (i < size) c += q[i];
i++;
*p++ = base64[(c & 0x00fc0000) >> 18];
*p++ = base64[(c & 0x0003f000) >> 12];
if (i > size + 1)
*p++ = '=';
else
*p++ = base64[(c & 0x00000fc0) >> 6];
if (i > size)
*p++ = '=';
else
*p++ = base64[c & 0x0000003f];
}
*p = 0;
return str;
}
void* base64_decode(const char* s, size_t* data_len_ptr)
{
size_t len = strlen(s);
if (len % 4)
[NSException raise:#"Invalid input in base64_decode" format:#"%d is an invalid length for an input string for BASE64 decoding", len];
unsigned char* data = (unsigned char*) malloc(len/4*3);
int n[4];
unsigned char* q = (unsigned char*) data;
for(const char*p=s; *p; )
{
n[0] = POS(*p++);
n[1] = POS(*p++);
n[2] = POS(*p++);
n[3] = POS(*p++);
if (n[0]==-1 || n[1]==-1)
[NSException raise:#"Invalid input in base64_decode" format:#"Invalid BASE64 encoding"];
if (n[2]==-1 && n[3]!=-1)
[NSException raise:#"Invalid input in base64_decode" format:#"Invalid BASE64 encoding"];
q[0] = (n[0] << 2) + (n[1] >> 4);
if (n[2] != -1) q[1] = ((n[1] & 15) << 4) + (n[2] >> 2);
if (n[3] != -1) q[2] = ((n[2] & 3) << 6) + n[3];
q += 3;
}
// make sure that data_len_ptr is not null
if (!data_len_ptr)
[NSException raise:#"Invalid input in base64_decode" format:#"Invalid destination for output string length"];
*data_len_ptr = q-data - (n[2]==-1) - (n[3]==-1);
return data;
}
Here is a base 64 encode function for NSString to NSString:
+(NSString *) encodeString:(NSString *)inString
{
NSData *data = [inString dataUsingEncoding:NSUTF8StringEncoding];
//Point to start of the data and set buffer sizes
int inLength = [data length];
int outLength = ((((inLength * 4)/3)/4)*4) + (((inLength * 4)/3)%4 ? 4 : 0);
const char *inputBuffer = [data bytes];
char *outputBuffer = malloc(outLength);
outputBuffer[outLength] = 0;
//64 digit code
static char Encode[] = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/";
//start the count
int cycle = 0;
int inpos = 0;
int outpos = 0;
char temp;
outputBuffer[outLength-1] = '=';
outputBuffer[outLength-2] = '=';
while (inpos < inLength){
switch (cycle) {
case 0:
outputBuffer[outpos++] = Encode[(inputBuffer[inpos]&0xFC)>>2];
cycle = 1;
break;
case 1:
temp = (inputBuffer[inpos++]&0x03)<<4;
outputBuffer[outpos] = Encode[temp];
cycle = 2;
break;
case 2:
outputBuffer[outpos++] = Encode[temp|(inputBuffer[inpos]&0xF0)>> 4];
temp = (inputBuffer[inpos++]&0x0F)<<2;
outputBuffer[outpos] = Encode[temp];
cycle = 3;
break;
case 3:
outputBuffer[outpos++] = Encode[temp|(inputBuffer[inpos]&0xC0)>>6];
cycle = 4;
break;
case 4:
outputBuffer[outpos++] = Encode[inputBuffer[inpos++]&0x3f];
cycle = 0;
break;
default:
cycle = 0;
break;
}
}
NSString *pictemp = [NSString stringWithUTF8String:outputBuffer];
free(outputBuffer);
return pictemp;
}
and Here is a base 64 decode function for NSString to NSString:
+(NSString *) decodeString:(NSString *)inString
{
const char* string = [inString cStringUsingEncoding:NSASCIIStringEncoding];
NSInteger inputLength = inString.length;
static char decodingTable[128];
static char encodingTable[] = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/";
for (NSInteger i = 0; i < 128; i++) {
decodingTable[encodingTable[i]] = i;
}
if ((string == NULL) || (inputLength % 4 != 0)) {
return nil;
}
while (inputLength > 0 && string[inputLength - 1] == '=') {
inputLength--;
}
NSInteger outputLength = inputLength * 3 / 4;
NSMutableData* data = [NSMutableData dataWithLength:outputLength];
uint8_t* output = data.mutableBytes;
NSInteger inputPoint = 0;
NSInteger outputPoint = 0;
while (inputPoint < inputLength) {
char i0 = string[inputPoint++];
char i1 = string[inputPoint++];
char i2 = inputPoint < inputLength ? string[inputPoint++] : 'A'; /* 'A' will decode to \0 */
char i3 = inputPoint < inputLength ? string[inputPoint++] : 'A';
output[outputPoint++] = (decodingTable[i0] << 2) | (decodingTable[i1] >> 4);
if (outputPoint < outputLength) {
output[outputPoint++] = ((decodingTable[i1] & 0xf) << 4) | (decodingTable[i2] >> 2);
}
if (outputPoint < outputLength) {
output[outputPoint++] = ((decodingTable[i2] & 0x3) << 6) | decodingTable[i3];
}
}
NSLog(#"%#",data);
NSString *finalString = [[NSString alloc] initWithData:data encoding:NSASCIIStringEncoding];
return finalString;
}
These were pieced together from examples I found in various places on the internet when I was searching for them a while ago. They, may be easier for you to implement. I just created a Base64 class and placed these methods in it.
Here are the C wrappers around Justin's answer:
char* base64_encode(const void* buf, size_t size)
{
NSData* data = [NSData dataWithBytesNoCopy:(void*)buf length:size];
NSString* string = [[NSString alloc] initWithData:data encoding:NSASCIIStringEncoding];
return [[_Class_ encode:string] UTF8String];
}
void* base64_Decode (const char* s, size_t* data_len)
{
NSString* result = [_Class_ decode:[NSString stringWithCString:s encoding:NSASCIIStringEncoding]];
*data_len = result.length;
return [result UTF8String];
}
Where Class is the class that contains Justin's functions.

Noise in sound in ios 4.3.x and 5.0 after recording

I've got an app where I use Core Audio for sound recording. Sound is made from some parts and then must be saved to the device. It was working fine but in some new ios versions I have a noise, something like distortion, in output files. What is the possible reason? ExtAudioFileCreateWithURL is used to create output file and ExtAudioFileSetProperty to set its properties.
Any help will be appreciated.
This code was created by another programmer who is currently unavailable, so I don't have any idea why such hack was implemented.
The way sound buffer is created for different versions of ios:
static BOOL shouldFixData = NO;
static int checkOnce = 1;
if (checkOnce) {
checkOnce = 0;
if (inNumberFrames * 8 == ioData->mBuffers[0].mDataByteSize) {
shouldFixData = YES;
}
}
if (shouldFixData) {
AudioBufferList cutData = {0};
cutData.mNumberBuffers = 1;
cutData.mBuffers[0].mNumberChannels = ioData->mBuffers[0].mNumberChannels;
cutData.mBuffers[0].mDataByteSize = ioData->mBuffers[0].mDataByteSize / 2;
cutData.mBuffers[0].mData = malloc(cutData.mBuffers[0].mDataByteSize);
SInt32* oldData = (SInt32*)ioData->mBuffers[0].mData;
SInt32* newData = (SInt32*)cutData.mBuffers[0].mData;
int count = cutData.mBuffers[0].mDataByteSize/4;
for (int i = 0; i < count; ++i) {
newData[i] = oldData[i*2];
}
ExtAudioFileWriteAsync(userData->outputFile, inNumberFrames, &cutData);
free(cutData.mBuffers[0].mData);
} else {
ExtAudioFileWriteAsync(userData->outputFile, inNumberFrames, ioData);
}
}
Saving the record:
CAStreamBasicDescription dstFormat;
dstFormat.mSampleRate = mOutputFormat.mSampleRate;
dstFormat.mFormatID = kAudioFormatLinearPCM;
dstFormat.mChannelsPerFrame = 2;
dstFormat.mBitsPerChannel = 16;
dstFormat.mBytesPerPacket = 2 * dstFormat.mChannelsPerFrame;
dstFormat.mBytesPerFrame = 2 * dstFormat.mChannelsPerFrame;
dstFormat.mFramesPerPacket = 1;
dstFormat.mFormatFlags = kLinearPCMFormatFlagIsPacked | kLinearPCMFormatFlagIsSignedInteger;
//recordInfo.output file is ExtAudioFileRef
err = ExtAudioFileCreateWithURL((CFURLRef)recordFileURL, kAudioFileWAVEType, &dstFormat, NULL, kAudioFileFlags_EraseFile, &recordInfo.outputFile);
if (err) { printf("ExtAudioFileCreateWithURL result %ld %08X %4.4s\n", err, (unsigned int)err, (char*)&err); return; }
NSString *currSysVer = [[UIDevice currentDevice] systemVersion];
NSComparisonResult versionCompareRes = [currSysVer compare:#"4.3" options:NSNumericSearch];
if (versionCompareRes == NSOrderedSame || versionCompareRes == NSOrderedDescending) {
//for new versions
err = ExtAudioFileSetProperty(recordInfo.outputFile, kExtAudioFileProperty_ClientDataFormat, sizeof(mOutputFormat), &mOutputFormat);
if (err) { printf("ExtAudioFileSetProperty result %ld %08X %4.4s\n", err, (unsigned int)err, (char*)&err); return; }
} else {
//for old versions
err = ExtAudioFileSetProperty(recordInfo.outputFile, kExtAudioFileProperty_ClientDataFormat, sizeof(dstFormat), &dstFormat);
if (err) { printf("ExtAudioFileSetProperty result %ld %08X %4.4s\n", err, (unsigned int)err, (char*)&err); return; }
}
sample output file:
output.mp3
You could use SPEEX to do denoising, include speex_preprocess.h first:
SpeexPreprocessState *_spt = speex_preprocess_state_init(NN, 16000);/NN is 320 for sampleRate 16000.
int i=1;
speex_preprocess_ctl(_spt, SPEEX_PREPROCESS_SET_DENOISE, &i);
-(BOOL)doConvert:(void *)data SampleLength:(SInt64)length
{
AudioBufferList *dataBuffer = (AudioBufferList*)data;
AudioSampleType *samples = (AudioSampleType*)dataBuffer->mBuffers[0].mData;
SInt64 count = length / NN;
//short sample_ground[count][NN];
short **sample_group;
sample_group = (short**)malloc(sizeof(short*)*count);
for (int i=0; i<count; i++)
sample_group[i] = (short*)malloc(sizeof(short)*NN);
for (int i = 0; i < count; i++) {
for (int j = 0 ; j < NN; j++) {
short value = samples[i*NN+j];
sample_group[i][j] = value;
}
}
for (int i = 0; i < count; i++) {
speex_preprocess_run(_spt, sample_group[i]);
}
for (int i = 0; i < count; i++) {
for (int j = 0 ; j < NN; j++) {
samples[i*NN+j] = sample_group[i][j];
}
}
for (int i=0; i<count; i++)
free(sample_group[i]);
free(sample_group);
return YES;
}
oops.. should have been feedback. Sorry..
The main difference in the code you posted is the format setup
sizeof(mOutputFormat), &mOutputFormat
where mOutputFormat not specified how its set.
compared to the dst format for older version which is set.
Setting "wrong" format usually gives bad sound :-)

How can I save array of samples as audio file in iPhone?

I have a sound as array of samples.
How can I save this as audio file?
I have examined iPhone Core Audio APIs.
And I understand how to record from mic and play music.
But I can't find how to do that.
Here is a piece of code that works for me. For any more information you should check out the book Core Audio Rough Cuts.
#include "WavGenerator.h"
#import <Foundation/Foundation.h>
#import <AudioToolbox/AudioToolbox.h>
#include "AudioController.h"
#define SAMPLE_RATE 44100
#define DURATION 5.0
#define COUNT_OF(x) ((sizeof(x)/sizeof(0[x])) / ((size_t)(!(sizeof(x) % sizeof(0[x])))))
// #define FILENAME #"newFile.caf"
extern unsigned int global_size_of_instrumental;
extern unsigned int global_size_output;
void createNewWAV (const char *location, int *sample_array){
NSAutoreleasePool *pool = [[NSAutoreleasePool alloc] init];
NSString *filePath = NSTemporaryDirectory();
filePath = [filePath stringByAppendingPathComponent:#"name_of_your_file.wav"];
NSURL *fileURL = [NSURL fileURLWithPath:filePath];
AudioStreamBasicDescription asbd;
memset(&asbd,0, sizeof(asbd));
asbd.mSampleRate = SAMPLE_RATE;
asbd.mFormatID = kAudioFormatLinearPCM;
asbd.mFormatFlags = kLinearPCMFormatFlagIsSignedInteger | kLinearPCMFormatFlagIsPacked;
// asbd.mFormatFlags = kAudioFormatFlagIsBigEndian;
asbd.mBitsPerChannel = 16;
asbd.mChannelsPerFrame = 1;
asbd.mFramesPerPacket = 1;
asbd.mBytesPerFrame = 2;
asbd.mBytesPerPacket = 2;
AudioFileID audioFile;
OSStatus audioErr = noErr;
audioErr = AudioFileCreateWithURL((CFURLRef)fileURL,
kAudioFileWAVEType,
&asbd,
kAudioFileFlags_EraseFile,
&audioFile);
assert (audioErr == noErr);
printf("WAV GENERATOR --- global_size_output %d \n", global_size_output);
int size_of_output = global_size_output;
SInt16 *the_samples = (SInt16 *) malloc(global_size_of_instrumental*size_of_output*sizeof(SInt16));
for (int i=0; i< global_size_of_instrumental*size_of_output; i++)
{
the_samples[i] = sample_array[i];
}
UInt32 numSamples = global_size_of_instrumental*size_of_output;
UInt32 bytesToWrite = numSamples;
audioErr = AudioFileWriteBytes(audioFile, false, 0, &bytesToWrite, the_samples);
audioErr = AudioFileClose(audioFile);
assert(audioErr == noErr);
[pool drain];
}
If you download the free version of http://www.dspdimension.com/technology-licensing/dirac2/ you will find in the sample sourcecode functions for reading and writing audio files, I can't remember what format tho.

How can I get AAC encoding with ExtAudioFile on iOS to work?

I need to convert a WAVE file into an AAC encoded M4A file on iOS. I'm aware that AAC encoding is not supported on older devices or in the simulator. I'm testing that before I run the code. But I still can't get it to work.
I looked into Apple's very own iPhoneExtAudioFileConvertTest example and I thought I followed it exactly, but still no luck!
Currently, I get a -50 (= error in user parameter list) while trying to set the client format on the destination file. On the source file, it works.
Below is my code. Any help is very much appreciated, thanks!
UInt32 size;
// Open a source audio file.
ExtAudioFileRef sourceAudioFile;
ExtAudioFileOpenURL( (CFURLRef)sourceURL, &sourceAudioFile );
// Get the source data format
AudioStreamBasicDescription sourceFormat;
size = sizeof( sourceFormat );
result = ExtAudioFileGetProperty( sourceAudioFile, kExtAudioFileProperty_FileDataFormat, &size, &sourceFormat );
// Define the output format (AAC).
AudioStreamBasicDescription outputFormat;
outputFormat.mFormatID = kAudioFormatMPEG4AAC;
outputFormat.mSampleRate = 44100;
outputFormat.mChannelsPerFrame = 2;
// Use AudioFormat API to fill out the rest of the description.
size = sizeof( outputFormat );
AudioFormatGetProperty( kAudioFormatProperty_FormatInfo, 0, NULL, &size, &outputFormat);
// Make a destination audio file with this output format.
ExtAudioFileRef destAudioFile;
ExtAudioFileCreateWithURL( (CFURLRef)destURL, kAudioFileM4AType, &outputFormat, NULL, kAudioFileFlags_EraseFile, &destAudioFile );
// Create canonical PCM client format.
AudioStreamBasicDescription clientFormat;
clientFormat.mSampleRate = sourceFormat.mSampleRate;
clientFormat.mFormatID = kAudioFormatLinearPCM;
clientFormat.mFormatFlags = kAudioFormatFlagIsPacked | kAudioFormatFlagIsSignedInteger;
clientFormat.mChannelsPerFrame = 2;
clientFormat.mBitsPerChannel = 16;
clientFormat.mBytesPerFrame = 4;
clientFormat.mBytesPerPacket = 4;
clientFormat.mFramesPerPacket = 1;
// Set the client format in source and destination file.
size = sizeof( clientFormat );
ExtAudioFileSetProperty( sourceAudioFile, kExtAudioFileProperty_ClientDataFormat, size, &clientFormat );
size = sizeof( clientFormat );
ExtAudioFileSetProperty( destAudioFile, kExtAudioFileProperty_ClientDataFormat, size, &clientFormat );
// Make a buffer
int bufferSizeInFrames = 8000;
int bufferSize = ( bufferSizeInFrames * sourceFormat.mBytesPerFrame );
UInt8 * buffer = (UInt8 *)malloc( bufferSize );
AudioBufferList bufferList;
bufferList.mNumberBuffers = 1;
bufferList.mBuffers[0].mNumberChannels = clientFormat.mChannelsPerFrame;
bufferList.mBuffers[0].mData = buffer;
bufferList.mBuffers[0].mDataByteSize = ( bufferSize );
while( TRUE )
{
// Try to fill the buffer to capacity.
UInt32 framesRead = bufferSizeInFrames;
ExtAudioFileRead( sourceAudioFile, &framesRead, &bufferList );
// 0 frames read means EOF.
if( framesRead == 0 )
break;
// Write.
ExtAudioFileWrite( destAudioFile, framesRead, &bufferList );
}
free( buffer );
// Close the files.
ExtAudioFileDispose( sourceAudioFile );
ExtAudioFileDispose( destAudioFile );
Answered my own question: I had to pass this problem to my colleague and he got it to work! I never had the chance to analyze my original problem but I thought, I'd post it here for the sake of completeness. The following method is called from within an NSThread. Parameters are set via the 'threadDictionary' and he created a custom delegate to transmit progress feedback (sorry, SO doesn't understand the formatting properly, the following is supposed to be one block of method implementation):
- (void)encodeToAAC
{
RXAudioEncoderStatusType encoderStatus;
OSStatus result = noErr;
BOOL success = NO;
BOOL cancelled = NO;
UInt32 size;
ExtAudioFileRef sourceAudioFile,destAudioFile;
AudioStreamBasicDescription sourceFormat,outputFormat, clientFormat;
SInt64 totalFrames;
unsigned long long encodedBytes, totalBytes;
int bufferSizeInFrames, bufferSize;
UInt8 * buffer;
AudioBufferList bufferList;
NSAutoreleasePool * pool = [[NSAutoreleasePool alloc] init];
NSFileManager * fileManager = [[[NSFileManager alloc] init] autorelease];
NSMutableDictionary * threadDict = [[NSThread currentThread] threadDictionary];
NSObject<RXAudioEncodingDelegate> * delegate = (NSObject<RXAudioEncodingDelegate> *)[threadDict objectForKey:#"Delegate"];
NSString *sourcePath = (NSString *)[threadDict objectForKey:#"SourcePath"];
NSString *destPath = (NSString *)[threadDict objectForKey:#"DestinationPath"];
NSURL * sourceURL = [NSURL fileURLWithPath:sourcePath];
NSURL * destURL = [NSURL fileURLWithPath:destPath];
// Open a source audio file.
result = ExtAudioFileOpenURL( (CFURLRef)sourceURL, &sourceAudioFile );
if( result != noErr )
{
DLog( #"Error in ExtAudioFileOpenURL: %ld", result );
goto bailout;
}
// Get the source data format
size = sizeof( sourceFormat );
result = ExtAudioFileGetProperty( sourceAudioFile, kExtAudioFileProperty_FileDataFormat, &size, &sourceFormat );
if( result != noErr )
{
DLog( #"Error in ExtAudioFileGetProperty: %ld", result );
goto bailout;
}
// Define the output format (AAC).
memset(&outputFormat, 0, sizeof(outputFormat));
outputFormat.mFormatID = kAudioFormatMPEG4AAC;
outputFormat.mSampleRate = 44100;
outputFormat.mFormatFlags = kMPEG4Object_AAC_Main;
outputFormat.mChannelsPerFrame = 2;
outputFormat.mBitsPerChannel = 0;
outputFormat.mBytesPerFrame = 0;
outputFormat.mBytesPerPacket = 0;
outputFormat.mFramesPerPacket = 1024;
// Use AudioFormat API to fill out the rest of the description.
//size = sizeof( outputFormat );
//AudioFormatGetProperty( kAudioFormatProperty_FormatInfo, 0, NULL, &size, &outputFormat);
// Make a destination audio file with this output format.
result = ExtAudioFileCreateWithURL( (CFURLRef)destURL, kAudioFileM4AType, &outputFormat, NULL, kAudioFileFlags_EraseFile, &destAudioFile );
if( result != noErr )
{
DLog( #"Error creating destination file: %ld", result );
goto bailout;
}
// Create the canonical PCM client format.
memset(&clientFormat, 0, sizeof(clientFormat));
clientFormat.mSampleRate = sourceFormat.mSampleRate;
clientFormat.mFormatID = kAudioFormatLinearPCM;
clientFormat.mFormatFlags = kLinearPCMFormatFlagIsSignedInteger | kLinearPCMFormatFlagIsPacked; //kAudioFormatFlagIsPacked | kAudioFormatFlagIsSignedInteger;
clientFormat.mChannelsPerFrame = 2;
clientFormat.mBitsPerChannel = 16;
clientFormat.mBytesPerFrame = 4;
clientFormat.mBytesPerPacket = 4;
clientFormat.mFramesPerPacket = 1;
// Set the client format in source and destination file.
size = sizeof( clientFormat );
result = ExtAudioFileSetProperty( sourceAudioFile, kExtAudioFileProperty_ClientDataFormat, size, &clientFormat );
if( result != noErr )
{
DLog( #"Error while setting client format in source file: %ld", result );
goto bailout;
}
size = sizeof( clientFormat );
result = ExtAudioFileSetProperty( destAudioFile, kExtAudioFileProperty_ClientDataFormat, size, &clientFormat );
if( result != noErr )
{
DLog( #"Error while setting client format in destination file: %ld", result );
goto bailout;
}
// Make a buffer
bufferSizeInFrames = 8000;
bufferSize = ( bufferSizeInFrames * sourceFormat.mBytesPerFrame );
buffer = (UInt8 *)malloc( bufferSize );
bufferList.mNumberBuffers = 1;
bufferList.mBuffers[0].mNumberChannels = clientFormat.mChannelsPerFrame;
bufferList.mBuffers[0].mData = buffer;
bufferList.mBuffers[0].mDataByteSize = ( bufferSize );
// Obtain total number of audio frames to encode
size = sizeof( totalFrames );
result = ExtAudioFileGetProperty( sourceAudioFile, kExtAudioFileProperty_FileLengthFrames, &size, &totalFrames );
if( result != noErr )
{
DLog( #"Error in ExtAudioFileGetProperty, could not get kExtAudioFileProperty_FileLengthFrames from sourceFile: %ld", result );
goto bailout;
}
encodedBytes = 0;
totalBytes = totalFrames * sourceFormat.mBytesPerFrame;
[threadDict setValue:[NSValue value:&totalBytes withObjCType:#encode(unsigned long long)] forKey:#"TotalBytes"];
if (delegate != nil)
[self performSelectorOnMainThread:#selector(didStartEncoding) withObject:nil waitUntilDone:NO];
while( TRUE )
{
// Try to fill the buffer to capacity.
UInt32 framesRead = bufferSizeInFrames;
result = ExtAudioFileRead( sourceAudioFile, &framesRead, &bufferList );
if( result != noErr )
{
DLog( #"Error in ExtAudioFileRead: %ld", result );
success = NO;
break;
}
// 0 frames read means EOF.
if( framesRead == 0 ) {
success = YES;
break;
}
// Write.
result = ExtAudioFileWrite( destAudioFile, framesRead, &bufferList );
if( result != noErr )
{
DLog( #"Error in ExtAudioFileWrite: %ld", result );
success = NO;
break;
}
encodedBytes += framesRead * sourceFormat.mBytesPerFrame;
if (delegate != nil)
[self performSelectorOnMainThread:#selector(didEncodeBytes:) withObject:[NSValue value:&encodedBytes withObjCType:#encode(unsigned long long)] waitUntilDone:NO];
if ([[NSThread currentThread] isCancelled]) {
cancelled = YES;
DLog( #"Encoding was cancelled." );
success = NO;
break;
}
}
free( buffer );
// Close the files.
ExtAudioFileDispose( sourceAudioFile );
ExtAudioFileDispose( destAudioFile );
bailout:
encoderStatus.result = result;
[threadDict setValue:[NSValue value:&encoderStatus withObjCType:#encode(RXAudioEncoderStatusType)] forKey:#"EncodingError"];
// Report to the delegate if one exists
if (delegate != nil)
if (success)
[self performSelectorOnMainThread:#selector(didEncodeFile) withObject:nil waitUntilDone:YES];
else if (cancelled)
[self performSelectorOnMainThread:#selector(encodingCancelled) withObject:nil waitUntilDone:YES];
else
[self performSelectorOnMainThread:#selector(failedToEncodeFile) withObject:nil waitUntilDone:YES];
// Clear the partially encoded file if encoding failed or is cancelled midway
if ((cancelled || !success) && [fileManager fileExistsAtPath:destPath])
[fileManager removeItemAtURL:destURL error:NULL];
[threadDict setValue:[NSNumber numberWithBool:NO] forKey:#"isEncoding"];
[pool release];
}
Are you sure the sample rates match? Can you print the values for clientFormat and outputFormat at the point you’re getting the error? Otherwise I think you might need an AudioConverter.
I tried out the code in Sebastian's answer and while it worked for uncompressed files (aif, wav, caf), it didn't for a lossy compressed file (mp3). I also had an error code of -50, but in ExtAudioFileRead rather than ExtAudioFileSetProperty. From this question I learned that this error signifies a problem with the function parameters. Turns out the buffer for reading the audio file had a size of 0 bytes, a result of this line:
int bufferSize = ( bufferSizeInFrames * sourceFormat.mBytesPerFrame );
Switching it to use the the bytes per frame from clientFormat instead (sourceFormat's value was 0) worked for me:
int bufferSize = ( bufferSizeInFrames * clientFormat.mBytesPerFrame );
This line was also in the question code, but I don't think that was the problem (but I had too much text for a comment).

How to get AVFrame(ffmpeg) from NSImage/UIImage

I'd like to convert NSImage/UIImage to AVFrame(ffmpeg).
I found a example code.
http://lists.mplayerhq.hu/pipermail/libav-user/2010-April/004550.html
but this code doesn't work.
I tried another approach.
AVFrame *frame = avcodec_alloc_frame();
int numBytes = avpicture_get_size(PIX_FMT_YUV420P, outputWidth, outputHeight);
uint8_t *buffer = (uint8_t *)av_malloc(numBytes*sizeof(uint8_t));
avpicture_fill((AVPicture *)frame, buffer, PIX_FMT_YUV420P, outputWidth, outputHeight);
//UIImage *image = … smothing … ;
NSImage *image = … smothing … ;
//CGImageRef newCgImage = image.CGImage;
CGImageRef newCgImage = [image CGImageForProposedRect:nil context:nil hints:nil];
//NSBitmapImageRep* bm = [NSBitmapImageRep imageRepWithData:[image TIFFRepresentation]];
//CGImageRef newCgImage = [bm CGImage];
size_t w = CGImageGetWidth(newCgImage);
size_t h = CGImageGetHeight(cgImage);
CGDataProviderRef dataProvider = CGImageGetDataProvider(newCgImage);
CFDataRef bitmapData = CGDataProviderCopyData(dataProvider);
uint8_t *buffer = (uint8_t *)CFDataGetBytePtr(bitmapData);
frame->linesize[0] = w;
int y, x;
for (y = 0; y < h; y++) {
for (x = 0; x < w; x++) {
int z = y * w + x;
frame->data[0][z] = buffer[z];
}
}
but this AVFrame give me green picture.
Please let me know how can i get it.
Thanks.
following is additional.
I tried again with paying attention color format.
I found example to conver RGB to YUM.
How to perform RGB->YUV conversion in C/C++?
new code is like this.but,still doesn't work…
#import <Foundation/Foundation.h>
#import <AppKit/AppKit.h>
#import <libavutil/avstring.h>
#import <libavcodec/avcodec.h>
#import <libavformat/avformat.h>
#import <libswscale/swscale.h>
int main(int argc, char *argv[]) {
NSAutoreleasePool* pool = [[NSAutoreleasePool alloc] init];
int i;
int outputWidth = 480; //size must be same size of arg
int outputHeight = 360; //size must be same size of arg
av_register_all();
AVOutputFormat *format = av_guess_format("mp4", NULL, NULL);
if(!format) return -1;
AVFormatContext *outFormatCtx = avformat_alloc_context();
if(!outFormatCtx) return -1;
outFormatCtx->oformat = format;
av_strlcpy(outFormatCtx->filename, "test.mov", sizeof(outFormatCtx->filename));
AVStream *vstream = av_new_stream(outFormatCtx, 0);
if(!vstream) return -1;
enum CodecID codec_id = av_guess_codec(outFormatCtx->oformat,
NULL,
outFormatCtx->filename,
NULL, CODEC_TYPE_VIDEO);
AVCodec *ovCodec = avcodec_find_encoder(codec_id);
if(!ovCodec) return -1;
AVCodecContext *ovCodecCtx = vstream->codec;
ovCodecCtx->codec_id = ovCodec->id;
ovCodecCtx->codec_type = CODEC_TYPE_VIDEO;
ovCodecCtx->width = outputWidth;
ovCodecCtx->height = outputHeight;
ovCodecCtx->pix_fmt = PIX_FMT_NONE;
if(ovCodec && ovCodec->pix_fmts){
const enum PixelFormat *p = ovCodec->pix_fmts;
while(*p++ != -1){
if(*p == ovCodecCtx->pix_fmt) break;
}
if(*p == -1) ovCodecCtx->pix_fmt = ovCodec->pix_fmts[0];
}
ovCodecCtx->time_base.num = 1;
ovCodecCtx->time_base.den = 30;
if(format->flags & AVFMT_GLOBALHEADER)
ovCodecCtx->flags |= CODEC_FLAG_GLOBAL_HEADER;
if(avcodec_open(ovCodecCtx, ovCodec) != 0) return -1;
if (! ( format->flags & AVFMT_NOFILE )) {
if(url_fopen(&outFormatCtx->pb, outFormatCtx->filename, URL_WRONLY) < 0) return NO;
}
av_write_header(outFormatCtx);
int buf_size = ovCodecCtx->width * ovCodecCtx->height * 4;
uint8_t *buf = av_malloc(buf_size);
AVFrame *buffer_frame = avcodec_alloc_frame();
if(!buffer_frame) return -1;
AVFrame *frame = avcodec_alloc_frame();
if(!frame) return -1;
int numBytes = avpicture_get_size(PIX_FMT_YUV420P, outputWidth, outputHeight);
uint8_t *buffer = (uint8_t *)av_malloc(numBytes*sizeof(uint8_t));
avpicture_fill((AVPicture *)frame, buffer, PIX_FMT_YUV420P, outputWidth, outputHeight);
for(i=1;i<argc;i++){
NSAutoreleasePool *innerPool = [[NSAutoreleasePool alloc] init];
NSImage *image = [[NSImage alloc] initWithContentsOfFile:[NSString stringWithCString: argv[i] encoding: NSUTF8StringEncoding]];
CGImageRef imageRef = [image CGImageForProposedRect:nil context:nil hints:nil];
size_t w = CGImageGetWidth(imageRef);
size_t h = CGImageGetHeight(imageRef);
size_t bytesPerRow = CGImageGetBytesPerRow(imageRef);
CGDataProviderRef dataProvider = CGImageGetDataProvider(imageRef);
CFDataRef bitmapData = CGDataProviderCopyData(dataProvider);
uint8_t *buff = (uint8_t *)CFDataGetBytePtr(bitmapData);
uint8_t R,G,B,Y,U,V;
int x,y;
for(y=0;y<h;y++){
for(x=0;x<w;x++){
uint8_t *tmp = buff + y * bytesPerRow + x * 4;
R = *(tmp + 3);
G = *(tmp + 2);
B = *(tmp + 1);
Y = (0.257 * R) + (0.504 * G) + (0.098 * B) + 16;
U = -(0.148 * R) - (0.291 * G) + (0.439 * B) + 128;
V = (0.439 * R) - (0.368 * G) - (0.071 * B) + 128;
//printf("y:%d x:%d R:%d,G:%d,B:%d Y:%d,U:%d,V:%d \n",y,x,R,G,B,Y,U,V);
frame->data[0][y*frame->linesize[0]+x]= Y;
//frame->data[1][y*frame->linesize[1]+x]= U; //if coment out "Bus error"
//frame->data[2][y*frame->linesize[2]+x]= V; //if coment out "Bus error"
}
}
CGImageRelease(imageRef);
CFRelease(bitmapData);
int out_size = avcodec_encode_video (ovCodecCtx, buf, buf_size, frame);
AVPacket outPacket;
av_init_packet(&outPacket);
outPacket.stream_index= vstream->index;
outPacket.data= buf;
outPacket.size= out_size;
//outPacket.pts = ?;
//outPacket.dts = ?;
if(ovCodecCtx->coded_frame->key_frame)
outPacket.flags |= PKT_FLAG_KEY;
if(av_interleaved_write_frame(outFormatCtx, &outPacket) != 0) return -1;
[image release];
[innerPool release];
}
av_write_trailer(outFormatCtx);
if (! ( format->flags & AVFMT_NOFILE ))
if(url_fclose(outFormatCtx->pb) < 0) return -1;
avcodec_close(vstream->codec);
for(i = 0; i < outFormatCtx->nb_streams; i++) {
av_freep(&outFormatCtx->streams[i]->codec);
av_freep(&outFormatCtx->streams[i]);
}
av_freep(&outFormatCtx);
av_free(buffer);
av_free(frame);
av_free(buffer_frame);
[pool release];
return 0;
}
and mekefile is like this.
CC = /usr/bin/gcc
CFLAGS = -O4 -Wall -I/usr/local/include
LDFLAGS =
LDLIBS = -L/usr/local/bin -lavutil -lavformat -lavcodec -lswscale
FRAMEWORK = -framework Foundation -framework AppKit #-framework CoreGraphics
OBJS = test.o
test: $(OBJS)
$(CC) -o $# $(LDFLAGS) $(OBJS) $(LDLIBS) $(FRAMEWORK) -lz -lbz2 -arch x86_64
Please somebody help me.
There is a colorspace mismatch between the data of the CGImage and the destination AVFrame. In order to fix that, you need to convert the CGImage data (probably in ARGB) into the YUV420 format (FFMpeg has built-in format converter). You can get information on the colorspace of a CGImage with the CGImageGetBitsPerComponent, CGImageGetBitsPerPixel and CGImageGetBytesPerRow functions.