How to read an audio file in iOS and get the samples? - iphone

Here is my code for this. I copied some of them from iOS library.
- (void)getSamplesWithFileName:(NSString *)fileName
storeToPointer:(SInt16*)pointer
withNumberOfSamples:(UInt32*)numberOfSamples
{
NSFileManager* fileManager = [NSFileManager defaultManager];
NSURL* url = [[fileManager URLsForDirectory:NSCachesDirectory inDomains:NSUserDomainMask] lastObject];
NSLog(#"%#",url);
NSString* directory = [url path];
NSString* nsFilePath = [directory stringByAppendingPathComponent:fileName];
const char* filePath = [nsFilePath cStringUsingEncoding:[NSString defaultCStringEncoding]];
CFURLRef audioFileURL =
CFURLCreateFromFileSystemRepresentation ( // 1
NULL, // 2
(const UInt8 *) filePath, // 3
strlen (filePath), // 4
false // 5
);
struct AQPlayerState aqData; // 1
aqData.mDataFormat.mFormatID = kAudioFormatLinearPCM; // 2
aqData.mDataFormat.mSampleRate = 8000.0; // 3
aqData.mDataFormat.mChannelsPerFrame = 1; // 4
aqData.mDataFormat.mBitsPerChannel = 16; // 5
aqData.mDataFormat.mBytesPerPacket = // 6
aqData.mDataFormat.mBytesPerFrame =
aqData.mDataFormat.mChannelsPerFrame * sizeof (SInt16);
aqData.mDataFormat.mFramesPerPacket = 1; // 7
OSStatus result =
AudioFileOpenURL ( // 2
audioFileURL, // 3
kAudioFileReadPermission,//fsRdPerm // 4
0, // 5
&aqData.mAudioFile // 6
);
NSLog(#"Play back open status:%ld", result);
CFRelease (audioFileURL); // 7
UInt32 dataFormatSize = sizeof (aqData.mDataFormat); // 1
AudioFileGetProperty ( // 2
aqData.mAudioFile, // 3
kAudioFilePropertyDataFormat, // 4
&dataFormatSize, // 5
&aqData.mDataFormat // 6
);
UInt32 numBytesReadFromFile;
UInt32 numPackets = 100000;
// pointer = malloc(numPackets * sizeof(SInt16));
AudioFileReadPackets(aqData.mAudioFile, false, &numBytesReadFromFile, NULL, 0, &numPackets, pointer);
printf("%ld", numBytesReadFromFile);
*numberOfSamples = numBytesReadFromFile;
}
But it seems that I get the wrong data! I tested the voice 'Ahhhhhh', and get really high zero-crossing rate. How exactly can I read this audio file?

Use Extended Audio File Services
Open files with ExtAudioFileOpenURL, read them into buffers with ExtAudioFileRead. Works with all formats Core Audio supports.
I can recommend the book Learning Core Audio: A Hands-On Guide to Audio Programming for Mac and iOS if you want to get into Core Audio programming.
Edit: The docs at the first link has sample code that'll probably help you.

Related

Record and play audio simultaneously in iOS

I am trying to play the recorded content simultaneously while recording. Currently I am using AVAudioRecorder for recording and AVAudioPlayer for playing.
When I was trying to play the content simultaneously nothing is playing. Please find the pseudo code for what I am doing.
If I do the same stuff after stop the recording everything works fine.
AVAudioRecorder *recorder; //Initializing the recorder properly.
[recorder record];
NSError *error=nil;
NSUrl recordingPathUrl; //Contains the recording path.
AVAudioPlayer *audioPlayer = [[AVAudioPlayer alloc] initWithContentsOfURL:recordingPathUrl
error:&error];
[audioPlayer prepareToPlay];
[audioPlayer play];
Could you please anybody let me know your thoughts or ideas?
This is achievable , Use these link and download it:
https://code.google.com/p/ios-coreaudio-example/downloads/detail?name=Aruts.zip&can=2&q=
This link will play sound from speaker but will not record it , I have implemented record functionality as well Below is full code description..
IN .h File
#import <Foundation/Foundation.h>
#import <AudioToolbox/AudioToolbox.h>
#ifndef max
#define max( a, b ) ( ((a) > (b)) ? (a) : (b) )
#endif
#ifndef min
#define min( a, b ) ( ((a) < (b)) ? (a) : (b) )
#endif
#interface IosAudioController : NSObject {
AudioComponentInstance audioUnit;
AudioBuffer tempBuffer; // this will hold the latest data from the microphone
ExtAudioFileRef mAudioFileRef;
}
#property (readonly)ExtAudioFileRef mAudioFileRef;
#property (readonly) AudioComponentInstance audioUnit;
#property (readonly) AudioBuffer tempBuffer;
- (void) start;
- (void) stop;
- (void) processAudio: (AudioBufferList*) bufferList;
#end
// setup a global iosAudio variable, accessible everywhere
extern IosAudioController* iosAudio;
IN .m
#import "IosAudioController.h"
#import <AudioToolbox/AudioToolbox.h>
#import <AVFoundation/AVFoundation.h>
#define kOutputBus 0
#define kInputBus 1
IosAudioController* iosAudio;
void checkStatus(int status){
if (status) {
printf("Status not 0! %d\n", status);
// exit(1);
}
}
static void printAudioUnitRenderActionFlags(AudioUnitRenderActionFlags * ioActionFlags)
{
if (*ioActionFlags == 0) {
printf("AudioUnitRenderActionFlags(%lu) ", *ioActionFlags);
return;
}
printf("AudioUnitRenderActionFlags(%lu): ", *ioActionFlags);
if (*ioActionFlags & kAudioUnitRenderAction_PreRender) printf("kAudioUnitRenderAction_PreRender ");
if (*ioActionFlags & kAudioUnitRenderAction_PostRender) printf("kAudioUnitRenderAction_PostRender ");
if (*ioActionFlags & kAudioUnitRenderAction_OutputIsSilence) printf("kAudioUnitRenderAction_OutputIsSilence ");
if (*ioActionFlags & kAudioOfflineUnitRenderAction_Preflight) printf("kAudioOfflineUnitRenderAction_Prefli ght ");
if (*ioActionFlags & kAudioOfflineUnitRenderAction_Render) printf("kAudioOfflineUnitRenderAction_Render");
if (*ioActionFlags & kAudioOfflineUnitRenderAction_Complete) printf("kAudioOfflineUnitRenderAction_Complete ");
if (*ioActionFlags & kAudioUnitRenderAction_PostRenderError) printf("kAudioUnitRenderAction_PostRenderError ");
if (*ioActionFlags & kAudioUnitRenderAction_DoNotCheckRenderArgs) printf("kAudioUnitRenderAction_DoNotCheckRenderArgs ");
}
/**
This callback is called when new audio data from the microphone is
available.
*/
static OSStatus recordingCallback(void *inRefCon,
AudioUnitRenderActionFlags *ioActionFlags,
const AudioTimeStamp *inTimeStamp,
UInt32 inBusNumber,
UInt32 inNumberFrames,
AudioBufferList *ioData) {
double timeInSeconds = inTimeStamp->mSampleTime / 44100.00;
printf("\n%fs inBusNumber: %lu inNumberFrames: %lu ", timeInSeconds, inBusNumber, inNumberFrames);
printAudioUnitRenderActionFlags(ioActionFlags);
// Because of the way our audio format (setup below) is chosen:
// we only need 1 buffer, since it is mono
// Samples are 16 bits = 2 bytes.
// 1 frame includes only 1 sample
AudioBuffer buffer;
buffer.mNumberChannels = 1;
buffer.mDataByteSize = inNumberFrames * 2;
buffer.mData = malloc( inNumberFrames * 2 );
// Put buffer in a AudioBufferList
AudioBufferList bufferList;
SInt16 samples[inNumberFrames]; // A large enough size to not have to worry about buffer overrun
memset (&samples, 0, sizeof (samples));
bufferList.mNumberBuffers = 1;
bufferList.mBuffers[0] = buffer;
// Then:
// Obtain recorded samples
OSStatus status;
status = AudioUnitRender([iosAudio audioUnit],
ioActionFlags,
inTimeStamp,
inBusNumber,
inNumberFrames,
&bufferList);
checkStatus(status);
// Now, we have the samples we just read sitting in buffers in bufferList
// Process the new data
[iosAudio processAudio:&bufferList];
// Now, we have the samples we just read sitting in buffers in bufferList
ExtAudioFileWriteAsync([iosAudio mAudioFileRef], inNumberFrames, &bufferList);
// release the malloc'ed data in the buffer we created earlier
free(bufferList.mBuffers[0].mData);
return noErr;
}
/**
This callback is called when the audioUnit needs new data to play through the
speakers. If you don't have any, just don't write anything in the buffers
*/
static OSStatus playbackCallback(void *inRefCon,
AudioUnitRenderActionFlags *ioActionFlags,
const AudioTimeStamp *inTimeStamp,
UInt32 inBusNumber,
UInt32 inNumberFrames,
AudioBufferList *ioData) {
// Notes: ioData contains buffers (may be more than one!)
// Fill them up as much as you can. Remember to set the size value in each buffer to match how
// much data is in the buffer.
for (int i=0; i < ioData->mNumberBuffers; i++) { // in practice we will only ever have 1 buffer, since audio format is mono
AudioBuffer buffer = ioData->mBuffers[i];
// NSLog(#" Buffer %d has %d channels and wants %d bytes of data.", i, buffer.mNumberChannels, buffer.mDataByteSize);
// copy temporary buffer data to output buffer
UInt32 size = min(buffer.mDataByteSize, [iosAudio tempBuffer].mDataByteSize); // dont copy more data then we have, or then fits
memcpy(buffer.mData, [iosAudio tempBuffer].mData, size);
buffer.mDataByteSize = size; // indicate how much data we wrote in the buffer
// uncomment to hear random noise
/*
UInt16 *frameBuffer = buffer.mData;
for (int j = 0; j < inNumberFrames; j++) {
frameBuffer[j] = rand();
}
*/
}
return noErr;
}
#implementation IosAudioController
#synthesize audioUnit, tempBuffer,mAudioFileRef;
/**
Initialize the audioUnit and allocate our own temporary buffer.
The temporary buffer will hold the latest data coming in from the microphone,
and will be copied to the output when this is requested.
*/
- (id) init {
self = [super init];
OSStatus status;
AVAudioSession *session = [AVAudioSession sharedInstance];
NSLog(#"%f",session.preferredIOBufferDuration);
// Describe audio component
AudioComponentDescription desc;
desc.componentType = kAudioUnitType_Output;
desc.componentSubType = kAudioUnitSubType_RemoteIO;
desc.componentFlags = 0;
desc.componentFlagsMask = 0;
desc.componentManufacturer = kAudioUnitManufacturer_Apple;
// Get component
AudioComponent inputComponent = AudioComponentFindNext(NULL, &desc);
// Get audio units
status = AudioComponentInstanceNew(inputComponent, &audioUnit);
checkStatus(status);
// Enable IO for recording
UInt32 flag = 1;
status = AudioUnitSetProperty(audioUnit,
kAudioOutputUnitProperty_EnableIO,
kAudioUnitScope_Input,
kInputBus,
&flag,
sizeof(flag));
checkStatus(status);
// Enable IO for playback
status = AudioUnitSetProperty(audioUnit,
kAudioOutputUnitProperty_EnableIO,
kAudioUnitScope_Output,
kOutputBus,
&flag,
sizeof(flag));
checkStatus(status);
// Describe format
AudioStreamBasicDescription audioFormat;
audioFormat.mSampleRate = 44100.00;
audioFormat.mFormatID = kAudioFormatLinearPCM;
audioFormat.mFormatFlags = kAudioFormatFlagIsSignedInteger | kAudioFormatFlagIsPacked;
audioFormat.mFramesPerPacket = 1;
audioFormat.mChannelsPerFrame = 1;
audioFormat.mBitsPerChannel = 16;
audioFormat.mBytesPerPacket = 2;
audioFormat.mBytesPerFrame = 2;
// Apply format
status = AudioUnitSetProperty(audioUnit,
kAudioUnitProperty_StreamFormat,
kAudioUnitScope_Output,
kInputBus,
&audioFormat,
sizeof(audioFormat));
checkStatus(status);
status = AudioUnitSetProperty(audioUnit,
kAudioUnitProperty_StreamFormat,
kAudioUnitScope_Input,
kOutputBus,
&audioFormat,
sizeof(audioFormat));
checkStatus(status);
// Set input callback
AURenderCallbackStruct callbackStruct;
callbackStruct.inputProc = recordingCallback;
callbackStruct.inputProcRefCon = self;
status = AudioUnitSetProperty(audioUnit,
kAudioOutputUnitProperty_SetInputCallback,
kAudioUnitScope_Global,
kInputBus,
&callbackStruct,
sizeof(callbackStruct));
checkStatus(status);
// Set output callback
callbackStruct.inputProc = playbackCallback;
callbackStruct.inputProcRefCon = self;
status = AudioUnitSetProperty(audioUnit,
kAudioUnitProperty_SetRenderCallback,
kAudioUnitScope_Global,
kOutputBus,
&callbackStruct,
sizeof(callbackStruct));
checkStatus(status);
// Disable buffer allocation for the recorder (optional - do this if we want to pass in our own)
flag = 0;
status = AudioUnitSetProperty(audioUnit,
kAudioUnitProperty_ShouldAllocateBuffer,
kAudioUnitScope_Output,
kInputBus,
&flag,
sizeof(flag));
// set preferred buffer size
Float32 audioBufferSize = (0.023220);
UInt32 size = sizeof(audioBufferSize);
status = AudioSessionSetProperty(kAudioSessionProperty_PreferredHardwareIOBufferDuration,
size, &audioBufferSize);
// Allocate our own buffers (1 channel, 16 bits per sample, thus 16 bits per frame, thus 2 bytes per frame).
// Practice learns the buffers used contain 512 frames, if this changes it will be fixed in processAudio.
tempBuffer.mNumberChannels = 1;
tempBuffer.mDataByteSize = 512 * 2;
tempBuffer.mData = malloc( 512 * 2 );
NSArray *paths = NSSearchPathForDirectoriesInDomains(NSDocumentDirectory, NSUserDomainMask, YES);
NSString *documentsDirectory = [paths objectAtIndex:0];
NSString *destinationFilePath = [[NSString alloc] initWithFormat: #"%#/output.caf", documentsDirectory];
NSLog(#">>> %#\n", destinationFilePath);
CFURLRef destinationURL = CFURLCreateWithFileSystemPath(kCFAllocatorDefault, ( CFStringRef)destinationFilePath, kCFURLPOSIXPathStyle, false);
OSStatus setupErr = ExtAudioFileCreateWithURL(destinationURL, kAudioFileCAFType, &audioFormat, NULL, kAudioFileFlags_EraseFile, &mAudioFileRef);
CFRelease(destinationURL);
NSAssert(setupErr == noErr, #"Couldn't create file for writing");
setupErr = ExtAudioFileSetProperty(mAudioFileRef, kExtAudioFileProperty_ClientDataFormat, sizeof(AudioStreamBasicDescription), &audioFormat);
NSAssert(setupErr == noErr, #"Couldn't create file for format");
setupErr = ExtAudioFileWriteAsync(mAudioFileRef, 0, NULL);
NSAssert(setupErr == noErr, #"Couldn't initialize write buffers for audio file");
// Initialise
status = AudioUnitInitialize(audioUnit);
checkStatus(status);
// [NSTimer scheduledTimerWithTimeInterval:5 target:self selector:#selector(stopRecording:) userInfo:nil repeats:NO];
return self;
}
/**
Start the audioUnit. This means data will be provided from
the microphone, and requested for feeding to the speakers, by
use of the provided callbacks.
*/
- (void) start {
OSStatus status = AudioOutputUnitStart(audioUnit);
checkStatus(status);
}
/**
Stop the audioUnit
*/
- (void) stop {
OSStatus status = AudioOutputUnitStop(audioUnit);
checkStatus(status);
[self stopRecording:nil];
}
/**
Change this function to decide what is done with incoming
audio data from the microphone.
Right now we copy it to our own temporary buffer.
*/
- (void) processAudio: (AudioBufferList*) bufferList{
AudioBuffer sourceBuffer = bufferList->mBuffers[0];
// fix tempBuffer size if it's the wrong size
if (tempBuffer.mDataByteSize != sourceBuffer.mDataByteSize) {
free(tempBuffer.mData);
tempBuffer.mDataByteSize = sourceBuffer.mDataByteSize;
tempBuffer.mData = malloc(sourceBuffer.mDataByteSize);
}
// copy incoming audio data to temporary buffer
memcpy(tempBuffer.mData, bufferList->mBuffers[0].mData, bufferList->mBuffers[0].mDataByteSize);
}
- (void)stopRecording:(NSTimer*)theTimer
{
printf("\nstopRecording\n");
OSStatus status = ExtAudioFileDispose(mAudioFileRef);
printf("OSStatus(ExtAudioFileDispose): %ld\n", status);
}
/**
Clean up.
*/
- (void) dealloc {
[super dealloc];
AudioUnitUninitialize(audioUnit);
free(tempBuffer.mData);
}
This Will definitely help you people..
Another Best Way of Doing this is to download Audio Touch from https://github.com/tkzic/audiograph and see Echo function of this application it repeat voice as you speak , but it does not record audio so Add Recording function into it , AS mentioned below:
IN MixerHostAudio.h
#property (readwrite) ExtAudioFileRef mRecordFile;
-(void)Record;
-(void)StopRecord;
IN MixerHostAudio.m
//ADD these two function in this class
-(void)Record{
NSString *completeFileNameAndPath = [[NSSearchPathForDirectoriesInDomains(NSDocumentDirectory, NSUserDomainMask, YES) lastObject] stringByAppendingString:#"/Record.wav"];
//create the url that the recording object needs to reference the file
CFURLRef audioFileURL = CFURLCreateFromFileSystemRepresentation (NULL, (const UInt8 *)[completeFileNameAndPath cStringUsingEncoding:[NSString defaultCStringEncoding]] , strlen([completeFileNameAndPath cStringUsingEncoding:[NSString defaultCStringEncoding]]), false);
AudioStreamBasicDescription dstFormat, clientFormat;
memset(&dstFormat, 0, sizeof(dstFormat));
memset(&clientFormat, 0, sizeof(clientFormat));
AudioFileTypeID fileTypeId = kAudioFileWAVEType;
UInt32 size = sizeof(dstFormat);
dstFormat.mFormatID = kAudioFormatLinearPCM;
// setup the output file format
dstFormat.mSampleRate = 44100.0; // set sample rate
// create a 16-bit 44100kHz Stereo format
dstFormat.mChannelsPerFrame = 2;
dstFormat.mBitsPerChannel = 16;
dstFormat.mBytesPerPacket = dstFormat.mBytesPerFrame = 4;
dstFormat.mFramesPerPacket = 1;
dstFormat.mFormatFlags = kLinearPCMFormatFlagIsPacked | kLinearPCMFormatFlagIsSignedInteger; // little-endian
//get the client format directly from
UInt32 asbdSize = sizeof (AudioStreamBasicDescription);
AudioUnitGetProperty(mixerUnit,
kAudioUnitProperty_StreamFormat,
kAudioUnitScope_Input,
0, // input bus
&clientFormat,
&asbdSize);
ExtAudioFileCreateWithURL(audioFileURL, fileTypeId, &dstFormat, NULL, kAudioFileFlags_EraseFile, &mRecordFile);
printf("recording\n");
ExtAudioFileSetProperty(mRecordFile, kExtAudioFileProperty_ClientDataFormat, size, &clientFormat);
//call this once as this will alloc space on the first call
ExtAudioFileWriteAsync(mRecordFile, 0, NULL);
}
-(void)StopRecord{
ExtAudioFileDispose(mRecordFile);
}
//In micLineInCallback function Add this line at last before return noErr; :
ExtAudioFileWriteAsync([THIS mRecordFile] , inNumberFrames, ioData);
And call these function from MixerHostViewController.m in - (IBAction) playOrStop: (id) sender method
You'll need to use AudioUnits if you want real-time monitoring of your audio input.
Apple's Audio Unit Hosting Guide
Tutorial on configuring the Remote I/O Audio Unit
The RemoteIO Audio Unit can be used for simultaneous record and play. There are plenty of examples of recording using RemoteIO (aurioTouch) and playing using RemoteIO. Just enable both unit input and unit output, and handle both buffer callbacks. See an example here

Decrease size of converted audio file

I am converting my recorded audio which is in .m4a format to .caf format. The settings of the recorded audio is as given below:
/* Record settings for recording the audio*/
recordSetting = [[NSDictionary alloc] initWithObjectsAndKeys:[NSNumber numberWithInt:kAudioFormatMPEG4AAC],AVFormatIDKey,
[NSNumber numberWithInt:44100.0],AVSampleRateKey,
[NSNumber numberWithInt: 2],AVNumberOfChannelsKey,
[NSNumber numberWithInt:16],AVLinearPCMBitDepthKey,
[NSNumber numberWithBool:NO],AVLinearPCMIsBigEndianKey,
[NSNumber numberWithBool:NO],AVLinearPCMIsFloatKey,
nil];
I convert the audio to .caf using this function:
-(NSString *)handleConvertToPCM:(NSURL *)convertUrl
{
[self performSelectorOnMainThread:#selector(showActivity) withObject:nil waitUntilDone:NO];
DEBUG_LOG(#"DEBUGGING");
DEBUG_LOG(#"handleConvertToPCM");
// open an ExtAudioFile
NSLog (#"opening %#", convertUrl);
ExtAudioFileRef inputFile;
CheckResult (ExtAudioFileOpenURL((CFURLRef)convertUrl, &inputFile),
"ExtAudioFileOpenURL failed");
// prepare to convert to a plain ol' PCM format
AudioStreamBasicDescription requiredPCMFormat;
requiredPCMFormat.mSampleRate = 44100; // todo: or use source rate?
requiredPCMFormat.mFormatID = kAudioFormatLinearPCM ;
requiredPCMFormat.mFormatFlags = kAudioFormatFlagsCanonical;
requiredPCMFormat.mChannelsPerFrame = 2;
requiredPCMFormat.mFramesPerPacket = 1;
requiredPCMFormat.mBitsPerChannel = 16;
requiredPCMFormat.mBytesPerPacket = 4;
requiredPCMFormat.mBytesPerFrame = 4;
CheckResult (ExtAudioFileSetProperty(inputFile, kExtAudioFileProperty_ClientDataFormat,
sizeof (requiredPCMFormat), &requiredPCMFormat),
"ExtAudioFileSetProperty failed");
// allocate a big buffer. size can be arbitrary for ExtAudioFile.
UInt32 outputBufferSize = 0x10000;
void* ioBuf = malloc (outputBufferSize);
UInt32 sizePerPacket = requiredPCMFormat.mBytesPerPacket;
UInt32 packetsPerBuffer = outputBufferSize / sizePerPacket;
// set up output file
self.outputPath = [NSString stringWithFormat:#"%#/export-pcm.caf",DOCUMENTS_FOLDER];
self.outputURL = [NSURL fileURLWithPath:self.outputPath];
DEBUG_LOG(#"creating output file %#", self.outputURL);
AudioFileID outputFile;
CheckResult(AudioFileCreateWithURL((CFURLRef)outputURL,
kAudioFileCAFType,
&requiredPCMFormat,
kAudioFileFlags_EraseFile,
&outputFile),
"AudioFileCreateWithURL failed");
// start convertin'
UInt32 outputFilePacketPosition = 0; //in bytes
while (true)
{
// wrap the destination buffer in an AudioBufferList
AudioBufferList convertedData;
convertedData.mNumberBuffers = 1;
convertedData.mBuffers[0].mNumberChannels = requiredPCMFormat.mChannelsPerFrame;
convertedData.mBuffers[0].mDataByteSize = outputBufferSize;
convertedData.mBuffers[0].mData = ioBuf;
UInt32 frameCount = packetsPerBuffer;
// read from the extaudiofile
CheckResult (ExtAudioFileRead(inputFile,
&frameCount,
&convertedData),
"Couldn't read from input file");
if (frameCount == 0)
{
printf ("done reading from file");
break;
}
// write the converted data to the output file
CheckResult (AudioFileWritePackets(outputFile,
false,
frameCount,
NULL,
outputFilePacketPosition / requiredPCMFormat.mBytesPerPacket,
&frameCount,
convertedData.mBuffers[0].mData),
"Couldn't write packets to file");
DEBUG_LOG(#"Converted %ld bytes", outputFilePacketPosition);
// advance the output file write location
outputFilePacketPosition += (frameCount * requiredPCMFormat.mBytesPerPacket);
}
// clean up
ExtAudioFileDispose(inputFile);
AudioFileClose(outputFile);
return(self.outputPath);
}
My problem is that the size of the converted file is very high compared to the file given for conversion.Is there anyway to decrease the size by changing the conversion settings.
I tried compressing the file obtained , but it takes much time to compress.So I would like to get a way to decrease size along with conversion.
Decompressing a highly compressed audio file almost always results in a much larger result file, unless you re-compress using an even lossier compression format.

How can I save array of samples as audio file in iPhone?

I have a sound as array of samples.
How can I save this as audio file?
I have examined iPhone Core Audio APIs.
And I understand how to record from mic and play music.
But I can't find how to do that.
Here is a piece of code that works for me. For any more information you should check out the book Core Audio Rough Cuts.
#include "WavGenerator.h"
#import <Foundation/Foundation.h>
#import <AudioToolbox/AudioToolbox.h>
#include "AudioController.h"
#define SAMPLE_RATE 44100
#define DURATION 5.0
#define COUNT_OF(x) ((sizeof(x)/sizeof(0[x])) / ((size_t)(!(sizeof(x) % sizeof(0[x])))))
// #define FILENAME #"newFile.caf"
extern unsigned int global_size_of_instrumental;
extern unsigned int global_size_output;
void createNewWAV (const char *location, int *sample_array){
NSAutoreleasePool *pool = [[NSAutoreleasePool alloc] init];
NSString *filePath = NSTemporaryDirectory();
filePath = [filePath stringByAppendingPathComponent:#"name_of_your_file.wav"];
NSURL *fileURL = [NSURL fileURLWithPath:filePath];
AudioStreamBasicDescription asbd;
memset(&asbd,0, sizeof(asbd));
asbd.mSampleRate = SAMPLE_RATE;
asbd.mFormatID = kAudioFormatLinearPCM;
asbd.mFormatFlags = kLinearPCMFormatFlagIsSignedInteger | kLinearPCMFormatFlagIsPacked;
// asbd.mFormatFlags = kAudioFormatFlagIsBigEndian;
asbd.mBitsPerChannel = 16;
asbd.mChannelsPerFrame = 1;
asbd.mFramesPerPacket = 1;
asbd.mBytesPerFrame = 2;
asbd.mBytesPerPacket = 2;
AudioFileID audioFile;
OSStatus audioErr = noErr;
audioErr = AudioFileCreateWithURL((CFURLRef)fileURL,
kAudioFileWAVEType,
&asbd,
kAudioFileFlags_EraseFile,
&audioFile);
assert (audioErr == noErr);
printf("WAV GENERATOR --- global_size_output %d \n", global_size_output);
int size_of_output = global_size_output;
SInt16 *the_samples = (SInt16 *) malloc(global_size_of_instrumental*size_of_output*sizeof(SInt16));
for (int i=0; i< global_size_of_instrumental*size_of_output; i++)
{
the_samples[i] = sample_array[i];
}
UInt32 numSamples = global_size_of_instrumental*size_of_output;
UInt32 bytesToWrite = numSamples;
audioErr = AudioFileWriteBytes(audioFile, false, 0, &bytesToWrite, the_samples);
audioErr = AudioFileClose(audioFile);
assert(audioErr == noErr);
[pool drain];
}
If you download the free version of http://www.dspdimension.com/technology-licensing/dirac2/ you will find in the sample sourcecode functions for reading and writing audio files, I can't remember what format tho.

Using AVMutableAudioMix to adjust volumes for tracks within asset

I'm applying an AVMutableAudioMix to a asset I've created, the asset generally consists of 3-5 audio tracks (no video). The goal is to add several volume commands throughout the play time, ie I'd like to set the volume to 0.1 at 1 seconds, 0.5 at 2 seconds, then 0.1 or whatever at 3 seconds. I'm just now trying to do this with an AVPlayer but will also later use it when exporting the AVSession to a file. The problem is that it only seems to care about the first volume command, and seem to ignore all later volume commands. If the first command is to set the volume to 0.1, that will be the permanent volume for this track for the rest of this asset. Despite it really looks like you should be able to add any number of these commands, seeing as the "inputParameters" member of AVMutableAudioMix is really an NSArray which is the series of AVMutableAudioMixInputParameter's. Anyone figured this out?
Edit: I figured this partly out. I'm able to add several volume changes throughout a certain track. But the timings appear way off, I'm not sure how to fix that. For example setting the volume to 0.0 at 5 seconds, then 1.0 at 10 seconds and then back to 0.0 at 15 seconds would make you assume the volume would go on and off promptly at those timings, but the results are always very unpredictable, with ramping of sounds going on, and sometimes working (with sudden volume changes as expected from setVolume). If anyone got the AudioMix to work, please provide an example.
The code I use to change the track volume is:
AVURLAsset *soundTrackAsset = [[AVURLAsset alloc]initWithURL:trackUrl options:nil];
AVMutableAudioMixInputParameters *audioInputParams = [AVMutableAudioMixInputParameters audioMixInputParameters];
[audioInputParams setVolume:0.5 atTime:kCMTimeZero];
[audioInputParams setTrackID:[[[soundTrackAsset tracksWithMediaType:AVMediaTypeAudio] objectAtIndex:0] trackID]];
audioMix = [AVMutableAudioMix audioMix];
audioMix.inputParameters = [NSArray arrayWithObject:audioInputParams];
Don't forget to add the audiomix to your AVAssetExportSession
exportSession.audioMix = audioMix;
However, I notice it does not work with all formats so You can use this function to change the volume level of an stored file if you keep having issues with AVFoundation. However, this function could be quite slow.
-(void) ScaleAudioFileAmplitude:(NSURL *)theURL: (float) ampScale {
OSStatus err = noErr;
ExtAudioFileRef audiofile;
ExtAudioFileOpenURL((CFURLRef)theURL, &audiofile);
assert(audiofile);
// get some info about the file's format.
AudioStreamBasicDescription fileFormat;
UInt32 size = sizeof(fileFormat);
err = ExtAudioFileGetProperty(audiofile, kExtAudioFileProperty_FileDataFormat, &size, &fileFormat);
// we'll need to know what type of file it is later when we write
AudioFileID aFile;
size = sizeof(aFile);
err = ExtAudioFileGetProperty(audiofile, kExtAudioFileProperty_AudioFile, &size, &aFile);
AudioFileTypeID fileType;
size = sizeof(fileType);
err = AudioFileGetProperty(aFile, kAudioFilePropertyFileFormat, &size, &fileType);
// tell the ExtAudioFile API what format we want samples back in
AudioStreamBasicDescription clientFormat;
bzero(&clientFormat, sizeof(clientFormat));
clientFormat.mChannelsPerFrame = fileFormat.mChannelsPerFrame;
clientFormat.mBytesPerFrame = 4;
clientFormat.mBytesPerPacket = clientFormat.mBytesPerFrame;
clientFormat.mFramesPerPacket = 1;
clientFormat.mBitsPerChannel = 32;
clientFormat.mFormatID = kAudioFormatLinearPCM;
clientFormat.mSampleRate = fileFormat.mSampleRate;
clientFormat.mFormatFlags = kLinearPCMFormatFlagIsFloat | kAudioFormatFlagIsNonInterleaved;
err = ExtAudioFileSetProperty(audiofile, kExtAudioFileProperty_ClientDataFormat, sizeof(clientFormat), &clientFormat);
// find out how many frames we need to read
SInt64 numFrames = 0;
size = sizeof(numFrames);
err = ExtAudioFileGetProperty(audiofile, kExtAudioFileProperty_FileLengthFrames, &size, &numFrames);
// create the buffers for reading in data
AudioBufferList *bufferList = malloc(sizeof(AudioBufferList) + sizeof(AudioBuffer) * (clientFormat.mChannelsPerFrame - 1));
bufferList->mNumberBuffers = clientFormat.mChannelsPerFrame;
for (int ii=0; ii < bufferList->mNumberBuffers; ++ii) {
bufferList->mBuffers[ii].mDataByteSize = sizeof(float) * numFrames;
bufferList->mBuffers[ii].mNumberChannels = 1;
bufferList->mBuffers[ii].mData = malloc(bufferList->mBuffers[ii].mDataByteSize);
}
// read in the data
UInt32 rFrames = (UInt32)numFrames;
err = ExtAudioFileRead(audiofile, &rFrames, bufferList);
// close the file
err = ExtAudioFileDispose(audiofile);
// process the audio
for (int ii=0; ii < bufferList->mNumberBuffers; ++ii) {
float *fBuf = (float *)bufferList->mBuffers[ii].mData;
for (int jj=0; jj < rFrames; ++jj) {
*fBuf = *fBuf * ampScale;
fBuf++;
}
}
// open the file for writing
err = ExtAudioFileCreateWithURL((CFURLRef)theURL, fileType, &fileFormat, NULL, kAudioFileFlags_EraseFile, &audiofile);
// tell the ExtAudioFile API what format we'll be sending samples in
err = ExtAudioFileSetProperty(audiofile, kExtAudioFileProperty_ClientDataFormat, sizeof(clientFormat), &clientFormat);
// write the data
err = ExtAudioFileWrite(audiofile, rFrames, bufferList);
// close the file
ExtAudioFileDispose(audiofile);
// destroy the buffers
for (int ii=0; ii < bufferList->mNumberBuffers; ++ii) {
free(bufferList->mBuffers[ii].mData);
}
free(bufferList);
bufferList = NULL;
}
Please also note that you may need to fine tune the ampScale you want depending where your volume value is coming from. The system volume goes from 0 to 1 and can be obtained by calling AudioSessionGetProperty
Float32 volume;
UInt32 dataSize = sizeof(Float32);
AudioSessionGetProperty (
kAudioSessionProperty_CurrentHardwareOutputVolume,
&dataSize,
&volume
);
The Audio Extension Toolbox function doesn't quite work anymore as is due to API changes. It now requires you to setup a category. When setting the export properties I was getting an error code of '?cat' (which the NSError will print out in decimal).
Here is the code that works now in iOS 5.1. It is incredibly slow too, just by looking I'd say several times slower. It is also memory intensive since it appear to load the file into memory, which generates memory warnings for 10MB mp3 files.
-(void) scaleAudioFileAmplitude:(NSURL *)theURL withAmpScale:(float) ampScale
{
OSStatus err = noErr;
ExtAudioFileRef audiofile;
ExtAudioFileOpenURL((CFURLRef)theURL, &audiofile);
assert(audiofile);
// get some info about the file's format.
AudioStreamBasicDescription fileFormat;
UInt32 size = sizeof(fileFormat);
err = ExtAudioFileGetProperty(audiofile, kExtAudioFileProperty_FileDataFormat, &size, &fileFormat);
// we'll need to know what type of file it is later when we write
AudioFileID aFile;
size = sizeof(aFile);
err = ExtAudioFileGetProperty(audiofile, kExtAudioFileProperty_AudioFile, &size, &aFile);
AudioFileTypeID fileType;
size = sizeof(fileType);
err = AudioFileGetProperty(aFile, kAudioFilePropertyFileFormat, &size, &fileType);
// tell the ExtAudioFile API what format we want samples back in
AudioStreamBasicDescription clientFormat;
bzero(&clientFormat, sizeof(clientFormat));
clientFormat.mChannelsPerFrame = fileFormat.mChannelsPerFrame;
clientFormat.mBytesPerFrame = 4;
clientFormat.mBytesPerPacket = clientFormat.mBytesPerFrame;
clientFormat.mFramesPerPacket = 1;
clientFormat.mBitsPerChannel = 32;
clientFormat.mFormatID = kAudioFormatLinearPCM;
clientFormat.mSampleRate = fileFormat.mSampleRate;
clientFormat.mFormatFlags = kLinearPCMFormatFlagIsFloat | kAudioFormatFlagIsNonInterleaved;
err = ExtAudioFileSetProperty(audiofile, kExtAudioFileProperty_ClientDataFormat, sizeof(clientFormat), &clientFormat);
// find out how many frames we need to read
SInt64 numFrames = 0;
size = sizeof(numFrames);
err = ExtAudioFileGetProperty(audiofile, kExtAudioFileProperty_FileLengthFrames, &size, &numFrames);
// create the buffers for reading in data
AudioBufferList *bufferList = malloc(sizeof(AudioBufferList) + sizeof(AudioBuffer) * (clientFormat.mChannelsPerFrame - 1));
bufferList->mNumberBuffers = clientFormat.mChannelsPerFrame;
//printf("bufferList->mNumberBuffers = %lu \n\n", bufferList->mNumberBuffers);
for (int ii=0; ii < bufferList->mNumberBuffers; ++ii) {
bufferList->mBuffers[ii].mDataByteSize = sizeof(float) * numFrames;
bufferList->mBuffers[ii].mNumberChannels = 1;
bufferList->mBuffers[ii].mData = malloc(bufferList->mBuffers[ii].mDataByteSize);
}
// read in the data
UInt32 rFrames = (UInt32)numFrames;
err = ExtAudioFileRead(audiofile, &rFrames, bufferList);
// close the file
err = ExtAudioFileDispose(audiofile);
// process the audio
for (int ii=0; ii < bufferList->mNumberBuffers; ++ii) {
float *fBuf = (float *)bufferList->mBuffers[ii].mData;
for (int jj=0; jj < rFrames; ++jj) {
*fBuf = *fBuf * ampScale;
fBuf++;
}
}
// open the file for writing
err = ExtAudioFileCreateWithURL((CFURLRef)theURL, fileType, &fileFormat, NULL, kAudioFileFlags_EraseFile, &audiofile);
NSError *error = NULL;
/*************************** You Need This Now ****************************/
AVAudioSession *session = [AVAudioSession sharedInstance];
[session setCategory:AVAudioSessionCategoryAudioProcessing error:&error];
/************************* End You Need This Now **************************/
// tell the ExtAudioFile API what format we'll be sending samples in
err = ExtAudioFileSetProperty(audiofile, kExtAudioFileProperty_ClientDataFormat, sizeof(clientFormat), &clientFormat);
error = [NSError errorWithDomain:NSOSStatusErrorDomain
code:err
userInfo:nil];
NSLog(#"Error: %#", [error description]);
// write the data
err = ExtAudioFileWrite(audiofile, rFrames, bufferList);
// close the file
ExtAudioFileDispose(audiofile);
// destroy the buffers
for (int ii=0; ii < bufferList->mNumberBuffers; ++ii) {
free(bufferList->mBuffers[ii].mData);
}
free(bufferList);
bufferList = NULL;
}
Thanks for the help provided in this post.
I just would like to add one thing as you should restore the AVAudioSession back to what it was or you'll end up not playing anything.
AVAudioSession *session = [AVAudioSession sharedInstance];
NSString *originalSessionCategory = [session category];
[session setCategory:AVAudioSessionCategoryAudioProcessing error:&error];
...
...
// restore category
[session setCategory:originalSessionCategory error:&error];
if(error)
NSLog(#"%#",[error localizedDescription]);
Cheers
For Setting the different volumes of Mutable Tracks you can use below Code
self.audioMix = [AVMutableAudioMix audioMix];
AVMutableAudioMixInputParameters *audioInputParams = [AVMutableAudioMixInputParameters audioMixInputParameters];
[audioInputParams setVolume:0.1 atTime:kCMTimeZero];
[audioInputParams setVolume:0.1 atTime:kCMTimeZero];
audioInputParams.trackID = compositionAudioTrack2.trackID;
AVMutableAudioMixInputParameters *audioInputParams1 = [AVMutableAudioMixInputParameters audioMixInputParameters];
[audioInputParams1 setVolume:0.9 atTime:kCMTimeZero];
audioInputParams1.trackID = compositionAudioTrack1.trackID;
AVMutableAudioMixInputParameters *audioInputParams2 = [AVMutableAudioMixInputParameters audioMixInputParameters];
[audioInputParams2 setVolume:0.3 atTime:kCMTimeZero];
audioInputParams2.trackID = compositionAudioTrack.trackID;
self.audioMix.inputParameters =[NSArray arrayWithObjects:audioInputParams,audioInputParams1,audioInputParams2, nil];

Extracting audio channel from Linear PCM

I would like to extract a channel audio from the an LPCM raw file ie extract left and right channel of a stereo LPCM file. The LPCM is 16 bit depth,interleaved, 2 channels,litle endian. From what I gather the order of byte is {LeftChannel,RightChannel,LeftChannel,RightChannel...} and since it is 16 bit depth there will be 2 bytes of sample for each channel right?
So my question is if i want to extract the left channel then I would take the bytes in 0,2,4,6...n*2 address? while the right channel would be 1,3,4,...(n*2+1).
Also after extracting the audio channel, should i set the format of the extracted channel as 16 bit depth ,1 channel?
Thanks in advance
This is the code that I currently use to extract PCM audio from AssetReader.. This code works fine with writing a music file without its channel being extracted so I it might be caused by the format or something...
NSURL *assetURL = [song valueForProperty:MPMediaItemPropertyAssetURL];
AVURLAsset *songAsset = [AVURLAsset URLAssetWithURL:assetURL options:nil];
NSDictionary *outputSettings = [NSDictionary dictionaryWithObjectsAndKeys:
[NSNumber numberWithInt:kAudioFormatLinearPCM], AVFormatIDKey,
[NSNumber numberWithFloat:44100.0], AVSampleRateKey,
[NSNumber numberWithInt:2], AVNumberOfChannelsKey,
// [NSData dataWithBytes:&channelLayout length:sizeof(AudioChannelLayout)], AVChannelLayoutKey,
[NSNumber numberWithInt:16], AVLinearPCMBitDepthKey,
[NSNumber numberWithBool:NO], AVLinearPCMIsNonInterleaved,
[NSNumber numberWithBool:NO],AVLinearPCMIsFloatKey,
[NSNumber numberWithBool:NO], AVLinearPCMIsBigEndianKey,
nil];
NSError *assetError = nil;
AVAssetReader *assetReader = [[AVAssetReader assetReaderWithAsset:songAsset
error:&assetError]
retain];
if (assetError) {
NSLog (#"error: %#", assetError);
return;
}
AVAssetReaderOutput *assetReaderOutput = [[AVAssetReaderAudioMixOutput
assetReaderAudioMixOutputWithAudioTracks:songAsset.tracks
audioSettings: outputSettings]
retain];
if (! [assetReader canAddOutput: assetReaderOutput]) {
NSLog (#"can't add reader output... die!");
return;
}
[assetReader addOutput: assetReaderOutput];
NSArray *dirs = NSSearchPathForDirectoriesInDomains(NSDocumentDirectory, NSUserDomainMask, YES);
NSString *documentsDirectoryPath = [dirs objectAtIndex:0];
//CODE TO SPLIT STEREO
[self setupAudioWithFormatMono:kAudioFormatLinearPCM];
NSString *splitExportPath = [[documentsDirectoryPath stringByAppendingPathComponent:#"monoleft.caf"] retain];
if ([[NSFileManager defaultManager] fileExistsAtPath:splitExportPath]) {
[[NSFileManager defaultManager] removeItemAtPath:splitExportPath error:nil];
}
AudioFileID mRecordFile;
NSURL *splitExportURL = [NSURL fileURLWithPath:splitExportPath];
OSStatus status = AudioFileCreateWithURL(splitExportURL, kAudioFileCAFType, &_streamFormat, kAudioFileFlags_EraseFile,
&mRecordFile);
NSLog(#"status os %d",status);
[assetReader startReading];
CMSampleBufferRef sampBuffer = [assetReaderOutput copyNextSampleBuffer];
UInt32 countsamp= CMSampleBufferGetNumSamples(sampBuffer);
NSLog(#"number of samples %d",countsamp);
SInt64 countByteBuf = 0;
SInt64 countPacketBuf = 0;
UInt32 numBytesIO = 0;
UInt32 numPacketsIO = 0;
NSMutableData * bufferMono = [NSMutableData new];
while (sampBuffer) {
AudioBufferList audioBufferList;
CMBlockBufferRef blockBuffer;
CMSampleBufferGetAudioBufferListWithRetainedBlockBuffer(sampBuffer, NULL, &audioBufferList, sizeof(audioBufferList), NULL, NULL, 0, &blockBuffer);
for (int y=0; y<audioBufferList.mNumberBuffers; y++) {
AudioBuffer audioBuffer = audioBufferList.mBuffers[y];
//frames = audioBuffer.mData;
NSLog(#"the number of channel for buffer number %d is %d",y,audioBuffer.mNumberChannels);
NSLog(#"The buffer size is %d",audioBuffer.mDataByteSize);
//Append mono left to buffer data
for (int i=0; i<audioBuffer.mDataByteSize; i= i+4) {
[bufferMono appendBytes:(audioBuffer.mData+i) length:2];
}
//the number of bytes in the mutable data containing mono audio file
numBytesIO = [bufferMono length];
numPacketsIO = numBytesIO/2;
NSLog(#"numpacketsIO %d",numPacketsIO);
status = AudioFileWritePackets(mRecordFile, NO, numBytesIO, &_packetFormat, countPacketBuf, &numPacketsIO, audioBuffer.mData);
NSLog(#"status for writebyte %d, packets written %d",status,numPacketsIO);
if(numPacketsIO != (numBytesIO/2)){
NSLog(#"Something wrong");
assert(0);
}
countPacketBuf = countPacketBuf + numPacketsIO;
[bufferMono setLength:0];
}
sampBuffer = [assetReaderOutput copyNextSampleBuffer];
countsamp= CMSampleBufferGetNumSamples(sampBuffer);
NSLog(#"number of samples %d",countsamp);
}
AudioFileClose(mRecordFile);
[assetReader cancelReading];
[self performSelectorOnMainThread:#selector(updateCompletedSizeLabel:)
withObject:0
waitUntilDone:NO];
The output format with audiofileservices is as follows:
_streamFormat.mFormatFlags = kLinearPCMFormatFlagIsSignedInteger | kLinearPCMFormatFlagIsPacked;
_streamFormat.mBitsPerChannel = 16;
_streamFormat.mChannelsPerFrame = 1;
_streamFormat.mBytesPerPacket = 2;
_streamFormat.mBytesPerFrame = 2;// (_streamFormat.mBitsPerChannel / 8) * _streamFormat.mChannelsPerFrame;
_streamFormat.mFramesPerPacket = 1;
_streamFormat.mSampleRate = 44100.0;
_packetFormat.mStartOffset = 0;
_packetFormat.mVariableFramesInPacket = 0;
_packetFormat.mDataByteSize = 2;
Sounds almost right - you have a 16 bit depth, so that means each sample will take 2 bytes. That means the left channel data will be in bytes {0,1}, {4,5}, {8,9} and so on. Interleaved means the samples are interleaved, not the bytes.
Other than that I would try it out and see if you have any problems with your code.
Also after extracting the audio
channel, should i set the format of
the extracted channel as 16 bit depth
,1 channel?
Only one of the two channels is remaining after your extraction, so yes, this is correct.
I had a similar error that the audio sounded 'slow', the reason for this is that you specified mChannelsPerFrame of 1, whereas you have a dual channel sound. Set it to 2 and it should speed up the playback. Also do tell if after you do this the output 'sounds' correctly... :)
I'm trying to split my stereo audio into two mono files (split stereo audio to mono streams on iOS). I've been using your code but can't seem to get it to work. Whats the contents of your setupAudioWithFormatMono method?