AVAssetReaderAudioMixOutput with multiple files? - iphone

I have 2 WAV files (mono) I would like to merge.
I want to merge them into a stereo WAV file where the first file will use the left channel while the second file will use the right channel (if possible, I would also like to control the volume and lower the second file a bit).
I've tried to use AVAssetReaderAudioMixOutput, but got the following error:
[AVAssetReaderAudioMixOutput initWithAudioTracks:audioSettings:] tracks must all be part of the same AVAsset
I'm not sure how to merge 2 different files.
AVAssetReaderOutput* reader=[AVAssetReaderAudioMixOutput assetReaderAudioMixOutputWithAudioTracks:[NSArray arrayWithObjects:
[[AVURLAsset URLAssetWithURL:[NSURL fileURLWithPath:[documentDirectory stringByAppendingPathComponent:#"left.wav"]] options:nil].tracks lastObject],
[[AVURLAsset URLAssetWithURL:[NSURL fileURLWithPath:[documentDirectory stringByAppendingPathComponent:#"right.wav"]] options:nil].tracks lastObject],
nil] audioSettings:nil];

Related

Adding metadata to AAC M4A via AVAssetExportSession

I am creating and storing an AAC-encoded .m4a file using AVAudioRecorder. This produces a playable .m4a file just fine. I want to then use AVAssetExportSession to process the file in order to add metadata to the file. The below code is producing a .m4a file of a similar size (1 KB less than source), but when it plays back, there is just silence.
NSURL* url = [NSURL fileURLWithPath:self.m4aPath];
AVURLAsset* asset = [AVAsset assetWithURL:url];
AVMutableMetadataItem* t = [AVMutableMetadataItem metadataItem];
t.key = AVMetadataCommonKeyTitle;
t.keySpace = AVMetadataKeySpaceCommon;
t.value = #"Unit Test";
NSArray* metadata = [NSArray arrayWithObject:t];
AVAssetExportSession *exportSession = [[AVAssetExportSession alloc] initWithAsset:asset presetName:AVAssetExportPresetAppleM4A];
exportSession.outputURL = [NSURL fileURLWithPath:[[NSFileManager rawRecordingsDirectory] stringByAppendingPathComponent:#"test.m4a"]];
exportSession.outputFileType = AVFileTypeAppleM4A;
exportSession.metadata = metadata;
[exportSession exportAsynchronouslyWithCompletionHandler:^{....}];
One more piece of info: When I look at the source and exported file in the Finder, the source file has the black iTunes icon, while the exported file has the white iTunes icon. Not sure what this means in practice, but hoping it might be helpful. Moreover, double-clicking source adds it to iTunes and starts playback, while double-clicking the exported opens iTunes but does nothing.
I had a similar issue where my output m4a file had the white icon (instead of black) and wouldn't play. Though that was when I was creating the original source file from raw sample data, not when adding metadata to it.
My issue was that I wasn't closing the exported file in my code (I was just terminating the app before calling the close function). Once I called the close function, it started working. You might want to check that.
Also, I found "open with->Quicktime" useful as that gives an error when the file is corrupt, and plays it fine when it isn't. More useful than iTunes silently ignoring the error.

iPhone: Mix two audio files programmatically?

I want to have two audio files and mix and play it programmatically. When I am playing the first audio file, after some time(dynamic time) I need to add the second small audio file with the first audio file when somewhere middle of the first audio file is playing, then finally I need to save as one audio file on the device. It should play the audio file with the mixer audio I included the second one.
I have gone through many forums, but couldn't get the clue exactly how to achieve this?
Could someone please clarify my below doubts?
In this case, what audio file/format I should use? Can I use .avi files?
How to add the second audio after the dynamic time set onto the first audio file programmatically? For ex: If the first audio total time is 2 mins, I might need to mix the second audio file (3 seconds audio) somewhere in 1 min or 1.5 mins or 55 seconds of the first file. Its dynamic.
How to save the final output audio file on the device? If I save the audio file programmatically somewhere, can I play back again?
I don't know how to achieve this. Please suggest your thoughts!
Open each audio file
Read the header info
Get raw uncompressed audio into memory as an array of ints for each file
Starting at the point in file 1's array where you want to mix in file2, loop through, adding file2's int value to file1's, being sure to 'clip' any values above or below the max (this is how you mix audio ... yes, it's that simple). If file2 is longer, you'll have to make the first array long enough to hold the remainder of file2 completely.
Write new header info and then the audio from the array to which you added file2.
If there is compression involved or the files won't fit in memory, you may have to implement a more complex buffering scheme.
In this case, what audio file/format I should use? Can I use .avi files?
You can choose a compressed or non-compressed format. Common non-compressed formats include Wav and AIFF. CAF can represent compressed and non compressed data. .avi is not an option (offered by the OS).
If the files are large and storage space (on disk) is a concern, you may consider AAC format saved in a CAF (or simply .m4a). For most applications, 16 bit samples will be enough, and you can also save space, memory and cpu by saving these files at an appropriate sample rate (ref: CDs are 44.1kHz).
Since ExtAudioFile interface abstract the conversion process, you should not have to change your program to compare size and speed differences of compressed and non-compressed formats for your distribution (AAC in CAF would be fine for normal applications).
Noncompressed CD quality audio will consume about 5.3 MB per minute, per channel. So if you have 2 stereo audio files, each 3 minutes long, and a 3 minute destination buffer, your memory requirement would be around 50 MB.
Since you have 'minutes' of audio, you may need to consider avoiding loading all audio data into memory at once. In order to read, manipulate, and combine audio, you will need a non-compressed representation to work with in memory, so compression formats would not help here. As well, converting a compressed representation to pcm takes a good amount of resources; reading a compressed file, although fewer bytes, can take more (or less) time.
How to add the second audio after the dynamic time set onto the first audio file programmatically? For ex: If the first audio total time is 2 mins, I might need to mix the second audio file (3 seconds audio) somewhere in 1 min or 1.5 mins or 55 seconds of the first file. Its dynamic.
To read the files and convert them to the format you want to use, use ExtAudioFile APIs - this will convert to your destination sample format for you. Common PCM sample representations in memory include SInt32, SInt16, and float, but that can vary wildly based on the application and the hardware (beyond iOS). ExtAudioFile APIs would also convert compressed formats to PCM, if needed.
Your input audio files should have the same sample rate. If not, you will have to resample the audio, a complex process which also takes a lot of resources (if done correctly/accurately). If you need to support resampling, double the time you've allocated to completing this task (not detailing the process here).
To add the sounds, you would request PCM samples from the files, process, and write to the output file (or buffer in memory).
To determine when to add the other sounds, you will need to get the sample rates for the input files (via ExtAudioFileGetProperty). If you want to write the second sound to the destination buffer at 55s, then you would start adding the sounds at sample number SampleRate * 55, where SampleRate is the sample rate of the files you are reading.
To mix audio, you will just use this form (pseudocode):
mixed[i] = fileA[i] + fileB[i];
but you have to be sure you avoid over/underflow and other arithmetic errors. Typically, you will perform this process using some integer value, because floating point calculations can take a long time (when there are so many). For some applications, you could just shift and add with no worry of overflow - this would effectively reduce each input by one half before adding them. The amplitude of the result would be one half. If you have control over the files' content (e.g. they are all bundled as resources) then you could simply ensure no peak sample in the files exceeded one half of the full scale value (about -6dBFS). Of course, saving as float would solve this issue at the expense of introducing higher CPU, memory, and file i/o demands.
At this point, you'd have 2 files open for reading, and one open for writing, then a few small temporary buffers for processing and mixing the inputs before writing to the output file. You should perform these requests in blocks for efficiency (e.g. read 1024 samples from each file, process the samples, write 1024 samples). The APIs don't guarantee much regarding caching and buffering for efficiency.
How to save the final output audio file on the device? If I save the audio file programmatically somewhere, can I play back again?
ExtAudioFile APIs would work for your read and writing needs. Yes, you can read/play it later.
Hello You can do this by using av foundation
- (BOOL) combineVoices1
{
NSError *error = nil;
BOOL ok = NO;
NSArray *paths = NSSearchPathForDirectoriesInDomains(NSDocumentDirectory, NSUserDomainMask, YES);
NSString *documentsDirectory = [paths objectAtIndex:0];
CMTime nextClipStartTime = kCMTimeZero;
//Create AVMutableComposition Object.This object will hold our multiple AVMutableCompositionTrack.
AVMutableComposition *composition = [[AVMutableComposition alloc] init];
AVMutableCompositionTrack *compositionAudioTrack = [composition addMutableTrackWithMediaType:AVMediaTypeAudio preferredTrackID:kCMPersistentTrackID_Invalid];
[compositionAudioTrack setPreferredVolume:0.8];
NSString *soundOne =[[NSBundle mainBundle]pathForResource:#"test1" ofType:#"caf"];
NSURL *url = [NSURL fileURLWithPath:soundOne];
AVAsset *avAsset = [AVURLAsset URLAssetWithURL:url options:nil];
NSArray *tracks = [avAsset tracksWithMediaType:AVMediaTypeAudio];
AVAssetTrack *clipAudioTrack = [[avAsset tracksWithMediaType:AVMediaTypeAudio] objectAtIndex:0];
[compositionAudioTrack insertTimeRange:CMTimeRangeMake(kCMTimeZero, avAsset.duration) ofTrack:clipAudioTrack atTime:kCMTimeZero error:nil];
AVMutableCompositionTrack *compositionAudioTrack1 = [composition addMutableTrackWithMediaType:AVMediaTypeAudio preferredTrackID:kCMPersistentTrackID_Invalid];
[compositionAudioTrack setPreferredVolume:0.3];
NSString *soundOne1 =[[NSBundle mainBundle]pathForResource:#"test" ofType:#"caf"];
NSURL *url1 = [NSURL fileURLWithPath:soundOne1];
AVAsset *avAsset1 = [AVURLAsset URLAssetWithURL:url1 options:nil];
NSArray *tracks1 = [avAsset1 tracksWithMediaType:AVMediaTypeAudio];
AVAssetTrack *clipAudioTrack1 = [[avAsset1 tracksWithMediaType:AVMediaTypeAudio] objectAtIndex:0];
[compositionAudioTrack1 insertTimeRange:CMTimeRangeMake(kCMTimeZero, avAsset.duration) ofTrack:clipAudioTrack1 atTime:kCMTimeZero error:nil];
AVMutableCompositionTrack *compositionAudioTrack2 = [composition addMutableTrackWithMediaType:AVMediaTypeAudio preferredTrackID:kCMPersistentTrackID_Invalid];
[compositionAudioTrack2 setPreferredVolume:1.0];
NSString *soundOne2 =[[NSBundle mainBundle]pathForResource:#"song" ofType:#"caf"];
NSURL *url2 = [NSURL fileURLWithPath:soundOne2];
AVAsset *avAsset2 = [AVURLAsset URLAssetWithURL:url2 options:nil];
NSArray *tracks2 = [avAsset2 tracksWithMediaType:AVMediaTypeAudio];
AVAssetTrack *clipAudioTrack2 = [[avAsset2 tracksWithMediaType:AVMediaTypeAudio] objectAtIndex:0];
[compositionAudioTrack1 insertTimeRange:CMTimeRangeMake(kCMTimeZero, avAsset2.duration) ofTrack:clipAudioTrack2 atTime:kCMTimeZero error:nil];
AVAssetExportSession *exportSession = [AVAssetExportSession
exportSessionWithAsset:composition
presetName:AVAssetExportPresetAppleM4A];
if (nil == exportSession) return NO;
NSString *soundOneNew = [documentsDirectory stringByAppendingPathComponent:#"combined10.m4a"];
//NSLog(#"Output file path - %#",soundOneNew);
// configure export session output with all our parameters
exportSession.outputURL = [NSURL fileURLWithPath:soundOneNew]; // output path
exportSession.outputFileType = AVFileTypeAppleM4A; // output file type
// perform the export
[exportSession exportAsynchronouslyWithCompletionHandler:^{
if (AVAssetExportSessionStatusCompleted == exportSession.status) {
NSLog(#"AVAssetExportSessionStatusCompleted");
} else if (AVAssetExportSessionStatusFailed == exportSession.status) {
// a failure may happen because of an event out of your control
// for example, an interruption like a phone call comming in
// make sure and handle this case appropriately
NSLog(#"AVAssetExportSessionStatusFailed");
} else {
NSLog(#"Export Session Status: %d", exportSession.status);
}
}];
return YES;
}
If you are going to play multiple sounds at once, definitely use the *.caf format. Apple recommends it for playing multiple sounds at once. In terms of mixing them programmatically, I am assuming you just want them to play at the same time. While one sound is playing, just tell the other sound to play at whatever time you would like. To set a specific time, use NSTimer (NSTimer Class Reference) and create a method to have the sound play when the timer fires.

Using AVAssetReader to read (stream) from a remote asset

My main goal is to stream a video from a server, and cut it frame by frame while streaming (so that it can be used by OpenGL). For that, I've used this code that I found everywhere on the Internet (as I recall it was from Apple's GLVideoFrame sample code):
NSArray * tracks = [asset tracks];
NSLog(#"%d", tracks.count);
for(AVAssetTrack* track in tracks) {
NSLog(#"type: %#", [track mediaType]);
initialFPS = track.nominalFrameRate;
width = (GLuint)track.naturalSize.width;
height = (GLuint)track.naturalSize.height;
NSError * error = nil;
// _movieReader is a member variable
#try {
self._movieReader = [[[AVAssetReader alloc] initWithAsset:asset error:&error] autorelease];
}
#catch (NSException *exception) {
NSLog(#"%# -- %#", [exception name], [exception reason]);
NSLog(#"skipping track");
continue;
}
if (error)
{
NSLog(#"CODE:%d\nDOMAIN:%#\nDESCRIPTION:%#\nFAILURE_REASON:%#", [error code], [error domain], error.localizedDescription, [error localizedFailureReason]);
continue;
}
NSString* key = (NSString*)kCVPixelBufferPixelFormatTypeKey;
NSNumber* value = [NSNumber numberWithUnsignedInt:kCVPixelFormatType_32BGRA];
NSDictionary* videoSettings = [NSDictionary dictionaryWithObject:value forKey:key];
[_movieReader addOutput:[AVAssetReaderTrackOutput assetReaderTrackOutputWithTrack:track
outputSettings:videoSettings]];
[_movieReader startReading];
[self performSelectorOnMainThread:#selector(frameStarter) withObject:nil waitUntilDone:NO];
}
But I always get this exception at [[AVAssetReader alloc] initWithAsset:error:].
NSInvalidArgumentException -- *** -[AVAssetReader initWithAsset:error:] Cannot initialize an instance of AVAssetReader with an asset at non-local URL 'http://devimages.apple.com/iphone/samples/bipbop/bipbopall.m3u8'
So my two questions are:
Is the exception really telling me that AVAssetReader must have a local URL? Can it be used for streaming (just like the rest of the AVFoundation classes)?
If the AVFoundation approach won't work, what are other suggestions to stream the video and split its frames at the same time?
Thanks a lot for your help.
AVFoundation does not seem to distinguish as much between local and non-local files, as it does between the KIND of files or protocols used. There is a VERY clear distinction between using mp4/mov's versus using the HTTP Live streaming protocol via m3u8's, but the differences using a local or remote mp4 are a little fuzzier.
To expand on the above:
a) If your 'remote' asset is an M3U8 (that is, you are using HTTP 'live' streaming), then no chance whatsoever. No matter if the M3U8 is in your local filesystem or on a remote server, for a multitude of reasons AVAssetReader and all AVAsset-associated functionality just does NOT work. However, AVPlayer, AVPlayerItem etc would work just fine.
b) If it is an MP4/MOV, a little further investigation is due. Local MP4/MOV's work flawlessly. While in case of remote MP4/MOV's, I'm able to create (or retrieve from an AVPlayerItem or AVPlayer or AVAssetTracks) an AVURLAsset with which I'm sometimes able to initialize an AVAssetReader successfully (I'll expand on the 'sometimes' as well, shortly). HOWEVER, copyNextSampleBuffer always returns nil in case of remote MP4's. Since several things UPTO the point of invoking copyNextSampleBuffer work, I'm not 100% sure if:
i) copyNextSampleBuffer not working for remote mp4's, after all the other steps having been successful, is intended/expected functionality.
ii) That the 'other steps' seem to work at all for remote MP4's is an accident of Apple's implementation, and this incompatibility is simply coming to the fore when we hit copyNextSampleBuffer..............what these 'other steps' are, I'll detail shortly.
iii) I'm doing something wrong when trying to invoke copyNextSampleBuffer for remote MP4's.
So #Paula you could try to investigate a little further with remote MOV/MP4's.
For reference, here are the approaches I tried for capturing a frame from videos:
a)
Create an AVURLAsset directly from the video URL.
Retrieve the video track using [asset tracksWithMediaType:AVMediaTypeVideo]
Prepare an AVAssetReaderTrackOutput using the video track as the source.
Create an AVAssetReader using the AVURLAsset.
Add AVAssetReaderTrackOutput to the AVAssetReader and startReading.
Retrieve images using copyNextSampleBuffer.
b)
Create an AVPlayerItem from the video URL, and then an AVPlayer from it (or create the AVPlayer directly from the URL).
Retrieve the AVPlayer's 'asset' property and load its 'tracks' using "loadValuesAsynchronouslyForKeys:".
Separate the tracks of type AVMediaTypeVideo (or simply call tracksWithMediaType: on the asset once the tracks are loaded), and create your AVAssetReaderTrackOutput using the video track.
Create AVAssetReader using the AVPlayer's 'asset', 'startReading' and then retrieve images using copyNextSampleBuffer.
c)
Create an AVPlayerItem+AVPlayer or AVPlayer directly from the video URL.
KVO the AVPlayerItem's 'tracks' property, and once the tracks are loaded, separate the AVAssetTracks of type AVMediaTypeVideo.
Retrieve the AVAsset from AVPlayerItem/AVPlayer/AVAssetTrack's 'asset' property.
Remaining steps are similar to approach (b).
d)
Create an AVPlayerItem+AVPlayer or AVPlayer directly from the video URL.
KVO the AVPlayerItem's 'tracks' property, and once the tracks are loaded, separate the ones of type AVMediaTypeVideo.
Create an AVMutableComposition, and initialize an associated AVMutableCompositionTrack of type AVMediaTypeVideo.
Insert the appropriate CMTimeRange from video track retrieved earlier, into this AVMutableCompositionTrack.
Similar to (b) and (c), now create your AVAssetReader and AVAssetReaderTrackOutput, but with the difference that you use the AVMutableComposition as the base AVAsset for initializing your AVAssetReader, and AVMutableCompositionTrack as the base AVAssetTrack for your AVAssetReaderTrackOutput.
'startReading' and use copyNextSampleBuffer to get frames from the AVAssetReader.
P.S: I tried approach (d) here to get around the fact that the AVAsset retrieved directly from AVPlayerItem or AVPlayer was not behaving. So I wanted to create a new AVAsset from the AVAssetTracks I already had in hand. Admittedly hacky, and perhaps pointless (where else would the track information be ultimately retrieved from if not the original AVAsset!) but it was worth a desperate try anyway.
Here's a summary of the results for different types of files:
1) Local MOV/MP4's - All 4 approaches work flawlessly.
2) Remote MOV/MP4's - The asset and tracks are retrieved correctly in approaches (b) through (d), and the AVAssetReader is initialized as well but copyNextSampleBuffer always returns nil. In case of (a), creation of the AVAssetReader itself fails with an 'Unknown Error' NSOSStatusErrorDomain -12407.
3) Local M3U8's (accessed through an in-app/local HTTP server) - Approaches (a), (b) and (c) fail miserably as trying to get an AVURLAsset/AVAsset in any shape or form for files streamed via M3U8's is a fools errand.
In case of (a), the asset is not created at all, and the initWithURL: call on AVURLAsset fails with an 'Unknown Error' AVFoundationErrorDomain -11800.
In case of (b) and (c), retrieving the AVURLAsset from the AVPlayer/AVPlayerItem or AVAssetTracks returns SOME object, but accessing the 'tracks' property on it always returns an empty array.
In case of (d), I'm able to retrieve and isolate the video tracks successfully, but while trying to create the AVMutableCompositionTrack, it fails when trying to insert the CMTimeRange from the source track into the AVMutableCompositionTrack, with an 'Unknown Error' NSOSStatusErrorDomain -12780.
4) Remote M3U8's, behave exactly the same as local M3U8's.
I'm not entirely educated on why these differences exist, or could not have been mitigated by Apple. But there you go.
You can get a remote file on AVMutableCompositionTrack
AVURLAsset* soundTrackAsset = [[AVURLAsset alloc]initWithURL:[NSURL URLWithString:#"http://www.yoururl.com/yourfile.mp3"] options:nil];
AVMutableCompositionTrack *compositionAudioSoundTrack = [mixComposition addMutableTrackWithMediaType:AVMediaTypeAudio preferredTrackID:kCMPersistentTrackID_Invalid];
[compositionAudioSoundTrack insertTimeRange:CMTimeRangeMake(kCMTimeZero, audioAsset.duration)
ofTrack:[[soundTrackAsset tracksWithMediaType:AVMediaTypeAudio] objectAtIndex:0]
atTime:kCMTimeZero error:nil];
However, this approach does not work very well with files that have a higher compression like MP4s

How can I append to a recorded MPEG4 AAC file?

I'm recording audio on an iPhone, using an AVAudioRecorder with the following settings:
NSMutableDictionary *recordSettings = [[NSDictionary alloc] initWithObjectsAndKeys:
[NSNumber numberWithInt: kAudioFormatMPEG4AAC], AVFormatIDKey,
[NSNumber numberWithFloat:44100.0], AVSampleRateKey,
[NSNumber numberWithInt:1], AVNumberOfChannelsKey,
[NSNumber numberWithInt:12800], AVEncoderBitRateKey,
[NSNumber numberWithInt:16], AVLinearPCMBitDepthKey,
[NSNumber numberWithInt: AVAudioQualityHigh], AVEncoderAudioQualityKey,
nil];
(I can be flexible on most of these settings, but I have to use MPEG4 AAC.)
I save the audio to a file.
The user needs to be able to come back at a later date and continue recording to the same file. There doesn't seem to be an option to do this directly with AVAudioRecorder, so instead I'm recording to a new file and concatenating them.
At the moment I'm appending the files using an AVMutableComposition and an AVMutableCompositionTrack as here, but it's really slow for longer recordings so this isn't really feasible.
I'm thinking it would be much quicker if I could strip the header from the second file, append the audio data to the first file, then alter the header of the combined file to reflect the new duration. As I know both files were created with exactly the same settings, I figure the other details in the headers should be identical.
Unfortunately I can't find any information about what format the headers are in, or if it's possible to combine files in this way.
So my questions are:
What is the format of the MPEG-4 AAC file header, when created on an iPhone?
Can I combine two audio files by messing with the headers like this?
Is there a better way of appending two MPEG-4 AAC audio files almost instantaneously?
Though we ask the AVAudioRecorder to record in MPEG4-AAC format, it always produces a .caf (Core Audio Format) file. This is just a wrapper format, however, and the actual audio data it contains is in AAC format.
In the end, appending files came down to manipulating the .caf files byte-by-byte. The spec for Core Audio Format files is here. Digesting this document and processing the files accordingly was a little off-putting at first, but it turns out the spec is very clear and complete, so it wasn't too onerous.
As the spec explains, .caf files consist of chunks with four-byte names at the beginning. For AAC files, there's always a desc chunk and a kuki chunk. As we know our two original files are in the same format, we can copy these chunks unchanged to the output file.
There's also a pakt chunk and a data chunk. We can't guarantee which order these will be in within the input files. There may or may not be a free chunk - but this just contains padding 0x00's, so we needn't copy this to the output file.
To combine the pakt chunks, we need to examine the chunk headers and produce a new pakt chunk whose mNumberPackets and mNumberValidFrames fields are the sums of those in the input files. The mPrimingFrames and mRemainderFrames are always zero - these are only relevant for streaming media. The bulk of the pakt chunks (ie. the actual packet table data) can just be concatenated.
Similarly for the data chunks: the mChunkSize fields need to be summed and then the bulk of the data can be concatenated.
Be careful when reading data from all the binary numeric fields within these files: the files are big-endian but the iPhone is little-endian.
For extra credit, you might also like to consider deleting segments of audio from within a file, or inserting one audio file into the middle of another. This is a little trickier as you have to parse the contents of the pakt chunk. Again it's a case of following the spec: there's a good description of how the packet sizes are stored in variable-length integers, so you'll have to parse these to find how many bytes each packet takes up in the data chunk, and calculate their positions accordingly.
All in all this is rather more hassle than I was hoping for. Maybe there's an open source library that will do all this for you, but I couldn't find one.
However, handling raw files like this is blinding fast compared to using AVMutableComposition and AVMutableCompositionTrack as in the original question - inserting an hour-long recording into another of the same length takes about two seconds.
Good luck!
I found a way that was much faster to implement:
Use AVAudioRecorder and use the extension "m4a" for a temporary file, you can however also use "caf" if you want but it's unnecessary.
Modify the code here to use AVAssetExportPresetPassthrough and exportSession.outputFileType = AVFileTypeQuickTimeMovie and a filename "audioJoined.mov". Use your newly recorded temporary m4a and an existing m4a file. This gives you an instant join (no recompression) and produces a "mov".
Note. Unfortunately the AVAudioPlayer cannot play a "mov" so the next step is to convert it to something playable. However, if you are just going to share the file somewhere you could potentially skip the next step since the mov is perfectly playable on a Mac in Quicktime. It also can be played in iTunes and synced back to an iPhone and plays in the iPod app.
Convert the mov back to a m4a using [[AVAssetExportSession alloc] initWithAsset:movFileAsset presetName:AVAssetExportPresetAppleM4A], #"audioJoined.m4a" for the filename and exportSession.outputFileType = AVFileTypeAppleM4A. Again, this is instant. I'm guessing that the exporter is smarter in this situation when it starts with a mov asset rather than a AVMutableComposition asset.
I'm using this technique in an app that is able to resume recording after recording has been stopped and the file has been played, or even if the app is restarted, pretty cool.

Play a wav file retrieved from a database on the iPhone?

I have alot of wav files stored in sqlite3, but when I retrieve one of them, I can't play it. The retrieve code is
NSData *soundData = (NSDATA *)sqlite3_column_blob(statement, 0);
mPlayer = [[AVAudioPlayer alloc] initWithData:soundData error:&error];
The data is stored as binary and it's there when I search for it using sqlite3.
Sorry. Never mind. I just compressed the data more and it works fine now. Seems the number of files is not as important as their size afterall.