How can I record AMR audio format on the iPhone? - iphone

A voice recorder doesn't need uncompressed Linear PCM audio. Compressed AMR would do fine. The iPhone framework built for recording audio is simple enough, but the only examples I've found for setting up the audio format (which come from Apple) use LinearPCM. I've tried various other combinations of values, but can't seem to get anything to work.
Does anybody have any code that actually records AMR?
Edit:
The AMR format is one of the options for setting the data type, but the other options (packet size, frame size, etc.) don't seem to match up no matter what I set them to.
Edit: Here's what I have for the PCM version:
/*
If we want to use AMR instead of PCM:
AMR Format:
Sampling Frequency: 8 kHz/13-bit (160 samples for 20 ms frames), filtered to 200-3400 Hz
eight source codecs : 12.2, 1.2, 7.95, 7.40, 6.70, 5.90, 5.15, 4.75 kbit/s
generated frame length: 244, 204, 159, 148, 134, 118, 103, 95 bits per frame
*/
format->mFormatID = kAudioFormatLinearPCM;
format->mSampleRate = 8000.0; //8 kHz
format->mFramesPerPacket = 1; //1 frame per packet
format->mChannelsPerFrame = 1; //Mono
format->mBytesPerFrame = 2; //8/bits per frame (round up)
format->mBytesPerPacket = 2; //Same as bytes per frame
format->mBitsPerChannel = 16; //16-bit audio
format->mReserved = 0; //always 0
format->mFormatFlags = kLinearPCMFormatFlagIsBigEndian |
kLinearPCMFormatFlagIsSignedInteger |
kLinearPCMFormatFlagIsPacked;

AMR codec is NOT supported for encoding/recording on the iPhone, albeit it is supported for playback: this is the reason the kAudioFormatAMR constant exists.
Official api says that supported encoding formats are:
ALAC (Apple Lossless) ~> kAudioFormatAppleLossless
iLBC (internet Low Bitrate Codec, for speech) ~> kAudioFormatiLBC
IMA/ADPCM (IMA4) ~> kAudioFormatAppleIMA4
linear PCM ~> kAudioFormatLinearPCM
ยต-law ~> kAudioFormatULaw
a-law ~> kAudioFormatALaw
You may try one of these formats or use an open source AMR encoder as goldenmean suggests.
edit: Updated Official api link

To update olegueret's link to the official documentation (why do they hide this stuff?)
http://developer.apple.com/library/ios/#qa/qa2008/qa1615.html

I guess AMR codec format is not supported my iPhone voice recorder app.
May be one can try integrating some open-source, implementation of AMR encoder into the apples' iPhone application framework and try making the voice recorder store the audio in AMR encoded format. (i dont know if thats allowed by apple by their NDA/license).
-AD

You can record audio to a uncompressed Linear PCM buffer (circular or ring), and, in another thread, convert data in this buffer, using your own AMR (or other) compression engine, before saving the compressed audio data to a file.

Related

what is the content of an .acc audio file?

i may sound too rookie please excuse me. When i read a .AAC audio file in Matlab using the audioread function the out put is a 256000x6 matrix. how do i know what is the content of each column?
filename = 'sample1.aac';
[y,Fs] = audioread(filename,'native');
writing the first column using audiowrite as below i can hear the whole sound. so what are the other columns?
audiowrite('sample2.wav',y,Fs);
Documentation:
https://uk.mathworks.com/help/matlab/ref/audioread.html
Output Arguments
y - Audio Data
Audio data in the file, returned as an m-by-n matrix, where m is the number of audio samples read and n is the number of audio channels in the file.
If you can hear the entire file in the first channel, it just means most of that file is contained in a mono channel. From Wikipedia r.e. AAC audio channels:
AAC supports inclusion of 48 full-bandwidth (up to 96 kHz) audio channels in one stream plus 16 low frequency effects (LFE, limited to 120 Hz) channels, up to 16 "coupling" or dialog channels, and up to 16 data streams
https://en.wikipedia.org/wiki/Advanced_Audio_Coding

How to change caf High quality(sample rate) to caf Low quality(sample rate)

I'm using AVAudioRecorder. I record the audio in caf format with 44100 sample rate. It recorded successfully. After recording I want to convert already recorded caf audio file with 11025 and 22050 sample rate. Is it possible to change high to low & medium sample rate.
As of iOS 6, AVAudioSession can take that input and can likely convert it via this method:
setPreferredSampleRate: error:.

CoreAudio Audio Unit plays only one channel of stereo audio

Recently I've bumped into next problem.
I use CoreAudio AudioUnit (RemoteI/O) to play/record sound stream in an iOS app.
Sound stream which goes into audio unit is 2 channel LPCM, 16 bit, signed integer, interleaved (I also configure an output recording stream which is basically the same but has only one channel and 2 bytes per packet and frame).
I have configured my input ASBD as follows (I get no error when I set it and when I initialize unit):
ASBD.mFormatFlags = kLinearPCMFormatFlagIsSignedInteger | kLinearPCMFormatFlagIsPacked;
ASBD.mBytesPerPacket = 4;
ASBD.mFramesPerPacket = 1;
ASBD.mBytesPerFrame = 4;
ASBD.mChannelsPerFrame = 2;
ASBD.mBitsPerChannel = 16;
In my render callback function I get AudioBufferList with one buffer (as I understand, because the audio stream is interleaved).
I have a sample stereo file for testing which is 100% stereo with 2 obvious channels. I translate it into stream which corresponds to ASBD and feed to audio unit.
When I play sample file I hear only left channel.
I would appreciate any ideas why this happens. If needed I can post more code.
Update: I've tried to set
ASBD.mFormatFlags = kLinearPCMFormatFlagIsSignedInteger | kLinearPCMFormatFlagIsPacked | kLinearPCMFormatFlagIsNonInterleaved;
ASBD.mBytesPerPacket = 2;
ASBD.mFramesPerPacket = 1;
ASBD.mBytesPerFrame = 2;
ASBD.mChannelsPerFrame = 2;
ASBD.mBitsPerChannel = 16;
ASBD and I've got buffer list with two buffers. I deinterleaved my stream into 2 channels(1 channel for 1 buffer) and got the same result. I tried with headset and speaker on iPad (I know that speaker is mono).
Ok. So I've check my code and spotted that I use VoiceProcessingIO audio unit (instead of RemoteIO which is in the question) which is basically correct for my app since documentation says "The Voice-Processing I/O unit (subtype kAudioUnitSubType_VoiceProcessingIO) has the characteristics of the Remote I/O unit and adds echo suppression for two-way duplex communication. It also adds automatic gain correction, adjustment of voice-processing quality, and muting"
When I changed audio unit type to RemoteIO I've immediately got the stereo playback. I didn't have to change stream properties.
Basically VoiceProcessingIO audio unit downfalls to mono and disregards stream properties.
I've posted a question on Apple Developer forum regarding stereo output using VoiceProcessingIO audio unit but haven't got any answer yet.
It seems pretty logical for me to downfall to mono in order to do some signal processing like echo cancelation because iOS devices can record only mono sound without specific external accessories. Although this is not documented anywhere in Apple documentation. I've also come across a post of guy who claimed that stereo worked for VoiceProcessingIO AU prior to iOS5.0.
Anyway thanks for your attention. Any other comments on the matter would be greatly appreciated.

Recording playback and mic on IPhone

In iPhone SDK 4.3 I would like to record what is being played out through speaker via Remote IO and also record the mic input. I was wondering if the best way is to record each separately to a different channel in an audio file. If so which apis allow me to do this and what audio format should I use. I was planning on using ExtAudioFileWrite to do the actual writing to the file.
Thanks
If both tracks that you have is mono, 16bit integer with the same sample rate:
format->mFormatFlags = kLinearPCMFormatFlagIsSignedInteger | kLinearPCMFormatFlagIsPacked;
format->mBitsPerChannel = 16;
you can combine those tracks to the 2 channels PCM by just alternating sample from one track with sample from another.
[short1_track1][short1_track2][short2_track1][short2_track2] and so on.
After that you can write this samples to the output file using ExtAudioFileWrite. That file should be 2 channel kLinearPCMFormatFlagIsSignedInteger | kLinearPCMFormatFlagIsPacked of course.
If one of tracks is stereo (I don't think that it is reasonable to record stereo from iphone mic), you can convert it to the mono by taking average from 2 channels or by skipping every second sample of it.
You can separately save PCM data from the play and record callback buffers of the RemoteIO Audio Unit, then mix them using your own mixer code (DSP code) before writing the mixed result to a file.
You may or may not need to do your own echo cancellation (advanced DSP code) as well.

Native iPhone audio format

I've currently got my output audio on the iPhone setup as this :
AudioStreamBasicDescription audioFormat;
audioFormat.mSampleRate = 48000;
audioFormat.mFormatID = kAudioFormatLinearPCM;
audioFormat.mFormatFlags = kAudioFormatFlagIsSignedInteger | kAudioFormatFlagIsPacked;
audioFormat.mFramesPerPacket = 1;
audioFormat.mChannelsPerFrame = 2;
audioFormat.mBitsPerChannel = 16;
audioFormat.mBytesPerPacket = 4;
audioFormat.mBytesPerFrame = 4;
However, when I examine my performance figures through shark I am seeing functions such as :
SRC_Convert_table_i32_scalar_stereo
take a fair chunk of time.
This made me think - what is the ideal and suggested output format for the iPhone? The one that requires as little work for the device to play.
Shark does work with the iPhone. You can enable iPhone profiling by selecting "Sampling > Network/iPhone Profiling..." in the menu.
Definitely try using a 44100 Hz sampling rate. With 48000 I see the same function that you posted appearing in the callstacks -- no such function shows up when using 44100. The canonical audio format for Audio Units on the iPhone is non-interleaved 8.24 linear PCM:
streamFormat.mFormatID = kAudioFormatLinearPCM;
streamFormat.mFormatFlags =
kAudioFormatFlagIsSignedInteger
| kAudioFormatFlagsNativeEndian
| kLinearPCMFormatFlagIsNonInterleaved
| (24 << kLinearPCMFormatFlagsSampleFractionShift);
streamFormat.mSampleRate = mixing_rate;
streamFormat.mBitsPerChannel = 32;
streamFormat.mChannelsPerFrame = 2;
streamFormat.mFramesPerPacket = 1;
streamFormat.mBytesPerFrame = ( streamFormat.mBitsPerChannel / 8 );
streamFormat.mBytesPerPacket = streamFormat.mBytesPerFrame *
streamFormat.mFramesPerPacket;
From iphone dev centre (requires login) the hardware suppoorted codecs are
iPhone Audio Hardware Codecs
iPhone OS applications can use a wide
range of audio data formats, as
described in the next section. Some of
these formats use software-based
encoding and decoding. You can
simultaneously play multiple sounds in
these formats. Moreover, your
application and a background
application (notably, the iPod
application) can simultaneously play
sounds in these formats.
Other iPhone OS audio formats employ a
hardware codec for playback. These
formats are:
AAC
ALAC (Apple Lossless)
MP3
48000 is a weird format for audio in general. While it's marginally (and imperceptibly) better than CD standard 44.1Khz it's not worth the size.
Is there any compelling reason in your app to use high quality stereo sound? In other words is the app likely to be played on speakers or good headphones?
LinearPCM format is hardware native so it should be optimal. (No decompression to fiddle with.) So that call may be a downsample to 44.1Khz.