Convert PCM Buffer to AAC ELD Format and vice versa - swift

I'm having trouble converting a linear PCM buffer to a compressed AAC ELD (Enhanced Low Delay) buffer.
I got some working code for the conversion into ilbc format from this question:
AVAudioCompressedBuffer to UInt8 array and vice versa
This approach worked fine.
I changed the input for the format to this:
let packetCapacity = 8
let maximumPacketSize = 96
lazy var capacity = packetCapacity * maximumPacketSize // 768
let convertedSampleRate: Double = 16000
lazy var aaceldFormat: AVAudioFormat = {
var descriptor = AudioStreamBasicDescription(mSampleRate: convertedSampleRate, mFormatID: kAudioFormatMPEG4AAC_ELD, mFormatFlags: 0, mBytesPerPacket: 0, mFramesPerPacket: 0, mBytesPerFrame: 0, mChannelsPerFrame: 1, mBitsPerChannel: 0, mReserved: 0)
return AVAudioFormat(streamDescription: &descriptor)!
}()
The conversion to a compressed buffer worked fine and I was able to convert the buffer to a UInt8 Array.
However, the conversion back to a PCM Buffer didn't work. The input block for the conversion back to a buffer looks like this:
func convertToBuffer(uints: [UInt8], outcomeSampleRate: Double) -> AVAudioPCMBuffer? {
// Convert to buffer
let compressedBuffer: AVAudioCompressedBuffer = AVAudioCompressedBuffer(format: aaceldFormat, packetCapacity: AVAudioPacketCount(packetCapacity), maximumPacketSize: maximumPacketSize)
compressedBuffer.byteLength = UInt32(capacity)
compressedBuffer.packetCount = AVAudioPacketCount(packetCapacity)
var compressedBytes = uints
compressedBytes.withUnsafeMutableBufferPointer {
compressedBuffer.data.copyMemory(from: $0.baseAddress!, byteCount: capacity)
}
guard let audioFormat = AVAudioFormat(
commonFormat: AVAudioCommonFormat.pcmFormatFloat32,
sampleRate: outcomeSampleRate,
channels: 1,
interleaved: false
) else { return nil }
guard let uncompressor = getUncompressingConverter(outputFormat: audioFormat) else { return nil }
var newBufferAvailable = true
let inputBlock : AVAudioConverterInputBlock = {
inNumPackets, outStatus in
if newBufferAvailable {
outStatus.pointee = .haveData
newBufferAvailable = false
return compressedBuffer
} else {
outStatus.pointee = .noDataNow
return nil
}
}
guard let uncompressedBuffer: AVAudioPCMBuffer = AVAudioPCMBuffer(pcmFormat: audioFormat, frameCapacity: AVAudioFrameCount((audioFormat.sampleRate / 10))) else { return nil }
var conversionError: NSError?
uncompressor.convert(to: uncompressedBuffer, error: &conversionError, withInputFrom: inputBlock)
if let err = conversionError {
print("couldnt decompress compressed buffer", err)
}
return uncompressedBuffer
}
The error block after the convert method triggers and prints out "too few bits left in input buffer". Also, it seems like the input block only gets called once.
I've tried different codes and this seems to be one of the most common outcomes. I'm also not sure if the problem is in the initial conversion from the pcm buffer to uint8 array although I get an UInt8 Array filled with 768 values every 0.1 seconds (Sometimes the array contains a few zeros at the end, which doesn't happen in ilbc format.
Questions:
1. Is the initial conversion from pcm buffer to uint8 array done with the right approach? Are the packetCapacity, capacity and maximumPacketSize valid? -> Again, seems to work
2. Am I missing something at the conversion back to pcm buffer? Also, am I using the variables in the right way?
3. Has anyone achieved this conversion without using C in the project?
** EDIT: ** I also worked with the approach from this post:
Decode AAC to PCM format using AVAudioConverter Swift
It works fine with AAC format, but not with AAC_LD or AAC_ELD

Related

Converting AvAudioInputNode to S16LE PCM

I'm trying to convert input node format to S16LE format. I've tried it with AVAudioMixerNode
First I create audio session
do {
try audioSession.setCategory(.record)
try audioSession.setActive(true)
} catch {
...
}
//Define formats
let inputNodeOutputFormat = audioEngine.inputNode.outputFormat(forBus: 0)
guard let wantedFormat = AVAudioFormat(commonFormat: AVAudioCommonFormat.pcmFormatInt16, sampleRate: 16000, channels: 1, interleaved: false) else {
return;
}
//Create mixer node and attach it to the engine
audioEngine.attach(mixerNode)
//Connect the input node to mixer node and mixer node to mainMixerNode
audioEngine.connect(audioEngine.inputNode, to: mixerNode, format: inputNodeOutputFormat)
audioEngine.connect(mixerNode, to: audioEngine.mainMixerNode, format: wantedFormat)
//Install the tab on the output of the mixerNode
mixerNode.installTap(onBus: 0, bufferSize: bufferSize, format: wantedFormat) { (buffer, time) in
let theLength = Int(buffer.frameLength)
var bufferData: [Int16] = []
for i in 0 ..< theLength
{
let char = Int16((buffer.int16ChannelData?.pointee[i])!)
bufferData.append(char)
}
}
I get the following error.
Exception 'I[busArray objectAtindexedSubscript:
(NSUlnteger)element] setFormat:format
error:&nsErr]: returned false, error Error
Domain=NSOSStatusErrorDomain Code=-10868
"(null)"' was thrown
What part of the graph did I mess up?
You have to set the format of nodes to match the actual format of the data. Setting the node's format doesn't cause any conversions to happen, except that mixer nodes can convert sample rates (but not data formats). You'll need to use an AVAudioConverter in your tap to do the conversion.
As an example of what this code would look like, to handle arbitrary conversions:
let inputNode = audioEngine.inputNode
let inputFormat = inputNode.inputFormat(forBus: 0)
let outputFormat = AVAudioFormat(... define your format ...)
guard let converter = AVAudioConverter(from: inputFormat, to: outputFormat) else {
throw ...some error...
}
inputNode.installTap(onBus: 0, bufferSize: 1024, format: inputFormat) {[weak self] (buffer, time) in
let inputBlock: AVAudioConverterInputBlock = {inNumPackets, outStatus in
outStatus.pointee = AVAudioConverterInputStatus.haveData
return buffer
}
let targetFrameCapacity = AVAudioFrameCount(outputFormat.sampleRate) * buffer.frameLength / AVAudioFrameCount(buffer.format.sampleRate)
if let convertedBuffer = AVAudioPCMBuffer(pcmFormat: outputFormat, frameCapacity: targetFrameCapacity) {
var error: NSError?
let status = converter.convert(to: convertedBuffer, error: &error, withInputFrom: inputBlock)
assert(status != .error)
let sampleCount = convertedBuffer.frameLength
let rawData = convertedBuffer.int16ChannelData![0]
// ... and here you have your data ...
}
}
If you don't need to change the sample rate, and you're converting from uncompressed audio to uncompressed audio, you may be able to use the simpler convert(to:from:) method in your tap.
Since iOS 13, you can also do this with AVAudioSinkNode rather than a tap, which can be more convenient.

Why do I get popping noises from my Core Audio program?

I am trying to figure out how to use Apple's Core Audio APIs to record and play back linear PCM audio without any file I/O. (The recording side seems to work just fine.)
The code I have is pretty short, and it works somewhat. However, I am having trouble with identifying the source of clicks and pops in the output. I've been beating my head against this for many days with no success.
I have posted a git repo here, with a command-line program program that shows where I'm at: https://github.com/maxharris9/AudioRecorderPlayerSwift/tree/main/AudioRecorderPlayerSwift
I put in a couple of functions to prepopulate the recording. The tone generator (makeWave) and noise generator (makeNoise) are just in here as debugging aids. I'm ultimately trying to identify the source of the messed up output when you play back a recording in audioData:
// makeWave(duration: 30.0, frequency: 441.0) // appends to `audioData`
// makeNoise(frameCount: Int(44100.0 * 30)) // appends to `audioData`
_ = Recorder() // appends to `audioData`
_ = Player() // reads from `audioData`
Here's the player code:
var lastIndexRead: Int = 0
func outputCallback(inUserData: UnsafeMutableRawPointer?, inAQ: AudioQueueRef, inBuffer: AudioQueueBufferRef) {
guard let player = inUserData?.assumingMemoryBound(to: Player.PlayingState.self) else {
print("missing user data in output callback")
return
}
let sliceStart = lastIndexRead
let sliceEnd = min(audioData.count, lastIndexRead + bufferByteSize - 1)
print("slice start:", sliceStart, "slice end:", sliceEnd, "audioData.count", audioData.count)
if sliceEnd >= audioData.count {
player.pointee.running = false
print("found end of audio data")
return
}
let slice = Array(audioData[sliceStart ..< sliceEnd])
let sliceCount = slice.count
// doesn't fix it
// audioData[sliceStart ..< sliceEnd].withUnsafeBytes {
// inBuffer.pointee.mAudioData.copyMemory(from: $0.baseAddress!, byteCount: Int(sliceCount))
// }
memcpy(inBuffer.pointee.mAudioData, slice, sliceCount)
inBuffer.pointee.mAudioDataByteSize = UInt32(sliceCount)
lastIndexRead += sliceCount + 1
// enqueue the buffer, or re-enqueue it if it's a used one
check(AudioQueueEnqueueBuffer(inAQ, inBuffer, 0, nil))
}
struct Player {
struct PlayingState {
var packetPosition: UInt32 = 0
var running: Bool = false
var start: Int = 0
var end: Int = Int(bufferByteSize)
}
init() {
var playingState: PlayingState = PlayingState()
var queue: AudioQueueRef?
// this doesn't help
// check(AudioQueueNewOutput(&audioFormat, outputCallback, &playingState, CFRunLoopGetMain(), CFRunLoopMode.commonModes.rawValue, 0, &queue))
check(AudioQueueNewOutput(&audioFormat, outputCallback, &playingState, nil, nil, 0, &queue))
var buffers: [AudioQueueBufferRef?] = Array<AudioQueueBufferRef?>.init(repeating: nil, count: BUFFER_COUNT)
print("Playing\n")
playingState.running = true
for i in 0 ..< BUFFER_COUNT {
check(AudioQueueAllocateBuffer(queue!, UInt32(bufferByteSize), &buffers[i]))
outputCallback(inUserData: &playingState, inAQ: queue!, inBuffer: buffers[i]!)
if !playingState.running {
break
}
}
check(AudioQueueStart(queue!, nil))
repeat {
CFRunLoopRunInMode(CFRunLoopMode.defaultMode, BUFFER_DURATION, false)
} while playingState.running
// delay to ensure queue emits all buffered audio
CFRunLoopRunInMode(CFRunLoopMode.defaultMode, BUFFER_DURATION * Double(BUFFER_COUNT + 1), false)
check(AudioQueueStop(queue!, true))
check(AudioQueueDispose(queue!, true))
}
}
I captured the audio with Audio Hijack, and noticed that the jumps are indeed correlated with the size of the buffer:
Why is this happening, and what can I do to fix it?
I believe you were beginning to zero in on, or at least suspect, the cause of the popping you are hearing: it's caused by discontinuities in your waveform.
My initial hunch was that you were generating the buffers independently (i.e. assuming that each buffer starts at time=0), but I checked out your code and it wasn't that. I suspect some of the calculations in makeWave were at fault. To check this theory I replaced your makeWave with the following:
func makeWave(offset: Double, numSamples: Int, sampleRate: Float64, frequency: Float64, numChannels: Int) -> [Int16] {
var data = [Int16]()
for sample in 0..<numSamples / numChannels {
// time in s
let t = offset + Double(sample) / sampleRate
let value = Double(Int16.max) * sin(2 * Double.pi * frequency * t)
for _ in 0..<numChannels {
data.append(Int16(value))
}
}
return data
}
This function removes the double loop in the original, accepts an offset so it knows which part of the wave is being generated and makes some changes to the sampling of the sine wave.
When Player is modified to use this function you get a lovely steady tone. I'll add the changes to player soon. I can't in good conscience show the quick and dirty mess it is now to the public.
Based on your comments below I refocused on your player. The issue was that the audio buffers expect byte counts but the slice count and some other calculations were based on Int16 counts. The following version of outputCallback will fix it. Concentrate on the use of the new variable bytesPerChannel.
func outputCallback(inUserData: UnsafeMutableRawPointer?, inAQ: AudioQueueRef, inBuffer: AudioQueueBufferRef) {
guard let player = inUserData?.assumingMemoryBound(to: Player.PlayingState.self) else {
print("missing user data in output callback")
return
}
let bytesPerChannel = MemoryLayout<Int16>.size
let sliceStart = lastIndexRead
let sliceEnd = min(audioData.count, lastIndexRead + bufferByteSize/bytesPerChannel)
if sliceEnd >= audioData.count {
player.pointee.running = false
print("found end of audio data")
return
}
let slice = Array(audioData[sliceStart ..< sliceEnd])
let sliceCount = slice.count
print("slice start:", sliceStart, "slice end:", sliceEnd, "audioData.count", audioData.count, "slice count:", sliceCount)
// need to be careful to convert from counts of Ints to bytes
memcpy(inBuffer.pointee.mAudioData, slice, sliceCount*bytesPerChannel)
inBuffer.pointee.mAudioDataByteSize = UInt32(sliceCount*bytesPerChannel)
lastIndexRead += sliceCount
// enqueue the buffer, or re-enqueue it if it's a used one
check(AudioQueueEnqueueBuffer(inAQ, inBuffer, 0, nil))
}
I did not look at the Recorder code, but you may want to check if the same sort of error crept in there.

Change AudioQueueBuffer's mAudioData

I would need to set the AudioQueueBufferRef's mAudioData. I tried with copyMemory:
inBuffer.pointee.copyMemory(from: lastItemOfArray, byteCount: byteCount) // byteCount is 512
but it doesnt't work.
The AudioQueueNewOutput() queue is properly setted up to Int16 pcm format
Here is my code:
class CustomObject {
var pcmInt16DataArray = [UnsafeMutableRawPointer]() // this contains pcmInt16 data
}
let callback: AudioQueueOutputCallback = { (
inUserData: UnsafeMutableRawPointer?,
inAQ: AudioQueueRef,
inBuffer: AudioQueueBufferRef) in
guard let aqp: CustomObject = inUserData?.bindMemory(to: CustomObject.self, capacity: 1).pointee else { return }
var numBytes: UInt32 = inBuffer.pointee.mAudioDataBytesCapacity
/// Set inBuffer.pointee.mAudioData to pcmInt16DataArray.popLast()
/// How can I set the mAudioData here??
inBuffer.pointee.mAudioDataByteSize = numBytes
AudioQueueEnqueueBuffer(inAQ, inBuffer, 0, nil)
}
From apple doc: https://developer.apple.com/documentation/audiotoolbox/audioqueuebuffer?language=objc
mAudioData:
The audio data owned the audio queue buffer. The buffer address cannot be changed.
So I guess the solution would be to set a new value to the same address
Anybody who knows how to do it?
UPDATE:
The incoming audio format is "pcm" signal (Little Endian) sampled at 48kHz. Here are my settings:
var dataFormat = AudioStreamBasicDescription()
dataFormat.mSampleRate = 48000;
dataFormat.mFormatID = kAudioFormatLinearPCM
dataFormat.mFormatFlags = kAudioFormatFlagIsSignedInteger | kAudioFormatFlagIsNonInterleaved;
dataFormat.mChannelsPerFrame = 1
dataFormat.mFramesPerPacket = 1
dataFormat.mBitsPerChannel = 16
dataFormat.mBytesPerFrame = 2
dataFormat.mBytesPerPacket = 2
And I am collecting the incoming data to
var pcmData = [UnsafeMutableRawPointer]()
You're close!
Try this:
inBuffer.pointee.mAudioData.copyMemory(from: lastItemOfArray, byteCount: Int(numBytes))
or this:
memcpy(inBuffer.pointee.mAudioData, lastItemOfArray, Int(numBytes))
Audio Queue Services was tough enough to work with when it was pure C. Now that we have to do so much bridging to get the API to work with Swift, it's a real pain. If you have the option, try out AVAudioEngine.
A few other things to check:
Make sure your AudioQueue has the same format that you've defined in your AudioStreamBasicDescription.
var queue: AudioQueueRef?
// assumes userData has already been initialized and configured
AudioQueueNewOutput(&dataFormat, callBack, &userData, nil, nil, 0, &queue)
Confirm you have allocated and primed the queue's buffers.
let numBuffers = 3
// using forced optionals here for brevity
for _ in 0..<numBuffers {
var buffer: AudioQueueBufferRef?
if AudioQueueAllocateBuffer(queue!, userData.bufferByteSize, &buffer) == noErr {
userData.mBuffers.append(buffer!)
callBack(inUserData: &userData, inAQ: queue!, inBuffer: buffer!)
}
}
Consider making your callback a function.
func callBack(inUserData: UnsafeMutableRawPointer?, inAQ: AudioQueueRef, inBuffer: AudioQueueBufferRef) {
let numBytes: UInt32 = inBuffer.pointee.mAudioDataBytesCapacity
memcpy(inBuffer.pointee.mAudioData, pcmData, Int(numBytes))
inBuffer.pointee.mAudioDataByteSize = numBytes
AudioQueueEnqueueBuffer(inAQ, inBuffer, 0, nil)
}
Also, see if you can get some basic PCM data to play through your audio queue before attempting to bring in the server side data.
var pcmData: [Int16] = []
for i in 0..<frameCount {
let element = Int16.random(in: Int16.min...Int16.max) // noise
pcmData.append(Int16(element))
}

How to decode m4a to PCM using AVFoundation's AVAsetReader

Trying to convert M4A to PCM goes well in the start.
i am able to convert and read the bytes.
however i am not sure if this is the correct way to do this.
as i am getting 16384 bytes when i try to get the bytes in NSData.
Here is my function
func getData(sampleRef:CMSampleBufferRef) -> NSMutableData{
let dataBuffer = CMSampleBufferGetDataBuffer(sampleRef)
let length = CMBlockBufferGetDataLength(dataBuffer!)
var data = NSMutableData(length: length)
CMBlockBufferCopyDataBytes(dataBuffer!, 0, length, data!.mutableBytes)
print(data!)// this prints 16384 bytes
return data!
}
this i try to convert this data to Int16
// 3 lines below i was just testing how it converts to Int16
let count = data!.length / sizeof(Int16)
var array = [Int16](count: count, repeatedValue: 0)
data!.getBytes(&array, length: data!.length)
these are my settings to decode the PCM from
M4A file.
let outputSettings = [
AVFormatIDKey: Int(kAudioFormatLinearPCM),
AVSampleRateKey: 44100,
AVLinearPCMBitDepthKey:16,
AVLinearPCMIsFloatKey:0,
AVNumberOfChannelsKey: 1 as NSNumber,
]
PS. i record the file with the same settings

Read a WAV file and convert it to an array of amplitudes in Swift

I have followed a very good tutorial on udacity to explore the basis of audio applications with Swift. I would like to extend its current functionalities, starting with displaying the waveform of the WAV file. For that purpose, I would need to retrieve the amplitude versus sample from the WAV file. How could I proceed in swift, given that I have a recorded file already?
Thank you!
AudioToolBox meets you need.
You can use AudioFileService to get the audio samples from the audio file, such as the WAV file,
Then you can get the amplitude from every sample.
// this is your desired amplitude data
public internal(set) var packetsX = [Data]()
public required init(src path: URL) throws {
Utility.check(error: AudioFileOpenURL(path as CFURL, .readPermission, 0, &playbackFile) , // set on output to the AudioFileID
operation: "AudioFileOpenURL failed")
guard let file = playbackFile else {
return
}
var numPacketsToRead: UInt32 = 0
GetPropertyValue(val: &numPacketsToRead, file: file, prop: kAudioFilePropertyAudioDataPacketCount)
var asbdFormat = AudioStreamBasicDescription()
GetPropertyValue(val: &asbdFormat, file: file, prop: kAudioFilePropertyDataFormat)
dataFormatD = AVAudioFormat(streamDescription: &asbdFormat)
/// At this point we should definitely have a data format
var bytesRead: UInt32 = 0
GetPropertyValue(val: &bytesRead, file: file, prop: kAudioFilePropertyAudioDataByteCount)
guard let dataFormat = dataFormatD else {
return
}
let format = dataFormat.streamDescription.pointee
let bytesPerPacket = Int(format.mBytesPerPacket)
for i in 0 ..< Int(numPacketsToRead) {
var packetSize = UInt32(bytesPerPacket)
let packetStart = Int64(i * bytesPerPacket)
let dataPt: UnsafeMutableRawPointer = malloc(MemoryLayout<UInt8>.size * bytesPerPacket)
AudioFileReadBytes(file, false, packetStart, &packetSize, dataPt)
let startPt = dataPt.bindMemory(to: UInt8.self, capacity: bytesPerPacket)
let buffer = UnsafeBufferPointer(start: startPt, count: bytesPerPacket)
let array = Array(buffer)
packetsX.append(Data(array))
}
}
For example , the WAV file has channel one 、bit depth of Int16 .
// buffer is of two Int8, to express an Int16
let buffer = UnsafeBufferPointer(start: startPt, count: bytesPerPacket)
more information , you can check my github repo