I am working on a push to talk functionality where sender can send an audio in form of bytes array to server and receiver can listen it at realtime through socket connection.
when i try to play video at receiver end using AVAudioEngine, it's not working.
let buffer = dataToPCMBuffer(format: format16KHzMono!, data: data)
let player = AVAudioPlayerNode()
self.audioEngine?.attach(audioPlayerNode)
let mixer = self.audioEngine?.mainMixerNode
self.audioEngine?.connect(player, to: mixer!, format: AVAudioFormat.init(commonFormat: AVAudioCommonFormat.pcmFormatInt16, sampleRate: 16000, channels: 1, interleaved: true) )
self.playerQueue.async {
self.audioPlayerNode.scheduleBuffer(buffer!) {
print("stopping")
if self.audioEngine!.isRunning {
self.audioPlayerNode.play()
}else {
try? self.audioEngine?.start()
}
}
And, i am facing crash at below given line.
self.audioEngine?.connect(player, to: mixer!, format: AVAudioFormat.init(commonFormat: AVAudioCommonFormat.pcmFormatInt16, sampleRate: 16000, channels: 1, interleaved: true) )
Any help will be appreciated.
I think it’s the format in your connection. Try using nil instead. There are some magic numbers needed for sample rates, maybe 16000 is not one of them.
Related
I have an audio file with 5.1 channels. How do I get access to a buffer containing all of this information during playback?
The setup is roughly like this
engine.attach(playerNode)
engine.connect(playerNode, to: engine.mainMixerNode, format: audioFile.processingFormat)
engine.prepare()
try? engine.start()
File loading and playback is pretty standard, I think.
let audioFile: AVAudioFile = try? .init(forReading: url, commonFormat: AVAudioCommonFormat.pcmFormatFloat32, interleaved: false)
// ...
playerNode.scheduleFile(audioFile, at: nil)
Then I used a tap on the bus to get access to the buffer.
let format: AVAudioFormat = engine.mainMixerNode.outputFormat(forBus: 0)
engine.mainMixerNode.installTap(
onBus: 0,
bufferSize: 1024,
format: format
) { buffer, time in
// Do something cool with buffer, but buffer only has two channels.
for channel in 0..<buffer.format.channelCount {
}
}
Questions:
Is a tap on the the bus the right way to get hold of this data? Os there a better way?
How do I get at the data for more than two channels?
Install your tap on the playerNode and don't bother specifying the format:
playerNode.installTap(
onBus: 0,
bufferSize: 1024,
format: nil
) { buffer, time in
// 6 channel buffer
for channel in 0..<buffer.format.channelCount {
}
}
we are working on a project which records voice from an external microphone. For analysis purposes, we need to have a sample rate of about 5k Hz.
We are using AvAudioEngine to record a voice.
We know Apple devices want able to record at a specific rate, so we are using AVAudioConverter to downgrade the sample rate.
But as you know it is similar to the compression, so the lower we reduce sample rate, file size and file duration affect the same. Which is currently happening(Correct me if I am wrong in this).
Issue
**Issue is downgrading sample rate shorter the file length and its effects on calculation & analysis.
For example, a 1-hour recording was downgraded to 45 mins. So suppose if we are making analysis on 5 minute period interval, it goes wrong
What will be the best solution for this?**
Query
We have searched over the internet but we could not figure out how buffer size on installTap affects? In the current code, we have set it to 2688.
Can anyone clarify?
Code
let bus = 0
let inputNode = engine.inputNode
let equalizer = AVAudioUnitEQ(numberOfBands: 2)
equalizer.bands[0].filterType = .lowPass
equalizer.bands[0].frequency = 3000
equalizer.bands[0].bypass = false
equalizer.bands[1].filterType = .highPass
equalizer.bands[1].frequency = 1000
equalizer.bands[1].bypass = false
engine.attach(equalizer) //Attach equalizer
// Connect nodes
engine.connect(inputNode, to: equalizer, format: inputNode.inputFormat(forBus: 0))
engine.connect(equalizer, to: engine.mainMixerNode, format: inputNode.inputFormat(forBus: 0))
// call before creating converter because this changes the mainMixer's output format
engine.prepare()
let outputFormat = AVAudioFormat(commonFormat: .pcmFormatInt16,
sampleRate: 5000,
channels: 1,
interleaved: false)!
// Downsampling converter
guard let converter: AVAudioConverter = AVAudioConverter(from: engine.mainMixerNode.outputFormat(forBus: 0), to: outputFormat) else {
print("Can't convert in to this format")
return
}
engine.mainMixerNode.installTap(onBus: bus, bufferSize: 2688, format: nil) { (buffer, time) in
var newBufferAvailable = true
let inputCallback: AVAudioConverterInputBlock = { inNumPackets, outStatus in
if newBufferAvailable {
outStatus.pointee = .haveData
newBufferAvailable = false
return buffer
} else {
outStatus.pointee = .noDataNow
return nil
}
}
let convertedBuffer = AVAudioPCMBuffer(pcmFormat: outputFormat, frameCapacity: AVAudioFrameCount(outputFormat.sampleRate) * buffer.frameLength / AVAudioFrameCount(buffer.format.sampleRate))!
var error: NSError?
let status = converter.convert(to: convertedBuffer, error: &error, withInputFrom: inputCallback)
assert(status != .error)
if status == .haveData {
// Process with converted buffer
}
}
do {
try engine.start()
} catch {
print("Can't start the engine: \(error)")
}
Expecting Result
We are fine with compression of buffer but We would like to have the same recording duration in the output file. If we record for 10 minutes output file should have 10 minutes of data.
Digitized audio doesn't have an intrinsic duration since it can be played back at any sample rate.
In order for the resulting file's duration to be what you expect, the sample rates have to be what you expect at each stage: Recording, processing, and playback.
I suspect that one of two possible things is happening:
A) the sample rate of the buffer you receive inside installtap is not what you assumed it would be... and you are converting from the wrong format.
B) You are playing back your audio at sample rates than are different that what you are assuming they are. (How do you know that your player is playing at 5000hz)?
In order to check this, you would have to break the process down into smaller pieces and check the sample rate at each stage.
I am developing a small wrapper to create a framework on AVFoundation that wraps up some functionality and I want to create tests around it.
I'm calling this only on the initializer of my class implementation as it only needs to be run once. I don't want to add logic specific that avoids this line to be run when I'm running the tests but I am having trouble testing functionality on this class like reading an audio file and appending it to the audio buffer.
private let mixerNode = AVAudioMixerNode()
// Used to get audio from the microphone.
private lazy var audioEngine = AVAudioEngine()
private func configureAudioEngine() {
// Get the native audio format of the engine's input bus.
let inputFormat = audioEngine.inputNode.inputFormat(forBus: 0)
// Set an output format
// let outputFormat = AVAudioFormat(standardFormatWithSampleRate: 48000, channels: 1)
let outputFormat = AVAudioFormat(
standardFormatWithSampleRate: audioEngine.inputNode.outputFormat(forBus: 0).sampleRate,
channels: 1)
// Mixer node converts the input
audioEngine.attach(mixerNode)
// Attach the mixer to the microphone input and the output of the audio engine.
// Crash on the line below, (shown on the lines below with the prints)
audioEngine.connect(audioEngine.inputNode, to: mixerNode, format: inputFormat)
audioEngine.connect(mixerNode, to: audioEngine.outputNode, format: outputFormat)
// Install a tap on the mixer node to capture the microphone audio.
mixerNode.installTap(onBus: 0,
bufferSize: 8192,
format: outputFormat) { [weak self] buffer, audioTime in
// Add captured audio to the buffer
// TODO: Do operation on the buffer data that's being received...
}
}
I am seeing inconsistencies between running this on the device and running the tests as the returned AVAudioFormat is different. These two prints below are from the same line indicated above
AVAudioEngine during a test
Tests iOS-Runner[4515:894511] [aurioc] AURemoteIO.cpp:1123 failed: 561015905 (enable 1, outf< 2 ch, 0 Hz, Float32, deinterleaved> inf< 2 ch, 0 Hz, Float32, deinterleaved>)
AVAudioEngine when running on device
<AVAudioFormat 0x2834962b0: 1 ch, 44100 Hz, Float32>
So I've been trying to do Speech Recognition in Swift using the builtin SFSpeechRecognition class while also downsampling and then recording the audio to a file, but I'm not well-versed enough in AVAudioEngine to figure it out.
I've gotten the Speech Recognition to work by itself, and I've gotten the recording the audio to work by itself, but I can't get them to work together.
Here's my existing code in which I try to record - the remaining code is just the standard speech recognition type stuff:
let audioFormat = AVAudioFormat(commonFormat: .pcmFormatFloat32, sampleRate: 16000, channels: 1, interleaved: false)
let mixer = AVAudioMixerNode()
audioEngine.attach(mixer)
audioEngine.connect(inputNode!, to: mixer, format: inputNode!.inputFormat(forBus: 0))
// 1 Connecting Mixer
audioEngine.connect(mixer, to: audioEngine.outputNode, format: audioFormat)
// 2 Recognition
inputNode!.installTap(onBus: 0, bufferSize: 1024, format: recordingFormat) { (buffer, when) in
print("Testing if this tap works")
self.recognitionRequest?.append(buffer)
}
// 3 Downsampling and recording
mixer.installTap(onBus: 0, bufferSize: 1024, format: audioFormat){ (buffer, when) in
print(buffer)
try? self.outputFile!.write(from: buffer)
}
If I comment out 3, then the speech recognition works, but otherwise 2 doesn't even run - the tap doesn't output anything. I also can't put the recognitionRequest in 3 because then the speech recognition throws an error. I see in the docs that each bus can only have one tap - how can I get around this? Should I use an AVConnectionPoint? I don't see it well documented in the docs.
i am receiving audio from server in bytes format through my socket connection.
and i am trying it to convert to PCMFormat to play it.
func playAudio(data: NSData){
let buffer = dataToPCMBuffer(format: format16KHzMono!, data: data)
let player = AVAudioPlayerNode()
self.audioEngine?.attach(audioPlayerNode)
let mixer = self.audioEngine?.mainMixerNode
self.audioEngine?.connect(player, to: mixer!, format: format16KHzMono)
self.playerQueue.async {
self.audioPlayerNode.scheduleBuffer(buffer!) {
print("stopping")
if self.audioEngine!.isRunning {
self.audioPlayerNode.play()
}else {
try? self.audioEngine?.start()
}
}
self.audioEngine?.prepare()
try! self.audioEngine?.start()
}
}
but i am facing crash in below line.
self.audioEngine?.connect(player, to: mixer!, format: format16KHzMono)
hope, this is a right way to stream audio from bytes.
Any help will be appreciated.