How to select audio input device (mic) in AVAudioEngine on macOS / swift? - swift

Is it possible to select the input device in AVAudioEngine using Swift on macOS?
Use case:
I am using SFSpeechRecognizer on macOS.
To feed microphone data into it I am using
private let audioEngine = AVAudioEngine()
:
let inputNode = audioEngine.inputNode
let recordingFormat = inputNode.outputFormat(forBus: 0)
inputNode.installTap( onBus: 0, bufferSize: 1024, format: recordingFormat )
{ (buffer: AVAudioPCMBuffer, when: AVAudioTime) in
self.recognitionRequest?.append( buffer )
}
audioEngine.prepare()
try audioEngine.start()
This will use the system default microphone.
I have an external USB microphone I wish to use instead. I can go into System Preferences -> Audio and set the default output device to my USB microphone. Then it will work. But if I disconnect the microphone and reconnect it to a different USB port, I will have to go through the process of setting it as default again.
To avoid having to do this repeatedly, I would like to set the microphone manually from code.
Is this possible?
EDIT: I found Set AVAudioEngine Input and Output Devices which uses Obj-C.

Related

webrtc macOS change input source

I'm trying to change the audio input source (microphone) in my macOS app like this:
var engine = AVAudioEngine()
private func activateNewInput(_ id: AudioDeviceID) {
let input = engine.inputNode
let inputUnit = input.audioUnit!
var inputDeviceID: AudioDeviceID = id
let status = AudioUnitSetProperty(inputUnit,
kAudioOutputUnitProperty_CurrentDevice,
kAudioUnitScope_Global,
0,
&inputDeviceID,
UInt32(MemoryLayout<AudioDeviceID>.size))
if status != 0 {
NSLog("Could not change input: \(status)")
}
/*
engine.prepare()
do {
try engine.start()
}
catch {
NSLog("\(error.localizedDescription)")
}*/
}
The status of the AudioUnitSetProperty is succesful (zero) however webrtc keeps on using the default input source no matter what.
On iOS it is typically done using. AVAudioSession which is missing on macOS. That's why I use AudioUnitSetProperty from Audio Toolbox framework. I also checked RTCPeerConnectionFactory and it has no constructor to inject ADM (audio device module) nor any APIs to control audio input. I use webrtc branch 106.
Any ideas, hints are much appreciated.
Thanks

How to test AVAudioEnigne and AVFoundation?

I am developing a small wrapper to create a framework on AVFoundation that wraps up some functionality and I want to create tests around it.
I'm calling this only on the initializer of my class implementation as it only needs to be run once. I don't want to add logic specific that avoids this line to be run when I'm running the tests but I am having trouble testing functionality on this class like reading an audio file and appending it to the audio buffer.
private let mixerNode = AVAudioMixerNode()
// Used to get audio from the microphone.
private lazy var audioEngine = AVAudioEngine()
private func configureAudioEngine() {
// Get the native audio format of the engine's input bus.
let inputFormat = audioEngine.inputNode.inputFormat(forBus: 0)
// Set an output format
// let outputFormat = AVAudioFormat(standardFormatWithSampleRate: 48000, channels: 1)
let outputFormat = AVAudioFormat(
standardFormatWithSampleRate: audioEngine.inputNode.outputFormat(forBus: 0).sampleRate,
channels: 1)
// Mixer node converts the input
audioEngine.attach(mixerNode)
// Attach the mixer to the microphone input and the output of the audio engine.
// Crash on the line below, (shown on the lines below with the prints)
audioEngine.connect(audioEngine.inputNode, to: mixerNode, format: inputFormat)
audioEngine.connect(mixerNode, to: audioEngine.outputNode, format: outputFormat)
// Install a tap on the mixer node to capture the microphone audio.
mixerNode.installTap(onBus: 0,
bufferSize: 8192,
format: outputFormat) { [weak self] buffer, audioTime in
// Add captured audio to the buffer
// TODO: Do operation on the buffer data that's being received...
}
}
I am seeing inconsistencies between running this on the device and running the tests as the returned AVAudioFormat is different. These two prints below are from the same line indicated above
AVAudioEngine during a test
Tests iOS-Runner[4515:894511] [aurioc] AURemoteIO.cpp:1123 failed: 561015905 (enable 1, outf< 2 ch, 0 Hz, Float32, deinterleaved> inf< 2 ch, 0 Hz, Float32, deinterleaved>)
AVAudioEngine when running on device
<AVAudioFormat 0x2834962b0: 1 ch, 44100 Hz, Float32>

Record audio from AirPod Pros at higher sample rate than 16kHz

Is it possible to record audio through the microphone of AirPod Pros at a sample rate higher than 16kHz?
I'm tapping into the microphone bus using audioEngine:
let node = audioEngine.inputNode
let recordingFormat = node.outputFormat(forBus: 0)
node.installTap(onBus: 0, bufferSize: 1024, format: recordingFormat) { [unowned self] (buffer, _) in
self.request!.append(buffer)
}
audioEngine.prepare()
do {
try audioEngine.start()
} catch {
fatalError("\t[Error] There was a problem starting speech recognition")
}
It seems as though the default sample rate (found in recordingFormat) is 16000Hz, and I've had difficulty specifying a higher sample rate.
This particular sample rate returns an audio recording with fairly low quality compared to a recording from the iPhone microphone, which has a sample rate of 44100Hz.
Try calling the following before starting your AVAudioEngine instance
AVAudioSession.sharedInstance().setCategory(.playAndRecord)
try AVAudioSession.sharedInstance().setPreferredSampleRate(44_100)

How to send Microphone and InApp Audio CMSampleBuffer to webRTC in swift?

I am working on-screen broadcast application. I want to send my screen recording on WebRTC server.
override func processSampleBuffer(_ sampleBuffer: CMSampleBuffer, with sampleBufferType: RPSampleBufferType) {
//if source!.isSocketConnected {
switch sampleBufferType {
case RPSampleBufferType.video:
// Handle video sample buffer
source?.processVideoSampleBuffer(sampleBuffer)
break
case RPSampleBufferType.audioApp:
// Handle audio sample buffer for app audio
source?.processInAppAudioSampleBuffer(sampleBuffer)
break
case RPSampleBufferType.audioMic:
// Handle audio sample buffer for mic audio
source?.processAudioSampleBuffer(sampleBuffer)
break
#unknown default:
break
}
}
// VideoBuffer Sending Method
func startCaptureLocalVideo(sampleBuffer: CMSampleBuffer) {
let _pixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer)
if let pixelBuffer = _pixelBuffer {
let rtcPixelBuffer = RTCCVPixelBuffer(pixelBuffer: pixelBuffer)
let timeStampNs = CMTimeGetSeconds(CMSampleBufferGetPresentationTimeStamp(sampleBuffer)) * 1000000000
let rtcVideoFrame = RTCVideoFrame(buffer: rtcPixelBuffer, rotation: RTCVideoRotation._90, timeStampNs: Int64(timeStampNs))
localVideoSource!.capturer(videoCapturer!, didCapture: rtcVideoFrame)
}
}
I got success to send VIDEO Sample Buffer on WebRTC but I am getting stuck in AUDIO part.
I did not find any way how to send AUDIO buffer to WebRTC.
Thank you so much for your answer.
I found the solution for that, just come to this link and follow the guideline:
https://github.com/pixiv/webrtc/blob/branch-heads/pixiv-m78/README.pixiv.md
WebRTC team is no longer to support native framework, so we need to modify WebRTC source code and re-build it to use inside another app.
Luckily, I found the person who fork source code from WebRTC project and he update the function that pass CMSampleBuffer from Broadcast extension to RTCPeerConnection.

How to do SFSpeechRecognition and record audio at the same time in Swift?

So I've been trying to do Speech Recognition in Swift using the builtin SFSpeechRecognition class while also downsampling and then recording the audio to a file, but I'm not well-versed enough in AVAudioEngine to figure it out.
I've gotten the Speech Recognition to work by itself, and I've gotten the recording the audio to work by itself, but I can't get them to work together.
Here's my existing code in which I try to record - the remaining code is just the standard speech recognition type stuff:
let audioFormat = AVAudioFormat(commonFormat: .pcmFormatFloat32, sampleRate: 16000, channels: 1, interleaved: false)
let mixer = AVAudioMixerNode()
audioEngine.attach(mixer)
audioEngine.connect(inputNode!, to: mixer, format: inputNode!.inputFormat(forBus: 0))
// 1 Connecting Mixer
audioEngine.connect(mixer, to: audioEngine.outputNode, format: audioFormat)
// 2 Recognition
inputNode!.installTap(onBus: 0, bufferSize: 1024, format: recordingFormat) { (buffer, when) in
print("Testing if this tap works")
self.recognitionRequest?.append(buffer)
}
// 3 Downsampling and recording
mixer.installTap(onBus: 0, bufferSize: 1024, format: audioFormat){ (buffer, when) in
print(buffer)
try? self.outputFile!.write(from: buffer)
}
If I comment out 3, then the speech recognition works, but otherwise 2 doesn't even run - the tap doesn't output anything. I also can't put the recognitionRequest in 3 because then the speech recognition throws an error. I see in the docs that each bus can only have one tap - how can I get around this? Should I use an AVConnectionPoint? I don't see it well documented in the docs.