I am trying to use AudioKit v5 to build a simple synthesizer app that plays a certain frequency whenever a button is pressed.
I would like there to be 28 buttons. However, I do not know if I should use the DunneAudioKit Synth class or create a dictionary of 28 AudioKit DynamicOscillators.
If I use the Synth class, I currently have no way of changing the waveform of the synth. If I use the dictionary of DynamicOscillators, I will have to start 28 oscillators and keep them running throughout the lifetime of the app. Neither scenario seems that great. One option only allows for a certain sound while the other one is energy inefficient.
Is there a better way to allow for polyphony using AudioKit? A way that is efficient and also able to produce many different kinds of sound? AudioKit SynthOne is a great example of what I am trying to achieve.
I downloaded "AudioKit Synth One - The Ultimate Guide" by Francis Preve and from that I learned that SynthOne uses 2 Oscillators, a Sub-Oscillator, an FM Pair, and a Noise Generator to produce its sounds. However, the eBook does not explain how to actually code a polyphonic synthesizer using these 5 generators. I know that SynthOne's source code is online. I have downloaded it, but it is a little too advanced for me to understand. However, if someone can help explain how to use just those 5 objects to create a polyphonic synthesizer, that would be incredible.
Thanks in advance.
I'm not sure how they did things in AudioKit 4 which Synth One uses. I would speculate that it has an internal oscillator array bank when polyphonic mode is enabled. So essentially one instance of an oscillator per voice.
In the AudioKit 5 documentation it says Dunne Synth is the only polyphonic oscillator at this time, but I did added a WIP polyphonic oscillator example in the AudioKit Cookbook. I'm not sure how much of a resource hog it is. 28 instances seems excessive so you might be able to get by with around 10 and change the frequencies for each voice with button presses.
The third option would be to use something like AppleSampler or DunneSampler and make instruments based on single cycle wavetable audio files. This is more of a workaround and wouldn't give as much control over certain parameters, but it would be lighter on the resources.
I had a similar question and tried several ways of making a versatile polyphonic sampler.
It's true that the AppleSampler and the DunneSampler support polyphone; however, I needed a sampler that I could control with more precision on a note-by-note bases; i.e. playing each "voice" with unique playback parameters like playspeed, etc.
I found that building a sampler based on the AudioPlayer was the right path for me; and there, I created a member variable inside my sampler "voice" that kept track of when that voice was "busy"; when a "voice" is assigned a note to play, it marks itself as "busy", and when it's done, the callback from the AudioPlayer executes a function that sets the voice's "busy" variable to "false".
I then use a "conductor" to find the first available voice that is not "busy" to play a sound.
Here is a snippet:
import AudioKit
import AudioKitUI
import AVFoundation
import Keyboard
import Combine
import SwiftUI
import DunneAudioKit
class AudioPlayerVoice: ObservableObject, HasAudioEngine {
// For audio playback
let engine = AudioEngine()
let player = AudioPlayer()
let variSpeed: VariSpeed
var voiceNumber = 0
var busy : Bool
init() {
variSpeed = VariSpeed(player)
engine.output = variSpeed
do {
try engine.start()
} catch {
Log("AudioKit did not start!")
}
busy = false
variSpeed.rate = 1.0
player.isBuffered = true
player.completionHandler = donePlaying
}
func play(buffer: AVAudioPCMBuffer) {
// Set this voice to busy so that new incoming notes are not palyed here
busy = true
// Load buffer into player
player.load(buffer: buffer)
// Compare buffer and audioplayer formats
// print("Player format 1: ")
// print(player.outputFormat)
// print("Buffer format: ")
// print(buffer.format)
// Set AudioPlayer format to be the same as buffer format
player.playerNode.engine?.connect( player.playerNode, to: player.mixerNode, format: buffer.format)
// Compare buffer and audioplayer formats again to see if the above line changed anything
// print("Player format 2: ")
// print(player.outputFormat)
// Play sound with a completion callback
player.play(completionCallbackType: .dataPlayedBack)
}
func donePlaying() {
print("done!")
busy = false
}
}
class AudioPlayerConductor: ObservableObject {
// Mark Published so View updates label on changes
#Published private(set) var lastPlayed: String = "None"
let voiceCount = 16
var soundFileList: [String] = []
var buffers : [AVAudioPCMBuffer] = []
var players: [AudioPlayerVoice] = []
var sampleDict: [String: AVAudioPCMBuffer] = [:]
func loadAudioFiles() {
// Build audio file name list
let fileNameExtension = ".wav"
if let files = try? FileManager.default.contentsOfDirectory(atPath: Bundle.main.bundlePath + "/Samples" ){
// var counter = 0
///print("Files... " + files)
for file in files {
if file.hasSuffix(fileNameExtension) {
let name = file.prefix(file.count - fileNameExtension.count)
// add sound file name without extension to our soundFileist
soundFileList.append(String(name))
// get url for current sound
let url = Bundle.main.url(forResource: String(name), withExtension: "wav", subdirectory: "Samples")
// read audiofile into an AVAudioFile
let audioFile = try! AVAudioFile(forReading: url!)
// find the audio format and frame count
let audioFormat = audioFile.processingFormat
let audioFrameCount = UInt32(audioFile.length)
// create a new AVAudioPCMBuffer and read from the AVAudioFile into the AVAudioPCMBuffer
let audioFileBuffer = AVAudioPCMBuffer(pcmFormat: audioFormat, frameCapacity: audioFrameCount)
try! audioFile.read(into: audioFileBuffer!)
// updated the sampleDict dictionary with "name" / "buffer" key / value
sampleDict[String(name)] = audioFileBuffer
//print("loading... " + name)
//print(".......... " + url!.absoluteString)
}
}
}
print("Loaded Samples:")
print(soundFileList)
}
func initializeSamplerVoices() {
for i in 1...voiceCount {
let newAudioPlayerVoice = AudioPlayerVoice()
newAudioPlayerVoice.voiceNumber = i
players.append(newAudioPlayerVoice)
}
}
func playWithAvailableVoice (bufferToPlay: AVAudioPCMBuffer, playspeed: Float) {
for i in 0...(voiceCount-1) {
if (!players[i].busy) {
players[i].variSpeed.rate = playspeed
players[i].play(buffer: bufferToPlay)
break
}
}
}
func playXY(x: Double, y: Double) {
let playspeed = Float(AliSwift.scale(x, 0.0, UIScreen.screenWidth, 0.1, 3.0))
let soundNumber = Int(AliSwift.scale(y, 0.0, UIScreen.screenHeight, 0 , Double(soundFileList.count - 1)))
let soundBuffer = sampleDict[soundFileList[soundNumber]]
playWithAvailableVoice(bufferToPlay: soundBuffer!, playspeed: playspeed)
}
init() {
loadAudioFiles()
initializeSamplerVoices()
}
}
struct ContentViewAudioPlayer: View {
#StateObject var conductor = AudioPlayerConductor()
// #StateObject var samplerVoice = AudioPlayerVoice()
var body: some View {
ZStack {
VStack {
Rectangle()
.fill(.red)
.frame(maxWidth: .infinity)
.frame(maxHeight: .infinity)
.onTapGesture { location in
print("Tapped at \(location)")
let someSound = conductor.sampleDict.randomElement()!
let someSoundName = someSound.key
let someSoundBuffer = someSound.value
print("Playing: " + someSoundName)
conductor.playXY(x: location.x, y: location.y)
}
}
.onAppear {
// conductor.start()
}
.onDisappear {
// conductor.stop()
}
}
}
struct ContentViewAudioPlayer_Previews: PreviewProvider {
static var previews: some View {
ContentView()
}
}
Related
I am trying to figure out how to use Apple's Core Audio APIs to record and play back linear PCM audio without any file I/O. (The recording side seems to work just fine.)
The code I have is pretty short, and it works somewhat. However, I am having trouble with identifying the source of clicks and pops in the output. I've been beating my head against this for many days with no success.
I have posted a git repo here, with a command-line program program that shows where I'm at: https://github.com/maxharris9/AudioRecorderPlayerSwift/tree/main/AudioRecorderPlayerSwift
I put in a couple of functions to prepopulate the recording. The tone generator (makeWave) and noise generator (makeNoise) are just in here as debugging aids. I'm ultimately trying to identify the source of the messed up output when you play back a recording in audioData:
// makeWave(duration: 30.0, frequency: 441.0) // appends to `audioData`
// makeNoise(frameCount: Int(44100.0 * 30)) // appends to `audioData`
_ = Recorder() // appends to `audioData`
_ = Player() // reads from `audioData`
Here's the player code:
var lastIndexRead: Int = 0
func outputCallback(inUserData: UnsafeMutableRawPointer?, inAQ: AudioQueueRef, inBuffer: AudioQueueBufferRef) {
guard let player = inUserData?.assumingMemoryBound(to: Player.PlayingState.self) else {
print("missing user data in output callback")
return
}
let sliceStart = lastIndexRead
let sliceEnd = min(audioData.count, lastIndexRead + bufferByteSize - 1)
print("slice start:", sliceStart, "slice end:", sliceEnd, "audioData.count", audioData.count)
if sliceEnd >= audioData.count {
player.pointee.running = false
print("found end of audio data")
return
}
let slice = Array(audioData[sliceStart ..< sliceEnd])
let sliceCount = slice.count
// doesn't fix it
// audioData[sliceStart ..< sliceEnd].withUnsafeBytes {
// inBuffer.pointee.mAudioData.copyMemory(from: $0.baseAddress!, byteCount: Int(sliceCount))
// }
memcpy(inBuffer.pointee.mAudioData, slice, sliceCount)
inBuffer.pointee.mAudioDataByteSize = UInt32(sliceCount)
lastIndexRead += sliceCount + 1
// enqueue the buffer, or re-enqueue it if it's a used one
check(AudioQueueEnqueueBuffer(inAQ, inBuffer, 0, nil))
}
struct Player {
struct PlayingState {
var packetPosition: UInt32 = 0
var running: Bool = false
var start: Int = 0
var end: Int = Int(bufferByteSize)
}
init() {
var playingState: PlayingState = PlayingState()
var queue: AudioQueueRef?
// this doesn't help
// check(AudioQueueNewOutput(&audioFormat, outputCallback, &playingState, CFRunLoopGetMain(), CFRunLoopMode.commonModes.rawValue, 0, &queue))
check(AudioQueueNewOutput(&audioFormat, outputCallback, &playingState, nil, nil, 0, &queue))
var buffers: [AudioQueueBufferRef?] = Array<AudioQueueBufferRef?>.init(repeating: nil, count: BUFFER_COUNT)
print("Playing\n")
playingState.running = true
for i in 0 ..< BUFFER_COUNT {
check(AudioQueueAllocateBuffer(queue!, UInt32(bufferByteSize), &buffers[i]))
outputCallback(inUserData: &playingState, inAQ: queue!, inBuffer: buffers[i]!)
if !playingState.running {
break
}
}
check(AudioQueueStart(queue!, nil))
repeat {
CFRunLoopRunInMode(CFRunLoopMode.defaultMode, BUFFER_DURATION, false)
} while playingState.running
// delay to ensure queue emits all buffered audio
CFRunLoopRunInMode(CFRunLoopMode.defaultMode, BUFFER_DURATION * Double(BUFFER_COUNT + 1), false)
check(AudioQueueStop(queue!, true))
check(AudioQueueDispose(queue!, true))
}
}
I captured the audio with Audio Hijack, and noticed that the jumps are indeed correlated with the size of the buffer:
Why is this happening, and what can I do to fix it?
I believe you were beginning to zero in on, or at least suspect, the cause of the popping you are hearing: it's caused by discontinuities in your waveform.
My initial hunch was that you were generating the buffers independently (i.e. assuming that each buffer starts at time=0), but I checked out your code and it wasn't that. I suspect some of the calculations in makeWave were at fault. To check this theory I replaced your makeWave with the following:
func makeWave(offset: Double, numSamples: Int, sampleRate: Float64, frequency: Float64, numChannels: Int) -> [Int16] {
var data = [Int16]()
for sample in 0..<numSamples / numChannels {
// time in s
let t = offset + Double(sample) / sampleRate
let value = Double(Int16.max) * sin(2 * Double.pi * frequency * t)
for _ in 0..<numChannels {
data.append(Int16(value))
}
}
return data
}
This function removes the double loop in the original, accepts an offset so it knows which part of the wave is being generated and makes some changes to the sampling of the sine wave.
When Player is modified to use this function you get a lovely steady tone. I'll add the changes to player soon. I can't in good conscience show the quick and dirty mess it is now to the public.
Based on your comments below I refocused on your player. The issue was that the audio buffers expect byte counts but the slice count and some other calculations were based on Int16 counts. The following version of outputCallback will fix it. Concentrate on the use of the new variable bytesPerChannel.
func outputCallback(inUserData: UnsafeMutableRawPointer?, inAQ: AudioQueueRef, inBuffer: AudioQueueBufferRef) {
guard let player = inUserData?.assumingMemoryBound(to: Player.PlayingState.self) else {
print("missing user data in output callback")
return
}
let bytesPerChannel = MemoryLayout<Int16>.size
let sliceStart = lastIndexRead
let sliceEnd = min(audioData.count, lastIndexRead + bufferByteSize/bytesPerChannel)
if sliceEnd >= audioData.count {
player.pointee.running = false
print("found end of audio data")
return
}
let slice = Array(audioData[sliceStart ..< sliceEnd])
let sliceCount = slice.count
print("slice start:", sliceStart, "slice end:", sliceEnd, "audioData.count", audioData.count, "slice count:", sliceCount)
// need to be careful to convert from counts of Ints to bytes
memcpy(inBuffer.pointee.mAudioData, slice, sliceCount*bytesPerChannel)
inBuffer.pointee.mAudioDataByteSize = UInt32(sliceCount*bytesPerChannel)
lastIndexRead += sliceCount
// enqueue the buffer, or re-enqueue it if it's a used one
check(AudioQueueEnqueueBuffer(inAQ, inBuffer, 0, nil))
}
I did not look at the Recorder code, but you may want to check if the same sort of error crept in there.
I basically followed this great tutorial on VNRecognizeTextRequest and modified some things:
https://bendodson.com/weblog/2019/06/11/detecting-text-with-vnrecognizetextrequest-in-ios-13/
I am trying to recognise text from devices with seven-segment-style displays which seems to get a bit tricky for this framework. Often it works, but numbers with comma are hard and if there's a a gap as well. I'm wondering whether there is the possibility to "train" this recognition engine. Another possibility might be to somehow tell it to specifically look for numbers, maybe then it can focus more processing power on that instead of generically looking for text?
I use this modified code for the request:
ocrRequest = VNRecognizeTextRequest { (request, error) in
guard let observations = request.results as? [VNRecognizedTextObservation] else { return }
for observation in observations {
guard let topCandidate = observation.topCandidates(1).first else { continue }
let topCandidateText = topCandidate.string
if let float = Float(topCandidateText), topCandidate.confidence > self.bestConfidence {
self.bestCandidate = float
self.bestConfidence = topCandidate.confidence
}
}
if self.bestConfidence >= 0.5 {
self.captureSession?.stopRunning()
DispatchQueue.main.async {
self.found(measurement: self.bestCandidate!)
}
}
}
ocrRequest.recognitionLevel = .accurate
ocrRequest.minimumTextHeight = 1/10
ocrRequest.recognitionLanguages = ["en-US", "en-GB"]
ocrRequest.usesLanguageCorrection = true
There are 3 global variables in this class regarding the text recognition:
private var ocrRequest = VNRecognizeTextRequest(completionHandler: nil)
private var bestConfidence: Float = 0
private var bestCandidate: Float?
Thanks in advance for your answers, even though this is not directly code-related, but more concept-related (i.e. "am I doing something wrong / did I overlook an important feature?" etc.).
Example image that work:
Example that half works:
(recognises 58)
Example that does not work:
(it has a very low confidence for "91" and often thinks it's just 9 or 9!)
I updated to latest version of Audiokit 4.5
and my Audiokit class that is meant to listen to microphone amplitude is now printing: 55: EXCEPTION (-1): "" infinitely on the console. the app doesnt crash or anything.. but is logging that.
My app is a video camera app that records using GPUImage library.
The logs appear only when I start recording for some reason.
In addition to that. My onAmplitudeUpdate callback method no longer outputs anything, just 0.0 values. This didnt happen before updating Audiokit. Any ideas here ?
Here is my class:
// G8Audiokit.swift
// GenerateToolkit
//
// Created by Omar Juarez Ortiz on 2017-08-03.
// Copyright © 2017 All rights reserved.
//
import Foundation
import AudioKit
class G8Audiokit{
//Variables for Audio audioAnalysis
var microphone: AKMicrophone! // Device Microphone
var amplitudeTracker: AKAmplitudeTracker! // Tracks the amplitude of the microphone
var signalBooster: AKBooster! // boosts the signal
var audioAnalysisTimer: Timer? // Continuously calls audioAnalysis function
let amplitudeBuffSize = 10 // Smaller buffer will yield more amplitude responiveness and instability, higher value will respond slower but will be smoother
var amplitudeBuffer: [Double] // This stores a rolling window of amplitude values, used to get average amplitude
public var onAmplitudeUpdate: ((_ value: Float) -> ())?
static let sharedInstance = G8Audiokit()
private init(){ //private because that way the class can only be initialized by itself.
self.amplitudeBuffer = [Double](repeating: 0.0, count: amplitudeBuffSize)
startAudioAnalysis()
}
// public override init() {
// // Initialize the audio buffer with zeros
//
// }
/**
Set up AudioKit Processing Pipeline and start the audio analysis.
*/
func startAudioAnalysis(){
stopAudioAnalysis()
// Settings
AKSettings.bufferLength = .medium // Set's the audio signal buffer size
do {
try AKSettings.setSession(category: .playAndRecord)
} catch {
AKLog("Could not set session category.")
}
// ----------------
// Input + Pipeline
// Initialize the built-in Microphone
microphone = AKMicrophone()
// Pre-processing
signalBooster = AKBooster(microphone)
signalBooster.gain = 5.0 // When video recording starts, the signal gets boosted to the equivalent of 5.0, so we're setting it to 5.0 here and changing it to 1.0 when we start video recording.
// Filter out anything outside human voice range
let highPass = AKHighPassFilter(signalBooster, cutoffFrequency: 55) // Lowered this a bit to be more sensitive to bass-drums
let lowPass = AKLowPassFilter(highPass, cutoffFrequency: 255)
// At this point you don't have much signal left, so you balance it against the original signal!
let rebalanced = AKBalancer(lowPass, comparator: signalBooster)
// Track the amplitude of the rebalanced signal, we use this value for audio reactivity
amplitudeTracker = AKAmplitudeTracker(rebalanced)
// Mute the audio that gets routed to the device output, preventing feedback
let silence = AKBooster(amplitudeTracker, gain:0)
// We need to complete the chain, routing silenced audio to the output
AudioKit.output = silence
// Start the chain and timer callback
do{ try AudioKit.start(); }
catch{}
audioAnalysisTimer = Timer.scheduledTimer(timeInterval: 0.01,
target: self,
selector: #selector(audioAnalysis),
userInfo: nil,
repeats: true)
// Put the timer on the main thread so UI updates don't interrupt
RunLoop.main.add(audioAnalysisTimer!, forMode: RunLoopMode.commonModes)
}
// Call this when closing the app or going to background
public func stopAudioAnalysis(){
audioAnalysisTimer?.invalidate()
AudioKit.disconnectAllInputs() // Disconnect all AudioKit components, so they can be relinked when we call startAudioAnalysis()
}
// This is called on the audioAnalysisTimer
#objc func audioAnalysis(){
writeToBuffer(val: amplitudeTracker.amplitude) // Write an amplitude value to the rolling buffer
let val = getBufferAverage()
onAmplitudeUpdate?(Float(val))
}
// Writes amplitude values to a rolling window buffer, writes to index 0 and pushes the previous values to the right, removes the last value to preserve buffer length.
func writeToBuffer(val: Double) {
for (index, _) in amplitudeBuffer.enumerated() {
if (index == 0) {
amplitudeBuffer.insert(val, at: 0)
_ = amplitudeBuffer.popLast()
}
else if (index < amplitudeBuffer.count-1) {
amplitudeBuffer.rearrange(from: index-1, to: index+1)
}
}
}
// Returns the average of the amplitudeBuffer, resulting in a smoother audio reactivity signal
func getBufferAverage() -> Double {
var avg:Double = 0.0
for val in amplitudeBuffer {
avg = avg + val
}
avg = avg / amplitudeBuffer.count
return avg
}
}
I'm capturing audio from AKLazyTap and rendering the accumulated [AVAudioPCMBuffer] to an audio file, in the background, while my app's audio is running. This works great, but I want to add fade in/out to clean up the result. I see the convenience extension for adding fades to a single AVAudioPCMBuffer, but I'm not sure how I'd do it on an array. I'd thought to concatenate the buffers, but there doesn't appear to be support for that. Does anyone know if that's currently possible? Basically it would require something similar to copy(from:readOffset:frames), but would need to have a write offset as well...
Or maybe there's an easier way?
UPDATE
Okay, after studying some related AK code, I tried directly copying buffer data over to a single, long buffer, then applying the fade convenience function. But this gives me an empty (well, 4k) file. Is there some obvious error here that I'm just not seeing?
func renderBufferedAudioToFile(_ audioBuffers: [AVAudioPCMBuffer], withStartOffset startOffset: Int, endOffset: Int, fadeIn: Float64, fadeOut: Float64, atURL url: URL) {
// strip off the file name
let name = String(url.lastPathComponent.split(separator: ".")[0])
var url = self.module.stateManager.audioCacheDirectory
// UNCOMPRESSED
url = url.appendingPathComponent("\(name).caf")
let format = Conductor.sharedInstance.sourceMixer.avAudioNode.outputFormat(forBus: 0)
var settings = format.settings
settings["AVLinearPCMIsNonInterleaved"] = false
// temp buffer for fades
let totalFrameCapacity = audioBuffers.reduce(0) { $0 + $1.frameLength }
guard let tempAudioBufferForFades = AVAudioPCMBuffer(pcmFormat: format, frameCapacity: totalFrameCapacity) else {
print("Failed to create fade buffer!")
return
}
// write ring buffer to file.
let file = try! AVAudioFile(forWriting: url, settings: settings)
var writeOffset: AVAudioFrameCount = 0
for i in 0 ..< audioBuffers.count {
var buffer = audioBuffers[i]
let channelCount = Int(buffer.format.channelCount)
if i == 0 && startOffset != 0 {
// copy a subset of samples in the buffer
if let subset = buffer.copyFrom(startSample: AVAudioFrameCount(startOffset)) {
buffer = subset
}
} else if i == audioBuffers.count - 1 && endOffset != 0 {
if let subset = buffer.copyTo(count: AVAudioFrameCount(endOffset)) {
buffer = subset
}
}
// write samples into single, long buffer
for i in 0 ..< buffer.frameLength {
for n in 0 ..< channelCount {
tempAudioBufferForFades.floatChannelData?[n][Int(i + writeOffset)] = (buffer.floatChannelData?[n][Int(i)])!
}
}
print("buffer \(i), writeOffset = \(writeOffset)")
writeOffset = writeOffset + buffer.frameLength
}
// update!
tempAudioBufferForFades.frameLength = totalFrameCapacity
if let bufferWithFades = tempAudioBufferForFades.fade(inTime: fadeIn, outTime: fadeOut) {
try! file.write(from: bufferWithFades)
}
}
I'm trying to have a handler in my Mac OS X app written in Swift for a global (system-wide) hotkey combo but I just cannot find proper documentation for it. I've read that I'd have to mess around in some legacy Carbon API for it, is there no better way? Can you show me some proof of concept Swift code? Thanks in advance!
Since Swift 2.0, you can now pass a function pointer to C APIs.
var gMyHotKeyID = EventHotKeyID()
gMyHotKeyID.signature = OSType("swat".fourCharCodeValue)
gMyHotKeyID.id = UInt32(keyCode)
var eventType = EventTypeSpec()
eventType.eventClass = OSType(kEventClassKeyboard)
eventType.eventKind = OSType(kEventHotKeyPressed)
// Install handler.
InstallEventHandler(GetApplicationEventTarget(), {(nextHanlder, theEvent, userData) -> OSStatus in
var hkCom = EventHotKeyID()
GetEventParameter(theEvent, EventParamName(kEventParamDirectObject), EventParamType(typeEventHotKeyID), nil, sizeof(EventHotKeyID), nil, &hkCom)
// Check that hkCom in indeed your hotkey ID and handle it.
}, 1, &eventType, nil, nil)
// Register hotkey.
let status = RegisterEventHotKey(UInt32(keyCode), UInt32(modifierKeys), gMyHotKeyID, GetApplicationEventTarget(), 0, &hotKeyRef)
I don't believe you can do this in 100% Swift today. You'll need to call InstallEventHandler() or CGEventTapCreate(), and both of those require a CFunctionPointer, which can't be created in Swift. Your best plan is to use established ObjC solutions such as DDHotKey and bridge to Swift.
You can try using NSEvent.addGlobalMonitorForEventsMatchingMask(handler:), but that only makes copies of events. You can't consume them. That means the hotkey will also be passed along to the currently active app, which can cause problems. Here's an example, but I recommend the ObjC approach; it's almost certainly going to work better.
let keycode = UInt16(kVK_ANSI_X)
let keymask: NSEventModifierFlags = .CommandKeyMask | .AlternateKeyMask | .ControlKeyMask
func handler(event: NSEvent!) {
if event.keyCode == self.keycode &&
event.modifierFlags & self.keymask == self.keymask {
println("PRESSED")
}
}
// ... to set it up ...
let options = NSDictionary(object: kCFBooleanTrue, forKey: kAXTrustedCheckOptionPrompt.takeUnretainedValue() as NSString) as CFDictionaryRef
let trusted = AXIsProcessTrustedWithOptions(options)
if (trusted) {
NSEvent.addGlobalMonitorForEventsMatchingMask(.KeyDownMask, handler: self.handler)
}
This also requires that accessibility services be approved for this app. It also doesn't capture events that are sent to your own application, so you have to either capture them with your responder chain, our use addLocalMointorForEventsMatchingMask(handler:) to add a local handler.
The following code works for me for Swift 5.0.1. This solution is the combination of the solution from the accepted answer by Charlie Monroe and the recommendation by Rob Napier to use DDHotKey.
DDHotKey seems to work out of the box but it had one limitation that I had to change: the eventKind is hardcoded to kEventHotKeyReleased while I needed both kEventHotKeyPressed and kEventHotKeyReleased event types.
eventSpec.eventKind = kEventHotKeyReleased;
If you want to handle both Pressed and Released events, just add a second InstallEventHandler call which registers the other event kind.
This the complete example of the code that registers the "Command + R" key for the kEventHotKeyReleased type.
import Carbon
extension String {
/// This converts string to UInt as a fourCharCode
public var fourCharCodeValue: Int {
var result: Int = 0
if let data = self.data(using: String.Encoding.macOSRoman) {
data.withUnsafeBytes({ (rawBytes) in
let bytes = rawBytes.bindMemory(to: UInt8.self)
for i in 0 ..< data.count {
result = result << 8 + Int(bytes[i])
}
})
}
return result
}
}
class HotkeySolution {
static
func getCarbonFlagsFromCocoaFlags(cocoaFlags: NSEvent.ModifierFlags) -> UInt32 {
let flags = cocoaFlags.rawValue
var newFlags: Int = 0
if ((flags & NSEvent.ModifierFlags.control.rawValue) > 0) {
newFlags |= controlKey
}
if ((flags & NSEvent.ModifierFlags.command.rawValue) > 0) {
newFlags |= cmdKey
}
if ((flags & NSEvent.ModifierFlags.shift.rawValue) > 0) {
newFlags |= shiftKey;
}
if ((flags & NSEvent.ModifierFlags.option.rawValue) > 0) {
newFlags |= optionKey
}
if ((flags & NSEvent.ModifierFlags.capsLock.rawValue) > 0) {
newFlags |= alphaLock
}
return UInt32(newFlags);
}
static func register() {
var hotKeyRef: EventHotKeyRef?
let modifierFlags: UInt32 =
getCarbonFlagsFromCocoaFlags(cocoaFlags: NSEvent.ModifierFlags.command)
let keyCode = kVK_ANSI_R
var gMyHotKeyID = EventHotKeyID()
gMyHotKeyID.id = UInt32(keyCode)
// Not sure what "swat" vs "htk1" do.
gMyHotKeyID.signature = OSType("swat".fourCharCodeValue)
// gMyHotKeyID.signature = OSType("htk1".fourCharCodeValue)
var eventType = EventTypeSpec()
eventType.eventClass = OSType(kEventClassKeyboard)
eventType.eventKind = OSType(kEventHotKeyReleased)
// Install handler.
InstallEventHandler(GetApplicationEventTarget(), {
(nextHanlder, theEvent, userData) -> OSStatus in
// var hkCom = EventHotKeyID()
// GetEventParameter(theEvent,
// EventParamName(kEventParamDirectObject),
// EventParamType(typeEventHotKeyID),
// nil,
// MemoryLayout<EventHotKeyID>.size,
// nil,
// &hkCom)
NSLog("Command + R Released!")
return noErr
/// Check that hkCom in indeed your hotkey ID and handle it.
}, 1, &eventType, nil, nil)
// Register hotkey.
let status = RegisterEventHotKey(UInt32(keyCode),
modifierFlags,
gMyHotKeyID,
GetApplicationEventTarget(),
0,
&hotKeyRef)
assert(status == noErr)
}
}
A quick Swift 3 update for the setup:
let opts = NSDictionary(object: kCFBooleanTrue, forKey: kAXTrustedCheckOptionPrompt.takeUnretainedValue() as NSString) as CFDictionary
guard AXIsProcessTrustedWithOptions(opts) == true else { return }
NSEvent.addGlobalMonitorForEvents(matching: .keyDown, handler: self.handler)
I maintain this Swift package that makes it easy to both add global keyboard shortcuts to your app and also let the user set their own.
import SwiftUI
import KeyboardShortcuts
// Declare the shortcut for strongly-typed access.
extension KeyboardShortcuts.Name {
static let toggleUnicornMode = Self("toggleUnicornMode")
}
#main
struct YourApp: App {
#StateObject private var appState = AppState()
var body: some Scene {
WindowGroup {
// …
}
Settings {
SettingsScreen()
}
}
}
#MainActor
final class AppState: ObservableObject {
init() {
// Register the listener.
KeyboardShortcuts.onKeyUp(for: .toggleUnicornMode) { [self] in
isUnicornMode.toggle()
}
}
}
// Present a view where the user can set the shortcut they want.
struct SettingsScreen: View {
var body: some View {
Form {
HStack(alignment: .firstTextBaseline) {
Text("Toggle Unicorn Mode:")
KeyboardShortcuts.Recorder(for: .toggleUnicornMode)
}
}
}
}
SwiftUI is used in this example, but it also supports Cocoa.
Take a look at the HotKey Library. You can simply use Carthage to implement it into your own app.
HotKey Library
there is a pretty hacky, but also pretty simple workaround if your app has a Menu:
add a new MenuItem (maybe call it something like "Dummy for Hotkey")
in the attributes inspector, conveniently enter your hotkey in the Key Equivalent field
set Allowed when Hidden, Enabled and Hidden to true
link it with an IBAction to do whatever your hotkey is supposed to do
done!