I have a memory reference, mBuffers.mData (from an AudioUnit bufferList), declared in the OS X and iOS framework headers as an:
What is an efficient way to write lots of Int16 values into memory referenced by this pointer?
A disassembly of this Swift source code:
for i in 0..<count {
var x : Int16 = someFastCalculation()
let loByte : Int32 = Int32(x) & 0x00ff
let hiByte : Int32 = (Int32(x) >> 8) & 0x00ff
memset(mBuffers.mData + 2 * i , loByte, 1)
memset(mBuffers.mData + 2 * i + 1, hiByte, 1)
shows lots of instructions setting up the memset() function calls (far more instructions than in my someFastCalculation). This is a loop inside a real-time audio callback, so efficient code to minimize latency and battery consumption is important.
Is there a faster way?
This Swift source allows array assignment of individual audio samples to an Audio Unit (or AUAudioUnit) audio buffer, and compiles down to a faster result than using memset.
let mutableData = UnsafeMutablePointer<Int16>(mBuffers.mData)
let sampleArray = UnsafeMutableBufferPointer<Int16>(
start: mutableData,
count: Int(mBuffers.mDataByteSize)/sizeof(Int16))
for i in 0..<count {
let x : Int16 = mySampleSynthFunction(i)
sampleArray[i] = x
More complete Gist here .
I need to perform simple math operation on Data that contains RGB pixels data. Currently Im doing this like so:
let imageMean: Float = 127.5
let imageStd: Float = 127.5
let rgbData: Data // Some data containing RGB pixels
let floats = (0..<rgbData.count).map {
(Float(rgbData[$0]) - imageMean) / imageStd
return Data(bytes: floats, count: floats.count * MemoryLayout<Float>.size)
This works, but it's too slow. I was hoping I could use the Accelerate framework to calculate this faster, but have no idea how to do this. I reserved some space so that it's not allocated every time this function starts, like so:
inputBufferDataNormalized = malloc(width * height * 3) // 3 channels RGB
I tried few functions, like vDSP_vasm, but I couldn't make it work. Can someone direct me to how to use it? Basically I need to replace this map function, because it takes too long time. And probably it would be great to use pre-allocated space all the time.
Following up on my comment on your other related question. You can use SIMD to parallelize the operation, but you'd need to split the original array into chunks.
This is a simplified example that assumes that the array is exactly divisible by 64, for example, an array of 1024 elements:
let arr: [Float] = (0 ..< 1024).map { _ in Float.random(in: 0...1) }
let imageMean: Float = 127.5
let imageStd: Float = 127.5
var chunks = [SIMD64<Float>]()
chunks.reserveCapacity(arr.count / 64)
for i in stride(from: 0, to: arr.count, by: 64) {
let v = SIMD64.init(arr[i ..< i+64])
chunks.append((v - imageMean) / imageStd) // same calculation using SIMD
You can now access each chunk with a subscript:
var results: [Float] = []
for chunk in chunks {
for i in chunk.indices {
Of course, you'd need to deal with a remainder if the array isn't exactly divisible by 64.
I have found a way to do this using Accelerate. First I reserve space for converted buffer like so
var inputBufferDataRawFloat = [Float](repeating: 0, count: width * height * 3)
Then I can use it like so:
let rawBytes = [UInt8](rgbData)
vDSP_vfltu8(rawBytes, 1, &inputBufferDataRawFloat, 1, vDSP_Length(rawBytes.count))
vDSP.add(inputBufferDataRawScalars.mean, inputBufferDataRawFloat, result: &inputBufferDataRawFloat)
vDSP.multiply(inputBufferDataRawScalars.std, inputBufferDataRawFloat, result: &inputBufferDataRawFloat)
return Data(bytes: inputBufferDataRawFloat, count: inputBufferDataRawFloat.count * MemoryLayout<Float>.size)
Works very fast. Maybe there is better function in Accelerate, if anyone know of it, please let me know. It need to perform function (A[n] + B) * C (or to be exact (A[n] - B) / C but the first one could be converted to this).
I have created an app which I am using to take acoustic measurements. The app generates a log sine sweep stimulus, and when the user presses 'start' the app simultaneously plays the stimulus sound, and records the microphone input.
All fairly standard stuff. I am using core audio as down the line I want to really delve into different functionality, and potentially use multiple interfaces, so have to start learning somewhere.
This is for iOS so I am creating an AUGraph with remoteIO Audio Unit for input and output. I have declared the audio formats, and they are correct as no errors are shown and the AUGraph initialises, starts, plays sound and records.
I have a render callback on the input scope to input 1 of my mixer. (ie, every time more audio is needed, the render callback is called and this reads a few samples into the buffer from my stimulus array of floats).
let genContext = Unmanaged.passRetained(self).toOpaque()
var genCallbackStruct = AURenderCallbackStruct(inputProc: genCallback,
inputProcRefCon: genContext)
AudioUnitSetProperty(mixerUnit!, kAudioUnitProperty_SetRenderCallback,
kAudioUnitScope_Input, 1, &genCallbackStruct,
I then have an input callback which is called every time the buffer is full on the output scope of the remoteIO input. This callback saves the samples to an array.
var inputCallbackStruct = AURenderCallbackStruct(inputProc: recordingCallback,
inputProcRefCon: context)
AudioUnitSetProperty(remoteIOUnit!, kAudioOutputUnitProperty_SetInputCallback,
kAudioUnitScope_Global, 0, &inputCallbackStruct,
Once the stimulus reaches the last sample, the AUGraph is stopped, and then I write both the stimulus and the recorded array to separate WAV files so I can check my data. What I am finding is that there is currently about 3000 samples delay between the recorded input and the stimulus.
Whilst it is hard to see the start of the waveforms (both the speakers and the microphone may not detect that low), the ends of the stimulus (bottom WAV) and the recorded should roughly line up.
There will be propagation time for the audio, I realise this, but at 44100Hz sample rate, that's 68ms. Core audio is meant to keep latency down.
So my question is this, can anybody account for this additional latency which seems quite high
my inputCallback is as follows:
let recordingCallback: AURenderCallback = { (
ioData ) -> OSStatus in
let audioObject = unsafeBitCast(inRefCon, to: AudioEngine.self)
var err: OSStatus = noErr
var bufferList = AudioBufferList(
mNumberBuffers: 1,
mBuffers: AudioBuffer(
mNumberChannels: UInt32(1),
mDataByteSize: 512,
mData: nil))
if let au: AudioUnit = audioObject.remoteIOUnit! {
err = AudioUnitRender(au,
let data = Data(bytes: bufferList.mBuffers.mData!, count: Int(bufferList.mBuffers.mDataByteSize))
let samples = data.withUnsafeBytes {
UnsafeBufferPointer<Int16>(start: $0, count: data.count / MemoryLayout<Int16>.size)
let factor = Float(Int16.max)
var floats: [Float] = Array(repeating: 0.0, count: samples.count)
for i in 0..<samples.count {
floats[i] = (Float(samples[i]) / factor)
var j = audioObject.in1BufIndex
let m = audioObject.in1BufSize
for i in 0..<(floats.count) {
audioObject.in1Buf[j] = Float(floats[I])
j += 1 ; if j >= m { j = 0 }
audioObject.in1BufIndex = j
audioObject.inputCallbackFrameSize = Int(frameCount)
audioObject.callbackcount += 1
var WindowSize = totalRecordSize / Int(frameCount)
if audioObject.callbackcount == WindowSize {
audioObject.running = false
return 0
So from when the engine starts, this callback should be called after the first set of data is collected from remoteIO. 512 samples as that is the default allocated buffer size. All it does is convert from the signed integer into Float, and save to a buffer. The value in1BufIndex is a reference to the last index in the array written to, and this is referenced and written to with each callback, to make sure the data in the array lines up.
Currently it seems about 3000 samples of silence is in the recorded array before the captured sweep is heard. Inspecting the recorded array by debugging in Xcode, all samples have values (and yes the first 3000 are very quiet), but somehow this doesn't add up.
Below is the generator Callback used to play my stimulus
let genCallback: AURenderCallback = { (
ioData) -> OSStatus in
let audioObject = unsafeBitCast(inRefCon, to: AudioEngine.self)
for buffer in UnsafeMutableAudioBufferListPointer(ioData!) {
var frames = buffer.mData!.assumingMemoryBound(to: Float.self)
var j = 0
if audioObject.stimulusReadIndex < (audioObject.Stimulus.count - Int(frameCount)){
for i in stride(from: 0, to: Int(frameCount), by: 1) {
frames[i] = Float((audioObject.Stimulus[j + audioObject.stimulusReadIndex]))
j += 1
audioObject.in2Buf[j + audioObject.stimulusReadIndex] = Float((audioObject.Stimulus[j + audioObject.stimulusReadIndex]))
audioObject.stimulusReadIndex += Int(frameCount)
return noErr;
There may be at least 4 things contributing to the round trip latency.
512 samples, or 11 mS, is the time required to gather enough samples before remoteIO can call your callback.
Sound propagates at about 1 foot per millisecond, double that for a round trip.
The DAC has an output latency.
There is the time needed for the multiple ADCs (there’s more than 1 microphone on your iOS device) to sample and post-process the audio (for sigma-delta, beam forming, equalization, and etc.). The post processing might be done in blocks, thus incurring the latency to gather enough samples (an undocumented number) for one block.
There’s possibly also added overhead latency in moving data (hardware DMA of some unknown block size?) between the ADC and system memory, as well as driver and OS context switching overhead.
There’s also a startup latency to power up the audio hardware subsystems (amplifiers, etc.), so it may be best to start playing and recording audio well before outputting your sound (frequency sweep).
Hi I need to decompose a number into powers of 2 in swift 5 for an iOS app I'm writing fro a click and collect system.
The backend of this system is written in c# and uses the following to save a multi-pick list of options as a single number in the database eg:
choosing salads for a filled roll on an order system works thus:
lettuce = 1
cucumber = 2
tomato = 4
sweetcorn = 8
onion = 16
by using this method it saves the options into the database for the choice made as (lettuce + tomato + onion) = 21 (1+4+16)
at the other end I use a c# function to do this thus:
for(int j = 0; j < 32; j++)
int mask = 1 << j;
I need to convert this function into a swift 5 format to integrate the decoder into my iOS app
any help would be greatly appreciated
In Swift, these bit fields are expressed as option sets, which are types that conform to the OptionSet protocol. Here is an example for your use case:
struct Veggies: OptionSet {
let rawValue: UInt32
static let lettuce = Veggies(rawValue: 1 << 0)
static let cucumber = Veggies(rawValue: 1 << 1)
static let tomato = Veggies(rawValue: 1 << 2)
static let sweetcorn = Veggies(rawValue: 1 << 3)
static let onion = Veggies(rawValue: 1 << 4)
let someVeggies: Veggies = [.lettuce, .tomato]
print(someVeggies) // => Veggies(rawValue: 5)
print(Veggies.onion.rawValue) // => 16
OptionSets are better than just using their raw values, for two reasons:
1) They standardize the names of the cases, and gives a consistent and easy way to interact with these values
2) OptionSet derives from the SetAlgebra protocol, and provides defaulted implementations for many useful methods like union, intersection, subtract, contains, etc.
I would caution against this design, however. Option sets are useful only when there's a really small number of flags (less than 64), that you can't forsee expanding. They're really basic, can't store any payload besides "x exists, or it doesn't", and they're primarily intended for use cases that have very high sensitivity for performance and memory use, which quick rare these days. I would recommend using regular objects (Veggie class, storing a name, and any other relevant data) instead.
You can just use a while loop, like this :
var j = 0
while j < 32 {
var mask = 1 << j
j += 1
Here is a link about loops and control flow in Swift 5.
Hi I figured it out this is my final solution:
var salads = "" as String
let value = 127
var j=0
while j < 256 {
let mask=1 << j
if((value & mask) != 0) {
salads.append(String(mask) + ",")
j += 1
salads = String(salads.dropLast()) // removes the final ","
This now feeds nicely into the in clause in my SQL query, thanks you all for your help! :)
I am trying to implement this code which I got from an apple WWDC video. However the video is from 2016 and I think the syntax has changed. How do I call sizeof(Float)? This produces an error.
func render(buffer:AudioBuffer){
let nFrames = Int(buffer.mDataByteSize) / sizeof(Float)
var ptr = UnsafeMutableRawPointer(buffer.mData)
var j = self.counter
let cycleLength = self.sampleRate / self.frequency
let halfCycleLength = cycleLength / 2
let amp = self.amplitude, minusAmp = -amp
for _ in 0..<nFrames{
if j < halfCycleLength{
ptr.pointee = amp
} else {
ptr.pointee = minusAmp
ptr = ptr.successor()
j += 1.0
if j > cycleLength {
self.counter = j
The sizeof() function is no longer supported in Swift.
As Leo Dabus said in his comment, you want MemoryLayout<Type>.size, or in your case, MemoryLayout<Float>.size.
Note that tells you the abstract size of an item of that type. However, due to alignment, you should not assume that structs containing different types of items will be the sums of the sizes of the other elements. Also, you need to consider the device it's running on. On a 64 bit device, Int is 8 bytes. On a 32 bit device, it's 4 bytes.
See the article on MemoryLayout at SwiftDoc.org for more information.
So, I have a stream of well-formed data coming from some hardware. The stream consists of a bunch of chunks of 8-bit data, some of which are meant to form into 32-bit integers. That's all good. The data moves along and now I want to parcel the sequence up.
The data is actually a block of contiguous bytes, with segments of it mapped to useful data. So, for example, the first byte is a confirmation code, the following four bytes represent a UInt32 of some application-specific meaning, followed by two bytes representing a UInt16, and so on for a couple dozen bytes.
I found two different ways to do that, both of which seem a bit..overwrought. It may just what happens when you get close to the metal.
But — are these two code idioms generally what one should expect to do? Or am I missing something more compact?
// data : Data exists before this code, and has what we're transforming into UInt32
// One Way to get 4 bytes from Data into a UInt32
var y : [UInt8] = [UInt8](repeating:UInt8(0x0), count: 4)
data.copyBytes(to: &y, from: Range(uncheckedBounds: (2,6)))
let u32result = UnsafePointer(y).withMemoryRebound(to: UInt32.self, capacity: 1, {
// u32result contains the 4 bytes from data
// Another Way to get 4 bytes from Data into a UInt32 via NSData
var result : UInt32 = 0
let resultAsNSData : NSData = data.subdata(in: Range(uncheckedBounds: (2,6))) as NSData
resultAsNSData.getBytes(&result, range: NSRange(location: 0, length: 4))
// result contains the 4 bytes from data
Creating UInt32 array from well-formed data object.
Swift 3
// Create sample data
let data = "foo".data(using: .utf8)!
// Using pointers style constructor
let array = data.withUnsafeBytes {
[UInt32](UnsafeBufferPointer(start: $0, count: data.count))
Swift 2
// Create sample data
let data = "foo".dataUsingEncoding(NSUTF8StringEncoding)!
// Using pointers style constructor
let array = Array(UnsafeBufferPointer(start: UnsafePointer<UInt32>(data.bytes), count: data.length))
I found two other ways of doing this which is leading me to believe that there are plenty of ways to do it, which is good, I suppose.
Two additional ways are described in some fashion over on Ray Wenderlich
This code dropped into your Xcode playground will reveal these two other idioms.
do {
let count = 1 // number of UInt32s
let stride = MemoryLayout<UInt32>.stride
let alignment = MemoryLayout<UInt32>.alignment
let byteCount = count * stride
var bytes : [UInt8] = [0x0D, 0x0C, 0x0B, 0x0A] // little-endian LSB -> MSB
var data : Data = Data.init(bytes: bytes) // In my situtation, I actually start with an instance of Data, so the [UInt8] above is a conceit.
print("---------------- 1 ------------------")
let placeholder = UnsafeMutableRawPointer.allocate(bytes: byteCount, alignedTo:alignment)
withUnsafeBytes(of: &data, { (bytes) in
for (index, byte) in data.enumerated() {
print("byte[\(index)]->\(String(format: "0x%02x",byte)) data[\(index)]->\(String(format: "0x%02x", data[index])) addr: \(bytes.baseAddress!+index)")
placeholder.storeBytes(of: byte, toByteOffset: index, as: UInt8.self)
let typedPointer1 = placeholder.bindMemory(to: UInt32.self, capacity: count)
print("u32: \(String(format: "0x%08x", typedPointer1.pointee))")
print("---------------- 2 ------------------")
for (index, byte) in bytes.enumerated() {
placeholder.storeBytes(of: byte, toByteOffset: index, as: UInt8.self)
// print("byte \(index): \(byte)")
print("byte[\(index)]->\(String(format: "0x%02x",byte))")
let typedPointer = placeholder.bindMemory(to: UInt32.self, capacity: count)
let result : UInt32 = typedPointer.pointee
print("u32: \(String(format: "0x%08x", typedPointer.pointee))")
With output:
---------------- 1 ------------------
byte[0]->0x0d data[0]->0x0d addr: 0x00007fff57243f68
byte[1]->0x0c data[1]->0x0c addr: 0x00007fff57243f69
byte[2]->0x0b data[2]->0x0b addr: 0x00007fff57243f6a
byte[3]->0x0a data[3]->0x0a addr: 0x00007fff57243f6b
u32: 0x0a0b0c0d
---------------- 2 ------------------
u32: 0x0a0b0c0d
Here's a Gist.
let a = [ 0x00, 0x00, 0x00, 0x0e ]
let b = a[0] << 24 + a[1] << 16 + a[2] << 8 + a[3]
print(b) // will print 14.
Should I describe this operation ?