Concatenating AVAssets seamlessly - swift

I've got some simple AVFoundation code to concatenate a bunch of four-second-long mp4 files together that looks like this:
func
compose(parts inParts: [Part], progress inProgress: (CMTime) -> ())
-> AVAsset?
{
guard
let composition = self.composition,
let videoTrack = composition.addMutableTrack(withMediaType: .video, preferredTrackID: kCMPersistentTrackID_Invalid),
let audioTrack = composition.addMutableTrack(withMediaType: .audio, preferredTrackID: kCMPersistentTrackID_Invalid)
else
{
debugLog("Unable to create tracks for composition")
return nil
}
do
{
var time = CMTime.zero
for p in inParts
{
let asset = AVURLAsset(url: p.path.url)
if let track = asset.tracks(withMediaType: .video).first
{
try videoTrack.insertTimeRange(CMTimeRange(start: .zero, duration: asset.duration), of: track, at: time)
}
if let track = asset.tracks(withMediaType: .audio).first
{
try audioTrack.insertTimeRange(CMTimeRange(start: .zero, duration: asset.duration), of: track, at: time)
}
time = CMTimeAdd(time, asset.duration)
inProgress(time)
}
}
catch (let e)
{
debugLog("Error adding clips: \(e)")
return nil
}
return composition
}
Unfortunately, every four seconds you can hear the audio cut out for a moment, indicating to me that this isn't an entirely seamless concatenation. Is there anything I can do to improve this?
Solution
Thanks to NoHalfBits’s excellent answer below, I’ve updated the above loop with the following, and it works very well:
for p in inParts
{
let asset = AVURLAsset(url: p.path.url)
// It’s possible (and turns out, it’s often the case with UniFi NVR recordings)
// for the audio and video tracks to be of slightly different start time
// and duration. Find the intersection of the two tracks’ time ranges and
// use that range when inserting both tracks into the composition…
// Calculate the common time range between the video and audio tracks…
let sourceVideo = asset.tracks(withMediaType: .video).first
let sourceAudio = asset.tracks(withMediaType: .audio).first
var commonTimeRange = CMTimeRange.zero
if sourceVideo != nil && sourceAudio != nil
{
commonTimeRange = CMTimeRangeGetIntersection(sourceVideo!.timeRange, otherRange: sourceAudio!.timeRange)
}
else if sourceVideo != nil
{
commonTimeRange = sourceVideo!.timeRange
}
else if sourceAudio != nil
{
commonTimeRange = sourceAudio!.timeRange
}
else
{
// There’s neither video nor audio tracks, bail…
continue
}
debugLog("Asset duration: \(asset.duration.seconds), common time range duration: \(commonTimeRange.duration.seconds)")
// Insert the video and audio tracks…
if sourceVideo != nil
{
try videoTrack.insertTimeRange(commonTimeRange, of: sourceVideo!, at: time)
}
if sourceAudio != nil
{
try audioTrack.insertTimeRange(commonTimeRange, of: sourceAudio!, at: time)
}
time = time + commonTimeRange.duration
inProgress(time)
}

In a mp4 container, every track can have its own start time and duration. Especially in recorded material it is not uncommon to have audio and video tracks with slightly different time ranges (insert some CMTimeRangeShow(track.timeRange) near the insertTimeRange to have a look at this).
To overcome this, instead of blindly inserting from CMTime.zero and the duration of the whole asset (the max endtime of all tracks):
get the timeRange of the sources audio and video track
calculate the common time range from these (CMTimeRangeGetIntersection does this for you)
use the common time range when inserting the segments from the source tracks to the destination tracks
increment your time by the duration of the common time range

Related

How to route audio to default speakers in Swift for macOS?

I have a function playing audio for a macOS swiftUI app but I want it to play the sound through the default built in speakers every single time. Does anyone know of any reliable method for this?
I've researched a lot but haven't found a solid method for macos. This is what I've tried:
AVRoutePickerView
This was only availble for ios and Mac catalyst but not macOS
Getting Device ID in AVAudioEngine
I found this code snippet but it assumes that the built in speaker device ID stays the same which it doesnt so that doesn't help.
engine = AVAudioEngine()
let output = engine.outputNode
// get the low level input audio unit from the engine:
let outputUnit = output.audioUnit!
// use core audio low level call to set the input device:
var outputDeviceID: AudioDeviceID = 51 // replace with actual, dynamic value
AudioUnitSetProperty(outputUnit,
kAudioOutputUnitProperty_CurrentDevice,
kAudioUnitScope_Global,
0,
&outputDeviceID,
UInt32(MemoryLayout<AudioDeviceID>.size))
Disabling bluetooth so the audio only goes through main speakers and not bluetooth speaker. This didn't seem the best approach so I havent' tested it.
The following is the code I have for playing sound:
func playTheSound() {
let url = Bundle.main.url(forResource: "Blow", withExtension: "mp3")
player = try! AVAudioPlayer(contentsOf: url!)
player?.play()
print("Sound was played")
//
So, any recommendations on how to route the audio to main speakers for macOS?
By "default built-in" I assume you actually just mean "built-in." The default speakers are the ones the audio will route to already.
The simplest solution to this that will probably always work is to route to the UID "BuiltInSpeakerDevice". For example, this does what you want:
let player = AVPlayer()
func playTheSound() {
let url = URL(filePath: "/System/Library/Sounds/Blow.aiff")
let item = AVPlayerItem(url: url)
player.replaceCurrentItem(with: item)
player.audioOutputDeviceUniqueID = "BuiltInSpeakerDevice"
player.play()
}
Note the use of AVPlayer and audioOutputDeviceUniqueID here. I'm betting this will work in approximately 100% of cases. It should even "work" if there were no built-in speakers, in that this silently fails (without crashing) if the UID doesn't exist.
But...sigh...I can't find anywhere that this is documented or any system constant for this string. And I really hate magic, undocumented strings. So, let's do it right. Besides, if we do it right, it'll work with AVAudioEngine, too. So let's get there.
First, you should always take a look at the invaluable CoreAudio output device useful methods in Swift 4. I don't know if anyone has turned this into a real framework, but this is a treasure trove of examples. The following code is a modernized version of that.
struct AudioDevice {
let id: AudioDeviceID
static func getAll() -> [AudioDevice] {
var propertyAddress = AudioObjectPropertyAddress(
mSelector: kAudioHardwarePropertyDevices,
mScope: kAudioObjectPropertyScopeGlobal,
mElement: kAudioObjectPropertyElementMain)
// Get size of buffer for list
var devicesBufferSize: UInt32 = 0
AudioObjectGetPropertyDataSize(AudioObjectID(kAudioObjectSystemObject), &propertyAddress,
0, nil,
&devicesBufferSize)
let devicesCount = Int(devicesBufferSize) / MemoryLayout<AudioDeviceID>.stride
// Get list
let devices = Array<AudioDeviceID>(unsafeUninitializedCapacity: devicesCount) { buffer, initializedCount in
AudioObjectGetPropertyData(AudioObjectID(kAudioObjectSystemObject), &propertyAddress,
0, nil,
&devicesBufferSize, buffer.baseAddress!)
initializedCount = devicesCount
}
return devices.map(Self.init)
}
var hasOutputStreams: Bool {
var propertySize: UInt32 = 256
var propertyAddress = AudioObjectPropertyAddress(
mSelector: kAudioDevicePropertyStreams,
mScope: kAudioDevicePropertyScopeOutput,
mElement: kAudioObjectPropertyElementMain)
AudioObjectGetPropertyDataSize(id, &propertyAddress, 0, nil, &propertySize)
return propertySize > 0
}
var isBuiltIn: Bool {
transportType == kAudioDeviceTransportTypeBuiltIn
}
var transportType: AudioDevicePropertyID {
var deviceTransportType = AudioDevicePropertyID()
var propertySize = UInt32(MemoryLayout<AudioDevicePropertyID>.size)
var propertyAddress = AudioObjectPropertyAddress(
mSelector: kAudioDevicePropertyTransportType,
mScope: kAudioObjectPropertyScopeGlobal,
mElement: kAudioObjectPropertyElementMain)
AudioObjectGetPropertyData(id, &propertyAddress,
0, nil, &propertySize,
&deviceTransportType)
return deviceTransportType
}
var uid: String {
var propertySize = UInt32(MemoryLayout<CFString>.size)
var propertyAddress = AudioObjectPropertyAddress(
mSelector: kAudioDevicePropertyDeviceUID,
mScope: kAudioObjectPropertyScopeGlobal,
mElement: kAudioObjectPropertyElementMain)
var result: CFString = "" as CFString
AudioObjectGetPropertyData(id, &propertyAddress, 0, nil, &propertySize, &result)
return result as String
}
}
And with that in place, you can fetch the first built-in output device:
player.audioOutputDeviceUniqueID = AudioDevice.getAll()
.first(where: {$0.hasOutputStreams && $0.isBuiltIn })?
.uid
Or you can use your AVAudioEngine approach if you want more control (note difference between uid and id here):
let player = AVAudioPlayerNode()
let engine = AVAudioEngine()
func playTheSound() {
let output = engine.outputNode
let outputUnit = output.audioUnit!
var outputDeviceID = AudioDevice.getAll()
.first(where: {$0.hasOutputStreams && $0.isBuiltIn })!
.id
AudioUnitSetProperty(outputUnit,
kAudioOutputUnitProperty_CurrentDevice,
kAudioUnitScope_Global,
0,
&outputDeviceID,
UInt32(MemoryLayout<AudioDeviceID>.size))
engine.attach(player)
engine.connect(player, to: engine.outputNode, format: nil)
try! engine.start()
let url = URL(filePath: "/System/Library/Sounds/Blow.aiff")
let file = try! AVAudioFile(forReading: url)
player.scheduleFile(file, at: nil)
player.play()
}

Export a video with dynamic text per frame in Swift AVFoundation

I fetch the timestamps from every frame and store them in an array using the showTimestamps function. I now want to "draw" each timestamp on each frame of the video, and export it.
func showTimestamps(videoFile : URL) -> [String] {
let asset = AVAsset(url:videoFile)
let track = asset.tracks(withMediaType: AVMediaType.video)[0]
let output = AVAssetReaderTrackOutput(track: track, outputSettings: nil)
guard let reader = try? AVAssetReader(asset: asset) else {exit(1)}
output.alwaysCopiesSampleData = false
reader.add(output)
reader.startReading()
var times : [String] = []
while(reader.status == .reading){
if let sampleBuffer = output.copyNextSampleBuffer() , CMSampleBufferIsValid(sampleBuffer) && CMSampleBufferGetTotalSampleSize(sampleBuffer) != 0 {
let frameTime = CMSampleBufferGetOutputPresentationTimeStamp(sampleBuffer)
if (frameTime.isValid){
times.append(String(format:"%.3f", frameTime.seconds))
}
}
}
return times.sorted()
}
However, I cannot figure out how to export a new video with each frame containing it's respectful timestamp? i.e How can I implement this code:
func generateNewVideoWithTimestamps(videoFile: URL, timestampsForFrames: [String]) {
// TODO
}
I want to keep the framerate, video quality, etc., the same. The only thing that should differ is to add some text on the bottom.
To get this far, I used these guides and failed: Frames, Static Text, Watermark

(iOS SceneKit) Convert texture coordinate to xyz coordinate

I'm able to get the SCNHitTestResult with a long press gesture using the codes below, so that I can get the worldCoordinates(x,y,z) and textureCoordinates.
guard recognizer.state != .ended else { return }
let point = recognizer.location(in: self.panoramaView.sceneView)
guard let hitTest = self.panoramaView.sceneView.hitTest(point, options: nil).first else { return }
let textureCoordinates = hitTest.textureCoordinates(withMappingChannel: 0)
Now the problem is, with a given textureCoordinates, is it possible to convert it to worldCoordinates?
ps: I'm using CTPanoramaView library.

AVFoundation route audio between two non-system input and ouputs

I've been trying to route audio from a virtual Soundflower device to another hardware speaker. The Soundflower virtual device is my system output. I want my AVEAudioEngine to take Soundflower input and output to the hardware speaker.
However having researched it seems AVAudioEngine only support RIO devices. I've looked AudioKit and Output Splitter example however I was getting crackling and unsatisfactory results. My bones of my code is as follows
static func set(device: String, isInput: Bool, toUnit unit: AudioUnit) -> Int {
let devs = (isInput ? EZAudioDevice.inputDevices() : EZAudioDevice.outputDevices()) as! [EZAudioDevice]
let mic = devs.first(where: { $0.name == device})!
var inputID = mic.deviceID // replace with actual, dynamic value
AudioUnitSetProperty(unit, kAudioOutputUnitProperty_CurrentDevice,
kAudioUnitScope_Global, 0, &inputID, UInt32(MemoryLayout<AudioDeviceID>.size))
return Int(inputID)
}
let outputRenderCallback: AURenderCallback = {
(inRefCon: UnsafeMutableRawPointer,
ioActionFlags: UnsafeMutablePointer<AudioUnitRenderActionFlags>,
inTimeStamp: UnsafePointer<AudioTimeStamp>,
inBusNumber: UInt32,
inNumberFrames: UInt32,
ioData: UnsafeMutablePointer<AudioBufferList>?) -> OSStatus in
// Get Refs
let buffer = UnsafeMutableAudioBufferListPointer(ioData)
let engine = Unmanaged<Engine>.fromOpaque(inRefCon).takeUnretainedValue()
// If Engine hasn't saved any data yet just output silence
if (engine.latestSampleTime == nil) {
//makeBufferSilent(buffer!)
return noErr
}
// Read the latest available Sample
let sampleTime = engine.latestSampleTime
if let err = checkErr(engine.ringBuffer.fetch(ioData!, framesToRead: inNumberFrames, startRead: sampleTime!).rawValue) {
//makeBufferSilent(buffer!)
return err
}
return noErr
}
private let trailEngine: AVAudioEngine
private let subEngine: AVAudioEngine
init() {
subEngine = AVAudioEngine()
let inputUnit = subEngine.inputNode.audioUnit!
print(Engine.set(device: "Soundflower (2ch)", isInput: true, toUnit: inputUnit))
trailEngine = AVAudioEngine()
let outputUnit = trailEngine.outputNode.audioUnit!
print(Engine.set(device: "Boom 3", isInput: false, toUnit: outputUnit))
subEngine.inputNode.installTap(onBus: 0, bufferSize: 2048, format: nil) { [weak self] (buffer, time) in
guard let self = self else { return }
let sampleTime = time.sampleTime
self.latestSampleTime = sampleTime
// Write to RingBuffer
if let _ = checkErr(self.ringBuffer.store(buffer.audioBufferList, framesToWrite: 2048, startWrite: sampleTime).rawValue) {
//makeBufferSilent(UnsafeMutableAudioBufferListPointer(buffer.mutableAudioBufferList))
}
}
var renderCallbackStruct = AURenderCallbackStruct(
inputProc: outputRenderCallback,
inputProcRefCon: UnsafeMutableRawPointer(Unmanaged<Engine>.passUnretained(self).toOpaque())
)
if let _ = checkErr(
AudioUnitSetProperty(
trailEngine.outputNode.audioUnit!,
kAudioUnitProperty_SetRenderCallback,
kAudioUnitScope_Global,
0,
&renderCallbackStruct,
UInt32(MemoryLayout<AURenderCallbackStruct>.size)
)
) {
return
}
subEngine.prepare()
trailEngine.prepare()
ringBuffer = RingBuffer<Float>(numberOfChannels: 2, capacityFrames: UInt32(4800 * 20))
do {
try self.subEngine.start()
} catch {
print("Error starting the input engine: \(error)")
}
DispatchQueue.main.asyncAfter(deadline: .now() + 0.01) {
do {
try self.trailEngine.start()
} catch {
print("Error starting the output engine: \(error)")
}
}
}
For reference the RingBuffer implementation is at:
https://github.com/vgorloff/CARingBuffer
and the AudioKit example
https://github.com/AudioKit/OutputSplitter/tree/master/OutputSplitter
I was using AudioKit 4 (however the example only uses AudioKit's device wrappers). The result of this code is super crackly audio through the speakers which suggests the signal is getting completely mangled in the transfer between the two engines. I am not too worried about latency between the two engines.

Get Shuffled Tracks from Media Player

I want to get the queue from the Media Player. I believe that I can't read the queue though (is this correct?) and that I can only get the queue when I set it, like when I select a certain playlist to play. However I'm struggling to get the shuffled queue.
let musicPlayerController = MPMusicPlayerController.systemMusicPlayer
let myMediaQuery = MPMediaQuery.songs()
let predicateFilter = MPMediaPropertyPredicate(value: chosenPlaylist, forProperty: MPMediaPlaylistPropertyName)
myMediaQuery.filterPredicates = NSSet(object: predicateFilter) as? Set<MPMediaPredicate>
musicPlayerController.setQueue(with: myMediaQuery)
musicPlayerController.repeatMode = .all
musicPlayerController.shuffleMode = .songs
musicPlayerController.play()
for track in myMediaQuery.items! {
print(track.value(forProperty: MPMediaItemPropertyTitle)!)
} // here I don't get the shuffled order that is going to play, just get the original order of the playlist
I need the shuffled as I want to be able to display what's going to play next.
I've had trouble with .shuffleMode in the past. Call shuffled() on the collection instead:
// shuffle
func shufflePlaylist() {
let query = MPMediaQuery.songs()
let predicate = MPMediaPropertyPredicate(value: "Art Conspiracy",
forProperty: MPMediaPlaylistPropertyName,
comparisonType: .equalTo)
query.addFilterPredicate(predicate)
guard let items = query.items else { return }
let collection = MPMediaItemCollection(items: items.shuffled())
player.setQueue(with: collection)
player.prepareToPlay()
player.play()
}