How do I wrap up a series of related NSTask operations into discrete functions - swift

I'm writing a Swift command line tool that uses NSTask to interact with git. In the simplest scenario I want to run three commands: init, add ., and commit -m Initial Commit. I intend to use a separate NSTask for each command, and want to house each command in its own function - returning true if the task succeeded or false if it didn't. This set-up would allow my main function to look like this:
func main() {
if runInit() {
if runStage() {
if runCommit() {
NSLog("success!")
}
}
}
}
To accomplish this each of the three functions must do the following before returning (i) launch the task (ii) wait for it to complete, (iii) obtain whatever is in stdout, and (iv) set the return value (true or false). Here's what I've got for the commit stage:
func runCommit() -> Bool {
var retval = false
var commitTask = NSTask()
commitTask.standardOutput = NSPipe()
commitTask.launchPath = gitPath
commitTask.arguments = ["commit", "-m", "Initial Commit"]
commitTask.currentDirectoryPath = demoProjectURL.path!
commitTask.standardOutput.fileHandleForReading.readToEndOfFileInBackgroundAndNotify()
nc.addObserverForName(NSFileHandleReadToEndOfFileCompletionNotification,
object: commitTask.standardOutput.fileHandleForReading,
queue: nil) { (note) -> Void in
// get the output, log it, then...
if commitTask.terminationStatus == EXIT_SUCCESS {
retval = true
}
}
commitTask.launch()
commitTask.waitUntilExit()
return retval
}
My question is essentially about how waitUntilExit works, particularly in conjunction with the notification I sign up for to enable me to get the output. Apple's docs say:
This method first checks to see if the receiver is still running using isRunning. Then it polls the current run loop using NSDefaultRunLoopMode until the task completes.
I'm a bit out of my depth when it comes to run loop mechanics, and was wondering what this means in this context - can I safely assume that my notification block will always be executed before the enclosing function returns?

waitUntilExit returns when the SIGCHILD signal has been received
to indicate that the child process has terminated. The notification
block is executed when EOF is read from the pipe to the child process.
It is not specified which of these events occurs first.
Therefore you have to wait for both. There are several possible solutions,
here is one using a "signalling semaphore", you could also use
a "dispatch group".
Another error in your code is that the observer is never removed.
func runCommit() -> Bool {
let commitTask = NSTask()
commitTask.standardOutput = NSPipe()
commitTask.launchPath = gitPath
commitTask.arguments = ["commit", "-m", "Initial Commit"]
commitTask.currentDirectoryPath = demoProjectURL.path!
commitTask.standardOutput!.fileHandleForReading.readToEndOfFileInBackgroundAndNotify()
let sema = dispatch_semaphore_create(0)
var obs : NSObjectProtocol!
obs = nc.addObserverForName(NSFileHandleReadToEndOfFileCompletionNotification,
object: commitTask.standardOutput!.fileHandleForReading, queue: nil) {
(note) -> Void in
// Get data and log it.
if let data = note.userInfo?[NSFileHandleNotificationDataItem] as? NSData,
let string = String(data: data, encoding: NSUTF8StringEncoding) {
print(string)
}
// Signal semaphore.
dispatch_semaphore_signal(sema)
nc.removeObserver(obs)
}
commitTask.launch()
// Wait for process to terminate.
commitTask.waitUntilExit()
// Wait for semaphore to be signalled.
dispatch_semaphore_wait(sema, DISPATCH_TIME_FOREVER)
let retval = commitTask.terminationStatus == EXIT_SUCCESS
return retval
}

Related

Swift stdout redirect to a string

I'm trying to redirect stdout to a string with Swift, to collect whatever is passed to the print function. I've read through a few resources that suggest my approach should be working, however, the following proof of concept script only outputs "Starting":
import Foundation
let outPipe = Pipe()
var outString = "Initial"
outPipe.fileHandleForReading.readabilityHandler = { fileHandle in
outString += String(data: fileHandle.availableData, encoding: .utf8)!
}
print("Starting")
// Redirect
setvbuf(stdout, nil, _IONBF, 0)
dup2(outPipe.fileHandleForWriting.fileDescriptor, STDOUT_FILENO)
print("Captured")
freopen("/dev/stdout", "a", stdout)
print(outString)
Any ideas where I've gone wrong? All help is appreciated!
The main problem is that the pipe's read handler is called on a secondary queue, but your program does not have a run loop. Therefore the program terminates before the event handler is called once.
In a command line application, one option so solve this is with a “dispatch semaphore” which is signaled in the event handler, once it detect an end-of-file condition on the pipe. And in order to make this work, we must close the writing end of the pipe eventually.
Also the standard output file descriptor should be saved and restored later. Reopening "/dev/stdout" will not work if the program is run in the Xcode debugger console.
The following code produces the expected output. It is meant as a “proof of concept” example. In your production code you'll want to avoid things like a forced try!.
import Foundation
let outPipe = Pipe()
var outString = "Initial"
let sema = DispatchSemaphore(value: 0)
outPipe.fileHandleForReading.readabilityHandler = { fileHandle in
let data = fileHandle.availableData
if data.isEmpty { // end-of-file condition
fileHandle.readabilityHandler = nil
sema.signal()
} else {
outString += String(data: data, encoding: .utf8)!
}
}
print("Starting")
// Redirect
setvbuf(stdout, nil, _IONBF, 0)
let savedStdout = dup(STDOUT_FILENO)
dup2(outPipe.fileHandleForWriting.fileDescriptor, STDOUT_FILENO)
print("Captured")
// Undo redirection
dup2(savedStdout, STDOUT_FILENO)
try! outPipe.fileHandleForWriting.close()
close(savedStdout)
sema.wait() // Wait until read handler is done
print(outString) // Prints "InitialCaptured"

Synchronize nested async network requests inside a while loop by using Semaphores

I have a func that gets a list of Players. When i fetch the players i need only to show those who belongs to the current Team so i am showing only a subset of the original list by filtering them. I don't know in advance, before making the request, how much players belong to the Team selected by the User, so i may need to do additional requests until i can display on the TableView at least 10 rows of Players. The User by pulling up from the bottom of the TableView can request more players to display. To do this i am calling a first async func request which in turn calls, inside a while, another nested async func request. Here a code to give you an idea of what i am trying to do:
let semaphore = DispatchSemaphore(value: 0)
func getTeamPlayersRequest() {
service.getTeamPlayers(...)
{
(result) in
switch result
{
case .success(let playersModel):
if let validCurrentPage = currentPageTmp ,
let validTotalPages = totalPagesTmp ,
let validNextPage = self.getTeamPlayersListNextPage()
{
while self.playersToShowTemp.count < 10 && self.currentPage < validTotalPages
{
self.currentPage = validNextPage //global var
self.fetchMorePlayers()
self.semaphore.wait() //global semaphore
}
}
case .failure(let error):
//some code...
}
})
}
private func fetchMorePlayers(){
// Completion handler of the following function is never called..
service.getTeamPlayers(requestedPage: currentPage, completion: {
(result) in
switch result
{
case .success(let playersModel):
if let validPlayerList = playersList,
let validPlayerListData = validPlayerList.data,
let validTeamModel = self.teamPlayerModel,
let validNextPage = self.getTeamPlayersListNextPage()
{
for player in validPlayerListData
{
if ( validTeamModel.id == player.team?.id)
{
self.playersToShowTemp.append(player)
}
}
}
self.currentPage = validNextPage
self.semaphore.signal() //global semaphore
case .failure(let error):
//some code...
}
}
}
I have tried both with DispatchGroup and Semaphore but i don't get it what i am doing wrong. I debugged the code and saw that the first async call get executed in a different queue (not the main queue) and a different thread. The nested async call getexecuted on a different thread but i don't know if it's the same concurrent queue of the first async call.
The completion handler of thenested call it's never called. Does anyone know why? is the self.semaphore.wait(), even if it get executed after the fetchMorePlayers() return, blocking/preventing the nested async completion handler to be called?
I am noticing through the Debugger that the completion() in the Xcode vars window has the note "swift partial apply forwarder for closure #1"
If we inline the function call in your loop, it looks something like this:
while self.playersToShowTemp.count < 10 && self.currentPage < validTotalPages
{
self.currentPage = validNextPage //global var
nbaService.getTeamPlayers(requestedPage: currentPage, completion: { ... })
self.semaphore.wait() //global semaphore
}
So nbaService.getTeamPlayers schedules a request, probably on the main DispatchQueue and immediately returns. Then you call wait on your semaphore, which blocks, probably before GCD even tries to run the task scheduled by nbaService.getTeamPlayers.
That's a problem on DispatchQueue.main, which is a serial queue. It has to be a serial queue for UI updates to work. What normally happens is on some iteration of the run loop you make a request, and return.. that bubbles back up to the run loop, which checks for more events and queued tasks. In this case, when your completion handler in getTeamPlayersRequest is waiting to be run, the run loop (via GCD) executes it for that iteration. Then you block the main thread, so the run loop can't continue. If you do need to block always do it on a different DispatchQueue, preferably a .concurrent one.
There is sometimes confusion about what .async does. It only means "run this later and right now return control back to the caller". That's all. It does not guarantee that your closure will run concurrently. It merely schedules it to be run later (possibly soon) on whatever DispatchQueue you called it on. If that queue is a serial queue, then it will be queued to run in its turn in that dispatch queue's run loop. If it's a concurrent queue (ie one you specifically set the attributes to include .concurrent). Then it will run, possibly at the same time as other tasks on that same DispatchQueue.
To avoid that instead of using a loop you can use async-chaining.
private func fetchMorePlayers(while condition: #autoclosure #escaping () -> Bool){
guard condition() else { return }
nbaService.getTeamPlayers(requestedPage: currentPage, completion: {
(result) in
switch result
{
case .success(let playersModel):
if let validPlayerList = playersList,
let validPlayerListData = validPlayerList.data,
let validTeamModel = self.teamPlayerModel,
let validNextPage = self.getTeamPlayersListNextPage()
{
for player in validPlayerListData
{
if ( validTeamModel.id == player.team?.id)
{
self.playersToShowTemp.append(player)
}
}
}
self.currentPage = validNextPage
// Chain to next call
self.fetchMorePlayers(while: condition))
case .failure(let error):
//some code...
}
}
}
Then in getTeamPlayersRequest you can do this:
func getTeamPlayersRequest() {
service.getTeamPlayers(...)
{
(result) in
switch result
{
case .success(let playersModel):
if let validCurrentPage = currentPageTmp ,
let validTotalPages = totalPagesTmp ,
let validNextPage = self.getTeamPlayersListNextPage()
{
self.currentPage = validNextPage //global var
self.fetchMorePlayers(while: self.playersToShowTemp.count < 10 && self.currentPage < validTotalPages)
}
case .failure(let error):
//some code...
}
})
}
This avoids the need to block on a semaphore, because each subsequent request happens in the completion handler of the previously completed one. The only issue is if you need for the completion handler in getTeamPlayersRequest to block while the fetchMorePlayers requests are being fetched, because now it won't you can re-introduce the semaphore. In that case the guard statement in fetchMorePlayers becomes:
guard condition() else
{
self.semaphore.signal()
return
}
That way it only signals on the last completion handler in the chain. You may need to block in a different DispatchQueue though. I think if you need to block, you probably have something about your design that needs to be reconsidered.
If you find yourself reaching for semaphores, it is almost always a mistake. Semaphores are inefficient at best, and introduce deadlock risks if misused. Semaphores should generally be avoided. (Don't get me wrong: Semaphores can be useful in some very narrow use cases, but this is not one of them.)
Use asynchronous patterns. One simple approach might be to recursively call the routine, calling the completion handler when done:
func startFetching(#escaping completion: () -> Void) {
fetchPlayers(page: 0, completion: completion)
}
private func fetchPlayers(page: Int, #escaping completion: () -> Void) {
// prepare request
// now perform request
performRequest(...) { ...
if let error = error {
completion()
return
}
...
if doesNeedMorePlayers {
fetchPlayers(page: page + 1, completion: completion)
} else {
completion()
}
}
}
Personally, I might probably add another closure to emit the players retrieved as we go along, e.g. like, if not actually, a Combine Publisher. Or if you want to update the UI all at once at the very end, just pass the players retrieved thus far as additional parameter in this recursive routine and pass the whole array back in the completion handler. But avoid globals or other state properties.
But the broader idea is to scrupulously avoid semaphores and instead embrace asynchronous patterns.

How to exit a runloop?

So, I have a Swift command-line program:
import Foundation
print("start")
startAsyncNetworkingStuff()
RunLoop.current.run()
print("end")
The code compiles without error. The async networking code runs just fine, fetches all its data, prints the result, and eventually calls its completion function.
How do I get that completion function to break out of above current runloop so that the last "end" gets printed?
Added:
Replacing RunLoop.current.run() with the following:
print("start")
var shouldKeepRunning = true
startAsyncNetworkingStuff()
let runLoop = RunLoop.current
while ( shouldKeepRunning
&& runLoop.run(mode: .defaultRunLoopMode,
before: .distantFuture ) ) {
}
print("end")
Setting
shouldKeepRunning = false
in the async network completion function still does not result in "end" getting printed. (This was checked by bracketing the shouldKeepRunning = false statement with print statements which actually do print to console). What is missing?
For a command line interface use this pattern and add a completion handler to your AsyncNetworkingStuff (thanks to Rob for code improvement):
print("start")
let runLoop = CFRunLoopGetCurrent()
startAsyncNetworkingStuff() { result in
CFRunLoopStop(runLoop)
}
CFRunLoopRun()
print("end")
exit(EXIT_SUCCESS)
Please don't use ugly while loops.
Update:
In Swift 5.5+ with async/await it has become much more comfortable. There's no need anymore to maintain the run loop.
Rename the file main.swift as something else and use the #main attribute like in a normal application.
#main
struct CLI {
static func main() async throws {
let result = await startAsyncNetworkingStuff()
// do something with result
}
}
The name of the struct is arbitrary, the static function main is mandatory and is the entry point.
Here's how to use URLSession in a macOS Command Line Tool using Swift 4.2
// Mac Command Line Tool with Async Wait and Exit
import Cocoa
// Store a reference to the current run loop
let runLoop = CFRunLoopGetCurrent()
// Create a background task to load a webpage
let url = URL(string: "http://SuperEasyApps.com")!
let task = URLSession.shared.dataTask(with: url) { (data, response, error) in
if let data = data {
print("Loaded: \(data) bytes")
}
// Stop the saved run loop
CFRunLoopStop(runLoop)
}
task.resume()
// Start run loop after work has been started
print("start")
CFRunLoopRun()
print("end") // End will print after the run loop is stopped
// If your command line tool finished return success,
// otherwise return EXIT_FAILURE
exit(EXIT_SUCCESS)
You'll have to call the stop function using a reference to the run loop before you started (as shown above), or using GCD in order to exit as you'd expect.
func stopRunLoop() {
DispatchQueue.main.async {
CFRunLoopStop(CFRunLoopGetCurrent())
}
}
References
https://developer.apple.com/documentation/corefoundation/1542011-cfrunlooprun
Run loops can be run recursively. You can call CFRunLoopRun() from within any run loop callout and create nested run loop activations on the current thread’s call stack.
https://developer.apple.com/documentation/corefoundation/1541796-cfrunloopstop
If the run loop is nested with a callout from one activation starting another activation running, only the innermost activation is exited.
(Answering my own question)
Adding the following snippet to my async network completion code allows "end" to be printed :
DispatchQueue.main.async {
shouldKeepRunning = false
}

Downloaded Data does not print in order using GCDs in Swift

I am trying to have the downloaded return message print before the message "second". Basically, once the message has been downloaded it should print and then the "second" message. Everytime the code runs, the second message prints and then the returnMessage because the return message takes a bit to download. Is it possible to allow the return message to fire after it completes and then the second message everytime the code is run?
var returnMessage: String? = ""
var downloadGroup = dispatch_group_create()
dispatch_async(utility.GlobalUtilityQueue){
dispatch_group_enter(downloadGroup)
service.executeQuery(query, completionHandler: { (ticket: GTLServiceTicket!, object: AnyObject!, error: NSError!) -> Void in
// Process the response
let json = JSON(object.JSON)
returnMessage = json["message"].string
println("\(returnMessage)") // print first
})
dispatch_group_leave(downloadGroup)
dispatch_group_notify(downloadGroup, self.utility.GlobalMainQueue) {
println("second")//should print second
}
}
The problem is that the dispatch_group_leave should be inside the completionHandler of executeQuery.
var returnMessage: String? = ""
let downloadGroup = dispatch_group_create()
dispatch_async(utility.GlobalUtilityQueue){
dispatch_group_enter(downloadGroup)
service.executeQuery(query) { ticket, object, error in
// Process the response
let json = JSON(object.JSON)
returnMessage = json["message"].string
println("\(returnMessage)") // will print first
dispatch_group_leave(downloadGroup)
}
dispatch_group_notify(downloadGroup, self.utility.GlobalMainQueue) {
println("second")//will print second
}
}
Obviously, this is not a situation where you would use dispatch group (you'd generally only do it if you were entering and leaving multiple times). Also, the outer dispatch_async is probably unnecessary (you're calling an asynchronous method, so there's no need to dispatch that to some background queue). But I assume this was more of an academic question, so hopefully this helps.

Multiple workers in Swift Command Line Tool

When writing a Command Line Tool (CLT) in Swift, I want to process a lot of data. I've determined that my code is CPU bound and performance could benefit from using multiple cores. Thus I want to parallelize parts of the code. Say I want to achieve the following pseudo-code:
Fetch items from database
Divide items in X chunks
Process chunks in parallel
Wait for chunks to finish
Do some other processing (single-thread)
Now I've been using GCD, and a naive approach would look like this:
let group = dispatch_group_create()
let queue = dispatch_queue_create("", DISPATCH_QUEUE_CONCURRENT)
for chunk in chunks {
dispatch_group_async(group, queue) {
worker(chunk)
}
}
dispatch_group_wait(group, DISPATCH_TIME_FOREVER)
However GCD requires a run loop, so the code will hang as the group is never executed. The runloop can be started with dispatch_main(), but it never exits. It is also possible to run the NSRunLoop just a few seconds, however that doesn't feel like a solid solution. Regardless of GCD, how can this be achieved using Swift?
I mistakenly interpreted the locking thread for a hanging program. The work will execute just fine without a run loop. The code in the question will run fine, and blocking the main thread until the whole group has finished.
So say chunks contains 4 items of workload, the following code spins up 4 concurrent workers, and then waits for all of the workers to finish:
let group = DispatchGroup()
let queue = DispatchQueue(label: "", attributes: .concurrent)
for chunk in chunk {
queue.async(group: group, execute: DispatchWorkItem() {
do_work(chunk)
})
}
_ = group.wait(timeout: .distantFuture)
Just like with an Objective-C CLI, you can make your own run loop using NSRunLoop.
Here's one possible implementation, modeled from this gist:
class MainProcess {
var shouldExit = false
func start () {
// do your stuff here
// set shouldExit to true when you're done
}
}
println("Hello, World!")
var runLoop : NSRunLoop
var process : MainProcess
autoreleasepool {
runLoop = NSRunLoop.currentRunLoop()
process = MainProcess()
process.start()
while (!process.shouldExit && (runLoop.runMode(NSDefaultRunLoopMode, beforeDate: NSDate(timeIntervalSinceNow: 2)))) {
// do nothing
}
}
As Martin points out, you can use NSDate.distantFuture() as NSDate instead of NSDate(timeIntervalSinceNow: 2). (The cast is necessary because the distantFuture() method signature indicates it returns AnyObject.)
If you need to access CLI arguments see this answer. You can also return exit codes using exit().
Swift 3 minimal implementation of Aaron Brager solution, which simply combines autoreleasepool and RunLoop.current.run(...) until you break the loop:
var shouldExit = false
doSomethingAsync() { _ in
defer {
shouldExit = true
}
}
autoreleasepool {
var runLoop = RunLoop.current
while (!shouldExit && (runLoop.run(mode: .defaultRunLoopMode, before: Date.distantFuture))) {}
}
I think CFRunLoop is much easier than NSRunLoop in this case
func main() {
/**** YOUR CODE START **/
let group = dispatch_group_create()
let queue = dispatch_queue_create("", DISPATCH_QUEUE_CONCURRENT)
for chunk in chunks {
dispatch_group_async(group, queue) {
worker(chunk)
}
}
dispatch_group_wait(group, DISPATCH_TIME_FOREVER)
/**** END **/
}
let runloop = CFRunLoopGetCurrent()
CFRunLoopPerformBlock(runloop, kCFRunLoopDefaultMode) { () -> Void in
dispatch_async(dispatch_queue_create("main", nil)) {
main()
CFRunLoopStop(runloop)
}
}
CFRunLoopRun()