Combining two Observable<Void>s - swift

I'm still a reactive newbie and I'm looking for help.
func doA() -> Observable<Void>
func doB() -> Observable<Void>
enum Result {
case Success
case BFailed
}
func doIt() -> Observable<Result> {
// start both doA and doB.
// If both complete then emit .Success and complete
// If doA completes, but doB errors emit .BFailed and complete
// If both error then error
}
The above is what I think I want... The initial functions doA() and doB() wrap network calls so they will both emit one signal and then Complete (or Error without emitting any Next events.) If doA() completes but doB() errors, I want doIt() to emit .BFailed and then complete.
It feels like I should be using zip or combineLatest but I'm not sure how to know which sequence failed if I do that. I'm also pretty sure that catchError is part of the solution, but I'm not sure exactly where to put it.
--
As I'm thinking about it, I'm okay with the calls happening sequentially. That might even be better...
IE:
Start doA()
if it completes start doB()
if it completes emit .Success
else emit .BFailed.
else forward the error.
Thanks for any help.

I believe .flatMapLatest() is what you're looking for, chaining your observable requests.
doFirst()
.flatMapLatest({ [weak self] (firstResult) -> Observable<Result> in
// Assuming this doesn't fail and returns result on main scheduler,
// otherwise `catchError` and `observeOn(MainScheduler.instance)` can be used to correct this
// ...
// do something with result #1
// ...
return self?.doSecond()
}).subscribeNext { [weak self] (secondResult) -> Void in
// ...
// do something with result #2
// ...
}.addDisposableTo(disposeBag)
And here is .flatMapLatest() doc in RxSwift.
Projects each element of an observable sequence into a new sequence of observable sequences and then
transforms an observable sequence of observable sequences into an observable sequence producing values only from the most recent observable sequence. It is a combination of map + switchLatest operator.

I apologize that I don't know the syntax for swift, so I'm writing the answer in c#. The code should be directly translatable.
var query =
doA
.Materialize()
.Zip(doB.Materialize(), (ma, mb) => new { ma, mb })
.Select(x =>
x.ma.Kind == NotificationKind.OnError
|| x.mb.Kind == NotificationKind.OnError
? Result.BFailed
: Result.Success);
Basically the .Materialize() operator turns the OnNext, OnError, and OnCompleted notifications for an observable of type T into OnNext notifications for an observable of type Notification<T>. You can then .Zip(...) these and check for your required conditions.

I've learned RxSwift well enough to answer this question now...
func doIt() -> Observable<Result> {
Observable.zip(
doA().map { Result.Success },
doB().map { Result.Success }
.catch { _ in Observable.just(Result.BFailed) }
) { $1 }
}

Related

Mapping Swift Combine Future to another Future

I have a method that returns a Future:
func getItem(id: String) -> Future<MediaItem, Error> {
return Future { promise in
// alamofire async operation
}
}
I want to use it in another method and covert MediaItem to NSImage, which is a synchronous operation. I was hoping to simply do a map or flatMap on the original Future but it creates a long Publisher that I cannot erased to Future<NSImage, Error>.
func getImage(id: String) -> Future<NSImage, Error> {
return getItem(id).map { mediaItem in
// some sync operation to convert mediaItem to NSImage
return convertToNSImage(mediaItem) // this returns NSImage
}
}
I get the following error:
Cannot convert return expression of type 'Publishers.Map<Future<MediaItem, Error>, NSImage>' to return type 'Future<NSImage, Error>'
I tried using flatMap but with a similar error. I can eraseToAnyPublisher but I think that hides the fact that getImage(id: String returns a Future.
I suppose I can wrap the body of getImage in a future but that doesn't seem as clean as chaining and mapping. Any suggestions would be welcome.
You can't use dribs and drabs and bits and pieces from the Combine framework like that. You have to make a pipeline — a publisher, some operators, and a subscriber (which you store so that the pipeline will have a chance to run).
Publisher
|
V
Operator
|
V
Operator
|
V
Subscriber (and store it)
So, here, getItem is a function that produces your Publisher, a Future. So you can say
getItem (...)
.map {...}
( maybe other operators )
.sink {...} (or .assign(...))
.store (...)
Now the future (and the whole pipeline) will run asynchronously and the result will pop out the end of the pipeline and you can do something with it.
Now, of course you can put the Future and the Map together and then stop, vending them so someone else can attach other operators and a subscriber to them. You have now assembled the start of a pipeline and no more. But then its type is not going to be Future; it will be an AnyPublisher<NSImage,Error>. And there's nothing wrong with that!
You can always wrap one future in another. Rather than mapping it as a Publisher, subscribe to its result in the future you want to return.
func mapping(futureToWrap: Future<MediaItem, Error>) -> Future<NSImage, Error> {
var cancellable: AnyCancellable?
return Future<String, Error> { promise in
// cancellable is captured to assure the completion of the wrapped future
cancellable = futureToWrap
.sink { completion in
if case .failure(let error) = completion {
promise(.failure(error))
}
} receiveValue: { value in
promise(.success(convertToNSImage(mediaItem)))
}
}
}
This could always be generalized to
extension Publisher {
func asFuture() -> Future<Output, Failure> {
var cancellable: AnyCancellable?
return Future<Output, Failure> { promise in
// cancellable is captured to assure the completion of the wrapped future
cancellable = self.sink { completion in
if case .failure(let error) = completion {
promise(.failure(error))
}
} receiveValue: { value in
promise(.success(value))
}
}
}
}
Note above that if the publisher in question is a class, it will get retained for the lifespan of the closure in the Future returned. Also, as a future, you will only ever get the first value published, after which the future will complete.
Finally, simply erasing to AnyPublisher is just fine. If you want to assure you only get the first value (similar to getting a future's only value), you could just do the following:
getItem(id)
.map(convertToNSImage)
.eraseToAnyPublisher()
.first()
The resulting type, Publishers.First<AnyPublisher<Output, Failure>> is expressive enough to convey that only a single result will ever be received, similar to a Future. You could even define a typealias to that end (though it's probably overkill at that point):
typealias AnyFirst<Output, Failure> = Publishers.First<AnyPublisher<Output, Failure>>

Creating and consuming a cursor with Vapor 3

This might be a can of worms, I'll do my best to describe the issue. We have a long running data processing job. Our database of actions is added to nightly and the outstanding actions are processed. It takes about 15 minutes to process nightly actions. In Vapor 2 we utilised a lot of raw queries to create a PostgreSQL cursor and loop through it until it was empty.
For the time being, we run the processing via a command line parameter. In future we wish to have it run as part of the main server so that progress can be checked while processing is being performed.
func run(using context: CommandContext) throws -> Future<Void> {
let table = "\"RecRegAction\""
let cursorName = "\"action_cursor\""
let chunkSize = 10_000
return context.container.withNewConnection(to: .psql) { connection in
return PostgreSQLDatabase.transactionExecute({ connection -> Future<Int> in
return connection.simpleQuery("DECLARE \(cursorName) CURSOR FOR SELECT * FROM \(table)").map { result in
var totalResults = 0
var finished : Bool = false
while !finished {
let results = try connection.raw("FETCH \(chunkSize) FROM \(cursorName)").all(decoding: RecRegAction.self).wait()
if results.count > 0 {
totalResults += results.count
print(totalResults)
// Obviously we do our processing here
}
else {
finished = true
}
}
return totalResults
}
}, on: connection)
}.transform(to: ())
}
Now this doesn't work because I'm calling wait() and I get the error "Precondition failed: wait() must not be called when on the EventLoop" which is fair enough. One of the issues I face is that I have no idea how you even get off the main event loop to run things like this on a background thread. I am aware of BlockingIOThreadPool, but that still seems to operate on the same EventLoop and still causes the error. While I'm able to theorise more and more complicated ways to achieve this, I'm hoping I'm missing an elegant solution which perhaps somebody with better knowledge of SwiftNIO and Fluent could help out with.
Edit: To be clear, the goal of this is obviously not to total up the number of actions in the database. The goal is to use the cursor to process every action synchronously. As I read the results in, I detect changes in the actions and then throw batches of them out to processing threads. When all the threads are busy, I don't start reading from the cursor again until they complete.
There are a LOT of these actions, up to 45 million in a single run. Aggregating promises and recursion didn't seem to be a great idea and when I tried it, just for the sake of it, the server hung.
This is a processing intensive task that can run for days on a single thread, so I'm not concerned about creating new threads. The issue is that I cannot work out how I can use the wait() function inside a Command as I need a container to create the database connection and the only one I have access to is context.container Calling wait() on this leads to the above error.
TIA
Ok, so as you know, the problem lies in these lines:
while ... {
...
try connection.raw("...").all(decoding: RecRegAction.self).wait()
...
}
you want to wait for a number of results and therefore you use a while loop and .wait() for all the intermediate results. Essentially, this is turning asynchronous code into synchronous code on the event loop. That is likely leading to deadlocks and will for sure stall other connections which is why SwiftNIO tries to detect that and give you that error. I won't go into the details why it's stalling other connections or why this is likely to lead to deadlocks in this answer.
Let's see what options we have to fix this issue:
as you say, we could just have this .wait() on another thread that isn't one of the event loop threads. For this any non-EventLoop thread would do: Either a DispatchQueue or you could use the BlockingIOThreadPool (which does not run on an EventLoop)
we could rewrite your code to be asynchronous
Both solutions will work but (1) is really not advisable as you would burn a whole (kernel) thread just to wait for the results. And both Dispatch and BlockingIOThreadPool have a finite number of threads they're willing to spawn so if you do that often enough you might run out of threads so it'll take even longer.
So let's look into how we can call an asynchronous function multiple times whilst accumulating the intermediate results. And then if we have accumulated all the intermediate results continue with all the results.
To make things easier let's look at a function that is very similar to yours. We assume this function to be provided just like in your code
/// delivers partial results (integers) and `nil` if no further elements are available
func deliverPartialResult() -> EventLoopFuture<Int?> {
...
}
what we would like now is a new function
func deliverFullResult() -> EventLoopFuture<[Int]>
please note how the deliverPartialResult returns one integer each time and deliverFullResult delivers an array of integers (ie. all the integers). Ok, so how do we write deliverFullResult without calling deliverPartialResult().wait()?
What about this:
func accumulateResults(eventLoop: EventLoop,
partialResultsSoFar: [Int],
getPartial: #escaping () -> EventLoopFuture<Int?>) -> EventLoopFuture<[Int]> {
// let's run getPartial once
return getPartial().then { partialResult in
// we got a partial result, let's check what it is
if let partialResult = partialResult {
// another intermediate results, let's accumulate and call getPartial again
return accumulateResults(eventLoop: eventLoop,
partialResultsSoFar: partialResultsSoFar + [partialResult],
getPartial: getPartial)
} else {
// we've got all the partial results, yay, let's fulfill the overall future
return eventLoop.newSucceededFuture(result: partialResultsSoFar)
}
}
}
Given accumulateResults, implementing deliverFullResult is not too hard anymore:
func deliverFullResult() -> EventLoopFuture<[Int]> {
return accumulateResults(eventLoop: myCurrentEventLoop,
partialResultsSoFar: [],
getPartial: deliverPartialResult)
}
But let's look more into what accumulateResults does:
it invokes getPartial once, then when it calls back it
checks if we have
a partial result in which case we remember it alongside the other partialResultsSoFar and go back to (1)
nil which means partialResultsSoFar is all we get and we return a new succeeded future with everything we have collected so far
that's already it really. What we did here is to turn the synchronous loop into asynchronous recursion.
Ok, we looked at a lot of code but how does this relate to your function now?
Believe it or not but this should actually work (untested):
accumulateResults(eventLoop: el, partialResultsSoFar: []) {
connection.raw("FETCH \(chunkSize) FROM \(cursorName)")
.all(decoding: RecRegAction.self)
.map { results -> Int? in
if results.count > 0 {
return results.count
} else {
return nil
}
}
}.map { allResults in
return allResults.reduce(0, +)
}
The result of all this will be an EventLoopFuture<Int> which carries the sum of all the intermediate result.count.
Sure, we first collect all your counts into an array to then sum it up (allResults.reduce(0, +)) at the end which is a bit wasteful but also not the end of the world. I left it this way because that makes accumulateResults be usable in other cases where you want to accumulate partial results in an array.
Now one last thing, a real accumulateResults function would probably be generic over the element type and also we can eliminate the partialResultsSoFar parameter for the outer function. What about this?
func accumulateResults<T>(eventLoop: EventLoop,
getPartial: #escaping () -> EventLoopFuture<T?>) -> EventLoopFuture<[T]> {
// this is an inner function just to hide it from the outside which carries the accumulator
func accumulateResults<T>(eventLoop: EventLoop,
partialResultsSoFar: [T] /* our accumulator */,
getPartial: #escaping () -> EventLoopFuture<T?>) -> EventLoopFuture<[T]> {
// let's run getPartial once
return getPartial().then { partialResult in
// we got a partial result, let's check what it is
if let partialResult = partialResult {
// another intermediate results, let's accumulate and call getPartial again
return accumulateResults(eventLoop: eventLoop,
partialResultsSoFar: partialResultsSoFar + [partialResult],
getPartial: getPartial)
} else {
// we've got all the partial results, yay, let's fulfill the overall future
return eventLoop.newSucceededFuture(result: partialResultsSoFar)
}
}
}
return accumulateResults(eventLoop: eventLoop, partialResultsSoFar: [], getPartial: getPartial)
}
EDIT: After your edit your question suggests that you do not actually want to accumulate the intermediate results. So my guess is that instead, you want to do some processing after every intermediate result has been received. If that's what you want to do, maybe try this:
func processPartialResults<T, V>(eventLoop: EventLoop,
process: #escaping (T) -> EventLoopFuture<V>,
getPartial: #escaping () -> EventLoopFuture<T?>) -> EventLoopFuture<V?> {
func processPartialResults<T, V>(eventLoop: EventLoop,
soFar: V?,
process: #escaping (T) -> EventLoopFuture<V>,
getPartial: #escaping () -> EventLoopFuture<T?>) -> EventLoopFuture<V?> {
// let's run getPartial once
return getPartial().then { partialResult in
// we got a partial result, let's check what it is
if let partialResult = partialResult {
// another intermediate results, let's call the process function and move on
return process(partialResult).then { v in
return processPartialResults(eventLoop: eventLoop, soFar: v, process: process, getPartial: getPartial)
}
} else {
// we've got all the partial results, yay, let's fulfill the overall future
return eventLoop.newSucceededFuture(result: soFar)
}
}
}
return processPartialResults(eventLoop: eventLoop, soFar: nil, process: process, getPartial: getPartial)
}
This will (as before) run getPartial until it returns nil but instead of accumulating all of getPartial's results, it calls process which gets the partial result and can do some further processing. The next getPartial call will happen when the EventLoopFuture process returns is fulfilled.
Is that closer to what you would like?
Notes: I used SwiftNIO's EventLoopFuture type here, in Vapor you would just use Future instead but the remainder of the code should be the same.
Here's the generic solution, rewritten for NIO 2.16/Vapor 4, and as an extension to EventLoop
extension EventLoop {
func accumulateResults<T>(getPartial: #escaping () -> EventLoopFuture<T?>) -> EventLoopFuture<[T]> {
// this is an inner function just to hide it from the outside which carries the accumulator
func accumulateResults<T>(partialResultsSoFar: [T] /* our accumulator */,
getPartial: #escaping () -> EventLoopFuture<T?>) -> EventLoopFuture<[T]> {
// let's run getPartial once
return getPartial().flatMap { partialResult in
// we got a partial result, let's check what it is
if let partialResult = partialResult {
// another intermediate results, let's accumulate and call getPartial again
return accumulateResults(partialResultsSoFar: partialResultsSoFar + [partialResult],
getPartial: getPartial)
} else {
// we've got all the partial results, yay, let's fulfill the overall future
return self.makeSucceededFuture(partialResultsSoFar)
}
}
}
return accumulateResults(partialResultsSoFar: [], getPartial: getPartial)
}
}

How to include completion handlers for multiple functions in swift?

Consider this code:
func test() {
A()
B()
C()
D()
E()
}
Each function here have their own set of actions like calling API's, parsing them, writing results in file, uploading to servers, etc etc.
I want to run this functions one by one. I read about completion handlers. My problem with completion handlers are:
All examples given to understand completion handlers was having just two methods
I don't want to place this functions in some other function. I want all function calls (A to E) inside Test() function alone
Can someone help on this?
This is easy to do, you just need to add a closure argument to call on completion. For example:
func a(completion: (() -> Void)) {
// When all async operations are complete:
completion()
}
func b(completion: (() -> Void)) {
// When all async operations are complete:
completion()
}
func c(completion: (() -> Void)) {
// When all async operations are complete:
completion()
}
func d(completion: (() -> Void)) {
// When all async operations are complete:
completion()
}
func e(completion: (() -> Void)) {
// When all async operations are complete:
completion()
}
func test() {
a {
b {
c {
d {
e {
// All have now completed.
}
}
}
}
}
}
As you can see, this looks horrible. One problem with multiple async operations, non concurrently is that you end up with this horrible nesting.
Solutions to this do exist, I personally recommend PromiseKit. It encapsulates blocks in easy chaining methods, which is far cleaner.
Take a look at reactive programming and observable chains. RxSwift will be an excellent library for you. Reactive programming is very popular right now and was created exactly to solve problems like yours. It enables you to easily create a process (like an assembly line) that transforms inputs into your desired outputs. For example, your code in Rx would look like:
A()
.flatMap { resultsFromA in
B(resultsFromA)
}
.flatMap { resultsFromB in
C(resultsFromB)
}
.flatMap { resultsFromC in
D(resultsFromC)
}
.flatMap { resultsFromD in
E(resultsFromD)
}
.subscribe(onNext: { finalResult in
//result from the process E, subscribe starts the whole 'assembly line'
//without subscribing nothing will happen, so you can easily create observable chains and reuse them throughout your app
})
This code sample would create a process where you'd transform results from your initial API call (A), into parsed results (B), then write parsed results to a file (C) etc. Those processes can be synchronous or asynchronous. Reactive programming can be understood as Observable pattern on steroids.

Is it possible to make `ReplaySubject` to run a closure on being subscribed to?

I want to create a cold observable that would only start doing expensive operation if there is an actual subscription. ReplaySubject would fit nicely except for the part that I need to be able to start an expensive background operation when the actual subscription is made and not on create of the observable. Is there a way to do so? Some sort of onSubscribed { ... } method.
Here are a couple of options:
Adding the expensive operation to a doOn(onSubscribe:) that's in between the Observable and the subscription:
let observable = Observable.of(1, 2)
.doOn(onSubscribe: { _ in
expensiveOperation()
})
observable
.subscribeNext { e in
print(e)
}
Making the Observable connectable and separating the doOn(onSubscribe:):
let observable = Observable.of(1, 2)
.publish()
observable
.doOn(onSubscribe: { _ in
expensiveOperation()
})
.subscribe()
observable
.subscribeNext { e in
print(e)
}
observable.connect()

Proper way to dispose of a disposable within an observable

I have an HTTPService which returns an Observable<NSData>. My goal is to compose that service into another service, ServiceA which transforms that data for my use case. Using Observable.create in RxSwift 2.0.0-rc.0 in ServiceA it's straight forward enough. My question is how to properly handle the disposable returned from the subscription of the HTTPService.
If I don't do anything I get the compile time warning that the result of call is unused: http://git.io/rxs.ud. I understand from reading that if I do nothing it's likely ok: (where xs mentioned below is let xs: Observable<E> ....
In case xs terminates in a predictable way with Completed or Error message, not handling subscription Disposable won't leak any resources, but it's still preferred way because in that way element computation is terminated at predictable moment.
So here is how I am currently addressing it, and also where I am wondering if I am doing this properly or if I have misunderstood something.
public struct ServiceA{
public static func changes() -> Observable<ChangeSet>{
return Observable.create{ observable in
// return's Observable<NSData>
let request = HTTPService.get("https://httpbin.org/get")
let disposable = request.subscribe(
onNext: { data in
// Do more work to transform this data
// into something meaningful for the application.
// For example purposes just use an empty object
observable.onNext(ChangeSet())
observable.onCompleted()
},
onError:{ error in
observable.onError(error)
})
// Is this the right way to deal with the
// disposable from the subscription in this situation?
return AnonymousDisposable{
disposable.dispose()
}
}
}
}
As documentation says
subscribe function returns a subscription Disposable that can be used to cancel computation and free resources.
Preferred way of terminating these fluent calls is by using
.addDisposableTo(disposeBag) or in some equivalent way.
When disposeBag gets deallocated, subscription will be automatically
disposed.
Actually your example looks fine in terms of rules, but it loos pretty bad ;) (Also it would be ok, if you would just return this disposable) :
public static func changes() -> Observable<ChangeSet>{
return Observable.create{ observable in
// return's Observable<NSData>
let request = HTTPService.get("https://httpbin.org/get")
return request.subscribe(
onNext: { data in
// Do more work to transform this data
// into something meaningful for the application.
// For example purposes just use an empty object
observable.onNext(ChangeSet())
observable.onCompleted()
},
onError:{ error in
observable.onError(error)
})
}
But as you you returning Observeble I wonder, why you dont just use map operator ?
In your example it would be something like this:
public static func changes() -> Observable<ChangeSet> {
return HTTPService.get("https://httpbin.org/get")
.map(ChangeSet.init)
}