I am trying to load data from two different endpoints using two different publishers which have different return types. I need to update the UI when both requests complete, but both requests can also fail so Zip doesn't do the trick. Usually I would use a DispatchGroup to accomplish this, but I have not figured out how to do that using Combine. Is there a way to use DispatchGroup with Combine?
let dispatchGroup: DispatchGroup = .init()
let networkQueue: DispatchQueue = .init(label: "network", cos: .userInitiated)
dispatchGroup.notify { print("work all done!" }
publisher
.receive(on: networkQueue, options: .init(group: dispatchGroup)
.sink { ... }
.receiveValue { ... }
.store(in: &cancellables)
publisher2
.receive(on: networkQueue, options: .init(group: dispatchGroup)
.sink { ... }
.receiveValue { ... }
.store(in: &cancellables)
The notify is immediately executed. Is this not the right way of doing this?
You'll want to use the Publishers.CombineLatest which will take the two publishers and create a new publisher, with the result of the latest value from both streams:
Publishers.CombineLatest(publisher, publisher2)
// Receive values on the main queue (you decide whether you want to do this)
.receive(on: DispatchQueue.main)
.sink(receiveCompletion: { completion in
// Handle error / completion
// If either stream produces an error, the error will be forwarded in here
}, receiveValue: { value1, value2 in
// value1 will be the value of publisher's Output type
// value2 will be the value of pubslier2's Output type
})
// You only need to store this subscription - not publisher and publisher2 individually
.store(in: &cancellables)
The Publishers.CombineLatest publisher, is what can be seen as the equivalent of using a DispatchGroup, where you call dispatchGroup.enter() for each network operation you initiate. However, one key difference is that the CombineLatest publisher will produce more than one value, if any of the publishers produce more than one value. For normal network operations, you don't need to worry about this. But if you find yourself in a situation where you only need the first or the first N values produces by the combined publisher, you could use the prefix(_:) modifier, which will make sure that you will never receive more than N events.
EDIT: Updated to fix typo in code.
Related
I currently have code like...
var isFetching = false
func fetch() {
guard !isFetching else { return }
fetching = true
apiPublisher
.receive(on: .main)
.handleEvents(receiveCompletion: { [weak self] _ in
self?.fetching = false
})
.assign(to: \.foo, on: self)
.store(in: &cancellables)
}
But I'm not happy with the way I'm stopping multiple requests happening like this. It works but feels clunky.
I feel like there should be a more "Combine-y" way of doing this.
Is there a more elegant/Combine-y way of doing this?
The issue here is the fact that you have the entire Publisher subscription chain in a function (fetch()) that can be called by virtually any other code at virtually any time from any thread. In order to do this in a "more Combine-y" you need to limit that. So the question is what calls this fetch() function? If you wrap all those call sites in a Publisher and then setup a trigger.flatMap { apiPublisher } and set up the flatMap to ignore events while it's waiting for the current network request to complete.
Something like this:
trigger
.flatMap(maxPublishers: .max(1)) {
apiPublisher
}
.receive(on: .main)
.assign(to: \.foo, on: self)
.store(in: &cancellables)
The important thing to note here is that you only execute this code once (probably in the viewDidLoad if this is a view controller. Then when you want to make the request, you send an event through the trigger Publisher. The flatMap will ensure that only one apiPublisher subscription will happen at a time.
I'm trying to use Combine to do several millions concurrent request through the network. Here is a mock up of the naive approach I'n using:
import Foundation
import Combine
let cancellable = (0..<1_000_000).publisher
.map(some_preprocessing)
.flatMap(maxPublishers: .max(32)) { request in
URLSession.dataTaskPublisher(for: request)
.map(\.data)
.catch { _ in
return Just(Data())
}
}
.sink { completion in
print(completion)
} receiveValue: { value in
print(value)
}
// Required in a command line tool
sleep(100)
This pipeline first creates a request, the the request is done in flatMap to confine errors. Also, flatMap merges several requests to they are effectively done concurrently, which is great.
The issue is that it will literally make 1,000,000 requests concurrently, so I added the parameter maxPublishers which limits the number of publishers that are subscribed at the same time in flatMap. This kind of work, only 32 publishers are active at the same time, but unfortunately some_preprocessing will still be performed 1,000,000 times before flatMap will be executed.
I expected flatMap(maxPublishers: .max(32)) to apply some back pressure, i.e. only requesting items from the upstream publisher map when maxPublishers < 32. This does not seem to be the case, and it fills up the RAM rapidly and delays the processing.
I then tried to use the buffer operator that is used to introduce back pressure between a producer and a consumer, but Apple documentation is so poor I don't understand its functioning (more specifically the prefechStrategy argument).
So I tried different combinations such as:
import Foundation
import Combine
let cancellable = (0..<1_000_000).publisher
.map(some_preprocessing)
.buffer(size: 32, prefetch: .byRequest, whenFull: .dropNewest)
.flatMap(maxPublishers: .max(32)) { request in
URLSession.dataTaskPublisher(for: request)
.map(\.data)
.catch { _ in
return Just(Data())
}
}
.sink { completion in
print(completion)
} receiveValue: { value in
print(value)
}
// Required in a command line tool
sleep(100)
This does not seem to do anything useful though, flatMap still requests as much element as it can.
How to properly apply back pressure in this case? I.e I need the upstream map publisher to "wait" for demand asked by the downstream publisher flatMap, which should only ask items when it as an empty slot.
The issue appears to be a Combine bug, as pointed out here. Using Publishers.Sequence causes the following operator to accumulate every value sent downstream before proceeding.
A workaround is to type-erase the sequence publisher:
import Foundation
import Combine
let cancellable = (0..<1_000_000).publisher
.eraseToAnyPublisher() // <----
.map(some_preprocessing)
.flatMap(maxPublishers: .max(32)) { request in
URLSession.dataTaskPublisher(for: request)
.map(\.data)
.catch { _ in
return Just(Data())
}
}
.sink { completion in
print(completion)
} receiveValue: { value in
print(value)
}
// Required in a command line tool without running loop
sleep(.max)
We know Empty Publisher in Combine will trigger a completion event immediately:
Empty<Void,Never>()
.sink {
print("completion: \($0)") // will print!
} receiveValue: {}
But Empty Publisher that flatMap returned will NOT trigger completion event:
var subs = Set<AnyCancellable>()
let p0 = PassthroughSubject<[Int],Error>()
let p1 = p0
.flatMap {_ in
Empty<Void,Never>() // same Empty Publisher
}.eraseToAnyPublisher()
p1
.sink {
print("completion: \($0)") // but NOT print!
} receiveValue: {}
.store(in: &subs)
p0.send([1,2,3])
Why is that??? Am I miss something??? Thanks! ;)
FlatMap works in the following way: for every upstream value it creates a publisher. The downstream receives all the values emitted by all these FlatMap-created publishers.
It completes when the upstream completes, or errors out if either the upstream errors out, or if any created publishers error out.
So, in your case, for the single upstream value of [1,2,3] you emit an Empty publisher (which completes), but there's no overall completion because PassthroughSubject hasn't completed.
p0.send([1,2,3])
p0.send(completion: .finished)
The above would complete the entire pipeline.
Im moving my project to Combine from RxSwift
I have a logic where I want publisher to emit event every time I click button. Acrually clicking button executed pushMe.send()
pushMe
.print("Debug")
.flatMap { (res) -> AnyPublisher<Bool, Error> in
return Future<Bool, Error>.init { closure in
closure(.failure(Errors.validationFail))
}.eraseToAnyPublisher()
}
.sink(receiveCompletion: { completion in
print("Completion received")
}, receiveValue: { value in
print("Value = \(value)")
})
.store(in: &subscriptions)
The console result
Debug: receive value: (true)
Completion received
Debug: receive value: (true)
Debug: receive value: (true)
I do not understand why sink receive error only on first event. The rest clicks are ignored.
What does flatMap do -
Subscribes to the given publisher (let's say XPublisher).
Sends the Errors and Output values (not finished event/ completion) emitted
by XPublisher to the down stream.
So If you handle errors inside the flat map , (which means the publisher inside the flatMap does not emit errors), then flatMap Never sends an error to the down stream.
pushMe
.print("Debug")
.flatMap { (res) -> AnyPublisher<Bool, Never> in //<= here
return Future<Bool, Error>.init { closure in
closure(.failure(Errors.validationFail))
}
.replaceError(with: false) //<= here
.eraseToAnyPublisher()
}
.sink(receiveCompletion: { completion in
print("Completion received")
}, receiveValue: { value in
print("Value = \(value)")
})
.store(in: &subscriptions)
Otherwise you can handle error outside the fatMap. Problem here is that, once an error out whole the subscription / cancellable cancelled. ( in the below example error has replace with a false value)
pushMe
.print("Debug")
.flatMap { (res) -> AnyPublisher<Bool, Error> in
return Future<Bool, Error>.init { closure in
closure(.failure(Errors.validationFail))
}
.eraseToAnyPublisher()
}
.replaceError(with: false) //<= here
.sink(receiveCompletion: { completion in
print("Completion received")
}, receiveValue: { value in
print("Value = \(value)")
})
.store(in: &subscriptions)
What is happening in the above code.
FlatMap error outs.
replace the error with false (One false value will receive Because of this)
subscription cancelled because of the error out in the stream.
The rule is that if an error propagates down the pipeline, the entire pipeline is cancelled. Thus, if your Future generates an error, it passes as an error to the Sink and thus the pipeline is cancelled all the way up to the Publisher.
The pattern for preventing this is to deal with the error inside the FlatMap. Basically, you've got two pipelines here: the one that starts with pushMe and the one that starts with Future. Simply don't let the error generated by the Future pipeline "leak" out into the pushMe pipeline, and so the pushMe pipeline will not be cancelled. Instead, catch the error inside the FlatMap and, if you want to pass something out of it to your Sink, pass out of it some sort of value that tells your Sink that there has been a bad input.
A simple solution in your case would be to change the type your FlatMap to <Bool,Never>, and pass either true or false as the Bool to indicate whether validation succeeded in the Future or not.
Or, if it's important to you to pass more detailed information about the error down the pipeline, change the type of your FlatMap to <Result<Bool,Error>,Never> and package the error information into the .failure case of the Result object.
This is how Publishers work in Combine.
The Publisher can either emit values or emit a completion event - once a completion event was emitted, the Publisher is finished and it cannot emit any other values or another completion event. If the Publisher emits an error, the error is emitted as a failure completion, meaning that once an error is emitted, the Publisher completes and it cannot emit any more values.
There are several Combine operators designed for handling errors without completing the Publisher. Have a look into the catch operator for instance.
First, thanks all for helping with this question.
Answer of #matt is one of the possible solution.
Another solution is to create new pipeline every time you clicking button.
Im using this approach because I have sequence of steps below failing publisher and Im not able to rely of dummy true/false result further.
Just<String>()
.sink(receiveValue: { value in
startProcess()
.sink(receiveCompletion: { (completion:
Subscribers.Completion<Failure>) in
// can handle ALL types of error of pipe line in ONE place
}, receiveValue: { (v: P.Output) in
// handle correct result
})
})
.store(in: &subscriptions)
func startProcess() -> AnyPublisher<Bool, Error> {
Future<Bool, Error>.init { closure in
// action 1
closure(.success(true))
}
.flatMap { (b: Bool) -> AnyPubilsher<Void, Error> in
Future<Bool, Error>.init { closure in
// action 2
closure(.success(()))
}
}
}
Benefit is that you are able to handle all types of errors in one place if second sink()
If you use Combine for network requests with URLSession, then you need to save the Subscription (aka, the AnyCancellable) - otherwise it gets immediately deallocated, which cancels the network request. Later, when the network response has been processed, you want to deallocate the subscription, because keeping it around would be a waste of memory.
Below is some code that does this. It's kind of awkward, and it may not even be correct. I can imagine a race condition where network request could start and complete on another thread before sub is set to the non-nil value.
Is there a nicer way to do this?
class SomeThing {
var subs = Set<AnyCancellable>()
func sendNetworkRequest() {
var request: URLRequest = ...
var sub: AnyCancellable? = nil
sub = URLSession.shared.dataTaskPublisher(for: request)
.map(\.data)
.decode(type: MyResponse.self, decoder: JSONDecoder())
.sink(
receiveCompletion: { completion in
self.subs.remove(sub!)
},
receiveValue: { response in ... }
}
subs.insert(sub!)
I call this situation a one-shot subscriber. The idea is that, because a data task publisher publishes only once, you know for a fact that it is safe to destroy the pipeline after you receive your single value and/or completion (error).
Here's a technique I like to use. First, here's the head of the pipeline:
let url = URL(string:"https://www.apeth.com/pep/manny.jpg")!
let pub : AnyPublisher<UIImage?,Never> =
URLSession.shared.dataTaskPublisher(for: url)
.map {$0.data}
.replaceError(with: Data())
.compactMap { UIImage(data:$0) }
.receive(on: DispatchQueue.main)
.eraseToAnyPublisher()
Now comes the interesting part. Watch closely:
var cancellable: AnyCancellable? // 1
cancellable = pub.sink(receiveCompletion: {_ in // 2
cancellable?.cancel() // 3
}) { image in
self.imageView.image = image
}
Do you see what I did there? Perhaps not, so I'll explain it:
First, I declare a local AnyCancellable variable; for reasons having to do with the rules of Swift syntax, this needs to be an Optional.
Then, I create my subscriber and set my AnyCancellable variable to that subscriber. Again, for reasons having to do with the rules of Swift syntax, my subscriber needs to be a Sink.
Finally, in the subscriber itself, I cancel the AnyCancellable when I receive the completion.
The cancellation in the third step actually does two things quite apart from calling cancel() — things having to do with memory management:
By referring to cancellable inside the asynchronous completion function of the Sink, I keep cancellable and the whole pipeline alive long enough for a value to arrive from the subscriber.
By cancelling cancellable, I permit the pipeline to go out of existence and prevent a retain cycle that would cause the surrounding view controller to leak.
Below is some code that does this. It's kind of awkward, and it may not even be correct. I can imagine a race condition where network request could start and complete on another thread before sub is set to the non-nil value.
Danger! Swift.Set is not thread safe. If you want to access a Set from two different threads, it is up to you to serialize the accesses so they don't overlap.
What is possible in general (although not perhaps with URLSession.DataTaskPublisher) is that a publisher emits its signals synchronously, before the sink operator even returns. This is how Just, Result.Publisher, Publishers.Sequence, and others behave. So those produce the problem you're describing, without involving thread safety.
Now, how to solve the problem? If you don't think you want to actually be able to cancel the subscription, then you can avoid creating an AnyCancellable at all by using Subscribers.Sink instead of the sink operator:
URLSession.shared.dataTaskPublisher(for: request)
.map(\.data)
.decode(type: MyResponse.self, decoder: JSONDecoder())
.subscribe(Subscribers.Sink(
receiveCompletion: { completion in ... },
receiveValue: { response in ... }
))
Combine will clean up the subscription and the subscriber after the subscription completes (with either .finished or .failure).
But what if you do want to be able to cancel the subscription? Maybe sometimes your SomeThing gets destroyed before the subscription is complete, and you don't need the subscription to complete in that case. Then you do want to create an AnyCancellable and store it in an instance property, so that it gets cancelled when SomeThing is destroyed.
In that case, set a flag indicating that the sink won the race, and check the flag before storing the AnyCancellable.
var sub: AnyCancellable? = nil
var isComplete = false
sub = URLSession.shared.dataTaskPublisher(for: request)
.map(\.data)
.decode(type: MyResponse.self, decoder: JSONDecoder())
// This ensures thread safety, if the subscription is also created
// on DispatchQueue.main.
.receive(on: DispatchQueue.main)
.sink(
receiveCompletion: { [weak self] completion in
isComplete = true
if let theSub = sub {
self?.subs.remove(theSub)
}
},
receiveValue: { response in ... }
}
if !isComplete {
subs.insert(sub!)
}
combine publishers have an instance method called prefix which does this:
func prefix(_ maxLength: Int) -> Publishers.Output<Self>
https://developer.apple.com/documentation/combine/publisher/prefix(_:)
playground example