With Combine, how to deallocate the Subscription after a network request - swift

If you use Combine for network requests with URLSession, then you need to save the Subscription (aka, the AnyCancellable) - otherwise it gets immediately deallocated, which cancels the network request. Later, when the network response has been processed, you want to deallocate the subscription, because keeping it around would be a waste of memory.
Below is some code that does this. It's kind of awkward, and it may not even be correct. I can imagine a race condition where network request could start and complete on another thread before sub is set to the non-nil value.
Is there a nicer way to do this?
class SomeThing {
var subs = Set<AnyCancellable>()
func sendNetworkRequest() {
var request: URLRequest = ...
var sub: AnyCancellable? = nil
sub = URLSession.shared.dataTaskPublisher(for: request)
.map(\.data)
.decode(type: MyResponse.self, decoder: JSONDecoder())
.sink(
receiveCompletion: { completion in
self.subs.remove(sub!)
},
receiveValue: { response in ... }
}
subs.insert(sub!)

I call this situation a one-shot subscriber. The idea is that, because a data task publisher publishes only once, you know for a fact that it is safe to destroy the pipeline after you receive your single value and/or completion (error).
Here's a technique I like to use. First, here's the head of the pipeline:
let url = URL(string:"https://www.apeth.com/pep/manny.jpg")!
let pub : AnyPublisher<UIImage?,Never> =
URLSession.shared.dataTaskPublisher(for: url)
.map {$0.data}
.replaceError(with: Data())
.compactMap { UIImage(data:$0) }
.receive(on: DispatchQueue.main)
.eraseToAnyPublisher()
Now comes the interesting part. Watch closely:
var cancellable: AnyCancellable? // 1
cancellable = pub.sink(receiveCompletion: {_ in // 2
cancellable?.cancel() // 3
}) { image in
self.imageView.image = image
}
Do you see what I did there? Perhaps not, so I'll explain it:
First, I declare a local AnyCancellable variable; for reasons having to do with the rules of Swift syntax, this needs to be an Optional.
Then, I create my subscriber and set my AnyCancellable variable to that subscriber. Again, for reasons having to do with the rules of Swift syntax, my subscriber needs to be a Sink.
Finally, in the subscriber itself, I cancel the AnyCancellable when I receive the completion.
The cancellation in the third step actually does two things quite apart from calling cancel() — things having to do with memory management:
By referring to cancellable inside the asynchronous completion function of the Sink, I keep cancellable and the whole pipeline alive long enough for a value to arrive from the subscriber.
By cancelling cancellable, I permit the pipeline to go out of existence and prevent a retain cycle that would cause the surrounding view controller to leak.

Below is some code that does this. It's kind of awkward, and it may not even be correct. I can imagine a race condition where network request could start and complete on another thread before sub is set to the non-nil value.
Danger! Swift.Set is not thread safe. If you want to access a Set from two different threads, it is up to you to serialize the accesses so they don't overlap.
What is possible in general (although not perhaps with URLSession.DataTaskPublisher) is that a publisher emits its signals synchronously, before the sink operator even returns. This is how Just, Result.Publisher, Publishers.Sequence, and others behave. So those produce the problem you're describing, without involving thread safety.
Now, how to solve the problem? If you don't think you want to actually be able to cancel the subscription, then you can avoid creating an AnyCancellable at all by using Subscribers.Sink instead of the sink operator:
URLSession.shared.dataTaskPublisher(for: request)
.map(\.data)
.decode(type: MyResponse.self, decoder: JSONDecoder())
.subscribe(Subscribers.Sink(
receiveCompletion: { completion in ... },
receiveValue: { response in ... }
))
Combine will clean up the subscription and the subscriber after the subscription completes (with either .finished or .failure).
But what if you do want to be able to cancel the subscription? Maybe sometimes your SomeThing gets destroyed before the subscription is complete, and you don't need the subscription to complete in that case. Then you do want to create an AnyCancellable and store it in an instance property, so that it gets cancelled when SomeThing is destroyed.
In that case, set a flag indicating that the sink won the race, and check the flag before storing the AnyCancellable.
var sub: AnyCancellable? = nil
var isComplete = false
sub = URLSession.shared.dataTaskPublisher(for: request)
.map(\.data)
.decode(type: MyResponse.self, decoder: JSONDecoder())
// This ensures thread safety, if the subscription is also created
// on DispatchQueue.main.
.receive(on: DispatchQueue.main)
.sink(
receiveCompletion: { [weak self] completion in
isComplete = true
if let theSub = sub {
self?.subs.remove(theSub)
}
},
receiveValue: { response in ... }
}
if !isComplete {
subs.insert(sub!)
}

combine publishers have an instance method called prefix which does this:
func prefix(_ maxLength: Int) -> Publishers.Output<Self>
https://developer.apple.com/documentation/combine/publisher/prefix(_:)
playground example

Related

Executing 2 parallel network requests using Swift Combine

I am trying to load data from two different endpoints using two different publishers which have different return types. I need to update the UI when both requests complete, but both requests can also fail so Zip doesn't do the trick. Usually I would use a DispatchGroup to accomplish this, but I have not figured out how to do that using Combine. Is there a way to use DispatchGroup with Combine?
let dispatchGroup: DispatchGroup = .init()
let networkQueue: DispatchQueue = .init(label: "network", cos: .userInitiated)
dispatchGroup.notify { print("work all done!" }
publisher
.receive(on: networkQueue, options: .init(group: dispatchGroup)
.sink { ... }
.receiveValue { ... }
.store(in: &cancellables)
publisher2
.receive(on: networkQueue, options: .init(group: dispatchGroup)
.sink { ... }
.receiveValue { ... }
.store(in: &cancellables)
The notify is immediately executed. Is this not the right way of doing this?
You'll want to use the Publishers.CombineLatest which will take the two publishers and create a new publisher, with the result of the latest value from both streams:
Publishers.CombineLatest(publisher, publisher2)
// Receive values on the main queue (you decide whether you want to do this)
.receive(on: DispatchQueue.main)
.sink(receiveCompletion: { completion in
// Handle error / completion
// If either stream produces an error, the error will be forwarded in here
}, receiveValue: { value1, value2 in
// value1 will be the value of publisher's Output type
// value2 will be the value of pubslier2's Output type
})
// You only need to store this subscription - not publisher and publisher2 individually
.store(in: &cancellables)
The Publishers.CombineLatest publisher, is what can be seen as the equivalent of using a DispatchGroup, where you call dispatchGroup.enter() for each network operation you initiate. However, one key difference is that the CombineLatest publisher will produce more than one value, if any of the publishers produce more than one value. For normal network operations, you don't need to worry about this. But if you find yourself in a situation where you only need the first or the first N values produces by the combined publisher, you could use the prefix(_:) modifier, which will make sure that you will never receive more than N events.
EDIT: Updated to fix typo in code.

How to apply back pressure with Combine buffer operator to avoid flatMap to ask an infinite demand upstream?

I'm trying to use Combine to do several millions concurrent request through the network. Here is a mock up of the naive approach I'n using:
import Foundation
import Combine
let cancellable = (0..<1_000_000).publisher
.map(some_preprocessing)
.flatMap(maxPublishers: .max(32)) { request in
URLSession.dataTaskPublisher(for: request)
.map(\.data)
.catch { _ in
return Just(Data())
}
}
.sink { completion in
print(completion)
} receiveValue: { value in
print(value)
}
// Required in a command line tool
sleep(100)
This pipeline first creates a request, the the request is done in flatMap to confine errors. Also, flatMap merges several requests to they are effectively done concurrently, which is great.
The issue is that it will literally make 1,000,000 requests concurrently, so I added the parameter maxPublishers which limits the number of publishers that are subscribed at the same time in flatMap. This kind of work, only 32 publishers are active at the same time, but unfortunately some_preprocessing will still be performed 1,000,000 times before flatMap will be executed.
I expected flatMap(maxPublishers: .max(32)) to apply some back pressure, i.e. only requesting items from the upstream publisher map when maxPublishers < 32. This does not seem to be the case, and it fills up the RAM rapidly and delays the processing.
I then tried to use the buffer operator that is used to introduce back pressure between a producer and a consumer, but Apple documentation is so poor I don't understand its functioning (more specifically the prefechStrategy argument).
So I tried different combinations such as:
import Foundation
import Combine
let cancellable = (0..<1_000_000).publisher
.map(some_preprocessing)
.buffer(size: 32, prefetch: .byRequest, whenFull: .dropNewest)
.flatMap(maxPublishers: .max(32)) { request in
URLSession.dataTaskPublisher(for: request)
.map(\.data)
.catch { _ in
return Just(Data())
}
}
.sink { completion in
print(completion)
} receiveValue: { value in
print(value)
}
// Required in a command line tool
sleep(100)
This does not seem to do anything useful though, flatMap still requests as much element as it can.
How to properly apply back pressure in this case? I.e I need the upstream map publisher to "wait" for demand asked by the downstream publisher flatMap, which should only ask items when it as an empty slot.
The issue appears to be a Combine bug, as pointed out here. Using Publishers.Sequence causes the following operator to accumulate every value sent downstream before proceeding.
A workaround is to type-erase the sequence publisher:
import Foundation
import Combine
let cancellable = (0..<1_000_000).publisher
.eraseToAnyPublisher() // <----
.map(some_preprocessing)
.flatMap(maxPublishers: .max(32)) { request in
URLSession.dataTaskPublisher(for: request)
.map(\.data)
.catch { _ in
return Just(Data())
}
}
.sink { completion in
print(completion)
} receiveValue: { value in
print(value)
}
// Required in a command line tool without running loop
sleep(.max)

In a Combine Publisher chain, how to keep inner objects alive until cancel or complete?

I've created a Combine publisher chain that looks something like this:
let pub = getSomeAsyncData()
.mapError { ... }
.map { ... }
...
.flatMap { data in
let wsi = WebSocketInteraction(data, ...)
return wsi.subject
}
.share().eraseToAnyPublisher()
It's a flow of different possible network requests and data transformations. The calling code wants to subscribe to pub to find out when the whole asynchronous process has succeeded or failed.
I'm confused about the design of the flatMap step with the WebSocketInteraction. That's a helper class that I wrote. I don't think its internal details are important, but its purpose is to provide its subject property (a PassthroughSubject) as the next Publisher in the chain. Internally the WebSocketInteraction uses URLSessionWebSocketTask, talks to a server, and publishes to the subject. I like flatMap, but how do you keep this piece alive for the lifetime of the Publisher chain?
If I store it in the outer object (no problem), then I need to clean it up. I could do that when the subject completes, but if the caller cancels the entire publisher chain then I won't receive a completion event. Do I need to use Publisher.handleEvents and listen for cancellation as well? This seems a bit ugly. But maybe there is no other way...
.flatMap { data in
let wsi = WebSocketInteraction(data, ...)
self.currentWsi = wsi // store in containing object to keep it alive.
wsi.subject.sink(receiveCompletion: { self.currentWsi = nil })
wsi.subject.handleEvents(receiveCancel: {
wsi.closeWebSocket()
self.currentWsi = nil
})
Anyone have any good "design patterns" here?
One design I've considered is making my own Publisher. For example, instead of having WebSocketInteraction vend a PassthroughSubject, it could conform to Publisher. I may end up going this way, but making a custom Combine Publisher is more work, and the documentation steers people toward using a subject instead. To make a custom Publisher you have to implement some of things that the PassthroughSubject does for you, like respond to demand and cancellation, and keep state to ensure you complete at most once and don't send events after that.
[Edit: to clarify that WebSocketInteraction is my own class.]
It's not exactly clear what problems you are facing with keeping an inner object alive. The object should be alive so long as something has a strong reference to it.
It's either an external object that will start some async process, or an internal closure that keeps a strong reference to self via self.subject.send(...).
class WebSocketInteraction {
private let subject = PassthroughSubject<String, Error>()
private var isCancelled: Bool = false
init() {
// start some async work
DispatchQueue.main.asyncAfter(deadline: .now() + 1) {
if !isCancelled { self.subject.send("Done") } // <-- ref
}
}
// return a publisher that can cancel the operation when
var pub: AnyPublisher<String, Error> {
subject
.handleEvents(receiveCancel: {
print("cancel handler")
self.isCancelled = true // <-- ref
})
.eraseToAnyPublisher()
}
}
You should be able to use it as you wanted with flatMap, since the pub property returned publisher, and the inner closure hold a reference to self
let pub = getSomeAsyncData()
...
.flatMap { data in
let wsi = WebSocketInteraction(data, ...)
return wsi.pub
}

Swift Combine publishers vs completion handler and when to cancel

I know in general a publisher is more powerful than a closure, however I want to ask and discuss a specific example:
func getNotificationSettingsPublisher() -> AnyPublisher<UNNotificationSettings, Never> {
let notificationSettingsFuture = Future<UNNotificationSettings, Never> { (promise) in
UNUserNotificationCenter.current().getNotificationSettings { (settings) in
promise(.success(settings))
}
}
return notificationSettingsFuture.eraseToAnyPublisher()
}
I think this is a valid example of a Future publisher and it could be used here instead of using a completion handler. Let's do something with it:
func test() {
getNotificationSettingsPublisher().sink { (notificationSettings) in
// Do something here
}
}
This works, however it will tell me that the result of sink (AnyCancellable) is unused. So whenever I try to get a value, I need to either store the cancellable or assign it until I get a value.
Is there something like sinkOnce or an auto destroy of cancellables? Sometimes I don't need tasks to the cancelled. I could however do this:
func test() {
self.cancellable = getNotificationSettingsPublisher().sink { [weak self] (notificationSettings) in
self?.cancellable?.cancel()
self?.cancellable = nil
}
}
So once I receive a value, I cancel the subscription. (I could do the same in the completion closure of sink I guess).
What's the correct way of doing so? Because if I use a closure, it will be called as many times as the function is called, and if it is called only once, then I don't need to cancel anything.
Would you say normal completion handlers could be replaced by Combine and if so, how would you handle receiving one value and then cancelling?
Last but not least, the completion is called, do I still need to cancel the subscription? I at least need to update the cancellable and set it to nil right? I assume storing subscriptions in a set is for long running subscriptions, but what about single value subscriptions?
Thanks
Instead of using the .sink operator, you can use the Sink subscriber directly. That way you don't receive an AnyCancellable that you need to save. When the publisher completes the subscription, Combine cleans everything up.
func test() {
getNotificationSettingsPublisher()
.subscribe(Subscribers.Sink(
receiveCompletion: { _ in },
receiveValue: ({
print("value: \($0)")
})
))
}

URLSession.shared.dataTaskPublisher not working on IOS 13.3

When trying to make a network request, I'm getting an error
finished with error [-999] Error Domain=NSURLErrorDomain Code=-999 "cancelled"
If I use URLSession.shared.dataTask instead of URLSession.shared.dataTaskPublisher it will work on IOS 13.3.
Here is my code :
return URLSession.shared.dataTaskPublisher(for : request).map{ a in
return a.data
}
.decode(type: MyResponse.self, decoder: JSONDecoder())
.receive(on: DispatchQueue.main)
.eraseToAnyPublisher()
This code worked on IOS 13.2.3.
You have 2 problems here:
1. like #matt said, your publisher isn't living long enough. You can either store the AnyCancellable as an instance var, or what I like to do (and appears to be a redux best practice) is use store(in:) to a Set<AnyCancellable> to keep it around and have it automatically cleaned up when the object is dealloced.
2. In order to kick off the actual network request you need to sink or assign the value.
So, putting these together:
var cancellableSet: Set<AnyCancellable> = []
func getMyResponse() {
URLSession.shared.dataTaskPublisher(for : request).map{ a in
return a.data
}
.decode(type: MyResponse.self, decoder: JSONDecoder())
.receive(on: DispatchQueue.main)
.replaceError(with: MyResponse())
.sink { myResponse in print(myResponse) }
.store(in: &cancellableSet)
}
You have not shown enough code, but based on the symptom it is clear what the problem is: your publisher / subscriber objects are not living long enough. I would venture to say that your code was always wrong and it was just a quirk that it seemed to succeed. Make sure that your publisher and especially your subscriber are retained in long-lived objects, such as instance properties, so that the network communication has time to take place.
Here's a working example of how to use a data task publisher:
class ViewController: UIViewController {
let url = URL(string:"https://apeth.com/pep/manny.jpg")!
lazy var pub = URLSession.shared.dataTaskPublisher(for: url)
.compactMap {UIImage(data: $0.data)}
.receive(on: DispatchQueue.main)
var sub : AnyCancellable?
override func viewDidLoad() {
super.viewDidLoad()
let sub = pub.sink(receiveCompletion: {_ in}, receiveValue: {print($0)})
self.sub = sub
}
}
That prints <UIImage:0x6000008ba490 anonymous {180, 206}>, which is correct (as you can see by going to that URL yourself).
The point I'm making is that if you don't say self.sub = sub, you get exactly the error you are reporting: the subscriber sub, which is merely a local, goes out of existence immediately and the network transaction is prematurely cancelled (with the error you reported).
EDIT I think that code was written before the .store(in:) method existed; if I were writing it today, I'd use that instead of a sub property. But the principle is the same.
I needed to move my cancellable set "above" the scope of the function where my subscriber was executing. This worked fine in iOS 13.2 when the cancellable set had the same scope as the function of the subscriber, but stop working in 13.3. The dataTaskPublisher cancels with the error sited above. It makes sense that the cancellable set should "out live" the subscriber. Developer error. Lesson learned.