F# async: parent/child cancellation? - apache-kafka

So here we go: given a Confluent.Kafka IConsumer<>, it wraps it into a dedicated async CE and consumes as long as cancellation hasn't been requested. This piece of code is also defends itself against the OperationCancelledException and runs finally block to ensure graceful termination of consumer.
let private consumeUntiCancelled callback (consumer: IConsumer<'key, 'value>) =
async {
let! ct = Async.CancellationToken
try
try
while not ct.IsCancellationRequested do
let consumeResult = consumer.Consume(ct)
if not consumeResult.IsPartitionEOF then do! (callback consumeResult)
with
| :? OperationCanceledException -> return ()
finally
consumer.Close()
consumer.Dispose()
}
Question #1: is this code correct or am I abusing the async?
So far so good. In my app I have to deal with lots of consumers that must die altogether. So, assuming that consumers: seq<Async<unit>> represents them, the following code is what I came up with:
async {
for consumer in consumers do
do! (Async.StartChild consumer |> Async.Ignore).
}
I expect this code to chain childs to the parent's cancellation context, and once it is cancelled, childs gonna be cancelled as well.
Question #2: is my finally block guaranteed to be ran even though child got cancelled?

I have two observations about your code:
Your use of Async.StartChild is correct - all child computations will inherit the same cancellation token and they will all get cancelled when the main token is cancelled.
The async workflow can be cancelled after you call consumer.Consume(ct) and before you call callback. I'm not sure what this means for your specific problem, but if it removes some data from a queue, the data could be lost before it is processed. If that's an issue, then I think you'll need to make callback non-asynchronous, or invoke it differently.
In your consumeUntilCancelled function, you do not explicity need to check while not if ct.IsCancellationRequested is true. The async workflow does this automatically in every do! or let!, so you can replace this with just a while loop.
Here is a minimal stand-alone demo:
let consume s = async {
try
while true do
do! Async.Sleep 1000
printfn "%s did work" s
finally
printfn "%s finalized" s }
let work =
async {
for c in ["A"; "B"; "C"; "D"] do
do! Async.StartChild (consume c) |> Async.Ignore }
Now we create the computation with a cancellation token:
// Run this in F# interactive
let ct = new System.Threading.CancellationTokenSource()
Async.Start(work, ct.Token)
// Run this sometime later
ct.Cancel()
Once you call ct.Cancel, all the finally blocks will be called and all the loops will stop.

Related

How to add a timeout to an awaiting function call

What's the best way to add a timeout to an awaiting function?
Example:
/// lets pretend this is in a library that I'm using and I can't mess with the guts of this thing
func fetchSomething() async -> Thing? {
// fetches something
}
// if fetchSomething() never returns then doSomethingElse() is never ran. Is there anyway to add a timeout to this system?
let thing = await fetchSomething()
doSomethingElse()
I wanted to make the system more robust in the case that fetchSomething() never returns. If this was using combine, I'd use the timeout operator.
One can create a Task, and then cancel it if it has not finished in a certain period of time. E.g., launch two tasks in parallel:
// cancel the fetch after 2 seconds
func fetchSomethingWithTimeout() async throws -> Thing {
let fetchTask = Task {
try await fetchSomething()
}
let timeoutTask = Task {
try await Task.sleep(nanoseconds: 2 * NSEC_PER_SEC)
fetchTask.cancel()
}
let result = try await fetchTask.value
timeoutTask.cancel()
return result
}
// here is a random mockup that will take between 1 and 3 seconds to finish
func fetchSomething() async throws -> Thing {
let duration: TimeInterval = .random(in: 1...3)
try await Task.sleep(nanoseconds: UInt64(TimeInterval(NSEC_PER_SEC) * duration))
return Thing()
}
If the fetchTask finishes first, it will reach the timeoutTask.cancel and stop it. If timeoutTask finishes first, it will cancel the fetchTask.
Obviously, this rests upon the implementation of fetchTask. It should not only detect the cancelation, but also throw an error (likely a CancellationError) if it was canceled. We cannot comment further without details regarding the implementation of fetchTask.
For example, in the above example, rather than returning an optional Thing?, I would instead return Thing, but have it throw an error if it was canceled.
I hesitate to mention it, but while the above assumes that fetchSomething was well-behaved (i.e., cancelable), there are permutations on the pattern that work even if it does not (i.e., run doSomethingElse in some reasonable timetable even if fetchSomething “never returns”).
But this is an inherently unstable situation, as the resources used by fetchSomething cannot be recovered until it finishes. Swift does not offer preemptive cancelation, so while we can easily solve the tactical issue of making sure that doSomethingElse eventually runs, if fetchSomething might never finish in some reasonable timetable, you have deeper problem.
You really should find a rendition of fetchSomething that is cancelable, if it is not already.
// You can use 'try and catch'. Wait for the fetch data inside the try block. When it fails the catch block can run a different statement. Something like this:
await getResource()
try {
await fetchData();
} catch(err){
doSomethingElse();
}
// program continues

How to cancel a tokio tcp connecting gracefully?

When we connect to a remote host via tcp, it can be a time-consuming operation. And while waiting for a connection, the user may cancel the operation at any time.
When connecting using async tokio, TcpStream::connect() returns a Future<TcpStream, io::Error> object, assumed to be called tcps_ft.
There are two parts, one is the normal logic of the program, which should call .awati() on tcp_ft above, and the other part is the UI of the program, where the user wants to call drop(tcps_ft) if he clicks the cancel button. But this seems impossible to do, because both calls consume tcps_ft.
#[tokio::test]
async fn test_cancel_by_drop() {
let addr = "192.168.1.100:8080";
let tcps_ft = TcpStream::connect(addr);
let mut tcps = tcps_ft.await.unwrap();
// simulate user's operation.
let cancel_jh = tokio::spawn(async move {
tokio::time::sleep(Duration::from_millis(100)).await;
drop(tcps_ft); // can not compile:: tcps_ft moved when await
});
// simulate user's program
tcps.shutdown().await;
cancel_jh.await;
}
So I considered using Task to do it, after all the Task::abort() function will not consume the atjh: Future<JoinHandle> object corresponding to this task. But I still can't call atjh.await before abort() returns synchronously, and in any case, await will consume the variable, making it impossible to call abort() asynchronously. (In other words, the call to abort() must be executed synchronously before await.)
#[tokio::test]
async fn test_cancel_by_abort() {
let addr = "192.168.1.100:8080";
let atjh = tokio::spawn(async move { TcpStream::connect(addr).await.unwrap() });
// simulate user's operation.
let cancel_jh = tokio::spawn(async {
tokio::time::sleep(Duration::from_millis(100)).await;
&atjh.abort();
});
// simulate user's program
let mut tcps = atjh.await.unwrap(); // can not compile:: atjh moved when await
tcps.shutdown().await;
cancel_jh.await;
}
Of course, one less direct way is to use callback functions. In my asynchronous connection task, when connect().await returns, the user's callback function is called to notify the user to call atjh.await.
But here the callback function is introduced again, and I know await/async itself is designed to solve the callback hell problem.
Further, for user-supplied asynchronous callback functions, the compiler may impose very many requirements, such as implementing Send, avoiding cross-thread safety issues, etc. This is certainly not something that async would like to encounter.
How can I do it asynchronously and gracefully to cancel this asynchronous connection process? Is there a suggested model to handle it?

When exactly do we use async-await and then?

I am very confused about this. I request you to clarify the concept.
Consider the following scenarios:
Case 1:
int number = 0;
void calculate() {
number = number + 2;
print(number);
}
I know this works just fine. "2" will be printed on the terminal.
But why shouldn't I use async-await here, like this:
int number = 0;
void calculate() async {
void addition() async {
number = number + 2;
}
await addition();
print(number);
}
This seems logical to me, since print(number) should wait for number = number + 2 to finish. Why isn't this necessary? How does dart know which operation to execute first?
How is it ensured that print(number) isn't executed before number = number + 2 and "0" is printed on the terminal?
Does the sequence in which we write these operations in the function matter?
Case 2:
Consider the case where I am interacting with SQFLite database and values fetched depend on each other.
Note: number1, number2, number3 will still have values before the following function is called.
void getValues() async {
void calculate1() {
number1 = await db.getNumber1(10);
}
void calculate2() {
number2 = await db.getNumber2(number1);
}
await calculate1().then((_) async {
await calculate2().then((_) async {
number3 = await db.getNumber3(number2);
});
});
}
I have a lot of these types of functions in my app and I am doing this everywhere.
I am kind of paranoid, thinking if old values of number1and number2 are taken as a parameter in getNumber2() and getNumber3() respectively, then I'll be doomed.
async/await are just syntax sugar for the underlying Future framework. 95% of the time, they will suffice, and are preferred by the style guide.
One exception is that you may have multiple futures that you want to wait until all are complete in parallel. In that case, you'll need to use Future.wait([future1, future2, future3]), which cannot be expressed using await.
Dart is executed line by line. So when the function is called calculation will be done first then it will be printed. So you will always get 2 printed
You can see it like there is one main thread in general which is the UI thread. Any operations you are writing in this thread will be performed line by line and after completely executing one line it will move to next line.
Now suppose you have something which you know that it will take time to be computed or fully executed with either a result or error. If you will write this in the main UI thread (synchronous thread) that means you're stopping the UI of the app, which in turn makes the app to crash(Application Not Responding Error) as the operating system feels that the app has frozen but as you know this is happening because of the compute you are running in the UI thread which is taking time and the UI is waiting for it to be completely executed.
So to overcome this issue we use Asynchronous methods to compute the time taking computations like getting some data from a database which will return a value or error in "future". The main UI thread doesn't waits for the asynchronous threads. If you don't have anything to show to the user until any asynchronous task is completed you place the loading indicators for the time being.
Hope this helps!

Replay subject subscription behaviour

The following code works as expected but I am puzzled by the way it behaves when I uncomment the line 'o.OnCompleted();'
The code joins all subscribers to the result of a single long operation and caches the result for further subscribers for 2 seconds. Any subscription after this time starts the process again.
Subscriptions will come from other threads (simulated with the thread pool).
var obs = Observable.Create((IObserver<Guid> o) =>
{
Console.WriteLine("Start");
Thread.Sleep(1000); // process
Console.WriteLine("End");
o.OnNext(Guid.NewGuid());
//o.OnCompleted(); // <-- uncomment this
return Disposable.Empty;
})
.Replay(TimeSpan.FromSeconds(2))
.RefCount()
.Take(1);
ThreadPool.QueueUserWorkItem(delegate
{
// simulate request from threadpool
obs.Subscribe(x => Console.WriteLine($"1: {x}"), () => Console.WriteLine($"1: complete"));
});
ThreadPool.QueueUserWorkItem(delegate
{
obs.Subscribe(x => Console.WriteLine($"2: {x}"), () => Console.WriteLine($"2: complete"));
});
Thread.Sleep(4000);
ThreadPool.QueueUserWorkItem(delegate
{
obs.Subscribe(x => Console.WriteLine($"3: {x}"), () => Console.WriteLine($"3: complete"));
});
Here is the result:
Start
End
1: 255BEFDC-2F14-40AD-AE77-2B005C5A3AA9
2: 255BEFDC-2F14-40AD-AE77-2B005C5A3AA9
1: complete
2: complete
Start
End
3: 1214DC63-F688-475A-9CB7-C3784054A4AC
3: complete
The odd behaviour is if I uncomment the line 'o.OnCompleted()' the result changes to this:
Start
End
1: 255BEFDC-2F14-40AD-AE77-2B005C5A3AA9
2: 255BEFDC-2F14-40AD-AE77-2B005C5A3AA9
1: complete
2: complete
Start
End
3: complete
The 3rd subscriber causes another subscription to the root observable but the result is missing. It appears the ReplaySubject caches the result of the previous observable having completed but still causes a new subscription. This seems unintuitive. I would like to understand why it doesn't work.
Note: I originially tried this using Defer instead of Create which had the same result as the second run above (for obvious reasons).
When you use the Replay/RefCount pair you create an observable that shares a common subscription to the source observable.
From the source:
Returns a connectable observable sequence that shares a single subscription to the underlying sequence replaying all notifications.
Now, it's important to remember that an observable produces a series of zero or more values, followed by either a complete or error signal. It cannot produce values after a complete or error is produced.
Since you are sharing a common subscription to the source and if your source producing a complete then it cannot produce more values. So when you call o.OnCompleted() then you're doing exactly that.
Also, as a side-note, you should avoid ever writing return Disposable.Empty; inside a Create. It means you're creating an observable than can complete before the subscription has returned and that can lead to race conditions.
The way to write your code without it is:
var obs =
Observable
.Defer(() => Observable.Return(Guid.NewGuid()).Concat(Observable.Never<Guid>()))
.Replay(TimeSpan.FromSeconds(2.0))
.RefCount()
.Take(1);
But this is the same as not calling o.OnCompleted().

RxJava2 Single take different route based on the item

I have the following code:
Single<Response<User>> single = service.registerUser();
single
.subscribeOn(Schedulers.io())
.observeOn(Schedulers.computation())
.map(Response::body)
.flatMap(parentsRepsitory::writeUser)
.observeOn(AndroidSchedulers.mainThread())
.flatMap(parentsRepsitory::getUser)
Where the parentsRepository is a repo wraping my realm database. The problems come when the server returns validation errors, however. So somewhere in my stream i want to have the equivalent of
if(response.code() == 201){
// CONTINUE STREAM USING THE LOGIC THAT HANDLES SUCCESS
}elseif(response.code() == 400){
// CONTINUE STREAM USING LOGIC TO HANDLE THE VALIDATION ERRORS
}
A solution I have previously implemented is as follows:
Observable<Response<User>> observable_from_api =
service.attemptLogin(username, password)
.share();
observable_from_api
.filter(response -> response.code() == HttpStatus.HTTP_STATUS_200_OK)
.//handle logic for success
observable_from_api
.filter(response -> response.code() == HttpStatus.HTTP_STATUS_400_BAD_REQUEST)
.//handle logic for validation errors
I don't like this solution for several different reasons. The main one being it just does not seem right. The second one being that the .share() method is only available on an Observable object. Since my network operation emits only one responce I would much rather use Single instead, but the .share() method is not available there.
Excuse me if this is a duplicate question, I have done some digging around and only found the solution I mentioned. I want to either see the optimal solution or be told explicitly that this is in fact the optimal solution.
I think you need to define which kind of data you want your consumer to receive. I assume you want to receive in the consumer a User object.
These are the signatures of the method that you should create:
Single<User> handleSuccess(Response<User> response)
Single<User> handleError(Response<User> response)
And then you create you stream in this way:
service.registerUser()
.flatMap(response -> {
if (response.success) {
return handleSuccess(response);
} else {
return handleError(response);
}
})
.subscribe(user -> logd("user: " + user.name));