( I've read a lot about async and I wonder what happens if there is a mix of async function call and non-async) and im talking about THREAD pov .
( I know that mix should not be done , but im asking in order to understand the flow better)
suppose I have(cube 1) function which calls async functions which calls asyc function.
(please notice that cube 1 function is not async)
Question : when the control reaches (cube 3) await Task.Delay(100); --
what is really happening as the control reaches cube3 await ? does the control is back(and blocked) at the pink arrow (cube 1 ) or does the control is released and still awaits ( orange arrow)
When control reaches the await statement in Cube3, Task.Delay() is called and immediately returns a Task that will complete after the delay. The rest of CalculateAnswer() is refactored into a continuation that will run when the Task returned by Delay() completes, and CalculateAnswer() immediately returns a Task<int> that will complete when that continuation completes.
CallCalculateAnswer() awaits the Task<int> returned by CalculateAnswer(), so the same logic applies: the code running after the await (i.e. the return statement) is refactored into a continuation that will run when the Task<int> completes, and CallCalculateAnswer() proceeds to return its own Task<int>.
StartChain(), however, does not await the Task<int> produced by CallCalculateAnswer(), so no continuation is generated. Instead, it stores the task into a local variable and returns. Therefore, the orange arrow in your question is the right one.
Note that, in this specific case, since nothing is done with the Task<int> returned by CallCalculateAnswer() and it is only stored in a local variable, it becomes "forgotten" once StartChain() returns: the task will complete (or fail) some time in the future, but its result (the int it produces) will be lost.
Update following comments: The pink arrow in your question implies you're expecting StartChain() to block, waiting for the task returned by CallCalculateAnswer() to complete, because it is not async and does not await that task. This is not what happens, as the usual control flow semantics apply: the method has returned a task, so you get a reference to that task and nothing more.
If you want StartChain() to block, you can use Task.Wait(), Task<T>.Result, or similar on the task returned by CallCalculateAnswer().
Related
I came across this piece of code on SO:
EDIT:the following code snippet is fully functional, I'm trying to understand if beside "working", can it lead to errors due to a possible race condition
if (!_downloaders.containsKey(page)) {
_downloaders[page] = NetworkProvider().getRecentPodcasts(page);
_downloaders[page].then((_) => _downloaders.remove(page));
}
final podcasts = await _downloaders[page];
and I cannot wrap my head around one part:
_downloaders[page].then((_) => _downloaders.remove(page));
here after we added a future to this Map, we execute the future by calling then(),
because we want that after the future has finished, the future is removed from the Map.
Here all is clear, but on the next line, we call await, on the Future that has been added to the Map, and which will be removed soon.
I cannot really understand how this is good code, as it looks to me that when the Future is called with the then(), I know that the code execution doesn't stop for the then() (but it does for await) , but isn't there a remote case where it might get to the await part, but the future is NOT inside the Map anymore, as it has been already removed?
Or this can never happen, and if so, can you explain me the inner workings, so I can fully graps this concept and improve my codebase
I cannot really understand how this is good code, as it looks to me that when the Future is called with the then(), I know that the code execution doesn't stop for the then() (but it does for await) , but isn't there a remote case where it might get to the await part, but the future is NOT inside the Map anymore, as it has been already removed?
Future.then() does not execute the Future's computation. Future.then() only registers a callback.
The cited code usually shouldn't be racy. Futures are normally asynchronous; even if the Future's computation is already complete, the .then() callback will not execute until the Dart runtime returns to the event loop, which in this case would happen at the await line. You can observe this:
void main() async {
print('Constructing Future');
var future = Future.sync(() => print('Ran future'));
print('Constructed Future; registering callback');
// ignore: unawaited_futures
future.then((_) => print('Ran .then() callback'));
print('Registered callback; awaiting');
await future;
print('Done');
}
which will print:
Constructing Future
Ran future
Constructed Future; registering callback
Registered callback; awaiting
Ran .then() callback
Done
It's possible (but unlikely) that the code could be racy in pathological cases. For example, Flutter provides a SynchronousFuture class that implements the Future interface but executes its .then() callback synchronously upon registration. However, that is rather unusual (and which is why the documentation for SynchronousFuture explicitly discourages using it). For that to happen, the NetworkProvider().getRecentPodcasts implementation would have to explicitly return a SynchronousFuture (or some equivalent implementation).
For few days I have been wrapping my head around cats-effect and IO. And I feel I have some misconceptions about this effect or simply I missed its point.
First of all - if IO can replace Scala's Future, how can we create an async IO task? Using IO.shift? Using IO.async? Is IO.delay sync or async? Can we make a generic async task with code like this Async[F].delay(...)? Or async happens when we call IO with unsafeToAsync or unsafeToFuture?
What's the point of Async and Concurrent in cats-effect? Why they are separated?
Is IO a green thread? If yes, why is there a Fiber object in cats-effect? As I understand the Fiber is the green thread, but docs claim we can think of IOs as green threads.
I would appreciate some clarifing on any of this as I have failed comprehending cats-effect docs on those and internet was not that helpfull...
if IO can replace Scala's Future, how can we create an async IO task
First, we need to clarify what is meant as an async task. Usually async means "does not block the OS thread", but since you're mentioning Future, it's a bit blurry. Say, if I wrote:
Future { (1 to 1000000).foreach(println) }
it would not be async, as it's a blocking loop and blocking output, but it would potentially execute on a different OS thread, as managed by an implicit ExecutionContext. The equivalent cats-effect code would be:
for {
_ <- IO.shift
_ <- IO.delay { (1 to 1000000).foreach(println) }
} yield ()
(it's not the shorter version)
So,
IO.shift is used to maybe change thread / thread pool. Future does it on every operation, but it's not free performance-wise.
IO.delay { ... } (a.k.a. IO { ... }) does NOT make anything async and does NOT switch threads. It's used to create simple IO values from synchronous side-effecting APIs
Now, let's get back to true async. The thing to understand here is this:
Every async computation can be represented as a function taking callback.
Whether you're using API that returns Future or Java's CompletableFuture, or something like NIO CompletionHandler, it all can be converted to callbacks. This is what IO.async is for: you can convert any function taking callback to an IO. And in case like:
for {
_ <- IO.async { ... }
_ <- IO(println("Done"))
} yield ()
Done will be only printed when (and if) the computation in ... calls back. You can think of it as blocking the green thread, but not OS thread.
So,
IO.async is for converting any already asynchronous computation to IO.
IO.delay is for converting any completely synchronous computation to IO.
The code with truly asynchronous computations behaves like it's blocking a green thread.
The closest analogy when working with Futures is creating a scala.concurrent.Promise and returning p.future.
Or async happens when we call IO with unsafeToAsync or unsafeToFuture?
Sort of. With IO, nothing happens unless you call one of these (or use IOApp). But IO does not guarantee that you would execute on a different OS thread or even asynchronously unless you asked for this explicitly with IO.shift or IO.async.
You can guarantee thread switching any time with e.g. (IO.shift *> myIO).unsafeRunAsyncAndForget(). This is possible exactly because myIO would not be executed until asked for it, whether you have it as val myIO or def myIO.
You cannot magically transform blocking operations into non-blocking, however. That's not possible neither with Future nor with IO.
What's the point of Async and Concurrent in cats-effect? Why they are separated?
Async and Concurrent (and Sync) are type classes. They are designed so that programmers can avoid being locked to cats.effect.IO and can give you API that supports whatever you choose instead, such as monix Task or Scalaz 8 ZIO, or even monad transformer type such as OptionT[Task, *something*]. Libraries like fs2, monix and http4s make use of them to give you more choice of what to use them with.
Concurrent adds extra things on top of Async, most important of them being .cancelable and .start. These do not have a direct analogy with Future, since that does not support cancellation at all.
.cancelable is a version of .async that allows you to also specify some logic to cancel the operation you're wrapping. A common example is network requests - if you're not interested in results anymore, you can just abort them without waiting for server response and don't waste any sockets or processing time on reading the response. You might never use it directly, but it has it's place.
But what good are cancelable operations if you can't cancel them? Key observation here is that you cannot cancel an operation from within itself. Somebody else has to make that decision, and that would happen concurrently with the operation itself (which is where the type class gets its name). That's where .start comes in. In short,
.start is an explicit fork of a green thread.
Doing someIO.start is akin to doing val t = new Thread(someRunnable); t.start(), except it's green now. And Fiber is essentially a stripped down version of Thread API: you can do .join, which is like Thread#join(), but it does not block OS thread; and .cancel, which is safe version of .interrupt().
Note that there are other ways to fork green threads. For example, doing parallel operations:
val ids: List[Int] = List.range(1, 1000)
def processId(id: Int): IO[Unit] = ???
val processAll: IO[Unit] = ids.parTraverse_(processId)
will fork processing all IDs to green threads and then join them all. Or using .race:
val fetchFromS3: IO[String] = ???
val fetchFromOtherNode: IO[String] = ???
val fetchWhateverIsFaster = IO.race(fetchFromS3, fetchFromOtherNode).map(_.merge)
will execute fetches in parallel, give you first result completed and automatically cancel the fetch that is slower. So, doing .start and using Fiber is not the only way to fork more green threads, just the most explicit one. And that answers:
Is IO a green thread? If yes, why is there a Fiber object in cats-effect? As I understand the Fiber is the green thread, but docs claim we can think of IOs as green threads.
IO is like a green thread, meaning you can have lots of them running in parallel without overhead of OS threads, and the code in for-comprehension behaves as if it was blocking for the result to be computed.
Fiber is a tool for controlling green threads explicitly forked (waiting for completion or cancelling).
It should be simple, but I have no idea how to do it. I want to run ScalaZ Task in the current thread. I was surprised task.run doesn't run on the current thread, as it is synchronous.
Is it possible to run it on the current thread, and how to do it?
There were some updates and deprecations since http://timperrett.com/2014/07/20/scalaz-task-the-missing-documentation/.
Right now the recommended way of calling task synchronously is:
task.unsafePerformSync // returns result or throws exception
task.unsafePerformSyncAttempt // returns -\/(error) or \/-(result)
Keep in mind, though, that it is not exactly done in the caller's thread - the execution is perfomed in a thread pool defined for a task, but the caller's thread blocks until the execution is finished. There is no way of making the task run exactly in the same thread.
In general, if Task.async is used - there is no way to make composite Task always stay in the same thread as cb (callback) can be called from any place (any thread), so that in a chain like:
Task
.delay("aaa")
.map(_ + "bbb")
.flatMap(x => Task.async(cb => completeCallBackSomewhereElse(cb, x)))
.map(_ + "ccc")
.unsafePerformSync
_ + "bbb" is gonna be executed in a caller's thread
_ + "ccc" is gonna be executed in Somewhereelse's thread as scalaz have no control over it.
Basically, this allows a Task to be a powerful instrument for asynchronous operations, so it might not even know about underlying thread pools or even implement behavior without pure threads and wait/notify.
However, there are special cases where it might work as caller-runs:
1) No Strategy/Task.async related stuff:
Task.delay("aaa").map(_ + "bbb").unsafePerformSync
unsafePerformSync uses CountDownLatch to await for result of runAsync, so if there is no async/non-deterministic operations on the way - runAsync will use caller's thread:
/**
* Run this `Future`, passing the result to the given callback once available.
* Any pure, non-asynchronous computation at the head of this `Future` will
* be forced in the calling thread. At the first `Async` encountered, control
* switches to whatever thread backs the `Async` and this function returns.
*/
def runAsync(cb: A => Unit): Unit =
listen(a => Trampoline.done(cb(a)))
2) You have control over execution strategies. So this simple Java trick will help. Besides, it's already implemented in scalaz and called Strategy.sequential
P.S.
1) If you simply want to start a computation as soon as possible use task.now/Task.unsafeStart.
2) If you want something less heavily related on asynchronous stuff but still lazy and stack-safe, you might take a look here (it's for Cats library) http://eed3si9n.com/herding-cats/Eval.html
3) If you just need to encapsulate side-effects - take a look at scalaz.effect
I was looking over some Scala server code and I saw thins async/await block:
async {
while (cancellationToken.nonCancelled) {
val (request, exchange) = await(listener.nextRequest)
respond(exchange, cancellationToken, handler(request))
}
}
How can this be correct syntax?
As I understand it:
For every execution of the while loop
Thread 1 will execute the code from the while loop except the one in the await clause.
Thread 2 will go in the await clause.
But then Thread 1 will have val (request, exchange) uninstantiated in case Thread 2 doesn't finish computing.
These values will be passed to the respond and handler methods uninstantiated.
So how can you have an assignment in two different threads?
So how can you have an assignment in two different threads?
async-await's main goal is to allow you to do asynchronous programming in a synchronous fashion.
What really happens is that the awaited call executes listener.nextRequest and asynchronously waits for it's completion, it doesn't execute the next line of code until then. This guarantees that if the next line of code is executed, it's values are populated. The assignment should happen where it is visible to the next LOC in the method.
This is possible due to the fact that the async macro actually transforms this code into a state-machine, where the first part is the execution up until the first await, and the next part is everything after.
This is a question about Scala continuations. Can resets be nested? If they can: what are nested resets useful for ? Is there any example of nested resets?
Yes, resets can be nested, and, yes, it can be useful. As an example, I recently prototyped an API for the scalagwt project that would allow GWT developers to write asynchronous RPCs (remote procedure calls) in a direct style (as opposed to the callback-passing style that is used in GWT for Java). For example:
field1 = "starting up..." // 1
field2 = "starting up..." // 2
async { // (reset)
val x = service.someAsyncMethod() // 3 (shift)
field1 = x // 5
async { // (reset)
val y = service.anotherAsyncMethod() // 6 (shift)
field2 = y // 8
}
field2 = "waiting..." // 7
}
field1 = "waiting..." // 4
The comments indicate the order of execution. Here, the async method performs a reset, and each service call performs a shift (you can see the implementation on my github fork, specifically Async.scala).
Note how the nested async changes the control flow. Without it, the line field2 = "waiting" would not be executed until after successful completion of the second RPC.
When an RPC is made, the implementation captures the continuation up to the inner-most async boundary, and suspends it for execution upon successful completion of the RPC. Thus, the nested async block allows control to flow immediately to the line after it as soon as the second RPC is made. Without that nested block, on the other hand, the continuation would extend all the way to the end of the outer async block, in which case all the code within the outer async would block on each and every RPC.
reset forms an abstraction so that code outside is not affected by the fact that the code inside is implemented with continuation magic. So if you're writing code with reset and shift, it can call other code which may or may not be implemented with reset and shift as well. In this sense they can be nested.