Future declaration seems independent from promise - scala

I was reading this article http://danielwestheide.com/blog/2013/01/16/the-neophytes-guide-to-scala-part-9-promises-and-futures-in-practice.html and I was looking at this code:
object Government {
def redeemCampaignPledge(): Future[TaxCut] = {
val p = Promise[TaxCut]()
Future {
println("Starting the new legislative period.")
Thread.sleep(2000)
p.success(TaxCut(20))
println("We reduced the taxes! You must reelect us!!!!1111")
}
p.future
}
}
I've seen this type of code a few times and I'm confused. So we have this Promise:
val p = Promise[TaxCut]()
And this Future:
Future {
println("Starting the new legislative period.")
Thread.sleep(2000)
p.success(TaxCut(20))
println("We reduced the taxes! You must reelect us!!!!1111")
}
I don't see any assignment between them so I don't understand: How are they connected?

I don't see any assignment between them so I don't understand: How are
they connected?
A Promise is a one way of creating a Future.
When you use Future { } and import scala.concurrent.ExecutionContext.Implicits.global, you're queuing a function on one of Scala's threadpool threads. But, that isn't the only way to generate a Future. A Future need not necessarily be scheduled on a different thread.
What this example does is:
Creates a Promise[TaxCut] which will be completed sometime in the near future.
Queues a function to be ran inside a threadpool thread via the Future apply. This function also completes the Promise via the Promise.success method
Returns the future generated by the promise via Promise.future. When this future returns, it may not be completed yet, depending on how fast the execution of the function queued to the Future really runs (the OP was trying to convey this via the Thread.sleep method, delaying the completion of the future).

Related

Is synchronous HTTP request wrapped in a Future considered CPU or IO bound?

Consider the following two snippets where first wraps scalaj-http requests with Future, whilst second uses async-http-client
Sync client wrapped with Future using global EC
object SyncClientWithFuture {
def main(args: Array[String]): Unit = {
import scala.concurrent.ExecutionContext.Implicits.global
import scalaj.http.Http
val delay = "3000"
val slowApi = s"http://slowwly.robertomurray.co.uk/delay/${delay}/url/https://www.google.co.uk"
val nestedF = Future(Http(slowApi).asString).flatMap { _ =>
Future.sequence(List(
Future(Http(slowApi).asString),
Future(Http(slowApi).asString),
Future(Http(slowApi).asString)
))
}
time { Await.result(nestedF, Inf) }
}
}
Async client using global EC
object AsyncClient {
def main(args: Array[String]): Unit = {
import scala.concurrent.ExecutionContext.Implicits.global
import sttp.client._
import sttp.client.asynchttpclient.future.AsyncHttpClientFutureBackend
implicit val sttpBackend = AsyncHttpClientFutureBackend()
val delay = "3000"
val slowApi = uri"http://slowwly.robertomurray.co.uk/delay/${delay}/url/https://www.google.co.uk"
val nestedF = basicRequest.get(slowApi).send().flatMap { _ =>
Future.sequence(List(
basicRequest.get(slowApi).send(),
basicRequest.get(slowApi).send(),
basicRequest.get(slowApi).send()
))
}
time { Await.result(nestedF, Inf) }
}
}
The snippets are using
Slowwly to simulate slow API
scalaj-http
async-http-client sttp backend
time
The former takes 12 seconds whilst the latter takes 6 seconds. It seems the former behaves as if it is CPU bound however I do not see how that is the case since Future#sequence should executes the HTTP requests in parallel? Why does synchronous client wrapped in Future behave differently from proper async client? Is it not the case that async client does the same kind of thing where it wraps calls in Futures under the hood?
Future#sequence should execute the HTTP requests in parallel?
First of all, Future#sequence doesn't execute anything. It just produces a future that completes when all parameters complete.
Evaluation (execution) of constructed futures starts immediately If there is a free thread in the EC. Otherwise, it simply submits it for a sort of queue.
I am sure that in the first case you have single thread execution of futures.
println(scala.concurrent.ExecutionContext.Implicits.global) -> parallelism = 6
Don't know why it is like this, it might that other 5 thread is always busy for some reason. You can experiment with explicitly created new EC with 5-10 threads.
The difference with the Async case that you don't create a future by yourself, it is provided by the library, that internally don't block the thread. It starts the async process, "subscribes" for a result, and returns the future, which completes when the result will come.
Actually, async lib could have another EC internally, but I doubt.
Btw, Futures are not supposed to contain slow/io/blocking evaluations without blocking. Otherwise, you potentially will block the main thread pool (EC) and your app will be completely frozen.

Why is it not recommended to retrieve value from Scala's Future?

I started working on Scala very recently and came across its feature called Future. I had posted a question for help with my code and some help from it.
In that conversation, I was told that it is not recommended to retrieve the value from a Future.
I understand that it is a parallel process when executed but if the value of a Future is not recommended to be retrieved, how/when do I access the result of it ? If the purpose of Future is to run a thread/process independent of main thread, why is it that it is not recommended to access it ? Will the Future automatically assign its output to its caller ? If so, how would we know when to access it ?
I wrote the below code to return a Future with a Map[String, String].
def getBounds(incLogIdMap:scala.collection.mutable.Map[String, String]): Future[scala.collection.mutable.Map[String, String]] = Future {
var boundsMap = scala.collection.mutable.Map[String, String]()
incLogIdMap.keys.foreach(table => if(!incLogIdMap(table).contains("INVALID")) {
val minMax = s"select max(cast(to_char(update_tms,'yyyyddmmhhmmss') as bigint)) maxTms, min(cast(to_char(update_tms,'yyyyddmmhhmmss') as bigint)) minTms from queue.${table} where key_ids in (${incLogIdMap(table)})"
val boundsDF = spark.read.format("jdbc").option("url", commonParams.getGpConUrl()).option("dbtable", s"(${minMax}) as ctids")
.option("user", commonParams.getGpUserName()).option("password", commonParams.getGpPwd()).load()
val maxTms = boundsDF.select("minTms").head.getLong(0).toString + "," + boundsDF.select("maxTms").head.getLong(0).toString
boundsMap += (table -> maxTms)
}
)
boundsMap
}
If I have to use the value which is returned from the method getBounds, can I access it in the below way ?
val tmsobj = new MinMaxVals(spark, commonParams)
tmsobj.getBounds(incLogIds) onComplete ({
case Success(Map) => val boundsMap = tmsobj.getBounds(incLogIds)
case Failure(value) => println("Future failed..")
})
Could anyone care to clear my doubts ?
As the others have pointed out, waiting to retrieve a value from a Future defeats the whole point of launching the Future in the first place.
But onComplete() doesn't cause the rest of your code to wait, it just attaches extra instructions to be carried out as part of the Future thread while the rest of your code goes on its merry way.
So what's wrong with your proposed code to access the result of getBounds()? Let's walk through it.
tmsobj.getBounds(incLogIds) onComplete { //launch Future, when it completes ...
case Success(m) => //if Success then store the result Map in local variable "m"
val boundsMap = tmsobj.getBounds(incLogIds) //launch a new and different Future
//boundsMap is a local variable, it disappears after this code block
case Failure(value) => //if Failure then store error in local variable "value"
println("Future failed..") //send some info to STDOUT
}//end of code block
You'll note that I changed Success(Map) to Success(m) because Map is a type (it's a companion object) and can't be used to match the result of your Future.
In conclusion: onComplete() doesn't cause your code to wait on the Future, which is good, but it is somewhat limited because it returns Unit, i.e. it has no return value with which it can communicate the result of the Future.
TLDR; Futures are not meant to manage shared state but they are good for composing asynchronous pieces of code. You can use map, flatMap and many other operations to combine Futures.
The computation that the Future represents will be executed using the given ExecutionContext (usually given implicitly), which will usually be on a thread-pool, so you are right to assume that the Future computation happens in parallel. Because of this concurrency, it is generally not advised to mutate state that is shared from inside the body of the Future, for example:
var i: Int = 0
val f: Future[Unit] = Future {
// Some computation
i = 42
}
Because you then run the risk of also accessing/modifying i in another thread (maybe the "main" one). In this kind of concurrent access situation, Futures would probably not be the right concurrency model, and you could imagine using monitors or message-passing instead.
Another possibility that is tempting but also discouraged is to block the main thread until the result becomes available:
val f: Future[Init] = Future { 42 }
val i: Int = Await.result(f)
The reason this is bad is that you will completely block the main thread, annealing the benefits of having concurrent execution in the first place. If you do this too much, you might also run in trouble because of a large number of threads that are blocked and hogging resources.
How do you then know when to access the result? You don't and it's actually the reason why you should try to compose Futures as much as possible, and only subscribe to their onComplete method at the very edge of your application. It's typical for most of your methods to take and return Futures, and only subscribe to them in very specific places.
It is not recommended to wait for a Future using Await.result because this blocks the execution of the current thread until some unknown point in the future, possibly forever.
It is perfectly OK to process the value of a Future by passing a processing function to a call such as map on the Future. This will call your function when the future is complete. The result of map is another Future, which can, in turn, be processed using map, onComplete or other methods.

Asynchronous message handling with Akka's Actors

In my project I'm using Akka's Actors. By definition Actors are thread-safe, which means that in the Actor's receive method
def receive = {
case msg =>
// some logic here
}
only one thread at a time processes the commented piece of code. However, things are starting to get more complicated when this code is asynchronous:
def receive = {
case msg =>
Future {
// some logic here
}
}
If I understand this correctly, in this case only the Future construct will be synchronized, so to speak, and not the logic inside the Future.
Of course I may block the Future:
def receive = {
case msg =>
val future = Future {
// some logic here
}
Await.result(future, 10.seconds)
}
which solves the problem, but I think we all should agree that this is hardly an acceptable solution.
So this is my question: how can I retain the thread-safe nature of actors in case of asynchronous computing without blocking Scala's Futures?
How can I retain the thread-safe nature of actors in case of
asynchronous computing without block Scalas Future?
This assumption is only true if you modify the internal state of the actor inside the Future which seems to be a design smell in the first place. Use the future for computation only by creating a copy of the data and pipe to result of the computation to the actor using pipeTo. Once the actor receives the result of the computation you can safely operate on it:
import akka.pattern.pipe
case class ComputationResult(s: String)
def receive = {
case ComputationResult(s) => // modify internal state here
case msg =>
Future {
// Compute here, don't modify state
ComputationResult("finished computing")
}.pipeTo(self)
}
I think you need to "resolve" the db query first and then use the result to return a new Future. If the db query returns a Future[A], then you can use flatMap to operate over A and return a new Future. Something in the lines of
def receive = {
case msg =>
val futureResult: Future[Result] = ...
futureResult.flatMap { result: Result =>
// ....
// return a new Future
}
}
the simplest solution here is to turn the actor into a state machine (use AkkaFSM) and do the following:
dispatch a future for the mongoDB request.
use the reference to your own actor to commuincate with your actor
tell the message back from the future.
depending on context you might have to do some more to get a proper response.
But this has the advantage that you process the message with the actor state and you can mutate the actor state as you please as you own the thread.

Do Futures always end up not returning anything?

Given that we must avoid...
1) Modifying state
2) Blocking
...what is a correct end-to-end usage for a Future?
The general practice in using Futures seems to be transforming them into other Futures by using map, flatMap etc. but it's no good creating Futures forever.
Will there always be a call to onComplete somewhere, with methods writing the result of the Future to somewhere external to the application (e.g. web socket; the console; a message broker) or is there a non-blocking way of accessing the result?
All of the information on Futures in the Scaladocs - http://docs.scala-lang.org/overviews/core/futures.html seem to end up writing to the console. onComplete doesn't return anything, so presumably we have to end up doing some "fire-and-forget" IO.
e.g. a call to println
f onComplete {
case Success(number) => println(number)
case Failure(err) => println("An error has occured: " + err.getMessage)
}
But what about in more complex cases where we want to do more with the result of the Future?
As an example, in the Play framework Action.async can return a Future[Result] and the framework handles the rest. Will it eventually have to expect never to get a result from the Future?
We know the user needs to be returned a Result, so how can a framework do this using only a Unit method?
Is there a non-blocking way to retrieve the value of a future and use it elsewhere within the application, or is a call to Await inevitable?
Best practice is to use callbacks such as onComplete, onSuccess, onFailure for side effecting operations, e.g. logging, monitoring, I/O.
If you need the continue with the result of of your Future computation as opposed to do a side-effecting operation, you should use map to get access to the result of your computation and compose over it.
Future returns a unit, yes. That's because it's an asynchronous trigger. You need to register a callback in order to gather the result.
From your referenced scaladoc (with my comments):
// first assign the future with expected return type to a variable.
val f: Future[List[String]] = Future {
session.getRecentPosts
}
// immediately register the callbacks
f onFailure {
case t => println("An error has occurred: " + t.getMessage)
}
f onSuccess {
case posts => for (post <- posts) println(post)
}
Or instead of println-ing you could do something with the result:
f onSuccess {
case posts: List[String] => someFunction(posts)
}
Try this out:
import scala.concurrent.duration._
import scala.concurrent._
import scala.concurrent.ExecutionContext.Implicits.global
val f: Future[Int] = Future { 43 }
val result: Int = Await.result(f, 0 nanos)
So what is going on here?
You're defining a computation to be executed on a different thread.
So you Future { 43 } returns immediately.
Then you can wait for it and gather the result (via Await.result) or define computation on it without waiting for it to be completed (via map etc...)
Actually, the kind of Future you are talking about are used for side-effects. The result returned by a Future depends its type :
val f = Future[Int] { 42 }
For example, I could send the result of Future[Int] to another Future :
val f2 = f.flatMap(integer => Future{ println(integer) }) // may print 42
As you know, a future is a process that happens concurrently. So you can get its result in the future (that is, using methods such as onComplete) OR by explicitly blocking the current thread until it gets a value :
import scala.concurrent.Await
import akka.util.Timeout
import scala.concurrent.duration._
implicit val timeout = Timeout(5 seconds)
val integer = Await.result(Future { 42 }, timeout.duration)
Usually when you start dealing with asynchronous processes, you have to think in terms of reactions which may never occur. Using chained Futures is like declaring a possible chain of events which could be broken at any moment. Therefore, waiting for a Future's value is definitely not a good practice as you may never get it :
val integer = Await.result(Future { throw new RuntimeException() }, timeout.duration) // will throw an uncaught exception
Try to think more in terms of events, than in procedures.

Scala async/await and parallelization

I'm learning about the uses of async/await in Scala. I have read this in https://github.com/scala/async
Theoretically this code is asynchronous (non-blocking), but it's not parallelized:
def slowCalcFuture: Future[Int] = ...
def combined: Future[Int] = async {
await(slowCalcFuture) + await(slowCalcFuture)
}
val x: Int = Await.result(combined, 10.seconds)
whereas this other one is parallelized:
def combined: Future[Int] = async {
val future1 = slowCalcFuture
val future2 = slowCalcFuture
await(future1) + await(future2)
}
The only difference between them is the use of intermediate variables.
How can this affect the parallelization?
Since it's similar to async & await in C#, maybe I can provide some insight. In C#, it's a general rule that Task that can be awaited should be returned 'hot', i.e. already running. I assume it's the same in Scala, where the Future returned from the function does not have to be explicitly started, but is just 'running' after being called. If it's not the case, then the following is pure (and probably not true) speculation.
Let's analyze the first case:
async {
await(slowCalcFuture) + await(slowCalcFuture)
}
We get to that block and hit the first await:
async {
await(slowCalcFuture) + await(slowCalcFuture)
^^^^^
}
Ok, so we're asynchronously waiting for that calculation to finish. When it's finished, we 'move on' with analyzing the block:
async {
await(slowCalcFuture) + await(slowCalcFuture)
^^^^^
}
Second await, so we're asynchronously waiting for second calculation to finish. After that's done, we can calculate the final result by adding two integers.
As you can see, we're moving step-by-step through awaits, awaiting Futures as they come one by one.
Let's take a look at the second example:
async {
val future1 = slowCalcFuture
val future2 = slowCalcFuture
await(future1) + await(future2)
}
OK, so here's what (probably) happens:
async {
val future1 = slowCalcFuture // >> first future is started, but not awaited
val future2 = slowCalcFuture // >> second future is started, but not awaited
await(future1) + await(future2)
^^^^^
}
Then we're awaiting the first Future, but both of the futures are currently running. When the first one returns, the second might have already completed (so we will have the result available at once) or we might have to wait for a little bit longer.
Now it's clear that second example runs two calculations in parallel, then waits for both of them to finish. When both are ready, it returns. First example runs the calculations in a non-blocking way, but sequentially.
the answer by Patryk is correct if a little difficult to follow. the main thing to understand about async/await is that it's just another way of doing Future's flatMap. there's no concurrency magic behind the scenes. all the calls inside an async block are sequential, including await which doesn't actually block the executing thread but rather wraps the rest of the async block in a closure and passes it as a callback on completion of the Future we're waiting on. so in the first piece of code the second calculation doesn't start until the first await has completed since no one started it prior to that.
In first case you create a new thread to execute a slow future and wait for it in a single call. So invocation of the second slow future is performed after the first one is complete.
In the second case when val future1 = slowCalcFuture is called, it effectively create a new thread, pass pointer to "slowCalcFuture" function to the thread and says "execute it please". It takes as much time as it is necessary to get a thread instance from thread pool, and pass a pointer to a function to the thread instance. Which can be considered instant. So, because val future1 = slowCalcFuture is translated into "get thread and pass pointer" operations, it is complete in no time and the next line is executed without any delay val future2 = slowCalcFuture. Feauture 2 is scheduled to execution without any delay too.
Fundamental difference between val future1 = slowCalcFuture and await(slowCalcFuture) is the same as between asking somebody to make you coffee and waiting for your coffee to be ready. Asking takes 2 seconds: which is needed to say phrase: "could you make me coffee please?". But waiting for coffee to be ready will take 4 minutes.
Possible modification of this task could be waiting for 1st available answer. For example, you want to connect to any server in a cluster. You issue requests to connect to every server you know, and the first one which responds, will be your server. You could do this with:
Future.firstCompletedOf(Array(slowCalcFuture, slowCalcFuture))