Is map of Future lazy or not? - scala

Basically I mean:
for(v <- Future(long time operation)) yield v*someOtherValue
This expression returns another Future, but the question is, is the v*someOhterValue operation lazy or not? Will this expression block on getting the value of Future(long time operation)?
Or it is like a chain of callbacks?

A short experiment can test this question.
import concurrent._;
import concurrent.ExecutionContext.Implicits.global
import scala.concurrent.duration._
object TheFuture {
def main(args: Array[String]): Unit = {
val fut = for (v <- Future { Thread.sleep(2000) ; 10 }) yield v * 10;
println("For loop is finished...")
println(Await.ready(fut, Duration.Inf).value.get);
}
}
If we run this, we see For loop is finished... almost immediately, and then two seconds later, we see the result. So the act of performing map or similar operations on a future is not blocking.

A map (or, equivalently, your for comprehension) on a Future is not lazy: it will be executed as soon as possible on another thread. However, since it runs on another thread, it isn't blocking, either.

If you want to do the definition and execution of the Future separately, then you have to use something like a Monix Task.
https://monix.io/api/3.0/monix/eval/Task.html

Related

Is synchronous HTTP request wrapped in a Future considered CPU or IO bound?

Consider the following two snippets where first wraps scalaj-http requests with Future, whilst second uses async-http-client
Sync client wrapped with Future using global EC
object SyncClientWithFuture {
def main(args: Array[String]): Unit = {
import scala.concurrent.ExecutionContext.Implicits.global
import scalaj.http.Http
val delay = "3000"
val slowApi = s"http://slowwly.robertomurray.co.uk/delay/${delay}/url/https://www.google.co.uk"
val nestedF = Future(Http(slowApi).asString).flatMap { _ =>
Future.sequence(List(
Future(Http(slowApi).asString),
Future(Http(slowApi).asString),
Future(Http(slowApi).asString)
))
}
time { Await.result(nestedF, Inf) }
}
}
Async client using global EC
object AsyncClient {
def main(args: Array[String]): Unit = {
import scala.concurrent.ExecutionContext.Implicits.global
import sttp.client._
import sttp.client.asynchttpclient.future.AsyncHttpClientFutureBackend
implicit val sttpBackend = AsyncHttpClientFutureBackend()
val delay = "3000"
val slowApi = uri"http://slowwly.robertomurray.co.uk/delay/${delay}/url/https://www.google.co.uk"
val nestedF = basicRequest.get(slowApi).send().flatMap { _ =>
Future.sequence(List(
basicRequest.get(slowApi).send(),
basicRequest.get(slowApi).send(),
basicRequest.get(slowApi).send()
))
}
time { Await.result(nestedF, Inf) }
}
}
The snippets are using
Slowwly to simulate slow API
scalaj-http
async-http-client sttp backend
time
The former takes 12 seconds whilst the latter takes 6 seconds. It seems the former behaves as if it is CPU bound however I do not see how that is the case since Future#sequence should executes the HTTP requests in parallel? Why does synchronous client wrapped in Future behave differently from proper async client? Is it not the case that async client does the same kind of thing where it wraps calls in Futures under the hood?
Future#sequence should execute the HTTP requests in parallel?
First of all, Future#sequence doesn't execute anything. It just produces a future that completes when all parameters complete.
Evaluation (execution) of constructed futures starts immediately If there is a free thread in the EC. Otherwise, it simply submits it for a sort of queue.
I am sure that in the first case you have single thread execution of futures.
println(scala.concurrent.ExecutionContext.Implicits.global) -> parallelism = 6
Don't know why it is like this, it might that other 5 thread is always busy for some reason. You can experiment with explicitly created new EC with 5-10 threads.
The difference with the Async case that you don't create a future by yourself, it is provided by the library, that internally don't block the thread. It starts the async process, "subscribes" for a result, and returns the future, which completes when the result will come.
Actually, async lib could have another EC internally, but I doubt.
Btw, Futures are not supposed to contain slow/io/blocking evaluations without blocking. Otherwise, you potentially will block the main thread pool (EC) and your app will be completely frozen.

fs2.Stream[IO, Something] does not return on take(1)

I have a function that returns a fs2.Stream of Measurements.
import cats.effect._
import fs2._
def apply(sds: SerialPort, interval: Int)(implicit cs: ContextShift[IO]): Stream[IO, SdsMeasurement] =
for {
blocker <- Stream.resource(Blocker[IO])
stream <- io.readInputStream(IO(sds.getInputStream), 1, blocker)
.through(SdsStateMachine.collectMeasurements())
} yield stream
Normally it is an infinite Stream, unless I pass it a test flag, in which case it should output one value and halt.
val infiniteSource: Stream[IO, SdsMeasurement] = ...
val source = if (isTest) infiniteSource.take(1) else infiniteSource
source.compile.drain
The infinite Stream works fine. It gives me all Measurements infinitely. The test Stream indeed gives me only the first measurement, nothing more. The problem I have is that the Stream does not return after this last measurements. It blocks forever. What am I doing wrong?
Note: I think I abstracted the essential code, but for more context, please take a look at my project: https://github.com/jkransen/fijnstof/blob/ZIO/src/main/scala/nl/kransen/fijnstof/Main.scala
The code you've presented here looks fine, I don't think the issue lies within that code. If it blocks, then presumably one of the underlying APIs blocks, for instance it might be the close method of the InputStream. What I usually do in such situations is to add log statements before and after every function call that might block.

scalaz.concurrent.Task repeatEval only evaluate Task.now and Task.async once

I am learning scalaz stream at the moment, I am confused why repeatEval only evaluate Task.async once.
val result = Process
.repeatEval(Task.async[Unit](t => {
val result = scala.io.Source.fromURL("http://someUrl").mkString
println(".......")
println(result)
}))
result.runLog.run //only print once
However, if I change Task.async to Task.delay. It evaluates the function infinitely. I dont know why is that
val result = Process
.repeatEval(Task.delay({
val result = scala.io.Source.fromURL("http://someUrl").mkString
println(".......")
println(result)
}))
result.runLog.run //print infinitely
Many thanks in advance
As I mention in my answer to your recent question about Task, Task.async takes a function that registers callbacks—not some code that should be executed asynchronously. In the case of the other question, you actually want Task.async, since you're interoperating with a callback-based API.
Here it seems like you probably want Task.apply, not Task.delay. The two look similar, but delay simply suspends the computation—it doesn't use an ExecutorService to run it in a separate thread. You can see this in the following example:
import scalaz._, Scalaz._, concurrent._
val delayTask = Task.delay(Thread.sleep(5000))
val applyTask = Task(Thread.sleep(5000))
Nondeterminism[Task].both(delayTask, delayTask).run
Nondeterminism[Task].both(applyTask, applyTask).run
The delayTask version will take longer.

Do Futures always end up not returning anything?

Given that we must avoid...
1) Modifying state
2) Blocking
...what is a correct end-to-end usage for a Future?
The general practice in using Futures seems to be transforming them into other Futures by using map, flatMap etc. but it's no good creating Futures forever.
Will there always be a call to onComplete somewhere, with methods writing the result of the Future to somewhere external to the application (e.g. web socket; the console; a message broker) or is there a non-blocking way of accessing the result?
All of the information on Futures in the Scaladocs - http://docs.scala-lang.org/overviews/core/futures.html seem to end up writing to the console. onComplete doesn't return anything, so presumably we have to end up doing some "fire-and-forget" IO.
e.g. a call to println
f onComplete {
case Success(number) => println(number)
case Failure(err) => println("An error has occured: " + err.getMessage)
}
But what about in more complex cases where we want to do more with the result of the Future?
As an example, in the Play framework Action.async can return a Future[Result] and the framework handles the rest. Will it eventually have to expect never to get a result from the Future?
We know the user needs to be returned a Result, so how can a framework do this using only a Unit method?
Is there a non-blocking way to retrieve the value of a future and use it elsewhere within the application, or is a call to Await inevitable?
Best practice is to use callbacks such as onComplete, onSuccess, onFailure for side effecting operations, e.g. logging, monitoring, I/O.
If you need the continue with the result of of your Future computation as opposed to do a side-effecting operation, you should use map to get access to the result of your computation and compose over it.
Future returns a unit, yes. That's because it's an asynchronous trigger. You need to register a callback in order to gather the result.
From your referenced scaladoc (with my comments):
// first assign the future with expected return type to a variable.
val f: Future[List[String]] = Future {
session.getRecentPosts
}
// immediately register the callbacks
f onFailure {
case t => println("An error has occurred: " + t.getMessage)
}
f onSuccess {
case posts => for (post <- posts) println(post)
}
Or instead of println-ing you could do something with the result:
f onSuccess {
case posts: List[String] => someFunction(posts)
}
Try this out:
import scala.concurrent.duration._
import scala.concurrent._
import scala.concurrent.ExecutionContext.Implicits.global
val f: Future[Int] = Future { 43 }
val result: Int = Await.result(f, 0 nanos)
So what is going on here?
You're defining a computation to be executed on a different thread.
So you Future { 43 } returns immediately.
Then you can wait for it and gather the result (via Await.result) or define computation on it without waiting for it to be completed (via map etc...)
Actually, the kind of Future you are talking about are used for side-effects. The result returned by a Future depends its type :
val f = Future[Int] { 42 }
For example, I could send the result of Future[Int] to another Future :
val f2 = f.flatMap(integer => Future{ println(integer) }) // may print 42
As you know, a future is a process that happens concurrently. So you can get its result in the future (that is, using methods such as onComplete) OR by explicitly blocking the current thread until it gets a value :
import scala.concurrent.Await
import akka.util.Timeout
import scala.concurrent.duration._
implicit val timeout = Timeout(5 seconds)
val integer = Await.result(Future { 42 }, timeout.duration)
Usually when you start dealing with asynchronous processes, you have to think in terms of reactions which may never occur. Using chained Futures is like declaring a possible chain of events which could be broken at any moment. Therefore, waiting for a Future's value is definitely not a good practice as you may never get it :
val integer = Await.result(Future { throw new RuntimeException() }, timeout.duration) // will throw an uncaught exception
Try to think more in terms of events, than in procedures.

Reducing code repetition with recurring code in if-else

I'm wondering if the following short snippet, which does show repetition can be made more DRY. I seem to be hitting these kind of constructions quite often.
Say I want some computation to be done either synchronous or asynchronous, which is chosen at runtime.
for(i <- 1 to reps) {
Thread.sleep(expDistribution.sample().toInt)
if (async) {
Future {
sqlContext.sql(query).collect()
}
} else {
sqlContext.sql(query).collect()
}
}
It feels clumsy repeating the call to the sqlContext. Is there an idiom for this trivial recurring construct?
You can "store" your computation in a local def and then evaluate it either synchronously or asynchronously
def go = sqlContext.sql(query).collect()
if(async) Future(go) else Future.successful(go)
You can execture Future in your current thread using MoreExecutors.directExecutor() which is implemented in guava library.
(If you don't wan't to use guava library, see this question)
Using this method, you can switch the execution context according to async flag.
Here's the sample code.
You can see that setting async flag to false makes each Future executed in order.
import com.google.common.util.concurrent.MoreExecutors
import scala.concurrent.{Future,ExecutionContext}
import java.lang.Thread
object Main {
def main(args:Array[String]){
val async = false // change this to switch sync/async
implicit val ec = if(async){
ExecutionContext.Implicits.global // use thread pool
}else{
ExecutionContext.fromExecutor(MoreExecutors.directExecutor) // directy execute in current thread
}
println(Thread.currentThread.getId)
Future{
Thread.sleep(1000)
println(Thread.currentThread.getId)
}
Future{
Thread.sleep(2000)
println(Thread.currentThread.getId)
}
Thread.sleep(4000) // if you are doing asynchronously, you need this.
}
}