Suppose I have this methods:
def callApis(f1, f2, f3):Future[Result] {
for {
a <- Future { f1 }
b <- Future { f2 }
c <- Future { f3 }
} yield Result(a,b,c)
}
If you are familiar with scala you will know that lines in the for block will execute sequentially. More specific, a will be calculated first. Then when we have the result for a, the code will calculate b. Then when we have the result for b, the code will calculate c.
My question is, how can you write a UNIT TEST that ensures that a always be computed before calculating b, and b always be computed before calculating c? My fear is if someone doesn't know much about how futures work in scala. They can accidentally make this code to run asynchronously.
I mean people can accidentally do something like this, this makes a,b,c to be calculated asynchronously (which I don't want people to do):
def callApis(f1, f2, f3):Future[Result] {
val fut1 = Future { f1 }
val fut2 = Future { f2 }
val fut3 = Future { f3 }
for {
a <- fut1
b <- fut2
c <- fut3
} yield Result(a,b,c)
}
Perhaps try defining a single-threaded execution context and require it in blocks that should execute serially. For example,
trait SerialExecutionContext extends ExecutionContext {
val singleThreadPool = Executors.newFixedThreadPool(1, (r: Runnable) => new Thread(r, s"single-thread-pool"))
val serialEc = ExecutionContext.fromExecutor(singleThreadPool)
override def execute(runnable: Runnable): Unit = serialEc.execute(runnable)
override def reportFailure(cause: Throwable): Unit = serialEc.reportFailure(cause)
}
def callApis()(implicit ec: SerialExecutionContext): Future[Result] = {
val fut1 = Future { ...doSomething... }
val fut2 = Future { ...doSomething... }
val fut3 = Future { ...doSomething... }
for {
a <- fut1
b <- fut2
c <- fut3
} yield Result(a,b,c)
}
Now callApis can evaluate only if we can prove at compile time there exists a serial execution context. Since within the body we have only one thread available futures are forced to start only after the previous one finished.
Related
Problem:
I am trying to call lazy function(function assigned to lazy val) inside Future block and it is not behaving as expected but when i execute the function directly inside the block it is working as expected. Not sure if am missing any.
Working Code:
Below is the code working as expected when i execute the method directly inside Future block
implicit val ec = ExecutionContext.fromExecutorService {
Executors.newFixedThreadPool(8)
}
def execute1() = {
Thread.sleep(4000); println("Completed 1!!!")
1
}
def execute2() = {
Thread.sleep(3000); println("Completed 2!!!")
2
}
def execute3() = {
Thread.sleep(2000); println("Completed 3!!!")
3
}
def execute4() = {
Thread.sleep(1000); println("Completed 4!!!")
4
}
val future1 : Future[Int] = Future.apply(execute1())
val future2 : Future[Int] = Future.apply(execute2())
val future3 : Future[Int] = Future.apply(execute3())
val future4 : Future[Int] = Future.apply(execute4())
val result = for { r1 <- future1
r2 <- future2
r3 <- future3
r4 <- future4
} yield {
println(r1+","+r2+","+r3+","+r4)
}
StdIn.readLine()
sys.exit()
When the above code is executed, the methods are executed in order "execute4,execute3,execute2,execute1" which is as expected.
Not Working Code:
In the above code, when i trying assigning the "execute" method to lazy variable and refer that variable inside Future block it is behaving differently. It is executed in 1,4,3,2 order.. Please see below code
implicit val ec = ExecutionContext.fromExecutorService {
Executors.newFixedThreadPool(8)
}
def execute1() = {
Thread.sleep(4000); println("Completed 1!!!")
1
}
def execute2() = {
Thread.sleep(3000); println("Completed 2!!!")
2
}
def execute3() = {
Thread.sleep(2000); println("Completed 3!!!")
3
}
def execute4() = {
Thread.sleep(1000); println("Completed 4!!!")
4
}
lazy val e1 = execute1()
lazy val e2 = execute2()
lazy val e3 = execute3()
lazy val e4 = execute4()
val future1 : Future[Int] = Future.apply(e1)
val future2 : Future[Int] = Future.apply(e2)
val future3 : Future[Int] = Future.apply(e3)
val future4 : Future[Int] = Future.apply(e4)
val result = for { r1 <- future1
r2 <- future2
r3 <- future3
r4 <- future4
} yield {
println(r1+","+r2+","+r3+","+r4)
}
StdIn.readLine()
sys.exit()
Expected Behavior: Since the functions(e1,e2,e3,e4) are referred as Lazy, it should be executed inside Future block upon calling and should behave same as the working code. Weird behavior i notice is it executes execute1() method synchronously and rest of the methods asynchronously.. Any guidance or suggestion will be great to me..
Output I am Expecting:
Regardless of "i execute the method inside Future block"(or) "make the method as lazy outside the Future block and call inside the Future block" should yield me the same result.. As per my example, the output i am expecting is "the order of method execution(asynchronously) as execute4(),execute3(),execute2() and execute(1)"
To simplify the example..
Future Execution is different in below two approach.. In both the approach, i am expecting same output
//Approach#1
def method() = {
}
Future{
method()
}
//Approach#2
lazy val lazyMethod = method()
Future {
lazyMethod()
}
Actually the code is working as expected. Let me explain.
First comes the for, when you did,
val result = for {
r1 <- future1
r2 <- future2
r3 <- future3
r4 <- future4
} yield {
println(r1+","+r2+","+r3+","+r4)
}
You are roughly doing,
val result =
future1.flatMap(r1 =>
future2.flatMap(r2 =>
future3.flatMap(r3 =>
future4.map(r4 =>
println(r1+","+r2+","+r3+","+r4)
)
)
)
)
Which means you are "accessing" the values computed by these futures only after you have accessed the value of the previous one.
Now comes the Future.apply which takes a body: => T as argument and gives you a Future[T] but the thing is that this body will start executing as soon as you create the future.
So, In your first implementation when you did,
val future1 : Future[Int] = Future.apply(execute1())
val future2 : Future[Int] = Future.apply(execute2())
val future3 : Future[Int] = Future.apply(execute3())
val future4 : Future[Int] = Future.apply(execute4())
All these futuresI's began executing your executeI's at this point. So, the println inside these executeI's will be executed x ms after this, irrespective of when you try to access the value inside any of these futures.
Now, comes the lazy. So when you declare something like this,
val laxy x = {
println("accessing lazy x")
5
}
The block will be executed only when you access x for the first time.
And when you are doing this,
val future1 : Future[Int] = Future.apply(e1)
val future2 : Future[Int] = Future.apply(e2)
val future3 : Future[Int] = Future.apply(e3)
val future4 : Future[Int] = Future.apply(e4)
You are still not "accessing" any of these lazy eI's but as you know that each future starts computing as soon as it is created. So when these futures start executing they will "access" these eI's.
To understand it better, lets change our executeI's as following,
def execute1() = {
println("Started 1!!! " + System.currentTimeMillis())
Thread.sleep(4000)
println("Completed 1!!! " + System.currentTimeMillis())
1
}
And you will notice that all of these eI's are executing sequentially.
This is because all these eI's will be evaluated in the thread where these were defined and not in the thread executing the Future. So these Thread.sleep will block the current thread and will be evaluated in a non-deterministic order (owing to some probable optimizations), which co-incidentally happens to be 1, 4, 3, 2.
But if you change the order of future to,
val future1 : Future[Int] = Future.apply(e1)
val future4 : Future[Int] = Future.apply(e4)
val future2 : Future[Int] = Future.apply(e2)
val future3 : Future[Int] = Future.apply(e3)
It will become 1, 3, 2, 4.
I'm having this issue where I use a for-comprehension in Scala to chain some Futures, and then get an instance of a class with some of those values. The problem is that the val I assign the value to, is of type Future[MyClass] instead of MyClass and I can't seem to figure out why.
The code is something like this:
val future = someService.someFutureReturningMethod()
val x = for{
a <- future
b <- someOtherService.someOtherFutureMethod(a)
} yield {
MyClass(b.sth, b.sthElse)
}
The problem here is that x ends up being of type Future[MyClass] and not MyClass and I can't seem to figure out why.
That behavior is correct, you can use for comprehension because Future[T] understands flatMap and map methods.
The following code
val futureA = Future.successful(1)
val futureB = Future.successful(2)
val futureC = Future.successful(3)
val x1 = for {
a <- futureA
b <- futureB
c <- futureC
} yield {
a + b + c
}
It is compiled to
val x2 = futureA.flatMap {
a => futureB.flatMap {
b => futureC.map {
c => a + b + c
}
}
}
A call to Future.flatMap or Future.map is a Future. (it is the same with Option, Try, Either, etc)
If you want the result you need to wait for it.
Await.result(x, Duration(10, TimeUnit.SECONDS))
I want to create a function similar to the following. Basically the function, say F will create a Future say fut1. When fut1 resolves, then another Future say fut2 should get created inside fut1. The fut2 should return the final value of the function F. The code has to be non-blocking all the way. I have written something like this but the return type is not Future[Int] but Future[Future[Int]]. I understand why this is the case (because map creates a Future) but I am unable to figure out how to return Future[Int] from this code.
import scala.concurrent.Future
import scala.concurrent.ExecutionContext.Implicits.global
def fut:Future[Int] = {
val f1 = Future{ 1 } //create a Future
f1.map (x => { //when f1 finishes, create another future
println(x)
val f2 = Future{ 2 }
f2.map(x=> x) //this creates another Future and thus the return is Future[Future[Int]]
})
}
You can achieve this using flat map or for comprehension.
FlatMap-
def futureFunctionWithFlatMap: Future[Int] = {
val f1 = Future {
1
}
f1.flatMap(x => {
println(x)
val f2 = Future {
2
}
f2.map(x => x)
})
}
For Comprehension
def futureFunctionWithForComprehension: Future[Int] = {
for {
f1 <- Future { 1 }
f2 <- {
println(f1)
Future { 2 }
}
} yield f2
}
Use flatMap
val f1 = Future{ 1 } //create a Future
val f2: Future[Int] = f1.flatMap(x => {
//will be triggered only after x is ready
Future{2}
})
How is functions with side-effects best handled in for-comprehensions in Scala?
I have a for comprehension that starts by creating a kind of resource (x) by calling a function f1. This resource has a close-method that needs to be called at the end but also if the for-comprehension fails somehow (unless.
So we have something like:
import scala.util.{Try,Success,Failure}
trait Resource {
def close() : Unit
}
// Opens some resource and returns it as Success or returns Failure
def f1 : Try[Resource] = ...
def f2 : Try[Resource] = ...
val res = for {
x <- f1
y <- f2
} yield {
(x,y)
}
Where should I call the close method? I can call it at the end of the for-comprehension as the last statement (z <- x.close), in the yield-part, or after the for-comprehension (res._1.close). None of them ensures that close is called if an error occurs (e.g. if f2 fails).
Alternatively, I could separate
x <- f1
out of the for-comprehension like this:
val res = f1
res match {
case Success(x) => {
for {
y <- f2
}
x.close
}
case Failure(e) => ...
:
That would ensure the call of close but is not very nice code.
Is there not a smarter and more clean way to achieve the same?
When I have such problem I decide between 2 possibilities:
Use Scala ARM
Implement Loan Pattern on my own (link is volatile and could die)
In most cases I prefer own implementation to avoid additional dependency.
Here is the code of Loan Pattern:
def using[A](r : Resource)(f : Resource => A) : A =
try {
f(r)
} finally {
r.close()
}
Usage:
using(getResource())(r =>
useResource(r)
)
Since you need 2 resources you will need to use this pattern twice:
using(getResource1())(r1 =>
using(getResource2())(r2 =>
doYourWork(r1, r2)))
You can also look on following answers:
Scala: Disposable Resource Pattern
functional try & catch w/ Scala
Using a variable in finally block
A common pattern for closing resources is the loan pattern:
type Closable = { def close(): Unit }
def withClosable[B](closable: Closable)(op: Closable => B): B = {
try {
op(closable)
} finally {
closable.close()
}
}
With a little refactoring you can use this pattern:
import scala.util.{Try,Success,Failure}
trait Resource {
def close() : Unit
}
// Opens some resource and returns it as Success or returns Failure
def f1(res: Resource) : Try[Resource] = ???
def f2(res: Resource) : Try[Resource] = ???
val f1Resource: Resource = ???
val f2Resource: Resource = ???
val res = for {
x <- withClosable(f1Resource)(f1)
y <- withClosable(f2Resource)(f2)
} yield {
(x,y)
}
or
import scala.util.{Try,Success,Failure}
trait Resource {
def close() : Unit
}
// Opens some resource and returns it as Success or returns Failure
def f1: Try[Resource] = {
val res: Resource = ???
withClosable(res){ ... }
}
def f2: Try[Resource] = {
val res: Resource = ???
withClosable(res){ ... }
}
val res = for {
x <- f1
y <- f2
} yield {
(x,y)
}
You could use
https://github.com/jsuereth/scala-arm
If your "resource" does not implement java.io.Closeable (or some other closable interface, supported by than library) you just need to write an implicit conversion:
implicit def yourEnititySupport[A <: your.closable.Enitity]: Resource[A] =
new Resource[A] {
override def close(r: A) = r.commit()
// if you need custom behavior here
override def closeAfterException(r: A, t: Throwable) = r.rollback()
}
And use it like this:
import resource._
for {
a <- managed(your.closable.Enitity())
b <- managed(your.closable.Enitity())
} { doSomething(a, b) }
Suppose I need to run two concurrent computations, wait for both of them, and then combine their results. More specifically, I need to run f1: X1 => Y1 and f2: X2 => Y2 concurrently and then call f: (Y1, Y2) => Y to finally get a value of Y.
I can create future computations fut1: X1 => Future[Y1] and fut2: X2 => Future[Y2] and then compose them to get fut: (X1, X2) => Future[Y] using monadic composition.
The problem is that monadic composition implies sequential wait. In our case it implies that we wait for one future first and then we will wait for another. For instance. if it takes 2 sec. to the first future to complete and just 1 sec. to the 2nd future to fail we waste 1 sec.
Thus it looks like we need an applicative composition of the futures to wait till either both complete or at least one future fails. Does it make sense ? How would you implement <*> for futures ?
None of the methods in other answers does the right thing in case of a future that fails quickly plus a future that succeeds after a long time.
But such a method can be implemented manually:
def smartSequence[A](futures: Seq[Future[A]]): Future[Seq[A]] = {
val counter = new AtomicInteger(futures.size)
val result = Promise[Seq[A]]()
def attemptComplete(t: Try[A]): Unit = {
val remaining = counter.decrementAndGet
t match {
// If one future fails, fail the result immediately
case Failure(cause) => result tryFailure cause
// If all futures have succeeded, complete successful result
case Success(_) if remaining == 0 =>
result tryCompleteWith Future.sequence(futures)
case _ =>
}
}
futures.foreach(_ onComplete attemptComplete)
result.future
}
ScalaZ does a similar thing internally, so both f1 |#| f2 and List(f1, f2).sequence fail immediately after any of the futures fails.
Here is a quick test of the failing time for those methods:
import java.util.Date
import scala.concurrent.Future
import scala.concurrent.ExecutionContext.Implicits.global
import scalaz._, Scalaz._
object ReflectionTest extends App {
def f1: Future[Unit] = Future {
Thread.sleep(2000)
}
def f2: Future[Unit] = Future {
Thread.sleep(1000)
throw new RuntimeException("Failure")
}
def test(name: String)(
f: (Future[Unit], Future[Unit]) => Future[Unit]
): Unit = {
val start = new Date().getTime
f(f1, f2).andThen {
case _ =>
println(s"Test $name completed in ${new Date().getTime - start}")
}
Thread.sleep(2200)
}
test("monadic") { (f1, f2) => for (v1 <- f1; v2 <- f2) yield () }
test("zip") { (f1, f2) => (f1 zip f2).map(_ => ()) }
test("Future.sequence") {
(f1, f2) => Future.sequence(Seq(f1, f2)).map(_ => ())
}
test("smartSequence") { (f1, f2) => smartSequence(Seq(f1, f2)).map(_ => ())}
test("scalaz |#|") { (f1, f2) => (f1 |#| f2) { case _ => ()}}
test("scalaz sequence") { (f1, f2) => List(f1, f2).sequence.map(_ => ())}
Thread.sleep(30000)
}
And the result on my machine is:
Test monadic completed in 2281
Test zip completed in 2008
Test Future.sequence completed in 2007
Test smartSequence completed in 1005
Test scalaz |#| completed in 1003
Test scalaz sequence completed in 1005
The problem is that monadic composition implies sequential wait. In our case it implies that we wait for one future first and then we will wait for another.
This is unfortunately true.
import java.util.Date
import scala.concurrent.Future
import scala.concurrent.ExecutionContext.Implicits.global
object Test extends App {
def timestamp(label: String): Unit = Console.println(label + ": " + new Date().getTime.toString)
timestamp("Start")
for {
step1 <- Future {
Thread.sleep(2000)
timestamp("step1")
}
step2 <- Future {
Thread.sleep(1000)
timestamp("step2")
}
} yield { timestamp("Done") }
Thread.sleep(4000)
}
Running this code outputs:
Start: 1430473518753
step1: 1430473520778
step2: 1430473521780
Done: 1430473521781
Thus it looks like we need an applicative composition of the futures to wait till either both complete or at least one future fails.
I am not sure applicative composition has anything to do with the concurrent strategy. Using for comprehensions, you get a result if all futures complete or a failure if any of them fails. So it's semantically the same.
Why Are They Running Sequentially
I think the reason why futures are run sequentially is because step1 is available within step2 (and in the rest of the computation). Essentially we can convert the for block as:
def step1() = Future {
Thread.sleep(2000)
timestamp("step1")
}
def step2() = Future {
Thread.sleep(1000)
timestamp("step2")
}
def finalStep() = timestamp("Done")
step1().flatMap(step1 => step2()).map(finalStep())
So the result of previous computations are available to the rest of the steps. It differs from <?> & <*> in this respect.
How To Run Futures In Parallel
#andrey-tyukin's code runs futures in parallel:
import java.util.Date
import scala.concurrent.Future
import scala.concurrent.ExecutionContext.Implicits.global
object Test extends App {
def timestamp(label: String): Unit = Console.println(label + ": " + new Date().getTime.toString)
timestamp("Start")
(Future {
Thread.sleep(2000)
timestamp("step1")
} zip Future {
Thread.sleep(1000)
timestamp("step2")
}).map(_ => timestamp("Done"))
Thread.sleep(4000)
}
Output:
Start: 1430474667418
step2: 1430474668444
step1: 1430474669444
Done: 1430474669446
Your post seems to contain two more or less independent questions.
I will address the concrete practical problem of running two concurrent computations first. The question about Applicative is answered in the very end.
Suppose you have two asynchronous functions:
val f1: X1 => Future[Y1]
val f2: X2 => Future[Y2]
And two values:
val x1: X1
val x2: X2
Now you can start the computations in multiple different ways. Let's take a look at some of them.
Starting computations outside of for (parallel)
Suppose you do this:
val y1: Future[Y1] = f1(x1)
val y2: Future[Y2] = f2(x2)
Now, the computations f1 and f2 are already running. It does not matter in which order you collect the results. You could do it with a for-comprehension:
val y: Future[(Y1,Y2)] = for(res1 <- y1; res2 <- y2) yield (res1,res2)
Using the expressions y1 and y2 in the for-comprehension does not interfere with the order of computation of y1 and y2, they are still being computed in parallel.
Starting computations inside of for (sequential)
If we simply take the definitions of y1 and y2, and plug them into the for comprehension directly, we will still get the same result, but the order of execution will be different:
val y = for (res1 <- f1(x1); res2 <- f2(x2)) yield (res1, res2)
translates into
val y = f1(x1).flatMap{ res1 => f2(x2).map{ res2 => (res1, res2) } }
in particular, the second computation starts after the first one has terminated. This is usually not what one wants to have.
Here, a basic substitution principle is violated. If there were no side-effects, one probably could transform this version into the previous one, but in Scala, one has to take care of the order of execution explicitly.
Zipping futures (parallel)
Futures respect products. There is a method Future.zip, which allows you to do this:
val y = f1(x1) zip f2(x2)
This would run both computations in parallel until both are done, or until one of them fails.
Demo
Here is a little script that demonstrates this behaviour (inspired by muhuk's post):
import scala.concurrent._
import scala.concurrent.duration._
import scala.concurrent.ExecutionContext.Implicits.global
import java.lang.Thread.sleep
import java.lang.System.{currentTimeMillis => millis}
var time: Long = 0
val x1 = 1
val x2 = 2
// this function just waits
val f1: Int => Future[Unit] = {
x => Future { sleep(x * 1000) }
}
// this function waits and then prints
// elapsed time
val f2: Int => Future[Unit] = {
x => Future {
sleep(x * 1000)
val elapsed = millis() - time
printf("Time: %1.3f seconds\n", elapsed / 1000.0)
}
}
/* Outside `for` */ {
time = millis()
val y1 = f1(x1)
val y2 = f2(x2)
val y = for(res1 <- y1; res2 <- y2) yield (res1,res2)
Await.result(y, Duration.Inf)
}
/* Inside `for` */ {
time = millis()
val y = for(res1 <- f1(x1); res2 <- f2(x2)) yield (res1, res2)
Await.result(y, Duration.Inf)
}
/* Zip */ {
time = millis()
val y = f1(x1) zip f2(x2)
Await.result(y, Duration.Inf)
}
Output:
Time: 2.028 seconds
Time: 3.001 seconds
Time: 2.001 seconds
Applicative
Using this definition from your other post:
trait Applicative[F[_]] {
def apply[A, B](f: F[A => B]): F[A] => F[B]
}
one could do something like this:
object FutureApplicative extends Applicative[Future] {
def apply[A, B](ff: Future[A => B]): Future[A] => Future[B] = {
fa => for ((f,a) <- ff zip fa) yield f(a)
}
}
However, I'm not sure what this has to do with your concrete problem, or with understandable and readable code. A Future already is a monad (this is stronger than Applicative), and there is even built-in syntax for it, so I don't see any advantages in adding some Applicatives here.
It needs not be sequential. The future computation may start the moment the future is created. Of course, if the future is created by the flatMap argument (and it will necessary be so if it needs the result of the first computation), then it will be sequential. But in code such as
val f1 = Future {....}
val f2 = Future {....}
for (a1 <- f1; a2 <- f2) yield f(a1, a2)
you get concurrent execution.
So the implementation of Applicative implied by Monad is ok.