Does cats mapN run all the futures in parallel?

Does cats mapN run all the futures in parallel? - scala

As mentioned in the jump-start guide, mapN will run all the futures in parallel, so I created the below simple Scala program, but a sample run shows diff to be 9187 ms and diffN to be 9106 ms. So it looks like that the mapN is also running the futures sequentially, isn't it? Please let me know if I am missing something?
package example
import scala.concurrent.Future
import scala.concurrent.ExecutionContext.Implicits.global
import java.time.LocalDateTime
import java.time.Duration
import scala.util.Failure
import scala.util.Success
import java.time.ZoneOffset
import cats.instances.future._
import cats.syntax.apply._
object FutureEx extends App {
val before = LocalDateTime.now()
val sum = for {
a <- getA
b <- getB
c <- getC
} yield (a + b + c)
sum onComplete {
case Failure(ex) => println(s"Error: ${ex.getMessage()}")
case Success(value) =>
val after = LocalDateTime.now()
println(s"before: $before")
println(s"after: $after")
val diff = getDiff(before, after)
println(s"diff: $diff")
println(s"sum: $value")
}
// let the above finish
Thread.sleep(20000)
val beforeN = LocalDateTime.now()
val usingMapN = (getA, getB, getC).mapN(add)
usingMapN onComplete {
case Failure(ex) => println(s"Error: ${ex.getMessage()}")
case Success(value) =>
val afterN = LocalDateTime.now()
println(s"beforeN: $beforeN")
println(s"afterN: $afterN")
val diff = getDiff(beforeN, afterN)
println(s"diffN: $diff")
println(s"sum: $value")
}
def getA: Future[Int] = {
println("inside A")
Thread.sleep(3000)
Future.successful(2)
}
def getB: Future[Int] = {
println("inside B")
Thread.sleep(3000)
Future.successful(3)
}
def getC: Future[Int] = {
println("inside C")
Thread.sleep(3000)
Future.successful(4)
}
def add(a: Int, b: Int, c: Int) = a + b + c
def getDiff(before: LocalDateTime, after: LocalDateTime): Long = {
Duration.between(before.toInstant(ZoneOffset.UTC), after.toInstant(ZoneOffset.UTC)).toMillis()
}
}

Because you have sleep outside Future it should be like:
def getA: Future[Int] = Future {
println("inside A")
Thread.sleep(3000)
2
}
So you start async Future with apply - Future.successful on the other hand returns pure value, meaning you execute sleep in same thread.

The time is going before mapN is ever called.
(getA, getB, getC).mapN(add)
This expression is creating a tuple (sequentially) and then calling mapN on it. So it is calling getA then getB then getC and since each of them has a 3 second delay, it takes 9 seconds.

Related

How to make sure a given future completes first in tests?

I'm writing tests for function bar:
def bar(fut1: Future[Int],
fut2: Future[Int],
fut3: Future[Int]): Future[Result] = ???
bar returns Result like this:
case class Result(
x: Int, // fut1 value
oy: Option[Int], // if fut2 is complete then Some of fut2 value else None
oz: Option[Int] // if fut3 is complete then Some of fut3 value else None
)
I want to write tests for all test cases:
fut1 completed, fut2 and fut3 did not complete
fut1 completed, fut2 completed, fut3 did not complete
etc.
So I am writing a fake implementation of functions foo1, foo2, and foo3 for these tests.
def foo1(x: Int): Future[Int] = ???
def foo2(x: Int): Future[Int] = ???
def foo3(x: Int): Future[Int] = ???
Test #1 invokes all these functions, checks if fut1 completes first, and invokes bar
val fut1 = foo1(0)
val fut2 = foo2(0)
val fut3 = foo3(0)
// make sure `fut1` completes first
Test #2 invokes all these functions, makes sure that fut2 completes first, and invokes bar.
Test #3 invokes all these functions, makes sure that fut3 completes first, and invokes bar.
My question is how to implement the functions foo1, foo2, and foo3 and the tests.

You can try attach completeness timestamp to each future via map, like:
import scala.concurrent.ExecutionContext.Implicits.global
import scala.concurrent._
import scala.concurrent.duration._
import scala.language.postfixOps
def foo1(x: Int): Future[Int] = Future {Thread.sleep(200); 1}
def foo2(x: Int): Future[Int] = Future {Thread.sleep(500); 2}
def foo3(x: Int): Future[Int] = Future {Thread.sleep(500); 3}
def completeTs[T](future: Future[T]): Future[(T, Long)] = future.map(v => v -> System.currentTimeMillis())
val resutls = Await.result(Future.sequence(
List(completeTs(foo1(1)), completeTs(foo2(1)), completeTs(foo3(1)))
), 2 seconds)
val firstCompleteTs = resutls.map(_._2).min
val firstCompleteIndex = resutls.indexWhere(_._2 == firstCompleteTs)
assert(firstCompleteIndex == 0)
Scatie: https://scastie.scala-lang.org/L9g78DSNQIm2K1jGlQzXBg

You could repurpose firstCompletedOf to verify whether the future of a given index in the futures list is the first completed one:
import java.util.concurrent.atomic.AtomicReference
import scala.concurrent.{ExecutionContext, Future, Promise}
import scala.util.Try
def isFirstCompleted[T](idx: Int)(futures: List[Future[T]])(
implicit ec: ExecutionContext): Future[Boolean] = {
val promise = Promise[(T, Int)]()
val pRef = new AtomicReference[Promise[(T, Int)]](promise)
futures.zipWithIndex foreach { case (f, i) => f onComplete { case tt: Try[T] =>
val p = pRef.getAndSet(null)
if (p != null) p tryComplete tt.map((_, i))
}
}
promise.future.map{ case (t, i) => i == idx }
}
Test running:
import scala.concurrent.ExecutionContext.Implicits.global
val futures = List(
Future{Thread.sleep(100); 1},
Future{Thread.sleep(150); throw new Exception("oops!")},
Future{Thread.sleep(50); 3}
)
isFirstCompleted(0)(futures) // Future(Success(false))
isFirstCompleted(2)(futures) // Future(Success(true))
For writing test cases, consider using ScalaTest AsyncFlatSpec.

It is unclear what it is exactly you are trying to test.
If you simply use futures that have already completed, you will get the behavior you describe:
def f1 = Future.successful(1)
def f2 = Future.successful(2)
def f3 = Future.successful(3)
eventually {
Future.firstCompletedOf(Seq(f1, f2, f3)).value shouldBe Some(1)
}
(note, that you cannot compare directly with fut1 like you did in the question, that'll always be false, because .firstCompletedOf returns a new future).
You can also make only one future complete, and leave the others alone:
val promise = Promise[Int].future
def f1 = promise.future // or just Future.successful(1) ... or Future(1)
def f2 = Future.never
def f3 = Future.never
result = Future.firstCompletedOf(Seq(f1, f2, f3))
promise.complete(Success(1))
eventually {
result.value shouldBe 1
}
Etc ... Can make the other futures be backed by their own promise too for example, if you want them all to complete eventually (not sure what it'll gain you, but then again, I am not sure what you are testing here to begin with).
Another possibility is make them depend on each other:
val promise = Promise[Int]
def f1 = promise.future
def f2 = promise.future.map(_ + 1)
def f3 = promise.future.map(_ + 2)
...
promise.complete(Success(1))

How do I measure elapsed time inside Cats IO effect?

I'd like to measure elapsed time inside IO container. It's relatively easy to do with plain calls or futures (e.g. something like the code below)
class MonitoringComponentSpec extends FunSuite with Matchers with ScalaFutures {
import scala.concurrent.ExecutionContext.Implicits.global
def meter[T](future: Future[T]): Future[T] = {
val start = System.currentTimeMillis()
future.onComplete(_ => println(s"Elapsed ${System.currentTimeMillis() - start}ms"))
future
}
def call(): Future[String] = Future {
Thread.sleep(500)
"hello"
}
test("metered call") {
whenReady(meter(call()), timeout(Span(550, Milliseconds))) { s =>
s should be("hello")
}
}
}
But not sure how to wrap IO call
def io_meter[T](effect: IO[T]): IO[T] = {
val start = System.currentTimeMillis()
???
}
def io_call(): IO[String] = IO.pure {
Thread.sleep(500)
"hello"
}
test("metered io call") {
whenReady(meter(call()), timeout(Span(550, Milliseconds))) { s =>
s should be("hello")
}
}
Thank you!

Cats-effect has a Clock implementation that allows pure time measurement as well as injecting your own implementations for testing when you just want to simulate the passing of time. The example from their documentation is:
def measure[F[_], A](fa: F[A])
(implicit F: Sync[F], clock: Clock[F]): F[(A, Long)] = {
for {
start <- clock.monotonic(MILLISECONDS)
result <- fa
finish <- clock.monotonic(MILLISECONDS)
} yield (result, finish - start)
}

In cats effect 3, you can use .timed. Like,
import cats.effect.IO
import cats.effect.unsafe.implicits.global
import cats.implicits._
import scala.concurrent.duration._
val twoSecondsLater = IO.sleep(2.seconds) *> IO.println("Hi")
val elapsedTime = twoSecondsLater.timed.map(_._1)
elapsedTime.unsafeRunSync()
// would give something like this.
Hi
res0: FiniteDuration = 2006997667 nanoseconds

Scala Future and TimeoutException: how to know the root cause?

Suppose I have the following code:
val futureInt1 = getIntAsync1();
val futureInt2 = getIntAsync2();
val futureSum = for {
int1 <- futureInt1
int2 <- futureInt2
} yield (int1 + int2)
val sum = Await.result(futureSum, 60 seconds)
Now suppose one of getIntAsync1 or getIntAsync2 takes a very long time, and it leads to Await.result throwing exception:
Caused by: java.util.concurrent.TimeoutException: Futures timed out after [60 seconds]
How am I supposed to know which one of getIntAsync1 or getIntAsync2 was still pending and actually lead to the timeout?
Note that here I'm merging 2 futures with zip, and this is a simple example for the question, but in my app I have this kind of code at different level (ie getIntAsync1 itself can use Future.zip or Future.sequence, map/flatMap/applicative)
Somehow what I'd like is to be able to log the pending concurrent operation stacktraces when a timeout occur on my main thread, so that I can know where are the bottlenexts on my system.
I have an existing legacy API backend which is not fully reactive yet and won't be so soon. I'm trying to increase response times by using concurrency. But since using this kind of code, It's become way more painful to understand why something takes a lot of time in my app. I would appreciate any tip you can provide to help me debugging such issues.

The key is realizing is that a the Future doesn't time out in your example—it's your calling thread which pauses for at most X time.
So, if you want to model time in your Futures you should use zipWith on each branch and zip with a Future which will contain a value within a certain amount of time. If you use Akka then you can use akka.pattern.after for this, together with Future.firstCompletedOf.
Now, even if you do, how do you figure out why any of your futures weren't completed in time, perhaps they depended on other futures which didn't complete.
The question boils down to: Are you trying to do some root-cause analysis on throughput? Then you should monitor your ExecutionContext, not your Futures. Futures are only values.

The proposed solution wraps each future from for block into TimelyFuture which requires timeout and name. Internally it uses Await to detect individual timeouts.
Please bear in mind this style of using futures is not intended for production code as it uses blocking. It is for diagnostics only to find out which futures take time to complete.
package xxx
import java.util.concurrent.TimeoutException
import scala.concurrent.{Future, _}
import scala.concurrent.duration.Duration
import scala.util._
import scala.concurrent.duration._
class TimelyFuture[T](f: Future[T], name: String, duration: Duration) extends Future[T] {
override def onComplete[U](ff: (Try[T]) => U)(implicit executor: ExecutionContext): Unit = f.onComplete(x => ff(x))
override def isCompleted: Boolean = f.isCompleted
override def value: Option[Try[T]] = f.value
#scala.throws[InterruptedException](classOf[InterruptedException])
#scala.throws[TimeoutException](classOf[TimeoutException])
override def ready(atMost: Duration)(implicit permit: CanAwait): TimelyFuture.this.type = {
Try(f.ready(atMost)(permit)) match {
case Success(v) => this
case Failure(e) => this
}
}
#scala.throws[Exception](classOf[Exception])
override def result(atMost: Duration)(implicit permit: CanAwait): T = {
f.result(atMost)
}
override def transform[S](ff: (Try[T]) => Try[S])(implicit executor: ExecutionContext): Future[S] = {
val p = Promise[S]()
Try(Await.result(f, duration)) match {
case s#Success(_) => ff(s) match {
case Success(v) => p.success(v)
case Failure(e) => p.failure(e)
}
case Failure(e) => e match {
case e: TimeoutException => p.failure(new RuntimeException(s"future ${name} has timed out after ${duration}"))
case _ => p.failure(e)
}
}
p.future
}
override def transformWith[S](ff: (Try[T]) => Future[S])(implicit executor: ExecutionContext): Future[S] = {
val p = Promise[S]()
Try(Await.result(f, duration)) match {
case s#Success(_) => ff(s).onComplete({
case Success(v) => p.success(v)
case Failure(e) => p.failure(e)
})
case Failure(e) => e match {
case e: TimeoutException => p.failure(new RuntimeException(s"future ${name} has timed out after ${duration}"))
case _ => p.failure(e)
}
}
p.future
}
}
object Main {
import scala.concurrent.ExecutionContext.Implicits.global
def main(args: Array[String]): Unit = {
val f = Future {
Thread.sleep(5);
1
}
val g = Future {
Thread.sleep(2000);
2
}
val result: Future[(Int, Int)] = for {
v1 <- new TimelyFuture(f, "f", 10 milliseconds)
v2 <- new TimelyFuture(g, "g", 10 milliseconds)
} yield (v1, v2)
val sum = Await.result(result, 1 seconds) // as expected, this throws exception : "RuntimeException: future g has timed out after 10 milliseconds"
}
}

If you are merely looking for informational metrics on which individual future was taking a long time (or in combination with others), your best bet is to use a wrapper when creating the futures to log the metrics:
object InstrumentedFuture {
def now() = System.currentTimeMillis()
def apply[T](name: String)(code: => T): Future[T] = {
val start = now()
val f = Future {
code
}
f.onComplete {
case _ => println(s"Future ${name} took ${now() - start} ms")
}
f
}
}
val future1 = InstrumentedFuture("Calculator") { /*...code...*/ }
val future2 = InstrumentedFuture("Differentiator") { /*...code...*/ }

You can check if a future has completed by calling its isComplete method
if (futureInt1.isComplete) { /*futureInt2 must be the culprit */ }
if (futureInt2.isComplete) { /*futureInt1 must be the culprit */ }

As a first approach i would suggest to lift your Future[Int] into Future[Try[Int]]. Something like that:
object impl {
def checkException[T](in: Future[T]): Future[Try[T]] =
in.map(Success(_)).recover {
case e: Throwable => {
Failure(new Exception("Error in future: " + in))
}
}
implicit class FutureCheck(s: Future[Int]) {
def check = checkException(s)
}
}
Then a small function to combine the results, something like that:
object test {
import impl._
val futureInt1 = Future{ 1 }
val futureInt2 = Future{ 2 }
def combine(a: Try[Int], b: Try[Int])(f: (Int, Int) => (Int)) : Try[Int] = {
if(a.isSuccess && b.isSuccess) {
Success(f(a.get, b.get))
}
else
Failure(new Exception("Error adding results"))
}
val futureSum = for {
int1 <- futureInt1.check
int2 <- futureInt2.check
} yield combine(int1, int2)(_ + _)
}
In futureSum you will have a Try[Int] with the integer or a Failure with the exception corresponding with the possible error.
Maybe this can be useful

val futureInt1 = getIntAsync1();
val futureInt2 = getIntAsync2();
val futureSum = for {
int1 <- futureInt1
int2 <- futureInt2
} yield (int1 + int2)
Try(Await.result(futureSum, 60 seconds)) match {
case Success(sum) => println(sum)
case Failure(e) => println("we got timeout. the unfinished futures are: " + List(futureInt1, futureInt2).filter(!_.isCompleted)
}

on scala, how to wait for a Future.onFailure to complete?

Scala futures:
With Await.result I can wait for the future to complete before terminating the program.
How can I wait for a Future.onFailure to complete before terminating the program?
package test
import scala.concurrent.duration.Duration
import scala.concurrent.{Await, Future}
import scala.concurrent.ExecutionContext.Implicits.global
object Test5 extends App {
def step(i: Int): Future[String] = Future {
Thread.sleep(1000)
if (i == 3) throw new Exception("exception on t3")
Thread.sleep(1000); val s = s"done: $i"; println(s); s
}
val task: Future[String] = step(1).flatMap(_ => step(2).flatMap(_ => step(3)))
task.onFailure {
case e: Throwable => println("task.onFailure. START: " + e); Thread.sleep(3000); println("task.onFailure. END.")
}
try {
val result: String = Await.result(task, Duration.Inf)
println("RESULT: " + result)
} catch {
case e: Throwable => println("Await.result.try catch: " + e)
}
// without this, "task.onFailure. END." does not get printed.
// How can I replace this line with something such as Await.result([the task.onFailure I set previously])
Thread.sleep(5000)
println("END")
}
Note: Instead of using task.onFailure, I could catch the exception on Await.result (as in the example). But I would prefer to use task.onFailure.
Update
A solution is to use transform or recover as proposed by Ryan. In my case, I needed something more, and I've added this alternative answer: on scala, how to wait for a Future.onFailure to complete?

If all you want to do is to sequence a side-effect such that you know that it has been executed I'd recommend using 'andThen'.

onFailure returns Unit. There's no way to block on that.
Either catch the exception on Await.result or use transform or recover to side-effect on the exception.

You can't. Future#onFailure returns Unit so there is no handle for which you can grab to block execution.
Since you're blocking on the completion of task anyway, the catch clause in your code is equivalent to what you would get from onFailure. That is, if the Future fails, the body of catch will be executed before the program exits because it is running on the main thread (whereas onComplete is not).

We can enrich Future with onComplete2 and onCompleteWith2 as follows:
import scala.concurrent.{ExecutionContext, Future, Promise}
import scala.util.Try
object FutureUtils {
implicit class RichFuture[T](val self: Future[T]) {
def onComplete2[U](f: (Try[T]) => U)(implicit executor: ExecutionContext): Future[T] = {
val p = Promise[T]()
self.onComplete { r: Try[T] => f(r); p.complete(r) }
p.future
}
def onCompleteWith2[U](f: (Try[T]) => Future[U])(implicit executor: ExecutionContext): Future[T] = {
val p = Promise[T]()
self.onComplete { r: Try[T] => f(r).onComplete(_ => p.complete(r)) }
p.future
}
}
}
then I case use it as follows:
import scala.concurrent.duration.Duration
import scala.concurrent.{Await, Future}
import scala.concurrent.ExecutionContext.Implicits.global
import FutureUtils._
import scala.util.{Failure, Success, Try}
object Test5 extends App {
def stepf(i: Int): Future[String] = Future { step(i) }
def step(i: Int): String = {
Thread.sleep(1000)
if (i == 3) throw new Exception("exception on t3")
Thread.sleep(1000); val s = s"done: $i"; println(s); s
}
val task1: Future[String] =
stepf(1).flatMap(_ => stepf(2).flatMap(_ => stepf(3)))
// the result of step(10) and step(11) is ignored
val task2: Future[String] = task1.onComplete2((r: Try[String]) => r match {
case Success(s) => step(10)
case Failure(e) => step(11)
})
/*
// If I want to recover (and so, to avoid an exception on Await.result:
val task3 = task2.recover {case e: Throwable => step(12) }
// I can use onCompleteWith2 instead of onComplete2
val task2: Future[String] = task1.onCompleteWith2((r: Try[String]) => r match {
case Success(s) => stepf(10)
case Failure(e) => stepf(11)
})
*/
try {
val result = Await.result(task2, Duration.Inf)
println("RESULT: " + result)
} catch {
// see my comment above to remove this
case e: Throwable => println("Await.result.try catch: " + e)
}
println("END.")
}
The execution is as follows:
done: 1
done: 2
done: 11
Await.result.try catch: java.lang.Exception: exception on t3
END.
If we don't throw the exception at step(3), the execution is as follows. Note that RESULT is "done: 3", not "done: 10":
done: 1
done: 2
done: 3
done: 10
RESULT: done: 3
END.

Scala Nested Futures

I have a couple of futures. campaignFuture returns a List[BigInt] and I want to be able to call the second future profileFuture for each of the values in the list returned from the first one. The second future can only be called when the first one is complete. How do I achieve this in Scala?
campaignFuture(1923).flatMap?? (May be?)
def campaignFuture(advertiserId: Int): Future[List[BigInt]] = Future {
val campaignHttpResponse = getCampaigns(advertiserId.intValue())
parseProfileIds(campaignHttpResponse.entity.asString)
}
def profileFuture(profileId: Int): Future[List[String]] = Future {
val profileHttpResponse = getProfiles(profileId.intValue())
parseSegmentNames(profileHttpResponse.entity.asString)
}

A for comprehension is here not applicable because we have a mix of List's and Future's.
So, your friends are map and flatMap:
To react on Future result
import scala.concurrent.{Future, Promise, Await}
import scala.concurrent.duration.Duration
import scala.concurrent.ExecutionContext.Implicits.global
def campaignFuture(advertiserId: Int): Future[List[BigInt]] = Future {
List(1, 2, 3)
}
def profileFuture(profileId: Int): Future[List[String]] = {
// delayed Future
val p = Promise[List[String]]
Future {
val delay: Int = (math.random * 5).toInt
Thread.sleep(delay * 1000)
p.success(List(s"profile-for:$profileId", s"delayed:$delay sec"))
}
p.future
}
// Future[List[Future[List[String]]]
val listOfProfileFuturesFuture = campaignFuture(1).map { campaign =>
campaign.map(id => profileFuture(id.toInt))
}
// React on Futures which are done
listOfProfileFuturesFuture foreach { campaignFutureRes =>
campaignFutureRes.foreach { profileFutureRes =>
profileFutureRes.foreach(profileListEntry => println(s"${new Date} done: $profileListEntry"))
}
}
// !!ONLY FOR TESTING PURPOSE - THIS CODE BLOCKS AND EXITS THE VM WHEN THE FUTURES ARE DONE!!
println(s"${new Date} waiting for futures")
listOfProfileFuturesFuture.foreach{listOfFut =>
Await.ready(Future.sequence(listOfFut), Duration.Inf)
println(s"${new Date} all futures done")
System.exit(0)
}
scala.io.StdIn.readLine()
To get the result of all Futures at once
import scala.concurrent.{Future, Await}
import scala.concurrent.duration.Duration
import scala.concurrent.ExecutionContext.Implicits.global
def campaignFuture(advertiserId: Int): Future[List[BigInt]] = Future {
List(1, 2, 3)
}
def profileFuture(profileId: Int): Future[List[String]] = Future {
List(s"profile-for:$profileId")
}
// type: Future[List[Future[List[String]]]]
val listOfProfileFutures = campaignFuture(1).map { campaign =>
campaign.map(id => profileFuture(id.toInt))
}
// type: Future[List[List[String]]]
val listOfProfileFuture = listOfProfileFutures.flatMap(s => Future.sequence(s))
// print the result
//listOfProfileFuture.foreach(println)
//scala.io.StdIn.readLine()
// wait for the result (THIS BLOCKS INFINITY!)
Await.result(listOfProfileFuture, Duration.Inf)
we use Future.sequence to convert a List[Future[T]] to Future[List[T]].
flatMap to get a Future[T] from Future[Future[T]]
if you need wait for the result (BLOCKING!) use Await to wait for the result

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Does cats mapN run all the futures in parallel? - scala

Because you have sleep outside Future it should be like: def getA: Future[Int] = Future { println("inside A") Thread.sleep(3000) 2 } So you start async Future with apply - Future.successful on the other hand returns pure value, meaning you execute sleep in same thread.

The time is going before mapN is ever called. (getA, getB, getC).mapN(add) This expression is creating a tuple (sequentially) and then calling mapN on it. So it is calling getA then getB then getC and since each of them has a 3 second delay, it takes 9 seconds.

Related

How to make sure a given future completes first in tests?

How do I measure elapsed time inside Cats IO effect?

Scala Future and TimeoutException: how to know the root cause?

on scala, how to wait for a Future.onFailure to complete?

Scala Nested Futures

Categories

Resources