Scala Futures for-comprehension with a list of values - scala

I need to execute a Future method on some elements I have in a list simultaneously. My current implementation works sequentially, which is not optimal for saving time. I did this by mapping my list and calling the method on each element and processing the data this way.
My manager shared a link with me showing how to execute Futures simultaneously using for-comprehension but I cannot see/understand how I can implement this with my List.
The link he shared with me is https://alvinalexander.com/scala/how-use-multiple-scala-futures-in-for-comprehension-loop/
Here is my current code:
private def method1(id: String): Tuple2[Boolean, List[MyObject]] = {
val workers = List.concat(idleWorkers, activeWorkers.keys.toList)
var ready = true;
val workerStatus = workers.map{ worker =>
val option = Await.result(method2(worker), 1 seconds)
var status = if (option.isDefined) {
if (option.get._2 == id) {
option.get._1.toString
} else {
"INVALID"
}
} else "FAILED"
val status = s"$worker: $status"
if (option.get._1) {
ready = false
}
MyObject(worker.toString, status)
}.toList.filterNot(s => s. status.contains("INVALID"))
(ready, workerStatus)
}
private def method2(worker: ActorRef): Future[Option[(Boolean, String)]] = Future{
implicit val timeout: Timeout = 1 seconds;
Try(Await.result(worker ? GetStatus, 1 seconds)) match {
case Success(extractedVal) => extractedVal match {
case res: (Boolean, String) => Some(res)
case _ => None
}
case Failure(_) => { None }
case _ => { None }
}
}
If someone could suggest how to implement for-comprehension in this scenario, I would be grateful. Thanks

For method2 there is no need for the Future/Await mix. Just map the Future:
def method2(worker: ActorRef): Future[Option[(Boolean, String)]] =
(worker ? GetStatus).map{
case res: (Boolean, String) => Some(res)
case _ => None
}
For method1 you likewise need to map the result of method2 and do the processing inside the map. This will make workerStatus a List[Future[MyObject]] and means that everything runs in parallel.
Then use Future.sequence(workerStatus) to turn the List[Future[MyObject]] into a Future[List[MyObject]]. You can then use map again to do the filtering/ checking on that List[MyObject]. This will happen when all the individual Futures have completed.
Ideally you would then return a Future from method1 to keep everything asynchronous. You could, if absolutely necessary, use Await.result at this point which would wait for all the asynchronous operations to complete (or fail).

Related

.map or match returning Future[List] when expected to return List

I am kind of failing this weird behaviour not sure where i am wrong exactly. So the situation is that tester2 function is returning a Future[Boolean]] now I want to wait for this to complete and when it gets completed I want it to return a List[String] based on different cases inside reset function, now the problem is instead of returning up a List[String] it is returning Future[List[String]] , not able to understand why match function behaving like this
I am getting this error to be exact
val les = Await.ready(tester2(5),Duration.Inf).map(reset).forEach(println)
object HelloWorld {
def main(args: Array[String]) {
val exp = tester2(5).map(reset)
val les = Await.ready(tester2(5),Duration.Inf).map(reset).forEach(println)
println(s"what do you say ${les}")
}
def reset (x: Option[Boolean]): List[String] =
x match {
case None => List("abc","def")
case Some(false) => List("abc","def")
case Some(true) => List("def","abc")
}
def tester():Future[Option[Message]]={
Future{
Thread.sleep(5000)
Option(Message("abc","def","ghi"))
}
}
def tester2(param:Int):Future[Option[Boolean]]={
Future{
Thread.sleep(5000)
if(param>10){
Some(true)
}else{
Some(false)
}
}
}
If tester2 returns a Future of an Option of a Boolean
def tester2(param:Int):Future[Option[Boolean] = ???
and you want to change the value to a string you need to say "when this future completes and there is a real Option[Boolean].. then do this thing. This is what "map" does on a future. It says "once the future completes, run this code". So you can do this:
def reset (in :Future[Option[Boolean]]) = in.map { optionOfBoolean :Option[Boolean] =>
optionOfBoolean match {
case None => ...
case Some(true) ...
}
}
Scala also allows you to short cut having the map and match together and just write:
def reset (in :Future[Option[Boolean]]) = in map {
case None => List("abc", "bcd")
case Some(true) => List("d3", "d4")
case Some(false) => List("sds", "dssds")
}
Since I can't see your error I can't help you further but something like this should work.
val booleanResult :Future[Option[Boolean]] = tester2(...)
val futureListStr :Future[List[String]] = reset(booleanResult)
val answer :List[String] = Await.result(futureListStr, scala.concurrent.duration.Duration.Inf)
Use Await.result to extract the result value.
final def result[T](awaitable: Awaitable[T], atMost: Duration): T
Await and return the result (of type T) of an Awaitable.
awaitable the Awaitable to be awaited
atMost maximum wait time, which may be negative (no waiting is done), >Duration.Inf for unbounded waiting, or a finite positive duration
returns the result value if awaitable is completed within the specific maximum wait time

How to write an asynchronous code that Awaited concisely?

I'm a beginner in Scala.
Please let me know if there is a more concise part in the code below.
To supplement, I'd like to call each Future method synchronously.
◆getUser method:
def getUser: Option[User] = {
Await.ready(
twitterService.getUser(configService.getString(TWITTER_USERNAME_CONF)),
Duration.Inf)
.value
.flatMap(x => Option(x.getOrElse(null)))
}
◆ process method:
def process : Unit =
for {
user <- getUser
} yield {
Await.ready(
twitterService.delete(user.id, configService.getString(TWITTER_SEARCH_KEYWORD)),
Duration.Inf)
.value
.foreach {
case Success(tweets) => tweets.foreach(tweet => println(s"Delete Successfully!!. $tweet"))
case Failure(exception) => println(s"Failed Delete.... Exception:[$exception]")
}
}
I made some assumptions on user and tweet data types but I would rewrite that to:
def maybeDeleteUser(userName: String, maybeUser: Option[User]): Future[String] =
maybeUser match {
case Some(user) =>
twitterService.delete(user.id, configService.getString(TWITTER_SEARCH_KEYWORD)).map {
case Failure(exception) => s"Failed Delete.... Exception:[${exception.getMessage}]"
case Success(tweets) => tweets.map(tweet => s"Delete Successfully!!. $tweet").mkString(System.lineSeparator())
}
case _ => Future.successful(s"Failed to find user $userName")
}
def getStatusLogMessage: Future[String] = {
val userName = configService.getString(TWITTER_USERNAME_CONF)
for {
maybeUser <- twitterService.getUser(configService.getString(TWITTER_USERNAME_CONF))
statusLogMessage <- maybeDeleteUser(userName, maybeUser)
} yield statusLogMessage
}
def process: Unit = {
val message = Await.result(getStatusLogMessage, Duration.Inf)
println(message)
}
That way your side effect, i.e. println is isolated and other methods can be unit tested. If you need to block the execution, do it only at the end and use map and flatMap to chain Futures if you need to order the execution of those. Also be careful with Duration.Inf, if you really need to block, then you'd want to have some defined timeout.

Submitting operations in created future

I have a Future lazy val that obtains some object and a function which submits operations in the Future.
class C {
def printLn(s: String) = println(s)
}
lazy val futureC: Future[C] = Future{Thread.sleep(3000); new C()}
def func(s: String): Unit = {
futureC.foreach{c => c.printLn(s)}
}
The problem is when Future is completed it executes operations in reverse order than they have been submited. So for example if I execute sequentialy
func("A")
func("B")
func("C")
I get after Future completion
scala> C
B
A
This order is important for me. Is there a way to preserve this order?
Of course I can use an actor who asks for future and stashing strings while future is not ready, but it seems redundant for me.
lazy val futureC: Future[C]
lazy vals in scala will be compiled in to the code which uses a synchronized block for thread safety.
Here when the func(A) is called, it will obtain the lock for the lazy val and that thread will go to sleep.
Therefore func(B) & func(C) will blocked by the lock.
When those blocked threads are run, the order cannot be guaranteed.
If you do it like below, you'll have the order as you expect. This is because the for comprehension creates a flatMap, & map based chain that gets executed sequentially.
lazy val futureC: Future[C] = Future {
Thread.sleep(1000)
new C()
}
def func(s: String) : Future[Unit] = {
futureC.map { c => c.printLn(s) }
}
val x = for {
_ <- func("A")
_ <- func("B")
_ <- func("C")
} yield ()
The order preserves even without the lazy keyword. You can remove the lazy keyword unless it is really necessary.
Hope this helps.
You can use Future.traverse to ensure the order of execution.
Something like this.. Im not sure how your func has a reference to the correct futureC, so I moved it inside.
def func(s: String): Future[Unit] = {
lazy val futureC = Future{Thread.sleep(3000); new C()}
futureC.map{c => c.printLn(s)}
}
def traverse[A,B](xs: Seq[A])(fn: A => Future[B]): Future[Seq[B]] =
xs.foldLeft(Future(Seq[B]())) { (acc, item) =>
acc.flatMap { accValue =>
fn(item).map { itemValue =>
accValue :+ itemValue
}
}
}
traverse(Seq("A","B","C"))(func)

Scala - Batched Stream from Futures

I have instances of a case class Thing, and I have a bunch of queries to run that return a collection of Things like so:
def queries: Seq[Future[Seq[Thing]]]
I need to collect all Things from all futures (like above) and group them into equally sized collections of 10,000 so they can be serialized to files of 10,000 Things.
def serializeThings(Seq[Thing]): Future[Unit]
I want it to be implemented in such a way that I don't wait for all queries to run before serializing. As soon as there are 10,000 Things returned after the futures of the first queries complete, I want to start serializing.
If I do something like:
Future.sequence(queries)
It will collect the results of all the queries, but my understanding is that operations like map won't be invoked until all queries complete and all the Things must fit into memory at once.
What's the best way to implement a batched stream pipeline using Scala collections and concurrent libraries?
I think that I managed to make something. The solution is based on my previous answer. It collects results from Future[List[Thing]] results until it reaches a treshold of BatchSize. Then it calls serializeThings future, when it finishes, the loop continues with the rest.
object BatchFutures extends App {
case class Thing(id: Int)
def getFuture(id: Int): Future[List[Thing]] = {
Future.successful {
List.fill(3)(Thing(id))
}
}
def serializeThings(things: Seq[Thing]): Future[Unit] = Future.successful {
//Thread.sleep(2000)
println("processing: " + things)
}
val ids = (1 to 4).toList
val BatchSize = 5
val future = ids.foldLeft(Future.successful[List[Thing]](Nil)) {
case (acc, id) =>
acc flatMap { processed =>
getFuture(id) flatMap { res =>
val all = processed ++ res
val (batch, rest) = all.splitAt(5)
if (batch.length == BatchSize) { // if futures filled the batch with needed amount
serializeThings(batch) map { _ =>
rest // process the rest
}
} else {
Future.successful(all) //if we need more Things for a batch
}
}
}
}.flatMap { rest =>
serializeThings(rest)
}
Await.result(future, Duration.Inf)
}
The result prints:
processing: List(Thing(1), Thing(1), Thing(1), Thing(2), Thing(2))
processing: List(Thing(2), Thing(3), Thing(3), Thing(3), Thing(4))
processing: List(Thing(4), Thing(4))
When the number of Things isn't divisible by BatchSize we have to call serializeThings once more(last flatMap). I hope it helps! :)
Before you do Future.sequence do what you want to do with individual future and then use Future.sequence.
//this can be used for serializing
def doSomething(): Unit = ???
//do something with the failed future
def doSomethingElse(): Unit = ???
def doSomething(list: List[_]) = ???
val list: List[Future[_]] = List.fill(10000)(Future(doSomething()))
val newList =
list.par.map { f =>
f.map { result =>
doSomething()
}.recover { case throwable =>
doSomethingElse()
}
}
Future.sequence(newList).map ( list => doSomething(list)) //wait till all are complete
instead of newList generation you could use Future.traverse
Future.traverse(list)(f => f.map( x => doSomething()).recover {case th => doSomethingElse() }).map ( completeListOfValues => doSomething(completeListOfValues))

waiting for "recursive" futures in scala

a simple code sample that describes my problem:
import scala.util._
import scala.concurrent._
import scala.concurrent.duration._
import ExecutionContext.Implicits.global
class LoserException(msg: String, dice: Int) extends Exception(msg) { def diceRoll: Int = dice }
def aPlayThatMayFail: Future[Int] = {
Thread.sleep(1000) //throwing a dice takes some time...
//throw a dice:
(1 + Random.nextInt(6)) match {
case 6 => Future.successful(6) //I win!
case i: Int => Future.failed(new LoserException("I did not get 6...", i))
}
}
def win(prefix: String): String = {
val futureGameLog = aPlayThatMayFail
futureGameLog.onComplete(t => t match {
case Success(diceRoll) => "%s, and finally, I won! I rolled %d !!!".format(prefix, diceRoll)
case Failure(e) => e match {
case ex: LoserException => win("%s, and then i got %d".format(prefix, ex.diceRoll))
case _: Throwable => "%s, and then somebody cheated!!!".format(prefix)
}
})
"I want to do something like futureGameLog.waitForRecursiveResult, using Await.result or something like that..."
}
win("I started playing the dice")
this simple example illustrates what i want to do. basically, if to put it in words, i want to wait for a result for some computation, when i compose different actions on previous success or failed attampts.
so how would you implement the win method?
my "real world" problem, if it makes any difference, is using dispatch for asynchronous http calls, where i want to keep making http calls whenever the previous one ends, but actions differ on wether the previous http call succeeded or not.
You can recover your failed future with a recursive call:
def foo(x: Int) = x match {
case 10 => Future.successful(x)
case _ => Future.failed[Int](new Exception)
}
def bar(x: Int): Future[Int] = {
foo(x) recoverWith { case _ => bar(x+1) }
}
scala> bar(0)
res0: scala.concurrent.Future[Int] = scala.concurrent.impl.Promise$DefaultPromise#64d6601
scala> res0.value
res1: Option[scala.util.Try[Int]] = Some(Success(10))
recoverWith takes a PartialFunction[Throwable,scala.concurrent.Future[A]] and returns a Future[A]. You should be careful though, because it will use quite some memory when it does lots of recursive calls here.
As drexin answered the part about exception handling and recovering, let me try and answer the part about a recursive function involving futures. I believe using a Promise will help you achieve your goal. The restructured code would look like this:
def win(prefix: String): String = {
val prom = Promise[String]()
def doWin(p:String) {
val futureGameLog = aPlayThatMayFail
futureGameLog.onComplete(t => t match {
case Success(diceRoll) => prom.success("%s, and finally, I won! I rolled %d !!!".format(prefix, diceRoll))
case Failure(e) => e match {
case ex: LoserException => doWin("%s, and then i got %d".format(prefix, ex.diceRoll))
case other => prom.failure(new Exception("%s, and then somebody cheated!!!".format(prefix)))
}
})
}
doWin(prefix)
Await.result(prom.future, someTimeout)
}
Now this won't be true recursion in the sense that it will be building up one long stack due to the fact that the futures are async, but it is similar to recursion in spirit. Using the promise here gives you something to block against while the recursion does it's thing, blocking the caller from what's happening behind the scene.
Now, if I was doing this, I would probable redefine things like so:
def win(prefix: String): Future[String] = {
val prom = Promise[String]()
def doWin(p:String) {
val futureGameLog = aPlayThatMayFail
futureGameLog.onComplete(t => t match {
case Success(diceRoll) => prom.success("%s, and finally, I won! I rolled %d !!!".format(prefix, diceRoll))
case Failure(e) => e match {
case ex: LoserException => doWin("%s, and then i got %d".format(prefix, ex.diceRoll))
case other => prom.failure(new Exception("%s, and then somebody cheated!!!".format(prefix)))
}
})
}
doWin(prefix)
prom.future
}
This way you can defer the decision on whether to block or use async callbacks to the caller of this function. This is more flexible, but it also exposes the caller to the fact that you are doing async computations and I'm not sure that is going to be acceptable for your scenario. I'll leave that decision up to you.
This works for me:
def retryWithFuture[T](f: => Future[T],retries:Int, delay:FiniteDuration) (implicit ec: ExecutionContext, s: Scheduler): Future[T] ={
f.recoverWith { case _ if retries > 0 => after[T](delay,s)(retryWithFuture[T]( f , retries - 1 , delay)) }
}