Play/Scala: Making unknown number of I/O calls in parallell, watining for the results - scala

So, I read the article here about parallel comprehension. He gives the following code example:
// Make 3 parallel async calls
val fooFuture = WS.url("http://foo.com").get()
val barFuture = WS.url("http://bar.com").get()
val bazFuture = WS.url("http://baz.com").get()
for {
foo <- fooFuture
bar <- barFuture
baz <- bazFuture
} yield {
// Build a Result using foo, bar, and baz
Ok(...)
}
All fine so far, but, I am in a situation where I don't know how many WS.get()'s I need to do always, I want it to be dynamic. So for instance:
val checks = Seq(callOne(param), callTwo(param))
Where the calls are:
def callOne(param: String): Future[Boolean] = {
// do something and return the Future with a true/false value
Future(true)
}
def callTwo(param: String): Future[Boolean] = {
// do something and return the Future with a true/false value
Future(false)
}
So, my question is, how shall I react on the results of my sequence with WS calls (or database queries for that matter), in a for-yield?
I have given two example of calls, but I want the same code be able to process 1 to many number of calls in parallel and gather the results in the for-yield to ultimately proceed to do other things.
Important: All calls should be carried out in parallel, the quickest ones will complete before the slow ones without any respect to what order they are fired.

Future.sequence is likely what you want.
Example usage:
val futures = List(WS.url("http://foo.com").get(), WS.url("http://bar.com").get())
Future.sequence(futures) # => Transforms a Seq[Future[_]] to Future[Seq[_]]
The future returns from Future.sequence will not be completed until the all of the futures in the input sequence are completed.
Bonus:
If your futures are heterogeneously typed, and you need to preserve that type, you can use Hlist. I've written the following snippet which will take an Hlist of futures, and transform it to a Future containing an Hlist of resolved values:
import shapeless._
import scala.concurrent.{ExecutionContext,Future}
object FutureHelpers {
object FutureReducer extends Poly2 {
import scala.concurrent.ExecutionContext.Implicits.global
implicit def f[A, B <: HList] = at[Future[A], Future[B]] { (f, resultFuture) =>
for {
result <- resultFuture
value <- f
} yield value :: result
}
}
// Like Future.sequence, but for HList
// hsequence(Future { 1 } :: Future { "string" } :: HNil)
// => Future { 1 :: "string" :: HNil }
def hsequence[T <: HList](hlist: T)(implicit
executor: ExecutionContext,
folder: RightFolder[T, Future[HNil], FutureReducer.type]) = {
hlist.foldRight(Future.successful[HNil](HNil))(FutureReducer)
}
}

Related

MVar tryPut returns true and isEmpty also returns true

I wrote simple callback(handler) function which i pass to async api and i want to wait for result:
object Handlers {
val logger: Logger = Logger("Handlers")
implicit val cs: ContextShift[IO] =
IO.contextShift(ExecutionContext.Implicits.global)
class DefaultHandler[A] {
val response: IO[MVar[IO, A]] = MVar.empty[IO, A]
def onResult(obj: Any): Unit = {
obj match {
case obj: A =>
println(response.flatMap(_.tryPut(obj)).unsafeRunSync())
println(response.flatMap(_.isEmpty).unsafeRunSync())
case _ => logger.error("Wrong expected type")
}
}
def getResponse: A = {
response.flatMap(_.take).unsafeRunSync()
}
}
But for some reason both tryPut and isEmpty(when i'd manually call onResult method) returns true, therefore when i calling getResponse it sleeps forever.
This is the my test:
class HandlersTest extends FunSuite {
test("DefaultHandler.test") {
val handler = new DefaultHandler[Int]
handler.onResult(3)
val response = handler.getResponse
assert(response != 0)
}
}
Can somebody explain why tryPut returns true, but nothing puts. And what is the right way to use Mvar/channels in scala?
IO[X] means that you have the recipe to create some X. So on your example, yuo are putting in one MVar and then asking in another.
Here is how I would do it.
object Handlers {
trait DefaultHandler[A] {
def onResult(obj: Any): IO[Unit]
def getResponse: IO[A]
}
object DefaultHandler {
def apply[A : ClassTag]: IO[DefaultHandler[A]] =
MVar.empty[IO, A].map { response =>
new DefaultHandler[A] {
override def onResult(obj: Any): IO[Unit] = obj match {
case obj: A =>
for {
r1 <- response.tryPut(obj)
_ <- IO(println(r1))
r2 <- response.isEmpty
_ <- IO(println(r2))
} yield ()
case _ =>
IO(logger.error("Wrong expected type"))
}
override def getResponse: IO[A] =
response.take
}
}
}
}
The "unsafe" is sort of a hint, but every time you call unsafeRunSync, you should basically think of it as an entire new universe. Before you make the call, you can only describe instructions for what will happen, you can't actually change anything. During the call is when all the changes occur. Once the call completes, that universe is destroyed, and you can read the result but no longer change anything. What happens in one unsafeRunSync universe doesn't affect another.
You need to call it exactly once in your test code. That means your test code needs to look something like:
val test = for {
handler <- TestHandler.DefaultHandler[Int]
_ <- handler.onResult(3)
response <- handler.getResponse
} yield response
assert test.unsafeRunSync() == 3
Note this doesn't really buy you much over just using the MVar directly. I think you're trying to mix side effects inside IO and outside it, but that doesn't work. All the side effects need to be inside.

Scala: Sequential processing of file data

I have a csv file from which i read data and populate my database. I am using scala to do this. Instead of firing db inserts in a paralleled way I want to execute the insert in sequential manner(i.e. one after another). I am not willing to use Await in a for loop. Any other approach apart from using await?
P.S: I have read the 1000 entries from csv to a list and looping on the list to create db inserts
Assuming you have some kind of save(entity: T): Future[_] method for your database, you can just fold your futures with flatMap (or for comprehension):
def saveAll(entities: List[T]): Future[Unit]
entities.foldLeft(Future.successful(())){
case (f, entity) => for {
_ <- f
_ <- save(entity)
} yield ()
}
}
Another option is recursive function. Less concise than foldLeft, but more readable to some. Just one more option for your consideration (assume save(entity: T): Future[R]:
def saveAll(entities: List[T]): Future[List[R]] = {
entities.headOption match {
case Some(entity) =>
for {
head <- save(entity)
tail <- saveAll(entities.tail)
} yield {
head :: tail
}
case None =>
Future.successful(Nil)
}
}
Yet another option, if your save method allows you to supply your own ExecutionContext i.e. save(entity: T)(implicit ec: ExecutionContext): Future[R], is just fire the Futures concurrently but use a single thread execution context:
def saveAll(entities: List[T]): Future[List[R]] = {
implicit ec = ExecutionContext.fromExecutionService(java.util.concurrent.Executors.newSingleThreadExecutor)
Future.sequence(entities.map(save))
}

Scala Future[A] and Future[Option[B]] composition

I have an app that manages Items. When the client queries an item by some info, the app first tries to find an existing item in the db with the info. If there isn't one, the app would
Check if info is valid. This is an expensive operation (much more so than a db lookup), so the app only performs this when there isn't an existing item in the db.
If info is valid, insert a new Item into the db with info.
There are two more classes, ItemDao and ItemService:
object ItemDao {
def findByInfo(info: Info): Future[Option[Item]] = ...
// This DOES NOT validate info; it assumes info is valid
def insertIfNotExists(info: Info): Future[Item] = ...
}
object ItemService {
// Very expensive
def isValidInfo(info: Info): Future[Boolean] = ...
// Ugly
def findByInfo(info: Info): Future[Option[Item]] = {
ItemDao.findByInfo(info) flatMap { maybeItem =>
if (maybeItem.isDefined)
Future.successful(maybeItem)
else
isValidInfo(info) flatMap {
if (_) ItemDao.insertIfNotExists(info) map (Some(_))
else Future.successful(None)
}
}
}
}
The ItemService.findByInfo(info: Info) method is pretty ugly. I've been trying to clean it up for a while, but it's difficult since there are three types involved (Future[Boolean], Future[Item], and Future[Option[Item]]). I've tried to use scalaz's OptionT to clean it up but the non-optional Futures make it not very easy either.
Any ideas on a more elegant implementation?
To expand on my comment.
Since you've already indicated a willingness to go down the route of monad transformers, this should do what you want. There is unfortunately quite a bit of line noise due to Scala's less than stellar typechecking here, but hopefully you find it elegant enough.
import scalaz._
import Scalaz._
object ItemDao {
def findByInfo(info: Info): Future[Option[Item]] = ???
// This DOES NOT validate info; it assumes info is valid
def insertIfNotExists(info: Info): Future[Item] = ???
}
object ItemService {
// Very expensive
def isValidInfo(info: Info): Future[Boolean] = ???
def findByInfo(info: Info): Future[Option[Item]] = {
lazy val nullFuture = OptionT(Future.successful(none[Item]))
lazy val insert = ItemDao.insertIfNotExists(info).liftM[OptionT]
lazy val validation =
isValidInfo(info)
.liftM[OptionT]
.ifM(insert, nullFuture)
val maybeItem = OptionT(ItemDao.findByInfo(info))
val result = maybeItem <+> validation
result.run
}
}
Two comments about the code:
We are using the OptionT monad transformer here to capture the Future[Option[_]] stuff and anything that just lives inside Future[_] we're liftMing up to our OptionT[Future, _] monad.
<+> is an operation provided by MonadPlus. In a nutshell, as the name suggests, MonadPlus captures the intuition that often times monads have an intuitive way of being combined (e.g. List(1, 2, 3) <+> List(4, 5, 6) = List(1, 2, 3, 4, 5, 6)). Here we're using it to short-circuit when findByInfo returns Some(item) rather than the usual behavior to short-circuit on None (this is roughly analogous to List(item) <+> List() = List(item)).
Other small note, if you actually wanted to go down the monad transformers route, often times you end up building everything in your monad transformer (e.g. ItemDao.findByInfo would return an OptionT[Future, Item]) so that you don't have extraneous OptionT.apply calls and then .run everything at the end.
You don't need scalaz for this. Just break your flatMap into two steps:
first, find and validate, then insert if necessary. Something like this:
ItemDao.findByInfo(info).flatMap {
case None => isValidInfo(info).map(None -> _)
case x => Future.successful(x -> true)
}.flatMap {
case (_, true) => ItemDao.insertIfNotExists(info).map(Some(_))
case (x, _) => Future.successful(x)
}
Doesn't look too bad, does it? If you don't mind running validation in parallel with retrieval (marginally more expensive resource-vise, but likely faster on average), you could further simplify it like this:
ItemDao
.findByInfo(info)
.zip(isValidInfo(info))
.flatMap {
case (None, true) => ItemDao.insertIfNotExists(info).map(Some(_))
case (x, _) => x
}
Also, what does insertIfNotExists return if the item does exist? If it returned the existing item, things could be even simpler:
isValidInfo(info)
.filter(identity)
.flatMap { _ => ItemDao.insertIfNotExists(info) }
.map { item => Some(item) }
.recover { case _: NoSuchElementException => None }
If you are comfortable with path-dependent type and higher-kinded type, something like the following can be an elegant solution:
type Const[A] = A
sealed trait Request {
type F[_]
type A
type FA = F[A]
def query(client: Client): Future[FA]
}
case class FindByInfo(info: Info) extends Request {
type F[x] = Option[x]
type A = Item
def query(client: Client): Future[Option[Item]] = ???
}
case class CheckIfValidInfo(info: Info) extends Request {
type F[x] = Const[x]
type A = Boolean
def query(client: Client): Future[Boolean] = ???
}
class DB {
private val dbClient: Client = ???
def exec(request: Request): request.FA = request.query(dbClient)
}
What this does is basically to abstract over both the wrapper type (eg. Option[_]) as well as inner type. For types without a wrapper type, we use Const[_] type which is basically an identity type.
In scala, many problems alike this can be solved elegantly using Algebraic Data Type and its advanced type system (i.e path-dependent type & higher-kinded type). Note that now we have single point of entry exec(request: Request) for executing db requests instead of something like DAO.

Logging the value of a future before returning it in Scala

def returnFuture[A](x: A): Future[A] = {
val xFuture = Future { x } // suppose an API call that returns a future
xFuture.flatMap(x => {
println(x) // logging the value of x
xFuture
})
}
This is the way I'm currently doing it. To provide more context:
This function is being called inside an API when a request is made and I'd like the log message to be printed just before the value computed in the request is returned. Which is why, the following is not a good solution for me:
def returnFuture[A](x: A): Future[A] = {
val xFuture = Future { x } // suppose an API call that returns a future
xFuture.map(x => {
println(x) // logging the value of x
})
xFuture
}
Logging is a side-effect, meaning that you don't want the operation to fail if the logging fails for any reason (e.g. a call to toString throwing NPE).
Future#andThen is perfect for this use case. From the docs:
Applies the side-effecting function to the result of this future, and returns a new future with the result of this future.
This method allows one to enforce that the callbacks are executed in a specified order.
Note that if one of the chained andThen callbacks throws an exception, that exception is not propagated to the subsequent andThen callbacks. Instead, the subsequent andThen callbacks are given the original value of this future.
Your example becomes:
def returnFuture[A](x: A): Future[A] = {
Future { x } // suppose an API call that returns a future
.andThen { case Success(v) => println(v) }
}
You can use onComplete callback:
def returnFuture[A](x: A): Future[A] = {
val f = Future { x }
f.onComplete(println)
f
}
A map will work too:
def returnFuture[A](x: A): Future[A] = {
Future { x }.map { v =>
println(v)
v
}
}
Keep in mind that the whole point of using Futures is that you are trying to avoid blocking and that you don't control exactly when the Future will be executed. So, if you want more detailed logs while keeping the asynchronous nature of a Future, do something like this:
def doSomething(param: String): String = {
// log something here
val result = param.toUpperCase
// log something else here
result
}
def asFuture(param: String) = Future {
doSomething(param)
}
In other words, if this is an option, add logs to the x operation instead.

Why does a Scala for-comprehension have to start with a generator?

According to the Scala Language Specification (ยง6.19), "An enumerator sequence always starts with a generator". Why?
I sometimes find this restriction to be a hindrance when using for-comprehensions with monads, because it means you can't do things like this:
def getFooValue(): Future[Int] = {
for {
manager = Manager.getManager() // could throw an exception
foo <- manager.makeFoo() // method call returns a Future
value = foo.getValue()
} yield value
}
Indeed, scalac rejects this with the error message '<-' expected but '=' found.
If this was valid syntax in Scala, one advantage would be that any exception thrown by Manager.getManager() would be caught by the Future monad used within the for-comprehension, and would cause it to yield a failed Future, which is what I want. The workaround of moving the call to Manager.getManager() outside the for-comprehension doesn't have this advantage:
def getFooValue(): Future[Int] = {
val manager = Manager.getManager()
for {
foo <- manager.makeFoo()
value = foo.getValue()
} yield value
}
In this case, an exception thrown by foo.getValue() will yield a failed Future (which is what I want), but an exception thrown by Manager.getManager() will be thrown back to the caller of getFooValue() (which is not what I want). Other possible ways of handling the exception are more verbose.
I find this restriction especially puzzling because in Haskell's otherwise similar do notation, there is no requirement that a do block should begin with a statement containing <-. Can anyone explain this difference between Scala and Haskell?
Here's a complete working example showing how exceptions are caught by the Future monad in for-comprehensions:
import scala.concurrent._
import scala.concurrent.duration._
import scala.concurrent.ExecutionContext.Implicits.global
import scala.util.{Try, Success, Failure}
class Foo(val value: Int) {
def getValue(crash: Boolean): Int = {
if (crash) {
throw new Exception("failed to get value")
} else {
value
}
}
}
class Manager {
def makeFoo(crash: Boolean): Future[Foo] = {
if (crash) {
throw new Exception("failed to make Foo")
} else {
Future(new Foo(10))
}
}
}
object Manager {
def getManager(crash: Boolean): Manager = {
if (crash) {
throw new Exception("failed to get manager")
} else {
new Manager()
}
}
}
object Main extends App {
def getFooValue(crashGetManager: Boolean,
crashMakeFoo: Boolean,
crashGetValue: Boolean): Future[Int] = {
for {
manager <- Future(Manager.getManager(crashGetManager))
foo <- manager.makeFoo(crashMakeFoo)
value = foo.getValue(crashGetValue)
} yield value
}
def waitForValue(future: Future[Int]): Unit = {
val result = Try(Await.result(future, Duration("10 seconds")))
result match {
case Success(value) => println(s"Got value: $value")
case Failure(e) => println(s"Got error: $e")
}
}
val future1 = getFooValue(false, false, false)
waitForValue(future1)
val future2 = getFooValue(true, false, false)
waitForValue(future2)
val future3 = getFooValue(false, true, false)
waitForValue(future3)
val future4 = getFooValue(false, false, true)
waitForValue(future4)
}
Here's the output:
Got value: 10
Got error: java.lang.Exception: failed to get manager
Got error: java.lang.Exception: failed to make Foo
Got error: java.lang.Exception: failed to get value
This is a trivial example, but I'm working on a project in which we have a lot of non-trivial code that depends on this behaviour. As far as I understand, this is one of the main advantages of using Future (or Try) as a monad. What I find strange is that I have to write
manager <- Future(Manager.getManager(crashGetManager))
instead of
manager = Manager.getManager(crashGetManager)
(Edited to reflect #RexKerr's point that the monad is doing the work of catching the exceptions.)
for comprehensions do not catch exceptions. Try does, and it has the appropriate methods to participate in for-comprehensions, so you can
for {
manager <- Try { Manager.getManager() }
...
}
But then it's expecting Try all the way down unless you manually or implicitly have a way to switch container types (e.g. something that converts Try to a List).
So I'm not sure your premises are right. Any assignment you made in a for-comprehension can just be made early.
(Also, there is no point doing an assignment inside a for comprehension just to yield that exact value. Just do the computation in the yield block.)
(Also, just to illustrate that multiple types can play a role in for comprehensions so there's not a super-obvious correct answer for how to wrap an early assignment in terms of later types:
// List and Option, via implicit conversion
for {i <- List(1,2,3); j <- Option(i).filter(_ <2)} yield j
// Custom compatible types with map/flatMap
// Use :paste in the REPL to define A and B together
class A[X] { def flatMap[Y](f: X => B[Y]): A[Y] = new A[Y] }
class B[X](x: X) { def map[Y](f: X => Y): B[Y] = new B(f(x)) }
for{ i <- (new A[Int]); j <- (new B(i)) } yield j.toString
Even if you take the first type you still have the problem of whether there is a unique "bind" (way to wrap) and whether to doubly-wrap things that are already the correct type. There could be rules for all these things, but for-comprehensions are already hard enough to learn, no?)
Haskell translates the equivalent of for { manager = Manager.getManager(); ... } to the equivalent of lazy val manager = Manager.getManager(); for { ... }. This seems to work:
scala> lazy val x: Int = throw new Exception("")
x: Int = <lazy>
scala> for { y <- Future(x + 1) } yield y
res8: scala.concurrent.Future[Int] = scala.concurrent.impl.Promise$DefaultPromise#fedb05d
scala> Try(Await.result(res1, Duration("10 seconds")))
res9: scala.util.Try[Int] = Failure(java.lang.Exception: )
I think the reason this can't be done is because for-loops are syntactic sugar for flatMap and map methods (except if you are using a condition in the for-loop, in that case it's desugared with the method withFilter). When you are storing in a immutable variable, you can't use these methods. That's the reason you would be ok using Try as pointed out by Rex Kerr. In that case, you should be able to use map and flatMap methods.