performance enhancement using flatMap vs for-comprehension - scala

I have an actor in my play app, that every tick (2 sec) sends a message to some method:
onSomething() : Future[Unit] = {
for {
a <- somethingThatReturnsFuture
b <- anotherThingThatReturnsFuture
}
}
This method has two calls that return future so I decided to use for-comprehension, but is it true that for-comprehension is blocking? So akka could not call this method again even with the 16 instances they run until the method complete?
If I would have my method to work with flatMap/map this will allow akka to have better performance? Like this:
onSomething() : Future[Unit] = {
somethingThatReturnsFuture.flatMap(res1 => {
anotherThingThatReturnsFuture.map(res2 => {
//whatever
})
})
}
thanks

As per Luis' comment, for-comprehensions are just syntactic sugar
Scala’s “for comprehensions” are syntactic sugar for composition of
multiple operations with foreach, map, flatMap, filter or withFilter.
Scala actually translates a for-expression into calls to those
methods, so any class providing them, or a subset of them, can be used
with for comprehensions.
which expands into underlying monadic operations, thus there should be no performance hit over using monadic operations directly. If your methods are independent of each other then you might gain same performance by taking advantage of Futures being eager and start them outside the for-comprehension like so
val aF = somethingThatReturnsFuture()
val bF = anotherThingThatReturnsFuture() // I started without waiting on anyone
for {
a <- aF
b <- bF
} yield {
a + b
}
However if calculation of b depends on a then you will not be able to kick them off in parallel
for {
a <- somethingThatReturnsFuture
b <- anotherThingThatReturnsFuture(a)
} yield {
a + b
}
Here anotherThingThatReturnsFuture "blocks" in the sense of having to wait on somethingThatReturnsFuture.

First of all.
Because the methods you are calling return a Future, none of them will block the Thread execution.
But what it is true, is that flatmap will concatenate sequentially the two operations. I mean, it will call the first method, then it returns inmediately, because it is a Future, and then it will call the second.
This will happen in the two options you have posted before (for comprehension and flatmap) because they are basically the same.
If you want to call the two methods at the same time, (in two different threads), so you don't know which of them will start to execute first, you have to use parallel collections.
But in your case, perhaps it is better to not use them because using futures will guarante that the thread will not block

Related

How does the cats-effect IO monad really work?

I'm new to functional programming and Scala, and I was checking out the Cats Effect framework and trying to understand what the IO monad does. So far what I've understood is that writing code in the IO block is just a description of what needs to be done and nothing happens until you explicitly run using the unsafe methods provided, and also a way to make code that performs side-effects referentially transparent by actually not running it.
I tried executing the snippet below just to try to understand what it means:
object Playground extends App {
var out = 10
var state = "paused"
def changeState(newState: String): IO[Unit] = {
state = newState
IO(println("Updated state."))
}
def x(string: String): IO[Unit] = {
out += 1
IO(println(string))
}
val tuple1 = (x("one"), x("two"))
for {
_ <- x("1")
_ <- changeState("playing")
} yield ()
println(out)
println(state)
}
And the output was:
13
paused
I don't understand why the assignment state = newState does not run, but the increment and assign expression out += 1 run. Am I missing something obvious on how this is supposed to work? I could really use some help. I understand that I can get this to run using the unsafe methods.
In your particular example, I think what is going on is that regular imperative Scala coded is unaffected by the IO monad--it runs when it normally would under the rules of Scala.
When you run:
for {
_ <- x("1")
_ <- changeState("playing")
} yield ()
this immediately calls x. That has nothing to do with the IO monad; it's just how for comprehensions are defined. The first step is to evaluate the first statement so you can call flatMap on it.
As you observe, you never "run" the monadic result, so the argument to flatMap, the monadic continuation, is never invoked, resulting in no call to changeState. This is specific to the IO monad, as, e.g., the List monad's flatMap would have immediately invoked the function (unless it were an empty list).

How to cancel a future action if another future did failed?

I have 2 futures (2 actions on db tables) and I want that before saving modifications to check if both futures have finished successfully.
Right now, I start second future inside the first (as a dependency), but I know is not the best option. I know I can use a for-comprehension to execute both futures in parallel, but even if one fail, the other will be executed (not tested yet)
firstFuture.dropColumn(tableName) match {
case Success(_) => secondFuture.deleteEntity(entity)
case Failure(e) => throw new Exception(e.getMessage)
}
// the first future alters a table, drops a column
// the second future deletes a row from another table
in this case, if first future is executed successfully, the second can fail. I want to revert the update of first future. I heard about SQL transactions, seems to be something like that, but how?
val futuresResult = for {
first <- firstFuture.dropColumn(tableName)
second <- secondFuture.deleteEntity(entity)
} yield (first, second)
A for-comprehension is much better in my case because I don't have dependencies between these two futures and can be executed in parallel, but this not solve my problem, the result can be (success, success) or (failed, success) for example.
Regarding Future running sequentially vs in parallel:
This is a bit tricky because Scala Future is designed to be eager. There are some other constructs across various Scala libraries that handle synchronous and asynchronous effects, such as cats IO, Monix Task, ZIO etc. which are designed in a lazy way, and they don't have this behaviour.
The thing with Future being eager is that it will start the computation as soon as it is can. Here "start" means schedule it on an ExecutionContext that is either selected explicitly or present implicitly. While it's technically possible that the execution is stalled a bit in case the scheduler decides to do so, it will most likely be started almost instantaneously.
So if you have a value of type Future, it's going to start running then and there. If you have a lazy value of type Future, or a function / method that returns a value of type Future, then it's not.
But even if all you have are simple values (no lazy vals or defs), if the Future definition is done inside the for-comprehension, then it means it's part of a monadic flatMap chain (if you don't understand this, ignore it for now) and it will be run in sequence, not in parallel. Why? This is not specific to Futures; every for-comprehension has the semantics of being a sequential chain in which you can pass the result of the previous step to the next step. So it's only logical that you can't run something in step n + 1 if it depends on something from step n.
Here's some code to demonstrate this.
val program = for {
_ <- Future { Thread.sleep(5000); println("f1") }
_ <- Future { Thread.sleep(5000); println("f2") }
} yield ()
Await.result(program, Duration.Inf)
This program will wait five seconds, then print "f1", then wait another five seconds, and then print "f2".
Now let's take a look at this:
val f1 = Future { Thread.sleep(5000); println("f1") }
val f2 = Future { Thread.sleep(5000); println("f2") }
val program = for {
_ <- f1
_ <- f2
} yield ()
Await.result(program, Duration.Inf)
The program, however, will print "f1" and "f2" simultaneously after five seconds.
Note that the sequence semantics are not really violated in the second case. f2 still has the opportunity to use the result of f1. But f2 is not using the result of f1; it's a standalone value that can be computed immediately (defined with a val). So if we change val f2 to a function, e.g. def f2(number: Int), then the execution changes:
val f1 = Future { Thread.sleep(5000); println("f1"); 42 }
def f2(number: Int) = Future { Thread.sleep(5000); println(number) }
val program = for {
number <- f1
_ <- f2(number)
} yield ()
As you would expect, this will print "f1" after five seconds, and only then will the other Future start, so it will print "42" after another five seconds.
Regarding transactions:
As #cbley mentioned in the comment, this sounds like you want database transactions. For example, in SQL databases this has a very specific meaning and it ensures the ACID properties.
If that's what you need, you need to solve it on the database layer. Future is too generic for that; it's just an effect type that models sync and async computations. When you see a Future value, just by looking at the type, you can't tell if it's the result of a database call or, say, some HTTP call.
For example, doobie describes every database query as a ConnectionIO type. You can have multiple queries lined up in a for-comprehension, just how you would have with Future:
val program = for {
a <- database.getA()
_ <- database.write("foo")
b <- database.getB()
} yield {
// use a and b
}
But unlike our earlier examples, here getA() and getB() don't return a value of type Future[A], but ConnectionIO[A]. What's cool about that is that doobie completely takes care of the fact that you probably want these queries to be run in a single transaction, so if getB() fails, "foo" will not be committed to the database.
So what you would do in that case is obtain the full description of your set of queries, wrap it into a single value program of type ConnectionIO, and once you want to actually run the transaction, you would do something like program.transact(myTransactor), where myTransactor is an instance of Transactor, a doobie construct that knows how to connect to your physical database.
And once you transact, your ConnectionIO[A] is turned into a Future[A]. If the transaction failed, you'll have a failed Future, and nothing will be really committed to your database.
If your database operations are independent of each other and can be run in parallel, doobie will also allow you to do that. Committing transactions via doobie, both in sequence and in parallel, is quite nicely explained in the docs.

How IO monad makes concurrency easy to do in Scala?

I watch this video and starts at 6min35s, it mentioned this graph:
saying that the IO Monad makes it easy to handle concurrency. I got confused on this: how does it work? How do the two for comprehension enable the concurrency (the computation of d and f)?
No, it doesn't enable the concurrency
for comprehensions only help you omitting several parentheses and indents.
The code you referred, translate to [flat]map is strictly equivalent to:
async.boundedQueue[Stuff](100).flatMap{ a =>
val d = computeB(a).flatMap{
b => computeD(b).map{ result =>
result
}
}
val f = computeC(a).flatMap{ c =>
computeE(c).flatMap{ e =>
computeF(e).map{ result =>
result
}
}
}
d.merge(f).map(g => g)
}
See, it only helps you omitting several parentheses and indents (joke)
The concurrency is hidden in flatMap and map
Once you understand how for is translated to flatMap and map, you can implement your concurrency inside them.
As map takes a function as argument, it doesn't mean that the function is executed during execution of map function, you can defer the function to another thread or run it latter. This is how concurrency implemented.
Take Promise and Future as example:
val future: Future = ```some promise```
val r: Future = for (v <- future) yield doSomething(v)
// or
val r: Future = future.map(v => doSomething(v))
r.wait
The function doSomething is not executed during the execution of Future.map function, instead it is called when the promise commits.
Conclusion
How to implement concurrency using for syntax suger:
for will convert into flatMap and map by scala compiler
Write you flatMap and map, where you will get a callback function from argument
Call the function you got whenever and wherever you like
Further reading
The flow control feature of many languages share a same property, they are like delimited continuation shift/reset, where they capture the following execution upto a scope into an function.
JavaScript:
async function() {
...
val yielded = await new Promise((resolve) => shift(resolve))
// resolve will captured execution of following statements upto end of the function.
...captured
}
Haskell:
do {
...
yielded_monad <- ```some monad``` -- shift function is >>= of the monad
...captured
}
Scala:
for {
...
yielded_monad <- ```some monad``` // shift function is flatMap/map of the monad
...captured
} yield ...
next time you see a language feature which capture following execution into a function, you know you can implement flow control using the feature.
The difference between delimited continuation and call/cc is that call/cc capture the whole following execution of the program, but delimited continuation has a scope.

Losing types on sequencing Futures

I'm trying to do this:
case class ConversationData(members: Seq[ConversationMemberModel], messages: Seq[MessageModel])
val membersFuture: Future[Seq[ConversationMemberModel]] = ConversationMemberPersistence.searchByConversationId(conversationId)
val messagesFuture: Future[Seq[MessageModel]] = MessagePersistence.searchByConversationId(conversationId)
Future.sequence(List(membersFuture, messagesFuture)).map{ result =>
// some magic here
self ! ConversationData(members, messages)
}
But when I'm sequencing the two futures compiler is losing types. The compiler says that type of result is List[Seq[Product with Serializable]] At the beginning I expect to do something like
Future.sequence(List(membersFuture, messagesFuture)).map{ members, messages => ...
But it looks like sequencing futures don't work like this... I also tried to using a collect inside the map but I get similar errors.
Thanks for your help
When using Future.sequence, the assumption is that the underlying types produced by the multiple Futures are the same (or extend from the same parent type). With sequence, you basically invert a Seq of Futures for a particular type to a single Future for a Seq of that particular type. A concrete example is probably more illustrative of that point:
val f1:Future[Foo] = ...
val f2:Future[Foo] = ...
val f3:Future[Foo] = ...
val futures:List[Future[Foo]] = List(f1, f2, f3)
val aggregateFuture:Future[List[Foo]] = Future.sequence(futures)
So you can see that I went from a List of Future[Foo] to a single Future wrapping a List[Foo]. You use this when you already have a bunch of Futures for results of the same type (or base type) and you want to aggregate all of the results for the next processing step. The sequence method product a new Future that won't be completed until all of the aggregated Futures are done and it will then contain the aggregated results of all of those Futures. This works especially well when you have an indeterminate or variable number of Futures to process.
For your case, it seems that you have a fixed number of Futures to handle. As #Zoltan suggested, a simple for comprehension is probably a better fit here because the number of Futures is known. So solving your problem like so:
for{
members <- membersFuture
messages <- messagesFuture
} {
self ! ConversationData(members, messages)
}
is probably the best way to go for this specific example.
What are you trying to achieve with the sequence call? I'd just use a for-comprehension instead:
val membersFuture: Future[Seq[ConversationMemberModel]] = ConversationMemberPersistence.searchByConversationId(conversationId)
val messagesFuture: Future[Seq[MessageModel]] = MessagePersistence.searchByConversationId(conversationId)
for {
members <- membersFuture
messages <- messagesFuture
} yield (self ! ConversationData(members, messages))
Note that it is important that you declare the two futures outside the for-comprehension, because otherwise your messagesFuture wouldn't be submitted until the membersFuture is completed.
You could also use zip:
membersFuture.zip(messagesFuture).map {
case (members, messages) => self ! ConversationData(members, messages)
}
but I'd prefer the for-comprehension.

Use only the value from IO monad without precedent IO actions

Doing some home project, I encountered an interested effect, which now , seems obvious to me, but still I do not see a way to get away from it.
That is the gist (I am using ScalaZ, but in haskell there would be probably the same result):
def askAndReadResponse(question: String): IO[String] = {
putStrLn(question) >> readLn
}
def core: IO[String] = {
val answer: IO[String] = askAndReadResponse("enter something")
val cond: IO[Boolean] = answer map {_.length > 2}
IO.ioMonad.ifM(cond, answer, core)
}
When I am trying to get an input from core, the askAndReadResponse evaluates twice - once for evaluating the condition, and then in ifM (so I have the message and readLn one more time then necessary).
What I need - just the validated value (to print it later, for instance)
Is there any elegant way to do this, in particular - to pass further the result of IO, without preceding IO actions, namely avoiding execution of askAndReadResponse twice?
You can sequence the effects using monadic binding with flatMap:
def core: IO[String] = askAndReadResponse("enter something").flatMap {
case response if response.length > 2 => response.point[IO]
case response => core
}
This lets you take the result of one computation (the user entering text after being prompted) and use it in subsequent computations (the calculation about whether to return or loop, and the result if returning).
ifM just isn't going to be useful in your case—it would only work here if your condition and your successful branch were independent computations.