Scala style with Futures - scala

Looking for the best way to write a chain of functions that need to run async one after the other. Given these two options:
Option 1
def operation1(): Unit = {...}
def operation2(): Unit = {...}
def foo(): Future[Unit] =
Future {
operation1()
operation2()
} onComplete {case _ => println("done!")}
Option 2
def operation1(): Future[Unit] = {...}
def operation2(): Future[Unit] = {...}
def foo(): Future[Unit] = {
operation1()
.flatMap {case _ => operation2() }
.onComplete {case _ => println("done!")}
}
Are there any advantages/disadvantages of one over the other?
I believe that the Option 1 will run the two functions on the same background thread. Is that also the case for the Option 2?
Are there any good practices for this?
Another question, given this function:
def foo: Future[A]
if I want to cast the result to unit, is this the best way to do it:
foo map { _ => () }
Thanks!

The potential advantage of Option 1 over Option 2 is that, it guarantees operation2 will run right after operation1 - if it didn't fail with an exception - whereas, in Option 2, you might have exhausted your thread pool available threads when the flatMap is to be done.
Yes, Option1 will run the operations in the same thread for sure. Option 2 will try to run them in two threads as long as there are enough of them available.
flatMap[S](f: (T) ⇒ Future[S])(implicit executor: ExecutionContext): Future[S]
You did have to declare an implicit execution context, or import it: That determines which pool you are using. If you imported the default global executor then your pool is a fork join based one with - by default - as many threads as cores you machine has.
The first option is like having a thread running both operations, one after the another whereas the second option runs the first operation in a thread and then tries to get another thread from your ExecutionContext to run the second operation.
The best practice is to use what you need:
Do you want to make sure that operation2 run in a context where no more threads are available in the execution context? If the answer is yes, then use Option1. Otherwise, you can use Option2
In relation to your last question: What you're doing in your proposed snippet is not casting, your are mapping a function which provides an Unit value for any value of type A. The effect is that you get a future of type Unit which is useful to check its completion state. That is the best way to get what you seem to want.
Be aware, however, of the fact that, as well as with flatMap, the execution of that "transformation function" will be run in a different thread provided by the implicit executor in your context. That is why map has an implicit parameter executor too:
def map[S](f: (T) ⇒ S)(implicit executor: ExecutionContext): Future[S]

Related

Reentrant locks within monads in Scala

A colleague of mine stated the following, about using a Java ReentrantReadWriteLock in some Scala code:
Acquiring the lock here is risky. It's "reentrant", but that internally depends on the thread context. F may run different stages of the same computation in different threads. You can easily cause a deadlock.
F here refers to some effectful monad.
Basically what I'm trying to do is to acquire the same reentrant lock twice, within the same monad.
Could somebody clarify why this could be a problem?
The code is split into two files. The outermost one:
val lock: Resource[F, Unit] = for {
// some other resource
_ <- store.writeLock
} yield ()
lock.use { _ =>
for {
// stuff
_ <- EitherT(store.doSomething())
// other stuff
} yield ()
}
Then, in the store:
import java.util.concurrent.locks.{Lock, ReentrantReadWriteLock}
import cats.effect.{Resource, Sync}
private def lockAsResource[F[_]](lock: Lock)(implicit F: Sync[F]): Resource[F, Unit] =
Resource.make {
F.delay(lock.lock())
} { _ =>
F.delay(lock.unlock())
}
private val lock = new ReentrantReadWriteLock
val writeLock: Resource[F, Unit] = lockAsResource(lock.writeLock())
def doSomething(): F[Either[Throwable, Unit]] = writeLock.use { _ =>
// etc etc
}
The writeLock in the two pieces of code is the same, and it's a cats.effect.Resource[F, Unit] wrapping a ReentrantReadWriteLock's writeLock. There are some reasons why I was writing the code this way, so I wouldn't want to dig into that. I would just like to understand why (according to my colleague, at least), this could potentially break stuff.
Also, I'd like to know if there is some alternative in Scala that would allow something like this without the risk for deadlocks.
IIUC your question:
You expect that for each interaction with the Resource lock.lock and lock.unlock actions happen in the same thread.
1) There is no guarantee at all since you are using arbitrary effect F here.
It's possible to write an implementation of F that executes every action in a new thread.
2) Even if we assume that F is IO then the body of doSomething someone could do IO.shift. So the next actions including unlock would happen in another thread. Probably it's not possible with the current signature of doSomething but you get the idea.
Also, I'd like to know if there is some alternative in Scala that would allow something like this without the risk for deadlocks.
You can take a look at scalaz zio STM.

Slight difference in Future.zip and Future.zipWith implementation. Why?

Let's consider the following excerpt from scala.concurrent.Future.scala:
def zip[U](that: Future[U]): Future[(T, U)] = {
implicit val ec = internalExecutor
flatMap { r1 => that.map(r2 => (r1, r2)) }
}
def zipWith[U, R](that: Future[U])(f: (T, U) => R)(implicit executor: ExecutionContext): Future[R] =
flatMap(r1 => that.map(r2 => f(r1, r2)))(internalExecutor)
It does not differ a lot seemingly, except for the application of function f in the zipWith case. It is interesting to me, why the internalExecutor (which just delegates to the current thread) is declared as implicit value in the zip and thus used in both underlying map and flatMap calls, but is used explicitly only in the flatMap call inside the zipWith?
As I understand after some thinking, the f function execution may involve some blocking or intensive computation which is out of Scala library control, and so the user should provide another execution context for it to not occasionally block the internalExecutor (current thread). Is this understanding correct?
The application of f is done with the supplied ExecutionContext, and internalExecutor is used to perform the flattening operation. The rule is basically: When the user supplies the logic, that logic is executed on the ExecutionContext supplied by the user.
You could imagine that zipWith was implemented as this.zip(that).map(f.tupled) or that zip was implemented as zipWith(Tuple2.apply)(internalExecutor).

Abstraction for asynchronous computation that cannot fail

Scala Futures are a fine abstraction for asynchronous computation that can fail. What abstraction should I use when I know that there is no failure possible?
Here is a concrete use case:
case class Service123Error(e: Throwable)
val f1: Future[User] =
service123.call()
val f2: Future[Either[Service123Error, User]] =
f1.map(Right.apply)
.recover { case e => Left(Service123Error(e)) }
In this code, the types do not model the fact that f2 always successfully completes. Given several values similar to f2, I know I can Future.sequence over them (i.e. without the fail fast behavior), but the types are not carrying this information.
I would not stick an Either into a Future if it carries the error information. Simply use the failure side of the future.
Future(Right(User)) -> Future[User](User)
Future(Left(Service123Error)) -> Future[User](throw Service123Error)

Running scala futures somewhat in parallel

I have some scala futures. I can easily run them in parallel with Future.sequence. I can also run them one-after-another with something like this:
def serFut[A, B](l: Iterable[A])(fn: A ⇒ Future[B]) : Future[List[B]] =
l.foldLeft(Future(List.empty[B])) {
(previousFuture, next) ⇒
for {
previousResults ← previousFuture
next ← fn(next)
} yield previousResults :+ next
}
(Described here). Now suppose that I want to run them slightly in parallel - ie with the constraint that at most m of them are running at once. The above code does this for the special case of m=1. Is there a nice scala-idiomatic way of doing it for general m? Then for extra utility, what's the most elegant way to implement a kill-switch into the routine? And could I change m on the fly?
My own solutions to this keep leading me back to procedural code, which feels rather wimpy next to scala elegance.
You can achieve it by using an ExecutionContext which uses a pool of max m threads:
How to configure a fine tuned thread pool for futures?
Put the implicit val ec = new ExecutionContext { ... somewhere within the scope of the serFut function so that it is used when creating futures.
The easiest way is to define your ExecutionContext like
implicit val ec = ExecutionContext.fromExecutor(Executors.newFixedThreadPool(m))

Is Future in Scala a monad?

Why and how specifically is a Scala Future not a Monad; and would someone please compare it to something that is a Monad, like an Option?
The reason I'm asking is Daniel Westheide's The Neophyte's Guide to Scala Part 8: Welcome to the Future where I asked whether or not a Scala Future was a Monad, and the author responded that it wasn't, which threw off base. I came here to ask for a clarification.
A summary first
Futures can be considered monads if you never construct them with effectful blocks (pure, in-memory computation), or if any effects generated are not considered as part of semantic equivalence (like logging messages). However, this isn't how most people use them in practice. For most people using effectful Futures (which includes most uses of Akka and various web frameworks), they simply aren't monads.
Fortunately, a library called Scalaz provides an abstraction called Task that doesn't have any problems with or without effects.
A monad definition
Let's review briefly what a monad is. A monad must be able to define at least these two functions:
def unit[A](block: => A)
: Future[A]
def bind[A, B](fa: Future[A])(f: A => Future[B])
: Future[B]
And these functions must statisfy three laws:
Left identity: bind(unit(a))(f) ≡ f(a)
Right identity: bind(m) { unit(_) } ≡ m
Associativity: bind(bind(m)(f))(g) ≡ bind(m) { x => bind(f(x))(g) }
These laws must hold for all possible values by definition of a monad. If they don't, then we simply don't have a monad.
There are other ways to define a monad that are more or less the same. This one is popular.
Effects lead to non-values
Almost every usage of Future that I've seen uses it for asychronous effects, input/output with an external system like a web service or a database. When we do this, a Future isn't even a value, and mathematical terms like monads only describe values.
This problem arises because Futures execute immediately upon data contruction. This messes up the ability to substitute expressions with their evaluated values (which some people call "referential transparency"). This is one way to understand why Scala's Futures are inadequate for functional programming with effects.
Here's an illustration of the problem. If we have two effects:
import scala.concurrent.Future
import scala.concurrent.ExecutionContext.Implicits._
def twoEffects =
( Future { println("hello") },
Future { println("hello") } )
we will have two printings of "hello" upon calling twoEffects:
scala> twoEffects
hello
hello
scala> twoEffects
hello
hello
But if Futures were values, we should be able to factor out the common expression:
lazy val anEffect = Future { println("hello") }
def twoEffects = (anEffect, anEffect)
But this doesn't give us the same effect:
scala> twoEffects
hello
scala> twoEffects
The first call to twoEffects runs the effect and caches the result, so the effect isn't run the second time we call twoEffects.
With Futures, we end up having to think about the evaluation policy of the language. For instance, in the example above, the fact I use a lazy value rather than a strict one makes a difference in the operational semantics. This is exactly the kind of twisted reasoning functional programming is designed to avoid -- and it does it by programming with values.
Without substitution, laws break
In the presense of effects, monad laws break. Superficially, the laws appear to hold for simple cases, but the moment we begin to substitute expressions with their evaluated values, we end up with the same problems we illustrated above. We simply can't talk about mathematical concepts like monads when we don't have values in the first place.
To put it bluntly, if you use effects with your Futures, saying they're monads is not even wrong because they aren't even values.
To see how monad laws break, just factor out your effectful Future:
import scala.concurrent.Future
import scala.concurrent.ExecutionContext.Implicits._
def unit[A]
(block: => A)
: Future[A] =
Future(block)
def bind[A, B]
(fa: Future[A])
(f: A => Future[B])
: Future[B] =
fa flatMap f
lazy val effect = Future { println("hello") }
Again, it will only run one time, but you need it to run twice -- once for the right-hand side of the law, and another for the left. I'll illustrate the problem for the right identity law:
scala> effect // RHS has effect
hello
scala> bind(effect) { unit(_) } // LHS doesn't
The implicit ExecutionContext
Without putting an ExecutionContext in implicit scope, we can't define either unit or bind in our monad. This is because the Scala API for Futures has these signature:
object Future {
// what we need to define unit
def apply[T]
(body: ⇒ T)
(implicit executor: ExecutionContext)
: Future[T]
}
trait Future {
// what we need to define bind
flatMap[S]
(f: T ⇒ Future[S])
(implicit executor: ExecutionContext)
: Future[S]
}
As a "convenience" to the user, the standard library encourages users to define an execution context in implicit scope, but I think this is a huge hole in the API that just leads to defects. One scope of the computation may have one execution context defined while another scope can have another context defined.
Perhaps you can ignore the problem if you define an instance of unit and bind that pins both operations to a single context and use this instance consistently. But this is not what people do most of the time. Most of the time, people use Futures with for-yield comprehensions that become map and flatMap calls. To make for-yield comprehensions work, an execution context must be defined at some non-global implicit scope (because for-yield doesn't provide a way to specify additional parameters to the map and flatMap calls).
To be clear, Scala lets you use lots of things with for-yield comprehensions that aren't actually monads, so don't believe that you have a monad just because it works with for-yield syntax.
A better way
There's a nice library for Scala called Scalaz that has an abstraction called scalaz.concurrent.Task. This abstraction doesn't run effects upon data construction as the standard library Future does. Furthermore, Task actually is a monad. We compose Task monadically (we can use for-yield comprehensions if we like), and no effects run while we're composing. We have our final program when we have composed a single expression evaluating to Task[Unit]. This ends up being our equivalent of a "main" function, and we can finally run it.
Here's an example illustrating how we can substitute Task expressions with their respective evaluated values:
import scalaz.concurrent.Task
import scalaz.IList
import scalaz.syntax.traverse._
def twoEffects =
IList(
Task delay { println("hello") },
Task delay { println("hello") }).sequence_
We will have two printings of "hello" upon calling twoEffects:
scala> twoEffects.run
hello
hello
And if we factor out the common effect,
lazy val anEffect = Task delay { println("hello") }
def twoEffects =
IList(anEffect, anEffect).sequence_
we get what we'd expect:
scala> twoEffects.run
hello
hello
In fact, it doesn't really matter that whether we use a lazy value or a strict value with Task; we get hello printed out twice either way.
If you want to functionally program, consider using Task everywhere you may use Futures. If an API forces Futures upon you, you can convert the Future to a Task:
import concurrent.
{ ExecutionContext, Future, Promise }
import util.Try
import scalaz.\/
import scalaz.concurrent.Task
def fromScalaDeferred[A]
(future: => Future[A])
(ec: ExecutionContext)
: Task[A] =
Task
.delay { unsafeFromScala(future)(ec) }
.flatMap(identity)
def unsafeToScala[A]
(task: Task[A])
: Future[A] = {
val p = Promise[A]
task.runAsync { res =>
res.fold(p failure _, p success _)
}
p.future
}
private def unsafeFromScala[A]
(future: Future[A])
(ec: ExecutionContext)
: Task[A] =
Task.async(
handlerConversion
.andThen { future.onComplete(_)(ec) })
private def handlerConversion[A]
: ((Throwable \/ A) => Unit)
=> Try[A]
=> Unit =
callback =>
{ t: Try[A] => \/ fromTryCatch t.get }
.andThen(callback)
The "unsafe" functions run the Task, exposing any internal effects as side-effects. So try not to call any of these "unsafe" functions until you've composed one giant Task for your entire program.
I believe a Future is a Monad, with the following definitions:
def unit[A](x: A): Future[A] = Future.successful(x)
def bind[A, B](m: Future[A])(fun: A => Future[B]): Future[B] = fut.flatMap(fun)
Considering the three laws:
Left identity:
Future.successful(a).flatMap(f) is equivalent to f(a). Check.
Right identity:
m.flatMap(Future.successful _) is equivalent to m (minus some possible performance implications). Check.
Associativity
m.flatMap(f).flatMap(g) is equivalent to m.flatMap(x => f(x).flatMap(g)). Check.
Rebuttal to "Without substitution, laws break"
The meaning of equivalent in the monad laws, as I understand it, is you could replace one side of the expression with the other side in your code without changing the behavior of the program. Assuming you always use the same execution context, I think that is the case. In the example #sukant gave, it would have had the same issue if it had used Option instead of Future. I don't think the fact that the futures are evaluated eagerly is relevant.
As the other commenters have suggested, you are mistaken. Scala's Future type has the monadic properties:
import scala.concurrent.Future
import scala.concurrent.ExecutionContext.Implicits._
def unit[A](block: => A): Future[A] = Future(block)
def bind[A, B](fut: Future[A])(fun: A => Future[B]): Future[B] = fut.flatMap(fun)
This is why you can use for-comprehension syntax with futures in Scala.