Cats effect - parallel composition of independent effects - scala

I want to combine multiple IO values that should run independently in parallel.
val io1: IO[Int] = ???
val io2: IO[Int] = ???
As I see it, I have to options:
Use cats-effect's fibers with a fork-join pattern
val parallelSum1: IO[Int] = for {
fiber1 <- io1.start
fiber2 <- io2.start
i1 <- fiber1.join
i2 <- fiber2.join
} yield i1 + i2
Use the Parallel instance for IO with parMapN (or one of its siblings like parTraverse, parSequence, parTupled etc)
val parallelSum2: IO[Int] = (io1, io2).parMapN(_ + _)
Not sure about the pros and cons of each approach, and when should I choose one over the other. This becomes even more tricky when abstracting over the effect type IO (tagless-final style):
def io1[F[_]]: F[Int] = ???
def io2[F[_]]: F[Int] = ???
def parallelSum1[F[_]: Concurrent]: F[Int] = for {
fiber1 <- io1[F].start
fiber2 <- io2[F].start
i1 <- fiber1.join
i2 <- fiber2.join
} yield i1 + i2
def parallelSum2[F[_], G[_]](implicit parallel: Parallel[F, G]): F[Int] =
(io1[F], io2[F]).parMapN(_ + _)
The Parallel typeclass requires 2 type constructors, making it somewhat more cumbersome to use, without context bounds and with an additional vague type parameter G[_]
Your guidance is appreciated :)
Amitay

I want to combine multiple IO values that should run independently in
parallel.
The way I view it, in order to figure out "when do I use which?", we need to return the the old parallel vs concurrent discussion, which basically boils down to (quoting the accepted answer):
Concurrency is when two or more tasks can start, run, and complete in
overlapping time periods. It doesn't necessarily mean they'll ever
both be running at the same instant. For example, multitasking on a
single-core machine.
Parallelism is when tasks literally run at the same time, e.g., on a
multicore processor.
We often like to provide an example of concurrency when we we do IO like operations, such as creating an over the wire call, or talking to disk.
Question is, which one do you want when you say you want to execute "in parallel", is it the former or the latter?
If we're referring to the former, then using Concurrent[F] both conveys the intention by the signature and provides the proper execution semantics. If it's the latter, and we, for example, want to process a collection of elements in parallel, then going with Parallel[F, G] would be the better solution.
It is often quite confusing when we think about the semantics of this regarding IO, because it has both instances for Parallel and Concurrent and we mostly use it to opaquely define side effecting operations.
As a side note, the reason behind Parallel taking two unary type constructors is because of the fact that M (in Parallel[M[_], F[_]]) in always a Monad instance, and we need a way to prove the Monad has an Applicative[F] instance as well for parallel executions, because when we think of a Monad we always talk about sequential execution semantics.

Related

Doobie - lifting arbitrary effect into ConnectionIO CE3

I am trying to migrate project from cats-effect 2 to cats-effect 3, i am using doobie for interacting with database. Previously i could lift ConnectionIO to IO as it was described, but with the upgrade i didn't find any implementation of LiftIO[ConnectionIO], how can be same achieved with CE3?
There's two approaches here. Broadly, there's the "don't do that" option (recommended), and the other way would be to use WeakAsync
"Don't do that"
Generally, it's not recommended to interleave arbitrary IO into a ConnectionIO program, because ConnectionIO is transactional, and you cannot roll back an IO. A ConnectionIO might run multiple times, for example to retry a recoverable failure.
When possible, it's a good idea to refactor the code to not do the non-transactional operation inside a transaction, but instead to perform it afterwards
However, when that's not practical, ConnectionIO provides a Sync instance, and so there's a number of things you can do with it easily that don't require lifting from IO in the first place:
printing to the console can be done by summoning Console[ConnectionIO]
Getting the current time can be done with Clock[ConnectionIO]
You can create a log4cats Logger[ConnectionIO] as needed using the appropriate factory for your backend
Generating a UUID value can be done with UUIDGen[ConnectionIO]
Using WeakAsync
Doobie's WeakAsync provides a way to get a Resource[F, F ~> ConnectionIO]
Note that because this is a resource, you must take care to complete the transaction inside of use - the lifecycle of the FunctionK from the resource will be shut down once use completes.
Usually that means something like this:
def doStuff(rows: List[Int]): F[Unit] = ???
WeakAsync.liftK[F, ConnectionIO].use { fk =>
val transaction = for {
rows <- sql"select 1".query[Int].to[List]
_ <- fk(doStuff(rows))
} yield ()
transaction.transact(xa)
}
I found the way to achieve it by
def f()(implicit liftToConnectionIO: FunctionK[IO, ConnectionIO]): IO[Unit] = IO.unit
implicit val liftToConnectionIO = WeakAsync.liftK[IO, ConnectionIO]
liftToConnectionIO.use(implicit lift => f())

When should one use applicatives over monads?

I’ve been using Scala at work and to understand Functional Programming more deeply I picked Graham Hutton’s Programming in Haskell (love it :)
In the chapter on Monads I got my first look into the concept of Applicative Functors (AFs)
In my (limited) professional-Scala capacity I’ve never had to use AFs and have always written code that uses Monads. I’m trying to distill the understanding of “when to use AFs” and hence the question. Is this insight correct:
If all your computations are independent and parallelizable (i.e., the result of one doesn’t determine the output of another) your needs would be better served by an AF if the output needs to be piped to a pure function without effects. If however, you have even a single dependency AFs won’t help and you’ll be forced to use Monads. If the output needs to be piped to a function with effects (e.g., returning Maybe) you’ll need Monads.
For example, if you have “monadic” code like so:
val result = for {
x <- callServiceX(...)
y <- callServiceY(...) //not dependent on X
} yield f(x,y)
It’s better to do something like (pseudo-AF syntax for scala where |#| is like a separator between parallel/asynch calls).
val result = (callServiceX(...) |#| callServiceY(...)).f(_,_)
If f == pure and callService* are independent AFs will serve you better
If f has effects i.e., f(x,y): Option[Response] you’ll need Monads
If callServiceX(...), y <- callServiceY(...), callServiceZ(y) i.e., there is even a single dependency in the chain, use Monads.
Is my understanding correct? I know there’s a lot more to AFs/Monads and I believe I understand the advantages of one over the other (for the most part). What I want to know is the decision making process of deciding which one to use in a particular context.
There is not really a decision to be made here: always use the Applicative interface, unless it is too weak.1
It's the essential tension of abstraction strength: more computations can be expressed with Monad; computations expressed with Applicative can be used in more ways.
You seem to be mostly correct about the conditions where you need to use Monad. I'm not sure about this one:
If f has effects i.e. f(x,y) : Option[Response] you'll need Monads.
Not necessarily. What is the functor in question here? There is nothing stopping you from creating a F[Option[X]] if F is the applicative. But just as before you won't be able to make further decisions in F depending on whether the Option succeeded or not -- the whole "call tree" of F actions must be knowable without computing any values.
1 Readability concerns aside, that is. Monadic code will probably be more approachable to people from traditional backgrounds because of its imperative look.
I think you'll need to be a little cautious about terms like "independent" or "parallelizable" or "dependency". For example, in the IO monad, consider the computation:
foo :: IO (String, String)
foo = do
line1 <- getLine
line2 <- getLine
return (line1, line2)
The first and second lines are not independent or parallelizable in the usual sense. The second getLine's result is affected by the action of the first getLine through their shared external state (i.e., the first getLine reads a line, implying the second getLine will not read that same line but will rather read the next line). Nonetheless, this action is applicative:
foo = (,) <$> getLine <*> getLine
As a more realistic example, a monadic parser for the expression 3 + 4 might look like:
expr :: Parser Expr
expr = do
x <- factor
op <- operator
y <- factor
return $ x `op` y
The three actions here are interdependent. The success of the first factor parser determines whether or not the others will be run, and its behavior (e.g., how much of the input stream it absorbs) clearly affects the results of the other parsers. It would not be reasonable to consider these actions as operating "in parallel" or being "independent". Still, it's an applicative action:
expr = factor <**> operator <*> factor
Or, consider this State Int action:
bar :: Int -> Int -> State Int Int
bar x y = do
put (x + y)
z <- gets (2*)
return z
Clearly, the result of the gets (*2) action depends on the computation performed in the put (x + y) action. But, again, this is an applicative action:
bar x y = put (x + y) *> gets (2*)
I'm not sure that there's a really straightforward way of thinking about this intuitively. Roughly, if you think of a monadic action/computation m a as having "monadic structure" m as well as a "value structure" a, then applicatives keep the monadic and value structures separate. For example, the applicative computation:
λ> [(1+),(10+)] <*> [3,4,5]
[4,5,6,13,14,15]
has a monadic (list) structure whereby we always have:
[f,g] <*> [a,b,c] = [f a, f b, f c, g a, g b, g c]
regardless of the actual values involves. Therefore, the resulting list length is the product of the length of both "input" lists, the first element of the result involves the first elements of the "input" lists, etc. It also has a value structure whereby the value 4 in the result clearly depends on the value (1+) and the value 3 in the inputs.
A monadic computation, on the other hand, permits a dependency of the monadic structure on the value structure, so for example in:
quux :: [Int]
quux = do
n <- [1,2,3]
drop n [10..15]
we can't write down the structural list computation independent of the values. The list structure (e.g., the length of the final list) is dependent on the value level data (the actual values in the list [1,2,3]). This is the kind of dependency that requires a monad instead of an applicative.

how does building a big task computation compare to execute synchronously several steps?

I have the following two pieces of code written in Scala/Monix:
def f1(input) =
for {
a <- task1(input)
b <- task2(a)
c <- task3(b)
} yield (c).runSyncUnsafe
and
def f2(input) = {
val a = task1(input).runSyncUnsafe
val b = task2(a).runSyncUnsafe
task3(b).runSyncUnsafe
}
I think the version f1 is better as it fully async and it doesn't block threads and my assumption is that, if there are many tasks running, the first should perform better in multithreading.
I know I should write a test to compare the two implementations but it would require a lot of refactoring of the legacy code. Also the profiling of the two versions is not easy in our specific situation so I'm asking here first, hoping for an answer from somebody with a lot of Scala/Monix experience:
How should the two compare in terms of performance under heavy load? Is this a real concern or is it a non-issue?
As a general rule is better to stay async for as long as possible. So you could write f1 like this:
def f1(input) =
for {
a <- task1(input)
b <- task2(a)
c <- task3(b)
} yield c
The caller can then decide whether to call runSyncUnsafe or an async call (runAsync, runOnComplete) or flatMap it with another task. This removes the Unsafe call from your code and leaves it to the caller to decide whether to be safe or not.
As far as performance goes, the tasks will be evaluated sequentially either way because later tasks depend on the results of earlier tasks.

Are there better monadic abstraction alternative for representing long running, async task?

A Future is good at representing a single asynchronous task that will / should be completed within some fixed amount of time.
However, there exist another kind of asynchronous task, one where it's not possible / very hard to know exactly when it will finish. For example, the time taken for a particular string processing task might depend on various factors such as the input size.
For these kind of task, detecting failure might be better by checking if the task is able to make progress within a reasonable amount of time instead of by setting a hard timeout value such as in Future.
Are there any libraries providing suitable monadic abstraction of such kind of task in Scala?
You could use a stream of values like this:
sealed trait Update[T]
case class Progress[T](value: Double) extends Update[T]
case class Finished[T](result: T) extends Update[T]
let your task emit Progress values when it is convenient (e.g. every time a chunk of the computation has finished), and emit one Finished value once the complete computation is finished. The consumer could check for progress values to ensure that the task is still making progress. If a consumer is not interested in progress updates, you can just filter them out. I think this is more composable than an actor-based approach.
Depending on how much performance or purity you need, you might want to look at akka streams or scalaz streams. Akka streams has a pure DSL for building flow graphs, but allows mutability in processing stages. Scalaz streams is more functional, but has lower performance last I heard.
You can break your work into chunks. Each chunk is a Future with a timeout - your notion of reasonable progress. Chain those Futures together to get a complete task.
Example 1 - both chunks can run in parallel and don't depend on each other (embarassingly parallel task):
val chunk1 = Future { ... } // chunk 1 starts execution here
val chunk2 = Future { ... } // chunk 2 starts execution here
val result = for {
c1 <- chunk1
c2 <- chunk2
} yield combine(c1, c2)
Example 2 - second chunk depends on the first:
val chunk1 = Future { ... } // chunk 1 starts execution here
val result = for {
c1 <- chunk1
c2 <- Future { c1 => ... } // chunk 2 starts execution here
} yield combine(c1, c2)
There are obviously other constructs to help you when you have many Futures like sequence.
The article "The worst thing in our Scala code: Futures" by Ken Scambler points the need for a separation of concerns:
scala.concurrent.Future is built to work naturally with scala.util.Try, so our code often ends up clumsily using Try everywhere to represent failure, using raw exceptions as failure values even where no exceptions get thrown.
To do anything with a scala.concurrent.Future, you need to lug around an implicit ExecutionContext. This obnoxious dependency needs to be threaded through everywhere they are used.
So if your code does not depend directly on Future, but on simple Monad properties, you can abstract it with a Monad type:
trait Monad[F[_]] {
def flatMap[A,B](fa: F[A], f: A => F[B]): F[B]
def pure[A](a: A): F[A]
}
// read the type parameter as “for all F, constrained by Monad”.
final def load[F[_]: Monad](pageUrl: URL): F[Page]

Scala, Using Responder to abstract a possible Asynchronous computation

I have been looking into scala and AKKA for managing an obviously parallelisable algorithm.
I have some knowledge of functional programming and mostly do Java, so my FP might not be the best yet.
The algorithm I am working with is pretty simple, there is a top computation:
def computeFull(...): FullObject
This computation calls sub computations and then sum it up (to simplify):
def computePartial(...): Int
and computeFull does something like this (again simplifying):
val partials = for(x <- 1 to 10
y <- 1 to 10) yield computePartial(x, y)
partials.foldLeft(0)(_ + _)
So, it's very close to the AKKA example, doing the PI computation. I have many computeFull to call and many computePartial within each of them. So I can wrap all of this in AKKA actors, or to simplify in Futures, calling each computeFull and each computePartial in separate threads. I then can use the fold, zip and map functions of http://doc.akka.io/docs/akka/snapshot/scala/futures.html to combile the futures.
However, this implies that computeFull and computePartial will have to return Futures wrapping the actual results. They thus become dependent on AKKA and assuming that things are run in parallel. In fact, I also have to implicitly pass down the execution contexts within my functions.
I think that this is weird and that the algorithm "shouldn't" know the details of how it is parallelised, or if it is.
After reading about Futures in scala (and not the AKKA one) and looking into Code Continuation. It seems like the Responder monad that is provided by scala (http://www.scala-lang.org/api/current/scala/Responder.html) seems like the right way to abstract how the function calls are run.
I have this vague intuition that computeFull and computePartial could return Responders instead of futures and that when the monad in executed, it decides how the code embedded within the Responder gets executed (if it spawns a new actor or if it is executed on the same thread).
However, I am not really sure how to get to this result. Any suggestions? Do you think I am on the right way?
If you don’t want to be dependent on Akka (but note that Akka-style futures will be moved and included with Scala 2.10) and your computation is a simple fold on a collection you can simply use Scala’s parallel collections:
val partials = for { x <- (1 to 10).par
y <- 1 to 10
} yield computePartial(x, y)
// blocks until everything is computed
partials.foldLeft(0)(_ + _)
Of course, this will block until partials is ready, so it may not be a appropriate situation when you really need futures.
With Scala 2.10 style futures you can make that completely asynchronous without your algorithms ever noticing it:
def computePartial(x: Int, y: Int) = {
Thread.sleep((1000 * math.random).toInt)
println (x + ", " + y)
x * y
}
import scala.concurrent.future
import scala.concurrent.Future
val partials: IndexedSeq[Future[Int]] = for {
x <- 1 to 10
y <- 1 to 10
} yield future(computePartial(x, y))
val futureResult: Future[Int] = Future.sequence(partials).map(_.fold(0)(_ + _))
def useResult(result: Int) = println(s"The sum is $result")
// now I want to use the result of my computation
futureResult map { result => // called when ready
useResult(result)
}
// still no blocking
println("This is probably printed before the sum is calculated.")
So, computePartial does not need to know anything about how it is being executed. (It should not have any side-effects though, even though for the purpose of this example, a println side-effect was included.)
A possible computeFull method should manage the algorithm and as such know about Futures or parallelism. After all this is part of the algorithm.
(As for the Responder: Scala’s old futures use it so I don’t know where this is going. – And isn’t an execution context exactly the means of configuration you are looking for?)
The single actor in akka knows not if he runs in parrallel or not. That is how akka is designed. But if you don't want to rely on akka you can use parrallel collections like:
for (i <- (0 until numberOfPartialComputations).par) yield (
computePartial(i)
).sum
The sum is called on a parrallel collection and is performed in parrallel.