Applicative vs. monadic combinators and the free monad in Scalaz

Applicative vs. monadic combinators and the free monad in Scalaz - scala

A couple of weeks ago Dragisa Krsmanovic asked a question here about how to use the free monad in Scalaz 7 to avoid stack overflows in this situation (I've adapted his code a bit):
import scalaz._, Scalaz._
def setS(i: Int): State[List[Int], Unit] = modify(i :: _)
val s = (1 to 100000).foldLeft(state[List[Int], Unit](())) {
case (st, i) => st.flatMap(_ => setS(i))
}
s(Nil)
I thought that just lifting a trampoline into StateT should work:
import Free.Trampoline
val s = (1 to 100000).foldLeft(state[List[Int], Unit](()).lift[Trampoline]) {
case (st, i) => st.flatMap(_ => setS(i).lift[Trampoline])
}
s(Nil).run
But it still blows the stack, so I just posted it as a comment.
Dave Stevens just pointed out that sequencing with the applicative *> instead of the monadic flatMap actually works just fine:
val s = (1 to 100000).foldLeft(state[List[Int], Unit](()).lift[Trampoline]) {
case (st, i) => st *> setS(i).lift[Trampoline]
}
s(Nil).run
(Well, it's super slow of course, because that's the price you pay for doing anything interesting like this in Scala, but at least there's no stack overflow.)
What's going on here? I don't think there could be a principled reason for this difference, but really I have no idea what could be going on in the implementation and don't have time to dig around at the moment. But I'm curious and it would be cool if someone else knows.

Mandubian is correct, the flatMap of StateT doesn't allow you to bypass stack accumulation because of the creation of the new StateT immediately before calling the wrapped monad's bind (which would be a Free[Function0] in your case).
So Trampoline can't help, but the Free Monad over the functor for State is one way to ensure stack safety.
We want to go from State[List[Int],Unit] to Free[a[State[List[Int],a],Unit] and our flatMap call will be to Free's flatMap (that doesn't do anything other than create the Free data structure).
val s = (1 to 100000).foldLeft(
Free.liftF[({ type l[a] = State[List[Int],a]})#l,Unit](state[List[Int], Unit](()))) {
case (st, i) => st.flatMap(_ =>
Free.liftF[({ type l[a] = State[List[Int],a]})#l,Unit](setS(i)))
}
Now we have a Free data structure built that we can easily thread a state through as such:
s.foldRun(List[Int]())( (a,b) => b(a) )
Calling liftF is fairly ugly so I have a PR in to make it easier for State and Kleisli monads so hopefully in the future there won't need to be type lambdas.
Edit: PR accepted so now we have
val s = (1 to 100000).foldLeft(state[List[Int], Unit](()).liftF) {
case (st, i) => st.flatMap(_ => setS(i).liftF)
}

There is a principled intuition for this difference.
The applicative operator *> evaluates its left argument only for its side effects, and always ignores the result. This is similar (in some cases equivalent) to Haskell's >> function for monads. Here's the source for *>:
/** Combine `self` and `fb` according to `Apply[F]` with a function that discards the `A`s */
final def *>[B](fb: F[B]): F[B] = F.apply2(self,fb)((_,b) => b)
and Apply#apply2:
def apply2[A, B, C](fa: => F[A], fb: => F[B])(f: (A, B) => C): F[C] =
ap(fb)(map(fa)(f.curried))
In general, flatMap depends on the result of the left argument (it must, as it is the input for the function in the right argument). Even though in this specific case you are ignoring the left result, flatMap doesn't know that.
It seems likely, given your results, that the implementation for *> is optimized for the case where the result of the left argument is unneeded. However flatMap cannot perform this optimization and so each call grows the stack by retaining the unused left result.
It's possible that this could be optimized at the compiler (scalac) or JIT (HotSpot) level (Haskell's GHC certainly performs this optimization), but for now this seems like a missed optimization opportunity.

Just to add to the discussion...
In StateT, you have:
def flatMap[S3, B](f: A => IndexedStateT[F, S2, S3, B])(implicit F: Bind[F]): IndexedStateT[F, S1, S3, B] =
IndexedStateT(s => F.bind(apply(s)) {
case (s1, a) => f(a)(s1)
})
The apply(s) fixes the current state reference in the next state.
bind definition interpretes eagerly its parameters catching the reference because it requires it:
def bind[A, B](fa: F[A])(f: A => F[B]): F[B]
At the difference of ap which might not need to interprete one of its parameters:
def ap[A, B](fa: => F[A])(f: => F[A => B]): F[B]
With this code, the Trampoline can't help for StateT flatMap (and also map)...

Related

Is there a concise way to "invert" an Option?

Say I have a function that can take an optional parameter, and I want to return a Some if the argument is None and a None if the argument is Some:
def foo(a: Option[A]): Option[B] = a match {
case Some(_) => None
case None => Some(makeB())
}
So what I want to do is kind of the inverse of map. The variants of orElse are not applicable, because they retain the value of a if it's present.
Is there a more concise way to do this than if (a.isDefined) None else Some(makeB())?

fold is more concise than pattern matching
val op:Option[B] = ...
val inv = op.fold(Option(makeB()))(_ => None)

Overview of this answer:
One-liner solution using fold
Little demo with the fold
Discussion of why the fold-solution could be just as "obvious" as the if-else-solution.
Solution
You can always use fold to transform Option[A] into whatever you want:
a.fold(Option(makeB())){_ => Option.empty[B]}
Demo
Here is a complete runnable example with all the necessary type definitions:
class A
class B
def makeB(): B = new B
def foo(a: Option[A]): Option[B] = a match {
case Some(_) => None
case None => Some(makeB())
}
def foo2(a: Option[A]): Option[B] =
a.fold(Option(makeB())){_ => Option.empty[B]}
println(foo(Some(new A)))
println(foo(None))
println(foo2(Some(new A)))
println(foo2(None))
This outputs:
None
Some(Main$$anon$1$B#5fdef03a)
None
Some(Main$$anon$1$B#48cf768c)
Why fold only seems less intuitive
In the comments, #TheArchetypalPaul has commented that fold seems "lot less obvious" than the if-else solution. I agree, but I still think that it might be interesting to reflect on the reasons why that is.
I think that this is mostly an artifact resulting from the presence of special if-else syntax for booleans.
If there were something like a standard
def ifNone[A, B](opt: Option[A])(e: => B) = new {
def otherwise[C >: B](f: A => C): C = opt.fold((e: C))(f)
}
syntax that can be used like this:
val optStr: Option[String] = Some("hello")
val reversed = ifNone(optStr) {
Some("makeB")
} otherwise {
str => None
}
and, more importantly, if this syntax was mentioned on the first page of every introduction to every programming language invented in the past half-century, then the ifNone-otherwise solution (that is, fold), would look much more natural to most people.
Indeed, the Option.fold method is the eliminator of the Option[T] type: whenever we have an Option[T] and want to get an A out of it, the most obvious thing to expect should be a fold(a)(b) with a: A and b: T => A. In contrast to the special treatment of booleans with the if-else-syntax (which is a mere convention), the fold method is very fundamental, the fact that it must be there can be derived from the first principles.

I've come up with this definition a.map(_ => None).getOrElse(Some(makeB())):
scala> def f[A](a: Option[A]) = a.map(_ => None).getOrElse(Some(makeB()))
f: [A](a: Option[A])Option[makeB]
scala> f(Some(44))
res104: Option[makeB] = None
scala> f(None)
res105: Option[makeB] = Some(makeB())

I think the most concise and clearest might be Option.when(a.isEmpty)(makeB)

Scala: different foldRight implementations in list

I've just figured out that scala (I'm on 2.12) provides completely different implementations of foldRight for immutable list and mutable list.
Immutable list (List.scala):
override def foldRight[B](z: B)(op: (A, B) => B): B =
reverse.foldLeft(z)((right, left) => op(left, right))
Mutable list (LinearSeqOptimized.scala):
def foldRight[B](z: B)(#deprecatedName('f) op: (A, B) => B): B =
if (this.isEmpty) z
else op(head, tail.foldRight(z)(op))
Now I'm just curious.
Could you please explain me why was it implemented so differently?

The override in List seems to override the foldRight in LinearSeqOptimized. The implementation in LinearSeqOptimized
def foldRight[B](z: B)(#deprecatedName('f) op: (A, B) => B): B =
if (this.isEmpty) z
else op(head, tail.foldRight(z)(op))
looks exactly like the canonical definition of foldRight as a catamorphism from your average theory book. However, as was noticed in SI-2818, this implementation is not stack-safe (throws unexpected StackOverflowError for long lists). Therefore, it was replaced by a stack-safe reverse.foldLeft in this commit. The foldLeft is stack-safe, because it has been implemented by a while loop:
def foldLeft[B](z: B)(#deprecatedName('f) op: (B, A) => B): B = {
var acc = z
var these = this
while (!these.isEmpty) {
acc = op(acc, these.head)
these = these.tail
}
acc
}
That hopefully explains why it was overridden in List. It doesn't explain why it was not overridden in other classes. I guess it's simply because the mutable data structures are used less often and quite differently anyway (often as buffers and accumulators during the construction of immutable ones).
Hint: there is a blame button in the top right corner over every file on Github, so you can always track down what was changed when, by whom, and why.

Cats: Non tail recursive tailRecM method for Monads

In cats, when a Monad is created using Monad trait, an implementation for method tailRecM should be provided.
I have a scenario below that I found impossible to provide a tail recursive implementation of tailRecM
sealed trait Tree[+A]
final case class Branch[A](left: Tree[A], right: Tree[A]) extends Tree[A]
final case class Leaf[A](value: A) extends Tree[A]
implicit val treeMonad = new Monad[Tree] {
override def pure[A](value: A): Tree[A] = Leaf(value)
override def flatMap[A, B](initial: Tree[A])(func: A => Tree[B]): Tree[B] =
initial match {
case Branch(l, r) => Branch(flatMap(l)(func), flatMap(r)(func))
case Leaf(value) => func(value)
}
//#tailrec
override def tailRecM[A, B](a: A)(func: (A) => Tree[Either[A, B]]): Tree[B] = {
func(a) match {
case Branch(l, r) =>
Branch(
flatMap(l) {
case Right(l) => pure(l)
case Left(l) => tailRecM(l)(func)
},
flatMap(r){
case Right(r) => pure(r)
case Left(r) => tailRecM(r)(func)
}
)
case Leaf(Left(value)) => tailRecM(value)(func)
case Leaf(Right(value)) => Leaf(value)
}
}
}
1) According to the above example, how this tailRecM method can be used for optimizing flatMap method call? Does the implementation of the flatMap method is overridden/modified by tailRecM at the compile time ?
2) If the tailRecM is not tail recursive as above, will it still be efficient than using the original flatMap method ?
Please share your thoughts.

Sometimes there is a way to replace a call stack with explicit list.
Here toVisit keeps track of branches that are waiting to be processed.
And toCollect keeps branches that are waiting to be merged until corresponding branch is finished processed.
override def tailRecM[A, B](a: A)(f: (A) => Tree[Either[A, B]]): Tree[B] = {
#tailrec
def go(toVisit: List[Tree[Either[A, B]]],
toCollect: List[Tree[B]]): List[Tree[B]] = toVisit match {
case (tree :: tail) =>
tree match {
case Branch(l, r) =>
l match {
case Branch(_, _) => go(l :: r :: tail, toCollect)
case Leaf(Left(a)) => go(f(a) :: r :: tail, toCollect)
case Leaf(Right(b)) => go(r :: tail, pure(b) +: toCollect)
}
case Leaf(Left(a)) => go(f(a) :: tail, toCollect)
case Leaf(Right(b)) =>
go(tail,
if (toCollect.isEmpty) pure(b) +: toCollect
else Branch(toCollect.head, pure(b)) :: toCollect.tail)
}
case Nil => toCollect
}
go(f(a) :: Nil, Nil).head
}
From cats ticket why to use tailRecM
tailRecM won't blow the stack (like almost every JVM program it may OOM), for any of the Monads in cats.
and then
Without tailRecM (or recursive flatMap) being safe, libraries like
iteratee.io can't safely be written since they require monadic recursion.
and another ticket states that clients of cats.Monad should be aware that some monads don't have stacksafe tailRecM
tailRecM can still be used by those that are trying to get stack safety, so long as they understand that certain monads will not be able to give it to them

Relation between tailRecM and flatMap
To answer you first question, the following code is part of FlatMapLaws.scala, from cats-laws. It tests consistency between flatMap and tailRecM methods.
/**
* It is possible to implement flatMap from tailRecM and map
* and it should agree with the flatMap implementation.
*/
def flatMapFromTailRecMConsistency[A, B](fa: F[A], fn: A => F[B]): IsEq[F[B]] = {
val tailRecMFlatMap = F.tailRecM[Option[A], B](Option.empty[A]) {
case None => F.map(fa) { a => Left(Some(a)) }
case Some(a) => F.map(fn(a)) { b => Right(b) }
}
F.flatMap(fa)(fn) <-> tailRecMFlatMap
}
This shows how to implement a flatMap from tailRecM and implicitly suggests that the compiler will not do such thing automatically. It's up to the user of the Monad to decide when it makes sense to use tailRecM over flatMap.
This blog has nice scala examples to explain when tailRecM comes in useful. It follows the PureScript article by Phil Freeman, which originally introduced the method.
It explains the downsides in using flatMap for monadic composition:
This characteristic of Scala limits the usefulness of monadic composition where flatMap can call monadic function f, which then can call flatMap etc..
In contrast with a tailRecM-based implementation:
This guarantees greater safety on the user of FlatMap typeclass, but it would mean that each the implementers of the instances would need to provide a safe tailRecM.
Many of the provided methods in cats leverage monadic composition. So, even if you don't use it directly, implementing tailRecM allows for more efficient composition with other monads.
Implmentation for tree
In a different answer, #nazarii-bardiuk provides an implementation of tailRecM which is tail recursive, but does not pass the flatMap/tailRecM consistency test mentioned above. The tree structure is not properly rebuilt after recursion. A fixed version below:
def tailRecM[A, B](arg: A)(func: A => Tree[Either[A, B]]): Tree[B] = {
#tailrec
def loop(toVisit: List[Tree[Either[A, B]]],
toCollect: List[Option[Tree[B]]]): List[Tree[B]] =
toVisit match {
case Branch(l, r) :: next =>
loop(l :: r :: next, None :: toCollect)
case Leaf(Left(value)) :: next =>
loop(func(value) :: next, toCollect)
case Leaf(Right(value)) :: next =>
loop(next, Some(pure(value)) :: toCollect)
case Nil =>
toCollect.foldLeft(Nil: List[Tree[B]]) { (acc, maybeTree) =>
maybeTree.map(_ :: acc).getOrElse {
val left :: right :: tail = acc
branch(left, right) :: tail
}
}
}
loop(List(func(arg)), Nil).head
}
(gist with test)
You're probably aware, but your example (as well as the answer by #nazarii-bardiuk) is used in the book Scala with Cats by Noel Welsh and Dave Gurnell (highly recommended).

Use of underscore in function call with Try parameters

I'm trying to understand particular use of underscore in Scala. And following piece of code I cannot understand
class Test[T, S] {
def f1(f: T => S): Unit = f2(_ map f)
def f2(f: Try[T] => Try[S]): Unit = {}
}
How is the _ treated in this case? How is the T=>S becomes Try[T]=>Try[S]?

It seems you are reading it wrongly. Look at the type of f2(Try[T] => Try[S]):Unit.
Then looking into f1 we have f: T => S.
The _ in value position desugars to f2(g => g map f).
Let's see what we know so far:
f2(Try[T] => Try[S]):Unit
f: T => S
f2(g => g map f)
Give 1. and 3. we can infer that the type of g has to be Try[T]. map over Try[T] takes T => Something, in case f which is T => S, in which case Something is S.
It may seem a bit hard to read now, but once you learn to distinguish between type and value position readin this type of code becomes trivial.
Another thing to notice def f2(f: Try[T] => Try[S]): Unit = {} is quite uninteresting and may be a bit detrimental in solving your particular question.
I'd try to solve this like that: first forget the class you created. Now implement this (replace the ??? with a useful implementation):
object P1 {
def fmap[A, B](A => B): Try[A] => Try[B] = ???
}
For bonus points use the _ as the first char in your implementation.

What's the deal with all the Either cruft?

The Either class seems useful and the ways of using it are pretty obvious. But then I look at the API documentation and I'm baffled:
def joinLeft [A1 >: A, B1 >: B, C] (implicit ev: <:<[A1, Either[C, B1]]):
Either[C, B1]
Joins an Either through Left.
def joinRight [A1 >: A, B1 >: B, C] (implicit ev: <:<[B1, Either[A1, C]]):
Either[A1, C]
Joins an Either through Right.
def left : LeftProjection[A, B]
Projects this Either as a Left.
def right : RightProjection[A, B]
Projects this Either as a Right.
What do I do with a projection and how do I even invoke the joins?
Google just points me to the API documentation.
This might just be a case of "paying no attention to the man behind the curtain", but I don't think so. I think this is important.

left and right are the important ones. Either is useful without projections (mostly you do pattern matching), but projections are quite worthy of attention, as they give a much richer API. You will use joins much less.
Either is often used to mean "a proper value or an error". In this respect, it is like an extended Option . When there is no data, instead of None, you have an error.
Option has a rich API. The same can be made available on Either, provided we know, in Either, which one is the result and which one is the error.
left and right projection says just that. It is the Either, plus the added knowledge that the value is respectively at left or at right, and the other one is the error.
For instance, in Option, you can map, so opt.map(f) returns an Option with f applied to the value of opt if it has a one, and still None if opt was None. On a left projection, it will apply f on the value at left if it is a Left, and leave it unchanged if it is a Right. Observe the signatures:
In LeftProjection[A,B], map[C](f: A => C): Either[C,B]
In RightProjection[A,B], map[C](f: B => C): Either[A,C].
left and right are simply the way to say which side is considered the value when you want to use one of the usual API routines.
Alternatives could have been:
set a convention, as in Haskell, where there were strong syntactical reasons to put the value at right. When you want to apply a method on the other side (you may well want to change the error with a map for instance), do a swap before and after.
postfix method names with Left or Right (maybe just L and R). That would prevent using for comprehension. With for comprehensions (flatMap in fact, but the for notation is quite convenient) Either is an alternative to (checked) exceptions.
Now the joins. Left and Right means the same thing as for the projections, and they are closely related to flatMap. Consider joinLeft. The signature may be puzzling:
joinLeft [A1 >: A, B1 >: B, C] (implicit ev: <:<[A1, Either[C, B1]]):
Either[C, B1]
A1 and B1 are technically necessary, but not critical to the understanding, let's simplify
joinLeft[C](implicit ev: <:<[A, Either[C, B])
What the implicit means is that the method can only be called if A is an Either[C,B]. The method is not available on an Either[A,B] in general, but only on an Either[Either[C,B], B]. As with left projection, we consider that the value is at left (that would be right for joinRight). What the join does is flatten this (think flatMap). When one join, one does not care whether the error (B) is inside or outside, we just want Either[C,B]. So Left(Left(c)) yields Left(c), both Left(Right(b)) and Right(b) yield Right(b). The relation with flatMap is as follows:
joinLeft(e) = e.left.flatMap(identity)
e.left.flatMap(f) = e.left.map(f).joinLeft
The Option equivalent would work on an Option[Option[A]], Some(Some(x)) would yield Some(x) both Some(None) and None would yield None. It can be written o.flatMap(identity). Note that Option[A] is isomorphic to Either[A,Unit] (if you use left projections and joins) and also to Either[Unit, A] (using right projections).

Ignoring the joins for now, projections are a mechanism allowing you to use use an Either as a monad. Think of it as extracting either the left or right side into an Option, but without losing the other side
As always, this probably makes more sense with an example. So imagine you have an Either[Exception, Int] and want to convert the Exception to a String (if present)
val result = opReturningEither
val better = result.left map {_.getMessage}
This will map over the left side of result, giving you an Either[String,Int]

joinLeft and joinRight enable you to "flatten" a nested Either:
scala> val e: Either[Either[String, Int], Int] = Left(Left("foo"))
e: Either[Either[String,Int],Int] = Left(Left(foo))
scala> e.joinLeft
res2: Either[String,Int] = Left(foo)
Edit: My answer to this question shows one example of how you can use the projections, in this case to fold together a sequence of Eithers without pattern matching or calling isLeft or isRight. If you're familiar with how to use Option without matching or calling isDefined, it's analagous.
While curiously looking at the current source of Either, I saw that joinLeft and joinRight are implemented with pattern matching. However, I stumbled across this older version of the source and saw that it used to implement the join methods using projections:
def joinLeft[A, B](es: Either[Either[A, B], B]) =
es.left.flatMap(x => x)

My suggestion is add the following to your utility package:
implicit class EitherRichClass[A, B](thisEither: Either[A, B])
{
def map[C](f: B => C): Either[A, C] = thisEither match
{
case Left(l) => Left[A, C](l)
case Right(r) => Right[A, C](f(r))
}
def flatMap[C](f: B => Either[A, C]): Either[A, C] = thisEither match
{
case Left(l) => Left[A, C](l)
case Right(r) => (f(r))
}
}
In my experience the only useful provided method is fold. You don't really use isLeft or isRight in functional code. joinLeft and joinRight might be useful as flatten functions as explained by Dider Dupont but, I haven't had occasion to use them that way. The above is using Either as right biased, which I suspect is how most people use them. Its like an Option with an error value instead of None.
Here's some of my own code. Apologies its not polished code but its an example of using Either in a for comprehension. Adding the map and flatMap methods to Either allows us to use the special syntax in for comprehensions. Its parsing HTTP headers, either returning an Http and Html error page response or a parsed custom HTTP Request object. Without the use of the for comprehension the code would be very difficult to comprehend.
object getReq
{
def LeftError[B](str: String) = Left[HResponse, B](HttpError(str))
def apply(line1: String, in: java.io.BufferedReader): Either[HResponse, HttpReq] =
{
def loop(acc: Seq[(String, String)]): Either[HResponse, Seq[(String, String)]] =
{
val ln = in.readLine
if (ln == "")
Right(acc)
else
ln.splitOut(':', s => LeftError("400 Bad Syntax in Header Field"), (a, b) => loop(acc :+ Tuple2(a.toLowerCase, b)))
}
val words: Seq[String] = line1.lowerWords
for
{
a3 <- words match
{
case Seq("get", b, c) => Right[HResponse, (ReqType.Value, String, String)]((ReqType.HGet, b, c))
case Seq("post", b, c) => Right[HResponse, (ReqType.Value, String, String)]((ReqType.HPost, b, c))
case Seq(methodName, b, c) => LeftError("405" -- methodName -- "method not Allowed")
case _ => LeftError("400 Bad Request: Bad Syntax in Status Line")
}
val (reqType, target, version) = a3
fields <- loop(Nil)
val optLen = fields.find(_._1 == "content-length")
pair <- optLen match
{
case None => Right((0, fields))
case Some(("content-length", second)) => second.filterNot(_.isWhitespace) match
{
case s if s.forall(_.isDigit) => Right((s.toInt, fields.filterNot(_._1 == "content-length")))
case s => LeftError("400 Bad Request: Bad Content-Length SyntaxLine")
}
}
val (bodyLen, otherHeaderPairs) = pair
val otherHeaderFields = otherHeaderPairs.map(pair => HeaderField(pair._1, pair._2))
val body = if (bodyLen > 0) (for (i <- 1 to bodyLen) yield in.read.toChar).mkString else ""
}
yield (HttpReq(reqType, target, version, otherHeaderFields, bodyLen, body))
}
}

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Applicative vs. monadic combinators and the free monad in Scalaz - scala

Related

Is there a concise way to "invert" an Option?

Scala: different foldRight implementations in list

Cats: Non tail recursive tailRecM method for Monads

Use of underscore in function call with Try parameters

What's the deal with all the Either cruft?

Categories

Resources