The Either class seems useful and the ways of using it are pretty obvious. But then I look at the API documentation and I'm baffled:
def joinLeft [A1 >: A, B1 >: B, C] (implicit ev: <:<[A1, Either[C, B1]]):
Either[C, B1]
Joins an Either through Left.
def joinRight [A1 >: A, B1 >: B, C] (implicit ev: <:<[B1, Either[A1, C]]):
Either[A1, C]
Joins an Either through Right.
def left : LeftProjection[A, B]
Projects this Either as a Left.
def right : RightProjection[A, B]
Projects this Either as a Right.
What do I do with a projection and how do I even invoke the joins?
Google just points me to the API documentation.
This might just be a case of "paying no attention to the man behind the curtain", but I don't think so. I think this is important.
left and right are the important ones. Either is useful without projections (mostly you do pattern matching), but projections are quite worthy of attention, as they give a much richer API. You will use joins much less.
Either is often used to mean "a proper value or an error". In this respect, it is like an extended Option . When there is no data, instead of None, you have an error.
Option has a rich API. The same can be made available on Either, provided we know, in Either, which one is the result and which one is the error.
left and right projection says just that. It is the Either, plus the added knowledge that the value is respectively at left or at right, and the other one is the error.
For instance, in Option, you can map, so opt.map(f) returns an Option with f applied to the value of opt if it has a one, and still None if opt was None. On a left projection, it will apply f on the value at left if it is a Left, and leave it unchanged if it is a Right. Observe the signatures:
In LeftProjection[A,B], map[C](f: A => C): Either[C,B]
In RightProjection[A,B], map[C](f: B => C): Either[A,C].
left and right are simply the way to say which side is considered the value when you want to use one of the usual API routines.
Alternatives could have been:
set a convention, as in Haskell, where there were strong syntactical reasons to put the value at right. When you want to apply a method on the other side (you may well want to change the error with a map for instance), do a swap before and after.
postfix method names with Left or Right (maybe just L and R). That would prevent using for comprehension. With for comprehensions (flatMap in fact, but the for notation is quite convenient) Either is an alternative to (checked) exceptions.
Now the joins. Left and Right means the same thing as for the projections, and they are closely related to flatMap. Consider joinLeft. The signature may be puzzling:
joinLeft [A1 >: A, B1 >: B, C] (implicit ev: <:<[A1, Either[C, B1]]):
Either[C, B1]
A1 and B1 are technically necessary, but not critical to the understanding, let's simplify
joinLeft[C](implicit ev: <:<[A, Either[C, B])
What the implicit means is that the method can only be called if A is an Either[C,B]. The method is not available on an Either[A,B] in general, but only on an Either[Either[C,B], B]. As with left projection, we consider that the value is at left (that would be right for joinRight). What the join does is flatten this (think flatMap). When one join, one does not care whether the error (B) is inside or outside, we just want Either[C,B]. So Left(Left(c)) yields Left(c), both Left(Right(b)) and Right(b) yield Right(b). The relation with flatMap is as follows:
joinLeft(e) = e.left.flatMap(identity)
e.left.flatMap(f) = e.left.map(f).joinLeft
The Option equivalent would work on an Option[Option[A]], Some(Some(x)) would yield Some(x) both Some(None) and None would yield None. It can be written o.flatMap(identity). Note that Option[A] is isomorphic to Either[A,Unit] (if you use left projections and joins) and also to Either[Unit, A] (using right projections).
Ignoring the joins for now, projections are a mechanism allowing you to use use an Either as a monad. Think of it as extracting either the left or right side into an Option, but without losing the other side
As always, this probably makes more sense with an example. So imagine you have an Either[Exception, Int] and want to convert the Exception to a String (if present)
val result = opReturningEither
val better = result.left map {_.getMessage}
This will map over the left side of result, giving you an Either[String,Int]
joinLeft and joinRight enable you to "flatten" a nested Either:
scala> val e: Either[Either[String, Int], Int] = Left(Left("foo"))
e: Either[Either[String,Int],Int] = Left(Left(foo))
scala> e.joinLeft
res2: Either[String,Int] = Left(foo)
Edit: My answer to this question shows one example of how you can use the projections, in this case to fold together a sequence of Eithers without pattern matching or calling isLeft or isRight. If you're familiar with how to use Option without matching or calling isDefined, it's analagous.
While curiously looking at the current source of Either, I saw that joinLeft and joinRight are implemented with pattern matching. However, I stumbled across this older version of the source and saw that it used to implement the join methods using projections:
def joinLeft[A, B](es: Either[Either[A, B], B]) =
es.left.flatMap(x => x)
My suggestion is add the following to your utility package:
implicit class EitherRichClass[A, B](thisEither: Either[A, B])
{
def map[C](f: B => C): Either[A, C] = thisEither match
{
case Left(l) => Left[A, C](l)
case Right(r) => Right[A, C](f(r))
}
def flatMap[C](f: B => Either[A, C]): Either[A, C] = thisEither match
{
case Left(l) => Left[A, C](l)
case Right(r) => (f(r))
}
}
In my experience the only useful provided method is fold. You don't really use isLeft or isRight in functional code. joinLeft and joinRight might be useful as flatten functions as explained by Dider Dupont but, I haven't had occasion to use them that way. The above is using Either as right biased, which I suspect is how most people use them. Its like an Option with an error value instead of None.
Here's some of my own code. Apologies its not polished code but its an example of using Either in a for comprehension. Adding the map and flatMap methods to Either allows us to use the special syntax in for comprehensions. Its parsing HTTP headers, either returning an Http and Html error page response or a parsed custom HTTP Request object. Without the use of the for comprehension the code would be very difficult to comprehend.
object getReq
{
def LeftError[B](str: String) = Left[HResponse, B](HttpError(str))
def apply(line1: String, in: java.io.BufferedReader): Either[HResponse, HttpReq] =
{
def loop(acc: Seq[(String, String)]): Either[HResponse, Seq[(String, String)]] =
{
val ln = in.readLine
if (ln == "")
Right(acc)
else
ln.splitOut(':', s => LeftError("400 Bad Syntax in Header Field"), (a, b) => loop(acc :+ Tuple2(a.toLowerCase, b)))
}
val words: Seq[String] = line1.lowerWords
for
{
a3 <- words match
{
case Seq("get", b, c) => Right[HResponse, (ReqType.Value, String, String)]((ReqType.HGet, b, c))
case Seq("post", b, c) => Right[HResponse, (ReqType.Value, String, String)]((ReqType.HPost, b, c))
case Seq(methodName, b, c) => LeftError("405" -- methodName -- "method not Allowed")
case _ => LeftError("400 Bad Request: Bad Syntax in Status Line")
}
val (reqType, target, version) = a3
fields <- loop(Nil)
val optLen = fields.find(_._1 == "content-length")
pair <- optLen match
{
case None => Right((0, fields))
case Some(("content-length", second)) => second.filterNot(_.isWhitespace) match
{
case s if s.forall(_.isDigit) => Right((s.toInt, fields.filterNot(_._1 == "content-length")))
case s => LeftError("400 Bad Request: Bad Content-Length SyntaxLine")
}
}
val (bodyLen, otherHeaderPairs) = pair
val otherHeaderFields = otherHeaderPairs.map(pair => HeaderField(pair._1, pair._2))
val body = if (bodyLen > 0) (for (i <- 1 to bodyLen) yield in.read.toChar).mkString else ""
}
yield (HttpReq(reqType, target, version, otherHeaderFields, bodyLen, body))
}
}
Related
Say I have a function that can take an optional parameter, and I want to return a Some if the argument is None and a None if the argument is Some:
def foo(a: Option[A]): Option[B] = a match {
case Some(_) => None
case None => Some(makeB())
}
So what I want to do is kind of the inverse of map. The variants of orElse are not applicable, because they retain the value of a if it's present.
Is there a more concise way to do this than if (a.isDefined) None else Some(makeB())?
fold is more concise than pattern matching
val op:Option[B] = ...
val inv = op.fold(Option(makeB()))(_ => None)
Overview of this answer:
One-liner solution using fold
Little demo with the fold
Discussion of why the fold-solution could be just as "obvious" as the if-else-solution.
Solution
You can always use fold to transform Option[A] into whatever you want:
a.fold(Option(makeB())){_ => Option.empty[B]}
Demo
Here is a complete runnable example with all the necessary type definitions:
class A
class B
def makeB(): B = new B
def foo(a: Option[A]): Option[B] = a match {
case Some(_) => None
case None => Some(makeB())
}
def foo2(a: Option[A]): Option[B] =
a.fold(Option(makeB())){_ => Option.empty[B]}
println(foo(Some(new A)))
println(foo(None))
println(foo2(Some(new A)))
println(foo2(None))
This outputs:
None
Some(Main$$anon$1$B#5fdef03a)
None
Some(Main$$anon$1$B#48cf768c)
Why fold only seems less intuitive
In the comments, #TheArchetypalPaul has commented that fold seems "lot less obvious" than the if-else solution. I agree, but I still think that it might be interesting to reflect on the reasons why that is.
I think that this is mostly an artifact resulting from the presence of special if-else syntax for booleans.
If there were something like a standard
def ifNone[A, B](opt: Option[A])(e: => B) = new {
def otherwise[C >: B](f: A => C): C = opt.fold((e: C))(f)
}
syntax that can be used like this:
val optStr: Option[String] = Some("hello")
val reversed = ifNone(optStr) {
Some("makeB")
} otherwise {
str => None
}
and, more importantly, if this syntax was mentioned on the first page of every introduction to every programming language invented in the past half-century, then the ifNone-otherwise solution (that is, fold), would look much more natural to most people.
Indeed, the Option.fold method is the eliminator of the Option[T] type: whenever we have an Option[T] and want to get an A out of it, the most obvious thing to expect should be a fold(a)(b) with a: A and b: T => A. In contrast to the special treatment of booleans with the if-else-syntax (which is a mere convention), the fold method is very fundamental, the fact that it must be there can be derived from the first principles.
I've come up with this definition a.map(_ => None).getOrElse(Some(makeB())):
scala> def f[A](a: Option[A]) = a.map(_ => None).getOrElse(Some(makeB()))
f: [A](a: Option[A])Option[makeB]
scala> f(Some(44))
res104: Option[makeB] = None
scala> f(None)
res105: Option[makeB] = Some(makeB())
I think the most concise and clearest might be Option.when(a.isEmpty)(makeB)
I've just figured out that scala (I'm on 2.12) provides completely different implementations of foldRight for immutable list and mutable list.
Immutable list (List.scala):
override def foldRight[B](z: B)(op: (A, B) => B): B =
reverse.foldLeft(z)((right, left) => op(left, right))
Mutable list (LinearSeqOptimized.scala):
def foldRight[B](z: B)(#deprecatedName('f) op: (A, B) => B): B =
if (this.isEmpty) z
else op(head, tail.foldRight(z)(op))
Now I'm just curious.
Could you please explain me why was it implemented so differently?
The override in List seems to override the foldRight in LinearSeqOptimized. The implementation in LinearSeqOptimized
def foldRight[B](z: B)(#deprecatedName('f) op: (A, B) => B): B =
if (this.isEmpty) z
else op(head, tail.foldRight(z)(op))
looks exactly like the canonical definition of foldRight as a catamorphism from your average theory book. However, as was noticed in SI-2818, this implementation is not stack-safe (throws unexpected StackOverflowError for long lists). Therefore, it was replaced by a stack-safe reverse.foldLeft in this commit. The foldLeft is stack-safe, because it has been implemented by a while loop:
def foldLeft[B](z: B)(#deprecatedName('f) op: (B, A) => B): B = {
var acc = z
var these = this
while (!these.isEmpty) {
acc = op(acc, these.head)
these = these.tail
}
acc
}
That hopefully explains why it was overridden in List. It doesn't explain why it was not overridden in other classes. I guess it's simply because the mutable data structures are used less often and quite differently anyway (often as buffers and accumulators during the construction of immutable ones).
Hint: there is a blame button in the top right corner over every file on Github, so you can always track down what was changed when, by whom, and why.
I'm trying to write a generic invert method that takes a Map from keys of type A to values that are collections of type B and converts it to a Map with keys of type B and collections of A using the same original collection type.
My goal is to make this method a member of a MyMap[A,B] class that offers extensions of the basic library methods, where Maps are implicitly converted to MyMaps. I am able to do this implicit conversion for a generic map, but I want to further specify that the invert method should only work in the case where B is a collection.
I lack the understanding of scala's collections framework to accomplish this - I've scoured the net for thorough introductory explanations of the signatures that look like a hodgepodge of Repr, CC,That, and CanBuildFrom, but I don't really understand how all these pieces fit together well enough to construct the method signature on my own. Please don't just give me the working signature for this case - I want to understand how the signatures of methods that use generic collections work in a broader sense so I'm able to do this independently going forward. Alternatively, feel free to reference an online resource that elaborates on this - I was unable to find one that was comprehensive and clear.
EDIT
I seem to have gotten it to work with the following code. If I did something wrong or you see something that can be improved, please comment & answer with a more optimal alternative.
class MyMap[A, B](val _map: Map[A, B]) {
def invert[E, CC[E]](
implicit ev1: B =:= CC[E],
ev2: CC[E] <:< TraversableOnce[E],
cbf: CanBuildFrom[CC[A], A, CC[A]]
): Map[E, CC[A]] = {
val inverted = scala.collection.mutable.Map.empty[E, Builder[A, CC[A]]]
for {
(key, values) <- _map
value <- values.asInstanceOf[CC[E]]
} {
if (!inverted.contains(value)) {
inverted += (value -> cbf())
}
inverted.get(value).foreach(_ += key)
}
return inverted.map({ case (k,v) => (k -> v.result) }).toMap
}
}
I started from your code and ended up with this:
implicit class MyMap[A, B, C[B] <: Traversable[B]](val _map: Map[A, C[B]]) {
def invert(implicit cbf: CanBuildFrom[C[A], A, C[A]]): Map[B, C[A]] = {
val inverted = scala.collection.mutable.Map.empty[B, Builder[A, C[A]]]
for ((k, vs) <- _map; v <- vs) {
inverted.getOrElseUpdate(v, cbf()) += k
}
inverted.map({ case (k, v) => (k -> v.result)}).toMap
}
}
val map = Map("a"-> List(1,2,3), "b" -> List(1,2))
println(map.invert) //Map(2 -> List(a, b), 1 -> List(a, b), 3 -> List(a))
I want to fold a collection or Y's and return an Option[X]. I want to start with None. Like this...
def f(optX: Option[X], y: Y): Option[X]
val optX = collectionOfY.fold(None) { case (prev, y) => f(prev,y) }
adding unneeded types to make it clearer
val optX: Option[X] = collectionOfY.fold(None) { case (prev: Option[X], y: Y) => f(prev,y) }
However, the compiler can not figure out the type properly and I have to write it like this
val xx: Option[X] = None
val optX = collectionOfY.fold(xx) { case (prev, y) => f(prev,y) }
What is the magic Scala syntax to write this?
Thanks
Peter
Just use foldLeft and any of the following
... foldLeft(Option.empty[X]) ... or ... foldLeft(None: Option[X]) ... or ... foldLeft[Option[X]](None) ...
After all, fold just calls foldLeft. You only really want to use fold when your A1 really is a super-type of A, if that really is the case then you can use fold as above and the compiler will know the type correctly.
For example, Option[List[Int]] <: Option[Seq[Int]] by covariance so we don't get an Any here:
List(Some(List(1,2,3))).fold[Option[Seq[Int]]](None)((_, _) => Some(Seq(1)))
> res2: Option[Seq[Int]] = Some(List(1))
Finally, if you do indeed know Option[X] will be a super-type of Y then say this explicitly in the type declaration of Y - i.e. Y <: Option[X], then you can use fold with the solutions given above.
See When should .empty be used versus the singleton empty instance? for a related discussion.
As was pointed out in the comments above, the preferred solution is change the first parameter passed to Option.fold to avoid the use of None and use Option.empty[X] instead.
val optX = collectionOfY.fold(Option.empty[X]) { case (prev, y) => f(prev,y) }
The Scala compiler accepts this without complaint.
This behavior is logical, because fold is defined as
def fold[A1 >: A](z: A1)(op: (A1, A1) ⇒ A1): A1
that means the starting parameter is a supertype of A, but None is's a supertype of Option[T]. If you specify the type directly (like in your second example) then the compiler has enough information to figure out the return type of fold.
One possible workaround is to specify the result of interaction of head of your collection with None as a starting point, and fold it with the rest of the collection (adding a check for Nil):
val optX = collectionOfY.match {
case Nil => None
case x:xs => xs.fold(f(None,x)) {case (prev,y) => f(prev,y) }
}
A couple of weeks ago Dragisa Krsmanovic asked a question here about how to use the free monad in Scalaz 7 to avoid stack overflows in this situation (I've adapted his code a bit):
import scalaz._, Scalaz._
def setS(i: Int): State[List[Int], Unit] = modify(i :: _)
val s = (1 to 100000).foldLeft(state[List[Int], Unit](())) {
case (st, i) => st.flatMap(_ => setS(i))
}
s(Nil)
I thought that just lifting a trampoline into StateT should work:
import Free.Trampoline
val s = (1 to 100000).foldLeft(state[List[Int], Unit](()).lift[Trampoline]) {
case (st, i) => st.flatMap(_ => setS(i).lift[Trampoline])
}
s(Nil).run
But it still blows the stack, so I just posted it as a comment.
Dave Stevens just pointed out that sequencing with the applicative *> instead of the monadic flatMap actually works just fine:
val s = (1 to 100000).foldLeft(state[List[Int], Unit](()).lift[Trampoline]) {
case (st, i) => st *> setS(i).lift[Trampoline]
}
s(Nil).run
(Well, it's super slow of course, because that's the price you pay for doing anything interesting like this in Scala, but at least there's no stack overflow.)
What's going on here? I don't think there could be a principled reason for this difference, but really I have no idea what could be going on in the implementation and don't have time to dig around at the moment. But I'm curious and it would be cool if someone else knows.
Mandubian is correct, the flatMap of StateT doesn't allow you to bypass stack accumulation because of the creation of the new StateT immediately before calling the wrapped monad's bind (which would be a Free[Function0] in your case).
So Trampoline can't help, but the Free Monad over the functor for State is one way to ensure stack safety.
We want to go from State[List[Int],Unit] to Free[a[State[List[Int],a],Unit] and our flatMap call will be to Free's flatMap (that doesn't do anything other than create the Free data structure).
val s = (1 to 100000).foldLeft(
Free.liftF[({ type l[a] = State[List[Int],a]})#l,Unit](state[List[Int], Unit](()))) {
case (st, i) => st.flatMap(_ =>
Free.liftF[({ type l[a] = State[List[Int],a]})#l,Unit](setS(i)))
}
Now we have a Free data structure built that we can easily thread a state through as such:
s.foldRun(List[Int]())( (a,b) => b(a) )
Calling liftF is fairly ugly so I have a PR in to make it easier for State and Kleisli monads so hopefully in the future there won't need to be type lambdas.
Edit: PR accepted so now we have
val s = (1 to 100000).foldLeft(state[List[Int], Unit](()).liftF) {
case (st, i) => st.flatMap(_ => setS(i).liftF)
}
There is a principled intuition for this difference.
The applicative operator *> evaluates its left argument only for its side effects, and always ignores the result. This is similar (in some cases equivalent) to Haskell's >> function for monads. Here's the source for *>:
/** Combine `self` and `fb` according to `Apply[F]` with a function that discards the `A`s */
final def *>[B](fb: F[B]): F[B] = F.apply2(self,fb)((_,b) => b)
and Apply#apply2:
def apply2[A, B, C](fa: => F[A], fb: => F[B])(f: (A, B) => C): F[C] =
ap(fb)(map(fa)(f.curried))
In general, flatMap depends on the result of the left argument (it must, as it is the input for the function in the right argument). Even though in this specific case you are ignoring the left result, flatMap doesn't know that.
It seems likely, given your results, that the implementation for *> is optimized for the case where the result of the left argument is unneeded. However flatMap cannot perform this optimization and so each call grows the stack by retaining the unused left result.
It's possible that this could be optimized at the compiler (scalac) or JIT (HotSpot) level (Haskell's GHC certainly performs this optimization), but for now this seems like a missed optimization opportunity.
Just to add to the discussion...
In StateT, you have:
def flatMap[S3, B](f: A => IndexedStateT[F, S2, S3, B])(implicit F: Bind[F]): IndexedStateT[F, S1, S3, B] =
IndexedStateT(s => F.bind(apply(s)) {
case (s1, a) => f(a)(s1)
})
The apply(s) fixes the current state reference in the next state.
bind definition interpretes eagerly its parameters catching the reference because it requires it:
def bind[A, B](fa: F[A])(f: A => F[B]): F[B]
At the difference of ap which might not need to interprete one of its parameters:
def ap[A, B](fa: => F[A])(f: => F[A => B]): F[B]
With this code, the Trampoline can't help for StateT flatMap (and also map)...