Comparing Haskell and Scala Bind/Flatmap Examples - scala

The following bind(>>=) code, in Haskell, does not compile:
ghci> [[1]] >>= Just
<interactive>:38:11:
Couldn't match type ‘Maybe’ with ‘[]’
Expected type: [t] -> [[t]]
Actual type: [t] -> Maybe [t]
In the second argument of ‘(>>=)’, namely ‘Just’
In the expression: [[1]] >>= Just
But, in Scala, it does actually compile and run:
scala> List( List(1) ).flatMap(x => Some(x) )
res1: List[List[Int]] = List(List(1))
Haskell's >>= signature is:
>>= :: Monad m => m a -> (a -> m b) -> m b
So, in [[1]] >>= f, f's type should be: a -> [b].
Why does the Scala code compile?

As #chi explained Scala's flatMap is more general than the Haskell's >>=. The full signature from the Scala docs is:
final def flatMap[B, That](f: (A) ⇒ GenTraversableOnce[B])(implicit bf: CanBuildFrom[List[A], B, That]): That
This implicit isn't relevant for this specific problem, so we could as well use the simpler definition:
final def flatMap[B](f: (A) ⇒ GenTraversableOnce[B]): List[B]
There is only one Problem, Option is no subclass of GenTraversableOnce, here an implicit conversion comes in. Scala defines an implicit conversion from Option to Iterable which is a subclass of Traversable which is a subclass of GenTraversableOnce.
implicit def option2Iterable[A](xo: Option[A]): Iterable[A]
The implicit is defined in the companion object of Option.
A simpler way to see the implicit at work is to assign a Option to an Iterable val:
scala> val i:Iterable[Int] = Some(1)
i: Iterable[Int] = List(1)
Scala uses some defaulting rules, to select List as the implementation of Iterable.
The fact that you can combine different subtypes of TraversableOnce with monad operations comes from the implicit class MonadOps:
implicit class MonadOps[+A](trav: TraversableOnce[A]) {
def map[B](f: A => B): TraversableOnce[B] = trav.toIterator map f
def flatMap[B](f: A => GenTraversableOnce[B]): TraversableOnce[B] = trav.toIterator flatMap f
def withFilter(p: A => Boolean) = trav.toIterator filter p
def filter(p: A => Boolean): TraversableOnce[A] = withFilter(p)
}
This enhances every TraversableOnce with the methods above. The subtypes are free to define more efficient versions on there own, these will shadow the implicit definitions. This is the case for List.

Quoting from the Scala reference for List
final def flatMap[B](f: (A) ⇒ GenTraversableOnce[B]): List[B]
So, flatMap is more general than Haskell's (>>=), since it only requires the mapped function f to generate a traversable type, not necessarily a List.

Related

What's the relationship between PartialFunction and Function1

I'm learning Scala, and find from scala doc the definition of PartialFunction and Function1, as shown below:
trait PartialFunction[-A, +B] extends (A) ⇒ B
trait Function1[-T1, +R] extends AnyRef
Q1) My 1st question is: what's the type of (A) => B?
And, I know we can turn the PartialFunction to be a normal function by the lift method.
But Q2) what's the relationship between ParitialFunction and Function1?
It seems if some function parameter is of type Function1, we can pass a matching PartitionFunction to it, as shown below:
scala> val process = (f: Function1[String, Int]) => f("1024")
process: (String => Int) => Int = <function1>
scala> val pattern = "([0-9]+)".r
pattern: scala.util.matching.Regex = ([0-9]+)
scala> val str2int: PartialFunction[String, Int] = {
| case pattern(num) => num.toInt
| }
str2int: PartialFunction[String,Int] = <function1>
scala> accept(str2int)
res67: Int = 1024
Thanks!
A ⇒ B is syntax sugar for Function1[A, B]. Similarly, (A1, A2) ⇒ R is actually Function2[A1, A2, R], etc. all the way to 22 (completely arbitrary limit). The definition of PartialFunction is thus
trait PartialFunction[-A, +B] extends Function1[A, B]
Since a PartialFunction[A, B] is also a Function1[A, B], you can pass it into something that wants an A ⇒ B. The only reason we use ⇒ over FunctionN is aesthetic: it looks nicer. Indeed, since ⇒ isn't really a type name, we can't say something like:
type ApIntInt[T[_, _]] = T[Int, Int]
// ApIntInt[⇒] // Error: ⇒ is not a type and was not expected here
ApIntInt[Function1] // Fine: Function1 is a type, it has the right kind, so it works.
// ApIntInt[Function1] = Function1[Int, Int] = Int ⇒ Int
Since you're a beginner, you won't see this kind of stuff (higher kinds) for a long time yet, but it is there, and you may well hit it someday.
When you use a PartialFunction as a Function1, if you pass a value at which it is not defined, it (probably) throws an exception, which is usually a MatchError (but doesn't have to be). By contrast, if you call pf.lift, that creates a Function[In, Option[Out]], which returns Some(result) if the PartialFunction is defined at a point, and returns None if it isn't, per the Scaladoc.
Ex:
lazy val factorial: PartialFunction[Int, Int] = {
case num if num > 1 => num * factorial(num - 1)
case 1 => 1
}
assert(!factorial.isDefinedAt(0))
factorial.apply(0) // Using a PF as a Function1 on an undefined point breaks (here with MatchError)
factorial.lift.apply(0) // This just returns None, because it checks isDefinedAt first

Compilation error when declaring Functor for Either even after using type projections

I am trying to write a Functor for Either for academic purposes in Scala. With help of higher-kinded types and type-projections, I managed to write an implementation for Either.
trait Functor[F[_]] {
def map[A, B](fa: F[A])(f: A => B): F[B]
}
object Functor {
implicit def eitherFunctor[A] = new Functor[({type λ[α] = Either[A, α]})#λ] {
override def map[B, C](fa: Either[A, B])(f: B => C) = fa.map(f)
}
}
def mapAll[F[_], A, B](fa: F[A])(f: A => B)(implicit fe: Functor[F]): F[B] = fe.map(fa)(f)
val right: Either[String, Int] = Right(2)
mapAll(right)(_ + 2)
Now, the code above does not compile. I am not sure of the reason but the compilation error that I am getting is given below -
Error:(19, 16) type mismatch;
found : Either[String,Int]
required: ?F[?A]
Note that implicit conversions are not applicable because they are ambiguous:
both method ArrowAssoc in object Predef of type [A](self: A)ArrowAssoc[A]
and method Ensuring in object Predef of type [A](self: A)Ensuring[A]
are possible conversion functions from Either[String,Int] to ?F[?A]
mapAll(right)(_ + 2)
Can someone point what I am not doing right in the code above?
PS: Please do not suggest me to use kind-projector.
You've just been bitten by SI-2712. If you're using Scala >= 2.12.2 just add this line to your build.sbt:
scalacOptions += "-Ypartial-unification"
For other Scala versions you can use this plugin.
Either[+A, +B] is expecting two type parameters(as #Ziyang Liu said), so for your example actually need BiFunctor not Functor, BiFunctor accept two functors and bound the two types.
there is a Bifunctor from Scalaz
trait Bifunctor[F[_, _]] { self =>
////
/** `map` over both type parameters. */
def bimap[A, B, C, D](fab: F[A, B])(f: A => C, g: B => D): F[C, D]
So you can use this Bifunctor like:
Bifunctor[Either].bimap(Right(1): Either[String, Int])(_.toUpperCase, _ + 1).println
Hope it's helpful for you.
As others said, what the compiler is trying to tell you is that the shapes of your types don't match. When you require an F[_], you're requiring a type constructor with a single type parameter, which Either[A, B] doesn't satisfy.
What we need to do is apply a type lambda when applying mapAll, same as we did when we created the instance of the Either functor:
val right: Either[String, Int] = Right(2)
mapAll[({type λ[α]=Either[String, α]})#λ, Int, Int](right)(_ + 2)
We've now squeezed in String and fixed it as the first argument, allowing the type projected type to only need to satisfy our alpha, and now the shapes match.
Of course, we can also use a type alias which would free us from specifying any additional type information when applying mapAll:
type MyEither[A] = Either[String, A]
val right: MyEither[Int] = Right(2)
mapAll(right)(_ + 2)

How does flatmap really work in Scala

I took the scala odersky course and thought that the function that Flatmap takes as arguments , takes an element of Monad and returns a monad of different type.
trait M[T] {
def flatMap[U](f: T => M[U]): M[U]
}
On Monad M[T] , the return type of function is also the same Monad , the type parameter U might be different.
However I have seen examples on internet , where the function returns a completely different Monad. I was under impression that return type of function should the same Monad. Can someone simplify the below to explain how flapmap results in the actual value instead of Option in the list.
Is the List not a Monad in Scala.
val l= List(1,2,3,4,5)
def f(x:int) = if (x>2) Some(x) else None
l.map(x=>f(x))
//Result List[Option[Int]] = List( None , None , Some(3) , Some(4) , Some(5))
l.flatMap(x=>f(x))
//Result: List(3,4,5)
Let's start from the fact that M[T]is not a monad by itself. It's a type constructor. It becomes a monad when it's associated with two operators: bind and return (or unit). There are also monad laws these operators must satisfy, but let's omit them for brevity. In Haskell the type of bind is:
class Monad m where
...
(>>=) :: m a -> (a -> m b) -> m b
where m is a type constructor. Since Scala is OO language bind will look like (first argument is self):
trait M[T] {
def bind[U](f: T => M[U]): M[U]
}
Here M === m, T === a, U === b. bind is often called flatMap. In a pure spherical world in a vacuum that would be a signature of flatMap in OO language. Scala is a very practical language, so the real signature of flatMap for List is:
final def flatMap[B, That](f: (A) ⇒ GenTraversableOnce[B])(implicit bf: CanBuildFrom[List[A], B, That]): That
It's not bind, but will work as a monadic bind if you provide f in the form of (A) => List[B] and also make sure that That is List[B]. On the other hand Scala is not going to watch your back if you provide something different, but will try to find some meaningful conversion (e.g. CanBuildFrom or something else) if it exists.
UPDATE
You can play with scalac flags (-Xlog-implicits, -Xlog-implicit-conversions) to see what's happening:
scala> List(1).flatMap { x => Some(x) }
<console>:1: inferred view from Some[Int] to scala.collection.GenTraversableOnce[?] via scala.this.Option.option2Iterable[Int]: (xo: Option[Int])Iterable[Int]
List(1).flatMap { x => Some(x) }
^
res1: List[Int] = List(1)
Hmm, perhaps confusingly, the signature you gave is not actually correct, since it's really (in simplified form):
def flatMap[B](f: (A) ⇒ GenTraversableOnce[B]): Traversable[B]
Since the compiler is open-source, you can actually see what it's doing (with its full signature):
def flatMap[B, That](f: A => GenTraversableOnce[B])(implicit bf: CanBuildFrom[Repr, B, That]): That = {
def builder = bf(repr) // extracted to keep method size under 35 bytes, so that it can be JIT-inlined
val b = builder
for (x <- this) b ++= f(x).seq
b.result
}
So you can see that there is actually no requirement that the return type of f be the same as the return type of flatMap.
The flatmap found in the standard library is a much more general and flexible method than the monadic bind method like the flatMap from Odersky's example.
For example, the full signature of flatmap on List is
def flatMap[B, That](f: (A) ⇒ GenTraversableOnce[B])(implicit bf: CanBuildFrom[List[A], B, That]): That
Instead of requiring the function passed into flatmap to return a List, it is able to return any GenTraversableOnce object, a very generic type.
flatmap then uses the implicit CanBuildFrom mechanism to determine the appropriate type to return.
So when you use flatmap with a function that returns a List, it is a monadic bind operation, but it lets you use other types as well.

What are type lambdas in Scala and what are their benefits?

Sometime I stumble into the semi-mysterious notation of
def f[T](..) = new T[({type l[A]=SomeType[A,..]})#l] {..}
in Scala blog posts, which give it a "we used that type-lambda trick" handwave.
While I have some intutition about this (we gain an anonymous type parameter A without having to pollute the definition with it?), I found no clear source describing what the type lambda trick is, and what are its benefits. Is it just syntactic sugar, or does it open some new dimensions?
Type lambdas are vital quite a bit of the time when you are working with higher-kinded types.
Consider a simple example of defining a monad for the right projection of Either[A, B]. The monad typeclass looks like this:
trait Monad[M[_]] {
def point[A](a: A): M[A]
def bind[A, B](m: M[A])(f: A => M[B]): M[B]
}
Now, Either is a type constructor of two arguments, but to implement Monad, you need to give it a type constructor of one argument. The solution to this is to use a type lambda:
class EitherMonad[A] extends Monad[({type λ[α] = Either[A, α]})#λ] {
def point[B](b: B): Either[A, B]
def bind[B, C](m: Either[A, B])(f: B => Either[A, C]): Either[A, C]
}
This is an example of currying in the type system - you have curried the type of Either, such that when you want to create an instance of EitherMonad, you have to specify one of the types; the other of course is supplied at the time you call point or bind.
The type lambda trick exploits the fact that an empty block in a type position creates an anonymous structural type. We then use the # syntax to get a type member.
In some cases, you may need more sophisticated type lambdas that are a pain to write out inline. Here's an example from my code from today:
// types X and E are defined in an enclosing scope
private[iteratee] class FG[F[_[_], _], G[_]] {
type FGA[A] = F[G, A]
type IterateeM[A] = IterateeT[X, E, FGA, A]
}
This class exists exclusively so that I can use a name like FG[F, G]#IterateeM to refer to the type of the IterateeT monad specialized to some transformer version of a second monad which is specialized to some third monad. When you start to stack, these kinds of constructs become very necessary. I never instantiate an FG, of course; it's just there as a hack to let me express what I want in the type system.
The benefits are exactly the same as those conferred by anonymous functions.
def inc(a: Int) = a + 1; List(1, 2, 3).map(inc)
List(1, 2, 3).map(a => a + 1)
An example usage, with Scalaz 7. We want to use a Functor that can map a function over the second element in a Tuple2.
type IntTuple[+A]=(Int, A)
Functor[IntTuple].map((1, 2))(a => a + 1)) // (1, 3)
Functor[({type l[a] = (Int, a)})#l].map((1, 2))(a => a + 1)) // (1, 3)
Scalaz provides some implicit conversions that can infer the type argument to Functor, so we often avoid writing these altogether. The previous line can be rewritten as:
(1, 2).map(a => a + 1) // (1, 3)
If you use IntelliJ, you can enable Settings, Code Style, Scala, Folding, Type Lambdas. This then hides the crufty parts of the syntax, and presents the more palatable:
Functor[[a]=(Int, a)].map((1, 2))(a => a + 1)) // (1, 3)
A future version of Scala might directly support such a syntax.
To put things in context: This answer was originally posted in another thread. You are seeing it here because the two threads have been merged. The question statement in the said thread was as follows:
How to resolve this type definition: Pure[({type ?[a]=(R, a)})#?] ?
What are the reasons of using such construction?
Snipped comes from scalaz library:
trait Pure[P[_]] {
def pure[A](a: => A): P[A]
}
object Pure {
import Scalaz._
//...
implicit def Tuple2Pure[R: Zero]: Pure[({type ?[a]=(R, a)})#?] = new Pure[({type ?[a]=(R, a)})#?] {
def pure[A](a: => A) = (Ø, a)
}
//...
}
Answer:
trait Pure[P[_]] {
def pure[A](a: => A): P[A]
}
The one underscore in the boxes after P implies that it is a type constructor takes one type and returns another type. Examples of type constructors with this kind: List, Option.
Give List an Int, a concrete type, and it gives you List[Int], another concrete type. Give List a String and it gives you List[String]. Etc.
So, List, Option can be considered to be type level functions of arity 1. Formally we say, they have a kind * -> *. The asterisk denotes a type.
Now Tuple2[_, _] is a type constructor with kind (*, *) -> * i.e. you need to give it two types to get a new type.
Since their signatures do not match, you cannot substitute Tuple2 for P. What you need to do is partially apply Tuple2 on one of its arguments, which will give us a type constructor with kind * -> *, and we can substitue it for P.
Unfortunately Scala has no special syntax for partial application of type constructors, and so we have to resort to the monstrosity called type lambdas. (What you have in your example.) They are called that because they are analogous to lambda expressions that exist at value level.
The following example might help:
// VALUE LEVEL
// foo has signature: (String, String) => String
scala> def foo(x: String, y: String): String = x + " " + y
foo: (x: String, y: String)String
// world wants a parameter of type String => String
scala> def world(f: String => String): String = f("world")
world: (f: String => String)String
// So we use a lambda expression that partially applies foo on one parameter
// to yield a value of type String => String
scala> world(x => foo("hello", x))
res0: String = hello world
// TYPE LEVEL
// Foo has a kind (*, *) -> *
scala> type Foo[A, B] = Map[A, B]
defined type alias Foo
// World wants a parameter of kind * -> *
scala> type World[M[_]] = M[Int]
defined type alias World
// So we use a lambda lambda that partially applies Foo on one parameter
// to yield a type of kind * -> *
scala> type X[A] = World[({ type M[A] = Foo[String, A] })#M]
defined type alias X
// Test the equality of two types. (If this compiles, it means they're equal.)
scala> implicitly[X[Int] =:= Foo[String, Int]]
res2: =:=[X[Int],Foo[String,Int]] = <function1>
Edit:
More value level and type level parallels.
// VALUE LEVEL
// Instead of a lambda, you can define a named function beforehand...
scala> val g: String => String = x => foo("hello", x)
g: String => String = <function1>
// ...and use it.
scala> world(g)
res3: String = hello world
// TYPE LEVEL
// Same applies at type level too.
scala> type G[A] = Foo[String, A]
defined type alias G
scala> implicitly[X =:= Foo[String, Int]]
res5: =:=[X,Foo[String,Int]] = <function1>
scala> type T = World[G]
defined type alias T
scala> implicitly[T =:= Foo[String, Int]]
res6: =:=[T,Foo[String,Int]] = <function1>
In the case you have presented, the type parameter R is local to function Tuple2Pure and so you cannot simply define type PartialTuple2[A] = Tuple2[R, A], because there is simply no place where you can put that synonym.
To deal with such a case, I use the following trick that makes use of type members. (Hopefully the example is self-explanatory.)
scala> type Partial2[F[_, _], A] = {
| type Get[B] = F[A, B]
| }
defined type alias Partial2
scala> implicit def Tuple2Pure[R]: Pure[Partial2[Tuple2, R]#Get] = sys.error("")
Tuple2Pure: [R]=> Pure[[B](R, B)]

Scala 2.8 breakOut

In Scala 2.8, there is an object in scala.collection.package.scala:
def breakOut[From, T, To](implicit b : CanBuildFrom[Nothing, T, To]) =
new CanBuildFrom[From, T, To] {
def apply(from: From) = b.apply() ; def apply() = b.apply()
}
I have been told that this results in:
> import scala.collection.breakOut
> val map : Map[Int,String] = List("London", "Paris").map(x => (x.length, x))(breakOut)
map: Map[Int,String] = Map(6 -> London, 5 -> Paris)
What is going on here? Why is breakOut being called as an argument to my List?
The answer is found on the definition of map:
def map[B, That](f : (A) => B)(implicit bf : CanBuildFrom[Repr, B, That]) : That
Note that it has two parameters. The first is your function and the second is an implicit. If you do not provide that implicit, Scala will choose the most specific one available.
About breakOut
So, what's the purpose of breakOut? Consider the example given for the question, You take a list of strings, transform each string into a tuple (Int, String), and then produce a Map out of it. The most obvious way to do that would produce an intermediary List[(Int, String)] collection, and then convert it.
Given that map uses a Builder to produce the resulting collection, wouldn't it be possible to skip the intermediary List and collect the results directly into a Map? Evidently, yes, it is. To do so, however, we need to pass a proper CanBuildFrom to map, and that is exactly what breakOut does.
Let's look, then, at the definition of breakOut:
def breakOut[From, T, To](implicit b : CanBuildFrom[Nothing, T, To]) =
new CanBuildFrom[From, T, To] {
def apply(from: From) = b.apply() ; def apply() = b.apply()
}
Note that breakOut is parameterized, and that it returns an instance of CanBuildFrom. As it happens, the types From, T and To have already been inferred, because we know that map is expecting CanBuildFrom[List[String], (Int, String), Map[Int, String]]. Therefore:
From = List[String]
T = (Int, String)
To = Map[Int, String]
To conclude let's examine the implicit received by breakOut itself. It is of type CanBuildFrom[Nothing,T,To]. We already know all these types, so we can determine that we need an implicit of type CanBuildFrom[Nothing,(Int,String),Map[Int,String]]. But is there such a definition?
Let's look at CanBuildFrom's definition:
trait CanBuildFrom[-From, -Elem, +To]
extends AnyRef
So CanBuildFrom is contra-variant on its first type parameter. Because Nothing is a bottom class (ie, it is a subclass of everything), that means any class can be used in place of Nothing.
Since such a builder exists, Scala can use it to produce the desired output.
About Builders
A lot of methods from Scala's collections library consists of taking the original collection, processing it somehow (in the case of map, transforming each element), and storing the results in a new collection.
To maximize code reuse, this storing of results is done through a builder (scala.collection.mutable.Builder), which basically supports two operations: appending elements, and returning the resulting collection. The type of this resulting collection will depend on the type of the builder. Thus, a List builder will return a List, a Map builder will return a Map, and so on. The implementation of the map method need not concern itself with the type of the result: the builder takes care of it.
On the other hand, that means that map needs to receive this builder somehow. The problem faced when designing Scala 2.8 Collections was how to choose the best builder possible. For example, if I were to write Map('a' -> 1).map(_.swap), I'd like to get a Map(1 -> 'a') back. On the other hand, a Map('a' -> 1).map(_._1) can't return a Map (it returns an Iterable).
The magic of producing the best possible Builder from the known types of the expression is performed through this CanBuildFrom implicit.
About CanBuildFrom
To better explain what's going on, I'll give an example where the collection being mapped is a Map instead of a List. I'll go back to List later. For now, consider these two expressions:
Map(1 -> "one", 2 -> "two") map Function.tupled(_ -> _.length)
Map(1 -> "one", 2 -> "two") map (_._2)
The first returns a Map and the second returns an Iterable. The magic of returning a fitting collection is the work of CanBuildFrom. Let's consider the definition of map again to understand it.
The method map is inherited from TraversableLike. It is parameterized on B and That, and makes use of the type parameters A and Repr, which parameterize the class. Let's see both definitions together:
The class TraversableLike is defined as:
trait TraversableLike[+A, +Repr]
extends HasNewBuilder[A, Repr] with AnyRef
def map[B, That](f : (A) => B)(implicit bf : CanBuildFrom[Repr, B, That]) : That
To understand where A and Repr come from, let's consider the definition of Map itself:
trait Map[A, +B]
extends Iterable[(A, B)] with Map[A, B] with MapLike[A, B, Map[A, B]]
Because TraversableLike is inherited by all traits which extend Map, A and Repr could be inherited from any of them. The last one gets the preference, though. So, following the definition of the immutable Map and all the traits that connect it to TraversableLike, we have:
trait Map[A, +B]
extends Iterable[(A, B)] with Map[A, B] with MapLike[A, B, Map[A, B]]
trait MapLike[A, +B, +This <: MapLike[A, B, This] with Map[A, B]]
extends MapLike[A, B, This]
trait MapLike[A, +B, +This <: MapLike[A, B, This] with Map[A, B]]
extends PartialFunction[A, B] with IterableLike[(A, B), This] with Subtractable[A, This]
trait IterableLike[+A, +Repr]
extends Equals with TraversableLike[A, Repr]
trait TraversableLike[+A, +Repr]
extends HasNewBuilder[A, Repr] with AnyRef
If you pass the type parameters of Map[Int, String] all the way down the chain, we find that the types passed to TraversableLike, and, thus, used by map, are:
A = (Int,String)
Repr = Map[Int, String]
Going back to the example, the first map is receiving a function of type ((Int, String)) => (Int, Int) and the second map is receiving a function of type ((Int, String)) => String. I use the double parenthesis to emphasize it is a tuple being received, as that's the type of A as we saw.
With that information, let's consider the other types.
map Function.tupled(_ -> _.length):
B = (Int, Int)
map (_._2):
B = String
We can see that the type returned by the first map is Map[Int,Int], and the second is Iterable[String]. Looking at map's definition, it is easy to see that these are the values of That. But where do they come from?
If we look inside the companion objects of the classes involved, we see some implicit declarations providing them. On object Map:
implicit def canBuildFrom [A, B] : CanBuildFrom[Map, (A, B), Map[A, B]]
And on object Iterable, whose class is extended by Map:
implicit def canBuildFrom [A] : CanBuildFrom[Iterable, A, Iterable[A]]
These definitions provide factories for parameterized CanBuildFrom.
Scala will choose the most specific implicit available. In the first case, it was the first CanBuildFrom. In the second case, as the first did not match, it chose the second CanBuildFrom.
Back to the Question
Let's see the code for the question, List's and map's definition (again) to see how the types are inferred:
val map : Map[Int,String] = List("London", "Paris").map(x => (x.length, x))(breakOut)
sealed abstract class List[+A]
extends LinearSeq[A] with Product with GenericTraversableTemplate[A, List] with LinearSeqLike[A, List[A]]
trait LinearSeqLike[+A, +Repr <: LinearSeqLike[A, Repr]]
extends SeqLike[A, Repr]
trait SeqLike[+A, +Repr]
extends IterableLike[A, Repr]
trait IterableLike[+A, +Repr]
extends Equals with TraversableLike[A, Repr]
trait TraversableLike[+A, +Repr]
extends HasNewBuilder[A, Repr] with AnyRef
def map[B, That](f : (A) => B)(implicit bf : CanBuildFrom[Repr, B, That]) : That
The type of List("London", "Paris") is List[String], so the types A and Repr defined on TraversableLike are:
A = String
Repr = List[String]
The type for (x => (x.length, x)) is (String) => (Int, String), so the type of B is:
B = (Int, String)
The last unknown type, That is the type of the result of map, and we already have that as well:
val map : Map[Int,String] =
So,
That = Map[Int, String]
That means breakOut must, necessarily, return a type or subtype of CanBuildFrom[List[String], (Int, String), Map[Int, String]].
I'd like to build upon Daniel's answer. It was very thorough, but as noted in the comments, it doesn't explain what breakout does.
Taken from Re: Support for explicit Builders (2009-10-23), here is what I believe breakout does:
It gives the compiler a suggestion as to which Builder to choose implicitly (essentially it allows the compiler to choose which factory it thinks fits the situation best.)
For example, see the following:
scala> import scala.collection.generic._
import scala.collection.generic._
scala> import scala.collection._
import scala.collection._
scala> import scala.collection.mutable._
import scala.collection.mutable._
scala>
scala> def breakOut[From, T, To](implicit b : CanBuildFrom[Nothing, T, To]) =
| new CanBuildFrom[From, T, To] {
| def apply(from: From) = b.apply() ; def apply() = b.apply()
| }
breakOut: [From, T, To]
| (implicit b: scala.collection.generic.CanBuildFrom[Nothing,T,To])
| java.lang.Object with
| scala.collection.generic.CanBuildFrom[From,T,To]
scala> val l = List(1, 2, 3)
l: List[Int] = List(1, 2, 3)
scala> val imp = l.map(_ + 1)(breakOut)
imp: scala.collection.immutable.IndexedSeq[Int] = Vector(2, 3, 4)
scala> val arr: Array[Int] = l.map(_ + 1)(breakOut)
imp: Array[Int] = Array(2, 3, 4)
scala> val stream: Stream[Int] = l.map(_ + 1)(breakOut)
stream: Stream[Int] = Stream(2, ?)
scala> val seq: Seq[Int] = l.map(_ + 1)(breakOut)
seq: scala.collection.mutable.Seq[Int] = ArrayBuffer(2, 3, 4)
scala> val set: Set[Int] = l.map(_ + 1)(breakOut)
seq: scala.collection.mutable.Set[Int] = Set(2, 4, 3)
scala> val hashSet: HashSet[Int] = l.map(_ + 1)(breakOut)
seq: scala.collection.mutable.HashSet[Int] = Set(2, 4, 3)
You can see the return type is implicitly chosen by the compiler to best match the expected type. Depending on how you declare the receiving variable, you get different results.
The following would be an equivalent way to specify a builder. Note in this case, the compiler will infer the expected type based on the builder's type:
scala> def buildWith[From, T, To](b : Builder[T, To]) =
| new CanBuildFrom[From, T, To] {
| def apply(from: From) = b ; def apply() = b
| }
buildWith: [From, T, To]
| (b: scala.collection.mutable.Builder[T,To])
| java.lang.Object with
| scala.collection.generic.CanBuildFrom[From,T,To]
scala> val a = l.map(_ + 1)(buildWith(Array.newBuilder[Int]))
a: Array[Int] = Array(2, 3, 4)
Daniel Sobral's answer is great, and should be read together with Architecture of Scala Collections (Chapter 25 of Programming in Scala).
I just wanted to elaborate on why it is called breakOut:
Why is it called breakOut?
Because we want to break out of one type and into another:
Break out of what type into what type? Lets look at the map function on Seq as an example:
Seq.map[B, That](f: (A) -> B)(implicit bf: CanBuildFrom[Seq[A], B, That]): That
If we wanted to build a Map directly from mapping over the elements of a sequence such as:
val x: Map[String, Int] = Seq("A", "BB", "CCC").map(s => (s, s.length))
The compiler would complain:
error: type mismatch;
found : Seq[(String, Int)]
required: Map[String,Int]
The reason being that Seq only knows how to build another Seq (i.e. there is an implicit CanBuildFrom[Seq[_], B, Seq[B]] builder factory available, but there is NO builder factory from Seq to Map).
In order to compile, we need to somehow breakOut of the type requirement, and be able to construct a builder that produces a Map for the map function to use.
As Daniel has explained, breakOut has the following signature:
def breakOut[From, T, To](implicit b: CanBuildFrom[Nothing, T, To]): CanBuildFrom[From, T, To] =
// can't just return b because the argument to apply could be cast to From in b
new CanBuildFrom[From, T, To] {
def apply(from: From) = b.apply()
def apply() = b.apply()
}
Nothing is a subclass of all classes, so any builder factory can be substituted in place of implicit b: CanBuildFrom[Nothing, T, To]. If we used the breakOut function to provide the implicit parameter:
val x: Map[String, Int] = Seq("A", "BB", "CCC").map(s => (s, s.length))(collection.breakOut)
It would compile, because breakOut is able to provide the required type of CanBuildFrom[Seq[(String, Int)], (String, Int), Map[String, Int]], while the compiler is able to find an implicit builder factory of type CanBuildFrom[Map[_, _], (A, B), Map[A, B]], in place of CanBuildFrom[Nothing, T, To], for breakOut to use to create the actual builder.
Note that CanBuildFrom[Map[_, _], (A, B), Map[A, B]] is defined in Map, and simply initiates a MapBuilder which uses an underlying Map.
Hope this clears things up.
A simple example to understand what breakOut does:
scala> import collection.breakOut
import collection.breakOut
scala> val set = Set(1, 2, 3, 4)
set: scala.collection.immutable.Set[Int] = Set(1, 2, 3, 4)
scala> set.map(_ % 2)
res0: scala.collection.immutable.Set[Int] = Set(1, 0)
scala> val seq:Seq[Int] = set.map(_ % 2)(breakOut)
seq: Seq[Int] = Vector(1, 0, 1, 0) // map created a Seq[Int] instead of the default Set[Int]