Related
I am currently reading Hutton's and Meijer's paper on parsing combinators in Haskell http://www.cs.nott.ac.uk/~pszgmh/monparsing.pdf. For the sake of it I am trying to implement them in scala. I would like to construct something easy to code, extend and also simple and elegant. I have come up with two solutions for the following haskell code
/* Haskell Code */
type Parser a = String -> [(a,String)]
result :: a -> Parser a
result v = \inp -> [(v,inp)]
zero :: Parser a
zero = \inp -> []
item :: Parser Char
item = \inp -> case inp of
[] -> []
(x:xs) -> [(x,xs)]
/* Scala Code */
object Hutton1 {
type Parser[A] = String => List[(A, String)]
def Result[A](v: A): Parser[A] = str => List((v, str))
def Zero[A]: Parser[A] = str => List()
def Character: Parser[Char] = str => if (str.isEmpty) List() else List((str.head, str.tail))
}
object Hutton2 {
trait Parser[A] extends (String => List[(A, String)])
case class Result[A](v: A) extends Parser[A] {
def apply(str: String) = List((v, str))
}
case object Zero extends Parser[T forSome {type T}] {
def apply(str: String) = List()
}
case object Character extends Parser[Char] {
def apply(str: String) = if (str.isEmpty) List() else List((str.head, str.tail))
}
}
object Hutton extends App {
object T1 {
import Hutton1._
def run = {
val r: List[(Int, String)] = Zero("test") ++ Result(5)("test")
println(r.map(x => x._1 + 1) == List(6))
println(Character("abc") == List(('a', "bc")))
}
}
object T2 {
import Hutton2._
def run = {
val r: List[(Int, String)] = Zero("test") ++ Result(5)("test")
println(r.map(x => x._1 + 1) == List(6))
println(Character("abc") == List(('a', "bc")))
}
}
T1.run
T2.run
}
Question 1
In Haskell, zero is a function value that can be used as it is, expessing all failed parsers whether they are of type Parser[Int] or Parser[String]. In scala we achieve the same by calling the function Zero (1st approach) but in this way I believe that I just generate a different function everytime Zero is called. Is this statement true? Is there a way to mitigate this?
Question 2
In the second approach, the Zero case object is extending Parser with the usage of existential types Parser[T forSome {type T}] . If I replace the type with Parser[_] I get the compile error
Error:(19, 28) class type required but Hutton2.Parser[_] found
case object Zero extends Parser[_] {
^
I thought these two expressions where equivalent. Is this the case?
Question 3
Which approach out of the two do you think that will yield better results in expressing the combinators in terms of elegance and simplicity?
I use scala 2.11.8
Note: I didn't compile it, but I know the problem and can propose two solutions.
The more Haskellish way would be to not use subtyping, but to define zero as a polymorphic value. In that style, I would propose to define parsers not as objects deriving from a function type, but as values of one case class:
final case class Parser[T](run: String => List[(T, String)])
def zero[T]: Parser[T] = Parser(...)
As shown by #Alec, yes, this will produce a new value every time, since a def is compiled to a method.
If you want to use subtyping, you need to make Parser covariant. Then you can give zero a bottom result type:
trait Parser[+A] extends (String => List[(A, String)])
case object Zero extends Parser[Nothing] {...}
These are in some way quite related; in system F_<:, which is the base of what Scala uses, the types _|_ (aka Nothing) and \/T <: Any. T behave the same (this hinted at in Types and Programming Languages, chapter 28). The two possibilities given here are a consequence of this fact.
With existentials I'm not so familiar with, but I think that while unbounded T forSome {type T} will behave like Nothing, Scala does not allow inhertance from an existential type.
Question 1
I think that you are right, and here is why: Zero1 below prints hello every time you use it. The solution, Zero2, involves using a val instead.
def Zero1[A]: Parser[A] = { println("hi"); str => List() }
val Zero2: Parser[Nothing] = str => List()
Question 2
No idea. I'm still just starting out with Scala. Hope someone answers this.
Question 3
The trait one will play better with Scala's for (since you can define custom flatMap and map), which turns out to be (somewhat) like Haskell's do. The following is all you need.
trait Parser[A] extends (String => List[(A, String)]) {
def flatMap[B](f: A => Parser[B]): Parser[B] = {
val p1 = this
new Parser[B] {
def apply(s1: String) = for {
(a,s2) <- p1(s1)
p2 = f(a)
(b,s3) <- p2(s2)
} yield (b,s3)
}
}
def map[B](f: A => B): Parser[B] = {
val p = this
new Parser[B] {
def apply(s1: String) = for ((a,s2) <- p(s1)) yield (f(a),s2)
}
}
}
Of course, to do anything interesting you need more parsers. I'll propose to you one simple parser combinator: Choice(p1: Parser[A], p2: Parser[A]): Parser[A] which tries both parsers. (And rewrite your existing parsers more to my style).
def choice[A](p1: Parser[A], p2: Parser[A]): Parser[A] = new Parser[A] {
def apply(s: String): List[(A,String)] = { p1(s) ++ p2(s) }
}
def unit[A](x: A): Parser[A] = new Parser[A] {
def apply(s: String): List[(A,String)] = List((x,s))
}
val character: Parser[Char] = new Parser[Char] {
def apply(s: String): List[(Char,String)] = List((s.head,s.tail))
}
Then, you can write something like the following:
val parser: Parser[(Char,Char)] = for {
x <- choice(unit('x'),char)
y <- char
} yield (x,y)
And calling parser("xyz") gives you List((('x','x'),"yz"), (('x','y'),"z")).
I know a lot of questions exist about type erasure and pattern matching on generic types, but I could not understand what should I do in my case from answers to those, and I could not explain it better in title.
Following code pieces are simplified to present my case.
So I have a trait
trait Feature[T] {
value T
def sub(other: Feature[T]): Double
}
// implicits for int,float,double etc to Feature with sub mapped to - function
...
Then I have a class
class Data(val features: IndexedSeq[Feature[_]]) {
def sub(other: Data): IndexedSeq[Double] = {
features.zip(other.features).map {
case(e1: Feature[t], e2: Feature[y]) => e1 sub e2.asInstanceOf[Feature[t]]
}
}
}
And I have a test case like this
case class TestFeature(val value: String) extends Feature[String] {
def sub(other: Feature[String]): Double = value.length - other.length
}
val testData1 = new Data(IndexedSeq(8, 8.3f, 8.232d, TestFeature("abcd"))
val testData2 = new Data(IndexedSeq(10, 10.1f, 10.123d, TestFeature("efg"))
testData1.sub(testData2).zipWithIndex.foreach {
case (res, 0) => res should be (8 - 10)
case (res, 1) => res should be (8.3f - 10.1f)
case (res, 2) => res should be (8.232d - 10.123d)
case (res, 3) => res should be (1)
}
This somehow works. If I try sub operation with instances of Data that have different types in same index of features, I get a ClassCastException. This actually satisfies my requirements, but if possible I would like to use Option instead of throwing an exception. How can I make following code work?
class Data(val features: IndexedSeq[Feature[_]]) {
def sub(other: Data): IndexedSeq[Double] = {
features.zip(other.features).map {
// of course this does not work, just to give idea
case(e1: Feature[t], e2: Feature[y]) if t == y => e1 sub e2.asInstanceOf[Feature[t]]
}
}
}
Also I am really inexperienced in Scala, so I would like to get feedback on this type of structure. Are there another ways to do this and which way would make most sense?
Generics don't exist at runtime, and an IndexedSeq[Feature[_]] has forgotten what the type parameter is even at compile time (#Jatin's answer won't allow you to construct a Data with a list of mixed types of Feature[_]). The easiest answer might be just to catch the exception (using catching and opt from scala.util.control.Exception). But, to answer the question as written:
You could check the classes at runtime:
case (e1: Feature[t], e2: Feature[y]) if e1.value.getClass ==
e2.value.getClass => ...
Or include the type information in the Feature:
trait Feature[T] {
val value: T
val valueType: ClassTag[T] // write classOf[T] in subclasses
def maybeSub(other: Feature[_]) = other.value match {
case valueType(v) => Some(actual subtraction)
case _ => None
}
}
The more complex "proper" solution is probably to use Shapeless HList to preserve the type information in your lists:
// note the type includes the type of all the elements
val l1: Feature[Int] :: Feature[String] :: HNil = f1 :: f2 :: HNil
val l2 = ...
// a 2-argument function that's defined for particular types
// this can be applied to `Feature[T], Feature[T]` for any `T`
object subtract extends Poly2 {
implicit def caseFeatureT[T] =
at[Feature[T], Feature[T]]{_ sub _}
}
// apply our function to the given HLists, getting a HList
// you would probably inline this
// could follow up with .toList[Double]
// since the resulting HList is going to be only Doubles
def subAll[L1 <: HList, L2 <: HList](l1: L1, l2: L2)(
implicit zw: ZipWith[L1, L2, subtract.type]) =
l1.zipWith(l2)(subtract)
That way subAll can only be called for l1 and l2 all of whose elements match, and this is enforced at compile time. (If you really want to do Options you can have two ats in the subtract, one for same-typed Feature[T]s and one for different-typed Feature[_]s, but ruling it out entirely seems like a better solution)
You could do something like this:
class Data[T: TypeTag](val features: IndexedSeq[Feature[T]]) {
val t = implicitly[TypeTag[T]]
def sub[E: TypeTag](other: Data[E]): IndexedSeq[Double] = {
val e = implicitly[TypeTag[E]]
features.zip(other.features).flatMap{
case(e1, e2: Feature[y]) if e.tpe == t.tpe => Some(e1 sub e2.asInstanceOf[Feature[T]])
case _ => None
}
}
}
And then:
case class IntFeature(val value: Int) extends Feature[Int] {
def sub(other: Feature[Int]): Double = value - other.value
}
val testData3 = new Data(IndexedSeq(TestFeature("abcd")))
val testData4 = new Data(IndexedSeq(IntFeature(1)))
println(testData3.sub(testData4).zipWithIndex)
gives Vector()
One of the most powerful patterns available in Scala is the enrich-my-library* pattern, which uses implicit conversions to appear to add methods to existing classes without requiring dynamic method resolution. For example, if we wished that all strings had the method spaces that counted how many whitespace characters they had, we could:
class SpaceCounter(s: String) {
def spaces = s.count(_.isWhitespace)
}
implicit def string_counts_spaces(s: String) = new SpaceCounter(s)
scala> "How many spaces do I have?".spaces
res1: Int = 5
Unfortunately, this pattern runs into trouble when dealing with generic collections. For example, a number of questions have been asked about grouping items sequentially with collections. There is nothing built in that works in one shot, so this seems an ideal candidate for the enrich-my-library pattern using a generic collection C and a generic element type A:
class SequentiallyGroupingCollection[A, C[A] <: Seq[A]](ca: C[A]) {
def groupIdentical: C[C[A]] = {
if (ca.isEmpty) C.empty[C[A]]
else {
val first = ca.head
val (same,rest) = ca.span(_ == first)
same +: (new SequentiallyGroupingCollection(rest)).groupIdentical
}
}
}
except, of course, it doesn't work. The REPL tells us:
<console>:12: error: not found: value C
if (ca.isEmpty) C.empty[C[A]]
^
<console>:16: error: type mismatch;
found : Seq[Seq[A]]
required: C[C[A]]
same +: (new SequentiallyGroupingCollection(rest)).groupIdentical
^
There are two problems: how do we get a C[C[A]] from an empty C[A] list (or from thin air)? And how do we get a C[C[A]] back from the same +: line instead of a Seq[Seq[A]]?
* Formerly known as pimp-my-library.
The key to understanding this problem is to realize that there are two different ways to build and work with collections in the collections library. One is the public collections interface with all its nice methods. The other, which is used extensively in creating the collections library, but which are almost never used outside of it, is the builders.
Our problem in enriching is exactly the same one that the collections library itself faces when trying to return collections of the same type. That is, we want to build collections, but when working generically, we don't have a way to refer to "the same type that the collection already is". So we need builders.
Now the question is: where do we get our builders from? The obvious place is from the collection itself. This doesn't work. We already decided, in moving to a generic collection, that we were going to forget the type of the collection. So even though the collection could return a builder that would generate more collections of the type we want, it wouldn't know what the type was.
Instead, we get our builders from CanBuildFrom implicits that are floating around. These exist specifically for the purpose of matching input and output types and giving you an appropriately typed builder.
So, we have two conceptual leaps to make:
We aren't using standard collections operations, we're using builders.
We get these builders from implicit CanBuildFroms, not from our collection directly.
Let's look at an example.
class GroupingCollection[A, C[A] <: Iterable[A]](ca: C[A]) {
import collection.generic.CanBuildFrom
def groupedWhile(p: (A,A) => Boolean)(
implicit cbfcc: CanBuildFrom[C[A],C[A],C[C[A]]], cbfc: CanBuildFrom[C[A],A,C[A]]
): C[C[A]] = {
val it = ca.iterator
val cca = cbfcc()
if (!it.hasNext) cca.result
else {
val as = cbfc()
var olda = it.next
as += olda
while (it.hasNext) {
val a = it.next
if (p(olda,a)) as += a
else { cca += as.result; as.clear; as += a }
olda = a
}
cca += as.result
}
cca.result
}
}
implicit def iterable_has_grouping[A, C[A] <: Iterable[A]](ca: C[A]) = {
new GroupingCollection[A,C](ca)
}
Let's take this apart. First, in order to build the collection-of-collections, we know we'll need to build two types of collections: C[A] for each group, and C[C[A]] that gathers all the groups together. Thus, we need two builders, one that takes As and builds C[A]s, and one that takes C[A]s and builds C[C[A]]s. Looking at the type signature of CanBuildFrom, we see
CanBuildFrom[-From, -Elem, +To]
which means that CanBuildFrom wants to know the type of collection we're starting with--in our case, it's C[A], and then the elements of the generated collection and the type of that collection. So we fill those in as implicit parameters cbfcc and cbfc.
Having realized this, that's most of the work. We can use our CanBuildFroms to give us builders (all you need to do is apply them). And one builder can build up a collection with +=, convert it to the collection it is supposed to ultimately be with result, and empty itself and be ready to start again with clear. The builders start off empty, which solves our first compile error, and since we're using builders instead of recursion, the second error also goes away.
One last little detail--other than the algorithm that actually does the work--is in the implicit conversion. Note that we use new GroupingCollection[A,C] not [A,C[A]]. This is because the class declaration was for C with one parameter, which it fills it itself with the A passed to it. So we just hand it the type C, and let it create C[A] out of it. Minor detail, but you'll get compile-time errors if you try another way.
Here, I've made the method a little bit more generic than the "equal elements" collection--rather, the method cuts the original collection apart whenever its test of sequential elements fails.
Let's see our method in action:
scala> List(1,2,2,2,3,4,4,4,5,5,1,1,1,2).groupedWhile(_ == _)
res0: List[List[Int]] = List(List(1), List(2, 2, 2), List(3), List(4, 4, 4),
List(5, 5), List(1, 1, 1), List(2))
scala> Vector(1,2,3,4,1,2,3,1,2,1).groupedWhile(_ < _)
res1: scala.collection.immutable.Vector[scala.collection.immutable.Vector[Int]] =
Vector(Vector(1, 2, 3, 4), Vector(1, 2, 3), Vector(1, 2), Vector(1))
It works!
The only problem is that we don't in general have these methods available for arrays, since that would require two implicit conversions in a row. There are several ways to get around this, including writing a separate implicit conversion for arrays, casting to WrappedArray, and so on.
Edit: My favored approach for dealing with arrays and strings and such is to make the code even more generic and then use appropriate implicit conversions to make them more specific again in such a way that arrays work also. In this particular case:
class GroupingCollection[A, C, D[C]](ca: C)(
implicit c2i: C => Iterable[A],
cbf: CanBuildFrom[C,C,D[C]],
cbfi: CanBuildFrom[C,A,C]
) {
def groupedWhile(p: (A,A) => Boolean): D[C] = {
val it = c2i(ca).iterator
val cca = cbf()
if (!it.hasNext) cca.result
else {
val as = cbfi()
var olda = it.next
as += olda
while (it.hasNext) {
val a = it.next
if (p(olda,a)) as += a
else { cca += as.result; as.clear; as += a }
olda = a
}
cca += as.result
}
cca.result
}
}
Here we've added an implicit that gives us an Iterable[A] from C--for most collections this will just be the identity (e.g. List[A] already is an Iterable[A]), but for arrays it will be a real implicit conversion. And, consequently, we've dropped the requirement that C[A] <: Iterable[A]--we've basically just made the requirement for <% explicit, so we can use it explicitly at will instead of having the compiler fill it in for us. Also, we have relaxed the restriction that our collection-of-collections is C[C[A]]--instead, it's any D[C], which we will fill in later to be what we want. Because we're going to fill this in later, we've pushed it up to the class level instead of the method level. Otherwise, it's basically the same.
Now the question is how to use this. For regular collections, we can:
implicit def collections_have_grouping[A, C[A]](ca: C[A])(
implicit c2i: C[A] => Iterable[A],
cbf: CanBuildFrom[C[A],C[A],C[C[A]]],
cbfi: CanBuildFrom[C[A],A,C[A]]
) = {
new GroupingCollection[A,C[A],C](ca)(c2i, cbf, cbfi)
}
where now we plug in C[A] for C and C[C[A]] for D[C]. Note that we do need the explicit generic types on the call to new GroupingCollection so it can keep straight which types correspond to what. Thanks to the implicit c2i: C[A] => Iterable[A], this automatically handles arrays.
But wait, what if we want to use strings? Now we're in trouble, because you can't have a "string of strings". This is where the extra abstraction helps: we can call D something that's suitable to hold strings. Let's pick Vector, and do the following:
val vector_string_builder = (
new CanBuildFrom[String, String, Vector[String]] {
def apply() = Vector.newBuilder[String]
def apply(from: String) = this.apply()
}
)
implicit def strings_have_grouping(s: String)(
implicit c2i: String => Iterable[Char],
cbfi: CanBuildFrom[String,Char,String]
) = {
new GroupingCollection[Char,String,Vector](s)(
c2i, vector_string_builder, cbfi
)
}
We need a new CanBuildFrom to handle the building of a vector of strings (but this is really easy, since we just need to call Vector.newBuilder[String]), and then we need to fill in all the types so that the GroupingCollection is typed sensibly. Note that we already have floating around a [String,Char,String] CanBuildFrom, so strings can be made from collections of chars.
Let's try it out:
scala> List(true,false,true,true,true).groupedWhile(_ == _)
res1: List[List[Boolean]] = List(List(true), List(false), List(true, true, true))
scala> Array(1,2,5,3,5,6,7,4,1).groupedWhile(_ <= _)
res2: Array[Array[Int]] = Array(Array(1, 2, 5), Array(3, 5, 6, 7), Array(4), Array(1))
scala> "Hello there!!".groupedWhile(_.isLetter == _.isLetter)
res3: Vector[String] = Vector(Hello, , there, !!)
As of this commit it's a lot easier to "enrich" Scala collections than it was when Rex gave his excellent answer. For simple cases it might look like this,
import scala.collection.generic.{ CanBuildFrom, FromRepr, HasElem }
import language.implicitConversions
class FilterMapImpl[A, Repr](val r : Repr)(implicit hasElem : HasElem[Repr, A]) {
def filterMap[B, That](f : A => Option[B])
(implicit cbf : CanBuildFrom[Repr, B, That]) : That = r.flatMap(f(_).toSeq)
}
implicit def filterMap[Repr : FromRepr](r : Repr) = new FilterMapImpl(r)
which adds a "same result type" respecting filterMap operation to all GenTraversableLikes,
scala> val l = List(1, 2, 3, 4, 5)
l: List[Int] = List(1, 2, 3, 4, 5)
scala> l.filterMap(i => if(i % 2 == 0) Some(i) else None)
res0: List[Int] = List(2, 4)
scala> val a = Array(1, 2, 3, 4, 5)
a: Array[Int] = Array(1, 2, 3, 4, 5)
scala> a.filterMap(i => if(i % 2 == 0) Some(i) else None)
res1: Array[Int] = Array(2, 4)
scala> val s = "Hello World"
s: String = Hello World
scala> s.filterMap(c => if(c >= 'A' && c <= 'Z') Some(c) else None)
res2: String = HW
And for the example from the question, the solution now looks like,
class GroupIdenticalImpl[A, Repr : FromRepr](val r: Repr)
(implicit hasElem : HasElem[Repr, A]) {
def groupIdentical[That](implicit cbf: CanBuildFrom[Repr,Repr,That]): That = {
val builder = cbf(r)
def group(r: Repr) : Unit = {
val first = r.head
val (same, rest) = r.span(_ == first)
builder += same
if(!rest.isEmpty)
group(rest)
}
if(!r.isEmpty) group(r)
builder.result
}
}
implicit def groupIdentical[Repr : FromRepr](r: Repr) = new GroupIdenticalImpl(r)
Sample REPL session,
scala> val l = List(1, 1, 2, 2, 3, 3, 1, 1)
l: List[Int] = List(1, 1, 2, 2, 3, 3, 1, 1)
scala> l.groupIdentical
res0: List[List[Int]] = List(List(1, 1),List(2, 2),List(3, 3),List(1, 1))
scala> val a = Array(1, 1, 2, 2, 3, 3, 1, 1)
a: Array[Int] = Array(1, 1, 2, 2, 3, 3, 1, 1)
scala> a.groupIdentical
res1: Array[Array[Int]] = Array(Array(1, 1),Array(2, 2),Array(3, 3),Array(1, 1))
scala> val s = "11223311"
s: String = 11223311
scala> s.groupIdentical
res2: scala.collection.immutable.IndexedSeq[String] = Vector(11, 22, 33, 11)
Again, note that the same result type principle has been observed in exactly the same way that it would have been had groupIdentical been directly defined on GenTraversableLike.
As of this commit the magic incantation is slightly changed from what it was when Miles gave his excellent answer.
The following works, but is it canonical? I hope one of the canons will correct it. (Or rather, cannons, one of the big guns.) If the view bound is an upper bound, you lose application to Array and String. It doesn't seem to matter if the bound is GenTraversableLike or TraversableLike; but IsTraversableLike gives you a GenTraversableLike.
import language.implicitConversions
import scala.collection.{ GenTraversable=>GT, GenTraversableLike=>GTL, TraversableLike=>TL }
import scala.collection.generic.{ CanBuildFrom=>CBF, IsTraversableLike=>ITL }
class GroupIdenticalImpl[A, R <% GTL[_,R]](val r: GTL[A,R]) {
def groupIdentical[That](implicit cbf: CBF[R, R, That]): That = {
val builder = cbf(r.repr)
def group(r: GTL[_,R]) {
val first = r.head
val (same, rest) = r.span(_ == first)
builder += same
if (!rest.isEmpty) group(rest)
}
if (!r.isEmpty) group(r)
builder.result
}
}
implicit def groupIdentical[A, R <% GTL[_,R]](r: R)(implicit fr: ITL[R]):
GroupIdenticalImpl[fr.A, R] =
new GroupIdenticalImpl(fr conversion r)
There's more than one way to skin a cat with nine lives. This version says that once my source is converted to a GenTraversableLike, as long as I can build the result from GenTraversable, just do that. I'm not interested in my old Repr.
class GroupIdenticalImpl[A, R](val r: GTL[A,R]) {
def groupIdentical[That](implicit cbf: CBF[GT[A], GT[A], That]): That = {
val builder = cbf(r.toTraversable)
def group(r: GT[A]) {
val first = r.head
val (same, rest) = r.span(_ == first)
builder += same
if (!rest.isEmpty) group(rest)
}
if (!r.isEmpty) group(r.toTraversable)
builder.result
}
}
implicit def groupIdentical[A, R](r: R)(implicit fr: ITL[R]):
GroupIdenticalImpl[fr.A, R] =
new GroupIdenticalImpl(fr conversion r)
This first attempt includes an ugly conversion of Repr to GenTraversableLike.
import language.implicitConversions
import scala.collection.{ GenTraversableLike }
import scala.collection.generic.{ CanBuildFrom, IsTraversableLike }
type GT[A, B] = GenTraversableLike[A, B]
type CBF[A, B, C] = CanBuildFrom[A, B, C]
type ITL[A] = IsTraversableLike[A]
class FilterMapImpl[A, Repr](val r: GenTraversableLike[A, Repr]) {
def filterMap[B, That](f: A => Option[B])(implicit cbf : CanBuildFrom[Repr, B, That]): That =
r.flatMap(f(_).toSeq)
}
implicit def filterMap[A, Repr](r: Repr)(implicit fr: ITL[Repr]): FilterMapImpl[fr.A, Repr] =
new FilterMapImpl(fr conversion r)
class GroupIdenticalImpl[A, R](val r: GT[A,R])(implicit fr: ITL[R]) {
def groupIdentical[That](implicit cbf: CBF[R, R, That]): That = {
val builder = cbf(r.repr)
def group(r0: R) {
val r = fr conversion r0
val first = r.head
val (same, other) = r.span(_ == first)
builder += same
val rest = fr conversion other
if (!rest.isEmpty) group(rest.repr)
}
if (!r.isEmpty) group(r.repr)
builder.result
}
}
implicit def groupIdentical[A, R](r: R)(implicit fr: ITL[R]):
GroupIdenticalImpl[fr.A, R] =
new GroupIdenticalImpl(fr conversion r)
One of the most powerful patterns available in Scala is the enrich-my-library* pattern, which uses implicit conversions to appear to add methods to existing classes without requiring dynamic method resolution. For example, if we wished that all strings had the method spaces that counted how many whitespace characters they had, we could:
class SpaceCounter(s: String) {
def spaces = s.count(_.isWhitespace)
}
implicit def string_counts_spaces(s: String) = new SpaceCounter(s)
scala> "How many spaces do I have?".spaces
res1: Int = 5
Unfortunately, this pattern runs into trouble when dealing with generic collections. For example, a number of questions have been asked about grouping items sequentially with collections. There is nothing built in that works in one shot, so this seems an ideal candidate for the enrich-my-library pattern using a generic collection C and a generic element type A:
class SequentiallyGroupingCollection[A, C[A] <: Seq[A]](ca: C[A]) {
def groupIdentical: C[C[A]] = {
if (ca.isEmpty) C.empty[C[A]]
else {
val first = ca.head
val (same,rest) = ca.span(_ == first)
same +: (new SequentiallyGroupingCollection(rest)).groupIdentical
}
}
}
except, of course, it doesn't work. The REPL tells us:
<console>:12: error: not found: value C
if (ca.isEmpty) C.empty[C[A]]
^
<console>:16: error: type mismatch;
found : Seq[Seq[A]]
required: C[C[A]]
same +: (new SequentiallyGroupingCollection(rest)).groupIdentical
^
There are two problems: how do we get a C[C[A]] from an empty C[A] list (or from thin air)? And how do we get a C[C[A]] back from the same +: line instead of a Seq[Seq[A]]?
* Formerly known as pimp-my-library.
The key to understanding this problem is to realize that there are two different ways to build and work with collections in the collections library. One is the public collections interface with all its nice methods. The other, which is used extensively in creating the collections library, but which are almost never used outside of it, is the builders.
Our problem in enriching is exactly the same one that the collections library itself faces when trying to return collections of the same type. That is, we want to build collections, but when working generically, we don't have a way to refer to "the same type that the collection already is". So we need builders.
Now the question is: where do we get our builders from? The obvious place is from the collection itself. This doesn't work. We already decided, in moving to a generic collection, that we were going to forget the type of the collection. So even though the collection could return a builder that would generate more collections of the type we want, it wouldn't know what the type was.
Instead, we get our builders from CanBuildFrom implicits that are floating around. These exist specifically for the purpose of matching input and output types and giving you an appropriately typed builder.
So, we have two conceptual leaps to make:
We aren't using standard collections operations, we're using builders.
We get these builders from implicit CanBuildFroms, not from our collection directly.
Let's look at an example.
class GroupingCollection[A, C[A] <: Iterable[A]](ca: C[A]) {
import collection.generic.CanBuildFrom
def groupedWhile(p: (A,A) => Boolean)(
implicit cbfcc: CanBuildFrom[C[A],C[A],C[C[A]]], cbfc: CanBuildFrom[C[A],A,C[A]]
): C[C[A]] = {
val it = ca.iterator
val cca = cbfcc()
if (!it.hasNext) cca.result
else {
val as = cbfc()
var olda = it.next
as += olda
while (it.hasNext) {
val a = it.next
if (p(olda,a)) as += a
else { cca += as.result; as.clear; as += a }
olda = a
}
cca += as.result
}
cca.result
}
}
implicit def iterable_has_grouping[A, C[A] <: Iterable[A]](ca: C[A]) = {
new GroupingCollection[A,C](ca)
}
Let's take this apart. First, in order to build the collection-of-collections, we know we'll need to build two types of collections: C[A] for each group, and C[C[A]] that gathers all the groups together. Thus, we need two builders, one that takes As and builds C[A]s, and one that takes C[A]s and builds C[C[A]]s. Looking at the type signature of CanBuildFrom, we see
CanBuildFrom[-From, -Elem, +To]
which means that CanBuildFrom wants to know the type of collection we're starting with--in our case, it's C[A], and then the elements of the generated collection and the type of that collection. So we fill those in as implicit parameters cbfcc and cbfc.
Having realized this, that's most of the work. We can use our CanBuildFroms to give us builders (all you need to do is apply them). And one builder can build up a collection with +=, convert it to the collection it is supposed to ultimately be with result, and empty itself and be ready to start again with clear. The builders start off empty, which solves our first compile error, and since we're using builders instead of recursion, the second error also goes away.
One last little detail--other than the algorithm that actually does the work--is in the implicit conversion. Note that we use new GroupingCollection[A,C] not [A,C[A]]. This is because the class declaration was for C with one parameter, which it fills it itself with the A passed to it. So we just hand it the type C, and let it create C[A] out of it. Minor detail, but you'll get compile-time errors if you try another way.
Here, I've made the method a little bit more generic than the "equal elements" collection--rather, the method cuts the original collection apart whenever its test of sequential elements fails.
Let's see our method in action:
scala> List(1,2,2,2,3,4,4,4,5,5,1,1,1,2).groupedWhile(_ == _)
res0: List[List[Int]] = List(List(1), List(2, 2, 2), List(3), List(4, 4, 4),
List(5, 5), List(1, 1, 1), List(2))
scala> Vector(1,2,3,4,1,2,3,1,2,1).groupedWhile(_ < _)
res1: scala.collection.immutable.Vector[scala.collection.immutable.Vector[Int]] =
Vector(Vector(1, 2, 3, 4), Vector(1, 2, 3), Vector(1, 2), Vector(1))
It works!
The only problem is that we don't in general have these methods available for arrays, since that would require two implicit conversions in a row. There are several ways to get around this, including writing a separate implicit conversion for arrays, casting to WrappedArray, and so on.
Edit: My favored approach for dealing with arrays and strings and such is to make the code even more generic and then use appropriate implicit conversions to make them more specific again in such a way that arrays work also. In this particular case:
class GroupingCollection[A, C, D[C]](ca: C)(
implicit c2i: C => Iterable[A],
cbf: CanBuildFrom[C,C,D[C]],
cbfi: CanBuildFrom[C,A,C]
) {
def groupedWhile(p: (A,A) => Boolean): D[C] = {
val it = c2i(ca).iterator
val cca = cbf()
if (!it.hasNext) cca.result
else {
val as = cbfi()
var olda = it.next
as += olda
while (it.hasNext) {
val a = it.next
if (p(olda,a)) as += a
else { cca += as.result; as.clear; as += a }
olda = a
}
cca += as.result
}
cca.result
}
}
Here we've added an implicit that gives us an Iterable[A] from C--for most collections this will just be the identity (e.g. List[A] already is an Iterable[A]), but for arrays it will be a real implicit conversion. And, consequently, we've dropped the requirement that C[A] <: Iterable[A]--we've basically just made the requirement for <% explicit, so we can use it explicitly at will instead of having the compiler fill it in for us. Also, we have relaxed the restriction that our collection-of-collections is C[C[A]]--instead, it's any D[C], which we will fill in later to be what we want. Because we're going to fill this in later, we've pushed it up to the class level instead of the method level. Otherwise, it's basically the same.
Now the question is how to use this. For regular collections, we can:
implicit def collections_have_grouping[A, C[A]](ca: C[A])(
implicit c2i: C[A] => Iterable[A],
cbf: CanBuildFrom[C[A],C[A],C[C[A]]],
cbfi: CanBuildFrom[C[A],A,C[A]]
) = {
new GroupingCollection[A,C[A],C](ca)(c2i, cbf, cbfi)
}
where now we plug in C[A] for C and C[C[A]] for D[C]. Note that we do need the explicit generic types on the call to new GroupingCollection so it can keep straight which types correspond to what. Thanks to the implicit c2i: C[A] => Iterable[A], this automatically handles arrays.
But wait, what if we want to use strings? Now we're in trouble, because you can't have a "string of strings". This is where the extra abstraction helps: we can call D something that's suitable to hold strings. Let's pick Vector, and do the following:
val vector_string_builder = (
new CanBuildFrom[String, String, Vector[String]] {
def apply() = Vector.newBuilder[String]
def apply(from: String) = this.apply()
}
)
implicit def strings_have_grouping(s: String)(
implicit c2i: String => Iterable[Char],
cbfi: CanBuildFrom[String,Char,String]
) = {
new GroupingCollection[Char,String,Vector](s)(
c2i, vector_string_builder, cbfi
)
}
We need a new CanBuildFrom to handle the building of a vector of strings (but this is really easy, since we just need to call Vector.newBuilder[String]), and then we need to fill in all the types so that the GroupingCollection is typed sensibly. Note that we already have floating around a [String,Char,String] CanBuildFrom, so strings can be made from collections of chars.
Let's try it out:
scala> List(true,false,true,true,true).groupedWhile(_ == _)
res1: List[List[Boolean]] = List(List(true), List(false), List(true, true, true))
scala> Array(1,2,5,3,5,6,7,4,1).groupedWhile(_ <= _)
res2: Array[Array[Int]] = Array(Array(1, 2, 5), Array(3, 5, 6, 7), Array(4), Array(1))
scala> "Hello there!!".groupedWhile(_.isLetter == _.isLetter)
res3: Vector[String] = Vector(Hello, , there, !!)
As of this commit it's a lot easier to "enrich" Scala collections than it was when Rex gave his excellent answer. For simple cases it might look like this,
import scala.collection.generic.{ CanBuildFrom, FromRepr, HasElem }
import language.implicitConversions
class FilterMapImpl[A, Repr](val r : Repr)(implicit hasElem : HasElem[Repr, A]) {
def filterMap[B, That](f : A => Option[B])
(implicit cbf : CanBuildFrom[Repr, B, That]) : That = r.flatMap(f(_).toSeq)
}
implicit def filterMap[Repr : FromRepr](r : Repr) = new FilterMapImpl(r)
which adds a "same result type" respecting filterMap operation to all GenTraversableLikes,
scala> val l = List(1, 2, 3, 4, 5)
l: List[Int] = List(1, 2, 3, 4, 5)
scala> l.filterMap(i => if(i % 2 == 0) Some(i) else None)
res0: List[Int] = List(2, 4)
scala> val a = Array(1, 2, 3, 4, 5)
a: Array[Int] = Array(1, 2, 3, 4, 5)
scala> a.filterMap(i => if(i % 2 == 0) Some(i) else None)
res1: Array[Int] = Array(2, 4)
scala> val s = "Hello World"
s: String = Hello World
scala> s.filterMap(c => if(c >= 'A' && c <= 'Z') Some(c) else None)
res2: String = HW
And for the example from the question, the solution now looks like,
class GroupIdenticalImpl[A, Repr : FromRepr](val r: Repr)
(implicit hasElem : HasElem[Repr, A]) {
def groupIdentical[That](implicit cbf: CanBuildFrom[Repr,Repr,That]): That = {
val builder = cbf(r)
def group(r: Repr) : Unit = {
val first = r.head
val (same, rest) = r.span(_ == first)
builder += same
if(!rest.isEmpty)
group(rest)
}
if(!r.isEmpty) group(r)
builder.result
}
}
implicit def groupIdentical[Repr : FromRepr](r: Repr) = new GroupIdenticalImpl(r)
Sample REPL session,
scala> val l = List(1, 1, 2, 2, 3, 3, 1, 1)
l: List[Int] = List(1, 1, 2, 2, 3, 3, 1, 1)
scala> l.groupIdentical
res0: List[List[Int]] = List(List(1, 1),List(2, 2),List(3, 3),List(1, 1))
scala> val a = Array(1, 1, 2, 2, 3, 3, 1, 1)
a: Array[Int] = Array(1, 1, 2, 2, 3, 3, 1, 1)
scala> a.groupIdentical
res1: Array[Array[Int]] = Array(Array(1, 1),Array(2, 2),Array(3, 3),Array(1, 1))
scala> val s = "11223311"
s: String = 11223311
scala> s.groupIdentical
res2: scala.collection.immutable.IndexedSeq[String] = Vector(11, 22, 33, 11)
Again, note that the same result type principle has been observed in exactly the same way that it would have been had groupIdentical been directly defined on GenTraversableLike.
As of this commit the magic incantation is slightly changed from what it was when Miles gave his excellent answer.
The following works, but is it canonical? I hope one of the canons will correct it. (Or rather, cannons, one of the big guns.) If the view bound is an upper bound, you lose application to Array and String. It doesn't seem to matter if the bound is GenTraversableLike or TraversableLike; but IsTraversableLike gives you a GenTraversableLike.
import language.implicitConversions
import scala.collection.{ GenTraversable=>GT, GenTraversableLike=>GTL, TraversableLike=>TL }
import scala.collection.generic.{ CanBuildFrom=>CBF, IsTraversableLike=>ITL }
class GroupIdenticalImpl[A, R <% GTL[_,R]](val r: GTL[A,R]) {
def groupIdentical[That](implicit cbf: CBF[R, R, That]): That = {
val builder = cbf(r.repr)
def group(r: GTL[_,R]) {
val first = r.head
val (same, rest) = r.span(_ == first)
builder += same
if (!rest.isEmpty) group(rest)
}
if (!r.isEmpty) group(r)
builder.result
}
}
implicit def groupIdentical[A, R <% GTL[_,R]](r: R)(implicit fr: ITL[R]):
GroupIdenticalImpl[fr.A, R] =
new GroupIdenticalImpl(fr conversion r)
There's more than one way to skin a cat with nine lives. This version says that once my source is converted to a GenTraversableLike, as long as I can build the result from GenTraversable, just do that. I'm not interested in my old Repr.
class GroupIdenticalImpl[A, R](val r: GTL[A,R]) {
def groupIdentical[That](implicit cbf: CBF[GT[A], GT[A], That]): That = {
val builder = cbf(r.toTraversable)
def group(r: GT[A]) {
val first = r.head
val (same, rest) = r.span(_ == first)
builder += same
if (!rest.isEmpty) group(rest)
}
if (!r.isEmpty) group(r.toTraversable)
builder.result
}
}
implicit def groupIdentical[A, R](r: R)(implicit fr: ITL[R]):
GroupIdenticalImpl[fr.A, R] =
new GroupIdenticalImpl(fr conversion r)
This first attempt includes an ugly conversion of Repr to GenTraversableLike.
import language.implicitConversions
import scala.collection.{ GenTraversableLike }
import scala.collection.generic.{ CanBuildFrom, IsTraversableLike }
type GT[A, B] = GenTraversableLike[A, B]
type CBF[A, B, C] = CanBuildFrom[A, B, C]
type ITL[A] = IsTraversableLike[A]
class FilterMapImpl[A, Repr](val r: GenTraversableLike[A, Repr]) {
def filterMap[B, That](f: A => Option[B])(implicit cbf : CanBuildFrom[Repr, B, That]): That =
r.flatMap(f(_).toSeq)
}
implicit def filterMap[A, Repr](r: Repr)(implicit fr: ITL[Repr]): FilterMapImpl[fr.A, Repr] =
new FilterMapImpl(fr conversion r)
class GroupIdenticalImpl[A, R](val r: GT[A,R])(implicit fr: ITL[R]) {
def groupIdentical[That](implicit cbf: CBF[R, R, That]): That = {
val builder = cbf(r.repr)
def group(r0: R) {
val r = fr conversion r0
val first = r.head
val (same, other) = r.span(_ == first)
builder += same
val rest = fr conversion other
if (!rest.isEmpty) group(rest.repr)
}
if (!r.isEmpty) group(r.repr)
builder.result
}
}
implicit def groupIdentical[A, R](r: R)(implicit fr: ITL[R]):
GroupIdenticalImpl[fr.A, R] =
new GroupIdenticalImpl(fr conversion r)
Locked. This question and its answers are locked because the question is off-topic but has historical significance. It is not currently accepting new answers or interactions.
What are the hidden features of Scala that every Scala developer should be aware of?
One hidden feature per answer, please.
Okay, I had to add one more. Every Regex object in Scala has an extractor (see answer from oxbox_lakes above) that gives you access to the match groups. So you can do something like:
// Regex to split a date in the format Y/M/D.
val regex = "(\\d+)/(\\d+)/(\\d+)".r
val regex(year, month, day) = "2010/1/13"
The second line looks confusing if you're not used to using pattern matching and extractors. Whenever you define a val or var, what comes after the keyword is not simply an identifier but rather a pattern. That's why this works:
val (a, b, c) = (1, 3.14159, "Hello, world")
The right hand expression creates a Tuple3[Int, Double, String] which can match the pattern (a, b, c).
Most of the time your patterns use extractors that are members of singleton objects. For example, if you write a pattern like
Some(value)
then you're implicitly calling the extractor Some.unapply.
But you can also use class instances in patterns, and that is what's happening here. The val regex is an instance of Regex, and when you use it in a pattern, you're implicitly calling regex.unapplySeq (unapply versus unapplySeq is beyond the scope of this answer), which extracts the match groups into a Seq[String], the elements of which are assigned in order to the variables year, month, and day.
Structural type definitions - i.e. a type described by what methods it supports. For example:
object Closer {
def using(closeable: { def close(): Unit }, f: => Unit) {
try {
f
} finally { closeable.close }
}
}
Notice that the type of the parameter closeable is not defined other than it has a close method
Type-Constructor Polymorphism (a.k.a. higher-kinded types)
Without this feature you can, for example, express the idea of mapping a function over a list to return another list, or mapping a function over a tree to return another tree. But you can't express this idea generally without higher kinds.
With higher kinds, you can capture the idea of any type that's parameterised with another type. A type constructor that takes one parameter is said to be of kind (*->*). For example, List. A type constructor that returns another type constructor is said to be of kind (*->*->*). For example, Function1. But in Scala, we have higher kinds, so we can have type constructors that are parameterised with other type constructors. So they're of kinds like ((*->*)->*).
For example:
trait Functor[F[_]] {
def fmap[A, B](f: A => B, fa: F[A]): F[B]
}
Now, if you have a Functor[List], you can map over lists. If you have a Functor[Tree], you can map over trees. But more importantly, if you have Functor[A] for any A of kind (*->*), you can map a function over A.
Extractors which allow you to replace messy if-elseif-else style code with patterns. I know that these are not exactly hidden but I've been using Scala for a few months without really understanding the power of them. For (a long) example I can replace:
val code: String = ...
val ps: ProductService = ...
var p: Product = null
if (code.endsWith("=")) {
p = ps.findCash(code.substring(0, 3)) //e.g. USD=, GBP= etc
}
else if (code.endsWith(".FWD")) {
//e.g. GBP20090625.FWD
p = ps.findForward(code.substring(0,3), code.substring(3, 9))
}
else {
p = ps.lookupProductByRic(code)
}
With this, which is much clearer in my opinion
implicit val ps: ProductService = ...
val p = code match {
case SyntheticCodes.Cash(c) => c
case SyntheticCodes.Forward(f) => f
case _ => ps.lookupProductByRic(code)
}
I have to do a bit of legwork in the background...
object SyntheticCodes {
// Synthetic Code for a CashProduct
object Cash extends (CashProduct => String) {
def apply(p: CashProduct) = p.currency.name + "="
//EXTRACTOR
def unapply(s: String)(implicit ps: ProductService): Option[CashProduct] = {
if (s.endsWith("=")
Some(ps.findCash(s.substring(0,3)))
else None
}
}
//Synthetic Code for a ForwardProduct
object Forward extends (ForwardProduct => String) {
def apply(p: ForwardProduct) = p.currency.name + p.date.toString + ".FWD"
//EXTRACTOR
def unapply(s: String)(implicit ps: ProductService): Option[ForwardProduct] = {
if (s.endsWith(".FWD")
Some(ps.findForward(s.substring(0,3), s.substring(3, 9))
else None
}
}
But the legwork is worth it for the fact that it separates a piece of business logic into a sensible place. I can implement my Product.getCode methods as follows..
class CashProduct {
def getCode = SyntheticCodes.Cash(this)
}
class ForwardProduct {
def getCode = SyntheticCodes.Forward(this)
}
Manifests which are a sort of way at getting the type information at runtime, as if Scala had reified types.
In scala 2.8 you can have tail-recursive methods by using the package scala.util.control.TailCalls (in fact it's trampolining).
An example:
def u(n:Int):TailRec[Int] = {
if (n==0) done(1)
else tailcall(v(n/2))
}
def v(n:Int):TailRec[Int] = {
if (n==0) done(5)
else tailcall(u(n-1))
}
val l=for(n<-0 to 5) yield (n,u(n).result,v(n).result)
println(l)
Case classes automatically mixin the Product trait, providing untyped, indexed access to the fields without any reflection:
case class Person(name: String, age: Int)
val p = Person("Aaron", 28)
val name = p.productElement(0) // name = "Aaron": Any
val age = p.productElement(1) // age = 28: Any
val fields = p.productIterator.toList // fields = List[Any]("Aaron", 28)
This feature also provides a simplified way to alter the output of the toString method:
case class Person(name: String, age: Int) {
override def productPrefix = "person: "
}
// prints "person: (Aaron,28)" instead of "Person(Aaron, 28)"
println(Person("Aaron", 28))
It's not exactly hidden, but certainly a under advertised feature: scalac -Xprint.
As a illustration of the use consider the following source:
class A { "xx".r }
Compiling this with scalac -Xprint:typer outputs:
package <empty> {
class A extends java.lang.Object with ScalaObject {
def this(): A = {
A.super.this();
()
};
scala.this.Predef.augmentString("xx").r
}
}
Notice scala.this.Predef.augmentString("xx").r, which is a the application of the implicit def augmentString present in Predef.scala.
scalac -Xprint:<phase> will print the syntax tree after some compiler phase. To see the available phases use scalac -Xshow-phases.
This is a great way to learn what is going on behind the scenes.
Try with
case class X(a:Int,b:String)
using the typer phase to really feel how useful it is.
You can define your own control structures. It's really just functions and objects and some syntactic sugar, but they look and behave like the real thing.
For example, the following code defines dont {...} unless (cond) and dont {...} until (cond):
def dont(code: => Unit) = new DontCommand(code)
class DontCommand(code: => Unit) {
def unless(condition: => Boolean) =
if (condition) code
def until(condition: => Boolean) = {
while (!condition) {}
code
}
}
Now you can do the following:
/* This will only get executed if the condition is true */
dont {
println("Yep, 2 really is greater than 1.")
} unless (2 > 1)
/* Just a helper function */
var number = 0;
def nextNumber() = {
number += 1
println(number)
number
}
/* This will not be printed until the condition is met. */
dont {
println("Done counting to 5!")
} until (nextNumber() == 5)
#switch annotation in Scala 2.8:
An annotation to be applied to a match
expression. If present, the compiler
will verify that the match has been
compiled to a tableswitch or
lookupswitch, and issue an error if it
instead compiles into a series of
conditional expressions.
Example:
scala> val n = 3
n: Int = 3
scala> import annotation.switch
import annotation.switch
scala> val s = (n: #switch) match {
| case 3 => "Three"
| case _ => "NoThree"
| }
<console>:6: error: could not emit switch for #switch annotated match
val s = (n: #switch) match {
Dunno if this is really hidden, but I find it quite nice.
Typeconstructors that take 2 type parameters can be written in infix notation
object Main {
class FooBar[A, B]
def main(args: Array[String]): Unit = {
var x: FooBar[Int, BigInt] = null
var y: Int FooBar BigInt = null
}
}
Scala 2.8 introduced default and named arguments, which made possible the addition of a new "copy" method that Scala adds to case classes. If you define this:
case class Foo(a: Int, b: Int, c: Int, ... z:Int)
and you want to create a new Foo that's like an existing Foo, only with a different "n" value, then you can just say:
foo.copy(n = 3)
in scala 2.8 you can add #specialized to your generic classes/methods. This will create special versions of the class for primitive types (extending AnyVal) and save the cost of un-necessary boxing/unboxing :
class Foo[#specialized T]...
You can select a subset of AnyVals :
class Foo[#specialized(Int,Boolean) T]...
Extending the language. I always wanted to do something like this in Java (couldn't). But in Scala I can have:
def timed[T](thunk: => T) = {
val t1 = System.nanoTime
val ret = thunk
val time = System.nanoTime - t1
println("Executed in: " + time/1000000.0 + " millisec")
ret
}
and then write:
val numbers = List(12, 42, 3, 11, 6, 3, 77, 44)
val sorted = timed { // "timed" is a new "keyword"!
numbers.sortWith(_<_)
}
println(sorted)
and get
Executed in: 6.410311 millisec
List(3, 3, 6, 11, 12, 42, 44, 77)
You can designate a call-by-name parameter (EDITED: this is different then a lazy parameter!) to a function and it will not be evaluated until used by the function (EDIT: in fact, it will be reevaluated every time it is used). See this faq for details
class Bar(i:Int) {
println("constructing bar " + i)
override def toString():String = {
"bar with value: " + i
}
}
// NOTE the => in the method declaration. It indicates a lazy paramter
def foo(x: => Bar) = {
println("foo called")
println("bar: " + x)
}
foo(new Bar(22))
/*
prints the following:
foo called
constructing bar 22
bar with value: 22
*/
You can use locally to introduce a local block without causing semicolon inference issues.
Usage:
scala> case class Dog(name: String) {
| def bark() {
| println("Bow Vow")
| }
| }
defined class Dog
scala> val d = Dog("Barnie")
d: Dog = Dog(Barnie)
scala> locally {
| import d._
| bark()
| bark()
| }
Bow Vow
Bow Vow
locally is defined in "Predef.scala" as:
#inline def locally[T](x: T): T = x
Being inline, it does not impose any additional overhead.
Early Initialization:
trait AbstractT2 {
println("In AbstractT2:")
val value: Int
val inverse = 1.0/value
println("AbstractT2: value = "+value+", inverse = "+inverse)
}
val c2c = new {
// Only initializations are allowed in pre-init. blocks.
// println("In c2c:")
val value = 10
} with AbstractT2
println("c2c.value = "+c2c.value+", inverse = "+c2c.inverse)
Output:
In AbstractT2:
AbstractT2: value = 10, inverse = 0.1
c2c.value = 10, inverse = 0.1
We instantiate an anonymous inner
class, initializing the value field
in the block, before the with
AbstractT2 clause. This guarantees
that value is initialized before the
body of AbstractT2 is executed, as
shown when you run the script.
You can compose structural types with the 'with' keyword
object Main {
type A = {def foo: Unit}
type B = {def bar: Unit}
type C = A with B
class myA {
def foo: Unit = println("myA.foo")
}
class myB {
def bar: Unit = println("myB.bar")
}
class myC extends myB {
def foo: Unit = println("myC.foo")
}
def main(args: Array[String]): Unit = {
val a: A = new myA
a.foo
val b: C = new myC
b.bar
b.foo
}
}
placeholder syntax for anonymous functions
From The Scala Language Specification:
SimpleExpr1 ::= '_'
An expression (of syntactic category Expr) may contain embedded underscore symbols _ at places where identifiers are legal. Such an expression represents an anonymous function where subsequent occurrences of underscores denote successive parameters.
From Scala Language Changes:
_ + 1 x => x + 1
_ * _ (x1, x2) => x1 * x2
(_: Int) * 2 (x: Int) => x * 2
if (_) x else y z => if (z) x else y
_.map(f) x => x.map(f)
_.map(_ + 1) x => x.map(y => y + 1)
Using this you could do something like:
def filesEnding(query: String) =
filesMatching(_.endsWith(query))
Implicit definitions, particularly conversions.
For example, assume a function which will format an input string to fit to a size, by replacing the middle of it with "...":
def sizeBoundedString(s: String, n: Int): String = {
if (n < 5 && n < s.length) throw new IllegalArgumentException
if (s.length > n) {
val trailLength = ((n - 3) / 2) min 3
val headLength = n - 3 - trailLength
s.substring(0, headLength)+"..."+s.substring(s.length - trailLength, s.length)
} else s
}
You can use that with any String, and, of course, use the toString method to convert anything. But you could also write it like this:
def sizeBoundedString[T](s: T, n: Int)(implicit toStr: T => String): String = {
if (n < 5 && n < s.length) throw new IllegalArgumentException
if (s.length > n) {
val trailLength = ((n - 3) / 2) min 3
val headLength = n - 3 - trailLength
s.substring(0, headLength)+"..."+s.substring(s.length - trailLength, s.length)
} else s
}
And then, you could pass classes of other types by doing this:
implicit def double2String(d: Double) = d.toString
Now you can call that function passing a double:
sizeBoundedString(12345.12345D, 8)
The last argument is implicit, and is being passed automatically because of the implicit de declaration. Furthermore, "s" is being treated like a String inside sizeBoundedString because there is an implicit conversion from it to String.
Implicits of this type are better defined for uncommon types to avoid unexpected conversions. You can also explictly pass a conversion, and it will still be implicitly used inside sizeBoundedString:
sizeBoundedString(1234567890L, 8)((l : Long) => l.toString)
You can also have multiple implicit arguments, but then you must either pass all of them, or not pass any of them. There is also a shortcut syntax for implicit conversions:
def sizeBoundedString[T <% String](s: T, n: Int): String = {
if (n < 5 && n < s.length) throw new IllegalArgumentException
if (s.length > n) {
val trailLength = ((n - 3) / 2) min 3
val headLength = n - 3 - trailLength
s.substring(0, headLength)+"..."+s.substring(s.length - trailLength, s.length)
} else s
}
This is used exactly the same way.
Implicits can have any value. They can be used, for instance, to hide library information. Take the following example, for instance:
case class Daemon(name: String) {
def log(msg: String) = println(name+": "+msg)
}
object DefaultDaemon extends Daemon("Default")
trait Logger {
private var logd: Option[Daemon] = None
implicit def daemon: Daemon = logd getOrElse DefaultDaemon
def logTo(daemon: Daemon) =
if (logd == None) logd = Some(daemon)
else throw new IllegalArgumentException
def log(msg: String)(implicit daemon: Daemon) = daemon.log(msg)
}
class X extends Logger {
logTo(Daemon("X Daemon"))
def f = {
log("f called")
println("Stuff")
}
def g = {
log("g called")(DefaultDaemon)
}
}
class Y extends Logger {
def f = {
log("f called")
println("Stuff")
}
}
In this example, calling "f" in an Y object will send the log to the default daemon, and on an instance of X to the Daemon X daemon. But calling g on an instance of X will send the log to the explicitly given DefaultDaemon.
While this simple example can be re-written with overload and private state, implicits do not require private state, and can be brought into context with imports.
Maybe not too hidden, but I think this is useful:
#scala.reflect.BeanProperty
var firstName:String = _
This will automatically generate a getter and setter for the field that matches bean convention.
Further description at developerworks
Implicit arguments in closures.
A function argument can be marked as implicit just as with methods. Within the scope of the body of the function the implicit parameter is visible and eligible for implicit resolution:
trait Foo { def bar }
trait Base {
def callBar(implicit foo: Foo) = foo.bar
}
object Test extends Base {
val f: Foo => Unit = { implicit foo =>
callBar
}
def test = f(new Foo {
def bar = println("Hello")
})
}
Build infinite data structures with Scala's Streams :
http://www.codecommit.com/blog/scala/infinite-lists-for-the-finitely-patient
Result types are dependent on implicit resolution. This can give you a form of multiple dispatch:
scala> trait PerformFunc[A,B] { def perform(a : A) : B }
defined trait PerformFunc
scala> implicit val stringToInt = new PerformFunc[String,Int] {
def perform(a : String) = 5
}
stringToInt: java.lang.Object with PerformFunc[String,Int] = $anon$1#13ccf137
scala> implicit val intToDouble = new PerformFunc[Int,Double] {
def perform(a : Int) = 1.0
}
intToDouble: java.lang.Object with PerformFunc[Int,Double] = $anon$1#74e551a4
scala> def foo[A, B](x : A)(implicit z : PerformFunc[A,B]) : B = z.perform(x)
foo: [A,B](x: A)(implicit z: PerformFunc[A,B])B
scala> foo("HAI")
res16: Int = 5
scala> foo(1)
res17: Double = 1.0
Scala's equivalent of Java double brace initializer.
Scala allows you to create an anonymous subclass with the body of the class (the constructor) containing statements to initialize the instance of that class.
This pattern is very useful when building component-based user interfaces (for example Swing , Vaadin) as it allows to create UI components and declare their properties more concisely.
See http://spot.colorado.edu/~reids/papers/how-scala-experience-improved-our-java-development-reid-2011.pdf for more information.
Here is an example of creating a Vaadin button:
val button = new Button("Click me"){
setWidth("20px")
setDescription("Click on this")
setIcon(new ThemeResource("icons/ok.png"))
}
Excluding members from import statements
Suppose you want to use a Logger that contains a println and a printerr method, but you only want to use the one for error messages, and keep the good old Predef.println for standard output. You could do this:
val logger = new Logger(...)
import logger.printerr
but if logger also contains another twelve methods that you would like to import and use, it becomes inconvenient to list them. You could instead try:
import logger.{println => donotuseprintlnt, _}
but this still "pollutes" the list of imported members. Enter the über-powerful wildcard:
import logger.{println => _, _}
and that will do just the right thing™.
require method (defined in Predef) that allow you to define additional function constraints that would be checked during run-time. Imagine that you developing yet another twitter client and you need to limit tweet length up to 140 symbols. Moreover you can't post empty tweet.
def post(tweet: String) = {
require(tweet.length < 140 && tweet.length > 0)
println(tweet)
}
Now calling post with inappropriate length argument will cause an exception:
scala> post("that's ok")
that's ok
scala> post("")
java.lang.IllegalArgumentException: requirement failed
at scala.Predef$.require(Predef.scala:145)
at .post(<console>:8)
scala> post("way to looooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooong tweet")
java.lang.IllegalArgumentException: requirement failed
at scala.Predef$.require(Predef.scala:145)
at .post(<console>:8)
You can write multiple requirements or even add description to each:
def post(tweet: String) = {
require(tweet.length > 0, "too short message")
require(tweet.length < 140, "too long message")
println(tweet)
}
Now exceptions are verbose:
scala> post("")
java.lang.IllegalArgumentException: requirement failed: too short message
at scala.Predef$.require(Predef.scala:157)
at .post(<console>:8)
One more example is here.
Bonus
You can perform an action every time requirement fails:
scala> var errorcount = 0
errorcount: Int = 0
def post(tweet: String) = {
require(tweet.length > 0, {errorcount+=1})
println(tweet)
}
scala> errorcount
res14: Int = 0
scala> post("")
java.lang.IllegalArgumentException: requirement failed: ()
at scala.Predef$.require(Predef.scala:157)
at .post(<console>:9)
...
scala> errorcount
res16: Int = 1
Traits with abstract override methods are a feature in Scala that is as not widely advertised as many others. The intend of methods with the abstract override modifier is to do some operations and delegating the call to super. Then these traits have to be mixed-in with concrete implementations of their abstract override methods.
trait A {
def a(s : String) : String
}
trait TimingA extends A {
abstract override def a(s : String) = {
val start = System.currentTimeMillis
val result = super.a(s)
val dur = System.currentTimeMillis-start
println("Executed a in %s ms".format(dur))
result
}
}
trait ParameterPrintingA extends A {
abstract override def a(s : String) = {
println("Called a with s=%s".format(s))
super.a(s)
}
}
trait ImplementingA extends A {
def a(s: String) = s.reverse
}
scala> val a = new ImplementingA with TimingA with ParameterPrintingA
scala> a.a("a lotta as")
Called a with s=a lotta as
Executed a in 0 ms
res4: String = sa attol a
While my example is really not much more than a poor mans AOP, I used these Stackable Traits much to my liking to build Scala interpreter instances with predefined imports, custom bindings and classpathes. The Stackable Traits made it possible to create my factory along the lines of new InterpreterFactory with JsonLibs with LuceneLibs and then have useful imports and scope varibles for the users scripts.