I'm new to Scala. When I am learning it by reading Scala code written by others, one of the most distinguishing feature I find in Scala code that is different from other languages is its pattern matching.
At the same time I feel the convenience and expressiveness it brings, I can't help being curious of the potential performance cost behind it -- generally speaking, how fast is match?
Firstly, without "advanced" features such as matching parameters of constructors, match in Scala, IMO, is the counterpart of switch-case in other languages. For instance,
color match {
0 => println "color is red!"
1 => println "color is green!"
2 => println "color is blue!"
}
As a novice, I want to know if the code above is exactly as fast as equivalent code in if-else statement?
Secondly, now taking those "advanced" features back, for instance:
expr match {
Animal(name) => ...
Animal(name, age) => ...
Animal(_, _, id) => ...
}
As for the code above or other features of match(list matching, pair matching, etc.), I am curious about how Scala implemented these fancy usage? And most importantly, how fast can I expect these code to be? (Say, are they still as fast as the match in the first case? Or maybe slightly slower? Or extremely slow owing to the use of some technology such as reflection?)
Thanks in advance!
First snippet is translated to bytecode's TableSwitch (or LookupSwitch) and is as fast as Java's switch/case:
scala> def foo(i: Int) = i match {
| case 1 => 2
| case 2 => 10
| case 3 => 42
| case _ => 777
| }
foo: (i: Int)Int
scala> :javap -c foo
Compiled from "<console>"
public class {
public static final MODULE$;
public static {};
Code:
0: new #2 // class
3: invokespecial #12 // Method "<init>":()V
6: return
public int foo(int);
Code:
0: iload_1
1: istore_2
2: iload_2
3: tableswitch { // 1 to 3
1: 44
2: 39
3: 34
default: 28
}
28: sipush 777
31: goto 45
34: bipush 42
36: goto 45
39: bipush 10
41: goto 45
44: iconst_2
45: ireturn
public ();
Code:
0: aload_0
1: invokespecial #18 // Method java/lang/Object."<init>":()V
4: aload_0
5: putstatic #20 // Field MODULE$:L;
8: return
Second snipped is translated to bunch of unapply/isInstanceOf/null checks calls, and is (obviously) slower than tableswitch. But it has same (or better, if compiler can optimize something) performance as manual checking via isInstanceOf (no reflection or similar stuff):
scala> case class Foo(s: String, i: Int)
defined class Foo
scala> def bar(foo: Foo) = foo match {
| case Foo("test", _) => 1
| case Foo(_, 42) => 2
| case _ => 3
| }
bar: (foo: Foo)Int
scala> :javap -c bar
Compiled from "<console>"
public class {
public static final MODULE$;
public static {};
Code:
0: new #2 // class
3: invokespecial #12 // Method "<init>":()V
6: return
public int bar(Foo);
Code:
0: aload_1
1: astore_2
2: aload_2
3: ifnull 26
6: aload_2
7: invokevirtual #20 // Method Foo.s:()Ljava/lang/String;
10: astore_3
11: ldc #22 // String test
13: aload_3
14: invokevirtual #26 // Method java/lang/Object.equals:(Ljava/lang/Object;)Z
17: ifeq 26
20: iconst_1
21: istore 4
23: goto 52
26: aload_2
27: ifnull 49
30: aload_2
31: invokevirtual #30 // Method Foo.i:()I
34: istore 5
36: bipush 42
38: iload 5
40: if_icmpne 49
43: iconst_2
44: istore 4
46: goto 52
49: iconst_3
50: istore 4
52: iload 4
54: ireturn
public ();
Code:
0: aload_0
1: invokespecial #34 // Method java/lang/Object."<init>":()V
4: aload_0
5: putstatic #36 // Field MODULE$:L;
8: return
}
Related
Just curious why the Scala authors didn't use recursion or even pattern matching when implementing find on Lists?
Their implementation looks like this:
override final def find(p: A => Boolean): Option[A] = {
var these: List[A] = this
while (!these.isEmpty) {
if (p(these.head)) return Some(these.head)
these = these.tail
}
None
}
Using a while and head and tail. They could have done something for "scala-esq" with recursion no?
#tailrec
def find(p: A => Boolean): Option[A] = {
this match {
case Nil => None
case head :: tail if p(head) => Some(head)
case elements => find(p, elements.tail)
}
}
It can't be because of tail-call optimisation can it? Is it somehow more efficient and I'm missing it? Could it be just author preference and style?! Something inflexible about it when A could be anything? hmmm
A quick experiment (using Scala 2.13.2). The three candidate implementations are:
while-loop
tail-recursive, but keeping the same logic as the while version
tail-recursive with a pattern match
I've modified the logic where appropriate to depend less on compiler optimizations (nonEmpty vs. !isEmpty and explicitly saving these.head so it's not called twice).
import scala.annotation.tailrec
object ListFindComparison {
def whileFind[A](lst: List[A])(p: A => Boolean): Option[A] = {
var these: List[A] = lst
while (these.nonEmpty) {
val h = these.head
if (p(h)) return Some(h)
else these = these.tail
}
None
}
def tailrecFind[A](lst: List[A])(p: A => Boolean): Option[A] = {
#tailrec
def iter(these: List[A]): Option[A] =
if (these.nonEmpty) {
val h = these.head
if (p(h)) Some(h)
else iter(these.tail)
} else None
iter(lst)
}
def tailRecPM[A](lst: List[A])(p: A => Boolean): Option[A] = {
#tailrec
def iter(these: List[A]): Option[A] =
these match {
case Nil => None
case head :: tail if p(head) => Some(head)
case _ => iter(these.tail)
}
iter(lst)
}
}
When inspecting the bytecode (using :javap ListFindComparison$), we see
For whileFind, the emitted code is straightforward
Code:
0: aload_1
1: astore_3
2: aload_3
3: invokevirtual #25 // Method scala/collection/immutable/List.nonEmpty:()Z
6: ifeq 50
9: aload_3
10: invokevirtual #29 // Method scala/collection/immutable/List.head:()Ljava/lang/Object;
13: astore 4
15: aload_2
16: aload 4
18: invokeinterface #35, 2 // InterfaceMethod scala/Function1.apply:(Ljava/lang/Object;)Ljava/lang/Object;
23: invokestatic #41 // Method scala/runtime/BoxesRunTime.unboxToBoolean:(Ljava/lang/Object;)Z
26: ifeq 39
29: new #43 // class scala/Some
32: dup
33: aload 4
35: invokespecial #46 // Method scala/Some."<init>":(Ljava/lang/Object;)V
38: areturn
39: aload_3
40: invokevirtual #49 // Method scala/collection/immutable/List.tail:()Ljava/lang/Object;
43: checkcast #21 // class scala/collection/immutable/List
46: astore_3
47: goto 2
50: getstatic #54 // Field scala/None$.MODULE$:Lscala/None$;
53: areturn
The tail-recursive finds are basically the same:
aload_0
aload_1
aload_2
invokespecial // call the appropriate (private) iter methods
areturn
The iter in tailrecFind is
Code:
0: aload_1
1: invokevirtual #25 // Method scala/collection/immutable/List.nonEmpty:()Z
4: ifeq 53
7: aload_1
8: invokevirtual #29 // Method scala/collection/immutable/List.head:()Ljava/lang/Object;
11: astore 4
13: aload_2
14: aload 4
16: invokeinterface #35, 2 // InterfaceMethod scala/Function1.apply:(Ljava/lang/Object;)Ljava/lang/Object;
21: invokestatic #41 // Method scala/runtime/BoxesRunTime.unboxToBoolean:(Ljava/lang/Object;)Z
24: ifeq 39
27: new #43 // class scala/Some
30: dup
31: aload 4
33: invokespecial #46 // Method scala/Some."<init>":(Ljava/lang/Object;)V
36: goto 50
39: aload_1
40: invokevirtual #49 // Method scala/collection/immutable/List.tail:()Ljava/lang/Object;
43: checkcast #21 // class scala/collection/immutable/List
46: astore_1
47: goto 0
50: goto 56
53: getstatic #54 // Field scala/None$.MODULE$:Lscala/None$;
56: areturn
There's no major difference in the core of the while and this iter: it's quite likely that JIT will bring these to the same machine code after enough invocations. tailrecFind has slightly greater constant overhead for getting into iter than whileFind has for getting into the loop. There's not likely to be a meaningful performance difference here (and in fact, since while is leaving the language definition in dotty, the future of while is as a library function which tail-recursively calls a block as long as a predicate passes).
The iter with pattern-matching is very different:
Code:
0: aload_1
1: astore 5
3: getstatic #77 // Field scala/collection/immutable/Nil$.MODULE$:Lscala/collection/immutable/Nil$;
6: aload 5
8: invokevirtual #80 // Method java/lang/Object.equals:(Ljava/lang/Object;)Z
11: ifeq 22
14: getstatic #54 // Field scala/None$.MODULE$:Lscala/None$;
17: astore 4
19: goto 92
22: goto 25
25: aload 5
27: instanceof #82 // class scala/collection/immutable/$colon$colon
30: ifeq 78
33: aload 5
35: checkcast #82 // class scala/collection/immutable/$colon$colon
38: astore 6
40: aload 6
42: invokevirtual #83 // Method scala/collection/immutable/$colon$colon.head:()Ljava/lang/Object;
45: astore 7
47: aload_2
48: aload 7
50: invokeinterface #35, 2 // InterfaceMethod scala/Function1.apply:(Ljava/lang/Object;)Ljava/lang/Object;
55: invokestatic #41 // Method scala/runtime/BoxesRunTime.unboxToBoolean:(Ljava/lang/Object;)Z
58: ifeq 75
61: new #43 // class scala/Some
64: dup
65: aload 7
67: invokespecial #46 // Method scala/Some."<init>":(Ljava/lang/Object;)V
70: astore 4
72: goto 92
75: goto 81
78: goto 81
81: aload_1
82: invokevirtual #49 // Method scala/collection/immutable/List.tail:()Ljava/lang/Object;
85: checkcast #21 // class scala/collection/immutable/List
88: astore_1
89: goto 0
92: aload 4
94: areturn
This is unlikely to be anywhere near as performant as the versions without pattern-matching (though to be fair, the branches will in practice be really easy for a predictor: not-taken (not-Nil), not-taken (::), not-taken (predicate fails), except for the very last run).
It's a little interesting to me that we get a call to equals when checking for Nil: it's probably still faster than isEmpty/nonEmpty, but it would be even faster without pattern-matching and with an explicit eq/ne against Nil.
I also note that pattern-matching against this is a bit of antipattern IMO: at that point, you're almost certainly better off using virtual method dispatch since you're basically implementing a slow vtable (it does have the advantage of potentially being pre-JIT'd if you put the common case first).
If you really care about performance, I'd try to avoid pattern-matching.
PS: I haven't analyzed the simple foldLeft solution:
lst.foldLeft(None) { (acc, v) =>
acc.orElse {
if (p(v)) Some(v)
else None
}
}
But since that doesn't short-circuit, I suspect it won't consistently beat any of the candidates, and even in the cases where there's no match before the last element, it might not even beat the pattern-match version then.
Given:
case object A
What's the difference, if any, between the # and : in:
def f(a: A.type): Int = a match {
case aaa # A => 42
}
and
def f(a: A.type): Int = a match {
case aaa : A.type => 42
}
The first one # uses an extractor to do the pattern matching while the second one : requires the type - that's why you need to pass in A.type there.
There's actually no difference between them in terms of matching. To better illustrate the difference between # and : we can look at a simple class, which doesn't provide an extractor out of the box.
class A
def f(a: A) = a match {
case _ : A => // works fine
case _ # A => // doesn't compile because no extractor is found
}
In this very specific case, almost nothing is different. They will both achieve the same results.
Semantically, case aaa # A => 42 is usage of pattern binding where we're matching on the exact object A, and case aaa : A.type => 42 is a type pattern where we want a to have the type A.type. In short, type versus equality, which doesn't make a difference for a singleton.
The generated code is actually slightly different. Consider this code compiled with -Xprint:patmat:
def f(a: A.type): Int = a match {
case aaa # A => 42
case aaa : A.type => 42
}
The relevant code for f shows that the two cases are slightly different, but will not produce different results:
def f(a: A.type): Int = {
case <synthetic> val x1: A.type = a;
case6(){
if (A.==(x1)) // case aaa # A
matchEnd5(42)
else
case7()
};
case7(){
if (x1.ne(null)) // case aaa: A.type
matchEnd5(42)
else
case8()
};
case8(){
matchEnd5(throw new MatchError(x1))
};
matchEnd5(x: Int){
x
}
}
The first case checks equality, where the second case only checks that the reference is not null (we already know the type matches since the method parameter is the singleton type).
Semantically, there is no difference in this case. We can have a look at the bytecode to see if there is a runtime difference:
> object A
defined object A
> object X { def f(a: A.type) = a match { case a # A => 42 } }
defined object X
> :javap X
...
public int f($line4.$read$$iw$$iw$$iw$$iw$$iw$$iw$A$);
descriptor: (L$line4/$read$$iw$$iw$$iw$$iw$$iw$$iw$A$;)I
flags: ACC_PUBLIC
Code:
stack=3, locals=4, args_size=2
0: aload_1
1: astore_3
2: getstatic #51 // Field $line4/$read$$iw$$iw$$iw$$iw$$iw$$iw$A$.MODULE$:L$line4/$read$$iw$$iw$$iw$$iw$$iw$$iw$A$;
5: aload_3
6: invokevirtual #55 // Method java/lang/Object.equals:(Ljava/lang/Object;)Z
9: ifeq 18
12: bipush 42
14: istore_2
15: goto 30
18: goto 21
21: new #57 // class scala/MatchError
24: dup
25: aload_3
26: invokespecial #60 // Method scala/MatchError."<init>":(Ljava/lang/Object;)V
29: athrow
30: iload_2
31: ireturn
And the other case:
> object Y { def f(a: A.type) = a match { case a: A.type => 42 } }
defined object Y
> :javap Y
...
public int f($line4.$read$$iw$$iw$$iw$$iw$$iw$$iw$A$);
descriptor: (L$line4/$read$$iw$$iw$$iw$$iw$$iw$$iw$A$;)I
flags: ACC_PUBLIC
Code:
stack=3, locals=4, args_size=2
0: aload_1
1: astore_3
2: aload_3
3: ifnull 12
6: bipush 42
8: istore_2
9: goto 24
12: goto 15
15: new #50 // class scala/MatchError
18: dup
19: aload_3
20: invokespecial #53 // Method scala/MatchError."<init>":(Ljava/lang/Object;)V
23: athrow
24: iload_2
25: ireturn
Indeed, there is a small difference. In the second case the compiler can see that a parameter of type A.type has only two values: A.type and null. Therefore at runtime there is only a check whether it is null because the other case is checked at compile time. In the first version of the code, the compiler doesn't do this optimization. Instead it is calling the equals method.
If we change the type of the parameter slightly, we get a different result:
> object Z { def f(a: AnyRef) = a match { case a: A.type => 42 } }
defined object Z
> :javap Z
...
public int f(java.lang.Object);
descriptor: (Ljava/lang/Object;)I
flags: ACC_PUBLIC
Code:
stack=3, locals=4, args_size=2
0: aload_1
1: astore_3
2: aload_3
3: getstatic #51 // Field $line4/$read$$iw$$iw$$iw$$iw$$iw$$iw$A$.MODULE$:L$line4/$read$$iw$$iw$$iw$$iw$$iw$$iw$A$;
6: if_acmpne 15
9: bipush 42
11: istore_2
12: goto 27
15: goto 18
18: new #53 // class scala/MatchError
21: dup
22: aload_3
23: invokespecial #56 // Method scala/MatchError."<init>":(Ljava/lang/Object;)V
26: athrow
27: iload_2
28: ireturn
In this version the compiler no longer knows what the parameter is, therefore it is doing a comparison of the types at runtime. We could now discuss whether the call of equals in the first version or the type comparison in the third call is more efficient but I guess the JIT of the JVM is optimizing away any overhead in both cases anyway, therefore we first would have to look at the machine code to tell which code is more efficient, if there is a difference at all.
Semantically there is no different in this particular example but in general we include keyword # if we want to do something with the object itself. This thread explains the use of these extractors with a simple example.
I noticed that there is no || operator available when pattern matching - is | short circuited?
In pattern matching, | is short circuited. You can't call unapply or the like (with returned parameters) with the or-operator, where side-effects might be more likely. So short-circuiting is purely an optimization technique (won't affect the correctness of the code except in extraordinary cases such as a side-effecting equals method). This does mean you are limited in your ability to short circuit or not for performance or side-effecting reasons.
To see this, if we write this code:
def matchor(s: String) = s match {
case "tickle" | "fickle" => "sickle"
case _ => "hammer"
}
We see this bytecode (in part)
public java.lang.String matchor(java.lang.String);
Code:
0: aload_1
1: astore_2
2: ldc #12; //String tickle
4: aload_2
5: astore_3
6: dup
7: ifnonnull 18
10: pop
11: aload_3
12: ifnull 25
15: goto 31
18: aload_3
19: invokevirtual #16; //Method java/lang/Object.equals:(Ljava/lang/Object;)Z
22: ifeq 31
25: iconst_1
26: istore 4
28: goto 66
31: ldc #18; //String fickle
33: aload_2
...
66: iload 4
68: ifeq 78
71: ldc #20; //String sickle
73: astore 6
75: goto 82
...
82: aload 6
84: areturn
See the jump on line 28 to avoid testing the "fickle" case? That's the short-circuit.
| short-circuits.
object First {
def unapply(str: String): Boolean = {
println("in First")
str == "first"
}
}
object Second {
def unapply(str: String) = {
println("in Second")
str == "second"
}
}
object Run extends App {
"first" match {
case First() | Second() => None
}
//Output: In First
"first" match {
case Second() | First() => None
}
//Output: In Second\nIn First
}
For example, suppose we have:
object Types {
type ObjectMap = collection.Map[String, Any]
}
class X {
def toObjectMap(x:Any): ObjectMap = x.asInstanceOf[Types.ObjectMap]
}
Does this have any additional runtime penalties compared to:
class X {
def toObjectMap(x:Any): collection.Map[String, Any]= x.asInstanceOf[collection.Map[String, Any]]
}
I wouldn't expect it to, but you know it's like, really easy to try it out.
scala> :javap -prv X
public scala.collection.Map<java.lang.String, java.lang.Object> toObjectMap(java.lang.Object);
flags: ACC_PUBLIC
Code:
stack=1, locals=2, args_size=2
0: aload_1
1: checkcast #9 // class scala/collection/Map
4: areturn
LocalVariableTable:
Start Length Slot Name Signature
0 5 0 this L$line9/$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$X;
0 5 1 x Ljava/lang/Object;
LineNumberTable:
line 53: 0
Signature: #75 // (Ljava/lang/Object;)Lscala/collection/Map<Ljava/lang/String;Ljava/lang/Object;>;
public scala.collection.Map<java.lang.String, java.lang.Object> toObjectMap2(java.lang.Object);
flags: ACC_PUBLIC
Code:
stack=1, locals=2, args_size=2
0: aload_1
1: checkcast #9 // class scala/collection/Map
4: areturn
LocalVariableTable:
Start Length Slot Name Signature
0 5 0 this L$line9/$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$X;
0 5 1 x Ljava/lang/Object;
LineNumberTable:
line 54: 0
Signature: #75 // (Ljava/lang/Object;)Lscala/collection/Map<Ljava/lang/String;Ljava/lang/Object;>;
Type aliases are just shorthand. The compiler expands the alias and from then on, proceeds precisely as if you had written the type out yourself. (As Som's answer shows, at least for your particular example.)
I am looking into scala TCO and have written the following code
import scala.annotation.tailrec
final def tailReccursionEx(str:String):List[String]={
#tailrec
def doTailRecursionEx(str:String,pos:Int,accu:List[String]):List[String]={
if(pos==str.length) return accu
else{
doTailRecursionEx(str,pos+1,accu++accu.foldLeft(List[String](str(`pos`).toString)){
(l,ch)=>l:+ch+str(`pos`)})
}
}
doTailRecursionEx(str,0,List[String]())
}
I have passed the #tailrec test and I believe that my function is self-recursive tail call. Yet, when I look into the java byte code with
javap -c -private RecursionEx\$\$anonfun\$doTailRecursionEx\$1\$1
I don't see the promised goto for the TCO for self-recursive function. Here is the bytecode.
public RecursionEx$$anonfun$doTailRecursionEx$1$1(java.lang.String, int);
Code:
0: aload_0
1: aload_1
2: putfield #35; //Field str$2:Ljava/lang/String;
5: aload_0
6: iload_2
7: putfield #41; //Field pos$1:I
10: aload_0
11: invokespecial #93; //Method scala/runtime/AbstractFunction2."<init>":()V
14: return
}
I think you need to run javap on a different generated class file. The file you are examining at present corresponds to the closure you use as part of foldLeft. If you try looking at the "RecursionEx$.class" file you should see the tail call recursion. When I compile the code:
import scala.annotation.tailrec
object RecursionEx {
#tailrec
final def doTailRecursionEx(str: String, pos: Int, accu: List[String]): List[String] = {
if (pos == str.length) return accu
doTailRecursionEx(str, pos + 1 , accu ++ accu.foldLeft(List[String](str(`pos`).toString)) {
(l, ch) => l :+ ch + str(`pos`)
})
}
def main(args: Array[String]) {
doTailRecursionEx("mew",0,List[String]())
}
}
and then run javap -c -private RecursionEx$ I see the following for the relevant section of code:
public final scala.collection.immutable.List doTailRecursionEx(java.lang.String, int, scala.collection.immutable.List);
Code:
0: iload_2
1: aload_1
2: invokevirtual #21; //Method java/lang/String.length:()I
5: if_icmpne 10
8: aload_3
9: areturn
10: iload_2
11: iconst_1
12: iadd
13: aload_3
14: aload_3
15: getstatic #26; //Field scala/collection/immutable/List$.MODULE$:Lscala/collection/immutable/List$;
18: getstatic #31; //Field scala/Predef$.MODULE$:Lscala/Predef$;
21: iconst_1
22: anewarray #17; //class java/lang/String
25: dup
26: iconst_0
27: getstatic #31; //Field scala/Predef$.MODULE$:Lscala/Predef$;
30: aload_1
31: invokevirtual #35; //Method scala/Predef$.augmentString:(Ljava/lang/String;)Lscala/collection/immutable/StringOps;
34: iload_2
35: invokeinterface #41, 2; //InterfaceMethod scala/collection/immutable/StringLike.apply:(I)C
40: invokestatic #47; //Method scala/runtime/BoxesRunTime.boxToCharacter:(C)Ljava/lang/Character;
43: invokevirtual #53; //Method java/lang/Object.toString:()Ljava/lang/String;
46: aastore
47: checkcast #55; //class "[Ljava/lang/Object;"
50: invokevirtual #59; //Method scala/Predef$.wrapRefArray:([Ljava/lang/Object;)Lscala/collection/mutable/WrappedArray;
53: invokevirtual #62; //Method scala/collection/immutable/List$.apply:(Lscala/collection/Seq;)Lscala/collection/immutable/List;
56: new #64; //class RecursionEx$$anonfun$doTailRecursionEx$1
59: dup
60: aload_1
61: iload_2
62: invokespecial #67; //Method RecursionEx$$anonfun$doTailRecursionEx$1."<init>":(Ljava/lang/String;I)V
65: invokeinterface #73, 3; //InterfaceMethod scala/collection/LinearSeqOptimized.foldLeft:(Ljava/lang/Object;Lscala/Function2;)Ljava/lang/Object;
70: checkcast #75; //class scala/collection/TraversableOnce
73: getstatic #26; //Field scala/collection/immutable/List$.MODULE$:Lscala/collection/immutable/List$;
76: invokevirtual #79; //Method scala/collection/immutable/List$.canBuildFrom:()Lscala/collection/generic/CanBuildFrom;
79: invokevirtual #85; //Method scala/collection/immutable/List.$plus$plus:(Lscala/collection/TraversableOnce;Lscala/collection/generic/CanBuildFrom;)Ljava/lang/Object;
82: checkcast #81; //class scala/collection/immutable/List
85: astore_3
86: istore_2
87: goto 0
with a goto at the end, just as you would expect.