In the accepted best response to this question, there is a clear explanation why boxing happens.
However, if I decompile the code (using java decompiler) I cannot see use of scala.runtime.BoxesRunTime. Furthermore, if I profile the code (using JProfiler) I cannot see any instances of BoxesRunTime.
So, how do I really see a proof of boxing/unboxing taking place?
In this code:
class Foo[T] {
def bar(i: T) = i
}
object Main {
def main(args: Array[String]) {
val f = new Foo[Int]
f.bar(5)
}
}
The invocation of bar should first box the integer. Compiling with Scala 2.8.1 and using:
javap -c -l -private -verbose -classpath <dir> Main$
to see the bytecode produced for the main method of the Main class yields:
public void main(java.lang.String[]);
...
9: iconst_5
10: invokestatic #24; //Method scala/runtime/BoxesRunTime.boxToInteger:(I)Ljava/lang/Integer;
13: invokevirtual #28; //Method Foo.bar:(Ljava/lang/Object;)Ljava/lang/Object;
16: pop
17: return
...
You can see the call to BoxesRunTime before the call to bar.
BoxesRunTime is an object which contains boxing methods for primitive types, so there should be exactly one instance in total. The trick here is that this particular file in the library was written in Java, and the conversions are static methods. For this reason there aren't any instances of it at runtime, although using it in Scala code feels as if it were an object.
You should probably look for boxed primitives (e.g. java.lang.Integer) with JProfile, though I am uncertain how the JVM works and whether it may actually rewrite the code at runtime and optimize it to avoid boxing. To my knowledge, it shouldn't apply specialization (but I believe CLR does). A few microbenchmarks with and without the boxing situation are another way to figure out what happens at runtime.
EDIT:
The above is assuming that a type parameter wasn't annotated with the #specialized annotation. In this case, the boxing/unboxing can be avoided. Certain classes in the standard library are specialized. See this sid.
Given the following Test.scala program:
object Test {
def main(args:Array[String]) {
val list = List(1,5,15)
val res = list.map(e => e*2).filter(e => e>10)
}
}
If I compile with scalac -Xprint:jvm Test.scala, I get this snippet suggesting that specialization occurs (sorry for wide paste):
package <empty> {
final class Test extends java.lang.Object with ScalaObject {
def main(args: Array[java.lang.String]): Unit = {
val list: List = immutable.this.List.apply(scala.this.Predef.wrapIntArray(Array[Int]{1, 5, 15}));
val res: List = list.map({
(new Test$$anonfun$1(): Function1)
}, immutable.this.List.canBuildFrom()).$asInstanceOf[scala.collection.TraversableLike]().filter({
(new Test$$anonfun$2(): Function1)
}).$asInstanceOf[List]();
()
};
def this(): object Test = {
Test.super.this();
()
}
};
#SerialVersionUID(0) #serializable final <synthetic> class Test$$anonfun$1 extends scala.runtime.AbstractFunction1$mcII$sp {
final def apply(e: Int): Int = Test$$anonfun$1.this.apply$mcII$sp(e);
<specialized> def apply$mcII$sp(v1: Int): Int = v1.*(2);
final <bridge> def apply(v1: java.lang.Object): java.lang.Object = scala.Int.box(Test$$anonfun$1.this.apply(scala.Int.unbox(v1)));
def this(): Test$$anonfun$1 = {
Test$$anonfun$1.super.this();
()
}
};
#SerialVersionUID(0) #serializable final <synthetic> class Test$$anonfun$2 extends scala.runtime.AbstractFunction1$mcZI$sp {
final def apply(e: Int): Boolean = Test$$anonfun$2.this.apply$mcZI$sp(e);
<specialized> def apply$mcZI$sp(v1: Int): Boolean = v1.>(10);
final <bridge> def apply(v1: java.lang.Object): java.lang.Object = scala.Boolean.box(Test$$anonfun$2.this.apply(scala.Int.unbox(v1)));
def this(): Test$$anonfun$2 = {
Test$$anonfun$2.super.this();
()
}
}
}
Could be why you don't see any evidence of boxing in bytecode...
Related
An example is worth a thousand words:
class A { def foo: Any = new Object }
class B extends A {
override def foo: AnyVal = 42
}
In Java, the signature #Override public int foo() wouldn't even be allowed, and the overridden method foo in B could only return the wrapper integer type (#Override java.lang.Integer foo()).
Is Scala able to avoid the boxing/unboxing of AnyVal values in the overridden def foo: AnyVal method above?
No, it does not. Scala has to adhere to emitting the correct bytecode:
λ scalac -Xprint:jvm Bar.scala
[[syntax trees at end of jvm]] // Bar.scala
package yuval.tests {
class A extends Object {
def foo(): Object = new Object();
def <init>(): yuval.tests.A = {
A.super.<init>();
()
}
};
class B extends yuval.tests.A {
override def foo(): Object = scala.Int.box(42);
def <init>(): yuval.tests.B = {
B.super.<init>();
()
}
}
}
You can see that although AnyVal was permitted in Scala, the actual method signature for the emitted foo is Object and not AnyVal, and Int is boxed.
Yuval's answer can be generalized: erasure of AnyVal is Object (you can see this e.g. by entering classOf[AnyVal] in the REPL), so whenever you have AnyVal in Scala, you can expect Object in bytecode.
E.g. if you change A to
class A { def foo: AnyVal = 0 }
it's still Object.
Maybe there is some situation in which using AnyVal itself will avoid boxing, but I would be surprised. It was created pretty much for compiler's convenience, and picked up another use (value classes) later, but it's rarely useful in user code (except for defining value classes).
What methods are generated for Scala case classes?
I know that some methods are generated specifically for case classes:
equals
canEqual
What are the others?
Also, I see that I can call productArity() on any case class. How does this work? In other words, why the following code is valid?
case class CaseClass()
object CaseClass {
val cc = new CaseClass()
cc.productArity
}
A good way what methods are generated for a specific class in Scala is to use the javap command.
Find the .class file that was compiled by scalac and then run the javap -private command on it from your respective command line tool. This will show you the constructors, fields, and all methods for a class.
You can do this for your case class to see what kinds of things are automagically supplied by Scala.
Case classes mixin the Product trait which provides the productArity method. For case classes the productArity method will return the count of the parameter list supplied in the class definition.
Given Test.scala -
case class Test()
You can run scalac Test.scala -print to see exactly what's generated
[[syntax trees at end of cleanup]] // Test.scala
package com {
case class Test extends Object with Product with Serializable {
<synthetic> def copy(): com.Test = new com.Test();
override <synthetic> def productPrefix(): String = "Test";
<synthetic> def productArity(): Int = 0;
<synthetic> def productElement(x$1: Int): Object = {
case <synthetic> val x1: Int = x$1;
case4(){
matchEnd3(throw new IndexOutOfBoundsException(scala.Int.box(x$1).toString()))
};
matchEnd3(x: Object){
x
}
};
override <synthetic> def productIterator(): Iterator = runtime.this.ScalaRunTime.typedProductIterator(Test.this);
<synthetic> def canEqual(x$1: Object): Boolean = x$1.$isInstanceOf[com.Test]();
override <synthetic> def hashCode(): Int = ScalaRunTime.this._hashCode(Test.this);
override <synthetic> def toString(): String = ScalaRunTime.this._toString(Test.this);
override <synthetic> def equals(x$1: Object): Boolean = {
case <synthetic> val x1: Object = x$1;
case5(){
if (x1.$isInstanceOf[com.Test]())
matchEnd4(true)
else
case6()
};
case6(){
matchEnd4(false)
};
matchEnd4(x: Boolean){
x
}
}.&&(x$1.$asInstanceOf[com.Test]().canEqual(Test.this));
def <init>(): com.Test = {
Test.super.<init>();
scala.Product$class./*Product$class*/$init$(Test.this);
()
}
};
<synthetic> object Test extends scala.runtime.AbstractFunction0 with Serializable {
final override <synthetic> def toString(): String = "Test";
case <synthetic> def apply(): com.Test = new com.Test();
case <synthetic> def unapply(x$0: com.Test): Boolean = if (x$0.==(null))
false
else
true;
<synthetic> private def readResolve(): Object = com.this.Test;
case <synthetic> <bridge> <artifact> def apply(): Object = Test.this.apply();
def <init>(): com.Test.type = {
Test.super.<init>();
()
}
}
}
It's true that a case classe automatically define equals and canEqual methods but it's also define getter methods for the constructor arguments. There's also a toString method that you can call.
A case class is also an instance of Product and thus inherit these methods. This is why you call productArity.
I'm currently making extensive use of the type class pattern in to be performance-relevant portions of my code. I made out at least two potential sources of inefficiency.
The implicit parameters get passed along message calls. I don't know whether this really happens. Maybe scalac can simply insert the implicit parameters where they are used and remove them from the method signature. This is probably not possible in cases where you insert the implicit parameters manually, since they might be resolved at runtime only. What optimizations do apply concerning passing implicit parameters?
If the type class instance is provided by a def (contrary to a val), the object has to be recreated on every invocation of a "type classed method". This issue may be adressed by the JVM, which might optimize object creation away. This issue might also be adressed by scalac by reusing these objects. What optimizations do apply concerning the creation of implicit parameter objects?
And of course there might be additional sources of inefficiency when applying the type class pattern. Please tell me about them.
If you genuinely care about writing ultra-high-performance code (and you may think you do but be very wrong about this) then typeclasses are going to cause some pain for the following reasons:
Many extra virtual method calls
Likely boxing of primitives (e.g. if using scalaz's typeclasses for monoids etc)
Object creations via def which are necessary because functions cannot be parameterized
Object creations to access the "pimped" methods
At runtime, the JVM may optimize some of the erroneous creations away (e.g. the creation of an MA simply to call <*>), but scalac does not do much to help. You can see this trivially by compiling some code which uses typeclasses and using -Xprint:icode as an argument.
Here's an example:
import scalaz._; import Scalaz._
object TC {
def main(args: Array[String]) {
println((args(0).parseInt.liftFailNel |#| args(1).parseInt.liftFailNel)(_ |+| _))
}
}
And here's the icode:
final object TC extends java.lang.Object with ScalaObject {
def main(args: Array[java.lang.String]): Unit = scala.this.Predef.println(scalaz.this.Scalaz.ValidationMA(scalaz.this.Scalaz.StringTo(args.apply(0)).parseInt().liftFailNel()).|#|(scalaz.this.Scalaz.StringTo(args.apply(1)).parseInt().liftFailNel()).apply({
(new anonymous class TC$$anonfun$main$1(): Function2)
}, scalaz.this.Functor.ValidationFunctor(), scalaz.this.Apply.ValidationApply(scalaz.this.Semigroup.NonEmptyListSemigroup())));
def this(): object TC = {
TC.super.this();
()
}
};
#SerialVersionUID(0) final <synthetic> class TC$$anonfun$main$1$$anonfun$apply$mcIII$sp$2 extends scala.runtime.AbstractFunction0 with Serializable {
final def apply(): Int = TC$$anonfun$main$1$$anonfun$apply$mcIII$sp$2.this.v1$1;
final <bridge> def apply(): java.lang.Object = scala.Int.box(TC$$anonfun$main$1$$anonfun$apply$mcIII$sp$2.this.apply());
<synthetic> <paramaccessor> private[this] val v1$1: Int = _;
def this($outer: anonymous class TC$$anonfun$main$1, v1$1: Int): anonymous class TC$$anonfun$main$1$$anonfun$apply$mcIII$sp$2 = {
TC$$anonfun$main$1$$anonfun$apply$mcIII$sp$2.this.v1$1 = v1$1;
TC$$anonfun$main$1$$anonfun$apply$mcIII$sp$2.super.this();
()
}
};
#SerialVersionUID(0) final <synthetic> class TC$$anonfun$main$1$$anonfun$apply$mcIII$sp$1 extends scala.runtime.AbstractFunction0$mcI$sp with Serializable {
final def apply(): Int = TC$$anonfun$main$1$$anonfun$apply$mcIII$sp$1.this.apply$mcI$sp();
<specialized> def apply$mcI$sp(): Int = TC$$anonfun$main$1$$anonfun$apply$mcIII$sp$1.this.v2$1;
final <bridge> def apply(): java.lang.Object = scala.Int.box(TC$$anonfun$main$1$$anonfun$apply$mcIII$sp$1.this.apply());
<synthetic> <paramaccessor> private[this] val v2$1: Int = _;
def this($outer: anonymous class TC$$anonfun$main$1, v2$1: Int): anonymous class TC$$anonfun$main$1$$anonfun$apply$mcIII$sp$1 = {
TC$$anonfun$main$1$$anonfun$apply$mcIII$sp$1.this.v2$1 = v2$1;
TC$$anonfun$main$1$$anonfun$apply$mcIII$sp$1.super.this();
()
}
};
#SerialVersionUID(0) final <synthetic> class TC$$anonfun$main$1 extends scala.runtime.AbstractFunction2$mcIII$sp with Serializable {
final def apply(x$1: Int, x$2: Int): Int = TC$$anonfun$main$1.this.apply$mcIII$sp(x$1, x$2);
<specialized> def apply$mcIII$sp(v1$1: Int, v2$1: Int): Int = scala.Int.unbox(scalaz.this.Scalaz.mkIdentity({
(new anonymous class TC$$anonfun$main$1$$anonfun$apply$mcIII$sp$2(TC$$anonfun$main$1.this, v1$1): Function0)
}).|+|({
(new anonymous class TC$$anonfun$main$1$$anonfun$apply$mcIII$sp$1(TC$$anonfun$main$1.this, v2$1): Function0)
}, scalaz.this.Semigroup.IntSemigroup()));
final <bridge> def apply(v1: java.lang.Object, v2: java.lang.Object): java.lang.Object = scala.Int.box(TC$$anonfun$main$1.this.apply(scala.Int.unbox(v1), scala.Int.unbox(v2)));
def this(): anonymous class TC$$anonfun$main$1 = {
TC$$anonfun$main$1.super.this();
()
}
}
}
You can see there's a huge amount of object creation going on here
Stumbled over that
def foo(f: Int => Unit) {}
def foo(f: Long => Unit) {}
doesn't compile because of method foo is defined twice. I know that above is only a shorthand for
def foo(f: Function1[Int, Unit]) {}
def foo(f: Function1[Long, Unit]) {}
and that after type erasure both methods have same signature.
Now I've read in Try out specialized Function1/Function2 in 2.8.0 RC1! that Function1 and Function2 have #specialized versions for Int, Long and Double since Scala 2.8. That surely means that Function[Int, Unit] and Function[Long, Unit] have separate class files at JVM level.
Would then not both signatures are different?
Is the problem, that second type parameter will continue to be erased? But same problem with
class Bar[#specialized T]
def foo(f: Bar[Int]) {}
def foo(f: Bar[Long]) {}
it doesn't compile.
#specialized has nothing to do with type erasure, at least in this case. It means that an extra version of your class is generated with the native type in the position. This saves on boxing/unboxing notably.
So you define a class like:
class MyClass[#specialized(Int) T] {
def foobar(t: T) = {}
}
and you get two classes as output, (approximately):
class Foobar[java.lang.Object] {
def foobar(t: java.lang.Object) = {}
}
class Foobar[int] {
def foobar(t: int) = {}
}
You need to have two implementations of the class because you can't always guarantee that the one with the correct native type will be called. The scala compiler will choose which one to call. Note that the java compiler has no idea this specialization is taking place, so must call the unspecialized methods.
In fact, the output is the following (via JAD):
public class MyClass implements ScalaObject {
public void foobar(Object obj) { }
public void foobar$mcI$sp(int t) {
foobar(BoxesRunTime.boxToInteger(t));
}
public MyClass() { }
}
public class MyClass$mcI$sp extends MyClass {
public void foobar(int t) {
foobar$mcI$sp(t);
}
public void foobar$mcI$sp(int i) { }
public volatile void foobar(Object t) {
foobar(BoxesRunTime.unboxToInt(t));
}
public MyClass$mcI$sp() {}
}
So your type erasure problem will not be fixed with #specialized.
Both for compatibility and for cases where the type parameters of Function1 are unknown, a method with the signature as if Function1 was not specialized must also be generated.
Inspired especially by Matthew Farwell's answer I've tried the following
class Bar[#specialized(Int) T](val t: T)
class Foo {
def foo(b: Bar[_]) { print(b.t) }
}
val bari = new Bar(1)
print(bari.t)
foo(bari)
with scalac -print and got:
// unspecialized version Bar[_] = Bar[Object]
class Bar extends Object with ScalaObject {
protected[this] val t: Object = _;
def t(): Object = Bar.this.t;
def t$mcI$sp(): Int = Int.unbox(Bar.this.t());
def specInstance$(): Boolean = false;
def this(t: Object): Bar = {
Bar.this.t = t;
Bar.super.this();
()
}
};
// specialized version Bar[Int]
class Bar$mcI$sp extends Bar {
protected[this] val t$mcI$sp: Int = _;
// inside of a specialized class methods are specialized,
// so the `val t` accessor is compiled twice:
def t$mcI$sp(): Int = Bar$mcI$sp.this.t$mcI$sp;
override def t(): Int = Bar$mcI$sp.this.t$mcI$sp();
def specInstance$(): Boolean = true;
override def t(): Object = Int.box(Bar$mcI$sp.this.t());
def this(t$mcI$sp: Int): Bar$mcI$sp = {
Bar$mcI$sp.this.t$mcI$sp = t$mcI$sp;
Bar$mcI$sp.super.this(null);
()
}
}
class Foo extends Object with ScalaObject {
// scalac compiles only ONE foo method not one for every special case
def foo(b: Bar): Unit = Predef.print(b.t());
def this(): Foo = {
Foo.super.this();
()
}
};
val bari: or.gate.Bar = new or.gate.Bar$mcI$sp(1);
// specialized version of `val t` accessor is used:
Predef.print(scala.Int.box(bari.t$mcI$sp()));
Foo.this.foo(bari)
But through foo only the unspecialized version of val t accessor is used, even for the specialized instance bari and indirectly bari's overridden method def t(): Object = Int.box(Bar$mcI$sp.this.t()); is called.
According to this, Scala methods belong to a class. However, if I define a method in REPL or in a script that I then execute using scala, what class does the method belong to ?
scala> def hoho(str:String) = {println("hoho " + str)}
hoho: (str: String)Unit
scala> hoho("rahul")
hoho rahul
In this example, what class does the method belong to ?
The REPL wraps all your statements (actually rewrites your statements) in objects automagically. You can see it in action if you print the intermediate code by using the -Xprint:typer option:
scala> def hoho(str:String) = {println("hoho " + str)}
[[syntax trees at end of typer]]// Scala source: <console>
package $line1 {
final object $read extends java.lang.Object with ScalaObject {
def this(): object $line1.$read = {
$read.super.this();
()
};
final object $iw extends java.lang.Object with ScalaObject {
def this(): object $line1.$read.$iw = {
$iw.super.this();
()
};
final object $iw extends java.lang.Object with ScalaObject {
def this(): object $line1.$read.$iw.$iw = {
$iw.super.this();
()
};
def hoho(str: String): Unit = scala.this.Predef.println("hoho ".+(str))
}
}
}
}
So your method hoho is really $line1.$read.$iw.$iw.hoho. Then when you use hoho("foo") later on, it'll rewrite to add the package and outer objects.
Additional notes: for scripts, -Xprint:typer (-Xprint:parser) reveals that the code is wrapped inside a code block in the main(args:Array[String]) of an object Main. You have access to the arguments as args or argv.