Implementing a fixed size, immutable, and specialized vector - scala

For performance and safety I would like to implement a fixed-size vector which is both immutable and specialized (I need fast arithmetics). My first idea was to use the #specialized annotation (because I need both integers and reals).
Here is a first try:
package so
class Vec[#specialized A] private[so] ( ary: Array[A] ) {
def apply( i: Int ) = ary(i)
}
However, when I analyze the resulting bytecode with javap, I can see that the elements are still boxed. For instance:
public double apply$mcD$sp(int);
Code:
0: aload_0
1: iload_1
2: invokevirtual #33; //Method apply:(I)Ljava/lang/Object;
5: invokestatic #83; //Method scala/runtime/BoxesRunTime.unboxToDouble:(Ljava/lang/Object;)D
8: dreturn
It looks like arrays are not specialized which seems silly, because arrays are specialized on the JVM.
Is there something I can still do to reach my goal ?

You are likely looking at the code compiled to Vec.class. According to this thread the specialization occurs in subclasses. This can be verified in the REPL:
scala> class Vec[#specialized A] ( ary: Array[A] ) {
| def apply( i: Int ) = ary(i)
| }
defined class Vec
scala> new Vec( Array[Int](1) ).getClass
res0: java.lang.Class[_ <: Vec[Int]] = class Vec$mcI$sp
As you can see for Int it is using the subclass Vec$mcI$sp. And if you run javap on that class you will see that it is infact specializing the code properly. This is what the apply method looks like in Vec$mcI$sp.class using javap:
public int apply(int);
flags: ACC_PUBLIC
Code:
stack=2, locals=2, args_size=2
0: aload_0
1: iload_1
2: invokevirtual #13 // Method apply$mcI$sp:(I)I
5: ireturn
Which I suppose is what you want when using Int.

Related

Scala: why works foo(1,2) and foo((1,2)) the same?

Say I have a Scala function :
def func(x:(Int,Int)):Int = x._1 + x._2
func((1,2)) // This works as expected
But how come below function call also works correctly?
func(1,2)
I know about function call being turned to object with apply methods but I am unable to see even then how this works?
If there are no appropriate multi-argument methods and a single appropriate one-argument method, the Scala compiler will try to convert those comma separated arguments into tuples.
The type of the argument x to your func method is (Int, Int), which is a syntactic sugar for Tuple2[Int, Int]. So the signature of func method is actually func(Tuple2[Int, Int]).
You invoke it as func(1, 2), but there's no method with signature func(Int, Int) defined in the scope, so the compiler will roughly translate the invocation to func(Tuple2(1, 2)), which matches the signature of your method. So this kind of invocation will work, but can lead to unexpected results (it's not hard to see why).
EDIT: Also see this question for additional reading.
This is a syntax of scala:
(x_1 , … , x_n),((x_1 , … , x_n))is a shorthand for `Tuple$n$($x_1 , … , x_n$)
check this Tuples, revised.
and also when check the generated bytecode:
scala> def bar(x: Int, y: Int) = func(x, y)
scala> :javap -c bar
Compiled from "<console>"
public class $line5.$read$$iw$$iw$ {
public static $line5.$read$$iw$$iw$ MODULE$;
public static {};
Code:
0: new #2 // class $line5/$read$$iw$$iw$
3: invokespecial #23 // Method "<init>":()V
6: return
public int bar(int, int);
Code:
0: getstatic #30 // Field $line3/$read$$iw$$iw$.MODULE$:L$line3/$read$$iw$$iw$;
3: new #32 // class scala/Tuple2$mcII$sp
6: dup
7: iload_1
8: iload_2
9: invokespecial #35 // Method scala/Tuple2$mcII$sp."<init>":(II)V
12: invokevirtual #39 // Method $line3/$read$$iw$$iw$.func:(Lscala/Tuple2;)I
15: ireturn
public $line5.$read$$iw$$iw$();
Code:
0: aload_0
1: invokespecial #42 // Method java/lang/Object."<init>":()V
4: aload_0
5: putstatic #44 // Field MODULE$:L$line5/$read$$iw$$iw$;
8: return
}
we can see this is transformed by compiler: new #32 // class scala/Tuple2$mcII$sp
and I think this is equivalent to Function.untupled, example:
scala> Function.untupled(func _)(1, 2)
res1: Int = 3

Why can I use :: operator with Seq in pattern matching but not elsewhere

So I have been really confused with this behavior regarding Seq in Scala.
When using pattern matching I am able to use either :: or +: operator and they seem interchangeable
val s=Seq(1,2,3)
s match{
case x :: l => ...
but when I'm trying to use :: in different situation like so:
val s=1::Seq(2,3)
I receive "value :: is not a member of Seq[Int]" message. I understand that I am supposed to use += and =+ operators with Seq, but why
:: work only in pattern matching scenario?
:: is for Lists, and in fact Seq.apply will currently give you a List:
scala> val s = Seq(1,2,3)
s: Seq[Int] = List(1, 2, 3)
So the type of value s is Seq[Int], but the object it points to is of type List[Int]. That's fine, because List extends Seq. And that will of course match a pattern involving :: because it is in fact a List:
scala> s match { case x :: xs => x }
res2: Int = 1
But the type of expression Seq(1,2,3) is not List[Int] but Seq[Int] -- even though the actual object is indeed a List. So the following fails because Seq does not define a :: method:
scala> val s = 1 :: Seq(2,3)
<console>:7: error: value :: is not a member of Seq[Int]
val s = 1 :: Seq(2,3)
You have to use the method for Seq instead:
scala> val s = 1 +: Seq(2,3)
s: Seq[Int] = List(1, 2, 3)
The key to your confusion is that when you invoke a method on a value like s, the set of methods available depends entirely on the value's static type, whereas the pattern match checks that the object being matched is of class ::.
To show this, let's compile some sample code and use javap to see the bytecode; the first few instructions of the first method check that the argument is of class :: (rather than some other class extending Seq) and cast to it:
NS% cat Test.scala
object Test {
def first(xs: Seq[Int]) = xs match { case x :: xs => x }
}
NS% javap -c Test\$.class
Compiled from "Test.scala"
public final class Test$ {
public static final Test$ MODULE$;
public static {};
Code:
0: new #2 // class Test$
3: invokespecial #12 // Method "<init>":()V
6: return
public int first(scala.collection.Seq<java.lang.Object>);
Code:
0: aload_1
1: astore_2
2: aload_2
3: instanceof #16 // class scala/collection/immutable/$colon$colon
6: ifeq 30
9: aload_2
10: checkcast #16 // class scala/collection/immutable/$colon$colon
13: astore_3
14: aload_3
15: invokevirtual #20 // Method scala/collection/immutable/$colon$colon.head:()Ljava/lang/Object;
18: invokestatic #26 // Method scala/runtime/BoxesRunTime.unboxToInt:(Ljava/lang/Object;)I
21: istore 4
23: iload 4
25: istore 5
27: iload 5
29: ireturn
30: new #28 // class scala/MatchError
33: dup
34: aload_2
35: invokespecial #31 // Method scala/MatchError."<init>":(Ljava/lang/Object;)V
38: athrow
Finally, you could ask why the Scala folks didn't make :: the equivalent method (prepend an element) for Seq. If they had, then 1 :: Seq(2,3) would work.
But for Seq they really needed a pair of operators, one to prepend (this one must end in a colon, so that it is right-associative) and one to append. You want to avoid appending an element to a List because you have to traverse the existing elements to do so, but the same is not true for Seqs in general -- e.g. append is quite efficient for a Vector. So they chose +: for prepend and :+ for append.
Of course, you could ask why they didn't use +: for List to match Seq. I don't know the full answer to that. I do know that :: comes from other languages which have list structures, so in part the answer is probably consistency with established conventions. And perhaps they did not realize that they wanted a matching pair of operators for a supertype of List until it was too late -- not sure. Does anyone know the history here?
:: pronounced cons is an operator for Lists.
Seq is a generic trait that all Scala sequences inherit. This is a generic interface for any type of sequence and not just for lists.
Given that the by default Scala uses Lists as the sequence return by the Seq() factory method, then Pattern Matching can operate with cons.
So you can do
val s = 1::List(2,3)
But not
val s = 1::Seq(2,3)

How can I get Class[T] from an implicit ClassTag[T]?

I'd like to have a method like
def retrieve[T](value: Option[T])(implicit ct: ClassTag[T]): T;
Inside this method I need to call a Java method (beyond my control) to create an instance of T that requires Class[T]:
public <T> T construct(clazz: Class<T> /* other arguments */) { ... }
How can I get Class[T] from ClassTag[T]? First I thought I could use runtimeClass from ClassTag, but it's type is Class[_], not Class[T]. Or is there any other implicit value that compiler can automatically provide, from which I can obtain Class[T]?
Here is the ticket on getClass and the linked forum discussion in which Odersky speculates:
You could also use a cast.
Here is the duplicate ticket where getClass is fixed. 5.getClass also casts:
/** Return the class object representing an unboxed value type,
* e.g. classOf[int], not classOf[java.lang.Integer]. The compiler
* rewrites expressions like 5.getClass to come here.
*/
def anyValClass[T <: AnyVal : ClassTag](value: T): jClass[T] =
classTag[T].runtimeClass.asInstanceOf[jClass[T]]
The limitation is reminiscent of this question about pattern matching with ClassTag, in which our naive expectations are also not met.
Does the resistance to Class[A] represent the impedance mismatch between Scala types and the platform?
Given the class type, all one can really do is newInstance. But reflective invocation with a constructor mirror won't give me my type back.
scala> res24 reflectConstructor res25.asMethod
res27: reflect.runtime.universe.MethodMirror = constructor mirror for Bar.<init>(): Bar (bound to null)
scala> res27()
res28: Any = Bar#2eeb08d9
scala> bar.getClass.newInstance
res29: Bar = Bar#31512f0a
scala> classOf[Bar].newInstance
res30: Bar = Bar#2bc1d89f
That doesn't seem fair.
As that mailing thread from 2008 concludes, you expect to use fewer casts in Scala.
BTW, it's not that I disbelieved the code comment, but:
scala> 5.getClass
res38: Class[Int] = int
scala> :javap -
Size 1285 bytes
MD5 checksum a30a28543087238b563fb1983d7d139b
Compiled from "<console>"
[snip]
9: getstatic #27 // Field scala/runtime/ScalaRunTime$.MODULE$:Lscala/runtime/ScalaRunTime$;
12: iconst_5
13: invokestatic #33 // Method scala/runtime/BoxesRunTime.boxToInteger:(I)Ljava/lang/Integer;
16: getstatic #38 // Field scala/reflect/ClassTag$.MODULE$:Lscala/reflect/ClassTag$;
19: invokevirtual #42 // Method scala/reflect/ClassTag$.Int:()Lscala/reflect/ClassTag;
22: invokevirtual #46 // Method scala/runtime/ScalaRunTime$.anyValClass:(Ljava/lang/Object;Lscala/reflect/ClassTag;)Ljava/lang/Class;
25: putfield #18 // Field res38:Ljava/lang/Class;
28: return

Why are `private val` and `private final val` different?

I used to think that private val and private final val are same, until I saw section 4.1 in Scala Reference:
A constant value definition is of the form
final val x = e
where e is a constant expression (§6.24). The final modifier must be present and no type annotation may be given. References to the constant value x are themselves treated as constant expressions; in the generated code they are replaced by the definition’s right-hand side e.
And I have written a test:
class PrivateVal {
private val privateVal = 0
def testPrivateVal = privateVal
private final val privateFinalVal = 1
def testPrivateFinalVal = privateFinalVal
}
javap -c output:
Compiled from "PrivateVal.scala"
public class PrivateVal {
public int testPrivateVal();
Code:
0: aload_0
1: invokespecial #19 // Method privateVal:()I
4: ireturn
public int testPrivateFinalVal();
Code:
0: iconst_1
1: ireturn
public PrivateVal();
Code:
0: aload_0
1: invokespecial #24 // Method java/lang/Object."<init>":()V
4: aload_0
5: iconst_0
6: putfield #14 // Field privateVal:I
9: return
}
The byte code is just as Scala Reference said: private val is not private final val.
Why doesn't scalac just treat private val as private final val? Is there any underlying reason?
So, this is just a guess, but it was a perennial annoyance in Java that final static variables with a literal on the right-hand side get inlined into bytecode as constants. That engenders a performance benefit sure, but it causes binary compatibility of the definition to break if the "constant" ever changed. When defining a final static variable whose value might need to change, Java programmers have to resort to hacks like initializing the value with a method or constructor.
A val in Scala is already final in the Java sense. It looks like Scala's designers are using the redundant modifier final to mean "permission to inline the constant value". So Scala programmers have complete control over this behavior without resorting to hacks: if they want an inlined constant, a value that should never change but is fast, they write "final val". if they want flexibility to change the value without breaking binary compatibility, just "val".
I think the confusion here arises from conflating immutability with the semantics of final. vals can be overridden in child classes and therefore can't be treated as final unless marked as such explicitly.
#Brian The REPL provides class scope at the line level. See:
scala> $iw.getClass.getPackage
res0: Package = package $line3
scala> private val x = 5
<console>:5: error: value x cannot be accessed in object $iw
lazy val $result = `x`
scala> private val x = 5; println(x);
5

Why scalac generates additional/wrapping closures

First. Consider the following code
scala> val fail = (x: Any) => { throw new RuntimeException }
fail: Any => Nothing = <function1>
scala> List(1).foreach(fail)
java.lang.RuntimeException
at $anonfun$1.apply(<console>:7)
at $anonfun$1.apply(<console>:7)
at scala.collection.LinearSeqOptimized$class.foreach(LinearSeqOptimized.scala:59)
There is additional anonfun between foreach and exception. One is expected to be a value of fail itself (object of a class Function1[]), but where is the second comes from?
foreach signature takes this function:
def foreach[U](f: A => U): Unit
So, what is the purpose of the second one?
Second, consider the following code:
scala> def outer() {
| def innerFail(x: Any) = { throw new RuntimeException("inner fail") }
|
| Set(1) foreach innerFail
| }
outer: ()Unit
scala> outer()
java.lang.RuntimeException: inner fail
at .innerFail$1(<console>:8)
at $anonfun$outer$1.apply(<console>:10)
at $anonfun$outer$1.apply(<console>:10)
at scala.collection.immutable.Set$Set1.foreach(Set.scala:86)
There are two additional anonfuns... do they really needed? :-E
Let's look at the bytecode.
object ExtraClosure {
val fail = (x: Any) => { throw new RuntimeException }
List(1).foreach(fail)
}
We find, inside the (single) anonymous function:
public final scala.runtime.Nothing$ apply(java.lang.Object);
Code:
0: new #15; //class java/lang/RuntimeException
3: dup
4: invokespecial #19; //Method java/lang/RuntimeException."<init>":()V
7: athrow
public final java.lang.Object apply(java.lang.Object);
Code:
0: aload_0
1: aload_1
2: invokevirtual #27; //Method apply:(Ljava/lang/Object;)Lscala/runtime/Nothing$;
5: athrow
So it's actually not an extra closure after all. We have one method overloaded with two different return values (which is perfectly okay for the JVM since it treats the type of all parameters as part of the function signature). Function is generic, so it has to take the object return, but the code you wrote returns specifically Nothing, it also creates a method that returns the type you'd expect.
There are various ways around this, but none are without their flaws. This is the type of thing that JVMs are pretty good at eliding, however, so I wouldn't worry about it too much.
Edit: And of course in your second example, you used a def, and the anonfun is the class that wraps that def in a function object. That is of course needed since foreach takes a Function1. You have to generate that Function1 somehow.