Understanding #specialized with Non-Primitives - scala

Looking at #specialized's docs, I see:
scala> class MyList[#specialized T]
defined class MyList
My incomplete understanding is that MyList accepts a generic parameter, T, that must be a primitive.
scala> new MyList[Int] {}
res1: MyList[Int] = $anon$1#17884d
But, I then made a case class.
scala> case class Zig(x: String)
defined class Zig
However, given my above assumption, I did not expect to be able to new a MyList with a parameterized type of Zig.
scala> new MyList[Zig]
res2: MyList[Zig] = MyList#62de73eb
What am I missing?

The #specialized annotation adds in additional implementations of the class (hidden away in the bytecode), that are implemented in such away as to avoid wrapping primitive types. In terms of Java they'd use int rather than Integer, as the constant wrapping and unwrapping can be quite bad for performance.
But it still retains the implementation that you write that can take any type.

Related

Functions vs function pointers

Can someone explain the following to me:
scala> def squared(x: Int) = x * x
squared: (x: Int)Int
scala> val sq : (Int) => Int = squared
sq: Int => Int = <function1>
scala> sq.getClass
res111: Class[_ <: Int => Int] = class $anonfun$1
I understand this so far, squared is a function while sq is a function pointer.
But then I do this:
scala> squared.getClass
<console>:13: error: missing arguments for method squared;
follow this method with `_' if you want to treat it as a partially applied function
squared.getClass
^
Why can't I invoke getClass on squared ? After all, aren't functions 1st class objects ? Why do I need to do this for it to work ?
scala> squared(7).getClass
res113: Class[Int] = int
And I also get the same result for
scala> sq(5).getClass
res115: Class[Int] = int
Come to think of it, why do
scala>squared(5)
and
scala> sq(5)
produce the same result, even though one is a function, and the other a function pointer without needing to use a different syntax ?
Something akin to *sq(5) may have been clearer, no ?
The concept of a pointer isn't really relevant here, or in Scala (or the JVM) more generally. The difference between squared and sq is that squared is a method, and sq is a function.
Scala is (primarily) a language designed to be compiled to JVM bytecode. The JVM doesn't have first class functions, but it does have methods, which are associated either with an instance of a class (instance methods) or simply with the class itself (static methods). Methods in this sense are a fundamentally different kind of thing than objects to the JVM—they can't be passed as arguments to other methods, etc.
Because Scala is a functional language and functional languages are built on the idea of higher-order functions and first class functions more generally, the Scala language designers needed to be able to encode functions in a way that would work on the JVM. This is done via a Function1 class, and when you write something like this:
val sq: (Int) => Int = x => x * x
You are using Scala's syntactic sugar for creating instances of the Function1 class. These things are functions encoded as JVM objects, so they can be passed around and treated as first class things in the language.
Scala doesn't abandon the idea of methods, though. For reasons related in part to Scala's functional-OOP hybridity and in part to issues of performance, most Scala programs make extensive use of def definitions, which define methods, not "functions" in the sense of Function1. Scala provides a special conversion process (called eta expansion) by which methods can be treated as functions in many situations (including the right-hand side of your sq definition here).
If this all seems confusing, believe me, it is. You get used to it after a while, though (just get the idea of pointers out of your head as quickly as possible).
I understand this so far, squared is a function while sq is a function pointer.
Incorrect. Scala has neither explicit pointers nor function pointers
scala> (squared _).getClass
res4: Class[_ <: Int => Int] = class $anonfun$1
scala> sq.getClass
res5: Class[_ <: Int => Int] = class $anonfun$1
scala> :type squared _
Int => Int
scala> :type sq
Int => Int
Both have the same type. The difference is that functions which are defd do not live in the same namespace as vals or vars and code which refers to one is parsed differently from code which refers to the other.
Specifically, a reference to a val is an expression which can be evaluated. A reference to a defd function is not an expression and you can't apply operators like . which only apply to values.

Scala : Does variable type inference affect performance?

In Scala, you can declare a variable by specifying the type, like this: (method 1)
var x : String = "Hello World"
or you can let Scala automatically detect the variable type (method 2)
var x = "Hello World"
Why would you use method 1? Does it have a performance benefit?
And once the variable has been declared, will it behave exactly the same in all situations wether it has been declared by method 1 or method 2?
Type inference is done at compile time - it's essentially the compiler figuring out what you mean, filling in the blanks, and then compiling the resulting code.
What this means is that there can be no runtime cost to type inference. The compile time cost, however, can sometimes be prohibitive and require you to explicitly annotate some of your expressions.
You will not have any performance difference using this two variants.
They will both be compiled to the same code.
The other answers assume that the compiler inferred what you think it inferred.
It is easy to demonstrate that specifying the type in a definition will set the expected type for the RHS of the definition and guide type inference.
For example, in this method that builds a collection of something, A is inferred to be Nothing, which may not be what you wanted:
scala> def build[A, B, C <: Iterable[B]](bs: B*)(implicit cbf: CanBuildFrom[A, B, C]): C = {
| val b = cbf(); println(b.getClass); b ++= bs; b.result }
build: [A, B, C <: Iterable[B]](bs: B*)(implicit cbf: scala.collection.generic.CanBuildFrom[A,B,C])C
scala> val xs = build(1,2,3)
class scala.collection.immutable.VectorBuilder
xs: scala.collection.immutable.IndexedSeq[Int] = Vector(1, 2, 3)
scala> val xs: List[Int] = build(1,2,3)
class scala.collection.mutable.ListBuffer
xs: List[Int] = List(1, 2, 3)
scala> val xs: Seq[Int] = build(1,2,3)
class scala.collection.immutable.VectorBuilder
xs: Seq[Int] = Vector(1, 2, 3)
Obviously, it matters for runtime performance whether you get a List or a Vector.
This is a lame example, but in many expressions you wouldn't notice the type of an intermediate collection unless it caused a performance problem.
Sample conversations:
https://groups.google.com/forum/#!msg/scala-language/mQ-bIXbC1zs/wgSD4Up5gYMJ
http://grokbase.com/p/gg/scala-user/137mgpjg98/another-funny-quirk
Why is Seq.newBuilder returning a ListBuffer?
https://groups.google.com/forum/#!topic/scala-user/1SjYq_qFuKk
In the simple example you gave, there is no difference in the generated byte code, and therefore no difference in performance. It would also make no noticeable difference in compilation speed.
In more complex code (likely involving implicits) you could run into cases where compile-type performance would be noticeably improved by specifying some types. However, I would completely ignore this until and unless you run into it -- specify types or not for other, better reasons.
More in line with your question, there is one very important case where it is a good idea to specify the type to ensure good run-time performance. Consider this code:
val x = new AnyRef { def sayHi() = println("Howdy!") }
x.sayHi
That code uses reflection to call sayHi, and that's a huge performance hit. Recent versions of Scala will warn you about this code for that reason, unless you have enabled the language feature for it:
warning: reflective access of structural type member method sayHi should be enabled
by making the implicit value scala.language.reflectiveCalls visible.
This can be achieved by adding the import clause 'import scala.language.reflectiveCalls'
or by setting the compiler option -language:reflectiveCalls.
See the Scala docs for value scala.language.reflectiveCalls for a discussion
why the feature should be explicitly enabled.
You might then change the code to this, which does not make use of reflection:
trait Talkative extends AnyRef { def sayHi(): Unit }
val x = new Talkative { def sayHi() = println("Howdy!") }
x.sayHi
For this reason you generally want to specify the type of the variable when you are defining classes this way; that way if you inadvertently add a method that would require reflection to call, you'll get a compilation error -- the method won't be defined for the variable's type. So while it is not the case that specifying the type makes the code run faster, it is the case that if the code would be slow, specifying the type makes it fail to compile.
val x: AnyRef = new AnyRef { def sayHi() = println("Howdy!") }
x.sayHi // ERROR: sayHi is not defined on AnyRef
There are of course other reasons why you might want to specify a type. They are required for the formal parameters of methods/functions, and for the return types of methods that are recursive or overloaded.
Also, you should always specify return types for methods in a public API (unless they are just trivially obvious), or you might end up with different method signatures than you intended, and then risk breaking existing clients of your API when you fix the signature.
You may of course want to deliberately widen a type so that you can assign other types of things to a variable later, e.g.
var shape: Shape = new Circle(1.0)
shape = new Square(1.0)
But in these cases there is no performance impact.
It is also possible that specifying a type will cause a conversion, and of course that will have whatever performance impact the conversion imposes.

Getting implicit scala Numeric from Azavea Numeric

I am using the Azavea Numeric Scala library for generic maths operations. However, I cannot use these with the Scala Collections API, as they require a scala Numeric and it appears as though the two Numerics are mutually exclusive. Is there any way I can avoid re-implementing all mathematical operations on Scala Collections for Azavea Numeric, apart from requiring all types to have context bounds for both Numerics?
import Predef.{any2stringadd => _, _}
class Numeric {
def addOne[T: com.azavea.math.Numeric](x: T) {
import com.azavea.math.EasyImplicits._
val y = x + 1 // Compiles
val seq = Seq(x)
val z = seq.sum // Could not find implicit value for parameter num: Numeric[T]
}
}
Where Azavea Numeric is defined as
trait Numeric[#scala.specialized A] extends java.lang.Object with
com.azavea.math.ConvertableFrom[A] with com.azavea.math.ConvertableTo[A] with scala.ScalaObject {
def abs(a:A):A
...remaining methods redacted...
}
object Numeric {
implicit object IntIsNumeric extends IntIsNumeric
implicit object LongIsNumeric extends LongIsNumeric
implicit object FloatIsNumeric extends FloatIsNumeric
implicit object DoubleIsNumeric extends DoubleIsNumeric
implicit object BigIntIsNumeric extends BigIntIsNumeric
implicit object BigDecimalIsNumeric extends BigDecimalIsNumeric
def numeric[#specialized(Int, Long, Float, Double) A:Numeric]:Numeric[A] = implicitly[Numeric[A]]
}
You can use Régis Jean-Gilles solution, which is a good one, and wrap Azavea's Numeric. You can also try recreating the methods yourself, but using Azavea's Numeric. Aside from NumericRange, most should be pretty straightforward to implement.
You may be interested in Spire though, which succeeds Azavea's Numeric library. It has all the same features, but some new ones as well (more operations, new number types, sorting & selection, etc.). If you are using 2.10 (most of our work is being directed at 2.10), then using Spire's Numeric eliminates virtually all overhead of a generic approach and often runs as fast as a direct (non-generic) implementation.
That said, I think your question is a good suggestion; we should really add a toScalaNumeric method on Numeric. Which Scala collection methods were you planning on using? Spire adds several new methods to Arrays, such as qsum, qproduct, qnorm(p), qsort, qselect(k), etc.
The most general solution would be to write a class that wraps com.azavea.math.Numeric and implements scala.math.Numeric in terms of it:
class AzaveaNumericWrapper[T]( implicit val n: com.azavea.math.Numeric[T] ) extends scala.math.Numeric {
def compare (x: T, y: T): Int = n.compare(x, y)
def minus (x: T, y: T): T = n.minus(x, y)
// and so on
}
Then implement an implicit conversion:
// NOTE: in scala 2.10, we could directly declare AzaveaNumericWrapper as an implicit class
implicit def toAzaveaNumericWrapper[T]( implicit n: com.azavea.math.Numeric[T] ) = new AzaveaNumericWrapper( n )
The fact that n is itself an implicit is key here: it allows for implicit values of type com.azavea.math.Numeric to be automatically used where na implicit value of
type scala.math.Numeric is expected.
Note that to be complete, you'll probably want to do the reverse too (write a class ScalaNumericWrapper that implements com.azavea.math.Numeric in terms of scala.math.Numeric).
Now, there is a disadvantage to the above solution: you get a conversion (and thus an instanciation) on each call (to a method that has a context bound of type scala.math.Numeric, and where you only an instance of com.azavea.math.Numeric is in scope).
So you will actually want to define an implicit singleton instance of AzaveaNumericWrapper for each of your numeric type. Assuming that you have types MyType and MyOtherType for which you defined instances of com.azavea.math.Numeric:
implicit object MyTypeIsNumeric extends AzaveaNumericWrapper[MyType]
implicit object MyOtherTypeIsNumeric extends AzaveaNumericWrapper[MyOtherType]
//...
Also, keep in mind that the apparent main purpose of azavea's Numeric class is to greatly enhance execution speed (mostly due to type parameter specialization).
Using the wrapper as above, you lose the specialization and hence the speed that comes out of it. Specialization has to be used all the way down,
and as soon as you call a generic method that is not specialized, you enter in the world of unspecialized generics (even if that method then calls back a specialized method).
So in cases where speed matters, try to use azavea's Numeric directly instead of scala's Numeric (just because AzaveaNumericWrapper uses it internally
does not mean that you will get any speed increase, as specialization won't happen here).
You may have noticed that I avoided in my examples to define instances of AzaveaNumericWrapper for types Int, Long and so on.
This is because there are already (in the standard library) implicit values of scala.math.Numeric for these types.
You might be tempted to just hide them (via something like import scala.math.Numeric.{ShortIsIntegral => _}), so as to be sure that your own (azavea backed) version is used,
but there is no point. The only reason I can think of would be to make it run faster, but as explained above, it wont.

Type extraction in Scala

I am pretty new to Scala and advanced programming languages. I try to solve the following problem.
I have got:
val s: Seq[SomeMutableType[_]]
I assume that all elements in the sequence are of the same type (but do not know which one at this point).
How may I call :
def proc[T](v0: SomeMutableType[T], v1: SomeMutableType[T]) { /* ... */ }
with something like
proc(s(0), s(1))
The compiler complains :
type mismatch; found : SomeMutableType[_$351] where type _$351 required:
SomeMutableType[Any] Note: _$351 <: Any, but class SomeMutableType is invariant in type T. You
may wish to define T as +T instead. (SLS 4.5)
I thought about that covariant thing, but I do not believe it makes sense in my case. I just want the compiler believe me when I say that s(0) and s(1) are of the same type! I usually do this via some casting, but I cannot cast to SomeMutableType[T] here since T is unknown due to erasure. Of course, I cannot change the definition of proc.
The problem is that you truly cannot make such a guarantee. For example:
scala> import scala.collection.mutable.Buffer
import scala.collection.mutable.Buffer
scala> val s: Seq[Buffer[_]] = Seq(Buffer(1), Buffer("a"))
s: Seq[scala.collection.mutable.Buffer[_]] = List(ArrayBuffer(1), ArrayBuffer(a))
See? You don't know that s(0) and s(1) are of the same type, because they may not be of the same type.
At this point, you should ask a question about what you want to accomplish, instead of asking how to solve a problem in how you want to accomplish it. They way you took won't work. Step back, think what problem you were trying to solve with this approach, and ask how to solve that problem.
For instance, you say:
I assume that all elements in the sequence are of the same type (but do not know which
one at this point).
It may be that what you want to do is parameterize a class or method, and use its type parameter when declaring s. Or, maybe, not have an s at all.
I am new to Scala, but as far as I can see your problem is the use of a wildcard type parameter when you declare s:
val s: Seq[SomeMutableType[_]]
As far as I understand, type erasure will always happen and what you really want here is a parameterised type bound to where s is initialised.
For example:
scala> class Erased(val s: List[_])
defined class Erased
scala> new Erased(List(1,2,3)).s.head
res21: Any = 1
If instead you use
scala> class Kept[T](val s: List[T])
defined class Kept
scala> new Kept(List(1,2,3)).s.head
res22: Int = 1
Then the contents of s retain their type information as it is bound to T. i.e. This is exactly how you tell the compiler "that s(0) and s(1) are of the same type".

Spurious ambiguous reference error in Scala 2.7.7 compiler/interpreter?

Can anyone explain the compile error below? Interestingly, if I change the return type of the get() method to String, the code compiles just fine. Note that the thenReturn method has two overloads: a unary method and a varargs method that takes at least one argument. It seems to me that if the invocation is ambiguous here, then it would always be ambiguous.
More importantly, is there any way to resolve the ambiguity?
import org.scalatest.mock.MockitoSugar
import org.mockito.Mockito._
trait Thing {
def get(): java.lang.Object
}
new MockitoSugar {
val t = mock[Thing]
when(t.get()).thenReturn("a")
}
error: ambiguous reference to overloaded definition,
both method thenReturn in trait OngoingStubbing of type
java.lang.Object,java.lang.Object*)org.mockito.stubbing.OngoingStubbing[java.lang.Object]
and method thenReturn in trait OngoingStubbing of type
(java.lang.Object)org.mockito.stubbing.OngoingStubbing[java.lang.Object]
match argument types (java.lang.String)
when(t.get()).thenReturn("a")
Well, it is ambiguous. I suppose Java semantics allow for it, and it might merit a ticket asking for Java semantics to be applied in Scala.
The source of the ambiguitity is this: a vararg parameter may receive any number of arguments, including 0. So, when you write thenReturn("a"), do you mean to call the thenReturn which receives a single argument, or do you mean to call the thenReturn that receives one object plus a vararg, passing 0 arguments to the vararg?
Now, what this kind of thing happens, Scala tries to find which method is "more specific". Anyone interested in the details should look up that in Scala's specification, but here is the explanation of what happens in this particular case:
object t {
def f(x: AnyRef) = 1 // A
def f(x: AnyRef, xs: AnyRef*) = 2 // B
}
if you call f("foo"), both A and B
are applicable. Which one is more
specific?
it is possible to call B with parameters of type (AnyRef), so A is
as specific as B.
it is possible to call A with parameters of type (AnyRef,
Seq[AnyRef]) thanks to tuple
conversion, Tuple2[AnyRef,
Seq[AnyRef]] conforms to AnyRef. So
B is as specific as A. Since both are
as specific as the other, the
reference to f is ambiguous.
As to the "tuple conversion" thing, it is one of the most obscure syntactic sugars of Scala. If you make a call f(a, b), where a and b have types A and B, and there is no f accepting (A, B) but there is an f which accepts (Tuple2(A, B)), then the parameters (a, b) will be converted into a tuple.
For example:
scala> def f(t: Tuple2[Int, Int]) = t._1 + t._2
f: (t: (Int, Int))Int
scala> f(1,2)
res0: Int = 3
Now, there is no tuple conversion going on when thenReturn("a") is called. That is not the problem. The problem is that, given that tuple conversion is possible, neither version of thenReturn is more specific, because any parameter passed to one could be passed to the other as well.
In the specific case of Mockito, it's possible to use the alternate API methods designed for use with void methods:
doReturn("a").when(t).get()
Clunky, but it'll have to do, as Martin et al don't seem likely to compromise Scala in order to support Java's varargs.
Well, I figured out how to resolve the ambiguity (seems kind of obvious in retrospect):
when(t.get()).thenReturn("a", Array[Object](): _*)
As Andreas noted, if the ambiguous method requires a null reference rather than an empty array, you can use something like
v.overloadedMethod(arg0, null.asInstanceOf[Array[Object]]: _*)
to resolve the ambiguity.
If you look at the standard library APIs you'll see this issue handled like this:
def meth(t1: Thing): OtherThing = { ... }
def meth(t1: Thing, t2: Thing, ts: Thing*): OtherThing = { ... }
By doing this, no call (with at least one Thing parameter) is ambiguous without extra fluff like Array[Thing](): _*.
I had a similar problem using Oval (oval.sf.net) trying to call it's validate()-method.
Oval defines 2 validate() methods:
public List<ConstraintViolation> validate(final Object validatedObject)
public List<ConstraintViolation> validate(final Object validatedObject, final String... profiles)
Trying this from Scala:
validator.validate(value)
produces the following compiler-error:
both method validate in class Validator of type (x$1: Any,x$2: <repeated...>[java.lang.String])java.util.List[net.sf.oval.ConstraintViolation]
and method validate in class Validator of type (x$1: Any)java.util.List[net.sf.oval.ConstraintViolation]
match argument types (T)
var violations = validator.validate(entity);
Oval needs the varargs-parameter to be null, not an empty-array, so I finally got it to work with this:
validator.validate(value, null.asInstanceOf[Array[String]]: _*)