I’m new to using Scala and am trying to see if a list contains any objects of a certain type.
When I make a method to do this, I get the following results:
var l = List("Some string", 3)
def containsType[T] = l.exists(_.isInstanceOf[T])
containsType[Boolean] // val res0: Boolean = true
l.exists(_.isInstanceOf[Boolean]) // val res1: Boolean = false
Could someone please help me understand why my method doesn’t return the same results as the expression on the last line?
Thank you,
Johan
Alin's answer details perfectly why the generic isn't available at runtime. You can get a bit closer to what you want with the magic of ClassTag, but you still have to be conscious of some issues with Java generics.
import scala.reflect.ClassTag
var l = List("Some string", 3)
def containsType[T](implicit cls: ClassTag[T]): Boolean = {
l.exists(cls.runtimeClass.isInstance(_))
}
Now, whenever you call containsType, a hidden extra argument of type ClassTag[T] gets passed it. So when you write, for instance, println(containsType[String]), then this gets compiled to
scala.this.Predef.println($anon.this.containsType[String](ClassTag.apply[String](classOf[java.lang.String])))
An extra argument gets passed to containsType, namely ClassTag.apply[String](classOf[java.lang.String]). That's a really long winded way of explicitly passing a Class<String>, which is what you'd have to do in Java manually. And java.lang.Class has an isInstance function.
Now, this will mostly work, but there are still major caveats. Generics arguments are completely erased at runtime, so this won't help you distinguish between an Option[Int] and an Option[String] in your list, for instance. As far as the JVM is concerned, they're both Option.
Second, Java has an unfortunate history with primitive types, so containsType[Int] will actually be false in your case, despite the fact that the 3 in your list is actually an Int. This is because, in Java, generics can only be class types, not primitives, so a generic List can never contain int (note the lowercase 'i', this is considered a fundamentally different thing in Java than a class).
Scala paints over a lot of these low-level details, but the cracks show through in situations like this. Scala sees that you're constructing a list of Strings and Ints, so it wants to construct a list of the common supertype of the two, which is Any (strings and ints have no common supertype more specific than Any). At runtime, Scala Int can translate to either int (the primitive) or Integer (the object). Scala will favor the former for efficiency, but when storing in generic containers, it can't use a primitive type. So while Scala thinks that your list l contains a String and an Int, Java thinks that it contains a String and a java.lang.Integer. And to make things even crazier, both int and java.lang.Integer have distinct Class instances.
So summon[ClassTag[Int]] in Scala is java.lang.Integer.TYPE, which is a Class<Integer> instance representing the primitive type int (yes, the non-class type int has a Class instance representing it). While summon[ClassTag[java.lang.Integer]] is java.lang.Integer::class, a distinct Class<Integer> representing the non-primitive type Integer. And at runtime, your list contains the latter.
In summary, generics in Java are a hot mess. Scala does its best to work with what it has, but when you start playing with reflection (which ClassTag does), you have to start thinking about these problems.
println(containsType[Boolean]) // false
println(containsType[Double]) // false
println(containsType[Int]) // false (list can't contain primitive type)
println(containsType[Integer]) // true (3 is converted to an Integer)
println(containsType[String]) // true (class type so it works the way you expect)
println(containsType[Unit]) // false
println(containsType[Long]) // false
Scala uses the type erasure model of generics. This means that no
information about type arguments is kept at runtime, so there's no way
to determine at runtime the specific type arguments of the given
List object. All the system can do is determine that a value is a
List of some arbitrary type parameters.
You can verify this behavior by trying any List concrete type:
val l = List("Some string", 3)
println(l.isInstanceOf[List[Int]]) // true
println(l.isInstanceOf[List[String]]) // true
println(l.isInstanceOf[List[Boolean]]) // also true
println(l.isInstanceOf[List[Unit]]) // also true
Now regarding your example:
def containsType[T] = l.exists(_.isInstanceOf[T])
println(containsType[Int]) // true
println(containsType[Boolean]) // also true
println(containsType[Unit]) // also true
println(containsType[Double]) // also true
isInstanceOf is a synthetic function (a function generated by the Scala compiler at compile-time, usually to work around the underlying JVM limitations) and does not work the way you would expect with generic type arguments like T, because after compilation, this would normally be equivalent in Java to instanceof T which, by the way - is illegal in Java.
Why is illegal? Because of type erasure. Type erasure means all your generic code (generic classes, generic methods, etc.) is converted to non-generic code. This usually means 3 things:
all type parameters in generic types are replaced with their bounds or Object if they are unbounded;
wherever necessary the compiler inserts type casts to preserve type-safety;
bridge methods are generated if needed to preserve polymorphism of all generic methods.
However, in the case of instanceof T, the JVM cannot differentiate between types of T at execution time, so this makes no sense. The type used with instanceof has to be reifiable, meaning that all information about the type needs to be available at runtime. This property does not apply to generic types.
So if Java forbids this because it can't work, why does Scala even allows it? The Scala compiler is indeed more permissive here, but for one good reason; because it treats it differently. Like the Java compiler, the Scala compiler also erases all generic code at compile-time, but since isInstanceOf is a synthetic function in Scala, calls to it using generic type arguments such as isInstanceOf[T] are replaced during compilation with instanceof Object.
Here's a sample of your code decompiled:
public <T> boolean containsType() {
return this.l().exists(x$1 -> BoxesRunTime.boxToBoolean(x$1 instanceof Object));
}
Main$.l = (List<Object>)package$.MODULE$.List().apply((Seq)ScalaRunTime$.MODULE$.wrapIntArray(new int[] { 1, 2, 3 }));
Predef$.MODULE$.println((Object)BoxesRunTime.boxToBoolean(this.containsType()));
Predef$.MODULE$.println((Object)BoxesRunTime.boxToBoolean(this.containsType()));
This is why no matter what type you give to the polymorphic function containsType, it will always result in true. Basically, containsType[T] is equivalent to containsType[_] from Scala's perspective - which actually makes sense because a generic type T, without any upper bounds, is just a placeholder for type Any in Scala. Because Scala cannot have raw types, you cannot for example, create a List without providing a type parameter, so every List must be a List of "something", and that "something" is at least an Any, if not given a more specific type.
Therefore, isInstanceOf can only be called with specific (concrete) type arguments like Boolean, Double, String, etc. That is why, this works as expected:
println(l.exists(_.isInstanceOf[Boolean])) // false
We said that Scala is more permissive, but that does not mean you get away without a warning.
To alert you of the possibly non-intuitive runtime behavior, the Scala compiler does usually emit unchecked warnings. For example, if you had run your code in the Scala interpreter (or compile it using scalac), you would have received this:
Related
I have a simple question, why is the Scala compiler not able to infer the function parameter types by itself ?
Some functional programming languages, like Haskell, can infer almost
all types, because they can perform global type inference. Scala can’t
do this, in part because Scala has to support subtype polymorphism
(inheritance), which makes type inference much harder. Here is a
summary of the rules for when explicit type annotations are required
in Scala.
When Explicit Type Annotations Are Required
In practical terms, you have to provide explicit type annotations for the following situations:
A mutable var or immutable val declaration where you don’t assign a value, (e.g., abstract declarations in a class like val book: String, var count: Int
All method parameters (e.g., def deposit(amount: Money) = {…}).
Method return types in the following cases:
1)When you explicitly call return in a method (even at the end).
2)When a method is recursive.
3)When two or more methods are overloaded (have the same name) and one of them calls another; the calling method needs a return type annotation.
4)When the inferred return type would be more general than you intended, e.g., Any.
Source Programming Scala, 2nd Edition - O'Reilly Media
I have a simple question, why is the Scala compiler not able to infer the function parameter types by itself ?
Scala has local type inference. If you look at the method, and look at what information is there locally, it should be easy to see, that there simply is no information about what the types could possibly be.
Say, you have the following method:
def foo(a) = a + a
How is Scala going to figure out what type a is? It can only figure that out from looking at how the method is called, but that is non-local information!
And even worse: since Scala supports dynamic code loading, it is possible that the calling code doesn't even exist yet at compile time.
More explicitly, can this code produce any errors in any scenrios:
def foreach[U](f :Int=>U) = f.asInstanceOf[Int=>Unit](1)
I know it works, and I have a vague idea why: any function, as an instance of a generic type, must define an erased version of apply and jvm performs type check only when the object is actually to be returned to a code where it had a concrete type (often miles away). So, in theory, as long as I never look at the returned value, I should be safe. I don't have an enough low-level understandings of java byte code, let alone scalac, to have any certainty about it.
Why would I want to do it? Look at the following example:
val b = new mutable.Buffer[Int]
val ints = Seq(1, 2, 3, 4)
ints foreach { b += _ }
It's a typical scala construct, as far as imperative style can be typical. foreach in this example takes an Int as an argument, and as scalac knows it to be an Int, it will create a closure with a specialized apply(x :Int). Unfortunately, its return type in this case is a mutable.Buffer[Int], which is an AnyRef. As far as I was able to see, scalac will never invoke a specialized apply providing an AnyVal argument if the result is an AnyRef (and vice versa). This means, that even if the caller applies the function to Int, underneath the function will box the argument and invoke the erased variant. Here of course it doesn't matter as they are boxed within the List anyway, but I'm talking about the principle.
For this reason I prefer to define this type of method as foreach(f :X=>Unit), rather than foreach[O](f: X=>O) as it is in TraversableOnce. If the input sequence in the example had such a signature, everything would compile just as fine, and the compiler would ignore the actual type of the expression and generate a function with Unit return type, which - when applied to an unboxed Int - would invoke directly void apply(Int x), without boxing.
The problem arises with interoperability - sometimes I need to call a method expecting a function with a Unit return type and all I have is a generic function returning Odin knows what. Of course, I could just write f(_) to box it in another function object instead of passing it directly, but it to large extent makes the whole optimisation of small tight loops moot.
I am trying to understand how to think about type classes in Haskell versus traits in Scala.
My understanding is that type classes are primarily important at compile time in Haskell and not at runtime anymore, on the other hand traits in Scala are important both at compile time and run time. I want to illustrate this idea with a simple example, and I want to know if this viewpoint of mine is correct or not.
First, let us consider type classes in Haskell:
Let's take a simple example. The type class Eq.
For example, Int and Char are both instances of Eq. So it is possible to create a polymorphic List that is also an instance of Eq and can either contain Ints or Chars but not both in the same List.
My question is : is this the only reason why type classes exist in Haskell?
The same question in other words:
Type classes enable to create polymorphic types ( in this example a polymorphic List) that support operations that are defined in a given type class ( in this example the operation == defined in the type class Eq) but that is their only reason for existence, according to my understanding. Is this understanding of mine correct?
Is there any other reason why type classes exist in ( standard ) Haskell?
Is there any other use case in which type classes are useful in standard Haskell ? I cannot seem to find any.
Since Haskell's Lists are homogeneous, it is not possible to put Char and Int into the same list. So the usefulness of type classes, according to my understanding, is exhausted at compile time. Is this understanding of mine correct?
Now, let's consider the analogous List example in Scala:
Lets define a trait Eq with an equals method on it.
Now let's make Char and Int implement the trait Eq.
Now it is possible to create a List[Eq] in Scala that accepts both Chars and Ints into the same List ( Note that this - putting different type of elements into the same List - is not possible Haskell, at least not in standard Haskell 98 without extensions)!
In the case of the Haskell's List, the existence of type classes is important/useful only for type checking at compile time, according to my understanding.
In contrast, the existence of traits in Scala is important both at compile time for type checking and at run type for polymorphic dispatch on the actual runtime type of the object in the List when comparing two Lists for equality.
So, based on this simple example, I came to the conclusion that in Haskell type classes are primarily important/used at compilation time, in contrast, Scala's traits are important/used both at compile time and run time.
Is this conclusion of mine correct?
If not, why not ?
EDIT:
Scala code in response to n.m.'s comments:
case class MyInt(i:Int) {
override def equals(b:Any)= i == b.asInstanceOf[MyInt].i
}
case class MyChar(c:Char) {
override def equals(a:Any)= c==a.asInstanceOf[MyChar].c
}
object Test {
def main(args: Array[String]) {
val l1 = List(MyInt(1), MyInt(2), MyChar('a'), MyChar('b'))
val l2 = List(MyInt(1), MyInt(2), MyChar('a'), MyChar('b'))
val l3 = List(MyInt(1), MyInt(2), MyChar('a'), MyChar('c'))
println(l1==l1)
println(l1==l3)
}
}
This prints:
true
false
I will comment on the Haskell side.
Type classes bring restricted polymorphism in Haskell, wherein a type variable a can still be quantified universally, but ranges over only a subset of all the types -- namely, the types for which an instance of the type class is available.
Why restricted polymorphism is useful? A nice example would be the equality operator
(==) :: ?????
What its type should be? Intuitively, it takes two values of the same type and returns a boolean, so:
(==) :: a -> a -> Bool -- (1)
But the typing above is not entirely honest, since it allows one to apply == to any type a, including function types!
(\x :: Integer -> x + x) == (\x :: Integer -> 2*x)
The above would pass type checking if (1) were the typing for (==), since both arguments are of the same type a = (Integer -> Integer). However, we can not effectively compare two functions: well-known Computability results tell us that there is no algorithm to do that in general.
So, what we could do to implement (==)?
Option 1: at run time, if a function (or any other value involving functions -- such as a list of functions) is found to be passed to (==), raise an exception. This is what e.g. ML does. Typed programs can now "go wrong", despite checking types at compile time.
Option 2: introduce a new kind of polymorphism, restricting a to the function-free types. For instance, ww could have (==) :: forall-non-fun a. a -> a -> Bool so that comparing functions yields to a type error. Haskell exploits type classes to obtain exactly that.
So, Haskell type classes allow one to type (==) "honestly", ensuring no error at run time, and without being overly restrictive. Of course, the power of type classes goes far beyond of that but, at least in my own view, they primary purpose is to allow restricted polymorphism, in a very general and flexible way. Indeed, with type classes the programmer can define their own restrictions on the universal type quantifications.
Can anyone please share insight into the trait "Immutable" in scala? At first glance I thought this would be a nice control structure to limit a class I'm building, but oddly I noticed that primitive types do not extend this. Is there a reason for this? Is there a way to bind the syntax to Immutable or AnyVal?
class Test {
def test[T<:Immutable](x:T)={
println("passes "+x)
}
case class X(s:String) extends Immutable
test(X("hello")) //passes
// test("fail") - does not pass compiler
}
The only direct subtypes of Immutable in the Scala core library are:
collection.immutable.Traversable
collection.parallel.immutable.ParIterable
Nothing else refers to Immutable at all.
Immutable hasn't been changed since it was added in 2009 in Martin Odersky's "massive new collections checkin". I'm searching through that commit, and it looks like Immutable was never even used as a bound when it was first introduced either.
Honestly, I doubt there's much intent behind these traits anymore. Odersky probably planned to use Immutable to bound the type arguments on immutable collections, and then thought better of it. But that's just my speculation.
So-called primitive types (Boolean, Byte, Char, Short, Int, Long, Float, Double) are intrinsically immutable. 5 is 5 is 5. You cannot do anything to 5 to turn it into anything that is not 5.
Otherwise, immutability is a property of how a value is stored. If stored in a var, that var may be replaced freely with a new value (of a compatible type). By extension, constructed types (classes, traits and objects) may be either immutable or mutable depending on whether they allow any of their internal state to be altered following construction.
Java's String (also used as Scala's String) is immutable.
However, none of this has anything to do with you example, since you did not demonstrate mutability. You simply showed what happens when one applies the + method of one value to another value.
While it is certainly possible that one can implement a + method that mutates its (apparent) left-hand operand, one rarely does that. If there's a need for that kind of mutation, one would conventionally define the += method instead.
+ is somewhat special in that it may be applied to any value (if the argument / right-hand operand) is a String by virtue of an implicit conversion to a special class that defines +(s: String) so that the string concatenation interpretation of + may be applied. In other words, if you write e1 + "e2" and the type of the expression e1 does not define +, then Scala will convert e1 to String and concatenate it with "e2".
Some time ago Oracle decided that adding Closures to Java 8 would be an good idea. I wonder how design problems are solved there in comparison to Scala, which had closures since day one.
Citing the Open Issues from javac.info:
Can Method Handles be used for Function Types?
It isn't obvious how to make that work. One problem is that Method Handles reify type parameters, but in a way that interferes with function subtyping.
Can we get rid of the explicit declaration of "throws" type parameters?
The idea would be to use disjuntive type inference whenever the declared bound is a checked exception type. This is not strictly backward compatible, but it's unlikely to break real existing code. We probably can't get rid of "throws" in the type argument, however, due to syntactic ambiguity.
Disallow #Shared on old-style loop index variables
Handle interfaces like Comparator that define more than one method, all but one of which will be implemented by a method inherited from Object.
The definition of "interface with a single method" should count only methods that would not be implemented by a method in Object and should count multiple methods as one if implementing one of them would implement them all. Mainly, this requires a more precise specification of what it means for an interface to have only a single abstract method.
Specify mapping from function types to interfaces: names, parameters, etc.
We should fully specify the mapping from function types to system-generated interfaces precisely.
Type inference. The rules for type inference need to be augmented to accomodate the inference of exception type parameters. Similarly, the subtype relationships used by the closure conversion should be reflected as well.
Elided exception type parameters to help retrofit exception transparency.
Perhaps make elided exception type parameters mean the bound. This enables retrofitting existing generic interfaces that don't have a type parameter for the exception, such as java.util.concurrent.Callable, by adding a new generic exception parameter.
How are class literals for function types formed?
Is it #void().class ? If so, how does it work if object types are erased? Is it #?(?).class ?
The system class loader should dynamically generate function type interfaces.
The interfaces corresponding to function types should be generated on demand by the bootstrap class loader, so they can be shared among all user code. For the prototype, we may have javac generate these interfaces so prototype-generated code can run on stock (JDK5-6) VMs.
Must the evaluation of a lambda expression produce a fresh object each time?
Hopefully not. If a lambda captures no variables from an enclosing scope, for example, it can be allocated statically. Similarly, in other situations a lambda could be moved out of an inner loop if it doesn't capture any variables declared inside the loop. It would therefore be best if the specification promises nothing about the reference identity of the result of a lambda expression, so such optimizations can be done by the compiler.
As far as I understand 2., 6. and 7. aren't a problem in Scala, because Scala doesn't use Checked Exceptions as some sort of "Shadow type-system" like Java.
What about the rest?
1) Can Method Handles be used for Function Types?
Scala targets JDK 5 and 6 which don't have method handles, so it hasn't tried to deal with that issue yet.
2) Can we get rid of the explicit declaration of "throws" type parameters?
Scala doesn't have checked exceptions.
3) Disallow #Shared on old-style loop index variables.
Scala doesn't have loop index variables. Still, the same idea can be expressed with a certain kind of while loop . Scala's semantics are pretty standard here. Symbols bindings are captured and if the symbol happens to map to a mutable reference cell then on your own head be it.
4) Handle interfaces like Comparator that define more than one method all but one of which come from Object
Scala users tend to use functions (or implicit functions) to coerce functions of the right type to an interface. e.g.
[implicit] def toComparator[A](f : (A, A) => Int) = new Comparator[A] {
def compare(x : A, y : A) = f(x, y)
}
5) Specify mapping from function types to interfaces:
Scala's standard library includes FuncitonN traits for 0 <= N <= 22 and the spec says that function literals create instances of those traits
6) Type inference. The rules for type inference need to be augmented to accomodate the inference of exception type parameters.
Since Scala doesn't have checked exceptions it can punt on this whole issue
7) Elided exception type parameters to help retrofit exception transparency.
Same deal, no checked exceptions.
8) How are class literals for function types formed? Is it #void().class ? If so, how does it work if object types are erased? Is it #?(?).class ?
classOf[A => B] //or, equivalently,
classOf[Function1[A,B]]
Type erasure is type erasure. The above literals produce scala.lang.Function1 regardless of the choice for A and B. If you prefer, you can write
classOf[ _ => _ ] // or
classOf[Function1[ _,_ ]]
9) The system class loader should dynamically generate function type interfaces.
Scala arbitrarily limits the number of arguments to be at most 22 so that it doesn't have to generate the FunctionN classes dynamically.
10) Must the evaluation of a lambda expression produce a fresh object each time?
The Scala specification does not say that it must. But as of 2.8.1 the the compiler does not optimizes the case where a lambda does not capture anything from its environment. I haven't tested with 2.9.0 yet.
I'll address only number 4 here.
One of the things that distinguishes Java "closures" from closures found in other languages is that they can be used in place of interface that does not describe a function -- for example, Runnable. This is what is meant by SAM, Single Abstract Method.
Java does this because these interfaces abound in Java library, and they abound in Java library because Java was created without function types or closures. In their absence, every code that needed inversion of control had to resort to using a SAM interface.
For example, Arrays.sort takes a Comparator object that will perform comparison between members of the array to be sorted. By contrast, Scala can sort a List[A] by receiving a function (A, A) => Int, which is easily passed through a closure. See note 1 at the end, however.
So, because Scala's library was created for a language with function types and closures, there isn't need to support such a thing as SAM closures in Scala.
Of course, there's a question of Scala/Java interoperability -- while Scala's library might not need something like SAM, Java library does. There are two ways that can be solved. First, because Scala supports closures and function types, it is very easy to create helper methods. For example:
def runnable(f: () => Unit) = new Runnable {
def run() = f()
}
runnable { () => println("Hello") } // creates a Runnable
Actually, this particular example can be made even shorter by use of Scala's by-name parameters, but that's beside the point. Anyway, this is something that, arguably, Java could have done instead of what it is going to do. Given the prevalence of SAM interfaces, it is not all that surprising.
The other way Scala handles this is through implicit conversions. By just prepending implicit to the runnable method above, one creates a method that gets automatically (note 2) applied whenever a Runnable is required but a function () => Unit is provided.
Implicits are very unique, however, and still controversial to some extent.
Note 1: Actually, this particular example was choose with some malice... Comparator has two abstract methods instead of one, which is the whole problem with it. Since one of its methods can be implemented in terms of the other, I think they'll just "subtract" defender methods from the abstract list.
And, on the Scala side, even though there's a sort method that uses (A, A) => Boolean, not (A, A) => Int, the standard sorting method calls for a Ordering object, which is quite similar to Java's Comparator! In Scala's case, though, Ordering performs the role of a type class.
Note 2: Implicits are automatically applied, once they have been imported into scope.