In Scala, why does a type annotation must follow for the function parameters ? Why does the compiler not infer the function parameter types? - scala

I have a simple question, why is the Scala compiler not able to infer the function parameter types by itself ?

Some functional programming languages, like Haskell, can infer almost
all types, because they can perform global type inference. Scala can’t
do this, in part because Scala has to support subtype polymorphism
(inheritance), which makes type inference much harder. Here is a
summary of the rules for when explicit type annotations are required
in Scala.
When Explicit Type Annotations Are Required
In practical terms, you have to provide explicit type annotations for the following situations:
A mutable var or immutable val declaration where you don’t assign a value, (e.g., abstract declarations in a class like val book: String, var count: Int
All method parameters (e.g., def deposit(amount: Money) = {…}).
Method return types in the following cases:
1)When you explicitly call return in a method (even at the end).
2)When a method is recursive.
3)When two or more methods are overloaded (have the same name) and one of them calls another; the calling method needs a return type annotation.
4)When the inferred return type would be more general than you intended, e.g., Any.
Source Programming Scala, 2nd Edition - O'Reilly Media

I have a simple question, why is the Scala compiler not able to infer the function parameter types by itself ?
Scala has local type inference. If you look at the method, and look at what information is there locally, it should be easy to see, that there simply is no information about what the types could possibly be.
Say, you have the following method:
def foo(a) = a + a
How is Scala going to figure out what type a is? It can only figure that out from looking at how the method is called, but that is non-local information!
And even worse: since Scala supports dynamic code loading, it is possible that the calling code doesn't even exist yet at compile time.

Related

Disfunctionality of type parameter

I’m new to using Scala and am trying to see if a list contains any objects of a certain type.
When I make a method to do this, I get the following results:
var l = List("Some string", 3)
def containsType[T] = l.exists(_.isInstanceOf[T])
containsType[Boolean] // val res0: Boolean = true
l.exists(_.isInstanceOf[Boolean]) // val res1: Boolean = false
Could someone please help me understand why my method doesn’t return the same results as the expression on the last line?
Thank you,
Johan
Alin's answer details perfectly why the generic isn't available at runtime. You can get a bit closer to what you want with the magic of ClassTag, but you still have to be conscious of some issues with Java generics.
import scala.reflect.ClassTag
var l = List("Some string", 3)
def containsType[T](implicit cls: ClassTag[T]): Boolean = {
l.exists(cls.runtimeClass.isInstance(_))
}
Now, whenever you call containsType, a hidden extra argument of type ClassTag[T] gets passed it. So when you write, for instance, println(containsType[String]), then this gets compiled to
scala.this.Predef.println($anon.this.containsType[String](ClassTag.apply[String](classOf[java.lang.String])))
An extra argument gets passed to containsType, namely ClassTag.apply[String](classOf[java.lang.String]). That's a really long winded way of explicitly passing a Class<String>, which is what you'd have to do in Java manually. And java.lang.Class has an isInstance function.
Now, this will mostly work, but there are still major caveats. Generics arguments are completely erased at runtime, so this won't help you distinguish between an Option[Int] and an Option[String] in your list, for instance. As far as the JVM is concerned, they're both Option.
Second, Java has an unfortunate history with primitive types, so containsType[Int] will actually be false in your case, despite the fact that the 3 in your list is actually an Int. This is because, in Java, generics can only be class types, not primitives, so a generic List can never contain int (note the lowercase 'i', this is considered a fundamentally different thing in Java than a class).
Scala paints over a lot of these low-level details, but the cracks show through in situations like this. Scala sees that you're constructing a list of Strings and Ints, so it wants to construct a list of the common supertype of the two, which is Any (strings and ints have no common supertype more specific than Any). At runtime, Scala Int can translate to either int (the primitive) or Integer (the object). Scala will favor the former for efficiency, but when storing in generic containers, it can't use a primitive type. So while Scala thinks that your list l contains a String and an Int, Java thinks that it contains a String and a java.lang.Integer. And to make things even crazier, both int and java.lang.Integer have distinct Class instances.
So summon[ClassTag[Int]] in Scala is java.lang.Integer.TYPE, which is a Class<Integer> instance representing the primitive type int (yes, the non-class type int has a Class instance representing it). While summon[ClassTag[java.lang.Integer]] is java.lang.Integer::class, a distinct Class<Integer> representing the non-primitive type Integer. And at runtime, your list contains the latter.
In summary, generics in Java are a hot mess. Scala does its best to work with what it has, but when you start playing with reflection (which ClassTag does), you have to start thinking about these problems.
println(containsType[Boolean]) // false
println(containsType[Double]) // false
println(containsType[Int]) // false (list can't contain primitive type)
println(containsType[Integer]) // true (3 is converted to an Integer)
println(containsType[String]) // true (class type so it works the way you expect)
println(containsType[Unit]) // false
println(containsType[Long]) // false
Scala uses the type erasure model of generics. This means that no
information about type arguments is kept at runtime, so there's no way
to determine at runtime the specific type arguments of the given
List object. All the system can do is determine that a value is a
List of some arbitrary type parameters.
You can verify this behavior by trying any List concrete type:
val l = List("Some string", 3)
println(l.isInstanceOf[List[Int]]) // true
println(l.isInstanceOf[List[String]]) // true
println(l.isInstanceOf[List[Boolean]]) // also true
println(l.isInstanceOf[List[Unit]]) // also true
Now regarding your example:
def containsType[T] = l.exists(_.isInstanceOf[T])
println(containsType[Int]) // true
println(containsType[Boolean]) // also true
println(containsType[Unit]) // also true
println(containsType[Double]) // also true
isInstanceOf is a synthetic function (a function generated by the Scala compiler at compile-time, usually to work around the underlying JVM limitations) and does not work the way you would expect with generic type arguments like T, because after compilation, this would normally be equivalent in Java to instanceof T which, by the way - is illegal in Java.
Why is illegal? Because of type erasure. Type erasure means all your generic code (generic classes, generic methods, etc.) is converted to non-generic code. This usually means 3 things:
all type parameters in generic types are replaced with their bounds or Object if they are unbounded;
wherever necessary the compiler inserts type casts to preserve type-safety;
bridge methods are generated if needed to preserve polymorphism of all generic methods.
However, in the case of instanceof T, the JVM cannot differentiate between types of T at execution time, so this makes no sense. The type used with instanceof has to be reifiable, meaning that all information about the type needs to be available at runtime. This property does not apply to generic types.
So if Java forbids this because it can't work, why does Scala even allows it? The Scala compiler is indeed more permissive here, but for one good reason; because it treats it differently. Like the Java compiler, the Scala compiler also erases all generic code at compile-time, but since isInstanceOf is a synthetic function in Scala, calls to it using generic type arguments such as isInstanceOf[T] are replaced during compilation with instanceof Object.
Here's a sample of your code decompiled:
public <T> boolean containsType() {
return this.l().exists(x$1 -> BoxesRunTime.boxToBoolean(x$1 instanceof Object));
}
Main$.l = (List<Object>)package$.MODULE$.List().apply((Seq)ScalaRunTime$.MODULE$.wrapIntArray(new int[] { 1, 2, 3 }));
Predef$.MODULE$.println((Object)BoxesRunTime.boxToBoolean(this.containsType()));
Predef$.MODULE$.println((Object)BoxesRunTime.boxToBoolean(this.containsType()));
This is why no matter what type you give to the polymorphic function containsType, it will always result in true. Basically, containsType[T] is equivalent to containsType[_] from Scala's perspective - which actually makes sense because a generic type T, without any upper bounds, is just a placeholder for type Any in Scala. Because Scala cannot have raw types, you cannot for example, create a List without providing a type parameter, so every List must be a List of "something", and that "something" is at least an Any, if not given a more specific type.
Therefore, isInstanceOf can only be called with specific (concrete) type arguments like Boolean, Double, String, etc. That is why, this works as expected:
println(l.exists(_.isInstanceOf[Boolean])) // false
We said that Scala is more permissive, but that does not mean you get away without a warning.
To alert you of the possibly non-intuitive runtime behavior, the Scala compiler does usually emit unchecked warnings. For example, if you had run your code in the Scala interpreter (or compile it using scalac), you would have received this:

Benefit of explicitly providing the method return type or variable type in scala

This question may be very silly, but I am a little confused which is the best way to do in scala.
In scala, compiler does the type inference and assign the most closest(or may be Restrictive) type for each variable or a method.
I am new to scala, and from many sample code/ libraries, I have noticed that in many places people are not explicitly providing the types for most of the time. But, in most of the code I wrote, I was/still am explicitly providing the types. For eg:
val someVal: String = "def"
def getMeResult() : List[String]= {
val list:List[String] = List("abc","def")
list
}
The reason I started to write this especially for method return type is that, when I write a method itself, I know what it should return. So If I explicitly provide the return type, I can find out if I am making any mistakes. Also, I felt it is easier to understand what that method returns by reading the return type itself. Otherwise, I will have to check what the return type of the last statement.
So my questions/doubts are :
1. Does it take less compilation time since the compiler doesn't have to infer much? Or it doesn't matter much ?
2. What is the normal standard in the scala world?
From "Scala in Depth" chapter 4.5:
For a human reading a nontrivial method implementation, infering the
return type can be troubling. It’s best to explicitly document and
enforce return types in public APIs.
From "Programming in Scala" chapter 2:
Sometimes the Scala compiler will require you to specify the result
type of a function. If the function is recursive, for example, you
must explicitly specify the function’s result type.
It is often a good idea to indicate function result types explicitly.
Such type annotations can make the code easier to read, because the
reader need not study the function body to figure out the inferred
result type.
From "Scala in Action" chapter 2.2.3:
It’s a good practice to specify the return type for the users of the
library. If you think it’s not clear from the function what its return
type is, either try to improve the name or specify the return type.
From "Programming Scala" chapter 1:
Recursive functions are one exception where the execution scope
extends beyond the scope of the body, so the return type must be
declared.
For simple functions perhaps it’s not that important to show it
explicitly. However, sometimes the inferred type won’t be what’s
expected. Explicit return types provide useful documentation for the
reader. I recommend adding return types, especially in public APIs.
You have to provide explicit return types in the following cases:
When you explicitly call return in a method.
When a method is recursive.
When two or more methods are overloaded and one of them calls another; the calling method needs a return type annotation.
When the inferred return type would be more general than you intended, e.g., Any.
Another reason which has not yet been mentioned in the other answers is the following. You probably know that it is a good idea to program to an interface, not an implementation.
In the case of return values of functions or methods, that means that you don't want users of the function or method to know what specific implementation of some interface (or trait) the function returns - that's an implementation detail you want to hide.
If you write a method like this:
trait Example
class ExampleImpl1 extends Example { ... }
class ExampleImpl2 extends Example { ... }
def example() = new ExampleImpl1
then the return type of the method will be inferred to be ExampleImpl1 - so, it is exposing the fact that it is returning a specific implementation of trait Example. You can use an explicit return type to hide this:
def example(): Example = new ExampleImpl1
The standard rule is to use explicit types for API (in order to specify the type precisely and as a guard against refactoring) and also for implicits (especially because implicits without an explicit type may be ignored if the definition site is after the use site).
To the first question, type inference can be a significant tax, but that is balanced against the ease of both writing and reading expressions.
In the example, the type on the local list is not even a "better java." It's just visual clutter.
However, it should be easy to read the inferred type. Occasionally, I have to fire up the IDE just to tell me what is inferred.
By implication, methods should be short enough so that it's easy to scan for the result type.
Sorry for the lack of references. Maybe someone else will step forward; the topic is frequent on MLs and SO.
2. The scala style guide says
Use type inference where possible, but put clarity first, and favour explicitness in public APIs.
You should almost never annotate the type of a private field or a local variable, as their type will usually be immediately evident in their value:
private val name = "Daniel"
However, you may wish to still display the type where the assigned value has a complex or non-obvious form.
All public methods should have explicit type annotations. Type inference may break encapsulation in these cases, because it depends on internal method and class details. Without an explicit type, a change to the internals of a method or val could alter the public API of the class without warning, potentially breaking client code. Explicit type annotations can also help to improve compile times.
The twitter scala style guide says of method return types:
While Scala allows these to be omitted, such annotations provide good documentation: this is especially important for public methods. Where a method is not exposed and its return type obvious, omit them.
I think there's a broad consensus that explicit types should be used for public APIs, and shouldn't be used for most local variable declarations. When to use explicit types for "internal" methods is less clear-cut and more a matter of judgement; different organizations have different standards.
1. Type inference doesn't seem to visibly affect compilation time for the line where the inference happens (aside from a few rare cases with implicits which are basically compiler bugs) - after all, the compiler still has to check the type, which is pretty much the same calculation it would use to infer it. But if a method return type is inferred then anything using that method has to be recompiled when that method changes.
So inferring a method (or public variable) that's used in many places can slow down compilation (particularly if you're using incremental compilation). But inferring local or private variables, private methods, or public methods that are only used in one or two places, makes no (significant) difference.

Type alias vs lambda type

Can anybody explain the pros/cons of
type VNel[+A] = ValidationNel[String, A]
x.sequence[VNel, ....
vs
x.sequence[({ type l[a] = ValidationNel[String, a] })#l, ....
From what I understand, using structural types incurs a runtime performance hit of having to use reflection.
Type lambda is a way to express complex types inline.
Type alias is way create an identifier for a type. It can be a complex type or be as simple as type UserId = Int. It is useful when you start needing a complex type more than once or you want to simplify a complex signature by breaking it in parts.
Neither type lambdas and type alias are structural typing. but rather a way to express types.
For more details on type lambdas:
https://stackoverflow.com/a/8737611/547564
They are mostly equivalent - use whichever you find clearer. IMO the type alias is usually more readable. In the context of a trait (or class) written to be extended, the type lambda can be clearer as it prevents overriding the type, but this is very much an edge case.
Accessing values defined in a structural type in ordinary code would indeed incur the cost of using reflection. But in a type lambda the structural type is only used as a generic type parameter, which will be erased at runtime. So there will be no runtime performance impact.
If you're making extensive use of type lambdas, you might like to consider the kind-projector plugin, which provides a more convenient syntax (and avoids the misleading visual resemblance to a structural type).

Closures in Scala vs Closures in Java

Some time ago Oracle decided that adding Closures to Java 8 would be an good idea. I wonder how design problems are solved there in comparison to Scala, which had closures since day one.
Citing the Open Issues from javac.info:
Can Method Handles be used for Function Types?
It isn't obvious how to make that work. One problem is that Method Handles reify type parameters, but in a way that interferes with function subtyping.
Can we get rid of the explicit declaration of "throws" type parameters?
The idea would be to use disjuntive type inference whenever the declared bound is a checked exception type. This is not strictly backward compatible, but it's unlikely to break real existing code. We probably can't get rid of "throws" in the type argument, however, due to syntactic ambiguity.
Disallow #Shared on old-style loop index variables
Handle interfaces like Comparator that define more than one method, all but one of which will be implemented by a method inherited from Object.
The definition of "interface with a single method" should count only methods that would not be implemented by a method in Object and should count multiple methods as one if implementing one of them would implement them all. Mainly, this requires a more precise specification of what it means for an interface to have only a single abstract method.
Specify mapping from function types to interfaces: names, parameters, etc.
We should fully specify the mapping from function types to system-generated interfaces precisely.
Type inference. The rules for type inference need to be augmented to accomodate the inference of exception type parameters. Similarly, the subtype relationships used by the closure conversion should be reflected as well.
Elided exception type parameters to help retrofit exception transparency.
Perhaps make elided exception type parameters mean the bound. This enables retrofitting existing generic interfaces that don't have a type parameter for the exception, such as java.util.concurrent.Callable, by adding a new generic exception parameter.
How are class literals for function types formed?
Is it #void().class ? If so, how does it work if object types are erased? Is it #?(?).class ?
The system class loader should dynamically generate function type interfaces.
The interfaces corresponding to function types should be generated on demand by the bootstrap class loader, so they can be shared among all user code. For the prototype, we may have javac generate these interfaces so prototype-generated code can run on stock (JDK5-6) VMs.
Must the evaluation of a lambda expression produce a fresh object each time?
Hopefully not. If a lambda captures no variables from an enclosing scope, for example, it can be allocated statically. Similarly, in other situations a lambda could be moved out of an inner loop if it doesn't capture any variables declared inside the loop. It would therefore be best if the specification promises nothing about the reference identity of the result of a lambda expression, so such optimizations can be done by the compiler.
As far as I understand 2., 6. and 7. aren't a problem in Scala, because Scala doesn't use Checked Exceptions as some sort of "Shadow type-system" like Java.
What about the rest?
1) Can Method Handles be used for Function Types?
Scala targets JDK 5 and 6 which don't have method handles, so it hasn't tried to deal with that issue yet.
2) Can we get rid of the explicit declaration of "throws" type parameters?
Scala doesn't have checked exceptions.
3) Disallow #Shared on old-style loop index variables.
Scala doesn't have loop index variables. Still, the same idea can be expressed with a certain kind of while loop . Scala's semantics are pretty standard here. Symbols bindings are captured and if the symbol happens to map to a mutable reference cell then on your own head be it.
4) Handle interfaces like Comparator that define more than one method all but one of which come from Object
Scala users tend to use functions (or implicit functions) to coerce functions of the right type to an interface. e.g.
[implicit] def toComparator[A](f : (A, A) => Int) = new Comparator[A] {
def compare(x : A, y : A) = f(x, y)
}
5) Specify mapping from function types to interfaces:
Scala's standard library includes FuncitonN traits for 0 <= N <= 22 and the spec says that function literals create instances of those traits
6) Type inference. The rules for type inference need to be augmented to accomodate the inference of exception type parameters.
Since Scala doesn't have checked exceptions it can punt on this whole issue
7) Elided exception type parameters to help retrofit exception transparency.
Same deal, no checked exceptions.
8) How are class literals for function types formed? Is it #void().class ? If so, how does it work if object types are erased? Is it #?(?).class ?
classOf[A => B] //or, equivalently,
classOf[Function1[A,B]]
Type erasure is type erasure. The above literals produce scala.lang.Function1 regardless of the choice for A and B. If you prefer, you can write
classOf[ _ => _ ] // or
classOf[Function1[ _,_ ]]
9) The system class loader should dynamically generate function type interfaces.
Scala arbitrarily limits the number of arguments to be at most 22 so that it doesn't have to generate the FunctionN classes dynamically.
10) Must the evaluation of a lambda expression produce a fresh object each time?
The Scala specification does not say that it must. But as of 2.8.1 the the compiler does not optimizes the case where a lambda does not capture anything from its environment. I haven't tested with 2.9.0 yet.
I'll address only number 4 here.
One of the things that distinguishes Java "closures" from closures found in other languages is that they can be used in place of interface that does not describe a function -- for example, Runnable. This is what is meant by SAM, Single Abstract Method.
Java does this because these interfaces abound in Java library, and they abound in Java library because Java was created without function types or closures. In their absence, every code that needed inversion of control had to resort to using a SAM interface.
For example, Arrays.sort takes a Comparator object that will perform comparison between members of the array to be sorted. By contrast, Scala can sort a List[A] by receiving a function (A, A) => Int, which is easily passed through a closure. See note 1 at the end, however.
So, because Scala's library was created for a language with function types and closures, there isn't need to support such a thing as SAM closures in Scala.
Of course, there's a question of Scala/Java interoperability -- while Scala's library might not need something like SAM, Java library does. There are two ways that can be solved. First, because Scala supports closures and function types, it is very easy to create helper methods. For example:
def runnable(f: () => Unit) = new Runnable {
def run() = f()
}
runnable { () => println("Hello") } // creates a Runnable
Actually, this particular example can be made even shorter by use of Scala's by-name parameters, but that's beside the point. Anyway, this is something that, arguably, Java could have done instead of what it is going to do. Given the prevalence of SAM interfaces, it is not all that surprising.
The other way Scala handles this is through implicit conversions. By just prepending implicit to the runnable method above, one creates a method that gets automatically (note 2) applied whenever a Runnable is required but a function () => Unit is provided.
Implicits are very unique, however, and still controversial to some extent.
Note 1: Actually, this particular example was choose with some malice... Comparator has two abstract methods instead of one, which is the whole problem with it. Since one of its methods can be implemented in terms of the other, I think they'll just "subtract" defender methods from the abstract list.
And, on the Scala side, even though there's a sort method that uses (A, A) => Boolean, not (A, A) => Int, the standard sorting method calls for a Ordering object, which is quite similar to Java's Comparator! In Scala's case, though, Ordering performs the role of a type class.
Note 2: Implicits are automatically applied, once they have been imported into scope.

Advantages of Scala's type system

I am exploring the Scala language. One claim I often hear is that Scala has a stronger type system than Java. By this I think what people mean is that:
scalac rejects certain buggy programs which javac will compile happily, only to cause a runtime error.
Certain invariants can be encoded in a Scala program such that the compiler won't let the programmer write code that violates the condition.
Am I right in thinking so?
The main advantage of the Scala Type system is not so much being stronger but rather being far richer (see "The Scala Type System").
(Java can define some of them, and implement others, but Scala has them built-in).
See also The Myth Makers 1: Scala's "Type Types", commenting Steve Yegge's blog post, where he "disses" Scala as "Frankenstein's Monster" because "there are type types, and type type types".
Value type classes (useful for reasonably small data structures that have value semantics) used instead of primitives types (Int, Doubles, ...), with implicit conversion to "Rich" classes for additional methods.
Nonnullable type
Monad types
Trait types (and the mixin composition that comes with it)
Singleton object types (just define an 'object' and you have one),
Compound types (intersections of object types, to express that the type of an object is a subtype of several other types),
Functional types ((type1, …)=>returnType syntax),
Case classes (regular classes which export their constructor parameters and which provide a recursive decomposition mechanism via pattern matching),
Path-dependent types (Languages that let you nest types provide ways to refer to those type paths),
Anonymous types (for defining anonymous functions),
Self types (can be used for instance in Trait),
Type aliases, along with:
package object (introduced in 2.8)
Generic types (like Java), with a type parameter annotation mechanism to control the subtyping behavior of generic types,
Covariant generic types: The annotation +T declares type T to be used only in covariant positions. Stack[T] is a subtype of Stack[S] if T is a subtype of S.
Contravariant generic types: -T would declare T to be used only in contravariant positions.
Bounded generic types (even though Java supports some part of it),
Higher kinded types, which allow one to express more advanced type relationships than is possible with Java Generics,
Abstract types (the alternative to generic type),
Existential types (used in Scala like the Java wildcard type),
Implicit types (see "The awesomeness of Scala is implicit",
View bounded types, and
Structural types, for specifing a type by specifying characteristics of the desired type (duck typing).
The main safety problem with Java relates to variance. Basically, a programmer can use incorrect variance declarations that may result in exceptions being thrown at run-time in Java, while Scala will not allow it.
In fact, the very fact that Java's Array is co-variant is already a problem, since it allows incorrect code to be generated. For instance, as exemplified by sepp2k:
String[] strings = {"foo"};
Object[] objects = strings;
objects[0] = new Object();
Then, of course, there are raw types in Java, which allows all sort of things.
Also, though Scala has it as well, there's casting. Java API is rich in type casts, and there's no idiom like Scala's case x: X => // x is now safely cast. Sure, one case use instanceof to accomplish that, but there's no incentive to do it. In fact, Scala's asInstanceOf is intentionally verbose.
These are the things that make Scala's type system stronger. It is also much richer, as VonC shows.