Scala type hierarchy - scala

I looked at Scala Type Hierarchy
It's pretty clear that Unit is a subtype of AnyVal. So I tried this:
object Main extends App {
val u : Unit = Seq[String]()
}
and it compiled fine. But I expected some error. DEMO.
Why? Unit is not a supertype of a Seq

This happens because Unit can also be inferred from statements (in contrast with expressions). This means that if you use the Unit type annotation, the compiler will ignore the last value in your expression and return Unit after your actual code.
You can simply testing this, by casting your u back to its original type. You will get a cast error, because u isn't actually bound to your Seq.
Here's what happens if you run this in the REPL:
scala> u.asInstanceOf[Seq[String]]
java.lang.ClassCastException: scala.runtime.BoxedUnit cannot be cast to scala.collection.Seq
As Jasper-M correctly pointed out, the compiler will rewrite this to
val u: Unit = { Seq[String](); () }

Why? Unit is not a supertype of a Seq
You are assuming that the Seq is assigned to u. But it isn't:
println(u)
# ()
As you can see, the value of u is () (i.e. the unique singleton instance of Unit), not an empty Seq.
This is due to Value Discarding, which is described in clause 5 of section 6.26.1 Value Conversions of the Scala Language Specification:
The following seven implicit conversions can be applied to an expression e which has some value type T and which is type-checked with some expected type pt.
[…]
If e has some value type and the expected type is Unit, e is converted to the expected type by embedding it in the term { e; () }.
In plain English as opposed to "Speclish", what this means is: if you say "this is of type Unit", but it actually isn't, the compiler will return () for you.
The Unit type is sort-of the equivalent to a statement: statements have no value. Well, in Scala, everything has a value, but we have () which is a useless value, so what would in other languages be a statement which has no value, is in Scala an expression which has a useless value. And just like other languages which distinguish between expressions and statements have such things like statement expressions and expression statements (e.g. ECMAScript, Java, C, C++, C♯), Scala has Value Discarding, which is essentially its analog for an Expression Statement (i.e. a statement which contains an expression whose value is discarded).
For example, in C you are allowed to write
printf("Hello, World");
even though printf is a function which returns a size_t. But, C will happily allow you to use printf as a statement and simply discard the return value. Scala does the same.
Whether or not that's a good thing is a different question. Would it be better if C forced you to write
size_t ignoreme = printf("Hello, World");
or if Scala forced you to write
val _ = processKeyBinding​(…)
// or
{ processKeyBinding​(…); () }
instead of just
processKeyBinding​(…)
I don't know. However, one of Scala's design goals is to integrate well with the host platform and that does not just mean high-performance language interoperability with other languages, it also means easy onboarding of existing developers, so there are some features in Scala that are not actually needed but exist for familiarity with developers from the host platform (e.g. Java, C♯, ECMAScript) – Value Discarding is one of them, while loops are another.

Related

How unsafe is it to cast an arbitrary function X=>Y to X => Unit in scala?

More explicitly, can this code produce any errors in any scenrios:
def foreach[U](f :Int=>U) = f.asInstanceOf[Int=>Unit](1)
I know it works, and I have a vague idea why: any function, as an instance of a generic type, must define an erased version of apply and jvm performs type check only when the object is actually to be returned to a code where it had a concrete type (often miles away). So, in theory, as long as I never look at the returned value, I should be safe. I don't have an enough low-level understandings of java byte code, let alone scalac, to have any certainty about it.
Why would I want to do it? Look at the following example:
val b = new mutable.Buffer[Int]
val ints = Seq(1, 2, 3, 4)
ints foreach { b += _ }
It's a typical scala construct, as far as imperative style can be typical. foreach in this example takes an Int as an argument, and as scalac knows it to be an Int, it will create a closure with a specialized apply(x :Int). Unfortunately, its return type in this case is a mutable.Buffer[Int], which is an AnyRef. As far as I was able to see, scalac will never invoke a specialized apply providing an AnyVal argument if the result is an AnyRef (and vice versa). This means, that even if the caller applies the function to Int, underneath the function will box the argument and invoke the erased variant. Here of course it doesn't matter as they are boxed within the List anyway, but I'm talking about the principle.
For this reason I prefer to define this type of method as foreach(f :X=>Unit), rather than foreach[O](f: X=>O) as it is in TraversableOnce. If the input sequence in the example had such a signature, everything would compile just as fine, and the compiler would ignore the actual type of the expression and generate a function with Unit return type, which - when applied to an unboxed Int - would invoke directly void apply(Int x), without boxing.
The problem arises with interoperability - sometimes I need to call a method expecting a function with a Unit return type and all I have is a generic function returning Odin knows what. Of course, I could just write f(_) to box it in another function object instead of passing it directly, but it to large extent makes the whole optimisation of small tight loops moot.

Understanding type inferrence in Scala

I wrote the following simple program:
import java.util.{Set => JavaSet}
import java.util.Collections._
object Main extends App {
def test(set: JavaSet[String]) = ()
test(emptySet()) //fine
test(emptySet) //error
}
DEMO
And was really surprised the the final line test(emptySet) was not compiled. Why? What is the difference between test(emptySet())? I thought in Scala we could omit parenthesis freely in such cases.
See Method Conversions in Scala specification:
The following four implicit conversions can be applied to methods which are not applied to some argument list.
Evaluation
A parameterless method m
of type => T is always converted to type T by evaluating the expression to which m is bound.
Implicit Application
If the method takes only implicit parameters, implicit arguments are passed following the rules here.
Eta Expansion
Otherwise, if the method is not a constructor, and the expected type pt
is a function type (Ts′)⇒T′, eta-expansion is performed on the expression e.
Empty Application
Otherwise, if e
has method type ()T, it is implicitly applied to the empty argument list, yielding e().
The one you want is "Empty Application", but it's only applied if none of the earlier conversions are, and in this case "Eta Expansion" happens instead.
EDIT: This was wrong and #Jasper-M's comment is right. No eta-expansion is happening, "Empty Application" is just inapplicable to generic methods currently.

def layout[A](x: A) = ... syntax in Scala

I'm a beginner of Scala who is struggling with Scala syntax.
I got the line of code from https://www.tutorialspoint.com/scala/higher_order_functions.htm.
I know (x: A) is an argument of layout function
( which means argument x of Type A)
But what is [A] between layout and (x: A)?
I've been googling scala function syntax, couldn't find it.
def layout[A](x: A) = "[" + x.toString() + "]"
It's a type parameter, meaning that the method is parameterised (some also say "generic"). Without it, compiler would think that x: A denotes a variable of some concrete type A, and when it wouldn't find any such type it would report a compile error.
This is a fairly common thing in statically typed languages; for example, Java has the same thing, only syntax is <A>.
Parameterized methods have rules where the types can occur which involve concepts of covariance and contravariance, denoted as [+A] and [-A]. Variance is definitely not in the scope of this question and is probably too much for you too handle right now, but it's an important concept so I figured I'd just mention it, at least to let you know what those plus and minus signs mean when you see them (and you will).
Also, type parameters can be upper or lower bounded, denoted as [A <: SomeType] and [A >: SomeType]. This means that generic parameter needs to be a subtype/supertype of another type, in this case a made-up type SomeType.
There are even more constructs that contribute extra information about the type (e.g. context bounds, denoted as [A : Foo], used for typeclass mechanism), but you'll learn about those later.
This means that the method is using a generic type as its parameter. Every type you pass that has the definition for .toString could be passed through layout.
For example, you could pass both int and string arguments to layout, since you could call .toString on both of them.
val i = 1
val s = "hi"
layout(i) // would give "[1]"
layout(s) // would give "[hi]"
Without the gereric parameter, for this example you would have to write two definitions for layout: one that accepts integers as param, and one that accepts string as param. Even worse: every time you need another type you'd have to write another definition that accepts it.
Take a look at this example here and you'll understand it better.
I also recomend you to take a look at generic classes here.
A is a type parameter. Rather than being a data type itself (Ex. case class A), it is generic to allow any data type to be accepted by the function. So both of these will work:
layout(123f) [Float datatype] will output: "[123]"
layout("hello world") [String datatype] will output: "[hello world]"
Hence, whichever datatype is passed, the function will allow. These type parameters can also specify rules. These are called contravariance and covariance. Read more about them here!

What is the reasoning for the imbalance of Scala's regular value assignment vs extractor assignment?

Scala appears to have different semantics for regular value assignment versus assignment during an extraction. This has created some very subtle runtime bugs for me as my codebase has migrated over time.
To illustrate:
case class Foo(s: String)
def anyRef(): AnyRef = { ... }
val Foo(x) = anyRef() // compiles but throws exception if anyRef() is not a Foo
val f: Foo = anyRef() // does not compile
I don't see why the two val assignment lines would be imbalanced with regard to compile/runtime behavior.
Some I am curious: Is there a reason for this behavior? Is it an undesirable artifact of the way the extraction is implemented?
(tested in Scala 2.11.7)
Yes, there is.
In the case of an extractor you are specifying a pattern that you expect to match at this position. This is translated to a call of an extractor method from Any to Option[String] and this method can be called with the value of type AnyRef you provide. You are just asserting, that you do indeed get a result and not "None".
In the other line you are using a type annotation, i.e. you are specifying the type of "f" explicitly. But then you are assigning an incompatible value.
Of course the compiler could add implicit type casts but making type casts so easy would not really suit a language like Scala.
You should also keep in mind that extractors have nothing to do with types. In the Pattern Foo(x) the name Foo has nothing to do with the type Foo. It is just the Name of an extractor method.
Using patterns in this way is of course a quite dynamic feature, but I think it is intentional.

The purpose of type classes in Haskell vs the purpose of traits in Scala

I am trying to understand how to think about type classes in Haskell versus traits in Scala.
My understanding is that type classes are primarily important at compile time in Haskell and not at runtime anymore, on the other hand traits in Scala are important both at compile time and run time. I want to illustrate this idea with a simple example, and I want to know if this viewpoint of mine is correct or not.
First, let us consider type classes in Haskell:
Let's take a simple example. The type class Eq.
For example, Int and Char are both instances of Eq. So it is possible to create a polymorphic List that is also an instance of Eq and can either contain Ints or Chars but not both in the same List.
My question is : is this the only reason why type classes exist in Haskell?
The same question in other words:
Type classes enable to create polymorphic types ( in this example a polymorphic List) that support operations that are defined in a given type class ( in this example the operation == defined in the type class Eq) but that is their only reason for existence, according to my understanding. Is this understanding of mine correct?
Is there any other reason why type classes exist in ( standard ) Haskell?
Is there any other use case in which type classes are useful in standard Haskell ? I cannot seem to find any.
Since Haskell's Lists are homogeneous, it is not possible to put Char and Int into the same list. So the usefulness of type classes, according to my understanding, is exhausted at compile time. Is this understanding of mine correct?
Now, let's consider the analogous List example in Scala:
Lets define a trait Eq with an equals method on it.
Now let's make Char and Int implement the trait Eq.
Now it is possible to create a List[Eq] in Scala that accepts both Chars and Ints into the same List ( Note that this - putting different type of elements into the same List - is not possible Haskell, at least not in standard Haskell 98 without extensions)!
In the case of the Haskell's List, the existence of type classes is important/useful only for type checking at compile time, according to my understanding.
In contrast, the existence of traits in Scala is important both at compile time for type checking and at run type for polymorphic dispatch on the actual runtime type of the object in the List when comparing two Lists for equality.
So, based on this simple example, I came to the conclusion that in Haskell type classes are primarily important/used at compilation time, in contrast, Scala's traits are important/used both at compile time and run time.
Is this conclusion of mine correct?
If not, why not ?
EDIT:
Scala code in response to n.m.'s comments:
case class MyInt(i:Int) {
override def equals(b:Any)= i == b.asInstanceOf[MyInt].i
}
case class MyChar(c:Char) {
override def equals(a:Any)= c==a.asInstanceOf[MyChar].c
}
object Test {
def main(args: Array[String]) {
val l1 = List(MyInt(1), MyInt(2), MyChar('a'), MyChar('b'))
val l2 = List(MyInt(1), MyInt(2), MyChar('a'), MyChar('b'))
val l3 = List(MyInt(1), MyInt(2), MyChar('a'), MyChar('c'))
println(l1==l1)
println(l1==l3)
}
}
This prints:
true
false
I will comment on the Haskell side.
Type classes bring restricted polymorphism in Haskell, wherein a type variable a can still be quantified universally, but ranges over only a subset of all the types -- namely, the types for which an instance of the type class is available.
Why restricted polymorphism is useful? A nice example would be the equality operator
(==) :: ?????
What its type should be? Intuitively, it takes two values of the same type and returns a boolean, so:
(==) :: a -> a -> Bool -- (1)
But the typing above is not entirely honest, since it allows one to apply == to any type a, including function types!
(\x :: Integer -> x + x) == (\x :: Integer -> 2*x)
The above would pass type checking if (1) were the typing for (==), since both arguments are of the same type a = (Integer -> Integer). However, we can not effectively compare two functions: well-known Computability results tell us that there is no algorithm to do that in general.
So, what we could do to implement (==)?
Option 1: at run time, if a function (or any other value involving functions -- such as a list of functions) is found to be passed to (==), raise an exception. This is what e.g. ML does. Typed programs can now "go wrong", despite checking types at compile time.
Option 2: introduce a new kind of polymorphism, restricting a to the function-free types. For instance, ww could have (==) :: forall-non-fun a. a -> a -> Bool so that comparing functions yields to a type error. Haskell exploits type classes to obtain exactly that.
So, Haskell type classes allow one to type (==) "honestly", ensuring no error at run time, and without being overly restrictive. Of course, the power of type classes goes far beyond of that but, at least in my own view, they primary purpose is to allow restricted polymorphism, in a very general and flexible way. Indeed, with type classes the programmer can define their own restrictions on the universal type quantifications.