When is a return type required for methods in Scala? - scala

The Scala compiler can often infer return types for methods, but there are some circumstances where it's required to specify the return type. Recursive methods, for example, require a return type to be specified.
I notice that sometimes I get the error message "overloaded method (methodname) requires return type", but it's not a general rule that return types must always be specified for overloaded methods (I have examples where I don't get this error).
When exactly is it required to specify a return type, for methods in general and specifically for overloaded methods?

The Chapter 2. Type Less, Do More of the Programming Scala book mentions:
When Explicit Type Annotations Are Required.
In practical terms, you have to provide explicit type annotations for the following situations:
Method return values in the following cases:
When you explicitly call return in a method (even at the end).
When a method is recursive.
When a method is overloaded and one of the methods calls another. The calling method needs a return type annotation.
When the inferred return type would be more general than you intended, e.g., Any.
Example:
// code-examples/TypeLessDoMore/method-nested-return-script.scala
// ERROR: Won't compile until you put a String return type on upCase.
def upCase(s: String) = {
if (s.length == 0)
return s // ERROR - forces return type of upCase to be declared.
else
s.toUpperCase()
}
Overloaded methods can sometimes require an explicit return type. When one such method calls another, we have to add a return type to the one doing the calling, as in this example.
// code-examples/TypeLessDoMore/method-overloaded-return-script.scala
// Version 1 of "StringUtil" (with a compilation error).
// ERROR: Won't compile: needs a String return type on the second "joiner".
object StringUtil {
def joiner(strings: List[String], separator: String): String =
strings.mkString(separator)
def joiner(strings: List[String]) = joiner(strings, " ") // ERROR
}
import StringUtil._ // Import the joiner methods.
println( joiner(List("Programming", "Scala")) )
The two joiner methods concatenate a List of strings together.
The first method also takes an argument for the separator string.
The second method calls the first with a “default” separator of a single space.
If you run this script, you get the following error.
... 9: error: overloaded method joiner needs result type
def joiner(strings: List[String]) = joiner(strings, "")
Since the second joiner method calls the first, it requires an explicit String return type. It should look like this:
def joiner(strings: List[String]): String = joiner(strings, " ")
Basically, specifying the return type can be a good practice even though Scala can infer it.
Randall Schulz comments:
As a matter of (my personal) style, I give explicit return types for all but the most simple methods (basically, one-liners with no conditional logic).
Keep in mind that if you let the compiler infer a method's result type, it may well be more specific than you want. (E.g., HashMap instead of Map.)
And since you may want to expose the minimal interface in your return type (see for instance this SO question), this kind of inference might get in the way.
And about the last scenario ("When the inferred return type would be more general than you intended"), Ken Bloom adds:
specify the return type when you want the compiler to verify that code in the function returns the type you expected
(The faulty code which triggers a "more general than expected return type was:
// code-examples/TypeLessDoMore/method-broad-inference-return-script.scala
// ERROR: Won't compile. Method actually returns List[Any], which is too "broad".
def makeList(strings: String*) = {
if (strings.length == 0)
List(0) // #1
else
strings.toList
}
val list: List[String] = makeList() // ERROR
, which I incorrectly interpreted and List[Any] because returning an empty List, but Ken called it out:
List(0) doesn't create a list with 0 elements.
It creates a List[Int] containing one element (the value 0).
Thus a List[Int] on one conditional branch and a List[String] on the other conditional branch generalize to List[Any].
In this case, the typer isn't being overly-general -- it's a bug in the code.
)

Related

Why does the absence of an else block translate to Unit type return for a function?

I noticed there is a type mismatch caused in the line else if(r1 == 0 || divisors.tail.isEmpty || !divisors.tail.contains(r1)){newAcc}. Because there is no else clause to my if ... else if ...
def euclidianDivision(dividend:Int,divisor:Int):(Int,Int)={
val quotient = dividend/divisor
val remainder = dividend%divisor
(quotient,remainder)
}
def firstExpansion(dividend:Int,divisors:List[Int]):List[(Int,Int)]={
def firstExpansionIter(dividend:Int,divisors:List[Int], acc:List[(Int,Int)]):List[(Int,Int)]= {
val div1:Int = divisors.head
val (q1,r1):(Int,Int) = euclidianDivision(dividend,div1)
val newAcc:List[(Int,Int)] = acc:::List((div1,q1))
if (divisors.tail.contains(r1)){
firstExpansionIter(r1,divisors.tail,newAcc)
}else if(r1 == 0 || divisors.tail.isEmpty || !divisors.tail.contains(r1)){newAcc}
}
firstExpansionIter(dividend,divisors,List((0,0))).tail
}
Here's the error code:
Error:(32, 15) type mismatch; found : Unit required: List[(Int,
Int)]
}else if(r1 == 0 || divisors.tail.isEmpty || !divisors.tail.contains(r1)){newAcc}
I can correct this by adding the else clause, but how come if there is no outcome handled by default, the function tries to return a Unit?
N.B : Corrected code :
def firstExpansion(dividend:Int,divisors:List[Int]):List[(Int,Int)]={
def firstExpansionIter(dividend:Int,divisors:List[Int], acc:List[(Int,Int)]):List[(Int,Int)]= {
val div1:Int = divisors.head
val (q1,r1):(Int,Int) = euclidianDivision(dividend,div1)
val newAcc:List[(Int,Int)] = acc:::List((div1,q1))
if (divisors.tail.contains(r1)){
firstExpansionIter(r1,divisors.tail,newAcc)
}else if(r1 == 0 || divisors.tail.isEmpty || !divisors.tail.contains(r1)){newAcc}
else throw new RuntimeException("Something unexpected happened.")
}
firstExpansionIter(dividend,divisors,List((0,0))).tail
}
I can correct this by adding the else clause, but how come if there is no outcome handled by default, the function tries to return a Unit?
In Scala, unlike more "imperative" languages, (almost) everything is an expression (there are very few statements), and every expression evaluates to a value (which also means that every method returns a value).
This means that, for example, the conditional expression if (condition) consequence else differentConsequence is an expression that evaluates to a value.
For example, in this piece of code:
val foo = if (someRandomCondition) 42 else "Hello"
the then part of the expression will evaluate to 42, the else part of the expression will evaluate to "Hello", which means the if expression as a whole will evaluate to either 42 or "Hello".
So, what is the type of foo going to be? Well, in the then case, the value is of type Int and in the else case, the value is of type String. But, this depends on the runtime value of someRandomCondition, which is unknown at compile time. So, the only choice we have as the type for the whole if expression is the lowest common ancestor (technically, the weak least upper bound) of Int and String, which is Any.
In a language with union types, we could give it a more precise type, namely the union type Int | String. (Scala 3 has union types, so we could give the expression this exact type, although Scala 3 will not infer union types.) In Scala 3, we could even annotate it with the even more precise type 42 | "Hello", which is actually the type that TypeScript is going to infer for the equivalent conditional expression:
const foo = someRandomCondition ? 42 : "Hello"
Now, let's move forward towards the code in the question:
val bar = if (someRandomCondition) 42
What is the type of bar going to be? We said above that it is the lowest common ancestor of the types of the then and else branch, but … what is the type of the else branch? What does the else branch evaluate to?
Remember, we said that every expression evaluates to a value, so the else branch must evaluate to some value. It can't just evaluate to "nothing".
This is solved by a so-called unit value of a unit type. The unit value and type are called the "unit" value and type, because the type is designed in such a way that it can only possibly be inhabited by a single value. The unit type has no members, no properties, no fields, no semantics, no nothing. As such, it is impossible to distinguish two values of the unit type from one another, or put another way: there can only be one value of the unit type, because very other value of the unit type must be identical.
In many programming languages, the unit value and type use the same notation as a tuple value and type, and are simply identified with the empty tuple (). An empty tuple and a unit value are the same thing: they have no content, no meaning. In Haskell, for example, both the type and the value are written ().
Scala also has a unit value, and it is also written (). The unit type, however, is scala.Unit.
So, the unit value, which is a useless value, is used to signify a meaningless return value.
A related, but different concept in some imperative languages is the void type (or in some languages, it is more a "pseudo-type").
Note that "returns nothing" is different from "doesn't return", which will become important in the second part of this answer.
So the first half of the puzzle is: the Scala Language Specification says that
if (condition) expression
is equivalent to
if (condition) expression else ()
Which means that in the (implicit) else case, the return type is Unit, which is not compatible with List[(Int, Int)], and therefore, you get a type error.
But why does throwing an exception fix this?
This brings us to the second special type: Nothing. Nothing is a so-called bottom type, which means that it is a subtype of every type. Nothing does not have any value. So, what then, would a return type of Nothing signify?
It signifies an expression that doesn't return. And I repeat what I said above: this is different from returning nothing.
A method that has only a side-effect returns nothing, but it does return. Its return type is Unit and its return value is (). It doesn't have a meaningful return value.
A method that has an infinite loop or throws an exception doesn't return at all. Its return type is Nothing and it doesn't have a return value.
And that is why throwing an exception in the else clause fixes the problem: this means that the type of the else clause is Nothing, and since Nothing is a subtype of every type, it doesn't even matter what the type of the then clause is, the lowest common supertype of the type of the then clause and Nothing will always be the type of the then clause. (Think about it: the lowest common ancestor of a father and any of his children, grandchildren, great-grandchildren, etc. will always be the father himself. The lowest common ancestor of T and any subtype of T will always be T. Since Nothing is a subtype of all types, the lowest common ancestor of T and Nothing will always be T because Nothing is always a subtype of T, no matter what T is.)

Scala - Function's implicit return type

I am new to scala, and got a little doubt about function definition & default return type.
Here is a function definition:
def wol(s: String) = s.length.toString.length
The prompt says it's:
wol: (s: String)Int
But, the code didn't specify return type explicitly, shouldn't it default to Unit, which means void in Java.
So, what is the rules for default return type of a Scala function?
The return type in a function is actually the return type of the last expression that occurs in the function. In this case it's an Int, because #length returns an Int.
This is the work done by the compiler when it tries to infer the type. If you don't specify a type, it automatically gets inferred, but it's not necessarily Unit. You could force it to be that be stating it:
def wol(s: String): Unit = s.length.toString.length
EDIT [syntactic sugar sample]
I just remembered something that might be connected to your previous beliefs. When you define a method without specifying its return type and without putting the = sign, the compiler will force the return type to be Unit.
def wol(s: String) {
s.length.toString.length
}
val x = wol("") // x has type Unit!
IntelliJ actually warns you and gives the hint Useless expression. Behind the scene, the #wol function here is converted into something like:
// This is actually the same as the first function
def wol(s: String): Unit = { s.length.toString.length }
Anyway, as a best practice try to avoid using this syntax and always opt for putting that = sign. Furthermore if you define public methods try to always specify the return type.
Hope that helps :)

Scala Function Overloading Anomaly

In Scala, why would this overload be allowed?
class log {
def LogInfo(m: String, properties: Map[String, String]): Unit = {
println(m)
}
def LogInfo(m: String, properties: Map[String, String], c: UUID = null): Unit = {
println(m + c.toString())
}
}
In the second definition of the LogInfo function, I have set the extra parameter to a default value of null. When I make the following call, it will call the first overload.
val l: log = new log()
val props: Map[String, String] = Map("a" -> "1")
l.LogInfo("message", props)
Why would it not throw an exception? With a default value, I would have thought both definitions could look the same.
An exception wouldn't be thrown here because the compiler chooses the first overload as the applicable one. This has to do with the way overload resolution works with default arguments. As per the specification, a strong hint to the fact such methods are discarded would be the following line:
Otherwise, let CC be the set of applicable alternatives which don't employ any default argument in the application to e1,…,em.
This has to do with the way the Scala compiler emits JVM byte code for these two methods. If we compile them and look behind the curtains, we'll see (omitting the actual byte code for brevity):
public class testing.ReadingFile$log$1 {
public void LogInfo(java.lang.String,
scala.collection.immutable.Map<java.lang.String, java.lang.String>);
Code:
public void LogInfo(java.lang.String,
scala.collection.immutable.Map<java.lang.String, java.lang.String>,
java.util.UUID);
Code:
public java.util.UUID LogInfo$default$3();
Code:
0: aconst_null
1: areturn
}
You see that the generated code actually emitted two methods, one taking two arguments and one taking three. Additionaly, the compiler added an additional method called LogInfo$default$3 (the name actually has a meaning, where $3 means "the default parameter for the third argument), which returns the default value for the c variable of the second overload. If the method with the default argument was to be invoked, LogInfo$default$3 would be used to introduce a fresh variable with the given value.
Both methods are applicable, but overloading resolution specifically tosses out the application that requires default args:
applicable alternatives which don't employ any default argument
http://www.scala-lang.org/files/archive/spec/2.12/06-expressions.html#overloading-resolution
As to "why", imagine the overload has many default parameters, such that most applications of it don't look like invocations of the first method.

Right associative functions with two parameter list

I was looking at the FoldLeft and FoldRight methods and the operator version of the method was extremely peculiar which was something like this (0 /: List.range(1,10))(+).
For right associative functions with two parameter lists one would expect the syntax to be something like this((param1)(param2) op HostClass).
But here in this case it is of the syntax (param1 op HostClass)(param2). This causes ambiguity with another case where a right associative function returns another function that takes a single parameter.
Because of this ambiguity the class compiles but fails when the function call is made as shown below.
class Test() {
val func1:(String => String) = { (in) => in * 2 }
def `test:`(x:String) = { println(x); func1 }
def `test:`(x:String)(y:String) = { x+" "+y }
}
val test = new Test
(("Foo") `test:` test)("hello")
<console>:10: error: ambiguous reference to overloaded definition,
both method test: in class Test of type (x: String)(y: String)String
and method test: in class Test of type (x: String)String => String
match argument types (String)
(("Foo") `test:` test)("hello")
so my questions are
Is this an expected behaviour or is it a bug?
Why the two parameter list right associative function call has been designed the way it is, instead of what I think to be more intuitive syntax of ((param1)(param2) op HostClass)?
Is there a workaround to call either of the overloaded test: function without ambiguity.
The Scala's Type System considers only the first parameter list of the function for type inference. Hence to uniquely identify one of the overloaded method in a class or object the first parameter list of the method has to be distinct for each of the overloaded definition. This can be demonstrated by the following example.
object Test {
def test(x:String)(y:Int) = { x+" "+y.toString() }
def test(x:String)(y:String) = { x+" "+y }
}
Test.test("Hello")(1)
<console>:9: error: ambiguous reference to overloaded definition,
both method test in object Test of type (x: String)(y: String)String
and method test in object Test of type (x: String)(y: Int)String
match argument types (String)
Test.test("Hello")(1)
Does it really fail at runtime? When I tested it, the class compiles, but the call of the method test: does not.
I think that the problem is not with the operator syntax, but with the fact that you have two overloaded functions, one with just one and the other with two parameter lists.
You will get the same error with the dot-notation:
test.`test:`("Foo")("hello")
If you rename the one-param list function, the ambiguity will be gone and
(("Foo") `test:` test)("hello")
will compile.

Simple Type Inference in Scala

I have been looking at type inference in Scala and there are a couple of things I'd like to understand a bit better around why expression/method-return types have to be explicitly declared in a few cases.
Explicit return declaration
Example (works if return keyword is ommitted):
def upCase(s: String) = {
if (s.length == 0)
return s // COMPILE ERROR - forces return type of upCase to be declared.
else
s.toUpperCase()
}
Why can't I use the explicitly typed parameter as a return value without declaring the return type? And that's not only for direct parameter references, just for any 'type-inferable' expression.
Method overloading
Example (fails to compile when the second joiner method is added):
def joiner(ss: List[String], sep: String) = ss.mkString(sep)
def joiner(ss: List[String]) = joiner(strings, " ") // COMPILE ERROR WHEN ADDED
Well most obvious answer is: because it stated in specification see part 6.20 of scala reference. But why it was designed this way is indeed very intresting question. I suspect it connected to the fact that compiler can't predict that expression will be the last one, since return changes execution flow.
EDIT:
Consider if return doesn't require explicit return type following code:
def bar() = {
if(guard())
return "SS"
else if(gurard1())
return true
2
}
that return type should bar have in this situation? Well there is option with most common supertype, but I think it will get us to returning Any in many cases. Well this is just my thoughts which may be totally incorrect =)
The type of a function or method is inferred from the type of its last statement. Usually, that's an expression.
Now, "return" breaks the control flow. It is an "immediate interrupt", so to speak. Because of that, the normal rules used to infer the type of an expression can't be used anymore. It still could be done, of course, but I'm guessing the cost in compiler complexity was deemed to high for the return.
Here's an example of how the flow is broken:
def toNumber(s: String) = {
if (s == null)
return ""
if (s matches """\d+""")
s.toInt
else
0
}
Normally, the type of the second if statement would be used to infer the type of the whole function. But the return on the first if introduces a second return point from the function, so this rule won't work.
Type inference infers the return type of a method when it can, which is more or less in any case that the method isn't recursive.
Your example would work if you changed it to:
def upCase(s: String) = {
if (s.length == 0)
s // note: no return
else
s.toUpperCase()
}
I don't know offhand why the return changes this.
Disclaimer - this answer was directed to the question as it was originally posted
Scala's type inference already does infer the return type of a method / expression:
scala> def foo(s : String) = s + " Hello"
foo: (String)java.lang.String
scala> var t = foo("World")
t: java.lang.String = World Hello
and:
scala> def bar( s : String) = s.toInt
bar: (String)Int
scala> var i = bar("3")
i: Int = 3
and:
scala> var j = if (System.getProperty("user.name") == "oxbow") 4 else "5".toInt
j: Int = 5
EDIT - I didn't realize that the inclusion of the return keyword meant that the return type of an expression had to be explicitly declared: I've pretty much stopped using return myself - but it's an interesting question. For the joiner example, the return type must be declared because of overloading. Again, I don't know the details as to why and would be interested to find out. I suspect a better-phrased question subject would elicit an answer from the likes of James Iry, Dan Spiewak or Daniel Sobral.
I suspect the method overloading (lack of) inference is related to the similar problem with recursive calls, because if the overloaded methods doesn't call each other, it works perfectly:
def joiner1(ss: List[String], sep: String) = ss.mkString(sep)
def joiner(ss: List[String], sep: String) = ss.mkString(sep)
def joiner(ss: List[String]) = joiner1(ss, " ")
There's two overloaded joiner methods, but the types are inferred correctly the code compiles.