Why is a return statement required to allow this while statement to be evaluated properly? - scala

Why is a return statement required to allow this while statement to be
evaluated properly? The following statement allows
import java.io.File
import java.io.FileInputStream
import java.io.InputStream
import java.io.BufferedReader
import java.io.InputStreamReader
trait Closeable {
def close ()
}
trait ManagedCloseable extends Closeable {
def use (code: () => Unit) {
try {
code()
}
finally {
this.close()
}
}
}
class CloseableInputStream (stream: InputStream)
extends InputStream with ManagedCloseable {
def read = stream.read
}
object autoclose extends App {
implicit def inputStreamToClosable (stream: InputStream):
CloseableInputStream = new CloseableInputStream(stream)
override
def main (args: Array[String]) {
val test = new FileInputStream(new File("test.txt"))
test use {
val reader = new BufferedReader(new InputStreamReader(test))
var input: String = reader.readLine
while (input != null) {
println(input)
input = reader.readLine
}
}
}
}
This produces the following error from scalac:
autoclose.scala:40: error: type mismatch;
found : Unit
required: () => Unit
while (input != null) {
^
one error found
It appears that it's attempting to treat the block following the use as an
inline statement rather than a lambda, but I'm not exactly sure why. Adding
return after the while alleviates the error:
test use {
val reader = new BufferedReader(new InputStreamReader(test))
var input: String = reader.readLine
while (input != null) {
println(input)
input = reader.readLine
}
return
}
And the application runs as expected. Can anyone describe to me what is going
on there exactly? This seems as though it should be a bug. It's been
persistent across many versions of Scala though (tested 2.8.0, 2.9.0, 2.9.1)

That's because it's use is declared as () => Unit, so the compiler expects the block you are giving use to return something that satisfies this signature.
It seems that what you want is to turn the entire block into a by-name parameter, to do so change def use (code : () => Unit) to def use (code : => Unit).

() => Unit is the type of a Function0 object, and you've required the use expression to be of that type, which it obviously isn't. => Unit is a by name parameter, which you should use instead.
You might find my answer to this question useful.

To go the heart of the matter, blocks are not lambdas. A block in Scala is a scope delimiter, nothing more.
If you had written
test use { () =>
val reader = new BufferedReader(new InputStreamReader(test))
var input: String = reader.readLine
while (input != null) {
println(input)
input = reader.readLine
}
}
Then you'd have a function (indicated by () =>) which is delimited by the block.
If use had been declared as
def use (code: => Unit) {
Then the syntax you used would work, but not because of any lambda thingy. That syntax indicates the parameter is passed by name, which, roughly speaking, means you'd take the whole expression passed as parameter (ie, the whole block), and substitute it for code inside the body of use. The type of code would be Unit, not a function, but the parameter would not be passed by value.

return or return expr has the type Nothing. You can substitute this for any type, as it never yields a value to the surrounding expression, instead it returns control to the caller.
In your program, it masquerades as the required type () => Unit.
Here's an occasionally convenient use for that (although you might be tarnished as unidiomatic if you use it too often, don't tell anyone you heard this from me!)
def foo(a: Option[Int]): Int = {
val aa: Int = a.getOrElse(return 0)
aa * 2
}
For the record, you should probably write:
def foo(a: Option[Int]): Int =
a.map(_ * 2).getOrElse(0)
You can get an insight into the mind of the compiler by checking the output of scala -Xprint:typer -e <one-liner>. Add -Ytyper-debug if you like sifting through the reams of output!
scala210 -Ytyper-debug -Xprint:typer -e 'def foo: Any = {val x: () => Any = { return }}'
... elided ...
typed return (): Nothing
adapted return (): Nothing to () => Any,

Related

Unable to understand Scala's type inference

I was going through Learning Concurrency With Scala
It had a following piece of Code.
package week_parallel.week1.SC_Book
import scala.collection.mutable
object SyncPoolArgs extends App {
private val tasks = mutable.Queue[() => Unit]()
object Worker extends Thread {
setDaemon(true)
def poll() = tasks.synchronized {
while (tasks.isEmpty) tasks.wait()
tasks.dequeue()
}
override def run() = while (true) {
val task = poll()
task()
}
}
Worker.start()
def asynchronous(body: =>Unit) = tasks.synchronized {
tasks.enqueue(() => body)
tasks.notify()
}
def sum(x: Int, y:Int) = {println("USING SUM")
x+y}
asynchronous { log("Hello ") }
asynchronous { log("World!") }
asynchronous { sum(4,5) }
Thread.sleep(500)
}
So, my question is if we have tasks of type function that takes no arguments and returns nothing, why does tasks.enqueue(() => body) put sum in the queue, shouldn't it check that the body method signature is wrong in case of sum.
Also, I am particularly unable to grasp how does tasks.enqueue(() => body) confine to the private val tasks = mutable.Queue[() => Unit]() type?
I think you may be confused by the declaration
body: => Unit
This means that body is a pass-by-name parameter of type Unit. This does not mean that body is a function that returns Unit, which would be body: () => Unit.
"pass by name" means that the expression that is passed to body will not be evaluated until the value is required. When it is evaluated, it will return Unit.
Since body is of type Unit, the expression () => body has type () => Unit which is what is required.
In this case the actual value of body is sum(4,5) which is type Int, but Int is compatible with Unit so there is no error.

How to pass a Function name to a Case-Match statement in Scala

I am a newbie to Scala Programmaing and trying to create a Case-Match or Switch-Case statement which inturn invokes different functions based on the input value.
For example please see the sample code snippet. Hope it explains what i intended to do
def getValue(x: Any):String = x match {
case "Value1"=> Function1(int)
case "Value2"=> Function2(int)
case _ => println("This is an invalid value")
}
def Function1(int) {
Do Something
}
def Function2(int) {
Do Something
}
When I give like this, I am getting an error as shown below :
Error:(26, 34) type mismatch;
found : Unit
required: String
case "Value1"=> Function1(int)
Edited :
Modified the return type of getValue to be a "Unit" instead of "String". Now this error is resolved but I am getting the following error message
Error:(26, 22) forward reference extends over definition of value
spark
case "Value1"=> Function1(int)
Modified code snippet
def getValue(x: Any):Unit = x match {
case "Value1"=> Function1(int)
case "Value2"=> Function2(int)
case _ => println("This is an invalid value")
}
def Function1(int) {
Do Something
}
def Function2(int) {
Do Something
}
I resolved this error by keeping the declaration of spark variable at the end of the code.
The problem is that your Function1 and Function2 (horrible names!) return Unit, and getValue is declared to return a String
A declaration of a function looks like def functionName(args): ReturnType = { ... }
Some parts of this can be omitted, and then defaults are assumed.
In your case, you omitted the ReturnType declaration, and (more importantly) the = sign. When there is no = before the function body, the function will always return Unit. If you want it to return a String, you need to add an = before the body, and make sure that the last statement in the body is indeed a String.
Additionally, the default case clause does not return anything. This does not work, because, again, getValue is declared to return a String. You need to either throw an exception in that case, or think of a default value to return (empty string?) or else use Options, like the other answer suggests.
Function1 and Function2 should return strings. Also, you cannot use println, since the result of it is Unit, in this case, you can throw an exception:
def getValue(x: Any):Unit = x match {
case "Value1"=> function1(1)
case "Value2"=> function2(2)
case _ => throw new IllegalArgumentException("This is an invalid value")
}
def function1(v: Int): String = {
// SOME STUF THAT RETURNS STRING
}
def function2(v: Int): String = {
// SOME STUF THAT RETURNS STRING
}

Explanation on the error with for comprehension and co-variance

Question
Would like to get assistance to understand the cause of the error. The original is from Coursera Scala Design Functional Random Generators.
Task
With the factories for random int and random boolean, trying to implement a random tree factory.
trait Factory[+T] {
self => // alias of 'this'
def generate: T
def map[S](f: T => S): Factory[S] = new Factory[S] {
def generate = f(self.generate)
}
def flatMap[S](f: T => Factory[S]): Factory[S] = new Factory[S] {
def generate = f(self.generate).generate
}
}
val intFactory = new Factory[Int] {
val rand = new java.util.Random
def generate = rand.nextInt()
}
val boolFactory = intFactory.map(i => i > 0)
Problem
The implementation in the 1st block causes the error but if it changed into the 2nd block, it does not. I believe Factory[+T] meant that Factory[Inner] and Factory[Leaf] could be both treated as Factory[Tree].
I have no idea why the same if expression in for block is OK but it is not OK in yield block. I appreciate explanations.
trait Tree
case class Inner(left: Tree, right: Tree) extends Tree
case class Leaf(x: Int) extends Tree
def leafFactory: Factory[Leaf] = intFactory.map(i => new Leaf(i))
def innerFactory: Factory[Inner] = new Factory[Inner] {
def generate = new Inner(treeFactory.generate, treeFactory.generate)
}
def treeFactory: Factory[Tree] = for {
isLeaf <- boolFactory
} yield if (isLeaf) leafFactory else innerFactory
^^^^^^^^^^^ ^^^^^^^^^^^^
type mismatch; found : Factory[Inner] required: Tree
type mismatch; found : Factory[Leaf] required: Tree
However, below works.
def treeFactory: Factory[Tree] = for {
isLeaf <- boolFactory
tree <- if (isLeaf) leafFactory else innerFactory
} yield tree
I have no idea why the same if expression in for block is OK but it is
not OK in yield block
Because they are translated differently by the compiler. The former example is translated into:
boolFactory.flatMap((isLeaf: Boolean) => if (isLeaf) leafFactory else innerFactor)
Which yields the expected Factory[Tree], while the latter is being translated to:
boolFactory.map((isLeaf: Boolean) => if (isLeaf) leafFactory else innerFactory)
Which yields a Factory[Factory[Tree]], not a Factory[Tree], thus not conforming to your method signature. This isn't about covariance, but rather how for comprehension translates these statements differently.

Scala: Return multiple data types from function

This is somewhat of a theoretical question but something I might want to do. Is it possible to return multiple data data types from a Scala function but limit the types that are allowed? I know I can return one type by specifying it, or I can essentially allow any data type by not specifying the return type, but I would like to return 1 of 3 particular data types to preserve a little bit of type safety. Is there a way to write an 'or' in the return type like:
def myFunc(input:String): [Int || String] = { ...}
The main context for this is trying to write universal data loading script. Some of my users use Spark, some Scalding, and who knows what will be next. I want my users to be able to use a generic loading script that might return a RichPipe, RDD, or some other data format depending on the framework they are using, but I don't want to throw type safety completely out the window.
You can use the Either type provided by the Scala Library.
def myFunc(input:String): Either[Int, String] = {
if (...)
Left(42) // return an Int
else
Right("Hello, world") // return a String
}
You can use more than two types by nesting, for instance Either[A,Either[B,C]].
As already noted in comments you'd better use Either for this task, but if you really want it, you can use implicits
object IntOrString {
implicit def fromInt(i: Int): IntOrString = new IntOrString(None, Some(i))
implicit def fromString(s: String): IntOrString = new IntOrString(Some(s), None)
}
case class IntOrString(str: Option[String], int: Option[Int])
implicit def IntOrStringToInt(v: IntOrString): Int = v.int.get
implicit def IntOrStringToStr(v: IntOrString): String = v.str.get
def myFunc(input:String): IntOrString = {
if(input.isEmpty) {
1
} else {
"test"
}
}
val i: Int = myFunc("")
val s: String = myFunc("123")
//exception
val ex: Int = myFunc("123")
I'd make the typing by the user less implicit and more explicit. Here are three examples:
def loadInt(input: String): Int = { ... }
def loadString(input: String): String = { ... }
That's nice and simple. Alternatively, we can have a function that returns the appropriate curried function using an implicit context:
def loader[T]()(implicit context: String): String => T = {
context match {
case "RDD" => loadInt _ // or loadString _
}
}
Then the user would:
implicit val context: String = "RDD" // simple example
val loader: String => Int = loader()
loader(input)
Alternatively, can turn it into an explicit parameter:
val loader: String => Int = loader("RDD")

Use of break in Scala With Return Value

I have the below requirement where I am checking whether a value is greater than 10 or not and based on that I will break, otherwise I will return a String. Below is my code:
import scala.util.control.Breaks._
class BreakInScala {
val breakException = new RuntimeException("Break happened")
def break = throw breakException
def callMyFunc(x: Int): String = breakable(myFunc(x))
def myFunc(x: Int): String = {
if (x > 10) {
"I am fine"
} else {
break
}
}
}
Now what is the happening is that I am getting the error message saying "type mismatch; found : Unit required: String" The reason is :
def breakable(op: => Unit)
But then how I will write a function which can return value as well as break if required?
The Scala compiler can evaluate that a branch throws an exception and not use it to form a minimum bound for the return type, but not if you move the throwing code out in a method: since it can be overridden, the compiler cannot be sure it will actually never return.
Your usage of the Break constructs seems confused: it already provides a break method, there is no need to provide your own, unless you want to throw your exception instead, which would make using Break unnecessary.
You are left with a couple of options then, since I believe usage of Break is unnecessary in your case.
1) Simply throw an exception on failure
def myFunc(x: Int): String = {
if (x > 10) {
"I am fine"
} else {
throw new RuntimeException("Break happened")
}
}
def usemyFunc(): Unit = {
try {
println("myFunc(11) is " + myFunc(11))
println("myFunc(5) is " + myFunc(5))
} catch {
case e: Throwable => println("myFunc failed with " + e)
}
}
2) Use the Try class (available from Scala 2.10) to return either a value or an exception. This differs from the previous suggestion because it forces the caller to inspect the result and check whether a value is available or not, but makes using the result a bit more cumbersome
import scala.util.Try
def myFunc(x: Int): Try[String] = {
Try {
if (x > 10) {
"I am fine"
} else {
throw new RuntimeException("Break happened")
}
}
}
def useMyFunc(): Unit = {
myFunc match {
case Try.Success(s) => println("myFunc succeded with " + s)
case Try.Failure(e) => println("myFunc failed with " + e)
}
}
3) If the thrown exception isn't relevant, you can use the Option class instead.
You can see how the multiple ways of working with Options relate to each other in
this great cheat sheet.
def myFunc(x: Int): Option[String] = {
if (x > 10) {
Some("I am fine") /* Some(value) creates an Option containing value */
} else {
None /* None represents an Option that has no value */
}
}
/* There are multiple ways to work with Option instances.
One of them is using pattern matching. */
def useMyFunc(): Unit = {
myFunc(10) match {
case Some(s) => println("myFunc succeded with " + s)
case None => println("myFunc failed")
}
}
/* Another one is using the map, flatMap, getOrElse, etc methods.
They usually take a function as a parameter, which is only executed
if some condition is met.
map only runs the received function if the Option contains a value,
and passes said value as a parameter to it. It then takes the result
of the function application, and creates a new Option containing it.
getOrElse checks if the Option contains a value. If it does, it is returned
directly. If it does not, then the result of the function passed to it
is returned instead.
Chaining map and getOrElse is a common idiom meaning roughly 'transform the value
contained in this Option using this code, but if there is no value, return whatever
this other piece of code returns instead'.
*/
def useMyFunc2(): Unit = {
val toPrint = myFunc(10).map{ s =>
"myFunc(10) succeded with " + s
}.getOrElse{
"myFunc(10) failed"
}
/* toPrint now contains a message to be printed, depending on whether myFunc
returned a value or not. The Scala compiler is smart enough to infer that
both code paths return String, and make toPrint a String as well. */
println(toPrint)
}
This is a slightly odd way of doing things (throwing an exception), an alternative way of doing this might be to define a "partial function" (a function which is only defined only a specific subset of it's domain a bit like this:
scala> val partial = new PartialFunction[Int, String] {
| def apply(i : Int) = "some string"
| def isDefinedAt(i : Int) = i < 10
}
partial: PartialFunction[Int, String] = <function1>
Once you've defined the function, you can then "lift" it into an Option of type Int, by doing the following:
scala> val partialLifted = partial.lift
partialOpt: Int => Option[String] = <function1>
Then, if you call the function with a value outside your range, you'll get a "None" as a return value, otherwise you'll get your string return value. This makes it much easier to apply flatMaps/ getOrElse logic to the function without having to throw exceptions all over the place...
scala> partialLifted(45)
res7: Option[String] = None
scala> partialLifted(10)
res8: Option[String] = Some(return string)
IMO, this is a slightly more functional way of doing things...hope it helps