I have this weird situation that I don't understand. I'm reading "Programming in Scala" book, Ch. 9.
Let's say I have a curried function:
def withThis(n:Int)(op:Int=>Unit){
println("Before")
op(n);
println("After")
}
When I call it with one argument inside a special curly-syntax it works as expected:
withThis(5){
(x) => {println("Hello!"); println(x); }
}
// Outputs
Before
Hello!
5
After
However, if I put two statements, I get something wierd:
withThis(5){
println("Hello!")
println(_)
}
// Outputs
Hello!
Before
5
After
How come the "Hello!" gets printed before "Before" and then "5" is printed inside? Am I crazy?
Your last code example should be rewritten as follows to produce the expected result:
withThis(5) { x =>
println("Hello!")
println(x)
}
Otherwise, your example is equivalent to
withThis(5) {
println("Hello!")
(x: Int) => println(x)
}
as the placeholder _ will be expanded to bind as tightly as possible in a non-degenerated way (i.e., it wouldn't expand to println(x => x)).
The other thing to note is that a block always returns its last value. In your example, the last value is actually (x: Int) => println(x).
In your second example the part in curlies: { println("Hello!"); println(_) } is a block which prints "Hello!" and returns a curried println. Imagine it simplified as { println("Hello!"); 5 }, a block which prints "Hello!" and returns 5.
Related
Background
I tried to answer a question "what is function?" and wonder if I actually know what it is. Please help to understand what a "function" is in Scala. It may sound non-sense debate but please be patient to help.
Questions
1. What is function
A "function" is a "computation/operation" to be "applied" to a single "argument" to generate an "value". If there are multiple argument, then it can be converted into ()()()... which is called currying.
w is a name
b is a binding of a function object to a name
c is computation
a is application
g is argument on which apply an compuation
val w = ( ) => { }
^ ^ ^ ^ ^
| | | | |
(n) (b) (g) (a) (c)
Is it OK to say these?
Also if this is to apply a computation to an argument:
() => { }
Then it actually should be in the opposite direction?
() <= { }
or
{ } => ()
2. Decomposition of definition
Is this correct understanding of what "def f (x:Unit):Unit = {}" is?
//--------------------------------------------------------------------------------
// The function literal says:
// 1. Define a "function" (ignore "method" here).
// 2. Bind the function to a name f.
// 3. It is to be applied to an "argument" of type Unit.
// 4. Bind the argument to a name x.
// 5. E-valuation, or appliation of the function to an argument generates an "value" of type Unit.
// 6. After evaluation, substitute it with the "value".
//--------------------------------------------------------------------------------
def f (x:Unit):Unit = {}
3. Evaluation / Application
Is "evaluation" the same with "application of a function to an argument and yield an value"? When I read lambda calculas, word "application" is used, but I think "evaluation" is also used.
Unit
//--------------------------------------------------------------------------------
// The literal says:
// 1. Apply the function f
// 2. on the "argument" enclosed between '(' and ')', which is Unit.
// 3. and yield Unit as the "value" evaluated.
//--------------------------------------------------------------------------------
def f (x:Unit):Unit = {}
f()
Is it the same with this? If so is "Unit" an object?
f(Unit) // No error in Scala worksheet
What is causing the error "Too many arguments" for Unit as argument below?
// Define a function that applies on an argument x of type Unit to generate Unit
() => {} // res0: () => Unit = <function0>
(() => {}) // res1: () => Unit = <function0>
// Application
(() => {})()
/* Error: Too many arguments */
(() => {})(Unit)
4. Referential transparency
Please advise if this is correct.
Using "def g (x:String): Unit = println(x)" as an example, "referential transparency" means that g(x) can be always substituted with its result and it will not break any.
If
g("foo")
can be always replaced with
Unit
then it is referentially transparent. However, it is not the case here for g. Hence g is not a referentially transparent function, hence it is not "pure" function. Random is also not pure.
{ scala.util.Random.nextInt } // res0: Int = -487407277
In Scala, function can be pure or side-effective. There is no way to tell by just having look at a function. Or is there a way to mark as such, or validate if it is pure or not?
5. Method is not a function
A method cannot be a first class object to be passed around but it can be by converting it to a function.
def g (x:String): Unit = println(x)
g("foo")
val _g = g _
_g("foo")
Why method cannot be a first class object? If a method is an object, what will happen or what will break?
Scala compiler does clever inferences or complehentions, then if it can be converted into an object with _, why Scala does not make it a firt class object?
6. What is { ... }?
Update:
"=> T" is call by name passing expression to be evaluated inside the function, hence has nothing to do with "{...}" specifically. {...} is a block expression. Hence all below is invalid.
It looks "{...}" is the same with "=> T".
def fill[T](n: Int)(elem: => T)
Array.fill[Int](3)({ scala.util.Random.nextInt })
{...} in itself yield an value without taking any argument.
{ scala.util.Random.nextInt } // res0: Int = 951666328
{ 1 } // res1: Int = 1
Does it mean "application" is an independent first class object, or the Scala compiler is clever enough to understand it is an abbreviation of:
() => { scala.util.Random.nextInt }
or
val next = (x:Int) => { scala.util.Random.nextInt(x) }
If so, "=> T" is actually "() => T"?
In Scala function is an implementation of one of traits from Function1 to Function22 depending on input parameters amount. For your particular example w is a shorthand for anonfunW:
val w = () => {}
val anonfunW = new Function1[Unit, Unit] {
def apply(x: Unit): Unit = ()
}
In Scala Language Specification 6.20 Return Expressions, it says:
"A return expression return e must occur inside the body of some enclosing named method or function.The innermost enclosing named method or function in a source program, f..."
There is a innermost enclosing f here, and I want to find that f for return. But when it comes to anonymous function, things becomes a little complicated.
Example1: The innermost enclosing f here is f1. In the decompiled java code, we can see that the exception is caught by f1.
def main(args: Array[String]) {
def f1() { () => return }
}
Example2: The innermost enclosing f here is still f1. In the decompiled java code, we can see that the exception is still caught by f1, even though, from the execution view, f2 is return's innermost enclosing f.
def main(args: Array[String]) {
def f2(x: Any) { println(x) }
def f1() { f2(() => return) }
}
Perhaps we can say, that in Example2, x is not trully executed in f2. But here is another weird example.
Example3: The innermost enclosing f here is main, not map and println.(But I know that the class file of map or println cannot be changed here.)
def main(args: Array[String]) {
val list = List(1, 2, 3)
println(list.map(x => return x * 2))
}
In a word, it seems that this innermost enclosing f is exactly the method with that exact def which contains the return expression directly.
Am I right?
Yes, you are right. "Enclosing" refers to "enclosing definition".
As you yourself have noticed, it cannot refer to "enclosing method invocations", because then even something as simple as the following canonical use case of an early return:
def findFirstGreaterZero(as: List[Int]): Option[Int] = {
for (a <- as) {
if (a > 0) return Some(a)
}
None
}
wouldn't work, even though there are no explicit lambdas in here. The for would desugar to (a <- as).foreach{ a => ... return }, and if the { a => ... return ...} would "enclose" the return keyword, the return would only return from the current invocation of the body of foreach, instead of returning from the method findFirstGreaterZero, which would make return mostly useless.
Suppose l = List(1, 2, 3):
scala> l foreach { println _ }
1
2
3
scala> l foreach { println }
1
2
3
l foreach { println _ } <=> l foreach { println } because _ can be omitted. But why does the following also produces the same result?
scala> l foreach { println(_) }
1
2
3
Shouldn't the _ be bounded to println instead of foreach?
in other words:
l foreach { println(_) } <=> l foreach { println(x => x) }
and therefore throws an error on missing parameter type?
l foreach { println(_.toString) } produces the expected missing parameter type error
foreach takes a function A => Unit, in this case, Int => Unit
println satisfies this condition, but it is a method, not a function. Scala can get around this though, through a technique called eta expansion. It creates a function that takes the inputs for the method and called the method with those inputs. In your case, it looks similar to (x: Int) => println(x).
Each way you've written accomplishes this.
l foreach { println }
Here Scala is able to infer that you want to treat println as a function and pass it to foreach
l foreach { println _ }
By adding the underscore you explicitly saying that you want to turn the method into a function
l foreach { println(_) }
This is similar to the last, for any method you call, you can use an underscore instead of passing a parameter. By doing this, instead of calling the method you create a partially-applied function. You then pass this function to foreach
l foreach { println(_.toString) }
This is quite a bit different. _.toString creates a function A => String but Scala cannot figure out the correct type for A. Another problem is not you are passing a value to println, so you are calling println and passing the result to foreach instead of turning it into a function. println returns Unit which is the wrong type to pass to foreach
Shouldn't the _ be bounded to println instead of foreach? in other words:
l foreach { println(_) } <=> l foreach { println(x => x) }
No, this is specifically excluded by rules for placeholders in anonymous functions:
An expression e of syntactic category Expr binds an underscore section u, if the following two conditions hold: (1) e properly contains u, and (2) there is no other expression of syntactic category Expr which is properly contained in e and which itself properly contains u.
"Properly contains" means _ never binds itself, and so it never expands to x => x. In your last example, _.toString does properly contain _ and so satisfies both conditions above.
You're forgetting that println(x) can also be written as println x in Scala. If there's only a single argument, the parenthesis are optional.
def fun(f: Int => Unit) {
f(10)
f(20)
}
println("method 1 call:")
fun(i => {println("hi"); println(i)})
println("method 2 call:")
fun{println("hi"); println(_)}
The output is:
E:\test\scala>scala i.scala
method 1 call:
hi
10
hi
20
method 2 call:
hi
10
20
I think i => {println("hi"); println(i)} and println("hi"); println(_) are the same. Because we have one parameter and the parameter is used just once, we can use _ to simplify the code.
Then, why does method 2 just print "hi" once?
(Does it mean: if I want to use _ to simple the calling, the contents on the right of => just can have one expression, if have more than one, e.g. println("hi"); println(i); Then, we can not use _ to replace?
println(_) expands to x => println(x), so {println("hi"); println(_)} expands to {println("hi"); x => println(x)}. So when fun{println("hi"); println(_)} executes, the following steps take place:
The expression {{println("hi"); println(_)}} is evaluated. Which means:
println("hi") is evaluated and then
x => println(x) is evaluated, creating a function object that will print its argument.
The thus-created function object is the result of the expression.
The method func is called with the created function object as its argument. func will call the function with 10 and 20, causing it to print those numbers.
First, you have to know that in scala { block; of; code } is an expression that evaluates to whatever the last expression inside it evaluates to.
When you say:
fun(i => { println("hi"); println(i) })
you create an anonymous function, which body contains 2 expressions, both returning () and both are evaluated when function is called, everything as expected.
But when you say
fun({println("hi"); println(_)})
You pass in a block, not an anonymous function. As sepp2k explained this expands to
{ println("hi"); x => println(x) }
So, you pass a block to fun, this block is being evaluated before it is passed. So first println("hi") happens, it is printed just once as block is evaluated once, and then this x => println(x) is evaluated, which is a function Int => Unit, that prints its argument. This, and only this (as a last expression is passed to the fun. This is why each time you call fun it just prints the argument twice.
To see further on how block could work you can look at this example that does more in the block
fun {
println("building the function")
val uuidOfThisFunction = UUID.randomUUID
x => println(s"$uuidOfThisFunction, $x")
}
So this block prepares a function giving it some extra context through the closure. This uuid will stay the same for both calls that happen if fun.
building the function
86e74b5e-83b5-41f3-a71c-eeafcd1db2a0, 10
86e74b5e-83b5-41f3-a71c-eeafcd1db2a0, 20
The example that would look more like what you did with first call would be
fun(
x => {
println("calling the function")
val uuidOfThisCall = UUID.randomUUID
println(s"$uuidOfThisCall, $x")
}
)
The block evaluates every time f is called.
calling the function
d3c9ff0a-84b4-47a3-8153-c002fa16d6c2, 10
calling the function
a0b1aa5b-c0ea-4047-858b-9db3d43d4983, 20
I have this method:
def myMethod(value:File,x: (a:File) => Unit) = {
// Some processing here
// More processing
x(value)
}
I know I can call this as:
myMethod(new File("c:/"),(x:File) => println(x))
Is there a way I could call it using braces? Something like:
myMethod(new File("c:/"),{ (x:File) =>
if(x.toString.endsWith(".txt")) {
println x
}
})
Or do I have to write that in another method and pass that to myMethod?
The body part of the function can be a block enclosed in braces:
myMethod(new File("c:/"), x => {
if (x.toString.endsWith(".txt")) {
println(x)
}
})
An alternative is way to define myMethod as a curried function:
def myMethod(value: File)(x: File => Unit) = x(value)
Now you can write code like the following:
myMethod(new File("c:/")) { x =>
if (x.toString.endsWith(".txt")) {
println(x)
}
}
The example you gave actually works, if you correct the lack of parenthesis around x in println x. Just put the parenthesis, and your code will work.
So, now, you might be wondering about when you need parenthesis, and when you don't. Fortunately for you, someone else has asked that very question.