Scala _ Placeholders (How does this code function?) - scala

I am learning Scala (coming from a background of mostly Java). I am trying to wrap my head around the following code:
object Main {
def main(args : Array[String]) {
for (file <- filesEnding(".txt"))
println(file.getName)
}
private val filesHere = (new java.io.File(".")).listFiles
def filesMatching(matcher: String => Boolean) =
for (file <- filesHere; if matcher(file.getName))
yield file
def filesEnding(query: String) = filesMatching(_.endsWith(query))
/* Other matcher functions */
}
In particular I am confused where Scala gets the value for _ in each of the matcher functions. I can see that filesEnding is called with an argument of .txt. That argument is assigned to query. filesEnding then calls filesMatching with an argument consistent with a String => Boolean function. Finally I can see that file.getName is what eventually replaces the _ placeholder.
What I don't get is how Scala knows to put file.getName in place of _. I am having trouble tracing this code in my head and the eclipse debugger isn't much help in this situation. Can somebody walk me through what is happening in this code?

The _ is just a shorthand for making an anonymous function:
_.endsWith(query)
is the same as the anonymous function
fileName => fileName.endsWith(query)
This function is then fed as the argument matcher to filesMatching. Inside that function you can see the call
matcher(file.getName)
This calls the anonymous function with file.getName as the _ argument (which I called fileName in the explicit example).

If you write _.someMethod(someArguments), this desugars to x => x.someMethod(someArguments), so filesMatching(_.endsWith(query)) desugars to filesMatching(x => x.endsWith(query)).
So filesMatching is called with matcher being the function x => x.endsWith(query), i.e. a function which takes one argument x and calls x.endsWith(query) on that argument.

Related

Strange definition and call of a function with curly bracket in Scala (function as parameter of a function)

I'm totally new with Scala, I have to maintain an old code, so I have to understand what it does.
Now I am stuck on this piece of code, it is about defining and calling a method.
This is the definition of the method:
private def myMethod[I, O](price: Long, id: Int)(i: I)(f: (I, String) => O): O = {
..some code..
}
this is the method call
myMethod(price, id)(b) {
..some code.. //single line of code, just calling an other function
}
I understood the part of having type parameter also of having multiple parameter (currying).
But what I didn't understand, is :
first of all this part: (f: (I, String) => O) , this is completely strange for me
second, why in the method call, it contains code after the { symbol, is it overriding the original method? even it's the case, it make no sense to override it when making the call
also, myMethod is supposed to return a value of type O , but in my code it's never affected to any variable. (EDIT: this point is clear now, I just misunderstood the code, nvm mind about it)
Please can any one clarify this points (especially the first and second one which are making me so confused)
EDIT
private var x : classX
myMethod(price, id)(b) {
x.listX //calling method without parameters
}
def listX (param1: ListFaFBI, param2: String): ListX ={
//returning an Object of type ListX, not a function
}
as you can see that myMethod is calling listX. if I understood well, myMethod is returning the method listX itself which has two parameters ListFaFBI (or I) and String and returning ListX (or O) as defined in (f: (I, String) => O)
f is a function that takes in an I and a String and returns an O. f: (I, String) => O is syntactic sugar for f: Function2[I, String, O].
The braces act essentially the same as parentheses would, although there are some differences, as they can be treated as blocks (See this question). The code inside the braces is actually a function literal, and it will be passed as f. Also see this question. Here,
myMethod(price, id)(b) { (i, s) =>
..some code..
}
would be syntactic sugar for
myMethod(price, id)(b)({ (i, s) =>
..some code..
})
I'm not sure what you mean by "in my code it's never affected to any variable", but I assume that what is returned is either irrelevant or that there is an implied return (in case the call to myMethod is at the end of a block.
first of all this part: (f: (I, String) => O) , this is completely strange for me
It is a function, taking a tuple of two elements of type I and String, and returns an O
why in the method call, it contains code after the { symbol, is it overriding the original method?
Your method is using multiple parameter list, and the syntax of the last parameter group is a block definition that allows to define the function value ((I, String) => O)
For example, if we had a method which takes a function in the same parameter list:
def foo(s: String, f: String => String)
Our implementation would look like this:
foo("hello", {
s => s + "world"
}
)
However, if we used a separate parameter group:
def foo(s: String)(f: String => String)
Our implementation look like this:
foo("hello") {
s => s + "world"
}
Which is more eye pleasing and reads nicer.
myMethod is supposed to return a value of type O , but in my code it's never affected to any variable
If you add the implemention of the method we can better help show you where it is returning a value of type O

Scala use of underscore as an object placeholder

Trying to wrap my head around the varying uses of the _. Right now I'm struggling with this example:
object Chapter9 extends App {
FileMatcher.filesEnding(".scala").foreach(println)
}
object FileMatcher {
private def filesHere = (new java.io.File(".")).listFiles
private def filesMatching(matcher: String => Boolean) = {
for (file <- filesHere; if matcher(file.getName))
yield file
}
def filesEnding(query: String) =
filesMatching(_.endsWith(query))
def filesContaining(query: String) =
filesMatching(_.contains(query))
def filesRegex(query: String) =
filesMatching(_.matches(query))
}
So clearly we want to abstract away the common work of looping/filtering/yielding for the varying types of matchers, makes sense to put it in a helper function.
I'm getting hung up on the _.endsWith part. My understanding is that this underscore (being the first and only one used in the method body) will be filled in by the first parameter, which in this case is query. I tried to test this theory by doing:
def filesEnding(query: String) = {
println(_: String)
}
But the program doesn't print anything. So what is _ here? How does Scala know what object to to search for an endsWith method on?
It looks like from output of the program that somehow file gets filled in for this underscore but have no idea how. Maybe the underscore remains a "wildcard" until it is used inside filesMatching's body and by that point the nearest enclosing scope is the for and the first "parameterisfile`?
Look at the signature for filesMatching(). Notice that it takes one argument of type String => Boolean. So its argument is a function that itself take a String argument and turns it into a Boolean.
Now remember that an anonymous function often looks something like this:
{ x => /* do something with x */ }
And in cases where x is used only once, then that can be abbreviated to a single _. So, working backwards, this
filesMatching(_.endsWith(query))
can be rewritten as this
filesMatching(x => x.endsWith(query))
So the filesMatching() code has its argument, a function that takes a string (which in the anonymous function I've called x). That function, matcher, is invoked with the string file.getName to get a Boolean. That boolean value is tested in an if clause:
if matcher(file.getName)
TL;DR: The underscore is shorthand for the file.getName string.
The canonical answer is What are all the uses of an underscore in Scala?
But -Xprint:parser shows
((x$1: String) => println((x$1: String)))
which is uninteresting except for the redundantly typed expression in the body of the function.
It doesn't seem to generate any extra code. The param is already a String.
I don't think your example compiles? Or I don't know what you're asking.
Explicit types can help debug when type of an anonymous function aren't inferred as you wish.
Edit: I gave this a try:
object Chapter9 extends App {
FileMatcher.filesEnding(".scala").foreach(println)
}
object FileMatcher {
private def filesHere = (new java.io.File(".")).listFiles
private def filesMatching(matcher: String => Boolean) = {
for (file <- filesHere; if matcher(file.getName))
yield file
}
def filesEnding(query: String) = {
println(_: String)
}
}
An expression with an underscore as an anonymous function needs its expected type to tell it what type the underscore is, unless explicitly annotated as you did. But that is not common usage.
Instead of (_: Int) * 2, (i: Int) => i * 2, but that's a style question.

Is there an implicit in this call to flatMap?

In this code :
import java.io.File
def recursiveListFiles(f: File): Array[File] = {
val these = f.listFiles
these ++ these.filter(_.isDirectory).flatMap(recursiveListFiles)
}
taken from : How do I list all files in a subdirectory in scala?
Why does flatMap(recursiveListFiles) compile ? as recursiveListFiles accepts a File parameter ? Is file parameter implicitly passed to recursiveListFiles ?
No because the expanded flatMap looks like:
flatMap(file => recursiveListFiles(file))
So each file in these is getting mapped to an Array[File], which gets flattened in flatMap. No implicit magic here (in the way that you're asking).
Sort of. flatMap quite explicitly passes an argument to its own argument. This is the nature of a higher-order function -- you basically give it a callback, it calls the callback, and then it does something with the result. The only implicit thing happening is conversion of a method to a function type.
Any method can be converted to an equivalent function type. So def recursiveListFiles(f: File): Array[File] is equivalent to File => Array[File], which is great, because on an Array[File], you have flatMap[B](f: File => Array[B]): Array[B], and your method's function type fits perfectly: the type parameter B is chosen to be File.
As mentioned in another answer, you could explicitly create that function by doing:
these.filter(_.isDirectory).flatMap(file => recursiveListFiles(file)) or
these.filter(_.isDirectory).flatMap(recursiveListFiles(_)) or
these.filter(_.isDirectory).flatMap(recursiveListFiles _)
In more complex circumstances, you might need to work with one of those more verbose options, but in your case, no need to bother.
flatMap takes a function f: (A) ⇒ GenTraversableOnce[B] to return List[B].
In your case, it is taking recursiveListFiles which is a File ⇒ Array[File] to therefore return a List[File]. This resulting List[File] is then concatenated to these.

Passing function as block of code between curly braces

A few times I saw a Scala code like that:
object Doer{
def doStuff(op: => Unit) {
op
}
}
Invoked in this way:
Doer.doStuff{
println("Done")
}
What is strange for me is how a function is passed to another function as just a block of code between curly braces. And there is even no parentheses that normally mark the beginning and end of argument list.
What is the name of this Scala syntax/feature? In what cases I can use it? Where is it documented?
This is called either a nullary function or a thunk, and is an example of call-by-name evaluation: http://www.scala-lang.org/old/node/138
You can use nullaries pretty much anywhere you have a parameter list. They are basically just syntactic sugar around zero-argument functions that make them look like ordinary values, and are invoked whenever they are referenced.
So
def doSomething(op: => Unit) {
op
}
doSomething {
println("Hello!")
}
is exactly the same as:
def doSomething(op: () => Unit) {
op()
}
doSomething(() => println("Hello!"))
The one thing to keep in mind with nullaries is they are invoked every time they are referenced, so something like:
def foo(op: => Int) = op + op
foo {
println("foo")
5
}
will print "foo" twice.
Edit: To expand on Randall's comment, one of the big ways that a nullary function differs from a zero-arg function is that nullaries are not first-class values. For example, you can have a List[() => Int] but you cannot have a List[=> Int]. And if you had something like:
def foo(i: => Int) = List(i)
you are not adding the nullary function to the list, only its return value.

What does this piece of code mean in scala?

def func(arg: String => Int): Unit = {
// body of function
}
I mean this fragment:
String => Int
Short answer
Its a function that receives a String and returns a Int
Long answer
In Scala, functions are first class citizens. That means you can store them in variables or (like in this case) pass them around as arguments.
This is how a function literal looks like
() => Unit
This is a function that receives no arguments and returns Unit (java's equivalent to void).
This would be a function that receives a String as a parameter and returns an Int:
(String) => Int
Also, scala let's you drop the parenthesis as a form of syntactic sugar, like in your example. The preceding arg: is just the name of the argument.
Inside func you would call the function received (arg) like this:
val result = arg("Some String") // this returns a Int
As mentioned in Advantages of Scala’s Type System, it is a Functional type.
The article Scala for Java Refugees Part 6: Getting Over Java describes this syntax in its section "Higher-Order Functions".
def itrate(array:Array[String], fun:(String)=>Unit) = {
for (i <- 0 to (array.length - 1)) { // anti-idiom array iteration
fun(array(i))
}
}
val a = Array("Daniel", "Chris", "Joseph", "Renee")
iterate(a, (s:String) => println(s))
See? The syntax is so natural you almost miss it.
Starting at the top, we look at the type of the fun parameter and we see the (type1, …)=>returnType syntax which indicates a functional type.
In this case, fun will be a functional which takes a single parameter of type String and returns Unit (effectively void, so anything at all).
Two lines down in the function, we see the syntax for actually invoking the functional. fun is treated just as if it were a method available within the scope, the call syntax is identical.
Veterans of the C/C++ dark-ages will recognize this syntax as being reminiscent of how function pointers were handled back-in-the-day.
The difference is, no memory leaks to worry about, and no over-verbosity introduced by too many star symbols.
In your case: def func(arg: String => Int): Unit, arg would be a function taking a String and returning an Int.
You might also see it written (perhaps by a decompiler) as
def func(arg: Function1[String, Int]): Unit = {
// body of function
}
They are precisely equivalent.