Unexpected behavior of StringBuilder in foreach - scala

While answering this question I stumbled upon a behavior I could not explain.
Coming from:
val builder = new StringBuilder("foo bar baz ")
(0 until 4) foreach { builder.append("!") }
builder.toString -> res1: String = foo bar baz !
The issue seemed clear, the function provided to the foreach was missing the Int argument, so StringBuilder.apply got executed. But that does not really explain why it appends the '!' only once. So I got to experimenting..
I would have expected the following six statements to be equivalent, but the resulting Strings differ:
(0 until 4) foreach { builder.append("!") } -> res1: String = foo bar baz !
(0 until 4) foreach { builder.append("!")(_) } -> res1: String = foo bar baz !!!!
(0 until 4) foreach { i => builder.append("!")(i) } -> res1: String = foo bar baz !!!!
(0 until 4) foreach { builder.append("!").apply } -> res1: String = foo bar baz !
(0 until 4) foreach { builder.append("!").apply(_) } -> res1: String = foo bar baz !!!!
(0 until 4) foreach { i => builder.append("!").apply(i) } -> res1: String = foo bar baz !!!!
So the statements are obviously not equivalent. Can somebody explain the difference?

Let's label them:
A - (0 until 4) foreach { builder.append("!").apply }
B - (0 until 4) foreach { builder.append("!").apply(_) }
C - (0 until 4) foreach { i => builder.append("!").apply(i) }
At first glance it is confusing, because it appears they should all be equivalent to each other. Let's look at C first. If we look at it as a Function1, it should be clear enough that builder.append("!") is evaluated with each invocation.
val C = new Function1[Int, StringBuilder] {
def apply(i: Int): StringBuilder = builder.append("!").apply(i)
}
For each element in (0 to 4), C is called, which re-evaluates builder.append("!") on each invocation.
The important step to understanding this is that B is syntactic sugar for C, and not A. Using the underscore in apply(_) tells the compiler to create a new anonymous function i => builder.append("!").apply(i). We might not necessarily expect this because builder.append("!").apply can be a function in it's own right, if eta-expanded. The compiler appears to prefer creating a new anonymous function, that simply wraps builder.append("!").apply, rather than eta-expanding it.
From the SLS 6.23.1 - Placeholder Syntax for Anonymous Functions
An expression e of syntactic category Expr binds an underscore section u, if the following two conditions hold: (1) e properly contains u, and (2) there is no other expression of syntactic category Expr which is properly contained in e and which itself properly contains u.
So builder.append("!").apply(_) properly contains the underscore, so the underscore syntax can applies for the anonymous function, and it becomes i => builder.append("!").apply(i), like C.
Compare this to:
(0 until 4) foreach { builder.append("!").apply _ }
Here, the underscore is not properly contained in the expression, so the underscore syntax does not immediately apply as builder.append("!").apply _ can also mean eta-expansion. In this case, eta-expansion comes first, which will be equivalent to A.
For A, it is builder.append("!").apply is implicitly eta-expanded to a function, which will only evaluate builder.append("!") once. e.g. it is something like:
val A = new Function1[Int, Char] {
private val a = builder.append("!")
// append is not called on subsequent apply calls
def apply(i: Int): Char = a.apply(i)
}

scala.collection.mutable.StringBuilder extends (Int => Char), and therefore builder.append("!"), which returns a StringBuilder, is a valid function argument to foreach. The first line is therefore equivalent as if you wrote:
val f: Int => Char = builder.append("!").asInstanceOf[Int => Char] // appends "!" once
(0 until 4).foreach(f) // fetches the 0th to 3rd chars in the string builder, and does nothing with them
All the lines that append !!!! actually create a new anonymous function i => builder.append("!").apply(i), and are therefore equivalent to
val f: Int => Char = (i: Int) => builder.append("!").apply(i)
(0 until 4).foreach(f) // appends 4 times (and fetches the 0th to 3rd chars in the string builder, and does nothing with them)
As for your fourth line, it's weirder IMO. In that case, you are trying to read a "field" apply in builder.append("!"). But apply is a method (Int)Char, and the expected type (as determined by the param type of foreach) is Int => ?. So there is a way to lift the method apply(Int)Char as an Int => ?, which is to create a lambda that will call the method. But in this case, since you're trying to read apply as field, initially, it means that the this of .apply should be evaluated once to be stored as a capture for the this parameter of the method call, giving something equivalent to this:
val this$1: StringBuilder = builder.append("!") // appends "!" once
val f: Int => Char = (i: Int) => this$1.apply(i)
(0 until 4).foreach(f) // fetches the 0th to 3rd chars in the string builder, and does nothing with them

Related

why S:String.indexOf(T:Substring) return <function1> in scala

scala> var a = List("1","2","3")
a: List[String] = List(1, 2, 3)
scala> a.foreach(_ => print((_:String).indexOf("123")))
< function1> < function1> < function1>
a.foreach(_ => print((_:String).indexOf("123")))
... is not the same thing as ...
a.foreach(x => print((x:String).indexOf("123")))
For one thing, the underscore, _, represents a passed parameter only once. If you encounter something like _ + _ it does not mean the parameter is added to itself, it represents the addition of two different passed parameters.
So what is (_:String).indexOf("123")? It is an anonymous function that takes one parameter of type String and attempts to find the index where the sub-string "123" can be found. In this case the underscore is unrelated to the parameter sent to the foreach lambda.

Scala function partial application

I'm trying to understand how function partial application works in Scala.
To do that, I've built this simple code:
object Test extends App {
myCustomConcat("General", "Public", "License") foreach print
GeneralPublicLicenceAcronym(myCustomConcat(_)) foreach print
def myCustomConcat(strings: String*): List[Char] = {
val result = for (s <- strings) yield {
s.charAt(0)
}
result.toList
}
def GeneralPublicLicenceAcronym (concatFunction: (String*) => List[Char] ) = {
myCustomConcat("General", "Public", "License")
}
}
myCostumConcat function takes in input an array of String and it returns a list containing the first letter of each string.
So, the code
myCustomConcat("General", "Public", "License") foreach print
will print on console: GPL
Suppose now that I want to write a function to generate the GPL acronym, using (as input parameter) my previous function extracting the first letter of each string:
def GeneralPublicLicenceAcronym (concatFunction: (String*) => List[Char] ): List[Char] = {
myCustomConcat("General", "Public", "License")
}
Running this new function with partial application:
GeneralPublicLicenceAcronym(myCustomConcat(_)) foreach print
I get this error:
Error:(8, 46) type mismatch; found : Seq[String] required: String GeneralPublicLicenceAcronym(myCustomConcat(_)) foreach print
Why? Can I use partial application in this case?
All you need to do is change myCustomConcat(_) to myCustomConcat _, or indeed just myCustomConcat
What you are doing isn't exactly partial application - it's just using a method as a function value.
In some cases (where a function value is expected) the compiler will work out what you mean, but in other contexts you often need to tell the compiler your intention, using the _ suffix.
"partial application" means that we are supplying some, but not all, of the arguments to a function, to create a new function, for example:
def add(x: Int, y: Int) = x + y //> add: (x: Int, y: Int)Int
val addOne: Int => Int = add(1, _) //> addOne : Int => Int = <function1>
addOne(2) //> res0: Int = 3
I suppose your case could be seen as partial application, but applying none of the arguments - you can use partial application syntax here, but you need to give a _* hint to the compiler because of the repeated parameters (String*), which ends up a bit ugly:
myCustomConcat(_:_*)
See also: Scala type ascription for varargs using _* cause error

scala. higher order function calling by name. does it make sense

Just want to clarify. If we use higher-order function (f. that accepts another function as argument). Does it make any sense specify "=>" sign to call it by-name. It seems arg-function is calling by-name anyhow?
There is an example:
// 1.
// the function that accepts arg-function with: two int params and returning String
// the function passing v1 & v2 as parameters to arg-function, invoking arg-function 2 times, connecting the result to one string
def takeFunction1(f: (Int, Int) => String, v1:Int, v2:Int ): String = {
f(v1, v2) + f(v1, v2)
}
// 2. same as #1 but calling arg-function by-name
def takeFunction2(f: => ((Int, Int) => String), v1:Int, v2:Int ): String = {
f(v1, v2) + f(v1, v2)
}
def aFun(v1:Int, v2:Int) : String = {
(v1 + v2).toString
}
// --
println( takeFunction1( aFun, 2, 2) )
println( takeFunction2( aFun, 2, 2) )
And what if I want to call it like this ?:
println( takeFunction2( aFun(2,2)), ... ) // it tries to evaluate immediately when passing
The difference is that if you pass as the first argument a call to a function that returns the (Int, Int) => String value to use, this call to the generator function is evaluated only once with pass-by-value, compared to being evaluated each time the argument is used in the case of pass-by-name.
Rather contrived example:
var bar = 0
def fnGen() = {
bar += 1
def myFun(v1:Int, v2:Int) = {
(v1 + v2).toString
}
myFun _
}
Now run some calls of your methods above using fnGen:
scala> println( takeFunction1( fnGen(), 2, 2) )
44
scala> bar
res1: Int = 1
scala> println( takeFunction2( fnGen(), 2, 2) )
44
scala> bar
res3: Int = 3
As you can see, calling takeFunction1 increments bar only once, while calling takeFunction2 increments bar twice.
The argument that you're passing by name is aFun; that's a valid expression, and it does get evaluated both times that takeFunction2 uses it, but since it's just a variable, and you're not doing anything else with it, "evaluating" it is not very meaningful. (It just evaluates to the same value both times.) For pass-by-name to behave differently from pass-by-value, you have to pass in an impure expression (one that has side-effects, or that can evaluate to different values on successive calls, or whatnot).

Polish notation evaluate function

I am new to Scala and I am having hard-time with defining, or more likely translating my code from Ruby to evaluate calculations described as Polish Notations,
f.e. (+ 3 2) or (- 4 (+ 3 2))
I successfully parse the string to form of ArrayBuffer(+, 3, 2) or ArrayBuffer(-, 4, ArrayBuffer(+, 3 2)).
The problem actually starts when I try to define a recursive eval function ,which simply takes ArrayBuffer as argument and "return" an Int(result of evaluated application).
IN THE BASE CASE:
I want to simply check if 2nd element is an instanceOf[Int] and 3rd element is instanceOf[Int] then evaluate them together (depending on sign operator - 1st element) and return Int.
However If any of the elements is another ArrayBuffer, I simply want to reassign that element to returned value of recursively called eval function. like:
Storage(2) = eval(Storage(2)). (** thats why i am using mutable ArrayBuffer **)
The error ,which I get is:
scala.collection.mutable.ArrayBuffer cannot be cast to java.lang.Integer
I am of course not looking for any copy-and-paste answers but for some advices and observations.
Constructive Criticism fully welcomed.
****** This is the testing code I am using only for the addition ******
def eval(Input: ArrayBuffer[Any]):Int = {
if(ArrayBuffer(2).isInstaceOf[ArrayBuffer[Any]]) {
ArrayBuffer(2) = eval(ArrayBuffer(2))
}
if(ArrayBuffer(3).isInstaceOf[ArrayBuffer[Any]]) {
ArrayBuffer(3) = eval(ArrayBuffer(3))
}
if(ArrayBuffer(2).isInstaceOf[Int] && ArrayBuffer(3).isInstanceOf[Int]) {
ArrayBuffer(2).asInstanceOf[Int] + ArrayBuffer(3).asInstanceOf[Int]
}
}
A few problems with your code:
ArrayBuffer(2) means "construct an ArrayBuffer with one element: 2". Nowhere in your code are you referencing your parameter Input. You would need to replace instances of ArrayBuffer(2) with Input(2) for this to work.
ArrayBuffer (and all collections in Scala) are 0-indexed, so if you want to access the second thing in the collection, you would do input(1).
If you leave the the final if there, then the compiler will complain since your function won't always return an Int; if the input contained something unexpected, then that last if would evaluate to false, and you have no else to fall to.
Here's a direct rewrite of your code: fixing the issues:
def eval(input: ArrayBuffer[Any]):Int = {
if(input(1).isInstanceOf[ArrayBuffer[Any]])
input(1) = eval(input(1).asInstanceOf[ArrayBuffer[Any]])
if(input(2).isInstanceOf[ArrayBuffer[Any]])
input(2) = eval(input(2).asInstanceOf[ArrayBuffer[Any]])
input(1).asInstanceOf[Int] + input(2).asInstanceOf[Int]
}
(note also that variable names, like input, should be lowercased.)
That said, the procedure of replacing entries in your input with their evaluations is probably not the best route because it destroys the input in the process of evaluating. You should instead write a function that takes the ArrayBuffer and simply recurses through it without modifying the original.
You'll want you eval function to check for specific cases. Here's a simple implementation as a demonstration:
def eval(e: Seq[Any]): Int =
e match {
case Seq("+", a: Int, b: Int) => a + b
case Seq("+", a: Int, b: Seq[Any]) => a + eval(b)
case Seq("+", a: Seq[Any], b: Int) => eval(a) + b
case Seq("+", a: Seq[Any], b: Seq[Any]) => eval(a) + eval(b)
}
So you can see that for the simple case of (+ arg1 arg2), there are 4 cases. In each case, if the argument is an Int, we use it directly in the addition. If the argument itself is a sequence (like ArrayBuffer), then we recursively evaluate before adding. Notice also that Scala's case syntax lets to do pattern matches with types, so you can skip the isInstanceOf and asInstanceOf stuff.
Now there definitely style improvements you'd want to make down the line (like using Either instead of Any and not hard coding the "+"), but this should get you on the right track.
And here's how you would use it:
eval(Seq("+", 3, 2))
res0: Int = 5
scala> eval(Seq("+", 4, Seq("+", 3, 2)))
res1: Int = 9
Now, if you want to really take advantage of Scala features, you could use an Eval extractor:
object Eval {
def unapply(e: Any): Option[Int] = {
e match {
case i: Int => Some(i)
case Seq("+", Eval(a), Eval(b)) => Some(a + b)
}
}
}
And you'd use it like this:
scala> val Eval(result) = 2
result: Int = 2
scala> val Eval(result) = ArrayBuffer("+", 2, 3)
result: Int = 5
scala> val Eval(result) = ArrayBuffer("+", 2, ArrayBuffer("+", 2, 3))
result: Int = 7
Or you could wrap it in an eval function:
def eval(e: Any): Int = {
val Eval(result) = e
result
}
Here is my take on right to left stack-based evaluation:
def eval(expr: String): Either[Throwable, Int] = {
import java.lang.NumberFormatException
import scala.util.control.Exception._
def int(s: String) = catching(classOf[NumberFormatException]).opt(s.toInt)
val symbols = expr.replaceAll("""[^\d\+\-\*/ ]""", "").split(" ").toSeq
allCatch.either {
val results = symbols.foldRight(List.empty[Int]) {
(symbol, operands) => int(symbol) match {
case Some(op) => op :: operands
case None => val x :: y :: ops = operands
val result = symbol match {
case "+" => x + y
case "-" => x - y
case "*" => x * y
case "/" => x / y
}
result :: ops
}
}
results.head
}
}

some operator questions

I'm new to scala so sorry if this is easy but I've had a hard time finding the answer.
I'm having a hard time understanding what <- does, and what ()=> Unit does. My understanding of these is that -> is sometimes used in foreach, and that => is used in maps. Trying to google "scala "<-" doesn't prove very fruitful. I found http://jim-mcbeath.blogspot.com/2008/12/scala-operator-cheat-sheet.html but it wasn't as helpful as it looks at first glance.
val numbers = List("one", "two", "three","four","five")
def operateOnList() {
for(number <- numbers) {
println(number + ": came out of this crazy thing!")
}
}
def tweener(method: () => Unit) {
method()
}
tweener(operateOnList)
() => Unit means that method is a function that takes no parameter and returns nothing (Unit).
<- is used in the for comprehension as an kind of assignation operator. for comprehension are a little bit specific because they are internally transformed. In your case, that would be transforms as numbers.foreach(i => println(i + ": came out of this crazy thing!"))
<- in the for comprehension means that we will iterate over each element of the numbers list and passed to number.
'<-' could be threated as 'in' so
for(number <- numbers){
...
}
could be translated into english as for each number in numbers do
'<-' has a twin with a different semantics: '->'. Simply it is just a replacement of comma in tuples: (a,b) is an equivalent to (a->b) or just a->b. The meaning after this symbols is that 'a' maps to 'b'. So this is often used in definition of Maps:
Map("a" -> 1,"aba" -> 3)
Map("London" -> "Britain", "Paris" -> "France")
Here you can think about mapping as a projection (or not) via some function (e.g. 'length of string', 'capital of').
Better explanation is here.
Last, but not least is '=>' which is map too, but with a general semantics. '=>' is in use all over the place in anonymous expressions:
scala> List(1,2,3,4).map(current => current+1)
res5: List[Int] = List(2, 3, 4, 5)
Which is for each element map current element of list with function 'plus one'
List(1,2,3,4).map(c => c%2 match {
| case 0 => "even"
| case 1 => "odd"
| }
| )
res6: List[java.lang.String] = List(odd, even, odd, even)
Map current element with provided pattern mathing
In the method
def tweener(method: () => Unit) {
method()
}
the method is called tweener, the parameter is arbitrarily named method, and the type of method is () => Unit, which is a function type, as you can tell from the =>.
Unit is a return type similar to void in Java, and represents no interesting value being returned. For instance, the return type of print is Unit. () represents an empty parameter list.
Confusingly, () is also used to represent an instance of Unit, called the unit value, the only value a Unit can take. But this is not what it means in the function type () => Unit, just as you can't have a function type 42 => Unit.
Back to your example, tweener takes a function of type () => Unit. operateOnList is a method, but it gets partially applied by the compiler to turn it into a function value. You can turn methods into functions yourself like this:
scala> def m = println("hi")
m: Unit
scala> m _
res17: () => Unit = <function0>
operateOnList can be turned into the right type of function because its parameter list is empty (), and its return type is implicity Unit.
As a side-note, if operateOnList were defined without the empty parameter list (as is legal, but more common when the return type is not Unit), you would need to manually partially apply it, else its value will be passed instead:
def f1() {}
def f2 {}
def g(f: () => Unit) {}
g(f1) // OK
g(f2) // error, since we're passing f2's result (),
// rather than partial function () => Unit
g(f2 _) // OK