When to use parenthesis in Scala infix notation - scala

When programming in Scala, I do more and more functional stuff. However, when using infix notation it is hard to tell when you need parenthesis and when you don't.
For example the following piece of code:
def caesar(k:Int)(c:Char) = c match {
case c if c isLower => ('a'+((c-'a'+k)%26)).toChar
case c if c isUpper => ('A'+((c-'A'+k)%26)).toChar
case _ => c
}
def encrypt(file:String,k:Int) = (fromFile(file) mkString) map caesar(k)_
The (fromFile(file) mkString) needs parenthesis in order to compile. When removed I get the following error:
Caesar.scala:24: error: not found: value map
def encrypt(file:String,k:Int) = fromFile(file) mkString map caesar(k)_
^
one error found
mkString obviously returns a string on which (by implicit conversion AFAIK)I can use the map function.
Why does this particular case needs parentheses? Is there a general guideline on when and why you need it?

This is what I put together for myself after reading the spec:
Any method which takes a single parameter can be used as an infix operator: a.m(b) can be written a m b.
Any method which does not require a parameter can be used as a postfix operator: a.m can be written a m.
For instance a.##(b) can be written a ## b and a.! can be written a!
Postfix operators have lower precedence than infix operators, so foo bar baz means foo.bar(baz) while foo bar baz bam means (foo.bar(baz)).bam and foo bar baz bam bim means (foo.bar(baz)).bam(bim).
Also given a parameterless method m of object a, a.m.m is valid but a m m is not as it would parse as exp1 op exp2.
Because there is a version of mkString that takes a single parameter it will be seen as an infix opertor in fromFile(file) mkString map caesar(k)_. There is also a version of mkString that takes no parameter which can be used a as postfix operator:
scala> List(1,2) mkString
res1: String = 12
scala> List(1,2) mkString "a"
res2: String = 1a2
Sometime by adding dot in the right location, you can get the precedence you need, e.g. fromFile(file).mkString map { }
And all that precedence thing happens before typing and other phases, so even though list mkString map function makes no sense as list.mkString(map).function, this is how it will be parsed.

The Scala reference mentions (6.12.3: Prefix, Infix, and Postfix Operations)
In a sequence of consecutive type infix operations t0 op1 t1 op2 . . .opn tn, all operators op1, . . . , opn must have the same associativity.
If they are all left-associative, the sequence is interpreted as (. . . (t0 op1 t1) op2 . . .) opn tn.
In your case, 'map' isn't a term for the operator 'mkstring', so you need grouping (with the parenthesis around 'fromFile(file) mkString')
Actually, Matt R comments:
It's not really an associativity issue, more that "Postfix operators always have lower precedence than infix operators. E.g. e1 op1 e2 op2 is always equivalent to (e1 op1 e2) op2". (Also from 6.12.3)
huynhjl's answer (upvoted) gives more details, and Mark Bush's answer (also upvoted) point to "A Tour of Scala: Operators" to illustrate that "Any method which takes a single parameter can be used as an infix operator".

Here's a simple rule: never used postfix operators. If you do, put the whole expression ending with the postfix operator inside parenthesis.
In fact, starting with Scala 2.10.0, doing that will generate a warning by default.
For good measure, you might want to move the postfix operator out, and use dot notation for it. For example:
(fromFile(file)).mkString map caesar(k)_
Or, even more simply,
fromFile(file).mkString map caesar(k)_
On the other hand, pay attention to the methods where you can supply an empty parenthesis to turn them into infix:
fromFile(file) mkString () map caesar(k)_

The spec doesn't make it clear, but my experience and experimentation has shown that the Scala compiler will always try to treat method calls using infix notation as infix operators. Even though your use of mkString is postfix, the compiler tries to interpret it as infix and so is trying to interpret "map" as its argument. All uses of postfix operators must either be immediately followed by an expression terminator or be used with "dot" notation for the compiler to see it as such.
You can get a hint of this (although it's not spelled out) in A Tour of Scala: Operators.

Related

How to annotate types of expressions within a Scala for expression

I'm wondering how to (safely) add type ascriptions to the lefthand side of Scala for expressions.
When working with complex for expressions, the inferred types are often hard to follow, or can go off track, and it's desirable to annotate the expressions within the for {..} block, to check that you and the compiler agree. An illustrative code sketch that uses the Eff monad:
for {
e1: Option[(Int, Vector[Int])] <- complexCalc
e2 <- process(e1)
} yield e2
def complexCalc[E]: Eff[E, Option[(Int, Vector[Int])]] = ???
However, this conflicts with Scala's support for pattern matching on the LHS of a for-expression.
As discussed in an earlier question, Scala treats the type ascription : Option[(Int, Vector[Int])] as a pattern to be matched.
The compiler has the concept of irrefutable patterns, which always match. Conversely, a refutable pattern might not match, and so when used in for expression, the righthand-side expression must provide a withFilter method to handle the non-matching case.
Due to a longstanding compiler bug/limitation, tuples and other common constructor patterns seem never to be treated as irrefutable.
The net result is that adding a type ascription makes the LHS pattern seem refutable and requires a filter op to be defined on the RHS. A symptom is compiler errors like value filter is not a member of org.atnos.eff.Eff[E,Option[(Int, Vector[Int])]].
If there any syntactic form that can be used to ascribe types to the intermediate expressions containing tuples, within for expressions, without encountering refutable pattern matching?
Turns out there's extensive discussion of this problem, and the current best solution, in this Scalaz PR *.
The work around is to add the type ascription in an assignment on the line below:
for {
e <- complexCalc
e1: Option[(Int, Vector[Int])] = e
e2 <- process(e1)
} yield e2
def complexCalc[E]: Eff[E, Option[(Int, Vector[Int])]] = ???

How much code does the body of an anonymous Scala function with "_" (underscore) comprise?

In Scala, if you have an expression containing an underscore, this is an anonymous function with the expression as its body and the underscore as as its parameter, e.g. 2*_ is the anonymous function that doubles its argument. But how far does the function body extend? I'm missing a clear rule here that disambiguates cases like e.g. the following (tested with the Scala 2.11.7 REPL):
scala> (_: Int)+2-1 // function body up to 1 - OK
res7: Int => Int = <function1>
scala> ((_: Int)+2)-1 // function body up to 2, - applied to function is an error
<console>:11: error: value - is not a member of Int => Int
((_: Int)+2)-1
^
The definition is given in http://www.scala-lang.org/files/archive/spec/2.11/06-expressions.html#placeholder-syntax-for-anonymous-functions, and it's... not that simple.
An expression e of syntactic category Expr binds an underscore section u, if the following two conditions hold: (1) e properly contains u, and (2) there is no other expression of syntactic category Expr which is properly contained in e and which itself properly contains u.
If an expression e binds underscore sections u_1 , \ldots , u_n, in this order, it is equivalent to the anonymous function (u'_1, ... u'_n) => e' where each u_i' results from u_i by replacing the underscore with a fresh identifier and e' results from e by replacing each underscore section u_i by u_i'.
And if you look at the grammar in the beginning of the section, (_: Int)+2 in (_: Int)+2-1 is not an Expr, but in ((_: Int)+2)-1 it is.
((_: Int)+2)-1 // function body up to 2, - applied to function is an error
error: value - is not a member of Int => Int
((_: Int)+2)-1
The error message from the compiler is sensible. Your additional parens have created a function literal that adds '2' to a wildcard/placeholder parameter. The compiler reads your code to mean that you have a this function value and you are trying to subtract '1' from it.
And this doesn't make sense. You can subract '1' from other numbers, but certainly not a function value. Thus, the compiler is telling you that it doesn't make sense to subtract one from a function value. Or, in compiler terms, a function of type Int => Int doesn't have a '-' function.
value - is not a member of Int => Int
Understanding this error message requires that you know that all operators in Scala ( -, *, +, etc ) are implemented as methods of types. If you look at the Scala API docs for Int, you'll see that it defines a long list of methods with common mathematical and logical operator symbols as function names.

Named parameters vs _, dot notation vs infix operation, curly vs round brackets when using higher-order functions in Scala

I'm having the hardest time understanding when I can or can't omit brackets and/or periods, and how this interplays with _.
The specific case I had with this was
val x: X = ???
val xss: List[List[X]] = ???
xss map x :: _ //this doesn't compile
xss map _.::(x) //this is the same as the above (and thus doesn't compile)
the above two seem to be identical to xss.map(_).::(x)
xss map (x :: _) //this works as expected
xss map {x :: _} //this does the same thing as the above
meanwhile, the following also fail:
xss.map xs => x :: xs //';' expected but '=>' found.
xss.map x :: _ //missing arguments for method map in class List; follow this method with `_' if you want to treat it as a partially applied function
//so when I try following the method with _, I get my favourite:
xss.map _ x :: _ //Cannot construct a collection of type That with elements of type B based on a collection of type List[List[Main.X]]
//as opposed to
xss map _ x :: _ //missing parameter type for expanded function ((x$1) => xss.map(x$1).x(($colon$colon: (() => <empty>))))
Right now, I often play "toggle the symbols until it compiles" which I believe to be a suboptimal programming strategy. How does this all work?
First we need to distinguish between xss.map(f) and xss map f. According to Scala Documentation any method which takes a single parameter can be used as an infix operator.
Actually map method in List is one of these methods. Ignoring the full signature and the fact that it's inherited from TraversableLike, the signature is as follows:
final def map[B](f: (A) ⇒ B): List[B]
So it takes a single parameter, namely f, which is a function with type A => B. So if you have a function value defined as
val mySize = (xs:List[Int]) => xs.size
you can choose between
xss.map(mySize)
or
xss map mySize
This is a matter of preference but according to Scala Style Guide, for this case, the latter is preferred, unless if it is part of a complex expression where it's better to stick with dot notation.
Note that if you opt to use the dot notation you always need to qualify the function application with brackets! That's why none of the following compiles successfully.
xss.map xs => x :: xs // Won't compile
xss.map x :: _ // Won't compile
xss.map _ x :: _ // Won't compile
But most of the time instead of passing a function value you need to pass a function literal (aka anonymous function). In this case again if you use the dot notation you need something like xss.map(_.size). But if you use the infix notation, it will be a matter of precedence.
For example
xss map x :: _ // Won't compile!
does not work because of operator precedence. So you need to use brackets to disambiguiate the situation for compiler by xss map (x :: _).
Use of curly braces instead of brackets has a very clear and simple rule. Again any function which takes only one parameter can be applied with curly braces instead of brackets, both for infix and dot notations. So the following statements will compile.
xss.map{x :: _}
xss map {x :: _}
For avoiding confusions you can begin with dot notation and explicit types for parameters. Later after being compiled - and probably writing some unit tests for your code - you can start refactoring the code by removing unnecessary types, using infix notation, and using curly braces instead of brackets where it makes sense.
For this purpose you can refer to Scala Style Guide and Martin Odersky's talk in Scala Days 2013 which is concerning Scala coding style. Also you can always ask for help from IDEs for refactoring the code to be more concise.

Scala - method precedence

I am new to Scala. I wonder whether it is possible to define some precedence with method calls. For example, if I have the chain of method calls:
someObject method1 param1 method2 param2 method3 param3
can this be equivalent to the following:
someObject.method1(param1).method2(param2.method3(param3))
or
someObject method1 param1 method2 (param2 method3 param3)
So I want method3 to take precedence over method2...
The reason I want to do this is that I want to develop a DSL, so I want to avoid using dots and parentheses as much as possible. If you guys find another solution for me, feel free to let me know.
You'll have to use methods with special operator characters to influence precedence as implied by Tomasz. This is partly why lots of Scala DSL make heavy use of operators. Also why some DSL are hard to read if you don't work with them daily.
Given method with using only letters, underscore and digits - you won't be able to influence things, here is what I put together for myself after reading the spec:
Any method which takes a single parameter can be used as an infix operator: a.m(b) can be written a m b.
Any method which does not require a parameter can be used as a postfix operator: a.m can be written a m.
Postfix operators have lower precedence than infix operators, so foo bar baz means foo.bar(baz) while foo bar baz bam means (foo.bar(baz)).bam and foo bar baz bam bim means (foo.bar(baz)).bam(bim).
So without knowing at all what your method signatures are, the following code (because it's all alphanumeric):
someObject method1 param1 method2 param2 method3 param3
will be parsed as:
someObject.method1(param1).method2(param2).method3(param3)
If you rename method3 to |*| or +:+ or whatever operator makes sense, you can achieve what you want:
someObject method1 param1 method2 param2 |*| param3
// same as
someObject.method1(param1).method2(param2.|*|(param3))
For example to see the difference:
implicit def pimp(s:String) = new {
def |*|(t:String) = t + s
def switch(t:String) = t + s
}
scala> "someObject" concat "param1" concat "param2" |*| "param3"
res2: java.lang.String = someObjectparam1param3param2
scala> "someObject" concat "param1" concat "param2" switch "param3"
res3: java.lang.String = param3someObjectparam1param2
This behavior is defined in chapter 6.12.3 Infix Operations of The Scala Language Specification.
In short: methods are invoked from left to right by default with some exceptions. These exceptions were only introduced to support mathematical operators precedence. So when you have two functions named * and +:
a + b * c
This will always be translated to:
a.+(b.*(c))
Here the first name of the function controls the precedence. However for ordinary functions you cannot control the order. Think about it - this would actually cause havoc and terribly unmaintainable code.
See also (not quite duplicate?): Operator precedence in Scala.

Why do people use _? as an identifier suffix?

I start reading Lift framework source code, I find that there're so many methods are defined using a name like methodName_? , is there a convention that _? has some special meaning?
def empty_? : Boolean = {}
You're unlikely to see the construct outside of the lift framework; I suspect that it's mostly Ruby-envy.
Almost any other Scala project will shy away from this syntax. Not because of the trailing question mark, but because it's yet another way to introduce underscores into your code - and the symbol is already far too heavily overloaded.
Based on an evaluation of multiple Scala projects, the notation can be reasonably described as non-idiomatic.
UPDATE
The reason that the underscore is required is to disambiguate from infix notation, in which:
x?y
would be read as
x.?(y)
with ? being a method name. Whereas:
x_?y
Clearly demarks x_? as being atomic.
The syntax is an example of what is formally known as a "mixed identifier", and is intended to allow definitions such as
def prop_=(v:String) = ... //setter
def unary_- = ... //prefix negation operator
It could (arguably) be considered a hack when similar construct is used simply to shove a question mark at the end of a method name.
The ? denotes that this is a predicate, a function returning Boolean. This convention goes back to Lisp, where ? (Scheme), p or -p (other Lisps, simulating the question mark with a "similar" letter) also denote predicates. Think of it as asking a question, "is the object empty?"
Scala will only allow mixed identifier names (containing alphanumerics and punctuation) if you separate them by _. E.g.,
scala> def iszero?(x : Int) = x == 0
<console>:1: error: '=' expected but identifier found.
def iszero?(x : Int) = x == 0
^
doesn't work, but
scala> def iszero_?(x : Int) = x == 0
iszero_$qmark: (x: Int)Boolean
does.