I have a simple unapply that checks of an integer is less than 10
object MatchLess {
def unapply(i: Int): Option[Int] = if ( i < 10 ) Some(i) else None
}
// so this prints
// 7 8 9 . . .
for ( i <- 7 to 12 ) i match {
case MatchLess(x) => print(x + " ") // line 8
case _ => print(". ")
}
I have one doubt about unapply syntax: why in case in line 8, value x is actually visible at both sides of =>? Can I assume that the compiler implicitly adds an assignment like this?
// ...
case /* val x = i */ MatchLess(x) => print(x + " ") // line 8
When you write case MatchLess(x) => ... the meaning is the following:
execute the unapply method
if it succeed, bind the variables (here x) to the values returned by unapply here the i of Some(i) (ohterwise, the pattern doesn't match, goes to the following.
So in your particular case the x is bound to the same value than i. But if instead of Some(i)the function MatchLess.unapplyreturns something else (for example Some(42)) x would have been bound to 42.
Check out the language spec, section 8 on pattern matching:
Syntax:
Pattern ::= Pattern1 { ‘|’ Pattern1 }
Pattern1 ::= varid ‘:’ TypePat
| ‘_’ ‘:’ TypePat
| Pattern2
Pattern2 ::= varid [‘#’ Pattern3]
| Pattern3
Pattern3 ::= SimplePattern
| SimplePattern {id [nl] SimplePattern}
SimplePattern ::= ‘_’
| varid // <- 2)
| Literal
| StableId
| StableId ‘(’ [Patterns] ‘)’ // <- 1)
| StableId ‘(’ [Patterns ‘,’] [varid ‘#’] ‘_’ ‘*’ ‘)’
| ‘(’ [Patterns] ‘)’
| XmlPattern
Patterns ::= Pattern {‘,’ Patterns}
MatchLess(x) is identified as a SimplePattern (1), and the expression between parentheses, according to the above, is identified via Patterns -> Pattern -> Pattern1 -> Pattern2 -> Pattern3 -> SimplePattern -> varid (2). This variable pattern is described as:
A variable pattern x is a simple identifier which starts with a lower
case letter. It matches any value, and binds the variable name to that
value. The type of x is the expected type of the pattern as given from
outside.
In your example, unapply is called on i, and the result is bound to x.
Related
I found that parentheses are not required when partial function used as parameter in Scala
val array = Array(2)
array.map(x => x + 1)
array.map { case x => x + 1 }
{ case x => x + 1 } defines a partial function here, so it should be array.map({ case x => x + 1 }), but there are no parentheses.So what happend here? what syntactic is here?
The answer is in the language specification. The syntax for function applications is this:
SimpleExpr ::= SimpleExpr1 ArgumentExprs
ArgumentExprs ::= ‘(’ [Exprs] ‘)’
| ‘(’ [Exprs ‘,’] PostfixExpr ‘:’ ‘_’ ‘*’ ‘)’
| [nl] BlockExpr
Exprs ::= Expr {‘,’ Expr}
So the arguments to a function can be one or more expressions surrounded by ( ), or a single BlockExpr if the function takes a single argument.
Moving on the section about blocks we find this:
BlockExpr ::= ‘{’ CaseClauses ‘}’
| ‘{’ Block ‘}’
Block ::= BlockStat {semi BlockStat} [ResultExpr]
Partial functions are defined using the CaseClauses option, so they must be surrounded by { } to make a block expression. This block expression is then a valid argument for a function with a single parameter.
Functions that take multiple parameters must always use ( ).
Recently, I am learning the Scala parser combinator. I would like to parse the key in a given string. For instance,
val expr1 = "local_province != $province_name$ or city=$city_name$ or people_number<>$some_digit$"
// ==> List("local_province", "city", "people_number")
val expr2 = "(local_province=$province_name$)"
// ==> List("local_province")
val expr3 = "(local_province=$province_name$ or city=$city_name$) and (lib_name=$akka$ or lib_author=$martin$)"
// ==> List("local_province", "city", "lib_name", "lib_author")
Trial
import scala.util.parsing.combinator.JavaTokenParsers
class KeyParser extends JavaTokenParsers {
lazy val key = """[a-zA-Z_]+""".r
lazy val value = "$" ~ key ~ "$"
lazy val logicOps = ">" | "<" | "=" | ">=" | "<=" | "!=" | "<>"
lazy val elem: Parser[String] = key <~ (logicOps ~ value)
lazy val expr: Parser[List[String]] =
"(" ~> repsep(elem, "and" | "or") <~ ")" | repsep(elem, "and" | "or")
lazy val multiExpr: Parser[List[String]] =
repsep(expr, "and" | "or") ^^ { _.foldLeft(List.empty[String])(_ ++ _) }
}
object KeyParser extends KeyParser {
def parse(input: String) = parseAll(multiExpr, input)
}
Here is my test in Scala REPL
KeyParser.parse(expr1)
[1.72] failure: $' expected but >' found
KeyParser.parse(expr2)
[1.33] parsed: List(local_province)
KeyParser.parse(expr3)
[1.98] parsed: List(local_province, city, lib_name, lib_author)
I notice that the KeyParser only works for "=" and it doesn't support the case like "(local_province<>$province_name$ AND city!=$city_name$)" which contains "<> | !=" and "AND".
So I would like to know how to revise it.
I notice that the KeyParser only works for "="
This isn't quite true. It also works for !=, < and >. The ones it doesn't work for are >=, <= and <>.
More generally it does not work for those operators which have a prefix of them appear in the list of alternatives before them. That is >= is not matched because > appears before it and is a prefix of it.
So why does this happen? The | operator creates a parser that produces the result of the left parser if it succeeds or of the right parser otherwise. So if you have a chain of |s, you'll get the result of the first parser in that chain which can match the current input. So if the current input is <>$some_digit$, the parser logicOps will match < and leave you with >$some_digit$ as the remaining input. So now it tries to match value against that input and fails.
Why doesn't backtracking help here? Because the logicOps parser already succeeded, so there's nowhere to backtrack to. If the parser were structured like this:
lazy val logicOpAndValue = ">" ~ value | "<" ~ value | "=" ~ value |
">=" ~ value | "<=" ~ value | "!=" ~ value |
"<>" ~ value
lazy val elem: Parser[String] = key <~ logicOpAndValue
Then the following would happen (with the current input being <>$some_digit$):
">" does not match the current input, so go to next alternative
"<" does match the current input, so try the right operand of the ~ (i.e. value) with the current input >$some_digit$. This fails, so continue with the next alternative.
... bunch of alternatives that don't match ...
"<>" does match the current input, so try the right operand of the ~. This matches as well. Success!
However in your code the ~ value is outside of the list of alternatives, not inside each alternative. So when the parser fails, we're no longer inside any alternative, so there's no next alternative to try and it just fails.
Of course moving the ~ value inside the alternatives isn't really a satisfying solution as it's ugly as hell and not very maintainable in the general case.
One solution is simply to move the longer operators at the beginning of the alternatives (i.e. ">=" | "<=" | "<>" | ">" | "<" | ...). This way ">" and "<" will only be tried if ">=", "<=" and "<>" have already failed.
A still better solution, which does not rely on the order of alternatives and is thus less error-prone, is to use ||| instead of. ||| works like | except that it tries all of the alternatives and then returns the longest successful result - not the first.
PS: This isn't related to your problem but you're currently limiting the nesting depth of parentheses because your grammar is not recursive. To allow unlimited nesting, you'll want your expr and multiExpr rules to look like this:
lazy val expr: Parser[List[String]] =
"(" ~> multiExpr <~ ")" | elem
lazy val multiExpr: Parser[List[String]] =
repsep(expr, "and" | "or") ^^ { _.foldLeft(List.empty[String])(_ ++ _) }
However I recommend renaming expr to something like primaryExpr and multiExpr to expr.
_.foldLeft(List.empty[String])(_ ++ _) can also be more succinctly expressed as _.flatten.
Say I have the following code:
val a: List[(Int, String)] = List((1,"A"),(2,"B"),(3,"C"))
val b: List[String] = List("A","C","E")
I can do:
a.map{case (fst,snd) => (fst,snd + "a")}
a.filter{case (_,snd) => b.contains(snd)}
But why can't I do:
a.map((_._1,_._2 + "a"))
a.filter(b.contains(_._2))
Is there a way to accomplish this using underscore notation, or am I forced here?
For the example:
a.map((_._1,_._2 + "a"))
Each placeholder (i.e. each underscore/_) introduces a new parameter in the argument expression.
To cite the Scala spec
An expression (of syntactic category Expr)
may contain embedded underscore symbols _ at places where identifiers
are legal. Such an expression represents an anonymous function where subsequent
occurrences of underscores denote successive parameters.
[...]
The anonymous functions in the left column use placeholder
syntax. Each of these is equivalent to the anonymous function on its right.
|---------------------------|----------------------------|
|`_ + 1` | `x => x + 1` |
|`_ * _` | `(x1, x2) => x1 * x2` |
|`(_: Int) * 2` | `(x: Int) => (x: Int) * 2` |
|`if (_) x else y` | `z => if (z) x else y` |
|`_.map(f)` | `x => x.map(f)` |
|`_.map(_ + 1)` | `x => x.map(y => y + 1)` |
You'll have to use the expanded forms when you need to use a given parameter more than once. So your example has to be rewritten as:
a.map(x => (x._1, x._2 + "a"))
For the example
a.filter(b.contains(_._2))
The problem is that you are effectively passing in an anonymous function to contains rather than filter, so you won't be able to use underscore notation here either. Instead you'll have to write
a.filter(x => b.contains(x._2))
You can't do
a.map((_._1,_._2 + "a"))
because _ will match the elements of the iterable for each iteration. The first _ will match with the elements of the first iterable and second _ will match with the elements of the second iterable and so on. _._1 will match the first element of tupled elements of the first iterable, but _._2 will try to get the second element of tupled elements of second iterable. As there is no second iterable, Scala compiler would throw compilation error
In your second line of code
a.filter(b.contains(_._2))
_._2 tries to get the second element of tupled iterable of b, but b is not a tupled iterable. b is simply a iterable of String.
to make it work you can do
a.map(x => (x._1, x._2 + "a"))
a.filter(x => b.contains(x._2))
This question already has answers here:
What is the formal difference in Scala between braces and parentheses, and when should they be used?
(9 answers)
Closed 9 years ago.
I am a little confused by the various uses of the 'block' {...} contruct in scala especially when calling a higher order function like in the following example.
def higherOrder(func: Int => Int): Int = {
func(4)
}
val f = ((x: Int) => x*x)
Then I can call higherOrder like so:
higherOrder(f), or
higherOrder {f}, or
higherOrder { x => x*x }
(1) is obvious,
but I can not wrap my head around how the syntax for (2) and (3) are parsed by the compiler
Can somebody explain what (2) and (3) correspond to, with regard to the language specification?
See SLS 6.6 Function Applications. Function application is defined like this:
SimpleExpr ::= SimpleExpr1 ArgumentExprs
ArgumentExprs ::= ‘(’ [Exprs] ‘)’
...
| [nl] BlockExpr
And BlockExpr is
BlockExpr ::= ‘{’ CaseClauses ‘}’
| ‘{’ Block ‘}’
So after function or method name you could specify either arguments list in brackets or expression in braces.
I was working on a project last night, and had some code like this:
/* fixes warnings in 2.10 */
import scala.language.implicitConversions
/* had some kind of case class with some members */
case class Wrapper[A](x: A)
/* for convenience's sake */
implicit def aToWrapper[A](x: A) = Wrapper(x)
/* had some kind of Seq */
val xs: List[Wrapper[String]] = List("hello", "world", "this", "is", "a", "test")
I then accidentally wrote:
xs foldLeft("") { (a, x) => a + x }
having left the . out between xs and foldLeft.
The types in question were a little more complex, and it asked me to annotate the lambda's parameter types, so I quickly did, thinking that was the source of my mistake. I ended up with something like this:
xs foldLeft("") { (a: String, x: Wrapper[String]) => a + x }
At this point I was receiving the error:
<console>:13: error: type mismatch;
found : (String, Wrapper[String]) => String
required: Int
xs foldLeft("") { (a: String, x: Wrapper[String]) => a + x }
^
Obviously the fix was xs.foldLeft("") ..., but I have been wondering why the compiler is expecting an Int in this case. Could anyone illuminate how this is being parsed? this has been nagging me all day.
When you leave out the dot and the parentheses you use the so called infix notation. It allows you to write a + b instead of a.+(b). An important rule here is that this is only allowed if a call is of the form object method paramlist (see SLS 6.12.3):
The right-hand operand of a left-associative operator may consist of several arguments enclosed in parentheses, e.g. e op (e 1 , ... , e n ). This expression is then interpreted as e.op(e 1 , ... , e n ).
foldLeft doesn't fit into this form, it uses object method paramlist1 paramlist2. Thus if you write this in operator notation the compiler treats it as object.method(paramlist1).paramlist2 (as described in SLS 6.12.2):
A postfix operator can be an arbitrary identifier. The postfix operation e op is interpreted as e.op.
But there is another rule applied here: Function Applications (SLS 6.6).
An application f(e 1 , ... , e m) applies the function f to the argument expressions
e 1 , ... , e m.
[...]
If f has some value type, the application is taken to be equivalent to f.apply(e 1 , ... , e m), i.e. the application of an apply method defined by f.
Here we go:
scala> { (a, x) => a + x }
<console>:12: error: missing parameter type
{ (a, x) => a + x }
^
<console>:12: error: missing parameter type
{ (a, x) => a + x }
^
This is just a function literals that misses its type arguments. If we add them everything compiles fine:
scala> { (a: String, x: Wrapper[String]) => a + x }
res6: (String, Wrapper[String]) => String = <function2>
The compiler just applies the rule about function applications described above:
scala> "" { (a: String, x: Wrapper[String]) => a + x }
<console>:13: error: type mismatch;
found : (String, Wrapper[String]) => String
required: Int
"" { (a: String, x: Wrapper[String]) => a + x }
^
scala> "" apply { (a: String, x: Wrapper[String]) => a + x }
<console>:13: error: type mismatch;
found : (String, Wrapper[String]) => String
required: Int
"" apply { (a: String, x: Wrapper[String]) => a + x }
^
Thus, your code is interpreted as
scala> xs foldLeft ("").apply{ (a: String, x: Wrapper[String]) => a + x }
<console>:14: error: type mismatch;
found : (String, Wrapper[String]) => String
required: Int
xs foldLeft ("").apply{ (a: String, x: Wrapper[String]) => a + x }
^
But why does it apply the functions applications rule? It would also be possible to apply the function literal as postfix operator. To find out why we get the shown error message wen need to look at SLS Scala Syntax Summary. There we can see the following:
InfixExpr ::= PrefixExpr
| InfixExpr id [nl] InfixExpr
PrefixExpr ::= [‘-’ | ‘+’ | ‘~’ | ‘!’] SimpleExpr
SimpleExpr ::= ‘new’ (ClassTemplate | TemplateBody)
| BlockExpr
| SimpleExpr1 [‘_’]
SimpleExpr1 ::= Literal
| Path
| ‘_’
| ‘(’ [Exprs] ‘)’
| SimpleExpr ‘.’ id
| SimpleExpr TypeArgs
| SimpleExpr1 ArgumentExprs
| XmlExpr
Exprs ::= Expr {‘,’ Expr}
ArgumentExprs ::= ‘(’ [Exprs] ‘)’
| ‘(’ [Exprs ‘,’] PostfixExpr ‘:’ ‘_’ ‘*’ ‘)’
| [nl] BlockExpr
From the sections described above we know that ArgumentExprs describes function application while InfixExpr describes an infix expression. Because of the rules of EBNF, the most upper rule has the lowest precedence. And because the former rule is called by the latter one, it means that the function literal is applied before the infix expression, hence the error message.
I believe you can only use binary operators as infix notation. I think you can also remedy the situation by using parenthesis: (xs foldLeft ("")) { (a: String, x: Wrapper[String]) => a + x }.
Probably, it parses your original code as xs.foldLeft("").{ (a: String, x: Wrapper[String]) => a + x }. Check out this answer: When to use parenthesis in Scala infix notation