What is the meaning and purpose of the syntactic sugar in this code?
def exp: Parser[Expr] = operands ~ binOp ~ exp ^^ {case e1~o~e2=>BinaryOp(o,e1,e2)}
In particular, what do each of these expressions mean?
operands ~ binOp ~ exp ^^
e1~o~e2
operands ~ binOp ~ exp ^^ ...
Operators in scala are just ordinary method calls:
operands ~ binOp ~ exp ^^ ...
is the same as
operands.~(binOp).~(exp).^^(...)
You can see the documentation for the ~ and ^^ methods here, or you should be able to click through to them in your IDE.
case e1~o~e2
This is matching a case class called ~ - lots of two-parameter things can be written in this "infix notation" in scala. It's equivalent to:
case ~(e1, ~(o, e2))
(see the documentation on case classes)
Those don't have any special meaning - they're just methods that are called ~ and ^^. You'll need to look at the documentation/implementation of whichever library you're using that defined them to figure out what they do.
Related
What is the significance of ordered choice? Does it simply mean that you put the longest pattern match first?
Let's say you had this expression"
val expr = "eat" ~ "more" ~ "beans" |
"eat" ~ "more" ~ "beans" ~ "and" ~ "fruit"
Since parser combinators use Ordered Choice, the string eat more beans and soup ... would result in matching on the first line? The val expr uses Ordered Choice poorly since it includes a less-specific expression first?
Also, what is left recursion?
Scala parser combinators implements parsing expression grammers. A PEG is predicated on the availability of infinite lookahead and backtracking capabilities which makes it easier to express grammars as it is not necessary to make a unilateral decision at any point in the parsing process.
Ordered choice/alternation can be considered the primary enabler of this behavior; a production under which a sequence of productions are tried in sequence, accepting the first one which matches the input. In your example above the second choice will never be matched because any input matching the second choice would be accepted by the first choice.
Left recursion occurs in the event that given a production of the form a = b, an expansion of b begins with a. Consider:
def a = b ~ c
def b = a ~ c
Expansion (matching) of the production a proceeds as follows:
b ~ c
(a ~ c) ~ c // substituting b
((b ~ c) ~ c) ~ c // substituting a
(((a ~ c) ~ c) ~ c) ~ c // substituting b
This is effectively infinite, unterminated recursion.
I am working on a simple expression parser, however given the following parser combinator declarations below, I can't seem to pass my tests and a right associative tree keeps on popping up.
def EXPR:Parser[E] = FACTOR ~ rep(SUM|MINUS) ^^ {case a~b => (a /: b)((acc,f) => f(acc))}
def SUM:Parser[E => E] = "+" ~ EXPR ^^ {case "+" ~ b => Sum(_, b)}
def MINUS:Parser[E => E] = "-" ~ EXPR ^^ {case "-" ~ b => Diff(_, b)}
I've been debugging hours for this. I hope someone can help me figure it out it's not coming out right.
"5-4-3" would yield a tree that evaluates to 4 instead of the expected -2.
What is wrong with the grammar above?
I don't work with Scala but do work with F# parser combinators and also needed associativity with infix operators. While I am sure you can do 5-4 or 2+3, the problem comes in with a sequence of two or more such operators of the same precedence and operator, i.e. 5-4-2 or 2+3+5. The problem won't show up with addition as (2+3)+5 = 2+(3+5) but (5-4)-2 <> 5-(4-2) as you know.
See: Monadic Parser Combinators 4.3 Repetition with meaningful separators. Note: The separators are the operators such as "+" and "*" and not whitespace or commas.
See: Functional Parsers Look for the chainl and chainr parsers in section 7. More parser combinators.
For example, an arithmetical expressions, where the operators that
separate the subexpressions have to be part of the parse tree. For
this case we will develop the functions chainr and chainl. These
functions expect that the parser for the separators yields a function
(!);
The function f should operate on an element and a list of tuples, each
containing an operator and an element. For example, f(e0; [(1; e1);
(2; e2); (3; e3)]) should return ((eo 1 e1) 2 e2) 3 e3. You may
recognize a version of foldl in this (albeit an uncurried one), where
a tuple (; y) from the list and intermediate result x are combined
applying x y.
You need a fold function in the semantic parser, i.e. the part that converts the tokens from the syntactic parser into the output of the parser. In your code I believe it is this part.
{case a~b => (a /: b)((acc,f) => f(acc))}
Sorry I can't do better as I don't use Scala.
"-" ~ EXPR ^^ {case "-" ~ b => Diff(_, b)}
for 5-4-3, it expands to
Diff(5, 4-3)
which is
Diff(5, Diff(4, 3))
however, what you need is:
Diff(Diff(5, 4), 3))
// for 5 + 4 - 3 it should be
Diff(Sum(5, 4), 3)
you need to involve stack.
It seems using "+" ~ EXPR made the answer incorrect. It should have been FACTOR instead.
I've got very strange behaviour when I run playframework in scala. I used anorm as database access layer thus I've started doing some code and I saw very strange scala compiler behavoiur.
This is code which is working:
case class P_Page_Control(Control_ID:Int,
Client_ID:String,
cContent: String,
Page_ID: Int,
Language_ID: Int,
InsertTime: Date,
ChangeTime: Option[Date],
IsDeleted: Boolean)
and:
object P_Page_Control { val parser = {
get[Int]("Control_ID") ~
get[String]("Client_ID") ~
get[String]("Content") ~
get[Int]("Page_ID") ~
get[Int]("Language_ID") ~
get[Date]("InsertTime") ~
get[Option[Date]]("ChangeTime") ~
get[Boolean]("IsDeleted") map {
case a ~ b ~ c ~ d ~ e ~ f ~ g ~ h =>
P_Page_Control(a, b, c, d, e, f, g, h)
}}}
For this moment no compilation error. Works fine.
But when I change property name I got errors:
object P_Page_Control { val parser = {
get[Int]("Control_ID") ~
get[String]("Client_ID") ~
get[String]("Content") ~
get[Int]("Page_ID") ~
get[Int]("Language_ID") ~
get[Date]("InsertTime") ~
get[Option[Date]]("ChangeTime") ~
get[Boolean]("IsDeleted") map {
case A_B ~ b ~ c ~ d ~ e ~ f ~ g ~ h =>
P_Page_Control(A_B, b, c, d, e, f, g, h)
}}}
As I'm a totally new to Scala I thought _ is some magic keyword or other magic stuff.
So I changed property name to aBB_AccAd but there was no compilation errors.
ooops...
Next funny thing: I renamed this to AAbbdddsadasdasAAFFFFeeee and I saw again compilation errors.
So what motivates Scala to throw compilation error for some set of literals?
Is this a bug or feature ? :-)
Names in patterns, which start with a capital letter, are interpreted as variable or object names that refer to an extractor (an object with an unapply or unapplySeq method). Since you haven't declared such a variable or object, Scala can't find it and throws an error.
I am working on a Parsing logic that needs to take operator precedence into consideration. My needs are not too complex. To start with I need multiplication and division to take higher precedence than addition and subtraction.
For example: 1 + 2 * 3 should be treated as 1 + (2 * 3). This is a simple example but you get the point!
[There are couple more custom tokens that I need to add to the precedence logic, which I may be able to add based on the suggestions I receive here.]
Here is one example of dealing with operator precedence: http://jim-mcbeath.blogspot.com/2008/09/scala-parser-combinators.html#precedencerevisited.
Are there any other ideas?
This is a bit simpler that Jim McBeath's example, but it does what you say you need, i.e. correct arithmetic precdedence, and also allows for parentheses. I adapted the example from Programming in Scala to get it to actually do the calculation and provide the answer.
It should be quite self-explanatory. There is a heirarchy formed by saying an expr consists of terms interspersed with operators, terms consist of factors with operators, and factors are floating point numbers or expressions in parentheses.
import scala.util.parsing.combinator.JavaTokenParsers
class Arith extends JavaTokenParsers {
type D = Double
def expr: Parser[D] = term ~ rep(plus | minus) ^^ {case a~b => (a /: b)((acc,f) => f(acc))}
def plus: Parser[D=>D] = "+" ~ term ^^ {case "+"~b => _ + b}
def minus: Parser[D=>D] = "-" ~ term ^^ {case "-"~b => _ - b}
def term: Parser[D] = factor ~ rep(times | divide) ^^ {case a~b => (a /: b)((acc,f) => f(acc))}
def times: Parser[D=>D] = "*" ~ factor ^^ {case "*"~b => _ * b }
def divide: Parser[D=>D] = "/" ~ factor ^^ {case "/"~b => _ / b}
def factor: Parser[D] = fpn | "(" ~> expr <~ ")"
def fpn: Parser[D] = floatingPointNumber ^^ (_.toDouble)
}
object Main extends Arith with App {
val input = "(1 + 2 * 3 + 9) * 2 + 1"
println(parseAll(expr, input).get) // prints 33.0
}
I'm doing Cay Horstmann's combinator parser exercises, I wonder about the best way to distinguish between strings that represent numbers and strings that represent variables in a match statement:
def factor: Parser[ExprTree] = (wholeNumber | "(" ~ expr ~ ")" | ident) ^^ {
case a: wholeNumber => Number(a.toInt)
case a: String => Variable(a)
}
The second line there, "case a: wholeNumber" is not legal. I thought about a regexp, but haven't found a way to get it to work with "case".
I would split it up a bit and push the case analysis into the |. This is one of the advantages of combinators and really LL(*) parsing in general:
def factor: Parser[ExprTree] = ( wholeNumber ^^ { Number(_.toInt) }
| "(" ~> expr <~ ")"
| ident ^^ { Variable(_) } )
I apologize if you're not familiar with the underscore syntax. Basically it just means "substitute the nth parameter to the enclosing function value". Thus { Variable(_) } is equivalent to { x => Variable(x) }.
Another bit of syntax magic here is the ~> and <~ operators in place of ~. These operators mean that the parsing of that term should include the syntax of both the parens, but the result should be solely determined by the result of expr. Thus, the "(" ~> expr <~ ")" matches exactly the same thing as "(" ~ expr ~ ")", but it doesn't require the extra case analysis to retrieve the inner result value from expr.