I'm read the sbt documentation on Commands, and wondering what do ^^^ and ~> mean?
I tried to google but nothing found, these chars are escaped by google I guess...Thanks a lot
// Demonstration of a custom parser.
// The command changes the foreground or background terminal color
// according to the input.
lazy val change = Space ~> (reset | setColor)
lazy val reset = token("reset" ^^^ "\033[0m")
lazy val color = token( Space ~> ("blue" ^^^ "4" | "green" ^^^ "2") )
lazy val select = token( "fg" ^^^ "3" | "bg" ^^^ "4" )
lazy val setColor = (select ~ color) map { case (g, c) => "\033[" + g + c + "m" }
def changeColor = Command("color")(_ => change) { (state, ansicode) =>
print(ansicode)
state
}
The full code is as example project/CommandExample.scala at http://www.scala-sbt.org/0.13/docs/Commands.html
Those are methods on the RichParser class.
See http://www.scala-sbt.org/0.13/api/#sbt.complete.RichParser
Hint. If you look for symbolic methods click the '#' on the upper left corner of the api doc page.
^^^[B](value: B): Parser[B]: Apply the original Parser, but provide value as the result if it succeeds.
~>[B](b: Parser[B]): Parser[B]: Produces a Parser that applies the original Parser and then applies next (in order), discarding the result of the original parser.
Related
I am starting to use the state monad to clean up my code. I have got it working for my problem where I process a transaction called CDR and modify the state accordingly.
It is working perfectly fine for individual transactions, using this function to perform the state update.
def addTraffic(cdr: CDR): Network => Network = ...
Here is an example:
scala> val processed: (CDR) => State[Network, Long] = cdr =>
| for {
| m <- init
| _ <- modify(Network.addTraffic(cdr))
| p <- get
| } yield p.count
processed: CDR => scalaz.State[Network,Long] = $$Lambda$4372/1833836780#1258d5c0
scala> val r = processed(("122","celda 1", 3))
r: scalaz.State[Network,Long] = scalaz.IndexedStateT$$anon$13#4cc4bdde
scala> r.run(Network.empty)
res56: scalaz.Id.Id[(Network, Long)] = (Network(Map(122 -> (1,0.0)),Map(celda 1 -> (1,0.0)),Map(1 -> Map(1 -> 3)),1,true),1)
What i want to do now is to chain a number of transactions on an iterator. I have found something that works quite well but the state transitions take no inputs (state changes through RNG)
import scalaz._
import scalaz.std.list.listInstance
type RNG = scala.util.Random
val f = (rng:RNG) => (rng, rng.nextInt)
val intGenerator: State[RNG, Int] = State(f)
val rng42 = new scala.util.Random
val applicative = Applicative[({type l[Int] = State[RNG,Int]})#l]
// To generate the first 5 Random integers
val chain: State[RNG, List[Int]] = applicative.sequence(List.fill(5)(intGenerator))
val chainResult: (RNG, List[Int]) = chain.run(rng42)
chainResult._2.foreach(println)
I have unsuccessfully tried to adapt this, but I can not get they types signatures to match because my state function requires the cdr (transaction) input
Thanks
TL;DR
you can use traverse from the Traverse type-class on a collection (e.g. List) of CDRs, using a function with this signature: CDR => State[Network, Long]. The result will be a State[Network, List[Long]]. Alternatively, if you don't care about the List[Long] there, you can use traverse_ instead, which will return State[Network, Unit]. Finally, should you want to "aggregate" the results T as they come along, and T forms a Monoid, you can use foldMap from Foldable, which will return State[Network, T], where T is the combined (e.g. folded) result of all Ts in your chain.
A code example
Now some more details, with code examples. I will answer this using Cats State rather than Scalaz, as I never used the latter, but the concept is the same and, if you still have problems, I will dig out the correct syntax.
Assume that we have the following data types and imports to work with:
import cats.implicits._
import cats.data.State
case class Position(x : Int = 0, y : Int = 0)
sealed trait Move extends Product
case object Up extends Move
case object Down extends Move
case object Left extends Move
case object Right extends Move
As it is clear, the Position represents a point in a 2D plane and a Move can move such point up, down, left or right.
Now, lets create a method that will allow us to see where we are at a given time:
def whereAmI : State[Position, String] = State.inspect{ s => s.toString }
and a method to change our position, given a Move:
def move(m : Move) : State[Position, String] = State{ s =>
m match {
case Up => (s.copy(y = s.y + 1), "Up!")
case Down => (s.copy(y = s.y - 1), "Down!")
case Left => (s.copy(x = s.x - 1), "Left!")
case Right => (s.copy(x = s.x + 1), "Right!")
}
}
Notice that this will return a String, with the name of the move followed by an exclamation mark. This is just to simulate the type change from Move to something else, and show how the results will be aggregated. More on this in a bit.
Now let's try to play with our methods:
val positions : State[Position, List[String]] = for{
pos1 <- whereAmI
_ <- move(Up)
_ <- move(Right)
_ <- move(Up)
pos2 <- whereAmI
_ <- move(Left)
_ <- move(Left)
pos3 <- whereAmI
} yield List(pos1,pos2,pos3)
And we can feed it an initial Position and see the result:
positions.runA(Position()).value // List(Position(0,0), Position(1,2), Position(-1,2))
(you can ignore the .value there, it's a quirk due to the fact that State[S,A] is really just an alias for StateT[Eval,S,A])
As you can see, this behaves as you would expect, and you can create different "blueprints" (e.g. sequences of state modifications), which will be applied once an initial state is provided.
Now, to actually answer to you question, say we have a List[Move] and we want to apply them sequentially to an initial state, and get the result: we use traverse from the Traverse type-class.
val moves = List(Down, Down, Left, Up)
val result : State[Position, List[String]] = moves.traverse(move)
result.run(Position()).value // (Position(-1,-1),List(Down!, Down!, Left!, Up!))
Alternatively, should you not need the A at all (the List in you case), you can use traverse_, instead of traverse and the result type will be:
val result_ : State[Position, List[String]] = moves.traverse_(move)
result_.run(Position()).value // (Position(-1,-1),Unit)
Finally, if your A type in State[S,A] forms a Monoid, then you could also use foldMap from Foldable to combine (e.g. fold) all As as they are calculated. A trivial example (probably useless, because this will just concatenate all Strings) would be this:
val result : State[Position,String] = moves.foldMap(move)
result.run(Position()).value // (Position(-1,-1),Down!Down!Left!Up!)
Whether this final approach is useful or not to you, really depends on what A you have and if it makes sense to combine it.
And this should be all you need in your scenario.
Problem
I want to parse line like this:
fieldName: value1|value2 anotherFieldName: value3 valueWithoutFieldName
into
List(Some("fieldName") ~ List("value1", "value2"), Some("anotherFieldName") ~ List("value3"), None~List("valueWithoutFieldName"))
(Alternative field values are separated by pipe (|). Field name is optional. If field has no name, it should be parsed as None (see valueWithoutFieldName)
My current (not working) solution
This is what I have so far:
val parser: Parser[ParsedQuery] = {
phrase(rep(opt(fieldNameTerm) ~ (multipleValueTerm | singleValueTerm))) ^^ {
case termSequence =>
// operate on List[Option[String] ~ List[String]]
}
}
val fieldNameTerm: Parser[String] = {
("\\w+".r <~ ":(?=\\S)".r) ^^ {
case fieldName => fieldName
}
}
val multipleValueTerm = rep1((singleValueTerm <~ alternativeValueTerm) | (alternativeValueTerm ~> singleValueTerm))
val alternativeValueTerm: Parser[String] = {
// matches '|'
("\\|".r) ^^ {
case token => token
}
}
val singleValueTerm: Parser[String] = {
// all non-whitespace characters except '|'
("[\\S&&[^\\|]]+".r) ^^ {
case token => token
}
}
Unfortunately, my code does not parse last possible field value (the last value after pipe) correctly and treats it as value of a new nameless field. For instance:
The following string:
"abc:111|222|333|444 cde:555"
is parsed into:
List((Some(abc)~List(111, 222, 333)), (None~444), (Some(cde)~555))
while I'd like it to be:
List((Some(abc)~List(111, 222, 333, 444)), (Some(cde)~555))
My suspicions
I think that the problem lies in definition of multipleValueTerm:
rep1((singleValueTerm <~ alternativeValueTerm) | (alternativeValueTerm ~> singleValueTerm))
It's second part is probably not interpreted correctly, but I have no idea why.
Shouldn't <~ from the first part of multipleValueTerm left pipe representing value separator, so that second part of this expression (alternativeValueTerm ~> singleValueTerm) is able to parse it successfully?
Let's look at what's happening. We want to parse: 111|222|333|444 with multiValueTerm.
111| fits (singleValueTerm <~ alternativeValueTerm). <~ throws away the | and we take the 111.
So we have 222|333|444 left.
Analog to the previous: 222| and 333| are taken. So we are left with 444. But 444 does not fit either (singleValueTerm <~ alternativeValueTerm) or (alternativeValueTerm ~> singleValueTerm). So it is not taken. That is why it will be treated as a new value without variable.
I would improve your parser this way:
val seperator = "|"
lazy val parser: Parser[List[(Option[String] ~ List[String])]] = rep(termParser)
lazy val termParser: Parser[(Option[String] ~ List[String])] = opt(fieldNameTerm) ~ valueParser
lazy val fieldNameTerm: Parser[String] = ("\\w+".r <~ ":(?=\\S)".r)
lazy val valueParser: Parser[List[String]] = rep1sep(singleValueTerm, seperator)
lazy val singleValueTerm: Parser[String] = ("[\\S&&[^\\|]]+".r)
There is no need for all this identity stuff ^^ {case x => x}. I removed that. Then I treat single- and multi-values the same way. It is either a List with exactly one or more elements. repsep is nice for dealing with seperators.
rep1sep(singleValueTerm, seperator) could be equivalently expressed with
singlevalueTerm ~ rep(seperator ~> singlevalueTerm)
I have a working parser, but I've just realised I do not cater for comments. In the DSL I am parsing, comments start with a ; character. If a ; is encountered, the rest of the line is ignored (not all of it however, unless the first character is ;).
I am extending RegexParsers for my parser and ignoring whitespace (the default way), so I am losing the new line characters anyway. I don't wish to modify each and every parser I have to cater for the possibility of comments either, because statements can span across multiple lines (thus each part of each statement may end with a comment). Is there any clean way to acheive this?
One thing that may influence your choice is whether comments can be found within your valid parsers. For instance let's say you have something like:
val p = "(" ~> "[a-z]*".r <~ ")"
which would parse something like ( abc ) but because of comments you could actually encounter something like:
( ; comment goes here
abc
)
Then I would recommend using a TokenParser or one of its subclass. It's more work because you have to provide a lexical parser that will do a first pass to discard the comments. But it is also more flexible if you have nested comments or if the ; can be escaped or if the ; can be inside a string literal like:
abc = "; don't ignore this" ; ignore this
On the other hand, you could also try to override the value of whitespace to be something like
override protected val whiteSpace = """(\s|;.*)+""".r
Or something along those lines.
For instance using the example from the RegexParsers scaladoc:
import scala.util.parsing.combinator.RegexParsers
object so1 {
Calculator("""(1 + ; foo
(1 + 2))
; bar""")
}
object Calculator extends RegexParsers {
override protected val whiteSpace = """(\s|;.*)+""".r
def number: Parser[Double] = """\d+(\.\d*)?""".r ^^ { _.toDouble }
def factor: Parser[Double] = number | "(" ~> expr <~ ")"
def term: Parser[Double] = factor ~ rep("*" ~ factor | "/" ~ factor) ^^ {
case number ~ list => (number /: list) {
case (x, "*" ~ y) => x * y
case (x, "/" ~ y) => x / y
}
}
def expr: Parser[Double] = term ~ rep("+" ~ log(term)("Plus term") | "-" ~ log(term)("Minus term")) ^^ {
case number ~ list => list.foldLeft(number) { // same as before, using alternate name for /:
case (x, "+" ~ y) => x + y
case (x, "-" ~ y) => x - y
}
}
def apply(input: String): Double = parseAll(expr, input) match {
case Success(result, _) => result
case failure: NoSuccess => scala.sys.error(failure.msg)
}
}
This prints:
Plus term --> [2.9] parsed: 2.0
Plus term --> [2.10] parsed: 3.0
res0: Double = 4.0
Just filter out all the comments with a regex before you pass the code into your parser.
def removeComments(input: String): String = {
"""(?ms)\".*?\"|;.*?$|.+?""".r.findAllIn(input).map(str => if(str.startsWith(";")) "" else str).mkString
}
val code =
"""abc "def; ghij"
abc ;this is a comment
def"""
println(removeComments(code))
I'm trying to build an internal DSL in Scala to represent algebraic definitions. Let's consider this simplified data model:
case class Var(name:String)
case class Eq(head:Var, body:Var*)
case class Definition(name:String, body:Eq*)
For example a simple definition would be:
val x = Var("x")
val y = Var("y")
val z = Var("z")
val eq1 = Eq(x, y, z)
val eq2 = Eq(y, x, z)
val defn = Definition("Dummy", eq1, eq2)
I would like to have an internal DSL to represent such an equation in the form:
Dummy {
x = y z
y = x z
}
The closest I could get is the following:
Definition("Dummy") := (
"x" -> ("y", "z")
"y" -> ("x", "z")
)
The first problem I encountered is that I cannot have two implicit conversions for Definition and Var, hence Definition("Dummy"). The main problem, however, are the lists. I don't want to surround them by any thing, e.g. (), and I also don't want their elements be separated by commas.
Is what I want possible using Scala? If yes, can anyone show me an easy way of achieving it?
While Scalas syntax is powerful, it is not flexible enough to create arbitrary delimiters for symbols. Thus, there is no way to leave commas and replace them only with spaces.
Nevertheless, it is possible to use macros and parse a string with arbitrary content at compile time. It is not an "easy" solution, but one that works:
object AlgDefDSL {
import language.experimental.macros
import scala.reflect.macros.Context
implicit class DefDSL(sc: StringContext) {
def dsl(): Definition = macro __dsl_impl
}
def __dsl_impl(c: Context)(): c.Expr[Definition] = {
import c.universe._
val defn = c.prefix.tree match {
case Apply(_, List(Apply(_, List(Literal(Constant(s: String)))))) =>
def toAST[A : TypeTag](xs: Tree*): Tree =
Apply(
Select(Ident(typeOf[A].typeSymbol.companionSymbol), newTermName("apply")),
xs.toList
)
def toVarAST(varObj: Var) =
toAST[Var](c.literal(varObj.name).tree)
def toEqAST(eqObj: Eq) =
toAST[Eq]((eqObj.head +: eqObj.body).map(toVarAST(_)): _*)
def toDefAST(defObj: Definition) =
toAST[Definition](c.literal(defObj.name).tree +: defObj.body.map(toEqAST(_)): _*)
parsers.parse(s) match {
case parsers.Success(defn, _) => toDefAST(defn)
case parsers.NoSuccess(msg, _) => c.abort(c.enclosingPosition, msg)
}
}
c.Expr(defn)
}
import scala.util.parsing.combinator.JavaTokenParsers
private object parsers extends JavaTokenParsers {
override val whiteSpace = "[ \t]*".r
lazy val newlines =
opt(rep("\n"))
lazy val varP =
"[a-z]+".r ^^ Var
lazy val eqP =
(varP <~ "=") ~ rep(varP) ^^ {
case lhs ~ rhs => Eq(lhs, rhs: _*)
}
lazy val defHead =
newlines ~> ("[a-zA-Z]+".r <~ "{") <~ newlines
lazy val defBody =
rep(eqP <~ rep("\n"))
lazy val defEnd =
"}" ~ newlines
lazy val defP =
defHead ~ defBody <~ defEnd ^^ {
case name ~ eqs => Definition(name, eqs: _*)
}
def parse(s: String) = parseAll(defP, s)
}
case class Var(name: String)
case class Eq(head: Var, body: Var*)
case class Definition(name: String, body: Eq*)
}
It can be used with something like this:
scala> import AlgDefDSL._
import AlgDefDSL._
scala> dsl"""
| Dummy {
| x = y z
| y = x z
| }
| """
res12: AlgDefDSL.Definition = Definition(Dummy,WrappedArray(Eq(Var(x),WrappedArray(Var(y), Var(z))), Eq(Var(y),WrappedArray(Var(x), Var(z)))))
In addition to sschaef's nice solution I want to mention a few possibilities that are commonly used to get rid of commas in list construction for a DSL.
Colons
This might be trivial, but it is sometimes overlooked as a solution.
line1 ::
line2 ::
line3 ::
Nil
For a DSL it is often desired that every line that contains some instruction/data is terminated the same way (opposed to Lists where all but the last line will get a comma). With such a solutions exchanging the lines no longer can mess up the trailing comma. Unfortunately, the Nil looks a bit ugly.
Fluid API
Another alternative that might be interesting for a DSL is something like that:
BuildDefinition()
.line1
.line2
.line3
.build
where each line is a member function of the builder (and returns a modified builder). This solution requires to eventually convert the builder to a list (which might be done as an implicit conversion). Note that for some APIs it might be possible to pass around the builder instances themselves, and only extract the data wherever needed.
Constructor API
Similarly another possibility is to exploit constructors.
new BuildInterface {
line1
line2
line3
}
Here, BuildInterface is a trait and we simply instantiate an anonymous class from the interface. The line functions call some member functions of this trait. Each invocation can internally update the state of the build interface. Note that this commonly results in a mutable design (but only during construction). To extract the list, an implicit conversion could be used.
Since I don't understand the actual purpose of your DSL, I'm not really sure if any of these techniques is interesting for your scenario. I just wanted to add them since they are common ways to get rid of ",".
Here is another solution which is relatively simple and enables a syntax that is pretty close to your ideal
(as other have pointed, the exact syntax your asked for is not possible, in particular because you cannot redefine delimiter symbols).
My solution stretches a bit what is reasonable to do because it adds an operator right on scala.Symbol,
but if you're going to use this DSL in a constrained scope then this should be OK.
object VarOps {
val currentEqs = new util.DynamicVariable( Vector.empty[Eq] )
}
implicit class VarOps( val variable: Var ) extends AnyVal {
import VarOps._
def :=[T]( body: Var* ) = {
val eq = Eq( variable, body:_* )
currentEqs.value = currentEqs.value :+ eq
}
}
implicit class SymbolOps( val sym: Symbol ) extends AnyVal {
def apply[T]( body: => Unit ): Definition = {
import VarOps._
currentEqs.withValue( Vector.empty[Eq] ) {
body
Definition( sym.name, currentEqs.value:_* )
}
}
}
Now you can do:
'Dummy {
x := (y, z)
y := (x, z)
}
Which builds the following definition (as printed in the REPL):
Definition(Dummy,Vector(Eq(Var(x),WrappedArray(Var(y), Var(z))), Eq(Var(y),WrappedArray(Var(x), Var(z)))))
I'm writing an application that will take in various "command" strings. I've been looking at the Scala combinator library to tokenize the commands. I find in a lot of cases I want to say: "These tokens are an orderless set, and so they can appear in any order, and some might not appear".
With my current knowledge of grammars I would have to define all combinations of sequences as such (pseudo grammar):
command = action~content
action = alphanum
content = (tokenA~tokenB~tokenC | tokenB~tokenC~tokenA | tokenC~tokenB~tokenA ....... )
So my question is, considering tokenA-C are unique, is there a shorter way to define a set of any order using a grammar?
You can use the "Parser.^?" operator to check a group of parse elements for duplicates.
def tokens = tokenA | tokenB | tokenC
def uniqueTokens = (tokens*) ^? (
{ case t if (t == t.removeDuplicates) => t },
{ "duplicate tokens found: " + _ })
Here is an example that allows you to enter any of the four stooges in any order, but fails to parse if a duplicate is encountered:
package blevins.example
import scala.util.parsing.combinator._
case class Stooge(name: String)
object StoogesParser extends RegexParsers {
def moe = "Moe".r
def larry = "Larry".r
def curly = "Curly".r
def shemp = "Shemp".r
def stooge = ( moe | larry | curly | shemp ) ^^ { case s => Stooge(s) }
def certifiedStooge = stooge | """\w+""".r ^? (
{ case s: Stooge => s },
{ "not a stooge: " + _ })
def stooges = (certifiedStooge*) ^? (
{ case x if (x == x.removeDuplicates) => x.toSet },
{ "duplicate stooge in: " + _ })
def parse(s: String): String = {
parseAll(stooges, new scala.util.parsing.input.CharSequenceReader(s)) match {
case Success(r,_) => r.mkString(" ")
case Failure(r,_) => "failure: " + r
case Error(r,_) => "error: " + r
}
}
}
And some example usage:
package blevins.example
object App extends Application {
def printParse(s: String): Unit = println(StoogesParser.parse(s))
printParse("Moe Shemp Larry")
printParse("Moe Shemp Shemp")
printParse("Curly Beyonce")
/* Output:
Stooge(Moe) Stooge(Shemp) Stooge(Larry)
failure: duplicate stooge in: List(Stooge(Moe), Stooge(Shemp), Stooge(Shemp))
failure: not a stooge: Beyonce
*/
}
There are ways around it. Take a look at the parser here, for example. It accepts 4 pre-defined numbers, which may appear in any other, but must appear once, and only once.
OTOH, you could write a combinator, if this pattern happens often:
def comb3[A](a: Parser[A], b: Parser[A], c: Parser[A]) =
a ~ b ~ c | a ~ c ~ b | b ~ a ~ c | b ~ c ~ a | c ~ a ~ b | c ~ b ~ a
I would not try to enforce this requirement syntactically. I'd write a production that admits multiple tokens from the set allowed and then use a non-parsing approach to ascertaining the acceptability of the keywords actually given. In addition to allowing a simpler grammar, it will allow you to more easily continue parsing after emitting a diagnostic about the erroneous usage.
Randall Schulz
I don't know what kind of constructs you want to support, but I gather you should be specifying a more specific grammar. From your comment to another answer:
todo message:link Todo class to database
I guess you don't want to accept something like
todo message:database Todo to link class
So you probably want to define some message-level keywords like "link" and "to"...
def token = alphanum~':'~ "link" ~ alphanum ~ "class" ~ "to" ~ alphanum
^^ { (a:String,b:String,c:String) => /* a == "message", b="Todo", c="database" */ }
I guess you would have to define your grammar at that level.
You could of course write a combination rule that does this for you if you encounter this situation frequently.
On the other hand, maybe the option exists to make "tokenA..C" just "token" and then differentiate inside the handler of "token"