I am currently reading Hutton's and Meijer's paper on parsing combinators in Haskell http://www.cs.nott.ac.uk/~pszgmh/monparsing.pdf. For the sake of it I am trying to implement them in scala. I would like to construct something easy to code, extend and also simple and elegant. I have come up with two solutions for the following haskell code
/* Haskell Code */
type Parser a = String -> [(a,String)]
result :: a -> Parser a
result v = \inp -> [(v,inp)]
zero :: Parser a
zero = \inp -> []
item :: Parser Char
item = \inp -> case inp of
[] -> []
(x:xs) -> [(x,xs)]
/* Scala Code */
object Hutton1 {
type Parser[A] = String => List[(A, String)]
def Result[A](v: A): Parser[A] = str => List((v, str))
def Zero[A]: Parser[A] = str => List()
def Character: Parser[Char] = str => if (str.isEmpty) List() else List((str.head, str.tail))
}
object Hutton2 {
trait Parser[A] extends (String => List[(A, String)])
case class Result[A](v: A) extends Parser[A] {
def apply(str: String) = List((v, str))
}
case object Zero extends Parser[T forSome {type T}] {
def apply(str: String) = List()
}
case object Character extends Parser[Char] {
def apply(str: String) = if (str.isEmpty) List() else List((str.head, str.tail))
}
}
object Hutton extends App {
object T1 {
import Hutton1._
def run = {
val r: List[(Int, String)] = Zero("test") ++ Result(5)("test")
println(r.map(x => x._1 + 1) == List(6))
println(Character("abc") == List(('a', "bc")))
}
}
object T2 {
import Hutton2._
def run = {
val r: List[(Int, String)] = Zero("test") ++ Result(5)("test")
println(r.map(x => x._1 + 1) == List(6))
println(Character("abc") == List(('a', "bc")))
}
}
T1.run
T2.run
}
Question 1
In Haskell, zero is a function value that can be used as it is, expessing all failed parsers whether they are of type Parser[Int] or Parser[String]. In scala we achieve the same by calling the function Zero (1st approach) but in this way I believe that I just generate a different function everytime Zero is called. Is this statement true? Is there a way to mitigate this?
Question 2
In the second approach, the Zero case object is extending Parser with the usage of existential types Parser[T forSome {type T}] . If I replace the type with Parser[_] I get the compile error
Error:(19, 28) class type required but Hutton2.Parser[_] found
case object Zero extends Parser[_] {
^
I thought these two expressions where equivalent. Is this the case?
Question 3
Which approach out of the two do you think that will yield better results in expressing the combinators in terms of elegance and simplicity?
I use scala 2.11.8
Note: I didn't compile it, but I know the problem and can propose two solutions.
The more Haskellish way would be to not use subtyping, but to define zero as a polymorphic value. In that style, I would propose to define parsers not as objects deriving from a function type, but as values of one case class:
final case class Parser[T](run: String => List[(T, String)])
def zero[T]: Parser[T] = Parser(...)
As shown by #Alec, yes, this will produce a new value every time, since a def is compiled to a method.
If you want to use subtyping, you need to make Parser covariant. Then you can give zero a bottom result type:
trait Parser[+A] extends (String => List[(A, String)])
case object Zero extends Parser[Nothing] {...}
These are in some way quite related; in system F_<:, which is the base of what Scala uses, the types _|_ (aka Nothing) and \/T <: Any. T behave the same (this hinted at in Types and Programming Languages, chapter 28). The two possibilities given here are a consequence of this fact.
With existentials I'm not so familiar with, but I think that while unbounded T forSome {type T} will behave like Nothing, Scala does not allow inhertance from an existential type.
Question 1
I think that you are right, and here is why: Zero1 below prints hello every time you use it. The solution, Zero2, involves using a val instead.
def Zero1[A]: Parser[A] = { println("hi"); str => List() }
val Zero2: Parser[Nothing] = str => List()
Question 2
No idea. I'm still just starting out with Scala. Hope someone answers this.
Question 3
The trait one will play better with Scala's for (since you can define custom flatMap and map), which turns out to be (somewhat) like Haskell's do. The following is all you need.
trait Parser[A] extends (String => List[(A, String)]) {
def flatMap[B](f: A => Parser[B]): Parser[B] = {
val p1 = this
new Parser[B] {
def apply(s1: String) = for {
(a,s2) <- p1(s1)
p2 = f(a)
(b,s3) <- p2(s2)
} yield (b,s3)
}
}
def map[B](f: A => B): Parser[B] = {
val p = this
new Parser[B] {
def apply(s1: String) = for ((a,s2) <- p(s1)) yield (f(a),s2)
}
}
}
Of course, to do anything interesting you need more parsers. I'll propose to you one simple parser combinator: Choice(p1: Parser[A], p2: Parser[A]): Parser[A] which tries both parsers. (And rewrite your existing parsers more to my style).
def choice[A](p1: Parser[A], p2: Parser[A]): Parser[A] = new Parser[A] {
def apply(s: String): List[(A,String)] = { p1(s) ++ p2(s) }
}
def unit[A](x: A): Parser[A] = new Parser[A] {
def apply(s: String): List[(A,String)] = List((x,s))
}
val character: Parser[Char] = new Parser[Char] {
def apply(s: String): List[(Char,String)] = List((s.head,s.tail))
}
Then, you can write something like the following:
val parser: Parser[(Char,Char)] = for {
x <- choice(unit('x'),char)
y <- char
} yield (x,y)
And calling parser("xyz") gives you List((('x','x'),"yz"), (('x','y'),"z")).
Related
I'm experimenting with Free monad in Scalaz and trying to build simple interpreter to parse and evaluate expressions like:
dec(inc(dec(dec(10)))
where dec means decrement, inc means increment. Here is what I got:
trait Interpreter[A]
case class V[A](a: A) extends Interpreter[A]
object Inc {
private[this] final val pattern = Pattern.compile("^inc\\((.*)\\)$")
def unapply(arg: String): Option[String] = {
val m = pattern.matcher(arg)
if(m.find()){
Some(m.group(1))
} else None
}
}
object Dec {
private[this] final val pattern = Pattern.compile("^dec\\((.*)\\)$")
def unapply(arg: String): Option[String] = {
val m = pattern.matcher(arg)
if(m.find()){
Some(m.group(1))
} else None
}
}
object Val {
def unapply(arg: String): Option[Int] =
if(arg.matches("^[0-9]+$")) Some(Integer.valueOf(arg))
else None
}
Now this is all I need to build AST. It currently looks as follows:
def buildAst(expression: String): Free[Interpreter, Int] =
expression match {
case Inc(arg) => inc(buildAst(arg))
case Dec(arg) => dec(buildAst(arg))
case Val(arg) => value(arg)
}
private def inc(i: Free[Interpreter, Int]) = i.map(_ + 1)
private def dec(d: Free[Interpreter, Int]) = d.map(_ - 1)
private def value(v: Int): Free[Interpreter, Int] = Free.liftF(V(v))
Now when testing the application:
object Test extends App{
val expression = "inc(dec(inc(inc(inc(dec(10))))))"
val naturalTransform = new (Interpreter ~> Id) {
override def apply[A](fa: Interpreter[A]): Id[A] = fa match {
case V(a) => a
}
}
println(buildAst(expression).foldMap(naturalTransform)) //prints 12
}
And it works pretty much fine (I'm not sure about if it is in scalaz style).
THE PROBLEM is the extractor objects Inc, Dec, Val feels like boilerplate code. Is there a way to reduce such a code duplication.
This will definitely become a problem if the number of functions supported gets larger.
Free monads are creating some boilerplate and that is a fact. However if you are willing to stick to some conventions, you could rewrite interpreter with Freasy Monad:
#free trait Interpreter {
type InterpreterF[A] = Free[InterpreterADT, A]
sealed trait InterpreterADT[A]
def inc(arg: InterpreterF[Int]): InterpreterF[Int]
def dec(arg: InterpreterF[Int]): InterpreterF[Int]
def value(arg: Int): InterpreterF[Int]
}
and that would generate all of case classes and matching on them. The interpreter becomes just a trait to implement.
However, you already have some logic within unapply - so you would have to split the parsing and executing logic:
import Interpreter.ops._
val incP = """^inc\\((.*)\\)$""".r
val decP = """^dec\\((.*)\\)$""".r
val valP = """^val\\((.*)\\)$""".r
def buildAst(expression: String): InterpreterF[Int] = expression match {
case incP(arg) => inc(buildAst(arg))
case decP(arg) => dec(buildAst(arg))
case valP(arg) => value(arg.toInt)
}
Then you could implement an actual interpreter:
val impureInterpreter = new Interpreter.Interp[Id] {
def inc(arg: Int): Int = arg+1
def dec(arg: Int): Int = arg-1
def value(arg: Int): Int = arg
}
and run it:
impureInterpreter.run(buildAst(expression))
I admit that this is more of a pseudocode than tested working solution, but it should give a general idea. Another library that uses similar idea is Freestyle but they use their own free monads implementation instead of relying on a cats/scalaz.
So, I would say it is possible to remove some boilerplate as long as you have no issue with splitting parsing and interpretation. Of course not all can be removed - you have to declare possible operations on your Interpreter algebra as well as you have to implement interpreter yourself.
It's been while Since I've started working on scala and I am wondering what kind of variable type is the best when I create a method which requires to return multiple data.
let's say If I have to make a method to get user info and it'll be called from many places.
def getUserParam(userId: String):Map[String,Any] = {
//do something
Map(
"isExist" -> true,
"userDataA" -> "String",
"userDataB" -> 1 // int
)
}
in this case, the result type is Map[String,Any] and since each param would be recognized as Any, You cannot pass the value to some other method requiring something spesifically.
def doSomething(foo: String){}
val foo = getUserParam("bar")
doSomething(foo("userDataA")) // type mismatch error
If I use Tuple, I can avoid that error, but I don't think it is easy to guess what each indexed number contains.
and of course there is a choice to use Case Class but once I use case class as a return type, I need to import the case class where ever I call the method.
What I want to ask is what is the best way to make a method returning more than 2 different variable type values.
Here are three options. Even though you might like the third option (using anonymous class) it's actually my least favorite. As you can see, it requires you to enable reflective calls (otherwise it throws a compilation warning). Scala will use reflection to achieve this which is not that great.
Personally, if there are only 2 values I use tuple. If there are more than two I will use a case class since it greatly improves code readability. The anonymous class option I knew it existed for a while, but I never used that it my code.
import java.util.Date
def returnTwoUsingTuple: (Date, String) = {
val date = new Date()
val str = "Hello world"
(date,str)
}
val tupleVer = returnTwoUsingTuple
println(tupleVer._1)
println(tupleVer._2)
case class Reply(date: Date, str: String)
def returnTwoUsingCaseClass: Reply = {
val date = new Date()
val str = "Hello world"
Reply(date,str)
}
val caseClassVer = returnTwoUsingCaseClass
println(caseClassVer.date)
println(caseClassVer.str)
import scala.language.reflectiveCalls
def returnTwoUsingAnonymousClass = {
val date = new Date()
val str = "Hello world"
new {
val getDate = date
val getStr = str
}
}
val anonClassVer = returnTwoUsingAnonymousClass
println(anonClassVer.getDate)
println(anonClassVer.getStr)
Sinse your logic with Map[String,Any] is more like for each key I have one of .. not for each key I have both ... more effective use in this case would be Either or even more effectively - scalaz.\/
scalaz.\/
import scalaz._
import scalaz.syntax.either._
def getUserParam(userId: String): Map[String, String \/ Int \/ Boolean] = {
//do something
Map(
"isExist" -> true.right,
"userDataA" -> "String".left.left,
"userDataB" -> 1.right.left
)
}
String \/ Int \/ Boolean is left-associatited to (String \/ Int) \/ Boolean
now you have
def doSomething(foo: String){}
unluckily it's the most complex case, if for example you had
def doSomethingB(foo: Boolean){}
you could've just
foo("userDataA").foreach(doSomethingB)
since the right value considered as correct so for String which is left to the left you could write
foo("userdata").swap.foreach(_.swap.foreach(doSomething))
Closed Family
Or you could craft you own simple type for large number of alternatives like
sealed trait Either3[+A, +B, +C] {
def ifFirst[T](action: A => T): Option[T] = None
def ifSecond[T](action: B => T): Option[T] = None
def ifThird[T](action: C => T): Option[T] = None
}
case class First[A](x: A) extends Either3[A, Nothing, Nothing] {
override def ifFirst[T](action: A => T): Option[T] = Some(action(x))
}
case class Second[A](x: A) extends Either3[Nothing, A, Nothing] {
override def ifSecond[T](action: A => T): Option[T] = Some(action(x))
}
case class Third[A](x: A) extends Either3[Nothing, Nothing, A] {
override def ifThird[T](action: A => T): Option[T] = Some(action(x))
}
now having
def getUserParam3(userId: String): Map[String, Either3[Boolean, String, Int]] = {
//do something
Map(
"isExist" -> First(true),
"userDataA" -> Second("String"),
"userDataB" -> Third(1)
)
}
val foo3 = getUserParam3("bar")
you can use your values as
foo3("userdata").ifSecond(doSomething)
I have an idea (vague), to pass (or chain) some implicit value in this manner, not introducing parameters to block f:
def block(x: Int)(f: => Unit)(implicit v: Int) = {
implicit val nv = v + x
f
}
def fun(implicit v: Int) = println(v)
such that if I used something alike:
implicit val ii: Int = 0
block(1) {
block(2) {
fun
}
}
It would print 3.
If I could say def block(x: Int)(f: implicit Int => Unit).
In other words I'm seeking for some design pattern which will allow me to implement this DSL: access some cumulative value inside nested blocks but without explicitly passing it as parameter. Is it possible? (implicits are not necessary, just a hint to emphasize that I don't want to pass that accumulator explicitly). Of course upper code will print 0.
EDIT: One of possible usages: composing http routes, in a following manner
prefix("path") {
prefix("subpath") {
post("action1") { (req, res) => do action }
get("action2") { (req, res) => do action }
}
}
Here post and get will access (how?) accumulated prefix, say List("path", "subpath") or "/path/subpath/".
Consider using DynamicVariable for this. It's really simple to use, and thread-safe:
val acc: DynamicVariable[Int] = new DynamicVariable(0)
def block(x: Int)(f: => Unit) = {
acc.withValue(acc.value + x)(f)
}
def fun = println(acc.value)
Passing state via implicit is dirty and will lead to unexpected and hard to track down bugs. What you're asking to do is build a function that can compose in such a way that nested calls accumulate over some operation and anything else uses that value to execute the function?
case class StateAccum[S](init: S){
val op: S => S
def flatMap[A <: S](f: S => StateAccum[A]) ={
val StateAccum(out) = f(s)
StateAccum(op(init, out))
}
def apply(f: S => A) = f(init)
}
which could allow you do exactly what you're after with a slight change in how you're calling it.
Now, if you really want the nested control structures, your apply would have to use an implicit value to distinguish the types of the return such that it applied the function to one and a flatMap to StateAccum returns. It gets crazy but looks like the following:
def apply[A](f: S => A)(implicit mapper: Mapper[S, A]): mapper.Out = mapper(this, f)
trait Mapper[S, A]{
type Out
def apply(s: StateAccum[S], f: S => A): Out
}
object Mapper extends LowPriorityMapper{
implicit def acuum[S, A <: S] = new Mapper[S, StateAccum[A]]{
type Out = StateAccum[A]
def apply(s: StateAccum[S], f: S => StateAccum[A]) = s.flatMap(f)
}
}
trait LowPriorityMapper{
implicit def acuum[S, A] = new Mapper[S, A]{
type Out = A
def apply(s: StateAccum[S], f: S => A) = f(s.init)
}
}
Is that possible to somehow marshall PartialFunction (let's assume it will always contains only one case) into something human-readable?
Let's say we have collection of type Any (messages: List[Any])
and number of PartialFuntion[Any, T] defined using pattern matching block.
case object R1
case object R2
case object R3
val pm1: PartialFunction[Any, Any] = {
case "foo" => R1
}
val pm2: PartialFunction[Any, Any] = {
case x: Int if x > 10 => R2
}
val pm3: PartialFunction[Any, Any] = {
case x: Boolean => R3
}
val messages: List[Any] = List("foo", 20)
val functions = List(pm1, pm2)
then we can find all the messages matched by provided PFs and related applications
val found: List[Option[Any]] = functions map { f =>
messages.find(f.isDefined).map(f)
}
but what if I need resulting map of 'what I expect' to 'what I've got' in the human-readable form (for logging). Say,
(case "foo") -> Some(R1)
(case Int if _ > 10) -> Some(R2)
(case Boolean) -> None
Is that possible? Some macro/meta works?
There's nothing at runtime which will print compiled code nicely.
You could write a macro which will print the source code of the tree and use that? Most macro tutorials start with a macro for printing source code -- see e.g. http://www.warski.org/blog/2012/12/starting-with-scala-macros-a-short-tutorial/
Perhaps:
// Given a partial function "pf", return the source code for pf
// as a string as well as the compiled, runnable function itself
def functionAndSource(pf: PartialFunction[Any, Any]): (String, PartialFunction[Any, Any]) = macro functionAndSourceImpl
def functionAndSourceImpl ...
val pm1: (String, PartialFunction[Any, Any]) = functionAndSource {
case "foo" => R1
}
This isn't ever going to be that easy or nice in Scala.
Scala isn't Lisp or Ruby: it's a compiled language and it is not optimised for reflection on the code itself.
Thanks for your answers. Using Macro is interesting one choice.
But as an option the solution might be to use kind of named partial functions. The idea is to name the function so in the output you can see the name of function instead source code.
object PartialFunctions {
type FN[Result] = PartialFunction[Any, Result]
case class NamedPartialFunction[A,B](name: String)(pf: PartialFunction[A, B]) extends PartialFunction[A,B] {
override def isDefinedAt(x: A): Boolean = pf.isDefinedAt(x)
override def apply(x: A): B = pf.apply(x)
override def toString(): String = s"matching($name)"
}
implicit class Named(val name: String) extends AnyVal {
def %[A,B](pf: PartialFunction[A,B]) = new NamedPartialFunction[A, B](name)(pf)
}
}
So then you can use it as follows
import PartialFunctions._
val pm1: PartialFunction[Any, Any] = "\"foo\"" % {
case "foo" => R1
}
val pm2: PartialFunction[Any, Any] = "_: Int > 10" % {
case x: Int if x > 10 => R2
}
val pm3: PartialFunction[Any, Any] = "_: Boolean" % {
case x: Boolean => R3
}
val messages: List[Any] = List("foo", 20)
val functions = List(pm1, pm2)
val found: List[Option[(String, Any)]] = functions map { case f: NamedPartialFunction =>
messages.find(f.isDefined).map(m => (f.name, f(m))
}
If I want to narrow, say, an Iterable[A] for all elements of a particular type (e.g. String) I can do:
as filter { _.isInstanceOf[String] }
However, it's obviously desirable to use this as an Iterable[String] which can be done via a map:
as filter { _.isInstanceOf[String] } map { _.asInstanceOf[String] }
Which is pretty ugly. Of course I could use flatMap instead:
as flatMap[String] { a =>
if (a.isInstanceOf[String])
Some(a.asInstanceOf[String])
else
None
}
But I'm not sure that this is any more readable! I have written a function, narrow, which can be used via implicit conversions:
as.narrow(classOf[String])
But I was wondering if there was a better built-in mechanism which I have overlooked. Particularly as it would be nice to be able to narrow a List[A] to a List[String], rather than to an Iterable[String] as it will be with my function.
The Scala syntax sugar for isInstanceOf / asInstanceOf is pattern matching:
as flatMap { case x: String => Some(x); case _ => None }
Because that uses flatMap, it should usually return the same collection you had to begin with.
On Scala 2.8, there's an experimental function that does that kind of pattern, defined inside the object PartialFunction. So, on Scala 2.8 you can do:
as flatMap (PartialFunction.condOpt(_ : Any) { case x: String => x })
Which looks bigger mostly because I did not import that function first. But, then again, on Scala 2.8 there's a more direct way to do it:
as collect { case x: String => x }
For the record, here's a full implementation of narrow. Unlike the signature given in the question, it uses an implicit Manifest to avoid some characters:
implicit def itrToNarrowSyntax[A](itr: Iterable[A]) = new {
def narrow[B](implicit m: Manifest[B]) = {
itr flatMap { x =>
if (Manifest.singleType(x) <:< m)
Some(x)
else
None
}
}
}
val res = List("daniel", true, 42, "spiewak").narrow[String]
res == Iterable("daniel", "spiewak")
Unfortunately, narrowing to a specific type (e.g. List[String]) rather than Iterable[String] is a bit harder. It can be done with the new collections API in Scala 2.8.0 by exploiting higher-kinds, but not in the current framework.
You may use in future:
for(a :Type <- itr) yield a
But it doesn't work now.
For more information, go to the following links:
http://lampsvn.epfl.ch/trac/scala/ticket/1089
http://lampsvn.epfl.ch/trac/scala/ticket/900
Shape preserving: I'm a bit rushed right now so I'm leaving the cast in there, but I'm pretty sure it can be eliminated. This works in trunk:
import reflect.Manifest
import collection.Traversable
import collection.generic.CanBuildFrom
import collection.mutable.ListBuffer
object narrow {
class Narrower[T, CC[X] <: Traversable[X]](coll: CC[T])(implicit m1: Manifest[CC[T]], bf: CanBuildFrom[CC[T], T, CC[T]]) {
def narrow[B: Manifest]: CC[B] = {
val builder = bf(coll)
def isB(x: T): Option[T] = if (Manifest.singleType(x) <:< manifest[B]) Some(x) else None
coll flatMap isB foreach (builder += _)
builder mapResult (_.asInstanceOf[CC[B]]) result
}
}
implicit def toNarrow[T, CC[X] <: Traversable[X]](coll: CC[T])(implicit m1: Manifest[CC[T]], bf: CanBuildFrom[CC[T], T, CC[T]]) =
new Narrower[T,CC](coll)
def main(args: Array[String]): Unit = {
println(Set("abc", 5, 5.5f, "def").narrow[String])
println(List("abc", 5, 5.5f, "def").narrow[String])
}
}
Running it:
Set(abc, def)
List(abc, def)