What `JObject(rec) <- someJArray` means inside for-comprehension - scala

I'm learning Json4s library.
I have a json fragment like this:
{
"records":[
{
"name":"John Derp",
"address":"Jem Street 21"
},
{
"name":"Scala Jo",
"address":"in my sweet dream"
}
]
}
And, I have Scala code, which converts a json string into a List of Maps, like this:
import org.json4s._
import org.json4s.JsonAST._
import org.json4s.native.JsonParser
val json = JsonParser.parse( """{"records":[{"name":"John Derp","address":"Jem Street 21"},{"name":"Scala Jo","address":"in my sweet dream"}]}""")
val records: List[Map[String, Any]] = for {
JObject(rec) <- json \ "records"
JField("name", JString(name)) <- rec
JField("address", JString(address)) <- rec
} yield Map("name" -> name, "address" -> address)
println(records)
The output of records to screen gives this:
List(Map(name -> John Derp, address -> Jem Street 21), Map(name ->
Scala Jo, address -> in my sweet dream))
I want to understand what the lines inside the for loop mean. For example, what is the meaning of this line:
JObject(rec) <- json \ "records"
I understand that the json \ "records" produces a JArray object, but why is it fetched as JObject(rec) at left of <-? What is the meaning of the JObject(rec) syntax? Where does the rec variable come from? Does JObject(rec) mean instantiating a new JObject class from rec input?
BTW, I have a Java programming background, so it would also be helpful if you can show me the Java equivalent code for the loop above.

You have the following types hierarchy:
sealed abstract class JValue {
def \(nameToFind: String): JValue = ???
def filter(p: (JValue) => Boolean): List[JValue] = ???
}
case class JObject(val obj: List[JField]) extends JValue
case class JField(val name: String, val value: JValue) extends JValue
case class JString(val s: String) extends JValue
case class JArray(val arr: List[JValue]) extends JValue {
override def filter(p: (JValue) => Boolean): List[JValue] =
arr.filter(p)
}
Your JSON parser returns following object:
object JsonParser {
def parse(s: String): JValue = {
new JValue {
override def \(nameToFind: String): JValue =
JArray(List(
JObject(List(
JField("name", JString("John Derp")),
JField("address", JString("Jem Street 21")))),
JObject(List(
JField("name", JString("Scala Jo")),
JField("address", JString("in my sweet dream"))))))
}
}
}
val json = JsonParser.parse("Your JSON")
Under the hood Scala compiler generates the following:
val res = (json \ "records")
.filter(_.isInstanceOf[JObject])
.flatMap { x =>
x match {
case JObject(obj) => //
obj //
.withFilter(f => f match {
case JField("name", _) => true
case _ => false
}) //
.flatMap(n => obj.withFilter(f => f match {
case JField("address", _) => true
case _ => false
}).map(a => Map(
"name" -> (n.value match { case JString(name) => name }),
"address" -> (a.value match { case JString(address) => address }))))
}
}
First line JObject(rec) <- json \ "records" is possible because JArray.filter returns List[JValue] (i.e. List[JObject]). Here each value of List[JValue] maps to JObject(rec) with pattern matching.
Rest calls are series of flatMap and map (this is how Scala for comprehensions work) with pattern matching.
I used Scala 2.11.4.
Of course, match expressions above are implemented using series of type checks and casts.
UPDATE:
When you use Json4s library there is an implicit conversion from JValue to org.json4s.MonadicJValue. See package object json4s:
implicit def jvalue2monadic(jv: JValue) = new MonadicJValue(jv)
This conversion is used here: JObject(rec) <- json \ "records". First, json is converted to MonadicJValue, then def \("records") is applied, then def filter is used on the result of def \ which is JValue, then it is again implicitly converted to MonadicJValue, then def filter of MonadicJValue is used. The result of MonadicJValue.filter is List[JValue]. After that steps described above are performed.

You are using a Scala for comprehension and I believe much of the confusion is about how for comprehensions work. This is Scala syntax for accessing the map, flatMap and filter methods of a monad in a concise way for iterating over collections. You will need some understanding of monads and for comprehensions in order to fully comprehend this. The Scala documentation can help, and so will a search for "scala for comprehension". You will also need to understand about extractors in Scala.
You asked about the meaning of this line:
JObject(rec) <- json \ "records"
This is part of the for comprehension.
Your statement:
I understand that the json \ "records" produces a JArray object,
is slightly incorrect. The \ function extracts a List[JSObject] from the parser result, json
but why is it fetched as JObject(rec) at left of <-?
The json \ "records" uses the json4s extractor \ to select the "records" member of the Json data and yield a List[JObject]. The <- can be read as "is taken from" and implies that you are iterating over the list. The elements of the list have type JObject and the construct JObject(rec) applies an extractor to create a value, rec, that holds the content of the JObject (its fields).
how come it's fetched as JObject(rec) at left of <-?
That is the Scala syntax for iterating over a collection. For example, we could also write:
for (x <- 1 to 10)
which would simply give us the values of 1 through 10 in x. In your example, we're using a similar kind of iteration but over the content of a list of JObjects.
What is the meaning of the JObject(rec)?
This is a Scala extractor. If you look in the json4s code you will find that JObject is defined like this:
case class JObject(obj: List[JField]) extends JValue
When we have a case class in Scala there are two methods defined automatically: apply and unapply. The meaning of JObject(rec) then is to invoke the unapply method and produce a value, rec, that corresponds to the value obj in the JObject constructor (apply method). So, rec will have the type List[JField].
Where does the rec variable come from?
It comes from simply using it and is declared as a placeholder for the obj parameter to JObject's apply method.
Does JObject(rec) mean instantiating new JObject class from rec input?
No, it doesn't. It comes about because the JArray resulting from json \ "records" contains only JObject values.
So, to interpret this:
JObject(rec) <- json \ "records"
we could write the following pseudo-code in english:
Find the "records" in the parsed json as a JArray and iterate over them. The elements of the JArray should be of type JObject. Pull the "obj" field of each JObject as a list of JField and assign it to a value named "rec".
Hopefully that makes all this a bit clearer?
it's also helpful if you can show me the Java equivalent code for the loop above.
That could be done, of course, but it is far more work than I'm willing to contribute here. One thing you could do is compile the code with Scala, find the associated .class files, and decompile them as Java. That might be quite instructive for you to learn how much Scala simplifies programming over Java. :)
why I can't do this? for ( rec <- json \ "records", so rec become JObject. What is the reason of JObject(rec) at the left of <- ?
You could! However, you'd then need to get the contents of the JObject. You could write the for comprehension this way:
val records: List[Map[String, Any]] = for {
obj: JObject <- json \ "records"
rec = obj.obj
JField("name", JString(name)) <- rec
JField("address", JString(address)) <- rec
} yield Map("name" -> name, "address" -> address)
It would have the same meaning, but it is longer.
I just want to understand what does the N(x) pattern mean, because I only ever see for (x <- y pattern before.
As explained above, this is an extractor which is simply the use of the unapply method which is automatically created for case classes. A similar thing is done in a case statement in Scala.
UPDATE:
The code you provided does not compile for me against 3.2.11 version of json4s-native. This import:
import org.json4s.JsonAST._
is redundant with this import:
import org.json4s._
such that JObject is defined twice. If I remove the JsonAST import then it compiles just fine.
To test this out a little further, I put your code in a scala file like this:
package example
import org.json4s._
// import org.json4s.JsonAST._
import org.json4s.native.JsonParser
class ForComprehension {
val json = JsonParser.parse(
"""{
|"records":[
|{"name":"John Derp","address":"Jem Street 21"},
|{"name":"Scala Jo","address":"in my sweet dream"}
|]}""".stripMargin
)
val records: List[Map[String, Any]] = for {
JObject(rec) <- json \ "records"
JField("name", JString(name)) <- rec
JField("address", JString(address)) <- rec
} yield Map("name" -> name, "address" -> address)
println(records)
}
and then started a Scala REPL session to investigate:
scala> import example.ForComprehension
import example.ForComprehension
scala> val x = new ForComprehension
List(Map(name -> John Derp, address -> Jem Street 21), Map(name -> Scala Jo, address -> in my sweet dream))
x: example.ForComprehension = example.ForComprehension#5f9cbb71
scala> val obj = x.json \ "records"
obj: org.json4s.JValue = JArray(List(JObject(List((name,JString(John Derp)), (address,JString(Jem Street 21)))), JObject(List((name,JString(Scala Jo)), (address,JString(in my sweet dream))))))
scala> for (a <- obj) yield { a }
res1: org.json4s.JValue = JArray(List(JObject(List((name,JString(John Derp)), (address,JString(Jem Street 21)))), JObject(List((name,JString(Scala Jo)), (address,JString(in my sweet dream))))))
scala> import org.json4s.JsonAST.JObject
for ( JObject(rec) <- obj ) yield { rec }
import org.json4s.JsonAST.JObject
scala> res2: List[List[org.json4s.JsonAST.JField]] = List(List((name,JString(John Derp)), (address,JString(Jem Street 21))), List((name,JString(Scala Jo)), (address,JString(in my sweet dream))))
So:
You are correct, the result of the \ operator is a JArray
The "iteration" over the JArray just treats the entire array as the only value in the list
There must be an implicit conversion from JArray to JObject that permits the extractor to yield the contents of JArray as a List[JField].
Once everything is a List, the for comprehension proceeds as normal.
Hope that helps with your understanding of this.
For more on pattern matching within assignments, try this blog
UPDATE #2:
I dug around a little more to discover the implicit conversion at play here. The culprit is the \ operator. To understand how json \ "records" turns into a monadic iterable thing, you have to look at this code:
org.json4s package object: This line declares an implicit conversion from JValue to MonadicJValue. So what's a MonadicJValue?
org.json4s.MonadicJValue: This defines all the things that make JValues iterable in a for comprehension: filter, map, flatMap and also provides the \ and \\ XPath-like operators
So, essentially, the use of the \ operator results in the following sequence of actions:
- implicitly convert the json (JValue) into MonadicJValue
- Apply the \ operator in MonadicJValue to yield a JArray (the "records")
- implicitly convert the JArray into MonadicJValue
- Use the MonadicJValue.filter and MonadicJValue.map methods to implement the for comprehension

Just simplified example, how for-comprehesion works here:
scala> trait A
defined trait A
scala> case class A2(value: Int) extends A
defined class A2
scala> case class A3(value: Int) extends A
defined class A3
scala> val a = List(1,2,3)
a: List[Int] = List(1, 2, 3)
scala> val a: List[A] = List(A2(1),A3(2),A2(3))
a: List[A] = List(A2(1), A3(2), A2(3))
So here is just:
scala> for(A2(rec) <- a) yield rec //will return and unapply only A2 instances
res34: List[Int] = List(1, 3)
Which is equivalent to:
scala> a.collect{case A2(rec) => rec}
res35: List[Int] = List(1, 3)
Collect is based on filter - so it's enough to have filter method as JValue has.
P.S. There is no foreach in JValue - so this won't work for(rec <- json \ "records") rec. But there is map, so that will: for(rec <- json \ "records") yield rec
If you need your for without pattern matching:
for {
rec <- (json \ "records").filter(_.isInstanceOf[JObject]).map(_.asInstanceOf[JObject])
rcobj = rec.obj
name <- rcobj if name._1 == "name"
address <- rcobj if address._1 == "address"
nm = name._2.asInstanceOf[JString].s
vl = address._2.asInstanceOf[JString].s
} yield Map("name" -> nm, "address" -> vl)
res27: List[scala.collection.immutable.Map[String,String]] = List(Map(name -> John Derp, address -> Jem Street 21), Map(name -> Scala Jo, address -> in my sweet dream))

Related

Convert Seq[Try[Option(String, Any)]] into Try[Option[Map[String, Any]]]

How to conveniently convert Seq[Try[Option[String, Any]]] into Try[Option[Map[String, Any]]].
If any Try before convert throws an exception, the converted Try should throw as well.
Assuming that the input type has a tuple inside the Option then this should give you the result you want:
val in: Seq[Try[Option[(String, Any)]]] = ???
val out: Try[Option[Map[String,Any]]] = Try(Some(in.flatMap(_.get).toMap))
If any of the Trys is Failure then the outer Try will catch the exception raised by the get and return Failure
The Some is there to give the correct return type
The get extracts the Option from the Try (or raises an exception)
Using flatMap rather than map removes the Option wrapper, keeping all Some values and discaring None values, giving Seq[(String, Any)]
The toMap call converts the Seq to a Map
Here is something that's not very clean but may help get you started. It assumes Option[(String,Any)], returns the first Failure if there are any in the input Seq and just drops None elements.
foo.scala
package foo
import scala.util.{Try,Success,Failure}
object foo {
val x0 = Seq[Try[Option[(String, Any)]]]()
val x1 = Seq[Try[Option[(String, Any)]]](Success(Some(("A",1))), Success(None))
val x2 = Seq[Try[Option[(String, Any)]]](Success(Some(("A",1))), Success(Some(("B","two"))))
val x3 = Seq[Try[Option[(String, Any)]]](Success(Some(("A",1))), Success(Some(("B","two"))), Failure(new Exception("bad")))
def f(x: Seq[Try[Option[(String, Any)]]]) =
x.find( _.isFailure ).getOrElse( Success(Some(x.map( _.get ).filterNot( _.isEmpty ).map( _.get ).toMap)) )
}
Example session
bash-3.2$ scalac foo.scala
bash-3.2$ scala -classpath .
Welcome to Scala 2.13.1 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_66).
Type in expressions for evaluation. Or try :help.
scala> import foo.foo._
import foo.foo._
scala> f(x0)
res0: scala.util.Try[Option[Equals]] = Success(Some(Map()))
scala> f(x1)
res1: scala.util.Try[Option[Equals]] = Success(Some(Map(A -> 1)))
scala> f(x2)
res2: scala.util.Try[Option[Equals]] = Success(Some(Map(A -> 1, B -> two)))
scala> f(x3)
res3: scala.util.Try[Option[Equals]] = Failure(java.lang.Exception: bad)
scala> :quit
If you're willing to use a functional support library like Cats then there are two tricks that can help this along:
Many things like List and Try are traversable, which means that (if Cats's implicits are in scope) they have a sequence method that can swap two types, for example converting List[Try[T]] to Try[List[T]] (failing if any of the items in the list are failure).
Almost all of the container types support a map method that can operate on the contents of a container, so if you have a function from A to B then map can convert a Try[A] to a Try[B]. (In Cats language they are functors but the container-like types in the standard library generally have map already.)
Cats doesn't directly support Seq, so this answer is mostly in terms of List instead.
Given that type signature, you can iteratively sequence the item you have to in effect push the list type down one level in the type chain, then map over that container to work on its contents. That can look like:
import cats.implicits._
import scala.util._
def convert(listTryOptionPair: List[Try[Option[(String, Any)]]]): Try[
Option[Map[String, Any]]
] = {
val tryListOptionPair = listTryOptionPair.sequence
tryListOptionPair.map { listOptionPair =>
val optionListPair = listOptionPair.sequence
optionListPair.map { listPair =>
Map.from(listPair)
}
}
}
https://scastie.scala-lang.org/xbQ8ZbkoRSCXGDJX0PgJAQ has a slightly more complete example.
One way to approach this is by using a foldLeft:
// Let's say this is the object you're trying to convert
val seq: Seq[Try[Option[(String, Any)]]] = ???
seq.foldLeft(Try(Option(Map.empty[String, Any]))) {
case (acc, e) =>
for {
accOption <- acc
elemOption <- e
} yield elemOption match {
case Some(value) => accOption.map(_ + value)
case None => accOption
}
}
You start off with en empty Map. You then use a for comprehension to go through the current map and element and finally you add a new tuple in the map if present.
The following solutions is based on this answer to the point that almost makes the question a duplicate.
Method 1: Using recursion
def trySeqToMap1[X,Y](trySeq : Seq[Try[Option[(X, Y)]]]) : Try[Option[Map[X,Y]]] = {
def helper(it : Iterator[Try[Option[(X,Y)]]], m : Map[X,Y] = Map()) : Try[Option[Map[X,Y]]] = {
if(it.hasNext) {
val x = it.next()
if(x.isFailure)
Failure(x.failed.get)
else if(x.get.isDefined)
helper(it, m + (x.get.get._1-> x.get.get._2))
else
helper(it, m)
} else Success(Some(m))
}
helper(trySeq.iterator)
}
Method 2: directly pattern matching in case you are able to get a stream or a List instead:
def trySeqToMap2[X,Y](trySeq : LazyList[Try[Option[(X, Y)]]], m : Map[X,Y]= Map.empty[X,Y]) : Try[Option[Map[X,Y]]] =
trySeq match {
case Success(Some(h)) #:: tail => trySeqToMap2(tail, m + (h._1 -> h._2))
case Success(None) #:: tail => tail => trySeqToMap2(tail, m)
case Failure(f) #:: _ => Failure(f)
case _ => Success(Some(m))
}
note: this answer was previously using different method signatures. It has been updated to conform to the signature given in the question.

Monad for-comprehensions with implicit Monad fails, use inheritance?

I am running into this famous 10 year old ticket in Scala https://github.com/scala/bug/issues/2823
Because I am expecting for-comprehensions to work like do-blocks in Haskell. And why shouldn't they, Monads go great with a side of sugar. At this point I have something like this:
import scalaz.{Monad, Traverse}
import scalaz.std.either._
import scalaz.std.list._
type ErrorM[A] = Either[String, A]
def toIntSafe(s : String) : ErrorM[Int] = {
try {
Right(s.toInt)
} catch {
case e: Exception => Left(e.getMessage)
}
}
def convert(args: List[String])(implicit m: Monad[ErrorM], tr: Traverse[List]): ErrorM[List[Int]] = {
val expanded = for {
arg <- args
result <- toIntSafe(arg)
} yield result
tr.sequence(expanded)(m)
}
println(convert(List("1", "2", "3")))
println(convert(List("1", "foo")))
And I'm getting the error
"Value map is not a member of ErrorM[Int]"
result <- toIntSafe(arg)
How do I get back to beautiful, monadic-comprehensions that I am used to? Some research shows the FilterMonadic[A, Repr] abstract class is what to extend if you want to be a comprehension, any examples of combining FilterMonadic with scalaz?
Can I reuse my Monad implicit and not have to redefine map, flatMap etc?
Can I keep my type alias and not have to wrap around Either or worse, redefine case classes for ErrorM?
Using Scala 2.11.8
EDIT: I am adding Haskell code to show this does indeed work in GHC without explicit Monad transformers, only traversals and default monad instances for Either and List.
type ErrorM = Either String
toIntSafe :: Read a => String -> ErrorM a
toIntSafe s = case reads s of
[(val, "")] -> Right val
_ -> Left $ "Cannot convert to int " ++ s
convert :: [String] -> ErrorM [Int]
convert = sequence . conv
where conv s = do
arg <- s
return . toIntSafe $ arg
main :: IO ()
main = do
putStrLn . show . convert $ ["1", "2", "3"]
putStrLn . show . convert $ ["1", "foo"]
Your haskell code and your scala code are not equivalent:
do
arg <- s
return . toIntSafe $ arg
corresponds to
for {
arg <- args
} yield toIntSafe(arg)
Which compiles fine.
To see why your example one doesn't compile, we can desugar it:
for {
arg <- args
result <- toIntSafe(arg)
} yield result
=
args.flatMap { arg =>
toIntSafe(arg).map {result => result}
}
Now looking at types:
args: List[String]
args.flatMap: (String => List[B]) => List[B]
arg => toIntSafe(arg).map {result => result} : String => ErrorM[Int]
Which shows the problem. flatMap is expecting a function returning a List but you are giving it a function returning an ErrorM.
Haskell code along the lines of:
do
arg <- s
result <- toIntSafe arg
return result
wouldn't compile either for roughly the same reason: trying to bind across two different monads, List and Either.
A for comprehension in scala will or a do expression in haskell will only ever work for the same underlying monad, because they are both basically syntactic translations to series of flatMaps and >>=s respectively. And those still need to typecheck.
If you want to compose monads one thing you can do use monad transformers ( EitherT), although in your above example I don't think you want to, since you actually want to sequence in the end.
Finally, in my opinion the most elegant way of expressing your code is:
def convert(args: List[String]) = args.traverse(toIntSafe)
Because map followed by sequence is traverse

How to "de-sugar" this Scala statement?

LINQ-style queries in Scala with json4s look as follows:
val jvalue = parse(text) // (1)
val jobject = for(JObject(o) <- jvalue) yield o // (2)
I do not understand exactly how (2) works. How would you de-sugar this for-statement ?
for-comprehensions of the form
for(v <- generator) yield expr
are translated into
generator.map(v => expr)
When you have a pattern match on the left, then any input values which do not match the pattern are filtered out. This means a partial function is created containing the match, and each input argument can be tested with isDefinedAt e.g.
val f: PartialFunction[JValue, JObject] = { case o#JObject(_) => o }
f.isDefinedAt(JObject(List[JField]())) //true
f.isDefinedAt(JNull) //false
This means your example will be translated into something like:
PartialFunction[JValue, List[JField]] mfun = { case JObject(o) -> o }
var jobject = jvalue.filter(mfun.isDefinedAt(_)).map(mfun)

Threading extra state through a parser in Scala

I'll give you the tl;dr up front
I'm trying to use the state monad transformer in Scalaz 7 to thread extra state through a parser, and I'm having trouble doing anything useful without writing a lot of t m a -> t m b versions of m a -> m b methods.
An example parsing problem
Suppose I have a string containing nested parentheses with digits inside them:
val input = "((617)((0)(32)))"
I also have a stream of fresh variable names (characters, in this case):
val names = Stream('a' to 'z': _*)
I want to pull a name off the top of the stream and assign it to each parenthetical
expression as I parse it, and then map that name to a string representing the
contents of the parentheses, with the nested parenthetical expressions (if any) replaced by their
names.
To make this more concrete, here's what I'd want the output to look like for the example input above:
val target = Map(
'a' -> "617",
'b' -> "0",
'c' -> "32",
'd' -> "bc",
'e' -> "ad"
)
There may be either a string of digits or arbitrarily many sub-expressions at a given level, but these two kinds of content won't be mixed in a single parenthetical expression.
To keep things simple, we'll assume that the stream of names will never
contain either duplicates or digits, and that it will always contain enough
names for our input.
Using parser combinators with a bit of mutable state
The example above is a slightly simplified version of the parsing problem in
this Stack Overflow question.
I answered that question with
a solution that looked roughly like this:
import scala.util.parsing.combinator._
class ParenParser(names: Iterator[Char]) extends RegexParsers {
def paren: Parser[List[(Char, String)]] = "(" ~> contents <~ ")" ^^ {
case (s, m) => (names.next -> s) :: m
}
def contents: Parser[(String, List[(Char, String)])] =
"\\d+".r ^^ (_ -> Nil) | rep1(paren) ^^ (
ps => ps.map(_.head._1).mkString -> ps.flatten
)
def parse(s: String) = parseAll(paren, s).map(_.toMap)
}
It's not too bad, but I'd prefer to avoid the mutable state.
What I want
Haskell's Parsec library makes
adding user state to a parser trivially easy:
import Control.Applicative ((*>), (<$>), (<*))
import Data.Map (fromList)
import Text.Parsec
paren = do
(s, m) <- char '(' *> contents <* char ')'
h : t <- getState
putState t
return $ (h, s) : m
where
contents
= flip (,) []
<$> many1 digit
<|> (\ps -> (map (fst . head) ps, concat ps))
<$> many1 paren
main = print $
runParser (fromList <$> paren) ['a'..'z'] "example" "((617)((0)(32)))"
This is a fairly straightforward translation of my Scala parser above, but without mutable state.
What I've tried
I'm trying to get as close to the Parsec solution as I can using Scalaz's state monad transformer, so instead of Parser[A] I'm working with StateT[Parser, Stream[Char], A].
I have a "solution" that allows me to write the following:
import scala.util.parsing.combinator._
import scalaz._, Scalaz._
object ParenParser extends ExtraStateParsers[Stream[Char]] with RegexParsers {
protected implicit def monadInstance = parserMonad(this)
def paren: ESP[List[(Char, String)]] =
(lift("(" ) ~> contents <~ lift(")")).flatMap {
case (s, m) => get.flatMap(
names => put(names.tail).map(_ => (names.head -> s) :: m)
)
}
def contents: ESP[(String, List[(Char, String)])] =
lift("\\d+".r ^^ (_ -> Nil)) | rep1(paren).map(
ps => ps.map(_.head._1).mkString -> ps.flatten
)
def parse(s: String, names: Stream[Char]) =
parseAll(paren.eval(names), s).map(_.toMap)
}
This works, and it's not that much less concise than either the mutable state version or the Parsec version.
But my ExtraStateParsers is ugly as sin—I don't want to try your patience more than I already have, so I won't include it here (although here's a link, if you really want it). I've had to write new versions of every Parser and Parsers method I use above
for my ExtraStateParsers and ESP types (rep1, ~>, <~, and |, in case you're counting). If I had needed to use other combinators, I'd have had to write new state transformer-level versions of them as well.
Is there a cleaner way to do this? I'd love to see an example of a Scalaz 7's state monad transformer being used to thread state through a parser, but Scalaz 6 or Haskell examples would also be useful and appreciated.
Probably the most general solution would be to rewrite Scala's parser library to accommodate monadic computations while parsing (like you partly did), but that would be quite a laborious task.
I suggest a solution using ScalaZ's State where each of our result isn't a value of type Parse[X], but a value of type Parse[State[Stream[Char],X]] (aliased as ParserS[X]). So the overall parsed result isn't a value, but a monadic state value, which is then run on some Stream[Char]. This is almost a monad transformer, but we have to do lifting/unlifting manually. It makes the code a bit uglier, as we need to lift values sometimes or use map/flatMap on several places, but I believe it's still reasonable.
import scala.util.parsing.combinator._
import scalaz._
import Scalaz._
import Traverse._
object ParenParser extends RegexParsers with States {
type S[X] = State[Stream[Char],X];
type ParserS[X] = Parser[S[X]];
// Haskell's `return` for States
def toState[S,X](x: X): State[S,X] = gets(_ => x)
// Haskell's `mapM` for State
def mapM[S,X](l: List[State[S,X]]): State[S,List[X]] =
l.traverse[({type L[Y] = State[S,Y]})#L,X](identity _);
// .................................................
// Read the next character from the stream inside the state
// and update the state to the stream's tail.
def next: S[Char] = state(s => (s.tail, s.head));
def paren: ParserS[List[(Char, String)]] =
"(" ~> contents <~ ")" ^^ (_ flatMap {
case (s, m) => next map (v => (v -> s) :: m)
})
def contents: ParserS[(String, List[(Char, String)])] = digits | parens;
def digits: ParserS[(String, List[(Char, String)])] =
"\\d+".r ^^ (_ -> Nil) ^^ (toState _)
def parens: ParserS[(String, List[(Char, String)])] =
rep1(paren) ^^ (mapM _) ^^ (_.map(
ps => ps.map(_.head._1).mkString -> ps.flatten
))
def parse(s: String): ParseResult[S[Map[Char,String]]] =
parseAll(paren, s).map(_.map(_.toMap))
def parse(s: String, names: Stream[Char]): ParseResult[Map[Char,String]] =
parse(s).map(_ ! names);
}
object ParenParserTest extends App {
{
println(ParenParser.parse("((617)((0)(32)))", Stream('a' to 'z': _*)));
}
}
Note: I believe that your approach with StateT[Parser, Stream[Char], _] isn't conceptually correct. The type says that we're constructing a parser given some state (a stream of names). So it would be possible that given different streams we get different parsers. This is not what we want to do. We only want that the result of parsing depends on the names, not the whole parser. In this way Parser[State[Stream[Char],_]] seems to be more appropriate (Haskell's Parsec takes a similar approach, the state/monad is inside the parser).

traverse collection of type "Any" in Scala

I would like to traverse a collection resulting from the Scala JSON toolkit at github.
The problem is that the JsonParser returns "Any" so I am wondering how I can avoid the following error:
"Value foreach is not a member of Any".
val json = Json.parse(urls)
for(l <- json) {...}
object Json {
def parse(s: String): Any = (new JsonParser).parse(s)
}
You will have to do pattern matching to traverse the structures returned from the parser.
/*
* (untested)
*/
def printThem(a: Any) {
a match {
case l:List[_] =>
println("List:")
l foreach printThem
case m:Map[_, _] =>
for ( (k,v) <- m ) {
print("%s -> " format k)
printThem(v)
}
case x =>
println(x)
}
val json = Json.parse(urls)
printThem(json)
You might have more luck using the lift-json parser, available at: http://github.com/lift/lift/tree/master/framework/lift-base/lift-json/
It has a much richer type-safe DSL available, and (despite the name) can be used completely standalone outside of the Lift framework.
If you are sure that in all cases there will be only one type you can come up with the following cast:
for (l <- json.asInstanceOf[List[List[String]]]) {...}
Otherwise do a Pattern-Match for all expected cases.