meta-programming to parse json in scala - scala

I need some hints to write a scala program that could read json file and create a case class at run time. As an example if we have json class like -
Employ{
name:{datatype:String, null:false}
age:{datatype:Int, null:true}
Address:{city: {datatype: String, null:true}, zip: {datatype: String, null:false}}
}
and this should create class like
case class Employ(name: String, age: Option[Int], address: Address}
case class Address(city: Option[String], zip:String}
would it be possible to do it in scala?

Yes, you can easily achieve this using TreeHugger. I did a similar thing for one of my work projects.
Below is a toy example which produces a Scala Akka Actor class. It needs to be cleaned up but, hopefully, you get the idea:
import argonaut.Argonaut._
import argonaut._
import org.scalatest.FunSuite
import treehugger.forest._
import definitions._
import treehuggerDSL._
class ConvertJSONToScalaSpec extends FunSuite {
test("read json") {
val input =
"""
|{
| "rulename" : "Rule 1",
| "condition" : [
| {
| "attribute" : "country",
| "operator" : "eq",
| "value" : "DE"
| }
| ],
| "consequence" : "route 1"
|}
""".stripMargin
val updatedJson: Option[Json] = input.parseOption
val tree =
BLOCK(
IMPORT(sym.actorImports),
CLASSDEF(sym.c).withParents(sym.d, sym.e) :=
BLOCK(
IMPORT(sym.consignorImport, "_"),
DEFINFER(sym.methodName) withFlags (Flags.OVERRIDE) := BLOCK(
CASE(sym.f DOT sym.methodCall APPLY (REF(sym.mc))) ==>
BLOCK(
sym.log DOT sym.logmethod APPLY (LIT(sym.logmessage)),
(IF (sym.declaration DOT sym.header DOT sym.consignor DOT sym.consignoreTID ANY_== LIT(1))
THEN (sym.sender APPLY() INFIX ("!", LIT(sym.okcm)))
ELSE
(sym.sender APPLY() INFIX ("!", LIT(sym.badcm)))
)
)
)
)
) inPackage (sym.packageName)
}
Essentially all you need to do is work out how to use the TreeHugger macros; each macro represents a specific keyword in Scala. It gives you a type-safe way to do your meta-programming.
There's also Scala Meta but I haven't used that.

Well... lets say you used some library like treehugger or scala meta or something else to generate the code string for case class. Now there are multiple approaches that you can take. To start with one of them, you can do the following.
// import the current runtime mirror as cm
import scala.reflect.runtime.{currentMirror => cm}
// you case code string
val codeString = """
case class Address(city: Option[String], zip:String)
Address(Some("CityName"), "zipcode")
"""
// get the toolbox from mirror
val tb = cm.mkToolBox()
// use tool box to convert string to Tree
val codeTree = tb.parse(codeString)
// eval your tree
val address = tb.eval(codeTree)
The problem is that the val address will have type Any. Also the universe still does not know about type Address so you will not be able to do address.asInstanceOf[Address].
You can solve this one by exploring things about ClassSymbol and ClassLoader and with enough luck may be able to solve many more issues that you will face by understanding more about how reflection works in Scala and Java. But that will be a high effort and no guaranty of success path.

Related

How to apply sequence function to List of ValidatedNel in cats?

I have the following code
sealed trait DomainValidation {
def errorMessage: String
}
type ValidationResult[A] = ValidatedNel[DomainValidation, A]
val ai:ValidationResult[String] = "big".validNel
val bi:ValidationResult[String] = "leboski".validNel
val l = List(ai,bi)
I want to convert l to ValidationResult[List[String]]. I came across sequence functionality but I am unable to use cats sequence as some implicit has to be there which knows how to handle ValidationResult[A]. But I am unable figure out what exactly is needed. I wrote the following
object helper {
implicit class hello[A](l: List[ValidationResult[A]]) {
def mysequence: ValidationResult[List[A]] = {
val m = l.collect { case Invalid(a) => Invalid(a) }
if (m.isEmpty) l.map { case Valid(a) => a }.validNel
else /* merge the NonEmpty Lists */
}
}
}
I am able to do l.mysequence. But how do I use cats sequence.
PS: I am a scala beginner. Having a hard time learning :). Forgive for any incorrect mentions.
The following should work as expected on Scala 2.12:
import cats.data.ValidatedNel, cats.syntax.validated._
// Your code:
sealed trait DomainValidation {
def errorMessage: String
}
type ValidationResult[A] = ValidatedNel[DomainValidation, A]
val ai:ValidationResult[String] = "big".validNel
val bi:ValidationResult[String] = "leboski".validNel
val l = List(ai,bi)
And then:
scala> import cats.instances.list._, cats.syntax.traverse._
import cats.instances.list._
import cats.syntax.traverse._
scala> l.sequence
res0: ValidationResult[List[String]] = Valid(List(big, leboski))
You don't show your code or explain what's not working, so it's hard to diagnose your issue, but it's likely to be one of the following problems:
You're on Scala 2.11, where .sequence requires you to enable -Ypartial-unification in your compiler options. If you're using sbt, you can do this by adding scalacOptions += "-Ypartial-unification" to your build.sbt (assuming you're on 2.11.9+).
You've omitted one of the necessary imports. You need at least the Traverse instance for List and the syntax for Traverse. The example code above includes the two imports you need, or you can just import cats.implicits._ and make your life a little easier.
If it's not one of these two things, you'll probably need to include more detail in your question for us to be able to help.

circe type field not showing

When encoding to Json with circe we really want the type field to show e.g.
scala> val fooJson = foo.asJson
fooJson: io.circe.Json =
{
"this_is_a_string" : "abc",
"another_field" : 123,
"type" : "Foo"
}
This is taken from the release notes which previously mentions that you can configure the encoding like this:
implicit val customConfig: Configuration =
Configuration.default.withSnakeCaseKeys.withDefaults.withDiscriminator("type")
Also other information about circe here suggests that without any configuration you should get some class type information in the encoding json.
Am I missing something? How do you get the class type to show?
UPDATE 30/03/2017: Follow up to OP's comment
I was able to make this work, as shown in the linked release notes.
Preparation step 1: add additional dependency to build.sbt
libraryDependencies += "io.circe" %% "circe-generic-extras" % "0.7.0"
Preparation step 2: setup dummy sealed trait hierarchy
import io.circe.{ Decoder, Encoder }
import io.circe.parser._, io.circe.syntax._
import io.circe.generic.extras.Configuration
import io.circe.generic.extras.auto._
import io.circe.generic.{ semiauto => boring } // <- This is the default generic derivation behaviour
import io.circe.generic.extras.{ semiauto => fancy } // <- This is the new generic derivation behaviour
implicit val customConfig: Configuration = Configuration.default.withDefaults.withDiscriminator("type")
sealed trait Stuff
case class Foo(thisIsAString: String, anotherField: Int = 13) extends Stuff
case class Bar(thisIsAString: String, anotherField: Int = 13) extends Stuff
object Foo {
implicit val decodeBar: Decoder[Bar] = fancy.deriveDecoder
implicit val encodeBar: Encoder[Bar] = fancy.deriveEncoder
}
object Bar {
implicit val decodeBar: Decoder[Bar] = boring.deriveDecoder
implicit val encodeBar: Encoder[Bar] = boring.deriveEncoder
}
Actual code using this:
val foo: Stuff = Foo("abc", 123)
val bar: Stuff = Bar("xyz", 987)
val fooString = foo.asJson.noSpaces
// fooString: String = {"thisIsAString":"abc","anotherField":123,"type":"Foo"}
val barString = bar.asJson.noSpaces
// barString: String = {"thisIsAString":"xyz","anotherField":987,"type":"Bar"}
val bar2 = for{
json <- parse(barString)
bar2 <- json.as[Stuff]
} yield bar2
// bar2: scala.util.Either[io.circe.Error,Stuff] = Right(Bar(xyz,987))
val foo2 = for{
json <- parse(fooString)
foo2 <- json.as[Stuff]
} yield foo2
// foo2: scala.util.Either[io.circe.Error,Stuff] = Right(Foo(abc,123))
So, provided you import the extra dependency (which is where Configuration comes from), it looks like it works.
Finally, as a sidenote, it does seem that there is some disconnection between Circe's DESIGN.md and practice, for which I am actually happy.
Original answer:
I am not sure this is supposed to be supported, by design.
Taken from Circe's DESIGN.md:
Implicit scope should not be used for configuration. Lots of people have asked for a way to configure generic codec derivation to use e.g. a type field as the discriminator for sealed trait hierarchies, or to use snake case for member names. argonaut-shapeless supports this quite straightforwardly with a JsonCoproductCodec type that the user can provide implicitly.
I don't want to criticize this approach—it's entirely idiomatic Scala, and it often works well in practice—but I personally don't like using implicit values for configuration, and I'd like to avoid it in circe until I am 100% convinced that there's no alternative way to provide this functionality.
What this means concretely: You'll probably never see an implicit argument that isn't a type class instance—i.e. that isn't a type constructor applied to a type in your model—in circe, and configuration of generic codec derivation is going to be relatively limited (compared to e.g. argonaut-shapeless) until we find a nice way to do this kind of thing with type tags or something similar.
In particular, customConfig: Configuration seems to be exactly the type of argument that the last paragraph refers to (e.g. an implicit argument that isn't a type class instance)
I am sure that #travis-brown or any other Circe's main contributors could shed some more light on this, in case there was in fact a way of doing this - and I would be very happy to know it! :)

How to match all words in a sentence with scala combinators?

For a first test with scala combinators, I am trying to get all words from a sentence, but I am just getting "None" from the following code :
import java.io.File
import scala.io.Source
import scala.util.parsing.combinator._
object PgnReader extends TagParser {
def parseFile(inputFile:File) = {
val pgnStream = Source.fromFile(inputFile)
val pgnStr = pgnStream.mkString
println(parseAll(tag, "Hello World !").getOrElse("None"))
pgnStream.close
}
}
trait TagParser extends RegexParsers {
val tag:Parser[String] = """[:alpha:]+""".r ^^ (_.toString)
}
I would like to get something like :
Hello
World
or even like :
List(Hello, World)
Am I on the right way with my code ?
I am using scala 2.11 and scala combinators
You should use something like that to match sequence of tokens instead of one token:
trait TagParser extends RegexParsers {
val tags: Parser[List[String]] = rep("""[a-zA-Z]+""".r)
}
rep is:
A parser generator for repetitions.
rep(p) repeatedly uses p to parse the input until p fails (the result
is a List of the consecutive results of p).
http://www.scala-lang.org/files/archive/nightly/docs/parser-combinators/index.html#scala.util.parsing.combinator.RegexParsers
I think this might get you closer:
trait TagParser extends RegexParsers {
val tag = rep("""\p{Alpha}+""".r) ^^ (_.map(_.toString))
}
POSIX character classes have a different syntax in Scala (as inherited from Java). The rep() syntax allows for multiple occurrences (giving a List()).
That will still choke on the exclamation point, so you can augment your regex a little. I'd probably also go with the notion of "tag" and "tags" separately to make things clearer:
trait TagParser extends RegexParsers {
val tags = rep(tag)
val tag = """\p{Alpha}+|!""".r ^^ (_.toString)
}
...
println(parseAll(tags, "Hello World !").getOrElse(None))
...

Scala pickling: how?

I'm trying to use "pickling" serialization is Scala, and I see the same example demonstrating it:
import scala.pickling._
import json._
val pckl = List(1, 2, 3, 4).pickle
Unpickling is just as easy as pickling:
val lst = pckl.unpickle[List[Int]]
This example raises some question. First of all, it skips converting of object to string. Apparently you need to call pckl.value to get json string representation.
Unpickling is even more confusing. Deserialization is an act of turning string (or bytes) into an object. How come this "example" demonstrates deserialization if there is no string/binry representation of object?
So, how do I deserialize simple object with pickling library?
Use the type system and case classes to achieve your goals. You can unpickle to some superior type in your hierarchy (up to and including AnyRef). Here is an example:
trait Zero
case class One(a:Int) extends Zero
case class Two(s:String) extends Zero
object Test extends App {
import scala.pickling._
import json._
// String that can be sent down a wire
val wire: String = Two("abc").pickle.value
// On the other side, just use a case class
wire.unpickle[Zero] match {
case One(a) => println(a)
case Two(s) => println(s)
case unknown => println(unknown.getClass.getCanonicalName)
}
}
Ok, I think I understood it.
import scala.pickling._
import json._
var str = Array(1,2,3).pickle.value // this is JSON string
println(str)
val x = str.unpickle[Array[Int]] // unpickle from string
will produce JSON string:
{
"tpe": "scala.Array[scala.Int]",
"value": [
1,
2,
3
]
}
So, the same way we pickle any type, we can unpickle string. Type of serialization is regulated by implicit formatter declared in "json." and can be replaced by "binary."
It does look like you will be starting with a pickle to unpickle to a case class. But the JSON string can be fed to the JSONPickle class to get the starting pickle.
Here's an example based on their array-json test
package so
import scala.pickling._
import json._
case class C(arr: Array[Int]) { override def toString = s"""C(${arr.mkString("[", ",", "]")})""" }
object PickleTester extends App {
val json = """{"arr":[ 1, 2, 3 ]}"""
val cPickle = JSONPickle( json )
val unpickledC: C = cPickle.unpickle[C]
println( s"$unpickledC, arr.sum = ${unpickledC.arr.sum}" )
}
The output printed is:
C([1,2,3]), arr.sum = 6
I was able to drop the "tpe" in from the test as well as the .stripMargin.trim on the input JSON from the test. It works all in one line, but I thought it might be more apparent split up. It's unclear to me if that "tpe" from the test is supposed to provide a measure of type safety for the incoming JSON.
Looks like the only other class they support for pickling is a BinaryPickle unless you want to roll your own. The latest scala-pickling snapshot jar requires quasiquotes to compile the code in this answer.
I tried someting more complicated this morning and discovered that the "tpe" is required for non-primatives in the incoming JSON - which points out that the serialized string really must be compatible with the pickler( which I mixed into the above code ):
case class J(a: Option[Boolean], b: Option[String], c: Option[Int]) { override def toString = s"J($a, $b, $c)" }
...
val jJson = """{"a": {"tpe": "scala.None.type"},
| "b":{"tpe": "scala.Some[java.lang.String]","x":"donut"},
| "c":{"tpe": "scala.Some[scala.Int]","x":47}}"""
val jPickle = JSONPickle( jJson.stripMargin.trim )
val unpickledJ: J = jPickle.unpickle[J]
println( s"$unpickledJ" )
...
where naturually, I had to use .value on a J(None, Some("donut"), Some(47)) to figure out how to create the jJson input value to prevent the unpickling from throwing an exception.
The output for J is like:
J(None, Some(donut), Some(47))
Looking at this test, it appears that if the incoming JSON is all primatives or case classes (or combinations) that the JSONPickle magic works, but some other classes like Options require extra "tpe" type information to unpickle correctly.

Introspect argument passed to a Scala macro

I would like to program a Scala macro that takes an instance of a case class as argument. All objects that can be passed to the macro have to implement a specific marker trait.
The following snippet shows the marker trait and two example case classes implementing it:
trait Domain
case class Country( id: String, name: String ) extends Domain
case class Town( id: String, longitude: Double, latitude: Double ) extends Domain
Now, I would like to write the following code using macros to avoid the heaviness of runtime reflection and its thread unsafety:
object Test extends App {
// instantiate example domain object
val myCountry = Country( "CH", "Switzerland" )
// this is a macro call
logDomain( myCountry )
}
The macro logDomain is implemented in a different project and looks similar to:
object Macros {
def logDomain( domain: Domain ): Unit = macro logDomainMacroImpl
def logDomainMacroImpl( c: Context )( domain: c.Expr[Domain] ): c.Expr[Unit] = {
// Here I would like to introspect the argument object but do not know how?
// I would like to generate code that prints out all val's with their values
}
}
The macro's purpose should be to generate code that - at runtime - outputs all values (id and name) of the given object and prints them as shown next:
id (String) : CH
name (String) : Switzerland
To achieve this, I would have to dynamically inspect the passed type argument and determine its members (vals). Then I would have to generate an AST representing the code that creates the log output. The macro should work regardless of what specific object implementing the marker trait "Domain" is passed to the macro.
At this point I am lost. I would appreciate if someone could give me a starting point or point me to some documentation? I am relatively new to Scala and have not found a solution in the Scala API docs or the Macro guide.
Listing the accessors of a case class is such a common operation when you're working with macros that I tend to keep a method like this around:
def accessors[A: u.WeakTypeTag](u: scala.reflect.api.Universe) = {
import u._
u.weakTypeOf[A].declarations.collect {
case acc: MethodSymbol if acc.isCaseAccessor => acc
}.toList
}
This will give us all the case class accessor method symbols for A, if it has any. Note that I'm using the general reflection API here—there's no need to make this macro-specific yet.
We can wrap this method up with some other convenience stuff:
trait ReflectionUtils {
import scala.reflect.api.Universe
def accessors[A: u.WeakTypeTag](u: Universe) = {
import u._
u.weakTypeOf[A].declarations.collect {
case acc: MethodSymbol if acc.isCaseAccessor => acc
}.toList
}
def printfTree(u: Universe)(format: String, trees: u.Tree*) = {
import u._
Apply(
Select(reify(Predef).tree, "printf"),
Literal(Constant(format)) :: trees.toList
)
}
}
And now we can write the actual macro code pretty concisely:
trait Domain
object Macros extends ReflectionUtils {
import scala.language.experimental.macros
import scala.reflect.macros.Context
def log[D <: Domain](domain: D): Unit = macro log_impl[D]
def log_impl[D <: Domain: c.WeakTypeTag](c: Context)(domain: c.Expr[D]) = {
import c.universe._
if (!weakTypeOf[D].typeSymbol.asClass.isCaseClass) c.abort(
c.enclosingPosition,
"Need something typed as a case class!"
) else c.Expr(
Block(
accessors[D](c.universe).map(acc =>
printfTree(c.universe)(
"%s (%s) : %%s\n".format(
acc.name.decoded,
acc.typeSignature.typeSymbol.name.decoded
),
Select(domain.tree.duplicate, acc.name)
)
),
c.literalUnit.tree
)
)
}
}
Note that we still need to keep track of the specific case class type we're dealing with, but type inference will take care of that at the call site—we won't need to specify the type parameter explicitly.
Now we can open a REPL, paste in your case class definitions, and then write the following:
scala> Macros.log(Town("Washington, D.C.", 38.89, 77.03))
id (String) : Washington, D.C.
longitude (Double) : 38.89
latitude (Double) : 77.03
Or:
scala> Macros.log(Country("CH", "Switzerland"))
id (String) : CH
name (String) : Switzerland
As desired.
From what I can see, you need to solve two problems: 1) get the necessary information from the macro argument, 2) generate trees that represent the code you need.
In Scala 2.10 these things are done with the reflection API. Follow Is there a tutorial on Scala 2.10's reflection API yet? to see what documentation is available for it.
import scala.reflect.macros.Context
import language.experimental.macros
trait Domain
case class Country(id: String, name: String) extends Domain
case class Town(id: String, longitude: Double, latitude: Double) extends Domain
object Macros {
def logDomain(domain: Domain): Unit = macro logDomainMacroImpl
def logDomainMacroImpl(c: Context)(domain: c.Expr[Domain]): c.Expr[Unit] = {
import c.universe._
// problem 1: getting the list of all declared vals and their types
// * declarations return declared, but not inherited members
// * collect filters out non-methods
// * isCaseAccessor only leaves accessors of case class vals
// * typeSignature is how you get types of members
// (for generic members you might need to use typeSignatureIn)
val vals = typeOf[Country].declarations.toList.collect{ case sym if sym.isMethod => sym.asMethod }.filter(_.isCaseAccessor)
val types = vals map (_.typeSignature)
// problem 2: generating the code which would print:
// id (String) : CH
// name (String) : Switzerland
//
// usually reify is of limited usefulness
// (see https://stackoverflow.com/questions/13795490/how-to-use-type-calculated-in-scala-macro-in-a-reify-clause)
// but here it's perfectly suitable
// a subtle detail: `domain` will be possibly used multiple times
// therefore we need to duplicate it
val stmts = vals.map(v => c.universe.reify(println(
c.literal(v.name.toString).splice +
"(" + c.literal(v.returnType.toString).splice + ")" +
" : " + c.Expr[Any](Select(domain.tree.duplicate, v)).splice)).tree)
c.Expr[Unit](Block(stmts, Literal(Constant(()))))
}
}