Can I make json4s's extract method case insensitive? - scala

I am using case classes to extract json with json4s's extract method. Unfortunately, the Natural Earth source data I am using isn't consistent about casing... at some resolutions a field is called iso_a2 and at some it's ISO_A2. I can only make json4s accept the one that matches the field in the case class:
object TopoJSON {
case class Properties(ISO_A2: String)
...
// only accepts capitalised version.
Is there any way to make json4s ignore case and accept both?

There is no way to make it case insensitive using the configuration properties, but a similar result can be achieved by either lowercasing or uppercasing the field names in the parsed JSON.
For example, we have our input:
case class Properties(iso_a2: String)
implicit val formats = DefaultFormats
val parsedLower = parse("""{ "iso_a2": "test1" }""")
val parsedUpper = parse("""{ "ISO_A2": "test2" }""")
We can lowercase all field names using a short function:
private def lowercaseAllFieldNames(json: JValue) = json transformField {
case (field, value) => (field.toLowerCase, value)
}
or make it for specific fields only:
private def lowercaseFieldByName(fieldName: String, json: JValue) = json transformField {
case (field, value) if field == fieldName => (fieldName.toLowerCase, value)
}
Now, to extract the case class instances:
val resultFromLower = lowercaseAllFieldNames(parsedLower).extract[Properties]
val resultFromUpper = lowercaseAllFieldNames(parsedUpper).extract[Properties]
val resultByFieldName = lowercaseFieldByName("ISO_A2", parsedUpper).extract[Properties]
// all produce expected items:
// Properties(test1)
// Properties(test2)
// Properties(test2)

Related

Pattern matching in Scala with generic return type

There are four different types: Location, Language, Technology and Industry. There is a repository for each type that can return a collection of these types. For example a list of Locations. Each type has a name property with type String. There is a list of Strings. which can contain names of Locations, Languages, etc. I would like to write a function to find those typed entities (Location, Language, ...) that match the names of the String list. I was thinking something about like this:
def find[T](names: String): Set[T] = {
val collection = T match {
case Language => LanguageRepository.getAll
case Location => LocationRepository.getAll
case Teehnology => TechnologyRepository.getAll
case Industry => IndustryRepository.getAll
}
// find the elements in the collection
}
This is not correct, so how can the querying of the collection be done, and after that how can I be sure that the name property is there?
Why not use algebraic data types for this? They can serve as enums. The example:
sealed trait Property
case object LanguageProperty extends Property
case object LocationProperty extends Property
case object TechnologyProperty extends Property
case object IndustryProperty extends Property
def find(names: String, property: Property): Set[ParentClass] = {
val collection = property match {
case LanguageProperty => LanguageRepository.getAll
case LocationProperty => LocationRepository.getAll
case TechnologyProperty => TechnologyRepository.getAll
case IndustryProperty => IndustryRepository.getAll
}
// find the elements in the collection
}
Implying that ParentClass is a parent class of your Language, Location, Technology and Industry classes.
You can pass an implicit ClassTag value to determine the runtime class you passed
case class Language()
case class Location()
case class Teehnology()
case class Industry()
val LANG = classOf[Language]
val LOC = classOf[Location]
val TEC = classOf[Teehnology]
val IND = classOf[Industry]
def find[Z](names: String)(implicit ct: ClassTag[Z]): Set[Z] = {
val collection = ct.runtimeClass match {
case LANG => Set(Language())
case LOC => Set(Location())
case TEC => Set(Teehnology())
case IND => Set(Industry())
}
collection.asInstanceOf[Set[Z]]
}
and then
find[Language]("")
find[Industry]("")
produces
res0: Set[Language] = Set(Language())
res1: Set[Industry] = Set(Industry())

Nested Scala case classes to/from CSV

There are many nice libraries for writing/reading Scala case classes to/from CSV files. I'm looking for something that goes beyond that, which can handle nested cases classes. For example, here a Match has two Players:
case class Player(name: String, ranking: Int)
case class Match(place: String, winner: Player, loser: Player)
val matches = List(
Match("London", Player("Jane",7), Player("Fred",23)),
Match("Rome", Player("Marco",19), Player("Giulia",3)),
Match("Paris", Player("Isabelle",2), Player("Julien",5))
)
I'd like to effortlessly (no boilerplate!) write/read matches to/from this CSV:
place,winner.name,winner.ranking,loser.name,loser.ranking
London,Jane,7,Fred,23
Rome,Marco,19,Giulia,3
Paris,Isabelle,2,Julien,5
Note the automated header line using the dot "." to form the column name for a nested field, e.g. winner.ranking. I'd be delighted if someone could demonstrate a simple way to do this (say, using reflection or Shapeless).
[Motivation. During data analysis it's convenient to have a flat CSV to play around with, for sorting, filtering, etc., even when case classes are nested. And it would be nice if you could load nested case classes back from such files.]
Since a case-class is a Product, getting the values of the various fields is relatively easy. Getting the names of the fields/columns does require using Java reflection.
The following function takes a list of case-class instances and returns a list of rows, each is a list of strings. It is using a recursion to get the values and headers of child case-class instances.
def toCsv(p: List[Product]): List[List[String]] = {
def header(c: Class[_], prefix: String = ""): List[String] = {
c.getDeclaredFields.toList.flatMap { field =>
val name = prefix + field.getName
if (classOf[Product].isAssignableFrom(field.getType)) header(field.getType, name + ".")
else List(name)
}
}
def flatten(p: Product): List[String] =
p.productIterator.flatMap {
case p: Product => flatten(p)
case v: Any => List(v.toString)
}.toList
header(classOf[Match]) :: p.map(flatten)
}
However, constructing case-classes from CSV is far more involved, requiring to use reflection for getting the types of the various fields, for creating the values from the CSV strings and for constructing the case-class instances.
For simplicity (not saying the code is simple, just so it won't be further complicated), I assume that the order of columns in the CSV is the same as if the file was produced by the toCsv(...) function above.
The following function starts by creating a list of "instructions how to process a single CSV row" (the instructions are also used to verify that the column headers in the CSV matches the the case-class properties). The instructions are then used to recursively produce one CSV row at a time.
def fromCsv[T <: Product](csv: List[List[String]])(implicit tag: ClassTag[T]): List[T] = {
trait Instruction {
val name: String
val header = true
}
case class BeginCaseClassField(name: String, clazz: Class[_]) extends Instruction {
override val header = false
}
case class EndCaseClassField(name: String) extends Instruction {
override val header = false
}
case class IntField(name: String) extends Instruction
case class StringField(name: String) extends Instruction
case class DoubleField(name: String) extends Instruction
def scan(c: Class[_], prefix: String = ""): List[Instruction] = {
c.getDeclaredFields.toList.flatMap { field =>
val name = prefix + field.getName
val fType = field.getType
if (fType == classOf[Int]) List(IntField(name))
else if (fType == classOf[Double]) List(DoubleField(name))
else if (fType == classOf[String]) List(StringField(name))
else if (classOf[Product].isAssignableFrom(fType)) BeginCaseClassField(name, fType) :: scan(fType, name + ".")
else throw new IllegalArgumentException(s"Unsupported field type: $fType")
} :+ EndCaseClassField(prefix)
}
def produce(instructions: List[Instruction], row: List[String], argAccumulator: List[Any]): (List[Instruction], List[String], List[Any]) = instructions match {
case IntField(_) :: tail => produce(tail, row.drop(1), argAccumulator :+ row.head.toString.toInt)
case StringField(_) :: tail => produce(tail, row.drop(1), argAccumulator :+ row.head.toString)
case DoubleField(_) :: tail => produce(tail, row.drop(1), argAccumulator :+ row.head.toString.toDouble)
case BeginCaseClassField(_, clazz) :: tail =>
val (instructionRemaining, rowRemaining, constructorArgs) = produce(tail, row, List.empty)
val newCaseClass = clazz.getConstructors.head.newInstance(constructorArgs.map(_.asInstanceOf[AnyRef]): _*)
produce(instructionRemaining, rowRemaining, argAccumulator :+ newCaseClass)
case EndCaseClassField(_) :: tail => (tail, row, argAccumulator)
case Nil if row.isEmpty => (Nil, Nil, argAccumulator)
case Nil => throw new IllegalArgumentException("Not all values from CSV row were used")
}
val instructions = BeginCaseClassField(".", tag.runtimeClass) :: scan(tag.runtimeClass)
assert(csv.head == instructions.filter(_.header).map(_.name), "CSV header doesn't match target case-class fields")
csv.drop(1).map(row => produce(instructions, row, List.empty)._3.head.asInstanceOf[T])
}
I've tested this using:
case class Player(name: String, ranking: Int, price: Double)
case class Match(place: String, winner: Player, loser: Player)
val matches = List(
Match("London", Player("Jane", 7, 12.5), Player("Fred", 23, 11.1)),
Match("Rome", Player("Marco", 19, 13.54), Player("Giulia", 3, 41.8)),
Match("Paris", Player("Isabelle", 2, 31.7), Player("Julien", 5, 16.8))
)
val csv = toCsv(matches)
val matchesFromCsv = fromCsv[Match](csv)
assert(matches == matchesFromCsv)
Obviously this should be optimized and hardened if you ever want to use this for production...

Check if a value either equals a string or is an an array which contains a string

A Scala map contains a key X.
The value can be either an array of Strings Array("Y")
or a simple String object, "Y".
I need to retrieve the value from the map and test
if the value is a string,
mayMap("X")=="Y"
or, if the value is an array.
myMap("X").contains("Y")
I don't want to use an if statement statement to check the type first of the value first of all. One option would be to write a function which checks the value, if it is an array then return the array, otherwise create an array with the single string element contained in the map. Then the call would be:
myToArrayFunction(myMap("X")).contains("Y")
That's what I actually do in Java.
But this is Scala. Is there a better idiom to do this in one line using pre-existing functions?
This should work:
myMap.get("X") match {
case None => println("oh snap!")
case Some(x) => x match {
case i: String => println(s"hooray for my String $i") // do something with your String here
case a: Array[String] => println(s"It's an Array $a") // do something with your Array here
}
}
case class Y(name: String)
//val a = Map[String, Any]("X" -> "Y")
val a = Map[String, Any]("X" -> Y("Peter"))
a.getOrElse("X", "Default") match {
case s: String => println(s)
case Y(name) => println(name)
}
you can also use something like this:
//val a = Map[String, Any]("X" -> "Y")
val a = Map[String, Any]("X" -> Y("Peter"))
a.map(v => println(v._2))

Specs2 JSONMatchers: mapping over Array elements?

I'm using the Specs2 JSONMatcher to validate that a GET request is being correctly converted from its internal representation (there are some manipulations we do before generating the JSON). What I need to do is, make sure that an element in the JSON array matches the corresponding object from our repository.
What I've tried:
val response = response.entity.asString // Spray's way of getting the JSON response
repository.all.map { obj =>
resp must */ ("id" -> obj.id)
resp must */ ("state" -> generateState(obj)
}
The problem is that the */ matcher just finds that "state": "whatever" (assuming generateState returns "whatever") exists somewhere in the JSON document, not necessarily in the same one matched by the ID
I tried using the indices but the repository.all method doesn't always return the elements in the same order, so there's no way of matching by index.
What I'd like to do is, iterate over the elements of the JSON array and match each one separately. Say an /## operator (or something) that takes matchers for each element:
resp /## { elem =>
val id = elem("id")
val obj = repository.lookup(id)
elem /("state" -> generateState(obj))
}
Does anyone have a way to do this or something similar?
Probably the easiest thing to do for now (until a serious refactoring of JsonMatchers) is to do some parsing and recursively use a JsonMatchers in a Matcher[String]:
"""{'db' : { 'id' : '1', 'state' : 'ok_1'} }""" must /("db" -> stateIsOk)
// a string matcher for the json string in 'db'
def stateIsOk: Matcher[String] = { json: String =>
// you need to provide a way to access the 'id' field
// then you can go on using a json matcher for the state
findId(json) must beSome { id: String =>
val obj = repository.lookup(id)
json must /("state" -> generate(obj))
}
}
// I am using my own parse function here
def findId(json: String): Option[String] =
parse(json).flatMap { a =>
findDeep("id", a).collect { case JSONArray(List(v)) => v.toString }
}
// dummy system
def generate(id: String) = "ok_"+id
case object repository {
def lookup(id: String) = id
}
What I did in the end is use responseAs[JArray], JArray#arr and JObject#values to convert the JSON structures into Lists and Maps, and then used the standard List and Map matchers. Much more flexible.

how to deserialize DateTime in Lift

I am having trouble deserializing a org.joda.time.DateTime field from JSON into a case class.
The JSON:
val ajson=parse(""" { "creationDate": "2013-01-02T10:48:41.000-05:00" }""")
I also set these serialization options:
implicit val formats = Serialization.formats(NoTypeHints) ++ net.liftweb.json.ext.JodaTimeSerializers.all
And the deserialization:
val val1=ajson.extract[Post]
where Post is:
case class Post(creationDate: DateTime){ ... }
The exception I get is:
net.liftweb.json.MappingException: No usable value for creationDate
Invalid date format 2013-01-02T10:48:41.000-05:00
How can I deserialize that date string into a DateTime object?
EDIT:
This works: val date3= new DateTime("2013-01-05T06:24:53.000-05:00")
which uses the same date string from the JSON as in the deserialization. What's happening here?
Seems like it is the DateParser format that Lift uses by default. In digging into the code, you can see that the parser attempts to use DateParser.parse(s, format) before passing the result to the constructor for org.joda.time.DateTime.
object DateParser {
def parse(s: String, format: Formats) =
format.dateFormat.parse(s).map(_.getTime).getOrElse(throw new MappingException("Invalid date format " + s))
}
case object DateTimeSerializer extends CustomSerializer[DateTime](format => (
{
case JString(s) => new DateTime(DateParser.parse(s, format))
case JNull => null
},
{
case d: DateTime => JString(format.dateFormat.format(d.toDate))
}
))
The format that Lift seems to be using is: yyyy-MM-dd'T'HH:mm:ss.SSS'Z'
To get around that, you could either specify the correct pattern and add that to your serialization options, or if you would prefer to just have the JodaTime constructor do all the work, you could create your own serializer like:
case object MyDateTimeSerializer extends CustomSerializer[DateTime](format => (
{
case JString(s) => tryo(new DateTime(s)).openOr(throw new MappingException("Invalid date format " + s))
case JNull => null
},
{
case d: DateTime => JString(format.dateFormat.format(d.toDate))
}
))
And then add that to your list of formats, instead of net.liftweb.json.ext.JodaTimeSerializers.all
Not perhaps 100% elegant but is just a few lines, quite readable and works:
val SourISODateTimeFormat = DateTimeFormat.forPattern("YYYY-MM-dd'T'HH:mm:ss.SSSZ")
val IntermediateDateTimeFormat = DateTimeFormat.forPattern("YYYY-MM-dd'T'HH:mm:ss'Z'")
def transformTimestamps(jvalue: JValue) = jvalue.transform {
case JField(name # ("createdTime" | "updatedTime"), JString(value)) =>
val dt = SourceISODateTimeFormat.parseOption(value).get
JField(name, JString(IntermediateDateTimeFormat.print(dt)))
}