Why does play-json lose precision while reading/parsing? - scala

In the following example (scala 2.11 and play-json 2.13)
val j ="""{"t":2.2599999999999997868371792719699442386627197265625}"""
println((Json.parse(j) \ "t").as[BigDecimal].compare(BigDecimal("2.2599999999999997868371792719699442386627197265625")))
The output is -1. Shouldn't they be equal ? On printing the parsed value, it prints rounded off value:
println((Json.parse(j) \ "t").as[BigDecimal]) gives 259999999999999786837179271969944

The problem is that by default play-json configures the Jackson parser with the MathContext set to DECIMAL128. You can fix this by setting the play.json.parser.mathContext system property to unlimited. For example, in a Scala REPL that would look like this:
scala> System.setProperty("play.json.parser.mathContext", "unlimited")
res0: String = null
scala> val j ="""{"t":2.2599999999999997868371792719699442386627197265625}"""
j: String = {"t":2.2599999999999997868371792719699442386627197265625}
scala> import play.api.libs.json.Json
import play.api.libs.json.Json
scala> val res = (Json.parse(j) \ "t").as[BigDecimal]
res: BigDecimal = 2.2599999999999997868371792719699442386627197265625
scala> val expected = BigDecimal("2.2599999999999997868371792719699442386627197265625")
expected: scala.math.BigDecimal = 2.2599999999999997868371792719699442386627197265625
scala> res.compare(expected)
res1: Int = 0
Note that setProperty should happen first, before any reference to Json. In normal (non-REPL) use you'd set the property via -D on the command line or whatever.
Alternatively you could use Jawn's play-json parsing support, which just works as expected off the shelf:
scala> val j ="""{"t":2.2599999999999997868371792719699442386627197265625}"""
j: String = {"t":2.2599999999999997868371792719699442386627197265625}
scala> import org.typelevel.jawn.support.play.Parser
import org.typelevel.jawn.support.play.Parser
scala> val res = (Parser.parseFromString(j).get \ "t").as[BigDecimal]
res: BigDecimal = 2.2599999999999997868371792719699442386627197265625
Or for that matter you could switch to circe:
scala> import io.circe.Decoder, io.circe.jawn.decode
import io.circe.Decoder
import io.circe.jawn.decode
scala> decode(j)(Decoder[BigDecimal].prepare(_.downField("t")))
res0: Either[io.circe.Error,BigDecimal] = Right(2.2599999999999997868371792719699442386627197265625)
…which handles a range of number-related corner cases more responsibly than play-json in my view. For example:
scala> val big = "1e2147483648"
big: String = 1e2147483648
scala> io.circe.jawn.parse(big)
res0: Either[io.circe.ParsingFailure,io.circe.Json] = Right(1e2147483648)
scala> play.api.libs.json.Json.parse(big)
java.lang.NumberFormatException
at java.math.BigDecimal.<init>(BigDecimal.java:491)
at java.math.BigDecimal.<init>(BigDecimal.java:824)
at scala.math.BigDecimal$.apply(BigDecimal.scala:287)
at play.api.libs.json.jackson.JsValueDeserializer.parseBigDecimal(JacksonJson.scala:146)
...
But that's out of scope for this question.
To be honest I'm not sure why play-json defaults to DECIMAL128 for the MathContext, but that's a question for the play-json maintainers, and is also out of scope here.

Related

How to get the alias of a Spark Column as String?

If I declare a Column in a val, like this:
import org.apache.spark.sql.functions._
val col: org.apache.spark.sql.Column = count("*").as("col_name")
col is of type org.apache.spark.sql.Column. Is there a way to access its name ("col_name")?
Something like:
col.getName() // returns "col_name"
In this case, col.toString returns "count(1) AS col_name"
Try below code.
scala> val cl = count("*").as("col_name")
cl: org.apache.spark.sql.Column = count(1) AS `col_name`
scala> cl.expr.argString
res14: String = col_name
scala> cl.expr.productElement(1).asInstanceOf[String]
res24: String = col_name
scala> val cl = count("*").cast("string").as("column_name")
cl: org.apache.spark.sql.Column = CAST(count(1) AS STRING) AS `column_name`
scala> cl.expr.argString
res113: String = column_name
From the above code if you alter .as & .cast It will give you wrong result.
You can also use json4s to extract name from expr.toJSON
scala> import org.json4s._
import org.json4s._
scala> import org.json4s.jackson.JsonMethods._
import org.json4s.jackson.JsonMethods._
scala> implicit val formats = DefaultFormats
formats: org.json4s.DefaultFormats.type = org.json4s.DefaultFormats$#16cccda5
scala> val cl = count("*").as("column_name").cast("string") // Used cast last.
cl: org.apache.spark.sql.Column = CAST(count(1) AS `column_name` AS STRING)
scala> (parse(cl.expr.toJSON) \\ "name").extract[String]
res104: String = column_name
One another easy way is, column names will always be covered by `` these characters. You can use either regex or split the string and get index 1 element.
with split,
col.toString.split("`")(1)
with regex,
val pattern = "`(.*)`".r
pattern.findFirstMatchIn(col.toString).get.group(1)
Advantage doing like this is even you include something like .cast("string") to you column it will still work.

How can I get all (non-final) object vals and subobject vals using reflection in Scala?

Note: This question is not a duplicate of How can I get all object vals and subobject vals using reflection in Scala?
The answer provided in that question only works for final members.
For example:
scala> object Settings {
| val Host = "host"
| }
defined module Settings
deepMembers(Settings)
res0: Map[String,String] = Map()
It must be a duplicate, but I need a refresher:
$ scala
Welcome to Scala version 2.11.7 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_45).
Type in expressions to have them evaluated.
Type :help for more information.
scala> object Settings { val Host = "host" ; val Guest = "guest" }
defined object Settings
scala> import reflect.runtime._,universe._
import reflect.runtime._
import universe._
scala> val im = currentMirror reflect Settings
im: reflect.runtime.universe.InstanceMirror = instance mirror for Settings$#c8e4bb0
scala> im.symbol.asClass.typeSignature.members filter (s => s.isTerm && s.asTerm.isAccessor)
res0: Iterable[reflect.runtime.universe.Symbol] = SynchronizedOps(value Guest, value Host)
scala> res0 map (im reflectMethod _.asMethod) map (_.apply())
res2: Iterable[Any] = List(guest, host)
val members = Settings.getClass.getDeclaredFields.map(_.getName).filterNot(_ == "MODULE$")
members: Array[String] = Array(Host)
This works but I think there's certainly a better way of doing this.

How to eval a string val in Scala?

I have scala expression stored in String variable:
val myExpr = "(xml \ \"node\")"
How do I execute this?
s"${myExpr}"
Right now it only gives me the string contents
What I'm trying to achieve is parsing user string input in the form:
"/some/node/in/xml"
and get that corresponding node in Scala:
(xml \ "node" \ "in" \ "xml")
For the REPL, my init includes:
implicit class interpoleter(val sc: StringContext) {def i(args: Any*) = $intp interpret sc.s(args: _*) }
with which
scala> val myExpr = "(xml \\ \"node\")"
myExpr: String = (xml \ "node")
scala> val xml = <x><node/></x>
xml: scala.xml.Elem = <x><node/></x>
scala> i"${myExpr}"
res3: scala.xml.NodeSeq = NodeSeq(<node/>)
res2: scala.tools.nsc.interpreter.IR.Result = Success
because isn't code really just a string, like everything else?
Probably, there is some more idiomatic way in recent scala versions, but you can use Twitter's Eval for that:
val i: Int = new Eval()("1 + 1") // => 2

In Scala I can have reference to a private type via implicit conversion

I've found this interesting behaviour in nscala_time package (scala version of joda-time)
import com.github.nscala_time.time.Imports._
import com.github.nscala_time.time.DurationBuilder
object tests {
val x = 3 seconds
//> x : is of type com.github.nscala_time.time.DurationBuilder
val xx: DurationBuilder = 3 seconds
//> fails to compile:
// class DurationBuilder in package time cannot be accessed in package com.github.nscala_time.time
}
What I'm trying to achieve is implicit conversion from nscala_time Duration to scala.concurrent.Duration
I need this becuase I'm using RxScala and nscala_time in one application.
// e.g. the following should be implicitly converted
// to nscala_time Duration first
// than to scala.lang.concurrent.Duration
3 seconds
nscala_time offers rich time & date api for my application, while I'm using RxScala in the same class for GUI responsivness.
You can download a simple project to play around: https://dl.dropboxusercontent.com/u/9958045/implicit_vs_private.zip
From scala-user group: It's a known issue https://issues.scala-lang.org/browse/SI-1800
perhaps you can use an implicit conversion? (btw Duration in nscala is essentially org.joda.time.Duration):
scala> import com.github.nscala_time.time.Imports._
import com.github.nscala_time.time.Imports._
scala> implicit class DurationHelper(d:org.joda.time.Duration) {
| def toScalaDuration = scala.concurrent.duration.Duration.apply(d.getMillis,scala.concurrent.duration.MILLISECONDS)
| }
defined class DurationHelper
scala> val d = RichInt(3).seconds.toDuration
// toDuration method is defined for com.github.nscala_time.time.DurationBuilder
d: org.joda.time.Duration = PT3S
scala> def exfun(d:scala.concurrent.duration.Duration) = d.toString
exfun: (d: scala.concurrent.duration.Duration)String
scala> exfun(d)
res41: String = 3000 milliseconds
(not using import scala.concurrent.duration._ here to avoid name clashes with joda/nlscala stuff)

Problem with outputting map values in scala

I have the following code snippet:
val map = new LinkedHashMap[String,String]
map.put("City","Dallas")
println(map.get("City"))
This outputs Some(Dallas) instead of just Dallas. Whats the problem with my code ?
Thank You
Use the apply method, it returns directly the String and throws a NoSuchElementException if the key is not found:
scala> import scala.collection.mutable.LinkedHashMap
import scala.collection.mutable.LinkedHashMap
scala> val map = new LinkedHashMap[String,String]
map: scala.collection.mutable.LinkedHashMap[String,String] = Map()
scala> map.put("City","Dallas")
res2: Option[String] = None
scala> map("City")
res3: String = Dallas
It's not really a problem.
While Java's Map version uses null to indicate that a key don't have an associated value, Scala's Map[A,B].get returns a Options[B], which can be Some[B] or None, and None plays a similar role to java's null.
REPL session showing why this is useful:
scala> map.get("State")
res6: Option[String] = None
scala> map.get("State").getOrElse("Texas")
res7: String = Texas
Or the not recommended but simple get:
scala> map.get("City").get
res8: String = Dallas
scala> map.get("State").get
java.util.NoSuchElementException: None.get
at scala.None$.get(Option.scala:262)
Check the Option documentation for more goodies.
There are two more ways you can handle Option results.
You can pattern match them:
scala> map.get("City") match {
| case Some(value) => println(value)
| case _ => println("found nothing")
| }
Dallas
Or there is another neat approach that appears somewhere in Programming in Scala. Use foreach to process the result. If a result is of type Some, then it will be used. Otherwise (if it's None), nothing happens:
scala> map.get("City").foreach(println)
Dallas
scala> map.get("Town").foreach(println)