I set up a SBT console like...
import org.json4s._
import org.json4s.native.JsonMethods._
import org.json4s.JsonDSL._
case class TagOptionOrNull(tag: String, optionUuid: Option[java.util.UUID], uuid: java.util.UUID)
val t1 = new TagOptionOrNull("t1", Some(java.util.UUID.randomUUID), java.util.UUID.randomUUID)
val t2 = new TagOptionOrNull("t2", None, null)
I'm trying to see json4s's behavior around null vs Option[UUID]. But I can't figure out the invocation to get it to make my case class a String of JSON.
scala> implicit val formats = DefaultFormats
formats: org.json4s.DefaultFormats.type = org.json4s.DefaultFormats$#614275d5
scala> compact(render(t1))
<console>:23: error: type mismatch;
found : TagOptionOrNull
required: org.json4s.JValue
(which expands to) org.json4s.JsonAST.JValue
compact(render(t1))
What am I missing?
Serialization.write should be able to serialise case class like so
import org.json4s.native.Serialization.write
implicit val formats = DefaultFormats ++ JavaTypesSerializers.all
println(write(t1))
which should output
{"tag":"t1","optionUuid":"95645021-f60c-4708-8bf3-9d5609559fdb","uuid":"19cc4979-5836-4edf-aedd-dcb3e96f17d6"}
Note to serialise UUID we need JavaTypeSerializers formats from
libraryDependencies += "org.json4s" %% "json4s-ext" % version
Related
I'm trying to run an example from the Spark book Spark: The Definitive Guide
build.sbt
ThisBuild / scalaVersion := "3.2.1"
libraryDependencies ++= Seq(
("org.apache.spark" %% "spark-sql" % "3.2.0" % "provided").cross(CrossVersion.for3Use2_13)
)
Compile / run := Defaults.runTask(Compile / fullClasspath, Compile / run / mainClass, Compile / run / runner).evaluated
lazy val root = (project in file("."))
.settings(
name := "scalalearn"
)
main.scala
// imports
...
object spark1 {
#main
def main(args: String*): Unit = {
...
case class Flight(
DEST_COUNTRY_NAME: String,
ORIGIN_COUNTRY_NAME: String,
count: BigInt
)
val flightsDF = spark.read
.parquet(s"$dataRootPath/data/flight-data/parquet/2010-summary.parquet/")
// import spark.implicits._ // FAILS
// import spark.sqlContext.implicits._ // FAILS
val flights = flightsDF.as[Flight]
// in Scala
flights
.filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != "Canada")
.map(flight_row => flight_row)
.take(5)
spark.stop()
}
}
Getting an error with the line val flights = flightsDF.as[Flight]
Unable to find encoder for type Flight. An implicit Encoder[Flight] is needed to store Flight
instances in a Dataset. Primitive types (Int, String, etc) and Product types (case classes)
are supported by importing spark.implicits._ Support for serializing other types will be added in
future releases.
Any help is appreciated.
Scala - 3.2.1
Spark - 3.2.0
Tried importing implicits from spark.implicits._ and spark.sqlContext.implicits._
The example works on scala 2.x
Looking for a way to convert DF to case class without any third party workarounds
You need to add Scala-3 dependency for Spark codecs
https://github.com/vincenzobaz/spark-scala3
libraryDependencies += "io.github.vincenzobaz" %% "spark-scala3" % "0.1.3"
and import Scala-3
import scala3encoders.given
instead of Scala-2
import spark.implicits._ // FAILS
import spark.sqlContext.implicits._ // FAILS
Scala Spark Encoders.product[X] (where X is a case class) keeps giving me "No TypeTag available for X" error
Regarding BigInt,
Does Spark support BigInteger type?
Spark does support Java BigIntegers but possibly with some loss of precision. If the numerical value of the BigInteger fits in a long (i.e. between -2^63 and 2^63-1) then it will be stored by Spark as a LongType. Otherwise it will be stored as a DecimalType, but this type only supports 38 digits of precision.
Correct codecs for comparatively small BigInts (fitting into LongType) are
import scala3encoders.derivation.{Deserializer, Serializer}
import org.apache.spark.sql.catalyst.expressions.Expression
import org.apache.spark.sql.catalyst.expressions.objects.{Invoke, StaticInvoke}
import org.apache.spark.sql.types.{DataType, LongType, ObjectType}
given Deserializer[BigInt] with
def inputType: DataType = LongType
def deserialize(path: Expression): Expression =
StaticInvoke(
BigInt.getClass,
ObjectType(classOf[BigInt]),
"apply",
path :: Nil,
returnNullable = false
)
given Serializer[BigInt] with
def inputType: DataType = ObjectType(classOf[BigInt])
def serialize(inputObject: Expression): Expression =
Invoke(inputObject, "longValue", LongType, returnNullable = false)
import scala3encoders.given
https://github.com/DmytroMitin/spark_stackoverflow/blob/87ef5361dd3553f8cc5ced26fed4c17c0061d6a2/src/main/scala/main.scala
(https://github.com/databricks/Spark-The-Definitive-Guide)
https://github.com/yashwanthreddyg/spark_stackoverflow/pull/1
https://gist.github.com/DmytroMitin/3c0fe6983a254b350ff9feedbb066bef
https://github.com/vincenzobaz/spark-scala3/pull/22
For large BigInts (not fitting into LongType when DecimalType is necessary) the codecs are
import scala3encoders.derivation.{Deserializer, Serializer}
import org.apache.spark.sql.catalyst.expressions.Expression
import org.apache.spark.sql.catalyst.expressions.objects.{Invoke, StaticInvoke}
import org.apache.spark.sql.types.{DataType, DataTypes, Decimal, ObjectType}
val decimalType = DataTypes.createDecimalType(38, 0)
given Deserializer[BigInt] with
def inputType: DataType = decimalType
def deserialize(path: Expression): Expression =
Invoke(path, "toScalaBigInt", ObjectType(classOf[scala.math.BigInt]), returnNullable = false)
given Serializer[BigInt] with
def inputType: DataType = ObjectType(classOf[BigInt])
def serialize(inputObject: Expression): Expression =
StaticInvoke(
Decimal.getClass,
decimalType,
"apply",
inputObject :: Nil,
returnNullable = false
)
import scala3encoders.given
which is almost the same as
import org.apache.spark.sql.catalyst.DeserializerBuildHelper.createDeserializerForScalaBigInt
import org.apache.spark.sql.catalyst.SerializerBuildHelper.createSerializerForScalaBigInt
import scala3encoders.derivation.{Deserializer, Serializer}
import org.apache.spark.sql.catalyst.expressions.Expression
import org.apache.spark.sql.types.{DataType, DataTypes, ObjectType}
val decimalType = DataTypes.createDecimalType(38, 0)
given Deserializer[BigInt] with
def inputType: DataType = decimalType
def deserialize(path: Expression): Expression =
createDeserializerForScalaBigInt(path)
given Serializer[BigInt] with
def inputType: DataType = ObjectType(classOf[BigInt])
def serialize(inputObject: Expression): Expression =
createSerializerForScalaBigInt(inputObject)
import scala3encoders.given
https://gist.github.com/DmytroMitin/8124d2a4cd25c8488c00c5a32f244f64
Runtime exception you observed meant that BigInts from the parquet file are comparatively small (fitting into LongType) and you tried my codecs for large BigInts (DecimalType)
https://gist.github.com/DmytroMitin/ad77677072c1d8d5538c94cb428c8fa4 (ERROR CodeGenerator: failed to compile: org.codehaus.commons.compiler.CompileException: File 'generated.java': A method named "toScalaBigInt" is not declared in any enclosing class nor any supertype, nor through a static import)
Vice versa, for large BigInts (DecimalType) and codecs for small BigInts (LongType): https://gist.github.com/DmytroMitin/3a3a61082fbfc12447f6e926fc45c7cd (ERROR CodeGenerator: failed to compile: org.codehaus.commons.compiler.CompileException: File 'generated.java': No applicable constructor/method found for actual parameters "org.apache.spark.sql.types.Decimal"; candidates are: ...)
We can't use both codecs for LongType and DecimalType: https://gist.github.com/DmytroMitin/32040a6b702fff5c53c727616b318cb5 (Exception in thread "main" java.lang.IllegalArgumentException: requirement failed: All input types must be the same except nullable, containsNull, valueContainsNull flags. The input types found are LongType DecimalType(38,0))
For a mixture of small and large BigInts correct is using codecs for DecimalType https://gist.github.com/DmytroMitin/626e09a63a387e6ff1d7fe264fc14d6b
The approach with manually created TypeTags seems to work too (not using scala3encoders)
// libraryDependencies += scalaOrganization.value % "scala-reflect" % "2.13.10" // in Scala 3
import scala.reflect.api
import scala.reflect.runtime.universe.{NoType, Type, TypeTag, internal}
import scala.reflect.runtime.universe
inline def createTypeTag[T](mirror: api.Mirror[_ <: api.Universe with Singleton], tpe: mirror.universe.Type): mirror.universe.TypeTag[T] = {
mirror.universe.TypeTag.apply[T](mirror.asInstanceOf[api.Mirror[mirror.universe.type]],
new api.TypeCreator {
override def apply[U <: api.Universe with Singleton](m: api.Mirror[U]): m.universe.Type = {
tpe.asInstanceOf[m.universe.Type]
}
}
)
}
val rm = universe.runtimeMirror(this.getClass.getClassLoader)
// val bigIntTpe = internal.typeRef(internal.typeRef(NoType, rm.staticPackage("scala.math"), Nil), rm.staticClass("scala.math.BigInt"), Nil)
// val strTpe = internal.typeRef(internal.typeRef(NoType, rm.staticPackage("java.lang"), Nil), rm.staticClass("java.lang.String"), Nil)
val flightTpe = internal.typeRef(NoType, rm.staticClass("Flight"), Nil)
// given TypeTag[BigInt] = createTypeTag[BigInt](rm, bigIntTpe)
// given TypeTag[String] = createTypeTag[String](rm, strTpe)
given TypeTag[Flight] = createTypeTag[Flight](rm, flightTpe)
import spark.implicits._
https://gist.github.com/DmytroMitin/bb0ccd5f1c533b2baec1756da52f8824
I have some code that works in Scala 2.{10,11,12,13} that I'm now trying to convert to Scala 3. Scala 3 does Enumeration differently than Scala 2. I'm trying to figure out how to convert the following code that interacts with play-json so that it will work with Scala 3. Any tips or pointers to code from projects that have already crossed this bridge?
// Scala 2.x style code in EnumUtils.scala
import play.api.libs.json._
import scala.language.implicitConversions
// see: http://perevillega.com/enums-to-json-in-scala
object EnumUtils {
def enumReads[E <: Enumeration](enum: E): Reads[E#Value] =
new Reads[E#Value] {
def reads(json: JsValue): JsResult[E#Value] = json match {
case JsString(s) => {
try {
JsSuccess(enum.withName(s))
} catch {
case _: NoSuchElementException =>
JsError(s"Enumeration expected of type: '${enum.getClass}', but it does not appear to contain the value: '$s'")
}
}
case _ => JsError("String value expected")
}
}
implicit def enumWrites[E <: Enumeration]: Writes[E#Value] = new Writes[E#Value] {
def writes(v: E#Value): JsValue = JsString(v.toString)
}
implicit def enumFormat[E <: Enumeration](enum: E): Format[E#Value] = {
Format(EnumUtils.enumReads(enum), EnumUtils.enumWrites)
}
}
// ----------------------------------------------------------------------------------
// Scala 2.x style code in Xyz.scala
import play.api.libs.json.{Reads, Writes}
object Xyz extends Enumeration {
type Xyz = Value
val name, link, unknown = Value
implicit val enumReads: Reads[Xyz] = EnumUtils.enumReads(Xyz)
implicit def enumWrites: Writes[Xyz] = EnumUtils.enumWrites
}
As an option you can switch to jsoniter-scala.
It supports enums for Scala 2 and Scala 3 out of the box.
Also it has handy derivation of safe and efficient JSON codecs.
Just need to add required libraries to your dependencies:
libraryDependencies ++= Seq(
// Use the %%% operator instead of %% for Scala.js and Scala Native
"com.github.plokhotnyuk.jsoniter-scala" %% "jsoniter-scala-core" % "2.13.5",
// Use the "provided" scope instead when the "compile-internal" scope is not supported
"com.github.plokhotnyuk.jsoniter-scala" %% "jsoniter-scala-macros" % "2.13.5" % "compile-internal"
)
And then derive a codec and use it:
import com.github.plokhotnyuk.jsoniter_scala.core._
import com.github.plokhotnyuk.jsoniter_scala.macros._
implicit val codec: JsonValueCodec[Xyz.Xyz] = JsonCodecMaker.make
println(readFromString[Xyz.Xyz]("\"name\""))
BTW, you can run the full code on Scastie: https://scastie.scala-lang.org/Evj718q6TcCZow9lRhKaPw
I'm trying to serialize a case class with optional value-class field to JSON using json4s. So far, I'm not able to get the optional value-class field rendered correctly (see the snippet below with examples).
I tried json-native and json-jackson libraries, results are identical.
Here's a short self-contained test
import org.json4s.DefaultFormats
import org.scalatest.FunSuite
import org.json4s.native.Serialization._
class JsonConversionsTest extends FunSuite {
implicit val jsonFormats = DefaultFormats
test("optional value-class instance conversion") {
val json = writePretty(Foo(Option(Id(123)), "foo-name", Option("foo-opt"), Id(321)))
val actual =
"""
|{
| "id":{
| "value":123
| },
| "name":"foo-name",
| "optField":"foo-opt",
| "nonOptId":321
|}
|""".stripMargin.trim
assert(json === actual)
val correct =
"""
|{
| "id": 123,
| "name":"foo-name",
| "optField":"foo-opt",
| "nonOptId":321
|}
|""".stripMargin.trim
assert(json !== correct)
}
}
case class Id(value: Int) extends AnyVal
case class Foo(id: Option[Id], name: String, optField: Option[String], nonOptId: Id)
I'm using scala 2.12 and the latest json4s-native version:
"org.json4s" %% "json4s-native" % "3.6.7"
It looks very similar to this issue, doesn't seem to be fixed or commented on.
A custom serializer would save your day.
object IdSerializer extends CustomSerializer[Id] ( format => (
{ case JInt(a) => Id(a.toInt) },
{ case a: Id => JInt(a.value) }
))
implicit val formats = DefaultFormats + IdSerializer
val json = writePretty(Foo(Option(Id(123)), "foo-name", Option("foo-opt"), Id(321)))
In any case try other options like using of jsoniter-scala instead. It is much safe and efficient than json4s especially in serialization of prettified JSON.
Add/replace dependencies:
libraryDependencies ++= Seq(
"com.github.plokhotnyuk.jsoniter-scala" %% "jsoniter-scala-core" % "0.55.2" % Compile,
"com.github.plokhotnyuk.jsoniter-scala" %% "jsoniter-scala-macros" % "0.55.2" % Provided // required only in compile-time
)
Derive a codec for the top-level data structure and use it to serialize into a string:
import com.github.plokhotnyuk.jsoniter_scala.macros._
import com.github.plokhotnyuk.jsoniter_scala.core._
case class Id(value: Int) extends AnyVal
case class Foo(id: Option[Id], name: String, optField: Option[String], nonOptId: Id)
implicit val codec: JsonValueCodec[Foo] = JsonCodecMaker.make(CodecMakerConfig())
val json = writeToString(Foo(Option(Id(123)), "foo-name", Option("foo-opt"), Id(321)), WriterConfig(indentionStep = 2))
val correct =
"""{
| "id": 123,
| "name": "foo-name",
| "optField": "foo-opt",
| "nonOptId": 321
|}""".stripMargin
assert(json == correct)
There are also more efficient options to store into byte array, java.nio.ByteBuffer or java.io.OutputStream immediately.
I am trying to construct a JSON object from a list where key is "products" and value is List[Product] where Product is a case class.But I am getting error that says "type mismatch; found : (String, List[com.mycompnay.ws.client.Product]) required: net.liftweb.json.JObject (which expands to) net.liftweb.json.JsonAST.JObject".
What I have done so far is as below:
val resultJson:JObject = "products" -> resultList
println(compact(render(resultJson)))
You're looking for decompose (doc). See this answer.
I tested the following code and it worked fine:
import net.liftweb.json._
import net.liftweb.json.JsonDSL._
import net.liftweb.json.Extraction._
implicit val formats = net.liftweb.json.DefaultFormats
case class Product(foo: String)
val resultList: List[Product] = List(Product("bar"), Product("baz"))
val resultJson: JObject = ("products" -> decompose(resultList))
println(compact(render(resultJson)))
Result:
{"products":[{"foo":"bar"},{"foo":"baz"}]}
I want to return list of json objects based on my case class objects.
Following is my spray router, which returns list of 'Appointment' objects.
trait GatewayService extends HttpService with SLF4JLogging {
import com.sml.apigw.protocols.AppointmentProtocol._
import spray.httpx.SprayJsonSupport._
implicit def executionContext = actorRefFactory.dispatcher
val router =
pathPrefix("api" / "v1") {
path("appointments") {
get {
complete {
val a = new Appointment("1", "2")
val l = List(a, a, a, a)
l
}
}
}
}
}
}
Following is the 'AppointmentProtocol'
import spray.json.DefaultJsonProtocol
case class Appointment(id: String, patient: String)
object AppointmentProtocol extends DefaultJsonProtocol {
implicit val appointmentFormat = jsonFormat2(Appointment.apply)
}
It gives the compilation error 'expression type of List[Appointment] doesn't confirms to expected type toResponseMarshallable'
Maybe you missed something in your example, but my brain-based compiler is telling me that your code should throw a compilation error since any directive in Spray requires a result of type Route, as i can see you have a List[Appointment]. Please read this article on Routes. Your route structure should be completed with complete, so i assume the this way would solve your issue:
get {
val a = new Appointment("1", "2")
val l = List(a, a, a, a)
complete(l)
}
Please note a complete directive which wraps the list. This should help, otherwise please provide the tree with resolved implicits by compiling your code with the flag -Xprint:typer, that should show where the problem with implicits is.
I think that you should use this library spray-json depending on your version of spay this will be your import and do not forget to add the imports to your code:
import MyJsonProtocol._
import spray.json._
for that be sure that you have imported:
libraryDependencies += "io.spray" %% "spray-json" % "1.3.2"
THen you can conver an object whith this, I also have problems having the case class in the same file but this was using testing:
case class Color(name: String, red: Int, green: Int, blue: Int)
object MyJsonProtocol extends DefaultJsonProtocol {
implicit val colorFormat = jsonFormat4(Color)
}
import MyJsonProtocol._
import spray.json._
val json = Color("CadetBlue", 95, 158, 160).toJson
val color = json.convertTo[Color]
and the list:
val jsonAst = List(1, 2, 3).toJson
This are extracted examples from the spray-json github project