Scala Object Mapper write incorrect value

Scala Object Mapper write incorrect value - scala

I have an complex object which contains another object. I want to write it as yaml file.I am using scala with ObjectMapper . My final object looks like this
Configuration(Some(List(s3://etl/configuration/minioApp/23-08-2021/metrics.yaml)),Some(Map(inputDataFrame -> Input(None,Some(Kafka(List(uguyhi, ytvvt),Some(hgvgvugb),Some(ytfytvi),Some(yftug),None,None,None,Some( uyguytf)))))),None,Some(Output(None,Some(Kafka(null,Some(yrdryft),None)))),None,None,None,None,None,None,None,None,Some(minioApp),None,None,None,None)
I want to write it to file. My main case class looks like this
case class Configuration(#BeanProperty var metrics: Option[Seq[String]],
#BeanProperty var inputs: Option[Map[String, Input]],
#BeanProperty var variables: Option[Map[String, String]] = None,
#BeanProperty var output: Option[Output] = None,
#BeanProperty var outputs: Option[Map[String, Output]] = None,
#BeanProperty var cacheOnPreview: Option[Boolean] = None,
#BeanProperty var showQuery: Option[Boolean] = None,
#BeanProperty var streaming: Option[Streaming] = None,
#BeanProperty var periodic: Option[Periodic] = None,
#BeanProperty var logLevel: Option[String] = None,
#BeanProperty var showPreviewLines: Option[Int] = None,
#BeanProperty var explain: Option[Boolean] = None,
#BeanProperty var appName: Option[String] = None,
#BeanProperty var continueOnFailedStep: Option[Boolean] = None,
#BeanProperty var cacheCountOnOutput: Option[Boolean] = None,
#BeanProperty var ignoreDeequValidations: Option[Boolean] = None,
#BeanProperty var failedDFLocationPrefix: Option[String] = None) extends Conf
and further input, output all class also have BeanProperty added. When I am trying to write this file using below snipplet.
val objectMapper = new ObjectMapper(new YAMLFactory)
objectMapper.writeValue(new File("/Users/ayush.goyal/Downloads/servingWrapper/src/main/resources/input1.yaml"),jobYamls)
where jobYaml is my object as mentioned above. I am getting below file.
metrics:
empty: false
defined: true
inputs:
empty: false
defined: true
variables:
empty: true
defined: false
output:
empty: false
defined: true
outputs:
empty: true
defined: false
cacheOnPreview:
empty: true
defined: false
showQuery:
empty: true
defined: false
streaming:
empty: true
defined: false
periodic:
empty: true
defined: false
logLevel:
empty: true
defined: false
showPreviewLines:
empty: true
defined: false
explain:
empty: true
defined: false
appName:
empty: false
defined: true
continueOnFailedStep:
empty: true
defined: false
cacheCountOnOutput:
empty: true
defined: false
ignoreDeequValidations:
empty: true
defined: false
failedDFLocationPrefix:
empty: true
defined: false
Now this file has 2 issue
Values of object is not populated
I don't want to populated fields that have null value.
How can I do that?
EDIT:
As per the suggestion, I was abl to get below yaml
---
metrics:
- "s3://etl/configuration/minioApp/23-08-2021/metrics.yaml"
inputs:
inputDataFrame:
file: null
kafka:
servers:
- "uguyhi"
- "ytvvt"
topic: "hgvgvugb"
topicPattern: "ytfytvi"
consumerGroup: "yftug"
options: null
schemaRegistryUrl: "yrftg"
schemaSubject: null
schemaId: " uyguytf"
output:
file: null
kafka:
vservers: null
checkpointLocation: "yrdryft"
compressionType: null
streaming: null
appName: "minioApp"
As you can see in above yaml, Several fields have null value. I don't want to write them. How can I do that? I just don't want those fiels to appear in Yaml.

Jackson doesn't know how to serialize Option as it's a Java library.
You can add the jackson-scala-module library and register the DefaultScalaModule on your ObjectMapper to let it know how to serialize common Scala types.
See https://github.com/FasterXML/jackson-module-scala

Related

Scala spark dataframe parse string field to typesafe Config error

I´m trying to parse a String field from a spark Dataframe to a Config, reading the Dataframe as a specific case class. The objective is to get a new row in the dataframe for each input & output that the config field contains. I'm far from having a solution, but fighting with an error about serializable objects:
config field comes as this:
data {
inputs = [
{
name = "table1"
paths = ["/data/master/table1"]
},
{
name = "table2"
paths = ["/data/master/table2"]
}
]
outputs = [
{
name = "table1"
type = "parquet"
mode = "append"
force = true
path = "/data/master/table1"
},
{
name = "table2"
type = "parquet"
mode = "append"
force = true
path = "/data/master/table2"
}
]
}
Code I'm using:
// case class to read the config field
case class JobConf(name: String, job: String, version: Long, updatedAt: Timestamp, conf: String)
//case class for output dataframe
case class InputOutputJob(name: String, job: String, version: String, path: String,
inputOuput: String)
val jobsDF = spark.read.parquet("path")
jobsDF.map {
job => {
val x = JobConf(job.getString(0), job.getString(1), job.getInt(2), job.getTimestamp(3), job.getString(4))
val c = ConfigFactory.parseString(x.conf, ConfigParseOptions.defaults().setSyntax(ConfigSyntax.PROPERTIES))
val p:Map[String, ConfigValue] = c.root().asScala
//Pending
InputOutputJob( ))
}
}.toDF
The error that comes when try to compile is:
Cause: java.io.NotSerializableException:
Serialization stack:
- object not serializable
- element of array (index: 0)
- array (class [Ljava.lang.Object;, size 1)
- field (class: java.lang.invoke.SerializedLambda, name: capturedArgs, type: class [Ljava.lang.Object;)

Serializing/Deserializing model using spray json scala test

I am new to scala and writing test cases using scala test and spray json. My code is as follows.
case class MyModel(Point1: String,
Point2: String,
Point3: Seq[String],
Point4: Seq[String])
it should "serialise/deserialize a MyModel to JSON" in {
val json= """{"Point1":"","Point3":[],"Point2":"","Point4":[]}""".parseJson
val myModelViaJson= json.convertTo[MyModel]
myModelViaJson.Point1 shouldBe ""
myModelViaJson.Point3.isEmpty shouldBe true
myModelViaJson.Point2 shouldBe ""
myModelViaJson.Point4.isEmpty shouldBe true
}
On doing sbt test I am geting following error
should serialise/deserialize a MyModel to JSON *** FAILED ***
[info] spray.json.DeserializationException: Expected String as JsString, but got []
[info] at spray.json.package$.deserializationError(package.scala:23)
[info] at spray.json.ProductFormats.fromField(ProductFormats.scala:63)
[info] at spray.json.ProductFormats.fromField$(ProductFormats.scala:51)
How to solve this?

Add val myModelViaJson= json.convertTo[MyModel] before parsing.
Refer: jsonformats-for-case-classes
So, the code will look like
val json= """{"Point1":"","Point3":[],"Point2":"","Point4":[]}""".parseJson
implicit val format = jsonFormat4(MyModel)
val myModelViaJson= json.convertTo[MyModel]
myModelViaJson.Point1 shouldBe ""
myModelViaJson.Point3.isEmpty shouldBe true
myModelViaJson.Point2 shouldBe ""
myModelViaJson.Point4.isEmpty shouldBe true

how can i create a custom CellEncoder in kantan

i have a code that converts a list of case class into a csv string, i'm using kantan so when i tried to pass the encoder i have this error:
could not find implicit value for evidence parameter of type kantan.csv.CellEncoder[Option[javax.xml.datatype.XMLGregorianCalendar]]
original date : 2020-08-13T21:52:27.000Z
this is my code:
import kantan.csv._
import kantan.csv.ops._
import kantan.csv.java8._
import kantan.csv.CellEncoder
val itemsList :List[ItemData] = getItems.getOrElse(Seq.empty[ItemData]).toList
implicit val itemEncoder: HeaderEncoder[ItemData] = HeaderEncoder.caseEncoder("absolutePath", "creationDate",
"displayName", "fileName", "lastModified","lastModifier","owner","parentAbsolutePath","typeValue")(ItemData.unapply _)
val csvItems :String =itemsList.asCsv(rfc.withHeader)
the case class :
case class ItemData(absolutePath: Option[String] = None,
creationDate: Option[javax.xml.datatype.XMLGregorianCalendar] = None,
displayName: Option[String] = None,
fileName: Option[String] = None,
lastModified: Option[javax.xml.datatype.XMLGregorianCalendar] = None,
lastModifier: Option[String] = None,
owner: Option[String] = None,
parentAbsolutePath: Option[String] = None,
typeValue: Option[String] = None)
dependencies:
lazy val `kantan-csv` = "com.nrinaudo" %% "kantan.csv" % Version.kantan
lazy val `kantan-csv-commons` = "com.nrinaudo" %% "kantan.csv-commons" % Version.kantan
lazy val `kantan-csv-generic` = "com.nrinaudo" %% "kantan.csv-generic" % Version.kantan
lazy val `kantan-csv-java8` = "com.nrinaudo" %% "kantan.csv-java8" % Version.kantan

I don't actually know much about javax.xml.datatype.XMLGregorianCalendar, so I'm not sure how you'd represent that as a string. This answer assumes it's done by calling toString, but change that to whatever is the correct way of doing so.
You need to provide a CellEncoder[javax.xml.datatype.XMLGregorianCalendar]. This is documented, and fairly straightforward:
implicit val xmlCalendarEncoder: CellEncoder[javax.xml.datatype.XMLGregorianCalendar] = CellEncoder.from(_.toString)
kantan.csv should be able to work out the rest for you.

Read yaml of same nested structure with different properties in scala

I am trying to make a generic function to read yaml files of same nested structure but different properties in scala using snakeYaml. Like one yaml could be
myMap:
-
name: key1
value: value1
-
name: key2
value: value2
Another yaml could be
myMap:
-
name: key1
value: value1
data: data1
-
name: key2
value: value2
data: data2
To read first yaml , I am able to read using below code from here :
class configParamsKeyValue {
#BeanProperty var name: String = null
#BeanProperty var value: String = null
}
class myConfig{
#BeanProperty var myMap= new java.util.ArrayList[configParamsKeyValue]();
}
def loadConfig(filename : String): myConfig = {
val yaml = new Yaml(new Constructor(classOf[myConfig]))
val stream = new FileInputStream(filename)
try {
val obj = yaml.load(stream)
obj.asInstanceOf[myConfig]
} finally {
stream.close()
}
}
I want to be able to pass this type configParamsKeyValue of ArrayList as parameter in class myConfig so that I can read second yaml file as well by defining another class like
class configParamsKeyValueData {
#BeanProperty var name: String = null
#BeanProperty var value: String = null
#BeanProperty var data: String = null
}
Can some body please help

Retrieving list of objects from application.conf

I have the following entry in Play for Scala application.conf:
jobs = [
{number: 0, dir: "/dir1", name: "General" },
{number: 1, dir: "/dir2", name: "Customers" }
]
I want to retrieve this list of objects in a Scala program:
val conf = ConfigFactory.load
val jobs = conf.getAnyRefList("jobs").asScala
println(jobs)
this prints
Buffer({number=0, name=General, dir=/dir1}, {number=1, name=Customers, dir=/dir2})
But how to convert the result to actual Scala objects?

Try this one:
case class Job(number: Int, dir: String, name: String)
object Job {
implicit val configLoader: ConfigLoader[List[Job]] = ConfigLoader(_.getConfigList).map(
_.asScala.toList.map(config =>
Job(
config.getInt("number"),
config.getString("dir"),
config.getString("name")
)
)
)
}
Then from Confugutation DI
Configuration.get[List[Job]]("jobs")

Here is a Config object which will extract data from a config file into a type that you specify.
Usage:
case class Job(number: Int, dir: String, name: String)
val jobs = Config[List[Job]]("jobs")
Code:
import com.typesafe.config._
import org.json4s._
import org.json4s.jackson.JsonMethods._
object Config {
private val conf = ConfigFactory.load()
private val jData = parse(conf.root.render(ConfigRenderOptions.concise))
def apply[T](name: String)(implicit formats: Formats = DefaultFormats, mf: Manifest[T]): T =
Extraction.extract(jData \\ name)(formats, mf)
}
This will throw an exception if the particular config object does not exist or does not match the format of T.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Scala Object Mapper write incorrect value - scala

Jackson doesn't know how to serialize Option as it's a Java library. You can add the jackson-scala-module library and register the DefaultScalaModule on your ObjectMapper to let it know how to serialize common Scala types. See https://github.com/FasterXML/jackson-module-scala

Related

Scala spark dataframe parse string field to typesafe Config error

Serializing/Deserializing model using spray json scala test

how can i create a custom CellEncoder in kantan

Read yaml of same nested structure with different properties in scala

Retrieving list of objects from application.conf

Categories

Resources