Create a json deserializer and use it - scala

How do you create a jackson custom serializer and use it in your program? The serializer is used to serialize data from a kafka stream, because my job fails if it encounters a null.
I tried the following to create a serializer.
import org.json4s._
import org.json4s.jackson.JsonMethods._
case class Person(
val user: Option[String]
)
object PersonSerializer extends CustomSerializer[Person](formats => ( {
case JObject(JField("user", JString(user)) :: Nil) => Person(Some(user))
case JObject(JField("user", null) :: Nil) => Person(None)
},
{
case Person(Some(user)) => JObject(JField("user", JString(user)) :: Nil)
case Person(None) => JObject(JField("user", JString(null)) :: Nil)
}))
I am trying to use it this way.
object ConvertJsonTOASTDeSerializer extends App
{
case class Address(street : String, city : String)
case class PersonAddress(name : String, address : Address)
val testJson1 =
"""
{ "user": null,
"address": {
"street": "Bulevard",
"city": "Helsinki",
"country": {
"code": "CD" }
},
"children": [
{
"name": "Mary",
"age": 5,
"birthdate": "2004-09-04T18:06:22Z"
},
{
"name": "Mazy",
"age": 3
}
]
}
"""
implicit var formats : Formats = DefaultFormats + PersonSerializer
val output = parse(testJson1).as[Person]
println(output.user)
}
I am getting an error saying that
Error:(50, 35) No JSON deserializer found for type com.examples.json4s.Person. Try to implement an implicit Reader or JsonFormat for this type.
val output = parse(testJson1).as[Person]

Not sure if I answer your question. I provide the runnable code:
import org.json4s._
import org.json4s.jackson.JsonMethods._
case class Person(
user: Option[String],
address: Address,
children: List[Child]
)
case class Address(
street: String,
city: String,
country: Country
)
case class Country(
code: String
)
case class Child(
name: String,
age: Int
)
val s =
"""
{ "user": null,
"address": {
"street": "Bulevard",
"city": "Helsinki",
"country": {
"code": "CD" }
},
"children": [
{
"name": "Mary",
"age": 5,
"birthdate": "2004-09-04T18:06:22Z"
},
{
"name": "Mazy",
"age": 3
}
]
}
"""
implicit val formats : Formats = DefaultFormats
parse(s).extract[Person] // Person(None,Address(Bulevard,Helsinki,Country(CD)),List(Child(Mary,5), Child(Mazy,3)))

Related

Multiplying event case class depending on the list based on nested IDs

I am processing a dataframe and converting into Dataset[Event] using Event case class.How ever there are nested Ids for which i need to multiply the events based on the flattening of nested device:os.
I am able to return the case class Event at the Kafka event level. But not sure how to multiply events .
Kafka incoming Event:
{
"partition": 1,
"key": "34768_20220203_MFETP501",
"offset": 1841543,
"createTime": 1646041475348,
"topic": "topic_int",
"publishTime": 1646041475344,
"errorCode": 0,
"userActions": {
"productId": "3MFETP501",
"createdDate": "2022-02-26T11:19:35.786Z",
"events": [
{
"GUID": "dbb1-f38b-f7f0-44af-90da-80179412f89c",
"eventDate": "2022-02-26T11:19:35.786Z",
"familyId": 2010,
"productTypeId": 1004678,
"serialID": "890479804",
"productName": "MFE Total Protection 2021 Family Pack",
"features": {
"mapping": [
{
"deviceId": 999795,
"osId": [
100
]
},
{
"deviceId": 987875
"osId": [
101
]
}
]
}
}
]
}
}
The expected output case classes for Event
Event("3MFETP501","1004678","2010","3MFETP501:890479804","MFE Total Protection 2021 Family Pack","999795_100", Map("targetId"->"999795_100") )
Event("3MFETP501","1004678","2010","3MFETP501:890479804","MFE Total Protection 2021 Family Pack","987875_100", Map("targetId"->"987875_100") )
case class Event(
productId: String,
familyId: String,
productTypeId: String,
key: String,
productName: String,
deviceOS:String,
var featureMap: mutable.Map[String, String])
val finalDataset:Dataset[Event] = inputDataFrame.flatMap(
row=> {
val productId = row.getAs[String]("productId")
val userActions = row.getAs[Row]("userActions")
val userEvents:mutable.Seq[Row] = userActions.getAs[mutable.WrappedArray[Row]]("events")
val processedEvents:mutable.Seq[Row]= userEvents.map(
event=>
val productTypeId = event.getAs[Int]("productTypeId")
val familyId = event.getAs[String]("familyId")
val features = activity.getAs[mutable.WrappedArray[Row]]("features")
val serialId = activity.getAs[String]("serialId")
val key = productId+":"+serialId
val features = mutable.Map[String, String]().withDefaultValue(null)
val device_os_list=List("999795_100","987875_101")
//Feature Map is for every device_os ( example "targetId"->"999795_100") for 999795_100
if (familyId == 2010 )
{
val a: Option[List[String]] = flatten the deviceId,osId ..
a.get.map(i=>{
val key: String = methodToCombinedeviceIdAndosId
val featureMapping: mutable.Map[String, String] = getfeatureMapForInvidualKey
Event(productId,productTypeId,familyId,key,productName,device_os,feature) ---> This is returning **List[Event]**
})
}
else{
Event(productId,productTypeId,familyId,key,productName,device_os,feature) --> This is returning **Event**. THIS WORKS
}
)
}
)
I do not implement it fully the same but I think it will be possible to understand logic and apply it on your case.
I created json file like kafka.json and put there code like this(your event):
[{
"partition": 1,
"key": "34768_20220203_MFETP501",
"offset": 1841543,
"createTime": 1646041475348,
"topic": "topic_int",
"publishTime": 1646041475344,
"errorCode": 0,
"userActions": {
"productId": "3MFETP501",
"createdDate": "2022-02-26T11:19:35.786Z",
"events": [
{
"GUID": "dbb1-f38b-f7f0-44af-90da-80179412f89c",
"eventDate": "2022-02-26T11:19:35.786Z",
"familyId": 2010,
"productTypeId": 1004678,
"serialID": "890479804",
"productName": "MFE Total Protection 2021 Family Pack",
"features": {
"mapping": [
{
"deviceId": 999795,
"osId": [
100
]
},
{
"deviceId": 987875,
"osId": [
101
]
}
]
}
}
]
}
}]
Please find below first solution that is based on flatMap and for loop.
case class Event(
productId: String,
familyId: String,
productTypeId: String,
key: String,
productName: String,
deviceOS: String,
featureMap: Map[String, String])
import org.apache.spark.sql.{Dataset, Row, SparkSession}
import scala.collection.mutable
val spark = SparkSession
.builder
.appName("StructuredStreaming")
.master("local[*]")
.getOrCreate()
private val inputDataFrame = spark.read.option("multiline", "true").format("json").load("/absolute_path_to_kafka.json")
import spark.implicits._
val finalDataset: Dataset[Event] = inputDataFrame.flatMap(
row => {
val userActions = row.getAs[Row]("userActions")
val productId = userActions.getAs[String]("productId")
val userEvents = userActions.getAs[mutable.WrappedArray[Row]]("events")
for (event <- userEvents;
familyId = event.getAs[Int]("familyId").toString;
productTypeId = event.getAs[Int]("productTypeId").toString;
serialId = event.getAs[String]("serialID");
productName = event.getAs[String]("productName");
key = s"$productId:$serialId";
features = event.getAs[Row]("features");
mappings = features.getAs[mutable.WrappedArray[Row]]("mapping");
mappingRow <- mappings;
deviceId = mappingRow.getAs[Long]("deviceId");
osIds = mappingRow.getAs[mutable.WrappedArray[Long]]("osId");
osId <- osIds;
deviseOs = deviceId + "_" + osId
) yield Event(productId, familyId, productTypeId, key, productName, deviseOs, Map("target" -> (deviseOs)))
}
)
finalDataset.foreach(e => println(e))
// Event(3MFETP501,2010,1004678,3MFETP501:890479804,MFE Total Protection 2021 Family Pack,999795_100,Map(target -> 999795_100))
// Event(3MFETP501,2010,1004678,3MFETP501:890479804,MFE Total Protection 2021 Family Pack,987875_101,Map(target -> 987875_101))
Also, you can solve this task using select, withColumn, explode, concat functions.
case class Event(
productId: String,
familyId: String,
productTypeId: String,
key: String,
productName: String,
deviceOS: String,
featureMap: Map[String, String])
import org.apache.spark.sql.{Dataset, SparkSession}
import org.apache.spark.sql.functions.{col, explode, concat, lit, map}
val spark = SparkSession
.builder
.appName("StructuredStreaming")
.master("local[*]")
.getOrCreate()
private val inputDataFrame = spark.read.option("multiline", "true").format("json").load("/absolute_path_to_kafka.json")
val transformedDataFrame = inputDataFrame
.select(col("userActions.productId").as("productId"),
explode(col("userActions.events")).as("event"))
.select(col("productId"),
col("event.familyId").as("familyId"),
col("event.productTypeId").as("productTypeId"),
col("event.serialID").as("serialID"),
col("event.productName").as("productName"),
explode(col("event.features.mapping")).as("features")
)
.select(
col("productId"),
col("familyId"),
col("productTypeId"),
col("serialID"),
col("productName"),
col("features.deviceId").as("deviceId"),
explode(col("features.osId")).as("osId")
)
.withColumn("key", concat(col("productId"), lit(":"), col("serialID")))
.withColumn("deviceOS", concat(col("deviceId"), lit("_"), col("osId")))
.withColumn("featureMap", map(lit("target"), col("deviceOS")))
import spark.implicits._
private val result: Dataset[Event] = transformedDataFrame.as[Event]
result.foreach(e => println(e))
// Event(3MFETP501,2010,1004678,3MFETP501:890479804,MFE Total Protection 2021 Family Pack,999795_100,Map(target -> 999795_100))
// Event(3MFETP501,2010,1004678,3MFETP501:890479804,MFE Total Protection 2021 Family Pack,987875_101,Map(target -> 987875_101))
Add option to customize response based on the value one of the field. I replace here for comprehension to map/flatmap, so you can return as response one or several events based on the type. Also, I customized json a little bit to show more examples in the result.
New json:
[{
"partition": 1,
"key": "34768_20220203_MFETP501",
"offset": 1841543,
"createTime": 1646041475348,
"topic": "topic_int",
"publishTime": 1646041475344,
"errorCode": 0,
"userActions": {
"productId": "3MFETP501",
"createdDate": "2022-02-26T11:19:35.786Z",
"events": [
{
"GUID": "dbb1-f38b-f7f0-44af-90da-80179412f89c",
"eventDate": "2022-02-26T11:19:35.786Z",
"familyId": 2010,
"productTypeId": 1004678,
"serialID": "890479804",
"productName": "MFE Total Protection 2021 Family Pack",
"features": {
"mapping": [
{
"deviceId": 999795,
"osId": [
100,
110
]
},
{
"deviceId": 987875,
"osId": [
101
]
}
]
}
},
{
"GUID": "1111-2222-f7f0-44af-90da-80179412f89c",
"eventDate": "2022-03-26T11:19:35.786Z",
"familyId": 2011,
"productTypeId": 1004679,
"serialID": "890479805",
"productName": "Product name",
"features": {
"mapping": [
{
"deviceId": 999796,
"osId": [
103
]
},
{
"deviceId": 987877,
"osId": [
104
]
}
]
}
}
]
}
}]
Please find code below:
case class Event(
productId: String,
familyId: String,
productTypeId: String,
key: String,
productName: String,
deviceOS: String,
featureMap: Map[String, String])
import org.apache.spark.sql.{Dataset, SparkSession}
val spark = SparkSession
.builder
.appName("StructuredStreaming")
.master("local[*]")
.getOrCreate()
private val inputDataFrame = spark.read.option("multiline", "true").format("json").load("/absolute_path_to_kafka.json")
import spark.implicits._
val finalDataset: Dataset[Event] = inputDataFrame.flatMap(
row => {
val userActions = row.getAs[Row]("userActions")
val productId = userActions.getAs[String]("productId")
val userEvents = userActions.getAs[mutable.WrappedArray[Row]]("events")
for (event <- userEvents;
productTypeId = event.getAs[Int]("productTypeId").toString;
serialId = event.getAs[String]("serialID");
productName = event.getAs[String]("productName");
key = s"$productId:$serialId";
familyId = event.getAs[Int]("familyId").toString;
features = event.getAs[Row]("features");
mappings = features.getAs[mutable.WrappedArray[Row]]("mapping");
mappingRow <- mappings;
deviceId = mappingRow.getAs[Long]("deviceId");
osIds = mappingRow.getAs[mutable.WrappedArray[Long]]("osId");
osId <- osIds;
deviseOs = deviceId + "_" + osId
) yield Event(productId, familyId, productTypeId, key, productName, deviseOs, Map("target" -> deviseOs))
userEvents.flatMap(event => {
val productTypeId = event.getAs[Int]("productTypeId").toString
val serialId = event.getAs[String]("serialID")
val productName = event.getAs[String]("productName")
val key = s"$productId:$serialId"
val familyId = event.getAs[Long]("familyId")
if(familyId == 2010) {
val features = event.getAs[Row]("features")
val mappings = features.getAs[mutable.WrappedArray[Row]]("mapping")
mappings.flatMap(mappingRow => {
val deviceId = mappingRow.getAs[Long]("deviceId")
val osIds = mappingRow.getAs[mutable.WrappedArray[Long]]("osId")
osIds.map(osId => {
val devise_os = deviceId + "_" + osId
Event(productId, familyId.toString, productTypeId, key, productName, devise_os, Map("target" -> devise_os))
})
})
} else {
Seq(Event(productId, familyId.toString, productTypeId, key, productName, "default_defice_os", Map("target" -> "default_defice_os")))
}
})
}
)
finalDataset.foreach(e => println(e))
// Event(3MFETP501,2010,1004678,3MFETP501:890479804,MFE Total Protection 2021 Family Pack,999795_100,Map(target -> 999795_100))
// Event(3MFETP501,2010,1004678,3MFETP501:890479804,MFE Total Protection 2021 Family Pack,999795_110,Map(target -> 999795_110))
// Event(3MFETP501,2010,1004678,3MFETP501:890479804,MFE Total Protection 2021 Family Pack,987875_101,Map(target -> 987875_101))
// Event(3MFETP501,2011,1004679,3MFETP501:890479805,Product name,default_defice_os,Map(target -> default_defice_os))
As this is under a Row of DataFrame, returning Event case class , converts into DataSet.Issue here is for one condition ,i am getting List[Event] and rest type , i am getting only Event class
FYI :This is not an answer. But my further attempt to solve.
if (familyId == 2010 )
{
val a: Option[List[String]] = flatten the deviceId,osId ..
a.get.map(i=>{
val key: String = methodToCombinedeviceIdAndosId
val featureMapping: mutable.Map[String, String] = getfeatureMapForInvidualKey
Event(productId,productTypeId,familyId,key,productName,device_os,feature) ---> This is returning List[Event]
})
}
else{
Event(productId,productTypeId,familyId,key,productName,device_os,feature) --> This is returning Event
}

How to parse list of dictionaries as string in scala?

I am trying to parse list of dictionaries(which is in string) inside scala. Basically i want to build another list so that i can traverse through the list using a for loop.
When i have one single list of dictionaries it works fine.
class CC[T] { def unapply(a:Any):Option[T] = Some(a.asInstanceOf[T]) }
object M extends CC[Map[String, Any]]
object A extends CC[List[Any]] //for s3
object I extends CC[Double]
object S extends CC[String]
object E extends CC[String]
object F extends CC[String]
object G extends CC[Map[String, Any]]
val jsonString =
"""
{
"index": 1,
"source": "a",
"name": "v",
"s3": [{
"path": "s3://1",
"bucket": "p",
"key": "r"
}]
}
""".stripMargin
//println(List(JSON.parseFull(jsonString)))
val result = for {
Some(M(map)) <- List(JSON.parseFull(jsonString))
//L(text) = map("text")
//M(texts) <- text
I(index) = map("index")
S(source) = map("source")
N(name) = map("name")
A(s3q)=map("s3")
G(s3data) <- s3q
F(path) = s3data("path")
} yield {
(index.toInt,source,name,path)
}
But when i aded another list, it gives error stating "java.lang.ClassCastException: scala.collection.immutable.$colon$colon cannot be cast to scala.collection.immutable.Map"
class CC[T] { def unapply(a:Any):Option[T] = Some(a.asInstanceOf[T]) }
object M extends CC[Map[String, Any]]
object A extends CC[List[Any]] //for s3
object I extends CC[Double]
object S extends CC[String]
object E extends CC[String]
object F extends CC[String]
object G extends CC[Map[String, Any]]
val jsonString =
"""
[{
"index": 1,
"source": "a",
"name": "v",
"s3": [{
"path": "s3://1",
"bucket": "p",
"key": "r"
}]
},{
"index": 1,
"source": "a",
"name": "v",
"s3": [{
"path": "s3://1",
"bucket": "p",
"key": "r"
}]
}]
""".stripMargin
//println(List(JSON.parseFull(jsonString)))
val result = for {
Some(M(map)) <- List(JSON.parseFull(jsonString))
//L(text) = map("text")
//M(texts) <- text
I(index) = map("index")
S(source) = map("source")
N(name) = map("name")
A(s3q)=map("s3")
G(s3data) <- s3q
F(path) = s3data("path")
} yield {
(index.toInt,source,name,path)
}

Do not know how to convert JArray(List(JString(dds3), JString(sdds))) into class java.lang.String

~ pathPrefix("system") {
post {
entity(as[JValue]) { system =>
val newPerms = for {
sitePerms <- findAllPermissions((system \ "siteId").extract[String])
} yield {
sitePerms.groupBy(_.userId).mapValues(_.map(_.permissionType).toSet)
}.flatMap { case (userId, perms) =>
val systemId = (system \ "id").extract[String]
perms.map(Permission(userId, systemId, _, "system"))
}
onComplete(newPerms.flatMap(addPermissions)) {
case Success(_) => complete(StatusCodes.NoContent)
case Failure(error) => failWith(error)
}
}
Request Body
[{
"name": "dds3",
"description": "",
"siteId": "abs",
"companyId": "local"
},
{
"name": "dds3",
"description": "",
"siteId": "abc",
"companyId": "local"
}]
Error:
The request content was malformed:
No usable value for name
Do not know how to convert JArray(List(JString(dds3), JString(sdds))) into class java.lang.String
I want to pass a List of array from request body but don't know how to do that in scala, can anyone please help me on that.
The easiest option is to let the entity directive unpick your data for you:
case class System(
name: String,
description: String,
siteId: String,
companyId: String,
)
entity(as[List[System]]) { system =>
system will contain a parsed list of System objects that can be processed in the usual way.

Scala JSON If key matches value return string

I have the JSon response as given below.
If metadata's Organic=true then label='true-Organic', else label='non-Organic'
in the end => return List or Map[modelId,label]
import net.liftweb.json.{DefaultFormats, _}
object test1 extends App {
val json_response =
"""{
"requestId": "91ee60d5f1b45e#316",
"error": null,
"errorMessages": [
],
"entries": [
{
"modelId":"RT001",
"sku": "SKU-ASC001",
"store": "New Jersey",
"ttlInSeconds": 8000,
"metadata": {
"manufactured_date": "2019-01-22T01:25Z",
"organic": "true"
}
},
{
"modelId":"RT002",
"sku": "SKU-ASC002",
"store": "livingstone",
"ttlInSeconds": 8000,
"metadata": {
"manufactured_date": "2019-10-03T01:25Z",
"organic": "false"
}
}
] }"""
tried like this :
val json = parse(json_response)
implicit val formats = DefaultFormats
var map = Map[String, String]()
case class Sales(modelId: String, sku: String, store: String, ttlInSeconds: Int, metadata:
Map[String, String])
case class Response(entries: List[Sales])
val response = json.extract[Response]
After this, not sure how to proceed.
This is a straightforward map operation on the entries field:
response.entries.map{ e =>
e.modelId ->
if (e.metadata.get("organic").contains("true")) {
"true-Organic"
} else {
"non-Organic"
}
}
This will return List[(String, String)], but you can call toMap to turn this into a Map if required.

Akka-HTTP JSON serialization

How does one control the deserialization for spray-json? For example, I have a class defined as:
case class A (Name:String, Value:String)
And I would like to deserialize the following JSON into a List of A objects:
{
"one": "1",
"two": "2"
}
and it should become:
List(A("one", "1"), A("two", "2"))
The problem is that the default JSON representation of that List is this one, which I do not want:
[
{ "Name": "one", "Value": "1" },
{ "Name": "two", "Value": "2" }
]
How can I accomplish this?
You can write your own custom deserializer for the structure you are looking for:
case class A(Name:String, Value:String)
implicit object ListAFormat extends RootJsonReader[List[A]] {
override def read(json: JsValue): List[A] = {
json.asJsObject.fields.toList.collect {
case (k, JsString(v)) => A(k, v)
}
}
}
import spray.json._
def main(args: Array[String]): Unit = {
val json =
"""
|{
| "one": "1",
| "two": "2"
|}
""".stripMargin
val result = json.parseJson.convertTo[List[A]]
println(result)
}
Prints:
List(A(one,1), A(two,2))