Map different value to the case class property during serialization and deserialization using Jackson - scala

I am trying to deserialize this JSON using Jackson library -
{
"name": "abc",
"ageInInt": 30
}
To the case class Person
case class Person(name: String, #JsonProperty(value = "ageInInt")#JsonAlias(Array("ageInInt")) age: Int)
but I am getting -
No usable value for age
Did not find value which can be converted into int
org.json4s.package$MappingException: No usable value for age
Did not find value which can be converted into int
Basically, I want to deserialize the json with the different key fields ageInInt to age.
here is the complete code -
val json =
"""{
|"name": "Tausif",
|"ageInInt": 30
|}""".stripMargin
implicit val format = DefaultFormats
println(Serialization.read[Person](json))

You need to register DefaultScalaModule to your JsonMapper.
import com.fasterxml.jackson.databind.json.JsonMapper
import com.fasterxml.jackson.module.scala.DefaultScalaModule
import com.fasterxml.jackson.core.`type`.TypeReference
import com.fasterxml.jackson.annotation.JsonProperty
val mapper = JsonMapper.builder()
.addModule(DefaultScalaModule)
.build()
case class Person(name: String, #JsonProperty(value = "ageInInt") age: Int)
val json =
"""{
|"name": "Tausif",
|"ageInInt": 30
|}""".stripMargin
val person: Person = mapper.readValue(json, new TypeReference[Person]{})
println(person) // Prints Person(Tausif,30)

Related

Convert prepareStament object to Json Scala

I'am trying to convert prepareStament(object uses for sending SQL statement to the database ) to Json with scala.
So far, I've discovered that the best way to convert an object to Json in scala is to do it with the net.liftweb library.
But when I tried it, I got an empty json.
this is the code
import java.sql.DriverManager
import net.liftweb.json._
import net.liftweb.json.Serialization.write
object Main {
def main (args: Array[String]): Unit = {
implicit val formats = DefaultFormats
val jdbcSqlConnStr = "sqlserverurl**"
val conn = DriverManager.getConnection(jdbcSqlConnStr)
val statement = conn.prepareStatement("exec select_all")
val piedPierJSON2= write(statement)
println(piedPierJSON2)
}
}
this is the result
{}
I used an object I created , and the conversion worked.
case class Person(name: String, address: Address)
case class Address(city: String, state: String)
val p = Person("Alvin Alexander", Address("Talkeetna", "AK"))
val piedPierJSON3 = write(p)
println(piedPierJSON3)
This is the result
{"name":"Alvin Alexander","address":{"city":"Talkeetna","state":"AK"}}
I understood where the problem was, PrepareStament is an interface, and none of its subtypes are serializable...
I'm going to try to wrap it up and put it in a different class.

Scala deserialize JSON to Collection

My JSON File containes below details
{
"category":"age, gender,post_code"
}
My scala code is below one
val filename = args.head
println(s"Reading ${args.head} ...")
val json = Source.fromFile(filename)
val mapper = new ObjectMapper() with ScalaObjectMapper
mapper.registerModule(DefaultScalaModule)
val parsedJson = mapper.readValue[Map[String, Any]](json.reader())
val data = parsedJson.get("category").toSeq
It's returning Seq(Any) = example List(age, gender,post_code) but I need Seq(String) output please if any has an idea about this please help me.
The idea in scala is to be typesafe whenever possible which you are giving away using Map[String, Any].
So, I recommend using a data class that represents your JSON data.
Example,
define a mapper,
scala> import com.fasterxml.jackson.databind.ObjectMapper
import com.fasterxml.jackson.databind.ObjectMapper
scala> import com.fasterxml.jackson.module.scala.experimental.ScalaObjectMapper
import com.fasterxml.jackson.module.scala.experimental.ScalaObjectMapper
scala> import com.fasterxml.jackson.module.scala.DefaultScalaModule
import com.fasterxml.jackson.module.scala.DefaultScalaModule
scala> val mapper = new ObjectMapper() with ScalaObjectMapper
mapper: com.fasterxml.jackson.databind.ObjectMapper with com.fasterxml.jackson.module.scala.experimental.ScalaObjectMapper = $anon$1#d486a4d
scala> mapper.registerModule(DefaultScalaModule)
res0: com.fasterxml.jackson.databind.ObjectMapper = $anon$1#d486a4d
Now, when you deserialise to Map[K, V] you can not specify all the nested data-structures,
scala> val jsonString = """{"category": ["metal", "metalcore"], "age": 10, "gender": "M", "postCode": "98109"}"""
jsonString: String = {"category": ["metal", "metalcore"], "age": 10, "gender": "M", "postCode": "98109"}
scala> mapper.readValue[Map[String, Any]](jsonString)
res2: Map[String,Any] = Map(category -> List(metal, metalcore), age -> 10, gender -> M, postCode -> 98109)
Following is a solution casting some key to desired data-structure but I personally don not recommend.
scala> mapper.readValue[Map[String, Any]](jsonString).get("category").map(_.asInstanceOf[List[String]]).getOrElse(List.empty[String])
res3: List[String] = List(metal, metalcore)
Best solution is to define a data class which I'm calling SomeData in following example and deserialize to it. SomeData is defined based on your JSON data-structure.
scala> final case class SomeData(category: List[String], age: Int, gender: String, postCode: String)
defined class SomeData
scala> mapper.readValue[SomeData](jsonString)
res4: SomeData = SomeData(List(metal, metalcore),10,M,98109)
scala> mapper.readValue[SomeData](jsonString).category
res5: List[String] = List(metal, metalcore)
Just read the JSON as a JsonNode, and access the property directly:
val jsonNode = objectMapper.readTree(json.reader())
val parsedJson = jsonNode.get("category").asText
By using the scala generic function for converting the JSON String to Case Class/Object you can de-serialize to anything you want. Like
JSON to Collection,
JSON to Case Class, and
JSON to Case Class with Object as field.
Please find a working and detailed answer which I have provided using generics here.

Consuming RESTful API and converting to Dataframe in Apache Spark

I am trying to convert output of url directly from RESTful api to Dataframe conversion in following way:
package trials
import org.apache.spark.sql.SparkSession
import org.json4s.jackson.JsonMethods.parse
import scala.io.Source.fromURL
object DEF {
implicit val formats = org.json4s.DefaultFormats
case class Result(success: Boolean,
message: String,
result: Array[Markets])
case class Markets(
MarketCurrency:String,
BaseCurrency:String,
MarketCurrencyLong:String,
BaseCurrencyLong:String,
MinTradeSize:Double,
MarketName:String,
IsActive:Boolean,
Created:String,
Notice:String,
IsSponsored:String,
LogoUrl:String
)
def main(args: Array[String]): Unit = {
val spark = SparkSession
.builder()
.appName(s"${this.getClass.getSimpleName}")
.config("spark.sql.shuffle.partitions", "4")
.master("local[*]")
.getOrCreate()
import spark.implicits._
val parsedData = parse(fromURL("https://bittrex.com/api/v1.1/public/getmarkets").mkString).extract[Array[Result]]
val mySourceDataset = spark.createDataset(parsedData)
mySourceDataset.printSchema
mySourceDataset.show()
}
}
The error is as follows and it repeats for every record:
Caused by: org.json4s.package$MappingException: Expected collection but got JObject(List((success,JBool(true)), (message,JString()), (result,JArray(List(JObject(List((MarketCurrency,JString(LTC)), (BaseCurrency,JString(BTC)), (MarketCurrencyLong,JString(Litecoin)), (BaseCurrencyLong,JString(Bitcoin)), (MinTradeSize,JDouble(0.01435906)), (MarketName,JString(BTC-LTC)), (IsActive,JBool(true)), (Created,JString(2014-02-13T00:00:00)), (Notice,JNull), (IsSponsored,JNull), (LogoUrl,JString(https://bittrexblobstorage.blob.core.windows.net/public/6defbc41-582d-47a6-bb2e-d0fa88663524.png))))))))) and mapping Result[][Result, Result]
at org.json4s.reflect.package$.fail(package.scala:96)
The structure of the JSON returned from this URL is:
{
"success": boolean,
"message": string,
"result": [ ... ]
}
So Result class should be aligned with this structure:
case class Result(success: Boolean,
message: String,
result: List[Markets])
Update
And I also refined slightly the Markets class:
case class Markets(
MarketCurrency: String,
BaseCurrency: String,
MarketCurrencyLong: String,
BaseCurrencyLong: String,
MinTradeSize: Double,
MarketName: String,
IsActive: Boolean,
Created: String,
Notice: Option[String],
IsSponsored: Option[Boolean],
LogoUrl: String
)
End-of-update
But the main issue is in the extraction of the main data part from the parsed JSON:
val parsedData = parse(fromURL("{url}").mkString).extract[Array[Result]]
The root of the returned structure is not an array, but corresponds to Result. So it should be:
val parsedData = parse(fromURL("{url}").mkString).extract[Result]
Then, I suppose that you need not to load the wrapper in the DataFrame, but rather the Markets that are inside. That is why it should be loaded like this:
val mySourceDataset = spark.createDataset(parsedData.result)
And it finally produces the DataFrame:
+--------------+------------+------------------+----------------+------------+----------+--------+-------------------+------+-----------+--------------------+
|MarketCurrency|BaseCurrency|MarketCurrencyLong|BaseCurrencyLong|MinTradeSize|MarketName|IsActive| Created|Notice|IsSponsored| LogoUrl|
+--------------+------------+------------------+----------------+------------+----------+--------+-------------------+------+-----------+--------------------+
| LTC| BTC| Litecoin| Bitcoin| 0.01435906| BTC-LTC| true|2014-02-13T00:00:00| null| null|https://bittrexbl...|
| DOGE| BTC| Dogecoin| Bitcoin|396.82539683| BTC-DOGE| true|2014-02-13T00:00:00| null| null|https://bittrexbl...|

Extracting a field from a Json string using jackson mapper in Scala

I have a json string:
val message = "{\"me\":\"a\",
\"version\":\"1.0\",
\"message_metadata\": \"{
\"event_type\":\"UpdateName\",
\"start_date\":\"1515\"}\"
}"
I want to extract the value of the field event_type from this json string.
I have used below code to extract the value:
val mapper = new ObjectMapper
val root = mapper.readTree(message)
val metadata =root.at("/message_metadata").asText()
val root1 = mapper.readTree(metadata)
val event_type =root1.at("/event_type").asText()
print("eventType:" + event_type.toString) //UpdateName
This works fine and I get the value as UpdateName. But I when I want to get the event type in a single line as below:
val mapper = new ObjectMapper
val root = mapper.readTree(message)
val event_type =root.at("/message_metadata/event_type").asText()
print("eventType:" + event_type.toString) //Empty string
Here event type returns a empty sting. This might be because of the message_metadata has Json object as a string value. Is there a way I can get the value of event_type in a single line?
The problem is that your JSON message contains an object who's message_metadata field itself contains JSON, so it must be decoded separately. I'd suggest that you don't put JSON into JSON but only encode the data structure once.
Example:
val message = "{\"me\":\"a\",
\"version\":\"1.0\",
\"message_metadata\": {
\"event_type\":\"UpdateName\",
\"start_date\":\"1515\"
}
}"
You can parse your JSON using case classes and then get your event_type field from there.
case class Json(me: String, version: String, message_metadata: Message)
case class Message(event_type: String, start_date: String)
object Mapping {
def main(args: Array[String]): Unit = {
import com.fasterxml.jackson.databind.ObjectMapper
import com.fasterxml.jackson.module.scala.DefaultScalaModule
import com.fasterxml.jackson.module.scala.experimental.ScalaObjectMapper
val objectMapper = new ObjectMapper() with ScalaObjectMapper
objectMapper.registerModule(DefaultScalaModule)
val str = "{\n \"me\": \"a\",\n \"version\": \"1.0\",\n \"message_metadata\": {\n \"event_type\": \"UpdateName\",\n \"start_date\": \"1515\"\n }\n}"
val json = objectMapper.readValue(str, classOf[Json])
//to print event_type
println(json.message_metadata.event_type)
//output: UpdateName
}
}
You can even convert a JSON to Scala Case Class and then get the particular field from the case class.
Please find a working and detailed answer which I have provided using generics here.

Spark convert RDD to DataFrame - Enumeration is not supported

I have a case class which contains a enumeration field "PersonType". I would like to insert this record to a Hive table.
object PersonType extends Enumeration {
type PersonType = Value
val BOSS = Value
val REGULAR = Value
}
case class Person(firstname: String, lastname: String)
case class Holder(personType: PersonType.Value, person: Person)
And:
val hiveContext = new HiveContext(sc)
import hiveContext.implicits._
val item = new Holder(PersonType.REGULAR, new Person("tom", "smith"))
val content: Seq[Holder] = Seq(item)
val data : RDD[Holder] = sc.parallelize(content)
val df = data.toDF()
...
When I try to convert the corresponding RDD to DataFrame, I get the following exception:
Exception in thread "main" java.lang.UnsupportedOperationException:
Schema for type com.test.PersonType.Value is not supported
...
at org.apache.spark.sql.catalyst.ScalaReflection$class.schemaFor(ScalaReflection.scala:691)
at org.apache.spark.sql.catalyst.ScalaReflection$.schemaFor(ScalaReflection.scala:30)
at org.apache.spark.sql.catalyst.ScalaReflection$class.schemaFor(ScalaReflection.scala:630)
at org.apache.spark.sql.catalyst.ScalaReflection$.schemaFor(ScalaReflection.scala:30)
at org.apache.spark.sql.SQLContext.createDataFrame(SQLContext.scala:414)
at org.apache.spark.sql.SQLImplicits.rddToDataFrameHolder(SQLImplicits.scala:94)
I'd like to convert PersonType to String before inserting to Hive.
Is it possible to extend the implicitconversion to handle PersonType as well?
I tried something like this but didn't work:
object PersonTypeConversions {
implicit def toString(personType: PersonTypeConversions.Value): String = personType.toString()
}
import PersonTypeConversions._
Spark: 1.6.0