I'm new to Scala and the Play Framework. I have written the following controller:
#Singleton
class MyController #Inject()(val controllerComponents: ControllerComponents) extends BaseController {
implicit val newMeasurementJson: OFormat[MeasurementModel] = Json.format[MeasurementModel]
def addMeasurement(): Action[AnyContent] = Action { implicit request =>
val content = request.body
val jsonObject: Option[JsValue] = content.asJson
val measurement: Option[MeasurementModel] =
jsonObject.flatMap(
Json.fromJson[MeasurementModel](_).asOpt
)
...
}
...
}
Where the endpoint receives the following JSON:
{
"sensor_id": "1029",
"sensor_type": "BME280",
"location": 503,
"lat": 48.12,
"lon": 11.488,
"timestamp": "2022-04-05T00:34:24",
"pressure": 94667.38,
"altitude": null,
"pressure_sealevel": null,
"temperature": 3.91,
"humidity": 65.85
}
MeasurementModel looks like this:
case class MeasurementModel(
sensor_id: String,
sensor_type: String,
location: Int,
lat: Float,
lon: Float,
timestamp: String,
pressure: Float,
altitude: Int,
pressure_sealevel: Int,
temperature: Float,
humidity: Float) {
}
Through testing I have seen that the null values in the JSON are causing the creation of the measurement object to be unsuccessful. How can I successfully handle null values and have them set in the generated MeasurementModel object?
The datatypes that can store null are Null and Option[].
Consider the following REPL code:
`
scala> val mightBeIntOrNull: Option[Int] = Option(1)
val a: Option[Int] = Some(1)
scala> val mightBeIntOrNull: Option[Int] = null
val a: Option[Int] = null
The Option wraps the Int value in Some, which can be extracted by pattern matching.
scala> val mightBeIntOrNull: Option[Int] = Option(1)
val mightBeIntOrNull: Option[Int] = Some(1)
scala> mightBeIntOrNull match {
| case Some(myIntVal) => println("This is an integer :" + myIntVal)
| case _ => println("This might be a null")
| }
This is an integer :1
scala> val mightBeIntOrNull: Option[Int] = null
val mightBeIntOrNull: Option[Int] = null
scala> mightBeIntOrNull match {
| case Some(myIntVal) => println("This is an integer :" + myIntVal)
| case _ => println("This might be a null")
| }
This might be a null
As Gaël J mentioned, you should add Option for the desired datatype in your case class
So the solution can be to wrap the datatype in option where you expect a null. Like:
{
"altitude": Option[Float],
"sensor_type": Option[String],
}
Related
I am trying to refactor some code and put the general logic into a trait. I basically want to process datasets, group them by some key and aggregate:
import org.apache.spark.sql.expressions.Aggregator
import org.apache.spark.sql.{ Dataset, Encoder, Encoders, TypedColumn }
case class SomeKey(a: String, b: Boolean)
case class InputRow(
SomeKey,
v: Double
)
trait MyTrait {
def processInputs: Dataset[InputRow]
def groupAndAggregate(
logs: Dataset[InputRow]
): Dataset[(SomeKey, Long)] = {
import logs.sparkSession.implicits._
logs
.groupByKey(i => i.key)
.agg(someAggFunc)
}
//Whatever agg function: here, it counts the number of v that are >= 0.5
def someAggFunc: TypedColumn[InputRow, Long] =
new Aggregator[
/*input type*/ InputRow,
/* "buffer" type */ Long,
/* output type */ Long
] with Serializable {
def zero = 0L
def reduce(b: Long, a: InputRow) = {
if (a.v >= 0.5)
b + 1
else
b
}
def merge(b1: Long, b2: Long) =
b1 + b2
// map buffer to output type
def finish(b: Long) = b
def bufferEncoder: Encoder[Long] = Encoders.scalaLong
def outputEncoder: Encoder[Long] = Encoders.scalaLong
}.toColumn
}
everything works fine: I can instantiate a class that inherits from MyTrait and override the way I process inputs:
import spark.implicits._
case class MyTraitTest(testDf: DataFrame) extends MyTrait {
override def processInputs: Dataset[InputRow] = {
val ds = testDf
.select(
$"a",
$"b",
$"v",
)
.rdd
.map(
r =>
InputRow(
SomeKey(r.getAs[String]("a"), r.getAs[Boolean]("b")),
r.getAs[Double]("v")
)
)
.toDS
ds
}
val df: DataFrame = Seq(
("1", false, 0.40),
("1", false, 0.54),
("0", true, 0.85),
("1", true, 0.39)
).toDF("a", "b", "v")
val myTraitTest = MyTraitTest(df)
val ds: Dataset[InputRow] = myTraitTest.processInputs
val res = myTraitTest.groupAndAggregate(ds)
res.show(false)
+----------+----------------------------------+
|key |InputRow |
+----------+----------------------------------+
|[1, false]|1 |
|[0, true] |1 |
|[1, true] |0 |
+----------+----------------------------------+
Now the problem: I want SomeKey to derive from a more generic trait Key, because the key will not always have only two fields, the fields won't have the same type etc. It will always be a simple tuple of some basic primitive types though.
So I tried to do the following:
trait Key extends Product
case class SomeKey(a: String, b: Boolean) extends Key
case class SomeOtherKey(x: Int, y: Boolean, z: String) extends Key
case class InputRow[T <: Key](
key: T,
v: Double
)
trait MyTrait[T <: Key] {
def processInputs: Dataset[InputRow[T]]
def groupAndAggregate(
logs: Dataset[InputRow[T]]
): Dataset[(T, Long)] = {
import logs.sparkSession.implicits._
logs
.groupByKey(i => i.key)
.agg(someAggFunc)
}
def someAggFunc: TypedColumn[InputRow[T], Long] = {...}
I now do:
case class MyTraitTest(testDf: DataFrame) extends MyTrait[SomeKey] {
override def processInputs: Dataset[InputRow[SomeKey]] = {
...
}
etc.
But now I get the error : Unable to find encoder for type T. An implicit Encoder[T] is needed to store T instances in a Dataset. Primitive types (Int, String, etc) and Product types (case classes) are supported by importing spark.implicits._ Support for serializing other types will be added in future releases.
.groupByKey(i => i.key)
I really don't know how to work around this issue, I tried lots of things without success. Sorry for this quite lengthy description but hopefully you have all the elements to help me understand... thanks!
Spark needs to be able to implicitly create the encoder for product type T so you'll need to help it work around the JVM type erasure and pass the TypeTag for T as an implicit parameter of your groupAndAggregate method.
A working example:
import org.apache.spark.sql.expressions.Aggregator
import org.apache.spark.sql.{ DataFrame, Dataset, Encoders, TypedColumn }
import scala.reflect.runtime.universe.TypeTag
trait Key extends Product
case class SomeKey(a: String, b: Boolean) extends Key
case class SomeOtherKey(x: Int, y: Boolean, z: String) extends Key
case class InputRow[T <: Key](key: T, v: Double)
trait MyTrait[T <: Key] {
def processInputs: Dataset[InputRow[T]]
def groupAndAggregate(
logs: Dataset[InputRow[T]]
)(implicit tTypeTag: TypeTag[T]): Dataset[(T, Long)] = {
import logs.sparkSession.implicits._
logs
.groupByKey(i => i.key)
.agg(someAggFunc)
}
def someAggFunc: TypedColumn[InputRow[T], Long] =
new Aggregator[InputRow[T], Long, Long] with Serializable {
def reduce(b: Long, a: InputRow[T]) = b + (a.v * 100).toLong
def merge(b1: Long, b2: Long) = b1 + b2
def zero = 0L
def finish(b: Long) = b
def bufferEncoder = Encoders.scalaLong
def outputEncoder = Encoders.scalaLong
}.toColumn
}
with a wrapping case class
case class MyTraitTest(testDf: DataFrame) extends MyTrait[SomeKey] {
import testDf.sparkSession.implicits._
import org.apache.spark.sql.functions.struct
override def processInputs = testDf
.select(struct($"a", $"b") as "key", $"v" )
.as[InputRow[SomeKey]]
}
and a test execution
val df = Seq(
("1", false, 0.40),
("1", false, 0.54),
("0", true, 0.85),
("1", true, 0.39)
).toDF("a", "b", "v")
val myTraitTest = MyTraitTest(df)
val ds = myTraitTest.processInputs
val res = myTraitTest.groupAndAggregate(ds)
res.show(false)
+----------+-----------------------------------------------+
|key |$anon$1($line5460910223.$read$$iw$$iw$InputRow)|
+----------+-----------------------------------------------+
|[1, false]|94 |
|[1, true] |39 |
|[0, true] |85 |
+----------+-----------------------------------------------+
I am using https://circe.github.io/circe/ and would like to figure out, if a property has an empty JSON object or not.
For example:
val json: String = """
{
"id": "c730433b-082c-4984-9d66-855c243266f0",
"name": "Foo",
"counts": [1, 2, 3],
"values": {
}
}
"""
As you can see on the code above, the property values is an empty JSON structure.
How to validate, if a property is empty or not?
There are lots of ways you could do this. For example:
import io.circe.jawn.parse
def valuesIsEmpty(in: String): Option[Boolean] = for {
parsed <- parse(in).right.toOption
parsedObj <- parsed.asObject
values <- parsedObj("values")
valuesObj <- values.asObject
} yield valuesObj.size == 0
And then:
scala> valuesIsEmpty(json)
res0: Option[Boolean] = Some(true)
Here None would indicate that the input is not valid JSON or isn't an object with a values member.
In general you wouldn't perform validation at this level, though—you'd build it into your decoder. For example:
import io.circe.Decoder, io.circe.generic.semiauto.deriveDecoder
case class Entry(id: String, name: String, counts: List[Int], values: Map[String, String])
implicit val decodeEntry: Decoder[Entry] = deriveDecoder[Entry].emap {
case e if e.values.isEmpty => Left("empty values")
case e => Right(e)
}
And then:
scala> io.circe.jawn.decode[Entry](json)
res0: Either[io.circe.Error,Entry] = Left(DecodingFailure(empty values, List()))
I have 2 case classes:
case class Person(fname: String, lname: String, age: Int, address: String)
case class PersonUpdate(fname: String, lname: String, age: Int)
so PersonUpdate dosent have all the fields Person have, and I want to write effective that get Person and PersonUpdate and find the fields that have different values:
for example:
def findChangedFields(person: Person, personUpdate: PersonUpdate): Seq[String] = {
var listOfChangedFields: List[String] = List.empty
if (person.fname == personUpdate.fname)
listOfChangedFields = listOfChangedFields :+ "fname"
if (person.lname == personUpdate.lname)
listOfChangedFields = listOfChangedFields :+ "lname"
if (person.age == personUpdate.age)
listOfChangedFields = listOfChangedFields :+ "age"
listOfChangedFields
}
findChangedFields(per, perUpdate)
but this is very ugly, how can I write this nicely with the magic of scala?
Something like this, maybe?
val fields = Seq("fname", "lname", "age")
val changedFields = person.productIterator
.zip(personUpdate.productIterator)
.zip(fields.iterator)
.collect { case ((a, b), name) if a != b => name }
.toList
Something like this:
case class Person(fname: String, lname: String, age: Int, address: String)
case class PersonUpdate(fname: String, lname: String, age: Int)
def findFirstNameChanged(person: Person, personUpdate: PersonUpdate): List[String] =
{
if (person.fname == personUpdate.fname) List("fname")
else Nil
}
def findLastNameChanged(person: Person, personUpdate: PersonUpdate): List[String] = {
if (person.lname == personUpdate.lname) List("lname")
else Nil
}
def findAgeNameChanged(person: Person, personUpdate: PersonUpdate): List[String] = {
if (person.age == personUpdate.age) List("age")
else Nil
}
def findChangedFields(person: Person, personUpdate: PersonUpdate): Seq[String] = {
findFirstNameChanged(person,personUpdate):::
findLastNameChanged(person,personUpdate) ::: findAgeNameChanged(person,personUpdate)
}
val per = Person("Pedro","Luis",22,"street")
val personUpdate = PersonUpdate("Pedro", "Luis",27)
findChangedFields(per, personUpdate)
I think your problem is similar to compare two Set of tuple. Please feel free two correct me.
So here is my solution which will work for any two case class having field names in any order
def caseClassToSet[T](inp: T)(implicit ct: ClassTag[T]): Set[(String, AnyRef)] = {
ct.runtimeClass.getDeclaredFields.map(f => {
f.setAccessible(true)
val res = (f.getName, f.get(inp))
f.setAccessible(false)
res
}).toSet
}
val person = Person("x", "y", 10, "xy")
val personUpdate = PersonUpdate("z","y",12)
val personParams: Set[(String, AnyRef)] = caseClassToSet(person)
val personUpdateParams: Set[(String, AnyRef)] = caseClassToSet(personUpdate)
println(personUpdateParams diff personParams)
Got help from Get field names list from case class
I have a case class:
case class EvaluateAddress(addressFormat: String,
screeningAddressType: String,
value: Option[String])
This was working fine until I have a new use case where "value" parameter can be a class Object instead of String.
My initial implementation to handle this use case:
case class EvaluateAddress(addressFormat: String,
screeningAddressType: String,
addressId: Option[String],
addressValue: Option[MailingAddress]) {
def this(addressFormat: String, screeningAddressType: String, addressId: String) = {
this(addressFormat, screeningAddressType, Option(addressId), None)
}
def this(addressFormat: String, screeningAddressType: String, address: MailingAddress) = {
this(addressFormat, screeningAddressType, None, Option(address))
}
}
But because of some problem, I can not have four parameters in any constructor.
Is there a way I can create a class containing three parameters: ** addressFormat, screeningAddressType, value** and handle both the use cases?
Your code works fine, to use the other constructor's you just need to use the new keyword:
case class MailingAddress(i: Int)
case class EvaluateAddress(addressFormat: String, screeningAddressType: String, addressId: Option[String], addressValue: Option[MailingAddress]) {
def this(addressFormat: String, screeningAddressType: String, addressId: String) = {
this(addressFormat, screeningAddressType, Option(addressId), None)
}
def this(addressFormat: String, screeningAddressType: String, address: MailingAddress) = {
this(addressFormat, screeningAddressType, None, Option(address))
}
}
val e1 = EvaluateAddress("a", "b", None, None)
val e2 = new EvaluateAddress("a", "b", "c")
val e3 = new EvaluateAddress("a", "b", MailingAddress(0))
You can create an auxilliary ADT to wrap different types of values. Inside EvaluateAddress you can check the alternative that was provided with a match:
case class EvaluateAddress(addressFormat: String,
screeningAddressType: String,
value: Option[EvaluateAddress.Value]
) {
import EvaluateAddress._
def doEvaluation() = value match {
case Some(Value.AsId(id)) =>
case Some(Value.AsAddress(mailingAddress)) =>
case None =>
}
}
object EvaluateAddress {
sealed trait Value
object Value {
case class AsId(id: String) extends Value
case class AsAddress(address: MailingAddress) extends Value
}
}
It's then possible to also define some implicit conversions to automatically convert Strings and MailingAddresses into Values:
object EvaluateAddress {
sealed trait Value
object Value {
case class AsId(id: String) extends Value
case class AsAddress(address: MailingAddress) extends Value
implicit def idAsValue(id: String): Value = AsId(id)
implicit def addressAsValue(address: MailingAddress): Value = AsAddress(address)
}
def withRawValue[T](addressFormat: String,
screeningAddressType: String,
rawValue: Option[T])(implicit asValue: T => Value): EvaluateAddress =
{
EvaluateAddress(addressFormat, screeningAddressType, rawValue.map(asValue))
}
}
Some examples of using those implicit conversions:
scala> EvaluateAddress("a", "b", Some("c"))
res1: EvaluateAddress = EvaluateAddress(a,b,Some(AsId(c)))
scala> EvaluateAddress("a", "b", Some(MailingAddress("d")))
res2: EvaluateAddress = EvaluateAddress(a,b,Some(AsAddress(MailingAddress(d))))
scala> val id: Option[String] = Some("id")
id: Option[String] = Some(id)
scala> EvaluateAddress.withRawValue("a", "b", id)
res3: EvaluateAddress = EvaluateAddress(a,b,Some(AsId(id)))
How should I extract the value of a field of a case class from a given String value representing the field.
For example:
case class Person(name: String, age: Int)
val a = Person("test",10)
Now here given a string name or age i want to extract the value from variable a. How do i do this? I know this can be done using reflection but I am not exactly sure how?
What you're looking for can be achieve using Shapeless lenses. This will also put the constraint that a field actually exists on a case class at compile time rather than run time:
import shapeless._
case class Person(name: String, age: Int)
val nameLens = lens[Person] >> 'name
val p = Person("myName", 25)
nameLens.get(p)
Yields:
res0: String = myName
If you try to extract a non existing field, you get a compile time error, which is a much stronger guarantee:
import shapeless._
case class Person(name: String, age: Int)
val nonExistingLens = lens[Person] >> 'bla
val p = Person("myName", 25)
nonExistingLens.get(p)
Compiler yells:
Error:(5, 44) could not find implicit value for parameter mkLens: shapeless.MkFieldLens[Person,Symbol with shapeless.tag.Tagged[String("bla")]]
val nonExistingLens = lens[Person] >> 'bla
don't know exactly what you had in mind, but a match statement would do, it is not very generic or extensible with regards changes to the Person case class, but it does meet your basic requirements of not using reflection:
scala> val a = Person("test",10)
a: Person = Person(test,10)
scala> def extract(p: Person, fieldName: String) = {
| fieldName match {
| case "name" => p.name
| case "age" => p.age
| }
| }
extract: (p: Person, fieldName: String)Any
scala> extract(a, "name")
res1: Any = test
scala> extract(a, "age")
res2: Any = 10
scala> extract(a, "name####")
scala.MatchError: name#### (of class java.lang.String)
at .extract(<console>:14)
... 32 elided
UPDATE as per comment:
scala> case class Person(name: String, age: Int)
defined class Person
scala> val a = Person("test",10)
a: Person = Person(test,10)
scala> def extract(p: Person, fieldName: String) = {
| fieldName match {
| case "name" => Some(p.name)
| case "age" => Some(p.age)
| case _ => None
| }
| }
extract: (p: Person, fieldName: String)Option[Any]
scala> extract(a, "name")
res4: Option[Any] = Some(test)
scala> extract(a, "age")
res5: Option[Any] = Some(10)
scala> extract(a, "name####")
res6: Option[Any] = None
scala>
I think it can do by convert case class to Map, then get field by name
def ccToMap(cc: AnyRef) =
(Map[String, Any]() /: cc.getClass.getDeclaredFields) {
(a, f) =>
f.setAccessible(true)
a + (f.getName -> f.get(cc))
}
Usage
case class Person(name: String, age: Int)
val column = Person("me", 16)
println(ccToMap(column))
val name = ccToMap(column)["name"]