Idiomatic handling of JSON null in scala upickle / ujson - scala

I am new to Scala and would like to learn the idiomatic way to solve common problems, as in pythonic for Python. My question regards reading JSON data with upickle, where the JSON value contains a string when present, and null when not present. I want to use a custom value to replace null. A simple example:
import upickle.default._
val jsonString = """[{"always": "foo", "sometimes": "bar"}, {"always": "baz", "sometimes": null}]"""
val jsonData = ujson.read(jsonString)
for (m <- jsonData.arr) {
println(m("always").str.length) // this will work
println(m("sometimes").str.length) // this will fail, Exception in thread "main" ujson.Value$InvalidData: Expected ujson.Str (data: null)
}
The issue is with the field "sometimes": when null, we cannot apply .str (or any other function mapping to a static type other than null). I am looking for something like m("sometimes").str("DEFAULT").length, where "DEFAULT" is the replacement for null.
Idea 1
Using pattern matching, the following works:
val sometimes = m("sometimes") match {
case s: ujson.Str => s.str
case _ => "DEFAULT"
}
println(sometimes.length)
Given Scala's concise syntax, this looks a bit complicated and will be repetitive when done for a number of values.
Idea 2
Answers to a related question mention creating a case class with default values. For my problem, the creation of a case class seems inflexible to me when different replacement values are needed depending depending on context.
Idea 3
Anwers to another question (not specific to upickle) discuss using Try().getOrElse(), i.e.:
import scala.util.Try
// ...
println(Try(m("sometimes").str).getOrElse("DEFAULT").length)
However, the discussion mentions that throwing an exception for a regular program path is expensive.
What are idiomatic, yet concise ways to solve this?

Idiomatic or scala way to do this by using scala's Option.
Fortunately, upickle Values offers them. Refer strOpt method in this source code.
Your problem in code is str methods in m("always").str and m("sometimes").str
With this code, you are prematurely assuming that all the values are strings. That's where the strOpt method comes. It either outputs a string if its value is a string or a None type if it not. And we can use getOrElse method coupled with it to decide what to throw if the value is None.
Following would be the optimum way to handle this.
val jsonString = """[{"always": "foo", "sometimes": "bar"}, {"always": "baz", "sometimes": null}]"""
for (m <- jsonData.arr) {
println(m("always").strOpt.getOrElse("").length)
println(m("sometimes").strOpt.getOrElse("").length)
}
Output:
3
3
3
0
Here if we get any value other than a string (null, float, int), the code will output it as an empty string. And its length will be calculated as 0.
Basically, this is similar to your "Idea1" approach but this is the scala way. Instead of "DEFAULT", I am throwing an empty string because you wouldn't want to have null values' length to be 7 (Length of string "DEFAULT").

Related

How to avoid NullPointerException in Scala while storing query result in variable

Here is a code that requires a change:
val activityDate = validation.select("activity_date").first.get(0).toString
When we run a job, 'activityDate' might return null as a result of query since there might not be any data in db. In this case we get NullPointerException. I need to update this code to avoid NPE.
I tried to do it in different ways but there is always smth missing. I should probably use Match Expression here but have face some errors while initializing it.
The usual way to model some kind of data that might or might not be there in Scala is the Option type. The Option type has two concrete implementations, Some for a value which is there and the None singleton to represent any absent value. Option conveniently has a constructor that wraps a nullable value and turns it into either a Some(value) for non-null values and None for nulls. You can use it as follows:
Option(validation.select("activity_date").first.get(0))
You can apply transformations to it using various combinators. If you want to transform the piece of data itself into something more meaningful for your application, map is usually a good call. The following applies the logic you had before:
val activityDate: Option[String] =
Option(validation.select("activity_date").first.get(0)).
map { activityDate => activityDate.toString }
Note that now activityDate is an Option itself, which means that you have to explicitly handle the case in which the data is not there. You can do so with a match on the concrete type of the option as follows:
activityDate match {
case Some(date) => // `date` is there for sure!
case None => // handle the `select` returned nothing
}
Altenrnatively if you want to apply a default value you can use the getOrElse method on the Option:
val activityDate: String =
Option(validation.select("activity_date").first.get(0)).
map { activityDate => activityDate.toString }.
getOrElse("No Data")
Another possibility to apply a default value on a None and a function to the value in a Some is using fold:
val activityDate: String =
Option(validation.select("activity_date").first.get(0)).
fold("No Data")(activityDate => activityDate.toString)
As a final note, you can shorten anonymous functions in these cases as follows:
val activityDate: String =
Option(validation.select("activity_date").first.get(0)).
fold("No Data")(_.toString)
Where _ is used to refer to the only parameter.

Convert string to a parameterized type in scala

I am trying to convert string to scala type by using asInstanceof method, while executing this am getting below exception
java.lang.ClassCastException: java.lang.String cannot be cast to scala.Tuple2
my code as below
import org.apache.spark.sql.Column
import org.apache.spark.sql.functions.col
val cond : String = "(null, col(field).isNotNull)" // Will get this condition from properties file.
type mutliColumnType = (Column, Column)
def condition( value : String , field : String = "somefield") : mutliColumnType = {
value match {
case "a" => (null, col(field).isNull)
case _ => convertStringToMutliColumnType(cond) //cond.asInstanceOf[mutliColumnType]
}
}
condition("a") // returns data
condition("ab") // Exception
How can we convert string to multiColumnType here ?
UPDATE:
Currently I written below code snippet to parse string to mutliColumnType :
def convertStringToMutliColumnType(cond : String) : mutliColumnType = {
val colArray=cond.trim.substring(1, cond.length-1).split(",")
(col(colArray(0)), col(colArray(1)))
}
You seem to be wanting asInstanceOf to evaluate a string as Scala code. This isn't really the same thing as "casting", it's not what asInstanceOf does, and in fact evaluating strings as code is not something Scala supports at all (apart from some internal APIs and ill-advised libraries like the now-defunct twitter-util-eval).
It's quite unclear what you're trying to do here, but your options are basically either to write a parser that takes strings and returns mutliColumnType values (which is a lot of work and almost certainly a bad idea), or just not to do this—i.e. to use Scala code where you need Scala code and strings where you need strings.
As a footnote: asInstanceOf is only really useful for downcasting (when you've lost type information and have something typed as Any that you as the programmer "know" is actually a String or whatever), and even then it should be considered an advanced technique that's unsafe and not generally idiomatic. Any time you write asInstanceOf you're telling the compiler to get out of the way because you know better, and in my experience you'll generally be wrong.

Anorm null values

I'm writing server that executes any select query on my db and returns json. I've done most of that task but I stuck with parsing nullable column to string.
val result = SQL("SELECT * FROM Table limit 5;")().map(_.asList.map({_.toString})).toList
val jsonResp = Json.toJson(result)
And that generates if the column could have null value string Some(123) instead of 123. I tried with match but I failed with compose that command. Maybe you had some similar problem and you know how to deal with that kind of response?
Edit:
I made some progress by adding pattern matching:
val result = SQL(query)()
.map(_.asList.map(
{
case Some(s) => s.toString
case None => ""
case v => v.toString
}
)).toList
but I'm not sure is it good way to solve that problem. Still waiting for ideas
Anorm is supporting nullable column as optional value.
There you return row as a list of raw value. It would be better to use the parser API to indicate how to extract properly values. E.g.
SQL("SELECT a, b, c ...").as(get[Option[String]]("a") ~ int("b") ~ str("c) map { case a ~ b ~ c => MyClass(a, b, c) }.*)
Returns SQL results as list of MyClass, with properties being in order Option[String], Int and String.
Anorm documentation has plenty of other examples.
It would probably be best to map optional entries in the database to optional fields in the json; most scala json libraries will render a field with Option type appropriately. It also seems odd that you would want to render an integer value as a string for json - json has a perfectly good numeric type for integers.
If you definitely want to convert an Option[Int] to a string in this fashion, the clearest way is probably o.map(_.toString).getOrElse(""); you could therefore write _.asList.map{_.map{_.toString}.getOrElse("")} rather than your pattern-match. I don't know the specific SQL library well enough to know whether the results are statically known to be of type Option or not; if the values are of type Any then you probably do need to match options and non-options the way you have.

avoid type conversion in Scala

I have this weird requirement where data comes in as name ->value pair from a service and all the name-> value type is string only (which really they are not but that's how data is stored)
This is a simplified illustration.
case class EntityObject(type:String,value:String)
EntityObject("boolean","true")
now when getting that EntityObject if type is "boolean" then I have to make sure value is not anything else but boolean so first get type out and check value and cast value to that type. e.g in this case check value is boolean so have to cast string value to boolean to validate. If it was anything else besides boolean then it should fail.
e.g. if data came in as below, casting will fail and it should report back to the caller about this error.
EntityObject("boolean","1")
Due to this weird requirement it forces type conversion in validation code which doesn't look elegant and against type safe programming. Any elegant way to handle this in scala (may be in a more type safe manner)?
Here is where I'm going to channel an idea taken from a tweet by Miles Sabin in regards to hereogenous mappings (see this gist on github.) If you know the type of object mapping names a head of time you can use a nifty little trick which involves dependent types. Hold on, 'cause it's a wild ride:
trait AssocConv[K] { type V ; def convert: String => V }
def makeConv[V0](name: String, con: String => V0) = new AssocConv[name.type]{
V = V0
val convert = con
}
implicit val boolConv = makeConv("boolean", yourMappingFunc)
def convEntity(name: String, value: String)(implicit conv: AssocConv[name.type]): Try[conv.V] = Try{ conv.convert(value) }
I haven't tested this but it "should" work. I've also enclosed it in a Scala Try so that it catches exceptions thrown by your conversion function (in case you're doing things like _.toInt as the converter.)
You're really talking about conversion, not casting. Casting would be if the value really were an instance of Boolean at runtime, whereas what you have is a String representation of a Boolean.
If you're already working with a case class, I think a pattern matching expression would work pretty well here.
For example,
def convert(entity : EntityObject) : Any = entity match {
case EntityObject("boolean", "true") => true
case EntityObject("boolean", "false") => false
case EntityObject("string", s) => s
// TODO: add Regex-based matchers for numeric types
}
Anything that doesn't match one of the specified patterns would cause a MatchError, or you could put a catchall expression at the end to throw your own exception.
In this particular example, since the function returns Any, the calling coffee would need to do an actual type cast to get the specific type, but at least by that point all validation/conversion would have already been performed. Alternatively, you could just put the code that uses the values directly into the above function and avoid casting. I don't know what your specific needs are, so I can't offer anything more detailed.

How to get object from Play cache (scala)

How to get object from Play cache (scala)
Code to set:
play.api.cache.Cache.set("mykey98", new Product(98), 0)
Code to get:
val product1: Option[Any] = play.api.cache.Cache.get("mykey98")
I get Option object. How to get actual Product object I stored in first step.
First and foremost, I would suggest using Cache.getAs, which takes a type parameter. That way you won't be stuck with Option[Any]. There are a few ways you can do this. In my example, I'll use String, but it will work the same with any other class. My preferred way is by pattern matching:
import play.api.cache.Cache
Cache.set("mykey", "cached string", 0)
val myString:String = Cache.getAs[String]("mykey") match {
case Some(string) => string
case None => SomeOtherClass.getNewString() // or other code to handle an expired key
}
This example is a bit over-simplified for pattern matching, but I think its a nicer method when needing to branch code based on the existence of a key. You could also use Cache.getOrElse:
val myString:String = Cache.getOrElse[String]("mykey") {
SomeOtherClass.getNewString()
}
In your specific case, replace String with Product, then change the code to handle what will happen if the key does not exist (such as setting a default key).