Anorm: implicit convertion [all value(include null)] to [String] - scala

I'm new to Scala and Play framework. I try to query all the data for selected columns from a data table and save them as Excel file.
Selected columns usually have different types, such as Int, Str, Timestamp, etc.
I want to convert all value types, include null into String
(null convert to empty string "")
without knowing the actual type of a column, so the code can be used for any tables.
According to Play's document, I can write the implicit converter below, however, this cannot handle null. Googled this for long time, cannot find solution. Can someone please let me know how to handle null in the implicit converter?
Thanks in advance~
implicit def valueToString: anorm.Column[String] =
anorm.Column.nonNull1[String] { (value, meta) =>
val MetaDataItem(qualified, nullable, clazz) = meta
value match {
case s: String => Right(s) // Provided-default case
case i: Int => Right(i.toString()) // Int to String
case t: java.sql.Clob => Right(t.toString()) // Blob/Text to String
case d: java.sql.Timestamp => Right(d.toString()) // Datatime to String
case _ => Left(TypeDoesNotMatch(s"Cannot convert $value: ${value.asInstanceOf[AnyRef].getClass} to String for column $qualified"))
}
}

As indicated in the documentation, if there a Column[T], allowing to parse of column of type T, and if the column(s) can be null, then Option[T] should be asked, benefiting from the generic support as Option[T].
There it is a custom Column[String] (make sure the custom one is used, not the provided Column[String]), so Option[String] should be asked.
import myImplicitStrColumn
val parser = get[Option[String]]("col")

Related

Typecasting in Scala

I have an alphanumeric field in an RDD of type AnyRef.
Case1: If it's 99898, I want to cast it as Long
Case2: If it's 0099898, I want to cast it as String
Case3: If it's AB998, I want to cast it as String.
I am trying this:
try {
account_number.asInstanceOf[ Long ])
} catch {
case _: Throwable => account_number.asInstanceOf[ String ])
}
But in this, I miss the case2, because 0099898 is converted to 99898. Any ideas?
If this field is AnyRef I wouldn't expect AnyVals there at all (like Long) - Scala's numbers are not equal to Java's numbers. At best you can have there some instance of java.lang.Numeric (e.g. java.lang.Long which is NOT scala.Long).
But to turn it into Long you would have to use pattern matching (with type matching or regexp pattern matching) and conversion (NOT casting!) to
val isStringID = raw"(0[0-9]+)".r
val isLongID = raw"([0-9]+)".r
account_number match {
case isStringID(id) => id // numeric string starting with 0
case isLongID(id) => id.toLong // numeric string convertible to Long
case l: java.lang.Long => l.toLong // Java's long
case _ => throw new IllegalArgumentException("Expected long or numeric string")
}
However, I would find that completely useless - right now you have Any instead of AnyVal. You could expect it to have Long or String but it's not represented by the returned value so compiler would NOT have any information about the safe usages. Personally, I would recommend doing something imediatelly after matching e.g. wrapping it with Either or creating ADT or passing it to function which needs String or Long.
// can be exhaustively pattern matched, or .folded or passed, etc
val stringOrLong: Either[String, Long] = account_number match {
case isStringID(id) => Left(id)
case isLongID(id) => Right(id.toLong)
case l: java.lang.Long => Right(l.toLong)
case _ => throw new IllegalArgumentException("Expected long or numeric string")
}
You cannot use .asInstanceOf to turn AnyRef to Long because neither is subtype or supertype of another, and this operation would always fail.
Any
/ \
AnyVal AnyRef
| |
Long |
\ /
Nothing
.asInstanceOf would only make sense if you were moving vertically in this hierarchy, not horizontally.
Another option you have is:
def tryConvert(s: String): Either[Long, String] = {
Try(s.toLong).filter(_.toString == s) match {
case Success(value) =>
Left(value)
case Failure(_) =>
Right(s)
}
}
Code run at Scastie.

Can I make json4s's extract method case insensitive?

I am using case classes to extract json with json4s's extract method. Unfortunately, the Natural Earth source data I am using isn't consistent about casing... at some resolutions a field is called iso_a2 and at some it's ISO_A2. I can only make json4s accept the one that matches the field in the case class:
object TopoJSON {
case class Properties(ISO_A2: String)
...
// only accepts capitalised version.
Is there any way to make json4s ignore case and accept both?
There is no way to make it case insensitive using the configuration properties, but a similar result can be achieved by either lowercasing or uppercasing the field names in the parsed JSON.
For example, we have our input:
case class Properties(iso_a2: String)
implicit val formats = DefaultFormats
val parsedLower = parse("""{ "iso_a2": "test1" }""")
val parsedUpper = parse("""{ "ISO_A2": "test2" }""")
We can lowercase all field names using a short function:
private def lowercaseAllFieldNames(json: JValue) = json transformField {
case (field, value) => (field.toLowerCase, value)
}
or make it for specific fields only:
private def lowercaseFieldByName(fieldName: String, json: JValue) = json transformField {
case (field, value) if field == fieldName => (fieldName.toLowerCase, value)
}
Now, to extract the case class instances:
val resultFromLower = lowercaseAllFieldNames(parsedLower).extract[Properties]
val resultFromUpper = lowercaseAllFieldNames(parsedUpper).extract[Properties]
val resultByFieldName = lowercaseFieldByName("ISO_A2", parsedUpper).extract[Properties]
// all produce expected items:
// Properties(test1)
// Properties(test2)
// Properties(test2)

Null check for Double/Int Value in Spark

I am new in Spark,
How can I check for for Null value in Double and Int value in scala or Spark.
Like for String We can do like this :
val value = (FirstString.isEmpty()) match {
case true => SecondString
case _ => FirstString
}
I searched for it a lot but i found only for String value. Can you please suggest me for other datatype as well.
Thanks in advance.
null is only applicable for AnyRef (i.e non primitive types) types in Scala. AnyVal types can not be set to null.
For example:
// the below are AnyVal(s) and wont compile
val c: Char = null
val i: Int = null
val d: Double = null
String is an AnyRef and so can be null:
// this is ok!
val c: String = null
That's why pattern matching nulls to Int/Double types is not possible:
// wont compile!
null match {
case a:Int => "is a null Int"
case _ => "something else"
}
May be you can simply use Options. So like:
val d: Double = ...
val isNull = Option(d).isDefined
Or you can use pattern matching:
val d: Double = ...
Option(d) match {
case Some(v) => use v
case _ => you got Null
}
isEmpty is not at all the same as "check for null". Calling isEmpty on null will fail:
val s: String = null
s.isEmpty // throws NullPointerException
Int and Double can't be null (neither can any other primitive types), so there is no need to check if they are. If you are talking specifically about Spark Rows, you need to check for null before getting an Int/Double/other primitive value:
It is invalid to use the native primitive interface to retrieve a value that is null, instead a user must check isNullAt before attempting to retrieve a value that might be null.

scala extractor pattern for complex validation but with nice error output

I am struggling with using the extractor pattern in a certain use case where it seems that it could be very powerful.
I start with an input of Map[String, String] coming from a web request. This is either a searchRequest or a countRequest to our api.
searchRequest has keys
query(required)
fromDate(optional-defaulted)
toDate(optional-defaulted)
nextToken(optional)
maxResults(optional-defaulted)
countRequest has keys
query(required)
fromDate(optional-defaulted)
toDate(optional-defaulted)
bucket(optional-defaulted)
Then, I want to convert both of these to a composition type structure like so
protected case class CommonQueryRequest(
originalQuery: String,
fromDate: DateTime,
toDate: DateTime
)
case class SearchQueryRequest(
commonRequest: CommonQueryRequest,
maxResults: Int,
nextToken: Option[Long])
case class CountRequest(commonRequest: CommonQueryRequest, bucket: String)
As you can see, I am sort of converting Strings to DateTimes and Int, Long, etc. My issue is that I really need errors for invalid fromDate vs. invalid toDate format vs. invalid maxResults vs. invalid next token IF available.
At the same time, I need to stick in defaults(which vary depending on if it is a search or count request).
Naturally, with the Map being passed in, you can tell search vs. count so in my first go at this, I added a key="type" with value of search or count so that I could match at least on that.
Am I even going down the correct path? I thought perhaps using matching could be cleaner than our existing implementation but the further I go down this path, it seems to be getting a bit uglier.
thanks,
Dean
I would suggest you to take a look at scalaz.Validation and ValidationNel. It's super nice way to collect validation errors, perfect fit for input request validation.
You can learn more about Validation here: http://eed3si9n.com/learning-scalaz/Validation.html. However in my example I use scalaz 7.1 and it can be a little bit different from what described in this article. However main idea remains the same.
Heres small example for your use case:
import java.util.NoSuchElementException
import org.joda.time.DateTime
import org.joda.time.format.DateTimeFormat
import scala.util.Try
import scalaz.ValidationNel
import scalaz.syntax.applicative._
import scalaz.syntax.validation._
type Input = Map[String, String]
type Error = String
case class CommonQueryRequest(originalQuery: String,
fromDate: DateTime,
toDate: DateTime)
case class SearchQueryRequest(commonRequest: CommonQueryRequest,
maxResults: Int,
nextToken: Option[Long])
case class CountRequest(commonRequest: CommonQueryRequest, bucket: String)
def stringField(field: String)(input: Input): ValidationNel[Error, String] =
input.get(field) match {
case None => s"Field $field is not defined".failureNel
case Some(value) => value.successNel
}
val dateTimeFormat = DateTimeFormat.fullTime()
def dateTimeField(field: String)(input: Input): ValidationNel[Error, DateTime] =
Try(dateTimeFormat.parseDateTime(input(field))) recover {
case _: NoSuchElementException => DateTime.now()
} match {
case scala.util.Success(dt) => dt.successNel
case scala.util.Failure(err) => err.toString.failureNel
}
def intField(field: String)(input: Input): ValidationNel[Error, Int] =
Try(input(field).toInt) match {
case scala.util.Success(i) => i.successNel
case scala.util.Failure(err) => err.toString.failureNel
}
def countRequest(input: Input): ValidationNel[Error, CountRequest] =
(
stringField ("query") (input) |#|
dateTimeField("fromDate")(input) |#|
dateTimeField("toDate") (input) |#|
stringField ("bucket") (input)
) { (query, from, to, bucket) =>
CountRequest(CommonQueryRequest(query, from, to), bucket)
}
val validCountReq = Map("query" -> "a", "bucket" -> "c")
val badCountReq = Map("fromDate" -> "invalid format", "bucket" -> "c")
println(countRequest(validCountReq))
println(countRequest(badCountReq))
scalactic looks pretty cool as well and I may go that route (though not sure if we can use that lib or not but I think I will just proceed forward until someone says no).

Access database column names from a Table?

Let's say I have a table:
object Suppliers extends Table[(Int, String, String, String)]("SUPPLIERS") {
def id = column[Int]("SUP_ID", O.PrimaryKey)
def name = column[String]("SUP_NAME")
def state = column[String]("STATE")
def zip = column[String]("ZIP")
def * = id ~ name ~ state ~ zip
}
Table's database name
The table's database name can be accessed by going: Suppliers.tableName
This is supported by the Scaladoc on AbstractTable.
For example, the above table's database name is "SUPPLIERS".
Columns' database names
Looking through AbstractTable, getLinearizedNodes and indexes looked promising. No column names in their string representations though.
I assume that * means "all the columns I'm usually interested in." * is a MappedProjection, which has this signature:
final case class MappedProjection[T, P <: Product](
child: Node,
f: (P) ⇒ T,
g: (T) ⇒ Option[P])(proj: Projection[P])
extends ColumnBase[T] with UnaryNode with Product with Serializable
*.getLinearizedNodes contains a huge sequence of numbers, and I realized that at this point I'm just doing a brute force inspection of everything in the API for possibly finding the column names in the String.
Has anybody also encountered this problem before, or could anybody give me a better understanding of how MappedProjection works?
It requires you to rely on Slick internals, which may change between versions, but it is possible. Here is how it works for Slick 1.0.1: You have to go via the FieldSymbol. Then you can extract the information you want like how columnInfo(driver: JdbcDriver, column: FieldSymbol): ColumnInfo does it.
To get a FieldSymbol from a Column you can use fieldSym(node: Node): Option[FieldSymbol] and fieldSym(column: Column[_]): FieldSymbol.
To get the (qualified) column names you can simply do the following:
Suppliers.id.toString
Suppliers.name.toString
Suppliers.state.toString
Suppliers.zip.toString
It's not explicitly stated anywhere that the toString will yield the column name, so your question is a valid one.
Now, if you want to programmatically get all the column names, then that's a bit harder. You could try using reflection to get all the methods that return a Column[_] and call toString on them, but it wouldn't be elegant. Or you could hack a bit and get a select * SQL statement from a query like this:
val selectStatement = DB withSession {
Query(Suppliers).selectStatement
}
And then parse our the column names.
This is the best I could do. If someone knows a better way then please share - I'm interested too ;)
Code is based on Lightbend Activator "slick-http-app".
slick version: 3.1.1
Added this method to the BaseDal:
def getColumns(): mutable.Map[String, Type] = {
val columns = mutable.Map.empty[String, Type]
def selectType(t: Any): Option[Any] = t match {
case t: TableExpansion => Some(t.columns)
case t: Select => Some(t.field)
case _ => None
}
def selectArray(t:Any): Option[ConstArray[Node]] = t match {
case t: TypeMapping => Some(t.child.children)
case _ => None
}
def selectFieldSymbol(t:Any): Option[FieldSymbol] = t match {
case t: FieldSymbol => Some(t)
case _ => None
}
val t = selectType(tableQ.toNode)
val c = selectArray(t.get)
for (se <- c.get) {
val col = selectType(se)
val fs = selectFieldSymbol(col.get)
columns += (fs.get.name -> fs.get.tpe)
}
columns
}
this method gets the column names (real names in DB) + types form the TableQ
used imports are:
import slick.ast._
import slick.util.ConstArray