Casting a variable to a method in Scala "Runtime Evaluation" - scala

I want to evaluate a function passed as a variable string in scala (sorry but i'm new to scala )
def concate(a:String,b:String): String ={
a+" "+b
}
var func="concate" //i'll get this function name from config as string
I want to perform something like
eval(func("hello","world)) //Like in Python
so output will be like
hello world
Eventually I want to execute few in built functions on a string coming from my config and I don't want to hard code the function names in the code.
EDIT
To Be More clear with my exact usecase
I have a Config file which has multiple functions defined in it that are Spark inbuilt functions on Data frame
application.conf looks like
transformations = [
{
"table" : "users",
"function" : "from_unixtime",
"column" : "epoch"
},
{
"table" : "users",
"function" : "yearofweek",
"column" : "epoch"
}
]
Now functions yearofweek and from_unixtime are Spark inbuilt functions now I want to eval my Dataframe by the functions defined in config. #all the functions are applied to a column defined.
the Obvious way is to write an if else and do string comparison calling a particular inbuilt function but that is way to much..
i am looking for a better solution.

This is indeed possible in scala, as scala is JSR 223 compliant scripting language. Here is an example (running with scala 2.11.8). Note that you need to import your method because otherwise the interpreter will not find it:
package my.example
object EvalDemo {
// evalutates scala code and returns the result as T
def evalAs[T](code: String) = {
import scala.reflect.runtime.currentMirror
import scala.tools.reflect.ToolBox
val toolbox = currentMirror.mkToolBox()
import toolbox.{eval, parse}
eval(parse(code)).asInstanceOf[T]
}
def concate(a: String, b: String): String = a + " " + b
def main(args: Array[String]): Unit = {
var func = "concate" //i'll get this function name from config as string
val code =
s"""
|import my.example.EvalDemo._
|${func}("hello","world")
|""".stripMargin
val result: String = evalAs[String](code)
println(result) // "hello world"
}
}

Have Function to name mapping in the code
def foo(str: String) = str + ", foo"
def bar(str: String) = str + ", bar"
val fmap = Map("foo" -> foo _, "bar" -> bar _)
fmap("foo")("hello")
now based on the function name we get from the config, pass the name to the map and lookup the corresponding function and invoke the arguments on it.
Scala repl
scala> :paste
// Entering paste mode (ctrl-D to finish)
def foo(str: String) = str + ", foo"
def bar(str: String) = str + ", bar"
val fmap = Map("foo" -> foo _, "bar" -> bar _)
fmap("foo")("hello")
// Exiting paste mode, now interpreting.
foo: (str: String)String
bar: (str: String)String
fmap: scala.collection.immutable.Map[String,String => String] = Map(foo -> $$Lambda$1104/1335082762#778a1250, bar -> $$Lambda$1105/841090268#55acec99)
res0: String = hello, foo

Spark offers you a way to write your transformations or queries using SQL. So, you really don't have to worry about Scala functions, casting and evaluation in this case. You just have to parse your config to generate the SQL query.
Let's say you have registered a table users with Spark and want to do a select and transform based on provided config,
// your generated query will look like this,
val query = "SELECT from_unixtime(epoch) as time, weekofyear(epoch) FROM users"
val result = spark.sql(query)
So, all you need to do is - build that query from your config.

Related

Anorm - generic insert

Is there a way to use Anorm like a regular ORM? I'd like to have a method that just inserts an element provided.
def insert[T](element: T)(implicit connection: Connection) = {
element.insert(connection)
}
I can definitely implement it by myself, but feels like I'm re-implementing an ORM... Old anorm version had this Magic[T] but I can't see it now
The documentation clearly states that Anorm is not an ORM (and will never be).
As indicated, to insert or update a T value, an instance of the ToStatement typeclass must be provided.
Some macros are provided to automatically materialize such instance.
import anorm.{ Macro, SQL, ToParameterList }
import anorm.NamedParameter
case class Bar(v: Int)
val bar1 = Bar(1)
// Convert all supported properties as parameters
val toParams1: ToParameterList[Bar] = Macro.toParameters[Bar]
val params1: List[NamedParameter] = toParams1(bar1)
// --> List(NamedParameter(v,ParameterValue(1)))
val names1: List[String] = params1.map(_.name)
// --> List(v)
val placeholders = names1.map { n => s"{$n}" } mkString ", "
// --> "{v}"
val generatedStmt = s"""INSERT INTO bar(${names1 mkString ", "}) VALUES ($placeholders)"""
val generatedSql1 = SQL(generatedStmt).on(params1: _*)

scala underscore parameter not acting as named parameter, in spark map reduce

I see some difference when using underscore parameter or named parameter in spark map function.
look at this code (executed in spark-shell):
var ds = Seq(1,2,3).toDS()
ds.map(t => Array("something", "" + t)).collect // works cool
ds.map(Array("funk", "" + _)).collect // doesn't work
the exception I get for the not working line is:
error: Unable to find encoder for type stored in a
Dataset. Primitive types (Int, String, etc) and Product types (case
classes) are supported by importing spark.implicits._ Support for
serializing other types will be added in future releases.
That's because the expansion of:
ds.map(Array("funk", "" + _)).collect
Doesn't work as you think. It expands to:
ds.map(Array("funk", ((x: Any) => "" + x))).collect
The _ in your array creation expands to a function. According to the documentation of DataSets, functions are not supported.
If we take a minimal reproduce:
val l = List(1,2,3)
val res = l.map(Array("42", "" + _))
And see the typer expansion (scalac -Xprint:typer), you can see:
def main(args: Array[String]): Unit = {
val l: List[Int] = scala.collection.immutable.List.apply[Int](1, 2, 3);
val res: List[Object] =
l.map[Object, List[Object]]
(scala.Predef.wrapRefArray[Object]
(scala.Array.apply[Object]("42", ((x$1: Any) => "".+(x$1))
If we isolate the specific relevant part, we can see that:
(x$1: Any) => "".+(x$1)
Is the expansion that happens inside the array creation.

Scala equivalent of Haskell's insertWith for Maps

I'm looking to do the simple task of counting words in a String. The easiest way I've found is to use a Map to keep track of word frequencies. Previously with Haskell, I used its Map's function insertWith, which takes a function that resolves key collisions, along with the key and value pair. I can't find anything similar in Scala's library though; only an add function (+), which presumably overwrites the previous value when re-inserting a key. For my purposes though, instead of overwriting the previous value, I want to add 1 to it to increase its count.
Obviously I could write a function to check if a key already exists, fetch its value, add 1 to it, and re-insert it, but it seems odd that a function like this isn't included. Am I missing something? What would be the Scala way of doing this?
Use a map with default value and then update with +=
import scala.collection.mutable
val count = mutable.Map[String, Int]().withDefaultValue(0)
count("abc") += 1
println(count("abc"))
If it's a string then why not use the split module
import Data.List.Split
let mywords = "he is a good good boy"
length $ nub $ splitOn " " mywords
5
If you want to stick with Scala's immutable style, you could create your own class with immutable semantics:
class CountMap protected(val counts: Map[String, Int]){
def +(str: String) = new CountMap(counts + (str -> (counts(str) + 1)))
def apply(str: String) = counts(str)
}
object CountMap {
def apply(counts: Map[String, Int] = Map[String, Int]()) = new CountMap(counts.withDefaultValue(0))
}
And then you can use it:
val added = CountMap() + "hello" + "hello" + "world" + "foo" + "bar"
added("hello")
>>2
added("qux")
>>0
You might also add apply overloads on the companion object so that you can directly input a sequence of words, or even a sentence:
object CountMap {
def apply(counts: Map[String, Int] = Map[String, Int]()): CountMap = new CountMap(counts.withDefaultValue(0))
def apply(words: Seq[String]): CountMap = CountMap(words.groupBy(w => w).map { case(word, group) => word -> group.length })
def apply(sentence: String): CountMap = CountMap(sentence.split(" "))
}
And then the you can even more easily:
CountMap(Seq("hello", "hello", "world", "world", "foo", "bar"))
Or:
CountMap("hello hello world world foo bar")

What `JObject(rec) <- someJArray` means inside for-comprehension

I'm learning Json4s library.
I have a json fragment like this:
{
"records":[
{
"name":"John Derp",
"address":"Jem Street 21"
},
{
"name":"Scala Jo",
"address":"in my sweet dream"
}
]
}
And, I have Scala code, which converts a json string into a List of Maps, like this:
import org.json4s._
import org.json4s.JsonAST._
import org.json4s.native.JsonParser
val json = JsonParser.parse( """{"records":[{"name":"John Derp","address":"Jem Street 21"},{"name":"Scala Jo","address":"in my sweet dream"}]}""")
val records: List[Map[String, Any]] = for {
JObject(rec) <- json \ "records"
JField("name", JString(name)) <- rec
JField("address", JString(address)) <- rec
} yield Map("name" -> name, "address" -> address)
println(records)
The output of records to screen gives this:
List(Map(name -> John Derp, address -> Jem Street 21), Map(name ->
Scala Jo, address -> in my sweet dream))
I want to understand what the lines inside the for loop mean. For example, what is the meaning of this line:
JObject(rec) <- json \ "records"
I understand that the json \ "records" produces a JArray object, but why is it fetched as JObject(rec) at left of <-? What is the meaning of the JObject(rec) syntax? Where does the rec variable come from? Does JObject(rec) mean instantiating a new JObject class from rec input?
BTW, I have a Java programming background, so it would also be helpful if you can show me the Java equivalent code for the loop above.
You have the following types hierarchy:
sealed abstract class JValue {
def \(nameToFind: String): JValue = ???
def filter(p: (JValue) => Boolean): List[JValue] = ???
}
case class JObject(val obj: List[JField]) extends JValue
case class JField(val name: String, val value: JValue) extends JValue
case class JString(val s: String) extends JValue
case class JArray(val arr: List[JValue]) extends JValue {
override def filter(p: (JValue) => Boolean): List[JValue] =
arr.filter(p)
}
Your JSON parser returns following object:
object JsonParser {
def parse(s: String): JValue = {
new JValue {
override def \(nameToFind: String): JValue =
JArray(List(
JObject(List(
JField("name", JString("John Derp")),
JField("address", JString("Jem Street 21")))),
JObject(List(
JField("name", JString("Scala Jo")),
JField("address", JString("in my sweet dream"))))))
}
}
}
val json = JsonParser.parse("Your JSON")
Under the hood Scala compiler generates the following:
val res = (json \ "records")
.filter(_.isInstanceOf[JObject])
.flatMap { x =>
x match {
case JObject(obj) => //
obj //
.withFilter(f => f match {
case JField("name", _) => true
case _ => false
}) //
.flatMap(n => obj.withFilter(f => f match {
case JField("address", _) => true
case _ => false
}).map(a => Map(
"name" -> (n.value match { case JString(name) => name }),
"address" -> (a.value match { case JString(address) => address }))))
}
}
First line JObject(rec) <- json \ "records" is possible because JArray.filter returns List[JValue] (i.e. List[JObject]). Here each value of List[JValue] maps to JObject(rec) with pattern matching.
Rest calls are series of flatMap and map (this is how Scala for comprehensions work) with pattern matching.
I used Scala 2.11.4.
Of course, match expressions above are implemented using series of type checks and casts.
UPDATE:
When you use Json4s library there is an implicit conversion from JValue to org.json4s.MonadicJValue. See package object json4s:
implicit def jvalue2monadic(jv: JValue) = new MonadicJValue(jv)
This conversion is used here: JObject(rec) <- json \ "records". First, json is converted to MonadicJValue, then def \("records") is applied, then def filter is used on the result of def \ which is JValue, then it is again implicitly converted to MonadicJValue, then def filter of MonadicJValue is used. The result of MonadicJValue.filter is List[JValue]. After that steps described above are performed.
You are using a Scala for comprehension and I believe much of the confusion is about how for comprehensions work. This is Scala syntax for accessing the map, flatMap and filter methods of a monad in a concise way for iterating over collections. You will need some understanding of monads and for comprehensions in order to fully comprehend this. The Scala documentation can help, and so will a search for "scala for comprehension". You will also need to understand about extractors in Scala.
You asked about the meaning of this line:
JObject(rec) <- json \ "records"
This is part of the for comprehension.
Your statement:
I understand that the json \ "records" produces a JArray object,
is slightly incorrect. The \ function extracts a List[JSObject] from the parser result, json
but why is it fetched as JObject(rec) at left of <-?
The json \ "records" uses the json4s extractor \ to select the "records" member of the Json data and yield a List[JObject]. The <- can be read as "is taken from" and implies that you are iterating over the list. The elements of the list have type JObject and the construct JObject(rec) applies an extractor to create a value, rec, that holds the content of the JObject (its fields).
how come it's fetched as JObject(rec) at left of <-?
That is the Scala syntax for iterating over a collection. For example, we could also write:
for (x <- 1 to 10)
which would simply give us the values of 1 through 10 in x. In your example, we're using a similar kind of iteration but over the content of a list of JObjects.
What is the meaning of the JObject(rec)?
This is a Scala extractor. If you look in the json4s code you will find that JObject is defined like this:
case class JObject(obj: List[JField]) extends JValue
When we have a case class in Scala there are two methods defined automatically: apply and unapply. The meaning of JObject(rec) then is to invoke the unapply method and produce a value, rec, that corresponds to the value obj in the JObject constructor (apply method). So, rec will have the type List[JField].
Where does the rec variable come from?
It comes from simply using it and is declared as a placeholder for the obj parameter to JObject's apply method.
Does JObject(rec) mean instantiating new JObject class from rec input?
No, it doesn't. It comes about because the JArray resulting from json \ "records" contains only JObject values.
So, to interpret this:
JObject(rec) <- json \ "records"
we could write the following pseudo-code in english:
Find the "records" in the parsed json as a JArray and iterate over them. The elements of the JArray should be of type JObject. Pull the "obj" field of each JObject as a list of JField and assign it to a value named "rec".
Hopefully that makes all this a bit clearer?
it's also helpful if you can show me the Java equivalent code for the loop above.
That could be done, of course, but it is far more work than I'm willing to contribute here. One thing you could do is compile the code with Scala, find the associated .class files, and decompile them as Java. That might be quite instructive for you to learn how much Scala simplifies programming over Java. :)
why I can't do this? for ( rec <- json \ "records", so rec become JObject. What is the reason of JObject(rec) at the left of <- ?
You could! However, you'd then need to get the contents of the JObject. You could write the for comprehension this way:
val records: List[Map[String, Any]] = for {
obj: JObject <- json \ "records"
rec = obj.obj
JField("name", JString(name)) <- rec
JField("address", JString(address)) <- rec
} yield Map("name" -> name, "address" -> address)
It would have the same meaning, but it is longer.
I just want to understand what does the N(x) pattern mean, because I only ever see for (x <- y pattern before.
As explained above, this is an extractor which is simply the use of the unapply method which is automatically created for case classes. A similar thing is done in a case statement in Scala.
UPDATE:
The code you provided does not compile for me against 3.2.11 version of json4s-native. This import:
import org.json4s.JsonAST._
is redundant with this import:
import org.json4s._
such that JObject is defined twice. If I remove the JsonAST import then it compiles just fine.
To test this out a little further, I put your code in a scala file like this:
package example
import org.json4s._
// import org.json4s.JsonAST._
import org.json4s.native.JsonParser
class ForComprehension {
val json = JsonParser.parse(
"""{
|"records":[
|{"name":"John Derp","address":"Jem Street 21"},
|{"name":"Scala Jo","address":"in my sweet dream"}
|]}""".stripMargin
)
val records: List[Map[String, Any]] = for {
JObject(rec) <- json \ "records"
JField("name", JString(name)) <- rec
JField("address", JString(address)) <- rec
} yield Map("name" -> name, "address" -> address)
println(records)
}
and then started a Scala REPL session to investigate:
scala> import example.ForComprehension
import example.ForComprehension
scala> val x = new ForComprehension
List(Map(name -> John Derp, address -> Jem Street 21), Map(name -> Scala Jo, address -> in my sweet dream))
x: example.ForComprehension = example.ForComprehension#5f9cbb71
scala> val obj = x.json \ "records"
obj: org.json4s.JValue = JArray(List(JObject(List((name,JString(John Derp)), (address,JString(Jem Street 21)))), JObject(List((name,JString(Scala Jo)), (address,JString(in my sweet dream))))))
scala> for (a <- obj) yield { a }
res1: org.json4s.JValue = JArray(List(JObject(List((name,JString(John Derp)), (address,JString(Jem Street 21)))), JObject(List((name,JString(Scala Jo)), (address,JString(in my sweet dream))))))
scala> import org.json4s.JsonAST.JObject
for ( JObject(rec) <- obj ) yield { rec }
import org.json4s.JsonAST.JObject
scala> res2: List[List[org.json4s.JsonAST.JField]] = List(List((name,JString(John Derp)), (address,JString(Jem Street 21))), List((name,JString(Scala Jo)), (address,JString(in my sweet dream))))
So:
You are correct, the result of the \ operator is a JArray
The "iteration" over the JArray just treats the entire array as the only value in the list
There must be an implicit conversion from JArray to JObject that permits the extractor to yield the contents of JArray as a List[JField].
Once everything is a List, the for comprehension proceeds as normal.
Hope that helps with your understanding of this.
For more on pattern matching within assignments, try this blog
UPDATE #2:
I dug around a little more to discover the implicit conversion at play here. The culprit is the \ operator. To understand how json \ "records" turns into a monadic iterable thing, you have to look at this code:
org.json4s package object: This line declares an implicit conversion from JValue to MonadicJValue. So what's a MonadicJValue?
org.json4s.MonadicJValue: This defines all the things that make JValues iterable in a for comprehension: filter, map, flatMap and also provides the \ and \\ XPath-like operators
So, essentially, the use of the \ operator results in the following sequence of actions:
- implicitly convert the json (JValue) into MonadicJValue
- Apply the \ operator in MonadicJValue to yield a JArray (the "records")
- implicitly convert the JArray into MonadicJValue
- Use the MonadicJValue.filter and MonadicJValue.map methods to implement the for comprehension
Just simplified example, how for-comprehesion works here:
scala> trait A
defined trait A
scala> case class A2(value: Int) extends A
defined class A2
scala> case class A3(value: Int) extends A
defined class A3
scala> val a = List(1,2,3)
a: List[Int] = List(1, 2, 3)
scala> val a: List[A] = List(A2(1),A3(2),A2(3))
a: List[A] = List(A2(1), A3(2), A2(3))
So here is just:
scala> for(A2(rec) <- a) yield rec //will return and unapply only A2 instances
res34: List[Int] = List(1, 3)
Which is equivalent to:
scala> a.collect{case A2(rec) => rec}
res35: List[Int] = List(1, 3)
Collect is based on filter - so it's enough to have filter method as JValue has.
P.S. There is no foreach in JValue - so this won't work for(rec <- json \ "records") rec. But there is map, so that will: for(rec <- json \ "records") yield rec
If you need your for without pattern matching:
for {
rec <- (json \ "records").filter(_.isInstanceOf[JObject]).map(_.asInstanceOf[JObject])
rcobj = rec.obj
name <- rcobj if name._1 == "name"
address <- rcobj if address._1 == "address"
nm = name._2.asInstanceOf[JString].s
vl = address._2.asInstanceOf[JString].s
} yield Map("name" -> nm, "address" -> vl)
res27: List[scala.collection.immutable.Map[String,String]] = List(Map(name -> John Derp, address -> Jem Street 21), Map(name -> Scala Jo, address -> in my sweet dream))

Deep access of fields in Scala using runtime reflection

I have code that deeply walks a case class' constructor fields, which of course may themselves be complex (list of things, maps, options, and other case classes). The code I found to extract field values at runtime works great on the highest-level fields but explodes when I try to access deeper fields. Example below.
I real life my application introspects the fields at each level, so I know that 'stuff' is another case class (I have the Symbol/Type), and I know Dos' field Symbols/Types. But this is obtained at runtime so I think it's blowing up because it doesn't know [T]/Manifest[T]. Is there a way to get this at runtime via reflection? How might my code change? The examples I found seemed to all require various things[T], which I wouldn't have for 'dos', right?
case class Uno( name:String, age:Int, pets:List[String], stuff:Dos )
case class Dos( foo:String )
object Boom extends App {
val ru = scala.reflect.runtime.universe
val m = ru.runtimeMirror(getClass.getClassLoader)
val u = Uno("Marcus",19,List("fish","bird"),Dos("wow"))
println("NAME: "+unpack(u,"name")) // Works
println("PETS: "+unpack(u,"pets")) // Works
// ----- Goes Boom -------
val dos = unpack(u,"stuff")
println("Other: "+unpack(dos,"foo")) // Boom!
// -----------------------
// Get object value for named parameter of target
def unpack[T]( target:T, name:String )(implicit man:Manifest[T]) : Any = {
val im = m.reflect(target)
val fieldX = ru.typeOf[T].declaration(ru.newTermName(name)).asTerm.accessed.asTerm
im.reflectField(fieldX).get
}
}
You're exactly right, the type of your dos is Any.
FieldMirror.symbol.typeSignature is what you'd get from typeOf[Dos].
So consider returning a pair (Any, Type) from unpack to have something to pass to unpack(target, type, name). Somewhat like:
case class Uno(name: String, age: Int, pets: List[String], stuff: Dos)
case class Dos(foo: String)
object Boom extends App {
import scala.reflect.runtime.universe._
import scala.reflect.runtime.{ currentMirror => cm }
import scala.reflect.ClassTag
val u = Uno("Marcus", 19, List("fish", "bird"), Dos("wow"))
println("NAME: " + unpack(u, "name")) // Works
println("PETS: " + unpack(u, "pets")) // Works
// ----- Goes Boom -------
val (dos, dosT) = unpack(u, "stuff")
println("Other: " + unpack(dos, dosT, "foo")) // Boom! ...or fizzle
// -----------------------
def unpack[T: TypeTag](target: T, name: String): (Any, Type) = unpack(target, typeOf[T], name)
// Get object value for named parameter of target
def unpack[T](target: T, t: Type, name: String): (Any, Type) = {
val im = cm.reflect(target)(ClassTag(target.getClass))
val fieldX = t.declaration(newTermName(name)).asTerm.accessed.asTerm
val fm = im.reflectField(fieldX)
(fm.get, fm.symbol.typeSignature)
}
}