Using Scala reflection to compile and load object - scala

Given the following String:
"println(\"Hello\")"
It is possible to use reflection to evaluate the code, as follows.
object Eval {
def apply[A](string: String): A = {
val toolbox = currentMirror.mkToolBox()
val tree = toolbox.parse(string)
toolbox.eval(tree).asInstanceOf[A]
}
}
However, lets say that the string contains an object with a function definition, such as:
"""object MyObj { def getX="X"}"""
Is there a way to use Scala reflection to compile the string, load it and run the function? What I have tried to do has not worked, if anyone has some example code it is very appreciated.

It depends on how strictly you define the acceptable input string. Should the object always be called MyObj? Should the method always be called getX? Should it always be 1 method or can it be multiple?
For the more general case you could try to extract all method names from the AST and generate calls to each one. The following code will call every method (and returns the result of the last one) that takes 0 arguments and is not a constructor, in some object, not taking inheritance into account:
def eval(string: String): Any = {
val toolbox = currentMirror.mkToolBox()
val tree = toolbox.parse(string)
//oName is the name of the object
//defs is the list of all method definitions
val ModuleDef(_,oName,Template(_,_,defs)) = tree
//only generate calls for non-constructor, zero-arg methods
val defCalls = defs.collect{
case DefDef(_,name,_,params,_,_)
if name != termNames.CONSTRUCTOR && params.flatten.isEmpty => q"$oName.$name"
}
//put the method calls after the object definition
val block = tree :: defCalls
toolbox.eval(q"..$block")
}
And using it:
scala> eval("""object MyObj { def bar() = println("bar"); def foo(a: String) = println(a); def getX = "x" }""")
bar
res60: Any = x

Related

Scala: Making config data as a global object

I have a case class for configuration parameters which is populated (using NO external library) before starting the actual application.
I pass this config object through out the application and in too many places.
Now the question is can this object be made global so I can refer it across the application as the values are going to be constant.
case class ConfigParam() extends Serializable {
var JobId: Int = 0
var jobName: String = null
var snapshotDate: Date = null
}
val configParam = ???
val ss = getSparkSession(configParam) //Method call...
Using ConfigParam as a global object could have bad implications for you. First of all, it will make harder to test any function which is using that global object.
Maybe you could just pass ConfigParam as an implicit argument?
For example, let's say you've got 3 functions:
def funA(name: String)(implicit configParam: ConfigParam): String = ???
def funB(number: Int)(implicit configParam: ConfigParam): String = ???
//you don't have to explicitily pass config param to funA or funB
def funC(name: String)(implicit configParam: ConfigParam): String = funA(name) + funB(100)
implicit val configParam = ??? //you need to initialise configParams as implicit val
funC("somename") //you can now just call funC without explicitly passing configParam
//it will be also passed to all function calls inside funC
//as long as they've got implicit parameter list with ConfigParam
Another solution could be to use some kind of dependency-injection framework, like guice.

How can I make a function generic on an MLReader

I am working in Spark 1.6.3. Here are two functions that do the same thing:
def modelFromBytesCV(modelArray: Array[Byte]): CountVectorizerModel = {
val tempPath: Path = KAZOO_TEMP_DIR.resolve(s"model_${System.currentTimeMillis()}")
Files.write(tempPath, modelArray)
CountVectorizerModel.read.load(tempPath.toString)
}
def modelFromBytesIDF(modelArray: Array[Byte]): IDFModel = {
val tempPath: Path = KAZOO_TEMP_DIR.resolve(s"model_${System.currentTimeMillis()}")
Files.write(tempPath, modelArray)
IDFModel.read.load(tempPath.toString)
}
I would like to make these functions generic. What I am hung up on is that the common trait between the CountVectorizerModel object and IDFModel is MLReadable[T] which itself must take as a type either CountVectorizerModel or IDFModel. This is sort of a recursive parent class loop that I can't figure out a solution to.
By comparison, the generic model writer is easy, because MLWritable is a common trait extended by all the models I am interested in:
def modelToBytes[M <: MLWritable](model: M): Array[Byte] = {
val tempPath: Path = KAZOO_TEMP_DIR.resolve(s"model_${System.currentTimeMillis()}")
model.write.overwrite().save(tempPath.toString)
Files.readAllBytes(tempPath)
}
How can I make a generic reader that will turn turn a spark-ml model into a byte array?
To make it work you'll need access to a specific MlReadable object.
import org.apache.spark.ml.util.MLReadable
def modelFromBytes[M](obj: MLReadable[M], modelArray: Array[Byte]): M = {
val tempPath: Path = ???
...
obj.read.load(tempPath.toString)
}
which could be later used as:
val bytes: Array[Byte] = ???
modelFromBytes(CountVectorizerModel, bytes)
Note that, despite the first appearance, there is nothing recursive here - MLReadable[M] refers to companion object, not class as such. So for example CountVectorizerModel object is MLReadable, while CountVectorizeModel class isn't.
Internally, Spark MLReader handles this in a different way - it creates an instance of the class using reflection, and then sets its Params. However this path won't be very useful for you here*.
If compatibility with the current API is required, you can try making readable object implicit:
def modelFromBytes[M](modelArray: Array[Byte])(implicit obj: MLReadable[M]): M = {
...
}
and then
implicit val readable: MLReadable[CountVectorizerModel] = CountVectorizerModel
modelFromBytes[CountVectorizerModel](bytes)
* Technically speaking it is possible to get companion object via reflection
def modelFromBytesCV[M <: MLWritable](
modelArray: Array[Byte])(implicit ct: ClassTag[M]): M = {
val tempPath: Path = ???
...
val cls = Class.forName(ct.runtimeClass.getName + "$");
cls.getField("MODULE$").get(cls).asInstanceOf[MLReadable[M]]
.read.load(tempPath.toString))
}
but I don't think that is a path worth exploring here. In particular we cannot really provide strict type bounds here - using MLWritable is a hack to limit human errors, but is rather useless for compiler.

How to save a Type or TypeTag to a val for later use?

I would like to save a Type or TypeTag in a val for later use. At this time, I am having to specify a type in several locations in a block of code. I do not need to parameterize the code because only one type will be used. This is more of a curiosity than a necessity.
I tried using typeOf, classOf, getClass, and several other forms of accessing the class and type. The solution is likely simple but my knowledge of Scala typing or type references is missing this concept.
object Example extends App {
import scala.reflect.runtime.universe._
object TestClass { val str = "..." }
case class TestClass() { val word = ",,," }
def printType[A: TypeTag](): Unit = println(typeOf[A])
printType[List[Int]]() //prints 'List[Int]'
printType[TestClass]() //prints 'Example.TestClass'
val typeOfCompanion: ??? = ??? //TODO what goes here?
val typeOfCaseClass: ??? = ??? //TODO what goes here?
printType[typeOfCompanion]() //TODO should print something like 'Example.TestClass'
printType[typeOfCaseClass]() //TODO should print something like 'Example.TestClass'
}
The solution should be able to save a Type or TypeTag or what the solution is. Then, pass typeOfCompanion or typeOfCaseClass like printTypetypeOfCompanion for printing. Changing the printing portion of the code may be required; I am not certain.
You have to be more explicit here
import scala.reflect.runtime.universe._
def printType(a: TypeTag[_]): Unit = println(a)
val typeOfCompanion = typeTag[List[Int]]
printType(typeOfCompanion)
def printType[A: TypeTag](): Unit = println(typeOf[A])
is exactly the same as
def printType[A]()(implicit a: TypeTag[A]): Unit = println(typeOf[A])
(except for the parameter name). So it can be called as
val listTypeTag /* : TypeTag[List[Int]] */ = typeTag[List[Int]]
printType()(listTypeTag)
(you can remove the empty parameter list from printType if you want).
For the companion, you need to use a singleton type:
val companionTag = typeTag[TestClass.type]
val caseClassTag = typeTag[TestClass]

Scala fails to initialize a val

I have found kind of a weirdness in the following Scala program (sorry to include all the code, but you'll see why I added it all) :
object md2html extends App {
private val DEFAULT_THEME = Themes.AMAZON_LIGHT
private val VALID_OPTIONS = Set("editorTheme", "logo", "style")
try {
// some code 1
} catch {
case t: Throwable => t.printStackTrace(); exitWithError(t.getMessage)
}
// some code 2 (method definitions only)
private def parseOption(key: String, value: String) = {
println(key + " " + VALID_OPTIONS)
if (! Set("theme","editorTheme", "logo", "style").contains(key)) exitWithError(s"$key is not a valid option")
if (key == "theme") Themes(value).toMap else Map(key.drop(2) -> value)
}
// some code 3 (method definitions only)
}
If VALID_OPTIONS is defined after one of the some code..., it is evaluated to null in parseOption. I can see no good reason for that. I truncated the code for clarity, but if some more code is required I'll be happy to add it.
EDIT : I looked a bit more into it, and here is what I found.
When extending App, the val is not initialized with this code
object Test extends App {
printTest()
def printTest = println(test)
val test = "test"
}
With a regular main method, it works fine :
object Test {
def main(args: Array[String]): Unit = {
printTest
}
def printTest = println(test)
val test = "test"
}
I had overseen that you use extends App. This is another pitfall in Scala, unfortunately:
object Foo extends App {
val bar = "bar"
}
Foo.bar // null!
Foo.main(Array())
Foo.bar // now initialized
The App trait defers the object's initialization to the invocation of the main method, so all the vals are null until the main method has been called.
In summary, the App trait and vals do not mix well. I have fallen into that trap many times. If you use App, avoid vals, if you have to use global state, use lazy vals instead.
Constructor bodies, and this goes for singleton objects as well, are evaluated strictly top to bottom. This is a common pitfall in Scala, unfortunately, as it becomes relevant where the vals are defined if they are referenced in other places of the constructor.
object Foo {
val rab = useBar // oops, involuntarily referring to uninitialized val
val bar = "bar"
def useBar: String = bar.reverse
}
Foo // NPE
Of course, in a better world, the Scala compiler would either disallow the above code, re-order the initialization, or at least warn you. But it doesn't...

Syntactic sugar for compile-time object creation in Scala

Lets say I have
trait fooTrait[T] {
def fooFn(x: T, y: T) : T
}
I want to enable users to quickly declare new instances of fooTrait with their own defined bodies for fooFn. Ideally, I'd want something like
val myFoo : fooTrait[T] = newFoo((x:T, y:T) => x+y)
to work. However, I can't just do
def newFoo[T](f: (x:T, y:T) => T) = new fooTrait[T] { def fooFn(x:T, y:T):T = f(x,y); }
because this uses closures, and so results in different objects when the program is run multiple times. What I really need is to be able to get the classOf of the object returned by newFoo and then have that be constructable on a different machine. What do I do?
If you're interested in the use case, I'm trying to write a Scala wrapper for Hadoop that allows you to execute
IO("Data") --> ((x: Int, y: Int) => (x, x+y)) --> IO("Out")
The thing in the middle needs to be turned into a class that implements a particular interface and can then be instantiated on different machines (executing the same jar file) from just the class name.
Note that Scala does the right thing with the syntactic sugar that converts (x:Int) => x+5 to an instance of Function1. My question is whether I can replicate this without hacking the Scala internals. If this was lisp (as I'm used to), this would be a trivial compile-time macro ... :sniff:
Here's a version that matches the syntax of what you list in the question and serializes/executes the anon-function. Note that this serializes the state of the Function2 object so that the serialized version can be restored on another machine. Just the classname is insufficient, as illustrated below the solution.
You should make your own encode/decode function, if even to just include your own Base64 implementation (not to rely on Sun's Hotspot).
object SHadoopImports {
import java.io._
implicit def functionToFooString[T](f:(T,T)=>T) = {
val baos = new ByteArrayOutputStream()
val oo = new ObjectOutputStream(baos)
oo.writeObject(f)
new sun.misc.BASE64Encoder().encode(baos.toByteArray())
}
implicit def stringToFun(s: String) = {
val decoder = new sun.misc.BASE64Decoder();
val bais = new ByteArrayInputStream(decoder.decodeBuffer(s))
val oi = new ObjectInputStream(bais)
val f = oi.readObject()
new {
def fun[T](x:T, y:T): T = f.asInstanceOf[Function2[T,T,T]](x,y)
}
}
}
// I don't really know what this is supposed to do
// just supporting the given syntax
case class IO(src: String) {
import SHadoopImports._
def -->(s: String) = new {
def -->(to: IO) = {
val IO(snk) = to
println("From: " + src)
println("Applying (4,5): " + s.fun(4,5))
println("To: " + snk)
}
}
}
object App extends Application {
import SHadoopImports._
IO("MySource") --> ((x:Int,y:Int)=>x+y) --> IO("MySink")
println
IO("Here") --> ((x:Int,y:Int)=>x*y+y) --> IO("There")
}
/*
From: MySource
Applying (4,5): 9
To: MySink
From: Here
Applying (4,5): 25
To: There
*/
To convince yourself that the classname is insufficient to use the function on another machine, consider the code below which creates 100 different functions. Count the classes on the filesystem and compare.
object App extends Application {
import SHadoopImports._
for (i <- 1 to 100) {
IO(i + ": source") --> ((x:Int,y:Int)=>(x*i)+y) --> IO("sink")
}
}
Quick suggestion: why don't you try to create an implicit def transforming FunctionN object to the trait expected by the --> method.
I do hope you won't have to use any macro for this!