What is the best way to inject a snippet of code to scala? something like eval in javascript and GroovyScriptEngine. I want to keep my rules/computations/formulas outside the actual data processing class. I have close to 100+ formulas to be executed. The data flow is same for all only the formulas change. What is the best way to do it in scala? and the number of formulas will grow over time.
You could use either scala-lang API for that or twitter-eval. Here is the snippet of a simple use case of scala-lang
import scala.tools.nsc.Settings
import scala.tools.nsc.interpreter.IMain
object ScalaReflectEvaluator {
def evaluate() = {
val clazz = prepareClass
val settings = new Settings
settings.usejavacp.value = true
settings.deprecation.value = true
val eval = new IMain(settings)
val evaluated = eval.interpret(clazz)
val res = eval.valueOfTerm("res0").get.asInstanceOf[Int]
println(res) //yields 9
}
private def prepareClass: String = {
s"""
|val x = 4
|val y = 5
|x + y
|""".stripMargin
}
}
or with twitter:
import com.twitter.util.Eval
object TwitterUtilEvaluator {
def evaluate() = {
val clazz = prepareClass
val eval = new Eval
eval.apply[Int](clazz)
}
private def prepareClass: String = {
s"""
|val x = 4
|val y = 5
|x + y
|""".stripMargin
}
}
I am not able to compile it at the moment to check whether I have missed something but you should get the idea.
I've found that scala.tools.reflect.ToolBox is the fastest eval in scala (measured interpreter, twitter's eval and custom tool). It's API:
import scala.reflect.runtime.universe
import scala.tools.reflect.ToolBox
val tb = universe.runtimeMirror(getClass.getClassLoader).mkToolBox()
tb.eval(tb.parse("""println("hello!")"""))
Related
I have a method called split that accepts an RDD[T] and a splitSize and returns an Array[RDD[T]].
Now, one of the test cases I write for it should verify that this function also randomly shuffles the RDD.
So I create a sorted RDD, and then see the results:
it should "randomize shuffle" in {
val inputRDD = sc.parallelize((0 until 16))
val result = RDDUtils.split(inputRDD, 2)
result.foreach(rdd => {
rdd.collect.foreach(println)
})
// Asset result is not sorted
}
If the results are:
0
1
2
3
..
15
Then it's not working as expected.
A good result can be something like:
11
3
9
14
...
1
6
How can I assert the output Array[RDD[T]]] is not sorted?
You could try something like this
val resultOrder = result.sortBy(....)
assert(!resultOrder.sameElements(result))
or
val resultOrder = result.sortBy(....)
assert(!resultOrder.toList == result.toList)
It's important to note that the key is to know how to sort the Array. For an Integer data type it would be easy, but for a complex data type you could need an implicit Ordering for your data type. e.g:
implicit val ordering: Ordering[T] =
Ordering.fromLessThan[T]((sa: T, sb: T) => sa < sb)
// OR
implicit val ordering: Ordering[MyClass] =
Ordering.fromLessThan[MyClass]((sa: MyClass, sb: MyClass) => sa.field1 < sb.field1)
The exact code would depend of your data type.
As a full example of this
package tests
import org.apache.log4j.{Level, Logger}
import org.apache.spark.rdd.RDD
import org.apache.spark.sql.SparkSession
object SortArrayRDD {
val spark = SparkSession
.builder()
.appName("SortArrayRDD")
.master("local[*]")
.config("spark.sql.shuffle.partitions","4") //Change to a more reasonable default number of partitions for our data
.config("spark.app.id","SortArrayRDD") // To silence Metrics warning
.getOrCreate()
val sc = spark.sparkContext
def main(args: Array[String]): Unit = {
try {
Logger.getRootLogger.setLevel(Level.ERROR)
val arrRDD: Array[RDD[Int]] = Array(sc.parallelize(List(2,3)),sc.parallelize(List(10,11)),sc.parallelize(List(6,7)),sc.parallelize(List(8,9)),
sc.parallelize(List(4,5)),sc.parallelize(List(0,1)),sc.parallelize(List(12,13)),sc.parallelize(List(14,15)))
val aux = arrRDD
implicit val ordering: Ordering[RDD[Int]] = Ordering.fromLessThan[RDD[Int]]((sa: RDD[Int], sb: RDD[Int]) => sa.sum() < sb.sum())
aux.sorted.foreach(rdd => println(rdd.collect().mkString(",")))
val resultOrder = aux.sorted
assert(!resultOrder.sameElements(arrRDD))
println("It's unordered")
} finally {
sc.stop()
}
}
}
I need to compile function and then evaluate it with different parameters of type List[Map[String, AnyRef]].
I have the following code that does not compile with such the type but compiles with simple type like List[Int].
I found that there are just certain implementations of Liftable in scala.reflect.api.StandardLiftables.StandardLiftableInstances
import scala.reflect.runtime.universe
import scala.reflect.runtime.universe._
import scala.tools.reflect.ToolBox
val tb = universe.runtimeMirror(getClass.getClassLoader).mkToolBox()
val functionWrapper =
"""
object FunctionWrapper {
def makeBody(messages: List[Map[String, AnyRef]]) = Map.empty
}""".stripMargin
val functionSymbol =
tb.define(tb.parse(functionWrapper).asInstanceOf[tb.u.ImplDef])
val list: List[Map[String, AnyRef]] = List(Map("1" -> "2"))
tb.eval(q"$functionSymbol.function($list)")
Getting compilation error for this, how can I make it work?
Error:(22, 38) Can't unquote List[Map[String,AnyRef]], consider using
... or providing an implicit instance of
Liftable[List[Map[String,AnyRef]]]
tb.eval(q"$functionSymbol.function($list)")
^
The problem comes not from complicated type but from the attempt to use AnyRef. When you unquote some literal, it means you want the infrastructure to be able to create a valid syntax tree to create an object that would exactly match the object you pass. Unfortunately this is obviously not possible for all objects. For example, assume that you've passed a reference to Thread.currentThread() as a part of the Map. How it could possible work? Compiler is just not able to recreate such a complicated object (not to mention making it the current thread). So you have two obvious alternatives:
Make you argument also a Tree i.e. something like this
def testTree() = {
val tb = universe.runtimeMirror(getClass.getClassLoader).mkToolBox()
val functionWrapper =
"""
| object FunctionWrapper {
|
| def makeBody(messages: List[Map[String, AnyRef]]) = Map.empty
|
| }
""".stripMargin
val functionSymbol =
tb.define(tb.parse(functionWrapper).asInstanceOf[tb.u.ImplDef])
//val list: List[Map[String, AnyRef]] = List(Map("1" -> "2"))
val list = q"""List(Map("1" -> "2"))"""
val res = tb.eval(q"$functionSymbol.makeBody($list)")
println(s"testTree = $res")
}
The obvious drawback of this approach is that you loose type safety at compile time and might need to provide a lot of context for the tree to work
Another approach is to not try to pass anything containing AnyRef to the compiler-infrastructure. It means you create some function-like Wrapper:
package so {
trait Wrapper {
def call(args: List[Map[String, AnyRef]]): Map[String, AnyRef]
}
}
and then make your generated code return a Wrapper instead of directly executing the logic and call the Wrapper from the usual Scala code rather than inside compiled code. Something like this:
def testWrapper() = {
val tb = universe.runtimeMirror(getClass.getClassLoader).mkToolBox()
val functionWrapper =
"""
|object FunctionWrapper {
| import scala.collection._
| import so.Wrapper /* <- here probably different package :) */
|
| def createWrapper(): Wrapper = new Wrapper {
| override def call(args: List[Map[String, AnyRef]]): Map[String, AnyRef] = Map.empty
| }
|}
| """.stripMargin
val functionSymbol = tb.define(tb.parse(functionWrapper).asInstanceOf[tb.u.ImplDef])
val list: List[Map[String, AnyRef]] = List(Map("1" -> "2"))
val tree: tb.u.Tree = q"$functionSymbol.createWrapper()"
val wrapper = tb.eval(tree).asInstanceOf[Wrapper]
val res = wrapper.call(list)
println(s"testWrapper = $res")
}
P.S. I'm not sure what are you doing but beware of performance issues. Scala is a hard language to compile and thus it might easily take more time to compile your custom code than to run it. If performance becomes an issue you might need to use some other methods such as full-blown macro-code-generation or at least caching of the compiled code.
Unfortunately, the most intuitive way,
val world = "Earth"
val tree = q"""println("Hello $world")"""
results in
Error:(16, 36) Don't know how to unquote here
val tree = q"""println("Hello $world")"""
^
because $ within quasiquotes expects a tree.
val world = "Earth"
val tree = q"""println(${c.literal(s"Hello $world")})"""
works, but is very ugly AND I get an Intellij warning that the c.literal is deprecated and I should use quasiquotes, instead.
So ... how do I do this?
UPDATE
In response to flavian's comment:
import scala.language.experimental.macros
import scala.reflect.macros._
object TestMacros {
def doTest() = macro impl
def impl(c: blackbox.Context)(): c.Expr[Unit] = {
import c.universe._ //access to AST classes
/*
val world = "Earth"
val tree = q"""println(${c.literal(s"Hello $world")})"""
*/
val world = TermName("Earth")
val tree = q"""println("Hello $world")"""
tree match {
case q"""println("Hello Earth")""" => println("succeeded")
case _ => c.abort(c.enclosingPosition, s"huh? was: $tree")
}
c.Expr(tree) //wrap tree and tag with its type
}
}
gives
Error:(18, 40) Don't know how to unquote here
val tree = q"""println("Hello $world")"""
^
You need a TermName or something that's a compiler primitive.
The real problem is that you are mixing interpolators, without realising. The interpolator in hello world is really a string interpolator, not a quasiquote one which is good at unquoting trees as you suggest.
This is one way to go about it:
import c.universe._
val world = TermName("Earth")
val tree = q"""println("Hello" + ${world.decodedName.toString})"""
I just started learning Macro-Fu.
For those also exploring Scala 2 Macros / Scalameta Quasiquotes, seems to me the simplest approach is the following (using SBT 1.5.5; inline explanations):
scala> import scala.language.experimental.macros
| import scala.reflect.macros.blackbox
|
| object UnquoteString {
|
| def helloWorld(): Unit = macro Impl.helloWorld
|
| object Impl {
| def helloWorld(c: blackbox.Context)(): c.Expr[Unit] = {
| import c.universe._
|
| val world = "Earth" // or any other value here...
|
| val msg = s"Hello $world" // build the entire string with it here...
|
| implicitly[Liftable[String]] // as we can lift 'whole' string values with this Liftable[_]...
|
| val tree = q"""println($msg)""" // quasi-unquote the entire string here...
|
| c.Expr[Unit](Block(Nil, tree))
| }
| }
| }
import scala.language.experimental.macros
import scala.reflect.macros.blackbox
object UnquoteString
scala> UnquoteString.helloWorld()
Hello Earth
scala>
Also the following change would also work
val tree = q"""println("Hello, " + $world)""" // quasi-unquote the string to append here...
I could not find how to convert a String into runnable code, for instance:
val i = "new String('Yo')"
// conversion
println(i)
should print
Yo
after the conversion.
I found the following example in another post:
import scala.tools.nsc.interpreter.ILoop
import java.io.StringReader
import java.io.StringWriter
import java.io.PrintWriter
import java.io.BufferedReader
import scala.tools.nsc.Settings
object FuncRunner extends App {
val line = "sin(2 * Pi * 400 * t)"
val lines = """import scala.math._
|var t = 1""".stripMargin
val in = new StringReader(lines + "\n" + line + "\nval f = (t: Int) => " + line)
val out = new StringWriter
val settings = new Settings
val looper = new ILoop(new BufferedReader(in), new PrintWriter(out))
val res = looper process settings
Console println s"[$res] $out"
}
link: How to convert a string from a text input into a function in a Scala
But it seems like scala.tools is not available anymore, and I'm a newbie in Scala so i could not figure out how to replace it.
And may be there are just other ways to do it now.
Thanks !
You can simple execute your code contained inside String using Quasiquotes(Experimental Module).
import scala.reflect.runtime.universe._
import scala.reflect.runtime.currentMirror
import scala.tools.reflect.ToolBox
// TO compile and run code we will use a ToolBox api.
val toolbox = currentMirror.mkToolBox()
// write your code starting with q and put it inside double quotes.
// NOTE : you will have to use triple quotes if you have any double quotes usage in your code.
val code1 = q"""new String("hello")"""
//compile and run your code.
val result1 = toolbox.compile(code1)()
// another example
val code2 = q"""
case class A(name:String,age:Int){
def f = (name,age)
}
val a = new A("Your Name",22)
a.f
"""
val result2 = toolbox.compile(code2)()
Output in REPL :
// Exiting paste mode, now interpreting.
import scala.reflect.runtime.universe._
import scala.reflect.runtime.currentMirror
import scala.tools.reflect.ToolBox
toolbox: scala.tools.reflect.ToolBox[reflect.runtime.universe.type] = scala.tools.reflect.ToolBoxFactory$ToolBoxImpl#69b34f89
code1: reflect.runtime.universe.Tree = new String("hello")
result1: Any = hello
code2: reflect.runtime.universe.Tree =
{
case class A extends scala.Product with scala.Serializable {
<caseaccessor> <paramaccessor> val name: String = _;
<caseaccessor> <paramaccessor> val age: Int = _;
def <init>(name: String, age: Int) = {
super.<init>();
()
};
def f = scala.Tuple2(name, age)
};
val a = new A("Your Name", 22);
a.f
}
result2: Any = (Your Name,22)
scala>
To learn more about Quasiquotes :
http://docs.scala-lang.org/overviews/quasiquotes/setup.html
I found a simple solution using the ToolBox tool :
val cm = universe.runtimeMirror(getClass.getClassLoader)
val tb = cm.mkToolBox()
val str = tb.eval(tb.parse("new String(\"Yo\")"))
println(str)
This is printing:
Yo
The scala compiler (and the "interpreter loop") are available here. For example:
https://github.com/scala/scala/blob/v2.11.7/src/repl/scala/tools/nsc/interpreter/ILoop.scala
A compiled jar that has that class will be in your scala distribution under lib\scala-compiler.jar.
One way could be to take/parse that string code and then write it yourself in some file as Scala code. This way it would be executed by Scala compiler when you will call it.
An example: just like Scala is doing with Java. It takes this code and then convert it into Java by making use of main method.
object Xyz extends App {
println ("Hello World!")
}
To embed Scala as a "scripting language", I need to be able to compile text fragments to simple objects, such as Function0[Unit] that can be serialised to and deserialised from disk and which can be loaded into the current runtime and executed.
How would I go about this?
Say for example, my text fragment is (purely hypothetical):
Document.current.elements.headOption.foreach(_.open())
This might be wrapped into the following complete text:
package myapp.userscripts
import myapp.DSL._
object UserFunction1234 extends Function0[Unit] {
def apply(): Unit = {
Document.current.elements.headOption.foreach(_.open())
}
}
What comes next? Should I use IMain to compile this code? I don't want to use the normal interpreter mode, because the compilation should be "context-free" and not accumulate requests.
What I need to get hold off from the compilation is I guess the binary class file? In that case, serialisation is straight forward (byte array). How would I then load that class into the runtime and invoke the apply method?
What happens if the code compiles to multiple auxiliary classes? The example above contains a closure _.open(). How do I make sure I "package" all those auxiliary things into one object to serialize and class-load?
Note: Given that Scala 2.11 is imminent and the compiler API probably changed, I am happy to receive hints as how to approach this problem on Scala 2.11
Here is one idea: use a regular Scala compiler instance. Unfortunately it seems to require the use of hard disk files both for input and output. So we use temporary files for that. The output will be zipped up in a JAR which will be stored as a byte array (that would go into the hypothetical serialization process). We need a special class loader to retrieve the class again from the extracted JAR.
The following assumes Scala 2.10.3 with the scala-compiler library on the class path:
import scala.tools.nsc
import java.io._
import scala.annotation.tailrec
Wrapping user provided code in a function class with a synthetic name that will be incremented for each new fragment:
val packageName = "myapp"
var userCount = 0
def mkFunName(): String = {
val c = userCount
userCount += 1
s"Fun$c"
}
def wrapSource(source: String): (String, String) = {
val fun = mkFunName()
val code = s"""package $packageName
|
|class $fun extends Function0[Unit] {
| def apply(): Unit = {
| $source
| }
|}
|""".stripMargin
(fun, code)
}
A function to compile a source fragment and return the byte array of the resulting jar:
/** Compiles a source code consisting of a body which is wrapped in a `Function0`
* apply method, and returns the function's class name (without package) and the
* raw jar file produced in the compilation.
*/
def compile(source: String): (String, Array[Byte]) = {
val set = new nsc.Settings
val d = File.createTempFile("temp", ".out")
d.delete(); d.mkdir()
set.d.value = d.getPath
set.usejavacp.value = true
val compiler = new nsc.Global(set)
val f = File.createTempFile("temp", ".scala")
val out = new BufferedOutputStream(new FileOutputStream(f))
val (fun, code) = wrapSource(source)
out.write(code.getBytes("UTF-8"))
out.flush(); out.close()
val run = new compiler.Run()
run.compile(List(f.getPath))
f.delete()
val bytes = packJar(d)
deleteDir(d)
(fun, bytes)
}
def deleteDir(base: File): Unit = {
base.listFiles().foreach { f =>
if (f.isFile) f.delete()
else deleteDir(f)
}
base.delete()
}
Note: Doesn't handle compiler errors yet!
The packJar method uses the compiler output directory and produces an in-memory jar file from it:
// cf. http://stackoverflow.com/questions/1281229
def packJar(base: File): Array[Byte] = {
import java.util.jar._
val mf = new Manifest
mf.getMainAttributes.put(Attributes.Name.MANIFEST_VERSION, "1.0")
val bs = new java.io.ByteArrayOutputStream
val out = new JarOutputStream(bs, mf)
def add(prefix: String, f: File): Unit = {
val name0 = prefix + f.getName
val name = if (f.isDirectory) name0 + "/" else name0
val entry = new JarEntry(name)
entry.setTime(f.lastModified())
out.putNextEntry(entry)
if (f.isFile) {
val in = new BufferedInputStream(new FileInputStream(f))
try {
val buf = new Array[Byte](1024)
#tailrec def loop(): Unit = {
val count = in.read(buf)
if (count >= 0) {
out.write(buf, 0, count)
loop()
}
}
loop()
} finally {
in.close()
}
}
out.closeEntry()
if (f.isDirectory) f.listFiles.foreach(add(name, _))
}
base.listFiles().foreach(add("", _))
out.close()
bs.toByteArray
}
A utility function that takes the byte array found in deserialization and creates a map from class names to class byte code:
def unpackJar(bytes: Array[Byte]): Map[String, Array[Byte]] = {
import java.util.jar._
import scala.annotation.tailrec
val in = new JarInputStream(new ByteArrayInputStream(bytes))
val b = Map.newBuilder[String, Array[Byte]]
#tailrec def loop(): Unit = {
val entry = in.getNextJarEntry
if (entry != null) {
if (!entry.isDirectory) {
val name = entry.getName
// cf. http://stackoverflow.com/questions/8909743
val bs = new ByteArrayOutputStream
var i = 0
while (i >= 0) {
i = in.read()
if (i >= 0) bs.write(i)
}
val bytes = bs.toByteArray
b += mkClassName(name) -> bytes
}
loop()
}
}
loop()
in.close()
b.result()
}
def mkClassName(path: String): String = {
require(path.endsWith(".class"))
path.substring(0, path.length - 6).replace("/", ".")
}
A suitable class loader:
class MemoryClassLoader(map: Map[String, Array[Byte]]) extends ClassLoader {
override protected def findClass(name: String): Class[_] =
map.get(name).map { bytes =>
println(s"defineClass($name, ...)")
defineClass(name, bytes, 0, bytes.length)
} .getOrElse(super.findClass(name)) // throws exception
}
And a test case which contains additional classes (closures):
val exampleSource =
"""val xs = List("hello", "world")
|println(xs.map(_.capitalize).mkString(" "))
|""".stripMargin
def test(fun: String, cl: ClassLoader): Unit = {
val clName = s"$packageName.$fun"
println(s"Resolving class '$clName'...")
val clazz = Class.forName(clName, true, cl)
println("Instantiating...")
val x = clazz.newInstance().asInstanceOf[() => Unit]
println("Invoking 'apply':")
x()
}
locally {
println("Compiling...")
val (fun, bytes) = compile(exampleSource)
val map = unpackJar(bytes)
println("Classes found:")
map.keys.foreach(k => println(s" '$k'"))
val cl = new MemoryClassLoader(map)
test(fun, cl) // should call `defineClass`
test(fun, cl) // should find cached class
}