dynamically parse a string and return a function in scala using reflection and interpretors - scala

I am trying to dinamically interpret code given as a String.
Eg:
val myString = "def f(x:Int):Int=x+1".
Im looking for a method that will return the real function out of it:
Eg:
val myIncrementFunction = myDarkMagicFunctionThatWillBuildMyFunction(myString)
println(myIncrementFunction(3))
will print 4
Use case: I want to use some simple functions from that interpreted code later in my code. For example they can provide something like def fun(x: Int): Int = x + 1 as a string, then I use the interpreter to compile/execute that code and then I'd like to be able to use this fun(x) in a map for example.
The problem is that that function type is unknown for me, and this is one of the big problems because I need to cast back from IMain.
I've read about reflection, type system and such, and after some googling I reached this point. Also I checked twitter's util-eval but I cant see too much from the docs and the examples in their tests, it's pretty the same thing.
If I know the type I can do something like
val settings = new Settings
val imain = new IMain(settings)
val res = imain.interpret("def f(x:Int):Int=x+1; val ret=f _ ")
val myF = imain.valueOfTerm("ret").get.asInstanceOf[Function[Int,Int]]
println(myF(2))
which works correctly and prints 3 but I am blocked by the problem I said above, that I dont know the type of the function, and this example works just because I casted to the type I used when I defined the string function for testing how IMain works.
Do you know any method how I could achieve this functionality ?
I'm a newbie so please excuse me if I wrote any mistakes.
Thanks

Ok, I managed to achieve the functionality I wanted, I am still looking for improving this code, but this snippet does what I want.
I used scala toolbox and quasiquotes
import scala.reflect.runtime.universe.{Quasiquote, runtimeMirror}
import scala.tools.reflect.ToolBox
object App {
def main(args: Array[String]): Unit = {
val mirror = runtimeMirror(getClass.getClassLoader)
val tb = ToolBox(mirror).mkToolBox()
val data = Array(1, 2, 3)
println("Data before function applied on it")
println(data.mkString(","))
println("Please enter the map function you want:")
val function = scala.io.StdIn.readLine()
val functionWrapper = "object FunctionWrapper { " + function + "}"
val functionSymbol = tb.define(tb.parse(functionWrapper).asInstanceOf[tb.u.ImplDef])
// Map each element using user specified function
val dataAfterFunctionApplied = data.map(x => tb.eval(q"$functionSymbol.function($x)"))
println("Data after function applied on it")
println(dataAfterFunctionApplied.mkString(","))
}
}
And here is the result in the terminal:
Data before function applied on it
1,2,3
Please enter the map function you want:
def function(x: Int): Int = x + 2
Data after function applied on it
3,4,5
Process finished with exit code 0

I wanted to elaborate the previous answer with the comment and perform an evaluation of the solutions:
import scala.reflect.runtime.universe.{Quasiquote, runtimeMirror}
import scala.tools.reflect.ToolBox
object Runtime {
def time[R](block: => R): R = {
val t0 = System.nanoTime()
val result = block // call-by-name
val t1 = System.nanoTime()
println("Elapsed time: " + (t1 - t0) + " ns")
result
}
def main(args: Array[String]): Unit = {
val mirror = runtimeMirror(getClass.getClassLoader)
val tb = ToolBox(mirror).mkToolBox()
val data = Array(1, 2, 3)
println(s"Data before function applied on it: '${data.toList}")
val function = "def apply(x: Int): Int = x + 2"
println(s"Function: '$function'")
println("#######################")
// Function with tb.eval
println(".... with tb.eval")
val functionWrapper = "object FunctionWrapper { " + function + "}"
// This takes around 1sec!
val functionSymbol = time { tb.define(tb.parse(functionWrapper).asInstanceOf[tb.u.ImplDef])}
// This takes around 0.5 sec!
val result = time {data.map(x => tb.eval(q"$functionSymbol.apply($x)"))}
println(s"Data after function applied on it: '${result.toList}'")
println(".... without tb.eval")
val func = time {tb.eval(q"$functionSymbol.apply _").asInstanceOf[Int => Int]}
// This takes around 0.5 sec!
val result2 = time {data.map(func)}
println(s"Data after function applied on it: '${result2.toList}'")
}
}
If we execute the code above we see the following output:
Data before function applied on it: 'List(1, 2, 3)
Function: 'def apply(x: Int): Int = x + 2'
#######################
.... with tb.eval
Elapsed time: 716542980 ns
Elapsed time: 661386581 ns
Data after function applied on it: 'List(3, 4, 5)'
.... without tb.eval
Elapsed time: 394119232 ns
Elapsed time: 85713 ns
Data after function applied on it: 'List(3, 4, 5)'
Just to emphasize the importance of do the evaluation to extract a Function, and then apply to the data, without the end to evaluate again, as the comment in the answer indicates.

You can use twitter-util library to do this, check the test file:
https://github.com/twitter/util/blob/b0696d0/util-eval/src/test/scala/com/twitter/util/EvalTest.scala
If you need to use IMain, maybe because you want to use the intepreter with your own custom settings, you can do something like this:
a. First create a class meant to hold your result:
class ResHolder(var value: Any)
b. Create a container object to hold the result and interpret the code into that object:
val settings = new Settings()
val writer = new java.io.StringWriter()
val interpreter = new IMain(settings, writer)
val code = "def f(x:Int):Int=x+1"
// Create a container object to hold the result and bind in the interpreter
val holder = new ResHolder(null)
interpreter.bind("$result", holder.getClass.getName, holder) match {
case Success =>
case Error => throw new ScriptException("error in: binding '$result' value\n" + writer)
case Incomplete => throw new ScriptException("incomplete in: binding '$result' value\n" + writer)
}
val ir = interpreter.interpret("$result.value = " + code)
// Return cast value or throw an exception based on result
ir match {
case Success =>
val any = holder.value
any.asInstanceOf[(Int) => Int]
case Error => throw new ScriptException("error in: '" + code + "'\n" + writer)
case Incomplete => throw new ScriptException("incomplete in :'" + code + "'\n" + writer)
}

Related

Scala Future Sequence Mapping: finding length?

I want to return both a Future[Seq[String]] from a method and the length of that Seq[String] as well. Currently I'm building the Future[Seq[String]] using a mapping function from another Future[T].
Is there any way to do this without awaiting for the Future?
You can map over the current Future to create a new one with the new data added to the type.
val fss: Future[Seq[String]] = Future(Seq("a","b","c"))
val x: Future[(Seq[String],Int)] = fss.map(ss => (ss, ss.length))
If you somehow know what the length of the Seq will be without actually waiting for it, then something like this;
val t: Future[T] = ???
def foo: (Int, Future[Seq[String]]) = {
val length = 42 // ???
val fut: Future[Seq[String]] = t map { v =>
genSeqOfLength42(v)
}
(length, fut)
}
If you don't, then you will have to return Future[(Int, Seq[String])] as jwvh said, or you can easily get the length later in the calling function.

Cats Writer Vector is empty

I wrote this simple program in my attempt to learn how Cats Writer works
import cats.data.Writer
import cats.syntax.applicative._
import cats.syntax.writer._
import cats.instances.vector._
object WriterTest extends App {
type Logged2[A] = Writer[Vector[String], A]
Vector("started the program").tell
val output1 = calculate1(10)
val foo = new Foo()
val output2 = foo.calculate2(20)
val (log, sum) = (output1 + output2).pure[Logged2].run
println(log)
println(sum)
def calculate1(x : Int) : Int = {
Vector("came inside calculate1").tell
val output = 10 + x
Vector(s"Calculated value ${output}").tell
output
}
}
class Foo {
def calculate2(x: Int) : Int = {
Vector("came inside calculate 2").tell
val output = 10 + x
Vector(s"calculated ${output}").tell
output
}
}
The program works and the output is
> run-main WriterTest
[info] Compiling 1 Scala source to /Users/Cats/target/scala-2.11/classes...
[info] Running WriterTest
Vector()
50
[success] Total time: 1 s, completed Jan 21, 2017 8:14:19 AM
But why is the vector empty? Shouldn't it contain all the strings on which I used the "tell" method?
When you call tell on your Vectors, each time you create a Writer[Vector[String], Unit]. However, you never actually do anything with your Writers, you just discard them. Further, you call pure to create your final Writer, which simply creates a Writer with an empty Vector. You have to combine the writers together in a chain that carries your value and message around.
type Logged[A] = Writer[Vector[String], A]
val (log, sum) = (for {
_ <- Vector("started the program").tell
output1 <- calculate1(10)
foo = new Foo()
output2 <- foo.calculate2(20)
} yield output1 + output2).run
def calculate1(x: Int): Logged[Int] = for {
_ <- Vector("came inside calculate1").tell
output = 10 + x
_ <- Vector(s"Calculated value ${output}").tell
} yield output
class Foo {
def calculate2(x: Int): Logged[Int] = for {
_ <- Vector("came inside calculate2").tell
output = 10 + x
_ <- Vector(s"calculated ${output}").tell
} yield output
}
Note the use of for notation. The definition of calculate1 is really
def calculate1(x: Int): Logged[Int] = Vector("came inside calculate1").tell.flatMap { _ =>
val output = 10 + x
Vector(s"calculated ${output}").tell.map { _ => output }
}
flatMap is the monadic bind operation, which means it understands how to take two monadic values (in this case Writer) and join them together to get a new one. In this case, it makes a Writer containing the concatenation of the logs and the value of the one on the right.
Note how there are no side effects. There is no global state by which Writer can remember all your calls to tell. You instead make many Writers and join them together with flatMap to get one big one at the end.
The problem with your example code is that you're not using the result of the tell method.
If you take a look at its signature, you'll see this:
final class WriterIdSyntax[A](val a: A) extends AnyVal {
def tell: Writer[A, Unit] = Writer(a, ())
}
it is clear that tell returns a Writer[A, Unit] result which is immediately discarded because you didn't assign it to a value.
The proper way to use a Writer (and any monad in Scala) is through its flatMap method. It would look similar to this:
println(
Vector("started the program").tell.flatMap { _ =>
15.pure[Logged2].flatMap { i =>
Writer(Vector("ended program"), i)
}
}
)
The code above, when executed will give you this:
WriterT((Vector(started the program, ended program),15))
As you can see, both messages and the int are stored in the result.
Now this is a bit ugly, and Scala actually provides a better way to do this: for-comprehensions. For-comprehension are a bit of syntactic sugar that allows us to write the same code in this way:
println(
for {
_ <- Vector("started the program").tell
i <- 15.pure[Logged2]
_ <- Vector("ended program").tell
} yield i
)
Now going back to your example, what I would recommend is for you to change the return type of compute1 and compute2 to be Writer[Vector[String], Int] and then try to make your application compile using what I wrote above.

Scala TypeTags and performance

There are some answers around for equivalent questions about Java, but is scala reflection (2.11, TypeTags) really slow? there's a long narrative write-up about it at http://docs.scala-lang.org/overviews/reflection/overview.html, where the answer to this question is hard to extract.
I see a lot of advice floating around about avoiding reflection, maybe some of it predating the improvements of 2.11, but if this works well it looks like it can solve the debilitating aspect of the JVM's type erasure, for scala code.
Thanks!
Let's measure it.
I've created simple class C that has one method. All what this method do is sleep for 10ms.
Let's invoke this method
within reflection
directly
And see which is faster and how fast it is.
I've created three tests.
Test 1. Invoke via reflection. Execution time include all work that necessary to be done for setup reflection.
Create runtimeMirror, reflect class, create declaration for method, and at last step - execute method.
Test 2. Do not take into account this preparation stage, as it can be re-used.
We are calculate time of method invoking via reflection only.
Test 3. Invoke method directly.
Results:
Reflection from start : job done in 2561ms got 101 (1,5seconds for setup each execution)
Invoke method reflection: job done in 1093ms got 101 ( < 1ms for setup each execution)
No reflection: job done in 1087ms got 101 ( < 1ms for setup each execution)
Conclusion:
Setup phase increase execution time dramatically. But there are no need to perform setup on each execution (this is like class initialization - can be done once). So if you use reflection in right way(with separated init stage) it shows relevant performance and can be used for production.
Source code:
class C {
def x = {
Thread.sleep(10)
1
}
}
class XYZTest extends FunSpec {
def withTime[T](procName: String, f: => T): T = {
val start = System.currentTimeMillis()
val r = f
val end = System.currentTimeMillis()
print(s"$procName job done in ${end-start}ms")
r
}
describe("SomeTest") {
it("rebuild each time") {
val s = withTime("Reflection from start : ", (0 to 100). map {x =>
val ru = scala.reflect.runtime.universe
val m = ru.runtimeMirror(getClass.getClassLoader)
val im = m.reflect(new C)
val methodX = ru.typeOf[C].declaration(ru.TermName("x")).asMethod
val mm = im.reflectMethod(methodX)
mm().asInstanceOf[Int]
}).sum
println(s" got $s")
}
it("invoke each time") {
val ru = scala.reflect.runtime.universe
val m = ru.runtimeMirror(getClass.getClassLoader)
val im = m.reflect(new C)
val s = withTime("Invoke method reflection: ", (0 to 100). map {x =>
val methodX = ru.typeOf[C].declaration(ru.TermName("x")).asMethod
val mm = im.reflectMethod(methodX)
mm().asInstanceOf[Int]
}).sum
println(s" got $s")
}
it("invoke directly") {
val c = new C()
val s = withTime("No reflection: ", (0 to 100). map {x =>
c.x
}).sum
println(s" got $s")
}
}
}

Delayed Execution of a series of operations

I'm trying to write a class where when you call a function defined in the class, it will store it in an array of functions instead of executing it right away, then user calls exec() to execute it:
class TestA(val a: Int, newAction: Option[ArrayBuffer[(Int) => Int]]) {
val action: ArrayBuffer[(Int) => Int] = if (newAction.isEmpty) ArrayBuffer.empty[(Int) => Int] else newAction.get
def add(b: Int): TestA = {action += (a => a + b); new TestA(a, Some(action))}
def exec(): Int = {
var result = 0
action.foreach(r => result += r.apply(a))
result
}
def this(a:Int) = this(a, None)
}
Then this is my test code:
"delayed action" should "delay action till ready" in {
val test = new TestA(3)
val result = test.add(5).add(5)
println(result.exec())
}
This gives me a result of 16 because 3 was passed in twice and got added twice. I guess the easy way for me to solve this problem is to not pass in value for the second round, like change val a: Int to val a: Option[Int]. It helps but it doesn't solve my real problem: letting the second function know the result of the first execution.
Does anyone have a better solution to this?? Or if this is a pattern, can anyone share a tutorial of it?
Just save the result of the action in the 'result' variable (instatiate it with 'a') and use the previous result as input for the current iteration
def exec(): Int = {
var result = a
action.foreach(r => result = r.apply(result))
result
}
or use the more functional oriented solution that does the same
def exec(): Int = {
action.foldLeft(a)((r, f) => f.apply(r))
}

How do I create a partial function with generics in scala?

I'm trying to write a performance measurements library for Scala. My idea is to transparently 'mark' sections so that the execution time can be collected. Unfortunately I wasn't able to bend the compiler to my will.
An admittedly contrived example of what I have in mind:
// generate a timing function
val myTimer = mkTimer('myTimer)
// see how the timing function returns the right type depending on the
// type of the function it is passed to it
val act = actor {
loop {
receive {
case 'Int =>
val calc = myTimer { (1 to 100000).sum }
val result = calc + 10 // calc must be Int
self reply (result)
case 'String =>
val calc = myTimer { (1 to 100000).mkString }
val result = calc + " String" // calc must be String
self reply (result)
}
Now, this is the farthest I got:
trait Timing {
def time[T <: Any](name: Symbol)(op: => T) :T = {
val start = System.nanoTime
val result = op
val elapsed = System.nanoTime - start
println(name + ": " + elapsed)
result
}
def mkTimer[T <: Any](name: Symbol) : (() => T) => () => T = {
type c = () => T
time(name)(_ : c)
}
}
Using the time function directly works and the compiler correctly uses the return type of the anonymous function to type the 'time' function:
val bigString = time('timerBigString) {
(1 to 100000).mkString("-")
}
println (bigString)
Great as it seems, this pattern has a number of shortcomings:
forces the user to reuse the same symbol at each invocation
makes it more difficult to do more advanced stuff like predefined project-level timers
does not allow the library to initialize once a data structure for 'timerBigString
So here it comes mkTimer, that would allow me to partially apply the time function and reuse it. I use mkTimer like this:
val myTimer = mkTimer('aTimer)
val myString= myTimer {
(1 to 100000).mkString("-")
}
println (myString)
But I get a compiler error:
error: type mismatch;
found : String
required: () => Nothing
(1 to 100000).mkString("-")
I get the same error if I inline the currying:
val timerBigString = time('timerBigString) _
val bigString = timerBigString {
(1 to 100000).mkString("-")
}
println (bigString)
This works if I do val timerBigString = time('timerBigString) (_: String), but this is not what I want. I'd like to defer typing of the partially applied function until application.
I conclude that the compiler is deciding the return type of the partial function when I first create it, chosing "Nothing" because it can't make a better informed choice.
So I guess what I'm looking for is a sort of late-binding of the partially applied function. Is there any way to do this? Or maybe is there a completely different path I could follow?
Well, thanks for reading this far
-teo
The usual pattern when you want "lazy" generics is to use a class with an apply method
class Timer(name: Symbol) {
def apply[T](op: => T) = time(name)(op)
}
def mkTimer(name: Symbol) = new Timer(name)