Creating serializable objects from Scala source code at runtime - scala

To embed Scala as a "scripting language", I need to be able to compile text fragments to simple objects, such as Function0[Unit] that can be serialised to and deserialised from disk and which can be loaded into the current runtime and executed.
How would I go about this?
Say for example, my text fragment is (purely hypothetical):
Document.current.elements.headOption.foreach(_.open())
This might be wrapped into the following complete text:
package myapp.userscripts
import myapp.DSL._
object UserFunction1234 extends Function0[Unit] {
def apply(): Unit = {
Document.current.elements.headOption.foreach(_.open())
}
}
What comes next? Should I use IMain to compile this code? I don't want to use the normal interpreter mode, because the compilation should be "context-free" and not accumulate requests.
What I need to get hold off from the compilation is I guess the binary class file? In that case, serialisation is straight forward (byte array). How would I then load that class into the runtime and invoke the apply method?
What happens if the code compiles to multiple auxiliary classes? The example above contains a closure _.open(). How do I make sure I "package" all those auxiliary things into one object to serialize and class-load?
Note: Given that Scala 2.11 is imminent and the compiler API probably changed, I am happy to receive hints as how to approach this problem on Scala 2.11

Here is one idea: use a regular Scala compiler instance. Unfortunately it seems to require the use of hard disk files both for input and output. So we use temporary files for that. The output will be zipped up in a JAR which will be stored as a byte array (that would go into the hypothetical serialization process). We need a special class loader to retrieve the class again from the extracted JAR.
The following assumes Scala 2.10.3 with the scala-compiler library on the class path:
import scala.tools.nsc
import java.io._
import scala.annotation.tailrec
Wrapping user provided code in a function class with a synthetic name that will be incremented for each new fragment:
val packageName = "myapp"
var userCount = 0
def mkFunName(): String = {
val c = userCount
userCount += 1
s"Fun$c"
}
def wrapSource(source: String): (String, String) = {
val fun = mkFunName()
val code = s"""package $packageName
|
|class $fun extends Function0[Unit] {
| def apply(): Unit = {
| $source
| }
|}
|""".stripMargin
(fun, code)
}
A function to compile a source fragment and return the byte array of the resulting jar:
/** Compiles a source code consisting of a body which is wrapped in a `Function0`
* apply method, and returns the function's class name (without package) and the
* raw jar file produced in the compilation.
*/
def compile(source: String): (String, Array[Byte]) = {
val set = new nsc.Settings
val d = File.createTempFile("temp", ".out")
d.delete(); d.mkdir()
set.d.value = d.getPath
set.usejavacp.value = true
val compiler = new nsc.Global(set)
val f = File.createTempFile("temp", ".scala")
val out = new BufferedOutputStream(new FileOutputStream(f))
val (fun, code) = wrapSource(source)
out.write(code.getBytes("UTF-8"))
out.flush(); out.close()
val run = new compiler.Run()
run.compile(List(f.getPath))
f.delete()
val bytes = packJar(d)
deleteDir(d)
(fun, bytes)
}
def deleteDir(base: File): Unit = {
base.listFiles().foreach { f =>
if (f.isFile) f.delete()
else deleteDir(f)
}
base.delete()
}
Note: Doesn't handle compiler errors yet!
The packJar method uses the compiler output directory and produces an in-memory jar file from it:
// cf. http://stackoverflow.com/questions/1281229
def packJar(base: File): Array[Byte] = {
import java.util.jar._
val mf = new Manifest
mf.getMainAttributes.put(Attributes.Name.MANIFEST_VERSION, "1.0")
val bs = new java.io.ByteArrayOutputStream
val out = new JarOutputStream(bs, mf)
def add(prefix: String, f: File): Unit = {
val name0 = prefix + f.getName
val name = if (f.isDirectory) name0 + "/" else name0
val entry = new JarEntry(name)
entry.setTime(f.lastModified())
out.putNextEntry(entry)
if (f.isFile) {
val in = new BufferedInputStream(new FileInputStream(f))
try {
val buf = new Array[Byte](1024)
#tailrec def loop(): Unit = {
val count = in.read(buf)
if (count >= 0) {
out.write(buf, 0, count)
loop()
}
}
loop()
} finally {
in.close()
}
}
out.closeEntry()
if (f.isDirectory) f.listFiles.foreach(add(name, _))
}
base.listFiles().foreach(add("", _))
out.close()
bs.toByteArray
}
A utility function that takes the byte array found in deserialization and creates a map from class names to class byte code:
def unpackJar(bytes: Array[Byte]): Map[String, Array[Byte]] = {
import java.util.jar._
import scala.annotation.tailrec
val in = new JarInputStream(new ByteArrayInputStream(bytes))
val b = Map.newBuilder[String, Array[Byte]]
#tailrec def loop(): Unit = {
val entry = in.getNextJarEntry
if (entry != null) {
if (!entry.isDirectory) {
val name = entry.getName
// cf. http://stackoverflow.com/questions/8909743
val bs = new ByteArrayOutputStream
var i = 0
while (i >= 0) {
i = in.read()
if (i >= 0) bs.write(i)
}
val bytes = bs.toByteArray
b += mkClassName(name) -> bytes
}
loop()
}
}
loop()
in.close()
b.result()
}
def mkClassName(path: String): String = {
require(path.endsWith(".class"))
path.substring(0, path.length - 6).replace("/", ".")
}
A suitable class loader:
class MemoryClassLoader(map: Map[String, Array[Byte]]) extends ClassLoader {
override protected def findClass(name: String): Class[_] =
map.get(name).map { bytes =>
println(s"defineClass($name, ...)")
defineClass(name, bytes, 0, bytes.length)
} .getOrElse(super.findClass(name)) // throws exception
}
And a test case which contains additional classes (closures):
val exampleSource =
"""val xs = List("hello", "world")
|println(xs.map(_.capitalize).mkString(" "))
|""".stripMargin
def test(fun: String, cl: ClassLoader): Unit = {
val clName = s"$packageName.$fun"
println(s"Resolving class '$clName'...")
val clazz = Class.forName(clName, true, cl)
println("Instantiating...")
val x = clazz.newInstance().asInstanceOf[() => Unit]
println("Invoking 'apply':")
x()
}
locally {
println("Compiling...")
val (fun, bytes) = compile(exampleSource)
val map = unpackJar(bytes)
println("Classes found:")
map.keys.foreach(k => println(s" '$k'"))
val cl = new MemoryClassLoader(map)
test(fun, cl) // should call `defineClass`
test(fun, cl) // should find cached class
}

Related

How to list out all the files in the public directory in a Play Framework 2.X Scala application?

Here is my controller
class Proguard extends Controller {
val proguardFolder = "/public/proguards/"
val proguardFolderFix = "/public/proguards"
val proguardSuffix = "proguard-"
val proguardExtension = ".pro"
val title = "# Created by https://www.proguard.io/api/%s\n\n%s"
def proguard(libraryName: String) = Action {
val libraries = libraryName.split(',')
val availableLibs = listInDir(proguardFolderFix)
val result = availableLibs.filter(libraries.contains).map(readFile).mkString
Ok(title.format(libraryName, result))
}
def list() = Action {
Ok(Json.toJson(listInDir(proguardFolder)))
}
private def listInDir(filePath: String): List[String] = {
getListOfFiles(Play.getFile(filePath)).map(_.getName.replace(proguardExtension, "").replace(proguardSuffix, ""))
}
def getListOfFiles(dir: File): List[File] = {
dir.listFiles.toList
}
def readFile(string: String): String = {
val source = scala.io.Source.fromFile(Play.getFile(s"$proguardFolder$proguardSuffix$string$proguardExtension"))
val lines = try source.mkString finally source.close()
lines
}
}
It worked totally okay in debug mode, but in production at Heroku dir.listFiles. is giving me NPE
I've tried different ways, but looks like only solution is move my files to s3 or database.

What is the best way to get the name of the caller class in an object?

I could get this working using this:
scala> object LOGGER {
| def warning(msg: String)(implicit className:String) = {
| className
| }
| }
defined object LOGGER
scala> class testing {
| lazy implicit val className = this.getClass.getName
| def test = LOGGER.warning("Testing")
| }
defined class testing
scala> val obj = new testing()
obj: testing = testing#11fb4f69
scala> obj.test
res51: String = testing <=======
scala> class testing2 {
| lazy implicit val className = this.getClass.getName
| def test = LOGGER.warning("Testing")
| }
defined class testing2
scala> val obj2 = new testing2()
obj2: testing2 = testing2#2ca3a203
scala> obj2.test
res53: String = testing2 <=====
I also tried using Thread.currentThread.getStackTrace in the object LOGGER but couldn't get it to print the calling class testing in the warning function.
Any other ways to do this?
Dynamic variable
One way to do it is DymamicVariable
import scala.util.DynamicVariable
object LOGGER {
val caller = new DynamicVariable[String]("---")
def time = new Date().toString
def warning(msg: String) = println(s"[${caller.value} : $time] $msg")
}
trait Logging {
def logged[T](action: => T) = LOGGER.caller.withValue(this.getClass.getName)(action)
}
class testing extends Logging {
def test = logged {
//some actions
LOGGER.warning("test something")
//some other actions
}
}
val t = new testing
t.test
will print something like
[testing : Wed Nov 25 11:29:23 MSK 2015] test something
Or instead of mixing in Logging you can use it directly
class testing {
def test = LOGGER.caller.withValue(this.getClass.getName) {
//some actions
LOGGER.warning("test something")
//some other actions
}
}
Macro
Another more powerfull, but more complex to support approach is to build some simple macro
You could define in other source, preferrably in other subproject
import scala.reflect.macros.blackbox.Context
import scala.language.experimental.macros
class LoggerImpl(val c: Context) {
import c.universe._
def getClassSymbol(s: Symbol): Symbol = if (s.isClass) s else getClassSymbol(s.owner)
def logImpl(msg: Expr[String]): Expr[Unit] = {
val cl = getClassSymbol(c.internal.enclosingOwner).toString
val time = c.Expr[String](q"new java.util.Date().toString")
val logline = c.Expr[String](q""" "[" + $cl + " : " + $time + "]" + $msg """)
c.Expr[Unit](q"println($logline)")
}
}
object Logger {
def warning(msg: String): Unit = macro LoggerImpl.logImpl
}
Now you don't need to change the testing class:
class testing {
def test = {
//some actions
Logger.warning("something happen")
//some other actions
}
}
And see desired output.
Thsi could be very-perfomant alternative to runtime stack introspection
I use this technique in my custom classloader project to get the name of the first class up the stack not in my package. The general idea is copied from the UrlClassloader.
String callerClassName="";
StackTraceElement[] stackTrace=Thread.currentThread().getStackTrace();
for (int i=1; i < stackTrace.length; i++) {
String candidateClassName=stackTrace[i].getClassName();
if(!candidateClassName.startsWith("to.be.ignored") &&
!candidateClassName.startsWith("java")){
callerClassName=candidateClassName;
break;
}
}
The approach has it's drawbacks since it only gets the name of the class, not the actual class or even better the object.

Serializing to disk and deserializing Scala objects using Pickling

Given a stream of homogeneous typed object, how would I go about serializing them to binary, writing them to disk, reading them from disk and then deserializing them using Scala Pickling?
For example:
object PicklingIteratorExample extends App {
import scala.pickling.Defaults._
import scala.pickling.binary._
import scala.pickling.static._
case class Person(name: String, age: Int)
val personsIt = Iterator.from(0).take(10).map(i => Person(i.toString, i))
val pklsIt = personsIt.map(_.pickle)
??? // Write to disk
val readIt: Iterator[Person] = ??? // Read from disk and unpickle
}
I find a way to so for standard files:
object PickleIOExample extends App {
import scala.pickling.Defaults._
import scala.pickling.binary._
import scala.pickling.static._
val tempPath = File.createTempFile("pickling", ".gz").getAbsolutePath
val outputStream = new FileOutputStream(tempPath)
val inputStream = new FileInputStream(tempPath)
val persons = for{
i <- 1 to 100
} yield Person(i.toString, i)
val output = new StreamOutput(outputStream)
persons.foreach(_.pickleTo(output))
outputStream.close()
val personsIt = new Iterator[Person]{
val streamPickle = BinaryPickle(inputStream)
override def hasNext: Boolean = inputStream.available > 0
override def next(): Person = streamPickle.unpickle[Person]
}
println(personsIt.mkString(", "))
inputStream.close()
}
But I am still unable to find a solution that will work with gzipped files. Since I do not know how to detect the EOF? The following throws an EOFexception since GZIPInputStream available method does not indicate the EOF:
object PickleIOExample extends App {
import scala.pickling.Defaults._
import scala.pickling.binary._
import scala.pickling.static._
val tempPath = File.createTempFile("pickling", ".gz").getAbsolutePath
val outputStream = new GZIPOutputStream(new FileOutputStream(tempPath))
val inputStream = new GZIPInputStream(new FileInputStream(tempPath))
val persons = for{
i <- 1 to 100
} yield Person(i.toString, i)
val output = new StreamOutput(outputStream)
persons.foreach(_.pickleTo(output))
outputStream.close()
val personsIt = new Iterator[Person]{
val streamPickle = BinaryPickle(inputStream)
override def hasNext: Boolean = inputStream.available > 0
override def next(): Person = streamPickle.unpickle[Person]
}
println(personsIt.mkString(", "))
inputStream.close()
}

Chisel: Access to Module Parameters from Tester

How does one access the parameters used to construct a Module from inside the Tester that is testing it?
In the test below I am passing the parameters explicitly both to the Module and to the Tester. I would prefer not to have to pass them to the Tester but instead extract them from the module that was also passed in.
Also I am new to scala/chisel so any tips on bad techniques I'm using would be appreciated :).
import Chisel._
import math.pow
class TestA(dataWidth: Int, arrayLength: Int) extends Module {
val dataType = Bits(INPUT, width = dataWidth)
val arrayType = Vec(gen = dataType, n = arrayLength)
val io = new Bundle {
val i_valid = Bool(INPUT)
val i_data = dataType
val i_array = arrayType
val o_valid = Bool(OUTPUT)
val o_data = dataType.flip
val o_array = arrayType.flip
}
io.o_valid := io.i_valid
io.o_data := io.i_data
io.o_array := io.i_array
}
class TestATests(c: TestA, dataWidth: Int, arrayLength: Int) extends Tester(c) {
val maxData = pow(2, dataWidth).toInt
for (t <- 0 until 16) {
val i_valid = rnd.nextInt(2)
val i_data = rnd.nextInt(maxData)
val i_array = List.fill(arrayLength)(rnd.nextInt(maxData))
poke(c.io.i_valid, i_valid)
poke(c.io.i_data, i_data)
(c.io.i_array, i_array).zipped foreach {
(element,value) => poke(element, value)
}
expect(c.io.o_valid, i_valid)
expect(c.io.o_data, i_data)
(c.io.o_array, i_array).zipped foreach {
(element,value) => poke(element, value)
}
step(1)
}
}
object TestAObject {
def main(args: Array[String]): Unit = {
val tutArgs = args.slice(0, args.length)
val dataWidth = 5
val arrayLength = 6
chiselMainTest(tutArgs, () => Module(
new TestA(dataWidth=dataWidth, arrayLength=arrayLength))){
c => new TestATests(c, dataWidth=dataWidth, arrayLength=arrayLength)
}
}
}
If you make the arguments dataWidth and arrayLength members of TestA you can just reference them. In Scala this can be accomplished by inserting val into the argument list:
class TestA(val dataWidth: Int, val arrayLength: Int) extends Module ...
Then you can reference them from the test as members with c.dataWidth or c.arrayLength

ProcessBuilder - Start another process / JVM in Scala - HowTo?

I already handled to start another VM in Java.
See ProcessBuilder - Start another process / JVM - HowTo?
For some reason, I can't manage to do the same in Scala.
Here's my code
object NewProcTest {
def main(args :Array[String]) {
println("Main")
// val clazz = classOf[O3]
val clazz = O4.getClass
Proc.spawn(clazz, true)
println("fin")
}
}
object Proc{
def spawn(clazz :Class[_], redirectStream :Boolean) {
val separator = System.getProperty("file.separator")
val classpath = System.getProperty("java.class.path")
val path = System.getProperty("java.home") +
separator + "bin" + separator + "java"
val processBuilder =
new ProcessBuilder(path, "-cp",
classpath,
clazz.getCanonicalName())
processBuilder.redirectErrorStream(redirectStream)
val process = processBuilder.start()
process.waitFor()
System.out.println("Fin")
}
}
I've tried to define the main in an object and in class. Both within the same .scala file or within a separate one.
What am I doing wrong?
The issue seems to be that the class name for an object has a '$' suffix.
If you strip off that suffix, the Java invocation line triggered from ProcessBuilder works.
I've hacked something below to show a couple of test cases. I'm not yet sure yet why this is the case but at least it provides a workaround.
import java.io.{InputStreamReader, BufferedReader}
import System.{getProperty => Prop}
object O3 {def main(args: Array[String]) {println("hello from O3")}}
package package1 {
object O4 {def main(args: Array[String]) {println("hello from O4")}}
}
object NewProcTest {
val className1 = O3.getClass().getCanonicalName().dropRight(1)
val className2 = package1.O4.getClass().getCanonicalName().dropRight(1)
val sep = Prop("file.separator")
val classpath = Prop("java.class.path")
val path = Prop("java.home")+sep+"bin"+sep+"java"
println("className1 = " + className1)
println("className2 = " + className2)
def spawn(className: String,
redirectStream: Boolean) {
val processBuilder = new ProcessBuilder(path, "-cp", classpath, className)
val pbcmd = processBuilder.command().toString()
println("processBuilder = " + pbcmd)
processBuilder.redirectErrorStream(redirectStream)
val process = processBuilder.start()
val reader = new BufferedReader(new InputStreamReader(process.getInputStream()))
println(reader.readLine())
reader.close()
process.waitFor()
}
def main(args :Array[String]) {
println("start")
spawn(className1, false)
spawn(className2, false)
println("end")
}
}