Syntactic sugar for compile-time object creation in Scala - scala

Lets say I have
trait fooTrait[T] {
def fooFn(x: T, y: T) : T
}
I want to enable users to quickly declare new instances of fooTrait with their own defined bodies for fooFn. Ideally, I'd want something like
val myFoo : fooTrait[T] = newFoo((x:T, y:T) => x+y)
to work. However, I can't just do
def newFoo[T](f: (x:T, y:T) => T) = new fooTrait[T] { def fooFn(x:T, y:T):T = f(x,y); }
because this uses closures, and so results in different objects when the program is run multiple times. What I really need is to be able to get the classOf of the object returned by newFoo and then have that be constructable on a different machine. What do I do?
If you're interested in the use case, I'm trying to write a Scala wrapper for Hadoop that allows you to execute
IO("Data") --> ((x: Int, y: Int) => (x, x+y)) --> IO("Out")
The thing in the middle needs to be turned into a class that implements a particular interface and can then be instantiated on different machines (executing the same jar file) from just the class name.
Note that Scala does the right thing with the syntactic sugar that converts (x:Int) => x+5 to an instance of Function1. My question is whether I can replicate this without hacking the Scala internals. If this was lisp (as I'm used to), this would be a trivial compile-time macro ... :sniff:

Here's a version that matches the syntax of what you list in the question and serializes/executes the anon-function. Note that this serializes the state of the Function2 object so that the serialized version can be restored on another machine. Just the classname is insufficient, as illustrated below the solution.
You should make your own encode/decode function, if even to just include your own Base64 implementation (not to rely on Sun's Hotspot).
object SHadoopImports {
import java.io._
implicit def functionToFooString[T](f:(T,T)=>T) = {
val baos = new ByteArrayOutputStream()
val oo = new ObjectOutputStream(baos)
oo.writeObject(f)
new sun.misc.BASE64Encoder().encode(baos.toByteArray())
}
implicit def stringToFun(s: String) = {
val decoder = new sun.misc.BASE64Decoder();
val bais = new ByteArrayInputStream(decoder.decodeBuffer(s))
val oi = new ObjectInputStream(bais)
val f = oi.readObject()
new {
def fun[T](x:T, y:T): T = f.asInstanceOf[Function2[T,T,T]](x,y)
}
}
}
// I don't really know what this is supposed to do
// just supporting the given syntax
case class IO(src: String) {
import SHadoopImports._
def -->(s: String) = new {
def -->(to: IO) = {
val IO(snk) = to
println("From: " + src)
println("Applying (4,5): " + s.fun(4,5))
println("To: " + snk)
}
}
}
object App extends Application {
import SHadoopImports._
IO("MySource") --> ((x:Int,y:Int)=>x+y) --> IO("MySink")
println
IO("Here") --> ((x:Int,y:Int)=>x*y+y) --> IO("There")
}
/*
From: MySource
Applying (4,5): 9
To: MySink
From: Here
Applying (4,5): 25
To: There
*/
To convince yourself that the classname is insufficient to use the function on another machine, consider the code below which creates 100 different functions. Count the classes on the filesystem and compare.
object App extends Application {
import SHadoopImports._
for (i <- 1 to 100) {
IO(i + ": source") --> ((x:Int,y:Int)=>(x*i)+y) --> IO("sink")
}
}

Quick suggestion: why don't you try to create an implicit def transforming FunctionN object to the trait expected by the --> method.
I do hope you won't have to use any macro for this!

Related

Is it possible to call a scala macro from generic scala code?

I'm trying to use Scala macros to convert untyped, Map[String, Any]-like expressions to their corresponding typed case class expressions.
The following scala macro (almost) gets the job done:
trait ToTyped[+T] {
def apply(term: Any): T
}
object TypeConversions {
// At compile-time, "type-check" an untyped expression and convert it to
// its appropriate typed value.
def toTyped[T]: ToTyped[T] = macro toTypedImpl[T]
def toTypedImpl[T: c.WeakTypeTag](c: Context): c.Expr[ToTyped[T]] = {
import c.universe._
val tpe = weakTypeOf[T]
if (tpe <:< typeOf[Int] || tpe <:< typeOf[String]) {
c.Expr[ToTyped[T]](
q"""new ToTyped[$tpe] {
def apply(term: Any): $tpe = term.asInstanceOf[$tpe]
}""")
} else {
val companion = tpe.typeSymbol.companion
val maybeConstructor = tpe.decls.collectFirst {
case m: MethodSymbol if m.isPrimaryConstructor => m
}
val constructorFields = maybeConstructor.get.paramLists.head
val subASTs = constructorFields.map { field =>
val fieldName = field.asTerm.name
val fieldDecodedName = fieldName.toString
val fieldType = tpe.decl(fieldName).typeSignature
q"""
val subTerm = term.asInstanceOf[Map[String, Any]]($fieldDecodedName)
TypeConversions.toTyped[$fieldType](subTerm)
"""
}
c.Expr[ToTyped[T]](
q"""new ToTyped[$tpe] {
def apply(term: Any): $tpe = $companion(..$subASTs)
}""")
}
}
}
Using the above toTyped function, I can convert for example an untyped person value to its corresponding typed Person case class:
object TypeConversionTests {
case class Person(name: String, age: Int, address: Address)
case class Address(street: String, num: Int, zip: Int)
val untypedPerson = Map(
"name" -> "Max",
"age" -> 27,
"address" -> Map("street" -> "Palm Street", "num" -> 7, "zip" -> 12345))
val typedPerson = TypeConversions.toTyped[Person](untypedPerson)
typedPerson shouldEqual Person("Max", 27, Address("Palm Street", 7, 12345))
}
However, my problem arises when trying to use the toTyped macro from above in generic scala code. Suppose I have a generic function indirection that uses the toTyped macro:
object CanIUseScalaMacrosAndGenerics {
def indirection[T](value: Any): T = TypeConversions.toTyped[T](value)
import TypeConversionTests._
val indirectlyTyped = indirection[Person](untypedPerson)
indirectlyTyped shouldEqual Person("Max", 27, Address("Palm Street", 7, 12345))
Here, I get a compile-time error from the toTyped macro complaining that the type T is not yet instantiated with a concrete type. I think the reason for the error is that from the perspective of toTyped inside indirection, the type T is still generic and not inferred to be Person just yet. And therefore the macro cannot build the corresponding Person case class when called via indirection. However, from the perspective of the call-site indirection[Person](untypedPerson), we have T == Person, so I wonder if there is a way to obtain the instantiated type of T (i.e., Person) inside the macro toTyped.
Put differently: Can I combine the Scala macro toTyped with the generic function indirection and yet be able to figure out the instantiated type for type parameter T inside the toTyped macro? Or am I on a hopeless track here and there is no way to combine Scala macros and generics like this? In the latter case I would like to know if the only solution here is to push the macro usage so far "out" that I can call it instantiated as toTyped[Person] rather than as toTyped[T].
Any insights are very much appreciated. Thank you! :-)
Macros need to be expanded. Every time you use a function which body is a macro, Scala will have to generate the code and put it there. As you suspect, this is very very specific and contradict the idea of parametric polymorphism where you write code independent of specific knowledge about your type.
Type classes are one of solutions to the general problem when you want to have one generic (parametric) definition and multiple per-type implementations of certain parts of your algorithm. You basically, define something that you could consider interface which (most likely) need to follow some contract (speaking in OOP terminology) and pass this interface as as argument:
// example
trait SpecificPerType[T] {
def doSomethingSpecific(t: T): String
}
val specificForString: SpecificPerType[String] = new SpecificPerType[String] {
def doSomethingSpecific(t: String): String = s"MyString: $t"
}
val specificForInt: SpecificPerType[Int] = new SpecificPerType[Int] {
def doSomethingSpecific(t: Int): String = s"MyInt: $t"
}
def genericAlgorithm[T](values: List[T])(specific: SpecificPerType[T]): String =
values.map(specific.doSomethingSpecific).mkString("\n")
genericAlgorithm(List(1,2,3))(specificForInt)
genericAlgorithm(List("a","b","c"))(specificForString)
As you can see, it would be pretty annoying to pass this specific part around, which is one of the reasons implicits were introduced.
So you could write it using implicits like this:
implicit val specificForString: SpecificPerType[String] = new SpecificPerType[String] {
def doSomethingSpecific(t: String): String = s"MyString: $t"
}
implicit val specificForInt: SpecificPerType[Int] = new SpecificPerType[Int] {
def doSomethingSpecific(t: Int): String = s"MyInt: $t"
}
def genericAlgorithm[T](values: List[T])(implicit specific: SpecificPerType[T]): String =
values.map(specific.doSomethingSpecific).mkString("\n")
/* for implicits with one type parameter there exist a special syntax
allowing to express them as if they were type constraints e.g.:
def genericAlgorithm[T: SpecificPerType](values: List[T]): String =
values.map(implicitly[SpecificPerType[T]].doSomethingSpecific).mkString("\n")
implicitly[SpecificPerType[T]] is a summoning that let you access implicit
by type, rather than by its variable's name
*/
genericAlgorithm(List(1,2,3)) // finds specificForString using its type
genericAlgorithm(List("a","b","c")) // finds specificForInt using its type
If you generate that trait implementation using macro, you will be able to have a generic algorithm e.g.:
implicit def generate[T]: SpecificPerType[T] =
macro SpecificPerTypeMacros.impl // assuming that you defined this macro there
As far as I can tell, this (extracting macros into type classes) is one of common patterns when it comes to being
able to generate some code with macros while, still building logic on top of it
using normal, parametric code.
(Just to be clear: I do not claim that the role of type classes is limited as the carriers of macro generated code).

How can I make a function generic on an MLReader

I am working in Spark 1.6.3. Here are two functions that do the same thing:
def modelFromBytesCV(modelArray: Array[Byte]): CountVectorizerModel = {
val tempPath: Path = KAZOO_TEMP_DIR.resolve(s"model_${System.currentTimeMillis()}")
Files.write(tempPath, modelArray)
CountVectorizerModel.read.load(tempPath.toString)
}
def modelFromBytesIDF(modelArray: Array[Byte]): IDFModel = {
val tempPath: Path = KAZOO_TEMP_DIR.resolve(s"model_${System.currentTimeMillis()}")
Files.write(tempPath, modelArray)
IDFModel.read.load(tempPath.toString)
}
I would like to make these functions generic. What I am hung up on is that the common trait between the CountVectorizerModel object and IDFModel is MLReadable[T] which itself must take as a type either CountVectorizerModel or IDFModel. This is sort of a recursive parent class loop that I can't figure out a solution to.
By comparison, the generic model writer is easy, because MLWritable is a common trait extended by all the models I am interested in:
def modelToBytes[M <: MLWritable](model: M): Array[Byte] = {
val tempPath: Path = KAZOO_TEMP_DIR.resolve(s"model_${System.currentTimeMillis()}")
model.write.overwrite().save(tempPath.toString)
Files.readAllBytes(tempPath)
}
How can I make a generic reader that will turn turn a spark-ml model into a byte array?
To make it work you'll need access to a specific MlReadable object.
import org.apache.spark.ml.util.MLReadable
def modelFromBytes[M](obj: MLReadable[M], modelArray: Array[Byte]): M = {
val tempPath: Path = ???
...
obj.read.load(tempPath.toString)
}
which could be later used as:
val bytes: Array[Byte] = ???
modelFromBytes(CountVectorizerModel, bytes)
Note that, despite the first appearance, there is nothing recursive here - MLReadable[M] refers to companion object, not class as such. So for example CountVectorizerModel object is MLReadable, while CountVectorizeModel class isn't.
Internally, Spark MLReader handles this in a different way - it creates an instance of the class using reflection, and then sets its Params. However this path won't be very useful for you here*.
If compatibility with the current API is required, you can try making readable object implicit:
def modelFromBytes[M](modelArray: Array[Byte])(implicit obj: MLReadable[M]): M = {
...
}
and then
implicit val readable: MLReadable[CountVectorizerModel] = CountVectorizerModel
modelFromBytes[CountVectorizerModel](bytes)
* Technically speaking it is possible to get companion object via reflection
def modelFromBytesCV[M <: MLWritable](
modelArray: Array[Byte])(implicit ct: ClassTag[M]): M = {
val tempPath: Path = ???
...
val cls = Class.forName(ct.runtimeClass.getName + "$");
cls.getField("MODULE$").get(cls).asInstanceOf[MLReadable[M]]
.read.load(tempPath.toString))
}
but I don't think that is a path worth exploring here. In particular we cannot really provide strict type bounds here - using MLWritable is a hack to limit human errors, but is rather useless for compiler.

How can I create a DSL involving "blocks" where certain functions are in-scope?

Overview
I have a Kotlin-based project that defines a DSL, but for reasons given below I'm now investigating whether it would be better to write my project in Scala. As Scala doesn't seem to lend itself to creating DSLs with as much ease as in Kotlin, I'm not entirely sure how I'd recreate the same DSL in Scala.
Before this gets flagged as a duplicate of this question, I've looked at that but my DSL requirements are somewhat different and I haven't been able to figure out a solution from that.
Details
I'm trying to create a flow-based programming system for developing automated vehicle part test procedures, and for the past couple of weeks I've been testing out an implementation of this in Kotlin, since it seems to support a lot of features that are really nice for creating FBP systems (native coroutine support, easy creation of DSLs using type-safe builders, etc.).
As awesome as Kotlin is though, I'm starting to realise that it would help a lot if the implementation language for the FBP was more functional, since FBP's seem to share a lot in common with functional languages. In particular, being able to define and consume typeclasses would be really useful for a project like this.
In Kotlin, I've created a DSL representing the "glue" language between nodes in a flow-based system. For example, given the existence of two blackbox processes Add and Square, I can define a "composite" node that squares the sum of two numbers:
#CompositeNode
private fun CompositeOutputtingScalar<Int>.addAndSquare(x: Int, y: Int) {
val add = create<Add>()
val square = create<Square>()
connect {
input(x) to add.x
input(y) to add.y
add.output to square.input
square.output to output
}
}
The idea is that connect is a function that takes a lambda of form ConnectionContext.() -> Unit, where ConnectionContext defines various overloads of an infix function to (shadowing the built-in to function in the Kotlin stdlib) allowing me to define the connections between these processes (or nodes).
This is my attempt to do something similar in Scala:
class OutputPort[-A] {
def connectTo(inputPort: InputPort[A]) {}
}
class InputPort[+A]
object connect {
val connections = new ListBuffer[Connection[_]]()
case class Connection[A](outputPort: OutputPort[A], inputPort: InputPort[A])
class ConnectionTracker() {
def track[A](connection: Connection[A]) {}
}
// Cannot make `OutputPort.connectTo` directly return a `Connection[A]`
// without sacrificing covariance, so make an implicit wrapper class
// that does this instead
implicit class ExtendedPort[A](outputPort: OutputPort[A]) {
def |>(inputPort: InputPort[A]): Unit = {
outputPort connectTo inputPort
connections += Connection(outputPort, inputPort)
}
}
}
def someCompositeFunction() {
val output = new OutputPort[Int]
val input = new InputPort[Int]
output |> input // Should not be valid here
connect {
output |> input // Should be valid here
}
}
Right now this won't compile because ConnectablePort isn't in scope. I can bring it into scope by doing:
import connect._
connect {
output |> input // Should be valid here
}
However, it's undesirable to have to do this within the node definition.
To summarise, how can I recreate the DSL I've made in Kotlin within Scala? For reference, this is how I've defined my Kotlin DSL:
interface Composite {
fun <U : ExecutableNode> create(id: String? = null): U
fun connect(apply: ConnectionContext.() -> Unit)
class ConnectionContext {
val constants = mutableListOf<Constant<*>>()
fun <T> input(parameter: T): OutputPort<T> = error("Should not actually be invoked after annotation processing")
fun <T> input(parameterPort: OutputPort<T>) = parameterPort
fun <T> constant(value: T) = Constant(value.toString(), value)
infix fun <U, V> U.to(input: InputPort<V>): Nothing = error("Cannot connect value to specified input")
infix fun <U> OutputPort<U>.to(input: InputPort<U>) = this join input
infix fun <T, U> T.to(other: U): Nothing = error("Invalid connection")
}
}
interface CompositeOutputtingScalar<T> : Composite {
val output: InputPort<T>
}
interface CompositeOutputtingCluster<T : Cluster> : Composite {
fun <TProperty> output(output: T.() -> TProperty): InputPort<TProperty>
}
Just turning on the |> is pretty straightforward in Scala if you use a companion object, and is something always available with the output port
class OutputPort[-A] {
def connectTo(inputPort: InputPort[A]):Unit = {}
}
class InputPort[+A]
object OutputPort{
implicit class ConnectablePort[A](outputPort: OutputPort[A]) {
def |>(inputPort: InputPort[A]): Unit = outputPort connectTo inputPort
}
}
def someCompositeFunction() {
val output = new OutputPort[Int]
val input = new InputPort[Int]
output |> input // Should be valid here
}
Judiciously deciding where to do imports is a core Scala concept. It is how we turn on implicit in our code, like the following, is very common, since that is the way we turn on our type classes.
class OutputPort[-A] {
def connectTo(inputPort: InputPort[A]): Unit = {}
}
class InputPort[+A]
object Converter {
implicit class ConnectablePort[A](outputPort: OutputPort[A]) {
def |>(inputPort: InputPort[A]): Unit = outputPort connectTo inputPort
}
}
def someCompositeFunction() {
val output = new OutputPort[Int]
val input = new InputPort[Int]
import Converter._
output |> input // Should be valid here
}
Now, I think this is what you are looking for, but there are still some import and implicit that needs to be setup, but this would enclose the implicit behavior:
class OutputPort[-A] {
def connectTo(inputPort: InputPort[A]): Unit = {}
}
class InputPort[+A]
object Converter {
private class ConnectablePort[A](outputPort: OutputPort[A]) {
def |>(inputPort: InputPort[A]): Unit = outputPort connectTo
inputPort
}
def convert[A](f: (OutputPort[A] => ConnectablePort[A]) => Unit): Unit = {
def connectablePortWrapper(x: OutputPort[A]): ConnectablePort[A] = new ConnectablePort[A](x)
f(connectablePortWrapper _)
}
}
object MyRunner extends App {
val output = new OutputPort[Int]
val input = new InputPort[Int]
import Converter.convert
//output |> input won't work
convert[Int] { implicit wrapper =>
output |> input // Should be valid here
}
}

Using Scala reflection to compile and load object

Given the following String:
"println(\"Hello\")"
It is possible to use reflection to evaluate the code, as follows.
object Eval {
def apply[A](string: String): A = {
val toolbox = currentMirror.mkToolBox()
val tree = toolbox.parse(string)
toolbox.eval(tree).asInstanceOf[A]
}
}
However, lets say that the string contains an object with a function definition, such as:
"""object MyObj { def getX="X"}"""
Is there a way to use Scala reflection to compile the string, load it and run the function? What I have tried to do has not worked, if anyone has some example code it is very appreciated.
It depends on how strictly you define the acceptable input string. Should the object always be called MyObj? Should the method always be called getX? Should it always be 1 method or can it be multiple?
For the more general case you could try to extract all method names from the AST and generate calls to each one. The following code will call every method (and returns the result of the last one) that takes 0 arguments and is not a constructor, in some object, not taking inheritance into account:
def eval(string: String): Any = {
val toolbox = currentMirror.mkToolBox()
val tree = toolbox.parse(string)
//oName is the name of the object
//defs is the list of all method definitions
val ModuleDef(_,oName,Template(_,_,defs)) = tree
//only generate calls for non-constructor, zero-arg methods
val defCalls = defs.collect{
case DefDef(_,name,_,params,_,_)
if name != termNames.CONSTRUCTOR && params.flatten.isEmpty => q"$oName.$name"
}
//put the method calls after the object definition
val block = tree :: defCalls
toolbox.eval(q"..$block")
}
And using it:
scala> eval("""object MyObj { def bar() = println("bar"); def foo(a: String) = println(a); def getX = "x" }""")
bar
res60: Any = x

Is there an easy way to chain java setters that are void instead of return this

I have a bunch of auto-generated java code that I will be calling in scala. Currently all of the objects were generated with void setters instead of returning this which makes it really annoying when you need to set a bunch of values (I'm not going to use the constructor by initializing everything since there's like 50 fields). For example:
val o = new Obj()
o.setA("a")
o.setB("b")
o.setC("c")
It would be really cool if I could do something like this
val o = with(new Obj()) {
_.setA("a")
_.setB("b")
_.setC("c")
}
I can't use andThen with anon functions since they require objects to be returned. Am I stuck with the current way I'm doing things or is there some magic I'm not aware of.
Sure, you can use tap (the Kestrel combinator), which you presently have to define yourself:
implicit class Tapper[A](val a: A) extends AnyVal {
def tap[B](f: A => B): A = { f(a); a }
def taps[B](fs: A => B*): A = { fs.map(_(a)); a }
}
It works like so:
scala> "salmon".taps(
| println,
| println
| )
salmon
salmon
res2: String = salmon
Note also
val myFavoriteObject = {
val x = new Obj
x.setA("a")
}
will allow you to use a short name to do all the setting while assigning to a more meaningful name for longer-term use.
You can use an implicit converter from/to a wrapper class that allows chaining.
Something like:
case class ObjWrapper(o: Obj) {
def setA(a: String) = { o.setA(a); this }
def setB(b: String) = { o.setB(b); this }
def setC(c: String) = { o.setC(c); this }
}
implicit def wrapped2Obj(ow: ObjWrapper): Obj = ow.o
ObjWrapper(myObj).setA("a").setB("b").setC("c")
Actually you don't even need the implicit converter since those method have been called on myObj.
Take a look at Scalaxy/Beans. Note however that it's using macros, so it should be considered experimental.