Using callUDF to create a method that chains UDF calls - scala

I monkey patched the org.apache.spark.sql.Column class to add a chainUDF method. It works well for udfs that don't take arguments and I need help to make it generic for udfs that take arguments.
Here's the current chainUDF method definition.
object ColumnExt {
implicit class ColumnMethods(c: Column) {
def chainUDF(udfName: String): Column = {
callUDF(udfName, c)
}
}
}
Here's the chainUDF method in action.
def appendZ(s: String): String = {
s"${s}Z"
}
spark.udf.register("appendZUdf", appendZ _)
def prependA(s: String): String = {
s"A${s}"
}
spark.udf.register("prependAUdf", prependA _)
val hobbiesDf = Seq(
("dance"),
("sing")
).toDF("word")
val actualDf = hobbiesDf.withColumn(
"fun",
col("word").chainUDF("appendZUdf").chainUDF("prependAUdf")
)
I'd like to update the chainUDF method definition so it takes an optional list of Column arguments. Something like this:
def appendWord(s: String, word: String): String = {
s"${s}${word}"
}
spark.udf.register("appendWordUdf", appendWord _)
val hobbiesDf = Seq(
("dance"),
("sing")
).toDF("word")
val actualDf = hobbiesDf.withColumn(
"fun",
col("word").chainUDF("appendZUdf").chainUDF("appendWordUdf", lit("cool"))
)
I think we'll need to update the chainUDF method definition to something like this:
object ColumnExt {
implicit class ColumnMethods(c: Column) {
def chainUDF(udfName: String, cols: Column* = some_default_value): Column = {
callUDF(udfName, c + cols)
}
}
}
I'm sure there is some Scala magic trick to make this happen.

The signature is:
def callUDF(udfName: String, cols: Column*): Column
so you don't need magic:
def chainUDF(udfName: String, cols: Column* = some_default_value): Column = {
callUDF(udfName, c +: cols: _*)
}

Related

How to Reflect Construction Method Based on Method Signature In Scala

class Test(a: String, b: Array[String], c: Array[String]){
def this(b: Array[String], c: Array[String]) {
this("1", b, c)
}
def this() = {
this(null, null, null)
}
}
I have a class Test like above, and I would like use scala reflection to invoke one of them
I try to use the follow code
import scala.reflect.runtime.{universe => ru}
val clsTest = ru.typeOf[Test].typeSymbol.asClass
val cm = m.reflectClass(clsTest)
val ctor = ru.typeOf[Test].decl(ru.termNames.CONSTRUCTOR).asTerm.alternatives.map(_.asMethod)
but I don't know how to select the method based on method signature.
Is there any approach to select the method based on the type signature like java reflect code? Thanks!
I have read the scala doc about reflect, but it don't solve my problem. it has only one constructor method. scala reflect doc
I have found a approach to filter by parameter types
methodName: means which method you want to reflect
allScope: true means find from this and super, false means find only from this
types: is the parameter types, you can use Seq(typeOf[T1], typeOf[T2]):_*
x: is the instance which will be reflected
the key is we can get method parameter types by methodSymbol.paramLists.head.map(_.info)
val ru: scala.reflect.runtime.universe.type = scala.reflect.runtime.universe
def m: ru.Mirror = {
ru.runtimeMirror(Thread.currentThread().getContextClassLoader)
}
//reflect method, method can't be curry
def reflectMethod[T: ru.TypeTag : ClassTag](methodName: String, allScope: Boolean, types: ru.Type*)(x: T): ru.MethodMirror = {
val instanceMirror = m.reflect(x)
val methodSymbols = if (allScope) {
val members = getTypeTag(x).tpe.member(ru.TermName(methodName))
if (members.equals(ru.NoSymbol)) {
throw new NoSuchMethodException(noSuchMethodException(methodName, allScope, types: _*)(x))
}
members
.asTerm
.alternatives
.map(_.asMethod)
} else {
val decls = getTypeTag(x).tpe.decl(ru.TermName(methodName))
if (decls.equals(ru.NoSymbol)) {
throw new NoSuchMethodException(noSuchMethodException(methodName, allScope, types: _*)(x))
}
decls
.asTerm
.alternatives
.map(_.asMethod)
}
methodSymbols.foreach(item => assert(item.paramLists.size < 2, "we don't support curry method yet"))
val methodSymbol = methodSymbols.find(item =>
if (item.paramLists.head.isEmpty) {
types.isEmpty
} else {
if (types.isEmpty) {
item.paramLists.head.isEmpty
} else {
// empty forall is true
item.paramLists.head.zip(types).forall(pair => pair._1.info =:= pair._2)
}
}).getOrElse(throw new NoSuchMethodException(noSuchMethodException(methodName, allScope, types: _*)(x)))
val methodMirror = instanceMirror.reflectMethod(methodSymbol)
methodMirror
}
private def noSuchMethodException[T: ru.TypeTag : ClassTag](methodName: String, allScope: Boolean, types: ru.Type*)(x: T): String = {
s"no such method: $methodName, allScope: $allScope type: $types in ${getRuntimeClass(x)}"
}
// the method as List(list(a,b,c))
// and this is PrimaryConstructor
class Test(a: String, b: Array[String], c: Array[String]) {
// the method as List(list(b,c))
def this(b: Array[String], c: Array[String]) {
this("1", b, c)
}
// the method as List(list())
def this() = {
this(null, null, null)
}
// the method as List(list(a),list(b,c)
def this(a:String )(b:String,c:String ){
this(null,null,null)
}
}
val constructor = typeOf[Test].members
// filter all constructor
.filter(e => e.isConstructor).map(e => e.asMethod)
// find which are you want
// edit 1
.find( e =>{
val methodParamsType = e.paramLists.head.map(e =>e.typeSignature)
// what params type are you
val expectParamsType = List(typeOf[Array[String]],typeOf[Array[String]])
methodParamsType.length == expectParamsType.length &&
methodParamsType.zip(expectParamsType).forall{case (l,r)=>l =:= r }
})
// or
// .find(e=>e.isPrimaryConstructor)
// .find(e=>e.paramLists.head.length == 2)
.get

Type List takes type parameters

I have the following data:
val param1 = List(("f1","f2","f3"),
("d1","d2","d2"))
The number of rows in this List is not fixed.
I want to pass these data as the input parameter of my Scala object.
import scopt.OptionParser
object Test {
case class Params( param1: List[(String, String, String)] = null )
def main(args: Array[String]) {
val defaultParams = Params()
val parser = new OptionParser[Params]("Test") {
head("bla-bla")
opt[List[(String, String, String)]]("param1")
.required()
.text("xxx")
.action((x, c) => c.copy(param1 = x))
}
parser.parse(args, defaultParams).map { params =>
val processor = new TestProcessor()
processor.run(params)
} getOrElse {
System.exit(1)
}
}
}
How can I do it? Currently, I define String, but it's not correct. If I use List(String,String,String), then it says Type List takes type parameters.

Extractor object and class in Scala

I'm writing extractor object for functions expressions. Here is how it looks like:
object FunctionTemplate2 {
private final val pattern = Pattern.compile("^(.+?)\\((.+?)\\,(.+?)\\)")
//e.g. foo(1, "str_arg")
def unapply(functionCallExpression: String): Option[(String, String, String)] = {
//parse expression and extract
}
}
And I can extract as follows:
"foo(1, \"str_arg\")" match {
case FunctionTemplate2("foo", first, second) =>
println(s"$first,$second")
}
But this is not as cute as it could be. I would like to have something like that:
case FunctionTemplate2("foo")(first, second) =>
println(s"$first,$second")
Like curried extractor. So I tried this:
case class Function2Extractor(fooName: String){
private final val pattern = Pattern.compile("^(.+?)\\((.+?)\\,(.+?)\\)")
println("creating")
def unapply(functionCallExpression: String): Option[(String, String, String)] =
//parse and extract as before
}
But it did not work:
"foo(1, \"str_arg\")" match {
case Function2Extractor("foo")(first, second) =>
println(s"$first,$second")
}
Is there a way to do this in Scala?
You can simply it by using some utilities in Scala toolset
Notice how pattern is used in match case.
Scala REPL
scala> val pattern = "^(.+?)\\((.+?)\\,(.+?)\\)".r
pattern: scala.util.matching.Regex = ^(.+?)\((.+?)\,(.+?)\)
scala> "foo(1, \"str_arg\")" match { case pattern(x, y, z) => println(s"$x $y $z")}
foo 1 "str_arg"

Get all arguments passed to a class in Scala

Is it generally possible to obtain all arguments/constructor parameters given to a class?
Something like:
trait GetMyArgs {
def myArgs = ???
}
class Foo(i: Int, d: Double) extends GetMyArgs
=>
scala> val f = new Foo(5, 6.6)
scala> f.myArgs
(5, 6.6) // or similar
def myArgs = {
this.getClass.getDeclaredFields
.toList
.map(i => {
i.setAccessible(true)
i.getName -> i.get(this)
}).toMap
}
You can just use java reflection to do this. but need to call out, if the variable not use and the constructor param is not val, the myArgs will output empty Map.
Here is same example as above, but in sorted form :)
def myArgs: List[(String, AnyRef)] = {
this.getClass.getDeclaredFields.toList
.map(i => {
i.setAccessible(true)
i.getName -> i.get(this)
})
}

in Slick 3.0, how to I get from a query to a case class?

I am trying to use Slick for database in a Scala application, and running into some issues (or my misunderstandings) of how to query (find) and convert the result to a case class.
I am not mapping the case class, but the actual values, with the intent of creating the case class on the fly. so, my table is:
object Tables {
class Names(tag: Tag) extends Table[Name](tag, "NAMES") {
def id = column[Long]("id", O.PrimaryKey, O.AutoInc)
def first = column[String]("first")
def middle = column[String]("last")
def last = column[String]("last")
def * = (id.?, first, middle.?, last) <> ((Name.apply _).tupled, Name.unapply)
}
object NamesQueries {
lazy val query = TableQuery[Names]
val findById = Compiled { k: Rep[Long] =>
query.filter(_.id === k)
}
}
}
and here is the query:
object NamesDAO {
def insertName(name: Name) {
NamesQueries.query += name.copy(id = None)
}
def findName(nameId: Long) = {
val q = NamesQueries.findById(nameId) // AppliedCompiledFunction[Long, Query[Tables.Names, Tables.Names.TableElementType, Seq],Seq[Tables.Names.TableElementType]]
val resultSeq = Database.forConfig("schoolme").run(q.result) // Future[Seq[Tables.Names.TableElementType]]
val result = resultSeq.map { r => // val result: Future[(Option[Long], String, Option[String], String) => Name]
val rr = r.map{ name => // val rr: Seq[(Option[Long], String, Option[String], String) => Name]
Name.apply _
}
rr.head
}
result
}
}
however, the findName method seems to return Future((Option[Long], String, Option[String], String) => Name) instead of a Future(Name). What am i doing wrong? Is it just a matter of just using asInstanceOf[Name]?
EDIT: expanded findName to smaller chunks with comments for each one, as sap1ens suggested.
well, i'll be damned.
following sap1ens comment above, I broke findName to multiple steps (and edited the question). but after that, i went back and gave my val an explicit type, and that worked. see here:
def findName(nameId: Long) = {
val q = NamesQueries.findById(nameId)
val resultSeq: Future[Seq[Name]] = Database.forConfig("schoolme").run(q.result)
val result = resultSeq.map { r =>
val rr = r.map{ name =>
name
}
rr.head
}
result
}
so, type inference was the (/my) culprit this time. remember, remember.