Scala implicit classes as function arguments - scala

I am learning implicit classes and about decorating classes with additional functions and definitions.
https://coderwall.com/p/k_1jzw/scala-s-pimp-my-library-pattern-example
My question is, how to pass these additional functions as arguments to another function.
Example:
I defined the class and implicit function like this
class BlingString(string: String) {
def bling = "*" + string + "*"
def doubleBling = "**" + string + "**"
}
implicit def blingYoString(string: String) = new BlingString(string)
Now I am able to use these functions on String objects.
"this is one".bling // prints *this is one*
"this is two".doubleBling // prints **this is two**
I want to create a generic function by passing these implicit functions to it,
def blingMyString(name: String, blingFunc: ???)) = name.blingFunc
so that I can use this like below,
blingMyString("this is one", bling)
blingMyString("this is two", doubleBling)
This is because I have a class with 20+ additional functions on Spark Dataframe. (data cleansing, data formatting and transformations).
Depending on different scenario, I have to execute one or more functions on dataframes.
val out1 = df.operation1().operation2().operation3()
val out2 = df.operation1().operation3()
val out3 = df.operation1().operation4().operation3()
def clean(dynamicFn: ???): DataFrame = df.operation1().dynamicFn().operation3()
val out1 = clean(operation2)
val out2 = clean()
val out3 = clean(operation4)
Thanks in advance.

class BlingString(s: String) {
def bling1 = "*" + s + "*"
def bling2 = "**" + s + "**"
}
implicit def string2bling(s: String) = new BlingString(s)
def bling(s: String, f: BlingString => String) = f(s)
bling("ciao", _.bling1)
bling("ciao", _.bling2)

Related

how to print a collection as valid code in scala

Assume I have
val x = List("a","b","c")
I'd like to have a function f which when called, returns
List("a","b","c")
Currently, println(x) just prints List(a,b,c) which will not compile when compiled/pasted into an Scala-Notebook or Unit-Test.
I'm stuck to find a general solution which also works for Seq[Double] etc , I managed to get something for Seq[String] by re-adding the quotes, but I'm unable to get a proper solution for all collection types
Sounds like you want custom type class Show
trait Show[T] {
def show(t: T): String
}
trait LowPriorityShow {
implicit def default[T]: Show[T] = _.toString
}
object Show extends LowPriorityShow {
implicit val str: Show[String] = s => s""""$s""""
// other exceptions for element types
implicit def list[T: Show]: Show[List[T]] = _.map(show(_)).mkString("List(", ",", ")")
implicit def seq[T: Show]: Show[Seq[T]] = _.map(show(_)).mkString("Seq(", ",", ")")
// other exceptions for collection types
}
def show[T](t: T)(implicit s: Show[T]): String = s.show(t)
val x = List("a","b","c")
show(x) //List("a","b","c")
val x1 = Seq("a","b","c")
show(x1) //Seq("a","b","c")
You can try to replace instances for collections (Show.list, Show.seq...) with more generic
import shapeless.Typeable
implicit def collection[Col[X] <: Iterable[X], T: Show](implicit ev: Typeable[Col[_]]): Show[Col[T]] = {
val col = Typeable[Col[_]].describe.takeWhile(_ != '[')
_.map(show(_)).mkString(s"$col(", ",", ")")
}
You'll have to check yourself whether the result is always a valid code in Scala.

Using callUDF to create a method that chains UDF calls

I monkey patched the org.apache.spark.sql.Column class to add a chainUDF method. It works well for udfs that don't take arguments and I need help to make it generic for udfs that take arguments.
Here's the current chainUDF method definition.
object ColumnExt {
implicit class ColumnMethods(c: Column) {
def chainUDF(udfName: String): Column = {
callUDF(udfName, c)
}
}
}
Here's the chainUDF method in action.
def appendZ(s: String): String = {
s"${s}Z"
}
spark.udf.register("appendZUdf", appendZ _)
def prependA(s: String): String = {
s"A${s}"
}
spark.udf.register("prependAUdf", prependA _)
val hobbiesDf = Seq(
("dance"),
("sing")
).toDF("word")
val actualDf = hobbiesDf.withColumn(
"fun",
col("word").chainUDF("appendZUdf").chainUDF("prependAUdf")
)
I'd like to update the chainUDF method definition so it takes an optional list of Column arguments. Something like this:
def appendWord(s: String, word: String): String = {
s"${s}${word}"
}
spark.udf.register("appendWordUdf", appendWord _)
val hobbiesDf = Seq(
("dance"),
("sing")
).toDF("word")
val actualDf = hobbiesDf.withColumn(
"fun",
col("word").chainUDF("appendZUdf").chainUDF("appendWordUdf", lit("cool"))
)
I think we'll need to update the chainUDF method definition to something like this:
object ColumnExt {
implicit class ColumnMethods(c: Column) {
def chainUDF(udfName: String, cols: Column* = some_default_value): Column = {
callUDF(udfName, c + cols)
}
}
}
I'm sure there is some Scala magic trick to make this happen.
The signature is:
def callUDF(udfName: String, cols: Column*): Column
so you don't need magic:
def chainUDF(udfName: String, cols: Column* = some_default_value): Column = {
callUDF(udfName, c +: cols: _*)
}

How to design function in flow style in scala

suppose I have a util object with two function
object t {
def funA(input:String,x:Int):String = "hello"*x
def funB(input:String,tail:String):String = input + ":" + tail
}
if i run
funB(funA("x",3),"tail")
I get the result = xxx:tail
the question is how to design these two function then I can call them in a flow style like:
"x" funA(3) funB("tail")
Extend String, Using implicit class,
implicit class CustomString(str: String) {
def funcA(count:Int) = str * count
def funB(tail:String):String = str + ":" + tail
}
println("x".funcA(3).funB("tail"))
With a case class with a String field (corresponding to the original functions first argument),
case class StringOps(s: String) {
def funA(x:Int):String = s*x
def funB(tail:String):String = s + ":" + tail
}
and an implicit for converting String to StringOps,
implicit def String2StringOps(s: String) = StringOps(s)
you can enable the following usage,
scala> "hello" funA 3
hellohellohello
scala> "hello" funA 3 funB "tail"
hellohellohello:tail

Scala Functional "no-op" syntax [duplicate]

When programming in java, I always log input parameter and return value of a method, but in scala, the last line of a method is the return value. so I have to do something like:
def myFunc() = {
val rs = calcSomeResult()
logger.info("result is:" + rs)
rs
}
in order to make it easy, I write a utility:
class LogUtil(val f: (String) => Unit) {
def logWithValue[T](msg: String, value: T): T = { f(msg); value }
}
object LogUtil {
def withValue[T](f: String => Unit): ((String, T) => T) = new LogUtil(f).logWithValue _
}
Then I used it as:
val rs = calcSomeResult()
withValue(logger.info)("result is:" + rs, rs)
it will log the value and return it. it works for me,but seems wierd. as I am a old java programmer, but new to scala, I don't know whether there is a more idiomatic way to do this in scala.
thanks for your help, now I create a better util using Kestrel combinator metioned by romusz
object LogUtil {
def kestrel[A](x: A)(f: A => Unit): A = { f(x); x }
def logV[A](f: String => Unit)(s: String, x: A) = kestrel(x) { y => f(s + ": " + y)}
}
I add f parameter so that I can pass it a logger from slf4j, and the test case is:
class LogUtilSpec extends FlatSpec with ShouldMatchers {
val logger = LoggerFactory.getLogger(this.getClass())
import LogUtil._
"LogUtil" should "print log info and keep the value, and the calc for value should only be called once" in {
def calcValue = { println("calcValue"); 100 } // to confirm it's called only once
val v = logV(logger.info)("result is", calcValue)
v should be === 100
}
}
What you're looking for is called Kestrel combinator (K combinator): Kxy = x. You can do all kinds of side-effect operations (not only logging) while returning the value passed to it. Read https://github.com/raganwald/homoiconic/blob/master/2008-10-29/kestrel.markdown#readme
In Scala the simplest way to implement it is:
def kestrel[A](x: A)(f: A => Unit): A = { f(x); x }
Then you can define your printing/logging function as:
def logging[A](x: A) = kestrel(x)(println)
def logging[A](s: String, x: A) = kestrel(x){ y => println(s + ": " + y) }
And use it like:
logging(1 + 2) + logging(3 + 4)
your example function becomes a one-liner:
def myFunc() = logging("result is", calcSomeResult())
If you prefer OO notation you can use implicits as shown in other answers, but the problem with such approach is that you'll create a new object every time you want to log something, which may cause performance degradation if you do it often enough. But for completeness, it looks like this:
implicit def anyToLogging[A](a: A) = new {
def log = logging(a)
def log(msg: String) = logging(msg, a)
}
Use it like:
def myFunc() = calcSomeResult().log("result is")
You have the basic idea right--you just need to tidy it up a little bit to make it maximally convenient.
class GenericLogger[A](a: A) {
def log(logger: String => Unit)(str: A => String): A = { logger(str(a)); a }
}
implicit def anything_can_log[A](a: A) = new GenericLogger(a)
Now you can
scala> (47+92).log(println)("The answer is " + _)
The answer is 139
res0: Int = 139
This way you don't need to repeat yourself (e.g. no rs twice).
If you like a more generic approach better, you could define
implicit def idToSideEffect[A](a: A) = new {
def withSideEffect(fun: A => Unit): A = { fun(a); a }
def |!>(fun: A => Unit): A = withSideEffect(fun) // forward pipe-like
def tap(fun: A => Unit): A = withSideEffect(fun) // public demand & ruby standard
}
and use it like
calcSomeResult() |!> { rs => logger.info("result is:" + rs) }
calcSomeResult() tap println
Starting Scala 2.13, the chaining operation tap can be used to apply a side effect (in this case some logging) on any value while returning the original value:
def tap[U](f: (A) => U): A
For instance:
scala> val a = 42.tap(println)
42
a: Int = 42
or in our case:
import scala.util.chaining._
def myFunc() = calcSomeResult().tap(x => logger.info(s"result is: $x"))
Let's say you already have a base class for all you loggers:
abstract class Logger {
def info(msg:String):Unit
}
Then you could extend String with the ## logging method:
object ExpressionLog {
// default logger
implicit val logger = new Logger {
def info(s:String) {println(s)}
}
// adding ## method to all String objects
implicit def stringToLog (msg: String) (implicit logger: Logger) = new {
def ## [T] (exp: T) = {
logger.info(msg + " = " + exp)
exp
}
}
}
To use the logging you'd have to import members of ExpressionLog object and then you could easily log expressions using the following notation:
import ExpressionLog._
def sum (a:Int, b:Int) = "sum result" ## (a+b)
val c = sum("a" ## 1, "b" ##2)
Will print:
a = 1
b = 2
sum result = 3
This works because every time when you call a ## method on a String compiler realises that String doesn't have the method and silently converts it into an object with anonymous type that has the ## method defined (see stringToLog). As part of the conversion compiler picks the desired logger as an implicit parameter, this way you don't have to keep passing on the logger to the ## every time yet you retain full control over which logger needs to be used every time.
As far as precedence goes when ## method is used in infix notation it has the highest priority making it easier to reason about what will be logged.
So what if you wanted to use a different logger in one of your methods? This is very simple:
import ExpressionLog.{logger=>_,_} // import everything but default logger
// define specific local logger
// this can be as simple as: implicit val logger = new MyLogger
implicit val logger = new Logger {
var lineno = 1
def info(s:String) {
println("%03d".format(lineno) + ": " + s)
lineno+=1
}
}
// start logging
def sum (a:Int, b:Int) = a+b
val c = "sum result" ## sum("a" ## 1, "b" ##2)
Will output:
001: a = 1
002: b = 2
003: sum result = 3
Compiling all the answers, pros and cons, I came up with this (context is a Play application):
import play.api.LoggerLike
object LogUtils {
implicit class LogAny2[T](val value : T) extends AnyVal {
def ##(str : String)(implicit logger : LoggerLike) : T = {
logger.debug(str);
value
}
def ##(f : T => String)(implicit logger : LoggerLike) : T = {
logger.debug(f(value))
value
}
}
As you can see, LogAny is an AnyVal so there shouldn't be any overhead of new object creation.
You can use it like this:
scala> import utils.LogUtils._
scala> val a = 5
scala> val b = 7
scala> implicit val logger = play.api.Logger
scala> val c = a + b ## { c => s"result of $a + $b = $c" }
c: Int = 12
Or if you don't need a reference to the result, just use:
scala> val c = a + b ## "Finished this very complex calculation"
c: Int = 12
Any downsides to this implementation?
Edit:
I've made this available with some improvements in a gist here

how to keep return value when logging in scala

When programming in java, I always log input parameter and return value of a method, but in scala, the last line of a method is the return value. so I have to do something like:
def myFunc() = {
val rs = calcSomeResult()
logger.info("result is:" + rs)
rs
}
in order to make it easy, I write a utility:
class LogUtil(val f: (String) => Unit) {
def logWithValue[T](msg: String, value: T): T = { f(msg); value }
}
object LogUtil {
def withValue[T](f: String => Unit): ((String, T) => T) = new LogUtil(f).logWithValue _
}
Then I used it as:
val rs = calcSomeResult()
withValue(logger.info)("result is:" + rs, rs)
it will log the value and return it. it works for me,but seems wierd. as I am a old java programmer, but new to scala, I don't know whether there is a more idiomatic way to do this in scala.
thanks for your help, now I create a better util using Kestrel combinator metioned by romusz
object LogUtil {
def kestrel[A](x: A)(f: A => Unit): A = { f(x); x }
def logV[A](f: String => Unit)(s: String, x: A) = kestrel(x) { y => f(s + ": " + y)}
}
I add f parameter so that I can pass it a logger from slf4j, and the test case is:
class LogUtilSpec extends FlatSpec with ShouldMatchers {
val logger = LoggerFactory.getLogger(this.getClass())
import LogUtil._
"LogUtil" should "print log info and keep the value, and the calc for value should only be called once" in {
def calcValue = { println("calcValue"); 100 } // to confirm it's called only once
val v = logV(logger.info)("result is", calcValue)
v should be === 100
}
}
What you're looking for is called Kestrel combinator (K combinator): Kxy = x. You can do all kinds of side-effect operations (not only logging) while returning the value passed to it. Read https://github.com/raganwald/homoiconic/blob/master/2008-10-29/kestrel.markdown#readme
In Scala the simplest way to implement it is:
def kestrel[A](x: A)(f: A => Unit): A = { f(x); x }
Then you can define your printing/logging function as:
def logging[A](x: A) = kestrel(x)(println)
def logging[A](s: String, x: A) = kestrel(x){ y => println(s + ": " + y) }
And use it like:
logging(1 + 2) + logging(3 + 4)
your example function becomes a one-liner:
def myFunc() = logging("result is", calcSomeResult())
If you prefer OO notation you can use implicits as shown in other answers, but the problem with such approach is that you'll create a new object every time you want to log something, which may cause performance degradation if you do it often enough. But for completeness, it looks like this:
implicit def anyToLogging[A](a: A) = new {
def log = logging(a)
def log(msg: String) = logging(msg, a)
}
Use it like:
def myFunc() = calcSomeResult().log("result is")
You have the basic idea right--you just need to tidy it up a little bit to make it maximally convenient.
class GenericLogger[A](a: A) {
def log(logger: String => Unit)(str: A => String): A = { logger(str(a)); a }
}
implicit def anything_can_log[A](a: A) = new GenericLogger(a)
Now you can
scala> (47+92).log(println)("The answer is " + _)
The answer is 139
res0: Int = 139
This way you don't need to repeat yourself (e.g. no rs twice).
If you like a more generic approach better, you could define
implicit def idToSideEffect[A](a: A) = new {
def withSideEffect(fun: A => Unit): A = { fun(a); a }
def |!>(fun: A => Unit): A = withSideEffect(fun) // forward pipe-like
def tap(fun: A => Unit): A = withSideEffect(fun) // public demand & ruby standard
}
and use it like
calcSomeResult() |!> { rs => logger.info("result is:" + rs) }
calcSomeResult() tap println
Starting Scala 2.13, the chaining operation tap can be used to apply a side effect (in this case some logging) on any value while returning the original value:
def tap[U](f: (A) => U): A
For instance:
scala> val a = 42.tap(println)
42
a: Int = 42
or in our case:
import scala.util.chaining._
def myFunc() = calcSomeResult().tap(x => logger.info(s"result is: $x"))
Let's say you already have a base class for all you loggers:
abstract class Logger {
def info(msg:String):Unit
}
Then you could extend String with the ## logging method:
object ExpressionLog {
// default logger
implicit val logger = new Logger {
def info(s:String) {println(s)}
}
// adding ## method to all String objects
implicit def stringToLog (msg: String) (implicit logger: Logger) = new {
def ## [T] (exp: T) = {
logger.info(msg + " = " + exp)
exp
}
}
}
To use the logging you'd have to import members of ExpressionLog object and then you could easily log expressions using the following notation:
import ExpressionLog._
def sum (a:Int, b:Int) = "sum result" ## (a+b)
val c = sum("a" ## 1, "b" ##2)
Will print:
a = 1
b = 2
sum result = 3
This works because every time when you call a ## method on a String compiler realises that String doesn't have the method and silently converts it into an object with anonymous type that has the ## method defined (see stringToLog). As part of the conversion compiler picks the desired logger as an implicit parameter, this way you don't have to keep passing on the logger to the ## every time yet you retain full control over which logger needs to be used every time.
As far as precedence goes when ## method is used in infix notation it has the highest priority making it easier to reason about what will be logged.
So what if you wanted to use a different logger in one of your methods? This is very simple:
import ExpressionLog.{logger=>_,_} // import everything but default logger
// define specific local logger
// this can be as simple as: implicit val logger = new MyLogger
implicit val logger = new Logger {
var lineno = 1
def info(s:String) {
println("%03d".format(lineno) + ": " + s)
lineno+=1
}
}
// start logging
def sum (a:Int, b:Int) = a+b
val c = "sum result" ## sum("a" ## 1, "b" ##2)
Will output:
001: a = 1
002: b = 2
003: sum result = 3
Compiling all the answers, pros and cons, I came up with this (context is a Play application):
import play.api.LoggerLike
object LogUtils {
implicit class LogAny2[T](val value : T) extends AnyVal {
def ##(str : String)(implicit logger : LoggerLike) : T = {
logger.debug(str);
value
}
def ##(f : T => String)(implicit logger : LoggerLike) : T = {
logger.debug(f(value))
value
}
}
As you can see, LogAny is an AnyVal so there shouldn't be any overhead of new object creation.
You can use it like this:
scala> import utils.LogUtils._
scala> val a = 5
scala> val b = 7
scala> implicit val logger = play.api.Logger
scala> val c = a + b ## { c => s"result of $a + $b = $c" }
c: Int = 12
Or if you don't need a reference to the result, just use:
scala> val c = a + b ## "Finished this very complex calculation"
c: Int = 12
Any downsides to this implementation?
Edit:
I've made this available with some improvements in a gist here