I have an iteration module which can apply an arbitrary function (Build generic reusable iteration module from higher order function) and would love to wrap it into a progressbar.
val things = Range(1,10)
def iterationModule[A](
iterationItems: Seq[A],
functionToApply: A => Any
): Unit = {
def foo(s:Int) = println(s)
iterationModule[Int](things, foo)
A basic progressbar could look like:
import me.tongfei.progressbar.ProgressBar
val pb = new ProgressBar("Test", things.size)
things.foreach(t=> {
But how can the function which is passed to the iterator module be intercepted and surrounded with a progressbar, i.e. call the pb.step?
An annoying possibility would be to pass the mutable pb object into each function (have it implement an interface).
But is it also possible to intercept and surround the function being passed by this stepping logic?
However, when looping with Seq().par.foreach, this might be problematic.
I need the code to work in Scala 2.11.
A more complex example:
val things = Range(1,100).map(_.toString)
def iterationModule[A](
iterationItems: Seq[A],
functionToApply: A => Any,
parallel: Boolean = false
): Unit = {
val pb = new ProgressBar(functionToApply.toString(), iterationItems.size)
if (parallel) {
} else {
def doStuff(inputDay: String, inputConfigSomething: String): Unit = println(inputDay + "__"+ inputConfigSomething)
iterationModule[String](things, doStuff(_, "foo"))
The function should be able to take the iteration item and additional parameters.
edit 2
import me.tongfei.progressbar.ProgressBar
val things = Range(1,100).map(_.toString)
def doStuff(inputDay: String, inputConfigSomething: String): Unit = println(inputDay + "__"+ inputConfigSomething)
def iterationModulePb[A](items: Seq[A], f: A => Any, parallel: Boolean = false): Unit = {
val pb = new ProgressBar(f.toString, items.size)
val it = if (parallel) {
} else {
it.foreach { x =>
iterationModulePb[String](things, doStuff(_, "foo"))
After a little discussion I figured out how to use a Seq with standard iterators.
For Scala 2.13 this would be the most general form.
import me.tongfei.progressbar.ProgressBar
def iterationModule[A](items: IterableOnce[A], f: A => Any): Unit = {
val (it, pb) =
if (items.knowSize != -1)
items.iterator -> new ProgressBar("Test", items.knowSize)
else {
val (iter1, iter2) = items.iterator.split
iter1 -> new ProgressBar("Test", iter2.size)
it.foreach { x =>
Note: most of the changes are just to make the code more generic, but the general idea is just to create a function that wraps both the original function and the call to the ProgressBar.
A simplified solution for 2.11
def iterationModule[A](items: Seq[A], parallel: Boolean = false)
(f: A => Any): Unit = {
val pb = new ProgressBar("test", items.size)
val it = if (parallel) {
} else {
it.foreach { a =>
I am trying to implement a custom Transformer in Flink following indications in its documentation but when I try to executed it seems the fit operation is never being called. Here it is what I've done so far:
class InfoGainTransformer extends Transformer[InfoGainTransformer] {
import InfoGainTransformer._
private[this] var counts: Option[collection.immutable.Vector[Map[Key, Double]]] = None
// here setters for params, as Flink does
object InfoGainTransformer {
// ====================================== Parameters =============================================
// ...
// ==================================== Factory methods ==========================================
// ...
// ========================================== Operations =========================================
implicit def fitLabeledVectorInfoGain = new FitOperation[InfoGainTransformer, LabeledVector] {
override def fit(instance: InfoGainTransformer, fitParameters: ParameterMap, input: DataSet[LabeledVector]): Unit = {
val counts = collection.immutable.Vector[Map[Key, Double]]()
input.map {
v =>
v.vector.map {
case (i, value) =>
val key = Key(value, v.label)
val cval = counts(i).getOrElse(key, .0)
counts(i) + (key -> cval)
implicit def fitVectorInfoGain[T <: Vector] = new FitOperation[InfoGainTransformer, T] {
override def fit(instance: InfoGainTransformer, fitParameters: ParameterMap, input: DataSet[T]): Unit = {
implicit def transformLabeledVectorsInfoGain = {
new TransformDataSetOperation[InfoGainTransformer, LabeledVector, LabeledVector] {
override def transformDataSet(
instance: InfoGainTransformer,
transformParameters: ParameterMap,
input: DataSet[LabeledVector]): DataSet[LabeledVector] = input
implicit def transformVectorsInfoGain[T <: Vector : BreezeVectorConverter : TypeInformation : ClassTag] = {
new TransformDataSetOperation[InfoGainTransformer, T, T] {
override def transformDataSet(instance: InfoGainTransformer, transformParameters: ParameterMap, input: DataSet[T]): DataSet[T] = input
Then I tried to use it in two ways:
val scaler = StandardScaler()
val polyFeatures = PolynomialFeatures()
val mlr = MultipleLinearRegression()
val gain = InfoGainTransformer().setK(2)
// Construct the pipeline
val pipeline = scaler
val r = pipeline.predict(dataSet map (_.vector))
And only my transformer:
In both cases, when I set a breakpoint inside fitLabeledVectorInfoGain, for example in the line input.map, the debugger stops there, but if I also set a breakpoint inside the nested map, for example bellow println("INSIDE!!!"), it never stops there.
Does anyone knows how could I debug this custom transformer?
It seems its working now. I think what was happening was I wasn't implementing right the FitOperation because nothing was being saved in the instance state, this is the implementation now:
implicit def fitLabeledVectorInfoGain = new FitOperation[InfoGainTransformer, LabeledVector] {
override def fit(instance: InfoGainTransformer, fitParameters: ParameterMap, input: DataSet[LabeledVector]): Unit = {
// val counts = collection.immutable.Vector[Map[Key, Double]]()
val r = input.map {
v =>
v.vector.foldLeft(Map.empty[Key, Double]) {
case (m, (i, value)) =>
println("INSIDE fit!!!")
val key = Key(value, v.label)
val cval = m.getOrElse(key, .0) + 1.0
m + (key -> cval)
instance.counts = Some(r)
Now the debugger enters correctly in all breakpoints and the TransformOperation its also being called.
Let's say in my pure Scala program i have a dependency to a Java service.
This Java service accepts a listener to notify me when some data changes.
Let's say the data is a tuple(x, y) and the java service calls the listener whenever X or Y changes but i'm interested only when X.
For this my listener has to save the last value of X, and forward the update/call only when oldX != X, so in order to have this my impure scala listener implementation has to hold a var oldX
val listener = new JavaServiceListener() {
var oldX;
def updated(val x, val y): Unit = {
if (oldX != x) {
oldX = x
//do stuff
How would i go about to design a wrapper for this kind of thing in Scala without val or mutable collections ? I can't at the JavaServiceListener level since i'm bound by the method signature, so I need another layer above which the java listener forwards to somehow
My preference would be to wrap it in a Monix Observable, then you can use distinctUntilChanged to eliminate consecutive duplicates. Something like:
import monix.reactive._
val observable = Observable.create(OverflowStrategy.Fail(10)){(sync) =>
val listener = new JavaServiceListener() {
def updated(val x, val y): Unit = {
Cancelable{() => javaService.unregister(listener)}
val distinctObservable = observable.distinctUntilChanged
Reactive programming allows you to use a pure model while the library handles all the difficult stuff.
First of all, if you are designing a purely functional program you cannot return Unit (neither Future[Unit], because Future does not suppress side effects).
If performance is not an issue I would make use of Kleisli[Option, xType, IO[Unit]] where T = Option. So the first thing you have to do is define (add the appropriate types)
def updated(oldX, x): Kleisli[Option, xType, xType] = Kleisli liftF {
if(x != oldX) None
else Some(x)
def doStuff(x, y): Kleisli[Option, xType, IO[Unit]] = Kleisli pure {
and now you can compose them in a for-comprehension something like that:
val result: Kleisli[Option, xType, IO[Unit]] = for{
xx <- updated(oldX, x)
effect <- doStuff(xx, y)
} yield effect
You can perform stateful compuation with ReaderWriterStateT, so you keep oldX as a state.
I found the solution I like with Cats and Cats-Effect:
trait MyListener {
def onChange(n: Int): Unit
class MyDistinctFunctionalListener(private val last: Ref[IO, Int], consumer: Int => Unit) extends MyListener {
override def onChange(newValue: Int): Unit = {
val program =
.flatMap(oldValue => notify(newValue, oldValue))
private def notify(newValue: Int, oldValue: Int): IO[Unit] = {
if (oldValue != newValue) IO(consumer(newValue)) else IO.delay(println("found duplicate"))
object MyDistinctFunctionalListener {
def create(consumer: Int => Unit): IO[MyDistinctFunctionalListener] =
Ref[IO].of(0).map(v => new MyDistinctFunctionalListener(v, consumer))
val printer: Int => Unit = println(_)
val functionalDistinctPrinterIO = MyDistinctFunctionalListener.create(printer)
functionalDistinctPrinterIO.map(fl =>
List(1, 1, 2, 2, 3, 3, 3, 4, 5, 5).foreach(fl.onChange)
More stuff about handling shared state here https://github.com/systemfw/scala-italy-2018
it is debatable if this is worth it over the private var solution
When programming in java, I always log input parameter and return value of a method, but in scala, the last line of a method is the return value. so I have to do something like:
def myFunc() = {
val rs = calcSomeResult()
logger.info("result is:" + rs)
in order to make it easy, I write a utility:
class LogUtil(val f: (String) => Unit) {
def logWithValue[T](msg: String, value: T): T = { f(msg); value }
object LogUtil {
def withValue[T](f: String => Unit): ((String, T) => T) = new LogUtil(f).logWithValue _
Then I used it as:
val rs = calcSomeResult()
withValue(logger.info)("result is:" + rs, rs)
it will log the value and return it. it works for me,but seems wierd. as I am a old java programmer, but new to scala, I don't know whether there is a more idiomatic way to do this in scala.
thanks for your help, now I create a better util using Kestrel combinator metioned by romusz
object LogUtil {
def kestrel[A](x: A)(f: A => Unit): A = { f(x); x }
def logV[A](f: String => Unit)(s: String, x: A) = kestrel(x) { y => f(s + ": " + y)}
I add f parameter so that I can pass it a logger from slf4j, and the test case is:
class LogUtilSpec extends FlatSpec with ShouldMatchers {
val logger = LoggerFactory.getLogger(this.getClass())
import LogUtil._
"LogUtil" should "print log info and keep the value, and the calc for value should only be called once" in {
def calcValue = { println("calcValue"); 100 } // to confirm it's called only once
val v = logV(logger.info)("result is", calcValue)
v should be === 100
What you're looking for is called Kestrel combinator (K combinator): Kxy = x. You can do all kinds of side-effect operations (not only logging) while returning the value passed to it. Read https://github.com/raganwald/homoiconic/blob/master/2008-10-29/kestrel.markdown#readme
In Scala the simplest way to implement it is:
def kestrel[A](x: A)(f: A => Unit): A = { f(x); x }
Then you can define your printing/logging function as:
def logging[A](x: A) = kestrel(x)(println)
def logging[A](s: String, x: A) = kestrel(x){ y => println(s + ": " + y) }
And use it like:
logging(1 + 2) + logging(3 + 4)
your example function becomes a one-liner:
def myFunc() = logging("result is", calcSomeResult())
If you prefer OO notation you can use implicits as shown in other answers, but the problem with such approach is that you'll create a new object every time you want to log something, which may cause performance degradation if you do it often enough. But for completeness, it looks like this:
implicit def anyToLogging[A](a: A) = new {
def log = logging(a)
def log(msg: String) = logging(msg, a)
Use it like:
def myFunc() = calcSomeResult().log("result is")
You have the basic idea right--you just need to tidy it up a little bit to make it maximally convenient.
class GenericLogger[A](a: A) {
def log(logger: String => Unit)(str: A => String): A = { logger(str(a)); a }
implicit def anything_can_log[A](a: A) = new GenericLogger(a)
Now you can
scala> (47+92).log(println)("The answer is " + _)
The answer is 139
res0: Int = 139
This way you don't need to repeat yourself (e.g. no rs twice).
If you like a more generic approach better, you could define
implicit def idToSideEffect[A](a: A) = new {
def withSideEffect(fun: A => Unit): A = { fun(a); a }
def |!>(fun: A => Unit): A = withSideEffect(fun) // forward pipe-like
def tap(fun: A => Unit): A = withSideEffect(fun) // public demand & ruby standard
and use it like
calcSomeResult() |!> { rs => logger.info("result is:" + rs) }
calcSomeResult() tap println
Starting Scala 2.13, the chaining operation tap can be used to apply a side effect (in this case some logging) on any value while returning the original value:
def tap[U](f: (A) => U): A
For instance:
scala> val a = 42.tap(println)
a: Int = 42
or in our case:
import scala.util.chaining._
def myFunc() = calcSomeResult().tap(x => logger.info(s"result is: $x"))
Let's say you already have a base class for all you loggers:
abstract class Logger {
def info(msg:String):Unit
Then you could extend String with the ## logging method:
object ExpressionLog {
// default logger
implicit val logger = new Logger {
def info(s:String) {println(s)}
// adding ## method to all String objects
implicit def stringToLog (msg: String) (implicit logger: Logger) = new {
def ## [T] (exp: T) = {
logger.info(msg + " = " + exp)
To use the logging you'd have to import members of ExpressionLog object and then you could easily log expressions using the following notation:
import ExpressionLog._
def sum (a:Int, b:Int) = "sum result" ## (a+b)
val c = sum("a" ## 1, "b" ##2)
Will print:
a = 1
b = 2
sum result = 3
This works because every time when you call a ## method on a String compiler realises that String doesn't have the method and silently converts it into an object with anonymous type that has the ## method defined (see stringToLog). As part of the conversion compiler picks the desired logger as an implicit parameter, this way you don't have to keep passing on the logger to the ## every time yet you retain full control over which logger needs to be used every time.
As far as precedence goes when ## method is used in infix notation it has the highest priority making it easier to reason about what will be logged.
So what if you wanted to use a different logger in one of your methods? This is very simple:
import ExpressionLog.{logger=>_,_} // import everything but default logger
// define specific local logger
// this can be as simple as: implicit val logger = new MyLogger
implicit val logger = new Logger {
var lineno = 1
def info(s:String) {
println("%03d".format(lineno) + ": " + s)
// start logging
def sum (a:Int, b:Int) = a+b
val c = "sum result" ## sum("a" ## 1, "b" ##2)
Will output:
001: a = 1
002: b = 2
003: sum result = 3
Compiling all the answers, pros and cons, I came up with this (context is a Play application):
import play.api.LoggerLike
object LogUtils {
implicit class LogAny2[T](val value : T) extends AnyVal {
def ##(str : String)(implicit logger : LoggerLike) : T = {
def ##(f : T => String)(implicit logger : LoggerLike) : T = {
As you can see, LogAny is an AnyVal so there shouldn't be any overhead of new object creation.
You can use it like this:
scala> import utils.LogUtils._
scala> val a = 5
scala> val b = 7
scala> implicit val logger = play.api.Logger
scala> val c = a + b ## { c => s"result of $a + $b = $c" }
c: Int = 12
Or if you don't need a reference to the result, just use:
scala> val c = a + b ## "Finished this very complex calculation"
c: Int = 12
Any downsides to this implementation?
I've made this available with some improvements in a gist here
When programming in java, I always log input parameter and return value of a method, but in scala, the last line of a method is the return value. so I have to do something like:
def myFunc() = {
val rs = calcSomeResult()
logger.info("result is:" + rs)
in order to make it easy, I write a utility:
class LogUtil(val f: (String) => Unit) {
def logWithValue[T](msg: String, value: T): T = { f(msg); value }
object LogUtil {
def withValue[T](f: String => Unit): ((String, T) => T) = new LogUtil(f).logWithValue _
Then I used it as:
val rs = calcSomeResult()
withValue(logger.info)("result is:" + rs, rs)
it will log the value and return it. it works for me,but seems wierd. as I am a old java programmer, but new to scala, I don't know whether there is a more idiomatic way to do this in scala.
thanks for your help, now I create a better util using Kestrel combinator metioned by romusz
object LogUtil {
def kestrel[A](x: A)(f: A => Unit): A = { f(x); x }
def logV[A](f: String => Unit)(s: String, x: A) = kestrel(x) { y => f(s + ": " + y)}
I add f parameter so that I can pass it a logger from slf4j, and the test case is:
class LogUtilSpec extends FlatSpec with ShouldMatchers {
val logger = LoggerFactory.getLogger(this.getClass())
import LogUtil._
"LogUtil" should "print log info and keep the value, and the calc for value should only be called once" in {
def calcValue = { println("calcValue"); 100 } // to confirm it's called only once
val v = logV(logger.info)("result is", calcValue)
v should be === 100
What you're looking for is called Kestrel combinator (K combinator): Kxy = x. You can do all kinds of side-effect operations (not only logging) while returning the value passed to it. Read https://github.com/raganwald/homoiconic/blob/master/2008-10-29/kestrel.markdown#readme
In Scala the simplest way to implement it is:
def kestrel[A](x: A)(f: A => Unit): A = { f(x); x }
Then you can define your printing/logging function as:
def logging[A](x: A) = kestrel(x)(println)
def logging[A](s: String, x: A) = kestrel(x){ y => println(s + ": " + y) }
And use it like:
logging(1 + 2) + logging(3 + 4)
your example function becomes a one-liner:
def myFunc() = logging("result is", calcSomeResult())
If you prefer OO notation you can use implicits as shown in other answers, but the problem with such approach is that you'll create a new object every time you want to log something, which may cause performance degradation if you do it often enough. But for completeness, it looks like this:
implicit def anyToLogging[A](a: A) = new {
def log = logging(a)
def log(msg: String) = logging(msg, a)
Use it like:
def myFunc() = calcSomeResult().log("result is")
You have the basic idea right--you just need to tidy it up a little bit to make it maximally convenient.
class GenericLogger[A](a: A) {
def log(logger: String => Unit)(str: A => String): A = { logger(str(a)); a }
implicit def anything_can_log[A](a: A) = new GenericLogger(a)
Now you can
scala> (47+92).log(println)("The answer is " + _)
The answer is 139
res0: Int = 139
This way you don't need to repeat yourself (e.g. no rs twice).
If you like a more generic approach better, you could define
implicit def idToSideEffect[A](a: A) = new {
def withSideEffect(fun: A => Unit): A = { fun(a); a }
def |!>(fun: A => Unit): A = withSideEffect(fun) // forward pipe-like
def tap(fun: A => Unit): A = withSideEffect(fun) // public demand & ruby standard
and use it like
calcSomeResult() |!> { rs => logger.info("result is:" + rs) }
calcSomeResult() tap println
Starting Scala 2.13, the chaining operation tap can be used to apply a side effect (in this case some logging) on any value while returning the original value:
def tap[U](f: (A) => U): A
For instance:
scala> val a = 42.tap(println)
a: Int = 42
or in our case:
import scala.util.chaining._
def myFunc() = calcSomeResult().tap(x => logger.info(s"result is: $x"))
Let's say you already have a base class for all you loggers:
abstract class Logger {
def info(msg:String):Unit
Then you could extend String with the ## logging method:
object ExpressionLog {
// default logger
implicit val logger = new Logger {
def info(s:String) {println(s)}
// adding ## method to all String objects
implicit def stringToLog (msg: String) (implicit logger: Logger) = new {
def ## [T] (exp: T) = {
logger.info(msg + " = " + exp)
To use the logging you'd have to import members of ExpressionLog object and then you could easily log expressions using the following notation:
import ExpressionLog._
def sum (a:Int, b:Int) = "sum result" ## (a+b)
val c = sum("a" ## 1, "b" ##2)
Will print:
a = 1
b = 2
sum result = 3
This works because every time when you call a ## method on a String compiler realises that String doesn't have the method and silently converts it into an object with anonymous type that has the ## method defined (see stringToLog). As part of the conversion compiler picks the desired logger as an implicit parameter, this way you don't have to keep passing on the logger to the ## every time yet you retain full control over which logger needs to be used every time.
As far as precedence goes when ## method is used in infix notation it has the highest priority making it easier to reason about what will be logged.
So what if you wanted to use a different logger in one of your methods? This is very simple:
import ExpressionLog.{logger=>_,_} // import everything but default logger
// define specific local logger
// this can be as simple as: implicit val logger = new MyLogger
implicit val logger = new Logger {
var lineno = 1
def info(s:String) {
println("%03d".format(lineno) + ": " + s)
// start logging
def sum (a:Int, b:Int) = a+b
val c = "sum result" ## sum("a" ## 1, "b" ##2)
Will output:
001: a = 1
002: b = 2
003: sum result = 3
Compiling all the answers, pros and cons, I came up with this (context is a Play application):
import play.api.LoggerLike
object LogUtils {
implicit class LogAny2[T](val value : T) extends AnyVal {
def ##(str : String)(implicit logger : LoggerLike) : T = {
def ##(f : T => String)(implicit logger : LoggerLike) : T = {
As you can see, LogAny is an AnyVal so there shouldn't be any overhead of new object creation.
You can use it like this:
scala> import utils.LogUtils._
scala> val a = 5
scala> val b = 7
scala> implicit val logger = play.api.Logger
scala> val c = a + b ## { c => s"result of $a + $b = $c" }
c: Int = 12
Or if you don't need a reference to the result, just use:
scala> val c = a + b ## "Finished this very complex calculation"
c: Int = 12
Any downsides to this implementation?
I've made this available with some improvements in a gist here
The general question is how to return additional information from methods, beside the actual result of the computation. But I want, that this information can silently be ignored.
Take for example the method dropWhile on Iterator. The returned result is the mutated iterator. But maybe sometimes I might be interested in the number of elements dropped.
In the case of dropWhile, this information could be generated externally by adding an index to the iterator and calculating the number of dropped steps afterwards. But in general this is not possible.
I simple solution is to return a tuple with the actual result and optional information. But then I need to handle the tuple whenever I call the method - even if I'm not interested in the optional information.
So the question is, whether there is some clever way of gathering such optional information?
Maybe through Option[X => Unit] parameters with call-back functions that default to None? Is there something more clever?
Just my two cents hereā¦
You could declare this:
case class RichResult[+A, +B](val result: A, val info: B)
with an implicit conversion to A:
implicit def unwrapRichResult[A, B](richResult: RichResult[A, B]): A = richResult.result
def someMethod: RichResult[Int, String] = /* ... */
val richRes = someMethod
val res: Int = someMethod
It's definitely not more clever, but you could just create a method that drops the additional information.
def removeCharWithCount(str: String, x: Char): (String, Int) =
(str.replace(x.toString, ""), str.count(x ==))
// alias that drops the additional return information
def removeChar(str: String, x: Char): String =
removeCharWithCount(str, x)._1
Here is my take (with some edits with a more realistic example):
package info {
trait Info[T] { var data: Option[T] }
object Info {
implicit def makeInfo[T]: Info[T] = new Info[T] {
var data: Option[T] = None
Then suppose your original method (and use case) is implemented like this:
object Test extends App {
def dropCounterIterator[A](iter: Iterator[A]) = new Iterator[A] {
def hasNext = iter.hasNext
def next() = iter.next()
override def dropWhile(p: (A) => Boolean): Iterator[A] = {
var count = 0
var current: Option[A] = None
while (hasNext && p({current = Some(next()); current.get})) { count += 1 }
current match {
case Some(a) => Iterator.single(a) ++ this
case None => Iterator.empty
val i = dropCounterIterator(Iterator.from(1))
val ii = i.dropWhile(_ < 10)
To provide and get access to the info, the code would be modified only slightly:
import info.Info // line added
object Test extends App {
def dropCounterIterator[A](iter: Iterator[A]) = new Iterator[A] {
def hasNext = iter.hasNext
def next() = iter.next()
// note overloaded variant because of extra parameter list, not overriden
def dropWhile(p: (A) => Boolean)(implicit info: Info[Int]): Iterator[A] = {
var count = 0
var current: Option[A] = None
while (hasNext && p({current = Some(next()); current.get})) { count += 1 }
info.data = Some(count) // line added here
current match {
case Some(a) => Iterator.single(a) ++ this
case None => Iterator.empty
val i = dropCounterIterator(Iterator.from(1))
val info = implicitly[Info[Int]] // line added here
val ii = i.dropWhile((x: Int) => x < 10)(info) // line modified
println(info.data.get) // line added here
Note that for some reason the type inference is affected and I had to annotate the type of the function passed to dropWhile.
You want dropWhileM with the State monad threading a counter through the computation.