Abstractions & Design Patterns for Complex IPython Widget Dashboards - ipython

The IPywidgets manual is very helpful for the most part, but is lacking some explanation about how one goes about creating complex dashboards. In particular, I am trying to figure our how to:
Design abstractions to assist in building easily extendable dashboards which contain multiple interdependent widgets, some of which hide/show other widgets.
Do this in a way which would allow me to get and set the state of all the widgets as a dict so I could subsequently implement buttons to save and load the configuration of my dashboard to a JSON file.
To make this question more concrete, I designed a minimal example of my current approach, which based on a design pattern I received from #jasongrout on the Jupyter Widgets channel on Gitter; when answering, please demonstrate your design pattern by reimplementing this example in it, ensuring the two criteria above are clearly fulfilled:
from IPython.display import display
import IPython.display as ipd
import ipywidgets as widgets
import matplotlib.pyplot as plt
class ModelUI:
def __init__(self):
self.par = dict()
def initUI(self):
self.funcPars = widgets.VBox()
self.controls = widgets.HBox([
self.UI = widgets.VBox([
def initFunc(self):
self.func = widgets.Dropdown(
self.func.observe(self.updateFunc, "value")
def initButton(self):
self.plot = widgets.Button(description="Plot")
def updateFunc(self, change):
if self.func.value == "Linear":
self.funcPars.children = self.linPars
self.par['func'] = "Linear"
elif self.func.value == "Quadratic":
self.funcPars.children = self.quadPars
self.par['func'] = "Quadratic"
def initLinear(self):
self.m = widgets.FloatSlider(
min=-10, max=10, value=2)
self.k = widgets.FloatSlider(
min=-10, max=10, value=1)
self.linPars = [self.m, self.k]
self.m.observe(self.updateLinear, "value")
self.k.observe(self.updateLinear, "value")
def updateLinear(self, change):
self.par['m'] = self.m.value
self.par['k'] = self.k.value
def initQuadratic(self):
self.a = widgets.FloatSlider(
min=-10, max=10, value=1)
self.b = widgets.FloatSlider(
min=-10, max=10, value=2)
self.c = widgets.FloatSlider(
min=-10, max=10, value=3)
self.quadPars = [self.a, self.b, self.c]
self.a.observe(self.updateQuadratic, "value")
self.b.observe(self.updateQuadratic, "value")
self.c.observe(self.updateQuadratic, "value")
def updateQuadratic(self, change):
self.par['a'] = self.a.value
self.par['b'] = self.b.value
self.par['c'] = self.c.value
def initOutput(self):
self.output = widgets.Output()
def plotFunction(self, change):
self.function = {
"Linear": lambda x: self.par['m']*x + self.par['k'],
"Quadratic": lambda x: self.par['a']*x**2 + self.par['b']*x + self.par['c']
with self.output:
xvals = [ i/10 for i in range(-100,100)]
yvals = list(map(self.function[self.par['func']], xvals))
def _ipython_display_(self):


Scala: how to modify the default metric for cross validation

I find te code below on this site:
// Note that the evaluator here is a BinaryClassificationEvaluator and its default metric
// is areaUnderROC.
val cv = new CrossValidator()
.setEvaluator(new BinaryClassificationEvaluator)
.setNumFolds(2) // Use 3+ in practice
.setParallelism(2) // Evaluate up to 2 parameter settings in parallel
As they said the default metric for BinaryClassificationEvaluator is "AUC".
How can I do to change this default metric to F1-score?
I tried:
// Note that the evaluator here is a BinaryClassificationEvaluator and its default metric
// is areaUnderROC.
val cv = new CrossValidator()
.setEvaluator(new BinaryClassificationEvaluator.setMetricName("f1"))
.setNumFolds(2) // Use 3+ in practice
.setParallelism(2) // Evaluate up to 2 parameter settings in parallel
But I got some errors...
I search on many sites but I did not find the solution...
setMetricName only accepts "areaUnderPR" or "areaUnderROC". You will need to write your own Evaluator; something like this:
import org.apache.spark.ml.evaluation.Evaluator
import org.apache.spark.ml.param.ParamMap
import org.apache.spark.ml.param.shared.{HasLabelCol, HasPredictionCol}
import org.apache.spark.ml.util.Identifiable
import org.apache.spark.sql.types.IntegerType
import org.apache.spark.sql.{Dataset, functions => F}
class FScoreEvaluator(override val uid: String) extends Evaluator with HasPredictionCol with HasLabelCol{
def this() = this(Identifiable.randomUID("FScoreEvaluator"))
def evaluate(dataset: Dataset[_]): Double = {
val truePositive = F.sum(((F.col(getLabelCol) === 1) && (F.col(getPredictionCol) === 1)).cast(IntegerType))
val predictedPositive = F.sum((F.col(getPredictionCol) === 1).cast(IntegerType))
val actualPositive = F.sum((F.col(getLabelCol) === 1).cast(IntegerType))
val precision = truePositive / predictedPositive
val recall = truePositive / actualPositive
val fScore = F.lit(2) * (precision * recall) / (precision + recall)
override def copy(extra: ParamMap): Evaluator = defaultCopy(extra)
Based on the answer of #gmds. Make sure Spark version >=2.3.
You can also follow the implementation of RegressionEvaluator in Spark to implement other custom evaluators.
I also added isLargerBetter so that the instantiated evaluator can be used in model selection (e.g. CV)
import org.apache.spark.ml.evaluation.Evaluator
import org.apache.spark.ml.param.ParamMap
import org.apache.spark.ml.param.shared.{HasLabelCol, HasPredictionCol, HasWeightCol}
import org.apache.spark.ml.util.Identifiable
import org.apache.spark.sql.types.IntegerType
import org.apache.spark.sql.{Dataset, functions => F}
class WRmseEvaluator(override val uid: String) extends Evaluator with HasPredictionCol with HasLabelCol with HasWeightCol {
def this() = this(Identifiable.randomUID("wrmseEval"))
def setPredictionCol(value: String): this.type = set(predictionCol, value)
def setLabelCol(value: String): this.type = set(labelCol, value)
def setWeightCol(value: String): this.type = set(weightCol, value)
def evaluate(dataset: Dataset[_]): Double = {
.withColumn("residual", F.col(getLabelCol) - F.col(getPredictionCol))
F.sqrt(F.sum(F.col(getWeightCol) * F.pow(F.col("residual"), 2)) / F.sum(getWeightCol))
override def copy(extra: ParamMap): Evaluator = defaultCopy(extra)
override def isLargerBetter: Boolean = false
The following is how to use it.
val wrmseEvaluator = new WRmseEvaluator()

How to use chisel dsptools with floats

I need to convert a Float32 into a Chisel FixedPoint, perform some computation and convert back FixedPoint to Float32.
For example, I need the following:
val a = 3.1F
val b = 2.2F
val res = a * b // REPL returns res: Float 6.82
Now, I do this:
import chisel3.experimental.FixedPoint
val fp_tpe = FixedPoint(6.W, 2.BP)
val a_fix = a.Something (fp_tpe) // convert a to FixPoint
val b_fix = b.Something (fp_tpe) // convert b to FixPoint
val res_fix = a_fix * b_fix
val res0 = res_fix.Something (fp_tpe) // convert back to Float
As a result, I'd expect the delta to be in a range of , e.g
val eps = 1e-4
assert ( abs(res - res0) < eps, "The error is too big")
Who can provide a working example for Chisel3 FixedPoint class for the pseudocode above?
Take a look at the following code:
import chisel3._
import chisel3.core.FixedPoint
import dsptools._
class FPMultiplier extends Module {
val io = IO(new Bundle {
val a = Input(FixedPoint(6.W, binaryPoint = 2.BP))
val b = Input(FixedPoint(6.W, binaryPoint = 2.BP))
val c = Output(FixedPoint(12.W, binaryPoint = 4.BP))
io.c := io.a * io.b
class FPMultiplierTester(c: FPMultiplier) extends DspTester(c) {
// This will PASS, there is sufficient precision to model the inputs
poke(c.io.a, 3.25)
poke(c.io.b, 2.5)
expect(c.io.c, 8.125)
// This will FAIL, there is not sufficient precision to model the inputs
// But this is only caught on output, this is likely the right approach
// because you can't really pass in wrong precision data in hardware.
poke(c.io.a, 3.1)
poke(c.io.b, 2.2)
expect(c.io.c, 6.82)
object FPMultiplierMain {
def main(args: Array[String]): Unit = {
iotesters.Driver.execute(Array("-fiv"), () => new FPMultiplier) { c =>
new FPMultiplierTester(c)
I'd also suggest looking at ParameterizedAdder in dsptools, that gives you a feel of how to write hardware modules that you pass different types. Generally you start with DspReals, confirm the model then start experimenting/calculating with FixedPoint sizes that return results with the desired precision.
For others benefit, I provide an improved solution from #Chick, rewritten in a more abstract Scala with variable DSP tolerances.
package my_pkg
import chisel3._
import chisel3.core.{FixedPoint => FP}
import dsptools.{DspTester, DspTesterOptions, DspTesterOptionsManager}
class FPGenericIO (inType:FP, outType:FP) extends Bundle {
val a = Input(inType)
val b = Input(inType)
val c = Output(outType)
class FPMul (inType:FP, outType:FP) extends Module {
val io = IO(new FPGenericIO(inType, outType))
io.c := io.a * io.b
class FPMulTester(c: FPMul) extends DspTester(c) {
val uut = c.io
// This will PASS, there is sufficient precision to model the inputs
poke(uut.a, 3.25)
poke(uut.b, 2.5)
expect(uut.c, 3.25*2.5)
// This will FAIL, if you won't increase tolerance, which is eps = 0.0 by default
poke(uut.a, 3.1)
poke(uut.b, 2.2)
expect(uut.c, 3.1*2.2)
object FPUMain extends App {
val fpInType = FP(8.W, 4.BP)
val fpOutType = FP(12.W, 6.BP)
// Update default DspTester options and increase tolerance
val opts = new DspTesterOptionsManager {
dspTesterOptions = DspTesterOptions(
fixTolLSBs = 2,
genVerilogTb = false,
isVerbose = true
dsptools.Driver.execute (() => new FPMul(fpInType, fpOutType), opts) {
c => new FPMulTester(c)
Here's my ultimate DSP multiplier implementation, which should support both FixedPoint and DspComplex numbers. #ChickMarkley, how do I update this class to implement a Complex multiplication?
package my_pkg
import chisel3._
import dsptools.numbers.{Ring,DspComplex}
import dsptools.numbers.implicits._
import dsptools.{DspContext}
import chisel3.core.{FixedPoint => FP}
import dsptools.{DspTester, DspTesterOptions, DspTesterOptionsManager}
class FPGenericIO[A <: Data:Ring, B <: Data:Ring] (inType:A, outType:B) extends Bundle {
val a = Input(inType.cloneType)
val b = Input(inType.cloneType)
val c = Output(outType.cloneType)
override def cloneType = (new FPGenericIO(inType, outType)).asInstanceOf[this.type]
class FPMul[A <: Data:Ring, B <: Data:Ring] (inType:A, outType:B) extends Module {
val io = IO(new FPGenericIO(inType, outType))
DspContext.withNumMulPipes(3) {
io.c := io.a * io.b
class FPMulTester[A <: Data:Ring, B <: Data:Ring](c: FPMul[A,B]) extends DspTester(c) {
val uut = c.io
// This will PASS, there is sufficient precision to model the inputs
poke(uut.a, 3.25)
poke(uut.b, 2.5)
expect(uut.c, 3.25*2.5)
// This will FAIL, there is not sufficient precision to model the inputs
// But this is only caught on output, this is likely the right approach
// because you can't really pass in wrong precision data in hardware.
poke(uut.a, 3.1)
poke(uut.b, 2.2)
expect(uut.c, 3.1*2.2)
object FPUMain extends App {
val fpInType = FP(8.W, 4.BP)
val fpOutType = FP(12.W, 6.BP)
//val comp = DspComplex[Double] // How to declare a complex DSP type ?
val opts = new DspTesterOptionsManager {
dspTesterOptions = DspTesterOptions(
fixTolLSBs = 0,
genVerilogTb = false,
isVerbose = true
dsptools.Driver.execute (() => new FPMul(fpInType, fpOutType), opts) {
//dsptools.Driver.execute (() => new FPMul(comp, comp), opts) { // <-- this won't compile
c => new FPMulTester(c)

Spark: Draw learning curve of a model with spark

I am using Spark and I would like to train a machine learning model.
Because of bad results, I would like to display the error made by the model at each epoch of the training (on train and test dataset).
I will then use this information to determined if my model is underfitting or overfitting the data.
Question: How can I draw the learning curve of a model with spark ?
In the following example, I have implement my own evaluator and override the evaluate method to print the metrics I was needed, but only two values have been display (maxIter = 1000).
import org.apache.spark.SparkConf
import org.apache.spark.ml.linalg.Vectors
import org.apache.spark.ml.regression.LinearRegression
import org.apache.spark.ml.tuning.{ParamGridBuilder, TrainValidationSplit}
import org.apache.spark.sql.SparkSession
object Min extends App {
// Open spark session.
val conf = new SparkConf()
.set("spark.network.timeout", "800")
val ss = SparkSession.builder
// Load data.
val data = ss.createDataFrame(ss.sparkContext.parallelize(
(Vectors.dense(1, 2), 1),
(Vectors.dense(1, 3), 2),
(Vectors.dense(1, 2), 1),
(Vectors.dense(1, 3), 2),
(Vectors.dense(1, 2), 1),
(Vectors.dense(1, 3), 2),
(Vectors.dense(1, 2), 1),
(Vectors.dense(1, 3), 2),
(Vectors.dense(1, 2), 1),
(Vectors.dense(1, 3), 2),
(Vectors.dense(1, 4), 3)
.withColumnRenamed("_1", "features")
.withColumnRenamed("_2", "label")
val Array(training, test) = data.randomSplit(Array(0.8, 0.2), seed = 42)
// Create model of linear regression.
val lr = new LinearRegression().setMaxIter(1000)
// Create parameters grid that will be used to train different version of the linear model.
val paramGrid = new ParamGridBuilder()
.addGrid(lr.regParam, Array(0.001))
.addGrid(lr.elasticNetParam, Array(0.5))
// Create trainer using validation split to evaluate which set of parameters performs the best.
val trainValidationSplit = new TrainValidationSplit()
.setEvaluator(new CustomRegressionEvaluator)
.setTrainRatio(0.8) // 80% of the data will be used for training and the remaining 20% for validation.
// Run train validation split, and choose the best set of parameters.
var model = trainValidationSplit.fit(training)
// Close spark session.
import org.apache.spark.ml.evaluation.{Evaluator, RegressionEvaluator}
import org.apache.spark.ml.param.{Param, ParamMap, Params}
import org.apache.spark.ml.util.{DefaultParamsReadable, DefaultParamsWritable, Identifiable}
import org.apache.spark.mllib.evaluation.RegressionMetrics
import org.apache.spark.sql.{Dataset, Row}
import org.apache.spark.sql.functions._
import org.apache.spark.sql.types._
final class CustomRegressionEvaluator (override val uid: String) extends Evaluator with HasPredictionCol with HasLabelCol with DefaultParamsWritable {
def this() = this(Identifiable.randomUID("regEval"))
def checkNumericType(
schema: StructType,
colName: String,
msg: String = ""): Unit = {
val actualDataType = schema(colName).dataType
val message = if (msg != null && msg.trim.length > 0) " " + msg else ""
require(actualDataType.isInstanceOf[NumericType], s"Column $colName must be of type " +
s"NumericType but was actually of type $actualDataType.$message")
def checkColumnTypes(
schema: StructType,
colName: String,
dataTypes: Seq[DataType],
msg: String = ""): Unit = {
val actualDataType = schema(colName).dataType
val message = if (msg != null && msg.trim.length > 0) " " + msg else ""
s"Column $colName must be of type equal to one of the following types: " +
s"${dataTypes.mkString("[", ", ", "]")} but was actually of type $actualDataType.$message")
var i = 0 // count the number of time the evaluate method is called
override def evaluate(dataset: Dataset[_]): Double = {
val schema = dataset.schema
checkColumnTypes(schema, $(predictionCol), Seq(DoubleType, FloatType))
checkNumericType(schema, $(labelCol))
val predictionAndLabels = dataset
.select(col($(predictionCol)).cast(DoubleType), col($(labelCol)).cast(DoubleType))
.map { case Row(prediction: Double, label: Double) => (prediction, label) }
val metrics = new RegressionMetrics(predictionAndLabels)
val metric = "mae" match {
case "rmse" => metrics.rootMeanSquaredError
case "mse" => metrics.meanSquaredError
case "r2" => metrics.r2
case "mae" => metrics.meanAbsoluteError
println(s"$i $metric") // Print the metrics
i = i + 1 // Update counter
override def copy(extra: ParamMap): RegressionEvaluator = defaultCopy(extra)
object RegressionEvaluator extends DefaultParamsReadable[RegressionEvaluator] {
override def load(path: String): RegressionEvaluator = super.load(path)
private[ml] trait HasPredictionCol extends Params {
* Param for prediction column name.
* #group param
final val predictionCol: Param[String] = new Param[String](this, "predictionCol", "prediction column name")
setDefault(predictionCol, "prediction")
/** #group getParam */
final def getPredictionCol: String = $(predictionCol)
private[ml] trait HasLabelCol extends Params {
* Param for label column name.
* #group param
final val labelCol: Param[String] = new Param[String](this, "labelCol", "label column name")
setDefault(labelCol, "label")
/** #group getParam */
final def getLabelCol: String = $(labelCol)
Here is a possible solution for the specific case of LinearRegression and any other algorithm that support objective history (in this case, And LinearRegressionTrainingSummary does the job).
Let's first create a minimal verifiable and complete example :
import org.apache.spark.ml.param.ParamMap
import org.apache.spark.ml.regression.{LinearRegression, LinearRegressionModel}
import org.apache.spark.ml.tuning.{ParamGridBuilder, TrainValidationSplit}
import org.apache.spark.mllib.util.{LinearDataGenerator, MLUtils}
import org.apache.spark.sql.SparkSession
val spark: SparkSession = SparkSession.builder().getOrCreate()
import org.apache.spark.ml.evaluation.RegressionEvaluator
import spark.implicits._
val data = {
val tmp = LinearDataGenerator.generateLinearRDD(
nexamples = 10000,
nfeatures = 4,
eps = 0.05
MLUtils.convertVectorColumnsToML(tmp, "features")
As you've noticed, when you want to generate data for testing purposes for spark-mllib or spark-ml, it's advised to use data generators.
Now, let's train a linear regressor :
// Create model of linear regression.
val lr = new LinearRegression().setMaxIter(1000)
// The following line will create two sets of parameters
val paramGrid = new ParamGridBuilder().addGrid(lr.regParam, Array(0.001)).addGrid(lr.fitIntercept).addGrid(lr.elasticNetParam, Array(0.5)).build()
// Create trainer using validation split to evaluate which set of parameters performs the best.
// I'm using the regular RegressionEvaluator here
val trainValidationSplit = new TrainValidationSplit()
.setEvaluator(new RegressionEvaluator)
.setTrainRatio(0.8) // 80% of the data will be used for training and the remaining 20% for validation.
// To retrieve subModels, make sure to set collectSubModels to true before fitting.
// Run train validation split, and choose the best set of parameters.
var model = trainValidationSplit.fit(data)
Now since our model is trained, all we need is to get the objective history.
The following part needs a bit of gymnastics between the model and sub-models object parameters.
In case you have a Pipeline or so, this code needs to be modified, so use it carefully. It's just an example :
val objectiveHist = spark.sparkContext.parallelize(
model.subModels.zip(model.getEstimatorParamMaps).map {
case (m: LinearRegressionModel, pm: ParamMap) =>
val history: Array[Double] = m.summary.objectiveHistory
val idx: Seq[Int] = 1 until history.length
// regParam, elasticNetParam, fitIntercept
val parameters = pm.toSeq.map(pair => (pair.param.name, pair.value.toString)) match {
case Seq(x, y, z) => (x._2, y._2, z._2)
(parameters._1, parameters._2, parameters._3, idx.zip(history).toMap)
}).toDF("regParam", "elasticNetParam", "fitIntercept", "objectiveHistory")
We can now examine those metrics :
// +--------+---------------+------------+-------------------------------------------------------------------------------------------------------+
// |regParam|elasticNetParam|fitIntercept|objectiveHistory |
// +--------+---------------+------------+-------------------------------------------------------------------------------------------------------+
// |0.001 |0.5 |true |[1 -> 0.4999999999999999, 2 -> 0.4038796441909531, 3 -> 0.02659222058006269, 4 -> 0.026592220340980147]|
// |0.001 |0.5 |false |[1 -> 0.5000637621421942, 2 -> 0.4039303922115196, 3 -> 0.026592220673025396, 4 -> 0.02659222039347222]|
// +--------+---------------+------------+-------------------------------------------------------------------------------------------------------+
You can notice that the training process actually stops after 4 iterations.
If you want just the number of iterations, you can do the following instead :
val objectiveHist2 = spark.sparkContext.parallelize(
model.subModels.zip(model.getEstimatorParamMaps).map {
case (m: LinearRegressionModel, pm: ParamMap) =>
val history: Array[Double] = m.summary.objectiveHistory
// regParam, elasticNetParam, fitIntercept
val parameters = pm.toSeq.map(pair => (pair.param.name, pair.value.toString)) match {
case Seq(x, y, z) => (x._2, y._2, z._2)
(parameters._1, parameters._2, parameters._3, history.size)
}).toDF("regParam", "elasticNetParam", "fitIntercept", "iterations")
I've changed the number of features in the generator (nfeatures = 100) for the sake of demonstrations :
// +--------+---------------+------------+----------+
// |regParam|elasticNetParam|fitIntercept|iterations|
// +--------+---------------+------------+----------+
// | 0.001| 0.5| true| 11|
// | 0.001| 0.5| false| 11|
// +--------+---------------+------------+----------+

Dynamic code evaluation in scala

What is the best way to inject a snippet of code to scala? something like eval in javascript and GroovyScriptEngine. I want to keep my rules/computations/formulas outside the actual data processing class. I have close to 100+ formulas to be executed. The data flow is same for all only the formulas change. What is the best way to do it in scala? and the number of formulas will grow over time.
You could use either scala-lang API for that or twitter-eval. Here is the snippet of a simple use case of scala-lang
import scala.tools.nsc.Settings
import scala.tools.nsc.interpreter.IMain
object ScalaReflectEvaluator {
def evaluate() = {
val clazz = prepareClass
val settings = new Settings
settings.usejavacp.value = true
settings.deprecation.value = true
val eval = new IMain(settings)
val evaluated = eval.interpret(clazz)
val res = eval.valueOfTerm("res0").get.asInstanceOf[Int]
println(res) //yields 9
private def prepareClass: String = {
|val x = 4
|val y = 5
|x + y
or with twitter:
import com.twitter.util.Eval
object TwitterUtilEvaluator {
def evaluate() = {
val clazz = prepareClass
val eval = new Eval
private def prepareClass: String = {
|val x = 4
|val y = 5
|x + y
I am not able to compile it at the moment to check whether I have missed something but you should get the idea.
I've found that scala.tools.reflect.ToolBox is the fastest eval in scala (measured interpreter, twitter's eval and custom tool). It's API:
import scala.reflect.runtime.universe
import scala.tools.reflect.ToolBox
val tb = universe.runtimeMirror(getClass.getClassLoader).mkToolBox()

Is there some way to use MacroImplementations code for "f" string interpolator in my macro?

Currently, I tried doing something like the following:
def macroImpl(cx: Context)(...) = {
new MacroImplementations { val c = cx }
but it complains that c in MacroImplementations is of type scala.reflect.macros.runtime.Context, while cx is of type scala.reflect.macros.Context.
What is the difference between those two contexts?
I ended up with the following solution - it's quite ugly, but it works:
import scala.language.experimental.macros
import scala.reflect.macros.Context
import scala.reflect.macros.runtime.{Context => ContextR}
import scala.tools.reflect.MacroImplementations
def putfImpl(cx: Context)(args: cx.Expr[Any]*): cx.Expr[Unit] = {
val cx2 = cx.asInstanceOf[ContextR]
val args2 = args.toList.asInstanceOf[List[cx2.Expr[Any]]]
import cx2.universe._
val Apply(_, List(Apply(_, partsE))) = cx2.prefix.tree
val mi = new { val c : cx2.type = cx2 } with MacroImplementations
val res = mi.macro_StringInterpolation_f(partsE, args2.map(_.tree), cx2.enclosingPosition)