Call a function with arguments from a list - scala

Is there a way to call a function with arguments from a list? The equivalent in Python is sum(*args).
// Scala
def sum(x: Int, y: Int) = x + y
val args = List(1, 4)
sum.???(args) // equivalent to sum(1, 4)
sum(args: _*) wouldn't work here.
Don't offer change the declaration of the function anyhow. I'm acquainted with a function with repeated parameters def sum(args: Int*).

Well, you can write
sum(args(0), args(1))
But I assume you want this to work for any list length? Then you would go for fold or reduce:
args.reduce(sum) // args must be non empty!
(0 /: args)(sum) // aka args.foldLeft(0)(sum)
These methods assume a pair-wise reduction of the list. For example, foldLeft[B](init: B)(fun: (B, A) => B): B reduces a list of elements of type A to a single element of type B. In this example, A = B = Int. It starts with the initial value init. Since you want to sum, the sum of an empty list would be zero. It then calls the function with the current accumulator (the running sum) and each successive element of the list.
So it's like
var result = 0
result = sum(result, 1)
result = sum(result, 4)
...
The reduce method assumes that the list is non-empty and requires that the element type doesn't change (the function must map from two Ints to an Int).

I wouldn't recommend it for most uses since it's a bit complicated and hard to read, bypasses compile-time checks, etc., but if you know what you're doing and need to do this, you can use reflection. This should work with any arbitary parameter types. For example, here's how you might call a constructor with arguments from a list:
import scala.reflect.runtime.universe
class MyClass(
val field1: String,
val field2: Int,
val field3: Double)
// Get our runtime mirror
val runtimeMirror = universe.runtimeMirror(getClass.getClassLoader)
// Get the MyClass class symbol
val classSymbol = universe.typeOf[MyClass].typeSymbol.asClass
// Get a class mirror for the MyClass class
val myClassMirror = runtimeMirror.reflectClass(classSymbol)
// Get a MyClass constructor representation
val myClassCtor = universe.typeOf[MyClass].decl(universe.termNames.CONSTRUCTOR).asMethod
// Get an invokable version of the constructor
val myClassInvokableCtor = myClassMirror.reflectConstructor(myClassCtor)
val myArgs: List[Any] = List("one", 2, 3.0)
val myInstance = myClassInvokableCtor(myArgs: _*).asInstanceOf[MyClass]

Related

Scala Map as parameters for spark ML models

I have developed a tool using pyspark. In that tool, the user provides a dict of model parameters, which is then passed to an spark.ml model such as Logistic Regression in the form of LogisticRegression(**params).
Since I am transferring to Scala now, I was wondering how this can be done in Spark using Scala? Coming from Python, my intuition is to pass a Scala Map such as:
val params = Map("regParam" -> 100)
val model = new LogisticRegression().set(params)
Obviously, it's not as trivial as that. It seem as in scala, we need to set every single parameter separately, like:
val model = new LogisticRegression()
.setRegParam(0.3)
I really want to avoid being forced to iterate over all user input parameters and set the appropriate parameters with tons of if clauses.
Any ideas how to solve this as elegantly as in Python?
According to the LogisticRegression API you need to set each param individually via setter:
Users can set and get the parameter values through setters and
getters, respectively.
An idea is to build your own mapping function to dynamically call the corresponding param setter using reflection.
Scala is a statically typed language, hence by-design doesn't have anything like Python's **params. As already being considered, you can store them in a Map of type[K, Any], but type erasure would erase types of the Map values due to JVM's runtime constraint.
Shapeless provides some neat mixed-type features that can circumvent the problem. An alternative is to use Scala's TypeTag to preserve type information, as in the following example:
import scala.reflect.runtime.universe._
case class Params[K]( m: Map[(K, TypeTag[_]), Any] ) extends AnyVal {
def add[V](k: K, v: V)(implicit vt: TypeTag[V]) = this.copy(
m = this.m + ((k, vt) -> v)
)
def grab[V](k: K)(implicit vt: TypeTag[V]) = m((k, vt)).asInstanceOf[V]
}
val params = Params[String](Map.empty).
add[Int]("a", 100).
add[String]("b", "xyz").
add[Double]("c", 5.0).
add[List[Int]]("d", List(1, 2, 3))
// params: Params[String] = Params( Map(
// (a,TypeTag[Int]) -> 100, (b,TypeTag[String]) -> xyz, (c,TypeTag[Double]) -> 5.0,
// (d,TypeTag[scala.List[Int]]) -> List(1, 2, 3)
// ) )
params.grab[Int]("a")
// res1: Int = 100
params.grab[String]("b")
// res2: String = xyz
params.grab[Double]("c")
// res3: Double = 5.0
params.grab[List[Int]]("d")
// res4: List[Int] = List(1, 2, 3)

Expression of type list(object) doesn't conform to expected list scala

Supposedly I have the following
case class test {
a:string
b: string
c: Int
d: Int }
var temp = List(test("lol","lel",1,2))
var total = List(test)
total = total:::temp //this doesn't work because temp is of type [test] while total is of type [test.type]
I do not understand the difference.
The reason I want to use this is that I want to have a running list where elements will be conditionally added in a loop.
So in this instance, total should initially an empty list which takes test objects. How do I do this?
Any feedback is appreciated!
Let me begin by explaining few basics about Scala.
In Scala, you define a class like following,
scala> class Demo(a: String, b: Int) {
| def stringify: String = a + " :: " + b
| }
// defined class Demo
You can think of a class as a blueprint given to Scala which will be used to create instances of that class. Here, every instance of class Demo will have two properties - a which will be a String and b which will be an Int and one method - stringify which will return a String.
scala> val demo1 = new Demo("demo1", 1)
// demo1: Demo = Demo#21eee94f
scala> demo1.getClass
// res0: Class[_ <: Demo] = class Demo
Here demo1 is an instance of class Demo and has type Demo.
Scala also has a concept of object which are instances of specially generated inner classes.
scala> object OtherDemo {
| val a: Int = 10
| }
// defined object OtherDemo
scala> DemoObject.getClass
// res2: Class[_ <: OtherDemo.type] = class OtherDemo$
Here OtherDemo will be the only instance of that specially generated class OtherDemo$ and has type OtherDemo.type.
And then there are case class in Scala
scala> case class AnotherDemo(a: Int)
// defined class AnotherDemo
This will create not only a class AnotherDemo but also an object AnotherDemo which we call a companion object. Which is equivalent to,
class AnotherDemo(a: Int)
object AnotherDemo {
def apply(a: Int): AnotherDemo = new AnotherDemo(a)
def unapply(anotherDemo: AnotherDemo): Option[Int] = Some(anotherDemo.a)
// And many more utility functions
}
We call this object AnotherDemo as companion object of class AnotherDemo.
We can create instances of AnotherDemo in two ways,
// By using new keyword, as we can do for any class
scala> val anotherDemo1 = new AnotherDemo(1)
// anotherDemo1: AnotherDemo = AnotherDemo(1)
// Or we can use `apply` method provided by companion object
scala> val anotherDemo2 = AnotherDemo(2)
// anotherDemo2: AnotherDemo = AnotherDemo(2)
scala> anotherDemo1.getClass
// res6: Class[_ <: AnotherDemo] = class AnotherDemo
scala> anotherDemo2.getClass
// res7: Class[_ <: AnotherDemo] = class AnotherDemo
scala> AnotherDemo.getClass
// res8: Class[_ <: AnotherDemo.type] = class AnotherDemo$
Also, In Scala your class names should start with Capital Letter. This enables you to easily distinguish them from instance variable which should start with small letters. This helps you in avoiding confusion.
Now, it is supposed to be a: String and not a: string.
scala> case class Test(
| a: String,
| b: String,
| c: Int,
| d: Int
| )
// defined class Test
Now, when you write,
scala> var temp = List(Test("lol","lel",1,2))
// temp: List[Test] = List(Test(lol,lel,1,2))
It is actually equivalent to,
var temp = List.apply(Test.apply("lol","lel",1,2))
Or,
val test1 = Test.apply("lol","lel",1,2)
var temp = List.apply(test1)
The Test in Test.apply is not your class Test but the companion object Test. And calling Test.apply returns an instance of class Test which is being passed to List.apply to finally get a List of type List[Test] containing this instance of Test.
But when you write this,
scala> var total = List(Test)
// total: List[Test.type] = List(Test)
You are creating a List of type List[Test.type] containing that companion object of Test.
Focus on total: List[Test.type] part... this means that total is a variable of type List[Test.type] which means that it will want to point to a value/instance of type List[Test.type], and will refuse to point to anything else.
Now... you are trying to do this,
total = total ::: temp
Which is equivalent to,
val x = total ::: temp
total = x
which is actually,
val x = temp.:::(total)
total = x
Now look at this val x = total ::: temp,
scala> val x = total ::: temp
// x: List[Serializable] = List(Test, Test(lol,lel,1,2))
You see... this x is of type List[Serializable]. So when you try total = x, you will get following error,
scala> total = x
// <console>:13: error: type mismatch;
// found : List[Serializable]
// required: List[Test.type]
// total = x
// ^
Which means that total required a List[Test.type] but you are giving it a List[Serializable].
You are looking for total = List.empty[test] rather than List(test).
The former creates an empty list of type List[test], the latter is a one-element list of type List[test.type] (test.type is not the same as test - it is its own object, representing the type of instances of test).
Also, do not use var. They are evil, and not really needed in scala in 99% of uses-cases. Just pretend that keyword does not exist at all, until you get enough of a grip at the language to be able to confidently distinguish the other 1%.
When you do this:
var total = List(test)
You are not initializing the object test, that's why the type of the list is Test.type, you are only creating a list of a template for an object.
When you do this instead:
var temp = List(test("lol","lel",1,2))
Yo have the object instantiated from a template (a class, in this case, Test) so the type of temp is List[Temp].
So, if you do something like:
val template = Test
Then the type of t is Test.type
And you can instantiate an object Test from template like this:
val instantiated = template("lol","lel",1,2)
As you see in your example, the total variable is just a list of templates from where you can instantiate objects while the temp variable is a list of objects of type Test.
To create an empty list of objects of type Test you just have to do:
val t: List[Test] = List.empty
Then you can add any object (of type Test) to this list
Based on your description ('I want to have a running list where elements will be conditionally added in a loop'), my understanding is that you are getting Test objects from some source and want to put them in a list but only if they meet certain criteria. We can express this requirement as a method. For convenience, we'll put the method in the Test companion object. Companion objects are a place to put things that should be available without having to instantiate any objects.
case class Test(a: String, b: String, c: Int, d: Int)
object Test {
/**
Returns a list of `Test` objects that pass the given criteria.
#param tests some source of tests that we can loop over one at a
time.
#param condition checks whether a `Test` object should go into our
output list.
*/
def runningList(
tests: Iterable[Test])(condition: Test => Boolean): List[Test] =
tests.filter(condition).toList
}
You can use it like (e.g.):
Test.runningList(testsSource) { test => test.c > 0 && test.d < 100 }
As you can see here, I've used a few Scala features, like iterables and their list conversion method, multi-parameter-list methods, first-class functions, function-as-last-argument DSL, and so on. If you have more questions on these topics I'd recommend a Scala tutorial.

Scala: a template for function to accept only a certain arity and a certain output?

I have a class, where all of its functions have the same arity and same type of output. (Why? Each function is a separate processor that is applied to a Spark DataFrame and yields another DataFrame).
So, the class looks like this:
class Processors {
def p1(df: DataFrame): DataFrame {...}
def p2(df: DataFrame): DataFrame {...}
def p3(df: DataFrame): DataFrame {...}
...
}
I then apply all the methods of a given DataFrame by mapping over Processors.getClass.getMethod, which allows me to add more processors without changing anything else in the code.
What I'd like to do is define a template to the methods under Processors which will restrict all of them to accept only one DataFrame and return a DataFrame. Is there a way to do this?
Implementing a restriction on what kind of functions can be added to a "list" is possible by using an appropriate container class instead of a generic class to hold the methods that are restricted. The container of restricted methods can then be part of some new class or object or part of the main program.
What you lose below by using containers (e.g. a Map with string keys and restricted values) to hold specific kinds of functions is compile-time checking of the names of the methods. e.g. calling triple vs trilpe
The restriction of a function to take a type T and return that same type T can be defined as a type F[T] using Function1 from the scala standard library. Function1[A,B] allows any single-parameter function with input type A and output type B, but we want these input/output types to be the same, so:
type F[T] = Function1[T,T]
For a container, I will demonstrate scala.collection.mutable.ListMap[String,F[T]] assuming the following requirements:
string names reference the functions (doThis, doThat, instead of 1, 2, 3...)
functions can be added to the list later (mutable)
though you could choose some other mutable or immutable collection class (e.g. Vector[F[T]] if you only want to number the methods) and still benefit from the restriction of what kind of functions future developers can include into the container.
An abstract type can be defined as:
type TaskMap[T] = ListMap[String, F[T]]
For your specific application you would then instantiate this as:
val Processors:TaskMap[Dataframe] = ListMap(
"p1" -> ((df: DataFrame) => {...code for p1 goes here...}),
"p2" -> ((df: DataFrame) => {...code for p2 goes here...}),
"p3" -> ((df: DataFrame) => {...code for p3 goes here...})
)
and then to call one of these functions you use
Processors("p2")(someDF)
For simplicity of demonstration, let's forget about Dataframes for a moment and consider whether this scheme works with integers.
Consider the short program below. The collection "myTasks" can only contain functions from Int to Int. All of the lines below have been tested in the scala interpreter, v2.11.6, so you can follow along line by line.
import scala.collection.mutable.ListMap
type F[T] = Function1[T,T]
type TaskMap[T] = ListMap[String, F[T]]
val myTasks: TaskMap[Int] = ListMap(
"negate" -> ((x:Int)=>(-x)),
"triple" -> ((x:Int)=>(3*x))
)
we can add a new function to the container that adds 7 and name it "add7"
myTasks += ( "add7" -> ((x:Int)=>(x+7)) )
and the scala interpreter responds with:
res0: myTasks.type = Map(add7 -> <function1>, negate -> <function1>, triple -> <function1>)
but we can't add a function named "half" because it would return a Float, and a Float is not an Int and should trigger a type error
myTasks += ( "half" -> ((x:Int)=>(0.5*x)) )
Here we get this error message:
scala> myTasks += ( "half" -> ((x:Int)=>(0.5*x)) )
<console>:12: error: type mismatch;
found : Double
required: Int
myTasks += ( "half" -> ((x:Int)=>(0.5*x)) )
^
In a compiled application, this would be found at compile time.
How to call the functions stored this way is a bit more verbose for single calls, but can be very convenient.
Suppose we want to call "triple" on 10.
We can't write
triple(10)
<console>:9: error: not found: value triple
Instead it is
myTasks("triple")(10)
res4: Int = 30
Where this notation becomes more useful is if you have a list of tasks to perform but only want to allow tasks listed in myTasks.
Suppose we want to run all the tasks on the input data "10"
myTasks mapValues { _ apply 10 }
res9: scala.collection.Map[String,Int] =
Map(add7 -> 17, negate -> -10, triple -> 30)
Suppose we want to triple, then add7, then negate
If each result is desired separately, as above, that becomes:
List("triple","add7","negate") map myTasks.apply map { _ apply 10 }
res11: List[Int] = List(30, 17, -10)
But "triple, then add 7, then negate" could also be describing a series of steps to do 10, i.e. we want -((3*10)+7)" and scala can do that too
val myProgram = List("triple","add7","negate")
myProgram map myTasks.apply reduceLeft { _ andThen _ } apply 10
res12: Int = -37
opening the door to writing an interpreter for your own customizable set of tasks because we can also write
val magic = myProgram map myTasks.apply reduceLeft { _ andThen _ }
and magic is then a function from int to int that can take aribtrary ints or otherwise do work as a function should.
scala> magic(1)
res14: Int = -10
scala> magic(2)
res15: Int = -13
scala> magic(3)
res16: Int = -16
scala> List(10,20,30) map magic
res17: List[Int] = List(-37, -67, -97)
Is this what you mean?
class Processors {
type Template = DataFrame => DataFrame
val p1: Template = ...
val p2: Template = ...
val p3: Template = ...
def applyAll(df: DataFrame): DataFrame =
p1(p2(p3(df)))
}

Partially applied isBefore function in Scala gives error

I am trying to merge two sequences of dates in Scala such that the merged sequence has sorted elements. I am using a partial implementation of isBefore as follows:
val seq1 = Seq(LocalDate.of(2014, 4, 5), LocalDate.of(2013, 6 ,7), LocalDate.of(2014, 3, 1))
val seq2 = Seq(LocalDate.of(2012, 2, 2), LocalDate.of(2015, 2, 1))
var arr = (seq1 ++ seq2).sortWith(_.isBefore(_) = 1)
println(arr)
But it shows compilation error for the isBefore function:
Multiple markers at this line
- missing arguments for method isBefore in class LocalDate; follow this method with `_' if you want to
treat it as a partially applied function
- missing arguments for method isBefore in class LocalDate; follow this method with `_' if you want to
treat it as a partially applied function
I am relatively new to Scala. What seems to be the problem?
At first there is no such term as partial implementation, at least i didn't heard of such, i guess you've meant partial application, but there is no partial application in this case, partial application is about curried functions, what complier is trying to tell you in your error message. Example of this:
def test(a: String)(f: String => String) = f(a)
val onString = test("hello world") _
onString(_.capitalize)
test: (a: String)(f: String => String)String
onString: (String => String) => String = <function1>
res8: String = Hello world
This is a partial application, you take a curried function, which returns another function and passes it one argument (partially applies it) and latter passes another argument.
As for you sorting problem, that should work. I don't which which library you are using but with Date time it's similar. I think the problem is in assignment (_.isBefore(_) = 1), this is illegal. Should be like this:
val seq1 = Seq(LocalDate.parse("2014-04-05"), LocalDate.parse("2013-06-07"), LocalDate.parse("2014-03-01"))
val seq2 = Seq(LocalDate.parse("2012-02-02"), LocalDate.parse("2015-02-01"))
var arr = (seq1 ++ seq2).sortWith(_.isBefore(_))
arr: Seq[org.joda.time.LocalDate] = List(2012-02-02, 2013-06-07, 2014-03-01, 2014-04-05, 2015-02-01)

Why we need implicit parameters in scala?

I am new to scala, and today when I came across this akka source code I was puzzled:
def traverse[A, B](in: JIterable[A], fn: JFunc[A, Future[B]],
executor: ExecutionContext): Future[JIterable[B]] = {
implicit val d = executor
scala.collection.JavaConversions.iterableAsScalaIterable(in).foldLeft(
Future(new JLinkedList[B]())) { (fr, a) ⇒
val fb = fn(a)
for (r ← fr; b ← fb) yield { r add b; r }
}
}
Why the code is written using implicit parameters intentionally? Why can't it be written as:
scala.collection.JavaConversions.iterableAsScalaIterable(in).foldLeft(
Future(new JLinkedList[B](),executor))
without decalaring a new implicit variable d? Is there any advantage of doing this? For now I only find implicits increase the ambiguity of the code.
I can give you 3 reasons.
1) It hides boilerplate code.
Lets sort some lists:
import math.Ordering
List(1, 2, 3).sorted(Ordering.Int) // Fine. I can tell compiler how to sort ints
List("a", "b", "c").sorted(Ordering.String) // .. and strings.
List(1 -> "a", 2 -> "b", 3 -> "c").sorted(Ordering.Tuple2(Ordering.Int, Ordering.String)) // Not so fine...
With implicit parameters:
List(1, 2, 3).sorted // Compiller knows how to sort ints
List(1 -> "a", 2 -> "b", 3 -> "c").sorted // ... and some other types
2) It alows you to create API with generic methods:
scala> (70 to 75).map{ _.toChar }
res0: scala.collection.immutable.IndexedSeq[Char] = Vector(F, G, H, I, J, K)
scala> (70 to 75).map{ _.toChar }(collection.breakOut): String // You can change default behaviour.
res1: String = FGHIJK
3) It allows you to focus on what really matters:
Future(new JLinkedList[B]())(executor) // meters: what to do - `new JLinkedList[B]()`. don't: how to do - `executor`
It's not so bad, but what if you need 2 futures:
val f1 = Future(1)(executor)
val f2 = Future(2)(executor) // You have to specify the same executor every time.
Implicit creates "context" for all actions:
implicit val d = executor // All `Future` in this scope will be created with this executor.
val f1 = Future(1)
val f2 = Future(2)
3.5) Implicit parameters allows type-level programming . See shapeless.
About "ambiguity of the code":
You don't have to use implicits, alternatively you can specify all parameters explicitly. It looks ugly sometimes (see sorted example), but you can do it.
If you can't find which implicit variables are used as parameters you can ask compiler:
>echo object Test { List( (1, "a") ).sorted } > test.scala
>scalac -Xprint:typer test.scala
You'll find math.this.Ordering.Tuple2[Int, java.lang.String](math.this.Ordering.Int, math.this.Ordering.String) in output.
In the code from Akka you linked, it is true that executor could be just passed explicitly. But if there was more than one Future used throughout this method, declaring implicit parameter would definitely make sense to avoid passing it around many times.
So I would say that in the code you linked, implicit parameter was used just to follow some code style. It would be ugly to make an exception from it.
Your question intrigued me, so I searched a bit on the net. Here's what I found on this blog: http://daily-scala.blogspot.in/2010/04/implicit-parameters.html
What is an implicit parameter?
An implicit parameter is a parameter to method or constructor that is marked as implicit. This means that if a parameter value is not supplied then the compiler will search for an "implicit" value defined within scope (according to resolution rules.)
Why use an implicit parameter?
Implicit parameters are very nice for simplifying APIs. For example the collections use implicit parameters to supply CanBuildFrom objects for many of the collection methods. This is because normally the user does not need to be concerned with those parameters. Another example is supplying an encoding to an IO library so the encoding is defined once (perhaps in a package object) and all methods can use the same encoding without having to define it for every method call.