Writing a generic mean function in Scala

Writing a generic mean function in Scala - scala

I'm trying to write a generic mean function that operates on an Iterable that contains numeric types. It would operate, say, on arrays, as so:
val rand = new scala.util.Random()
val a = Array.fill(1000) { rand.nextInt(101) }
val b = Array.fill(1000) { rand.nextDouble }
println(mean(a))
println(mean(b))
etc., hopefully being able to work on other iterables, such as lists.
I have tried various incantations for the mean method, to no avail:
def mean[T <% Numeric[T]](xs: Iterable[T]) = xs.sum.toDouble / xs.size
def mean[A](xs: Iterable[Numeric[A]]):Double = xs.sum.toDouble / xs.size
def mean[T](xs: Iterable[T])(implicit num: Numeric[T]):Double = xs.sum / xs.size
def mean(xs: Iterable[Double]) = xs.sum / xs.size
What is the proper way to do this in Scala?

This works:
def mean[T : Numeric](xs: Iterable[T]): T = implicitly[Numeric[T]] match {
case num: Fractional[_] => import num._; xs.sum / fromInt(xs.size)
case num: Integral[_] => import num._; xs.sum / fromInt(xs.size)
case _ => sys.error("Undivisable numeric!")
}
So, let's make some explanations. First, Numeric must be used in type class pattern. That is, you don't say a type T is, or can be converted into, Numeric. Instead, Numeric provides methods over a type T. One such example is num.fromInt.
Next, Numeric does not provide a common division operator. Instead, one must choose between Fractional and Integral. Here, I'm matching on Numeric[T] to distinguish between both.
Note that I don't use T on the match, because Scala cannot check for type parameters on matches, as they are erased. Instead, I use _, and Scala infers the correct type if possible (as it is here).
After that, I'm importing num._, where num is either Fractional or Integral. This brings some implicit conversions into context that let me do stuff like calling the method / directly. If I did not do that import, I'd be forced to write this:
num.div(xs.sum, num.fromInt(xs.size))
Note that I do not have to pass the implicit parameter to xs.sum, since it is already implicitly available in the scope.
I guess that's it. Did I miss anything?

One of your version is pretty close:
def mean[T](xs: Iterable[T])(implicit num: Numeric[T]):Double =
num.toDouble(xs.sum) / xs.size
Here is the other syntax:
def mean[T: Numeric](xs: Iterable[T]):Double =
implicitly[Numeric[T]].toDouble(xs.sum) / xs.size

def mean[A](it:Iterable[A])(implicit n:Numeric[A]) = {
it.map(n.toDouble).sum / it.size
}

This is quite an old question, but I am basically doing this
def average[A](list: List[Any])(implicit numerics: Numeric[A]): Double = {
list.map(Option(_)).filter(_.isDefined).flatten match {
case Nil => 0.0
case definedElements => numerics.toDouble(list.map(_.asInstanceOf[A]).sum) / definedElements.length.toDouble
}
}
for a list which might contain null values (I have to keep interoperability with Java). The null elements are not counted towards the average.

Related

Convert Array[AnyVal] to primitive Array[T], T can be Int, Double and etc in scala

I learnt that in scala Array is not a convariant collection. If I have an array of AnyVal, and all elements in array has the same type, how can I make it an array of primitives?
I am think to use the first element of the array like to detect the data type. Like the code below:
def convert(arr:Array[AnyVal]):Array[_] = {
val firstElement = arr.head
firstElement match {
case y:Int => ???
case y:Long => ???
case y:Float => ???
case y:Double => ???
...
}
}

The return type of a function is defined at compile time. You can't change the return type of a function based on data that is passed to the function.
The type of a value is also defined at compile time at the point where it is created. However that value can be held in a variable of any supertype of the value type, which is what makes the type system so powerful. Scala also makes it easy to tell the actual type of a value by using pattern matching.
If you know the type you want at compile time, use collect to narrow the type of a value using pattern matching. For example, to get an list of Double from a list of AnyVal do this:
val doubles: List[Double] = anyList.collect{ case d: Double => d }
Any non-Double values will be discarded.
Also note that Array is a Java hang-over, so prefer Scala types like List or Vector.

I am not sure if this is an optimal solution (or for that matter the idea of taking AnyVals input, and converting them to primitives). But typecasting all elements to the desired type should work. Here is a snippet I just tried out.
def convert(ar: Array[AnyVal]): Array[_] = {
ar.head match {
case y: Int => ar.map(_.asInstanceOf[Int])
case y: Long => ar.map(_.asInstanceOf[Long])
case _ => Array() // do check if this can be the case
}
}

Type parameters applied to Scala Function

I am trying to understand the type parameters when applied to a function.
I would like to use Generic Types in the below method but using String and Int for my understanding.
When I define a function as below
def myfunc[Int](f:String => Int):Int = {
Integer.min(1,2)
}
it complains
found : scala.this.Int
required: Int&0
Integer.min(1,2)
However if I remove the return type of the function ( which I understand is not required), it compiles fine.
I am not able to infer why removing the return type makes the compilation successful.
Appreciate your help.
-Amit

Try
def myfunc(f:String => Int):Int = {
Integer.min(1,2)
}
When you write def myfunc[Int](f:String => Int):Int you declare type parameter Int, which hides standard type scala.Int. This is the same as if you declared def myfunc[A](f:String => A):A. When you remove return type it's inferred to be scala.Int, i.e. def myfunc[A](f:String => A) is def myfunc[A](f:String => A):Int

If you want to use generics, first you have to understand that the name of the variable types starts capitalized and they are names, just that so [Int] in your function is the name of the type variable, an example:
object Main extends App{
val f: String => Int = s => 4
println(myfunc(f, "nothing useful"))
def myfunc[A,B](f:A => B, x: A):B = {
f(x)
}
}
here the names are A and B and the return type is of type B

Question: What's the difference between these 3 methods?
def myfunc1[X](f:String => X):X =
Integer.min(1,2)
def myfunc2[Int](f:String => Int):Int =
Integer.min(1,2)
def myfunc3[IDontGetTypeParameters](f:String => IDontGetTypeParameters):IDontGetTypeParameters =
Integer.min(1,2)
Answer: Nothing. From the compiler's point of view they are the same, and they fail to compile for the same reason: each is defined to return the type of the type parameter but tries to return an integer (Scala.Int) instead.

A quick one liner:
def myfunc(f:String => Int):Int = Integer.min(1,2)

It's good trying to make your own examples, but have you tried any examples from books, articles or tutorials? There's probably a good one in Scala for the Impatient by Cay Horstmann.
Here's a decent example from the Tour de Scala:
def listOfDuplicates[A](x: A, length: Int): List[A] = {
if (length < 1)
Nil
else
x :: listOfDuplicates(x, length - 1)
}
Sometimes you can omit the type parameter, but let's ignore that for now and declare the types explicitly:
listOfDuplicates[Int](43, 5) // Should give a list with 43 five times
listOfDuplicates[String]("Hello, world! ", 3) // "Hello, world!" thrice
listOfDuplicates[(Int, Int)]((0, 1), 8) // The pair (0, 1) eight times
This shows that A can be Int, String, (Int, Int) or just about anything else we can think of. Not sure you'd ever have a practical need for this, but you can even do something like this:
def wrapLength(str: String): Int = str.length
listOfDuplicates[String => Int](wrapLength(_), 2)
Here's a Scastie snippet in which you can play around with this.

Your generic type name shouldn't be one of the reserved words in Scala. Int itself is a reserved word for a type.
In this cases, for simplicity and understanding, we use some basic characters like T or R as the generic type if you really keen to use generics for other functions.

scala simple example of proper subtyping

I'm new to scala and trying to understand the right way to think about subtypes, so here's a simple example.
Let's say I want to make a function truncation() that takes a number and rounds it down to a few decimals places and returns the result. I might go about this as,
def truncation(number:Double, level:Int)={
math.floor(number * math.pow(10,level)) / math.pow(10,level)
}
truncation(1.2345, 2)
res0: Double = 1.23
But I probably also want this function to work with other numeric types besides Double, such as Float.
So how should I think about generalizing this function to work well with multiple types?
I'm thinking I should be using generic types such as
def truncation [A](number:A, level:Int):A={
math.floor(number * math.pow(10,level)) / math.pow(10,level)
}
but this doesn't compile.
In the case of just two types, I see that the Either type is a good option. But in the more general case,maybe I'll want to be able to handle Ints as well, and have different implementations that match on the type of the input object.
What's the best way to be thinking about this? Thanks for your help.

For a generic that you want to constrain to numeric types, you can use Numeric:
def truncation[T](number: T, level:Int)(implicit n: Numeric[T]) = {
import math._
val doubleValue = n.toDouble(number)
floor(doubleValue * pow(10,level)) / pow(10,level)
}
Or equivalently:
def truncation[T : Numeric](number: T, level:Int) = {
import math._
val doubleValue = implicitly[Numeric[T]].toDouble(number)
floor(doubleValue * pow(10,level)) / pow(10,level)
}
These will work for Ints, Doubles, Floats, and other numeric types.
The first example uses an implicit parameter, which you can read about here. The second version uses a context bound, which you can read about here together with the implicitly operator, which you can read about here. Finally, read the documentation of Numeric here to see all the available methods.
Note that the versions above both return Double. If you want them to return T (whatever the input type is), you can try:
def truncation[T : Numeric](number: T, level:Int): T = implicitly[Numeric[T]] match {
case n:Fractional[T] =>
val tenPow = n.fromInt(math.pow(10, level).toInt)
n.div(n.fromInt(n.toInt(n.times(number, tenPow))), tenPow)
case n:Integral[T] => number
}

What's the difference between multiple parameters lists and multiple parameters per list in Scala?

In Scala one can write (curried?) functions like this
def curriedFunc(arg1: Int) (arg2: String) = { ... }
What is the difference between the above curriedFunc function definition with two parameters lists and functions with multiple parameters in a single parameter list:
def curriedFunc(arg1: Int, arg2: String) = { ... }
From a mathematical point of view this is (curriedFunc(x))(y) and curriedFunc(x,y) but I can write def sum(x) (y) = x + y and the same will be def sum2(x, y) = x + y
I know only one difference - this is partially applied functions. But both ways are equivalent for me.
Are there any other differences?

Strictly speaking, this is not a curried function, but a method with multiple argument lists, although admittedly it looks like a function.
As you said, the multiple arguments lists allow the method to be used in the place of a partially applied function. (Sorry for the generally silly examples I use)
object NonCurr {
def tabulate[A](n: Int, fun: Int => A) = IndexedSeq.tabulate(n)(fun)
}
NonCurr.tabulate[Double](10, _) // not possible
val x = IndexedSeq.tabulate[Double](10) _ // possible. x is Function1 now
x(math.exp(_)) // complete the application
Another benefit is that you can use curly braces instead of parenthesis which looks nice if the second argument list consists of a single function, or thunk. E.g.
NonCurr.tabulate(10, { i => val j = util.Random.nextInt(i + 1); i - i % 2 })
versus
IndexedSeq.tabulate(10) { i =>
val j = util.Random.nextInt(i + 1)
i - i % 2
}
Or for the thunk:
IndexedSeq.fill(10) {
println("debug: operating the random number generator")
util.Random.nextInt(99)
}
Another advantage is, you can refer to arguments of a previous argument list for defining default argument values (although you could also say it's a disadvantage that you cannot do that in single list :)
// again I'm not very creative with the example, so forgive me
def doSomething(f: java.io.File)(modDate: Long = f.lastModified) = ???
Finally, there are three other application in an answer to related post Why does Scala provide both multiple parameters lists and multiple parameters per list? . I will just copy them here, but the credit goes to Knut Arne Vedaa, Kevin Wright, and extempore.
First: you can have multiple var args:
def foo(as: Int*)(bs: Int*)(cs: Int*) = as.sum * bs.sum * cs.sum
...which would not be possible in a single argument list.
Second, it aids the type inference:
def foo[T](a: T, b: T)(op: (T,T) => T) = op(a, b)
foo(1, 2){_ + _} // compiler can infer the type of the op function
def foo2[T](a: T, b: T, op: (T,T) => T) = op(a, b)
foo2(1, 2, _ + _) // compiler too stupid, unfortunately
And last, this is the only way you can have implicit and non implicit args, as implicit is a modifier for a whole argument list:
def gaga [A](x: A)(implicit mf: Manifest[A]) = ??? // ok
def gaga2[A](x: A, implicit mf: Manifest[A]) = ??? // not possible

There's another difference that was not covered by 0__'s excellent answer: default parameters. A parameter from one parameter list can be used when computing the default in another parameter list, but not in the same one.
For example:
def f(x: Int, y: Int = x * 2) = x + y // not valid
def g(x: Int)(y: Int = x * 2) = x + y // valid

That's the whole point, is that the curried and uncurried forms are equivalent! As others have pointed out, one or the other form can be syntactically more convenient to work with depending on the situation, and that is the only reason to prefer one over the other.
It's important to understand that even if Scala didn't have special syntax for declaring curried functions, you could still construct them; this is just a mathematical inevitability once you have the ability to create functions which return functions.
To demonstrate this, imagine that the def foo(a)(b)(c) = {...} syntax didn't exist. Then you could still achieve the exact same thing like so: def foo(a) = (b) => (c) => {...}.
Like many features in Scala, this is just a syntactic convenience for doing something that would be possible anyway, but with slightly more verbosity.

The two forms are isomorphic. The main difference is that curried functions are easier to apply partially, while non-curried functions have slightly nicer syntax, at least in Scala.

Scala compiler not recognizing a view bound

I've tried this line of code
def **[A <% Numeric[A]](l:List[A],m:List[A])=l.zip(m).map({t=>t._1*t._2})
However on compilation, I get this error
error: value * is not a member of type parameter A
def **[A <% Numeric[A]](l:List[A],m:List[A])=l.zip(m).map({t=>t._1*t._2})
When I look at the source for the Numeric trait, I see a * op defined.
What am I doing wrong?

The instance of Numeric is not a number itself, but it is an object that offers operations to do the arithmetic. For example, an object num of type Numeric[Int] can add two integers like this: num.plus(3, 5) The result of this operation is the integer 7.
For integers, this is very trivial. However, for all basic numerical types, there is one implicit instance of Numeric available. And if you define your own numeric types, you can provide one.
Therefore, you should leave the bounds for A open and add an implicit parameter of type Numeric[A], with which you do the calculations. Like this:
def **[A](l:List[A],m:List[A])(implicit num:Numeric[A])=l.zip(m).map({t=>num.times(t._1, t._2)})
Of course, num.times(a,b) looks less elegant than a*b. In most of the cases, one can live with that. However, you can wrap the value a in an object of type Ops that supports operators, like this:
// given are: num:Numeric[A], a:A and b:A
val a_ops = num.mkNumericOps(a)
val product = a_ops * b
Since the method mkNumericOps is declared implicit, you can also import it and use it implicitly:
// given are: num:Numeric[A], a:A and b:A
import num._
val product = a * b

You can also solve this with a context bound. Using the context method from this answer, you can write:
def **[A : Numeric](l:List[A],m:List[A]) =
l zip m map { t => context[A]().times(t._1, t._2) }
or
def **[A : Numeric](l:List[A],m:List[A]) = {
val num = context[A]()
import num._
l zip m map { t => t._1 * t._2 }
}

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Writing a generic mean function in Scala - scala

One of your version is pretty close: def mean[T](xs: Iterable[T])(implicit num: Numeric[T]):Double = num.toDouble(xs.sum) / xs.size Here is the other syntax: def mean[T: Numeric](xs: Iterable[T]):Double = implicitly[Numeric[T]].toDouble(xs.sum) / xs.size

def mean[A](it:Iterable[A])(implicit n:Numeric[A]) = { it.map(n.toDouble).sum / it.size }

Related

Convert Array[AnyVal] to primitive Array[T], T can be Int, Double and etc in scala

Type parameters applied to Scala Function

scala simple example of proper subtyping

What's the difference between multiple parameters lists and multiple parameters per list in Scala?

Scala compiler not recognizing a view bound

Categories

Resources