Scala - calculate average of SomeObj.double in a List[SomeObj] - scala

I'm on my second evening of scala, and I'm resisting the urge to write things in scala how I used to do them in java and trying to learn all of the idioms. In this case I'm looking to just compute an average using such things as closures, mapping, and perhaps list comprehension. Irrespective of whether this is the best way to compute an average, I just want to know how to do these things in scala for learning purposes only
Here's an example: the average method below is left pretty much unimplemented. I've got a couple of other methods for looking up the rating an individual userid gave that uses the find method of TraversableLike (I think), but nothing more that is scala specific, really. How would I compute an average given a List[RatingEvent] where RatingEvent.rating is a double value that I'd to compute an average of across all values of that List in a scala-like manner?.
package com.brinksys.liftnex.model
class Movie(val id : Int, val ratingEvents : List[RatingEvent]) {
def getRatingByUser(userId : Int) : Int = {
return getRatingEventByUserId(userId).rating
}
def getRatingEventByUserId(userId : Int) : RatingEvent = {
var result = ratingEvents find {e => e.userId == userId }
return result.get
}
def average() : Double = {
/*
fill in the blanks where an average of all ratingEvent.rating values is expected
*/
return 3.8
}
}
How would a seasoned scala pro fill in that method and use the features of scala to make it as concise as possible? I know how I would do it in java, which is what I want to avoid.
If I were doing it in python, I assume the most pythonic way would be:
sum([re.rating. for re in ratingEvents]) / len(ratingEvents)
or if I were forcing myself to use a closure (which is something I at least want to learn in scala):
reduce(lambda x, y : x + y, [re.rating for re in ratingEvents]) / len(ratingEvents)
It's the usage of these types of things I want to learn in scala.
Your suggestions? Any pointers to good tutorials/reference material relevant to this are welcome :D

If you're going to be doing math on things, using List is not always the fastest way to go because List has no idea how long it is--so ratingEvents.length takes time proportional to the length. (Not very much time, granted, but it does have to traverse the whole list to tell.) But if you're mostly manipulating data structures and only occasionally need to compute a sum or whatever, so it's not the time-critical core of your code, then using List is dandy.
Anyway, the canonical way to do it would be with a fold to compute the sum:
(0.0 /: ratingEvents){_ + _.rating} / ratingEvents.length
// Equivalently, though more verbosely:
// ratingEvents.foldLeft(0.0)(_ + _.rating) / ratingEvents.length
or by mapping and then summing (2.8 only):
ratingEvents.map(_.rating).sum / ratingEvents.length
For more information on maps and folds, see this question on that topic.

You might calculate sum and length in one go, but I doubt that this helps except for very long lists. It would look like this:
val (s,l) = ratingEvents.foldLeft((0.0, 0))((t, r)=>(t._1 + r.rating, t._2 + 1))
val avg = s / l
I think for this example Rex' solution is much better, but in other use cases the "fold-over-tuple-trick" can be essential.

Since mean and other descriptive statistics like standard deviation or median are needed in different contexts, you could also use a small reusable implicit helper class to allow for more streamlined chained commands:
implicit class ImplDoubleVecUtils(values: Seq[Double]) {
def mean = values.sum / values.length
}
val meanRating = ratingEvents.map(_.rating).mean
It even seems to be possible to write this in a generic fashion for all number types.

Tail recursive solution can achieve both single traversal and avoid high memory allocation rates
def tailrec(input: List[RatingEvent]): Double = {
#annotation.tailrec def go(next: List[RatingEvent], sum: Double, count: Int): Double = {
next match {
case Nil => sum / count
case h :: t => go(t, sum + h.rating, count + 1)
}
}
go(input, 0.0, 0)
}
Here are jmh measurements of approaches from above answers on a list of million elements:
[info] Benchmark Mode Score Units
[info] Mean.foldLeft avgt 0.007 s/op
[info] Mean.foldLeft:·gc.alloc.rate avgt 4217.549 MB/sec
[info] Mean.foldLeft:·gc.alloc.rate.norm avgt 32000064.281 B/op
...
[info] Mean.mapAndSum avgt 0.039 s/op
[info] Mean.mapAndSum:·gc.alloc.rate avgt 1690.077 MB/sec
[info] Mean.mapAndSum:·gc.alloc.rate.norm avgt 72000009.575 B/op
...
[info] Mean.tailrec avgt 0.004 s/op
[info] Mean.tailrec:·gc.alloc.rate avgt ≈ 10⁻⁴ MB/sec
[info] Mean.tailrec:·gc.alloc.rate.norm avgt 0.196 B/op

I can suggest 2 ways:
def average(x: Array[Double]): Double = x.foldLeft(0.0)(_ + _) / x.length
def average(x: Array[Double]): Double = x.sum / x.length
Both are fine, but in 1 case when using fold you can not only make "+" operation, but as well replace it with other (- or * for example)

Related

What is the scala equivalent of Python's Numpy np.random.choice?(Random weighted selection in scala)

I was looking for Scala's equivalent code or underlying theory for pythons np.random.choice (Numpy as np). I have a similar implementation that uses Python's np.random.choice method to select the random moves from the probability distribution.
Python's code
Input list: ['pooh', 'rabbit', 'piglet', 'Christopher'] and probabilies: [0.5, 0.1, 0.1, 0.3]
I want to select one of the value from the input list given the associated probability of each input element.
The Scala standard library has no equivalent to np.random.choice but it shouldn't be too difficult to build your own, depending on which options/features you want to emulate.
Here, for example, is a way to get an infinite Stream of submitted items, with the probability of any one item weighted relative to the others.
def weightedSelect[T](input :(T,Int)*): Stream[T] = {
val items :Seq[T] = input.flatMap{x => Seq.fill(x._2)(x._1)}
def output :Stream[T] = util.Random.shuffle(items).toStream #::: output
output
}
With this each input item is given with a multiplier. So to get an infinite pseudorandom selection of the characters c and v, with c coming up 3/5ths of the time and v coming up 2/5ths of the time:
val cvs = weightedSelect(('c',3),('v',2))
Thus the rough equivalent of the np.random.choice(aa_milne_arr,5,p=[0.5,0.1,0.1,0.3]) example would be:
weightedSelect("pooh"-> 5
,"rabbit" -> 1
,"piglet" -> 1
,"Christopher" -> 3).take(5).toArray
Or perhaps you want a better (less pseudo) random distribution that might be heavily lopsided.
def weightedSelect[T](items :Seq[T], distribution :Seq[Double]) :Stream[T] = {
assert(items.length == distribution.length)
assert(math.abs(1.0 - distribution.sum) < 0.001) // must be at least close
val dsums :Seq[Double] = distribution.scanLeft(0.0)(_+_).tail
val distro :Seq[Double] = dsums.init :+ 1.1 // close a possible gap
Stream.continually(items(distro.indexWhere(_ > util.Random.nextDouble())))
}
The result is still an infinite Stream of the specified elements but the passed-in arguments are a bit different.
val choices :Stream[String] = weightedSelect( List("this" , "that")
, Array(4998/5000.0, 2/5000.0))
// let's test the distribution
val (choiceA, choiceB) = choices.take(10000).partition(_ == "this")
choiceA.length //res0: Int = 9995
choiceB.length //res1: Int = 5 (not bad)

chisel3 arithmetic operations on Doubles

Please I have problems manipulating arithmetic operations with doubles in chisel. I have been seeing examples that uses just the following types: Int,UInt,SInt.
I saw here that arithmetic operations where described only for SInt and UInt. What about Double?
I tried to declare my output out as Double, but didn't know how. Because the output of my code is Double.
Is there a way to declare in Bundle an input and an output of type Double?
Here is my code:
class hashfunc(val k:Int, val n: Int ) extends Module {
val a = k + k
val io = IO(new Bundle {
val b=Input(UInt(k.W))
val w=Input(UInt(k.W))
var out = Output(UInt(a.W))
})
val tabHash1 = new Array[Array[Double]](n)
val x = new ArrayBuffer[(Double, Data)]
val tabHash = new Array[Double](tabHash1.size)
for (ind <- tabHash1.indices){
var sum=0.0
for (ind2 <- 0 until x.size){
sum += ( x(ind2) * tabHash1(ind)(ind2) )
}
tabHash(ind) = ((sum + io.b) / io.w)
}
io.out := tabHash.reduce(_ + _)
}
When I compile the code, I get the following error:
code error
Thank you for your kind attention, looking forward to your responses.
Chisel does have a native FixedPoint type which maybe of use. It is in the experimental package
import chisel3.experimental.FixedPoint
There is also a project DspTools that has simulation support for Doubles. There are some nice features, e.g. it that allows modules to parameterized on the numeric types (Complex, Double, FixedPoint, SInt) so that you can run simulations on double to validate the desired mathematical behavior and then switch to a synthesizable number format that meets your precision criteria.
DspTools is an ongoing research projects and the team would appreciate outside users feedback.
Operations on floating point numbers (Double in this case) are not supported directly by any HDL. The reason for this is that while addition/subtraction/multiplication of fixed point numbers is well defined there are a lot of design space trade-offs for floating point hardware as it is a much more complex piece of hardware.
That is to say, a high performance floating point unit is a significant piece of hardware in it's own right and would be time shared in any realistic design.

What is the most efficient way to iterate a collection in parallel in Scala (TOP N pattern)

I am new to Scala and would like to build a real-time application to match some people. For a given Person, I would like to get the TOP 50 people with the highest matching score.
The idiom is as follows :
val persons = new mutable.HashSet[Person]() // Collection of people
/* Feed omitted */
val personsPar = persons.par // Make it parall
val person = ... // The given person
res = personsPar
.filter(...) // Some filters
.map{p => (p,computeMatchingScoreAsFloat(person, p))}
.toList
.sortBy(-_._2)
.take(50)
.map(t => t._1 + "=" + t._2).mkString("\n")
In the sample code above, HashSet is used, but it can be any type of collection, as I am pretty sure it is not optimal
The problem is that persons contains over 5M elements, the computeMatchingScoreAsFloat méthods computes a kind a correlation value with 2 vectors of 200 floats. This computation takes ~2s on my computer with 6 cores.
My question is, what is the fastest way of doing this TOPN pattern in Scala ?
Subquestions :
- What implementation of collection (or something else?) should I use ?
- Should I use Futures ?
NOTE: It has to be computed in parallel, the pure computation of computeMatchingScoreAsFloat alone (with no ranking/TOP N) takes more than a second, and < 200 ms if multi-threaded on my computer
EDIT: Thanks to Guillaume, compute time has been decreased from 2s to 700ms
def top[B](n:Int,t: Traversable[B])(implicit ord: Ordering[B]):collection.mutable.PriorityQueue[B] = {
val starter = collection.mutable.PriorityQueue[B]()(ord.reverse) // Need to reverse for us to capture the lowest (of the max) or the greatest (of the min)
t.foldLeft(starter)(
(myQueue,a) => {
if( myQueue.length <= n ){ myQueue.enqueue(a);myQueue}
else if( ord.compare(a,myQueue.head) < 0 ) myQueue
else{
myQueue.dequeue
myQueue.enqueue(a)
myQueue
}
}
)
}
Thanks
I would propose a few changes:
1- I believe that the filter and map steps requires traversing the collection twice. Having a lazy collection would reduce it to one. Either have a lazy collection (like Stream) or converting it to one, for instance for a list:
myList.view
2- the sort step requires sorting all elements. Instead, you can do a FoldLeft with an accumulator storing the top N records. See there for an example of an implementation:
Simplest way to get the top n elements of a Scala Iterable . I would probably test a Priority Queue instead of a list if you want maximum performance (really falling into its wheelhouse). For instance, something like this:
def IntStream(n:Int):Stream[(Int,Int)] = if(n == 0) Stream.empty else (util.Random.nextInt,util.Random.nextInt) #:: IntStream(n-1)
def top[B](n:Int,t: Traversable[B])(implicit ord: Ordering[B]):collection.mutable.PriorityQueue[B] = {
val starter = collection.mutable.PriorityQueue[B]()(ord.reverse) // Need to reverse for us to capture the lowest (of the max) or the greatest (of the min)
t.foldLeft(starter)(
(myQueue,a) => {
if( myQueue.length <= n ){ myQueue.enqueue(a);myQueue}
else if( ord.compare(a,myQueue.head) < 0 ) myQueue
else{
myQueue.dequeue
myQueue.enqueue(a)
myQueue
}
}
)
}
def diff(t2:(Int,Int)) = t2._2
top(10,IntStream(10000))(Ordering.by(diff)) // select top 10
I really think that you problem requires a SINGLE collection traverse so you be able to get your run time down to below 1 sec
Good luck!

Chisel compiler is very slow

I am working on a matrix summation kind of design. The compiler takes 4+hours to generate 1+million lines of codes. Every line is "assign....." I don't know if this is the inefficiency of the compiler or my coding style is bad. If someone could suggest some alternatives that will be great!
Here is the description of the code
The input will be AND with a random matrix element by element and summed up using .reduce, so the result matrix should be 140X6 vec, cat them together gives me a 840 bits output
(rndvec, which is supposed to be a 140x840x6 bits random matrix. since I don't know how to generate random value so I started with a fixed 140x6 to represent one row and feed it with input over and over again)
This following is my code
import Chisel._
import scala.collection.mutable.HashMap
import util.Random
class LBio(n: Int) extends Bundle {
var myinput = UInt(INPUT,840)
var myoutput = UInt (OUTPUT,840)
}
class Lbi(q: Int,n:Int,m :Int ) extends Module{
def mask(orig: Vec[UInt],maska:UInt,mi:Int)={
val result = Vec.fill(840){UInt(width =6)}
for (i<-0 until 840 ){
result(i) := orig(i)&Fill(6,maska(i)) //every bits of input AND with random vector
}
result
}
val io= new LBio(840)
val rndvec = Vec.fill(840){UInt("h13",6)} //random vector, for now its just replication of 0x13....
val resultvec = Vec.fill(140){UInt(width = 6)}
for (i<-0 until 140){
resultvec(i) := mask(rndvec,io.myinput,m).reduce(_+_) //add the entire row of 6 bits element together with reduce
}
io.myoutput := resultvec.toBits
}
The terminal report:
started inference
finished inference (4)
start width checking
finished width checking
started flattenning
finished flattening (941783)
resolving nodes to the components
finished resolving
started transforms
finished transforms
checking for combinational loops
NO COMBINATIONAL LOOP FOUND
COMPILING class TutorialExamples.Lbi 0 CHILDREN (0,0)
[success] Total time: 33453 s, completed Oct 16, 2013 10:32:10 PM
There's nothing obviously wrong with your Chisel code, but I should point out that if rndvec is 140x840x6 bits, that's ~689kB of state! And your reduce operation is on 5kB of state.
Chisel uses "assign" statements because your code is entirely combinational and Chisel produces a very structural form of Verilog.
I suspect the part that is killing the compile time (aside from the huge amount of state) is that you are generating and manipulating 140 Vecs with the mask() function.
I tried my hand at rewriting your code and got it down from 941,783 nodes to 202,723 (takes about 10-15 minutes to compile, but generates 11MB of Verilog code). I'm pretty sure this does what your code was doing:
class Hello(q: Int, dim_n:Int) extends Module
{
val io = new LBio(dim_n)
val rndvec = Vec.fill(dim_n){UInt("h13",6)}
val resultvec = Vec.fill(dim_n/6){UInt(width=6)}
// lift this work outside of the for loop
val padded_input = Vec.fill(dim_n){UInt(width=6)}
for (i <- 0 until dim_n)
{
padded_input(i) := Fill(6,io.myinput)
}
for (i <- 0 until dim_n/6)
{
val result = Bits(width=dim_n*6)
result := rndvec.toBits & padded_input.toBits
var sum = UInt(0) //advanced Chisel - be careful with the use of var!
for (j <- 0 until dim_n by 6)
{
sum = sum + result(j+6,j)
}
resultvec(i) := sum
}
io.myoutput := resultvec.toBits
}
What I did was avoid doing the same work over and over again - like padding out the myinput Vec inside of the for loop's mask() function. I also kept everything in Bits() instead of Vecs. Sadly it means I lose the awesome .reduce() function.
I think maybe the answer is "be cognizant of how much state you're creating" and "Vecs are awesome, but use carefully".
Do you have a Verilog version that's short and concise? It'd be interesting to see if there are areas where Chisel is losing out efficiency wise.

Adding immutable Vectors

I am trying to work more with scalas immutable collection since this is easy to parallelize, but i struggle with some newbie problems. I am looking for a way to create (efficiently) a new Vector from an operation. To be precise I want something like
val v : Vector[Double] = RandomVector(10000)
val w : Vector[Double] = RandomVector(10000)
val r = v + w
I tested the following:
// 1)
val r : Vector[Double] = (v.zip(w)).map{ t:(Double,Double) => t._1 + t._2 }
// 2)
val vb = new VectorBuilder[Double]()
var i=0
while(i<v.length){
vb += v(i) + w(i)
i = i + 1
}
val r = vb.result
}
Both take really long compared to the work with Array:
[Vector Zip/Map ] Elapsed time 0.409 msecs
[Vector While Loop] Elapsed time 0.374 msecs
[Array While Loop ] Elapsed time 0.056 msecs
// with warm-up (10000) and avg. over 10000 runs
Is there a better way to do it? I think the work with zip/map/reduce has the advantage that it can run in parallel as soon as the collections have support for this.
Thanks
Vector is not specialized for Double, so you're going to pay a sizable performance penalty for using it. If you are doing a simple operation, you're probably better off using an array on a single core than a Vector or other generic collection on the entire machine (unless you have 12+ cores). If you still need parallelization, there are other mechanisms you can use, such as using scala.actors.Futures.future to create instances that each do the work on part of the range:
val a = Array(1,2,3,4,5,6,7,8)
(0 to 4).map(_ * (a.length/4)).sliding(2).map(i => scala.actors.Futures.future {
var s = 0
var j = i(0)
while (j < i(1)) {
s += a(j)
j += 1
}
s
}).map(_()).sum // _() applies the future--blocks until it's done
Of course, you'd need to use this on a much longer array (and on a machine with four cores) for the parallelization to improve things.
You should use lazily built collections when you use more than one higher-order methods:
v1.view zip v2 map { case (a,b) => a+b }
If you don't use a view or an iterator each method will create a new immutable collection even when they are not needed.
Probably immutable code won't be as fast as mutable but the lazy collection will improve execution time of your code a lot.
Arrays are not type-erased, Vectors are. Basically, JVM gives Array an advantage over other collections when handling primitives that cannot be overcome. Scala's specialization might decrease that advantage, but, given their cost in code size, they can't be used everywhere.