I would like to find the fastest way to write euclidean distance in Scala. After some attemps, i'm here.
def euclidean[V <: Seq[Double]](dot1: V, dot2: V): Double = {
var d = 0D
var i = 0
while( i < dot1.size ) {
val toPow2 = dot1(i) - dot2(i)
d += toPow2 * toPow2
i += 1
}
sqrt(d)
}
Fastest results are obtain with mutable.ArrayBuffer[Double] as V and no collection.parallel._ are authorized for various vector size from 2 up to 10000
For those who desire to test breeze its slower with following distance function :
def euclideanDV(v1: DenseVector[Double], v2: DenseVector[Double]) = norm(v1 - v2)
If anyone knows any pure scala code or library that could help to improve speed it would be greatly appreciated.
The way i tested speed was i follow.
val te1 = 0L
val te2 = 0L
val runNumber = 100000
val warmUp = 60000
(0 until runNumber).foreach{ x =>
val t1 = System.nanoTime
euclidean1(v1, v2)
val t2 = System.nanoTime
euclidean2(v1, v2)
val t3 = System.nanoTime
if( x >= warmUp ) {
te1 += t2 - t1
te2 += t3 - t2
}
}
Here a some of my tries
// Fast on ArrayBuffer, quadratic on List
def euclidean1[V <: Seq[Double]](v1: V, v2: V) =
{
var d = 0D
var i = 0
while( i < v1.size ){
val toPow2 = v1(i) - v2(i)
d += toPow2 * toPow2
i += 1
}
sqrt(d)
}
// Breeze test
def euclideanDV(v1: DenseVector[Double], v2: DenseVector[Double]) = norm(v1 - v2)
// Slower than euclidean1
def euclidean2[V <: Seq[Double]](v1: V, v2: V) =
{
var d = 0D
var i = 0
while( i < v1.size )
{
d += pow(v1(i) - v2(i), 2)
i += 1
}
d
}
// Slower than 1 for Vsize ~< 1000 and a bit better over 1000 on ArrayBuffer
def euclidean3[V <: Seq[Double]](v1: V, v2: V) =
{
var d = 0D
var i = 0
(0 until v1.size).foreach{ i=>
val toPow2 = v1(i) - v2(i)
d += toPow2 * toPow2
}
sqrt(d)
}
// Slower than 1 for Vsize ~< 1000 and a bit better over 1000 on ArrayBuffer
def euclidean3bis(dot1: Seq[Double], dot2: Seq[Double]): Double =
{
var sum = 0D
dot1.indices.foreach{ id =>
val toPow2 = dot1(id) - dot2(id)
sum += toPow2 * toPow2
}
sqrt(sum)
}
// Slower than 1
def euclidean4[V <: Seq[Double]](v1: V, v2: V) =
{
var d = 0D
var i = 0
val vz = v1.zip(v2)
while( i < vz.size )
{
val (a, b) = vz(i)
val toPow2 = a - b
d += toPow2 * toPow2
i += 1
}
d
}
// Slower than 1
def euclideanL1(v1: List[Double], v2: List[Double]) = sqrt(v1.zip(v2).map{ case (a, b) =>
val toPow2 = a - b
toPow2 * toPow2
}.sum)
// Slower than 1
def euclidean5(dot1: Seq[Double], dot2: Seq[Double]): Double =
{
var sum = 0D
dot1.zipWithIndex.foreach{ case (a, id) =>
val toPow2 = a - dot2(id)
sum += toPow2 * toPow2
}
sqrt(sum)
}
// super super slow
def euclidean6(v1: Seq[Double], v2: Seq[Double]) = sqrt(v1.zip(v2).map{ case (a, b) => pow(a - b, 2) }.sum)
// Slower than 1
def euclidean7(dot1: Seq[Double], dot2: Seq[Double]): Double =
{
var sum = 0D
dot1.zip(dot2).foreach{ case (a, b) => sum += pow(a - b, 2) }
sum
}
// Slower than 1
def euclidean8(v1: Seq[Double], v2: Seq[Double]) =
{
def inc(n: Int, v: Double) = {
val toPow2 = v1(n) - v2(n)
v + toPow2 * toPow2
}
#annotation.tailrec
def go(n: Int, v: Double): Double =
{
if( n < v1.size - 1 ) go(n + 1, inc(n, v))
else inc(n, v)
}
sqrt(go(0, 0D))
}
// Slower than 1
def euclideanL2(v1: List[Double], v2: List[Double]) =
{
def inc(vzz: List[(Double, Double)], v: Double): Double =
{
val (a, b) = vzz.head
val toPow2 = a - b
v + toPow2 * toPow2
}
#annotation.tailrec
def go(vzz: List[(Double, Double)], v: Double): Double =
{
if( vzz.isEmpty ) v
else go(vzz.tail, inc(vzz, v))
}
sqrt(go(v1.zip(v2), 0D))
}
I tried tailrecursion on List but not enough efficiently on ArrayBuffer, i totally agree with the fact that proper tools like JMH are needed to test speed efficiency properly. But when order of magnitude is between 10-50% faster, we can be confident that it is better.
Even if it is V <: Seq[Double] it is NOT appropriate for List but for ArrayLike structure.
Here my proposal
def euclideanF[V <: Seq[Double]](v1: V, v2: V) = {
#annotation.tailrec
def go(d: Double, i: Int): Double = {
if( i < v1.size ) {
val toPow2 = v1(i) - v2(i)
go(d + toPow2 * toPow2, i + 1)
}
else d
}
sqrt(go(0D, 0))
}
Related
I'm trying to fit a curve with SimpleCurveFitter of commons.math3.fitting in Scala but I catch an exception :
org.apache.commons.math3.exception.ConvergenceException : Unable to permorm
Qr decomposition on jacobian
However, I have checked my gradient calculations.... I still don't see why the exception is raised.
See the code by yourself
def main(args: Array[String]): Unit = {
var xv: DenseVector[Double] = linspace(0, 3, 300)
var yv: DenseVector[Double] = DenseVector.zeros(300)
for (i <- xv.findAll(x => x < 1.0)) yv.update(i, 1)
for (i <- xv.findAll(x => x >= 1.0)) yv.update(i, exp(-(xv(i) - 1.0)/1))
val wop: Array[WeightedObservedPoint] = new Array[WeightedObservedPoint](xv.length)
for (i <- 0 to xv.length - 1) wop.update(i, new WeightedObservedPoint(1, xv(i), yv(i)))
val f: ParametricUnivariateFunction = new ParametricUnivariateFunction {
override def value(x: Double, parameters: Double*): Double = {
val a = parameters(0)
val b = parameters(1)
1.0 / (1.0 + a * pow(x, 2 * b))
}
override def gradient(x: Double, parameters: Double*): Array[Double] = {
val a = parameters(0)
val b = parameters(1)
val ga = - pow(x, 2 * b) / pow(1 + a * pow(x, 2 * b), 2)
val gb = - (2 * a * pow(x, 2 * b) * log(x)) / pow(1 + a * pow(x, 2 * b), 2)
val grad = Array(ga, gb)
grad
}
}
val wopc = JavaConverters.asJavaCollection(wop)
val cf = SimpleCurveFitter.create(f, Array(1, 1))
val param = cf.fit(wopc)
println(param(0), param(1))
}
Thank you for your help :)
I am wondering if there's a way to deal with a while (n > 0) loop in a more functional way, I have a small Scala app that counts the number of digits equal to K from a range from 1 to N:
for example 30 and 3 would return 4 [3, 13, 23, 30]
object NumKCount {
def main(args: Array[String]): Unit = {
println(countK(30,3))
}
def countKDigit(n:Int, k:Int):Int = {
var num = n
var count = 0
while (num > 10) {
val digit = num % 10
if (digit == k) {count += 1}
num = num / 10
}
if (num == k) {count += 1}
count
}
def countK(n:Int, k:Int):Int = {
1.to(n).foldLeft(0)((acc, x) => acc + countKDigit(x, k))
}
}
I'm looking for a way to define the function countKDigit using a purely functional approach
First expand number n into a sequence of digits
def digits(n: Int): Seq[Int] = {
if (n < 10) Seq(n)
else digits(n / 10) :+ n % 10
}
Then reduce the sequence by counting occurrences of k
def countKDigit(n:Int, k:Int):Int = {
digits(n).count(_ == k)
}
Or you can avoid countKDigit entirely by using flatMap
def countK(n:Int, k:Int):Int = {
1.to(n).flatMap(digits).count(_ == k)
}
Assuming that K is always 1 digit, you can convert n to String and use collect or filter, like below (there's not much functional stuff you can do with Integer):
def countKDigit(n: Int, k: Int): Int = {
n.toString.collect({ case c if c.asDigit == k => true }).size
}
or
def countKDigit(n: Int, k: Int): Int = {
n.toString.filter(c => c.asDigit == 3).length
}
E.g.
scala> 343.toString.collect({ case c if c.asDigit == 3 => true }).size
res18: Int = 2
scala> 343.toString.filter(c => c.asDigit == 3).length
res22: Int = 2
What about the following approach:
scala> val myInt = 346763
myInt: Int = 346763
scala> val target = 3
target: Int = 3
scala> val temp = List.tabulate(math.log10(myInt).toInt + 1)(x => math.pow(10, x).toInt)
temp: List[Int] = List(1, 10, 100, 1000, 10000, 100000)
scala> temp.map(x => myInt / x % 10)
res17: List[Int] = List(3, 6, 7, 6, 4, 3)
scala> temp.count(x => myInt / x % 10 == target)
res18: Int = 2
Counting the occurrences of a single digit in a number sequence.
def countK(n:Int, k:Int):Int = {
assert(k >= 0 && k <= 9)
1.to(n).mkString.count(_ == '0' + k)
}
If you really only want to modify countKDigit() to a more functional design, there's always recursion.
def countKDigit(n:Int, k:Int, acc: Int = 0):Int =
if (n == 0) acc
else countKDigit(n/10, k, if (n%10 == k) acc+1 else acc)
I am trying to understand specifically what happens when I change this line of code from:
else f(a) * product(f)(a + 1, b)
to:
else f(a) * product(x => x)(a + 1, b)
or:
else f(a) * product(x => x * x)(a + 1, b)
This is the complete code block:
object exercise {
class Prod (){
var acc: Int = 0
def product(f: Int => Int)(a: Int, b: Int): Int = {
acc = {
if (a > b) 1
else f(a) * product(f)(a + 1, b)
}
println(s"a: $a; b: $b; acc: $acc")
acc
}
}
val x = new Prod()
x.product(x => x * x)(1, 5);
}
I'm using a scala worksheet in IntelliJ community edition 2016.2.4, if that matters. I get different results each time I changeproduct(???)(a + 1, b). When I try to view just the output of the recursive function call, I get java.lang.StackOverflowError. As the code is written, I see:
a: 5; b: 4; acc: 1
a: 4; b: 4; acc: 16
a: 3; b: 4; acc: 144
a: 2; b: 4; acc: 576
res2: Int = 576
How can I view the output of just that statement for each recursive iteration without causing an error?
Is there a better way than this example to find three numbers from a list that sum to zero in scala? Right now, I feel like my functional way may not be the most efficient and it contains duplicate tuples. What is the most efficient way to get rid of duplicate tuples in my current example?
def secondThreeSum(nums:List[Int], n:Int):List[(Int,Int,Int)] = {
val sums = nums.combinations(2).map(combo => combo(0) + combo(1) -> (combo(0), combo(1))).toList.toMap
nums.flatMap { num =>
val tmp = n - num
if(sums.contains(tmp) && sums(tmp)._1 != num && sums(tmp)._2 != num) Some((num, sums(tmp)._1, sums(tmp)._2)) else None
}
}
This is pretty simple, and doesn't repeat any tuples:
def f(nums: List[Int], n: Int): List[(Int, Int, Int)] = {
for {
(a, i) <- nums.zipWithIndex;
(b, j) <- nums.zipWithIndex.drop(i + 1)
c <- nums.drop(j + 1)
if n == a + b + c
} yield (a, b, c)
}
Use .combinations(3) to generate all distinct possible triplets of your start list, then keep only those that sum up to n :
scala> def secondThreeSum(nums:List[Int], n:Int):List[(Int,Int,Int)] = {
nums.combinations(3)
.collect { case List(a,b,c) if (a+b+c) == n => (a,b,c) }
.toList
}
secondThreeSum: (nums: List[Int], n: Int)List[(Int, Int, Int)]
scala> secondThreeSum(List(1,2,3,-5,2), 0)
res3: List[(Int, Int, Int)] = List((2,3,-5))
scala> secondThreeSum(List(1,2,3,-5,2), -1)
res4: List[(Int, Int, Int)] = List((1,3,-5), (2,2,-5))
Here is a solution that's O(n^2*log(n)). So it's quite a lot faster for large lists.
Also it uses lower level language features to increase the speed even further.
def f(nums: List[Int], n: Int): List[(Int, Int, Int)] = {
val result = scala.collection.mutable.ArrayBuffer.empty[(Int, Int, Int)]
val array = nums.toArray
val mapValueToMaxIndex = scala.collection.mutable.Map.empty[Int, Int]
nums.zipWithIndex.foreach {
case (n, i) => mapValueToMaxIndex += (n -> math.max(i, (mapValueToMaxIndex.getOrElse(n, i))))
}
val size = array.size
var i = 0
while(i < size) {
val a = array(i)
var j = i+1
while(j < size) {
val b = array(j)
val c = n - b - a
mapValueToMaxIndex.get(c).foreach { maxIndex =>
if(maxIndex > j) result += ((a, b, c))
}
j += 1
}
i += 1
}
result.toList
}
def comb(c: Int, r: Int): Int = {
if(r == 1) c
else if(r < c) comb(c-1, r-1) + comb(c-1, r)
else 1
}
comb(20,10) //184,756
What I'd like to do is to call it as comb(10,20) and get the same result. I tried to replace c with r and r with c except in the signature, but it doesn't work.
def comb(c: Int, r: Int): Int = {
if(c == 1) r
else if(c < r) comb(r-1, c-1) + comb(r-1, c)
else 1
}
comb(10,20) //2 -> not right
You would also need to change the order in the sub-calls:
def comb(c: Int, r: Int): Int = {
if (c == 1) r
else if (c < r) comb(c - 1, r - 1) + comb(c, r - 1)
else 1
}
This gives the expected result:
comb(10, 20) // 184756
I would add, don't be afraid of local defs:
scala> def comb(c: Int, r: Int): Int = {
| if(r == 1) c
| else if(r < c) comb(c-1, r-1) + comb(c-1, r)
| else 1
| }
comb: (c: Int, r: Int)Int
scala> comb(20,10) //184,756
res0: Int = 184756
scala> def cc(c: Int, r: Int): Int = {
| def cc2(c: Int, r: Int): Int = {
| if(r == 1) c
| else if(r < c) comb(c-1, r-1) + comb(c-1, r)
| else 1
| }
| if (c < r) cc2(r,c) else cc2(c,r)
| }
cc: (c: Int, r: Int)Int
scala> cc(20,10)
res1: Int = 184756
scala> cc(10,20)
res2: Int = 184756