Here is some code I wrote to solve project Euler #14 in scala
The output is shown below as well. My issue is, I expect better performance from the cached version, but the opposite is true. I think I did something wrong, since I don't think HashMap's overhead is enough to make this this slow.
Any suggestions?
object Problem14 {
def main(args: Array[String]) {
val collatzCacheMap = collection.mutable.HashMap[Long,Long]()
def collatzLengthWithCache(num: Long): Long = {
def collatzR(currNum: Long, solution: Long): Long = {
val cacheVal = collatzCacheMap.get(currNum)
if(cacheVal != None) {
val answer = solution + cacheVal.get
collatzCacheMap.put(num, answer);
answer;
}
else if(currNum == 1) { collatzCacheMap.put(num, solution + 1); solution + 1; }
else if(currNum % 2 == 0) collatzR(currNum/2, solution + 1)
else collatzR(3*currNum + 1, solution + 1)
}
collatzR(num, 0)
}
def collatzLength(num: Long): Long = {
def collatzR(currNum: Long, solution: Long): Long = {
if(currNum == 1) solution + 1
else if(currNum % 2 == 0) collatzR(currNum/2, solution + 1)
else collatzR(currNum*3 + 1, solution + 1)
}
collatzR(num, 0)
}
var startTime = System.currentTimeMillis()
//val answer = (1L to 1000000).reduceLeft((x,y) => if(collatzLengthWithCache(x) > collatzLengthWithCache(y)) x else y)
val answer = (1L to 1000000).zip((1L to 1000000).map(collatzLengthWithCache)).reduceLeft((x,y) => if(x._2 > y._2) x else y)
println(answer)
println("Cached time: " + (System.currentTimeMillis() - startTime))
collatzCacheMap.clear()
startTime = System.currentTimeMillis()
//val answer2 = (1L to 1000000).par.reduceLeft((x,y) => if(collatzLengthWithCache(x) > collatzLengthWithCache(y)) x else y)
val answer2 = (1L to 1000000).par.zip((1L to 1000000).par.map(collatzLengthWithCache)).reduceLeft((x,y) => if(x._2 > y._2) x else y)
println(answer2)
println("Cached time parallel: " + (System.currentTimeMillis() - startTime))
startTime = System.currentTimeMillis()
//val answer3 = (1L to 1000000).reduceLeft((x,y) => if(collatzLength(x) > collatzLength(y)) x else y)
val answer3 = (1L to 1000000).zip((1L to 1000000).map(collatzLength)).reduceLeft((x,y) => if(x._2 > y._2) x else y)
println(answer3)
println("No Cached time: " + (System.currentTimeMillis() - startTime))
startTime = System.currentTimeMillis()
//val answer4 = (1L to 1000000).par.reduceLeft((x,y) => if(collatzLength(x) > collatzLength(y)) x else y)
val answer4 = (1L to 1000000).par.zip((1L to 1000000).par.map(collatzLength)).reduceLeft((x,y) => if(x._2 > y._2) x else y)
println(answer4)
println("No Cached time parallel: " + (System.currentTimeMillis() - startTime))
}
}
Output:
(837799,525)
Cached time: 1070
(837799,525)
Cached time parallel: 954
(837799,525)
No Cached time: 450
(837799,525)
No Cached time parallel: 241
Related
I am writing a naive implementation of Kmeans in Spark for my homework:
import breeze.linalg.{ Vector, DenseVector, squaredDistance }
import scala.math
def parse(line: String): Vector[Double] = {
DenseVector(line.split(' ').map(_.toDouble))
}
def closest_assign(p: Vector[Double], centres: Array[Vector[Double]]): Int = {
var bestIndex = 1
var closest = Double.PositiveInfinity
for (i <- 0 until centres.length) {
val tempDist = squaredDistance(p, centres(i))
if (tempDist < closest) {
closest = tempDist
bestIndex = i
}
}
bestIndex
}
val fileroot:String="/FileStore/tables/"
val file=sc.textFile(fileroot+"data.txt")
.map(parse _)
.cache()
val c1=sc.textFile(fileroot+"c1.txt")
.map(parse _)
.collect()
val c2=sc.textFile(fileroot+"c2.txt")
.map(parse _)
.collect()
val K=10
val MAX_ITER=20
var kPoints=c2
for(i<-0 until MAX_ITER){
val closest = file.map(p => (closest_assign(p, kPoints), (p, 1)))
val pointStats = closest.reduceByKey { case ((x1, y1), (x2, y2)) => (x1 + x2, y1 + y2) }
val newPoints = pointStats.map { pair =>
(pair._1, pair._2._1 * (1.0 / pair._2._2))
}.collectAsMap()
for (newP <- newPoints) {
kPoints(newP._1) = newP._2
}
val tempDist = closest
.map { x => squaredDistance(x._2._1, newPoints(x._1)) }
.fold(0) { _ + _ }
println(i+" time finished iteration (cost = " + tempDist + ")")
}
In theory tempDist should become smaller and smaller as the program runs but in reality it goes the other way around. Also I found c1 and c2 changes value after the for(i<-0 until MAX_ITER) loop. But c1 and c2 should be val values! Is the way I load c1 and c2 wrong? c1 and c2 are two different initial clusters for the data.
I have 3 different objects that I've written in IDEA, labelled PartA, PartB, and PartC. However, when I attempt to run any of these objects, the only one that gives me the option to run is PartB. When I right click on the code for PartA and PartC, I have no option to run them. Only PartB has the option to run. What's going on here, and how can I fix it so I can run the different objects I have written?
Edit: Sorry, first time posting a question here. Here's the code I have written.
object PartB extends App {
def easter(Y:Int): Int = {
val N = Y - 1900
val A = N - (N/19) * 19
val B = (7 * A + 1) / 19
val C = 11 * A + 4 - B
val M = C - (C / 29) * 29
val Q = N / 4
val S = N + Q + 31 - M
val W = S - (S / 7) * 7
val DATE = 25 - M - W
return DATE
}
println("Enter a year: ")
val year = scala.io.StdIn.readInt()
val date = easter(year)
var easter_day : String = ""
if (date == 0) {
easter_day = "March, 31"
} else if (date < 0) {
easter_day = "March, " + (31 + year)
} else {
easter_day = "April, " + date
}
println("In " + year + ", Easter is on " + easter_day + ".")
}
////////////////////////////////////////////////////////////////////////////////
object PartC {
def ack(m:Int, n:Int) : Int = {
if (m == 0) {
return n + 1
} else if (n == 0) {
return ack(m - 1, 1)
} else {
return ack(m - 1, ack(m, n - 1))
}
}
println("Enter a value for m: ")
val m = scala.io.StdIn.readInt()
println("Enter a value for n: ")
val n = scala.io.StdIn.readInt()
println(ack(m, n))
}
PartB extends App, but PartC doesn't. Presumably PartA doesn't either.
The App trait can be used to quickly turn objects into executable programs... the whole class body becomes the “main method”.
So PartB defines a main method.
I want to generate a list of Tuple2 objects. Each tuple (a,b) in the list should meeting the conditions:a and b both are perfect squares,(b/30)<a<b
and a>N and b>N ( N can even be a BigInt)
I am trying to write a scala function to generate the List of Tuples meeting the above requirements?
This is my attempt..it works fine for Ints and Longs..But for BigInt there is sqrt problem I am facing..Here is my approach in coding as below:
scala> def genTups(N:Long) ={
| val x = for(s<- 1L to Math.sqrt(N).toLong) yield s*s;
| val y = x.combinations(2).map{ case Vector(a,b) => (a,b)}.toList
| y.filter(t=> (t._1*30/t._2)>=1)
| }
genTups: (N: Long)List[(Long, Long)]
scala> genTups(30)
res32: List[(Long, Long)] = List((1,4), (1,9), (1,16), (1,25), (4,9), (4,16), (4,25), (9,16), (9,25), (16,25))
Improved this using BigInt square-root algorithm as below:
def genTups(N1:BigInt,N2:BigInt) ={
def sqt(n:BigInt):BigInt = {
var a = BigInt(1)
var b = (n>>5)+BigInt(8)
while((b-a) >= 0) {
var mid:BigInt = (a+b)>>1
if(mid*mid-n> 0) b = mid-1
else a = mid+1
}; a-1 }
val x = for(s<- sqt(N1) to sqt(N2)) yield s*s;
val y = x.combinations(2).map{ case Vector(a,b) => (a,b)}.toList
y.filter(t=> (t._1*30/t._2)>=1)
}
I appreciate any help to improve in my algorithm .
You can avoid sqrt in you algorithm by changing the way you calculate x to this:
val x = (BigInt(1) to N).map(x => x*x).takeWhile(_ <= N)
The final function is then:
def genTups(N: BigInt) = {
val x = (BigInt(1) to N).map(x => x*x).takeWhile(_ <= N)
val y = x.combinations(2).map { case Vector(a, b) if (a < b) => (a, b) }.toList
y.filter(t => (t._1 * 30 / t._2) >= 1)
}
You can also re-write this as a single chain of operations like this:
def genTups(N: BigInt) =
(BigInt(1) to N)
.map(x => x * x)
.takeWhile(_ <= N)
.combinations(2)
.map { case Vector(a, b) if a < b => (a, b) }
.filter(t => (t._1 * 30 / t._2) >= 1)
.toList
In a quest for performance, I came up with this recursive version that appears to be significantly faster
def genTups(N1: BigInt, N2: BigInt) = {
def sqt(n: BigInt): BigInt = {
var a = BigInt(1)
var b = (n >> 5) + BigInt(8)
while ((b - a) >= 0) {
var mid: BigInt = (a + b) >> 1
if (mid * mid - n > 0) {
b = mid - 1
} else {
a = mid + 1
}
}
a - 1
}
#tailrec
def loop(a: BigInt, rem: List[BigInt], res: List[(BigInt, BigInt)]): List[(BigInt, BigInt)] =
rem match {
case Nil => res
case head :: tail =>
val a30 = a * 30
val thisRes = rem.takeWhile(_ <= a30).map(b => (a, b))
loop(head, tail, thisRes.reverse ::: res)
}
val squares = (sqt(N1) to sqt(N2)).map(s => s * s).toList
loop(squares.head, squares.tail, Nil).reverse
}
Each recursion of the loop adds all the matching pairs for a given value of a. The result is built in reverse because adding to the front of a long list is much faster than adding to the tail.
Firstly create a function to check if number if perfect square or not.
def squareRootOfPerfectSquare(a: Int): Option[Int] = {
val sqrt = math.sqrt(a)
if (sqrt % 1 == 0)
Some(sqrt.toInt)
else
None
}
Then, create another func that will calculate this list of tuples according to the conditions mentioned above.
def generateTuples(n1:Int,n2:Int)={
for{
b <- 1 to n2;
a <- 1 to n1 if(b>a && squareRootOfPerfectSquare(b).isDefined && squareRootOfPerfectSquare(a).isDefined)
} yield ( (a,b) )
}
Then on calling the function with parameters generateTuples(5,10)
you will get an output as
res0: scala.collection.immutable.IndexedSeq[(Int, Int)] = Vector((1,4), (1,9), (4,9))
Hope that helps !!!
I'm migrating from Java to Scala and I am trying to come up with the procedure merge for mergesort algorithm. My solution:
def merge(src: Array[Int], dst: Array[Int], from: Int,
mid: Int, until: Int): Unit = {
/*
* Iteration of merge:
* i - index of src[from, mid)
* j - index of src[mid, until)
* k - index of dst[from, until)
*/
#tailrec
def loop(i: Int, j: Int, k: Int): Unit = {
if (k >= until) {
// end of recursive calls
} else if (i >= mid) {
dst(k) = src(j)
loop(i, j + 1, k + 1)
} else if (j >= until) {
dst(k) = src(j)
loop(i + 1, j, k + 1)
} else if (src(i) <= src(j)) {
dst(k) = src(i);
loop(i + 1, j, k + 1)
} else {
dst(k) = src(j)
loop(i, j + 1, k + 1)
}
}
loop(from, mid, from)
}
seems to work, but it seems to me that it is written in quite "imperative" style
(despite i have used recursion and no mutable variables except for the arrays, for which the side effect is intended). I want something like this:
/*
* this code is not working and at all does the wrong things
*/
for (i <- (from until mid); j <- (mid until until);
k <- (from until until) if <???>) yield dst(k) = src(<???>)
But i cant come up with the proper solution of such kind. Can you please help me?
Consider this:
val left = src.slice(from, mid).buffered
val right = src.slice(mid, until).buffered
(from until until) foreach { k =>
dst(k) = if(!left.hasNext) right.next
else if(!right.hasNext || left.head < right.head) left.next
else right.next
}
I worked on the Prime Generator problem for almost 3 days.
I want to make a Scala functional solution(which means "no var", "no mutable data"), but every time it exceed the time limitation.
My solution is:
object Main {
def sqrt(num: Int) = math.sqrt(num).toInt
def isPrime(num: Int): Boolean = {
val end = sqrt(num)
def isPrimeHelper(current: Int): Boolean = {
if (current > end) true
else if (num % current == 0) false
else isPrimeHelper(current + 1)
}
isPrimeHelper(2)
}
val feedMax = sqrt(1000000000)
val feedsList = (2 to feedMax).filter(isPrime)
val feedsSet = feedsList.toSet
def findPrimes(min: Int, max: Int) = (min to max) filter {
num => if (num <= feedMax) feedsSet.contains(num)
else feedsList.forall(p => num % p != 0 || p * p > num)
}
def main(args: Array[String]) {
val total = readLine().toInt
for (i <- 1 to total) {
val Array(from, to) = readLine().split("\\s+")
val primes = findPrimes(from.toInt, to.toInt)
primes.foreach(println)
println()
}
}
}
I'm not sure where can be improved. I also searched a lot, but can't find a scala solution(most are c/c++ ones)
Here is a nice fully functional scala solution using the sieve of eratosthenes: http://en.literateprograms.org/Sieve_of_Eratosthenes_(Scala)#chunk def:ints
Check out this elegant and efficient one liner by Daniel Sobral: http://dcsobral.blogspot.se/2010/12/sieve-of-eratosthenes-real-one-scala.html?m=1
lazy val unevenPrimes: Stream[Int] = {
def nextPrimes(n: Int, sqrt: Int, sqr: Int): Stream[Int] =
if (n > sqr) nextPrimes(n, sqrt + 1, (sqrt + 1)*(sqrt + 1)) else
if (unevenPrimes.takeWhile(_ <= sqrt).exists(n % _ == 0)) nextPrimes(n + 2, sqrt, sqr)
else n #:: nextPrimes(n + 2, sqrt, sqr)
3 #:: 5 #:: nextPrimes(7, 3, 9)
}