JVM OOM Error - Finding prime numbers using scala stream - scala

I am using Scala Stream to find out prime numbers between two numbers, but it is throwing Java OutofMemoryError. Can someone tell me why is it happening so?
def sieve(nums: Stream[Int]): Stream[Int] = {
nums.head #:: sieve(nums.tail filter (_ % nums.head != 0))
}
def listPrimesinRange(start: Int, end: Int): List[Int] = {
sieve(Stream.from(2)).filter(x => (x >= start && x <= end)).toList
}

Related

Code to compute Stream of primes in Scala

I have slightly modified Daniel Sobral's prime Stream function from this SO post:
def primeStream: Stream[Int] => Stream[Int] =
s => s.head #:: primeStream(s.tail filter(_ % s.head != 0))
I'm using it with:
primeStream(Stream.from(2)).take(100).foreach(println)
and it works fine enough, but I'm wondering if I could get rid of that pesky Stream.from(2) with the following:
def primeStream: def primeStream: () => Stream[Int] =
() => Stream.from(2)
def primeStream: Stream[Int] => Stream[Int] =
s => s.head #:: primeStream(s.tail filter(_ % s.head != 0))
to achieve:
primeStream().take(100).foreach(println)
But that doesn't work. What am I missing?
I tried also:
def primeStream: Stream[Int] => Stream[Int] = {
() => Stream.from(2)
s: Stream[Int] => s.head #:: primeStream(s.tail filter(_ % s.head != 0))
}
which doesn't work.
This works:
def primeStream2(s: Stream[Int] = Stream.from(2)): Stream[Int] =
s.head #:: primeStream2(s.tail filter(_ % s.head != 0))
But I wanted to understand what I missed to make the syntax work for the more symmetric syntax above with 2 parallel definitions of primeStream .
The 1st attempt doesn't work because you're trying to define 2 different methods with the same name. Methods can't be differentiated by their return types. Also, other than their names they appear to be totally unrelated so if you were able to invoke one of them the existence of the other would be immaterial.
The 2nd attempt tries to put 2 unrelated, and unnamed, functions in the same code block. It will compile if you wrap the 1st function in parentheses but the result isn't what you're after.
I completely understand your desire to make Stream.from(2) automatic because if you pass anything else, like Stream.from(13), you don't get a Stream of prime integers.
There are a few different ways to get a lazy sequence of prime numbers with only one Stream invocation. This one is a little complicated because it tries to reduce the number of inner iterations when searching for the next prime.
val primeStream: Stream[Int] = 2 #:: Stream.iterate[Int](3)(x =>
Stream.iterate(x+2)(_+2).find(i => primeStream.takeWhile(p => p*p <= i)
.forall(i%_ > 0)).get)
You can also use the new (Scala 2.13) unfold() method to create the Stream.
val primes = Stream.unfold(List(2)) { case hd::tl =>
Option((hd, Range(hd+1, hd*2).find(n => tl.forall(n % _ > 0)).get::hd::tl))
}
Note that Stream has been deprecated since Scala 2.13 and should be replaced with the new LazyList.

Collatz - maximum number of steps and the corresponding number

I am trying to write a Scala function that takes an upper bound as argument and calculates the steps for the numbers in a range from 1 up to this bound. It had to return the maximum number of steps and the corresponding number that needs that many steps. (as a pair - first element is the number of steps and second is the corresponding index)
I already have created a function called "collatz" which computes the number of steps. I am very new with Scala and I am a bit stuck because of the limitations. Here's how I thought to start the function:
def max(x:Int):Int = {
for (i<-(1 to x).toList) yield collatz(i)
the way I think to solve this problem is to: 1. iterate through the range and apply collatz to all elements while putting them in a new list which stores the number of steps. 2. find the maximum of the new list by using List.max 3. Use List.IndexOf to find the index. However, I'm really stuck since I don't know how to do this without using var (and only using val). Thanks!
Something like this:
def collatzMax(n: Long): (Long, Long) = {
require(n > 0, "Collatz function is not defined for n <= 0")
def collatz(n: Long, steps: Long): Long = n match {
case n if (n <= 1) => steps
case n if (n % 2 == 0) => collatz(n / 2, steps + 1)
case n if (n % 2 == 1) => collatz(3 * n + 1, steps + 1)
}
def loop(n: Long, current: Long, acc: List[(Long, Long)]): List[(Long, Long)] =
if (current > n) acc
else {
loop(n, current + 1, collatz(current, 0) -> current :: acc)
}
loop(n, 1, Nil).sortBy(-_._1).head
}
Example:
collatzMax(12)
result: (Long, Long) = (19,9) // 19 steps for collatz(9)
Using for:
def collatzMax(n: Long) =
(for(i <- 1L to n) yield collatz(i) -> i).sortBy(-_._1).head
Or(continuing your idea):
def maximum(x: Long): (Long, Long) = {
val lst = for (i <- 1L to x) yield collatz(i)
val maxValue = lst.max
(maxValue, lst.indexOf(maxValue) + 1)
}
Try:
(1 to x).map(collatz).maxBy(_._2)._1

Scala functional solution for spoj "Prime Generator"

I worked on the Prime Generator problem for almost 3 days.
I want to make a Scala functional solution(which means "no var", "no mutable data"), but every time it exceed the time limitation.
My solution is:
object Main {
def sqrt(num: Int) = math.sqrt(num).toInt
def isPrime(num: Int): Boolean = {
val end = sqrt(num)
def isPrimeHelper(current: Int): Boolean = {
if (current > end) true
else if (num % current == 0) false
else isPrimeHelper(current + 1)
}
isPrimeHelper(2)
}
val feedMax = sqrt(1000000000)
val feedsList = (2 to feedMax).filter(isPrime)
val feedsSet = feedsList.toSet
def findPrimes(min: Int, max: Int) = (min to max) filter {
num => if (num <= feedMax) feedsSet.contains(num)
else feedsList.forall(p => num % p != 0 || p * p > num)
}
def main(args: Array[String]) {
val total = readLine().toInt
for (i <- 1 to total) {
val Array(from, to) = readLine().split("\\s+")
val primes = findPrimes(from.toInt, to.toInt)
primes.foreach(println)
println()
}
}
}
I'm not sure where can be improved. I also searched a lot, but can't find a scala solution(most are c/c++ ones)
Here is a nice fully functional scala solution using the sieve of eratosthenes: http://en.literateprograms.org/Sieve_of_Eratosthenes_(Scala)#chunk def:ints
Check out this elegant and efficient one liner by Daniel Sobral: http://dcsobral.blogspot.se/2010/12/sieve-of-eratosthenes-real-one-scala.html?m=1
lazy val unevenPrimes: Stream[Int] = {
def nextPrimes(n: Int, sqrt: Int, sqr: Int): Stream[Int] =
if (n > sqr) nextPrimes(n, sqrt + 1, (sqrt + 1)*(sqrt + 1)) else
if (unevenPrimes.takeWhile(_ <= sqrt).exists(n % _ == 0)) nextPrimes(n + 2, sqrt, sqr)
else n #:: nextPrimes(n + 2, sqrt, sqr)
3 #:: 5 #:: nextPrimes(7, 3, 9)
}

Scala's Stream and StackOverflowError

Consider this code (taken from "Functional programming principles in Scala" course by Martin Odersky):
def sieve(s: Stream[Int]): Stream[Int] = {
s.head #:: sieve(s.tail.filter(_ % s.head != 0))
}
val primes = sieve(Stream.from(2))
primes.take(1000).toList
It works just fine. Notice that sieve is in fact NOT tail recursive (or is it?), even though Stream's tail is lazy.
But this code:
def sieve(n: Int): Stream[Int] = {
n #:: sieve(n + 1).filter(_ % n != 0)
}
val primes = sieve(2)
primes.take(1000).toList
throws StackOverflowError.
What is the problem with the second example? I guess filter messes things up, but I can't understand why. It returns a Stream, so it souldn't make evaluation eager (am I right?)
You can highlight the problem with a bit of tracking code:
var counter1, counter2 = 0
def sieve1(s: Stream[Int]): Stream[Int] = {
counter1 += 1
s.head #:: sieve1(s.tail.filter(_ % s.head != 0))
}
def sieve2(n: Int): Stream[Int] = {
counter2 += 1
n #:: sieve2(n + 1).filter(_ % n != 0)
}
sieve1(Stream.from(2)).take(100).toList
sieve2(2).take(100).toList
We can run this and check the counters:
scala> counter1
res2: Int = 100
scala> counter2
res3: Int = 540
So in the first case the depth of the call stack is the number of primes, and in the second it's the largest prime itself (well, minus one).
Neither one of these are tail recursive.
Using the tailrec annotation will tell you whether or not a function is tail recursive.
Adding #tailrec to the two functions above gives:
import scala.annotation.tailrec
#tailrec
def sieve(s: Stream[Int]): Stream[Int] = {
s.head #:: sieve(s.tail.filter(_ % s.head != 0))
}
#tailrec
def sieve(n: Int): Stream[Int] = {
n #:: sieve(n + 1).filter(_ % n != 0)
}
Loading this shows that both definitions are not tail recursive:
<console>:10: error: could not optimize #tailrec annotated method sieve: it contains a recursive call not in tail position
s.head #:: sieve(s.tail.filter(_ % s.head != 0))
^
<console>:10: error: could not optimize #tailrec annotated method sieve: it contains a recursive call not in tail position
n #:: sieve(n + 1).filter(_ % n != 0)

Scala performance - Sieve

Right now, I am trying to learn Scala . I've started small, writing some simple algorithms . I've encountered some problems when I wanted to implement the Sieve algorithm from finding all all prime numbers lower than a certain threshold .
My implementation is:
import scala.math
object Sieve {
// Returns all prime numbers until maxNum
def getPrimes(maxNum : Int) = {
def sieve(list: List[Int], stop : Int) : List[Int] = {
list match {
case Nil => Nil
case h :: list if h <= stop => h :: sieve(list.filterNot(_ % h == 0), stop)
case _ => list
}
}
val stop : Int = math.sqrt(maxNum).toInt
sieve((2 to maxNum).toList, stop)
}
def main(args: Array[String]) = {
val ap = printf("%d ", (_:Int));
// works
getPrimes(1000).foreach(ap(_))
// works
getPrimes(100000).foreach(ap(_))
// out of memory
getPrimes(1000000).foreach(ap(_))
}
}
Unfortunately it fails when I want to computer all the prime numbers smaller than 1000000 (1 million) . I am receiving OutOfMemory .
Do you have any idea on how to optimize the code, or how can I implement this algorithm in a more elegant fashion .
PS: I've done something very similar in Haskell, and there I didn't encountered any issues .
I would go with an infinite Stream. Using a lazy data structure allows to code pretty much like in Haskell. It reads automatically more "declarative" than the code you wrote.
import Stream._
val primes = 2 #:: sieve(3)
def sieve(n: Int) : Stream[Int] =
if (primes.takeWhile(p => p*p <= n).exists(n % _ == 0)) sieve(n + 2)
else n #:: sieve(n + 2)
def getPrimes(maxNum : Int) = primes.takeWhile(_ < maxNum)
Obviously, this isn't the most performant approach. Read The Genuine Sieve of Eratosthenes for a good explanation (it's Haskell, but not too difficult). For real big ranges you should consider the Sieve of Atkin.
The code in question is not tail recursive, so Scala cannot optimize the recursion away. Also, Haskell is non-strict by default, so you can't hardly compare it to Scala. For instance, whereas Haskell benefits from foldRight, Scala benefits from foldLeft.
There are many Scala implementations of Sieve of Eratosthenes, including some in Stack Overflow. For instance:
(n: Int) => (2 to n) |> (r => r.foldLeft(r.toSet)((ps, x) => if (ps(x)) ps -- (x * x to n by x) else ps))
The following answer is about a 100 times faster than the "one-liner" answer using a Set (and the results don't need sorting to ascending order) and is more of a functional form than the other answer using an array although it uses a mutable BitSet as a sieving array:
object SoE {
def makeSoE_Primes(top: Int): Iterator[Int] = {
val topndx = (top - 3) / 2
val nonprms = new scala.collection.mutable.BitSet(topndx + 1)
def cullp(i: Int) = {
import scala.annotation.tailrec; val p = i + i + 3
#tailrec def cull(c: Int): Unit = if (c <= topndx) { nonprms += c; cull(c + p) }
cull((p * p - 3) >>> 1)
}
(0 to (Math.sqrt(top).toInt - 3) >>> 1).filterNot { nonprms }.foreach { cullp }
Iterator.single(2) ++ (0 to topndx).filterNot { nonprms }.map { i: Int => i + i + 3 }
}
}
It can be tested by the following code:
object Main extends App {
import SoE._
val top_num = 10000000
val strt = System.nanoTime()
val count = makeSoE_Primes(top_num).size
val end = System.nanoTime()
println(s"Successfully completed without errors. [total ${(end - strt) / 1000000} ms]")
println(f"Found $count primes up to $top_num" + ".")
println("Using one large mutable1 BitSet and functional code.")
}
With the results from the the above as follows:
Successfully completed without errors. [total 294 ms]
Found 664579 primes up to 10000000.
Using one large mutable BitSet and functional code.
There is an overhead of about 40 milliseconds for even small sieve ranges, and there are various non-linear responses with increasing range as the size of the BitSet grows beyond the different CPU caches.
It looks like List isn't very effecient space wise. You can get an out of memory exception by doing something like this
1 to 2000000 toList
I "cheated" and used a mutable array. Didn't feel dirty at all.
def primesSmallerThan(n: Int): List[Int] = {
val nonprimes = Array.tabulate(n + 1)(i => i == 0 || i == 1)
val primes = new collection.mutable.ListBuffer[Int]
for (x <- nonprimes.indices if !nonprimes(x)) {
primes += x
for (y <- x * x until nonprimes.length by x if (x * x) > 0) {
nonprimes(y) = true
}
}
primes.toList
}