Collatz - maximum number of steps and the corresponding number - scala

I am trying to write a Scala function that takes an upper bound as argument and calculates the steps for the numbers in a range from 1 up to this bound. It had to return the maximum number of steps and the corresponding number that needs that many steps. (as a pair - first element is the number of steps and second is the corresponding index)
I already have created a function called "collatz" which computes the number of steps. I am very new with Scala and I am a bit stuck because of the limitations. Here's how I thought to start the function:
def max(x:Int):Int = {
for (i<-(1 to x).toList) yield collatz(i)
the way I think to solve this problem is to: 1. iterate through the range and apply collatz to all elements while putting them in a new list which stores the number of steps. 2. find the maximum of the new list by using List.max 3. Use List.IndexOf to find the index. However, I'm really stuck since I don't know how to do this without using var (and only using val). Thanks!

Something like this:
def collatzMax(n: Long): (Long, Long) = {
require(n > 0, "Collatz function is not defined for n <= 0")
def collatz(n: Long, steps: Long): Long = n match {
case n if (n <= 1) => steps
case n if (n % 2 == 0) => collatz(n / 2, steps + 1)
case n if (n % 2 == 1) => collatz(3 * n + 1, steps + 1)
}
def loop(n: Long, current: Long, acc: List[(Long, Long)]): List[(Long, Long)] =
if (current > n) acc
else {
loop(n, current + 1, collatz(current, 0) -> current :: acc)
}
loop(n, 1, Nil).sortBy(-_._1).head
}
Example:
collatzMax(12)
result: (Long, Long) = (19,9) // 19 steps for collatz(9)
Using for:
def collatzMax(n: Long) =
(for(i <- 1L to n) yield collatz(i) -> i).sortBy(-_._1).head
Or(continuing your idea):
def maximum(x: Long): (Long, Long) = {
val lst = for (i <- 1L to x) yield collatz(i)
val maxValue = lst.max
(maxValue, lst.indexOf(maxValue) + 1)
}

Try:
(1 to x).map(collatz).maxBy(_._2)._1

Related

area under the curve programatically in scala

im trying to solve for the area under the curve of the example 1 of: http://tutorial.math.lamar.edu/Classes/CalcI/AreaProblem.aspx
f(x) = x^3 - 5x^2 + 6x + 5 and the x-axis n = 5
the answers says it is: 25.12
but i'm getting a slightly less: 23.78880035448074
what im i doing wrong??
here's my code:
import scala.math.BigDecimal.RoundingMode
def summation(low: Int, up: Int, coe: List[Int], ex: List[Int]) = {
def eva(coe: List[Int], ex: List[Int], x: Double) = {
(for (i <- 0 until coe.size) yield coe(i) * math.pow(x,ex(i))).sum
}
#annotation.tailrec
def build_points(del: Float, p: Int, xs : List[BigDecimal]): List[BigDecimal] = {
if(p <= 0 ) xs map { x => x.setScale(3, RoundingMode.HALF_EVEN)}
else build_points(del, p - 1, ((del * p):BigDecimal ):: xs)
}
val sub = 5
val diff = (up - low).toFloat
val deltaX = diff / sub
val points = build_points(deltaX, sub, List(0.0f)); println(points)
val middle_points =
(for (i <- 0 until points.size - 1) yield (points(i) + points(i + 1)) / 2)
(for (elem <- middle_points) yield deltaX * eva(coe,ex,elem.toDouble)).sum
}
val coe = List(1,-5,6,5)
val exp = List(3,2,1,0)
print(summation(0,4,coe,exp))
I'm guessing the problem is that the problem is build_points(deltaX, 5, List(0.0f)) returns a list with 6 elements instead of 5. The problem is that you are passing a list with one element in the beginning, where I'm guessing you wanted an empty list, like
build_points(deltaX, sub, Nil)

Scala fast way to parallelize collection

My code is equivalent to this:
def iterate(prev: Vector[Int], acc: Int): Vector[Int] = {
val next = (for { i <- 1.to(1000000) }
yield (prev(Random.nextInt(i))) ).toVector
if (acc < 20) iterate(next, acc + 1)
else next
}
iterate(1.to(1000000).toVector, 1)
For a large number of iterations, it does an operation on a collection, and yields the value. At the end of the iterations, it converts everything to a vector. Finally, it proceeds to the next recursive self-call, but it cannot proceed until it has all the iterations done. The number of the recursive self-calls is very small.
I want to paralellize this, so I tried to use .par on the 1.to(1000000) range. This used 8 processes instead of 1, and the result was only twice faster! .toParArray was only slightly faster than .par. I was told it could be much faster if I used something different, like maybe ThreadPool - this makes sense, because all of the time is spent in constructing next, and I assume that concatenating the outputs of different processes onto shared memory will not result in huge slowdowns, even for very large outputs (this is a key assumption and it might be wrong). How can I do it? If you provide code, paralellizing the code I gave will be sufficient.
Note that the code I gave is not my actual code. My actual code is much more long and complex (Held-Karp algorithm for TSP with constraints, BitSets and more stuff), and the only notable difference is that in my code, prev's type is ParMap, instead of Vector.
Edit, extra information: the ParMap has 350k elements on the worst iteration at the biggest sample size I can handle, and otherwise it's typically 5k-200k (that varies on a log scale). If it inherently needs a lot of time to concatenate the results from the processes into one single process (I assume this is what's happening), then there is nothing much I can do, but I rather doubt this is the case.
Implemented few versions after the original, proposed in the question,
rec0 is the original with a for loop;
rec1 uses par.map instead of for loop;
rec2 follows rec1 yet it employs parallel collection ParArray for lazy builders (and fast access on bulk traversal operations);
rec3 is a non-idiomatic non-parallel version with mutable ArrayBuffer.
Thus
import scala.collection.mutable.ArrayBuffer
import scala.collection.parallel.mutable.ParArray
import scala.util.Random
// Original
def rec0() = {
def iterate(prev: Vector[Int], acc: Int): Vector[Int] = {
val next = (for { i <- 1.to(1000000) }
yield (prev(Random.nextInt(i))) ).toVector
if (acc < 20) iterate(next, acc + 1)
else next
}
iterate(1.to(1000000).toVector, 1)
}
// par map
def rec1() = {
def iterate(prev: Vector[Int], acc: Int): Vector[Int] = {
val next = (1 to 1000000).par.map { i => prev(Random.nextInt(i)) }.toVector
if (acc < 20) iterate(next, acc + 1)
else next
}
iterate(1.to(1000000).toVector, 1)
}
// ParArray par map
def rec2() = {
def iterate(prev: ParArray[Int], acc: Int): ParArray[Int] = {
val next = (1 to 1000000).par.map { i => prev(Random.nextInt(i)) }.toParArray
if (acc < 20) iterate(next, acc + 1)
else next
}
iterate((1 to 1000000).toParArray, 1).toVector
}
// Non-idiomatic non-parallel
def rec3() = {
def iterate(prev: ArrayBuffer[Int], acc: Int): ArrayBuffer[Int] = {
var next = ArrayBuffer.tabulate(1000000){i => i+1}
var i = 0
while (i < 1000000) {
next(i) = prev(Random.nextInt(i+1))
i = i + 1
}
if (acc < 20) iterate(next, acc + 1)
else next
}
iterate(ArrayBuffer.tabulate(1000000){i => i+1}, 1).toVector
}
Then a little testing on averaging elapsed times,
def elapsed[A] (f: => A): Double = {
val start = System.nanoTime()
f
val stop = System.nanoTime()
(stop-start)*1e-6d
}
val times = 10
val e0 = (1 to times).map { i => elapsed(rec0) }.sum / times
val e1 = (1 to times).map { i => elapsed(rec1) }.sum / times
val e2 = (1 to times).map { i => elapsed(rec2) }.sum / times
val e3 = (1 to times).map { i => elapsed(rec3) }.sum / times
// time in ms.
e0: Double = 2782.341
e1: Double = 2454.828
e2: Double = 3455.976
e3: Double = 1275.876
shows that the non-idiomatic non-parallel version proves the fastest in average. Perhaps for larger input data, the parallel, idiomatic versions may be beneficial.

Scala - can 'for-yield' clause yields nothing for some condition?

In Scala language, I want to write a function that yields odd numbers within a given range. The function prints some log when iterating even numbers. The first version of the function is:
def getOdds(N: Int): Traversable[Int] = {
val list = new mutable.MutableList[Int]
for (n <- 0 until N) {
if (n % 2 == 1) {
list += n
} else {
println("skip even number " + n)
}
}
return list
}
If I omit printing logs, the implementation become very simple:
def getOddsWithoutPrint(N: Int) =
for (n <- 0 until N if (n % 2 == 1)) yield n
However, I don't want to miss the logging part. How do I rewrite the first version more compactly? It would be great if it can be rewritten similar to this:
def IWantToDoSomethingSimilar(N: Int) =
for (n <- 0 until N) if (n % 2 == 1) yield n else println("skip even number " + n)
def IWantToDoSomethingSimilar(N: Int) =
for {
n <- 0 until N
if n % 2 != 0 || { println("skip even number " + n); false }
} yield n
Using filter instead of a for expression would be slightly simpler though.
I you want to keep the sequentiality of your traitement (processing odds and evens in order, not separately), you can use something like that (edited) :
def IWantToDoSomethingSimilar(N: Int) =
(for (n <- (0 until N)) yield {
if (n % 2 == 1) {
Option(n)
} else {
println("skip even number " + n)
None
}
// Flatten transforms the Seq[Option[Int]] into Seq[Int]
}).flatten
EDIT, following the same concept, a shorter solution :
def IWantToDoSomethingSimilar(N: Int) =
(0 until N) map {
case n if n % 2 == 0 => println("skip even number "+ n)
case n => n
} collect {case i:Int => i}
If you will to dig into a functional approach, something like the following is a good point to start.
First some common definitions:
// use scalaz 7
import scalaz._, Scalaz._
// transforms a function returning either E or B into a
// function returning an optional B and optionally writing a log of type E
def logged[A, E, B, F[_]](f: A => E \/ B)(
implicit FM: Monoid[F[E]], FP: Pointed[F]): (A => Writer[F[E], Option[B]]) =
(a: A) => f(a).fold(
e => Writer(FP.point(e), None),
b => Writer(FM.zero, Some(b)))
// helper for fixing the log storage format to List
def listLogged[A, E, B](f: A => E \/ B) = logged[A, E, B, List](f)
// shorthand for a String logger with List storage
type W[+A] = Writer[List[String], A]
Now all you have to do is write your filtering function:
def keepOdd(n: Int): String \/ Int =
if (n % 2 == 1) \/.right(n) else \/.left(n + " was even")
You can try it instantly:
scala> List(5, 6) map(keepOdd)
res0: List[scalaz.\/[String,Int]] = List(\/-(5), -\/(6 was even))
Then you can use the traverse function to apply your function to a list of inputs, and collect both the logs written and the results:
scala> val x = List(5, 6).traverse[W, Option[Int]](listLogged(keepOdd))
x: W[List[Option[Int]]] = scalaz.WriterTFunctions$$anon$26#503d0400
// unwrap the results
scala> x.run
res11: (List[String], List[Option[Int]]) = (List(6 was even),List(Some(5), None))
// we may even drop the None-s from the output
scala> val (logs, results) = x.map(_.flatten).run
logs: List[String] = List(6 was even)
results: List[Int] = List(5)
I don't think this can be done easily with a for comprehension. But you could use partition.
def getOffs(N:Int) = {
val (evens, odds) = 0 until N partition { x => x % 2 == 0 }
evens foreach { x => println("skipping " + x) }
odds
}
EDIT: To avoid printing the log messages after the partitioning is done, you can change the first line of the method like this:
val (evens, odds) = (0 until N).view.partition { x => x % 2 == 0 }

Integer partitioning in Scala

Given n ( say 3 people ) and s ( say 100$ ), we'd like to partition s among n people.
So we need all possible n-tuples that sum to s
My Scala code below:
def weights(n:Int,s:Int):List[List[Int]] = {
List.concat( (0 to s).toList.map(List.fill(n)(_)).flatten, (0 to s).toList).
combinations(n).filter(_.sum==s).map(_.permutations.toList).toList.flatten
}
println(weights(3,100))
This works for small values of n. ( n=1, 2, 3 or 4).
Beyond n=4, it takes a very long time, practically unusable.
I'm looking for ways to rework my code using lazy evaluation/ Stream.
My requirements : Must work for n upto 10.
Warning : The problem gets really big really fast. My results from Matlab -
---For s =100, n = 1 thru 5 results are ---
n=1 :1 combinations
n=2 :101 combinations
n=3 :5151 combinations
n=4 :176851 combinations
n=5: 4598126 combinations
---
You need dynamic programming, or memoization. Same concept, anyway.
Let's say you have to divide s among n. Recursively, that's defined like this:
def permutations(s: Int, n: Int): List[List[Int]] = n match {
case 0 => Nil
case 1 => List(List(s))
case _ => (0 to s).toList flatMap (x => permutations(s - x, n - 1) map (x :: _))
}
Now, this will STILL be slow as hell, but there's a catch here... you don't need to recompute permutations(s, n) for numbers you have already computed. So you can do this instead:
val memoP = collection.mutable.Map.empty[(Int, Int), List[List[Int]]]
def permutations(s: Int, n: Int): List[List[Int]] = {
def permutationsWithHead(x: Int) = permutations(s - x, n - 1) map (x :: _)
n match {
case 0 => Nil
case 1 => List(List(s))
case _ =>
memoP getOrElseUpdate ((s, n),
(0 to s).toList flatMap permutationsWithHead)
}
}
And this can be even further improved, because it will compute every permutation. You only need to compute every combination, and then permute that without recomputing.
To compute every combination, we can change the code like this:
val memoC = collection.mutable.Map.empty[(Int, Int, Int), List[List[Int]]]
def combinations(s: Int, n: Int, min: Int = 0): List[List[Int]] = {
def combinationsWithHead(x: Int) = combinations(s - x, n - 1, x) map (x :: _)
n match {
case 0 => Nil
case 1 => List(List(s))
case _ =>
memoC getOrElseUpdate ((s, n, min),
(min to s / 2).toList flatMap combinationsWithHead)
}
}
Running combinations(100, 10) is still slow, given the sheer numbers of combinations alone. The permutations for each combination can be obtained simply calling .permutation on the combination.
Here's a quick and dirty Stream solution:
def weights(n: Int, s: Int) = (1 until s).foldLeft(Stream(Nil: List[Int])) {
(a, _) => a.flatMap(c => Stream.range(0, n - c.sum + 1).map(_ :: c))
}.map(c => (n - c.sum) :: c)
It works for n = 6 in about 15 seconds on my machine:
scala> var x = 0
scala> weights(100, 6).foreach(_ => x += 1)
scala> x
res81: Int = 96560646
As a side note: by the time you get to n = 10, there are 4,263,421,511,271 of these things. That's going to take days just to stream through.
My solution of this problem, it can computer n till 6:
object Partition {
implicit def i2p(n: Int): Partition = new Partition(n)
def main(args : Array[String]) : Unit = {
for(n <- 1 to 6) println(100.partitions(n).size)
}
}
class Partition(n: Int){
def partitions(m: Int):Iterator[List[Int]] = new Iterator[List[Int]] {
val nums = Array.ofDim[Int](m)
nums(0) = n
var hasNext = m > 0 && n > 0
override def next: List[Int] = {
if(hasNext){
val result = nums.toList
var idx = 0
while(idx < m-1 && nums(idx) == 0) idx = idx + 1
if(idx == m-1) hasNext = false
else {
nums(idx+1) = nums(idx+1) + 1
nums(0) = nums(idx) - 1
if(idx != 0) nums(idx) = 0
}
result
}
else Iterator.empty.next
}
}
}
1
101
5151
176851
4598126
96560646
However , we can just show the number of the possible n-tuples:
val pt: (Int,Int) => BigInt = {
val buf = collection.mutable.Map[(Int,Int),BigInt]()
(s,n) => buf.getOrElseUpdate((s,n),
if(n == 0 && s > 0) BigInt(0)
else if(s == 0) BigInt(1)
else (0 to s).map{k => pt(s-k,n-1)}.sum
)
}
for(n <- 1 to 20) printf("%2d :%s%n",n,pt(100,n).toString)
1 :1
2 :101
3 :5151
4 :176851
5 :4598126
6 :96560646
7 :1705904746
8 :26075972546
9 :352025629371
10 :4263421511271
11 :46897636623981
12 :473239787751081
13 :4416904685676756
14 :38393094575497956
15 :312629484400483356
16 :2396826047070372396
17 :17376988841260199871
18 :119594570260437846171
19 :784008849485092547121
20 :4910371215196105953021

Scala performance - Sieve

Right now, I am trying to learn Scala . I've started small, writing some simple algorithms . I've encountered some problems when I wanted to implement the Sieve algorithm from finding all all prime numbers lower than a certain threshold .
My implementation is:
import scala.math
object Sieve {
// Returns all prime numbers until maxNum
def getPrimes(maxNum : Int) = {
def sieve(list: List[Int], stop : Int) : List[Int] = {
list match {
case Nil => Nil
case h :: list if h <= stop => h :: sieve(list.filterNot(_ % h == 0), stop)
case _ => list
}
}
val stop : Int = math.sqrt(maxNum).toInt
sieve((2 to maxNum).toList, stop)
}
def main(args: Array[String]) = {
val ap = printf("%d ", (_:Int));
// works
getPrimes(1000).foreach(ap(_))
// works
getPrimes(100000).foreach(ap(_))
// out of memory
getPrimes(1000000).foreach(ap(_))
}
}
Unfortunately it fails when I want to computer all the prime numbers smaller than 1000000 (1 million) . I am receiving OutOfMemory .
Do you have any idea on how to optimize the code, or how can I implement this algorithm in a more elegant fashion .
PS: I've done something very similar in Haskell, and there I didn't encountered any issues .
I would go with an infinite Stream. Using a lazy data structure allows to code pretty much like in Haskell. It reads automatically more "declarative" than the code you wrote.
import Stream._
val primes = 2 #:: sieve(3)
def sieve(n: Int) : Stream[Int] =
if (primes.takeWhile(p => p*p <= n).exists(n % _ == 0)) sieve(n + 2)
else n #:: sieve(n + 2)
def getPrimes(maxNum : Int) = primes.takeWhile(_ < maxNum)
Obviously, this isn't the most performant approach. Read The Genuine Sieve of Eratosthenes for a good explanation (it's Haskell, but not too difficult). For real big ranges you should consider the Sieve of Atkin.
The code in question is not tail recursive, so Scala cannot optimize the recursion away. Also, Haskell is non-strict by default, so you can't hardly compare it to Scala. For instance, whereas Haskell benefits from foldRight, Scala benefits from foldLeft.
There are many Scala implementations of Sieve of Eratosthenes, including some in Stack Overflow. For instance:
(n: Int) => (2 to n) |> (r => r.foldLeft(r.toSet)((ps, x) => if (ps(x)) ps -- (x * x to n by x) else ps))
The following answer is about a 100 times faster than the "one-liner" answer using a Set (and the results don't need sorting to ascending order) and is more of a functional form than the other answer using an array although it uses a mutable BitSet as a sieving array:
object SoE {
def makeSoE_Primes(top: Int): Iterator[Int] = {
val topndx = (top - 3) / 2
val nonprms = new scala.collection.mutable.BitSet(topndx + 1)
def cullp(i: Int) = {
import scala.annotation.tailrec; val p = i + i + 3
#tailrec def cull(c: Int): Unit = if (c <= topndx) { nonprms += c; cull(c + p) }
cull((p * p - 3) >>> 1)
}
(0 to (Math.sqrt(top).toInt - 3) >>> 1).filterNot { nonprms }.foreach { cullp }
Iterator.single(2) ++ (0 to topndx).filterNot { nonprms }.map { i: Int => i + i + 3 }
}
}
It can be tested by the following code:
object Main extends App {
import SoE._
val top_num = 10000000
val strt = System.nanoTime()
val count = makeSoE_Primes(top_num).size
val end = System.nanoTime()
println(s"Successfully completed without errors. [total ${(end - strt) / 1000000} ms]")
println(f"Found $count primes up to $top_num" + ".")
println("Using one large mutable1 BitSet and functional code.")
}
With the results from the the above as follows:
Successfully completed without errors. [total 294 ms]
Found 664579 primes up to 10000000.
Using one large mutable BitSet and functional code.
There is an overhead of about 40 milliseconds for even small sieve ranges, and there are various non-linear responses with increasing range as the size of the BitSet grows beyond the different CPU caches.
It looks like List isn't very effecient space wise. You can get an out of memory exception by doing something like this
1 to 2000000 toList
I "cheated" and used a mutable array. Didn't feel dirty at all.
def primesSmallerThan(n: Int): List[Int] = {
val nonprimes = Array.tabulate(n + 1)(i => i == 0 || i == 1)
val primes = new collection.mutable.ListBuffer[Int]
for (x <- nonprimes.indices if !nonprimes(x)) {
primes += x
for (y <- x * x until nonprimes.length by x if (x * x) > 0) {
nonprimes(y) = true
}
}
primes.toList
}