help rewriting in functional style - scala

I'm learning Scala as my first functional-ish language. As one of the problems, I was trying to find a functional way of generating the sequence S up to n places. S is defined so that S(1) = 1, and S(x) = the number of times x appears in the sequence. (I can't remember what this is called, but I've seen it in programming books before.)
In practice, the sequence looks like this:
S = 1, 2, 2, 3, 3, 4, 4, 4, 5, 5, 5, 6, 6, 6, 6, 7, 7, 7, 7 ...
I can generate this sequence pretty easily in Scala using an imperative style like this:
def genSequence(numItems: Int) = {
require(numItems > 0, "numItems must be >= 1")
var list: List[Int] = List(1)
var seq_no = 2
var no = 2
var no_nos = 0
var num_made = 1
while(num_made < numItems) {
if(no_nos < seq_no) {
list = list :+ no
no_nos += 1
num_made += 1
} else if(no % 2 == 0) {
no += 1
no_nos = 0
} else {
no += 1
seq_no += 1
no_nos = 0
}
}
list
}
But I don't really have any idea how to write this without using vars and the while loop.
Thanks!

Pavel's answer has come closest so far, but it's also inefficient. Two flatMaps and a zipWithIndex are overkill here :)
My understanding of the required output:
The results contain all the positive integers (starting from 1) at least once
each number n appears in the output (n/2) + 1 times
As Pavel has rightly noted, the solution is to start with a Stream then use flatMap:
Stream from 1
This generates a Stream, a potentially never-ending sequence that only produces values on demand. In this case, it's generating 1, 2, 3, 4... all the way up to Infinity (in theory) or Integer.MAX_VALUE (in practice)
Streams can be mapped over, as with any other collection. For example: (Stream from 1) map { 2 * _ } generates a Stream of even numbers.
You can also use flatMap on Streams, allowing you to map each input element to zero or more output elements; this is key to solving your problem:
val strm = (Stream from 1) flatMap { n => Stream.fill(n/2 + 1)(n) }
So... How does this work? For the element 3, the lambda { n => Stream.fill(n/2 + 1)(n) } will produce the output stream 3,3. For the first 5 integers you'll get:
1 -> 1
2 -> 2, 2
3 -> 3, 3
4 -> 4, 4, 4
5 -> 5, 5, 5
etc.
and because we're using flatMap, these will be concatenated, yielding:
1, 2, 2, 3, 3, 4, 4, 4, 5, 5, 5, ...
Streams are memoised, so once a given value has been calculated it'll be saved for future reference. However, all the preceeding values have to be calculated at least once. If you want the full sequence then this won't cause any problems, but it does mean that generating S(10796) from a cold start is going to be slow! (a problem shared with your imperative algorithm). If you need to do this, then none of the solutions so far is likely to be appropriate for you.

The following code produces exactly the same sequence as yours:
val seq = Stream.from(1)
.flatMap(Stream.fill(2)(_))
.zipWithIndex
.flatMap(p => Stream.fill(p._1)(p._2))
.tail
However, if you want to produce the Golomb sequence (that complies with the definition, but differs from your sample code result), you may use the following:
val seq = 1 #:: a(2)
def a(n: Int): Stream[Int] = (1 + seq(n - seq(seq(n - 2) - 1) - 1)) #:: a(n + 1)
You may check my article for more examples of how to deal with number sequences in functional style.

Here is a translation of your code to a more functional style:
def genSequence(numItems: Int): List[Int] = {
genSequenceR(numItems, 2, 2, 0, 1, List[Int](1))
}
def genSequenceR(numItems: Int, seq_no: Int, no:Int, no_nos: Int, numMade: Int, list: List[Int]): List[Int] = {
if(numMade < numItems){
if(no_nos < seq_no){
genSequenceR(numItems, seq_no, no, no_nos + 1, numMade + 1, list :+ no)
}else if(no % 2 == 0){
genSequenceR(numItems, seq_no, no + 1, 0, numMade, list)
}else{
genSequenceR(numItems, seq_no + 1, no + 1, 0, numMade, list)
}
}else{
list
}
}
The genSequenceR is the recursive function that accumulates values in the list and calls the function with new values based on the conditions. Like the while loop, it terminates, when numMade is less than numItems and returns the list to genSequence.
This is a fairly rudimentary functional translation of your code. It can be improved and there are better approaches typically used. I'd recommend trying to improve it with pattern matching and then work towards the other solutions that use Stream here.

Here's an attempt from a Scala tyro. Keep in mind I don't really understand Scala, I don't really understand the question, and I don't really understand your algorithm.
def genX_Ys[A](howMany : Int, ofWhat : A) : List[A] = howMany match {
case 1 => List(ofWhat)
case _ => ofWhat :: genX_Ys(howMany - 1, ofWhat)
}
def makeAtLeast(startingWith : List[Int], nextUp : Int, howMany : Int, minimumLength : Int) : List[Int] = {
if (startingWith.size >= minimumLength)
startingWith
else
makeAtLeast(startingWith ++ genX_Ys( howMany, nextUp),
nextUp +1, howMany + (if (nextUp % 2 == 1) 1 else 0), minimumLength)
}
def genSequence(numItems: Int) = makeAtLeast(List(1), 2, 2, numItems).slice(0, numItems)
This seems to work, but re-read the caveats above. In particular, I am sure there is a library function that performs genX_Ys, but I couldn't find it.
EDIT Could be
def genX_Ys[A](howMany : Int, ofWhat : A) : Seq[A] =
(1 to howMany) map { x => ofWhat }

Here is a very direct "translation" of the definition of the Golomb seqence:
val it = Iterator.iterate((1,1,Map(1->1,2->2))){ case (n,i,m) =>
val c = m(n)
if (c == 1) (n+1, i+1, m + (i -> n) - n)
else (n, i+1, m + (i -> n) + (n -> (c-1)))
}.map(_._1)
println(it.take(10).toList)
The tripel (n,i,m) contains the actual number n, the index i and a Map m, which contains how often an n must be repeated. When the counter in the Map for our n reaches 1, we increase n (and can drop n from the map, as it is not longer needed), else we just decrease n's counter in the map and keep n. In every case we add the new pair i -> n into the map, which will be used as counter later (when a subsequent n reaches the value of the current i).
[Edit]
Thinking about it, I realized that I don't need indexes and not even a lookup (because the "counters" are already in the "right" order), which means that I can replace the Map with a Queue:
import collection.immutable.Queue
val it = 1 #:: Iterator.iterate((2, 2, Queue[Int]())){
case (n,1,q) => (n+1, q.head, q.tail + (n+1))
case (n,c,q) => (n,c-1,q + n)
}.map(_._1).toStream
The Iterator works correctly when starting by 2, so I had to add a 1 at the beginning. The second tuple argument is now the counter for the current n (taken from the Queue). The current counter could be kept in the Queue as well, so we have only a pair, but then it's less clear what's going on due to the complicated Queue handling:
val it = 1 #:: Iterator.iterate((2, Queue[Int](2))){
case (n,q) if q.head == 1 => (n+1, q.tail + (n+1))
case (n,q) => (n, ((q.head-1) +: q.tail) + n)
}.map(_._1).toStream

Related

List whose elements depend on the previous ones

Suppose I have a list of increasing integers. If the difference of 2 consecutive numbers is less than a threshold, then we index them by the same number, starting from 0. Otherwise, we increase the index by 1.
For example: for the list (1,2,5,7,8,11,15,16,20) and the threshold = 3, the output will be: (0, 0, 1, 1, 1, 2, 3, 3, 4).
Here is my code:
val listToGroup = List(1,2,5,7,8,11,15,16,20)
val diff_list = listToGroup.sliding(2,1).map{case List(i, j) => j-i}.toList
val thres = 2
var j=0
val output_ = for(i <- diff_list.indices) yield {
if (diff_list(i) > thres ) {
j += 1
}
j
}
val output = List.concat(List(0), output_)
I'm new to Scala and I feel the list is not used efficiently. How can this code be improved?
You can avoid the mutable variable by using scanLeft to get a more idiomatic code:
val output = diff_list.scanLeft(0) { (count, i) =>
if (i > thres) count + 1
else count
}
Your code shows some constructs which are usually avoided in Scala, but common when coming from procedural langugues, like: for(i <- diff_list.indices) ... diff_list(i) can be replaced with for(i <- diff_list).
Other than that, I think your code is efficient - you need to traverse the list anyway and you do it in O(N). I would not worry about efficiency here, more about style and readability.
My rewrite to how I think it would be more natural in Scala for the whole code would be:
val listToGroup = List(1,2,5,7,8,11,15,16,20)
val thres = 2
val output = listToGroup.zip(listToGroup.drop(1)).scanLeft(0) { case (count, (i, j)) =>
if (j - i > thres) count + 1
else count
}
My adjustments to your code:
I use scanLeft to perform the result collection construction
I prefer x.zip(x.drop(1)) over x.sliding(2, 1) (constructing tuples seems a bit more efficient than constructing collections). You could also use x.zip(x.tail), but that does not handle empty x
I avoid the temporary result diff_list
val listToGroup = List(1, 2, 5, 7, 8, 11, 15, 16, 20)
val thres = 2
listToGroup
.sliding(2)
.scanLeft(0)((a, b) => { if (b.tail.head - b.head > thres) a + 1 else a })
.toList
.tail
You don't need to use mutable variable, you can achieve the same with scanLeft.

How to pair each element of a Seq with the rest?

I'm looking for an elegant way to combine every element of a Seq with the rest for a large collection.
Example: Seq(1,2,3).someMethod should produce something like
Iterator(
(1,Seq(2,3)),
(2,Seq(1,3)),
(3,Seq(1,2))
)
Order of elements doesn't matter. It doesn't have to be a tuple, a Seq(Seq(1),Seq(2,3)) is also acceptable (although kinda ugly).
Note the emphasis on large collection (which is why my example shows an Iterator).
Also note that this is not combinations.
Ideas?
Edit:
In my use case, the numbers are expected to be unique. If a solution can eliminate the dupes, that's fine, but not at additional cost. Otherwise, dupes are acceptable.
Edit 2: In the end, I went with a nested for-loop, and skipped the case when i == j. No new collections were created. I upvoted the solutions that were correct and simple ("simplicity is the ultimate sophistication" - Leonardo da Vinci), but even the best ones are quadratic just by the nature of the problem, and some create intermediate collections by usage of ++ that I wanted to avoid because the collection I'm dealing with has close to 50000 elements, 2.5 billion when quadratic.
The following code has constant runtime (it does everything lazily), but accessing every element of the resulting collections has constant overhead (when accessing each element, an index shift must be computed every time):
def faceMap(i: Int)(j: Int) = if (j < i) j else j + 1
def facets[A](simplex: Vector[A]): Seq[(A, Seq[A])] = {
val n = simplex.size
(0 until n).view.map { i => (
simplex(i),
(0 until n - 1).view.map(j => simplex(faceMap(i)(j)))
)}
}
Example:
println("Example: facets of a 3-dimensional simplex")
for ((i, v) <- facets((0 to 3).toVector)) {
println(i + " -> " + v.mkString("[", ",", "]"))
}
Output:
Example: facets of a 3-dimensional simplex
0 -> [1,2,3]
1 -> [0,2,3]
2 -> [0,1,3]
3 -> [0,1,2]
This code expresses everything in terms of simplices, because "omitting one index" corresponds exactly to the face maps for a combinatorially described simplex. To further illustrate the idea, here is what the faceMap does:
println("Example: how `faceMap(3)` shifts indices")
for (i <- 0 to 5) {
println(i + " -> " + faceMap(3)(i))
}
gives:
Example: how `faceMap(3)` shifts indices
0 -> 0
1 -> 1
2 -> 2
3 -> 4
4 -> 5
5 -> 6
The facets method uses the faceMaps to create a lazy view of the original collection that omits one element by shifting the indices by one starting from the index of the omitted element.
If I understand what you want correctly, in terms of handling duplicate values (i.e., duplicate values are to be preserved), here's something that should work. Given the following input:
import scala.util.Random
val nums = Vector.fill(20)(Random.nextInt)
This should get you what you need:
for (i <- Iterator.from(0).take(nums.size)) yield {
nums(i) -> (nums.take(i) ++ nums.drop(i + 1))
}
On the other hand, if you want to remove dups, I'd convert to Sets:
val numsSet = nums.toSet
for (num <- nums) yield {
num -> (numsSet - num)
}
seq.iterator.map { case x => x -> seq.filter(_ != x) }
This is quadratic, but I don't think there is very much you can do about that, because in the end of the day, creating a collection is linear, and you are going to need N of them.
import scala.annotation.tailrec
def prems(s : Seq[Int]):Map[Int,Seq[Int]]={
#tailrec
def p(prev: Seq[Int],s :Seq[Int],res:Map[Int,Seq[Int]]):Map[Int,Seq[Int]] = s match {
case x::Nil => res+(x->prev)
case x::xs=> p(x +: prev,xs, res+(x ->(prev++xs)))
}
p(Seq.empty[Int],s,Map.empty[Int,Seq[Int]])
}
prems(Seq(1,2,3,4))
res0: Map[Int,Seq[Int]] = Map(1 -> List(2, 3, 4), 2 -> List(1, 3, 4), 3 -> List(2, 1, 4),4 -> List(3, 2, 1))
I think you are looking for permutations. You can map the resulting lists into the structure you are looking for:
Seq(1,2,3).permutations.map(p => (p.head, p.tail)).toList
res49: List[(Int, Seq[Int])] = List((1,List(2, 3)), (1,List(3, 2)), (2,List(1, 3)), (2,List(3, 1)), (3,List(1, 2)), (3,List(2, 1)))
Note that the final toList call is only there to trigger the evaluation of the expressions; otherwise, the result is an iterator as you asked for.
In order to get rid of the duplicate heads, toMap seems like the most straight-forward approach:
Seq(1,2,3).permutations.map(p => (p.head, p.tail)).toMap
res50: scala.collection.immutable.Map[Int,Seq[Int]] = Map(1 -> List(3, 2), 2 -> List(3, 1), 3 -> List(2, 1))

Code efficiency in Scala loops, counting up or counting down?

Clearly, if you need to count up, count up. If you need to count down, count down. However, other things being equal, is one faster than the other?
Here is my Scala code for a well-known puzzle - checking if a number is divisible by 13.
In the first example, I reverse my array and count upwards in the subsequent for-loop. In the second example I leave the array alone and do a decrementing for-loop. On the surface, the second example looks faster. Unfortunately, on the site where I run the code, it always times out.
// works every time
object Thirteen {
import scala.annotation.tailrec
#tailrec
def thirt(n: Long): Long = {
val getNum = (n: Int) => Array(1, 10, 9, 12, 3, 4)(n % 6)
val ni = n.toString.split("").reverse.map(_.toInt)
var s: Long = 0
for (i <- 0 to ni.length-1) {
s += ni(i) * getNum(i)
}
if (s == n) s else thirt(s)
}
}
// times out every time
object Thirteen {
import scala.annotation.tailrec
#tailrec
def thirt(n: Long): Long = {
val getNum = (n: Int) => Array(1, 10, 9, 12, 3, 4)(n % 6)
val ni = n.toString.split("").map(_.toInt)
var s: Long = 0
for (i <- ni.length-1 to 0 by -1) {
s = s + ni(i) * getNum(i)
}
if (s == n) s else thirt(s)
}
}
I ask the following questions:
Is there an obvious rule I am unaware of?
What is an easy way to test two code versions for performance – reliably measuring performance in the JVM appears difficult.
Does it help to look at the underlying byte code?
Is there a better piece of code solving
the same problem, If so, I'd be very grateful to see it.
Whilst I've seen similar questions, I can't find a definitive answer.
Here's how I'd be tempted to tackle it.
val nums :Stream[Int] = 1 #:: 10 #:: 9 #:: 12 #:: 3 #:: 4 #:: nums
def thirt(n :Long) :Long = {
val s :Long = Stream.iterate(n)(_ / 10)
.takeWhile(_ > 0)
.zip(nums)
.foldLeft(0L){case (sum, (i, num)) => sum + i%10 * num}
if (s == n) s else thirt(s)
}

How to capture inner matched value in indexWhere vector expression?

Using a Vector[Vector[Int]] reference v, and the expression to find a given number num:
val posX = v.indexWhere(_.indexOf(num) > -1)
Is there any way to capture the value of _.indexOf(num) to use after the expression (i.e. the posY value)? The following signals an error 'Illegal start of simple expression':
val posX = v.indexWhere((val posY = _.indexOf(num)) > -1)
If we do not mind using a variable then we can capture indexOf() value of inner Vector (_ in the below code) in a var and use it later to build the y position:
val posX = v.indexWhere(_.indexOf(num) > -1)
val posY = v(posX).indexOf(num)
There are lots of nice functional ways to do this. The following is probably one of the more concise:
val v = Vector(Vector(1, 2, 3), Vector(4, 5, 6), Vector(7, 8, 9))
val num = 4
val Some((posY, posX)) = v.map(_ indexOf num).zipWithIndex.find(_._1 > -1)
// posY: Int = 0
// posX: Int = 1
Note that there's a lot of extra work going on here, though—we're creating a couple of intermediate collections, parts of which we don't need, etc. If you're calling this thing a lot or on very large collections, you unfortunately may need to take a more imperative approach. In that case I'd suggest bundling up all the unpleasantness:
def locationOf(v: Vector[Vector[Int]])(num: Int): Option[(Int, Int)] = {
var i, j = 0
var found = false
while (i < v.size && !found) {
j = 0
while (j < v(i).size && !found)
if (v(i)(j) == num) found = true else j += 1
if (!found) i += 1
}
if (!found) None else Some(i, j)
}
Not as elegant, but this method is probably going to be a lot faster and more memory efficient. It's small enough that it isn't likely to contain any of the bugs that this kind of programming is so prone to, and it's referentially transparent—all the mutation is local.
From my armchair,
scala> val v = Vector(Vector(1, 2, 3), Vector(4, 5, 6), Vector(7, 8, 9))
scala> v.zipWithIndex collectFirst {
| case (e, i) if (e indexOf num) >= 0 =>
| (i, e indexOf num)
| }
res7: Option[(Int, Int)] = Some((1,0))
I haven't done the armchair math, but that's one intermediate collection compared to Travis's. But see Travis's comment that the result inner index is computed twice here, and the whole point was not to do that.
Here is a solution that will only evaluate up until it finds the required element. I personally find it more readable and you can reuse it across programs. You can obviously make this more general if need be.
val v = Vector(Vector(1, 2, 3), Vector(4, 5, 6))
def findElem(i: Int, vs: Vector[Vector[Int]]): (Int, Int) =
(for {
row <- vs.indices.toStream
col <- vs(row).indices.toStream
if vs(row)(col) == i
} yield (row, col)).head
findElem(5, v) // (1, 1)
You could remove the .toStream methods if you want all positions. Using the .toStream just means that you will only evaluate up until the first occurrence.

Using Streams for iteration in Scala

SICP says that iterative processes (e.g. Newton method of square root calculation, "pi" calculation, etc.) can be formulated in terms of Streams.
Does anybody use streams in Scala to model iterations?
Here is one way to produce the stream of approximations of pi:
val naturals = Stream.from(0) // 0, 1, 2, ...
val odds = naturals.map(_ * 2 + 1) // 1, 3, 5, ...
val oddInverses = odds.map(1.0d / _) // 1/1, 1/3, 1/5, ...
val alternations = Stream.iterate(1)(-_) // 1, -1, 1, ...
val products = (oddInverses zip alternations)
.map(ia => ia._1 * ia._2) // 1/1, -1/3, 1/5, ...
// Computes a stream representing the cumulative sum of another one
def sumUp(s : Stream[Double], acc : Double = 0.0d) : Stream[Double] =
Stream.cons(s.head + acc, sumUp(s.tail, s.head + acc))
val pi = sumUp(products).map(_ * 4.0) // Approximations of pi.
Now, say you want the 200th iteration:
scala> pi(200)
resN: Double = 3.1465677471829556
...or the 300000th:
scala> pi(300000)
resN : Double = 3.14159598691202
Streams are extremely useful when you are doing a sequence of recursive calculations and a single result depends on previous results, such as calculating pi. Here's a simpler example, consider the classic recursive algorithm for calculating fibbonacci numbers (1, 2, 3, 5, 8, 13, ...):
def fib(n: Int) : Int = n match {
case 0 => 1
case 1 => 2
case _ => fib(n - 1) + fib(n - 2)
}
One of the main points of this code is that while very simple, is extremely inefficient. fib(100) almost crashed my computer! Each recursion branches into two calls and you are essentially calculating the same values many times.
Streams allow you to do dynamic programming in a recursive fashion, where once a value is calculated, it is reused every time it is needed again. To implement the above using streams:
val naturals: Stream[Int] = Stream.cons(0, naturals.map{_ + 1})
val fibs : Stream[Int] = naturals.map{
case 0 => 1
case 1 => 2
case n => fibs(n - 1) + fibs( n - 2)
}
fibs(1) //2
fibs(2) //3
fibs(3) //5
fibs(100) //1445263496
Whereas the recursive solution runs in O(2^n) time, the Streams solution runs in O(n^2) time. Since you only need the last 2 generated members, you can easily optimize this using Stream.drop so that the stream size doesn't overflow memory.