Splitting a Scala Range into evenly-sized contiguous sub-Ranges - scala

If I have a Range, how can I split it into a sequence of contiguous sub-ranges, where the number of sub-ranges (buckets) is specified? Empty buckets should be omitted if there are not enough items.
For example:
splitRange(1 to 6, 3) == Seq(Range(1,2), Range(3,4), Range(5,6))
splitRange(1 to 2, 3) == Seq(Range(1), Range(2))
Some additional constraints, that rule out some of the solutions I've seen:
Roughly even bucket size - the bucket size should vary by 1, at most
The length of the input range may sometimes be very large, so the ranges should not be materialized into sequences (e.g. can't use grouped)
This also implies that we don't allocate numbers to buckets in round-robin fashion, because then numbers in each bucket wouldn't be contiguous and so wouldn't form a Range
Ideally, the sub-ranges would be produced in order, i.e (1,2)(3,4), not (3,4)(1,2)
A colleague found a solution here:
def splitRange(r: Range, chunks: Int): Seq[Range] = {
if (r.step != 1)
throw new IllegalArgumentException("Range must have step size equal to 1")
val nchunks = scala.math.max(chunks, 1)
val chunkSize = scala.math.max(r.length / nchunks, 1)
val starts = r.by(chunkSize).take(nchunks)
val ends = starts.map(_ - 1).drop(1) :+ r.end
starts.zip(ends).map(x => x._1 to x._2)
}
but this can produce very uneven bucket sizes when N is small, e.g:
splitRange(1 to 14, 5)
//> Vector(Range(1, 2), Range(3, 4), Range(5, 6),
//| Range(7, 8), Range(9, 10, 11, 12, 13, 14))
^^^^^^^^^^^^^^^^^^^^^

Floating-point approaches
One way is to generate a fractional (floating-point) offset for each bucket, then convert these to integer Ranges, by zipping. Empty Ranges also need filtering out using collect.
def splitRange(r: Range, chunks: Int): Seq[Range] = {
require(r.step == 1, "Range must have step size equal to 1")
require(chunks >= 1, "Must ask for at least 1 chunk")
val m = r.length.toDouble
val chunkSize = m / chunks
val bins = (0 to chunks).map { x => math.round((x.toDouble * m) / chunks).toInt }
val pairs = bins zip (bins.tail)
pairs.collect { case (a, b) if b > a => a to b }
}
(The first version of this solution had a rounding problem such that it could not handle Int.MaxValue - this has now been fixed based on Rex Kerr's recursive floating-point solution below)
Another floating-point approach is to recurse down the range, taking the head off the range each time, so we cannot miss any elements. This version can handle Int.MaxValue correctly.
def splitRange(r: Range, chunks: Int): Seq[Range] = {
require(r.step == 1, "Range must have step size equal to 1")
require(chunks >= 1, "Must ask for at least 1 chunk")
val chunkSize = r.length.toDouble / chunks
def go(i: Int, r: Range, delta: Double, acc: List[Range]): List[Range] = {
if (i == chunks) r :: acc
// ensures the last chunk has all remaining values, even if error accumulates
else {
val s = delta + chunkSize
val (chunk, rest) = r.splitAt(s.toInt)
go(i + 1, rest, s - s.toInt, if (chunk.length > 0) chunk :: acc else acc)
}
}
go(1, r, 0.0D, Nil).reverse
}
One can also recurse to generate the (start,end) pairs, rather than zipping them. This is adapted from Rex Kerr's answer to a similar question
def splitRange(r: Range, chunks: Int): Seq[Range] = {
require(r.step == 1, "Range must have step size equal to 1")
require(chunks >= 1, "Must ask for at least 1 chunk")
val m = r.length
val bins = (0 to chunks).map { x => math.round((x.toDouble * m) / chunks).toInt }
def snip(r: Range, ns: Seq[Int], got: Vector[Range]): Vector[Range] = {
if (ns.length < 2) got
else {
val (i, j) = (ns.head, ns.tail.head)
snip(r.drop(j - i), ns.tail, got :+ r.take(j - i))
}
}
snip(r, bins, Vector.empty).filter(_.length > 0)
}
Integer approach
Finally, I realized that this can be done with purely integer arithmetic by adapting Bresenham's line-drawing algorithm, which solves a basically equivalent problem - how to allocate the x-pixels evenly across the y rows, using only integer operations!
I initially translated the pseudo-code into an imperative solution using var and ArrayBuffer, then converted it into a tail-recursive solution:
def splitRange(r: Range, chunks: Int): List[Range] = {
require(r.step == 1, "Range must have step size equal to 1")
require(chunks >= 1, "Must ask for at least 1 chunk")
val dy = r.length
val dx = chunks
#tailrec
def go(y0:Int, y:Int, d:Int, ch:Int, acc: List[Range]):List[Range] = {
if (ch == 0) acc
else {
if (d > 0) go(y0, y-1, d-dx, ch, acc)
else go(y-1, y, d+dy, ch-1, if (y > y0) acc
else (y to y0) :: acc)
}
}
go(r.end, r.end, dy - dx, chunks, Nil)
}
Please see the Wikipedia link for a full explanation, but essentially the algorithm zig-zags up the slope of a line, alternatively adding the y-range dy and subtracting the x-range dx. If these don't divide exactly, then an error accumulates until it divides exactly, leading to an extra pixel in some sub-ranges.
splitRange(3 to 15, 5)
//> List(Range(3, 4), Range(5, 6, 7), Range(8, 9),
//| Range(10, 11, 12), Range(13, 14, 15))

Related

Scala: Problem with foldLeft with negative numbers in list

I am writing a Scala function that returns the sum of even elements in a list, minus sum of odd elements in a list. I cannot use mutables, recursion or for/while loops for my solution. The code below passes 2/3 tests, but I can't seem to figure out why it can't compute the last test correctly.
def sumOfEvenMinusOdd(l: List[Int]) : Int = {
if (l.length == 0) return 0
val evens = l.filter(_%2==0)
val odds = l.filter(_%2==1)
val evenSum = evens.foldLeft(0)(_+_)
val oddSum = odds.foldLeft(0)(_+_)
evenSum-oddSum
}
//BEGIN TESTS
val i1 = sumOfEvenMinusOdd(List(1,3,5,4,5,2,1,0)) //answer: -9
val i2 = sumOfEvenMinusOdd(List(2,4,5,6,7,8,10)) //answer: 18
val i3 = sumOfEvenMinusOdd(List(109, 19, 12, 1, -5, -120, -15, 30,-33,-13, 12, 19, 3, 18, 1, -1)) //answer -133
My code is outputting this:
defined function sumOfEvenMinusOdd
i1: Int = -9
i2: Int = 18
i3: Int = -200
I am extremely confused why these negative numbers are tripping up the rest of my code. I saw a post explaining the order of operations with foldLeft foldRight, but even changing to foldRight still yields i3: Int = -200. Is there a detail I'm missing? Any guidance / help would be greatly appreciated.
The problem isn't foldLeft or foldRight, the problem is the way you filter out odd values:
val odds = l.filter(_ % 2 == 1)
Should be:
val odds = l.filter(_ % 2 != 0)
The predicate _ % 2 == 1 will only yield true for positive elements. For example, the expression -15 % 2 is equal to -1, and not 1.
As as side note, we can also make this a bit more efficient:
def sumOfEvenMinusOdd(l: List[Int]): Int = {
val (evenSum, oddSum) = l.foldLeft((0, 0)) {
case ((even, odd), element) =>
if (element % 2 == 0) (even + element, odd) else (even, odd + element)
}
evenSum - oddSum
}
Or even better by accumulating the difference only:
def sumOfEvenMinusOdd(l: List[Int]): Int = {
l.foldLeft(0) {
case (diff, element) =>
diff + element * (if (element % 2 == 0) 1 else -1)
}
}
The problem is on the filter condition that you apply on list to find odd numbers.
the odd condition that you doesn't work for negative odd number because mod 2 return -1 for this kind of number.
number % 2 == 0 if number is even
number % 2 != 0 if number is odd
so if you change the filter conditions all works as expected.
Another suggestion:
Why you want use foldleft function for a simple sum operation when you can use directly the sum functions?
test("Test sum Of even minus odd") {
def sumOfEvenMinusOdd(l: List[Int]) : Int = {
val evensSum = l.filter(_%2 == 0).sum
val oddsSum = l.filter(_%2 != 0).sum
evensSum-oddsSum
}
assert(sumOfEvenMinusOdd(List.empty[Int]) == 0)
assert(sumOfEvenMinusOdd(List(1,3,5,4,5,2,1,0)) == -9) //answer: -9
assert(sumOfEvenMinusOdd(List(2,4,5,6,7,8,10)) == 18) //answer: 18
assert(sumOfEvenMinusOdd(List(109, 19, 12, 1, -5, -120, -15, 30,-33,-13, 12, 19, 3, 18, 1, -1)) == -133)
}
With this solution your function is more clear and you can remove the if on the funciton

Collatz - maximum number of steps and the corresponding number

I am trying to write a Scala function that takes an upper bound as argument and calculates the steps for the numbers in a range from 1 up to this bound. It had to return the maximum number of steps and the corresponding number that needs that many steps. (as a pair - first element is the number of steps and second is the corresponding index)
I already have created a function called "collatz" which computes the number of steps. I am very new with Scala and I am a bit stuck because of the limitations. Here's how I thought to start the function:
def max(x:Int):Int = {
for (i<-(1 to x).toList) yield collatz(i)
the way I think to solve this problem is to: 1. iterate through the range and apply collatz to all elements while putting them in a new list which stores the number of steps. 2. find the maximum of the new list by using List.max 3. Use List.IndexOf to find the index. However, I'm really stuck since I don't know how to do this without using var (and only using val). Thanks!
Something like this:
def collatzMax(n: Long): (Long, Long) = {
require(n > 0, "Collatz function is not defined for n <= 0")
def collatz(n: Long, steps: Long): Long = n match {
case n if (n <= 1) => steps
case n if (n % 2 == 0) => collatz(n / 2, steps + 1)
case n if (n % 2 == 1) => collatz(3 * n + 1, steps + 1)
}
def loop(n: Long, current: Long, acc: List[(Long, Long)]): List[(Long, Long)] =
if (current > n) acc
else {
loop(n, current + 1, collatz(current, 0) -> current :: acc)
}
loop(n, 1, Nil).sortBy(-_._1).head
}
Example:
collatzMax(12)
result: (Long, Long) = (19,9) // 19 steps for collatz(9)
Using for:
def collatzMax(n: Long) =
(for(i <- 1L to n) yield collatz(i) -> i).sortBy(-_._1).head
Or(continuing your idea):
def maximum(x: Long): (Long, Long) = {
val lst = for (i <- 1L to x) yield collatz(i)
val maxValue = lst.max
(maxValue, lst.indexOf(maxValue) + 1)
}
Try:
(1 to x).map(collatz).maxBy(_._2)._1

Complexity estimation for simple recursive algorithm

I wrote a code on Scala. And now I want to estimate time and memory complexity.
Problem statement
Given a positive integer n, find the least number of perfect square numbers (for example, 1, 4, 9, 16, ...) which sum to n.
For example, given n = 12, return 3 because 12 = 4 + 4 + 4; given n = 13, return 2 because 13 = 4 + 9.
My code
def numSquares(n: Int): Int = {
import java.lang.Math._
def traverse(n: Int, ns: Int): Int = {
val max = ((num: Int) => {
val sq = sqrt(num)
// a perfect square!
if (sq == floor(sq))
num.toInt
else
sq.toInt * sq.toInt
})(n)
if (n == max)
ns + 1
else
traverse(n - max, ns + 1)
}
traverse(n, 0)
}
I use here a recursion solution. So IMHO time complexity is O(n), because I need to traverse over the sequence of numbers using recursion. Am I right? Have I missed anything?

How to capture inner matched value in indexWhere vector expression?

Using a Vector[Vector[Int]] reference v, and the expression to find a given number num:
val posX = v.indexWhere(_.indexOf(num) > -1)
Is there any way to capture the value of _.indexOf(num) to use after the expression (i.e. the posY value)? The following signals an error 'Illegal start of simple expression':
val posX = v.indexWhere((val posY = _.indexOf(num)) > -1)
If we do not mind using a variable then we can capture indexOf() value of inner Vector (_ in the below code) in a var and use it later to build the y position:
val posX = v.indexWhere(_.indexOf(num) > -1)
val posY = v(posX).indexOf(num)
There are lots of nice functional ways to do this. The following is probably one of the more concise:
val v = Vector(Vector(1, 2, 3), Vector(4, 5, 6), Vector(7, 8, 9))
val num = 4
val Some((posY, posX)) = v.map(_ indexOf num).zipWithIndex.find(_._1 > -1)
// posY: Int = 0
// posX: Int = 1
Note that there's a lot of extra work going on here, though—we're creating a couple of intermediate collections, parts of which we don't need, etc. If you're calling this thing a lot or on very large collections, you unfortunately may need to take a more imperative approach. In that case I'd suggest bundling up all the unpleasantness:
def locationOf(v: Vector[Vector[Int]])(num: Int): Option[(Int, Int)] = {
var i, j = 0
var found = false
while (i < v.size && !found) {
j = 0
while (j < v(i).size && !found)
if (v(i)(j) == num) found = true else j += 1
if (!found) i += 1
}
if (!found) None else Some(i, j)
}
Not as elegant, but this method is probably going to be a lot faster and more memory efficient. It's small enough that it isn't likely to contain any of the bugs that this kind of programming is so prone to, and it's referentially transparent—all the mutation is local.
From my armchair,
scala> val v = Vector(Vector(1, 2, 3), Vector(4, 5, 6), Vector(7, 8, 9))
scala> v.zipWithIndex collectFirst {
| case (e, i) if (e indexOf num) >= 0 =>
| (i, e indexOf num)
| }
res7: Option[(Int, Int)] = Some((1,0))
I haven't done the armchair math, but that's one intermediate collection compared to Travis's. But see Travis's comment that the result inner index is computed twice here, and the whole point was not to do that.
Here is a solution that will only evaluate up until it finds the required element. I personally find it more readable and you can reuse it across programs. You can obviously make this more general if need be.
val v = Vector(Vector(1, 2, 3), Vector(4, 5, 6))
def findElem(i: Int, vs: Vector[Vector[Int]]): (Int, Int) =
(for {
row <- vs.indices.toStream
col <- vs(row).indices.toStream
if vs(row)(col) == i
} yield (row, col)).head
findElem(5, v) // (1, 1)
You could remove the .toStream methods if you want all positions. Using the .toStream just means that you will only evaluate up until the first occurrence.

help rewriting in functional style

I'm learning Scala as my first functional-ish language. As one of the problems, I was trying to find a functional way of generating the sequence S up to n places. S is defined so that S(1) = 1, and S(x) = the number of times x appears in the sequence. (I can't remember what this is called, but I've seen it in programming books before.)
In practice, the sequence looks like this:
S = 1, 2, 2, 3, 3, 4, 4, 4, 5, 5, 5, 6, 6, 6, 6, 7, 7, 7, 7 ...
I can generate this sequence pretty easily in Scala using an imperative style like this:
def genSequence(numItems: Int) = {
require(numItems > 0, "numItems must be >= 1")
var list: List[Int] = List(1)
var seq_no = 2
var no = 2
var no_nos = 0
var num_made = 1
while(num_made < numItems) {
if(no_nos < seq_no) {
list = list :+ no
no_nos += 1
num_made += 1
} else if(no % 2 == 0) {
no += 1
no_nos = 0
} else {
no += 1
seq_no += 1
no_nos = 0
}
}
list
}
But I don't really have any idea how to write this without using vars and the while loop.
Thanks!
Pavel's answer has come closest so far, but it's also inefficient. Two flatMaps and a zipWithIndex are overkill here :)
My understanding of the required output:
The results contain all the positive integers (starting from 1) at least once
each number n appears in the output (n/2) + 1 times
As Pavel has rightly noted, the solution is to start with a Stream then use flatMap:
Stream from 1
This generates a Stream, a potentially never-ending sequence that only produces values on demand. In this case, it's generating 1, 2, 3, 4... all the way up to Infinity (in theory) or Integer.MAX_VALUE (in practice)
Streams can be mapped over, as with any other collection. For example: (Stream from 1) map { 2 * _ } generates a Stream of even numbers.
You can also use flatMap on Streams, allowing you to map each input element to zero or more output elements; this is key to solving your problem:
val strm = (Stream from 1) flatMap { n => Stream.fill(n/2 + 1)(n) }
So... How does this work? For the element 3, the lambda { n => Stream.fill(n/2 + 1)(n) } will produce the output stream 3,3. For the first 5 integers you'll get:
1 -> 1
2 -> 2, 2
3 -> 3, 3
4 -> 4, 4, 4
5 -> 5, 5, 5
etc.
and because we're using flatMap, these will be concatenated, yielding:
1, 2, 2, 3, 3, 4, 4, 4, 5, 5, 5, ...
Streams are memoised, so once a given value has been calculated it'll be saved for future reference. However, all the preceeding values have to be calculated at least once. If you want the full sequence then this won't cause any problems, but it does mean that generating S(10796) from a cold start is going to be slow! (a problem shared with your imperative algorithm). If you need to do this, then none of the solutions so far is likely to be appropriate for you.
The following code produces exactly the same sequence as yours:
val seq = Stream.from(1)
.flatMap(Stream.fill(2)(_))
.zipWithIndex
.flatMap(p => Stream.fill(p._1)(p._2))
.tail
However, if you want to produce the Golomb sequence (that complies with the definition, but differs from your sample code result), you may use the following:
val seq = 1 #:: a(2)
def a(n: Int): Stream[Int] = (1 + seq(n - seq(seq(n - 2) - 1) - 1)) #:: a(n + 1)
You may check my article for more examples of how to deal with number sequences in functional style.
Here is a translation of your code to a more functional style:
def genSequence(numItems: Int): List[Int] = {
genSequenceR(numItems, 2, 2, 0, 1, List[Int](1))
}
def genSequenceR(numItems: Int, seq_no: Int, no:Int, no_nos: Int, numMade: Int, list: List[Int]): List[Int] = {
if(numMade < numItems){
if(no_nos < seq_no){
genSequenceR(numItems, seq_no, no, no_nos + 1, numMade + 1, list :+ no)
}else if(no % 2 == 0){
genSequenceR(numItems, seq_no, no + 1, 0, numMade, list)
}else{
genSequenceR(numItems, seq_no + 1, no + 1, 0, numMade, list)
}
}else{
list
}
}
The genSequenceR is the recursive function that accumulates values in the list and calls the function with new values based on the conditions. Like the while loop, it terminates, when numMade is less than numItems and returns the list to genSequence.
This is a fairly rudimentary functional translation of your code. It can be improved and there are better approaches typically used. I'd recommend trying to improve it with pattern matching and then work towards the other solutions that use Stream here.
Here's an attempt from a Scala tyro. Keep in mind I don't really understand Scala, I don't really understand the question, and I don't really understand your algorithm.
def genX_Ys[A](howMany : Int, ofWhat : A) : List[A] = howMany match {
case 1 => List(ofWhat)
case _ => ofWhat :: genX_Ys(howMany - 1, ofWhat)
}
def makeAtLeast(startingWith : List[Int], nextUp : Int, howMany : Int, minimumLength : Int) : List[Int] = {
if (startingWith.size >= minimumLength)
startingWith
else
makeAtLeast(startingWith ++ genX_Ys( howMany, nextUp),
nextUp +1, howMany + (if (nextUp % 2 == 1) 1 else 0), minimumLength)
}
def genSequence(numItems: Int) = makeAtLeast(List(1), 2, 2, numItems).slice(0, numItems)
This seems to work, but re-read the caveats above. In particular, I am sure there is a library function that performs genX_Ys, but I couldn't find it.
EDIT Could be
def genX_Ys[A](howMany : Int, ofWhat : A) : Seq[A] =
(1 to howMany) map { x => ofWhat }
Here is a very direct "translation" of the definition of the Golomb seqence:
val it = Iterator.iterate((1,1,Map(1->1,2->2))){ case (n,i,m) =>
val c = m(n)
if (c == 1) (n+1, i+1, m + (i -> n) - n)
else (n, i+1, m + (i -> n) + (n -> (c-1)))
}.map(_._1)
println(it.take(10).toList)
The tripel (n,i,m) contains the actual number n, the index i and a Map m, which contains how often an n must be repeated. When the counter in the Map for our n reaches 1, we increase n (and can drop n from the map, as it is not longer needed), else we just decrease n's counter in the map and keep n. In every case we add the new pair i -> n into the map, which will be used as counter later (when a subsequent n reaches the value of the current i).
[Edit]
Thinking about it, I realized that I don't need indexes and not even a lookup (because the "counters" are already in the "right" order), which means that I can replace the Map with a Queue:
import collection.immutable.Queue
val it = 1 #:: Iterator.iterate((2, 2, Queue[Int]())){
case (n,1,q) => (n+1, q.head, q.tail + (n+1))
case (n,c,q) => (n,c-1,q + n)
}.map(_._1).toStream
The Iterator works correctly when starting by 2, so I had to add a 1 at the beginning. The second tuple argument is now the counter for the current n (taken from the Queue). The current counter could be kept in the Queue as well, so we have only a pair, but then it's less clear what's going on due to the complicated Queue handling:
val it = 1 #:: Iterator.iterate((2, Queue[Int](2))){
case (n,q) if q.head == 1 => (n+1, q.tail + (n+1))
case (n,q) => (n, ((q.head-1) +: q.tail) + n)
}.map(_._1).toStream