Percentile calculator - scala

I have been trying to create a small method to calculate given percentile from a seq. It works.. almost. Problem is I don't know why is doesn't work. I was hoping one of your 'a bit smarter' people than me could help me with it.
What I hope the result would be is that it would return the item from the seq that n prosent of the seq is smaller than equal than returned value.
def percentile[Int](p: Int)(seq: Seq[Int]) = {
require(0 <= p && p <= 100) // some value requirements
require(!seq.isEmpty) // more value requirements
val sorted = seq.sorted
val k = math.ceil((seq.length - 1) * (p / 100)).toInt
return sorted(k)
}
So for example if I have
val v = Vector(7, 34, 39, 18, 16, 17, 21, 36, 17, 2, 4, 39, 4, 19, 2, 12, 35, 13, 40, 37)
and I call my function percentile(11)(v) return value is 2. However, 10% of the vector are smaller or equal than 2, not 11% like I am calling. percentile(11)(v) should return 4.

your error is in this row:
val k = math.ceil((seq.length - 1) * (p / 100)).toInt
and particularly here: p / 100. Being p an Int <= 100 and >= 0, p/100 will always be equal to 0 or 1 (if p == 100). If you want a floating point result, you have to widen one of the two values to double: p/100.0
val k = math.ceil((seq.length - 1) * (p / 100.0)).toInt
On a side note: you don't need the [Int] type parameter

The problem is with the part p / 100 in
val k = math.ceil((seq.length - 1) * (p / 100)).toInt
Since p is of type Int and 100 is also an Int, the division is an integer division that returns an Int. If either p or 100 is a Double, the result will be a Double.
The easiest fix would be to change that part in p / 100.0.

Related

How do iterate a sequence with varying starting positions

Say I have an array:
[10,12,20,50]
I can iterate though this array like normal which would look at the position at 0, then 1, 2, and 3.
What if I wanted to start an any arbritrary position in the array, and then go through all the numbers in order.
So the other permutations would be:
10,12,20,50
12,20,50,10
20,50,10,12
50,10,12,20
Is there a general function that would allow me to do this type of sliding iteration?
so looking at the index positions from the above it would be:
0,1,2,3
1,2,3,0
2,3,0,1
3,0,1,2
It would be great if some languages have this built in, but I want to know the algorithm to do this also so I understand.
Let's iterate over an array.
val arr = Array(10, 12, 20, 50)
for (i <- 0 to arr.length - 1) {
println(arr(i))
}
With output:
10
12
20
50
Pretty basic.
What about:
val arr = Array(10, 12, 20, 50)
for (i <- 2 to (2 + arr.length - 1)) {
println(arr(i))
}
Oops. Out of bounds. But what if we modulo that index by the length of the array?
val arr = Array(10, 12, 20, 50)
for (i <- 2 to (2 + arr.length - 1)) {
println(arr(i % arr.length))
}
20
50
10
12
Now you just need to wrap it up in a function that replaces 2 in that example with an argument.
There is no language builtin. There is a similar method permutations, but it will generate all permutations without the order, which doesn't really fit your need.
Your requirement can be implemented with a simple algorithm where you just concatenates two slices:
def orderedPermutation(in: List[Int]): Seq[List[Int]] = {
for(i <- 0 until in.size) yield
in.slice(i, in.size) ++ in.slice(0, i)
}
orderedPermutation(List(10,12,20,50)).foreach(println)
Working code here

Scala: Problem with foldLeft with negative numbers in list

I am writing a Scala function that returns the sum of even elements in a list, minus sum of odd elements in a list. I cannot use mutables, recursion or for/while loops for my solution. The code below passes 2/3 tests, but I can't seem to figure out why it can't compute the last test correctly.
def sumOfEvenMinusOdd(l: List[Int]) : Int = {
if (l.length == 0) return 0
val evens = l.filter(_%2==0)
val odds = l.filter(_%2==1)
val evenSum = evens.foldLeft(0)(_+_)
val oddSum = odds.foldLeft(0)(_+_)
evenSum-oddSum
}
//BEGIN TESTS
val i1 = sumOfEvenMinusOdd(List(1,3,5,4,5,2,1,0)) //answer: -9
val i2 = sumOfEvenMinusOdd(List(2,4,5,6,7,8,10)) //answer: 18
val i3 = sumOfEvenMinusOdd(List(109, 19, 12, 1, -5, -120, -15, 30,-33,-13, 12, 19, 3, 18, 1, -1)) //answer -133
My code is outputting this:
defined function sumOfEvenMinusOdd
i1: Int = -9
i2: Int = 18
i3: Int = -200
I am extremely confused why these negative numbers are tripping up the rest of my code. I saw a post explaining the order of operations with foldLeft foldRight, but even changing to foldRight still yields i3: Int = -200. Is there a detail I'm missing? Any guidance / help would be greatly appreciated.
The problem isn't foldLeft or foldRight, the problem is the way you filter out odd values:
val odds = l.filter(_ % 2 == 1)
Should be:
val odds = l.filter(_ % 2 != 0)
The predicate _ % 2 == 1 will only yield true for positive elements. For example, the expression -15 % 2 is equal to -1, and not 1.
As as side note, we can also make this a bit more efficient:
def sumOfEvenMinusOdd(l: List[Int]): Int = {
val (evenSum, oddSum) = l.foldLeft((0, 0)) {
case ((even, odd), element) =>
if (element % 2 == 0) (even + element, odd) else (even, odd + element)
}
evenSum - oddSum
}
Or even better by accumulating the difference only:
def sumOfEvenMinusOdd(l: List[Int]): Int = {
l.foldLeft(0) {
case (diff, element) =>
diff + element * (if (element % 2 == 0) 1 else -1)
}
}
The problem is on the filter condition that you apply on list to find odd numbers.
the odd condition that you doesn't work for negative odd number because mod 2 return -1 for this kind of number.
number % 2 == 0 if number is even
number % 2 != 0 if number is odd
so if you change the filter conditions all works as expected.
Another suggestion:
Why you want use foldleft function for a simple sum operation when you can use directly the sum functions?
test("Test sum Of even minus odd") {
def sumOfEvenMinusOdd(l: List[Int]) : Int = {
val evensSum = l.filter(_%2 == 0).sum
val oddsSum = l.filter(_%2 != 0).sum
evensSum-oddsSum
}
assert(sumOfEvenMinusOdd(List.empty[Int]) == 0)
assert(sumOfEvenMinusOdd(List(1,3,5,4,5,2,1,0)) == -9) //answer: -9
assert(sumOfEvenMinusOdd(List(2,4,5,6,7,8,10)) == 18) //answer: 18
assert(sumOfEvenMinusOdd(List(109, 19, 12, 1, -5, -120, -15, 30,-33,-13, 12, 19, 3, 18, 1, -1)) == -133)
}
With this solution your function is more clear and you can remove the if on the funciton

Splitting a Scala Range into evenly-sized contiguous sub-Ranges

If I have a Range, how can I split it into a sequence of contiguous sub-ranges, where the number of sub-ranges (buckets) is specified? Empty buckets should be omitted if there are not enough items.
For example:
splitRange(1 to 6, 3) == Seq(Range(1,2), Range(3,4), Range(5,6))
splitRange(1 to 2, 3) == Seq(Range(1), Range(2))
Some additional constraints, that rule out some of the solutions I've seen:
Roughly even bucket size - the bucket size should vary by 1, at most
The length of the input range may sometimes be very large, so the ranges should not be materialized into sequences (e.g. can't use grouped)
This also implies that we don't allocate numbers to buckets in round-robin fashion, because then numbers in each bucket wouldn't be contiguous and so wouldn't form a Range
Ideally, the sub-ranges would be produced in order, i.e (1,2)(3,4), not (3,4)(1,2)
A colleague found a solution here:
def splitRange(r: Range, chunks: Int): Seq[Range] = {
if (r.step != 1)
throw new IllegalArgumentException("Range must have step size equal to 1")
val nchunks = scala.math.max(chunks, 1)
val chunkSize = scala.math.max(r.length / nchunks, 1)
val starts = r.by(chunkSize).take(nchunks)
val ends = starts.map(_ - 1).drop(1) :+ r.end
starts.zip(ends).map(x => x._1 to x._2)
}
but this can produce very uneven bucket sizes when N is small, e.g:
splitRange(1 to 14, 5)
//> Vector(Range(1, 2), Range(3, 4), Range(5, 6),
//| Range(7, 8), Range(9, 10, 11, 12, 13, 14))
^^^^^^^^^^^^^^^^^^^^^
Floating-point approaches
One way is to generate a fractional (floating-point) offset for each bucket, then convert these to integer Ranges, by zipping. Empty Ranges also need filtering out using collect.
def splitRange(r: Range, chunks: Int): Seq[Range] = {
require(r.step == 1, "Range must have step size equal to 1")
require(chunks >= 1, "Must ask for at least 1 chunk")
val m = r.length.toDouble
val chunkSize = m / chunks
val bins = (0 to chunks).map { x => math.round((x.toDouble * m) / chunks).toInt }
val pairs = bins zip (bins.tail)
pairs.collect { case (a, b) if b > a => a to b }
}
(The first version of this solution had a rounding problem such that it could not handle Int.MaxValue - this has now been fixed based on Rex Kerr's recursive floating-point solution below)
Another floating-point approach is to recurse down the range, taking the head off the range each time, so we cannot miss any elements. This version can handle Int.MaxValue correctly.
def splitRange(r: Range, chunks: Int): Seq[Range] = {
require(r.step == 1, "Range must have step size equal to 1")
require(chunks >= 1, "Must ask for at least 1 chunk")
val chunkSize = r.length.toDouble / chunks
def go(i: Int, r: Range, delta: Double, acc: List[Range]): List[Range] = {
if (i == chunks) r :: acc
// ensures the last chunk has all remaining values, even if error accumulates
else {
val s = delta + chunkSize
val (chunk, rest) = r.splitAt(s.toInt)
go(i + 1, rest, s - s.toInt, if (chunk.length > 0) chunk :: acc else acc)
}
}
go(1, r, 0.0D, Nil).reverse
}
One can also recurse to generate the (start,end) pairs, rather than zipping them. This is adapted from Rex Kerr's answer to a similar question
def splitRange(r: Range, chunks: Int): Seq[Range] = {
require(r.step == 1, "Range must have step size equal to 1")
require(chunks >= 1, "Must ask for at least 1 chunk")
val m = r.length
val bins = (0 to chunks).map { x => math.round((x.toDouble * m) / chunks).toInt }
def snip(r: Range, ns: Seq[Int], got: Vector[Range]): Vector[Range] = {
if (ns.length < 2) got
else {
val (i, j) = (ns.head, ns.tail.head)
snip(r.drop(j - i), ns.tail, got :+ r.take(j - i))
}
}
snip(r, bins, Vector.empty).filter(_.length > 0)
}
Integer approach
Finally, I realized that this can be done with purely integer arithmetic by adapting Bresenham's line-drawing algorithm, which solves a basically equivalent problem - how to allocate the x-pixels evenly across the y rows, using only integer operations!
I initially translated the pseudo-code into an imperative solution using var and ArrayBuffer, then converted it into a tail-recursive solution:
def splitRange(r: Range, chunks: Int): List[Range] = {
require(r.step == 1, "Range must have step size equal to 1")
require(chunks >= 1, "Must ask for at least 1 chunk")
val dy = r.length
val dx = chunks
#tailrec
def go(y0:Int, y:Int, d:Int, ch:Int, acc: List[Range]):List[Range] = {
if (ch == 0) acc
else {
if (d > 0) go(y0, y-1, d-dx, ch, acc)
else go(y-1, y, d+dy, ch-1, if (y > y0) acc
else (y to y0) :: acc)
}
}
go(r.end, r.end, dy - dx, chunks, Nil)
}
Please see the Wikipedia link for a full explanation, but essentially the algorithm zig-zags up the slope of a line, alternatively adding the y-range dy and subtracting the x-range dx. If these don't divide exactly, then an error accumulates until it divides exactly, leading to an extra pixel in some sub-ranges.
splitRange(3 to 15, 5)
//> List(Range(3, 4), Range(5, 6, 7), Range(8, 9),
//| Range(10, 11, 12), Range(13, 14, 15))

Create Range with inclusive end value when stepping

Is there any way to create a range which includes the end value when using a step which doesn't align?
For instance the following yields:
scala> Range.inclusive(0, 35, 10)
res3: scala.collection.immutable.Range.Inclusive = Range(0, 10, 20, 30)
But I would also like the end value (35) included like so:
scala> Range.inclusive(0, 35, 10)
res3: scala.collection.immutable.Range.Inclusive = Range(0, 10, 20, 30, 35)
As mentioned, not a standard semantics. A workaround,
for (i <- 0 to 35 by 10) yield if (35 % 10 != 0 && 35 - i < 10) 35 else i
where you must replace the boundary and step values as needed.
No, not with the current definition/ implementation. It would be strange behaviour to have the step the same for all intermediate elements but different from the last.
The above solution does not work because it omits the value "30". Here is a unfold-style solution that produces a list rather than a sequence.
def unfoldRange(i: Int, j: Int, s: Int): List[Int] = {
if (i >= j) List(j)
else i :: unfoldRange(i+s,j,s)
}
I think you can tackle this by extending Range with the Pimp my Library pattern as well.
object Extensions {
implicit def RichRange(value: Range) = new {
def withEnd: IndexedSeq[Int] = {
if (value.last != value.end) value :+ value.end
else value
}
}
}
although you get an IndexedSeq[Int] rather than a range. Use it like:
import Extensions._
0 to 5 by 2 withEnd // produces 0, 2, 4, 5

Complexity estimation for simple recursive algorithm

I wrote a code on Scala. And now I want to estimate time and memory complexity.
Problem statement
Given a positive integer n, find the least number of perfect square numbers (for example, 1, 4, 9, 16, ...) which sum to n.
For example, given n = 12, return 3 because 12 = 4 + 4 + 4; given n = 13, return 2 because 13 = 4 + 9.
My code
def numSquares(n: Int): Int = {
import java.lang.Math._
def traverse(n: Int, ns: Int): Int = {
val max = ((num: Int) => {
val sq = sqrt(num)
// a perfect square!
if (sq == floor(sq))
num.toInt
else
sq.toInt * sq.toInt
})(n)
if (n == max)
ns + 1
else
traverse(n - max, ns + 1)
}
traverse(n, 0)
}
I use here a recursion solution. So IMHO time complexity is O(n), because I need to traverse over the sequence of numbers using recursion. Am I right? Have I missed anything?