"Find the Parity Outlier Code Wars (Scala)" - scala

I am doing some of CodeWars challenges recently and I've got a problem with this one.
"You are given an array (which will have a length of at least 3, but could be very large) containing integers. The array is either entirely comprised of odd integers or entirely comprised of even integers except for a single integer N. Write a method that takes the array as an argument and returns this "outlier" N."
I've looked at some solutions, that are already on our website, but I want to solve the problem using my own approach.
The main problem in my code, seems to be that it ignores negative numbers even though I've implemented Math.abs() method in scala.
If you have an idea how to get around it, that is more than welcome.
Thanks a lot
object Parity {
var even = 0
var odd = 0
var result = 0
def findOutlier(integers: List[Int]): Int = {
for (y <- 0 until integers.length) {
if (Math.abs(integers(y)) % 2 == 0)
even += 1
else
odd += 1
}
if (even == 1) {
for (y <- 0 until integers.length) {
if (Math.abs(integers(y)) % 2 == 0)
result = integers(y)
}
} else {
for (y <- 0 until integers.length) {
if (Math.abs(integers(y)) % 2 != 0)
result = integers(y)
}
}
result
}

Your code handles negative numbers just fine. The problem is that you rely on mutable sate, which leaks between runs of your code. Your code behaves as follows:
val l = List(1,3,5,6,7)
println(Parity.findOutlier(l)) //6
println(Parity.findOutlier(l)) //7
println(Parity.findOutlier(l)) //7
The first run is correct. However, when you run it the second time, even, odd, and result all have the values from your previous run still in them. If you define them inside of your findOutlier method instead of in the Parity object, then your code gives correct results.
Additionally, I highly recommend reading over the methods available to a Scala List. You should almost never need to loop through a List like that, and there are a number of much more concise solutions to the problem. Mutable var's are also a pretty big red flag in Scala code, as are excessive if statements.

Related

Difference between size and sizeIs

What is the semantic difference between size and sizeIs? For example,
List(1,2,3).sizeIs > 1 // true
List(1,2,3).size > 1 // true
Luis mentions in a comment that
...on 2.13+ one can use sizeIs > 1 which will be more efficient than
size > 1 as the first one does not compute all the size before
returning
Add size comparison methods to IterableOps #6950 seems to be the pull request that introduced it.
Reading the scaladoc
Returns a value class containing operations for comparing the size of
this $coll to a test value. These operations are implemented in terms
of sizeCompare(Int)
it is not clear to me why is sizeIs more efficient than regular size?
As far as I understand the changes.
The idea is that for collections that do not have a O(1) (constant) size. Then, sizeIs can be more efficient, specially for comparisons with small values (like 1 in the comment).
But why?
Simple, because instead of computing all the size and then doing the comparison, sizeIs returns an object which when computing the comparison, can return early.
For example, lets check the code
def sizeCompare(otherSize: Int): Int = {
if (otherSize < 0) 1
else {
val known = knownSize
if (known >= 0) Integer.compare(known, otherSize)
else {
var i = 0
val it = iterator
while (it.hasNext) {
if (i == otherSize) return if (it.hasNext) 1 else 0 // HERE!!! - return as fast as possible.
it.next()
i += 1
}
i - otherSize
}
}
}
Thus, in the example of the comment, suppose a very very very long List of three elements. sizeIs > 1 will return as soon as it knows that the List has at least one element and hasMore. Thus, saving the cost of traversing the other two elements to compute a size of 3 and then doing the comparison.
Note that: If the size of the collection is greater than the comparing value, then the performance would be roughly the same (maybe slower than just size due the extra comparisons on each cycle). Thus, I would only recommend this for comparisons with small values, or when you believe the values will be smaller than the collection.

Scala - conditionally sum elements in list

I am trying to solve a beginner problem with lists but can't find an example to help me get it work. I am given a list of positive and negative integers (AccountHistory) and I need to check if the negative integers in this list have ever exceeded -1000. I expected my code to work with a freshly introduced helper function like this:
def checkAccount(account: AccountHistory): Boolean = {
def helper(i: AccountHistory): Int = {
var total = 0
i.collect{case x if x < 0 => Math.abs(x) + total}
return total
}
if (helper(account) >1000) true else false
}
But it doesn't work. Please help me find my mistake or problem in wrong approach.
Edit: The pre-given tests include
assert(checkAccount(List(10,-5,20)))
assert(!checkAccount(List(-1000,-1)))
So if assert expects true then my approach is wrong to solve it like this.
By 'exceeded' I mean <-1000, for any or all elements in a list (like exceeding a credit amount in given period).
i.collect{case x if x < 0 => Math.abs(x) + total}
In the above code snippet, not assign back to total, maybe you need:
val total = i.filter(_ < 0).map(Math.abs).sum
I think this is what you're supposed to do:
def checkAccount(account: AccountHistory): Boolean =
account.forall(_ > -1000)

Merge Sort algorithm efficiency

I am currently taking an online algorithms course in which the teacher doesn't give code to solve the algorithm, but rather rough pseudo code. So before taking to the internet for the answer, I decided to take a stab at it myself.
In this case, the algorithm that we were looking at is merge sort algorithm. After being given the pseudo code we also dove into analyzing the algorithm for run times against n number of items in an array. After a quick analysis, the teacher arrived at 6nlog(base2)(n) + 6n as an approximate run time for the algorithm.
The pseudo code given was for the merge portion of the algorithm only and was given as follows:
C = output [length = n]
A = 1st sorted array [n/2]
B = 2nd sorted array [n/2]
i = 1
j = 1
for k = 1 to n
if A(i) < B(j)
C(k) = A(i)
i++
else [B(j) < A(i)]
C(k) = B(j)
j++
end
end
He basically did a breakdown of the above taking 4n+2 (2 for the declarations i and j, and 4 for the number of operations performed -- the for, if, array position assignment, and iteration). He simplified this, I believe for the sake of the class, to 6n.
This all makes sense to me, my question arises from the implementation that I am performing and how it effects the algorithms and some of the tradeoffs/inefficiencies it may add.
Below is my code in swift using a playground:
func mergeSort<T:Comparable>(_ array:[T]) -> [T] {
guard array.count > 1 else { return array }
let lowerHalfArray = array[0..<array.count / 2]
let upperHalfArray = array[array.count / 2..<array.count]
let lowerSortedArray = mergeSort(array: Array(lowerHalfArray))
let upperSortedArray = mergeSort(array: Array(upperHalfArray))
return merge(lhs:lowerSortedArray, rhs:upperSortedArray)
}
func merge<T:Comparable>(lhs:[T], rhs:[T]) -> [T] {
guard lhs.count > 0 else { return rhs }
guard rhs.count > 0 else { return lhs }
var i = 0
var j = 0
var mergedArray = [T]()
let loopCount = (lhs.count + rhs.count)
for _ in 0..<loopCount {
if j == rhs.count || (i < lhs.count && lhs[i] < rhs[j]) {
mergedArray.append(lhs[i])
i += 1
} else {
mergedArray.append(rhs[j])
j += 1
}
}
return mergedArray
}
let values = [5,4,8,7,6,3,1,2,9]
let sortedValues = mergeSort(values)
My questions for this are as follows:
Do the guard statements at the start of the merge<T:Comparable> function actually make it more inefficient? Considering we are always halving the array, the only time that it will hold true is for the base case and when there is an odd number of items in the array.
This to me seems like it would actually add more processing and give minimal return since the time that it happens is when we have halved the array to the point where one has no items.
Concerning my if statement in the merge. Since it is checking more than one condition, does this effect the overall efficiency of the algorithm that I have written? If so, the effects to me seems like they vary based on when it would break out of the if statement (e.g at the first condition or the second).
Is this something that is considered heavily when analyzing algorithms, and if so how do you account for the variance when it breaks out from the algorithm?
Any other analysis/tips you can give me on what I have written would be greatly appreciated.
You will very soon learn about Big-O and Big-Theta where you don't care about exact runtimes (believe me when I say very soon, like in a lecture or two). Until then, this is what you need to know:
Yes, the guards take some time, but it is the same amount of time in every iteration. So if each iteration takes X amount of time without the guard and you do n function calls, then it takes X*n amount of time in total. Now add in the guards who take Y amount of time in each call. You now need (X+Y)*n time in total. This is a constant factor, and when n becomes very large the (X+Y) factor becomes negligible compared to the n factor. That is, if you can reduce a function X*n to (X+Y)*(log n) then it is worthwhile to add the Y amount of work because you do fewer iterations in total.
The same reasoning applies to your second question. Yes, checking "if X or Y" takes more time than checking "if X" but it is a constant factor. The extra time does not vary with the size of n.
In some languages you only check the second condition if the first fails. How do we account for that? The simplest solution is to realize that the upper bound of the number of comparisons will be 3, while the number of iterations can be potentially millions with a large n. But 3 is a constant number, so it adds at most a constant amount of work per iteration. You can go into nitty-gritty details and try to reason about the distribution of how often the first, second and third condition will be true or false, but often you don't really want to go down that road. Pretend that you always do all the comparisons.
So yes, adding the guards might be bad for your runtime if you do the same number of iterations as before. But sometimes adding extra work in each iteration can decrease the number of iterations needed.

For loop in scala without sequence?

So, while working my way through "Scala for the Impatient" I found myself wondering: Can you use a Scala for loop without a sequence?
For example, there is an exercise in the book that asks you to build a counter object that cannot be incremented past Integer.MAX_VALUE. In order to test my solution, I wrote the following code:
var c = new Counter
for( i <- 0 to Integer.MAX_VALUE ) c.increment()
This throws an error: sequences cannot contain more than Int.MaxValue elements.
It seems to me that means that Scala is first allocating and populating a sequence object, with the values 0 through Integer.MaxValue, and then doing a foreach loop on that sequence object.
I realize that I could do this instead:
var c = new Counter
while(c.value < Integer.MAX_VALUE ) c.increment()
But is there any way to do a traditional C-style for loop with the for statement?
In fact, 0 to N does not actually populate anything with integers from 0 to N. It instead creates an instance of scala.collection.immutable.Range, which applies its methods to all the integers generated on the fly.
The error you ran into is only because you have to be able to fit the number of elements (whether they actually exist or not) into the positive part of an Int in order to maintain the contract for the length method. 1 to Int.MaxValue works fine, as does 0 until Int.MaxValue. And the latter is what your while loop is doing anyway (to includes the right endpoint, until omits it).
Anyway, since the Scala for is a very different (much more generic) creature than the C for, the short answer is no, you can't do exactly the same thing. But you can probably do what you want with for (though maybe not as fast as you want, since there is some performance penalty).
Wow, some nice technical answers for a simple question (which is good!) But in case anyone is just looking for a simple answer:
//start from 0, stop at 9 inclusive
for (i <- 0 until 10){
println("Hi " + i)
}
//or start from 0, stop at 9 inclusive
for (i <- 0 to 9){
println("Hi " + i)
}
As Rex pointed out, "to" includes the right endpoint, "until" omits it.
Yes and no, it depends what you are asking for. If you're asking whether you can iterate over a sequence of integers without having to build that sequence first, then yes you can, for instance using streams:
def fromTo(from : Int, to : Int) : Stream[Int] =
if(from > to) {
Stream.empty
} else {
// println("one more.") // uncomment to see when it is called
Stream.cons(from, fromTo(from + 1, to))
}
Then:
for(i <- fromTo(0, 5)) println(i)
Writing your own iterator by defining hasNext and next is another option.
If you're asking whether you can use the 'for' syntax to write a "native" loop, i.e. a loop that works by incrementing some native integer rather than iterating over values produced by an instance of an object, then the answer is, as far as I know, no. As you may know, 'for' comprehensions are syntactic sugar for a combination of calls to flatMap, filter, map and/or foreach (all defined in the FilterMonadic trait), depending on the nesting of generators and their types. You can try to compile some loop and print its compiler intermediate representation with
scalac -Xprint:refchecks
to see how they are expanded.
There's a bunch of these out there, but I can't be bothered googling them at the moment. The following is pretty canonical:
#scala.annotation.tailrec
def loop(from: Int, until: Int)(f: Int => Unit): Unit = {
if (from < until) {
f(from)
loop(from + 1, until)(f)
}
}
loop(0, 10) { i =>
println("Hi " + i)
}

Scala vals vs vars

I'm pretty new to Scala but I like to know what is the preferred way of solving this problem. Say I have a list of items and I want to know the total amount of the items that are checks. I could do something like so:
val total = items.filter(_.itemType == CHECK).map(._amount).sum
That would give me what I need, the sum of all checks in a immutable variable. But it does it with what seems like 3 iterations. Once to filter the checks, again to map the amounts and then the sum. Another way would be to do something like:
var total = new BigDecimal(0)
for (
item <- items
if item.itemType == CHECK
) total += item.amount
This gives me the same result but with 1 iteration and a mutable variable which seems fine too. But if I wanted to to extract more information, say the total number of checks, that would require more counters or mutable variables but I wouldn't have to iterate over the list again. Doesn't seem like the "functional" way of achieving what I need.
var numOfChecks = 0
var total = new BigDecimal(0)
items.foreach { item =>
if (item.itemType == CHECK) {
numOfChecks += 1
total += item.amount
}
}
So if you find yourself needing a bunch of counters or totals on a list is it preferred to keep mutable variables or not worry about it do something along the lines of:
val checks = items.filter(_.itemType == CHECK)
val total = checks.map(_.amount).sum
return (checks.size, total)
which seems easier to read and only uses vals
Another way of solving your problem in one iteration would be to use views or iterators:
items.iterator.filter(_.itemType == CHECK).map(._amount).sum
or
items.view.filter(_.itemType == CHECK).map(._amount).sum
This way the evaluation of the expression is delayed until the call of sum.
If your items are case classes you could also write it like this:
items.iterator collect { case Item(amount, CHECK) => amount } sum
I find that speaking of doing "three iterations" is a bit misleading -- after all, each iteration does less work than a single iteration with everything. So it doesn't automatically follows that iterating three times will take longer than iterating once.
Creating temporary objects, now that is a concern, because you'll be hitting memory (even if cached), which isn't the case of the single iteration. In those cases, view will help, even though it adds more method calls to do the same work. Hopefully, JVM will optimize that away. See Moritz's answer for more information on views.
You may use foldLeft for that:
(0 /: items) ((total, item) =>
if(item.itemType == CHECK)
total + item.amount
else
total
)
The following code will return a tuple (number of checks -> sum of amounts):
((0, 0) /: items) ((total, item) =>
if(item.itemType == CHECK)
(total._1 + 1, total._2 + item.amount)
else
total
)