Sort and Remove performance in Scala - scala

I'm writing a program which determines whether an Array would be in strictly increasing order if one and only one element were removed. It works but apparently it does not pass the time limit set by codefights (To be clear it runs instantly on my local machine and on their servers fails the 30 second time limit (with 5000 numbers in the array)).
Which operation is the most performance consuming? Sort is only run once, the only operations that are run every iteration are patching the array, removing by value which is defined by array diff and distinct. Thanks.
def remove(num: Int, array: Array[Int]): Array[Int] = array diff Array(num)
def almostIncreasingSequence(sequence: Array[Int]): Boolean = {
var i = 0
val sortedSequence = sequence.sortWith(_ < _)
while (i < sequence.length) {
val patchedSequence = sequence.patch(i, Nil, 1)
if(patchedSequence.sameElements(remove(sequence(i), sortedSequence).distinct)) return true
i += 1
}
return false
}

So the trick with these things is usually avoid sorting (if possible) and traverse only once. Maybe something like this.
def almostIncreasingSequence(sequence: Array[Int]): Boolean = {
sequence.indices.tail.filter(x => sequence(x-1) >= sequence(x)) match {
case Seq(x) => x == 1 || //remove sequence.head
sequence(x-2) < sequence(x) || //remove sequence(x-1)
!sequence.isDefinedAt(x+1) || //remove sequence.last
sequence(x-1) < sequence(x+1) //remove sequence(x)
case Seq() => true //no dips in sequence
case _ => false //too many dips in sequence
}
}

Related

How to speed up this Scala solution for an easy Leetcode problem?

I solved easy Leetcode problem Ransom Note in Scala like this :
def canConstruct(ransomNote: String, magazine: String): Boolean = {
val magazineChars = magazine.toSeq.groupBy(identity).mapValues(_.size)
val ransomChars = ransomNote.toSeq.groupBy(identity).mapValues(_.size)
ransomChars.forall { case (c, num) => magazineChars.getOrElse(c, 0) >= num }
}
This solution is Ok but slower than other accepted solutions in Scala.
Now I wonder how to speed it up. How would you suggest optimize this solution ?
For performance purpose, you should use low level data structure (primitive type instead of object type, array of primitive type instead of List, Map, i.e.), and low level syntax (while instead of foreach loop, i.e.)
Here is my solution, which beats 90% ~ 100% (it's random), you can speed up it by replace foreach to while loop and replace forall to while loop too, but it's too tedious:
a slightly optimized version of the above solution:
def canConstruct(ransomNote: String, magazine: String): Boolean = {
if (magazine.length < ransomNote.length) {
false // if the magazine has fewer letters than the ransom note, then we definitely can't make the note
} else {
var i = 0
val counts = Array.ofDim[Int](26)
while (i < magazine.length) {
counts(magazine(i) - 'a') += 1
if (i < ransomNote.length) counts(ransomNote(i) - 'a') -= 1 // avoid the need for another loop for the ransom note letters
i += 1
}
var c = 0;
while (c < counts.length) {
if (counts(c) < 0) {
return false
}
c += 1
}
true
}
}
with the following results (after a few runs):

Understanding performance of a tailrec annotated recursive method in scala

Consider the following method - which has been verified to conform to the proper tail recursion :
#tailrec
def getBoundaries(grps: Seq[(BigDecimal, Int)], groupSize: Int, curSum: Int = 0, curOffs: Seq[BigDecimal] = Seq.empty[BigDecimal]): Seq[BigDecimal] = {
if (grps.isEmpty) curOffs
else {
val (id, cnt) = grps.head
val newSum = curSum + cnt.toInt
if (newSum%50==0) { println(s"id=$id newsum=$newSum") }
if (newSum >= groupSize) {
getBoundaries(grps.tail, groupSize, 0, curOffs :+ id) // r1
} else {
getBoundaries(grps.tail, groupSize, newSum, curOffs) // r2
}
}
}
This is running very slowly - about 75 loops per second. When I hit the stacktrace (a nice feature of Intellij) almost every time the line that is currently being invoked is the second tail-recursive call r2. That fact makes me suspicious of the purported "scala unwraps the recursive calls into a while loop". If the unwrapping were occurring then why are we seeing so much time in the invocations themselves?
Beyond having a properly structured tail recursive method are there other considerations to get a recursive routine have performance approaching a direct iteration?
The performance will depend on the underlying type of the Seq.
If it is List then the problem is appending (:+) to the List because this gets very slow with long lists because it has to scan the whole list to find the end.
One solution is to prepend to the list (+:) each time and then reverse at the end. This can give very significant performance improvements, because adding to the start of a list is very quick.
Other Seq types will have different performance characteristics, but you can convert to a List before the recursive call so that you know how it is going to perform.
Here is sample code
def getBoundaries(grps: Seq[(BigDecimal, Int)], groupSize: Int): Seq[BigDecimal] = {
#tailrec
def loop(grps: List[(BigDecimal, Int)], curSum: Int, curOffs: List[BigDecimal]): List[BigDecimal] =
if (grps.isEmpty) curOffs
else {
val (id, cnt) = grps.head
val newSum = curSum + cnt.toInt
if (newSum >= groupSize) {
loop(grps.tail, 0, id +: curOffs) // r1
} else {
loop(grps.tail, newSum, curOffs) // r2
}
}
loop(grps.toList, 0, Nil).reverse
}
This version gives 10x performance improvement over the original code using the test data provided by the questioner in his own answer to the question.
The issue is not in the recursion but instead in the array manipulation . With the following testcase it runs at about 200K recursions per second
type Fgroups = Seq[(BigDecimal, Int)]
test("testGetBoundaries") {
val N = 200000
val grps: Fgroups = (N to 1 by -1).flatMap { x => Array.tabulate(x % 20){ x2 => (BigDecimal(x2 * 1e9), 1) }}
val sgrps = grps.sortWith { case (a, b) =>
a._1.longValue.compare(b._1.longValue) < 0
}
val bb = getBoundaries(sgrps, 100 )
println(bb.take(math.min(50,bb.length)).mkString(","))
assert(bb.length==1900)
}
My production data sample has a similar number of entries (Array with 233K rows ) but runs at 3 orders of magnitude more slowly. I am looking into the tail operation and other culprits now.
Update The following reference from Alvin Alexander indicates that the tail operation should be v fast for immutable collections - but deadly slow for long mutable ones - including Array's !
https://alvinalexander.com/scala/understanding-performance-scala-collections-classes-methods-cookbook
Wow! I had no idea about the performance implications of using mutable collections in scala!
Update By adding code to convert the Array to an (immutable) Seq I see the 3 orders of magnitude performance improvement on the production data sample:
val grps = if (grpsIn.isInstanceOf[mutable.WrappedArray[_]] || grpsIn.isInstanceOf[Array[_]]) {
Seq(grpsIn: _*)
} else grpsIn
The (now fast ~200K/sec) final code is:
type Fgroups = Seq[(BigDecimal, Int)]
val cntr = new java.util.concurrent.atomic.AtomicInteger
#tailrec
def getBoundaries(grpsIn: Fgroups, groupSize: Int, curSum: Int = 0, curOffs: Seq[BigDecimal] = Seq.empty[BigDecimal]): Seq[BigDecimal] = {
val grps = if (grpsIn.isInstanceOf[mutable.WrappedArray[_]] || grpsIn.isInstanceOf[Array[_]]) {
Seq(grpsIn: _*)
} else grpsIn
if (grps.isEmpty) curOffs
else {
val (id, cnt) = grps.head
val newSum = curSum + cnt.toInt
if (cntr.getAndIncrement % 500==0) { println(s"[${cntr.get}] id=$id newsum=$newSum") }
if (newSum >= groupSize) {
getBoundaries(grps.tail, groupSize, 0, curOffs :+ id)
} else {
getBoundaries(grps.tail, groupSize, newSum, curOffs)
}
}
}

Get First nonrecurring element in a list using scala

Getting an compilation error - forward reference extends over definition of value lst:
val lt = List(1,2,3,3,2,4,5,1,5,7,8,7)
var cond = false
do
{
var cond = if (lt.tail contains lt.head) true else false
if (cond == true) {
val lst : List[Int]= lt.filter(_!=lt.head)
val lt = lst
}
else {
println(lt.head)
}
}
while(cond == false)
You can implement "Get first" using find and you can implement "non-recurring" using count == 1 so the code is
lt.find(x => lt.count(_ == x) == 1)
This will return an Option[Int] that can be unpicked in the usual way.
This algorithm is clear but not efficient, so for a very long list you might want to pre-compute the count, or use a recursive function to implement your original algorithm. This would be less clear but more efficient, so avoid it unless you can prove that the inefficiency is causing a problem.
Update
Here is an example of pre-computing the count for each value. This is potentially faster for long lists because Map operations are typically O(log n) so the function is O(n log n) rather than O(n2) for the previous version.
def firstUniq[A](in: Seq[A]): Option[A] = {
val m = mutable.Map.empty[A, Int]
for (elem <- in) {
m.update(elem, m.getOrElseUpdate(elem, 0) + 1)
}
val singles = m.filter(_._2 == 1)
in.find(singles.contains)
}
first non recurring element in whole list
Get First nonrecurring element in a list using scala
You can use filter and count as
val firstNonRecurrringValue = lt.filter(x => lt.count(_ == x) == 1)(0)
so firstNonRecurrringValue is 4
first non recurring element in the list after the element
But looking at your do while code, it seems that you are trying to print the first element that is not recurring after it. For that following code should work
val firstNonRecurringValue = lt.zipWithIndex.filter(x => lt.drop(x._2).count(_ == x._1) == 1)(0)._1
Now firstNonRecurringValue should be 3

Generate all IP addresses given a string in scala

I was trying my hand at writing an IP generator given a string of numbers. The generator would take as an input a string of number such as "17234" and will return all possible list of ips as follows:
1.7.2.34
1.7.23.4
1.72.3.4
17.2.3.4
I attempted to write a snippet to do the generation as follows:
def genip(ip:String):Unit = {
def legal(ip:String):Boolean = (ip.size == 1) || (ip.size == 2) || (ip.size == 3)
def genips(ip:String,portion:Int,accum:String):Unit = portion match {
case 1 if legal(ip) => println(accum+ip)
case _ if portion > 1 => {
genips(ip.drop(1),portion-1,if(accum.size == 0) ip.take(1)+"." else accum+ip.take(1)+".")
genips(ip.drop(2),portion-1,if(accum.size == 0) ip.take(2)+"." else accum+ip.take(2)+".")
genips(ip.drop(3),portion-1,if(accum.size == 0) ip.take(3)+"." else accum+ip.take(3)+".")
}
case _ => return
}
genips(ip,4,"")
}
The idea is to partition the string into four octets and then further partition the octet into strings of size "1","2" and "3" and then recursively descend into the remaining string.
I am not sure if I am on the right track but it would be great if somebody could suggest a more functional way of accomplishing the same.
Thanks
Here is an alternative version of the attached code:
def generateIPs(digits : String) : Seq[String] = generateIPs(digits, 4)
private def generateIPs(digits : String, partsLeft : Int) : Seq[String] = {
if ( digits.size < partsLeft || digits.size > partsLeft * 3) {
Nil
} else if(partsLeft == 1) {
Seq(digits)
} else {
(1 to 3).map(n => generateIPs(digits.drop(n), partsLeft - 1)
.map(digits.take(n) + "." + _)
).flatten
}
}
println("Results:\n" + generateIPs("17234").mkString("\n"))
Major changes:
Methods now return the collection of strings (rather than Unit), so they are proper functions (rather than working of side effects) and can be easily tested;
Avoiding repeating the same code 3 times depending on the size of the bunch of numbers we take;
Not passing accumulated interim result as a method parameter - in this case it doesn't have sense since you'll have at most 4 recursive calls and it's easier to read without it, though as you're loosing the tail recursion in many case it might be reasonable to leave it.
Note: The last map statement is a good candidate to be replaced by for comprehension, which many developers find easier to read and reason about, though I will leave it as an exercise :)
You code is the right idea; I'm not sure making it functional really helps anything, but I'll show both functional and side-effecting ways to do what you want. First, we'd like a good routine to chunk off some of the numbers, making sure an okay number are left for the rest of the chunking, and making sure they're in range for IPs:
def validSize(i: Int, len: Int, more: Int) = i + more <= len && i + 3*more >= len
def chunk(s: String, more: Int) = {
val parts = for (i <- 1 to 3 if validSize(i, s.length, more)) yield s.splitAt(i)
parts.filter(_._1.toInt < 256)
}
Now we need to use chunk recursively four times to generate the possibilities. Here's a solution that is mutable internally and iterative:
def genIPs(digits: String) = {
var parts = List(("", digits))
for (i <- 1 to 4) {
parts = parts.flatMap{ case (pre, post) =>
chunk(post, 4-i).map{ case (x,y) => (pre+x+".", y) }
}
}
parts.map(_._1.dropRight(1))
}
Here's one that recurses using Iterator:
def genIPs(digits: String) = Iterator.iterate(List((3,"",digits))){ _.flatMap{
case(j, pre, post) => chunk(post, j).map{ case(x,y) => (j-1, pre+x+".", y) }
}}.dropWhile(_.head._1 >= 0).next.map(_._2.dropRight(1))
The logic is the same either way. Here it is working:
scala> genIPs("1238516")
res2: List[String] = List(1.23.85.16, 1.238.5.16, 1.238.51.6,
12.3.85.16, 12.38.5.16, 12.38.51.6,
123.8.5.16, 123.8.51.6, 123.85.1.6)

Efficient way to fold list in scala, while avoiding allocations and vars

I have a bunch of items in a list, and I need to analyze the content to find out how many of them are "complete". I started out with partition, but then realized that I didn't need to two lists back, so I switched to a fold:
val counts = groupRows.foldLeft( (0,0) )( (pair, row) =>
if(row.time == 0) (pair._1+1,pair._2)
else (pair._1, pair._2+1)
)
but I have a lot of rows to go through for a lot of parallel users, and it is causing a lot of GC activity (assumption on my part...the GC could be from other things, but I suspect this since I understand it will allocate a new tuple on every item folded).
for the time being, I've rewritten this as
var complete = 0
var incomplete = 0
list.foreach(row => if(row.time != 0) complete += 1 else incomplete += 1)
which fixes the GC, but introduces vars.
I was wondering if there was a way of doing this without using vars while also not abusing the GC?
EDIT:
Hard call on the answers I've received. A var implementation seems to be considerably faster on large lists (like by 40%) than even a tail-recursive optimized version that is more functional but should be equivalent.
The first answer from dhg seems to be on-par with the performance of the tail-recursive one, implying that the size pass is super-efficient...in fact, when optimized it runs very slightly faster than the tail-recursive one on my hardware.
The cleanest two-pass solution is probably to just use the built-in count method:
val complete = groupRows.count(_.time == 0)
val counts = (complete, groupRows.size - complete)
But you can do it in one pass if you use partition on an iterator:
val (complete, incomplete) = groupRows.iterator.partition(_.time == 0)
val counts = (complete.size, incomplete.size)
This works because the new returned iterators are linked behind the scenes and calling next on one will cause it to move the original iterator forward until it finds a matching element, but it remembers the non-matching elements for the other iterator so that they don't need to be recomputed.
Example of the one-pass solution:
scala> val groupRows = List(Row(0), Row(1), Row(1), Row(0), Row(0)).view.map{x => println(x); x}
scala> val (complete, incomplete) = groupRows.iterator.partition(_.time == 0)
Row(0)
Row(1)
complete: Iterator[Row] = non-empty iterator
incomplete: Iterator[Row] = non-empty iterator
scala> val counts = (complete.size, incomplete.size)
Row(1)
Row(0)
Row(0)
counts: (Int, Int) = (3,2)
I see you've already accepted an answer, but you rightly mention that that solution will traverse the list twice. The way to do it efficiently is with recursion.
def counts(xs: List[...], complete: Int = 0, incomplete: Int = 0): (Int,Int) =
xs match {
case Nil => (complete, incomplete)
case row :: tail =>
if (row.time == 0) counts(tail, complete + 1, incomplete)
else counts(tail, complete, incomplete + 1)
}
This is effectively just a customized fold, except we use 2 accumulators which are just Ints (primitives) instead of tuples (reference types). It should also be just as efficient a while-loop with vars - in fact, the bytecode should be identical.
Maybe it's just me, but I prefer using the various specialized folds (.size, .exists, .sum, .product) if they are available. I find it clearer and less error-prone than the heavy-duty power of general folds.
val complete = groupRows.view.filter(_.time==0).size
(complete, groupRows.length - complete)
How about this one? No import tax.
import scala.collection.generic.CanBuildFrom
import scala.collection.Traversable
import scala.collection.mutable.Builder
case class Count(n: Int, total: Int) {
def not = total - n
}
object Count {
implicit def cbf[A]: CanBuildFrom[Traversable[A], Boolean, Count] = new CanBuildFrom[Traversable[A], Boolean, Count] {
def apply(): Builder[Boolean, Count] = new Counter
def apply(from: Traversable[A]): Builder[Boolean, Count] = apply()
}
}
class Counter extends Builder[Boolean, Count] {
var n = 0
var ttl = 0
override def +=(b: Boolean) = { if (b) n += 1; ttl += 1; this }
override def clear() { n = 0 ; ttl = 0 }
override def result = Count(n, ttl)
}
object Counting extends App {
val vs = List(4, 17, 12, 21, 9, 24, 11)
val res: Count = vs map (_ % 2 == 0)
Console println s"${vs} have ${res.n} evens out of ${res.total}; ${res.not} were odd."
val res2: Count = vs collect { case i if i % 2 == 0 => i > 10 }
Console println s"${vs} have ${res2.n} evens over 10 out of ${res2.total}; ${res2.not} were smaller."
}
OK, inspired by the answers above, but really wanting to only pass over the list once and avoid GC, I decided that, in the face of a lack of direct API support, I would add this to my central library code:
class RichList[T](private val theList: List[T]) {
def partitionCount(f: T => Boolean): (Int, Int) = {
var matched = 0
var unmatched = 0
theList.foreach(r => { if (f(r)) matched += 1 else unmatched += 1 })
(matched, unmatched)
}
}
object RichList {
implicit def apply[T](list: List[T]): RichList[T] = new RichList(list)
}
Then in my application code (if I've imported the implicit), I can write var-free expressions:
val (complete, incomplete) = groupRows.partitionCount(_.time != 0)
and get what I want: an optimized GC-friendly routine that prevents me from polluting the rest of the program with vars.
However, I then saw Luigi's benchmark, and updated it to:
Use a longer list so that multiple passes on the list were more obvious in the numbers
Use a boolean function in all cases, so that we are comparing things fairly
http://pastebin.com/2XmrnrrB
The var implementation is definitely considerably faster, even though Luigi's routine should be identical (as one would expect with optimized tail recursion). Surprisingly, dhg's dual-pass original is just as fast (slightly faster if compiler optimization is on) as the tail-recursive one. I do not understand why.
It is slightly tidier to use a mutable accumulator pattern, like so, especially if you can re-use your accumulator:
case class Accum(var complete = 0, var incomplete = 0) {
def inc(compl: Boolean): this.type = {
if (compl) complete += 1 else incomplete += 1
this
}
}
val counts = groupRows.foldLeft( Accum() ){ (a, row) => a.inc( row.time == 0 ) }
If you really want to, you can hide your vars as private; if not, you still are a lot more self-contained than the pattern with vars.
You could just calculate it using the difference like so:
def counts(groupRows: List[Row]) = {
val complete = groupRows.foldLeft(0){ (pair, row) =>
if(row.time == 0) pair + 1 else pair
}
(complete, groupRows.length - complete)
}