Upper Triangular Matrix in Scala - scala

Is there a way I can perform a faster computation of upper triangle matrix in scala?
/** Returns a vector which consists of the upper triangular elements of a matrix */
def getUpperTriangle(A: Array[Array[Double]]) =
{
var A_ = Seq(0.)
for (i <- 0 to A.size - 1;j <- 0 to A(0).size - 1)
{
if (i <= j){
A_ = A_ ++ Seq(A(i)(j))
}
}
A_.tail.toArray
}

I don't know about faster, but this is a lot shorter and more "functional" (I note you tagged your question with functional-programming)
def getUpperTriangle(a: Array[Array[Double]]) =
(0 until a.size).flatMap(i => a(i).drop(i)).toArray
or, more or less same idea:
def getUpperTriangle(a: Array[Array[Double]]) =
a.zipWithIndex.flatMap{case(r,i) => r.drop(i)}

Here are three basic things you can do to streamline your logic to improve performance:
Start with an empty Seq, so you don't have to call Seq.tail at the end. The tail operation is going to be O(n), since the Seq factory methods give you an IndexedSeq
Use Seq.:+ to append a single element to the Seq, instead of constructing a Seq with a single element, and using Seq.++ to append two Seqs. Seq.:+ is going to be O(1) (amortized) and quite fast for an IndexedSeq. Using Seq.++ with a single-element sequence is probably still O(1), but will have a good bit more overhead.
You can start j at i instead of starting j at 0 and testing i <= j in the body of the loop. This will save n^2/2 no-op loop iterations.
Some stylistic things:
It's best to always include the return type. You actually get a deprecation warning without it.
We use lowercase for variable names in Scala
0 until size is perhaps more readable than 0 to size - 1
def getUpperTriangle(a: Array[Array[Double]]): Array[Double] = {
var result = Seq[Double]()
for (i <- 0 until a.size; j <- i until A(0).size) {
result = result :+ a(i)(j)
}
result.toArray
}

Related

How to count the number of iterations in a for comprehension in Scala?

I am using a for comprehension on a stream and I would like to know how many iterations took to get o the final results.
In code:
var count = 0
for {
xs <- xs_generator
x <- xs
count = count + 1 //doesn't work!!
if (x prop)
yield x
}
Is there a way to achieve this?
Edit: If you don't want to return only the first item, but the entire stream of solutions, take a look at the second part.
Edit-2: Shorter version with zipWithIndex appended.
It's not entirely clear what you are attempting to do. To me it seems as if you are trying to find something in a stream of lists, and additionaly save the number of checked elements.
If this is what you want, consider doing something like this:
/** Returns `x` that satisfies predicate `prop`
* as well the the total number of tested `x`s
*/
def findTheX(): (Int, Int) = {
val xs_generator = Stream.from(1).map(a => (1 to a).toList).take(1000)
var count = 0
def prop(x: Int): Boolean = x % 317 == 0
for (xs <- xs_generator; x <- xs) {
count += 1
if (prop(x)) {
return (x, count)
}
}
throw new Exception("No solution exists")
}
println(findTheX())
// prints:
// (317,50403)
Several important points:
Scala's for-comprehension have nothing to do with Python's "yield". Just in case you thought they did: re-read the documentation on for-comprehensions.
There is no built-in syntax for breaking out of for-comprehensions. It's better to wrap it into a function, and then call return. There is also breakable though, but it works with Exceptions.
The function returns the found item and the total count of checked items, therefore the return type is (Int, Int).
The error in the end after the for-comprehension is to ensure that the return type is Nothing <: (Int, Int) instead of Unit, which is not a subtype of (Int, Int).
Think twice when you want to use Stream for such purposes in this way: after generating the first few elements, the Stream holds them in memory. This might lead to "GC-overhead limit exceeded"-errors if the Stream isn't used properly.
Just to emphasize it again: the yield in Scala for-comprehensions is unrelated to Python's yield. Scala has no built-in support for coroutines and generators. You don't need them as often as you might think, but it requires some readjustment.
EDIT
I've re-read your question again. In case that you want an entire stream of solutions together with a counter of how many different xs have been checked, you might use something like that instead:
val xs_generator = Stream.from(1).map(a => (1 to a).toList)
var count = 0
def prop(x: Int): Boolean = x % 317 == 0
val xsWithCounter = for {
xs <- xs_generator;
x <- xs
_ = { count = count + 1 }
if (prop(x))
} yield (x, count)
println(xsWithCounter.take(10).toList)
// prints:
// List(
// (317,50403), (317,50721), (317,51040), (317,51360), (317,51681),
// (317,52003), (317,52326), (317,52650), (317,52975), (317,53301)
// )
Note the _ = { ... } part. There is a limited number of things that can occur in a for-comprehension:
generators (the x <- things)
filters/guards (if-s)
value definitions
Here, we sort-of abuse the value-definition syntax to update the counter. We use the block { counter += 1 } as the right hand side of the assignment. It returns Unit. Since we don't need the result of the block, we use _ as the left hand side of the assignment. In this way, this block is executed once for every x.
EDIT-2
If mutating the counter is not your main goal, you can of course use the zipWithIndex directly:
val xsWithCounter =
xs_generator.flatten.zipWithIndex.filter{x => prop(x._1)}
It gives almost the same result as the previous version, but the indices are shifted by -1 (it's the indices, not the number of tried x-s).

Scala: IndexedSeq.newBuilder vs. ArrayBuffer

I had accepted that building an IndexedSeq in a loop should use an ArrayBuffer, followed by a conversion to a Vector via ".toVector()".
In an example profiled showed the CPU hotspot was in this section, and so I tried an alternative: use IndexedSeq.newBuilder() followed by conversion to immutable via ".result()".
This change gave a significance performance improvement. The code looks almost the same. So it seems using IndexedSeq.newBuilder() is best practice. Is this correct? The example method is shown below, with the ArrayBuffer difference commented out.
def interleave[T](a: IndexedSeq[T], b: IndexedSeq[T]): IndexedSeq[T] = {
val al = a.length
val bl = b.length
val buffer = IndexedSeq.newBuilder[T]
//---> val buffer = new ArrayBuffer[T](al + bl)
val commonLength = Math.min(al, bl)
val aExtra = al - commonLength
val bExtra = bl - commonLength
var i = 0
while (i < commonLength) {
buffer += a(i)
buffer += b(i)
i += 1
}
if (aExtra > 0) {
while (i < al) {
buffer += a(i)
i += 1
}
} else if (bExtra > 0) {
while (i < bl) {
buffer += b(i)
i += 1
}
}
buffer.result()
//---> buffer.toVector()
}
As to which is best practice, I guess it depends upon your requirements. Both approaches are acceptable and understandable. All things being equal, in this particular case, I would favor the IndexedSeq.newBuilder over ArrayBuilder (since the latter targets the creation of an Array, while the former's result is a Vector).
Just one point on benchmarking: this is a real art form, due to caching, JIT & HotSpot performance, garbage collection, etc. One piece of software you might consider using to do this is ScalaMeter. You will need to write both versions of the function to populate the final vector, and ScalaMeter will give you accurate statistics on both. ScalaMeter allows the code to warm-up before taking measurements, and can also look at memory requirements as well as CPU time.
In this example informal testing did not deceive, but ScalaMeter does provide a clearer picture of performance. Building the result in an ArrayBuffer (top orange line) is definitely slower than the more direct newBuilder (blue line).
Returning the ArrayBuffer as an IndexedSeq is the fastest (green line), but of course it does not give you the true protection of an immutable collection.
Building the intermediate result in an Array (red line) is intermediate between ArrayBuffer and newBuilder.
The "zipAll" collection method allows the interleave to be done in a more functional style:
def interleaveZipAllBuilderPat[T](a: IndexedSeq[T], b: IndexedSeq[T]): IndexedSeq[T] = {
a.zipAll(b, null, null).foldLeft(Vector.newBuilder[T]) { (z, tp) =>
tp match {
case ((x:T, null)) => z += x
case ((x:T,y:T)) => z += x += y
}
}.result()
}
The slowest are the functional method, with the top two almost the same and these differ only in that one does the pattern match, and the other an if statement, so the pattern is not slow.
Functional is marginally worse than the direct loop method if an ArrayBuffer is used to accumulate the result, but the direct loop using the newBuilder is significantly faster.
If "zipAll" could return a builder, and if the builder were iterable, the functional style could be faster - no need to produce the immutable result if the next step just requires an iteration over elements.
So for me newBuilder is the clear winner.

Tail recursion with List + .toVector or Vector?

val dimensionality = 10
val zeros = DenseVector.zeros[Double](dimensionality)
#tailrec private def specials(list: List[DenseVector[Int]], i: Int): List[DenseVector[Int]] = {
if(i >= dimensionality) list
else {
val vec = zeros.copy
vec(i to i) := 1
specials(vec :: list, i + 1)
}
}
val specialList = specials(Nil, 0).toVector
specialList.map(...doing my thing...)
Should I write my tail recursive function using a List as accumulator above and then write
specials(Nil, 0).toVector
or should I write my trail recursion with a Vector in the first place? What is computationally more efficient?
By the way: specialList is a list that contains DenseVectors where every entry is 0 with the exception of one entry, which is 1. There are as many DenseVectors as they are long.
I'm not sur what you're trying to do here but you could rewrite your code like so:
type Mat = List[Vector[Int]]
#tailrec
private def specials(mat: Mat, i: Int): Mat = i match {
case `dimensionality` => mat
case _ =>
val v = zeros.copy.updated(i,1)
specials(v :: mat, i + 1)
}
As you are dealing with a matrix, Vector is probably a better choice.
Let's compare the performance characteristics of both variants:
List: prepending takes constant time, conversion to Vector takes linear time.
Vector: prepending takes "effectively" constant time (eC), no subsequent conversion needed.
If you compare the implementations of List and Vector, then you'll find out that prepending to a List is a simpler and cheaper operation than prepending to a Vector. Instead of just adding another element at the front as it is done by List, Vector potentially has to replace a whole branch/subtree internally. On average, this still happens in constant time ("effectively" constant, because the subtrees can differ in their size), but is more expensive than prepending to List. On the plus side, you can avoid the call to toVector.
Eventually, the crucial point of interest is the size of the collection you want to create (or in other words, the amount of recursive prepend-steps you are doing). It's totally possible that there is no clear winner and one of the two variants is faster for <= n steps, whereas the other variant is faster for > n steps. In my naive toy benchmark, List/toVecor seemed to be faster for less than 8k elements, but you should perform a set of well-chosen benchmarks that represent your scenario adequately.

How to write an efficient groupBy-size filter in Scala, can be approximate

Given a List[Int] in Scala, I wish to get the Set[Int] of all Ints which appear at least thresh times. I can do this using groupBy or foldLeft, then filter. For example:
val thresh = 3
val myList = List(1,2,3,2,1,4,3,2,1)
myList.foldLeft(Map[Int,Int]()){case(m, i) => m + (i -> (m.getOrElse(i, 0) + 1))}.filter(_._2 >= thresh).keys
will give Set(1,2).
Now suppose the List[Int] is very large. How large it's hard to say but in any case this seems wasteful as I don't care about each of the Ints frequencies, and I only care if they're at least thresh. Once it passed thresh there's no need to check anymore, just add the Int to the Set[Int].
The question is: can I do this more efficiently for a very large List[Int],
a) if I need a true, accurate result (no room for mistakes)
b) if the result can be approximate, e.g. by using some Hashing trick or Bloom Filters, where Set[Int] might include some false-positives, or whether {the frequency of an Int > thresh} isn't really a Boolean but a Double in [0-1].
First of all, you can't do better than O(N), as you need to check each element of your initial array at least once. You current approach is O(N), presuming that operations with IntMap are effectively constant.
Now what you can try in order to increase efficiency:
update map only when current counter value is less or equal to threshold. This will eliminate huge number of most expensive operations — map updates
try faster map instead of IntMap. If you know that values of the initial List are in fixed range, you can use Array instead of IntMap (index as the key). Another possible option will be mutable HashMap with sufficient initail capacity. As my benchmark shows it actually makes significant difference
As #ixx proposed, after incrementing value in the map, check whether it's equal to 3 and in this case add it immediately to result list. This will save you one linear traversing (appears to be not that significant for large input)
I don't see how any approximate solution can be faster (only if you ignore some elements at random). Otherwise it will still be O(N).
Update
I created microbenchmark to measure the actual performance of different implementations. For sufficiently large input and output Ixx's suggestion regarding immediately adding elements to result list doesn't produce significant improvement. However similar approach could be used to eliminate unnecessary Map updates (which appears to be the most expensive operation).
Results of benchmarks (avg run times on 1000000 elems with pre-warming):
Authors solution:
447 ms
Ixx solution:
412 ms
Ixx solution2 (eliminated excessive map writes):
150 ms
My solution:
57 ms
My solution involves using mutable HashMap instead of immutable IntMap and includes all other possible optimizations.
Ixx's updated solution:
val tuple = (Map[Int, Int](), List[Int]())
val res = myList.foldLeft(tuple) {
case ((m, s), i) =>
val count = m.getOrElse(i, 0) + 1
(if (count <= 3) m + (i -> count) else m, if (count == thresh) i :: s else s)
}
My solution:
val map = new mutable.HashMap[Int, Int]()
val res = new ListBuffer[Int]
myList.foreach {
i =>
val c = map.getOrElse(i, 0) + 1
if (c == thresh) {
res += i
}
if (c <= thresh) {
map(i) = c
}
}
The full microbenchmark source is available here.
You could use the foldleft to collect the matching items, like this:
val tuple = (Map[Int,Int](), List[Int]())
myList.foldLeft(tuple) {
case((m, s), i) => {
val count = (m.getOrElse(i, 0) + 1)
(m + (i -> count), if (count == thresh) i :: s else s)
}
}
I could measure a performance improvement of about 40% with a small list, so it's definitely an improvement...
Edited to use List and prepend, which takes constant time (see comments).
If by "more efficiently" you mean the space efficiency (in extreme case when the list is infinite), there's a probabilistic data structure called Count Min Sketch to estimate the frequency of items inside it. Then you can discard those with frequency below your threshold.
There's a Scala implementation from Algebird library.
You can change your foldLeft example a bit using a mutable.Set that is build incrementally and at the same time used as filter for iterating over your Seq by using withFilter. However, because I'm using withFilteri cannot use foldLeft and have to make do with foreach and a mutable map:
import scala.collection.mutable
def getItems[A](in: Seq[A], threshold: Int): Set[A] = {
val counts: mutable.Map[A, Int] = mutable.Map.empty
val result: mutable.Set[A] = mutable.Set.empty
in.withFilter(!result(_)).foreach { x =>
counts.update(x, counts.getOrElse(x, 0) + 1)
if (counts(x) >= threshold) {
result += x
}
}
result.toSet
}
So, this would discard items that have already been added to the result set while running through the Seq the first time, because withFilterfilters the Seqin the appended function (map, flatMap, foreach) rather than returning a filtered Seq.
EDIT:
I changed my solution to not use Seq.count, which was stupid, as Aivean correctly pointed out.
Using Aiveans microbench I can see that it is still slightly slower than his approach, but still better than the authors first approach.
Authors solution
377
Ixx solution:
399
Ixx solution2 (eliminated excessive map writes):
110
Sascha Kolbergs solution:
72
Aivean solution:
54

What is time complexity of slice in scala?

What is time complexity of scala slice method? Is it O(m) or O(n),
where m is number of elements in slice and n is number of elements in collection.
More specific question: what is time complexity of
someMap.slice(i, i + 1).keys.head, where i is random int less than someMap.size? If slice complexity is O(m) then it should be O(1), right?
This clearly depends on the underlying datatype. Slicing an Array, ArrayBuffer, ByteBuffer, or any other sub-class of IndexedSeqOptimized for example is O(k) if you're slicing k elements from the container. List, for example, is O(n) you can see by its implementation. You'll likely want to check the source for your specific type.
Map gets its implementation of slice from IterableLike. From the implementation pasted below, it's clearly the cost to iterate over elements in the collection plus the cost to create a newly built map.
For TreeMap this should be O(n + k lg k) if you're slicing k elements out of the map.
For HashMap this should be bounded by O(n).
override /*TraversableLike*/ def slice(from: Int, until: Int): Repr = {
val lo = math.max(from, 0)
val elems = until - lo
val b = newBuilder
if (elems <= 0) b.result()
else {
b.sizeHintBounded(elems, this)
var i = 0
val it = iterator drop lo
while (i < elems && it.hasNext) {
b += it.next
i += 1
}
b.result()
}
}
When in doubt, look at the source.