Is it faster to create a new Map or clear it and use again? - scala

I need to use many Maps in my project so I wonder which way is more efficient:
val map = mutable.Map[Int, Int] = mutable.Map.empty
for (_ <- 0 until big_number)
{
// do something with map
map.clear()
}
or
for (_ <- 0 until big_number)
{
val map = mutable.Map[Int, Int] = mutable.Map.empty
// do something with map
}
to use in terms of time and memory?

Well, my formal answer would always be depends. As you need to benchmark your own scenario, and see what fits better for your scenario. I'll provide an example how you can try benchmarking your own code. Let's start with writing a measuring method:
def measure(name: String, f: () => Unit): Unit = {
val l = System.currentTimeMillis()
println(name + ": " + (System.currentTimeMillis() - l))
f()
println(name + ": " + (System.currentTimeMillis() - l))
}
Let's assume that in each iteration we need to insert into the map one key-value pair, and then to print it:
Await.result(Future.sequence(Seq(Future {
measure("inner", () => {
for (i <- 0 until 10) {
val map2 = mutable.Map.empty[Int, Int]
map2(i) = i
println(map2)
}
})
},
Future {
measure("outer", () => {
val map1 = mutable.Map.empty[Int, Int]
for (i <- 0 until 10) {
map1(i) = i
println(map1)
map1.clear()
}
})
})), 10.seconds)
The output in this case, is almost always equal between the inner and the outer. Please note that in this case I run the two options in parallel, as if I wouldn't the first one always takes significantly more time, no matter which one of then is first.
Therefore, we can conclude, that in this case they are almost the same.
But, if for example I add an immutable option:
Future {
measure("immutable", () => {
for (i <- 0 until 10) {
val map1 = Map[Int, Int](i -> i)
println(map1)
}
})
}
it always ends up first. This makes sense because immutable collections are much more performant than the mutables.
For better performance tests you probably need to use some third parties, such as scalameter, or others that exists.

Related

Understanding performance of a tailrec annotated recursive method in scala

Consider the following method - which has been verified to conform to the proper tail recursion :
#tailrec
def getBoundaries(grps: Seq[(BigDecimal, Int)], groupSize: Int, curSum: Int = 0, curOffs: Seq[BigDecimal] = Seq.empty[BigDecimal]): Seq[BigDecimal] = {
if (grps.isEmpty) curOffs
else {
val (id, cnt) = grps.head
val newSum = curSum + cnt.toInt
if (newSum%50==0) { println(s"id=$id newsum=$newSum") }
if (newSum >= groupSize) {
getBoundaries(grps.tail, groupSize, 0, curOffs :+ id) // r1
} else {
getBoundaries(grps.tail, groupSize, newSum, curOffs) // r2
}
}
}
This is running very slowly - about 75 loops per second. When I hit the stacktrace (a nice feature of Intellij) almost every time the line that is currently being invoked is the second tail-recursive call r2. That fact makes me suspicious of the purported "scala unwraps the recursive calls into a while loop". If the unwrapping were occurring then why are we seeing so much time in the invocations themselves?
Beyond having a properly structured tail recursive method are there other considerations to get a recursive routine have performance approaching a direct iteration?
The performance will depend on the underlying type of the Seq.
If it is List then the problem is appending (:+) to the List because this gets very slow with long lists because it has to scan the whole list to find the end.
One solution is to prepend to the list (+:) each time and then reverse at the end. This can give very significant performance improvements, because adding to the start of a list is very quick.
Other Seq types will have different performance characteristics, but you can convert to a List before the recursive call so that you know how it is going to perform.
Here is sample code
def getBoundaries(grps: Seq[(BigDecimal, Int)], groupSize: Int): Seq[BigDecimal] = {
#tailrec
def loop(grps: List[(BigDecimal, Int)], curSum: Int, curOffs: List[BigDecimal]): List[BigDecimal] =
if (grps.isEmpty) curOffs
else {
val (id, cnt) = grps.head
val newSum = curSum + cnt.toInt
if (newSum >= groupSize) {
loop(grps.tail, 0, id +: curOffs) // r1
} else {
loop(grps.tail, newSum, curOffs) // r2
}
}
loop(grps.toList, 0, Nil).reverse
}
This version gives 10x performance improvement over the original code using the test data provided by the questioner in his own answer to the question.
The issue is not in the recursion but instead in the array manipulation . With the following testcase it runs at about 200K recursions per second
type Fgroups = Seq[(BigDecimal, Int)]
test("testGetBoundaries") {
val N = 200000
val grps: Fgroups = (N to 1 by -1).flatMap { x => Array.tabulate(x % 20){ x2 => (BigDecimal(x2 * 1e9), 1) }}
val sgrps = grps.sortWith { case (a, b) =>
a._1.longValue.compare(b._1.longValue) < 0
}
val bb = getBoundaries(sgrps, 100 )
println(bb.take(math.min(50,bb.length)).mkString(","))
assert(bb.length==1900)
}
My production data sample has a similar number of entries (Array with 233K rows ) but runs at 3 orders of magnitude more slowly. I am looking into the tail operation and other culprits now.
Update The following reference from Alvin Alexander indicates that the tail operation should be v fast for immutable collections - but deadly slow for long mutable ones - including Array's !
https://alvinalexander.com/scala/understanding-performance-scala-collections-classes-methods-cookbook
Wow! I had no idea about the performance implications of using mutable collections in scala!
Update By adding code to convert the Array to an (immutable) Seq I see the 3 orders of magnitude performance improvement on the production data sample:
val grps = if (grpsIn.isInstanceOf[mutable.WrappedArray[_]] || grpsIn.isInstanceOf[Array[_]]) {
Seq(grpsIn: _*)
} else grpsIn
The (now fast ~200K/sec) final code is:
type Fgroups = Seq[(BigDecimal, Int)]
val cntr = new java.util.concurrent.atomic.AtomicInteger
#tailrec
def getBoundaries(grpsIn: Fgroups, groupSize: Int, curSum: Int = 0, curOffs: Seq[BigDecimal] = Seq.empty[BigDecimal]): Seq[BigDecimal] = {
val grps = if (grpsIn.isInstanceOf[mutable.WrappedArray[_]] || grpsIn.isInstanceOf[Array[_]]) {
Seq(grpsIn: _*)
} else grpsIn
if (grps.isEmpty) curOffs
else {
val (id, cnt) = grps.head
val newSum = curSum + cnt.toInt
if (cntr.getAndIncrement % 500==0) { println(s"[${cntr.get}] id=$id newsum=$newSum") }
if (newSum >= groupSize) {
getBoundaries(grps.tail, groupSize, 0, curOffs :+ id)
} else {
getBoundaries(grps.tail, groupSize, newSum, curOffs)
}
}
}

Cats Writer Vector is empty

I wrote this simple program in my attempt to learn how Cats Writer works
import cats.data.Writer
import cats.syntax.applicative._
import cats.syntax.writer._
import cats.instances.vector._
object WriterTest extends App {
type Logged2[A] = Writer[Vector[String], A]
Vector("started the program").tell
val output1 = calculate1(10)
val foo = new Foo()
val output2 = foo.calculate2(20)
val (log, sum) = (output1 + output2).pure[Logged2].run
println(log)
println(sum)
def calculate1(x : Int) : Int = {
Vector("came inside calculate1").tell
val output = 10 + x
Vector(s"Calculated value ${output}").tell
output
}
}
class Foo {
def calculate2(x: Int) : Int = {
Vector("came inside calculate 2").tell
val output = 10 + x
Vector(s"calculated ${output}").tell
output
}
}
The program works and the output is
> run-main WriterTest
[info] Compiling 1 Scala source to /Users/Cats/target/scala-2.11/classes...
[info] Running WriterTest
Vector()
50
[success] Total time: 1 s, completed Jan 21, 2017 8:14:19 AM
But why is the vector empty? Shouldn't it contain all the strings on which I used the "tell" method?
When you call tell on your Vectors, each time you create a Writer[Vector[String], Unit]. However, you never actually do anything with your Writers, you just discard them. Further, you call pure to create your final Writer, which simply creates a Writer with an empty Vector. You have to combine the writers together in a chain that carries your value and message around.
type Logged[A] = Writer[Vector[String], A]
val (log, sum) = (for {
_ <- Vector("started the program").tell
output1 <- calculate1(10)
foo = new Foo()
output2 <- foo.calculate2(20)
} yield output1 + output2).run
def calculate1(x: Int): Logged[Int] = for {
_ <- Vector("came inside calculate1").tell
output = 10 + x
_ <- Vector(s"Calculated value ${output}").tell
} yield output
class Foo {
def calculate2(x: Int): Logged[Int] = for {
_ <- Vector("came inside calculate2").tell
output = 10 + x
_ <- Vector(s"calculated ${output}").tell
} yield output
}
Note the use of for notation. The definition of calculate1 is really
def calculate1(x: Int): Logged[Int] = Vector("came inside calculate1").tell.flatMap { _ =>
val output = 10 + x
Vector(s"calculated ${output}").tell.map { _ => output }
}
flatMap is the monadic bind operation, which means it understands how to take two monadic values (in this case Writer) and join them together to get a new one. In this case, it makes a Writer containing the concatenation of the logs and the value of the one on the right.
Note how there are no side effects. There is no global state by which Writer can remember all your calls to tell. You instead make many Writers and join them together with flatMap to get one big one at the end.
The problem with your example code is that you're not using the result of the tell method.
If you take a look at its signature, you'll see this:
final class WriterIdSyntax[A](val a: A) extends AnyVal {
def tell: Writer[A, Unit] = Writer(a, ())
}
it is clear that tell returns a Writer[A, Unit] result which is immediately discarded because you didn't assign it to a value.
The proper way to use a Writer (and any monad in Scala) is through its flatMap method. It would look similar to this:
println(
Vector("started the program").tell.flatMap { _ =>
15.pure[Logged2].flatMap { i =>
Writer(Vector("ended program"), i)
}
}
)
The code above, when executed will give you this:
WriterT((Vector(started the program, ended program),15))
As you can see, both messages and the int are stored in the result.
Now this is a bit ugly, and Scala actually provides a better way to do this: for-comprehensions. For-comprehension are a bit of syntactic sugar that allows us to write the same code in this way:
println(
for {
_ <- Vector("started the program").tell
i <- 15.pure[Logged2]
_ <- Vector("ended program").tell
} yield i
)
Now going back to your example, what I would recommend is for you to change the return type of compute1 and compute2 to be Writer[Vector[String], Int] and then try to make your application compile using what I wrote above.

Parallel Aggregate is not working on lists .length > 8

I'm writing a small exercise app that calculates number of unique letters (incl Unicode) in a seq of strings, and I'm using aggregate for it, as I try to run in parallel
here's my code:
class Frequency(seq: Seq[String]) {
type FreqMap = Map[Char, Int]
def calculate() = {
val freqMap: FreqMap = Map[Char, Int]()
val pattern = "(\\p{L}+)".r
val seqop: (FreqMap, String) => FreqMap = (fm, s) => {
s.toLowerCase().foldLeft(freqMap){(fm, c) =>
c match {
case pattern(char) => fm.get(char) match {
case None => fm+((char, 1))
case Some(i) => fm.updated(char, i+1)
}
case _ => fm
}
}
}
val reduce: (FreqMap, FreqMap) => FreqMap =
(m1, m2) => {
m1 ++ m2.map { case (k, v) => k -> (v + m1.getOrElse(k, 0)) }
}
seq.par.aggregate(freqMap)(seqop, reduce)
}
}
and then the code that makes use of that
object Frequency extends App {
val text = List("abc", "abc", "abc", "abc", "abc", "abc", "abc", "abc", "abc");
def frequency(seq: Seq[String]):Map[Char, Int] = {
new Frequency(seq).calculate()
}
Console println frequency(seq=text)
}
though I supplied "abc" 9 times, the result is Map(a -> 8, b -> 8, c -> 8), as it is for any number of "abc"'s > 8
I've looked at this, and it seems like I'm using aggregate correctly
Any suggestions to make it work?
You're discarding already collected results (the first fm) in your seqop. You need to add these to the new results you're computing, e.g. like this:
def calculate() = {
val freqMap: FreqMap = Map[Char, Int]()
val pattern = "(\\p{L}+)".r
val reduce: (FreqMap, FreqMap) => FreqMap =
(m1, m2) => {
m1 ++ m2.map { case (k, v) => k -> (v + m1.getOrElse(k, 0)) }
}
val seqop: (FreqMap, String) => FreqMap = (fm, s) => {
val res = s.toLowerCase().foldLeft(freqMap){(fm, c) =>
c match {
case pattern(char) => fm.get(char) match {
case None => fm+((char, 1))
case Some(i) => fm.updated(char, i+1)
}
case _ => fm
}
}
// I'm reusing your existing combinator function here:
reduce(res,fm)
}
seq.par.aggregate(freqMap)(seqop, reduce)
}
Depending on how the parallel collections divide the work you discard some of it. In your case (9x "abc") it divides the thing in 8 parallel seqop operations which means you discard exactly one result set. This varies depending on numbers, if you run in with say 17x "abc" it runs in 13 parallel operations, discarding 4 result sets (on my machine anyway - I'm not familiar with the underlying code and how it divides the work, this probably depends on the used ExecutionContext/Threadpool and subsequently number of CPUs/cores and so on).
Generally parallel collections are a drop in replacement for sequential collections, meaning if you drop .par you should still get the same result, albeit usually slower. If you do this with your original code you get a result of 1, which tells you that it's not a parallelization problem. This is a good way to test if you're doing to right thing when using these.
And last but not least: This was harder to spot than usual for me because you use the same variable name twice and subsequently shadow fm. Not doing that would make the code more readable and mistakes such as this easier to spot.

How do I flatten a nested For Comprehension that uses I/O?

I am having trouble flattening a nested For Generator into a single For Generator.
I created MapSerializer to save and load Maps.
Listing of MapSerializer.scala:
import java.io.{ObjectInputStream, ObjectOutputStream}
object MapSerializer {
def loadMap(in: ObjectInputStream): Map[String, IndexedSeq[Int]] =
(for (_ <- 1 to in.readInt()) yield {
val key = in.readUTF()
for (_ <- 1 to in.readInt()) yield {
val value = in.readInt()
(key, value)
}
}).flatten.groupBy(_ _1).mapValues(_ map(_ _2))
def saveMap(out: ObjectOutputStream, map: Map[String, Seq[Int]]) {
out.writeInt(map size)
for ((key, values) <- map) {
out.writeUTF(key)
out.writeInt(values size)
values.foreach(out.writeInt(_))
}
}
}
Modifying loadMap to assign key within the generator causes it to fail:
def loadMap(in: ObjectInputStream): Map[String, IndexedSeq[Int]] =
(for (_ <- 1 to in.readInt();
key = in.readUTF()) yield {
for (_ <- 1 to in.readInt()) yield {
val value = in.readInt()
(key, value)
}
}).flatten.groupBy(_ _1).mapValues(_ map(_ _2))
Here is the stacktrace I get:
java.io.UTFDataFormatException
at java.io.ObjectInputStream$BlockDataInputStream.readWholeUTFSpan(ObjectInputStream.java)
at java.io.ObjectInputStream$BlockDataInputStream.readOpUTFSpan(ObjectInputStream.java)
at java.io.ObjectInputStream$BlockDataInputStream.readWholeUTFSpan(ObjectInputStream.java)
at java.io.ObjectInputStream$BlockDataInputStream.readUTFBody(ObjectInputStream.java)
at java.io.ObjectInputStream$BlockDataInputStream.readUTF(ObjectInputStream.java:2819)
at java.io.ObjectInputStream.readUTF(ObjectInputStream.java:1050)
at MapSerializer$$anonfun$loadMap$1.apply(MapSerializer.scala:8)
at MapSerializer$$anonfun$loadMap$1.apply(MapSerializer.scala:7)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:194)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:194)
at scala.collection.immutable.Range.foreach(Range.scala:76)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:194)
at scala.collection.immutable.Range.map(Range.scala:43)
at MapSerializer$.loadMap(MapSerializer.scala:7)
I would like to flatten the loading code to a single For Comprehension, but I get errors that suggest that it is either executing in a different order or repeating steps I am not expecting it to repeat.
Why is it that moving the assignment of key into the generator causes it to fail?
Can I flatten this into a single generator? If so, what would that generator be?
Thank you for self contained compiling code in your question. I don't think you want to flatten the loops as the structure is not flat. You then need to use groupBy to recover the structure. Also if you have "zero -> Seq()" as an element of the map, it would be lost. Using this simple map avoids the groupBy and preserves the elements mapped to empty sequences:
def loadMap(in: ObjectInputStream): Map[String, IndexedSeq[Int]] = {
val size = in.readInt
(1 to size).map{ _ =>
val key = in.readUTF
val nval = in.readInt
key -> (1 to nval).map(_ => in.readInt)
}(collection.breakOut)
}
I use breakOut to generate the right type as otherwise I think the compilers complains about generic Map and immutable Map mismatch. You can also use Map() ++ (...).
Note: I arrived at this solution by being confused by your for loop and starting to rewrite using as flatMap and map:
val tuples = (1 to size).flatMap{ _ =>
val key = in.readUTF
println("key " + key)
val nval = in.readInt
(1 to nval).map(_ => key -> in.readInt)
}
I think in the for loop, something happens when you don't use some of the generator. I though this would be equivalent to:
val tuples = for {
_ <- 1 to size
key = in.readUTF
nval = in.readInt
_ <- 1 to nval
value = in.readInt
} yield { key -> value }
But this is not the case, so I think I'm missing something in the translation.
Edit: figured out what's wrong with a single for loop. Short story: the translation of definitions within for loops caused the key = in.readUTF statement to be called consecutively before the inner loop is executed. To work around this, use view and force:
val tuples = (for {
_ <- (1 to size).view
key = in.readUTF
nval = in.readInt
_ <- 1 to nval
value = in.readInt
} yield { key -> value }).force
The issue can be demonstrated more clearly with this piece of code:
val iter = Iterator.from(1)
val tuple = for {
_ <- 1 to 3
outer = iter.next
_ <- 1 to 3
inner = iter.next
} yield (outer, inner)
It returns Vector((1,4), (1,5), (1,6), (2,7), (2,8), (2,9), (3,10), (3,11), (3,12)) which shows that all outer values are evaluated before inner values. This is due to the fact that it is more or less translated to something like:
for {
(i, outer) <- for (i <- (1 to 3)) yield (i, iter.next)
_ <- 1 to 3
inner = iter.next
} yield (outer, inner)
This computes all outer iter.next first. Going back to the original use case, all in.readUTF values would be called consecutively before in.readInt.
Here is the compacted version of #huynhjl's answer that I eventually deployed:
def loadMap(in: ObjectInputStream): Map[String, IndexedSeq[Int]] =
((1 to in.readInt()) map { _ =>
in.readUTF() -> ((1 to in.readInt()) map { _ => in.readInt()) }
})(collection.breakOut)
The advantage of this version is that there are no direct assignments.

Tune Nested Loop in Scala

I was wondering if I can tune the following Scala code :
def removeDuplicates(listOfTuple: List[(Class1,Class2)]): List[(Class1,Class2)] = {
var listNoDuplicates: List[(Class1, Class2)] = Nil
for (outerIndex <- 0 until listOfTuple.size) {
if (outerIndex != listOfTuple.size - 1)
for (innerIndex <- outerIndex + 1 until listOfTuple.size) {
if (listOfTuple(i)._1.flag.equals(listOfTuple(j)._1.flag))
listNoDuplicates = listOfTuple(i) :: listNoDuplicates
}
}
listNoDuplicates
}
Usually if you have someting looking like:
var accumulator: A = new A
for( b <- collection ) {
accumulator = update(accumulator, b)
}
val result = accumulator
can be converted in something like:
val result = collection.foldLeft( new A ){ (acc,b) => update( acc, b ) }
So here we can first use a map to force the unicity of flags. Supposing the flag has a type F:
val result = listOfTuples.foldLeft( Map[F,(ClassA,ClassB)] ){
( map, tuple ) => map + ( tuple._1.flag -> tuple )
}
Then the remaining tuples can be extracted from the map and converted to a list:
val uniqList = map.values.toList
It will keep the last tuple encoutered, if you want to keep the first one, replace foldLeft by foldRight, and invert the argument of the lambda.
Example:
case class ClassA( flag: Int )
case class ClassB( value: Int )
val listOfTuples =
List( (ClassA(1),ClassB(2)), (ClassA(3),ClassB(4)), (ClassA(1),ClassB(-1)) )
val result = listOfTuples.foldRight( Map[Int,(ClassA,ClassB)]() ) {
( tuple, map ) => map + ( tuple._1.flag -> tuple )
}
val uniqList = result.values.toList
//uniqList: List((ClassA(1),ClassB(2)), (ClassA(3),ClassB(4)))
Edit: If you need to retain the order of the initial list, use instead:
val uniqList = listOfTuples.filter( result.values.toSet )
This compiles, but as I can't test it it's hard to say if it does "The Right Thing" (tm):
def removeDuplicates(listOfTuple: List[(Class1,Class2)]): List[(Class1,Class2)] =
(for {outerIndex <- 0 until listOfTuple.size
if outerIndex != listOfTuple.size - 1
innerIndex <- outerIndex + 1 until listOfTuple.size
if listOfTuple(i)._1.flag == listOfTuple(j)._1.flag
} yield listOfTuple(i)).reverse.toList
Note that you can use == instead of equals (use eq if you need reference equality).
BTW: https://codereview.stackexchange.com/ is better suited for this type of question.
Do not use index with lists (like listOfTuple(i)). Index on lists have very lousy performance. So, some ways...
The easiest:
def removeDuplicates(listOfTuple: List[(Class1,Class2)]): List[(Class1,Class2)] =
SortedSet(listOfTuple: _*)(Ordering by (_._1.flag)).toList
This will preserve the last element of the list. If you want it to preserve the first element, pass listOfTuple.reverse instead. Because of the sorting, performance is, at best, O(nlogn). So, here's a faster way, using a mutable HashSet:
def removeDuplicates(listOfTuple: List[(Class1,Class2)]): List[(Class1,Class2)] = {
// Produce a hash map to find the duplicates
import scala.collection.mutable.HashSet
val seen = HashSet[Flag]()
// now fold
listOfTuple.foldLeft(Nil: List[(Class1,Class2)]) {
case (acc, el) =>
val result = if (seen(el._1.flag)) acc else el :: acc
seen += el._1.flag
result
}.reverse
}
One can avoid using a mutable HashSet in two ways:
Make seen a var, so that it can be updated.
Pass the set along with the list being created in the fold. The case then becomes:
case ((seen, acc), el) =>