save initial state of array and reset in scala

save initial state of array and reset in scala - scala

I am wondering what is the best way to save the initial state of an Array in scala so I can reset with initial values after manipulating a working copy. I would like to do something like this
val initialValue = Array(Array(1,2,3),Array(4,5,6))
val workingCopy = initialValue.clone
The problem is that when I change values of workingCopy, I also change the values of initialValue.
I also tried
val workingCopy = Array.fill(2,3)(0)
Array.copy(initialValue,0,workingCopy,2)
But I get the same result.
This holds even if i use var instead of val when defining the arrays. I think this shallow copy behavior might be caused by the nested Array structure, but I'm not sure how to deal with it.

As pointed out by Angelo, you usually want to use immutable data structures to avoid problems like this. However, if you really need to go the mutable way, e.g. for performance reasons (although the "modification" of immutable collections, such as Vector, is not as expensive as you might think), then you need to do a deep copy of your nested arrays and the content. The implementation is up to you.
If it really is just an Array[Array[Int]], it's enough to do something like this:
val initialValue = Array(Array(1,2,3), Array(4,5,6))
val workingCopy = initialValue.map(_.clone)
A more general example using a reference type instead of simple Ints:
scala> class Cell(var x: String) { def copy = new Cell(x); override def toString = x }
defined class Cell
scala> val initialValue = Array(Array(new Cell("foo")))
initialValue: Array[Array[Cell]] = Array(Array(foo))
scala> val workingCopy = initialValue.map(_.map(_.copy))
workingCopy: Array[Array[Cell]] = Array(Array(foo))
scala> initialValue(0)(0).x = "bar"
initialValue(0)(0).x: String = bar
scala> initialValue
res0: Array[Array[Cell]] = Array(Array(bar))
scala> workingCopy
res1: Array[Array[Cell]] = Array(Array(foo))
Instead of initialValue.map(_.map(_.copy)), there are of course other ways to do the same thing (e.g. a nested for expression which copies the objects as its side effect).

var vs val really just determines whether the reference is immutable, it would have no effect on the contents of the data structure. if you were to convert to an immutable data structure, for example a List, then the copy would be unneeded as the operations for "modifying" the contents of such a structure create a new reference leaving the original untouched.
for example:
val initialValue = List(List(1,2,3),List(4,5,6))
val sumInnerLists = initialValue.map(l => l.foldLeft(0)(_ + _))
println(sumInnerLists) // List(6, 15)
println(initialValue) // List(List(1,2,3),List(4,5,6))

Related

Scala ListBuffer add-all behavior is not consistent, some elements are lost

I came across a Scala collection behavior that somewhat dubious. Is this an expected behavior?
Following is a simplified code to reproduce the issue.
import scala.collection.mutable.{ Map => MutableMap }
import scala.collection.mutable.ListBuffer
val relationCache = MutableMap.empty[String, String]
val relationsToFlush = new ListBuffer[String]()
def addRelation(relation: String) = relationCache(relation) = relation
Range(0,170).map("string-#" + _).foreach(addRelation(_))
val relations = relationCache.values.toSeq /* Bad */
// val relations = relationCache.map(_._2).toSeq /* Good */
relationCache.clear
relationsToFlush ++= relations
relationsToFlush.size
Has two collections, mutable map (relationCache) and mutable list (relationsToFlush). relationCache takes elements and at later point it should be transferred to relationsToFlush and the cache should be cleared up.
However, not all elements transferred to relationsToFlush, output as below:
scala> relationsToFlush ++= relCache
res14: relationsToFlush.type = ListBuffer(string-#80, string-#27)
scala> relationsToFlush.size
res15: Int = 2
Where else if the code changed to
val relations = relationCache.map(_._2).toSeq /* Good */
Then we get the expected result (170 elements)
My guess is 'good' code creates new mutable list with those element while the other returns directly from map, hence its lost when clear is called on map. However, shouldn't the reference count gets bumped up when it returns to relations variable?
Scala Version: 2.11

You've stumbled across one of the vagaries of the Seq trait.
Since Seq is a trait, and not a class, it's not really a collection type distinct from the others, leading some to refer to it as a failed abstraction.
Consider the following REPL session. (Scala 2.12.7)
scala> List(1,2,3).toSeq
res4: scala.collection.immutable.Seq[Int] = List(1, 2, 3)
scala> Stream(1,2,3).toSeq
res5: scala.collection.immutable.Seq[Int] = Stream(1, ?)
scala> Vector(1,2,3).toSeq
res6: scala.collection.immutable.Seq[Int] = Vector(1, 2, 3)
Notice how the underlying collection type is retained. In particular the lazy Stream has not realized all its elements. That's what's happening here:
relationCache.values.toSeq
The toSeq transformation returns a Stream and nothing thereafter is forcing the realization of the rest of the elements.

The problem is combining lazy evaluation with mutable data structures.
The value of relations is lazy and is not computed until it is used. Since it is based on a mutable.Map collection, the results will be based on whatever that mutable.Map has at the time relations is first used.
To complicate things, relations is actually a Stream which means that those values are locked the first time they are read, meaning the subsequent changes to the mutable.Map will not affect that value of relations.
The simple fix is to use toList rather than toSeq, because List is not a lazy collection and will be evaluated immediately.

Fastest way to append sequence objects in loop

I have a for loop within which I get an Seq[Seq[(String,Int)]] for every run. I have the usual way of running through the Seq[Seq[(String,Int)]] to get every Seq[(String,Int)] and then append it to a ListBuffer[Seq[String,Int]].
Here is the following code:
var lis; //Seq[Seq[Tuple2(String,Int)]]
var matches = new ListBuffer[(String,Int)]
someLoop.foreach(k=>
// someLoop gives lis object on evry run,
// and that needs to be added to matches list
lis.foreach(j => matches.appendAll(j))
)
Is there better way to do this process without running through Seq[Seq[String,Int]] loop, say directly adding all the seq objects from the Seq to the ListBuffer?
I tried the ++ operator, by adding matches and lis directly. It didn't work either. I use Scala 2.10.2

Try this:
matches.appendAll(lis.flatten)
This way you can avoid the mutable ListBuffer at all. lis.flatten will be the Seq[(String, Int)]. So you can shorten your code like this:
val lis = ... //whatever that is Seq[Seq[(String, Int)]]
val flatLis = lis.flatten // Seq[(String, Int)]
Avoid var's and mutable structures like ListBuffer as much as you can

You don't need to append to an empty ListBuffer, just create it directly:
import collection.breakOut
val matches: ListBuffer[(String,Int)] =
lis.flatten(breakOut)
breakOut is the magic here. Calling flatten on a Seq[Seq[T]] would usually create a Seq[T] that you'd then have to convert to a ListBuffer. Using breakOut causes it to look at the expected output type and build that kind of collection instead.
Of course... You were only using ListBuffer for mutability anyway, so a Seq[T] is probably exactly what you really want. In which case, just let the inferencer do its thing:
val matches = lis.flatten

Converting mutable collection to immutable

I'm looking for a best way of converting a collection.mutable.Seq[T] to collection.immutable.Seq[T].

If you want to convert ListBuffer into a List, use .toList. I mention this because that particular conversion is performed in constant time. Note, though, that any further use of the ListBuffer will result in its contents being copied first.
Otherwise, you can do collection.immutable.Seq(xs: _*), assuming xs is mutable, as you are unlikely to get better performance any other way.

As specified:
def convert[T](sq: collection.mutable.Seq[T]): collection.immutable.Seq[T] =
collection.immutable.Seq[T](sq:_*)
Addition
The native methods are a little tricky to use. They are already defined on scala.collection.Seq and you’ll have to take a close look whether they return a collection.immutable or a collection.mutable. For example .toSeq returns a collection.Seq which makes no guarantees about mutability. .toIndexedSeq however, returns a collection.immutable.IndexedSeq so it seems to be fine to use. I’m not sure though, if this is really the intended behaviour as there is also a collection.mutable.IndexedSeq.
The safest approach would be to convert it manually to the intended collection as shown above. When using a native conversion, I think it is best practice to add a type annotation including (mutable/immutable) to ensure the correct collection is returned.

toList (or toStream if you want it lazy) are the preferred way if you want a LinearSeq, as you can be sure what you get back is immutable (because List and Stream are). There's no toVector method if you want an immutable IndexedSeq, but it seems that toIndexedSeq gives you a Vector (which is immutable) most if not all of the time.
Another way is to use breakOut. This will look at the type you're aiming for in your return type, and if possible oblige you. e.g.
scala> val ms = collection.mutable.Seq(1,2,3)
ms: scala.collection.mutable.Seq[Int] = ArrayBuffer(1, 2, 3)
scala> val r: List[Int] = ms.map(identity)(collection.breakOut)
r: List[Int] = List(1, 2, 3)
scala> val r: collection.immutable.Seq[Int] = ms.map(identity)(collection.breakOut)
r: scala.collection.immutable.Seq[Int] = Vector(1, 2, 3)
For more info on such black magic, get some strong coffee and see this question.

If you are also working with Set and Map you can also try these, using TreeSet as an example.
import scala.collection.mutable
val immutableSet = TreeSet(blue, green, red, yellow)
//converting a immutable set to a mutable set
val mutableSet = mutable.Set.empty ++= immutableSet
//converting a mutable set back to immutable set
val anotherImmutableSet = Set.empty ++ mutableSet
The above example is from book Programming in Scala

Scala collection of elements accessible by name

I have some Scala code roughly analogous to this:
object Foo {
val thingA = ...
val thingB = ...
val thingC = ...
val thingD = ...
val thingE = ...
val thingsOfAKind = List(thingA, thingC, thingE)
val thingsOfADifferentKind = List(thingB, thingD)
val allThings = thingsOfAKind ::: thingsOfADifferentKind
}
Is there some nicer way of declaring a bunch of things and being able to access them both individually by name and collectively?
The issue I have with the code above is that the real version has almost 30 different things, and there's no way to actually ensure that each new thing I add is also added to an appropriate list (or that allThings doesn't end up with duplicates, although that's relatively easy to fix).
The various things are able to be treated in aggregate by almost all of the code base, but there are a few places and a couple of things where the individual identity matters.
I thought about just using a Map, but then the compiler looses the ability to check that the individual things being looked up actually exist (and I have to either wrap code to handle a failed lookup around every attempt, or ignore the problem and effectively risk null pointer exceptions).
I could make the kind each thing belongs to an observable property of the things, then I would at least have a single list of all things and could get lists of each of the kinds with filter, but the core issue remains that I would ideally like to be able to declare that a thing exists, has a name (identifier), and is part of a collection.
What I effectively want is something like a compile-time Map. Is there a good way of achieving something like this in Scala?

How about this type of pattern?
class Things[A] {
var all: List[A] = Nil
def ->: (x: A): A = { all = x :: all; x }
}
object Test {
val things1 = new Things[String]
val things2 = new Things[String]
val thingA = "A" ->: things1
val thingB = "B" ->: things2
val thingC = "C" ->: things1
val thingD = ("D" ->: things1) ->: things2
}
You could also add a little sugar, making Things automatically convertible to List,
object Things {
implicit def thingsToList[A](things: Things[A]): List[A] = things.all
}
I can't think of a way to do this without the var that has equally nice syntax.

Scala do I have to convert to a Seq when creating a collection from an iterable?

Perhaps I am barking up the wrong tree (again) but if it is normal practice to have a property typed as a scala.collection.immutable.Set[A], then how would you create one of these given a scala.Iterable[A]? For example:
class ScalaClass {
private var s: scala.collection.immutable.Set[String]
def init(): Unit = {
val i = new scala.collection.mutable.HashSet[String]
//ADD SOME STUFF TO i
s = scala.collection.immutable.Set(i) //DOESN'T WORK
s = scala.collection.immutable.Set(i toSeq : _ *) //THIS WORKS
}
}
Can someone explain why it is necessary to create the immutable set via a Seq (or if it is not, then how do I do it)?

Basically because you're creating the immutable Set through the "canonical factory method" apply in Set's companion object, which takes a sequence, or "varargs" (as in Set(a,b,c)). See this:
http://scala-tools.org/scaladocs/scala-library/2.7.1/scala/collection/immutable/Set$object.html
I don't think there is another to do it in the standard library.

It will make an immutable copy:
scala> val mu = new scala.collection.mutable.HashSet[String]
mu: scala.collection.mutable.HashSet[String] = Set()
scala> val im = mu.clone.readOnly
im: scala.collection.Set[String] = ro-Set()