Copy constructor of BitSet (or other collections) in Scala - scala

I need to create a new instance of BitSet class from another BitSet object (input).
I expected something like new BitSet(input), but none found. I could get the new instance with map() method as follows, but I don't think this is the best solution.
var r = input.map(_ + 0)(BitSet.canBuildFrom)
What's the copy constructor of BitSet? What's the general rule for copy constructor in Scala?

You can create another with the bitmask of the first:
var r = new BitSet(input.toBitMask)

I think, the general rule is to use immutable collections. They are, well, immutable, so you can pass them around freely without taking special care for copying them.
When you need mutable collections, however, copying collections becomes useful. I discovered that using standard to method works:
scala> mutable.Set(1, 2, 3)
res0: scala.collection.mutable.Set[Int] = Set(1, 2, 3)
scala> res0.to[mutable.Set]
res1: scala.collection.mutable.Set[Int] = Set(1, 2, 3)
scala> res0 eq res1
res2: Boolean = false
However, it won't work with BitSet because it is not a generic collection, and to needs type constructor as its generic parameter. For BitSet you can use the method suggested by Lee. BTW, it is intended exactly for scala.collection.mutable.BitSet, because scala.collection.immutable.BitSet does not contain such constructor (nor does it need it).

The "copy" method on collections is called clone (to be consistent with Java style).
scala> collection.mutable.BitSet(1,2,3)
res0: scala.collection.mutable.BitSet = BitSet(1, 2, 3)
scala> res0.clone
res1: scala.collection.mutable.BitSet = BitSet(1, 2, 3)
scala> res0 += 4
res2: res0.type = BitSet(1, 2, 3, 4)
scala> res1
res40: scala.collection.mutable.BitSet = BitSet(1, 2, 3)

Related

How to efficiently delete all elements from ListBuffer in Scala?

I have a ListBuffer with thousand elements. After program has done calculations I want to fill it with new data. Is there a way like in C with free() to empty it? Or is it a good way to assign null to my ListBuffer and garbage collector will do all the work?
The method clear does just that.
scala> val xs = scala.collection.mutable.ListBuffer(1,2,3,4,5)
xs: scala.collection.mutable.ListBuffer[Int] = ListBuffer(1, 2, 3, 4, 5)
scala> xs.clear()
scala> xs
res2: scala.collection.mutable.ListBuffer[Int] = ListBuffer()

Index with Many Indices

Is there a quick scala idiom to have retrieve multiple elements of a a traversable using indices.
I am looking for something like
val L=1 to 4 toList
L(List(1,2)) //doesn't work
I have been using map so far, but wondering if there was a more "scala" way
List(1,2) map {L(_)}
Thanks in advance
Since a List is a Function you can write just
List(1,2) map L
Although, if you're going to be looking things up by index, you should probably use an IndexedSeq like Vector instead of a List.
You could add an implicit class that adds the functionality:
implicit class RichIndexedSeq[T](seq: IndexedSeq[T]) {
def apply(i0: Int, i1: Int, is: Int*): Seq[T] = (i0+:i1+:is) map seq
}
You can then use the sequence's apply method with one index or multiple indices:
scala> val data = Vector(1,2,3,4,5)
data: scala.collection.immutable.Vector[Int] = Vector(1, 2, 3, 4, 5)
scala> data(0)
res0: Int = 1
scala> data(0,2,4)
res1: Seq[Int] = ArrayBuffer(1, 3, 5)
You can do it with a for comprehension but it's no clearer than the code you have using map.
scala> val indices = List(1,2)
indices: List[Int] = List(1, 2)
scala> for (index <- indices) yield L(index)
res0: List[Int] = List(2, 3)
I think the most readable would be to implement your own function takeIndices(indices: List[Int]) that takes a list of indices and returns the values of a given List at those indices. e.g.
L.takeIndices(List(1,2))
List[Int] = List(2,3)

Mutable or Immutable Set? [duplicate]

This question already has an answer here:
Question about Scala variable Mutability
(1 answer)
Closed 10 years ago.
Consider the following:
scala> val myset = Set(1,2)
myset: scala.collection.immutable.Set[Int] = Set(1, 2)
scala> myset += 3
<console>:9: error: reassignment to val
myset += 3
^
scala> var myset = Set(1,2)
myset: scala.collection.immutable.Set[Int] = Set(1, 2)
scala> myset += 3
scala> myset
res47: scala.collection.immutable.Set[Int] = Set(1, 2, 3)
In one case I can add "alex" and in another I can't.
I know the difference between val and var. However what confused me in both cases Scala tells me that the Set is immutable but it allows different behaviour.
I don't want to just as that's because in oncase myset is a val and in one it is a var. I want a deeper answer than that to explain why in both cases Scala says myset is an immutable set but yet treats both differently. Because it is counter intuitive.
For example, is there any difference using a mutuable set and declaring an immutable set as var?
And why does scala let you bend the rules? Would it not be better if when it is said immutable it meant it?
Thanks.
First of all let's translate the += call
val myset = Set(1,2)
myset = myset + 3 //this is what the compiler does for myset += 3
This means that we're actually calling the + method on Set, whose scaladoc says
Creates a new set with an additional element, unless the element is already present.
What the code is trying to do is therefore to create a new Set with the added element and reassign it to the immutable reference myset.
Now if we change the reference to a mutable one (using var) then you can reassign it with the newly made and immutable Set(1,2,3)
The original Set(1, 2) is still immutable and no rule is broken. Let's explain this with a case
var myset = Set(1,2)
val holdIt = Some(myset)
myset += 3
println(myset) // will print Set(1, 2, 3)
println(holdIt)// will print Some(Set(1, 2))
As you can see, the original Set, which is captured by the Option, was never changed, we just reassigned it's variable reference to a newly created Set
When you use an immutable Set, a new Set will be created and reassigned to the var, which is not possible when using val. But a mutable Set can be mutated. While the following is not a perfectly save demonstration, it still shows that with an immutable Set a new object is created, and the mutable Set instance stays the same:
scala> var set1 = Set(1, 2)
set1: scala.collection.immutable.Set[Int] = Set(1, 2)
scala> System.identityHashCode(set1)
res0: Int = 2119337193
scala> set1 += 3
scala> System.identityHashCode(set1)
res2: Int = 1866831260
scala> var set2 = collection.mutable.Set(1, 2)
set2: scala.collection.mutable.Set[Int] = Set(2, 1)
scala> System.identityHashCode(set2)
res3: Int = 594308521
scala> set2 += 3
res4: scala.collection.mutable.Set[Int] = Set(2, 1, 3)
scala> System.identityHashCode(set2)
res5: Int = 594308521
Consider myset as the pointer and Set(1,2) as the object
In case of val myset pointer is assigned only once & cannot point to different object (i.e. pointer is immutable) where as
in case of var myset can be assigned arbitrary times & can point to different objects (i.e. pointer is mutable)
In both cases object Set(1,2) is immutable.
When you do
scala> val myset = Set(1,2)
myset: scala.collection.immutable.Set[Int] = Set(1, 2)
scala> myset += 3
<console>:9: error: reassignment to val
myset += 3
^
myset += 3 actually gives you a new Set with 3 appended to it..
Since the myset pointer is immutable you cannot reassign to the new object returned by the + method.
where as in case of
scala> var myset = Set(1,2)
myset: scala.collection.immutable.Set[Int] = Set(1, 2)
scala> myset += 3
Since myset pointer is mutable you can reassign to the new object returned by the + method.

Scala: Yielding from one type of collection to another

Concerning the yield command in Scala and the following example:
val values = Set(1, 2, 3)
val results = for {v <- values} yield (v * 2)
Can anyone explain how Scala knows which type of collection to yield into? I know it is based on values, but how would I go about writing code that replicates yield?
Is there any way for me to change the type of the collection to yield into? In the example I want results to be of type List instead of Set.
Failing this, what is the best way to convert from one collection to another? I know about _:*, but as a Set is not a Seq this does not work. The best I could find thus far is val listResults = List() ++ results.
Ps. I know the example does not following the recommended functional way (which would be to use map), but it is just an example.
The for comprehensions are translated by compiler to map/flatMap/filter calls using this scheme.
This excellent answer by Daniel answers your first question.
To change the type of result collection, you can use collection.breakout (also explained in the post I linked above.)
scala> val xs = Set(1, 2, 3)
xs: scala.collection.immutable.Set[Int] = Set(1, 2, 3)
scala> val ys: List[Int] = (for(x <- xs) yield 2 * x)(collection.breakOut)
ys: List[Int] = List(2, 4, 6)
You can convert a Set to a List using one of following ways:
scala> List.empty[Int] ++ xs
res0: List[Int] = List(1, 2, 3)
scala> xs.toList
res1: List[Int] = List(1, 2, 3)
Recommended read: The Architecture of Scala Collections
If you use map/flatmap/filter instead of for comprehensions, you can use scala.collection.breakOut to create a different type of collection:
scala> val result:List[Int] = values.map(2*)(scala.collection.breakOut)
result: List[Int] = List(2, 4, 6)
If you wanted to build your own collection classes (which is the closest thing to "replicating yield" that makes any sense to me), you should have a look at this tutorial.
Try this:
val values = Set(1, 2, 3)
val results = for {v <- values} yield (v * 2).toList

Scala: Can I rely on the order of items in a Set?

This was quite an unplesant surprise:
scala> Set(1, 2, 3, 4, 5)
res18: scala.collection.immutable.Set[Int] = Set(4, 5, 1, 2, 3)
scala> Set(1, 2, 3, 4, 5).toList
res25: List[Int] = List(5, 1, 2, 3, 4)
The example by itself suggest a "no" answer to my question. Then what about ListSet?
scala> import scala.collection.immutable.ListSet
scala> ListSet(1, 2, 3, 4, 5)
res21: scala.collection.immutable.ListSet[Int] = Set(1, 2, 3, 4, 5)
This one seems to work, but should I rely on this behavior?
What other data structure is suitable for an immutable collection of unique items, where the original order must be preserved?
By the way, I do know about distict method in List. The problem is, I want to enforce uniqueness of items (while preserving the order) at interface level, so using distinct would mess up my neat design..
EDIT
ListSet doesn't seem very reliable either:
scala> ListSet(1, 2, 3, 4, 5).toList
res28: List[Int] = List(5, 4, 3, 2, 1)
EDIT2
In my search for a perfect design I tried this:
scala> class MyList[A](list: List[A]) { val values = list.distinct }
scala> implicit def toMyList[A](l: List[A]) = new MyList(l)
scala> implicit def fromMyList[A](l: MyList[A]) = l.values
Which actually works:
scala> val l1: MyList[Int] = List(1, 2, 3)
scala> l1.values
res0: List[Int] = List(1, 2, 3)
scala> val l2: List[Int] = new MyList(List(1, 2, 3))
l2: List[Int] = List(1, 2, 3)
The problem, however, is that I do not want to expose MyList outside the library. Is there any way to have the implicit conversion when overriding? For example:
trait T { def l: MyList[_] }
object O extends T { val l: MyList[_] = List(1, 2, 3) }
scala> O.l mkString(" ") // Let's test the implicit conversion
res7: String = 1 2 3
I'd like to do it like this:
object O extends T { val l = List(1, 2, 3) } // Doesn't work
That depends on the Set you are using. If you do not know which Set implementation you have, then the answer is simply, no you cannot be sure. In practice I usually encounter the following three cases:
I need the items in the Set to be ordered. For this I use classes mixing in the SortedSet trait which when you use only the Standard Scala API is always a TreeSet. It guarantees the elements are ordered according to their compareTo method (see the Ordered trat). You get a (very) small performance penalty for the sorting as the runtime of inserts/retrievals is now logarithmic, not (almost) constant like with the HashSet (assuming a good hash function).
You need to preserve the order in which the items are inserted. Then you use the LinkedHashSet. Practically as fast as the normal HashSet, needs a little more storage space for the additional links between elements.
You do not care about order in the Set. So you use a HashSet. (That is the default when using the Set.apply method like in your first example)
All this applies to Java as well, Java has a TreeSet, LinkedHashSet and HashSet and the corresponding interfaces SortedSet, Comparable and plain Set.
It is my belief that you should never rely on the order in a set. In no language.
Apart from that, have a look at this question which talks about this in depth.
ListSet will always return elements in the reverse order of insertion because it is backed by a List, and the optimal way of adding elements to a List is by prepending them.
Immutable data structures are problematic if you want first in, first out (a queue). You can get O(logn) or amortized O(1). Given the apparent need to build the set and then produce an iterator out of it (ie, you'll first put all elements, then you'll remove all elements), I don't see any way to amortize it.
You can rely that a ListSet will always return elements in last in, first out order (a stack). If that suffices, then go for it.