'--' the document say: Creates a new $coll from this $coll by removing all elements of another collection.
'&~' the document say: The difference of this set and another set.
i use both of two symbol can get same result.
so is there have some difference between this two symbol? For example, there are different time complexity or memory occupation?
According to the Scaladoc.
-- accepts any kind of IterableOnce whereas &~ only accepts other Sets
Arguably, if you have two sets, you should prefer &~ since it is probably optimized.
Related
I have to append elements to my collection. Which structure is more preferable? Appending to List costs O(n), what about ListBuffer, ArrayBuffer, Set, Map and other structures?
ListBuffer accotding to the docs:
It provides constant time prepend and append.
But it is mutable structure, so be careful using - preferably in a very limited scope (e.g. function or method).
ArrayBuffer according to the documentation:
Prepends and removes are linear in the buffer size.
Because this structure built on top of the dynamic array, hence sometimes require internal array copy for recreation, which in JVM is almost constant but still not exactly constant time. See System.arraycopy documentation for more details. Also mutable structure.
Set, Map - are not what you called List-like at all. Set - un-ordered (list IS ordered) structure, which contains ONLY unique elements. Map[K, V] - stores as the name stands, the mapping between K type keys to V type values.
So as conclusion: if you need to append elements I'd suggest to go with ListBuffer, but since this is mutable structure limit scope its usage ad whenever you need to pass it somewhere - convert it to List.
I have a immutable list and need a new copy of it with elements replaced at multiple index locations. The List.updated is an O(n) operation and can only replace one at a time. What is the efficient way of doing this? Thanks!
List is not a good fit if you need random element access/update. From the documentation:
This class is optimal for last-in-first-out (LIFO), stack-like access patterns. If you need another access pattern, for example, random access or FIFO, consider using a collection more suited to this than List.
More generally, what you need is an indexed sequence instead of a linear one (such as List). From the documentation of IndexedSeq:
Indexed sequences support constant-time or near constant-time element access and length computation. They are defined in terms of abstract methods apply for indexing and length.
Indexed sequences do not add any new methods to Seq, but promise efficient implementations of random access patterns.
The default concrete implementation of IndexedSeq is Vector, so you may consider using it.
Here's an extract from its documentation (emphasis added):
Vector is a general-purpose, immutable data structure. It provides random access and updates in effectively constant time, as well as very fast append and prepend. Because vectors strike a good balance between fast random selections and fast random functional updates, they are currently the default implementation of immutable indexed sequences
list
.iterator
.zipWithIndex
.map { case (index, element) => newElementFor(index) }
.toList
Below, both descriptions of these data structures: (from Programming in scala book)
Linked lists
Linked lists are mutable sequences that consist of nodes
that are linked with next pointers. In most languages null would be
picked as the empty linked list. That does not work for Scala
collections, because even empty sequences must support all sequence
methods. LinkedList.empty.isEmpty, in par- ticular, should return true
and not throw a NullPointerException. Empty linked lists are encoded
instead in a special way: Their next field points back to the node
itself. Like their immutable cousins, linked lists are best operated
on sequen- tially. In addition, linked lists make it easy to insert an
element or linked list into another linked list.
Mutable lists
A MutableList consists of a single linked list together with a pointer
that refers to the terminal empty node of that list. This makes list
append a con- stant time operation because it avoids having to
traverse the list in search for its terminal node. MutableList is
currently the standard implementation of mutable.LinearSeq in Scala.
Main difference is the addition of the last element's pointer in MutableList type.
Question is: What might be the usage preferring LinkedList rather than MutableList? Isn't MutableList strictly (despite the new pointer) equivalent and even more practical with a tiny addition of used memory (the last element's pointer)?
Since MutableList wraps a LinkedList, most operations involve an extra indirection step. Note that wrapping means, it contains an internal variable to a LinkedList (indeed two, because of the efficient last element lookup). So the linked list is a required building block to realise the mutable list.
If you do not need prepend or look up of the last element, you could thus just go for the LinkedList. Scala offers you a large choice of data structures, so the best is first to make a checklist of all the operations that you require (and their preferred efficiency), then choose the best fit.
Generally, I recommend you to use immutable structures, they are often as efficient as the mutable ones and don't produce problems with concurrency.
The apidoc of distinct in SeqLike says:
Builds a new sequence from this sequence without any duplicate elements.
Returns: A new sequence which contains the first occurrence of every element of this sequence.
Do I feel it correct that no ordering guarantee is provided? More generally, do methods of SeqLike provide any process-in-order (and return-in-order) guarantee?
On the contrary: operations on Seqs guarantee the output order (unless the API says otherwise). This is one of the basic properties of sequences, where the order matters, versus sets, where only containment matters.
It depends on the collection you were using in the first place. If you had a list you'll get your order. If on the other hand you had a set, then probably not.
the other day I was wondering why scala.collection.Map defines its unzip method as
def unzip [A1, A2] (implicit asPair: ((A, B)) ⇒ (A1, A2)): (Iterable[A1], Iterable[A2])
Since the method returns "only" a pair of Iterable instead of a pair of Seq it is not guaranteed that the key/value pairs in the original map occur at matching indices in the returned sequences since Iterable doesn't guarantee the order of traversal. So if I had a
Map((1,A), (2,B))
, then after calling
Map((1,A), (2,B)) unzip
I might end up with
... = (List(1, 2),List(A, B))
just as well as with
... = (List(2, 1),List(B, A))
While I can imagine storage-related reasons behind this (think of HashMaps, for example) I wonder what you guys think about this behavior. It might appear to users of the Map.unzip method that the items were returned in the same pair order (and I bet this is probably almost always the case) yet since there's no guarantee this might in turn yield hard-to-find bugs in the library user's code.
Maybe that behavior should be expressed more explicitly in the accompanying scaladoc?
EDIT: Please note that I'm not referring to maps as ordered collections. I'm only interested in "matching" sequences after unzip, i.e. for
val (keys, values) = someMap.unzip
it holds for all i that (keys(i), values(i)) is an element of the original mapping.
Actually, the examples you gave will not occur. The Map will always be unzipped in a pair-wise fashion. Your statement that Iterable does not guarantee the ordering, is not entirely true. It is more accurate to say that any given Iterable does not have to guarantee the ordering, but this is dependent on implementation. In the case of Map.unzip, the ordering of pairs is not guaranteed, but items in the pairs will not change they way they are matched up -- that matching is a fundamental property of the Map. You can read the source to GenericTraversableTemplate to verify this is the case.
If you expand unzip's description, you'll get the answer:
definition classes: GenericTraversableTemplate
In other words, it didn't get specialized for Map.
Your argument is sound, though, and I daresay you might get your wishes if you open an enhancement ticket with your reasoning. Specially if you go ahead an produce a patch as well -- if nothing else, at least you'll learn a lot more about Scala collections in doing so.
Maps, generally, do not have a natural sequence: they are unordered collections. The fact your keys happen to have a natural order does not change the general case.
(However I am at a loss to explain why Map has a zipWithIndex method. This provides a counter-argument to my point. I guess it is there for consistency with other collections and that, although it provides indices, they are not guaranteed to be the same on subsequent calls.)
If you use a LinkedHashMap or LinkedHashSet the iterators are supposed to return the pairs in the original order of insertion. Other HashMaps, yeah, you have no control. Retaining the original order of insertion is quite useful in UI contexts, it allows you to resort tables on any column in a Web application without changing types, for instance.