I am looking for a class in scala collections which allows me to traverse to next and previous element of a list of items.
For example:
val container = SomeClassFromScala(Int,Double,classOf[String],7)
container.getPreviousItem(Double) => Option[Int]
container.getNextItem(7) => None
Is there any class in Scala collections with this api and constant time for getNext/getPrevious.
I can write the code but I wanted to see if there is anything I can use right away.
If you want to have an immutable collection with your requirements, you can have a look at Zipper in scalaz:
Provides a pointed stream, which is a non-empty zipper-like stream structure that tracks an index (focus)
position in a stream. Focus can be moved forward and backwards through the stream, elements can be inserted
before or after the focused position, and the focused item can be deleted.
All operations are constant time. Though the constant is larger as one would expect from something that wraps an array (and doesn't allow insertion/deletion of elements), as it involves object creation.
Implementation is basically by having two lists (Streams, whatever), where one holds the previous, reversed elements. Moving is done by swapping over the head element from one list to the other.
Have a look at DoubleLinkedList. It adds a prev that gives you a list with the previous item as its head.
import collection.mutable.DoubleLinkedList
val a = DoubleLinkedList(1,2,3,4)
val b = a.next.next // DoubleLinkedList(3, 4)
val c = b.prev // DoubleLinkedList(2, 3, 4)
Downsides: It's not constant time, and it's mutable.
Related
Can someone help me to create immutable doubly linked list in Scala by showing one of the method implementation? Also, can you provide me one of example of it by let's implementing prepend(element:Int): DoublyLinkedList[Int] ?
Complementing the already linked video with a link to a text version and includinging important quotes:
This is described at Immutable Doubly Linked Lists in Scala with Call-By-Name and Lazy Values
A doubly-linked list doesn’t offer too much benefit in Scala, quite the contrary.
So this data structure is pretty useless from a practical standpoint, which is why ... there is no doubly-linked list in the Scala standard collection library
All that said, a doubly-linked list is an excellent exercise on the mental model of immutable data structures.
Think about it: this operation in Java would have been done in 3 steps:
new node
new node’s prev is 1
1’s next is the new node
If we can execute that new 1 ⇆ 2 in the same expression, we’ve made it; we can then recursively apply this operation on the rest of the list. Strangely enough, it’s possible.
class DLCons[+T](override val value: T, p: => DLList[T], n: => DLList[T]) extends DLList[T] {
// ... other methods
override def updatePrev[S >: T](newPrev: => DLList[S]) = {
lazy val result: DLCons[S] = new DLCons(value, newPrev, n.updatePrev(result))
result
}
}
Look at updatePrev: we’re creating a new node, whose next reference immediately points to a recursive call; the recursive call is on the current next node, which will update its own previous reference to the result we’re still in the process of defining! This forward reference is only possible in the presence of lazy values and the by-name arguments.
Call By Name & lazy val (Call by need) makes it possible to have immutable doubly linked list.
I was following a Scala video tutorial and he mentioned prepend :: takes constant time and append :+ time increases with length of list. And, also he mentioned most of the time reversing the list prepending and re-reversing the list gives better performance than appending.
Question 1
Why prepend :: takes constant time and append :+ time increases with length of list?
But reason for that is not mentioned in the tutorial and I tried in google. I didn’t find the answer but I found another surprising thing.
Question 2
ListBuffer takes constant time for both append and prepend. If possible why it wasnt implemented in List?
Obvious there would be reason behind! Appreciate if someone could explain.
Answer 1:
List is implemented as Linked list. The reference you hold is to it's head.
e.g. if you have a list of 4 elements (1 to 4) it will be:
[1]->[2]->[3]->[4]->//
Prepending meaning adding new element to the head and return the new head:
[5]->[1]->[2]->[3]->[4]->//
The reference to the old head [1] still valid and from it's point of view there are still 4 elements.
On the other hand, appending meaning adding element to the end of the list.
Since List is immutable, we can't just add it to the end, but we need to clone the entire List:
[1']->[2']->[3']->[4']->[5]->//
Since clone mean copy the entire list in the same order, we need to iterate over each element and append it.
Answer 2:
ListBuffer is mutable collection, changing it will change all the references.
Ad. 1. The list in Scala is defined (simplifying) as a head and a tail. The tail is also a list. Adding an element to the head means creation a new list with a new head and the existing list as a new tail. The existing list is not changed. This is why it is a constant time operation.
Appending to a list needs rebuilding the existing list, which cannot be done in constant time.
Ad. 2. ListBuffer is a mutable collection. It may be more efficient in some applications, but on the other hand immutable collections are thread-safe and easily scalable.
This SO answer describes how scala.collection.breakOut can be used to prevent creating wasteful intermediate collections. For example, here we create an intermediate Seq[(String,String)]:
val m = List("A", "B", "C").map(x => x -> x).toMap
By using breakOut we can prevent the creation of this intermediate Seq:
val m: Map[String,String] = List("A", "B", "C").map(x => x -> x)(breakOut)
Views solve the same problem and in addition access elements lazily:
val m = (List("A", "B", "C").view map (x => x -> x)).toMap
I am assuming the creation of the View wrappers is fairly cheap, so my question is: Is there any real reason to use breakOut over Views?
You're going to make a trip from England to France.
With view: you're taking a set of notes in your notebook and boom, once you've called .force() you start making all of them: buy a ticket, board on the plane, ....
With breakOut: you're departing and boom, you in the Paris looking at the Eiffel tower. You don't remember how exactly you've arrived there, but you did this trip actually, just didn't make any memories.
Bad analogy, but I hope this give you a taste of what is the difference between them.
I don't think views and breakOut are identical.
A breakOut is a CanBuildFrom implementation used to simplify transformation operations by eliminating intermediary steps. E.g get from A to B without the intermediary collection. A breakOut means letting Scala choose the appropriate builder object for maximum efficiency of producing new items in a given scenario. More details here.
views deal with a different type of efficiency, the main sale pitch being: "No more new objects". Views store light references to objects to tackle different usage scenarios: lazy access etc.
Bottom line:
If you map on a view you may still get an intermediary collection of references created before the expected result can be produced. You could still have superior performance from:
collection.view.map(somefn)(breakOut)
Than from:
collection.view.map(someFn)
As of Scala 2.13, this is no longer a concern. Breakout has been removed and views are the recommended replacement.
Scala 2.13 Collections Rework
Views are also the recommended replacement for collection.breakOut.
For example,
val s: Seq[Int] = ...
val set: Set[String] = s.map(_.toString)(collection.breakOut)
can be expressed with the same performance characteristics as:
val s: Seq[Int] = ...
val set = s.view.map(_.toString).to(Set)
What flavian said.
One use case for views is to conserve memory. For example, if you had a million-character-long string original, and needed to use, one by one, all of the million suffixes of that string, you might use a collection of
val v = original.view
val suffixes = v.tails
views on the original string. Then you might loop over the suffixes one by one, using suffix.force() to convert them back to strings within the loop, thus only holding one in memory at a time. Of course, you could do the same thing by iterating with your own loop over the indices of the original string, rather than creating any kind of collection of the suffixes.
Another use-case is when creation of the derived objects is expensive, you need them in a collection (say, as values in a map), but you only will access a few, and you don't know which ones.
If you really have a case where picking between them makes sense, prefer breakOut unless there's a good argument for using view (like those above).
Views require more code changes and care than breakOut, in that you need to add force() where needed. Depending on context, failure to do so is
often only detected at run-time. With breakOut, generally if it
compiles, it's right.
In cases where view does not apply, breakOut
will be faster, since view generation and forcing is skipped.
If you use a debugger, you can inspect the collection contents, which you
can't meaningfully do with a collection of views.
My present use case is pretty trivial, either mutable or immutable Map will do the trick.
Have a method that takes an immutable Map, which then calls a 3rd party API method that takes an immutable Map as well
def doFoo(foo: String = "default", params: Map[String, Any] = Map()) {
val newMap =
if(someCondition) params + ("foo" -> foo) else params
api.doSomething(newMap)
}
The Map in question will generally be quite small, at most there might be an embedded List of case class instances, a few thousand entries max. So, again, assume little impact in going immutable in this case (i.e. having essentially 2 instances of the Map via the newMap val copy).
Still, it nags me a bit, copying the map just to get a new map with a few k->v entries tacked onto it.
I could go mutable and params.put("bar", bar), etc. for the entries I want to tack on, and then params.toMap to convert to immutable for the api call, that is an option. but then I have to import and pass around mutable maps, which is a bit of hassle compared to going with Scala's default immutable Map.
So, what are the general guidelines for when it is justified/good practice to use mutable Map over immutable Maps?
Thanks
EDIT
so, it appears that an add operation on an immutable map takes near constant time, confirming #dhg's and #Nicolas's assertion that a full copy is not made, which solves the problem for the concrete case presented.
Depending on the immutable Map implementation, adding a few entries may not actually copy the entire original Map. This is one of the advantages to the immutable data structure approach: Scala will try to get away with copying as little as possible.
This kind of behavior is easiest to see with a List. If I have a val a = List(1,2,3), then that list is stored in memory. However, if I prepend an additional element like val b = 0 :: a, I do get a new 4-element List back, but Scala did not copy the orignal list a. Instead, we just created one new link, called it b, and gave it a pointer to the existing List a.
You can envision strategies like this for other kinds of collections as well. For example, if I add one element to a Map, the collection could simply wrap the existing map, falling back to it when needed, all while providing an API as if it were a single Map.
Using a mutable object is not bad in itself, it becomes bad in a functional programming environment, where you try to avoid side-effects by keeping functions pure and objects immutable.
However, if you create a mutable object inside a function and modify this object, the function is still pure if you don't release a reference to this object outside the function. It is acceptable to have code like:
def buildVector( x: Double, y: Double, z: Double ): Vector[Double] = {
val ary = Array.ofDim[Double]( 3 )
ary( 0 ) = x
ary( 1 ) = y
ary( 2 ) = z
ary.toVector
}
Now, I think this approach is useful/recommended in two cases: (1) Performance, if creating and modifying an immutable object is a bottleneck of your whole application; (2) Code readability, because sometimes it's easier to modify a complex object in place (rather than resorting to lenses, zippers, etc.)
In addition to dhg's answer, you can take a look to the performance of the scala collections. If an add/remove operation doesn't take a linear time, it must do something else than just simply copying the entire structure. (Note that the converse is not true: it's not beacuase it takes linear time that your copying the whole structure)
I like to use collections.maps as the declared parameter types (input or return values) rather than mutable or immutable maps. The Collections maps are immutable interfaces that work for both types of implementations. A consumer method using a map really doesn't need to know about a map implementation or how it was constructed. (It's really none of its business anyway).
If you go with the approach of hiding a map's particular construction (be it mutable or immutable) from the consumers who use it then you're still getting an essentially immutable map downstream. And by using collection.Map as an immutable interface you completely remove all the ".toMap" inefficiency that you would have with consumers written to use immutable.Map typed objects. Having to convert a completely constructed map into another one simply to comply to an interface not supported by the first one really is absolutely unnecessary overhead when you think about it.
I suspect in a few years from now we'll look back at the three separate sets of interfaces (mutable maps, immutable maps, and collections maps) and realize that 99% of the time only 2 are really needed (mutable and collections) and that using the (unfortunately) default immutable map interface really adds a lot of unnecessary overhead for the "Scalable Language".
What is the difference between Scala's MutableList and ListBuffer classes in scala.collection.mutable? When would you use one vs the other?
My use case is having a linear sequence where I can efficiently remove the first element, prepend, and append. What's the best structure for this?
A little explanation on how they work.
ListBuffer uses internally Nil and :: to build an immutable List and allows constant-time removal of the first and last elements. To do so, it keeps a pointer on the first and last element of the list, and is actually allowed to change the head and tail of the (otherwise immutable) :: class (nice trick allowed by the private[scala] var members of ::). Its toList method returns the normal immutable List in constant time as well, as it can directly return the structure maintained internally. It is also the default builder for immutable Lists (and thus can indeed be reasonably expected to have constant-time append). If you call toList and then again append an element to the buffer, it takes linear time with respect to the current number of elements in the buffer to recreate a new structure, as it must not mutate the exported list any more.
MutableList works internally with LinkedList instead, an (openly, not like ::) mutable linked list implementation which knows of its element and successor (like ::). MutableList also keeps pointers to the first and last element, but toList returns in linear time, as the resulting List is constructed from the LinkedList. Thus, it doesn't need to reinitialize the buffer after a List has been exported.
Given your requirements, I'd say ListBuffer and MutableList are equivalent. If you want to export their internal list at some point, then ask yourself where you want the overhead: when you export the list, and then no overhead if you go on mutating buffer (then go for MutableList), or only if you mutable the buffer again, and none at export time (then go for ListBuffer).
My guess is that in the 2.8 collection overhaul, MutableList predated ListBuffer and the whole Builder system. Actually, MutableList is predominantly useful from within the collection.mutable package: it has a private[mutable] def toLinkedList method which returns in constant time, and can thus efficiently be used as a delegated builder for all structures that maintain a LinkedList internally.
So I'd also recommend ListBuffer, as it may also get attention and optimization in the future than “purely mutable” structures like MutableList and LinkedList.
This gives you an overview of the performance characteristics: http://www.scala-lang.org/docu/files/collections-api/collections.html ; interestingly, MutableList and ListBuffer do not differ there. The documentation of MutableList says it is used internally as base class for Stack and Queue, so maybe ListBuffer is more the official class from the user perspective?
You want a list (why a list?) that is growable and shrinkable, and you want constant append and prepend. Well, Buffer, a trait, has constant append and prepend, with most other operations linear. I'm guessing that ListBuffer, a class that implements Buffer, has constant time removal of the first element.
So, my own recommendation is for ListBuffer.
First, lets go over some of the relevant types in Scala
List - An Immutable collection. A Recursive implementation i.e . i.e An instance of list has two primary elements the head and the tail, where the tail references another List.
List[T]
head: T
tail: List[T] //recursive
LinkedList - A mutable collection defined as a series of linked nodes, where each node contains a value and a pointer to the next node.
Node[T]
value: T
next: Node[T] //sequential
LinkedList[T]
first: Node[T]
List is a functional data structure (immutability) compared to LinkedList which is more standard in imperative languages.
Now, lets look at
ListBuffer - A mutable buffer implementation backed by a List.
MutableList - An implementation based on LinkedList ( Would have been more self explanatory if it had been named LinkedListBuffer instead )
They both offer similar complexity bounds on most operations.
However, if you request a List from a MutableList, then it has to convert the existing linear representation into the recursive representation which takes O(n) which is what #Jean-Philippe Pellet points out. But, if you request a Seq from MutableList the complexity is O(1).
So, IMO the choice narrows down to the specifics of your code and your preference. Though, I suspect there is a lot more List and ListBuffer out there.
Note that ListBuffer is final/sealed, while you can extend MutableList.
Depending on your application, extensibility may be useful.