Scala append and prepend method performance - scala

I was following a Scala video tutorial and he mentioned prepend :: takes constant time and append :+ time increases with length of list. And, also he mentioned most of the time reversing the list prepending and re-reversing the list gives better performance than appending.
Question 1
Why prepend :: takes constant time and append :+ time increases with length of list?
But reason for that is not mentioned in the tutorial and I tried in google. I didn’t find the answer but I found another surprising thing.
Question 2
ListBuffer takes constant time for both append and prepend. If possible why it wasnt implemented in List?
Obvious there would be reason behind! Appreciate if someone could explain.

Answer 1:
List is implemented as Linked list. The reference you hold is to it's head.
e.g. if you have a list of 4 elements (1 to 4) it will be:
[1]->[2]->[3]->[4]->//
Prepending meaning adding new element to the head and return the new head:
[5]->[1]->[2]->[3]->[4]->//
The reference to the old head [1] still valid and from it's point of view there are still 4 elements.
On the other hand, appending meaning adding element to the end of the list.
Since List is immutable, we can't just add it to the end, but we need to clone the entire List:
[1']->[2']->[3']->[4']->[5]->//
Since clone mean copy the entire list in the same order, we need to iterate over each element and append it.
Answer 2:
ListBuffer is mutable collection, changing it will change all the references.

Ad. 1. The list in Scala is defined (simplifying) as a head and a tail. The tail is also a list. Adding an element to the head means creation a new list with a new head and the existing list as a new tail. The existing list is not changed. This is why it is a constant time operation.
Appending to a list needs rebuilding the existing list, which cannot be done in constant time.
Ad. 2. ListBuffer is a mutable collection. It may be more efficient in some applications, but on the other hand immutable collections are thread-safe and easily scalable.

Related

Scala how to split a list at a certain index

If I have a list of ten elements in Scala how can I create a new list that consists of only the elements of the previous list from a range of two indexes. So if the original list was ten items long the new one could be like:
val N=Oldlist(0) to Oldlist(10)
Please do not use the split at method thats not what I'm trying to do.
List has a slice(from, to) method. You should probably use that. I thought it used structural sharing, but it doesn't (as discussed in the comments).
If I understand correctly your question, you can do:
val list = (oldlist(0) to oldList(10)).toList
oldlist(0) to oldList(10) creates a new Range that is then converted to a List.

Immutable DataStructures In Scala

We know that Scala supports immutable data structures..i.e each time u update the list it will create a new object and reference in the heap.
Example
val xs:List[Int] = List.apply(22)
val newList = xs ++ (33)
So when i append the second element to a list it will create a new list which will contain both 22 and 33.This exactly works like how immutable String works in Java.
So the question is each time I append a element in the list a new object will be created each time..This ldoes not look efficient to me.
is there some special data structures like persistent data structures are used when dealing with this..Does anyone know about this?
Appending to a list has O(n) complexity and is inefficient. A general approach is to prepend to a list while building it, and finally reverse it.
Now, your question on creating new object still applies to the prepend. Note that since xs is immutable, newList just points to xs for the rest of the data after the prepend.
While #manojlds is correct in his analysis, the original post asked about the efficiency of duplicating list nodes whenever you do an operation.
As #manojlds said, constructing lists often require thinking backwards, i.e., building a list and then reversing it. There are a number of other situations where list building requires "needless" copying.
To that end, there is a mutable data structure available in Scala called ListBuffer which you can use to build up your list and then extract the result as an immutable list:
val xsa = ListBuffer[Int](22)
xsa += 33
val newList = xsa.toList
However, the fact that the list data structure is, in general, immutable means that you have some very useful tools to analyze, de-compose and re-compose the list. Many builtin operations take advantage of the immutability. By extension, your own programs can also take advantage of this immutability.

Scala's mutable.LinkedList append

Quote from sources:
If this is empty then it does nothing and returns that.
There are some questions where authors ask how append to LinkedList, but i didn't found, why LinkedList is designed with such behavior.
And one more questions, does Scala has any List with add/append (which changes this with O(1)) and map operations?
If you expand the documentation for append in the mutable LinkedList API doc there is something more that least explains the O(n) performance of append:
def append(that: LinkedList[A]): LinkedList[A]
If this is empty then it does nothing and returns that. Otherwise,
appends that to this. The append requires a full traversal of this.
append takes a second LinkedList (that) and appends it to the current one (this). If the current LinkedList is empty the result of appending a second LinkedList to an empty one is just the second LinkedList.
I may be misunderstanding your question, but I didn't think this could be controversial or require particular design decisions.
As for performance characteristics of operations on scala collection I'm not sure if there's anything newer, but I've always pointed to this doc.

An scala collection container with next and previous methods

I am looking for a class in scala collections which allows me to traverse to next and previous element of a list of items.
For example:
val container = SomeClassFromScala(Int,Double,classOf[String],7)
container.getPreviousItem(Double) => Option[Int]
container.getNextItem(7) => None
Is there any class in Scala collections with this api and constant time for getNext/getPrevious.
I can write the code but I wanted to see if there is anything I can use right away.
If you want to have an immutable collection with your requirements, you can have a look at Zipper in scalaz:
Provides a pointed stream, which is a non-empty zipper-like stream structure that tracks an index (focus)
position in a stream. Focus can be moved forward and backwards through the stream, elements can be inserted
before or after the focused position, and the focused item can be deleted.
All operations are constant time. Though the constant is larger as one would expect from something that wraps an array (and doesn't allow insertion/deletion of elements), as it involves object creation.
Implementation is basically by having two lists (Streams, whatever), where one holds the previous, reversed elements. Moving is done by swapping over the head element from one list to the other.
Have a look at DoubleLinkedList. It adds a prev that gives you a list with the previous item as its head.
import collection.mutable.DoubleLinkedList
val a = DoubleLinkedList(1,2,3,4)
val b = a.next.next // DoubleLinkedList(3, 4)
val c = b.prev // DoubleLinkedList(2, 3, 4)
Downsides: It's not constant time, and it's mutable.

Difference between MutableList and ListBuffer

What is the difference between Scala's MutableList and ListBuffer classes in scala.collection.mutable? When would you use one vs the other?
My use case is having a linear sequence where I can efficiently remove the first element, prepend, and append. What's the best structure for this?
A little explanation on how they work.
ListBuffer uses internally Nil and :: to build an immutable List and allows constant-time removal of the first and last elements. To do so, it keeps a pointer on the first and last element of the list, and is actually allowed to change the head and tail of the (otherwise immutable) :: class (nice trick allowed by the private[scala] var members of ::). Its toList method returns the normal immutable List in constant time as well, as it can directly return the structure maintained internally. It is also the default builder for immutable Lists (and thus can indeed be reasonably expected to have constant-time append). If you call toList and then again append an element to the buffer, it takes linear time with respect to the current number of elements in the buffer to recreate a new structure, as it must not mutate the exported list any more.
MutableList works internally with LinkedList instead, an (openly, not like ::) mutable linked list implementation which knows of its element and successor (like ::). MutableList also keeps pointers to the first and last element, but toList returns in linear time, as the resulting List is constructed from the LinkedList. Thus, it doesn't need to reinitialize the buffer after a List has been exported.
Given your requirements, I'd say ListBuffer and MutableList are equivalent. If you want to export their internal list at some point, then ask yourself where you want the overhead: when you export the list, and then no overhead if you go on mutating buffer (then go for MutableList), or only if you mutable the buffer again, and none at export time (then go for ListBuffer).
My guess is that in the 2.8 collection overhaul, MutableList predated ListBuffer and the whole Builder system. Actually, MutableList is predominantly useful from within the collection.mutable package: it has a private[mutable] def toLinkedList method which returns in constant time, and can thus efficiently be used as a delegated builder for all structures that maintain a LinkedList internally.
So I'd also recommend ListBuffer, as it may also get attention and optimization in the future than “purely mutable” structures like MutableList and LinkedList.
This gives you an overview of the performance characteristics: http://www.scala-lang.org/docu/files/collections-api/collections.html ; interestingly, MutableList and ListBuffer do not differ there. The documentation of MutableList says it is used internally as base class for Stack and Queue, so maybe ListBuffer is more the official class from the user perspective?
You want a list (why a list?) that is growable and shrinkable, and you want constant append and prepend. Well, Buffer, a trait, has constant append and prepend, with most other operations linear. I'm guessing that ListBuffer, a class that implements Buffer, has constant time removal of the first element.
So, my own recommendation is for ListBuffer.
First, lets go over some of the relevant types in Scala
List - An Immutable collection. A Recursive implementation i.e . i.e An instance of list has two primary elements the head and the tail, where the tail references another List.
List[T]
head: T
tail: List[T] //recursive
LinkedList - A mutable collection defined as a series of linked nodes, where each node contains a value and a pointer to the next node.
Node[T]
value: T
next: Node[T] //sequential
LinkedList[T]
first: Node[T]
List is a functional data structure (immutability) compared to LinkedList which is more standard in imperative languages.
Now, lets look at
ListBuffer - A mutable buffer implementation backed by a List.
MutableList - An implementation based on LinkedList ( Would have been more self explanatory if it had been named LinkedListBuffer instead )
They both offer similar complexity bounds on most operations.
However, if you request a List from a MutableList, then it has to convert the existing linear representation into the recursive representation which takes O(n) which is what #Jean-Philippe Pellet points out. But, if you request a Seq from MutableList the complexity is O(1).
So, IMO the choice narrows down to the specifics of your code and your preference. Though, I suspect there is a lot more List and ListBuffer out there.
Note that ListBuffer is final/sealed, while you can extend MutableList.
Depending on your application, extensibility may be useful.