Disadvantages of Immutable objects - scala

I know that Immutable objects offer several advantages over mutable objects like they are easier to reason about than mutable ones, they do not have complex state spaces that change over time, we can pass them around freely, they make safe hash table keys etc etc.So my question is what are the disadvantages of immutable objects??

Quoting from Effective Java:
The only real disadvantage of immutable classes is that they require a
separate object for each distinct value. Creating these objects can be
costly, especially if they are large. For example, suppose that you
have a million-bit BigInteger and you want to change its low-order
bit:
BigInteger moby = ...;
moby = moby.flipBit(0);
The flipBit method
creates a new BigInteger instance, also a million bits long, that
differs from the original in only one bit. The operation requires time
and space proportional to the size of the BigInteger. Contrast this to
java.util.BitSet. Like BigInteger, BitSet represents an arbitrarily
long sequence of bits, but unlike BigInteger, BitSet is mutable. The
BitSet class provides a method that allows you to change the state of
a single bit of a millionbit instance in constant time.
Read the full item on Item 15: Minimize mutability

Apart from possible performance drawbacks (possible! because with the complexity of GC and HotSpot optimisations, immutable structures are not necessarily slower) - one drawback can be that state must now be threaded through your whole application. For simple applications or tiny scripts the effort to maintain state this way might be too high to buy you concurrency safety.
For example think of a GUI framework like Swing. It would be definitely possible to write a GUI framework entirely using immutable structures and one main "unsafe" outer loop, and I guess this has been done in Haskell. Some of the problems of maintaining nested immutable state can be addressed for example with lenses. But managing all the interactions (registering listeners etc.) may get quite involved, so you might instead want to introduce new abstractions such as functional-reactive or hybrid-reactive GUIs.
Basically you lose some of OO's encapsulation by going all immutable, and when this becomes a problem there are alternative approaches such as actors or STM.

I work with Scala on a daily basis. Immutability has certain key advantages as we all know. However sometimes it's just plain easier to allow mutable content in some situations. Here's a contrived example:
var counter = 0
something.map {e =>
...
counter += 1
}
Of course I could just have the map return a tuple with the payload and count, or use a collection.size if available. But in this case the mutable counter is arguably more clear. In general I prefer immutability but also allow myself to make exceptions.

To answer this question I would quote Programming in Scala, second Edition, chapter "Next Steps in Scala", item 11, by Lex Spoon, Bill Venners and Martin Odersky :
The Scala perspective, however, is that val and var are just two different tools in your toolbox, both useful, neither inherently evil. Scala encourages you to lean towards vals, but ultimately reach for the best tool given the job at hand.
So I would say that just as for programming languages, val and var solves different problems : there is no "disavantage / avantage" without context, there is just a problem to solve, and both of val / var address differently the problem.
Hope it helps, even if it does not provide a concrete list of pros / cons !

Related

Is there a possibility to create a memory-efficient sequence of bits in the JVM?

I've got a piece of code that takes into account a given amount of features, where each feature is Boolean. I'm looking for the most efficient way to store a set of such features. My initial thought was to try and store these as a BitSet. But then, I realized that this implementation is meant to be used to store numbers in bit format rather than manipulate each bit, which is something I'd like to do (see the effect of switching any feature on and off). I then thought of using a Boolean array, but apparently the JVM uses much more memory for each Boolean element than the one bit it actually needs.
I'm therefore left with the question: What is the most efficient way to store a set of bits that I'd like to treat as independent bits rather than the building blocks of some number?
Please refer to this question: boolean[] vs. BitSet: Which is more efficient?
According to the answer of Peter Lawrey, boolean[] (not Boolean[]) is your way to go since its values can be manipulated and it takes only one byte of memory per bit to store. Consider that there is no way for a JVM application to store one bit in only one bit of memory and let it be directly (array-like) manipulated because it needs a pointer to find the address of the bit and the smallest addressable unit is a byte.
The site you referenced already states that the mutable BitSet is the same as the java.util.BitSet. There is nothing you can do in Java that you can't do in Scala. But since you are using Scala, you probably want a safe implementation which is probably meant to be even multithreaded. Mutable datatypes are not suitable for that. Therefore, I would simply use an immutable BitSet and accept the memory cost.
However, BitSets have their limits (deriving from the maximum number of int). If you need larger data sizes, you may use LongBitSets, which are basically Map<Long, BitSet>. If you need even more space, you may nest them in another map Map<Long, LongBitSet>, but in that case you need to use two or more identifiers (longs).

How is Scala suitable for Big Scalable Application

I am taking course Functional Programming Principles in Scala | Coursera on Scala.
I fail to understand with immutability , so many functions and so much dependencies on recursion , how is Scala is really suitable for real world applications.
I mean coming from imperative languages I see a risk of StackOverflow or Garbage Collection kicking in and with multiple copies of everything I am running Out Of Memory
What I a missing here?
Stack overflow: it's possible to make your recursive function tail recursive. Add #tailrec from scala.annotation.tailrec to make sure your function is 100% tail recursive. This is basically a loop.
Most importantly recursive solutions is only one of many patterns available. See "Effective Java" why mutability is bad. Immutable data is much better suitable for large applications: no need to synchronize access, client can't mess with data internals, etc. Immutable structures are very efficient in many cases. If you add an element to the head of a list: elem :: list all data is shared between 2 lists - awesome! Only head is created and pointed to the list. Imagine that you have to create a new deep clone of a list every time client asks for.
Expressions in Scala are more succinct and maybe more lazy - create filter and map and all that applied as needed. You can do the same in Java but ceremony takes forever so usually devs just create multiple temp collections on the way.
Martin Odersky defines mutability as a dependence on time/history. That's very interesting because you can use var inside of a function as long as no other code can be affected in any way, i.e. results are always the same.
Look at Option[T] and compare to null. Use them in for comprehensions. Exception becomes really exceptional and Option, Try, Box, Either communicate failures in a very nice way.
Scala allows to write more modular and generic code with less effort compared to Java.
Find a good piece of Scala code and try to see how you would do it in Java - it will be self evident.
Real world applications are getting more event-driven which involves passing around data across different processes or systems needing immutable data structures
In most of the cases we are either manipulating data or waiting on a resource.
In that case its easy to hook in a callback with Actors
Take a look at
http://pavelfatin.com/scala-for-project-euler/
Which gives you some examples on using functions like map fllter etc. Functions like these are used routinely by Ruby applications
Combination of immutability and recursion avoids a lot of stackoverflow problems. This come in handly while dealing with event driven applications
akka.io is a classic example which could have been build very concisely in scala.

Why do Eclipse APIs use Arrays instead of Collections?

In the Eclipse APIs, the return and argument types are mostly arrays instead of collections. An example is the members method on IContainer, which returns IResources[].
I am interested in why this is the case. Maybe it is one of the following:
The APIs were designed before generics generics were available, so IResource[] was better than just Collection or List
Memory concerns, e.g. ArrayList internally holds an array which has more space than is needed (to offer an efficient implementation of add), whereas an array is always constructed for just the needed target size
It's not possible to add/remove elements on an array, so it is safe for iterating (but defensive copying is still necessary, because one can still change elements, e.g. set them to null)
Does anyone have any insights or other ideas why the API was developed that way?
Posting this as an answer, so it can be accepted.
Eclipse predates generics and they are really serious about API stability. Also, at the low level of SWT passing arrays seems to be used to reflect the operating system APIs that are being wrapped. Once you have a bunch of tooling using Arrays I guess it makes sense to keep things consistent. Also note that arrays aren't subject to all of the type erasure issues when using reflection.
Yeah, I hear you as far as the collections api being generally much easier to work with for dynamic lists of items.

Scala: var List vs val MutableList

In Odersky et al's Scala book, they say use lists. I haven't read the book cover to cover but all the examples seem to use val List. As I understand it one also is encouraged to use vals over vars. But in most applications is there not a trade off between using a var List or a val MutableList?. Obviously we use a val List when we can. But is it good practice to be using a lot of var Lists (or var Vectors etc)?
I'm pretty new to Scala coming from C#. There I had a lot of:
public List<T> myList {get; private set;}
collections which could easily have been declared as vals if C# had immutability built in, because the collection itself never changed after construction, even though elements would be added and subtracted from the collection in its life time. So declaring a var collection almost feels like a step away from immutability.
In response to answers and comments, one of the strong selling points of Scala is: that it can have many benefits without having to completely change the way one writes code as is the case with say Lisp or Haskell.
Is it good practice to be using a lot of var Lists (or var Vectors
etc)?
I would say it's better practice to use var with immutable collections than it is to use val with mutable ones. Off the top of my head, because
You have more guarantees about behaviour: if your object has a mutable list, you never know if some other external object is going to update it
You limit the extent of mutability; methods returning a collection will yield an immutable one, so you only have mutablility within your one object
It's easy to immutabilize a var by simply assigning it to a val, whereas to make a mutable collection immutable you have to use a different collection type and rebuild it
In some circumstances, such as time-dependent applications with extensive I/O, the simplest solution is to use mutable state. And in some circumstances, a mutable solution is just more elegant. However in most code you don't need mutability at all. The key is to use collections with higher order functions instead of looping, or recursion if a suitable function doesn't exist. This is simpler than it sounds. You just need to spend some time getting to know the methods on List (and other collections, which are mostly the same). The most important ones are:
map: applies your supplied function to each element in the collection - use instead of looping and updating values in an array
foldLeft: returns a single result from a collection - use instead of looping and updating an accumulator variable
for-yield expressions: simplify your mapping and filtering especially for nested-loop type problems
Ultimately, much of functional programming is a consequence of immutability and you can't really have one without the other; however, local vars are mostly an implementation detail: there's nothing wrong with a bit of mutability so long as it cannot escape from the local scope. So use vars with immutable collections since the local vars are not what will be exported.
You are assuming either the List must be mutable, or whatever is pointing to it must be mutable. In other words, that you need to pick one of the two choices below:
val list: collection.mutable.LinkedList[T]
var list: List[T]
That is a false dichotomy. You can have both:
val list: List[T]
So, the question you ought to be asking is how do I do that?, but we can only answer that when you try it out and face a specific problem. There's no generic answer that will solve all your problems (well, there is -- monads and recursion -- but that's too generic).
So... give it a try. You might be interested in looking at Code Review, where most Scala questions pertain precisely how to make some code more functional. But, ultimately, I think the best way to go about it is to try, and resort to Stack Overflow when you just can't figure out some specific problem.
Here is how I see this problem of mutability in functional programming.
Best solution: Values are best, so the best in functional programming usage is values and recursive functions:
val myList = func(4);
def func(n) = if (n>0) n::func(n) else Nil
Need mutable stuff: Sometimes mutable stuff is needed or makes everything a lot easier. My impression when we face this situation is to use the mutables structures, so to use val list: collection.mutable.LinkedList[T] instead of var list: List[T], this is not because of a real improvement on performances but because of mutable functions which are already defined in the mutable collection.
This advice is personal and maybe not recommended when you want performance but it is a guideline I use for daily programming in scala.
I believe you can't separate the question of mutable val / immutable var from the specific use case. Let me deepen a bit: there are two questions you want to ask yourself:
How am I exposing my collection to the outside?
I want a snapshot of the current state, and this snapshot should not change regardless of the changes that are made to the entity hosting the collection. In such case, you should prefer immutable var. The reason is that the only way to do so with a mutable val is through a defensive copy.
I want a view on the state, that should change to reflect changes to the original object state. In this case, you should opt for an immutable val, and return it wrapped through an unmodifiable wrapper (much like what Collections.unmodifiableList() does in Java, here is a question where it's asked how to do so in Scala). I see no way to achieve this with an immutable var, so I believe your choice here is mandatory.
I only care about the absence of side effects. In this case the two approaches are very similar. With an immutable var you can directly return the internal representation, so it is slightly clearer maybe.
How am I modifying my collection?
I usually make bulk changes, namely, I set the whole collection at once. This makes immutable var a better choice: you can just assign what's in input to your current state, and you are fine already. With an immutable val, you need to first clear your collection, then copy the new contents in. Definitely worse.
I usually make pointwise changes, namely, I add/remove a single element (or few of them) to/from the collection. This is what I actually see most of the time, the collection being just an implementation detail and the trait only exposing methods for the pointwise manipulation of its status. In this case, a mutable val may be generally better performance-wise, but in case this is an issue I'd recommend taking a look at Scala's collections performance.
If it is necessary to use var lists, why not? To avoid problems you could for example limit the scope of the variable. There was a similar question a while ago with a pretty good answer: scala's mutable and immutable set when to use val and var.

Using Scala, does a functional paradigm make sense for analyzing live data?

For example, when analyzing live stockmarket data I expose a method to my clients
def onTrade(trade: Trade) {
}
The clients may choose to do anything from counting the number of trades, calculating averages, storing high lows, price comparisons and so on. The method I expose returns nothing and the clients often use vars and mutable structures for their computation. For example when calculating the total trades they may do something like
var numTrades = 0
def onTrade(trade: Trade) {
numTrades += 1
}
A single onTrade call may have to do six or seven different things. Is there any way to reconcile this type of flexibility with a functional paradigm? In other words a return type, vals and nonmutable data structures
You might want to look into Functional Reactive Programming. Using FRP, you would express your trades as a stream of events, and manipulate this stream as a whole, rather than focusing on a single trade at a time.
You would then use various combinators to construct new streams, for example one that would return the number of trades or highest price seen so far.
The link above contains links to several Haskell implementations, but there are probably several Scala FRP implementations available as well.
One possibility is using monads to encapsulate state within a purely functional program. You might check out the Scalaz library.
Also, according to reports, the Scala team is developing a compiler plug-in for an effect system. Then you might consider providing an interface like this to your clients,
def callbackOnTrade[A, B](f: (A, Trade) => B)
The clients define their input and output types A and B, and define a pure function f that processes the trade. All "state" gets encapsulated in A and B and threaded through f.
Callbacks may not be the best approach, but there are certainly functional designs that can solve such a problem. You might want to consider FRP or a state-monad solution as already suggested, actors are another possibility, as is some form of dataflow concurrency, and you can also take advantage of the copy method that's automatically generated for case classes.
A different approach is to use STM (software transactional memory) and stick with the imperative paradigm whilst still retaining some safety.
The best approach depends on exactly how you're persisting the data and what you're actually doing in these state changes. As always, let a profiler be your guide if performance is critical.