Why do Eclipse APIs use Arrays instead of Collections? - eclipse

In the Eclipse APIs, the return and argument types are mostly arrays instead of collections. An example is the members method on IContainer, which returns IResources[].
I am interested in why this is the case. Maybe it is one of the following:
The APIs were designed before generics generics were available, so IResource[] was better than just Collection or List
Memory concerns, e.g. ArrayList internally holds an array which has more space than is needed (to offer an efficient implementation of add), whereas an array is always constructed for just the needed target size
It's not possible to add/remove elements on an array, so it is safe for iterating (but defensive copying is still necessary, because one can still change elements, e.g. set them to null)
Does anyone have any insights or other ideas why the API was developed that way?

Posting this as an answer, so it can be accepted.
Eclipse predates generics and they are really serious about API stability. Also, at the low level of SWT passing arrays seems to be used to reflect the operating system APIs that are being wrapped. Once you have a bunch of tooling using Arrays I guess it makes sense to keep things consistent. Also note that arrays aren't subject to all of the type erasure issues when using reflection.
Yeah, I hear you as far as the collections api being generally much easier to work with for dynamic lists of items.

Related

Disadvantages of Immutable objects

I know that Immutable objects offer several advantages over mutable objects like they are easier to reason about than mutable ones, they do not have complex state spaces that change over time, we can pass them around freely, they make safe hash table keys etc etc.So my question is what are the disadvantages of immutable objects??
Quoting from Effective Java:
The only real disadvantage of immutable classes is that they require a
separate object for each distinct value. Creating these objects can be
costly, especially if they are large. For example, suppose that you
have a million-bit BigInteger and you want to change its low-order
bit:
BigInteger moby = ...;
moby = moby.flipBit(0);
The flipBit method
creates a new BigInteger instance, also a million bits long, that
differs from the original in only one bit. The operation requires time
and space proportional to the size of the BigInteger. Contrast this to
java.util.BitSet. Like BigInteger, BitSet represents an arbitrarily
long sequence of bits, but unlike BigInteger, BitSet is mutable. The
BitSet class provides a method that allows you to change the state of
a single bit of a millionbit instance in constant time.
Read the full item on Item 15: Minimize mutability
Apart from possible performance drawbacks (possible! because with the complexity of GC and HotSpot optimisations, immutable structures are not necessarily slower) - one drawback can be that state must now be threaded through your whole application. For simple applications or tiny scripts the effort to maintain state this way might be too high to buy you concurrency safety.
For example think of a GUI framework like Swing. It would be definitely possible to write a GUI framework entirely using immutable structures and one main "unsafe" outer loop, and I guess this has been done in Haskell. Some of the problems of maintaining nested immutable state can be addressed for example with lenses. But managing all the interactions (registering listeners etc.) may get quite involved, so you might instead want to introduce new abstractions such as functional-reactive or hybrid-reactive GUIs.
Basically you lose some of OO's encapsulation by going all immutable, and when this becomes a problem there are alternative approaches such as actors or STM.
I work with Scala on a daily basis. Immutability has certain key advantages as we all know. However sometimes it's just plain easier to allow mutable content in some situations. Here's a contrived example:
var counter = 0
something.map {e =>
...
counter += 1
}
Of course I could just have the map return a tuple with the payload and count, or use a collection.size if available. But in this case the mutable counter is arguably more clear. In general I prefer immutability but also allow myself to make exceptions.
To answer this question I would quote Programming in Scala, second Edition, chapter "Next Steps in Scala", item 11, by Lex Spoon, Bill Venners and Martin Odersky :
The Scala perspective, however, is that val and var are just two different tools in your toolbox, both useful, neither inherently evil. Scala encourages you to lean towards vals, but ultimately reach for the best tool given the job at hand.
So I would say that just as for programming languages, val and var solves different problems : there is no "disavantage / avantage" without context, there is just a problem to solve, and both of val / var address differently the problem.
Hope it helps, even if it does not provide a concrete list of pros / cons !

How is Scala suitable for Big Scalable Application

I am taking course Functional Programming Principles in Scala | Coursera on Scala.
I fail to understand with immutability , so many functions and so much dependencies on recursion , how is Scala is really suitable for real world applications.
I mean coming from imperative languages I see a risk of StackOverflow or Garbage Collection kicking in and with multiple copies of everything I am running Out Of Memory
What I a missing here?
Stack overflow: it's possible to make your recursive function tail recursive. Add #tailrec from scala.annotation.tailrec to make sure your function is 100% tail recursive. This is basically a loop.
Most importantly recursive solutions is only one of many patterns available. See "Effective Java" why mutability is bad. Immutable data is much better suitable for large applications: no need to synchronize access, client can't mess with data internals, etc. Immutable structures are very efficient in many cases. If you add an element to the head of a list: elem :: list all data is shared between 2 lists - awesome! Only head is created and pointed to the list. Imagine that you have to create a new deep clone of a list every time client asks for.
Expressions in Scala are more succinct and maybe more lazy - create filter and map and all that applied as needed. You can do the same in Java but ceremony takes forever so usually devs just create multiple temp collections on the way.
Martin Odersky defines mutability as a dependence on time/history. That's very interesting because you can use var inside of a function as long as no other code can be affected in any way, i.e. results are always the same.
Look at Option[T] and compare to null. Use them in for comprehensions. Exception becomes really exceptional and Option, Try, Box, Either communicate failures in a very nice way.
Scala allows to write more modular and generic code with less effort compared to Java.
Find a good piece of Scala code and try to see how you would do it in Java - it will be self evident.
Real world applications are getting more event-driven which involves passing around data across different processes or systems needing immutable data structures
In most of the cases we are either manipulating data or waiting on a resource.
In that case its easy to hook in a callback with Actors
Take a look at
http://pavelfatin.com/scala-for-project-euler/
Which gives you some examples on using functions like map fllter etc. Functions like these are used routinely by Ruby applications
Combination of immutability and recursion avoids a lot of stackoverflow problems. This come in handly while dealing with event driven applications
akka.io is a classic example which could have been build very concisely in scala.

In which scenario we need a ReadOnlyCollection?

Dotnet 4.5 has introduce ReadOnlyCollection. My question is what is the practical useage of it? What scenarios we may need this kind of data structure?
You need read-only collections when your API returns collection objects to your callers, copying is too expensive, and you would prefer to stay away from returning IEnumerable<T>. This is commonly desirable in situations when random access is required over the returned collection.
When you want to return a collection that the caller should not be able to modify, but you still want to have the guarantees that an IList gives over an IEnumerable, e.g. a free .Count property, an indexer and the ability to safely iterate over it multiple times, both which aren't guaranteed on an IEnumerable.
This class is useful in a multithreading application. In a multithreading environment can it be a real problem to have a collection of objects, which might be changed by some other thread. This assures threadsafety and lessens the complexity of the code.

Tree collections in Scala

I want to implement a tree in Scala. My particular tree uses Swing Split panes to give multiple views of a geographical map. Any pane within a split pane can itself be further divided to give an extra view. Am I right in saying that neither TreeMap nor TreeSet provide Tree functionality? Please excuse me if I've misunderstood this. It strikes me there should be standard Tree collections and it is bad practice to keep reinventing the wheel. Are there any Tree implementation out there that might be the future Scala standard?
All Trees have three types of elements: a Root, Nodes and Leaves. Leaves and Nodes must have a single reference to a parent. Root and Nodes can have multiple references to child nodes and leaves. Leaves have zero children. Nodes and Root can not be deleted without their children being deleted. there's probably other rules / constraints that I've missed.
This seems like enough common specification to justify a standard collection. I would also suggest that there should be a standard subclass collection for the case where Root and Nodes can only have 2 children or a single leaf child. Which is what I want in my particular case.
Actually, a tree by itself is both pretty useless and pretty difficult to specify.
Starting with the latter, speaking strictly about the data structure, how many children can a tree have? Do nodes store values or not? Do nodes store metadata? Do children have pointers to their parents? Do you store the tree as nodes with pointers, or as positional elements on an array?
These are all questions to which the answer is "it depends". In fact, you stated that children have pointers to their parents, but that is not true for any immutable tree! You also seem to assume trees are always stored as node objects with references, when some trees are actually stored as nodes on a single array (such as a Heap).
Furthermore, not all these requirements can be accommodated -- some are mutually exclusive. Even if you ignore those, you are still left with a data structure which is not optimized for anything and clumsy to use because you have to deal with lots of details not relevant to you.
And, then, there's the second problem, which is that a tree, by itself, is useless. TreeSet and TreeMap take advantage of specific trees whose insertion/deletion/lookup algorithm makes them good data structures for sorted data. That, however, is not at all the only use for trees. Trees can be used to search for spatial algorithms, to represent tree-like real world information, to make up filesystems, etc. Sometimes the task is finding a tree inside a graph. Each of these uses require different representations and different algorithms -- the algorithms being what make them at all useful.
And, to top it off, writing a tree class is trivial. The problem is writing the algorithms to manipulate it.
There is a bit of mismatch between the notion of "tree" as a GUI widget—which you seem to be referring to—and tree as an ordered data structure. In the former case it is just a nested sequence, in the latter the aim is to provide for instance fast search algorithms, and you don't arbitrary manipulate the internal structure, where often the branching factor is constant and the tree height is kept balanced, etc. An example of the latter is collection.immutable.TreeMap which uses a self-balancing binary tree structure called Red-Black-Tree.
So these data structures are rather useless for bridging to javax.swing.TreeModel. There is little that can be done about this interface, so you'll probably stick with the default implementation DefaultTreeModel, a mutable non-generic structure (which is all that single threaded Swing needs).
For a discussion about having a scala-swing JTree wrapper, see this question. It also has a link to a Scala library for JTree.
Since you can use java classes with Scala, take a look at the javax.swing.tree package: http://docs.oracle.com/javase/6/docs/api/javax/swing/tree/package-summary.html, especially TreeModel and TreeNode, MutableTreeNode and DefaultMutableTreeNode. They were designed to be used with Swing, but are pretty much a standard tree implementation.
Other than that, implementing a (generic) tree should be pretty straightforward.
Since gui application imposes low performance demands on the tree collection you may use general graph library constrained to represent only tree structured-graph: http://scala-graph.org/
TreeSet and TreeMap are both based on RedBlack:
Red-black trees are a form of balanced binary trees where some nodes
are designated “red” and others designated “black.” Like any balanced
binary tree, operations on them reliably complete in time logarithmic
to the size of the tree.
(quote from Scala 2.8 Collections)
RedBlack is not documented very well but if you look at the source of TreeSet and TreeMap it's pretty easy to figure out how to use it, though it doesn't fill all (most?) of your requirements (nodes don't have references to the parent, etc).

Is Guava's ImmutableList.Builder thread safe?

What are the thread safety guarantees for Guava's ImmutableList.Builder? The javadocs don't say.
While the Guava Immutable classes are threadsafe, their builders are not. For most applications, only one thread will interact with any particular Builder instance.
While the absence of thread-safety usually doesn't need to be documented, such Javadoc might make sense for the Immutable collection builders. People may be surprised that ImmutableList is threadsafe while ImmutableList.Builder isn't.
If thread-safety is not mentioned in the javadocs, don't assume it!
More seriously, "no".
I would also prefer javadocs of ImmutableList and friends include such a -rather obvious, yes- remark (so you wouldn't have to assume it yourself), because the "obvious" is not always the case. Just the other day I was discussing scala.List, an immutable list, and some surprizing issues it may cause if exchanged between threads inappropriately (via a data race), which people didn't think about because they see the word "immutable" on the tin, plus they equate "immutable == thread-safe", so it pays off to be on the safe side even when documenting "obvious" thread-safety aspects.
Agree with #Dimitris Andreou: definitely do not assume thread safety if its not documented as such. When you go to the effort of making a non-trivial class threadsafe, you want users to know it.
Beyond that, I think the most common use case for a builder will be thread-confined: ie as a local variable in some method. If you need multiple threads to build a List, is is really immutable yet?
If you have multiple threads feeding into a list, but want to snapshot it at some point and say "no more changes going forward, its immutable" then I'd write something that takes the elements from those threads and freezes the contents into a new ImmutableList when you know its ready.