why class Set does not exists in scala? - scala

There are many classes which inherit trait Set.
HashSet, TreeSet, etc.
And there's Object(could i call it companion object of trait Set? not in the case of class Set?) Set and trait Set.
It seems to me that just adding one more class "Set" to this list make it seems to be really easy to understand the structure.
is there any reason Class Set should not exists?

If you just need a set, use Set.apply and you will have a valid set that supports all important operations. You don't need to worry how is it implemented. It is prepared to work well for most use cases.
On the other hand, if performance of certain operations matters for you, create a concrete class for concrete implementation of set, and you will know exactly what you have.
In java you would write:
Set<String> strings = new HashSet<>(Arrays.asList("a", "b"));
in scala you could as well have those types
val strings: Set[String] = HashSet("a", "b")
but you can also use a handy factory if you don't need to worry about the type and simply use
val strings = Set("a", "b")
and nothing is wrong with this, and I don't see how adding another class would help at all. It is normal thing to have an interface/trait and concrete implementations, nothing in the middle is needed nor helpful.
Set.apply is a factory for sets. You can check what is the actual class of resulting object using getClass. This factory creates special, optimized sets for sizes 0-4, for example
scala.collection.immutable.Set$EmptySet$
scala.collection.immutable.Set$Set1
scala.collection.immutable.Set$Set2
for bigger sets it is a hash set, namely scala.collection.immutable.HashSet$HashTrieSet.

In Scala there is no overlap between classes and traits. Classes are implementations that can be instantiated, while traits are independently mixable interfaces. The use of Set.apply gives an object with interface Set and that is all you need to know to use it. I fully understand wanting a concrete type, but that would be unnecessary. The right thing to do here is save it to a val of type Set and use only the interface Set provides.
I know that may not be satisfying, but give it time and the Scala type system will make sense in terms of itself, even if that is different than what Java does.

Related

Is it a good idea to add methods to Scala case classes

Case classes are suppose to be algebraic types, therefore some people are against adding methods to the case class.
Can somebody please give an example for why it's a bad idea?
This is one of those questions that leads to more questions.
Following is my take on this.
Lets see what happens when a case class is defined,
The Scala compiler does the following,
Creates a class and its companion object.
Implements the apply method that you can use as a factory. This lets
you create instances of the class without the new keyword.
Prefixes all arguments, in the parameter list, with val. ie. makes it immutable
Adds implementations of hashCode, equals and toString
Implements the unapply method, a case class supports pattern matching. This is important when you define an Algebraic Data Type.
Generates accessors for fields. Note that it does not generate "mutators"
Now as we can see case classes are not exact peers of the Java Beans.
Case classes tend to represent Datatype more than it represents a entity.
I look at them as good friends of programmers in terms of the fact that it cuts down on the boiler plate of endless getters , override equals and hashcode methods etc.
Now coming to the question,
If you look at it from a functional programming standpoint then case classes are the way to go since you would looking at immutability , equality and you are sure that the case class represents a data structure. It is here that a lot of the times people programming in FP say to use them for ADTs.
If your case class has logic that works on the class's state then that makes it a bad choice for functional programming.
I prefer to use case classes for scenarios where i am sure that i need a class to represent a datastructure because thats where i get the help of auto generated methods and the added advantage of patter-matching. When i program in a OO way with side effects ,mutable state i use class .
Having said that there still could be scenarios where you could have a case class with utlity methods. I just think those chances are less.

Why use Collection.empty[T] instead of new Collection[T]()

I was wondering if there is a good reason to use Collection.empty[T] instead of new Collection[T]() (or the inverse) ? Or is it just a personal preference ?
Thanks.
Calling new Collection[T]() will create a new instance every time. On the other hand, Collection.empty[T] will most likely always return the same singleton object, usually defined somewhere as
object Empty extends Collection[Nothing] ...
which will be much faster. Edit: This is only possible for immutable collections, mutable collections have to return a new instance every time empty is called.
You should always prefer Collection.empty[Type].
In addition to Collection.empty[T] being clearer on the intent, you should favour it for the same reason that you should favour factory methods in general when instantiating a collection: because thoses factories abstract away some implementation details that you might not (or should not) care about.
By example, when you do Seq.empty[String] you actually get an instance of List[String]. You could directly instantiate a List[String] but if all you care about is to have some Seq you would introduce a needless dependency to List (well OK, actually you cannot as it stands, because List is already abstract, but let's pretend we can for the sake of the argument)
The whole point of factories is precisely to have some amount of separation of concern and not bother with unnecessary instantiation details.
As another more elaborate example, let's talk about collection.immutable.HashMap. This one is very much a concrete class so you might think there is no need for a factory here. Except that for optimization purpose the factory in the companion object collection.immutable.HashMap will actually create different sub-classes depending on the number of elements that you initialize the map with (see this question: Scala: how to make a Hash(Trie)Map from a Map (via Anorm in Play)). Obviously, if you directly instantiate collection.immutable.HashMap you will lose this optimization.
Another common optimization for empty is to always return (when it is an immutable collection) the same instance, yet another useful optimization that you would lose by directly instantiating the collection.
So as a rule of thumb, as far as you can you should use the factories that are provided by the various collection companion objects, so as to shield yourself from unneeded dependencies while at the same time benefiting from potential optimizations provided by the collection framework.
empty is just a special case of factory, and so the same logic applies.

Why does the Scala API have two strategies for organizing types?

I've noticed that the Scala standard library uses two different strategies for organizing classes, traits, and singleton objects.
Using packages whose members are them imported. This is, for example, how you get access to scala.collection.mutable.ListBuffer. This technique is familiar coming from Java, Python, etc.
Using type members of traits. This is, for example, how you get access to the Parser type. You first need to mix in scala.util.parsing.combinator.Parsers. This technique is not familiar coming from Java, Python, etc, and isn't much used in third-party libraries.
I guess one advantage of (2) is that it organizes both methods and types, but in light of Scala 2.8's package objects the same can be done using (1). Why have both these strategies? When should each be used?
The nomenclature of note here is path-dependent types. That's the option number 2 you talk of, and I'll speak only of it. Unless you happen to have a problem solved by it, you should always take option number 1.
What you miss is that the Parser class makes reference to things defined in the Parsers class. In fact, the Parser class itself depends on what input has been defined on Parsers:
abstract class Parser[+T] extends (Input => ParseResult[T])
The type Input is defined like this:
type Input = Reader[Elem]
And Elem is abstract. Consider, for instance, RegexParsers and TokenParsers. The former defines Elem as Char, while the latter defines it as Token. That means the Parser for the each is different. More importantly, because Parser is a subclass of Parsers, the Scala compiler will make sure at compile time you aren't passing the RegexParsers's Parser to TokenParsers or vice versa. As a matter of fact, you won't even be able to pass the Parser of one instance of RegexParsers to another instance of it.
The second is also known as the Cake pattern.
It has the benefit that the code inside the class that has a trait mixed in becomes independent of the particular implementation of the methods and types in that trait. It allows to use the members of the trait without knowing what's their concrete implementation.
trait Logging {
def log(msg: String)
}
trait App extends Logging {
log("My app started.")
}
Above, the Logging trait is the requirement for the App (requirements can also be expressed with self-types). Then, at some point in your application you can decide what the implementation will be and mix the implementation trait into the concrete class.
trait ConsoleLogging extends Logging {
def log(msg: String) = println(msg)
}
object MyApp extends App with ConsoleLogging
This has an advantage over imports, in the sense that the requirements of your piece of code aren't bound to the implementation defined by the import statement. Furthermore, it allows you to build and distribute an API which can be used in a different build somewhere else provided that its requirements are met by mixing in a concrete implementation.
However, there are a few things to be careful with when using this pattern.
All of the classes defined inside the trait will have a reference to the outer class. This can be an issue where performance is concerned, or when you're using serialization (when the outer class is not serializable, or worse, if it is, but you don't want it to be serialized).
If your 'module' gets really large, you will either have a very big trait and a very big source file, or will have to distribute the module trait code across several files. This can lead to some boilerplate.
It can force you to have to write your entire application using this paradigm. Before you know it, every class will have to have its requirements mixed in.
The concrete implementation must be known at compile time, unless you use some sort of hand-written delegation. You cannot mix in an implementation trait dynamically based on a value available at runtime.
I guess the library designers didn't regard any of the above as an issue where Parsers are concerned.

Practical uses for Structural Types?

Structural types are one of those "wow, cool!" features of Scala. However, For every example I can think of where they might help, implicit conversions and dynamic mixin composition often seem like better matches. What are some common uses for them and/or advice on when they are appropriate?
Aside from the rare case of classes which provide the same method but aren't related nor do implement a common interface (for example, the close() method -- Source, for one, does not extend Closeable), I find no use for structural types with their present restriction. If they were more flexible, however, I could well write something like this:
def add[T: { def +(x: T): T }](a: T, b: T) = a + b
which would neatly handle numeric types. Every time I think structural types might help me with something, I hit that particular wall.
EDIT
However unuseful I find structural types myself, the compiler, however, uses it to handle anonymous classes. For example:
implicit def toTimes(count: Int) = new {
def times(block: => Unit) = 1 to count foreach { _ => block }
}
5 times { println("This uses structural types!") }
The object resulting from (the implicit) toTimes(5) is of type { def times(block: => Unit) }, ie, a structural type.
I don't know if Scala does that for every anonymous class -- perhaps it does. Alas, that is one reason why doing pimp my library that way is slow, as structural types use reflection to invoke the methods. Instead of an anonymous class, one should use a real class to avoid performance issues in pimp my library.
Structural types are very cool constructs in Scala. I've used them to represent multiple unrelated types that share an attribute upon which I want to perform a common operation without a new level of abstraction.
I have heard one argument against structural types from people who are strict about an application's architecture. They feel it is dangerous to apply a common operation across types without an associative trait or parent type, because you then leave the rule of what type the method should apply to open-ended. Daniel's close() example is spot on, but what if you have another type that requires different behavior? Someone who doesn't understand the architecture might use it and cause problems in the system.
I think structural types are one of these features that you don't need that often, but when you need it, it helps you a lot. One area where structural types really shine is "retrofitting", e.g. when you need to glue together several pieces of software you have no source code for and which were not intended for reuse. But if you find yourself using structural types a lot, you're probably doing it wrong.
[Edit]
Of course implicits are often the way to go, but there are cases when you can't: Imagine you have a mutable object you can modify with methods, but which hides important parts of it's state, a kind of "black box". Then you have to work somehow with this object.
Another use case for structural types is when code relies on naming conventions without a common interface, e.g. in machine generated code. In the JDK we can find such things as well, like the StringBuffer / StringBuilder pair (where the common interfaces Appendable and CharSequence are way to general).
Structural types gives some benefits of dynamic languages to a statically linked language, specifically loose coupling. If you want a method foo() to call instance methods of class Bar, you don't need an interface or base-class that is common to both foo() and Bar. You can define a structural type that foo() accepts and whose Bar has no clue of existence. As long as Bar contains methods that match the structural type signatures, foo() will be able to call.
It's great because you can put foo() and Bar on distinct, completely unrelated libraries, that is, with no common referenced contract. This reduces linkage requirements and thus further contributes for loose coupling.
In some situations, a structural type can be used as an alternative to the Adapter pattern, because it offers the following advantages:
Object identity is preserved (there is no separate object for the adapter instance, at least in the semantic level).
You don't need to instantiate an adapter - just pass a Bar instance to foo().
You don't need to implement wrapper methods - just declare the required signatures in the structural type.
The structural type doesn't need to know the actual instance class or interface, while the adapter must know Bar so it can call its methods. This way, a single structural type can be used for many actual types, whereas with adapter it's necessary to code multiple classes - one for each actual type.
The only drawback of structural types compared to adapters is that a structural type can't be used to translate method signatures. So, when signatures doesn't match, you must use adapters that will have some translation logic. I particularly don't like to code "intelligent" adapters because in many times they are more than just adapters and cause increased complexity. If a class client needs some additional method, I prefer to simply add such method, since it usually doesn't affect footprint.

Where case classes should NOT be used in Scala?

Case classes in Scala are standard classes enhanced with pattern-matching, equals, ... (or am I wrong?). Moreover they require no "new" keyword for their instanciation. It seems to me that they are simpler to define than regular classes (or am I again wrong?).
There are lots of web pages telling where they should be used (mostly about pattern matchin). But where should they be avoided ? Why don't we use them everywhere ?
There are many places where case classes are not adequate:
When one wishes to hide the data structure.
As part of a type hierarchy of more than two or three levels.
When the constructor requires special considerations.
When the extractor requires special considerations.
When equality and hash code requires special considerations.
Sometimes these requirements show up late in the design, and requires one to convert a case class into a normal class. Since the benefits of a case class really aren't all that great -- aside from the few special cases they were specially made for -- my own recommendation is not to make anything a case class unless there's a clear use for it.
Or, in other words, do not overdesign.
Inheriting from case classes is problematic. Suppose you have code like so:
case class Person(name: String) { }
case class Homeowner(address: String,override val name: String)
extends Person(name) { }
scala> Person("John") == Homeowner("1 Main St","John")
res0: Boolean = true
scala> Homeowner("1 Main St","John") == Person("John")
res1: Boolean = false
Perhaps this is what you want, but usually you want a==b if and only if b==a. Unfortunately, the compiler can't sensibly fix this for you automatically.
This gets even worse because the hashCode of Person("John") is not the same as the hashCode of Homeowner("1 Main St","John"), so now equals acts weird and hashCode acts weird.
As long as you know what to expect, inheriting from case classes can give comprehensible results, but it has come to be viewed as bad form (and thus has been deprecated in 2.8).
One downside that is mentioned in Programming in Scala is that due to the things automatically generated for case classes the objects get larger than for normal classes, so if memory efficiency is important, you might want to use regular classes.
It can be tempting to use case classes because you want free toString/equals/hashCode. This can cause problems, so avoid doing that.
I do wish there were an annotation that let you get those handy things without making a case class, but maybe that's harder than it sounds.