Where case classes should NOT be used in Scala? - scala

Case classes in Scala are standard classes enhanced with pattern-matching, equals, ... (or am I wrong?). Moreover they require no "new" keyword for their instanciation. It seems to me that they are simpler to define than regular classes (or am I again wrong?).
There are lots of web pages telling where they should be used (mostly about pattern matchin). But where should they be avoided ? Why don't we use them everywhere ?

There are many places where case classes are not adequate:
When one wishes to hide the data structure.
As part of a type hierarchy of more than two or three levels.
When the constructor requires special considerations.
When the extractor requires special considerations.
When equality and hash code requires special considerations.
Sometimes these requirements show up late in the design, and requires one to convert a case class into a normal class. Since the benefits of a case class really aren't all that great -- aside from the few special cases they were specially made for -- my own recommendation is not to make anything a case class unless there's a clear use for it.
Or, in other words, do not overdesign.

Inheriting from case classes is problematic. Suppose you have code like so:
case class Person(name: String) { }
case class Homeowner(address: String,override val name: String)
extends Person(name) { }
scala> Person("John") == Homeowner("1 Main St","John")
res0: Boolean = true
scala> Homeowner("1 Main St","John") == Person("John")
res1: Boolean = false
Perhaps this is what you want, but usually you want a==b if and only if b==a. Unfortunately, the compiler can't sensibly fix this for you automatically.
This gets even worse because the hashCode of Person("John") is not the same as the hashCode of Homeowner("1 Main St","John"), so now equals acts weird and hashCode acts weird.
As long as you know what to expect, inheriting from case classes can give comprehensible results, but it has come to be viewed as bad form (and thus has been deprecated in 2.8).

One downside that is mentioned in Programming in Scala is that due to the things automatically generated for case classes the objects get larger than for normal classes, so if memory efficiency is important, you might want to use regular classes.

It can be tempting to use case classes because you want free toString/equals/hashCode. This can cause problems, so avoid doing that.
I do wish there were an annotation that let you get those handy things without making a case class, but maybe that's harder than it sounds.

Related

why class Set does not exists in scala?

There are many classes which inherit trait Set.
HashSet, TreeSet, etc.
And there's Object(could i call it companion object of trait Set? not in the case of class Set?) Set and trait Set.
It seems to me that just adding one more class "Set" to this list make it seems to be really easy to understand the structure.
is there any reason Class Set should not exists?
If you just need a set, use Set.apply and you will have a valid set that supports all important operations. You don't need to worry how is it implemented. It is prepared to work well for most use cases.
On the other hand, if performance of certain operations matters for you, create a concrete class for concrete implementation of set, and you will know exactly what you have.
In java you would write:
Set<String> strings = new HashSet<>(Arrays.asList("a", "b"));
in scala you could as well have those types
val strings: Set[String] = HashSet("a", "b")
but you can also use a handy factory if you don't need to worry about the type and simply use
val strings = Set("a", "b")
and nothing is wrong with this, and I don't see how adding another class would help at all. It is normal thing to have an interface/trait and concrete implementations, nothing in the middle is needed nor helpful.
Set.apply is a factory for sets. You can check what is the actual class of resulting object using getClass. This factory creates special, optimized sets for sizes 0-4, for example
scala.collection.immutable.Set$EmptySet$
scala.collection.immutable.Set$Set1
scala.collection.immutable.Set$Set2
for bigger sets it is a hash set, namely scala.collection.immutable.HashSet$HashTrieSet.
In Scala there is no overlap between classes and traits. Classes are implementations that can be instantiated, while traits are independently mixable interfaces. The use of Set.apply gives an object with interface Set and that is all you need to know to use it. I fully understand wanting a concrete type, but that would be unnecessary. The right thing to do here is save it to a val of type Set and use only the interface Set provides.
I know that may not be satisfying, but give it time and the Scala type system will make sense in terms of itself, even if that is different than what Java does.

Is it a good idea to add methods to Scala case classes

Case classes are suppose to be algebraic types, therefore some people are against adding methods to the case class.
Can somebody please give an example for why it's a bad idea?
This is one of those questions that leads to more questions.
Following is my take on this.
Lets see what happens when a case class is defined,
The Scala compiler does the following,
Creates a class and its companion object.
Implements the apply method that you can use as a factory. This lets
you create instances of the class without the new keyword.
Prefixes all arguments, in the parameter list, with val. ie. makes it immutable
Adds implementations of hashCode, equals and toString
Implements the unapply method, a case class supports pattern matching. This is important when you define an Algebraic Data Type.
Generates accessors for fields. Note that it does not generate "mutators"
Now as we can see case classes are not exact peers of the Java Beans.
Case classes tend to represent Datatype more than it represents a entity.
I look at them as good friends of programmers in terms of the fact that it cuts down on the boiler plate of endless getters , override equals and hashcode methods etc.
Now coming to the question,
If you look at it from a functional programming standpoint then case classes are the way to go since you would looking at immutability , equality and you are sure that the case class represents a data structure. It is here that a lot of the times people programming in FP say to use them for ADTs.
If your case class has logic that works on the class's state then that makes it a bad choice for functional programming.
I prefer to use case classes for scenarios where i am sure that i need a class to represent a datastructure because thats where i get the help of auto generated methods and the added advantage of patter-matching. When i program in a OO way with side effects ,mutable state i use class .
Having said that there still could be scenarios where you could have a case class with utlity methods. I just think those chances are less.

Why doesn't scala have easier support for Equals?

Reading Martin's book, chapter about equality Object equality chapter one may notice that properly implementing equals in scala (as actually in any other language) is not very straightforward. However scala is extremely powerful and agile, and I cannot believe it could not simplify things a bit. I know that scala generates proper equals for case classes, so I wonder why couldn't it generate simplifications for normal classes?
To show my point, I wrote an example of how would I see this should look like. It probably has flaws, and I had to use ClassTag which I know is very wrong for such basic thing as equals due to performance (any tip how could I do it without ClassTag?), but thinking that scala can generate proper equals for case classes, I'd say it should be able to generate proper code for normal classes, giving that developer provides the Key which should be used to compare objects.
trait Equality[T] extends Equals {
val ttag: ClassTag[T]
def Key: Seq[Any]
def canEqual(other: Any): Boolean = other match {
case that: Equality[_] if that.ttag == ttag => true
case _ => false
}
override def equals(other: Any): Boolean =
other match {
case that: Equality[T] => canEqual(that) && Key == that.Key
case _ => false
}
override def hashCode = Key.foldLeft(1)((x, y) => 41 * x + y.hashCode)
}
Then you can use it like this:
class Point(val x: Int, val y: Int)(implicit val ttag: ClassTag[Point]) extends Equality[Point]{
override def Key: Seq[Any] = Seq(x, y)
}
I'm not very much into ClassTags, so I might have made it wrong, but it seems to be working. Still that's not what I am asking - I want to know is there any serious reasons, why scala itself do not simplify implementing equality checks?
It is an interesting idea, and I can see from a comment that scalaz has something like this. I'm not sure I have a complete answer, but some factors to consider are:
Case classes are a bit unique in that you're not supposed to inherit from them. Assuming they're not sub-classed, canEqual isn't even required.
There's something elegant about the idea that even '+' in scala is a function not a 'language feature.' It's not necessarily better to have this as a language feature instead of a library (and there are a number of utilities even in Java to help with implementing hashcode/equals). The existing "equals" method isn't a language feature, it's just a method on the parent class Object, inherited from Java.
Scala does still need to play-nice with Java, and that could be one barrier to radically changing how equals works. Interface requirements can't be imposed on existing Java classes that scala might want to inherit from.
When you do consider what the syntax might look like if it were a language feature, I'm not sure what could actually be eliminated or changed.
For example, the developer still has to specify the Key you suggest. Also, to be able to support, for example, anonymous subclasses and trait mix-ins where canEqual won't change, with a language feature you'd still need to explicitly define your class tag.
Maybe the analysis could be more interesting if you provide what, syntactically, it might look like if it were a language feature instead of a helper library. I might be missing some aspects of how this would work.
There is no such thing "proper equals" since it depends on the use cases.
For simple cases what you suggest may work, but in such case, using case class with also work. The problem is that while for simple Data classes it works, the same isn't true for other cases. For example, when there are inheritance - what is the right thing to do? if class have private members - does it should be part of the equals?? does public getters should be ? What will happen when when there are cyclic dependencies ?
One of the main reasons for the case classes cannot be inherit by other case class is the equality.

Memory overhead of Case classes in scala

What is the memory overhead of a case class in scala ?
I've implemented some code to hold a lexicon with multiple types of interned tokens for NLP processing. I've got a case class for each token type.
For example, the canonical lemma/stem token is as follows:
sealed trait InternedLexAtom extends LexAtom{
def id : Int
}
case class Lemma(id: Int) extends InternedLexAtom
I'm going to be returning document vectors of these interned tokens, the reason I wrap them in case classes is to be able to add methods to the tokens via implicit classes. The reason I use this way of adding behaviour to the lexeme's is because I want the lexemes to have different methods based on different contexts.
So I'm hoping the answer will be zero memory overhead due to type erasure. Is this the case ?
I have a suspicion that a single pointer might be packed with the parameters for some of the magic Scala can do :(
justification
To put things in perspective. The JVM uses 1.5-2gigs of memory with my lexicon loaded (the lexicon does not use cases classes in it's in-memory representation), and C++ does the same in 500-700 mb of memory. If my codebase keeps scaling it's memory requirements the way it is now I'm not going to be able to do this stuff on my laptop (in-memory)
I'll sidestep the problem by structuring my code differently. For example I can just strip away the case classes in vector representations if I need to. Would be nice if I didn't have to.
Question Extension.
Robin and Pedro have addressed the use-case, thank you. In this case I was missing value classes. With those there are no more downsides. additionally: I tried my best not to mention C++'s POD concept. But now I must ask :D A c++ POD is just a struct with primitive values. If I wanted to pack more than just one value into value class, how would I achieve this ? I am assuming this would be what I want to do ?
class SuperTriple(val underlying: Tuple2[Int,Int]) extends AnyVal {
def super: underlying._1
def triple: underlying._2
}
I do actually need the above construct, since a SuperTriple is what I am using as my vector model symbol :D
The original question still remains "what is the overhead of a case class".
In Scala 2.10 you can use value classes. (In older versions of Scala, for something with zero overhead for just one member, you need to use unboxed tagged types.)

which GOF Design pattern(s) has entirely different implementation (java vs Scala)

Recently I read following SO question :
Is there any use cases for employing the Visitor Pattern in Scala?
Should I use Pattern Matching in Scala every time I would have used
the Visitor Pattern in Java?
The link to the question with title:
Visitor Pattern in Scala. The accepted answer begins with
Yes, you should probably start off with pattern matching instead of
the visitor pattern. See this
http://www.artima.com/scalazine/articles/pattern_matching.html
My question (inspired by above mentioned question) is which GOF Design pattern(s) has entirely different implementation in Scala? Where should I be careful and not follow java based programming model of Design Patterns (Gang of Four), if I am programming in Scala?
Creational patterns
Abstract Factory
Builder
Factory Method
Prototype
Singleton : Directly create an Object (scala)
Structural patterns
Adapter
Bridge
Composite
Decorator
Facade
Flyweight
Proxy
Behavioral patterns
Chain of responsibility
Command
Interpreter
Iterator
Mediator
Memento
Observer
State
Strategy
Template method
Visitor : Patten Matching (scala)
For almost all of these, there are Scala alternatives that cover some but not all of the use cases for these patterns. All of this is IMO, of course, but:
Creational Patterns
Builder
Scala can do this more elegantly with generic types than can Java, but the general idea is the same. In Scala, the pattern is most simply implemented as follows:
trait Status
trait Done extends Status
trait Need extends Status
case class Built(a: Int, b: String) {}
class Builder[A <: Status, B <: Status] private () {
private var built = Built(0,"")
def setA(a0: Int) = { built = built.copy(a = a0); this.asInstanceOf[Builder[Done,B]] }
def setB(b0: String) = { built = built.copy(b = b0); this.asInstanceOf[Builder[A,Done]] }
def result(implicit ev: Builder[A,B] <:< Builder[Done,Done]) = built
}
object Builder {
def apply() = new Builder[Need, Need]
}
(If you try this in the REPL, make sure that the class and object Builder are defined in the same block, i.e. use :paste.) The combination of checking types with <:<, generic type arguments, and the copy method of case classes make a very powerful combination.
Factory Method (and Abstract Factory Method)
Factory methods' main use is to keep your types straight; otherwise you may as well use constructors. With Scala's powerful type system, you don't need help keeping your types straight, so you may as well use the constructor or an apply method in the companion object to your class and create things that way. In the companion-object case in particular, it is no harder to keep that interface consistent than it is to keep the interface in the factory object consistent. Thus, most of the motivation for factory objects is gone.
Similarly, many cases of abstract factory methods can be replaced by having a companion object inherit from an appropriate trait.
Prototype
Of course overridden methods and the like have their place in Scala. However, the examples used for the Prototype pattern on the Design Patterns web site are rather inadvisable in Scala (or Java IMO). However, if you wish to have a superclass select actions based on its subclasses rather than letting them decide for themselves, you should use match rather than the clunky instanceof tests.
Singleton
Scala embraces these with object. They are singletons--use and enjoy!
Structural Patterns
Adapter
Scala's trait provides much more power here--rather than creating a class that implements an interface, for example, you can create a trait which implements only part of the interface, leaving the rest for you to define. For example, java.awt.event.MouseMotionListener requires you to fill in two methods:
def mouseDragged(me: java.awt.event.MouseEvent)
def mouseMoved(me: java.awt.event.MouseEvent)
Maybe you want to ignore dragging. Then you write a trait:
trait MouseMoveListener extends java.awt.event.MouseMotionListener {
def mouseDragged(me: java.awt.event.MouseEvent) {}
}
Now you can implement only mouseMoved when you inherit from this. So: similar pattern, but much more power with Scala.
Bridge
You can write bridges in Scala. It's a huge amount of boilerplate, though not quite as bad as in Java. I wouldn't recommend routinely using this as a method of abstraction; think about your interfaces carefully first. Keep in mind that with the increased power of traits that you can often use those to simplify a more elaborate interface in a place where otherwise you might be tempted to write a bridge.
In some cases, you may wish to write an interface transformer instead of the Java bridge pattern. For example, perhaps you want to treat drags and moves of the mouse using the same interface with only a boolean flag distinguishing them. Then you can
trait MouseMotioner extends java.awt.event.MouseMotionListener {
def mouseMotion(me: java.awt.event.MouseEvent, drag: Boolean): Unit
def mouseMoved(me: java.awt.event.MouseEvent) { mouseMotion(me, false) }
def mouseDragged(me: java.awt.event.MouseEvent) { mouseMotion(me, true) }
}
This lets you skip the majority of the bridge pattern boilerplate while accomplishing a high degree of implementation independence and still letting your classes obey the original interface (so you don't have to keep wrapping and unwrapping them).
Composite
The composite pattern is particularly easy to achieve with case classes, though making updates is rather arduous. It is equally valuable in Scala and Java.
Decorator
Decorators are awkward. You usually don't want to use the same methods on a different class in the case where inheritance isn't exactly what you want; what you really want is a different method on the same class which does what you want instead of the default thing. The enrich-my-library pattern is often a superior substitute.
Facade
Facade works better in Scala than in Java because you can have traits carry partial implementations around so you don't have to do all the work yourself when you combine them.
Flyweight
Although the flyweight idea is as valid in Scala as Java, you have a couple more tools at your disposal to implement it: lazy val, where a variable is not created unless it's actually needed (and thereafter is reused), and by-name parameters, where you only do the work required to create a function argument if the function actually uses that value. That said, in some cases the Java pattern stands unchanged.
Proxy
Works the same way in Scala as Java.
Behavioral Patterns
Chain of responsibility
In those cases where you can list the responsible parties in order, you can
xs.find(_.handleMessage(m))
assuming that everyone has a handleMessage method that returns true if the message was handled. If you want to mutate the message as it goes, use a fold instead.
Since it's easy to drop responsible parties into a Buffer of some sort, the elaborate framework used in Java solutions rarely has a place in Scala.
Command
This pattern is almost entirely superseded by functions. For example, instead of all of
public interface ChangeListener extends EventListener {
void stateChanged(ChangeEvent e)
}
...
void addChangeListener(ChangeListener listener) { ... }
you simply
def onChange(f: ChangeEvent => Unit)
Interpreter
Scala provides parser combinators which are dramatically more powerful than the simple interpreter suggested as a Design Pattern.
Iterator
Scala has Iterator built into its standard library. It is almost trivial to make your own class extend Iterator or Iterable; the latter is usually better since it makes reuse trivial. Definitely a good idea, but so straightforward I'd hardly call it a pattern.
Mediator
This works fine in Scala, but is generally useful for mutable data, and even mediators can fall afoul of race conditions and such if not used carefully. Instead, try when possible to have your related data all stored in one immutable collection, case class, or whatever, and when making an update that requires coordinated changes, change all things at the same time. This won't help you interface with javax.swing, but is otherwise widely applicable:
case class Entry(s: String, d: Double, notes: Option[String]) {}
def parse(s0: String, old: Entry) = {
try { old.copy(s = s0, d = s0.toDouble) }
catch { case e: Exception => old }
}
Save the mediator pattern for when you need to handle multiple different relationships (one mediator for each), or when you have mutable data.
Memento
lazy val is nearly ideal for many of the simplest applications of the memento pattern, e.g.
class OneRandom {
lazy val value = scala.util.Random.nextInt
}
val r = new OneRandom
r.value // Evaluated here
r.value // Same value returned again
You may wish to create a small class specifically for lazy evaluation:
class Lazily[A](a: => A) {
lazy val value = a
}
val r = Lazily(scala.util.Random.nextInt)
// not actually called until/unless we ask for r.value
Observer
This is a fragile pattern at best. Favor, whenever possible, either keeping immutable state (see Mediator), or using actors where one actor sends messages to all others regarding the state change, but where each actor can cope with being out of date.
State
This is equally useful in Scala, and is actually the favored way to create enumerations when applied to methodless traits:
sealed trait DayOfWeek
final trait Sunday extends DayOfWeek
...
final trait Saturday extends DayOfWeek
(often you'd want the weekdays to do something to justify this amount of boilerplate).
Strategy
This is almost entirely replaced by having methods take functions that implement a strategy, and providing functions to choose from.
def printElapsedTime(t: Long, rounding: Double => Long = math.round) {
println(rounding(t*0.001))
}
printElapsedTime(1700, math.floor) // Change strategy
Template Method
Traits offer so many more possibilities here that it's best to just consider them another pattern. You can fill in as much code as you can from as much information as you have at your level of abstraction. I wouldn't really want to call it the same thing.
Visitor
Between structural typing and implicit conversion, Scala has astoundingly more capability than Java's typical visitor pattern. There's no point using the original pattern; you'll just get distracted from the right way to do it. Many of the examples are really just wishing there was a function defined on the thing being visited, which Scala can do for you trivially (i.e. convert an arbitrary method to a function).
Ok, let's have a brief look at these patterns. I'm looking at all these patterns purely from a functional programming point of view, and leaving out many things that Scala can improve from an OO point of view. Rex Kerr answer provides an interesting counter-point to my own answers (I only read his answer after writing my own).
With that in mind, I'd like to say that it is important to study persistent data structures (functionally pure data structures) and monads. If you want to go deep, I think category theory basics are important -- category theory can formally describe all program structures, including imperative ones.
Creational Patterns
A constructor is nothing more than a function. A parameterless constructor for type T is nothing more than a function () => T, for example. In fact, Scala's syntactical sugar for functions is taken advantage on case classes:
case class T(x: Int)
That is equivalent to:
class T(val x: Int) { /* bunch of methods */ }
object T {
def apply(x: Int) = new T(x)
/* other stuff */
}
So that you can instantiate T with T(n) instead of new T(n). You could even write it like this:
object T extends Int => T {
def apply(x: Int) = new T(x)
/* other stuff */
}
Which turns T into a formal function, without changing any code.
This is the important point to keep in mind when thinking of creational patterns. So let's look at them:
Abstract Factory
This one is unlikely to change much. A class can be thought of as a group of closely related functions, so a group of closely related functions is easily implemented through a class, which is what this pattern does for constructors.
Builder
Builder patterns can be replaced by curried functions or partial function applications.
def makeCar: Size => Engine => Luxuries => Car = ???
def makeLargeCars = makeCar(Size.Large) _
def makeCar: (Size, Engine, Luxuries) => Car = ???
def makeLargeCars = makeCar(Size.Large, _: Engine, _: Luxuries)
Factory Method
Becomes obsolete if you discard subclassing.
Prototype
Doesn't change -- in fact, this is a common way of creating data in functional data structures. See case classes copy method, or all non-mutable methods on collections which return collections.
Singleton
Singletons are not particularly useful when your data is immutable, but Scala object implements this pattern is a safe manner.
Structural Patterns
This is mostly related to data structures, and the important point on functional programming is that the data structures are usually immutable. You'd be better off looking at persistent data structures, monads and related concepts than trying to translate these patterns.
Not that some patterns here are not relevant. I'm just saying that, as a general rule, you should look into the things above instead of trying to translate structural patterns into functional equivalents.
Adapter
This pattern is related to classes (nominal typing), so it remains important as long as you have that, and is irrelevant when you don't.
Bridge
Related to OO architecture, so the same as above.
Composite
Lot at Lenses and Zippers.
Decorator
A Decorator is just function composition. If you are decorating a whole class, that may not apply. But if you provide your functionality as functions, then composing a function while maintaining its type is a decorator.
Facade
Same comment as for Bridge.
Flyweight
If you think of constructors as functions, think of flyweight as function memoization. Also, Flyweight is intrinsic related to how persistent data structures are built, and benefits a lot from immutability.
Proxy
Same comment as for Adapter.
Behavioral Patterns
This is all over the place. Some of them are completely useless, while others are as relevant as always in a functional setting.
Chain of Responsibility
Like Decorator, this is function composition.
Command
This is a function. The undo part is not necessary if your data is immutable. Otherwise, just keep a pair of function and its reverse. See also Lenses.
Interpreter
This is a monad.
Iterator
It can be rendered obsolete by just passing a function to the collection. That's what Traversable does with foreach, in fact. Also, see Iteratee.
Mediator
Still relevant.
Memento
Useless with immutable objects. Also, its point is keeping encapsulation, which is not a major concern in FP.
Note that this pattern is not serialization, which is still relevant.
Observer
Relevant, but see Functional Reactive Programming.
State
This is a monad.
Strategy
A strategy is a function.
Template Method
This is an OO design pattern, so it's relevant for OO designs.
Visitor
A visitor is just a method receiving a function. In fact, that's what Traversable's foreach does.
In Scala, it can also be replaced with extractors.
I suppose, Command pattern not needed in functional languages at all. Instead of encapsulation command function inside object and then selecting appropriate object, just use appropriate function itself.
Flyweight is just cache, and has default implementation in most functional languages (memoize in clojure)
Even Template method, Strategy and State can be implemented with just passing appropriate function in method.
So, I recommend to not go deep in Design Patterns when you tries yourself in functional style but reading some books about functional concepts (high-order functions, laziness, currying, and so on)