Case classes are suppose to be algebraic types, therefore some people are against adding methods to the case class.
Can somebody please give an example for why it's a bad idea?
This is one of those questions that leads to more questions.
Following is my take on this.
Lets see what happens when a case class is defined,
The Scala compiler does the following,
Creates a class and its companion object.
Implements the apply method that you can use as a factory. This lets
you create instances of the class without the new keyword.
Prefixes all arguments, in the parameter list, with val. ie. makes it immutable
Adds implementations of hashCode, equals and toString
Implements the unapply method, a case class supports pattern matching. This is important when you define an Algebraic Data Type.
Generates accessors for fields. Note that it does not generate "mutators"
Now as we can see case classes are not exact peers of the Java Beans.
Case classes tend to represent Datatype more than it represents a entity.
I look at them as good friends of programmers in terms of the fact that it cuts down on the boiler plate of endless getters , override equals and hashcode methods etc.
Now coming to the question,
If you look at it from a functional programming standpoint then case classes are the way to go since you would looking at immutability , equality and you are sure that the case class represents a data structure. It is here that a lot of the times people programming in FP say to use them for ADTs.
If your case class has logic that works on the class's state then that makes it a bad choice for functional programming.
I prefer to use case classes for scenarios where i am sure that i need a class to represent a datastructure because thats where i get the help of auto generated methods and the added advantage of patter-matching. When i program in a OO way with side effects ,mutable state i use class .
Having said that there still could be scenarios where you could have a case class with utlity methods. I just think those chances are less.
Related
What is the memory overhead of a case class in scala ?
I've implemented some code to hold a lexicon with multiple types of interned tokens for NLP processing. I've got a case class for each token type.
For example, the canonical lemma/stem token is as follows:
sealed trait InternedLexAtom extends LexAtom{
def id : Int
}
case class Lemma(id: Int) extends InternedLexAtom
I'm going to be returning document vectors of these interned tokens, the reason I wrap them in case classes is to be able to add methods to the tokens via implicit classes. The reason I use this way of adding behaviour to the lexeme's is because I want the lexemes to have different methods based on different contexts.
So I'm hoping the answer will be zero memory overhead due to type erasure. Is this the case ?
I have a suspicion that a single pointer might be packed with the parameters for some of the magic Scala can do :(
justification
To put things in perspective. The JVM uses 1.5-2gigs of memory with my lexicon loaded (the lexicon does not use cases classes in it's in-memory representation), and C++ does the same in 500-700 mb of memory. If my codebase keeps scaling it's memory requirements the way it is now I'm not going to be able to do this stuff on my laptop (in-memory)
I'll sidestep the problem by structuring my code differently. For example I can just strip away the case classes in vector representations if I need to. Would be nice if I didn't have to.
Question Extension.
Robin and Pedro have addressed the use-case, thank you. In this case I was missing value classes. With those there are no more downsides. additionally: I tried my best not to mention C++'s POD concept. But now I must ask :D A c++ POD is just a struct with primitive values. If I wanted to pack more than just one value into value class, how would I achieve this ? I am assuming this would be what I want to do ?
class SuperTriple(val underlying: Tuple2[Int,Int]) extends AnyVal {
def super: underlying._1
def triple: underlying._2
}
I do actually need the above construct, since a SuperTriple is what I am using as my vector model symbol :D
The original question still remains "what is the overhead of a case class".
In Scala 2.10 you can use value classes. (In older versions of Scala, for something with zero overhead for just one member, you need to use unboxed tagged types.)
Recently I read following SO question :
Is there any use cases for employing the Visitor Pattern in Scala?
Should I use Pattern Matching in Scala every time I would have used
the Visitor Pattern in Java?
The link to the question with title:
Visitor Pattern in Scala. The accepted answer begins with
Yes, you should probably start off with pattern matching instead of
the visitor pattern. See this
http://www.artima.com/scalazine/articles/pattern_matching.html
My question (inspired by above mentioned question) is which GOF Design pattern(s) has entirely different implementation in Scala? Where should I be careful and not follow java based programming model of Design Patterns (Gang of Four), if I am programming in Scala?
Creational patterns
Abstract Factory
Builder
Factory Method
Prototype
Singleton : Directly create an Object (scala)
Structural patterns
Adapter
Bridge
Composite
Decorator
Facade
Flyweight
Proxy
Behavioral patterns
Chain of responsibility
Command
Interpreter
Iterator
Mediator
Memento
Observer
State
Strategy
Template method
Visitor : Patten Matching (scala)
For almost all of these, there are Scala alternatives that cover some but not all of the use cases for these patterns. All of this is IMO, of course, but:
Creational Patterns
Builder
Scala can do this more elegantly with generic types than can Java, but the general idea is the same. In Scala, the pattern is most simply implemented as follows:
trait Status
trait Done extends Status
trait Need extends Status
case class Built(a: Int, b: String) {}
class Builder[A <: Status, B <: Status] private () {
private var built = Built(0,"")
def setA(a0: Int) = { built = built.copy(a = a0); this.asInstanceOf[Builder[Done,B]] }
def setB(b0: String) = { built = built.copy(b = b0); this.asInstanceOf[Builder[A,Done]] }
def result(implicit ev: Builder[A,B] <:< Builder[Done,Done]) = built
}
object Builder {
def apply() = new Builder[Need, Need]
}
(If you try this in the REPL, make sure that the class and object Builder are defined in the same block, i.e. use :paste.) The combination of checking types with <:<, generic type arguments, and the copy method of case classes make a very powerful combination.
Factory Method (and Abstract Factory Method)
Factory methods' main use is to keep your types straight; otherwise you may as well use constructors. With Scala's powerful type system, you don't need help keeping your types straight, so you may as well use the constructor or an apply method in the companion object to your class and create things that way. In the companion-object case in particular, it is no harder to keep that interface consistent than it is to keep the interface in the factory object consistent. Thus, most of the motivation for factory objects is gone.
Similarly, many cases of abstract factory methods can be replaced by having a companion object inherit from an appropriate trait.
Prototype
Of course overridden methods and the like have their place in Scala. However, the examples used for the Prototype pattern on the Design Patterns web site are rather inadvisable in Scala (or Java IMO). However, if you wish to have a superclass select actions based on its subclasses rather than letting them decide for themselves, you should use match rather than the clunky instanceof tests.
Singleton
Scala embraces these with object. They are singletons--use and enjoy!
Structural Patterns
Adapter
Scala's trait provides much more power here--rather than creating a class that implements an interface, for example, you can create a trait which implements only part of the interface, leaving the rest for you to define. For example, java.awt.event.MouseMotionListener requires you to fill in two methods:
def mouseDragged(me: java.awt.event.MouseEvent)
def mouseMoved(me: java.awt.event.MouseEvent)
Maybe you want to ignore dragging. Then you write a trait:
trait MouseMoveListener extends java.awt.event.MouseMotionListener {
def mouseDragged(me: java.awt.event.MouseEvent) {}
}
Now you can implement only mouseMoved when you inherit from this. So: similar pattern, but much more power with Scala.
Bridge
You can write bridges in Scala. It's a huge amount of boilerplate, though not quite as bad as in Java. I wouldn't recommend routinely using this as a method of abstraction; think about your interfaces carefully first. Keep in mind that with the increased power of traits that you can often use those to simplify a more elaborate interface in a place where otherwise you might be tempted to write a bridge.
In some cases, you may wish to write an interface transformer instead of the Java bridge pattern. For example, perhaps you want to treat drags and moves of the mouse using the same interface with only a boolean flag distinguishing them. Then you can
trait MouseMotioner extends java.awt.event.MouseMotionListener {
def mouseMotion(me: java.awt.event.MouseEvent, drag: Boolean): Unit
def mouseMoved(me: java.awt.event.MouseEvent) { mouseMotion(me, false) }
def mouseDragged(me: java.awt.event.MouseEvent) { mouseMotion(me, true) }
}
This lets you skip the majority of the bridge pattern boilerplate while accomplishing a high degree of implementation independence and still letting your classes obey the original interface (so you don't have to keep wrapping and unwrapping them).
Composite
The composite pattern is particularly easy to achieve with case classes, though making updates is rather arduous. It is equally valuable in Scala and Java.
Decorator
Decorators are awkward. You usually don't want to use the same methods on a different class in the case where inheritance isn't exactly what you want; what you really want is a different method on the same class which does what you want instead of the default thing. The enrich-my-library pattern is often a superior substitute.
Facade
Facade works better in Scala than in Java because you can have traits carry partial implementations around so you don't have to do all the work yourself when you combine them.
Flyweight
Although the flyweight idea is as valid in Scala as Java, you have a couple more tools at your disposal to implement it: lazy val, where a variable is not created unless it's actually needed (and thereafter is reused), and by-name parameters, where you only do the work required to create a function argument if the function actually uses that value. That said, in some cases the Java pattern stands unchanged.
Proxy
Works the same way in Scala as Java.
Behavioral Patterns
Chain of responsibility
In those cases where you can list the responsible parties in order, you can
xs.find(_.handleMessage(m))
assuming that everyone has a handleMessage method that returns true if the message was handled. If you want to mutate the message as it goes, use a fold instead.
Since it's easy to drop responsible parties into a Buffer of some sort, the elaborate framework used in Java solutions rarely has a place in Scala.
Command
This pattern is almost entirely superseded by functions. For example, instead of all of
public interface ChangeListener extends EventListener {
void stateChanged(ChangeEvent e)
}
...
void addChangeListener(ChangeListener listener) { ... }
you simply
def onChange(f: ChangeEvent => Unit)
Interpreter
Scala provides parser combinators which are dramatically more powerful than the simple interpreter suggested as a Design Pattern.
Iterator
Scala has Iterator built into its standard library. It is almost trivial to make your own class extend Iterator or Iterable; the latter is usually better since it makes reuse trivial. Definitely a good idea, but so straightforward I'd hardly call it a pattern.
Mediator
This works fine in Scala, but is generally useful for mutable data, and even mediators can fall afoul of race conditions and such if not used carefully. Instead, try when possible to have your related data all stored in one immutable collection, case class, or whatever, and when making an update that requires coordinated changes, change all things at the same time. This won't help you interface with javax.swing, but is otherwise widely applicable:
case class Entry(s: String, d: Double, notes: Option[String]) {}
def parse(s0: String, old: Entry) = {
try { old.copy(s = s0, d = s0.toDouble) }
catch { case e: Exception => old }
}
Save the mediator pattern for when you need to handle multiple different relationships (one mediator for each), or when you have mutable data.
Memento
lazy val is nearly ideal for many of the simplest applications of the memento pattern, e.g.
class OneRandom {
lazy val value = scala.util.Random.nextInt
}
val r = new OneRandom
r.value // Evaluated here
r.value // Same value returned again
You may wish to create a small class specifically for lazy evaluation:
class Lazily[A](a: => A) {
lazy val value = a
}
val r = Lazily(scala.util.Random.nextInt)
// not actually called until/unless we ask for r.value
Observer
This is a fragile pattern at best. Favor, whenever possible, either keeping immutable state (see Mediator), or using actors where one actor sends messages to all others regarding the state change, but where each actor can cope with being out of date.
State
This is equally useful in Scala, and is actually the favored way to create enumerations when applied to methodless traits:
sealed trait DayOfWeek
final trait Sunday extends DayOfWeek
...
final trait Saturday extends DayOfWeek
(often you'd want the weekdays to do something to justify this amount of boilerplate).
Strategy
This is almost entirely replaced by having methods take functions that implement a strategy, and providing functions to choose from.
def printElapsedTime(t: Long, rounding: Double => Long = math.round) {
println(rounding(t*0.001))
}
printElapsedTime(1700, math.floor) // Change strategy
Template Method
Traits offer so many more possibilities here that it's best to just consider them another pattern. You can fill in as much code as you can from as much information as you have at your level of abstraction. I wouldn't really want to call it the same thing.
Visitor
Between structural typing and implicit conversion, Scala has astoundingly more capability than Java's typical visitor pattern. There's no point using the original pattern; you'll just get distracted from the right way to do it. Many of the examples are really just wishing there was a function defined on the thing being visited, which Scala can do for you trivially (i.e. convert an arbitrary method to a function).
Ok, let's have a brief look at these patterns. I'm looking at all these patterns purely from a functional programming point of view, and leaving out many things that Scala can improve from an OO point of view. Rex Kerr answer provides an interesting counter-point to my own answers (I only read his answer after writing my own).
With that in mind, I'd like to say that it is important to study persistent data structures (functionally pure data structures) and monads. If you want to go deep, I think category theory basics are important -- category theory can formally describe all program structures, including imperative ones.
Creational Patterns
A constructor is nothing more than a function. A parameterless constructor for type T is nothing more than a function () => T, for example. In fact, Scala's syntactical sugar for functions is taken advantage on case classes:
case class T(x: Int)
That is equivalent to:
class T(val x: Int) { /* bunch of methods */ }
object T {
def apply(x: Int) = new T(x)
/* other stuff */
}
So that you can instantiate T with T(n) instead of new T(n). You could even write it like this:
object T extends Int => T {
def apply(x: Int) = new T(x)
/* other stuff */
}
Which turns T into a formal function, without changing any code.
This is the important point to keep in mind when thinking of creational patterns. So let's look at them:
Abstract Factory
This one is unlikely to change much. A class can be thought of as a group of closely related functions, so a group of closely related functions is easily implemented through a class, which is what this pattern does for constructors.
Builder
Builder patterns can be replaced by curried functions or partial function applications.
def makeCar: Size => Engine => Luxuries => Car = ???
def makeLargeCars = makeCar(Size.Large) _
def makeCar: (Size, Engine, Luxuries) => Car = ???
def makeLargeCars = makeCar(Size.Large, _: Engine, _: Luxuries)
Factory Method
Becomes obsolete if you discard subclassing.
Prototype
Doesn't change -- in fact, this is a common way of creating data in functional data structures. See case classes copy method, or all non-mutable methods on collections which return collections.
Singleton
Singletons are not particularly useful when your data is immutable, but Scala object implements this pattern is a safe manner.
Structural Patterns
This is mostly related to data structures, and the important point on functional programming is that the data structures are usually immutable. You'd be better off looking at persistent data structures, monads and related concepts than trying to translate these patterns.
Not that some patterns here are not relevant. I'm just saying that, as a general rule, you should look into the things above instead of trying to translate structural patterns into functional equivalents.
Adapter
This pattern is related to classes (nominal typing), so it remains important as long as you have that, and is irrelevant when you don't.
Bridge
Related to OO architecture, so the same as above.
Composite
Lot at Lenses and Zippers.
Decorator
A Decorator is just function composition. If you are decorating a whole class, that may not apply. But if you provide your functionality as functions, then composing a function while maintaining its type is a decorator.
Facade
Same comment as for Bridge.
Flyweight
If you think of constructors as functions, think of flyweight as function memoization. Also, Flyweight is intrinsic related to how persistent data structures are built, and benefits a lot from immutability.
Proxy
Same comment as for Adapter.
Behavioral Patterns
This is all over the place. Some of them are completely useless, while others are as relevant as always in a functional setting.
Chain of Responsibility
Like Decorator, this is function composition.
Command
This is a function. The undo part is not necessary if your data is immutable. Otherwise, just keep a pair of function and its reverse. See also Lenses.
Interpreter
This is a monad.
Iterator
It can be rendered obsolete by just passing a function to the collection. That's what Traversable does with foreach, in fact. Also, see Iteratee.
Mediator
Still relevant.
Memento
Useless with immutable objects. Also, its point is keeping encapsulation, which is not a major concern in FP.
Note that this pattern is not serialization, which is still relevant.
Observer
Relevant, but see Functional Reactive Programming.
State
This is a monad.
Strategy
A strategy is a function.
Template Method
This is an OO design pattern, so it's relevant for OO designs.
Visitor
A visitor is just a method receiving a function. In fact, that's what Traversable's foreach does.
In Scala, it can also be replaced with extractors.
I suppose, Command pattern not needed in functional languages at all. Instead of encapsulation command function inside object and then selecting appropriate object, just use appropriate function itself.
Flyweight is just cache, and has default implementation in most functional languages (memoize in clojure)
Even Template method, Strategy and State can be implemented with just passing appropriate function in method.
So, I recommend to not go deep in Design Patterns when you tries yourself in functional style but reading some books about functional concepts (high-order functions, laziness, currying, and so on)
I've noticed that the Scala standard library uses two different strategies for organizing classes, traits, and singleton objects.
Using packages whose members are them imported. This is, for example, how you get access to scala.collection.mutable.ListBuffer. This technique is familiar coming from Java, Python, etc.
Using type members of traits. This is, for example, how you get access to the Parser type. You first need to mix in scala.util.parsing.combinator.Parsers. This technique is not familiar coming from Java, Python, etc, and isn't much used in third-party libraries.
I guess one advantage of (2) is that it organizes both methods and types, but in light of Scala 2.8's package objects the same can be done using (1). Why have both these strategies? When should each be used?
The nomenclature of note here is path-dependent types. That's the option number 2 you talk of, and I'll speak only of it. Unless you happen to have a problem solved by it, you should always take option number 1.
What you miss is that the Parser class makes reference to things defined in the Parsers class. In fact, the Parser class itself depends on what input has been defined on Parsers:
abstract class Parser[+T] extends (Input => ParseResult[T])
The type Input is defined like this:
type Input = Reader[Elem]
And Elem is abstract. Consider, for instance, RegexParsers and TokenParsers. The former defines Elem as Char, while the latter defines it as Token. That means the Parser for the each is different. More importantly, because Parser is a subclass of Parsers, the Scala compiler will make sure at compile time you aren't passing the RegexParsers's Parser to TokenParsers or vice versa. As a matter of fact, you won't even be able to pass the Parser of one instance of RegexParsers to another instance of it.
The second is also known as the Cake pattern.
It has the benefit that the code inside the class that has a trait mixed in becomes independent of the particular implementation of the methods and types in that trait. It allows to use the members of the trait without knowing what's their concrete implementation.
trait Logging {
def log(msg: String)
}
trait App extends Logging {
log("My app started.")
}
Above, the Logging trait is the requirement for the App (requirements can also be expressed with self-types). Then, at some point in your application you can decide what the implementation will be and mix the implementation trait into the concrete class.
trait ConsoleLogging extends Logging {
def log(msg: String) = println(msg)
}
object MyApp extends App with ConsoleLogging
This has an advantage over imports, in the sense that the requirements of your piece of code aren't bound to the implementation defined by the import statement. Furthermore, it allows you to build and distribute an API which can be used in a different build somewhere else provided that its requirements are met by mixing in a concrete implementation.
However, there are a few things to be careful with when using this pattern.
All of the classes defined inside the trait will have a reference to the outer class. This can be an issue where performance is concerned, or when you're using serialization (when the outer class is not serializable, or worse, if it is, but you don't want it to be serialized).
If your 'module' gets really large, you will either have a very big trait and a very big source file, or will have to distribute the module trait code across several files. This can lead to some boilerplate.
It can force you to have to write your entire application using this paradigm. Before you know it, every class will have to have its requirements mixed in.
The concrete implementation must be known at compile time, unless you use some sort of hand-written delegation. You cannot mix in an implementation trait dynamically based on a value available at runtime.
I guess the library designers didn't regard any of the above as an issue where Parsers are concerned.
Structural types are one of those "wow, cool!" features of Scala. However, For every example I can think of where they might help, implicit conversions and dynamic mixin composition often seem like better matches. What are some common uses for them and/or advice on when they are appropriate?
Aside from the rare case of classes which provide the same method but aren't related nor do implement a common interface (for example, the close() method -- Source, for one, does not extend Closeable), I find no use for structural types with their present restriction. If they were more flexible, however, I could well write something like this:
def add[T: { def +(x: T): T }](a: T, b: T) = a + b
which would neatly handle numeric types. Every time I think structural types might help me with something, I hit that particular wall.
EDIT
However unuseful I find structural types myself, the compiler, however, uses it to handle anonymous classes. For example:
implicit def toTimes(count: Int) = new {
def times(block: => Unit) = 1 to count foreach { _ => block }
}
5 times { println("This uses structural types!") }
The object resulting from (the implicit) toTimes(5) is of type { def times(block: => Unit) }, ie, a structural type.
I don't know if Scala does that for every anonymous class -- perhaps it does. Alas, that is one reason why doing pimp my library that way is slow, as structural types use reflection to invoke the methods. Instead of an anonymous class, one should use a real class to avoid performance issues in pimp my library.
Structural types are very cool constructs in Scala. I've used them to represent multiple unrelated types that share an attribute upon which I want to perform a common operation without a new level of abstraction.
I have heard one argument against structural types from people who are strict about an application's architecture. They feel it is dangerous to apply a common operation across types without an associative trait or parent type, because you then leave the rule of what type the method should apply to open-ended. Daniel's close() example is spot on, but what if you have another type that requires different behavior? Someone who doesn't understand the architecture might use it and cause problems in the system.
I think structural types are one of these features that you don't need that often, but when you need it, it helps you a lot. One area where structural types really shine is "retrofitting", e.g. when you need to glue together several pieces of software you have no source code for and which were not intended for reuse. But if you find yourself using structural types a lot, you're probably doing it wrong.
[Edit]
Of course implicits are often the way to go, but there are cases when you can't: Imagine you have a mutable object you can modify with methods, but which hides important parts of it's state, a kind of "black box". Then you have to work somehow with this object.
Another use case for structural types is when code relies on naming conventions without a common interface, e.g. in machine generated code. In the JDK we can find such things as well, like the StringBuffer / StringBuilder pair (where the common interfaces Appendable and CharSequence are way to general).
Structural types gives some benefits of dynamic languages to a statically linked language, specifically loose coupling. If you want a method foo() to call instance methods of class Bar, you don't need an interface or base-class that is common to both foo() and Bar. You can define a structural type that foo() accepts and whose Bar has no clue of existence. As long as Bar contains methods that match the structural type signatures, foo() will be able to call.
It's great because you can put foo() and Bar on distinct, completely unrelated libraries, that is, with no common referenced contract. This reduces linkage requirements and thus further contributes for loose coupling.
In some situations, a structural type can be used as an alternative to the Adapter pattern, because it offers the following advantages:
Object identity is preserved (there is no separate object for the adapter instance, at least in the semantic level).
You don't need to instantiate an adapter - just pass a Bar instance to foo().
You don't need to implement wrapper methods - just declare the required signatures in the structural type.
The structural type doesn't need to know the actual instance class or interface, while the adapter must know Bar so it can call its methods. This way, a single structural type can be used for many actual types, whereas with adapter it's necessary to code multiple classes - one for each actual type.
The only drawback of structural types compared to adapters is that a structural type can't be used to translate method signatures. So, when signatures doesn't match, you must use adapters that will have some translation logic. I particularly don't like to code "intelligent" adapters because in many times they are more than just adapters and cause increased complexity. If a class client needs some additional method, I prefer to simply add such method, since it usually doesn't affect footprint.
Case classes in Scala are standard classes enhanced with pattern-matching, equals, ... (or am I wrong?). Moreover they require no "new" keyword for their instanciation. It seems to me that they are simpler to define than regular classes (or am I again wrong?).
There are lots of web pages telling where they should be used (mostly about pattern matchin). But where should they be avoided ? Why don't we use them everywhere ?
There are many places where case classes are not adequate:
When one wishes to hide the data structure.
As part of a type hierarchy of more than two or three levels.
When the constructor requires special considerations.
When the extractor requires special considerations.
When equality and hash code requires special considerations.
Sometimes these requirements show up late in the design, and requires one to convert a case class into a normal class. Since the benefits of a case class really aren't all that great -- aside from the few special cases they were specially made for -- my own recommendation is not to make anything a case class unless there's a clear use for it.
Or, in other words, do not overdesign.
Inheriting from case classes is problematic. Suppose you have code like so:
case class Person(name: String) { }
case class Homeowner(address: String,override val name: String)
extends Person(name) { }
scala> Person("John") == Homeowner("1 Main St","John")
res0: Boolean = true
scala> Homeowner("1 Main St","John") == Person("John")
res1: Boolean = false
Perhaps this is what you want, but usually you want a==b if and only if b==a. Unfortunately, the compiler can't sensibly fix this for you automatically.
This gets even worse because the hashCode of Person("John") is not the same as the hashCode of Homeowner("1 Main St","John"), so now equals acts weird and hashCode acts weird.
As long as you know what to expect, inheriting from case classes can give comprehensible results, but it has come to be viewed as bad form (and thus has been deprecated in 2.8).
One downside that is mentioned in Programming in Scala is that due to the things automatically generated for case classes the objects get larger than for normal classes, so if memory efficiency is important, you might want to use regular classes.
It can be tempting to use case classes because you want free toString/equals/hashCode. This can cause problems, so avoid doing that.
I do wish there were an annotation that let you get those handy things without making a case class, but maybe that's harder than it sounds.