Scala - Does pattern matching break the Open-Closed principle? [duplicate] - scala

If I add a new case class, does that mean I need to search through all of the pattern matching code and find out where the new class needs to be handled? I've been learning the language recently, and as I read about some of the arguments for and against pattern matching, I've been confused about where it should be used. See the following:
Pro:
Odersky1 and
Odersky2
Con:
Beust
The comments are pretty good in each case, too. So is pattern matching something to be excited about or something I should avoid using? Actually, I imagine the answer is "it depends on when you use it," but what are some positive use cases for it and what are some negative ones?

Jeff, I think you have the right intuition: it depends.
Object-oriented class hierarchies with virtual method dispatch are good when you have a relatively fixed set of methods that need to be implemented, but many potential subclasses that might inherit from the root of the hierarchy and implement those methods. In such a setup, it's relatively easy to add new subclasses (just implement all the methods), but relatively difficult to add new methods (you have to modify all the subclasses to make sure they properly implement the new method).
Data types with functionality based on pattern matching are good when you have a relatively fixed set of classes that belong to a data type, but many potential functions that operate on that data type. In such a setup, it's relatively easy to add new functionality for a data type (just pattern match on all its classes), but relatively difficult to add new classes that are part of the data type (you have to modify all the functions that match on the data type to make sure they properly support the new class).
The canonical example for the OO approach is GUI programming. GUI elements need to support very little functionality (drawing themselves on the screen is the bare minimum), but new GUI elements are added all the time (buttons, tables, charts, sliders, etc). The canonical example for the pattern matching approach is a compiler. Programming languages usually have a relatively fixed syntax, so the elements of the syntax tree will change rarely (if ever), but new operations on syntax trees are constantly being added (faster optimizations, more thorough type analysis, etc).
Fortunately, Scala lets you combine both approaches. Case classes can both be pattern matched and support virtual method dispatch. Regular classes support virtual method dispatch and can be pattern matched by defining an extractor in the corresponding companion object. It's up to the programmer to decide when each approach is appropriate, but I think both are useful.

While I respect Cedric, he's completely wrong on this issue. Scala's pattern matching can be fully-encapsulated from class changes when desired. While it is true that a change to a case class would require changing any corresponding pattern matching instances, this is only when using such classes in a naive fashion.
Scala's pattern matching always delegates to the deconstructor of a class's companion object. With a case class, this deconstructor is automatically generated (along with a factory method in the companion object), though it is still possible to override this auto-generated version. At all times, you can assert complete control over the pattern matching process, insulating any patterns from potential changes in the class itself. Thus, pattern matching is simply another way of accessing class data through the safe filter of encapsulation, just like any other method.
So, Dr. Odersky's opinion would be the one to trust here, particularly given the sheer volume of research he has performed in the area of object-oriented programming and design.
As for where it should be used, that is entirely according to taste. If it makes your code more concise and maintainable, use it! Otherwise, don't. For most object-oriented programs, pattern matching is unnecessary. However, once you begin to integrate more functional idioms (Option, List, etc) I think you'll find that pattern matching will significantly reduce syntactic overhead as well as improving the safety offered by the type system. In general, any time you want to extract data while simultaneously testing some condition (e.g. extracting a value from Some), pattern matching will likely be of use.

Pattern matching is definitely good if you are doing functional programming. In case of OO, there are some cases where it is good. In Cedric's example itself, it depends on how you view the print() method conceptually. Is it a behavior of each Term object? Or is it something outside it? I would say it is outside, and makes sense to do pattern matching. On the other hand if you have an Employee class with various subclasses, it is a poor design choice to do pattern matching on an attribute of it (say name) in the base class.
Also pattern matching offers an elegant way of unpacking members of a class.

Related

How to wrap procedural algorithms in OOP language

I have to implement an algorithm which fits perfectly to the procedural design approach. It has no relations with some data structure, it just takes couple of objects, bunch of control parameters and performs complicated operations on them, including creating and modifying intermediate temporal data, subroutines calls, many cpu-intensive data transformations. The algorithm is too specific to include in either parameter object as method.
What is idiomatic way to wrap such algorithms in an OOP language? Define static object with static method that performs calculation? Define class that takes all algorithm parameters as constructor arguments and have result method to return result? Any other way?
If you need more specifics, I'm writing in scala. But any general OOP approach is also applicable.
A static method (or a method on a singleton object in the case of Scala -- which I'm just gonna call a static method because that's the most common terminology) can work perfectly fine and is probably the most common approach to this.
There's some reasons to use other approaches, but they aren't strictly necessary and I'd avoid them unless you actually need an advantage that they give. The reason for this is because static methods are the simplest (if least versatile) approach.
Using a non-static method can be useful because you can then utilize design patterns like the factory pattern. For example, you might have an Operator class with a method evaluate. Now you could have different factories create different Operators so that you can swap your algorithm on the fly. Perhaps a calculator might have an AddOperatorFactory, MultiplyOperatorFactory and so on. Obviously this requires that you are able to instantiate an object that represents the algorithm. Of course, you could just pass a function around directly, as Scala and many other languages allow. Classes allow for inheritance, though, which opens the doors for some design patterns and, well, you're asking about OOP, not Scala specifically.
Also useful is the ability to have state with an object. With static methods, your only options for retaining state are either having global state (ew) or making the user of the static methods keep track of this state (more work for the users). With an instance of an object, you can keep that state inside the instance. For example, if your algorithm is a graph search, perhaps you'd want to allow resuming a search after you find the first match (which obviously requires storing state).
It's not much harder to have to do new MyAlgorithm().doStuff() instead of MyAlgorithm.doStuff(), so if in doubt, I would err on the side of avoiding static methods if you think you'll need the functionality that having an instance offers.

Why don't imperative languages have pattern matching?

So, pattern matching in functional languages is pretty awesome. I wondering why most imperative languages haven't implemented this feature? To my understanding, Scala is the only "mainstream" imperative language that has pattern matching. The case/switch structure is just so much less powerful.
In particular, I am interested in whether the lack of pattern matching is because of technical reasons or historical reasons?
It's mostly historic. Pattern matching -- and more to the point, algebraic data types -- was invented around 1980 for the functional language Hope. From there it quickly made it into ML, and was later adopted in other functional languages like Miranda and Haskell. The mainstream imperative world usually takes a few decades longer to pick up new programming language ideas.
One reason that particularly hindered adoption is that the mainstream has long been dominated by object-oriented ideology. In that world, anything that isn't expressed by objects and subtyping is considered morally "wrong". One could argue that algebraic data types are kind of an antithesis to that.
Perhaps there are also some technical reasons that make it more natural in functional languages:
Regular scoping rules and fine-grained binding constructs for variables are the norm in functional languages, but less common in mainstream imperative languages.
Especially so since patterns bind immutable variables.
Type checking pattern matches relies on the more well-formed structure and rigidness of functional type systems, and their close ties to computational logic. Mainstream type systems are usually far away from that.
Algebraic data types require heap allocation (unless you want to waste a lot of space and prohibit recursion), and would be very inconvenient without garbage collection. However, GCs in mainstream languages, where they exist, are typically optimised for heavyweight objects, not lightweight functional data.
Until recently (more precisely: until Scala), it was believed that pattern matching was incompatible with representation ignorance (i.e. the defining characteristic of OO). Since OO is a major paradigm in mainstream languages, having a seemingly irreconcilable feature in a mainstream language seemingly didn't make sense.
In Scala, pattern matching is reconciled with OO, simply by having the match operations be method calls on an object. (Rather simple in hindsight, no?) In particular, matches are performed by calling methods on extractor objects, which, just like any other object, only have access to the public API of the object being examined, thus not breaking encapsulation.
A pattern matching library inspired by Scala, in which patterns are first-class objects themselves (inspired by F#'s Active Patterns) was added to Newspeak, a very dynamic language that takes OO very seriously. (Newspeak doesn't even have variables, just methods.)
Note that regular expressions are an example of a limited form of pattern matching. Polymorphic method dispatch can also be seen as an example of a limited form of pattern matching (without the extraction features). In fact, method dispatch is powerful enough to implement full pattern matching as evidenced by Scala and especially Newspeak (in the latter, pattern matching is even implemented as a library, completely separate from the language).
Here are my 2 cents. Take a simple Option pattern match:
val o = Some(1)
o match {
case Some(i) => i + 1
case None => 0
}
There are so many things going on here in Scala. Compiler checks that you have exhaustive match, creates a new variable i for the scope of the case statement and of course extracts the value from Option in a first place somehow.
Extracting a value would be doable in languages like Java. Implement unapply method(s) of some agreed upon interface and you are done. Now you can return values to a caller.
Providing this extracted value to the caller, which essentially requires a closure is not so convenient to do in regular OO languages without closure support. It can become quite ugly in Java7 where you would probably use Observer pattern.
If you add other pattern matching abilities of Scala in the mix like matching on specific types, i.e. case i: Int =>; using default clause _ when you want to (compiler has to check exhaustiveness somehow whether you use _ or not); additional checks like case i if i > 0 =>; and so on it quickly becomes very ugly from client side to use (think Java).
If you drop all those fancy pattern matching features your pattern match would be pretty much at the level of Java's switch statement.
It looks like it just would not be worth it, even if possible, to implement using anonymous classes without lambdas support and strong type system.
I would say that it is more for historical than technical reasons. Pattern matching works well with algebraic data types which have also historically been associated with functional languages.
Scala is probably a bad example of an imperative language with pattern matching because it tends to favour the functional style, though it doesn't enforce it.
An example of a modern, mostly imperative language with pattern matching is Rust.
Imperative and runs on the metal, but still has algebraic data types, pattern matching and other features that are more common to functional languages. But its' compiler implementation is a lot more complex than that of a C compiler

Can you suggest any good intro to Scala philosophy and programs design?

In Java and C++ designing program's objects hierarchy is pretty obvious. But beginning Scala I found myself difficult to decide what classes to define to better employ Scala's syntactic sugar facilities (an even idealess about how should I design for better performance). Any good readings on this question?
I have read 4 books on Scala, but I have not found what you are asking for. I guess you have read "Programming in Scala" by Odersky (Artima) already. If not, this is a link to the on-line version:
http://www.docstoc.com/docs/8692868/Programming-In-Scala
This book gives many examples how to construct object-oriented models in Scala, but all examples are very small in number of classes. I do not know of any book that will teach you how to structure large scale systems using Scala.
Imperative object-orientation has
been around since Smalltalk, so we
know a lot about this paradigm.
Functional object-orientation on the
other hand, is a rather new concept,
so in a few years I expect books
describing large scale FOO systems to
appear. Anyway, I think that the PiS
book gives you a pretty good picture
how you can put together the basic
building blocks of a system, like
Factory pattern, how to replace the
Strategy pattern with function
literals and so on.
One thing that Viktor Klang once told me (and something I really agree upon) is that one difference between C++/Java and Scala OO is that you define a lot more (smaller) classes when you use Scala. Why? Because you can! The syntactic sugar for the case class result in a very small penalty for defining a class, both in typing and in readability of the code. And as you know, many small classes usually means better OO (fewer bugs) but worse performance.
One other thing I have noticed is that I use the factory pattern a lot more when dealing with immutable objects, since all "changes" of an instance results in creating a new instance. Thank God for the copy() method on the case class. This method makes the factory methods a lot shorter.
I do not know if this helped you at all, but I think this subject is very interesting myself, and I too await more literature on this subject.
Cheers!
This is still an evolving matter. For instance, the just released Scala 2.8.0 brought support of type constructor inference, which enabled a pattern of type classes in Scala. The Scala library itself has just began using this pattern. Just yesterday I heard of a new Lift module in which they are going to try to avoid inheritance in favor of type classes.
Scala 2.8.0 also introduced lower priority implicits, plus default and named parameters, both of which can be used, separately or together, to produce very different designs than what was possible before.
And if we go back in time, we note that other important features are not that old either:
Extractor methods on case classes object companions where introduced February 2008 (before that, the only way to do extraction on case classes was through pattern matching).
Lazy values and Structural types where introduced July 2007.
Abstract types support for type constructors was introduced in May 2007.
Extractors for non-case classes was introduced in January 2007.
It seems that implicit parameters were only introduced in March 2006, when they replaced the way views were implemented.
All that means we are all learning how to design Scala software. Be sure to rely on tested designs of functional and object oriented paradigms, to see how new features in Scala are used in other languages, like Haskell and type classes or Python and default (optional) and named parameters.
Some people dislike this aspect of Scala, others love it. But other languages share it. C# is adding features as fast as Scala. Java is slower, but it goes through changes too. It added generics in 2004, and the next version should bring some changes to better support concurrent and parallel programming.
I don't think that there are much tutorials for this. I'd suggest to stay with the way you do it now, but to look through "idiomatic" Scala code as well and to pay special attention in the following cases:
use case classes or case objects instead of enums or "value objects"
use objects for singletons
if you need behavior "depending on the context" or dependency-injection-like functionality, use implicits
when designing a type hierarchy or if you can factor things out of a concrete class, use traits when possible
Fine grained inheritance hierarchies are OK. Keep in mind that you have pattern matching
Know the "pimp my library" pattern
And ask as many questions as you feel you need to understand a certain point. The Scala community is very friendly and helpful. I'd suggest the Scala mailing list, Scala IRC or scala-forum.org
I've just accidentally googled to a file called "ScalaStyleGuide.pdf". Going to read...

Mixins vs composition in scala

In java world (more precisely if you have no multiple inheritance/mixins) the rule of thumb is quite simple: "Favor object composition over class inheritance".
I'd like to know if/how it is changed if you also consider mixins, especially in scala?
Are mixins considered a way of multiple inheritance, or more class composition?
Is there also a "Favor object composition over class composition" (or the other way around) guideline?
I've seen quite some examples when people use (or abuse) mixins when object composition could also do the job and I'm not always sure which one is better. It seems to me that you can achieve quite similar things with them, but there are some differences also, some examples:
visibility - with mixins everything becomes part of the public api, which is not the case with composition.
verbosity - in most cases mixins are less verbose and a bit easier to use, but it's not always the case (e.g. if you also use self types in complex hierarchies)
I know the short answer is "It depends", but probably there are some typical situation when this or that is better.
Some examples of guidelines I could come up with so far (assuming I have two traits A and B and A wants to use some methods from B):
If you want to extend the API of A with the methods from B then mixins, otherwise composition. But it does not help if the class/instance that I'm creating is not part of a public API.
If you want to use some patterns that need mixins (e.g. Stackable Trait Pattern) then it's an easy decision.
If you have circular dependencies then mixins with self types can help. (I try to avoid this situation, but it's not always easy)
If you want some dynamic, runtime decisions how to do the composition then object composition.
In many cases mixins seem to be easier (and/or less verbose), but I'm quite sure they also have some pitfalls, like the "God class" and others described in two artima articles: part 1, part 2 (BTW it seems to me that most of the other problems are not relevant/not so serious for scala).
Do you have more hints like these?
A lot of the problems that people have with mix-ins can be averted in Scala if you only mix-in abstract traits into your class definitions, and then mix in the corresponding concrete traits at object instantiation time. For instance
trait Locking{
// abstract locking trait, many possible definitions
protected def lock(body: =>A):A
}
class MyService{
this:Locking =>
}
//For this time, we'll use a java.util.concurrent lock
val myService:MyService = new MyService with JDK15Locking
This construct has several things to recommend it. First, it prevents there from being an explosion of classes as different combinations of trait functionalities are needed. Second, it allows for easy testing, as one can create and mix-in "do-nothing" concrete traits, similar to mock objects. Finally, we've completely hidden the locking trait used, and even that locking is going on, from consumers of our service.
Since we've gotten past most of the claimed drawbacks of mix-ins, we're still left with a tradeoff
between mix-in and composition. For myself, I normally make the decision based on whether a hypothetical delegate object would be entirely encapsulated by the containing object, or whether it could potentially be shared and have a lifecycle of its own. Locking provides a good example of entirely encapsulated delegates. If your class uses a lock object to manage concurrent access to its internal state, that lock is entirely controlled by the containing object, and neither it nor its operations are advertised as part of the class's public interface. For entirely encapsulated functionality like this, I go with mix-ins. For something shared, like a datasource, use composition.
Other differences you haven't mentioned:
Trait classes do not have any independent existence:
(Programming Scala)
If you find that a particular trait is used most often as a parent of other classes, so that the child classes behave as the parent trait, then consider defining the trait as a class instead, to make this logical relationship more clear.
(We said behaves as, rather than is a, because the former is the more precise definition of inheritance, based on the Liskov Substitution Principle - see [Martin2003], for example.)
[Martin2003]: Robert C. Martin, Agile Software Development: Principles, Patterns, and Practices, Prentice-Hall, 2003
mixins (trait) have no constructor parameters.
Hence the advice, still from Programming Scala:
Avoid concrete fields in traits that can’t be initialized to suitable default values.
Use abstract fields instead or convert the trait to a class with a constructor.
Of course, stateless traits don’t have any issues with initialization.
It’s a general principle of good object-oriented design that an instance should always be in a known valid state, starting from the moment the construction process finishes.
That last part, regarding the initial state of an object, has often helped decide between class (and class composition) and trait (and mixins) for a given concept.

How do you go from an abstract project description to actual code?

Maybe its because I've been coding around two semesters now, but the major stumbling block that I'm having at this point is converting the professor's project description and requirements to actual code. Since I'm currently in Algorithms 101, I basically do a bottom-up process, starting with a blank whiteboard and draw out the object and method interactions, then translate that into classes and code.
But now the prof has tossed interfaces and abstract classes into the mix. Intellectually, I can recognize how they work, but am stubbing my toes figuring out how to use these new tools with the current project (simulating a web server).
In my professors own words, mapping the abstract description to Java code is the real trick. So what steps are best used to go from English (or whatever your language is) to computer code? How do you decide where and when to create an interface, or use an abstract class?
So what steps are best used to go from English (or whatever your language is) to computer code?
Experience is what teaches you how to do this. If it's not coming naturally yet (and don't feel bad if it doesn't, because it takes a long time!), there are some questions you can ask yourself:
What are the main concepts of the system? How are they related to each other? If I was describing this to someone else, what words and phrases would I use? These thoughts will help you decide what classes are useful to think about.
What sorts of behaviors do these things have? Are there natural dependencies between them? (For example, a LineItem isn't relevant or meaningful without the context of an Order, nor is an Engine much use without a Car.) How do the behaviors affect the state of the other objects? Do they communicate with each other, and if so, in what way? These thoughts will help you develop the public interfaces of your classes.
That's just the tip of the iceberg, of course. For more about this thought process in general, see Eric Evans's excellent book, Domain-Driven Design.
How do you decide where and when to create an interface, or use an abstract class?
There's no hard and fast prescriptions; again, experience is the best guide here. That said, there's certainly some rules of thumb you can follow:
If several unrelated or significantly different object types all provide the same kind of functionality, use an interface. For example, if the Steerable interface has a Steer(Vector bearing) method, there may be lots of different things that can be steered: Boats, Airplanes, CargoShips, Cars, et cetera. These are completely unrelated things. But they all share the common interface of being able to be steered.
In general, try to favor an interface instead of an abstract base class. This way you can define a single implementation which implements N interfaces. In the case of Java, you can only have one abstract base class, so you're locked into a particular inheritance hierarchy once you say that a class inherits from another one.
Whenever you don't need implementation from a base class, definitely favor an interface over an abstract base class. This would also be handy if you're operating in a language where inheritance doesn't apply. For example, in C#, you can't have a struct inherit from a base class.
In general...
Read a lot of other people's code. Open source projects are great for that. Respect their licenses though.
You'll never get it perfect. It's an iterative process. Don't be discouraged if you don't get it right.
Practice. Practice. Practice.
Research often. Keep tackling more and more challenging projects / designs. Even if there are easy ones around.
There is no magic bullet, or algorithm for good design.
Nowadays I jump in with a design I believe is decent and work from that.
When the time is right I'll implement understanding the result will have to refactored ( rewritten ) sooner rather than later.
Give this project your best shot, keep an eye out for your mistakes and how things should've been done after you get back your results.
Keep doing this, and you'll be fine.
What you should really do is code from the top-down, not from the bottom-up. Write your main function as clearly and concisely as you can using APIs that you have not yet created as if they already existed. Then, you can implement those APIs in similar fashion, until you have functions that are only a few lines long. If you code from the bottom-up, you will likely create a whole lot of stuff that you don't actually need.
In terms of when to create an interface... pretty much everything should be an interface. When you use APIs that don't yet exist, assume that every concrete class is an implementation of some interface, and use a declared type that is indicative of that interface. Your inheritance should be done solely with interfaces. Only create concrete classes at the very bottom when you are providing an implementation. I would suggest avoiding abstract classes and just using delegation, although abstract classes are also reasonable when two different implementations differ only slightly and have several functions that have a common implementation. For example, if your interface allows one to iterate over elements and also provides a sum function, the sum function is a trivial to implement in terms of the iteration function, so that would be a reasonable use of an abstract class. An alternative would be to use the decorator pattern in that case.
You might also find the Google Techtalk "How to Design a Good API and Why it Matters" to be helpful in this regard. You might also be interested in reading some of my own software design observations.
Also, for the coming future, you can keep in pipeline to read the basics on domain driven design to align yourself to the real world scenarios - it gives a solid foundation for requirements mapping to the real classes.