Classes vs. Functions [closed] - class

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 3 years ago.
Improve this question
What is the difference between functional programming and object oriented programming? How should one decide what kind of programming paradigm should be chosen? what are the benefits of one over the other ?
Functions are easy to understand even for someone without any programming experience, but with a fair math background. On the other hand, classes seem to be more difficult to grasp.
Let's say I want to make a class/function that calculates the age of a person given his/her birth year and the current year. Should I create a class for this or a function?
Or is the choice dependent on the scenario?
P.S. I am working on Python, but I guess the question is generic.

Create a function. Functions do specific things, classes are specific things.
Classes often have methods, which are functions that are associated with a particular class, and do things associated with the thing that the class is - but if all you want is to do something, a function is all you need.
Essentially, a class is a way of grouping functions (as methods) and data (as properties) into a logical unit revolving around a certain kind of thing. If you don't need that grouping, there's no need to make a class.

Like what Amber says in her answer: create a function. In fact when you don't have to make classes if you have something like:
class Person(object):
def __init__(self, arg1, arg2):
self.arg1 = arg1
self.arg2 = arg2
def compute(self, other):
""" Example of bad class design, don't care about the result """
return self.arg1 + self.arg2 % other
Here you just have a function encapsulate in a class. This just make the code less readable and less efficient. In fact the function compute can be written just like this:
def compute(arg1, arg2, other):
return arg1 + arg2 % other
You should use classes only if you have more than 1 function to it and if keep a internal state (with attributes) has sense. Otherwise, if you want to regroup functions, just create a module in a new .py file.
You might look this video (Youtube, about 30min), which explains my point. Jack Diederich shows why classes are evil in that case and why it's such a bad design, especially in things like API.
It's quite a long video but it's a must see.

i know it is a controversial topic, and likely i get burned now. but here are my thoughts.
For myself i figured that it is best to avoid classes as long as possible. If i need a complex datatype I use simple struct (C/C++), dict (python), JSON (js), or similar, i.e. no constructor, no class methods, no operator overloading, no inheritance, etc. When using class, you can get carried away by OOP itself (What Design pattern, what should be private, bla bla), and loose focus on the essential stuff you wanted to code in the first place.
If your project grows big and messy, then OOP starts to make sense because some sort of helicopter-view system architecture is needed. "function vs class" also depends on the task ahead of you.
function
purpose: process data, manipulate data, create result sets.
when to use: always code a function if you want to do this: “y=f(x)”
struct/dict/json/etc (instead of class)
purpose: store attr./param., maintain attr./param., reuse attr./param., use attr./param. later.
when to use: if you deal with a set of attributes/params (preferably not mutable)
different languages same thing: struct (C/C++), JSON (js), dict (python), etc.
always prefer simple struct/dict/json/etc over complicated classes (keep it simple!)
class (if it is a new data type)
a simple perspective: is a struct (C), dict (python), json (js), etc. with methods attached.
The method should only make sense in combination with the data/param stored in the class.
my advice: never code complex stuff inside class methods (call an external function instead)
warning: do not misuse classes as fake namespace for functions! (this happens very often!)
other use cases: if you want to do a lot of operator overloading then use classes (e.g. your own matrix/vector multiplication class)
ask yourself: is it really a new “data type”? (Yes => class | No => can you avoid using a class)
array/vector/list (to store a lot of data)
purpose: store a lot of homogeneous data of the same data type, e.g. time series
advice#1: just use what your programming language already have. do not reinvent it
advice#2: if you really want your “class mysupercooldatacontainer”, then overload an existing array/vector/list/etc class (e.g. “class mycontainer : public std::vector…”)
enum (enum class)
i just mention it
advice#1: use enum plus switch-case instead of overcomplicated OOP design patterns
advice#2: use finite state machines

Classes (or rather their instances) are for representing things. Classes are used to define the operations supported by a particular class of objects (its instances). If your application needs to keep track of people, then Person is probably a class; the instances of this class represent particular people you are tracking.
Functions are for calculating things. They receive inputs and produce an output and/or have effects.
Classes and functions aren't really alternatives, as they're not for the same things. It doesn't really make sense to consider making a class to "calculate the age of a person given his/her birthday year and the current year". You may or may not have classes to represent any of the concepts of Person, Age, Year, and/or Birthday. But even if Age is a class, it shouldn't be thought of as calculating a person's age; rather the calculation of a person's age results in an instance of the Age class.
If you are modelling people in your application and you have a Person class, it may make sense to make the age calculation be a method of the Person class. A method is basically a function which is defined as part of a class; this is how you "define the operations supported by a particular class of objects" as I mentioned earlier.
So you could create a method on your person class for calculating the age of the person (it would probably retrieve the birthday year from the person object and receive the current year as a parameter). But the calculation is still done by a function (just a function that happens to be a method on a class).
Or you could simply create a stand-alone function that receives arguments (either a person object from which to retrieve a birth year, or simply the birth year itself). As you note, this is much simpler if you don't already have a class where this method naturally belongs! You should never create a class simply to hold an operation; if that's all there is to the class then the operation should just be a stand-alone function.

It depends on the scenario. If you only want to compute the age of a person, then use a function since you want to implement a single specific behaviour.
But if you want to create an object, that contains the date of birth of a person (and possibly other data), allows to modify it, then computing the age could be one of many operations related to the person and it would be sensible to use a class instead.
Classes provide a way to merge together some data and related operations. If you have only one operation on the data then using a function and passing the data as argument you will obtain an equivalent behaviour, with less complex code.
Note that a class of the kind:
class A(object):
def __init__(self, ...):
#initialize
def a_single_method(self, ...):
#do stuff
isn't really a class, it is only a (complicated)function. A legitimate class should always have at least two methods(without counting __init__).

I'm going to break from the herd on this one (Edit 7 years later: I'm not a lone voice on this anymore, there is an entire coding movement to do just this, called 'Functional Programming') and provide an alternate point of view:
Never create classes. Always use functions.
Edit: Research has repeatedly shown that Classes are an outdated method of programming. Nearly every research paper on the topic sides with Functional Programming rather than Object Oriented Programming.
Reliance on classes has a significant tendency to cause coders to create bloated and slow code. Classes getting passed around (since they're objects) take a lot more computational power than calling a function and passing a string or two. Proper naming conventions on functions can do pretty much everything creating a class can do, and with only a fraction of the overhead and better code readability.
That doesn't mean you shouldn't learn to understand classes though. If you're coding with others, people will use them all the time and you'll need to know how to juggle those classes. Writing your code to rely on functions means the code will be smaller, faster, and more readable. I've seen huge sites written using only functions that were snappy and quick, and I've seen tiny sites that had minimal functionality that relied heavily on classes and broke constantly. (When you have classes extending classes that contain classes as part of their classes, you know you've lost all semblance of easy maintainability.)
When it comes down to it, all data you're going to want to pass can easily be handled by the existing datatypes.
Classes were created as a mental crutch and provide no actual extra functionality, and the overly-complicated code they have a tendency to create defeats the point of that crutch in the long run.
Edit: Update 7 years later...
Recently, a new movement in coding has been validating this exact point I've made. It is the movement to replace Object Oriented Programming (OOP) with functional programming, and it's based on a lot of these exact issues with OOP. There are lots of research papers showing the benefits of Functional programming over Object Oriented Programming. In addition to the points I've mentioned, it makes reusing code much easier, makes bugfixing and unit testing fasters and easier. Honestly, with the vast number of benefits, the only reason to go with OOP over Functional is compatibility with legacy code that hasn't been updated yet.

Before answering your question:
If you do not have a Person class, first you must consider whether you want to create a Person class. Do you plan to reuse the concept of a Person very often? If so, you should create a Person class. (You have access to this data in the form of a passed-in variable and you don't care about being messy and sloppy.)
To answer your question:
You have access to their birthyear, so in that case you likely have a Person class with a someperson.birthdate field. In that case, you have to ask yourself, is someperson.age a value that is reusable?
The answer is yes. We often care about age more than the birthdate, so if the birthdate is a field, age should definitely be a derived field. (A case where we would not do this: if we were calculating values like someperson.chanceIsFemale or someperson.positionToDisplayInGrid or other irrelevant values, we would not extend the Person class; you just ask yourself, "Would another program care about the fields I am thinking of extending the class with?" The answer to that question will determine if you extend the original class, or make a function (or your own class like PersonAnalysisData or something).)

Never create classes. At least the OOP kind of classes in Python being discussed.
Consider this simplistic class:
class Person(object):
def __init__(self, id, name, city, account_balance):
self.id = id
self.name = name
self.city = city
self.account_balance = account_balance
def adjust_balance(self, offset):
self.account_balance += offset
if __name__ == "__main__":
p = Person(123, "bob", "boston", 100.0)
p.adjust_balance(50.0)
print("done!: {}".format(p.__dict__))
vs this namedtuple version:
from collections import namedtuple
Person = namedtuple("Person", ["id", "name", "city", "account_balance"])
def adjust_balance(person, offset):
return person._replace(account_balance=person.account_balance + offset)
if __name__ == "__main__":
p = Person(123, "bob", "boston", 100.0)
p = adjust_balance(p, 50.0)
print("done!: {}".format(p))
The namedtuple approach is better because:
namedtuples have more concise syntax and standard usage.
In terms of understanding existing code, namedtuples are basically effortless to understand. Classes are more complex. And classes can get very complex for humans to read.
namedtuples are immutable. Managing mutable state adds unnecessary complexity.
class inheritance adds complexity, and hides complexity.
I can't see a single advantage to using OOP classes. Obviously, if you are used to OOP, or you have to interface with code that requires classes like Django.
BTW, most other languages have some record type feature like namedtuples. Scala, for example, has case classes. This logic applies equally there.

Related

How to define a class whos only role is to perform an action

This is a question about the definition of a class.
Of course I have read the endless examples on the Internet of what should be called a class. I have read that it is all the verbs and nouns that make up a thing. I understand the concept of a car class with properties like size, colour, and methods like drive
I also understand the idea that a class should have only one responsibility and adhere to the other SOLID principles
My problem relates to a program I have developed.
The responsibility of the program is to extract all the similar words from a document. It is therefore not a 'noun' like a car or animal but a verb type class I suppose.
In order to do this the program iterates through a folder of text files, extracts all the text, splits the text up by line and then 20 characters, compares each of the chunks in one file to all of the others by similarity, keeps only the words that are similar between two files, cleans the words to get rid of various characters and then added the words to a text file and repeats this for all the files in the folder.
So I have one responsibility for the class and I have written methods for each of the phrases between the commas.
Having read more about class design then it could to me that some of these methods might be classes in their own right. If a class is defined by having a single responsibility then presumably I could define more classes instead of these methods. E.g. why don't I have a class to find word similarity with only one method....
So my question is how do I define a class on a single responsbility basis if a method also has a single responsibility and the class doesn't define a thing but more of an action. What are the boundaries of what defines a class?
Please no...'Have you read'...because I have read them all. A simple explanation with a well illustrated example (conceptual example is fine)
The term "single responsibility" is very nebulous. I find it much easier to think of it in terms of cohesion and coupling. In short, we have to get things that tend to change together (i.e. are strongly cohesive) into one class and things that don't (i.e. are loosely coupled) into separate classes.
In practice that means things that tend to work with the same "data" belong to the same class. This can be easily enforced if data does not leave the object. Even more pragmatically that means avoiding "getter" methods that return data from an object.
Regarding your problem. You're saying it's not a noun, but only because you don't think of it that way. What is your "business logic"? To collect SimilarWords from a Document. Both are nouns. Your phrases are all about what steps should be taken. Rethink your application in terms of what things are involved and what actions those things would be able to do for you.
Here is a short/incomplete design for the things you describe:
public interface Folder {
public SimilarWords extract();
}
Meaning: I want to extract SimilarWords from a Folder.
public interface TextFile {
public void chunk(Consumer<Chunk> chunkConsumer);
}
Meaning: TextFile chunks the text.
public class Comparison {
public Comparison(TextFile file1, TextFile file2);
public SimilarWords extract();
}
Meaning: Two TextFiles are compared where the SimilarWords come from. You didn't use the word "Comparison" explicitly, I made that up.
And of course SimilarWords need to be added together for all file pairs (?) and then written to some output:
public interface SimilarWords {
public SimilarWords add(SimilarWords other);
public void writeTo(OutputStream output);
}
So that would be a proper OO design. I didn't catch all the details of your domain, so this model may be not exactly what you want, but I think you get the point.
Let's think a little about both your problem, problems in general, and SRP.
SRP states that a class should be concerned with one thing. This doesn't mean exactly to have a single method that does only one thing.
Actually this can be applied outside OOP too: a function should do only a single thing.
Now imagine your program has to implement 200 features. Imagine they are so simple that a single function is enough to implement any feature. And suppose you are using only functions. By the same principle you have to write (at least) 200 functions. Now this is not so great as it looks. First you program structure looks like an endless list of micro-sized pieces of code. Second if they are micro-sized, they can't do much by themselves (this is not bad per see). As you suspected a feature doesn't usually map to a single function in real world. Third if they do almost nothing, they have to ask everything to someone else. Or someone is doing that somewhere else. So there is some place where a function, or a class, is calling all the others. That place centralizes a lot of knowledge about the system. It has to know about everything to be able to call everyone. This is not good for an architecture.
The alternative is to distribute the knowledge.
If you allow those functions or classes to do a little more, they ask less things to others, some of those things are solved locally. Let me guess. As all this classes are in the same application, some of them are related to each other. They can form a group and collaborate. Maybe they can be the same class, or inherit from others. This reduces communication paths. Communication becomes more local.
Comunication paths matter. Imagine there are 125 persons in your company, and the company needs to take collective decisions. Would you do a 125 people meeting, or you group people say in 5 groups, each with 5 teams of 5 people and have small meetings instead, and then the team and group leaders meet themselves? This is a form of hierarchy or structure that helps things.
Can you imagine the fan-in and fan-out in the new structure? 5/5/5 is much better than 1/125.
So this is about a trade-off. You are exchanging communication paths by responsabilities. What you want in the end to have a reasonable architecture, with knowledge distributed evenly.

Difference between an instance of a class and a class representing an instance already?

I use Java as an example but this is more of a general OOP design related question.
Lets take the IOExceptions in Java as an example. Why is there a class FileNotFoundException for example? Should not that be an instance of a IOException where the cause is FileNotFound? I would say FileNotFoundException is an instance of IOException. Where does this end? FileNotFoundButOnlyCheckedOnceException, FileNotFoundNoMatterHowHardITriedException..?
I have also seen code in projects I worked in where classes such as FirstLineReader and LastLineReader existed. To me, such classes actually represent instances, but I see such design in many places. Look at the Spring Framework source code for example, it comes with hundreds of such classes, where every time I see one I see an instance instead of a blueprint. Are not classes meant to be blueprints?
What I am trying to ask is, how does one make the decision between these 2 very simple options:
Option 1:
enum DogBreed {
Bulldog, Poodle;
}
class Dog {
DogBreed dogBreed;
public Dog(DogBreed dogBreed) {
this.dogBreed = dogBreed;
}
}
Option 2:
class Dog {}
class Bulldog extends Dog {
}
class Poodle extends Dog {
}
The first option gives the caller the requirement to configure the instance it is creating. In the second option, the class represents the instance itself already (as I see it, which might be totally wrong ..).
If you agree that these classes represent instances instead of blueprints, would you say it is a good practice to create classes that represents instances or is it totally wrong the way I am looking at this and my statement "classes representing instances" is just load of nonsense?
Edited
First of all: We know the Inheritance definition and we can find a lot of examples in SO and internet. But, I think we should look in-depth and a little more scientific.
Note 0:
Clarification about Inheritance and Instance terminology.
First let me name Development Scope for development life cycle, when we are modeling and programming our system and Runtime Scope for sometimes our system is running.
We have Classes and modeling and developing them in Development Scope. And Objects in Runtime Scope. There is no Object in Development Scope.
And in Object Oriented, the definition of Instance is: Creating an Object from a Class.
On the other hand, when we are talking about classes and object, we should clarify our Viewpoint about Development Scope and Runtime Scope.
So, with this introduction, I want to clarify Inheritance:
Inheritance is a relationship between Classes, NOT Objects.
Inheritance can exist in Development Scope, not in Runtime Scope. There is no Inheritance in Runtime Scope.
After running our project, there is no relationship between parent and child (If there is only Inheritance between a child class and parent class). So, the question is: What is super.invokeMethod1() or super.attribute1 ?, they are not the relationship between child and parent. All attributes and methods of a parent are transmitted to the child and that is just a notation to access the parts that transmitted from a parent.
Also, there are not any Objects in Development Scope. So there are not any Instances in Development scope. It is just Is-A and Has-A relationship.
Therefore, when we said:
I would say FileNotFoundException is a instance of an IOException
We should clarify about our Scope (Development and Runtime).
For example, If FileNotFoundException is an instance of IOException, then what is the relationship between a specific FileNotFoundException exception at runtime (the Object) and FileNotFoundException. Is it an instance of instance?
Note 1:
Why we used Inheritance? The goal of inheritance is to extending parent class functionalities (based on the same type).
This extension can happen by adding new attributes or new methods.
Or overriding existing methods.
In addition, by extending a parent class, we can reach to reusability too.
We can not restrict the parent class functionality (Liskov Principle)
We should be able to replace the child as parent in the system (Liskov Principle)
and etc.
Note 2:
The Width and Depth of Inheritance Hierarchies
The Width and Depth of Inheritance can be related to many factors:
The project: The complexity of the project (Type Complexity) and it's architecture and design. The size of the project, the number of classes and etc.
The team: The expertise of a team in controlling the complexity of the project.
and etc.
However, we have some heuristics about it. (Object-Oriented Design Heuristics, Arthur J. Riel)
In theory, inheritance hierarchies should be deep—the deeper, the better.
In practice, inheritance hierarchies should be no deeper than
an average person can keep in his or her short-term memory. A popular
value for this depth is six.
Note that they are heuristics and based on short-term memory number (7). And maybe the expertise of a team affect this number. But in many hierarchies like organizational charts is used.
Note 3:
When we are using Wrong Inheritance?
Based on :
Note 1: the goal of Inheritance (Extending parent class functionalities)
Note 2: the width and depth of Inheritance
In this conditions we use wrong inheritance:
We have some classes in an inheritance hierarchy, without extending parent class functionalities. The extension should be reasonable and should be enough to make a new class. The reasonable means from Observer's point of view. The observer can be Project Architect or Designer (Or other Architects and Designers).
We have a lot of classes in the inheritance hierarchy. It calls Over-Specialization. Some reasons may cause this:
Maybe we did not consider Note 1 (Extending parent functionalities)
Maybe our Modularization (packaging) is not correct. And we put many system use cases in one package and we should make Design Refactoring.
They are other reasons, but not exactly related this answer.
Note 4:
What should we do? When we are using Wrong Inheritance?
Solution 1: We should perform Design Refactoring to check the value of classes in order to Extending parent Functionality. In this refactoring, maybe many classes of system deleted.
Solution 2: We should perform Design Refactoring to modularization. In this refactoring, maybe some classes of our package transmitted to other packages.
Solution 3: Using the Composition over Inheritance.
We can use this technique for many reasons. Dynamic Hierarchy is one of popular reasons that we prefer Composition instead of Inheritance.
see Tim Boudreau (of Sun) notes here:
Object hierarchies don't scale
Solution 4: use instances over Subclasses
This question is about this technique. Let me named it instances over Subclasses.
When we can use it:
(Tip 1): Consider Note 1, when we do not exactly extend the parent class functionalities. Or the extensions are not reasonable and enough.
(Tip 2:) Consider Note 2, If we have a lot of subclasses (semi or identical classes) that extends the parent class a little and we can control this extension without inheritance. Note that it is not easy to say that. We should prove that it is not violating other Object Oriented Principles like Open-Close Principle.
What should we do?
Martin Fowler recommend (Book 1 page 232 and Book 2 page 251):
Replace Subclass with Fields, Change the methods to superclass fields and eliminate the subclasses.
We can use other techniques like enum as the question mentioned.
First, by including the exceptions question along with a general system design issue, you're really asking two different questions.
Exceptions are just complicated values. Their behaviors are trivial: provide the message, provide the cause, etc. And they're naturally hierarchical. There's Throwable at the top, and other exceptions repeatedly specialize it. The hierarchy simplifies exception handling by providing a natural filter mechanism: when you say catch (IOException..., you know you'll get everything bad that happened regarding i/o. Can't get much clearer than that. Testing, which can be ugly for big object hierarchies, is no problem for exceptions: There's little or nothing to test in a value.
It follows that if you are designing similar complex values with trivial behaviors, a tall inheritance hierarchy is a reasonable choice: Different kinds of tree or graph nodes constitute a good example.
Your second example seems to be about objects with more complex behaviors. These have two aspects:
Behaviors need to be tested.
Objects with complex behaviors often change their relationships with each other as systems evolve.
These are the reasons for the often heard mantra "composition over inheritance." It's been well-understood since the mid-90s that big compositions of small objects are generally easier to test, maintain, and change than big inheritance hierarchies of necessarily big objects.
Having said that, the choices you've offered for implementation are missing the point. The question you need to answer is "What are the behaviors of dogs I'm interested in?" Then describe these with an interface, and program to the interface.
interface Dog {
Breed getBreed();
Set<Dog> getFavoritePlaymates(DayOfWeek dayOfWeek);
void emitBarkingSound(double volume);
Food getFavoriteFood(Instant asOfTime);
}
When you understand the behaviors, implementation decisions become much clearer.
Then a rule of thumb for implementation is to put simple, common behaviors in an abstract base class:
abstract class AbstractDog implements Dog {
private Breed breed;
Dog(Breed breed) { this.breed = breed; }
#Override Breed getBreed() { return breed; }
}
You should be able to test such base classes by creating minimal concrete versions that just throw UnsupportedOperationException for the unimplemented methods and verify the implemented ones. A need for any fancier kind of setup is a code smell: you've put too much into the base.
Implementation hierarchies like this can be helpful for reducing boilerplate, but more than 2 deep is a code smell. If you find yourself needing 3 or more levels, it's very likely you can and should wrap chunks of common behavior from the low-level classes in helper classes that will be easier to test and available for composition throughout the system. For example, rather than offering a protected void emitSound(Mp3Stream sound); method in the base class for inheritors to use, it would be far preferable to create a new class SoundEmitter {} and add a final member with this type in Dog.
Then make concrete classes by filling in the rest of the behavior:
class Poodle extends AbstractDog {
Poodle() { super(Breed.POODLE); }
Set<Dog> getFavoritePlaymates(DayOfWeek dayOfWeek) { ... }
Food getFavoriteFood(Instant asOfTime) { ... }
}
Observe: The need for a behavior - that the dog must be able to return its breed - and our decision to implement the "get breed" behavior in an abstract base class resulted in a stored enum value.
We ended up adopting something closer to your Option 1, but this wasn't an a priori choice. It flowed from thinking about behaviors and the cleanest way to implement them.
Following comments are on the condition where sub-classes do not actually extend the functionality of their super class.
From Oracle doc:
Signals that an I/O exception of some sort has occurred. This class is the general class of exceptions produced by failed or interrupted I/O operations.
It says IOException is a general exception. If we have a cause enum:
enum cause{
FileNotFound, CharacterCoding, ...;
}
We will not be able to throw an IOException if the cause in our custom code is not included in the enum. In another word, it makes IOException more specific instead of general.
Assuming we are not programming a library, and the functionality of class Dog below is specific in our business requirement:
enum DogBreed {
Bulldog, Poodle;
}
class Dog {
DogBreed dogBreed;
public Dog(DogBreed dogBreed) {
this.dogBreed = dogBreed;
}
}
Personally I think it is good to use enum because it simplifies the class structure (less classes).
The first code you cite involves exceptions.
Inheritance is a natural fit for exception types because the language-provided construct to differentiate exceptions of interest in the try-catch statement is through use of the type system. This means we can easily choose to handle just a more specific type (FileNotFound), or the more general type (IOException).
Testing a field's value, to see whether to handle an exception, means stepping out of the standard language construct and writing some boiler plate guard code (e.g. test value(s) and rethrow if not interested).
(Further, exceptions need to be extensible across DLL (compilation) boundaries. When we use enums we may have problems extending the design without modifying the source that introduces (and other that consumes) the enum.)
When it comes to things other than exceptions, today's wisdom encourages composition over inheritance as this tends to result in less complex and more maintainable designs.
Your Option 1 is more of a composition example, whereas your Option 2 is clearly an inheritance example.
If you agree that these classes represent instances instead of blueprints, would you say it is a good practice to create classes that represents instances or is it totally wrong the way I am looking at this and my statement "classes representing instances" is just load of nonsense?
I agree with you, and would not say this represents good practice. These classes as shown are not particularly customizable and don't represent added value.
A class that has offers no overrides, no new state, no new methods, is not particularly differentiated from its base. So there is little merit in declaring such a class, unless we seek to do instance-of tests on it (like the exception handling language construct does under the covers). We can't really tell from this example, which is contrived for the purposes of asking the question, whether there is any added value in these subclasses but it doesn't appear so.
To be clear, though, there are lots of worse example of inheritance, such as when an (pre) occupation like Teacher or Student inherits from Person. This means that a Teacher cannot a be Student at the same time unless we engage in adding even more classes, e.g. TeacherStudent, perhaps using multiple inheritance..
We might call this class explosion, as sometimes we end up needing a matrix of classes because of inappropriate is-a relationships. (Add one new class, and you need a whole new row or column of exploded classes.)
Working with a design that suffers class explosion actually creates more work for clients consuming these abstractions, so it is a loose-loose situation.
Here at issue, is in our trust of natural language because when we say someone is-a Student, this is not, from a logical perspective, the same permanent "is-a"/instance-of relationship (of subclassing), but rather a potentially-temporary role being played that the Person: one of many possible roles a Person might play concurrently at that. In these cases composition is clearly superior to inheritance.
In your scenario, however, the BullDog is unlikely to be able to be anything other than the BullDog, so the permanent is-a relationship of subclassing holds, and while adding little value, at least this hierarchy does not risk class explosion.
Note that the main drawback to with the enum approach is that the enum may not be extensible depending on the language you're using. If you need arbitrary extensibility (e.g. by others and without altering your code), you have the choice of using something extensible but more weakly typed, like strings (typos aren't caught, duplicates aren't caught, etc..), or you can use inheritance, as it offers decent extensibility with stronger typing. Exceptions need this kind of extensibility by others without modification and recompilation of the originals and others since they are used across DLL boundaries.
If you control the enum and can recompile the code as a unit as needed to handle new dog types, then you don't need this extensibility.
Option 1 has to list all known causes at declaration time.
Option 2 can be extended by creating new classes, without touching the original declaration.
This is important when the base/original declaration is done by the framework. If there were 100 known, fixed, reasons for I/O problems, an enum or something similar could make sense, but if new ways to communicate can crop up that should also be I/O exceptions, then a class hierarchy makes more sense. Any class library that you add to your application can extend with more I/O exceptions without touching the original declaration.
This is basically the O in the SOLID, open for extension, closed for modification.
But this is also why, as an example, DayOfWeek type of enumerations exists in many frameworks. It is extremely unlikely that the western world suddenly wakes up one day and decides to go for 14 unique days, or 8, or 6. So having classes for those is probably overkill. These things are more fixed in stone (knock-on-wood).
The two options you present do not actually express what I think you're trying to get at. What you're trying to differentiate between is composition and inheritance.
Composition works like this:
class Poodle {
Legs legs;
Tail tail;
}
class Bulldog {
Legs legs;
Tail tail;
}
Both have a common set of characteristics that we can aggregate to 'compose' a class. We can specialize components where we need to, but can just expect that "Legs" mostly work like other legs.
Java has chosen inheritance instead of composition for IOException and FileNotFoundException.
That is, a FileNotFoundException is a kind of (i.e. extends) IOException and permits handling based on the identity of the superclass only (though you can specify special handling if you choose to).
The arguments for choosing composition over inheritance are well-rehearsed by others and can be easily found by searching for "composition vs. inheritance."

Scala - Does pattern matching break the Open-Closed principle? [duplicate]

If I add a new case class, does that mean I need to search through all of the pattern matching code and find out where the new class needs to be handled? I've been learning the language recently, and as I read about some of the arguments for and against pattern matching, I've been confused about where it should be used. See the following:
Pro:
Odersky1 and
Odersky2
Con:
Beust
The comments are pretty good in each case, too. So is pattern matching something to be excited about or something I should avoid using? Actually, I imagine the answer is "it depends on when you use it," but what are some positive use cases for it and what are some negative ones?
Jeff, I think you have the right intuition: it depends.
Object-oriented class hierarchies with virtual method dispatch are good when you have a relatively fixed set of methods that need to be implemented, but many potential subclasses that might inherit from the root of the hierarchy and implement those methods. In such a setup, it's relatively easy to add new subclasses (just implement all the methods), but relatively difficult to add new methods (you have to modify all the subclasses to make sure they properly implement the new method).
Data types with functionality based on pattern matching are good when you have a relatively fixed set of classes that belong to a data type, but many potential functions that operate on that data type. In such a setup, it's relatively easy to add new functionality for a data type (just pattern match on all its classes), but relatively difficult to add new classes that are part of the data type (you have to modify all the functions that match on the data type to make sure they properly support the new class).
The canonical example for the OO approach is GUI programming. GUI elements need to support very little functionality (drawing themselves on the screen is the bare minimum), but new GUI elements are added all the time (buttons, tables, charts, sliders, etc). The canonical example for the pattern matching approach is a compiler. Programming languages usually have a relatively fixed syntax, so the elements of the syntax tree will change rarely (if ever), but new operations on syntax trees are constantly being added (faster optimizations, more thorough type analysis, etc).
Fortunately, Scala lets you combine both approaches. Case classes can both be pattern matched and support virtual method dispatch. Regular classes support virtual method dispatch and can be pattern matched by defining an extractor in the corresponding companion object. It's up to the programmer to decide when each approach is appropriate, but I think both are useful.
While I respect Cedric, he's completely wrong on this issue. Scala's pattern matching can be fully-encapsulated from class changes when desired. While it is true that a change to a case class would require changing any corresponding pattern matching instances, this is only when using such classes in a naive fashion.
Scala's pattern matching always delegates to the deconstructor of a class's companion object. With a case class, this deconstructor is automatically generated (along with a factory method in the companion object), though it is still possible to override this auto-generated version. At all times, you can assert complete control over the pattern matching process, insulating any patterns from potential changes in the class itself. Thus, pattern matching is simply another way of accessing class data through the safe filter of encapsulation, just like any other method.
So, Dr. Odersky's opinion would be the one to trust here, particularly given the sheer volume of research he has performed in the area of object-oriented programming and design.
As for where it should be used, that is entirely according to taste. If it makes your code more concise and maintainable, use it! Otherwise, don't. For most object-oriented programs, pattern matching is unnecessary. However, once you begin to integrate more functional idioms (Option, List, etc) I think you'll find that pattern matching will significantly reduce syntactic overhead as well as improving the safety offered by the type system. In general, any time you want to extract data while simultaneously testing some condition (e.g. extracting a value from Some), pattern matching will likely be of use.
Pattern matching is definitely good if you are doing functional programming. In case of OO, there are some cases where it is good. In Cedric's example itself, it depends on how you view the print() method conceptually. Is it a behavior of each Term object? Or is it something outside it? I would say it is outside, and makes sense to do pattern matching. On the other hand if you have an Employee class with various subclasses, it is a poor design choice to do pattern matching on an attribute of it (say name) in the base class.
Also pattern matching offers an elegant way of unpacking members of a class.

When to use mutable vs immutable classes in Scala

Much is written about the advantages of immutable state, but are there common cases in Scala where it makes sense to prefer mutable classes? (This is a Scala newbie question from someone with a background in "classic" OOP design using mutable classes.)
For something trivial like a 3-dimensional Point class, I get the advantages of immutability. But what about something like a Motor class, which exposes a variety of control variables and/or sensor readings? Would a seasoned Scala developer typically write such a class to be immutable? In that case, would 'speed' be represented internally as a 'val' instead of a 'var', and the 'setSpeed' method return a new instance of the class? Similarly, would every new reading from a sensor describing the motor's internal state cause a new instance of Motor to be instantiated?
The "old way" of doing OOP in Java or C# using classes to encapsulate mutable state seems to fit the Motor example very well. So I'm curious to know if once you gain experience using the immutable-state paradigm, you would even design a class like Motor to be immutable.
I'll use a different, classic, OO modeling example: bank accounts.
These are used in practically every OO course on the planet, and the design you usually end up with is something like this:
class Account(var balance: BigDecimal) {
def transfer(amount: BigDecimal, to: Account): Unit = {
balance -= amount
to.balance += amount
}
}
IOW: the balance is data, and the transfer is an operation. (Note also that the transfer is a complex operation involving multiple mutable objects, which however should be atomic, not complex … so you need locking etc.)
However, that is wrong. That's not how banking systems are actually designed. In fact, that's not how actual real-world (physical) banking works, either. Actual physical banking and actual banking systems work like this:
class Account(implicit transactionLog: TransactionLog) {
def balance = transactionLog.reduceLeft(_ + _)
}
class TransactionSlip(from: Account, to: Account, amount: BigDecimal)
IOW: the balance is an operation and the transfer is data. Note that everything here is immutable. The balance is just a left fold of the transaction log.
Note also that we didn't even end up with a purely functional, immutable design as an explicit design goal. We just wanted to model the banking system correctly and ended up with a purely functional, immutable design by coincidence. (Well, it's actually not by coincidence. There's a reason why real-world banking works that way, and it has the same benefits as it has in programming: mutable state and side-effects make systems complex and confusing … and in banking that means money disappearing.)
The point here is that the exact same problem can be modeled in very different ways, and depending on the model, you might up with something which is trivial to make purely immutable or very hard.
I think the short answer is most likely: Yes, immutable data structures are far more usable and efficient than you realize.
The question you've posed is a bit ambiguous because the answer depends less on the motor you've described than on the software system that you haven't described. The great mistake of how OOP is always taught, in my opinion, is recommending bottom-up design of "domain" classes prior to considering how the classes will be used. Maybe your system even needs more than one data structure holding the same information about a motor in different ways.
The "old way" of doing OOP in Java or C# using classes to encapsulate mutable state seems to fit the motor example very well.
The "new way" (arguably), in support of multithreaded systems, is to encapsulate mutable state within actors. An actor that represents the current state of a motor would be mutable. But if you were to take a "snapshot" of the motor's state and pass that information to another actor, the message needs to be immutable.
In that [immutable] case, would 'speed' be represented internally as a 'val' instead of a 'var', and the 'setSpeed' method return a new instance of the class?
Yes, but you don't actually have to write that method if you use a case class. Suppose you have a class defined as case class Motor(speed: Speed, rpm: Int, mass: Mass, color: Color). Using the copy method, you could write something like motor2 = motor1.copy(rpm = 3500, speed = 88.mph).

What functions to put inside a class

If I have a function (say messUp that does not need to access any private variables of a class (say room), should I write the function inside the class like room.messUp() or outside of it like messUp(room)? It seems the second version reads better to me.
There's a tradeoff involved here. Using a member function lets you:
Override the implementation in derived classes, so that messing up a kitchen could involve trashing the cupboards even if no cupboards are available in a generic room.
Decide that you need to access private variables later on, without having to refactor all the code that uses the function.
Make the function part of an interface, so that a piece of code may require that its argument be mess-up-able.
Using an external function lets you:
Make that function generic, so that you may apply it to rooms, warehouses and oil rigs equally (if they provide the member functions required for messing up).
Keep the class signature small, so that creating mock versions for unit testing (or different implementations) becomes easier.
Change the class implementation without having to examine the code for that function.
There's no real way to have your cake and eat it too, so you have to make choices. A common OO decision is to make everything a method (unless clearly idiotic) and sacrifice the three latter points, but that doesn't mean you should do it in all situations.
Any behaviour of a class of objects should be written as an instance method.
So room.messUp() is the OO way to do this.
Whether messUp has to access any private members of the class or not, is irrelevant, the fact that it's a behaviour of the room, suggests that it's an instance method, as would be cleanUp or paint, etc...
Ignoring which language, I think my first question is if messUp is related to any other functions. If you have a group of related functions, I would tend to stick them in a class.
If they don't access any class variables then you can make them static. This way, they can be called without needing to create an instance of the class.
Beyond that, I would look to the language. In some languages, every function must be a method of some class.
In the end, I don't think it makes a big difference. OOP is simply a way to help organize your application's data and logic. If you embrace it, then you would choose room.messUp() over messUp(room).
i base myself on "C++ Coding Standards: 101 Rules, Guidelines, And Best Practices" by Sutter and Alexandrescu, and also Bob Martin's SOLID. I agree with them on this point of course ;-).
If the message/function doesnt interract so much with your class, you should make it a standard ordinary function taking your class object as argument.
You should not polute your class with behaviours that are not intimately related to it.
This is to repect the Single Responsibility Principle: Your class should remain simple, aiming at the most precise goal.
However, if you think your message/function is intimately related to your object guts, then you should include it as a member function of your class.