howto distinguish composition and self-typing use-cases - scala

Scala has two instruments for expressing object composition: original self-type concept and well known trivial composition. I'm curios what situations I should use which in.
There are obvious differences in their applicability. Self-type requires you to use traits. Object composition allows you to change extensions on run-time with var declaration.
Leaving technical details behind I can figure two indicators to help with classification of use cases. If some object used as combinator for a complex structure such as tree or just have several similar typed parts (1 car to 4 wheels relation) than it should use composition. There is extreme opposite use case. Lets assume one trait become too big to clearly observe it and it got split. It is quite natural that you should use self-types for this case.
That rules are not absolute. You may do extra work to convert code between this techniques. e.g. you may replace 4 wheels composition with self-typing over Product4. You may use Cake[T <: MyType] {part : MyType} instead of Cake { this : MyType => } for cake pattern dependencies. But both cases seem counterintuitive and give you extra work.
There are plenty of boundary use cases although. One-to-one relations is very hard to decide with. Is there any simple rule to decide what kind of technique is preferable?
self-type makes you classes abstract, composition makes your code verbose. self-type gives your problems with blending namespaces and also gives you extra typing for free (you got not just a cocktail of two elements but gasoline-motor oil cocktail known as a petrol bomb).
How can I choose between them? What hints are there?
Update:
Let us discuss the following example:
Adapter pattern. What benefits it has with both selt-typing and composition approaches?

Hints below are derived from heuristic approach (trial-and-error method of problem solving used when an algorithmic approach is impractical) and not supported by any formula (Mathematically -based reasoning).
***Hints given here should be evaluated in reference to accompanying hints, no hint is a perfect rule to distinguish composition and self-typing use-cases.
(While following below mentioned hints, I don't care or focus on verbosity or number of lines of the code or programming effort inputs.)
composition (dictionary meaning): the act of combining parts or elements to form a whole (Trivial Composition)
trait (Dictionary meaning): a distinguishing characteristic or quality
Hints for Trivial Composition (which can be achieved by super - sub class mechanism or association relationship) (e.g. Car and Wheels):
Which can be counted discretely (eg. Wheels)
Which can be classified futher ( based on different criterias) (eg. Wheels - alloy wheel, steel wheel etc)
Which can be added or removed (Note: When we say wheel stopped, it is actually wheel's rotation speed is stopped, when we say heart stopped, it is actually heart's pulsating speed has become zero)
generally applicable to few (in the universe Some Vehicles and some machineries have wheels)
( few can be 10- 15 or millions also - To explain let us understand the statement: when geologist talk about time and say some time ago , it means few million years ago, it depends on the actual subject)
Hints for Self Type (Trait)(eg. Car and Speed):
Which is unidimensional (not in terms of physics), can be plotted on a number line (whatever the physical unit is) (e.g. Speed)
Which can't be classified further naturally (e.g. Speed) (or atleast you will not classify it further) Here, naturally word used to convey the meaning that to classify it you will have to depend on your own criteria and there will be a definite possibility of classify it into millions of sub types. Take move, you can have millions of move sub traits ... like zigzag move, rotate and go forward,... (million possibility with various permutation combination).
Which can be increased or decreased or stoped (eg. speed, anger, love etc.)
Which is generally seen/ can be seen in very distantly placed classes (e.g. Speed of light, Speed of earth, speed of runner)
generally applicable to many ( in the universe majority (here every) object has speed)
Software Development is like making your own universe and as a creator you define everything. A Trait will be seen among distantly placed classes in your domain (your own universe).
Please note that I have not seen any specific word (here counterpart for trait) in any language ( I know very few) for part which is used for Trivial Composition.
Further Explanation:
To get the answer you need to find somewhere deep in the philosophy of class oriented or object oriented aporach of software development and need to understand the mind and logic of the creators of the programming languages such as java and scala (or many more) which have inculcated class oriented or object oriented paradigm within those language.
Another thing you need is the deep understanding of semantics (the study of meaning or the study of linguistic development by classifying and examining changes in meaning and form )which we use to describe the real world and the semantics behind the keywords (in the programming language) we use as programmers.
I believe, when we create class we want to manifest the real world into software. The class becomes representation of something from the real world may it be car, human, star, dream, thought or imagination etc.
When somebody says "Wheels", you will have clear cut picture of its shape and application and you can think of driving wheel or wheels which roll on road. Wheel always be part of something. It can be counted in discrete numbers. Wheels can be further classified based on criteria like material, application, size etc. Wheel like things qualify for Trivial Composition.
When somebody says "Speed", you will not have any clear cut picture of it... no shape ...no colour... but you can relate it with any moving (relativity) part in the universe. It is a characteristic, trait. Speed is not part of anything. It can be there or it can not be there. It can be plotted on a single line (either direction + or - ). It is hard to classify "Speed". Speed like things qualifies for trait.
In my opinion,
If we take Car as a class (Object), "Speed" like characteristics should go as trait in scala. And "Wheel" like parts, components should go in as "Trivial Composition". "Speed" like characteristics will not have natural classification, where "Wheels" can have many classes and they themselves are independent objects (in reality).
If we take Human as a class (Oject), "Anger, crying, laughing, etc." like behaviors should go as trait and "hands, legs, brain, heart, etc." should go in as "Trivial Composition" as they themselves are independent objects (in reality).
If we think of name, it can be given to anything and anyone i.e. our nearest star has a name "Sun", highest mountain has a name "Himalaya", my dog has a name "Rocky", the river has a name "Amazon".... "Name" is a trait and should not be considered for "Trivial Composition".
If we think of heart, animals have heart as their part. It must be considered for "Trivial Composition" and not as a trait.
What is class?
Class is a description or a blue print of a particular Object.
What is object?
Object is a reality which can be described by the class definition.
(Egg or hen? Which came first?)I believe, Software engineer first thinks of Objects and then (to describe them or make them) (from a blueprint) defines class. (Please note that IN Object Oriented Modelling and Design - Class and Object are complementing each other's existence.) ( "Egg or hen? Which came first? is for co-existence of class and object and has no relevance with the famous Circle-Eclipse Problem (http://en.wikipedia.org/wiki/Circle-ellipse_problem) as the later is related with Inheritance or subtype Polymorphism.)
interface: a thing that enables separate and sometimes incompatible elements to coordinate effectively
Software Development is like making your own universe and as a creator you define everything. Composition should be preferred over Inheritance. ( Gang of Four - Design Patterns)

Related

Difference between an instance of a class and a class representing an instance already?

I use Java as an example but this is more of a general OOP design related question.
Lets take the IOExceptions in Java as an example. Why is there a class FileNotFoundException for example? Should not that be an instance of a IOException where the cause is FileNotFound? I would say FileNotFoundException is an instance of IOException. Where does this end? FileNotFoundButOnlyCheckedOnceException, FileNotFoundNoMatterHowHardITriedException..?
I have also seen code in projects I worked in where classes such as FirstLineReader and LastLineReader existed. To me, such classes actually represent instances, but I see such design in many places. Look at the Spring Framework source code for example, it comes with hundreds of such classes, where every time I see one I see an instance instead of a blueprint. Are not classes meant to be blueprints?
What I am trying to ask is, how does one make the decision between these 2 very simple options:
Option 1:
enum DogBreed {
Bulldog, Poodle;
}
class Dog {
DogBreed dogBreed;
public Dog(DogBreed dogBreed) {
this.dogBreed = dogBreed;
}
}
Option 2:
class Dog {}
class Bulldog extends Dog {
}
class Poodle extends Dog {
}
The first option gives the caller the requirement to configure the instance it is creating. In the second option, the class represents the instance itself already (as I see it, which might be totally wrong ..).
If you agree that these classes represent instances instead of blueprints, would you say it is a good practice to create classes that represents instances or is it totally wrong the way I am looking at this and my statement "classes representing instances" is just load of nonsense?
Edited
First of all: We know the Inheritance definition and we can find a lot of examples in SO and internet. But, I think we should look in-depth and a little more scientific.
Note 0:
Clarification about Inheritance and Instance terminology.
First let me name Development Scope for development life cycle, when we are modeling and programming our system and Runtime Scope for sometimes our system is running.
We have Classes and modeling and developing them in Development Scope. And Objects in Runtime Scope. There is no Object in Development Scope.
And in Object Oriented, the definition of Instance is: Creating an Object from a Class.
On the other hand, when we are talking about classes and object, we should clarify our Viewpoint about Development Scope and Runtime Scope.
So, with this introduction, I want to clarify Inheritance:
Inheritance is a relationship between Classes, NOT Objects.
Inheritance can exist in Development Scope, not in Runtime Scope. There is no Inheritance in Runtime Scope.
After running our project, there is no relationship between parent and child (If there is only Inheritance between a child class and parent class). So, the question is: What is super.invokeMethod1() or super.attribute1 ?, they are not the relationship between child and parent. All attributes and methods of a parent are transmitted to the child and that is just a notation to access the parts that transmitted from a parent.
Also, there are not any Objects in Development Scope. So there are not any Instances in Development scope. It is just Is-A and Has-A relationship.
Therefore, when we said:
I would say FileNotFoundException is a instance of an IOException
We should clarify about our Scope (Development and Runtime).
For example, If FileNotFoundException is an instance of IOException, then what is the relationship between a specific FileNotFoundException exception at runtime (the Object) and FileNotFoundException. Is it an instance of instance?
Note 1:
Why we used Inheritance? The goal of inheritance is to extending parent class functionalities (based on the same type).
This extension can happen by adding new attributes or new methods.
Or overriding existing methods.
In addition, by extending a parent class, we can reach to reusability too.
We can not restrict the parent class functionality (Liskov Principle)
We should be able to replace the child as parent in the system (Liskov Principle)
and etc.
Note 2:
The Width and Depth of Inheritance Hierarchies
The Width and Depth of Inheritance can be related to many factors:
The project: The complexity of the project (Type Complexity) and it's architecture and design. The size of the project, the number of classes and etc.
The team: The expertise of a team in controlling the complexity of the project.
and etc.
However, we have some heuristics about it. (Object-Oriented Design Heuristics, Arthur J. Riel)
In theory, inheritance hierarchies should be deep—the deeper, the better.
In practice, inheritance hierarchies should be no deeper than
an average person can keep in his or her short-term memory. A popular
value for this depth is six.
Note that they are heuristics and based on short-term memory number (7). And maybe the expertise of a team affect this number. But in many hierarchies like organizational charts is used.
Note 3:
When we are using Wrong Inheritance?
Based on :
Note 1: the goal of Inheritance (Extending parent class functionalities)
Note 2: the width and depth of Inheritance
In this conditions we use wrong inheritance:
We have some classes in an inheritance hierarchy, without extending parent class functionalities. The extension should be reasonable and should be enough to make a new class. The reasonable means from Observer's point of view. The observer can be Project Architect or Designer (Or other Architects and Designers).
We have a lot of classes in the inheritance hierarchy. It calls Over-Specialization. Some reasons may cause this:
Maybe we did not consider Note 1 (Extending parent functionalities)
Maybe our Modularization (packaging) is not correct. And we put many system use cases in one package and we should make Design Refactoring.
They are other reasons, but not exactly related this answer.
Note 4:
What should we do? When we are using Wrong Inheritance?
Solution 1: We should perform Design Refactoring to check the value of classes in order to Extending parent Functionality. In this refactoring, maybe many classes of system deleted.
Solution 2: We should perform Design Refactoring to modularization. In this refactoring, maybe some classes of our package transmitted to other packages.
Solution 3: Using the Composition over Inheritance.
We can use this technique for many reasons. Dynamic Hierarchy is one of popular reasons that we prefer Composition instead of Inheritance.
see Tim Boudreau (of Sun) notes here:
Object hierarchies don't scale
Solution 4: use instances over Subclasses
This question is about this technique. Let me named it instances over Subclasses.
When we can use it:
(Tip 1): Consider Note 1, when we do not exactly extend the parent class functionalities. Or the extensions are not reasonable and enough.
(Tip 2:) Consider Note 2, If we have a lot of subclasses (semi or identical classes) that extends the parent class a little and we can control this extension without inheritance. Note that it is not easy to say that. We should prove that it is not violating other Object Oriented Principles like Open-Close Principle.
What should we do?
Martin Fowler recommend (Book 1 page 232 and Book 2 page 251):
Replace Subclass with Fields, Change the methods to superclass fields and eliminate the subclasses.
We can use other techniques like enum as the question mentioned.
First, by including the exceptions question along with a general system design issue, you're really asking two different questions.
Exceptions are just complicated values. Their behaviors are trivial: provide the message, provide the cause, etc. And they're naturally hierarchical. There's Throwable at the top, and other exceptions repeatedly specialize it. The hierarchy simplifies exception handling by providing a natural filter mechanism: when you say catch (IOException..., you know you'll get everything bad that happened regarding i/o. Can't get much clearer than that. Testing, which can be ugly for big object hierarchies, is no problem for exceptions: There's little or nothing to test in a value.
It follows that if you are designing similar complex values with trivial behaviors, a tall inheritance hierarchy is a reasonable choice: Different kinds of tree or graph nodes constitute a good example.
Your second example seems to be about objects with more complex behaviors. These have two aspects:
Behaviors need to be tested.
Objects with complex behaviors often change their relationships with each other as systems evolve.
These are the reasons for the often heard mantra "composition over inheritance." It's been well-understood since the mid-90s that big compositions of small objects are generally easier to test, maintain, and change than big inheritance hierarchies of necessarily big objects.
Having said that, the choices you've offered for implementation are missing the point. The question you need to answer is "What are the behaviors of dogs I'm interested in?" Then describe these with an interface, and program to the interface.
interface Dog {
Breed getBreed();
Set<Dog> getFavoritePlaymates(DayOfWeek dayOfWeek);
void emitBarkingSound(double volume);
Food getFavoriteFood(Instant asOfTime);
}
When you understand the behaviors, implementation decisions become much clearer.
Then a rule of thumb for implementation is to put simple, common behaviors in an abstract base class:
abstract class AbstractDog implements Dog {
private Breed breed;
Dog(Breed breed) { this.breed = breed; }
#Override Breed getBreed() { return breed; }
}
You should be able to test such base classes by creating minimal concrete versions that just throw UnsupportedOperationException for the unimplemented methods and verify the implemented ones. A need for any fancier kind of setup is a code smell: you've put too much into the base.
Implementation hierarchies like this can be helpful for reducing boilerplate, but more than 2 deep is a code smell. If you find yourself needing 3 or more levels, it's very likely you can and should wrap chunks of common behavior from the low-level classes in helper classes that will be easier to test and available for composition throughout the system. For example, rather than offering a protected void emitSound(Mp3Stream sound); method in the base class for inheritors to use, it would be far preferable to create a new class SoundEmitter {} and add a final member with this type in Dog.
Then make concrete classes by filling in the rest of the behavior:
class Poodle extends AbstractDog {
Poodle() { super(Breed.POODLE); }
Set<Dog> getFavoritePlaymates(DayOfWeek dayOfWeek) { ... }
Food getFavoriteFood(Instant asOfTime) { ... }
}
Observe: The need for a behavior - that the dog must be able to return its breed - and our decision to implement the "get breed" behavior in an abstract base class resulted in a stored enum value.
We ended up adopting something closer to your Option 1, but this wasn't an a priori choice. It flowed from thinking about behaviors and the cleanest way to implement them.
Following comments are on the condition where sub-classes do not actually extend the functionality of their super class.
From Oracle doc:
Signals that an I/O exception of some sort has occurred. This class is the general class of exceptions produced by failed or interrupted I/O operations.
It says IOException is a general exception. If we have a cause enum:
enum cause{
FileNotFound, CharacterCoding, ...;
}
We will not be able to throw an IOException if the cause in our custom code is not included in the enum. In another word, it makes IOException more specific instead of general.
Assuming we are not programming a library, and the functionality of class Dog below is specific in our business requirement:
enum DogBreed {
Bulldog, Poodle;
}
class Dog {
DogBreed dogBreed;
public Dog(DogBreed dogBreed) {
this.dogBreed = dogBreed;
}
}
Personally I think it is good to use enum because it simplifies the class structure (less classes).
The first code you cite involves exceptions.
Inheritance is a natural fit for exception types because the language-provided construct to differentiate exceptions of interest in the try-catch statement is through use of the type system. This means we can easily choose to handle just a more specific type (FileNotFound), or the more general type (IOException).
Testing a field's value, to see whether to handle an exception, means stepping out of the standard language construct and writing some boiler plate guard code (e.g. test value(s) and rethrow if not interested).
(Further, exceptions need to be extensible across DLL (compilation) boundaries. When we use enums we may have problems extending the design without modifying the source that introduces (and other that consumes) the enum.)
When it comes to things other than exceptions, today's wisdom encourages composition over inheritance as this tends to result in less complex and more maintainable designs.
Your Option 1 is more of a composition example, whereas your Option 2 is clearly an inheritance example.
If you agree that these classes represent instances instead of blueprints, would you say it is a good practice to create classes that represents instances or is it totally wrong the way I am looking at this and my statement "classes representing instances" is just load of nonsense?
I agree with you, and would not say this represents good practice. These classes as shown are not particularly customizable and don't represent added value.
A class that has offers no overrides, no new state, no new methods, is not particularly differentiated from its base. So there is little merit in declaring such a class, unless we seek to do instance-of tests on it (like the exception handling language construct does under the covers). We can't really tell from this example, which is contrived for the purposes of asking the question, whether there is any added value in these subclasses but it doesn't appear so.
To be clear, though, there are lots of worse example of inheritance, such as when an (pre) occupation like Teacher or Student inherits from Person. This means that a Teacher cannot a be Student at the same time unless we engage in adding even more classes, e.g. TeacherStudent, perhaps using multiple inheritance..
We might call this class explosion, as sometimes we end up needing a matrix of classes because of inappropriate is-a relationships. (Add one new class, and you need a whole new row or column of exploded classes.)
Working with a design that suffers class explosion actually creates more work for clients consuming these abstractions, so it is a loose-loose situation.
Here at issue, is in our trust of natural language because when we say someone is-a Student, this is not, from a logical perspective, the same permanent "is-a"/instance-of relationship (of subclassing), but rather a potentially-temporary role being played that the Person: one of many possible roles a Person might play concurrently at that. In these cases composition is clearly superior to inheritance.
In your scenario, however, the BullDog is unlikely to be able to be anything other than the BullDog, so the permanent is-a relationship of subclassing holds, and while adding little value, at least this hierarchy does not risk class explosion.
Note that the main drawback to with the enum approach is that the enum may not be extensible depending on the language you're using. If you need arbitrary extensibility (e.g. by others and without altering your code), you have the choice of using something extensible but more weakly typed, like strings (typos aren't caught, duplicates aren't caught, etc..), or you can use inheritance, as it offers decent extensibility with stronger typing. Exceptions need this kind of extensibility by others without modification and recompilation of the originals and others since they are used across DLL boundaries.
If you control the enum and can recompile the code as a unit as needed to handle new dog types, then you don't need this extensibility.
Option 1 has to list all known causes at declaration time.
Option 2 can be extended by creating new classes, without touching the original declaration.
This is important when the base/original declaration is done by the framework. If there were 100 known, fixed, reasons for I/O problems, an enum or something similar could make sense, but if new ways to communicate can crop up that should also be I/O exceptions, then a class hierarchy makes more sense. Any class library that you add to your application can extend with more I/O exceptions without touching the original declaration.
This is basically the O in the SOLID, open for extension, closed for modification.
But this is also why, as an example, DayOfWeek type of enumerations exists in many frameworks. It is extremely unlikely that the western world suddenly wakes up one day and decides to go for 14 unique days, or 8, or 6. So having classes for those is probably overkill. These things are more fixed in stone (knock-on-wood).
The two options you present do not actually express what I think you're trying to get at. What you're trying to differentiate between is composition and inheritance.
Composition works like this:
class Poodle {
Legs legs;
Tail tail;
}
class Bulldog {
Legs legs;
Tail tail;
}
Both have a common set of characteristics that we can aggregate to 'compose' a class. We can specialize components where we need to, but can just expect that "Legs" mostly work like other legs.
Java has chosen inheritance instead of composition for IOException and FileNotFoundException.
That is, a FileNotFoundException is a kind of (i.e. extends) IOException and permits handling based on the identity of the superclass only (though you can specify special handling if you choose to).
The arguments for choosing composition over inheritance are well-rehearsed by others and can be easily found by searching for "composition vs. inheritance."

Classes vs. Functions [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 3 years ago.
Improve this question
What is the difference between functional programming and object oriented programming? How should one decide what kind of programming paradigm should be chosen? what are the benefits of one over the other ?
Functions are easy to understand even for someone without any programming experience, but with a fair math background. On the other hand, classes seem to be more difficult to grasp.
Let's say I want to make a class/function that calculates the age of a person given his/her birth year and the current year. Should I create a class for this or a function?
Or is the choice dependent on the scenario?
P.S. I am working on Python, but I guess the question is generic.
Create a function. Functions do specific things, classes are specific things.
Classes often have methods, which are functions that are associated with a particular class, and do things associated with the thing that the class is - but if all you want is to do something, a function is all you need.
Essentially, a class is a way of grouping functions (as methods) and data (as properties) into a logical unit revolving around a certain kind of thing. If you don't need that grouping, there's no need to make a class.
Like what Amber says in her answer: create a function. In fact when you don't have to make classes if you have something like:
class Person(object):
def __init__(self, arg1, arg2):
self.arg1 = arg1
self.arg2 = arg2
def compute(self, other):
""" Example of bad class design, don't care about the result """
return self.arg1 + self.arg2 % other
Here you just have a function encapsulate in a class. This just make the code less readable and less efficient. In fact the function compute can be written just like this:
def compute(arg1, arg2, other):
return arg1 + arg2 % other
You should use classes only if you have more than 1 function to it and if keep a internal state (with attributes) has sense. Otherwise, if you want to regroup functions, just create a module in a new .py file.
You might look this video (Youtube, about 30min), which explains my point. Jack Diederich shows why classes are evil in that case and why it's such a bad design, especially in things like API.
It's quite a long video but it's a must see.
i know it is a controversial topic, and likely i get burned now. but here are my thoughts.
For myself i figured that it is best to avoid classes as long as possible. If i need a complex datatype I use simple struct (C/C++), dict (python), JSON (js), or similar, i.e. no constructor, no class methods, no operator overloading, no inheritance, etc. When using class, you can get carried away by OOP itself (What Design pattern, what should be private, bla bla), and loose focus on the essential stuff you wanted to code in the first place.
If your project grows big and messy, then OOP starts to make sense because some sort of helicopter-view system architecture is needed. "function vs class" also depends on the task ahead of you.
function
purpose: process data, manipulate data, create result sets.
when to use: always code a function if you want to do this: “y=f(x)”
struct/dict/json/etc (instead of class)
purpose: store attr./param., maintain attr./param., reuse attr./param., use attr./param. later.
when to use: if you deal with a set of attributes/params (preferably not mutable)
different languages same thing: struct (C/C++), JSON (js), dict (python), etc.
always prefer simple struct/dict/json/etc over complicated classes (keep it simple!)
class (if it is a new data type)
a simple perspective: is a struct (C), dict (python), json (js), etc. with methods attached.
The method should only make sense in combination with the data/param stored in the class.
my advice: never code complex stuff inside class methods (call an external function instead)
warning: do not misuse classes as fake namespace for functions! (this happens very often!)
other use cases: if you want to do a lot of operator overloading then use classes (e.g. your own matrix/vector multiplication class)
ask yourself: is it really a new “data type”? (Yes => class | No => can you avoid using a class)
array/vector/list (to store a lot of data)
purpose: store a lot of homogeneous data of the same data type, e.g. time series
advice#1: just use what your programming language already have. do not reinvent it
advice#2: if you really want your “class mysupercooldatacontainer”, then overload an existing array/vector/list/etc class (e.g. “class mycontainer : public std::vector…”)
enum (enum class)
i just mention it
advice#1: use enum plus switch-case instead of overcomplicated OOP design patterns
advice#2: use finite state machines
Classes (or rather their instances) are for representing things. Classes are used to define the operations supported by a particular class of objects (its instances). If your application needs to keep track of people, then Person is probably a class; the instances of this class represent particular people you are tracking.
Functions are for calculating things. They receive inputs and produce an output and/or have effects.
Classes and functions aren't really alternatives, as they're not for the same things. It doesn't really make sense to consider making a class to "calculate the age of a person given his/her birthday year and the current year". You may or may not have classes to represent any of the concepts of Person, Age, Year, and/or Birthday. But even if Age is a class, it shouldn't be thought of as calculating a person's age; rather the calculation of a person's age results in an instance of the Age class.
If you are modelling people in your application and you have a Person class, it may make sense to make the age calculation be a method of the Person class. A method is basically a function which is defined as part of a class; this is how you "define the operations supported by a particular class of objects" as I mentioned earlier.
So you could create a method on your person class for calculating the age of the person (it would probably retrieve the birthday year from the person object and receive the current year as a parameter). But the calculation is still done by a function (just a function that happens to be a method on a class).
Or you could simply create a stand-alone function that receives arguments (either a person object from which to retrieve a birth year, or simply the birth year itself). As you note, this is much simpler if you don't already have a class where this method naturally belongs! You should never create a class simply to hold an operation; if that's all there is to the class then the operation should just be a stand-alone function.
It depends on the scenario. If you only want to compute the age of a person, then use a function since you want to implement a single specific behaviour.
But if you want to create an object, that contains the date of birth of a person (and possibly other data), allows to modify it, then computing the age could be one of many operations related to the person and it would be sensible to use a class instead.
Classes provide a way to merge together some data and related operations. If you have only one operation on the data then using a function and passing the data as argument you will obtain an equivalent behaviour, with less complex code.
Note that a class of the kind:
class A(object):
def __init__(self, ...):
#initialize
def a_single_method(self, ...):
#do stuff
isn't really a class, it is only a (complicated)function. A legitimate class should always have at least two methods(without counting __init__).
I'm going to break from the herd on this one (Edit 7 years later: I'm not a lone voice on this anymore, there is an entire coding movement to do just this, called 'Functional Programming') and provide an alternate point of view:
Never create classes. Always use functions.
Edit: Research has repeatedly shown that Classes are an outdated method of programming. Nearly every research paper on the topic sides with Functional Programming rather than Object Oriented Programming.
Reliance on classes has a significant tendency to cause coders to create bloated and slow code. Classes getting passed around (since they're objects) take a lot more computational power than calling a function and passing a string or two. Proper naming conventions on functions can do pretty much everything creating a class can do, and with only a fraction of the overhead and better code readability.
That doesn't mean you shouldn't learn to understand classes though. If you're coding with others, people will use them all the time and you'll need to know how to juggle those classes. Writing your code to rely on functions means the code will be smaller, faster, and more readable. I've seen huge sites written using only functions that were snappy and quick, and I've seen tiny sites that had minimal functionality that relied heavily on classes and broke constantly. (When you have classes extending classes that contain classes as part of their classes, you know you've lost all semblance of easy maintainability.)
When it comes down to it, all data you're going to want to pass can easily be handled by the existing datatypes.
Classes were created as a mental crutch and provide no actual extra functionality, and the overly-complicated code they have a tendency to create defeats the point of that crutch in the long run.
Edit: Update 7 years later...
Recently, a new movement in coding has been validating this exact point I've made. It is the movement to replace Object Oriented Programming (OOP) with functional programming, and it's based on a lot of these exact issues with OOP. There are lots of research papers showing the benefits of Functional programming over Object Oriented Programming. In addition to the points I've mentioned, it makes reusing code much easier, makes bugfixing and unit testing fasters and easier. Honestly, with the vast number of benefits, the only reason to go with OOP over Functional is compatibility with legacy code that hasn't been updated yet.
Before answering your question:
If you do not have a Person class, first you must consider whether you want to create a Person class. Do you plan to reuse the concept of a Person very often? If so, you should create a Person class. (You have access to this data in the form of a passed-in variable and you don't care about being messy and sloppy.)
To answer your question:
You have access to their birthyear, so in that case you likely have a Person class with a someperson.birthdate field. In that case, you have to ask yourself, is someperson.age a value that is reusable?
The answer is yes. We often care about age more than the birthdate, so if the birthdate is a field, age should definitely be a derived field. (A case where we would not do this: if we were calculating values like someperson.chanceIsFemale or someperson.positionToDisplayInGrid or other irrelevant values, we would not extend the Person class; you just ask yourself, "Would another program care about the fields I am thinking of extending the class with?" The answer to that question will determine if you extend the original class, or make a function (or your own class like PersonAnalysisData or something).)
Never create classes. At least the OOP kind of classes in Python being discussed.
Consider this simplistic class:
class Person(object):
def __init__(self, id, name, city, account_balance):
self.id = id
self.name = name
self.city = city
self.account_balance = account_balance
def adjust_balance(self, offset):
self.account_balance += offset
if __name__ == "__main__":
p = Person(123, "bob", "boston", 100.0)
p.adjust_balance(50.0)
print("done!: {}".format(p.__dict__))
vs this namedtuple version:
from collections import namedtuple
Person = namedtuple("Person", ["id", "name", "city", "account_balance"])
def adjust_balance(person, offset):
return person._replace(account_balance=person.account_balance + offset)
if __name__ == "__main__":
p = Person(123, "bob", "boston", 100.0)
p = adjust_balance(p, 50.0)
print("done!: {}".format(p))
The namedtuple approach is better because:
namedtuples have more concise syntax and standard usage.
In terms of understanding existing code, namedtuples are basically effortless to understand. Classes are more complex. And classes can get very complex for humans to read.
namedtuples are immutable. Managing mutable state adds unnecessary complexity.
class inheritance adds complexity, and hides complexity.
I can't see a single advantage to using OOP classes. Obviously, if you are used to OOP, or you have to interface with code that requires classes like Django.
BTW, most other languages have some record type feature like namedtuples. Scala, for example, has case classes. This logic applies equally there.

Elegant AST model

I am in the process of writing a toy compiler in scala. The target language itself looks like scala but is an open field for experiment.
After several large refactorings I can't find a good way to model my abstract syntax tree. I would like to use the facilities of scala's pattern matching, the problem is that the tree carries moving information (like types, symbols) along the compilation process.
I can see a couple of solutions, none of which I like :
case classes with mutable fields (I believe the scala compiler does this) : the problem is that those fields are not present a each stage of the compilation and thus have to be nulled (or Option'd) and it becomes really heavy to debug/write code. Moreover, if for exemple, I find a node with null type after the typing phase I have a really hard time finding the cause of the bug.
huge trait/case class hierarchy : something like Node, NodeWithSymbol, NodeWithType, ... Seems like a pain to write AND work with
something completly hand crafted with extractors
I'm also not sure if it is good practice to go with a fully immutable AST, especially in scala where there is no implicit sharing (because the compiler is not aware of immutability) and it could hurt performances to copy the tree all the time.
Can you think of an elegant pattern to model my tree using scala's powerful type system ?
TL;DR I prefer to keep the AST immutable and carry things like type information in a separate structure, e.g. a Map, that can be referred by IDs stored in the AST. But there is no perfect answer.
You're by no means the first to struggle with this question. Let me list some options:
1) Mutable structures that get updated at each phase. All the up and downsides you mention.
2) Traits/cake pattern. Feasible, but expensive (there's no sharing) and kinda ugly.
3) A new tree type at each phase. In some ways this is the theoretically cleanest. Each phase can deal only with a structure produced for it by the previous phase. Plus the same approach carries all the way from front end to back end. For instance, you may "desugar" at some point and having a new tree type means that downstream phase(s) don't have to even consider the possibility of node types that are eliminated by desugaring. Also, low level optimizations usually need IRs that are significantly lower level than the original AST. But this is also a lot of code since almost everything has to be recreated at each step. This approach can also be slow since there can be almost no data sharing between phases.
4) Label every node in the AST with an ID and use that ID to reference information in other data structures (maps and vectors and such) that hold information computed for each phase. In many ways this is my favorite. It retains immutability, maximizes sharing and minimizes the "excess" code you have to write. But you still have to deal with the potential for "missing" information that can be tricky to debug. It's also not as fast as the mutable option, though faster than any option that requires producing a new tree at each phase.
I recently started writing a toy verifier for a small language, and I am using the Kiama library for the parser, resolver and type checker phases.
Kiama is a Scala library for language processing. It enables convenient analysis and transformation of structured data. The programming styles supported by the library are based on well-known formal language processing paradigms, including attribute grammars, tree rewriting, abstract state machines, and pretty printing.
I'll try to summarise my (fairly limited) experience:
[+] Kiama comes with several examples, and the main contributor usually responds quickly to questions asked on the mailing list
[+] The attribute grammar paradigm allows for a nice separation into "immutable components" of the nodes, e.g., names and subnodes, and "mutable components", e.g., type information
[+] The library comes with a versatile rewriting system which - so far - covered all my use cases
[+] The library, e.g., the pretty printer, make nice examples of DSLs and of various functional patterns/approaches/ideas
[-] The learning curve it definitely steep, even with examples and the mailing list at hand
[-] Implementing the resolving phase in a "purely function" style (cf. my question) seems tricky, but a hybrid approach (which I haven't tried yet) seems to be possible
[-] The attribute grammar paradigm and the resulting separation of concerns doesn't make it obvious how to document the properties nodes have in the end (cf. my question)
[-] Rumour has it, that the attribute grammar paradigm does not yield the fastest implementations
Summarising my summary, I enjoy using Kiama a lot and I strongly recommend that you give it a try, or at least have a look at the examples.
(PS. I am not affiliated with Kiama)

OOP: Is it normal to have a lot of inherited classes?

I started writing some code for a 2D game, created a class "objets" trying to keep it as generic as possible. I have a few methods and attributes that are common to every kind of element (buldings, ppl, interface buttons, etc) like (w, h, x, y ...you know) but most of them only make sense when applied to and specific type of item.
So I would have to inherit a new class for every type of actor in the game?
Just wondering if this is a common practice, or maybe i should manage it in a different way.
Thanks in advance.
If you're introducing behaviour then subclass, however if the difference is attribute based then don't e.g.
Animal (has .colour and .makeSound) -> Dog (has .eatOwnPoop) -> RedDog (no, too specific, covered by colour)
Notice how I had ".makeSound" in Animal. I could have put .bark in dog, but then I'd have to put .meow in cat etc. The subclass can simply override and provide a concrete sound.
However, you can use interfaces to better cross-cut your code, but that's quite a lengthy topic and probably overkill for your needs (although it could help any unit testing you do).
It sounds like you are over-using inheritance. It is certainly a red flag when you simultaneously say "common attributes like ..." and "...only make sense when applied to a specific type." Also, it is a red flag that domain objects such as building share a common base class with an interface object like button. Finally, it is quite unusual to define your own objet (object?) class from which every class in your system derives. It's not inconceivable, but in combination with your other comments, it sounds like you've started down an unproductive path.
You might want to refer to a good tutorial on object-oriented design and analysis such as "Head First OOA&D"
You do not HAVE to do anything. Generally, it is useful to use derived classes if they exhibit some kind of commonality but become more specialised in nature requiring specific functionality at each level of inheritance. It is also good to use if you want to have polymorphic behaviour. You have asked a very open ended question but basically do not feel that you HAVE to use inheritance as not every problem requires it and indeed some people overuse inheritance, introducing it in places where it really is not needed. All in all, I would really recommend that if you haven't already that you read a good book on object oriented design as this will then get you to think about your code from a different perspective and greatly improve the way you view software and design it. It may sound like a cop out but this kind of question is very hard to answer without knowing all details of what you are doing.

How do you go from an abstract project description to actual code?

Maybe its because I've been coding around two semesters now, but the major stumbling block that I'm having at this point is converting the professor's project description and requirements to actual code. Since I'm currently in Algorithms 101, I basically do a bottom-up process, starting with a blank whiteboard and draw out the object and method interactions, then translate that into classes and code.
But now the prof has tossed interfaces and abstract classes into the mix. Intellectually, I can recognize how they work, but am stubbing my toes figuring out how to use these new tools with the current project (simulating a web server).
In my professors own words, mapping the abstract description to Java code is the real trick. So what steps are best used to go from English (or whatever your language is) to computer code? How do you decide where and when to create an interface, or use an abstract class?
So what steps are best used to go from English (or whatever your language is) to computer code?
Experience is what teaches you how to do this. If it's not coming naturally yet (and don't feel bad if it doesn't, because it takes a long time!), there are some questions you can ask yourself:
What are the main concepts of the system? How are they related to each other? If I was describing this to someone else, what words and phrases would I use? These thoughts will help you decide what classes are useful to think about.
What sorts of behaviors do these things have? Are there natural dependencies between them? (For example, a LineItem isn't relevant or meaningful without the context of an Order, nor is an Engine much use without a Car.) How do the behaviors affect the state of the other objects? Do they communicate with each other, and if so, in what way? These thoughts will help you develop the public interfaces of your classes.
That's just the tip of the iceberg, of course. For more about this thought process in general, see Eric Evans's excellent book, Domain-Driven Design.
How do you decide where and when to create an interface, or use an abstract class?
There's no hard and fast prescriptions; again, experience is the best guide here. That said, there's certainly some rules of thumb you can follow:
If several unrelated or significantly different object types all provide the same kind of functionality, use an interface. For example, if the Steerable interface has a Steer(Vector bearing) method, there may be lots of different things that can be steered: Boats, Airplanes, CargoShips, Cars, et cetera. These are completely unrelated things. But they all share the common interface of being able to be steered.
In general, try to favor an interface instead of an abstract base class. This way you can define a single implementation which implements N interfaces. In the case of Java, you can only have one abstract base class, so you're locked into a particular inheritance hierarchy once you say that a class inherits from another one.
Whenever you don't need implementation from a base class, definitely favor an interface over an abstract base class. This would also be handy if you're operating in a language where inheritance doesn't apply. For example, in C#, you can't have a struct inherit from a base class.
In general...
Read a lot of other people's code. Open source projects are great for that. Respect their licenses though.
You'll never get it perfect. It's an iterative process. Don't be discouraged if you don't get it right.
Practice. Practice. Practice.
Research often. Keep tackling more and more challenging projects / designs. Even if there are easy ones around.
There is no magic bullet, or algorithm for good design.
Nowadays I jump in with a design I believe is decent and work from that.
When the time is right I'll implement understanding the result will have to refactored ( rewritten ) sooner rather than later.
Give this project your best shot, keep an eye out for your mistakes and how things should've been done after you get back your results.
Keep doing this, and you'll be fine.
What you should really do is code from the top-down, not from the bottom-up. Write your main function as clearly and concisely as you can using APIs that you have not yet created as if they already existed. Then, you can implement those APIs in similar fashion, until you have functions that are only a few lines long. If you code from the bottom-up, you will likely create a whole lot of stuff that you don't actually need.
In terms of when to create an interface... pretty much everything should be an interface. When you use APIs that don't yet exist, assume that every concrete class is an implementation of some interface, and use a declared type that is indicative of that interface. Your inheritance should be done solely with interfaces. Only create concrete classes at the very bottom when you are providing an implementation. I would suggest avoiding abstract classes and just using delegation, although abstract classes are also reasonable when two different implementations differ only slightly and have several functions that have a common implementation. For example, if your interface allows one to iterate over elements and also provides a sum function, the sum function is a trivial to implement in terms of the iteration function, so that would be a reasonable use of an abstract class. An alternative would be to use the decorator pattern in that case.
You might also find the Google Techtalk "How to Design a Good API and Why it Matters" to be helpful in this regard. You might also be interested in reading some of my own software design observations.
Also, for the coming future, you can keep in pipeline to read the basics on domain driven design to align yourself to the real world scenarios - it gives a solid foundation for requirements mapping to the real classes.