Related
This is a question about the definition of a class.
Of course I have read the endless examples on the Internet of what should be called a class. I have read that it is all the verbs and nouns that make up a thing. I understand the concept of a car class with properties like size, colour, and methods like drive
I also understand the idea that a class should have only one responsibility and adhere to the other SOLID principles
My problem relates to a program I have developed.
The responsibility of the program is to extract all the similar words from a document. It is therefore not a 'noun' like a car or animal but a verb type class I suppose.
In order to do this the program iterates through a folder of text files, extracts all the text, splits the text up by line and then 20 characters, compares each of the chunks in one file to all of the others by similarity, keeps only the words that are similar between two files, cleans the words to get rid of various characters and then added the words to a text file and repeats this for all the files in the folder.
So I have one responsibility for the class and I have written methods for each of the phrases between the commas.
Having read more about class design then it could to me that some of these methods might be classes in their own right. If a class is defined by having a single responsibility then presumably I could define more classes instead of these methods. E.g. why don't I have a class to find word similarity with only one method....
So my question is how do I define a class on a single responsbility basis if a method also has a single responsibility and the class doesn't define a thing but more of an action. What are the boundaries of what defines a class?
Please no...'Have you read'...because I have read them all. A simple explanation with a well illustrated example (conceptual example is fine)
The term "single responsibility" is very nebulous. I find it much easier to think of it in terms of cohesion and coupling. In short, we have to get things that tend to change together (i.e. are strongly cohesive) into one class and things that don't (i.e. are loosely coupled) into separate classes.
In practice that means things that tend to work with the same "data" belong to the same class. This can be easily enforced if data does not leave the object. Even more pragmatically that means avoiding "getter" methods that return data from an object.
Regarding your problem. You're saying it's not a noun, but only because you don't think of it that way. What is your "business logic"? To collect SimilarWords from a Document. Both are nouns. Your phrases are all about what steps should be taken. Rethink your application in terms of what things are involved and what actions those things would be able to do for you.
Here is a short/incomplete design for the things you describe:
public interface Folder {
public SimilarWords extract();
}
Meaning: I want to extract SimilarWords from a Folder.
public interface TextFile {
public void chunk(Consumer<Chunk> chunkConsumer);
}
Meaning: TextFile chunks the text.
public class Comparison {
public Comparison(TextFile file1, TextFile file2);
public SimilarWords extract();
}
Meaning: Two TextFiles are compared where the SimilarWords come from. You didn't use the word "Comparison" explicitly, I made that up.
And of course SimilarWords need to be added together for all file pairs (?) and then written to some output:
public interface SimilarWords {
public SimilarWords add(SimilarWords other);
public void writeTo(OutputStream output);
}
So that would be a proper OO design. I didn't catch all the details of your domain, so this model may be not exactly what you want, but I think you get the point.
Let's think a little about both your problem, problems in general, and SRP.
SRP states that a class should be concerned with one thing. This doesn't mean exactly to have a single method that does only one thing.
Actually this can be applied outside OOP too: a function should do only a single thing.
Now imagine your program has to implement 200 features. Imagine they are so simple that a single function is enough to implement any feature. And suppose you are using only functions. By the same principle you have to write (at least) 200 functions. Now this is not so great as it looks. First you program structure looks like an endless list of micro-sized pieces of code. Second if they are micro-sized, they can't do much by themselves (this is not bad per see). As you suspected a feature doesn't usually map to a single function in real world. Third if they do almost nothing, they have to ask everything to someone else. Or someone is doing that somewhere else. So there is some place where a function, or a class, is calling all the others. That place centralizes a lot of knowledge about the system. It has to know about everything to be able to call everyone. This is not good for an architecture.
The alternative is to distribute the knowledge.
If you allow those functions or classes to do a little more, they ask less things to others, some of those things are solved locally. Let me guess. As all this classes are in the same application, some of them are related to each other. They can form a group and collaborate. Maybe they can be the same class, or inherit from others. This reduces communication paths. Communication becomes more local.
Comunication paths matter. Imagine there are 125 persons in your company, and the company needs to take collective decisions. Would you do a 125 people meeting, or you group people say in 5 groups, each with 5 teams of 5 people and have small meetings instead, and then the team and group leaders meet themselves? This is a form of hierarchy or structure that helps things.
Can you imagine the fan-in and fan-out in the new structure? 5/5/5 is much better than 1/125.
So this is about a trade-off. You are exchanging communication paths by responsabilities. What you want in the end to have a reasonable architecture, with knowledge distributed evenly.
Scala has two instruments for expressing object composition: original self-type concept and well known trivial composition. I'm curios what situations I should use which in.
There are obvious differences in their applicability. Self-type requires you to use traits. Object composition allows you to change extensions on run-time with var declaration.
Leaving technical details behind I can figure two indicators to help with classification of use cases. If some object used as combinator for a complex structure such as tree or just have several similar typed parts (1 car to 4 wheels relation) than it should use composition. There is extreme opposite use case. Lets assume one trait become too big to clearly observe it and it got split. It is quite natural that you should use self-types for this case.
That rules are not absolute. You may do extra work to convert code between this techniques. e.g. you may replace 4 wheels composition with self-typing over Product4. You may use Cake[T <: MyType] {part : MyType} instead of Cake { this : MyType => } for cake pattern dependencies. But both cases seem counterintuitive and give you extra work.
There are plenty of boundary use cases although. One-to-one relations is very hard to decide with. Is there any simple rule to decide what kind of technique is preferable?
self-type makes you classes abstract, composition makes your code verbose. self-type gives your problems with blending namespaces and also gives you extra typing for free (you got not just a cocktail of two elements but gasoline-motor oil cocktail known as a petrol bomb).
How can I choose between them? What hints are there?
Update:
Let us discuss the following example:
Adapter pattern. What benefits it has with both selt-typing and composition approaches?
Hints below are derived from heuristic approach (trial-and-error method of problem solving used when an algorithmic approach is impractical) and not supported by any formula (Mathematically -based reasoning).
***Hints given here should be evaluated in reference to accompanying hints, no hint is a perfect rule to distinguish composition and self-typing use-cases.
(While following below mentioned hints, I don't care or focus on verbosity or number of lines of the code or programming effort inputs.)
composition (dictionary meaning): the act of combining parts or elements to form a whole (Trivial Composition)
trait (Dictionary meaning): a distinguishing characteristic or quality
Hints for Trivial Composition (which can be achieved by super - sub class mechanism or association relationship) (e.g. Car and Wheels):
Which can be counted discretely (eg. Wheels)
Which can be classified futher ( based on different criterias) (eg. Wheels - alloy wheel, steel wheel etc)
Which can be added or removed (Note: When we say wheel stopped, it is actually wheel's rotation speed is stopped, when we say heart stopped, it is actually heart's pulsating speed has become zero)
generally applicable to few (in the universe Some Vehicles and some machineries have wheels)
( few can be 10- 15 or millions also - To explain let us understand the statement: when geologist talk about time and say some time ago , it means few million years ago, it depends on the actual subject)
Hints for Self Type (Trait)(eg. Car and Speed):
Which is unidimensional (not in terms of physics), can be plotted on a number line (whatever the physical unit is) (e.g. Speed)
Which can't be classified further naturally (e.g. Speed) (or atleast you will not classify it further) Here, naturally word used to convey the meaning that to classify it you will have to depend on your own criteria and there will be a definite possibility of classify it into millions of sub types. Take move, you can have millions of move sub traits ... like zigzag move, rotate and go forward,... (million possibility with various permutation combination).
Which can be increased or decreased or stoped (eg. speed, anger, love etc.)
Which is generally seen/ can be seen in very distantly placed classes (e.g. Speed of light, Speed of earth, speed of runner)
generally applicable to many ( in the universe majority (here every) object has speed)
Software Development is like making your own universe and as a creator you define everything. A Trait will be seen among distantly placed classes in your domain (your own universe).
Please note that I have not seen any specific word (here counterpart for trait) in any language ( I know very few) for part which is used for Trivial Composition.
Further Explanation:
To get the answer you need to find somewhere deep in the philosophy of class oriented or object oriented aporach of software development and need to understand the mind and logic of the creators of the programming languages such as java and scala (or many more) which have inculcated class oriented or object oriented paradigm within those language.
Another thing you need is the deep understanding of semantics (the study of meaning or the study of linguistic development by classifying and examining changes in meaning and form )which we use to describe the real world and the semantics behind the keywords (in the programming language) we use as programmers.
I believe, when we create class we want to manifest the real world into software. The class becomes representation of something from the real world may it be car, human, star, dream, thought or imagination etc.
When somebody says "Wheels", you will have clear cut picture of its shape and application and you can think of driving wheel or wheels which roll on road. Wheel always be part of something. It can be counted in discrete numbers. Wheels can be further classified based on criteria like material, application, size etc. Wheel like things qualify for Trivial Composition.
When somebody says "Speed", you will not have any clear cut picture of it... no shape ...no colour... but you can relate it with any moving (relativity) part in the universe. It is a characteristic, trait. Speed is not part of anything. It can be there or it can not be there. It can be plotted on a single line (either direction + or - ). It is hard to classify "Speed". Speed like things qualifies for trait.
In my opinion,
If we take Car as a class (Object), "Speed" like characteristics should go as trait in scala. And "Wheel" like parts, components should go in as "Trivial Composition". "Speed" like characteristics will not have natural classification, where "Wheels" can have many classes and they themselves are independent objects (in reality).
If we take Human as a class (Oject), "Anger, crying, laughing, etc." like behaviors should go as trait and "hands, legs, brain, heart, etc." should go in as "Trivial Composition" as they themselves are independent objects (in reality).
If we think of name, it can be given to anything and anyone i.e. our nearest star has a name "Sun", highest mountain has a name "Himalaya", my dog has a name "Rocky", the river has a name "Amazon".... "Name" is a trait and should not be considered for "Trivial Composition".
If we think of heart, animals have heart as their part. It must be considered for "Trivial Composition" and not as a trait.
What is class?
Class is a description or a blue print of a particular Object.
What is object?
Object is a reality which can be described by the class definition.
(Egg or hen? Which came first?)I believe, Software engineer first thinks of Objects and then (to describe them or make them) (from a blueprint) defines class. (Please note that IN Object Oriented Modelling and Design - Class and Object are complementing each other's existence.) ( "Egg or hen? Which came first? is for co-existence of class and object and has no relevance with the famous Circle-Eclipse Problem (http://en.wikipedia.org/wiki/Circle-ellipse_problem) as the later is related with Inheritance or subtype Polymorphism.)
interface: a thing that enables separate and sometimes incompatible elements to coordinate effectively
Software Development is like making your own universe and as a creator you define everything. Composition should be preferred over Inheritance. ( Gang of Four - Design Patterns)
I would like to make an RPG in the object oriented programming style. I have experience with oo programming, but have never worked with large groups of classes and subclasses. I am starting with this
http://members.gamedev.net/emmanuel_deloget/
and creating my own structure similar to it. The problem is I don't understand how the structure is to be used. Do you create static classes for things like races and use those, or do you create an object from the single race class? it is confusing to me because I would assume you create a single race class and when the 'Main' class initializes you create all of the individual classes, but the above chart does not show any methods for initializing these objects with the exception of the constructor. But because races will have different member variable values, how would I use the above chart to initialize a race object? Or is this chart incomplete? (if its incomplete, then that is what has been confusing me I think)
Strong suggestion: Don't use an inheritance hierarchy for representing objects in an RPG.
Use composition instead, and use prototype-based programming objects for defining object attributes and behaviours.
The simple reason is that it is impossible to represent the behaviours of a complex RPG in a inheritance-style class hierarchy.
As an example, suppose you have a iron golem as a monster in your game. It needs to have:
The properties of the "iron" material in relation to what is able to cause damage (i.e. very resistant to impact, very resistant to fire, very vulnerable to rust or acid)
The properties of a "large humanoid" with respect to combat and actions (cannot fly, can walk, can be decapitated, can wield oversized weapons and armour)
The properties of a "artificial construct" for magic effects (invulnerable to fear, not living, magically animated)
The properties of a "semi-intelligent monster" with repect to AI and behaviour (hostile to player, attacks player on sight, etc.)
All of these kind of properties could theoretically be mixed and matched. So you will never be able to define a simple inheritance heirarchy that contains them all. Don't even try, just make your iron golem a composite of the relevant properties.
I speak from experience of implementing RPG object models - if you are interested take a look at the source code for Tyrant (a roguelike game I wrote many years ago). I started with an inheritance hierarchy but eventually had to refactor the entire code base to a prototype-based model.
This is a general design question not relating to any language. I'm a bit torn between going for minimum code or optimum organization.
I'll use my current project as an example. I have a bunch of tabs on a form that perform different functions. Lets say Tab 1 reads in a file with a specific layout, tab 2 exports a file to a specific location, etc. The problem I'm running into now is that I need these tabs to do something slightly different based on the contents of a variable. If it contains a 1 I may need to use Layout A and perform some extra concatenation, if it contains a 2 I may need to use Layout B and do no concatenation but add two integer fields, etc. There could be 10+ codes that I will be looking at.
Is it more preferable to create an individual path for each code early on, or attempt to create a single path that branches out only when absolutely required.
Creating an individual path for each code would allow my code to be extremely easy to follow at a glance, which in turn will help me out later on down the road when debugging or making changes. The downside to this is that I will increase the amount of code written by calling some of the same functions in multiple places (for example, steps 3, 5, and 9 for every single code may be exactly the same.
Creating a single path that would branch out only when required will be a bit messier and more difficult to follow at a glance, but I would create less code by placing conditionals only at steps that are unique.
I realize that this may be a case-by-case decision, but in general, if you were handed a previously built program to work on, which would you prefer?
Edit: I've drawn some simple images to help express it. Codes 1/2/3 are the variables and the lines under them represent the paths they would take. All of these steps need to be performed in a linear chronological fashion, so there would be a function to essentially just call other functions in the proper order.
Different Paths
Single Path
Creating a single path that would
branch out only when required will be
a bit messier and more difficult to
follow at a glance, but I would create
less code by placing conditionals only
at steps that are unique.
Im not buying this statement. There is a level of finesse when deciding when to write new functions. Functions should be as simple and reusable as possible (but no simpler). The correct answer is almost never 'one big file that does a lot of branching'.
Less LOC (lines of code) should not be the goal. Readability and maintainability should be the goal. When you create functions, the names should be self documenting. If you have a large block of code, it is good to do something like
function doSomethingComplicated() {
stepOne();
stepTwo();
// and so on
}
where the function names are self documenting. Not only will the code be more readable, you will make it easier to unit test each segment of the code in isolation.
For the case where you will have a lot of methods that call the same exact methods, you can use good OO design and design patterns to minimize the number of functions that do the same thing. This is in reference to your statement "The downside to this is that I will increase the amount of code written by calling some of the same functions in multiple places (for example, steps 3, 5, and 9 for every single code may be exactly the same."
The biggest danger in starting with one big block of code is that it will never actually get refactored into smaller units. Just start down the right path to begin with....
EDIT --
for your picture, I would create a base-class with all of the common methods that are used. The base class would be abstract, with an abstract method. Subclasses would implement the abstract method and use the common functions they need. Of course, replace 'abstract' with whatever your language of choice provides.
You should always err on the side of generalization, with the only exception being early prototyping (where throughput of generating working stuff is majorly impacted by designing correct abstractions/generalizations). having said that, you should NEVER leave that mess of non-generalized cloned branches past the early prototype stage, as it leads to messy hard to maintain code (if you are doing almost the same thing 3 different times, and need to change that thing, you're almost sure to forget to change 1 out of 3).
Again it's hard to specifically answer such an open ended question, but I believe you don't have to sacrifice one for the other.
OOP techniques solves this issue by allowing you to encapsulate the reusable portions of your code and generate child classes to handle object specific behaviors.
Personally I think you might (if possible by your API) create inherited forms, create them on fly on master form (with tabs), pass agruments and embed in tab container.
When to inherit form and when to decide to use arguments (code) to show/hide/add/remove functionality is up to you, yet master form should contain only decisions and argument passing and embeddable forms just plain functionality - this way you can separate organisation from implementation.
Currently I am making some decisions for my first objective-c API. Nothing big, just a little help for myself to get things done faster in the future.
After reading a few hours about different patterns like making categories, singletons, and so on, I came accross something that I like because it seems easy to maintain for me. I'm making a set of useful functions, that can be useful everywhere.
So what I did is:
1) I created two new files (.h, .m), and gave the "class" a name: SLUtilsMath, SLUtilsGraphics, SLUtilsSound, and so on. I think of that as kind of "namespace", so all those things will always be called SLUtils******. I added all of them into a Group SL, which contains a subgroup SLUtils.
2) Then I just put my functions signatures in the .h file, and the implementations of the functions in the .m file. And guess what: It works!! I'm happy with it, and it's easy to use. The only nasty thing about it is, that I have to include the appropriate header every time I need it. But that's okay, since that's normal. I could include it in the header prefix pch file, though.
But then, I went to toilet and a ghost came out there, saying: "Hey! Isn't it better to make real methods, instead of functions? Shouldn't you make class methods, so that you have to call a method rather than a function? Isn't that much cooler and doesn't it have a better performance?" Well, for readability I prefer the functions. On the other hand they don't have this kind of "named parameters" like methods, a.f.a.i.k..
So what would you prefer in that case?
Of course I dont want to allocate an object before using a useful method or function. That would be harrying.
Maybe the toilet ghost was right. There IS a cooler way. Well, for me, personally, this is great:
MYNAMESPACECoolMath.h
#import <Foundation/Foundation.h>
#interface MYNAMESPACECoolMath : NSObject {
}
+ (float)randomizeValue:(float)value byPercent:(float)percent;
+ (float)calculateHorizontalGravity:(CGPoint)p1 andPoint:(CGPoint)p2;
// and some more
#end
Then in code, I would just import that MYNAMESPACECoolMath.h and just call:
CGFloat myValue = [MYNAMESPACECoolMath randomizeValue:10.0f byPercent:5.0f];
with no nasty instantiation, initialization, allocation, what ever. For me that pattern looks like a static method in java, which is pretty nice and easy to use.
The advantage over a function, is, as far as I noticed, the better readability in code. When looking at a CGRectMake(10.0f, 42.5f, 44.2f, 99.11f) you'll may have to look up what those parameters stand for, if you're not so familiar with it. But when you have a method call with "named" parameters, then you see immediately what the parameter is.
I think I missed the point what makes a big difference to a singleton class when it comes to simple useful methods / functions that can be needed everywhere. Making special kind of random values don't belong to anything, it's global. Like grass. Like trees. Like air. Everyone needs it.
Performance-wise, a static method in a static class compile to almost the same thing as a function.
Any real performance hits you'd incur would be in object instantiation, which you said you'd want to avoid, so that should not be an issue.
As far as preference or readability, there is a trend to use static methods more than necessary because people are viewing Obj-C is an "OO-only" language, like Java or C#. In that paradigm, (almost) everything must belong to a class, so class methods are the norm. In fact, they may even call them functions. The two terms are interchangeable there. However, this is purely convention. Convention may even be too strong of a word. There is absolutely nothing wrong with using functions in their place and it is probably more appropriate if there are no class members (even static ones) that are needed to assist in the processing of those methods/functions.
The problem with your approach is the "util" nature of it. Almost anything with the word "util" it in suggests that you have created a dumping ground for things you don't know where to fit into your object model. That probably means that your object model is not in alignment with your problem space.
Rather than working out how to package up utility functions, you should be thinking about what model objects these functions should be acting upon and then put them on those classes (creating the classes if needed).
To Josh's point, while there is nothing wrong with functions in ObjC, it is a very strongly object-oriented language, based directly on the grand-daddy of object-oriented languages, Smalltalk. You should not abandon the OOP patterns lightly; they are the heart of Cocoa.
I create private helper functions all the time, and I create public convenience functions for some objects (NSLocalizedString() is a good example of this). But if you're creating public utility functions that aren't front-ends to methods, you should be rethinking your patterns. And the first warning sign is the desire to put the word "util" in a file name.
EDIT
Based on the particular methods you added to your question, what you should be looking at are Categories. For instance, +randomizeValue:byPercent: is a perfectly good NSNumber category:
// NSNumber+SLExtensions.h
- (double)randomizeByPercent:(CGFloat)percent;
+ (double)randomDoubleNear:(CGFloat)percent byPercent:(double)number;
+ (NSNumber *)randomNumberNear:(CGFloat)percent byPercent:(double)number;
// Some other file that wants to use this
#import "NSNumber+SLExtensions.h"
randomDouble = [aNumber randomizeByPercent:5.0];
randomDouble = [NSNumber randomDoubleNear:5.0 byPercent:7.0];
If you get a lot of these, then you may want to split them up into categories like NSNumber+Random. Doing it with Categories makes it transparently part of the existing object model, though, rather than creating classes whose only purpose is to work on other objects.
You can use a singleton instance instead if you want to avoid instantiating a bunch of utility objects.
There's nothing wrong with using plain C functions, though. Just know that you won't be able to pass them around using #selector for things like performSelectorOnMainThread.
When it comes to performance of methods vs. functions, Mike Ash has some great numbers in his post "Performance Comparisons of Common Operations". Objective-C message send operations are extremely fast, so much so that you'd have to have a really tight computational loop to even see the difference. I think that using functions vs. methods in your approach will come down to the stylistic design issues that others have described.
Optimise the system, not the function calls.
Implement what is easiest to understand and then when the whole system works, profile it and speed up what's slow. I doubt very much that the objective-c runtime overhead of a static class is going to matter one bit to your whole app.