How to wrap procedural algorithms in OOP language - scala

I have to implement an algorithm which fits perfectly to the procedural design approach. It has no relations with some data structure, it just takes couple of objects, bunch of control parameters and performs complicated operations on them, including creating and modifying intermediate temporal data, subroutines calls, many cpu-intensive data transformations. The algorithm is too specific to include in either parameter object as method.
What is idiomatic way to wrap such algorithms in an OOP language? Define static object with static method that performs calculation? Define class that takes all algorithm parameters as constructor arguments and have result method to return result? Any other way?
If you need more specifics, I'm writing in scala. But any general OOP approach is also applicable.

A static method (or a method on a singleton object in the case of Scala -- which I'm just gonna call a static method because that's the most common terminology) can work perfectly fine and is probably the most common approach to this.
There's some reasons to use other approaches, but they aren't strictly necessary and I'd avoid them unless you actually need an advantage that they give. The reason for this is because static methods are the simplest (if least versatile) approach.
Using a non-static method can be useful because you can then utilize design patterns like the factory pattern. For example, you might have an Operator class with a method evaluate. Now you could have different factories create different Operators so that you can swap your algorithm on the fly. Perhaps a calculator might have an AddOperatorFactory, MultiplyOperatorFactory and so on. Obviously this requires that you are able to instantiate an object that represents the algorithm. Of course, you could just pass a function around directly, as Scala and many other languages allow. Classes allow for inheritance, though, which opens the doors for some design patterns and, well, you're asking about OOP, not Scala specifically.
Also useful is the ability to have state with an object. With static methods, your only options for retaining state are either having global state (ew) or making the user of the static methods keep track of this state (more work for the users). With an instance of an object, you can keep that state inside the instance. For example, if your algorithm is a graph search, perhaps you'd want to allow resuming a search after you find the first match (which obviously requires storing state).
It's not much harder to have to do new MyAlgorithm().doStuff() instead of MyAlgorithm.doStuff(), so if in doubt, I would err on the side of avoiding static methods if you think you'll need the functionality that having an instance offers.

Related

Pass Objects or Object Properties to Function?

Is there a guideline or recommendation about how to pass object to functions? Is it recommended to always pass the full object to a function:
calculateSomething(car1, car2, aircraft)
Or is it better to only pass the properties that are really needed to the function?
calculateSomething(car1.speed, car1.length, car2.speed, aircraft.height)
The first approach seems to be more convenient, especially when the function requires many more properties. However, my intuition tells me that the second approach is more computationally efficient as the function does not has to handle the full objects.
Is there a general programming advice for this or is it for every function a trade-off between readability and speed?
Never pass the properties directly. Because that breaks the principles of Object orientated programming, (Encapsulation) specially if it will involve making changes to the properties.
Always use getters and setters to make changes to the object properties.

Scala - Does pattern matching break the Open-Closed principle? [duplicate]

If I add a new case class, does that mean I need to search through all of the pattern matching code and find out where the new class needs to be handled? I've been learning the language recently, and as I read about some of the arguments for and against pattern matching, I've been confused about where it should be used. See the following:
Pro:
Odersky1 and
Odersky2
Con:
Beust
The comments are pretty good in each case, too. So is pattern matching something to be excited about or something I should avoid using? Actually, I imagine the answer is "it depends on when you use it," but what are some positive use cases for it and what are some negative ones?
Jeff, I think you have the right intuition: it depends.
Object-oriented class hierarchies with virtual method dispatch are good when you have a relatively fixed set of methods that need to be implemented, but many potential subclasses that might inherit from the root of the hierarchy and implement those methods. In such a setup, it's relatively easy to add new subclasses (just implement all the methods), but relatively difficult to add new methods (you have to modify all the subclasses to make sure they properly implement the new method).
Data types with functionality based on pattern matching are good when you have a relatively fixed set of classes that belong to a data type, but many potential functions that operate on that data type. In such a setup, it's relatively easy to add new functionality for a data type (just pattern match on all its classes), but relatively difficult to add new classes that are part of the data type (you have to modify all the functions that match on the data type to make sure they properly support the new class).
The canonical example for the OO approach is GUI programming. GUI elements need to support very little functionality (drawing themselves on the screen is the bare minimum), but new GUI elements are added all the time (buttons, tables, charts, sliders, etc). The canonical example for the pattern matching approach is a compiler. Programming languages usually have a relatively fixed syntax, so the elements of the syntax tree will change rarely (if ever), but new operations on syntax trees are constantly being added (faster optimizations, more thorough type analysis, etc).
Fortunately, Scala lets you combine both approaches. Case classes can both be pattern matched and support virtual method dispatch. Regular classes support virtual method dispatch and can be pattern matched by defining an extractor in the corresponding companion object. It's up to the programmer to decide when each approach is appropriate, but I think both are useful.
While I respect Cedric, he's completely wrong on this issue. Scala's pattern matching can be fully-encapsulated from class changes when desired. While it is true that a change to a case class would require changing any corresponding pattern matching instances, this is only when using such classes in a naive fashion.
Scala's pattern matching always delegates to the deconstructor of a class's companion object. With a case class, this deconstructor is automatically generated (along with a factory method in the companion object), though it is still possible to override this auto-generated version. At all times, you can assert complete control over the pattern matching process, insulating any patterns from potential changes in the class itself. Thus, pattern matching is simply another way of accessing class data through the safe filter of encapsulation, just like any other method.
So, Dr. Odersky's opinion would be the one to trust here, particularly given the sheer volume of research he has performed in the area of object-oriented programming and design.
As for where it should be used, that is entirely according to taste. If it makes your code more concise and maintainable, use it! Otherwise, don't. For most object-oriented programs, pattern matching is unnecessary. However, once you begin to integrate more functional idioms (Option, List, etc) I think you'll find that pattern matching will significantly reduce syntactic overhead as well as improving the safety offered by the type system. In general, any time you want to extract data while simultaneously testing some condition (e.g. extracting a value from Some), pattern matching will likely be of use.
Pattern matching is definitely good if you are doing functional programming. In case of OO, there are some cases where it is good. In Cedric's example itself, it depends on how you view the print() method conceptually. Is it a behavior of each Term object? Or is it something outside it? I would say it is outside, and makes sense to do pattern matching. On the other hand if you have an Employee class with various subclasses, it is a poor design choice to do pattern matching on an attribute of it (say name) in the base class.
Also pattern matching offers an elegant way of unpacking members of a class.

What functions to put inside a class

If I have a function (say messUp that does not need to access any private variables of a class (say room), should I write the function inside the class like room.messUp() or outside of it like messUp(room)? It seems the second version reads better to me.
There's a tradeoff involved here. Using a member function lets you:
Override the implementation in derived classes, so that messing up a kitchen could involve trashing the cupboards even if no cupboards are available in a generic room.
Decide that you need to access private variables later on, without having to refactor all the code that uses the function.
Make the function part of an interface, so that a piece of code may require that its argument be mess-up-able.
Using an external function lets you:
Make that function generic, so that you may apply it to rooms, warehouses and oil rigs equally (if they provide the member functions required for messing up).
Keep the class signature small, so that creating mock versions for unit testing (or different implementations) becomes easier.
Change the class implementation without having to examine the code for that function.
There's no real way to have your cake and eat it too, so you have to make choices. A common OO decision is to make everything a method (unless clearly idiotic) and sacrifice the three latter points, but that doesn't mean you should do it in all situations.
Any behaviour of a class of objects should be written as an instance method.
So room.messUp() is the OO way to do this.
Whether messUp has to access any private members of the class or not, is irrelevant, the fact that it's a behaviour of the room, suggests that it's an instance method, as would be cleanUp or paint, etc...
Ignoring which language, I think my first question is if messUp is related to any other functions. If you have a group of related functions, I would tend to stick them in a class.
If they don't access any class variables then you can make them static. This way, they can be called without needing to create an instance of the class.
Beyond that, I would look to the language. In some languages, every function must be a method of some class.
In the end, I don't think it makes a big difference. OOP is simply a way to help organize your application's data and logic. If you embrace it, then you would choose room.messUp() over messUp(room).
i base myself on "C++ Coding Standards: 101 Rules, Guidelines, And Best Practices" by Sutter and Alexandrescu, and also Bob Martin's SOLID. I agree with them on this point of course ;-).
If the message/function doesnt interract so much with your class, you should make it a standard ordinary function taking your class object as argument.
You should not polute your class with behaviours that are not intimately related to it.
This is to repect the Single Responsibility Principle: Your class should remain simple, aiming at the most precise goal.
However, if you think your message/function is intimately related to your object guts, then you should include it as a member function of your class.

Encapsulation in the age of frameworks

At my old C++ job, we always took great care in encapsulating member variables, and only exposing them as properties when absolutely necessary. We'd have really specific constructors that made sure you fully constructed the object before using it.
These days, with ORM frameworks, dependency-injection, serialization, etc., it seems like you're better off just relying on the default constructor and exposing everything about your class in properties, so that you can inject things, or build and populate objects more dynamically.
In C#, it's been taken one step further with Object initializers, which give you the ability to basically define your own constructor. (I know object initializers are not really custom constructors, but I hope you get my point.)
Are there any general concerns with this direction? It seems like encapsulation is starting to become less important in favor of convenience.
EDIT: I know you can still carefully encapsulate members, but I just feel like when you're trying to crank out some classes, you either have to sit and carefully think about how to encapsulate each member, or just expose it as a property, and worry about how it is initialized later. It just seems like the easiest approach these days is to expose things as properties, and not be so careful. Maybe I'm just flat wrong, but that's just been my experience, espeically with the new C# language features.
I disagree with your conclusion. There are many good ways of encapsulating in c# with all the above mentioned technologies, as to maintain good software coding practices. I would also say that it depends on whose technology demo you're looking at, but in the end it comes down to reducing the state-space of your objects so that you can make sure they hold their invariants at all times.
Take object relational frameworks; most of them allow you to specify how they are going to hydrate the entities; NHibernate for example allows you so say access="property" or access="field.camelcase" and similar. This allows you to encapsulate your properties.
Dependency injection works on the other types you have, mostly those which are not entities, even though you can combine AOP+ORM+IOC in some very nice ways to improve the state of these things. IoC is often used from layers above your domain entities if you're building a data-driven application, which I guess you are, since you're talking about ORMs.
They ("they" being application and domain services and other intrinsic classes to the program) expose their dependencies but in fact can be encapsulated and tested in even better isolation than previously since the paradigms of design-by-contract/design-by-interface which you often use when mocking dependencies in mock-based testing (in conjunction with IoC), will move you towards class-as-component "semantics". I mean: every class, when built using the above, will be better encapsulated.
Updated for urig: This holds true for both exposing concrete dependencies and exposing interfaces. First about interfaces: What I was hinting at above was that services and other applications classes which have dependencies, can with OOP depend on contracts/interfaces rather than specific implementations. In C/C++ and older languages there wasn't the interface and abstract classes can only go so far. Interfaces allow you to tie different runtime instances to the same interface without having to worry about leaking internal state which is what you're trying to get away from when abstracting and encapsulating. With abstract classes you can still provide a class implementation, just that you can't instantiate it, but inheritors still need to know about the invariants in your implementation and that can mess up state.
Secondly, about concrete classes as properties: you have to be wary about what types of types ;) you expose as properties. Say you have a List in your instance; then don't expose IList as the property; this will probably leak and you can't guarantee that consumers of the interface don't add things or remove things which you depend on; instead expose something like IEnumerable and return a copy of the List, or even better, do it as a method:
public IEnumerable MyCollection { get { return _List.Enum(); } } and you can be 100% certain to get both the performance and the encapsulation. Noone can add or remove to that IEnumerable and you still don't have to perform a costly array copy. The corresponding helper method:
static class Ext {
public static IEnumerable<T> Enum<T>(this IEnumerable<T> inner) {
foreach (var item in inner) yield return item;
}
}
So while you can't get 100% encapsulation in say creating overloaded equals operators/method you can get close with your public interfaces.
You can also use the new features of .Net 4.0 built on Spec# to verify the contracts I talked about above.
Serialization will always be there and has been for a long time. Previously, before the internet-area it was used for saving your object graph to disk for later retrieval, now it's used in web services, in copy-semantics and when passing data to e.g. a browser. This doesn't necessarily break encapsulation if you put a few [NonSerialized] attributes or the equivalents on the correct fields.
Object initializers aren't the same as constructors, they are just a way of collapsing a few lines of code. Values/instances in the {} will not be assigned until all of your constructors have run, so in principle it's just the same as not using object initializers.
I guess, what you have to watch out for is deviating from the good principles you've learnt from your previous job and make sure you are keeping your domain objects filled with business logic encapsulated behind good interfaces and ditto for your service-layer.
Private members are still incredibly important. Controlling access to internal object data is always good, and shouldn't be ignored.
Many times private methods I've found to be overkill. Most of the time, if the work you're doing is important enough to break out, you can refactor it in such a way that either a) the private method is trivial, or b) is an integral part of other functions.
In addition, with unit testing, having many methods private makes it very hard to unit test. There are ways around that (making test objects friends, etc), but add difficulties.
I wouldn't discount private methods entirely though. Any time there's important, internal algorithms that really make no sense outside of the class there's no reason to expose those methods.
I think that encapsulation is still important, it helps more in libraries than anything imho. You can create a library that does X, but you don't need everyone to know how X was created. And if you wanted to create it more specifically to obfuscate the way you create X. The way I learned about encapsulation, I remember also that you should always define your variables as private to protect them from a data attack. To protect against a hacker breaking your code and accessing variables that they are not supposed to use.

Understanding Interfaces

I have class method that returns a list of employees that I can iterate through. What's the best way to return the list? Typically I just return an ArrayList. However, as I understand, interfaces are better suited for this type of action. Which would be the best interface to use? Also, why is it better to return an interface, rather than the implementation (say ArrayList object)? It just seems like a lot more work to me.
Personally, I would use a List<Employee> for creating the list on the backend, and then use IList when you return. When you use interfaces, it gives you the flexability to change the implementation without having to alter who's using your code. If you wanted to stick with an ArrayList, that'd be a non-generic IList.
# Jason
You may as well return IList<> because an array actually implements this interface.
The best way to do something like this would be to return, as you say, a List, preferably using generics, so it would be List<Employee>.
Returning a List rather than an ArrayList means that if later you decide to use, say, a LinkedList, you don't have to change any of the code other than where you create the object to begin with (i.e, the call to "new ArrayList())".
If all you are doing is iterating through the list, you can define a method that returns the list as IEnumerable (for .NET).
By returning the interface that provides just the functionality you need, if some new collection type comes along in the future that is better/faster/a better match for your application, as long as it still implements IEnumerable you can completely rewrite your method, using the new type inside it, without changing any of the code that calls it.
Is there any reason the collection needs to be ordered? Why not simply return an IEnumerable<Employee>? This gives the bare minimum that is required - if you later wanted some other form of storage, like a Bag or Set or Tree or whatnot, your contract would remain intact.
I disagree with the premise that it's better to return an interface. My reason is that you want to maximize the usefulness a given block of code exposes.
With that in mind, an interface works for accepting an item as an argument. If a function parameter calls for an array or an ArrayList, that's the only thing you can pass to it. If a function parameter calls for an IEnumerable it will accept either, as well as a number of other objects. It's more useful
The return value, however, works opposite. When you return an IEnumerable, the only thing you can do is enumerate it. If you have a List handy and return that then code that calls your function can also easily do a number of other things, like get a count.
I stand united with those advising you to get away from the ArrayList, though. Generics are so much better.
An interface is a contract between the implementation and the user of the implementation.
By using an interface, you allow the implementation to change as much as it wants as long as it maintains the contract for the users.
It also allows multiple implementations to use the same interface so that users can reuse code that interacts with the interface.
You don't say what language you're talking about, but in something .NETish, then it's no more work to return an IList than a List or even an ArrayList, though the mere mention of that obsolete class makes me think you're not talking about .NET.
An interface is essentially a contract that a class has certain methods or attributes; programming to an interface rather then a direct implementation allows for more dynamic and manageable code, as you can completely swap out implementations as long as the "contract" is still held.
In the case you describe, passing an interface does not give you a particular advantage, if it were me, I would pass the ArrayList with the generic type, or pass the Array itself: list.toArray()
Actually you shouldn't return a List if thats a framework, at least not without thinking it, the recommended class to use is a Collection. The List class has some performance improvements at the cost of server extendability issues. It's in fact an FXCop rule.
You have the reasoning for that in this article
Return type for your method should be IList<Employee>.
That means that the caller of your method can use anything that IList offers but cannot use things specific to ArrayList. Then if you feel at some point that LinkedList or YourCustomSuperDuperList offers better performance or other advantages you can safely use it within your method and not screw callers of it.
That's roughly interfaces 101. ;-)