I've been introduced to an Objective-C codebase which has ~50,000 LoC and I'd estimate that 25% or so is duplicate code. Unfortunately, OO principles have been mostly ignored up to this point in the codebase in favor of copy and pasting logic. Yay!
I'm coming from a Java background and a lot of this duplication is fixable with good old-fashioned objective oriented programming. Extracting shared logic into a base class feels like the correct solution in a lot of cases.
However, before I embark on creating a bunch of base classes and sharing common logic between derived classes, I thought I should stop and see if there are any other options available to me. After watching Ken Kocienda's 'Writing Easy-To-Change Code' WWDC session from 2011, he's advising me to keep object hierarchies as shallow as possible. He doesn't offer up any hard statistics as to why he has this opinion, so I'm wondering whether I'm missing out on something.
I'm not an Objective-C expert by any stretch of the imagination, so I'm wondering if there's any best practices when deciding on an object hierarchy. Basically, I'd like to get opinions on when you decide to stop creating base classes and start using composition instead of inheritance as a way of sharing code between classes.
Also, from a runtime performance standpoint, is there anything to sway me away from creating object hierarchies?
I wrote up some thoughts awhile back on coming to iOS from other backgrounds, including Java. Some things have changed due to ARC. In particular, memory management is no longer so front-and-center. That said, all the things you used to do to make memory management easy (use accessors, use accessors, use accessors) is still equally valid in ARC.
#Radu is completely correct that you should often keep your class hierarchies fairly simple and shallow (as you read). Composition is often a much better approach in Cocoa than extensive subclassing (this is likely true in Java, too, but it's common practice in ObjC). ObjC also has no concept of an abstract method or class, which makes certain kinds of subclassing a little awkward. Rather than extracting shared logic into base classes (particularly abstract base classes), it is often better to extract them into a separate strategy object.
Look at UITableView and its use of delegates and datasources. Look at things like NSAttributedString which HAS-A NSString rather than IS-A. That's common and often keeps things cleaner. As with all large object hierarchies, keep LSP in mind at all times. I see a lot of ObjC design go sideways when someone forgets that a square is not a rectangle. Again, this is true of all languages, but it's worth remembering as you design.
Immutable (value) objects are a real win whenever you can use them.
The other piece you will quickly discover is that there are very few "safety decorations" like "final" or "protected" (there is a #protected, but it isn't actually that useful in practice and is seldom used). People from a Java and C++ background tend to fret about compiler enforcement of various access rules. ObjC doesn't have compiler enforcement of most protections (you can always send any message you want to any object at runtime). You just use consistent naming conventions and don't go poking around at private methods. Programmer discipline takes the place of compiler enforcement. In practice, it works just fine that way in the vast majority of cases.
That said, ObjC has a lot of warnings, and you absolutely must eliminate all warnings. Most ObjC warnings are actually errors.
I've strayed a little from the specific question of object hierarchies, but hopefully it's useful.
One major problem with deep hierarchies in Objective-C is that Xcode doesn't help you at all understanding/managing them. Another is simply that just about anything in Objective-C gets about twice as complex than the equivalent in Java, so you need to work harder to keep stuff simple.
But I find composition in Objective-C to be awkward (though I can't say exactly why), so there is no "perfect" answer.
I have observed that small subroutines are much rarer in Objective-C vs Java, and one is much more likely to see code duplicated between mostly-identical view controllers and the like. I think a big part of this is simply the development tools and the relative awkwardness with creating new classes.
PS: I had to rework an app that contained roughly 55K lines, close as we could count. As you found, there was likely about 25% duplication, but there was also another 25% or so of totally dead code. (Thankfully, that app has been pretty much abandoned since.)
Related
It seems a lot of Objective-C code is using Singleton nowadays.
While a lot of people complaining about Singleton, e.g. Google (Where Have All the Singletons Gone?), their fellow engineers also use it anyway: http://code.google.com/mobile/analytics/docs/iphone/
I know we had some answers in Stack Overflow already but they are not totally specific to Objective-C as a dynamic language: Objective C has categories, while many other languages do not.
So what is your opinion? Do you still use Singleton? If so, how do you make your app more testable?
Updated: I think we need to use codes as example for more concrete discussion, so much discussions on SO are theory based without a single line of code
Let's use the Google Analytics iOS SDK as an example:
// Initialization
[[GANTracker sharedTracker] startTrackerWithAccountID:#"UA-0000000-1"
dispatchPeriod:kGANDispatchPeriodSec
delegate:nil];
// Track page view
[[GANTracker sharedTracker] trackPageview:#"/app_entry_point"
withError:&error];
The beauty of the above code is once you have initialized using the method "startTrackerWithAccountID", you can run method "trackPageview" throughout out your apps without passing through configurations.
If you think Singleton is bad, can you improve the above code?
Much thanked for your input, have a happy Friday.
This post is likely to be downvote-bait, but I don't really understand why singletons get no love. They're perfectly valid, you just have to understand what they're useful for.
In iOS development, you have one and only one instance of the application you currently are. You're only one application, right? You're not two or zero applications, are you? So the framework provides you with a UIApplication singleton through which to get at application-level os and framework features. It models something appropriately to have that be a singleton.
If you've got data fields of which there can and should be only one, and you need to get to them from all over the place in your app, there's totally nothing wrong with modeling that as a singleton too. Creating a singleton as a globals bucket is probably a misuse of the pattern, and I think that's probably what most people object to about them. But if you're modeling something that has "singleness" to it, a singleton might well be the way to go.
Some developers seem to have a fundamental disgust for singletons, but when actually asked why, they mumble something about globals and namespaces and aesthetics. Which I guess I can understand, if you've really resolved once and for all that Singletons are an anti-pattern and to be abhorred in all cases. But you're not thinking anymore, at that point. And the framework design disagrees with you.
I think most developers go through the Singleton phase, where you have everything you need at your fingertips, in a bunch of wonderful Singletons.
Then you discover that unit testing with Singletons can be difficult. You don't actually want to connect to the database, but your Singleton does. Add a layer of redirection and mock it.
Then you discover that unit testing isn't the only time you need different behaviour. You make your Singleton configurable to have different behaviour based on a parameter. You start to wonder if you need to split it into two Singletons. Then your code needs to know which Singleton to use, so you need a Singleton that knows which Singleton to use.
Then some other code starts messing with the values in your Singleton, while you're using it. How dare they! If you wanted just anybody to get at those values from anywhere, you'd make them global...
Once you get to this point, you start wondering if Singletons were the right solution. You start to see the dangers of global data, particularly within an OO design, where you just assume your data won't get poked at by other people.
So you go back and start passing the data along, rather than looking it up (this used to be called good OO design, but now it has a fancy name like "Dependency Injection").
Eventually you learn that Singletons are fine in moderation. You learn to recognize when your Singleton needs to stop being single.
So you get shared objects like UIApplication and NSUserDefaults. Those are good uses of Singletons.
I got burned enough in the Java Singleton craze a decade ago. I don't even consider writing my own Singletons. The only time I've needed anything similar in recent memory is wanting to cache the result of [NSCalendar currentCalendar] (which takes a long time). I created a category on NSCalendar and cached it as a static variable. I felt a bit dirty, but the alternative was painfully slow code.
To summarize and for those who tl;dr:
Singletons are a tool. They're not likely to be the right tool, but you have to discover that for yourself.
Why do you need an answer that is "total Objective C specific"? Singletons aren't totally Obj-C specific either, and you're able to use those. Functions aren't Obj-C-specific, integers aren't Obj-C specific, and yet you're able to use all of those in your Obj-C code.
The obvious replacements for a singleton work in any language.
A singleton is a badly-designed global.
So the simplest replacement is to just make it a regular global, without the silly "one instance only" restriction.
A more thorough solution is, instead of having a globally accessible object at all, pass it as a parameter to the functions that need it.
And finally, you can go for a hybrid solution using a Dependency Injection framework.
The problem with singletons is that they can lead to tight coupling. Let's say you're building an airline booking system: your booking controller might use an
id<FlightsClient>
A common way to obtain it within the controller would be as follows:
_flightsClient = [FlightsClient sharedInstance];
Drawbacks:
It becomes difficult to test a class in isolation.
If you want to change the flight client for another implementation, its necessary to search through the application and swap it out one by one.
If there's a case where the application should use a different implementation (eg OnlineFlightClient, OfflineFlightClient), things get tricky.
A good workaround is to apply the dependency injection design pattern.
Think of dependency injectionas telling an architectural story. When the key actors in your application are pulled up into an assembly, then the application’s configuration is correctly modularized (removing duplication). Having created this script, its now easy to reconfigure or swap one actor for another.”. In this way we need not understand all of a problem at once, its easy to evolve our app’s design as the requirements evolve.
Here's a dependency injection library: https://github.com/typhoon-framework/Typhoon
I'm currently dealing with a code base which contains several dozens of classes generated with SOAP::WSDL. However, having worked with Moose I now think that generating those classes at runtime at meta level (i.e. not to files on disk but directly to objects) might be a better idea (completely excluding performance reasons at this point).
Is this approach sensible? The idea is to avoid changes to generated code and also to avoid re-generating it once in a while.
If so, are there any ready-to-use Perl modules that create classes from a WSDL?
To answer the second question first, there is nothing Moose based that will turn a WSDL into a set of classes. However you could possibly build something based on XML::Toolkit. For full disclosure XML::Toolkit is my module that has tools for converting SAX streams into Moose classes and vice versa. For non-Moose Perl classes, there is XML::Compile which I believe can compile SOAP wsdl -> Perl.
To answer the first question, my experience with XML::Toolkit says that keeping the classes in memory at run-time is tricky. Ignoring the performance overhead there is a lot of stuff you'll need to keep in your head that are inflated from the WSDL. It would be an interesting experiment, but I"m not sure how long-term maintainable it would be.
I've wanted for a long time to try something like this but I haven't had a project that paid me to really focus on it. Unfortunately I don't have the free time to tackle a project of this size either.
I am in a Web Scripting class at school and am working on my first assignment. I tend to overdo things and delve deeper into my subject than what is required in my classes. Right now I am researching CGI.pm to do my HTTP requests and it says there are two programming styles for CGI.pm:
An object-oriented style
A function-oriented style
Unless I overlooked the clear answer or am not knowledgeable enough to discern the answer for myself from the documentation provided at: http://perldoc.perl.org/CGI.html I just don't know what the pros and cons are of using these two different styles.
With that being said what are the pros and cons of using the two different styles? Which one is more commonly used? As far as using object-oriented style it says I can only use one CGI object at the time. Why is that?
Thanks for all your help. You have all made studying Computer Science very enjoyable, satisfying, and rewarding for me. =D
Behind the scenes, CGI.pm is doing the same thing despite the styles. The functional interface actually uses a secret object that you don't see.
For many small-scale CGI projects, you're probably never going to need more than one CGI object at a time, so the functional interface is fine. This might be the more common style, but only because most people make small scripts for very specific tasks. If you have a lot of other stuff going on, you might not like CGI.pm importing a long list (and it is long) of function names into your script. Some of the function names might clash with those other modules want to import.
I, however, always use the object-oriented interface. I don't have to worry about name collisions, and it's apparent where any method came from since you see its object. It's also easy to pass the object as arguments to other parts of large applications, etc.
Some people might complain about the extra typing, but that's never been the slow part of programming for me. I've been doing Perl for a long time and I don't mind the syntax. However, I only use CGI to get the input and maybe send the output. I don't mess with any of the HTML stuff.
When it talks about one CGI.pm object at a time, it's referring to access to the input. Once you've read STDIN, for instance, another CGI.pm object won't be able to read that. You can have as many objects as you like though. They just won't share data and the first one gets all of POST data.
You can actually use a mixture though. You can import some things, like :html, but still use the OO interface to deal with the input.
I strongly recommend using the object interface.
Will it be absolutely required for your classwork? No, in fact it is arguably overkill for even small production projects.
However, if you are serious about learning to use CGI.pm for larger scale projects you will need to learn the object method. If you reach the point of needing two objects you will have to use the object interface. Programming, like most everything else, gets better with practice. Practicing now on relatively easier problems will help you be ready for more complex ones.
In fact I'd recommend it as a general rule in programming (although there are exceptions) that if faced with two methods of using a particular tool making a habit of using the one most likely to be used in production code and/or the one that is the correct answer for more of the problem space.
I have managed to avoid C and C++ up until now (except for a few HelloWorlds), and have instead worked in higher-level languages. I've worked and lived in VB6, then Java, then C#, then ActionScript, and now Ruby.
I've recently become curious about programming for the IPod Touch/IPhone. Though I've seen some possibilities for avoiding ObjectiveC (like Mono for IPhone), I'm curious about Objective C. Mostly: does it require the developer to handle garbage collection and manage pointers and that sort of thing?
Edit: I am totally open to the possibility that my concept of higher- and lower-level languages is incorrect or misleading.
Garbage collection: yes, but not nearly as bad as C or C++. Once you understand Objective-C garbage collection it's really easy. It doesn't take that long to learn or master either.
As far as pointers go - you'll be using pointers far less frequently than you would in C or C++. Yes, a bit of knowledge about pointers is useful (knowing the difference between declaring NSString * and NSString) but that's also not that complicated.
Personally I would avoid all the "higher-level" languages that are emerging for iPhone development. They really only eliminate the need for you to learn Objective-C but still force you to use the native framework (Cocoa Touch). They add complexity for laziness sake. Objective-C can be learnt in a week, debugging framework nuances takes a lot longer. But then again, it is only my opinion.
edit:
C# memory management can also be a pain if not done correctly. Yes, there's a garbage collector that works in a lot of cases. However, there are also objects that need to be disposed and not disposing them correctly can lead to memory leaks. Creating your own IDisposable objects can also be tricky if you're not 100% sure what the order of the Disposing sequence is. I've seen a lot of projects where developers get this horribly wrong. Basically what I'm saying is that the Objective-C memory management is not more complex or more technically difficult than the Dispose chain in C#/.NET
There is garbage collection for Objective-C but not for the iPhone, i. e. if you write software for the iPhone you have to handle the memory stuff on your own, but this is no general Objective-C problem.
"Mostly: does it require the developer to handle garbage collection and manage pointers and that sort of thing?" - on the iPhone, yes.
It's not as "high-level" as C# and Java, IMO, but it's still considered a high-level language, just like C++ and, yes, C.
Objective C does have garbage collection, but the iPhone Object C does not.
I think the title speaks for itself guys - why should I write an interface and then implement a concrete class if there is only ever going to be 1 concrete implementation of that interface?
I think you shouldn't ;)
There's no need to shadow all your classes with corresponding interfaces.
Even if you're going to make more implementations later, you can always extract the interface when it becomes necessary.
This is a question of granularity. You cannot clutter your code with unnecessary interfaces but they are useful at boundaries between layers.
Someday you may try to test a class that depends on this interface. Then it's nice that you can mock it.
I'm constantly creating and removing interfaces. Some were not worth the effort and some are really needed. My intuition is mostly right but some refactorings are necessary.
The question is, if there is only going to ever be one concrete implementation, should there be an interface?
YAGNI - You Ain't Gonna Need It from Wikipedia
According to those who advocate the YAGNI approach, the temptation to write code that is not necessary at the moment, but might be in the future, has the following disadvantages:
* The time spent is taken from adding, testing or improving necessary functionality.
* The new features must be debugged, documented, and supported.
* Any new feature imposes constraints on what can be done in the future, so an unnecessary feature now may prevent implementing a necessary feature later.
* Until the feature is actually needed, it is difficult to fully define what it should do and to test it. If the new feature is not properly defined and tested, it may not work right, even if it eventually is needed.
* It leads to code bloat; the software becomes larger and more complicated.
* Unless there are specifications and some kind of revision control, the feature may not be known to programmers who could make use of it.
* Adding the new feature may suggest other new features. If these new features are implemented as well, this may result in a snowball effect towards creeping featurism.
Two somewhat conflicting answers to your question:
You do not need to extract an interface from every single concrete class you construct, and
Most Java programmers don't build as many interfaces as they should.
Most systems (even "throwaway code") evolve and change far past what their original design intended for them. Interfaces help them to grow flexibly by reducing coupling. In general, here are the warning signs that you ought to be coding to an interface:
Do you even suspect that another concrete class might need the same interface (like, if you suspect your data access objects might need XML representation down the road -- something that I've experienced)?
Do you suspect that your code might need to live on the other side of a Web Services layer?
Does your code forms a service layer to some outside client?
If you can honestly answer "no" to all these questions, then an interface might be overkill. Might. But again, unforeseen consequences are the name of the game in programming.
You need to decide what the programming interface is, by specifying the public functions. If you don't do a good job of that, the class would be difficult to use.
Therefore, if you decide later you need to create a formal interface, you should have the design ready to go.
So, you do need to design an interface, but you don't need to write it as an interface and then implement it.
I use a test driven approach to creating my code. This will often lead me to create interfaces where I want to supply a mock or dummy implementation as part of my test fixture.
I would not normally create any code unless it has some relevance to my tests, and since you cannot easily test an interface, only an implementation, that leads me to create interfaces if I need them when supplying dependencies for a test case.
I will also sometimes create interfaces when refactoring, to remove duplication or improve code readability.
You can always refactor your code to introduce an interface if you find out you need one later.
The only exception to this would be if I were designing an API for release to a third party - where the cost of making API changes is high. In this case I might try to predict the type of changes I might need to do in the future and work out ways of creating my API to minimise future incompatible changes.
One thing which no one mentioned yet, is that sometimes it is necessary in order to avoid depenency issues. you can have the interface in a common project with few dependencies and the implementation in a separate project with lots of dependencies.
"Only Ever going to have One implementation" == famous last words
It doesn't cost much to make an interface and then derive a concrete class from it. The process of doing it can make you rethink your design and often leads to a better end product. And once you've done it, if you ever find yourself eating those words - as frequently happens - you won't have to worry about it. You're already set. Whereas otherwise you have a pile of refactoring to do and it's gonna be a pain.
Editted to clarify: I'm working on the assumption that this class is going to be spread relatively far and wide. If it's a tiny utility class used by one or two other classes in a single package then yeah, don't worry about it. If it's a class that's going to be used in multiple packages by multiple other classes then my previous answer applies.
The question should be: "how can you ever be sure, that there is only going to ever be one concrete implementation?"
How can you be totally sure?
By the time you thought this through, you would already have created the interface and be on your way without assumptions that might turn out to be wrong.
With today's coding tools (like Resharper), it really doesn't take much time at all to create and maintain interfaces alongside your classes, whereas discovering that now you need an extra implementation and to replace all concrete references can take a long time and is no fun at all - believe me.
A lot of this is taken from a Rainsberger talk on InfoQ: http://www.infoq.com/presentations/integration-tests-scam
There are 3 reasons to have a class:
It holds some Value
It helps Persist some entity
It performs some Service
The majority of services should have interfaces. It creates a boundary, hides implementation, and you already have a second client; all of the tests that interact with that service.
Basically if you would ever want to Mock it out in a unit test it should have an interface.