Chess programming, Scala, JVM: How costly is dynamic dispatch? [closed] - scala

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
My aim is to write a basic chess playing AI. It doesn't need to be incredible, but I want it to play with some degree of competence against people with some level of familiarity with the game.
I have a trait called Piece which has abstract methods canMakeMove(m: Move, b: Board) and allMovesFrom(p: Position, b: Board). These methods are important to the program logic for obvious reasons, and are implemented by the concrete classes King, Queen, Pawn, Rook, Bishop, and Knight. Thus in other places, say for example in the code that determines whether or not a particular board has a King in check, these methods are called on values whose type is the abstract type Piece, (piece canMakeMove (..., ...)) so the actual method called is determined at runtime via dynamic dispatch.
I'm wondering if this is too costly for the purposes of a Chess AI program which is going to have to execute this code many times. After looking around online and reading some more about chess programming I've found that the most common representation of a chess board is not like my Vector[Vector[Option[Piece]] but an int matrix ('bit board') that likely uses a switch statement on the values in the board to accomplish the effect that I'm currently relying on dynamic dispatch to achieve. Will this prevent my AI from reaching viable performance level?

I will suggest in this answer that you are subject to a common pitfall in programming.
There is one very important wisdom that I learned from my colleagues, and it has proven its worth to me time and time again:
Only solve problems when they occur, and not any sooner.
Keep in mind that there have been chess programs and chess computers around for a long, long time.
There were even chess programs on the C64, and there were those tiny travel chess computers that were embedded in the chess board, so you could carry them around easily.
Those systems had far less resources than your present day computer by several orders of magnitude.
If you had a similar algorithm as the one that was running on those old systems, and that algorithm was implemented in Scala, with dynamic dispatch, generous vectors instead of bit arrays et cetera, they would still be faster than they were on the older hardware from the past days.
Still, writing even a moderate search heuristic for the chess problem is a colossal task for a single developer.
When faced with a colossal task, one intuitively tends to look for the first bit of the problem that one feels competent in solving, in hopes that this will then naturally lead to the next little problem that you can solve, and so on, eventually leading to a full solution to the colossal problem. Divide and conquer.
However, my personal experience tells me that this is not a good way to tackle bigger programming problems.
Usually, if you follow this path, you end up with a really, really optimized representation of the chess board, consuming only very little memory, and optimized for runtime as much as possible.
And then you're stuck. You have put all your energy into this, you're really proud of your great chessboard class, and somehow it's really dissatisfying that you feel no step closer to actually have a working chess AI.
That's where the rule that I mentioned in the beginning comes in handy.
In short: Don't optimize. Yet. You can still optimize later, if it's necessary.
Instead, do the following:
Make a class or trait that represents a chessboard.
Leave it empty in the beginning - no methods or members.
Do the same with the types representing pieces, board positions, moves etc.
Then, when you start implementing the chess AI, fill in those types with methods. But only add the methods that you really need, and only when you need them, and not any single method more than that.
Prefer to use traits in the beginning, as you can later on produce more optimized versions of a trait, if necessary.
In that way, try to get as soon as you can to the first version of your chess AI that does something interesting.
It's up to you to define what "interesting" means in this context.
It's okay if the first version of the chess AI is crappy, and loses every game because it makes really bad decisions.
It should just have that one single thing: It should do something that you find at least remotely interesting.
Then, identify the top problem that you have with your chess AI.
Think about how to solve it, come up with a plan that solves only that one problem, in the simplest way that you can possibly imagine, even if your programmer instincts tell you to do something more complex.
Resist the urge to delve into complexity fast, because the most complex code that you write will eventually turn out to be your biggest problem, and also the most inflexible part of your code base.
Short, simple, almost stupid steps towards a solution. Prototype a lot, create simple versions in a short time, and only optimize when needed.
As for the optimization - that's really not a problem.
For example, let's say that in the beginning you thought it's a good idea to define a board position like this:
case class Position(x:Int, y:Int)
Later on, you figure that it would have been a better choice to just use a tuple of ints to represent a board position. Simply substitute your previous definition with:
type Position = (Int, Int)
There you go. Do that change, compile the code, and the compile errors will show you all the places in the code that you need to adapt. It's not going to take very long to refactor this.
Even later down the line, you decide that since there are only 64 possible positions on the board, you can easily represent the position with a number in the range from 0 to 64 - reserving the number 64 for "off-board".
Again, that's an easy change:
type Position = Byte
Save, compile, fix the compile errors. Should not take longer than 15 minutes.
With this example, I want to illustrate that you shouldn't be afraid of optimizing later, waiting to solve problems only when they actually occur.
This will always require some refactoring, but the effort is usually not really worth mentioning.
I tend to think of this as "massaging the code".
Of course we know that junior programmers sometimes tend to write "spaghetti code" that is very hard to refactor later.
Maybe that's why there is a certain fear of late optimization that practically every developer knows, and the urge to optimize early.
The only real antidote against such "spaghetti code" is to go through this painful process several times. After a while, you will develop a very good intuition on how to write code that can be optimized and refactored easily.
This will match all the common programming wisdoms, like separation of concerns, short methods, encapsulation and so on.
At this point, I would recommend the book "Clean Code" as an excellent guideline.
Some more tips:
See your program as a bridge. You stand on one side of the bridge, starting off with nothing. On the far end, there is your vision - a decent chess AI. Between the ends, there is a gap. Your program will be the bridge. Don't start by putting all your energy into optimizing the very first brick that you put in your bridge. A bridge with just one perfect brick won't hold. Build a prototype bridge first - crappy, but it connects the two ends. When you have that, it's easier to enhance the crappy bridge step by step.
Abstract all the functionality into traits and classes. Be wary of code repetitions - when you find a bug or need to do a refactoring, repetitious code can be a neck breaker.
Don't be afraid to write code that doesn't look smart, as long as the code is simple and solves the problem. Bonus points for code that reads well and looks almost like an intuitive description of the algorithm.
Keep your methods short. One to six lines.
Make sure that your code doesn't nest too deeply - everyone with a bit of Scala experience should be able to understand what your code is doing.
It can be a good thing to spend some time in order to find good names for your concepts. Maybe there are even some standards of naming certain parts of a chess AI. It's good to do a bit of research first.
While you are researching, see if there are some Scala/Java libraries available that you can use, for important parts of your problem. You can still replace the libraries with your own implementations later, but in the beginning, using the knowledge of other people can give you a jump start into creating your first, prototype version "bridge". Everything that will help you close the "gap" quickly is good.
Don't think of your code as something eternal, perfect, unchangeable. Maybe you have the ambition to write the best representation of a chess board in Scala ever. Don't do that. Eternal, perfect code cannot be refactored or changed easily. Code that can be refactored easily is really good code. Refactoring code is 50% of what a programmer does.
Jump into automated testing. For Scala, using the scalatest library can be a good choice. Use a tool like SBT that will run all the automated tests on every build. If you have certain assumptions about the classes you write, hardcode those assumptions into automated tests. It will feel a little stupid and redundant at first. But the first time that one of your automated tests show you the cause of a bug that would otherwise have been very hard to find, all the time to write those automated tests will have paid out.
An advanced topic about automated testing is code coverage. You can use a tool that will generate a coverage report, how much of your code is covered by automated tests. From my own experience, I would recommend jacoco4sbt.
And finally, the mandatory citation: "Optimization is the root of all evil." - There is a lot of wisdom in this one.
Good luck, and a whole lot of fun!

Related

What are "not so well defined problems" that LISP is supposed to solve?

Most people agree that LISP helps to solve problems that are not well defined, or that are not fully understood at the beginning of the project.
"Not fully understood"" might indicate that we don't know what problem we are trying to solve, so the developer refines the problem domain continuously. But isn't this process language independent?
All this refinement does not take away the need for, say, developing algorithms/solutions for the final problem that does need to be solved. And that is the actual work.
So, I'm not sure what advantage LISP provides if the developer has no idea where he's going i.e. solving a problem that is not finalised yet.
Lisp (not "LISP") has a number of advantages when you're facing problems that are not well-defined. First of all, you have a REPL where you can quickly experiment with -- that helps in sketching out quick functions and trying to play with them, leading to a very rapid development cycle. Second, having a dynamically typed language is working well in this context too: with a statically typed language you need to "design more" before you begin, and changing the design leads to changing more code -- in contrast, with Lisps you just write the code and the data it operates on can change as needed. In addition to these, there's the usual benefits of a functional language -- one with first class lambda functions, etc (eg, garbage collection).
In general, these advantage have been finding their way into other languages. For example, Javascript has everything that I listed so far. But there is one more advantage for Lisps that is still not present in other languages -- macros. This is an important tool to use when your problem calls for a domain specific language. Basically, in Lisp you can extend the language with constructs that are specific to your problem -- even if these constructs lead to a completely different language.
Finally, you need to plan ahead for what happens when the code becomes more than a quick experiment. In this case you want your language to cope with "growing scripts into applications" -- for example, having a module system means that you can get a more "serious"
application. For example, in Racket you can get your solution separated into such modules, where each can be written in its own language -- it even has a statically typed language which makes it possible to start with a dynamically typed development cycle and once the code becomes more stable and/or big enough that maintenance becomes difficult, you can switch some modules into the static language and get the usual benefits from that. Racket is actually unique among Lisps and Schemes in this kind of support, but even with others the situation is still far more advanced than in non-Lisp languages.
In AI (Artificial Intelligence) historically Lisp was seen as the AI assembly language. It was used to build higher-level languages which help to work with the problem domain in a more direct way. Many of these domains need a lot of 'knowledge' for finding usable answers.
A typical example is an expert system for, say, oil exploration. The expert system gets as inputs (geological) observations and gives information about the chances to find oil, what kind of oil, in what depths, etc. To do that it needs 'expert knowledge' how to interpret the data. When you start such a project to develop such an expert system it is typically not clear what kind of inferences are needed, what kind of 'knowledge' experts can provide and how this 'knowledge' can be written down for a computer.
In this case one typically develops new languages on top of Lisp and you are not working with a fixed predefined language.
As an example see this old paper about Dipmeter Advisor, a Lisp-based expert system developed by Schlumberger in the 1980s.
So, Lisp does not solve any problems. But it was originally used to solve problems that are complex to program, by providing new language layers which should make it easier to express the domain 'knowledge', rules, constraints, etc. to find solutions which are not straight forward to compute.
The "big" win with a language that allows for incremental development is that you (typically) has a read-eval-print loop (or "listener" or "console") that you interact with, plus you tend to not need to lose state when you compile and load new code.
The ability to keep state around from test run to test run means that lengthy computations that are untouched by your changes can simply be kept around instead of being re-computed.
This allows you to experiment and iterate faster. Being able to iterate faster means that exploration is less of a hassle. Very useful for exploratory programming, something that is typical with dealing with less well-defined problems.

Are MooseX::Declare and MooseX::Method::Signatures production ready?

From the current version (0.98) of the Moose::Manual::MooseX are the lines:
We have high hopes for the future of
MooseX::Method::Signatures and
MooseX::Declare. However, these
modules, while used regularly in
production by some of the more insane
members of the community, are still
marked alpha just in case backwards
incompatible changes need to be made.
I noticed that for MooseX::Method::Signatures the change log for September 2009 mentions the removal of the "scary ALPHA disclaimer".
So, are these still "alpha"?
Would I still be considered one of the "more insane" to use them?
I'd say they are production ready - I'm using them in production - but there are several things to consider:
Performance
MooseX::Declare and dependencies do almost all of their magic at compile time. Depending on the size of your program, you might find anywhere from half a second to several seconds of additional initialization overhead. If this a problem, don't use MooseX::Declare.
At runtime, the main overhead is type and argument checking, which you should (ideally) be doing anyway. That said, Moose type constraints have some overheads, namely coercion and the more complex (MooseX::Types::Structured-style) constraints. Don't use these if performance is an issue.
Stability
MooseX::Declare and MooseX::Method::Signature's external syntax is now stable. But it is important to know that the internals are subject to extreme change. (fortunately, changes for the better)
To give you an idea, the signature itself is grabbed using a big block of C code stolen from the Perl tokenizer (toke.c). This can break in some situations since it isn't actually parsing anything. The bit inside the brackets is parsed using PPI, which is designed for pure Perl, but the resulting PPI tree is then hacked up to get something useful. Devel::Declare itself is a hack - after it sees specific keywords (e.g. 'role', 'class', 'method') the Devel::Declare-using module must rewrite the source code by hand, with no interaction with the real Perl parser.
Corner cases may cause Perl to segfault. Or rewrite the source code badly, so you get syntax errors but have no idea what's causing them without -MO::Deparse. If you mess up the MooseX::Declare syntax by accident, there is no guarantee that the module will detect this and give you a sensible error. The ALPHA message may have gone, but this is still doing dark and scary things internally, and you should be prepared for that.
UPDATE
MooseX::Declare has not been updated much, and you may wish to look at alternatives such as Moops. Personally, I have decided to stick with pure Moose until Perl itself begins to support class/method/has syntax natively, which is possibly on the cards.
I think it's a matter of differing perspectives as much as anything -- rafl is one of the aforementioned "more insane members of the community" while Rolsky is more conservative. It's up to you to decide who you agree with, and really I think that the most important variable is your own code.
MooseX::Declare is good code. It won't randomly blow up your machine, it's not awful for performance, and it offers a lot of nifty stuff while reducing the amount of boilerplate that you have to write. But it might change in the future, making your code refuse to compile until it's updated; it might make your editor and other development tools confused when it sees syntax that it can't parse, it might piss off your collaborators by making them learn a new module to work with your code, or it might piss off your boss by making it so any future maintainer has to learn a new module to work with your code. Which of those things apply to you, and to what degree? You know better than I do, I hope.
There are people who feel that the maturity and stability of MooseX::Delcare, Devel::Declare on which it's based, or even Moose itself are not yet ready for "prime time". I also know of two large companies with millions of visitors a month, who have MooseX::Declare in their production environment. I personally am happy with the stack I am provided with Moose and do not see a need yet to bring in MooseX::Declare. I know people who's opinion I deeply respect who refuse to write new code without the declarative sugar from MooseX::Declare.
All of this is to say, the decision on whether something is or is not production ready is highly dependent upon your production environment, your development needs, and taste for risk. Without being in your shoes we can't possibly give an informed decision as to how well any given tool matches that profile.
MooseX::Method::Signatures (MXMS), and MooseX::Declare which uses it, is not production ready. This is not because the code isn't stable, but because it is appallingly slow. Simply using the method keyword, no types or arguments, is a 500-1000x runtime performance hit over a regular method call. My Macbook Pro can do about 6,000 simple method calls per second using MXMS vs 5,000,000 with plain Perl.
Method::Signatures, in contrast, has almost no performance hit above what it would normally cost to do the requested checks. The syntax is almost exactly the same as MXMS and it supports Moose (and Mouse) types. Both rely on the same underlying syntax modifying technique. (Full disclosure, I am the author of Method::Signatures.)
If you like MooseX::Declare but want the performance of Method::Signatures, try Method::Signatures::Modifiers.
It depends on what you mean by "production ready". I wouldn't depend on them until their velocity slows down quite a bit. I like my production stuff to not need frequent care from external code changes, API adjustments, and so on. That's not something particular to Moose, but any young project.
You have to judge how much that matters to you. In some situations, pushing stuff into production is a lengthy process, so you must be circumspect with such things. At the other extreme, some places let you edit files directly on the production server. That is, you have to define your tolerance before anyone can tell you which side a given MooseX module is on.
MooseX::Declare and MooseX::Method::Signatures are working well but they can have really nasty penalty depending on what does your code do. This can be fixed by just not using method keyword or using Method::Signatures::Modifiers.
Performance penalty I am seeing is around 2-5x compared to Method::Signatures::Modifiers (5x being mostly for one specific class I was using). And it seems that it is mostly compile time or maybe first time initialization, because it is getting under 2x when the computation is longer.
Method::Signatures::Modifiers has better errors but you have to turn this optimization off when you use debugger (it goes haywire because it does not see these methods, you can see for yourself in -MO=Deparse output).
It may be worth it to get rid of Perl argument shifting hell.

Suggested content for a lunch-time "Introduction to Scala" talk

I'm going to be giving a short (30-40 mins) lunch-time talk on Scala to technical staff at my company. I'd like some suggestions for what would be the most appropriate content. Most people attending will have experience in Java and/or C# (plus various other languages).
What are the key things to cover? I'd like to give a brief introduction to the Scala syntax so that people don't feel lost when looking at code examples. I'll also cover some of the history behind the language and its designers. What would help people get the most out of the talk?
People are almost certainly coming to talk to get an answer to the question, "Why should I use Scala?" Anything you can provide to help them answer that will be valuable.
Keep the discussion of the history and the personalities behind Scala to a minimum.
A whirlwind tour of the syntax is useful, but keep it short.
Spend a good chunk of the talk demonstrating examples and comparisons to Java. Show cases where Scala shines. You should literally be running and executing code so that people get a real, hands-on feel for how things work.
Make sure to cover weaknesses, too! Provide an objective and balanced overview.
I gave a similar talk - mostly to those with a Java background. I felt that taking a piece of real Java (about 30 lines) and iteratively adding scala features worked pretty well. The 30 lines of Java eventually ended up as 6 (six!) of scala. The point being (of course) that 6 lines are more readable and maintainable than 30.
I converted the scala to line-by-line Java equivalent and then introduced:
Type inference
Option
Closures
Pattern-matching (on lists)
Type aliases
Tail recursion
I found that this segment took quite a long time because the audience were very interested in the minutiae of scala's syntax (especially around function-expressions). Before undertaking the pattern-matching bit, I had a slide explaining the various things you could use in a match.
Tough. One has to balance the new and the familiar. For instance:
Talk about traits, how they differ from interfaces and multiple inheritance. Note that most methods in all of Scala collections can actually be found on the trait Traversable, which has a single abstract method: foreach.
Speak of functions and partial functions, show map/filter/foreach, and how they make use of functions.
Talk about pattern matching -- show how unapply is used to enable representation independence, while at the same time case classes make the common case easy.
Above all AVOID any topic that might be difficult to understand quickly, or you may waste time on them. For example of great topics I wouldn't talk about: self types, variance, for-comprehensions.
Pick more topics than you have time for. Let the public steer the conversation towards the topcis they are more interested in. If anyone starts to boggle down a topic too much, say you'll be pleased to explain it in more details later, and ask if they would mind if you moved to another topic. On the other hand, if everyone seems to be picking up on one thing in particular, stay with it. Otherwise, it might feel like you want to hide something.
I gave a presentation on re-writing Java classes in Scala. It has lots of examples of Java -> Scala and (hopefully) makes the gains obvious. Feel free to borrow any content you want... presentation took 1hr 10minutes so you might want to cut some stuff out.
Presentation: http://www.colinhowe.co.uk/downloads/rewriting-java-in-scala.ppt
Video: http://skillsmatter.com/podcast/java-jee/re-writing-java-classes-in-scala-and-making-your-code-lovely
You could do worse than running through Jonas Bonér's presentation, Pragmatic Real-World Scala. Perhaps skip some advanced topics in there on different applications of traits and self-type annotations.

Good practice class line count [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 7 years ago.
Improve this question
I know that there is no right answer to this question, I'm just asking for your opinions.
I know that creating HUGE class files with thousand lines of code is not a good practice since it's hard to maintain and also it usually means that you should probably review your program logic.
In your opinion what is an average line count for a class let's say in Java (i don't know if the choice of language has anything to do with it but just in case...)
Yes, I'd say it does have to do with the language, if only because some languages are more verbose than others.
In general, I use these rules of thumb:
< 300 lines: fine
300 - 500 lines: reasonable
500 - 1000 lines: maybe ok, but plan on refactoring
> 1000 lines: definitely refactor
Of course, it really depends more on the nature and complexity of the code than on LOC, but I've found these reasonable.
In general, number of lines is not the issue - a slightly better metric is number of public methods. But there is no correct figure. For example, a utility string class might correctly have hundreds of methods, whereas a business level class might have only a couple.
If you are interested in LOC, cyclomatic and other complexity measurements, I can strongly recommend Source Monitor from http://www.campwoodsw.com, which is free, works with major languages such as Java & C++, and is all round great.
From Eric Raymond's "The Art Of Unix Programming"
In nonmathematical terms, Hatton's empirical results imply a sweet spot between 200 and 400 logical lines of code that minimizes probable defect density, all other factors (such as programmer skill) being equal. This size is independent of the language being used — an observation which strongly reinforces the advice given elsewhere in this book to program with the most powerful languages and tools you can. Beware of taking these numbers too literally however. Methods for counting lines of code vary considerably according to what the analyst considers a logical line, and other biases (such as whether comments are stripped). Hatton himself suggests as a rule of thumb a 2x conversion between logical and physical lines, suggesting an optimal range of 400–800 physical lines.
Taken from here
Better to measure something like cyclomatic complexity and use that as a gauge. You could even stick it in your build script/ant file/etc.
It's too easy, even with a standardized code format, for lines of code to be disconnected from the real complexity of the class.
Edit: See this question for a list of cyclomatic complexity tools.
I focus on methods and (try to) keep them below 20 lines of code. Class length is in general dictated by the single responsibility principle. But I believe that this is no absolute measure because it depends on the level of abstraction, hence somewhere between 300 and 500 lines I start looking over the code for a new responsibility or abstraction to extract.
Small enough to do only the task it is charged with.
Large enough to do only the task it is charged with.
No more, no less.
In my experience any source file over 1000 text lines I will start wanting to break up. Ideally methods should fit on a single screen, if possible.
Lately I've started to realise that removing unhelpful comments can help greatly with this. I comment far more sparingly now than I did 20 years ago when I first started programming.
The short answer: less than 250 lines.
The shorter answer: Mu.
The longer answer: Is the code readable and concise? Does the class have a single responsibility? Does the code repeat itself?
For me, the issue isn't LOC. What I look at is several factors. First, I check my If-Else-If statements. If a lot of them have the same conditions, or result in similar code being run, I try to refactor that. Then I look at my methods and variables. In any single class, that class should have one primary function and only that function. If it has variables and methods for a different area, consider putting those into their own class. Either way, avoid counting LOC for two reasons:
1) It's a bad metric. If you count LOC you're counting not just long lines, but also lines which are whitespace and used for comments as though they are the same. You can avoid this, but at the same time, you're still counting small lines and long lines equally.
2) It's misleading. Readability isn't purely a function of LOC. A class can be perfectly readable but if you have a LOC count which it violates, you're gonna find yourself working hard to squeeze as many lines out of it as you can. You may even end up making the code LESS readable. If you take the LOC to assign variables and then use them in a method call, it's more readable than calling the assignments of those variables directly in the method call itself. It's better to have 5 lines of readable code than to condense it into 1 line of unreadable code.
Instead, I'd look at depth of code and line length. These are better metrics because they tell you two things. First, the nested depth tells you if you're logic needs to be refactored. If you are looking at If statements or loops nested more than 2 deep, seriously consider refactoring. Consider refactoring if you have more than one level of nesting. Second, if a line is long, it is generally very unreadable. Try separating out that line onto several more readable lines. This might break your LOC limit if you have one, but it does actually improve readability.
line counting == bean counting.
The moment you start employing tools to find out just how many lines of code a certain file or function has, you're screwed, IMHO, because you stopped worrying about managebility of the code and started bureaucratically making rules and placing blame.
Have a look at the file / function, and consider if it is still comfortable to work with, or starts getting unwieldly. If in doubt, call in a co-developer (or, if you are running a one-man-show, some developer unrelated to the project) to have a look, and have a quick chat about it.
It's really just that: a look. Does someone else immediately get the drift of the code, or is it a closed book to the uninitiated? This quick look tells you more about the readability of a piece of code than any line metrics ever devised. It is depending on so many things. Language, problem domain, code structure, working environment, experience. What's OK for one function in one project might be all out of proportion for another.
If you are in a team / project situation, and can't readily agree by this "one quick look" approach, you have a social problem, not a technical one. (Differing quality standards, and possibly a communication failure.) Having rules on file / function lengths is not going to solve your problem. Sitting down and talking about it over a cool drink (or a coffee, depending...) is a better choice.
You're right... there is no answer to this. You cannot put a "best practice" down as a number of lines of code.
However, as a guideline, I often go by what I can see on one page. As soon as a method doesn't fit on one page, I start thinking I'm doing something wrong. As far as the whole class is concerned, if I can't see all the method/property headers on one page then maybe I need to start splitting that out as well.
Again though, there really isn't an answer, some things just have to get big and complex. The fact that you know this is bad and you're thinking about it now, probably means that you'll know when to stop when things get out of hand.
Lines of code is much more about verbosity than any other thing. In the project I'm currently working we have some files with over 1000 LOC. But, if you strip the comments, it will probably remain about 300 or even less. If you change declarations like
int someInt;
int someOtherInt;
to one line, the file will be even shorter.
However, if you're not verbose and you still have a big file, you'll probably need to think about refactoring.

Why use a post compiler?

I am battling to understand why a post compiler, like PostSharp, should ever be needed?
My understanding is that it just inserts code where attributed in the original code, so why doesn't the developer just do that code writing themselves?
I expect that someone will say it's easier to write since you can use attributes on methods and then not clutter them up boilerplate code, but that can be done using DI or reflection and a touch of forethought without a post compiler. I know that since I have said reflection, the performance elephant will now enter - but I do not care about the relative performance here, when the absolute performance for most scenarios is trivial (sub millisecond to millisecond).
Let's try to take an architectural point on the issue. Say you are an architect (everyone wants to be an architect ;)
You need to deliver the architecture to your team:
a selected set of libraries, architectural patterns, and design patterns. As a part of your design, you say: "we will implement caching using the following design pattern:"
string key = string.Format("[{0}].MyMethod({1},{2})", this, param1, param2 );
T value;
if ( !cache.TryGetValue( key, out value ) )
{
using ( cache.Lock(key) )
{
if (!cache.TryGetValue( key, out value ) )
{
// Do the real job here and store the value into variable 'value'.
cache.Add( key, value );
}
}
}
This is a correct way to do tracing. Developers are going to implement this pattern thousands of times, so you write a nice Word document telling how you want the pattern to be implemented. Yeah, a Word document. Do you have a better solution? I'm afraid you don't. Classic code generators won't help. Functional programming (delegates)? It works fairly well for some aspects, but not here: you need to pass method parameters to the pattern. So what's left? Describe the pattern in natural language and trust developers will implement them.
What will happen?
First, some junior developer will look at the code and tell "Hm. Two cache lookups. Kinda useless. One is enough." (that's not a joke -- ask the DNN team about this issue). And your patterns cease to be thread-safe.
As an architect, how do you ensure that the pattern is properly applied? Unit testing? Fair enough, but you will hardly detect threading issues this way. Code review? That's maybe the solution.
Now, what is you decide to change the pattern? For instance, you detect a bug in the cache component and decide to use your own? Are you going to edit thousands of methods? It's not just refactoring: what if the new component has different semantics?
What if you decide that a method is not going to be cached any more? How difficult will it be to remove caching code?
The AOP solution (whatever the framework is) has the following advantages over plain code:
It reduces the number of lines of code.
It reduces the coupling between components, therefore you don't have to change much things when you decide to change the logging component (just update the aspect), therefore it improves the capacity of your source code to cope with new requirements over time.
Because there is less code, the probability of bugs is lower for a given set of features, therefore AOP improves the quality of your code.
So if you put it all together:
Aspects reduce both development costs and maintenance costs of software.
I have a 90 min talk on this topic and you can watch it at http://vimeo.com/2116491.
Again, the architectural advantages of AOP are independent of the framework you choose. The differences between frameworks (also discussed in this video) influence principally the extent to which you can apply AOP to your code, which was not the point of this question.
Suppose you already have a class which is well-designed, well-tested etc. You want to easily add some timing on some of the methods. Yes, you could use dependency injection, create a decorator class which proxies to the original but with timing for each method - but even that class is going to be a mess of repetition...
... or you can add reflection to the mix and use a dynamic proxy of some description, which lets you write the timing code once, but requires you to get that reflection code just right -which isn't as easy as it might be, especially if generics are involved.
... or you can add an attribute to each method that you want timed, write the timing code once, and apply it as a post-compile step.
I know which seems more elegant to me - and more obvious when reading the code. It can be applied even in situations where DI isn't appropriate (and it really isn't appropriate for every single class in a system) and with no other changes elsewhere.
AOP (PostSharp) is for attaching code to all sorts of points in your application, from one location, so you don't have to place it there.
You cannot achieve what PostSharp can do with Reflection.
I personally don't see a big use for it, in a production system, as most things can be done in other, better, ways (logging, etc).
You may like to review the other threads on this matter:
Anyone with Postsharp experience in production?
Other than logging, and transaction management what are some practical applications of AOP?
Aspect Oriented Programming: What do you use PostSharp for?
etc (search)
Aspects take away all the copy & paste - code and make adding new features faster.
I hate nothing more than, for example, having to write the same piece of code over and over again. Gael has a very nice example regarding INotifyPropertyChanged on his website (www.postsharp.net).
This is exactly what AOP is for. Forget about the technical details, just implement what you are being asked for.
In the long run, I think we all should say goodbye to the way we are writing software now. It's tedious and plainly stupid to write boilerplate code and iterate manually.
The future belongs to declarative, functional style being held together by an object oriented framework - and the cross cutting concerns being handled by aspects.
I guess the only people who will not get it soon are the guys who are still payed for lines of code.