Why would you NOT want the compiler to optimize your code? - compiler-optimization

Once I realized there is the option for this in GCC, I asked google and plenty of people want to know how to tell the compiler not to optimize the code. This seems counter-productive, what purpose can this serve to help the programmer? Debugging perhaps? How would it help in a situation where it is preferred to do this?

You said it - debugging. The optimizer can restructure code so that functions no longer exist and statements are intermingled. Turning off optimization is often necessary to allow a debugger to map machine/byte code addresses back to a source location.
As Tikhon mentions, it can also be useful if the optimizer has a bug.

The main reason is compile time: turning on optimizations can significantly increase build times without necessarily giving much benefit.
Also, certain optimizations can affect the accuracy and correctness of your program. However, these optimizations usually need to be turned on explicitly rather than with a flag like -O2.
Some optimizations--things like inlining--can increase the size of the executable. In certain cases, this is an important consideration.
Optimization can also have negative effects on your code. For example, speed might increase but at the expense of using more memory. This is not always desirable.

Related

What can use to avoid the msgSend function overhead?(Obj-c)

Hello everyone help me please!!!!
What can use to avoid the msgSend function overhead?
maybe answer is IMP, but I not sure.
You could simply inline the function to avoid any function call overhead. Then it would be faster than even a C function! But before you start down this path - are you certain this level of optimisation is warranted? You are more likely to get a better payoff by optimizing the algorithm.
The use of IMP is very rarely required. The method dispatching in Objective-C (especially in the 64-bit runtime) has been very heavily optimised, and exploits many tricks for speed.
What profiling have you done that tells you that method dispatching is the cause of your performance issue? I suggest first you examine the algorithm to see first of all where the most expensive operations are, and see if there is a more efficient way to implement it.
To answer your question, a quick search finds some directly relevant questions similar to yours right here on SO, with some great and detailed answers:
Objective-C optimization
Objective-C and use of SEL/IMP

Are MooseX::Declare and MooseX::Method::Signatures production ready?

From the current version (0.98) of the Moose::Manual::MooseX are the lines:
We have high hopes for the future of
MooseX::Method::Signatures and
MooseX::Declare. However, these
modules, while used regularly in
production by some of the more insane
members of the community, are still
marked alpha just in case backwards
incompatible changes need to be made.
I noticed that for MooseX::Method::Signatures the change log for September 2009 mentions the removal of the "scary ALPHA disclaimer".
So, are these still "alpha"?
Would I still be considered one of the "more insane" to use them?
I'd say they are production ready - I'm using them in production - but there are several things to consider:
Performance
MooseX::Declare and dependencies do almost all of their magic at compile time. Depending on the size of your program, you might find anywhere from half a second to several seconds of additional initialization overhead. If this a problem, don't use MooseX::Declare.
At runtime, the main overhead is type and argument checking, which you should (ideally) be doing anyway. That said, Moose type constraints have some overheads, namely coercion and the more complex (MooseX::Types::Structured-style) constraints. Don't use these if performance is an issue.
Stability
MooseX::Declare and MooseX::Method::Signature's external syntax is now stable. But it is important to know that the internals are subject to extreme change. (fortunately, changes for the better)
To give you an idea, the signature itself is grabbed using a big block of C code stolen from the Perl tokenizer (toke.c). This can break in some situations since it isn't actually parsing anything. The bit inside the brackets is parsed using PPI, which is designed for pure Perl, but the resulting PPI tree is then hacked up to get something useful. Devel::Declare itself is a hack - after it sees specific keywords (e.g. 'role', 'class', 'method') the Devel::Declare-using module must rewrite the source code by hand, with no interaction with the real Perl parser.
Corner cases may cause Perl to segfault. Or rewrite the source code badly, so you get syntax errors but have no idea what's causing them without -MO::Deparse. If you mess up the MooseX::Declare syntax by accident, there is no guarantee that the module will detect this and give you a sensible error. The ALPHA message may have gone, but this is still doing dark and scary things internally, and you should be prepared for that.
UPDATE
MooseX::Declare has not been updated much, and you may wish to look at alternatives such as Moops. Personally, I have decided to stick with pure Moose until Perl itself begins to support class/method/has syntax natively, which is possibly on the cards.
I think it's a matter of differing perspectives as much as anything -- rafl is one of the aforementioned "more insane members of the community" while Rolsky is more conservative. It's up to you to decide who you agree with, and really I think that the most important variable is your own code.
MooseX::Declare is good code. It won't randomly blow up your machine, it's not awful for performance, and it offers a lot of nifty stuff while reducing the amount of boilerplate that you have to write. But it might change in the future, making your code refuse to compile until it's updated; it might make your editor and other development tools confused when it sees syntax that it can't parse, it might piss off your collaborators by making them learn a new module to work with your code, or it might piss off your boss by making it so any future maintainer has to learn a new module to work with your code. Which of those things apply to you, and to what degree? You know better than I do, I hope.
There are people who feel that the maturity and stability of MooseX::Delcare, Devel::Declare on which it's based, or even Moose itself are not yet ready for "prime time". I also know of two large companies with millions of visitors a month, who have MooseX::Declare in their production environment. I personally am happy with the stack I am provided with Moose and do not see a need yet to bring in MooseX::Declare. I know people who's opinion I deeply respect who refuse to write new code without the declarative sugar from MooseX::Declare.
All of this is to say, the decision on whether something is or is not production ready is highly dependent upon your production environment, your development needs, and taste for risk. Without being in your shoes we can't possibly give an informed decision as to how well any given tool matches that profile.
MooseX::Method::Signatures (MXMS), and MooseX::Declare which uses it, is not production ready. This is not because the code isn't stable, but because it is appallingly slow. Simply using the method keyword, no types or arguments, is a 500-1000x runtime performance hit over a regular method call. My Macbook Pro can do about 6,000 simple method calls per second using MXMS vs 5,000,000 with plain Perl.
Method::Signatures, in contrast, has almost no performance hit above what it would normally cost to do the requested checks. The syntax is almost exactly the same as MXMS and it supports Moose (and Mouse) types. Both rely on the same underlying syntax modifying technique. (Full disclosure, I am the author of Method::Signatures.)
If you like MooseX::Declare but want the performance of Method::Signatures, try Method::Signatures::Modifiers.
It depends on what you mean by "production ready". I wouldn't depend on them until their velocity slows down quite a bit. I like my production stuff to not need frequent care from external code changes, API adjustments, and so on. That's not something particular to Moose, but any young project.
You have to judge how much that matters to you. In some situations, pushing stuff into production is a lengthy process, so you must be circumspect with such things. At the other extreme, some places let you edit files directly on the production server. That is, you have to define your tolerance before anyone can tell you which side a given MooseX module is on.
MooseX::Declare and MooseX::Method::Signatures are working well but they can have really nasty penalty depending on what does your code do. This can be fixed by just not using method keyword or using Method::Signatures::Modifiers.
Performance penalty I am seeing is around 2-5x compared to Method::Signatures::Modifiers (5x being mostly for one specific class I was using). And it seems that it is mostly compile time or maybe first time initialization, because it is getting under 2x when the computation is longer.
Method::Signatures::Modifiers has better errors but you have to turn this optimization off when you use debugger (it goes haywire because it does not see these methods, you can see for yourself in -MO=Deparse output).
It may be worth it to get rid of Perl argument shifting hell.

Is it naughty to have a large utility file?

In my C project I have quite a large utils.c file. It is really full of many utilities of different sorts. I feel a bit naughty just stuffing different miscellaneous functions in there. For example it has some utilities related to low level stuff such as a lowercase() function, and it also has some quite sophisticated utilities such as converting to/from different colour formats.
My question is, is it very naughty to have such a large utils.c with many different types of utilities in it? Should I break it up into many different kinds of utility files? Such as graphics_utils.c and so on What do you think?
Breaking them up into separate files based on categories (ie graphics, strings, etc.) will lead to better organization, making it easier to locate certain pieces of code, having smaller files to go through, instead of just one large file.
You want to break it up, not just for organizational reasons, but because you will have many other files that depend on this one. Because everything will depend on this file, it makes this one file difficult to change because it might cause widespread breakage.
http://ifacethoughts.net/2006/04/15/stable-dependencies-principle/
If it's just you that will EVER maintain the stuff, it's a matter of when the complexity gets to the point where you find yourself searching for things. That would be the time to refactor and reorganize (there's a cost to reorganize, just as there's a cost to not reorganize).
If it's POSSIBLE that anyone else will maintain a project that includes your utils, you have to consider THEIR pain point when deciding when to reorganize. Theirs is MUCH lower than yours.
I tend to break them up into various sub-utils as you say (graphics_utils) when it becomes appropriate.
Break it up. Stuff will be easier to find, easier to reuse, easier to refactor, easier to unit test. I recently needed to get a set of ISO-8601 date handling methods out of a ginormous Java utility class of static methods, and it was really hard to find the 5% of the code I needed.
It is definitely not kosher, because the next guy coming through your code won't know where to look for anything. Break it up by function, and your coworkers will thank you!
Another advantage that comes from breaking up the file into separates is that when you place it under source control, you can have finer grained control. This really is useful if you have bits that are tweaked/extended/specialised frequently, and other bits that are relatively stable.
Another point: You should organize your code, i. e. break it up in smaller modules and categorize it, because at some point in time you will end up writing a second and third function for the same thing, simply for the reason that you wont find that function that you knew it was there, but you don't remember it's name.
I've got a (rather large) project with such a module and there is programming logic for which there are up to 5-6 implementations (for the same thing).
Like everyone else I would break them up. But I tend to use Extension Methods now, so I would have one class (and one file) per class being extended (e.g. StringExtensions, SqlDataReaderExtensions, etc). I find this tends to break up the utility methods nicely.

Good practice class line count [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 7 years ago.
Improve this question
I know that there is no right answer to this question, I'm just asking for your opinions.
I know that creating HUGE class files with thousand lines of code is not a good practice since it's hard to maintain and also it usually means that you should probably review your program logic.
In your opinion what is an average line count for a class let's say in Java (i don't know if the choice of language has anything to do with it but just in case...)
Yes, I'd say it does have to do with the language, if only because some languages are more verbose than others.
In general, I use these rules of thumb:
< 300 lines: fine
300 - 500 lines: reasonable
500 - 1000 lines: maybe ok, but plan on refactoring
> 1000 lines: definitely refactor
Of course, it really depends more on the nature and complexity of the code than on LOC, but I've found these reasonable.
In general, number of lines is not the issue - a slightly better metric is number of public methods. But there is no correct figure. For example, a utility string class might correctly have hundreds of methods, whereas a business level class might have only a couple.
If you are interested in LOC, cyclomatic and other complexity measurements, I can strongly recommend Source Monitor from http://www.campwoodsw.com, which is free, works with major languages such as Java & C++, and is all round great.
From Eric Raymond's "The Art Of Unix Programming"
In nonmathematical terms, Hatton's empirical results imply a sweet spot between 200 and 400 logical lines of code that minimizes probable defect density, all other factors (such as programmer skill) being equal. This size is independent of the language being used — an observation which strongly reinforces the advice given elsewhere in this book to program with the most powerful languages and tools you can. Beware of taking these numbers too literally however. Methods for counting lines of code vary considerably according to what the analyst considers a logical line, and other biases (such as whether comments are stripped). Hatton himself suggests as a rule of thumb a 2x conversion between logical and physical lines, suggesting an optimal range of 400–800 physical lines.
Taken from here
Better to measure something like cyclomatic complexity and use that as a gauge. You could even stick it in your build script/ant file/etc.
It's too easy, even with a standardized code format, for lines of code to be disconnected from the real complexity of the class.
Edit: See this question for a list of cyclomatic complexity tools.
I focus on methods and (try to) keep them below 20 lines of code. Class length is in general dictated by the single responsibility principle. But I believe that this is no absolute measure because it depends on the level of abstraction, hence somewhere between 300 and 500 lines I start looking over the code for a new responsibility or abstraction to extract.
Small enough to do only the task it is charged with.
Large enough to do only the task it is charged with.
No more, no less.
In my experience any source file over 1000 text lines I will start wanting to break up. Ideally methods should fit on a single screen, if possible.
Lately I've started to realise that removing unhelpful comments can help greatly with this. I comment far more sparingly now than I did 20 years ago when I first started programming.
The short answer: less than 250 lines.
The shorter answer: Mu.
The longer answer: Is the code readable and concise? Does the class have a single responsibility? Does the code repeat itself?
For me, the issue isn't LOC. What I look at is several factors. First, I check my If-Else-If statements. If a lot of them have the same conditions, or result in similar code being run, I try to refactor that. Then I look at my methods and variables. In any single class, that class should have one primary function and only that function. If it has variables and methods for a different area, consider putting those into their own class. Either way, avoid counting LOC for two reasons:
1) It's a bad metric. If you count LOC you're counting not just long lines, but also lines which are whitespace and used for comments as though they are the same. You can avoid this, but at the same time, you're still counting small lines and long lines equally.
2) It's misleading. Readability isn't purely a function of LOC. A class can be perfectly readable but if you have a LOC count which it violates, you're gonna find yourself working hard to squeeze as many lines out of it as you can. You may even end up making the code LESS readable. If you take the LOC to assign variables and then use them in a method call, it's more readable than calling the assignments of those variables directly in the method call itself. It's better to have 5 lines of readable code than to condense it into 1 line of unreadable code.
Instead, I'd look at depth of code and line length. These are better metrics because they tell you two things. First, the nested depth tells you if you're logic needs to be refactored. If you are looking at If statements or loops nested more than 2 deep, seriously consider refactoring. Consider refactoring if you have more than one level of nesting. Second, if a line is long, it is generally very unreadable. Try separating out that line onto several more readable lines. This might break your LOC limit if you have one, but it does actually improve readability.
line counting == bean counting.
The moment you start employing tools to find out just how many lines of code a certain file or function has, you're screwed, IMHO, because you stopped worrying about managebility of the code and started bureaucratically making rules and placing blame.
Have a look at the file / function, and consider if it is still comfortable to work with, or starts getting unwieldly. If in doubt, call in a co-developer (or, if you are running a one-man-show, some developer unrelated to the project) to have a look, and have a quick chat about it.
It's really just that: a look. Does someone else immediately get the drift of the code, or is it a closed book to the uninitiated? This quick look tells you more about the readability of a piece of code than any line metrics ever devised. It is depending on so many things. Language, problem domain, code structure, working environment, experience. What's OK for one function in one project might be all out of proportion for another.
If you are in a team / project situation, and can't readily agree by this "one quick look" approach, you have a social problem, not a technical one. (Differing quality standards, and possibly a communication failure.) Having rules on file / function lengths is not going to solve your problem. Sitting down and talking about it over a cool drink (or a coffee, depending...) is a better choice.
You're right... there is no answer to this. You cannot put a "best practice" down as a number of lines of code.
However, as a guideline, I often go by what I can see on one page. As soon as a method doesn't fit on one page, I start thinking I'm doing something wrong. As far as the whole class is concerned, if I can't see all the method/property headers on one page then maybe I need to start splitting that out as well.
Again though, there really isn't an answer, some things just have to get big and complex. The fact that you know this is bad and you're thinking about it now, probably means that you'll know when to stop when things get out of hand.
Lines of code is much more about verbosity than any other thing. In the project I'm currently working we have some files with over 1000 LOC. But, if you strip the comments, it will probably remain about 300 or even less. If you change declarations like
int someInt;
int someOtherInt;
to one line, the file will be even shorter.
However, if you're not verbose and you still have a big file, you'll probably need to think about refactoring.

How do you define 'unwanted code'?

How would you define "unwanted code"?
Edit:
IMHO, Any code member with 0 active calling members (checked recursively) is unwanted code. (functions, methods, properties, variables are members)
Here's my definition of unwanted code:
A code that does not execute is a dead weight. (Unless it's a [malicious] payload for your actual code, but that's another story :-))
A code that repeats multiple times is increasing the cost of the product.
A code that cannot be regression tested is increasing the cost of the product as well.
You can either remove such code or refactor it, but you don't want to keep it as it is around.
0 active calls and no possibility of use in near future. And I prefer to never comment out anything in case I need for it later since I use SVN (source control).
Like you said in the other thread, code that is not used anywhere at all is pretty much unwanted. As for how to find it I'd suggest FindBugs or CheckStyle if you were using Java, for example, since these tools check to see if a function is used anywhere and marks it as non-used if it isn't. Very nice for getting rid of unnecessary weight.
Well after shortly thinking about it I came up with these three points:
it can be code that should be refactored
it can be code that is not called any more (leftovers from earlier versions)
it can be code that does not apply to your style-guide and way-of-coding
I bet there is a lot more but, that's how I'd define unwanted code.
In java i'd mark the method or class with #Deprecated.
Any PRIVATE code member with no active calling members (checked recursively). Otherwise you do not know if your code is not used out of your scope analysis.
Some things are already posted but here's another:
Functions that almost do the same thing. (only a small variable change and therefore the whole functions is copy pasted and that variable is changed)
Usually I tell my compiler to be as annoyingly noisy as possible, that picks 60% of stuff that I need to examine. Unused functions that are months old (after checking with the VCS) usually get ousted, unless their author tells me when they'll actually be used. Stuff missing prototypes is also instantly suspect.
I think trying to implement automated house cleaning is like trying to make a USB device that guarantees that you 'safely' play Russian Roulette.
The hardest part to check are components added to the build system, few people notice those and unused kludges are left to gather moss.
Beyond that, I typically WANT the code, I just want its author to refactor it a bit and make their style the same as the rest of the project.
Another helpful tool is doxygen, which does help you (visually) see relations in the source tree.. however, if its set at not extracting static symbols / objects, its not going to be very thorough.