Does Perl perform common subexpression elimination? - perl

I wonder if Perl performs common subexpression elimination?
And what kind of optimisations are done?

No, but I do.
Now, I don't unroll loops by hand, because loops are an easier concept once you're familiar with programming. Because you could be doing anything with a sequence of commands, the loop makes it clear that you're repeating a task.
But CSE is something that makes more efficient code regardless of the implementation of the language. So I do it. It doesn't make the code baroque, and it works in languages where it's not automatically included.
Perl offers compression of syntax so there are often less subexpressions that have to be hand-eliminated.

No, and its not possible to do it either, except in very simple cases.
In order to eliminate common subexpressions, you must know that they haven't changed their values in between. But since so much can happen between two expressions a few lines apart, its almost impossible to tell if the subexpressions are still common.
The only things you would be able to eliminate are expressions that are provably pure, like "7 + 5". But proving that something like a function call is safe to eliminate is not going to happen.
To do this, you need powerful and conservative static analysis, which Perl does not have, and is not likely to gain (in C/C++ you need less powerful stuff because the languages are less dynamic, but you still need something).

Related

Emacs custom syntactic analysis

In emacs, the syntactic analysis is surprisingly little.
For example, if I wish to indent parameter names differently than the types in a function declaration, like so:
void myfunction(
int
test
);
int is considered an arglist-intro, and test is considered as arglist-cont. However, if I add any more parameters, they'll all be considered arglist-cont, so indenting arglist-cont wouldn't do the desired effect.
So here's what I'm wondering: Is it possible to make my own syntactic analysis thingy for emacs so that it'll recognize and differentiate cases like this (this isn't the only case, by the way)? And if so, how?
Yes, of course you can write whatever you want. Emacs is free software, it comes with sources, so you can modify them as you wish.
However, please be aware that Emacs is quite widely used, including by some very smart hackers. This means that Emacs limitations usually (but, of course, not always!) have a good reason behind them (in your case, the reason is that the C syntax is quite complex). The implication is that doing what you want to do might be harder than you might be thinking. Not that it should discourage you, of course!
PS. You asked "is it possible to make my own syntactic analysis", not "how to do that" :-)
PPS. As for "how", you will have to start with cc-engine.el.

What's the recommended replacement for Perl's deprecated-ish given/when?

Now that the Perl devs have decided to sort-of deprecate given/when statements, is there a recommended replacement, beyond just going back to if/elsif/else?
if/elsif/else chains are the best option most of the time — except when something completely different is better than both if/elsif/else and given/when, which is actually reasonably often. Examples of "completely different" approaches are creating different types of objects to handle different scenarios, and letting method dispatch do your work for you, or finding an opportunity to make your code more data-driven. Both of those, if they're appropriate and you do them right, can greatly reduce the number of "switch statement" constructs in your code.
Just as a supplement, I've found that a combination of 'for' and if/elsif/else is good if you have some given/when/default code that needs to be quickly updated. Just replace given with for and replace the when statements with a cascade of if & elsif, and replace default with else. This allows all your tests to continue using $_ implicitly, requiring less rewriting. (But be aware that other special smart match features will not work any more.)
This is just for rewriting code that already uses given/when, though. For writing new code, #hobbs has the right answer.

What is Perl's secret of getting small code do so much?

I've seen many (code-golf) Perl programs out there and even if I can't read them (Don't know Perl) I wonder how you can manage to get such a small bit of code to do what would take 20 lines in some other programming language.
What is the secret of Perl? Is there a special syntax that allows you to do complex tasks in few keystrokes? Is it the mix of regular expressions?
I'd like to learn how to write powerful and yet short programs like the ones you know from the code-golf challenges here. What would be the best place to start out? I don't want to learn "clean" Perl - I want to write scripts even I don't understand anymore after a week.
If there are other programming languages out there with which I can write even shorter code, please tell me.
There are a number of factors that make Perl good for code golfing:
No data typing. Values can be used interchangeably as strings and numbers.
"Diagonal" syntax. Usually referred to as TMTOWTDI (There's more than one way to do it.)
Default variables. Most functions act on $_ if no argument is specified. (A few act
on #_.)
Functions that take multiple arguments (like split) often have defaults that
let you omit some arguments or even all of them.
The "magic" readline operator, <>.
Higher order functions like map and grep
Regular expressions are integrated into the syntax (i.e. not a separate library)
Short-circuiting operators return the last value tested.
Short-circuiting operators can be used for flow control.
Additionally, without strictures (which are off be default):
You don't need to declare variables.
Barewords auto-quote to strings.
undef becomes either 0 or '' depending on context.
Now that that's out of the way, let me be very clear on one point:
Golf is a game.
It's great to aspire to the level of perl-fu that allows you to be good at it, but in the name of $DIETY do not golf real code. For one, it's a horrible waste of time. You could spend an hour trying to trim out a few characters. Golfed code is fragile: it almost always makes major assumptions and blithely ignores error checking. Real code can't afford to be so careless. Finally, your goal as a programmer should be to write clear, robust, and maintainable code. There's a saying in programming: Always write your code as if the person who will maintain it is a violent sociopath who knows where you live.
So, by all means, start golfing; but realize that it's just playing around and treat it as such.
Most people miss the point of much of Perl's syntax and default operators. Perl is largely a "DWIM" (do what I mean) language. One of it's major design goals is to "make the common things easy and the hard things possible".
As part of that, Perl designers talk about Huffman coding of the syntax and think about what people need to do instead of just giving them low-level primitives. The things that you do often should take the least amount of typing, and functions should act like the most common behavior. This saves quite a bit of work.
For instance, the split has many defaults because there are some use cases where leaving things off uses the common case. With no arguments, split breaks up $_ on whitespace because that's a very common use.
my #bits = split;
A bit less common but still frequent case is to break up $_ on something else, so there's a slightly longer version of that:
my #bits = split /:/;
And, if you wanted to be explicit about the data source, you can specify the variable too:
my #bits = split /:/, $line;
Think of this as you would normally deal with life. If you have a common task that you perform frequently, like talking to your bartender, you have a shorthand for it the covers the usual case:
The usual
If you need to do something, slightly different, you expand that a little:
The usual, but with onions
But you can always note the specifics
A dirty Bombay Sapphire martini shaken not stirred
Think about this the next time you go through a website. How many clicks does it take for you to do the common operations? Why are some websites easy to use and others not? Most of the time, the good websites require you to do the least amount of work to do the common things. Unlike my bank which requires no fewer than 13 clicks to make a credit card bill payment. It should be really easy to give them money. :)
This doesn't answer the whole question, but in regards to writing code you won't be able to read in a couple days, here's a few languages that will encourage you to write short, virtually unreadable code:
J
K
APL
Golfscript
Perl has a lot of single character special variables that provide a lot of shortcuts eg $. $_ $# $/ $1 etc. I think it's that combined with the built in regular expressions, allows you to write some very concise but unreadable code.
Perl's special variables ($_, $., $/, etc.) can often be used to make code shorter (and more obfuscated).
I'd guess that the "secret" is in providing native operations for often repeated tasks.
In the domain that perl was originally envisioned for you often have to
Take input linewise
Strip off whitespace
Rip lines into words
Associate pairs of data
...
and perl simple provided operators to do these things. The short variable names and use of defaults for many things is just gravy.
Nor was perl the first language to go this way. Many of the features of perl were stolen more-or-less intact (or often slightly improved) from sed and awk and various shells. Good for Larry.
Certainly perl wasn't the last to go this way, you'll find similar features in python and php and ruby and ... People liked the results and weren't about to give them up just to get more regular syntax.
What's Java's secret of copying a variable in only one line, without worrying about buses and memory? Answer: the code is transformed to bigger code. Same for every language ever invented.

What is best practice as far as using perl-isms (idiomatic expressions) in Perl?

A couple of years back I participated in writing the best practices/coding style for our (fairly large and often Perl-using) company. It was done by a committee of "senior" Perl developers.
As anything done by consensus, it had parts which everyone disagreed with. Duh.
The part that rubbed wrong the most was a strong recommendation to NOT use many Perlisms (loosely defined as code idioms not present in, say C++ or Java), such as "Avoid using '... unless X;' constructs".
The main rationale posited for such rules as this one was that non-Perl developers would have much harder time with the Perl code base otherwise. The assumption here I guess is that Perl code jockeys are rarer breed overall - and among new hires to the company - than non-Perlers.
I was wondering whether SO has any good arguments to support or reject this logic... it is mostly academic curiosity at this point as the company's Perl coding standard is ossified and will never be revised again as far as I'm aware.
P.S. Just to be clear, the question is in the context I noted - the answer for an all-Perl smaller development shop is obviously a resounding "use Perl to its maximum capability".
I write code assuming that a competent Perl programmer will be reading it. I don't go out of my way to be clever, but I don't dumb it down either.
If you're writing code for people who don't know the language, you're going to miss most of the point of using that language. I often find that people want to outlaw Perlisms because they refuse to learn any more than they already know.
Since you say that you are in a small Perl shop, it should be pretty easy to ask the person who wrote the code what it means if you don't understand it. That sort of stuff should come up in code reviews and so on. Everyone continues to learn more about the language as you have periodic and regular chances to review the code. You shouldn't let too much time elapse without other eyeballs looking at someone's code. You certainly shouldn't wait until a week after they leave the company.
As for new hires, I'm always puzzled why anyone would think that you should sit them in front of a keyboard and turn them loose expecting productive work in a codebase they have never seen.
This isn't limited to Perl, either. It's a general programming issue. You should always be learning more about your tools. Most of the big shops I know have mini-bootcamps to bring developers up to speed on the codebase, including any bits of tricky code they may encounter.
I ask myself two simple questions:
Am I doing this because it's devilishly clever and/or shows off my extensive knowledge of Perl arcana?
Then it's a bad idea. But,
Am I doing this because it's idiomatic Perl and benefits from Perl's distinct advantages?
Then it's a good idea.
I see no justifiable reason to reject, say, string interpolation just because Java and C don't have it. unless is a funny one but I think having a subroutine start with the occasional
return undef unless <something>;
isn't so bad.
What sort of perlisms do you mean?
Good:
idiomatic for loops: for(1..5) {} or for( #foo ) {}
Scalar context evaluation of arrays: my $count = #items;
map, grep and sort: my %foo = map { $_->id => $_ } #objects;
OK if limited:
statement modifier control - trailing if, unless, etc.
Restrict to error trapping and early returns. die "Bad juju\n" unless $foo eq 'good juju';
As Schwern pointed out, another good use is conditional assignment of default values: my $foo = shift; $foo = 'blarg' unless defined $foo;. This usage is, IMO, cleaner than a my $foo = defined $_[0] ? shift : 'blarg';.
Reason to avoid: if you need to add additional behaviors to the check or an else, you have a big reformatting job. IMO, the hassle to redo a statement (even in a good editor) is more disruptive than typing several "unnecessary" blocks.
Prototypes - use only to create filtery functions like map. Prototypes are compiler hints not 'prototypes' in the sense of any other language.
Logical operators - standardize on when to use and and or vs. && and ||. All your code should be consistent. Best if you use a Perl::Critic policy to enforce.
Avoid:
Local variables. Dynamic scope is damn weird, and local is not the same as local anywhere else.
Package variables. Enables bad practices. If you think you need globally shared state, refactor. If you still need globally shared state, use a singleton.
Symbol table hackery
It must have been, as you say, a few years ago, because Damian Conway has 'cornered the market' in Perl standards with Perl Best Practices for the last few years.
I've worked in a similarly ossified environment - where we were not allowed to adopt the latest best practices, because that would be a change, and no one at a sufficiently high level in the corporate structure understood (or could be bothered to understand) Perl and sign off on moving in to the 21st Century.
A corporation that deploys a technology and retains it, but doesn't either buy in expertise or train up in house, is asking for trouble.
(I'd guess you're working in a highly change-controlled environment - financial perhaps?)
I agree with brian on this by the way.
I'd say Moose kills off 99.9% of Perl-isms, by convention, that shouldn't be used: symbol table hackery, reblessing objects, common blackbox violations: treating objects as arrays or hashes. The great thing, is it does all of this without taking the functionality hit of "not using it".
If the "perl-isms" you're really referring to are mutator form (warn "bad idea" unless $good_idea), unless, and until then I don't think you really have much of an argument because these "perlisms" don't seem to inhibit readability to either perl users, or non-perl users.
Pick up a copy of Effective Perl Programming: Ways to Write Better, More Idiomatic Perl (2nd Edition), and treat that as a guideline. It contains many of the better idioms and is packed with the little bits of information that will get you writing good Perl style Perl code, as opposed to C or Java (or whatever) style Perl code.

Keeping CL and Scheme straight in your head

Depending on my mood I seem to waffle back and forth between wanting a Lisp-1 and a Lisp-2. Unfortunately beyond the obvious name space differences, this leaves all kinds of amusing function name/etc problems you run into. Case in point, trying to write some code tonight I tried to do (map #'function listvar) which, of course, doesn't work in CL, at all. Took me a bit to remember I wanted mapcar, not map. Of course it doesn't help when slime/emacs shows map IS defined as something, though obviously not the same function at all.
So, pointers on how to minimize this short of picking one or the other and sticking with it?
Map is more general than mapcar, for example you could do the following rather than using mapcar:
(map 'list #'function listvar)
How do I keep scheme and CL separate in my head? I guess when you know both languages well enough you just know what works in one and not the other. Despite the syntactic similarities they are quite different languages in terms of style.
Well, I think that as soon you get enough experience in both languages this becomes a non-issue (just with similar natural languages, like Italian and Spanish). If you usually program in one language and switch to the other only occasionally, then unfortunately you are doomed to write Common Lisp in Scheme or vice versa ;)
One thing that helps is to have a distinct visual environment for both languages, using syntax highlighting in some other colors etc. Then at least you will always know whether you are in Common Lisp or Scheme mode.
I'm definitely aware that there are syntactic differences, though I'm certainly not fluent enough yet to automatically use them, making the code look much more similar currently ;-).
And I had a feeling your answer would be the case, but can always hope for a shortcut <_<.
The easiest way to keep both languages straight is to do your thinking and code writing in Common Lisp. Common Lisp code can be converted into Scheme code with relative ease; however, going from Scheme to Common Lisp can cause a few headaches. I remember once where I was using a letrec in Scheme to store both variables and functions and had to split it up into the separate CL functions for the variable and function namespaces respectively.
In all practicality though I don't make a habit of writing CL code, which makes the times that I do have to all the more painful.