What is the CORE:match (opcode) subroutine in Perl profiling? - perl

I previously wrote some utilities in Perl, and I am now rewriting them in order to give some new/better features. However, things seem to be going much more slowly than in the original utilities, so I decided to run one with the NYTProf profiler. Great profiler btw, still trying to figure out all its useful features.
So anyway, it turns out that 93% of my program's time is being spent on calls to the GeneModel::CORE:match (opcode) subroutine, and I have no idea what this is. Most Google hits point to NYTProf profiles others have posted. I indeed wrote the GeneModel class/package, but I don't know what this subroutine is, why it was called so many times, or why it's taking so long. Any ideas?

CORE:match is a call to a regular expression -- in this case, within your GeneModel package.
For example, if we profile this script, Devel::NYTProf reports 1000 calls to Foo::CORE:match.
use strict;
use warnings;
package Foo;
my $s = 'foo foo';
$s =~ /foo/ for 1 .. 1000;

Perl is compiled to opcodes. The match operator results in a match opcode.
> perl -MO=Terse -e'm//'
LISTOP (0x8c4b40) leave [1]
OP (0x8c4070) enter
COP (0x8c4780) nextstate
PMOP (0x8c4260) match
This is not a subroutine, but merely represented that way as opcode profiling is a recent addition and the UI hasn't been overhauled yet to take that into account. In simple words, the profiler is telling you that most time is spent in the regex engine.

Related

Having a perl script make use of one among several secondary scripts

I have a main program mytool.pl to be run from the command line. There are several auxillary scripts special1.pl, special2.pl, etc. which each contain a couple subroutines and a hash, all identically named across scripts. Let's suppose these are named MySpecialFunction(), AnotherSpecialFunction() and %SpecialData.
I'd like for mytool to include/use/import the contents of one of the special*.pl files, only one, according to a command line option. For example, the user will do:
bash> perl mytool.pl --specialcase=5
and mytools will use MySpecialFunction() from special5.pl, and ignore all other special*.pl files.
Is this possible and how to do it?
It's important to note that the selection of which special file to use is made at runtime, so adding a "use" at the top of mytool.pl probably isn't the right thing to do.
Note I am a long-time C programmer, not a perl expert; I may be asking something obvious.
This is for a one-off project that will turn to dust in only a month. Neither mytool.pl nor special?.pl (nor perl itself) will be of interest beyond the end of this short project. Therefore, we don't care for solutions that are elaborate or require learning some deep magic. Quick and dirty preferred. I'm guessing that Perl's module mechanism is overkill for this, but have no idea what the alternatives are.
You can use a hash or array to map values of specialcase to .pl files and require or do them as needed.
#!/usr/bin/env perl
use strict; use warnings;
my #handlers = qw(one.pl two.pl);
my ($case) = #ARGV;
$case = 0 unless defined $case;
# check that $case is within range
do $handlers[$case];
print special_function(), "\n";
When you use a module, Perl just require's the module in a BEGIN block (and imports the modules exported items). Since you want to change what script you load at runtime, call require yourself.
if ($special_case_1) {
require 'special1.pl';
# and go about your business
}
Here's a good reference on when to use use vs. require.

In Perl, is there any way to tie a stash?

Similar to the way AUTOLOAD can be used to define subroutines on demand, I am wondering if there is a way to tie a package's stash so that I can intercept access to variables in that package.
I've tried various permutations of the following idea, but none seem to work:
{package Tie::Stash;
use Tie::Hash;
BEGIN {our #ISA = 'Tie::StdHash'}
sub FETCH {
print "calling fetch\n";
}
}
{package Target}
BEGIN {tie %Target::, 'Tie::Stash'}
say $Target::x;
This dies with Bad symbol for scalar ... on the last line, without ever printing "calling fetch". If the say $Target::x; line is removed, the program runs and exits properly.
My guess is that the failure has to do with stashes being like, but not the same as hashes, so the standard tie mechanism is not working right (or it might just be that stash lookup never invokes tie magic).
Does anyone know if this is possible? Pure Perl would be best, but XS solutions are ok.
You're hitting a compile time internal error ("Bad symbol for scalar"), this happens while Perl is trying to work out what '$Target::x' should be, which you can verify by running a debugging Perl with:
perl -DT foo.pl
...
### 14:LEX_NORMAL/XOPERATOR ";\n"
### Pending identifier '$Target::x'
Bad symbol for scalar at foo.pl line 14.
I think the GV for '::Target' is replaced by something else when you tie() it, so that whatever eventually tries to get to its internal hash cannot. Given that tie() is a little bit of a mess, I suspect what you're trying to do won't work, which is also suggested by this (old) set of exchanges on p5p:
https://groups.google.com/group/perl.perl5.porters/browse_thread/thread/f93da6bde02a91c0/ba43854e3c59a744?hl=en&ie=UTF-8&q=perl+tie+stash#ba43854e3c59a744
A little late to the question, but although it's not possible to use tie to do this, Variable::Magic allows you to attach magic to a stash and thereby achieve something similar.

How can i count the respective lines for each sub in my perl code?

I am refactoring a rather large body of code and a sort of esoteric question came to me while pondering where to go on with this. What this code needs in large parts is shortening of subs.
As such it would be very advantageous to point some sort of statistics collector at the directory, which would go through all the .pm, .cgi and .pl files, find all subs (i'm fine if it only gets the named ones) and gives me a table of all of them, along with their line count.
I gave PPI a cursory look, but could not find anything directly relevant, with some tools that might be appropiate, but rather complex to use.
Are there any easier modules that do something like this?
Failing that, how would you do this?
Edit:
Played around with PPI a bit and created a script that collects relevant statistics on a code base: http://gist.github.com/514512
my $document = PPI::Document->new($file);
# Strip out comments and documentation
$document->prune('PPI::Token::Pod');
$document->prune('PPI::Token::Comment');
# Find all the named subroutines
my $sub_nodes = $document->find(
sub { $_[1]->isa('PPI::Statement::Sub') and $_[1]->name } );
print map { sprintf "%s %s\n", $_->name, scalar split /\n/, $_->content } #$sub_nodes;
I'm dubious that simply identifying long functions is the best way to identify what needs to be refactored. Instead, I'd run the code through perlcritic at increasing levels of harshness and follow the suggestions.

Can I rate a song in iTunes (on a Mac) using Perl?

I've tried searching CPAN. I found Mac::iTunes, but not a way to assign a rating to a particular track.
If you're not excited by Mac::AppleScript, which just takes a big blob of AppleScript text and runs it, you might prefer Mac::AppleScript::Glue, which provides a more object-oriented interface. Here's the equivalent to Iamamac's sample code:
#!/usr/bin/env perl
use Modern::Perl;
use Mac::AppleScript::Glue;
use Data::Dumper;
my $itunes = Mac::AppleScript::Glue::Application->new('iTunes');
# might crash if iTunes isn't playing anything yet
my $track = $itunes->current_track;
# for expository purposes, let's see what we're dealing with
say Dumper \$itunes, \$track;
say $track->rating; # initially undef
$track->set(rating => 100);
say $track->rating; # should print 100
All that module does is build a big blob of AppleScript, run it, and then break it all apart into another AppleScript expression that it can use on your next command. You can see that in the _ref value of the track object when you run the above script. Because al it's doing is pasting and parsing AppleScript, this module won't be any faster than any other AppleScript-based approach, but it does allow you to intersperse other Perl commands within your script, and it keeps your code looking a little more like Perl, for what that's worth.
You can write AppleScript to fully control iTunes, and there is a Perl binding Mac::AppleScript.
EDIT Code Sample:
use Mac::AppleScript qw(RunAppleScript);
RunAppleScript(qq(tell application "iTunes" \n set rating of current track to $r \n end tell));
Have a look at itunes-perl, it seems to be able to rate tracks.

Why is 'last' called 'last' in Perl?

What is the historical reason to that last is called that in Perl rather than break as it is called in C?
The design of Perl was influenced by C (in addition to awk, sed and sh - see man page below), so there must have been some reasoning behind not going with the familiar C-style naming of break/last.
A bit of history from the Perl 1.000 (released 18 December, 1987) man page:
[Perl] combines (in the author's opinion, anyway) some of the best features of C, sed, awk, and sh, so people familiar with those languages should have little difficulty with it. (Language historians will also note some vestiges of csh, Pascal, and even BASIC|PLUS.)
The semantics of 'break' or 'last' are
defined by the language (in this case
Perl), not by you.
Why not think of 'last' as "this is
the last statement to run for the
loop".
It's always struck me as odd that the
'continue' statement in 'C' starts the
next pass of a loop. This is
definitely a strange use of the
concept of "continue". But it is the
semantics of 'C', so I accept it.
By trying to map particular
programming concepts into single
English words with existing meaning
there is always going to be some sort
of mismatching oddity
Source
Plus, Larry Wall is kinda weird. Have you seen his picture?
(source: wired.com)
I expect that this is because Perl was created by a linguist, not a computer scientist. In normal English usage, the concept of declaring that you have completed your final pass through a loop is more strongly connected to the word "last" ("this is the last pass") than to the word "break" ("break the loop"? "break out of the loop"? - it's not even clear how "break" is intended to relate to exiting the loop).
The term 'last' makes more sense when you remember that you can use it with more than just the immediate looping control. You can apply it to labeled blocks one or more levels above
the block it is in:
LINE: while( <> ) {
WORD: foreach ( split ) {
last LINE if /^__END__\z/;
...
}
}
It reads more naturally to say "last" in english when you read it as "last line if it matches ...".
Theres an additional reason you might want to consider:
Last does more than just loop control.
sub hello {
my ( $arg ) = #_;
scope: {
foo();
bar();
last if $arg > 4;
baz();
quux();
}
}
Last as such is a general flow control mechanism not limited to loops. While of course, you can generalise the above as a loop that runs at most 1 times, the absence of a loop to me indicates "Break? What are we breaking out of?"
Instead, I think of "last" as "Jump to the position of the last brace", which is for this purpose, more semantically sensible.
I was asking the same question to Damian Conway about say. Perl 6 will introduce say, which is nothing more than print that automatically adds a newline. My question was why not simply use echo, because this is what echo does in Bash (and probably elsewhere).
His answer was: echo is 33% longer than say.
He has a point there. :)
Because it goes to the last of the loop. And because Larry Wall was a weird guy.