In perl passing a parameter to a script using a required command - perl

I have two file tmp.pl and tmp2.pl. I want to call tmp2.pl with a require command but also send a parameter. Is there a better way of doing this?
tmp.pl
require "tmp2.pl" "passed parameter";
tmp2.pl
print #_;

As far as I know, require cannot be used to send a parameter. But that's a good thing, I think, because I cannot think of a reason why you should want to. Looks to me that your design is wrong.
tmp2.pl should be either:
a independent perl program, which you should run with system or qx()
a module, with optional exported tags etc.
a package which defines a class
but that's just my idea....

There's probably a better way to accomplish whatever it is you're trying to do, but you could achieve your current sub goal with something like
{
local #_ = ("passed parameter");
require "tmp2.pl";
}
I might consider this idiom in a place where I wanted to run a perl script from within a perl script. That is, I could say
{
local #ARGV = ("foo","bar");
require "my_script.pl";
}
instead of
system("perl","my_script.pl","foo","bar");
(There are plenty of subtle and not-so-subtle differences between these two calls, so a lot depends on what "features" of these calls you need)

Yes, I think this should work. Use "require" to include the script. After that you can pass the parameter by calling the sub function. The modified script can be
require "tmp2.pl" ;
subfunc(parameter);
print #_;

Supposing that you want to execute other program and capture its output I'm partial to using the IPC::Run module.
#!/usr/bin/perl
use strict;
use warnings;
use IPC::Run qw( run );
run( [ './tmp2.pl', #ARGV ], \'', \my $out, \my $err );
print "out: $out\n";
print "err: $err\n";

Related

Anyway to tell if perl script is run via do?

I have a small script configuration file which is loaded from a main script.
#main.pl
package MYPACKAGE;
our $isMaster=1;
package main;
my config=do "./config.pl"
#config.pl
my $runViaDoFlag;
$runViaDoFlag=$0=~/main\.pl/; #Test if main.pl is the executing script
$runViaDoFlag=defined $MYPACKAGE::isMaster; #Test custom package variable
#Is there a 'built-in' way to do this?
die "Need to run from main script! " unless $runViaDoFlag;
{
options={
option1=>"some value",
option2=>"some value",
},
mySub=>sub {
# do interesting things here
}
}
In a more complicated config file it might not be so obvious that config.pl script is intended to only be executed by do. Hence I want to include a die with basic usage instructions.
Solutions:
test $0 for the main script name
have custom package variable defined in the main script and checked by the config script
simply have a comment in the config instructing the user how to use it.
These work, however is there some way of knowing if a script is executed via do via built-in variable/subs?
I'd offer a change in design: have that configuration in a normal module, in which you can then test whether it's been loaded by (out of) the main:: namespace or not. Then there is no need for any of that acrobatics with control variables etc.
One way to do that
use warnings;
use strict;
use feature 'say';
use FindBin qw($RealBin);
use lib $RealBin; # so to be able to 'use' from current directory
use ConfigPackage qw(load_config);
my $config = load_config();
# ...
and the ConfigPackage.pm (in the same directory)
package ConfigPackage;
use warnings;
use strict;
use feature 'say';
use Carp;
use Exporter qw(); # want our own import
our #EXPORT_OK = qw(load_config);
sub import {
#say "Loaded by: ", (caller)[0];
croak "Must only be loaded from 'main::'"
if not ( (caller)[0] eq 'main' );
# Now switch to Exporter::import to export symbols as needed
goto &Exporter::import;
}
sub load_config {
# ...
return 'Some config-related data structure';
}
1;
(Note that this use of goto is fine.)
This is just a sketch of course; adjust, develop further, and amend as needed. If this is lodaed out of a package other than main::, and so it fails, then that happens in the compile phase, since that's when import is called. I'd consider that a good thing.
If that config code need be able to run as well, as the question may indicate, then have a separate executable that loads this module and runs what need be run.
As for the question as stated, the title and the question's (apparent) quest differ a little, but both can be treated by using caller EXPR. It won't be a clean little "built-in" invocation though.
The thing about do as intended to be used is that
do './stat.pl' is largely like
eval `cat stat.pl`;
except that ...
(That stat.pl is introduced earlier in docs, merely to signify that do is invoked on a file.)
Then caller(0) will have clear hints to offer (see docs). It returns
my ($package, $filename, $line, $subroutine, $hasargs,
$wantarray, $evaltext, $is_require, $hints, $bitmask, $hinthash)
= caller($i);
In a call asked for, do './config.pl', apart from main (package) and the correct filename, the caller(0) in config.pl also returns:
(eval) for $subroutine
./config.pl for $evaltext
1 for $is_require
Altogether this gives plenty to decide whether the call was made as required.
However, I would not recommend this kind of involved analysis instead of just using a package, what is also incomparably more flexible.

Perl - New definition of myprint() or Overload print command

I am a newb to Perl. I am writing some scripts and want to define my own print called myprint() which will print the stuff passed to it based on some flags (verbose/debug flag)
open(FD, "> /tmp/abc.txt") or die "Cannot create abc.txt file";
print FD "---Production Data---\n";
myprint "Hello - This is only a comment - debug data";
Can someone please help me with some sample code to for myprint() function?
Do you care more about writing your own logging system, or do you want to know how to put logging statements in appropriate parts of your program which you can turn off (and, incur little performance penalty when they are turned off)?
If you want a logging system that is easy to start using, but also offers a world of features which you can incrementally discover and use, Log::Log4perl is a good option. It has an easy mode, which allows you to specify the desired logging level, and emits only those logging messages that are above the desired level.
#!/usr/bin/env perl
use strict; use warnings;
use File::Temp qw(tempfile);
use Log::Log4perl qw(:easy);
Log::Log4perl->easy_init({level => $INFO});
my ($fh, $filename) = tempfile;
print $fh "---Production Data---\n";
WARN 'Wrote something somewhere somehow';
The snippet also shows a better way of opening a temporary file using File::Temp.
As for overriding the built-in print … It really isn't a good idea to fiddle with built-ins except in very specific circumstances. perldoc perlsub has a section on Overriding Built-in Functions. The accepted answer to this question lists the Perl built-ins that cannot be overridden. print is one of those.
But, then, one really does not need to override a built-in to write a logging system.
So, if an already-written logging system does not do it for you, you really seem to be asking "how do I write a function that prints stuff conditionally depending on the value of a flag?"
Here is one way:
#!/usr/bin/env perl
package My::Logger;
{
use strict; use warnings;
use Sub::Exporter -setup => {
exports => [
DEBUG => sub {
return sub {} unless $ENV{MYDEBUG};
return sub { print 'DEBUG: ' => #_ };
},
]
};
}
package main;
use strict; use warnings;
# You'd replace this with use My::Logger qw(DEBUG) if you put My::Logger
# in My/Logger.pm somewhere in your #INC
BEGIN {
My::Logger->import('DEBUG');
}
sub nicefunc {
print "Hello World!\n";
DEBUG("Isn't this a nice function?\n");
return;
}
nicefunc();
Sample usage:
$ ./yy.pl
Hello World!
$ MYDEBUG=1 ./yy.pl
Hello World!
DEBUG: Isn't this a nice function?
I wasn't going to answer this because Sinan already has the answer I'd recommend, but tonight I also happened to be working on the "Filehandle References" chapter to the upcoming Intermediate Perl. That are a couple of relevant paragraphs which I'll just copy directly without adapting them to your question:
IO::Null and IO::Interactive
Sometimes we don't want to send our output anywhere, but we are forced
to send it somewhere. In that case, we can use IO::Null to create
a filehandle that simply discards anything that we give it. It looks
and acts just like a filehandle, but does nothing:
use IO::Null;
my $null_fh = IO::Null->new;
some_printing_thing( $null_fh, #args );
Other times, we want output in some cases but not in others. If we are
logged in and running our program in our terminal, we probably want to
see lots of output. However, if we schedule the job through cron, we
probably don't care so much about the output as long as it does the job.
The IO::Interactive module is smart enough to tell the difference:
use IO::Interactive;
print { is_interactive } 'Bamboo car frame';
The is_interactive subroutine returns a filehandle. Since the
call to the subroutine is not a simple scalar variable, we surround
it with braces to tell Perl that it's the filehandle.
Now that you know about "do nothing" filehandles, you can replace some
ugly code that everyone tends to write. In some cases you want output
and in some cases you don't, so many people use a post-expression
conditional to turn off a statement in some cases:
print STDOUT "Hey, the radio's not working!" if $Debug;
Instead of that, you can assign different values to $debug_fh based
on whatever condition you want, then leave off the ugly if $Debug
at the end of every print:
use IO::Null;
my $debug_fh = $Debug ? *STDOUT : IO::Null->new;
$debug_fh->print( "Hey, the radio's not working!" );
The magic behind IO::Null might give a warning about "print() on
unopened filehandle GLOB" with the indirect object notation (e.g.
print $debug_fh) even though it works just fine. We don't get that
warning with the direct form.

Is there a module that searches for superfluous code?

Is there a module, which can find code not needed?
As an example a script with code not needed to run the script:
#!/usr/bin/env perl
use warnings;
use 5.12.0;
use utf8;
binmode STDOUT, ':utf8';
use DateTime;
use WWW::Mechanize;
sub my_print {
my ( $string, $tab, $color ) = #_;
say $string;
}
sub check {
my $string = shift;
return if length $string > 10;
return $string;
}
my_print( 'Hello World' );
Not categorically. Perl is notoriously difficult to analyze without actually executing, to the point that compiling a Perl program to be run later actually requires including a copy of the perl interpreter! As a result there are very few code analysis tools for Perl. What you can do is use a profiler, but this is a bit overkill (and as I mentioned, requires actually executing the program. I like Devel::NYTProf. This will spit out some HTML files showing how many times eaqch line or sub was executed, as well as how much time was spent there, but this only works for that specific execution of the program. It will allow you to see that WWW::Mechanize is loaded but never called, but it will not be able to tell you if warnings or binmode had any effect on execution.
Devel::Cover provides code coverage metrics that may be of some use here.

Subroutines vs scripts in Perl

I'm fairly new to Perl and was wondering what the best practices regarding subroutines are with Perl. Can a subroutine be too big?
I'm working on a script right now, and it might need to call another script. Should I just integrate the old script into the new one in the form of a subroutine? I need to pass one argument to the script and need one return value.
I'm guessing I'd have to do some sort of black magic to get the output from the original script, so subroutine-ing it makes sense right?
Avoiding "black magic" is always a good idea when writing code. You never want to jump through hoops and come up with an unintuitive hack to solve a problem, especially if that code needs to be supported later. It happens, admittedly, and we're all guilty of it. Circumstances can weigh heavily on "just getting the darn thing to work."
The point is, the best practice is always to make the code clean and understandable. Remember, and this is especially true with Perl code in my experience, any code you wrote yourself more than a few months ago may as well have been written by someone else. So even if you're the only one who needs to support it, do yourself a favor and make it easy to read.
Don't cling to broad sweeping ideas like "favor more files over larger files" or "favor smaller methods/subroutines over larger ones" etc. Those are good guidelines to be sure, but apply the spirit of the guideline rather than the letter of it. Keep the code clean, understandable, and maintainable. If that means the occasional large file or large method/subroutine, so be it. As long as it makes sense.
A key design goal is separation of concerns. Ideally, each subroutine performs a single well-defined task. In this light, the main question revolves not around a subroutine's size but its focus. If your program requires multiple tasks, that implies multiple subroutines.
In more complex scenarios, you may end up with groups of subroutines that logically belong together. They can be organized into libraries or, even better, modules. If possible, you want to avoid a scenario where you end up with multiple scripts that need to communicate with each other, because the usual mechanism for one script to return data to another script is tedious: the first script writes to standard output and the second script must parse that output.
Several years ago I started work at a job requiring that I build a large number of command-line scripts (at least, that's how it turned out; in the beginning, it wasn't clear what we were building). I was quite inexperienced at the time and did not organize the code very well. In hindsight, I should have worked from the premise that I was writing modules rather than scripts. In other words, the real work would have been done by modules, and the scripts (the code executed by a user on the command line) would have remained very small front-ends to invoke the modules in various ways. This would have facilitated code reuse and all of that good stuff. Live and learn, right?
Another option that hasn't been mentioned yet for reusing the code in your scripts is to put common code in a module. If you put shared subroutines into a module or modules, you can keep your scripts short and focussed on what they do that is special, while isolating the common code in a easy to access and reuse form.
For example, here is a module with a few subroutines. Put this in a file called MyModule.pm:
package MyModule;
# Always do this:
use strict;
use warnings;
use IO::Handle; # For OOP filehandle stuff.
use Exporter qw(import); # This lets us export subroutines to other scripts.
# These may be exported.
our #EXPORT_OK = qw( gather_data_from_fh open_data_file );
# Automatically export everything allowed.
# Generally best to leave empty, but in some cases it makes
# sense to export a small number of subroutines automatically.
our #EXPORT = #EXPORT_OK;
# Array of directories to search for files.
our #SEARCH_PATH;
# Parse the contents of a IO::Handle object and return structured data
sub gather_data_from_fh {
my $fh = shift;
my %data;
while( my $line = $fh->readline );
# Parse the line
chomp $line;
my ($key, #values) = split $line;
$data{$key} = \#values;
}
return \%data;
}
# Search a list of directories for a file with a matching name.
# Open it and return a handle if found.
# Die otherwise
sub open_data_file {
my $file_name = shift;
for my $path ( #SEARCH_PATH, '.' ) {
my $file_path = "$path/$file_name";
next unless -e $file_path;
open my $fh, '<', $file_path
or die "Error opening '$file_path' - $!\n"
return $fh;
}
die "No matching file found in path\n";
}
1; # Need to have trailing TRUE value at end of module.
Now in script A, we take a filename to search for and process and then print formatted output:
use strict;
use warnings;
use MyModule;
# Configure which directories to search
#MyModule::SEARCH_PATH = qw( /foo/foo/rah /bar/bar/bar /eeenie/meenie/mynie/moe );
#get file name from args.
my $name = shift;
my $fh = open_data_file($name);
my $data = gather_data_from_fh($fh);
for my $key ( sort keys %$data ) {
print "$key -> ", join ', ', #{$data->{$key}};
print "\n";
}
Script B, searches for a file, parses it and then writes the parsed data structure into a YAML file.
use strict;
use warnings;
use MyModule;
use YAML qw( DumpFile );
# Configure which directories to search
#MyModule::SEARCH_PATH = qw( /da/da/da/dum /tutti/frutti/unruly /cheese/burger );
#get file names from args.
my $infile = shift;
my $outfile = shift;
my $fh = open_data_file($infile);
my $data = gather_data_from_fh($fh);
DumpFile( $outfile, $data );
Some related documentation:
perlmod - About Perl modules in general
perlmodstyle - Perl module style guide; this has very useful info.
perlnewmod - Starting a new module
Exporter - The module used to export functions in the sample code
use - the perlfunc article on use.
Some of these docs assume you will be sharing your code on CPAN. If you won't be publishing to CPAN, simply ignore the parts about signing up and uploading code.
Even if you aren't writing for CPAN, it is beneficial to use the standard tools and CPAN file structure for your module development. Following the standard allows you to use all of the tools CPAN authors use to simplify the development, testing and installation process.
I know that all this seems really complicated, but the standard tools make each step easy. Even adding unit tests to your module distribution is easy thanks to the great tools available. The payoff is huge, and well worth the time you will invest.
Sometimes it makes sense to have a separate script, sometimes it doesn't. The "black magic" isn't that complicated.
#!/usr/bin/perl
# square.pl
use strict;
use warnings;
my $input = shift;
print $input ** 2;
#!/usr/bin/perl
# sum_of_squares.pl
use strict;
use warnings;
my ($from, $to) = #ARGV;
my $sum;
for my $num ( $from .. $to ) {
$sum += `square.pl $num` // die "square.pl failed: $? $!";
}
print $sum, "\n";
Easier and better error reporting on failure is automatic with IPC::System::Simple:
#!/usr/bin/perl
# sum_of_squares.pl
use strict;
use warnings;
use IPC::System::Simple 'capture';
my ($from, $to) = #ARGV;
my $sum;
for my $num ( $from .. $to ) {
$sum += capture( "square.pl $num" );
}
print $sum, "\n";

How can I hook into Perl's print?

Here's a scenario. You have a large amount of legacy scripts, all using a common library. Said scripts use the 'print' statement for diagnostic output. No changes are allowed to the scripts - they range far and wide, have their approvals, and have long since left the fruitful valleys of oversight and control.
Now a new need has arrived: logging must now be added to the library. This must be done automatically and transparently, without users of the standard library needing to change their scripts. Common library methods can simply have logging calls added to them; that's the easy part. The hard part lies in the fact that diagnostic output from these scripts were always displayed using the 'print' statement. This diagnostic output must be stored, but just as importantly, processed.
As an example of this processing, the library should only record the printed lines that contain the words 'warning', 'error', 'notice', or 'attention'. The below Extremely Trivial and Contrived Example Code (tm) would record some of said output:
sub CheckPrintOutput
{
my #output = #_; # args passed to print eventually find their way here.
foreach my $value (#output) {
Log->log($value) if $value =~ /warning|error|notice|attention/i;
}
}
(I'd like to avoid such issues as 'what should actually be logged', 'print shouldn't be used for diagnostics', 'perl sucks', or 'this example has the flaws x y and z'...this is greatly simplified for brevity and clarity. )
The basic problem comes down to capturing and processing data passed to print (or any perl builtin, along those lines of reasoning). Is it possible? Is there any way to do it cleanly? Are there any logging modules that have hooks to let you do it? Or is it something that should be avoided like the plague, and I should give up on ever capturing and processing the printed output?
Additional: This must run cross-platform - windows and *nix alike. The process of running the scripts must remain the same, as must the output from the script.
Additional additional: An interesting suggestion made in the comments of codelogic's answer:
You can subclass http://perldoc.perl.org/IO/Handle.html and create your
own file handle which will do the logging work. – Kamil Kisiel
This might do it, with two caveats:
1) I'd need a way to export this functionality to anyone who uses the common library. It would have to apply automatically to STDOUT and probably STDERR too.
2) the IO::Handle documentation says that you can't subclass it, and my attempts so far have been fruitless. Is there anything special needed to make sublclassing IO::Handle work? The standard 'use base 'IO::Handle' and then overriding the new/print methods seem to do nothing.
Final edit: Looks like IO::Handle is a dead end, but Tie::Handle may do it. Thanks for all the suggestions; they're all really good. I'm going to give the Tie::Handle route a try. If it causes problems I'll be back!
Addendum: Note that after working with this a bit, I found that Tie::Handle will work, if you don't do anything tricky. If you use any of the features of IO::Handle with your tied STDOUT or STDERR, it's basically a crapshoot to get them working reliably - I could not find a way to get the autoflush method of IO::Handle to work on my tied handle. If I enabled autoflush before I tied the handle it would work. If that works for you, the Tie::Handle route may be acceptable.
There are a number of built-ins that you can override (see perlsub). However, print is one of the built-ins that doesn't work this way. The difficulties of overriding print are detailed at this perlmonk's thread.
However, you can
Create a package
Tie a handle
Select this handle.
Now, a couple of people have given the basic framework, but it works out kind of like this:
package IO::Override;
use base qw<Tie::Handle>;
use Symbol qw<geniosym>;
sub TIEHANDLE { return bless geniosym, __PACKAGE__ }
sub PRINT {
shift;
# You can do pretty much anything you want here.
# And it's printing to what was STDOUT at the start.
#
print $OLD_STDOUT join( '', 'NOTICE: ', #_ );
}
tie *PRINTOUT, 'IO::Override';
our $OLD_STDOUT = select( *PRINTOUT );
You can override printf in the same manner:
sub PRINTF {
shift;
# You can do pretty much anything you want here.
# And it's printing to what was STDOUT at the start.
#
my $format = shift;
print $OLD_STDOUT join( '', 'NOTICE: ', sprintf( $format, #_ ));
}
See Tie::Handle for what all you can override of STDOUT's behavior.
You can use Perl's select to redirect STDOUT.
open my $fh, ">log.txt";
print "test1\n";
my $current_fh = select $fh;
print "test2\n";
select $current_fh;
print "test3\n";
The file handle could be anything, even a pipe to another process that post processes your log messages.
PerlIO::tee in the PerlIO::Util module seems to allows you to 'tee' the output of a file handle to multiple destinations (e.g. log processor and STDOUT).
Lots of choices. Use select() to change the filehandle that print defaults to. Or tie STDOUT. Or reopen it. Or apply an IO layer to it.
This isn't the answer to your issue but you should be able to adopt the logic for your own use. If not, maybe someone else will find it useful.
Catching malformed headers before they happen...
package PsychicSTDOUT;
use strict;
my $c = 0;
my $malformed_header = 0;
open(TRUE_STDOUT, '>', '/dev/stdout');
tie *STDOUT, __PACKAGE__, (*STDOUT);
sub TIEHANDLE {
my $class = shift;
my $handles = [#_];
bless $handles, $class;
return $handles;
}
sub PRINT {
my $class = shift;
if (!$c++ && #_[0] !~ /^content-type/i) {
my (undef, $file, $line) = caller;
print STDERR "Missing content-type in $file at line $line!!\n";
$malformed_header = 1;
}
return 0 if ($malformed_header);
return print TRUE_STDOUT #_;
}
1;
usage:
use PsychicSTDOUT;
print "content-type: text/html\n\n"; #try commenting out this line
print "<html>\n";
print "</html>\n";
You could run the script from a wrapper script that captures the original script's stdout and writes the output somewhere sensible.