Remove non-digit characters perl - perl

I have a file which has multiple quotes like the one below:
<verse-no>quote</verse-no>
<quote-verse>1:26,27 Man Created to Continually Develop</quote-verse>
<quote>When Adam came from the Creator’s hand, he bore, in his physical, mental, and
spiritual nature, a likeness to his Maker. “God created man in His own image”
(Genesis 1:27), and it was His purpose that the longer man lived the more fully
he should reveal this image—the more fully reflect the glory of the Creator. All
his faculties were capable of development; their capacity and vigor were
continually to increase. Ed 15
</quote>
I want to remove all strings from <quote-verse>.....</quote-verse> line so that the end result will be <quote>1:26,27</quote>.
I have tried perl -pi.bak -e 's#\D*$<\/quote-verse>#<\/quote-verse>#g' file.txt
This does nothing. I am a beginner in perl (self taught) with less than 10 days experience. Please tell me what's wrong and how to proceed.

You have XML. Therefore you want an XML parser. XML::Twig is a good one.
The reason there's a lot of people saying 'don't use regular expressions to parse XML' is because whilst it does work in a limited scope. But XML is a specification, and certain things are valid, some are not. If you make code that's built on assumptions that aren't always true, what you end up with is brittle code - code that will break one day without warning, if someone alters their perfectly valid XML into a slightly different but still perfectly valid XML.
So with that in mind:
This works:
#!/usr/bin/perl
use strict;
use warnings;
use XML::Twig;
sub quote_verse_handler {
my ( $twig, $quote ) = #_;
my $text = $quote->text;
$text =~ s/(\d)\D+$/$1/;
$quote->set_text($text);
}
my $parser = XML::Twig->new(
twig_handlers => { 'quote-verse' => \&quote_verse_handler },
pretty_print => 'indented'
);
#$parser -> parsefile ( 'your_file.xml' );
local $/;
$parser->parse(<DATA>);
$parser->print;
__DATA__
<xml>
<verse-no>quote</verse-no>
<quote-verse>1:26,27 Man Created to Continually Develop</quote-verse>
<quote>When Adam came from the Creator's hand, he bore, in his physical, mental, and
spiritual nature, a likeness to his Maker. "God created man in His own image"
(Genesis 1:27), and it was His purpose that the longer man lived the more fully
he should reveal this image-the more fully reflect the glory of the Creator. All
his faculties were capable of development; their capacity and vigor were
continually to increase. Ed 15
</quote>
</xml>
What this does is - run through your file. Each time it encounters a section quote-verse it calls the handler, and gives it 'that bit' of the XML to do stuff with. We apply a regular expression, to chop off the trailing bit of the line, and then update the XML accordingly.
Once parse is finished, we spit out the finished product.
You'll probably want to replace:
local $/;
$parser -> parse ( <DATA> );
with:
$parser -> parsefile ( 'your_file_name' );
You may also find:
$parser -> print_to_file( 'output_filename' );
to be useful.

Related

perl - two stage conditional compilation

I have pretty big perl script executed quite frequently (from cron).
Most executions require pretty short & simple tests.
How to split single file script into two parts with "part two" compiled based on "part 1" decision?
Considered solution:
using BEGIN{ …; exit if …; } block for trivial test.
two file solution with file_1 using require to compile&execute file_2.
I would prefer single file solution to ease maintenance if the cost is reasonable.
First, you should measure how long the compilation really takes, to see if this "optimization" is even necessary. If it does happen to be, then since you said you'd prefer a one-file solution, one possible solution is using the __DATA__ section for code like so:
use warnings;
use strict;
# measure compliation and execution time
use Time::HiRes qw/ gettimeofday tv_interval /;
my $start;
BEGIN { $start = [gettimeofday] }
INIT { printf "%.06f\n", tv_interval($start) }
END { printf "%.06f\n", tv_interval($start) }
my $condition = 1; # dummy for testing
# conditionally compile and run the code in the DATA section
if ($condition) {
eval do { local $/; <DATA>.'; 1' } or die $#;
}
__DATA__
# ... lots of code here ...
I see two ways of achieving what you want. The simple one would be to divide the script in two parts. The first part will do the simple tests. Then, if you need to do more complicated tests you may "add" the second part. The way to do this is using eval like this:
<first-script.pl>
...
eval `cat second-script.pl`;
if( $# ) {
print STDERR $#, "\n";
die "Errors in the second script.\n";
}
Or using File::Slurp in a more robust way:
eval read_file("second-script.pl", binmode => ':utf8');
Or following #amon suggestion and do:
do "second-script.pl";
Only beware that do is different from eval in this way:
It also differs in that code evaluated with do FILE cannot see lexicals in the enclosing scope; eval STRING does. It's the same, however, in that it does reparse the file every time you call it, so you probably don't want to do this inside a loop.
The eval will execute in the context of the first script, so any variables or initializations will be available to that code.
Related to this, there is this question: Best way to add dynamic code to a perl application, which I asked some time ago (and answered myself with the help of the comments provided and some research.) I took some time to document everything I could think of for anyone (and myself) to refer to.
The second way I see would be to turn your testing script into a daemon and have the crontab bit call this daemon as necessary. The daemon remains alive so any data structures that you may need will remain in memory. On the down side, this will take resources in a continuos way as the daemon process will always be running.

PERL: String Replacement on file

I am working on a script to do a string replacement in a file and I will read the variables and values and files from a configuration file and do string replacement.
Here is my logic to do a string replacement.
sub expansion($$$){
my $f = shift(#_) ; # file Name
my $vname = shift(#_) ; # variable name for pattern match
my $value = shift(#_) ; # value to replace
my $n = "$f".".new";
open ( O, "<$f") or print( "Can't open $f file: $!");
open ( N ,">$n" ) or print( "Can't open $n file: $!");
while (<O>)
{
$_ =~ s/$vname/$value/g; #check for pattern
print N "$_" ;
}
close (O);
close (N);
}
In my logic am reading line by line in from input file ($f) for the pattern and writing to a new file ($n) .
Instead of write to a new file is there any way to do a string replacement the original file when I try to do the same it has only empty file with no contents.
Do not. Never, ever1. Don't you dare, Don't even think of, do not use subroutine prototyping. It is horribly broken (that is, it doesn't do what you think it does) and is dangerous.
Now, we got that out of the way:
Yes, you can do what you want. You can open a file as both read and writable by using the mode <+. So far, so good.
However, due to buffering, you cannot use the standard read and write methods to read and write to the file. Instead, you need to use sysread and syswrite.
Then, what you need to do is read the line, use sysseek to go back to the start of where you read, and then write to that spot.
Not only is it very complex to do, but it is full of peril. Let's take a simple example. I have a document, and I want to replace my curly quotes with straight quotes.
$line =~ s/“|”/"/g;
That should work. I'm replacing one character with another. What could go wrong?
If this is a UTF-8 file (what Macs and Linux systems use by default), those curly quotes are two-byte characters and that straight quote is a single byte character. I would be writing back a line that was shorter than the line I read in. My buffer is going to be off.
Back in the days when computer memory and storage were measured in kilobytes, and you serial devices like reel-to-reel tapes, this type of operation was quite common. However, in this age where storage is vast, it's simply not worth the complexity and error prone process that this entails. Stick with reading from one file, and writing to another. Then use unlink and rename to delete the original and to rename the copy to the original's name.
A few more pointers:
Don't print if the file can't be opened. Use die. Otherwise, your program will simply continue on blithely unaware that it is not working. Even better, use the pragma use autodie;, and you won't have to worry about testing whether or not a read/write failed.
Use scalars for file handles.
That is instead of
open OUT, ">my_file.txt";
use
open my $out_fh, ">my_file.txt";
And, it is highly recommended to use the three parameter open:
Use
open my $out_fh, ">", "my_file.txt";
If you aren't, always add use strict; and use warnings;.
In fact, your Perl syntax is a bit ancient. You need to get a book on Modern Perl. Perl originally was written as a hack language to replace shell and awk programming. However, Perl has morphed into a full fledge language that can handle complex data types, object orientation, and large projects. Learning the modern syntax of Perl will help you find errors, and become a better developer.
1. Like all rules, this can be broken, but only if you have a clear and careful understanding what is going on. It's like those shows that say "Don't do this at home. We're professionals."
sub inplace_expansion($$$){
my $f = shift(#_) ; # file Name
my $vname = shift(#_) ; # variable name for pattern match
my $value = shift(#_) ; # value to replace
local #ARGV = ( $f );
local $^I = '';
while (<>)
{
s/\Q$vname/$value/g; #check for pattern
print;
}
}
or, my preference would run closer to this (basically equivalent, changes mostly in formatting, variable names, etc.):
use English;
sub inplace_expansion {
my ( $filename, $pattern, $replacement ) = #_;
local #ARGV = ( $filename ),
$INPLACE_EDIT = '';
while ( <> ) {
s/\Q$pattern/$replacement/g;
print;
}
}
The trick with local basically simulates a command-line script (as one would run with perl -e); for more details, see perldoc perlrun. For more on $^I (aka $INPLACE_EDIT), see perldoc perlvar.
(For the business with \Q (in the s// expression), see perldoc -f quotemeta. This is unrelated to your question, but good to know. Also be aware that passing regex patterns around in variables—as opposed to, e.g., using literal regexes exclusively— can be vulnerable to injection attacks; Perl's built-in taint mode is useful here.)
EDIT: David W. is right about prototypes.

Perl - New definition of myprint() or Overload print command

I am a newb to Perl. I am writing some scripts and want to define my own print called myprint() which will print the stuff passed to it based on some flags (verbose/debug flag)
open(FD, "> /tmp/abc.txt") or die "Cannot create abc.txt file";
print FD "---Production Data---\n";
myprint "Hello - This is only a comment - debug data";
Can someone please help me with some sample code to for myprint() function?
Do you care more about writing your own logging system, or do you want to know how to put logging statements in appropriate parts of your program which you can turn off (and, incur little performance penalty when they are turned off)?
If you want a logging system that is easy to start using, but also offers a world of features which you can incrementally discover and use, Log::Log4perl is a good option. It has an easy mode, which allows you to specify the desired logging level, and emits only those logging messages that are above the desired level.
#!/usr/bin/env perl
use strict; use warnings;
use File::Temp qw(tempfile);
use Log::Log4perl qw(:easy);
Log::Log4perl->easy_init({level => $INFO});
my ($fh, $filename) = tempfile;
print $fh "---Production Data---\n";
WARN 'Wrote something somewhere somehow';
The snippet also shows a better way of opening a temporary file using File::Temp.
As for overriding the built-in print … It really isn't a good idea to fiddle with built-ins except in very specific circumstances. perldoc perlsub has a section on Overriding Built-in Functions. The accepted answer to this question lists the Perl built-ins that cannot be overridden. print is one of those.
But, then, one really does not need to override a built-in to write a logging system.
So, if an already-written logging system does not do it for you, you really seem to be asking "how do I write a function that prints stuff conditionally depending on the value of a flag?"
Here is one way:
#!/usr/bin/env perl
package My::Logger;
{
use strict; use warnings;
use Sub::Exporter -setup => {
exports => [
DEBUG => sub {
return sub {} unless $ENV{MYDEBUG};
return sub { print 'DEBUG: ' => #_ };
},
]
};
}
package main;
use strict; use warnings;
# You'd replace this with use My::Logger qw(DEBUG) if you put My::Logger
# in My/Logger.pm somewhere in your #INC
BEGIN {
My::Logger->import('DEBUG');
}
sub nicefunc {
print "Hello World!\n";
DEBUG("Isn't this a nice function?\n");
return;
}
nicefunc();
Sample usage:
$ ./yy.pl
Hello World!
$ MYDEBUG=1 ./yy.pl
Hello World!
DEBUG: Isn't this a nice function?
I wasn't going to answer this because Sinan already has the answer I'd recommend, but tonight I also happened to be working on the "Filehandle References" chapter to the upcoming Intermediate Perl. That are a couple of relevant paragraphs which I'll just copy directly without adapting them to your question:
IO::Null and IO::Interactive
Sometimes we don't want to send our output anywhere, but we are forced
to send it somewhere. In that case, we can use IO::Null to create
a filehandle that simply discards anything that we give it. It looks
and acts just like a filehandle, but does nothing:
use IO::Null;
my $null_fh = IO::Null->new;
some_printing_thing( $null_fh, #args );
Other times, we want output in some cases but not in others. If we are
logged in and running our program in our terminal, we probably want to
see lots of output. However, if we schedule the job through cron, we
probably don't care so much about the output as long as it does the job.
The IO::Interactive module is smart enough to tell the difference:
use IO::Interactive;
print { is_interactive } 'Bamboo car frame';
The is_interactive subroutine returns a filehandle. Since the
call to the subroutine is not a simple scalar variable, we surround
it with braces to tell Perl that it's the filehandle.
Now that you know about "do nothing" filehandles, you can replace some
ugly code that everyone tends to write. In some cases you want output
and in some cases you don't, so many people use a post-expression
conditional to turn off a statement in some cases:
print STDOUT "Hey, the radio's not working!" if $Debug;
Instead of that, you can assign different values to $debug_fh based
on whatever condition you want, then leave off the ugly if $Debug
at the end of every print:
use IO::Null;
my $debug_fh = $Debug ? *STDOUT : IO::Null->new;
$debug_fh->print( "Hey, the radio's not working!" );
The magic behind IO::Null might give a warning about "print() on
unopened filehandle GLOB" with the indirect object notation (e.g.
print $debug_fh) even though it works just fine. We don't get that
warning with the direct form.

Subroutines vs scripts in Perl

I'm fairly new to Perl and was wondering what the best practices regarding subroutines are with Perl. Can a subroutine be too big?
I'm working on a script right now, and it might need to call another script. Should I just integrate the old script into the new one in the form of a subroutine? I need to pass one argument to the script and need one return value.
I'm guessing I'd have to do some sort of black magic to get the output from the original script, so subroutine-ing it makes sense right?
Avoiding "black magic" is always a good idea when writing code. You never want to jump through hoops and come up with an unintuitive hack to solve a problem, especially if that code needs to be supported later. It happens, admittedly, and we're all guilty of it. Circumstances can weigh heavily on "just getting the darn thing to work."
The point is, the best practice is always to make the code clean and understandable. Remember, and this is especially true with Perl code in my experience, any code you wrote yourself more than a few months ago may as well have been written by someone else. So even if you're the only one who needs to support it, do yourself a favor and make it easy to read.
Don't cling to broad sweeping ideas like "favor more files over larger files" or "favor smaller methods/subroutines over larger ones" etc. Those are good guidelines to be sure, but apply the spirit of the guideline rather than the letter of it. Keep the code clean, understandable, and maintainable. If that means the occasional large file or large method/subroutine, so be it. As long as it makes sense.
A key design goal is separation of concerns. Ideally, each subroutine performs a single well-defined task. In this light, the main question revolves not around a subroutine's size but its focus. If your program requires multiple tasks, that implies multiple subroutines.
In more complex scenarios, you may end up with groups of subroutines that logically belong together. They can be organized into libraries or, even better, modules. If possible, you want to avoid a scenario where you end up with multiple scripts that need to communicate with each other, because the usual mechanism for one script to return data to another script is tedious: the first script writes to standard output and the second script must parse that output.
Several years ago I started work at a job requiring that I build a large number of command-line scripts (at least, that's how it turned out; in the beginning, it wasn't clear what we were building). I was quite inexperienced at the time and did not organize the code very well. In hindsight, I should have worked from the premise that I was writing modules rather than scripts. In other words, the real work would have been done by modules, and the scripts (the code executed by a user on the command line) would have remained very small front-ends to invoke the modules in various ways. This would have facilitated code reuse and all of that good stuff. Live and learn, right?
Another option that hasn't been mentioned yet for reusing the code in your scripts is to put common code in a module. If you put shared subroutines into a module or modules, you can keep your scripts short and focussed on what they do that is special, while isolating the common code in a easy to access and reuse form.
For example, here is a module with a few subroutines. Put this in a file called MyModule.pm:
package MyModule;
# Always do this:
use strict;
use warnings;
use IO::Handle; # For OOP filehandle stuff.
use Exporter qw(import); # This lets us export subroutines to other scripts.
# These may be exported.
our #EXPORT_OK = qw( gather_data_from_fh open_data_file );
# Automatically export everything allowed.
# Generally best to leave empty, but in some cases it makes
# sense to export a small number of subroutines automatically.
our #EXPORT = #EXPORT_OK;
# Array of directories to search for files.
our #SEARCH_PATH;
# Parse the contents of a IO::Handle object and return structured data
sub gather_data_from_fh {
my $fh = shift;
my %data;
while( my $line = $fh->readline );
# Parse the line
chomp $line;
my ($key, #values) = split $line;
$data{$key} = \#values;
}
return \%data;
}
# Search a list of directories for a file with a matching name.
# Open it and return a handle if found.
# Die otherwise
sub open_data_file {
my $file_name = shift;
for my $path ( #SEARCH_PATH, '.' ) {
my $file_path = "$path/$file_name";
next unless -e $file_path;
open my $fh, '<', $file_path
or die "Error opening '$file_path' - $!\n"
return $fh;
}
die "No matching file found in path\n";
}
1; # Need to have trailing TRUE value at end of module.
Now in script A, we take a filename to search for and process and then print formatted output:
use strict;
use warnings;
use MyModule;
# Configure which directories to search
#MyModule::SEARCH_PATH = qw( /foo/foo/rah /bar/bar/bar /eeenie/meenie/mynie/moe );
#get file name from args.
my $name = shift;
my $fh = open_data_file($name);
my $data = gather_data_from_fh($fh);
for my $key ( sort keys %$data ) {
print "$key -> ", join ', ', #{$data->{$key}};
print "\n";
}
Script B, searches for a file, parses it and then writes the parsed data structure into a YAML file.
use strict;
use warnings;
use MyModule;
use YAML qw( DumpFile );
# Configure which directories to search
#MyModule::SEARCH_PATH = qw( /da/da/da/dum /tutti/frutti/unruly /cheese/burger );
#get file names from args.
my $infile = shift;
my $outfile = shift;
my $fh = open_data_file($infile);
my $data = gather_data_from_fh($fh);
DumpFile( $outfile, $data );
Some related documentation:
perlmod - About Perl modules in general
perlmodstyle - Perl module style guide; this has very useful info.
perlnewmod - Starting a new module
Exporter - The module used to export functions in the sample code
use - the perlfunc article on use.
Some of these docs assume you will be sharing your code on CPAN. If you won't be publishing to CPAN, simply ignore the parts about signing up and uploading code.
Even if you aren't writing for CPAN, it is beneficial to use the standard tools and CPAN file structure for your module development. Following the standard allows you to use all of the tools CPAN authors use to simplify the development, testing and installation process.
I know that all this seems really complicated, but the standard tools make each step easy. Even adding unit tests to your module distribution is easy thanks to the great tools available. The payoff is huge, and well worth the time you will invest.
Sometimes it makes sense to have a separate script, sometimes it doesn't. The "black magic" isn't that complicated.
#!/usr/bin/perl
# square.pl
use strict;
use warnings;
my $input = shift;
print $input ** 2;
#!/usr/bin/perl
# sum_of_squares.pl
use strict;
use warnings;
my ($from, $to) = #ARGV;
my $sum;
for my $num ( $from .. $to ) {
$sum += `square.pl $num` // die "square.pl failed: $? $!";
}
print $sum, "\n";
Easier and better error reporting on failure is automatic with IPC::System::Simple:
#!/usr/bin/perl
# sum_of_squares.pl
use strict;
use warnings;
use IPC::System::Simple 'capture';
my ($from, $to) = #ARGV;
my $sum;
for my $num ( $from .. $to ) {
$sum += capture( "square.pl $num" );
}
print $sum, "\n";

How can I hook into Perl's print?

Here's a scenario. You have a large amount of legacy scripts, all using a common library. Said scripts use the 'print' statement for diagnostic output. No changes are allowed to the scripts - they range far and wide, have their approvals, and have long since left the fruitful valleys of oversight and control.
Now a new need has arrived: logging must now be added to the library. This must be done automatically and transparently, without users of the standard library needing to change their scripts. Common library methods can simply have logging calls added to them; that's the easy part. The hard part lies in the fact that diagnostic output from these scripts were always displayed using the 'print' statement. This diagnostic output must be stored, but just as importantly, processed.
As an example of this processing, the library should only record the printed lines that contain the words 'warning', 'error', 'notice', or 'attention'. The below Extremely Trivial and Contrived Example Code (tm) would record some of said output:
sub CheckPrintOutput
{
my #output = #_; # args passed to print eventually find their way here.
foreach my $value (#output) {
Log->log($value) if $value =~ /warning|error|notice|attention/i;
}
}
(I'd like to avoid such issues as 'what should actually be logged', 'print shouldn't be used for diagnostics', 'perl sucks', or 'this example has the flaws x y and z'...this is greatly simplified for brevity and clarity. )
The basic problem comes down to capturing and processing data passed to print (or any perl builtin, along those lines of reasoning). Is it possible? Is there any way to do it cleanly? Are there any logging modules that have hooks to let you do it? Or is it something that should be avoided like the plague, and I should give up on ever capturing and processing the printed output?
Additional: This must run cross-platform - windows and *nix alike. The process of running the scripts must remain the same, as must the output from the script.
Additional additional: An interesting suggestion made in the comments of codelogic's answer:
You can subclass http://perldoc.perl.org/IO/Handle.html and create your
own file handle which will do the logging work. – Kamil Kisiel
This might do it, with two caveats:
1) I'd need a way to export this functionality to anyone who uses the common library. It would have to apply automatically to STDOUT and probably STDERR too.
2) the IO::Handle documentation says that you can't subclass it, and my attempts so far have been fruitless. Is there anything special needed to make sublclassing IO::Handle work? The standard 'use base 'IO::Handle' and then overriding the new/print methods seem to do nothing.
Final edit: Looks like IO::Handle is a dead end, but Tie::Handle may do it. Thanks for all the suggestions; they're all really good. I'm going to give the Tie::Handle route a try. If it causes problems I'll be back!
Addendum: Note that after working with this a bit, I found that Tie::Handle will work, if you don't do anything tricky. If you use any of the features of IO::Handle with your tied STDOUT or STDERR, it's basically a crapshoot to get them working reliably - I could not find a way to get the autoflush method of IO::Handle to work on my tied handle. If I enabled autoflush before I tied the handle it would work. If that works for you, the Tie::Handle route may be acceptable.
There are a number of built-ins that you can override (see perlsub). However, print is one of the built-ins that doesn't work this way. The difficulties of overriding print are detailed at this perlmonk's thread.
However, you can
Create a package
Tie a handle
Select this handle.
Now, a couple of people have given the basic framework, but it works out kind of like this:
package IO::Override;
use base qw<Tie::Handle>;
use Symbol qw<geniosym>;
sub TIEHANDLE { return bless geniosym, __PACKAGE__ }
sub PRINT {
shift;
# You can do pretty much anything you want here.
# And it's printing to what was STDOUT at the start.
#
print $OLD_STDOUT join( '', 'NOTICE: ', #_ );
}
tie *PRINTOUT, 'IO::Override';
our $OLD_STDOUT = select( *PRINTOUT );
You can override printf in the same manner:
sub PRINTF {
shift;
# You can do pretty much anything you want here.
# And it's printing to what was STDOUT at the start.
#
my $format = shift;
print $OLD_STDOUT join( '', 'NOTICE: ', sprintf( $format, #_ ));
}
See Tie::Handle for what all you can override of STDOUT's behavior.
You can use Perl's select to redirect STDOUT.
open my $fh, ">log.txt";
print "test1\n";
my $current_fh = select $fh;
print "test2\n";
select $current_fh;
print "test3\n";
The file handle could be anything, even a pipe to another process that post processes your log messages.
PerlIO::tee in the PerlIO::Util module seems to allows you to 'tee' the output of a file handle to multiple destinations (e.g. log processor and STDOUT).
Lots of choices. Use select() to change the filehandle that print defaults to. Or tie STDOUT. Or reopen it. Or apply an IO layer to it.
This isn't the answer to your issue but you should be able to adopt the logic for your own use. If not, maybe someone else will find it useful.
Catching malformed headers before they happen...
package PsychicSTDOUT;
use strict;
my $c = 0;
my $malformed_header = 0;
open(TRUE_STDOUT, '>', '/dev/stdout');
tie *STDOUT, __PACKAGE__, (*STDOUT);
sub TIEHANDLE {
my $class = shift;
my $handles = [#_];
bless $handles, $class;
return $handles;
}
sub PRINT {
my $class = shift;
if (!$c++ && #_[0] !~ /^content-type/i) {
my (undef, $file, $line) = caller;
print STDERR "Missing content-type in $file at line $line!!\n";
$malformed_header = 1;
}
return 0 if ($malformed_header);
return print TRUE_STDOUT #_;
}
1;
usage:
use PsychicSTDOUT;
print "content-type: text/html\n\n"; #try commenting out this line
print "<html>\n";
print "</html>\n";
You could run the script from a wrapper script that captures the original script's stdout and writes the output somewhere sensible.