How to print code embedded in a here-doc to lexical filehandle - perl

I'm trying to create a script that will generate perl code from a template, and I'm having trouble understanding the error being thrown and why my workaround fixes it.
This example is contrived, but it demonstrates the issue:
use strict;
use warnings;
my $name = shift; # from #ARGV
my $file = sprintf "%s.pm", $name;
open my $fh, ">", $file
or die "error: open(>, '$file'): $!";
print $fh << "MODULE";
package $name;
#
# blah blah
#
use strict;
use warnings;
require Exporter;
our \#ISA = qw| Exporter |;
our \#EXPORT = qw| |; # automatic exports
our \#EXPORT_OK = qw| |; # on-demand exports
# CODE
1;
MODULE
close $fh;
When running this script, I get the following error:
$ perl script.pl Foo
Invalid version format (non-numeric data) at script.pl line 11, near "package "
syntax error at script.pl line 11, near "package $name"
BEGIN not safe after errors--compilation aborted at script.pl line 17.
Originally this script was just printing to stdout instead of writing to file -- no errors thrown. After adding the file handling and receiving this error, I then tried to just use a bare filehandle -- again no errors thrown.
So if I merely replace "$fh" with "FH" everywhere, the script works as expected. What is it about the lexical filehandle causing this to choke?

There should be no space after << marking the here document, so
print $fh << "MODULE";
should be
print $fh <<"MODULE";
or more neatly
print $fh <<MODULE;
or perhaps
print $fh (<< "MODULE");
As it is the << is being treated as a left-shift operator and Perl continues to try to compile the package statement. Finding no valid package name it tries to use $nameas a version number, and complains because it isn't one

Perl is an ambiguous language. It means that it's not always clear how it should be parsed. In some situations, perl has to guess how to parse something. There's a grammatical ambiguity in
print $fh << "MODULE";
Specifically, the << can be a left shift operator or the start of here-doc.
There are two paths you can follow to address the issue.
You can remove the ambiguity:
print $fh +<< "MODULE";
print $fh (<< "MODULE");
print { $fh } << "MODULE";
$fh->print(<< "MODULE");
You can trick perl into guessing correctly:
print $fh <<"MODULE";
Note that print $fh +<< "MODULE"; introduces an alternate ambiguity. Is + a binary or unary + operator? Thankfully, it's interpreted as a unary-+ as desired.
By the way, <<"MODULE" can be shortened to <<MODULE.

Related

Problem with here-doc in `print $fh <<'EOF'`: Perl executing the here doc

(According to https://stackoverflow.com/a/17479551/6607497 it should work, but doesn't)
I have some code like this:
use strict;
use warnings;
if (open(my $fh, '>', '/tmp/test')) {
print $fh << 'TAG';
BEGIN {
something;
}
TAG
close($fh);
}
If I leave out $fh (which is a file handle opened for output, BTW), the BEGIN block is output correctly (to STDOUT).
However when I add $fh, Perl (5.18, 5.26) tried to execute something which causes an run-time error:
Bareword "something" not allowed while "strict subs" in use at /tmp/heredoc2.pl line 6.
syntax error at /tmp/heredoc2.pl line 9, near "FOO
close"
Execution of /tmp/heredoc2.pl aborted due to compilation errors.
What is wrong?
The details of the problem are interesting (Original Perl was 5.18.2, but using 5.26.1 for the example):
First some code that works without $fh:
#!/usr/bin/perl
use strict;
use warnings;
if (open(my $fh, '>', '/tmp/test')) {
print << 'FOO_BAR';
BEGIN {
something;
}
FOO_BAR
close($fh);
}
perl -c says : /tmp/heredoc.pl syntax OK, but nothing is output!
If I add $fh before <<, I get this error:
Bareword "something" not allowed while "strict subs" in use at /tmp/heredoc.pl line 7.
syntax error at /tmp/heredoc.pl line 10, near "FOO_BAR
close"
/tmp/heredoc.pl had compilation errors.
Finally if I remove the space before 'FOO_BAR', it works:
#!/usr/bin/perl
use strict;
use warnings;
if (open(my $fh, '>', '/tmp/test')) {
print $fh <<'FOO_BAR';
BEGIN {
something;
}
FOO_BAR
close($fh);
}
> perl -c /tmp/heredoc.pl
/tmp/heredoc.pl syntax OK
> perl /tmp/heredoc.pl
> cat /tmp/test
BEGIN {
something;
}
Maybe the true pitfall is the statement in perlop(1):
There may not be a space between the "<<" and the identifier,
unless the identifier is explicitly quoted. (...)

What type is STDOUT, and how do I optionally write to it?

Does STDOUT have a "type"?
printf STDERR ("STDOUT = %s\n", STDOUT);
printf STDERR ("\*STDOUT = %s\n", *STDOUT);
printf STDERR ("\\\*STDOUT = %s\n", \*STDOUT);
Produces:
STDOUT = STDOUT
*STDOUT = *main::STDOUT
\*STDOUT = GLOB(0x600078848)
I understand the *main::STDOUT and GLOB(0x600078848) entries. The "bareword" one leaves me curious.
I'm asking because I want to pass a file handle-like argument to a method call. In 'C', I'd use a file descriptor or a File *. I want it to default to STDOUT. What I've done is:
$OUT_FILE_HANDLE = \*STDOUT;
if(#ARGV > 0 ) {
open($OUT_FILE_HANDLE, ">", "$ARGV[0]") or die $!;
}
It works, but I don't know exactly what I've done. Have I botched up STDOUT? I suspect I have "ruined" (overwritten) STDOUT, which is NOT what I want.
Please pardon the compound question; they seemed related.
Create a lexical filehandle to be a copy of STDOUT and manipulate that as needed
sub manip_fh {
my ($fh) = #_;
say $fh "hi"; # goes to STDOUT
open my $fh, '>', 'a_file.txt' or die $!; # now it's to a file
say $fh "hello";
}
open my $fh, '>&', STDOUT; # via dup2
manip_fh($fh);
say "hi"; # still goes where STDOUT went before being dup-ed (terminal)
This new, independent, filehandle can then be reopened to another resource without affecting STDOUT. See open.
The $OUT_FILE_HANDLE = \*STDOUT; from the question creates an alias and so the STDOUT does indeed get changed when the "new" one changes. You can see that by printing the typeglob
our $NEW = \*STDOUT; # "our" only for checks here, otherwise better "my"
say *{$main::NEW}; #--> *main::STDOUT
or by printing the IO slot from the symbol table for both
say for *{$main::NEW}{IO}, *{$main::{STDOUT}}{IO};
and seeing (that the object stringifies to) the same (eg IO::File=IO(0x1a8ca50)).
When it's duped using open with mode >& as in the first code snippet (but as global our) it prints *main::NEW, and its IO::File object is not the same as for STDOUT. (Make it a global our so that it is in the symbol table for these checks, but not for real use; it's much better having a my.)
From perlvar:
Perl identifiers that begin with digits or punctuation characters are exempt from the effects of the package declaration and are always forced to be in package main; they are also exempt from strict 'vars' errors. A few other names are also exempt in these ways: [...] STDOUT
So, STDOUT is a global variable containing a pre-opened file handle.
From perlfunc:
If FILEHANDLE is an undefined scalar variable (or array or hash element), a new filehandle is autovivified, meaning that the variable is assigned a reference to a newly allocated anonymous filehandle. Otherwise if FILEHANDLE is an expression, its value is the real filehandle.
Your $OUT_FILE_HANDLE is not undefined, so it is its value, STDOUT, that is being opened. AFAIK, if you open an already open handle, it is implicitly closed first.
There are several ways to do what you want. The first is obvious from the above quote — do not define $OUT_FILE_HANDLE before the open:
if (#ARGV > 0 ) {
open($OUT_FILE_HANDLE, ">", "$ARGV[0]") or die $!;
} else {
$OUT_FILE_HANDLE = \*STDOUT;
}
# do stuff to $OUT_FILE_HANDLE
Another is to use select, so you don't need to pass a file handle:
if (#ARGV > 0 ) {
open($OUT_FILE_HANDLE, ">", "$ARGV[0]") or die $!;
select $OUT_FILE_HANDLE;
}
# do stuff (without specifying a file handle)
select STDOUT;
This part of your question wasn't answered:
The "bareword" one leaves me curious.
An identifier with no other meaning is a string literal that produces itself.[1] For example, foo is the same as 'foo'.
$ perl -e'my $x = foo; print "$x\n";'
foo
This is error-prone, so we use use strict qw( subs ); to prevent this.
$ perl -e'use strict; my $x = foo; print "$x\n";'
Bareword "foo" not allowed while "strict subs" in use at -e line 1.
Execution of -e aborted due to compilation errors.
See this for other meanings Perl could assign.

Creating HTML file via heredoc in Perl fails with Bareword error

I'm trying to figure out how to create a simple HTML file using a heredoc in Perl but I keep getting
Bareword found where operator expected at pscratch.pl line 12, near "<title>Test"
(Missing operator before Test?)
Having no space between pattern and following word is deprecated at pscratch.pl line 13.
syntax error at pscratch.pl line 11, near "head>"
Execution of pscratch.pl aborted due to compilation errors.
I can't figure out what the issue is. This is the script in its entirety:
use strict;
use warnings;
my $fh;
my $file = "/home/msadmin1/bin/testing/html.test";
open($fh, '>', $file) or die "Cannot open $file: \n $!";
print $fh << "EOF";
<html>
<head>
<title>Test</title>
</head>
<body>
<h1>This is a test</h1>
</body>
</html>
EOF
close($fh);
I've tried using both single and double quotes around EOF. I've also tried escaping all of the <> tags which didn't help.
What should I be doing to prevent this error?
EDIT
I know there are modules out there that will simplify this, but I'd like to know what the problem is with this before I simplify the task with a module.
EDIT 2
The error seems to indicate that Perl is looking at the text within the heredoc as a substitution due to the / in the closing tags. If I escape them part of the error goes away regarding space between pattern and following word but the rest of the error remains.
Remove the space infront of the << "EOF"; as it is not interacting nicely with the filehandle print.
Here are various working/non-working variants:
#!/usr/bin/env perl
use warnings;
use strict;
my $foo = << "EOF";
OK: with space into a variable
EOF
print $foo;
print <<"EOF";
OK: without space into a regular print
EOF
print << "EOF";
OK: with space into a regular print
EOF
open my $fh, ">foo" or die "Unable to open foo : $!";
print $fh <<"EOF";
OK: without space into a filehandle print
EOF
# Show file output
close $fh;
print `cat foo`;
# This croaks
eval '
print $fh << "EOF";
with space into a filehandle print
EOF
';
if ($#) {
print "FAIL: with space into a filehandle print\n"
}
# Throws a bitshift warning:
print "FAIL: space and filehandle means bitshift!\n";
print $fh << "EOF";
print "\n";
Output
OK: with space into a variable
OK: without space into a regular print
OK: with space into a regular print
OK: without space into a filehandle print
FAIL: with space into a filehandle print
FAIL: space and filehandle means bitshift!
Argument "EOF" isn't numeric in left bitshift (<<) at foo.pl line 42.
152549948

Errors reading PDF with CAM-PDF: use of uninitialized value in addition <+> line 667

I'm new to perl and I'm trying to read a pdf file using CAM::PDF here is my code:
When I try to run this in the command prompt I get these errors:
"Use of uninitialized value in addition <+> at
C:/Strawberry/perl/site/lib/CAM/PDF.pm line 667 ... substr outside of
str at C:/Strawberry/perl/site/lib/CAM/PDF.pm line 657 ... (at the
end)... "Bad request for object 60 at position 0 in the file Can't
call method "getPageContentTree" on the undefined value at C:...
The weird thing is I have the exact same files and program on a separate computer that runs just fine. It prints everything perfectly where this computer can't.
I've tried reinstalling CAM::PDF and reinstalling cpan. The reinstall actually failed for some reason too. Thanks for the help.
#!/usr/bin/perl
use strict;
use warnings;
use CAM::PDF;
use CAM::PDF::PageText;
#in cmd: courts.pl samplePDF.pdf
my $filename = shift || die "Supply pdf on command line\n";
my $pdf = CAM::PDF->new($filename);
#print text_from_page(1);
my $string = text_from_page(1);
#print $string;
$string =~ s/\b \b//g;
print $string;
open(my $fh, '>', 'reports.txt');
print $fh "$string";
close $fh;
print "done\n";
sub text_from_page {
my $pg_num = shift;
return
CAM::PDF::PageText->render($pdf->getPageContentTree($pg_num));
}

In Perl, how can I handle continuation lines in a configuration file?

So I'm trying to read in a config. file in Perl. The config file uses a trailing backslash to indicate a line continuation. For instance, the file might look like this:
=== somefile ===
foo=bar
x=this\
is\
a\
multiline statement.
I have code that reads in the file, and then processes the trailing backslash(es) to concatenate the lines. However, it looks like Perl already did it for me. For instance, the code:
open(fh, 'somefile');
#data = <fh>;
print join('', #data);
prints:
foo=bar
x=thisisamultiline statement
Lo and behold, the '#data = ;' statement appears to have already handled the trailing backslash!
Is this defined behavior in Perl?
I have no idea what you are seeing, but that is not valid Perl code and that is not a behavior in Perl. Here is some Perl code that does what you want:
#!/usr/bin/perl
use strict;
use warnings;
while (my $line = <DATA>) {
#collapse lines that end with \
while ($line =~ s/\\\n//) {
$line .= <DATA>;
}
print $line;
}
__DATA__
foo=bar
x=this\
is\
a\
multiline statement.
Note: If you are typing the file in on the commandline like this:
perl -ple 1 <<!
foo\
bar
baz
!
Then you are seeing the effect of your shell, not Perl. Consider the following counterexample:
printf 'foo\\\nbar\nbaz\n' | perl -ple 1
My ConfigReader::Simple module supports continuation lines in config files, and should handle your config if it's the format in your question.
If you want to see how to do it yourself, check out the source for that module. It's not a lot of code.
I don't know what exactly you are doing, but the code you gave us doesn't even run:
=> cat z.pl
#!/usr/bin/perl
fh = open('somefile', 'r');
#data = <fh>;
print join('', #data);
=> perl z.pl
Can't modify constant item in scalar assignment at z.pl line 2, near ");"
Execution of z.pl aborted due to compilation errors.
And if I change the snippet to be actual perl:
=> cat z.pl
#!/usr/bin/perl
open my $fh, '<', 'somefile';
my #data = <$fh>;
print join('', #data);
it clearly doesn't mangle the data:
=> perl z.pl
foo=bar
x=this\
is\
a\
multiline statement.