using file handle returned by select - perl

I am pulling out my hair on using the file handle returned by select.
The documentation about select reads:
select
Returns the currently selected filehandle.
I have a piece of code, that prints some data and usually is executed without any re-direction. But there is one use case, where select is used to re-direct the print output to a file.
In this piece of code, I need to use the current selected file handle. I tried the following code fragment:
my $fh = select;
print $fh "test\n";
I wrote a short test program to demonstrate my problem:
#!/usr/bin/perl
use strict;
use warnings;
sub test
{
my $fh=select;
print $fh "#_\n";
}
my $oldfh;
# this works :-)
open my $test1, "> test1.txt";
$oldfh = select $test1;
test("test1");
close select $oldfh if defined $oldfh;
#this doesn't work. :-(
# Can't use string ("main::TEST2") as a symbol ref while "strict refs" in use
open TEST2,">test2.txt";
$oldfh = select TEST2;
test("test2");
close select $oldfh if defined $oldfh;
#this doesn't work, too. :-(
# gives Can't use string ("main::STDOUT") as a symbol ref while "strict refs" in use at
test("test");
It seems, that select is not returning a reference to the file handle but a string containing the name of the file handle.
What do I have to do to always get a usable file handle from select's return value?
P.S. I need to pass this file handle as OutputFile to XML::Simple's XMLout().

Just use
print XMLout(...);
It seems, that select is not returning a reference to the file handle but a string containing the name of the file handle.
It can indeed return a plain ordinary string.
>perl -MDevel::Peek -E"Dump(select())"
SV = PV(0x6cbe38) at 0x260e850
REFCNT = 1
FLAGS = (PADTMP,POK,pPOK)
PV = 0x261ce48 "main::STDOUT"\0
CUR = 12
LEN = 24
But that's perfectly acceptable as a file handle to Perl. There are four things that Perl accepts as file handles:
A reference to an IO object.
>perl -e"my $fh = *STDOUT{IO}; CORE::say($fh 'foo');"
foo
A glob that contains a reference to an IO object.
>perl -e"my $fh = *STDOUT; CORE::say($fh 'foo');"
foo
A reference to a glob that contains a reference to an IO object.
>perl -e"my $fh = \*STDOUT; CORE::say($fh 'foo');"
foo
A "symbolic reference" to a glob that contains a reference to an IO object.
>perl -e"my $fh = 'STDOUT'; CORE::say($fh 'foo');"
foo
This type doesn't work under strict refs, though.
>perl -Mstrict -e"my $fh = 'STDOUT'; CORE::say($fh 'foo');"
Can't use string ("STDOUT") as a symbol ref while "strict refs" in use at -e line 1.
What do I have to do to always get a usable file handle from select's return value?
As demonstrated above, it already returns a perfectly usable file handle. If XMLout doesn't support it, then it's a bug in XMLout. You could work around it as follows:
my $fh = select();
if (!ref($fh) && ref(\$fh) ne 'GLOB') {
no strict qw( refs );
$fh = \*$fh;
}
This can also be used to make the handle usable in a strict environment
As bad as XML::Simple is at reading XML, it's a million times worse at generating it. See Why is XML::Simple Discouraged?.
Consider XML::LibXML or XML::Twig if you're modifying XML.
Consider XML::Writer if you're generating XML.

The point of select is you don't need to specify the handle at all, since it's the default one.
sub test {
print "#_\n";
}
That's also the reason why select isn't recommended: it introduces global state which is hard to track and debug.

First of all, you shouldn't use XML::Simple , because it will need lots of work to make sure that your output will generate consistent XML. At least make sure you're using the appropriate ForceArray parameters.
Instead of doing filehandle shenanigans, why don't you use the simpler
print XMLout($data, %options);
... instead of trying to pass a default filehandle around?

Related

Perl: Variable value is 'glob', but should be 'scalar'

I do have the following simple code:
my $TimeZone = $hCache->{'TimeZone'}; # Cache gets filled earlier
my $DateTime = DateTime->now();
$DateTime->set_time_zone($TimeZone);
This code runs in an application server which is basically a long running perl process that accepts incoming network connections.
From time to time this applicationserver gets somehow "dirty", and the code above is printing the following error:
The 'name' parameter ("Europe/Berlin") to DateTime::TimeZone::new was
a 'glob', which is not one of the allowed types: scalar at
/srv/epages/eproot/Perl/lib/site_perl/linux/DateTime.pm line 1960.
When I try to debug the variable "$TimeZone" I'm getting no further details.
E.g.
print ref($TimeZone); # prints nothing (scalar?)
print $TimeZone; # prints "Europe/Berlin"
The code works if I'm forcing the timezone to be a string again, like so:
my $TimeZone = $hCache->{'TimeZone'}; # Cache gets filled earlier
my $DateTime = DateTime->now();
$DateTime->set_time_zone($TimeZone."");
My questions are:
If 'glob' is not a reference, how can I debug the variable properly?
How can I create a 'glob' variable? What is the syntax to it? I'm
quite sure that my huge codebase has some accidents in it, but I
don't know what to search for.
Is there a way to 'monitor' the
variable? Basically, getting a stacktrace if the variable changes
How can I create a 'glob' variable?
Glob, short for "typeglob" is a structure (in the C sense of the word) that contains a field for each type of variable that can be found in the symbol table (scalar, array, hash, code, glob, etc). They form the symbol table.
Globs are created by simply mentioning a package variable.
#a = 4..6; # Creates glob *main::a containing a reference to the new array.
Since globs are themselves packages variables, you can bring a glob into existence just by mentioning it.
my $x = *glob; # The glob *main::glob is created by this line at compile-time.
Note that file handles are often accessed via globs. For example, open(my $fh, '<', ...) populates $fh with a reference to a glob that contains a reference to an IO.
$fh # Reference to glob that contains a reference to an IO.
*$fh # Glob that contains a reference to an IO.
*$fh{IO} # Reference to an IO.
If 'glob' is not a reference, how can I debug the variable properly?
ref(\$var) will return GLOB for a glob.
$ perl -e'$x = *STDOUT; CORE::say ref(\$x)'
GLOB
Is there a way to 'monitor' the variable?
Yes. You can add magic to it.
$ perl -e'
use feature qw( say );
use Carp qw( cluck );
use Variable::Magic qw( wizard cast );
my $wiz = wizard(
data => sub { $_[1] },
set => sub { cluck("Variable $_[1] modified"); },
);
my $x;
cast($x, $wiz, q{$x});
$x = 123; # Line 14
'
Variable $x modified at -e line 9.
main::__ANON__(SCALAR(0x50bcee23c0), "\$x") called at -e line 14
eval {...} called at -e line 14
More work is needed to detect if a hash or array changes, but the above can be used to monitor the elements of hashes and arrays.

CGI.pm: file handle in global hash lost between subroutines?

I'd like to store file handle to a global hash and read() it in a subroutine without revealing CGI object, but I find that it doesn't work(resulting zero sized output file).
Here is the simplified perl code:
#!/usr/local/bin/perl
use CGI;
use vars qw(%in);
&init_hash;
$fname = &process_uploaded_file if($in{'upfile'});
$fsize = -s $fname;
print "Content-Type: text/plain\r\n\r\n";
print "in{'upfile'}=",$in{'upfile'},"\r\n";
print "in{'desc'}=",$in{'desc'},"\r\n";
print "fname=",$fname,"\r\n";
print "fsize=",$fsize,"\r\n";
sub init_hash{
my $q = new CGI;
$in{'desc'} = $q->param('desc');
$in{'upfile'} = $q->param('upfile');
$in{'upfh'} = $q->upload('upfile') if($in{'upfile'});
}
sub process_uploaded_file{
my $fname = time.'.bin';
open(OUT,'>',$fname) || die('open file failed');
while(my $read = read($in{'upfh'}, my $buff, 4096)){
print OUT $buff;
}
close(OUT);
eval { close($in{'upfh'}); };
return $fname;
}
EDIT: I should provide perl and cgi.pm version.
Perl version: This is perl 5, version 12, subversion 2 (v5.12.2) built for MSWin32-x86-multi-thread
(with 8 registered patches, see perl -V for more detail)
$CGI::VERSION='3.50';
There is so much wrong with your code.
First your problem: you are trying to optimize where optimization isn't due. And the temp files of the CGI object are deleted before you actually access them. Your code should work when you extend the lifetime of the CGI object, e.g. by adding it to the %in hash.
Always use strict; use warnings;. There are no excuses.
Global variables are declared with our. The vars pragma is a historical artifact. But please don't use global variables, as they are unneccessary here.
Don't call functions like &foo unless you can tell me what exactly this does. Until you have this knowledge: foo().
Use the header method of the CGI object to write headers: $q->header('text/plain').
The \n may not be what you think it is. Do a binmode STDOUT to remove the :crlf PerlIO-layer if it is currently applied. Although equivalent to \r\n, It may be clearer to write \015\012 to demonstrate that you care about the actual bytes.
You can interpolate variables into strings, you know. You can also specify a string that is to be appended after each print by setting $\:
{
local $\ = "\015\012";
print "in{'upfile'}=$in{'upfile'}";
print "in{'desc'}=$in{'desc'}";
print "fname=$fname";
print "fsize=$fsize";
}
Don't use bareword filehandles. Instead of open OUT, "<", $fname you should open my $outfh, "<", $fname.
Why did you put one close in an eval? I don't see how this should die.

Unquoting a String in Perl

I have a subroutine in Perl sub findfiles , I have to pass a quoted value "*/*" as input parameter since it complains without quoting ,on the other hand in my subroutine I needed it to be unquoted (may be!)The problem is when I print the value to check ,I don't see any quote,or any thing but may be there are some special hidden character or something I don't know ? My codes work properly when I use */*directly but not when I pass it as as an input parameter
Do you have any idea?
sub findfiles {
$dirname=$_[0];
my #temp = grep {-f} <$dirname>;
print #temp;
}
&findfiles("*/*"); doesnot work
but
sub findfiles {
$dirname=$_[0];
my #temp = grep {-f} <*/*>;
print #temp;
}
does its job
With your updated code, I can see where your error lies. While
my #temp = grep {-f} <*/*>;
Works as a glob
my #temp = grep {-f} <$dirname>;
Is interpreted as a readline() on the file handle $dirname.
If you want to avoid ambiguity you can use the function for glob:
my #temp = grep -f, glob $dirname;
You might also be interested in using File::Find, which finds files recursively.
NOTE: This problem could have been avoided if you had warnings turned on. As a rule of thumb, coding in perl without using
use strict;
use warnings;
...is a very bad idea. These two pragmas will help you identify problems with your code.
The problem is when I print the value to check ,I don't see any quote
$test="*/*"
^string delimiter
^^^string
^string delimiter
When you print a string (be it from a string literal, a scalar or whatever) you print the string.
The delimiters don't get printed. They just tell perl where the edges of the data are.
Do you know about File::Find?
use File::Find ();
File::Find::find( sub { say $File::Find::name if -f; } => $my_root );
Or what about File::Find::Rule (see file)?
say foreach File::Find::Rule->file->in( $my_root );

How can I redefine 'open' properly in Perl?

Some time ago, I ask a question: How do I redefine built in Perl functions?
And the answers have served me well. I have a package that overrides Perl's 'open' function enabling me to log file access.
Now I've come to a case that breaks the functionality of the original code.
use strict;
use warnings;
use Data::Dumper;
sub myopen (*;#) {
my $p;
my $retval = CORE::open($p, $_[1]);
{
no strict;
*{"main::$_[0]"} = $p;
}
return $retval;
}
BEGIN {
*CORE::GLOBAL::open = *myopen;
};
my #a = (1, 2, 3);
open(CHECK, ">dump") or print "UNABLE TO OPEN DUMPER FILE: $!\n";
print CHECK "test\n";
print CHECK Data::Dumper->Dump(\#a);
close CHECK
Now I get this message:
Can't locate object method "CHECK" via package "Data::Dumper"
How do I fix it?
Try using a name other than "CHECK".
"CHECK" is a special function which is called during compile time, and you really shouldn't use it.
$ open CHECK , '<', 'foo.txt';
Took 0.00224494934082031 seconds.
Runtime error: Undefined subroutine &Devel::REPL::Plugin::Packages::DefaultScratchpad::CHECK called at (eval 329) line 5.
$ open CHECKS , '<', 'foo.txt';
Took 0.00155806541442871 seconds.
$
More on 'CHECK'
Why that specific error?
perl -MO=Deparse -e 'print CHECK Data::Dumper 1';
print 'Data::Dumper'->CHECK(1);
Also, you're using global file handles, which are problematic.
use this notation:
open my $fh, '<' , $foo ;
print <$fh>;
close $fh;
These are extra beneficial is they self-close when they go out of scope.
Compare:
> perl -MData::Dumper -e'local*_=*STDOUT;print _ Data::Dumper->Dump([2]);'
Can't locate object method "_" via package "Data::Dumper" at -e line 1.
to
> perl -MData::Dumper -e'local*_=*STDOUT;print _ ( Data::Dumper->Dump([2]) );'
$VAR1 = 2;
I used a different name from "STDOUT" because it seems to only gets the indirect object wrong when it's not a built-in handle.
This will work and without producing the error...
print {*CHECK} Data::Dumper->Dump(\#a);
This stops it being confused has an "Indirect Object Syntax"
However I do recommend steering clear of using CHECK and other special named code blocks in Perl and using lexical variables for filehandles is the preferred method. PBP

Why can't I say print $somehash{$var}{fh} "foo"?

I have a line of code along the lines of:
print $somehash{$var}{fh} "foo";
The hash contains the filehandle a few levels down. The error is:
String found where operator expected at test.pl line 10, near "} "foo""
I can fix it by doing this:
my $fh = $somehash{$var}{fh};
print $fh "foo";
...but is there a one-liner?
see http://perldoc.perl.org/functions/print.html
Note that if you're storing
FILEHANDLEs in an array, or if you're
using any other expression more
complex than a scalar variable to
retrieve it, you will have to use a
block returning the filehandle value
instead: ...
So, in your case, you would use a block like this:
print { $somehash{$var}{fh} } "foo";
If you have anything other than a simple scalar as your filehandle, you need to wrap the reference holding the filehandle in braces so Perl knows how to parse the statement:
print { $somehash{$var}{fh} } $foo;
Part of Perl Best Practices says to always wrap filehandles in braces just for this reason, although I don't get that nutty with it.
The syntax is odd because print is an indirect method on a filehandle object:
method_name Object #arguments;
You might have seen this in old-school CGI.pm. Here are two indirect method calls:
use CGI;
my $cgi_object = new CGI 'cat=Buster&bird=nightengale';
my $value = param $cgi_object 'bird';
print "Indirect value is $value\n";
That almost works fine (see Schwern's answer about the ambiguity) as long as the object is in a simple scalar. However, if I put the $cgi_object in a hash, I get the same syntax error you got with print. I can put the braces around the hash access to make it work out. Continuing with the previous code:
my %hash;
$hash{animals}{cgi} = $cgi_object;
# $value = param $hash{animals}{cgi} 'cat'; # syntax error
$value = param { $hash{animals}{cgi} } 'cat';
print "Braced value is $value\n";
That's all a bit clunky so just use the arrow notation for everything instead:
my $cgi_object = CGI->new( ... );
$cgi_object->param( ... );
$hash{animals}{cgi}->param( ... );
You can do the same with filehandles, although you have to use the IO::Handle module to make it all work out:
use IO::Handle;
STDOUT->print( 'Hello World' );
open my( $fh ), ">", $filename or die ...;
$fh->print( ... );
$hash{animals}{fh} = $fh;
$hash{animals}{fh}->print( ... );
The above answers are all correct. The reason they don't allow a full expression in there is print FH LIST is already pretty weird syntax. To put anything more complicated in there would introduce a ton of ambiguous syntax. The block removed that ambiguity.
To see where this madness leads to, consider the horror that is indirect object syntax.
foo $bar; # Is that foo($bar) or $bar->foo()? Good luck!