Using Perl FileHandle with a scalar containing string instead of filename - perl

My script download a plain text file using from the internet using LWP::Simple's get() function.
I'd the like to process this string in a filehandle way. I found this 'elegant' (well, I like it) way of doing this from http://www.perlmonks.org/?node_id=745018 .
my $filelike = get($url); # whole text file sucked up in single string
open my $fh, '<', \$filelike or die $!;
while (<$fh>) {
# do wildly exciting stuff;
};
But I like using FileHandle; however, I've not found a way of doing the above using it. So:
my $filelike = get($url);
my $fh = new FileHandle \$filelike; # does not work
my $fh = new FileHandle $filelike; # does not work either
Any ideas?
Thanks.

FileHandle provides an fdopen method which can give you a FileHandle object from a symbol reference. You can open a raw filehandle to the scalar ref and then wrap that in a FileHandle object.
open my $string_fh, '<', \$filelike;
my $fh = FileHandle->new->fdopen( $string_fh, 'r' );
(Also, see this answer for why you should use Class->new instead of the indirect new Class notation.)

Do you realize that all file handles are objects of the IO::Handle? If all you want is to use the file handle as an object, you don't have to do anything at all.
$ perl -e'
open my $fh, "<", \"abcdef\n";
STDOUT->print($fh->getline());
'
abcdef
Note: In older versions of Perl, you will need to add use IO::Handle;.

Related

Perl script find and replace not working?

I am trying to create a script in Perl to replace text in all HTML files in a given directory. However, it is not working. Could anyone explain what I'm doing wrong?
my #files = glob "ACM_CCS/*.html";
foreach my $file (#files)
{
open(FILE, $file) || die "File not found";
my #lines = <FILE>;
close(FILE);
my #newlines;
foreach(#lines) {
$_ =~ s/Authors Here/Authors introduced this subject for the first time in this paper./g;
#$_ =~ s/Authors Elsewhere/Authors introduced this subject in a previous paper./g;
#$_ =~ s/D4-/D4: Is the supporting evidence described or cited?/g;
push(#newlines,$_);
}
open(FILE, $file) || die "File not found";
print FILE #newlines;
close(FILE);
}
For example, I'd want to replace "D4-" with "D4: Is the...", etc. Thanks, I'd appreciate any tips.
You are using the two argument version of open. If $file does not start with "<", ">", or ">>", it will be opened as read filehandle. You cannot write to a read file handle. To solve this, use the three argument version of open:
open my $in, "<", $file or die "could not open $file: $!";
open my $out, ">", $file or die "could not open $file: $!";
Also note the use of lexical filehandles ($in) instead of the bareword file handles (FILE). Lexical filehandles have many benefits over bareword filehandles:
They are lexically scoped instead of global
They close when they go out of scope instead of at the end of the program
They are easier to pass to functions (ie you don't have to use a typeglob reference).
You use them just like you would use a bareword filehandle.
Other things you might want to consider:
use the strict pragma
use the warnings pragma
work on files a line or chunk at a time rather than reading them in all at once
use an HTML parser instead of regex
use named variables instead of the default variable ($_)
if you are using the default variable, don't include it where it is already going to be used (eg s/foo/bar/; instead of $_ =~ s/foo/bar/;)
Number 4 may be very important for what you are doing. If you are not certain of the format these HTML files are in, then you could easily miss things. For instance, "Authors Here" and "Authors\nHere" means the same thing to HTML, but your regex will miss the later. You might want to take a look at XML::Twig (I know it says XML, but it handles HTML as well). It is a very easy to use XML/HTML parser.

PERL Net::DNS output to file

Completely new to Perl (in the process of learning) and need some help. Here is some code that I found which prints results to the screen great, but I want it printed to a file. How can I do this? When I open a file and send output to it, I get garbage data.
Here is the code:
use Net::DNS;
my $res = Net::DNS::Resolver->new;
$res->nameservers("ns.example.com");
my #zone = $res->axfr("example.com");
foreach $rr (#zone) {
$rr->print;
}
When I add:
open(my $fh, '>', $filename) or die "Could not open file '$filename' $!";
.....
$rr -> $fh; #I get garbage.
Your #zone array contains a list of Net::DNS::RR objects, whose print method stringifies the object and prints it to the currently selected file handle
To print the same thing to a different file handle you will have to stringify the object yourself
This should work
open my $fh, '>', $filename or die "Could not open file '$filename': $!";
print $fh $_->string, "\n" for #zone;
When you're learning a new language, making random changes to code in the hope that they will do what you want is not a good idea. A far better approach is to read the documentation for the libraries and functions that you are using.
The original code uses $rr->print. The documentation for Net::DNS::Resolver says:
print
$resolver->print;
Prints the resolver state on the standard output.
The print() method there is named after the standard Perl print function which we can use to print data to any filehandle. There's a Net::DNS::Resolver method called string which is documented like this:
string
print $resolver->string;
Returns a string representation of the resolver state.
So it looks like $rr->print is equivalent to print $rr->string. And it's simple enough to change that to print to your new filehandle.
print $fh $rr->string;
p.s. And, by the way, it's "Perl", not "PERL".

File handles not working properly in Perl

I tried initializing two file handles to NULL and later use it in my program.
This was my code:
my $fh1 = " ";
my $fh2 = " ";
open($fh1, ">", $fname1);
open($fh2, ">", $fname2);
print $fh1 "HI this is fh1";
After executing, my files contained this:
fname1 is empty
fname2 cointains
Hi this is fh1
What was the mistake ?
Why is fname1 empty while fname2 contains a string, even though I haven't inserted any string in fh2?
You have set $fh1 and $fh2 to the same value (a space character, not NULL) and so they refer to the same underlying typeglob for I/O.
Filehandles in Perl are a special variable type called a glob or typeglob. In the old days of Perl 4, you always referred to a glob as a character string, often as a bareword. The barewords STDIN, STDOUT, and STDERR are relics of this simpler time.
Nowadays, you can (and usually should) use lexical filehandles, but the underlying reference to a typeglob will still be there. For example, you can write
my $fh = 'STDOUT';
print $fh "hello world\n";
and this will do the exact same thing as
print STDOUT "hello world\n";
Now if you pass an uninitialized scalar as the first argument to open, Perl will assign an arbitrary typeglob to it. You probably don't need to know which typeglob it is.
But if the argument to open is already initialized, Perl uses the typeglob with that argument's value. So this snippet of code will create and add data to a file:
my $fh = "FOO";
open $fh, '>', '/tmp/1';
print FOO "This is going into /tmp/1\n";
close $fh;
Now we can look at your example. You have set $fh1 and $fh2 to the same value -- a string consisting of a space character. So your open call to $fh1 creates an association between a typeglob named " " and the file descriptor for the output stream to $fname1.
When you call open on $fh2, you are reusing the typeglob named " ", which will automatically close the other filehandle using the same typeglob ($fh1), the same was as if you say open FOO, ">/tmp/1"; open FOO, ">/tmp/2", the second open call will implicitly close the first filehandle.
Now you are printing on $fh1, which refers to the typeglob named " ", which is associated with the output stream to file $fname2, and that's where the output goes.
It was a mistake to initialize $fh1 and $fh2. Just leave them undefined:
my ($fh1, $fh2);
open $fh1, ">", ... # assigns $fh1 to arbitrary typeglob
open $fh2, ">", ... # assigns $fh2 to different arbitrary typeglob
You shouldn't initialise your file handles at all, otherwise Perl will try to use that value as a file handle instead of creating a new one. In this case you have opened $fname1 on the file handle ' ' (a single space) and then opened $fname2 on the same file handle, which closes $fname1.
Rather than declaring the file handles separately, it is best to declare them in the open statement, like this
open my $fh1, '>', $fname1;
open my $fh2, '>', $fname2;
then there is less that can go wrong

How do I save a file in an IO::Handle?

I'm a little confused on how to save a file that is an IO::Handle.
Here is what I have
use IO::File;
my $iof = IO::File->new;
# open file
$iof->open($path, "w") || die "$! : $path";
# ensure binary
$iof->binmode;
# output file to disk
print $iof $self->File_Upload;
$iof->close;
File_Upload is the IO::Handle given to me via the CGI module for a file upload, but the output in the file is...
IO::Handle=GLOB(0x20dabec)
Not the binary data of the uploaded PDF.
If I have a file in a file handle how do I save it?
Do I need IO::File if I have an IO::Handle?
Your input is appreciated.
1DMF
Read from the CGI provided file handle using readline:
print $iof readline($self->File_Upload);
The fact that your output contents were 'IO::Handle=GLOB(0x20dabec)' implied that the $self->File_Upload is of type IO::Handle and should be treated as such.
Using readline in a list context pulls all the lines as demonstrated above. Alternatively, you could use the object method ->getlines():
print $iof $self->File_Upload->getlines();
How does one slurp a file?
my $fh = $self->File_Upload();
my $file = do { local $/; <$fh> };
Yes, this works for IO::Handle objects in addition to the usual globs (STDIN), references to globs (from open my $fh, ...) and IO scalars (*STDIN{IO}).
Then to print it,
print($iof $file);
In this particular case, you could simply use
print($iof $self->File_Upload()->getlines());

How to derefence a copy of a STDIN filehandle?

I'm trying to figure out how to get a Perl module to deference and open a reference to a filehandle. You'll understand what I mean when you see the main program:
#!/usr/bin/perl
use strict;
use warnings;
use lib '/usr/local/share/custom_pm';
use Read_FQ;
# open the STDIN filehandle and make a copy of it for (safe handling)
open(FILECOPY, "<&STDIN") or die "Couldn't duplicate STDIN: $!";
# filehandle ref
my $FH_ref = \*FILECOPY;
# pass a reference of the filehandle copy to the module's subroutine
# the value the perl module returns gets stored in $value
my $value = {Read_FQ::read_fq($FH_ref)};
# do something with $value
Basically, I want the main program to receive input via STDIN, make a copy of the STDIN filehandle (for safe handling) then pass a reference to that copy to the read_fq() subroutine in the Read_FQ.pm file (the perl module). The subroutine will then read the input from that file handle, process it, and return a value. Here the Read_FQ.pm file:
package Read_FQ;
sub read_fq{
my ($filehandle) = #_;
my contents = '';
open my $fh, '<', $filehandle or die "Too bad! Couldn't open $filehandle for read\n";
while (<$fh>) {
# do something
}
close $fh;
return $contents;
Here's where I'm running into trouble. In the terminal, when I pass a filename to the main program to open:
cat file.txt | ./script01.pl
it gives the following error message: Too bad! Couldn't open GLOB(0xfa97f0) for read
This tells me that the problem is how I'm dereferencing and opening the reference to the filehandle in the perl module. The main program is okay. I read that $refGlob = \*FILE; is a reference to a file handle and in most cases, should automatically be dereferenced by Perl. However, that isn't that case here. Does anyone know how to dereference a filehandle ref so that I can process it?
thanks. Any suggestions are greatly appreciated.
Your $filehandle should already be open - you had opened FILECOPY, taken a reference and put it in $FH_ref, which is $filehandle. If you want to re-open it again use the <& argument in open or just start reading from it right away.
If I understand correctly, you want the 3-arg equivalent of
open my $fh, '<&STDIN'
That would be
open my $fh, '<&', $filehandle