Perl: Opening File - perl

I am trying to open the file received as argument.
When i store the argument in to the global variable open works successfully.
But
If I use give make it as my open fails to open the file.
What is the reason.
#use strict;
use warnings;
#my $FILE=$ARGV[0]; #open Fails to open the file $FILE
$FILE=$ARGV[0]; #Works Fine with Global $FILE
open(FILE)
or
die "\n ". "Cannot Open the file specified :ERROR: $!". "\n";

Unary open works only on package (global) variables. This is documented on the manpage.
A better way to open a file for reading would be:
my $filename = $ARGV[0]; # store the 1st argument into the variable
open my $fh, '<', $filename or die $!; # open the file using lexically scoped filehandle
print <$fh>; # print file contents
P.S. always use strict and warnings while debugging your Perl scripts.

It's all in perldoc -f open:
If EXPR is omitted, the scalar variable of the same name as
the FILEHANDLE contains the filename. (Note that lexical
variables--those declared with "my"--will not work for this
purpose; so if you're using "my", specify EXPR in your call
to open.)
Note that this isn't a very good way to specify the file name. As you can see, it has a hard constraint on the variable type it's in, and either the global variable it requires or the global filehandle it opens are usually best avoided.
Using a lexical filehandle keeps its scope in control, and handles closing automatically:
open my $fh, '<', "filename" or die "string involving $!";
And if you're taking that file name from the command line, you could possibly do away with that open or any handle altogether, and use the plain <> operator to read from command-line arguments or STDIN. (see comments for more on this)

use strict;
use warnings;
my $file_name = shift #ARGV;
open(my $file, '<', $file_name) or die $!;
…
close($file);
Always use strict and warnings. If either of them complains, fix the code, do not comment out the pragmas. You can also use autodie to avoid the explicit or die after open, see autodie.

From Perl's docs for open()
If EXPR is omitted, the scalar variable of the same name as the FILEHANDLE contains the filename. (Note that lexical variables--those declared with my--will not work for this purpose; so if you're using my, specify EXPR in your call to open.)

Related

Perl script find and replace not working?

I am trying to create a script in Perl to replace text in all HTML files in a given directory. However, it is not working. Could anyone explain what I'm doing wrong?
my #files = glob "ACM_CCS/*.html";
foreach my $file (#files)
{
open(FILE, $file) || die "File not found";
my #lines = <FILE>;
close(FILE);
my #newlines;
foreach(#lines) {
$_ =~ s/Authors Here/Authors introduced this subject for the first time in this paper./g;
#$_ =~ s/Authors Elsewhere/Authors introduced this subject in a previous paper./g;
#$_ =~ s/D4-/D4: Is the supporting evidence described or cited?/g;
push(#newlines,$_);
}
open(FILE, $file) || die "File not found";
print FILE #newlines;
close(FILE);
}
For example, I'd want to replace "D4-" with "D4: Is the...", etc. Thanks, I'd appreciate any tips.
You are using the two argument version of open. If $file does not start with "<", ">", or ">>", it will be opened as read filehandle. You cannot write to a read file handle. To solve this, use the three argument version of open:
open my $in, "<", $file or die "could not open $file: $!";
open my $out, ">", $file or die "could not open $file: $!";
Also note the use of lexical filehandles ($in) instead of the bareword file handles (FILE). Lexical filehandles have many benefits over bareword filehandles:
They are lexically scoped instead of global
They close when they go out of scope instead of at the end of the program
They are easier to pass to functions (ie you don't have to use a typeglob reference).
You use them just like you would use a bareword filehandle.
Other things you might want to consider:
use the strict pragma
use the warnings pragma
work on files a line or chunk at a time rather than reading them in all at once
use an HTML parser instead of regex
use named variables instead of the default variable ($_)
if you are using the default variable, don't include it where it is already going to be used (eg s/foo/bar/; instead of $_ =~ s/foo/bar/;)
Number 4 may be very important for what you are doing. If you are not certain of the format these HTML files are in, then you could easily miss things. For instance, "Authors Here" and "Authors\nHere" means the same thing to HTML, but your regex will miss the later. You might want to take a look at XML::Twig (I know it says XML, but it handles HTML as well). It is a very easy to use XML/HTML parser.

How to derefence a copy of a STDIN filehandle?

I'm trying to figure out how to get a Perl module to deference and open a reference to a filehandle. You'll understand what I mean when you see the main program:
#!/usr/bin/perl
use strict;
use warnings;
use lib '/usr/local/share/custom_pm';
use Read_FQ;
# open the STDIN filehandle and make a copy of it for (safe handling)
open(FILECOPY, "<&STDIN") or die "Couldn't duplicate STDIN: $!";
# filehandle ref
my $FH_ref = \*FILECOPY;
# pass a reference of the filehandle copy to the module's subroutine
# the value the perl module returns gets stored in $value
my $value = {Read_FQ::read_fq($FH_ref)};
# do something with $value
Basically, I want the main program to receive input via STDIN, make a copy of the STDIN filehandle (for safe handling) then pass a reference to that copy to the read_fq() subroutine in the Read_FQ.pm file (the perl module). The subroutine will then read the input from that file handle, process it, and return a value. Here the Read_FQ.pm file:
package Read_FQ;
sub read_fq{
my ($filehandle) = #_;
my contents = '';
open my $fh, '<', $filehandle or die "Too bad! Couldn't open $filehandle for read\n";
while (<$fh>) {
# do something
}
close $fh;
return $contents;
Here's where I'm running into trouble. In the terminal, when I pass a filename to the main program to open:
cat file.txt | ./script01.pl
it gives the following error message: Too bad! Couldn't open GLOB(0xfa97f0) for read
This tells me that the problem is how I'm dereferencing and opening the reference to the filehandle in the perl module. The main program is okay. I read that $refGlob = \*FILE; is a reference to a file handle and in most cases, should automatically be dereferenced by Perl. However, that isn't that case here. Does anyone know how to dereference a filehandle ref so that I can process it?
thanks. Any suggestions are greatly appreciated.
Your $filehandle should already be open - you had opened FILECOPY, taken a reference and put it in $FH_ref, which is $filehandle. If you want to re-open it again use the <& argument in open or just start reading from it right away.
If I understand correctly, you want the 3-arg equivalent of
open my $fh, '<&STDIN'
That would be
open my $fh, '<&', $filehandle

Why does this program not find the word 'error' in my text file?

open(LOG,"logfile.txt") or die "Unable to open $logfile:$!";
print "\n";
while(<$LOG>){
print if /\berror\b/i;
}
close(LOG);
Your typo actually takes you one step closer to opening the file the right way -- namely, using the recommended 3-argument form of open.
use strict;
use warnings;
open(my $log, '<', "logfile.txt") or die "Open failed : $logfile : $!";
while (<$log>) {
...
}
This approach is better because your file handle can be stored in a lexically scoped variable (rather than in a global name like LOG). This provides an added benefit in automatically closing the file when the lexical variable goes out of scope. Also, lexical file handles can be passed around between subroutines using a more familiar syntax.
If you wanted an even more effortless open, you could do this:
#ARGV = 'logfile.txt';
while ( <> ) {
print if /\berror\b/i;
}
open LOG, "logfile.txt";
while (<LOG>) {
print if /\berror\b/i;
}
You have an error:
while (<$LOG>)
should read
while (<LOG>)
Filehandles are not variables, so no $.

Which one is good practice, a lexical filehandle or a typeglob?

Some say we should use a lexical filehandle instead of a typeglob, like this:
open $fh, $filename;
But most Perl books, including The Llama Book, use a typeglob, like this:
open LOGFILE, $filename;
So what are the differences? Which one is considered a better practice?
The earliest edition of the Llama Book is from 1993, before lexical filehandles were part of the Perl language. Lexical filehandles are a better practice for a variety of reasons. The most important disadvantages of typeglobs are
they are always global in scope, which can lead to insidious bugs like this one:
sub doSomething {
my ($input) = #_;
# let's compare $input to something we read from another file
open(F, "<", $anotherFile);
#F = <F>;
close F;
do_some_comparison($input, #F);
}
open(F, "<", $myfile);
while (<F>) {
doSomething($_); # do'h -- just closed the F filehandle
}
close F;
they are harder to pass to a subroutine than a lexical filehandle
package package1;
sub log_time { # print timestamp to filehandle
my ($fh) = #_;
print $fh scalar localtime, "\n";
}
package package2;
open GLOB, '>', 'log1';
open $lexical, '>', 'log2';
package1::log_time($lexical); # works as expected
package1::log_time(GLOB); # doesn't work
package1::log_time('GLOB'); # doesn't work
package1::log_time(*GLOB); # works
package1::log_time(package2::GLOB); # works
package1::log_time('package2::GLOB'); # works
See also: Why is three-argument open calls with autovivified filehandles a Perl best practice?
When lexical variables are used, the filehandles have the scope of these variables and are automatically closed whenever you leave that scope:
{
open my $fh, '<', 'file' or die $!;
# ...
# the fh is closed upon leaving the scope
}
So you do not create permanent global variables.
Lexical filehandles can be passed easily as arguments, filehandles cannot. Typeglobs can (or at least references to them can), but that's kinda messy. Consider sticking with lexical variables, and make sure to declare them first, so you know that they're really lexical and not local or global. I.e.
my $fh;
open $fh, $filename;
Also consider using IO::Handle or IO::File as options. Used to be FileHandle but was informed by ysth below that FileHandle now just uses 'IO::Handle' in turn, which is news to me since 5.6, but there's a lot to learn here. :-)
Also, don't forget use strict :-)
Usage of typeglob filehandle is not recommended because if you don't pay attention, this can lead to several issues. E.g: If you're creating a recursive function which reuses the same typeglob, you'll get some warnings when you try to close the filehandle unless you create a temporal-limited package-based glob. Lexical variables are scoped to the block in which they are defined while the typeglob scope is for the full package in which it is defined.
To resume:
If you want stay with typeglob filehandle make sure to create a temporal-limited package-based glob:
...
local *FH;
open FH, '<', $filepath or die(sprintf('Could not open %s: %s', $filepath, $!));
...
else, use a lexical variable
...
open my $fh, '<', $filepath or die(sprintf('Could not open %s: %s', $filepath, $!));
...

How can I change the standard output filehandle for a system call in Perl?

I am trying change the stdout of the system function using select. But it does not seem to be working. The system output is getting displayed on console rather than getting redirected to the file.
use strict;
use warnings;
chdir "C:\\Documents and Settings\\" or die "cannot open the dir\n";
open FH, ">out.txt" or die "cannot open out.txt\n";
select FH or die " cannot change the stdout\n";
system "dir /a" ;
Although I can do system "dir /a > out.txt", I want to know why the above code is not working.
The select function changes the default filehandle for output. It does not change STDOUT, which points to whatever it points to regardless if it is the default filehandle for output. If you want to change STDOUT, you have to do something like ylebre or Jon's answer.
When you start a child process, it gets the same standard output as the parent. It doesn't care what the default filehandle of the parent is though.
The problem with your code is that system command doesn't capture the output for you. You can use qx//:
use strict;
use warnings;
open my $fh, ">", "output.txt" or die $!;
my $oldfh= select $fh;
print qx/ls -a/;
select $oldfh;
close $fh or warn $!;
There are other options to run an external application and obtain read it's output, like the IPC::Open2 and IPC::Open3 modules. They're included in perl by default.
A side note: don't use globs for file handles: they are globals, and globals are evil. Use lexical variables instead (FH vs. $fh). Also, you should use the three argument open instead of two parameters, since the former is safer than the latter.
Maybe this helps
I'm not sure 'select' is what you are looking for in this case. To redirect STDOUT to a file, you can do the following:
use strict;
use warnings;
use IO::Handle;
open FH, "> out.txt";
STDOUT->fdopen( \*FH, 'w' ) or die $!;
print "Hello world!\n";
system "dir /a";
Hope this helps!
I reassign the file descriptors at the system (not Perl) level:
sub reopen($$$) {
open $_[0], $_[1]. '&='. fileno($_[2]) or die "failed to reopen: $!";
}
reopen *STDOUT, '>', $new_stdout_handle;
Now anything that you fork will think $new_stdout_handle is standard output.
I pasted a full example to gist: http://gist.github.com/129407