In perl is \*STDIN the same as STDIN? - perl

I'm the author of Pythonizer and I'm trying to translate the code of CGI.pm from the standard perl library to Python. I came across this code in read_from_client:
read(\*STDIN, $$buff, $len, $offset)
Is \*STDIN the same thing as just STDIN? I'm not understanding why they are using it this way. Thanks for your help!
The module also references \*main::STDIN - is this the same as STDIN too (I would translate plain STDIN to sys.stdin in python)? Code:
foreach my $fh (
\*main::STDOUT,
\*main::STDIN,
\*main::STDERR,
) { ... }

Instead of translating CGI.pm line for line, I'll recommend you understand the interface then do whatever Python would do for that. Or, better yet, just forget it exists. It often seems like a translation will be a drop-in replacement, but since the libraries and structures you'll use in the new language are different enough that you are just going to make new bugs. Since you are going to make new bugs anyway, you might as well do something smarter.
But, I know nothing about your situation, so let's get to the literal question.
You're looking at:
# Read data from a file handle
sub read_from_client {
my($self, $buff, $len, $offset) = #_;
local $^W=0; # prevent a warning
return $MOD_PERL
? $self->r->read($$buff, $len, $offset)
: read(\*STDIN, $$buff, $len, $offset);
}
Instead of worrying about the Perl code, just do whatever you need to do in Python to satisfy the interface. Given a buffer and a length, get some more data from the filehandle. Since you are not handling mod_perl (I'm guessing, because how would you?), you can ignore most stuff there.
The \*main::STDIN and \*STDIN are references to a typeglob, which is a way to track all the Perl variables with the same name (scalar, array, hash, subroutine, filehandle, and a few others). The STDIN identifier is a special case variable that is main by default, so adding the package main:: in front is probably merely developer comfort.
When you use those reference in a place that wants to work on a filehandle, the filehandle portion of the type glob is used. It's just a way to pass the identifier STDIN and have something else use it as a filehandle.
You see this as a way to pass around the named, standard file handles.
The read takes a filehandle (or reference to typeglob) as its first argument.
In python, you'd do something like sys.stdin.read(...).

The following can usually be used a file handle:
An IO object (*STDIN{IO})
A glob containing an IO object (*STDIN)
A reference to a glob containing an IO object (\*STDIN)
The name of a glob containing an IO object ("STDIN")
The builtin operators that expect a file handle allow you to omit the * when providing a glob. For example, read( FH, ... ) means read( *FH, ... ).
The builtin functions that expect a file handle should accept all of these. So you could any of the following:
read( *STDIN{IO}, ... )
read( STDIN, ... )
read( *STDIN, ... )
read( \*STDIN, ... )
read( "STDIN", ... ).
They will have the same effect.
Third-party libraries probably accept the globs and reference to globs, and they should also expect IO objects. I expect the least support for providing the name as a string. Your mileage may vary.
You can't go wrong with a reference to a glob (\*FH) since that's what open( my $fh, ... ) produces.

Related

What is a "handle" in Perl?

Wondering what the handle is in Perl.
I can see file handle, directory handle..etc but want to know the meaning of the handle in Perl.
For example, in IO::Pipe, I can see below explain. And want to make clear the meaning of "becomes a handle"?
reader ([ARGS])
The object is re-blessed into a sub-class of IO::Handle,
and becomes a handle at the reading end of the pipe. If
ARGS are given then fork is called and ARGS are passed
to exec.
Also could you please explain the meaning of bless?
A handle is an way to get to something without actually being that thing. A handle has an interface to interact with something managed by system (or something else).
Start with the idea of a scalar. Defining a simple scalar stores a value and that value is actually in the memory of your program. In very simplistic terms, you manage that resource directly and wholly within your program. You don't need to ask the system to do increment the variable for you:
my $n = 5;
$n++;
Talking to the outside world
A handle represents a connection to something managed by something else, typically through "system calls".
A file handle is your connection to a file (so, managed by the filesystem or OS) but is not the file itself. With that filehandle, you can can read from or write to the file, there is code behind all that to talk to the system to do the actual work.
open my $filehandle, '<', $filename or die "$!";
Since you are not managing the actual work and since you depend on the system to do the work, you check the $! system error variable to check that the system was able to do what you wanted. If it couldn't, it tells you how it ran into a problem (although the error may not be very specific).
A directory handle is a way to get a list of the things inside a directory, but is not the directory itself. To get that, you have to ask the system to do things for you. And so on.
Perl is wonderful though
But, in Perl, you can make a handle to anything you like (and I write a lot about this in either Effective Perl Programming and Mastering Perl. You can use the interface for a handle even if you wholly control the thing and don't need to ask the system to do something on your behalf.
For example, you can use the filehandle interface on a string:
open my $string_filehandle, '>', \my $string;
print {$string_filehandle} "This goes to the string";
To your code as you read it, it looks like the thing is a file (socket, whatever) because the main use of the handle interface. This is quite handy when your are handcuffed to using a filehandle because someone else wrote some code you can't change. This function is designed to only send $message to some output handle:
sub print_to_file_only {
my( $filehandle, $message ) = #_;
print {$filehandle} $message;
}
But sometime you don't want that message to go to the terminal, file, socket, or whatever. You want to see it in my program. You can capture the message in your $string_filehandle because it uses the same handle interface even though it's not a static resource.
print_to_file_only( $string_filehandle );
Now you'll see the message show up in $string, and you can do whatever you like with it.
There are many more tricks like this, and I'm tempted to talk about all of them. But, this should be a good start.
This can be a very broad topic and I'll try to stay with the crux of the question, captured in a line quoted from IO::Pipe docs which needs explaining
The object is re-blessed into a sub-class of IO::Handle,
and becomes a handle at the reading end of the pipe.
A "handle" in Perl is a construct built around some resource, out in the OS or in our program, which allows us to manage that resource. A filehandle, for instance, may facilitate access to a file, via libraries and OS facilities, and is more than a plain file descriptor
use warnings;
use strict;
use feature 'say';
my $file = shift // die "Usage: $0 file\n";
open my $fh, '<', $file or die "Can't open $file: $!";
say fileno $fh; # file descriptor, a small integer, normally >= 3
say $fh->fileno; # can use handle as object of IO::Handle or IO::File
print while <$fh>; # use it to access data at/via the resource
close $fh;
The opened file got a file descriptor in the OS, a small integer, the first one available.† But we get the "filehandle" $fh associated with the opened file, which is far nicer to work with and with which various tools can be used. (The fileno was used to get the fd from it.)
In newer Perls (since v5.14.0) a (file)handle can in fact be treated as an object of IO::Handle or IO::File, as these classes get loaded on demand once a method call from them is used on the variable with the handle (if the call can't be resolved otherwise).‡
This brings us to the second question, of "re-bless"-ing.
When a reference is bless-ed into a package it becomes an object of (the class supposedly defined in) that package. A sub this is done in is thus a constructor and such "blessed" reference is returned to the caller. That is an "instance" of the class in the caller, an object.
The object's internal structure has fields saying what package it's from so it gets treated accordingly, one can call methods defined in the package on it, etc. This is a bit simplified, see perlootut and perlobj for starters.
The quote from the docs comes from the reader or writer methods in IO::Pipe class. Once they are called on an object of that class it becomes beneficial for the object to have facilities from IO::Handle, so it is "made" into an object of that class. (Not of IO::File class since a pipe isn't seekable while IO::File inherits from IO::Seekable as well.)
Since bless is a crucial and telling part of the process peopple often simply say that it's "blessed" (or "re-blessed" here since it was already an object of another class), but as you can see from the linked sources there is a bit more to do.
As a final comment, note that a "(file)handle" can be opened to things entirely other than an OS resource like a file or socket or such. For example, it can be "tied" (see perltie, Tie::Handle); or, opened to a scalar ("in-memory file").§
† If STDIN/STDOUT/STDERR (fd's 0,1,2) aren't closed and this is the first thing opened, it gets 3
‡ The IO::File inherits from IO::Handle and IO::Seekable, adding only a few methods. Most classes that represent handles, like IO::Pipe or IO::Select, inherit from IO::Handle. So its docs, first, provide a feel for what is available for a handle, so what a "handle" is.
§ This isn't a full filehandle though; try fileno on it (-1). But it behaves well enough to be useful. One example: I use it in forked processes to accumulate prints in a child in a string, which is in the end sent back to the parent. That way they can be logged/printed coherently, and in some order.
# in a child
my $stdout;
open my $fh_stdout, '>', \$stdout or croak "Can't open var for write: $!";
my $fh_STDOUT = select $fh_stdout; # set as default, save old (STDOUT)
say "goes to string"; # winds up in $stdout (sent to parent in the end)
select $fh_STDOUT; # cna switch back (normally not needed in child)
This can be done in other ways of course (append messages to a string, for example) but this way we can print normally once the handle is select-ed as default (and can use existing subs/libraries which may just print, not caring where they run, etc).
Most languages create variables and objects in very different ways. In Perl, they are very similar.
Perl allows most types of variables to be marked as an object by the bless functionality.
This confer additional powers to the variable to call methods in the class. Perl will search that class for a method of that name. If you fail to supply the second argument to bless, it will use the current package or class to search.
In their IO::Pipe example, you call IO::Pipe's new() method to obtain a blessed object. To them make use, they fork() and the parent converts ("re-blessed") $pipe to a reader subclass of IO::Pipe with methods calls that work as a reader. The child process converts their $pipe to a write. Now they may communicate from child to parent via the pipe.

How to pipe to and read from the same tempfile handle without race conditions?

Was debugging a perl script for the first time in my life and came over this:
$my_temp_file = File::Temp->tmpnam();
system("cmd $blah | cmd2 > $my_temp_file");
open(FIL, "$my_temp_file");
...
unlink $my_temp_file;
This works pretty much like I want, except the obvious race conditions in lines 1-3. Even if using proper tempfile() there is no way (I can think of) to ensure that the file streamed to at line 2 is the same opened at line 3. One solution might be pipes, but the errors during cmd might occur late because of limited pipe buffering, and that would complicate my error handling (I think).
How do I:
Write all output from cmd $blah | cmd2 into a tempfile opened file handle?
Read the output without re-opening the file (risking race condition)?
You can open a pipe to a command and read its contents directly with no intermediate file:
open my $fh, '-|', 'cmd', $blah;
while( <$fh> ) {
...
}
With short output, backticks might do the job, although in this case you have to be more careful to scrub the inputs so they aren't misinterpreted by the shell:
my $output = `cmd $blah`;
There are various modules on CPAN that handle this sort of thing, too.
Some comments on temporary files
The comments mentioned race conditions, so I thought I'd write a few things for those wondering what people are talking about.
In the original code, Andreas uses File::Temp, a module from the Perl Standard Library. However, they use the tmpnam POSIX-like call, which has this caveat in the docs:
Implementations of mktemp(), tmpnam(), and tempnam() are provided, but should be used with caution since they return only a filename that was valid when function was called, so cannot guarantee that the file will not exist by the time the caller opens the filename.
This is discouraged and was removed for Perl v5.22's POSIX.
That is, you get back the name of a file that does not exist yet. After you get the name, you don't know if that filename was made by another program. And, that unlink later can cause problems for one of the programs.
The "race condition" comes in when two programs that probably don't know about each other try to do the same thing as roughly the same time. Your program tries to make a temporary file named "foo", and so does some other program. They both might see at the same time that a file named "foo" does not exist, then try to create it. They both might succeed, and as they both write to it, they might interleave or overwrite the other's output. Then, one of those programs think it is done and calls unlink. Now the other program wonders what happened.
In the malicious exploit case, some bad actor knows a temporary file will show up, so it recognizes a new file and gets in there to read or write data.
But this can also happen within the same program. Two or more versions of the same program run at the same time and try to do the same thing. With randomized filenames, it is probably exceedingly rare that two running programs will choose the same name at the same time. However, we don't care how rare something is; we care how devastating the consequences are should it happen. And, rare is much more frequent than never.
File::Temp
Knowing all that, File::Temp handles the details of ensuring that you get a filehandle:
my( $fh, $name ) = File::Temp->tempfile;
This uses a default template to create the name. When the filehandle goes out of scope, File::Temp also cleans up the mess.
{
my( $fh, $name ) = File::Temp->tempfile;
print $fh ...;
...;
} # file cleaned up
Some systems might automatically clean up temp files, although I haven't care about that in years. Typically is was a batch thing (say once a week).
I often go one step further by giving my temporary filenames a template, where the Xs are literal characters the module recognizes and fills in with randomized characters:
my( $name, $fh ) = File::Temp->tempfile(
sprintf "$0-%d-XXXXXX", time );
I'm often doing this while I'm developing things so I can watch the program make the files (and in which order) and see what's in them. In production I probably want to obscure the source program name ($0) and the time; I don't want to make it easier to guess who's making which file.
A scratchpad
I can also open a temporary file with open by not giving it a filename. This is useful when you want to collect outside the program. Opening it read-write means you can output some stuff then move around that file (we show a fixed-length record example in Learning Perl):
open(my $tmp, "+>", undef) or die ...
print $tmp "Some stuff\n";
seek $tmp, 0, 0;
my $line = <$tmp>;
File::Temp opens the temp file in O_RDWR mode so all you have to do is use that one file handle for both reading and writing, even from external programs. The returned file handle is overloaded so that it stringifies to the temp file name so you can pass that to the external program. If that is dangerous for your purpose you can get the fileno() and redirect to /dev/fd/<fileno> instead.
All you have to do is mind your seeks and tells. :-) Just remember to always set autoflush!
use File::Temp;
use Data::Dump;
$fh = File::Temp->new;
$fh->autoflush;
system "ls /tmp/*.txt >> $fh" and die $!;
#lines = <$fh>;
printf "%s\n\n", Data::Dump::pp(\#lines);
print $fh "How now brown cow\n";
seek $fh, 0, 0 or die $!;
#lines2 = <$fh>;
printf "%s\n", Data::Dump::pp(\#lines2);
Which prints
[
"/tmp/cpan_htmlconvert_DPzx.txt\n",
"/tmp/cpan_htmlconvert_DunL.txt\n",
"/tmp/cpan_install_HfUe.txt\n",
"/tmp/cpan_install_XbD6.txt\n",
"/tmp/cpan_install_yzs9.txt\n",
]
[
"/tmp/cpan_htmlconvert_DPzx.txt\n",
"/tmp/cpan_htmlconvert_DunL.txt\n",
"/tmp/cpan_install_HfUe.txt\n",
"/tmp/cpan_install_XbD6.txt\n",
"/tmp/cpan_install_yzs9.txt\n",
"How now brown cow\n",
]
HTH

What is the Perl's IO::File equivalent to open($fh, ">:utf8",$path)?

It's possible to white a file utf-8 encoded as follows:
open my $fh,">:utf8","/some/path" or die $!;
How do I get the same result with IO::File, preferably in 1 line?
I got this one, but does it do the same and can it be done in just 1 line?
my $fh_out = IO::File->new($target_file, 'w');
$fh_out->binmode(':utf8');
For reference, the script starts as follows:
use 5.020;
use strict;
use warnings;
use utf8;
# code here
Yes, you can do it in one line.
open accepts one, two or three parameters. With one parameter, it is just a front end for the built-in open function. With two or three parameters, the first parameter is a filename that may include whitespace or other special characters, and the second parameter is the open mode, optionally followed by a file permission value.
[...]
If IO::File::open is given a mode that includes the : character, it passes all the three arguments to the three-argument open operator.
So you just do this.
my $fh_out = IO::File->new('/some/path', '>:utf8');
It is the same as your first open line because it gets passed through.
I would suggest to try out Path::Tiny. For example, to open and write out your file
use Path::Tiny;
path('/some/path')->spew_utf8(#data);
From the docs, on spew, spew_raw, spew_utf8
Writes data to a file atomically. [ ... ]
spew_raw is like spew with a binmode of :unix for a fast, unbuffered, raw write.
spew_utf8 is like spew with a binmode of :unix:encoding(UTF-8) (or PerlIO::utf8_strict ). If Unicode::UTF8 0.58+ is installed, a raw spew will be done instead on the data encoded with Unicode::UTF8.
The module integrates many tools for handling files and directories, paths and content. It is often simple calls like above, but also method chaining, recursive directory iterator, hooks for callbacks, etc. There is error handling throughout, consistent and thoughtful dealing with edge cases, flock on input/ouput handles, its own tiny and useful class for exceptions ... see docs.
Edit:
You could also use File::Slurp if it was not discouraged to use
e.g
use File::Slurp qw(write_file);
write_file( 'filename', {binmode => ':utf8'}, $buffer ) ;
The first argument to write_file is the filename. The next argument is
an optional hash reference and it contains key/values that can modify
the behavior of write_file. The rest of the argument list is the data
to be written to the file.
Some good reasons to not use?
Not reliable
Has some bugs
And as #ThisSuitIsBlackNot said File::Slurp is broken and wrong

PERL: String Replacement on file

I am working on a script to do a string replacement in a file and I will read the variables and values and files from a configuration file and do string replacement.
Here is my logic to do a string replacement.
sub expansion($$$){
my $f = shift(#_) ; # file Name
my $vname = shift(#_) ; # variable name for pattern match
my $value = shift(#_) ; # value to replace
my $n = "$f".".new";
open ( O, "<$f") or print( "Can't open $f file: $!");
open ( N ,">$n" ) or print( "Can't open $n file: $!");
while (<O>)
{
$_ =~ s/$vname/$value/g; #check for pattern
print N "$_" ;
}
close (O);
close (N);
}
In my logic am reading line by line in from input file ($f) for the pattern and writing to a new file ($n) .
Instead of write to a new file is there any way to do a string replacement the original file when I try to do the same it has only empty file with no contents.
Do not. Never, ever1. Don't you dare, Don't even think of, do not use subroutine prototyping. It is horribly broken (that is, it doesn't do what you think it does) and is dangerous.
Now, we got that out of the way:
Yes, you can do what you want. You can open a file as both read and writable by using the mode <+. So far, so good.
However, due to buffering, you cannot use the standard read and write methods to read and write to the file. Instead, you need to use sysread and syswrite.
Then, what you need to do is read the line, use sysseek to go back to the start of where you read, and then write to that spot.
Not only is it very complex to do, but it is full of peril. Let's take a simple example. I have a document, and I want to replace my curly quotes with straight quotes.
$line =~ s/“|”/"/g;
That should work. I'm replacing one character with another. What could go wrong?
If this is a UTF-8 file (what Macs and Linux systems use by default), those curly quotes are two-byte characters and that straight quote is a single byte character. I would be writing back a line that was shorter than the line I read in. My buffer is going to be off.
Back in the days when computer memory and storage were measured in kilobytes, and you serial devices like reel-to-reel tapes, this type of operation was quite common. However, in this age where storage is vast, it's simply not worth the complexity and error prone process that this entails. Stick with reading from one file, and writing to another. Then use unlink and rename to delete the original and to rename the copy to the original's name.
A few more pointers:
Don't print if the file can't be opened. Use die. Otherwise, your program will simply continue on blithely unaware that it is not working. Even better, use the pragma use autodie;, and you won't have to worry about testing whether or not a read/write failed.
Use scalars for file handles.
That is instead of
open OUT, ">my_file.txt";
use
open my $out_fh, ">my_file.txt";
And, it is highly recommended to use the three parameter open:
Use
open my $out_fh, ">", "my_file.txt";
If you aren't, always add use strict; and use warnings;.
In fact, your Perl syntax is a bit ancient. You need to get a book on Modern Perl. Perl originally was written as a hack language to replace shell and awk programming. However, Perl has morphed into a full fledge language that can handle complex data types, object orientation, and large projects. Learning the modern syntax of Perl will help you find errors, and become a better developer.
1. Like all rules, this can be broken, but only if you have a clear and careful understanding what is going on. It's like those shows that say "Don't do this at home. We're professionals."
sub inplace_expansion($$$){
my $f = shift(#_) ; # file Name
my $vname = shift(#_) ; # variable name for pattern match
my $value = shift(#_) ; # value to replace
local #ARGV = ( $f );
local $^I = '';
while (<>)
{
s/\Q$vname/$value/g; #check for pattern
print;
}
}
or, my preference would run closer to this (basically equivalent, changes mostly in formatting, variable names, etc.):
use English;
sub inplace_expansion {
my ( $filename, $pattern, $replacement ) = #_;
local #ARGV = ( $filename ),
$INPLACE_EDIT = '';
while ( <> ) {
s/\Q$pattern/$replacement/g;
print;
}
}
The trick with local basically simulates a command-line script (as one would run with perl -e); for more details, see perldoc perlrun. For more on $^I (aka $INPLACE_EDIT), see perldoc perlvar.
(For the business with \Q (in the s// expression), see perldoc -f quotemeta. This is unrelated to your question, but good to know. Also be aware that passing regex patterns around in variables—as opposed to, e.g., using literal regexes exclusively— can be vulnerable to injection attacks; Perl's built-in taint mode is useful here.)
EDIT: David W. is right about prototypes.

Should I manually set Perl's #ARGV so I can use <> to open, scan, and close files?

I have recently started learning Perl and one of my latest assignments involves searching a bunch of files for a particular string. The user provides the directory name as an argument and the program searches all the files in that directory for the pattern. Using readdir() I have managed to build an array with all the searchable file names and now need to search each and every file for the pattern, my implementation looks something like this -
sub searchDir($) {
my $dirN = shift;
my #dirList = glob("$dirN/*");
for(#dirList) {
push #fileList, $_ if -f $_;
}
#ARGV = #fileList;
while(<>) {
## Search for pattern
}
}
My question is - is it alright to manually load the #ARGV array as has been done above and use the <> operator to scan in individual lines or should I open / scan / close each file individually? Will it make any difference if this processing exists in a subroutine and not in the main function?
On the topic of manipulating #ARGV - that's definitely working code, Perl certainly allows you to do that. I don't think it's a good coding habit though. Most of the code I've seen that uses the "while (<>)" idiom is using it to read from standard input, and that's what I initially expect your code to do. A more readable pattern might be to open/close each input file individually:
foreach my $file (#files) {
open FILE, "<$file" or die "Error opening file $file ($!)";
my #lines = <FILE>;
close FILE or die $!;
foreach my $line (#file) {
if ( $line =~ /$pattern/ ) {
# do something here!
}
}
}
That would read more easily to me, although it is a few more lines of code. Perl allows you a lot of flexibility, but I think that makes it that much more important to develop your own style in Perl that's readable and understandable to you (and your co-workers, if that's important for your code/career).
Putting subroutines in the main function or in a subroutine is also mostly a stylistic decision that you should play around with and think about. Modern computers are so fast at this stuff that style and readability is much more important for scripts like this, as you're not likely to encounter situations in which such a script over-taxes your hardware.
Good luck! Perl is fun. :)
Edit: It's of course true that if he had a very large file, he should do something smarter than slurping the entire file into an array. In that case, something like this would definitely be better:
while ( my $line = <FILE> ) {
if ( $line =~ /$pattern/ ) {
# do something here!
}
}
The point when I wrote "you're not likely to encounter situations in which such a script over-taxes your hardware" was meant to cover that, sorry for not being more specific. Besides, who even has 4GB hard drives, let alone 4GB files? :P
Another Edit: After perusing the Internet on the advice of commenters, I've realized that there are hard drives that are much larger than 4GB available for purchase. I thank the commenters for pointing this out, and promise in the future to never-ever-ever try to write a sarcastic comment on the internet.
I would prefer this more explicit and readable version:
#!/usr/bin/perl -w
foreach my $file (<$ARGV[0]/*>){
open(F, $file) or die "$!: $file";
while(<F>){
# search for pattern
}
close F;
}
But it is also okay to manipulate #ARGV:
#!/usr/bin/perl -w
#ARGV = <$ARGV[0]/*>;
while(<>){
# search for pattern
}
Yes, it is OK to adjust the argument list before you start the 'while (<>)' loop; it would be more nearly foolhardy to adjust it while inside the loop. If you process option arguments, for instance, you typically remove items from #ARGV; here, you are adding items, but it still changes the original value of #ARGV.
It makes no odds whether the code is in a subroutine or in the 'main function'.
The previous answers cover your main Perl-programming question rather well.
So let me comment on the underlying question: How to find a pattern in a bunch of files.
Depending on the OS it might make sense to call a specialised external program, say
grep -l <pattern> <path>
on unix.
Depending on what you need to do with the files containing the pattern, and how big the hit/miss ratio is, this might save quite a bit of time (and re-uses proven code).
The big issue with tweaking #ARGV is that it is a global variable. Also, you should be aware that while (<>) has special magic attributes. (reading each file in #ARGV or processing STDIN if #ARGV is empty, testing for definedness rather than truth). To reduce the magic that needs to be understood, I would avoid it, except for quickie-hack-jobs.
You can get the filename of the current file by checking $ARGV.
You may not realize it, but you are actually affecting two global variables, not just #ARGV. You are also hitting $_. It is a very, very good idea to localize $_ as well.
You can reduce the impact of munging globals by using local to localize the changes.
BTW, there is another important, subtle bit of magic with <>. Say you want to return the line number of the match in the file. You might think, ok, check perlvar and find $. gives the linenumber in the last handle accessed--great. But there is an issue lurking here--$. is not reset between #ARGV files. This is great if you want to know how many lines total you have processed, but not if you want a line number for the current file. Fortunately there is a simple trick with eof that will solve this problem.
use strict;
use warnings;
...
searchDir( 'foo' );
sub searchDir {
my $dirN = shift;
my $pattern = shift;
local $_;
my #fileList = grep { -f $_ } glob("$dirN/*");
return unless #fileList; # Don't want to process STDIN.
local #ARGV;
#ARGV = #fileList;
while(<>) {
my $found = 0;
## Search for pattern
if ( $found ) {
print "Match at $. in $ARGV\n";
}
}
continue {
# reset line numbering after each file.
close ARGV if eof; # don't use eof().
}
}
WARNING: I just modified your code in my browser. I have not run it so it, may have typos, and probably won't work without a bit of tweaking
Update: The reason to use local instead of my is that they do very different things. my creates a new lexical variable that is only visible in the contained block and cannot be accessed through the symbol table. local saves the existing package variable and aliases it to a new variable. The new localized version is visible in any subsequent code, until we leave the enclosing block. See perlsub: Temporary Values Via local().
In the general case of making new variables and using them, my is the correct choice. local is appropriate when you are working with globals, but you want to make sure you don't propagate your changes to the rest of the program.
This short script demonstrates local:
$foo = 'foo';
print_foo();
print_bar();
print_foo();
sub print_bar {
local $foo;
$foo = 'bar';
print_foo();
}
sub print_foo {
print "Foo: $foo\n";
}