perl: canot open file within a loop - perl

I am trying to read in a bunch of similar files and process them one by one. Here is the code I have. But somehow the perl script doesn't read in the files correctly. I'm not sure how to fix it. The files are definitely readable and writable by me.
#!/usr/bin/perl
use strict;
use warnings;
my #olap_f = `ls /full_dir_to_file/*txt`;
foreach my $file (#olap_f){
my %traits_h;
open(IN,'<',$file) || die "cannot open $file";
while(<IN>){
chomp;
my #array = split /\t/;
my $trait = $array[4];
$traits_h{$trait} ++;
}
close IN;
}
When I run it, the error message (something like below) showed up:
cannot open /full_dir_to_file/a.txt

You have newlines at the end of each filename:
my #olap_f = `ls ~dir_to_file/*txt`;
chomp #olap_f; # Remove newlines
Better yet, use glob to avoid launching a new process (and having to trim newlines):
my #olap_f = glob "~dir_to_file/*txt";
Also, use $! to find out why a file couldn't be opened:
open(IN,'<',$file) || die "cannot open $file: $!";
This would have told you
cannot open /full_dir_to_file/a.txt
: No such file or directory
which might have made you recognize the unwanted newline.

I'll add a quick plug for IO::All here. It's important to know what's going on under the hood but it's convenient sometimes to be able to do:
use IO::All;
my #olap_f = io->dir('/full_dir_to_file/')->glob('*txt');
In this case it's not shorter than #cjm's use of glob but IO::All does have a few other convenient methods for working with files as well.

Related

How to match and find common based on substring from two files?

I have two files. File1 contains list of email addresses. File2 contains list of domains.
I want to filter out all the email addresses after matching exact domain using Perl script.
I am using below code, but I don't get correct result.
#!/usr/bin/perl
#use strict;
#use warnings;
use feature 'say';
my $file1 = "/home/user/domain_file" or die " FIle not found\n";
my $file2 = "/home/user/email_address_file" or die " FIle not found\n";
my $match = open(MATCH, ">matching_domain") || die;
open(my $data1, '<', $file1) or die "Could not open '$file1' $!\n";
my #wrd = <$data1>;
chomp #wrd;
# loop on the fiile to be searched
open(my $data2, '<', $file2) or die "Could not open '$file2' $!\n";
while(my $line = <$data2>) {
chomp $line;
foreach (#wrd) {
if($line =~ /\#$_$/) {
print MATCH "$line\n";
}
}
}
File1
abc#1gmail.com.au
abc#gmail.com
abc#gmail.com1
abc#2outlook.com2
abc#outlook.com1
abc#yahoo.com
abc#yahooo1.com
abc#yahooo.com
File2
yahoo.com
gmail.com
Expected output
abc#gmail.com
abc#yahoo.com
First off, since you seem to be on *nix, you might want to check out grep -f, which can take search patterns from a given file. I'm no expert in grep, but I would try the file and "match whole words" and this should be fairly easy.
Second: Your Perl code can be improved, but it works as expected. If you put the emails and domains in the files as indicated by your code. It may be that you have mixed the files up.
If I run your code, fixing only the paths, and keeping the domains in file1, it does create the file matching_domain and it contains your expected output:
abc#gmail.com
abc#yahoo.com
So I don't know what you think your problem is (because you did not say). Maybe you were expecting it to print output to the terminal. Either way, it does work, but there are things to fix.
#use strict;
#use warnings;
It is a huge mistake to remove these two. Biggest mistake you will ever do while coding Perl. It will not remove your errors, just hide them. You will spend 10 times as much time bug fixing. Uncomment this as your first thing you do to fix this.
use feature 'say';
You never use this. You could for example replace print MATCH "$line\n" with say MATCH $line, which is slightly more concise.
my $file1 = "/home/user/domain_file" or die " FIle not found\n";
my $file2 = "/home/user/email_address_file" or die " FIle not found\n";
This is very incorrect. You are placing a condition on the creation of a variable. If the condition fails, does the variable exist? Don't do this. I assume this is to check if the file exists, but that is not what this does. To check if a file exists, you can use -e, documented as perldoc "-X" (various file tests).
Furthermore, a statement in the form of a string, "/home/user..." is TRUE ("truthy"), as far as Perl conditions are concerned. It is only false if it is "0" (zero), "" (empty) or undef (undefined). So your or clause will never be executed. E.g. "foo" or die will never die.
Lastly, this test is quite meaningless, as you will be testing this in your open statement later on anyway. If the file does not exist, the open will fail and your program will die.
my $match = open(MATCH, ">matching_domain") || die;
This is also very incorrect. First off, you never use the $match variable. Secondly, I bet it does not contain what you think it does. (it contains a boolean which states whether open was successful or not, see perldoc -f open) Thirdly, again, don't put conditions on my declarations of variables, it is a bad idea.
What this statement really means is that $match will contain either the return value of the open, or the return value of die. This should probably be simply:
open my $match, ">", "matching_domain" or die "Cannot open '$match': $!;
Also, use the three argument open with explicit open MODE, and use lexical file handles, like you have done elsewhere.
And one more thing on top of all the stuff I've already badgered you with: I don't recommend hard coding output files for small programs like this. If you want to redirect the output, use shell redirection: perl foo.pl > output.txt. I think this is what has prompted you to think something is wrong with your code: You don't see the output.
Other than that, your code is fine, as near as I can tell. You may want to chomp the lines from the domain file, but it should not matter. Also remember that indentation is a good thing, and it helps you read your code. I mentioned this in a comment, but it was removed for some reason. It is important though.
Good luck!
This assumes that the lines labeled File1 are in the file pointed to by $file1 and the lines labeled File2 are in the file pointed to by $file2.
You have your variables swapped. You want to match what is in $line against $_, not the other way around:
# loop on the file to be searched
open( my $data2, '<', $file2 ) or die "Could not open '$file2' $!\n";
while ( my $line = <$data2> ) {
chomp $line;
foreach (#wrd) {
if (/\#$line$/) {
print MATCH "$_\n";
}
}
}
You should un-comment the warnings and strict lines:
use strict;
use warnings;
warnings shows you that the or die checks are not really working the way you intended in the file name assignment statements. Just use :
my $file1 = "/home/user/domain_file";
my $file2 = "/home/user/email_address_file";
You are already doing the checks where they belong (on open).

Replace multiple lines in text file

I have text files containing the text below (amongst other text)
DIFF_COEFF= 1.000e+07,1.000e+07,1.000e+07,1.000e+07,
1.000e+07,1.000e+07,1.000e+07,1.000e+07,1.000e+07,1.000e+07,1.000e+07,
1.000e+07,1.000e+07,1.000e+07,1.000e+07,1.000e+07,1.000e+07,1.000e+07,
1.000e+07,1.000e+07,1.000e+07,1.000e+07,1.000e+07,1.000e+07,1.000e+07,
1.000e+07,1.000e+07,1.000e+07,1.000e+07,1.000e+07,1.000e+07,1.000e+07,
1.000e+07,1.000e+07,1.000e+07,1.000e+07,1.000e+07,4.000e+05,
and I need to replace it with the following text:
DIFF_COEFF= 2.000e+07,2.000e+07,2.000e+07,2.000e+07,
2.000e+07,2.000e+07,2.000e+07,2.000e+07,2.000e+07,2.000e+07,2.000e+07,
2.000e+07,2.000e+07,2.000e+07,2.000e+07,2.000e+07,2.000e+07,2.000e+07,
2.000e+07,2.000e+07,2.000e+07,2.000e+07,2.000e+07,2.000e+07,2.000e+07,
2.000e+07,2.000e+07,2.000e+07,2.000e+07,2.000e+07,2.000e+07,2.000e+07,
2.000e+07,2.000e+07,2.000e+07,2.000e+07,2.000e+07,8.000e+05,
Each line above corresponds to a new line in the text file.
After some googling, I thought making use of Perl in the following might work, but it did not. I got the error message
Illegal division by zero at -e line 1, <> chunk 1
s_orig='DIFF_COEFF=*4.000e+05,'
s_new='DIFF_COEFF= 2.000e+07,2.000e+07,2.000e+07,2.000e+07,\n2.000e+07,2.000e+07,2.000e+07,2.000e+07,2.000e+07,2.000e+07,2.000e+07,\n2.000e+07,2.000e+07,2.000e+07,2.000e+07,2.000e+07,2.000e+07,2.000e+07,\n2.000e+07,2.000e+07,2.000e+07,2.000e+07,2.000e+07,2.000e+07,2.000e+07,\n2.000e+07,2.000e+07,2.000e+07,2.000e+07,2.000e+07,2.000e+07,2.000e+07,\n2.000e+07,2.000e+07,2.000e+07,2.000e+07,2.000e+07,8.000e+05,'
perl -0 -i -pe "s:\Q${s_orig}\E:${s_new}:/igs" file.txt
Does anyone here know the right way to do this?
Edit - some more details: the text after this block is "DIFF_COEFF_Q=" followed by the same set of numbers, so I need to search for and replace the specific lines shown. The text files are not very large in size.
Copy the file over to a new one, except that within the range of text between these markers drop the replacement text instead. Then move that file to replace the original, as it may be needed judging by the attempted perl -0 -i in the question.
Note that when changing a file we have to build new content and then replace the file. There are a few ways to do this and modules that make it easier, shown further below.
The code below uses the range operator and the fact that it returns the counter for lines within the range, 1 for the first and the number ending with E0 for the last. So we don't copy lines inside that region while we write the replacement text (and the post-region-end marker) on the last line.
I consider the region of interest to end right before DIFF_COEFF_Q= line, per the question edit.
use warnings;
use strict;
use feature 'say';
use File::Copy 'move';
my $replacement = "replacement text";
my $file = 'input.txt';
my $out_file = 'new_' . $file;
open my $fh_out, '>', $out_file or die "Can't open $out_file: $!";
open my $fh, '<', $file or die "Can't open $file: $!";
while (<$fh>)
{
if (my $range_cnt = /^\s*DIFF_COEFF\s*=/ .. /^\s*DIFF_COEFF_Q\s*=/) #/
{
if ($range_cnt =~ /E0$/)
{
print $fh_out $replacement; # may need a newline
print $fh_out $_;
}
}
else {
print $fh_out $_;
}
}
close $fh or die "Can't close $file: $!"; # don't overwrite original
close $fh_out or die "Can't close $out_file: $!"; # if there are problems
#move $out_file, $file or die "Can't move $file to $out_file: $!";
Uncomment the move line once this has been tested well enough on your actual files, if you want to replace the original. You may or may not need a newline after $replacement, depending on it.
An alternative is to use flags for entering/leaving that range. But this won't be cleaner since there are two distinct actions, to stop copying when entering the range and write replacement when leaving. Thus multiple flags need be set and checked, what may end up messier.
If the files can't ever be huge it is simpler to read and process the file in memory. Then open the same file for writing and dump the new content
my $text = do { # slurp file into a scalar
local $/;
open my $fh, '<', $file or die "Can't open $file: $!";
<$fh>
};
$text =~ s/^\s*DIFF_COEFF\s*=.*?(\n\s*DIFF_COEFF_Q)/$replacement$1/ms;
# Change $out_file to $file to overwrite
open my $fh_out, '>', $out_file or die "Can't open $out_file: $!";
print $fh_out $text;
Here /m modifier is for multiline mode in which we can use ^ for the beginning of a line (not the whole string), what is helpful here. The /s makes . match a newline, too. Also note that we can slurp a file with Path::Tiny as simply as: my $text = path($file)->slurp;
Another option is to use Path::Tiny, which in newer versions has edit and edit_lines methods
use Path::Tiny;
# NOTE: edits $file in place (changes it)
path($file)->edit(
sub { s/DIFF_COEFF=.*?(\n\s*DIFF_COEFF_Q)/$replacement$1/s }
);
For more on this see, for example, this post and this post and this post.
The first and last way change the inode number of the file. See this post if that is a problem.
It's an interesting error that you've made and I can see what has led you to make it. But I don't think I've ever seen anyone else make the same mistake :-)
Your substitution statement is this:
s:\Q${s_orig}\E:${s_new}:/igs
So you've decided to use : as the delimiter of the substitution operator. But you want to use the options i, g and s and everywhere you've seen people talk about options on a substitution operator, they talk about using / to introduce the options. So you've added /igs to your substitution operator.
But what you've missed (and I completely understand why) is that the / that comes before the options is actually the closing delimiter of the standard, s/.../.../, version of the substitution operator. If you change the delimiter (as you have done) then your altered closing delimiter is all you need.
In your case, Perl doesn't expect the / as it has already seen the closing delimiter. It, therefore, decides that the / is a division operator and tries to divide the result of your substitution by igs. It interprets igs as zero and you get your error.
The fix is to remove that / so:
s:\Q${s_orig}\E:${s_new}:/igs
becomes:
s:\Q${s_orig}\E:${s_new}:igs

Printing a text file using perl

I am completely new to this and this should be the easiest thing to do but for some reason I cannot get my local text file to print. After trying multiple times with different code I came to use the following code but it doesn't print.
I have searched for days on various threads to solve this and have had no luck. Please help. Here is my code:
#!/usr/bin/perl
$newfile = "file.txt";
open (FH, $newfile);
while ($file = <FH>) {
print $file;
}
I updated my code to the following:
#!/user/bin/perl
use strict; # Always use strict
use warnings; # Always use warnings.
open(my $fh, "<", "file.txt") or die "unable to open file.txt: $!";
# Above we open file using 3 handle method
# or die die with error if unable to open it.
while (<$fh>) { # While in the file.
print $_; # Print each line
}
close $fh; # Close the file
system('C:\Users\RSS\file.txt');
It returns the following: my first report generated by perl. I do not know where this is coming from. Nowhere do I have a print "my first report generated by perl."; statement and it definitely is not in my text file.
My text file is full of various emails, addresses, phone numbers and snippets of emails.
Thank you all for your help. I figured out my problem. I somehow managed to kick myself out of my directory and did not realize it.
This is most likely a combination of a failure to open the file, and a failure to check the return value of open.
If you are completely new to perl, I warmly recommend reading the excellent "perlintro" man page, using either man perlintro or perldoc perlintro on the command line, or taking a look here: https://perldoc.perl.org/perlintro.html.
The "Files and I/O" section there gives a good and concise way of doing this:
open(my $in, "<", "input.txt") or die "Can't open input.txt: $!";
while (<$in>) { # assigns each line in turn to $_
print "Just read in this line: $_";
}
This version will give you an explanation and abort if anything goes wrong while trying to open the file. For example, if there is no file named file.txt in the current working directory, your version will quietly fail to open the file, and afterwards it will quietly fail to read from the closed file handle.
Also, always adding at least one of these to your perl scripts will save you a lot of trouble in the long run:
use warnings; # or use the -w command line switch to turn warnings on globally
use diagnostics;
These won't catch the failure to open the file, but will alert on the failed read.
In the first example here you can see that without the diagnostics module, the code fails without any error messages. The second example shows how the diagnostics module changes this.
$ perl -le 'open FH, "nonexistent.txt"; while(<FH>){print "foo"}'
$ perl -le 'use diagnostics; open FH, "nonexistent.txt"; while(<FH>){print "foo"}'
readline() on closed filehandle FH at -e line 1 (#1)
(W closed) The filehandle you're reading from got itself closed sometime
before now. Check your control flow.
By the way, the legendary "Camel Book" is basically the perl man pages formatted for paper printing, so reading the perldocs in the order listed in perldoc perl will give you a high level of understanding of the language in a reasonably accessible and inexpensive manner.
Happy hacking!
This is simple and including explanations.
use strict; # Always use strict
use warnings; # Always use warnings.
open(my $fh, "<", "file.txt") or die "unable to open file.txt: $!";
# Above we open file using 3 handle method
# or die die with error if unable to open it.
while (<$fh>) { # While in the file.
print $_; # Print each line
}
close $fh; # Close the file
There is then also the case where you are trying to open a file which is not in a location where you think it is. So consider doing full path, if not in the same dir.
open(my $fh, "<", 'F:\Workdir\file.txt') or die "unable to open < input.txt: $!";
EDIT: After your comments, it seems that you are opening an empty file. Please add this at the bottom of that same script and rerun. It will open the file in C:\Users\RSS and make sure it does actually contain data?
system('C:\Users\RSS\file.txt');
First, of all as you are starting out, it is better to enable all warnings by 'use warnings' and disable all such expression which can lead to uncertain behavior or are difficult to debug by pragma 'use strict'.
As you are dealing with file stream, it is always recommended to the check if you were able to open the stream. so, try to use croak or die both would terminate the program with a given message.
Instead of reading inside the while condition, I would recommend checking for end of file. So, loop breaks as end is found. Usually, when reading a line you would use it for further processing, so it is good idea to remove end of lines using chomp.
A sample for reading a file in perl can be as follows:
#!/user/bin/perl
use strict;
use warnings;
my $newfile = "file.txt";
open (my $fh, $newfile) or die "Could not open file '$newfile' $!";
while (!eof($fh))
{
my $line=<$fh>;
chomp($line);
print $line , "\n";
}

File truncated, when opened in Perl

Im new to perl, so sorry if this is obvious, but i looked up how to open a file, and use the flags, but for the life of me they dont seem to work right I narrowed it down to these lines of code.
if ($flag eq "T"){
open xFile, ">" , "$lUsername\\$openFile";
}
else
{
open xFile, ">>", "$lUsername\\$openFile";
}
Both of these methods seem to delete the contents of my file. I also checked if the flag is formatted correctly and it is, i know for a fact ive gone down both conditions.
EDIT: codepaste of a larger portion of my code http://codepaste.net/n52sma
New to Perl? I hope you're using use strict and use warnings.
As other's have stated, you should be using a test to make sure your file is open. However, that's not really the problem here. In fact, I used your code, and it seems to work fine for me. Maybe you should try printing some debugging messages to see if this is doing what you think it's doing:
use strict;
use warnings;
use autodie; #Will stop your program if the "open" doesn't work.
my $lUsername = "ABaker";
my $openFile = "somefile.txt";
if ($flag eq "T") {
print qq(DEBUG: Flag = "$flag": Deleting file "$lUsername/$openFile");
open xFile, ">" , "$lUsername/$openFile";
}
else {
print qq(DEBUG: Flag = "$flag": Appending file "$lUsername/$openFile");
open xFile, ">>", "$lUsername/$openFile";
}
You want to use strict and warnings in order to make sure you're not having issues with variable names. The use strict forces you to declare your variables first. For example, are you setting $Flag, but then using $flag? Maybe $flag is set the first time through, but you're setting $Flag the second time through.
Anyway, the DEBUG: statements will give you a better idea of what your error could be.
By the way, in Perl, you're checking if $flag is set to T and not t. If you want to test against both t and T, test whether uc $flag eq 'T' and not just $flag eq 'T'.
#Ukemi
I reformated to comply with use strict, i also made print statements to make sure i was trunctating when i want to, and not when i dont. It still is deleting the file. Although now sometimes its simply not writing, im going to give a larger portion of my code in a link, id really appreciate it if you gave it a once over.
Are you seeing it say Truncating, but the file is empty? Are you sure the file already existed? There's a reason why I put the flag and everything in my debug statements. The more you print, the more you know. Try the following section of code:
$file = "lUsername/$openFile" #Use forward slashes vs. back slashes.
if ($flag eq "T") {
print qq(Flag = "$flag". Truncating file "$file"\n);
open $File , '>', $file
or die qq(Unable to open file "$file" for writing: $!\n);
}
else {
print qq(Flag = "$flag". Appending to file "$file"\n);
if (not -e $file) {
print qq(File "$file" does not exist. Will create it\n");
}
open $File , '>>', $file
or die qq(Unable to open file "$file" for appending: $!\n);
}
Note I'm printing out the flag and the name of the file in quotes. This will allow me to see if there are any hidden characters in my file name.
I'm using the qq(...) method to quote strings, so I can use the quotation marks in my print statements.
Also note I'm checking for the existence of the file when I truncate. This way, I make sure the file actually exists.
This should point out any possible errors in your logic. The other thing you can do is to stop your program when you finish writing out the file and verify that the file was written out as expected.
print "Write to file now:\n";
my $writeToFile = <>;
printf $File "$writeToFile";
close $File;
print "DEBUG: Temporary stop. Examine file\n";
<STDIN>; #DEBUG:
Now, if you see it saying it's appending to the file, and the file exists, and you still see the file being overwritten, we'll know the problem lies in your actual open xFile, ">>" $file statement.
You should use the three-argument-version of open, lexical filehandles and check wether there might have been an error:
# Writing to file (clobbering it if it exists)
open my $file , '>', $filename
or die "Unable to write to file '$filename': $!";
# Appending to file
open my $file , '>>', $filename
or die "Unable to append to file '$filename': $!";
>> does not clobber or truncate. Either you ended up in the "then" clause when you expected to be in the "else" clause, or the problem is elsewhere.
To check what $flag contains:
use Data::Dumper;
local $Data::Dumper::Useqq = 1;
print(Dumper($flag));
For your reference I have mentioned some basic file handling techniques below.
open FILE, "filename.txt" or die $!;
The command above will associate the FILE filehandle with the file filename.txt. You can use the filehandle to read from the file. If the file doesn't exist - or you cannot read it for any other reason - then the script will die with the appropriate error message stored in the $! variable.
open FILEHANDLE, MODE, EXPR
The available modes are the following:
read < #this mode will read the file
write > # this mode will create the new file. If the file already exists it will truncate and overwrite.
append >> #this will append the contents if the file already exists,else it will create new one.
if you have confusion on this, you can use the module called File::Slurp;
I have mentioned the sample codes using File::Slurp module.
use strict;
use File::Slurp;
my $read_mode=read_file("test.txt"); #to read file contents
write_file("test2.txt",$read_mode); #to write file
my #all_files=read_dir("/home/desktop",keep_dot_dot=>0); #read a dir
write_file("test2.txt",{append=>1},"#all_files"); #Append mode

What's the best way to open and read a file in Perl?

Please note - I am not looking for the "right" way to open/read a file, or the way I should open/read a file every single time. I am just interested to find out what way most people use, and maybe learn a few new methods at the same time :)*
A very common block of code in my Perl programs is opening a file and reading or writing to it. I have seen so many ways of doing this, and my style on performing this task has changed over the years a few times. I'm just wondering what the best (if there is a best way) method is to do this?
I used to open a file like this:
my $input_file = "/path/to/my/file";
open INPUT_FILE, "<$input_file" || die "Can't open $input_file: $!\n";
But I think that has problems with error trapping.
Adding a parenthesis seems to fix the error trapping:
open (INPUT_FILE, "<$input_file") || die "Can't open $input_file: $!\n";
I know you can also assign a filehandle to a variable, so instead of using "INPUT_FILE" like I did above, I could have used $input_filehandle - is that way better?
For reading a file, if it is small, is there anything wrong with globbing, like this?
my #array = <INPUT_FILE>;
or
my $file_contents = join( "\n", <INPUT_FILE> );
or should you always loop through, like this:
my #array;
while (<INPUT_FILE>) {
push(#array, $_);
}
I know there are so many ways to accomplish things in perl, I'm just wondering if there are preferred/standard methods of opening and reading in a file?
There are no universal standards, but there are reasons to prefer one or another. My preferred form is this:
open( my $input_fh, "<", $input_file ) || die "Can't open $input_file: $!";
The reasons are:
You report errors immediately. (Replace "die" with "warn" if that's what you want.)
Your filehandle is now reference-counted, so once you're not using it it will be automatically closed. If you use the global name INPUT_FILEHANDLE, then you have to close the file manually or it will stay open until the program exits.
The read-mode indicator "<" is separated from the $input_file, increasing readability.
The following is great if the file is small and you know you want all lines:
my #lines = <$input_fh>;
You can even do this, if you need to process all lines as a single string:
my $text = join('', <$input_fh>);
For long files you will want to iterate over lines with while, or use read.
If you want the entire file as a single string, there's no need to iterate through it.
use strict;
use warnings;
use Carp;
use English qw( -no_match_vars );
my $data = q{};
{
local $RS = undef; # This makes it just read the whole thing,
my $fh;
croak "Can't open $input_file: $!\n" if not open $fh, '<', $input_file;
$data = <$fh>;
croak 'Some Error During Close :/ ' if not close $fh;
}
The above satisfies perlcritic --brutal, which is a good way to test for 'best practices' :). $input_file is still undefined here, but the rest is kosher.
Having to write 'or die' everywhere drives me nuts. My preferred way to open a file looks like this:
use autodie;
open(my $image_fh, '<', $filename);
While that's very little typing, there are a lot of important things to note which are going on:
We're using the autodie pragma, which means that all of Perl's built-ins will throw an exception if something goes wrong. It eliminates the need for writing or die ... in your code, it produces friendly, human-readable error messages, and has lexical scope. It's available from the CPAN.
We're using the three-argument version of open. It means that even if we have a funny filename containing characters such as <, > or |, Perl will still do the right thing. In my Perl Security tutorial at OSCON I showed a number of ways to get 2-argument open to misbehave. The notes for this tutorial are available for free download from Perl Training Australia.
We're using a scalar file handle. This means that we're not going to be coincidently closing someone else's file handle of the same name, which can happen if we use package file handles. It also means strict can spot typos, and that our file handle will be cleaned up automatically if it goes out of scope.
We're using a meaningful file handle. In this case it looks like we're going to write to an image.
The file handle ends with _fh. If we see us using it like a regular scalar, then we know that it's probably a mistake.
If your files are small enough that reading the whole thing into memory is feasible, use File::Slurp. It reads and writes full files with a very simple API, plus it does all the error checking so you don't have to.
There is no best way to open and read a file. It's the wrong question to ask. What's in the file? How much data do you need at any point? Do you need all of the data at once? What do you need to do with the data? You need to figure those out before you think about how you need to open and read the file.
Is anything that you are doing now causing you problems? If not, don't you have better problems to solve? :)
Most of your question is merely syntax, and that's all answered in the Perl documentation (especially (perlopentut). You might also like to pick up Learning Perl, which answers most of the problems you have in your question.
Good luck, :)
It's true that there are as many best ways to open a file in Perl as there are
$files_in_the_known_universe * $perl_programmers
...but it's still interesting to see who usually does it which way. My preferred form of slurping (reading the whole file at once) is:
use strict;
use warnings;
use IO::File;
my $file = shift #ARGV or die "what file?";
my $fh = IO::File->new( $file, '<' ) or die "$file: $!";
my $data = do { local $/; <$fh> };
$fh->close();
# If you didn't just run out of memory, you have:
printf "%d characters (possibly bytes)\n", length($data);
And when going line-by-line:
my $fh = IO::File->new( $file, '<' ) or die "$file: $!";
while ( my $line = <$fh> ) {
print "Better than cat: $line";
}
$fh->close();
Caveat lector of course: these are just the approaches I've committed to muscle memory for everyday work, and they may be radically unsuited to the problem you're trying to solve.
I once used the
open (FILEIN, "<", $inputfile) or die "...";
my #FileContents = <FILEIN>;
close FILEIN;
boilerplate regularly. Nowadays, I use File::Slurp for small files that I want to hold completely in memory, and Tie::File for big files that I want to scalably address and/or files that I want to change in place.
For OO, I like:
use FileHandle;
...
my $handle = FileHandle->new( "< $file_to_read" );
croak( "Could not open '$file_to_read'" ) unless $handle;
...
my $line1 = <$handle>;
my $line2 = $handle->getline;
my #lines = $handle->getlines;
$handle->close;
Read the entire file $file into variable $text with a single line
$text = do {local(#ARGV, $/) = $file ; <>};
or as a function
$text = load_file($file);
sub load_file {local(#ARGV, $/) = #_; <>}
If these programs are just for your productivity, whatever works! Build in as much error handling as you think you need.
Reading in a whole file if it's large may not be the best way long-term to do things, so you may want to process lines as they come in rather than load them up in an array.
One tip I got from one of the chapters in The Pragmatic Programmer (Hunt & Thomas) is that you might want to have the script save a backup of the file for you before it goes to work slicing and dicing.
The || operator has higher precedence, so it is evaluated first before sending the result to "open"... In the code you've mentioned, use the "or" operator instead, and you wouldn't have that problem.
open INPUT_FILE, "<$input_file"
or die "Can't open $input_file: $!\n";
Damian Conway does it this way:
$data = readline!open(!((*{!$_},$/)=\$_)) for "filename";
But I don't recommend that to you.