How can I read input from a text file in Perl? - perl

I would like to take input from a text file in Perl. Though lot of info are available over the net, it is still very confusing as to how to do this simple task of printing every line of text file. So how to do it? I am new to Perl, thus the confusion .

eugene has already shown the proper way. Here is a shorter script:
#!/usr/bin/perl
print while <>
or, equivalently,
#!/usr/bin/perl -p
on the command line:
perl -pe0 textfile.txt
You should start learning the language methodically, following a decent book, not through haphazard searches on the web.
You should also make use of the extensive documentation that comes with Perl.
See perldoc perltoc or perldoc.perl.org.
For example, opening files is covered in perlopentut.

First, open the file:
open my $fh, '<', "filename" or die $!;
Next, use the while loop to read until EOF:
while (<$fh>) {
# line contents's automatically stored in the $_ variable
}
close $fh or die $!;

# open the file and associate with a filehandle
open my $file_handle, '<', 'your_filename'
or die "Can't open your_filename: $!\n";
while (<$file_handle>) {
# $_ contains each record from the file in turn
}

Related

Open a file and overwrite the file with adjustments and no backup

I have the following three lines:
rename($file_path, $file_fh.'.bak');
open( my $file_IN_fh, '<' , $file_path.'.bak') || die "die message";
open( my $file_OUT_fh, '>' , $file_path) || die "die message";
It works great. It allows me to go through the in file while(<$file_IN_fh>), make a bunch of changes with a script (s///g, if() to determine if the line stays or not, etc), and write to the out file. In the end I get my edited file and the file name is unchanged.
My issue is that I am at a point where I no longer (currently) want the backup files, so I want to replace the code with something similar that won't create the backup file, and comment back and forth the three lines over the years if my needs change.
How do I do this kind of editing in place not from the command line?
One basic way is to read the file line by line and write desired output lines to a temporary file, which is then renamed so to overwrite the original.
use File::Copy qw(move);
open my $fh, '<', $file or die "Can't open $file: $!";
open my $fh_out, '>', $outfile or die "Can't open $outfile: $!";
while (<$fh>) {
next if /line_to_skip/;
s/patt/repl/g;
print $fh_out $_;
}
close $_ for ($fh, $fh_out);
move ($outfile, $file) or die "Can't move $outfile to $file: $!";
This is what is normally done by tools that edit files "in place" (with additional safety, checks, and flexibility). Since the $outfile is temporary use File::Temp.
Add checks when close-ing files.
Note that this changes the file's inode number, which may matter for some applications.†
If the file isn't huge you can simplify this and read it in first
open my $fh, '<', $file or die "Can't open $file: $!";
my #lines = <$fh>;
open $fh, '>', $file or die "Can't open $file for writing: $!";
for (#lines) {
next if /line_to_skip/;
s/patt/repl/g;
print $fh_out $_;
}
close $fh;
what preserves the inode number, since > mode truncates the existing inode data.
† If this is indeed a problem, you can still keep the same inode. After the temporary file is written, open it for reading and open the original file for writing; that truncates the contents of that inode. Then copy the temporary file to the original. Close handles and delete the temporary file.
If the file is huge, then I'd question why you'd want to avoid the temporary file. Otherwise, I'd suggest just loading the file into memory, make modifications, then write it back out.
use File::Slurp qw( read_file write_file );
my $in = read_file($qfn, array_ref => 1);
my #out;
while (defined( $_ = shift(#$in) )) {
s/a/b/g; # For example.
push #out, $_ if /c/; # For example.
}
write_file($qfn, \#out);
I avoided using expensive splice by using two arrays.
Note that using Tie::File might save one line of code, but this will be 30x faster[1], and probably use less memory (despite memory-saving being Tie::File's goal). Tie::File is never the answer!!!
This is not necessarily representative of all Tie::File uses, but I have indeed timed Tie::File taking 30x longer than the alternative at some basic task. That means that 2 seconds worth of work would have taken 1 minute with Tie::File!
Take a look at the Tie::File module. It is a core module and so shouldn't need installing, and the code is as simple as
use Tie::File;
tie my #file, 'Tie::File', $filepath or die $!;
Thereafter the array #file will hold the contents of the file, one line per element, and any changes to the array will be reflected in the file. All array operations such as push, splice, etc. will work fine
Note that line one of the file is in element zero of the array etc.

Replace multiple lines in text file

I have text files containing the text below (amongst other text)
DIFF_COEFF= 1.000e+07,1.000e+07,1.000e+07,1.000e+07,
1.000e+07,1.000e+07,1.000e+07,1.000e+07,1.000e+07,1.000e+07,1.000e+07,
1.000e+07,1.000e+07,1.000e+07,1.000e+07,1.000e+07,1.000e+07,1.000e+07,
1.000e+07,1.000e+07,1.000e+07,1.000e+07,1.000e+07,1.000e+07,1.000e+07,
1.000e+07,1.000e+07,1.000e+07,1.000e+07,1.000e+07,1.000e+07,1.000e+07,
1.000e+07,1.000e+07,1.000e+07,1.000e+07,1.000e+07,4.000e+05,
and I need to replace it with the following text:
DIFF_COEFF= 2.000e+07,2.000e+07,2.000e+07,2.000e+07,
2.000e+07,2.000e+07,2.000e+07,2.000e+07,2.000e+07,2.000e+07,2.000e+07,
2.000e+07,2.000e+07,2.000e+07,2.000e+07,2.000e+07,2.000e+07,2.000e+07,
2.000e+07,2.000e+07,2.000e+07,2.000e+07,2.000e+07,2.000e+07,2.000e+07,
2.000e+07,2.000e+07,2.000e+07,2.000e+07,2.000e+07,2.000e+07,2.000e+07,
2.000e+07,2.000e+07,2.000e+07,2.000e+07,2.000e+07,8.000e+05,
Each line above corresponds to a new line in the text file.
After some googling, I thought making use of Perl in the following might work, but it did not. I got the error message
Illegal division by zero at -e line 1, <> chunk 1
s_orig='DIFF_COEFF=*4.000e+05,'
s_new='DIFF_COEFF= 2.000e+07,2.000e+07,2.000e+07,2.000e+07,\n2.000e+07,2.000e+07,2.000e+07,2.000e+07,2.000e+07,2.000e+07,2.000e+07,\n2.000e+07,2.000e+07,2.000e+07,2.000e+07,2.000e+07,2.000e+07,2.000e+07,\n2.000e+07,2.000e+07,2.000e+07,2.000e+07,2.000e+07,2.000e+07,2.000e+07,\n2.000e+07,2.000e+07,2.000e+07,2.000e+07,2.000e+07,2.000e+07,2.000e+07,\n2.000e+07,2.000e+07,2.000e+07,2.000e+07,2.000e+07,8.000e+05,'
perl -0 -i -pe "s:\Q${s_orig}\E:${s_new}:/igs" file.txt
Does anyone here know the right way to do this?
Edit - some more details: the text after this block is "DIFF_COEFF_Q=" followed by the same set of numbers, so I need to search for and replace the specific lines shown. The text files are not very large in size.
Copy the file over to a new one, except that within the range of text between these markers drop the replacement text instead. Then move that file to replace the original, as it may be needed judging by the attempted perl -0 -i in the question.
Note that when changing a file we have to build new content and then replace the file. There are a few ways to do this and modules that make it easier, shown further below.
The code below uses the range operator and the fact that it returns the counter for lines within the range, 1 for the first and the number ending with E0 for the last. So we don't copy lines inside that region while we write the replacement text (and the post-region-end marker) on the last line.
I consider the region of interest to end right before DIFF_COEFF_Q= line, per the question edit.
use warnings;
use strict;
use feature 'say';
use File::Copy 'move';
my $replacement = "replacement text";
my $file = 'input.txt';
my $out_file = 'new_' . $file;
open my $fh_out, '>', $out_file or die "Can't open $out_file: $!";
open my $fh, '<', $file or die "Can't open $file: $!";
while (<$fh>)
{
if (my $range_cnt = /^\s*DIFF_COEFF\s*=/ .. /^\s*DIFF_COEFF_Q\s*=/) #/
{
if ($range_cnt =~ /E0$/)
{
print $fh_out $replacement; # may need a newline
print $fh_out $_;
}
}
else {
print $fh_out $_;
}
}
close $fh or die "Can't close $file: $!"; # don't overwrite original
close $fh_out or die "Can't close $out_file: $!"; # if there are problems
#move $out_file, $file or die "Can't move $file to $out_file: $!";
Uncomment the move line once this has been tested well enough on your actual files, if you want to replace the original. You may or may not need a newline after $replacement, depending on it.
An alternative is to use flags for entering/leaving that range. But this won't be cleaner since there are two distinct actions, to stop copying when entering the range and write replacement when leaving. Thus multiple flags need be set and checked, what may end up messier.
If the files can't ever be huge it is simpler to read and process the file in memory. Then open the same file for writing and dump the new content
my $text = do { # slurp file into a scalar
local $/;
open my $fh, '<', $file or die "Can't open $file: $!";
<$fh>
};
$text =~ s/^\s*DIFF_COEFF\s*=.*?(\n\s*DIFF_COEFF_Q)/$replacement$1/ms;
# Change $out_file to $file to overwrite
open my $fh_out, '>', $out_file or die "Can't open $out_file: $!";
print $fh_out $text;
Here /m modifier is for multiline mode in which we can use ^ for the beginning of a line (not the whole string), what is helpful here. The /s makes . match a newline, too. Also note that we can slurp a file with Path::Tiny as simply as: my $text = path($file)->slurp;
Another option is to use Path::Tiny, which in newer versions has edit and edit_lines methods
use Path::Tiny;
# NOTE: edits $file in place (changes it)
path($file)->edit(
sub { s/DIFF_COEFF=.*?(\n\s*DIFF_COEFF_Q)/$replacement$1/s }
);
For more on this see, for example, this post and this post and this post.
The first and last way change the inode number of the file. See this post if that is a problem.
It's an interesting error that you've made and I can see what has led you to make it. But I don't think I've ever seen anyone else make the same mistake :-)
Your substitution statement is this:
s:\Q${s_orig}\E:${s_new}:/igs
So you've decided to use : as the delimiter of the substitution operator. But you want to use the options i, g and s and everywhere you've seen people talk about options on a substitution operator, they talk about using / to introduce the options. So you've added /igs to your substitution operator.
But what you've missed (and I completely understand why) is that the / that comes before the options is actually the closing delimiter of the standard, s/.../.../, version of the substitution operator. If you change the delimiter (as you have done) then your altered closing delimiter is all you need.
In your case, Perl doesn't expect the / as it has already seen the closing delimiter. It, therefore, decides that the / is a division operator and tries to divide the result of your substitution by igs. It interprets igs as zero and you get your error.
The fix is to remove that / so:
s:\Q${s_orig}\E:${s_new}:/igs
becomes:
s:\Q${s_orig}\E:${s_new}:igs

Printing a text file using perl

I am completely new to this and this should be the easiest thing to do but for some reason I cannot get my local text file to print. After trying multiple times with different code I came to use the following code but it doesn't print.
I have searched for days on various threads to solve this and have had no luck. Please help. Here is my code:
#!/usr/bin/perl
$newfile = "file.txt";
open (FH, $newfile);
while ($file = <FH>) {
print $file;
}
I updated my code to the following:
#!/user/bin/perl
use strict; # Always use strict
use warnings; # Always use warnings.
open(my $fh, "<", "file.txt") or die "unable to open file.txt: $!";
# Above we open file using 3 handle method
# or die die with error if unable to open it.
while (<$fh>) { # While in the file.
print $_; # Print each line
}
close $fh; # Close the file
system('C:\Users\RSS\file.txt');
It returns the following: my first report generated by perl. I do not know where this is coming from. Nowhere do I have a print "my first report generated by perl."; statement and it definitely is not in my text file.
My text file is full of various emails, addresses, phone numbers and snippets of emails.
Thank you all for your help. I figured out my problem. I somehow managed to kick myself out of my directory and did not realize it.
This is most likely a combination of a failure to open the file, and a failure to check the return value of open.
If you are completely new to perl, I warmly recommend reading the excellent "perlintro" man page, using either man perlintro or perldoc perlintro on the command line, or taking a look here: https://perldoc.perl.org/perlintro.html.
The "Files and I/O" section there gives a good and concise way of doing this:
open(my $in, "<", "input.txt") or die "Can't open input.txt: $!";
while (<$in>) { # assigns each line in turn to $_
print "Just read in this line: $_";
}
This version will give you an explanation and abort if anything goes wrong while trying to open the file. For example, if there is no file named file.txt in the current working directory, your version will quietly fail to open the file, and afterwards it will quietly fail to read from the closed file handle.
Also, always adding at least one of these to your perl scripts will save you a lot of trouble in the long run:
use warnings; # or use the -w command line switch to turn warnings on globally
use diagnostics;
These won't catch the failure to open the file, but will alert on the failed read.
In the first example here you can see that without the diagnostics module, the code fails without any error messages. The second example shows how the diagnostics module changes this.
$ perl -le 'open FH, "nonexistent.txt"; while(<FH>){print "foo"}'
$ perl -le 'use diagnostics; open FH, "nonexistent.txt"; while(<FH>){print "foo"}'
readline() on closed filehandle FH at -e line 1 (#1)
(W closed) The filehandle you're reading from got itself closed sometime
before now. Check your control flow.
By the way, the legendary "Camel Book" is basically the perl man pages formatted for paper printing, so reading the perldocs in the order listed in perldoc perl will give you a high level of understanding of the language in a reasonably accessible and inexpensive manner.
Happy hacking!
This is simple and including explanations.
use strict; # Always use strict
use warnings; # Always use warnings.
open(my $fh, "<", "file.txt") or die "unable to open file.txt: $!";
# Above we open file using 3 handle method
# or die die with error if unable to open it.
while (<$fh>) { # While in the file.
print $_; # Print each line
}
close $fh; # Close the file
There is then also the case where you are trying to open a file which is not in a location where you think it is. So consider doing full path, if not in the same dir.
open(my $fh, "<", 'F:\Workdir\file.txt') or die "unable to open < input.txt: $!";
EDIT: After your comments, it seems that you are opening an empty file. Please add this at the bottom of that same script and rerun. It will open the file in C:\Users\RSS and make sure it does actually contain data?
system('C:\Users\RSS\file.txt');
First, of all as you are starting out, it is better to enable all warnings by 'use warnings' and disable all such expression which can lead to uncertain behavior or are difficult to debug by pragma 'use strict'.
As you are dealing with file stream, it is always recommended to the check if you were able to open the stream. so, try to use croak or die both would terminate the program with a given message.
Instead of reading inside the while condition, I would recommend checking for end of file. So, loop breaks as end is found. Usually, when reading a line you would use it for further processing, so it is good idea to remove end of lines using chomp.
A sample for reading a file in perl can be as follows:
#!/user/bin/perl
use strict;
use warnings;
my $newfile = "file.txt";
open (my $fh, $newfile) or die "Could not open file '$newfile' $!";
while (!eof($fh))
{
my $line=<$fh>;
chomp($line);
print $line , "\n";
}

Perl File Handling

The below is the Perl script that I wrote today. This reads the content from one file and writes on the other file. It works but, not completely.
#---------------------------------------------------------------------------
#!/usr/bin/perl
open IFILE, "text3.txt" or die "File not found";
open OFILE, ">text4.txt" or die "File not found";
my $lineno = 0;
while(<IFILE>)
{
#var=<IFILE>;
$lineno++;
print OFILE "#var";
}
close(<IFILE>);
close(<OFILE>);
#---------------------------------------------------------------------------
The issue is, it reads and writes contens, but not all.
text3.txt has four lines. The above script reads only from second line and writes on text4.txt. So, finally I get only three lines (line.no 2 to line.no 4) of text3.txt.
What is wrong with the above program. I don't have any idea about how to check the execution flow on Perl scripts. Kindly help me.
I'm completely new to Programming. I believe, learning all these would help me in changing my career path.
Thanks in Advance,
Vijay
<IFILE> reads one line from IFILE (only one because it's in scalar context). So while(<IFILE>) reads the first line, then the <IFILE> in list context within the while block reads the rest. What you want to do is:
# To read each line one by one:
while(!eof(IFILE)) { # check if end of file is reached instead of reading a line
my $line = <IFILE>; # scalar context, reads only one line
print OFILE $line;
}
# Or to read the whole file at once:
my #content = <IFILE>; # list context, read whole file
print OFILE #content;
The problem is that this line...
while(<IFILE>)
...reads one line from text3.txt, and then this line...
#var=<IFILE>;
...reads ALL of the remaining lines from text3.txt.
You can do it either way, by looping with while or all at once with #var=<IFILE>, but trying to do both won't work.
This is how I would have written the code in your question.
#!/usr/bin/perl
use warnings;
use strict;
use autodie;
# don't need to use "or die ..." when using the autodie module
open my $input, '<', 'text3.txt';
open my $output, '>', 'text4.txt';
while(<$input>){
my $lineno = $.;
print {$output} $_;
}
# both files get closed automatically when they go out of scope
# so no need to close them explicitly
I would recommend always putting use strict and use warnings at the beginning of all Perl files. At least until you know exactly why it is recommended.
I used autodie so that I didn't have to check the return value of open manually. ( autodie was added to Core in version 5.10.1 )
I used the three argument form of open because it is more robust.
It is important to note that while (<$input>){ ... } gets transformed into while (defined($_ = <$input>)){ ... } by the compiler. Which means that the current line is in the $_ variable.
I also used the special $. variable to get the current line number, rather than trying to keep track of the number myself.
There is a couple of questions you might want to think about, if you are strictly copying a file you could use File::Copy module.
If you are going to process the input before writing it out, you might also consider whether you want to keep both files open at the same time or instead read the whole content of the first file (into memory) first, and then write it to the outfile.
This depends on what you are doing underneath. Also if you have a huge binary file, each line in the while-loop might end up huge, so if memory is indeed an issue you might want to use more low-level stream-based reading, more info on I/O: http://oreilly.com/catalog/cookbook/chapter/ch08.html
My suggestion would be to use the cleaner PBP suggested way:
#!/usr/bin/perl
use strict;
use warnings;
use English qw(-no_match_vars);
my $in_file = 'text3.txt';
my $out_file = 'text4.txt';
open my $in_fh, '<', $in_file or die "Unable to open '$in_file': $OS_ERROR";
open my $out_fh, '>', $out_file or die "Unable to open '$out_file': $OS_ERROR";
while (<$in_fh>) {
# $_ is automatically populated with the current line
print { $out_fh } $_ or die "Unable to write to '$out_file': $OS_ERROR";
}
close $in_fh or die "Unable to close '$in_file': $OS_ERROR";
close $out_fh or die "Unable to close '$out_file': $OS_ERROR";
OR just print out the whole in-file directly:
#!/usr/bin/perl
use strict;
use warnings;
use English qw(-no_match_vars);
my $in_file = 'text3.txt';
my $out_file = 'text4.txt';
open my $in_fh, '<', $in_file or die "Unable to open '$in_file': $OS_ERROR";
open my $out_fh, '>', $out_file or die "Unable to open '$out_file': $OS_ERROR";
local $INPUT_RECORD_SEPARATOR; # Slurp mode, read in all content at once, see: perldoc perlvar
print { $out_fh } <$in_fh> or die "Unable to write to '$out_file': $OS_ERROR";;
close $in_fh or die "Unable to close '$in_file': $OS_ERROR";
close $out_fh or die "Unable to close '$out_file': $OS_ERROR";
In addition if you just want to apply a regular expression or similar to a file quickly, you can look into the -i switch of the perl command: perldoc perlrun
perl -p -i.bak -e 's/foo/bar/g' text3.txt; # replace all foo with bar in text3.txt and save original in text3.txt.bak
When you're closing the files, use just
close(IFILE);
close(OFILE);
When you surround a file handle with angle brackets like <IFILE>, Perl interprets that to mean "read a line of text from the file inside the angle brackets". Instead of reading from the file, you want to close the actual file itself here.

How do I resolve a "print() on closed filehandle" error in Perl?

I am getting this error while executing my Perl script. Please, tell me how to rectify this error in Perl.
print() on closed filehandle MYFILE
This is the code that is giving the error:
sub return_error
{
$DATA= "Sorry this page is corrently being updated...<p>";
$DATA.= " Back ";
open(MYFILE,">/home/abc/xrt/sdf/news/top.html");
print MYFILE $DATA;
close(MYFILE);
exit;
}
I hope that now I'm clearer.
You want to do some action on MYFILE after you (or the interpreter itself because of an error) closed it.
According to your code sample, the problem could be that open doesn't really open the file, the script may have no permission to write to the file.
Change your code to the following to see if there was an error:
open(MYFILE, ">", "/home/abc/xrt/sdf/news/top.html") or die "Couldn't open: $!";
Update
ysth pointed out that -w is not really good at checking if you can write to the file, it only ‘checks that one of the relevant flags in the mode is set’. Furthermore, brian d foy told me that the conditional I've used isn't good at handling the error. So I removed the misleading code. Use the code above instead.
It appears that the open call is failing. You should always check the status when opening a filehandle.
my $file = '/home/abc/xrt/sdf/news/top.html';
open(MYFILE, ">$file") or die "Can't write to file '$file' [$!]\n";
print MYFILE $DATA;
close MYFILE;
If the open is unsuccessful, the built-in variable $! (a.k.a. $OS_ERROR) will contain the OS-depededant error message, e.g. "Permission denied"
It's also preferable (for non-archaic versions of Perl) to use the three-argument form of open and lexical filehandles:
my $file = '/home/abc/xrt/sdf/news/top.html';
open(my $fh, '>', $file) or die "Can't write to file '$file' [$!]\n";
print {$fh} $DATA;
close $fh;
An alternate solution to saying or die is to use the autodie pragma:
#!/usr/bin/perl
use strict;
use warnings;
use autodie;
open my $fh, "<", "nsdfkjwefnbwef";
print "should never get here (unless you named files weirdly)\n";
The code above produces the following error (unless a file named nsdfkjwefnbwef exists in the current directory):
Can't open 'nsdfkjwefnbwef' for reading: 'No such file or directory' at example.pl line 7
This:
open(MYFILE,">/home/abc/xrt/sdf/news/top.html");
In modern Perl, it could be written as:
open(my $file_fh, ">", "/home/abc/xrt/sdf/news/top.html") or die($!);
This way you get a $variable restricted to the scope, there is no "funky business" if you have weird filenames (e.g. starting with ">") and error handling (you can replace die with warn or with error handling code).
Once you close $file_fh or simply go out of scope, you can not longer print to it.
I had this problem when my files were set to READ-ONLY.
Check this also, before giving up! :)
Check that the open worked
if(open(my $FH, ">", "filename") || die("error: $!"))
{
print $FH "stuff";
close($FH);
}
If you use a global symbol MYFILE as your filehandle, rather than a local lexical ($myfile), you will invariably run into issues if your program is multithreaded, e.g. if it is running via mod_perl. One process could be closing the filehandle while another process is attempting to write to it. Using $myfile will avoid this issue as each instance will have its own local copy, but you will still run into issues where one process could overwrite the data that another is writing. Use flock() to lock the file while writing to it.
Somewhere in you're script you will be doing something like:
open MYFILE, "> myfile.txt";
# do stuff with myfile
close MYFILE;
print MYFILE "Some more stuff I want to write to myfile";
The last line will throw an error because MYFILE has been closed.
Update
After seeing your code, it looks like the file you are trying to write to can't be opened in the first place. As others have already mentioned try doing something like:
open MYFILE, "> myfile.txt" or die "Can't open myfile.txt: $!\n"
Which should give you some feedback on why you can't open the file.