How to read and write a file, syntax wrong - perl

I end up having my script appending the new changes that I wanted to make to the end of the file instead of in the actual file.
open (INCONFIG, "+<$text") or die $!;
#config = <INCONFIG>;
foreach(#config)
{
if ( $_ =~ m/$checker/ )
{
$_ = $somethingnew;
}
print INCONFIG $_;
}
close INCONFIG or die;
Ultimately I wanted to rewrite the whole text again, but with certain strings modified if it matched the search criterion. But so far it only appends ANOTHER COPY of the entire file(with changes) to the bottom of the old file.
I know that I can just close the file, and use another write file -handle and parse it in. But was hoping to be able to learn what I did wrong, and how to fix it.

As I understand open, using read/write access for a text file isn't a good idea. After all a file just is a byte stream: Updating a part of the file with something of a different length is the stuff headaches are made of ;-)
Here is my approach: Try to emulate the -i "inplace" switch of perl. So essentially we write to a backup file, which we will later rename. (On *nix system, there is some magic with open filehandles keeping deleted files available, so we don't have to create a new file. Lets do it anyway.)
my $filename = ...;
my $tempfile = "$filename.tmp";
open my $inFile, '<', $filename or die $!;
open my $outFile, '>', $tempfile or die $!;
while (my $line = <$inFile>) {
$line = doWeirdSubstitutions($line);
print $outFile $line;
}
close $inFile or die $!;
close $outFile or die $!;
rename $tempfile, $filename
or die "rename failed: $!"; # will break under weird circumstances.
# delete temp file
unlink $tempfile or die $!;
Untested, but obvious code. Does this help with your problem?

Your problem is a misunderstanding of what <+ "open for update" does. It is discussed in the Perl Tutorial at
Mixing Reads and Writes.
What you really want to do is copy the old file to a new file and then rename it after the fact. This is discussed in the perlfaq5 mentioned by daxim. Plus there are entire modules dedicated to doing this safely, such as File::AtomicWrite. These help with the issue of your program aborting and leaving you with a clobbered file.

As others pointed out, there are better ways :)
But if you really want to read and write using +<, you should remember that, after reading the file, you're at the end of the file... That explains that your output is appended after the original content.
What you need to do is reset the file-pointer to the beginning of the file, using seek:
seek(INCONFIG ,0,0);
Then start writing...

perlopentut says this about mixing reads and writes
In fact, when it comes to updating a file, unless you're working on a
binary file as in the WTMP case above, you probably don't want to use
this approach for updating. Instead, Perl's -i flag comes to the
rescue.
Another way is to use the Tie::File module. The code reduces to just this:
tie my #config, 'Tie::File', $text or die $!;
s/$checker/$somethingnew/g for #config;
But remember to back the file up before you modify it until you have debugged your program.

Related

Open a file and overwrite the file with adjustments and no backup

I have the following three lines:
rename($file_path, $file_fh.'.bak');
open( my $file_IN_fh, '<' , $file_path.'.bak') || die "die message";
open( my $file_OUT_fh, '>' , $file_path) || die "die message";
It works great. It allows me to go through the in file while(<$file_IN_fh>), make a bunch of changes with a script (s///g, if() to determine if the line stays or not, etc), and write to the out file. In the end I get my edited file and the file name is unchanged.
My issue is that I am at a point where I no longer (currently) want the backup files, so I want to replace the code with something similar that won't create the backup file, and comment back and forth the three lines over the years if my needs change.
How do I do this kind of editing in place not from the command line?
One basic way is to read the file line by line and write desired output lines to a temporary file, which is then renamed so to overwrite the original.
use File::Copy qw(move);
open my $fh, '<', $file or die "Can't open $file: $!";
open my $fh_out, '>', $outfile or die "Can't open $outfile: $!";
while (<$fh>) {
next if /line_to_skip/;
s/patt/repl/g;
print $fh_out $_;
}
close $_ for ($fh, $fh_out);
move ($outfile, $file) or die "Can't move $outfile to $file: $!";
This is what is normally done by tools that edit files "in place" (with additional safety, checks, and flexibility). Since the $outfile is temporary use File::Temp.
Add checks when close-ing files.
Note that this changes the file's inode number, which may matter for some applications.†
If the file isn't huge you can simplify this and read it in first
open my $fh, '<', $file or die "Can't open $file: $!";
my #lines = <$fh>;
open $fh, '>', $file or die "Can't open $file for writing: $!";
for (#lines) {
next if /line_to_skip/;
s/patt/repl/g;
print $fh_out $_;
}
close $fh;
what preserves the inode number, since > mode truncates the existing inode data.
† If this is indeed a problem, you can still keep the same inode. After the temporary file is written, open it for reading and open the original file for writing; that truncates the contents of that inode. Then copy the temporary file to the original. Close handles and delete the temporary file.
If the file is huge, then I'd question why you'd want to avoid the temporary file. Otherwise, I'd suggest just loading the file into memory, make modifications, then write it back out.
use File::Slurp qw( read_file write_file );
my $in = read_file($qfn, array_ref => 1);
my #out;
while (defined( $_ = shift(#$in) )) {
s/a/b/g; # For example.
push #out, $_ if /c/; # For example.
}
write_file($qfn, \#out);
I avoided using expensive splice by using two arrays.
Note that using Tie::File might save one line of code, but this will be 30x faster[1], and probably use less memory (despite memory-saving being Tie::File's goal). Tie::File is never the answer!!!
This is not necessarily representative of all Tie::File uses, but I have indeed timed Tie::File taking 30x longer than the alternative at some basic task. That means that 2 seconds worth of work would have taken 1 minute with Tie::File!
Take a look at the Tie::File module. It is a core module and so shouldn't need installing, and the code is as simple as
use Tie::File;
tie my #file, 'Tie::File', $filepath or die $!;
Thereafter the array #file will hold the contents of the file, one line per element, and any changes to the array will be reflected in the file. All array operations such as push, splice, etc. will work fine
Note that line one of the file is in element zero of the array etc.

perl: canot open file within a loop

I am trying to read in a bunch of similar files and process them one by one. Here is the code I have. But somehow the perl script doesn't read in the files correctly. I'm not sure how to fix it. The files are definitely readable and writable by me.
#!/usr/bin/perl
use strict;
use warnings;
my #olap_f = `ls /full_dir_to_file/*txt`;
foreach my $file (#olap_f){
my %traits_h;
open(IN,'<',$file) || die "cannot open $file";
while(<IN>){
chomp;
my #array = split /\t/;
my $trait = $array[4];
$traits_h{$trait} ++;
}
close IN;
}
When I run it, the error message (something like below) showed up:
cannot open /full_dir_to_file/a.txt
You have newlines at the end of each filename:
my #olap_f = `ls ~dir_to_file/*txt`;
chomp #olap_f; # Remove newlines
Better yet, use glob to avoid launching a new process (and having to trim newlines):
my #olap_f = glob "~dir_to_file/*txt";
Also, use $! to find out why a file couldn't be opened:
open(IN,'<',$file) || die "cannot open $file: $!";
This would have told you
cannot open /full_dir_to_file/a.txt
: No such file or directory
which might have made you recognize the unwanted newline.
I'll add a quick plug for IO::All here. It's important to know what's going on under the hood but it's convenient sometimes to be able to do:
use IO::All;
my #olap_f = io->dir('/full_dir_to_file/')->glob('*txt');
In this case it's not shorter than #cjm's use of glob but IO::All does have a few other convenient methods for working with files as well.

File truncated, when opened in Perl

Im new to perl, so sorry if this is obvious, but i looked up how to open a file, and use the flags, but for the life of me they dont seem to work right I narrowed it down to these lines of code.
if ($flag eq "T"){
open xFile, ">" , "$lUsername\\$openFile";
}
else
{
open xFile, ">>", "$lUsername\\$openFile";
}
Both of these methods seem to delete the contents of my file. I also checked if the flag is formatted correctly and it is, i know for a fact ive gone down both conditions.
EDIT: codepaste of a larger portion of my code http://codepaste.net/n52sma
New to Perl? I hope you're using use strict and use warnings.
As other's have stated, you should be using a test to make sure your file is open. However, that's not really the problem here. In fact, I used your code, and it seems to work fine for me. Maybe you should try printing some debugging messages to see if this is doing what you think it's doing:
use strict;
use warnings;
use autodie; #Will stop your program if the "open" doesn't work.
my $lUsername = "ABaker";
my $openFile = "somefile.txt";
if ($flag eq "T") {
print qq(DEBUG: Flag = "$flag": Deleting file "$lUsername/$openFile");
open xFile, ">" , "$lUsername/$openFile";
}
else {
print qq(DEBUG: Flag = "$flag": Appending file "$lUsername/$openFile");
open xFile, ">>", "$lUsername/$openFile";
}
You want to use strict and warnings in order to make sure you're not having issues with variable names. The use strict forces you to declare your variables first. For example, are you setting $Flag, but then using $flag? Maybe $flag is set the first time through, but you're setting $Flag the second time through.
Anyway, the DEBUG: statements will give you a better idea of what your error could be.
By the way, in Perl, you're checking if $flag is set to T and not t. If you want to test against both t and T, test whether uc $flag eq 'T' and not just $flag eq 'T'.
#Ukemi
I reformated to comply with use strict, i also made print statements to make sure i was trunctating when i want to, and not when i dont. It still is deleting the file. Although now sometimes its simply not writing, im going to give a larger portion of my code in a link, id really appreciate it if you gave it a once over.
Are you seeing it say Truncating, but the file is empty? Are you sure the file already existed? There's a reason why I put the flag and everything in my debug statements. The more you print, the more you know. Try the following section of code:
$file = "lUsername/$openFile" #Use forward slashes vs. back slashes.
if ($flag eq "T") {
print qq(Flag = "$flag". Truncating file "$file"\n);
open $File , '>', $file
or die qq(Unable to open file "$file" for writing: $!\n);
}
else {
print qq(Flag = "$flag". Appending to file "$file"\n);
if (not -e $file) {
print qq(File "$file" does not exist. Will create it\n");
}
open $File , '>>', $file
or die qq(Unable to open file "$file" for appending: $!\n);
}
Note I'm printing out the flag and the name of the file in quotes. This will allow me to see if there are any hidden characters in my file name.
I'm using the qq(...) method to quote strings, so I can use the quotation marks in my print statements.
Also note I'm checking for the existence of the file when I truncate. This way, I make sure the file actually exists.
This should point out any possible errors in your logic. The other thing you can do is to stop your program when you finish writing out the file and verify that the file was written out as expected.
print "Write to file now:\n";
my $writeToFile = <>;
printf $File "$writeToFile";
close $File;
print "DEBUG: Temporary stop. Examine file\n";
<STDIN>; #DEBUG:
Now, if you see it saying it's appending to the file, and the file exists, and you still see the file being overwritten, we'll know the problem lies in your actual open xFile, ">>" $file statement.
You should use the three-argument-version of open, lexical filehandles and check wether there might have been an error:
# Writing to file (clobbering it if it exists)
open my $file , '>', $filename
or die "Unable to write to file '$filename': $!";
# Appending to file
open my $file , '>>', $filename
or die "Unable to append to file '$filename': $!";
>> does not clobber or truncate. Either you ended up in the "then" clause when you expected to be in the "else" clause, or the problem is elsewhere.
To check what $flag contains:
use Data::Dumper;
local $Data::Dumper::Useqq = 1;
print(Dumper($flag));
For your reference I have mentioned some basic file handling techniques below.
open FILE, "filename.txt" or die $!;
The command above will associate the FILE filehandle with the file filename.txt. You can use the filehandle to read from the file. If the file doesn't exist - or you cannot read it for any other reason - then the script will die with the appropriate error message stored in the $! variable.
open FILEHANDLE, MODE, EXPR
The available modes are the following:
read < #this mode will read the file
write > # this mode will create the new file. If the file already exists it will truncate and overwrite.
append >> #this will append the contents if the file already exists,else it will create new one.
if you have confusion on this, you can use the module called File::Slurp;
I have mentioned the sample codes using File::Slurp module.
use strict;
use File::Slurp;
my $read_mode=read_file("test.txt"); #to read file contents
write_file("test2.txt",$read_mode); #to write file
my #all_files=read_dir("/home/desktop",keep_dot_dot=>0); #read a dir
write_file("test2.txt",{append=>1},"#all_files"); #Append mode

perl appending issues

I have some code that appends into some files in the nested for loops. After exiting the for loops, I want to append .end to all the files.
foreach my $file (#SPICE_FILES)
{
open(FILE1, ">>$file") or die "[ERROR $0] cannot append to file : $file\n";
print FILE1 "\n.end\n";
close FILE1;
}
I noticed in some strange cases that the ".end" is appended into the middle of the files!
how do i resolve this??
Since I do not yet have the comment-privilege I'll have to write this as an 'answer'.
Do you use any dodgy modules?
I have run into issues where (obviously) broken perl-modules have done something to the output buffering. For me placing
$| = 1;
in the code has helped. The above statement turns off perls output buffering (AFAIK). It might have had other effects too, but I have not seen anything negative come out of it.
I guess you've got data buffered in some previously opened file descriptors. Try closing them before re-opening:
open my $fd, ">>", $file or die "Can't open $file: $!";
print $fd, $data;
close $fd or die "Can't close: $!";
Better yet, you can append those filehanles to an array/hash and write to them in cleanup:
push #handles, $fd;
# later
print $_ "\n.end\n" for #handles;
Here's a case to reproduce the "impossible" append in the middle:
#!/usr/bin/perl -w
use strict;
my $file = "file";
open my $fd, ">>", $file;
print $fd "begin"; # no \n -- write buffered
open my $fd2, ">>", $file;
print $fd2 "\nend\n";
close $fd2; # file flushed on close
# program ends here -- $fd finally closed
# you're left with "end\nbegin"
It’s not possible to append something to the middle of the file. The O_APPEND flag guarantees that each write(2) syscall will place its contents at the old EOF and update the st_size field by incrementing it by however many bytes you just wrote.
Therefore if you find that your own data is not showing up at the end when you go to look at it, then another agent has written more data to it afterwards.

What's the best way to open and read a file in Perl?

Please note - I am not looking for the "right" way to open/read a file, or the way I should open/read a file every single time. I am just interested to find out what way most people use, and maybe learn a few new methods at the same time :)*
A very common block of code in my Perl programs is opening a file and reading or writing to it. I have seen so many ways of doing this, and my style on performing this task has changed over the years a few times. I'm just wondering what the best (if there is a best way) method is to do this?
I used to open a file like this:
my $input_file = "/path/to/my/file";
open INPUT_FILE, "<$input_file" || die "Can't open $input_file: $!\n";
But I think that has problems with error trapping.
Adding a parenthesis seems to fix the error trapping:
open (INPUT_FILE, "<$input_file") || die "Can't open $input_file: $!\n";
I know you can also assign a filehandle to a variable, so instead of using "INPUT_FILE" like I did above, I could have used $input_filehandle - is that way better?
For reading a file, if it is small, is there anything wrong with globbing, like this?
my #array = <INPUT_FILE>;
or
my $file_contents = join( "\n", <INPUT_FILE> );
or should you always loop through, like this:
my #array;
while (<INPUT_FILE>) {
push(#array, $_);
}
I know there are so many ways to accomplish things in perl, I'm just wondering if there are preferred/standard methods of opening and reading in a file?
There are no universal standards, but there are reasons to prefer one or another. My preferred form is this:
open( my $input_fh, "<", $input_file ) || die "Can't open $input_file: $!";
The reasons are:
You report errors immediately. (Replace "die" with "warn" if that's what you want.)
Your filehandle is now reference-counted, so once you're not using it it will be automatically closed. If you use the global name INPUT_FILEHANDLE, then you have to close the file manually or it will stay open until the program exits.
The read-mode indicator "<" is separated from the $input_file, increasing readability.
The following is great if the file is small and you know you want all lines:
my #lines = <$input_fh>;
You can even do this, if you need to process all lines as a single string:
my $text = join('', <$input_fh>);
For long files you will want to iterate over lines with while, or use read.
If you want the entire file as a single string, there's no need to iterate through it.
use strict;
use warnings;
use Carp;
use English qw( -no_match_vars );
my $data = q{};
{
local $RS = undef; # This makes it just read the whole thing,
my $fh;
croak "Can't open $input_file: $!\n" if not open $fh, '<', $input_file;
$data = <$fh>;
croak 'Some Error During Close :/ ' if not close $fh;
}
The above satisfies perlcritic --brutal, which is a good way to test for 'best practices' :). $input_file is still undefined here, but the rest is kosher.
Having to write 'or die' everywhere drives me nuts. My preferred way to open a file looks like this:
use autodie;
open(my $image_fh, '<', $filename);
While that's very little typing, there are a lot of important things to note which are going on:
We're using the autodie pragma, which means that all of Perl's built-ins will throw an exception if something goes wrong. It eliminates the need for writing or die ... in your code, it produces friendly, human-readable error messages, and has lexical scope. It's available from the CPAN.
We're using the three-argument version of open. It means that even if we have a funny filename containing characters such as <, > or |, Perl will still do the right thing. In my Perl Security tutorial at OSCON I showed a number of ways to get 2-argument open to misbehave. The notes for this tutorial are available for free download from Perl Training Australia.
We're using a scalar file handle. This means that we're not going to be coincidently closing someone else's file handle of the same name, which can happen if we use package file handles. It also means strict can spot typos, and that our file handle will be cleaned up automatically if it goes out of scope.
We're using a meaningful file handle. In this case it looks like we're going to write to an image.
The file handle ends with _fh. If we see us using it like a regular scalar, then we know that it's probably a mistake.
If your files are small enough that reading the whole thing into memory is feasible, use File::Slurp. It reads and writes full files with a very simple API, plus it does all the error checking so you don't have to.
There is no best way to open and read a file. It's the wrong question to ask. What's in the file? How much data do you need at any point? Do you need all of the data at once? What do you need to do with the data? You need to figure those out before you think about how you need to open and read the file.
Is anything that you are doing now causing you problems? If not, don't you have better problems to solve? :)
Most of your question is merely syntax, and that's all answered in the Perl documentation (especially (perlopentut). You might also like to pick up Learning Perl, which answers most of the problems you have in your question.
Good luck, :)
It's true that there are as many best ways to open a file in Perl as there are
$files_in_the_known_universe * $perl_programmers
...but it's still interesting to see who usually does it which way. My preferred form of slurping (reading the whole file at once) is:
use strict;
use warnings;
use IO::File;
my $file = shift #ARGV or die "what file?";
my $fh = IO::File->new( $file, '<' ) or die "$file: $!";
my $data = do { local $/; <$fh> };
$fh->close();
# If you didn't just run out of memory, you have:
printf "%d characters (possibly bytes)\n", length($data);
And when going line-by-line:
my $fh = IO::File->new( $file, '<' ) or die "$file: $!";
while ( my $line = <$fh> ) {
print "Better than cat: $line";
}
$fh->close();
Caveat lector of course: these are just the approaches I've committed to muscle memory for everyday work, and they may be radically unsuited to the problem you're trying to solve.
I once used the
open (FILEIN, "<", $inputfile) or die "...";
my #FileContents = <FILEIN>;
close FILEIN;
boilerplate regularly. Nowadays, I use File::Slurp for small files that I want to hold completely in memory, and Tie::File for big files that I want to scalably address and/or files that I want to change in place.
For OO, I like:
use FileHandle;
...
my $handle = FileHandle->new( "< $file_to_read" );
croak( "Could not open '$file_to_read'" ) unless $handle;
...
my $line1 = <$handle>;
my $line2 = $handle->getline;
my #lines = $handle->getlines;
$handle->close;
Read the entire file $file into variable $text with a single line
$text = do {local(#ARGV, $/) = $file ; <>};
or as a function
$text = load_file($file);
sub load_file {local(#ARGV, $/) = #_; <>}
If these programs are just for your productivity, whatever works! Build in as much error handling as you think you need.
Reading in a whole file if it's large may not be the best way long-term to do things, so you may want to process lines as they come in rather than load them up in an array.
One tip I got from one of the chapters in The Pragmatic Programmer (Hunt & Thomas) is that you might want to have the script save a backup of the file for you before it goes to work slicing and dicing.
The || operator has higher precedence, so it is evaluated first before sending the result to "open"... In the code you've mentioned, use the "or" operator instead, and you wouldn't have that problem.
open INPUT_FILE, "<$input_file"
or die "Can't open $input_file: $!\n";
Damian Conway does it this way:
$data = readline!open(!((*{!$_},$/)=\$_)) for "filename";
But I don't recommend that to you.