Most efficient way to write over file after reading - perl

I'm reading in some data from a file, manipulating it, and then overwriting it to the same file. Until now, I've been doing it like so:
open (my $inFile, $file) or die "Could not open $file: $!";
$retString .= join ('', <$inFile>);
...
close ($inFile);
open (my $outFile, $file) or die "Could not open $file: $!";
print $outFile, $retString;
close ($inFile);
However I realized I can just use the truncate function and open the file for read/write:
open (my $inFile, '+<', $file) or die "Could not open $file: $!";
$retString .= join ('', <$inFile>);
...
truncate $inFile, 0;
print $inFile $retString;
close ($inFile);
I don't see any examples of this anywhere. It seems to work well, but am I doing it correctly? Is there a better way to do this?

The canonical way to read an entire file contents is to (temporarily) set the input record separator $/ to undef, after which the next readline will return all of the rest of the file. But I think your original technique is much clearer and less involved than just reopening the file for write.
Note that all the following examples make use of the autodie pragma, which avoids the need to explicitly test the status of open, close, truncate, seek, and many other related calls that aren't used here.
Opening in read/write mode and using truncate would look like this
use strict;
use warnings;
use autodie;
use Fcntl 'SEEK_SET';
my ($file) = #ARGV;
open my $fh, '+<', $file;
my $ret_string = do { local $/; <$fh>; };
# Modify $ret_string;
seek $fh, 0, SEEK_SET;
truncate $fh, 0;
print $fh $ret_string;
close $fh;
whereas a simple second open would be
use strict;
use warnings;
use autodie;
my ($file) = #ARGV;
open my $fh, '<', $file;
my $ret_string = do { local $/; <$fh>; };
# Modify $ret_string;
open $fh, '>', $file;
print $fh $ret_string;
close $fh;
There is a third way, which is to use the equivalent of the edit-in-place -i command-line option. If you set the built-in variable $^I to anything other than undef and pass a file on the command line to the program then Perl will transparently rename by appending the value of $^I and open a new output file with the original name.
If you set $^I to the empty string '' then the original file will be deleted from the directory and will disappear when it is closed (note that Windows doesn't support this, and you have to specify a non-null value). But while you are testing your code it is best to set it to something else so that you have a route of retreat if you succeed in destroying the data.
That mode would look like this
use strict;
use warnings;
$^I = '.old';
my $ret_string = do { local $/; <>; };
# Modify $ret_string;
print $ret_string;
Note that the new output file is selected as the default output, and if you want to print to the console you have to write an explicit print STDOUT ....

I would recommend using $INPLACE_EDIT:
use strict;
use warnings;
my $file = '...';
local #ARGV = $file;
local $^I = '.bak';
while (<>) {
# Modify the line;
print;
}
# unlink "$file$^I"; # Optionally delete backup
For additional methods for editing a file, just read perlfaq5 - How do I change, delete, or insert a line in a file, or append to the beginning of a file?.

I would change the way you're reading the file, not how you open it. Joining lines is less efficient than reading whole file at once,
$retString .= do { local $/; <$inFile> };
As for truncate you might want to seek to begining of the file first as perldoc suggests
The position in the file of FILEHANDLE is left unchanged. You may want to call seek before writing to the file.

The Tie::File module may help if you are changing some lines in a file.
Like this
use strict;
use warnings;
use Tie::File;
tie my #source, 'Tie::File', 'file.txt' or die $!;
for my $line (#source) {
# Modify $line here
}
untie #source;

Related

writing a script in perl to convert all files in a folder to another format

Question is 2 fold:
I'm writing a perl (not to experienced with perl) script and can get it to convert one file at a time from csv to ascii. I want to do a loop that takes all csv's in a folder and converts them to ascii/txt.
Is perl the best language to be attempting this with? I assumed yes since i can successfully do it one file at a time but having a very hard time figuring out a way to loop it.
I was trying to figure out how to load all the files into an array then run the loop for each one, but my googling has reached its limit and i'm out of ideas.
here's my working script:
#!/usr/bin/env perl
use strict;
use warnings;
use autodie;
use open IN => ':encoding(UTF-16)';
use open OUT => ':encoding(ascii)';
my $buffer;
open(my $ifh, '<', 'Software_compname.csv');
read($ifh, $buffer, -s $ifh);
close($ifh);
open(my $ofh, '>', 'Software_compname.txt');
print($ofh $buffer);
close($ofh);
Just add the following loop to your script and give it the files to process as arguments:
for my $input_file (glob shift) {
(my $output_file = $input_file) =~ s/csv$/txt/ or do {
warn "Invalid file name: $input_file\n";
next;
};
my $buffer;
open my $ifh, '<', $input_file;
read $ifh, $buffer, -s $ifh;
close $ifh;
open my $ofh, '>', $output_file;
print{$ofh} $buffer;
close $ofh;
}
If you want to do it from the Perl side only, I would suggest using File::Find::Rule:
use File::Find::Rule;
my #files = File::Find::Rule->file()
->name('*.in')
->in(my #input_directories = ('.'));
# etc.
regards,
matthias

Why is my Perl program not reading from the input file?

I'm trying to read in this file:
Oranges
Apples
Bananas
Mangos
using this:
open (FL, "fruits");
#fruits
while(<FL>){
chomp($_);
push(#fruits,$_);
}
print #fruits;
But I'm not getting any output. What am I missing here? I'm trying to store all the lines in the file into an array, and printing out all the contents on a single line. Why isn't chomp removing the newlines from the file, like it's supposed to?
you should always use :
use strict;
use warnings;
at the begining of your scripts.
and use 3 args open, lexical handles and test opening for failure, so your script becomes:
#!/usr/bin/perl
use strict;
use warnings;
use Data::Dumper;
my #fruits;
my $file = 'fruits';
open my $fh, '<', $file or die "unable to open '$file' for reading :$!";
while(my $line = <$fh>){
chomp($line);
push #fruits, $line;
}
print Dumper \#fruits;
I'm guessing that you have DOS-style newlines (i.e., \r\n) in your fruits file. The chomp command normally only works on unix-style (i.e., \n.)
You're not opening any file. FL is a file handle that never is opened, and therefore you can't read from it.
The first thing you need to do is put use warnings at the top of your program to help you with these problems.
#!/usr/bin/env perl
use strict;
use warnings;
use IO::File;
use Data::Dumper;
my $fh = IO::File->new('fruits', 'r') or die "$!\n";
my #fruits = grep {s/\n//} $fh->getlines;
print Dumper \#fruits;
that's nice and clean
You should check open for errors:
open( my $FL, '<', 'fruits' ) or die $!;
while(<$FL>) {
...
1) You should always print the errors from IO. `open() or die "Can't open file $f, $!";
2) you probably started the program from different directory from where file "fruits" is

Perl File Handling

The below is the Perl script that I wrote today. This reads the content from one file and writes on the other file. It works but, not completely.
#---------------------------------------------------------------------------
#!/usr/bin/perl
open IFILE, "text3.txt" or die "File not found";
open OFILE, ">text4.txt" or die "File not found";
my $lineno = 0;
while(<IFILE>)
{
#var=<IFILE>;
$lineno++;
print OFILE "#var";
}
close(<IFILE>);
close(<OFILE>);
#---------------------------------------------------------------------------
The issue is, it reads and writes contens, but not all.
text3.txt has four lines. The above script reads only from second line and writes on text4.txt. So, finally I get only three lines (line.no 2 to line.no 4) of text3.txt.
What is wrong with the above program. I don't have any idea about how to check the execution flow on Perl scripts. Kindly help me.
I'm completely new to Programming. I believe, learning all these would help me in changing my career path.
Thanks in Advance,
Vijay
<IFILE> reads one line from IFILE (only one because it's in scalar context). So while(<IFILE>) reads the first line, then the <IFILE> in list context within the while block reads the rest. What you want to do is:
# To read each line one by one:
while(!eof(IFILE)) { # check if end of file is reached instead of reading a line
my $line = <IFILE>; # scalar context, reads only one line
print OFILE $line;
}
# Or to read the whole file at once:
my #content = <IFILE>; # list context, read whole file
print OFILE #content;
The problem is that this line...
while(<IFILE>)
...reads one line from text3.txt, and then this line...
#var=<IFILE>;
...reads ALL of the remaining lines from text3.txt.
You can do it either way, by looping with while or all at once with #var=<IFILE>, but trying to do both won't work.
This is how I would have written the code in your question.
#!/usr/bin/perl
use warnings;
use strict;
use autodie;
# don't need to use "or die ..." when using the autodie module
open my $input, '<', 'text3.txt';
open my $output, '>', 'text4.txt';
while(<$input>){
my $lineno = $.;
print {$output} $_;
}
# both files get closed automatically when they go out of scope
# so no need to close them explicitly
I would recommend always putting use strict and use warnings at the beginning of all Perl files. At least until you know exactly why it is recommended.
I used autodie so that I didn't have to check the return value of open manually. ( autodie was added to Core in version 5.10.1 )
I used the three argument form of open because it is more robust.
It is important to note that while (<$input>){ ... } gets transformed into while (defined($_ = <$input>)){ ... } by the compiler. Which means that the current line is in the $_ variable.
I also used the special $. variable to get the current line number, rather than trying to keep track of the number myself.
There is a couple of questions you might want to think about, if you are strictly copying a file you could use File::Copy module.
If you are going to process the input before writing it out, you might also consider whether you want to keep both files open at the same time or instead read the whole content of the first file (into memory) first, and then write it to the outfile.
This depends on what you are doing underneath. Also if you have a huge binary file, each line in the while-loop might end up huge, so if memory is indeed an issue you might want to use more low-level stream-based reading, more info on I/O: http://oreilly.com/catalog/cookbook/chapter/ch08.html
My suggestion would be to use the cleaner PBP suggested way:
#!/usr/bin/perl
use strict;
use warnings;
use English qw(-no_match_vars);
my $in_file = 'text3.txt';
my $out_file = 'text4.txt';
open my $in_fh, '<', $in_file or die "Unable to open '$in_file': $OS_ERROR";
open my $out_fh, '>', $out_file or die "Unable to open '$out_file': $OS_ERROR";
while (<$in_fh>) {
# $_ is automatically populated with the current line
print { $out_fh } $_ or die "Unable to write to '$out_file': $OS_ERROR";
}
close $in_fh or die "Unable to close '$in_file': $OS_ERROR";
close $out_fh or die "Unable to close '$out_file': $OS_ERROR";
OR just print out the whole in-file directly:
#!/usr/bin/perl
use strict;
use warnings;
use English qw(-no_match_vars);
my $in_file = 'text3.txt';
my $out_file = 'text4.txt';
open my $in_fh, '<', $in_file or die "Unable to open '$in_file': $OS_ERROR";
open my $out_fh, '>', $out_file or die "Unable to open '$out_file': $OS_ERROR";
local $INPUT_RECORD_SEPARATOR; # Slurp mode, read in all content at once, see: perldoc perlvar
print { $out_fh } <$in_fh> or die "Unable to write to '$out_file': $OS_ERROR";;
close $in_fh or die "Unable to close '$in_file': $OS_ERROR";
close $out_fh or die "Unable to close '$out_file': $OS_ERROR";
In addition if you just want to apply a regular expression or similar to a file quickly, you can look into the -i switch of the perl command: perldoc perlrun
perl -p -i.bak -e 's/foo/bar/g' text3.txt; # replace all foo with bar in text3.txt and save original in text3.txt.bak
When you're closing the files, use just
close(IFILE);
close(OFILE);
When you surround a file handle with angle brackets like <IFILE>, Perl interprets that to mean "read a line of text from the file inside the angle brackets". Instead of reading from the file, you want to close the actual file itself here.

What's the easiest way to write to a file using Perl?

Currently I am using
system("echo $panel_login $panel_password $root_name $root_pass $port $panel_type >> /home/shared/ftp");
What is the easiest way to do the same thing using Perl? IE: a one-liner.
Why does it need to be one line? You're not paying by the line, are you? This is probably too verbose, but it took a total of two minutes to type it out.
#!/usr/bin/env perl
use strict;
use warnings;
my #values = qw/user secret-password ftp-address/;
open my $fh, '>>', 'ftp-stuff' # Three argument form of open; lexical filehandle
or die "Can't open [ftp-stuff]: $!"; # Always check that the open call worked
print $fh "#values\n"; # Quote the array and you get spaces between items for free
close $fh or die "Can't close [ftp-stuff]: $!";
You might find IO::All to be helpful:
use IO::All;
#stuff happens to set the variables
io("/home/shared/ftp")->write("$panel_login $panel_password $root_name $root_pass $port $panel_type");
EDIT (By popular and editable demand)
http://perldoc.perl.org/functions/open.html
In your case you would have to :
#21st century perl.
my $handle;
open ($handle,'>>','/home/shared/ftp') or die("Cant open /home/shared/ftp");
print $handle "$panel_login $panel_password $root_name $root_pass $port $panel_type";
close ($handle) or die ("Unable to close /home/shared/ftp");
Alternatively, you could use the autodie pragma (as #Chas Owens suggested in comments).
This way, no check (the or die(...)) part needs to be used.
Hope to get it right this time. If so, will erase this Warning.
Old deprecated way
Use print (not one liner though). Just open your file before and get a handle.
open (MYFILE,'>>/home/shared/ftp');
print MYFILE "$panel_login $panel_password $root_name $root_pass $port $panel_type";
close (MYFILE);
http://perl.about.com/od/perltutorials/a/readwritefiles_2.htm
You might want to use the simple File::Slurp module:
use File::Slurp;
append_file("/home/shared/ftp",
"$panel_login $panel_password $root_name $root_pass ".
"$port $panel_type\n");
It's not a core module though, so you'll have to install it.
(open my $FH, ">", "${filename}" and print $FH "Hello World" and close $FH)
or die ("Couldn't output to file: ${filename}: $!\n");
Of course, it's impossible to do proper error checking in a one liner... That should be written slightly differently:
open my $FH, ">", "${filename}" or die("Can't open file: ${filename}: $!\n");
print $FH "Hello World";
close $FH;
For advanced one-liners like this, you could also use the psh command from Psh, a simple pure Perl shell.
psh -c '{my $var = "something"; print $var} >/tmp/out.txt'
I use FileHandle. From the POD:
use FileHandle;
$fh = new FileHandle ">> FOO"; # modified slightly from the POD, to append
if (defined $fh) {
print $fh "bar\n";
$fh->close;
}
If you want something closer to a "one-liner," you can do this:
use FileHandle;
my $fh = FileHandle->new( '>> FOO' ) || die $!;
$fh->print( "bar\n" );
## $fh closes when it goes out of scope
You can do a one-liner like this one:
print "$panel_login $panel_password $root_name $root_pass $port $panel_type" >> io('/home/shared/ftp');
You only need to add the IO::All module to your code, like this:
use IO::All;
Some good reading about editing files with perl:
FMTYEWTK About Mass Edits In Perl

What is the best way to slurp a file into a string in Perl?

Yes, There's More Than One Way To Do It but there must be a canonical or most efficient or most concise way. I'll add answers I know of and see what percolates to the top.
To be clear, the question is how best to read the contents of a file into a string.
One solution per answer.
How about this:
use File::Slurp;
my $text = read_file($filename);
ETA: note Bug #83126 for File-Slurp: Security hole with encoding(UTF-8). I now recommend using File::Slurper (disclaimer: I wrote it), also because it has better defaults around encodings:
use File::Slurper 'read_text';
my $text = read_text($filename);
or Path::Tiny:
use Path::Tiny;
path($filename)->slurp_utf8;
I like doing this with a do block in which I localize #ARGV so I can use the diamond operator to do the file magic for me.
my $contents = do { local(#ARGV, $/) = $file; <> };
If you need this to be a bit more robust, you can easily turn this into a subroutine.
If you need something really robust that handles all sorts of special cases, use File::Slurp. Even if you aren't going to use it, take a look at the source to see all the wacky situations it has to handle. File::Slurp has a big security problem that doesn't look to have a solution. Part of this is its failure to properly handle encodings. Even my quick answer has that problem. If you need to handle the encoding (maybe because you don't make everything UTF-8 by default), this expands to:
my $contents = do {
open my $fh, '<:encoding(UTF-8)', $file or die '...';
local $/;
<$fh>;
};
If you don't need to change the file, you might be able to use File::Map.
In writing File::Slurp (which is the best way), Uri Guttman did a lot of research in the many ways of slurping and which is most efficient. He wrote down his findings here and incorporated them info File::Slurp.
open(my $f, '<', $filename) or die "OPENING $filename: $!\n";
$string = do { local($/); <$f> };
close($f);
Things to think about (especially when compared with other solutions):
Lexical filehandles
Reduce scope
Reduce magic
So I get:
my $contents = do {
local $/;
open my $fh, $filename or die "Can't open $filename: $!";
<$fh>
};
I'm not a big fan of magic <> except when actually using magic <>. Instead of faking it out, why not just use the open call directly? It's not much more work, and is explicit. (True magic <>, especially when handling "-", is far more work to perfectly emulate, but we aren't using it here anyway.)
mmap (Memory mapping) of strings may be useful when you:
Have very large strings, that you don't want to load into memory
Want a blindly fast initialisation (you get gradual I/O on access)
Have random or lazy access to the string.
May want to update the string, but are only extending it or replacing characters:
#!/usr/bin/perl
use warnings; use strict;
use IO::File;
use Sys::Mmap;
sub sip {
my $file_name = shift;
my $fh;
open ($fh, '+<', $file_name)
or die "Unable to open $file_name: $!";
my $str;
mmap($str, 0, PROT_READ|PROT_WRITE, MAP_SHARED, $fh)
or die "mmap failed: $!";
return $str;
}
my $str = sip('/tmp/words');
print substr($str, 100,20);
Update: May 2012
The following should be pretty well equivalent, after replacing Sys::Mmap with File::Map
#!/usr/bin/perl
use warnings; use strict;
use File::Map qw{map_file};
map_file(my $str => '/tmp/words', '+<');
print substr($str, 100, 20);
use Path::Class;
file('/some/path')->slurp;
{
open F, $filename or die "Can't read $filename: $!";
local $/; # enable slurp mode, locally.
$file = <F>;
close F;
}
This is neither fast, nor platform independent, and really evil, but it's short (and I've seen this in Larry Wall's code ;-):
my $contents = `cat $file`;
Kids, don't do that at home ;-).
use IO::All;
# read into a string (scalar context)
$contents = io($filename)->slurp;
# read all lines an array (array context)
#lines = io($filename)->slurp;
See the summary of Perl6::Slurp which is incredibly flexible and generally does the right thing with very little effort.
For one-liners you can usually use the -0 switch (with -n) to make perl read the whole file at once (if the file doesn't contain any null bytes):
perl -n0e 'print "content is in $_\n"' filename
If it's a binary file, you could use -0777:
perl -n0777e 'print length' filename
Here is a nice comparison of the most popular ways to do it:
http://poundcomment.wordpress.com/2009/08/02/perl-read-entire-file/
Nobody said anything about read or sysread, so here is a simple and fast way:
my $string;
{
open my $fh, '<', $file or die "Can't open $file: $!";
read $fh, $string, -s $file; # or sysread
close $fh;
}
Candidate for the worst way to do it! (See comment.)
open(F, $filename) or die "OPENING $filename: $!\n";
#lines = <F>;
close(F);
$string = join('', #lines);
Adjust the special record separator variable $/
undef $/;
open FH, '<', $filename or die "$!\n";
my $contents = <FH>;
close FH;