Cannot read from a file opened in read/write/append mode - perl

Here is my code
use strict;
use warnings;
open(INFILE,"<File1") or die "Can't open the file";
open(OUTFILE,"+>>File2") or die "Can't open the file";
while(<INFILE>) {
print OUTFILE "$_";
}
while(<OUTFILE>) {
print "$_\n";
}
Am trying to move the contents of File1 to File2 which is created in read,write and append mode. The problem here is I have my File2 created with File1 contents, but the print statement inside the while(<OUTFILE>) {} is not printing the contents of File2.
Anyone help me understand what is wrong here!(I might be making a silly mistake)

After your first while loop, you are at the end of the OUTFILE. So, the second while loop is never really executed. Use seek before you read OUTFILE:
use strict;
use warnings;
open(INFILE,"<File1")or die "Can't open the file";
open(OUTFILE,"+>>File2")or die "Can't open the file";
while(<INFILE>)
{
print OUTFILE "$_";
}
seek OUTFILE, 0, 0;
while(<OUTFILE>)
{
print "$_\n";
}

Set the cursor back to the start of OUTFILE:
seek OUTFILE, 0, 0;

Related

Recursive search in Perl?

I'm incredibly new to Perl, and never have been a phenomenal programmer. I have some successful BVA routines for controlling microprocessor functions, but never anything embedded, or multi-facted. Anyway, my question today is about a boggle I cannot get over when trying to figure out how to remove duplicate lines of text from a text file I created.
The file could have several of the same lines of txt in it, not sequentially placed, which is problematic as I'm practically comparing the file to itself, line by line. So, if the first and third lines are the same, I'll write the first line to a new file, not the third. But when I compare the third line, I'll write it again since the first line is "forgotten" by my current code. I'm sure there's a simple way to do this, but I have issue making things simple in code. Here's the code:
my $searchString = pseudo variable "ideally an iterative search through the source file";
my $file2 = "/tmp/cutdown.txt";
my $file3 = "/tmp/output.txt";
my $count = "0";
open (FILE, $file2) || die "Can't open cutdown.txt \n";
open (FILE2, ">$file3") || die "Can't open output.txt \n";
while (<FILE>) {
print "$_";
print "$searchString\n";
if (($_ =~ /$searchString/) and ($count == "0")) {
++ $count;
print FILE2 $_;
} else {
print "This isn't working\n";
}
}
close (FILE);
close (FILE2);
Excuse the way filehandles and scalars do not match. It is a work in progress... :)
The secret of checking for uniqueness, is to store the lines you have seen in a hash and only print lines that don't exist in the hash.
Updating your code slightly to use more modern practices (three-arg open(), lexical filehandles) we get this:
my $file2 = "/tmp/cutdown.txt";
my $file3 = "/tmp/output.txt";
open my $in_fh, '<', $file2 or die "Can't open cutdown.txt: $!\n";
open my $out_fh, '>', $file3 or die "Can't open output.txt: $!\n";
my %seen;
while (<$in_fh>) {
print $out_fh unless $seen{$_}++;
}
But I would write this as a Unix filter. Read from STDIN and write to STDOUT. That way, your program is more flexible. The whole code becomes:
#!/usr/bin/perl
use strict;
use warnings;
my %seen;
while (<>) {
print unless $seen{$_}++;
}
Assuming this is in a file called my_filter, you would call it as:
$ ./my_filter < /tmp/cutdown.txt > /tmp/output.txt
Update: But this doesn't use your $searchString variable. It's not clear to me what that's for.
If your file is not very large, you can store each line readed from the input file as a key in a hash variable. And then, print the hash keys (ordered). Something like that:
my %lines = ();
my $order = 1;
open my $fhi, "<", $file2 or die "Cannot open file: $!";
while( my $line = <$fhi> ) {
$lines {$line} = $order++;
}
close $fhi;
open my $fho, ">", $file3 or die "Cannot open file: $!";
#Sort the keys, only if needed
my #ordered_lines = sort { $lines{$a} <=> $lines{$b} } keys(%lines);
for my $key( #ordered_lines ) {
print $fho $key;
}
close $fho;
You need two things to do that:
a hash to keep track of all the lines you have seen
a loop reading the input file
This is a simple implementation, called with an input filename and an output filename.
use strict;
use warnings;
open my $fh_in, '<', $ARGV[0] or die "Could not open file '$ARGV[0]': $!";
open my $fh_out, '<', $ARGV[1] or die "Could not open file '$ARGV[1]': $!";
my %seen;
while (my $line = <$fh_in>) {
# check if we have already seen this line
if (not $seen{$line}) {
print $fh_out $line;
}
# remember this line
$seen{$line}++;
}
To test it, I've included it with the DATA handle as well.
use strict;
use warnings;
my %seen;
while (my $line = <DATA>) {
# check if we have already seen this line
if (not $seen{$line}) {
print $line;
}
# remember this line
$seen{$line}++;
}
__DATA__
foo
bar
asdf
foo
foo
asdfg
hello world
This will print
foo
bar
asdf
asdfg
hello world
Keep in mind that the memory consumption will grow with the file size. It should be fine as long as the text file is smaller than your RAM. Perl's hash memory consumption grows a faster than linear, but your data structure is very flat.

what is wrong with the syntax that does not give me an outfile?

I dont know what exactly is wrong but everytime I execute this script i keep getting "No such file or directory at ./reprioritize line 35, line 1".
here is my script that is having an issue:
my $newresult = "home/user/newresults_percengtage_vs_pn";
sub pushval
{
my #fields = #_;
open OUTFILE, ">$newresult/fixedhomdata_030716-031316.csv" or die $!; #line 35
while(<OUTFILE>)
{
if($fields[5] >= 13)
{
print OUTFILE "$fields[0]", "$fields[1]","$fields[2]","$fields[3]","$fields[4]","$fields[5]", "0";
}
elsif($fields[5] < 13 && $fields[5] > 1)
{
print OUTFILE "$fields[0]", "$fields[1]","$fields[2]","$fields[3]","$fields[4]","$fields[5]", "1";
}
elsif($fields[5] <= 1)
{
print OUTFILE "$fields[0]", "$fields[1]","$fields[2]","$fields[3]","$fields[4]","$fields[5]", "2";
}
}
close (OUTFILE);
You may want to have a look at Perl's tutorial on opening files.
I simplify it a bit. There are basically three modes: open for reading, open for writing, and open for appending.
Reading
Opening for reading is indicated by either a < preceeding the filename or on its own, as a separate parameter to the open() call (preferred), i.e.:
my $fh = undef;
my $filename = 'fixedhomdata_030716-031316.csv';
open($fh, "<$filename") or die $!; # bad
open($fh, '<', $filename) or die $!; # good
while( my $line = <$fh> ) { # read one line from filehandle $fh
...
}
close($fh);
When you open the file this way, it must exist, else you get your error (No such file or directory at ...).
Writing
Opening for writing is indicated by a >, i.e.:
open($fh, ">$filename") or die $!; # bad
open($fh, '>', $filename) or die $!; # good
print $fh "some text\n"; # write to filehandle $fh
print $fh "more text\n"; # write to filehandle $fh
...
close($fh);
When you open the file this way, it is truncated (cleared) and overwritten if it existed. If it did not exist, it will get created.
Appending
Opening for appending is indicated by a >>, i.e.:
open($fh, ">>$filename") or die $!; # bad
open($fh, '>>', $filename) or die $!; # good
print $fh "some text\n"; # append to filehandle $fh
print $fh "more text\n"; # append to filehandle $fh
...
close($fh);
When you open the file this way and it existed, then the new lines will be appended to the file, i.e. nothing is lost. If the file did not
exist, it will be created (as if only > had been given).
Your error message doesn't match your code. You opened the file for writing (>) but got doesn't exist, which indicates that you actually opened it for reading.
This might have happened because you use OUTPUT as a filehandle instead of a scoped variable, e.g. $fh. OUTPUT is a global filehandle, i.e. if you open a file this way, then all of your code (no matter which function in) can use OUTPUT. Don't do that. From the docs:
An older style is to use a bareword as the filehandle, as
open(FH, "<", "input.txt")
or die "cannot open < input.txt: $!";
Then you can use FH as the filehandle, in close FH and and so on.
Note that it's a global variable, so this form is not recommended
in new code.
To summarize:
use scoped variables as filehandles ($fh instead of OUTPUT)
open your file in the right mode (> vs. <)
always use three-argument open (open($fh, $mode, $filename) vs. open($fh, "$mode$filename")
The comments explain that your two issues with the snippet are
The missing leading '/' in the $newresult declaration
You are treating your filehandle as both a read and a write.
The first is easy to fix. The second is not as easy to fix properly with knowing the rest of the script. I am making an assumption that pushval is called once per record in a Array of Arrays(?). This snippet below should get the result you want, but there is likely a better way of doing it.
my $newresult = "/home/user/newresults_percengtage_vs_pn";
sub pushval{
my #fields = #_;
open OUTFILE, ">>$newresult/fixedhomdata_030716-031316.csv" or die $!; #line 35
print OUTFILE "$fields[0]", "$fields[1]","$fields[2]","$fields[3]","$fields[4]","$fields[5]"
if($fields[5] >= 13) {
print OUTFILE "0\n";
} elsif($fields[5] < 13 && $fields[5] > 1) {
print OUTFILE "1\n";
} elsif($fields[5] <= 1) {
print OUTFILE "2\n";
}
close (OUTFILE);

Read and Write to a file in perl

this
is just
an example.
Lets assume the above is out.txt. I want to read out.txt and write onto the same file.
<Hi >
<this>
<is just>
<an example.>
Modified out.txt.
I want to add tags in the beginning and end of some lines.
As I will be reading the file several times I cannot keep writing it onto a different file each time.
EDIT 1
I tried using "+<" but its giving an output like this :
Hi
this
is just
an example.
<Hi >
<this>
<is just>
<an example.>
**out.txt**
EDIT 2
Code for reference :
open(my $fh, "+<", "out.txt");# or die "cannot open < C:\Users\daanishs\workspace\CCoverage\out.txt: $!";
while(<$fh>)
{
$s1 = "<";
$s2 = $_;
$s3 = ">";
$str = $s1 . $s2 . $s3;
print $fh "$str";
}
The very idea of what you are trying to do is flawed. The file starts as
H i / t h i s / ...
If you were to change it in place, it would look as follows after processing the first line:
< H i > / i s / ...
Notice how you clobbered "th"? You need to make a copy of the file, modify the copy, the replace the original with the copy.
The simplest way is to make this copy in memory.
my $file;
{ # Read the file
open(my $fh, '<', $qfn)
or die "Can't open \"$qfn\": $!\n";
local $/;
$file = <$fh>;
}
# Change the file
$file =~ s/^(.*)\n/<$1>\n/mg;
{ # Save the changes
open(my $fh, '>', $qfn)
or die "Can't create \"$qfn\": $!\n";
print($fh $file);
}
If you wanted to use the disk instead:
rename($qfn, "$qfn.old")
or die "Can't rename \"$qfn\": $!\n";
open(my $fh_in, '<', "$qfn.old")
or die "Can't open \"$qfn\": $!\n";
open(my $fh_out, '>', $qfn)
or die "Can't create \"$qfn\": $!\n";
while (<$fh_in>) {
chomp;
$_ = "<$_>";
print($fh_out "$_\n");
}
unlink("$qfn.old");
Using a trick, the above can be simplified to
local #ARGV = $qfn;
local $^I = '';
while (<>) {
chomp;
$_ = "<$_>";
print(ARGV "$_\n");
}
Or as a one-liner:
perl -i -pe'$_ = "<$_>"' file
Read contents in memory and then prepare required string as you write to your file. (SEEK_SET to zero't byte is required.
#!/usr/bin/perl
open(INFILE, "+<in.txt");
#a=<INFILE>;
seek INFILE, 0, SEEK_SET ;
foreach $i(#a)
{
chomp $i;
print INFILE "<".$i.">"."\n";
}
If you are worried about amount of data being read in memory, you will have to create a temporary result file and finally copy the result file to original file.
You could use Tie::File for easy random access to the lines in your file:
use Tie::File;
use strict;
use warnings;
my $filename = "out.txt";
my #array;
tie #array, 'Tie::File', $filename or die "can't tie file \"$filename\": $!";
for my $line (#array) {
$line = "<$line>";
# or $line =~ s/^(.*)$/<$1>/g; # -- whatever modifications you need to do
}
untie #array;
Disclaimer: Of course, this option is only viable if the file is not shared with other processes. Otherwise you could use flock to prevent shared access while you modify the file.
Disclaimer-2 (thanks to ikegami): Don't use this solution if you have to edit big files and are concerned about performance. Most of the performance loss is mitigated for small files (less than 2MB, though this is configurable using the memory arg).
One option is to open the file twice: Open it once read-only, read the data, close it, process it, open it again read-write (no append), write the data, and close it. This is good practice because it minimizes the time you have the file open, in case someone else needs it.
If you only want to open it once, then you can use the +< file type - just use the seek call between reading and writing to return to the beginning of the file. Otherwise, you finish reading, are at the end of the file, and start writing there, which is why you get the behavior you're seeing.
Need to specify
use Fcntl qw(SEEK_SET);
in order to use
seek INFILE, 0, SEEK_SET;
Thanks user1703205 for the example.

How to print into two files at once?

I'm having trouble getting this line of code to work:
for my $fh (FH1, FH2, FH3) { print $fh "whatever\n" }
I found it at perldoc but it doesn't work for me.
The code I have so far is:
my $archive_dir = '/some/cheesy/dir/';
my ($stat_file,$stat_file2) = ($archive_dir."file1.txt",$archive_dir."file2.txt");
my ($fh1,$fh2);
for my $fh (fh1, fh2) { print $fh "whatever\n"; }
I'm getting "Bareword" errors on the (fh1, fh2) part because I'm using strict. I also noticed they were missing a ; in the example, so I'm guessing there might be some more errors aside from that.
What's the correct syntax for printing to two files at once?
You haven't opened the files.
my ($fh1,$fh2);
open($fh1, ">", $stat_file) or die "Couldn't open $stat_file: $!";
open($fh2, ">", $stat_file2) or die "Couldn't open $stat_file2: $!";
for my $fh ($fh1, $fh2) { print $fh "whatever\n"; }
Notice that I'm not using barewords. In the olden days, you would have used:
open(FH1, ">$stat_file");
...
for my $fh (FH1, FH2) { print $fh "whatever\n"; }
but the modern approach is the former.
I would just use IO::Tee.
use strict;
use warnings;
use autodie; # open will now die on failure
use IO::Tee;
open my $fh1, '>', 'file1';
open FH2, '>', 'file2';
my $both = IO::Tee->new( $fh1, \*FH2 );
print {$both} 'This is file number ';
print {$fh1} 'one';
print FH2 'two';
print {$both} "\n";
print {$both} "foobar\n";
$both->close;
Running the above program results in:
file1
This is file number one
foobar
file2
This is file number two
foobar
I would recommend reading the whole perldoc file for more advanced usage.
That looks about right, it's just that it used to be common in Perl to use barewords as file handles, but nowadays it's recommended to use normal scalars.
So make sure that you actually have the files open, then just substitute the (fh1, fh2) part with the actual file handles (which would be ($fh1, $fh2) or something)
another version based off of Brian's answer:
open(my $fh1, ">", $stat_file) or die "Couldn't open $stat_file!";
open(my $fh2, ">", $stat_file2) or die "Couldn't open $stat_file2!";
for ($fh1, $fh2) { print $_ "whatever\n"; }
You first need to open the file in order to get valid filehandles
open (MYFILEA, $stat_file);
open (MYFILEB, $stat_file2);
for my $fh ( \*MYFILEA, \*MYFILEB ) { print $fh "whatever\n" }
close (MYFILEA);
close (MYFILEB);

Problems reading a binary file with ActivePerl?

I'm trying to read a binary file with the following code:
open(F, "<$file") || die "Can't read $file: $!\n";
binmode(F);
$data = <F>;
close F;
open (D,">debug.txt");
binmode(D);
print D $data;
close D;
The input file is 16M; the debug.txt is only about 400k. When I look at debug.txt in emacs, the last two chars are ^A^C (SOH and ETX chars, according to notepad++) although that same pattern is present in the debug.txt. The next line in the file does have a ^O (SI) char, and I think that's the first occurrence of that particular character.
How can I read in this entire file?
If you really want to read the whole file at once, use slurp mode. Slurp mode can be turned on by setting $/ (which is the input record separator) to undef. This is best done in a separate block so you don't mess up $/ for other code.
my $data;
{
open my $input_handle, '<', $file or die "Cannot open $file for reading: $!\n";
binmode $input_handle;
local $/;
$data = <$input_handle>;
close $input_handle;
}
open $output_handle, '>', 'debug.txt' or die "Cannot open debug.txt for writing: $!\n";
binmode $output_handle;
print {$output_handle} $data;
close $output_handle;
Use my $data for a lexical and our $data for a global variable.
TIMTOWTDI.
File::Slurp is the shortest way to express what you want to achieve. It also has built-in error checking.
use File::Slurp qw(read_file write_file);
my $data = read_file($file, binmode => ':raw');
write_file('debug.txt', {binmode => ':raw'}, $data);
The IO::File API solves the global variable $/ problem in a more elegant fashion.
use IO::File qw();
my $data;
{
my $input_handle = IO::File->new($file, 'r') or die "could not open $file for reading: $!";
$input_handle->binmode;
$input_handle->input_record_separator(undef);
$data = $input_handle->getline;
}
{
my $output_handle = IO::File->new('debug.txt', 'w') or die "could not open debug.txt for writing: $!";
$output_handle->binmode;
$output_handle->print($data);
}
I don't think this is about using slurp mode or not, but about correctly handling binary files.
instead of
$data = <F>;
you should do
read(F, $buffer, 1024);
This will only read 1024 bytes, so you have to increase the buffer or read the whole file part by part using a loop.