I want to remove all lines in a text file that start with HPL_ I have acheived this and can print to screen, but when I try to write to a file, I just get the last line of the amended text printed in the new file. Any help please!
open(FILE,"<myfile.txt");
#LINES = <FILE>;
close(FILE);
open(FILE,">myfile.txt");
foreach $LINE (#LINES) {
#array = split(/\:/,$LINE);
my $file = "changed";
open OUTFILE, ">$file" or die "unable to open $file $!";
print OUTFILE $LINE unless ($array[0] eq "HPL_");
}
close(FILE);
close (OUTFILE);
exit;
You just want to remove all lines that start with HPL_? That's easy!
perl -pi -e 's/^HPL_.*//s' myfile.txt
Yes, it really is just a one-liner. :-)
If you don't want to use the one-liner, re-write the "write to file" portion as follows:
my $file = "changed";
open( my $outfh, '>', $file ) or die "Could not open file $file: $!\n";
foreach my $LINE (#LINES) {
my #array = split(/:/,$LINE);
next if $array[0] eq 'HPL_';
print $outfh $LINE;
}
close( $outfh );
Note how you are open()ing the file each time through the loop. This is causing the file to only contain the last line, as using open() with > means "overwrite what's in the file". That's the major problem with your code as it stands.
Edit: As an aside, you want to clean up your code. Use lexical filehandles as I've shown. Always add the three lines that tchrist posted at the top of every one of your Perl programs. Use the three-operator version of open(). Don't slurp the entire file into an array, as if you try to read a huge file it could cause your computer to run out of memory. Your program could be re-written as:
#!perl
use strict;
use autodie;
use warnings FATAL => "all";
my $infile = "myfile.txt";
my $outfile = "changed.txt";
open( my $infh, '<', $infile );
open( my $outfh, '>', $outfile );
while( my $line = <$infh> ) {
next if $line =~ /^HPL_/;
print $outfh $line;
}
close( $outfh );
close( $infh );
Note how with use autodie you don't need to add or die ... to the open() function, as the autodie pragma handles that for you.
The issue with your code is that you open the file for output within your line-processing loop which, due to your use of the '>' form of open, opens the file each time for write, obliterating any previous content.
Move the invocation of open() to the top of your file, above the loop, and it should work.
Also, I'm not sure of your intent but at line 4 of your example, you reopen your input file for write (using '>'), which also clobbers anything it contains.
As a side note, you might try reading up on Perl's grep() command which is designed to do exactly what you need, as in:
#!/usr/bin/perl
use strict;
use warnings;
open(my $in, '<', 'myfile.txt') or die "failed to open input for read: $!";
my #lines = <$in> or die 'no lines to read from input';
close($in);
# collect all lines that do not begin with HPL_ into #result
my #result = grep ! /^HPL_/, #lines;
open(my $out, '>', 'changed.txt') or die "failed to open output for write: $!";
print { $out } #result;
close($out);
Related
This is my script count.pl, I am trying to count the number of lines in a file.
The script's code :
chdir $filepath;
if (-e "$filepath"){
$total = `wc -l < file.list`;
printf "there are $total number of lines in file.list";
}
i can get a correct output, but i do not want to count blank lines and anything in the file that start with #. any idea ?
As this is a Perl program already open the file and read it, filtering out lines that don't count with
open my $fh, '<', $filename or die "Can't open $filename: $!";
my $num_lines = grep { not /^$|^\s*#/ } <$fh>;
where $filename is "file.list." If by "blank lines" you mean also lines with spaces only then chagne regex to /^\s*$|^\s*#/. See grep, and perlretut for regex used in its condition.
That filehandle $fh gets closed when the control exits the current scope, or add close $fh; after the file isn't needed for processing any more. Or, wrap it in a block with do
my $num_lines = do {
open my $fh, '<', $filename or die "Can't open $filename: $!";
grep { not /^$|^\s*#/ } <$fh>;
};
This makes sense doing if the sole purpose of opening that file is counting lines.
Another thing though: an operation like chdir should always be checked, and then there is no need for the race-sensitive if (-e $filepath) either. Altogether
# Perhaps save the old cwd first so to be able to return to it later
#my $old_cwd = Cwd::cwd;
chdir $filepath or die "Can't chdir to $filepath: $!";
open my $fh, '<', $filename or die "Can't open $filename: $!";
my $num_lines = grep { not /^$|^\s*#/ } <$fh>;
A couple of other notes:
There is no reason for printf. For all normal prints use say, for which you need use feature qw(say); at the beginning of the program. See feature pragma
Just in case, allow me to add: every program must have at the beginning
use warnings;
use strict;
Perhaps the original intent of the code in the question is to allow a program to try a non-existing location, and not die? In any case, one way to keep the -e test, as asked for
#my $old_cwd = Cwd::cwd;
chdir $filepath or warn "Can't chdir to $filepath: $!";
my $num_lines;
if (-e $filepath) {
open my $fh, '<', $filename or die "Can't open $filename: $!";
$num_lines = grep { not /^$|^\s*#/ } <$fh>;
}
where I still added a warning if chdir fails. Remove that if you really don't want it. I also added a declaration of the variable that is assigned the number of lines, with my $total_lines;. If it is declared earlier in your real code then of course remove that line here.
perl -ne '$n++ unless /^$|^#/ or eof; print "$n\n" if eof'
Works with multiple files too.
perl -ne '$n++ unless /^$|^#/ or eof; END {print "$n\n"}'
Better for a single file.
open(my $fh, '<', $filename);
my $n = 0;
for(<$fh>) { $n++ unless /^$|^#/}
print $n;
Using sed to filter out the "unwanted" lines in a single file:
sed '/^\s*#/d;/^\s*$/d' infile | wc -l
Obviously, you can also replace infile with a list of files.
The solution is very simple, no any magic.
use strict;
use warnings;
use feature 'say';
my $count = 0;
while( <> ) {
$count++ unless /^\s*$|^\s*#/;
}
say "Total $count lines";
Reference:
<>
My Perl file generates the text file which usually contains 200 lines. Sometimes it exceeds 200 lines (For example 217 lines). I need to trim off the rest of the lines from the 201st line. I have used the counter method to trim the exceeded lines. Is there any other simple and efficient way to do this?
Code:
#!/usr/bin/perl -w
use strict;
use warnings;
my $filename1="channel.txt";
my $filename2="channel1.txt";
my $fh;
my $fh1;
my $line;
my $line1;
my $count=1;
open $fh, '<', $filename1 or die "Can't open > $filename1: $!";
open $fh1, '>', $filename2 or die "Can't open > $filename2: $!";
while(my $line = <$fh>)
{
chomp $line;
chomp $line1;
if($count<201)
{
print $fh1 "$line\n";
}
$count++;
}
close ($fh1);
close($fh);
I have already mentioned in my comment, this is short version of that comment If you actually trying to trim the file you can use the Perl One Liner instead of writing the whole code
perl -pe 'last if($. == 201);' input.text >result.txt
-p used for process the file line by line an print the output
-e execute flag, to execute the Perl syntax
With Perl script you can do this also
open my $fh,"<","input.txt";
open my $wh,">","result.txt";
print $wh scalar <$fh> for(1..10);
xxfelixxx already gave you the correct answer. I am just changing my earlier posted answer, to clean up your code and to write back to the original file:
use strict;
use warnings;
my #array;
my $filename="channel.txt";
open my $fh, '<', $filename or die "Can't open > $filename: $!";
while( my $line = <$fh> ) {
last if $. > 200;
push #array, $line;
}
close($fh);
open $fh, '>', $filename or die "Can't open > $filename: $!";
print $fh #array;
close($fh);
There is no need to keep your own counter, perl has a special variable $. which keeps track of the input line number. You can simplify your loop like so:
while( chomp( my $line = <$fh> ) ) {
last if $. > 200;
print $fh1 "$line\n";
}
perldoc perlvar - Search for INPUT_LINE_NUMBER.
To write back to the original file: input.txt without using redirection:
perl -pi.tmp -we "last if $.>200;" input.txt
where
-i : opens a temp file and automatically replaces the file to be
edited with the temporary file after processing (the '.tmp'
is the suffix to use for the temp file during processing)
-w : command line flag to 'use warnings'
-p : magic; basically equivalent to coding:
LINE: while (defined $_ = <ARGV>)) {
"your code here"
}
-e : perl code follows this flag (enclosed in double quotes for MSWin32 aficiandos)
I have a problem when the script print the whole line of text file in a result text file:
use strict;
use warnings;
use autodie;
my $out = "result2.txt";
open my $outFile, ">$out" or die $!;
my %permitted = do {
open my $fh, '<', 'f1.txt';
map { /(.+?)\s+\(/, 1 } <$fh>;
};
open my $fh, '<', 'f2.txt';
while (<$fh>) {
my ($phrase) = /(.+?)\s+->/;
if ($permitted{$phrase}) {
print $outFile $fh;
}
close $outFile;
The problem is in this line
print $outFile $fh;
Any idea please?
Thank you
print $outFile $fh is printing the value of the file handle $fh to the file handle $outFile. Instead you want to print the entire current line, which is in $_.
There are a couple of other improvements that can be made
You should always use the three-parameter form of open, so the open mode appears on its own as the second paremeter
There is no need to test the success of an open of autodie is in place
If you have a variable that contains the name of the output file, then you really should have ones for the names of the two input files as well
This is how your program should look. I hope it helps.
use strict;
use warnings;
use autodie;
my ($in1, $in2, $out) = qw/ f1.txt f2.txt result2.txt /;
my %permitted = do {
open my $fh, '<', $in1;
map { /(.+?)\s+\(/, 1 } <$fh>;
};
open my $fh, '<', $in2;
open my $outfh, '>', $out;
while (<$fh>) {
my ($phrase) = /(.+?)\s+->/;
if ($permitted{$phrase}) {
print $outfh $_;
}
}
close $outfh;
I think you want print $outfile $phrase here, don't you? The line you currently have is trying to print out a file handle reference ($fh) to a file ($outfile).
Also, just as part of perl best practices, you'll want to use the three argument open for your first open line:
open my $outFile, ">", $out or die $!;
(FWIW, you're already using 3-arg open for your other two calls to open.)
Although Borodin has provided an excellent solution to your question, here's another option where you pass your 'in' files' names to the script on the command line, and let Perl handle the opening and closing of those files:
use strict;
use warnings;
my $file2 = pop;
my %permitted = map { /(.+?)\s+\(/, 1 } <>;
push #ARGV, $file2;
while (<>) {
my ($phrase) = /(.+?)\s+->/;
print if $permitted{$phrase};
}
Usage: perl script.pl inFile1 inFile2 [>outFile]
The last, optional parameter directs output to a file.
The pop command implicitly removes inFile2's name off of #ARGV, and stores it in $file2. Then, inFile1 is read using the <> directive. The file name of inFile2 is then pushed onto #ARGV, and that file is read and a line is printed if $permitted{$phrase} is true.
Running the script without the last, optional parameter will print results (if any) to the screen. Using the last parameter saves output to a file.
Hope this helps!
I have this little perl script which opens a txt file, reads the number in it, then overwrites the file with the number incremented by 1. I can open and read from the file, I can write to the file but I"m having issues overwriting. In addition, I'm wondering if there is a way to do this without opening the file twice. Here's my code:
#!/usr/bin/perl
open (FILE, "<", "data.txt") or die "$! error trying to a\
ppend";
undef $/;
$number = <FILE>;
$number = int($number);
$myNumber = $number++;
print $myNumber+'\n';
close(FILE);
open(FILE, ">data.txt") or die "$! error";
print FILE $myNumber;
close(FILE);
Change the line
$myNumber = $number++;
to
$myNumber = $number+1;
That should solve the problem.
Below is how you could do by opening the file just once:
open(FILE, "+<data.txt") or die "$! error";
undef $/;
$number = <FILE>;
$number = int($number);
$myNumber = $number+1;
seek(FILE, 0, 0);
truncate(FILE, tell FILE);
print $myNumber+"\n";
print FILE $myNumber;
close(FILE);
It's good that you used the three-argument form of open the first time. You also needed to do that in your second open. Also, you should use lexical variables, i.e., those which begin with my, in your script--even for your file handles.
You can just increment the variable that holds the number, instead of passing it to a new variable. Also, it's a good idea to use chomp. This things being said, consider the following option:
#!/usr/bin/env perl
use strict;
use warnings;
undef $/;
open my $fhIN, "<", "data.txt" or die "Error trying to open for reading: $!";
chomp( my $number = <$fhIN> );
close $fhIN;
$number++;
open my $fhOUT, ">", "data.txt" or die "Error trying to open for writing: $!";
print $fhOUT $number;
close $fhOUT;
Another option is to use the Module File::Slurp, letting it handle all the I/O operations:
#!/usr/bin/env perl
use strict;
use warnings;
use File::Slurp qw/edit_file/;
edit_file { chomp; $_++ } 'data.txt';
Try this:
#!/usr/bin/perl
use strict;
use warnings;
my $file = "data.txt";
my $number = 0;
my $fh;
if( -e $file ) {
open $fh, "+<", $file or die "Opening '$file' failed, because $!\n";
$number = <$fh>;
seek( $fh, 0, 0 );
} else { # if no data.txt exists - yet
open $fh, ">", $file or die "Creating '$file' failed, because $!\n";
}
$number++;
print "$number\n";
print $fh $number;
close( $fh );
If you're using a bash shell, and you save the code to test.pl, you can test it with:
for i in {1..10}; do ./test.pl; done
Then 'cat data.txt', should show a 10.
I am 100% new to Perl but do have some PHP knowledge. I'm trying to create a quick script that will take the #url vars and save it to a .txt file. The problem that I'm having is that it's saving the url again everytime it runs through the loop which is super annoying. So when the loop runs, it'll look like this.
url1.com
url1.com url2.com
url1.com url2.com url3.com
What I would like it to look like is just a plain and simple:
url1.com
url2.com
url3.com
Here is my code. If anyone can help, I would appreciate it SO SO much!
#!/usr/bin/perl
use strict;
use warnings;
my $file = "data.rdf.u8";
my #urls;
open(my $fh, "<", $file) or die "Unable to open $file\n";
while (my $line = <$fh>) {
if ($line =~ m/<(?:ExternalPage about|link r:resource)="([^\"]+)"\/?>/) {
push #urls, $1;
}
open (FH, ">>my_urls.txt") or die "$!";
print FH "#urls ";
close(FH);
}
close $fh;
Your print is inside your while loop. It sounds like you want to move your print outside of the loop.
Or if you want to print each url as you go through each line, move the declaration of "my #urls" down into the loop, then it will get reset each line
Shouldn't this part:
open (FH, ">>my_urls.txt") or die "$!";
print FH "#urls ";
close(FH);
...be placed outside of while loop? It makes no sense within while, as #urls are apparently incomplete there.
And two regex-related sidenotes: first, with m operator you may choose another set of delimiters so you don't have to escape / sign; second, it's not necessary to escape " sign within character class definition. In fact, it's not required to escape it in regex at all - unless you choose this character as a delimiter. )
So your regex may look like this:
$line =~ m#<(?:ExternalPage about|link r:resource)="([^"]+)"/?>#
do you need the #urls array elsewhere? because else, you could simply:
#!/usr/bin/perl
use strict;
use warnings;
my $file = "data.rdf.u8";
my #urls;
open(my $fh, "<", $file) or die "Unable to open $file\n";
open (FH, ">>my_urls.txt") or die "$!";
while (my $line = <$fh>) {
if ($line =~ m/<(?:ExternalPage about|link r:resource)="([^\"]+)"\/?>/) {
print FH $1;
}
}
close(FH);
close $fh;