Picking a specific line with a specific string - perl

I am trying this in Perl to pick one complete line from whole document which contains "CURRENT_RUN_ID". I have been using below code to accomplish the above said task but I am unable to enter the while loop.
my $sSuccessString = "CURRENT_RUN_ID";
open(LOG, "$slogfile") or die("Can't open $slogfile\n");
my $sLines;
{
local $/ = undef;
$sLines=<LOG>;
}
my $spool = 0;
my #matchingLines;
while (<LOG>)
{
print OUTLOG "in while loop\n";
if (m/$sSuccessString/i) {
print OUTLOG "in if loop\n";
$spool = 1;
print map { "$_ \n" } #matchingLines;
#matchingLines = ();
}
if ($spool) {
push (#matchingLines, $_);
}
}

You are already done reading from the filehandle LOG after you have slurped it into $sLines. <LOG> in the head of the while will return undef because it has reached eof. You either have to use that variable $sLines in your while loop or get rid of it. You're not using it anyway.
If you only want to print the line that matches, all you need to do is this:
use strict;
use warnings;
open my $fh_in, '<', 'input_file' or die $!;
open my $fh_out '>', 'output_file' or die $!;
while (my $line = <$fh_in>) {
print $fh_out $line if $line =~ m/CURRENT_RUN_ID/;
}
close $fh_in;
close $fh_out;

When you execute this code:
$sLines=<LOG>;
it reads all of the data from LOG into $sLines and it leaves the file pointer for LOG at the end of the file. So when you next try to read from that file handle with:
while (<LOG>)
nothing is returned as there is no more data to read.
If you want to read the file twice, then you will need to use the seek() function to reset the file pointer before your second read.
seek LOG, 0, 0;
But, given that you never do anything with $sLines I suspect that you can probably just remove that whole section of the code.
The whole thing with $spool and #matchingLines seems strange too. What were you trying to achieve there?
I think your code can be simplified to just:
my $sSuccessString = "CURRENT_RUN_ID";
open(LOG, $slogfile) or die("Can't open $slogfile\n");
while (<LOG>) {
print OUTLOG if /$sSuccessString/i/;
}
Personally, I'd make it even simpler, by reading from STDIN and writing to STDOUT.
my $sSuccessString = 'CURRENT_RUN_ID';
while (<>) {
print if /$sSuccessString/i/;
}
And then using Unix I/O redirection to connect up the correct files.
$ ./this_filter.pl < your_input.log > your_output.log

Related

Perl File Write Issue

I'm having a really weird problem with this perl script. The basic point is that sometimes a file write/append doesn't happen. On a run of the program, either all of the writes will happen or none of them will. Here is the subroutine, with some comments:
sub process_svs {
my $F;
open($F, '<', $_[0]);
if($log_dups==1) {
open($dfh, '>>',"./duplicates.txt");
}
while (my $line = <$F>) {
chomp $line;
if($line =~ /somepattern/) {
if (! -e "somefile") {
copy("source","dest") or warn ("couldn't copy");
} elsif($log_dups==1) {
system("touch ./duplicates.txt"); # ghetto workaround
print $dfh "./".$_[0]."_files/".$1.",v already exists\n" or die("Couldn't write duplicate"); # problem line
}
}
}
close $F;
}
The print statements to stdout always work, but if I remove the touch ./duplicates.txt crap, nothing is written to duplicates.txt.
The other "weird" thing, is that earlier in the program, I create a directory with perl mkdir, and if the directory exists when the program is run, I don't need the workaround, the duplicates.txt writing works just fine. If I delete the directory, and let the program mkdir it, it doesn't work. Seems relevant, but I can't figure out how since the directory and the text file are not in the same location, or related in any way, that I can think of.
Additionally, I have run it through the debugger, and can see the write call being executed, but inspecting duplicates.txt immediately after the write shows nothing written.
Any possible reasons for this would be greatly appreciated.
If you want to see a modified, but more complete, version of the script, it is here:
use strict;
use warnings;
use File::Copy;
my $svs = $ARGV[0];
my $rhis_str = system("rhis $svs > ./tmp_history");
my $fh;
my $dfh;
my #versions;
my $all_revs = 0;
my $current_rev = "";
my $log_dups = 0;
sub process_svs {
my $F;
open($F, '<', $_[0]);
if($log_dups==1) {
open($dfh, '>>',"./duplicates.txt");
}
while (my $line = <$F>) {
chomp $line;
if($line =~ /something/) {
if (! -e "something") {
copy("source","dest") or warn ("couldn't copy ");
} elsif($log_dups==1) {
system("touch ./duplicates.txt"); # ghetto workaround
print $dfh "something already exists\n" or die("Couldn't write duplicate");
}
}
}
close $F;
}
for(my $i = 0; $i <= scalar #ARGV; $i++) {
my $arg = $ARGV[$i];
if($arg eq "-a") {
$all_revs = 1;
} elsif($arg eq "-r") {
$all_revs = 0;
$current_rev = $ARGV[$i+1];
} elsif($arg eq "--log-dups") {
$log_dups = 1;
}
}
open($fh, '<','./tmp_history') or die(">>> Failed to open ./tmp_history");;
mkdir "./".$svs."_files";
if($all_revs == 1) {
print ">>> Processing all revisions of ".$svs;
if($log_dups==1) {
print" (and logging duplicates)\n";
}
while(my $line = <$fh>) {
chomp $line;
if ($line =~ /something/) {
push #versions, $1;
}
}
}
system("some_cmd &>/dev/null");
process_svs($svs);
}
You're not checking to see if your files opened. This is a very, very basic mistake and you should fix this immediately. Either add or die $! after each open or, better yet, use autodie and it will take care of catching all IO exceptions for you and give you good, consistent error messages.
Most importantly, this will tell you why it failed to open. $! tells you why it failed. You don't have that in your check on print.
print $dfh "./".$_[0]."_files/".$1.",v already exists\n" or die("Couldn't write duplicate"); # problem line
You're checking if print failed, but you're not including $!. Either add $! like die "Couldn't write to duplicate: $!" or use autodie, remove the or die clause, and let autodie take care of it. I recommend the second.
I suspect you'll find that something else is deleting duplicates.txt between the open and the print.
The second thing that grabs my attention is here.
if($log_dups==1) {
open($dfh, '>>',"./duplicates.txt");
}
You're using a global variable $log_dups to decide whether or not to open the file for writing (and not checking if it succeeded). This should be a variable that gets passed into the function, it's just good programming practice. Later you decide whether to print to $dfh based on that global variable.
if (! -e "something") {
copy("source","dest") or warn ("couldn't copy ");
} elsif($log_dups==1) {
system("touch ./duplicates.txt"); # ghetto workaround
print $dfh "something already exists\n" or die("Couldn't write duplicate");
}
Because $log_dups is global it's possible something else is changing $log_dups between deciding to open duplicates.txt and writing to it. To avoid all these problems, and to make the code simpler, $log_dups should be an argument passed into the function.
Furthermore, the filehandle $dfh is inexplicably a global. Same problem, something else could be closing it. It will also not be automatically closed at the end of the function which might leave writes to duplicates.txt buffered until the program exits. $dfh should be a lexical.
Other problems...
my $rhis_str = system("rhis $svs > ./tmp_history");
$rhis_str will contain the exit status of the rhis program. I don't think that's what you want. You don't use this variable anyway.
There's no need to pass ./file to open, it's safe and easier to read to use just pass file. That it's in the current working directory is implied.
If you fix these basic problems and still have trouble, then edit your question with the revised code and we can look again.

How to write a correct name using combination of variable and string as a filehandler?

I want to make a tool to classify each line in input file to several files
but it seems have some problem in naming a filehandler so I can't go ahead , how do I solve?
here is my program
ARGV[0] is the input file
ARGV[1] is the number of classes
#!/usr/bin/perl
use POSIX;
use warnings;
# open input file
open(Raw,"<","./$ARGV[0]") or die "Can't open $ARGV[0] \n";
# create a directory class to store class files
system("mkdir","Class");
# create files for store class informations
for($i=1;$i<=$ARGV[1];$i++)
{
# it seems something wrong in here
open("Class$i",">","./Class/$i.class") or die "Can't create $i.class \n";
}
# read each line and random decide which class to store
while( eof(Raw) != 1)
{
$Line = readline(*Raw);
$Random_num = ceil(rand $ARGV[1]);
for($k=1;$k<=$ARGV[1];$k++)
{
if($Random_num == $k)
{
# Store to the file
print "Class$k" $Line;
last;
}
}
}
for($h=1;$h<=$ARGV[1];$h++)
{
close "Class$h";
}
close Raw;
thanks
Later I use the advice provided by Bill Ruppert
I put the name of filehandler into array , but it seems appear a syntax bug , but I can't correct it
I label the syntax bug with ######## A syntax error but it looks quite OK ########
here is my code
#!/usr/bin/perl
use POSIX;
use warnings;
use Data::Dumper;
# open input file
open(Raw,"<","./$ARGV[0]") or die "Can't open $ARGV[0] \n";
# create a directory class to store class files
system("mkdir","Class");
# put the name of hilehandler into array
for($i=0;$i<$ARGV[1];$i++)
{
push(#Name,("Class".$i));
}
# create files of classes
for($i=0;$i<=$#Name;$i++)
{
$I = ($i+1);
open($Name[$i],">","./Class/$I.class") or die "Can't create $I.class \n";
}
# read each line and random decide which class to store
while( eof(Raw) != 1)
{
$Line = readline(*Raw);
$Random_num = ceil(rand $ARGV[1]);
for($k=0;$k<=$#Name;$k++)
{
if($Random_num == ($k+1))
{
print $Name[$k] $Line; ######## A syntax error but it looks quite OK ########
last;
}
}
}
for($h=0;$h<=$#Name;$h++)
{
close $Name[$h];
}
close Raw;
thanks
To quote the Perl documentation on the print function:
If you're storing handles in an array or hash, or in general whenever you're using any expression more complex than a bareword handle or a plain, unsubscripted scalar variable to retrieve it, you will have to use a block returning the filehandle value instead, in which case the LIST may not be omitted:
print { $files[$i] } "stuff\n";
print { $OK ? STDOUT : STDERR } "stuff\n";
Thus, print $Name[$k] $Line; needs to be changed to print { $Name[$k] } $Line;.
How about this one:
#! /usr/bin/perl -w
use strict;
use POSIX;
my $input_file = shift;
my $file_count = shift;
my %hash;
open(INPUT, "<$input_file") || die "Can't open file $input_file";
while(my $line = <INPUT>) {
my $num = ceil(rand($file_count));
$hash{$num} .= $line
}
foreach my $i (1..$file_count) {
open(OUTPUT, ">$i.txt") || die "Can't open file $i.txt";
print OUTPUT $hash{$i};
close OUTPUT;
}
close INPUT;

perl file read, truncate

I am trying to modify a config file.
I first read it into #buffer, depending on a regex match.
The modified buffer gets written back on disk, in case the file got smaller, a trunciation is done.
Unfortunatly this does not work, and it already crashes at fseek, but as far as I can say my usage of fseek conforms to perl doc.
open (my $file, "+<", "somefilethatexists.txt");
flock ($file, LOCK_EX);
foreach my $line (<$file>) {
if ($line =~ m/(something)*/) {
push (#buffer, $line);
}
}
print "A\n";
seek($file,0,0); #seek to the beginning, we read some data already
print "B\n"; # never appears
write($file, join('\n',#buffer)); #write new data
truncate($file, tell($file)); #get rid of everything beyond the just written data
flock($file, LOCK_UN);
close ($file);
perlopentut says this about Mixing Reads and Writes
... when it comes to updating a file ... you probably don't want to
use this approach for updating.
You should use Tie::File for this. It opens the file for both read and write on the same filehandle and allows you to treat a file as an array of lines
use strict;
use warnings;
use Tie::File;
tie my #file, 'Tie::File', 'somefilethatexists.txt' or die $!;
for (my $i = 0; $i < #file; ) {
if (m/(something)*/) {
$i++;
}
else {
splice #file, $i, 1;
}
}
untie #file;
Where are your fseek(), fwrite() and ftruncate() functions defined? Perl doesn't have those functions. You should be using seek(), print() (or syswrite()) and truncate(). We can't really help you if you're using functions that we know nothing about.
You also don't need (and probably don't want) that explicit call to unlock the file or the call to close the file. The filehandle will be closed and unlocked as soon as your $file variable goes out of scope.
Maybe you can try this:
$^I = '.bak';
#ARGV = 'somefilethatexists.txt';
while (<>) {
if (/(something)*/) {
print;
}
}

Open filehandle or assign stdout

I'm working in a program where the user can pass a -o file option, and output should be then directed to that file. Otherwise, it should go to stdout.
To retrieve the option I'm using the module getopt long, and that's not the problem. The problem is that I want to create a file handle with that file or assign stdout to it if the option was not set.
if ($opt) {
open OUTPUT, ">", $file;
} else {
open OUTPUT, # ???
}
That's because this way, later in my code I can just:
print OUTPUT "...";
Without worrying if OUTPUT is stdout or a file the user specified. Is this possible? If I'm doing a bad design here, please let me know.
This would be a good example on how to use select.
use strict;
use warnings;
use autodie;
my $fh;
if ($opt) {
open $fh, '>', $file;
select $fh;
}
print "This goes to the file if $opt is defined, otherwise to STDOUT."
Look at the open documentation. The easiest is to reopen STDOUT itself and not use a filehandle in your code.
if ($opt) {
open(STDOUT, ">", $file);
}
...
print "this goes to $file or STDOUT\n";
(Add some error checking of course.)
A constant item such as OUTPUT cannot be assigned. Using a variable such as $output works better. For example:
my ($output, $display_filename);
if ($opt)
{
if ($opt eq '-')
{
$display_filename = 'stdout';
$output = *STDOUT;
}
else
{
$display_filename = $opt;
open($output, '>', $opt) or
die("Cannot open $opt for writing: $!\n");
}
}
That way the program can print to standard output and/or to an output file:
print $output "This might go to a file\n";
print "Data written to $display_filename\n" if ($verbose);

How do I determine whether a Perl file handle is a read or write handle?

You are given either an IO::File object or a typeglob (\*STDOUT or Symbol::symbol_to_ref("main::FH")); how would you go about determining if it is a read or write handle? The interface cannot be extended to pass this information (I am overriding close to add calls to flush and sync before the actual close).
Currently I am attempting to flush and sync the filehandle and ignoring the error "Invalid argument" (which is what I get when I attempt to flush or sync a read filehandle):
eval { $fh->flush; 1 } or do {
#this seems to exclude flushes on read handles
unless ($! =~ /Invalid argument/) {
croak "could not flush $fh: $!";
}
};
eval { $fh->sync; 1 } or do {
#this seems to exclude syncs on read handles
unless ($! =~ /Invalid argument/) {
croak "could not sync $fh: $!";
}
};
Have a look at the fcntl options. Maybe F_GETFL with O_ACCMODE.
Edit: I did a little googling and playing over lunch and here is some probably non-portable code but it works for my Linux box, and probably any Posix system (perhaps even Cygwin, who knows?).
use strict;
use Fcntl;
use IO::File;
my $file;
my %modes = ( 0 => 'Read only', 1 => 'Write only', 2 => 'Read / Write' );
sub open_type {
my $fh = shift;
my $mode = fcntl($fh, F_GETFL, 0);
print "File is: " . $modes{$mode & 3} . "\n";
}
print "out\n";
$file = new IO::File();
$file->open('> /tmp/out');
open_type($file);
print "\n";
print "in\n";
$file = new IO::File();
$file->open('< /etc/passwd');
open_type($file);
print "\n";
print "both\n";
$file = new IO::File();
$file->open('+< /tmp/out');
open_type($file);
Example output:
$ perl test.pl
out
File is: Write only
in
File is: Read only
both
File is: Read / Write