Read content of file passed in argument to perl program - perl

I am trying read content of file which I passed as command line argument to perl program
ERROR:
[localhost#dharm myPerlCodes]$ perl data.pl m
FILE NAME IS ==
Could not read from , program halting. at data.pl line 31.
CODE:
sub get_record {
# the file_data should be passed in as a parameter
my $file = shift;
print "FILE NAME IS == $file\n";
open(FILE, $file) or die "Could not read from $file, program halting.";
# read the record, and chomp off the newline
chomp(my $record = <FILE>);
close FILE;
return $record;
}
$text = &get_record();
print "text in given file is = $text\n";
What am I doing wrong ?

You should be passing the filename to get_record, like:
$text = &get_record(shift #ARGV);
shift inside a subroutine gets arguments passed to the subroutine (#_); only outside of a subroutine does it get command line arguments.

Outside of a function, shift returns the next parameter to the script, but inside of a function it returns the next parameter.
You need to pass the parameter in to the get_record function: $text = &get_record(shift);

If you want to read the content of a file passed as a parameter, you don't need to do any of that. The file will automatically be available in <>. Also, there's no need to use & unless you're dealing with prototypes (and you shouldn't be).
sub get_record {
chomp( my $record = <> );
return $record;
}
my $text = get_record();

Related

Read array data from a CSV or a config file instead of __DATA__ inside Perl script

I'm a novice in Perl and recently came through a perl script which reads array data from __DATA__; (a inline data statement inside the perl script at the bottom) using #flatfiledata = <DATA>;
data looks as below inside the script at the bottom:
__DATA__;
Arrays
#cab_method::Concentration::Dilution
#cab04_cartypes::XXXX Dye::Gag XXXXXX::LuciXXXX::Firefly LuciXXXX::Renilla LucifXXXX
__END__;
I would like to keep this data inside a config file either csv or a tab delimited file
and then read this data into array.
below is the current subroutine which read this data from __DATA__ statement at the bottom.
# read data from __DATA__ section at end of program and put data into #flatfiledata array
#flatfiledata = <DATA>;
&ReadArraysAndHashes;
sub ReadArraysAndHashes {
foreach $line (#flatfiledata) {
chomp $line;
# ok, some weird newline chars have ended up in data section, so...
$line =~ s/\r|\n//g;
# skip any lines that do not contain at least one '::'
# (probably blank lines or comments)
unless ( $line =~ /::/ ) { next }
#split all the elements in the line into an array
my #elements = split( /::/, $line );
# The first element is the key;
my $key = shift(#elements);
# if the key starts with a '#,' you have an array;
# if it starts with a '%,' it is a hash
# either way, delete the symbol from the key
$key =~ s/^(.)//;
my $array_or_hash = $1;
# create a hash of hashes
if ( $array_or_hash eq '#' ) {
$clrdata{array}{$key} = \#elements;
#{$key} = #{ $clrdata{array}{$key} };
}
elsif ( $array_or_hash eq '%' ) {
if ( $#elements % 2 != 1 ) {
print "odd number of elements for $key\n";
}
my %hash = #elements;
$clrdata{hash}{$key} = \%hash;
}
}
}
__DATA__;
Arrays
#cab_method::Concentration::Dilution
#cab04_cartypes::XXXX Dye::Gag XXXXXX::LuciXXXX::Firefly LuciXXXX::Renilla LucifXXXX
__END__;
Basic answer
DATA is just a filehandle, so it can easily be replaced with any other filehandle.
open my $config_fh, '<', 'some_config_file'
or die "Can't open config file: $!\n";
#flatfiledata = <$config_fh>;
More tips
One thing that we've learned from many decades of computer programming is that it's a bad idea for a subroutine to access global variables. All data used by a subroutine should either be passed to the subroutine as parameters or created within the subroutine. Your ReadArraysAndHashes() subroutine uses the global #flatfiledata variable. You should either pass the data into the subroutine:
open my $config_fh, '<', 'some_config_file'
or die "Can't open config file: $!\n";
#flatfiledata = <$config_fh>;
ReadArraysAndHashes(#flatfiledata);
# And in the subroutine
sub ReadArraysAndHashes {
my #flatfiledata = #_;
# Rest of the code is the same
...
}
Or you should read the data inside the subroutine:
ReadArraysAndHashes();
# And in the subroutine
sub ReadArraysAndHashes {
# We don't need the #flatfiledata array
# as we can read the filehandle directly
open my $config_fh, '<', 'some_config_file'
or die "Can't open config file: $!\n";
foreach $line (<$config_fh>) {
# Your existing code
...
}
}
It also look like your subroutine is writing to a global variable called %clrdata. That's also a bad idea. Consider returning that variable from your subroutine instead.
%clrdata = ReadArraysAndHashes();
# And in the subroutine
sub ReadArraysAndHashes {
# All of the code
...
return %clrdata;
}
Extra reading
Two final tips to consider:
Take a look at restricting the scope of your variables using my.
Don't use ampersands to call subroutines.

Perl - Could not open and read files

I've created a script for validating xml files after given input folder. It should grep xml files from the input directory then sort out the xml files and check the condition. But it throws a command that not Open at line , <STDIN> line 1.
But it creates an empty log file.
Since i faced numeric error while sorting, comment that.
so i need to be given input location, the script should check the xml files and throw errors in a mentioned log file.
Anyone can help this?
Script
#!/usr/bin/perl
# use strict;
use warnings;
use Cwd;
use File::Basename;
use File::Path;
use File::Copy;
use File::Find;
print "Enter the path: ";
my $filepath = <STDIN>;
chomp $filepath;
die "\n\tpleas give input folder \n" if(!defined $filepath or !-d $filepath);
my $Toolpath = dirname($0);
my $base = basename($filepath);
my $base_path = dirname($filepath);
my ($xmlF, #xmlF);
my #errors=();
my #warnings=();
my #checkings=();
my $ecount=0;
my $wcount=0;
my $ccount=0;
my ($x, $y);
my $z="0";
opendir(DIR,"$filepath");
my #xmlFiles = grep{/\.xml$/} readdir(DIR);
closedir(DIR);
my $logfile = "$base_path\\$base"."_Err.log";
# #xmlF=sort{$a <=> $b}#xmlFiles;
#xmlF=sort{$a cmp $b}#xmlFiles;
open(OUT, ">$logfile") || die ("\nLog file couldnt write $logfile :$!");
my $line;
my $flcnt = scalar (#xmlF);
for ($x=0; $x < $flcnt; $x++)
{
open IN, "$xmlF[$x]" or die "not Open";
print OUT "\n".$xmlF[$x]."\n==================\n";
print "\nProcessing File $xmlF[$x] .....\n";
local $/;
while ($line=<IN>)
{
while ($line=~m#(<res(?: [^>]+)? type="weblink"[^>]*>)((?:(?!</res>).)*)</res>#igs)
{
my $tmp1 = $1; my $tmp2 = $&; my $pre1 = $`;
if($tmp1 =~ m{ subgroup="Weblink"}i){
my $pre = $pre1.$`;
if($tmp2 !~ m{<tooltip><\!\[CDATA\[Weblink\]\]><\/tooltip>}ms){
my $pre = $pre1.$`;
push(#errors,lineno($pre),"\t<tooltip><\!\[CDATA\[Weblink\]\]></tooltip> is missing\n");
}
}
}
foreach my $warnings(#warnings)
{
$wcount = $wcount+1;
}
foreach my $checkings(#checkings)
{
$ccount = $ccount+1;
}
foreach my $errors(#errors)
{
$ecount = $ecount+1;
}
my $count_err = $ecount/2;
print OUT "".$count_err." Error(s) Found:-\n------------------------\n ";
print OUT "#errors\n";
$ecount = 0;
my $count_war = $wcount/2;
print OUT "$count_war Warning(s) Found:-\n-------------------------\n ";
print OUT "#warnings\n";
$wcount = 0;
my $count_check = $ccount/2;
print OUT "$count_check Checking(s) Found:-\n-------------------------\n ";
print OUT "#checkings\n";
$wcount = 0;
undef #errors;
undef #warnings;
undef #checkings;
close IN;
}
}
The readdir returns bare file names, without the path.
So when you go ahead to open those files you need to prepend the names returned by readdir with the name of the directory the readdir read them from, here $filepath. Or build the full path names right away
use warnings;
use strict;
use feature 'say';
use File::Spec;
print "Enter the path: ";
my $filepath = <STDIN>;
chomp $filepath;
die "\nPlease give input folder\n" if !defined $filepath or !-d $filepath;
opendir(my $fh_dir, $filepath) or die "Can't opendir $filepath: $!";
my #xml_files =
map { File::Spec->catfile($filepath, $_) }
grep { /\.xml$/ }
readdir $fh_dir;
closedir $fh_dir;
say for #xml_files;
where I used File::Spec to portably piece together the file name.
The map can be made to also do grep's job so to make only one pass over the file list
my #xml_files =
map { /\.xml$/ ? File::Spec->catfile($filepath, $_) : () }
readdir $fh_dir;
The empty list () gets flattened in the returned list, effectively disappearing altogether.
Here are some comments on the code. Note that this is normally done at Code Review but I feel that it is needed here.
First: a long list of variables is declared upfront. It is in fact important to declare in as small a scope as possible. It turns out that most of those variables can indeed be declared where they are used, as seen in comments below.
The location of the executable is best found using
use FindBin qw($RealBin);
where $RealBin also resolves links (as opposed to $Bin, also available)
Assigning () to an array at declaration doesn't do anything; it is exactly the same as normal my #errors;. They can also go together, my (#errors, #warnings, #checks);. If the array has something then = () clears it, what is a good way to empty an array
Assigning a "0" makes the variable a string. While Perl normally converts between strings and numbers as needed, if a number is needed then use a number, my $z = 0;
Lexical filehandles (open my $fh, ...) are better than globs (open FH, ...)
I don't understand the comment about "numeric error" in sorting. The cmp operator sorts lexicographically, for numeric sort use <=>
When array is used in scalar context – when assigned to a scalar for example – the number of elements is returned. So no need for scalar but do my flcnt = #xmlF;
For iteration over array indices use $#ary, the index of the last element of #ary, for
foreach my $i (0..$#xmlF) { ... }
But if there aren't any uses of the index (I don't see any) then loop over elements
foreach my $file (#xmlF) { ... }
When you check the file open print the error $!, open ... or die "... : $!";. This is done elsewhere in the code, and it should be done always.
The local $/; unsets the input record separator, what makes the following read take the whole file. If that is intended then $line is not a good name. Also note that a variable can be declared inside the condition, while (my $line = <$fh>) { }
I can't comment on the regex as I don't know what it's supposed to accomplish, but it is complex; any chance to simplify all that?
The series of foreach loops only works out the number of elements of those arrays; there is no need for loops then, just my $ecount = #errors; (etc). This also allows you to keep the declaration of those counter variables in minimal scope.
The undef #errors; (etc) aren't needed since those arrays count for each file and so you can declare them inside the loops, anew at each iteration (and at smallest scope). When you wish to empty an array it is better to do #ary = (); than to undef it; that way it's not allocated all over again on the next use

Unable to redirect the output of the system command to a file named error.log and stderr to another file named test_file.errorlog

This perl script is traversing all directories and sub directories, searching for a file named RUN in it. Then it opens the file and runs the 1st line written in the file. The problem is that I am not able to redirect the output of the system command to a file named error.log and STDERR to another file named test_file.errorlog, but no such file is created.
Note that all variable are declared if not found.
find (\&pickup_run,$path_to_search);
### Subroutine for extracting path of directories with RUN FILE PRESENT
sub pickup_run {
if ($File::Find::name =~/RUN/) {
### If RUN file is present , push it into array named run_file_present
push(#run_file_present,$File::Find::name);
}
}
###### Iterate over the array containing paths to directories containing RUN files one by one
foreach my $var (#run_file_present) {
$var =~ s/\//\\/g;
($path_minus_run=$var) =~ s/RUN\b//;
#print "$path_minus_run\n";
my $test_case_name;
($test_case_name=$path_minus_run) =~ s/expression to be replced//g;
chdir "$path_minus_run";
########While iterating over the paths, open each file
open data, "$var";
#####Run the first two lines containing commands
my #lines = <data>;
my $return_code=system (" $lines[0] >error.log 2>test_file.errorlog");
if($return_code) {
print "$test_case_name \t \t FAIL \n";
}
else {
print "$test_case_name \t \t PASS \n";
}
close (data);
}
The problem is almost certainly that $lines[0] has a newline at the end after being read from the file
But there are several improvements you could make
Always use strict and use warnings at the top of every Perl program, and declare all your variables using my as close as possible to their first point of use
Use the three-parameter form of open and always check whether it succeeded, putting the built-in variable $! into your die string to say why it failed. You can also use autodie to save writing the code for this manually for every open, but it requires Perl v5.10.1 or better
You shouldn't put quotes around scalar variables -- just used them as they are. so chdir $path_minus_run and open data, $var are correct
There is also no need to save all the files to be processed and deal with them later. Within the wanted subroutine, File::Find sets you up with $File::Find::dir set to the directory containing the file, and $_ set to the bare file name without a path. It also does a chdir to the directory for you, so the context is ideal for processing the file
use strict;
use warnings;
use v5.10.1;
use autodie;
use File::Find;
my $path_to_search;
find( \&pickup_run, $path_to_search );
sub pickup_run {
return unless -f and $_ eq 'RUN';
my $cmd = do {
open my $fh, '<', $_;
<$fh>;
};
chomp $cmd;
( my $test_name = $File::Find::dir ) =~ s/expression to be replaced//g;
my $retcode = system( "$cmd >error.log 2>test_file.errorlog" );
printf "%s\t\t%s\n", $test_name, $retcode ? 'FAIL' : 'PASS';
}

Validate perl input - filter out inexistent files

I have a perl script to which i supply input(text file) from batch or sometimes from command prompt. When i supply input from batch file sometimes the file may not exisits. I want to catch the No such file exists error and do some other task when this error is thrown. Please find the below sample code.
while(<>) //here it throws an error when file doesn't exists.
{
#parse the file.
}
#if error is thrown i want to handle that error and do some other task.
Filter #ARGV before you use <>:
#ARGV = grep {-e $_} #ARGV;
if(scalar(#ARGV)==0) die('no files');
# now carry on, if we've got here there is something to do with files that exist
while(<>) {
#...
}
<> reads from the files listed in #ARGV, so if we filter that before it gets there, it won't try to read non-existant files. I've added the check for the size of #ARGV because if you supply a list files which are all absent, it will wait on stdin (the flipside of using <>). This assumes that you don't want to do that.
However, if you don't want to read from stdin, <> is probably a bad choice; you might as well step through the list of files in #ARGV. If you do want the option of reading from stdin, then you need to know which mode you're in:
$have_files = scalar(#ARGV);
#ARGV = grep {-e $_} #ARGV;
if($have_files && scalar(grep {defined $_} #ARGV)==0) die('no files');
# now carry on, if we've got here there is something to do;
# have files that exist or expecting stdin
while(<>) {
#...
}
The diamond operator <> means:
Look at the names in #ARGV and treat them as files you want to open.
Just loop through all of them, as if they were one big file.
Actually, Perl uses the ARGV filehandle for this purpose
If no command line arguments are given, use STDIN instead.
So if a file doesn't exist, Perl gives you an error message (Can't open nonexistant_file: ...) and continues with the next file. This is what you usually want. If this is not the case, just do it manually. Stolen from the perlop page:
unshift(#ARGV, '-') unless #ARGV;
FILE: while ($ARGV = shift) {
open(ARGV, $ARGV);
LINE: while (<ARGV>) {
... # code for each line
}
}
The open function returns a false value when a problem is encountered. So always invoke open like
open my $filehandle "<", $filename or die "Can't open $filename: $!";
The $! contains a reason for the failure. Instead of dieing, we can do some other error recovery:
use feature qw(say);
#ARGV or #ARGV = "-"; # the - symbolizes STDIN
FILE: while (my $filename = shift #ARGV) {
my $filehandle;
unless (open $filehandle, "<", $filename) {
say qq(Oh dear, I can't open "$filename". What do you wan't me to do?);
my $tries = 5;
do {
say qq(Type "q" to quit, or "n" for the next file);
my $response = <STDIN>;
exit if $response =~ /^q/i;
next FILE if $response =~ /^n/i;
say "I have no idea what that meant.";
} while --$tries;
say "I give up" and exit!!1;
}
LINE: while (my $line = <$filehandle>) {
# do something with $line
}
}

Can I read and write to multiple filehandles simultaneously (Perl)?

I'm trying to read from two files, and generate output in a third. I first wanted to edit the first one on the go but I didn't find a suitable method save for arrays.
My problem is that the third file (output) is empty whenever I uncomment the "_ref_param_handling" function. BUT the following is what puzzles me the most: If I do a UNIX very basic `cat` system call on the output file at then end (see code below), it works just fine. If I open the filehandle just before and close it right after editing, it also works fine (around my print FILEHANDLE LIST).
I undoubtedly am missing something here. Apart from a problem between my keyboard and my chair, what is it? A filehandle conflict? A scope problem?
Every variable is declared and has the value I want it to have.
Edit (not applicable anymore).
Using IO::File on the three files didn't change anything.
Edit 2 : New full subroutine code
My code works (except when my ref already exists, but that's because of the "append" mode i think) but there might be some mistakes and unperlish ways of coding (sorry, Monks). I, however, use Strict and warnings !
sub _ref_edit($) {
my $manda_def = "$dir/manda_def.list";
my $newrefhandle;
my $ref = $_[0];
(my $refout = $ref) =~ s/empty//;
my $refhandle;
my $parname = '';
my $parvalue = '';
my #val;
_printMan;
my $flush = readline STDIN; # Wait for <enter>
# If one or both of the ref. and the default values are missing
if ( !( -e $manda_def && -e $ref ) ) {
die "Cannot find $ref and/or $manda_def";
}
# Open needed files (ref & default)
open( $refhandle, "<", $ref ) or die "Cannot open ref $ref : $!";
open( $newrefhandle, ">>", $refout )
or die "Cannot open new ref $refout : $!";
# Read each line
while ( my $refline = <$refhandle> ) {
# If line read not an editable macro
if ( $refline =~ /^define\({{(.+)}},\s+{{.*__VALUE__.*}}\)/ ){
$parname = $1; # $1 = parameter name captured in regexp
# Prompt user
$parvalue = _ref_param_handling( $parname, $manda_def );
# Substitution in ref
$refline =~ s/__VALUE__/$parvalue/;
# Param not specified and no default value
$parvalue eq '' ? $refline=~s/__COM__/#/ : $refline=~s/__COM__//;
}
print $newrefhandle $refline;
}
close $newrefhandle;
close $refhandle;
return $refout;
} # End ref edit
the _ref_param_handle subroutine still is :
open( $mde, '<', $_[1] )
or die "Cannot open mandatory/default list $_[1] : $!";
# Read default/mandatory file list
while (<$mde>) {
( $name, $manda, $default, $match, $descript ) = split( /\s+/, $_, 5 );
next if ( $name !~ $ref_param ); # If param read differs from parname
(SOME IF/ELSE)
} # End while <MDE>
close $mde;
return $input;
}
Extract from manda_def file :
NAME Mandatory? Default Match Comm.
PORT y NULL ^\d+$ Database port
PROJECT y NULL \w{1,5} Project name
SERVER y NULL \w+ Server name
modemRouting n NULL .+
modlib y bin .+
modules y sms .+
Extract from ref_file :
define({{PORT}}, {{__VALUE__}})dnl
define({{PROJECT}}, {{__VALUE__}})dnl
define({{SERVER}}, {{__VALUE__}})dnl
define({{modemRouting}}, {{__COM__{{$0}} '__VALUE__'}})dnl
define({{modlib}}, {{__COM__{{$0}} '__VALUE__'}})dnl
define({{modules}}, {{__COM__{{$0}} '__VALUE__'}})dnl
Any help appreciated.
It is unclear what is initialising $refhandle, $newrefhandle and $mde. Depending on the values they have will affect the behaviour of open - i.e. whether it will close any filehandles before opening a new one.
I would suggest that you start using the IO::File interface to open/write to files, as this makes the job of filehandle management much easier, and will avoid any inadvertent closes. Something like...
use IO::File;
my $refhandle = IO::File->new("< $ref") or die "open() - $!";
$refhandle->print(...);
As far as editing files in place goes, this is a common pattern I use to achieve this, make sure of the -i behaviour of perl.
sub edit_file
{
my ($filename) = #_;
# you can re-create the one-liner above by localizing #ARGV as the list of
# files the <> will process, and localizing $^I as the name of the backup file.
local (#ARGV) = ($filename);
local($^I) = '.bak';
while (<>)
{
s/original string/new string/g;
}
continue
{
print;
}
}
try opening the second file handle for input outside the loop and pass a reference to the subroutine _ref_param_handle.Use seek function to seek file back to start.If your file is not too large you can also think of storing the content in an array and the accessing it instead of looping over same contents.
EDIT:
Here is a small example to support what I was trying to say above:
#!/usr/bin/perl -w
sub test
{
my $fh_to_read = $_[0] ;
my $fh_to_write = $_[1] ;
while(<$fh_to_read>)
{
print $fh_to_write $_ ;
}
seek($fh_to_read,0,0) ;
}
open(FH1,"<dummy1");
open(FH2,"<dummy2");
open(FH3,">dummy3");
while(<FH2>)
{
print FH3 "$_" ;
test(\*FH1,\*FH3);
}
Info about perl references
From what I gather, your script wants to convert a file in the following form:
define({{VAR1}}, {{__VALUE__}})
define({{VAR2}}, {{__VALUE__}})
define({{VAR3}}, {{__VALUE__}})
define({{VAR4}}, {{__VALUE__}})
to something like this:
define({{VAR1}}, {{}})
define({{VAR2}}, {{VALUE2}})
define({{VAR3}}, {{VALUE3}})
define({{VAR4}}, {{}})
The following works. I don't know what manda_def means, and also I didn't bother to create an actual variable replacement function.
#!/usr/bin/perl
use strict;
use warnings;
sub work {
my ($ref, $newref, $manda_def) = #_;
# Open needed files (ref & default)
open(my $refhandle, '<', $ref) or die "Cannot open ref $ref : $!";
open(my $newrefhandle, '>', $newref) or die "Cannot open new ref $newref: $!";
# Read each line
while (my $refline = <$refhandle>) {
# if line read is not an editable macro
if ($refline =~ /^define\({{(.+)}},\s+{{.*__VALUE__.*}}\)/){
my $parvalue = _ref_param_handling($1, $manda_def); # manda_def?
# Substitution in ref
$refline =~ s/__VALUE__/$parvalue/;
# Param not specified and no default value
$refline =~ s/__COM__/#/ if $parvalue eq '';
}
print $newrefhandle $refline;
}
close $newrefhandle;
close $refhandle;
return $newref;
}
sub _ref_param_handling {
my %parms = (VAR2 => 'VALUE2', VAR3 => 'VALUE3');
return $parms{$_[0]} if exists $parms{$_[0]};
}
work('ref.txt', 'newref.txt', 'manda.txt');
Guys, I seriously consider hanging myself with my wireless mouse.
My script never failed. I just didn't ran it through the end (it's actually a very long parameter list). The printing is just done as soon as the filehandle is closed (or so I guessed)...
/me *cries*
I've spent 24 hours on this...