I've created a script for validating xml files after given input folder. It should grep xml files from the input directory then sort out the xml files and check the condition. But it throws a command that not Open at line , <STDIN> line 1.
But it creates an empty log file.
Since i faced numeric error while sorting, comment that.
so i need to be given input location, the script should check the xml files and throw errors in a mentioned log file.
Anyone can help this?
Script
#!/usr/bin/perl
# use strict;
use warnings;
use Cwd;
use File::Basename;
use File::Path;
use File::Copy;
use File::Find;
print "Enter the path: ";
my $filepath = <STDIN>;
chomp $filepath;
die "\n\tpleas give input folder \n" if(!defined $filepath or !-d $filepath);
my $Toolpath = dirname($0);
my $base = basename($filepath);
my $base_path = dirname($filepath);
my ($xmlF, #xmlF);
my #errors=();
my #warnings=();
my #checkings=();
my $ecount=0;
my $wcount=0;
my $ccount=0;
my ($x, $y);
my $z="0";
opendir(DIR,"$filepath");
my #xmlFiles = grep{/\.xml$/} readdir(DIR);
closedir(DIR);
my $logfile = "$base_path\\$base"."_Err.log";
# #xmlF=sort{$a <=> $b}#xmlFiles;
#xmlF=sort{$a cmp $b}#xmlFiles;
open(OUT, ">$logfile") || die ("\nLog file couldnt write $logfile :$!");
my $line;
my $flcnt = scalar (#xmlF);
for ($x=0; $x < $flcnt; $x++)
{
open IN, "$xmlF[$x]" or die "not Open";
print OUT "\n".$xmlF[$x]."\n==================\n";
print "\nProcessing File $xmlF[$x] .....\n";
local $/;
while ($line=<IN>)
{
while ($line=~m#(<res(?: [^>]+)? type="weblink"[^>]*>)((?:(?!</res>).)*)</res>#igs)
{
my $tmp1 = $1; my $tmp2 = $&; my $pre1 = $`;
if($tmp1 =~ m{ subgroup="Weblink"}i){
my $pre = $pre1.$`;
if($tmp2 !~ m{<tooltip><\!\[CDATA\[Weblink\]\]><\/tooltip>}ms){
my $pre = $pre1.$`;
push(#errors,lineno($pre),"\t<tooltip><\!\[CDATA\[Weblink\]\]></tooltip> is missing\n");
}
}
}
foreach my $warnings(#warnings)
{
$wcount = $wcount+1;
}
foreach my $checkings(#checkings)
{
$ccount = $ccount+1;
}
foreach my $errors(#errors)
{
$ecount = $ecount+1;
}
my $count_err = $ecount/2;
print OUT "".$count_err." Error(s) Found:-\n------------------------\n ";
print OUT "#errors\n";
$ecount = 0;
my $count_war = $wcount/2;
print OUT "$count_war Warning(s) Found:-\n-------------------------\n ";
print OUT "#warnings\n";
$wcount = 0;
my $count_check = $ccount/2;
print OUT "$count_check Checking(s) Found:-\n-------------------------\n ";
print OUT "#checkings\n";
$wcount = 0;
undef #errors;
undef #warnings;
undef #checkings;
close IN;
}
}
The readdir returns bare file names, without the path.
So when you go ahead to open those files you need to prepend the names returned by readdir with the name of the directory the readdir read them from, here $filepath. Or build the full path names right away
use warnings;
use strict;
use feature 'say';
use File::Spec;
print "Enter the path: ";
my $filepath = <STDIN>;
chomp $filepath;
die "\nPlease give input folder\n" if !defined $filepath or !-d $filepath;
opendir(my $fh_dir, $filepath) or die "Can't opendir $filepath: $!";
my #xml_files =
map { File::Spec->catfile($filepath, $_) }
grep { /\.xml$/ }
readdir $fh_dir;
closedir $fh_dir;
say for #xml_files;
where I used File::Spec to portably piece together the file name.
The map can be made to also do grep's job so to make only one pass over the file list
my #xml_files =
map { /\.xml$/ ? File::Spec->catfile($filepath, $_) : () }
readdir $fh_dir;
The empty list () gets flattened in the returned list, effectively disappearing altogether.
Here are some comments on the code. Note that this is normally done at Code Review but I feel that it is needed here.
First: a long list of variables is declared upfront. It is in fact important to declare in as small a scope as possible. It turns out that most of those variables can indeed be declared where they are used, as seen in comments below.
The location of the executable is best found using
use FindBin qw($RealBin);
where $RealBin also resolves links (as opposed to $Bin, also available)
Assigning () to an array at declaration doesn't do anything; it is exactly the same as normal my #errors;. They can also go together, my (#errors, #warnings, #checks);. If the array has something then = () clears it, what is a good way to empty an array
Assigning a "0" makes the variable a string. While Perl normally converts between strings and numbers as needed, if a number is needed then use a number, my $z = 0;
Lexical filehandles (open my $fh, ...) are better than globs (open FH, ...)
I don't understand the comment about "numeric error" in sorting. The cmp operator sorts lexicographically, for numeric sort use <=>
When array is used in scalar context – when assigned to a scalar for example – the number of elements is returned. So no need for scalar but do my flcnt = #xmlF;
For iteration over array indices use $#ary, the index of the last element of #ary, for
foreach my $i (0..$#xmlF) { ... }
But if there aren't any uses of the index (I don't see any) then loop over elements
foreach my $file (#xmlF) { ... }
When you check the file open print the error $!, open ... or die "... : $!";. This is done elsewhere in the code, and it should be done always.
The local $/; unsets the input record separator, what makes the following read take the whole file. If that is intended then $line is not a good name. Also note that a variable can be declared inside the condition, while (my $line = <$fh>) { }
I can't comment on the regex as I don't know what it's supposed to accomplish, but it is complex; any chance to simplify all that?
The series of foreach loops only works out the number of elements of those arrays; there is no need for loops then, just my $ecount = #errors; (etc). This also allows you to keep the declaration of those counter variables in minimal scope.
The undef #errors; (etc) aren't needed since those arrays count for each file and so you can declare them inside the loops, anew at each iteration (and at smallest scope). When you wish to empty an array it is better to do #ary = (); than to undef it; that way it's not allocated all over again on the next use
Related
I'm having some problem with a subroutine that locates certain files and extracts some data out of them.
This subroutine is called inside a foreach loop, but whenever the call is made the loop skips to its next iteration. So I am wondering whether any of the next;'s are somehow escaping from the subroutine to the foreach loop where it is called?
To my knowledge the sub looks solid though so I'm hoping if anyone can see something I'm missing?
sub FindKit{
opendir(DH, "$FindBin::Bin\\data");
my #kitfiles = readdir(DH);
closedir(DH);
my $nametosearch = $_[0];
my $numr = 1;
foreach my $kitfile (#kitfiles)
{
# skip . and .. and Thumbs.db and non-K-files
if($kitfile =~ /^\.$/) {shift #kitfiles; next;}
if($kitfile =~ /^\.\.$/) {shift #kitfiles; next;}
if($kitfile =~ /Thumbs\.db/) {shift #kitfiles; next;}
if($kitfile =~ /^[^K]/) {shift #kitfiles; next;}
# $kitfile is the file used on this iteration of the loop
open (my $fhkits,"<","data\\$kitfile") or die "$!";
while (<$fhkits>) {}
if ($. <= 1) {
print " Empty File!";
next;
}
seek($fhkits,0,0);
while (my $kitrow = <$fhkits>) {
if ($. == 0 && $kitrow =~ /Maakartikel :\s*(\S+)\s+Montagekit.*?($nametosearch)\s{3,}/g) {
close $fhkits;
return $1;
}
}
$numr++;
close $fhkits;
}
return 0;
}
To summarize comments, the refactored code:
use File::Glob ':bsd_glob';
sub FindKit {
my $nametosearch = $_[0];
my #kitfiles = glob "$FindBin::Bin/data/K*"; # files that start with K
foreach my $kitfile (#kitfiles)
{
open my $fhkits, '<', $kitfile or die "$!";
my $kitrow_first_line = <$fhkits>; # read first line
return if eof; # next read is end-of-file so it was just header
my ($result) = $kitrow_first_line =~
/Maakartikel :\s*(\S+)\s+Montagekit.*?($nametosearch)\s{3,}/;
return $result if $result;
}
return 0;
}
I use core File::Glob and enable :bsd_glob option, which can handle spaces in filenames. I follow the docs note to use "real slash" on Win32 systems.
I check whether there is only a header line using eof.†
I do not see how this can affect the calling code, other than by its return value. Also, I don't see how the posted code can make the caller skip the beat, either. That problem is unlikely to be in this sub.
Please let me know if I missed some point with the above rewrite.
† Previous version used to check whether there is just one (header) line by
1 while <$fhkits>; # check number of lines ...
return if $. == 1; # there was only one line, the header
Also correct but eof is way better
The thing that is almost certainly screwing you here, is that you are shifting the list that you are iterating.
That's bad news, as you're deleting elements ... but in places you aren't necessarily thinking.
For example:
#!/usr/bin/env perl
use strict;
use warnings;
my #list = qw ( one two three );
my $count;
foreach my $value ( #list ) {
print "Iteration ", ++$count," value is $value\n";
if ( $value eq 'two' ) { shift #list; next };
}
print "#list";
How many times do you think that should iterate, and which values should end up in the array?
Because you shift you never process element 'three' and you delete element 'one'. That's almost certainly what's causing you problems.
You also:
open using a relative path, when your opendir used an absolute one.
skip a bunch of files, and then skip anything that doesn't start with K. Why not just search for things that do start with K?
read the file twice, and one is to just check if it's empty. The perl file test -z will do this just fine.
you set $kitrow for each line in the file, but don't really use it for anything other than pattern matching. It'd probably work better using implicit variables.
You only actually do anything on the first line - so you don't ever need to iterate the whole file. ($numr seems to be discarded).
you use a global match, but only use one result. The g flag seems redundant here.
I'd suggest a big rewrite, and do something like this:
#!/usr/bin/env perl
use strict;
use warnings;
use FindBin;
sub FindKit{
my ($nametosearch) = #_;
my $numr = 1;
foreach my $kitfile (glob "$FindBin::Bin\\data\\K*" )
{
if ( -z $kitfile ) {
print "$kitfile is empty\n";
next;
}
# $kitfile is the file used on this iteration of the loop
open (my $fhkits,"<", $kitfile) or die "$!";
<$kitfile> =~ m/Maakartikel :\s*(\S+)\s+Montagekit.*?($nametosearch)\s{3,}/
and return $1;
return 0;
}
}
As a big fan of the Path::Tiny module (me have it always installed and using it in every project) my solution would be:
use strict;
use warnings;
use Path::Tiny;
my $found = FindKit('mykit');
print "$found\n";
sub FindKit {
my($nametosearch) = #_;
my $datadir = path($0)->realpath->parent->child('data');
die "$datadir doesn't exists" unless -d $datadir;
for my $file ($datadir->children( qr /^K/ )) {
next if -z $file; #skip empty
my #lines = $file->lines;
return $1 if $lines[0] =~ /Maakartikel :\s*(\S+)\s+Montagekit.*?($nametosearch)\s{3,}/;
}
return;
}
Some comments and still opened issues:
Using the Path::Tiny you could always use forward slashes in the path-names, regardless of the OS (UNIX/Windows), e.g. the data/file will work on windows too.
AFAIK the FindBin is considered broken - so the above uses the $0 and realpath ...
what if the Kit is in multiple files? The above always returns on the 1st found one
the my #lines = $file->lines; reads all lines - unnecessary - but on small files doesn't big deal.
the the reality this function returns the arg for the Maakartikel, so probably better name would be find_articel_by_kit or find_articel :)
easy to switch to utf8 - just change the $file->lines to $file->lines_utf8;
I wrote a perl script to count the occurrences of a character in a file.
So far this is what I have got,
#!/usr/bin/perl -w
use warnings;
no warnings ('uninitialized', 'substr');
my $lines_ref;
my #lines;
my $count;
sub countModule()
{
my $file = "/test";
open my $fh, "<",$file or die "could not open $file: $!";
my #contents = $fh;
my #filtered = grep (/\// ,#contents);
return \#filtered;
}
#lines = countModule();
##lines = $lines_ref;
$count = #lines;
print "###########\n $count \n###########\n";
My test file looks like this:
10.0.0.1/24
192.168.10.0/24
172.16.30.1/24
I am basically trying to count the number of instances of "/"
This is the output that I get:
###########
1
###########
I am getting 1 instead of 3, which is the number of occurrences.
Still learning perl, so any help will be appreciated..Thank you!!
Here are a few points about your code
You should always use strict at the top of your program, and only use no warnings for special reasons in a limited scope. There is no general reason why a working Perl program should need to disable warnings globally
Declare your variables close to their first point of use. The style of declaring everything at the top of the file is unnecessary and is a legacy of C
Never use prototypes in your code. They are available for very special purposes and shouldn't be used for the vast majority of Perl code. sub countModule() { ... } insists that countModule may never be called with any parameters and isn't necessary or useful. The definition should be just sub countModule { ... }
A big well done! for using a lexical file handle, the three-parameter form of open, and putting $! in your die string
my #contents = $fh will just set #contents to a single-element list containing just the filehandle. To read the whole file into the array you need my #contents = <$fh>
You can avoid escaping slashes in a regular expression if you use a different delimiter. To do that you need to use the m operator explicitly, like my #filtered = grep m|/|, #contents)
You return an array reference but assign the returned value to an array, so #lines = countModule() sets #lines to a single-element list containing just the array reference. You should either return a list with return #filtered or dereference the return value on assignment with #lines = #{ countModule }
If all you need to do is to print the number of lines in the file that contain a slash character then you could write something like this
use strict;
use warnings;
my $count;
sub countModule {
open my $fh, '<', '/test' or die "Could not open $file: $!";
return [ grep m|/|, <$fh> ];
}
my $lines = countModule;
$count = #$lines;
print "###########\n $count \n###########\n";
Close, but a few issues:
use strict;
use warnings;
sub countModule
{
my $file = "/test";
open my $fh, "<",$file or die "could not open $file: $!";
my #contents = <$fh>; # The <> brackets are used to read from $fh.
my #filtered = grep (/\// ,#contents);
return #filtered; # Remove the reference.
}
my #lines = countModule();
my $count = scalar #lines; # 'scalar' is not required, but lends clarity.
print "###########\n $count \n###########\n";
Each of the changes I made to your code are annotated with a #comment explaining what was done.
Now in list context your subroutine will return the filtered lines. In scalar context it will return a count of how many lines were filtered.
You did also mention find the occurrences of a character (despite everything in your script being line-oriented). Perhaps your counter sub would look like this:
sub file_tallies{
my $file = '/test';
open my $fh, '<', $file or die $!;
my $count;
my $lines;
while( <$fh> ) {
$lines++;
$count += $_ =~ tr[\/][\/];
}
return ( $lines, $count );
}
my( $line_count, $slash_count ) = file_tallies();
In list context,
return \#filtered;
returns a list with one element -- a reference to the named array #filtered. Maybe you wanted to return the list itself
return #filtered;
Here's some simpler code:
sub countMatches {
my ($file, $c) = #_; # Pass parameters
local $/;
undef $/; # Slurp input
open my $fh, "<",$file or die "could not open $file: $!";
my $s = <$fh>; # The <> brackets are used to read from $fh.
close $fh;
my $ptn = quotemeta($c); # So we can match strings like ".*" verbatim
my #hits = $s =~ m/($ptn)/g;
0 + #hits
}
print countMatches ("/test", '/') . "\n";
The code pushes Perl beyond the very basics, but not too much. Salient points:
By undeffing $/ you can read the input into one string. If you're counting
occurrences of a string in a file, and not occurrences of lines that contain
the string, this is usually easier to do.
m/(...)/g will find all the hits, but if you want to count strings like
"." you need to quote the meta characters in them.
Store the results in an array to evaluate m// in list context
Adding 0 to a list gives the number of items in it.
I have a list of words and I want to group them into different groups depending on whether they are verbs/adjectives/nouns/etc. So, basically I am looking for a Perl module which tells whether a word is verb/noun etc.
I googled but couldn't find what I was looking for. Thanks.
Lingua::EN::Tagger, Lingua::EN::Semtags::Engine, Lingua::EN::NamedEntity
See the Lingua::EN:: namespace in CPAN. Specifically, Link Grammar and perhaps Lingua::EN::Tagger can help you. Also WordNet provides that kind of information and you can query it using this perl module.
follow code perl help you to find all this thing in your text file in your folder only give the path of directory and it will process all file at once and save result in report.txt file strong text
#!/usr/local/bin/perl
# for loop execution
# Perl Program to calculate Factorial
sub fact
{
# Retriving the first argument
# passed with function calling
my $x = $_[0];
my #names = #{$_[1]};
my $length = $_[2];
# checking if that value is 0 or 1
if ($x < $length)
{
#print #names[$x],"\n";
use Lingua::EN::Fathom;
my $text = Lingua::EN::Fathom->new();
# Analyse contents of a text file
$dirlocation="./2015/";
$path =$dirlocation.$names[$x];
$text->analyse_file($path); # Analyse contents of a text file
$accumulate = 1;
# Analyse contents of a text string
$text->analyse_block($text_string,$accumulate);
# TO Do, remove repetition
$num_chars = $text->num_chars;
$num_words = $text->num_words;
$percent_complex_words = $text->percent_complex_words;
$num_sentences = $text->num_sentences;
$num_text_lines = $text->num_text_lines;
$num_blank_lines = $text->num_blank_lines;
$num_paragraphs = $text->num_paragraphs;
$syllables_per_word = $text->syllables_per_word;
$words_per_sentence = $text->words_per_sentence;
# comment needed
%words = $text->unique_words;
foreach $word ( sort keys %words )
{
# print("$words{$word} :$word\n");
}
$fog = $text->fog;
$flesch = $text->flesch;
$kincaid = $text->kincaid;
use strict;
use warnings;
use 5.010;
my $filename = 'report.txt';
open(my $fh, '>>', $filename) or die "Could not open file '$filename' $!";
say $fh $text->report;
close $fh;
say 'done';
print($text->report);
$x = $x+1;
fact($x,\#names,$length);
}
# Recursively calling function with the next value
# which is one less than current one
else
{
done();
}
}
# Driver Code
$a = 0;
#names = ("John Paul", "Lisa", "Kumar","touqeer");
opendir DIR1, "./2015" or die "cannot open dir: $!";
my #default_files= grep { ! /^\.\.?$/ } readdir DIR1;
$length = scalar #default_files;
print $length;
# Function call and printing result after return
fact($a,\#default_files,$length);
sub done
{
print "Done!";
}
I have been trying to get rid of a weird bug for hours, with no success. I have a subroutine that sorts a file. here is the code:
sub sort_file {
$filename = #_;
print #_;
print $filename;
open(SRTINFILE,"<$filename");
#lines=<SRTINFILE>;
close(SRTINFILE);
open(SRTOUTFILE,">$filename");
#sorted = sort { #aa=split(/ /,$a); #bb=split(/ /,$b); return ($aa[1] <=> $bb[1]); } #lines;
print SRTOUTFILE #sorted;
close(SRTOUTFILE);
}
any time this function is run, perl creates a file, called "1". i have no idea why. I am a complete perl noob and am just using it for quick and dirty text file processing. anyone know what's wrong?
An array in scalar context evalutes to the number of elements in the array. If you pass one argument to the function, the following assigns 1 to $filename.
$filename = #_;
You want any of the following:
$filename = $_[0];
$filename = shift;
($filename) = #_;
Furthermore, you want to limit the scope of the variable to the function, so you want
my $filename = $_[0];
my $filename = shift;
my ($filename) = #_;
(my $filename) = #_; # Exact same as previous.
The other answers are sufficient to tell you why you were getting strange errors.
I would like to show you how a more experienced Perl programmer might write this subroutine.
use warnings;
use strict;
use autodie;
sub sort_file {
my( $filename ) = #_;
my #lines;
{
# 3 arg open
open my $in_fh, '<', $filename;
#lines = <$in_fh>;
close $in_fh;
}
# Schwartzian transform
my #sorted = map{
$_->[0]
} sort {
$a->[2] <=> $b->[2]
} map {
[ $_, split ' ', $_ ]
} #lines;
{
open my $out_fh, '>', $filename;
print {$out_fh} #sorted;
close $out_fh;
}
}
use strict;
prevents you from using a variable without declaring it (among other things).
use warnings;
Informs you of some potential errors.
use autodie;
Now you don't need to write open .... or die ....
{ open ...; #lines = <$fh>; close $fh }
Limits the scope of the FileHandle.
#sorted = map { ... } sort { ... } map { ... } #list
This is an examples of a Schwartzian transform, which reduces the number of times that the values are split. In this example, it may be overkill.
How confusing. Assigning $filename = #_ the way you are means that you are evaluating an array in scalar context, which means that $filename is assigned the number of elements in #_. Because you don't check to see if the first open call succeeds, reading the file 1 likely fails, but you continue anyway and open for writing a file named 1. The solution is to use $filename in an array context and begin your subroutine with ($filename) = #_ or $filename = shift.
Why aren't you using use strict by the way?
Always use:
use strict;
use warnings;
Then Perl will tell you when you're off the mark.
As you've observed, the notation:
$filename = #_;
means that an unscoped variable is assigned the number of elements in the argument list to the function, and since you pass one file, the name of the created file will be '1'.
You meant to write:
my($filename) = #_;
This provides list context for the array, and assigns $_[0] to $filename, ignoring any extra arguments to the function.
OK... nevermind. it just dawned on me. $filename = #_; makes no sense. should be $filename = #_[0]; . There goes 2 hours of my life. note to other perl noobs: beware.
The script below takes function names in a text file and scans on a
folder that contains multiple c,h files. It opens those files one-by-one and
reads each line. If the match is found in any part of the files, it prints the
line number and the line that contains the match.
Everything is working fine except that the comparison is not working properly. I would be very grateful to whoever solves my problem.
#program starts:
use FileHandle;
print "ENTER THE PATH OF THE FILE THAT CONTAINS THE FUNCTIONS THAT YOU WANT TO
SEARCH: ";#getting the input file
our $input_path = <STDIN>;
$input_path =~ s/\s+$//;
open(FILE_R1,'<',"$input_path") || die "File open failed!";
print "ENTER THE PATH OF THE FUNCTION MODEL: ";#getting the folder path that
#contains multiple .c,.h files
our $model_path = <STDIN>;
$model_path =~ s/\s+$//;
our $last_dir = uc(substr ( $model_path,rindex( $model_path, "\\" ) +1 ));
our $output = $last_dir."_FUNC_file_names";
while(our $func_name_input = <FILE_R1> )#$func_name_input is the function name
#that is taken as the input
{
$func_name_input=reverse($func_name_input);
$func_name_input=substr($func_name_input,rindex($func_name_input,"\("+1);
$func_name_input=reverse($func_name_input);
$func_name_input=substr($func_name_input,index($func_name_input," ")+1);
#above 4 lines are func_name_input is choped and only part of the function
#name is taken.
opendir FUNC_MODEL,$model_path;
while (our $file = readdir(FUNC_MODEL))
{
next if($file !~ m/\.(c|h)/i);
find_func($file);
}
close(FUNC_MODEL);
}
sub find_func()
{
my $fh1 = FileHandle->new("$model_path//$file") or die "ERROR: $!";
while (!$fh1->eof())
{
my $func_name = $fh1->getline(); #getting the line
**if($func_name =~$func_name_input)**#problem here it does not take the
#match
{
next if($func_name=~m/^\s+/);
print "$.,$func_name\n";
}
}
}
$func_name_input=substr($func_name_input,rindex($func_name_input,"\("+1);
You're missing an ending parenthesis. Should be:
$func_name_input=substr($func_name_input,rindex($func_name_input,"\(")+1);
There's probably an easier way than those four statements, too. But it's a little early to wrap my head around it all. Do you want to match "foo" in "function foo() {"? If so, you could use a regex like /\s+([^) ]+)/.
When you say $func_name =~$func_name_input, you're treating all characters in $func_name_input as special regex characters. If this is not what you mean to do, you can use quotemeta (perldoc -f quotemeta): $func_name =~quotemeta($func_name_input) or $func_name =~ qr/\Q$func_name_input\E/.
Debugging will be easier with strictures (and a syntax-hilighting editor). Also note that, if you're not using those variables in other files, "our" doesn't do anything "my" wouldn't do for file-scoped variables.
find + xargs + grep does 90% of what you want.
find . -name '*.[c|h]' | xargs grep -n your_pattern
ack does it even easier.
ack --type=cc your_pattern
Simply take your list of patterns from your file and "or" them together.
ack --type=cc 'foo|bar|baz'
This has the benefit of only search the files once, and not once for each pattern being searched for as you're doing.
I still think you should just use ack, but your code needed some serious love.
Here is an improved version of your program. It now takes the directory to search and patterns on the command line rather than having to ask for (and the user write) files. It searches all the files under the directory, not just the ones in the directory, using File::Find. It does this in one pass by concatenating all the patterns into regular expressions. It uses regexes instead of index() and substr() and reverse() and oh god. It simply uses built in filehandles rather than the FileHandle module and checking for eof(). Everything is declared lexical (my) instead of global (our). Strict and warnings are on for easier debugging.
#!/usr/bin/perl
use strict;
use warnings;
use File::Find;
die "Usage: search_directory function ...\n" unless #ARGV >= 2;
my $Search_Dir = shift;
my $Pattern = build_pattern(#ARGV);
find(
{
wanted => sub {
return unless $File::Find::name =~ m/\.(c|h)$/i;
find_func($File::Find::name, $pattern);
},
no_chdir => 1,
},
$Search_Dir
);
# Join all the function names into one pattern
sub build_pattern {
my #patterns;
for my $name (#_) {
# Turn foo() into foo. This replaces all that reverse() and rindex()
# and substr() stuff.
$name =~ s{\(.*}{};
# Use \Q to protect against regex metacharacters in the input
push #patterns, qr{\Q$name\E};
}
# Join them up into one pattern.
return join "|", #patterns;
}
sub find_func {
my( $file, $pattern ) = #_;
open(my $fh, "<", $file) or die "Can't open $file: $!";
while (my $line = <$fh>) {
# XXX not all functions are unindented, but your choice
next if $line =~ m/^\s+/;
print "$file:$.: $line" if $line =~ $pattern;
}
}