I am trying to rename the existing file name with Kernel.txt on the basis of "Linux kernel Version" or "USB_STATE=DISCONNECTED". Script is running without any error but no output is coming. The changed file needs to be in the same folder(F1,F2,F3) as it was earlier.
Top dir: Log
SubDir: F1,F2,F3
F1: .bin file,.txt file,.jpg file
F2: .bin file,.txt file,.jpg file
F3: .bin file,.txt file,.jpg file
#!/usr/bin/perl
use strict;
use warnings;
use File::Find;
use File::Basename;
use File::Spec;
use Cwd;
chdir('C:\\doc\\logs');
my $dir_01 = getcwd;
my $all_file=find ({ 'wanted' => \&renamefile }, $dir_01);
sub renamefile
{
if ( -f and /.txt?/ )
{
my #files = $_;
foreach my $file (#files)
{
open (FILE,"<" ,$file) or die"Can not open the file";
my #lines = <FILE>;
close FILE;
for my $line ( #lines )
{
if($line=~ /Linux kernel Version/gi || $line=~ /USB_STATE=DISCONNECTED/gi)
{
my $dirname = dirname($file); # file's directory, so we rename only the file itself.
my $file_name = basename($file); # File name fore renaming.
my $new_file_name = $file_name;
$new_file_name =~ s/.* /Kernal.txt/g; # replace the name with Kernal.txt
rename($file, File::Spec->catfile($dirname, $new_file_name)) or die $!;
}
}
}
}
}
This code looks a bit like cargo-cult programming. That is, some constructs are here without indication that you are understanding what this is doing.
chdir('C:\\doc\\logs');
my $dir_01 = getcwd;
Do yourself a favour and use forward slashes, even for Windows pathnames. This is generally supported.
Your directory diagram says that there is a top dir Log, yet you chdir to C:/doc/logs. What is it?
You do realize that $dir_01 is a very nondescriptive name, and is the path you just chdir'd to? Also, File::Find does not require you to start in the working directory. That is, the chdir is a bit useless here. You actually want:
my $start_directory = "C:/doc/Log"; # or whatever
my $all_file=find ({ 'wanted' => \&renamefile }, $dir_01);
I'm not sure what the return value of find would mean. But I'm sure that we don't have to put it into some unused variable.
When we provide key names with the => fat comma, we don't have to manually quote these keys. Therefore:
find({ wanted => \&renamefile }, $start_directory);
/.txt?/
This regex does the following:
match any character (that isn't a newline),
followed by literal tx,
and optionally a t. the ? is a zero-or-one quantifier.
If you want to match filenames that end with .txt, you should do
/\.txt$/
the \. matches a literal period. The $ anchors the regex at the end of the string.
my #files = $_;
foreach my $file (#files) {
...;
}
This would normally be written as
my $file = $_;
...;
You assign the value of $_ to the #files array, which then has one element: The $_ contents. Then you loop over this one element. Such loops don't deserve to be called loops.
open (FILE,"<" ,$file) or die"Can not open the file";
my #lines = <FILE>;
close FILE;
for my $line ( #lines )
{ ... }
Ah, where to begin?
Use lexical variables for file handles. These have the nice property of closing themselves.
For error handling, use autodie. If you really want to do it yourself, the error message should contain two important pieces of information:
the name of the file you couldn't open ($file)
the reason why the open failed ($!)
That would mean something like ... or die "Can't open $file: $!".
Don't read the whole file into an array and loop over that. Instead, be memory-efficient and iterate over the lines, using a while(<>)-like loop. This only reads one line at a time, which is much better.
Combined, this would look like
use autodie; # at the top
open my $fh, "<", $file;
LINE: while (<$fh>) {
...; # no $line variable, let's use $_ instead
}
Oh, and I labelled the loop (with LINE) for later reference.
if($line=~ /Linux kernel Version/gi || $line=~ /USB_STATE=DISCONNECTED/gi) { ... }
Putting the /g flag on regexes turns them into an iterator. you really don't want that. And I'm not quite sure if that case-insensitive matching is really neccessary. You can move the || or into the regex, with the regex alternation |. As we now use $_ to contain the lines, we don't have to manually bind the regex to a string. Therefore, we can write:
if (/Linux Kernel Version|USB_STATE=DISCONNECTED/i) { ... }
my $dirname = dirname($file); # file's directory, so we rename only the file itself.
my $file_name = basename($file); # File name fore renaming.
The by default, the original $_, and therefore our $file, only contains the filename, but not the directory. This isn't a problem: File::Find chdir'd into the correct directory. This makes our processing a lot easier. If you want to have the directory, use the $File::Find::dir variable.
my $new_file_name = $file_name;
$new_file_name =~ s/.* /Kernal.txt/g;
The /.* / regex says:
match anything up to including the last space
If this matches, replace the matched part with Kernal.txt.
The /g flag is completely useless here. Are you sure you don't want Kernel.txt with an e? And why the space in the filename? I don't quite understand that. If you want to rename the file to Kernel.txt, just assign that as a string, instead of doing weird stuff with substitutions:
my $new_file_name = "Kernel.txt";
rename($file, File::Spec->catfile($dirname, $new_file_name)) or die $!;
We already established that an error message should also include the filename, or even better: we should use automatic error handling.
Also, we are already in the correct directory, so we don't have to concatenate the new name with the directory.
rename $file => $new_file_name; # error handling by autodie
last LINE;
That should be enough. Also note that I leave the LINE loop. Once we renamed the file, there is no need to check the other lines as well.
Related
I am new to perl. I have a directory structure. In each directory, I have a log file. I want to grep pattern from that file and do post processing. Right now I am grepping the pattern from those files using unix grep and putting into text file and reading that text file to do post processing, But I want to automate task of reading each file and grepping pattern from that file. In the code below the mdp_cgdis_1102.txt have grepped pattern from directories. I would really appreciate any help
#!usr/bin/perl
use strict;
use warnings;
open FILE, 'mdp_cgdis_1102.txt' or die "Cannot open file $!";
my #array = <FILE>;
my #arr;
my #brr;
foreach my $i (#array){
#arr = split (/\//, $i);
#brr = split (/\:/, $i);
print " $arr[0] --- $brr[2]";
}
It is unclear to me which part of the process needs automating. I'll go by "want to automate reading each file and grepping pattern from that file," whereby you presumably already have a list of files. If you actually need to build the file list as well see the added code below.
One way: pull all patterns from each file and store that in a hash (filename => arrayref-with-patterns)
my %file_pattern;
foreach my $file (#filelist) {
open my $fh, '<', $file or die "Can't open $file: $!";
$file_pattern{$file} = [ grep { /$pattern/ } <$fh> ];
close $fh;
}
The [ ] takes a reference to the list returned by grep, ie. constructs an "anonymous array", and that (reference) is assigned as a value to the $file key.
Now you can process your patterns, per log file
foreach my $filename (sort keys %file_pattern) {
print "Processing log $filename.\n";
my #patterns = #{$file_pattern{$filename}};
# Process the list of patterns in this log file
}
ADDED
In order to build the list of files #filelist used above, from a known list of directories, use core File::Find
module which recursively scans supplied directories and applies supplied subroutines
use File::Find;
find( { wanted => \&process_logs, preprocess => \&select_logs }, #dir_list);
Your subroutine process_logs() is applied to each file/directory that passed preprocessing by the second sub, with its name available as $File::Find::name, and in it you can either populate the hash with patterns-per-log as shown above, or run complete processing as needed.
Your subroutine select_logs() contains code to filter log files from all files in each directory, that File::Find would normally processes, so that process_file() only gets the log files.
Another way would be to use the other invocation
find(\&process_all, #dir_list);
where now the sub process_all() is applied to all entries (files and directories) found and thus this sub itself needs to ensure that it only processes the log files. See linked documentation.
The equivalent of
find ... -name '*.txt' -type f -exec grep ... {} +
is
use File::Find::Rule qw( );
my $base_dir_qfn = ...;
my $re = qr/.../;
my #log_qfns =
File::Find::Rule
->name(qr/\..txt\z/)
->file
->in($base_dir_qfn);
my $success = 1;
for my $log_qfn (#log_qfns) {
open(my $fh, '<', $log_qfn)
or do {
$success = 0;
warn("Can't open log file \"$log_qfn\": $!\n);
next;
};
while (<$fh>) {
print if /$re/;
}
}
exit(1) if !$success;
Use File::Find to traverse the directory.
In a loop go through all the logfiles:
Open the file
read it line by line
For each line, do a regular expression match (
if ($line =~ /pattern/) ) or use
if (index($line, $searchterm) >= 0) if you are looking for a certain static string.
If you find a match, print the line.
close the file
I hope that gives you enough pointers to get started. You will learn more if you find out how to do each of these steps in Perl by yourself (I pointed out the hard ones).
I am currently working on a code that changes certain words to Shakespearean words. I have to extract the sentences that contain the words and print them out into another file. I had to remove .START from the beginning of each file.
First I split the files with the text by spaces, so now I have the words. Next, I iterated the words through a hash. The hash keys and values are from a tab delimited file that is structured as so, OldEng/ModernEng (lc_Shakespeare_lexicon.txt). Right now, I'm trying to figure out how to find the exact position of each modern English word that is found, change it to the Shakespearean; then find the sentences with the change words and printing them out to a different file. Most of the code is finished except for this last part. Here is my code so far:
#!/usr/bin/perl -w
use diagnostics;
use strict;
#Declare variables
my $counter=();
my %hash=();
my $conv1=();
my $conv2=();
my $ssph=();
my #text=();
my $key=();
my $value=();
my $conversion=();
my #rmv=();
my $splits=();
my $words=();
my #word=();
my $vals=();
my $existingdir='/home/nelly/Desktop';
my #file='Sentences.txt';
my $eng_words=();
my $results=();
my $storage=();
#Open file to tab delimited words
open (FILE,"<", "lc_shakespeare_lexicon.txt") or die "could not open lc_shakespeare_lexicon.txt\n";
#split words by tabs
while (<FILE>){
chomp($_);
($value, $key)= (split(/\t/), $_);
$hash{$value}=$key;
}
#open directory to Shakespearean files
my $dir="/home/nelly/Desktop/input";
opendir(DIR,$dir) or die "can't opendir Shakespeare_input.tar.gz";
#Use grep to get WSJ file and store into an array
my #array= grep {/WSJ/} readdir(DIR);
#store file in a scalar
foreach my $file(#array){
#open files inside of input
open (DATA,"<", "/home/nelly/Desktop/input/$file") or die "could not open $file\n";
#loop through each file
while (<DATA>){
#text=$_;
chomp(#text);
#Remove .START
#rmv=grep(!/.START/, #text);
foreach $splits(#rmv){
#split data into separate words
#word=(split(/ /, $splits));
#Loop through each word and replace with Shakespearean word that exists
$counter=0;
foreach $words(#word){
if (exists $hash{$words}){
$eng_words= $hash{$words};
$results=$counter;
print "$counter\n";
$counter++;
#create a new directory and store senteces with Shakespearean words in new file called "Sentences.txt"
mkdir $existingdir unless -d $existingdir;
open my $FILE, ">>", "$existingdir/#file", or die "Can't open $existingdir/conversion.txt'\n";
#print $FILE "#words\n";
close ($FILE);
}
}
}
}
}
close (FILE);
close (DIR);
Natural language processing is very hard to get right except in trivial cases, for instance it is difficult to define exactly what is meant by a word or a sentence, and it is awkward to distinguish between a single quote and an apostrophe when they are both represented using the U+0027 "apostrophe" character '
Without any example data it is difficult to write a reliable solution, but the program below should be reasonably close
Please note the following
use warnings is preferable to -w on the shebang line
A program should contain as few comments as possible as long as it is comprehensible. Too many comments just make the program bigger and harder to grasp without adding any new information. The choice of identifiers should make the code mostly self documenting
I believe use diagnostics to be unnecessary. Most messages are fairly self-explanatory, and diagnostics can produce large amounts of unnecessary output
Because you are opening multiple files it is more concise to use autodie which will avoid the need to explicitly test every open call for success
It is much better to use lexical file handles, such as open my $fh ... instead of global ones, like open FH .... For one thing a lexical file handle will be implicitly closed when it goes out of scope, which helps to tidy up the program a lot by making explicit close calls unnecessary
I have removed all of the variable declarations from the top of the program except those that are non-empty. This approach is considered to be best practice as it aids debugging and assists the writing of clean code
The program lower-cases the original word using lc before checking to see if there is a matching entry in the hash. If a translation is found, then the new word is capitalised using ucfirst if the original word started with a capital letter
I have written a regular expression that will take the next sentence from the beginning of the string $content. But this is one of the things that I can't get right without sample data, and there may well be problems, for instance, with sentences that end with a closing quotation mark or a closing parenthesis
use strict;
use warnings;
use autodie;
my $lexicon = 'lc_shakespeare_lexicon.txt';
my $dir = '/home/nelly/Desktop/input';
my $existing_dir = '/home/nelly/Desktop';
my $sentences = 'Sentences.txt';
my %lexicon = do {
open my ($fh), '<', $lexicon;
local $/;
reverse(<$fh> =~ /[^\t\n\r]+/g);
};
my #files = do {
opendir my ($dh), $dir;
grep /WSJ/, readdir $dh;
};
for my $file (#files) {
my $contents = do {
open my $fh, '<', "$dir/$file";
join '', grep { not /\A\.START/ } <$fh>;
};
# Change any CR or LF to a space, and reduce multiple spaces to single spaces
$contents =~ tr/\r\n/ /;
$contents =~ s/ {2,}/ /g;
# Find and process each sentence
while ( $contents =~ / \s* (.+?[.?!]) (?= \s+ [A-Z] | \s* \z ) /gx ) {
my $sentence = $1;
my #words = split ' ', $sentence;
my $changed;
for my $word (#words) {
my $eng_word = $lexicon{lc $word};
$eng_word = ucfirst $eng_word if $word =~ /\A[A-Z]/;
if ($eng_word) {
$word = $eng_word;
++$changed;
}
}
if ($changed) {
mkdir $existing_dir unless -d $existing_dir;
open my $out_fh, '>>', "$existing_dir/$sentences";
print "#words\n";
}
}
}
I've files with filenames such as lin.txt and lin1.txt along with other .txt files. I need to find only these files and print its content only by one. I've the below code, but its somehow not matching the files starting with lin*. What is the issue?
$te_dir= "/projects/xxx/";
opendir (DIR, $te_dir) or die $!;
while (my $file = readdir(DIR))
{
if ($file=~/\.txt/)
{
#// Doing some tasks.
if($file ~= 'lin*.txt')
{
$linfile=$te_dir/$file;
open(LINFILE, $linfile) or die "Couldn't open file $file:$!";
while(my $line = <LINFILE>)
{
print $line;
}
close LINFILE;
}
}
}
You are mixing globs (shell wildcards) with regular expressions. These are two different formalisms with different syntax and semantics. In regular expressions (which is what Perl matching uses), n* matches zero or more occurrences of the character n. You probably mean
if ($file =~ /lin.*\.txt/)
Notice also the syntax error in the operator. You correctly have =~ in the first conditional, but you misspelled it as ~= where you do this comparison. (Maybe it's just a transcription error; for me, this creates a clear syntax error, so the script would not run in the first place.)
As noted in #brianadams' answer, the proper regular expression for this is
if ($file =~ /^lin.*\.txt$/)
with beginning of line ^ and end of line $ anchors to prevent e.g. feline.txt.html from matching. The default behavior of Perl's regular expressions is to find a match anywhere in the input string.
Here's a quick (and minimal) rewrite of your code that might help:
use strict;
use warnings;
my $te_dir = "/projects/xxx/";
opendir( my $dirh, $te_dir ) or die "Could not open '$te_dir': $!";
while ( my $file = readdir($dirh) ) {
next unless $file =~ /\.txt$/;
#// Doing some tasks.
if ( $file =~ /^ lin \d* \.txt $/x ) {
my $linfile = "$te_dir/$file";
open( my $fh, $linfile ) or die "Couldn't open file $linfile: $!";
while ( my $line = <$fh> ) {
print $line;
}
close $fh or die "Could not close $linfile: $!";
}
}
First, note that we've put strict and warnings at the top of the code. That will tell you about all sorts of interesting issues, including misspelled variable names.
Next, we've switch to lexical handles (e.g., my $dirh instead of DIR). The "bareword" version of the handles you're using (DIR and LINFILE have been discouraged for a long time because those are effectively global constructs and generally global data is bad because when it gets broken, it's awfully hard to tell what broke it, so we much, much prefer the lexical versions (the handles declared with the my builtin).
Also, this line you had probably doesn't do what you're thinking:
$linfile=$te_dir/$file;
You're trying to smash together a directory and filename with a forward slash, but since you didn't use string interpolation, you're actually using division. Both your director and filename will, in this numeric context, probably evaluate to zero, giving you a divide by zero error when you're trying to open a file!
However, if you're willing to use a CPAN module, you can make this even easier:
use strict;
use warnings;
use File::Find::Rule;
my $te_dir = "/projects/xxx/";
my #files = File::Find::Rule->file->name('lin*.txt')->in($te_dir);
foreach my $linfile (#files) {
#// Doing some tasks.
open my $fh, $linfile or die "Couldn't open file $linfile: $!";
while ( my $line = <$fh> ) {
print $line;
}
}
No muss, no fuss. Get only the files you want in the first pass and already have the correct file names (note that I didn't close the filehandle because it will close automatically when $fh goes out of scope at the end of the foreach loop.)
To match files starting with lin
if ( $file =~ /^lin.*\.txt$/ )
Try changing your 2nd if condition from this,
if($file ~= 'lin*.txt')
to this,
if($file =~ /lin*\.txt/)
You could also try: if($file =~ /^lin*\.txt/) , as already pointed out in other answers, but you'll need to make sure that the file names stored in the $file variable contain only the file name and not the entire path as well.
I have multiple files that have the extension .tdx.
Currently my program works on individual files using $ARGV[0], however the number of files are growing and I would like to use a wildcard based upon the file extension.
After much research I am at a loss.
I would like to read each file individually so the extract from the file is identified by the user.
#!C:\Perl\bin\perl.exe
use warnings;
use FileHandle;
open my $F_IN, '<', $ARGV[0] or die "Unable to open file: $!\n";
open my $F_OUT, '>', 'output.txt' or die "Unable to open file: $!\n";
while (my $line = $F_IN->getline) {
if ($line =~ /^User/) {
$F_OUT->print($line);
}
if ($line =~ /--FTP/) {
$F_OUT->print($line);
}
if ($line =~ /^ftp:/) {
$F_OUT->print($line);
}
}
close $F_IN;
close $F_OUT;
All the files are in one directory, so I assume I will need to open the directory.
I am just not sure how if I need to build an array of files or build a list and chomp it.
You have many options --
Loop over #ARGV, allowing the user to pass in a list of files
Use glob to pass in a pattern that perl will expand into a list of files (and then loop over that list, as in #1). This can be messy as they have to make sure to quote it so the shell doesn't interpolate it first.
Write some wrapper to call your existing script over and over again.
There's also a variant of the first one, which is to read from <>. This is set to either STDIN, or it'll automatically open the files named in #ARGV. See eof for an example of how to use it.
As an variant of #2, you can pass in a directory name, and use either opendir and readdir to loop over the list (making sure to grab only files with your extension, or at the very least ignore . and ..) or append /* or /*.tdx to it and use glob again.
The glob function can help you. Just try
my #files = glob '*.tdx';
for my $file (#files) {
# Process $file...
}
In list context, glob expands its argument to the list of file names that match the pattern. For details, see glob in perlfunc.
I never got glob to work. What I ended up doing was building an array based on the file extension .tdx. from there I copied the array to a filelist and read from that. What I ended up with is:
#!C:\Perl\bin\perl.exe
use warnings;
use FileHandle;
open my $F_OUT, '>', 'output.txt' or die "Unable to open file: $!\n";
open(FILELIST, "dir /b /s \"%USERPROFILE%\\Documents\\holding\\*.tdx\" |");
#filelist=<FILELIST>;
close(FILELIST);
foreach $file (#filelist)
{
chomp($file);
open my $F_IN, '<', $file or die "Unable to open file: $!\n";
while (my $line = $F_IN->getline)
{
Doing Something
}
close $F_IN;
}
close $F_OUT;
Thank you for your answers they helped in the learning experaince.
If you're on a Windows machine, putting in *.tdx on the command line might not work, nor may glob which historically used the shell's globbing abilities. (It now appears that the built in glob function now uses File::Glob, so that may no longer be an issue).
One thing you can do is not use globs, but allow the user to input the directories and suffixes they want. Then use opendir and readdir to go through the directories yourself.
use strict;
use warnings;
use feature qw(say);
use autodie;
use Getopt::Long; # Why not do it right?
use Pod::Usage; # It's about time to learn about POD documentation
my #suffixes; # Hey, why not let people put in more than one suffix?
my #directories; # Let people put in the directories they want to check
my $help;
GetOptions (
"suffix=s" => \#suffixes,
"directory=s" => \#directories,
"help" => \$help,
) or pod2usage ( -message => "Invalid usage" );
if ( not #suffixes ) {
#suffixes = qw(tdx);
}
if ( not #directories ) {
#directories = qw(.);
}
if ( $help ) {
pod2usage;
}
my $regex = join, "|", #suffixes;
$regex = "\.($regex)$"; # Will equal /\.(foo|bar|txt)$/ if Suffixes are foo, bar, txt
for my $directory ( #directories ) {
opendir my ($dir_fh), $directory; # Autodie will take care of this:
while ( my $file = readdir $dir_fh ) {
next unless -f $file;
next unless $file =~ /$regex/;
... Here be dragons ...
}
}
This will go through all of the directories your user input and then examines each entry. It uses the suffixes your user inputs (With .tdx being the default) to create a regular expression to check against the file name. If the file name matches the regular expression, do whatever you wanted to do with that file.
The script below takes function names in a text file and scans on a
folder that contains multiple c,h files. It opens those files one-by-one and
reads each line. If the match is found in any part of the files, it prints the
line number and the line that contains the match.
Everything is working fine except that the comparison is not working properly. I would be very grateful to whoever solves my problem.
#program starts:
use FileHandle;
print "ENTER THE PATH OF THE FILE THAT CONTAINS THE FUNCTIONS THAT YOU WANT TO
SEARCH: ";#getting the input file
our $input_path = <STDIN>;
$input_path =~ s/\s+$//;
open(FILE_R1,'<',"$input_path") || die "File open failed!";
print "ENTER THE PATH OF THE FUNCTION MODEL: ";#getting the folder path that
#contains multiple .c,.h files
our $model_path = <STDIN>;
$model_path =~ s/\s+$//;
our $last_dir = uc(substr ( $model_path,rindex( $model_path, "\\" ) +1 ));
our $output = $last_dir."_FUNC_file_names";
while(our $func_name_input = <FILE_R1> )#$func_name_input is the function name
#that is taken as the input
{
$func_name_input=reverse($func_name_input);
$func_name_input=substr($func_name_input,rindex($func_name_input,"\("+1);
$func_name_input=reverse($func_name_input);
$func_name_input=substr($func_name_input,index($func_name_input," ")+1);
#above 4 lines are func_name_input is choped and only part of the function
#name is taken.
opendir FUNC_MODEL,$model_path;
while (our $file = readdir(FUNC_MODEL))
{
next if($file !~ m/\.(c|h)/i);
find_func($file);
}
close(FUNC_MODEL);
}
sub find_func()
{
my $fh1 = FileHandle->new("$model_path//$file") or die "ERROR: $!";
while (!$fh1->eof())
{
my $func_name = $fh1->getline(); #getting the line
**if($func_name =~$func_name_input)**#problem here it does not take the
#match
{
next if($func_name=~m/^\s+/);
print "$.,$func_name\n";
}
}
}
$func_name_input=substr($func_name_input,rindex($func_name_input,"\("+1);
You're missing an ending parenthesis. Should be:
$func_name_input=substr($func_name_input,rindex($func_name_input,"\(")+1);
There's probably an easier way than those four statements, too. But it's a little early to wrap my head around it all. Do you want to match "foo" in "function foo() {"? If so, you could use a regex like /\s+([^) ]+)/.
When you say $func_name =~$func_name_input, you're treating all characters in $func_name_input as special regex characters. If this is not what you mean to do, you can use quotemeta (perldoc -f quotemeta): $func_name =~quotemeta($func_name_input) or $func_name =~ qr/\Q$func_name_input\E/.
Debugging will be easier with strictures (and a syntax-hilighting editor). Also note that, if you're not using those variables in other files, "our" doesn't do anything "my" wouldn't do for file-scoped variables.
find + xargs + grep does 90% of what you want.
find . -name '*.[c|h]' | xargs grep -n your_pattern
ack does it even easier.
ack --type=cc your_pattern
Simply take your list of patterns from your file and "or" them together.
ack --type=cc 'foo|bar|baz'
This has the benefit of only search the files once, and not once for each pattern being searched for as you're doing.
I still think you should just use ack, but your code needed some serious love.
Here is an improved version of your program. It now takes the directory to search and patterns on the command line rather than having to ask for (and the user write) files. It searches all the files under the directory, not just the ones in the directory, using File::Find. It does this in one pass by concatenating all the patterns into regular expressions. It uses regexes instead of index() and substr() and reverse() and oh god. It simply uses built in filehandles rather than the FileHandle module and checking for eof(). Everything is declared lexical (my) instead of global (our). Strict and warnings are on for easier debugging.
#!/usr/bin/perl
use strict;
use warnings;
use File::Find;
die "Usage: search_directory function ...\n" unless #ARGV >= 2;
my $Search_Dir = shift;
my $Pattern = build_pattern(#ARGV);
find(
{
wanted => sub {
return unless $File::Find::name =~ m/\.(c|h)$/i;
find_func($File::Find::name, $pattern);
},
no_chdir => 1,
},
$Search_Dir
);
# Join all the function names into one pattern
sub build_pattern {
my #patterns;
for my $name (#_) {
# Turn foo() into foo. This replaces all that reverse() and rindex()
# and substr() stuff.
$name =~ s{\(.*}{};
# Use \Q to protect against regex metacharacters in the input
push #patterns, qr{\Q$name\E};
}
# Join them up into one pattern.
return join "|", #patterns;
}
sub find_func {
my( $file, $pattern ) = #_;
open(my $fh, "<", $file) or die "Can't open $file: $!";
while (my $line = <$fh>) {
# XXX not all functions are unindented, but your choice
next if $line =~ m/^\s+/;
print "$file:$.: $line" if $line =~ $pattern;
}
}