Build array of the contents of the working directory in perl - perl

I am working on a script which utilizes files in surrounding directories using a path such as
"./dir/file.txt"
This works fine, as long as the working directory is the one containing the script. However the script is going out to multiple users and some people may not change their working directory and run the script by typing its entire path like this:
./path/to/script/my_script.pl
This poses a problem as when the script tries to access ./dir/file.txt it is looking for the /dir directory in the home directory, and of course, it can't fine it.
I am trying to utilize readdir and chdir to correct the directory if it isn't the right one, here is what I have so far:
my $working_directory = $ENV{PWD};
print "Working directory: $working_directory\n"; #accurately prints working directory
my #directory = readdir $working_directory; #crashes script
if (!("my_script.pl" ~~ #directory)){ #if my_script.pl isnt in #directoryies, do this
print "Adjusting directory so I work\n";
print "Your old directory: $ENV{PWD}\n";
chdir $ENV{HOME}; #make the directory home
chdir "./path/to/script/my_script.pl"; #make the directory correct
print "Your new directory: $ENV{PWD}\n";
}
The line containing readdir crashes my script with the following error
Bad symbol for dirhandle at ./path/to/script/my_script.pl line 250.
which I find very strange because I am running this from the home directory which prints out properly right beforehand and contains nothing to do with the "bad symbol"
I'm open to any solutions
Thank you in advance

The readdir operates with a directory handle, not a path on a string. You need to do something like:
opendir(my $dh, $working_directory) || die "can't opendir: $!";
my #directory = readdir($dh);
Check perldoc for both readdir and opendir.

I think you're going about this the wrong way. If you're looking for a file that's travelling with your script, then what you probably should consider is the FindBin module - that lets you figure out the path to your script, for use in path links.
So e.g.
use FindBin;
my $script_path = $FindBin::Bin;
open ( my $input, '<', "$script_path/dir/file.txt" ) or warn $!;
That way you don't have to faff about with chdir and readdir etc.

Related

Optimized way to print directory paths recursively without file comparison in perl

I have a directory which contains multiple levels of sub dirs. I want to print path for each and every directory.
Currently, I am using
use File::Find;
find(
{
wanted => \&findfiles,
}, $maindirectory);
sub findfiles
{
if (-d) {
push #arrayofdirs,$File::Find::dir;
}
}
But each subdirectory contains thousands of files at each level. The above code takes lot of time to provide the result as it compares each file for directory. Is there a way to get subdirectories path without comparing files to save time or any other optimized method?
Edit: This issue got partially resolved but a new issue came up because of this solution. I have listed it here: Multiple File search in varying level of directories in perl
If you are on a UNIX/Linux platform then you can try reading output of find $maindirectory -type d command into your program (see this answer for a safe way to do that.). This command prints the names of directories in $maindirectory. It is faster because a compiled C program (find) does all the hard work. The following script should print all directory paths found.
Sample script:
use strict;
use warnings;
my $maindirectory = '.';
open my $fh, '-|', 'find', $maindirectory, '-type', 'd' or die "Can't open pipe: $!";
while( my $dir = <$fh>) {
print $dir;
}
close $fh or warn "can't close pipe: $!";
Note that there is no point in calling find through perl and then just printing its output without any processing. You can just as well run find $maindirectory -type d in shell itself.

Perl: How to go to the root directory from a script

I want to go to a directory, print in some files, then go back to the root directory.
So, I did this :
chdir "corpus";
open (OUTFILE, ">para$i") or die "Impossible d'ouvrir le fichier\n";
print OUTFILE $tab[$i];
close OUTFILE;
`cd /`;
But it obviously does not work (the cd / part). How do I go back to the root directory once I moved to the child directory in a Perl script?
Thanks a lot :).
Ok, now I have an other issue with this :
for (my $i=0; $i<$number_para;$i++){
open (OUTFILE, ">", "para$i.txt") or die ;
print OUTFILE $tab[$i];
}
worked fine, but when I added the chdir:
for (my $i=0; $i<$number_para;$i++){
chdir "corpus"
open (OUTFILE, ">", "para$i.txt") or die ;
print OUTFILE $tab[$i];
chdir "/"
}
It says "print() on closed filehandle OUTFILE". I don't understand why, since it worked fine before...
chdir "/"
will work just fine. Or if you have a set directory in a variable:
chdir $dir or die $!;
Or as Miller says, you can refer to ... However, you should be aware that you do not have to change directory. If you want to open a file in another directory, you can supply the relative path to it:
open (my $out, ">", "Corpus/para$i") or die $!;
Note that you should use three argument open, with explicit mode, and lexical file handle.
You say root directory, but it looks like you actually just want the parent directory.
To go to the parent directory, use '..';
chdir "..";
Or if you want to be paranoid about cross platform compatability:
use File::Spec;
chdir File::Spec->updir();
To actually go the root directory, just use chdir like you did the first time:
chdir '/';
or once again being paranoid about cross platform compatability:
use File::Spec;
chdir File::Spec->rootdir();
It might be worth pointing out why using cd in backticks didn't work.
Running a command in backticks starts up a completely new shell environment for the command. That new environment starts with a copy of all of the environment variables from the environment that your program is running in. The current directory is one of those environment variables (it's in $ENV{PWD}).
You new environment starts up. The first (and only) thing that it does is to change directory. So the value of $ENV{PWD} in the new environment is changed. But the value in your original environment stays the same as it was.
Your new environment then closes down as its job is done. All of the environment variables that it has are removed from memory. And control returns to the original environment. Which still has the original value for the current directory.
A child environment cannot change the environment variables in its parent environment. So any attempt to change directory using an external program is doomed to failure.
But changing directory using Perl's built-in function chdir works just fine. Because that changes the value in the current environment.
Hope that's helpful.

perl chdir and system commands

I am trying to chdir in perl but I am just not able to get my head around what's going wrong.
This code works.
chdir('C:\Users\Server\Desktop')
But when trying to get the user's input, it doesn't work. I even tried using chomp to remove any spaces that might come.
print "Please enter the directory\n";
$p=<STDIN>;
chdir ('$p') or die "sorry";
system("dir");
Also could someone please explain how I could use the system command in this same situation and how it differs from chdir.
The final aim is to access two folders, check for files that are named the same (eg: if both the folders have a file named "water") and copy the file that has the same name into a third folder.
chdir('$p') tries to change to a directory literally named $p. Drop the single quotes:
chdir($p)
Also, after reading it in, you probably want to remove the newline (unless the directory name really does end with a newline):
$p = <STDIN>;
chomp($p);
But if you are just chdiring to be able to run dir and get the results in your script, you probably don't want to do that. First of all, system runs a command but doesn't capture its output. Secondly, you can just do:
opendir my $dirhandle, $p or die "unable to open directory $p: $!\n";
my #files = readdir $dirhandle;
closedir $dirhandle;
and avoid the chdir and running a command prompt command altogether.
I will use it this way.
chdir "C:/Users/Server/Desktop"
The above works for me

Trying to pass a subdirectory as a parameter in Perl

I have a Perl program to read .html's and only works if the program is in the same directory as the .html's.
I would like to be able to start in different directories and pass the html's location as a parameter. The program (shell example below) traverses the subdirectory "sub"
and its subdirectories to look for .html's, but only works when my perl file is in the same subdirectory "sub". If I put the Perl file
in the home directory, which is one step back from the subdirectory "sub", it doesn't work.
In the shell, if I type "perl project.pl ./sub" from my home directory, it says could
not open ./sub/file1.html. No such file or directory. Yet the file does exist in that exact spot.
file1.html is the first file it is trying to read.
If I change directories in the shell to that subdirectory and move the .pl file
there and then say in the shell: "perl project.pl ./" everything is ok.
To search the directories, I have been using the File::Find concept which I found here:
How to traverse all the files in a directory; if it has subdirectories, I want to traverse files in subdirectories too
Find::File to search a directory of a list of files
#!/usr/bin/perl -w
use strict;
use warnings;
use File::Find;
find( \&directories, $ARGV[0]);
sub directories {
$_ = $File::Find::name;
if(/.*\.html$/){#only read file on local drive if it is an .html
my $file = $_;
open my $info, $file or die "Could not open $file: $!";
while(my $line = <$info>) {
#perform operations on file
}
close $info;
}
return;
}
In the documentation of File::Find it says:
You are chdir()'d to $File::Find::dir when the function is called,
unless no_chdir was specified. Note that when changing to directories
is in effect the root directory (/) is a somewhat special case
inasmuch as the concatenation of $File::Find::dir, '/' and $_ is not
literally equal to $File::Find::name.
So you actually are at ~/sub already. Only use the filename, which is $_. You do not need to overwrite it. Remove the line:
$_ = $File::Find::name;
find changes directory automatically so that $File::Find::name is no longer relative to the current directory.
You can delete this line to get it to work:
$_ = $File::Find::name;
See also File::Find no_chdir.
From the File::Find documentation:
For each file or directory found, it calls the &wanted subroutine.
(See below for details on how to use the &wanted function).
Additionally, for each directory found, it will chdir() into that
directory and continue the search, invoking the &wanted function on
each file or subdirectory in the directory.
(emphasis mine)
The reason it's not finding ./sub/file1.html is because, when open is called, File::Find has already chdired you into ./sub/. You should be able to open the file as just file1.html.

How can I scan multiple log files to find which ones have a particular IP address in them?

Recently there have been a few attackers trying malicious things on my server so I've decided to somewhat "track" them even though I know they won't get very far.
Now, I have an entire directory containing the server logs and I need a way to search through every file in the directory, and return a filename if a string is found. So I thought to myself, what better of a language to use for text & file operations than Perl? So my friend is helping me with a script to scan all files for a certain IP, and return the filenames that contain the IP so I don't have to search for the attacker through every log manually. (I have hundreds)
#!/usr/bin/perl
$dir = ".";
opendir(DIR, "$dir");
#files = grep(/\.*$/,readdir(DIR));
closedir(DIR);
foreach $file(#files) {
open FILE, "$file" or die "Unable to open files";
while(<FILE>) {
print if /12.211.23.200/;
}
}
although it is giving me directory read errors. Any assistance is greatly appreciated.
EDIT: Code edited, still saying permission denied cannot open directory on line 10. I am just going to run the script from within the logs directory if you are questioning the directory change to "."
Mike.
Can you use grep instead?
To get all the lines with the IP, I would directly use grep, no need to show a list of files, it's a simple command:
grep 12\.211\.23\.200 *
I like to pipe it to another file and then open that file in an editor...
If you insist on wanting the filenames, it's also easy
grep -l 12\.211\.23\.200 *
grep is available on all Unix//Linux with the GNU tools, or on windows using one of the many implementations (unxutils, cygwin, ...etc.)
You have to concatenate $dirname with $filname when using files found through readdir, remember you haven't chdir'ed into the directory where those files resides.
open FH, "<", "$dirname/$filname" or die "Cannot open $filname:$!";
Incidentally, why not just use grep -r to recursively search all subdirectories under your log dir for your string?
EDIT: I see your edits, and two things. First, this line:
#files = grep(/\.*$/,readdir(DIR));
Is not effective, because you are searching for zero or more . characters at the end of the string. Since it's zero or more, it'll match everything in the directory. If you're trying to exclude files ending in ., try this:
#files = grep(!/\.$/,readdir(DIR));
Note the ! sign for negation if you're trying to exclude those files. Otherwise (if you only want those files and I'm misunderstanding your intent), leave the ! out.
In any case, if you're getting your die message on line 10, most likely you're hitting a file that has permissions such that you can't read it. Try putting the filename in the die output so you can see which file it's failing on:
open FILE, "$file" or die "Unable to open file: $file";
But as with other answers, and to reiterate: Why not use grep? The unix command, not the Perl function.
This will get the file names you are looking for in perl, and probably do it much faster than running and doing a perl regex.
#files = `find ~/ServerLogs -name "*.log" | xargs grep -l "<ip address>"`'
Although, this will require a *nix compliant system, or Cygwin on Windows.
Firstly get a list of files within your source directory:
opendir(DIR, "$dir");
#files = grep(/\.log$/,readdir(DIR));
closedir(DIR);
And then loop through those files
foreach $file(#files)
{
// file processing code
}
My first suggest would be to use grep instead. The right tool for the job, they say...
But to answer your question:
readdir just returns the filenames from the directory. You'll need to concatenate the directory name and filename together.
$path = "$dirname/$filname";
open FH, $path or die ...
Then you should ignore files that are actually directories, such as "." and "..". After getting the $path, check to see if it's a file.
if (-f $path) {
open FH, $path or die ...
while (<FH>)
BTW, I thought I would throw in a mention for File::Next. To iterate over all files in a directory (recursively):
use Path::Class; # always useful.
use File::Next;
my $files = File::Next::files( dir(qw/path to files/) ); # look in path/to/files
while( defined ( my $file = $files->() ) ){
$file = file( $file );
say "Examining $file";
say "found foo" if $file->slurp =~ /foo/;
}
File::Next is taint-safe.
~ doesn't auto-expand in Perl.
opendir my $fh, '~/' or die("Doin It Wrong"); # Doing It Wrong.
opendir my $fh, glob('~/') and die( "Thats right!" );
Also, if you must use readdir(), make sure you guard the expression thus:
while (defined(my $filename = readdir(DH))) {
...
}
If you don't do the defined() test, the loop will terminate if it finds a file called '0'.
Have you looked on CPAN for log parsers? I searched with 'log parse' and it yielded over 200 hits. Some (probably many) won't be relevant - some may be. It depends, in part, on which web server you are using.
Am I reading this right? Your line 10 that gives you the error is
open FILE, "$file" or die "Unable to open files";
And the $file you are trying to read, according to line 6,
#files = grep(/\.*$/,readdir(DIR));
is a file that ends with zero or more dot. Is this what you really wanted? This basically matches every file in the directory, including "." and "..". Maybe you don't have enough permission to open the parent directory for reading?
EDIT: if you only want to read all files (including hidden ones), you might want to use something like the following:
opendir(DIR, ".");
#files = readdir(DIR);
closedir(DIR);
foreach $file (#files) {
if ($file ne "." and $file ne "..") {
open FILE, "$file" or die "cannot open $file\n";
# do stuff with FILE
}
}
Note that this doesn't take care of sub directories.
I know I am way late to this discussion (ran across it while searching for grep related posts) but I am going to answer anyway:
It isn't specified clearly if these are web server logs (Apache, IIS, W3SVC, etc.) but the best tool for mining those for data is the LogParser tool from Microsoft. See logparser.com for more info.
LogParser will allow you to write SQL-like statements against the log files. It is very flexible and very fast.
Use perl from the command line, like a better grep
perl -wnl -e '/12.211.23.200/ and print;' *.log > output.txt
the benefit here is that you can chain logic far easier
perl -wnl -e '(/12.211.23.20[1-11]/ or /denied/i ) and print;' *.log
if you are feeling wacky you can also use more advanced command line options to feed perl one liner result into other perl one liners.
You really need to read "Minimal Perl: For UNIX and Linux People", awesome book on this very sort of thing.
First, use grep.
But if you don't want to, here are two small improvements you can make that I haven't seen mentioned yet:
1) Change:
#files = grep(/\.*$/,readdir(DIR));
to
#files = grep({ !-d "$dir/$_" } readdir(DIR));
This way you will exclude not just "." and ".." but also any other subdirectories that may exist in the server log directory (which the open downstream would otherwise choke on).
2) Change:
print if /12.211.23.200/;
to
print if /12\.211\.23\.200/;
"." is a regex wildcard meaning "any character". Changing it to "\." will reduce the number of false positives (unlikely to change your results in practice but it's more correct anyway).