perl how to read files one by one from directory other than array concept? - perl

How can I read a log files one by one from directory other than array concept. I tried with that concept but I didn't met requirements. Because in current working directory log files keep on adding to it. If i use array concept there are missing of latest log files. Is there any better solution for this? Below code what I have tried, here array contents all files of a directory.
opendir ( DIR, $readDir ) || die "Error in opening dir $readDir\n";
my #files = grep { !/^\.\.?$/ } readdir DIR;
print STDERR "files: #files \n\n";

If you are using linux,
my $log_content = `cat /log/dir/*.log`;
This will combine all the log file contents as one.

Related

Optimized way to print directory paths recursively without file comparison in perl

I have a directory which contains multiple levels of sub dirs. I want to print path for each and every directory.
Currently, I am using
use File::Find;
find(
{
wanted => \&findfiles,
}, $maindirectory);
sub findfiles
{
if (-d) {
push #arrayofdirs,$File::Find::dir;
}
}
But each subdirectory contains thousands of files at each level. The above code takes lot of time to provide the result as it compares each file for directory. Is there a way to get subdirectories path without comparing files to save time or any other optimized method?
Edit: This issue got partially resolved but a new issue came up because of this solution. I have listed it here: Multiple File search in varying level of directories in perl
If you are on a UNIX/Linux platform then you can try reading output of find $maindirectory -type d command into your program (see this answer for a safe way to do that.). This command prints the names of directories in $maindirectory. It is faster because a compiled C program (find) does all the hard work. The following script should print all directory paths found.
Sample script:
use strict;
use warnings;
my $maindirectory = '.';
open my $fh, '-|', 'find', $maindirectory, '-type', 'd' or die "Can't open pipe: $!";
while( my $dir = <$fh>) {
print $dir;
}
close $fh or warn "can't close pipe: $!";
Note that there is no point in calling find through perl and then just printing its output without any processing. You can just as well run find $maindirectory -type d in shell itself.

Perl File::Copy is not actually copying the file

Quick synopsis: Let's say there are multiple of the same file type in one directory (in this example, 10 .txt files). I am trying to use Perl's copy function to copy 5 of them into a new directory, then zip up that directory.
The code works...except the folder that is supposed to have the .txt files copied, doesn't actually have anything in it, and I don't know why. Here is my complete code:
#!/usr/bin/perl
use strict;
use warnings;
use File::Copy;
my $source_dir = "C:\\Users\\user\\Documents\\test";
my $dest_dir = "C:\\Users\\user\\Desktop\\files";
my $count = 0;
opendir(DIR, $source_dir) or die $!;
system('mkdir files');
while (my $file = readdir(DIR)) {
print "$file\n";
if ($count eq 5) {
last;
}
if ($file =~ /\.txt/) {
copy("$file", "$dest_dir/$file");
$count++;
}
}
closedir DIR;
system('"C:\Program Files\Java\jdk1.8.0_66\bin\jar.exe" -cMf files.zip files');
system('del /S /F /Q files');
system('rmdir files');
Everything works...the directory files is created, then zipped up into files.zip...when I open the zip file, the files directory is empty, so it's as if the copy statement didn't copy anything over.
In the $source_dir are 10 .txt files, like this (for testing purposes):
test1.txt
test2.txt
test3.txt
test4.txt
test5.txt
test6.txt
test7.txt
test8.txt
test9.txt
test10.txt
The files don't actually get copied over...NOTE: the print "$file\n" was added for testing, and it indeed is printing out test1.txt, test2.txt, etc. up to test6.txt so I know that it is finding the files, just not copying them over.
Any thoughts as to where I'm going wrong?
I think there is a typo in your script:
system('mkdir files');
should be:
system("mkdir $dest_dir");
but, the real issue is that you are not using the full path of the source file. Change your copy to:
copy("$source_dir/$file", $dest_dir);
and see if that helps.
You might also want to look at: File::Path and Archive::Zip, they would eliminate the system calls.

Build array of the contents of the working directory in perl

I am working on a script which utilizes files in surrounding directories using a path such as
"./dir/file.txt"
This works fine, as long as the working directory is the one containing the script. However the script is going out to multiple users and some people may not change their working directory and run the script by typing its entire path like this:
./path/to/script/my_script.pl
This poses a problem as when the script tries to access ./dir/file.txt it is looking for the /dir directory in the home directory, and of course, it can't fine it.
I am trying to utilize readdir and chdir to correct the directory if it isn't the right one, here is what I have so far:
my $working_directory = $ENV{PWD};
print "Working directory: $working_directory\n"; #accurately prints working directory
my #directory = readdir $working_directory; #crashes script
if (!("my_script.pl" ~~ #directory)){ #if my_script.pl isnt in #directoryies, do this
print "Adjusting directory so I work\n";
print "Your old directory: $ENV{PWD}\n";
chdir $ENV{HOME}; #make the directory home
chdir "./path/to/script/my_script.pl"; #make the directory correct
print "Your new directory: $ENV{PWD}\n";
}
The line containing readdir crashes my script with the following error
Bad symbol for dirhandle at ./path/to/script/my_script.pl line 250.
which I find very strange because I am running this from the home directory which prints out properly right beforehand and contains nothing to do with the "bad symbol"
I'm open to any solutions
Thank you in advance
The readdir operates with a directory handle, not a path on a string. You need to do something like:
opendir(my $dh, $working_directory) || die "can't opendir: $!";
my #directory = readdir($dh);
Check perldoc for both readdir and opendir.
I think you're going about this the wrong way. If you're looking for a file that's travelling with your script, then what you probably should consider is the FindBin module - that lets you figure out the path to your script, for use in path links.
So e.g.
use FindBin;
my $script_path = $FindBin::Bin;
open ( my $input, '<', "$script_path/dir/file.txt" ) or warn $!;
That way you don't have to faff about with chdir and readdir etc.

Delete files in a folder in perl

Im trying to delete all the files in the directory called spool but it doesent work... Im trying to use Unlink
unlink glob "$dir/*home/roz/newfolder/spool*";
is the code im trying to use but it doesent work
First of all, spool* did not return the files in 'spool' folder, but rather - all files in folder 'newfolder' with name starting with 'spool'.
To get all the files in a folder named 'spool' use:
glob("$dir/*home/roz/newfolder/spool/*");
To get all hidden files in the 'spool' folder use:
glob("$dir/*home/roz/newfolder/spool/.*");
And finally, are you sure that '*home' is that what you really want?
If it is a typo, and you've meant 'home' it will be more clear and error-prone(you don't have to care about hidden files or files with spaces in the name) to use
my $path = "$dir/home/roz/newfolder/spool";
opendir(my $sdir, $path) or die "Unable to open $path: $!";
unlink map { "$path/$_" if -f "$path/$_" } readdir $sdir;

How can I scan multiple log files to find which ones have a particular IP address in them?

Recently there have been a few attackers trying malicious things on my server so I've decided to somewhat "track" them even though I know they won't get very far.
Now, I have an entire directory containing the server logs and I need a way to search through every file in the directory, and return a filename if a string is found. So I thought to myself, what better of a language to use for text & file operations than Perl? So my friend is helping me with a script to scan all files for a certain IP, and return the filenames that contain the IP so I don't have to search for the attacker through every log manually. (I have hundreds)
#!/usr/bin/perl
$dir = ".";
opendir(DIR, "$dir");
#files = grep(/\.*$/,readdir(DIR));
closedir(DIR);
foreach $file(#files) {
open FILE, "$file" or die "Unable to open files";
while(<FILE>) {
print if /12.211.23.200/;
}
}
although it is giving me directory read errors. Any assistance is greatly appreciated.
EDIT: Code edited, still saying permission denied cannot open directory on line 10. I am just going to run the script from within the logs directory if you are questioning the directory change to "."
Mike.
Can you use grep instead?
To get all the lines with the IP, I would directly use grep, no need to show a list of files, it's a simple command:
grep 12\.211\.23\.200 *
I like to pipe it to another file and then open that file in an editor...
If you insist on wanting the filenames, it's also easy
grep -l 12\.211\.23\.200 *
grep is available on all Unix//Linux with the GNU tools, or on windows using one of the many implementations (unxutils, cygwin, ...etc.)
You have to concatenate $dirname with $filname when using files found through readdir, remember you haven't chdir'ed into the directory where those files resides.
open FH, "<", "$dirname/$filname" or die "Cannot open $filname:$!";
Incidentally, why not just use grep -r to recursively search all subdirectories under your log dir for your string?
EDIT: I see your edits, and two things. First, this line:
#files = grep(/\.*$/,readdir(DIR));
Is not effective, because you are searching for zero or more . characters at the end of the string. Since it's zero or more, it'll match everything in the directory. If you're trying to exclude files ending in ., try this:
#files = grep(!/\.$/,readdir(DIR));
Note the ! sign for negation if you're trying to exclude those files. Otherwise (if you only want those files and I'm misunderstanding your intent), leave the ! out.
In any case, if you're getting your die message on line 10, most likely you're hitting a file that has permissions such that you can't read it. Try putting the filename in the die output so you can see which file it's failing on:
open FILE, "$file" or die "Unable to open file: $file";
But as with other answers, and to reiterate: Why not use grep? The unix command, not the Perl function.
This will get the file names you are looking for in perl, and probably do it much faster than running and doing a perl regex.
#files = `find ~/ServerLogs -name "*.log" | xargs grep -l "<ip address>"`'
Although, this will require a *nix compliant system, or Cygwin on Windows.
Firstly get a list of files within your source directory:
opendir(DIR, "$dir");
#files = grep(/\.log$/,readdir(DIR));
closedir(DIR);
And then loop through those files
foreach $file(#files)
{
// file processing code
}
My first suggest would be to use grep instead. The right tool for the job, they say...
But to answer your question:
readdir just returns the filenames from the directory. You'll need to concatenate the directory name and filename together.
$path = "$dirname/$filname";
open FH, $path or die ...
Then you should ignore files that are actually directories, such as "." and "..". After getting the $path, check to see if it's a file.
if (-f $path) {
open FH, $path or die ...
while (<FH>)
BTW, I thought I would throw in a mention for File::Next. To iterate over all files in a directory (recursively):
use Path::Class; # always useful.
use File::Next;
my $files = File::Next::files( dir(qw/path to files/) ); # look in path/to/files
while( defined ( my $file = $files->() ) ){
$file = file( $file );
say "Examining $file";
say "found foo" if $file->slurp =~ /foo/;
}
File::Next is taint-safe.
~ doesn't auto-expand in Perl.
opendir my $fh, '~/' or die("Doin It Wrong"); # Doing It Wrong.
opendir my $fh, glob('~/') and die( "Thats right!" );
Also, if you must use readdir(), make sure you guard the expression thus:
while (defined(my $filename = readdir(DH))) {
...
}
If you don't do the defined() test, the loop will terminate if it finds a file called '0'.
Have you looked on CPAN for log parsers? I searched with 'log parse' and it yielded over 200 hits. Some (probably many) won't be relevant - some may be. It depends, in part, on which web server you are using.
Am I reading this right? Your line 10 that gives you the error is
open FILE, "$file" or die "Unable to open files";
And the $file you are trying to read, according to line 6,
#files = grep(/\.*$/,readdir(DIR));
is a file that ends with zero or more dot. Is this what you really wanted? This basically matches every file in the directory, including "." and "..". Maybe you don't have enough permission to open the parent directory for reading?
EDIT: if you only want to read all files (including hidden ones), you might want to use something like the following:
opendir(DIR, ".");
#files = readdir(DIR);
closedir(DIR);
foreach $file (#files) {
if ($file ne "." and $file ne "..") {
open FILE, "$file" or die "cannot open $file\n";
# do stuff with FILE
}
}
Note that this doesn't take care of sub directories.
I know I am way late to this discussion (ran across it while searching for grep related posts) but I am going to answer anyway:
It isn't specified clearly if these are web server logs (Apache, IIS, W3SVC, etc.) but the best tool for mining those for data is the LogParser tool from Microsoft. See logparser.com for more info.
LogParser will allow you to write SQL-like statements against the log files. It is very flexible and very fast.
Use perl from the command line, like a better grep
perl -wnl -e '/12.211.23.200/ and print;' *.log > output.txt
the benefit here is that you can chain logic far easier
perl -wnl -e '(/12.211.23.20[1-11]/ or /denied/i ) and print;' *.log
if you are feeling wacky you can also use more advanced command line options to feed perl one liner result into other perl one liners.
You really need to read "Minimal Perl: For UNIX and Linux People", awesome book on this very sort of thing.
First, use grep.
But if you don't want to, here are two small improvements you can make that I haven't seen mentioned yet:
1) Change:
#files = grep(/\.*$/,readdir(DIR));
to
#files = grep({ !-d "$dir/$_" } readdir(DIR));
This way you will exclude not just "." and ".." but also any other subdirectories that may exist in the server log directory (which the open downstream would otherwise choke on).
2) Change:
print if /12.211.23.200/;
to
print if /12\.211\.23\.200/;
"." is a regex wildcard meaning "any character". Changing it to "\." will reduce the number of false positives (unlikely to change your results in practice but it's more correct anyway).