How do I read in the contents of a directory in Perl? - perl

How do I get Perl to read the contents of a given directory into an array?
Backticks can do it, but is there some method using 'scandir' or a similar term?

opendir(D, "/path/to/directory") || die "Can't open directory: $!\n";
while (my $f = readdir(D)) {
print "\$f = $f\n";
}
closedir(D);
EDIT: Oh, sorry, missed the "into an array" part:
my $d = shift;
opendir(D, "$d") || die "Can't open directory $d: $!\n";
my #list = readdir(D);
closedir(D);
foreach my $f (#list) {
print "\$f = $f\n";
}
EDIT2: Most of the other answers are valid, but I wanted to comment on this answer specifically, in which this solution is offered:
opendir(DIR, $somedir) || die "Can't open directory $somedir: $!";
#dots = grep { (!/^\./) && -f "$somedir/$_" } readdir(DIR);
closedir DIR;
First, to document what it's doing since the poster didn't: it's passing the returned list from readdir() through a grep() that only returns those values that are files (as opposed to directories, devices, named pipes, etc.) and that do not begin with a dot (which makes the list name #dots misleading, but that's due to the change he made when copying it over from the readdir() documentation). Since it limits the contents of the directory it returns, I don't think it's technically a correct answer to this question, but it illustrates a common idiom used to filter filenames in Perl, and I thought it would be valuable to document. Another example seen a lot is:
#list = grep !/^\.\.?$/, readdir(D);
This snippet reads all contents from the directory handle D except '.' and '..', since those are very rarely desired to be used in the listing.

A quick and dirty solution is to use glob
#files = glob ('/path/to/dir/*');

This will do it, in one line (note the '*' wildcard at the end)
#files = </path/to/directory/*>;
# To demonstrate:
print join(", ", #files);

IO::Dir is nice and provides a tied hash interface as well.
From the perldoc:
use IO::Dir;
$d = IO::Dir->new(".");
if (defined $d) {
while (defined($_ = $d->read)) { something($_); }
$d->rewind;
while (defined($_ = $d->read)) { something_else($_); }
undef $d;
}
tie %dir, 'IO::Dir', ".";
foreach (keys %dir) {
print $_, " " , $dir{$_}->size,"\n";
}
So you could do something like:
tie %dir, 'IO::Dir', $directory_name;
my #dirs = keys %dir;

You could use DirHandle:
use DirHandle;
$d = new DirHandle ".";
if (defined $d)
{
while (defined($_ = $d->read)) { something($_); }
$d->rewind;
while (defined($_ = $d->read)) { something_else($_); }
undef $d;
}
DirHandle provides an alternative, cleaner interface to the opendir(), closedir(), readdir(), and rewinddir() functions.

Similar to the above, but I think the best version is (slightly modified) from "perldoc -f readdir":
opendir(DIR, $somedir) || die "can't opendir $somedir: $!";
#dots = grep { (!/^\./) && -f "$somedir/$_" } readdir(DIR);
closedir DIR;

You can also use the children method from the popular Path::Tiny module:
use Path::Tiny;
my #files = path("/path/to/dir")->children;
This creates an array of Path::Tiny objects, which are often more useful than just filenames if you want to do things to the files, but if you want just the names:
my #files = map { $_->stringify } path("/path/to/dir")->children;

Here's an example of recursing through a directory structure and copying files from a backup script I wrote.
sub copy_directory {
my ($source, $dest) = #_;
my $start = time;
# get the contents of the directory.
opendir(D, $source);
my #f = readdir(D);
closedir(D);
# recurse through the directory structure and copy files.
foreach my $file (#f) {
# Setup the full path to the source and dest files.
my $filename = $source . "\\" . $file;
my $destfile = $dest . "\\" . $file;
# get the file info for the 2 files.
my $sourceInfo = stat( $filename );
my $destInfo = stat( $destfile );
# make sure the destinatin directory exists.
mkdir( $dest, 0777 );
if ($file eq '.' || $file eq '..') {
} elsif (-d $filename) { # if it's a directory then recurse into it.
#print "entering $filename\n";
copy_directory($filename, $destfile);
} else {
# Only backup the file if it has been created/modified since the last backup
if( (not -e $destfile) || ($sourceInfo->mtime > $destInfo->mtime ) ) {
#print $filename . " -> " . $destfile . "\n";
copy( $filename, $destfile ) or print "Error copying $filename: $!\n";
}
}
}
print "$source copied in " . (time - $start) . " seconds.\n";
}

from: http://perlmeme.org/faqs/file_io/directory_listing.html
#!/usr/bin/perl
use strict;
use warnings;
my $directory = '/tmp';
opendir (DIR, $directory) or die $!;
while (my $file = readdir(DIR)) {
next if ($file =~ m/^\./);
print "$file\n";
}
The following example (based on a code sample from perldoc -f readdir) gets all the files (not directories) beginning with a period from the open directory. The filenames are found in the array #dots.
#!/usr/bin/perl
use strict;
use warnings;
my $dir = '/tmp';
opendir(DIR, $dir) or die $!;
my #dots
= grep {
/^\./ # Begins with a period
&& -f "$dir/$_" # and is a file
} readdir(DIR);
# Loop through the array printing out the filenames
foreach my $file (#dots) {
print "$file\n";
}
closedir(DIR);
exit 0;
closedir(DIR);
exit 0;

Related

Directory Handle in Perl Not Working Properly

I am trying to read files inside a folder in Perl using Directory Handle. The script is able to show the file name but it is throwing two errors: readdir() attempted on invalid dirhandle DIR and closedir() attempted on invalid dirhandle DIR.
I am calling a subroutine and passing two values:
if($fileEnding eq "directory")
{
print "$fileName is a directory\n";
FolderInvestigator1($a, $fileName);
}
$a holds the directory name and its path which is being passed via command-line argument. I am passing the control to a subroutine.
Below is my code:-
sub FolderInvestigator1
{
my $prevPath = shift;
my $receivedFolder = shift;
my $realPath = "$prevPath/$receivedFolder";
my $path = File::Spec->rel2abs($realPath);
print "$path\n";
print "$receivedFolder Folder Received\n";
opendir(DIR, $path) or die "You've Passed Invalid Directory as Arguments\n";
while(my $fileName = readdir DIR)
{
next if $fileName =~ /^\./;
print "The Vacant Folder has $fileName file\n";
}
closedir(DIR);
}
Here is my complete code:-
FirstResponder();
sub FirstResponder
{
if (#ARGV == 0)
{
print "No Arguments Passed\n";
}
else
{
foreach my $a(#ARGV)
{
print "Investigating $a directory below:-\n";
opendir(DIR, $a) or die "You've Passed Invalid Directory as Arguments\n";
while(my $fileName = readdir DIR)
{
next if $fileName =~ /^\./;
$ending = `file --mime-type $a/$fileName`;
#print $ending;
$fileEnding = `basename -s $ending`;
#print $fileEnding;
chomp($fileEnding);
#print $fileName,"\n";
if($fileEnding eq "directory")
{
print "$fileName is a directory\n";
FolderInvestigator1($a, $fileName);
}
else
{
CureExtensions($a, $fileName);
}
}
closedir(DIR);
my #files = glob("$a/*");
my $size = #files;
if($size == 0)
{
print "The $a is an empty directory\n";
}
}
}#Foreach Ends Here..
}
Please see the screenshot for more information on what's going on!
I am not able to realize why Directory Handle is throwing error even though I made the path correct. Some guidance will be highly appreciated.
The problem with your code is that you have a nested use of the bareword (global) dir handle DIR, and hence the inner loop closes the handle before the outer loop is finished:
opendir(DIR, $arg) or die "...";
while(my $fileName = readdir DIR) {
# ... more code here
opendir(DIR, $path) or die "...";
while(my $file = readdir DIR) {
# ... more code here
}
closedir DIR;
}
closedir DIR;
Here is an example of how you could write the first loop using a lexical dir handle $DIR instead of using a legacy global bareword handle DIR:
use feature qw(say);
use strict;
use warnings;
use File::Spec;
FirstResponder();
sub FirstResponder {
foreach my $arg (#ARGV) {
print "Investigating $arg directory below:-\n";
opendir(my $DIR, $arg) or die "You've Passed Invalid Directory as Arguments\n";
my $size = 0;
while(my $fileName = readdir $DIR) {
next if $fileName =~ /^\./;
my $path = File::Spec->catfile( $arg, $fileName );
if( -d $path) {
print "$fileName is a directory\n";
say "FolderInvestigator1($arg, $fileName)"
}
else {
say "CureExtensions($arg, $fileName)";
}
$size++;
}
closedir $DIR;
if($size == 0) {
print "The $arg is an empty directory\n";
}
}
}
The use of bareword filehandle names is old style and deprecated, according to perldoc open:
An older style is to use a bareword as the filehandle, as
open(FH, "<", "input.txt")
or die "Can't open < input.txt: $!";
Then you can use FH as the filehandle, in close FH and and so on. Note that it's a global
variable, so this form is not recommended in new code.
See also:
Why does Perl open() documentation use two different FILEHANDLE style?
Don't Open Files in the old way

Perl cannot stat $_

I want the last modified time for each file in the directory. To make sure my loop is working I print $_ and I see the file names of the directory:
for ( #Files ) {
opendir( D, $path . '\/' . $_ ) or die "$!";
my #textfiles = grep { ! /^\.{1,2}$/ } readdir( D );
for ( #textfiles ) {
# print "$_\n"; <----the file names.
my $epoch_timestamp = ( stat( $_ ) )[9];
print "$epoch_timestamp\n";
}
I get this error
Use of uninitialized value $epoch_timestamp in concatenation (.) or string
What am I doing wrong?
readdir returns only the names of the files. If your current working directory is different then you must build the full path as you did with the parameter to opendir. The easiest way is to use map in the list for the for loop
I'm concerned about your statement
opendir( D, $path . '\/' . $_ ) or die "$!";
which will put, literally, \/ between $path and $_. I think you need just /, but it is simplest to interpolate the variables with
opendir( D, "$path/$_" ) or die "$!";
But $_ comes from the array #Files. If these are indeed file names then your opendir will fail. They need to be directory names
In my solution I've built the variable $dir as
my $dir = "$path/$_"
so that it can be used in the call to opendir as well as to build the full path to the files in the following for loop
Note that I have also used a lexical directory handle my $dh, which are far superior to global handles D
for ( #Files ) {
my $dir = "$path/$_";
opendir my $dh, $dir or die $!;
my #textfiles = grep { ! /^\.{1,2}$/ } readdir $dh;
for ( map { "$dir/$_" } #textfiles ) {
# print "$_\n"; <----the file names.
my $epoch_timestamp = ( stat( $_ ) )[9];
print "$epoch_timestamp\n";
}
Or alternatively to above perfect answers, you could use some modules and make your life more easy. :) Like: Path::Tiny[1]
use 5.014;
use warnings;
use Path::Tiny;
my $path = path('/etc');
my #Files = qw(defaults cups ssl);
for my $dir (#Files) {
my #textfiles = $path->child($dir)->children;
for my $file (#textfiles) {
say "$file: ", $file->stat->mtime;
}
}
Of course, the above the nested loop could be written as
for my $dir (#Files) {
my #textfiles = $path->child($dir)->children;
say "$_: ", $_->stat->mtime for (#textfiles);
}
and also storing the list of files into #textfiles isn't necessary, so it could be reduced to:
for my $dir (#Files) {
say "$_: ", $_->stat->mtime for ( $path->child($dir)->children );
}
Path::Tiny conveniently throws a clean exception message on error.
readdir only returns the name of the file in the directory. You need to provide a qualified path to the file to stat.
my $dir_qfn = ...;
opendir(my $dh, $dir_qfn)
or do {
warn("Can't read dir \"$dir_qfn\": $!\n");
next;
};
while (defined( my $fn = readdir($dh) )) {
next if $fn =~ /^\.\.?\z/;
my $qfn = "$dir_qfn/$fn";
my $mtime = ( stat($qfn) )[9];
defined($mtime)
or do {
warn("Can't stat file \"$file_qfn\": $!\n");
next;
};
...
}
Using glob instead
my $dir = ...;
my %ts =
map { $_ => (stat $_)[9] }
grep { !m{/\.\.?\z} } #/
glob "\Q$dir\E/{*,.*}";
say "ts{$_} => $_" for sort keys %ts;
I use a hash name => timestamp to collect both in a data structure. The pattern $dir/{*,.*} is there to catch dot files as well, or it would be just $dir/*.
The grep filters out . and .. filenames, found in path by m{..} match. Its pattern needs \Q..\E to prevent an injection bug with particular directory names. It also escapes spaces so File::Glob with its :bsd_globoption isn't needed. Thanks to ikegami for comments.
If you'd rather process files one at a time, retrieve the list with glob and then iterate through it.

How to get files names with specific extension from a folder in perl

Currently in a perl script I am using the glob function to get a list of files with specific extensions.
my #filearray = glob("$DIR/*.abc $DIR/*.llc");
Is there any alternative to glob, to get the list of files with specific extension from a folder? If so please provide me some example? Thank you
Yes, there are much more complicated ways, like opendir, readdir and a regex filter. They will also give you the hidden files (or dotfiles):
opendir DIR, $DIR or die $!;
my #filearray = grep { /\.(abc|llc)$/ } readdir DIR;
closedir DIR;
#Using:
opendir(DIR, $dir) || die "$!";
my #files = grep(/\.[abc|lic]*$/, readdir(DIR));
closedir(DIR);
#Reference: CPAN
use Path::Class; # Exports dir() by default
my $dir = dir('foo', 'bar'); # Path::Class::Dir object
my $dir = Path::Class::Dir->new('foo', 'bar'); # Same thing
my $file = $dir->file('file.txt'); # A file in this directory
my $handle = $dir->open;
while (my $file = $handle->read)
{
$file = $dir->file($file); # Turn into Path::Class::File object
...
}
#Reference: Refered: http://accad.osu.edu/~mlewis/Class/Perl/perl.html#cd
# search for a file in all subdirectories
#!/usr/local/bin/perl
if ($#ARGV != 0) {
print "usage: findfile filename\n";
exit;
}
$filename = $ARGV[0];
# look in current directory
$dir = getcwd();
chop($dir);
&searchDirectory($dir);
sub searchDirectory
{
local($dir);
local(#lines);
local($line);
local($file);
local($subdir);
$dir = $_[0];
# check for permission
if(-x $dir)
{
# search this directory
#lines = `cd $dir; ls -l | grep $filename`;
foreach $line (#lines)
{
$line =~ /\s+(\S+)$/;
$file = $1;
print "Found $file in $dir\n";
}
# search any sub directories
#lines = `cd $dir; ls -l`;
foreach $line (#lines)
{
if($line =~ /^d/)
{
$line =~ /\s+(\S+)$/;
$subdir = $dir."/".$1;
&searchDirectory($subdir);
}
}
}
}
Please try another one:
use Cwd;
use File::Find;
my $dir = getcwd();
my #abclicfiles;
find(\&wanted, $dir);
sub wanted
{
push(#abclicfiles, $File::Find::name) if($File::Find::name=~m/\.(abc|lic)$/i);
}
print join "\n", #abclicfiles;
This the directory which is getting from user:
print "Please enter the directory: ";
my $dir = <STDIN>;
chomp($dir);
opendir(DIR, $dir) || die "Couldn't able to read dir: $!";
my #files = grep(/\.(txt|lic)$/, readdir(DIR));
closedir(DIR);
print join "\n", #files;

Perl opening files from recursive directory

So my program is supposed to recursively go through a directory and then for each file in the directory, open up the file and search for the words "error" "fail" and "failed." Then it should write the instances where these words occur, as well as the rest of the characters on the line after those words, out to a file designated in the command prompt. I have been having some trouble making sure the program performs the search on the files that are found in the directory. Right now it does recurse through the directory and even creates a file to write out to, however, it does not seem to be searching through the files found in the recursing. Here is my code:
#!/usr/local/bin/perl
use warnings;
use strict;
use File::Find;
my $argument2 = $ARGV[0];
my $dir = "c:/program/Scripts/Directory1"; #directory to search through
open FILE, ">>$argument2" or die $!; #file to write out
my $unsuccessful = 0;
my #errors = ();
my #failures= ();
my #failures2 = ();
my #name = ();
my #file;
my $file;
my $filename;
opendir(DIR, $dir) or die $!;
while($file = readdir(DIR)){
next if($file =~ m/^\./);
foreach(0..$#file){
print $_;
open(FILELIST, '<', $_);
while(<FILELIST>){
if (/Unsuccessful/i){
$unsuccessful = 1;
}
if(/ERROR/ ){
push(#errors, "ERROR in line $.\n");
print "\t\tERROR in line $.:$1\n" if (/Error\s+(.+)/);
}
if(/fail/i ){
push(#failures, "ERROR in line $.\n");
print FILE "ERROR in line $.:$1\n" if (/fail\s+(.+)/);
}
if(/failed/i ){
push(#failures2, "ERROR in line $.\n");
print FILE "ERROR in line $.:$1\n" if (/failed\s+(.+)/);
}
if ($unsuccessful){
}
}
close FILELIST;
}
}
closedir(DIR);
close FILE;
So, to clarify, my problem is that the search contained in the "while()" loop does not seem to be executing on the files found in the directory recursively. Any comments/suggestions/help that you can give on why this may be happening would be very helpful. I am new to Perl so some sample code would also just help me understand what you are trying to say. Thank you very much.
Typically, when I want to do something on recursive files, I start with find2perl . -print which generates the boilerplate for me with the wanted function from which I can modify to do whatever I want.
For example
# Traverse desired filesystems
File::Find::find({wanted => \&wanted}, '.');
exit;
sub wanted {
return unless -f $File::Find::name;
return unless -R $File::Find::name;
open (F,"<",$File::Find::name) or warn("Error opening $File::Find::name : $!\n");
while(<F>) {
if(m/error/) { print; }
if(m/fail/) { print; }
}
}
This is an example of a recursive perl directory listing. In reality, I would probably use file::find, or really just grep -R, but I am assuming this is homework of some kind:
use strict;
my $dir = $ARGV[0];
my $level = 0;
depthFirstDirectoryList($dir, $level);
sub depthFirstDirectoryList{
my ($dir, $level) = #_;
opendir (my $ind, $dir) or die "Can't open $dir for reading: $!\n";
while(my $file = readdir($ind)){
if(-d "$dir/$file" && $file ne "." && $file ne ".."){
depthFirstDirectoryList("$dir/$file", $level + 1);
}
else{
no warnings 'uninitialized';
print "\t" x $level . "file: $dir/$file\n";
}
}
}

How to check in Perl whether there is any sub folder exist in a folder?

Is there a way to check whether there is any subfolder exist inside a folder. I would like to do this in Perl?
Glob through the contents of the directory, and check whether it is a directory with -d.
sub has_subfolder {
my $directory = shift;
for ( <$directory/*>, <$directory/.*> ) {
next if m#/\.\.?$#; # skip . and ..
return 1 if -d;
}
return 0;
}
if (grep -d, glob("$folder/*")) {
print "$folder has subfolder(s)\n";
}
If you want to deal with directories matching .*, you could do:
if (grep -d && !/\.\.?$/, glob("$folder/.* $folder/*")) {
print "$folder has subfolder(s)\n";
}
You can use the'File::Find' module for this purpose. File::Find processes and scans a directory recursively. Here is the sample code:
use File::Find;
my $DirName = 'dirname' ;
sub has_subdir
{
#The path of the file/dir being visited.
my $subdir = $File::Find::name;
#Ignore if this is a file.
return unless -d $subdir;
#Ignore if $subdir is $Dirname itself.
return if ( $subdir eq $DirName);
# if we have reached here, this is a subdirector.
print "Sub directory found - $subdir\n";
}
#For each file and sub directory in $Dirname, 'find' calls
#the 'has_subdir' subroutine recursively.
find (\&has_subdir, $DirName);
In order to check if a subfolder exists in a directory (without knowing any names):
my $dir_name = "some_directory";
opendir my $dir, $dir_name
or die "Could not open directory $dir_name: $!";
my $has_subfolder = grep { -d && !/(^|\/)\.\.?$/ } map { ("$dir_name"||'.')."/$_" } readdir $dir;
In other words, it checks for one or more files in the directory which are themselves directories.
If you want a specific subfolder, just use Geo's answer.
Edit: This is getting silly now, but here's a truly general-purpose answer. :-P Someone else is getting the check mark anyway.
sub hasSubDir {
my $dir_name = shift;
opendir my $dir, $dir_name
or die "Could not open directory $dir_name: $!";
my #files = readdir($dir);
closedir($dir);
for my $file (#files) {
if($file !~ /\.\.?$/) {
return 1 if -d $dir/$file;
}
}
return 0;
}
OK, I'm just gonna have to submit my own answer
sub has_subfolder {
my $dir = shift;
my $found = 0;
opendir my $dh, $dir or die "Could not open directory $dir: $!";
while (my $_ = readdir($dh)) {
next if (/^\.\.?$/); # skip '.' and '..'
my $path = $dir . '/' . $_; # readdir doesn't return the whole path
if (-d $path) { # found a dir? record it, and leave the loop!
$found = 1;
last;
}
closedir($dh); # make sure we cleanup after!
return $found;
}
Compared to other answers:
finds hidden directories
completes as soon as it finds a match
doesn't traverse the tree twice (once for normal files, and again for hidden files)
EDIT - I see the requirements just changed (sigh). Fortunately the code above is trivially modified:
sub get_folders {
my $dir = shift;
my #found;
opendir my $dh, $dir or die "Could not open directory $dir: $!";
while (my $_ = readdir($dh)) {
next if (/^\.\.?$/); # skip '.' and '..'
my $path = $dir . '/' . $_; # readdir doesn't return the whole path
push(#found, $_) if (-d $path) # found a dir? record it
}
closedir($dh); # make sure we cleanup after!
return #found;
}
if(-e "some_folder/some_subfolder") {
print "folder exists";
}
else {
print "folder does not exist";
}