How can I recursively read out directories in Perl?

How can I recursively read out directories in Perl? - perl

I want to read out a directory recursively to print the data-structure in an HTML-Page with Template::Toolkit.
But I'm hanging in how to save the Paths and Files in a form that can be read our easy.
My idea started like this
sub list_dirs{
my ($rootPath) = #_;
my (#paths);
$rootPath .= '/' if($rootPath !~ /\/$/);
for my $eachFile (glob($path.'*'))
{
if(-d $eachFile)
{
push (#paths, $eachFile);
&list_dirs($eachFile);
}
else
{
push (#files, $eachFile);
}
}
return #paths;
}
How could I solve this problem?

This should do the trick
use strict;
use warnings;
use File::Find qw(finddepth);
my #files;
finddepth(sub {
return if($_ eq '.' || $_ eq '..');
push #files, $File::Find::name;
}, '/my/dir/to/search');

You should always use strict and warnings to help you debug your code. Perl would have warned you for example that #files is not declared. But the real problem with your function is that you declare a lexical variable #paths on every recursive call to list_dirs and don't push the return value back after the recursion step.
push #paths, list_dir($eachFile)
If you don't want to install additional modules, the following solution should probably help you:
use strict;
use warnings;
use File::Find qw(find);
sub list_dirs {
my #dirs = #_;
my #files;
find({ wanted => sub { push #files, $_ } , no_chdir => 1 }, #dirs);
return #files;
}

The answer by mdom explains how your initial attempt went astray. I would also suggest that you consider friendlier alternatives to File::Find. CPAN has several options. Here's one.
use strict;
use warnings;
use File::Find::Rule;
my #paths = File::Find::Rule->in(#ARGV);
Also see here:
SO answer providing CPAN
alternatives to File::Find.
SO question on directory iterators.
And here is a rewrite of your recursive solution. Things to note: use strict; use warnings; and the use of a scoping block to create a static variable for the subroutine.
use strict;
use warnings;
print $_, "\n" for dir_listing(#ARGV);
{
my #paths;
sub dir_listing {
my ($root) = #_;
$root .= '/' unless $root =~ /\/$/;
for my $f (glob "$root*"){
push #paths, $f;
dir_listing($f) if -d $f;
}
return #paths;
}
}

I think you have problem in the following line in your code
for my $eachFile (glob($path.'*'))
You change the $path variable into $rootpath.
It will store the path correctly.

I use this script to remove hidden files (created by Mac OS X) from my USB Pendrive, where I usually use it to listen music in the car, and any file ending with ".mp3", even when it starts with "._", will be listed in the car audio list.
#!/bin/perl
use strict;
use warnings;
use File::Find qw(find);
sub list_dirs {
my #dirs = #_;
my #files;
find({ wanted => sub { push #files, $_ } , no_chdir => 1 }, #dirs);
return #files;
}
if ( ! #ARGV || !$ARGV[0] ) {
print "** Invalid dir!\n";
exit ;
}
if ( $ARGV[0] !~ /\/Volumes\/\w/s ) {
print "** Dir should be at /Volume/... > $ARGV[0]\n";
exit ;
}
my #paths = list_dirs($ARGV[0]) ;
foreach my $file (#paths) {
my ($filename) = ( $file =~ /([^\\\/]+)$/s ) ;
if ($filename =~ /^\._/s ) {
unlink $file ;
print "rm> $file\n" ;
}
}

you can use this method as recursive file search that separate specific file types,
my #files;
push #files, list_dir($outputDir);
sub list_dir {
my #dirs = #_;
my #files;
find({ wanted => sub { push #files, glob "\"$_/*.txt\"" } , no_chdir => 1 }, #dirs);
return #files;
}

Related

How to store result from Find::File into array

I want to list the file in directory and subdirectory. I use perl File::Find. Is it possible for me to store the result into an array?
Here is the code
use warnings;
use strict;
use File::Find;
my $location="tmp";
sub find_txt {
my $F = $File::Find::name;
if ($F =~ /txt$/ ) {
push #filelist, $F;
return #filelist;
}
}
my #fileInDir = find({ wanted => \&find_txt, no_chdir=>1}, $location);
print OUTPUT #fileInDir
the code above doesn't display the output

Sure, just push into an array declared outside:
use warnings;
use strict;
use File::Find;
my $location = "tmp";
my #results;
my $find_txt = sub {
my $F = $File::Find::name;
if ($F =~ /txt$/ ) {
push #results, $F;
}
};
find({ wanted => $find_txt, no_chdir=>1}, $location);
for my $result (#results) {
print "found $result\n";
}
The return value of the wanted callback is ignored. find itself has no documented or useful return value either.

For posterity, this is much more straightforward with Path::Iterator::Rule.
use strict;
use warnings;
use Path::Iterator::Rule;
my $location = 'tmp';
my $rule = Path::Iterator::Rule->new->not_dir->name(qr/txt$/);
my #paths = $rule->all($location);

Replace
my #fileInDir = find({ wanted => \&find_txt, no_chdir=>1}, $location);
with
my #fileInDir;
find({ wanted => sub { push #fileInDir, find_txt(); }, no_chdir=>1 }, $location);
and add the missing
return;
aka
return ();
to find_txt. Unlike the solution in the earlier answer, this allows you to have reusable and conveniently located "wanted" subs.

How to pass an anonymous sub to Find::File

I know I can do this as an expression modifier:
#!/usr/bin/perl -w
use strict;
use File::Find;
sub file_find{
my ($path,$filter) = #_;
find(sub {print $File::Find::name."\n" if /$filter/}, $path);
}
file_find($newdir,'\.txt');
or this which is less readable:
find(sub {if(/$filter/){print $File::Find::name."\n"}}, $path);
But if I wanted to do something like this, how can I do it?
sub file_find{
my ($path,$filter) = #_;
find(\&print, $path);
sub print {
if(/$filter/){ #Variable $filter will not stay shared
print $File::Find::name."\n";
}
}
}
file_find($newdir,'\.txt')
I get 'variable will not stay shared'. I believe I'm supposed to make it an anonymous sub:
my $print = sub {
if(/$filter/){
print $File::Find::name."\n";
}
}
But then I don't know how to pass the reference to the find sub. Perhaps it's somthing silly I'm missing.
Edit: Never mind, this seems to work:
sub file_find{
my ($path,$filter) = #_;
my $subref = sub{
if(/$filter/){
print $File::Find::name."\n";
}
};
find($subref,$path);
}
file_find($newdir,'\.txt');
I had to push the find sub to the bottom! Man I feel so dumb :)

I would separate the subs apart (and rename the print() one as it conflicts with the built-in with the same name!), then you can do something along these lines (if I'm understanding what you want correctly):
use warnings;
use strict;
use File::Find;
file_find('.', '.txt');
sub file_find{
my ($path,$filter) = #_;
my #files = find(sub {my_print($filter)}, $path);
}
sub my_print {
my $filter = shift;
my $fname = $File::Find::name;
if($fname =~ /$filter/){
print "$fname\n";
}
}
However, with that said, File::Find::Rule can make these things very, very easy (particularly handling the file filters as it handles regex natively):
use warnings;
use strict;
use File::Find::Rule;
my $filter = '*.txt';
my $dir = '.';
my #files = File::Find::Rule->file()
->name($filter)
->in($dir);
print "$_\n" for #files;

Search file in directory structure

Does anybody know a method to search for a file in a directory structure without using File::Find? I know step-by-step how to do it but if it is possible to make it smoother that will be helpful.

File::Find is a core module since perl 5.000 so I don't see a reason for not using it.
But if you still want to take your crazy way you could call the find command.

From one File::Find hater to another: DirWalk.pm, inspired by the Python's os.walk().
package DirWalk;
use strict;
use warnings;
sub new {
my ($class, #dirs) = #_;
my #odirs = #dirs;
#dirs = qw/./ unless #dirs;
s!/+$!! for #dirs;
s!/+\.$!! for #dirs;
my $self = { _odirs => [#odirs], _dirs => [#dirs], _dhstack => [], _dnstack => [] };
opendir my($dirh), $dirs[0];
return undef unless $dirh;
shift #{ $self->{_dirs} };
unshift #{ $self->{_dhstack} }, $dirh;
unshift #{ $self->{_dnstack} }, $dirs[0];
return bless $self, $class;
}
sub _walk_op {
my ($self) = #_;
if (wantarray) {
my #ret;
while (defined(my $x = $self->next())) {
push #ret, $x;
}
return #ret;
}
elsif (defined wantarray) {
return $self->next();
}
return undef;
}
sub next
{
my ($self) = #_;
my $dstack = $self->{_dhstack};
my $nstack = $self->{_dnstack};
if (#$dstack) {
my $x;
do {
$x = readdir $dstack->[0];
} while (defined($x) && ($x eq '.' || $x eq '..'));
if (defined $x) {
my $nm = $nstack->[0].'/'.$x;
if (-d $nm) {
# open dir, and put the handle on the stack
opendir my($dh), $nm;
if (defined $dh) {
unshift #{ $self->{_dhstack} }, $dh;
unshift #{ $self->{_dnstack} }, $nm;
}
else {
warn "can't walk into $nm!"
}
$nm .= '/';
}
# return the name
return $nm;
}
else {
closedir $dstack->[0];
shift #$dstack;
shift #$nstack;
unless (#$dstack) {
while (#{ $self->{_dirs} }) {
my $dir = shift #{ $self->{_dirs} };
opendir my($dirh), $dir;
next unless defined $dirh;
unshift #{ $self->{_dhstack} }, $dirh;
unshift #{ $self->{_dnstack} }, $dir;
last;
}
}
return $self->next();
}
}
else {
return undef;
}
}
use overload '<>' => \&_walk_op;
use overload '""' => sub { 'DirWalk('.join(', ', #{$_[0]->{_odirs}}).')'; };
1;
Example:
# prepare test structure
mkdir aaa
touch aaa/bbb
mkdir aaa/ccc
touch aaa/ccc/ddd
# example invocation:
perl -mDirWalk -E '$dw=DirWalk->new("aaa"); say while <$dw>;'
#output
aaa/ccc/
aaa/ccc/ddd
aaa/bbb
Another example:
use strict;
use warnings;
use DirWalk;
# iteration:
my $dw = DirWalk->new("aaa");
while (<$dw>) {
print "$_\n";
}
# or as a list:
$dw = DirWalk->new("aaa");
my #list = <$dw>;
for (#list) {
print "$_\n";
}

The method I've been inplamenting is utilizing three commands: opendir, readdir, and closedir. See below for an example:
opendir my $dir1, $cwd or die "cannot read the directory $cwd: $!";
#cwd= readdir $dir1;
closedir $dir1;
shift #cwd; shift #cwd;
foreach(#cwd){if ($_=~/$file_search_name/){print "I have found the file in $_\n!";}}
The directory will be stored in #cwd, which includes . and .. For windows, shift #cwd will remove these. I unfortunately am tight for time, but utilize this idea with an anon array to store the directory handles as well as another array for storing the directory paths. Perhaps utilize -d to check if it is a directory. There might be file permission issues, so perhaps unless(opendir ...) would be a great option.
Best of luck.

I'm sure I will be flayed alive for this answer but you could always use either system() or backticks `` to execute the regular linux find command. Or do some sort of ls...
#files = `ls $var/folder/*.logfile`
#files = `find . -name $file2find`
I expect some seasoned perlers have many good reasons not to do this.

yon can also try some stuff like this!!!
# I want to find file xyz.txt in $dir (say C:\sandbox)
Findfile("xyz.txt", $dir);
sub Findfile ()
{
my $file = shift;
my $Searchdir = shift;
my #content = <$Searchdir/*>;
foreach my $element (#content)
{
if($element =~ /.*$file$/)
{
print "found";
last;
}
elsif (-d $element)
{
Findfile($file, $element); #recursive search
}
}
}

File::Find::Rule is "smoother".
use File::Find::Rule qw( );
say for File::Fine::Rule->in(".");

perl script to count files in windows directory tree

I am new to perl scripting. I am trying to get the count of directories & subdirectories.
So I have searched all the available help on scripting.
But unable get the count of Subdirectories. Below is the script I used.
use strict;
use warnings;
use File::Slurp;
my #dirs = ('.');
my $directory_count = 0;
my $file_count = 0;
my $outfile = 'log.txt';
open my $fh, '>', $outfile or die "can't create logfile; $!";
for my $dir (#dirs) {
for my $file (read_dir ($dir)) {
if ( -d "$dir/$file" ) {
$directory_count++;
}
else {
$file_count++;
}
}
print $fh "Directories: $directory_count\n";
print $fh "Files: $file_count\n";
}
close $fh;
Here, I am unable to identify where to change the command of dir with /s.
Please help it will reduce lot of manual work.
Ravi

Never EVER write your own directory traversal. There are too many pitfalls, gotchas and edge cases. Things like path delimiters, files with spaces, alternate data streams, soft links, hard links, DFS paths... just don't do it.
Use File::Find or if you prefer File::Find::Rule.
As I prefer the former, I'll give an example:
use strict;
use warnings;
use File::Find;
my $dir_count;
my $file_count;
#find runs this for every file in it's traversal.
#$_ is 'current file'. $File::Find::Name is full path to file.
sub count_stuff {
if ( -d ) { $dir_count++ };
if ( -f ) { $file_count++ };
}
find ( \&count_stuff, "." );
print "Dirs: $dir_count\n";
print "Files: $file_count\n";

Here is a script that does it: 1) without global variables; and 2) without adding another sub to the namespace.
#!/usr/bin/env perl
use strict;
use warnings;
use File::Find;
run(\#ARGV);
sub run {
my $argv = shift;
for my $dir ( #$argv ) {
my $ret = count_files_and_directories( $dir );
printf(
"%s: %d files and %d directories\n",
$dir,
$ret->{files},
$ret->{directories}
);
}
return;
}
sub count_files_and_directories {
my $top = shift;
my %ret = (directories => 0, files => 0);
find(
{
wanted => sub {
-d and $ret{directories} += 1;
-f and $ret{files} += 1;
},
no_chdir => 1,
},
$top,
);
\%ret;
}

It seems simpler to use File::Find::Rule.. For example:
use warnings;
use strict;
use File::Find::Rule;
my #files = File::Find::Rule->new->file->in('.');
my #dirs = File::Find::Rule->new->directory->in('.');

directory tree warning

i have writed some script, that recursively print's directory's content. But it prints warning for each folder. How to fix this?
sample folder:
dev# cd /tmp/testdev# ls -p -Rtest2/testfiletestfile2
./test2:testfile3testfile4
my code:
#!/usr/bin/perl
use strict;
use warnings;
browseDir('/tmp/test');
sub browseDir {
my $path = shift;
opendir(my $dir, $path);
while (readdir($dir)) {
next if /^\.{1,2}$/;
if (-d "$path/$_") {
browseDir("$path/$_");
}
print "$path/$_\n";
}
closedir($dir);
}
and the output:
dev# perl /tmp/cotest.pl/tmp/test/test2/testfile3
/tmp/test/test2/testfile4Use of uninitialized value $_ in
concatenation (.) or string at /tmp/cotest.pl line 16./tmp/test/
/tmp/test/testfile/tmp/test/testfile2

May you try that code:
#!/usr/bin/perl
use strict;
use warnings;
browseDir('/tmp');
sub browseDir {
my $path = shift;
opendir(my $dir, $path);
while (readdir($dir)) {
next if /^\.{1,2}$/;
print "$path/$_\n";
if (-d "$path/$_") {
browseDir("$path/$_");
}
}
closedir($dir);
}
If you got that error, its because you call browseDir() before use variable $_.

Why not use the File::Find module? It's included in almost all distributions of Perl since Perl 5.x. It's not my favorite module due to the sort of messy way it works, but it does a good job.
You define a wanted subroutine that does what you want and filter out what you don't want. In this case, you're printing pretty much everything, so all wanted does is print out what is found.
In File::Find, the name of the file is kept in $File::Find::name and the directory for that file is in $File::Find::dir. The $_ is the file itself, and can be used for testing.
Here's a basic way of what you want:
use strict;
use warnings;
use feature qw(say);
use File::Find;
my $directory = `/tmp/test`;
find ( \&wanted, $directory );
sub wanted {
say $File::Find::Name;
}
I prefer to put my wanted function in my find subroutine, so they're together. This is equivalent to the above:
use strict;
use warnings;
use feature qw(say);
use File::Find;
my $directory = `/tmp/test`;
find (
sub {
say $File::Find::Name
},
$directory,
);
Good programming says not to print in subroutines. Instead, you should use the subroutine to store and return your data. Unfortunately, find doesn't return anything at all. You have to use a global array to capture the list of files, and later print them out:
use strict;
use warnings;
use feature qw(say);
use File::Find;
my $directory = `/tmp/test`;
my #directory_list;
find (
sub {
push #directory_list, $File::Find::Name
}, $directory );
for my $file (#directory_list) {
say $file;
}
Or, if you prefer a separate wanted subroutine:
use strict;
use warnings;
use feature qw(say);
use File::Find;
my $directory = `/tmp/test`;
my #directory_list;
find ( \&wanted, $directory );
sub wanted {
push #directory_list, $File::Find::Name;
}
for my $file (#directory_list) {
say $file;
}
The fact that my wanted subroutine depends upon an array that's not local to the subroutine bothers me which is why I prefer embedding the wanted subroutine inside my find call.
One thing you can do is use your subroutine to filter out what you want. Let's say you're only interested in JPG files:
use strict;
use warnings;
use feature qw(say);
use File::Find;
my $directory = `/tmp/test`;
my #directory_list;
find ( \&wanted, $directory );
sub wanted {
next unless /\.jpg$/i; #Skip everything that doesn't have .jpg suffix
push #directory_list, $File::Find::Name;
}
for my $file (#directory_list) {
say $file;
}
Note how the wanted subroutine does a next on any file I don't want before I push it into my #directory_list array. Again, I prefer the embedding:
find (sub {
next unless /\.jpg$/i; #Skip everything that doesn't have .jpg suffix
push #directory_list, $File::Find::Name;
}
I know this isn't exactly what you asked, but I just wanted to let you know about the Find::File module and introduce you to Perl modules (if you didn't already know about them) which can add a lot of functionality to Perl.

You place a value in $_ before calling browseDir and you expect it the value to be present after calling browseDir (a reasonable expectation), but browseDir modifies that variable.
Just add local $_; to browseDir to make sure that any change to it are undone before the sub exits.
Unrelated to your question, here are three other issues:
Not even minimal error checking!
You could run out of directory handles will navigating a deep directory.
You filter out files ".\n" and "..\n".
Fix:
#!/usr/bin/perl
use strict;
use warnings;
browseDir('/tmp/test');
sub browseDir {
my $path = shift;
opendir(my $dh, $path) or die $!;
my #files = readdir($dh);
closedir($dh);
for (#files) {
next if /^\.{1,2}z/;
if (-d "$path/$_") {
browseDir("$path/$_");
}
print "$path/$_\n";
}
}
Finally, why don't use you a module like File::Find::Rule?
use File::Find::Rule qw( );
print "$_\n" for File::Find::Rule->in('/tmp');
Note: Before 5.12, while (readir($dh)) would have to be written while (defined($_ = readdir($dh)))