How to pass an anonymous sub to Find::File - perl

I know I can do this as an expression modifier:
#!/usr/bin/perl -w
use strict;
use File::Find;
sub file_find{
my ($path,$filter) = #_;
find(sub {print $File::Find::name."\n" if /$filter/}, $path);
}
file_find($newdir,'\.txt');
or this which is less readable:
find(sub {if(/$filter/){print $File::Find::name."\n"}}, $path);
But if I wanted to do something like this, how can I do it?
sub file_find{
my ($path,$filter) = #_;
find(\&print, $path);
sub print {
if(/$filter/){ #Variable $filter will not stay shared
print $File::Find::name."\n";
}
}
}
file_find($newdir,'\.txt')
I get 'variable will not stay shared'. I believe I'm supposed to make it an anonymous sub:
my $print = sub {
if(/$filter/){
print $File::Find::name."\n";
}
}
But then I don't know how to pass the reference to the find sub. Perhaps it's somthing silly I'm missing.
Edit: Never mind, this seems to work:
sub file_find{
my ($path,$filter) = #_;
my $subref = sub{
if(/$filter/){
print $File::Find::name."\n";
}
};
find($subref,$path);
}
file_find($newdir,'\.txt');
I had to push the find sub to the bottom! Man I feel so dumb :)

I would separate the subs apart (and rename the print() one as it conflicts with the built-in with the same name!), then you can do something along these lines (if I'm understanding what you want correctly):
use warnings;
use strict;
use File::Find;
file_find('.', '.txt');
sub file_find{
my ($path,$filter) = #_;
my #files = find(sub {my_print($filter)}, $path);
}
sub my_print {
my $filter = shift;
my $fname = $File::Find::name;
if($fname =~ /$filter/){
print "$fname\n";
}
}
However, with that said, File::Find::Rule can make these things very, very easy (particularly handling the file filters as it handles regex natively):
use warnings;
use strict;
use File::Find::Rule;
my $filter = '*.txt';
my $dir = '.';
my #files = File::Find::Rule->file()
->name($filter)
->in($dir);
print "$_\n" for #files;

Related

How to store result from Find::File into array

I want to list the file in directory and subdirectory. I use perl File::Find. Is it possible for me to store the result into an array?
Here is the code
use warnings;
use strict;
use File::Find;
my $location="tmp";
sub find_txt {
my $F = $File::Find::name;
if ($F =~ /txt$/ ) {
push #filelist, $F;
return #filelist;
}
}
my #fileInDir = find({ wanted => \&find_txt, no_chdir=>1}, $location);
print OUTPUT #fileInDir
the code above doesn't display the output
Sure, just push into an array declared outside:
use warnings;
use strict;
use File::Find;
my $location = "tmp";
my #results;
my $find_txt = sub {
my $F = $File::Find::name;
if ($F =~ /txt$/ ) {
push #results, $F;
}
};
find({ wanted => $find_txt, no_chdir=>1}, $location);
for my $result (#results) {
print "found $result\n";
}
The return value of the wanted callback is ignored. find itself has no documented or useful return value either.
For posterity, this is much more straightforward with Path::Iterator::Rule.
use strict;
use warnings;
use Path::Iterator::Rule;
my $location = 'tmp';
my $rule = Path::Iterator::Rule->new->not_dir->name(qr/txt$/);
my #paths = $rule->all($location);
Replace
my #fileInDir = find({ wanted => \&find_txt, no_chdir=>1}, $location);
with
my #fileInDir;
find({ wanted => sub { push #fileInDir, find_txt(); }, no_chdir=>1 }, $location);
and add the missing
return;
aka
return ();
to find_txt. Unlike the solution in the earlier answer, this allows you to have reusable and conveniently located "wanted" subs.

include/eval perl file into unique namespace defined at runtime

I'm writing a tool that must import a number of other perl config files. The files are not wrapped w/packages and may have similar or conflicting variables/functions. I don't have the ability to change the format of these files, so I must work around what they are. What I was thinking to do was import each into a unique name space, but I've not found a way to do that using do, require, or use. If I don't use dynamic names, just a hardcoded name, I can do it.
Want something like this:
sub sourceTheFile {
my ($namespace, $file) = #_;
package $namespace;
do $file;
1;
return;
}
That doesn't work because the package command requires a constant for the name. So then I try something like this:
sub sourceTheFile {
my ($namespace, $file) = #_;
eval "package $namespace;do $file;1;"
return;
}
But the contents of the file read by do are placed in the main:: scope not the one I want. The target scope is created, just not populated by the
do. (I tried require, and just a straight cat $file inside the eval as well.)
I'm using Devel::Symdump to verify that the namespaces are built correctly or not.
example input file:
my $xyz = "some var";
%all_have_this = ( common=>"stuff" );
ADDITIONAL CHALLENGE
Using the answer that does the temp file build and do call, I can make this work dynamically as I require. BUT, big but, how do I now reference the data inside this new namespace? Perl doesn't seem to have the lose ability to build a variable name from a string and use that as the variable.
I am not sure why the eval did not work. Maybe a bug? Here is a workaround using a temp file. This works for me:
use strict;
use warnings;
use Devel::Symdump;
use File::Temp;
my $file = './test.pl';
my $namespace = 'TEST';
{
my $fh = File::Temp->new();
print $fh "package $namespace;\n";
print $fh "do '$file';\n";
print $fh "1;\n";
close $fh;
do $fh->filename;
}
Perl's use and require facilities make use of any hooks you might have installed in #INC. You can simply install a hook which looks in a specific location to load modules with a prefix you choose:
package MyIncHook;
use strict;
use warnings;
use autouse Carp => qw( croak );
use File::Spec::Functions qw( catfile );
sub import {
my ($class, $prefix, $location) = #_;
unshift #INC, _loader_for($prefix, $location);
return;
}
sub _loader_for {
my $prefix = shift;
my $location = shift;
$prefix =~ s{::}{/}g;
return sub {
my $self = shift;
my $wanted = shift;
return unless $wanted =~ /^\Q$prefix/;
my $path = catfile($location, $wanted);
my ($is_done);
open my $fh, '<', $path
or croak "Failed to open '$path' for reading: $!";
my $loader = sub {
if ($is_done) {
close $fh
or croak "Failed to close '$path': $!";
return 0;
}
if (defined (my $line = <$fh>)) {
$_ = $line;
return 1;
}
else {
$_ = "1\n";
$is_done = 1;
return 1;
}
};
(my $package = $wanted) =~ s{/}{::}g;
$package =~ s/[.]pm\z//;
my #ret = (\"package $package;", $loader);
return #ret;
}
}
__PACKAGE__;
__END__
Obviously, modify the construction of $path according to your requirements.
You can use it like this:
#!/usr/bin/env perl
use strict;
use warnings;
use MyIncHook ('My::Namespace', "$ENV{TEMP}/1");
use My::Namespace::Rand;
print $My::Namespace::Rand::settings{WARNING_LEVEL}, "\n";
where $ENV{TEMP}/1/My/Namespace/Rand.pm contains:
%settings = (
WARNING_LEVEL => 'critical',
);
Output:
C:\Temp> perl t.pl
critical
You can, obviously, define your own mapping from made up module names to file names.

Learning the High Order Perl: issue with iterator

I study the High Order Perl book and have an issue with iterators in the Chapter 4.3.4.
The code:
main_script.pl
#!/perl
use strict;
use warnings;
use FindBin qw($Bin);
use lib $Bin;
use Iterator_Utils qw(:all);
use FlatDB;
my $db = FlatDB->new("$Bin/db.csv") or die "$!";
my $q = $db->query('STATE', 'NY');
while (my $rec = NEXTVAL($q) )
{
print $rec;
}
Iterator_Utils.pm
#!/perl
use strict;
use warnings;
package Iterator_Utils;
use Exporter 'import';;
our #EXPORT_OK = qw(NEXTVAL Iterator
append imap igrep
iterate_function filehandle_iterator list_iterator);
our %EXPORT_TAGS = ('all' => \#EXPORT_OK);
sub NEXTVAL { $_[0]->() }
sub Iterator (&) { return $_[0] }
FlatDB.pm
#!/perl
use strict;
use warnings;
package FlatDB;
my $FIELDSEP = qr/:/;
sub new
{
my $class = shift;
my $file = shift;
open my $fh, "<", $file or return;
chomp(my $schema = <$fh>);
my #field = split $FIELDSEP, $schema;
my %fieldnum = map { uc $field[$_] => $_ } (0..$#field);
bless
{
FH => $fh,
FIELDS => \#field,
FIELDNUM => \%fieldnum,
FIELDSEP => $FIELDSEP
} => $class;
}
use Fcntl ':seek';
sub query
{
my $self = shift;
my ($field, $value) = #_;
my $fieldnum = $self->{FIELDNUM}{uc $field};
return unless defined $fieldnum;
my $fh = $self->{FH};
seek $fh, 0, SEEK_SET;
<$fh>; # discard schema line
return Iterator
{
local $_;
while (<$fh>)
{
chomp;
my #fields = split $self->{FIELDSEP}, $_, -1;
my $fieldval = $fields[$fieldnum];
return $_ if $fieldval eq $value;
}
return;
};
}
db.csv
LASTNAME:FIRSTNAME:CITY:STATE:OWES
Adler:David:New York:NY:157.00
Ashton:Elaine:Boston:MA:0.00
Dominus:Mark:Philadelphia:PA:0.00
Orwant:Jon:Cambridge:MA:26.30
Schwern:Michael:New York:NY:149658.23
Wall:Larry:Mountain View:CA:-372.14
Just as in the book so far, right? However I do not get the output (the strings with Adler and Schwern should occur). The error message is:
Can't use string ("Adler:David:New York:NY:157.00") as a subroutine ref while
"strict refs" in use at N:/Perle/Learn/Iterators/Iterator_Utils.pm line 12, <$fh>
line 3.
What am I doing wrong?
Thanks in advance!
FlatDB calls Iterator, which is defined in Iterator_Utils, so it needs to import that function from Iterator_Utils. If you add
use Iterator_Utils qw(Iterator);
after package FlatDB, the program will work.
Thanks very much for finding this error. I will add this to the errata on the web site. If you would like to be credited by name, please email me your name.

directory tree warning

i have writed some script, that recursively print's directory's content. But it prints warning for each folder. How to fix this?
sample folder:
dev# cd /tmp/testdev# ls -p -Rtest2/testfiletestfile2
./test2:testfile3testfile4
my code:
#!/usr/bin/perl
use strict;
use warnings;
browseDir('/tmp/test');
sub browseDir {
my $path = shift;
opendir(my $dir, $path);
while (readdir($dir)) {
next if /^\.{1,2}$/;
if (-d "$path/$_") {
browseDir("$path/$_");
}
print "$path/$_\n";
}
closedir($dir);
}
and the output:
dev# perl /tmp/cotest.pl/tmp/test/test2/testfile3
/tmp/test/test2/testfile4Use of uninitialized value $_ in
concatenation (.) or string at /tmp/cotest.pl line 16./tmp/test/
/tmp/test/testfile/tmp/test/testfile2
May you try that code:
#!/usr/bin/perl
use strict;
use warnings;
browseDir('/tmp');
sub browseDir {
my $path = shift;
opendir(my $dir, $path);
while (readdir($dir)) {
next if /^\.{1,2}$/;
print "$path/$_\n";
if (-d "$path/$_") {
browseDir("$path/$_");
}
}
closedir($dir);
}
If you got that error, its because you call browseDir() before use variable $_.
Why not use the File::Find module? It's included in almost all distributions of Perl since Perl 5.x. It's not my favorite module due to the sort of messy way it works, but it does a good job.
You define a wanted subroutine that does what you want and filter out what you don't want. In this case, you're printing pretty much everything, so all wanted does is print out what is found.
In File::Find, the name of the file is kept in $File::Find::name and the directory for that file is in $File::Find::dir. The $_ is the file itself, and can be used for testing.
Here's a basic way of what you want:
use strict;
use warnings;
use feature qw(say);
use File::Find;
my $directory = `/tmp/test`;
find ( \&wanted, $directory );
sub wanted {
say $File::Find::Name;
}
I prefer to put my wanted function in my find subroutine, so they're together. This is equivalent to the above:
use strict;
use warnings;
use feature qw(say);
use File::Find;
my $directory = `/tmp/test`;
find (
sub {
say $File::Find::Name
},
$directory,
);
Good programming says not to print in subroutines. Instead, you should use the subroutine to store and return your data. Unfortunately, find doesn't return anything at all. You have to use a global array to capture the list of files, and later print them out:
use strict;
use warnings;
use feature qw(say);
use File::Find;
my $directory = `/tmp/test`;
my #directory_list;
find (
sub {
push #directory_list, $File::Find::Name
}, $directory );
for my $file (#directory_list) {
say $file;
}
Or, if you prefer a separate wanted subroutine:
use strict;
use warnings;
use feature qw(say);
use File::Find;
my $directory = `/tmp/test`;
my #directory_list;
find ( \&wanted, $directory );
sub wanted {
push #directory_list, $File::Find::Name;
}
for my $file (#directory_list) {
say $file;
}
The fact that my wanted subroutine depends upon an array that's not local to the subroutine bothers me which is why I prefer embedding the wanted subroutine inside my find call.
One thing you can do is use your subroutine to filter out what you want. Let's say you're only interested in JPG files:
use strict;
use warnings;
use feature qw(say);
use File::Find;
my $directory = `/tmp/test`;
my #directory_list;
find ( \&wanted, $directory );
sub wanted {
next unless /\.jpg$/i; #Skip everything that doesn't have .jpg suffix
push #directory_list, $File::Find::Name;
}
for my $file (#directory_list) {
say $file;
}
Note how the wanted subroutine does a next on any file I don't want before I push it into my #directory_list array. Again, I prefer the embedding:
find (sub {
next unless /\.jpg$/i; #Skip everything that doesn't have .jpg suffix
push #directory_list, $File::Find::Name;
}
I know this isn't exactly what you asked, but I just wanted to let you know about the Find::File module and introduce you to Perl modules (if you didn't already know about them) which can add a lot of functionality to Perl.
You place a value in $_ before calling browseDir and you expect it the value to be present after calling browseDir (a reasonable expectation), but browseDir modifies that variable.
Just add local $_; to browseDir to make sure that any change to it are undone before the sub exits.
Unrelated to your question, here are three other issues:
Not even minimal error checking!
You could run out of directory handles will navigating a deep directory.
You filter out files ".\n" and "..\n".
Fix:
#!/usr/bin/perl
use strict;
use warnings;
browseDir('/tmp/test');
sub browseDir {
my $path = shift;
opendir(my $dh, $path) or die $!;
my #files = readdir($dh);
closedir($dh);
for (#files) {
next if /^\.{1,2}z/;
if (-d "$path/$_") {
browseDir("$path/$_");
}
print "$path/$_\n";
}
}
Finally, why don't use you a module like File::Find::Rule?
use File::Find::Rule qw( );
print "$_\n" for File::Find::Rule->in('/tmp');
Note: Before 5.12, while (readir($dh)) would have to be written while (defined($_ = readdir($dh)))

How can I recursively read out directories in Perl?

I want to read out a directory recursively to print the data-structure in an HTML-Page with Template::Toolkit.
But I'm hanging in how to save the Paths and Files in a form that can be read our easy.
My idea started like this
sub list_dirs{
my ($rootPath) = #_;
my (#paths);
$rootPath .= '/' if($rootPath !~ /\/$/);
for my $eachFile (glob($path.'*'))
{
if(-d $eachFile)
{
push (#paths, $eachFile);
&list_dirs($eachFile);
}
else
{
push (#files, $eachFile);
}
}
return #paths;
}
How could I solve this problem?
This should do the trick
use strict;
use warnings;
use File::Find qw(finddepth);
my #files;
finddepth(sub {
return if($_ eq '.' || $_ eq '..');
push #files, $File::Find::name;
}, '/my/dir/to/search');
You should always use strict and warnings to help you debug your code. Perl would have warned you for example that #files is not declared. But the real problem with your function is that you declare a lexical variable #paths on every recursive call to list_dirs and don't push the return value back after the recursion step.
push #paths, list_dir($eachFile)
If you don't want to install additional modules, the following solution should probably help you:
use strict;
use warnings;
use File::Find qw(find);
sub list_dirs {
my #dirs = #_;
my #files;
find({ wanted => sub { push #files, $_ } , no_chdir => 1 }, #dirs);
return #files;
}
The answer by mdom explains how your initial attempt went astray. I would also suggest that you consider friendlier alternatives to File::Find. CPAN has several options. Here's one.
use strict;
use warnings;
use File::Find::Rule;
my #paths = File::Find::Rule->in(#ARGV);
Also see here:
SO answer providing CPAN
alternatives to File::Find.
SO question on directory iterators.
And here is a rewrite of your recursive solution. Things to note: use strict; use warnings; and the use of a scoping block to create a static variable for the subroutine.
use strict;
use warnings;
print $_, "\n" for dir_listing(#ARGV);
{
my #paths;
sub dir_listing {
my ($root) = #_;
$root .= '/' unless $root =~ /\/$/;
for my $f (glob "$root*"){
push #paths, $f;
dir_listing($f) if -d $f;
}
return #paths;
}
}
I think you have problem in the following line in your code
for my $eachFile (glob($path.'*'))
You change the $path variable into $rootpath.
It will store the path correctly.
I use this script to remove hidden files (created by Mac OS X) from my USB Pendrive, where I usually use it to listen music in the car, and any file ending with ".mp3", even when it starts with "._", will be listed in the car audio list.
#!/bin/perl
use strict;
use warnings;
use File::Find qw(find);
sub list_dirs {
my #dirs = #_;
my #files;
find({ wanted => sub { push #files, $_ } , no_chdir => 1 }, #dirs);
return #files;
}
if ( ! #ARGV || !$ARGV[0] ) {
print "** Invalid dir!\n";
exit ;
}
if ( $ARGV[0] !~ /\/Volumes\/\w/s ) {
print "** Dir should be at /Volume/... > $ARGV[0]\n";
exit ;
}
my #paths = list_dirs($ARGV[0]) ;
foreach my $file (#paths) {
my ($filename) = ( $file =~ /([^\\\/]+)$/s ) ;
if ($filename =~ /^\._/s ) {
unlink $file ;
print "rm> $file\n" ;
}
}
you can use this method as recursive file search that separate specific file types,
my #files;
push #files, list_dir($outputDir);
sub list_dir {
my #dirs = #_;
my #files;
find({ wanted => sub { push #files, glob "\"$_/*.txt\"" } , no_chdir => 1 }, #dirs);
return #files;
}