How to store result from Find::File into array - perl

I want to list the file in directory and subdirectory. I use perl File::Find. Is it possible for me to store the result into an array?
Here is the code
use warnings;
use strict;
use File::Find;
my $location="tmp";
sub find_txt {
my $F = $File::Find::name;
if ($F =~ /txt$/ ) {
push #filelist, $F;
return #filelist;
}
}
my #fileInDir = find({ wanted => \&find_txt, no_chdir=>1}, $location);
print OUTPUT #fileInDir
the code above doesn't display the output

Sure, just push into an array declared outside:
use warnings;
use strict;
use File::Find;
my $location = "tmp";
my #results;
my $find_txt = sub {
my $F = $File::Find::name;
if ($F =~ /txt$/ ) {
push #results, $F;
}
};
find({ wanted => $find_txt, no_chdir=>1}, $location);
for my $result (#results) {
print "found $result\n";
}
The return value of the wanted callback is ignored. find itself has no documented or useful return value either.

For posterity, this is much more straightforward with Path::Iterator::Rule.
use strict;
use warnings;
use Path::Iterator::Rule;
my $location = 'tmp';
my $rule = Path::Iterator::Rule->new->not_dir->name(qr/txt$/);
my #paths = $rule->all($location);

Replace
my #fileInDir = find({ wanted => \&find_txt, no_chdir=>1}, $location);
with
my #fileInDir;
find({ wanted => sub { push #fileInDir, find_txt(); }, no_chdir=>1 }, $location);
and add the missing
return;
aka
return ();
to find_txt. Unlike the solution in the earlier answer, this allows you to have reusable and conveniently located "wanted" subs.

Related

How to pass an anonymous sub to Find::File

I know I can do this as an expression modifier:
#!/usr/bin/perl -w
use strict;
use File::Find;
sub file_find{
my ($path,$filter) = #_;
find(sub {print $File::Find::name."\n" if /$filter/}, $path);
}
file_find($newdir,'\.txt');
or this which is less readable:
find(sub {if(/$filter/){print $File::Find::name."\n"}}, $path);
But if I wanted to do something like this, how can I do it?
sub file_find{
my ($path,$filter) = #_;
find(\&print, $path);
sub print {
if(/$filter/){ #Variable $filter will not stay shared
print $File::Find::name."\n";
}
}
}
file_find($newdir,'\.txt')
I get 'variable will not stay shared'. I believe I'm supposed to make it an anonymous sub:
my $print = sub {
if(/$filter/){
print $File::Find::name."\n";
}
}
But then I don't know how to pass the reference to the find sub. Perhaps it's somthing silly I'm missing.
Edit: Never mind, this seems to work:
sub file_find{
my ($path,$filter) = #_;
my $subref = sub{
if(/$filter/){
print $File::Find::name."\n";
}
};
find($subref,$path);
}
file_find($newdir,'\.txt');
I had to push the find sub to the bottom! Man I feel so dumb :)
I would separate the subs apart (and rename the print() one as it conflicts with the built-in with the same name!), then you can do something along these lines (if I'm understanding what you want correctly):
use warnings;
use strict;
use File::Find;
file_find('.', '.txt');
sub file_find{
my ($path,$filter) = #_;
my #files = find(sub {my_print($filter)}, $path);
}
sub my_print {
my $filter = shift;
my $fname = $File::Find::name;
if($fname =~ /$filter/){
print "$fname\n";
}
}
However, with that said, File::Find::Rule can make these things very, very easy (particularly handling the file filters as it handles regex natively):
use warnings;
use strict;
use File::Find::Rule;
my $filter = '*.txt';
my $dir = '.';
my #files = File::Find::Rule->file()
->name($filter)
->in($dir);
print "$_\n" for #files;

Search file in directory structure

Does anybody know a method to search for a file in a directory structure without using File::Find? I know step-by-step how to do it but if it is possible to make it smoother that will be helpful.
File::Find is a core module since perl 5.000 so I don't see a reason for not using it.
But if you still want to take your crazy way you could call the find command.
From one File::Find hater to another: DirWalk.pm, inspired by the Python's os.walk().
package DirWalk;
use strict;
use warnings;
sub new {
my ($class, #dirs) = #_;
my #odirs = #dirs;
#dirs = qw/./ unless #dirs;
s!/+$!! for #dirs;
s!/+\.$!! for #dirs;
my $self = { _odirs => [#odirs], _dirs => [#dirs], _dhstack => [], _dnstack => [] };
opendir my($dirh), $dirs[0];
return undef unless $dirh;
shift #{ $self->{_dirs} };
unshift #{ $self->{_dhstack} }, $dirh;
unshift #{ $self->{_dnstack} }, $dirs[0];
return bless $self, $class;
}
sub _walk_op {
my ($self) = #_;
if (wantarray) {
my #ret;
while (defined(my $x = $self->next())) {
push #ret, $x;
}
return #ret;
}
elsif (defined wantarray) {
return $self->next();
}
return undef;
}
sub next
{
my ($self) = #_;
my $dstack = $self->{_dhstack};
my $nstack = $self->{_dnstack};
if (#$dstack) {
my $x;
do {
$x = readdir $dstack->[0];
} while (defined($x) && ($x eq '.' || $x eq '..'));
if (defined $x) {
my $nm = $nstack->[0].'/'.$x;
if (-d $nm) {
# open dir, and put the handle on the stack
opendir my($dh), $nm;
if (defined $dh) {
unshift #{ $self->{_dhstack} }, $dh;
unshift #{ $self->{_dnstack} }, $nm;
}
else {
warn "can't walk into $nm!"
}
$nm .= '/';
}
# return the name
return $nm;
}
else {
closedir $dstack->[0];
shift #$dstack;
shift #$nstack;
unless (#$dstack) {
while (#{ $self->{_dirs} }) {
my $dir = shift #{ $self->{_dirs} };
opendir my($dirh), $dir;
next unless defined $dirh;
unshift #{ $self->{_dhstack} }, $dirh;
unshift #{ $self->{_dnstack} }, $dir;
last;
}
}
return $self->next();
}
}
else {
return undef;
}
}
use overload '<>' => \&_walk_op;
use overload '""' => sub { 'DirWalk('.join(', ', #{$_[0]->{_odirs}}).')'; };
1;
Example:
# prepare test structure
mkdir aaa
touch aaa/bbb
mkdir aaa/ccc
touch aaa/ccc/ddd
# example invocation:
perl -mDirWalk -E '$dw=DirWalk->new("aaa"); say while <$dw>;'
#output
aaa/ccc/
aaa/ccc/ddd
aaa/bbb
Another example:
use strict;
use warnings;
use DirWalk;
# iteration:
my $dw = DirWalk->new("aaa");
while (<$dw>) {
print "$_\n";
}
# or as a list:
$dw = DirWalk->new("aaa");
my #list = <$dw>;
for (#list) {
print "$_\n";
}
The method I've been inplamenting is utilizing three commands: opendir, readdir, and closedir. See below for an example:
opendir my $dir1, $cwd or die "cannot read the directory $cwd: $!";
#cwd= readdir $dir1;
closedir $dir1;
shift #cwd; shift #cwd;
foreach(#cwd){if ($_=~/$file_search_name/){print "I have found the file in $_\n!";}}
The directory will be stored in #cwd, which includes . and .. For windows, shift #cwd will remove these. I unfortunately am tight for time, but utilize this idea with an anon array to store the directory handles as well as another array for storing the directory paths. Perhaps utilize -d to check if it is a directory. There might be file permission issues, so perhaps unless(opendir ...) would be a great option.
Best of luck.
I'm sure I will be flayed alive for this answer but you could always use either system() or backticks `` to execute the regular linux find command. Or do some sort of ls...
#files = `ls $var/folder/*.logfile`
#files = `find . -name $file2find`
I expect some seasoned perlers have many good reasons not to do this.
yon can also try some stuff like this!!!
# I want to find file xyz.txt in $dir (say C:\sandbox)
Findfile("xyz.txt", $dir);
sub Findfile ()
{
my $file = shift;
my $Searchdir = shift;
my #content = <$Searchdir/*>;
foreach my $element (#content)
{
if($element =~ /.*$file$/)
{
print "found";
last;
}
elsif (-d $element)
{
Findfile($file, $element); #recursive search
}
}
}
File::Find::Rule is "smoother".
use File::Find::Rule qw( );
say for File::Fine::Rule->in(".");

Match two strings based on common substring

I have a list of files that needs to be grouped in pairs. (I need to append an HTML 'File B' (body) to 'File A' (header) because I need to serve them statically without server-side includes).
Example:
/path/to/headers/.../matching_folder/FileA.html
/someother/path/to/.../matching_folder/body/FileB.html
Emphasizing with the ellipses that the paths are not of uniform length, nor is 'matching folder' always in the same position in the path.
It seems I need to match/join based on the common substring 'matching_folder', but I am stumped on scanning each string, storing, matching (excerpt):
my #dirs = ( $headerPath, $bodyPath );
my #files = ();
find( { wanted => \&wanted, no_chdir => 1 }, #dirs );
foreach my $file (#files) {
# pseudocode: append $file[0] to $file[1] if both paths contain same 'matching_folder'
};
sub wanted {
return unless -f and /(FileA\.html$)|(FileB\.html$)/i;
push #files, $_;
};
Hash the files by all the directory steps in their names.
#!/usr/bin/perl
use warnings;
use strict;
use feature qw{ say };
use File::Find;
my $headerPath = 'headers';
my $bodyPath = 'bodies';
my #dirs = ($headerPath, $bodyPath);
my #files;
sub wanted {
return unless -f and /file.\.html$/;
push #files, $_;
};
find({ wanted => \&wanted, no_chdir => 1 }, #dirs);
my %common;
for my $file (#files) {
my #steps = split m(/), $file;
push #{ $common{$_} }, $file for #steps;
};
# All the headers and all the bodies share their prefixes,
# but that's not what we're interested in.
delete #common{qw{ bodies headers }};
for my $step (keys %common) {
next if 1 == #{ $common{$step} };
print "$step common for #{ $common{$step} }\n";
}
Tested on the following structure:
bodies/3/something/C/something2/fileA.html
bodies/2/junk/B/fileB.html
bodies/1/A/fileC.html
headers/a/B/fileD.html
headers/c/one/A/two/fileE.html
headers/b/garbage/C/fileF.html
Output:
B common for headers/a/B/fileD.html bodies/2/junk/B/fileB.html
C common for headers/b/garbage/C/fileF.html bodies/3/something/C/something2/fileA.html
A common for headers/c/one/A/two/fileE.html bodies/1/A/fileC.html
With the above, I can get to
for my $step (keys %common) {
next unless 2 == #{ $common{$step} }; # pairs
my #pairs = #{ $common{$step} };
my $html;
foreach my $f (#pairs) {
$html .= &readfile($f);
};
&writefile($html, $step . '.html');
}
And get what I need for now. Thanks all! (I love Perl, making hard things possible indeed).

How to put data from CSV file to Perl hash

I have Perl and CSV file with something like:
"Name","Lastname"
"Homer","Simpsons"
"Ned","Flanders"
In this CSV file I have header in the first line and in other lines there are
data.
I want to convert this CSV file to such Perl data:
[
{
Lastname => "Simpsons",
Name => "Homer",
},
{
Lastname => "Flanders",
Name => "Ned",
},
]
I've written the function that users Text::CSV and doing what I need.
Here is the sample script:
#!/usr/bin/perl
use strict;
use warnings FATAL => 'all';
use 5.010;
use utf8;
use open qw(:std :utf8);
use Text::CSV;
sub read_csv {
my ($filename) = #_;
my #first_line;
my $result;
my $csv = Text::CSV->new ({ binary => 1, auto_diag => 1 });
open my $fh, "<:encoding(utf8)", $filename or die "$filename: $!";
while (my $row = $csv->getline ($fh)) {
if (not #first_line) {
#first_line = #{$row};
} else {
push #{$result}, { map { $first_line[$_] => $row->[$_] } 0..$#first_line };
}
}
close $fh;
return $result;
}
my $data = read_csv('sample.csv');
This works fine but this function I want to use in several scripts. I'm
greatly suprised that Text::CSV doesn't have this feature.
My question. What should I do to simplify solving such tasks in the future for
me and others?
Should I use some Perl module from CPAN, should I try to add this function to
Text::CSV, or something else?
Huh? Why so complicated? First, we fetch the header outside of the loop:
my $headers = $csv->getline($fh) or die "no header";
Assign these to be the column names:
$csv->column_names(#$headers);
Then, each call to getline_hr will provide a hashref:
while (my $hashref = $csv->getline_hr($fh)) {
push #$result, $hashref;
}
We can also use getline_hr_all:
$result = $csv->getline_hr_all($fh);
In other words, it ain't complex, most pieces are already provided by Text::CSV, and it can be done in very few lines.
Also, a module like this seems to already exist: Text::CSV::Slurp. (note: reverse dependency search through metacpan is awesome)
It's probably not a standard feature because different people will want their CSV files parsed into different data structures.
Why not create your own module that wraps this function?
package CSVRead;
use strict;
use warnings;
use 5.010;
use open qw(:std :utf8);
use Text::CSV;
require Exporter;
our #ISA = qw(Exporter);
our #EXPORT = qw(read_csv);
sub read_csv {
my ($filename) = #_;
my #first_line;
my $result;
my $csv = Text::CSV->new ({ binary => 1, auto_diag => 1 });
open my $fh, "<:encoding(utf8)", $filename or die "$filename: $!";
while (my $row = $csv->getline ($fh)) {
if (not #first_line) {
#first_line = #{$row};
} else {
push #{$result}, { map { $first_line[$_] => $row->[$_] } 0..$#first_line };
}
}
close $fh;
return $result;
}
Then, use it like this:
#!/usr/bin/perl
use strict;
use warnings;
use 5.010;
use Data::Dumper;
use CSVRead;
my $data = read_csv('sample.csv');
say Dumper $data;

How can I recursively read out directories in Perl?

I want to read out a directory recursively to print the data-structure in an HTML-Page with Template::Toolkit.
But I'm hanging in how to save the Paths and Files in a form that can be read our easy.
My idea started like this
sub list_dirs{
my ($rootPath) = #_;
my (#paths);
$rootPath .= '/' if($rootPath !~ /\/$/);
for my $eachFile (glob($path.'*'))
{
if(-d $eachFile)
{
push (#paths, $eachFile);
&list_dirs($eachFile);
}
else
{
push (#files, $eachFile);
}
}
return #paths;
}
How could I solve this problem?
This should do the trick
use strict;
use warnings;
use File::Find qw(finddepth);
my #files;
finddepth(sub {
return if($_ eq '.' || $_ eq '..');
push #files, $File::Find::name;
}, '/my/dir/to/search');
You should always use strict and warnings to help you debug your code. Perl would have warned you for example that #files is not declared. But the real problem with your function is that you declare a lexical variable #paths on every recursive call to list_dirs and don't push the return value back after the recursion step.
push #paths, list_dir($eachFile)
If you don't want to install additional modules, the following solution should probably help you:
use strict;
use warnings;
use File::Find qw(find);
sub list_dirs {
my #dirs = #_;
my #files;
find({ wanted => sub { push #files, $_ } , no_chdir => 1 }, #dirs);
return #files;
}
The answer by mdom explains how your initial attempt went astray. I would also suggest that you consider friendlier alternatives to File::Find. CPAN has several options. Here's one.
use strict;
use warnings;
use File::Find::Rule;
my #paths = File::Find::Rule->in(#ARGV);
Also see here:
SO answer providing CPAN
alternatives to File::Find.
SO question on directory iterators.
And here is a rewrite of your recursive solution. Things to note: use strict; use warnings; and the use of a scoping block to create a static variable for the subroutine.
use strict;
use warnings;
print $_, "\n" for dir_listing(#ARGV);
{
my #paths;
sub dir_listing {
my ($root) = #_;
$root .= '/' unless $root =~ /\/$/;
for my $f (glob "$root*"){
push #paths, $f;
dir_listing($f) if -d $f;
}
return #paths;
}
}
I think you have problem in the following line in your code
for my $eachFile (glob($path.'*'))
You change the $path variable into $rootpath.
It will store the path correctly.
I use this script to remove hidden files (created by Mac OS X) from my USB Pendrive, where I usually use it to listen music in the car, and any file ending with ".mp3", even when it starts with "._", will be listed in the car audio list.
#!/bin/perl
use strict;
use warnings;
use File::Find qw(find);
sub list_dirs {
my #dirs = #_;
my #files;
find({ wanted => sub { push #files, $_ } , no_chdir => 1 }, #dirs);
return #files;
}
if ( ! #ARGV || !$ARGV[0] ) {
print "** Invalid dir!\n";
exit ;
}
if ( $ARGV[0] !~ /\/Volumes\/\w/s ) {
print "** Dir should be at /Volume/... > $ARGV[0]\n";
exit ;
}
my #paths = list_dirs($ARGV[0]) ;
foreach my $file (#paths) {
my ($filename) = ( $file =~ /([^\\\/]+)$/s ) ;
if ($filename =~ /^\._/s ) {
unlink $file ;
print "rm> $file\n" ;
}
}
you can use this method as recursive file search that separate specific file types,
my #files;
push #files, list_dir($outputDir);
sub list_dir {
my #dirs = #_;
my #files;
find({ wanted => sub { push #files, glob "\"$_/*.txt\"" } , no_chdir => 1 }, #dirs);
return #files;
}