How to perl process and push multiple directories using simple code - perl

I am trying to get all sub directories and files inside 'dir1, dir2, dir3,dir4,dir5' like below and push it to another Dir. if I using this code I am getting everything. however, I need to process more directories like this. is there simple way to process all these 'dir1 to ... dirx' using simple code instead of assign each directories individually below. Thanks in Advance
use File::Find::Rule;
my #pushdir;
my #pushdir1 = File::Find::Rule->directory->in('/tmp/dirx');
my #pushdir2 = File::Find::Rule->directory->in('/tmp/nextdir');
my #pushdir3 = File::Find::Rule->directory->in('/tmp/pushdir');
my #pushdir4 = File::Find::Rule->directory->in('/tmp/logdir');
my #pushdir5 = File::Find::Rule->directory->in('/tmp/testdir');
push #pushdir, #pushdir1,#pushdir2,#pushdir3,#pushdir4,#pushdir5;
my #Files;
foreach my $dir (#pushdir) {
push #Files, sort glob "$dir/*.txt";
}

Use a subroutine:
#!/usr/bin/env perl
use strict;
use warnings;
use File::Find::Rule;
my #dirnames = qw( dirx temp_dir testdir nextdir );
my #dirs_to_search = map { "/tmp/$_" } #dirnames;
my #files;
for my $dir (#dirs_to_search) {
push #files, find_files_in_dir($dir);
}
sub find_files_in_dir {
my ($dir) = #_;
my #subdirs = File::Find::Rule->directory->in( $dir );
my #txt_files;
for my $subdir ( #subdirs ) {
push #txt_files, sort glob "$subdir/*.txt";
}
return #txt_files;
}

It sounds like you want a list of all *.txt files contained in /tmp child directories and their descendants
That's easily done with a single glob call, like this
my #files = glob '/tmp/**/*.txt'

Related

Make .tar or .gz file in window using perl

I will try to make .tar or .gz file. But I have some issues like:
It takes the complete path
D:\test\jtax-issue11-16\title.xml
D:\test\jtax-issue11-16\artwork
D:\test\jtax-issue11-16\artwork\cover.png
Note: the above is also my folder structure.
But my requirement is:
jtax-issue11-16\title.xml
jtax-issue11-16\artwork
jtax-issue11-16\artwork\cover.png
Which means create .tar or .gz file with the current folder name only
My code is :
use strict;
use warnings;
use Archive::Tar;
use File::Find;
use File::Basename 'basename';
use Cwd;
my $current_path = getcwd;
my #inventory = ();
find (sub { push #inventory, $File::Find::name }, $current_path);
my $tar = Archive::Tar->new();
$tar->add_files(#inventory);
$tar->write('a.tar');
If I use basename, then it produces an error. I don't understand how to use basename or how to create .tar or .gz file with the current folder name.
use File::Find::Rule qw( );
my $base_dir = '.';
my #files =
map { s{^\Q$base_dir/}{}r }
File::Find::Rule
->mindepth(1)
->in($base_dir);
or
use File::Find qw( find );
my $base_dir = '.';
my #files;
find(
{
wanted => sub { push #files, s{^\Q$base_dir/}{}r },
no_chdir => 1,
},
$base_dir
);
shift(#files);
Given that you are in $current_path when you call find(), you should just pass . to find(). That way all of the paths you get in $File::Find::name will be relative to the current directory;
my $current_path = getcwd;
my #inventory = ();
find (sub { push #inventory, $File::Find::name }, '.');
That will give you paths like:
.\jtax-issue11-16\title.xml
.\jtax-issue11-16\artwork
.\jtax-issue11-16\artwork\cover.png
But you could use s/^\.\\// to remove the .\ from the beginning of that path if it's important to you. The easiest place to do that might be after you have built #inventory.
#inventory = map { s/^\.\\//; $_ } #inventory;
I will add Dave Cross code in my file and get my o/p.
Code is below:-
use strict;
use warnings;
use Archive::Tar;
use File::Find;
use File::Basename 'basename';
use 5.010;
use Cwd;
my $current_dir = getcwd;
my #tar_files;
my #inventory = ();
find (sub { push #inventory, $File::Find::name }, $current_dir);
my $tar = Archive::Tar->new();
#inventory = map { s/^$current_dir\///; $_ } #inventory;
foreach my $temp (#inventory)
{
# skip my file name which is conv.pl
# and skip blank entry in tar, which is created by $current_dir
if ($temp =~ m/conv\.pl/ || $temp =~ m/conv\.exe/ ||$temp =~/$current_dir/)
{
}
else
{
push (#tar_files , $temp );
}
}
$tar->add_files(#tar_files);
$tar->write($file_name.".tar");

Perl , How to read subfolder Output

I am writing a script to read the content of multiple sub folder in a directory.
And recently i need to read the content of folder inside multiple sub-folder.
Want to ask how can i write the code to read those folder inside multiple sub-folder.
This is the new conditions
Multiple Sub-folder -> Local folder -> fileAAA.csv
how do i read this fileAAA in Local folder of Multiple Sub-folder?
Currently the code i am writing was in this condition and it works well.
Multiple Sub-folder -> fileAAA.csv
Able to read fileAAA from multiple Sub-folder
Below is the code i use to read
Multiple Sub-folder -> fileAAA.csv
my ( $par_dir, $sub_dir );
opendir( $par_dir, "$parent" );
while ( my $sub_folders = readdir($par_dir) ) {
next if ( $sub_folders =~ /^..?$/ ); # skip . and ..
my $path = $parent . '/' . $sub_folders;
next unless ( -d $path ); # skip anything that isn't a directory
opendir( $sub_dir, $path );
while ( my $file = readdir($sub_dir) ) {
next unless $file =~ /\.csv?$/i;
my $full_path = $path . '/' . $file;
print_file_names($full_path);
}
closedir($sub_dir);
$flag = 0;
}
closedir($par_dir);
......
Updated
You should look at the File::Find module which has everything already in place to do searches like this, and has taken account of all corner cases for you
I wrote that on my tablet and at the time I couldn't offer sample code to support it. I believe this will do what you're asking for, which is simply to find all CSV files at any level beneath a parent directory
use strict;
use warnings;
use File::Find qw/ find /;
STDOUT->autoflush;
my $parent = '/Multiple Sub-folder';
find(sub {
return unless -f and /\.csv$/i;
print_file_names($File::Find::name);
}, $parent);
sub print_file_names {
my ($fn) = #_;
print $fn, "\n";
}
Without using moudle try this
Instead of opendir can you try glob for subdirectory search.
In below script i make a subroutine for continuous search.
When elsif condition is satisfied the path of the directory is will go to the find subroutine then it'll seach and so on.
my $v = "/Multiple Sub-folder";
find($v);
sub find{
my ($s) = #_;
foreach my $ma (glob "$s/*")
{
if(-f $ma)
{
if($ma =~m/.csv$/) # Here search for csv files.
{
print "$ma\n";
}
}
elsif(-d $ma)
{
find("$ma")
}
}
}
But can you use File::Find module for search the files in the directory as the answer of Borodin Which is the best approach.

directory tree warning

i have writed some script, that recursively print's directory's content. But it prints warning for each folder. How to fix this?
sample folder:
dev# cd /tmp/testdev# ls -p -Rtest2/testfiletestfile2
./test2:testfile3testfile4
my code:
#!/usr/bin/perl
use strict;
use warnings;
browseDir('/tmp/test');
sub browseDir {
my $path = shift;
opendir(my $dir, $path);
while (readdir($dir)) {
next if /^\.{1,2}$/;
if (-d "$path/$_") {
browseDir("$path/$_");
}
print "$path/$_\n";
}
closedir($dir);
}
and the output:
dev# perl /tmp/cotest.pl/tmp/test/test2/testfile3
/tmp/test/test2/testfile4Use of uninitialized value $_ in
concatenation (.) or string at /tmp/cotest.pl line 16./tmp/test/
/tmp/test/testfile/tmp/test/testfile2
May you try that code:
#!/usr/bin/perl
use strict;
use warnings;
browseDir('/tmp');
sub browseDir {
my $path = shift;
opendir(my $dir, $path);
while (readdir($dir)) {
next if /^\.{1,2}$/;
print "$path/$_\n";
if (-d "$path/$_") {
browseDir("$path/$_");
}
}
closedir($dir);
}
If you got that error, its because you call browseDir() before use variable $_.
Why not use the File::Find module? It's included in almost all distributions of Perl since Perl 5.x. It's not my favorite module due to the sort of messy way it works, but it does a good job.
You define a wanted subroutine that does what you want and filter out what you don't want. In this case, you're printing pretty much everything, so all wanted does is print out what is found.
In File::Find, the name of the file is kept in $File::Find::name and the directory for that file is in $File::Find::dir. The $_ is the file itself, and can be used for testing.
Here's a basic way of what you want:
use strict;
use warnings;
use feature qw(say);
use File::Find;
my $directory = `/tmp/test`;
find ( \&wanted, $directory );
sub wanted {
say $File::Find::Name;
}
I prefer to put my wanted function in my find subroutine, so they're together. This is equivalent to the above:
use strict;
use warnings;
use feature qw(say);
use File::Find;
my $directory = `/tmp/test`;
find (
sub {
say $File::Find::Name
},
$directory,
);
Good programming says not to print in subroutines. Instead, you should use the subroutine to store and return your data. Unfortunately, find doesn't return anything at all. You have to use a global array to capture the list of files, and later print them out:
use strict;
use warnings;
use feature qw(say);
use File::Find;
my $directory = `/tmp/test`;
my #directory_list;
find (
sub {
push #directory_list, $File::Find::Name
}, $directory );
for my $file (#directory_list) {
say $file;
}
Or, if you prefer a separate wanted subroutine:
use strict;
use warnings;
use feature qw(say);
use File::Find;
my $directory = `/tmp/test`;
my #directory_list;
find ( \&wanted, $directory );
sub wanted {
push #directory_list, $File::Find::Name;
}
for my $file (#directory_list) {
say $file;
}
The fact that my wanted subroutine depends upon an array that's not local to the subroutine bothers me which is why I prefer embedding the wanted subroutine inside my find call.
One thing you can do is use your subroutine to filter out what you want. Let's say you're only interested in JPG files:
use strict;
use warnings;
use feature qw(say);
use File::Find;
my $directory = `/tmp/test`;
my #directory_list;
find ( \&wanted, $directory );
sub wanted {
next unless /\.jpg$/i; #Skip everything that doesn't have .jpg suffix
push #directory_list, $File::Find::Name;
}
for my $file (#directory_list) {
say $file;
}
Note how the wanted subroutine does a next on any file I don't want before I push it into my #directory_list array. Again, I prefer the embedding:
find (sub {
next unless /\.jpg$/i; #Skip everything that doesn't have .jpg suffix
push #directory_list, $File::Find::Name;
}
I know this isn't exactly what you asked, but I just wanted to let you know about the Find::File module and introduce you to Perl modules (if you didn't already know about them) which can add a lot of functionality to Perl.
You place a value in $_ before calling browseDir and you expect it the value to be present after calling browseDir (a reasonable expectation), but browseDir modifies that variable.
Just add local $_; to browseDir to make sure that any change to it are undone before the sub exits.
Unrelated to your question, here are three other issues:
Not even minimal error checking!
You could run out of directory handles will navigating a deep directory.
You filter out files ".\n" and "..\n".
Fix:
#!/usr/bin/perl
use strict;
use warnings;
browseDir('/tmp/test');
sub browseDir {
my $path = shift;
opendir(my $dh, $path) or die $!;
my #files = readdir($dh);
closedir($dh);
for (#files) {
next if /^\.{1,2}z/;
if (-d "$path/$_") {
browseDir("$path/$_");
}
print "$path/$_\n";
}
}
Finally, why don't use you a module like File::Find::Rule?
use File::Find::Rule qw( );
print "$_\n" for File::Find::Rule->in('/tmp');
Note: Before 5.12, while (readir($dh)) would have to be written while (defined($_ = readdir($dh)))

How to get a list of leaf subdirectories in a root folder in Perl

I am very new to Perl (scripting languages in general) and I was wondering how to use Perl to get a lisitng of all the leaf directories in Perl. For example, lets say my root directory is C:
C: -> I have folder "A" and "B" and files a.txt and b.txt
Folder "A" -> I have folder "D" and file c.html
Folder "B" -> I have folder "E" and "F" and file d.html
Folder "D", "E" and "F" -> bunch of text files
How do I get a bunch of directory paths as output for this scenario of:
C:\A\D\
C:\B\E\
C:\B\F\
As you can see, I just want a list of all the leaf directories possible. I dont want C:\A\ and C:\B\ to show up. After doign some reserarch myself, I have noticed that I may somehow be able to use the File::Find module in Perl, but that also I am not 100% sure about how to go ahead with.
Thanks for any help you may be able to provide :)
Another approach:
use strict;
use warnings;
use feature qw( say );
use File::Find::Rule qw( );
use Path::Class qw( dir );
my $root = dir('.')->absolute();
my #dirs = File::Find::Rule->directory->in($root);
shift(#dirs);
my #leaf_dirs;
if (#dirs) {
my $last = shift(#dirs);
for (#dirs) {
push #leaf_dirs, $last if !/^\Q$last/;
$last = $_ . "/";
}
push #leaf_dirs, $last;
}
say for #leaf_dirs;
Or using find's preprocess option:
use strict;
use warnings;
use File::Find;
find({ wanted =>sub{1}, # required--in version 5.8.4 at least
preprocess=>sub{ # #_ is files in current directory
#_ = grep { -d && !/\.{1,2}$/ } #_;
print "$File::Find::dir\n" unless #_;
return #_;
}
}, ".");
From an answer to the question How to Get the Last Subdirectories by liverpole on Perlmonks:
prints all leaf directories under the current directory (see "./"):
use strict;
use warnings;
my $h_dirs = terminal_subdirs("./");
my #dirs = sort keys %$h_dirs;
print "Terminal Directories:\n", join("\n", #dirs);
sub terminal_subdirs {
my ($top, $h_results) = #_;
$h_results ||= { };
opendir(my $dh, $top) or die "Arrggghhhh -- can't open '$top' ($!)\n";
my #files = readdir($dh);
closedir $dh;
my $nsubdirs = 0;
foreach my $fn (#files) {
next if ($fn eq '.' or $fn eq '..');
my $full = "$top/$fn";
if (!-l $full and -d $full) {
++$nsubdirs;
terminal_subdirs($full, $h_results);
}
}
$nsubdirs or $h_results->{$top} = 1;
return $h_results;
}

How do I use $File::Find::prune?

I have a need to edit cue files in the first directory and not go recursively in the subdirectories.
find(\&read_cue, $dir_source);
sub read_cue {
/\.cue$/ or return;
my $fd = $File::Find::dir;
my $fn = $File::Find::name;
tie my #lines, 'Tie::File', $fn
or die "could not tie file: $!";
foreach (#lines) {
s/some substitution//;
}
untie #lines;
}
I've tried variations of
$File::Find::prune = 1;
return;
but with no success. Where should I place and define $File::Find::prune?
Thanks
If you don't want to recurse, you probably want to use glob:
for (glob("*.cue")) {
read_cue($_);
}
If you want to filter the subdirectories recursed into by File::Find, you should use the preprocess function (not the $File::Find::prune variable) as this gives you much more control. The idea is to provide a function which is called once per directory, and is passed a list of files and subdirectories; the return value is the filtered list to pass to the wanted function, and (for subdirectories) to recurse into.
As msw and Brian have commented, your example would probably be better served by a glob, but if you wanted to use File::Find, you might do something like the following. Here, the preprocess function calls -f on every file or directory it's given, returning a list of files. Then the wanted function is called only for those files, and File::Find does not recurse into any of the subdirectories:
use strict;
use File::Find;
# Function is called once per directory, with a list of files and
# subdirectories; the return value is the filtered list to pass to
# the wanted function.
sub preprocess { return grep { -f } #_; }
# Function is called once per file or subdirectory.
sub wanted { print "$File::Find::name\n" if /\.cue$/; }
# Find files in or below the current directory.
find { preprocess => \&preprocess, wanted => \&wanted }, '.';
This can be used to create much more sophisticated file finders. For example, I wanted to find all files in a Java project directory, without recursing into subdirectories starting with ".", such as ".idea" and ".svn", created by IntelliJ and Subversion. You can do this by modifying the preprocess function:
# Function is called once per directory, with a list of files and
# subdirectories; return value is the filtered list to pass to the
# wanted function.
sub preprocess { return grep { -f or (-d and /^[^.]/) } #_; }
If you only want the files in a directory without searching subdirectories, you don't want to use File::Find. A simple glob probably does the trick:
my #files = glob( "$dir_source/*.cue" );
You don't need that subroutine. In general, when you're doing a lot of work for a task that you think should be simple, you're probably doing it wrong. :)
Say you have a directory subtree with
/tmp/foo/file.cue
/tmp/foo/bar/file.cue
/tmp/foo/bar/baz/file.cue
Running
#! /usr/bin/perl
use warnings;
use strict;
use File::Find;
sub read_cue {
if (-f && /\.cue$/) {
print "found $File::Find::name\n";
}
}
#ARGV = (".") unless #ARGV;
find \&read_cue => #ARGV;
outputs
found /tmp/foo/file.cue
found /tmp/foo/bar/file.cue
found /tmp/foo/bar/baz/file.cue
But if you remember the directories in which you found cue files
#! /usr/bin/perl
use warnings;
use strict;
use File::Find;
my %seen_cue;
sub read_cue {
if (-f && /\.cue$/) {
print "found $File::Find::name\n";
++$seen_cue{$File::Find::dir};
}
elsif (-d && $seen_cue{$File::Find::dir}) {
$File::Find::prune = 1;
}
}
#ARGV = (".") unless #ARGV;
find \&read_cue => #ARGV;
you get only the toplevel cue file:
found /tmp/foo/file.cue
That's because $File::Find::prune emulates the -prune option of find that affects directory processing:
-prune
True; if the file is a directory, do not descend into it.