Perl: How to copy directory without any files? - perl

I would like to copy one folder with subfolders but without the files. With dircopy from package File::Copy::Recursive the whole structure with the files will be copy:
my $source = 'C:/dir_source';
my $target = 'C:/dir_target';
dircopy $source, $target or die "Could not perform dircopy of $source to $target: $!";
Is there an appropriate module or do I have use finddepth from module use File::Find; and use rmdir?

I don't know about libraries; some probably exist which can be coerced into giving you that.
Here's one way to do it, indeed using File::Find to find the hierarchy. Then go through that list of full-path locations and make them
use warnings;
use strict;
use feature 'say';
use File::Find;
sub get_dir_hier {
my ($src) = #_;
my #dirs;
find( sub {
push #dirs, $File::Find::name
if -d and $File::Find::name ne $src; # only UNDER source dir
}, $src);
return \#dirs;
}
sub copy_dir_hier {
my ($dirs, $tgt, $verbose) = #_;
for my $dir (#$dirs) {
next if not $dir;
say "mkdir $tgt/$dir" if $verbose;
# mkdir "$tgt/$dir" # UNCOMMENT AFTER TESTING
# or warn "Error with mkdir $tgt/$dir: $!";
}
}
my ($source_dir, $target_dir) = #ARGV;
die "Usage: $0 source-dir target-dir\n"
if not $source_dir or not $target_dir;
say "Copy directory hierarchy from under $source_dir to under $target_dir\n";
say "Find directory hierarchy under $source_dir";
my $dirs = get_dir_hier($source_dir);
say for #$dirs; say '-'x60;
say "\nCopy that hierarchy under $target_dir";
copy_dir_hier( $dirs, $target_dir, 1 );
This obtains the listing of directories under the given source directory, without it; that is easily changed. Then those are copied under the target directory, which isn't made.
For making directories, the target directory (under which to make the hierarch) must exist for mkdir to work, since it doesn't create directories recursively; the source's directory hierarchy is built by adding them in order so that is not a problem for the rest of hierarchy.
In order to create paths (recursively) see make_path in File::Path
All code shown here has been tested and it works as it stands -- but it needs a lot more testing, and probably debugging.
One other way is by using the Linux tree command.
The tree command can be made into printing directories only, with the full path and without pretty graphics: tree -dfi. With these options the default output of
$ tree t1
t1
├── f1.txt
├── t21
│   ├── f21a.txt
│   ├── f21.txt
│   └── t31
│   └── f3.txt
└── t22
turns into
$ tree -dfi t1
t1
t1/t21
t1/t21/t31
t1/t22
This is a convenient form for making those directories.
Here's a (barely-tested) code for finding the directory hierarchy this way:
# Uses "tree" command
sub get_dir_hier_using_tree {
my ($src) = #_;
my #out = qx(tree -dfi $src);
my #dirs = grep { m{^/|\.} } #out; #/ keep only directories (full path)
chomp #dirs;
#say for #dirs; say '---';
# Remove the leading part of the path, to the source-directory name
s{^$src/?}{} for #dirs;
# Remove possibly empty entries
#dirs = grep { defined and /\S/ } #dirs;
#say for #dirs; say '-'x40;
return \#dirs
}

Related

Perl - concatenate files with similar names pattern and write concatenated file names to a list

I have a directory with multiple sub-directories in it and each subdir has a fixed set of files - one for each category like -
1)Main_dir
1.1) Subdir1 with files
- Test.1.age.txt
- Test.1.name.txt
- Test.1.place.csv
..........
1.2) Subdir2 with files
- Test.2.age.txt
- Test.2.name.txt
- Test.2.place.csv
.........
there are around 20 folders with 10 files in them. I need to first concatenate files under each category like Test.1.age.txt and Test.2.age.txt into a combined.age.txt file and once I do all concatenation I want to printout these filenames in a new Final_list.txt file like
./Main_dir/Combined.age.txt
./Main_dir/Combined.name.txt
I am able to read all the files from all subdirs in an array, but i am not sure how to do pattern search for the similar files names. Also, will be able to figure out this printout part of the code. Can anyone please share on how to do this pattern search for concatenation? My code so far :
use warnings;
use strict;
use File::Spec;
use Data::Dumper;
use File::Basename;
foreach my $file (#files) {
print "$file\n";
}
my $testdir = './Main_dir';
my #Comp_list = glob("$testdir/test_dir*/*.txt");
I am trying to do the pattern search on the array contents in the #Comp_list, which I surely need to learn -
foreach my $f1 (#Comp_list) {
if($f1 !~ /^(\./\.txt$/) {
print $f1; # check if reading the file right
#push it to a file using concatfile(
}}
Thanks a lot!
This should work for you. I've only tested it superficially as it would take me a while to create some test data, so as you have some at hand I'm hoping you'll report back with any problems
The program segregates all the files found by the equivalent of your glob call, and puts them in buckets according to their type. I've assumed that the names are exactly as you've shown, so the type is penultimate field when the file name is split on dots; i.e. the type of Test.1.age.txt is age
Having collected all of the file lists, I've used a technique that is originally designed to read through all of the files specified on the command line. If #ARGV is set to a list of files then an <ARGV> operation will read through all the files as if they were one, and so can easily be copied to a new output file
If you need the files concatenated in a specific order then I will have to amend my solution. At present they will be processed in the order that glob returns them -- probably in lexical order of their file names, but you shouldn't rely on that
use strict;
use warnings 'all';
use v5.14.0; # For autoflush method
use File::Spec::Functions 'catfile';
use constant ROOT_DIR => './Main_dir';
my %files;
my $pattern = catfile(ROOT_DIR, 'test_dir*', '*.txt');
for my $file ( glob $pattern ) {
my #fields = split /\./, $file;
my $type = lc $fields[-2];
push #{ $files{$type} }, $file;
}
STDOUT->autoflush; # Get prompt reports of progress
for my $type ( keys %files ) {
my $outfile = catfile(ROOT_DIR, "Combined.$type.txt");
open my $out_fh, '>', $outfile or die qq{Unable to open "$outfile" for output: $!};
my $files = $files{$type};
printf qq{Writing aggregate file "%s" from %d input file%s ... },
$outfile,
scalar #$files,
#$files == 1 ? '' : 's';
local #ARGV = #$files;
print $out_fh $_ while <ARGV>;
print "complete\n";
}
I think it's easier if you categorize the files first then you can work with them.
use warnings;
use strict;
use File::Spec;
use Data::Dumper;
use File::Basename;
my %hash = ();
my $testdir = './main_dir';
my #comp_list = glob("$testdir/**/*.txt");
foreach my $file (#comp_list){
$file =~ /(\w+\.\d\..+\.txt)/;
next if not defined $1;
my #tmp = split(/\./, $1);
if (not defined $hash{$tmp[-2]}) {
$hash{$tmp[-2]} = [$file];
}else{
push($hash{$tmp[-2]}, $file);
}
}
print Dumper(\%hash);
Files:
main_dir
├── sub1
│   ├── File.1.age.txt
│   └── File.1.name.txt
└── sub2
├── File.2.age.txt
└── File.2.name.txt
Result:
$VAR1 = {
'age' => [
'./main_dir/sub1/File.1.age.txt',
'./main_dir/sub2/File.2.age.txt'
],
'name' => [
'./main_dir/sub1/File.1.name.txt',
'./main_dir/sub2/File.2.name.txt'
]
};
You can create a loop to concatenate and combine files

How to list all files in the directories

how can i list all files in parent and sub directories for multilpe dirs?
$dir="/home/httpd/cgi-bin/r/met";
opendir(DIR,"/home/httpd/cgi-bin/r/met")||die"error";
while($line=readdir DIR)
{
print"$line\n";
opendir DIR1,"$dir/$line"||die"error";
while($line1=readdir DIR1)
{
print"$line1\n";
}
}
closedir DIR;
closedir DIR1;
Don't do it this way, use File::Find instead.
use strict;
use warnings;
use File::Find;
my $search = "/home/httpd/cgi-bin/r/met";
sub print_file_names {
print $_,"\n";
}
find ( \&print_file_names, $search );
File::Find effectively walks through list of directories and executes a subroutine defined by you for each file or directory found recursively below the starting directory. Before calling your subroutine find (a function exported by File::Find module) by default changes to the directory being scanned and sets the following (global) variables:
$File::Find::dir -- visited directory path relative to the starting directory
$File::Find::name -- full path of the file being visited relative to the starting directory
$_ -- basename of the file being visited (used in my example)
One way to solve your problem would be:
#!/usr/bin/perl
# Usage: ffind [dir1 ...]
use strict; use warnings;
use 5.010; # to be able to use say
use File::Find;
# use current working directory if no command line argument
#ARGV = qw(.) unless #ARGV;
find( sub { say if -f }, #ARGV );

script in perl to copy directory structure from the source to the destination

#!/usr/bin/perl -w
use File::Copy;
use strict;
my $i= "0";
my $j= "1";
my $source_directory = $ARGV[$i];
my $target_directory = $ARGV[$j];
#print $source_directory,"\n";
#print $target_directory,"\n";
my#list=process_files ($source_directory);
print "remaninign files\n";
print #list;
# Accepts one argument: the full path to a directory.
# Returns: A list of files that reside in that path.
sub process_files {
my $path = shift;
opendir (DIR, $path)
or die "Unable to open $path: $!";
# We are just chaining the grep and map from
# the previous example.
# You'll see this often, so pay attention ;)
# This is the same as:
# LIST = map(EXP, grep(EXP, readdir()))
my #files =
# Third: Prepend the full path
map { $path . '/' . $_}
# Second: take out '.' and '..'
grep { !/^\.{1,2}$/ }
# First: get all files
readdir (DIR);
closedir (DIR);
for (#files) {
if (-d $_) {
# Add all of the new files from this directory
# (and its subdirectories, and so on... if any)
push #files, process_files ($_);
} else { #print #files,"\n";
# for(#files)
while(#files)
{
my $input= pop #files;
print $input,"\n";
copy($input,$target_directory);
}
}
# NOTE: we're returning the list of files
return #files;
}
}
This basically copies files from source to destination but I need some guidance on how to
copy the directory as well. The main thing to note here is no CPAN modules are allowed except copy, move, and path
Instead of rolling your own directory processing adventure, why not simply use File::Find to go through the directory structure for you.
#! /usr/bin/env perl
use :5.10;
use warnings;
use File::Find;
use File::Path qw(make_path);
use File::Copy;
use Cwd;
# The first two arguments are source and dest
# 'shift' pops those arguments off the front of
# the #ARGV list, and returns what was removed
# I use "cwd" to get the current working directory
# and prepend that to $dest_dir. That way, $dest_dir
# is in correct relationship to my input parameter.
my $source_dir = shift;
my $dest_dir = cwd . "/" . shift;
# I change into my $source_dir, so the $source_dir
# directory isn't in the file name when I find them.
chdir $source_dir
or die qq(Cannot change into "$source_dir");;
find ( sub {
return unless -f; #We want files only
make_path "$dest_dir/$File::Find::dir"
unless -d "$dest_dir/$File::Find::dir";
copy "$_", "$dest_dir/$File::Find::dir"
or die qq(Can't copy "$File::Find::name" to "$dest_dir/$File::Find::dir");
}, ".");
Now, you don't need a process_files subroutine. You let File::Find::find handle recursing the directory for you.
By the way, you could rewrite the find like this which is how you usually see it in the documentation:
find ( \&wanted, ".");
sub wanted {
return unless -f; #We want files only
make_path "$dest_dir/$File::Find::dir"
unless -d "$dest_dir/$File::Find::dir";
copy "$_", "$dest_dir/$File::Find::dir"
or die qq(Can't copy "$File::Find::name" to "$dest_dir/$File::Find::dir");
}
I prefer to embed my wanted subroutine into my find command instead because I think it just looks better. It first of all guarantees that the wanted subroutine is kept with the find command. You don't have to look at two different places to see what's going on.
Also, the find command has a tendency to swallow up your entire program. Imagine where I get a list of files and do some complex processing on them. The entire program can end up in the wanted subroutine. To avoid this, you simply create an array of the files you want to operate on, and then operate on them inside your program:
...
my #file_list;
find ( \&wanted, "$source_dir" );
for my $file ( #file_list ) {
...
}
sub wanted {
return unless -f;
push #file_list, $File::Find::name;
}
I find this a programming abomination. First of all, what is going on with find? It's modifying my #file_list, but how? No where in the find command is #file_list mentioned. What is it doing?
Then at the end of my program is this sub wanted function that is using a variable, #file_list in a global manner. That's bad programming practice.
Embedding my subroutine directly into my find command solves many of these issues:
my #file_list;
find ( sub {
return unless -f;
push #file_list;
}, $source_dir );
for my $file ( #file_list ) {
...
}
This just looks better. I can see that #file_list is being manipulated directly by my find command. Plus, that pesky wanted subroutine has disappeared from the end of my program. Its' the exact same code. It just looks better.
Let's get to what that find command is doing and how it works with the wanted subroutine:
The find command finds each and every file, directory, link, or whatnot located in the directory list you pass to it. With each item it finds in that directory, it passes it to your wanted subroutine for processing. A return leaves the wanted subroutine and allows find to fetch the next item.
Each time the wanted subroutine is called, find sets three variables:
$File::Find::name: The name of the item found with the full path attached to it.
$File::Find::dir: The name of the directory where the item was found.
$_: The name of the item without the directory name.
In Perl, that $_ variable is very special. It's sort of a default variable for many commands. That is, you you execute a command, and don't give it a variable to use, that command will use $_. For example:
print
prints out $_
return if -f;
Is the same as saying this:
if ( -f $_ ) {
return;
}
This for loop:
for ( #file_list ) {
...
}
Is the same as this:
for $_ ( #file_list ) {
...
}
Normally, I avoid the default variable. It's global in scope and it's not always obvious what is being acted upon. However, there are a few circumstances where I'll use it because it really clarifies the program's meaning:
return unless -f;
in my wanted function is very obvious. I exit the wanted subroutine unless I was handed a file. Here's another:
return unless /\.txt$/;
This will exit my wanted function unless the item ends with '.txt'.
I hope this clarifies what my program is doing. Plus, I eliminated a few bugs while I was at it. I miscopied $File::Find::dir to $File::Find::name which is why you got the error.

find folders with no further subfolders in perl

How do I find, in a given path, all folders with no further subfolders? They may contain files but no further folders.
For example, given the following directory structure:
time/aa/
time/aa/bb
time/aa/bb/something/*
time/aa/bc
time/aa/bc/anything/*
time/aa/bc/everything/*
time/ab/
time/ab/cc
time/ab/cc/here/*
time/ab/cc/there/*
time/ab/cd
time/ab/cd/everywhere/*
time/ac/
The output of find(time) should be as follows:
time/aa/bb/something/*
time/aa/bc/anything/*
time/aa/bc/everything/*
time/ab/cc/here/*
time/ab/cc/there/*
time/ab/cd/everywhere/*
* above represents files.
Any time you want to write a directory walker, always use the standard File::Find module. When dealing with the filesystem, you have to be able to handle odd corner cases, and naïve implementations rarely do.
The environment provided to the callback (named wanted in the documentation) has three variables that are particularly useful for what you want to do.
$File::Find::dir is the current directory name
$_ is the current filename within that directory
$File::Find::name is the complete pathname to the file
When we find a directory that is not . or .., we record the complete path and delete its parent, which we now know cannot be a leaf directory. At the end, any recorded paths that remain must be leaves because find in File::Find performs a depth-first search.
#! /usr/bin/env perl
use strict;
use warnings;
use File::Find;
#ARGV = (".") unless #ARGV;
my %dirs;
sub wanted {
return unless -d && !/^\.\.?\z/;
++$dirs{$File::Find::name};
delete $dirs{$File::Find::dir};
}
find \&wanted, #ARGV;
print "$_\n" for sort keys %dirs;
You can run it against a subdirectory of the current directory
$ leaf-dirs time
time/aa/bb/something
time/aa/bc/anything
time/aa/bc/everything
time/ab/cc/here
time/ab/cc/there
time/ab/cd/everywhere
or use a full path
$ leaf-dirs /tmp/time
/tmp/time/aa/bb/something
/tmp/time/aa/bc/anything
/tmp/time/aa/bc/everything
/tmp/time/ab/cc/here
/tmp/time/ab/cc/there
/tmp/time/ab/cd/everywhere
or plumb multiple directories in the same invocation.
$ mkdir -p /tmp/foo/bar/baz/quux
$ leaf-dirs /tmp/time /tmp/foo
/tmp/foo/bar/baz/quux
/tmp/time/aa/bb/something
/tmp/time/aa/bc/anything
/tmp/time/aa/bc/everything
/tmp/time/ab/cc/here
/tmp/time/ab/cc/there
/tmp/time/ab/cd/everywhere
Basically, you open the root folder and use following procedure:
sub child_dirs {
my ($directory) = #_;
Open the directory
opendir my $dir, $directory or die $!;
select the files from the files in this directory where the file is a directory
my #subdirs = grep {-d $_ and not m</\.\.?$>} map "$directory/$_", readdir $dir;
# ^-- directory and not . or .. ^-- use full name
If the list of such selected files contains elements,
3.1. then recurse into each such directory,
3.2. else this directory is a "leaf" and it will be appended to the output files.
if (#subdirs) {
return map {child_dirs($_)} #subdirs;
} else {
return "$directory/*";
}
# OR: #subdirs ? map {child_dirs($_)} #subdirs : "$directory/*";
.
}
Example usage:
say $_ for child_dirs("time"); # dir `time' has to be in current directory.
This function will do it. Just call it with your initial path:
sub isChild {
my $folder = shift;
my $isChild = 1;
opendir(my $dh, $folder) || die "can't opendir $folder: $!";
while (readdir($dh)) {
next if (/^\.{1,2}$/); # skip . and ..
if (-d "$folder/$_") {
$isChild = 0;
isChild("$folder/$_");
}
}
closedir $dh;
if ($isChild) { print "$folder\n"; }
}
I tried the readdir way of doing things. Then I stumbled upon this...
use File::Find::Rule;
# find all the subdirectories of a given directory
my #subdirs = File::Find::Rule->directory->in( $directory );
I eliminated any entry matching the initial part of the string and not having some of the leaf entries, from this output.

How can I list files under a directory with a specific name pattern using Perl?

I have a directory /var/spool and inside that, directories named
a b c d e f g h i j k l m n o p q r s t u v x y z
And inside each "letter directory", a directory called "user" and inside this, many directories called auser1 auser2 auser3 auser4 auser5 ...
Every user directory contains mail messages and the file names have the following format: 2. 3. 4. 5. etc.
How can I list the email files for every user in every directory in the following way:
/var/spool/a/user/auser1/11.
/var/spool/a/user/auser1/9.
/var/spool/a/user/auser1/8.
/var/spool/a/user/auser1/10.
/var/spool/a/user/auser1/2.
/var/spool/a/user/auser1/4.
/var/spool/a/user/auser1/12.
/var/spool/b/user/buser1/12.
/var/spool/b/user/buser1/134.
/var/spool/b/user/buser1/144.
etc.
I need that files and then open every single file for modify the header and body. This part I already have, but I need the first part.
I am trying this:
dir = "/var/spool";
opendir ( DIR, $dir ) || die "No pude abrir el directorio $dirname\n";
while( ($filename = readdir(DIR))){
#directorios1 = `ls -l "$dir/$filename"`;
print("#directorios1\n");
}
closedir(DIR);
But does not work the way I need it.
You can use File::Find.
As others have noted, use File::Find:
#!/usr/bin/perl
use strict;
use warnings;
use File::Find;
find(\&find_emails => '/var/spool');
sub find_emails {
return unless /\A[0-9]+[.]\z/;
return unless -f $File::Find::name;
process_an_email($File::Find::name);
return;
}
sub process_an_email {
my ($file) = #_;
print "Processing '$file'\n";
}
Use File::Find to traverse a directory tree.
For a fixed level of directories, sometimes it's easier to use glob than File::Find:
while (my $file = </var/spool/[a-z]/user/*/*>) {
print "Processing $file\n";
}
People keep recommending File::Find, but the other piece that makes it easy is my File::Find::Closures, which provides the convenience functions for you:
use File::Find;
use File::Find::Closures qw( find_by_regex );
my( $wanted, $reporter ) = find_by_regex( qr/^\d+\.\z/ );
find( $wanted, #directories_to_search );
my #files = $reporter->();
You don't even need to use File::Find::Closures. I wrote the module so that you could lift out the subroutine you wanted and paste it into your own code, perhaps tweaking it to get what you needed.
Try this:
sub browse($);
sub browse($)
{
my $path = $_[0];
#append a / if missing
if($path !~ /\/$/)
{
$path .= '/';
}
#loop through the files contained in the directory
for my $eachFile (glob($path.'*'))
{
#if the file is a directory
if(-d $eachFile)
{
#browse directory recursively
browse($eachFile);
}
else
{
# your file processing here
}
}
}#browse