Perl Rename Folders And Files With File::Find - perl

I'm using this code to process my folders and files with two subs:
"sub folders" for folder names only
"sub files" for file names with extensions only
But I realized that "sub folders" messes with my files with extensions during the rename process.
How to distinguish the processes from one another or what is the intelligent way to tell "sub folders" to rename "names" with no extension and "sub files" to rename "names" with externsion?
find(\&folders, $dir_source);
sub folders {
my $fh = $File::Find::dir;
my $artist = (File::Spec->splitdir($fh))[3];
if (-d $fh) {
my $folder_name = $_;
# some substitution
rename $folder_name, $_;
}
}
find(\&files, $dir_source);
sub files {
/\.\w+$/ or return;
my $fn = $File::Find::name;
my ($genre, $artist, $boxset, $album, $disc);
if ($fn =~ /Singles/ or $fn =~ /Box Set/) {
($genre, $artist, $boxset, $album, $disc) = (File::Spec->splitdir($fn))[2..6];
}
else {
($genre, $artist, $album, $disc) = (File::Spec->splitdir($fn))[2..5];
}
if (-e $fn) {
my $file_name = $_;
# some substitution
rename $file_name, $_;
}
}

File::Find::find() calls your sub for every file and folder. If you only want to affect folders, then ignore files:
And you'll need to call finddepth() instead of find(), since you're changing directory names (you'll want to rename the "deeper" directories before the more "shallow" ones).
finddepth(sub {
return unless -d;
(my $new = $_) =~ s/this/that/ or return;
rename $_, $new or warn "Err renaming $_ to $new in $File::Find::dir: $!";
}, ".");
Alternative for multiple substitutions:
finddepth(sub {
return unless -d;
my $new = $_;
for ($new) {
s/this/that/;
s/something/something_else/;
}
return if $_ eq $new;
rename $_, $new or warn "Err renaming $_ to $new in $File::Find::dir: $!";
}, ".");
And in the files sub, I'd make the first statement:
return unless -f;

Related

Data driven perl script

I want to list file n folder in directory. Here are the list of the file in this directory.
Output1.sv
Output2.sv
Folder1
Folder2
file_a
file_b
file_c.sv
But some of them, i don't want it to be listed. The list of not included file, I list in input.txt like below. Note:some of them is file and some of them is folder
NOT_INCLUDED=file_a
NOT_INCLUDED=file_b
NOT_INCLUDED=file_c.sv
Here is the code.
#!/usr/intel/perl
use strict;
use warnings;
my $input_file = "INPUT.txt";
open ( OUTPUT, ">OUTPUT.txt" );
file_in_directory();
close OUTPUT;
sub file_in_directory {
my $path = "experiment/";
my #unsort_output;
my #not_included;
open ( INFILE, "<", $input_file);
while (<INFILE>){
if ( $_ =~ /NOT_INCLUDED/){
my #file = $_;
foreach my $file (#file) {
$file =~ s/NOT_INCLUDED=//;
push #not_included, $file;
}
}
}
close INFILE;
opendir ( DIR, $path ) || die "Error in opening dir $path\n";
while ( my $filelist = readdir (DIR) ) {
chomp $filelist;
next if ( $filelist =~ m/\.list$/ );
next if ( $filelist =~ m/\.swp$/ );
next if ( $filelist =~ s/\.//g);
foreach $_ (#not_included){
chomp $_;
my $not_included = "$_";
if ( $filelist eq $not_included ){
next;
}
push #unsort_output, $filelist;
}
closedir(DIR);
my #output = sort #unsort_output;
print OUTPUT #output;
}
The output that I want is to list all the file in that directory except the file list in input.txt 'NOT_INCLUDED'.
Output1.sv
Output2.sv
Folder1
Folder2
But the output that i get seem still included that unwanted file.
This part of the code makes no sense:
while ( my $filelist = readdir (DIR) ) {
...
foreach $_ (#not_included){
chomp $_;
my $not_included = "$_";
if ( $filelist eq $not_included ){
next;
} # (1)
push #unsort_output, $filelist; # (2)
}
This code contains three opening braces ({) but only two closing braces (}). If you try to run your code as-is, it fails with a syntax error.
The push line (marked (2)) is part of the foreach loop, but indented as if it were outside. Either it should be indented more (to line up with (1)), or you need to add a } before it. Neither alternative makes much sense:
If push is outside of the foreach loop, then the next statement (and the whole foreach loop) has no effect. It could just be deleted.
If push is inside the foreach loop, then every directory entry ($filelist) will be pushed multiple times, one for each line in #not_included (except for the names listed somewhere in #not_included; those will be pushed one time less).
There are several other problems. For example:
$filelist =~ s/\.//g removes all dots from the file name, transforming e.g. file_c.sv into file_csv. That means it will never match NOT_INCLUDED=file_c.sv in your input file.
Worse, the next if s/// part means the loop skips all files whose names contain dots, such as Output1.sv or Output2.sv.
Results are printed without separators, so you'll get something like
Folder1Folder1Folder1Folder2Folder2Folder2file_afile_afile_bfile_b in OUTPUT.txt.
Global variables are used for no reason, e.g. INFILE and DIR.
Here is how I would structure the code:
#!/usr/intel/perl
use strict;
use warnings;
my $input_file = 'INPUT.txt';
my %is_blacklisted;
{
open my $fh, '<', $input_file or die "$0: $input_file: $!\n";
while (my $line = readline $fh) {
chomp $line;
if ($line =~ s!\ANOT_INCLUDED=!!) {
$is_blacklisted{$line} = 1;
}
}
}
my $path = 'experiment';
my #results;
{
opendir my $dh, $path or die "$0: $path: $!\n";
while (my $entry = readdir $dh) {
next
if $entry eq '.' || $entry eq '..'
|| $entry =~ /\.list\z/
|| $entry =~ /\.swp\z/
|| $is_blacklisted{$entry};
push #results, $entry;
}
}
#results = sort #results;
my $output_file = 'OUTPUT.txt';
{
open my $fh, '>', $output_file or die "$0: $output_file: $!\n";
for my $result (#results) {
print $fh "$result\n";
}
}
The contents of INPUT.txt (more specifically, the parts after NOT_INCLUDED=) are read into a hash (%is_blacklisted). This allows easy lookup of entries.
Then we process the directory entries. We skip over . and .. (I assume you don't want those) as well as all files ending with *.list or *.swp (that was in your original code). We also skip any file that is blacklisted, i.e. that was specified as excluded in INPUT.txt. The remaining entries are collected in #results.
We sort our results and write them to OUTPUT.txt, one entry per line.
Not deviating too much from your code, here is the solution. Please find the comments:
#!/usr/intel/perl
use strict;
use warnings;
my $input_file = "INPUT.txt";
open ( OUTPUT, ">OUTPUT.txt" );
file_in_directory();
close OUTPUT;
sub file_in_directory {
my $path = "experiment/";
my #unsort_output;
my %not_included; # creating hash map insted of array for cleaner and faster implementaion.
open ( INFILE, "<", $input_file);
while (my $file = <INFILE>) {
if ($file =~ /NOT_INCLUDED/) {
$file =~ s/NOT_INCLUDED=//;
$not_included{$file}++; # create a quick hash map of (filename => 1, filename2 => 1)
}
}
close INFILE;
opendir ( DIR, $path ) || die "Error in opening dir $path\n";
while ( my $filelist = readdir (DIR) ) {
next if $filelist =~ /^\.\.?$/xms; # discard . and .. files
chomp $filelist;
next if ( $filelist =~ m/\.list$/ );
next if ( $filelist =~ m/\.swp$/ );
next if ( $filelist =~ s/\.//g);
if (defined $not_included{$filelist}) {
next;
}
else {
push #unsort_output, $filelist;
}
}
closedir(DIR); # earlier the closedir was inside of while loop. Which is wrong.
my #output = sort #unsort_output;
print OUTPUT join "\n", #output;
}

Can't find file trying to move

I'm trying to clean up a directory that contains a lot of sub directories that actually belong in some of the sub directories, not the main directory.
For example, there is
Main directory
sub1
sub2
sub3
HHH
And HHH belongs in sub3. HHH has multiple text files inside of it (as well as some ..txt and ...txt files that I would like to ignore), and each of these text files has a string
some_pattern [sub3].
So, I attempted to write a script that looks into the file and then moves it into its corresponding directory
use File::Find;
use strict;
use warnings;
use File::Copy;
my $DATA = "D:/DATA/DATA_x/*";
my #dirs = grep { -d } glob $DATA;
foreach (#dirs) {
if ($_ =~ m/HHH/) {
print "$_\n";
my $file = "$_/*";
my #files = grep { -f } glob $file;
foreach (#files) {
print "file $_\n";
}
foreach (#files) {
print "\t$_\n";
my #folders = split('/', $_);
if ($folders[4] eq '..txt' or $folders[4] eq '...txt') {
print "$folders[4] ..txt\n";
}
foreach (#folders) {
print "$_\n";
}
open(FH, '<', $_);
my $value;
while (my $line = <FH>) {
if ($line =~ m/some_pattern/) {
($value) = $line =~ /\[(.+?)\]/;
($value) =~ s/\s*$//;
print "ident'$value'\n";
my $new_dir = "$folders[0]/$folders[1]/$folders[2]/$value/$folders[3]/$folders[4]";
print "making $folders[0]/$folders[1]/$folders[2]/$value/$folders[3]\n";
print "file is $folders[4]\n";
my $new_over_dir = "$folders[0]/$folders[1]/$value/$folders[2]/$folders[3]";
mkdir $new_over_dir or die "Can't make it $!";
print "going to swap\n '$_'\n for\n '$new_dir'\n";
move($_, $new_dir) or die "Can't $!";
}
}
}
}
}
It's saying
Can't make it No such file or directory at foo.pl line 57, <FH> line 82.
Why is it saying that it won't make a file that doesn't exist?
A while later: here is my final script:
use File::Find;
use strict;
use warnings;
use File::Copy;
my $DATA = "D:/DATA/DATA_x/*";
my #dirs = grep { -d } glob $DATA;
foreach (#dirs) {
if ($_ =~ m/HHH/) {
my $value;
my #folders;
print "$_\n";
my $file = "$_/*";
my #files = grep { -f } glob $file;
foreach (#files) {
print "file $_\n";
}
foreach (#files) {
print "\t$_\n";
#folders = split('/', $_);
if ($folders[4] eq '..txt' or $folders[4] eq '...txt') {
print "$folders[4] ..txt\n";
}
foreach (#folders) {
print "$_\n";
}
open(FH, '<', $_);
while (my $line = <FH>) {
if ($line =~ m/some_pattern/) {
($value) = $line =~ /\[(.+?)\]/;
($value) =~ s/\s*$//;
print "ident'$value'\n";
}
}
}
if($value){
print "value $value\n";
my $dir1 = "/$folders[1]/$folders[2]/$folders[3]/$folders[4]/$folders[5]";
my $dir2 = "/$folders[1]/$folders[2]/$folders[3]/$folders[4]/$value";
system("cp -r $dir1 $dir2");
}
}
}
}
This works. It looks like part of my problem from before was that I was trying to run this on a directory in my D: drive--when I moved it to the C: drive, it worked fine without any permissions errors or anything. I did try to implement something with Path::Tiny, but this script was so close to being functional (and it was functional in a Unix environment), that I decided to just complete it.
You really should read the Path::Tiny doccu. It probably contains everything you need.
Some starting points, without error handling and so on...
use strict;
use warnings;
use Path::Tiny;
my $start=path('D:/DATA/DATA_x');
my $iter = path($start)->iterator({recurse => 1});
while ( $curr = $iter->() ) {
#select here the needed files - add more conditions if need
next if $curr->is_dir; #skip directories
next if $curr =~ m/HHH.*\.{2,3}txt$/; #skip ...?txt
#say "$curr";
my $content = $curr->slurp;
if( $content =~ m/some_pattern/ ) {
#do something wih the file
say "doing something with $curr";
my $newfilename = path("insert what you need here"); #create the needed new path for the file ..
path($newfilename->dirname)->mkpath; #make directories
$curr->move($newfilename); #move the file
}
}
Are you sure of the directory path you are trying to create. The mkdir call might be failing if some of the intermediate directories doesn't exist. If your code is robust to ensure that
the variable $new_over_dir contains the directory path you have to create, you can use method make_path from perl module File::Path to create the new directory, instead of 'mkdir'.
From the documentation of make_path:
The make_path function creates the given directories if they don't
exists before, much like the Unix command mkdir -p.

Move files to directories

Is there any reason why the following construction would not work? The file list contains file names. The name list contains a list of names that when matched as a substring to a file name, causes the loop to move the file to the directory called $name. It seems like it should work but it is not moving files. What is a better way to construct this?
FILE: for my $file (#file_list) {
for my $name (#name_list) {
if ($file =~ /^\Q$name\E/) {
rename "/Users/path/to/file/I/need/to/move/$file", "/Users/path/to/directory/i/need/to/move/file/to/$name/$file" or die "rename failed because: $!\n";
next FILE;
}
}
print "no match for $file\n";
}
Right code:
for my $file (#file_list) {
my $found = 0;
for my $name (#name_list) {
if ($file =~ /^\Q$name\E/) {
print "failed to rename $file\n" unless rename "/Users/path/to/file/I/need/to/move/$file", "/Users/path/to/directory/i/need/to/move/file/to/$name/$file";
$found = 1;
last;
}
}
print "no match for $file\n" unless $found;
}

How to check in Perl whether there is any sub folder exist in a folder?

Is there a way to check whether there is any subfolder exist inside a folder. I would like to do this in Perl?
Glob through the contents of the directory, and check whether it is a directory with -d.
sub has_subfolder {
my $directory = shift;
for ( <$directory/*>, <$directory/.*> ) {
next if m#/\.\.?$#; # skip . and ..
return 1 if -d;
}
return 0;
}
if (grep -d, glob("$folder/*")) {
print "$folder has subfolder(s)\n";
}
If you want to deal with directories matching .*, you could do:
if (grep -d && !/\.\.?$/, glob("$folder/.* $folder/*")) {
print "$folder has subfolder(s)\n";
}
You can use the'File::Find' module for this purpose. File::Find processes and scans a directory recursively. Here is the sample code:
use File::Find;
my $DirName = 'dirname' ;
sub has_subdir
{
#The path of the file/dir being visited.
my $subdir = $File::Find::name;
#Ignore if this is a file.
return unless -d $subdir;
#Ignore if $subdir is $Dirname itself.
return if ( $subdir eq $DirName);
# if we have reached here, this is a subdirector.
print "Sub directory found - $subdir\n";
}
#For each file and sub directory in $Dirname, 'find' calls
#the 'has_subdir' subroutine recursively.
find (\&has_subdir, $DirName);
In order to check if a subfolder exists in a directory (without knowing any names):
my $dir_name = "some_directory";
opendir my $dir, $dir_name
or die "Could not open directory $dir_name: $!";
my $has_subfolder = grep { -d && !/(^|\/)\.\.?$/ } map { ("$dir_name"||'.')."/$_" } readdir $dir;
In other words, it checks for one or more files in the directory which are themselves directories.
If you want a specific subfolder, just use Geo's answer.
Edit: This is getting silly now, but here's a truly general-purpose answer. :-P Someone else is getting the check mark anyway.
sub hasSubDir {
my $dir_name = shift;
opendir my $dir, $dir_name
or die "Could not open directory $dir_name: $!";
my #files = readdir($dir);
closedir($dir);
for my $file (#files) {
if($file !~ /\.\.?$/) {
return 1 if -d $dir/$file;
}
}
return 0;
}
OK, I'm just gonna have to submit my own answer
sub has_subfolder {
my $dir = shift;
my $found = 0;
opendir my $dh, $dir or die "Could not open directory $dir: $!";
while (my $_ = readdir($dh)) {
next if (/^\.\.?$/); # skip '.' and '..'
my $path = $dir . '/' . $_; # readdir doesn't return the whole path
if (-d $path) { # found a dir? record it, and leave the loop!
$found = 1;
last;
}
closedir($dh); # make sure we cleanup after!
return $found;
}
Compared to other answers:
finds hidden directories
completes as soon as it finds a match
doesn't traverse the tree twice (once for normal files, and again for hidden files)
EDIT - I see the requirements just changed (sigh). Fortunately the code above is trivially modified:
sub get_folders {
my $dir = shift;
my #found;
opendir my $dh, $dir or die "Could not open directory $dir: $!";
while (my $_ = readdir($dh)) {
next if (/^\.\.?$/); # skip '.' and '..'
my $path = $dir . '/' . $_; # readdir doesn't return the whole path
push(#found, $_) if (-d $path) # found a dir? record it
}
closedir($dh); # make sure we cleanup after!
return #found;
}
if(-e "some_folder/some_subfolder") {
print "folder exists";
}
else {
print "folder does not exist";
}

How do I read in the contents of a directory in Perl?

How do I get Perl to read the contents of a given directory into an array?
Backticks can do it, but is there some method using 'scandir' or a similar term?
opendir(D, "/path/to/directory") || die "Can't open directory: $!\n";
while (my $f = readdir(D)) {
print "\$f = $f\n";
}
closedir(D);
EDIT: Oh, sorry, missed the "into an array" part:
my $d = shift;
opendir(D, "$d") || die "Can't open directory $d: $!\n";
my #list = readdir(D);
closedir(D);
foreach my $f (#list) {
print "\$f = $f\n";
}
EDIT2: Most of the other answers are valid, but I wanted to comment on this answer specifically, in which this solution is offered:
opendir(DIR, $somedir) || die "Can't open directory $somedir: $!";
#dots = grep { (!/^\./) && -f "$somedir/$_" } readdir(DIR);
closedir DIR;
First, to document what it's doing since the poster didn't: it's passing the returned list from readdir() through a grep() that only returns those values that are files (as opposed to directories, devices, named pipes, etc.) and that do not begin with a dot (which makes the list name #dots misleading, but that's due to the change he made when copying it over from the readdir() documentation). Since it limits the contents of the directory it returns, I don't think it's technically a correct answer to this question, but it illustrates a common idiom used to filter filenames in Perl, and I thought it would be valuable to document. Another example seen a lot is:
#list = grep !/^\.\.?$/, readdir(D);
This snippet reads all contents from the directory handle D except '.' and '..', since those are very rarely desired to be used in the listing.
A quick and dirty solution is to use glob
#files = glob ('/path/to/dir/*');
This will do it, in one line (note the '*' wildcard at the end)
#files = </path/to/directory/*>;
# To demonstrate:
print join(", ", #files);
IO::Dir is nice and provides a tied hash interface as well.
From the perldoc:
use IO::Dir;
$d = IO::Dir->new(".");
if (defined $d) {
while (defined($_ = $d->read)) { something($_); }
$d->rewind;
while (defined($_ = $d->read)) { something_else($_); }
undef $d;
}
tie %dir, 'IO::Dir', ".";
foreach (keys %dir) {
print $_, " " , $dir{$_}->size,"\n";
}
So you could do something like:
tie %dir, 'IO::Dir', $directory_name;
my #dirs = keys %dir;
You could use DirHandle:
use DirHandle;
$d = new DirHandle ".";
if (defined $d)
{
while (defined($_ = $d->read)) { something($_); }
$d->rewind;
while (defined($_ = $d->read)) { something_else($_); }
undef $d;
}
DirHandle provides an alternative, cleaner interface to the opendir(), closedir(), readdir(), and rewinddir() functions.
Similar to the above, but I think the best version is (slightly modified) from "perldoc -f readdir":
opendir(DIR, $somedir) || die "can't opendir $somedir: $!";
#dots = grep { (!/^\./) && -f "$somedir/$_" } readdir(DIR);
closedir DIR;
You can also use the children method from the popular Path::Tiny module:
use Path::Tiny;
my #files = path("/path/to/dir")->children;
This creates an array of Path::Tiny objects, which are often more useful than just filenames if you want to do things to the files, but if you want just the names:
my #files = map { $_->stringify } path("/path/to/dir")->children;
Here's an example of recursing through a directory structure and copying files from a backup script I wrote.
sub copy_directory {
my ($source, $dest) = #_;
my $start = time;
# get the contents of the directory.
opendir(D, $source);
my #f = readdir(D);
closedir(D);
# recurse through the directory structure and copy files.
foreach my $file (#f) {
# Setup the full path to the source and dest files.
my $filename = $source . "\\" . $file;
my $destfile = $dest . "\\" . $file;
# get the file info for the 2 files.
my $sourceInfo = stat( $filename );
my $destInfo = stat( $destfile );
# make sure the destinatin directory exists.
mkdir( $dest, 0777 );
if ($file eq '.' || $file eq '..') {
} elsif (-d $filename) { # if it's a directory then recurse into it.
#print "entering $filename\n";
copy_directory($filename, $destfile);
} else {
# Only backup the file if it has been created/modified since the last backup
if( (not -e $destfile) || ($sourceInfo->mtime > $destInfo->mtime ) ) {
#print $filename . " -> " . $destfile . "\n";
copy( $filename, $destfile ) or print "Error copying $filename: $!\n";
}
}
}
print "$source copied in " . (time - $start) . " seconds.\n";
}
from: http://perlmeme.org/faqs/file_io/directory_listing.html
#!/usr/bin/perl
use strict;
use warnings;
my $directory = '/tmp';
opendir (DIR, $directory) or die $!;
while (my $file = readdir(DIR)) {
next if ($file =~ m/^\./);
print "$file\n";
}
The following example (based on a code sample from perldoc -f readdir) gets all the files (not directories) beginning with a period from the open directory. The filenames are found in the array #dots.
#!/usr/bin/perl
use strict;
use warnings;
my $dir = '/tmp';
opendir(DIR, $dir) or die $!;
my #dots
= grep {
/^\./ # Begins with a period
&& -f "$dir/$_" # and is a file
} readdir(DIR);
# Loop through the array printing out the filenames
foreach my $file (#dots) {
print "$file\n";
}
closedir(DIR);
exit 0;
closedir(DIR);
exit 0;