Odd file handling in perl on OS X - perl

I'm very much a perl newbie, so bear with me.
I was looking for a way to recurse through folders in OS X and came across this solution: How to traverse all the files in a directory...
I modified perreal's answer (see code below) slightly so that I could specify the search folder in an argument; i.e. I changed my #dirs = ("."); to my #dirs = ($ARGV[0]);
But for some reason this wouldn't work -- it would open the folder, but would not identify any of the subdirectories as folders, apart from '.' and '..', so it never actually went beyond the specified root.
If I actively specified the folder (e.g. \Volumes\foo\bar) it still doesn't work. But, if I go back to my #dirs = ("."); and then sit in my desired folder (foo\bar) and call the script from its own folder (foo\boo\script.pl) it works fine.
Is this 'expected' behaviour? What am I missing?!
Many thanks,
Mat
use warnings;
use strict;
my #dirs = (".");
my %seen;
while (my $pwd = shift #dirs) {
opendir(DIR,"$pwd") or die "Cannot open $pwd\n";
my #files = readdir(DIR);
closedir(DIR);
foreach my $file (#files) {
if (-d $file and ($file !~ /^\.\.?$/) and !$seen{$file}) {
$seen{$file} = 1;
push #dirs, "$pwd/$file";
}
next if ($file !~ /\.txt$/i);
my $mtime = (stat("$pwd/$file"))[9];
print "$pwd $file $mtime";
print "\n";
}
}

The problem is that you are using the -d operator on the file basename without its path. Perl will look in the current working directory for a directory of that name and return true if it finds one there, when it should be looking in $pwd.
This solution changes $file to always hold the full name of the file or directory, including the path.
use strict;
use warnings;
my #dirs = (shift);
my %seen;
while (my $pwd = shift #dirs) {
opendir DIR, $pwd or die "Cannot open $pwd\n";
my #files = readdir DIR;
closedir DIR;
foreach (#files) {
next if /^\.\.?$/;
my $file = "$pwd/$_";
next if $seen{$file};
if ( -d $file ) {
$seen{$file} = 1;
push #dirs, $file;
}
elsif ( $file =~ /\.txt$/i ) {
my $mtime = (stat $file)[9];
print "$file $mtime\n";
}
}
}

use full path with -d
-d "$pwd/$file"

Related

How to read directories and sub-directories without knowing the directory name in perl?

Hi i want to read directories and sub-directories without knowing the directory name. Current directory is "D:/Temp". 'Temp' has sub-directories like 'A1','A2'. Again 'A1' has sub-directories like 'B1','B2'. Again 'B1' has sub-directories like 'C1','C2'. Perl script doesn't know these directories. So it has to first find directory and then read one file at a time in dir 'C1' once all files are read in 'C1' it should changes to dir 'C2'. I tried with below code here i don't want to read all files in array(#files) but need one file at time. In array #dir elements should be as fallows.
$dir[0] = "D:/Temp/A1/B1/C1"
$dir[1] = "D:/Temp/A1/B1/C2"
$dir[2] = "D:/Temp/A1/B2/C1"
Below is the code i tried.
use strict;
use File::Find::Rule;
use Data::Dumper;
my $dir = "D:/Temp";
my #dir = File::Find::Rule->directory->in($dir);
print Dumper (\#dir);
my $readDir = $dir[3];
opendir ( DIR, $readDir ) || die "Error in opening dir $readDir\n";
my #files = grep { !/^\.\.?$/ } readdir DIR;
print STDERR "files: #files \n\n";
for my $fil (#files) {
open (F, "<$fil");
read (F, my $data);
close (F);
print "$data";
}
use File::Find;
use strict;
use warnings;
my #dirs;
my %has_children;
find(sub {
if (-d) {
push #dirs, $File::Find::name;
$has_children{$File::Find::dir} = 1;
}
}, 'D:/Temp');
my #ends = grep {! $has_children{$_}} #dirs;
print "$_\n" for (#ends);
Your Goal: Find the absolute paths to those directories that do not themselves have child directories.
I'll call those directories of interest terminal directories. Here's the prototype for a function that I believe provides the convenience you are looking for. The function returns its result as a list.
my #list = find_terminal_directories($full_or_partial_path);
And here's an implementation of find_terminal_directories(). Note that this implementation does not require the use of any global variables. Also note the use of a private helper function that is called recursively.
On my Windows 7 system, for the input directory C:/Perl/lib/Test, I get the output:
== List of Terminal Folders ==
c:/Perl/lib/Test/Builder/IO
c:/Perl/lib/Test/Builder/Tester
c:/Perl/lib/Test/Perl/Critic
== List of Files in each Terminal Folder: ==
c:/Perl/lib/Test/Builder/IO/Scalar.pm
c:/Perl/lib/Test/Builder/Tester/Color.pm
c:/Perl/lib/Test/Perl/Critic/Policy.pm
Implementation
#!/usr/bin/env perl
use strict;
use warnings;
use Cwd qw(abs_path getcwd);
my #dir_list = find_terminal_directories("C:/Perl/lib/Test");
print "== List of Terminal Directories ==\n";
print join("\n", #dir_list), "\n";
print "\n== List of Files in each Terminal Directory: ==\n";
for my $dir (#dir_list) {
for my $file (<"$dir/*">) {
print "$file\n";
open my $fh, '<', $file or die $!;
my $data = <$fh>; # slurp entire file contents into $data
close $fh;
# Now, do something with $data !
}
}
sub find_terminal_directories {
my $rootdir = shift;
my #wanted;
my $cwd = getcwd();
chdir $rootdir;
find_terminal_directories_helper(".", \#wanted);
chdir $cwd;
return #wanted;
}
sub find_terminal_directories_helper {
my ($dir, $wanted) = #_;
return if ! -d $dir;
opendir(my $dh, $dir) or die "open directory error!";
my $count = 0;
foreach my $child (readdir($dh)) {
my $abs_child = abs_path($child);
next if (! -d $child || $child eq "." || $child eq "..");
++$count;
chdir $child;
find_terminal_directories_helper($abs_child, $wanted); # recursion!
chdir "..";
}
push #$wanted, abs_path($dir) if ! $count; # no sub-directories found!
}
Perhaps the following will be helpful:
use strict;
use warnings;
use File::Find::Rule;
my $dir = "D:/Temp";
local $/;
my #dirs =
sort File::Find::Rule->exec( sub { File::Find::Rule->directory->in($_) == 1 }
)->directory->in($dir);
for my $dir (#dirs) {
for my $file (<"$dir/*">) {
open my $fh, '<', $file or die $!;
my $data = <$fh>;
close $fh;
print $data;
}
}
local $/; lets us slurp the file's contents into a variable. Delete it if you only want to read the first line.
The sub in the exec() is used to pass only those dirs which don't contain a dir
sort is used to arrange those dirs in your wanted order
A file glob <"$dir/*"> is used to get the files in each dir
Edit: Have modified the code to find only 'terminal directories.' Thanks to DavidRR for this spec clarification.
I would use File::Find
Sample script:
#!/usr/bin/perl
use strict;
use warnings;
use File::Find;
my $dir = "/home/chris";
find(\&wanted, $dir);
sub wanted {
print "dir: $File::Find::dir\n";
print "file in dir: $_\n";
print "complete path to file: $File::Find::name\n";
}
OUTPUTS:
$ test.pl
dir: /home/chris/test_dir
file in dir: test_dir2
complete path to file: /home/chris/test_dir/test_dir2
dir: /home/chris/test_dir/test_dir2
file in dir: foo.txt
complete path to file: /home/chris/test_dir/test_dir2/foo.txt
...
Using backticks, write subdirs and files to a file called filelist:
`ls -R $dir > filelist`

Perl program help on opendir and readdir

So I have a program that I want to clean some text files. The program asks for the user to enter the full pathway of a directory containing these text files. From there I want to read the files in the directory, print them to a new file (that is specified by the user), and then clean them in the way I need. I have already written the script to clean the text files.
I ask the user for the directory to use:
chomp ($user_supplied_directory = <STDIN>);
opendir (DIR, $user_supplied_directory);
Then I need to read the directory.
my #dir = readdir DIR;
foreach (#dir) {
Now I am lost.
Any help please?
I'm not certain of what do you want. So, I made some assumptions:
When you say clean the text file, you meant delete the text file
The names of the files you want to write into are formed by a pattern.
So, if I'm right, try something like this:
chomp ($user_supplied_directory = <STDIN>);
opendir (DIR, $user_supplied_directory);
my #dir = readdir DIR;
foreach (#dir) {
next if (($_ eq '.') || ($_ eq '..'));
# Reads the content of the original file
open FILE, $_;
$contents = <FILE>;
close FILE;
# Here you supply the new filename
$new_filename = $_ . ".new";
# Writes the content to the new file
open FILE, '>'.$new_filename;
print FILE $content;
close FILE;
# Deletes the old file
unlink $_;
}
I would suggest that you switch to File::Find. It can be a bit of a challenge in the beginning but it is powerful and cross-platform.
But, to answer your question, try something like:
my #files = readdir DIR;
foreach $file (#files) {
foo($user_supplied_directory/$file);
}
where "foo" is whatever you need to do to the files. A few notes might help:
using "#dir" as the array of files was a bit misleading
the folder name needs to be prepended to the file name to get the right file
it might be convenient to use grep to throw out unwanted files and subfolders, especially ".."
I wrote something today that used readdir. Maybe you can learn something from it. This is just a part of a (somewhat) larger program:
our #Perls = ();
{
my $perl_rx = qr { ^ perl [\d.] + $ }x;
for my $dir (split(/:/, $ENV{PATH})) {
### scanning: $dir
my $relative = ($dir =~ m{^/});
my $dirpath = $relative ? $dir : "$cwd/$dir";
unless (chdir($dirpath)) {
warn "can't cd to $dirpath: $!\n";
next;
}
opendir(my $dot, ".") || next;
while ($_ = readdir($dot)) {
next unless /$perl_rx/o;
### considering: $_
next unless -f;
next unless -x _;
### saving: $_
push #Perls, "$dir/$_";
}
}
}
{
my $two_dots = qr{ [.] .* [.] }x;
if (grep /$two_dots/, #Perls) {
#Perls = grep /$two_dots/, #Perls;
}
}
{
my (%seen, $dev, $ino);
#Perls = grep {
($dev, $ino) = stat $_;
! $seen{$dev, $ino}++;
} #Perls;
}
The crux is push(#Perls, "$dir/$_"): filenames read by readdir are basenames only; they are not full pathnames.
You can do the following, which allows the user to supply their own directory or, if no directory is specified by the user, it defaults to a designated location.
The example shows the use of opendir, readdir, stores all files in the directory in the #files array, and only files that end with '.txt' in the #keys array. The while loop ensures that the full path to the files are stored in the arrays.
This assumes that your "text files" end with the ".txt" suffix. I hope that helps, as I'm not quite sure what's meant by "cleaning the files".
use feature ':5.24';
use File::Copy;
my $dir = shift || "/some/default/directory";
opendir(my $dh, $dir) || die "Can't open $dir: $!";
while ( readdir $dh ) {
push( #files, "$dir/$_");
}
# store ".txt" files in new array
foreach $file ( #files ) {
push( #keys, $file ) if $file =~ /(\S+\.txt\z)/g;
}
# Move files to new location, even if it's across different devices
for ( #keys ) {
move $_, "/some/other/directory/"; || die "Couldn't move files: $!\n";
}
See the perldoc of File::Copy for more info.

How can I list all files in a directory using Perl?

I usually use something like
my $dir="/path/to/dir";
opendir(DIR, $dir) or die "can't open $dir: $!";
my #files = readdir DIR;
closedir DIR;
or sometimes I use glob, but anyway, I always need to add a line or two to filter out . and .. which is quite annoying.
How do you usually go about this common task?
my #files = grep {!/^\./} readdir DIR;
This will exclude all the dotfiles as well, but that's usually What You Want.
I often use File::Slurp. Benefits include: (1) Dies automatically if the directory does not exist. (2) Excludes . and .. by default. It's behavior is like readdir in that it does not return the full paths.
use File::Slurp qw(read_dir);
my $dir = '/path/to/dir';
my #contents = read_dir($dir);
Another useful module is File::Util, which provides many options when reading a directory. For example:
use File::Util;
my $dir = '/path/to/dir';
my $fu = File::Util->new;
my #contents = $fu->list_dir( $dir, '--with-paths', '--no-fsdots' );
I will normally use the glob method:
for my $file (glob "$dir/*") {
#do stuff with $file
}
This works fine unless the directory has lots of files in it. In those cases you have to switch back to readdir in a while loop (putting readdir in list context is just as bad as the glob):
open my $dh, $dir
or die "could not open $dir: $!";
while (my $file = readdir $dh) {
next if $file =~ /^[.]/;
#do stuff with $file
}
Often though, if I am reading a bunch of files in a directory, I want to read them in a recursive manner. In those cases I use File::Find:
use File::Find;
find sub {
return if /^[.]/;
#do stuff with $_ or $File::Find::name
}, $dir;
If some of the dotfiles are important,
my #files = grep !/^\.\.?$/, readdir DIR;
will only exclude . and ..
When I just want the files (as opposed to directories), I use grep with a -f test:
my #files = grep { -f } readdir $dir;
Thanks Chris and Ether for your recommendations. I used the following to read a listing of all files (excluded directories), from a directory handle referencing a directory other than my current directory, into an array. The array was always missing one file when not using the absolute path in the grep statement
use File::Slurp;
print "\nWhich folder do you want to replace text? " ;
chomp (my $input = <>);
if ($input eq "") {
print "\nNo folder entered exiting program!!!\n";
exit 0;
}
opendir(my $dh, $input) or die "\nUnable to access directory $input!!!\n";
my #dir = grep { -f "$input\\$_" } readdir $dh;

How can I add a prefix to all filenames under a directory?

I am trying to prefix a string (reference_) to the names of all the *.bmp files in all the directories as well sub-directories. The first time we run the silk script, it will create directories as well subdirectories, and under each subdirectory it will store each mobile application's sceenshot with .bmp extension.
When I run the automated silkscript for second time it will again create the *.bmp files in all the subdirectories. Before running the script for second time I want to prefix all the *.bmp with a string reference_.
For example first_screen.bmp to reference_first_screen.bmp,
I have the directory structure as below:
C:\Image_Repository\BG_Images\second
...
C:\Image_Repository\BG_Images\sixth
having first_screen.bmp and first_screen.bmp files etc...
Could any one help me out?
How can I prefix all the image file names with reference_ string?
When I run the script for second time, the Perl script in silk will take both the images from the sub-directory and compare them both pixel by pixel. I am trying with code below.
Could you please guide me how can I proceed to complete this task.
#!/usr/bin/perl -w
&one;
&two;
sub one {
use Cwd;
my $dir ="C:\\Image_Repository";
#print "$dir\n";
opendir(DIR,"+<$dir") or "die $!\n";
my #dir = readdir DIR;
#$lines=#dir;
delete $dir[-1];
print "$lines\n";
foreach my $item (#dir)
{
print "$item\n";
}
closedir DIR;
}
sub two {
use Cwd;
my $dir1 ="C:\\Image_Repository\\BG_Images";
#print "$dir1\n";
opendir(D,"+<$dir1") or "die $!\n";
my #dire = readdir D;
#$lines=#dire;
delete $dire[-1];
#print "$lines\n";
foreach my $item (#dire)
{
#print "$item\n";
$dir2="C:\\Image_Repository\\BG_Images\\$item";
print $dir2;
opendir(D1,"+<$dir2") or die " $!\n";
my #files=readdir D1;
#print "#files\n";
foreach $one (#files)
{
$one="reference_".$one;
print "$one\n";
#rename $one,Reference_.$one;
}
}
closedir DIR;
}
I tried open call with '+<' mode but I am getting compilation error for the read and write mode.
When I am running this code, it shows the files in BG_images folder with prefixed string but actually it's not updating the files in the sub-directories.
You don't open a directory for writing. Just use opendir without the mode parts of the string:
opendir my($dir), $dirname or die "Could not open $dirname: $!";
However, you don't need that. You can use File::Find to make the list of files you need.
#!/usr/bin/perl
use strict;
use warnings;
use File::Basename;
use File::Find;
use File::Find::Closures qw(find_regular_files);
use File::Spec::Functions qw(catfile);
my( $wanted, $reporter ) = find_regular_files;
find( $wanted, $ARGV[0] );
my $prefix = 'recursive_';
foreach my $file ( $reporter->() )
{
my $basename = basename( $file );
if( index( $basename, $prefix ) == 0 )
{
print STDERR "$file already has '$prefix'! Skipping.\n";
next;
}
my $new_path = catfile(
dirname( $file ),
"recursive_$basename"
);
unless( rename $file, $new_path )
{
print STDERR "Could not rename $file: $!\n";
next;
}
print $file, "\n";
}
You should probably check out the File::Find module for this - it will make recursing up and down the directory tree simpler.
You should probably be scanning the file names and modifying those that don't start with reference_ so that they do. That may require splitting the file name up into a directory name and a file name and then prefixing the file name part with reference_. That's done with the File::Basename module.
At some point, you need to decide what happens when you run the script the third time. Do the files that already start with reference_ get overwritten, or do the unprefixed files get overwritten, or what?
The reason the files are not being renamed is that the rename operation is commented out. Remember to add use strict; at the top of your script (as well as the -w option which you did use).
If you get a list of files in an array #files (and the names are base names, so you don't have to fiddle with File::Basename), then the loop might look like:
foreach my $one (#files)
{
my $new = "reference_$one";
print "$one --> $new\n";
rename $one, $new or die "failed to rename $one to $new ($!)";
}
With the aid of find utility from coreutils for Windows:
$ find -iname "*.bmp" | perl -wlne"chomp; ($prefix, $basename) = split(m~\/([^/]+)$~, $_); rename($_, join(q(/), ($prefix, q(reference_).$basename))) or warn qq(failed to rename '$_': $!)"

How do I read in the contents of a directory in Perl?

How do I get Perl to read the contents of a given directory into an array?
Backticks can do it, but is there some method using 'scandir' or a similar term?
opendir(D, "/path/to/directory") || die "Can't open directory: $!\n";
while (my $f = readdir(D)) {
print "\$f = $f\n";
}
closedir(D);
EDIT: Oh, sorry, missed the "into an array" part:
my $d = shift;
opendir(D, "$d") || die "Can't open directory $d: $!\n";
my #list = readdir(D);
closedir(D);
foreach my $f (#list) {
print "\$f = $f\n";
}
EDIT2: Most of the other answers are valid, but I wanted to comment on this answer specifically, in which this solution is offered:
opendir(DIR, $somedir) || die "Can't open directory $somedir: $!";
#dots = grep { (!/^\./) && -f "$somedir/$_" } readdir(DIR);
closedir DIR;
First, to document what it's doing since the poster didn't: it's passing the returned list from readdir() through a grep() that only returns those values that are files (as opposed to directories, devices, named pipes, etc.) and that do not begin with a dot (which makes the list name #dots misleading, but that's due to the change he made when copying it over from the readdir() documentation). Since it limits the contents of the directory it returns, I don't think it's technically a correct answer to this question, but it illustrates a common idiom used to filter filenames in Perl, and I thought it would be valuable to document. Another example seen a lot is:
#list = grep !/^\.\.?$/, readdir(D);
This snippet reads all contents from the directory handle D except '.' and '..', since those are very rarely desired to be used in the listing.
A quick and dirty solution is to use glob
#files = glob ('/path/to/dir/*');
This will do it, in one line (note the '*' wildcard at the end)
#files = </path/to/directory/*>;
# To demonstrate:
print join(", ", #files);
IO::Dir is nice and provides a tied hash interface as well.
From the perldoc:
use IO::Dir;
$d = IO::Dir->new(".");
if (defined $d) {
while (defined($_ = $d->read)) { something($_); }
$d->rewind;
while (defined($_ = $d->read)) { something_else($_); }
undef $d;
}
tie %dir, 'IO::Dir', ".";
foreach (keys %dir) {
print $_, " " , $dir{$_}->size,"\n";
}
So you could do something like:
tie %dir, 'IO::Dir', $directory_name;
my #dirs = keys %dir;
You could use DirHandle:
use DirHandle;
$d = new DirHandle ".";
if (defined $d)
{
while (defined($_ = $d->read)) { something($_); }
$d->rewind;
while (defined($_ = $d->read)) { something_else($_); }
undef $d;
}
DirHandle provides an alternative, cleaner interface to the opendir(), closedir(), readdir(), and rewinddir() functions.
Similar to the above, but I think the best version is (slightly modified) from "perldoc -f readdir":
opendir(DIR, $somedir) || die "can't opendir $somedir: $!";
#dots = grep { (!/^\./) && -f "$somedir/$_" } readdir(DIR);
closedir DIR;
You can also use the children method from the popular Path::Tiny module:
use Path::Tiny;
my #files = path("/path/to/dir")->children;
This creates an array of Path::Tiny objects, which are often more useful than just filenames if you want to do things to the files, but if you want just the names:
my #files = map { $_->stringify } path("/path/to/dir")->children;
Here's an example of recursing through a directory structure and copying files from a backup script I wrote.
sub copy_directory {
my ($source, $dest) = #_;
my $start = time;
# get the contents of the directory.
opendir(D, $source);
my #f = readdir(D);
closedir(D);
# recurse through the directory structure and copy files.
foreach my $file (#f) {
# Setup the full path to the source and dest files.
my $filename = $source . "\\" . $file;
my $destfile = $dest . "\\" . $file;
# get the file info for the 2 files.
my $sourceInfo = stat( $filename );
my $destInfo = stat( $destfile );
# make sure the destinatin directory exists.
mkdir( $dest, 0777 );
if ($file eq '.' || $file eq '..') {
} elsif (-d $filename) { # if it's a directory then recurse into it.
#print "entering $filename\n";
copy_directory($filename, $destfile);
} else {
# Only backup the file if it has been created/modified since the last backup
if( (not -e $destfile) || ($sourceInfo->mtime > $destInfo->mtime ) ) {
#print $filename . " -> " . $destfile . "\n";
copy( $filename, $destfile ) or print "Error copying $filename: $!\n";
}
}
}
print "$source copied in " . (time - $start) . " seconds.\n";
}
from: http://perlmeme.org/faqs/file_io/directory_listing.html
#!/usr/bin/perl
use strict;
use warnings;
my $directory = '/tmp';
opendir (DIR, $directory) or die $!;
while (my $file = readdir(DIR)) {
next if ($file =~ m/^\./);
print "$file\n";
}
The following example (based on a code sample from perldoc -f readdir) gets all the files (not directories) beginning with a period from the open directory. The filenames are found in the array #dots.
#!/usr/bin/perl
use strict;
use warnings;
my $dir = '/tmp';
opendir(DIR, $dir) or die $!;
my #dots
= grep {
/^\./ # Begins with a period
&& -f "$dir/$_" # and is a file
} readdir(DIR);
# Loop through the array printing out the filenames
foreach my $file (#dots) {
print "$file\n";
}
closedir(DIR);
exit 0;
closedir(DIR);
exit 0;