I need to search for files in a directory that begin with a particular pattern, say "abc". I also need to eliminate all the files in the result that end with ".xh". I am not sure how to go about doing it in Perl.
I have something like this:
opendir(MYDIR, $newpath);
my #files = grep(/abc\*.*/,readdir(MYDIR)); # DOES NOT WORK
I also need to eliminate all files from result that end with ".xh"
Thanks, Bi
try
#files = grep {!/\.xh$/} <$MYDIR/abc*>;
where MYDIR is a string containing the path of your directory.
opendir(MYDIR, $newpath); my #files = grep(/abc*.*/,readdir(MYDIR)); #DOES NOT WORK
You are confusing a regex pattern with a glob pattern.
#!/usr/bin/perl
use strict;
use warnings;
opendir my $dir_h, '.'
or die "Cannot open directory: $!";
my #files = grep { /abc/ and not /\.xh$/ } readdir $dir_h;
closedir $dir_h;
print "$_\n" for #files;
opendir(MYDIR, $newpath) or die "$!";
my #files = grep{ !/\.xh$/ && /abc/ } readdir(MYDIR);
close MYDIR;
foreach (#files) {
do something
}
The point that kevinadc and Sinan Unur are using but not mentioning is that readdir() returns a list of all the entries in the directory when called in list context. You can then use any list operator on that. That's why you can use:
my #files = grep (/abc/ && !/\.xh$/), readdir MYDIR;
So:
readdir MYDIR
returns a list of all the files in MYDIR.
And:
grep (/abc/ && !/\.xh$/)
returns all the elements returned by readdir MYDIR that match the criteria there.
foreach $file (#files)
{
my $fileN = $1 if $file =~ /([^\/]+)$/;
if ($fileN =~ /\.xh$/)
{
unlink $file;
next;
}
if ($fileN =~ /^abc/)
{
open(FILE, "<$file");
while(<FILE>)
{
# read through file.
}
}
}
also, all files in a directory can be accessed by doing:
$DIR = "/somedir/somepath";
foreach $file (<$DIR/*>)
{
# apply file checks here like above.
}
ALternatively you can use the perl module File::find.
Instead of using opendir and filtering readdir (don't forget to closedir!), you could instead use glob:
use File::Spec::Functions qw(catfile splitpath);
my #files =
grep !/^\.xh$/, # filter out names ending in ".xh"
map +(splitpath $_)[-1], # filename only
glob # perform shell-like glob expansion
catfile $newpath, 'abc*'; # "$newpath/abc*" (or \ or :, depending on OS)
If you don't care about eliminating the $newpath prefixed to the results of glob, get rid of the map+splitpath.
Related
This question already has answers here:
Why can't I open files returned by Perl's readdir?
(2 answers)
Closed 7 years ago.
I have a problem with a Perl script, as follows.
I must open and analyze all the *.txt files in a directory, but I cannot.
I can read file names that are saved in the #files array and printed, but I cannot open those files for reading.
This is my code:
my $dir= "../Scrivania/programmi" ;
opendir my ($dh), $dir;
my #files = grep { -f and /\.txt/i } readdir $dir;
closedir $dh;
for my $file ( #files ) {
$file = catfile($dir, $file);
print qq{Opening "$file"\n};
open my $fh, '<', $file;
# Do stuff with the data from $fh
print "sono nel foreach\n";
print " in : "."$fh\n";
#open(CANALI,$fh);
##righe=<CANALI>;
#close(CANALI);
#print "canali:"."#righe\n";
#foreach $canali (#righe)
#{
# $canali =~ /\d\d:\d\d (-) (.*)/;
# $ora= $1;
#
# if($hhSplit[0] == $ora)
# {
# push(#output, "$canali");
#
# }
#}
}
The main problem you have is that the file names returned by readdir have no path, so you're trying to open, say, x.txt when you should be opening ../Sc/direct/x.txt. The file doesn't exist in the current working directory so your open call fails
You also have a strange mixture of stuff in glob("$dir/(.*).txt/") which looks a little like a regex pattern, which glob doesn't understand. The value of $dir is a directory handle left open from the opendir on the first line. What you should be using is glob '../Sc/direct/*.txt', but then there's no need for the readdir
There are two ways to find the contents of a file. You can use opendir and readdir to read everything in the directory, or you can use glob
The first method returns only the bare name of each entry, which means you must concatenate each name with the path to the containing directory, preferably using catfile from File::Spec::Functions. It also includes the pseudo-directories . and .. so you must filter those out before you can use the list of names
glob has neither of these disadvantages. All the strings it returns are real directory entries, and they will include a path if you provided one in the pattern you passed as a parameter
You seem to have become rather muddled over the two, so I have written this program which differentiates between the two approaches. I hope it makes things clearer
use strict;
use warnings;
use v5.10.1;
use autodie;
use File::Spec::Functions qw/ catfile /;
my $dir = '../Sc/direct';
### Using glob
for my $file ( glob catfile($dir, '*.txt') ) {
print qq{Opening "$file"\n};
open my $fh, '<', $file;
# Do stuff with the data from $fh
}
### Using opendir / readdir
opendir my ($dh), $dir;
my #files = grep { -f and /\.txt$/i } readdir $dir;
closedir $dh;
for my $file ( #files ) {
$file = catfile($dir, $file);
print qq{Opening "$file"\n};
open my $fh, '<', $file;
# Do stuff with the data from $fh
}
Using $dir in the glob is incorrect. $dir is a GLOB type not a string value. Rather you should be looping over the #files array and looking for names that match what you want. Maybe something like so:
foreach my $fp (#files) {
if ($fp =~ /(.*).txt/) {
print "$fp is a .txt\n";
open (my $in, "<", $fp)
while (<$in>) ...
}
}
I'm just a beginner in Perl. I try to rename a file or directory using the following script, but it is not renaming the file. Please help me in identifying the problem.
I'm using Perl version 5.8.4
#!/usr/bin/perl
use strict;
use warnings;
use File::Copy;
my $dir="/home/hari/perl-s/abc/";
opendir (DIR, $dir);
my #fileList = readdir DIR;
foreach (#fileList){
next if -d;
my $oldname = $_;
print "Newfile after assigning: $_ \n";
s/(^[0-9])(.)//;
print "Newfile: $_ \n";
print "oldname: $oldname \n";
rename ($oldname,$_);
}
The return values of readdir are just filenames; they do not include the path that was provided to opendir. You generally have to include that manually.
opendir (DIR, $dir);
my #fileList = readdir DIR;
foreach (#fileList){
# $_ is just "filename"
$_ = "$dir/$_"; # now $_ is "/home/hari/perl-s/abc/$filename"
next if -d;
...
}
There's more than one way to do things in Perl. Another way to get the set of files in a directory is with the glob function. One of the advantages of glob is that you can use it in such a way so that it returns filenames with their full paths, and so sometimes glob is preferable to the opendir/readdir/closedir idioms:
my #filelist = glob("$dir/*");
foreach (#filelist) {
# $_ is "/home/hari/perl-s/abc/filename"
...
}
I have the following code for listing all files in a directory , I have trouble with path addressing ,my directory is is */tmp/* ,basically I want the files which are in a directory in tmp directory.but I am not allowed to use * ,do you have any idea?
my $directory="*/tmp/*/";
opendir(DIR, $directory) or die "couldn't open $directory: $!\n";
my #files = readdir DIR;
foreach $files (#files){
#...
} ;
closedir DIR;
opendir can't work with wildcards
For your task exists a bit ugly, but working solution
my #files = grep {-f} <*/tmp/*>; # this is equivalent of ls */tmp/*
# grep {-f} will stat on each entry and filter folders
# So #files would contain only file names with relative path
foreach my $file (#files) {
# do with $file whatever you want
}
Without globbing and * wildcard:
use 5.010;
use Path::Class::Rule qw();
for my $tmp_dir (Path::Class::Rule->new->dir->and(sub { return 'tmp' eq (shift->dir_list(1,1) // q{}) })->all) {
say $_ for $tmp_dir->children;
}
I usually use something like
my $dir="/path/to/dir";
opendir(DIR, $dir) or die "can't open $dir: $!";
my #files = readdir DIR;
closedir DIR;
or sometimes I use glob, but anyway, I always need to add a line or two to filter out . and .. which is quite annoying.
How do you usually go about this common task?
my #files = grep {!/^\./} readdir DIR;
This will exclude all the dotfiles as well, but that's usually What You Want.
I often use File::Slurp. Benefits include: (1) Dies automatically if the directory does not exist. (2) Excludes . and .. by default. It's behavior is like readdir in that it does not return the full paths.
use File::Slurp qw(read_dir);
my $dir = '/path/to/dir';
my #contents = read_dir($dir);
Another useful module is File::Util, which provides many options when reading a directory. For example:
use File::Util;
my $dir = '/path/to/dir';
my $fu = File::Util->new;
my #contents = $fu->list_dir( $dir, '--with-paths', '--no-fsdots' );
I will normally use the glob method:
for my $file (glob "$dir/*") {
#do stuff with $file
}
This works fine unless the directory has lots of files in it. In those cases you have to switch back to readdir in a while loop (putting readdir in list context is just as bad as the glob):
open my $dh, $dir
or die "could not open $dir: $!";
while (my $file = readdir $dh) {
next if $file =~ /^[.]/;
#do stuff with $file
}
Often though, if I am reading a bunch of files in a directory, I want to read them in a recursive manner. In those cases I use File::Find:
use File::Find;
find sub {
return if /^[.]/;
#do stuff with $_ or $File::Find::name
}, $dir;
If some of the dotfiles are important,
my #files = grep !/^\.\.?$/, readdir DIR;
will only exclude . and ..
When I just want the files (as opposed to directories), I use grep with a -f test:
my #files = grep { -f } readdir $dir;
Thanks Chris and Ether for your recommendations. I used the following to read a listing of all files (excluded directories), from a directory handle referencing a directory other than my current directory, into an array. The array was always missing one file when not using the absolute path in the grep statement
use File::Slurp;
print "\nWhich folder do you want to replace text? " ;
chomp (my $input = <>);
if ($input eq "") {
print "\nNo folder entered exiting program!!!\n";
exit 0;
}
opendir(my $dh, $input) or die "\nUnable to access directory $input!!!\n";
my #dir = grep { -f "$input\\$_" } readdir $dh;
I'm using this code to get a list of all the files in a specific directory:
opendir DIR, $dir or die "cannot open dir $dir: $!";
my #files= readdir DIR;
closedir DIR;
How can I modify this code or append something to it so that it only looks for text files and only loads the array with the prefix of the filename?
Example directory contents:
.
..
923847.txt
98398523.txt
198.txt
deisi.jpg
oisoifs.gif
lksdjl.exe
Example array contents:
files[0]=923847
files[1]=98398523
files[2]=198
my #files = glob "$dir/*.txt";
for (0..$#files){
$files[$_] =~ s/\.txt$//;
}
it is enough to change one line:
my #files= map{s/\.[^.]+$//;$_}grep {/\.txt$/} readdir DIR;
If you can use the new features of Perl 5.10, this is how I would write it.
use strict;
use warnings;
use 5.10.1;
use autodie; # don't need to check the output of opendir now
my $dir = ".";
{
opendir my($dirhandle), $dir;
for( readdir $dirhandle ){ # sets $_
when(-d $_ ){ next } # skip directories
when(/^[.]/){ next } # skip dot-files
when(/(.+)[.]txt$/){ say "text file: ", $1 }
default{
say "other file: ", $_;
}
}
# $dirhandle is automatically closed here
}
Or if you have very large directories, you could use a while loop.
{
opendir my($dirhandle), $dir;
while( my $elem = readdir $dirhandle ){
given( $elem ){ # sets $_
when(-d $_ ){ next } # skip directories
when(/^[.]/){ next } # skip dot-files
when(/(.+)[.]txt$/){ say "text file: ", $1 }
default{
say "other file: ", $_;
}
}
}
}
This is the simplest way I've found (as in human readable) using the glob function:
# Store only TXT-files in the #files array using glob
my #files = grep ( -f ,<*.txt>);
# Write them out
foreach $file (#files) {
print "$file\n";
}
Additionally the "-f" ensures that only actual files (and not directories) are stored in the array.
To get just the ".txt" files, you can use a file test operator (-f : regular file) and a regex.
my #files = grep { -f && /\.txt$/ } readdir $dir;
Otherwise, you can look for just text files, using perl's -T (ascii-text file test operator)
my #files = grep { -T } readdir $dir;
Just use this:
my #files = map {-f && s{\.txt\z}{} ? $_ : ()} readdir DIR;