Checking existence of a file given a directory format in perl - perl

I am struggling with a method of walking a directory tree to check existence of a file in multiple directories. I am using Perl and I only have the ability to use File::Find as I am unable to install any other modules for this.
Here's the layout of the file system I want to traverse:
Cars/Honda/Civic/Setup/config.txt
Cars/Honda/Pathfinder/Setup/config.txt
Cars/Toyota/Corolla/Setup/config.txt
Cars/Toyota/Avalon/Setup/
Note that the last Setup folder is missing a config.txt file.
Edit: also, in each of the Setup folders there are a number of other files as well that vary from Setup folder to Setup folder. There really isn't any single file to search against to get into the Setup folder itself.
So you can see that the file path stays the same except for the make and model folders. I want to find all of the Setup folders and then check to see if there is a config.txt file in that folder.
At first I was using the following code with File::Find
my $dir = '/test/Cars/';
find(\&find_config, $dir);
sub find_config {
# find all Setup folders from the given top level dir
if ($File::Find::dir =~ m/Setup/) {
# create the file path of config.txt whether it exists or not, well check in the next line
$config_filepath = $File::Find::dir . "/config.txt";
# check existence of file; further processing
...
}
}
You can obviously see the flaw in trying to use $File::Find::dir =~ m/Setup/ since it will return a hit for every single file in the Setup folder. Is there any way to use a -d or some sort of directory check rather than a file check? The config.txt is not always in the folder (I will need to create it if it doesn't exist) so I can't really use something like return unless ($_ =~ m/config\.txt/) since I don't know if it's there or not.
I'm trying to find a way to use something like return unless ( <is a directory> and <the directory has a regex match of m/Setup/>).
Maybe File::Find is not the right method for something like this but I've been searching around for a while now without any good leads on working with directory names rather than file names.

File::Find finds directory names, too. You want to check for when $_ eq 'Setup' (note: eq, not your regular expression, which would also match XXXSetupXXX), and then see if there's a config.txt file in the directory ( -f "$File::Find::name/config.txt" ). If you want to avoid complaining about files named Setup, check that the found 'Setup' is a directory with -d.

I'm trying to find a way to use something like return unless ( <is a directory> and <the directory has a regex match of m/Setup/>).
use File::Spec::Functions qw( catfile );
my $dir = '/test/Cars/';
find(\&find_config, $dir);
sub find_config {
return unless $_ eq 'Setup' and -d $File::Find::name;
my $config_filepath = catfile $File::Find::name => 'config.txt';
# check for existence etc
}

Related

Perl script does not recognize paths which contains environment variables

I got a file with a path on each line. I insert those lines into an array #dirs. Some of the paths include environment variables. An example of a valid file with paths:
/some/valid/path
$HOME/dir
$SOME_ENV/dir
Each path, I would like to check if it contains a file called abc. So I use:
foreach my $dir (#dirs) {
chmod($dir);
my $file = $dir."/"."abc";
print "invalid dir: $dir" unless((-e $file) && (-s $file));
}
But, for some reason, it does not recognize the environment variables, meaning it fails even though $SOME_ENV/dir contains the abc file.
Also, the script does recognize those environment variables, if I use it as following:
print $ENV{SOME_ENV}."\n";
print $ENV{HOME}."\n";
Furthermore, I tried to use the abs_path of the Cwd module, in order to get the real path of the path (so it won't include the environment variable), but it also, does not recognize the environment variable.
why (-e $file) does not recognize the environment variable? How can I solve this issue?
There is nothing in your code evaluating $dir for environment variables inside of it, so you'd need to add that. A very simplistic way could be done like this - using a regular expression to find the variables and then replacing them with their values in the %ENV hash.
$dir =~ s/\$([A-Z0-9_]*)/$ENV{$1}/g;

Determining file size in Perl doesn't always work

I'm trying to enumerate directory content and check the sizes of the files there (no recursion). So I opendir/readdir through the directory, skip certain types of files (directories and such), and by using something like my $size = -s "$file_path get the size of the current file.
However, I'm having a weird situation - in one directory I can't get the size of any file (containing all .exe files). The same program runs fine on another directory (.txt files, .pl and similar).
If I copy some .exe file from the first directory to the other one, its size is properly determined.
If I run the program on the first directory again, the size of that one copied .exe is properly determined, all others still fail. So it seems like some weird caching problem.
Any idea why this is happening?
Edit: With the -f check, the .exe files for which size check doesn't work are not plain files. However they "become" plain files if I just copy them from that directory somewhere. Then the size check works
The part of the code used for enumerating files:
my $dir_handle;
my $dir_entry;
my $retVal = opendir($dir_handle, "$path");
if (!$retVal)
{
print "Unable to open directory. \n$!";
exit(0);
}
while ($dir_entry = readdir($dir_handle))
{
print "Current file: $dir_entry \n";
next if (! -f $dir_entry);
my $size_bytes = -s "$dir_entry";
if ($size_bytes)
{
print "Size: $size_bytes \n";
}
}
closedir($dir_handle);
readdir() returns the file name only, and doesn't include the path information - so if the directory is not the same as the current working dir, this will fail.
You want to include the path:
while ($dir_entry = readdir($dir_handle))
{
print "Current file: $dir_entry \n";
next if (! -f "$path/$dir_entry");
my $size_bytes = -s "$path/$dir_entry";
if ($size_bytes)
{
print "Size: $size_bytes \n";
}
}
(and yes, using Unix-style path separators works fine here; feel free to use \ instead if you like escaping things)
readdir only returns the name of each directory entry. It doesn't include the path to the directory being read. For example, if you're reading /var/tmp and that directory contains a file named foo, then readdir is going to return foo, not /var/tmp/foo.
To check whether a directory entry is a file or to get its size, you have to provide a full pathname to the file, including the directory part. Unless you're specifically calling readdir on the current directory, you will need to convert each filename to a pathname:
while ($dir_entry = readdir($dir_handle))
{
my $pn = $path . '/' . $dir_entry;
print "Current file: $pn \n";
next if (! -f $pn);
my $size_bytes = -s $pn;
...
}
As has already been stated, your bug was in not including the path information during your file operations.
I would recommend using Path::Class to make your file and directory operations cross platform compatible, but also to automatically handle issues like this:
use strict;
use warnings;
use Path::Class;
my $path = '.';
my $dir = dir($path);
for my $child ( $dir->children ) {
printf "Current file: %s\n", $child->basename;
next if !-f $child;
if ( my $size = -s $child ) {
print "Size: $size\n";
}
}

Trying to pass a subdirectory as a parameter in Perl

I have a Perl program to read .html's and only works if the program is in the same directory as the .html's.
I would like to be able to start in different directories and pass the html's location as a parameter. The program (shell example below) traverses the subdirectory "sub"
and its subdirectories to look for .html's, but only works when my perl file is in the same subdirectory "sub". If I put the Perl file
in the home directory, which is one step back from the subdirectory "sub", it doesn't work.
In the shell, if I type "perl project.pl ./sub" from my home directory, it says could
not open ./sub/file1.html. No such file or directory. Yet the file does exist in that exact spot.
file1.html is the first file it is trying to read.
If I change directories in the shell to that subdirectory and move the .pl file
there and then say in the shell: "perl project.pl ./" everything is ok.
To search the directories, I have been using the File::Find concept which I found here:
How to traverse all the files in a directory; if it has subdirectories, I want to traverse files in subdirectories too
Find::File to search a directory of a list of files
#!/usr/bin/perl -w
use strict;
use warnings;
use File::Find;
find( \&directories, $ARGV[0]);
sub directories {
$_ = $File::Find::name;
if(/.*\.html$/){#only read file on local drive if it is an .html
my $file = $_;
open my $info, $file or die "Could not open $file: $!";
while(my $line = <$info>) {
#perform operations on file
}
close $info;
}
return;
}
In the documentation of File::Find it says:
You are chdir()'d to $File::Find::dir when the function is called,
unless no_chdir was specified. Note that when changing to directories
is in effect the root directory (/) is a somewhat special case
inasmuch as the concatenation of $File::Find::dir, '/' and $_ is not
literally equal to $File::Find::name.
So you actually are at ~/sub already. Only use the filename, which is $_. You do not need to overwrite it. Remove the line:
$_ = $File::Find::name;
find changes directory automatically so that $File::Find::name is no longer relative to the current directory.
You can delete this line to get it to work:
$_ = $File::Find::name;
See also File::Find no_chdir.
From the File::Find documentation:
For each file or directory found, it calls the &wanted subroutine.
(See below for details on how to use the &wanted function).
Additionally, for each directory found, it will chdir() into that
directory and continue the search, invoking the &wanted function on
each file or subdirectory in the directory.
(emphasis mine)
The reason it's not finding ./sub/file1.html is because, when open is called, File::Find has already chdired you into ./sub/. You should be able to open the file as just file1.html.

How can I sync two directories with Perl?

I have a folder called "Lib" in my drive it contains many files inside and I have a problem that this "Lib" folder is there in many other places in the drive. My Perl script has to copy the contents from folder "Lib" which are latest updated and paste it in the folder "d:\perl\Latest_copy_of_Lib"
For example, I have a Lib folders in d:\functions, d:\abc, and many other places. I want to find the latest copy of each file in those directories. So, if the file d:\functions\foo.txt was last modified on 2009-10-12 and d:\abc\foo.txt was last modified on 2009-10-13, then I want the version in d:\abc to by copied to the target directory.
I have used file::find but it searches in whole dir and copies the contents that are not latest copy.
I think you just described rsync. Unless you have some sort of weird requirements here, I don't think you need to write any code to do this. I certainly wouldn't reach for Perl to do the job you described.
You need to use File::Find to create a hash of files to move. Only put the path to a file in the hash if the file is newer than the path already stored in the hash. Here is a simple implementation. Note, there may be problems on the windows platform, I am not used to using File::Spec to work with files and pathes in a cross platform manner.
#!/usr/bin/perl
use warnings;
use strict;
use File::Find;
use File::Spec;
my %copy;
my #sources = qw{
/Users/cowens/foo/Lib
/Users/cowens/bar/Lib
/Users/cowens/baz/Lib
};
find sub {
my ($volume, $dir, $file) = File::Spec->splitpath($File::Find::name);
my #dirs = File::Spec->splitdir($dir);
my #base = ($volume); #the base directory of the file
for my $dir (#dirs) {
last if $dir eq 'Lib';
push #base, $dir;
}
#the part that is common among the various bases
my #rest = #dirs[$#base .. $#dirs];
my $base = File::Spec->catdir(#base);
my $rest = File::Spec->catfile(#rest, $file);
#if we don't have this file yet, or if the file is newer than the one
#we have
if (not exists $copy{$rest} or (stat $File::Find::name)[9] > $copy{$rest}{mtime}) {
$copy{$rest} = {
mtime => (stat _)[9],
base => $base
};
}
}, #sources;
print "copy\n";
for my $rest (sort keys %copy) {
print "\t$rest from $copy{$rest}{base}\n";
}
If you can standardize on a single location for your libraries, and then use one of the following:
set PERL5LIB Environment variable and add
use lib 'C:\Lib';
or
perl -I C:\Lib myscript
Any of these will give you a single copy of your lib directory that any of your scripts will be able to access.

How can I change the case of filenames in Perl?

I'm trying to create a process that renames all my filenames to Camel/Capital Case. The closest I have to getting there is this:
perl -i.bak -ple 's/\b([a-z])/\u$1/g;' *.txt # or similar .extension.
Which seems to create a backup file (which I'll remove when it's verified this does what I want); but instead of renaming the file, it renames the text inside of the file. Is there an easier way to do this? The theory is that I have several office documents in various formats, as I'm a bit anal-retentive, and would like them to look like this:
New Document.odt
Roffle.ogg
Etc.Etc
Bob Cat.flac
Cat Dog.avi
Is this possible with perl, or do I need to change to another language/combination of them?
Also, is there anyway to make this recursive, such that /foo/foo/documents has all files renamed, as does /foo/foo/documents/foo?
You need to use rename .
Here is it's signature:
rename OLDNAME,NEWNAME
To make it recursive, use it along with File::Find
use strict;
use warnings;
use File::Basename;
use File::Find;
#default searches just in current directory
my #directories = (".");
find(\&wanted, #directories);
sub wanted {
#renaming goes here
}
The following snippet, will perform the code inside wanted against all the files that are found. You have to complete some of the code inside the wanted to do what you want to do.
EDIT: I tried to accomplish this task using File::Find, and I don't think you can easily achieve it. You can succeed by following these steps :
if the parameter is a dir, capitalize it and obtain all the files
for each file, if it's a dir, go back at the beginning with this file as argument
if the file is a regular file, capitalize it
Perl just got in my way while writing this script. I wrote this script in ruby :
require "rubygems"
require "ruby-debug"
# camelcase files
class File
class << self
alias :old_rename :rename
end
def self.rename(arg1,arg2)
puts "called with #{arg1} and #{arg2}"
self.old_rename(arg1,arg2)
end
end
def capitalize_dir_and_get_files(dir)
if File.directory?(dir)
path_c = dir.split(/\//)
#base = path_c[0,path_c.size-1].join("/")
path_c[-1].capitalize!
new_dir_name = path_c.join("/")
File.rename(dir,new_dir_name)
files = Dir.entries(new_dir_name) - [".",".."]
files.map! {|file| File.join(new_dir_name,file)}
return files
end
return []
end
def camelize(dir)
files = capitalize_dir_and_get_files(dir)
files.each do |file|
if File.directory?(file)
camelize(file.clone)
else
dir_name = File.dirname(file)
file_name = File.basename(file)
extname = File.extname(file)
file_components = file_name.split(/\s+/)
file_components.map! {|file_component| file_component.capitalize}
new_file_name = File.join(dir_name,file_components.join(" "))
#if extname != ""
# new_file_name += extname
#end
File.rename(file,new_file_name)
end
end
end
camelize(ARGV[0])
I tried the script on my PC and it capitalizes all dirs,subdirs and files by the rule you mentioned. I think this is the behaviour you want. Sorry for not providing a perl version.
Most systems have the rename command ....
NAME
rename - renames multiple files
SYNOPSIS
rename [ -v ] [ -n ] [ -f ] perlexpr [ files ]
DESCRIPTION
"rename" renames the filenames supplied according to the rule specified as the first argument. The perlexpr argument is a Perl expression which
is expected to modify the $_ string in Perl for at least some of the filenames specified. If a given filename is not modified by the expression,
it will not be renamed. If no filenames are given on the command line, filenames will be read via standard input.
For example, to rename all files matching "*.bak" to strip the extension, you might say
rename 's/\.bak$//' *.bak
To translate uppercase names to lower, you’d use
rename 'y/A-Z/a-z/' *
OPTIONS
-v, --verbose
Verbose: print names of files successfully renamed.
-n, --no-act
No Action: show what files would have been renamed.
-f, --force
Force: overwrite existing files.
AUTHOR
Larry Wall
DIAGNOSTICS
If you give an invalid Perl expression you’ll get a syntax error.
Since Perl runs just fine on multiple platforms, let me warn you that FAT (and FAT32, etc) filesystems will ignore renames that only change the case of the file name. This is true under Windows and Linux and is probably true for other platforms that support the FAT filesystem.
Thus, in addition to Geo's answer, note that you may have to actually change the file name (by adding a character to the end, for example) and then change it back to the name you want with the correct case.
If you will only rename files on NTFS filesystems or only on ext2/3/4 filesystems (or other UNIX/Linux filesystems) then you probably don't need to worry about this. I don't know how the Mac OSX filesystem works, but since it is based on BSDs, I assume it will allow you to rename files by only changing the case of the name.
I'd just use the find command to recur the subdirectories and mv to do the renaming, but still leverage Perl to get the renaming right.
find /foo/foo/documents -type f \
-execdir bash -c 'mv "$0" \
"$(echo "$0" \
| perl -pe "s/\b([[:lower:]])/\u\$1/g; \
s/\.(\w+)$/.\l\$1/;")"' \
{} \;
Cryptic, but it works.
Another one:
find . -type f -exec perl -e'
map {
( $p, $n, $s ) = m|(.*/)([^/]*)(\.[^.]*)$|;
$n =~ s/(\w+)/ucfirst($1)/ge;
rename $_, $p . $n . $s;
} #ARGV
' {} +
Keep in mind that on case-remembering filesystems (FAT/NTFS), you'll need to rename the file to something else first, then to the case change. A direct rename from "etc.etc" to "Etc.Etc" will fail or be ignored, so you'll need to do two renames: "etc.etc" to "etc.etc~" then "etc.etc~" to "Etc.Etc", for example.