Find symlink using Find in Perl - perl

I have faced some issue on how to find a symlink inside a directory using Find library, it can't be found and I am very new in using Perl.
use strict;
use warnings;
use File::Find::Rule;
use File::Find;
use File::Copy;
my #files = File::Find::Rule->file()
->name( 'table.txt' )
->in( '/a/b/c' );
for my $file (#files)
{
print "\nfile: $file\n";
Supposedly there is another directory that contain the specified file, table.txt, but it is a symlink that is /a/b/d, but the symlink is not there when it is printed out.

The File::Find::Rule implements the -X filetests, as methods. The one that tests whether an entry is a symlink (-l) is called symlink.
In my reading of the question you don't know the name of that directory (otherwise, why "find" a file in it?), except that it is a symbolic link. Then you need to first find directories which are symlinks, then find files in them. For this second task you'll need to tell the module to follow the links.
I use a test structure on disk
test_find/a/c/c.txt
test_find/a/b/b.txt
test_find/is_link -> a/c ("is_link" is a symbolic link to "a/c")
and from the directory right above it I run the program
use warnings;
use strict;
use feature 'say';
use File::Find::Rule;
my $top_dir = shift #ARGV || './test_find';
my #symlink_dirs = File::Find::Rule->directory->symlink->in($top_dir);
foreach my $dir (#symlink_dirs) {
my #files = File::Find::Rule->file->extras({follow => 1})->in($dir);
say "in $dir: #files";
}
which prints
in test_find/is_link: test_find/is_link/c.txt
Note that the program will find all files in all directories which are symlinks. Writing an explicit loop as above allows you to add code to decide which of the symlink-directories to actually search. If you don't mind looking through all of them you can just do
my #files = File::Find::Rule->file->extras({follow => 1})->in(#symlink_dirs);
See documentation for features to limit the search using what you know about the directory/file.
If the link directory is in a hierarchy not including the target, then simply search that hierarchy.
With test structure
other_hier/is_link -> ../test_find/a/c
test_find/a/c/c.txt
you only need to tell it to follow the links
my #all_files = File::Find::Rule # may use linebreaks and spaces
-> file
-> extras({ follow => 1 })
-> in('other_hier');
When added to the above program and printed this adds
another/is_link/c.txt
You can of course replace the literal 'other_hier' with $top_dir and invoke the program with argument other_hier (and make that other_hier directory and the link in it).
If both link and target are in the same hierarchy that is searched then you can't do this; the search would run into circular links (the module detects that and exits with error).

Related

Perl: How to use File::Find::Rule to list all symbolic links from a given path

ls /foo/bar/
lrwxr-xr-x a1 -> ../../../a1
lrwxr-xr-x a2 -> ../../../a2
lrwxr-xr-x a3 -> ../../../a3
This is a curtailed output of ls.
My goal:
1. Go to /foo/bar/ and find the latest version of a (which is a symbolic link). So in this case, a3. Copy the contents of a3 to a temp location
I am trying to use File::Find::Rule but I am unable to figure out how to use it to list all the symbolic links. Reading through various Google sites, I see people explaining how to follow the symbolic links but not to list them.
What I have figured out so far:
my $filePath = "/foo/bar";
my #files = File::Find::Rule->file->in(filePath);
This returns an empty array because there there are no files only symbolic links in /foo/bar.
I also tried
my #files = File::Find::Rule->in($makeFilePath)->extras({follow =>1});
but I feel that is asks to follow the symbolic link rather than list them.
Use the symlink method from -X test synonyms provided in File::Find::Rule
use warnings 'all';
use strict;
use File::Find::Rule;
my $rule = File::Find::Rule->new;
my #links = $rule->symlink->in('.');
print "#links\n";
This finds all files which satisfy -l file test in the current directory. Also see -X.
With the list of links on hand, you can use the -M file test or stat (or its File::stat by-name interface), to sort it out by timestamps of the target files. For example
use List::Util 'max';
my %ts_name = map { (stat)[9] => $_ } #links;
my $latest = $ts_name{ max (keys %ts_name) };
There are other ways to sort/filter/etc a list. If you use -M then you need min. If you wanted the timestamps for the link itself for some reason, use the lstat instead. The module also provides an mtime method for work with timestamps, but it is meant for search and not suitable for sorting.
Note that you don't have to actually create an object first, but can directly do
use File::Find::Rule;
my #links = File::Find::Rule->symlink->in('.');
To copy/move things use core File::Copy, while for temporary files core File::Temp is useful.

Preferred way of getting absolute path of path relative to current directory

I am trying to expand a path relative to the current directory:
use feature qw(say);
use strict;
use warnings;
use Cwd;
use File::Spec;
my $fn = 'test/my_file';
say File::Spec->rel2abs( $fn );
say Cwd::abs_path( $fn );
Here, Cwd::abs_path() fails if directory test does not exist.
Why does File::Spec->rel2abs() work fine, while Cwd::abs_path() fails?
The documentation of the two modules Cwd and File::Spec gives little clue why this happens. According to the following bug report it could be due to expansion of symbolic links, first from 2004: "Cwd::abs_path returns undef for non-existent paths":
bugs.debian.org
rt.perl.org
File::Spec and the friendlier Path::Class do not touch the file system, so they can be used for path that you will create. Cwd::abs_path does, so it can be used to return valid paths. Use whichever is most appropriate.

simple way to test if a given filename is underneath a directory?

Is there a Perl module (preferably core) that has a function that will tell me if a given filename is inside a directory (or a subdirectory of the directory, recursively)?
For example:
my $f = "/foo/bar/baz";
# prints 1
print is_inside_of($f, "/foo");
# prints 0
print is_inside_of($f, "/foo/asdf");
I could write my own, but there are some complicating factors such as symlinks, relative paths, whether it's OK to examine the filesystem or not, etc. I'd rather not reinvent the wheel.
Path::Tiny is not in core, but it has no non-core dependencies, so is a very quick and easy installation.
use Path::Tiny qw(path);
path("/usr/bin")->subsumes("/usr/bin/perl"); # true
Now, it does this entirely by looking at the file paths (after canonicalizing them), so it may or may not be adequate depending on what sort of behaviour you're expecting in edge cases like symlinks. But for most purposes it should be sufficient. (If you want to take into account hard links, the only way is to search through the entire directory structure and compare inode numbers.)
If you want the function to work for only a filename (without a path) and a path, you can use File::Find:
#!/usr/bin/perl
use warnings;
use strict;
use File::Find;
sub is_inside_of {
my ($file, $path) = #_;
my $found;
find( sub { $found = 1 if $_ eq $file }, $path);
return $found
}
If you don't want to check the filesystem, but only process the path, see File::Spec for some functions that can help you. If you want to process symlinks, though, you can't avoid touching the file system.

Perl Subdirectory Traversal

I am writing a script that goes through our large directory of Perl Scripts and checks for certain things. Right now, it takes two kinds of input: the main directory, or a single file. If a single file is provided, it runs the main function on that file. If the main directory is provided, it runs the main function on every single .pm and .pl inside that directory (due to the recursive nature of the directory traversal).
How can I write it (or what package may be helpful)- so that I can also enter one of the seven SUBdirectories, and it will traverse ONLY that subdirectory (instead of the entire thing)?
I can't really see the difference in processing between the two directory arguments. Surely, using File::Find will just do the right thing in both instances.
Something like this...
#!/usr/bin/perl
use strict;
use warnings;
use File::Find;
my $input = shift;
if (-f $input) {
# Handle a single file
handle_a_file($input);
} else {
# Handler a directory
handle_a_directory($input);
}
sub handle_a_file {
my $file = shift;
# Do whatever you need to with a single file
}
sub handle_a_directory {
my $dir = shift;
find(\&do this, $dir);
}
sub do_this {
return unless -f;
return unless /\.p[ml]$/;
handle_a_file($File::Find::name);
}
One convenient way would be to use the excellent Path::Class module, more precisely: the traverse() method of Path::Class::Dir. You'd control what to process from within the callback function which is supplied as the first argument to traverse(). The manpages has sample snippets.
Using the built-ins like opendir is perfectly fine, of course.
I've just turned to using Path::Class almost everywhere, though, as it has so many nice convenience methods and simply feels right. Be sure to read the docs for Path::Class::File to know what's available. Really does the job 99% of the time.
If you know exactly what directory and subdirectories you want to look at you can use glob("$dir/*/*/*.txt") for example to get ever .txt file in 3rd level of the given $dir

How can I generate random unique temp file names?

I am trying to create a temp file using the following code:
use File::Temp ;
$tmp = File::Temp->new( TEMPLATE => 'tempXXXXX',
DIR => 'mydir',
SUFFIX => '.dat');
This is create the temp file. Because of my permission issue, the other program is not able to write into file.
So I just want to generate the file name without creating the file. Is there any where to do that?
If you don't create the file at the same time you create the name then it is possible for the a file with the same name to be created before you create the file manually. If you need to have a different process open the file, simply close it first:
#!/usr/bin/perl
use strict;
use warnings;
use File::Temp;
sub get_temp_filename {
my $fh = File::Temp->new(
TEMPLATE => 'tempXXXXX',
DIR => 'mydir',
SUFFIX => '.dat',
);
return $fh->filename;
}
my $filename = get_temp_filename();
open my $fh, ">", $filename
or die "could not open $filename: $!";
The best way to handle the permissions problem is to make sure the users that run the two programs are both in the same group. You can then use chmod to change the permissions inside the first program to allow the second program (or any user in that group) to modify the file:
my $filename = get_temp_filename();
chmod 0660, $filename;
Just to obtain the name of the tempfile you can do:
#!/usr/bin/perl
use strict;
use warnings;
use 5.10.1;
use File::Temp qw/tempfile/;
my $file;
(undef, $file) = tempfile('tmpXXXXXX', OPEN=>0);
say $file;
But as Chas. Owens said, be careful the same name could be created before you use it.
The get_temp_filename function proposed by Chas. Owens uses a local filehandle object ($fh), which is destroyed upon function return, leading to the created tempfile destruction.
To avoid this, and therefore keep the file (less risk) add:
UNLINK => 0
to the new method arguments, forbidding file unlink at object deletion time.
Actually, I agree with Chas.Owens - the design is fatally flawed.
It really feels like you need to fix the design, so:
If you have control of the 2nd program, have that program create the filename and the file, and pass the filename to the 1st program.
But, if the 2nd program isn't something you wrote and so you cannot modify it then I'd recommend one of the following:
1 - Use the first processes PID as part of the file name in an attempt to minimize the risks of duplicate filenames.
2 - Have the 2nd program pipe its output to the 1st program, don't bother with a file at all. Personally, this is a much better solution than 1.
3 - Wrap the 2nd program in a script (shell, perl, whatever) which creates the name and the file and passes that to both programs.