How can I detect that a symlink is broken in Perl? - perl

I would like to remove a broken symlink in a directory using Perl.
In my mind I just had to list the file of a directory and test is this a symlink (-l) and if it returns false just unlink it.
But it appears that when using readir to list all files my broken symlinks are not recoganized as a file. Because my link is pointing to nothing I understand why.
All the file in $myDir are symlinks, either valid or broken.
When I display #files I only get a list of valid symlink.
opendir DIR, $myDir;
my #files = grep(/$regexp/,readdir(DIR));
closedir DIR;
print "filenames : #files\n";

There are two main relevant system calls, stat() and lstat(). The lstat() call will tell you that it is a symlink (but on other files, behaves the same as stat()). This allows you to determine that the name is a symlink. The stat() system call follows a symlink to its end, and tells you about the file (or directory) at the end of the link. If the stat() call fails on the symlink, then the symlink is broken or you're trying to access a directory or file where you have no permission.
The Perl file test operators include -l to detect whether a name is a symlink. You can use the Perl functions stat and lstat explicitly. Between these, you should be able to sort out whether a symlink is broken or not - but you should probably plan to write a function to do the job.
You probably don't need to use the readlink Perl function. Beware the underlying system readlink() call; it does not return a null-terminated string!
It is interesting that neither Perl nor its POSIX module supports the realpath() function. However, the PathTools module does support it. If realpath fails, on a symlink, the symlink is non-functional (aka broken).

Here's some code I've used to remove broken links:
chdir $dir or die;
opendir(DIR, '.') or die;
foreach my $link (readdir DIR) {
next unless -l $link and not -e readlink($link);
print "Removing broken link $link\n";
unlink $link;
}
closedir DIR;
Note that it's important that the directory containing the links is the current directory. readdir returns only filenames, and the links might be relative.

Combining lstat with stat:
say "dangling link at $fn" if (lstat $fn and not stat $fn);
update: it works for me...
salva#topo:~/t/dl$ perl -E 'opendir $dh, "."; say $_ for grep { !stat $_ and lstat $_ } readdir $dh'
foo
salva#topo:~/t/dl$ ls -l
total 0
-rw-rw-r-- 1 salva salva 0 2011-07-05 12:34 f
lrwxrwxrwx 1 salva salva 11 2011-07-05 12:00 fii -> /etc/shadow
lrwxrwxrwx 1 salva salva 12 2011-07-05 11:59 foo -> /etc/hjdkshf

Check for broken symlinks (checking only the top level if there are symlinks to symlinks):
use strict;
use warnings;
use autodie;
opendir my $dirh, '.';
while (my $file = readdir $dirh) {
if ( -l $file ) {
my $target = readlink $file;
if ( ! -e $target && ! -l $target ) {
print "$file -> $target broken\n";
}
}
}

Use readlink() and stat() the result.

Using the built-in Perl glob function ?
For examples:
#files = <*>;
foreach $file (#files) {
print $file . "\n";
}
For a specific $dir:
#files = <$dir*>;
foreach $file (#files) {
print $file . "\n";
}

A broken symlink is a link (-l) that does not exists (!-e)
perl -e 'print "broken: $_\n" for grep { -l and ! -e } glob("*");'

Related

Perl directory is getting past if(! -d) statement? [readdir results]

Okay so I have a program that basically looks into a passed in directory, if any file names match a pattern I will make a directory and move that specific file and any that matches it (regardless of extension) into that directory. Now if they don't match I should move them into the PassedInDir/misc/ directory.
I have a condition in both cases to avoid passing in any directory (as my program isn't ready to deal with those yet) something like if( ! -d $fp).
Everything works fine when I run it the first time in the directory. However when I run it again on the same directory (which should now only contain directories) I get the Error Could not move file assignmentZ to destination DataB/misc at projectSorter.pl line 16.. AssignmentZ is a directory however its somehow getting past the (!-d) in the second case.
#!/usr/bin/perl -w
use File::Copy;
if(#ARGV < 1){
print "\nUsage: proj6.pl <directory>\n\n";
exit;
}
die("\nDirectory $ARGV[0] does not exist\n\n") if( ! -e $ARGV[0]);
opendir( DIR, $ARGV[0]) or die("\nCould not open directory $ARGV[0]\n\n");
while(($fp = readdir(DIR))){
if($fp =~ m/proj(.*)\./){
(! -d "$ARGV[0]/assignment$1") && (mkdir "$ARGV[0]/assignment$1");
move("$ARGV[0]/$fp" , "$ARGV[0]/assignment$1") or die("Could not move file $fp to destination $ARGV[0]/assignment$1");
}
elsif(! -d $fp){ #gets past here!!!
(! -d "$ARGV[0]/misc") && (mkdir "$ARGV[0]/misc");
move("$ARGV[0]/$fp" , "$ARGV[0]/misc") or die("Could not move file $fp to destination $ARGV[0]/misc");
}
}
It is the only directory to do it out of the ones previously made by running my program once. I am curious about why this is happening.
$fp as set by readdir is relative to scanned directory. chdir to the scanned directory or prepend the scanned directory name for -d test.
You use "$ARGV[0]/$fp" as argument to move function.
perldoc -f readdir
readdir DIRHANDLE
Returns the next directory entry for a directory opened by
"opendir". […]
If you're planning to filetest the return values out of a
"readdir", you'd better prepend the directory in question.
Otherwise, because we didn't "chdir" there, it would have been
testing the wrong file.
Some suggestions.
‣ Don't use the -w flag with Perl. Some modules turn warnings off to do their work but the -w flag is global. With it, they will report warnings that should be ignored.
‣ Always have these two lines at the top of every script.
use strict;
use warnings;
These will catch a lot of errors in your code. See perldoc strict and perldoc warnings for more details.
‣ Use glob() or Find::Find instead of opendir/readdir/closedir.
‣ Use make_path() from File::Path instead of mkdir.
‣ Use an if statement for conditional execution instead of &&.
‣ Place blank lines in your code to make reading it easier.
File::Find and File::path are standard modules that come installed with Perl. For a list of the standard modules, see perldoc perlmodlib.
#!/usr/bin/perl
# --------------------------------------
# pragmas
use strict;
use warnings;
# --------------------------------------
# modules
use File::Copy;
use File::Path qw( make_path );
# --------------------------------------
# main
# make sure there is something to work on
if(#ARGV < 1){
print "\nUsage: proj6.pl <directory>\n\n";
exit;
}
# arguments should be directories
for my $src_dir ( #ARGV ){
# validate the source directory
die("\n$src_dir does not exist\n\n") if( ! -e $src_dir);
die("\n$src_dir is not a directory\n\n") if( ! -d $src_dir);
# move proj* files
for my $proj ( glob( "$src_dir/proj*" )){
# get the proj number
( my $number ) = $proj =~ m/proj(.*)\./;
# get the destination directory
my $dst_dir = "$src_dir/assignment$number";
# create the directory where it goes
if( ! -d $dst_dir ){
make_path( $dst_dir ) or die "could not make path $dst_dir";
}
# move the file
move( $proj, $dst_dir ) or die( "could not move file $proj to destination $dst_dir" );
} # end of $proj files
# move other files
for my $file ( grep { ! -d } glob( "$src_dir/*" )){
# get the destination directory
my $dst_dir = "$src_dir/misc";
# create the directory where it goes
if( ! -d $dst_dir ){
make_path( $dst_dir ) or die "could not make path $dst_dir";
}
# move the file
move( $file, $dst_dir ) or die( "could not move file $file to destination $dst_dir" );
} # end other files
} # end of src_dir

Perl, how to choose a directory

I'm trying to determine which of the content of a folder is a directory and which is a file, I wrote the following but the result is not what I would expect:
opendir DH, $dir or die "Cannot open Dir: $!";
my #dirs = grep !/^\.\.?$/, readdir DH ;
foreach my $files (#dirs) {
print $files."<br>";
if ( -d $files )
{
print $files." is a directory<br>";
}
}
closedir DH;
The result is something as the example below:
.file1
file.log
file3.zip
file4
file5.zip
dir1.name1.suffix1.yyyy.MM.dd.hh.mm.ss
file5.zip
file6.tar
dir2
dir3.name1.suffix1.yyyy.MM.dd.hh.mm.ss
where the item starting with dir are actual directory, so my question is why the if is failing discover them as such?
What am I doing wrong?
$diris missing...
if ( -d "$dir/$files" )
{
print $files." is a directory<br>";
}
It's easiest to chdir to $dir so that you don't have to prefix the node names with the path. You can also use autodie if you are running Perl v5.10.1 or better. Finally, if you use $_ as your loop control variable (the file/directory names) you can omit it from the parameters of print, -d and regex matches
Like this
use strict;
use warnings;
use v5.10.1;
use autodie;
my ($dir) = #ARGV;
opendir my $dh, $dir;
chdir $dh;
while ( readdir $dh ) {
next if /\A\.\.?\z/;
print;
print " is a directory" if -d;
print "<br/>\n";
}
... # local expires. working directory returns to its original value
Update
In view of ikegami's (deleted) comment about returning back to the original working directory, here's an example of using the File::chdir module to do this tidily. It exports a tied variable $CWD which will change your working directory if you assign to it. You can also localise it, so just wrapping the above code in braces and adding a new local value for $CWD keeps things neat. Note that File::chdir is not a core module so you will likely need to install it
Note however that there is still a very small possibility that the process may be started with a present working directory that it cannot chdir to. This module will not solve that problem
use strict;
use warnings;
use v5.10.1;
use autodie;
use File::chdir;
my ($dir) = #ARGV;
{
opendir my $dh, $dir;
local $CWD = $dir;
while ( readdir $dh ) {
next if /\A\.\.?\z/;
print;
print " is a directory" if -d;
print "<br/>\n";
}
}

Perl Directory Walking issue - Can't go back up more than 1 directory properly

First off, I don't have the ability to use File::Find.
So I have my script to walk through directories and find a certain type of file. But if I go more than one sub-directory deep, my script doesn't properly exit all the way back up to the starting directory. I think I need to have a $previousDir variable that keeps track of the last directory so I can say to go back out to that one when I'm done in the sub-directory. But I've tried putting it in multiple places without success...
File Structure (BOLD is a Dir, Italic is a file):
startingdirectory/Logs - AAA, Dir1, zzz, adstatlog.299, adstatlog.tgz, file
/AAA - filefile
/Dir1 - /Dir2, config.tar.gz
/Dir2 - EMPTY
/zzz - withinzzz
Here is my current script:
# specify the directory where you want to start the search
my $startingDir = $ARGV[0];
my $directoryCount = 0;
my $directory = shift;
my $previousDir;
my #directories;
my $tarOutput;
# Calling the Subroutine, which searches the Directory
readDirectory($startingDir);
sub readDirectory
{
# Open and close the startingDir
opendir(DIR, #_[0]) or die("ERROR: Couldn't open specified directory $!");
my #files = grep { $_ !~ /^\.{1,2}$/ } readdir DIR;
closedir DIR;
print "------------------------------------------------------------------------\n\n";
foreach my $currentFile (#files)
{
print "Current File: ", $currentFile, "\n\n";
#Directory currently searching through
print "Searching in $directory\n\n";
my $fullPath = "$directory/$currentFile";
print "FULL PATH: $fullPath\n\n";
if ( -d $fullPath )
{
print "Found New Directory: ", $currentFile, "\n\n";
push (#directories, $currentFile);
$directoryCount++;
print "Current number = $directoryCount\n\n";
print "Directories: #directories \n\n";
$previousDir = $directory;
$directory = $fullPath;
# The Subroutine is calling hisself with the new parameters
readDirectory($directory);
}
elsif ( $currentFile =~ /\.tar.gz$/i || $currentFile =~ /\.tar$/i || $currentFile =~ /\.tgz$/i)
{
print "File: ", $currentFile, "\n\n";
my $tarOutput = `tar -tvzf $currentFile`;
print $tarOutput, "\n";
$previousDir = $directory;
}
print "PREVIOUSDIR: $previousDir\n\n";
print "-----------------------------------------------------------------------\n\n";
$directory = $previousDir;
}
}
And the output: (scroll down to see where issue begins)
------------------------------------------------------------------------
Current File: AAA
Searching in /home/gackerma/Logs
FULL PATH: /home/gackerma/Logs/AAA
Found New Directory: AAA
Current number = 1
Directories: AAA
------------------------------------------------------------------------
Current File: filefile
Searching in /home/gackerma/Logs/AAA
FULL PATH: /home/gackerma/Logs/AAA/filefile
PREVIOUSDIR: /home/gackerma/Logs
------------------------------------------------------------------
PREVIOUSDIR: /home/gackerma/Logs
------------------------------------------------------------------
Current File: Dir1
Searching in /home/gackerma/Logs
FULL PATH: /home/gackerma/Logs/Dir1
Found New Directory: Dir1
Current number = 2
Directories: AAA Dir1
------------------------------------------------------------------------
Current File: DIR2
Searching in /home/gackerma/Logs/Dir1
FULL PATH: /home/gackerma/Logs/Dir1/DIR2
Found New Directory: DIR2
Current number = 3
Directories: AAA Dir1 DIR2
------------------------------------------------------------------------
PREVIOUSDIR: /home/gackerma/Logs/Dir1
------------------------------------------------------------------
Current File: configs.tar.gz
Searching in /home/gackerma/Logs/Dir1
FULL PATH: /home/gackerma/Logs/Dir1/configs.tar.gz
PREVIOUSDIR: /home/gackerma/Logs/Dir1
------------------------------------------------------------------
PREVIOUSDIR: /home/gackerma/Logs/Dir1 ***THIS IS WHERE THE ISSUE STARTS -
PREVIOUSDIR SHOULD BE /Logs!!***
------------------------------------------------------------------
Current File: file
Searching in /home/gackerma/Logs/Dir1
FULL PATH: /home/gackerma/Logs/Dir1/file
PREVIOUSDIR: /home/gackerma/Logs/Dir1
------------------------------------------------------------------
Current File: adstatlog.299
Searching in /home/gackerma/Logs/Dir1
FULL PATH: /home/gackerma/Logs/Dir1/adstatlog.299
PREVIOUSDIR: /home/gackerma/Logs/Dir1
------------------------------------------------------------------
Current File: zzz
Searching in /home/gackerma/Logs/Dir1
FULL PATH: /home/gackerma/Logs/Dir1/zzz
PREVIOUSDIR: /home/gackerma/Logs/Dir1
------------------------------------------------------------------
Current File: adstatlog.tgz
Searching in /home/gackerma/Logs/Dir1
FULL PATH: /home/gackerma/Logs/Dir1/adstatlog.tgz
PREVIOUSDIR: /home/gackerma/Logs/Dir1
------------------------------------------------------------------
I would really use File::Find if you can.
Here's a working, simplified version of your recursive try:
use warnings;
use strict;
die "Usage: $0 (abs path to dir) " if #ARGV != 1;
my $dir = shift #ARGV;
file_find($dir);
sub file_find {
my $dir = shift;
opendir my $dh, $dir or warn "$dir: $!";
my #files = grep { $_ !~ /^\.{1,2}$/ } readdir $dh;
closedir $dh;
for my $file ( #files ) {
my $path = "$dir/$file";
if( $path =~ /(\.tar\.gz|\.tar|\.tgz)$/ ) {
print "do tar for $path\n";
}
file_find($path) if -d $path;
}
}
The File::Find module has been a standard Unix module since Perl 5.000. In fact, it's been a standard module since Perl 3.x, maybe even before. In fact, I have Perl 5.12 installed on my Mac, and I still see the old find.pl file sitting in one of the #INC directories.
Back before Perl 5 (or maybe even before Perl 4), you'd do this:
require "find.pl";
instead of
use File::Find;
TO get the find command on your system (find.pl is there for backwards compatibility). This is why I find it so hard to believe you don't have File::Find on your system. It'd be like saying you don't have the dir command on your Windows PC.
Run the command perl -V. That's a capital V. This will print out the #INC directory list. See if you can find a File directory in only of those directories listed in that list. Under that directory should be a Find.pm Perl module.
Here's what it looks like on my PC running Strawberry Perl:
#INC:
C:/perl/perl/site/lib
C:/perl/perl/vendor/lib
C:/perl/perl/lib
.
On my Mac, 10 directories are listed in that #INC list.
Also see which version of Perl you have on your system. And, make sure the directories listed in #INC are readable by you.
There is something definitely wrong with your Perl installation if you don't have File::Find on your system. I'd be more worried about that than File::Find itself.
One more thing, see if you have perldoc command installed. If you do, type:
$ perldoc File::Find
and see if that gives you any documentation on File::Find. If it does, it means that File::Find is on your system. Then run:
$ perldoc -l File::Find
which will give you the location of the File::Find module.
Before doing anything else, verify that File::Find really, really doesn't exist on your system, or that you don't have read access to it before doing anything else. As I said before, if this module doesn't exist on your system, you have major problems with your Perl installation, and I'd be worried whether it can be trusted. This needs to be resolved.
If everything is okay, then we need to see your program to figure out why you can't use File::Find. It might be something minor, like using quotes around the program's name.
There are a number of problems with your program. The main error is that you are using too many global variables and trying to manually keep them in synch with the directory you are currently processing.
Here is a list
Always use strict and use warnings for every program you write
warnings would have told you that you should write opendir(DIR, $_[0]) instead of opendir(DIR, #_[0])
You are setting $directory to $previousDir after every entry in a directory. But $previousDir is being set only when the current entry is another directory, so after ordinary files the value is restored even though it hasn't been saved.
You are getting confused about whether you should be reading the directory specified by global variable $directory or by the parameter passed to the subroutine.
By far the easiest way to do this is to use only the subroutine parameter to specify the current directory and forget about the global variable. Here is a program that does what yours is intended to
use strict;
use warnings;
process_dir($ARGV[0]);
sub process_dir {
my ($dir) = #_;
opendir my $dh, $dir or die $!;
my #entries = grep { not /^\.\.?$/ } readdir $dh;
closedir $dh;
for my $entry (#entries) {
my $fullname = "$dir/$entry";
if (-d $fullname) {
process_dir($fullname);
}
elsif ($entry=~ /(?:\.tar|\.tgz|\.tar\.gz)$/i)
print "File: ", $fullname, "\n\n";
print `tar -tvzf $fullname`;
}
}
}

ls -ltr dirname* in perl

I need to find for a certain dirname, i have a code which greps for the latest-file . Could someone help me to find the ls -ltr dirname* without using the ls -ltr command in the perl script. Below code may help:
my $dir = "/abc/pqr/xyz";
opendir(my $DH, $dir) or die "Error opening $dir: $!";
my %files = map { $_ => (stat("$dir/$_"))[9] } grep(! /^\.\.?$/, readdir($DH));
closedir($DH);
my #sorted_files = sort { $files{$b} <=> $files{$a} } (keys %files);
print "the file is $sorted_files[0] \n";
I need a to find a dir name as new_123 in /abc/pqr/xyz. As ls -ltr new*, as these directories are created everyday so looking for new*.
Its a bit unclear what you are asking for, but:
I need a to find a dir name as new_123 in /abc/pqr/xyz. As ls -ltr
new*, as these directories are created everyday so looking for new*.
For that, you can use glob.
My PERL is a bit rusty but I think this would do it:
while (my $dir = glob("new*")) {
next unless (-d "$dir");
print "$dir\n";
}
If system calls are not a restriction, you can use find:
find /some/path/ -type d -name "new*"

How do I check for a sub-subdirectory in Perl?

I have a folder called Client which contains many subfolders. I want to create a Perl script to look at each of those subfolders and check for a folder there. If it is there, I want to skip it and move on, if it is not there, I want to create it and do some processing.
How do I go about looping through all of the subfolders and checking for the directory I want? I have found a lot of information on how to get all the files in a folder and/or subfolders, but nothing on checking for a directory within each subfolder.
Augh! Too much complexity in the other answers. The original question doesn't appear to be asking for a recursive traversal. As far as I can see, this is a perfectly sensible solution, and vastly more readable to boot:
foreach my $dir (glob "Client/*") {
next if ! -d $dir; # skip if it's not a directory
next if -d "$dir/subfolder"; # skip if subfolder already exists
mkdir "$dir/subfolder" or die; # create it
do_some_processing(); # do some processing
}
Seriously folks: opendir/readdir? Really?
It's pretty easy once you break it into steps. Get a list of the subdirectories with glob then see which ones don't have the second-level directory. If you are using a File::Find-like module, you are probably doing too much work:
#!perl
use strict;
use warnings;
use File::Spec::Functions;
my $start = 'Clients';
my $subdir = 'already_there';
# #queue is the list of directories you need to process
my #queue = grep { ! -d catfile( $_, $subdir ) } # filter for the second level
grep { -d } # filter for directories
glob catfile( $start, '*' ); # everything below $start
#!/usr/bin/perl
use strict;
use Fcntl qw( :DEFAULT :flock :seek );
use File::Spec;
use IO::Handle;
my $startdir = shift #ARGV || '.';
die "$startdir is not a directory\n"
unless -d $startdir;
my $verify_dir_name = 'MyDir';
my $dh = new IO::Handle;
opendir $dh, $startdir or
die "Cannot open $startdir: $!\n";
while(defined(my $cont = readdir($dh))) {
next
if $cont eq '.' || $cont eq '..';
my $fullpath = File::Spec->catfile($dir, $cont);
next
unless -d $fullpath && -r $fullpath && -w $fullpath;
my $verify_path = File::Spec->catfile($fullpath, $verify_dir_name);
next
if -d $verify_path;
mkdir($verify_path, 0755);
# do whatever other operations you want to $verify_path
}
closedir($dh);
The short answer is use File::FInd.
The long answer is first write a subroutine that validates the existence of the folder and if the folder is not there, create it and then do the processing needed. Then invoke the find method of the File::Find module with a reference to the subroutine and the starting folder to process all the subfolders.