Perl recursively copy files of specific format [duplicate] - perl

This question already has answers here:
How can I copy a directory recursively and filter filenames in Perl?
(5 answers)
Closed 8 years ago.
Note: The related question is solved by use of a deprecated module and is not consistent across OS. The answer to current question uses newer modules and hence is being posted here.
I have a module that in turn uses the File::NCopy CPAN module for recursively copying files from to .
The problem is - I need to recursively copy only the files of specific file type to the destination. Is there any way to filter the source by extension?
As an alternative - is there a way to copy all files except the hidden files? My main problem being the .git folder also gets copied - which is not desired.
Platform: MacOS
Alternatives explored:
1) File::Copy::Recursive module :- seems to provide only recursive copy of files or directories. Does not seem to help with either hidden files or exclude filter
2) Using rsync -avz --exclude=".*" :- unable to combine this with recursive copy functionality.
3) Homegrown solution similar to How can I copy a directory recursively and filter filenames in Perl? :- Might be the last resort - but does not seem portable unless tweaked and tested across different platforms. Will be falling back to this unless a module already exists.
4) https://metacpan.org/pod/Path::Class::Dir :- Seems plausible - will be running a quick implementation using this.
Implemented Solution:
I used the recursive module and the Path::Class::Dir
dir($sourceDir)->recurse(callback => sub {
my $file = shift;
return if($file eq '.' || $file eq '..');
return if -d $file;
if (<custom filter>)
{
my $path = file($file)->relative($sourceDir);
fcopy("$sourceDir/$path", "$destinationDir/$path") or die "Could not perform fcopy: $!";
}
});
Without the relative path - the destination folder structure does not seem to be the same as the source folder structure.

I think the easiest solution is to use File::Copy::Recursive to copy the directory structure fully, and then to go back with File::Find::Rule to determine all the dirs that you want to filter and then remove them.
Given that .git folders don't necessarily hold that much data, I think the performance hit from copying more files than you need to is likely to be pretty small. The following would be sufficient to accomplish what you desire:
use strict;
use warnings;
use File::Copy::Recursive qw(dircopy pathrmdir);
use File::Find::Rule;
my $src = '...src...';
my $dest = '...dest...';
dircopy($src, $dest) or die "Can't dircopy: $!";
my #git = File::Find::Rule->directory()
->name('.git')
->in($dest);
pathrmdir($_) or die "Can't remove $_: $!" for (#git);
Alternatively, if you'd like to roll your own, you might take a look at File::Find::Rule #Further Examples which includes an example on how to "ignore CVS directories".

Related

Find symlink using Find in Perl

I have faced some issue on how to find a symlink inside a directory using Find library, it can't be found and I am very new in using Perl.
use strict;
use warnings;
use File::Find::Rule;
use File::Find;
use File::Copy;
my #files = File::Find::Rule->file()
->name( 'table.txt' )
->in( '/a/b/c' );
for my $file (#files)
{
print "\nfile: $file\n";
Supposedly there is another directory that contain the specified file, table.txt, but it is a symlink that is /a/b/d, but the symlink is not there when it is printed out.
The File::Find::Rule implements the -X filetests, as methods. The one that tests whether an entry is a symlink (-l) is called symlink.
In my reading of the question you don't know the name of that directory (otherwise, why "find" a file in it?), except that it is a symbolic link. Then you need to first find directories which are symlinks, then find files in them. For this second task you'll need to tell the module to follow the links.
I use a test structure on disk
test_find/a/c/c.txt
test_find/a/b/b.txt
test_find/is_link -> a/c ("is_link" is a symbolic link to "a/c")
and from the directory right above it I run the program
use warnings;
use strict;
use feature 'say';
use File::Find::Rule;
my $top_dir = shift #ARGV || './test_find';
my #symlink_dirs = File::Find::Rule->directory->symlink->in($top_dir);
foreach my $dir (#symlink_dirs) {
my #files = File::Find::Rule->file->extras({follow => 1})->in($dir);
say "in $dir: #files";
}
which prints
in test_find/is_link: test_find/is_link/c.txt
Note that the program will find all files in all directories which are symlinks. Writing an explicit loop as above allows you to add code to decide which of the symlink-directories to actually search. If you don't mind looking through all of them you can just do
my #files = File::Find::Rule->file->extras({follow => 1})->in(#symlink_dirs);
See documentation for features to limit the search using what you know about the directory/file.
If the link directory is in a hierarchy not including the target, then simply search that hierarchy.
With test structure
other_hier/is_link -> ../test_find/a/c
test_find/a/c/c.txt
you only need to tell it to follow the links
my #all_files = File::Find::Rule # may use linebreaks and spaces
-> file
-> extras({ follow => 1 })
-> in('other_hier');
When added to the above program and printed this adds
another/is_link/c.txt
You can of course replace the literal 'other_hier' with $top_dir and invoke the program with argument other_hier (and make that other_hier directory and the link in it).
If both link and target are in the same hierarchy that is searched then you can't do this; the search would run into circular links (the module detects that and exits with error).

Unzipping files in perl with wildcard file name

I am using the Archive::Zip module to extract a specific file.
my $file = shift
my $zip = Archive::Zip->('zipped.zip');
$zip->extractMember($file.'txt');
The problem is that sometimes the complete file name is not known, and I want to do something like this:
$zip->extractMember($file.*.'txt');
I searched around online and can't find anything for this module (or any similar module). Is there a module that allows for wildcarding in file extraction?
It seems you could use the documented membersMatching($regex) method:
my #files = $zip->membersMatching(qr/$file.*\.txt/);
$zip->extractMember($_) for #files;

How to recursively in perl readdir contents starting from root and then according to a user specified level retrieve files that end in .txt

I was trying to ask for help I posted a previous question. Also I don't want to use any modules unless it is a built in module I prefer to write my own. I know the recursive part to list all files from multiple directories, but don't understand where exactly or how I would specify the desired level of search, so if I give as parameters root and 3 it should look through at least 3 directories and then retrieve all files as long it is less than or equal to 3. Any help is greatly appreciated.
do you just want it to list all files, or to return them in an array. If merely printing them is enough, you do something like:
sub print_txt_recurse() {
my ($filepath, $level) = #_;
#some code to get file paths and and print txt files going through each file
elsif (-d $file && $level > 1 ) {
print_txt_recurse($file, $level - 1);
}
return;
}
You could use File::Find, a core module of Perl, which means it supposes to be available everywhere.
See Core modules (F)

Renaming items in a File::Find folder traversal

I am supposed to traverse through a whole tree of folders and rename everything (including folders) to lower case. I looked around quite a bit and saw that the best way was to use File::Find. I tested this code:
#!/usr/bin/perl -w
use File::Find;
use strict;
print "Folder: ";
chomp(my $dir = <STDIN>);
find(\&lowerCase, $dir);
sub lowerCase{
print $_," = ",lc($_),"\n";
rename $_, lc($_);
}
and it seems to work fine. But can anyone tell me if I might run into trouble with this code? I remember posts on how I might run into trouble because of renaming folders before files or something like that.
If you are on Windows, as comments stated, then no, renaming files or folders in any order won't be a problem, because a path DIR1/file1 is the same as dir1/file1 to Windows.
It MAY be a problem on Unix though, in which case you are better off doing a recursive BFS by hand.
Also, when doing system calls like rename, ALWAYS check result:
rename($from, $to) || die "Error renaming $from to $to: $!";
As noted in comments, take care about renaming "ABC" to "abc". On Windows is not a problem.
Personally, I prefer to:
List files to be renamed using find dir/ > 2b_renamed
Review the list manually, using an editor of choice (vim 2b_renamed, in my case)
Use the rename from CPAN on that list: xargs rename 'y/A-Z/a-z/' < 2b_renamed
That manual review is very important to me, even when I can easily rollback changes (via git or even Time Machine).

How to include a "diff" in a Perl test? [duplicate]

This question already has answers here:
Closed 12 years ago.
Possible Duplicate:
How can I use Perl to determine whether the contents of two files are identical?
If I am writing a Perl module test, and for example I want to test that an output file is exactly what is expected, if I use an external command like diff, the test might fail on some operating systems which don't provide the diff command. What would be a simple way to do something like diff on files, which doesn't rely on external commands? I understand that there are modules on CPAN which can do file diffs, but I would rather not complicate the build process unless necessary.
File::Compare, in core since 5.004.
When testing and looking for differences in files or strings I always use Test::Differences that uses Text::Diff. I know that you probably know that and you would like a non module solution, but looking for differences has many corner cases so is not trivial. Also I write this answer more for googlers (just in case you already know these modules).
I like the table output of this module. It is very convenient when the differences are a small number.
Why not just read and compare the two files in perl? Something like...
sub readfile
{
local ($/) = undef;
open READFILE, "<", $_[0]
or die "Can't read '$_[0]': $!";
my $contents = <READFILE>;
close READFILE or die "Can't close '$_[0]': $!";
return $contents;
}
$expected = readfile("expected_results");
$actual = readfile("actual_results");
if ($expected != $actual) {
die "Got wrong results!";
}
(If you're concerned about multiple OS portability, you may also need to do something about line endings, either in your test program or here, because some OSs use CRLF instead of LF to separate lines in text files. If you want to handle it here, a regular expression replace will do the trick.)