Perl Archive::Tar how to exclude root directory - perl

I need to extract .tgz file ( contains directory tree/files) in d:\work directory , make some changes and repack it again.
The extracted folder will be like D:\work\123a\... ( so th .tgz contains folder 123a and many subfolders\files.
Now I tried to use Archive::Tar and pack recursively, but the problem is "work" folder is also compressed. - So it will be \work\123a\...
How to exclude this "work" folder (cwd or root dir).
Any help is appreciated.
my code:
my $tar = Archive::Tar->new;
$tar->read("$tgzfiles[0]");
$tar->extract();
(my $tgzfolder = $tgzfiles[0]) =~ s/\.[^.]+$//;
# Here I make some changes in content{.....}
our $arc = Archive::Tar->new();
find( \&archiveit, "d:/work/$tgzfolder" );
$arc->write("$tgzfiles[0]", 9);
sub archiveit {
$arc->add_files( $File::Find::name );
}

The following works for me. The initial test.tgz is located in the home folder.
I used chdir to switch to the working directory, but then I needed to provide the no_chdir option to the wanted function to prevent changing the directory into the subdirectories.
Also, I guess the $tgzfolder should be the path to the extracted files, so it should be extracted from the archive contents, not from its name.
#!/usr/bin/perl
use warnings;
use strict;
use Archive::Tar;
use File::Find;
my #tgzfiles = "$ENV{HOME}/test.tgz";
chdir 'd:/work';
my $tar = 'Archive::Tar'->new;
$tar->read($tgzfiles[0]);
$tar->extract;
# Modify the contents.
my $tgzfolder = ($tar->list_files)[0];
my $arc = 'Archive::Tar'->new;
find({wanted => sub { $arc->add_files($File::Find::name) },
no_chdir => 1},
$tgzfolder );
$arc->write($tgzfiles[0], 9);

Related

Perl tar file creates directory recursively

I am taring the directory contents using Archive::Tar module.
My scripts is below:
#!/usr/bin/perl -w
use strict;
use warnings;
use Archive::Tar;
use File::Find;
use Data::Dumper;
my $home_dir = "C:/Users/Documents/Vinod/Perl_Scripts/test/";
my $src_location = $home_dir."LOG_DIR";
my $dst_location = $home_dir."file.tar.gz";
my #inventory = ();
find (sub { push #inventory, $File::Find::name }, $src_location);
print "Files:".Dumper(\#inventory);
my $tar = Archive::Tar->new();
$tar->add_files( #inventory );
$tar->write( $dst_location , 9 );
Script is able to create file.tar.gz file in location C:/Users/Documents/Vinod/Perl_Scripts/test/.
But when I extract the file.tar.gz manually it creates a whole path recursively once again. So LOG_DIR contents would be visible in the location C:/Users/Documents/Vinod/Perl_Scripts/test/file.tar/file/Users/Documents/Vinod/Perl_Scripts/test/LOG_DIR/
How can I have the contents which is inside C:/Users/Documents/Vinod/Perl_Scripts/test/LOG_DIR/ in C:/Users/Documents/Vinod/Perl_Scripts/test/file.tar/file/ while extracting it.
If you don't want to recreate the full path, chdir into the home directory, and make the source dir relative:
my $home_dir = "C:/Users/Documents/Vinod/Perl_Scripts/test/";
chdir $home_dir;
my $src_location = "LOG_DIR";
my $dst_location = $home_dir."file.tar.gz";
Since you use $File::Find::name for your list, you get the absolute path to each file. That's the name that you give Archive::Tar, so that's the name that it uses. You can see the files in a tarball:
$ tar -tzf archive.tgz
There are various ways to get relative paths instead. You might do that in the wanted function. Remove the part of the path that you do not want. That's typically not going to be the directory you used for find (src_location) because you want to keep that level of structure:
my #inventory;
find(
sub {
return if /\A\.\.?\z/;
push #inventory, abs2rel( $File::Find::name, $home_dir )
}, $src_location
);
Or do it after:
#inventory = map { abs2rel($_, $home_dir) } #inventory;

Perl , How to read subfolder Output

I am writing a script to read the content of multiple sub folder in a directory.
And recently i need to read the content of folder inside multiple sub-folder.
Want to ask how can i write the code to read those folder inside multiple sub-folder.
This is the new conditions
Multiple Sub-folder -> Local folder -> fileAAA.csv
how do i read this fileAAA in Local folder of Multiple Sub-folder?
Currently the code i am writing was in this condition and it works well.
Multiple Sub-folder -> fileAAA.csv
Able to read fileAAA from multiple Sub-folder
Below is the code i use to read
Multiple Sub-folder -> fileAAA.csv
my ( $par_dir, $sub_dir );
opendir( $par_dir, "$parent" );
while ( my $sub_folders = readdir($par_dir) ) {
next if ( $sub_folders =~ /^..?$/ ); # skip . and ..
my $path = $parent . '/' . $sub_folders;
next unless ( -d $path ); # skip anything that isn't a directory
opendir( $sub_dir, $path );
while ( my $file = readdir($sub_dir) ) {
next unless $file =~ /\.csv?$/i;
my $full_path = $path . '/' . $file;
print_file_names($full_path);
}
closedir($sub_dir);
$flag = 0;
}
closedir($par_dir);
......
Updated
You should look at the File::Find module which has everything already in place to do searches like this, and has taken account of all corner cases for you
I wrote that on my tablet and at the time I couldn't offer sample code to support it. I believe this will do what you're asking for, which is simply to find all CSV files at any level beneath a parent directory
use strict;
use warnings;
use File::Find qw/ find /;
STDOUT->autoflush;
my $parent = '/Multiple Sub-folder';
find(sub {
return unless -f and /\.csv$/i;
print_file_names($File::Find::name);
}, $parent);
sub print_file_names {
my ($fn) = #_;
print $fn, "\n";
}
Without using moudle try this
Instead of opendir can you try glob for subdirectory search.
In below script i make a subroutine for continuous search.
When elsif condition is satisfied the path of the directory is will go to the find subroutine then it'll seach and so on.
my $v = "/Multiple Sub-folder";
find($v);
sub find{
my ($s) = #_;
foreach my $ma (glob "$s/*")
{
if(-f $ma)
{
if($ma =~m/.csv$/) # Here search for csv files.
{
print "$ma\n";
}
}
elsif(-d $ma)
{
find("$ma")
}
}
}
But can you use File::Find module for search the files in the directory as the answer of Borodin Which is the best approach.

Determine if directory is the root directory

I am trying to traverse the directory tree upwards, searching for a given directory name, if the directory is found, I should chdir to it, otherwise give an error message. For example:
use warnings;
use strict;
use Cwd qw(getcwd);
die "Base directory not found!" if (!gotoDir());
sub gotoDir {
my $baseDir = '.test';
my $curdir = getcwd();
while (1) {
return 1 if (-d $baseDir);
if (! chdir("..")) {
chdir($curdir);
return 0;
}
}
}
The problem is that chdir does not fail when going beyond the root, so the above program enters an infinite loop if .test is not found.
Of course, I could just test for / since I am on Linux, but I would like to do this in a system independent manner.
As #Gnouc has answered, the File::Spec module has a portable representation of the root directory with its rootdir method.
This is how I would write your goto_dir subroutine. Note that capital letters are conventionally reserved for global identifiers like Package::Names.
I think it is best to pass the directory you are searching for as a parameter to the subroutine to make it more general. I have also written it so that the subroutine does a chdir to the .test directory if it is is found, which is what you say you want but not what your own solution tries to do.
Finally, since portability is important, I have used File::Spec->updir in place of a literal '..' to refer to the parent of the current directory.
#!/usr/bin/env perl
use strict;
use warnings;
use Cwd 'cwd';
use File::Spec;
goto_dir('.test') or die 'Base directory not found!';
sub goto_dir {
my ($base_dir) = #_;
my $original_dir = cwd;
while () {
if (-d $base_dir) {
chdir $base_dir;
return 1;
}
elsif (cwd eq File::Spec->rootdir) {
chdir $original_dir;
return 0;
}
else {
chdir File::Spec->updir;
}
}
}
You can use File::Spec to get the root directory:
$ perl -MFile::Spec -E 'say File::Spec->rootdir()'
/
File::Spec is great for obtaining what the root directory is, but for testing whether a given directory is or isn't that is not so easy. For that you likely want to use stat and compare if the dev and ino fields are equal:
use File::stat;
my $rootstat = stat(File::Spec->rootdir);
...
my $thisstat = stat($dir);
if( $thisstat->dev == $rootstat->dev and $thisstat->ino == $rootstat->ino ) {
say "This is the root directory";
}
This avoids problems of the string-formatted form of a path to the directory, as it may be that you have the path ../../../../../.. for example.

Adding a member to zip file from a file handle in Perl

I am trying to add a remote file to a local zip archive.
Currently, I am doing something like this.
use Modern::Perl;
use Archive::Zip;
use File::Remote;
my $remote = File::Remote->new(rsh => "/usr/bin/ssh", rcp => "/usr/bin/scp");
my $zip = Archive::Zip->new();
$remote->open(*FH,'host2:/file/to/add.txt');
my $fh = IO::File->new_from_fd(*FH,'r');
#this is what I want to do.
$zip->addFileHandle($fh,'add.txt');
...
Unfortunately, Archive::Zip does not have have an addFileHandle method.
Is there another way that I can do that?
Thanks.
Do something like this (copy to local path):
$remote->copy("host:/remote/file", "/local/file");
and use the addFile method provided by Archive::Zip with the local file
Archive::Zip might not have filehandle support for writing to a zip file, but Archive::Zip::SimpleZip does.
Here is a self-contained example that shows how to read from a filehandle & write directly to the zip file without any need for a temporary file.
use warnings;
use strict;
use Archive::Zip::SimpleZip;
use File::Remote;
# create a file to add to the zip archive
system "echo hello world >/tmp/hello" ;
my $remote = File::Remote->new(rsh => "/usr/bin/ssh", rcp => "/usr/bin/scp");
my $zip = Archive::Zip::SimpleZip->new("/tmp/r.zip");
$remote->open(*FH,'/tmp/hello');
# Create a filehandle to write to the zip fiule.
my $member = $zip->openMember(Name => 'add.txt');
my $buffer;
while (read(FH, $buffer, 1024*16))
{
print $member $buffer ;
}
$member->close();
$zip->close();
# dump the contents of the zipo file to stdout
system "unzip -p /tmp/r.zip" ;

How do I use $File::Find::prune?

I have a need to edit cue files in the first directory and not go recursively in the subdirectories.
find(\&read_cue, $dir_source);
sub read_cue {
/\.cue$/ or return;
my $fd = $File::Find::dir;
my $fn = $File::Find::name;
tie my #lines, 'Tie::File', $fn
or die "could not tie file: $!";
foreach (#lines) {
s/some substitution//;
}
untie #lines;
}
I've tried variations of
$File::Find::prune = 1;
return;
but with no success. Where should I place and define $File::Find::prune?
Thanks
If you don't want to recurse, you probably want to use glob:
for (glob("*.cue")) {
read_cue($_);
}
If you want to filter the subdirectories recursed into by File::Find, you should use the preprocess function (not the $File::Find::prune variable) as this gives you much more control. The idea is to provide a function which is called once per directory, and is passed a list of files and subdirectories; the return value is the filtered list to pass to the wanted function, and (for subdirectories) to recurse into.
As msw and Brian have commented, your example would probably be better served by a glob, but if you wanted to use File::Find, you might do something like the following. Here, the preprocess function calls -f on every file or directory it's given, returning a list of files. Then the wanted function is called only for those files, and File::Find does not recurse into any of the subdirectories:
use strict;
use File::Find;
# Function is called once per directory, with a list of files and
# subdirectories; the return value is the filtered list to pass to
# the wanted function.
sub preprocess { return grep { -f } #_; }
# Function is called once per file or subdirectory.
sub wanted { print "$File::Find::name\n" if /\.cue$/; }
# Find files in or below the current directory.
find { preprocess => \&preprocess, wanted => \&wanted }, '.';
This can be used to create much more sophisticated file finders. For example, I wanted to find all files in a Java project directory, without recursing into subdirectories starting with ".", such as ".idea" and ".svn", created by IntelliJ and Subversion. You can do this by modifying the preprocess function:
# Function is called once per directory, with a list of files and
# subdirectories; return value is the filtered list to pass to the
# wanted function.
sub preprocess { return grep { -f or (-d and /^[^.]/) } #_; }
If you only want the files in a directory without searching subdirectories, you don't want to use File::Find. A simple glob probably does the trick:
my #files = glob( "$dir_source/*.cue" );
You don't need that subroutine. In general, when you're doing a lot of work for a task that you think should be simple, you're probably doing it wrong. :)
Say you have a directory subtree with
/tmp/foo/file.cue
/tmp/foo/bar/file.cue
/tmp/foo/bar/baz/file.cue
Running
#! /usr/bin/perl
use warnings;
use strict;
use File::Find;
sub read_cue {
if (-f && /\.cue$/) {
print "found $File::Find::name\n";
}
}
#ARGV = (".") unless #ARGV;
find \&read_cue => #ARGV;
outputs
found /tmp/foo/file.cue
found /tmp/foo/bar/file.cue
found /tmp/foo/bar/baz/file.cue
But if you remember the directories in which you found cue files
#! /usr/bin/perl
use warnings;
use strict;
use File::Find;
my %seen_cue;
sub read_cue {
if (-f && /\.cue$/) {
print "found $File::Find::name\n";
++$seen_cue{$File::Find::dir};
}
elsif (-d && $seen_cue{$File::Find::dir}) {
$File::Find::prune = 1;
}
}
#ARGV = (".") unless #ARGV;
find \&read_cue => #ARGV;
you get only the toplevel cue file:
found /tmp/foo/file.cue
That's because $File::Find::prune emulates the -prune option of find that affects directory processing:
-prune
True; if the file is a directory, do not descend into it.