Perl tar file creates directory recursively - perl

I am taring the directory contents using Archive::Tar module.
My scripts is below:
#!/usr/bin/perl -w
use strict;
use warnings;
use Archive::Tar;
use File::Find;
use Data::Dumper;
my $home_dir = "C:/Users/Documents/Vinod/Perl_Scripts/test/";
my $src_location = $home_dir."LOG_DIR";
my $dst_location = $home_dir."file.tar.gz";
my #inventory = ();
find (sub { push #inventory, $File::Find::name }, $src_location);
print "Files:".Dumper(\#inventory);
my $tar = Archive::Tar->new();
$tar->add_files( #inventory );
$tar->write( $dst_location , 9 );
Script is able to create file.tar.gz file in location C:/Users/Documents/Vinod/Perl_Scripts/test/.
But when I extract the file.tar.gz manually it creates a whole path recursively once again. So LOG_DIR contents would be visible in the location C:/Users/Documents/Vinod/Perl_Scripts/test/file.tar/file/Users/Documents/Vinod/Perl_Scripts/test/LOG_DIR/
How can I have the contents which is inside C:/Users/Documents/Vinod/Perl_Scripts/test/LOG_DIR/ in C:/Users/Documents/Vinod/Perl_Scripts/test/file.tar/file/ while extracting it.

If you don't want to recreate the full path, chdir into the home directory, and make the source dir relative:
my $home_dir = "C:/Users/Documents/Vinod/Perl_Scripts/test/";
chdir $home_dir;
my $src_location = "LOG_DIR";
my $dst_location = $home_dir."file.tar.gz";

Since you use $File::Find::name for your list, you get the absolute path to each file. That's the name that you give Archive::Tar, so that's the name that it uses. You can see the files in a tarball:
$ tar -tzf archive.tgz
There are various ways to get relative paths instead. You might do that in the wanted function. Remove the part of the path that you do not want. That's typically not going to be the directory you used for find (src_location) because you want to keep that level of structure:
my #inventory;
find(
sub {
return if /\A\.\.?\z/;
push #inventory, abs2rel( $File::Find::name, $home_dir )
}, $src_location
);
Or do it after:
#inventory = map { abs2rel($_, $home_dir) } #inventory;

Related

Perl Archive::Tar how to exclude root directory

I need to extract .tgz file ( contains directory tree/files) in d:\work directory , make some changes and repack it again.
The extracted folder will be like D:\work\123a\... ( so th .tgz contains folder 123a and many subfolders\files.
Now I tried to use Archive::Tar and pack recursively, but the problem is "work" folder is also compressed. - So it will be \work\123a\...
How to exclude this "work" folder (cwd or root dir).
Any help is appreciated.
my code:
my $tar = Archive::Tar->new;
$tar->read("$tgzfiles[0]");
$tar->extract();
(my $tgzfolder = $tgzfiles[0]) =~ s/\.[^.]+$//;
# Here I make some changes in content{.....}
our $arc = Archive::Tar->new();
find( \&archiveit, "d:/work/$tgzfolder" );
$arc->write("$tgzfiles[0]", 9);
sub archiveit {
$arc->add_files( $File::Find::name );
}
The following works for me. The initial test.tgz is located in the home folder.
I used chdir to switch to the working directory, but then I needed to provide the no_chdir option to the wanted function to prevent changing the directory into the subdirectories.
Also, I guess the $tgzfolder should be the path to the extracted files, so it should be extracted from the archive contents, not from its name.
#!/usr/bin/perl
use warnings;
use strict;
use Archive::Tar;
use File::Find;
my #tgzfiles = "$ENV{HOME}/test.tgz";
chdir 'd:/work';
my $tar = 'Archive::Tar'->new;
$tar->read($tgzfiles[0]);
$tar->extract;
# Modify the contents.
my $tgzfolder = ($tar->list_files)[0];
my $arc = 'Archive::Tar'->new;
find({wanted => sub { $arc->add_files($File::Find::name) },
no_chdir => 1},
$tgzfolder );
$arc->write($tgzfiles[0], 9);

Make .tar or .gz file in window using perl

I will try to make .tar or .gz file. But I have some issues like:
It takes the complete path
D:\test\jtax-issue11-16\title.xml
D:\test\jtax-issue11-16\artwork
D:\test\jtax-issue11-16\artwork\cover.png
Note: the above is also my folder structure.
But my requirement is:
jtax-issue11-16\title.xml
jtax-issue11-16\artwork
jtax-issue11-16\artwork\cover.png
Which means create .tar or .gz file with the current folder name only
My code is :
use strict;
use warnings;
use Archive::Tar;
use File::Find;
use File::Basename 'basename';
use Cwd;
my $current_path = getcwd;
my #inventory = ();
find (sub { push #inventory, $File::Find::name }, $current_path);
my $tar = Archive::Tar->new();
$tar->add_files(#inventory);
$tar->write('a.tar');
If I use basename, then it produces an error. I don't understand how to use basename or how to create .tar or .gz file with the current folder name.
use File::Find::Rule qw( );
my $base_dir = '.';
my #files =
map { s{^\Q$base_dir/}{}r }
File::Find::Rule
->mindepth(1)
->in($base_dir);
or
use File::Find qw( find );
my $base_dir = '.';
my #files;
find(
{
wanted => sub { push #files, s{^\Q$base_dir/}{}r },
no_chdir => 1,
},
$base_dir
);
shift(#files);
Given that you are in $current_path when you call find(), you should just pass . to find(). That way all of the paths you get in $File::Find::name will be relative to the current directory;
my $current_path = getcwd;
my #inventory = ();
find (sub { push #inventory, $File::Find::name }, '.');
That will give you paths like:
.\jtax-issue11-16\title.xml
.\jtax-issue11-16\artwork
.\jtax-issue11-16\artwork\cover.png
But you could use s/^\.\\// to remove the .\ from the beginning of that path if it's important to you. The easiest place to do that might be after you have built #inventory.
#inventory = map { s/^\.\\//; $_ } #inventory;
I will add Dave Cross code in my file and get my o/p.
Code is below:-
use strict;
use warnings;
use Archive::Tar;
use File::Find;
use File::Basename 'basename';
use 5.010;
use Cwd;
my $current_dir = getcwd;
my #tar_files;
my #inventory = ();
find (sub { push #inventory, $File::Find::name }, $current_dir);
my $tar = Archive::Tar->new();
#inventory = map { s/^$current_dir\///; $_ } #inventory;
foreach my $temp (#inventory)
{
# skip my file name which is conv.pl
# and skip blank entry in tar, which is created by $current_dir
if ($temp =~ m/conv\.pl/ || $temp =~ m/conv\.exe/ ||$temp =~/$current_dir/)
{
}
else
{
push (#tar_files , $temp );
}
}
$tar->add_files(#tar_files);
$tar->write($file_name.".tar");

Determine if directory is the root directory

I am trying to traverse the directory tree upwards, searching for a given directory name, if the directory is found, I should chdir to it, otherwise give an error message. For example:
use warnings;
use strict;
use Cwd qw(getcwd);
die "Base directory not found!" if (!gotoDir());
sub gotoDir {
my $baseDir = '.test';
my $curdir = getcwd();
while (1) {
return 1 if (-d $baseDir);
if (! chdir("..")) {
chdir($curdir);
return 0;
}
}
}
The problem is that chdir does not fail when going beyond the root, so the above program enters an infinite loop if .test is not found.
Of course, I could just test for / since I am on Linux, but I would like to do this in a system independent manner.
As #Gnouc has answered, the File::Spec module has a portable representation of the root directory with its rootdir method.
This is how I would write your goto_dir subroutine. Note that capital letters are conventionally reserved for global identifiers like Package::Names.
I think it is best to pass the directory you are searching for as a parameter to the subroutine to make it more general. I have also written it so that the subroutine does a chdir to the .test directory if it is is found, which is what you say you want but not what your own solution tries to do.
Finally, since portability is important, I have used File::Spec->updir in place of a literal '..' to refer to the parent of the current directory.
#!/usr/bin/env perl
use strict;
use warnings;
use Cwd 'cwd';
use File::Spec;
goto_dir('.test') or die 'Base directory not found!';
sub goto_dir {
my ($base_dir) = #_;
my $original_dir = cwd;
while () {
if (-d $base_dir) {
chdir $base_dir;
return 1;
}
elsif (cwd eq File::Spec->rootdir) {
chdir $original_dir;
return 0;
}
else {
chdir File::Spec->updir;
}
}
}
You can use File::Spec to get the root directory:
$ perl -MFile::Spec -E 'say File::Spec->rootdir()'
/
File::Spec is great for obtaining what the root directory is, but for testing whether a given directory is or isn't that is not so easy. For that you likely want to use stat and compare if the dev and ino fields are equal:
use File::stat;
my $rootstat = stat(File::Spec->rootdir);
...
my $thisstat = stat($dir);
if( $thisstat->dev == $rootstat->dev and $thisstat->ino == $rootstat->ino ) {
say "This is the root directory";
}
This avoids problems of the string-formatted form of a path to the directory, as it may be that you have the path ../../../../../.. for example.

Adding a member to zip file from a file handle in Perl

I am trying to add a remote file to a local zip archive.
Currently, I am doing something like this.
use Modern::Perl;
use Archive::Zip;
use File::Remote;
my $remote = File::Remote->new(rsh => "/usr/bin/ssh", rcp => "/usr/bin/scp");
my $zip = Archive::Zip->new();
$remote->open(*FH,'host2:/file/to/add.txt');
my $fh = IO::File->new_from_fd(*FH,'r');
#this is what I want to do.
$zip->addFileHandle($fh,'add.txt');
...
Unfortunately, Archive::Zip does not have have an addFileHandle method.
Is there another way that I can do that?
Thanks.
Do something like this (copy to local path):
$remote->copy("host:/remote/file", "/local/file");
and use the addFile method provided by Archive::Zip with the local file
Archive::Zip might not have filehandle support for writing to a zip file, but Archive::Zip::SimpleZip does.
Here is a self-contained example that shows how to read from a filehandle & write directly to the zip file without any need for a temporary file.
use warnings;
use strict;
use Archive::Zip::SimpleZip;
use File::Remote;
# create a file to add to the zip archive
system "echo hello world >/tmp/hello" ;
my $remote = File::Remote->new(rsh => "/usr/bin/ssh", rcp => "/usr/bin/scp");
my $zip = Archive::Zip::SimpleZip->new("/tmp/r.zip");
$remote->open(*FH,'/tmp/hello');
# Create a filehandle to write to the zip fiule.
my $member = $zip->openMember(Name => 'add.txt');
my $buffer;
while (read(FH, $buffer, 1024*16))
{
print $member $buffer ;
}
$member->close();
$zip->close();
# dump the contents of the zipo file to stdout
system "unzip -p /tmp/r.zip" ;

How do I use $File::Find::prune?

I have a need to edit cue files in the first directory and not go recursively in the subdirectories.
find(\&read_cue, $dir_source);
sub read_cue {
/\.cue$/ or return;
my $fd = $File::Find::dir;
my $fn = $File::Find::name;
tie my #lines, 'Tie::File', $fn
or die "could not tie file: $!";
foreach (#lines) {
s/some substitution//;
}
untie #lines;
}
I've tried variations of
$File::Find::prune = 1;
return;
but with no success. Where should I place and define $File::Find::prune?
Thanks
If you don't want to recurse, you probably want to use glob:
for (glob("*.cue")) {
read_cue($_);
}
If you want to filter the subdirectories recursed into by File::Find, you should use the preprocess function (not the $File::Find::prune variable) as this gives you much more control. The idea is to provide a function which is called once per directory, and is passed a list of files and subdirectories; the return value is the filtered list to pass to the wanted function, and (for subdirectories) to recurse into.
As msw and Brian have commented, your example would probably be better served by a glob, but if you wanted to use File::Find, you might do something like the following. Here, the preprocess function calls -f on every file or directory it's given, returning a list of files. Then the wanted function is called only for those files, and File::Find does not recurse into any of the subdirectories:
use strict;
use File::Find;
# Function is called once per directory, with a list of files and
# subdirectories; the return value is the filtered list to pass to
# the wanted function.
sub preprocess { return grep { -f } #_; }
# Function is called once per file or subdirectory.
sub wanted { print "$File::Find::name\n" if /\.cue$/; }
# Find files in or below the current directory.
find { preprocess => \&preprocess, wanted => \&wanted }, '.';
This can be used to create much more sophisticated file finders. For example, I wanted to find all files in a Java project directory, without recursing into subdirectories starting with ".", such as ".idea" and ".svn", created by IntelliJ and Subversion. You can do this by modifying the preprocess function:
# Function is called once per directory, with a list of files and
# subdirectories; return value is the filtered list to pass to the
# wanted function.
sub preprocess { return grep { -f or (-d and /^[^.]/) } #_; }
If you only want the files in a directory without searching subdirectories, you don't want to use File::Find. A simple glob probably does the trick:
my #files = glob( "$dir_source/*.cue" );
You don't need that subroutine. In general, when you're doing a lot of work for a task that you think should be simple, you're probably doing it wrong. :)
Say you have a directory subtree with
/tmp/foo/file.cue
/tmp/foo/bar/file.cue
/tmp/foo/bar/baz/file.cue
Running
#! /usr/bin/perl
use warnings;
use strict;
use File::Find;
sub read_cue {
if (-f && /\.cue$/) {
print "found $File::Find::name\n";
}
}
#ARGV = (".") unless #ARGV;
find \&read_cue => #ARGV;
outputs
found /tmp/foo/file.cue
found /tmp/foo/bar/file.cue
found /tmp/foo/bar/baz/file.cue
But if you remember the directories in which you found cue files
#! /usr/bin/perl
use warnings;
use strict;
use File::Find;
my %seen_cue;
sub read_cue {
if (-f && /\.cue$/) {
print "found $File::Find::name\n";
++$seen_cue{$File::Find::dir};
}
elsif (-d && $seen_cue{$File::Find::dir}) {
$File::Find::prune = 1;
}
}
#ARGV = (".") unless #ARGV;
find \&read_cue => #ARGV;
you get only the toplevel cue file:
found /tmp/foo/file.cue
That's because $File::Find::prune emulates the -prune option of find that affects directory processing:
-prune
True; if the file is a directory, do not descend into it.