Make .tar or .gz file in window using perl - perl

I will try to make .tar or .gz file. But I have some issues like:
It takes the complete path
D:\test\jtax-issue11-16\title.xml
D:\test\jtax-issue11-16\artwork
D:\test\jtax-issue11-16\artwork\cover.png
Note: the above is also my folder structure.
But my requirement is:
jtax-issue11-16\title.xml
jtax-issue11-16\artwork
jtax-issue11-16\artwork\cover.png
Which means create .tar or .gz file with the current folder name only
My code is :
use strict;
use warnings;
use Archive::Tar;
use File::Find;
use File::Basename 'basename';
use Cwd;
my $current_path = getcwd;
my #inventory = ();
find (sub { push #inventory, $File::Find::name }, $current_path);
my $tar = Archive::Tar->new();
$tar->add_files(#inventory);
$tar->write('a.tar');
If I use basename, then it produces an error. I don't understand how to use basename or how to create .tar or .gz file with the current folder name.

use File::Find::Rule qw( );
my $base_dir = '.';
my #files =
map { s{^\Q$base_dir/}{}r }
File::Find::Rule
->mindepth(1)
->in($base_dir);
or
use File::Find qw( find );
my $base_dir = '.';
my #files;
find(
{
wanted => sub { push #files, s{^\Q$base_dir/}{}r },
no_chdir => 1,
},
$base_dir
);
shift(#files);

Given that you are in $current_path when you call find(), you should just pass . to find(). That way all of the paths you get in $File::Find::name will be relative to the current directory;
my $current_path = getcwd;
my #inventory = ();
find (sub { push #inventory, $File::Find::name }, '.');
That will give you paths like:
.\jtax-issue11-16\title.xml
.\jtax-issue11-16\artwork
.\jtax-issue11-16\artwork\cover.png
But you could use s/^\.\\// to remove the .\ from the beginning of that path if it's important to you. The easiest place to do that might be after you have built #inventory.
#inventory = map { s/^\.\\//; $_ } #inventory;

I will add Dave Cross code in my file and get my o/p.
Code is below:-
use strict;
use warnings;
use Archive::Tar;
use File::Find;
use File::Basename 'basename';
use 5.010;
use Cwd;
my $current_dir = getcwd;
my #tar_files;
my #inventory = ();
find (sub { push #inventory, $File::Find::name }, $current_dir);
my $tar = Archive::Tar->new();
#inventory = map { s/^$current_dir\///; $_ } #inventory;
foreach my $temp (#inventory)
{
# skip my file name which is conv.pl
# and skip blank entry in tar, which is created by $current_dir
if ($temp =~ m/conv\.pl/ || $temp =~ m/conv\.exe/ ||$temp =~/$current_dir/)
{
}
else
{
push (#tar_files , $temp );
}
}
$tar->add_files(#tar_files);
$tar->write($file_name.".tar");

Related

How to perl process and push multiple directories using simple code

I am trying to get all sub directories and files inside 'dir1, dir2, dir3,dir4,dir5' like below and push it to another Dir. if I using this code I am getting everything. however, I need to process more directories like this. is there simple way to process all these 'dir1 to ... dirx' using simple code instead of assign each directories individually below. Thanks in Advance
use File::Find::Rule;
my #pushdir;
my #pushdir1 = File::Find::Rule->directory->in('/tmp/dirx');
my #pushdir2 = File::Find::Rule->directory->in('/tmp/nextdir');
my #pushdir3 = File::Find::Rule->directory->in('/tmp/pushdir');
my #pushdir4 = File::Find::Rule->directory->in('/tmp/logdir');
my #pushdir5 = File::Find::Rule->directory->in('/tmp/testdir');
push #pushdir, #pushdir1,#pushdir2,#pushdir3,#pushdir4,#pushdir5;
my #Files;
foreach my $dir (#pushdir) {
push #Files, sort glob "$dir/*.txt";
}
Use a subroutine:
#!/usr/bin/env perl
use strict;
use warnings;
use File::Find::Rule;
my #dirnames = qw( dirx temp_dir testdir nextdir );
my #dirs_to_search = map { "/tmp/$_" } #dirnames;
my #files;
for my $dir (#dirs_to_search) {
push #files, find_files_in_dir($dir);
}
sub find_files_in_dir {
my ($dir) = #_;
my #subdirs = File::Find::Rule->directory->in( $dir );
my #txt_files;
for my $subdir ( #subdirs ) {
push #txt_files, sort glob "$subdir/*.txt";
}
return #txt_files;
}
It sounds like you want a list of all *.txt files contained in /tmp child directories and their descendants
That's easily done with a single glob call, like this
my #files = glob '/tmp/**/*.txt'

perl script to count files in windows directory tree

I am new to perl scripting. I am trying to get the count of directories & subdirectories.
So I have searched all the available help on scripting.
But unable get the count of Subdirectories. Below is the script I used.
use strict;
use warnings;
use File::Slurp;
my #dirs = ('.');
my $directory_count = 0;
my $file_count = 0;
my $outfile = 'log.txt';
open my $fh, '>', $outfile or die "can't create logfile; $!";
for my $dir (#dirs) {
for my $file (read_dir ($dir)) {
if ( -d "$dir/$file" ) {
$directory_count++;
}
else {
$file_count++;
}
}
print $fh "Directories: $directory_count\n";
print $fh "Files: $file_count\n";
}
close $fh;
Here, I am unable to identify where to change the command of dir with /s.
Please help it will reduce lot of manual work.
Ravi
Never EVER write your own directory traversal. There are too many pitfalls, gotchas and edge cases. Things like path delimiters, files with spaces, alternate data streams, soft links, hard links, DFS paths... just don't do it.
Use File::Find or if you prefer File::Find::Rule.
As I prefer the former, I'll give an example:
use strict;
use warnings;
use File::Find;
my $dir_count;
my $file_count;
#find runs this for every file in it's traversal.
#$_ is 'current file'. $File::Find::Name is full path to file.
sub count_stuff {
if ( -d ) { $dir_count++ };
if ( -f ) { $file_count++ };
}
find ( \&count_stuff, "." );
print "Dirs: $dir_count\n";
print "Files: $file_count\n";
Here is a script that does it: 1) without global variables; and 2) without adding another sub to the namespace.
#!/usr/bin/env perl
use strict;
use warnings;
use File::Find;
run(\#ARGV);
sub run {
my $argv = shift;
for my $dir ( #$argv ) {
my $ret = count_files_and_directories( $dir );
printf(
"%s: %d files and %d directories\n",
$dir,
$ret->{files},
$ret->{directories}
);
}
return;
}
sub count_files_and_directories {
my $top = shift;
my %ret = (directories => 0, files => 0);
find(
{
wanted => sub {
-d and $ret{directories} += 1;
-f and $ret{files} += 1;
},
no_chdir => 1,
},
$top,
);
\%ret;
}
It seems simpler to use File::Find::Rule.. For example:
use warnings;
use strict;
use File::Find::Rule;
my #files = File::Find::Rule->new->file->in('.');
my #dirs = File::Find::Rule->new->directory->in('.');

directory tree warning

i have writed some script, that recursively print's directory's content. But it prints warning for each folder. How to fix this?
sample folder:
dev# cd /tmp/testdev# ls -p -Rtest2/testfiletestfile2
./test2:testfile3testfile4
my code:
#!/usr/bin/perl
use strict;
use warnings;
browseDir('/tmp/test');
sub browseDir {
my $path = shift;
opendir(my $dir, $path);
while (readdir($dir)) {
next if /^\.{1,2}$/;
if (-d "$path/$_") {
browseDir("$path/$_");
}
print "$path/$_\n";
}
closedir($dir);
}
and the output:
dev# perl /tmp/cotest.pl/tmp/test/test2/testfile3
/tmp/test/test2/testfile4Use of uninitialized value $_ in
concatenation (.) or string at /tmp/cotest.pl line 16./tmp/test/
/tmp/test/testfile/tmp/test/testfile2
May you try that code:
#!/usr/bin/perl
use strict;
use warnings;
browseDir('/tmp');
sub browseDir {
my $path = shift;
opendir(my $dir, $path);
while (readdir($dir)) {
next if /^\.{1,2}$/;
print "$path/$_\n";
if (-d "$path/$_") {
browseDir("$path/$_");
}
}
closedir($dir);
}
If you got that error, its because you call browseDir() before use variable $_.
Why not use the File::Find module? It's included in almost all distributions of Perl since Perl 5.x. It's not my favorite module due to the sort of messy way it works, but it does a good job.
You define a wanted subroutine that does what you want and filter out what you don't want. In this case, you're printing pretty much everything, so all wanted does is print out what is found.
In File::Find, the name of the file is kept in $File::Find::name and the directory for that file is in $File::Find::dir. The $_ is the file itself, and can be used for testing.
Here's a basic way of what you want:
use strict;
use warnings;
use feature qw(say);
use File::Find;
my $directory = `/tmp/test`;
find ( \&wanted, $directory );
sub wanted {
say $File::Find::Name;
}
I prefer to put my wanted function in my find subroutine, so they're together. This is equivalent to the above:
use strict;
use warnings;
use feature qw(say);
use File::Find;
my $directory = `/tmp/test`;
find (
sub {
say $File::Find::Name
},
$directory,
);
Good programming says not to print in subroutines. Instead, you should use the subroutine to store and return your data. Unfortunately, find doesn't return anything at all. You have to use a global array to capture the list of files, and later print them out:
use strict;
use warnings;
use feature qw(say);
use File::Find;
my $directory = `/tmp/test`;
my #directory_list;
find (
sub {
push #directory_list, $File::Find::Name
}, $directory );
for my $file (#directory_list) {
say $file;
}
Or, if you prefer a separate wanted subroutine:
use strict;
use warnings;
use feature qw(say);
use File::Find;
my $directory = `/tmp/test`;
my #directory_list;
find ( \&wanted, $directory );
sub wanted {
push #directory_list, $File::Find::Name;
}
for my $file (#directory_list) {
say $file;
}
The fact that my wanted subroutine depends upon an array that's not local to the subroutine bothers me which is why I prefer embedding the wanted subroutine inside my find call.
One thing you can do is use your subroutine to filter out what you want. Let's say you're only interested in JPG files:
use strict;
use warnings;
use feature qw(say);
use File::Find;
my $directory = `/tmp/test`;
my #directory_list;
find ( \&wanted, $directory );
sub wanted {
next unless /\.jpg$/i; #Skip everything that doesn't have .jpg suffix
push #directory_list, $File::Find::Name;
}
for my $file (#directory_list) {
say $file;
}
Note how the wanted subroutine does a next on any file I don't want before I push it into my #directory_list array. Again, I prefer the embedding:
find (sub {
next unless /\.jpg$/i; #Skip everything that doesn't have .jpg suffix
push #directory_list, $File::Find::Name;
}
I know this isn't exactly what you asked, but I just wanted to let you know about the Find::File module and introduce you to Perl modules (if you didn't already know about them) which can add a lot of functionality to Perl.
You place a value in $_ before calling browseDir and you expect it the value to be present after calling browseDir (a reasonable expectation), but browseDir modifies that variable.
Just add local $_; to browseDir to make sure that any change to it are undone before the sub exits.
Unrelated to your question, here are three other issues:
Not even minimal error checking!
You could run out of directory handles will navigating a deep directory.
You filter out files ".\n" and "..\n".
Fix:
#!/usr/bin/perl
use strict;
use warnings;
browseDir('/tmp/test');
sub browseDir {
my $path = shift;
opendir(my $dh, $path) or die $!;
my #files = readdir($dh);
closedir($dh);
for (#files) {
next if /^\.{1,2}z/;
if (-d "$path/$_") {
browseDir("$path/$_");
}
print "$path/$_\n";
}
}
Finally, why don't use you a module like File::Find::Rule?
use File::Find::Rule qw( );
print "$_\n" for File::Find::Rule->in('/tmp');
Note: Before 5.12, while (readir($dh)) would have to be written while (defined($_ = readdir($dh)))

How to add one more node information to xml file

I written one script that create one xml file from multiple files,I written script like this.
#!/usr/bin/perl
use warnings;
use strict;
use XML::LibXML;
use Carp;
use File::Find;
use File::Spec::Functions qw( canonpath );
use XML::LibXML::Reader;
use Digest::MD5 'md5';
if ( #ARGV == 0 ) {
push #ARGV, "c:/main/work";
warn "Using default path $ARGV[0]\n Usage: $0 path ...\n";
}
open( my $allxml, '>', "all_xml_contents.combined.xml" )
or die "can't open output xml file for writing: $!\n";
print $allxml '<?xml version="1.0" encoding="UTF-8"?>',
"\n<Shiporder xmlns:xsi=\"http://www.w3.org/2001/XMLSchema-instance\">\n";
my %shipto_md5;
find(
sub {
return unless ( /(_stc\.xml)$/ and -f );
extract_information();
return;
},
#ARGV
);
print $allxml "</Shiporder>\n";
sub extract_information {
my $path = $_;
if ( my $reader = XML::LibXML::Reader->new( location => $path )) {
while ( $reader->nextElement( 'data' )) {
my $elem = $reader->readOuterXml();
my $md5 = md5( $elem );
print $allxml $reader->readOuterXml() unless ( $shipto_md5{$md5}++ );
}
}
return;
}
from above script I am extracting data node information from all xml files and stored in a new xml file . but I have one more node starts with "details", I need to extract that information and I need to add that information also to the file, I tried like this
$reader->nextElement( 'details' );
my $information = $reader->readOuterXml();
I added this in while loop but how can I assign or print this data into same file($all xml). Please help me with this problem.
After your suggestion I tried like this, It gives error
#!/usr/bin/perl
use warnings;
use strict;
use XML::LibXML;
use Carp;
use File::Find;
use File::Spec::Functions qw( canonpath );
use XML::LibXML::Reader;
if ( #ARGV == 0 ) {
push #ARGV, "V:/main/work";
warn "Using default path $ARGV[0]\n Usage: $0 path ...\n";
}
my $libXML = new XML::LibXML;
my $outputDom = $libXML->parse_string('<?xml version="1.0" encoding="UTF-8"?
>','<Shiporder xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">');
my $shiporder = $outputDom->documentElement;
find(
sub {
return unless ( /(_stc\.xml)$/ and -f );
extract_information();
return;
},
#ARGV
);
sub extract_information {
my $path = $_;
if(my #inputDom = XML::LibXML->load_xml(location => $path)){
$inputDom->findnodes('//data || //deatils');
foreach (#$inputDom) {
$shiporder->appendChild($_->parentNode->cloneNode(1));
}
$outputDom->toFile("allfiles.xml");
}
}
but it gives like " '\n\n:1: Parser error:Strat tag expected,'<' not found " Can you help me with script because I am very new to perl.
You would do a lot better if you used what XML::LibXML and related modules gives you, it is a very large and comprehensive module and allows you to do a lot in few lines.
You can use the parser to start a new dom document using parse_string, storing the root node using documentElement. From there, use parse_file to load up each of your input files, then findnodes on the input files to extract the nodes you want to clone. Then append a clone of your input nodes to the output document, and finally use the toFile method to write out your output.
Something like:
my $libXML = new XML::LibXML;
my $outputDom = $libXML->parse_string('<?xml version="1.0" encoding="UTF-8"?>',
'\n<Shiporder xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">\n');
my $shiporder = $outputDom->documentElement;
...
my $inputDom = $libXML->parse_file(some_file_name);
$inputDom->findnodes('//data || //details'); # use a more suitable xpath
foreach (#$inputDom) {
$shipOrder->appendChild($_->parentNode->cloneNode(1)); # if you want parent too...
}
...
$outputDom->toFile(some_output_file);
}
You will have to allow for namespaces and whatnot, but this gives one approach to start with.

How to get a list of leaf subdirectories in a root folder in Perl

I am very new to Perl (scripting languages in general) and I was wondering how to use Perl to get a lisitng of all the leaf directories in Perl. For example, lets say my root directory is C:
C: -> I have folder "A" and "B" and files a.txt and b.txt
Folder "A" -> I have folder "D" and file c.html
Folder "B" -> I have folder "E" and "F" and file d.html
Folder "D", "E" and "F" -> bunch of text files
How do I get a bunch of directory paths as output for this scenario of:
C:\A\D\
C:\B\E\
C:\B\F\
As you can see, I just want a list of all the leaf directories possible. I dont want C:\A\ and C:\B\ to show up. After doign some reserarch myself, I have noticed that I may somehow be able to use the File::Find module in Perl, but that also I am not 100% sure about how to go ahead with.
Thanks for any help you may be able to provide :)
Another approach:
use strict;
use warnings;
use feature qw( say );
use File::Find::Rule qw( );
use Path::Class qw( dir );
my $root = dir('.')->absolute();
my #dirs = File::Find::Rule->directory->in($root);
shift(#dirs);
my #leaf_dirs;
if (#dirs) {
my $last = shift(#dirs);
for (#dirs) {
push #leaf_dirs, $last if !/^\Q$last/;
$last = $_ . "/";
}
push #leaf_dirs, $last;
}
say for #leaf_dirs;
Or using find's preprocess option:
use strict;
use warnings;
use File::Find;
find({ wanted =>sub{1}, # required--in version 5.8.4 at least
preprocess=>sub{ # #_ is files in current directory
#_ = grep { -d && !/\.{1,2}$/ } #_;
print "$File::Find::dir\n" unless #_;
return #_;
}
}, ".");
From an answer to the question How to Get the Last Subdirectories by liverpole on Perlmonks:
prints all leaf directories under the current directory (see "./"):
use strict;
use warnings;
my $h_dirs = terminal_subdirs("./");
my #dirs = sort keys %$h_dirs;
print "Terminal Directories:\n", join("\n", #dirs);
sub terminal_subdirs {
my ($top, $h_results) = #_;
$h_results ||= { };
opendir(my $dh, $top) or die "Arrggghhhh -- can't open '$top' ($!)\n";
my #files = readdir($dh);
closedir $dh;
my $nsubdirs = 0;
foreach my $fn (#files) {
next if ($fn eq '.' or $fn eq '..');
my $full = "$top/$fn";
if (!-l $full and -d $full) {
++$nsubdirs;
terminal_subdirs($full, $h_results);
}
}
$nsubdirs or $h_results->{$top} = 1;
return $h_results;
}