Why does my jzip process hang when I call it with Perl's system? - perl

I am definitely new to Perl, and please forgive me if this seem like a stupid question to you.
I am trying to unzip a bunch of .cab file with jzip in Perl (ActivePerl, jzip, Windows XP):
#!/usr/bin/perl
use strict;
use warnings;
use File::Find;
use IO::File;
use v5.10;
my $prefix = 'myfileprefix';
my $dir = '.';
File::Find::find(
sub {
my $file = $_;
return if -d $file;
return if $file !~ /^$prefix(.*)\.cab$/;
my $cmd = 'jzip -eo '.$file;
system($cmd);
}, $dir
);
The code decompresses the first .cab files in the folder and hangs (without any errors). It hangs in there until I press Ctrl+c to stop. Anyone know what the problem is?
EDIT: I used processxp to inspect the processes, and I found that there are correct number of jzip processes fired up (per the number of cab files resides at the source folder). However, only one of them is run under cmd.exe => perl, and none of these process gets shut down after fired. Seems to me I need to shut down the process and execute it one by one, which I have no clue how to do so in perl. Any pointers?
EDIT: I also tried replacing jzip with notepad, it turns out it opens up notepad with one file at a time (in sequential order), and only if I manually close notepad then another instance is fired. Is this common behavior in ActivePerl?
EDIT: I finally solved it, and I am still not entire sure why. What I did was removing XML library in the script, which should not relevant. Sorry I removed "use XML::DOM" purposefully in the beginning as I thought it is completely irrelevant to this problem.
OLD:
use strict;
use warnings;
use File::Find;
use IO::File;
use File::Copy;
use XML::DOM;
use DBI;
use v5.10;
NEW:
#!/usr/bin/perl
use strict;
use warnings;
use File::Find;
use IO::File;
use File::Copy;
use DBI;
use v5.10;
my $prefix = 'myfileprefix';
my $dir = '.';
# retrieve xml file within given folder
File::Find::find(
sub {
my $file = $_;
return if -d $file;
return if $file !~ /^$prefix(.*)\.cab$/;
say $file;
#say $file or die $!;
my $cmd = 'jzip -eo '.$file;
say $cmd;
system($cmd);
}, $dir
);
This, however, imposes another problem, when the extracted file already exists, the script will hang again. I highly suspect this is a problem of jzip and an alternative of solving the problem is simply replacing jzip with extract, like #ghostdog74 pointed out below.

First off, if you are using commands via system() call, you should always redirect their output/error to a log or at least process within your program.
In this particular case, if you do that, you'd have a log of what every single command is doing and will see if/when any of them are stuck.
Second, just a general tip, it's a good idea to always use native Perl libraries - in this case, it may be impossible of course (I'm not that experienced with Windows Perl so no clue if there's a jzip module in Perl, but search CPAN).
UPDATE: Didn't find a Perl native CAB extractor, but found a jzip replacement that might work better - worth a try. http://www.cabextract.org.uk/ - there's a DOS version which will hopefully work on Windows

Based on your edit, this is what I suggest:
#!/usr/bin/perl
use strict;
use warnings;
use File::Find;
use IO::File;
use v5.10;
my $prefix = 'myfileprefix';
my $dir = '.';
my #commands;
File::Find::find(
sub {
my $file = $_;
return if -d $file;
return if $file !~ /^$prefix(.*)\.cab$/;
my $cmd = "jzip -eo $File::Find::name";
push #commands, $cmd;
}, $dir
);
#asynchronously kick off jzips
my $fresult;
for #commands
{
$fresult = fork();
if($fresult == 0) #child
{
`$_`;
}
elsif(! defined($fresult))
{
die("Fork failed");
}
else
{
#no-op, just keep moving
}
}
edit: added asynch. edit2: fixed scope issue.

What happens when you run the jzip command from the dos window? Does it work correctly? What happens if you add an end of line character (\n) to the command in the script? Does this prevent the hang?

here's an alternative, using extract.exe which you can download here or here
use File::Find;
use IO::File;
use v5.10;
my $prefix = 'myfileprefix';
my $dir = '.';
File::Find::find({wanted => \&wanted}, '.');
exit;
sub wanted {
my $destination = q(c:\test\temp);
if ( -f $_ && $_=~/^$prefix(.*)\.cab$/ ) {
$filename = "$File::Find::name";
$path = "$File::Find::dir";
$cmd = "extract /Y $path\\$filename /E /L $destination";
print $cmd."\n";
system($cmd);
}
} $dir;

Although no one has mentioned it explicitly, system blocks until the process finishes. The real problem, as people have noted, is figuring out why the process doesn't exit. Forking or any other parallelization won't help because you'll be left with a lot of hung processes.
Until you can figure out the issue, start small. Make the smallest Perl script that demonstrates the problem:
#!perl
system( '/path/to/jzip', '-eo', 'literal_file_name' ); # full path, list syntax!
print "I finished!\n";
Now the trick is to figure out why it hangs, and sometimes that means different solutions for different external programs. Sometimes you need to close STDIN before you run the external process or it sits there waiting for it to close, sometimes you do some other thing.
Instead of system, you might also try things such as IPC::System::Simple, which handles a lot of platform-specific details for you, or modules like IPC::Run or IPC::Open3.
Sometimes it just sucks, and this situation is one of those times.

Related

Perl search for a particular file extension in folder and sub folder

I have a folder which has over 1500 files scattered around in different sub-folders with extension .fna. I was wondering if there is a simple way in Perl to extract all these files and store them in a different location?
As File::Find is recommended everywhere, let me add that there are other, sometimes nicer, options, like https://metacpan.org/pod/Path::Iterator::Rule or Path::Class traverse function.
Which OS are you using? If it's Windows, I think a simple xcopy command would be a lot easier. Open a console window and type "xcopy /?" to get the info on this command. It should be something simple like:
xcopy directory1/*.fna directory2 /s
use File::Find;
my #files;
find(\&search, '/some/path/*.fna');
doSomethingWith(#files);
exit;
sub search {
push #files, $File::Find::name;
return;
}
Without much more information to go on, you don't need a perl script to do something as easy as this.
Here's a *nix one-liner
find /source/dir -name "*.fna" -exec mv -t /target/dir '{}' \+ -print
sorry for the late response. I was away for a conference. Here is my code which seem to work fine so far.
use strict;
use warnings;
use Cwd;
use FileHandle;
open my $out, ">>results7.txt" or die;
my $parent = "/home/denis/Denis_data/Ordered species";
my ($par_dir, $sub_dir);
opendir($par_dir, $parent);
while (my $sub_folders = readdir($par_dir)) {
next if ($sub_folders =~ /^..?$/); # skip . and ..
my $path = $parent . '/' . $sub_folders;
#my $path = $sub_folders;
next unless (-d $path); # skip anything that isn't a directory
chdir($path) or die;
system 'perl batch_hmm.pl';
print $out $path."\n";
#chdir('..') or die;
#closedir($sub_dir);
}
closedir($par_dir);
I will also try the File::Finder option. The above one looks quite messy.

Simple PERL script to loop very quickly

I'm trying to get a perl script to loop very quickly (in Solaris).
I have something like this:
#! /bin/perl
while ('true')
{
use strict;
use warnings;
use Time::HiRes;
system("sh", "shell script.sh");
Time::HiRes::usleep(10);
}
I want the perl script to execute a shell script every 10 microseconds. The script doesn't fail but no matter how much I change the precision of usleep within the script, the script is still only being executed approx 10 times per second. I need it to loop much faster than that.
Am I missing something fundamental here? I've never used perl before but I can't get the sleep speed I want in Solaris so I've opted for perl.
TIA
Huskie.
EDIT:
Revised script idea thanks to user comments - I'm now trying to do it all within perl and failing miserably!
Basically I'm trying to run the PS command to capture processes - if the process exists I want to capture the line and output to a text file.
#! /bin/perl
while ('true')
{
use strict;
use warnings;
use Time::HiRes;
open(PS,"ps -ef | grep <program> |egrep -v 'shl|grep' >> grep_out.txt");
Time::HiRes::usleep(10);
}
This returns the following error:
Name "main::PS" used only once: possible typo at ./ps_test_loop.pl line 9.
This is a pure perl program (not launching any external process) that looks for processes running some particular executable:
#!/usr/bin/perl
use strict;
use warnings;
my $cmd = 'lxc-start';
my $cmd_re = qr|/\Q$cmd\E$|;
$| = 1;
while (1) {
opendir PROC, "/proc" or die $!;
while (defined(my $pid = readdir PROC)) {
next unless $pid =~ /^\d+$/;
if (defined(my $exe = readlink "/proc/$pid/exe")) {
if ($exe =~ $cmd_re) {
print "pid: $pid\n";
}
}
}
closedir PROC;
# sleep 1;
}
On my computer this runs at 250 times/second.
The bottleneck is the creation of processes, pipes, and opening the output file. You should be doing that at most once, instead of doing it in each iteration. That's why you need to do everything in Perl if you want to make this faster. Which means: don't call the ps command, or any other command. Instead, read from /proc or use Proc::ProcessTable, as the comments suggest.
Incidentally: the use statement is executed only once (it is essentially a shorthand for a require statement wrapped in a BEGIN { } clause), so you might as well put that at the top of the file for clarity.

Recursive directory traversal in Perl

I'm trying to write a script that prints out the file structure starting at the folder the script is located in. The script works fine without the recursive call but with that call it prints the contents of the first folder and crashes with the following message: closedir() attempted on invalid dirhandle DIR at printFiles.pl line 24. The folders are printed and the execution reaches the last line but why isn't the recursive call done? And how should I solve this instead?
#!/usr/bin/perl -w
printDir(".");
sub printDir{
opendir(DIR, $_[0]);
local(#files);
local(#dirs);
(#files) = readdir(DIR);
foreach $file (#files) {
if (-f $file) {
print $file . "\n";
}
if (-d $file && $file ne "." && $file ne "..") {
push(#dirs, $file);
}
}
foreach $dir (#dirs) {
print "\n";
print $dir . "\n";
printDir($dir);
}
closedir(DIR);
}
You should always use strict; and use warnings; at the start of your Perl program, especially before you ask for help with it. That way Perl will show up a lot of straightforward errors that you may not notice otherwise.
The invalid filehandle error is likely because DIR is a global directory handle and has been closed already by a previous execution of the subroutine. It is best to always used lexical handles for both files and directories, and to test the return code to make sure the open succeeded, like this
opendir my $dh, $_[0] or die "Failed to open $_[0]: $!";
One advantage of lexical file handles is that they are closed implicitly when they go out of scope, so there is no need for your closedir call at the end of the subroutine.
local isn't meant to be used like that. It doesn't suffice as a declaration, and you are creating a temporary copy of a global variable that everything can access. Best to use my instead, like this
my #dirs;
my #files = readdir $dh;
Also, the file names you are using from readdir have no path, and so your file tests will fail unless you either chdir to the directory being processed or append the directory path string to the file name before testing it.
Use the File::Find module. The way i usually do this is using the find2perl tool which comes with perl, which takes the same parameters as find and creates a suitable perl script using File::Find. Then i fine-tune the generated script to do what i want it to do. But it's also possible to use File::Find directly.
Why not use File::Find?
use strict; #ALWAYS!
use warnings; #ALWAYS!
use File::Find;
find(sub{print "$_\n";},".");

perl mktemp and echo

i am trying to put some word in tempfile via commandline
temp file creat but word not past in tempfile
#!/usr/bin/perl -w
system ('clear');
$TMPFILE = "mktemp /tmp/myfile/devid.XXXXXXXXXX";
$echo = "echo /"hello world/" >$TMPFILE";
system ("$TMPFILE");
system ("$echo");
Please Help to Solve This
To capture the name output by mktemp, do this instead:
chomp($TMPFILE = `mktemp /tmp/myfile/devid.XXXXXXXXXX`);
But Perl can do all the things you are doing without resorting to the shell.
Avoid using external commands from perl script as much as possible.
you can use: File::Temp module in this case, see this
Here's a specific demonstration of the advice that others have given you: where possible, use Perl directly rather than invoking system. Also, you should get in the habit of including use strict and use warnings in your Perl scripts.
use strict;
use warnings;
use File::Temp;
my $ft = File::Temp->new(
UNLINK => 0,
TEMPLATE => '/tmp/myfile/devid.XXXXXXXXXX',
);
print "Writing to temp file: ", $ft->filename, "\n";
print $ft "Hello, world.\n";

How do I get a directory listing in Perl? [duplicate]

This question already has answers here:
How do I read in the contents of a directory in Perl?
(9 answers)
Closed 7 years ago.
I would like to execute ls in a Perl program as part of a CGI script. For this I used exec(ls), but this does not return from the exec call.
Is there a better way to get a listing of a directory in Perl?
Exec doesn't return at all. If you wanted that, use system.
If you just want to read a directory, open/read/close-dir may be more appropriate.
opendir my($dh), $dirname or die "Couldn't open dir '$dirname': $!";
my #files = readdir $dh;
closedir $dh;
#print files...
Everyone else seems stuck on the exec portion of the question.
If you want a directory listing, use Perl's built-in glob or opendir. You don't need a separate process.
exec does not give control back to the perl program.
system will, but it does not return the results of an ls, it returns a status code.
tick marks `` will give you the output of our command, but is considered by some as unsafe.
Use the built in dir functions.
opendir, readdir, and so on.
http://perldoc.perl.org/functions/opendir.html
http://perldoc.perl.org/functions/readdir.html
In order to get the output of a system command you need to use backticks.
$listing = `ls`;
However, Perl is good in dealing with directories for itself. I'd recommend using File::Find::Rule.
Yet another example:
chdir $dir or die "Cannot chroot to $dir: $!\n";
my #files = glob("*.txt");
Use Perl Globbing:
my $dir = </dir/path/*>
EDIT: Whoops! I thought you just wanted a listing of the directories... remove the 'directory' call to make this script do what you want it to...
Playing with filehandles is the wrong way to go in my opinion. The following is an example of using File::Find::Rule to find all the directories in a specified directory. It may seem like over kill for what you're doing, but later down the line it may be worth it.
First, my one line solution:
File::Find::Rule->maxdepth(1)->directory->in($base_dir);
Now a more drawn out version with comments. If you have File::Find::Rule installed you should be able to run this no problem. Don't fear the CPAN.
#!/usr/bin/perl
use strict;
use warnings;
# See http://search.cpan.org/~rclamp/File-Find-Rule-0.32/README
use File::Find::Rule;
# If a base directory was not past to the script, assume current working director
my $base_dir = shift // '.';
my $find_rule = File::Find::Rule->new;
# Do not descend past the first level
$find_rule->maxdepth(1);
# Only return directories
$find_rule->directory;
# Apply the rule and retrieve the subdirectories
my #sub_dirs = $find_rule->in($base_dir);
# Print out the name of each directory on its own line
print join("\n", #sub_dirs);
I would recommend you have a look at IPC::Open3. It allows for far more control over the spawned process than system or the backticks do.
On Linux, I prefer find:
my #files = map { chomp; $_ } `find`;