I am trying to write a Perl script to bring up a Postgres client once and send several files through the client, captures the output separately for each file.
If I do something like:
system ("cat $query1.sql | psql -p 2070 super &> $HOME/Results1.txt");
system ("cat $query2.sql | psql -p 2070 super &> $HOME/Results2.txt");
then Perl will start up the client for each query. As I'll be running hundreds and maybe thousands of queries I want to skip the overhead for starting up all but the first client.
I think I should be able to bring up the Postgres client via Open2, but it hangs when I try. I'm doing this on SUSE Linux machine with Perl 5.10.0.
Here's my code:
use IPC::Open2;
use IO::Handle;
our $pid = open2(*CHILDOUT, *CHILDINT, '../installdir/bin/psql -p 2070 super');
print STDOUT $pid;
print CHILDINT "cat $dumpR5KScript";
print STDOUT 'Sent the commands';
$output = <CHILDOUT>;
close(CHILDIN);
close(CHILDOUT);
It appears to be hanging with "open2" because I never see the pid.
Can someone point out what I am doing wrong so my call to open2 doesn't hang?
And if anyone has advice on the larger issue of the best way of bringing up a Postgres client and running queries through it I would be grateful.
You have already been told to use DBI in the comments to your post, and it will be a good thing if you do. Formatting is much easier than fiddling with the IPC and stitching together a sort of an API between Perl and a command line database client, parsing the output and formatting the input.
Regarding your problem, however:
it should be \*CHILDIN instead of *CHILDIN (reference to typeglob rather than typeglob)
and anyways in such a case you should use variables instead of typeglobs and ancient idioms:
my ( $childout, $childin ) ;
our $pid = open2( $childout, $childin, '../installdir/bin/psql -p 2070 super');
print STDOUT $pid;
Please read the documentation for IPC::Open2.
Also, better to use open3 to handle STDERR as well
Finally, I do not know the postgress client, but the possibility of a deadlock (which you are experiencing) is very real with open2:
This whole affair is quite dangerous, as you may block forever. It assumes it's going to talk to something like bc, both writing to it and reading from it. This is presumably safe because you "know" that commands like bc will read a line at a time and output a line at a time. Programs like sort that read their entire input stream first, however, are quite apt to cause deadlock.
The big problem with this approach is that if you don't have control over source code being run in the child process, you can't control what it does with pipe buffering. Thus you can't just open a pipe to cat -v and continually read and write a line from it.
The IO::Pty and Expect modules from CPAN can help with this, as they provide a real tty (well, a pseudo-tty, actually), which gets you back to line buffering in the invoked command again.
Related
I have a program with lots of system commands to handle searching for, examining, and killing processes:
system qq(kill $pid);
and
for my $pid ( qx( pgrep -f "$pgrep_re") ) {
chomp $pid;
...
}
and
my $command_line = qx(ps -o command="" $pid);
chomp $command_line;
....
Not only is this system specific, but I'm depending upon the user to have these particular commands in their path the correct way, leaving me with a system security issue (like some joker setting alias ps="rm -rf *").
I would like to do this stuff in a nice, Perl way which would be less dependent upon all of these system commands and be a bit more platform independent1.
Is there a Perl module? Bonus points for one that does it in a nice object-oriented way and doesn't depend externally with these very same commands.
1. A lot of this deals with using ssh and setting up tunnels, so since Windows doesn't have ssh as a native command, I'm willing to exclude it as long as this works well for other Unix/Linux systems.
kill: use the builtin kill (perldoc -f kill)
ps: use search.cpan.org, there is UNIX::Process. In linux you could also scan through /proc/
pgrep: combine ps with perl pattern matching
I have a question about this answer, quoted below, by friedo to another question here. (I don't have permission to comment on it, so I am asking this as a question.)
"You can use File::Tee.
use File::Tee qw(tee);
tee STDOUT, '>>', 'some_file.out';
print "w00p w00p";
If File::Tee is unavailable, it is easily simulated with a pipeline:
open my $tee, "|-", "tee some_file.out";
print $tee "w00p w00p";
close $tee;
Are both of these tees the same? Or is one from Perl and the other from Linux/Unix?
They're mostly the same, but the implementation details differ.
Opening a pipe to tee some_file.out forks a new process and runs the Unix / Linux utility program tee(1) in it. This program reads its standard input (i.e. anything you write to the pipe) and writes it both to some_file.out as well as to stdout (which it inherits from your program).
Obviously, this will not work under Windows, or on any other system that doesn't provide a Unix-style tee command.
The File::Tee module, on the other hand, is implemented in pure Perl, and doesn't depend on any external programs. However, according to its documentation, it also works by forking a new process and running what is essentially a Perl reimplementation of the Unix tee command under it. This does have some advantages, as the documentation states:
"It is implemeted around fork, creating a new process for every tee'ed stream. That way, there are no problems handling the output generated by external programs run with system or by XS modules that don't go through perlio."
On the other hand, the use of fork has its down sides as well:
"BUGS
Does not work on Windows (patches welcome)."
If you do want a pure Perl implementation of the tee functionality that works on all platforms, consider using IO::Tee instead. Unlike File::Tee, this module is implemented using PerlIO and does not use fork.
Alas, this also means that it may not correctly capture the output of external programs executed with system or XS modules that bypass PerlIO.
In a nutshell: wrote a Perl script using flock(). On Linux, it behaves as expected. On AIX, flock() always returns 1, even though another instance of the script, using flock(), should be holding an exclusive lock on the lockfile.
We ship a Bash script to restart our program, relying on flock(1) to prevent simultaneous restarts from making multiple processes. Recently we deployed on AIX, where flock(1) doesn't come by default and won't be provided by the admins. Hoping to keep things simple, I wrote a Perl script called flock, like this:
#!/usr/bin/perl
use Fcntl ':flock';
use Getopt::Std 'getopts';
getopts("nu:x:");
%switches = (LOCK_EX => $opt_x, LOCK_UN => $opt_u, LOCK_NB => $opt_n);
my $lockFlags = 0;
foreach $key (keys %switches) {
if($switches{$key}) {$lockFlags |= eval($key)};
}
$fileDesc = $opt_x || $opt_u;
open(my $lockFile, ">&=$fileDesc") || die "Can't open file descriptor: $!";
flock($lockFile, $lockFlags) || die "Can't change lock - $!\n";;
I tested the script by running (flock -n -x 200; sleep 60)200>lockfile twice, nearly simultaneously, from two terminal tabs.
On Linux, the second run dies with "Resource temporarily unavailable", as expected.
On AIX, the second run acquires the lock, with flock() returning 1, as most definitely not expected.
I understand the flock() is implemented differently on the two systems, the Linux version using flock(1) and the AIX one using, I think, fcntl(1). I don't have enough expertise to understand how this causes my problem, and how to solve it.
Many thanks for any advice.
This isn't anything to do with AIX, the open() call in your script is incorrect.
Should should be something like:
open (my $lockfile, ">>", $fileDesc) # for LOCK_EX, must be write
You were using the "dup() previously opened file handle" syntax with >&=, but the script had not opened any files to duplicate, nor should it.
My quick tests shows the correct behavior (debugging added)
first window:
$ ./flock.pl -n -x lockfile
opened lockfile
locked
second window:
$./flock.pl -n -x lockfile
opened lockfile
Can't change lock - Resource temporarily unavailable
$
It's not about different commands, I suppose; it's more about global differences between AIX and Linux.
In POSIX systems, file locks are advisory: each program could check the file's state and then reconsider what has to be done with it. No explicit checks = no locking.
In Linux systems, however, one can try to enforce a mandatory lock, although the doc itself states that it would be unwise to rely on it: implementation is (and probably will ever be) buggy.
Therefore, I suggest implementing such checks of advisory flags within the script itself.
More about it: man 2 fcntl, man 2 flock.
I am using Perl to execute psexec and capture the output from the console. What seems odd to me is that when I execute the command with backticks, it correctly captures output every time.
For example, this Perl script works, and I've used this for many years on many different configurations:
use strict;
my #out;
#out = `psexec \\\\192.168.1.105 -u admin -p pass netstat -a`;
print #out;
This Perl script fails, and seems to reliably cause psexesvc to hang on the remote system:
use IPC::Open2;
my($chld_out, $chld_in, $pid);
$pid = open2($chld_out, $chld_in, 'psexec \\\\192.168.1.105 -u admin -p pass netstat -a');
waitpid( $pid, 0 );
my $child_exit_status = $? >> 8;
my $answer = <$chld_out>;
print "\n\n answer: $answer";
What is so strange to me is that backticks seem to never have any problem. Everything else does, including examples in C++ from MSDN.
My suspicion is that the problem with IPC::Open2 and the example in C++ (linked above) is related to the fact that I'm redirecting STDIN and STDOUT from the command shell (cmd.exe), and the child process (psexec) does the same thing when communicating with my remote system.
Also, where in the perldocs can I find detailed information on how backticks work? I'm most interested in their "internals" on Windows.
Or, where in the Perl source can I review the inner workings of backticks (that may be biting off more than I can chew, but it's worth a shot at this point).
UPDATE:
Following Andy's suggestion, I found this works:
use IPC::Open2;
my($chld_out, $chld_in, $pid);
$pid = open2($chld_out, $chld_in, 'psexec \\\\192.168.1.105 -u admin -p pass netstat -a');
my #answer = <$chld_out>;
print "\n\n answer: #answer";
waitpid( $pid, 0 );
my $child_exit_status = $? >> 8;
I know very little about how this works on windows, so maybe somebody can provide a more specific answer, but when piping between processes in perl, you need to be careful to avoid undesired blocking and deadlocks. There is some discussion of various problem scenarios in perlipc.
In your example, the immediate call to waitpid causes problems. One possibility is that the child cannot exit until something reads the output, so everything hangs since the parent is not going to read the output. Another possibility is that part of the data stream is shut down as part of the waitpid call, and this causes a problem with the remote process.
In any case, it would be better to read the output from the child process before calling waitpid.
I want to meassure the throughput of a link using Windows build-in FTP tool inside a Perl script. Therefore the script creates the following command script:
open <ip>
<username>
<password>
hash
get 500k.txt
quit
Afterwards I run the command script using the following Perl code:
system(#args);
#args = ("ftp", "-s:c:\\ftp_dl.txt");
system(#args);
If I run the command inside a DOS-box the output looks like this:
ftp> open <ip>
Connected to <ip>
220 "Welcome to the fast and fabulous DUFTP005 ftp-server :-) "
User (<ip>:(none)):
331 Please specify the password.
230 Login successful.
ftp> hash
Hash mark printing On ftp: (2048 bytes/hash mark) .
ftp> get 500k.txt
200 PORT command successful. Consider using PASV.
150 Opening BINARY mode data connection for 500k.txt (14336 bytes).
#######
226 File send OK.
ftp: 14336 bytes received in 0.00Seconds 14336000.00Kbytes/sec.
ftp> quit
221 Goodbye.
To be able to get the throughput I need the extract that line:
ftp: 14336 bytes received in 0.00Seconds 14336000.00Kbytes/sec.
I'm not very familiar with Perl. Does anybody have an idea how to get that line?
Use either open in pipe mode:
open($filehandle, "$command|") or die "did not work: $! $?";
while(<$filehandle>)
{
#do something with $_
}
or use backticks:
my #programoutput=`$command`
You can't get the output with system().
Instead use bactkicks:
my $throughput = 0;
my $output = `ftp -s:c:\\ftp_dl.txt`;
if (($? == 0) && ($output =~ /([\d+\.]+)\s*K?bytes\/sec/m)) {
$throughput = $1;
}
$output will contain all the lines from the execution of the ftp command (but not any error message sent to STDERR).
Then we check if ftp returned success (0) and if we got a throughput somewhere in the output.
If so, we set $throughput to it.
This being Perl, there are many ways to do this:
You could also use the Net::FTP module that supports Windows to deal with the file transfer and use a timing module like Time::HiRes to time it and calculate your throughput.
This way you won't depend on the ftp program (your script would not work on localised version of Windows for instance without much re-work, and you need to rely on the ftp program to be installed and in the same location).
See perlfaq8, which has several answers that deal with this topic. The ones you probably need for this question are:
Why can't I get the output of a command with system()?
How can I capture STDERR from an external command?
Also, you might be interested in some of the IPC (Interprocess Communication) Perl modules that come in the standard library:
IPC::Open2
IPC::Open3
Some of the Perl documentation might also help:
perlipc - Perl interprocess communication
perlopentut - Perl open tutorial
If you're not familiar with the Perl documentation, you might check out my Perl documentation documentation.
Good luck,
You should try and use libcurl which is more suited for the task.
There is an easy to use API