How can I pause Perl processing without hard-coding the duration? - perl

I have a Perl script that contains this code snippet, which calls the system shell to get some files by SFTP and unzip them with WinZip:
# Run script to get files from remote server
system "exec_SFTP.vbs";
# Unzip any files that were retrieved
foreach $zipFile (<*.zip>) {
system "wzunzip $zipFile";
}
Even if some files are retrieved, they are never unzipped, because by the time the files are retrieved and the SFTP connection is closed, the Perl script has already completed the unzip step, with the result that it doesn't find anything to unzip.
My short-term fix is to insert
sleep(60);
before the unzip step, but that assumes that the SFTP connection will finish within 60 seconds, which may sometimes be a gross over-estimate, and other times an under-estimate.
Is there a more sound way to cause Perl to pause until the SFTP connection is closed before proceeding with the unzip step?
Edit: Responders have questioned (and reasonably so) the use of a VB script rather than having Perl do the file transfer. It has to do with security -- the VB script is maintained by others and is authorized to do the SFTP.

Check the code in your *.vbs file. The system function waits for the child process to finish before execution continues. It appears that your *.vbs file is forking a background task to do the FTP and returning immediately.

In a perfect world your script would be rewritten to use Net::SFTP::Foreign and Archive::Extract..
An ugly quick-hackish kind of way might be to create a touch-file before your first system call, alter your sftp-fetching script to delete the file once it is done and have a while like so
while(-e 'touch.file') {
sleep 5;
}
# foreach [...]
Of course, you would need to take care if your .vbs fails and leaves the touchfile undeleted and many other bad side effects. This would be for a quick solution (if none of the other suggestions work) until you get the time to rewrite without system() calls.

You need a way for Perl to wait until the SFTP transfer is done, but as your script is currently written, Perl has no way of knowing this. (It looks like you're combining at least two scripting languages and a (GUI?) SFTP client; this can work, but it's not exactly reliable or robust. Why use VBscript to start the SFTP transfer?)
I can think of four options:
Your Perl script could do the SFTP transfer itself, using something like CPAN's Net::SFTP module, rather than spawning an external job whose status it cannot track.
Your Perl script could spawn a command-line SFTP utility (like PSFTP) that doesn't return until the transfer is done.
Or change exec_SFTP.vbs script to not return until the transfer is done.
If you're currently using a graphical SFTP client and can't switch for whatever reason, I'd recommend using a scripting language like AutoIt instead of Perl. AutoIt has features to wait for windows to change state and so on, so it could more easily monitor for an activity's completion.
Options 1 or 2 would be the most robust and reliable.

The best I can suggest is modifying exec_SFTP.vbs to exit only after the file transfer is complete. system waits for the program it called to complete, so that should solve your problem:
system LIST
system PROGRAM LIST
Does exactly the same thing as "exec LIST", except
that a fork is done first, and the parent process
waits for the child process to complete.

If you can't modify the vbs script to stay alive until it terminates, you may be able to track subprocess creation. If you get subprocess ids, you can monitor them thereby know when the vbs' various offspring terminate.
Win32::Process::Info lets you get a subprocess ids from a running process.

Maybe this is a dumb question, but why not just use the Net::SFTP and Archive::Extract Perl modules to download and unzip the files?

system will not return until the shell it's running the command in has returned; this may be wrong for launching graphical programs and file associations.
See if any of the following help?
system('cscript exec_SFTP.vbs');
use Win32::Process;
use Win32;
Win32::Process::Create(my $proc, 'wscript.exe',
'wscript exec_SFTP.vbs', 0, NORMAL_PRIORITY_CLASS, '.');
$proc->Wait(INFINITE);

Have a look at IPC::Open3
IPC::Open3 - open a process for reading, writing, and error handling using open3()

Related

Using Perl modules vs. using system() calls

Quite recently, I wrote a few scripts in Perl for a cPanel plugin in which, though most of the code was in Perl, there was quite a lot of system() commands as well which I used to execute shell commands directly.
I am pretty sure that there are Perl modules that I could have used instead. Keeping in mind the time crunch, I thought using the system command was easier (to complete the project in time). In retrospective, I think that was a bad programming practice.
My question is, is there any tradeoff, memory-wise or otherwise when using Perl's modules and using system() commands. For example, what would be the difference in using:
my $directory = "temp";
mkdir $directory;
and
system ("mkdir temp");
Also, if I am to use Perl modules, wouldn't that involve installing a whole lot of modules in the beginning?
The most obvious economy is that, in the first case, your Perl process is creating the directory, while in the second, Perl is starting a new process that runs a command shell which parses the command line and runs the shell mkdir command to create the directory, and then the child process is deleted. You would be creating and deleting a process and running the shell for every call to system: there is no caching of processes or similar economy.
The second thing that comes to mind is that, if your original mkdir fails, it is simple to handle the error in Perl, whereas shelling out to run a mkdir command puts your program at a distance from the error, and it is far more awkward to handle the many different problems that may arise.
There is also the question of maintainability and portability, which will affect you even if you aren't expecting to run your program on more than one machine. Once you abandon control to a system command you have no control over what happens. I could have written a mkdir that will delete your home directory or, less disastrously, your program may find itself on a system where mkdir doesn't exist, or does something slightly different.
In the particular case of mkdir, this is a built-in Perl operator and is part of every Perl installation. There are also many core libraries that require you to put use Module in your program, but are already installed and need no further action.
I am sure others will come up with more reasons to prefer a Perl operator or module over a shell command. In general you should prefer to keep everything you can within the language. There are only a few cases where you have to run a third-party program, and they usually involve custom software that allows you act on proprietary data formats.

Pausing a perl script while SFTP transfers files

FYI, I'm a complete newbie with Perl, as in I can spell it and only a little more so I'm trying to learn. What I'm trying to accomplish is using SFTP to transfer files from a Windows machine to a Linux machine.
I've noticed that Perl issues the SFTP get command, but doesn't wait for the transfer to finish so when the Perl script tries to use a file it can't find it. I know there is the sleep command, but the number and size of files will vary on a weekly basis so using sleep(600) seems a little silly.
Is there a standard way to pause a Perl script until SFTP finishes transferring all necessary files?
TIA.
Using Net::SFTP might have solved this dilemma, but my workplace won't allow me to download and install stuff, especially in production. So rather than waiting on the typical bureaucracy I did some more digging around and discovered this:
By calling SFTP in batch mode using a separate file that contains the SFTP commands, the Perl script has to wait for SFTP to finish executing the commands in the separate "command" file. So by using the batch mode option, the Perl script is paused as long as it takes for SFTP to finish its work of file transfer.

Should I turn a perl script that parses a /var/log/.* file into a daemon?

I am writing a perl script to parse, for example, /var/log/syslog.
The perl script triggers further subsequent tasks when particular events in the log appear. The log is parsed following the advice of this post:
Command line: monitor log file and add data to database
Which what I believe is the use of a pipe.
Now I'd like this script to forever run in the background.
This sounds like a daemon to me, and the daemon program referenced in the following question seems ideal:
How can I run a Perl script as a system daemon in linux?
But from this post, it seems clear that daemon's have no open file handles. So how can I have a daemon, or a perl script that becomes a daemon, that monitors a logfile?
It sounds like what you want is a daemon. In that case the advise given in the second post you reference is the best practice. However, you do have other options like daemontools, which removes the fork complexity.
Daemons are allowed to have filehandles, but you should close STDIN, STDOUT, and STDERRR because you shouldn't really use them anymore. A lot of this has to do with the way fork works in *nix systems. Just open the pipe filehandle after your second fork, and you shouldn't have any issues.
this doesn't answer your question, but is another route to consider which may or may not be appropriate for you:
rsyslog can execute a program when a certain message is logged
see Filter Conditions for setting up the up the trigger, Templates for formatting the output that's passed to the script, and Actions > Shell Execute for specifying the executable.
Be sure to read the security implications, and that ryslog blocks while the external program runs. But if your script runs reliably quickly, it may be an option.

parallel SSH in perl

I am trying to create a script in perl which can ssh to multiple hosts (500+), execute a desired command, and show output on the screen. I have done this with the Net::OpenSSH module, as ssh-keys are not configured and I am not allowed to configure those. So, I have to use a thing which can supply the password while doing ssh.
Due to the many connections, it takes considerable time while doing the thing. I searched for "parallel ssh in perl" and discovered that there is a module for opening parallel ssh (Net::OpenSSH:Parallel), but I read somewhere on some forums that I cannot capture output with this module like I can capture using Net::OpenSSH ($ssh->caputre(ls)).
So, how can I accomplish parallel ssh in a more expedient manner? Also, I welcome any other suggestions I can use to save time. Would using Net:OpenSSH in threads save my time or will it work exactly like parallel?
You can fork your program and manage the forks with something like Parallel::ForkManager. Then do the SSH work + capture using Net::OpenSSH and display the results to the screen. You'll need to be careful with your IO though since all those processes trying to write to STDOUT/STDERR at the same time will get garbled results. You'll need to do something like the answer from this question (pipes between parent and child processes): fork() and STDOUT/STDERR to the console from child processes
Parallel programming is harder than serial, so be prepared for some fun :)
One way is to use a shell script to execute your perl script:
#!/bin/bash
for host in $(cat myhosts)
do
perl myperl.pl $host $1 $2 &
done
where myhosts is a file containing 500+ host names

Why doesn't my shell script work when I run it from Perl?

I have this command that I load (example.sh) works well in the unix command line.
However, if I execute it in Perl using the system or ` syntax, it doesn't work.
I am guessing certain settings like environment variables and other external sh files weren't loaded.
Is there an example coding to ensure it will work?
More Updates on coding execution failure (I have been trying with different codes):
push (#JOBSTORUN, "cd $a/$b/$c/$d; loadproject cats; sleep 60;");
...
my $pm = new Parallel::ForkManager(3);
foreach my $job (#JOBSTORUN) {
$pm->start and next;
print(`$job`);
$pm->finish;
}
print "\n\n[DONE] FINISHED EXECUTING JOBS\n";
Output Messages:
sh: loadproject: command not found
Can you show us what you have tried so far? How are you running this program?
My first suspicion wouldn't be the environment if you are running it from a login shell. any Perl script you start (well, any program, really) inherits the same environment. However, if you are running the program through cron, then that's a different story.
The other mistakes I usually make in these situations is specifying the relative paths incorrectly. The paths are fine from the command line, but my Perl script has some other current working directory.
For general advice, see Interact with the system when Perl isn't enough. There's also a chapter in Learning Perl about this.
That's about the best advice you can hope for given the very limited information you've shared with us.