Redirect output of sys process to stdout, no buffering - scala

I would like to run an external process and have its stdout and stderr piped immediately to stdout, without any buffering (at least not line-buffering). With line buffering, this isn't difficult:
val cmd: String = ...
cmd ! ProcessLogger(println,println)
However, how can I do this so that I don't have to wait for a whole line before outputting? The problem is that the process I am running has a percentage output that keeps updating, but when I use something like the above, it waits for a bit and then finally outputs 100%.
I've looked into ProcessIO and the helpful BasicIO companion, but I haven't managed to get anything working end to end.

Related

Line buffered reading in Perl

I have a perl script, say "process_output.pl" which is used in the following context:
long_running_command | "process_output.pl"
The process_output script, needs to be like the unix "tee" command, which dumps output of "long_running_command" to the terminal as it gets generated, and in addition captures output to a text file, and at the end of "long_running_command", forks another process with the text file as an input.
The behavior I am currently seeing is that, the output of "long_running_command" gets dumped to the terminal, only when it gets completed instead of, dumping output as it gets generated. Do I need to do something special to fix this?
Based on my reading in a few other stackexchange posts, i tried the following in "process_output.pl", without much help:
select(STDOUT); $| =1;
select(STDIN); $| =1; # Not sure even if this is needed
use FileHandle; STDOUT->autoflush(1);
stdbuf -oL -eL long_running_command | "process_output.pl"
Any pointers on how to proceed further.
Thanks
AB
This is more likely an issue with the output of the first process being buffered, rather than the input of your script. The easiest solution would be to try using the unbuffer command (I believe it's part of the expect package), something like
unbuffer long_running_command | "process_output.pl"
The unbuffer command will disable the buffering that happens normally when output is directed to a non-interactive place.
This will be the output processing of long_running_processing. More than likely it is using stdio - which will look to see what the output file descriptor is connected to before it does outputing. If it is a terminal (tty), then it will generally output line based, but in the above case - it will notice it is writing to a pipe and will therefore buffer the output into larger chunks.
You can control the buffering in your own process by using, as you showed
select(STDOUT); $| =1;
This means that things that your process prints to STDIO, are not buffered - it makes no sense doing this for input, as you control how much buffering is done - if you use sysread() then you are reading unbuffered, if you use a construct like <$fh> then perl will await until it has a "whole line" (it actually reads up to the next input line separator (as defined in variable $/ which is newline by default)) before it returns data to you.
unbuffer can be used to "disable" the output buffering, what it actually does is make the outputing process think that it is talking to a tty (by using a pseudo tty) so the output process does not buffer.

Perl STDOUT redirected to a pipe, no output after calling sleep()

I'm having problems with Perl on Windows (both ActivePerl and Strawberry), when redirecting a script STDOUT to a pipe, and using sleep(). Try this:
perl -e "for (;;) { print 'Printing line ', $i++, \"\n\"; sleep(1); }"
This works as expected. Now pipe it to Tee (or some file, same result):
perl -e "for (;;) { print 'Printing line ', $i++, \"\n\"; sleep(1); }" | tee
There's no output at all, tee captures nothing. However, the perl script is still running, only there's nothing on STDOUT until the script finishes, and then all output is dumped to tee. Except, if the STDOUT buffer fills the script might hang.
Now, if you remove the sleep(call), the pipe works as expected! What's going on?
I found a workaround; disabling the STDOUT buffering with $|=1 makes the pipe work when using the sleep, but... why? Can anyone explain and offer a better solution?
You are suffering from buffering. Add $| = 1; to unbuffer STDOUT.
All file handles except STDERR are buffered by default, but STDOUT uses a minimal form of buffering (flushed by newlines) when connected to a terminal. By substituting the terminal for a pipe, normal buffering is reinstated.
Removing the sleep call doesn't change anything except to speeds things up. Instead of taking minutes to fill up the buffer, it takes milliseconds. With or without it, the output is still written in 4k or 8k blocks (depending on your version of Perl).

Perl execute command, capture and display output

I need to execute a command from my Perl script, which is going to take a while (1-2 hours). I want to be able to see the output of the command, so that my script can check everything went OK, but as it takes such a long time, I'd like the user to see the commands output while it runs, too.
What I've tried:
backticks - Can only get output when command is finished
system - Can only get output when command is finished
open - Almost perfect - but the commands output is buffered, meaning users don't see an update for a long time. Internet is full of suggestions to set $| = 1 but apparently this only affects input buffering and doesn't work
Piping to tee - Similar results to open - 'tee' only seems to print later
Re-directing output and Using Proc::Background and File::Tail - Almost perfect again, but can't think of an elegant way to stop the print loop
Would love to have a suggestion!
Edit: I've accepted Barmars answer. I believe it works because Expect.pm uses a pseudo-terminal. Just to others looking at this question in future, this is how I've implemented it:
my $process = Expect->spawn($command, #params) or die "Cannot spawn $command: $!\n";
while ($process->expect(undef))
{
print $process->before();
}
Using Expect.pm should disable output buffering in the command.

What can be the possible situations where one should prefer the unbuffered output?

By the discussion in my previous question I came to know that Perl gives line buffer output by default.
$| = 0; # for buffered output (by default)
If you want to get unbuffered output then set the special variable $| to 1 i.e.
$| = 1; # for unbuffered output
Now I want to know that what can be the possible situations where one should prefer the unbuffered output?
You want unbuffered output for interactive tasks. By that, I mean you don't want output stuck in some buffer when you expect someone or something else to respond to the output.
For example, you wouldn't want user prompts sent to STDOUT to be buffered. (That's why STDOUT is never fully buffered when attached to a terminal. It is only line buffered, and the buffer is flushed by attempts to read from STDIN.)
For example, you'd want requests sent over pipes and sockets to not get stuck in some buffer, as the other end of the connection would never see it.
The only other reason I can think of is when you don't want important data to be stuck in a buffer in the event of a unrecoverable error such as a panic or death by signal.
For example, you might want to keep a log file unbuffered in order to be able to diagnose serious problems. (This is why STDERR isn't buffered by default.)
Here's a small sample of Perl users from StackOverflow who have benefited from learning to set $| = 1:
STDOUT redirected externally and no output seen at the console
Perl Print function does not work properly when Sleep() is used
can't write to file using print
perl appending issues
Unclear perl script execution
Perl: Running a "Daemon" and printing
Redirecting STDOUT of a pipe in Perl
Why doesn't my parent process see the child's output until it exits?
Why does adding or removing a newline change the way this perl for loop functions?
Perl not printing properly
In Perl, how do I process input as soon as it arrives, instead of waiting for newline?
What is the simple way to keep the output stream exactly as it shown out on the screen (while interactive data used)?
Is it possible to print from a perl CGI before the process exits?
Why doesn't print output anything on each iteration of a loop when I use sleep?
Perl Daemon Not Working with Sleep()
It can be useful when writing to another program over a socket or pipe. It can also be useful when you are writing debugging information to STDOUT to watch the state of your program live.

Problem with piped filehandle in perl

I am trying to run bp_genbank2gff3.pl (bioperl package) from another perl script that
gets a genbank as its argument.
This does not work (no output files are generated):
my $command = "bp_genbank2gff3.pl -y -o /tmp $ARGV[0]";
open( my $command_out, "-|", $command );
close $command_out;
but this does
open( my $command_out, "-|", $command );
sleep 3; # why do I need to sleep?
close $command_out;
Why?
I thought that close is supposed to block until the command is done:
Closing any piped filehandle causes
the parent process to wait for the
child to finish...
(see http://perldoc.perl.org/functions/open.html).
Edit
I added this as last line:
say "ret=$ret, \$?=$?, \$!=$!";
and in both cases the printout is:
ret=, $?=13, $!=
(which means close failed in both cases, right?)
$? = 13 means your child process was terminated by a SIGPIPE signal. Your external program (bp_genbank2gff3.pl) tried to write some output to a pipe to your perl program. But the perl program closed its end of the pipe so your OS sent a SIGPIPE to the external program.
By sleeping for 3 seconds, you are letting your program run for 3 seconds before the OS kills it, so this will let your program get something done. Note that pipes have a limited capacity, though, so if your parent perl script is not reading from the pipe and if the external program is writing a lot to standard output, the external program's write operations will eventually block and you may not really get 3 seconds of effort from your external program.
The workaround is to read the output from the external program, even if you are just going to throw it away.
open( my $command_out, "-|", $command );
my #ignore_me = <$command_out>;
close $command_out;
Update: If you really don't care about the command's output, you can avoid SIGPIPE issues by redirecting the output to /dev/null:
open my $command_out, "-|", "$command > /dev/null";
close $command_out; # succeeds, no SIGPIPE
Of course if you are going to go to that much trouble to ignore the output, you might as well just use system.
Additional info: As the OP says, closing a piped filehandle causes the parent to wait for the child to finish (by using waitpid or something similar). But before it starts waiting, it closes its end of the pipe. In this case, that end is the read end of the pipe that the child process is writing its standard output to. The next time the child tries to write something to standard output, the OS detects that the read end of that pipe is closed and sends a SIGPIPE to the child process, killing it and quickly letting the close statement in the parent finish.
I'm not sure what you're trying to do but system is probably better in this case...