Perl operator: $|++; dollar sign pipe plus plus - perl

I'm working on a new version of an already released code of perl, and found the line:
$|++;
AFAIK, $| is related with pipes, as explained in this link, and I understand this, but I cannot figure out what the ++ (plus plus) means here.
Thank you in advance.
EDIT: Found the answer in this link:
In short: It forces to print (flush) to your console before the next statement, in case the script is too fast.
Sometimes, if you put a print statement inside of a loop that runs really really quickly, you won’t see the output of your print statement until the program terminates. sometimes, you don’t even see the output at all. the solution to this problem is to “flush” the output buffer after each print statement; this can be performed in perl with the following command:
$|++;
[update]
as has been pointed out by r. schwartz, i’ve misspoken; the above command causes print to flush the buffer preceding the next output.

$| defaults to 0; doing $|++ thus increments it to 1. Setting it to nonzero enables autoflush on the currently-selected file handle, which is STDOUT by default, and is rarely changed.
So the effect is to ensure that print statements and the like output immediately. This is useful if you're outputting to a socket or the like.

$| is an abbreviation for $OUTPUT_AUTOFLUSH, as you had found out. The ++ increments this variable.
$| = 1 would be the clean way to do this (IMHO).

It's an old idiom, from the days before IO::Handle. In modern code this should be written as
use IO::Handle;
STDOUT->autoflush(1);

It increments autoflush, which is most probably equivalent to turning it on.

Related

Perl STDIN without buffering or line buffering

I have a Perl script that received input piped from another program. It's buffering with an 8k (Ubuntu default) input buffer, which is causing problems. I'd like to use line buffering or disable buffering completely. It doesn't look like there's a good way to do this. Any suggestions?
use IO::Handle;
use IO::Poll qw[ POLLIN POLLHUP POLLERR ];
use Text::CSV;
my $stdin = new IO::Handle;
$stdin->fdopen(fileno(STDIN), 'r');
$stdin->setbuf(undef);
my $poll = IO::Poll->new() or die "cannot create IO::Poll object";
$poll->mask($stdin => POLLIN);
STDIN->blocking(0);
my $halt = 0;
for(;;) {
$poll->poll($config{poll_timout});
for my $handle ($poll->handles(POLLIN | POLLHUP | POLLERR)) {
next unless($handle eq $stdin);
if(eof) {
$halt = 1;
last;
}
my #row = $csv->getline($stdin);
# Do more stuff here
}
last if($halt);
}
Polling STDIN kind of throws a wrench into things since IO::Poll uses buffering and direct calls like sysread do not (and they can't mix). I don't want to infinitely call sysread without no blocking. I require the use of select or poll since I don't want to hammer the CPU.
PLEASE NOTE: I'm talking about STDIN, NOT STDOUT. $|++ is not the solution.
[EDIT]
Updating my question to clarify based on the comments and other answers.
The program that is writing to STDOUT (on the other side of the pipe) is line buffered and flushed after every write. Every write contains a newline, so in effect, buffering is not an issue for STDOUT of the first program.
To verify this is true, I wrote a small C program that reads piped input from the same program with STDIN buffering disabled (setvbuf with _IONBF). The input appears in STDIN of the test program immediately. Sadly, it does not appear to be an issue with the output from the first program.
[/EDIT]
Thanks for any insight!
PS. I have done a fair amount of Googling. This link is the closest I've found to an answer, but it certainly doesn't satisfy all my needs.
Say there are two short lines in the pipe's buffer.
IO::Poll notifies you there's data to read, which you proceed to read (indirectly) using readline.
Reading one character at a time from a file handle is very inefficient. As such, readline (aka <>) reads a block of data from the file handle at a time. The two lines ends up in a buffer and the first of the two lines is returned.
Then you wait for IO::Poll to notify you that there is more data. It doesn't know about Perl's buffer; it just knows the pipe is empty. As such, it blocks.
This post demonstrates the problem. It uses IO::Select, but the principle (and solution) is the same.
You're actually talking about the other program's STDOUT. The solution is $|=1; (or equivalent) in the other program.
If you can't, you might be able to convince the other program use line-buffering instead of block buffering by connecting its STDOUT to a pseudo-tty instead of a pipe (like Expect.pm does, for example).
The unix program expect has a tool called unbuffer which does that exactly that. (It's part of the expect-dev package on Ubuntu.) Just prefix the command name with unbuffer.

Perl, what does $|++ do?

I'm re-factoring some perl code, and as seems to be the case, Perl has some weird constructs that are a pain to look up.
In this case I encountered the following...
$|++;
This is on a line by itself just after the "use" statements.
What does this command do?
From perldoc perlvar:
$|
If set to nonzero, forces a flush right away and after every write or print on the currently selected output channel. Default is 0 (regardless of whether the channel is really buffered by the system or not; $| tells you only whether you've asked Perl explicitly to flush after each write). STDOUT will typically be line buffered if output is to the terminal and block buffered otherwise. Setting this variable is useful primarily when you are outputting to a pipe or socket, such as when you are running a Perl program under rsh and want to see the output as it's happening. This has no effect on input buffering. See getc for that. See select on how to select the output channel. See also IO::Handle.
Therefore, as it always starts as 0, this increments it to 1, forcing a flush after every write/print.
You can replace it with the following to be much clearer.
use English '-no_match_vars';
$OUTPUT_AUTOFLUSH = 1;
Looking up variables is best done with perlvar (perldoc perlvar, or http://perldoc.perl.org/perlvar.html)
From that:
HANDLE->autoflush( EXPR )
$OUTPUT_AUTOFLUSH
$|
If set to nonzero,
forces a flush right away and after every write or print on the
currently selected output channel. Default is 0 (regardless of whether
the channel is really buffered by the system or not; $| tells you only
whether you've asked Perl explicitly to flush after each write).
STDOUT will typically be line buffered if output is to the terminal
and block buffered otherwise. Setting this variable is useful
primarily when you are outputting to a pipe or socket, such as when
you are running a Perl program under rsh and want to see the output as
it's happening. This has no effect on input buffering. See getc for
that. See select on how to select the output channel. See also
IO::Handle.
++ is the increment operator, which adds one to the variable.
So $|++ sets autoflush true (default 0 + 1 = 1, which boolean evals as true), which forces writes to stdout to not be buffered.
$| is one of Perl's special variables.
According to perlvar:
If set to nonzero, forces a flush right away and after every write or print on the currently selected output channel.
If Google is your only source of information, I can understand how looking up special variables in Perl could cause consternation. Fortunately there is perldoc! Every machine with perl on it should also have perldoc. Use it without command line parameters to get a list of all the Core documentation that comes with your version of Perl.
To look up all special variables: perldoc perlvar
To look up a specific special variable:perldoc -v '$|' ( on *nix,
use double quotes on Windows)
To look up perl's list of functions: perldoc perlfunc
To look up a specific function: perldoc -f sprintf
To look up the operators (including precedence): perldoc perlop
Armed with that information, you'll know what happens when you post-increment the Output Autoflush variable.
As a special bonus, perldoc.perl.org can manage all of these jobs with the exception of the -v search...
As others have pointed out, it enables autoflush on the selected output filehandle (which is likely STDOUT). What nobody else has said, though, is that while you're generally refactoring and neatening up code, you really ought to replace it with the equivalent but much more obvious
STDOUT->autoflush(1);

perl $|=1; What is this?

I am learning Writing CGI Application with Perl -- Kevin Meltzer . Brent Michalski
Scripts in the book mostly begin with this:
#!"c:\strawberry\perl\bin\perl.exe" -wT
# sales.cgi
$|=1;
use strict;
use lib qw(.);
What's the line $|=1; How to space it, eg. $| = 1; or $ |= 1; ?
Why put use strict; after $|=1; ?
Thanks
perlvar is your friend. It documents all these cryptic special variables.
$OUTPUT_AUTOFLUSH (aka $|):
If set to nonzero, forces a flush right away and after every write or print on the currently selected output channel. Default is 0 (regardless of whether the channel is really buffered by the system or not; $| tells you only whether you've asked Perl explicitly to flush after each write). STDOUT will typically be line buffered if output is to the terminal and block buffered otherwise. Setting this variable is useful primarily when you are outputting to a pipe or socket, such as when you are running a Perl program under rsh and want to see the output as it's happening. This has no effect on input buffering. See getc for that. See select on how to select the output channel. See also IO::Handle.
Mnemonic: when you want your pipes to be piping hot.
Happy coding.
For the other questions:
There is no reason that use strict; comes after $|, except by the programmers convention. $| and other special variables are not affected by strict in this way. The spacing is also not important -- just pick your convention and be consistent. (I prefer spaces.)
$| = 1; forces a flush after every write or print, so the output appears as soon as it's generated rather than being buffered.
See the perlvar documentation.
$| is the name of a special variable. You shouldn't introduce a space between the $ and the |.
Whether you use whitespace around the = or not doesn't matter to Perl. Personally I think using spaces makes the code more readable.
Why the use strict; comes after $| = 1; in your script I don't know, except that they're both the sort of thing you'd put right at the top, and you have to put them in one order or the other. I don't think it matters which comes first.
It does not matter where in your script you put a use statement, because they all get evaluated at compile time.
$| is the built-in variable for autoflush. I agree that in this case, it is ambiguous. However, a lone $ is not a valid statement in perl, so by process of elimination, we can say what it must mean.
use lib qw(.) seems like a silly thing to do, since "." is already in #INC by default. Perhaps it is due to the book being old. This statement tells perl to add "." to the #INC array, which is the "path environment" for perl, i.e. where it looks for modules and such.

Why use an empty print after enabling autoflush?

I've found something similar like this in a piece of code:
use IO::Handle;
autoflush STDOUT 1;
print '';
Is the purpose of "print" to empty a possibly filled buffer?
The print forces all text in buffer (from previous prints) to be ouputted immediately. The code basically disable buffering and flush everything.
The print call should be a wasted system call. perlvar states, "If set to nonzero, forces a flush right away and after every write or print on the currently selected output channel." The code in this example should turn on autoflush, causing a flush, then add noting to the STDOUT buffer and cause a flush. There may be another reason for the print but my guess is that the original author of the code made the same assumption as bvr that there would be data left in the buffer after the call to autoflush that would need to be flushed.

What does "select((select(s),$|=1)[0])" do in Perl?

I've seen some horrific code written in Perl, but I can't make head nor tail of this one:
select((select(s),$|=1)[0])
It's in some networking code that we use to communicate with a server and I assume it's something to do with buffering (since it sets $|).
But I can't figure out why there's multiple select calls or the array reference. Can anyone help me out?
It's a nasty little idiom for setting autoflush on a filehandle other than STDOUT.
select() takes the supplied filehandle and (basically) replaces STDOUT with it, and it returns the old filehandle when it's done.
So (select($s),$|=1) redirects the filehandle (remember select returns the old one), and sets autoflush ($| = 1). It does this in a list ((...)[0]) and returns the first value (which is the result of the select call - the original STDOUT), and then passes that back into another select to reinstate the original STDOUT filehandle. Phew.
But now you understand it (well, maybe ;)), do this instead:
use IO::Handle;
$fh->autoflush;
The way to figure out any code is to pick it apart. You know that stuff inside parentheses happens before stuff outside. This is the same way you'd figuring out what code is doing in other languages.
The first bit is then:
( select(s), $|=1 )
That list has two elements, which are the results of two operations: one to select the s filehandle as the default then one to set $| to a true value. The $| is one of the per-filehandle variables which only apply to the currently selected filehandle (see Understand global variables at The Effective Perler). In the end, you have a list of two items: the previous default filehandle (the result of select), and 1.
The next part is a literal list slice to pull out the item in index 0:
( PREVIOUS_DEFAULT, 1 )[0]
The result of that is the single item that is previous default filehandle.
The next part takes the result of the slice and uses it as the argument to another call to select
select( PREVIOUS_DEFAULT );
So, in effect, you've set $| on a filehandle and ended up back where you started with the default filehandle.
select($fh)
Select a new default file handle. See http://perldoc.perl.org/functions/select.html
(select($fh), $|=1)
Turn on autoflush. See http://perldoc.perl.org/perlvar.html
(select($fh), $|=1)[0]
Return the first value of this tuple.
select((select($fh), $|=1)[0])
select it, i.e. restore the old default file handle.
Equivalent to
$oldfh = select($fh);
$| = 1;
select($oldfh);
which means
use IO::Handle;
$fh->autoflush(1);
as demonstrated in the perldoc page.
In another venue, I once proposed that a more comprehensible version would be thus:
for ( select $fh ) { $| = 1; select $_ }
This preserves the compact idiom’s sole advantage that no variable needs be declared in the surrounding scope.
Or if you’re not comfortable with $_, you can write it like this:
for my $prevfh ( select $fh ) { $| = 1; select $prevfh }
The scope of $prevfh is limited to the for block. (But if you write Perl you really have no excuse to be skittish about $_.)
It's overly clever code for turning on buffer flushing on handle s and then re-selecting the current handle.
See perldoc -f select for more.
please check perldoc -f select. For the meaning of $|, please check perldoc perlvar
It is overoptimization to skip loading IO::Handle.
use IO::Handle;
$fh->autoflush(1);
is much more readable.