How to print before a while loop is started in Perl? - perl

I have this code in Perl:
print "Processing ... ";
while ( some condition ) {
# do something over than 10 minutes
}
print "OK\n";
Now I get back the first print after the while loop is finished.
How can I print the messeage before the while loop is started?

Output is buffered, meaning the program decides when it actually renders what you printed. You can put
$| = 1;
to flush stdout in this single instance. For more methods (auto-flushing, file flushing etc) you can search around SO for questions about this.

Ordinarily, perl will buffer up to 8KB of output text before flushing it to the device, or up to the next newline if the device is a terminal. You can avoid this by adding
STDOUT->autoflush
to the top of your code, assuming that you are printing to STDOUT. This will force the data to be flushed after every print, say or write operation
Note that this is the same as using $| = 1 but is significantly less cryptic and allows you to change the properties of any given file handle

You can see the prints by flushing the buffers immediately after.
print "Processing ... ";
STDOUT->flush;
If you are using autoflush, you should save the current configuration by duplicating the file handle.
use autodie; # dies if error on open and close.
{
STDOUT->flush; # empty its buffer
open my $saved_stdout, '>&', \*STDOUT;
STDOUT->autoflush;
# ... output with autoflush active
open STDOUT, '>&', $saved_stdout; # restore old STDOUT
}
See perldoc -f open and search for /\>\&/

Related

printing messages to the screen and a file

I have perl script that I am running in Windows. In the script I call another perl scrip. I am trying get both of those scripts to print to the cmd window and a file. This basically how I am doing the call
using IO::Tee
open (my $file, '>>', "C:\\Logs\\logfile.txt") or die "couldn't open log file: $!";
me $tee = IO::Tee->new(\*STDOUT, $file);
# doing some stuff
print $tee "log about what i just did";
# do more stuff
print $tee "more logs";
print $tee `c:\\secondScript.pl arg1`;
print $tee "done with script";
The second script is basically
# do stuff
print "script 2 log about stuff";
# do more stuff
print "script 2 log about more stuff";
print "script 2 done";
This does get everything to the screen and a file. However, I don't see the "script 2 log about stuff", "script 2 log about more stuff", and "script 2 done" until after script 2 has finished. I would like to see all of that stream to the screen and the file as soon as the print is reached.
Printing to STDOUT is usually line buffered (to speed things up) when output goes to a terminal and block buffered otherwise (e.g. when redirecting output to a file).
You can think of it as if anything printed is first placed into a buffer (typically 4096 bytes large) and only when the buffer is full (i.e. 4096 characters were printed), it gets output to the screen.
line buffered means the output is only shown on screen after a \n is found (or the buffer is exhausted). You don't have \ns in your 2nd script, so no output is shown until either a) \n comes, b) buffer is full, or c) the script ends.
block buffered means the output is shown only when the buffer is full. \ns don't influence this here (except for counting as one character).
To avoid buffering there's a magic variable called $|. From the docs (scroll down to the $| section):
If set to nonzero, forces a flush right away and after every write or
print on the currently selected output channel.
So you could append "\n" to your print statements or – better – set $| = 1; on top of your 2nd script (only once, not for each print). This will slow down the output of the 2nd script (in theory) but for a few lines it will make no difference.

Why won't my perl daemon print?

I am debugging a daemon and I'm trying to use print statements to output information to the terminal. The gist of my code is:
#!/usr/bin/env perl
use strict;
use warnings;
use Readonly;
Readonly my $TIMEOUT => ...;
...
while (1) {
print "DEBUG INFO";
...
sleep $TIMEOUT;
}
However, no output it getting printed to my terminal. Why is this?
Summary:
Use $| = 1 or add a newline, "\n" to the print.
Explanation:
The reason this isn't printing to the terminal is because perl is buffering the output for efficiency. Once the print buffer has been filled it will be flushed and the output will appear in your terminal. It may be desirable for you to force flushing the buffer, as depending on the length of $TIMEOUT you could be waiting for a considerable length of time for output!
There are two main approaches to flushing the buffer:
1) As you're printing to your terminal, then your filehandle is most likely STDOUT. Any file handles attached to the terminal are by default in line-buffered mode, and we can flush the buffer and force output by adding a newline character to your print statement:
while (1) {
print "DEBUG INFO\n";
...
sleep $TIMEOUT;
}
2) The second approach is to use $| which when set to non-zero makes the current filehandle (STDOUT by default or the last to be selected) hot and forces a flush of the buffer immediately. Therefore, the following will also force printing of the debug information:
$| = 1;
while (1) {
print "DEBUG INFO";
...
sleep $TIMEOUT;
}
If using syntax such as this is confusing, then you may like to consider:
use IO::Handle;
STDOUT->autoflush(1);
while (1) {
print "DEBUG INFO";
...
sleep $TIMEOUT;
}
In many code examples where immediate flushing of the buffer is required, you may see $|++ used to make a file-handle hot and immediately flush the buffer, and --$| to make a file-handle cold and switch off auto-flushing. See these two answers for more details:
Perl operator: $|++; dollar sign pipe plus plus
How does --$| work in Perl?
If you're interested in learning more about perl buffers, then I would suggest reading Suffering from Buffering, which gives great insight into why we have buffering and explains how to switch it on and off.

Perl bidirectional pipe IPC, how to avoid output buffering

I am trying to communicate with an interactive process. I want my perl script to be a "moddle man" between the user and the process. The process puts text to stdout, prompts the user for a command, puts more text to stdout, prompts the user for a command, ....... A primitive graphic is provided:
User <----STDOUT---- interface.pl <-----STDOUT--- Process
User -----STDIN----> interface.pl ------STDIN---> Process
User <----STDOUT---- interface.pl <-----STDOUT--- Process
User -----STDIN----> interface.pl ------STDIN---> Process
User <----STDOUT---- interface.pl <-----STDOUT--- Process
User -----STDIN----> interface.pl ------STDIN---> Process
The following simulates what I'm trying to do:
#!/usr/bin/perl
use strict;
use warnings;
use FileHandle;
use IPC::Open2;
my $pid = open2( \*READER, \*WRITER, "cat -n" );
WRITER->autoflush(); # default here, actually
my $got = "";
my $input = " ";
while ($input ne "") {
chomp($input = <STDIN>);
print WRITER "$input \n";
$got = <READER>;
print $got;
}
DUe to output buffering the above example does not work. No matter what text is typed in, or how many enters are pressed the program just sits there. The way to fix it is to issue:
my $pid = open2( \*READER, \*WRITER, "cat -un" );
Notice "cat -un" as opposed to just "cat -n". -u turns off output buffering on cat. When output buffering is turned off this works. The process I am trying to interact with most likely buffers output as I am facing the same issues with "cat -n". Unfortunately I can not turn off output buffering on the process I am communicating with, so how do I handle this issue?
UPDATE1 (using ptty):
#!/usr/bin/perl
use strict;
use warnings;
use IO::Pty;
use IPC::Open2;
my $reader = new IO::Pty;
my $writer = new IO::Pty;
my $pid = open2( $reader, $writer, "cat -n" );
my $got = "";
my $input = " ";
$writer->autoflush(1);
while ($input ne "") {
chomp($input = <STDIN>);
$writer->print("$input \n");
$got = $reader->getline;
print $got;
}
~
There are three kinds of buffering:
Block buffering: Output is placed into a fixed-sized buffer. The buffer is flushed when it becomes full. You'll see the output come out in chunks.
Line buffering: Output is placed into a fixed-sized buffer. The buffer is flushed when a newline is added to the buffer and when it becomes full.
No buffering: Output is passed directly to the OS.
In Perl, buffering works as follows:
File handles are buffered by default. One exception: STDERR is not buffered by default.
Block buffering is used. One exception: STDOUT is line buffered if and only if it's connected to a terminal.
Reading from STDIN flushes the buffer for STDOUT.
Until recently, Perl used 4KB buffers. Now, the default is 8KB, but that can be changed when Perl is built.
This first two are surprisingly standard across all applications. That means:
User -------> interface.pl
User is a person. He doesn't buffer per say, though it's a very slow source of data. OK
interface.pl ----> Process
interface.pl's output is block buffered. BAD
Fixed by adding the following to interface.pl:
use IO::Handle qw( );
WRITER->autoflush(1);
Process ----> interface.pl
Process's output is block buffered. BAD
Fixed by adding the following to Process:
use IO::Handle qw( );
STDOUT->autoflush(1);
Now, you're probably going to tell me you can't change Process. If so, that leaves you three options:
Use a command line or configuration option provided by tool to change its buffering behaviour. I don't know of any tools that provide such an option.
Fool the child to use line buffering instead of block buffering by using a pseudo tty instead of a pipe.
Quitting.
interface.pl -------> User
interface.pl's output is line buffered. OK (right?)

In Perl, why does print not generate any output after I close STDOUT?

I have the code:
open(FILE, "<$new_file") or die "Cant't open file \n";
#lines=<FILE>;
close FILE;
open(STDOUT, ">$new_file") or die "Can't open file\n";
$old_fh = select(OUTPUT_HANDLE);
$| = 1;
select($old_fh);
for(#lines){
s/(.*?xsl.*?)xsl/$1xslt/;
print;
}
close(STDOUT);
STDOUT -> autoflush(1);
print "file changed";
After closing STDOUT closing the program does not write the last print print "file changed". Why is this?
*Edited* Print message I want to write on Console no to file
I suppose it is because print default filehandle is STDOUT, which at that point it is already closed. You could reopen it, or print to other filehandle, for example, STDERR.
print STDERR "file changed";
It's because you've closed the filehandle stored in STDOUT, so print can't use it anymore. Generally speaking opening a new filehandle into one of the predefined handle names isn't a very good idea because it's bound to lead to confusion. It's much clearer to use lexical filehandles, or just a different name for your output file. Yes you then have to specify the filehandle in your print call, but then you don't have any confusion over what's happened to STDOUT.
A print statement will output the string in the STDOUT, which is the default output file handle.
So the statement
print "This is a message";
is same as
print STDOUT "This is a message";
In your code, you have closed STDOUT and then printing the message, which will not work. Reopen the STDOUT filehandle or do not close it. As the script ends, the file handles will be automatically closed
open OLDOUT, ">&", STDOUT;
close STDOUT;
open(STDOUT, ">$new_file") or die "Can't open file\n";
...
close(STDOUT);
open (STDOUT, ">&",OLDOUT);
print "file changed";
You seem to be confused about how file IO operations are done in perl, so I would recommend you read up on that.
What went wrong?
What you are doing is:
Open a file for reading
Read the entire file and close it
Open the same file for overwrite (org file is truncated), using the STDOUT file handle.
Juggle around the default print handle in order to set autoflush on a file handle which is not even opened in the code you show.
Perform a substitution on all lines and print them
Close STDOUT then print a message when everything is done.
Your main biggest mistake is trying to reopen the default output file handle STDOUT. I assume this is because you do not know how print works, i.e. that you can supply a file handle to print to print FILEHANDLE "text". Or that you did not know that STDOUT was a pre-defined file handle.
Your other errors:
You did not use use strict; use warnings;. No program you write should be without these. They will prevent you from doing bad things, and give you information on errors, and will save you hours of debugging.
You should never "slurp" a file (read the entire file to a variable) unless you really need to, because this is ineffective and slow and for huge files will cause your program to crash due to lack of memory.
Never reassign the default file handles STDIN, STDOUT, STDERR, unless A) you really need to, B) you know what you are doing.
select sets the default file handle for print, read the documentation. This is rarely something that you need to concern yourself with. The variable $| sets autoflush on (if set to a true value) for the currently selected file handle. So what you did actually accomplished nothing, because OUTPUT_HANDLE is a non-existent file handle. If you had skipped the select statements, it would have set autoflush for STDOUT. (But you wouldn't have noticed any difference)
print uses print buffers because it is efficient. I assume you are trying to autoflush because you think your prints get caught in the buffer, which is not true. Generally speaking, this is not something you need to worry about. All the print buffers are automatically flushed when a program ends.
For the most part, you do not need to explicitly close file handles. File handles are automatically closed when they go out of scope, or when the program ends.
Using lexical file handles, e.g. open my $fh, ... instead of global, e.g. open FILE, .. is recommended, because of the previous statement, and because it is always a good idea to avoid global variables.
Using three-argument open is recommended: open FILEHANDLE, MODE, FILENAME. This is because you otherwise risk meta-characters in your file names to corrupt your open statement.
The quick fix:
Now, as I said in the comments, this -- or rather, what you intended, because this code is wrong -- is pretty much identical to the idiomatic usage of the -p command line switch:
perl -pi.bak -e 's/(.*?xsl.*?)xsl/$1xslt/' file.txt
This short little snippet actually does all that your program does, but does it much better. Explanation:
-p switch automatically assumes that the code you provide is inside a while (<>) { } loop, and prints each line, after your code is executed.
-i switch tells perl to do inplace-edit on the file, saving a backup copy in "file.txt.bak".
So, that one-liner is equivalent to a program such as this:
$^I = ".bak"; # turns inplace-edit on
while (<>) { # diamond operator automatically uses STDIN or files from #ARGV
s/(.*?xsl.*?)xsl/$1xslt/;
print;
}
Which is equivalent to this:
my $file = shift; # first argument from #ARGV -- arguments
open my $fh, "<", $file or die $!;
open my $tmp, ">", "/tmp/foo.bar" or die $!; # not sure where tmpfile is
while (<$fh>) { # read lines from org file
s/(.*?xsl.*?)xsl/$1xslt/;
print $tmp $_; # print line to tmp file
}
rename($file, "$file.bak") or die $!; # save backup
rename("/tmp/foo.bar", $file) or die $!; # overwrite original file
The inplace-edit option actually creates a separate file, then copies it over the original. If you use the backup option, the original file is first backed up. You don't need to know this information, just know that using the -i switch will cause the -p (and -n) option to actually perform changes on your original file.
Using the -i switch with the backup option activated is not required (except on Windows), but recommended. A good idea is to run the one-liner without the option first, so the output is printed to screen instead, and then adding it once you see the output is ok.
The regex
s/(.*?xsl.*?)xsl/$1xslt/;
You search for a string that contains "xsl" twice. The usage of .*? is good in the second case, but not in the first. Any time you find yourself starting a regex with a wildcard string, you're probably doing something wrong. Unless you are trying to capture that part.
In this case, though, you capture it and remove it, only to put it back, which is completely useless. So the first order of business is to take that part out:
s/(xsl.*?)xsl/$1xslt/;
Now, removing something and putting it back is really just a magic trick for not removing it at all. We don't need magic tricks like that, when we can just not remove it in the first place. Using look-around assertions, you can achieve this.
In this case, since you have a variable length expression and need a look-behind assertion, we have to use the \K (mnemonic: Keep) option instead, because variable length look-behinds are not implemented.
s/xsl.*?\Kxsl/xslt/;
So, since we didn't take anything out, we don't need to put anything back using $1. Now, you may notice, "Hey, if I replace 'xsl' with 'xslt', I don't need to remove 'xsl' at all." Which is true:
s/xsl.*?xsl\K/t/;
You may consider using options for this regex, such as /i, which causes it to ignore case and thus also match strings such as "XSL FOO XSL". Or the /g option which will allow it to perform all possible matches per line, and not just the first match. Read more in perlop.
Conclusion
The finished one-liner is:
perl -pi.bak -e 's/xsl.*?xsl\K/t/' file.txt

Will data in a pipe queue up for reading by Perl?

I have a Perl script that executes a long running process and observes its command line output (log messages), some of which are multiple lines long. Once it has a full log message, it sends it off to be processed and grabs the next log message.
open(PS_F, "run.bat |") or die $!;
$logMessage = "";
while (<PS_F>) {
$lineRead = $_;
if ($lineRead =~ m!(\d{4}-\d{2}-\d{2}\ \d{2}:\d{2}:\d{2})!) {
#process the previous log message
$logMessage = $lineRead;
}
else {
$logMessage = $logMessage.$_;
}
}
close(PS_F);
In its current form, do I have to worry about the line reading and processing "backing up"? For example, if I get a new log message every 1 second and it takes 5 seconds to do all the processing (random numbers I pulled out), do I have to worry that I will miss log messages or have memory issues?
In general, data output on the pipeline by one application will be buffered if the next cannot consume it fast enough. If the buffer fills up, the outputting application is blocked (i.e. calls to write to the output file handle just stall) until the consumer catches up. I believe the buffer on Linux is (or was) 65536 bytes.
In this fashion, you can never run out of memory, but you can seriously stall the producer application in the pipeline.
No you will not lose messages. The writing end of the pipe will block if the pipe buffer is full.
Strictly speaking, this should be a comment: Please consider re-writing your code as
# use lexical filehandle and 3 arg open
open my $PS_F, '-|', 'run.bat'
or die "Cannot open pipe to 'run.bat': $!";
# explicit initialization not needed
# limit scope
my $logMessage;
while (<$PS_F>) {
# you probably meant to anchor the pattern
# and no need to capture if you are not going to use
# captured matches
# there is no need to escape a space, although you might
# want to use [ ] for clarity
$logMessage = '' if m!^\d{4}-\d{2}-\d{2}[ ]\d{2}:\d{2}:\d{2}!;
$logMessage .= $_;
}
close $PS_F
or die "Cannot close pipe: $!";