Update command line output - perl

My program (which happens to be in Perl, though I don't think this question is Perl-specific) outputs status messages at one point in the program of the form Progress: x/yy where x and yy are a number, like: Progress: 4/38.
I'd like to "overwrite" the previous output when a new status message is printed so I don't fill the screen with status messages. So far, I've tried this:
my $progressString = "Progress\t$counter / " . $total . "\n";
print $progressString;
#do lots of processing, update $counter
my $i = 0;
while ($i < length($progressString)) {
print "\b";
++$i;
}
The backspace character won't print if I include a newline in $progressString. If I leave out the newline, however, the output buffer is never flushed and nothing prints.
What's a good solution for this?

Use autoflush with STDOUT:
local $| = 1; # Or use IO::Handle; STDOUT->autoflush;
print 'Progress: ';
my $progressString;
while ...
{
# remove prev progress
print "\b" x length($progressString) if defined $progressString;
# do lots of processing, update $counter
$progressString = "$counter / $total"; # No more newline
print $progressString; # Will print, because auto-flush is on
# end of processing
}
print "\n"; # Don't forget the trailing newline

Say
$| = 1
somewhere early in your program to turn autoflushing on for the output buffer.
Also consider using "\r" to move the cursor back to the beginning of the line, rather than trying to explicitly count how many spaces you need to move back.
Like you said, don't print out a newline while your progress counter is running or else you will print out your progress on a separate line instead of overwriting the old line.

I know it's not quite what you asked for, but possibly better. I happened on this same problem and so rather than deal with it too much went to using Term::ProgressBar which looks nice too.

You can also use the ANSI escape codes to directly control the cursor. Or you can use Term::ReadKey to do the same thing.

I had to tackle something similar to this today.
If you don't mind reprinting the entire line, you could do something like this:
print "\n";
while (...) {
print "\rProgress: $counter / $total";
# do processing work here
$counter++;
}
print "\n";
The "\r" character is a carriage return-- it brings the cursor back to the beginning of the line. That way, anything you print out overwrites the previous progress notification's text.

Related

Updating multiple lines of perl output on terminal

I am trying to update the progress of my Perl script on the terminal. The output looks something like this
Progress: ||||||||| [46%]
The progress keeps on getting updated until it reaches 100%. This is being done by printing "\r" after updating the progress. I wish to update multiple lines at the same time, how can it be done? The expectation is something like this
Progress: ||||||||| [46%]
Run-time: 100sec
After some progress(and or time) I wish to update it like this
Progress: |||||||||| [50%]
Run-time: 150sec
I tried printing "\r" two times to go to the previous line. But it didn't work.
I found similar questions (here and here), but they were answered for Python using modules. Mine is a Perl script, and I am not preferring to use external modules.
Term::ANSIScreen provides terminal control using ANSI escape sequences:
use Term::ANSIScreen qw!savepos loadpos!;
print savepos();
for my $i (1..10) {
print '|' x $i, "\n";
print "Step: $i\n";
sleep 1;
print loadpos();
}
or
use Term::ANSIScreen qw!up!;
for my $i (1..10) {
print '|' x $i, "\n";
print "Step: $i\n";
sleep 1;
print up(2);
}
These constants can be used instead of the module:
my $savepos = "\e[s";
my $loadpos = "\e[u";
my $up2 = "\e[2A";
ANSI escape codes

Perl CGI produces unexpected output

I have a Perl CGI script for online concordance application that searches for an instance of word in a text and prints the sorted output.
#!/usr/bin/perl -wT
# middle.pl - a simple concordance
# require
use strict;
use diagnostics;
use CGI;
# ensure all fatals go to browser during debugging and set-up
# comment this BEGIN block out on production code for security
BEGIN {
$|=1;
print "Content-type: text/html\n\n";
use CGI::Carp('fatalsToBrowser');
}
# sanity check
my $q = new CGI;
my $target = $q->param("keyword");
my $radius = $q->param("span");
my $ordinal = $q->param("ord");
my $width = 2*$radius;
my $file = 'concordanceText.txt';
if ( ! $file or ! $target ) {
print "Usage: $0 <file> <target>\n";
exit;
}
# initialize
my $count = 0;
my #lines = ();
$/ = ""; # Paragraph read mode
# open the file, and process each line in it
open(FILE, " < $file") or die("Can not open $file ($!).\n");
while(<FILE>){
# re-initialize
my $extract = '';
# normalize the data
chomp;
s/\n/ /g; # Replace new lines with spaces
s/\b--\b/ -- /g; # Add spaces around dashes
# process each item if the target is found
while ( $_ =~ /\b$target\b/gi ){
# find start position
my $match = $1;
my $pos = pos;
my $start = $pos - $radius - length($match);
# extract the snippets
if ($start < 0){
$extract = substr($_, 0, $width+$start+length($match));
$extract = (" " x -$start) . $extract;
}else{
$extract = substr($_, $start, $width+length($match));
my $deficit = $width+length($match) - length($extract);
if ($deficit > 0) {
$extract .= (" " x $deficit);
}
}
# add the extracted text to the list of lines, and increment
$lines[$count] = $extract;
++$count;
}
}
sub removePunctuation {
my $string = $_[0];
$string = lc($string); # Convert to lowercase
$string =~ s/[^-a-z ]//g; # Remove non-aplhabetic characters
$string =~ s/--+/ /g; #Remove 2+ hyphens with a space
$string =~s/-//g; # Remove hyphens
$string =~ s/\s=/ /g;
return($string);
}
sub onLeft {
#USAGE: $word = onLeft($string, $radius, $ordinal);
my $left = substr($_[0], 0, $_[1]);
$left = removePunctuation($left);
my #word = split(/\s+/, $left);
return($word[-$_[2]]);
}
sub byLeftWords {
my $left_a = onLeft($a, $radius, $ordinal);
my $left_b = onLeft($b, $radius, $ordinal);
lc($left_a) cmp lc($left_b);
}
# process each line in the list of lines
print "Content-type: text/plain\n\n";
my $line_number = 0;
foreach my $x (sort byLeftWords #lines){
++$line_number;
printf "%5d",$line_number;
print " $x\n\n";
}
# done
exit;
The perl script produces expected result in terminal (command line). But the CGI script for online application produces unexpected output. I cannot figure out what mistake I am making in the CGI script. The CGI script should ideally produce the same output as the command line script. Any suggestion would be very helpful.
Command Line Output
CGI Output
The BEGIN block executes before anything else and thus before
my $q = new CGI;
The output goes to the server process' stdout and not to the HTTP stream, so the default is text/plain as you can see in the CGI output.
After you solve that problem you'll find that the output still looks like a big ugly block because you need to format and send a valid HTML page, not just a big block of text. You cannot just dump a bunch of text to the browser and expect it to do anything intelligent with it. You must create a complete HTML page with tags to layout your content, probably with CSS as well.
In other words, the output required will be completely different from the output when writing only to the terminal. How to structure it is up to you, and explaining how to do that is out of scope for StackOverflow.
As the other answers state, the BEGIN block is executed at the very start of your program.
BEGIN {
$|=1;
print "Content-type: text/html\n\n";
use CGI::Carp('fatalsToBrowser');
}
There, you output an HTTP header Content-type: text/html\n\n. The browser sees that first, and treats all your output as HTML. But you only have text. Whitespace in an HTML page is collapsed into single spaces, so all your \n line breaks disappear.
Later, you print another header, the browser cannot see that as a header any more, because you already had one and finished it off with two newlines \n\n. It's now too late to switch back to text/plain.
It is perfectly fine to have a CGI program return text/plain and just have text without markup be displayed in a browser when all you want is text, and no colors or links or tables. For certain use cases this makes a lot of sense, even if it doesn't have the hyper in Hypertext any more. But you're not really doing that.
Your BEGIN block serves a purpose, but you are overdoing it. You're trying to make sure that when an error occurs, it gets nicely printed in the browser, so you don't need to deal with the server log while developing.
The CGI::Carp module and it's functionality fatalsToBrowser bring their own mechanism for that. You don't have to do it yourself.
You can safely remove the BEGIN block and just put your use CGI::CARP at the top of the script with all the other use statements. They all get run first anyway, because use gets run at compile time, while the rest of your code gets run at run time.
If you want, you can keep the $|++, which turns off the buffering for your STDOUT handle. It gets flushed immediately and every time you print something, that output goes directly to the browser instead of collecting until it's enough or there is a newline. If your process runs for a long time, this makes it easier for the user to see that stuff is happening, which is also useful in production.
The top of your program should look like this now.
#!/usr/bin/perl -T
# middle.pl - a simple concordance
use strict;
use warnigns;
use diagnostics;
use CGI;
use CGI::Carp('fatalsToBrowser');
$|=1;
my $q = CGI->new;
Finally, a a few quick words on the other parts I deleted from there.
Your comment requires over the use statements is misleading. Those are use, not require. As I said above, use gets run at compile time. require on the other hand gets run at run time and can be done conditionally. Misleading comments will make it harder for others (or you) to maintain your code later on.
I removed the -w flag from the shebang (#!/usr/bin/perl) and put the use warnings pragma in. That's a more modern way to turn on warnings, because sometimes the shebang can be ignored.
The use diagnostics pragma gives you extra long explanations when things go wrong. That's useful, but also extra slow. You can use it during development, but please remove it for production.
The comment sanity check should be moved down under the CGI instantiation.
Please use the invocation form of new to instantiate CGI, and any other classes. The -> syntax will take care of inheritance properly, while the old new CGI cannot do that.
I ran your cgi. The BEGIN block is run regardless and you print a content-type header here - you have explicitly asked for HTML here. Then later you attemp to print another header for PLAIN. This is why you can see the header text (that hasn't taken effect) at the beginning of the text in the browser window.

print doesn't work while iterations are going inside foreach loop

I try to find a way to print a progressbar on the commandline while parsing logfiles. Get logfiles=> foreach file => foreach line {do}.
The idea: I want to print a part of the progressbar in every "foreach file" loop. Meaing: print the whole bar if you just parse 1 file. print half of the bar for every file when u parse 2 files and so on. You find the specific code at the bottom.
The problem: The output (print "*") is printed after ALL foreach iteration are done - not in between. Details are in the Code.
Does someone have an idea how to print inside a foreach? Or can tell me the problem? I don't get it :(.
my #logfiles=glob($logpath);
print "<------------------>\n";
$vari=20/(scalar #logfiles);
foreach my $logfile (#logfiles){
open(LOGFILEhandle, $logfile);
#lines = <LOGFILEhandle>;
print "*" x $vari; #won't work, only after loop. Even a "print "*";" doesn't work
foreach my $line (#lines){
#print "*"; works "in between". print "*" x $vari; does not.
if ($line=~/xyz/){
......
......
}
close(LOGFILEhandle);
}
}
I would like to suggest Term::ProgressBar module to avoid reinventing the wheel.
#!/usr/bin/perl
use strict;
use warnings;
use Term::ProgressBar;
my #files = qw (file1 file2 file3 file4);
my $progress = Term::ProgressBar->new(scalar #files);
for (0..#files) {
$| = 1;
sleep(1); #introducing sleep for demo purpose otherwise bar will fill up quickly
#open the file, do some operations and when you are done
#update the bar
$progress->update($_);
}
You are suffering from buffering. The output is buffered until a certain amount is reached or you print a newline. To change this behaviour simply add
$| = 1 ;
at the top of your file. This will turn on autoflush for STDOUT.
There is more than one way to do it and a little bit longer and less cryptic is Borodins suggestion:
STDOUT->autoflush();

Read perl file handle with $INPUT_RECORD_SEPARATOR as a regex

I'm looking for a way to read from a file handle line by line (and then execute a function on each line) with the following twist: what I want to treat as a "line" shall be terminated by varying characters and not just a single character that I define as $/. I now that $INPUT_RECORD_SEPARATOR or $/ do not support regular expressions or passing a list of characters to be treated as line terminators and this is where my problem lies.
My file handle comes from stdout of a process. Thus, I cannot seek inside the file handle and the full content is not available immediately but is produced bit by bit as the process is executed. I want to be able to attach things like a timestamp to each "line" the process produces using a function that I called handler in my examples. Each line should be handled as soon as it gets produced by the program.
Unfortunately, I can only come up with a way that either executes the handler function immediately but seems horribly inefficient or a way that uses a buffer but will only lead to "grouped" calls of the handler function and thus, for example, produce wrong timestamps.
In fact, in my specific case, my regex would even be very simple and just read /\n|\r/. So for this particular problem I don't even need full regex support but just the possibility to treat more than one character as the line terminator. But $/ doesn't support this.
Is there an efficient way to solve this problem in Perl?
Here is some quick pseudo-perl code to demonstrate my two approaches:
read the input file handle byte-by-byte
This would look like this:
my $acc = "";
while (read($fd, my $b, 1)) {
$acc .= $b;
if ($acc =~ /someregex$/) {
handler($acc);
$acc = "";
}
}
The advantage here is, that handler gets immediately dispatched once enough bytes are read. The disadvantage is, that we do string appending and check the regex for every single byte we read from $fd.
read the input file handle with blocks of X-byte at a time
This would look like this:
my $acc = "";
while (read($fd, my $b, $bufsize)) {
if ($b =~ /someregex/) {
my #parts = split /someregex/, $b;
# for brevity lets assume we always get more than 2 parts...
my $first = shift #parts;
handler(acc . $first);
my $last = pop #parts;
foreach my $part (#parts) {
handler($part);
}
$acc = $last;
}
}
The advantage here is, that we are more efficient as we only check every $bufsize bytes. The disadvantage is, that the execution of handler has to wait until $bufsize bytes have been read.
Setting $INPUT_RECORD_SEPARATOR to a regex wouldn't help, because Perl's readline uses buffered IO, too. The trick is to use your second approach but with unbuffered sysread instead of read. If you sysread from a pipe, the call will return as soon as data is available, even if the whole buffer couldn't be filled (at least on Unix).
The suggestion by nwellnhof allowed me to implement a solution to this problem:
my $acc = "";
while (1) {
my $ret = sysread($fh, my $buf, 1000);
if ($ret == 0) {
last;
}
# we split with a capturing group so that we also retain which line
# terminator was used
# a negative limit is used to also produce trailing empty fields if
# required
my #parts = split /(\r|\n)/, $buf, -1;
my $numparts = scalar #parts;
if ($numparts == 1) {
# line terminator was not found
$acc .= $buf;
} elsif ($numparts >= 3) {
# first match needs special treatment as it needs to be
# concatenated with $acc
my $first = shift #parts;
my $term = shift #parts;
handler($acc . $first . $term);
my $last = pop #parts;
for (my $i = 0; $i < $numparts - 3; $i+=2) {
handler($parts[$i] . $parts[$i+1]);
}
# the last part is put into the accumulator. This might
# just be the empty string if $buf ended in a line
# terminator
$acc = $last;
}
}
# if the output didn't end with a linebreak, handle the rest
if ($acc ne "") {
handler($acc);
}
My tests show that indeed sysread will return even before having read 1000 characters if there is a pause in the input stream. The code above takes care to concatenate multiple messages of length 1000 and split messages with a lesser length or multiple terminators correctly.
Please shout if you see any bug in above code.

delete previous and next lines in perl

I have the following file:
#TWEETY:150:000000000-ACFKE:1:2104:27858:17965
AAATTAGCAAAAAACAATAACAAAACTGGGAAAATGCAATTTAACAACGAAAATTTTCCGAGAACTTGAAAGCGTACGAAAACGATACGCTCC
+
D1FFFB11FDG00EE0FFFA1110FAA1F/ABA0FGHEGDFEEFGDBGGGGFEHBFDDG/FE/EGH1#GF#F0AEEEEFHGGFEFFCEC/>EE
#TWEETY:150:000000000-ACFKE:1:1105:22044:20029
AAAAAATATTAAAACTACGAATGCATAAATTATTTCGTTCGAAATAAACTCACACTCGTAACATTGAACTACGCGCTCC
+
CCFDDDFGGGGGGGGGGHGGHHHHGHHHHHHHHHHHHHHHGHHGHHHHHHHHHHHHHGHGHGGHHHHHHGHHEGGGGGG
#TWEETY:150:000000000-ACFKE:1:2113:14793:7182
TATATAAAGCGAGAGTAGAAACTTTTTAATTGACGCGGCGAGAAAGTATATAGCAACAAGCGAGCACCCGCTCC
+
BBFFFFFGGGGFFGGFGHHHHHHHHHHHHHHHHHGGAEEEAFGGGHHFEGHHGHHHHHGHHGGGGFHHGG?EEG
#TWEETY:150:000000000-ACFKE:1:2109:5013:22093
AAAAAAATAATTCATATCGCCATATCGACTGACAGATAATCTATCTATAATCATAACTTTTCCCTCGCTCC
+
DAFAADDGF1EAGG3EG3A00ECGDFFAEGFCHHCAGHBGEAGBFDEDGGHBGHGFGHHFHHHBDG?/FA/
#TWEETY:150:000000000-ACFKE:1:2106:25318:19875
+
CCCCCCCCCCCCGGGGGGGGGGGGGGGGGGGGGGGGFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
The lines are in groups of four (each time there is a name, starting with #TWEETY, a string of letters, a + character, and another string of letters).
The second and fourth lines should have the same number of characters.
But there are cases where the second line is empty, as in the last four lines.
In these cases, I would like to get rid of the whole block (the previous line before the empty line and the next two lines).
I have just started perl and have been trying to write a script for my problem, but am having a hard time. Does anyone have some feedback?
Thanks!
Keep an array buffer of the last four lines. When it's full, check the second line, print the lines or not, empty the buffer, repeat.
#!/usr/bin/perl
use warnings;
use strict;
my #buffer;
sub output {
print #buffer unless 1 == length $buffer[1];
#buffer = ();
}
while (<>) {
if (4 == #buffer) {
output();
}
push #buffer, $_;
}
output(); # Don't forget to process the last four lines.
Yes. Start with looking at $/ and set it so you can work on a chunk at a time. I would suggest you can treat # as a record separator in your example.
Then iterate your records using a while loop. E.g. while ( <> ) {
Use split on \n to turn the current chunk into an array of lines.
Perform your test on the appropriate lines, and either print - or not - depending on whether it passed.
If you get stuck with that, then I'm sure a specific question including your code and where you're having problems will be well received here.
If you chunk the data correctly, this becomes almost trivial.
#!/usr/bin/perl
use strict;
use warnings;
# Use '#TWEETY' as the record separator to make it
# easy to chunk the data.
local $/ = '#TWEETY';
while (<DATA>) {
# The first entry will be empty (as the separator
# is the first thing in the file). Skip that record.
next unless /\S/;
# Skip any records with two consecutive newlines
# (as they will be the ones with the empty line 2)
next if /\n\n/;
# Print the remaining records
# (with $/ stuck back on the front)
print "$/$_";
}
__DATA__
#TWEETY:150:000000000-ACFKE:1:2104:27858:17965
AAATTAGCAAAAAACAATAACAAAACTGGGAAAATGCAATTTAACAACGAAAATTTTCCGAGAACTTGAAAGCGTACGAAAACGATACGCTCC
+
D1FFFB11FDG00EE0FFFA1110FAA1F/ABA0FGHEGDFEEFGDBGGGGFEHBFDDG/FE/EGH1#GF#F0AEEEEFHGGFEFFCEC/>EE
#TWEETY:150:000000000-ACFKE:1:1105:22044:20029
AAAAAATATTAAAACTACGAATGCATAAATTATTTCGTTCGAAATAAACTCACACTCGTAACATTGAACTACGCGCTCC
+
CCFDDDFGGGGGGGGGGHGGHHHHGHHHHHHHHHHHHHHHGHHGHHHHHHHHHHHHHGHGHGGHHHHHHGHHEGGGGGG
#TWEETY:150:000000000-ACFKE:1:2113:14793:7182
TATATAAAGCGAGAGTAGAAACTTTTTAATTGACGCGGCGAGAAAGTATATAGCAACAAGCGAGCACCCGCTCC
+
BBFFFFFGGGGFFGGFGHHHHHHHHHHHHHHHHHGGAEEEAFGGGHHFEGHHGHHHHHGHHGGGGFHHGG?EEG
#TWEETY:150:000000000-ACFKE:1:2109:5013:22093
AAAAAAATAATTCATATCGCCATATCGACTGACAGATAATCTATCTATAATCATAACTTTTCCCTCGCTCC
+
DAFAADDGF1EAGG3EG3A00ECGDFFAEGFCHHCAGHBGEAGBFDEDGGHBGHGFGHHFHHHBDG?/FA/
#TWEETY:150:000000000-ACFKE:1:2106:25318:19875
+
CCCCCCCCCCCCGGGGGGGGGGGGGGGGGGGGGGGGFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
Thanks everyone for the feedback!
It was all really useful. Thanks to your suggestions, I explored all the options and learned the unless statement.
The easiest solution given my existing code, was just to add an unless statement at the end.
### Write to output, but remove non-desired Gs
open OUT, ">$outfile";
my #accorder = #{$store0{"accorder"}};
foreach my $acc (#accorder){
# retrieve seq(2nd line) and qual(4th line)
my $seq = $store0{$acc}{"seq"};
my $qual = $store0{$acc}{"qual"};
# clean out polyG at end
$seq =~ s/G{3,}.{0,1}$//;
my $lenseq = length($seq);
my $lenqual = length($qual);
my $startqual = $lenqual - $lenseq;
$qual = substr($qual, 0, $lenseq);
#the above was in order to remove multiple G characters at the end of the
#second line, which is what led to empty lines (lines that were made up of
#only Gs got cut out)
# print to output, unless sequence has become empty
unless($lenseq == 0){ #this is the unless statement I added
print OUT "\#$acc\n$seq\n+\n$qual\n";
}
}
close(OUT);