Localizing "$|" - perl

Can I use both ways to localize $| or should I use one in favor of the other?
Way 1: backup old value of $| in "_init_scr" and set back $| to the old value when "_end_win" is called.
Way 2: calling local $| = 1 after "_init_scr" is called.
package Package_name
# ...
sub _init_scr {
my ( $arg ) = #_;
$arg->{old_handle} = select( $arg->{handle_out} );
#$arg->{backup_flush} = $|; # way 1
$| = 1;
# ...
}
sub _end_win {
my ( $arg ) = #_;
# ...
#$| = $arg->{backup_flush}; # way 1
select( $arg->{old_handle} );
}
sub choose {
my $arg = ...;
# ...
_init_scr( $arg );
# way 2 - instead of setting `$|` in "_init_scr" set it here:
#local $| = 1;
# ...
while ( 1 ) {
my $c = _getch( $arg );
# ...;
given ( $c ) {
# ...
when ( $c == CONTROL_C ) {
_end_win( $arg );
print "^C";
kill( 'INT', $$ );
return;
}
when ( $c == KEY_ENTER ) {
# ...
_end_win( $arg );
return $result;
}
}
}
}

Use local. That way, $| is restored no matter how the sub is exited (exception, early return, etc).
By the way, you could use select()->flush; instead of toggling $| back and forth.
use IO::Handle qw( ); # Required in older versions of Perl.
print "^C";
select()->flush();
That said, the advantage of local $| is gone since you need to call _end_win to clean up anyway. So let's get rid of the need for _end_win.
use Sub::ScopeFinalizer qw( scope_finalizer );
sub _local_scr {
my ( $arg ) = #_;
my $old_autoflush = $|;
my $old_handle = select( $arg->{handle_out} );
$| = 1;
return scope_finalizer {
$| = $old_autoflush;
select($old_handle);
};
}
sub choose {
my $arg = ...;
my $guard = _local_scr( $arg );
while ( 1 ) {
...
print "^C";
kill( 'INT', $$ );
return;
...
}
}

If you want to localize the value, just use local. It will handle restoring the original value when the scope where it was localized is exited without any additional effort (or chance for mistakes) on your part.

Third way:
use IO::Handle;
# ...
$arg->{handle_out}->autoflush(1);
# ...
$arg->{handle_out}->autoflush(0);
There are various other convenient methods available in IO::Handle.

You should use local $| = 1;, as this is the idiomatic Perl way of doing this. It's simpler than keeping track of the value in another variable.
Use an extra set of braces (if needed) to create a scope so that it only applies to the part of the program that should have it set.
{
#Buffering is turned off only in here.
local $| = 1;
unbuffered_commands();
}
buffered_commands();

Related

Weakening captures using Sub::Quote

I'd like to weaken captured variables in the code generated by Sub::Quote. For example, here's the non-quoted alternative:
use 5.10.0;
use Scalar::Util qw[ weaken ];
{
my $s = 'foo';
my $x = sub { say $s };
weaken( my $y = $x );
my $bar = sub { &$y };
&$bar;
$x = undef;
&$bar
}
and the output:
foo
Can't use an undefined value as a subroutine reference [...]
And here's my Sub::Quote attempt:
use 5.10.0;
use Sub::Quote;
use Scalar::Util qw[ weaken ];
{
my $s = 'foo';
my $x = sub { say $s };
weaken( my $y = $x );
my $bar = quote_sub( '&$y', { '$y' => \$y } );
&$bar;
$x = undef;
&$bar;
}
and the output:
foo
foo
Obviously the captured $y isn't weakened. Is there a way of altering the generated code to weaken captured variables?
The documentation is sparse, and the Sub::Quote implementation is complex; I'm fairly convinced this isn't possible with the current code, but I'd love to be shown to be wrong.
my $bar = quote_sub( '&$y', { '$y' => \$y } );
is roughly the same as
my $bar = eval(q{ my $y = $y; sub { &$y } });
(It does more, but those bits are irrelevant to this question). As you can see, that creates a new strong reference to the sub[1].
As a workaround, you could add a layer of indirection:
my $bar = eval(q{ my $y_ref = \$y; sub { &{ $$y_ref } } });
This can be achieved by using:
my $bar = quote_sub( '&{$$y_ref}', { '$y_ref' => \\$y } );
There wouldn't be any problems if the $y created by Sub::Quote was an alias for your $y. This can be achieved using Data::Alias or an experimental feature introduced in 5.22.
This can be demonstrated using the following:
{
package Sub::Quote;
my $sub = sub {
my ($from, $captures, $indent) = #_;
join(
'',
"use feature qw( refaliasing );\n",
"no warnings qw( experimental::refaliasing );\n",
map {
/^([\#\%\$])/
or croak "capture key should start with \#, \% or \$: $_";
(' ' x $indent).qq{\\my ${_} = \\${1}{${from}->{${\quotify $_}}};\n};
} keys %$captures
)
};
no warnings qw( redefine );
*capture_unroll = $sub;
}
my $bar = quote_sub( '&$y', { '$y' => \$y } );
You could talk to the module's maintainer about adding an option that would cause the use of aliasing.
When you create a copy of a (strong or weak) reference, it's a strong reference.

Understanding async in perl on specific example

I have to write a script that get some URLs in parallel and do some work. In the past I have always used Parallel::ForkManager for such things, but now I wanted to learn something new and try asynchronous programming with AnyEvent (and AnyEvent::HTTP or AnyEvent::Curl::Multi) ... but I'm having problem understanding AnyEvent and writing a script that should:
open a file (every line is a seperate URL)
(from now in parallel, but with a limit for f.e. 10 concurrent requests)
read file line after line (I dont want to load whole file to memory - it might be big)
make a HTTP request for that URL
read response
updates MySQL record accordingly
(next file line)
I have read many manuals, tutorials, but its still hard for me to understand differences between blocking and non-blocking code. I have found similar script at http://perlmaven.com/fetching-several-web-pages-in-parallel-using-anyevent, where Mr. Szabo explains the basics, but I still cant understand how to implement something like:
...
open my $fh, "<", $file;
while ( my $line = <$fh> )
{
# http request, read response, update MySQL
}
close $fh
...
... and add a concurrency limit in this case.
I would be very grateful for help ;)
UPDATE
Following Ikegami's advice I gave Net::Curl::Multi a try. I'm very pleased with results. After years of using Parallel::ForkManager just for concurrent grabbing thousands of URLs, Net::Curl::Multi seems to be awesome.
Here is my code with while loop on filehandle. It seems to work as it should, but considering it's my first time writing something like this I would like to ask more experienced Perl users to take a look and tell me if there are some potential bugs, something I missed, etc.
Also, if I may ask: as I don't fully understand how Net::Curl::Multi's concurrency works, please tell me whether I should expect any problems with putting MySQL UPDATE command (via DBI) inside RESPONSE loop (besides higher server load obviously - I expect final script to run with about 50 concurrent N::C::M workers, maybe more).
#!/usr/bin/perl
use Net::Curl::Easy qw( :constants );
use Net::Curl::Multi qw( );
sub make_request {
my ( $url ) = #_;
my $easy = Net::Curl::Easy->new();
$easy->{url} = $url;
$easy->setopt( CURLOPT_URL, $url );
$easy->setopt( CURLOPT_HEADERDATA, \$easy->{head} );
$easy->setopt( CURLOPT_FILE, \$easy->{body} );
return $easy;
}
my $maxWorkers = 10;
my $multi = Net::Curl::Multi->new();
my $workers = 0;
my $i = 1;
open my $fh, "<", "urls.txt";
LINE: while ( my $url = <$fh> )
{
chomp( $url );
$url .= "?$i";
print "($i) $url\n";
my $easy = make_request( $url );
$multi->add_handle( $easy );
$workers++;
my $running = 0;
do {
my ($r, $w, $e) = $multi->fdset();
my $timeout = $multi->timeout();
select $r, $w, $e, $timeout / 1000
if $timeout > 0;
$running = $multi->perform();
RESPONSE: while ( my ( $msg, $easy, $result ) = $multi->info_read() ) {
$multi->remove_handle( $easy );
$workers--;
printf( "%s getting %s\n", $easy->getinfo( CURLINFO_RESPONSE_CODE ), $easy->{url} );
}
# dont max CPU while waiting
select( undef, undef, undef, 0.01 );
} while ( $workers == $maxWorkers || ( eof && $running ) );
$i++;
}
close $fh;
Net::Curl is a rather good library that's extremely fast. Furthermore, it can handle parallel requests too! I'd recommend using this instead of AnyEvent.
use Net::Curl::Easy qw( :constants );
use Net::Curl::Multi qw( );
sub make_request {
my ( $url ) = #_;
my $easy = Net::Curl::Easy->new();
$easy->{url} = $url;
$easy->setopt( CURLOPT_URL, $url );
$easy->setopt( CURLOPT_HEADERDATA, \$easy->{head} );
$easy->setopt( CURLOPT_FILE, \$easy->{body} );
return $easy;
}
my $max_running = 10;
my #urls = ( 'http://www.google.com/' );
my $multi = Net::Curl::Multi->new();
my $running = 0;
while (1) {
while ( #urls && $running < $max_running ) {
my $easy = make_request( shift( #urls ) );
$multi->add_handle( $easy );
++$running;
}
last if !$running;
my ( $r, $w, $e ) = $multi->fdset();
my $timeout = $multi->timeout();
select( $r, $w, $e, $timeout / 1000 )
if $timeout > 0;
$running = $multi->perform();
while ( my ( $msg, $easy, $result ) = $multi->info_read() ) {
$multi->remove_handle( $easy );
printf( "%s getting %s\n", $easy->getinfo( CURLINFO_RESPONSE_CODE ), $easy->{url} );
}
}
This does exactly what you want, in an asynchronous fashion, and it does that by wrapping Net::Curl in a safe fashion:
#!/usr/bin/env perl
package MyDownloader;
use strict;
use warnings qw(all);
use Moo;
extends 'YADA::Worker';
has '+use_stats'=> (default => sub { 1 });
has '+retry' => (default => sub { 10 });
after init => sub {
my ($self) = #_;
$self->setopt(
encoding => '',
verbose => 1,
);
};
after finish => sub {
my ($self, $result) = #_;
if ($self->has_error) {
print "ERROR: $result\n";
} else {
# do the interesting stuff here
printf "Finished downloading %s: %d bytes\n", $self->final_url, length ${$self->data};
}
};
around has_error => sub {
my $orig = shift;
my $self = shift;
return 1 if $self->$orig(#_);
return 1 if $self->getinfo('response_code') =~ m{^5[0-9]{2}$}x;
};
1;
package main;
use strict;
use warnings qw(all);
use Carp;
use YADA;
my $q = YADA->new(
max => 8,
timeout => 30,
);
open(my $fh, '<', 'file_with_urls_per_line.txt')
or croak "can't open queue: $!";
while (my $url = <$fh>) {
chomp $url;
$q->append(sub {
MyDownloader->new($url)
});
}
close $fh;
$q->wait;

Is it unpolite to put an END block in a module?

Would it be OK to keep the END block in this example, because nobody wants a broken terminal or shouldn't I put an END block in a module?
package My_Package;
use warnings;
use strict;
use Term::ReadKey;
sub _init_scr {
my ( $arg ) = #_;
$arg->{backup_flush} = $|;
$| = 1;
Term::ReadKey::ReadMode 'ultra-raw';
}
sub _end_win {
my ( $arg ) = #_;
print "\n\r";
Term::ReadKey::ReadMode 'restore';
$| = $arg->{backup_flush};
}
END {
Term::ReadKey::ReadMode 'restore';
}
sub my_function {
my $arg = {};
_init_scr( $arg );
while ( 1 ) {
my $c = ReadKey 0;
if ( ! defined $c ) {
_end_win( $arg );
warn "EOT";
return;
}
next if $c eq "\e";
given ( $c ) {
when ( $c ge 'a' && $c le 'z' ) {
print $c;
$arg->{string} .= $c;
}
when ( $c eq "\cC" ) {
_end_win( $arg );
print STDERR "^C";
kill( 'INT', $$ );
return;
}
when ( $c eq "\r" ) {
_end_win( $arg );
return $arg->{string};
}
}
}
}
If your module changes the terminal mode, then I would think the most polite thing to do would be for it to also install an END block to restore the terminal mode before the program exits.
No, it's polite and expected that you put things back as you found them.
However, it's unwelcome to tidy up someone else's workspace unless you've been asked to do so.
That is, your END routine shouldn't run unless it has reason to do so, and your module probably ought to allow a developer to disable the automatic cleanup. (E.g., use My_Package qw(:no_auto_restore).)
Failing that, the POD ought to explicitly document that the module fiddles with a system resource upon exit.

Perl Classes :: Can not write output

I am new to Object oriented programming in perl. So, I have a silly question.
What --
I am writing a script which will do something and write result to stream ( STDOUT or NETWORK ).
How --
[main.pl]
#!/usr/bin/perl
use strict;
require const::StreamTypes;
require output;
my $out = output->new("output");
$out->writeLine("Sample output");
[output.pm]
#!/usr/bin/perl
use strict;
require const::StreamTypes;
package output;
my $OUTSTR;
sub new{
my $class = shift();
my $stream = shift();
if($stream eq const::StreamTypes->STDNET){
}elsif($stream eq const::StreamTypes->STDWEB){
}else{
*OUTSTR = *STDOUT;
}
my $self = {
"_outStream" => $stream,
"_outStreamPtr" => $OUTSTR
};
bless($self, $class);
}
sub writeLine{
my $msg = shift();
print(OUTSTR "$msg\n");
}
return 1;
So, can anyone help me understand what is going wrong here? 'cas program runs without error but with no output.
Thanks!
I changed a couple of things here:
the first parameter of a methd is the invocant (instance or class) itself
indirect file handles are globals!
the autodie module comes in handy, if using open
consider using strict in your modules, too
I would not recommend the use of package global variable ( my $OUTSTR; ), because that's going to be messy with multiple instances, which want to have different streams.
And I definitely got into the habit of using accessors for all attributes. You can use a lightweight system like Class::Accessor or perhaps you are even lucky enough to use Moose our Mouse. Of course there are a couple of other modules also providing accessors in different ways.
package output;
use strict;
use warnings;
use autodie;
use Class::Accessor "moose-like";
has "outStream" => ( is => 'rw' );
sub new{
my ( $class, $stream ) = #_;
my $self = bless( {}, $class );
if ( 0 ) {
# ...
} else {
open( my $outStream, '>&', \*STDOUT );
$self->outStream( $outStream );
}
return $self;
}
sub writeLine{
my ( $self, $msg ) = #_;
print { $self->outStream } "$msg\n";
}
return 1;
Moose would create a constructor for you, but you can insert your parameter processing as easy as follows:
use Moose;
has "outStream" => ( is => 'rw' );
sub BUILDARGS {
my ( $class, $stream ) = #_;
open( my $outStream, '>&', \*STDOUT );
return {
outStream => $outStream,
};
}
$OUTSTR and *OUTSTR are very different things -- you should clear up your misunderstanding about this before you worry about object oriented programming.
That said, you can probably fix this script by getting everything to refer to $OUTSTR:
...
}else{
$OUTSTR = *STDOUT;
}
...
print $OUTSTR "$msg\n";
How about just passing a file handle directly into the object's constructor?
package output;
sub new {
my ($class, $fh) = #_;
bless { file_handle => $fh }, $class;
}
sub writeLine {
my $self = shift;
my $line = shift;
print {$self->{file_handle}} $line;
}
1;
Example usage:
my $output = output->new(\*STDOUT); # write to stdout
my $socket = IO::Socket::INET->new('www.perl.org', PeerPort => 'http(80)', Proto => 'tcp');
my $output = output->new($socket); # write to a socket
Please don't use barenames for file handles. Use lexical file handles.
The following lines assume that there is a hash %type_handlers somewhere that looks something like this:
{ const::StreamTypes->STDNET => \&constructor_for_stdnet_handles
, const::StreamTypes->STDWEB => \&constructor_for_stdweb_handles
}
Then you can replace the bottom of your constructor with:
my $handler = $type_handlers{ $stream };
my $outstr
= $handler ? $handler->()
: do { my $h; open( $h, '>&', \*::STDOUT ) and $h; }
;
return bless( {
_outStream => $stream
, _outStreamPtr => $outstr
}
, $class
);
Then writeLine becomes:
sub writeLine {
my ( $self, $msg ) = #_;
( $self->{_outStreamPtr} || *::STDOUT{IO} )->say( $msg );
}
The method is a little more robust in cases where somebody just blessed themselves into your class.
my $q_and_d = bless {}, 'output';
If you don't want to allow "quick & dirty" instances, and want more precise messages from possible failures, you could do this:
Carp::croak( 'No outstream!' )
unless my $h = Params::Util::_HANDLE( $self->{_outStreamPtr} )
;

How do you create a callback function (dispatch table) in Perl using hashes?

I want to call a main controller function that dispatches other function dynamically, something like this:
package Controller;
my %callback_funcs = ();
sub register_callback{
my ($class,$callback,$options) = _#;
#apppend to %callback_funcs hash ... ?
}
sub main{
%callback_funcs = ( add => 'add_func', rem => 'remove_func', edit => 'edit_func');
while(<STDIN>){
last if ($_ =~ /^\s*$/);
if($_ == 'add' || _$ == 'rem' || _$ == 'edit'){
$result = ${callback_funcs['add']['func']}(callback_funcs['add']['options']);
}
}
}
sub add_func{
...
}
One caveat is that the subs are defined in other Modules, so the callbacks would have to be able to reference them... plus
I'm having a hard time getting the hashes right!
So, it's possible to have a hash that contains anonymous subroutines that you can invoke from stdin.
my %callbacks = (
add => sub {
# do stuff
},
fuzzerbligh => sub {
# other stuff
},
);
And you can insert more hashvalues into the hash:
$callbacks{next} = sub {
...
};
And you would invoke one like this
$callbacks{next}->(#args);
Or
my $coderef = $callbacks{next};
$coderef->(#args);
You can get the hashkey from STDIN, or anywhere else.
You can also define them nonymously and then take a reference to them.
sub delete {
# regular sub definition
}
$callbacks{delete} = \&delete;
I wouldn't call these callbacks, however. Callbacks are subs that get called after another subroutine has returned.
Your code is also rife with syntax errors which may be obscuring the deeper issues here. It's also not clear to me what you're trying to do with the second level of arrays. When are you defining these subs, and who is using them when, and for what?
Perhaps this simplified example will help:
# Very important.
use strict;
use warnings;
# Define some functions.
sub multiply { $_[0] * $_[1] }
sub divide { $_[0] / $_[1] }
sub add { $_[0] + $_[1] }
sub subtract { $_[0] - $_[1] }
# Create a hash of references to those functions (dispatch table).
my %funcs = (
multiply => \&multiply,
divide => \&divide,
add => \&add,
subtract => \&subtract,
);
# Register some more functions.
sub register {
my ($key, $func) = #_;
$funcs{$key} = $func;
}
register('+', \&add); # As above.
register('sum', sub { # Or using an anonymous subroutine.
my $s = 0;
$s += $_ for #_;
return $s;
});
# Invoke them dynamically.
while (<>){
my ($op, #args) = split;
last unless $op and exists $funcs{$op}; # No need for equality tests.
print $funcs{$op}->(#args), "\n";
}
You've already got some good answers on how to build a dispatch table and call functions through it within a single file, but you also keep talking about wanting the functions to be defined in other modules. If that's the case, then wouldn't it be better to build the dispatch table dynamically based on what dispatchable functions each module says it has rather than having to worry about keeping it up to date manually? Of course it would!
Demonstrating this requires multiple files, of course, and I'm using Module::Pluggable from CPAN to find the modules which provide the function definitions.
dispatch_core.pl:
#!/usr/bin/env perl
use strict;
use warnings;
my %dispatch;
use lib '.'; # a demo is easier if I can put modules in the same directory
use Module::Pluggable require => 1, search_path => 'DTable';
for my $plugin (plugins) {
%dispatch = (%dispatch, $plugin->dispatchable);
}
for my $func (sort keys %dispatch) {
print "$func:\n";
$dispatch{$func}->(2, 5);
}
DTable/Add.pm:
package DTable::Add;
use strict;
use warnings;
sub dispatchable {
return (add => \&add);
}
sub add {
my ($num1, $num2) = #_;
print "$num1 + $num2 = ", $num1 + $num2, "\n";
}
1;
DTable/MultDiv.pm:
package DTable::MultDiv;
use strict;
use warnings;
sub dispatchable {
return (multiply => \&multiply, divide => \&divide);
}
sub multiply {
my ($num1, $num2) = #_;
print "$num1 * $num2 = ", $num1 * $num2, "\n";
}
sub divide {
my ($num1, $num2) = #_;
print "$num1 / $num2 = ", $num1 / $num2, "\n";
}
1;
Then, on the command line:
$ ./dispatch_core.pl
add:
2 + 5 = 7
divide:
2 / 5 = 0.4
multiply:
2 * 5 = 10
Adding new functions is now as simple as dropping a new file into the DTable directory with an appropriate dispatchable sub. No need to ever touch dispatch_core.pl just to add a new function again.
Edit: In response to the comment's question about whether this can be done without Module::Pluggable, here's a modified dispatch_core.pl which doesn't use any external modules other than the ones defining the dispatchable functions:
#!/usr/bin/env perl
use strict;
use warnings;
my %dispatch;
my #dtable = qw(
DTable::Add
DTable::MultDiv
);
use lib '.';
for my $plugin (#dtable) {
eval "use $plugin";
%dispatch = (%dispatch, $plugin->dispatchable);
}
for my $func (sort keys %dispatch) {
print "$func:\n";
$dispatch{$func}->(2, 5);
}