How do I make two perl files communicate? - perl

So I have something like this:
fork.pl
for $str (#files)
{
my($command) = "perl command.pl ".$str;
exec( $command );
}
command.pl
$file=$ARGV[0].".csv";
#code that counts rows here
print $rowcount;
So as the end result I have 10 files launched which count how many rows are in each csv file.
I do not need help editting this code, it works (this is just a compressed version). I need help figuring out how to take the output ($rowcount) of ten files and combine it into one for further processing.

I keep some utility code around for just this purpose... this is tweaked slightly to your question and including a synchronized global counting method.
#!/usr/bin/perl
use threads;
use Thread::Queue;
my #workers;
my $num_threads = 10;
my $queue = new Thread::Queue;
my $total_ines = 0;
for (0..$num_threads-1) {
$workers[$_] = new threads(\&worker);
}
while ($_ = shift #ARGV) {
$queue->enqueue($_);
}
sub worker() {
while ($file = $queue->dequeue) {
#line counting code here
global_counter($lines_counted);
}
}
sub global_counter() :locked {
#add to the number of lines counted
$total_lines += shift
}
for (0..$num_threads-1) { $queue->enqueue(undef); }
for (0..$num_threads-1) { $workers[$_]->join; }
print $total_lines;

This kind of communication is solved using pipes (let me write a simple example):
# -- fork.pl -------------------------
for (1..3) {
open my $PIPE, "perl command.pl |";
print "catch: $_\n" while(<$PIPE>);
close $PIPE;
}
# -- command.pl ----------------------
print rand(1);
It prints (random numbers):
catch: 0.58929443359375
catch: 0.1290283203125
catch: 0.907012939453125

You need to look either at threads or Interprocess communication with e.g. sockets or shared memory when using fork.

Compressed but won't work. I'm assuming that in fork.pl, you fork before exec'ing? Backticks capture the output of the called process, namely your prints:
fork.pl
for $str (#files)
{
my($command) = "perl command.pl ".$str;
print `$command`;
}
But rather than forking and launching processes, wouldn't it be smarter to turn the second file into a module?
package MyCommand;
use Exporter;
our #EXPORT = qw( command );
sub command {
my $file = $_[0] . '.csv';
...
return $rowcount;
}
1;
fork.pl:
use MyCommand;
...
my #rowcounts;
for my $str (#files) {
push #rowcounts, command($str);
}
A bit of self-promotion, but I just posted this in your other thread, which seems relevant enough: How to run in parallel two child command from a parent one?

Accumulate pipes from children:
#!/usr/bin/perl -w
use strict;
my $files = qw/one.csv two.csv three.csv/;
my $command = "perl command.pl";
my #pipes;
foreach (#files) {
my $fd;
open $fd, "-|", "$command $_" and push #pipes, $fd;
};
my $sum = 0;
foreach my $pp (#pipes) {
$sum += $_ if defined ($_=<$pp>);
};
print $sum;
Then you can just read them one by one (as in example), or use IO::Select to read data as it appears in each pipe.
A hash table in addition to array is also good if you want to know which data comes from which source.

Related

How to pass hash contents of a forked subroutine back to main program?

I need to access in a main program the contents of hashes that were generated via subroutines that were forked. Here specifically is what I am trying to do:-
use Benchmark;
use File::Find;
use File::Basename;
use File::Path;
use Data::Dumper;
use strict;
use warnings;
print "Process ID: $$ \n";
my #PartitionRoots = qw(/nfs/dir1 /nfs/dir2 /nfs/dir3 /nfs/dir4);
my #PatternsToCheck = qw(prefix1 prefix2);
my #MatchedDirnames = qw();
my $DirCount = 0;
my $Forks = 0;
my #AllDirs = qw();
my %SweepStats = ();
foreach my $RootPath (#PartitionRoots) {
foreach my $Pattern (#PatternsToCheck) {
if (grep {-e} glob ("$RootPath/$Pattern*")) {
my #Files = glob ("$RootPath/$Pattern*");
foreach my $FileName (#Files) {
if (-d $FileName) {
$DirCount++;
push (#AllDirs, $FileName);
my $PID = fork;
if (not defined $PID) {
warn 'Could not fork!\n';
next;
}
if ($PID) {
$Forks++;
print "In the parent PID ($$), Child pid: $PID Number of forked child processes: $Forks\n";
} else {
print "In the child PID ($$)\n";
find(\&file_stats, $FileName);
print "Child ($$) exiting...\n";
exit;
}
}
}
}
}
}
for (1 .. $Forks) {
my $PID = wait();
print "Parent saw child $PID exit.\n";
}
print "Parent ($$) ending.\n";
print Dumper (\%SweepStats);
foreach my $DirName (#AllDirs) {
print ("Printing $DirName contents...\n");
foreach (#{$SweepStats{$DirName}}) {
my $uname = $_->{uname};
my $mtime = $_->{mtime};
my $size = $_->{size};
my $file = $_->{file};
print ("$uname $mtime $size $file\n");
}
}
sub file_stats {
if (-f $File::Find::name) {
my $FileName = $_;
my $PathName = dirname($_);
my $DirName = basename($_);
my $uid = (stat($_))[4];
my $uname = getpwuid($uid);
my $size = (stat($_))[7];
my $mtime = (stat($_))[9];
if (defined $uname && $uname ne '') {
push #{$SweepStats{$FileName}}, {path=>$PathName,dir=>$DirName,uname=>$uname,mtime=>$mtime,size=>$size,file=>$File::Find::name};
} else {
push #{$SweepStats{$FileName}}, {path=>$PathName,dir=>$DirName,uname=>$uid,mtime=>$mtime,size=>$size,file=>$File::Find::name};
}
}
return;
}
exit;
...but Dumper is coming up empty, so the dereferencing and printing that immediately follows is empty, too. I know the file stat collecting is working, because if I replace the "push #{$SweepStats{$FileName}}" statements with print statements, I see exactly what is expected. I just need to properly access the hashes from the global level, but I cannot get it quite right. What am I doing wrong here? There are all kinds of posts about passing hashes to subroutines, but not the other way around.
Thanks!
The fork call creates a new, independent process. That child process and its parent cannot write to each other's data. So in order for data to be exchanged between the parent and the child we need to use some Inter-Process-Communication (IPC) mechanism.†
It is by far easiest to use a library that takes care of details, and Parallel::ForkManager seems rather suitable here as it provides an easy way to pass the data from child back to the parent, and it has a simple queue (to keep the number of simultaneous processes limited to a given number).
Here is some working code, and comments follow
use warnings;
use strict;
use feature 'say';
use File::Find;
use File::Spec;
use Parallel::ForkManager;
my %file_stats; # written from callback in run_on_finish()
my $pm = Parallel::ForkManager->new(16);
$pm->run_on_finish(
sub { # 6th argument is what is passed back from finish()
my ($pid, $exit, $ident, $signal, $core, $dataref) = #_;
foreach my $file_name (keys %$dataref) {
$file_stats{$file_name} = $dataref->{$file_name};
}
}
);
my #PartitionRoots = '.'; # For my tests: current directory,
my #PatternsToCheck = ''; # no filter (pattern is empty string)
my %stats; # for use by File::Find in child processes
foreach my $RootPath (#PartitionRoots) {
foreach my $Pattern (#PatternsToCheck) {
my #dirs = grep { -d } glob "$RootPath/$Pattern*";
foreach my $dir (#dirs) {
#say "Looking inside $dir";
$pm->start and next; # child process
find(\&get_file_stats, $dir);
$pm->finish(0, { %stats }); # exits, {%stats} passed back
}
}
}
$pm->wait_all_children;
sub get_file_stats {
return if not -f;
#say "\t$File::Find::name";
my ($uid, $size, $mtime) = (stat)[4,7,9];
my $uname = getpwuid $uid;
push #{$stats{$File::Find::name}}, {
path => $File::Find::dir,
dir => ( File::Spec->splitdir($File::Find::dir) )[-1],
uname => (defined $uname and $uname ne '') ? $uname : $uid,
mtime => $mtime,
size => $size,
file => $File::Find::name
};
}
Comments
The main question in all this is: at which part of your three-level hierarchy to spawn child processes? I left it as in the question, where for each directory a child is forked. This may be suitable if (most of) directories have many files; but if it isn't so and there is little work for each child to do then it may all get too busy and the overhead may reduce/deny the speedup
The %stats hash, necessary for File::Find to store the data it finds, need be declared outside of all loops so that it is seen in the sub. So it is inherited by all child processes, but each gets its own copy as due and we need not worry about data overlap or such
I simplified (and corrected) code other than the forking as well, following what seemed to me to be desired. Please let me know if that is off
See linked documentation, and for example this post and links in it for details
In order to display complex data structures use a library, of which there are many.
I use Data::Dump, intended to simply print nicely,
use Data::Dump qw(dd pp);
...
dd \%file_stats; # or: say "Stats for all files: ", pp \%file_stats;
for its compact output, while the most widely used is the core Data::Dumper
use Data::Dumper
...
say Dumper \%file_stats;
which also "understands" data structures (so you can mostly eval them back).
(Note: In this case there'll likely be a lot of output! So redirect to a file, or exit those loops after the first iteration so just to see how it's all going.)
† As a process is forked the variables and data from the parent are available to the child. They aren't copied right away for efficiency reasons, so initially the child does read parent's data. But any data generated after the fork call in the parent or child cannot be seen by the other process.
Try this module: IPC::Shareable.
It recommended by perldoc perlipc, and you can find answer to your question here.

Perl - Implementing Perl Script with Perl Module

I would imagine this is too big and too specific for a normal StackOverflow question, so I can understand if there isn't any possible help. However I will try and show what is the issue I am facing. Also I am new to Perl and I know you shouldn't declare all variables at the start, I'm just trying to see if I can get this implemented first.
I have a Perl script:
use 5.010;
use Math::Trig ':radial';
use Math::Trig;
use List::Util qw(max min);
#Input parameters:
#The ouput filename:
$outfile = 'Tree.scad';
#The coordinates of the points that is to be supported.
$min_X=0;
$max_X=60;
$min_Y=0;
$max_Y=60;
$distance=10;
#The minimum angle from horizontal your printer can make, in degrees
$min_angle= 40;
#Ignore the next line, it is not an input parameter.
($X_ref,$Y_ref)=grid($min_X,$max_X,$min_Y,$max_Y,$distance);#X=#$X_ref;#Y=#{$Y_ref};
for $i (0..$#X){
$Z[$i]=20;#The function that defined the height of each point. This setting wil give you a flat roof. For a more advanced tree, try:
#$Z[$i]=-0.01*$X[$i]**2+0.2*$Y[$i]-0.005*$Y[$i]**2+20;
}
#End of input parameters.
$min_radian = deg2rad($min_angle);
$b = tan($min_radian);
#Z=map{$_/$b}#Z;
open $output, '>', $outfile or die "error writing to '$outfile'";
print $output "width=2;\n";
print $output "sphere_radius=0;\n";
print $output "base_plate_size=10;\n\n";
while ($#X>0){
($I,$J)=find_min_dist(\#X,\#Y,\#Z);
($X_branch,$Y_branch,$Z_branch)=find_branch($X[$I],$Y[$I],$Z[$I],$X[$J],$Y[$J],$Z[$J]);
#X_list=($X_branch,$X[$I],$X[$J]);
#Y_list=($Y_branch,$Y[$I],$Y[$J]);
#Z_list=($Z_branch,$Z[$I],$Z[$J]);
for $j (0..$#Y_list){
if (abs($X_list[$j]) < 0.001){
$X_list[$j]=0;
}
if (abs($Y_list[$j]) < 0.001){
$Y_list[$j]=0;
}
if (abs($Z_list[$j]) < 0.001){
$Z_list[$J]=0;
}
}
branch(\#X_list,\#Y_list,\#Z_list);
splice(#X,$I,1,$X_branch);
splice(#X,$J,1);
splice(#Y,$I,1,$Y_branch);
splice(#Y,$J,1);
splice(#Z,$I,1,$Z_branch);
splice(#Z,$J,1);
}
print $output 'if(base_plate_size>0){';
print $output "\n translate([$X[0],$Y[0],$Z[0]*$b])\n";
print $output "cube([base_plate_size,base_plate_size,1],center=true);}";
sub grid{
my $d=$_[4];
#X_values=$_[0]/$d..$_[1]/$d;
#X_values=map{$_*$d} #X_values;
#Y_values=$_[2]/$d..$_[3]/$d;
#Y_values=map{$_*$d} #Y_values;
for $i (0..$#X_values){
#Y=(#Y,#Y_values);
for $j (0..$#Y_values){
$X[$i*($#Y_values+1)+$j]= $X_values[$i];
}
}
return (\#X,\#Y);
}
sub branch{
my #X=#{ $_[0] };
my #Y=#{ $_[1] };
my #Z=#{ $_[2] };
#Z=map{$_*$b}#Z;
for $i (1..$#X){
($rho, $theta, $phi) = cartesian_to_spherical($X[$i]-$X[0],$Y[$i]-$Y[0],$Z[$i]-$Z[0]);
$phi = rad2deg($phi);
if (abs($phi)<0.001){$phi=0;}
$theta = rad2deg($theta)+90;
if (abs($theta)<0.001){$theta=0;}
if (abs($rho)>0.001){
print $output "translate([$X[0],$Y[0],$Z[0]])\n";
print $output "rotate([0,0,$theta])\n";
print $output "rotate([$phi,0,0])\n";
print $output "translate([-width/2,-width/2,0])";
print $output "cube([width,width,$rho]);\n";
print $output 'if (sphere_radius>0){';
print $output "\n translate([$X[$i],$Y[$i],$Z[$i]])\n";
print $output "sphere(sphere_radius,center=1);}\n";}
}
}
sub find_min_dist{
my #X=#{ $_[0] };
my #Y=#{ $_[1] };
my #Z=#{ $_[2] };
my $min_dist=($X[0]-$X[1])**2+($Y[0]-$Y[1])**2+($Z[0]-$Z[1])**2;
my $max_Z=$Z[0];
my $I=0;
my $J=1;
for $i (1..$#Z){
if ($Z[$i]>=$max_Z){
$max_Z=$Z[$i];
$I=$i;}
}
for $j (0..$#X){
if ($j!=$I){
$dist=(($X[$I]-$X[$j])**2+($Y[$I]-$Y[$j])**2+($Z[$I]-$Z[$j])**2);
if ($min_dist>$dist){
$min_dist=$dist;
$J=$j;
}}}
return ($I,$J);
}
sub find_branch{
my $X1=$_[0];
my $Y1=$_[1];
my $Z1=$_[2];
my $X2=$_[3];
my $Y2=$_[4];
my $Z2=$_[5];
$rXY=sqrt(($X1-$X2)**2+($Y1-$Y2)**2);
if (abs($Z1-$Z2) < $rXY) {
$Z_branch=($Z1+$Z2-$rXY)/2;
$a=($Z1-$Z_branch)/$rXY;
$X_branch=(1-$a)*$X1+$a*$X2;
$Y_branch=(1-$a)*$Y1+$a*$Y2;
}
elsif ($Z1 < $Z2) {
$X_branch=$X1;
$Y_branch=$Y1;
$Z_branch=$Z1;
}
else {
$X_branch=$X2;
$Y_branch=$Y2;
$Z_branch=$Z2;
}
return ($X_branch,$Y_branch,$Z_branch);
}
Which produces a scad file and outputs it as this:
I thought it would be good to implement this method in a slicing program, Slic3r. Now what I have done is attempted to still keep it separate since I would like to show at least this structure in the program and decide whether or not it is possible to do.
Slic3r Original Code: https://github.com/slic3r/Slic3r/blob/21eb603cc16946b14e77d3c10cbee2f1163503c6/lib/Slic3r/Print/SupportMaterial.pm
Modified Slic3r Code: https://pastebin.com/aHzXT4RW
So the comparison is, I removed the generate_pillar_supports and added my grid subroutine. I assumed I would just have to call it since this script is separate to how it's generated compared to the other support structures on:
So replaced this:
my $shape = [];
if ($self->object_config->support_material_pattern eq 'pillars') {
$self->generate_pillars_shape($contact, $support_z, $shape);
}
With this:
my $shape = [];
if ($self->object_config->support_material_pattern eq 'pillars') {
$self->grid($min_X,$max_X,$min_Y,$max_Y,$distance);
}
However unfortunately, I have not been able to get a nice structure to form but rather this:
As I said, I know this is a large question and I'm not diving into the entire Slic3ing program so it might be even harder to understand. However just from a brief look, would anyone know what the issue is? Am I calling the subroutine wrong, does the script only work to produce a scad file, etc. All I would need is just to see if this is able to show or not. Thanks.
sub grid does not appear to be a method, but you are calling it as one
$self->grid($min_X,$max_X,$min_Y,$max_Y,$distance);
This syntax actually sends $self as the first argument, so that call is equivalent to the function call
grid($self,$min_X,$max_X,$min_Y,$max_Y,$distance);
What you probably want is to just say
grid($min_X,$max_X,$min_Y,$max_Y,$distance);
(You also really want to say
use strict;
use warnings;
at the top of every script)

Perl non-blocking user input

#!/usr/bin/perl -w
use Term::ReadKey;
ReadMode('cbreak');
while (1) {
$char = ReadKey(-1);
next unless defined $char;
printf("Char: $char Decimal: %d\tHex: %x\n", ord($char), ord($char));
}
ReadMode('normal');
The above works great. But i want to be able to get user input while some executable is running. so i ve tried the below but its not working. maybe running an executable while trying to get a user input is messing up? if so, how do i go about doing it?
I am getting output from $myexe and depending on the user input, i would like to filter differnt things from $myexe
#!/usr/bin/perl -w
use Term::ReadKey;
my $myexe = 'bin/myexecutable';
open my $EXE,
"$myexe distribute 2>&1 |"
or die 'Cannot open EXE';
ReadMode('cbreak');
while (<$EXE>) {
$char = ReadKey(-1);
if (defined $char) {
print ">>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> $char\n"; #i would press a key but nothin prints out
}
print "$_\n";
}
ReadMode('normal');
I'm wary of running a 'busy-wait' loop like you'd get with Term::ReadKey. But what I'd suggest - if you're trying to do two things at once - is that it may be worth considering doing a spot of parallel code.
Something like:
#!/usr/bin/perl
use strict;
use warnings;
use threads;
use threads::shared;
use Term::ReadKey;
my $myexe = 'bin/myexecutable';
my $filter : shared;
sub worker {
open my $EXE, "$myexe distribute 2>&1 |"
or die 'Cannot open EXE';
while ( my $line = <$EXE> ) {
#do something with filter here;
print "$filter : $line";
}
}
$filter = 0;
threads->create( \&worker );
my $keypress;
ReadMode 4;
while ( threads->list(threads::running) ) {
while ( not defined( $keypress = ReadKey(-1) )
and threads->list(threads::running) )
{
print "Waiting\nRunning:" . threads->list(threads::running) . "\n";
sleep 1;
}
print "Got $keypress\n";
$filter = $keypress;
}
ReadMode 0;
foreach my $thr ( threads->list ) {
$thr->join();
}
This is some fairly simple example code - you can extend it in a variety of ways, but the principle is this:
you start a thread to 'do the work'.
you handle the 'keypress watching' in the 'main' thread.
Because there's a sleep in there, you're not busy-waiting on a keypress (e.g. polling as fast as a processor will spin).

Show bash script output in live on perl - curses script

I'm a newb' on Perl, and try to do a simple script's launcher in Perl with Curses (Curses::UI)
On Stackoverflow I found a solution to print (in Perl) in real time the output of a Bash script.
But I can't do this with my Curses script, to write this output in a TextEditor field.
For example, the Perl script :
#!/usr/bin/perl -w
use strict;
use Curses::UI;
use Curses::Widgets;
use IO::Select;
my $cui = new Curses::UI( -color_support => 1 );
[...]
my $process_tracking = $container_middle_right->add(
"text", "TextEditor",
-readonly => 1,
-text => "",
);
sub launch_and_read()
{
my $s = IO::Select->new();
open my $fh, '-|', './test.sh';
$s->add($fh);
while (my #readers = $s->can_read()) {
for my $fh (#readers) {
if (eof $fh) {
$s->remove($fh);
next;
}
my $l = <$fh>;
$process_tracking->text( $l );
my $actual_text = $process_tracking->text() . "\n";
my $new_text = $actual_text . $l;
$process_tracking->text( $new_text );
$process_tracking->cursor_to_end();
}
}
}
[...]
$cui->mainloop();
This script contains a button to launch launch_and_read().
And the test.sh :
#!/bin/bash
for i in $( seq 1 5 )
do
sleep 1
echo "from $$ : $( date )"
done
The result is my application freeze while the bash script is executed, and the final output is wrote on my TextEditor field at the end.
Is there a solution to show in real time what's happened in the Shell script, without blocking the Perl script ?
Many thanks, and sorry if this question seems to be stupid :x
You can't block. Curses's loop needs to run to process events. So you must poll. select with a timeout of zero can be used to poll.
my $sel;
sub launch_child {
$sel = IO::Select->new();
open my $fh, '-|', './test.sh';
$sel->add($fh);
}
sub read_from_child {
if (my #readers = $sel->can_read(0)) {
for my $fh (#readers) {
my $rv = sysread($fh, my $buf, 64*1024);
if (!$rv) {
$sel->remove($fh);
close($fh);
next;
}
... add contents of $buf to the ui here ...
}
}
}
launch_child();
$cui->set_timer(read_from_child => \&read_from_child, 1);
$cui->mainloop();
Untested.
Note that I switched from readline (<>) to sysread since the former blocks until a newline is received. Using blocking calls like read or readline defies the point of using select. Furthermore, using buffering calls like read or readline can cause select to say nothing is waiting when there actually is. Never use read and readline with select.

Pimp my Perl code

I'm an experienced developer, but not in Perl. I usually learn Perl to hack a script, then I forget it again until the next time. Hence I'm looking for advice from the pros.
This time around I'm building a series of data analysis scripts. Grossly simplified, the program structure is like this:
01 my $config_var = 999;
03 my $result_var = 0;
05 foreach my $file (#files) {
06 open(my $fh, $file);
07 while (<$fh>) {
08 &analyzeLine($_);
09 }
10 }
12 print "$result_var\n";
14 sub analyzeLine ($) {
15 my $line = shift(#_);
16 $result_var = $result_var + calculatedStuff;
17 }
In real life, there are up to about half a dozen different config_vars and result_vars.
These scripts differ mostly in the values assigned to the config_vars. The main loop will be the same in every case, and analyzeLine() will be mostly the same but could have some small variations.
I can accomplish my purpose by making N copies of this code, with small changes here and there; but that grossly violates all kinds of rules of good design. Ideally, I would like to write a series of scripts containing only a set of config var initializations, followed by
do theCommonStuff;
Note that config_var (and its siblings) must be available to the common code, as must result_var and its lookalikes, upon which analyzeLine() does some calculations.
Should I pack my "common" code into a module? Create a class? Use global variables?
While not exactly code golf, I'm looking for a simple, compact solution that will allow me to DRY and write code only for the differences. I think I would rather not drive the code off a huge table containing all the configs, and certainly not adapt it to use a database.
Looking forward to your suggestions, and thanks!
Update
Since people asked, here's the real analyzeLine:
# Update stats with time and call data in one line.
sub processLine ($) {
my $line = shift(#_);
return unless $line =~ m/$log_match/;
# print "$1 $2\n";
my ($minute, $function) = ($1, $2);
$startMinute = $minute if not $startMinute;
$endMinute = $minute;
if ($minute eq $currentMinute) {
$minuteCount = $minuteCount + 1;
} else {
if ($minuteCount > $topMinuteCount) {
$topMinute = $currentMinute;
$topMinuteCount = $minuteCount;
printf ("%40s %s : %d\n", '', $topMinute, $topMinuteCount);
}
$totalMinutes = $totalMinutes + 1;
$totalCount = $totalCount + $minuteCount;
$currentMinute = $minute;
$minuteCount = 1;
}
}
Since these variables are largely interdependent, I think a functional solution with separate calculations won't be practical. I apologize for misleading people.
Two comments: First, don't post line numbers as they make it more difficult than necessary to copy, paste and edit. Second, don't use &func() to invoke a sub. See perldoc perlsub:
A subroutine may be called using an explicit & prefix. The & is optional in modern Perl, ... Not only does the & form make the argument list optional, it also disables any prototype checking on arguments you do provide.
In short, using & can be surprising unless you know what you are doing and why you are doing it.
Also, don't use prototypes in Perl. They are not the same as prototypes in other languages and, again, can have very surprising effects unless you know what you are doing.
Do not forget to check the return value of system calls such as open. Use autodie with modern perls.
For your specific problem, collect all configuration variables in a hash. Pass that hash to analyzeLine.
#!/usr/bin/perl
use warnings; use strict;
use autodie;
my %config = (
frobnicate => 'yes',
machinate => 'no',
);
my $result;
$result += analyze_file(\%config, $_) for #ARGV;
print "Result = $result\n";
sub analyze_file {
my ($config, $file) = #_;
my $result;
open my $fh, '<', $file;
while ( my $line = <$fh> ) {
$result += analyze_line($config, $line);
}
close $fh;
return $result;
}
sub analyze_line {
my ($line) = #_;
return length $line;
}
Of course, you will note that $config is being passed all over the place, which means you might want to turn this in to a OO solution:
#!/usr/bin/perl
package My::Analyzer;
use strict; use warnings;
use base 'Class::Accessor::Faster';
__PACKAGE__->follow_best_practice;
__PACKAGE__->mk_accessors( qw( analyzer frobnicate machinate ) );
sub analyze_file {
my $self = shift;
my ($file) = #_;
my $result;
open my $fh, '<', $file;
while ( my $line = <$fh> ) {
$result += $self->analyze_line($line);
}
close $fh;
return $result;
}
sub analyze_line {
my $self = shift;
my ($line) = #_;
return $self->get_analyzer->($line);
}
package main;
use warnings; use strict;
use autodie;
my $x = My::Analyzer->new;
$x->set_analyzer(sub {
my $length; $length += length $_ for #_; return $length;
});
$x->set_frobnicate('yes');
$x->set_machinate('no');
my $result;
$result += $x->analyze_file($_) for #ARGV;
print "Result = $result\n";
Go ahead and create a class hierarchy. Your task is an ideal playground for OOP style of programming.
Here's an example:
package Common;
sub new{
my $class=shift;
my $this=bless{},$class;
$this->init();
return $this;
}
sub init{}
sub theCommonStuff(){
my $this=shift;
for(1..10){ $this->analyzeLine($_); }
}
sub analyzeLine(){
my($this,$line)=#_;
$this->{'result'}.=$line;
}
package Special1;
our #ISA=qw/Common/;
sub init{
my $this=shift;
$this->{'sep'}=','; # special param: separator
}
sub analyzeLine(){ # modified logic
my($this,$line)=#_;
$this->{'result'}.=$line.$this->{'sep'};
}
package main;
my $c = new Common;
my $s = new Special1;
$c->theCommonStuff;
$s->theCommonStuff;
print $c->{'result'}."\n";
print $s->{'result'}."\n";
If all the common code is in one function, a function taking your config variables as parameters, and returning the result variables (either as return values, or as in/out parameters), will do. Otherwise, making a class ("package") is a good idea, too.
sub common_func {
my ($config, $result) = #_;
# ...
$result->{foo} += do_stuff($config->{bar});
# ...
}
Note in the above that both the config and result are hashes (actually, references thereto). You can use any other data structure that you feel will suit your goal.
Some thoughts:
If there are several $result_vars, I would recommend creating a separate subroutine for calculating each one.
If a subroutine relies on information outside that function, it should be passed in as a parameter to that subroutine, rather than relying on global state.
Alternatively wrap the whole thing in a class, with $result_var as an attribute of the class.
Practically speaking, there are a couple ways you could implement this:
(1) Have your &analyzeLine function return calculatedStuff, and add it to &result_var in a loop outside the function:
$result_var = 0;
foreach my $file (#files) {
open(my $fh, $file);
while (<$fh>) {
$result_var += analyzeLine($_);
}
}
}
sub analyzeLine ($) {
my $line = shift(#_);
return calculatedStuff;
}
(2) Pass $result_var into analyzeLine explicitly, and return the changed $result_var.
$result_var = 0;
foreach my $file (#files) {
open(my $fh, $file);
while (<$fh>) {
$result_var = addLineToResult($result_var, $_);
}
}
}
sub addLineToResult ($$) {
my $running_total = shift(#_);
my $line = shift(#_);
return $running_total + calculatedStuff;
}
The important part is that if you separate out functions for each of your several $result_vars, you'll be more readily able to write clean code. Don't worry about optimizing yet. That can come later, when your code has proven itself slow. The improved design will make optimization easier when the time comes.
why not create a function and using $config_var and $result_var as parameters?