Perl Creating hash reference and looping through one element from each branch at a time - perl

As a beginner I have what I think is a rather complicated problem I am hoping someone could help with.
I have the following text file (tab delminated)...
FILE1.txt
Dog Big
Dog Medium
Dog Small
Rabbit Huge
Rabbit Tiny
Rabbit Middle
Donkey Massive
Donkey Little
Donkey Gigantic
I need to read FILE1.txt into a hash reference to get something like the following... (using Data::Dumper)
$VAR1 = {
'Dog' => {
'Big',
'Medium',
'Small'
},
'Rabbit => {
'Huge',
'Tiny',
'Middle'
},
'Donkey => {
'Massive',
'Little',
'Gigantic'
},
};
The problem I am having:
I then need to loop through each branch of the hash reference one at a time, I will use the value from the hash reference to check if this matches my keyword, if so it will then return it's corresponding key.... for example...
What I need it to do:
my $keyword == "Little";
Dog->Big
if 'Big' matches my keyword then return $found = Dog
else go to the next branch
Rabbit->Huge
if 'Huge' matches my keyword then return $found = Rabbit
else go to the next branch
Donkey->Massive
if 'Massive' matches my keyword then return $found = Donkey
else go to the next branch (which is Dog again, but the second element this time)
Dog->Medium
if 'Medium' matches my keyword then return $found = Dog
else go to the next branch
Rabbit->Tiny
if 'Tiny' matches my keyword then return $found = Rabbit
else go the the next branch
Donkey->Little
if 'Little' matches my keyword then return $found = Donkey
..... and so on until the keyword is found or we reach the end of the hash reference
This is the kind of thing I am trying to achieve but don't know how to go about doing this, or whether a hash reference is the best way to do this, or if it can even be done with a hash/hash reference?
your help with this is much appreciated, thanks

Choosing proper data structure is often key step to the solution, but first of all you should define what you are trying achieve. What is overall goal? For example I have this data file and in mine application/program I need frequently ask for this information. It is crucial to ask proper question because for example if you don't need ask frequently for keyword it doesn't make sense creating hash at all.
perl -anE'say $F[0] if $F[1] eq "Little"' FILE1.txt
Yes it is that simple. Look in perlrun manpage for switches and what they mean and how to do same thing in bigger application.
If you need frequently ask for this question you should arrange your data in way which helps you and not in way you have to battle with.
use strict;
use warnings;
use feature qw(say);
use autodie;
open my $f, '<', 'FILE1.txt';
my %h;
while(<$f>) {
chomp;
my ($animal, $keyword) = split' ';
$h{$keyword} = $animal unless exists $h{$keyword};
}
close $f;
for my $keyword (qw(Little Awkward Small Tiny)) {
say $h{$keyword} ? "$keyword $h{$keyword}" : "keyword $keyword not found";
}
But if you still insist you want to traverse hash you can do it but you has been warned.
open my $f, '<', 'FILE1.txt';
my %h;
while (<$f>) {
chomp;
my ( $animal, $keyword ) = split ' ';
push #{ $h{$animal} }, $keyword;
}
close $f;
KEYWORD:
for my $keyword (qw(Little Awkward Small Tiny)) {
for my $animal (keys %h) {
for my $k (#{$h{$animal}}) {
if($k eq $keyword) {
say "$keyword $animal";
next KEYWORD;
}
}
}
say "keyword $keyword not found";
}

To critique my own answer: the structuring of the part that does the search could be better. And maybe it is pointless even using an ordered hash as the search is through a linear list. Maybe it should be an array of arrays
use strict;
use warnings;
use Tie::IxHash;
#open file
open(my $fh,"ani.txt") ||die $!;
#make an ordered hash
tie my %sizes, 'Tie::IxHash';
#read file into hash of arrays
while(<$fh>) {
(my $animal,my $size)=split(/\s+/);
if (!exists($sizes{$animal})) {
$sizes{$animal} = [$size];
} else {
push #{$sizes{$animal}},$size;
}
}
my $keyword="Little";
my $running=1;
my $depth=0;
while( $running ) {
$running = 0;
for my $search (keys %sizes) {
next if ($depth > #{$sizes{$search}});
$running = 1;
if ($keyword eq $sizes{$search}[$depth]) {
print "FOUND!!!!!! $search $depth";
exit(0);
}
}
$depth++;
}
Here is another version of a solution to the stated problem. To solve the actual problem given there is no need to store anything except the first "size" key for each animal in a hash
This hash can then be trivally used to look up the animal
use strict;
use warnings;
open(my $fh,"ani.txt") ||die $!;
my %animals;
#read file into hash
while(<$fh>) {
(my $animal,my $size)=split(/\s+/);
#only add the animal the first time the size is found
if (!exists($animals{$size})) {
$animals{$size} = $animal;
}
}
my $keyword="Little";
print "animal is ", $animals{$keyword};

Related

How to get a user-configurable buffer for printing?

I'd like to have a print function supporting a user-configurable buffer, so to print what I have in the buffer only when the buffer is > a threshold).
I need to write multiple files, so I have multiple filehandles to write to, and for this an object oriented module might be handier.
I imagine something like this:
my $printer1 = Print::Buffer->new({ size => 1000, filehandle => \$OUT1 });
for (my $i=1; $i<1000; $i++) {
$printer1->print("This string will be eventually printed ($i/1000)");
}
# and at the end print the remaining buffer
$printer1->flush();
Any recommendation? I probably don't use the right keywords as with print/buffer I didn't find clear matches in CPAN.
UPDATE:
Thanks everyone for the very useful comments. As some of you pointed out, the problem is more complex than I initially thought, and probably a bad idea. (This question arose as I was printing very large files [>100Gb] in with a print statement at each loop iteration, and noted that if I was printing every hunderth iteration I had a speedup, but it could be dependent on how the loop was changed...)
UPDATE 2:
I need/want to accept an answer. To me both have been instructive and they are both useful. I tested both and they both need further work before being able to benchmark the improvement (if any, see update above). The tie handle is a less known feature that I loved, that's why I accepted that. They were both equally close to the desired answer in my opinion. Thank you all very much for the discussion and the insights.
I'd like to have a print function supporting a user-configurable buffer, [...]
I imagine something like this: [...]
It's not hard to write something like it. Here's a basic sketch
File PrintBuffer.pm
package PrintBuffer;
use warnings;
use strict;
sub new {
my ($class, %args) = #_;
my $self = {
_size => $args{size} // 64*1024, #//
_fh => $args{filehandle} // *STDOUT,
_buf => ''
};
$self->{_fh}->autoflush; # want it out once it's printed
bless $self, $class;
}
sub print {
my ($self, $string) = #_;
$self->{_buf} .= $string;
if ( length($self->{_buf}) > $self->{_size} ) {
print { $self->{_fh} } $self->{_buf};
$self->{_buf} = '';
}
return $self;
}
sub DESTROY {
my $self = shift;
print { $self->{_fh} } $self->{_buf} if $self->{_buf} ne '';
$self->{_buf} = '';
}
1;
There's a bit more to do here, and a whole lot that can be added, and since it relies only on basic tools one can add/change as desired.† For one, I can imagine a size method to manipulate the buffer size of an existing object (print if there's already more data than the new size), and flush.
Note that DESTROY method provides for the buffer to be printed as the object drops out of any scope, and is getting destroyed, what seems reasonable to do.
A driver
use warnings;
use strict;
use feature 'say';
use PrintBuffer;
my $fout = shift // die "Usage: $0 out-file\n";
open my $fh, '>', $fout or die "Can't open $fout: $!";
my $obj_file = PrintBuffer->new(size => 100, filehandle => $fh);
my $obj_stdout = PrintBuffer->new(size => 100);
$obj_file->print('a little bit');
$obj_stdout->print('a little bit');
say "printed 'a little bit' ..."; sleep 10;
$obj_file->print('out'x30); # push it over a 100 chars
$obj_stdout->print('out'x30);
say "printed 'out'x30 ... "; sleep 10;
$obj_file->print('again...'); # check DESTROY
$obj_stdout->print('again');
say "printed 'again' (and we're done)";
Check the size of output file in another terminal after each informational print.
I tried PerlIO::buffersize brought up by Grinnz in a comment and it seems to work "as advertised" as they say. It doesn't allow you to do all you may wish but it may be a ready solution for basic needs. Note that this doesn't work with :encoding layer in use.
Thanks to ikegami for comments and tests (linked in comments).
† The print works with an autoflush-ed handle. Still, the first change could be to use syswrite instead, which is unbuffered and attempts to directly write all that's asked of it, via one write(2) call. But since there's no guarantee that all got written we also need to check
use Carp; # for croak
WRITE: {
my $bytes_written = 0;
while ( $bytes_written < length $self->{_buf} ) {
my $rv = syswrite(
$self->{_fh},
$self->{_buf},
length($self->{_buf}) - $bytes_written,
$bytes_written
);
croak "Error writing: $!" if not defined $rv;
$bytes_written += $rv;
}
$self->{_buf} = '';
};
I've put this in a block only to limit the scope of $bytes_written and any other variables that one may wish to introduce so to reduce the number of dereferences of $self (but note that $self->{_buf} may be quite large and copying it "to optimize" dereferencing may end up slower).
Naively we'd only need syswrite(FH, SCALAR) but if it happens that not all of SCALAR gets written then we need to continue writing from past what was written, thus the need to use the form with length-to-write and offset as well.
Since this is unbuffered it mustn't be mixed with buffered IO (or that need be done very carefully); see the docs. Also, :encoding layers can't be used with it. Consider these restrictions against other capabilities that may be wanted in this class.
I don't see a general solution on CPAN, either. But this is straightforward enough with tied filehandles. Something like
use Symbol;
sub Print::Buffer::new {
my ($class,$mode,$file,#opts) = #_;
my $x = Symbol::gensym;
open ($x, $mode, $file) or die "failed to open '$file': $!";
tie *$x, "Print::Buffer", fh => $fh, #opts;
$x;
}
sub Print::Buffer::TIEHANDLE {
my $pkg = shift;
my $self = { #_ };
$self->{bufsize} //= 16 * 1024 * 1024;
$self->{_buffer} = "";
bless $self, $pkg;
}
sub Print::Buffer::PRINT {
my ($self,#msg) = #_;
$self->{buffer} .= join($,,#msg);
$self->_FLUSH if length($self->{buffer}) > $self->{bufsize};
}
sub Print::Buffer::_FLUSH {
my $self = shift;
print {$self->{fh}} $self->{buffer};
$self->{buffer} = "";
}
sub Print::Buffer::CLOSE {
my $self = shift;
$self->_FLUSH;
close( $self->{fh} );
}
sub Print::Buffer::DESTROY {
my $self = shift;
$self->_FLUSH;
}
# ----------------------------------------
my $fh1 = Print::Buffer->new(">", "/tmp/file1",
bufsize => 16*1024*1024);
for (my $i=1; $i<1000; $i++) {
print $fh1 "This string will be eventually printed ($i/1000)\n";
}

Push into end of hash in Perl

So what I am trying to do with the following code is push a string, let's say "this string" onto the end of each key in a hash. I'm completely stumped on how to do this. Here's my code:
use warnings;
use strict;
use File::Find;
my #name;
my $filename;
my $line;
my #severity = ();
my #files;
my #info = ();
my $key;
my %hoa;
my $xmlfile;
my $comment;
my #comments;
open( OUTPUT, "> $ARGV[0]" );
my $dir = 'c:/programs/TEST/Test';
while ( defined( $input = glob( $dir . "\\*.txt" ) ) ) {
open( INPUT, "< $input" );
while (<INPUT>) {
chomp;
if (/File/) {
my #line = split /:/;
$key = $line[1];
push #{ $hoa{$key} }, "Filename\n";
}
if ( /XML/ ... /File/ ) {
$xmlfile = $1;
push #{ $hoa{$key} }, "XML file is $xmlfile\n";
}
if (/Important/) {
push #{ $hoa{$key} }, "Severity is $_\n";
}
if (/^\D/) {
next if /Important/;
push #{ $hoa{$key} }, "Given comment is $_\n";
}
push #{ $hoa{$key} }, "this string\n";
}
}
foreach my $k ( keys %hoa ) {
my #list = #{ $hoa{$k} };
foreach my $l (#list) {
print OUTPUT $l, "\n";
}
}
}
close INPUT;
close OUTPUT;
Where I have "this string" is where I was trying to push that string onto the end of the array. However, what ended up happening was that it ended up printing "this string" three times, and not at the end of every key like I wanted. When I tried to put it outside the while() loop, it said that the value of $key was not initialized. So please, any help? And if you need any clarification on what I'm asking, just let me know. Thank you!
No offence, but there are so many issues in this code I don't even know where to start...
First, the 'initialization block' (all these my $something; my #somethings lines at the beginning of this script) is not required in Perl. In fact, it's not just 'redundant' - it's actually confusing: I had to move my focus back and forth every time I encountered a new variable just to check its type. Besides, even with all this $input var is still not declared as local; it's either missing in comments, or the code given has omissions.
Second, why do you declare your intention to use File::Find (good) - but then do not use it at all? It could greatly simplify all this while(glob) { while(<FH>) { ... } } routine.
Third, I'm not sure why you assign something to $key only when the line read is matched by /File/ - but then use its value as a key in all the other cases. Is this an attempt to read the file organized in sections? Then it can be done a bit more simple, either by slurp/splitting or localizing $/ variable...
Anyway, the point is that if the first line of the file scanned is not matched by /File/, the previous (i.e., from the previous file!) value is used - and I'm not quite sure that it's intended. And if the very first line of the first file is not /File/-matched, then an empty string is used as a key - again, it smells like a bug...
Could you please describe your task in more details? Give some test input/output results, perhaps... It'd be great to proceed in short tasks, organizing your code in process.
Your program is ill-conceived and breaks a lot of good practice rules. Rather than enumerate them all, here is an equivalent program with a better structure.
I wonder if you are aware that all of the if statements will be tested and possibly executed? Perhaps you need to make use of elsif?
Aside from the possibility that $key is undefined when it is used, you are also setting $xmlfile to $1 which will never be defined as there are no captures in any of your regular expressions.
It is impossible to tell from your code what you are trying to do, so we can help you only if you show us your output, input and say how to derive one from the other.
use strict;
use warnings;
use File::Find;
my ($outfile) = #ARGV;
my $dir = 'c:/programs/TEST/Test';
my %hoa;
my $key;
while (my $input = glob "$dir/*.txt") {
open my $in, '<', $input or die $!;
while (<$in>) {
chomp;
if (/File/) {
my $key = (split /:/)[1];
push #{ $hoa{$key} }, "Filename\n";
}
if (/XML/ ... /File/) {
my $xmlfile = $1;
push #{ $hoa{$key} }, "XML file is $xmlfile\n";
}
if (/Important/) {
push #{ $hoa{$key} }, "Severity is $_\n";
}
if (/^\D/) {
next if /Important/;
push #{ $hoa{$key} }, "Given comment is $_\n";
}
push #{ $hoa{$key} }, "this string\n";
}
close $in;
}
open my $out, '>', $outfile or die $!;
foreach my $k (keys %hoa) {
foreach my $l (#{ $hoa{$k} }) {
print $out $l, "\n";
}
}
close $out;
I suspect based on your code, that the line where $key is set is not called each time through the loop, and that you do not trigger any of the other if statements.
This would append "this string" to the end of the array. Based on that you are getting 3 of the "this strings" at the end of the array, I would suspect that two lines do not go through the if (/FILE/) or any of the other if statements. This would leave the $key value the same and at the end, you would append "this string" to the array, using whatever the last value of $key was when it was set.
This will append the string "this string" to every element of the hash %hoa, which elements are array refs:
for (values(%hoa)) { push #{$_}, "this string"; }
Put that outside your while loop, and you'll print "this string" at the end of each element of %hoa.
It will autovivify array refs where it finds undefined elements. It will also choke if it cannot dereference an element as an array, and will manipulate arrays by symbolic reference if it finds a simple scalar and is not running under strict:
my %autoviv = ( a => ['foo'], b => undef );
push #$_, "PUSH" for values %autoviv; # ( a => ['foo', 'PUSH'], b => ['PUSH'] )
my %fatal = ( a => {} );
push #$_, "PUSH" for values %fatal; # FATAL: "Not an ARRAY reference at..."
my %dangerous = (a => "foo");
push #$_, "PUSH" for values %dangerous; # Yikes! #foo is now ("PUSH")
use strict;
my %kablam = (a => "foo");
push #$_, "PUSH" for values %kablam; # "Can't use string ("foo") as an ARRAY ref ..."
As I understand it, traverse the hash with a map command to modify its keys. An example:
EDIT: I've edited because I realised that the map command can be assigned to the same hash. No need to create a new one.
#!/usr/bin/perl
use warnings;
use strict;
use Data::Dumper;
my %hash = qw|
key1 value1
key2 value2
key3 value3
|;
my %hash = map { $_ . "this string" => $hash{ $_ } } keys %hash;
print Dump \%hash;
Run it like:
perl script.pl
With following output:
$VAR1 = {
'key3this string' => 'value3',
'key2this string' => 'value2',
'key1this string' => 'value1'
};

What is the idiomatic way in Perl to determine whether a string variable matches a string in a list?

Part of the specification says "Some names are special, e.g. Hughie, Dewey, Louis, and Donald. Other names may be added over the lifetime of the project at arbitrary times. Whenever you input one of those names, play quack.wav."
I could write ...
while (<>) {
if ($_ =~ /Hughie|Dewey|Louis/) {
quack() ;
}
elsif ($_ =~ /Donald/ {
quack() ;
you_re_fired_apprentice() ; # Easter egg don't tell QA
}
}
... but though untaxing to implement, it looks WTF-y: Where's the binary search? What if there were a sudden stupendous increase in the number of duck names? It would not scale at all!
I could create empty files using those names in a temporary directory, and then use the "file exists" API, but that seems roundabout, and I would have to be sure they were deleted at the end.
Surely there is a better way?
You could write that, but you should write this:
my %ducks = map {$_ => 1} qw(Hughie Dewey Louis);
while (<>) {
if ($ducks{$_}) {
quack() ;
}
elsif ($_ eq 'Donald') {
quack() ;
you_re_fired_apprentice() ; # Easter egg don't tell QA
}
}
Creating the hash takes a little bit of time, but not more than O(n). Lookup with a hash is O(1) though, so it is much more efficient than sequential search (via grep or a regex with alternation) assuming you will be checking for more than one or two items.
By the way, the regex that you have will match the words anywhere in the search string. You need to add start and end anchors if you want an exact match.
Alternatively, you could use smart matching
my #ducks = qw(Hughie Dewey Louis);
my $name = 'Dewey';
say 'smart match' if $name ~~ #ducks;
This is what is used by switch statements, so you could write
given ($name) {
when (#ducks) {
quack();
}
when ('Donald') {
quack();
you_re_fired_apprentice(); # Easter egg don't tell QA
}
}
As mentioned, hashes are the way to go for this. This is sort of what OOP looked like before OOP.
use strict;
use warnings;
my %duck_action = (
Donald => sub {quack(); you_re_fired_apprentice()},
Hughie => sub {quack()},
Dewie => sub {quack()},
Louis => sub {quack()},
);
for my $duck (qw( Hughie Dewie Donald Louis Porkie )) {
print "$duck: ";
my $action = $duck_action{$duck} || &no_such_duck;
$action->();
}
sub quack {
print "Quack!\n";
}
sub you_re_fired_apprentice {
print "You're fired!\n";
}
sub no_such_duck {
print "No such duck!\n";
}
You can use a Perl Hash. See also How can I represent sets in Perl? and Representing Sets in Perl.
Using hashes to implement a set is not exactly pretty, but it should be fast.
To find a string in a list, you could also use any in List::MoreUtils
use List::MoreUtils qw(any);
my #ducks = qw(Hughie Dewey Louis);
my $name = 'Dewey';
say 'any' if any {$name eq $_} #ducks;
If you're tied to using an array rather than a hash, you can use perl's grep function to search the array for a string.
#specialnames = qw(Hughie Dewey Louis);
while (my $value = <>) {
if (grep {$value eq $_}, #specialnames) {
quack() ;
}
elsif ($_ =~ /Donald/ {
quack() ;
you_re_fired_apprentice() ; # Easter egg don't tell QA
}
}
This does scale a lot worse than a hash, and might even scale worse than copying the array into a hash and then doing hash lookups.

Dealing with multiple capture groups in multiple records

Data Format:
attribname: data
Data Example:
cheese: good
pizza: good
bagel: good
fire: bad
Code:
my $subFilter='(.+?): (.+)';
my #attrib = ($dataSet=~/$subFilter/g);
for (#attrib)
{
print "$_\n";
}
The code spits out:
cheese
good
pizza
good
[etc...]
I was wondering what an easy Perly way to do this is? I am parsing the data from a log the data above is trash for simplicity. I am newer to Perl, I suspect I could do this via fanangling indexes, but I was wondering if there is a short method of implementing this? Is there any way to have the capture groups put into two different variables instead of serially appended to the list along with all matches?
Edit: I want the attribute and it's associated value together so I can the do what I need to to them. For example if within my for loop I could access both the attribute name and attribute value.
Edit:
I tried
my %attribs;
while (my $line = <$data>)
{
my ($attrib, $value) = ($line=~m/$subFilter/);
print $attribs{$attrib}," : ", $value,"\n";
}
and no luck :( I don't get any output with this. My data is in a variable not a file, because it parsed out of a set of parent data which is in a file. It would be convenient if the my variable worked so that my (#attrib, #value) = ($line=~/$subFilter/g); filled the lists appropriately with the multiple matches.
Solution:
my #line = ($7 =~/(.+?)\n/g);
for (#line)
{
my ($attrib, $value) = ($_=~m/$subFilter/);
if ($attrib ne "")
{
print $attrib," : ", $value,"\n";
}
}
I'm not really clear on what you actually want to store, but here's how you could store the data in a hash table, with '1' indicating good and '0' indicating 'bad':
use strict;
use warnings;
use Data::Dumper;
my %foods;
while (my $line = <DATA>)
{
chomp $line;
my ($food, $good) = ($line =~ m/^(.+?): (.+)$/);
$foods{$food} = ($good eq 'good' ? 1 : 0);
}
print Dumper(\%foods);
__DATA__
cheese: good
pizza: good
bagel: good
fire: bad
This prints:
$VAR1 = {
'bagel' => 1,
'cheese' => 1,
'fire' => 0,
'pizza' => 1
};
A sensible approach would be to make use of the split function:
my %attrib;
open my $data, '<', 'fileName' or die "Unable to open file: $!";
while ( my $line = <$data> ) {
my ( $attrib, $value ) = split /:\s*/, $line, 2;
$attrib{$attrib} = $value;
}
close $data;
foreach my $attrib ( keys %attrib ) {
print "$attrib: $attrib{$attrib}\n";
}
If you're into one-liners, the following would achieve the same:
$ perl -F/:\s*/ -ane '$attrib{$F[0]} = $F[1]; } END { print $_,"\t",$attrib{$_},"\n" foreach keys %attrib;" fileName

Pimp my Perl code

I'm an experienced developer, but not in Perl. I usually learn Perl to hack a script, then I forget it again until the next time. Hence I'm looking for advice from the pros.
This time around I'm building a series of data analysis scripts. Grossly simplified, the program structure is like this:
01 my $config_var = 999;
03 my $result_var = 0;
05 foreach my $file (#files) {
06 open(my $fh, $file);
07 while (<$fh>) {
08 &analyzeLine($_);
09 }
10 }
12 print "$result_var\n";
14 sub analyzeLine ($) {
15 my $line = shift(#_);
16 $result_var = $result_var + calculatedStuff;
17 }
In real life, there are up to about half a dozen different config_vars and result_vars.
These scripts differ mostly in the values assigned to the config_vars. The main loop will be the same in every case, and analyzeLine() will be mostly the same but could have some small variations.
I can accomplish my purpose by making N copies of this code, with small changes here and there; but that grossly violates all kinds of rules of good design. Ideally, I would like to write a series of scripts containing only a set of config var initializations, followed by
do theCommonStuff;
Note that config_var (and its siblings) must be available to the common code, as must result_var and its lookalikes, upon which analyzeLine() does some calculations.
Should I pack my "common" code into a module? Create a class? Use global variables?
While not exactly code golf, I'm looking for a simple, compact solution that will allow me to DRY and write code only for the differences. I think I would rather not drive the code off a huge table containing all the configs, and certainly not adapt it to use a database.
Looking forward to your suggestions, and thanks!
Update
Since people asked, here's the real analyzeLine:
# Update stats with time and call data in one line.
sub processLine ($) {
my $line = shift(#_);
return unless $line =~ m/$log_match/;
# print "$1 $2\n";
my ($minute, $function) = ($1, $2);
$startMinute = $minute if not $startMinute;
$endMinute = $minute;
if ($minute eq $currentMinute) {
$minuteCount = $minuteCount + 1;
} else {
if ($minuteCount > $topMinuteCount) {
$topMinute = $currentMinute;
$topMinuteCount = $minuteCount;
printf ("%40s %s : %d\n", '', $topMinute, $topMinuteCount);
}
$totalMinutes = $totalMinutes + 1;
$totalCount = $totalCount + $minuteCount;
$currentMinute = $minute;
$minuteCount = 1;
}
}
Since these variables are largely interdependent, I think a functional solution with separate calculations won't be practical. I apologize for misleading people.
Two comments: First, don't post line numbers as they make it more difficult than necessary to copy, paste and edit. Second, don't use &func() to invoke a sub. See perldoc perlsub:
A subroutine may be called using an explicit & prefix. The & is optional in modern Perl, ... Not only does the & form make the argument list optional, it also disables any prototype checking on arguments you do provide.
In short, using & can be surprising unless you know what you are doing and why you are doing it.
Also, don't use prototypes in Perl. They are not the same as prototypes in other languages and, again, can have very surprising effects unless you know what you are doing.
Do not forget to check the return value of system calls such as open. Use autodie with modern perls.
For your specific problem, collect all configuration variables in a hash. Pass that hash to analyzeLine.
#!/usr/bin/perl
use warnings; use strict;
use autodie;
my %config = (
frobnicate => 'yes',
machinate => 'no',
);
my $result;
$result += analyze_file(\%config, $_) for #ARGV;
print "Result = $result\n";
sub analyze_file {
my ($config, $file) = #_;
my $result;
open my $fh, '<', $file;
while ( my $line = <$fh> ) {
$result += analyze_line($config, $line);
}
close $fh;
return $result;
}
sub analyze_line {
my ($line) = #_;
return length $line;
}
Of course, you will note that $config is being passed all over the place, which means you might want to turn this in to a OO solution:
#!/usr/bin/perl
package My::Analyzer;
use strict; use warnings;
use base 'Class::Accessor::Faster';
__PACKAGE__->follow_best_practice;
__PACKAGE__->mk_accessors( qw( analyzer frobnicate machinate ) );
sub analyze_file {
my $self = shift;
my ($file) = #_;
my $result;
open my $fh, '<', $file;
while ( my $line = <$fh> ) {
$result += $self->analyze_line($line);
}
close $fh;
return $result;
}
sub analyze_line {
my $self = shift;
my ($line) = #_;
return $self->get_analyzer->($line);
}
package main;
use warnings; use strict;
use autodie;
my $x = My::Analyzer->new;
$x->set_analyzer(sub {
my $length; $length += length $_ for #_; return $length;
});
$x->set_frobnicate('yes');
$x->set_machinate('no');
my $result;
$result += $x->analyze_file($_) for #ARGV;
print "Result = $result\n";
Go ahead and create a class hierarchy. Your task is an ideal playground for OOP style of programming.
Here's an example:
package Common;
sub new{
my $class=shift;
my $this=bless{},$class;
$this->init();
return $this;
}
sub init{}
sub theCommonStuff(){
my $this=shift;
for(1..10){ $this->analyzeLine($_); }
}
sub analyzeLine(){
my($this,$line)=#_;
$this->{'result'}.=$line;
}
package Special1;
our #ISA=qw/Common/;
sub init{
my $this=shift;
$this->{'sep'}=','; # special param: separator
}
sub analyzeLine(){ # modified logic
my($this,$line)=#_;
$this->{'result'}.=$line.$this->{'sep'};
}
package main;
my $c = new Common;
my $s = new Special1;
$c->theCommonStuff;
$s->theCommonStuff;
print $c->{'result'}."\n";
print $s->{'result'}."\n";
If all the common code is in one function, a function taking your config variables as parameters, and returning the result variables (either as return values, or as in/out parameters), will do. Otherwise, making a class ("package") is a good idea, too.
sub common_func {
my ($config, $result) = #_;
# ...
$result->{foo} += do_stuff($config->{bar});
# ...
}
Note in the above that both the config and result are hashes (actually, references thereto). You can use any other data structure that you feel will suit your goal.
Some thoughts:
If there are several $result_vars, I would recommend creating a separate subroutine for calculating each one.
If a subroutine relies on information outside that function, it should be passed in as a parameter to that subroutine, rather than relying on global state.
Alternatively wrap the whole thing in a class, with $result_var as an attribute of the class.
Practically speaking, there are a couple ways you could implement this:
(1) Have your &analyzeLine function return calculatedStuff, and add it to &result_var in a loop outside the function:
$result_var = 0;
foreach my $file (#files) {
open(my $fh, $file);
while (<$fh>) {
$result_var += analyzeLine($_);
}
}
}
sub analyzeLine ($) {
my $line = shift(#_);
return calculatedStuff;
}
(2) Pass $result_var into analyzeLine explicitly, and return the changed $result_var.
$result_var = 0;
foreach my $file (#files) {
open(my $fh, $file);
while (<$fh>) {
$result_var = addLineToResult($result_var, $_);
}
}
}
sub addLineToResult ($$) {
my $running_total = shift(#_);
my $line = shift(#_);
return $running_total + calculatedStuff;
}
The important part is that if you separate out functions for each of your several $result_vars, you'll be more readily able to write clean code. Don't worry about optimizing yet. That can come later, when your code has proven itself slow. The improved design will make optimization easier when the time comes.
why not create a function and using $config_var and $result_var as parameters?