Perl + recursive subroutine + accessing variable defined outside of subroutine - perl

I am pulling bitbucket repo list using Perl. The response from bitbucket will contain only 10 repositories and a marker for the next page where there will be another 10 repos and so on ... (they call it paging response)
So, I wrote a recursive subroutine which calls itself if the next page marker is present. This will continue until it reaches the last page.
Here is my code:
#!/usr/bin/perl
use warnings;
use strict;
use Data::Dumper;
use LWP::UserAgent;
use JSON;
my #array;
recursive("my_bitbucket_url");
foreach ( #array ) { print $_."\n"; }
sub recursive
{
my $url = $_[0];
### here goes my LWP::UserAgent code which connects to bitbucket and pulls back the response in a JSON as $response->decoded_content
### hence, removing this code for brevity
my $hash = decode_json $response->decoded_content;
#print Dumper ($hash);
foreach my $a ( #{$hash->{values}} )
{
push #array, $a->{links}->{self}->{href};
}
if ( defined $hash->{next})
{
print "Next page Exists \n";
print "Recursing with $hash->{next} \n";
recursive( $hash->{next} );
}
else
{
print "Last page reached. No more recursion \n"
}
}
Now, my code works fine and it lists all the repos.
Question:
I am not sure about the way I have used the variable my #array; above. I have defined it outside the subroutine, However, I am accessing it directly from the subroutine. Somehow, I feel this is not right.
So, how to append to an array using a recursive subroutine in such cases. Does my code obey Perl ethics or is it something really absurd (yet correct because it works) ?
UPDATE
After following suggestions from #ikegami, #Sobrique and #Hynek -Pichi- Vychodil, I have come with below code which uses while loop and avoids recusrsion.
Here is my thought process:
Define an array #array.
Call the subroutine call_url with initial bitbucket URL and save the response in $hash
Check the $hash for the next page marker
If next page marker exists, then push the elements to #array and call call_url with the new marker. This will be done with the while loop.
If the next page marker does NOT exists, then push the elements to #array. Period.
Print #array content.
And here is my code:
my #array;
my $hash = call_url("my_bitbucket_url ");
if (defined $hash->{next})
{
while (defined $hash->{next})
{
foreach my $a ( #{$hash->{values}} )
{
push #array, $a->{links}->{self}->{href};
}
$hash = call_url($hash->{next});
}
}
foreach my $a ( #{$hash->{values}} )
{
push #array, $a->{links}->{self}->{href};
}
foreach (#array) { print $_."\n"; }
sub call_url
{
### here goes my LWP::UserAgent code which connects to bitbucket and pulls back the response in a JSON as $response->decoded_content
### hence, removing this code for brevity
my $hash = decode_json $response->decoded_content;
#print Dumper ($hash);
return $hash;
}
Would definitely like to hear whether this looks OK or is there still a room for improvement.

Using global variables to return values demonstrates high coupling, something to be avoided.
You're asking if the following is acceptable:
my $sum;
sum(4, 5);
print("$sum\n");
sub sum {
my ($x, $y) = #_;
$sum = $x + $y;
}
The fact that the sub is recursive is completely irrelevant; it just makes your example larger.
Problem fixed:
sub recursive
{
my $url = $_[0];
my #array;
my $hash = ...;
foreach my $a ( #{$hash->{values}} )
{
push #array, $a->{links}->{self}->{href};
}
if ( defined $hash->{next})
{
print "Next page Exists \n";
print "Recursing with $hash->{next} \n";
push #array, recursive( $hash->{next} );
}
else
{
print "Last page reached. No more recursion \n"
}
return #array;
}
{
my #array = recursive("my_bitbucket_url");
foreach ( #array ) { print $_."\n"; }
}
Recursion removed:
sub recursive
{
my $url = $_[0];
my #array;
while (defined($url)) {
my $hash = ...;
foreach my $a ( #{$hash->{values}} )
{
push #array, $a->{links}->{self}->{href};
}
$url = $hash->{next};
if ( defined $url)
{
print "Next page Exists \n";
print "Recursing with $url\n";
}
else
{
print "Last page reached. No more recursion \n"
}
}
return #array;
}
{
my #array = recursive("my_bitbucket_url");
foreach ( #array ) { print $_."\n"; }
}
Clean up of the latest code you posted:
my $url = "my_bitbucket_url";
my #array;
while ($url) {
my $hash = call_url($url);
for my $value ( #{ $hash->{values} } ) {
push #array, $value->{links}{self}{href};
}
$url = $hash->{next};
}
print("$_\n") for #array;

Yes, using a global variable is bad habit even it is lexical scoped variable.
Each recursive code can be rewritten into its imperative loop version and vice versa. It is because all of this is implemented on CPU which doesn't know anything about recursion at all. Thre are only jumps. All calls and returns are just jumps with some stack manipulation so you can rewrite your recursion algorithm into loop. If it is not obvious and simple as in this case you can even emulate stack and behaviour as it is done in your favourite language interpreter or compiler. In this case it's very simple:
my #array = with_loop("my_bitbucket_url");
foreach ( #array ) { print $_."\n"; }
sub with_loop
{
my $url = $_[0];
my #array;
while(1)
{
### here goes my LWP::UserAgent code which connects to bitbucket and
### pulls back the response in a JSON as $response->decoded_content
### hence, removing this code for brevity
my $hash = decode_json $response->decoded_content;
#print Dumper ($hash);
foreach my $a ( #{$hash->{values}} )
{
push #array, $a->{links}->{self}->{href};
}
unless ( defined $hash->{next})
{
print "Last page reached. No more recursion \n";
last
};
print "Next page Exists \n";
print "Recursing with $hash->{next} \n";
$url = $hash->{next};
};
return #array;
}
But when you would like to stick with recursion you can but it is a little bit more tricky. First of all, there is not tail call optimization so you don't have to try write tail call code as your original version does. So you can do this:
my #array = recursion("my_bitbucket_url");
foreach ( #array ) { print $_."\n"; }
sub recursion
{
my $url = $_[0];
### here goes my LWP::UserAgent code which connects to bitbucket and
### pulls back the response in a JSON as $response->decoded_content
### hence, removing this code for brevity
my $hash = decode_json $response->decoded_content;
#print Dumper ($hash);
# this map version is same as foreach with push but more perlish
my #array = map $_->{links}->{self}->{href}, #{$hash->{values}};
if (defined $hash->{next})
{
print "Next page Exists \n";
print "Recursing with $hash->{next} \n";
push #array, recursive( $hash->{next} );
}
else
{
print "Last page reached. No more recursion \n"
}
return #array;
}
But this version is not very efficient so there is way how to write tail call recursive version in perl which is a little bit tricky.
my #array = tail_recursive("my_bitbucket_url");
foreach ( #array ) { print $_."\n"; }
sub tail_recursive
{
my $url = $_[0];
my #array;
return tail_recursive_inner($url, \#array);
# url is mutable parameter
}
sub tail_recursive_inner
{
my $url = $_[0];
my $array = $_[1];
# $array is reference to accumulator #array
# from tail_recursive function
### here goes my LWP::UserAgent code which connects to bitbucket and
### pulls back the response in a JSON as $response->decoded_content
### hence, removing this code for brevity
my $hash = decode_json $response->decoded_content;
#print Dumper ($hash);
foreach my $a ( #{$hash->{values}} )
{
push #$array, $a->{links}->{self}->{href};
}
if (defined $hash->{next})
{
print "Next page Exists \n";
print "Recursing with $hash->{next} \n";
# first parameter is mutable so its OK to assign
$_[0] = $hash->{next};
goto &tail_recursive_inner;
}
else
{
print "Last page reached. No more recursion \n"
}
return #$array;
}
And if you are interested in some real perl trickery
print $_."\n" for tricky_tail_recursion("my_bitbucket_url");
sub tricky_tail_recursion {
my $url = shift;
### here goes my LWP::UserAgent code which connects to bitbucket and
### pulls back the response in a JSON as $response->decoded_content
### hence, removing this code for brevity
my $hash = decode_json $response->decoded_content;
#print Dumper ($hash);
push #_, $_->{links}->{self}->{href} for #{$hash->{values}};
if (defined $hash->{next}) {
print "Next page Exists \n";
print "Recursing with $hash->{next} \n";
unshift #_, $hash->{next};
goto &tricky_tail_recursion;
} else {
print "Last page reached. No more recursion \n"
};
return #_;
}
See also: LWP::UserAgent docs.

A variable defined outside any closures is available to the whole program. It works fine, there's nothing to worry about. Some might call it 'bad style' in certain cases (mostly around program length and action at distance) but that's not a hard constraint.
I'm not sure I necessarily see the advantage of recursion here though - your problem doesn't seem to warrant it. That's not a problem per-se, but it can be a little confusing for future maintenance programmers ;).
I'd be thinking something along the lines of (non recursive):
my $url = "my_bitbucket_url";
while ( defined $url ) {
##LWP Stuff;
my $hash = decode_json $response->decoded_content;
foreach my $element ( #{ $hash->{values} } ) {
print join( "\n", #{ $element->{links}->{self}->{href} } ), "\n";
}
$url = $hash->{next}; #undef if it doesn't exist, so loop breaks.
}

Related

Array ref empty when received by a sub

I am trying to access elements of arrays by reference, passing references into a sub. Here is my code snippet:
my #arr1 = (1,2);
my #arr2 = (3,4);
my #arr3;
push #arr3, \#arr1;
push #arr3, \#arr2;
for my $i (#arr3) {
print "$i\n";
}
print "Entered Sub func()\n";
for my $i (#arr3) {
func($i);
}
sub func{
my $par = shift;
print $par."\n";
}
print "------------------------\n";
for my $elem(#$par) {
print $elem."\n";
}
And here is the ouput:
C:\Users\ag194y>perl arrs.pl
ARRAY(0x357b28)
ARRAY(0x3575e8)
Entered Sub func()
ARRAY(0x357b28)
ARRAY(0x3575e8)
------------------------
C:\Users\ag194y>
I was expecting to access the elements of #arr1 and a#rr2 with the for loop in the sub, but it looks like array refs are empty. What am I doing wrong? Many thanks.
I think the problem is, loop being outside of func. You are calling func twice, and only after that you are looping through $par, which is undefined at the time.
You might be looking for something like:
sub func{
my $par = shift;
print $par."\n";
print "------------------------\n";
for my $elem (#$par){
print $elem."\n";
}
}

How do I add variables to be set based on a numeric input in perl?

I am making a score-keeping script in Perl, and would like to have it ask how many players there are, and ask for a name, then score, for each player. I have a good bit of this script done, but only for 3 players. the current script can be found on github here: skore
(from link:)
#!/usr/bin/env perl
use strict;
my $version = "1.0";
my $arg = shift(#ARGV);
my $subname = $arg;
if (!defined($arg)){
cmd_go();
}
$subname =~ s/-/_/g;
my $sub = main->can("cmd_$subname") || main->can("dev_$subname") || main->can("hid_$subname");
if (!defined($sub))
{
print STDERR "Invalid command given.\nUse \e[1;32m./skore help\e[0m for a list of commands.\n";
exit 1;
}
else
{
$sub->(#ARGV);
exit 0;
}
# Main command
sub cmd_go()
{
print "\e[2J\e[0G\e[0d"; # J = Erase in Display, 2 = Entire Screen, (G, d) = Move cursor to (..,..)
print "••••••••••••••••••••\n";
print "• Welcome to \e[1;32mskore\e[0m •\n";
print "••••••••••••••••••••\n\n";
my #game = prompt("What game are we scoring?\n");
print "••• Game name locked: #game\n\n";
my #p1name = prompt("Player 1 name?\n");
my #p2name = prompt("Player 2 name?\n");
my #p3name = prompt("Player 3 name?\n");
print "\n";
print "••• Player names locked: #p1name #p2name #p3name\n\n";
my #p1score = prompt_num("score for #p1name?\n");
my #p2score = prompt_num("score for #p2name?\n");
my #p3score = prompt_num("score for #p3name?\n");
print "\n";
print "••• Game: #game\n";
print "••• #p1name\n";
print "••••• \e[1;32m#p1score\e[0m\n";
print "••• #p2name\n";
print "••••• \e[1;32m#p2score\e[0m\n";
print "••• #p3name\n";
print "••••• \e[1;32m#p3score\e[0m\n";
exit 1;
}
sub cmd_help()
{
print "To get right into using skore, simply type ./skore\n";
print "For details about skore, such as version, use ./skore pkg\n";
}
sub cmd_pkg()
{
print "skore version: $version\n";
print "Detected OS: ";
exec "uname -r";
}
sub prompt {
my ($query) = #_; # take a prompt string as argument
local $| = 1; # activate autoflush to immediately show the prompt
print $query;
chomp(my $answer = <STDIN>); return $answer;
}
sub prompt_num {
NSTART:
my ($querynum) = #_;
print $querynum;
chomp(my $pnum = <STDIN>);
if ($pnum eq $pnum+0) { return $pnum; }
else { print "Error: That is not a number. Try again.\n"; goto NSTART; }
}
sub prompt_yn {
my ($queryyn) = #_;
my $answer = prompt("$queryyn (y/N): ");
return lc($answer) eq 'y';
}
I'd like to also point out that I'm new to perl.
OK, wow. Stop for a moment, step back and put the code down. Think about what you're trying to accomplish here.
There's a bunch of things you're doing in your code that's really going to benefit from taking a step back, and understanding what's going on, before proceeding.
First off:
my $arg = shift(#ARGV);
my $subname = $arg;
if (!defined($arg)){
cmd_go();
}
What is this intended to do? You only use $arg 3 times here, and one of those is to copy it to $subname.
This could be quite simplified by:
my $subname = shift;
cmd_go() unless defined $subname;
Now this:
my $sub = main->can("cmd_$subname") || main->can("dev_$subname") || main->can("hid_$subname");
Where did that come from? Because I'm pretty sure that - as a beginner to perl - you didn't write that yourself, not least because you don't have any subroutines prefixed with dev_ or hid. And this sort of redirect is serious overkill for a program that basically does just one thing.
(And normally, you'd use flags like getopt rather than a command that you leave blank in a default state).
You are also massively overusing arrays - which suggests you're not really sure the difference between #game and $game.
E.g. this:
my #game = prompt("What game are we scoring?\n");
prompt does this though:
chomp(my $answer = <STDIN>); return $answer;
It returns a scalar (a single line) and you're putting it into an array for - as far as I can tell - no particular reason.
Likewise this:
my #p1score = prompt_num("score for #p1name?\n");
my #p2score = prompt_num("score for #p2name?\n");
my #p3score = prompt_num("score for #p3name?\n");
First off - you're using a bunch of single element arrays. But then you're numbering them. When ... actually, the whole point of arrays is to have numbered values.
So how about instead:
print "Num players?:\n";
chomp ( my $num = <STDIN> );
my #players;
my %scores;
for ( 1..$num ) {
print "Player name\n";
chomp ( my $name = <STDIN> );
push ( #players, $name );
}
foreach my $person ( #players ) {
print "Score for $person:\n";
chomp ( my $score = <STDIN> );
while ( $score =~ /\D/ ) {
print "Invalid - please enter numeric value\n";
chomp ( $score = <STDIN> );
}
$scores{$person} = $score;
}
foreach my $person ( #players ) {
print "$person => $score{$person}\n";
}
There are a bunch of other things that you're doing that is more complicated than it needs to be.
What I would suggest you do:
go re-read the perl basics. perldata in particular.
have a look at getopt which is a good (and standard) way to take program 'flag' style input. (e.g. showing version, if that's what you really want.
it looks a lot like you've cargo-culted the code here. I would suggest you re-write from the ground up, and when you hit a problem - ask about it on Stack Overflow, if you can't figure it out from the perl docs.
Try this. Hope this is what you wanted.
#!/usr/bin/env perl
use strict;
my $version = "1.0";
my $arg = shift(#ARGV);
my $subname = $arg;
if (!defined($arg)){
cmd_go();
}
$subname =~ s/-/_/g;
my $sub = main->can("cmd_$subname") || main->can("dev_$subname") || main->can("hid_$subname");
if (!defined($sub))
{
print STDERR "Invalid command given.\nUse \e[1;32m./skore help\e[0m for a list of commands.\n";
exit 1;
}
else
{
$sub->(#ARGV);
exit 0;
}
# Main command
sub cmd_go()
{
print "\e[2J\e[0G\e[0d"; # J = Erase in Display, 2 = Entire Screen, (G, d) = Move cursor to (..,..)
print "••••••••••••••••••••\n";
print "• Welcome to \e[1;32mskore\e[0m •\n";
print "••••••••••••••••••••\n\n";
my #game = prompt("What game are we scoring?\n");
print "••• Game name locked: #game\n\n";
my $players= prompt("Enter total number of players:\n");
my #players_list;
for(my $i=0;$i<$players;$i++){
push(#players_list , prompt("Enter Player ".($i+1)." name\n"));
}
print "\n";
print "••• Player names locked: ";
for(my $i=0;$i<$players;$i++){
print $players_list[$i]."\t";
}
print "\n\n";
my #players_score;
for(my $i=0;$i<$players;$i++){
push(#players_score, prompt("score for $players_list[$i]?\n"));
}
print "\n";
print "••• Game: #game\n";
for(my $i=0;$i<$players;$i++){
print "$players_list[$i]\n";
print "••••• \e[1;32m$players_score[$i]\e[0m\n";
}
exit 1;
}
sub cmd_help()
{
print "To get right into using skore, simply type ./skore\n";
print "For details about skore, such as version, use ./skore pkg\n";
}
sub cmd_pkg()
{
print "skore version: $version\n";
print "Detected OS: ";
exec "uname -r";
}
sub prompt {
my ($query) = #_; # take a prompt string as argument
local $| = 1; # activate autoflush to immediately show the prompt
print $query;
chomp(my $answer = <STDIN>); return $answer;
}
sub prompt_num {
NSTART:
my ($querynum) = #_;
print $querynum;
chomp(my $pnum = <STDIN>);
if ($pnum eq $pnum+0) { return $pnum; }
else { print "Error: That is not a number. Try again.\n"; goto NSTART; }
}
sub prompt_yn {
my ($queryyn) = #_;
my $answer = prompt("$queryyn (y/N): ");
return lc($answer) eq 'y';
}

How to use refernce concept and access element of subroutine argument using Perl?

I am writing a code for calling a subroutine which has 4 argument(3 hashes and one file handler).i want to know how to access them in subroutine.My code is as below.
#print OUTFILE "Content of TPC file:.\n";
my $DATA_INFO = $ARGV[0];
my $OUT_DIR = $ARGV[1];
my $log= "$OUT_DIR/log1";
open(LOG1,">$log");
require "$DATA_INFO";
my $SCRIPT_DIR = $ENV{"SCRIPT_DIR"} ;
require "$SCRIPT_DIR/cmp_fault.pl";
require "$SCRIPT_DIR/pattern_mismatch.pl";
require "$SCRIPT_DIR/scan_count.pl";
print "\nComparing data:\n\n" ;
pattern_mismatch("\%data","\%VAR1","\%status",*LOG1);
cmp_fault("\%data","\%VAR1","\%status",*LOG1);
scan_count("\%data","\%status",*LOG1);
print "\n Comparison done:\n";
foreach $pattern (keys %status) {
print "pattern";
foreach $attr (keys %{$status{$pattern}}) {
print ",$attr";
}
print "\n";
last;
}
#Print Data
foreach $pattern (keys %status) {
print "$pattern";
foreach $attr (keys %{$status{$pattern}}) {
print ",$status{$pattern}{$attr}";
}
print "\n";
Sub routine cmp_fault is here:
sub cmp_fault {
use strict;
use warning;
$data_ref= $_[0];;
$VAR1_ref= $_[1];
$status_ref = $_[2];
$log1_ref=$_[3];
# print LOG1"For TPC : First find the pattern and then its fault type\n";
for $pat ( keys %$data_ref ) {
print "fgh:\n$pat,";
for $key (keys %{$data_ref{$pat}}) {
if($key=~/fault/){
print LOG1 "$key:$data_ref{$pat}{$key},\n";
}
}
}
# print LOG1 "\nFor XLS : First find the pattern and then its pattern type\n";
for $sheet (keys %$VAR1_ref){
if ("$sheet" eq "ATPG") {
for $row (1 .. $#{$VAR1_ref->{$sheet}}) {
$patname = $VAR1_ref->{'ATPG'}[$row]{'Pattern'} ;
next if ("$patname" eq "") ;
$faultXls = $VAR1_ref->{'ATPG'}[$row]{'FaultType'} ;
# print LOG1 " $patname==>$faultXls \n";
if (defined $data{$patname}{'fault'}) {
$faultTpc = $data{$patname}{'fault'} ;
# print LOG1 "\n $patname :XLS: $faultXls :TPC: $faultTpc\n";
if("$faultXls" eq "$faultTpc") {
print LOG1 "PASS: FaultType Matched $patname :XLS: $faultXls :TPC: $faultTpc\n\n\n";
print "PASS: FaultType Matched $patname :XLS: $faultXls :TPC: $faultTpc\n\n";
$status_ref->{$patname}{'FaultType'} = PASS;
}
else {
print LOG1 "FAIL: FaultType Doesn't Match\n\n";
$status_ref->{$patname}{'FaultType'} = Fail;
}
}
}
}
}
}
return 1;
When passing parameters into an array, you can only ever pass a single list of parameters.
For scalars, this isn't a problem. If all you're acting on is a single array, this also isn't a problem.
If you need to send scalars and an array or hash, then the easy way is to 'extract' the scalar parameters first, and then treat 'everything else' as the list.
use strict;
use warnings;
sub scalars_and_array {
my ( $first, $second, #rest ) = #_;
print "$first, $second, ", join( ":", #rest ), "\n";
}
scalars_and_array( "1", "2", "3", 4, 5, 6 );
But it should be noted that by doing so - you're passing values. You can do this with hashes too.
To pass data structure references, it's as you note - pass by reference, then dereference. It's useful to be aware though, that -> becomes useful, because it's accessing a hash and dereferencing it.
use strict;
use warnings;
use Data::Dumper;
sub pass_hash {
my ( $hashref ) = #_;
print $hashref,"\n";
print $hashref -> {"one"},"\n";
print $hashref -> {"fish"} -> {"haddock"};
}
my %test_hash = ( "one" => 2,
"three" => 4,
"fish" => { "haddock" => "plaice" }, );
pass_hash ( \%test_hash );
print "\n";
print Dumper \%test_hash;
The core of your problem here though, is that you haven't turned on strict and warnings which would tell you that:
for $pat ( keys %data_ref ) {
is wrong - there is no hash called data_ref there's only a scalar (which holds a hash reference) called $data_ref.
You need %$data_ref here.
And here:
for $key ( keys %{ $data{$pat} } ) {
You also have no $data - your code says $data_ref. (You might have %data in scope, but that's a really bad idea to mess around with within a sub).
There's a bunch of other errors - which would also be revealed by strict and warnings. That's a very basic debugging step, and you will generally get a much better response from Stack Overflow if you do this before asking for assistance. So please - do that, tidy up your code and remove errors/warnings. If you are still having problems after that, then by all means make a post outlining where and what problem you're having.

Push into end of hash in Perl

So what I am trying to do with the following code is push a string, let's say "this string" onto the end of each key in a hash. I'm completely stumped on how to do this. Here's my code:
use warnings;
use strict;
use File::Find;
my #name;
my $filename;
my $line;
my #severity = ();
my #files;
my #info = ();
my $key;
my %hoa;
my $xmlfile;
my $comment;
my #comments;
open( OUTPUT, "> $ARGV[0]" );
my $dir = 'c:/programs/TEST/Test';
while ( defined( $input = glob( $dir . "\\*.txt" ) ) ) {
open( INPUT, "< $input" );
while (<INPUT>) {
chomp;
if (/File/) {
my #line = split /:/;
$key = $line[1];
push #{ $hoa{$key} }, "Filename\n";
}
if ( /XML/ ... /File/ ) {
$xmlfile = $1;
push #{ $hoa{$key} }, "XML file is $xmlfile\n";
}
if (/Important/) {
push #{ $hoa{$key} }, "Severity is $_\n";
}
if (/^\D/) {
next if /Important/;
push #{ $hoa{$key} }, "Given comment is $_\n";
}
push #{ $hoa{$key} }, "this string\n";
}
}
foreach my $k ( keys %hoa ) {
my #list = #{ $hoa{$k} };
foreach my $l (#list) {
print OUTPUT $l, "\n";
}
}
}
close INPUT;
close OUTPUT;
Where I have "this string" is where I was trying to push that string onto the end of the array. However, what ended up happening was that it ended up printing "this string" three times, and not at the end of every key like I wanted. When I tried to put it outside the while() loop, it said that the value of $key was not initialized. So please, any help? And if you need any clarification on what I'm asking, just let me know. Thank you!
No offence, but there are so many issues in this code I don't even know where to start...
First, the 'initialization block' (all these my $something; my #somethings lines at the beginning of this script) is not required in Perl. In fact, it's not just 'redundant' - it's actually confusing: I had to move my focus back and forth every time I encountered a new variable just to check its type. Besides, even with all this $input var is still not declared as local; it's either missing in comments, or the code given has omissions.
Second, why do you declare your intention to use File::Find (good) - but then do not use it at all? It could greatly simplify all this while(glob) { while(<FH>) { ... } } routine.
Third, I'm not sure why you assign something to $key only when the line read is matched by /File/ - but then use its value as a key in all the other cases. Is this an attempt to read the file organized in sections? Then it can be done a bit more simple, either by slurp/splitting or localizing $/ variable...
Anyway, the point is that if the first line of the file scanned is not matched by /File/, the previous (i.e., from the previous file!) value is used - and I'm not quite sure that it's intended. And if the very first line of the first file is not /File/-matched, then an empty string is used as a key - again, it smells like a bug...
Could you please describe your task in more details? Give some test input/output results, perhaps... It'd be great to proceed in short tasks, organizing your code in process.
Your program is ill-conceived and breaks a lot of good practice rules. Rather than enumerate them all, here is an equivalent program with a better structure.
I wonder if you are aware that all of the if statements will be tested and possibly executed? Perhaps you need to make use of elsif?
Aside from the possibility that $key is undefined when it is used, you are also setting $xmlfile to $1 which will never be defined as there are no captures in any of your regular expressions.
It is impossible to tell from your code what you are trying to do, so we can help you only if you show us your output, input and say how to derive one from the other.
use strict;
use warnings;
use File::Find;
my ($outfile) = #ARGV;
my $dir = 'c:/programs/TEST/Test';
my %hoa;
my $key;
while (my $input = glob "$dir/*.txt") {
open my $in, '<', $input or die $!;
while (<$in>) {
chomp;
if (/File/) {
my $key = (split /:/)[1];
push #{ $hoa{$key} }, "Filename\n";
}
if (/XML/ ... /File/) {
my $xmlfile = $1;
push #{ $hoa{$key} }, "XML file is $xmlfile\n";
}
if (/Important/) {
push #{ $hoa{$key} }, "Severity is $_\n";
}
if (/^\D/) {
next if /Important/;
push #{ $hoa{$key} }, "Given comment is $_\n";
}
push #{ $hoa{$key} }, "this string\n";
}
close $in;
}
open my $out, '>', $outfile or die $!;
foreach my $k (keys %hoa) {
foreach my $l (#{ $hoa{$k} }) {
print $out $l, "\n";
}
}
close $out;
I suspect based on your code, that the line where $key is set is not called each time through the loop, and that you do not trigger any of the other if statements.
This would append "this string" to the end of the array. Based on that you are getting 3 of the "this strings" at the end of the array, I would suspect that two lines do not go through the if (/FILE/) or any of the other if statements. This would leave the $key value the same and at the end, you would append "this string" to the array, using whatever the last value of $key was when it was set.
This will append the string "this string" to every element of the hash %hoa, which elements are array refs:
for (values(%hoa)) { push #{$_}, "this string"; }
Put that outside your while loop, and you'll print "this string" at the end of each element of %hoa.
It will autovivify array refs where it finds undefined elements. It will also choke if it cannot dereference an element as an array, and will manipulate arrays by symbolic reference if it finds a simple scalar and is not running under strict:
my %autoviv = ( a => ['foo'], b => undef );
push #$_, "PUSH" for values %autoviv; # ( a => ['foo', 'PUSH'], b => ['PUSH'] )
my %fatal = ( a => {} );
push #$_, "PUSH" for values %fatal; # FATAL: "Not an ARRAY reference at..."
my %dangerous = (a => "foo");
push #$_, "PUSH" for values %dangerous; # Yikes! #foo is now ("PUSH")
use strict;
my %kablam = (a => "foo");
push #$_, "PUSH" for values %kablam; # "Can't use string ("foo") as an ARRAY ref ..."
As I understand it, traverse the hash with a map command to modify its keys. An example:
EDIT: I've edited because I realised that the map command can be assigned to the same hash. No need to create a new one.
#!/usr/bin/perl
use warnings;
use strict;
use Data::Dumper;
my %hash = qw|
key1 value1
key2 value2
key3 value3
|;
my %hash = map { $_ . "this string" => $hash{ $_ } } keys %hash;
print Dump \%hash;
Run it like:
perl script.pl
With following output:
$VAR1 = {
'key3this string' => 'value3',
'key2this string' => 'value2',
'key1this string' => 'value1'
};

How to call a subroutine with a variable pre-assigned to some value?

In Perl, when one uses the sort function with a custom comparison, the variables $a and $b are already assigned to the current pair of elements to compare, e.g. in:
#decreasing = sort { $b <=> $a } #list;
How can I write other subroutines with a similar functionality? For example, imagine that I want to write sort of process_and_store function that does something special with each item of a list and then stores it in a database; and where the variable $item is already assigned to the current item being processed. I would like to write for example something like:
process_and_store { do_something_with($item); } #list;
Rather than
process_and_store { my $item = shift; do_something_with($item); } #list;
How should I go about doing this?
UPDATE: For completeness, although flesk's answer works without problems, in order to “properly” localize the changes I make to the $item variable I had to follow the advice from Axeman. In SomePackage.pm I placed something like:
package SomePackage;
use strict;
require Exporter;
our #ISA = qw/Exporter/;
our #EXPORT = qw(process_and_store);
our $item;
sub import {
my $import_caller = caller();
{ no strict 'refs';
*{ $import_caller . '::item' } = \*item;
}
# Now, cue Exporter!
goto &{ Exporter->can( 'import' ) };
}
sub process_and_store (&#) {
my $code = shift;
for my $x (#_) {
local *item = \$x;
$code->();
print "stored: $item\n"
}
}
1;
Then I call this from main.pl with something like:
#!/usr/bin/perl -w
use strict;
use SomePackage;
process_and_store { print "seen: $item\n"; } (1, 2, 3);
And get the expected result:
seen: 1
stored: 1
seen: 2
stored: 2
seen: 3
stored: 3
In my "associative array" processing library, I do something similar. The user can export the variables $k and $v (key-value) so that they can do things like this:
each_pair { "$k => $v" } some_source_list()
Here's how I do it:
I declare our ( $k, $v ) in the implementing package.
In import I allow packages to export those symbols and alias them in the
receiving package: *{$import_caller.'::'.$name} = \*{ $name };
In the pair processors, I do the following:
local *::_ = \$_[0];
local *k = \$_[0];
local *v = \$_[1];
#res = $block->( $_[0], $_[1] );
Thus $k and $v are aliases of what's in the queue. If this doesn't have to be the case, then you might be happy enough with something like the following:
local ( $k, $v ) = splice( #_, 0, 2 );
local $_ = $k;
But modifiable copies also allow me to do things like:
each_pair { substr( $k, 0, 1 ) eq '-' and $v++ } %some_hash;
UPDATE:
It seems that you're neglecting step #2. You have to make sure that the symbol in the client package maps to your symbol. It can be as simple as:
our $item;
sub import {
my $import_caller = caller();
{ no strict 'refs';
*{ $import_caller . '::item' } = \*item;
}
# Now, cue Exporter!
goto &{ Exporter->can( 'import' ) };
}
Then when you localize your own symbol, the aliased symbol in the client package is localized as well.
The main way that I can see that it would work without the local, is if you were calling it from the same package. Otherwise, $SomePackage::item and $ClientPackage::item are two distinct things.
I think it's a bit of a hack, but you could do something like this:
#!/usr/bin/perl
use strict;
use warnings;
my $item;
sub process_and_store(&#) {
my $code = shift;
for (#_) {
$item = $_;
&$code();
}
undef $item;
}
The thing is, $item has to be a global scalar for this to work, so process_and_store has to update that scalar while looping over the list. You should also undef $item at the end of the sub routine to limit any potential side-effects. If I were to write something like this, I'd tuck it away in a module and make it possible to define the iterator variable, so as to limit name conflicts.
Test:
my #list = qw(apples pears bananas);
process_and_store { do_something_with($item) } #list;
sub do_something_with {
my $fruit = shift;
print "$fruit\n";
}
Output:
apples
pears
bananas
The $a and $b variables are special in Perl; they're real global variables and hence exempt from use strict, and also used specifically by the sort() function.
Most other similar uses in Perl would use the $_ global for this sort of thing:
process_and_store { do_something_with( $_ ) } #list;
Which is already handled by the normal $_ rules. Don't forget to localise $_:
sub process_and_store(&#)
{
my $code = shift;
foreach my $item (#_) {
local $_ = $item;
$code->();
}
}