Subroutine argument apparently lost in loop - perl

I have a CGI script pulling bibliography data from a BibTeX file, building HTML from it. It uses CGI::Ajax to call the subroutine below with one or two arguments. Most of the time, it will be a search term that is passed as $s, but if I pass a string through my HTML form, the subroutine will not be entirely happy with it. There is a foreach loop checking the entries and jumping over the entries that do not match. Now I can print the argument outside this loop alright, but the loop itself won’t print anything for $s, nor will it find any entries matching it. If within the loop $s were simply empty, the subroutine would print the entire bibliography, but this is not the case.
Basically it is as if $s passed as an argument breaks the loop, whereas an explicit definition in the subroutine works fine.
Here is a simplified version of my code. Please excuse sloppy or ignorant coding, I’m just dabbling in Perl.
sub outputBib {
my ( $self,$s,$kw ) = #_;
my #k;
#k = ('foo','bar'); # this is fine
#k = keys (%{$self->{_bib}}); # this causes problems
foreach my $k (#k) {
$output .= "Key = $k<br/>";
$output .= "Search Term = $s<br/>";
}
return $output;
}
The problem seems to be the array built from the keys of the $self->{_bib} hash. It is odd that
the loop is fine when $s is not passed through CGI::Ajax. All elements are processed.
as soon as the subroutine is called with $s, the loop does not return anything.
if #k is defined as a straightforward array, the loop works and $s can be printed within the loop;
I build $self->{_bib} like so:
sub parseBib {
my ( $self ) = #_;
while (my $e = new Text::BibTeX::Entry $self->{_bibFileObject}) {
next unless $e->parse_ok;
my %entry_hash;
$entry_hash{'title'} = $e->get('title');
$entry_hash{'keywords'} = $e->get('keywords');
$self->{_bib}{$e->key} = \%entry_hash;
}
}
Any ideas? Thanks.

My first suggestion would be to use warn/print STDERR to verify on the live running copy that, when called via CGI::Ajax, all of your variables ($self, $s, $kw, $self->{_bib}) have the values that you're expecting. Although I'm a big fan of CGI::Ajax, it does a fair bit of magic behind your back and it may not be calling outputBib in quite the way you think it is.
Also keep in mind that CGI operates on a per-request model, not per-page. Are you perhaps populating $self->{_bib} when you send the initial page (and also doing all of your successful tests in that environment), then expecting it to still be there when the AJAX requests come in? If so, you're out of luck - you'll need to rebuild it in the AJAX handler, either within outputBib or earlier in your code, before you call ->build_html and hand it off to CGI::Ajax.

Related

Variable not getting expanded

Variable not getting substituted
Even after defining the 2 variable explicitly it's not getting substituted
sub updatekey{
my $key_url = File::Spec->catfile($dir. "/keys/cert.key.$label.$type")
$eol = '';
open(FILE, $key_url) or die "$!";
my $key_file;
while (read(FILE, $buf, 60*57)) {
$keyfile = $key_file . encode_base64($buf,$eol);
}
}
Open FILE getting failed because $type is not getting substituted. if I modify the line as below
my $key_url = File::Spec->catfile($dir. "/keys/cert.key.$label.pem")
it's working fine.
This is pretty basic debugging.
Add this to your code as the first lines in the subroutine:
if (defined $type) {
print "\$type is undefined\n";
} else {
print "\$type is '$type'\n";
}
That way, you'll see exactly what value $type has just before you try to use it. I expect you'll see either "$type is undefined" or "$type is ''".
Your problem then becomes finding where $type is supposed to be set and working out why that isn't happening.
Two other pieces of advice:
When writing a Perl program, it's always a good idea to have use strict and use warnings in your code (near the top of the file) as they will find a large number of basic programming errors.
When programming in any language, you should ensure that any subroutine only uses variables that are passed in at parameters or variables that are declared within the subroutine. Using global variables (as you seem to be doing here) makes your code less portable and harder to debug.

Line Input operator with glob returning old values

The following excerpt code, when running on perl 5.16.3 and older versions, has a strange behavior, where subsequent calls to a glob in the line input operator causes the glob to continue returning previous values, rather than running the glob anew.
#!/usr/bin/env perl
use strict;
use warnings;
my #dirs = ("/tmp/foo", "/tmp/bar");
foreach my $dir (#dirs) {
my $count = 0;
my $glob = "*";
print "Processing $glob in $dir\n";
while (<$dir/$glob>) {
print "Processing file $_\n";
$count++;
last if $count > 0;
}
}
If you put two files in /tmp/foo and one or more in /tmp/bar, and run the code, I get the following output:
Processing * in /tmp/foo
Processing file /tmp/foo/foo.1
Processing * in /tmp/bar
Processing file /tmp/foo/foo.2
I thought that when the while terminates after the last, that the new invocation of the while on the second iteration would re-run the glob and give me the files listed /tmp/bar, but instead I get a continuation of what's in /tmp/foo.
It's almost like the angle operator glob is acting like a precompiled pattern. My hypothesis is that the angle operator is creating a filehandle in the symbol table that's still open and being reused behind the scenes, and that it's scoped to the containing foreach, or possibly the whole subroutine.
From I/O Operators in perlop
(my emphasis)
A (file)glob evaluates its (embedded) argument only when it is starting a
new list. All values must be read before it will start over. In list
context, this isn't important because you automatically get them all
anyway. However, in scalar context the operator returns the next value
each time it's called, or undef when the list has run out.
Since <> is called in scalar context here and you exit the loop with last after the first iteration, the next time you enter it it keeps reading from the original list.
It is clarified in comments that there is a practical need behind this quest: process only some of the files from a directory and never return all filenames since there can be many.
So assigning from glob to a list and working with it, or better yet using for instead of while as commented by ysth, doesn't help here as it returns a huge list.
I haven't found a way to make glob (what <> with a filename pattern uses) drop and rebuild the list once it's generated it, without getting to its end first.
Apparently, each instance of the operator gets its own list. So using another <> inside the while loop with the hope of resetting it, in any way and even with the same pattern, doesn't affect the list being iterated over in while (<$glob>).
Just to note, breaking out of the loop with a die (with while in an eval) doesn't help either; the next time we come to that while the same list is continued. Wrapping it in a closure
sub iter_glob { my $dir = shift; return sub { scalar <"$dir/*"> } }
for my $d (#dirs) {
my $iter = iter_glob($d);
while (my $f = $iter->()) {
# ...
}
}
met with the same fate; the original list keeps being used.
The solution then is to use readdir instead.

Deferencing hash of hashes in Perl

Sorry for this long post, the code should be easy to understand for veterans of Perl. I'm new to Perl and I'm trying to figure out this bit of code:
my %regression;
print "Reading regression dir: $opt_dir\n";
foreach my $f ( glob("$opt_dir/*.regress") ) {
my $name = ( fileparse( $f, '\.regress' ) )[0];
$regression{$name}{file} = $f;
say "file $regression{$name}{file}";
say "regression name $regression{$name}";
say "regression name ${regression}{$name}";
&read_regress_file( $f, $regression{$name} );
}
sub read_regress_file {
say "args #_";
my $file = shift;
my $href = shift;
say "href $href";
open FILE, $file or die "Cannot open $file: $!\n";
while ( <FILE> ) {
next if /^\s*\#/ or /^\s*$/;
chomp;
my #tokens = split "=";
my $key = shift #tokens;
$$href{$key} = join( "=", #tokens );
}
close FILE;
}
The say lines are things I added to debug.
My confusion is the last part of the subroutine read_regress_file. It looks like href is a reference from the line my $href = shift;. However, I'm trying to figure out how the hash that was passed got referenced in the first place.
%regression is a hash with keys of $name. The .regress files the code reads are simple files contains variables and their values in the form of:
var1=value
var2=value
...
So it looks like the line
my $name = (fileparse($f,'\.regress'))[0];
is creating the keys as scalars and the line
$regression{$name}{file} = $f;
actually makes $name into a hash.
In my debugging lines
say "regression name $regression{$name}";
prints the reference, for instance
regression name HASH(0x7cd198)
but
say "regression name ${regression}{$name}";
prints a name, like
regression name {filename}
with the file name inside the braces.
However, using
say "regression name $$regression{$name}";
prints nothing.
From my understanding, it looks like regression is an actual hash, but the references are the nested hashes, name.
Why does my deference test line using braces work, but the other form of dereferencing ($$) not work?
Also, why is the name still surrounded by braces when it prints? Shouldn't I be dereferencing $name instead?
I'm sorry if this is difficult to read. I'm confused which hash is actually referenced, and how to deference them if the reference is the nested hash.
This is a tough one. You've found some very awkward code that displays what may well be a bug in Perl, and you're getting confused over dereferencing Perl data structures. Standard Perl installations include the full set of documentation, and I suggest you take a look at perldoc perlreftut which is also available online at perldoc.com
The most obvious thing is that you are writing very old-fashioned Perl. Using an ampersand & to call a Perl subroutine hasn't been considered good practice since v5.8 was released fourteen years ago
I don't think there's much need to go beyond your clearly experimentatal lines at the start of the first for loop. Once you have understood this the rest should follow
say "file $regression{$name}{file}";
say "regression name $regression{$name}";
say "regression name ${regression}{$name}";
First of all, expanding data structure references within a string is unreliable. Perl tries to do what you mean, but it's very easy to write something ambiguous without realising it. It is often much better to use printf so that you can specify the embedded value separately. For instance
printf "file %s\n", $regression{$name}{file};
That said, you have a problem. $regression{$name} accesses the element of hash %regression whose key is equal to $name. That value is a reference to another hash, so the line
say "regression name $regression{$name}";
prints something like
regression name HASH(0x29348b0)
which you really don't want to see
Your first try $regression{$name}{file} accesses the element of the secondary hash that has the key file. That works fine
But ${regression}{$name} should be the same as $regression{$name}. Outside a string it is, but inside it's like ${regression} and {$name} are treated separately
There are really too many issues here for me to start guessing where you're stuck, especially without being able to talk about specifics. But it may help if I rewrite the initial code like this
my %regression;
print "Reading regression dir: $opt_dir\n";
foreach my $f ( glob("$opt_dir/*.pl") ) {
my ($name, $path, $suffix) = fileparse($f, '\.regress');
$regression{$name}{file} = $f;
my $file_details = $regression{$name};
say "file $file_details->{file}";
read_regress_file($f, $file_details);
}
I've copied the hash reference to $file_details and passed it to the subroutine like that. Can you see that each element of %regression is keyed by the name of the file, and that each value is a reference to another hash that contains the values filled in by read_regress_file?
I hope this helps. This isn't really a forum for teaching language basics so I don't think I can do much better
What I understand is that this:
$regression{$name}
represents a hashref, which looks like this:
{ file => '...something...'}
So, in order to dereference the hashref returned by $regression{$name}, you have to do something like:
%{ $regression{$name} }
In order to get the full hash.
In order to get the file property of the hash, do this:
$regression{$name}->{file}
Hope this helps.

what does print for mean in Perl?

I need to edit some Perl script and I'm new to this language.
I encountered the following statement:
print for (#$result);
I know that $result is a reference to an array and #$result returns the whole array.
But what does print for mean?
Thank you in advance.
In Perl, there's such a thing as an implicit variable. You may have seen it already as $_. There's a lot of built in functions in perl that will work on $_ by default.
$_ is set in a variety of places, such as loops. So you can do:
while ( <$filehandle> ) {
chomp;
tr/A-Z/a-z/;
s/oldword/newword/;
print;
}
Each of these lines is using $_ and modifying it as it goes. Your for loop is doing the same - each iteration of the loop sets $_ to the current value and print is then doing that by default.
I would point out though - whilst useful and clever, it's also a really good way to make confusing and inscrutable code. In nested loops, for example, it can be quite unclear what's actually going on with $_.
So I'd typically:
avoid writing it explicitly - if you need to do that, you should consider actually naming your variable properly.
only use it in places where it makes it clearer what's going on. As a rule of thumb - if you use it more than twice, you should probably use a named variable instead.
I find it particularly useful if iterating on a file handle. E.g.:
while ( <$filehandle> ) {
next unless m/keyword/; #skips any line without 'keyword' in it.
my ( $wiggle, $wobble, $fronk ) = split ( /:/ ); #split $_ into 3 variables on ':'
print $wobble, "\n";
}
It would be redundant to assign a variable name to capture a line from <$filehandle>, only to immediately discard it - thus instead we use split which by default uses $_ to extract 3 values.
If it's hard to figure out what's going on, then one of the more useful ways is to use perl -MO=Deparse which'll re-print the 'parsed' version of the script. So in the example you give:
foreach $_ (#$result) {
print $_;
}
It is equivalent to for (#$result) { print; }, which is equivalent to for (#$result) { print $_; }. $_ refers to the current element.

Strange behavior using POST data in perl scripts

Server is linux. I am having inexplicable problems when I send POST data to the script.
For example, I send the following POST data: choice=update
Here is the script:
#!/usr/bin/perl -w
print "Content-type: text/html\n\n";
if ( $ENV{'REQUEST_METHOD'} eq "GET" ) {
$in = $ENV{'QUERY_STRING'};
} elsif ($ENV{'REQUEST_METHOD'} eq "POST") {
read(STDIN,$in,$ENV{'CONTENT_LENGTH'});
}
#in = split(/&/,$in);
foreach $i (0 .. $#in) {
# Convert plus's to spaces
$in[$i] =~ s/\+/ /g;
# Split into key and value.
($key, $val) = split(/=/,$in[$i],2); # splits on the first =.
# Convert %XX from hex numbers to alphanumeric
$key =~ s/%(..)/pack("c",hex($1))/ge;
$val =~ s/%(..)/pack("c",hex($1))/ge;
# Associate key and value
$in{$key} .= "\0" if (defined($in{$key})); # \0 is the multiple separator
$in{$key} .= $val;
}
print $in{'choice'};
The first time I access the script, it prints update
The second time I access it, it prints updateupdate
The third time, it prints updateupdateupdate
...and so on.
What on earth could be causing it to keep appending the string to itself between requests? I am sending exactly the same POST data every time by simply refreshing with my browser. Cookies are not being used. There is nothing else in the file that is not commented out.
Edit: Also, when I print <STDIN> it says choice=update every time. The other updates don't appear to be added to STDIN
My guess is that the script is kept running between requests. As %in is a global variable it is never cleared, so that $in{$key} .= $value ends up making the string longer and longer. You can probably evade the problem by using lexical variables.
This means you'll need to find out how the script is being run by the web server.
You'll also want to look at using modules to do all this parsing work for you, and learn about ways to write perl code avoid the problem you've encountered. I'd suggest taking a look at Modern Perl and working from there.
It sounds / looks like it's related to the web server's configuration and not the script itself.
However, at the beginning of the code, try adding:
my %in;
This would scope the variable you're printing.
Also, at the end of the code I would add: exit 0;
(Although usually not necessary).