Perl/DBI selectrow_array scope confusion - perl

So, I've got a list of files and I'm checking a database table to see if an entry exists and pull the id and destination filename. If not there, insert an entry and repull the entry (note the id is auto increment so no matter I have to do a second query.)
The issue is when I re-pull the query after the insert, the variable it's going into is lexical (i think that's the correct wording) and once I leave the scope of the if (!defined) block, it loses its value.
#lookup file db entry
my ($fileId, $destFilename) = $dbh->selectrow_array("select fileId, destFilename from myTable where sourceFilename = '$file'");
if (! defined $fileId) {
# calculate out what the destination filename should be here..
# add missing entry into table
($fileId, $destFilename) = $dbh->selectrow_array("select fileId, destFilename from myTable where sourceFilename = '$file'");
print Dumper $destFilename;
}
print Dumper $destFilename;
This will result in:
$VAR1 = "correctfilenamehere"
$VAR1 = undef
I have tried defining the variables before assigning them via the selectrow_array call. I've tried changing above from my to our for these variables. I'm confused on why it's doing this.
Also to note, this code is within another block so those variables are already lexical to that scope. I had presumed they would be available in the child blocks, but it's not really working that way as far as I can see.

The code you posted does not exhibit the behaviour you describe.
$ perl -e'
use strict;
use warnings;
use Data::Dumper qw( Dumper );
sub f { $_[0] ? (4, "abc") : () }
my ($fileId, $destFilename) = f(0);
if (!defined $fileId) {
($fileId, $destFilename) = f(1);
print Dumper $destFilename;
}
print Dumper $destFilename;
'
$VAR1 = 'abc';
$VAR1 = 'abc';
You could get the described bahaviour if you introduced a new variable with the same name.
$ perl -e'
use strict;
use warnings;
use Data::Dumper qw( Dumper );
sub f { $_[0] ? (4, "abc") : () }
my ($fileId, $destFilename) = f(0);
if (!defined $fileId) {
my ($fileId, $destFilename) = f(1);
print Dumper $destFilename;
}
print Dumper $destFilename;
'
$VAR1 = 'abc';
$VAR1 = undef;

Found the issue, a bit of code was making a lexical variable of the same name and I wasn't noticing, so the parent $destFilename was being replaced with a local variant.

Related

how can I display a variable's name in perl, along with the value of the variable?

To diagnose or debug my perl code, I would like to easily display the name of a variable along with its value. In bash, one types the following:
#!/bin/bash
dog=pitbull
declare -p dog
In perl, consider the following script, junk.pl:
#!/usr/bin/perl
use strict; use warnings;
my $dog="pitbull";
my $diagnosticstring;
print STDERR "dog=$dog\n";
sub checkvariable {
foreach $diagnosticstring (#_) { print "nameofdiagnosticstring=$diagnosticstring\n"; }
}
checkvariable "$dog";
If we call this script, we obtain
bash> junk.pl
dog=pitbull
nameofdiagnosticstring=pitbull
bash>
But instead, when the subroutine checkvariable is called, I would like the following to be printed:
dog=pitbull
This would make coding easier and less error-prone, since one would not have to type the variable's name twice.
You can do something like this with PadWalker (which you'll need to install from CPAN). But it's almost certainly far more complex than you'd like it to be.
#!/usr/bin/perl
use strict;
use warnings;
use PadWalker 'peek_my';
my $dog="pitbull";
print STDERR "dog=$dog\n";
sub checkvariable {
my $h = peek_my(0);
foreach (#_) {
print '$', $_,'=', ${$h->{'$'. $_}}, "\n";
}
}
checkvariable "dog";
Data::Dumper::Names may be what you're looking for.
#! perl
use strict;
use warnings;
use Data::Dumper::Names;
my $dog = 'pitbull';
my $cat = 'lynx';
my #mice = qw(jumping brown field);
checkvariable($dog, $cat, \#mice);
sub checkvariable {
print Dumper #_;
}
1;
Output:
perl test.pl
$dog = 'pitbull';
$cat = 'lynx';
#mice = (
'jumping',
'brown',
'field'
);
(not an answer, a formatted comment)
The checkvariable sub receives only a value, and there's no (simple or reliable) way to find out what variable holds that value.
This is why Data::Dumper forces you to specify the varnames as strings:
perl -MData::Dumper -E '
my $x = 42;
my $y = "x";
say Data::Dumper->Dump([$x, $y]);
say Data::Dumper->Dump([$x, $y], [qw/x y/])
'
$VAR1 = 42;
$VAR2 = 'x';
$x = 42;
$y = 'x';
Something as following usually helps
use strict;
use warnings;
use Data::Dumper;
my $debug = 1;
my $container = 20;
my %hash = ( 'a' => 7, 'b' => 2, 'c' => 0 );
my #array = [ 1, 7, 9, 8, 21, 16, 37, 42];
debug('container',$container) if $debug;
debug('%hash', \%hash) if $debug;
debug('#array', #array) if $debug;
sub debug {
my $name = shift;
my $value = shift;
print "DEBUG: $name [ARRAY]\n", Dumper($value) if ref $value eq 'ARRAY';
print "DEBUG: $name [HASH]\n", Dumper($value) if ref $value eq 'HASH';
print "DEBUG: $name = $value\n" if ref $value eq '';
}
But why not run perl script under build-in debugger? Option -d
The Perl Debugger

perl passing hash reference not behaving as expected

(NOTE: I sort of figured this out, see all the way at the end)
It's kind of late and I've been staring at this code for far too long. I finally wrote a short test program to test hashes and passing them by reference and it is not behaving as I expect. I'm sure there is something very simple I'm missing ... can anyone spot it?
#!/usr/bin/perl
use Data::Dumper;
my %hash = ();
print "BEFORE ADDING KEYS\n";
print Dumper (\%hash);
test (\%hash, 10);
print "AFTER ADDING KEYS\n";
print Dumper (\%hash);
sub test {
my %hash = %{$_[0]};
my $number = $_[1];
if ($number == 0) { return; }
print "BEFORE ADDING KEY HASH_REF=$_[0] NUMBER=$number\n";
print Dumper (\%hash);
$hash{$number} = $number;
print "AFTER ADDING KEY\n";
print Dumper (\%hash);
test ($_[0], $number - 1);
}
I expect this code to add the numbers 10 to 1 into my hash, but instead the hash gets wiped out and doesn't contain anything once the test routine finishes recursing. What am I missing? Here is the output:
BEFORE ADDING KEYS
$VAR1 = {};
BEFORE ADDING KEY HASH_REF=HASH(0xdb82fd0) NUMBER=10
$VAR1 = {};
AFTER ADDING KEY
$VAR1 = {
'10' => 10
};
BEFORE ADDING KEY HASH_REF=HASH(0xdb82fd0) NUMBER=9
$VAR1 = {};
AFTER ADDING KEY
$VAR1 = {
'9' => 9
};
...
BEFORE ADDING KEY HASH_REF=HASH(0xdb82fd0) NUMBER=1
$VAR1 = {};
AFTER ADDING KEY
$VAR1 = {
'1' => 1
};
AFTER ADDING KEYS
$VAR1 = {};
Changing this line:
$hash{$number} = $number;
to:
$_[0]->{$number} = $number;
made everything work as expected. Why does the first statement modify a fresh local %hash when I would expect this local %hash to be pointing to the same de-referenced hash reference I originally passed into the routine?
Everything works as intended. You first statement in the test sub makes a copy of passed hash:
my %hash = %{$_[0]};
To mutate passed hash, you should work with hashref, like:
my $hashref= $_[0];
$hashref->{key} = 'val';
This approach will change original hash, not it's copy.

Build hash of hash in perl

I'm new to using perl and I'm trying to build a hash of a hash from a tsv. My current process is to read in a file and construct a hash and then insert it into another hash.
my %hoh = ();
while (my $line = <$tsv>)
{
chomp $line;
my %hash;
my #data = split "\t", $line;
my $id;
my $iter = each_array(#columns, #data);
while(my($k, $v) = $iter->())
{
$hash{$k} = $v;
if($k eq 'Id')
{
$id = $v;
}
}
$hoh{$id} = %hash;
}
print "dump: ", Dumper(%hoh);
This outputs:
dump
$VAR1 = '1234567890';
$VAR2 = '17/32';
$VAR3 = '1234567891';
$VAR4 = '17/32';
.....
Instead of what I would expect:
dump
{
'1234567890' => {
'k1' => 'v1',
'k2' => 'v2',
'k3' => 'v3',
'k4' => 'v4',
'id' => '1234567890'
},
'1234567891' => {
'k1' => 'v1',
'k2' => 'v2',
'k3' => 'v3',
'k4' => 'v4',
'id' => '1234567891'
},
........
};
My limited understanding is that when I do $hoh{$id} = %hash; its inserting in a reference to %hash? What am I doing wrong? Also is there a more succint way to use my columns and data array's as key,value pairs into my %hash object?
-Thanks in advance,
Niru
To get a reference, you have to use \:
$hoh{$id} = \%hash;
%hash is the hash, not the reference to it. In scalar context, it returns the string X/Y wre X is the number of used buckets and Y the number of all the buckets in the hash (i.e. nothing useful).
To get a reference to a hash variable, you need to use \%hash (as choroba said).
A more succinct way to assign values to columns is to assign to a hash slice, like this:
my %hoh = ();
while (my $line = <$tsv>)
{
chomp $line;
my %hash;
#hash{#columns} = split "\t", $line;
$hoh{$hash{Id}} = \%hash;
}
print "dump: ", Dumper(\%hoh);
A hash slice (#hash{#columns}) means essentially the same thing as ($hash{$columns[0]}, $hash{$columns[1]}, $hash{$columns[2]}, ...) up to however many columns you have. By assigning to it, I'm assigning the first value from split to $hash{$columns[0]}, the second value to $hash{$columns[1]}, and so on. It does exactly the same thing as your while ... $iter loop, just without the explicit loop (and it doesn't extract the $id).
There's no need to compare each $k to 'Id' inside a loop; just store it in the hash as a normal field and extract it afterwards with $hash{Id}. (Aside: Is your column header Id or id? You use Id in your loop, but id in your expected output.)
If you don't want to keep the Id field in the individual entries, you could use delete (which removes the key from the hash and returns the value):
$hoh{delete $hash{Id}} = \%hash;
Take a look at the documentation included in Perl. The command perldoc is very helpful. You can also look at the Perldoc webpage too.
One of the tutorials is a tutorial on Perl references. It all help clarify a lot of your questions and explain about referencing and dereferencing.
I also recommend that you look at CPAN. This is an archive of various Perl modules that can do many various tasks. Look at Text::CSV. This module will do exactly what you want, and even though it says "CSV", it works with tab separated files too.
You missed putting a slash in front of your hash you're trying to make a reference. You have:
$hoh{$id} = %hash;
Probably want:
$hoh{$id} = \%hash;
also, when you do a Data::Dumper of a hash, you should do it on a reference to a hash. Internally, hashes and arrays have similar structures when a Data::Dumper dump is done.
You have:
print "dump: ", Dumper(%hoh);
You should have:
print "dump: ", Dumper( \%hoh );
My attempt at the program:
#! /usr/bin/env perl
#
use warnings;
use strict;
use autodie;
use feature qw(say);
use Data::Dumper;
use constant {
FILE => "test.txt",
};
open my $fh, "<", FILE;
#
# First line with headers
#
my $line = <$fh>;
chomp $line;
my #headers = split /\t/, $line;
my %hash_of_hashes;
#
# Rest of file
#
while ( my $line = <$fh> ) {
chomp $line;
my %line_hash;
my #values = split /\t/, $line;
for my $index ( ( 0..$#values ) ) {
$line_hash{ $headers[$index] } = $values[ $index ];
}
$hash_of_hashes{ $line_hash{id} } = \%line_hash;
}
say Dumper \%hash_of_hashes;
You should only store a reference to a variable if you do so in the last line before the variable goes go of scope. In your script, you declare %hash inside the while loop, so placing this statement as the last in the loop is safe:
$hoh{$id} = \%hash;
If it's not the last statement (or you're not sure it's safe), create an anonymous structure to hold the contents of the variable:
$hoh{$id} = { %hash };
This makes a copy of %hash, which is slower, but any subsequent changes to it will not effect what you stored.

Perl modifying hash reference in subroutine

I am having trouble understanding the hash references and changing the hash in place, instead of returning it. I want to write a sub routine which will return a value from hash and also modify the hash. I was facing some issues while coding for it. So, I wrote the following basic code to understand modifying the hash in place.
#!/usr/local/bin/perl
#Check hash and array references
#Author: Sidartha Karna
use warnings;
use strict;
use Data::Dumper;
sub checkHashRef{
my ($hashRef, $arrVal) = #_;
my %hashDeref = %{$hashRef};
$hashDeref{'check'} = 2;
push(#{$arrVal}, 3);
print "There:" ;
print Dumper $hashRef;
print Dumper %hashDeref;
print Dumper $arrVal
}
my %hashVal = ('check', 1);
my #arrVal = (1, 2);
checkHashRef(\%hashVal, \#arrVal);
print "here\n";
print Dumper %hashVal;
print Dumper #arrVal;
The output observed is:
There:$VAR1 = {
'check' => 1
};
$VAR1 = 'check';
$VAR2 = 2;
$VAR1 = [
1,
2,
3
];
here
$VAR1 = 'check';
$VAR2 = 1;
$VAR1 = 1;
$VAR2 = 2;
$VAR3 = 3;
From the output, I inferred that, changes to hashDeref are not modifying the data in the reference. Is my understanding correct? Is there a way to modify the hash variable in place instead of returning it.
This is making a (shallow) copy of %hashVal:
my %hashDeref = %{$hashRef};
The hash-ref $hashRef still points to %hashVal but %hashDeref doesn't, it is just a copy. If you want to modify the passed hash-ref in-place, then work with the passed hash-ref:
sub checkHashRef{
my ($hashRef, $arrVal) = #_;
$hashRef->{'check'} = 2;
#...
That will leave your changes in %hashVal. In the array case, you never make a copy, you just dereference it in-place:
push(#{$arrVal}, 3);
and the change to $arrVal shows up in #arrVal.

How can I store entire contents of a Perl array to a scalar variable?

How can I store entire contents of an array to a scalar variable.
eg:
my $code = do { local $/; <FILE HANDLE>; };
This works fine for file handles but I need this for an array.
Use join.
my $str = join '', #array;
You can also take a reference to the array:
my #array = 'a'..'z';
my $scalar = \#array;
foo( $scalar );
sub foo {
my $array_ref = shift;
for my $f ( #$array_ref ) {
do_something( $f );
}
}
Which approach you take really depends on what you are trying to accomplish.
#arr = ("1","2","3") ;
my $arr = "#arr" ;
print "$arr";
You can actually use a scalar variable as a filehandle:
my $bigbuffer;
my $f;
open $f, ">", \$bigbuffer; # opens $f for writing into the variable $bigbuffer
# do whatever prints fwrites etc you want here