Passing string and hash ref to subroutines - perl

I am passing a string and a hash to a subroutine where the hash is accepted as a reference to the subroutine. At the end of the subroutine call, I expect my hash to be filled. Snapshot:
#!/usr/local/bin/perl5.8
sub passHashAndFile {
my ($file, $hashRef) = #_;
open(HANDLE, $file) or die("Can not open file $file \n");
while(<HANDLE>) {
my #splitted_values = split("--", $_);
$hashRef->{$spllited_values[0]} = $hashRef->{$spllited_values[1]};
}
close(HANDLE);
}
my %hash;
passHashAndFile("test.txt", %hash);
foreach my $elem (keys %hash) {
print "Key = $elem, Value = $hash{$elem}\n";
}
And my test.txt looks like this:
1--2
3--4
5--6
7--8
I am catching hash as an reference and then dereferencing it to fill the values. What wrong did it do?

passHashAndFile("test.txt", %hash);
should be
passHashAndFile("test.txt", \%hash);
Additionally, you have misspelled variable names, you assigning the wrong value, you're using unlocalized global vars, and you're using problematic 2-arg open. Also, most people would have the sub return a hash ref instead of taking a ref to an empty hash.
#!/usr/local/bin/perl5.8
use strict;
use warnings;
sub parse_file {
my ($qfn) = #_;
open(my $fh, '<', $qfn)
or die("Can't open file \"$qfn\": $!\n");
my %hash;
while (my $line = <$fh>) {
my ($key, $val) = split(/--/, $line);
$hash{$key} = $val;
}
return \%hash;
}
my $hash = parse_file("test.txt");
for my $key (keys %$hash) {
print "Key = $key, Value = $hash->{$key}\n";
}
We usually leave out of our answers, but always use use strict; use warnings;. It would have found at least one of the problems.

Related

perl: How to make 'warn' think we read from a file?

I have a function (a variation of string++):
sub inc
{
$_[0] =~ /^(.*?)([0-9]+)$/;
my ($a,$b)=($1,$2);
die "cannot increment [$_[0]]" unless defined $b;
warn "increment overflow [$_[0]]" if length(++$b) != length($2);
$a.$b;
}
It is invoked in many places of a script, on different data (sometimes from a file, sometimes from a database).
When I read from a filehandle, die and warn print a message like this:
cannot increment [abc] at script line 5, <filehandle> line 123.
otherwise a shorter message is printed:
cannot increment [abc] at script line 5.
When I read from database I would like to have a message like this:
cannot increment [abc] at script line 5, <SELECT...> line 123.
Is it possible?
Setting the line number is quite simple: an assignment to $. can be made. But how to set the 'filehandle' part and make it visible?
I have found such a workaround:
my $fh = "SELECT...";
open $fh, "/dev/null";
<$fh>;
but it is a bit long, and it actually does open a file.
The filehandle information that appears in warn and die messages is only set after calls to <HANDLE>, readline, tell, eof, and seek. When you fetch data from a database with DBI, for example, you're not calling any of these, so you have to pass the extra data yourself.
One way to do this is to write a custom exception class that stringifies to the text you want:
package MyException;
use strict;
use warnings 'all';
use v5.18.0;
use overload '""' => \&as_string;
sub new {
my ($self, $message, $src, $src_line) = #_;
my ($package, $file, $line) = caller;
if (! defined $src && ref ${^LAST_FH} eq 'GLOB') {
$src = *${^LAST_FH}{NAME};
$src_line = $.;
}
bless { message => $message,
file => $file,
line => $line,
src => $src,
src_line => $src_line }, $self;
}
sub as_string {
my ($self) = #_;
my $message = "$self->{message} at $self->{file} line $self->{line}";
if (defined $self->{src} && defined $self->{src_line}) {
$message .= ", <$self->{src}> line $self->{src_line}";
}
$message .= "\n";
}
1;
Note that Perl 5.18.0 or up is required to use the read-only ${^LAST_FH} variable, which holds a reference to the last read filehandle.
Here's how you would use this when reading from a file:
use strict;
use warnings 'all';
use MyException;
while (<DATA>) {
warn MyException->new('foo'); # equivalent to warn 'foo'
}
__DATA__
first
second
Output:
foo at ./myscript line 9, <DATA> line 1
foo at ./myscript line 9, <DATA> line 2
And here's how you would use it when fetching records from a database:
use strict;
use warnings 'all';
use DBI;
use MyException;
my $dbh = DBI->connect('dbi:mysql:test', 'user', 'pass', {
RaiseError => 1
});
my $sql = 'SELECT * FROM test';
my $sth = $dbh->prepare($sql);
$sth->execute;
my $count;
while (my $row = $sth->fetch) {
warn MyException->new('foo', $sql, ++$count);
}
Output:
foo at ./myscript line 19, <SELECT * FROM test> line 1
foo at ./myscript line 19, <SELECT * FROM test> line 2
(Unfortunately, DBI doesn't provide a method to get the number of rows that have been fetched so far, so you have to count them yourself.)
Since you're trying to warn or die from inside a subroutine, you have to do a little bit more work. The simplest approach for die would be to trap exceptions from your subroutine with eval and re-throw them:
my $count = 1;
while (my $row = $sth->fetch) {
eval {
inc($row[0]);
};
if ($# =~ /^(cannot increment \[.*?\])/) {
die MyException->new($1, $sql, $count);
}
elsif ($#) {
die $#;
}
$count++;
}
You can handle warnings in a similar way by creating a __WARN__ handler:
{
my $count = 1;
local $SIG{__WARN__} = sub {
if ($_[0] =~ /^(increment overflow \[.*?\])/) {
warn MyException->new($1, $sql, $count);
}
else {
warn #_;
}
};
while (my $row = $sth->fetch) {
inc($row[0]);
$count++;
}
}
You may prefer this implementation of your inc subroutine. Your own uses the reserved variables $a and $b, as well as saving and retrieving the initial non-numeric part of the string
Note that the STDERR output is not in sync with STDOUT, so the warning appears prematurely in the aggregated text. In reality the warning is issued only when the passed string has an all-nines numeric field
use strict;
use warnings 'all';
my $s = 'ZZ90';
for ( 1 .. 20 ) {
$s = inc_str($s);
print $s, "\n";
}
sub inc_str {
my ($str) = #_;
$str =~ s{([0-9]+)$}{
my $num = $1;
warn "Increment overflow [$str]" unless $num =~ /[^9]/;
sprintf '%0*d', length($num), $num+1;
}e or die "Cannot increment [$str]";
return $str;
}
output
Increment overflow [ZZ99] at E:\Perl\source\inc_str.pl line 18.
ZZ91
ZZ92
ZZ93
ZZ94
ZZ95
ZZ96
ZZ97
ZZ98
ZZ99
ZZ100
ZZ101
ZZ102
ZZ103
ZZ104
ZZ105
ZZ106
ZZ107
ZZ108
ZZ109
ZZ110

How to show matching and Miss matching records of two text files in command prompt using Perl?

I'm using two text files sampleA.txt and sampleB.txt. I have two fields in each file and I need to compare first record(first row) of sampleA.txt with the first row of sampleB.txt and I want to show matching records as well as miss matching records in command prompt.I need to do that in Perl.
Using the below script I'm getting one output but it is wrong. I need to populate both matching as well as mismatching. How to do that?
sampleA.txt:
1|X
2|A
4|Z
5|A
sampleB.txt:
2|A
2|X
3|B
4|C
Output I'm getting:
2|A
2|X
4|C
Outputs I want:
Matching-Output:
2|A
Miss-matching-Output:
1|X
4|Z
5|A
3|B
4|C
Perl Script:
#!/usr/bin/perl
use strict;
use warnings;
open(FILE1,'C:\Users\sathiya.kumar\Desktop\sampleA.txt') || die $!;
open(FILE2,'C:\Users\sathiya.kumar\Desktop\sampleB.txt') || die $!;
my $interline;
while (my $line= <FILE1>) {
my #fields = split('\|',$line);
parser($fields[0]);
}
sub parser {
my $mergeid = shift;
while (defined $interline || ($interline= <FILE2>)) {
my #fields = split('\|',$interline);
my $key = $fields[0];
if ($key lt $mergeid) {
# Skip non-matching records
$interline = undef;
next;
} elsif ($key gt $mergeid) {
# wait for next key
last;
} else {
print $interline;
$interline = undef;
}
}
}
close(FILE1);
close(FILE2);
Let me know if you need more information.
You left out 2|X:
use strict;
use warnings;
use 5.016;
use Data::Dumper;
#Create a set from the entries in sampleA.txt:
my $fname = 'sampleA.txt';
open my $A_INFILE, '<', $fname
or die "Couldn't open $fname: $!";
my %a;
while (my $line = <$A_INFILE>) {
chomp $line;
$a{$line} = undef;
}
close $A_INFILE;
say Dumper(\%a);
#Create a set from the entries in sampleB.txt:
$fname = 'sampleB.txt';
open my $B_INFILE, '<', $fname
or die "Couldn't open $fname: $!";
my %b;
while (my $line = <$B_INFILE>) {
chomp $line;
$b{$line} = undef;
}
close $B_INFILE;
say Dumper(\%b);
#Divide the entries in both files into matches and mismatches:
my (#matches, #mismatches);
for my $a_val (keys %a) {
if (exists $b{$a_val}) {
push #matches, $a_val;
}
else {
push #mismatches, $a_val;
}
}
for my $b_val (keys %b) {
if (not exists $a{$b_val}) {
push #mismatches, $b_val;
}
}
say Dumper(\#matches);
say Dumper(\#mismatches);
--output:--
$VAR1 = {
'5|A' => undef,
'4|Z' => undef,
'1|X' => undef,
'2|A' => undef
};
$VAR1 = {
'2|X' => undef,
'3|B' => undef,
'4|C' => undef,
'2|A' => undef
};
$VAR1 = [
'2|A'
];
$VAR1 = [
'5|A',
'4|Z',
'1|X',
'2|X',
'3|B',
'4|C'
];
If you evaluate a hash in scalar context, it returns false if the hash is empty. If there are any key/value pairs, it returns true; more precisely, the value returned is a string consisting of the number of used buckets and the number of allocated buckets, separated by a slash. This is pretty much useful only to find out whether Perl's internal hashing algorithm is performing poorly on your data set. For example, you stick 10,000 things in a hash, but evaluating %HASH in scalar context reveals "1/16" , which means only one out of sixteen buckets has been touched, and presumably contains all 10,000 of your items. This isn't supposed to happen. If a tied hash is evaluated in scalar context, the SCALAR method is called (with a fallback to FIRSTKEY ).
http://perldoc.perl.org/perldata.html

function call in perl

As a part of my course work I have been learning perl programming language for the first time in last the few weeks. I have been writing small functions and making function calls. I have written a function for string matching.
use strict;
use warnings;
sub find_multi_string {
my ($file, #strings) = #_;
my $fh;
open ($fh, "<$file");
#store the whole file in an array
my #array = <$fh>;
for my $string (#strings) {
if (grep /$string/, #array) {
next;
} else {
die "Cannot find $string in $file";
}
}
return 1;
}
find_multi_string('file name', 'string1','string2','string3','string4','string 5');
In the above script I'm passing the arguments in the function call. The script works.
But I'd like to know if there is way to specify the file name and string1... string n in an array in the program itself and just make the function call.
find_multi_string();
That would be a mistake, always pass parameters and return values to your subroutines.
What you're describing is essentially using subroutines solely to subdivide and document your code. If you were to do that, it would better to just remove the subroutine entirely and include a comment before the section of code.
Overall, your code looks good as is. You probably will want to use quotemeta though, and your logic can be simplified a little:
use strict;
use warnings;
use autodie;
sub find_multi_string {
my ($file, #strings) = #_;
# Load the file
my $data = do {
open my $fh, "<", $file;
local $/;
<$fh>
};
for my $string (#strings) {
if ($data !~ /\Q$string/) {
die "Cannot find $string in $file";
}
}
return 1;
}
find_multi_string('file name', 'string1','string2','string3','string4','string 5');
A few improvements of your original code:
use autodie
use 3-args open
as you want to check anywhere in the file, just load the file as a single string
if the matching string are just text without metacharacters from regexp, just use the index function
Your question is about passing the function arguments from your program.
I suspect that you are looking for #ARGV. See perlvar.
Here is the modified code:
use strict;
use warnings;
use autodie;
sub find_multi_string {
my ($file, #strings) = #_;
my $content = do {
open my $fh, '<', $file;
local $/;
<$fh>
};
foreach (#strings) {
die "Cannot find $string in $file" unless index($content, $_) >= 0;
}
return 1;
}
find_multi_string(#ARGV);

Learning the High Order Perl: issue with iterator

I study the High Order Perl book and have an issue with iterators in the Chapter 4.3.4.
The code:
main_script.pl
#!/perl
use strict;
use warnings;
use FindBin qw($Bin);
use lib $Bin;
use Iterator_Utils qw(:all);
use FlatDB;
my $db = FlatDB->new("$Bin/db.csv") or die "$!";
my $q = $db->query('STATE', 'NY');
while (my $rec = NEXTVAL($q) )
{
print $rec;
}
Iterator_Utils.pm
#!/perl
use strict;
use warnings;
package Iterator_Utils;
use Exporter 'import';;
our #EXPORT_OK = qw(NEXTVAL Iterator
append imap igrep
iterate_function filehandle_iterator list_iterator);
our %EXPORT_TAGS = ('all' => \#EXPORT_OK);
sub NEXTVAL { $_[0]->() }
sub Iterator (&) { return $_[0] }
FlatDB.pm
#!/perl
use strict;
use warnings;
package FlatDB;
my $FIELDSEP = qr/:/;
sub new
{
my $class = shift;
my $file = shift;
open my $fh, "<", $file or return;
chomp(my $schema = <$fh>);
my #field = split $FIELDSEP, $schema;
my %fieldnum = map { uc $field[$_] => $_ } (0..$#field);
bless
{
FH => $fh,
FIELDS => \#field,
FIELDNUM => \%fieldnum,
FIELDSEP => $FIELDSEP
} => $class;
}
use Fcntl ':seek';
sub query
{
my $self = shift;
my ($field, $value) = #_;
my $fieldnum = $self->{FIELDNUM}{uc $field};
return unless defined $fieldnum;
my $fh = $self->{FH};
seek $fh, 0, SEEK_SET;
<$fh>; # discard schema line
return Iterator
{
local $_;
while (<$fh>)
{
chomp;
my #fields = split $self->{FIELDSEP}, $_, -1;
my $fieldval = $fields[$fieldnum];
return $_ if $fieldval eq $value;
}
return;
};
}
db.csv
LASTNAME:FIRSTNAME:CITY:STATE:OWES
Adler:David:New York:NY:157.00
Ashton:Elaine:Boston:MA:0.00
Dominus:Mark:Philadelphia:PA:0.00
Orwant:Jon:Cambridge:MA:26.30
Schwern:Michael:New York:NY:149658.23
Wall:Larry:Mountain View:CA:-372.14
Just as in the book so far, right? However I do not get the output (the strings with Adler and Schwern should occur). The error message is:
Can't use string ("Adler:David:New York:NY:157.00") as a subroutine ref while
"strict refs" in use at N:/Perle/Learn/Iterators/Iterator_Utils.pm line 12, <$fh>
line 3.
What am I doing wrong?
Thanks in advance!
FlatDB calls Iterator, which is defined in Iterator_Utils, so it needs to import that function from Iterator_Utils. If you add
use Iterator_Utils qw(Iterator);
after package FlatDB, the program will work.
Thanks very much for finding this error. I will add this to the errata on the web site. If you would like to be credited by name, please email me your name.

How to I use a class property/variable as a print filehandle in Perl?

I want to do the same thing as
open MYFILE, ">", "data.txt";
print MYFILE "Bob\n";
but instead in class variable like
sub _init_tmp_db
{
my ($self) = #_;
open $$self{tmp_db_fh}, ">", "data.txt";
print $$self{tmp_db_fh} "Bob\n";
}
It gave me this error : 'String found where operator expected near "Bob\n"'
what should I do?
From the print manpage:
If you're storing handles in an array or hash, or in general whenever
you're using any expression more complex than a bareword handle or a
plain, unsubscripted scalar variable to retrieve it, you will have to
use a block returning the filehandle value instead.
You should be using:
print { $$self{tmp_db_fh} } "Bob\n";
This code won't work under use strict. To fix it just use a my variable:
open my $fh, ">", "data.txt" or die $!;
$$self{tmp_db_fh} = $fh;
print { $$self{tmp_db_fh} } "Bob\n";
You should the IO::File module instead.
use IO::File;
my $file = IO::File->new;
$file->open("> data.txt");
print_something($file);
sub print_something {
my ($file) = #_;
$file->print("hello world\n");
}
Or in your example function:
use IO::File;
# ...
sub _init_tmp_db
{
my ($self) = #_;
$self{tmp_db_fh} = IO::File->new;
$self{tmp_db_fh}->open(">", "data.txt");
$self{tmp_db_fh}->print"Bob\n";
}
(note, you can still non -> based calls too, but I wrote the above
using the more traditional ->open() type calls.)
Filehandles can only be scalars.
But $$self{tmp_db_fh} is either an open filehandle (to data.txt) then this would work:
sub _init_tmp_db
{
my ($self) = #_;
my $filehandle = $$self{tmp_db_fh} ;
print $filehandle "Bob\n";
}
or you open the filehandle inside _init_tmp_db
sub _init_tmp_db
{
my ($self) = #_;
open my $filehandle , ">", "data.txt" or die "Cannot open data.txt" ;
print $filehandle "Bob\n";
}
But providing a string in $$self{tmp_db_fh} (like 'FILEHANDLE') won't work.
This is easily solved by creating a variable for a file handle:
sub _init_tmp_db {
my $self = shift;
my $fh;
open $fh, ">", "data.txt"
$self->{temp_db_fh} = $fh;
# Sometime later...
$fh = $self-{temp_db_hf};
print $fh "Bob\n";
}
This is an issue because the way the print syntax is parsed and the early sloppiness of the syntax. The print statement has really two separate formats: Format #1 is that the you're simply passing it stuff to print. Format #2 says that the first item may be a file handle, and the rest is the stuff you want to print to the file handle. If print can't easily determine that the first parameter is a file handle, it fails.
If you look at other languages, they'll use a parameter for passing the file handle, and maybe the stuff to print. Or in object oriented languages, they'll overload >> for the file handle parameter. They'll look something like this:
print "This is my statement", file=file_handle;
or
print "This is my statement" >> file_handle;
You might be able to munge the syntax to get away from using a variable. However, it doesn't make the program more efficient or more readable, and may simply make the program harder to maintain. So, just use a variable for the file handle.
You said class in your title. I assume that you are interested in writing a fully fledge object oriented package to do this. Here's a quick example. Notice in the write subroutine method I retrieve the file handle into a variable and use the variable in the print statement.
#! /usr/bin/env perl
#
use strict;
use warnings;
#######################################################
# MAIN PROGRAM
#
my $file = File->new;
$file->open("OUTPUT") or
die "Can't open 'OUTPUT' for writing\n";
$file->write("This is a test");
#
#######################################################
package File;
use Carp;
sub new {
my $class = shift;
my $self = {};
bless $self, $class;
return $self;
}
sub open {
my $self = shift;
my $file = shift;
my $fh;
if (defined $file) {
$self->{FILE} = $file;
open ($fh, ">", $file) and $self->_fh($fh);
}
return $self->_fh;
}
sub _fh {
my $self = shift;
my $fh = shift;
if (defined $fh) {
$self->{FH} = $fh;
}
return $self->{FH};
}
sub write {
my $self = shift;
my $note = shift;
my $fh = $self->_fh;
print $fh $note . "\n";
return
}