How to do `defined` with a hash slice - perl

I'm trying to learn Perl better, and learn hash slices.
Instead of 3 different if (defined statements, I'm trying to tidy the code to make it more maintainable and readable, but have come across the following conundrum:
#!/usr/bin/env perl
use strict;
use warnings FATAL => 'all';
use feature 'say';
use autodie ':all';
use Carp 'confess';
use DDP; # a.k.a. Data::Printer
use JSON 'decode_json';
my $hashref;
$hashref->{Jane} = decode_json('{"sex":"Female","Mortality_Status":"Alive", "latest_date":"2020-11-26","Hospitalized":"no","Risk_Status":"NA"}');
p $hashref; # pretty print the data
my #needed_terms = qw(age BMI sex);
if (defined #{ $hashref->{Jane} }{#needed_terms}) {
say 'all terms are defined.'; # this is what it says, which is WRONG!!!
} else {
say 'some terms are missing.'; # Jane is missing BMI and age, so the script should print here
}
I've read How to do sum of hash reference slice? and https://perlmonks.org/?node=References+quick+reference to no avail.
This person Jane is missing both age and BMI information, so the if (defined statement should say that some terms are missing, but is instead passing.
I get the same error whether I use #{ $hashref->{Jane} }{#needed_terms} or %{ $hashref->{Jane} }{#needed_terms}
I've also thought that maybe defined is returning how many terms of the slice are defined, but that isn't true.
How can I set if (defined statement on a hash slice properly?

This is a good place to use all from List::Util:
use List::Util qw(all);
if (all { exists $hashref->{Jane}{$_} } #needed_terms) {
say 'all terms are defined.';
} else {
say 'some terms are missing.'; # Jane is missing BMI and age, so the script should print here
}
It loops through all the needed terms and checks to see if each exists as key to the Jane hash.
One of my favorite docs for Perl Data Structures is perldoc perldsc. It is more of a step-by-step tutorial than References quick reference.

Related

how to substitute actual value of variable in a file which is present in another file in perl

I have two file.Variables are declared in first file($one=1;) , in second file variable name is given ($one). I want to substitute this variable name with actual value and print the output.
File1.txt
variables are gieven here
$one=1;
$name="gorge";
$animal="cat";
File2.txt
This number is x=$one/or less then two
his name is $name
It is a $animal/ kind of animal.
Expected output
This number is x=1/or less then two
his name is gorge
It is a cat/ kind of animal.
I tried with this code:
open (data1,"</home/file1");
open (data2,"</home/file2");
while (<data1>){
while (<data2>){
print $_;
}
}
close data2;
close data1;
Thank You.
You need a templating system
One of the most popular ones is Template Toolkit
For example, with this template file
File2.template
This number is x=[% one %]/or less then two
his name is [% name %]
It is a [% animal %]/kind of animal.
And this Perl code
main.pl
use strict;
use warnings 'all';
use Template;
my $tt = Template->new;
my $vars = {
one => 1,
name => 'gorge',
animal => 'cat',
};
$tt->process('File2.template', $vars);
The result is this
output
This number is x=1/or less then two
his name is gorge
It is a cat/kind of animal.
I think you're fishing for something that is a horribly bad idea.
So I'll suggest a different approach, of building regular expressions to replace the text. In doing this though - the use of $one is going to be a bit confusing, because that means a scalar variable in perl, and this is "just" going to be a pattern match.
So if you can change that - you should:
#!/usr/bin/env perl
use strict;
use warnings;
use Data::Dumper;
my %replace = ( 'one' => '1',
'name' => 'gorge',
'animal' => 'cat' );
my $search = join ( '|', keys %replace );
$search = qr/\$($search)/;
print Dumper \%replace;
print $search;
while ( <DATA> ) {
s/$search/$replace{$1}/g;
print;
}
__DATA__
This number is x=$one/or less then two
his name is $name
It is a $animal/ kind of animal.
You can build your replace patterns something like this:
#!/usr/bin/env perl
use strict;
use warnings;
use Data::Dumper;
my %replace = map { m/\$(\w+)=\"?([^;\"]+)/ } <DATA>;
print Dumper \%replace;
__DATA__
$one=1;
$name="gorge";
$animal="cat";
This gives you:
$VAR1 = {
'name' => 'gorge',
'one' => '1',
'animal' => 'cat'
};
If you're going to be any kind of Perl programmer, then you'll need to read the Perl FAQ.
In there, you'll find an answer to your question.
How can I expand variables in a text string?
If you read that answer, you'll end up with code very similar to what Sobrique gave you. However, in order to get to that code, you'll need to first pass the first paragraph in the answer which says:
If you can avoid it, don't, or if you can use a templating system, such as Text::Template or Template Toolkit, do that instead.
That's really good advice. You should follow it.

How do you treat hashes in arrays properly?

I've got an array of hashes:
my #questions = (
{"Why do you study here?" => "bla"},
{"What are your hobbies?" => "blabla"});
And I try to loop through it:
foreach (#questions) {
my $key = (keys $_)[0];
$content .= "\\section{$key}\n\n$_{$key}\n\n";
}
giving me
Use of uninitialized value in concatenation (.) or string at
convert.pl line 44.
Where does the error come from?
$_{$key} looks up $key in the hash variable %_. The sigil $ at the beginning indicates that the type of the result is a scalar. It's the syntactic construct VAR{KEY} that determines that VAR must be a hash. Although $_ and %_ use the same symbol as a name, the different sigils make them unrelated variables.
You need to dereference the hash reference $_ into the underlying hash. The syntax for this is $_->{$key} or ${$_}{$key}.
See the reference tutorial for a more general presentation of the topic.
Gilles already explained how to use your current data structure, but I would recommend that you use a different data structure altogether: a simple hash.
#!/usr/bin/perl
use strict;
use warnings;
use 5.010;
my %answers = (
"Why do you study here?" => "bla",
"What are your hobbies?" => "blabla"
);
while (my ($question, $answer) = each %answers) {
say "Question: $question";
say "Answer: $answer";
}
Output:
Question: Why do you study here?
Answer: bla
Question: What are your hobbies?
Answer: blabla
I find this easier to work with than an array of hashes, each of which only contains a single key/value pair.
If you want to iterate through the hash in a certain (non-sorted) order, there are a couple of options. The simplistic solution is to maintain an array of keys in the order you want to access them:
# In the order you want to access them
my #questions = ("What are your hobbies?", "Why do you study here?");
my %answers;
#answers{#questions} = ("blabla", "bla");
foreach my $question (#questions) {
say "Question: $question";
say "Answer: $answers{$question}";
}
Output:
Question: What are your hobbies?
Answer: blabla
Question: Why do you study here?
Answer: bla
Another option is to use Tie::IxHash (or the faster XS module Tie::Hash::Indexed) to access keys in insertion order:
use Tie::IxHash;
tie my %answers, "Tie::IxHash";
%answers = (
"Why do you study here?" => "bla",
"What are your hobbies?" => "blabla"
);
while (my ($question, $answer) = each %answers) {
say "Question: $question";
say "Answer: $answer";
}
Output:
Question: Why do you study here?
Answer: bla
Question: What are your hobbies?
Answer: blabla
The elements of #questions are references to hash, not hashes. Therefore, you should use them like this:
foreach (#questions) {
my $key = (keys %$_)[0];
print "\\section{$key}\n\n$_->{$key}\n\n";
}
See perlref for how to create and use reference.

Using # and $ with the same variable name in Perl

I am declaring the same variable name with # and $:
#ask=(1..9);
$ask="insanity";
print ("Array #ask\n");
print ("Scalar $ask\n");
Without using use strict I am getting output correctly but when I am using use strict it gives me a compilation error.
Do these two variables refer to two different memory locations or is it the same variable?
You've got two variables:
#ask
$ask
You could have %ask (a hash) too if you wanted. Then you'd write:
print $ask, $ask[0], $ask{0};
to reference the scalar, the array and the hash.
Generally, you should avoid this treatment, but the variables are all quite distinct and Perl won't be confused.
The only reason use strict; is complaining is because you don't prefix your variables with my:
#!/usr/bin/env perl
use strict;
use warnings;
my #ask = (1..9);
my $ask = "insanity";
my %ask = ( 0 => 'infinity', infinity => 0 );
print "Array #ask\n";
print "Scalar $ask\n";
print "Hash $ask{0}\n";
with use strict; you need to declare your variables first before using it.
For example:
use strict;
my #ask=(1..9);
my $ask="insanity";
print ("Array #ask\n");
print ("Scalar $ask\n");
#ask and $ask are different variables — as is %ask — and it is not an error to do this. It is however poor style.
Because the sigil changes when you use them, such as when you use $ask[1] to get the second element of #ask, the code becomes harder to read and use strict will also not be able to tell if you've gotten confused. Thus it's a good idea to use names that differ in more than the sigil unless you know what you're doing. So you could use e.g. #asks and $ask.
The error you are getting with strict is not due to variable names. It is because you are not declaring the variables (using one of my, our, local, or state. Nor are you using the vars pragma.
Short answer: Stick a my in front of each variable, and you'll be strict-compliant.
For package variables, you can examine entries in the symbol table. $ask and #ask are separate entities:
#!/usr/bin/env perl
use Devel::Symdump;
use YAML;
#ask=(1..9);
$ask="insanity";
my $st = Devel::Symdump->new('main');
print Dump [ $st->$_ ] for qw(
scalars
arrays
);
Among other things, this code will output:
--
…
- main::ask
…
---
…
- main::ask
…
Being able to use the same name can help when, say, you have an array of fish and you are doing something with each fish in the array:
for my $fish (#fish) {
go($fish);
}
Normally, it is more expressive to use the plural form for arrays and hashes, the singular form for elements of an array, and something based on the singular form for keys in a hash:
#!/usr/bin/env perl
use strict;
use warnings;
my #ships = ('Titanic', 'Costa Concordia');
my %ships = (
'Titanic' => {
maiden_voyage => '10 April 1912',
capacity => 3_327,
},
'Costa Concordia' => {
maiden_voyage => '14 July 2006',
capacity => 4_880,
},
);
for my $ship (#ships) {
print "$ship\n";
}
while (my ($ship_name, $ship_details) = each %ships) {
print "$ship_name capacity: $ship_details->{capacity}\n";
}

Lexically importing useful functions in a big script

Sometimes I need a useful utility function, like List::Util::max in the middle of a large program that does lots of stuff. So if I do
use List::Util 'max';
At the top of my program, I'm stuck with that symbol, polluting my whole namespace, even though I only need it in one subroutine.
So I've been thinking of trying a different pattern, instead:
use List::Util ();
# a whole bunch of stuff later...
sub blah {
List::Util->import( 'max' );
$blah = max #foobar;
...
}
There are two problems with this, though. For one, it doesn't automatically unimport at the end of the block (drat.) I would have to undo everything with an unimport.
The other problem is that apparently prototypes don't get applied correctly, so I have to say max( #foobar ) instead of the prettier parenthesisless version.
Is there an easy way to temporarily import symbols for a block, which would automagically make them go away at the end of the block, and which would also handle prototypes correctly?
Just do this, it's much better and cleaner:
package Foo;
use strict; use warnings;
use List::Util 'max';
use namespace::autoclean;
# your method definitions here...
namespace::autoclean will "unimport" the symbol after the package's compilation cycle is done. The call to it in your method will still work, but you have no namespace pollution (the *Foo::max symbol is removed) and calling $obj->max() will fail.
Alternatively, you might want to take a look at Lexical::Import (I know nothing about it; an irc birdie mentioned it).
If you only use max in one subroutine, I wouldn't import it into the namespace at all. My solution is to
use List::Util;
sub blah {
print List::Util::max(#list);
}
You can localize a symbol table entry:
use List::Util ();
#y = qw(1 3 5 -9 4);
sub max { # return maximum *absolute value* of list
my $max = abs(shift);
$max<abs($_) && ($max=$abs($_)) for #_;
return $max;
}
sub max2 {
local *max = *List::Util::max;
return max(#_);
}
print "My max: ", max(#y), "\n"; # ==> 9
print "List::Util::max ", max2(#y), "\n"; # ==> 5
perlfunc implies that no MODULE should do what you want:
sub blah {
use List::Util qw(max);
say max #foobar;
no List::Util;
}
but that doesn't work -- at least not for List::Util. I believe that it would need to define an unimport method. Even then, I'm not sure if you could have a bare max in your module call different definitions.

In Perl, how can I concisely check if a $variable is defined and contains a non zero length string?

I currently use the following Perl to check if a variable is defined and contains text. I have to check defined first to avoid an 'uninitialized value' warning:
if (defined $name && length $name > 0) {
# do something with $name
}
Is there a better (presumably more concise) way to write this?
You often see the check for definedness so you don't have to deal with the warning for using an undef value (and in Perl 5.10 it tells you the offending variable):
Use of uninitialized value $name in ...
So, to get around this warning, people come up with all sorts of code, and that code starts to look like an important part of the solution rather than the bubble gum and duct tape that it is. Sometimes, it's better to show what you are doing by explicitly turning off the warning that you are trying to avoid:
{
no warnings 'uninitialized';
if( length $name ) {
...
}
}
In other cases, using some sort of null value instead of the actual data gets around the problem. With Perl 5.10's defined-or operator, give length an explicit empty string (defined, and gives back zero length) instead of the variable that would trigger the warning:
use 5.010;
if( length( $name // '' ) ) {
...
}
In Perl 5.12, it's a bit easier because length on an undefined value also returns undefined. That might seem like a bit of silliness, but that pleases the mathematician I might have wanted to be. That doesn't issue a warning, which is the reason this question exists.
use 5.012;
use warnings;
my $name;
if( length $name ) { # no warning
...
}
As mobrule indicates, you could use the following instead for a small savings:
if (defined $name && $name ne '') {
# do something with $name
}
You could ditch the defined check and get something even shorter, e.g.:
if ($name ne '') {
# do something with $name
}
But in the case where $name is not defined, although the logic flow will work just as intended, if you are using warnings (and you should be), then you'll get the following admonishment:
Use of uninitialized value in string ne
So, if there's a chance that $name might not be defined, you really do need to check for definedness first and foremost in order to avoid that warning. As Sinan Ünür points out, you can use Scalar::MoreUtils to get code that does exactly that (checks for definedness, then checks for zero length) out of the box, via the empty() method:
use Scalar::MoreUtils qw(empty);
if(not empty($name)) {
# do something with $name
}
First, since length always returns a non-negative number,
if ( length $name )
and
if ( length $name > 0 )
are equivalent.
If you are OK with replacing an undefined value with an empty string, you can use Perl 5.10's //= operator which assigns the RHS to the LHS unless the LHS is defined:
#!/usr/bin/perl
use feature qw( say );
use strict; use warnings;
my $name;
say 'nonempty' if length($name //= '');
say "'$name'";
Note the absence of warnings about an uninitialized variable as $name is assigned the empty string if it is undefined.
However, if you do not want to depend on 5.10 being installed, use the functions provided by Scalar::MoreUtils. For example, the above can be written as:
#!/usr/bin/perl
use strict; use warnings;
use Scalar::MoreUtils qw( define );
my $name;
print "nonempty\n" if length($name = define $name);
print "'$name'\n";
If you don't want to clobber $name, use default.
In cases where I don't care whether the variable is undef or equal to '', I usually summarize it as:
$name = "" unless defined $name;
if($name ne '') {
# do something with $name
}
You could say
$name ne ""
instead of
length $name > 0
It isn't always possible to do repetitive things in a simple and elegant way.
Just do what you always do when you have common code that gets replicated across many projects:
Search CPAN, someone may have already the code for you. For this issue I found Scalar::MoreUtils.
If you don't find something you like on CPAN, make a module and put the code in a subroutine:
package My::String::Util;
use strict;
use warnings;
our #ISA = qw( Exporter );
our #EXPORT = ();
our #EXPORT_OK = qw( is_nonempty);
use Carp qw(croak);
sub is_nonempty ($) {
croak "is_nonempty() requires an argument"
unless #_ == 1;
no warnings 'uninitialized';
return( defined $_[0] and length $_[0] != 0 );
}
1;
=head1 BOILERPLATE POD
blah blah blah
=head3 is_nonempty
Returns true if the argument is defined and has non-zero length.
More boilerplate POD.
=cut
Then in your code call it:
use My::String::Util qw( is_nonempty );
if ( is_nonempty $name ) {
# do something with $name
}
Or if you object to prototypes and don't object to the extra parens, skip the prototype in the module, and call it like: is_nonempty($name).
The excellent library Type::Tiny provides an framework with which to build type-checking into your Perl code. What I show here is only the thinnest tip of the iceberg and is using Type::Tiny in the most simplistic and manual way.
Be sure to check out the Type::Tiny::Manual for more information.
use Types::Common::String qw< NonEmptyStr >;
if ( NonEmptyStr->check($name) ) {
# Do something here.
}
NonEmptyStr->($name); # Throw an exception if validation fails
How about
if (length ($name || '')) {
# do something with $name
}
This isn't quite equivalent to your original version, as it will also return false if $name is the numeric value 0 or the string '0', but will behave the same in all other cases.
In perl 5.10 (or later), the appropriate approach would be to use the defined-or operator instead:
use feature ':5.10';
if (length ($name // '')) {
# do something with $name
}
This will decide what to get the length of based on whether $name is defined, rather than whether it's true, so 0/'0' will handle those cases correctly, but it requires a more recent version of perl than many people have available.
if ($name )
{
#since undef and '' both evaluate to false
#this should work only when string is defined and non-empty...
#unless you're expecting someting like $name="0" which is false.
#notice though that $name="00" is not false
}