Round brackets enclosing private variables. Why used in this case? - perl

I am reading Learning Perl 6th edition, and the subroutines chapter has this code:
foreach (1..10) {
my($square) = $_ * $_; # private variable in this loop
print "$_ squared is $square.\n";
}
Now I understand that the list syntax, ie the brackets, are used to distinguish between list context and scalar context as in:
my($num) = #_; # list context, same as ($num) = #_;
my $num = #_; # scalar context, same as $num = #_;
But in the foreach loop case I can't see how a list context is appropriate.
And I can change the code to be:
foreach (1..10) {
my $square = $_ * $_; # private variable in this loop
print "$_ squared is $square.\n";
}
And it works exactly the same. So why did the author use my($square) when a simple my $square could have been used instead?
Is there any difference in this case?

Certainly in this case, the brackets aren't necessary. They're not strictly wrong in the sense that they do do what the author intends. As with so much in Perl, there's more than one way to do it.
So there's the underlying question: why did the author choose to do this this way? I wondered at first whether it was the author's preferred style: perhaps he chose always to put his lists of new variables in brackets simply so that something like:
my ($count) = 4;
where the brackets aren't doing anything helpful, at least looked consistent with something like:
my ($min, $max) = (2, 3);
But looking at the whole book, I can't find a single example of this use of brackets for a single value other than the section you referenced. As one example of many, the m// in List Context section in Chapter 9 contains a variety of different uses of my with assignments, but does not use brackets with any single values.
I'm left with the conclusion that as the author introduced my in subroutines with my($m, $n); he tried to vary the syntax as little as possible the next time he used it, ending up with my($max_so_far) and then tried to explain scalar and list contexts, as you quoted above. I'm not sure this is terribly helpful.
TL;DR It's not necessary, although it's not actually wrong. Probably a good idea to avoid this style in your code.

You're quite correct. It's redundant. It doesn't make any difference in this case, because you're effectively forcing a list context to list context operation.
E.g.
my ( $square ) = ( $_ * $_ );
Which also produces the same result. So - in this case, doesn't matter. But is generally speaking not good coding style.

Related

Topicalising a variable using "for" is apparently bad. Why?

So I answered a question on SO and got a lot of flack for it.
I have been using Perl for many years and use this topicalising quite a lot.
So let's start with some code. I am doing search and replace in these examples. The idea is to search for one and three from two strings and replace them.
$values = 'one two three four five';
$value2 = 'one 2 three four 5';
$values =~ s/one//g;
$values =~ s/three//g;
$values2 =~ s/one//g;
$values2 =~ s/three//g;
This code is simple and everyone accepts it.
I can also build an array or hash with a list of values to search and replace which is also acceptable.
However, When I build a script to topicalise $values and $values2 and lessen the amount of typing to build a script it seems to be misunderstood?
Here is the code.
$values = 'one two three four five';
$value2 = 'one 2 three four 5';
for ( $values, $values2 ) {
s/one//g;
s/three//g;
}
The above code will topicalise the variables for the duration of the for block, but many programmers are against this. I want to understand why this is unacceptable?
There are several points to consider.
Your code performs multiple substitutions on a list of variables. You can do that without using $_:
for my $s ($values, $values2) {
$s =~ s/one//g;
$s =~ s/three//g;
}
Personally I think nothing is wrong with the above code.
The general problem with $_ is that it's not a local variable. E.g. if the body of your for loop calls a function (that calls a function ...) that modifies $_ without localizing it (e.g. by assigning to it or using a bare s/// or using while (<...>)), then it will overwrite the variables you're iterating over. With a my variable you're protected because other functions can't see your local variables.
That said, if the rest of your code doesn't have this bug (scribbling over $_ without localizing it), $_ will work fine here.
However, the code in your answer people originally complained about is different:
for ($brackets) {
s/\\left/(/g;
s/\\right/)/g;
}
Here you're not trying to perform the same substitutions on many variables; you're just saving yourself some typing by getting rid of the $brackets =~ part:
$brackets =~ s/\\left/(/g;
$brackets =~ s/\\right/)/g;
Using an explicit loop variable wouldn't be a solution because you'd still have to type out $foo =~ on every line.
This is more a matter of taste. You're only using for for its aliasing effect, not to loop over multiple values. Personally I'm still OK with this.
perldoc perlsyn has this
The foreach is the non-experimental way to set a topicalizer.
The OP's construct is a perfectly valid way of writing Perl code. The only provisons I have regarding their earlier answer are
Unlike the example here, only two operations were being applied to a single variable. That is only marginally briefer than simply writing two substitutions and I wouldn't bother here, although I may consider
s/one//g, s/three//g for $value;
Other than the topicaliser, the answer is identical to another one already posted. I don't believe this makes it sufficiently different to warrant another post

Exchange hash values without temporary variable

I have two hash values a couple of levels deep in a data structure that I would like to exchange and then switch back later.
$hashref->{$irrelevant}{$key1} and $hashref->{$irrelevant}{$key2}
Since they are such long names, ($a, $b) = ($b, $a) would be way too long for a single line of code.
Is there a way to do this elegantly, or am I stuck taking up three lines by exchanging with a temporary variable?
You people who hide "irrelevant" data meanings aren't doing anyone any favours. We still have to write a solution, but it has to be in abstract terms that make no sense either to you or me!
The neatest way I can think of is with a pair of hash slices
my $irrelevant_href = $hashref->{$irrelevant};
#{$irrelevant_href}{$key1, $key2} = #{$irrelevant_href}{$key2, $key1};
Create a sub to make it clear what the long line full of symbols is doing.
sub swap { ($_[0], $_[1]) = ($_[1], $_[0]) }
And it also makes the line shorter.
swap($hashref->{$irrelevant}{$key1}, $hashref->{$irrelevant}{$key2});
You could even use
swap(#{ $hashref->{$irrelevant} }{ $key1, $key2 });
There are far more grave sins in coding that declaring a lexical temp variable, but since you're asking let me throw an idea at you.
If you'll be doing any significant work with $hashref->{$irrelevant}, perhaps you should grab a copy of it specifically, to both make your code briefer for a maintenance programmer & cut out a level of dereferencing with every use.
For example:
# capture inner reference 'cause we'll be using it alot anyway...
my $ir_h_ref = $hashref->{$irrelevant};
#stuff here
# Do swap with shorter reference chain
( $ir_h_ref->{$key1}, $ir_h_ref->{$key2} ) =
( $ir_h_ref->{$key2}, $ir_h_ref->{$key1} );
Now this new variable is probably not worth it just for the sake of the switch, but if you'll be doing much more with that hash in the same code block, it just may become attractive.

Perl operators that modify inputs in-place

I recently took a Perl test and one of the questions was to find all the Perl operations that can be used to modify their inputs in-place. The options were
sort
map
do
grep
eval
I don't think any of these can modify the inputs in-place. Am I missing anything here or is the question wrong?
Try this:
my #array = qw(1 2 3 4);
print "#array\n";
my #new_array = map ++$_, #array;
print "#new_array\n";
print "#array\n"; # oops, we modified this in-place
grep is similar. For sort, the $a and $b variables are aliases back to the original array, so can also be used to modify it. The result is somewhat unpredictable, depending on what sorting algorithm Perl is using (which has historically changed in different versions of Perl, though hasn't changed in a while).
my #arr = qw(1 2 3 4 5);
my #new = sort { ++$a } #arr;
print "#arr\n";
do and eval can take an arbitrary code block, so can obviously modify any non-readonly variable, though it's not clear whether that counts as modifying inputs in place. Slade's example using the stringy form of eval should certainly count though.
I'm assuming the question is testing to see if the student knows to properly use the return values of sort, map, and so on instead of using them in void context and expecting side effects. It's totally possible to modify the parameters given, though.
map and grep alias $_ to each element, so modifying $_ will change the values of the variables in the list passed to it (assuming they're not constants or literals).
eval EXPR and do EXPR can do anything, more or less, so there's nothing stopping you from doing something like:
my $code = q($code = 'modified');
eval $code;
say $code;
The arguments to do BLOCK and eval BLOCK are always a literal block of code, which aren't valid lvalues in any way I know of.
sort has a special optimization when called like #array = sort { $a <=> $b } #array;. If you look at the opcodes generated by this with B::Concise, you'll see something like:
9 <#> sort lK/INPLACE,NUM
But for a question about the language semantics, an implementation detail is irrelevant.

A better variable naming system?

A newbie to programming. The task is to extract a particular data from a string and I chose to write the code as follows -
while ($line =<IN>) {
chomp $line;
#tmp=(split /\t/, $line);
next if ($tmp[0] !~ /ch/);
#tgt1=#tmp[8..11];
#tgt2=#tmp[12..14];
#tgt3=#tmp[15..17];
#tgt4=#tmp[18..21];
foreach (1..4) {
print #tgt($_), "\n";
}
I thought #tgt($_) would be interpreted as #tgt1, #tgt2, #tgt3, #tgt4 but I still get the error message that #tgt is a global symbol (#tgt1, #tgt2, #tgt3, #tgt4` have been declared).
Q1. Did I misunderstand foreach loop?
Q2. Why couldn't perl see #tgt($_) as #tgt1, #tgt2 ..etc?
Q2. From the experience this is probably a bad way to name variables. What would be a preferred way to name variables that have similar features?
Q2. Why couldn't perl see #tgt($_) as #tgt1, #tgt2 ..etc?
Q2. From the experience this is probably a bad way to name variables. What would be a preferred way to name variables that have similar features?
I'll asnswer both together.
#tgt($_) does NOT mean what you hope it means
First off, it's an invalid syntax (you can't use () after an array name, perl interpeter will produce a compile error).
What you're trying to do is access distinct variables by accessing a variable via an expression resulting in its name (aka symbolic references). This IS possible to do; but is typically a bad idea and poor-style Perl (as in, you CAN but you SHOULD NOT do it, without a very very good reason).
To access element $_ the way you tried, you use #{"tgt$_"} syntax. But I repeat - Do Not Do That, even if you can.
A correct idiomatic solution: use an array of arrayrefs, with your 1-4 (or rather 0-3) indexing the outer array:
# Old bad code: #tgt1=#tmp[8..11];
# New correct code:
$tgt[0]=[ #tmp[8..11] ]; # [] creates an array reference from a list.
# etc... repeat 4 times - you can even do it in a smart loop later.
What this does is, it stores a reference to an array slice into a zeroth element of a single #tgt array.
At the end, #tgt array has 4 elements , each an array reference to an array containing one of the slices.
Q1. Did I misunderstand foreach loop?
Your foreach loop (as opposed to its contents - see above) was correct, with one style caveat - again, while you CAN use a default $_ variable, you should almost never use it, instead always use named variables for readability.
You print the abovementioned array of arrayrefs as follows (ask separately if any of the syntax is unclear - this is a mid-level data structure handling, not for beginners):
foreach my $index (0..3) {
print join(",", #{ $tgt[$index]}) . "\n";
}

In Perl, why does the `while(<HANDLE>) {...}` construct not localize `$_`?

What was the design (or technical) reason for Perl not automatically localizing $_ with the following syntax:
while (<HANDLE>) {...}
Which gets rewritten as:
while (defined( $_ = <HANDLE> )) {...}
All of the other constructs that implicitly write to $_ do so in a localized manner (for/foreach, map, grep), but with while, you must explicitly localize the variable:
local $_;
while (<HANDLE>) {...}
My guess is that it has something to do with using Perl in "Super-AWK" mode with command line switches, but that might be wrong.
So if anyone knows (or better yet was involved in the language design discussion), could you share with us the reasoning behind this behavior? More specifically, why was allowing the value of $_ to persist outside of the loop deemed important, despite the bugs it can cause (which I tend to see all over the place on SO and in other Perl code)?
In case it is not clear from the above, the reason why $_ must be localized with while is shown in this example:
sub read_handle {
while (<HANDLE>) { ... }
}
for (1 .. 10) {
print "$_: \n"; # works, prints a number from 1 .. 10
read_handle;
print "done with $_\n"; # does not work, prints the last line read from
# HANDLE or undef if the file was finished
}
From the thread on perlmonks.org:
There is a difference between foreach
and while because they are two totally
different things. foreach always
assigns to a variable when looping
over a list, while while normally
doesn't. It's just that while (<>) is
an exception and only when there's a
single diamond operator there's an
implicit assignment to $_.
And also:
One possible reason for why while(<>)
does not implicitly localize $_ as
part of its magic is that sometimes
you want to access the last value of
$_ outside the loop.
Quite simply, while never localises. No variable is associated with a while construct, so it doesn't have even have anything to localise.
If you change some variable in the while loop expression or in a while loop body, it's your responsibility to adequately scope it.
Speculation: Because for and foreach are iterators and loop over values, while while operates on a condition. In the case of while (<FH>) the condition is that data was read from the file. The <FH> is what writes to $_, not the while. The implicit defined() test is just an affordance to prevent naive code from terminating the loop on a read of false value.
For other forms of while loops, e.g. while (/foo/) you wouldn't want to localize $_.
While I agree that it would be nice if while (<FH>) localized $_, it would have to be a very special case, which could cause other problems with recognizing when to trigger it and when not to, much like the rules for <EXPR> distinguishing being a handle read or a call to glob.
As a side note, we only write while(<$fh>) because Perl doesn't have real iterators. If Perl had proper iterators, <$fh> would return one. for would use that to iterate a line at a time rather than slurping the whole file into an array. There would be no need for while(<$fh>) or the special cases associated with it.