Perl operators that modify inputs in-place - perl

I recently took a Perl test and one of the questions was to find all the Perl operations that can be used to modify their inputs in-place. The options were
sort
map
do
grep
eval
I don't think any of these can modify the inputs in-place. Am I missing anything here or is the question wrong?

Try this:
my #array = qw(1 2 3 4);
print "#array\n";
my #new_array = map ++$_, #array;
print "#new_array\n";
print "#array\n"; # oops, we modified this in-place
grep is similar. For sort, the $a and $b variables are aliases back to the original array, so can also be used to modify it. The result is somewhat unpredictable, depending on what sorting algorithm Perl is using (which has historically changed in different versions of Perl, though hasn't changed in a while).
my #arr = qw(1 2 3 4 5);
my #new = sort { ++$a } #arr;
print "#arr\n";
do and eval can take an arbitrary code block, so can obviously modify any non-readonly variable, though it's not clear whether that counts as modifying inputs in place. Slade's example using the stringy form of eval should certainly count though.

I'm assuming the question is testing to see if the student knows to properly use the return values of sort, map, and so on instead of using them in void context and expecting side effects. It's totally possible to modify the parameters given, though.
map and grep alias $_ to each element, so modifying $_ will change the values of the variables in the list passed to it (assuming they're not constants or literals).
eval EXPR and do EXPR can do anything, more or less, so there's nothing stopping you from doing something like:
my $code = q($code = 'modified');
eval $code;
say $code;
The arguments to do BLOCK and eval BLOCK are always a literal block of code, which aren't valid lvalues in any way I know of.
sort has a special optimization when called like #array = sort { $a <=> $b } #array;. If you look at the opcodes generated by this with B::Concise, you'll see something like:
9 <#> sort lK/INPLACE,NUM
But for a question about the language semantics, an implementation detail is irrelevant.

Related

How to avoid input modification in PDL subroutines

I would like to avoid the assignment operator .= to modify the user input from a subroutine.
One way to avoid this is to perform a copy of the input inside the subroutine. Is this the best way to proceed? Are there other solutions?
use PDL;use strict;
my $a=pdl(1);
f_0($a);print "$a\n";
f_1($a);print "$a\n";
sub f_0{
my($input)=#_;
my $x=$input->copy;
$x.=0;
}
sub f_1{
my($input)=#_;
$input.=0;
}
In my case (perl 5.22.1), executing last script prints 1 and 0 in two lines. f_0 does not modify user input in-place, while f_1 does.
According to the FAQ 6.17 What happens when I have several references to the same PDL object in different variables :
Piddles behave like Perl references in many respects. So when you say
$a = pdl [0,1,2,3]; $b = $a;
then both $b and $a point to the same
object, e.g. then saying
$b++;
will not create a copy of the original piddle but just
increment in place
[...]
It is important to keep the "reference nature" of piddles in mind when
passing piddles into subroutines. If you modify the input piddles you
modify the original argument, not a copy of it. This is different from
some other array processing languages but makes for very efficient
passing of piddles between subroutines. If you do not want to modify
the original argument but rather a copy of it just create a copy
explicitly...
So yes, to avoid modification of the original, create a copy as you did:
my $x = $input->copy;
or alternatively:
my $x = pdl( $input );

Round brackets enclosing private variables. Why used in this case?

I am reading Learning Perl 6th edition, and the subroutines chapter has this code:
foreach (1..10) {
my($square) = $_ * $_; # private variable in this loop
print "$_ squared is $square.\n";
}
Now I understand that the list syntax, ie the brackets, are used to distinguish between list context and scalar context as in:
my($num) = #_; # list context, same as ($num) = #_;
my $num = #_; # scalar context, same as $num = #_;
But in the foreach loop case I can't see how a list context is appropriate.
And I can change the code to be:
foreach (1..10) {
my $square = $_ * $_; # private variable in this loop
print "$_ squared is $square.\n";
}
And it works exactly the same. So why did the author use my($square) when a simple my $square could have been used instead?
Is there any difference in this case?
Certainly in this case, the brackets aren't necessary. They're not strictly wrong in the sense that they do do what the author intends. As with so much in Perl, there's more than one way to do it.
So there's the underlying question: why did the author choose to do this this way? I wondered at first whether it was the author's preferred style: perhaps he chose always to put his lists of new variables in brackets simply so that something like:
my ($count) = 4;
where the brackets aren't doing anything helpful, at least looked consistent with something like:
my ($min, $max) = (2, 3);
But looking at the whole book, I can't find a single example of this use of brackets for a single value other than the section you referenced. As one example of many, the m// in List Context section in Chapter 9 contains a variety of different uses of my with assignments, but does not use brackets with any single values.
I'm left with the conclusion that as the author introduced my in subroutines with my($m, $n); he tried to vary the syntax as little as possible the next time he used it, ending up with my($max_so_far) and then tried to explain scalar and list contexts, as you quoted above. I'm not sure this is terribly helpful.
TL;DR It's not necessary, although it's not actually wrong. Probably a good idea to avoid this style in your code.
You're quite correct. It's redundant. It doesn't make any difference in this case, because you're effectively forcing a list context to list context operation.
E.g.
my ( $square ) = ( $_ * $_ );
Which also produces the same result. So - in this case, doesn't matter. But is generally speaking not good coding style.

Array in scalar context in Perl

In Perl, if you will assign an array to a scalar, you will get length of that array.
Example:
my #arr = (1,2,3,4);
my $x = #arr;
Now if you check the content of $x, you will find that $x contains the length of #arr array.
I want to know the reason why Perl does so. What is the reason behind it? I try at my level but could not find any good reason. So, can someone help me understand the reason behind the scene which is taking place?
Perl uses contexts where other languages use functions to get some info or convert the value between different types. The concept of context is thoroughly explained in perldata manual page (man perldata).
In Perl, the same data look differently under different context. An array looks like a list of its elements in list context, while it looks like number of its elements in scalar context.
How else could it possibly look in scalar context?
It could be the fist element of the array. This can be done with my ($x) = #arr; or my $x = shift #arr; or my $x = $arr[0];.
It could be the last element of the array. This can be done with my ($x) = reverse #arr; or my $x = pop #arr; or my $x = $arr[-1];.
I cannot think of any other reasonable way to make a scalar from an array. Obviously using array length as its scalar value is better than these two, because it is somewhat global property of the array, while these two are fairly local. And it is also very logical when you look at typical use of array in scalar context:
die "Not enough arguments" if #ARGV < 5;
You can read < quite naturally as “is smaller than”.

perl encapsulate single variable in double quotes

In Perl, is there any reason to encapsulate a single variable in double quotes (no concatenation) ?
I often find this in the source of the program I am working on (writen 10 years ago by people that don't work here anymore):
my $sql_host = "something";
my $sql_user = "somethingelse";
# a few lines down
my $db = sub_for_sql_conection("$sql_host", "$sql_user", "$sql_pass", "$sql_db");
As far as I know there is no reason to do this. When I work in an old script I usualy remove the quotes so my editor colors them as variables not as strings.
I think they saw this somewhere and copied the style without understanding why it is so. Am I missing something ?
Thank you.
All this does is explicitly stringify the variables. In 99.9% of cases, it is a newbie error of some sort.
There are things that may happen as a side effect of this calling style:
my $foo = "1234";
sub bar { $_[0] =~ s/2/two/ }
print "Foo is $foo\n";
bar( "$foo" );
print "Foo is $foo\n";
bar( $foo );
print "Foo is $foo\n";
Here, stringification created a copy and passed that to the subroutine, circumventing Perl's pass by reference semantics. It's generally considered to be bad manners to munge calling variables, so you are probably okay.
You can also stringify an object or other value here. For example, undef stringifies to the empty string. Objects may specify arbitrary code to run when stringified. It is possible to have dual valued scalars that have distinct numerical and string values. This is a way to specify that you want the string form.
There is also one deep spooky thing that could be going on. If you are working with XS code that looks at the flags that are set on scalar arguments to a function, stringifying the scalar is a straight forward way to say to perl, "Make me a nice clean new string value" with only stringy flags and no numeric flags.
I am sure there are other odd exceptions to the 99.9% rule. These are a few. Before removing the quotes, take a second to check for weird crap like this. If you do happen upon a legit usage, please add a comment that identifies the quotes as a workable kludge, and give their reason for existence.
In this case the double quotes are unnecessary. Moreover, using them is inefficient as this causes the original strings to be copied.
However, sometimes you may want to use this style to "stringify" an object. For example, URI ojects support stringification:
my $uri = URI->new("http://www.perl.com");
my $str = "$uri";
I don't know why, but it's a pattern commonly used by newcomers to Perl. It's usually a waste (as it is in the snippet you posted), but I can think of two uses.
It has the effect of creating a new string with the same value as the original, and that could be useful in very rare circumstances.
In the following example, an explicit copy is done to protect $x from modification by the sub because the sub modifies its argument.
$ perl -E'
sub f { $_[0] =~ tr/a/A/; say $_[0]; }
my $x = "abc";
f($x);
say $x;
'
Abc
Abc
$ perl -E'
sub f { $_[0] =~ tr/a/A/; say $_[0]; }
my $x = "abc";
f("$x");
say $x;
'
Abc
abc
By virtue of creating a copy of the string, it stringifies objects. This could be useful when dealing with code that alters its behaviour based on whether its argument is a reference or not.
In the following example, explicit stringification is done because require handles references in #INC differently than strings.
$ perl -MPath::Class=file -E'
BEGIN { $lib = file($0)->dir; }
use lib $lib;
use DBI;
say "ok";
'
Can't locate object method "INC" via package "Path::Class::Dir" at -e line 4.
BEGIN failed--compilation aborted at -e line 4.
$ perl -MPath::Class=file -E'
BEGIN { $lib = file($0)->dir; }
use lib "$lib";
use DBI;
say "ok";
'
ok
In your case quotes are completely useless. We can even says that it is wrong because this is not idiomatic, as others wrote.
However quoting a variable may sometime be necessary: this explicitely triggers stringification of the value of the variable. Stringification may give a different result for some values if thoses values are dual vars or if they are blessed values with overloaded stringification.
Here is an example with dual vars:
use 5.010;
use strict;
use Scalar::Util 'dualvar';
my $x = dualvar 1, "2";
say 0+$x;
say 0+"$x";
Output:
1
2
My theory has always been that it's people coming over from other languages with bad habits. It's not that they're thinking "I will use double quotes all the time", but that they're just not thinking!
I'll be honest and say that I used to fall into this trap because I came to Perl from Java, so the muscle memory was there, and just kept firing.
PerlCritic finally got me out of the habit!
It definitely makes your code more efficient, but if you're not thinking about whether or not you want your strings interpolated, you are very likely to make silly mistakes, so I'd go further and say that it's dangerous.

Question about the foreach-value

I've found in a Module a for-loop written like this
for( #array ) {
my $scalar = $_;
...
...
}
Is there Difference between this and the following way of writing a for-loop?
for my $scalar ( #array ) {
...
...
}
Yes, in the first example, the for loop is acting as a topicalizer (setting $_ which is the default argument to many Perl functions) over the elements in the array. This has the side effect of masking the value $_ had outside the for loop. $_ has dynamic scope, and will be visible in any functions called from within the for loop. You should primarily use this version of the for loop when you plan on using $_ for its special features.
Also, in the first example, $scalar is a copy of the value in the array, whereas in the second example, $scalar is an alias to the value in the array. This matters if you plan on setting the array's value inside the loop. Or, as daotoad helpfully points out, the first form is useful when you need a copy of the array element to work on, such as with destructive function calls (chomp, s///, tr/// ...).
And finally, the first example will be marginally slower.
$_ is the "default input and pattern matching space". In other words, if you read in from a file handle at the top of a while loop, or run a foreach loop and don't name a loop variable, $_ is set up for you.
However, if you write a foreach loop and name a loop variable, $_ is not set up.This can be justified by following code:
1. #!/usr/bin/perl -w
2. #array = (1,2,3);
3. foreach my $elmnt (#array)
4. {
5. print "$_ ";
6. }
The output being "Use of uninitialized value in concatenation (.)"
However if you replace line 3 by:
foreach (#array)
The output is "1 2 3" as expected.
Now in your case, it is always better to name a loop variable in a foreach loop to make the code more readable(perl is already cursed much for being less readable), this way there will also be no need of explicit assignment to the $_ variable and resulting scoping issues.
I can't explain better than the doc can