Get variable from shell script in a perl IF statement - perl

This is a follow-up question from Modify text column based on the column before it
I wanna change the starting index of the line processing, say start from the third line. I notice that in order for perl to use the variable in shell, I must export the variable and use $ENV{} in perl, see:
#!/bin/bash
t=3
export t
perl -e 'print $ENV{t}'
perl -lane '$F[3] += sin($F[2]/10 * 4 * atan2 1, 1) if($ENV{t} .. 4);
print "#F"
'test.txt > test_new.txt
Here test.txt is merely the same with the previous question:
A 0.016333 0.003203 0.472723
A 0.016333 0.035228 0.472723
B 0.016333 0.067253 0.472723
B 0.016333 0.099278 0.472723
C 0.016333 0.131303 0.472723
C 0.016333 0.163328 0.472723
However, the $ENV{t} does not work at all: the line processing still starts from the first line. Maybe in IF statement the usage is different??
What should I do to control which line to start?

It's the range operator that's doing it. The particular rule you are using for (3..4) is
If either operand of scalar ".." is a constant expression, that operand is considered true if it is equal (== ) to the current input line number (the $. variable).
Otherwise,
It is false as long as its left operand is false. Once the left operand is true, the range operator stays true until the right operand is true, AFTER which the range operator becomes false again. It doesn't become false till the next time the range operator is evaluated.
When you have a variable for one end point it is being evaluated and is found to be true. So the left end is always true and the operator never gets to be false, and all lines are printed.
As for how to do it, forego the elegance and test explicitly,
if $. >= $ENV{t} and $. <= 4
You can still use the range operator, for a more compact expression
if $.==$ENV{t} .. 4
However, at this point this may be not as clear as a normal test while a tiny gain in performance (if any) may not even be measurable. Thanks to ikegami for bringing this up and for further comments.

#!/bin/bash
t=3
export t
perl -e 'print $ENV{t}'
perl -lane '$F[3] += sin($F[2]/10 * 4 * atan2 1, 1) if(($.>=$ENV{t})&&($.<= 4));
print "#F" 'test.txt > test_new.txt
The above code works! It is great to know the current line number is $.
The result is:
A 0.016333 0.003203 0.472723
A 0.016333 0.035228 0.472723
B 0.016333 0.067253 0.493849581177725
B 0.016333 0.099278 0.503907047205915
C 0.016333 0.131303 0.472723
C 0.016333 0.163328 0.472723

Related

Please explain perl operator associativity

The DOC explains perl operator precedence and associativity. I am interesting at -> and * operators. Both have left associativity.
I have next example and it seems that -> has right associativity. Not left as documentation states:
my #args;
sub call { print "call\n"; shift #args }
sub test { print "test\n"; 1 }
sub t4 { print "t4\n"; 4 }
#args=( \&test, 2, 3 );
call()->( #args, t4 );
sub m1 { print "m1\n"; 2 }
sub m2 { print "m2\n"; 4 }
m1()*(#args, m2);
The output is:
t4
call
test
m1
m2
Because -> has left associativity I expect next output:
call
t4
test
m1
m2
What did I miss? Why t4 is called before call?
You are asking about operand evaluation order (the order in which operands are evaluated), not operator associativity (the order in which operators of the same operator precedence are evaluated).
The operand evaluation order is undefined (or at least undocumented) for many operators, including ->. You will find it consistent, but one shouldn't rely on that.
You can use statement breaks to force the desired order:
my $sub = call();
$sub->( #args, t4 );
This would also work:
$_->( #args, t4 ) for call();
Operator Associativity
Associativity is the order in which operators of the same precedence are evaluated.
For example, * and / have the same precedence, so associativity determines that
3 / 4 * 5 * 6
is equivalent to
( ( 3 / 4 ) * 5 ) * 6 = 22.5
and not
3 / ( 4 * ( 5 * 6 ) ) = 0.025
As you can see, associativity can have a direct effect on the outcome of the expression.
Operand Evaluation Order
Operand evaluation order is the order in which operands are evaluated.
For example, operand evaluation order determines whether 1+2 is evaluated before 3+4, or vice-versa in
( 1 + 2 ) * ( 3 + 4 )
Unless evaluating the operand has a side-effect (e.g. changing a variable or printing to the screen), then operand evaluation order has no effect on the outcome of the expression. As such, some languages leave it undefined to allow for optimization.
In Perl, it's undefined (or at least undocumented) for many operators. It's only documented for the following:
The comma operator (, and =>): left to right
Binary boolean operators (and, or, &&, || and //): left, then maybe right[1]
Conditional operators (? :): left, then either second or third[1]
List and scalar assignment operators (=): right, then left[2]
Binding operators (=~ and !~): left, then right[3]
Required for short-circuited evaluation.
Required for my $x = $x;.
Required to function at all.
Associativity is a rule that tells you how to parse an expression where the same operator appears twice, or more than one operator with the same precedence level appears. For example, associativity determines that:
foo()->bar()->baz()
is equivalent to
(foo()->bar())->baz() # left associative
and not
foo()->(bar()->baz()) # right associative
Since your code only hase one -> operator in it, the associativity of that operator is irrelevant.
You seem to be looking for a rule about the order of evaluation of operands to the -> operator. As far as I know, there is no guaranteed order of evaluation for most operators in perl (much like C). If the order matters, you should split the expression into multiple statements with an explicit temporary variable:
my $tmp = call();
$tmp->( #args, t4 );
Associativity describes the order that multiple operators of equal precedence are handled. It has nothing to do with whether the left arguments or the right arguments are processed first. -> being left associative means that a->b->c is processed as (a->b)->c, not as a->(b->c).

How does incrementation operator work within a loop

I've just started off with perl, and while trying out a few compound statements, I wrote this:
my $ct;
while ($ct++ < 10) {
print $ct;
}
It prints out:
12345678910
I was not expecting it to print 10. How does the logic for the loop really work?
According to perdoc, a TERM operator has the highest precedence. $ct gets incremented to 10 after iterating the loop where it is 9. When it becomes 10, while loop is supposed to exit. So why is 10 still printed out?
Think of it like
while ($ct < 10) {
$ct += 1;
print $ct;
}
(increment after comparison)
On the other hand, ++ on the left side of the variable will increment first, and then do comparison,
while (++$ct < 10) {
print $ct;
}
This is quite intuitive for someone with C background; from perldoc:
"++" and "--" work as in C. That is, if placed before a variable, they increment or decrement the variable by one before returning the value, and if placed after, increment or decrement after returning the value.
Its because you are using the postfix operator. its first compared then incremented.

Perl booleans, negation (and how to explain it)?

I'm new here. After reading through how to ask and format, I hope this will be an OK question. I'm not very skilled in perl, but it is the programming language what I known most.
I trying apply Perl to real life but I didn't get an great understanding - especially not from my wife. I tell her that:
if she didn't bring to me 3 beers in the evening, that means I got zero (or nothing) beers.
As you probably guessed, without much success. :(
Now factually. From perlop:
Unary "!" performs logical negation, that is, "not".
Languages, what have boolean types (what can have only two "values") is OK:
if it is not the one value -> must be the another one.
so naturally:
!true -> false
!false -> true
But perl doesn't have boolean variables - have only a truth system, whrere everything is not 0, '0' undef, '' is TRUE. Problem comes, when applying logical negation to an not logical value e.g. numbers.
E.g. If some number IS NOT 3, thats mean it IS ZERO or empty, instead of the real life meaning, where if something is NOT 3, mean it can be anything but 3 (e.g. zero too).
So the next code:
use 5.014;
use Strictures;
my $not_3beers = !3;
say defined($not_3beers) ? "defined, value>$not_3beers<" : "undefined";
say $not_3beers ? "TRUE" : "FALSE";
my $not_4beers = !4;
printf qq{What is not 3 nor 4 mean: They're same value: %d!\n}, $not_3beers if( $not_3beers == $not_4beers );
say qq(What is not 3 nor 4 mean: #{[ $not_3beers ? "some bears" : "no bears" ]}!) if( $not_3beers eq $not_4beers );
say ' $not_3beers>', $not_3beers, "<";
say '-$not_3beers>', -$not_3beers, "<";
say '+$not_3beers>', -$not_3beers, "<";
prints:
defined, value><
FALSE
What is not 3 nor 4 mean: They're same value: 0!
What is not 3 nor 4 mean: no bears!
$not_3beers><
-$not_3beers>0<
+$not_3beers>0<
Moreover:
perl -E 'say !!4'
what is not not 4 IS 1, instead of 4!
The above statements with wife are "false" (mean 0) :), but really trying teach my son Perl and he, after a while, asked my wife: why, if something is not 3 mean it is 0 ? .
So the questions are:
how to explain this to my son
why perl has this design, so why !0 is everytime 1
Is here something "behind" what requires than !0 is not any random number, but 0.
as I already said, I don't know well other languages - in every language is !3 == 0?
I think you are focussing to much on negation and too little on what Perl booleans mean.
Historical/Implementation Perspective
What is truth? The detection of a higher voltage that x Volts.
On a higher abstraction level: If this bit here is set.
The abstraction of a sequence of bits can be considered an integer. Is this integer false? Yes, if no bit is set, i.e. the integer is zero.
A hardware-oriented language will likely use this definition of truth, e.g. C, and all C descendants incl Perl.
The negation of 0 could be bitwise negation—all bits are flipped to 1—, or we just set the last bit to 1. The results would usually be decoded as integers -1 and 1 respectively, but the latter is more energy efficient.
Pragmatic Perspective
It is convenient to think of all numbers but zero as true when we deal with counts:
my $wordcount = ...;
if ($wordcount) {
say "We found $wordcount words";
} else {
say "There were no words";
}
or
say "The array is empty" unless #array; # notice scalar context
A pragmatic language like Perl will likely consider zero to be false.
Mathematical Perspective
There is no reason for any number to be false, every number is a well-defined entity. Truth or falseness emerges solely through predicates, expressions which can be true or false. Only this truth value can be negated. E.g.
¬(x ≤ y) where x = 2, y = 3
is false. Many languages which have a strong foundation in maths won't consider anything false but a special false value. In Lisps, '() or nil is usually false, but 0 will usually be true. That is, a value is only true if it is not nil!
In such mathematical languages, !3 == 0 is likely a type error.
Re: Beers
Beers are good. Any number of beers are good, as long as you have one:
my $beers = ...;
if (not $beers) {
say "Another one!";
} else {
say "Aaah, this is good.";
}
Boolification of a beer-counting variable just tells us if you have any beers. Consider !! to be a boolification operator:
my $enough_beer = !! $beers;
The boolification doesn't concern itself with the exact amount. But maybe any number ≥ 3 is good. Then:
my $enough_beer = ($beers >= 3);
The negation is not enough beer:
my $not_enough_beer = not($beers >= 3);
or
my $not_enough_beer = not $beers;
fetch_beer() if $not_enough_beer;
Sets
A Perl scalar does not symbolize a whole universe of things. Especially, not 3 is not the set of all entities that are not three. Is the expression 3 a truthy value? Yes. Therefore, not 3 is a falsey value.
The suggested behaviour of 4 == not 3 to be true is likely undesirable: 4 and “all things that are not three” are not equal, the four is just one of many things that are not three. We should write it correctly:
4 != 3 # four is not equal to three
or
not( 4 == 3 ) # the same
It might help to think of ! and not as logical-negation-of, but not as except.
How to teach
It might be worth introducing mathematical predicates: expressions which can be true or false. If we only ever “create” truthness by explicit tests, e.g. length($str) > 0, then your issues don't arise. We can name the results: my $predicate = (1 < 2), but we can decide to never print them out, instead: print $predicate ? "True" : "False". This sidesteps the problem of considering special representations of true or false.
Considering values to be true/false directly would then only be a shortcut, e.g. foo if $x can considered to be a shortcut for
foo if defined $x and length($x) > 0 and $x != 0;
Perl is all about shortcuts.
Teaching these shortcuts, and the various contexts of perl and where they turn up (numeric/string/boolean operators) could be helpful.
List Context
Even-sized List Context
Scalar Context
Numeric Context
String Context
Boolean Context
Void Context
as I already said, I don't know well other languages - in every language is !3 == 0?
Yes. In C (and thus C++), it's the same.
void main() {
int i = 3;
int n = !i;
int nn = !n;
printf("!3=%i ; !!3=%i\n", n, nn);
}
Prints (see http://codepad.org/vOkOWcbU )
!3=0 ; !!3=1
how to explain this to my son
Very simple. !3 means "opposite of some non-false value, which is of course false". This is called "context" - in a Boolean context imposed by negation operator, "3" is NOT a number, it's a statement of true/false.
The result is also not a "zero" but merely something that's convenient Perl representation of false - which turns into a zero if used in a numeric context (but an empty string if used in a string context - see the difference between 0 + !3 and !3 . "a")
The Boolean context is just a special kind of scalar context where no conversion to a string or a number is ever performed. (perldoc perldata)
why perl has this design, so why !0 is everytime 1
See above. Among other likely reasons (though I don't know if that was Larry's main reason), C has the same logic and Perl took a lot of its syntax and ideas from C.
For a VERY good underlying technical detail, see the answers here: " What do Perl functions that return Boolean actually return " and here: " Why does Perl use the empty string to represent the boolean false value? "
Is here something "behind" what requires than !0 is not any random number, but 0.
Nothing aside from simplicity of implementation. It's easier to produce a "1" than a random number.
if you're asking a different question of "why is it 1 instead of the original # that was negated to get 0", the answer to that is simple - by the time Perl interpreter gets to negate that zero, it no longer knows/remembers that zero was a result of "!3" as opposed to some other expression that resulted in a value of zero/false.
If you want to test that a number is not 3, then use this:
my_variable != 3;
Using the syntax !3, since ! is a boolean operator, first converts 3 into a boolean (even though perl may not have an official boolean type, it still works this way), which, since it is non-zero, means it gets converted to the equivalent of true. Then, !true yields false, which, when converted back to an integer context, gives 0. Continuing with that logic shows how !!3 converts 3 to true, which then is inverted to false, inverted again back to true, and if this value is used in an integer context, gets converted to 1. This is true of most modern programming languages (although maybe not some of the more logic-centered ones), although the exact syntax may vary some depending on the language...
Logically negating a false value requires some value be chosen to represent the resulting true value. "1" is as good a choice as any. I would say it is not important which value is returned (or conversely, it is important that you not rely on any particular true value being returned).

What's the difference between | and || in MATLAB?

What is the difference between the | and || logical operators in MATLAB?
I'm sure you've read the documentation for the short-circuiting operators, and for the element-wise operators.
One important difference is that element-wise operators can operate on arrays whereas the short-circuiting operators apply only to scalar logical operands.
But probably the key difference is the issue of short-circuiting. For the short-circuiting operators, the expression is evaluated from left to right and as soon as the final result can be determined for sure, then remaining terms are not evaluated.
For example, consider
x = a && b
If a evaluates to false, then we know that a && b evaluates to false irrespective of what b evaluates to. So there is no need to evaluate b.
Now consider this expression:
NeedToMakeExpensiveFunctionCall && ExpensiveFunctionCall
where we imagine that ExpensiveFunctionCall takes a long time to evaluate. If we can perform some other, cheap, test that allows us to skip the call to ExpensiveFunctionCall, then we can avoid calling ExpensiveFunctionCall.
So, suppose that NeedToMakeExpensiveFunctionCall evaluates to false. In that case, because we have used short-circuiting operators, ExpensiveFunctionCall will not be called.
In contrast, if we used the element-wise operator and wrote the function like this:
NeedToMakeExpensiveFunctionCall & ExpensiveFunctionCall
then the call to ExpensiveFunctionCall would never be skipped.
In fact the MATLAB documentation, which I do hope you have read, includes an excellent example that illustrates the point very well:
x = (b ~= 0) && (a/b > 18.5)
In this case we cannot perform a/b if b is zero. Hence the test for b ~= 0. The use of the short-circuiting operator means that we avoid calculating a/b when b is zero and so avoid the run-time error that would arise. Clearly the element-wise logical operator would not be able to avoid the run-time error.
For a longer discussion of short-circuit evaluation, refer to the Wikipedia article on the subject.
Logical Operators
MATLAB offers three types of logical operators and functions:
| is Element-wise — operate on corresponding elements of logical arrays.
Example:
vector inputs A and B
A = [0 1 1 0 1];
B = [1 1 0 0 1];
A | B = 11101
|| is Short-circuit — operate on scalar, logical expressions
Example:
|| : Returns logical 1 (true) if either input, or both, evaluate to true, and logical 0 (false) if they do not.
Operand: logical expressions containing scalar values.
A || B (B is only evaluated if A is false)
A = 1;
B = 0;
C =(A || (B = 1));
B is 0 after this expression and C is 1.
Other is, Bit-wise — operate on corresponding bits of integer values or arrays.
reference link
|| is used for scalar inputs
| takes array input in if/while statements
From the source:-
Always use the && and || operators when short-circuiting is required.
Using the elementwise operators (& and |) for short-circuiting can
yield unexpected results.
Short-circuit || means, that parameters will be evaluated only if necessarily in expression.
In our example expr1 || expr2 if expr1 evaluates to TRUE, than there is no need to evaluate second operand - the result will be always TRUE. If you have a long chain of Short-circuit operators A || B || C || D and your first evaluates to true, then others won't be evaluated.
If you substitute Element-wise logical | to A | B | C | D then all elements will be evaluated regardless of previous operands.
| represents OR as a logical operator. || is also a logical operator called a short-circuit OR
The most important advantage of short-circuit operators is that you can use them to evaluate an expression only when certain conditions are satisfied. For example, you want to execute a function only if the function file resides on the current MATLAB path. Short-circuiting keeps the following code from generating an error when the file, myfun.m, cannot be found:
comp = (exist('myfun.m') == 2) && (myfun(x) >= y)
Similarly, this statement avoids attempting to divide by zero:
x = (b ~= 0) && (a/b > 18.5)
You can also use the && and || operators in if and while statements to take advantage of their short-circuiting behavior:
if (nargin >= 3) && (ischar(varargin{3}))

Is it guaranteed anywhere in docs that hashes with same keys will also have same order?

There's not much guarantees about order of hash keys in Perl. Is there any mention in docs that I can't find that would say that as long as two hashes use exactly same keys, they will go in exactly same order?
Short test seems to confirm that. Even if I generate some additional keys for internal key table between assigning to two different hashes, their keys are returned in same order:
my %aaa;
my %bbb;
my %ccc;
my %ddd;
#aaa{qw(a b c d e f g h i j k l)}=();
# Let's see if generating more keys for internal table matters
#ccc{qw(m n o p q r s t u v w x)}=();
#bbb{qw(a b c d e f g h i j k l)}=();
# Just to test if different insertion order matters
#ddd{qw(l k c d e f g h i j a)}=(); $ddd{b} = ();
print keys %aaa, "\n";
print keys %bbb, "\n";
print keys %ddd, "\n";
However I wouldn't rely on udocumented behavior and only fact that can be easily found in docs is that keys, values and each all will use same order as long as hash is not modified.
From perlsec:
Perl has never guaranteed any ordering of the hash keys, and the
ordering has already changed several times during the lifetime of Perl
5. Also, the ordering of hash keys has always been, and continues to be, affected by the insertion order.
http://perldoc.perl.org/perlsec.html
A longer test disproves.
So, different hashes with the same set of keys won't always have the same order. For me the program below demonstrates that two hashes with keys qw(a b c d e f) can differ in ordering:
v5.16.0
%h1: ecabdf
%h2: eadcbf
Program:
#!/usr/bin/env perl
use strict;
use warnings;
use feature qw(say);
# http://stackoverflow.com/q/12724071/132382
use constant KEYS => qw(a b c d e f);
my %h1 = map { $_ => undef } KEYS;
my %h2 = map { $_ => undef } KEYS;
delete #h2{'b', 'd', 'f'};
#h2{'x', 'y', 'z'} = ();
#h2{'b', 'd', 'f'} = ();
delete #h2{'x', 'y', 'z'};
say $^V;
say '%h1: ', keys(%h1);
say '%h2: ', keys(%h2);
Update
Here's a simpler demonstration that insertion order alone matters:
$ perl -MList::Util=shuffle -E \
> '#keys = ('a'..'z'); #h1{#keys} = #h2{shuffle #keys} = ();
> say keys(%$_) for (\%h1, \%h2)'
wraxdjyukhgftienvmslcpqbzo
warxdjyukhgftienmvslpcqbzo
#^^ ^^ ^^
#|| || ||
It is specifically guaranteed that this is undependable.
See the Algorithmic Complexity Attacks section of perlsec in full. Despite its regrettable incoherency, it states that
in 5.8.1, the order is guaranteed to be random every time.
in 5.8.2 and later, the order will be the same unless Perl detects pathological behavior (specifically, a series of keys that would all hash to a small number of buckets, causing the hash performance to suffer). In those cases, "the function is perturbed by a pseudorandom seed".
The documentation does not guarantee that the ordering will always be the same; in fact, it specifically states that it will not be predictable in a pathological case. Should the hashing function be changed in a future release, it's possible that data which previously did not generate degenerate hashing would now do so, and then would be subject to the random perturbation.
So the takeaway is that if you're not using 5.8.1, maybe you'll get the same order, and maybe it won't change when you update your Perl, but it might. If you are using 5.8.1, then a random order is guaranteed.
If you want a dependable order, use one of the CPAN classes that provides a hash that has a guaranteed key order - Tie::Hash::Indexed, Tie::IxHash - or just sort your keys. If you have a hash that has less than a few thousand keys, you probably won't notice an appreciable difference. If it has more than that, maybe you should consider a heavier-weight solution such as a database anyway.
Edit: and just to make it more interesting, keys will be randomly ordered as of 5.18.
Here is a counter-example that is shorter than #pilcrow's (apparently I missed his answer when I first looked at this question):
#!/usr/bin/env perl
use strict; use warnings;
my #hashes = (
{ map { $_ => rand } 'a' .. 'z' },
{ map { $_ => rand } 'a' .. 'd', 'f' .. 'z' }
);
delete $hashes[0]{e};
print "#{[ keys %$_ ]}\n" for #hashes;
Output:
C:\temp> t
w r a x d j y u k h g f t i n v m s l c p q b z o
w r a x d j y u k h g f t i n v m s l c p b q z o
perldoc -f keys has some info on the ordering:
The keys of a hash are returned in an apparently random order. The actual random order is subject to change in future versions of Perl, but it is guaranteed to be the same order as either the values or each function produces (given that the hash has not been modified). Since Perl 5.8.1 the ordering can be different even between different runs of Perl for security reasons (see Algorithmic Complexity Attacks in perlsec).
So the only guarantee is that no ordering is guaranteed.
Since at least 5.18, following is explicitly mentioned in perldoc perlsec:
keys, values, and each return items in a per-hash randomized order.
Modifying a hash by insertion will change the iteration order of that
hash.
Perl has never guaranteed any ordering of the hash keys, and the
ordering has already changed several times during the lifetime of Perl
5. Also, the ordering of hash keys has always been, and continues to be, affected by the insertion order and the history of changes made to
the hash over its lifetime.
Therefore two hashes with same set of keys are explicitly NOT guaranteed to be iterated in same order.