Reading this Perl: extract rows from 1 to n (Windows) I didn't understand the flip-flop-operator/readline-counter part.
perl -nE 'say $c if $c=1..3' my_file
1
2
3E0
Could someone explain me more detailed where this output comes from?
To quote perlop:
In scalar context, ".." returns a
boolean value. The operator is
bistable, like a flip-flop, and
emulates the line-range (comma)
operator of sed, awk, and various
editors. Each ".." operator maintains
its own boolean state, even across
calls to a subroutine that contains
it. It is false as long as its left
operand is false. Once the left
operand is true, the range operator
stays true until the right operand is
true, AFTER which the range operator
becomes false again. It doesn't become
false till the next time the range
operator is evaluated. It can test the
right operand and become false on the
same evaluation it became true (as in
awk), but it still returns true once.
If you don't want it to test the right
operand until the next evaluation, as
in sed, just use three dots ("...")
instead of two. In all other regards,
"..." behaves just like ".." does.
The right operand is not evaluated
while the operator is in the "false"
state, and the left operand is not
evaluated while the operator is in the
"true" state. The precedence is a
little lower than || and &&. The value
returned is either the empty string
for false, or a sequence number
(beginning with 1) for true. The
sequence number is reset for each
range encountered. The final sequence
number in a range has the string "E0"
appended to it, which doesn't affect
its numeric value, but gives you
something to search for if you want to
exclude the endpoint. You can exclude
the beginning point by waiting for the
sequence number to be greater than 1.
If either operand of scalar ".." is a
constant expression, that operand is
considered true if it is equal (==)
to the current input line number (the
$. variable).
(emphasis added)
Related
Why is the following use line legal Perl syntax? (Adapted from the POD for parent; tested on Perl 5.26.2 x64 on Cygwin.)
package MyHash;
use strict;
use Tie::Hash;
use parent -norequire, "Tie::StdHash";
# ^^^^^^^^^^ A bareword with nothing to protect it!
Under -MO=Deparse, the use line becomes
use parent ('-norequire', 'Tie::StdHash');
but I can't tell from the use docs where the quoting on -norequire comes from.
If use strict were not in force, I would understand it. The bareword norequire would become the string "norequire", the unary minus would turn that string into "-bareword", and the resulting string would go into the use import list. For example:
package MyHash;
use Tie::Hash;
use parent -norequire, "Tie::StdHash";
Similarly, if there were a fat comma, I would understand it. -foo => bar becomes "-foo", bar because => turns foo into "foo", and then the unary minus works its magic again. For example:
package MyHash;
use strict;
use Tie::Hash;
use parent -norequire => "Tie::StdHash";
Both of those examples produce the same deparse for the use line. However, both have quoting that the original example does not. What am I missing that makes the original example (with strict, without =>) legal? Thanks!
You already cited perldoc perlop, but it is relevant here.
Unary - performs arithmetic negation if the operand is numeric, including any string that looks like a number. If the operand is an identifier, a string consisting of a minus sign concatenated with the identifier is returned. ... One effect of these rules is that -bareword is equivalent to the string "-bareword".
This behavior of the unary minus operator is applied to the bareword before the strict checks are applied. Therefore, unary minus is a kind of quoting operator that also works in strict mode.
Similarly, barewords as the invocant in method invocation do not need to be quoted as long as they are not a function call:
Foo->bar; # 'Foo'->bar(); --- but only if no sub Foo exists
print->bar; # print($_)->bar();
However, the unary minus behaviour seems to be due to constant folding, not due to a special case in the parser. For example, this code
use strict;
0 ? foo : bar;
will only complain about the bareword "bar" being disallowed, suggesting that the bareword check happens very late during parsing and compilation. In the unary minus case, the bareword will already have been constant-folded into a proper string value at that point, and no bareword remains visible.
While this is arguably buggy, it is also impossible to change without breaking backwards compatibility – and this behaviour is used by many modules such as use parent to communicate options. Compare also similar idioms on command line interfaces, where options usually begin with a dash.
From perlop
Symbolic Unary Operators
Unary "-" performs arithmetic negation if the operand is numeric, including any
string that looks like a number. If the operand is an identifier, a string
consisting of a minus sign concatenated with the identifier is returned.
Otherwise, if the string starts with a plus or minus, a string starting with
the opposite sign is returned. One effect of these rules is that -bareword is
equivalent to the string "-bareword". If, however, the string begins with a
non-alphabetic character (excluding "+" or "-"), Perl will attempt to convert
the string to a numeric and the arithmetic negation is performed. If the string
cannot be cleanly converted to a numeric, Perl will give the warning Argument
"the string" isn't numeric in negation (-) at ....
So because of the rules of Perl parsing -name is treated as "-name" even under use strict
I'm trying to increment:
Text_1_string(0)
to
Text_1_string(1)
and so on.
Note that I only want to increment the number in the parenthesis.
I've used:
name =~ s/\(([0-9]+)\)/$1 + 1/e;
but it turns out as:
Text_1_string1
and I don't understand why. The group captured is the number, it shouldn't replace the parenthesis.
It replaces the whole pattern that it matched, not only what is also captured. So you do need to put back the parens
$name =~ s/\(([0-9]+)\)/'('.($1 + 1).')'/e;
Since the replacement part is evaluated as code it need be normal Perl code, thus the quotes and concatenation, and parenthesis for precedence.
To add, there are patterns that need not be put back in the replacement part: lookahead and lookbehind assertions. Like common anchors, these are zero width assertions, so they do not consume what they match -- you only "look"
$name =~ s/(?<=\() ([0-9]+) (?=\))/$1 + 1/xe;
The lookbehind can't be of variable length (like \w+); it takes only a fixed string pattern.
The (?<=...) asserts that the (fixed length) pattern in parenthesis (which do not capture!) must precede the number while (?=...) asserts that the pattern in its parens must follow, for the whole pattern to match.
Often very useful is the lookbehind-type construct \K, which makes the engine keep in the string what it had matched up to that point (instead of "consuming" it); so it "drops" previous matches, much like the (?<=...) form
$name =~ s/\(\K ([0-9]+) (?=\))/$1 + 1/xe;
This is also more efficient. While it is also termed a "lookbehind" in documentation, there are in fact distinct differences in behavior. See this post and comments. Thanks to ikegami for a comment.
All these are positive lookarounds; there are also negative ones, asserting that given patterns must not be there for the whole thing to match.
A bit of an overkill in this case but a true gift in some other cases.
what does '-' mean in the param
$cgi->start_html(-title => uc($color), -BGCOLOR => $color);
I just know it is used in hash type, but this is param in a sub. So it makes me confused, and i searched for a long time.
Whenever you come across confusing syntax in Perl, a handy tool is the -MO=Deparse option. This causes Perl to check the syntax of a script and output the script in a normalized form, rather than executing it.
So if I do
perl -MO=Deparse -e '$cgi->start_html(-title => uc($color), -BGCOLOR => $color);'
I get a result of:
$cgi->start_html(-'title', uc $color, -'BGCOLOR', $color);
-e syntax OK
There are three differences here:
Quotes were added to title and BGCOLOR.
The => operators changed to commas.
The parentheses disappeared from uc($color).
The first two are the normal effects of the => ("fat comma") operator: It's equivalent to a comma, except that if the thing to the left is an identifier (starting with a letter or underscore and containing only alphanumeric characters and underscores), that identifier becomes a quoted string.
And the parentheses after uc just aren't strictly necessary in this situation, since the builtin function uc is prototyped to take 0 or 1 arguments.
But now we have -'title' and -'BGCOLOR', so what's the negative of a string? Checking perldoc perlop, we see that unary minus follows the rules:
If the operand is a number or a string representation of a number, does an arithmetic negation.
Otherwise, if the string starts with '+' or '-', switches just the first character of the string to the opposite sign.
Otherwise, if the string starts with a letter, adds a '-' to the beginning of the string.
Otherwise, attempts to convert the string to a number, probably prints a warning if warnings are enabled, and then does an arithmetic negation.
Here we have case 3, so -'title' is '-title' and -'BGCOLOR' is '-BGCOLOR'.
So presumably the start_html method expects a list of arguments which come in key-value pairs, and the key strings are supposed to start with hyphens. (It might or might not internally use these arguments to create a hash, with a line like my %options = #_;.)
This is all a little roundabout, plus you'd get confusing results if you ever tried passing something like -3zzz => $value. So I'd personally add explicit quotes here to make it obvious what's being passed, but keep using the fat commas anyway to emphasize the arguments are meant to be key/value pairs:
$cgi->start_html('-title' => uc($color), '-BGCOLOR' => $color);
It has no effect here. It's just treated as part of the string. I assume that the original author of CGI.pm wanted to make the options look more like command-line options. I think that was a terrible idea.
It's a string literal, just like "-title" or "-BGCOLOR".
perldoc perlop:
[Unary "-" ...] If the operand is an identifier, a string consisting of a minus sign concatenated with the identifier is returned. Otherwise, if the string starts with a plus or minus, a string starting with the opposite sign is returned.
In other words, -"foo" is "-foo".
The => operator (sometimes pronounced "fat comma") is a synonym for the comma except that it causes a word on its left to be interpreted as a string if it begins with a letter or underscore and is composed only of letters, digits and underscores.
In other words, foo => 42 is "foo", 42.
Taken together, this means -title => uc($color) is "-title", uc($color).
I am just starting out with Swift, coming from Objective-C. I have this line:
self.collectionView?.insertItems(at: [IndexPath.init(row: (self.flights.count -1), section: 0)])
I get an error saying: Expected ',' separator after 'count'.
Why on earth can I not do a simple sum, the count -1? A little quirk of Swift I haven't yet learnt, or its too early in the morning...
Referring to Apple's "Lexical Structure" Documentation:
The whitespace around an operator is used to determine whether an
operator is used as a prefix operator, a postfix operator, or a binary
operator. This behavior is summarized in the following rules:
If an operator has whitespace around both sides or around neither side, it is treated as a binary operator. As an example, the +++
operator in a+++b and a +++ b is treated as a binary operator.
If an operator has whitespace on the left side only, it is treated as a prefix unary operator. As an example, the +++ operator in a +++b
is treated as a prefix unary operator.
If an operator has whitespace on the right side only, it is treated as a postfix unary operator. As an example, the +++ operator in a+++ b
is treated as a postfix unary operator.
If an operator has no whitespace on the left but is followed immediately by a dot (.), it is treated as a postfix unary operator.
As an example, the +++ operator in a+++.b is treated as a postfix
unary operator (a+++ .b rather than a +++ .b).
Note: ++ and -- has been removed from Swift 3. For more information, check 0004 Swift evolution proposal.
Means that the minus operator in self.flights.count -1 treated as prefix unary operator (second rule).
To make it more clear, the compiler reads self.flights.count -1 as: self.flights.count, next to it there is a minus one, but NOT a subtraction operation. By applying the second rule, the minus is a prefix unary operator for the 1.
Obviously, you want the compiler to treat the minus operator as a binary operator, so what you should do is to add whitespace around both sides of the minus (applying the first rule):
self.collectionView.insertItems(at: [IndexPath.init(row: (flights.count - 1), section: 0)])
Hope this helped.
All you need to do is add a space
self.collectionView?.insertItems(at: [IndexPath.init(row: (self.flights.count - 1), section: 0)])
I'm new to Perl and I have been learning about the Perl basics for past two days.
I'm converting a Perl script to Java program gradually.
In the Perl script, I came across this code.
if( $arr[$i]=~/^0$/ ){
...
...
}
I know that $arr[$i] means getting the ith element from the array arr.
But what does =~/^0$/ mean?
To what are they comparing the array's element?
I searched for this, but I couldn't find it.
Someone please explain me.
FYI, the arr contains floating values.
if ($arr[$i]) =~ /^0$/) is roughly equivalent to if ($arr[$i] eq "0"), but not exactly the same, as it will match both the strings "0" and "0\n". If $arr[$1] was read from a file or stdin and it has not been chomped, this can be a very significant distinction.
if ($arr[$i] == 0), on the other hand, will match any string beginning with a non-numeric character or a string of zeroes/whitespace which is not followed by a numeric character, although it will generate a warning if the string contains non-whitespace, non-digit characters or contains only whitespace (and warnings are enabled, of course).
=~ is a binding operator.
"Binary "=~" binds a scalar expression to a pattern match"
/^0$/ on the right hand side is the regex
^ Match the beginning of the line
$ Match the end of the line (or before newline at the end)
And the zero has no special meaning.
^ and $ are regex anchors which says $arr[$i] should begin with 0 and there is end of string immediately after it.
It can be written as
if ($arr[$i] eq "0" or $arr[$i] eq "0\n")