This question already has answers here:
Why aren't newlines being printed in this Perl code?
(3 answers)
Closed 8 years ago.
Let's take the following minimalistic script:
#!/usr/bin/perl
#
# conditional_operator.pl
#
use strict;
print ( 1 ? "true" : "false" )." statement\n";
exit;
I expect the output always to be "true statement". But when I execute this snippet, I see ...
deviolog#home:~/test$ perl conditional_operator.pl
true
The " statement\n" concatenation seems to be ignored.
My perl version is v5.14.2. I read the perlop manual about the conditional operator and think, a string concatenation should be possible.
Can somebody explain this behaviour?
Always include use warnings; at the top of every script.
To get your desired behavior, just add parenthesis so print is called with the entire argument instead of just the first part:
print(( 1 ? "true" : "false" )." statement\n");
If you'd had warnings turned on, you would've gotten this alert:
Useless use of concatenation (.) or string in void context
You can also avoid the undesired behavior by leading with a blank concatenation, or you could put a plus sign before the parenthesis:
print +( 1 ? "true" : "false" )." statement\n";
print ''.( 1 ? "true" : "false" )." statement\n";
Adding use warnings to your code gives this:
print (...) interpreted as function at ./cond line 10.
Useless use of concatenation (.) or string in void context at ./cond line 10.
Even better, add use diagnostics and you get this:
print (...) interpreted as function at ./cond line 11 (#1)
(W syntax) You've run afoul of the rule that says that any list operator
followed by parentheses turns into a function, with all the list
operators arguments found inside the parentheses. See
"Terms and List Operators (Leftward)" in perlop.
Useless use of concatenation (.) or string in void context at ./cond line 11 (#2)
(W void) You did something without a side effect in a context that does
nothing with the return value, such as a statement that doesn't return a
value from a block, or the left side of a scalar comma operator. Very
often this points not to stupidity on your part, but a failure of Perl
to parse your program the way you thought it would. For example, you'd
get this if you mixed up your C precedence with Python precedence and
said
$one, $two = 1, 2;
when you meant to say
($one, $two) = (1, 2);
Another common error is to use ordinary parentheses to construct a list
reference when you should be using square or curly brackets, for
example, if you say
$array = (1,2);
when you should have said
$array = [1,2];
The square brackets explicitly turn a list value into a scalar value,
while parentheses do not. So when a parenthesized list is evaluated in
a scalar context, the comma is treated like C's comma operator, which
throws away the left argument, which is not what you want. See
perlref for more on this.
This warning will not be issued for numerical constants equal to 0 or 1
since they are often used in statements like
1 while sub_with_side_effects();
String constants that would normally evaluate to 0 or 1 are warned
about.
Perl wants to explain the problems to you. You just need to ask it what you're doing wrong :-)
Related
First, please note that I ask this question out of curiosity, and I'm aware that using variable names like ## is probably not a good idea.
When using doubles quotes (or qq operator), scalars and arrays are interpolated :
$v = 5;
say "$v"; # prints: 5
$# = 6;
say "$#"; # prints: 6
#a = (1,2);
say "#a"; # prints: 1 2
Yet, with array names of the form #+special char like ##, #!, #,, #%, #; etc, the array isn't interpolated :
#; = (1,2);
say "#;"; # prints nothing
say #; ; # prints: 1 2
So here is my question : does anyone knows why such arrays aren't interpolated? Is it documented anywhere?
I couldn't find any information or documentation about that. There are too many articles/posts on google (or SO) about the basics of interpolation, so maybe the answer was just hidden in one of them, or at the 10th page of results..
If you wonder why I could need variable names like those :
The -n (and -p for that matter) flag adds a semicolon ; at the end of the code (I'm not sure it works on every version of perl though). So I can make this program perl -nE 'push#a,1;say"#a"}{say#a' shorter by doing instead perl -nE 'push#;,1;say"#;"}{say#', because that last ; convert say# to say#;. Well, actually I can't do that because #; isn't interpolated in double quotes. It won't be useful every day of course, but in some golfing challenges, why not!
It can be useful to obfuscate some code. (whether obfuscation is useful or not is another debate!)
Unfortunately I can't tell you why, but this restriction comes from code in toke.c that goes back to perl 5.000 (1994!). My best guess is that it's because Perl doesn't use any built-in array punctuation variables (except for #- and #+, added in 5.6 (2000)).
The code in S_scan_const only interprets # as the start of an array if the following character is
a word character (e.g. #x, #_, #1), or
a : (e.g. #::foo), or
a ' (e.g. #'foo (this is the old syntax for ::)), or
a { (e.g. #{foo}), or
a $ (e.g. #$foo), or
a + or - (the arrays #+ and #-), but not in regexes.
As you can see, the only punctuation arrays that are supported are #- and #+, and even then not inside a regex. Initially no punctuation arrays were supported; #- and #+ were special-cased in 2000. (The exception in regex patterns was added to make /[\c#-\c_]/ work; it used to interpolate #- first.)
There is a workaround: Because #{ is treated as the start of an array variable, the syntax "#{;}" works (but that doesn't help your golf code because it makes the code longer).
Perl's documentation says that the result is "not strictly predictable".
The following, from perldoc perlop (Perl 5.22.1), refers to interpolation of scalars. I presume it applies equally to arrays.
Note also that the interpolation code needs to make a decision on
where the interpolated scalar ends. For instance, whether
"a $x -> {c}" really means:
"a " . $x . " -> {c}";
or:
"a " . $x -> {c};
Most of the time, the longest possible text that does not include
spaces between components and which contains matching braces or
brackets. because the outcome may be determined by voting based on
heuristic estimators, the result is not strictly predictable.
Fortunately, it's usually correct for ambiguous cases.
Some things are just because "Larry coded it that way". Or as I used to say in class, "It works the way you think, provided you think like Larry thinks", sometimes adding "and it's my job to teach you how Larry thinks."
I am trying to extract a part of a string and put it into a new variable. The string I am looking at is:
maker-scaffold_26653|ref0016423-snap-gene-0.1
(inside a $gene_name variable)
and the thing I want to match is:
scaffold_26653|ref0016423
I'm using the following piece of code:
my $gene_name;
my $scaffold_name;
if ($gene_name =~ m/scaffold_[0-9]+\|ref[0-9]+/) {
$scaffold_name = $1;
print "$scaffold_name\n";
}
I'm getting the following error when trying to execute:
Use of uninitialized value $scaffold_name in concatenation (.) or string
I know that the pattern is right, because if I use $' instead of $1 I get
-snap-gene-0.1
I'm at a bit of a loss: why will $1 not work here?
If you want to use a value from the matching you have to make () arround the character in regex
To expand on Jens' answer, () in a regex signifies an anonymous capture group. The content matched in a capture group is stored in $1-9+ from left to right, so for example,
/(..):(..):(..)/
on an HH:MM:SS time string will store hours, minutes, and seconds in $1, $2, $3 respectively. Naturally this begins to become unwieldy and is not self-documenting, so you can assign the results to a list instead:
my ($hours, $mins, $secs) = $time =~ m/(..):(..):(..)/;
So your example could bypass the use of $ variables by doing direct assignment:
my ($scaffold_name) = $gene_name =~ m/(scaffold_[0-9]+[|]ref[0-9]+)/;
# $scaffold_name now contains 'scaffold_26653|ref0016423'
You can even get rid of the ugly =~ binding by using for as a topicalizer:
my $scaffold_name;
for ($gene_name) {
($scaffold_name) = m/(scaffold_\d+[|]ref\d+)/;
print $scaffold_name;
}
If things start to get more complex, I prefer to use named capture groups (introduced in Perl v5.10.0):
$gene_name =~ m{
(?<scaffold_name> # ?<name> creates a named capture group
scaffold_\d+? # 'scaffold' and its trailing digits
[|] # Literal pipe symbol
ref\d+ # 'ref' and its trailing digits
)
}xms; # The x flag lets us write more readable regexes
print $+{scaffold_name}, "\n";
The results of named capture groups are stored in the magic hash %+. Access is done just like any other hash lookup, with the capture groups as the keys. %+ is locally scoped in the same way the $ are, so it can be used as a drop-in replacement for them in most situations.
It's overkill for this particular example, but as regexes start to get larger and more complicated, this saves you the trouble of either having to scroll all the way back up and count anonymous capture groups from left to right to find which of those darn $ variables is holding the capture you wanted, or scan across a long list assignment to find where to add a new variable to hold a capture that got inserted in the middle.
My personal rule of thumb is to assign the results of anonymous captured to descriptively named lexically scoped variables for 3 or less captures, then switch to using named captures, comments, and indentation in regexes when more are necessary.
Guys perl is not as easy i thought its so confusing thing.I just moved to operators and I wrote some codes but I am unable to figure it out how the compiler treating them.
$in = "42" ;
$out = "56"+32+"good";
print $out,;
The output for above code is 88 and where does the good gone? and Now lets see the other one.
$in ="42";
$out="good52"+32;
print $out ;
and for these the output is 32. The question is where does the good gone that we just stored in $out and the value 52 between the " "why the compiler just printing the value as 32 but not that remaining text.And the other question is
$in=52;
$in="52";
both doing the same work "52" not working as a text . becuase when we add "52"+32 it gives as 84. what is happening and
$in = "hello";
$in = hello;
both do the same work ? or do they differ but if i print then give the same output.Its just eating up my brain.Its so confusing becuase when "52" or 52 and "hello" or hello doing the same job why did they introduce " ".I just need the explaination why its happening for above codes.
In Perl, + is a numeric operator. It tries to interpret its two operands as numbers. 51 is the number 51. "51" is a string containing two digits, and the + operator tries to convert the string to a number, which is 51, and uses it in the calculation. "hello" is a string containing five letters, and when the + operator tries to interpret that as a number, it equates to 0 (zero).
Your first example is thus:
$out = "56"+32+"good";
which is evaluates just like:
$out = 56 + 32 + 0;
Your print then converts that to a string on output, and yields 88.
In perl, the + operator will treat its arguments as numbers, and try to convert anything that is not a number to a number. The . (dot) operator is used to join strings: it will try to convert its operands to strings if they aren't already strings.
If you put:
use strict;
use warnings;
At the top of your script, you would get warnings such as:
Argument "good" isn't numeric in addition (+) at ...
Argument "good52" isn't numeric in addition (+) at ...
Perl automatically reassigns a string value to numeric, if possible. So "42" + 10 actually becomes 52. But it cannot do that with a proper string value, such as "good".
In perl, a string in a numerical context (like when you use a + operator) is converted to a number.
In perl, you can concatenate string using the . (dot) operator, not +.
If you use +, perl will try and interpret all of the operands as numbers. This works well for strings that are number representations, otherwise you get 0. This explains what you see.
$in=52;
$in="52";
both doing the same work "52" not working as a text . becuase when we add "52"+32 it gives as 84.
The problem here is not with the variable definition. One is a string and the other a number. But when you use the string in a numerical expression (+), then it will converted to number.
About your second question:
$in = "hello" defines a string, as you expect;
$in = hello; will just copy the symbol hello (however it is defined) on to your variable. This is actually not "strict" perl and if you set use strict; in your file, perl will complain about it.
First off, give this a read.
Your problem is that the + is a mathematical addition, which doesn't work on strings. If you use that, Perl will assume that you're working with numbers and therefore discard anything that isn't.
To concatenate strings, use .:
$str = "blah " . "blah " . "blah";
As far as the difference between "52" and 52 goes, there isn't one. Since nothing (commands, comments, etc.) in Perl can start with numbers, the compiler doesn't need the quotes to know what to do.
1 #backwards = reverse qw(yabba dabba doo);
2 $backwards = reverse qw(yabba dabba doo);
3
4 print #backwards; #gives doodabbayabba
5 print $backwards."\n"; #gives oodabbadabbay
6 print #backwards."\n"; #gives 3
In the above code why does line 6 give 3 as the output? Why does it convert to scalar context if its concatenated with a \n?
Thanks
Line 6 gives 3 because the . operator imposes scalar context on both its operands, and #backwards in a scalar context yields 3 (because any array in a scalar context reports the number of elements in it).
Try:
print "#backwards\n";
To join the elements in an array, use join:
print join(', ', #backwards) . "\n";
Your question is ultimately "why is #backwards in scalar context in line 6", which
begs the question, "how can I determine a term's context?".
Context is determined by "what is around" (i.e. its "context") the term.
How can I determine a term's context? By looking at the operator/function that
is using the term.
What steps could you follow to figure out the context for #backwards yourself if you
didn't have helpful stackoverflow folks around to tell you its context?
Here we have
print #backwards."\n"
so there are two operators/functions. How do we know which one provides context
to #backwards? By consulting precedence. Near the top of perlop.pod we have Perl's
precedence chart (print is a "list operator"):
left terms and list operators (leftward)
...
left + - .
...
nonassoc list operators (rightward)
Oh great, now we need to know whether print is leftward or rightward. By consulting
the "Terms and List Operators (Leftward)" section in perlop (right after the
precedence list) we see that print is rightward here, because we have not enclosed
its arguments in parenthesis.
So concatenation is higher precedence, so concatenation provides context to #backwards.
Next step is to check the docs (perlop again) for concatenation:
Binary "." concatenates two strings.
Strings are scalars, so binary "." concatenates two scalars.
And we finally have it!
#backwards has scalar context because concatenation provides scalar context to
each of its operands.
Woo. That was easy, wasn't it :-)
Context is tricky. And sometimes a couple of pixels can make all the difference. Consider the following two lines:
print localtime, "\n";
print localtime. "\n";
Did you spot the difference?
The two statements look really similar. But what it going on in the two cases is very different.
In the first statement, print is passed a list of arguments. And the documentation for print says that it imposes list context on its arguments. Therefore the call to localtime is evaluated in list context and the function returns a list of numbers.
In the second statement, print receives a single argument. That argument is a string created by concatenating the results of calling localtime with a newline character. The concatenation operator imposes scalar context on both of its operands. Therefore the call to localtime returns the current time in a human-readable string format. That string is then concatenated with the newline and the resulting string is passed to print. The print call still evaluates its argument in list context, but that context makes no difference to the way that the single string argument is interpreted.
For another, similar, look at the weirder corners of context in Perl, see the question I posted on my blog a couple of days ago. I'll be posting the solution in a few days time.
Is there any reason to use a scalar comma operator anywhere other than in a for loop?
Since the Perl scalar comma is a "port" of the C comma operator, these comments are probably apropos:
Once in a while, you find yourself in
a situation in which C expects a
single expression, but you have two
things you want to say. The most
common (and in fact the only common)
example is in a for loop, specifically
the first and third controlling
expressions. What if (for example) you
want to have a loop in which i counts
up from 0 to 10 at the same time that
j is counting down from 10 to 0?
So, your instinct that it's mainly useful in for loops is a good one, I think.
I occasionally use it in the conditional (sometimes erroneously called "the ternary") operator, if the code is easier to read than breaking it out into a real if/else:
my $blah = condition() ? do_this(), do_that() : do_the_other_thing();
It could also be used in some expression where the last result is important, such as in a grep expression, but in this case it's just the same as if a semicolon was used:
my #results = grep { setup(), condition() } #list;