Why does this subroutine work if I type out its arguments literally, but not if I give the arguments in the form of a variable? - perl

I am using a perl package (Biomart), that includes a subroutine called addFilter(). That subroutine needs a couple of arguments, including one that needs to be of the format "nr:nr:nr"
If I use the subroutine as follows, it works fine:
$query->addFilter("chromosomal_region", ["1:1108138:1108138","1:1110294:1110294"]);
However, if I use it like this, it does not work:
my $string = '"1:1108138:1108138","1:1110294:1110294","1:1125105:1125105"';
$query->addFilter("chromosomal_region", ['$string']);
Since there are tens of thousands of those arguments that I construct in a for loop, I really need the second way to work... What could be causing this? I hope someone can help me out, many thanks in advance!

Because you seem to be trying to write in a language that's not Perl. '"this","that","another"' isn't an array, it's a string. And '$string' doesn't interpolate or include $string in any way because it uses single quotes. It just produces a string that starts with a dollar sign and ends with "string".
Something more like what you intend would be:
my #things = ("1:1108138:1108138","1:1110294:1110294","1:1125105:1125105");
$query->addFilter("chromosomal_region", \#things);
-or-
$query->addFilter("chromosomal_region", [ #things ] );
And to build it up dynamically, you can simply do push #things, $value in a loop or whatever you need.

'$string' is literally "$string"; the variable isn't replaced with its contents. Lose the single quotes.
Of course, it's unlikely passing a reference to an array consisting of a single comma-separated string with quotes embedded in it is going to do the same thing as passing a reference to an array of strings.
Try something like:
my $ref = ["1:1108138:1108138","1:1110294:1110294"];
$query->addFilter("chromosomal_region", $ref);

I agree with hobbs...if you want to take many inputs like that, you can use a for loop and an array like this (provided you are taking inputs from STDIN):
for ($line = <STDIN> && $line ne "end\n")
{
chomp($line);
push #values,$line;
}
It takes data and puts in values array. You have to indicate the end of data by "end".
And for your error, what others said was right. Perl's variable interpolation works only for variables in double quotes.

Related

Topicalising a variable using "for" is apparently bad. Why?

So I answered a question on SO and got a lot of flack for it.
I have been using Perl for many years and use this topicalising quite a lot.
So let's start with some code. I am doing search and replace in these examples. The idea is to search for one and three from two strings and replace them.
$values = 'one two three four five';
$value2 = 'one 2 three four 5';
$values =~ s/one//g;
$values =~ s/three//g;
$values2 =~ s/one//g;
$values2 =~ s/three//g;
This code is simple and everyone accepts it.
I can also build an array or hash with a list of values to search and replace which is also acceptable.
However, When I build a script to topicalise $values and $values2 and lessen the amount of typing to build a script it seems to be misunderstood?
Here is the code.
$values = 'one two three four five';
$value2 = 'one 2 three four 5';
for ( $values, $values2 ) {
s/one//g;
s/three//g;
}
The above code will topicalise the variables for the duration of the for block, but many programmers are against this. I want to understand why this is unacceptable?
There are several points to consider.
Your code performs multiple substitutions on a list of variables. You can do that without using $_:
for my $s ($values, $values2) {
$s =~ s/one//g;
$s =~ s/three//g;
}
Personally I think nothing is wrong with the above code.
The general problem with $_ is that it's not a local variable. E.g. if the body of your for loop calls a function (that calls a function ...) that modifies $_ without localizing it (e.g. by assigning to it or using a bare s/// or using while (<...>)), then it will overwrite the variables you're iterating over. With a my variable you're protected because other functions can't see your local variables.
That said, if the rest of your code doesn't have this bug (scribbling over $_ without localizing it), $_ will work fine here.
However, the code in your answer people originally complained about is different:
for ($brackets) {
s/\\left/(/g;
s/\\right/)/g;
}
Here you're not trying to perform the same substitutions on many variables; you're just saving yourself some typing by getting rid of the $brackets =~ part:
$brackets =~ s/\\left/(/g;
$brackets =~ s/\\right/)/g;
Using an explicit loop variable wouldn't be a solution because you'd still have to type out $foo =~ on every line.
This is more a matter of taste. You're only using for for its aliasing effect, not to loop over multiple values. Personally I'm still OK with this.
perldoc perlsyn has this
The foreach is the non-experimental way to set a topicalizer.
The OP's construct is a perfectly valid way of writing Perl code. The only provisons I have regarding their earlier answer are
Unlike the example here, only two operations were being applied to a single variable. That is only marginally briefer than simply writing two substitutions and I wouldn't bother here, although I may consider
s/one//g, s/three//g for $value;
Other than the topicaliser, the answer is identical to another one already posted. I don't believe this makes it sufficiently different to warrant another post

Perl: Grep in an array

I have an array in the below format
array
Link-IF-A<->IF-B
Link-IF-C<->IF-D
Link-IF-E<->IF-F
Link-IF-G<->IF-H
Link-IF-I<->IF-J
I am trying to search interface "IF-D" but the value always show as 0.
I want to see 1 when it matches else 0.I ahve tried all the below method but everytime result is 0.
$link = IF-D
method1 :
my $result = grep /$link/,#array;
method 2:
my $result = grep /^$link,/,#array;
method3 :
my $result = grep(/^$link$/, #array)
Thanks
Your second and third methods can never match, as none of your target strings begin with, or contain only IF-D. The second method uses ^, anchoring to the beginning of your target string, and the third method contains both ^ and $ mandating that the pattern match the entire target string, not just some portion of it. So those will always fail (it appears that you're just trying things at random to see if they work, and especially in the case of regular expressions, that's not a good way to accomplish the goal.)
The first example will match one time; the 2nd element, because the pattern IF-D matches at the end of the target string Link-IF-C<->IF-D. However, it's only going to work if your target string and your pattern are what you think they are. In the example code you showed us, the pattern string wasn't wrapped in quotes. It must be.
So, for example, this will do what you seem to want:
my $link = "IF-D";
my #array = qw(
Link-IF-A<->IF-B
Link-IF-C<->IF-D
Link-IF-E<->IF-F
Link-IF-G<->IF-H
Link-IF-I<->IF-J
);
my $found = grep /\Q$link/, #array;
print "$found\n"; # 1
The \Q isn't strictly necessary for the pattern you've demonstrated. That construct forces the contents of $link to be treated on its literal meaning, rather than possibly as metasymbols. Your example pattern doesn't contain any metasymbols, but if it accidentally did, the \Q would de-meta them.
If you think you've implemented something semantically equal to this example, and yet it's not working, then you've found exactly why people ask that those asking questions post a small self-contained snippet of code that demonstrates the behavior they're describing. If my example code doesn't clear up the problem, boil your code down to a single snippet that demonstrates the problem, and add it as an update to your question so that we can run it ourselves and see exactly what you're talking about.
You must use double quote or single quote in your assignment:
$link = "IF-D";
instead of $link = IF-D.
use strict;
use warnings;
my #array = qw(
Link-IF-A<->IF-B
Link-IF-C<->IF-D
Link-IF-E<->IF-F
Link-IF-G<->IF-H
Link-IF-I<->IF-J
);
my $link = "IF-D";
print scalar grep /$link/, #array;

A better variable naming system?

A newbie to programming. The task is to extract a particular data from a string and I chose to write the code as follows -
while ($line =<IN>) {
chomp $line;
#tmp=(split /\t/, $line);
next if ($tmp[0] !~ /ch/);
#tgt1=#tmp[8..11];
#tgt2=#tmp[12..14];
#tgt3=#tmp[15..17];
#tgt4=#tmp[18..21];
foreach (1..4) {
print #tgt($_), "\n";
}
I thought #tgt($_) would be interpreted as #tgt1, #tgt2, #tgt3, #tgt4 but I still get the error message that #tgt is a global symbol (#tgt1, #tgt2, #tgt3, #tgt4` have been declared).
Q1. Did I misunderstand foreach loop?
Q2. Why couldn't perl see #tgt($_) as #tgt1, #tgt2 ..etc?
Q2. From the experience this is probably a bad way to name variables. What would be a preferred way to name variables that have similar features?
Q2. Why couldn't perl see #tgt($_) as #tgt1, #tgt2 ..etc?
Q2. From the experience this is probably a bad way to name variables. What would be a preferred way to name variables that have similar features?
I'll asnswer both together.
#tgt($_) does NOT mean what you hope it means
First off, it's an invalid syntax (you can't use () after an array name, perl interpeter will produce a compile error).
What you're trying to do is access distinct variables by accessing a variable via an expression resulting in its name (aka symbolic references). This IS possible to do; but is typically a bad idea and poor-style Perl (as in, you CAN but you SHOULD NOT do it, without a very very good reason).
To access element $_ the way you tried, you use #{"tgt$_"} syntax. But I repeat - Do Not Do That, even if you can.
A correct idiomatic solution: use an array of arrayrefs, with your 1-4 (or rather 0-3) indexing the outer array:
# Old bad code: #tgt1=#tmp[8..11];
# New correct code:
$tgt[0]=[ #tmp[8..11] ]; # [] creates an array reference from a list.
# etc... repeat 4 times - you can even do it in a smart loop later.
What this does is, it stores a reference to an array slice into a zeroth element of a single #tgt array.
At the end, #tgt array has 4 elements , each an array reference to an array containing one of the slices.
Q1. Did I misunderstand foreach loop?
Your foreach loop (as opposed to its contents - see above) was correct, with one style caveat - again, while you CAN use a default $_ variable, you should almost never use it, instead always use named variables for readability.
You print the abovementioned array of arrayrefs as follows (ask separately if any of the syntax is unclear - this is a mid-level data structure handling, not for beginners):
foreach my $index (0..3) {
print join(",", #{ $tgt[$index]}) . "\n";
}

Preserving backslashes in Perl strings

Is there a way in Perl to preserve and print all backslashes in a string variable?
For example:
$str = 'a\\b';
The output is
a\b
but I need
a\\b
The problem is can't process the string in any way to escape the backslashes because
I have to read complex regular expressions from a database and don't know in which combination and number they appear and have to print them exactly as they are on a web page.
I tried with template toolkit and html and html_entity filters. The only way it works so far is to use a single quoted here document:
print <<'XYZ';
a\\b
XYZ
But then I can't interpolate variables which makes this solution useless.
I tried to write a string to a web page, into file and on the shell, but no luck, always one backslash disappears. Maybe I am totally on the wrong track, but what is the correct way to print complex regular expressions including backslashes in all combinations and numbers without any changes?
In other words:
I have a database containing hundreds of regular expressions as string data. I want to read them with perl and print them on a web page exatly as they are in the database.
There are all the time changes to these regular expressions by many administrators so I don't know in advance how and what to escape.
A typical example would look like this:
'C:\\test\\file \S+'
but it could change the next day to
'\S+ C:\\test\\file'
Maybe a correct conclusion would be to escape every backslash exactly one time no matter in which combination and in which number it appears? This would mean it works to double them up. Then the problem isn't as big as I feared. I tested it on the bash and it works with two and even three backslashes in a row (4 backslaches print 2 ones and 6 backslashes print 3 ones).
The backslash only has significance to Perl when it occurs in Perl source code, e.g.: your assignment of a literal string to a variable:
my $str = 'a\\b';
However, if you read data from a file (or a database or socket etc) any backslashes in the data you read will be preserved without you needing to take any special steps.
my $str = 'a\\b';
print $str;
This prints a\\b.
Use
my $str = 'a\\\\b';
instead
It's a PITA, but you will just have to double up the backslashes, e.g.
a\\\\b
Otherwise, you could store the backslash in another variable, and interpolate that.
The minimum to get two slashes is (unfortunately) three slashes:
use 5.016;
my $a = 'a\\\b';
say $a;
The problem I tried to solve does not exist. I confused initializing a string directly in the code with using the html forms. Using a string inside the code preserving all backslashes is only possible either with a here document or by reading a textfile containing the string. But if I just use the html form on a web page to insert a string and use escapeHTML() from the CGI module it takes care of all and you can insert the most wired combinations of special characters. They all get displayed and preserved exactly as inserted. So I should have started directly with html and database operations instead of trying to examine things first
by using strings directly in the code. Anyway, thanks for your help.
You can use the following regular expression to form your string correctly:
my $str = 'a\\b';
$str =~ s/\\/\\\\/g;
print "$str\n";
This prints a\\b.
EDIT:
You can use non-interpolating here-document instead:
my $str = <<'EOF';
a\\b
EOF
print "$str\n";
This still prints a\\b.
Grant's answer provided the hint I needed. Some of the other answers did not match Perl's operation on my system so ...
#!/usr/bin/perl
use warnings;
use strict;
my $var = 'content';
print "\'\"\N{U+0050}\\\\\\$var\n";
print <<END;
\'\"\N{U+0050}\\\\\\$var\n
END
print '\'\"\N{U+0050}\\\\\\$var\n'.$/;
my $str = '\'\"\N{U+0050}\\\\\\$var\n';
print $str.$/;
print #ARGV;
print $/;
Called from bash ... using the bash means of escaping in quotes which changes \' to '\''.
jamie#debian:~$ ./ft.pl '\'\''\"\N{U+0050}\\\\\\$var\n'
'"P\\\content
'"P\\\content
'\"\N{U+0050}\\\$var\n
'\"\N{U+0050}\\\$var\n
\'\"\N{U+0050}\\\\\\$var\n
The final line, with six backslashes in the middle, was what I had expected. Reality differed.
So:
"in here \" is interpolated
in HEREDOC \ is interpolated
'in single quotes only \' is interpolated and only for \ and ' (are there more?)
my $str = 'same limited \ interpolation';
perl.pl 'escape using bash rules' with #ARGV is not interpolated

perl: printing object properties

I'm playing a bit with the Net::Amazon::EC2 libraries, and can't find out a simple way to print object properties:
This works:
my $snaps = $ec2->describe_snapshots();
foreach my $snap ( #$snaps ) {
print $snap->snapshot_id . " " . $snap->volume_id . "\n";
}
But if I try:
print "$snap->snapshot_id $snap->volume_id \n";
I get
Net::Amazon::EC2::Snapshot=HASH(0x4c1be90)->snapshot_id
Is there a simple way to print the value of the property inside a print?
$snap->volume_id is not a property, it is a method call. While you could interpolate a method call inside a string, it is exceedingly ugly.
To get all the properties of an object you can use the module Data::Dumper, included with core perl:
use Data::Dumper;
print Dumper($object);
Not in the way you want to do it. In fact, what you're doing with $snap->snapshot_id is calling a method (as in sub). Perl cannot do that inside a double-quoted string. It will interpolate your variable $snap. That becomes something like HASH(0x1234567) because that is what it is: a blessed reference of a hash.
The interpolation only works with scalars (and arrays, but I'll omit that). You can go:
print "$foo $bar"; # scalar
print "$hash->{key}"; # scalar inside a hashref
print "$hash->{key}->{moreKeys}->[0]"; # scalar in an array ref in a hashref...
There is one way to do it, though: You can reference and dereference it inside the quoted string, like I do here:
use DateTime;
my $dt = DateTime->now();
print "${\$dt->epoch }"; # both these
print "#{[$dt->epoch]}"; # examples work
But that looks rather ugly, so I would not recommend it. Use your first approach instead!
If you're still interested in how it works, you might also want to look at these Perl FAQs:
What's wrong with always quoting "$vars"?
How do I expand function calls in a string?
From perlref:
Here's a trick for interpolating a subroutine call into a string:
print "My sub returned #{[mysub(1,2,3)]} that time.\n";
The way it works is that when the #{...} is seen in the double-quoted
string, it's evaluated as a block. The block creates a reference to an
anonymous array containing the results of the call to mysub(1,2,3) .
So the whole block returns a reference to an array, which is then
dereferenced by #{...} and stuck into the double-quoted string. This
chicanery is also useful for arbitrary expressions:
print "That yields #{[$n + 5]} widgets\n";
Similarly, an expression that returns a reference to a scalar can be
dereferenced via ${...} . Thus, the above expression may be written
as:
print "That yields ${\($n + 5)} widgets\n";
Stick with the first sample you showed. It looks cleaner and is easier to read.
I'm answering this because it took me a long time to find this and I feel like other people may benefit as well.
For nicer printing of objects use Data::Printer and p():
use DateTime;
use Data::Printer;
my $dt = DateTime->from_epoch( epoch => time );
p($dt);
The PERL translator has limited depth perception within quotes. Removing them should solve the problem. Or just load the real values into a simple variable that you can print within the quotes. Might need to do that if you have objects which contain pointers to other objects:
SwissArmyChainSaw =/= PureMagic:
print("xxx".$this->{whatever}."rest of string\n");
The problem is that $snap is being interpolated inside the string, but $snap is a reference. As perldoc perlref tells us: "Using a reference as a string produces both its referent's type, including any package blessing as described in perlobj, as well as the numeric address expressed in hex."
In other words, within a string, you can't dereference $snap. Your first try was the correct way to do it.
I agree with most comment, stick to concatenation for easy reading. You can use
say
instead of print to spare of using the "\n".