Backslash before a subroutine call - perl

As I was understanding the difference between [] and \ in references,I used both on subroutine the former was fine but when I tried later I thought it should give error but the below program in perl
#!/usr/bin/perl
use strict;
use warnings;
use Data::Dumper;
my #b;
for my $i ( 0 .. 10 ) {
$b[$i] = \somefunc($i);
}
print Dumper( \#b );
sub somefunc {
my $n = shift;
my ( #a, $k );
for my $j ( 11 .. 13 ) {
$k = $n * $j;
push( #a, $k );
}
print "a: #a \n";
return #a;
}
gives output as :
a: 0 0 0
a: 11 12 13
a: 22 24 26
a: 33 36 39
a: 44 48 52
a: 55 60 65
a: 66 72 78
a: 77 84 91
a: 88 96 104
a: 99 108 117
a: 110 120 130
$VAR1 = [
\0,
\13,
\26,
\39,
\52,
\65,
\78,
\91,
\104,
\117,
\130
];
I was unable to understand the output.Need explanation.

What is happening here is:
You are returning an array from somefunc.
But you are assigning it to a scalar. What this is effectively doing therefore, is simply putting the last value in the array, into the scalar value.
my $value = ( 110, 120, 130 );
print $value;
When you do this - $value is set to the last value in the array. This is what's actually happening in your code. See for example perldata:
List values are denoted by separating individual values by commas (and enclosing the list in parentheses where precedence requires it):
(LIST)
In a context not requiring a list value, the value of what appears to be a list literal is simply the value of the final element, as with the C comma operator. For example,
#foo = ('cc', '-E', $bar);
assigns the entire list value to array #foo, but
foo = ('cc', '-E', $bar);
assigns the value of variable $bar to the scalar variable $foo. Note that the value of an actual array in scalar context is the length of the array; the following assigns the value 3 to $foo:
#foo = ('cc', '-E', $bar);
$foo = #foo; # $foo gets 3
It's this latter case that's often the gotcha, because it's a list in a scalar context.
And in your example - the backslash prefix denotes 'reference to' - which is largely meaningless because it's a reference to a number.
But for a scalar, it might be more meaningful:
my $newvalue = "fish";
my $value = ( 110, 120, 130, \$newvalue );
print Dumper $value;
$newvalue = 'barg';
print Dumper $value;
Gives:
$VAR1 = \'fish';
$VAR1 = \'barg';
That's why you're getting the results. Prefix with the slash indicates that you're getting a reference to the result, not a reference to the sub. Reference to 130 isn't actually all that meaningful.
Normally, when doing the assignment above - you'd get a warning about Useless use of a constant (110) in void context but this doesn't apply when you've got a subroutine return.
If you wanted to insert a sub reference, you'd need to add &, but if you just want to insert the returned array by reference - you either need to:
$b[$i] = [somefunc($i)]
Or:
return \#a;

Related

How to pass optional parameters to a Perl subroutine?

I wanted to pass two arguments to my subroutine. The first parameter is required, the second parameter I want to be optional. Actually it works for non-array parameter in this script:
deftest ("string value test",9);
sub deftest{
my ($strvalue, $num) = #_;
if (defined $num){
print "\nCHECK 1 - Defined value $num";
} else {
$num //= 99;
}
print "\nstrvalue: $strvalue num: $num\n";
}
When I invoke this subroutine with ("string value test",9); then on the screen:
CHECK 1 - Defined value 9
strvalue: string value test num: 9
When I don't enter the second argument, then the value is 99. So this works fine.
The problem is, when I use the array in the first argument:
my #arrSwap = (1,2,3);
deftest2 (#arrSwap,5);
sub deftest2{
my (#arrSwap, $num) = #_;
if (defined $num){
print "\n\nCHECK 2 - Defined val $num";
} else {
$num //= 55;
}
print "\narrSwap $arrSwap[1] num: $num\n";
}
After invoking this subroutine with deftest2 (#arrSwap,5); then on the screen I have only: arrSwap 2 num: 55
(but 5 would be expected as it is defined).
Why it doesn't work with the array?
Because perl passes a single list as arguments, and assigning my ( #arrSwap, $num ) = #_ will always mean an empty $num, because #arrSwap will consume the entirety of the input.
If you want to do this, you need to pass a reference
my #arrSwap = ( 1,2,3 );
deftest2 ( \#arrSwap, 5);
sub deftest2 {
my ( $arrSwap_ref, $num ) = #_;
$num //= 55;
print "\narrSwap", $arrSwap -> [1], " num: $num\n";
}
Otherwise perl simply has no way to tell whether the last number is part of the list you sent, or an optional parameter.
The arguments to subroutines are lists in Perl. When you assign the arguments to an array, Perl doesn't know how long that array is supposed to be, so there is no way to do this:
sub frobnicate {
my (#foo, $bar) = #_;
}
In that case, $bar will always be undef, because it just assigns the rest of the list to #foo.
Instead, you need to make the array-part of your arguments a reference, and derefence it correctly.
sub frobnicate {
my ($foo, $bar) = #_;
foreach my $element (#{ $foo }) {
# ...
}
}
And then call it with a reference.
frobnicate(\#foo, 123);
Because a reference is a scalar value, you now know how many elements will be in the argument list, and you can have optional arguments at the end of the list.
The array is expanded in the parameter list and forms the first N parameters, with $num appearing at the end
If you pass the array that way then there is no way to tell from within the subroutine how many parameters came from the array, so it is always best to pass an array by reference
Your code would look like this
my #arrSwap = ( 1, 2, 3 );
deftest2( \#arrSwap, 5 );
deftest2( \#arrSwap );
sub deftest2 {
my ( $arrSwap, $num ) = #_;
if ( defined $num ) {
print "\n\nCHECK 2 - Defined val $num";
}
else {
$num = 55;
}
print "\narrSwap $arrSwap->[1] num: $num\n";
}
output
CHECK 2 - Defined val 5
arrSwap 2 num: 5
arrSwap 2 num: 55

finding highest value in hash

I have a hash with 5 keys, each of these keys have 5 values
foreach $a(#mass){
if($a=~some regex){
#value=($1,$2,$3,$4,$5);
$hash{"keysname$c"}="#value";
c++;
}
}
Each scalar is a value of different parameters , I have to determinate the highest value of the first array for the all keys in hash
Edit:
Code must compare first value of key1 with first value of key2, key3...key5 and print the highest one
This will print max value for structure like
my %hash = ( k1 => [6,4,1], k2 => [16,14,11] );
use List::Util qw(max);
# longest array
my $n = max map $#$_, values %hash;
for my $i (0 .. $n) {
my $max = max map $_->[$i], values %hash;
print "max value on position $i is $max\n";
}
and for strings,
my %hash = ( k1 => "6 4 1", k2 => "16 14 11" );
use List::Util qw(max);
# longest array
my $n = max map $#{[ split ]}, values %hash;
for my $i (0 .. $n) {
my $max = max map [split]->[$i], values %hash;
print "max value on position $i is $max\n";
}
If I understand your question correctly (and it's a little unclear) then I think you want something like this:
#!/usr/bin/perl
use strict;
use warnings;
use 5.010;
use List::Util 'max';
my (#data, #max);
while (<DATA>) {
chomp;
push #data, [split];
}
for my $i (0 .. $#{$data[0]}) {
push #max, max map { $_->[$i] } #data;
}
say "#max";
__DATA__
93 3 26 87 7
66 96 46 77 42
26 3 71 64 91
31 27 14 40 86
82 72 71 34 7
try this
map {push #temp, #{$_}} values %hash;
#desc_sorted= sort {$b <=> $a} #temp;
print $desc_sorted[0],"\n";
map will consolidate all lists to a single list and sort will sort that consolidated array in descending order.

compare second digit of th string in perl

How to compare except the first digit of the string? Suppose I have 713 and 213, the return is 13.
another example:
518 and 21 => no result
423 and 413 => no result
315 and 215 => 15
Thanks
with this you can do:
if(substr($str1,1) eq substr($str2,1))
tested here
The substr operator will extract substrings for you. The second parameter of substr is the offset of the start of the substring that you want, so if you want the second character onwards you must say substr $string, 1.
This program takes your own data and puts the two numbers into $i and $j. Then substr is called twice to copy the second character onwards of those strings into $i2 and $j2. The if statement compares the two values and prints output accordingly.
use strict;
use warnings;
for (
'518 and 21',
'423 and 413',
'315 and 215') {
my ($i, $j) = /\d+/g;
my $i2 = substr $i, 1;
my $j2 = substr $j, 1;
if ($i2 eq $j2) {
print "$i and $j => $i2\n";
}
else {
print "$i and $j => no result\n";
}
}
output
518 and 21 => no result
423 and 413 => no result
315 and 215 => 15

error pushing value into hash of array

I am parsing an output report from psiblast. I used COG alignments and searched a gene database for matches (homologues). One thing that I want to do is to find out which genes match to more than one COG. My partial script is below.
I am specifically having problems creating an array that holds all of the COGs for the genes that are assigned to multiple COGs.
I am getting the following error "Can't use string ("COG0003") as an ARRAY ref while "strict refs" in use at parse_POG_reports.pl line 26, line 67.".
I have looked at other posted relating to pushing elements into hashes of arrays. But I think the error might be occurring when one gene has 2 matches to the same COG, and it tries to push the same COG into the array (ie. the last 2 lines of the sample input). Does this make sense? If so, how can I avoid this problem?
use strict;
use warnings;
my %maxBits;my %COGhit_count;
my $Hohits={};my %COGhits;
my $COG_psi_report=$ARGV[0];
open (IN, $COG_psi_report) or die "cannot open $COG_psi_report\n";
while (my $line=<IN>){
next if ($line =~/^#/);
chomp $line;
my #columns = split(/\t/,$line);
my $bits=$columns[11];
my $COG=$columns[0];
my $hit=$columns[1];
my $Eval=$columns[10];
next if ($Eval > 0.00001); # threshold for significant hits set by DK
$COGhit_count{$hit}++; # count how many COGs each gene is homologous to
$COGhits{$hit}=$COG;
if ($COGhit_count{$hit}>1) {
push #{$COGhits{$hit}}, $COG; #
}
## for those that there are multiple hits we need to select top hit ##
if (!exists $maxBits{$hit}){
$maxBits{$hit}=$bits;
}
elsif (exists $maxBits{$hit} && $bits > $maxBits{$hit}){
$maxBits{$hit}=$bits;
}
$Hohits->{$hit}->{$bits}=$COG;
}
close (IN);
example Input:
POG0002 764184357-stool1_revised_scaffold22981_1_gene47608 23.90 159 112 3 1 156 1 153 2e-06 54.2
POG0002 764062976-stool2_revised_C999233_1_gene54902 23.63 182 121 5 3 169 2 180 2e-06 53.9
POG0002 763901136-stool1_revised_scaffold39447_1_gene145241 26.45 155 89 3 3 137 5 154 3e-06 53.9
POG0002 765701615-stool1_revised_C1349270_1_gene168522 23.53 187 115 5 3 169 2 180 5e-06 53.1
POG0002 158802708-stool2_revised_C1077267_1_gene26470 22.69 216 158 5 3 213 5 216 5e-06 52.7
POG0003 160502038-stool1_revised_scaffold47906_2_gene161164 33.00 297 154 6 169 424 334 626 6e-40 157
POG0003 160502038-stool1_revised_scaffold47906_2_gene161164 16.28 172 128 4 23 192 46 203 1e-06 56.6
POG0003 158337416-stool1_revised_C1254444_1_gene13533 30.06 346 184 7 133 424 57 398 6e-40 155
POG0003 158337416-stool1_revised_scaffold29713_1_gene153054 28.61 332 194 8 132 424 272 599 2e-38 152
POG0003 158337416-stool1_revised_scaffold29713_1_gene153054 24.00 200 131 5 1 193 5 190 9e-11 69.3
You need to get rid of line 24 (counting backwards):
$COGhits{$hit}=$COG;
In it, you are setting $COGhits{$hit} to a scalar value (the value of $COG). Later, in line 26 you are trying to dereference $COGhits{$hit} as an array to push into it. That doesn't work because there's a scalar in there.
Just remove the if and change those lines into this. That should do the trick as now all those $hits are stored in array references.
$COGhit_count{$hit}++; # count how many COGs each gene is homologous to
push #{$COGhits{$hit}}, $COG;
Output of $COGhits:
$VAR4 = {
'158802708-stool2_revised_C1077267_1_gene26470' => [
'POG0002'
],
'764062976-stool2_revised_C999233_1_gene54902' => [
'POG0002'
],
'764184357-stool1_revised_scaffold22981_1_gene47608' => [
'POG0002'
],
'765701615-stool1_revised_C1349270_1_gene168522' => [
'POG0002'
],
'763901136-stool1_revised_scaffold39447_1_gene145241' => [
'POG0002'
],
'160502038-stool1_revised_scaffold47906_2_gene161164' => [
'POG0003',
'POG0003'
]
};
If you however want both the scalar and the array ref, try this code. I don't recommend this, though.
$COGhit_count{$hit}++; # count how many COGs each gene is homologous to
if ($COGhit_count{$hit} == 1) {
$COGhits{$hit}=$COG; # Save as scalar
}
elsif ($COGhit_count{$hit} == 2) { # If we've just found the second hit,
my $temp = $COGhits{$hit}; # save the first and convert $COGhits{$hit}
$COGhits{$hit} = []; # to an array ref, then push both the old and
push #{$COGhits{$hit}}, $temp, $COG; # the new value in it.
} elsif ($COGhit_count{$hit} > 2) {
push #{$COGhits{$hit}}, $COG; # Just push the new value in
}
Thought: You probably had $COGhits{$hit}=$COG first but then noticed that sometimes there can be more than one value, so you added the push line, but you did not realized that you in fact had to replace the old line.
It's telling you exactly what you're doing wrong.
$COGhits{$hit}=$COG; # <--- scalar
if ($COGhit_count{$hit}>1) {
push #{$COGhits{$hit}}, $COG; # <--- array
}
You can't assign the value as a non-reference type and then try to autovivify it as a reference type. Perl will do the latter, but not if you've already stored a conflicting data type in that location.
Also, if this by some miracle worked the first time (it won't), and you ran this more than once, any array that you might have autoviv-ed by the push, would be clobbered by the scalar non-reference assignment.
I'm not sure what you're after, but the first line should probably be deleted.
Instead of that construct, you want to decide whether there will ever be more than one specification of $COG for a value of $hit. If there can be, simply replacing those 4 lines with the push is the way to go.
I have done multi-purpose structure slots before, and they are largely a pain to maintain. But if you wanted to do something like that, you can do this:
my $ref = \$hashref->{ $key }; # autovivifies slot as simple scalar.
# it starts out as undefined.
if ( ref $$ref ) { # ref $ref will always be true
push #$$ref, $value;
}
else {
$$ref = defined( $$ref ) ? [ $$ref, $value ] : $value;
}
But you have to write bifurcated logic every time you want to access the mixed tree in some different way. The performance savings that you get with scalars, is somewhat eaten up by the tests and branching.
So I don't do too much of this anymore. I decide beforehand whether the relationship is 1-1 or 1-n. Routines like the ones below can make it more straightforward dealing with these kinds of tables, to a degree.
sub get_list_from_hash {
my ( $hash, $key ) = #_;
my $ref = \$hash->{ $key };
return unless defined( $$ref );
return ref( $$ref ) ? #$$ref : $$ref;
}
sub store_in_hash {
$_[0] = {} unless ref $_[0];
my ( $hash, $key, #values ) = #_;
my #defined = grep {; defined } #values;
unless ( #defined ) {
delete $hash->{ $key };
return;
}
my $ref = \$hash->{ $key };
if ( ref $$ref ) {
push #$$ref, #defined;
}
elsif ( defined $$ref ) {
$$ref = [ $$ref, #defined ];
}
elsif ( #values > 1 ) {
#$$ref = #defined;
}
else {
( $$ref ) = #defined;
}
}

How to print/extract information listed under a column from two dimensional array in Perl?

I have a output file which is a two dimensional array (this file was output generated after running script written to produce 2D array) and I have to read information under a particular column, say column 1. In other words, how do I read and print out information listed, corresponding to all the rows, under column 1.
Any suggestions?
__DATA__
1 2 3 4 5 6 7 8 9
A B C D E F G H I
93 48 57 66 52 74 33 22 91
From the above data I want to extract information column wise, say if I want information from column 1, I should be able to list only the following output.
want to list Then I want
OUTPUT:
1
A
93
Final version after all corrections:
#!/usr/bin/perl
use strict;
use warnings;
my $column_to_show = 0;
while ( <DATA> ) {
last unless /\S/;
print +(split)[$column_to_show], "\n";
}
__DATA__
1 2 3 4 5 6 7 8 9
A B C D E F G H I
93 48 57 66 52 74 33 22 91
Output:
C:\Temp> u
1
A
93
Explanation of print +(split)[$column_to_show], "\n";:
perldoc -f split:
Splits the string EXPR into a list of strings and returns that list.
...
If EXPR is omitted, splits the $_ string. If PATTERN is also omitted,
splits on whitespace (after skipping any leading whitespace).
So: (split)[3] selects the fourth element of the list returned by split. The + in front of (split) is necessary to help perl parse the expression correctly. See perldoc -f print:
Also be careful not to follow the
print keyword with a left parenthesis
unless you want the corresponding
right parenthesis to terminate the
arguments to the print — interpose a +
or put parentheses around all the
arguments.
I thoroughly recommend every Perl programmer to occasionally skim through all of the documentation perldoc perltoc. It is on your computer.
my $line ;
foreach $line (#DATA)
{
my #DATA1 = split( / +/, "$line" );
print "first element of array is $DATA1[0]";
}
__DATA__
1 2 3 4 5 6 7 8 9
A B C D E F G H I
93 48 57 66 52 74 33 22 91
OUTPUT:-
1
A
93
Try playing with this code. Basically I load the data into an array of arrays
Each line is a reference to a row.
#!/usr/bin/perl
use strict;
use warnings;
my $TwoDimArray;
while (my $line=<DATA>) {
push #$TwoDimArray, [split(/,/,$line)];
};
for my $column (0..2) {
print "[$column,0] : " . $TwoDimArray->[0]->[$column] ."\n";
print "[$column,1] : " . $TwoDimArray->[1]->[$column] ."\n";
print "\n";
}
__DATA__
1,2,3,04,05,06
7,8,9,10,11,12
The map function is your friend:
open FILE, "data.txt";
while ($line = <FILE>) {
chomp($line);
push #data, [split /[, ]+/, $line];
}
close FILE;
#column1 = map {$$_[0]} #data;
print "#column1\n";
And in data.txt something like:
1, 2, 3, 4
5, 6, 7, 8
9, 10, 11, 12
13, 14, 15, 16
perl -lne '#F = split /\s+/ and print $F[1]'
This might be what you want:
use English qw<$OS_ERROR>; # Or just use $!
use IO::Handle;
my #columns;
open my $fh, '<', 'columns.dat' or die "I'm dead. $OS_ERROR";
while ( my $line = <$fh> ) {
my #cols = split /\s+/, $line;
$columns[$_][$fh->input_line_number()-1] = $cols[$_] foreach 0..$#cols;
}
$fh->close();
You can access them directly by element.
$arrays[0][0] = 1;
$arrays[0][1] = 2;
$arrays[1][0] = 3;
$arrays[1][1] = 4;
for (my $i = 0; $i <= $#{$arrays[1]}; $i++) {
print "row for $i\n";
print "\tfrom first array: " . $arrays[0][$i] . "\n";
print "\tfrom second array: " . $arrays[1][$i] . "\n";
}
prints
row for 0
from first array: 1
from second array: 3
row for 1
from first array: 2
from second array: 4