perl - push not appending to the end of the array - perl

DB<2> n
main::(/home/repsa/temper.pl:84): my $tttdiskhumber=$myTemprecord[-1];
DB<2> n
main::(/home/repsa/temper.pl:87): push(#myMainrecord,$tttdiskhumber);
DB<2> p #myMainrecord
t2agvio701vhost03t2adsap7011
DB<3> p $tttdiskhumber
hdisk6
DB<4> n
main::(/home/repsa/temper.pl:88): #myTemprecord=();
DB<4> p #myMainrecord
hdisk6o701vhost03t2adsap7011
DB<5>
Why my last push is not appending to the end of the array?
Any help is appreciated....

oh it is. The problem is that you're sending a carriage return to the screen. It's probably trailing the previous element in the array.
$ perl -e'print "abc", "def\r", "ghi", "\n";'
ghidef
You probably read a Windows text file on a non-Windows system without convert the line endings, either in advance (using dos2unix) or when you read the file (by using s/\s+\z//; instead of chomp;).
As jordanm suggested in a comment, the debugger's x command will show you what you have better than p.
$ perl -d
Loading DB routines from perl5db.pl version 1.33
Editor support available.
Enter h or `h h' for help, or `man perldebug' for more help.
my #a = ("abc", "def\r", "ghi");
1;
^D
main::(-:1): my #a = ("abc", "def\r", "ghi");
DB<1> s
main::(-:2): 1;
DB<1> p #a
ghidef
DB<2> x #a
0 'abc'
1 "def\cM"
2 'ghi'
DB<3> q

Related

How to parse rows in my txt file properly using perl

I hope to parse a txt file that looks like this:
A a, b, c
B e
C f, g
The format I hope to get is:
A a
A b
A c
B e
C f
C g
I tried this:
perl -ane '#s=split(/\,/, $F[1]); foreach $k (#s){print "$F[0] $k\n";}' txt.txt
but it only works when there's no space after commas. In the original file, there is a space after each comma. What should I do?
$ perl -lane 'print "$F[0] $_" for map { tr/,//rd } #F[1..$#F]' input.txt
A a
A b
A c
B e
C f
C g
Use auto-split mode on whitespace like normal, and for each element of an array slice of #F from the second field to the last one, remove any commas (I used tr//d, the more usual s/// works too, of course) and print it with the first field prepended.
Alternatively, don't use -a because it splits too much.
perl -le'#F = split(" ", $_, 2); print "$F[0] $_" for split(/,\s*/, $F[1])'

How can I set a working breakpoint to a constant expression?

I have Perl code that uses a constant with an initializing block like this:
use constant C => map {
...;
} (0..255);
When I try to set a breakpoint at the ...; line, it does not work, meaning: I can set the breakpoint, but the debugger does not stop there.
I tried:
Start the program with the debugger (perl -d program.pl)
Set the breakpoint in the debugger (b 2)
Reload using R, then run (r) the program
But still the debugger did not stop at the line, just as if I had no breakpoint set.
My Perl is not the latest; it's 5.18.2, just in case it matters...
You are trying to put a break point in a use block.
A use block is in effect a BEGIN block with a require in it.
The Perl debugger by default does not stop in compile phase.
However you can force the Perl debugger into single step mode inside a BEGIN block by setting the variable $DB::single to 1
See Debugging Compile-Time Statements in perldoc perldebug
If you change your code to
use constant C => map {
$DB::single = 1;
...;
} (0..255);
The Perl debugger will stop in the use statement.
You can avoid altering your code if you create a simple module like this (concept originated here):
package StopBegin;
BEGIN {
$DB::single=1;
}
1;
Then, run your code as
perl -I./ -MStopBegin -d test.pl
Pertinent Answer (previous, not-so-pertinent answer is below this one)
If test.pl looks like this:
use constant C => {
map {;
"C$_" => $_;
} 0 .. 255
};
here's what the debug interaction looks like:
% perl -I./ -MStopBegin -d test.pl
Loading DB routines from perl5db.pl version 1.53
Editor support available.
Enter h or 'h h' for help, or 'man perldebug' for more help.
StopBegin::CODE(0x55db6287dac0)(StopBegin.pm:8):
8: 1;
DB<1> s
main::CODE(0x55db6287db38)(test.pl:5):
5: };
DB<1> -
1 use constant C => {
2: map {;
3: "C$_" => $_;
4 } 0 .. 255
5==> };
DB<2> b 3
DB<3> c
main::CODE(0x55db6287db38)(test.pl:3):
3: "C$_" => $_;
DB<3>
Note the use of the breakpoint to stop inside the map.
Previous, Not-So-Pertinent Answer
If test.pl looks like this:
my $foo;
BEGIN {
$foo = 1;
};
here's what the debug interaction looks like:
% perl -I./ -MStopBegin -d test.pl
Loading DB routines from perl5db.pl version 1.53
Editor support available.
Enter h or 'h h' for help, or 'man perldebug' for more help.
StopBegin::CODE(0x5567e3d79a80)(StopBegin.pm:8):
8: 1;
DB<1> s
main::CODE(0x5567e40f0db0)(test.pl:4):
4: $foo = 1;
DB<1> s
main::(test.pl:1): my $foo;
DB<1> s
Debugged program terminated. Use q to quit or R to restart,
use o inhibit_exit to avoid stopping after program termination,
h q, h R or h o to get additional info.
DB<1>
Note the use of the s command to advance, otherwise it'll skip over the BEGIN block in test.pl

Error in perl code - "Bizarre copy of ARRAY in leave at "

I am having one problem in this perl code.
It is showing some error "Bizarre copy of ARRAY in leave at " . Although code is correct, I feel. Can anybody help.
#!/usr/bin/perl -w
use strict;
sub getStatus() {
#my $self = shift;
my $status;
my #details;
my $Up = 2;
my $Down = 3;
$status = "Failed";
push #details, $Up, $Down;
my $detailMsg = join(",", #details);
return [$status, $detailMsg];
}
my $info = &getStatus();
my $status = ${#$info}[0];
my $detailMsg = ${#$info}[1];
print $status;
print $detailMsg;
exit 0;
-----------------------
Now debugging using perl -d option.
-----------------------
Loading DB routines from perl5db.pl version 1.28
Editor support available.
Enter h or `h h' for help, or `man perldebug' for more help.
main::(test.pl:19): my $info = &getStatus();
DB<1> n
main::(test.pl:20): my $status = ${#$info}[0];
DB<1> n
main::(test.pl:20): my $status = ${#$info}[0];
DB<1> n
Bizarre copy of ARRAY in leave at test.pl line 20.
at test.pl line 20
Debugged program terminated. Use q to quit or R to restart,
use o inhibit_exit to avoid stopping after program termination,
h q, h R or h o to get additional info.
DB<1>
Please suggest any solution. If this is related to problem in perl module, then how can we overcome. Please suggest.
${#$info}[0] is an abomination. $info is the return value of your getStatus sub. It is an array reference. Then, #$info is the dereferenced array. But, you are evaluating it in scalar context, so it evaluates to 2. Then you try to evaluate that as an array reference, and take its first element.
Bizarre indeed. The error message is very appropriate.
PS: Don't use &getStatus(). getStatus() is the right way to invoke your sub.
PPPS: You probably do want $info->[0], but then it is hard to be certain, because what you wrote is so bizarre.

Perl Debugger - How to step out of a loop

While using perl debugger, is there any way to step out of the current loop?
For example:
line 1
for($i=1;$i<100000:$i++)
{
line2
}
line3
I want the debugger to step out of this for loop and stop at line3
c 5
Demonstration:
>perl -d
Loading DB routines from perl5db.pl version 1.33
Editor support available.
Enter h or `h h' for help, or `perldoc perldebug' for more help.
print "line1\n";
for (1..100000) {
print "line2\n";
}
print "line3\n";
^Z
main::(-:1): print "line1\n";
DB<1> s
line1
main::(-:2): for (1..100000) {
DB<1> s
main::(-:3): print "line2\n";
DB<1> s
line2
main::(-:3): print "line2\n";
DB<1> c 5
line2
line2
line2
...
line2
line2
line2
main::(-:5): print "line3\n";
DB<2> s
line3
Debugged program terminated. Use q to quit or R to restart,
You can just set the loop termination condition:
$i=100000
Elaborate? Just set the variable to the exit condition like so:
DB<5> $i=1
DB<6> print $i
1
DB<7> $i=100000
DB<8> print $i
100000
DB<9> c
Debugged program terminated. Use q to quit or R to restart,
c 3 means continue execution and stop at line 3
There is no step out.
You can either setup a break point on "line 3" and continue "c" to next breakpoint, or explicitly state c <line #> to stop at a particular line.

How to Rewrite of One Line Code (or Less Line Code in command line) of this code in Perl?

I have a code like that:
#!/usr/bin/perl
use strict;
use warnings;
my %proteins = qw/
UUU F UUC F UUA L UUG L UCU S UCC S UCA S UCG S UAU Y UAC Y UGU C UGC C UGG W
CUU L CUC L CUA L CUG L CCU P CCC P CCA P CCG P CAU H CAC H CAA Q CAG Q CGU R CGC R CGA R CGG R
AUU I AUC I AUA I AUG M ACU T ACC T ACA T ACG T AAU N AAC N AAA K AAG K AGU S AGC S AGA R AGG R
GUU V GUC V GUA V GUG V GCU A GCC A GCA A GCG A GAU D GAC D GAA E GAG E GGU G GGC G GGA G GGG G
/;
open(INPUT,"<dna.txt");
while (<INPUT>) {
tr/[a,c,g,t]/[A,C,G,T]/;
y/GCTA/CGAU/;
foreach my $protein (/(...)/g) {
if (defined $proteins{$protein}) {
print $proteins{$protein};
}
}
}
close(INPUT);
This code is related to my other question's answer: DNA to RNA and Getting Proteins with Perl
The output of the program is:
SIMQNISGREAT
How can I rewrite that code with Perl, it will run on command line and it will be rewritten with less code(if possible one line code)?
PS 1: dna.txt is like that:
TCATAATACGTTTTGTATTCGCCAGCGCTTCGGTGT
PS 2: If the code will be less line, it is accepted to write the my %proteins variable into a file.
The only changes I would recommend making are simplifying your while loop:
while (<INPUT>) {
tr/acgt/ACGT/;
tr/GCTA/CGAU/;
foreach my $protein (/(...)/g) {
if (defined $proteins{$protein}) {
print $proteins{$protein};
}
}
}
Since y and tr are synonyms, you should only use one of them. I think tr reads better than y, so I picked tr. Further, you were calling them very differently, but this should be the same effect and only mentions the letters you actually change. (All the other characters were being transposed to themselves. That makes it much harder to see what is actually being changed.)
You might want to remove the open(INPUT,"<dna.txt"); and corresponding close(INPUT); lines, as they make it much harder to use your program in shell pipelines or with different input files. But that's up to you, if the input file will always be dna.txt and never anything different, this is alright.
Somebody (#kamaci) called my name in another thread. This is the best I can come up with while keeping the protein table on the command line:
perl -nE'say+map+substr("FYVDINLHL%VEMKLQL%VEIKLQFYVDINLHCSGASTRPWSGARTRP%SGARTRPCSGASTR",(s/GGG/GGC/i,vec($_,0,32)&101058048)%63,1),/.../g' dna.txt
(Shell quoting, for Windows quoting swap ' and " characters). This version marks invalid codons with %, you can probably fix that by adding =~y/%//d at an appropriate spot.
Hint: This picks out 6 bits from the raw ASCII encoding of an RNA triple, giving 64 codes between 0 and 101058048; to get a string index, I reduce the result modulo 63, but this creates one double mapping which regrettably had to code two different proteins. The s/GGG/GGC/i maps one of them to another that codes the right protein.
Also note the parentheses before the % operator which both isolate the , operator from the argument list of substr and fix the precedence of & vs %. If you ever use that in production code, you're a bad, bad person.
#!/usr/bin/perl
%p=qw/UUU F UUC F UUA L UUG L UCU S UCC S UCA S UCG S UAU Y UAC Y UGU C UGC C UGG W
CUU L CUC L CUA L CUG L CCU P CCC P CCA P CCG P CAU H CAC H CAA Q CAG Q CGU R CGC R CGA R CGG R
AUU I AUC I AUA I AUG M ACU T ACC T ACA T ACG T AAU N AAC N AAA K AAG K AGU S AGC S AGA R AGG R
GUU V GUC V GUA V GUG V GCU A GCC A GCA A GCG A GAU D GAC D GAA E GAG E GGU G GGC G GGA G GGG G/;
$_=uc<DATA>;y/GCTA/CGAU/;map{print if$_=$p{$_}}/(...)/g
__DATA__
TCATAATACGTTTTGTATTCGCCAGCGCTTCGGTGT
Phew. Best I can come up with, at least this quickly. If you're sure the input is always already in uppercase, you can also drop the uc saving another two characters. Or if the input is always the same, you could assign it to $_ straight away instead of reading it from anywhere.
I guess I don't need to say that this code should not be used in production environments or anywhere else other than pure fun. When doing actual programming, readability almost always wins over compactness.
A few other versions I mentioned in the comments:
Reading %p and the DNA from files:
#!/usr/bin/perl
open A,"<p.txt";map{map{/(...)/;$p{$1}=chop}/(... .)/g}<A>;
open B,"<dna.txt";$_=uc<B>;y/GCTA/CGAU/;map{print if$_=$p{$_}}/(...)/g
From shell with perl -e:
perl -e 'open A,"<p.txt";map{map{/(...)/;$p{$1}=chop}/(... .)/g}<A>;open B,"<dna.txt";$_=uc<B>;y/GCTA/CGAU/;map{print if$_=$p{$_}}/(...)/g'
Most things have already been pointed out, especially that readability matters. I wouldn't try to reduce the program more than what follows.
use strict;
use warnings;
# http://stackoverflow.com/questions/5402405/
my $fnprot = shift || 'proteins.txt';
my $fndna = shift || 'dna.txt';
# build protein table
open my $fhprot, '<', $fnprot or die "open $fnprot: $!";
my %proteins = split /\s+/, do { local $/; <$fhprot> };
close $fhprot;
# process dna data
my #result;
open my $fhdna, '<', $fndna or die "open $fndna: $!";
while (<$fhdna>) {
tr/acgt/ACGT/;
tr/GCTA/CGAU/;
push #result, map $proteins{$_}, grep defined $proteins{$_}, m/(...)/g;
}
close $fhdna;
# check correctness of result (given input as per original post)
my $expected = 'SIMQNISGREAT';
my $got = join '', #result;
die "#result is not expected" if $got ne $expected;
print "#result - $got\n";
The only "one-liner" thing I added is the push map grep m//g in the while loop. Note that Perl 5.10 adds the "defined or" operator - // - which allows you to write:
push #result, map $proteins{$_} // (), m/(...)/g;
Ah okay, the open do local $/ file slurp idiom is handy for slurping small files into memory. Hope you find it a bit inspiring. :-)
If write proteins data to another file, space delimited and without line break. So, you can import data by reading file once time.
#!/usr/bin/perl
use strict;
use warnings;
open(INPUT, "<mydata.txt");
open(DATA, "<proteins.txt");
my %proteins = split(" ",<DATA>);
while (<INPUT>) {
tr/GCTA/CGAU/;
while(/(\w{3})/gi) {print $proteins{$1} if (exists($proteins{$1}))};
}
close(INPUT);
close(DATA);
You can remove line of code "tr/a,c,g,t/A,C,G,T/" because match operator has option for case insensitive (i option). And original foreach loop can be optimized like code above. $1 variable here is matched pattern result inside parentheses of match operation /(\w{3})/gi