Perl debugger various outputs - perl

While writing script I noticed some very strange Perl behavior. The task is to trim last character of string stored in $temp1 variable, this can be done using either chop($temp1) or more complicated substr($temp1, 0, length($temp1) - 1). These both work as one-liners but in my script the substr solution does not work. Below is example from debugger:
1st PROBLEM:
1st SOLUTION - NOT WORKING:
main::(delmid_n.pl:24): if (($state == 0) && ($_ !~ $musr) && ($_ !~ $lusr)) {
>> n
main::(delmid_n.pl:54): $strprn=substr($temp1, 0, length($temp1) - 1);
>> p $temp1
(-,user2,t-mobile.co.uk)\
>> p substr($temp1, 0, length($temp1) - 1) . "\n";
(-,user2,t-mobile.co.uk)
DB<4>
main::(delmid_n.pl:55): print $strprn . "\n";
DB<4>
main::(delmid_n.pl:56): $temp1 = "";
DB<4>
As you can see in the $strprn variable, nothing is stored. If the same piece of code (which is stored into the $strprn variable) is printed via the 'p' command, the output is OK. This "bug" can be overcome using the mentioned chop() function (see code below):
2nd SOLUTION - WORKING:
main::(delmid_w.pl:24): if (($state == 0) && ($_ !~ $musr) && ($_ !~ $lusr)) {
>> p $temp1
(-,user2,t-mobile.co.uk)\
DB<4> n
main::(delmid_w.pl:56): chop ($temp1);
DB<4> n
main::(delmid_w.pl:57): print $temp1;
DB<4> n
(-,user2,t-mobile.co.uk)
main::(delmid_w.pl:58): $temp1 = "";
DB<4>
The above code is exactly the same as the first example, but the following two lines are from 1st example:
$strprn=substr($temp1, 0, length($temp1) - 1);
print $strprn . "\n";
are replaced with following two lines in the 2nd example:
chop ($temp1);
print $temp1;
What is wrong with the 2nd solution?
2nd PROBLEM:
This is a problem which I do not have a workaround for, so far.
DB<1>
main::(delbeg_n.pl:15): $state = 0;
DB<1>
main::(delbeg_n.pl:16): $muser = qr/\(-,user1,[^,]+\.co\.uk\)\\$/;
DB<1>
main::(delbeg_n.pl:19): line: while (<>) {
DB<1>
main::(delbeg_n.pl:20): chomp; # strip record separator
DB<1>
main::(delbeg_n.pl:21): #Fld = split(/\s+/, $_,);
DB<1>
main::(delbeg_n.pl:23): if (($state == 0) && ($Fld[1] =~ $muser)) {
DB<1> p $Fld[1]
(-,user1,one2one.co.uk)\
DB<2> n
main::(delbeg_n.pl:43): print $_;
DB<2> p $_
netgroup1 (-,user1,one2one.co.uk)\
DB<3> if ($Fld[1] =~ $muser) {print "TRUE"}
TRUE
As you can see after executing line 21 in the code, the next execution line is 43 (the else statement). Why is the following condition not evaluated as a true, allowing the code to continue with line 23, 24, 25?
if (($state == 0) && ($Fld[1] =~ $muser))
The following line was inserted as a demonstration that the condition should be evaluated as a true:
if ($Fld[1] =~ $muser) {print "TRUE"}
Thanks a lot.

Both problems are related. There is something strange at the end of the line.
First, it looks as if you have a newline character after the slash at the end of the line that you have not chomped off -- at least that is what the debugger shows:
>> p $temp1
(-,user2,t-mobile.co.uk)\
DB<4> n
If there was no newline it would have been
>> p $temp1
(-,user2,t-mobile.co.uk)\
DB<4> n
So, there should be two characters: \ and newline.
Why the weird behavior -- I don't know without looking at your input file.
Also, instead of using string length, you can just do substr( $string, 0, -1 ) -- that will also return all but the last character.
That is probably the reason for your second problem -- I guess that \(-,user1,[^,]+\.co\.uk\)\\ matches and it is the end of line marker $ that causes the mismatch.
use strict please
check the file types (dos, unix, old mac)
view the file in a hex editor

Related

Usage of Range operator in perl

I have the following code especially the condition in the if block and how the id is being fetched, to read the below text in the file and display the ids as mentioned below:
Using a Range operator ..:
use strict;
use warnings;
use autodie;
#open my $fh, '<', 'sha.log';
my $fh = \*DATA;
my #work_items;
while (<$fh>) {
if ( my $range = /Work items:/ ... !/^\s*\(\d+\) (\d+)/ ) {
push #work_items, $1 if $range > 1 && $range !~ /E/;
}
}
print "#work_items\n";
Text in the file
__DATA__
Change sets:
(0345) ---$User1 "test12"
Component: (0465) "textfiles1"
Modified: 14-Sep-2014 02:17 PM
Changes:
---c- (0574) /<unresolved>/sha.txt
Work items:
(0466) 90516 "test defect
(0467) 90517 "test defect
Change sets:
(0345) ---$User1 "test12"
Component: (0465) "textfiles1"
Modified: 14-Sep-2014 02:17 PM
Changes:
---c- (0574) /<unresolved>/sha.txt
Work items:
(0468) 90518 "test defect
Outputs:
90516 90517 90518
Question: Range operator is used with two dots why it is being used with 3 dots here??
First, its not really the range operator; it's known as the flip-flop operator when used in scalar context. And like all symbolic operators, it's documented in perlop.
... is almost the same thing as ... When ... is used instead of .., the end condition isn't tested on the same pass as the start condition.
$ perl -E'for(qw( a b a c a d a )) { say if $_ eq "a" .. $_ eq "a"; }'
a # Start and stop at the first 'a'
a # Start and stop at the second 'a'
a # Start and stop at the third 'a'
a # Start and stop at the fourth 'a'
$ perl -E'for(qw( a b a c a d a )) { say if $_ eq "a" ... $_ eq "a"; }'
a # Start at the first 'a'
b
a # Stop at the second 'a'
a # Start at the third 'a'
d
a # Stop at the fourth 'a'
Per http://perldoc.perl.org/perlop.html#Range-Operators:
If you don't want it to test the right operand until the next evaluation, as in sed, just use three dots ("...") instead of two. In all other regards, "..." behaves just like ".." does.
So, this:
/Work items:/ ... !/^\s*\(\d+\) (\d+)/
means "from a line that matches /Work items:/ until the next subsequent line that doesn't match /^\s*\(\d+\) (\d+)/", whereas this:
/Work items:/ .. !/^\s*\(\d+\) (\d+)/
would mean "from a line that matches /Work items:/ until the line that doesn't match /^\s*\(\d+\) (\d+)/" (even if it's the same one).

How can I put N's in my final string

I'm beginner in perl so im have some problems writing a script.
I want a script that put the letter N one certain number of times with basis in a length that I previous check. This Ns have to be in the final of a string inside a .txt. This strings begin with a > and have that 'face':
A1_23ABR2014_53_CC07.P10R_E07_009.ab1
attgccttttgctagcttatagaataataattcatataaacaaaaaatat
tttatattatttaaaaataaataaaccaaataaagtcattgttgatccaa
ttgaacaaatcatattccatccatttaaagcgtctggataatcaggaata
cgtctaggcattacattaaatccaagaaaatgcataggtaagaatgttaa
I already wrote that, but I don't know how to do next.
if $qend > $sendi{
my $leg1 = $qendi - $sendi;
open(my #final, '>>', 'contiggeral.fasta') or die;
while (N < $leg1) {
do N++ in #nomecontig
}
Thanks and sorry for my bad english.
The condition if non-modifier if must be enclosed in parentheses. Variables must start with a sigil (N has none). There is no in operator in Perl.
my $string = 'abc';
my $final_length = 20;
$string .= 'N' x ($final_length - length $string);
print $string, "\n";

Use Perl to remove blocks of text from a file between two mark points

I need to cut a set of lines from a file between two mark points. For example the file is
file.txt
END
line 1 not removed
END
line 2 not removed
line 3 not removed
BEGIN
line 1 is to be removed
line 2 is to be removed
line 3 is to be removed
END
line two last not removed
END
line three last not removed
line four last not removed
I want to remove the lines between BEGIN and END. The new file becomes
file2.txt
END
line 1 not removed
END
line 2 not removed
line 3 not removed
line two last not removed
END
line three last not removed
line four last not removed
This means BEGIN and the first END after BEGIN and the lines between them should be removed.
I was able to write this program, and it works perfectly. But is there any better way to do this?
use File::Copy;
$j = $i = 0;
open(DATA, "<file1.txt");
open(DATA1, ">file2.txt");
while (<DATA>) {
if ($_ =~ /^BEGIN/) { $i = 1; }
if ($_ =~ /^END/ && $i == 1) { $i = 0; next if $_ }
if ($i == 1) { next if $_; }
print DATA1 $_;
}
close(DATA);
close(DATA1);
copy "file2.txt", "file1.txt";
while(<DATA>) {
print DATA1 $_ unless /^BEGIN/ .. /^END/;
}
About the range .. operator from perldoc,
In scalar context, ".." returns a boolean value. The operator is bistable, like a flip-flop, and emulates the line-range (comma) operator of sed, awk, and various editors. Each ".." operator maintains its own boolean state, even across calls to a subroutine that contains it. It is false as long as its left operand is false. Once the left operand is true, the range operator stays true until the right operand is true, AFTER which the range operator becomes false again.

Perl program for extracting the functions alone in a Ruby file

I am having the following Ruby program.
puts "hai"
def mult(a,b)
a * b
end
puts "hello"
def getCostAndMpg
cost = 30000 # some fancy db calls go here
mpg = 30
return cost,mpg
end
AltimaCost, AltimaMpg = getCostAndMpg
puts "AltimaCost = #{AltimaCost}, AltimaMpg = {AltimaMpg}"
I have written a perl script which will extract the functions alone in a Ruby file as follows
while (<DATA>){
print if ( /def/ .. /end/ );
}
Here the <DATA> is reading from the ruby file.
So perl prograam produces the following output.
def mult(a,b)
a * b
end
def getCostAndMpg
cost = 30000 # some fancy db calls go here
mpg = 30
return cost,mpg
end
But, if the function is having block of statements, say for example it is having an if condition testing block means then it is not working. It is taking only up to the "end" of "if" block. And it is not taking up to the "end" of the function. So kindly provide solutions for me.
Example:
def function
if x > 2
puts "x is greater than 2"
elsif x <= 2 and x!=0
puts "x is 1"
else
puts "I can't guess the number"
end #----- My code parsing only up to this
end
Thanks in Advance!
If your code is properly indented, you just want lines that start with def or end, so change your program to:
while (<DATA>){
print if ( /^def/ .. /^end/ );
}
Or run it without a program file at all - run the program from the command line, using -n to have perl treat it as a while loop reading from STDIN:
perl -n -e "print if ( /^def/ .. /^end/ );" < ruby-file.rb
I am not familiar with ruby syntax but if you can ensure good indentation all over the code, you can check based on indentation. Something similar to:
my $add = 0;
my $spaces;
while(my $str = <DATA>) {
if (! $add && $str =~ /^(\s*)def function/) {
$add = 1;
$spaces = $1;
}
if ($add) {
print $_;
$add = 0 if ($str =~ /^$spaces\S/);
}
}
Another option could be counting level of program, something like this:
my $level = 0;
while(<DATA>) {
if(/\b def \b/x .. /\b end \b/x && $level == 0) {
$level++ if /\b if \b/x; # put all statements that closes by end here
$level-- if /\b end \b/x;
print;
}
}
I am not all that familiar with ruby syntax, so you need to put all statements that are closed by end into regex with $level++.
Please note I added \b around those keywords to make sure you are matching whole word and not things like undef as start of function.

Perl: extract rows from 1 to n (Windows)

I want to extract rows 1 to n from my .csv file. Using this
perl -ne 'if ($. == 3) {print;exit}' infile.txt
I can extract only one row. How to put a range of rows into this script?
If you have only a single range and a single, possibly concatenated input stream, you can use:
#!/usr/bin/perl -n
if (my $seqno = 1 .. 3) {
print;
exit if $seqno =~ /E/;
}
But if you want it to apply to each input file, you need to catch the end of each file:
#!/usr/bin/perl -n
print if my $seqno = 1 .. 3;
close ARGV if eof || $seqno =~ /E/;
And if you want to be kind to people who forget args, add a nice warning in a BEGIN or INIT clause:
#!/usr/bin/perl -n
BEGIN { warn "$0: reading from stdin\n" if #ARGV == 0 && -t }
print if my $seqno = 1 .. 3;
close ARGV if eof || $seqno =~ /E/;
Notable points include:
You can use -n or -p on the #! line. You could also put some (but not all) other command line switches there, like ‑l or ‑a.
Numeric literals as
operands to the scalar flip‐flop
operator are each compared against
readline counter, so a scalar 1 ..
3 is really ($. == 1) .. ($. ==
3).
Calling eof with neither an argument nor empty parens means the last file read in the magic ARGV list of files. This contrasts with eof(), which is the end of the entire <ARGV> iteration.
A flip‐flop operator’s final sequence number is returned with a "E0" appended to it.
The -t operator, which calls libc’s isatty(3), default to the STDIN handle — unlike any of the other filetest operators.
A BEGIN{} block happens during compilation, so if you try to decompile this script with ‑MO=Deparse to see what it really does, that check will execute. With an INIT{}, it will not.
Doing just that will reveal that the implicit input loop as a label called LINE that you perhaps might in other circumstances use to your advantage.
HTH
What's wrong with:
head -3 infile.txt
If you really must use Perl then this works:
perl -ne 'if ($. <= 3) {print} else {exit}' infile.txt
You can use the range operator:
perl -ne 'if (1 .. 3) { print } else { last }' infile.txt