Perl chomp convert string into numerical values - perl

I get a weird behavior for chomp. The <STDIN> adds trailing newline to the input, and I want to remove it, so I uses chomp like the following:
print("Enter Your Name:\n");
$n = <STDIN>;
$n = chomp($n);
print("Your Name: $n");
$out = "";
for ($i =0; $i < 10; $i++)
{
$out .= $n;
print ($out);
print("\n");
}
When I enter any name value (string) such as "Fox" I expect output like:
Fox
FoxFox
FoxFoxFox
FoxFoxFoxFox
etc..
However, "Fox" is replaced by numerical value 1 i.e
1
11
111
1111
I tried to get assistance from the official manual of perl about chomp but I could not able to get any help there. Would any one explain why chomp do like that and how could solve this issue?
Edit
I reviewed the example on the book again, I found that they use chomp witout assign i.e:
$n = chomp($n);
# Is replaced by
chomp($n);
And Indeed by this way the script printout as expected! Also I don't know how and why?

From the perldoc on chomp:
It returns the total number of characters removed from all its arguments
You're setting $n to the return value of chomp($n):
$n = chomp($n);
To do what you want, you can simply
chomp($n)

Related

Storing variables dynamically in Perl

I am trying to store and print variables dynamically in Perl, by asking user to input number of variables to create, then asking for each of the created variables to add information then output the length of text contained in each of them. In my head I came up with this:
use strict;
use warnings;
sub main {
my %VarStore = ();
print ("How many variables to create: ");
chomp(my $varNum = <STDIN>);
my $counter = 1
while ($counter <= $varNum) {
print "Enter text to variable $counter: \n";
chomp(my $buffer = <STDIN>);
$VarStore{'var'$counter} = $buffer;
$counter ++;
}
while ($counter <= $varNum) {
print "Variable $counter is length($VarStore{'var'$counter}) character long \n";
$counter ++;
}
}
What I would like is:
> How many variables to create: 3
> Enter text to variable 1: ABCQWEPOL
> Enter text to variable 2: xJSAG!HHKSKASK
> Enter text to variable 3: KakA
> Variable 1 is 9 character long
> Variable 2 is 14 character long
> Variable 3 is 4 character long
Any clue why my code is not working? I thought of a hash here so that I can create dynamic variables say with keys var1, var2, var3, etc depending on the input the user gives to create them. Thanks in advance.
You are correct that a hash is a good solution to this problem. You have two problems in your code. First, $VarStore{'var'$counter} is not valid syntax, you need to use the . operator to concatenate strings $VarStore{'var'.$counter}, or you can use double quotes to interpolate variables into strings $VarStore{"var$counter"}.
Unlike variables, you can't directly interpolate function calls into strings, so the length() call should be done separately. Or alternatively you can concatenate strings with the function call. print "Variable $counter is " . length($VarStore{"var$counter"}). " long\n";
Second problem is that after your first while loop completes, the $counter variable you reuse for the next while loop will already be greater than $varNum, so you need to reset it to 1. $counter = 1;
It may be simpler to use foreach loops to iterate through the count. Also, sub main is not needed but if you use it you need to actually call main(); somewhere so it will run.
use strict;
use warnings;
my %VarStore;
print ("How many variables to create: ");
chomp(my $varNum = <STDIN>);
foreach my $counter (1..$varNum) {
print "Enter text to variable $counter: \n";
chomp(my $buffer = <STDIN>);
$VarStore{"var$counter"} = $buffer;
}
foreach my $counter (1..$varNum) {
my $length = length($VarStore{"var$counter"});
print "Variable $counter is $length character long \n";
}

How can I count the characters in STDIN using perl without wc?

I am attempting to write a script to count the number of lines, words, and characters input by the user in STDIN. Using the script below, I can accomplish this when a user inputs a file as a CLI, but when I attempt to use this code for STDIN, I end up with an infinite loop. What should I change to fix this?
print "Enter a string to be counted";
my $userInput = <STDIN>;
while ($userInput) {
$lines++;
$chars += length ($_);
$words += scalar(split(/\s+/, $_));
}
printf ("%5d %5d %5d %10s", $lines, $words, $chars, $fileName);
Your program is fine, expect that you need to read from the file handle in the while test. At present you are just reading one line from STDIN and repeatedly checking that it is true - i.e. not zero or undef.
Your code should look like this
use strict;
use warnings;
my ($lines, $chars, $words) = (0, 0, 0);
print "Enter a string to be counted";
while (<STDIN>) {
++$lines;
$chars += length;
$words += scalar split;
}
printf "%5d %5d %5d\n", $lines, $words, $chars;
Note that I have used just length instead of length $_ as $_ is the default parameter for the length operator. $_ only really comes into its own if you use the defaults.
Similarly, the default parameters to split are split ' ', $_ which is what you want in preference to split /\s+/, $_ because the latter returns a zero-length initial field if there are any leading spaces in $_. The special value of a single literal space ' ' just extracts all the sequences of non-space characters, which is almost always what you want. Anything other than just a single space is converted to a regex pattern as normal.
Finally, I have used ++$lines instead of $lines++. The latter is popular only because of the name of the language C++, and it is less common that the value returned by the expression needs to be the original value of the variable rather than the new one. Much more often the increment is used as a statement on its own, as here, when the returned value is irrelevant. If Perl didn't optimise it out (because the context is void and the return value is unused) the code would be doing unnecessary additional work to save the original value of the variable so that it can be returned after the increment. I also think ++$var looks more like the imperative "increment $var" and improves the readability of the code.
Your input has to be within the loop. Else you are processing the same string over and over again.
Maybe this is what you need?
use strict;
use warnings;
print "Enter a string to be counted:\n";
my $lines = 0;
my $chars = 0;
my $words = 0;
while (<>) {
chomp;
$lines++;
$chars += length ($_);
$words += scalar(split(/\s+/, $_));
}
printf ("lines: %5d words: %5d chars: %5d\n", $lines, $words, $chars);

Why chomp($Array[0]) will delete the some string contain in $Array[0]?

I have a input file which is look like this
1a0i b.40.4.6
1a49 b.58.1.1
1a82 c.37.1.10
1atp d.144.1.7
.
.
.
.
Problem1
I put each line into #Array
when I use
$Line = chomp($Array[0]);
print $Line;
show the output in screen 1
but When I use
$Line = $Array[0];
print $Line;
show the output in screen 1a0i b.40.4.6
Why the chomp will let $Line remaining only one character?
Problem2
I want to use b.40 as a file_name , so here it's my code
$Array[0] is 1a0i b.40.4.6
$Line = $Array[0];
#Element = split(" ",$Line);
#Tiny_element = split(".",$Element[1]);
$File_name = join(".",splice(#Tiny_element,0,2));
but I print $File_name , but it show nothing , I use Dumper \#Tiny_element , but it show empty
I print $Element[1] , it shows b.40.4.6 , I use index($Element[1],".") it show 1 so I know it contain "." but it can't separate with "."
I try split("\.",$Element[1]) ; split('.',$Element[1]) before , but it still don't solve out ....
What's wrong with it?
thanks
Perhaps you could try reading the documentation for a function that you are using, rather than just guessing at its behaviour.
The documentation for chomp says this:
It returns the total number of characters removed from all its arguments
The string is edited in place.
Answer to problem 1:
Use:
chomp($Array[0]);
$Line = $Array[0];
instead of:
$Line = chomp($Array[0]);
That's because return value of chomp is not the string, but the number of trailing characters removed from the string.
Answer to problem 2
$File_name = $1 if ($Line =~ /\s([^\.]+\.[^\.]+)/);
For your 2nd problem, If I were to clarify your code by changing the 1st parameter of your split statements as follows:
$Line = $Array[0];
#Element = split(/ /,$Line);
#Tiny_element = split(/./,$Element[1]);
$File_name = join(".",splice(#Tiny_element,0,2));
and reminded you that . is a wildcard character in regular expressions, would you better understand your error?
BTW: /[.]/ is a great regex for finding literal periods.

Could i search between keys of a hash and assign its value to a variable in Perl?

I want to use substr function to recuperate some nucleotides in a sequences. Here i have the FASTA format of those sequences:
>dvex28051
AAAACAAAAACATTCGCTAGAAAGTAATCAGCTGGTCATTTATTTGAAATGTTAATGATATATTTCATGTTGCTAATTTTTTATGAAAAAAATCATTGCTTATTTAATTACTCTTGGTTCTTGACCAACTATAAAAGCATTGTTTAGTATCAAGTGTCCAGGTATCAGCAGTTTTGTTTGAAAACAAACTTTTATTCATGCAGTCAGTGGCGGATCCAGGTAGAGTGCAGAGGCAGCACCCTCCGTCAGAAAACCAAAAAAAGAAGAAATGAAAAATTATAAAAAAAATTTCTAAACGTTGGTGCACTTAAGTGTAGCAAAAAATTCCTGTTTAGATATTCAGTGGGGAGCGACACCTTTTGGGGCCTATAGCTTCAAATCTTACTTGGTGACCTAAAATCGCTTTTTCGTTGGATCTGCGAAAGCTAGAATTTGGTTGCTGCAAATCGAATCGGTGCATCAACTGCATCAATATCAACGATGTGGTGACTGGTGGTATATTTTGGGTTCGTGCAATGCTACATTTATTTCAATCATATTTCAAGGCAGAAAGGGAAAGAAAACATCAGGTCAAGACAGTGGCGTAGCGAGGGAAGGGGGGCATACGTCCCCGGGCGCAACACGATGTCTTTTTTTTTAATCATCTGCGAAATTCAGACATTTTTTAGAGACTAAATGAAACTATGGAAAACCGGGCCCTTATAAAAGTTGAGACCAAGTGAAAAACTGGGGATAAAACATGAAAATCGGGCTCCAAAAGAATGAGAGTCCGCCCTTGGTCTGTACCAGCATGATTTGAGCGCAAATTTCATTAAGCCCCCGGGCGCAAGACACTCACGCTACGCCCCTGGGTAAAGACAAACAGAGTAGTTTTTCTTATAAACACAAGCATGCACAAACAACATAAAAACAAAACACAGTTTTTTTTAAGACGATGTGCTGCGTGCACCCGCTCAATGTTTTTTTTTTTTTTTTATAGAAAAGCAAAACTTTGAAAGGTTAACGTCAACTCATTTTACAACAATTTGTGGCAAATGGTATCAAGGTATCAAGCAATTAACTAAATGTCTTCCACTAGAACGCAGAACACCATTTTGCAATTATTTATTTGATGTAAACCAGTGTGTTAGATCAAAATCACTTCGACGCCGTTTTTTGACTCCGTGAAAATCTTGGTATTCTTCTCGCATTGCATAATGATGGTTTGTTGAAATAAAATTAAACGCTTAACGTTCTTAAAATGAGCGCGATACTACTTTTCTTTGTAGATTTTCTGCATGCGCTCCTTTTAAGTTGATCCCGAGCTACAAACTTCTTTATGAACGTTTTGGATTTCTCCAAAATAAAGCCTGCAAGCAGTTTTCTAAAAACACCGCACCCCCCATTAGGAATTTCTAGATCCGCCCCTGCATACAGTATTTGTTAATTATTAAAACCAACCAGCAGCAATTGTTTATTCAATGACTATTAAACCAACCTGGATAGTGCGTTTGGTCTTGATTGAAGCGATTGCTGCATTGACGTCTTTCGGAACCACATCACC
>dvex294195
GAATCAGTGGAAAAGTCACAACGCAGCTTGCCGAATTACTGCAGATTCTTTACACTTTTTTTTCTACATTATCACTGTTTTGCTTAATTTTCAATTATAGAAATCAAAATTAATAACTGGTATGTAGTTGGTCGGTGCTTCGAGAAAGTAGCCTACTCAATGATTTCTCAGAATGTTACAGTACTTCAAAAAAACAGACTACCCATTTCAAAAAATATAAACCTAGTA
I want to compare each keys of the hash with the Hit column (dvex\d++) of this table:
#Query Hit sense start end star_q end_q lenght_q # this line is informative don't make part of the code.
miRNA1 dvex28051 + 205 232 11 38 51
miRNA1 dvex202016 - 75 106 17 48 51
miRNA1 dvex294195 + 55 85 11 48 51
If this exist, I want to assign its value of the hash to a variable (i.e: $sequence) for apply a substr function:
my $fragment = substr $sequence, $start, $length_sequence;
I make an array with the sequences, and tried to reading it each 2 values and compare it:
while (my $line1 = <$MYINPUTFILE>){ #Entry of the sequences Fasta file
chomp $line1;
push #array_lines, $line1;
}
while (my $line2 = <$IN>){ #Entry of the table
chomp $line2;
push #database_lines, $line2;
}
foreach my $database_line (#database_lines){ #each value of the table
my #entry = split /\s++/,$database_line;
$pattern = $entry[1];
$query = $entry[0];
$start = $entry[3];
$l_pattern = length $pattern;
$end = $entry[4];
$lng_sequence = ($end - $start) + 1;
$sense = $entry[2];
$l_query = $entry[7];
my $count = 2;
for (my $i = 0; $i <= $#array_lines; $i +=$count){
chomp $array_lines[$i-2];
chomp $array_lines[$i-1];
$seq = $array_lines[$i-1];
$header = $array_lines[$i-2];
if($new_header =~ /$pattern/ && $l_header == $l_pattern){
if(($end+$right_diff+$increment) > $l_query){
$clean_seq = substr $seq, $start, $l_query;
} else {;}
}
The problem with my code is that Perl recognizes $seq as the last one Sequence. And always apply substr function on this $seq. I need to search the $pattern and search in those sequences, if exist, assign $seq to its sequence, next apply substr function.
Some suggestions?
I see two significant problems with your code. First, in the loop:
for (my $i = 0; $i <= $#array_lines; $i +=$count){
chomp $array_lines[$i-2];
chomp $array_lines[$i-1];
$seq = $array_lines[$i-1];
$i is set to zero the first time through, but you access array elements $i-1 and $i-2. Element -1 will be the last element of the array, and -2 will be the second to the last element. So it looks like $seq and $header will have incorrect values the first time through your loop. Maybe you need to start $i at $count instead of zero?
Secondly, in this line:
if(($end+$right_diff+$increment) > $l_query){
$increment appears only here in your code. It is never set to anything. Did you mean to use $i here?
A few other suggestions:
Make sure you use warnings; use strict; This will catch errors such as the $increment variable above.
Here is a simpler way to read a file into an array:
my #array_lines = <$MYINPUTFILE>;
chomp #array_lines;
Within regexes, ++ is a special quantifier that disables backtracking. If you want to split on one or more whitespace characters, it is more typical to use split /\s+/, or the equivalent split ' '
With this line, you appear to be simply checking that two strings are equal:
if($new_header =~ /$pattern/ && $l_header == $l_pattern)
You could just do this instead:
if($new_header eq $pattern)
When you have multiple conditions, it is clearer to put them all in one if statement instead of using nested statements. If you have many conditions, you can put them on multiple lines for clarity.
It isn't necessary to use else {;} If you don't need to do anything there, just omit the else clause altogether.

How can I print a matching line and the next three lines in Perl?

I need to search for a pattern and write that line as well as the next 3 lines into a file (FILE). Is this a correct way to do it? Thanks.
print FILE if /^abc/;
$n=3 if /^abc/;
print FILE if ($n-- > 0);
I like .. operator:
perl -ne 'print if (/abc/ and $n=3) .. not $n--'
but you doesn't have described what should happen if abc pattern is repeated in following three lines. If you want restart counter, your approach is correct if fix a little bug with double print.
perl -ne'$n=4 if/abc/;print if$n-->0'
This is a feature of the command-line grep(1). No programming needed:
grep abc --after-context=3
You do get '--' lines between groups of context, but those are easy enough to strip. It's also easy enough to do the whole thing in Perl. :)
The trick is what you want to do when one of the following three lines also contains the pattern you're looking for. grep(1) will reset the counter and keep printing lines.
You could simplify it to using a flag variable to know if you should print a line:
while( <$input> ) {
$n=4 if /^abc/;
print FILE if ($n-- > 0);
}
Besides simplification, it also fixes a problem: in your version the abc string will be printed twice.
There is no need to slurp the file in or try to write your code on a single line:
#!/usr/bin/perl
use strict;
use warnings;
while ( my $line = <DATA> ) {
if ( $line =~ /^abc/ ) {
print $line;
print scalar <DATA> for 1 .. 3;
}
}
__DATA__
x
y
z
abc
1
2
3
4
5
6
Another possible solution...
#!/usr/bin/perl
use strict;
my $count = 0;
while (<DATA>) {
$count = 1 if /abc/;
if ($count >= 1 and $count <= 3) {
next if /abc/;
print;
$count++;
}
}
__DATA__
test
alpha
abc
1
234123
new_Data
test
I'd rather take a few extra lines of code and make everything more clear. Something like this should work:
my $count = 0;
while ( my $line = pop #file ) {
if ( /^abc/ ) {
$count = 4;
}
if ( $count > 0 ) {
print FILE $line;
$count--;
}
}
Edit to respond to comments:
missing the regex was a bug, fixed now.
printing the newlines or not is certainly optional, as is slurping the file in. Different people have different styles, that's one of the things that people like about Perl!