Variable Scope outside foreach loop Perl - perl

Here is the problem:
Generating 10 iterations of 50 iterations and accessing the 50 character string outside the inner foreach loop.
I have tried putting the 50x iteration inside a sub function and calling it, but that was unsuccessful.
Thus far, I only get a single character outside the foreach loop whether it's in a sub function or not. I'm fairly certain this is a scope issue that I'm failing to see.
So, code:
#!/usr/bin/perl
use strict;
use warnings;
my #dna = ('A','G','T','C');
my $i;
my $str;
for ($i=1; $i<11; $i++){
#print $i . " ";
foreach(1..50){
my $nt = int(rand $#dna + 1);
$str = $dna[$nt];
#correct here all 50 nts
print $str;
}
#single nt here
#print $str;
print "\n";
}
Output: Corerct, but I need to access $str as is below but outside the foreach loop and within the first for loop.
TGATTAGCGTCCGCGCGTATTGTATTAAGCCACAGAATGTAATGCCAAGA
GCTATAGGAAGACGCCGATCCCTGGACCGGCACAGGCACGGTAACAGCAG
TTGTTGTAGGATCCCAGGGAGCGAAGCACGTGAACTGCGACTAATTTCAA
TAACCAGGCAACACTAAACAGCTCCCATGTGTAAGGACGTATAGGCAGTT
GTAATTGTAGATCACAAAATTTACACGGTATAGCATTAACTGGAACCTGC
AACAGTGCCGTTTATTAATCTCCTCTAGTGTAGGGACGAATCGACCACGG
CGTGAGCAAGCACAAATATCCTTTAGGGGTGTGCTTAAAACACCCAGTAG
GAGTTCATAGGCCAACAATATGGCAAAGCCTTGCCCCATCAAATTCGGCG
TTGCGTCTGCGAACACTGTTGGTGTGCCTTTAGTGCGGGTTACTCGAGAA
CGCGATCTCCGTTTATAACGCTAGCAAACTACTACGGACCGAGGCATCGC
I removed the extra space in the string. It was superfluous.
This was another attempt at getting to the variable to no avail:
use strict;
use warnings;
my $str;
my #dna = ('A','G','T','C');
for (my $i=1; $i<11; $i++){
fifty();
print $str;
}
sub fifty {
foreach (1 .. 50){
my $nt = int(rand $#dna + 1);
$str = $dna[$nt];
return $str;
}
}

for (my $i=1; $i<11; $i++){
fifty();
Infiftyyou return something but you discard ist, as you do no assignement like $str= fifty();
print $str;
}
And here you print something that has no value yet as it seems - in fact you assign a value in fifty- but you shouldn't use global variables.
sub fifty {
foreach (1 .. 50){
my $nt = int(rand $#dna + 1);
$str = $dna[$nt];
Here you discard whatever is in $str and assign one letter instead. Also you assign to a global variable - which you should avoid.
return $str;
}
}
And here you directly leave fifty and return just the one character - which you (see above) discard.

I found this to work perfectly: Turns out to be scope as far as I could tell and not sure why I was stuck. Regardless, moving on now.
#!/usr/bin/perl
use strict;
use warnings;
my #dna = ('A','G','T','C');
my $i;
my $str;
for ($i=1; $i<11; $i++){
my $filename = "seq_" . $i;
open(my $OUT, '>', $filename) or die("Can't open $filename($!)");
foreach(1..50){
my $nt = int(rand $#dna + 1);
$str = $dna[$nt];
print $OUT $str;
}
close $filename;
}

Related

The output of a subroutine is returning 0

I have written a script which uses a subroutine to call percentage of nucleotides in a given sequence. When I run the script the output for each nucleotide percentage is always shown to be zero.
Here's my code;
#!/usr/bin/perl
use strict;
use warnings;
#### Subroutine to report percentage of each nucleotide in DNA sequence ####
my $input = $ARGV[0];
my $nt = $ARGV[1];
my $args = $#ARGV +1;
if($args != 2){
print "Error!!! Insufficient number of arguments\n";
print "Usage: $0 <input fasta file>\n";
}
my($FH, $line);
open($FH, '<', $input) || die "Could\'nt open file: $input\n";
$line = do{
local $/;
<$FH>;
};
$line =~ s/>(.*)//g;
$line =~ s/\s+//g;
my $perc = perc_nucleotide($line , $nt);
printf("The percentage of $nt nucleotide in given sequence is %.0f", $perc);
print "\n";
sub perc_nucleotide {
my($line, $nt) = #_;
print "$nt\n";
my $count = 0;
if( $nt eq "A" || $nt eq "T" || $nt eq "G" || $nt eq "C"){
$count++;
}
my $total_len = length($line);
my $perc = ($count/$total_len)*100;
}
I think that I am setting the $count variable wrong. I tried different ways but can't figure it out.
This is the input file
>XM_024894547.1 Trichoderma citrinoviride Redoxin (BBK36DRAFT_1163529), partial mRNA
ATGGCCTTCCGTCTCCCTCTGCGCCGCATTGCCCTGGCCCGCCCCGCCACCGTTGCGCGTGGCTTCCACT
CGACGCCCCGCGCCCTGGTCAAGGTCGGCGACGAGGTCCCGAGCTTGGAGCTGTTCGAGAAGTCGGCCGC
CAGCAAGATCAACCTGGCCGACGAGTTCAAGAAGGGCGACGGCTACATTGTCGGCGTCCCGGGCGCCTTC
TCCGGCACCTGCTCCGGCACCCACGTCCCGTCGTACATCAACCACCCTGACATCAAGACGGCCGGCCAGG
TCTTTGTCGTCTCCGTCAACGACCCCTTTGTCATGAAGGCTTGGGCAGACCAGCTGGATCCCGCCGGAGA
GACAGGAATCCGGTTCGTTGCCGACCCCACGGCTGAGTTCACAAAGGCTCTGGAACTGGGATTCGACGAC
GCTGCTCCTCTGTTCGGAGGCACCCGAAGCAAGCGCTATGCTCTCAAGGTTAAGGATGGCAAGGTCACTG
CCGCCTTTGTTGAGCCCGACAACACGGGCACTTCCGTGTCAATGGCCGACAAGGTCCTCAGCTAA
The problem is here:
my $perc = perc_nucleotide($line , $nt);
printf("The percentage of $nt nucleotide in given sequence is %.0f", $perc);
perc_nucleotide is returning 0.18018018018018 but the format %.0f says to print it with no decimal places. So it gets truncated to 0. You should probably use something more like %.2f.
It's also worth noting that perc_nucleotide does not have a return. It still works, but for reasons that might not be obvious.
perc_nucleotide sets my $perc = ($count/$total_len)*100; but never uses that $perc. The $perc in the main program is a different variable.
perc_nucleotide does return something, every Perl subroutine without an explicit return returns the "last evaluated expression". In this case it's my $perc = ($count/$total_len)*100; but the last evaluated expression rules can get a bit tricky.
It's easier to read and safer to have an explicit return. return ($count/$total_len)*100;
I corrected the script and it gave me right answers.
#!/usr/bin/perl
use strict;
use warnings;
##### Subroutine to calculate percentage of all nucleotides in a DNA sequence #####
my $input = $ARGV[0];
my $nt = $ARGV[1];
my $args = $#ARGV + 1;
if($args != 2){
print "Error!!! Insufficient number of arguments\n";
print "Usage: $0 <input_fasta_file> <nucleotide>\n";
}
my($FH, $line);
open($FH, '<', $input) || die "Couldn\'t open input file: $input\n";
$line = do{
local $/;
<$FH>;
};
chomp $line;
#print $line;
$line =~ s/>(.*)//g;
$line =~ s/\s+//g;
#print "$line\n";
my $total_len = length($line);
my $perc_of_nt = perc($line, $nt);
**printf("The percentage of nucleotide $nt in a given sequence is %.2f%%", $perc_of_nt);
print "\n";**
#print "$total_len\n";
sub perc{
my($line, $nt) = #_;
my $char; my $count = 0;
**foreach $char (split //, $line){
if($char eq $nt){
$count += 1;
}
}**
**return (($count/$total_len)*100)**
}
The answer for the above input file is:
Total_len = 555
The percentage of nucleotide A in a given sequence is 18.02%
The percentage of nucleotide T in a given sequence is 18.74%
The percentage of nucleotide G in a given sequence is 28.47%
The changes which I made are in bold.
Thanks for amazing insight!!!

Check how many "," in each line in Perl [duplicate]

This question already has answers here:
Counting number of occurrences of a string inside another (Perl)
(4 answers)
Closed 7 years ago.
I have to check how many times was "," in each line in file. Anybody have idea how can I do it in Perl?
On this moment my code looks like it:
open($list, "<", $student_list)
while ($linelist = <$list>)
{
printf("$linelist");
}
close($list)
But I have no idea how to check how many times is "," in each $linelist :/
Use the transliteration operator in counting mode:
my $commas = $linelist =~ y/,//;
Edited in your code :
use warnings;
use strict;
open my $list, "<", "file.csv" or die $!;
while (my $linelist = <$list>)
{
my $commas = $linelist =~ y/,//;
print "$commas\n";
}
close($list);
If you just want to count the number of somethings in a file, you don't need to read it into memory. Since you aren't changing the file, mmap would be just fine:
use File::Map qw(map_file);
map_file my $map, $filename, '<';
my $count = $map =~ tr/,//;
#! perl
# perl script.pl [file path]
use strict;
use warnings;
my $file = shift or die "No file name provided";
open(my $IN, "<", $file) or die "Couldn't open file $file: $!";
my #matches = ();
my $index = 0;
# while <$IN> will get the file one line at a time rather than loading it all into memory
while(<$IN>){
my $line = $_;
my $current_count = 0;
# match globally, meaning keep track of where the last match was
$current_count++ while($line =~ m/,/g);
$matches[$index] = $current_count;
$index++;
}
$index = 0;
for(#matches){
$index++;
print "line $index had $_ matches\n"
}
You can use mmap Perl IO layer instead of File::Map. It is almost as efficient as former but most probably present in your Perl installation without needing installing a module. Next, using y/// is more efficient than m//g in array context.
use strict;
use warnings;
use autodie;
use constant STUDENT_LIST => 'text.txt';
open my $list, '<:mmap', STUDENT_LIST;
while ( my $line = <$list> ) {
my $count = $line =~ y/,//;
print "There is $count commas at $.. line.\n";
}
If you would like grammatically correct output you can use Lingua::EN::Inflect in the right place
use Lingua::EN::Inflect qw(inflect);
print inflect "There PL_V(is,$count) $count PL_N(comma,$count) at ORD($.) line.\n";
Example output:
There are 7 commas at 1st line.
There are 0 commas at 2nd line.
There is 1 comma at 3rd line.
There are 2 commas at 4th line.
There are 7 commas at 5th line.
Do you want #commas for each line in the file, or #commas in the entire file?
On a per-line basis, replace your while loop with:
my #data = <list>;
foreach my $line {
my #chars = split //, $line;
my $count = 0;
foreach my $c (#chars) { $count++ if $c eq "," }
print "There were $c commas\n";
}

How do I input file line results into an array?

My code so far only reads lines 1 to 4 and prints them. What I want to do instead of printing them is putting them into an array. So any help would be greatly appreciated. And hopefully just the code since it should be short. I learn much faster looking at full code instead of opening another 50 tabs trying to put multiple concepts together. Hopefully I'll learn this at some point and won't require help.
my $x = 1;
my $y = 4;
open FILE, "file.txt" or die "can not open file";
while (<FILE>) {
print if $. == $x .. $. == $y;
}
You should just put each line in an array with push :
my $x = 1;
my $y = 4;
my #array;
open FILE, "file.txt" or die "can not open file";
while (<FILE>) {
push (#array, $_) if ($. >= $x || $. <= $y);
}
foreach at the end is just proof it works - note it doesn't ignore blank lines - figured you may want to keep them.
#!/usr/bin/perl
use warnings;
use strict;
my $fi;
my $line;
my $i = 0;
my #array;
open($fi, "< file.txt");
while ($line = <$fi>) {
$array[$i] = $line;
if ($i == 3)
{
last;
}
$i++;
}
foreach(#array)
{
print $_;
}
you know, you don't need to keep iterating through the file once you've got all the data you need.
my $x = 1;
my $y = 4;
my #array;
my $file = 'file.txt';
# Lexical filehandle, three-argument open, meaningful error message
open my $file_h, '<', $file or die "cannot open $file: $!";
while (<$file_h>) {
push #array $_ if $_ >= $x; # This condition is unnecessary when $x is 1
last if $. == $y;
}

perl printing hash of arrays with out Data::Dumper

Here is the code, I know it is not perfect perl. If you have insight on how I an do better let me know. My main question is how would I print out the arrays without using Data::Dumper?
#!/usr/bin/perl
use Data::Dumper qw(Dumper);
use strict;
use warnings;
open(MYFILE, "<", "move_headers.txt") or die "ERROR: $!";
#First split the list of files and the headers apart
my #files;
my #headers;
my #file_list = <MYFILE>;
foreach my $source_parts (#file_list) {
chomp($source_parts);
my #parts = split(/:/, $source_parts);
unshift(#files, $parts[0]);
unshift(#headers, $parts[1]);
}
# Next get a list of unique headers
my #unique_files;
foreach my $item (#files) {
my $found = 0;
foreach my $i (#unique_files) {
if ($i eq $item) {
$found = 1;
last;
}
}
if (!$found) {
unshift #unique_files, $item;
}
}
#unique_files = sort(#unique_files);
# Now collect the headers is a list per file
my %hash_table;
for (my $i = 0; $i < #files; $i++) {
unshift #{ $hash_table{"$files[$i]"} }, "$headers[$i]";
}
# Process the list with regex
while ((my $key, my $value) = each %hash_table) {
if (ref($value) eq "ARRAY") {
print "$value", "\n";
}
}
The Perl documentation has a tutorial on "Printing of a HASH OF ARRAYS" (without using Data::Dumper)
perldoc perldsc
You're doing a couple things the hard way. First, a hash will already uniqify its keys, so you don't need the loop that does that. It appears that you're building a hash of files, with the values meant to be the headers found in those files. The input data is "filename:header", one per line. (You could use a hash of hashes, since the headers may need uniquifying, but let's let that go for now.)
use strict;
use warnings;
open my $files_and_headers, "<", "move_headers.txt" or die "Can't open move_headers: $!\n";
my %headers_for_file;
while (defined(my $line = <$files_and_headers> )) {
chomp $line;
my($file, $header) = split /:/, $line, 2;
push #{ $headers_for_file{$file} }, $header;
}
# Print the arrays for each file:
foreach my $file (keys %headers_for_file) {
print "$file: #{ $headers_for_file{$file}}\n";
}
We're letting Perl do a chunk of the work here:
If we add keys to a hash, they're always unique.
If we interpolate an array into a print statement, Perl adds spaces between them.
If we push onto an empty hash element, Perl automatically puts an empty anonymous array in the element and then pushes onto that.
An alternative to using Data::Dumper is to use Data::Printer:
use Data::Printer;
p $value;
You can also use this to customise the format of the output. E.g. you can have it all in a single line without the indexes (see the documentation for more options):
use Data::Printer {
index => 0,
multiline => 0,
};
p $value;
Also, as a suggestion for getting unique files, put the elements into a a hash:
my %unique;
#unique{ #files } = #files;
my #unique_files = sort keys %unique;
Actually, you could even skip that step and put everything into %hash_table in one pass:
my %hash_table;
foreach my $source_parts (#file_list) {
chomp($source_parts);
my #parts = split(/:/, $source_parts);
unshift #{ $hash_table{$parts[0]} }, $parts[1];
}

How can I search and replace a match a specific number of times in a string in Perl?

How can I search and replace a match with specific number of times using s///;. For example:
$string="abcabdaaa";
I want to replace a with i in $string n times. How can I do that? n is an integer provided by user.
The simple answer probably doesn't do want you want.
my $str = 'aaaa';
$str =~ s/a/a_/ for 1..2;
print $str, "\n"; # a__aaa. But you want a_a_aa, right?
You need to count the replacements yourself, and act accordingly:
$str = 'aaaa';
my $n = 0;
$str =~ s/(a)/ ++$n > 2 ? $1 : 'a_' /ge;
print $str, "\n";
See the FAQ, How do I change the Nth occurrence of something? for related examples.
Just substitute $n times:
$string =~ s/a/i/ for 1..$n;
This will do it.
More general solution would be global substitution with counter:
my $i = 0; # count the substitutions made
$string =~ s/(a)/ ++$i > $n ? $1 : "i" /ge;
I'm not aware of any flag that would do that. I'd simply use a loop:
for (my $i = 0; $i < $n; $i++)
{
$string =~ s/a/i/;
}
you can try this:
$str1=join('i',split(/a/,$str,$n));
Here is a way to do based on the comment you made to eugene y's answer
#!/usr/bin/perl
use strict; use warnings;
my $string = '***ab***c';
my $n = 3;
1 while $n -- and $string =~ s/\*([^\n])/*\n$1/;
print "$string\n";
Output:
*
*
*
ab***c
Using
sub substitute_n {
my $n = shift;
my $pattern = shift;
my $replace = shift;
local $_ = shift;
my $i = 1;
s{($pattern)} {
$i++ <= $n ? eval qq{"$replace"} : $1;
}ge;
$_;
}
You can then write
my $s = "***ab***c";
print "[", substitute_n(2, qr/\*/, '$1\n', $s), "]\n";
to get the following output:
[*
*
*ab***c]