I have written below mention code to read a file and and storing data to array #s_arr.
But when I am trying to print that #s_arr array outside the block it shows nothing.
use Data::Dumper;
my #s_arr;
my #err;
my %sort_h_1;
$fname = '/qv/Error.log';
open( IN, "<$fname" );
foreach $line ( <IN> ) {
if ( $line =~ /CODE\+(\w{3})(\d{5})/ ) {
$a = "$1$2";
push #err, $a;
}
}
close IN;
$prev = "";
$count = 0;
my %hash;
foreach ( sort #err ) {
if ( $prev ne $_ ) {
if ( $count ) {
$hash{$prev} = $count;
}
$prev = $_;
$count = 0;
}
$count++;
}
print Dumper \%hash;
printf( "%s:%d\n", $prev, $count ) if $count;
$hash{$prev} = $count;
my $c = 0;
print "Today Error Count\n";
foreach my $name ( sort { $hash{$b} <=> $hash{$a} } keys %hash ) {
#printf "%-8s %s\n", $name, $hash{$name};
#my %sort_h ;
push #s_arr, $name;
push #s_arr, $hash{$name};
#$sort_h{$name} = $hash{$name} ;
#print Dumper \%sort_h ;
#print Dumper \#s_arr ;
$c++;
if ( $c eq 30 ) {
exit;
}
}
print Dumper \#s_arr; # It's showing nothing
You are calling exit inside of your foreach loop. That makes the program stop, and the print Dumper #s_arr is never reached.
To break out of a loop you need to use last.
foreach my $name ( sort ... ) {
# ...
$c++;
last if $c == 30; # break out of the loop when $c reaches 30
}
I used the postfix variant of if here because that makes it way easier to read. Also note that as zdim pointed out above, you should use the numerical equality check == when checking for numbers. eq is for strings.
Related
I've written a script that requires the user to input a response which is saved as a variable. The problem is that I used a loop to do this and I have a second question that requires user input but this is outside of the first loop. The output from the second question requires that both the first variable & the second variable meet the condition which prints a response. I get the error message (using warnings) that the variable is not explicitly declared.
How can I prompt Perl to "remember" the user input from in the loop after the loop is over? Should I just continue the loop? If so, how?
Thanks for your help. I've attached the script. I assume it's unsophisticated and that it may be better to use a different type of loop but I'm just getting the hang of the for fxn so it would be helpful to stick to it. Any other tips in cleaning up the code would be helpful.
#!C:\Perl
#lesson2XUE.pl
use warnings; use strict;
my $fas="MTESMIRDVELAEEALPKKTGGPQG";
my $swi="MSTESMIRDV ELAEEALPKK TGGPQG";
print "Your sequence has been determined. How would you like it displayed? FASTA/SwissProt:";
for ( my $i=0; $i<4; $i++ ) {
my $choice=<STDIN>;
chomp $choice;
if ($choice =~ /FASTA/i) {
print $fas,"\n";
last;
} elsif ($choice =~ /SwissProt/i){
print $swi,"\n";
last;
} else {
print qq[Please type "FASTA" or "SwissProt" only\n];
}
}
sleep (1);
print "\n";
print qq[Would you like to see the other version? Y/N\n];
my $choice2=<STDIN>;
chomp $choice2;
if (($choice2 =~ /Y/i) && **($choice =~ /FASTA/i)**){
print $swi,"\n";
} elsif (($choice2=~/Y/i) && **($choice=~/SwissProt/i)**) {
print $fas, "\n";
} elsif ($choice2 =~ /N/i) {
print qq[Okay,have a good day];
}
The issue occurs where the script is bold. Because I am no longer in the loop, $choice is not defined.
problem with if (($choice2 =~ /Y/i) && **($choice =~ /FASTA/i)){** and } elsif ($choice =~ /SwissProt/i) (Realised that I cannot bold in 'script' view)
Thanks again!
As mentioned before... Declare $choice outside the scope of your loop. in your example $choice is out of scope after leaving the for loop...
use warnings; use strict;
my $fas="MTESMIRDVELAEEALPKKTGGPQG";
my $swi="MSTESMIRDV ELAEEALPKK TGGPQG";
print "Your sequence has been determined. How would you like it displayed? FASTA/SwissProt:";
my $choice="";
for (my $i = 0; $i < 4; $i++) {
$choice=<STDIN>;
chomp $choice;
if ($choice =~ /FASTA/i) {
print $fas,"\n";
last;
} elsif ($choice =~ /SwissProt/i){
print $swi,"\n";
last;
} else {
print qq[Please type "FASTA" or "SwissProt" only\n];
}
}
sleep (1);
print "\n";
print qq[Would you like to see the other version? Y/N\n];
my $choice2=<STDIN>;
chomp $choice2;
if (($choice2 =~ /Y/i) && ($choice =~ /FASTA/i)){
print $swi,"\n";
} elsif (($choice2=~/Y/i) && ($choice=~/SwissProt/i)) {
print $fas, "\n";
} elsif ($choice2 =~ /N/i) {
print qq[Okay,have a good day];
}
I think these would be help.
while(defined($line=<STDIN>)){
print "I saw $line";
}
I made a version:
use warnings;
use strict;
my $fas = "MTESMIRDVELAEEALPKKTGGPQG";
my $swi = "MSTESMIRDV ELAEEALPKK TGGPQG";
my $tmp = "";
main();
sub main () {
print
"Your sequence has been determined. How would you like it displayed? FASTA/SwissProt:";
while ( defined( my $line = <STDIN> ) ) {
chomp $line;
if ( $tmp eq "" ) {
process($line);
next;
}
else {
process2($line);
last;
}
}
}
sub process {
my $choice = shift;
$tmp = $choice;
if ( $choice =~ /FASTA/i ) {
print $fas, "\n";
}
elsif ( $choice =~ /SwissProt/i ) {
print $swi, "\n";
}
else {
$tmp = "";
print qq[Please type "FASTA" or "SwissProt" only\n];
return;
}
print "\n";
print qq[Would you like to see the other version? Y/N\n];
}
sub process2 {
my $choice2 = shift;
if ( ( $choice2 =~ /Y/i ) && ( $tmp =~ /FASTA/i ) ) {
print $swi, "\n";
}
elsif ( ( $choice2 =~ /Y/i ) && ( $tmp =~ /SwissProt/i ) ) {
print $fas, "\n";
}
elsif ( $choice2 =~ /N/i ) {
print qq[Okay,have a good day];
}
else {
print qq[Good day];
}
}
I am trying to print out the key for the hash if the value satisfies a certain condition. However, I am not sure how to access the hash key if it satisfies the value condition. This is the code I have:
foreach my $x (values %hash){
if ($x > $ARGV[1]){
$counter = $counter + 1
print "keys %hash\n"
}
}
print "$counter\n"
When you loop over the values, you have no access to the key.
for my $key (keys %hash) {
if ($hash{$key} > $ARGV[1]) {
$counter = $counter + 1;
print "$key\n";
}
}
print "$counter\n";
or
keys %hash; # reset iterator
while (my ($key, $value) = each %hash) {
if ($value > $ARGV[1]) {
$counter = $counter + 1;
print "$key\n";
}
}
print "$counter\n";
You can't access the key of a hash element given its value. After all, multiple keys may have the same value. But you can rely on Perl giving you the keys in the same order as the values. So you could write something like this
use strict;
use warnings;
my #keys = keys %hash;
my #vals = values %hash;
my $count = 0;
for my $val ( #values ) {
my $key = shift #keys;
if ( $val > $ARGV[1] ) {
++$count;
print $key, "\n";
}
}
print "$count\n";
But it would be far better to use a while loop with each to gather both the key and the value at the same time
while ( my ($key, $val) = each %hash ) {
if ( $val > $ARGV[1] ) {
++$count;
print $key, "\n";
}
}
I'm trying to use a foreach loop to loop through an array and then use a nested while loop to loop through each line of a text file to see if the array element matches a line of text; if so then I push data from that line into a new array to perform calculations.
The outer foreach loop appears to be working correctly (based on printed results with each array element) but the inner while loop is not looping (same data pushed into array each time).
Any advice?
The code is below
#! /usr/bin/perl -T
use CGI qw(:cgi-lib :standard);
print "Content-type: text/html\n\n";
my $input = param('sequence');
my $meanexpfile = "final_expression_complete.txt";
open(FILE, $meanexpfile) or print "unable to open file";
my #meanmatches;
#regex = (split /\s/, $input);
foreach $regex (#regex) {
while (my $line = <FILE>) {
if ( $line =~ m/$regex\s(.+\n)/i ) {
push(#meanmatches, $1);
}
}
my $average = average(#meanmatches);
my $std_dev = std_dev($average, #meanmatches);
my $average_round = sprintf("%0.4f", $average);
my $stdev_round = sprintf("%0.4f", $std_dev);
my $coefficient_of_variation = $stdev_round / $average_round;
my $cv_round = sprintf("%0.4f", $coefficient_of_variation);
print font(
{ color => "blue" }, "<br><B>$regex average: $average_round
 Standard deviation: $stdev_round Coefficient of
variation(Cv): $cv_round</B>"
);
}
sub average {
my (#values) = #_;
my $count = scalar #values;
my $total = 0;
$total += $_ for #values;
return $count ? $total / $count : 0;
}
sub std_dev {
my ($average, #values) = #_;
my $count = scalar #values;
my $std_dev_sum = 0;
$std_dev_sum += ($_ - $average)**2 for #values;
return $count ? sqrt($std_dev_sum / $count) : 0;
}
Yes, my advice would be:
Turn on strict and warnings.
perltidy your code,
use 3 argument open: open ( my $inputfile, "<", 'final_expression.txt' );
die if it doesn't open - the rest of your program is irrelevant.
chomp $line
you are iterating your filehandle, but once you've done this you're at the end of file for the next iteration of the foreach loop so your while loops becomes a null operation. Simplistically, reading the file into an array my #lines = <FILE>; would fix this.
So with that in mind:
#!/usr/bin/perl -T
use strict;
use warnings;
use CGI qw(:cgi-lib :standard);
print "Content-type: text/html\n\n";
my $input = param('sequence');
my $meanexpfile = "final_expression_complete.txt";
open( my $input_file, "<", $meanexpfile ) or die "unable to open file";
my #meanmatches;
my #regex = ( split /\s/, $input );
my #lines = <$input_file>;
chomp (#lines);
close($input_file) or warn $!;
foreach my $regex (#regex) {
foreach my $line (#lines) {
if ( $line =~ m/$regex\s(.+\n)/i ) {
push( #meanmatches, $1 );
}
}
my $average = average(#meanmatches);
my $std_dev = std_dev( $average, #meanmatches );
my $average_round = sprintf( "%0.4f", $average );
my $stdev_round = sprintf( "%0.4f", $std_dev );
my $coefficient_of_variation = $stdev_round / $average_round;
my $cv_round = sprintf( "%0.4f", $coefficient_of_variation );
print font(
{ color => "blue" }, "<br><B>$regex average: $average_round
 Standard deviation: $stdev_round Coefficient of
variation(Cv): $cv_round</B>"
);
}
sub average {
my (#values) = #_;
my $count = scalar #values;
my $total = 0;
$total += $_ for #values;
return $count ? $total / $count : 0;
}
sub std_dev {
my ( $average, #values ) = #_;
my $count = scalar #values;
my $std_dev_sum = 0;
$std_dev_sum += ( $_ - $average )**2 for #values;
return $count ? sqrt( $std_dev_sum / $count ) : 0;
}
The problem here is that starting from the second iteration of foreach you are trying to read from already read file handle. You need to rewind to the beginning to read it again:
foreach $regex (#regex) {
seek FILE, 0, 0;
while ( my $line = <FILE> ) {
However that does not look very performant. Why read file several times at all, when you can read it once before the foreach starts, and then iterate through the list:
my #lines;
while (<FILE>) {
push (#lines, $_);
}
foreach $regex (#regex) {
foreach $line (#lines) {
Having the latter, you might also what to consider using grep instead of the while loop.
This is the program as it stands right now, it takes in a .fasta file (a file containing genetic code), creates a hash table with the data and prints it, however, it is quite slow. It splits a string an compares it against all other letters in the file.
use strict;
use warnings;
use Data::Dumper;
my $total = $#ARGV + 1;
my $row;
my $compare;
my %hash;
my $unique = 0;
open( my $f1, '<:encoding(UTF-8)', $ARGV[0] ) or die "Could not open file '$ARGV[0]' $!\n";
my $discard = <$f1>;
while ( $row = <$f1> ) {
chomp $row;
$compare .= $row;
}
my $size = length($compare);
close $f1;
for ( my $i = 0; $i < $size - 6; $i++ ) {
my $vs = ( substr( $compare, $i, 5 ) );
for ( my $j = 0; $j < $size - 6; $j++ ) {
foreach my $value ( substr( $compare, $j, 5 ) ) {
if ( $value eq $vs ) {
if ( exists $hash{$value} ) {
$hash{$value} += 1;
} else {
$hash{$value} = 1;
}
}
}
}
}
foreach my $val ( values %hash ) {
if ( $val == 1 ) {
$unique++;
}
}
my $OUTFILE;
open $OUTFILE, ">output.txt" or die "Error opening output.txt: $!\n";
print {$OUTFILE} "Number of unique keys: " . $unique . "\n";
print {$OUTFILE} Dumper( \%hash );
close $OUTFILE;
Thanks in advance for any help!
It is not clear from the description what is wanted from this script, but if you're looking for matching sets of 5 characters, you don't actually need to do any string matching: you can just run through the whole sequence and keep a tally of how many times each 5-letter sequence occurs.
use strict;
use warnings;
use Data::Dumper;
my $str; # store the sequence here
my %hash;
# slurp in the whole file
open(IN, '<:encoding(UTF-8)', $ARGV[0]) or die "Could not open file '$ARGV[0]' $!\n";
while (<IN>) {
chomp;
$str .= $_;
}
close(IN);
# not sure if you were deliberately omitting the last two letters of sequence
# this looks at all the sequence
my $l_size = length($str) - 4;
for (my $i = 0; $i < $l_size; $i++) {
$hash{ substr($str, $i, 5) }++;
}
# grep in a scalar context will count the values.
my $unique = grep { $_ == 1 } values %hash;
open OUT, ">output.txt" or die "Error opening output.txt: $!\n";
print OUT "Number of unique keys: ". $unique."\n";
print OUT Dumper(\%hash);
close OUT;
It might help to remove searching for information that you already have.
I don't see that $j depends upon $i so you're actually matching values to themselves.
So you're getting bad counts as well. It works for 1, because 1 is the square of 1.
But if for each five-character string you're counting strings that match, you're going
to get the square of the actual number.
You would actually get better results if you did it this way:
# compute it once.
my $lim = length( $compare ) - 6;
for ( my $i = 0; $i < $lim; $i++ ){
my $vs = substr( $compare, $i, 5 );
# count each unique identity *once*
# if it's in the table, we've already counted it.
next if $hash{ $vs };
$hash{ $vs }++; # we've found it, record it.
for ( my $j = $i + 1; $j < $lim; $j++ ) {
my $value = substr( $compare, $j, 5 );
$hash{ $value }++ if $value eq $vs;
}
}
However, it could be an improvement on this to do an index for your second loop
and let the c-level of perl do your matching for you.
my $pos = $i;
while ( $pos > -1 ) {
$pos = index( $compare, $vs, ++$pos );
$hash{ $vs }++ if $pos > -1;
}
Also, if you used index, and wanted to omit the last two characters--as you do, it might make sense to remove those from the characters you have to search:
substr( $compare, -2 ) = ''
But you could do all of this in one pass, as you loop through file. I believe the code
below is almost an equivalent.
my $last_4 = '';
my $last_row = '';
my $discard = <$f1>;
# each row in the file after the first...
while ( $row = <$f1> ) {
chomp $row;
$last_row = $row;
$row = $last_4 . $row;
my $lim = length( $row ) - 5;
for ( my $i = 0; $i < $lim; $i++ ) {
$hash{ substr( $row, $i, 5 ) }++;
}
# four is the maximum we can copy over to the new row and not
# double count a strand of characters at the end.
$last_4 = substr( $row, -4 );
}
# I'm not sure what you're getting by omitting the last two characters of
# the last row, but this would replicate it
foreach my $bad_key ( map { substr( $last_row, $_ ) } ( -5, -6 )) {
--$hash{ $bad_key };
delete $hash{ $bad_key } if $hash{ $bad_key } < 1;
}
# grep in a scalar context will count the values.
$unique = grep { $_ == 1 } values %hash;
You may be interested in this more concise version of your code that uses a global regex match to find all the subsequences of five characters. It also reads the entire input file in one go, and removes the newlines afterwards.
The path to the input file is expected as a parameter on the command line, and the output is sent to STDIN, and can be redirected to a file on the command line, like this
perl subseq5.pl input.txt > output.txt
I've also used Data::Dump instead of Data::Dumper because I believe it to be vastly superior. However it is not a core module, and so you will probably need to install it.
use strict;
use warnings;
use open qw/ :std :encoding(utf-8) /;
use Data::Dump;
my $str = do { local $/; <>; };
$str =~ tr|$/||d;
my %dups;
++$dups{$1} while $str =~ /(?=(.{5}))/g;
my $unique = grep $_ == 1, values %dups;
print "Number of unique keys: $unique\n";
dd \%dups;
I have a hash of lists that is not getting populated.
I checked that the block at the end that adds to the hash is in fact being called on input. It should either add a singleton list if the key doesn't exist, or else push to the back of the list (referenced under the right key) if it does.
I understand that the GOTO is ugly, but I've commented it out and it has no effect.
The problem is that when printhits is called, nothing is printed, as if there are no values in the hash. I also tried each (%genomehits), no dice.
THANKS!
#!/usr/bin/perl
use strict;
use warnings;
my $len = 11; # resolution of the peaks
#$ARGV[0] is input file
#$ARGV[1] is call number
# optional -s = spread number from call
# optional -o specify output file name
my $usage = "see arguments";
my $input = shift #ARGV or die $usage;
my $call = shift #ARGV or die $usage;
my $therest = join(" ",#ARGV) . " ";
print "the rest".$therest."\n";
my $spread = 1;
my $output = $input . ".out";
if ($therest =~ /-s\s+(\d+)\s/) {$spread = $1;}
if ($therest =~ /-o\s+(.+)\s/) {$output = $1;}
# initialize master hash
my %genomehits = ();
foreach (split ';', $input) {
my $mygenename = "err_naming";
if ($_ =~ /^(.+)-/) {$mygenename = $1;}
open (INPUT, $_);
my #wiggle = <INPUT>;
&singlegene(\%genomehits, \#wiggle, $mygenename);
close (INPUT);
}
&printhits;
#print %genomehits;
sub printhits {
foreach my $key (%genomehits) {
print "key: $key , values: ";
foreach (#{$genomehits{$key}}) {
print $_ . ";";
}
print "\n";
}
}
sub singlegene {
# let %hash be the mapping hash
# let #mygene be the gene to currently process
# let $mygenename be the name of the gene to currently process
my (%hash) = %{$_[0]};
my (#mygene) = #{$_[1]};
my $mygenename = $_[2];
my $chromosome;
my $leftbound = -2;
my $rightbound = -2;
foreach (#mygene) {
#print "Doing line ". $_ . "\n";
if ($_ =~ "track" or $_ =~ "output" or $_ =~ "#") {next;}
if ($_ =~ "Step") {
if ($_ =~ /chrom=(.+)\s/) {$chromosome = $1;}
if ($_ =~ /span=(\d+)/) {$1 == 1 or die ("don't support span not equal to one, see wig spec")};
$leftbound = -2;
$rightbound = -2;
next;
}
my #line = split /\t/, $_;
my $pos = $line[0];
my $val = $line[-1];
# above threshold for a call
if ($val >= $call) {
# start of range
if ($rightbound != ($pos - 1)) {
$leftbound = $pos;
$rightbound = $pos;
}
# middle of range, increment rightbound
else {
$rightbound = $pos;
}
if (\$_ =~ $mygene[-1]) {goto FORTHELASTONE;}
}
# else reinitialize: not a call
else {
FORTHELASTONE:
# typical case, in an ocean of OFFs
if ($rightbound != ($pos-1)) {
$leftbound = $pos;
}
else {
# register the range
my $range = $rightbound - $leftbound;
for ($spread) {
$leftbound -= $len;
$rightbound += $len;
}
#print $range . "\n";
foreach ($leftbound .. $rightbound) {
my $key = "$chromosome:$_";
if (not defined $hash{$key}) {
$hash{$key} = [$mygenename];
}
else { push #{$hash{$key}}, $mygenename; }
}
}
}
}
}
You are passing a reference to %genomehits to the function singlegene, and then copying it into a new hash when you do my (%hash) = %{$_[0]};. You then add values to %hash which goes away at the end of the function.
To fix it, use the reference directly with arrow notation. E.g.
my $hash = $_[0];
...
$hash->{$key} = yadda yadda;
I think it's this line:
my (%hash) = %{$_[0]};
You're passing in a reference, but this statement is making a copy of your hash. All additions you make in singlegene are then lost when you return.
Leave it as a hash reference and it should work.
PS - Data::Dumper is your friend when large data structures are not behaving as expected. I'd sprinkle a few of these in your code...
use Data::Dumper; print Dumper \%genomehash;