It should be a simple nested foreach loop but it's not working and really starting to annoy me that I can't figure this out! Still a perl beginner but I thought I understood this by now. Can someone explain to me where I'm going wrong? The idea is simple: 2 files, 1 small, 1 large with info I want in the small one. Both have unique id's in them. Compare and match the id's and output a new small file with the added info in the small file.
I have 2 pieces of code: 1 without stricts and 1 with and both are not working. I know to use stricts but i'm still curious as to why the one without stricts isn't working either.
WITOUT STRICTS:
if ($#ARGV != 2){
print "input_file1 input_file2 output_file\n";
exit;
}
$inputfile1=$ARGV[0];
$inputfile2=$ARGV[1];
$outputfile1=$ARGV[2];
open(INFILE1,$inputfile1) || die "No inputfile :$!\n";
open(INFILE2,$inputfile2) || die "No inputfile :$!\n";
open(OUTFILE_1,">$outputfile1") || die "No outputfile :$!\n";
$i = 0;
$j = 0;
#infile1=<INFILE1>;
#infile2=<INFILE2>;
foreach ( #infile1 ){
#elements = split(";",$infile1[$i]);
$id1 = $elements[3];
print "1. $id1\n";
$lat = $elements[5];
$lon = $elements[6];
$lat =~ s/,/./;
$lon =~ s/,/./;
print "2. $lat\n";
print "3. $lon\n";
foreach ( #infile2 ){
#loopelements = split(";",$infile2[$j]);
$id2 = $loopelements[4];
print "4. $id2\n";
if ($id1 == $id2){
print OUTFILE_1 "$loopelements[0];$loopelements[1];$loopelements[2];$loopelements[3];$loopelements[4];$lat,$lon\n";
};
$j = $j+1;
};
#elements = join(";",#elements); # add ';' to all elements
#print "$i\r";
$i = $i+1;
}
close(INFILE1);
close(INFILE2);
close(OUTFILE_1);
The error without is the second loop will not start if i'm not mistaken.
WITH STRICTS:
use strict;
use warnings;
my $inputfile1 = shift || die "Give input!\n";
my $inputfile2 = shift || die "Give more input!\n";
my $outputfile = shift || die "Give output!\n";
open my $INFILE1, '<', $inputfile1 or die "In use/Not found :$!\n";
open my $INFILE2, '<', $inputfile2 or die "In use/Not found :$!\n";
open my $OUTFILE, '>', $outputfile or die "In use/Not found :$!\n";
my $i = 0;
my $j = 0;
foreach ( my $infile1 = <$INFILE1> ){
my #elements = split(";",$infile1[$i]);
my $id1 = $elements[3];
print "1: $id1\n";
my $lat = $elements[5];
my $lon = $elements[6];
$lat =~ s/,/./;
$lon =~ s/,/./;
print "2: $lat\n";
print "3: $lon\n";
foreach ( my $infile2 = <$INFILE2> ){
my #loopelements = split(";",$infile2[$j]);
my $id2 = $loopelements[4];
print "4: $id2\n";
if ($id1 == $id2){
print $OUTFILE "$loopelements[0];$loopelements[1];$loopelements[2];$loopelements[3];$loopelements[4];$lat,$lon\n";
};
$j = $j+1;
};
##elements = join(";",#elements); # add ';' to all elements
#print "$i\r";
$i = $i+1;
}
close($INFILE1);
close($INFILE2);
close($OUTFILE);
The error with stricts:
Global symbol "#infile1" requires explicit package name at Z:\Data-Content\Data\test\jan\bestemming_zonder_acco\add_latlon_dest_test.pl line 16.
Global symbol "#infile2" requires explicit package name at Z:\Data-Content\Data\test\jan\bestemming_zonder_acco\add_latlon_dest_test.pl line 31.
Your 'strict' implementation gives you errors due to a confusion about the sigils (the $ and # characters) indication whether a variable is an scalar or an array. In the loop statement you are reading each line of the file into a scalar called $infile1 but in the following line you are trying to access a element of the array #infile1. These to variables are not related and as perl tells you the latter is not declared.
Another problem with you 'strict' implementation is that you are reading the file inside the loop. This means that for nested loops you will read file 2 in the first iteration of the outer loop and for all succeeding iterations the inner loop will not be able to read any lines.
I missed the foreach/while issue, pointed out by stevenl, even fixing the stricture issues will leave you with foreach loops with only one iteration.
I'm not sure what your problem with the unstrict script are.
But I wouldn't use a nested loop at all for processing two files. I would un-nest the loops, so it roughly looked like this:
my %cord;
while ( my $line = <$INFILE1> ) {
my #elements = split /;/, $line;
$cord{ $elements[3] } = "$elements[5],$elements[6]";
}
while ( my $line = <$INFILE2> ) {
my #elements = split /;/, $line;
if ( exists %coord{ $elements[4] } ) {
print $OUTFILE "....;$cord{ $elements4 }\n";
}
}
I can't see exactly where the problem with the non-strict version is. What is the problem that you are encountering?
The problem with the strict version is particularly in these 2 lines:
foreach ( my $infile1 = <$INFILE1> ){
my #elements = split(";",$infile1[$i]);
You have a scalar $infile1 in the first line, but you are treating it as an array in the next line. Also, change the foreach to a while (see below).
A few comments.
For the non-strict version, you could have collapsed the loop to a C-style for loop as:
for (my $i = 0; $i < #infile1; $i++) {
...
}
That can be made simpler to read if you go without the array indexes altogether:
foreach my $infile1 (#infile1) {
my #elements = split ';', $infile1;
...
}
But with the larger file, it might take time to slurp the entire file into the array at the beginning. So it might be better to iterate through the file as you go:
while (my $infile = <$INFILE1>) {
...
}
Note the last point should be how the strict version looks. You need a while loop rather than a foreach loop, because assigning <$INFILE1> to a scalar means it will return the next line only, which evaluates to true as long as there is another line in the file. (Thus, the foreach would only ever get the first line to loop over.)
You don't reset $j before the inner foreach loop runs. Therefore, the second time your inner loop runs, you are trying to access elements that are past the end of the array. This mistake exists in both the strict and non-strict version.
You should not be using $i and $j at all; the point of foreach is that it automatically gets each element for you. Here is an example of correctly using foreach in the inner loop:
foreach my $line ( #infile2 ){
#loopelements = split(";",$line);
#...now do stuff as before
}
This puts each element of #infile one into the variable $line in succession, until you have gone through all of the array.
Related
In my below program, I was trying to search a string from no of files In a folder but output Is printing in continuous manner rather than stopping after required search. Can some one pls help to point out the error ?
i.e. I am trying to Search the string "VoLTE SIPTX: [SIPTX-SIP] ==> REGISTER" from #files but I am not getting the desired output but I am getting repetitive output of my strings.
# #!/usr/bin/perl
# use strict;
use warnings;
&IMS_Compare_Message();
sub IMS_Compare_Message
{
print "Entering the value i.e. the IMS Message to compare with";
my $value = '';
my $choice = '';
my $loop = '';
print "\nThe script path & name is $0\n";
print "\nPlease enter desired number to select any of the following
(1) Start Comparing REGISTER message !!
(2) Start Comparing SUBSCRIBE message
(3) Start Comparing INVITE message \n";
$value = <STDIN>;
if ($value == 1 )
{
print "\n Start Comparing REGISTER message\n\n";
$IMS_Message = "VoLTE SIPTX: [SIPTX-SIP] ==> REGISTER";
#chomp ($IMS_Message);
}
elsif ($value == 2)
{
print "\n SUBSCRIBE message Flow\n\n";
}
elsif ($value == 3)
{
print "\n INVITE message Flow\n\n";
}
else
{
print "\nThe input is not valid!\n";
print "\nDo you want to continue selecting a Automation Mode again (Y or N)?\n";
$choice = <STDIN>;
if( $choice =~ /[Yy]/) {
test_loop();
} else {
exit;
}
}
my $kw = "$IMS_Message";
my #files = grep {-f} (<*main_log>);
foreach my $file (#files)
{
open(my $fh, '<', $file) or die $!;
my #content = <$fh>;
close($fh);
my $l = 0;
$search = chomp ($kw);
#my $search = quotemeta($kw);
foreach (#content)
{ # go through every line for this keyword
$l++;
if (/$search/)
{
printf 'Found keyword %s in file %s, line %d:%s'.$/, $kw, $file, $l, $_
}
}
}
}
After Modificaiton
# #!/usr/bin/perl
use strict;
use warnings;
print "Entering the value i.e. the IMS Message to compare with";
my $value = '';
my $choice = '';
my $loop = '';
my $IMS_Message = '';
my $search = '';
my $kw = '';
print "\nThe script path & name is $0\n";
print "\nPlease enter desired number to select any of the following
(1) Start Comparing REGISTER message !!
(2) Start Comparing SUBSCRIBE message
(3) Start Comparing INVITE message \n";
$value = <STDIN>;
if ($value == 1 )
{
print "\n Start Comparing REGISTER message\n\n";
$IMS_Message = "VoLTE SIPTX: [SIPTX-SIP] ==> REGISTER";
#chomp ($IMS_Message);
}
elsif ($value == 2)
{
print "\n SUBSCRIBE message Flow\n\n";
}
elsif ($value == 3)
{
print "\n INVITE message Flow\n\n";
}
else
{
print "\nThe input is not valid!\n";
print "\nDo you want to continue selecting a Automation Mode again (Y or N)?\n";
$choice = <STDIN>;
if( $choice eq /[Yy]/) {
test_loop();
} else {
exit;
}
$kw = $IMS_Message;
$search = qr/\Q$kw/;
for my $file ( grep { -f } glob '*main_log' ) {
open my $fh, '<', $file or die qq{Unable to open "$file" for input: $!};
while ( <$fh> ) {
if ( /$search/ ) {
printf "Found keyword %s in file %s, line %d: %s\n", $kw, $file, $., $_;
last;
}
}
}
}
Here are some observations on your code
Your approach to debugging appears to be to try things at random to see if they work. It would be far more fruitful to add diagnostic print statements so that you can compare variables' actual values with what you expect
Error and warning messages are useful information, and it is foolish to comment out use strict to make them go away
Don't call subroutines with an ampersand &. That hasn't been best practice for twenty years now
Lay your code out tidily and cinsistently, so that both you and any people you ask for help can read it easily. As it stands it is impossible to tell where blocks start and end without counting brace characters {...}
Variables should be declared with my as close as possible to their first point of use, and not all at once at the top of the file or subroutine
chomp is necessary only for strings that have been read from the terminal or from a file. It returns the number of characters removed, not the trimmed string
if( $choice =~ /[Yy]/ ) { ... } will check only whether the string contains a Y, so if the operator enters MARRY ME! it will return true. You should use string equality eq to check whether a single Y character has been typed
You shouldn't put scalar variables alone inside double quotes. At best it will make no difference, and just add noise to your code; at worst it will completely change the value of the variable. Just my $kw = $IMS_Message is correct
Unless you require non-sequential access to the contents of a file, it is best to use a while loop to read and process it line by line, rather than read the whole thing into an array and process each element of the array. This also allows you to use the built-in line number variable $. instead of implementing your own $l
The main problem is that you have derived $search from the result of chomp $kw, which sets $search to the number of characters removed by chomp. This is always zero because $kw is a copy of $IMS_Message, which has no newline at the end. That means you are checking all the lines of every file for the character 0, and not for the message that you intended. The correct way is my $search = quotemeta($kw) which you had in place but have commented out, presumably as a result of your policy of "debugging by guesswork"
Fixing these things, your code should look something like this
my $search = qr/\Q$kw/;
for my $file ( grep { -f } glob '*main_log' ) {
open my $fh, '<', $file or die qq{Unable to open "$file" for input: $!};
while ( <$fh> ) {
if ( /$search/ ) {
printf "Found keyword %s in file %s, line %d: %s\n", $kw, $file, $., $_;
last;
}
}
}
This question already has answers here:
Counting number of occurrences of a string inside another (Perl)
(4 answers)
Closed 7 years ago.
I have to check how many times was "," in each line in file. Anybody have idea how can I do it in Perl?
On this moment my code looks like it:
open($list, "<", $student_list)
while ($linelist = <$list>)
{
printf("$linelist");
}
close($list)
But I have no idea how to check how many times is "," in each $linelist :/
Use the transliteration operator in counting mode:
my $commas = $linelist =~ y/,//;
Edited in your code :
use warnings;
use strict;
open my $list, "<", "file.csv" or die $!;
while (my $linelist = <$list>)
{
my $commas = $linelist =~ y/,//;
print "$commas\n";
}
close($list);
If you just want to count the number of somethings in a file, you don't need to read it into memory. Since you aren't changing the file, mmap would be just fine:
use File::Map qw(map_file);
map_file my $map, $filename, '<';
my $count = $map =~ tr/,//;
#! perl
# perl script.pl [file path]
use strict;
use warnings;
my $file = shift or die "No file name provided";
open(my $IN, "<", $file) or die "Couldn't open file $file: $!";
my #matches = ();
my $index = 0;
# while <$IN> will get the file one line at a time rather than loading it all into memory
while(<$IN>){
my $line = $_;
my $current_count = 0;
# match globally, meaning keep track of where the last match was
$current_count++ while($line =~ m/,/g);
$matches[$index] = $current_count;
$index++;
}
$index = 0;
for(#matches){
$index++;
print "line $index had $_ matches\n"
}
You can use mmap Perl IO layer instead of File::Map. It is almost as efficient as former but most probably present in your Perl installation without needing installing a module. Next, using y/// is more efficient than m//g in array context.
use strict;
use warnings;
use autodie;
use constant STUDENT_LIST => 'text.txt';
open my $list, '<:mmap', STUDENT_LIST;
while ( my $line = <$list> ) {
my $count = $line =~ y/,//;
print "There is $count commas at $.. line.\n";
}
If you would like grammatically correct output you can use Lingua::EN::Inflect in the right place
use Lingua::EN::Inflect qw(inflect);
print inflect "There PL_V(is,$count) $count PL_N(comma,$count) at ORD($.) line.\n";
Example output:
There are 7 commas at 1st line.
There are 0 commas at 2nd line.
There is 1 comma at 3rd line.
There are 2 commas at 4th line.
There are 7 commas at 5th line.
Do you want #commas for each line in the file, or #commas in the entire file?
On a per-line basis, replace your while loop with:
my #data = <list>;
foreach my $line {
my #chars = split //, $line;
my $count = 0;
foreach my $c (#chars) { $count++ if $c eq "," }
print "There were $c commas\n";
}
the below script is basically catching the second column and counting the values. The only minor issue I have is that the file has empty lines at the end (it's how the values are being exported) and because of these empty lines the script is miscounting. Any ideas please? Thanks.
my $sum_column_b = 0;
open my $file, "<", "file_to_count.txt" or die($!);
while( my $line = <$file>) {
$line =~ m/\s+(\d+)/; #regexpr to catch second column values
$sum_column_b += $1;
}
print $sum_column_b, "\n";
I think the main issue has been established, you are using $1 when it is not conditionally tied to the regex match, which causes you to add values when you should not. This is an alternative solution:
$sum_column_b += $1 if $line =~ m/\s+(\d+)/;
Typically, you should never use $1 unless you check that the regex you expect it to come from succeeded. Use either something like this:
if ($line =~ /(\d+)/) {
$sum += $1;
}
Or use direct assignment to a variable:
my ($num) = $line =~ /(\d+)/;
$sum += $num;
Note that you need to use list context by adding parentheses around the variable, or the regex will simply return 1 for success. Also note that, like Borodin says, this will give an undefined value when the match fails, and you must add code to check for that.
This can be handy when capturing several values:
my #nums = $line =~ /(\d+)/g;
The main problem is that if the regex does not match, then $1 will hold the value it received in the previous successful match. So every empty line will cause the previous line to be counted again.
An improvement would be:
my $sum_column_b = 0;
open my $file, "<", "file_to_count.txt" or die($!);
while( my $line = <$file>) {
next if $line =~ /^\s*$/; # skip "empty" lines
# ... maybe skip other known invalid lines
if ($line =~ m/\s+(\d+)/) { #regexpr to catch second column values
$sum_column_b += $1;
} else {
warn "problematic line '$line'\n"; # report invalid lines
}
}
print $sum_column_b, "\n";
The else-block is of course optional but can help noticing invalid data.
Try putting this line just after the while line:
next if ( $line =~ /^$/ );
Basically, loop around to the next line if the current line has no content.
#!/usr/bin/perl
use warnings;
use strict;
my $sum_column_b = 0;
open my $file, "<", "file_to_count.txt" or die($!);
while (my $line = <$file>) {
next if (m/^\s*$/); # next line if this is unsignificant
if ($line =~ m/\s+(\d+)/) {
$sum_column_b += $1;
}
}
print "$sum_column_b\n";
I am extremely new to the Perl process. I am very much enjoying the learning curve and Perl but I am frustrated beyond belief and have spent many, many hours on one task achieving little to no results.
#!/usr/bin/perl
use strict;
print "Average value of retroviruses for the length of each genome and each of the genes:\n"; #create a title for the script
my $infile = "Lab1_table.txt"; # This is the file path.
open INFILE, $infile or die "Can't open $infile: $!"; # Provides an error message if the file can'tbe found.
# set my initial values.
my $tally = 0;
my #header = ();
my #averages = ();
# create my first loop to run through the file by line.
while (my $line = <INFILE>){
chomp $line;
print "$line\n";
# add one to the loop and essentially remove the header line of value.
# the first line is what was preventing me from caclulating averages as Perl can't calculate words.
my #row = split /\t/, $line; # split the file by tab characters.
$tally++; #adds one to the tally.
if ( $tally == 1 ) { #if the tally = 1 the row is determined as a the header.
#header = #row;
}
# if the tally is anything else besides 1 then it will read those rows.
else {
for( my $i = 1; $i < scalar #row; $i++ ) {
$averages[$i] += $row[$i];
}
foreach my $element (#row){
}
foreach my $i (0..4){
$averages[$i] = $averages[$i] + $row[1..4];
}
}
}
print "Average values of genome, gag, pol and env:\n";
for( my $i = 1; $i < scalar #averages; $i++ ) { # this line is used to determine the averages of the columns and print the values
print $averages[$i]/($tally-1), "\n";
}
SO, I got the results to come up with what I wanted (not in the exact format I wanted but as close as I can seem to get at the moment) and they do average the columns.
The issue now is writing to a an outfile. I am trying to get my table and results from the previous code to appear in my outfile. I get a good file name but no results.
foreach my $i (1){
my $outfile= "Average_values".".txt";
open OUTFILE, ">$outfile" or die "$outfile: $!";
print "Average values of genome, gag, pol and env:\n";
}
close OUTFILE;
close INFILE;
I feel like there is an easy way to do this and a hard way and I have taken the very hard way. Any help would be much appreciated.
You did not tell Perl where to print:
print OUTFILE "Average values of genome, gag, pol and env:\n";
BTW, together with use strict, also use warnings. And for working with files, use lexical filehandles and the three argument form of open:
open my $FH, '>', $filename or die $!;
print $FH 'Something';
close $FH or die $!;
My code so far only reads lines 1 to 4 and prints them. What I want to do instead of printing them is putting them into an array. So any help would be greatly appreciated. And hopefully just the code since it should be short. I learn much faster looking at full code instead of opening another 50 tabs trying to put multiple concepts together. Hopefully I'll learn this at some point and won't require help.
my $x = 1;
my $y = 4;
open FILE, "file.txt" or die "can not open file";
while (<FILE>) {
print if $. == $x .. $. == $y;
}
You should just put each line in an array with push :
my $x = 1;
my $y = 4;
my #array;
open FILE, "file.txt" or die "can not open file";
while (<FILE>) {
push (#array, $_) if ($. >= $x || $. <= $y);
}
foreach at the end is just proof it works - note it doesn't ignore blank lines - figured you may want to keep them.
#!/usr/bin/perl
use warnings;
use strict;
my $fi;
my $line;
my $i = 0;
my #array;
open($fi, "< file.txt");
while ($line = <$fi>) {
$array[$i] = $line;
if ($i == 3)
{
last;
}
$i++;
}
foreach(#array)
{
print $_;
}
you know, you don't need to keep iterating through the file once you've got all the data you need.
my $x = 1;
my $y = 4;
my #array;
my $file = 'file.txt';
# Lexical filehandle, three-argument open, meaningful error message
open my $file_h, '<', $file or die "cannot open $file: $!";
while (<$file_h>) {
push #array $_ if $_ >= $x; # This condition is unnecessary when $x is 1
last if $. == $y;
}