How to count the odd number of occurrences in Perl? - perl

I have a program in Perl that is supposed to count the number of times an element appears in an array, and prints out the value of the element if the number of times it appears is odd.
Here is my code.
#!/usr/bin/perl
use strict;
use warnings;
sub FindOddCount($)
{
my #arraynumber = #_;
my $Even = 0;
my $i = 0;
my $j = 0;
my $array_length = scalar(#_);
for ($i = 0; $i <= $array_length; $i++)
{
my $IntCount = 0;
for ($j = 0; $j <= $array_length; $j++)
{
if ($arraynumber[$i] == $arraynumber[$j])
{
$IntCount++;
print($j);
}
}
$Even = $IntCount % 2;
if ($Even != 0)
{
return $arraynumber[$i];
}
}
if ($Even == 0)
{
return "none";
}
}
my #array1 = (1,1,2,2,3,3,4,4,5,5,6,7,7,7,7);
my #array2 = (10,10,7,7,6,6,2,2,3,3,4,4,5,5,6,7,7,7,7,10,10);
my #array3 = (6,6,10,7,7,6,6,2,2,3,3,4,4,5,5,6,7,7,7,7,10.10);
my #array4 = (10,10,7,7,2,2,3,3,4,4,5,5,7,7,7,7,10,10,6);
my #array5 = (6,6);
my #array6 = (1);
my $return_value1 = FindOddCount(#array1);
my $return_value2 = FindOddCount(#array2);
my $return_value3 = FindOddCount(#array3);
my $return_value4 = FindOddCount(#array4);
my $return_value5 = FindOddCount(#array5);
my $return_value6 = FindOddCount(#array6);
print "The Odd value for the first array is $return_value1\n";
print "The Odd value for the 2nd array is $return_value2\n ";
print "The Odd value for the 3rd array is $return_value3\n ";
print "The Odd value for the 4th array is $return_value4\n ";
print "The Odd value for the 5th array is $return_value5\n ";
print "The Odd value for the sixth array is $return_value6\n ";
Here are my results.
The Odd value for the first array is 15
The Odd value for the first array is 21
The Odd value for the first array is 21
The Odd value for the first array is 19
The Odd value for the first array is 2
The Odd value for the first array is 1
If you can't tell. It is printing the count of all of the elements of the array instead of returning the element that occurs an odd number of times. In addition I get this error.
Use of uninitialized value in numeric eq (==) at OddCount.pl line 17.
Line 17 is where the 1st array and the 2nd array are compared. Yet the values are clearly instantiated and they work when I print them out. What is the issue?

Build a frequency hash for an array then go through it to see which elements have odd counts
use warnings;
use strict;
use feature 'say';
my #ary = qw(7 o1 7 o2 o1 z z o1); # o1,o2 appear odd number of times
my %freq;
++$freq{$_} for #ary;
foreach my $key (sort keys %freq) {
say "$key => $freq{$key}" if $freq{$key} & 1;
}
This is far simpler than the code in the question -- but which is easily fixed, too. See below.
Some notes
++$freq{$_} increments the value for the key $_ in the hash %freq by 1, or it adds the key to the hash if it doesn't exist (by autovivification) and sets its value to one. So when an array is iterated over with this code in the end the hash %freq contains for keys the array elements and for their values the elements' counts
Test $n & 1 uses the bitwise AND -- it is true if $n has the lowest bit set, so if it is odd
That ++$freq{$_} for #ary; is a Statement Modifier, running the statement for each element of #ary where the current element is aliased by $_ variable
This prints
o1 => 3
o2 => 1
This printing of odd-frequency elements (if any) is sorted alphabetically in elements, just so. Please change to any particular order that may be needed, or let me know.
Comments on the code in the question, which is correct with two simple fixes.
It uses prototypes in a wrong way for the purpose, in sub FindOddCount($). I suspect that this isn't needed so let's not dwell on it -- just drop that and make it sub FindOddCount
The index in loops includes the length of the array (<=) so in the last iteration they attempt to index into the array past its last element. Off-by-one error. That can be fixed by changing the condition into < $array_length (instead of <=), but read on
There is no reason to use C-style loops, not even to iterate over the index. (Needed here since the position in the array is used.) Scripting languages provide for cleaner ways†
foreach my $i1 (0 .. $#arraynumber) {
my $IntCount = 0;
foreach my $i2 (0 .. $#arraynumber) {
if ( $arraynumber[$i1] == $arraynumber[$i2] ) {
...
That 0..N is the range operator, which creates the list of numbers within that range. The syntax $#array_name is the index of the last element in the array #array_name. Exactly what's needed. So there is no need for the array length
Multiple (six) arrays, used to check the code, can be manipulated in far better and easier ways by using references; see the tutorial for complex data structures perldsc, and in particular the page perllol, for array-of-arrays
In short: when you remove the prototype and fix off-by-one error your code seems to be correct.
† And not only scripting ones -- for example, C++11 introduced the range-based for loop
for (auto var: container) ... // really const auto&, or auto&, or auto&&
and the link (a standard reference) says
Used as a more readable equivalent to the traditional for loop [...]

Count the number of occurrences in a for loop using a hash. Then print the desired elements using grep, like so:
#!/usr/bin/env perl
use warnings;
use strict;
use feature qw( say );
my #array = (10,10,7,7,6,6,2,2,3,3,4,4,5,5,6,7,7,7,7,10,10);
my %cnt;
# Count each element of the array:
$cnt{$_}++ for #array;
# Print only the array elements that occurred an odd number of times,
# separated by ", ":
say join q{, }, grep { $cnt{$_} % 2 } #array;
# 6, 6, 6

Related

Why array counter returns a smaller number? [duplicate]

I seem to have come across several different ways to find the size of an array. What is the difference between these three methods?
my #arr = (2);
print scalar #arr; # First way to print array size
print $#arr; # Second way to print array size
my $arrSize = #arr;
print $arrSize; # Third way to print array size
The first and third ways are the same: they evaluate an array in scalar context. I would consider this to be the standard way to get an array's size.
The second way actually returns the last index of the array, which is not (usually) the same as the array size.
First, the second ($#array) is not equivalent to the other two. $#array returns the last index of the array, which is one less than the size of the array.
The other two (scalar #arr and $arrSize = #arr) are virtually the same. You are simply using two different means to create scalar context. It comes down to a question of readability.
I personally prefer the following:
say 0+#array; # Represent #array as a number
I find it clearer than
say scalar(#array); # Represent #array as a scalar
and
my $size = #array;
say $size;
The latter looks quite clear alone like this, but I find that the extra line takes away from clarity when part of other code. It's useful for teaching what #array does in scalar context, and maybe if you want to use $size more than once.
This gets the size by forcing the array into a scalar context, in which it is evaluated as its size:
print scalar #arr;
This is another way of forcing the array into a scalar context, since it's being assigned to a scalar variable:
my $arrSize = #arr;
This gets the index of the last element in the array, so it's actually the size minus 1 (assuming indexes start at 0, which is adjustable in Perl although doing so is usually a bad idea):
print $#arr;
This last one isn't really good to use for getting the array size. It would be useful if you just want to get the last element of the array:
my $lastElement = $arr[$#arr];
Also, as you can see here on Stack Overflow, this construct isn't handled correctly by most syntax highlighters...
To use the second way, add 1:
print $#arr + 1; # Second way to print array size
All three give the same result if we modify the second one a bit:
my #arr = (2, 4, 8, 10);
print "First result:\n";
print scalar #arr;
print "\n\nSecond result:\n";
print $#arr + 1; # Shift numeration with +1 as it shows last index that starts with 0.
print "\n\nThird result:\n";
my $arrSize = #arr;
print $arrSize;
Example:
my #a = (undef, undef);
my $size = #a;
warn "Size: " . $#a; # Size: 1. It's not the size
warn "Size: " . $size; # Size: 2
The “Perl variable types” section of the perlintro documentation contains
The special variable $#array tells you the index of the last element of an array:
print $mixed[$#mixed]; # last element, prints 1.23
You might be tempted to use $#array + 1 to tell you how many items there are in an array. Don’t bother. As it happens, using #array where Perl expects to find a scalar value (“in scalar context”) will give you the number of elements in the array:
if (#animals < 5) { ... }
The perldata documentation also covers this in the “Scalar values” section.
If you evaluate an array in scalar context, it returns the length of the array. (Note that this is not true of lists, which return the last value, like the C comma operator, nor of built-in functions, which return whatever they feel like returning.) The following is always true:
scalar(#whatever) == $#whatever + 1;
Some programmers choose to use an explicit conversion so as to leave nothing to doubt:
$element_count = scalar(#whatever);
Earlier in the same section documents how to obtain the index of the last element of an array.
The length of an array is a scalar value. You may find the length of array #days by evaluating $#days, as in csh. However, this isn’t the length of the array; it’s the subscript of the last element, which is a different value since there is ordinarily a 0th element.
From perldoc perldata, which should be safe to quote:
The following is always true:
scalar(#whatever) == $#whatever + 1;
Just so long as you don't $#whatever++ and mysteriously increase the size or your array.
The array indices start with 0.
and
You can truncate an array down to nothing by assigning the null list () to it. The following are equivalent:
#whatever = ();
$#whatever = -1;
Which brings me to what I was looking for which is how to detect the array is empty. I found it if $#empty == -1;
There are various ways to print size of an array. Here are the meanings of all:
Let’s say our array is my #arr = (3,4);
Method 1: scalar
This is the right way to get the size of arrays.
print scalar #arr; # Prints size, here 2
Method 2: Index number
$#arr gives the last index of an array. So if array is of size 10 then its last index would be 9.
print $#arr; # Prints 1, as last index is 1
print $#arr + 1; # Adds 1 to the last index to get the array size
We are adding 1 here, considering the array as 0-indexed. But, if it's not zero-based then, this logic will fail.
perl -le 'local $[ = 4; my #arr = (3, 4); print $#arr + 1;' # prints 6
The above example prints 6, because we have set its initial index to 4. Now the index would be 5 and 6, with elements 3 and 4 respectively.
Method 3:
When an array is used in a scalar context, then it returns the size of the array
my $size = #arr;
print $size; # Prints size, here 2
Actually, method 3 and method 1 are same.
Use int(#array) as it threats the argument as scalar.
To find the size of an array use the scalar keyword:
print scalar #array;
To find out the last index of an array there is $# (Perl default variable). It gives the last index of an array. As an array starts from 0, we get the size of array by adding one to $#:
print "$#array+1";
Example:
my #a = qw(1 3 5);
print scalar #a, "\n";
print $#a+1, "\n";
Output:
3
3
As numerous answers pointed out, the first and third way are the correct methods to get the array size, and the second way is not.
Here I expand on these answers with some usage examples.
#array_name evaluates to the length of the array = the size of the array = the number of elements in the array, when used in a scalar context.
Below are some examples of a scalar context, such as #array_name by itself inside if or unless, of in arithmetic comparisons such as == or !=.
All of these examples will work if you change #array_name to scalar(#array_name). This would make the code more explicit, but also longer and slightly less readable. Therefore, more idiomatic usage omitting scalar() is preferred here.
my #a = (undef, q{}, 0, 1);
# All of these test whether 'array' has four elements:
print q{array has four elements} if #a == 4;
print q{array has four elements} unless #a != 4;
#a == 4 and print q{array has four elements};
!(#a != 4) and print q{array has four elements};
# All of the above print:
# array has four elements
# All of these test whether array is not empty:
print q{array is not empty} if #a;
print q{array is not empty} unless !#a;
#a and print q{array is not empty};
!(!#a) and print q{array is not empty};
# All of the above print:
# array is not empty

Binary search—Can't use string "1" as a symbol ref while strict refs is in use

I've been browsing over the already answered questions regarding this error message.
I am trying to solve a problem from the Rosalind web site that looks for some indexes using a binary search.
When my subroutine finds the number it seems to ignore it, and if I try to print the $found variable, it gives me the error
Can't use string "1" as a symbol ref while strict refs is in use
The code is this
sub binarysearch
{
my $numbertolook = shift;
my #intarray=#_;
my $lengthint = scalar #intarray;
my #sorted = sort {$a <=> $b} #intarray;
#print $numbertolook, " " , #sorted, "\n";
my $low=0;
my $high=$lengthint-1;
my $found =undef;
my $midpoint;
while ($low<$high)
{
$midpoint=int(($low+$high)/2);
#print $midpoint, " ",$low," ", $high, " ", #sorted, "\n";
if ($numbertolook<$sorted[$midpoint])
{
$high=$midpoint;
}
elsif ($numbertolook>$sorted[$midpoint])
{
$low=$midpoint;
}
elsif ($numbertolook==$sorted[$midpoint])
{
$found=1;
print $found "\n";
last;
}
if ($low==$high-1 and $low==$midpoint)
{
if ($numbertolook==$sorted[$high])
{
$found=1;
print $found "\n";
last;
}
$low=$high;
}
}
return $found;
}
You want
print $found, "\n";
Or
print $found . "\n";
With no operator between $found and the newline, it thinks $found is the filehandle to print a newline to, and is getting an error because it isn't a filehandle.
I'll try to help
First of all, as simple as it may seem, a binary search is quite difficult to code correctly. The main reason is that it's a hotbed of off-by-one errors, which are so prevalent that they have their own Wikipedia page
The issue is that an array containing, say, the values A to Z will have 26 elements with indices 0 to 25. I think FORTRAN bucks the trend, and Lua, but pretty much every other language has the first element of an array at index zero
A zero base works pretty well for everything until you start using divide and conquer algorithms. Merge Sort as well as Binary Search are such algorithms. Binary search goes
Is it in the first half?
If so then search the first half further
Else search the second half further
The hard part is when you have to decide when you've found the object, or when you need to give up looking. Splitting data in two nearly-halves is easy. Knowing when to stop is hard
It's highly efficient for sorted data, but the problem comes when implementing it that, if we do it properly, we have to deal with all sorts of weird index bases beyond zero or one.
Suppose I have an array
my #alpha = 'A' .. 'Q'
If I print scalar #alpha I will see 17, meaning the array has seventeen elements, indexed from 0 to 16
Now I'm looking for E in that array, so I do a binary search, so I want the "first half" and the "second half" of #alpha. If I add 0 to 16 and divide by 2 I get a neat "8", so the middle element is at index 8, which is H
But wait. There are 17 elements, which is an odd number, so if we say the first eight (A .. H) are left of the middle and the last eight (I .. Q) are right of the middle then surely the "middle" is I?
In truth this is all a deception, because a binary search will work however we partition the data. In this case binary means two parts, and although the search would be more efficient if those parts could be equal in size it's not necessary for the algorithm to work. So it can be the first third and the last two-thirds, or just the first element and the rest
That's why using int(($low+high)/2) is fine. It rounds down to the nearest integer so that with our 17-element array $mid is a usable 8 instead of 8.5
But your code still has to account for some unexpected things. In the case of our 17-element array we have calculated the middle index to be 8. So indexes 0 .. 7 are the "first half" while 8 .. 16 are the "second half", and the middle index is where the second half starts
But didn't we round the division down? So in the case of an odd number of elements, shouldn't our mid point be at the end of the first half, and not the start of the second? This is an arcane off-by-one error, but let's see if it still works with a simple even number of elements
#alpha = `A` .. `D`
The start and and indices are 0 and 3; the middle index is int((0+3)/2) == 1. So the first half is 0..1 and the second half is 2 .. 3. That works fine
But there's still a lot more. Say I have to search an array with two elements X and Y. That has two clear halves, and I'm looking for A, which is before the middle. So I now search the one-element list X for A. The minimum and maximum elements of the target array are both zero. The mid-point is int((0+0)/2) == 0. So what happens next?
It is similar but rather worse when we're searching for Z in the same list. The code has to be exactly right, otherwise we will be either searching off the end of the array or checking the last element again and again
Saving the worst for last, suppose
my #alpha = ( 'A', 'B, 'Y, 'Z' )
and I'm looking for M. That lest loose all sorts of optimisations that involve checks that may may the ordinary case much slower
Because of all of this it's by far the best solution to use a library or a language's built-in function to do all of this. In particular, Perl's hashes are usually all you need to check for specific strings and any associated data. The algorithm used is vastly better than a binary search for any non-trivial data sets
Wikipedia shows this algorithm for an iterative binary search
The binary search algorithm can also be expressed iteratively with two index limits that progressively narrow the search range.
int binary_search(int A[], int key, int imin, int imax)
{
// continue searching while [imin,imax] is not empty
while (imin <= imax)
{
// calculate the midpoint for roughly equal partition
int imid = midpoint(imin, imax);
if (A[imid] == key)
// key found at index imid
return imid;
// determine which subarray to search
else if (A[imid] < key)
// change min index to search upper subarray
imin = imid + 1;
else
// change max index to search lower subarray
imax = imid - 1;
}
// key was not found
return KEY_NOT_FOUND;
}
And here is a version of your code that is far from bug-free but does what you intended. You weren't so far off
use strict;
use warnings 'all';
print binarysearch( 76, 10 .. 99 ), "\n";
sub binarysearch {
my $numbertolook = shift;
my #intarray = #_;
my $lengthint = scalar #intarray;
my #sorted = sort { $a <=> $b } #intarray;
my $low = 0;
my $high = $lengthint - 1;
my $found = undef;
my $midpoint;
while ( $low < $high ) {
$midpoint = int( ( $low + $high ) / 2 );
#print $midpoint, " ",$low," ", $high, " ", #sorted, "\n";
if ( $numbertolook < $sorted[$midpoint] ) {
$high = $midpoint;
}
elsif ( $numbertolook > $sorted[$midpoint] ) {
$low = $midpoint;
}
elsif ( $numbertolook == $sorted[$midpoint] ) {
$found = 1;
print "FOUND\n";
return $midpoint;
}
if ( $low == $high - 1 and $low == $midpoint ) {
if ( $numbertolook == $sorted[$high] ) {
$found = 1;
print "FOUND\n";
return $midpoint;
}
return;
}
}
return $midpoint;
}
output
FOUND
66
If you call print with several parameters separated with a space print expects the first one to be a filehandle. This is interprented as print FILEHANDLE LIST from the documentation.
print $found "\n";
What you want to do is either to separate with ,, to call it as print LIST.
print $found, "\n";
or to concat as strings, which will also call it as print LIST, but with only one element in LIST.
print $found . "\n";

what is meaning of $# in perl?

Consider the following code:
my #candidates = get_candidates($marker);
CANDIDATE:
for my $i (0..$#candidates) {
next CANDIDATE if open_region($i);
$candidates[$i] = $incumbent{ $candidates[$i]{region} };
}
What is meaning $# in line 3?
It is a value of last index on array (in your case it is last index on candidates).
Since candidates is an array, $#candidates is the largest index (number of elements - 1)
For example:
my #x = (4,5,6);
print $#x;
will print 2 since that is the largest index.
Note that if the array is empty, $#candidates will be -1
EDIT: from perldoc perlvar:
$# is also used as sigil, which, when prepended on the name of
an array, gives the index of the last element in that array.
my #array = ("a", "b", "c");
my $last_index = $#array; # $last_index is 2
for my $i (0 .. $#array) {
print "The value of index $i is $array[$i]\n";
}
This means array_size - 1. It is the same as (scalar #array) - 1.
In perl ,we have several ways to get an array size ,such as print #arr,print scalar (#arr) ,print $#arr+1 and so on.No reason ,just use it.You will get familiar with some default usage in perl during your further contact with perl .Unlike C++/java ,perl use a lot of
special expression to simplify our coding , but sometimes it always make us more confused.

Perl: increment 2d array cell?

I have a set of numerical data for which is important to me to know what pairs of numbers occurred together, and how many times. Each set of data contain 7 numbers betwen 1 and 20. There are several hundred sets of data.
Essentially, by parsing each set of my data, I want to create a 20 x 20 array that I can use to keep a count of when pairs of numbers occurred together.
I have done a lot of searching, but maybe I've used the wrong key words. I've seen loads of examples how to create a "2D array" - I know perl doesn't actually do that, and that it's really an array of references - and to print the values contained therein, but nothing really on how to work with one particular cell by number and alter it.
Below is my conceptual code. The commented lines don't work, but illustrate what I want to achieve. I'm reasonably new to coding perl, and this just seems to advanced for me to understand the examples I've seen and translate it into something I can actually use.
my #datapairs;
while (<DATAFILE>)
{
chomp;
my #data = split(",",$_);
for ($prcount=0; $prcount <=5; $prcount++)
{
for ($othcount=($prcount+1); $othcount<=6; $othcount++)
{
#data[$prcount]=#data[$prcount]+1;
#data[$othcount]=#data[$othcount]+1;
#data[$prcount]=#data[$prcount]-1;
#data[$othcount]=#data[$othcount]-1;
print #data[$prcount]." ".#data[$othcount]."; ";
##datapairs[#data[$prcount]][#data[$othcount]]++;
##datapairs[#data[$othcount]][#data[$prcount]]++;
}
}
}
Any input or suggestions would be much appreciated.
To access a "cell" in a "2-d array" in Perl (as you alredy figured out, it's an array of arrayrefs), is simple:
my #datapairs;
# Add 1 for a pair with indexes $i and $j
$datapairs[$i]->[$j]++;
print that value
print "$datapairs[$i]->[$j]\n";
It's not clear what you mean by "occur together" - if you mean "in the same length-7 array", it's easy:
my #datapairs;
while (<DATAFILE>) {
chomp;
my #data = split(",", $_);
for (my $prcount = 0; $prcount <= 5; $prcount++) {
for (my $othcount = $prcount + 1; $othcount <=6 ; $othcount++) {
$datapairs[ $data[$prcount] ]->[ $data[$othcount] ]++;
}
}
}
# Print
for (my $i = 0; $i < 20; $i++) {
for (my $j = 0; $j < 20; $j++) {
print "$datapairs[$i]->[$j], ";
}
print "\n";
}
As a side note, personally, just for stylistic reasons, I strongly prefer to reference EVERYTHING, e.g. use arrayref of arrayrefs instead of array of arrays. E.g.
my $datapairs;
# Add 1 for a pair with indexes $i and $j
$datapairs->[$i]->[$j]++;
print that value
print "$datapairs->[$i]->[$j]\n";
The second (and third...) arrow dereference operator is optional in Perl but I personally find it significantly more readable to enforce its usage - it spaces out the index expressions.

What perl code samples can lead to undefined behaviour?

These are the ones I'm aware of:
The behaviour of a "my" statement modified with a statement modifier conditional or loop construct (e.g. "my $x if ...").
Modifying a variable twice in the same statement, like $i = $i++;
sort() in scalar context
truncate(), when LENGTH is greater than the length of the file
Using 32-bit integers, "1 << 32" is undefined. Shifting by a negative number of bits is also undefined.
Non-scalar assignment to "state" variables, e.g. state #a = (1..3).
One that is easy to trip over is prematurely breaking out of a loop while iterating through a hash with each.
#!/usr/bin/perl
use strict;
use warnings;
my %name_to_num = ( one => 1, two => 2, three => 3 );
find_name(2); # works the first time
find_name(2); # but fails this time
exit;
sub find_name {
my($target) = #_;
while( my($name, $num) = each %name_to_num ) {
if($num == $target) {
print "The number $target is called '$name'\n";
return;
}
}
print "Unable to find a name for $target\n";
}
Output:
The number 2 is called 'two'
Unable to find a name for 2
This is obviously a silly example, but the point still stands - when iterating through a hash with each you should either never last or return out of the loop; or you should reset the iterator (with keys %hash) before each search.
These are just variations on the theme of modifying a structure that is being iterated over:
map, grep and sort where the code reference modifies the list of items to sort.
Another issue with sort arises where the code reference is not idempotent (in the comp sci sense)--sort_func($a, $b) must always return the same value for any given $a and $b.