Builtin method of culling all values outside lower and upper, perl array - perl

I've got an array in perl which contains sorted non-contiguous values. For example: 1 2 3 5 7 11 13 15.
I want to remove all values that are outside lower and upper, keeping lower and upper in the returned selection. My method of doing that looks like this (could probably be improved by using slice):
my #culledArray;
for ( my $i = 0; $i < scalar(#array); $i++ ) {
if ( ( $array[$i] <= $_[1] ) and ( $array[$i] >= $_[0] ) ) {
push(#culledArray, $array[$i]);
}
}
where the lower and upper are contained in $_[0] and $_[1], respectively. Is there a perl builtin that does this?

Don't know anything built-in that would do that (that is quite a specific requirement), but you can save yourself some typing by using grep:
my #culledArray = grep {( $_ <= $_[1] ) and ( $_ >= $_[0] )} #array;
If the list is long and you don't want to copy it, finding the start and end indices and using a slice might be interesting.

This is messy, but my unit tests pass, so it seems to work. Take the lower and upper indexes, based on the fact that #array is a sorted list and $_[0] >= $_[1], then create the #culledArray from #array[$lower..$upper]:
my #culledArray;
my $index = 0;
++$index until $array[$index] >= $_[0];
my $lowerIndex = $index;
while (($array[$index] <= $_[1]) and ($index < $#array)) { ++$index; }
my $upperIndex = $index;
#culledArray = #array[$lowerIndex .. $upperIndex];
return \#culledArray;
I'd love to know the efficiency of this vs the answer Mat gave. I'm almost sure that I don't necessarily traverse the entire #array (because I traverse from index of 0 until I find the $upperIndex. I'm not sure how the grep method in the linked answer works, or how perl implements the slicing of #array to #culledArray in the above code, though.

It looks like you may be using percentiles or quantiles? If so then Statistics::Descriptive may help.
The percentile method returns the value and index at that percentile, so you can use code as below
use strict;
use warnings;
use Statistics::Descriptive;
my #data = qw/ 1 2 3 5 7 11 13 15 /;
my $stat = Statistics::Descriptive::Full->new;
$stat->add_data(#data);
my ($d25, $i25) = $stat->percentile(25);
my ($d75, $i75) = $stat->percentile(75);
my #subset = ($stat->get_data)[$i25 .. $i75];
print "#subset\n";
output
2 3 5 7 11

Related

Vector binning algorithm in Perl

I have the following vector:
19.01
20.2572347267
16.4893617021
19.0981432361
36.3636363636
20.41
It's actually much longer, but that doesn't matter. I need an algorithm to bin these values into a hash. The hash keys must be floating point values that start from the minimum value + 1 (in this case 17.48...) and increase by 1. The values of the hash must be the number of elements that fall into the corresponding bin, i.e. the end result should be:
$hash{17.49}=1
$hash{18.49}=0
$hash{19.49}=2
$hash{20.49}=2
$hash{21.49}=0
$hash{22.49}=0
.
.
.
$hash{35.49}=0
$hash{37.49}=1
Please help guys.
This seems to work:
#!/usr/bin/perl
use warnings;
use strict;
use Data::Dumper;
use List::Util qw{ min };
my #vector = qw( 19.01
20.2572347267
16.4893617021
19.0981432361
36.3636363636
20.41
);
my %hash;
my $min = min(#vector);
for my $n (#vector) {
my $diff = $n - $min;
++$hash{ 1 + $min + int $diff };
}
print Dumper \%hash;
If you need the zeroes as well, just add the follwoing before the loop:
my $max = max(#vector);
my $i = $min;
while ($i <= $max) {
$hash{$i++} = 0;
}
(And include max in the use clause, too.)
Came up with a sweet solution, hopefully somebody else will also find it helpful.
use POSIX;
sub frac { $_[0]-floor($_[0]) } #saw this little function posted somewhere, quddos to the guy who came up with it
for (my $x = ${min_value} + 1; $x <= ${max_value} + 1; $x += 1) # if you don't need the zeroes, remove this loop
{
$bins{$x} = 0;
}
foreach my $n (#array)
{
$bins{floor($n+1)+frac($min_value)}++;
}
floor() or ceil() (and use POSIX;) should be used instead of int(), because int() can produce erenous results - 278 may be internally stored as 277.99999999997899999 (for example), so int(278) turns out equal to 277, which may mess up your computation. Read this somewhere, but can't find the link...

Selecting elements from an array and putting it in another array in Perl

I have an array containing 10 numbers. I want to pick numbers in array index 0,2,4,6,8 and put them in a new array. Likewise with index 1,3,5,7,9. I am new to Perl (started a few days ago).
My program:
my #b;
#a = (1,2,3,4,5,6,7,8,9,10);
for($i=0;$i<=$#a;$i++)
{
push(#b,$a[$i+1]);
}
print "#b";
What am I doing wrong?
I suggest avoiding for loop as it's easier to make mistake somewhere in its usage, and use foreach
my #a = (1,2,3,4,5,6,7,8,9,10);
my (#even, #odd);
foreach my $i (0 .. $#a) {
if ($i % 2) { push #odd, $a[$i] } else { push #even, $a[$i] }
}
You can also use map to test array index modulo % 2, and then for #even decide to filter it by () or take value for it using $a[$_]
my #even = map { $_%2 ? () : $a[$_] } 0 .. $#a;
my #odd = map { $_%2 ? $a[$_] : () } 0 .. $#a;
Simple.
my #a = (1,2,3,4,5,6,7,8,9,10);
my (#even, #odd);
for ( #a ) {
$_ % 2 ? push #odd, $_ : push #even, $_;
}
A few things:
Use the pragmas use strict; and use warnings;. These will catch a lot of errors. If you use use strict;, you'll have to declare your variables with my (sometimes you'll use our, but 99% of the time, you'll use my)
In your for loop, you're using the default variable $_. This variable is evil for a variety of reasons. (One, it's global in scope, so something else could change this variable on your and you wouldn't know.). Declare your variables except in situations where you must use $_.
Standard is to put the { on the line with the for and while. Another is to avoid the C style for loop (and to avoid foreach which is just an alias to for)
Use spaces. It's much easier to read $i <= $#a than $i<=$a.
Here's my interpretation of your program:
#! /usr/bin/env perl
use strict;
use warnings;
use feature qw(say); #A nicer 'print'
my #a = qw(12 13 14 15 16 17 18 19 20);
my #even;
my #odd;
for my $element (0..$#a) {
if ( $element % 2 ) {
push #odd, $a[$element];
}
else {
push #even, $a[$element];
}
}
say '#even = ' . join ': ', #even;
say '#odd = ' . join ': ', #odd;
The output:
#even = 12: 14: 16: 18: 20
#odd = 13: 15: 17: 19
Note my for loop. I use the 0..$#a to go through each element of the array. The $# is returns the last index of the array. Note that this is easier to understand than the for($i=0;$i<=$#a;$i++) that you used. It's one of the reasons why C style for loops are discouraged.
I use the modulo operator % to parse my even/odd. Modulo is like remainder division. If the number is odd, the modulo % 2 will be a 1. Otherwise, it's zero. Modulo operations are great for anything that works on a cycle.
But let's get back to your program. Here's your original code with a few minor tweaks.
I added the use strict; and use warnings;. These catch about 99% of your programming errors.
I use use feature qw(say); because say is nicer when it comes to debugging. I can take a statement, copy it, and then throw say qq(...); around it and see what it's doing.
I added a bunch of say statements to reveal the logic of your code.
Let's watch what happens. Here's your program slightly modified:
#! /usr/bin/env perl
use strict;
use warnings;
use feature qw(say);
my #b;
my #a = (1,2,3,4,5,6,7,8,9,10);
my $i;
for($i=0; $i<=$#a; $i++) {
say "Index = $i Element = $a[$i + 1]";
say qq(push(\#b, $a[$i+1]););
push(#b,$a[$i+1]);
}
print "#b";
And here's the output:
Index = 0 Element = 2
push(#b, 2);
Index = 1 Element = 3
push(#b, 3);
Index = 2 Element = 4
push(#b, 4);
Index = 3 Element = 5
push(#b, 5);
Index = 4 Element = 6
push(#b, 6);
Index = 5 Element = 7
push(#b, 7);
Index = 6 Element = 8
push(#b, 8);
Index = 7 Element = 9
push(#b, 9);
Index = 8 Element = 10
push(#b, 10);
Use of uninitialized value in concatenation (.) or string at ./test.pl line 11.
Index = 9 Element =
Use of uninitialized value within #a in concatenation (.) or string at ./test.pl line 12.
push(#b, );
Use of uninitialized value $b[9] in join or string at ./test.pl line 15.
I can see the how each push statement is being executed, and look at that, you're pushing in each and every element. Actually, you're not because you used $a[$i+1] as what you're pushing.
Using use warnings and I can see that I am trying to push the non-existant $a[10] into your #b array.
Let's change your for loop to go to every other element
#! /usr/bin/env perl
use strict;
use warnings;
use feature qw(say);
my #b;
my #a = qw(1 2 3 4 5 6 7 8 9 10);
my $i;
for ($i=0; $i <= $#a; $i += 2) {
push #b, $a[$i];
}
The first element is $a[0]. The next element in the loop is $a[2] because I added 2 to the index instead of just incrementing it by 1. Now, I'll go through all the even elements and skip all of the odd elements.
And the output:
1 3 5 7 9
(Note that $a[0] = 1. That's why they're all odd numbers. It's why I started at 12 in my program, so $a[0] = 12 which is the even number).
My preference would be to use the while and avoid the for(...; ...; ...) construct:
#! /usr/bin/env perl
use strict;
use warnings;
use feature qw(say);
my #b;
my #a = qw(1 2 3 4 5 6 7 8 9 10);
my $i = 0;
while ( $i < $#a ) {
push #b, $a[$i];
$i += 2;
}
Even:
for($i=0;$i<=$#a;$i+=2)
{
push(#b,$a[$i]);
}
Odd:
for($i=1;$i<=$#a;$i+=2)
{
push(#b,$a[$i]);
}
List::MoreUtils has an indexes function:
use List::MoreUtils qw{indexes} ;
use 5.10.0 ;
my #a = (1,2,3,4,5,6,7,8,9,10) ;
# index of array where even
say foreach indexes { $_ % 2 == 0 } #a ;
# index of array where odd
say foreach indexes { $_ % 2 != 0 } #a ;
I admit this may be sort of inelegant and it's possibly cheating here to use a module - especially one that is not in CORE. It would be convenient if List::MoreUtils and List::Utils were just one CORE module, but still not as elegant as some the other answers here.

how to add numbers in numeric string in perl

I am facing some problem while adding values in numeric string:
I have string that looks like 02:03:05:07:04:06. All the numbers have to be <10. Now, I have to take a random number from 1-9 and add that number with last position number of the string (e.g. 3).
I the sum>10, then I have add that number to the number in the second last position.
So far, I have
#!/usr/bin/perl -w
use strict;
my $str='02:03:05:07:04:06';
my #arr=split(/:/,$str);
my #new_arr=pop(#arr);
my $rand_val=int(rand(9));
my $val=$new_arr[0]+$rand_val;
if($val>=10)
{
I am unable to generate a logic here:(
}
Please help me out of this problem.
After adding the number we have to join the string and print it also :)
my $str = '02:03:05:07:04:06';
my #nums = split /:/, $str;
my $add = int(rand(9)) + 1;
my $overflow = 1;
for (1..#nums) {
if ($num[-$_] + $add < 10) {
$num[-$_] += $add;
$overflow = 0;
last;
}
}
die "Overflow" if $overflow;
$str = join ':', map sprintf('%02d', $_), #nums;
I just run this and it works. The caveat is that, the lower the last number of the string is, the smaller the chance the "if ($val>=10)" will be valid
This doesn't solve the problem of your rand_val potentially being 0, but I'll leave that as a task for you to resolve. This should give you what you're looking for in terms of traversing through the values in the array until the the sum of the random value and current most-last value in the array.
1 use strict;
2 my $str='02:03:05:07:04:06';
3 my #arr=split(/:/,$str);
4 my $rand_val=int(rand(9));
5 my $val;
6
7 foreach my $i (reverse #arr){
8 $val = $i + $rand_val;
9 next if ($val >= 10);
10 print "val: $val, rand_val: $rand_val, value_used: $i\n";
11 last if ($val < 10);
12 }
I see a misstake : you do
my #new_arr=pop(#arr);
(...)
my $val=$new_arr[0]+$rand_val;
but pop only returns the last element, not a list.

How to sum two lists element-wise

I want to parse a file line by line, each of which containing two integers, then sum these values in two distinct variables. My naive approach was like this:
my $i = 0;
my $j = 0;
foreach my $line (<INFILE>)
{
($i, $j) += ($line =~ /(\d+)\t(\d+)/);
}
But it yields the following warning:
Useless use of private variable in void context
hinting that resorting to the += operator triggers evaluation of the left-hand side in scalar instead of list context (please correct me if I'm wrong on this point).
Is it possible to achieve this elegantly (possibly in one line) without resorting to arrays or intermediate variables?
Related question: How can I sum arrays element-wise in Perl?
No, it's because the expression ($i, $j) += (something, 1) parses as adding 1 to $j only, leaving $i hanging in void context. Perl 5 has no hyper-operators or automatic zipping for the assignment operators such as +=. This works:
my ($i, $j) = (0, 0);
foreach my $line (<INFILE>) {
my ($this_i, $this_j) = split /\t/, $line;
$i += $this_i;
$j += $this_j;
}
You can avoid the repetion by using a compound data structure instead of named variables for the columns.
First of all, your way of adding arrays pairwise does not work (the related question you posted yourself gives some hints there).
And for the parsing part: How about just splitting the lines? If your lines are formatted accordingly (whitespaces should not be a problem).
split(/\t/, $line, 2)
If you really, really want to do it in one line, you could do something like this (though I don't think you would call it elegant):
my #a = (0, 0);
foreach my $line (<INFILE>)
{
#a = map { shift(#a)+$_ } split(/\t/, $line, 2);
}
For an input of #lines = ("11\t1\n", " 22 \t 2 \n", "33\t3"); it gave me the #a = (6, 66)
I would advise you to use the split part of my answer, but not the adding up part. There is nothing wrong in using more than one line! If it makes your intention clearer, more lines are better than one. But than again I'm hardly using perl nowadays but python instead, so my perl coding style might have a "bad" influence there...
It is quite possible to swap the pair over for each addition, meaning you're always adding to the same element in each pair. (This generalises to rotating multi-element arrays if required.)
use strict;
use warnings;
my #pair = (0, 0);
while (<DATA>) {
#pair = ($pair[1], $pair[0] + $_) for /\d+/g;
}
print "#pair\n";
__DATA__
99 42
12 15
18 14
output
129 71
Here's another option:
use Modern::Perl;
my $i = my $j = 0;
map{$i += $_->[0]; $j += $_->[1]} [split] for <DATA>;
say "$i - $j";
__DATA__
1 2
3 4
5 6
7 8
Output:
16 - 20

How should I use Perl's scalar range operator?

What is the scalar ".." operator typical usage? Is it only selecting blocks of text?
Interesting example by myself:
sub get_next {
print scalar($$..!$$), "\n";
}
get_next for 1 .. 5; # prints numbers from 1 to 5
get_next for 1 .. 5; # prints numbers from 6 to 10
People hardly seem to know about it based on questions here, but, yes, I'd say typical usage is selecting blocks of text, either with
while (<>) {
print if /BEGIN/ .. /END/;
}
or
while (<>) {
print if 3 .. 5; # prints lines 3 through 5
}
The latter is syntactic sugar for checking against the input line-number ($.)
... if $. == 3 .. $. == 5;
which suggests the weird-looking
#! /usr/bin/perl -l
for ($_ = 1; $_ <= 10; ++$_) {
print if $_ == 4 .. $_ == 7;
}
The above program's output is
4
5
6
7
If you have some sort of bracketing condition, test for it in subs:
for (...) {
do_foo($x,$y,$z) if begin($x) .. end($z);
}
Outside of perl -e you really shouldn't. It is esoteric and funky. See my post not even 24hrs ago about it about how it maintains state with calling context. This stung me in a real world application, because I was trying to be clever and found what I thought was a good use-case for it.
Here's a place where you need to be very careful about unintentional use of the scalar range operator: subroutine returns.
sub range {
my $start = shift;
my $end = shift;
return $start .. $end;
}
#foo = range( 1, 5 ); # ( 1, 2, 3, 4, 5 )
$foo = range( 1, 5 ); # False or maybe true. Who knows.
If you call this subroutine in scalar context, you'll be in for a surprise.
After being bitten by some variant of this problem, I now always make sure I assign a list return into an array, thereby getting array-like context behaviors for my subs. If I need other context specific behavior (very rarely) I use wantarray.
sub range {
my $start = shift;
my $end = shift;
my #result = $start .. $end;
return #result;
}
another use is simple counters like this:
perl -e 'foreach ( 1 .. 100 ){print"$_\n"}'