How to count frequency in each column, corresponding to first column in perl - perl

I have a input like this.
a x1
a x1
a y1
b x1
b y1
b z1
c y1
c z1
Want a output like this
a = 3, x1= 2, y1= 1, z1= 0
b = 3, x1= 1, y1= 1, z1=1
c =2, x1=0, y1=1, z1=1
I want to make a perl program for this, but do not know .
Please help..

This is a pretty simple homework problem (which is why this has been downvoted a couple of times, no doubt), but it's a good way to show the power of hashes if you're struggling with that.
Let's break this down. You want to classify things by the first column (and count the number of occurrences), counting the associated data. Hashes are perfect for this.
Let's assume that the final hash will look something like this:
{
'a' => {
'*count' => 3,
'x1' => 2,
'y1' => 1,
},
. . .
}
(where '*count' is a string that should never appear in your data) and that your output will look like your specification.
Parsing the data is pretty simple.
my $hash = {};
my $count = '*count';
for( #lines ) {
chomp;
my ($x, $y) = split( /\s+/ );
$hash->{$x}->{$count}++;
$hash->{$x}->{$y}++;
}
Printing it out is just as simple:
for my $x ( sort keys %{$hash} ) {
my #output = ( sprintf( '%s = %d', $x, $hash->{$x}->{$count} ) );
for my $y ( qw{ x1 y1 z1 } ) {
push( #output, sprintf( '%s = %d', $y, $hash->{$x}->{$y} ) );
}
print join( ', ', #output ) . "\n";
}
Note that if, say, $hash->{'c'}->{'z1'} doesn't exist, the value returned will still be 0. You can do the additional checking with exists if you like, but it shouldn't be necessary.

Related

Please explain perl statement

I am reading Intermediate Perl book and in Chapt10 there is this code. I added few print statements but core logic is untouched.
#!/usr/bin/perl
use strict;
use warnings;
use Data::Dumper;
my #input = qw(Gilligan Skipper Professor Ginger Mary Ann);
my #sorted_positions = sort { $input[$a] cmp $input[$b] } 0 .. $#input;
print Dumper( \#sorted_positions );
my #ranks;
#ranks[#sorted_positions] = ( 1 .. #sorted_positions );
print Dumper( \#ranks );
foreach ( 0 .. $#ranks ) {
print "$input[$_] sorts into position $ranks[$_]\n";
}
When i check the Dumper output then for #sorted_positions array it is printing
$VAR1 = [
5,
0,
3,
4,
2,
1
];
which make sense to me but for #ranks array it is printing
$VAR1 = [
2,
6,
5,
3,
4,
1
];
I am unable to understand what this line is doing.
#ranks[#sorted_positions] = ( 1 .. #sorted_positions );
I am able understand what output means in reference to the program but not able to understand how that output is coming i.e. what exactly is perl doing inside that statement.
The line:
#ranks[#sorted_positions] = ( 1 .. #sorted_positions );
is equivalent to:
#ranks[5,0,3,4,2,1] = (1,2,3,4,5,6);
which is equivalent to:
$ranks[5] = 1;
$ranks[0] = 2;
$ranks[3] = 3;
$ranks[4] = 4;
$ranks[2] = 5;
$ranks[1] = 6;
The example is using slices which are documented in the perldata man page.
Let suppose you want to assign string 'x' into the first position of an array, 'y' into the second position and 'z' into the third position. Instead of doing three assignments, you can do them at the same time;
#array[0,1,2] = ("x", "y", "z");
You don't have to do these in order;
#array[2,0,1] = ("z", "x", "y"); # same result
The right-hand side of the line in question produces a list of numbers starting with 1 and finishing at the integer value returned by the expression #sorted_positions (which is 6 as there are 6 things in #sorted_positions) - ie its identical to;
(1,2,3,4,5,6)
So, the whole statement is identical to:
#ranks[5,0,3,4,2] = (1,2,3,4,5,6) ;
So, if we take just one iteration of this:
foreach ( 0 .. $#ranks ) {
print "$input[$_] sorts into position $ranks[$_]\n";
}
we get;
print "$input[0] sorts into position $ranks[0]\n"
# ie: Gilligan sorts into position 2
Hope that helps.

perl: sprintf for element in list

I've been really confused about this, I'm trying to create a big matrix of numbers and I want to use sprintf with perl to have a nicer output. I'm trying to use sprintf like so
my $x = 0;
my $y = 0;
for ($x=1; $x<=$steps; $y++) { # loop through lines
for ($y=0; $y<=$distances; $y++) {
my $format = sprintf ("%s",$matrix[$x][$y]);
but this is really doing my head in, as I am looping through all the values of $x and $y and getting their combinations. So I am not sure if I'm meant to use more formatting arguments like so
my $format = sprintf ("%s%s%s",$matrix[$x][$y]);
(of course this is giving me compilation errors as it's not right)
But when I only use one argument, I can't put spaces in between my columns :/ Can somebody explain what's happening? I really don't understand what I'm meant to do to get the formatting nice. I'm looking to just align the columns and have a couple of whitespaces between them. Thank you all so much.
I would be thinking in terms of using map, as a way to display every element:
#!/usr/bin/env perl
use strict;
use warnings;
my #matrix = ( [1,2,3,4],
[5,6,7,8],
[9,10,11,12], );
print join ("\n", map { join ( "\t", #$_ ) } #matrix );
This is formatting on tab-stops, rather than fixed width columns, and outputs:
1 2 3 4
5 6 7 8
9 10 11 12
If you particularly wanted sprintf though:
foreach my $row ( #matrix ) {
print map { sprintf("%5s", $_) } #$row,"\n";
}
(5 columns wide).
In each of these, I'm working on whole rows - that only really applies though, if I'm right about the assumptions I've made about which elements you're displaying.
At a very basic level - your code could work as:
#!/usr/bin/env perl
use strict;
use warnings;
my #matrix = ( [ 1, 2, 3, 4 ],
[ 5, 6, 7, 8 ],
[ 9, 10, 11, 12 ], );
my $steps = 2;
my $distances = 3;
for ( my $x = 1; $x <= $steps; $x++ ) { # loop through lines
for ( my $y = 0; $y <= $distances; $y++ ) {
printf( "%5s", $matrix[$x][$y] );
}
print "\n";
}
Although note - that will only work with equal numbers of columns. You could, however, do something like:
#!/usr/bin/env perl
use strict;
use warnings;
my #matrix = ( [ 1, 2, ],
[ 3, 4, 5, ],
[ 6, 7, 8, 9, 10, 11, 12 ], );
my $steps = 2;
my $distances = 3;
for ( my $x = 1; $x <= $steps; $x++ ) { # loop through lines
for ( my $y = 0; $y <= $distances; $y++ ) {
printf( "%5s", $matrix[$x][$y] // '' );
}
print "\n";
}
Which omits the first row (because you set $x to 1), and iterates up to 4 columns:
3 4 5
6 7 8 9
This omits the extra values on the last line, and uses // to test if the cell is empty or not.
for my $row (#matrix) {
my $format = join(' ', ('%5.2f') x #$row)."\n";
printf($format, #$row);
}
If all rows have the same number of columns, you could calculate the format once.
if (#matrix) {
my $format = join(' ', ('%5.2f') x #{$matrix[0]})."\n";
for my $row (#matrix) {
printf($format, #$row);
}
}
If the size of the columns isn't unknown in advance, you'll need to need to perform the following in order:
Format the cells (if needed),
Find the length of the largest cell of each column, then
Print out the matrix with padding.
The following assumes every row of the matrix is the same length.
use List::Util qw( max );
if (#matrix) {
for my $row (#matrix) {
$_ = sprinf('%.2f', $_) for #$row;
}
my $num_cols = #{$matrix[0]};
my #col_sizes = (0) x $num_cols;
for my $row (#matrix) {
$col_sizes[$x] = max(0, $col_sizes[$x], $row->[$x]);
}
my $format = join(' ', map { "%$_s" } #col_sizes)."\n";
for my $row (#matrix) {
printf($format, #$row);
}
}

Assignment of multiple array subroutine parameters in Perl doesn't work

I'm confused about perl subroutine parameters in this example
when i use references in subroutine parameters it works:
#a = ( 1, 2 );
#b = ( 5, 8 );
#c = add_vecpair( \#a, \#b );
print "#c\n";
print $a[0];
sub add_vecpair { # assumes both vectors the same length
my ( $x, $y ) = #_; # copy in the array references
my #result;
#$x[0] = 2;
for ( my $i = 0; $i < #$x; $i++ ) {
$result[$i] = $x->[$i] + $y->[$i];
}
return #result;
}
but when i don't use references as parameters like this:
#a = ( 1, 2 );
#b = ( 5, 8 );
#c = add_vecpair( #a, #b );
print "#c\n";
print $a[0];
sub add_vecpair { # assumes both vectors the same length
my ( #x, #y ) = #_; # copy in the array references
my #result;
print #y;
for ( my $i = 0; $i < #x; $i++ ) {
$result[$i] = $x[$i] + $y[$i];
}
return #result;
}
..it doesn't work. When do i need to use references as subroutine parameters?
Short version: The issue is this line:
my (#x, #y) = #_;
Assignments are greedy. #x is treated first, and is given as many values from #_ as it can handle. And as it can handle all of them, it ends up getting all of the contents of #_, and #y get none.
The result is the same as this:
my #x = #_; # Gets all of the arguements
my #y; # Gets nothing, and is therefore declared but uninitialized.
This is why using references is recommended when subroutines take more than one value as arguement, and at least one of those values are arrays or hashes.
Longer version:
#_ is a composite of all of the arguements passed to the subroutine, so the original container doesn't matter. Consider the code snippets below. The first one is yours, the second one does the exact same thing, but more clearly displays what is happening.
#a = (1, 2);
#b = (5, 8);
add_vecpair(#a,#b);
....is the same as:
add_vecpair(1, 2, 5, 8);
To further hilight the problem, hashes get really messy if treated this way:
%a = ('a' => 1,
'b' => 2);
%b = ('c' => 3,
'd' => 4);
somefunction(%a, %b);
...is the same as:
somefunction('a', 1, 'b', 2, 'c', 3, 'd', 4);
When you call Perl subroutines with array or hash parameters, they are flattened out to a single list. Therefore in the second case your two array parameters loose their identities and #_ becomes a single array with the elements of both #a and #b.

How to print range values with specific reference number are given?

I have a set of data file looks like below. I would like to get the interpolation final value (final,P) by referring to 2 set of number range (scoreA and scoreB). Let's say for "Eric", his scoreA is 35 (value between range 30.00 - 40.00) and scoreB is 48 (a value between range 45.00 - 50.00). He will get 2 set of final values range which are (22.88,40.90) & (26.99,38.99). And I would like to get the final value of "Eric" and "George" in the data file. "George"'s scoreA = 38 and scoreB = 26.
After formula calculation, I want to get the exact final value when his scoreA=35 & scoreB=45. Let's assume formula is P=X+Y (P is final value), so far I have been trying the code as shown below. However it cannot get the correct lines.
How to get the exactly final value range by referring to the data given?
data file
Student_name ("Eric")
/* This is a junk line */
scoreA ("10.00, 20.00, 30.00, 40.00")
scoreB ("15.00, 30.00, 45.00, 50.00, 55.00")
final (
"12.23,19.00,37.88,45.98,60.00",\
"07.00,20.11,24.56,45.66,57.88",\
"05.00,15.78,22.88,40.90,57.99",\
"10.00,16.87,26.99,38.99,40.66"\)
Student_name ("Liy")
/* This is a junk line */
scoreA ("5.00, 10.00, 20.00, 60.00")
scoreB ("25.00, 30.00, 40.00, 55.00, 60.00")
final (
"02.23,15.00,37.88,45.98,70.00",\
"10.00,28.11,34.56,45.66,57.88",\
"08.00,19.78,32.88,40.90,57.66",\
"10.00,27.87,39.99,59.99,78.66"\)
Student_name ("Frank")
/* This is a junk line */
scoreA ("2.00, 15.00, 25.00, 40.00")
scoreB ("15.00, 24.00, 38.00, 45.00, 80.00")
final (
"02.23,15.00,37.88,45.98,70.00",\
"10.00,28.11,34.56,45.66,57.88",\
"08.00,19.78,32.88,40.90,57.66",\
"10.00,27.87,39.99,59.99,78.66"\)
Student_name ("George")
/* This is a junk line */
scoreA ("10.00, 15.00, 20.00, 40.00")
scoreB ("25.00, 33.00, 46.00, 55.00, 60.00")
final (
"10.23,25.00,37.88,45.98,68.00",\
"09.00,28.11,34.56,45.66,60.88",\
"18.00,19.78,32.88,40.90,79.66",\
"17.00,27.87,40.99,59.99,66.66"\)
Coding
data();
sub data() {
my $cnt = 0;
while (my #array = <FILE>) {
foreach $line(#array) {
if ($line =~ /Student_name/) {
$a = $line;
if ($a =~ /Eric/ or $cnt > 0 ) {
$cnt++;
}
if ( $cnt > 1 and $cnt <= 3 ) {
print $a;
}
if ( $cnt > 2 and $cnt <= 4 ) {
print $a;
}
if ( $cnt == 5 ) {
$cnt = 0;
}
}
}
}
}
Result
Eric final=42.66
George final=24.30
In my comment I said that parsing is fairly easy. Here is how it could be done. As the question lacks a proper specification of the file format, I will assume the following:
The file consists of properties, which have values:
document ::= property*
property ::= word "(" value ("," value)* ")"
A value is a double-quoted string containing numbers seperated by commata, or a single word:
value ::= '"' ( word | number ("," number)* ) '"'
Spaces, backslashes, and comments are irrelevant.
Here is a possible implementation; I will not go into the details of explaining how to write a simple parser.
package Parser;
use strict; use warnings;
sub parse {
my ($data) = #_;
# perform tokenization
pos($data) = 0;
my $length = length $data;
my #tokens;
while(pos($data) < $length) {
next if $data =~ m{\G\s+}gc
or $data =~ m{\G\\}gc
or $data =~ m{\G/[*].*?[*]/}gc;
if ($data =~ m/\G([",()])/gc) {
push #tokens, [symbol => $1];
} elsif ($data =~ m/\G([0-9]+[.][0-9]+)/gc) {
push #tokens, [number => 0+$1];
} elsif ($data =~ m/\G(\w+)/gc) {
push #tokens, [word => $1];
} else {
die "unreckognized token at:\n", substr $data, pos($data), 10;
}
}
return parse_document(\#tokens);
}
sub token_error {
my ($token, $expected) = #_;
return "Wrong token [#$token] when expecting [#$expected]";
}
sub parse_document {
my ($tokens) = #_;
my #properties;
push #properties, parse_property($tokens) while #$tokens;
return #properties;
}
sub parse_property {
my ($tokens) = #_;
$tokens->[0][0] eq "word"
or die token_error $tokens->[0], ["word"];
my $name = (shift #$tokens)->[1];
$tokens->[0][0] eq "symbol" and $tokens->[0][1] eq '('
or die token_error $tokens->[0], [symbol => '('];
shift #$tokens;
my #vals;
VAL: {
push #vals, parse_value($tokens);
if ($tokens->[0][0] eq 'symbol' and $tokens->[0][1] eq ',') {
shift #$tokens;
redo VAL;
}
}
$tokens->[0][0] eq "symbol" and $tokens->[0][1] eq ')'
or die token_error $tokens->[0], [symbol => ')'];
shift #$tokens;
return [ $name => #vals ];
}
sub parse_value {
my ($tokens) = #_;
$tokens->[0][0] eq "symbol" and $tokens->[0][1] eq '"'
or die token_error $tokens->[0], [symbol => '"'];
shift #$tokens;
my $value;
if ($tokens->[0][0] eq "word") {
$value = (shift #$tokens)->[1];
} else {
my #nums;
NUM: {
$tokens->[0][0] eq 'number'
or die token_error $tokens->[0], ['number'];
push #nums, (shift #$tokens)->[1];
if ($tokens->[0][0] eq 'symbol' and $tokens->[0][1] eq ',') {
shift #$tokens;
redo NUM;
}
}
$value = \#nums;
}
$tokens->[0][0] eq "symbol" and $tokens->[0][1] eq '"'
or die token_error $tokens->[0], [symbol => '"'];
shift #$tokens;
return $value;
}
Now, we get the following data structure as output from Parser::parse:
(
["Student_name", "Eric"],
["scoreA", [10, 20, 30, 40]],
["scoreB", [15, 30, 45, 50, 55]],
[
"final",
[12.23, 19, 37.88, 45.98, 60],
[7, 20.11, 24.56, 45.66, 57.88],
[5, 15.78, 22.88, 40.9, 57.99],
[10, 16.87, 26.99, 38.99, 40.66],
],
["Student_name", "Liy"],
["scoreA", [5, 10, 20, 60]],
["scoreB", [25, 30, 40, 55, 60]],
[
"final",
[2.23, 15, 37.88, 45.98, 70],
[10, 28.11, 34.56, 45.66, 57.88],
[8, 19.78, 32.88, 40.9, 57.66],
[10, 27.87, 39.99, 59.99, 78.66],
],
...,
)
As a next step, we want to transform it into nested hashes, so that we have the structure
{
Eric => {
scoreA => [...],
scoreB => [...],
final => [[...], ...],
},
Liy => {...},
...,
}
So we simply run it through this small sub:
sub properties_to_hash {
my %hash;
while(my $name_prop = shift #_) {
$name_prop->[0] eq 'Student_name' or die "Expected Student_name property";
my $name = $name_prop->[1];
while( #_ and $_[0][0] ne 'Student_name') {
my ($prop, #vals) = #{ shift #_ };
if (#vals > 1) {
$hash{$name}{$prop} = \#vals;
} else {
$hash{$name}{$prop} = $vals[0];
}
}
}
return \%hash;
}
So we have the main code
my $data = properties_to_hash(Parser::parse( $file_contents ));
Now we can move on to Part 2 fo the problem: calculating your scores. That is, once you make clear what you need done.
Edit: Bilinear interpolation
Let f be the function that returns the value at a certain coordinate. If we have a value at those coordinates, we can return that. Else, we perform bilinear interpolation with the next known values.
The formula for bilinear interpolation[1] is:
f(x, y) = 1/( (x_2 - x_1) · (y_2 - y_1) ) · (
f(x_1, y_1) · (x_2 - x) · (y_2 - y)
+ f(x_2, y_1) · (x - x_1) · (y_2 - y)
+ f(x_1, y_2) · (x_2 - x) · (y - y_1)
+ f(x_2, y_2) · (x - x_1) · (y - y_1)
)
Now, scoreA denote the positions of the data points in the final table on the first axis, scoreA the positions on the second axis. We have to do the following:
assert that the requested values x, y are inside the bounds
fetch the next smaller and next larger positions
perform interpolation
.
sub f {
my ($data, $x, $y) = #_;
# do bounds check:
my ($x_min, $x_max, $y_min, $y_max) = (#{$data->{scoreA}}[0, -1], #{$data->{scoreB}}[0, -1]);
die "indices ($x, $y) out of range ([$x_min, $x_max], [$y_min, $y_max])"
unless $x_min <= $x && $x <= $x_max
&& $y_min <= $y && $y <= $y_max;
To fetch the boxing indices x_1, x_2, y_1, y_2 we need to iterate through all possible scores. We'll also remember the physical indices x_i1, x_i2, y_i1, y_i2 of the underlying arrays.
my ($x_i1, $x_i2, $y_i1, $y_i2);
for ([$data->{scoreA}, \$x_i1, \$x_i2], [$data->{scoreB}, \$y_i1, \$y_i2]) {
my ($scores, $a_i1, $a_i2) = #$_;
for my $i (0 .. $#$scores) {
if ($scores->[$i] <= $x) {
($$a_i1, $$a_i2) = $i == $#$scores ? ($i, $i+1) : ($i-1, $i);
last;
}
}
}
my ($x_1, $x_2) = #{$data->{scoreA}}[$x_i1, $x_i2];
my ($y_1, $y_2) = #{$data->{scoreB}}[$y_i1, $y_i2];
Now, interpolation according to above formula can be performed, but each access at a known index can be changed to an access via physical index, so f(x_1, y_2) would become
$final->[$x_i1][$y_i2]
Detailed Explanation of sub f
sub f { ... } declares a sub with name f, although that is probably a bad name. bilinear_interpolation might be a better name.
my ($data, $x, $y) = #_ states that our sub takes three arguments:
$data, a hash reference containing fields scoreA, scoreB and final, which are array references.
$x, the position along the scoreA-axis where interpolation is required.
$y, the position along the scoreB-axis where interpolation is required.
Next, we want to assert that the positions for $x and $y are valid aka inside bounds. The first value in $data->{scoreA} is the minimal value; the maximal value is in the last position (index -1). To get both at once, we use an array slice. A slice accesses multiple values at once and returns a list, like #array[1, 2]. Because we use complex data structures which use references, we have to dereference the array in $data->{scoreA}. This makes the slice look like #{$data->{scoreA}}[0, 1].
Now that we have the $x_min and $x_max values, we throw and error unless the requested value $x is inside the range defined by the min/max values. This is true when
$x_min <= $x && $x <= $x_max
Should either $x or $y be out of bounds, we throw an error which shows the actual bounds. So the code
die "indices ($x, $y) out of range ([$x_min, $x_max], [$y_min, $y_max])"
could, for example, throw an error like
indices (10, 500) out of range ([20, 30], [25, 57]) at script.pl line 42
Here we can see that the value for $x is too small, and $y is too large.
The next problem is to find neighbouring values. Assuming scoreA holds [1, 2, 3, 4, 5], and $x is 3.7, we want to select the values 3 and 4. But because we can pull some nifty tricks a bit later, we would rather remember the position of the neighbouring values, not the values themselves. So this would give 2 and 3 in above example (remember that arrows are zero-based).
We can do this by looping over all indices of our array. When we find a value that is ≤ $x, we remember the index. E.g. 3 is the first value that is ≤ $x, so we remember the index 2. For the next higher value, we have to be a bit carful: Obviously, we can just take the next index, so 2 + 1 = 3. But now assume that $x is 5. This passes the bounds check. The first value that is ≤ $x would be value 5, so we can remember position 4. However, there is no entry at position 5, so we could use the current index itself. Because this would lead to division by zero later on, we would be better off remembering positions 3 and 4 (values 4 and 5).
Expressed as code, that is
my ($x_i1, $x_i2);
my #scoreA = #{ $data->{scoreA} }; # shortcut to the scoreA entry
for my $i (0 .. $#scores) { # iterate over all indices: `$#arr` is the last idx of #arr
if ($scores[$i] <= $x) { # do this if the current value is ≤ $x
if ($i != $#scores) { # if this isn't the last index
($x_i1, $x_i2) = ($i, $i+1);
} else { # so this is the last index
($x_i1, $x_i2) = ($i-1, $i);
}
last; # break out of the loop
}
}
In my original code I choose a more complex solution to avoid copy-pasting the same code for finding the neighbours of $y.
Because we also need the values, we obtain them via a slice with the indices:
my ($x_1, $x_2) = #{$data->{scoreA}}[$x_i1, $x_i2];
Now we have all surrounding values $x1, $x_2, $y_1, $y_2 which define the rectangle in which we want to perform bilinear interpolation. The mathematical formula is easy to translate to Perl: just choose the correct operators (*, not · for multiplication), and the variables need dollar signs before them.
The formula I used is recursive: The definition of f refers to itself. This would imply an infinite loop, unless we do some thinking and break the recursion. f symbolizes the value at a certain position. In most cases, this means interpolating. However, if $x and $y are both equal to values in scoreA and scoreB respectively, we don't need bilinear interpolation, and can return the final entry directly.
This can be done by checking if both $x and $y are members of their arrays, and doing an early return. Or we can use the fact that $x_1, ..., $y_2 all are members of the arrays. Instead of recursing with values we know don't need interpolating, we just do an array access. This is what we have saved the indices $x_i1, ..., $y_i2 for. So wherever the original formula says f(x_1, y_1) or similar, we write the equivalent $data->{final}[$x_i1][$y_i2].

Is there an elegant zip to interleave two lists in Perl 5?

I recently "needed" a zip function in Perl 5 (while I was thinking about How do I calculate relative time?), i.e. a function that takes two lists and "zips" them together to one list, interleaving the elements.
(Pseudo)example:
#a=(1, 2, 3);
#b=('apple', 'orange', 'grape');
zip #a, #b; # (1, 'apple', 2, 'orange', 3, 'grape');
Haskell has zip in the Prelude and Perl 6 has a zip operator built in, but how do you do it in an elegant way in Perl 5?
Assuming you have exactly two lists and they are exactly the same length, here is a solution originally by merlyn (Randal Schwartz), who called it perversely perlish:
sub zip2 {
my $p = #_ / 2;
return #_[ map { $_, $_ + $p } 0 .. $p - 1 ];
}
What happens here is that for a 10-element list, first, we find the pivot point in the middle, in this case 5, and save it in $p. Then we make a list of indices up to that point, in this case 0 1 2 3 4. Next we use map to pair each index with another index that’s at the same distance from the pivot point as the first index is from the start, giving us (in this case) 0 5 1 6 2 7 3 8 4 9. Then we take a slice from #_ using that as the list of indices. This means that if 'a', 'b', 'c', 1, 2, 3 is passed to zip2, it will return that list rearranged into 'a', 1, 'b', 2, 'c', 3.
This can be written in a single expression along ysth’s lines like so:
sub zip2 { #_[map { $_, $_ + #_/2 } 0..(#_/2 - 1)] }
Whether you’d want to use either variation depends on whether you can see yourself remembering how they work, but for me, it was a mind expander.
The List::MoreUtils module has a zip/mesh function that should do the trick:
use List::MoreUtils qw(zip);
my #numbers = (1, 2, 3);
my #fruit = ('apple', 'orange', 'grape');
my #zipped = zip #numbers, #fruit;
Here is the source of the mesh function:
sub mesh (\#\#;\#\#\#\#\#\#\#\#\#\#\#\#\#\#\#\#\#\#\#\#\#\#\#\#\#\#\#\#\#\#) {
my $max = -1;
$max < $#$_ && ($max = $#$_) for #_;
map { my $ix = $_; map $_->[$ix], #_; } 0..$max;
}
I find the following solution straightforward and easy to read:
#a = (1, 2, 3);
#b = ('apple', 'orange', 'grape');
#zipped = map {($a[$_], $b[$_])} (0 .. $#a);
I believe it's also faster than solutions that create the array in a wrong order first and then use slice to reorder, or solutions that modify #a and #b.
For arrays of the same length:
my #zipped = ( #a, #b )[ map { $_, $_ + #a } ( 0 .. $#a ) ];
my #l1 = qw/1 2 3/;
my #l2 = qw/7 8 9/;
my #out;
push #out, shift #l1, shift #l2 while ( #l1 || #l2 );
If the lists are a different length, this will put 'undef' in the extra slots but you can easily remedy this if you don't wish to do this. Something like ( #l1[0] && shift #l1 ) would do it.
Hope this helps!
Algorithm::Loops is really nice if you do much of this kind of thing.
My own code:
sub zip { #_[map $_&1 ? $_>>1 : ($_>>1)+($#_>>1), 1..#_] }
This is totally not an elegant solution, nor is it the best solution by any stretch of the imagination. But it's fun!
package zip;
sub TIEARRAY {
my ($class, #self) = #_;
bless \#self, $class;
}
sub FETCH {
my ($self, $index) = #_;
$self->[$index % #$self][$index / #$self];
}
sub STORE {
my ($self, $index, $value) = #_;
$self->[$index % #$self][$index / #$self] = $value;
}
sub FETCHSIZE {
my ($self) = #_;
my $size = 0;
#$_ > $size and $size = #$_ for #$self;
$size * #$self;
}
sub CLEAR {
my ($self) = #_;
#$_ = () for #$self;
}
package main;
my #a = qw(a b c d e f g);
my #b = 1 .. 7;
tie my #c, zip => \#a, \#b;
print "#c\n"; # ==> a 1 b 2 c 3 d 4 e 5 f 6 g 7
How to handle STORESIZE/PUSH/POP/SHIFT/UNSHIFT/SPLICE is an exercise left to the reader.