Perl push item to subarray - perl

I have 2 empty arrays that I add items to using a for loop.
#main = (
"a 1 b 2 c 3 ",
"d 4",
"e 5 f 6 g 7 h 8",
"i 9 j 10",
);
#arr1 = (); #only gets the letters
#supposed to look like:
#(
#[a,b,c]
#[d]
#[e,f,g,h]
#[i,j]
#)
#arr2 = (); #only gets the numbers
#supposed to look like:
#(
#[1,2,3]
#[4]
#[5,6,7,8]
#[9,10]
#)
for($i=0;#main;$i+=1){
#line = split(/\s+/,shift(#main));
push(#arr1,[]);
push(#arr2,[]);
while(#line){
push(#arr1[$i],shift(#line));
push(#arr2[$i],shift(#line));
}
}
error:
Experimental push on scalar is now forbidden at index.pl line 29, near "))"
Experimental push on scalar is now forbidden at index.pl line 30, near "))"
It seems that #arr[$i] returns a reference to an array. How do I get this array and add items to it?

The first argument of push must be an array, not an array slice. I think you wanted #{ $arr1[$i] } (but that's not your only problem). Make sure that you're using use strict; use warnings;!
–ikegami
for(my $i=0;#main;$i+=1){
my #line = split(/\s+/,shift(#main));
push(#arr1,[]);
push(#arr2,[]);
while(#line){
# ↓↓ ↓
push( #{ $arr1[$i] } ,shift(#line));
push( #{ $arr2[$i] } ,shift(#line));
}
}
Not exactly sure how it works, probably turns the reference into an actual array of something along those lines.
Another solution I came up with myself, which adds the finished array after instead of adding values slowly:
for($i=0;#main;$i+=1){
#line = split(/\s+/,shift(#main));
#subArr1 = ();
#subArr2 = ();
while(#line){
push(#subArr1,shift(#line));
push(#subArr2,shift(#line));
}
push(#arr1,[#subArr1]);
push(#arr2,[#subArr2]);
}

As of perl 5.36, you can iterate over multiple elements of a list at once in a foreach loop, which lets you clean this up a bit:
#!/usr/bin/env perl
use 5.036;
# The next two lines aren't strictly needed because of the above, but
# I like to be explicit
use strict;
use warnings;
use experimental qw/for_list/;
use Data::Dumper;
my #main = (
"a 1 b 2 c 3 ",
"d 4",
"e 5 f 6 g 7 h 8",
"i 9 j 10",
);
my (#arr1, #arr2);
for my $line (#main) {
my (#sub1, #sub2);
for my ($name, $num) (split " ", $line) {
push #sub1, $name;
push #sub2, $num;
}
push #arr1, \#sub1;
push #arr2, \#sub2;
}
print Dumper(\#arr1, \#arr2);
You can use the same approach in older versions using natatime from the List::MoreUtils module (Available from CPAN or your OS's package manager):
#!/usr/bin/env perl
use strict;
use warnings;
use List::MoreUtils qw/natatime/;
use Data::Dumper;
my #main = (
"a 1 b 2 c 3 ",
"d 4",
"e 5 f 6 g 7 h 8",
"i 9 j 10",
);
my (#arr1, #arr2);
for my $line (#main) {
my (#sub1, #sub2);
my $it = natatime 2, split(" ", $line);
while (my ($name, $num) = $it->()) {
push #sub1, $name;
push #sub2, $num;
}
push #arr1, \#sub1;
push #arr2, \#sub2;
}
print Dumper(\#arr1, \#arr2);
The thing these have in common is building up each subarray first, and then pushing references to them onto #arr1 and #arr2 at the end, removing the need for a lot of the arrayref dereferencing syntax. They also iterate directly over the elements you're storing in the subarrays instead of modifying the source directly with shift.

Please inspect the following demonstration code for a compliance with your problem.
The code is based upon comments in OP's code with description what result arrays should look alike.
use strict;
use warnings;
use feature 'say';
use Data::Dumper;
my(#main,#arr1,#arr2);
#main = (
"a 1 b 2 c 3 ",
"d 4",
"e 5 f 6 g 7 h 8",
"i 9 j 10",
);
for ( #main ) {
push(#arr1,[ /(\d+)/g ]);
push(#arr2,[ /([a-z])/g ]);
}
say Dumper(\#arr1,\#arr2);
Output
$VAR1 = [
[
'1',
'2',
'3'
],
[
'4'
],
[
'5',
'6',
'7',
'8'
],
[
'9',
'10'
]
];
$VAR1 = [
[
'a',
'b',
'c'
],
[
'd'
],
[
'e',
'f',
'g',
'h'
],
[
'i',
'j'
]
];

Related

Making more of Perl Data::Dumper output

I have several nested data structures that refer to one another's elements. I would like to be able to check those references, so I am searching for something that will print the memory address of nested structures. An option for Data::Dumper would be fine.
Here are some examples of what I mean:
my #a = ( [1,2,3], [4,5,6] );
print \#a;
Will give you something like:
ARRAY(0x20071dc8)
When you run the same code through the debugger and examine the array with x \#a it will print this:
0 ARRAY(0x20070668)
0 ARRAY(0x2006c0d8)
0 1
1 2
2 3
1 ARRAY(0x2006c1b0)
0 4
1 5
2 6
But using Data::Dumper
print Dumper \#a;
It looks like this
$VAR1 = [
[
1,
2,
3
],
[
4,
5,
6
]
];
What I really want is a mixture of the Data::Dumper output and the details that the debugger provides. Perhaps this
$VAR1 = [ ARRAY(0x20070668)
[ ARRAY(0x2006c0d8)
1,
2,
3
],
[ ARRAY(0x2006c1b0)
4,
5,
6
]
];
Edit
Consider this code. The output doesn't explain that $b[1] is the same reference as in $a[0]
use Data::Dumper;
my #a = ( [1,2,3], [4,5,6] );
my #b = ( ["a","b","c"], $a[0] );
print Dumper \#b
print $b[1], "\n";
print $a[0], "\n";
output
$VAR1 = [
[
'a',
'b',
'c'
],
[
1,
2,
3
]
];
ARRAY(0x2002bcc0)
ARRAY(0x2002bcc0)
Also, is this approach, when one structure references the contents of another, considered to be good programming practice? Maybe this is too general question and heavily depends on particular code but I would like to know your opinion.
Data::Dumper will already tell if a reference is reused.
In the following example, the 2nd and 3rd elements of the AoA are identical. This is represented in the Dumper output:
use strict;
use warnings;
my #array1 = (1..3);
my #array2 = (4..6);
my #AoA = (\#array1, \#array2, \#array2);
use Data::Dumper;
print Dumper \#AoA;
Outputs:
$VAR1 = [
[
1,
2,
3
],
[
4,
5,
6
],
$VAR1->[1]
];
Responding to your edit
If you want to find the relation between two different data structures, just make a single call to Dumper with both data structures.
You can do this by passing them as a list, or as values in another anonymous data structure, like a hash or array:
use strict;
use warnings;
my #a = ([1,2,3], [4,5,6]);
my #b = (["a","b","c"], $a[0]);
use Data::Dumper;
print Dumper(\#a, \#b);
Outputs:
$VAR1 = [
[
1,
2,
3
],
[
4,
5,
6
]
];
$VAR2 = [
[
'a',
'b',
'c'
],
$VAR1->[0]
];
I believe that far too little attention is paid to Data::Dump. It is written by Gisle Aas, the author of the remarkable LWP suite of modules.
It will help you in this case because there is a companion Data::Dump::Filtered module that allows you to supply a callback to dictate exactly how each item should be displayed in the dump.
This program takes the data in your question as an example. It employs a callback that adds the stringified version of the reference as a Perl comment before each array is displayed. The dump is very similar to your requirement, and as a bonus it is still valid Perl code that can be passed through eval if necessary.
Note that all dump output is sent to STDERR so I have called select STDERR to keep the print output in synch with the dumps.
use strict;
use warnings;
use Data::Dump::Filtered qw/ dump_filtered /;
my #a = ( [1,2,3], [4,5,6] );
my #b = ( [ qw/ a b c / ], $a[0] );
select STDERR;
dump_filtered(\#a, \&filter);
print "\n";
dump_filtered(\#b, \&filter);
print "\n";
print '$b[1] is ', $b[1], "\n";
print '$a[0] is ', $a[0], "\n";
sub filter {
my ($thing, $ref) = #_;
return { comment => "$ref" } if $thing->is_array;
}
output
# ARRAY(0x45179c)
[
# ARRAY(0xa2d36c)
[1, 2, 3],
# ARRAY(0x44adc4)
[4, 5, 6],
]
# ARRAY(0x4e6964)
[
# ARRAY(0xa2d534)
["a", "b", "c"],
# ARRAY(0xa2d36c)
[1, 2, 3],
]
$b[1] is ARRAY(0xa2d36c)
$a[0] is ARRAY(0xa2d36c)

Joining intervals in perl

First let me show an example:
I have one set of intervals like
[1,4],[5,15],[16,20]
and the other one like
[2,3],[6,14]
and I want it become one set like
[1,2],[3,4],[5,6],[7,15],[16,20]
I am not sure what this operation is called though, forget me if the title was misleading. Is there a CPAN module with I can use or is it better to come up with my own solution? Is there a general well known algorithm?
Using the pairs function from List::Util is a possible solution.
#!/usr/bin/perl
use strict;
use warnings;
use List::Util 'pairs';
my #a1 = ([1,4],[5,15],[16,20]);
my #a2 = ([2,3],[6,14]);
my #new = pairs sort {$a <=> $b} map {#$_} #a1, #a2;
use Data::Dumper; print Dumper \#new;
This prints
$VAR1 = [
[
1,
2
],
[
3,
4
],
[
5,
6
],
[
14,
15
],
[
16,
20
]
];
Step by step approach:
#!/usr/bin/perl
use Data::Dumper;
my #set1 = ([1,4],[5,15],[16,20]);
my #set2 = ([2,3],[6,14]);
# Make the tuples into an unsorted list
my #nums = ();
foreach my $tuple (#set1,#set2) {
foreach my $num (#{$tuple}) {
push #nums, $num;
}
}
# Sort the list
my #sorted = sort {$a <=> $b} #nums;
print "#sorted\n";
# Retuple
my #finalset = ();
while(my #tuple = splice(#sorted,0,2)) {
push #finalset, \#tuple;
}
print Dumper(\#finalset);

Making the sort stable in Perl

I have an array of refs with me . Something like
$a[0] = [qw( 1 2 3 4 )];
$a[1] = [qw( a b c d )];
The 1,2,3,4 are actually website breadcrumbs which are used for navigation (Home, Profile, Contact-us, Contact-me-specifically).
Now, I have to sort this ladder (And using stable sort in perl 5.8 is not an option sadly)
The sorting criteria is
The depth of the ladder
If two ladders have same depth, then sort them depending on their index.
For example, if the array originally contains
$a[0] = [qw( 1 2 3 4 )];
$a[1] = [qw( 1 2 3 )];
Then after the sort, the array should contain
$a[0] = [qw( 1 2 3 )];
$a[1] = [qw( 1 2 3 4 )];
But if the arrays are something like :-
$a[0] = [qw( 1 2 3 )];
$a[1] = [qw( a b c )];
Then after the sort,
$a[0] = [qw( 1 2 3 )];
$a[1] = [qw( a b c )];
I can't get it to work this way that I tried .
my #sorted_array = sort { #$b <=> #$a || $a <=> $b } #a;
Can someone help me in this?
The description of your data structure (linked list), and the implementation in your sort routine (arrayrefs) do not quite fit together; I will assume the latter.
A non-stable sort can be made stable by sorting by the position as a secondary criterion:
sort { normally or by_index } #stuff
Normally, you seem to want to compare the array length. To be able to test for the index, you have to somehow make the index of the current element available. You can do so by two means:
Do the Schwartzian Transform, and annotate each element with its index. This is silly.
Sort the indices, not the elements.
This would look like:
my #sorted_indices =
sort { #{ $array[$b] } <=> #{ $array[$a] } or $a <=> $b } 0 .. $#array;
my #sorted = #array[#sorted_indices]; # do a slice
What you were previously doing with $a <=> $b was comparing refernces. This is not guaranteed to do anything meaningful.
Test of that sort:
use Test::More;
my #array = (
[qw/1 2 3/],
[qw/a b c/],
[qw/foo bar baz qux/],
);
my #expected = (
[qw/foo bar baz qux/],
[qw/1 2 3/],
[qw/a b c/],
);
...; # above code
is_deeply \#sorted, \#expected;
done_testing;
Your code doesn't work because you expect $a and $b to contain the element's value in one place (#$b <=> #$a) and the element's index in another ($a <=> $b).
You need the indexes in your comparison, so your comparison function is going to need the indexes.
By passing the indexes of the array to sort, you have access to both the indexes and the values at those indexes, so your code is going to include
sort { ... } 0..$#array;
After we're finished sorting, we want to retrieve the elements for those indexes. For that, we can use
my #sorted = map $array[$_], #sorted_indexes;
or
my #sorted = #array[ #sorted_indexes ];
All together, we get:
my #sorted =
map $array[$_],
sort { #{ $array[$a] } <=> #{ $array[$b] } || $a <=> $b }
0..$#array;
or
my #sorted = #array[
sort { #{ $array[$a] } <=> #{ $array[$b] } || $a <=> $b }
0..$#array
];
I think we need to clear up your sorting algorithm. You said:
The depth of the ladder
Sort them depending on their index.
Here's an example:
$array[0] = [ qw(1 a b c d e) ];
$array[2] = [ qw(1 2 b c d e) ];
$array[3] = [ qw(a b c) ];
$array[4] = [ qw(a b c d e) ];
You want them sorted this way:
$array[3] = [ qw(a b c) ];
$array[2] = [ qw(1 2 b c d e) ];
$array[0] = [ qw(1 a b c d e) ];
$array[4] = [ qw(a b c d e) ];
Is that correct?
What about this?
$array[0] = [ qw(100, 21, 15, 32) ];
$array[1] = [ qw(32, 14, 32, 20) ];
Sorting by numeric, $array[1] should be before $array[0], but sorting by string, $array[0] is before $array[1].
Also, you notice that I cannot tell whether $array[0] should be before or after $array[1] until I look at the second element of the array.
This makes sorting very difficult to do on a single line function. Even if you can somehow reduce it, It'll make it very difficult for someone to analyze what you are doing, or for you to debug the statement.
Fortunately, you can use an entire subroutine as a sort routine:
use warnings;
use strict;
use autodie;
use feature qw(say);
use Data::Dumper;
my #array;
$array[0] = [ qw(1 2 3 4 5 6) ];
$array[1] = [ qw(1 2 3) ];
$array[2] = [ qw(a b c d e f) ];
$array[3] = [ qw(0 1 2) ];
my #sorted_array = sort sort_array #array;
say Dumper \#sorted_array;
sub sort_array {
#my $a = shift; #Array reference to an element in #array
#my $b = shift; $Array reference to an element in #array
my #a_array = #{ $a };
my #b_array = #{ $b };
#
#First sort on length of arrays
#
if ( scalar #a_array ne scalar #b_array ) {
return scalar #a_array <=> scalar #b_array;
}
#
# Arrays are the same length. Sort on first element in array that differs
#
for my $index (0..$#a_array ) {
if ( $a_array[$index] ne $b_array[$index] ) {
return $a_array[$index] cmp $b_array[$index];
}
}
#
# Both arrays are equal in size and content
#
return 0;
}
This returns:
$VAR1 = [
[
'0',
'1',
'2'
],
[
'1',
'2',
'3'
],
[
'1',
'2',
'3',
'4',
'5',
'6'
],
[
'a',
'b',
'c',
'd',
'e',
'f'
]
];

how to compare elements of 2 arrays in row

I have 2 arrays (#curNodes and #oldNodes), elements of array are in row.
For e.g :
Output of print #curNodes Output of print #oldNodes
US London
UK US
Now I want to compare each element of #curNodes with #oldNodes.
E.g first it will check for "US" in #oldNodes, if it is there do nothing
else some other action.
Could you please help me and let me know if elements are in row,
how this comparison can be done.
For an approach not requiring any external modules, how about making the first array into a hash and then iterating through the second array? See below.
use v5.012;
use warnings;
my #old_nodes = qw/ a b c d /;
my %old = map {; $_ => 1 } #old_nodes;
my #cur_nodes = qw/ a d /;
foreach (#cur_nodes) {
if ($old{$_}) {
say "$_ exists in old_nodes";
}
}
You can use Array::Diff module for this.
You can do it using the smart match operator (~~).
#!/usr/bin/perl
use strict;
use warnings;
use 5.010;
my #curNodes = qw' US UK ';
my #oldNodes = qw' London US ';
my $flag;
foreach my $item (#curNodes) {
$flag = #oldNodes~~$item ? 0 : 1;
last if !$flag; #perform some action
}
We don't really have multidimensional arrays in Perl.
But there is certainly arrays of array references.
Assuming that you're running Perl 5.10 or newer, I think the smartmatch operator makes sense.
Based on your description, here's what I came up with:
#!/usr/bin/perl -Tw
use 5.010;
use strict;
use warnings;
use Data::Dumper;
my #curNodes = (
[ 'US', 'UK' ],
);
my #oldNodes = (
[ 'London', 'US' ],
);
my #matchedElements = grep { $_ ~~ #{ $oldNodes[0] } } #{ $curNodes[0] };
say Dumper( \#curNodes );
say Dumper( \#oldNodes );
say Dumper( \#matchedElements );
This emits:
$VAR1 = [
[
'US',
'UK'
]
];
$VAR1 = [
[
'London',
'US'
]
];
$VAR1 = [
'US'
];
I imagine that you'd want to iterate over #matchedElements in your program.

How can I partition a Perl array into equal sized chunks?

I have a fixed-sized array where the size of the array is always in factor of 3.
my #array = ('foo', 'bar', 'qux', 'foo1', 'bar', 'qux2', 3, 4, 5);
How can I cluster the member of array such that we can get
an array of array group by 3:
$VAR = [ ['foo','bar','qux'],
['foo1','bar','qux2'],
[3, 4, 5] ];
my #VAR;
push #VAR, [ splice #array, 0, 3 ] while #array;
or you could use natatime from List::MoreUtils
use List::MoreUtils qw(natatime);
my #VAR;
{
my $iter = natatime 3, #array;
while( my #tmp = $iter->() ){
push #VAR, \#tmp;
}
}
I really like List::MoreUtils and use it frequently. However, I have never liked the natatime function. It doesn't produce output that can be used with a for loop or map or grep.
I like to chain map/grep/apply operations in my code. Once you understand how these functions work, they can be very expressive and very powerful.
But it is easy to make a function to work like natatime that returns a list of array refs.
sub group_by ($#) {
my $n = shift;
my #array = #_;
croak "group_by count argument must be a non-zero positive integer"
unless $n > 0 and int($n) == $n;
my #groups;
push #groups, [ splice #array, 0, $n ] while #array;
return #groups;
}
Now you can do things like this:
my #grouped = map [ reverse #$_ ],
group_by 3, #array;
** Update re Chris Lutz's suggestions **
Chris, I can see merit in your suggested addition of a code ref to the interface. That way a map-like behavior is built in.
# equivalent to my map/group_by above
group_by { [ reverse #_ ] } 3, #array;
This is nice and concise. But to keep the nice {} code ref semantics, we have put the count argument 3 in a hard to see spot.
I think I like things better as I wrote it originally.
A chained map isn't that much more verbose than what we get with the extended API.
With the original approach a grep or other similar function can be used without having to reimplement it.
For example, if the code ref is added to the API, then you have to do:
my #result = group_by { $_[0] =~ /foo/ ? [#_] : () } 3, #array;
to get the equivalent of:
my #result = grep $_->[0] =~ /foo/,
group_by 3, #array;
Since I suggested this for the sake of easy chaining, I like the original better.
Of course, it would be easy to allow either form:
sub _copy_to_ref { [ #_ ] }
sub group_by ($#) {
my $code = \&_copy_to_ref;
my $n = shift;
if( reftype $n eq 'CODE' ) {
$code = $n;
$n = shift;
}
my #array = #_;
croak "group_by count argument must be a non-zero positive integer"
unless $n > 0 and int($n) == $n;
my #groups;
push #groups, $code->(splice #array, 0, $n) while #array;
return #groups;
}
Now either form should work (untested). I'm not sure whether I like the original API, or this one with the built in map capabilities better.
Thoughts anyone?
** Updated again **
Chris is correct to point out that the optional code ref version would force users to do:
group_by sub { foo }, 3, #array;
Which is not so nice, and violates expectations. Since there is no way to have a flexible prototype (that I know of), that puts the kibosh on the extended API, and I'd stick with the original.
On a side note, I started with an anonymous sub in the alternate API, but I changed it to a named sub because I was subtly bothered by how the code looked. No real good reason, just an intuitive reaction. I don't know if it matters either way.
Or this:
my $VAR;
while( my #list = splice( #array, 0, 3 ) ) {
push #$VAR, \#list;
}
Another answer (a variation on Tore's, using splice but avoiding the while loop in favor of more Perl-y map)
my $result = [ map { [splice(#array, 0, 3)] } (1 .. (scalar(#array) + 2) % 3) ];
Try this:
$VAR = [map $_ % 3 == 0 ? ([ $array[$_], $array[$_ + 1], $array[$_ + 2] ])
: (),
0..$#array];
Another generic solution, non-destructive to the original array:
use Data::Dumper;
sub partition {
my ($arr, $N) = #_;
my #res;
my $i = 0;
while ($i + $N-1 <= $#$arr) {
push #res, [#$arr[$i .. $i+$N-1]];
$i += $N;
}
if ($i <= $#$arr) {
push #res, [#$arr[$i .. $#$arr]];
}
return \#res;
}
print Dumper partition(
['foo', 'bar', 'qux', 'foo1', 'bar', 'qux2', 3, 4, 5],
3
);
The output:
$VAR1 = [
[
'foo',
'bar',
'qux'
],
[
'foo1',
'bar',
'qux2'
],
[
3,
4,
5
]
];
As a learning experience I decided to do this in Perl6
The first, perhaps most simplest way I tried was to use map.
my #output := #array.map: -> $a, $b?, $c? { [ $a, $b // Nil, $c // Nil ] };
.say for #output;
foo bar qux
foo1 bar qux2
3 4 5
That didn't seem very scalable. What if I wanted to take the items from the list 10 at a time, that would get very annoying to write. ... Hmmm I did just mention "take" and there is a keyword named take lets try that in a subroutine to make it more generally useful.
sub at-a-time ( Iterable \sequence, Int $n where $_ > 0 = 1 ){
my $is-lazy = sequence.is-lazy;
my \iterator = sequence.iterator;
# gather is used with take
gather loop {
my Mu #current;
my \result = iterator.push-exactly(#current,$n);
# put it into the sequence, and yield
take #current.List;
last if result =:= IterationEnd;
}.lazy-if($is-lazy)
}
For kicks let's try it against an infinite list of the fibonacci sequence
my $fib = (1, 1, *+* ... *);
my #output = at-a-time( $fib, 3 );
.say for #output[^5]; # just print out the first 5
(1 1 2)
(3 5 8)
(13 21 34)
(55 89 144)
(233 377 610)
Notice that I used $fib instead of #fib. It was to prevent Perl6 from caching the elements of the Fibonacci sequence.
It might be a good idea to put it into a subroutine to create a new sequence everytime you need one, so that the values can get garbage collected when you are done with them.
I also used .is-lazy and .lazy-if to mark the output sequence lazy if the input sequence is. Since it was going into an array #output it would have tried to generate all of the elements from an infinite list before continuing onto the next line.
Wait a minute, I just remembered .rotor.
my #output = $fib.rotor(3);
.say for #output[^5]; # just print out the first 5
(1 1 2)
(3 5 8)
(13 21 34)
(55 89 144)
(233 377 610)
.rotor is actually far more powerful than I've demonstrated.
If you want it to return a partial match at the end you will need to add a :partial to the arguments of .rotor.
Use the spart function from the List::NSect package on CPAN.
perl -e '
use List::NSect qw{spart};
use Data::Dumper qw{Dumper};
my #array = ("foo", "bar", "qux", "foo1", "bar", "qux2", 3, 4, 5);
my $var = spart(3, #array);
print Dumper $var;
'
$VAR1 = [
[
'foo',
'bar',
'qux'
],
[
'foo1',
'bar',
'qux2'
],
[
3,
4,
5
]
];
Below a more generic solution to the problem:
my #array = ('foo', 'bar', 1, 2);
my $n = 3;
my #VAR = map { [] } 1..$n;
my #idx = sort map { $_ % $n } 0..$#array;
for my $i ( 0..$#array ){
push #VAR[ $idx[ $i ] ], #array[ $i ];
}
This also works when the number of items in the array is not a factor of 3.
In the above example, the other solutions with e.g. splice would produce two arrays of length 2 and one of length 0.