how to factorize two lists - perl

is there a way to factorize some lists easily in perl?
For example with 2 lists ('a', 'b', 'c') and ('d', 'e', 'f')
I want the output ('ad', 'ae', 'af' .... 'ce', 'cf')
for now i'm doing
use strict;
use warnings;
my #listA = ('a', 'b', 'c');
my #listB = ('d', 'e', 'f');
my #listC = ();
foreach my $elementA (#listA)
{
foreach my $elementB (#listB)
{
push(#listC, $elementA.$elementB);
}
}
This works fine, but I would like to know if there is a more "perlish" way to do so?
thanks :)

You can use map to make it more Perlish.
my #list_a = qw( a b c );
my #list_b = qw( d e f );
my #list_c = map {
my $temporary = $_;
map { $temporary . $_ } #list_b
} #list_a;
This results in the same #list_c you had above.
We need the $temporary variable because both maps will set the topic $_, and the inner map would override the outer topic, so we have to save it in another lexical.
Note that I renamed your variables and used qw(), since you asked for Perlish. The common consensus on style in Perl is to use snake case variable names. Despite camel case being named after our camel, we don't use it for variables. Only for package names.

Solution for arbitrary number of lists:
use Algorithm::Loops qw( NestedLoops );
my #arrays = (
[qw( a b c )],
[qw( d e f )],
...
);
my #result;
NestedLoops(\#arrays, sub { push #result, join("", #_); });
or
my #result;
my $iter = NestedLoops(\#arrays);
while (my #comb = $iter->()) {
push #result, join("", #comb);
}

#listC = glob join '', map '{' . join(',', map quotemeta, #$_) . '}', \#listA, \#listB;

Other alternative CPAN modules are:
Set::Product
Set::CrossProduct
Set::CartesianProduct::Lazy

Related

How can I remove the title name of each unique number in Data::Dumper in perl?

I use Data::Dumper to catch uniqe number in each element.
#!perl
use warnings;
use strict;
use Data::Dumper;
$Data::Dumper::Sortkeys = 1;
my #names = qw(A A A A B B B C D);
my %counts;
$counts{$_}++ for #names;
print Dumper (\%counts);
exit;
This is output.
$VAR1 = {
'A' => 4,
'B' => 3,
'C' => 1,
'D' => 1
};
How can I remove the title name of each unique number to get output like this format?
$VAR1 = { 4 ,3 ,1 ,1 }
Presuming you want the counts in descending order, you could use the following:
printf "\$VAR1 = { %s};\n",
join ',',
map "$_ ",
sort { $b <=> $a }
values(%counts);
If instead you want the counts sorted by key,
printf "\$VAR1 = { %s};\n",
join ',',
map "$counts{$_} ",
sort
keys(%counts);
Either way, that's a really weird format. Square brackets would make more sense than curly ones.
One of many ways to get desired result
use strict;
use warnings;
use feature 'say';
my #names = qw( A A A A B B B C D );
my %counts;
$counts{$_}++ for #names;
my #values = map { $counts{$_} } sort keys %counts;
say join(',', #values);
output
4,3,1,1

Perl to sort words by user-defined alphabet sequence

I have an array of "words" (strings), which consist of letters from an "alphabet" with user-defined sequence. E.g my "alphabet" starts with "ʔ ʕ b g d", so a list of "words" (bʔd ʔbg ʕʔb bʕd) after sort by_my_alphabet should be ʔbd ʕʔb bʔd bʕd.
sort by_my_alphabet (bʔd ʔbg ʕʔb bʕd) # gives ʔbd ʕʔb bʔd bʕd
Is there a way to make a simple subroutine by_my_alphabet with $a and $b to solve this problem?
Simple, and very fast because it doesn't use a compare callback, but it needs to scan the entire string:
use utf8;
my #my_chr = split //, "ʔʕbgd";
my %my_ord = map { $my_chr[$_] => $_ } 0..$#my_chr;
my #sorted =
map { join '', #my_chr[ unpack 'W*', $_ ] } # "\x00\x01\x02\x03\x04" ⇒ "ʔʕbgd"
sort
map { pack 'W*', #my_ord{ split //, $_ } } # "ʔʕbgd" ⇒ "\x00\x01\x02\x03\x04"
#unsorted;
Optimized for long strings since it only scans a string up until a difference is found:
use utf8;
use List::Util qw( min );
my #my_chr = split //, "ʔʕbgd";
my %my_ord = map { $my_chr[$_] => $_ } 0..$#my_chr;
sub my_cmp($$) {
for ( 0 .. ( min map length($_), #_ ) - 1 ) {
my $cmp = $my_ord{substr($_[0], $_, 1)} <=> $my_ord{substr($_[1], $_, 1)};
return $cmp if $cmp;
}
return length($_[0]) <=> length($_[1]);
}
my #sorted = sort my_cmp #unsorted;
Both should be faster than Sobrique's. Theirs uses a compare callback, and it scans the entire strings being compared.
Yes.
sort can take any function that returns a relative sort position. All you need is a function that correctly looks up the 'sort value' of a string for comparing.
So all you need to do here is define a 'relative weight' of your extra letters, and then compare the two.
#!/usr/bin/env perl
use strict;
use warnings;
use Data::Dumper;
my #sort_order = qw ( B C A D );
my #array_to_sort = qw ( A B C D A B C D AB BB CCC ABC );
my $count = 0;
my %position_of;
$position_of{$_} = $count++ for #sort_order;
print Dumper \%position_of;
sub sort_by_pos {
my #a = split //, $a;
my #b = split //, $b;
#iterate one letter at a time, using 'shift' to take it off the front
#of the array.
while ( #a and #b ) {
my $result = $position_of{shift #a} <=> $position_of{shift #b};
#result is 'true' if it's "-1" or "1" which indicates relative position.
# 0 is false, and that'll cause the next loop iteration to test the next
#letter-pair
return $result if $result;
}
#return a value based on remaining length - longest 'string' will sort last;
#That's so "AAA" comparing with "AA" comparison actually work,
return scalar #a <=> scalar #b;
}
my #new = sort { sort_by_pos } #array_to_sort;
print Dumper \#new;
Bit of a simple case, but it sorts our array into:
$VAR1 = [
'B',
'B',
'BB',
'C',
'C',
'CCC',
'A',
'A',
'AB',
'ABC',
'D',
'D'
];

Recursive sorting in Perl

I have a hash that contains keys that correspond to database subscripts, but the database can have multidimensional records so the key could be a single subscript, or a list of subscripts.
I need to find a way to sort these records so I can print them in a logical order.
Example:
my $data = {
'1,1,1' => 'data1',
'1,2' => 'data2',
'1,1,3' => 'stuff',
'2,1,1' => 'data3',
'2,1,2' => 'data4',
'2,1,3' => 'data blah',
'2,2,2' => 'datawk2n',
'3,1,2' => 'more',
};
# Should print the keys in the properly sorted order
print join "\n", sort some_function keys %$data;
sub some_function {
# Do some sorting magikz
}
I want it to sort by the leftmost subscript first. If the leftmost value is identical I want to move to the next value and compare those. If those are identical I want to continue to the next one ... and so on ... until all possibilities are exhausted.
This will most likely involve some recursion, but I can't figure out how to make recursion work with those fancy $a and $b variables.
What can I put in some_function to get the following output?
1,1,1
1,1,3
1,2
2,1,1
2,1,2
2,1,3
2,2,2
3,1,2
The following is the fastest solution (by far!):
my #sorted_keys =
map { join ',', unpack 'N*', $_ }
sort
map { pack 'N*', split /,/, $_ }
keys(%$data);
If you want something simpler, and still quite fast, you could use a "natural sort".
Sort::Key::Natural
use Sort::Key::Natural qw( natsort );
my #sorted_keys = natsort(keys(%$data));
Sort::Naturally
use Sort::Naturally qw( nsort );
my #sorted_keys = nsort(keys(%$data));
Benchmarks:
Rate SN SKN grt
SN 3769/s -- -40% -88%
SKN 6300/s 67% -- -79%
grt 30362/s 705% 382% --
Benchmark code:
use strict;
use warnings;
use Benchmark qw( cmpthese );
use List::Util qw( shuffle );
use Sort::Key::Natural qw( );
use Sort::Naturally qw( );
my #keys =
shuffle
split ' ',
'1 1,0 1,1 1,1,1 1,1,3 1,2 2,1,1 2,1,2 2,1,3 2,2,2 3,1,2 10,1,1';
sub grt {
my #sorted_keys =
map { join ',', unpack 'N*', $_ }
sort
map { pack 'N*', split /,/, $_ }
#keys;
}
sub SKN { my #sorted_keys = Sort::Key::Natural::natsort(#keys); }
sub SN { my #sorted_keys = Sort::Naturally::nsort(#keys); }
cmpthese(-3, {
grt => \&grt,
SKN => \&SKN,
SN => \&SN,
});
I thought the Sort::Naturally module would help you here, but it seems not
I must have had a bug in my test. This works fine
use Sort::Naturally 'nsort';
say for nsort keys %$data;
I recommend either this or the Sort::Key::Naturally solution as they are the clearest
It is bad practice to chase speed of execution, especially at the expense of readability, before there is evidence that a given solution is too slow. Even then it is foolish to randomly optimise chunks of your code in the hope of making a difference, and your solution should be run through a profiler to discover where it would be most fruitful to make enhancements
There is no need for recursion. This program shows a sort subroutine by_elements which simply compares each item in the list until it finds either a mismatch or the end of one of the lists
In the former case the result is just the comparison of the two differ elements, and in the latter it is a comparison of the number of elements in the two lists
use strict;
use warnings 'all';
use feature 'say';
my $data = {
'1,1,1' => 'data1',
'1,2' => 'data2',
'1,1,3' => 'stuff',
'2,1,1' => 'data3',
'2,1,2' => 'data4',
'2,1,3' => 'data blah',
'2,2,2' => 'datawk2n',
'3,1,2' => 'more',
'10,1,1' => 'odd',
'1,1' => 'simple',
'1,0' => 'simple0',
'1' => 'simpler',
};
say for sort by_elements keys %$data;
sub by_elements {
my ( $aa, $bb ) = map [/\d+/g], $a, $b;
for ( my $i = 0; $i < #$aa and $i < #$bb; ++$i ) {
my $cmp = $aa->[$i] <=> $bb->[$i];
return $cmp if $cmp;
}
return #$aa <=> #$bb;
}
output
1
1,0
1,1
1,1,1
1,1,3
1,2
2,1,1
2,1,2
2,1,3
2,2,2
3,1,2
10,1,1
Use natsort of Sort::Key::Natural:
#!/usr/bin/env perl
use strict;
use warnings;
use v5.10;
use Sort::Key::Natural qw(natsort);
my %data = (
'1,1,1' => 'data1',
'1,2' => 'data2',
'1,1,3' => 'stuff',
'2,1,1' => 'data3',
'2,1,2' => 'data4',
'2,1,3' => 'data blah',
'2,2,2' => 'datawk2n',
'10,1,2' => 'more',
);
say for natsort keys %data;
Outputs:
1,1,1
1,1,3
1,2
2,1,1
2,1,2
2,1,3
2,2,2
10,1,2
No need for recursion, just a loop that you can break out of.
sub some_function {
my #aa = split /,/, $a;
my #bb = split /,/, $b;
my $cmp = 0;
for (my $i=0; $i<#aa || $i<#bb; $i++) {
$cmp = $aa[$i] <=> $bb[$i];
last if $cmp;
}
$cmp;
}
But if your heart is set on a recursive solution, there's
sub aref_sort_recurse {
my ($c,$d) = #_;
#$c ? #$d ? shift #$c <=> shift #$d || aref_sort_recurse($c,$d) ? 1 : -#$d
}
sub some_function {
aref_sort_recurse( [split /,/, $a], [split /,/, $b] )
}

How to flatten a single nested hash key?

I have a data structure flattened by Hash::Flatten
For example,
flatten( { a => [ 'x', { b => 'y' } ] } )
produces
my $flat = {
'a:0' => 'x'
'a:1.b' => 'y',
};
I want to generate a flattened hash key from the a list of keys and indexes from a key Data::Diver's functions would accept.
For example,
my #key = ('a', 1, 'b');
should return
my $key = "a:1.b";
I have looked at Hash::Flatten, but it seems it can only flatten the whole hash, which is not what I am looking for. I just want to flatten a single (nested) key at a time.
To avoid replicating the escaping mechanism of Hash::Flatten, I tried the following:
use Data::Diver qw( DiveVal );
use Hash::Flatten qw( flatten );
my #key = ('a', 1, 'b');
DiveVal(my $h = {}, #key) = 1;
my ($key) = keys(%{ flatten($h) );
But that can just as easily return a:0 as a:1.b. Does anyone have any recommendations?
Only the key in which you are interested will have a defined value, so only a small change is needed.
use Data::Diver qw( DiveVal );
use Hash::Flatten qw( flatten );
sub flat_key {
DiveVal(my $h = {}, #_) = 1;
my $flat = flatten($h);
return ( grep $flat->{$_}, keys(%$flat) )[0];
}
my #key = ('a', 1, 'b');
my $key = flat_key(#key); # a:1.b
Because this uses Data::Diver, you can also use references to indicate that a number is really a hash key.
my #key = ('a', 1, 'b');
my $key = flat_key(map \$_, #key); # a.1.b
Alternatively, the escaping mechanism is well documented.
sub _flat_key_escape {
my ($s) = #_;
$s =~ s/([\\.:])/\\$1/g;
return $s;
}
sub flat_key {
my $key;
die("usage") if !#_;
for my $subkey (#_) {
if (ref($subkey)) { $key .= '.' . _flat_key_escape($$subkey); }
elsif ($subkey !~ /^-?[0-9]+\z/) { $key .= '.' . _flat_key_escape($subkey); }
else { $key .= ':' . _flat_key_escape($subkey); }
}
return substr($key, 1);
}
This is simple to do without reference to either Hash::Flatten or Data::Diver. The latter's DiveVal distinguishes between hash keys and array indices using the regex /^-?\d+$/, so we can do the same to discover whether a item in a sequence's Hash::Flatten default contraction should be preceded by a colon : (array index) or a dot . (hash key).
That gives the subroutine flatten_key below
use strict;
use warnings;
use 5.010;
my #key = ('a', 1, 'b');
my $key = flatten_key(#key);
say $key;
say flatten_key(qw/ a b c 1 2 3 /);
sub flatten_key {
join '', shift, map /^-?\d+$/ ? ":$_" : ".$_", #_;
}
output
a:1.b
a.b.c:1:2:3
Update
If you need to use the Data::Diver convention that any value passed as a scalar reference is a hash key, even if it looks like a number, then you can expand that subroutine like this. It's slightly more awkward because the first item in the sequence needs to be processed as well, but for some reason it doesn't take a delimiter character. So I've chosen to add a delimiter to all the items and then remove it from the first.
say flatten_key('a', 'b', \1, \2, 'c', 'd', 1, 2);
sub flatten_key {
my #key = map {
ref() ? ".$$_" :
/^-?\d+$/ ? ":$_" :
".$_"
} #_;
$key[0] =~ s/^[:.]//;
join '', #key;
}
output
a.b.1.2.c.d:1:2
Update
Also accounting for hash keys that themselves contain dots or colons:
say flatten_key(qw/ a .. :: b /);
sub flatten_key {
my #key = map {
(my $s = ref() ? $$_ : $_) =~ s/(?=[:.\\])/\\/g;
/^-?\d+$/ ? ":$s" : ".$s"
} #_;
$key[0] =~ s/^[:.]//;
join '', #key;
}
output
a.\.\..\:\:.b

Getting these loop errors in basic Perl

I have the following code in Perl. I am very new to the language:
#!/usr/bin/perl
use strict;
use warnings;
my $date = $ARGV[0];
my $symbols = ('A', 'B', 'C');
foreach $symbol (%symbols)
{
my $print = "$symbol";
print "$print";
}
Getting:
Useless use of a constant in void context at (line of %symbols)
and
Global symbol "$symbol requires explicit package name at ..."
and
Global symbol "%symbols" require explicit package. name at ..."
You are using an Hash when an Array is all that is needed.
#!/usr/bin/perl
use strict;
use warnings;
my $date = $ARGV[0];
my #symbols = ('A', 'B', 'C');
foreach my $symbol (#symbols)
{
print $symbol;
}
1) Your $symbols should be #symbols, since it's an array. Later in the foreach, %symbols should be #symbols.
2) The $symbol is not declared. Say foreach my $symbol... instead.
You are declaring $symbols instead of #symbols, so it is putting that in scalar context and setting it to 'C'. Then you try to loop through a hash with the same name, which you never created. Remember, $a (scalar), #a (array) and %a (hash) are all different.
This is what you wanted:
my #symbols = qw/ A B C /; ## the same as ( 'A', 'B', 'C' )
foreach my $symbol ( #symbols ) {
print $symbol;
}
Really quick:
my #symbols = qw/ A B C /; ## new array with three values
my $symbols = qw/ A B C /; ## new scalar that is the last element of the "A B C" list ($symbols = 'C')
my %symbols = (
A => 1,
B => 2,
C => 3,
); ## a hash with three key/value pairs
Your foreach is looking at each symbol in a non-existent hash called %symbols, not your array #symbols.
foreach $symbol (#symbols)
{
my $print = "$symbol";
print "$print";
}