Learn the order of elements - perl

I want to find an efficient way (in Perl preferably) to learn the fixed order of a family of words by comparing their order in multiple subsets of the group.
(They are job parameters. There are about 30 different ones. Different jobs need different combinations of parameters & there are only ever a few parameters in each job)
For example, given:
first
second
third
sixth
seventh
tenth
first
third
fourth
fifth
sixth
third
fifth
seventh
eighth
ninth
tenth
It should be able to remember the relative order relationships it sees to work out that the order is:
first
second
third
fourth
fifth
sixth
seventh
eighth
ninth
tenth
I have generated lists like:
first.second.third.sixth.seventh.tenth
first.third.fourth.fifth.sixth
third.fifth.seventh.eighth.ninth.tenth
then sorted uniquely + alphabetically and visually compared them, but I have hundreds of different combinations of the 30ish parameters, so it will be a big job to sort through them all and put them together manually.
I think #daniel-tran has answered the "how" in https://stackoverflow.com/a/48041943/224625 and using that and some hackery like:
$order->{$prev}->{$this} = 1;
$order->{$this}->{$prev} = 0;
I've managed to populate a hash of hashes with a 1 or a 0 for each pair of consecutive parameters to say which comes first, like:
$VAR1 = {
'first' => {
'second' => 1,
'third' => 1,
},
'second' => {
'first' => 0,
'third' => 1,
},
'third' => {
'first' => 0,
'second' => 0,
'fourth' => 1,
'fifth' => 1,
'sixth' => 1,
},
'fourth' => {
'third' => 0,
'fifth' => 1,
},
...
but I hit the wall trying to work out what to do in my sort function when it's asked to sort a pair that have never been seen as immediate neighbours, thus don't have a relationship defined.
Is there an easy solution?
Am I going about this the right way?
Is there a better WTDI in the first place?
Thanks,
John

The question you linked to includes another answer using a graph and topological sort. The Graph module is pretty easy to use:
use warnings;
use strict;
use Graph;
my $graph = Graph->new(directed => 1);
my $prev;
while (<DATA>) {
chomp;
$graph->add_edge($prev, $_) if length && length $prev;
$prev = $_;
}
print $_,"\n" for $graph->topological_sort;
__DATA__
first
second
third
sixth
seventh
tenth
first
third
fourth
fifth
sixth
third
fifth
seventh
eighth
ninth
tenth
Output:
first
second
third
fourth
fifth
sixth
seventh
eighth
ninth
tenth

I tried to implement a naive solution myself. I built the %order hash where the values of each key were the elements that followed it. I then created a transitive closure of this structure (i.e. if first was before second and second was before third, then first must be before third). If there was enough information, each key would have a different number of values, and sorting the elements by the number of the values would give the ordered list.
#!/usr/bin/perl
use warnings;
use strict;
use feature qw{ say };
my #partial = (
[qw[ first second third sixth seventh tenth ]],
[qw[ first third fourth fifth sixth ]],
[qw[ third fifth seventh eighth ninth tenth ]]);
my %order;
my %all;
for my $list (#partial) {
undef #all{ #$list };
undef $order{ $list->[ $_ - 1 ] }{ $list->[$_] }
for 1 .. $#$list;
}
my $changed = 1;
while ($changed) {
undef $changed;
for my $from (keys %order) {
if (my #to = keys %{ $order{$from} }) {
if (my #to2 = map keys %{ $order{$_} }, #to) {
my $before = keys %{ $order{$from} };
undef #{ $order{$from} }{#to2};
$changed = 1 if $before != keys %{ $order{$from} };
}
}
}
}
my %key_counts;
$key_counts{ keys %{ $order{$_} } }++ for keys %order;
warn "Not enough information\n"
if keys %key_counts != keys %order;
say join ' ',
sort { keys %{ $order{$b} } <=> keys %{ $order{$a} } }
keys %order;
output
first second third fourth fifth sixth seventh eighth ninth tenth

This is a direct and simple-minded manual solution.
It collects all elements in given sub-sequences and sorts them. The sorting criterion is the position (index) of compared elements in the first sub-sequence that has both. If none of the sub-sequences have both elements an undecided (zero) is returned from the sort's block.
use warnings;
use strict;
use feature 'say';
use List::MoreUtils qw(uniq firstval);
my #all = qw(ant bug frog cat dog elk); # to draw input (sublists) from
my #s1 = #all[0,1,3,5];
my #s2 = #all[1,2,4,5];
my #s3 = #all[2,3,4];
my #inv = ( # for index comparison
{ map { $s1[$_] => $_ } 0..$#s1 },
{ map { $s2[$_] => $_ } 0..$#s2 },
{ map { $s3[$_] => $_ } 0..$#s3 }
);
my #sorted = sort {
my $fv = firstval { exists $_->{$a} and exists $_->{$b} } #inv;
($fv) ? $fv->{$a} <=> $fv->{$b} : 0;
} uniq #s1, #s2, #s3;
say "#sorted";
The complexity of this isn't as good as it can be since some of the comparisons can (in principle) be avoided but that doesn't show for smaller problems. It prints the correct sequence, and for the posted problem as well (replace #sN arrays with the ones provided in the question).
This code assumes consistent and complete enough subsequences.
For an arbitrary number of subsets of the full list (3 above) the ancillary #inv is built as
my #subseqs = (\#s1, \#s2, \#s3);
my #inv;
for my $rr (#subseqs) {
push #inv, { map { $rr->[$_] => $_ } 0..$#$rr }
}

Related

Find nearest option in a Perl Hash

I have a hashref that has data tied to days of the calendar year, for example:
my $calendarEntries = { '1' => 'Entry 1', '5' => 'Entry 2', '15' => 'Entry 3' };
I can obtain the day of the year using DateTime:
state $moduleDateTime = require DateTime;
my $dt = DateTime->now('time_zone' => 'America/Chicago');
my $dayOfTheYear = $dt->strftime('%j');
However, I'm trying to figure out the most efficient way to handle situations where the current day does not match any of the days in the hash. I'd like to always "round down" in those situations. E.g. today (which is the 7th day of the year), I'd like to load the entry with the key '5', since it is the most "recent" entry.
Is there a way to select a key in a hashref that is the closest candidate for being <= $dayOfTheYear? If I were using DBD, I could do a query like this:
'SELECT entry WHERE `key` <= ' . $dayOfTheYear . ' ORDER BY `key` DESC LIMIT 1'
But, I'd rather avoid needing to create a database and call it, if I can do something natively in Perl.
One way, expecting many searches
use List::MoreUtils qw(last_value);
my #entries = sort { $a <=> $b } keys %$calendarEntries;
my $nearest_le = last_value { $day >= $_ } #entries;
This returns the last element that is less or equal, for any input, so the key of interest.
The drawback of using simply a hash is that one needs an extra data structure to build. Any library that offers this sort of lookup must do that as well, of course, but those then come with other goodies and may be considerably better performing (depending on how often this is done).
If this 'rounding' need be done a lot for a given hash then it makes sense to build a lookup table for days, associating each with its nearest key in the hash.† ‡
If #entries is sorted descending ($b <=> $a) then the core List::Util::first does it.
† For example
my %nearest_le;
my #keys = sort { $a <=> $b } keys %$calendarEntries;
for my $day (1..366) {
for my $k (#keys) {
if ($k <= $day) {
$nearest_le{$day} = $k;
}
else { last }
}
};
This enumerates days of the year, as specified in the question.
‡ If this were needed for things other than the days (366 at most), where long lists may be expected, a better algorithmic behavior is afforded by binary searches on sorted lists (O(log n)).
The library used above, List::MoreUtils, also has lower_bound with O(log n)
Returns the index of the first element in LIST which does not compare less than val.
So this needs a few adjustments, for
use List::MoreUtils qw(lower_bound);
my #keys = sort { $a <=> $b } keys %$calendarEntries;
my $nearest_le = exists $calendarEntries->{$day}
? $day
: $keys[ -1 + lower_bound { $_ <=> $day } #keys ];
A nice simple solution.
use List::Util qw( max );
max grep { $_ <= $dayOfTheYear } keys %$calendarEntries
Notes:
Best to make sure $calendarEntries->{ $dayOfTheYear } doesn't exist first.
You'll need to handle the case where there is no matching key.
It's faster than sorting unless you perform many searches. But even then, we're only dealing with at most 365 keys, so simplicity is key here.
The simplest solution is to simply look up the value for your date, and if it is not found, go down until you find a value. In this sample, I included a rudimentary error handling.
use strict;
use warnings;
use feature 'say';
my $calendarEntries = { '1' => 'Entry 1', '5' => 'Entry 2', '15' => 'Entry 3' };
my $find = shift // 7; # for testing purposes
my $date = get_nearest_below($calendarEntries, $find);
if (defined $date) {
say "Nearest date below to '$find' is '$date'";
} else { # error handling
warn "Nearest date below not found for '$find'";
}
sub get_nearest_below {
my ($href, $n) = #_;
while ($n > 0) { # valid dates are > 0
return $n if defined $href->{$n}; # find a defined value
$n--; # or go to the next key below
}
return undef; # or return error if nothing is found before 0
}
Output:
$ foo.pl
Nearest date below to '7' is '5'
$ foo.pl 12
Nearest date below to '12' is '5'
$ foo.pl 123
Nearest date below to '123' is '15'
$ foo.pl 0
Nearest date below not found for '0' at foo.pl line 13.

How to alter the hash values with new values in perl?

How can i reset the hash values in perl
use warnings;
use strict;
my %hash = qw(one 1 two 2 three 3 four 4);
my #key = keys(%hash);
my #avz = (9..12);
my %vzm;
print "Original hash and keys : ",%hash,"\n";
for(my $i = 0; $i<=scalar #avz; $i++){
my #new = "$key[$i] $avz[$i] ";
push(%vzm , #new);
}
print "modified hash and keys",%vzm,"\n";
I tried to alter the keys of original hash with another keys. How can i do it
This program give the error is:
Original hash and keys : three3one1two2four4
Not an ARRAY reference at key.pl line 10.
I expect the output is
Original hash and keys : three3one1two2four4
modified hash and keys : three11one9two10four12
How can i do it
Ok, first off - you're doing something nasty in your code:
You're trying to take an ordered data structure - an array - and push it into a keyed data structure, which has no particular ordering defined.
This isn't going to work very well - it technically works, because internally perl treats arrays and hashes similarly.
But for example your first assignment - what you're actually getting is:
my %hash = (
one => 1,
two => 2,
three => 3,
four => 4
);
You can access the keys (in no particular order) via keys(). And the values via values(). But to try and treat it like an array is undefined behaviour.
To add elements to your array:
$hash{'nine'} = 9;
To delete elements from your array:
delete ( $hash{'one'} );
You can iterate on keys or values - and combined with sort even do them in some sort of order. (Just bear in mind for sorting alphanumeric numbers you'll have a custom sort job).
foreach my $key ( sort keys %hash ) {
print "$key => $hash{$key}\n";
}
(Note - this is sorting by alphanumeric string, so gives:
four => 4
one => 1
three => 3
two => 2
If you want to sort by value:
foreach my $key ( sort { $hash{$a} <=> $hash{$b} } keys %hash ) {
print "$key => $hash{$key}\n";
}
And so you'll get:
one => 1
two => 2
three => 3
four => 4
So the real question remains - what are you actually trying to accomplish? The point of a hash is to give you an unordered mini-database of key-value pairs. Treating one like an array doesn't make an awful lot of sense. Either you're iterating hash elements in arbitrary order, or you're applying a specific sort to it - but one where you're relying on getting elements in a particular order is a bad plan - it may work, but it's not guaranteed to work, and that makes for bad code.
You have to keep the order of the keys in some array, or take it from original list
my #tmp = qw(one 1 two 2 three 3 four 4);
my %hash = #tmp;
# 'one', 'two', ..
my #key = #tmp[ grep !($_%2), 0 .. $#tmp ];
# ..
for my $i (0 .. $#avz) {
$vzm{ $key[$i] } = $avz[$i];
}
or using hash slice as more perlish approach,
#vzm{ #key } = #avz;
You can't do what you want (replace the values for keys in the hash in the order they originally were added) without keeping track of that order separately, since the hash doesn't have any particular order. In other words, this:
my #key = keys(%hash);
needs to be this:
my #key = ( 'one', 'two', 'three', 'four' );
Once you have that, you can just assign the values all at once with a hash slice:
my %vzm;
#vzm{#key} = #avz;
To create a hash element, you use assignment to $var{$key}.
for (my $i = 0; $i < scalar #avz; $i++) {
$vzm{$key[$i]} = $avz[$i];
}
Note also that the loop condition should be <, not <=. List/array indexes end at scalar #avz - 1.

Find duplicates in a hash, store grouped in a new hash

I have the following hash, and I need to find the duplicates between the top most hash values 6 and 4. I've tried a few solutions to no avail, and am not too familiar with Perl syntax to make it work.
The Hash I Have
$VAR1 = {
'6' => [ '1000', '2000', '4000' ],
'4' => [ '1000', '2000', '3000' ]
};
The Hash I Need
$VAR1 = {
'6' => ['4000'],
'4' => ['3000'],
'Both' => ['1000','2000']
}
Find all common elements, e.g. by deduplicating with a hash.
Find all elements that are not common.
Given two arrays #x, #y, this would mean:
use List::MoreUtils 'uniq';
# find all common elements
my %common;
$common{$_}++ for uniq(#x), uniq(#y); # count all elements
$common{$_} == 2 or delete $common{$_} for keys %common;
# remove entries from #x, #y that are common:
#x = grep { not $common{$_} } #x;
#y = grep { not $common{$_} } #y;
# Put the common strings in an array:
my #common = keys %common;
Now all that is left is to do a bit of dereferencing and such, but that should be fairly trivial.
No need for other modules. perl hashes are really good for finding uniq or common values
my %both;
# count the number of times any element was seen in 4 and 6
$both{$_}++ for (#{$VAR1->{4}}, #{$VAR1->{6}});
for (keys %both) {
# if the count is one the element isn't in both 4 and 6
delete $both{$_} if( $both{$_} == 1 );
}
$VAR1->{Both} = [keys %both];

Getting fixed width columnar output using 'printf' in Perl [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Formatting output with 'printf' in Perl
my #selections = ("Hamburger","Frankfurter","French Fries","Large Coke","Medium Coke","Small Coke","Onion Rings");
my #prices = (3.49, 2.19, 1.69, 1.79, 1.59, 1.39, 1.19);
my #quantity = (3, 0, 0, 4, 0, 0, 8);
printf("%s %10s %12s %10s\n", "Qty", "Desc.", "Unit \$", "Total");
for($meh = 0; $meh <= 6; $meh++)
{
if($quantity[$meh] != 0)
{
printf("%d %10s %9.2f %7.2f\n", $quantity[$meh], $selections[$meh], $prices[$meh], $prices[$meh]*$quantity[$meh])
}
}
I can't figure out how to make the columns line up. I followed the suggestions of another post, but it still isn't working.
The problem is that your strings are more than 10 characters long, and Perl won't cut them unless you specify a maximum width, which is given after the dot for strings (%10.10s). Also, you may want to use a negative number so they become aligned to the left (%-10.10s).
If you want the columns to be exactly aligned based on dynamic input data, you need to make two passes over the rows. The first time through, record the maximum length of each column. Then construct a format string using those lengths. Finally, print each row using that format string.
use strict;
use warnings;
my #selections = ("Hamburger","Frankfurter","French Fries","Large Coke","Medium Coke","Small Coke","Onion Rings");
my #prices = (3.49, 2.19, 1.69, 1.79, 1.59, 1.39, 1.19);
my #quantity = (3, 0, 0, 4, 0, 0, 8);
my #rows;
push #rows, ["Qty", "Desc.", "Unit \$", "Total"];
# construct table data as a two-dimensional array
for (my $meh = 0; $meh < #selections; $meh++) {
next unless $quantity[$meh];
push #rows, [$quantity[$meh], $selections[$meh], $prices[$meh], $prices[$meh]*$quantity[$meh]];
}
# first pass over rows: compute the maximum width for each column
my #widths;
for my $row (#rows) {
for (my $col = 0; $col < #$row; $col++) {
$widths[$col] = length $row->[$col] if length $row->[$col] > ($widths[$col] // 0);
}
}
# compute the format. for this data, it works out to "%-3s %-11s %-6s %-5s\n"
my $format = join(' ', map { "%-${_}s" } #widths) . "\n";
# second pass: print each row using the format
for my $row (#rows) {
printf $format, #$row;
}
That yields this output:
Qty Desc. Unit $ Total
3 Hamburger 3.49 10.47
4 Large Coke 1.79 7.16
8 Onion Rings 1.19 9.52
Long time ago, Perl was mainly used for formatting files. It still has this capabilities although I haven't seen it used in a program since Perl 4.x came out.
Check out the perlform documentation, the format function, and the write function.
I'd give you an example on what the code would look like except I haven't done it in years. Otherwise, use the printf statement. You can limit the size of a text field with a %-10.10s type of format. This says to left justify the string, and pad it out to 10 characters, but not more than 10 characters.
I also suggest you get a book on modern Perl. One that will teach you about references.
I've rewritten your program to use references. Notice that all of the data is now in a single array instead of spread over four separate arrays that you hope you keep the index together.
I can talk about the ENTREE of $item[1] by saying $item[1]->{ENTREE}. It's easier to read and easier to maintain.
Also note that I've changed your for loop. In yours, you had to know that you had seven items. If you added a new item, you'd have to change your loop. In mine, I use $#menu to get the last index of my menu. I then use (0..$#menu) to automatically loop from 0 to the last item in the #menu array.
use strict;
use warnings;
use Data::Dumper;
my #menu = (
{ ENTREE => "Hamburger", PRICE => 3.49, QUANTITY => 3 },
{ ENTREE => "Frankfurter", PRICE => 2.19, QUANTITY => 0 },
{ ENTREE => "French Fries", PRICE => 1.69, QUANTITY => 0 },
{ ENTREE => "Large Coke", PRICE => 1.79, QUANTITY => 4 },
{ ENTREE => "Medium Coke", PRICE => 1.59, QUANTITY => 0 },
{ ENTREE => "Small Coke", PRICE => 1.39, QUANTITY => 0 },
{ ENTREE => "Onion Rings", PRICE => 1.19, QUANTITY => 8 },
);
printf "%-3.3s %-10.10s %-6.6s %s\n\n", 'Qty', 'Desc.', 'Unit $', 'Total';
# Use $#menu to get the number of items in the array instead of knowing it's 6
foreach my $item (0..$#menu) {
# Dereference $menu[$item] to make $menu_item a hash
# This makes the syntax easier to read.
my %menu_item = %{ $menu[$item] };
if ( $menu_item{QUANTITY} ) {
printf "%3d %-10.10s %9.2f %7.2f\n",
$menu_item{QUANTITY}, $menu_item{ENTREE}, $menu_item{PRICE},
$menu_item{QUANTITY} * $menu_item{PRICE};
}
}
OUTPUT:
Qty Desc. Unit $ Total
3 Hamburger 3.49 10.47
4 Large Coke 1.79 7.16
8 Onion Ring 1.19 9.52

How can I find out if a hash has an odd number of elements in assignment?

How could I find out if this hash has an odd number of elements?
my %hash = ( 1, 2, 3, 4, 5 );
Ok, I should have written more information.
sub routine {
my ( $first, $hash_ref ) = #_;
if ( $hash_ref refers to a hash with odd numbers of elements ) {
"Second argument refers to a hash with odd numbers of elements.\nFalling back to default values";
$hash_ref = { option1 => 'office', option2 => 34, option3 => 'fast' };
}
...
...
}
routine( [ 'one', 'two', 'three' ], { option1 =>, option2 => undef, option3 => 'fast' );
Well, I suppose there is some terminological confusion in the question that should be clarified.
A hash in Perl always has the same number of keys and values - because it's fundamentally an engine to store some values by their keys. I mean, key-value pair should be considered as a single element here. )
But I guess that's not what was asked really. ) I suppose the OP tried to build a hash from a list (not an array - the difference is subtle, but it's still there), and got the warning.
So the point is to check the number of elements in the list which will be assigned to a hash. It can be done as simple as ...
my #list = ( ... there goes a list ... );
print #list % 2; # 1 if the list had an odd number of elements, 0 otherwise
Notice that % operator imposes the scalar context on the list variable: it's simple and elegant. )
UPDATE as I see, the problem is slightly different. Ok, let's talk about the example given, simplifying it a bit.
my $anhash = {
option1 =>,
option2 => undef,
option3 => 'fast'
};
See, => is just a syntax sugar; this assignment could be easily rewritten as...
my $anhash = {
'option1', , 'option2', undef, 'option3', 'fast'
};
The point is that missing value after the first comma and undef are not the same, as lists (any lists) are flattened automatically in Perl. undef can be a normal element of any list, but empty space will be just ignored.
Take note the warning you care about (if use warnings is set) will be raised before your procedure is called, if it's called with an invalid hash wrapped in reference. So whoever caused this should deal with it by himself, looking at his own code: fail early, they say. )
You want to use named arguments, but set some default values for missing ones? Use this technique:
sub test_sub {
my ($args_ref) = #_;
my $default_args_ref = {
option1 => 'xxx',
option2 => 'yyy',
};
$args_ref = { %$default_args_ref, %$args_ref, };
}
Then your test_sub might be called like this...
test_sub { option1 => 'zzz' };
... or even ...
test_sub {};
The simple answer is: You get a warning about it:
Odd number of elements in hash assignment at...
Assuming you have not been foolish and turned warnings off.
The hard answer is, once assignment to the hash has been done (and warning issued), it is not odd anymore. So you can't.
my %hash = (1,2,3,4,5);
use Data::Dumper;
print Dumper \%hash;
$VAR1 = {
'1' => 2,
'3' => 4,
'5' => undef
};
As you can see, undef has been inserted in the empty spot. Now, you can check for undefined values and pretend that any existing undefined values constitutes an odd number of elements in the hash. However, should an undefined value be a valid value in your hash, you're in trouble.
perl -lwe '
sub isodd { my $count = #_ = grep defined, #_; return ($count % 2) };
%a=(a=>1,2);
print isodd(%a);'
Odd number of elements in hash assignment at -e line 1.
1
In this one-liner, the function isodd counts the defined arguments and returns whether the amount of arguments is odd or not. But as you can see, it still gives the warning.
You can use the __WARN__ signal to "trap" for when a hash assignment is incorrect.
use strict ;
use warnings ;
my $odd_hash_length = 0 ;
{
local $SIG{__WARN__} = sub {
my $msg = shift ;
if ($msg =~ m{Odd number of elements in hash assignment at}) {
$odd_hash_length = 1 ;
}
} ;
my %hash = (1, 2, 3, 4, 5) ;
}
# Now do what you want based on $odd_hash_length
if ($odd_hash_length) {
die "the hash had an odd hash length assignment...aborting\n" ;
} else {
print "the hash was initialized correctly\n";
}
See also How to capture and save warnings in Perl.