Is instantiating a hash in a function inefficient in perl? - perl

Is there any difference in doing the following, efficiency, bad practice...?
(In a context of bigger hashes and sending them through many functions)
sub function {
my ($self, $hash_ref) = #_;
my %hash = %{$hash_ref};
print $hash{$key};
return;
}
Compared to:
sub function {
my ($self, $hash_ref) = #_;
print $hash_ref->{$key};
return;
}

Let's say %$hash_ref contains N elements.
The first snippet does the following in addition to what the second snippet does:
Creates N scalars. (Can involve multiple memory allocations each.)
Adds N*2 scalars to the stack. (Cheap.)
Creates a hash. (More memory allocations...)
Adds N elements to a hash. (More memory allocations...)
The second snippet does the following in addition to what the first snippet does:
[Nothing. It's a complete subset of the first snippet]
The first snippet is therefore far less efficient than the second. It also more complicated by virtue of having extra code. The complete lack of benefit and the numerous costs dictate that one should avoid the pattern used in the first snippet.

1st snippet is silly. But it's convenient practice to emulate named arguments:
sub function {
my ($self, %params ) = #_;
...
}
So, pass arrays/hashes by reference, creation of new (especially big) hash will be much slower. But there is nothing bad in "named arguments" hack.
And did you now that there exist key/value slice (v5.20+ only)? You can copy part of hash easily this way:
my %foo = ( one => 1, two => 2, three => 3, four => 4);
my %bar = %foo{'one', 'four'};
More information in perldoc perldata

The first version of the sub creates a local copy of the data structure which reference is passed to it. As such it is far less efficient, of course.
There is one legitimate reason for this: to make sure the data in the caller isn't changed. That local %hash can be changed in the sub as needed or convenient and the data in the calling code is not affected. This way the data in the caller is also protected against accidental changes.
Another reason why a local copy of data is done, in particular with deeper data structures, is to avoid long chains of dereferencing and thus simplify code; so parts of deep hierarchies may be copied for simpler access. Then this is merely for (presumed) programming convenience.
So in the shown example there'd be absolutely no reason to make a local copy. However, presumably the question is about subs where more work is done and then what's best depends on details.

Related

Which features of Perl make it a functional programming language?

Inspired a little by: https://stackoverflow.com/questions/30977789/why-is-c-not-a-functional-programming-language
I found: Higher Order Perl
It made me wonder about the assertion that Perl is a functional programming language. Now, I appreciate that functional programming is a technique (much like object oriented).
However I've found a list of what makes a functional programming language:
First Class functions
Higher Order Functions
Lexical Closures
Pattern Matching
Single Assignment
Lazy Evaluation
Garbage Collection
Type Inference
Tail Call Optimization
List Comprehensions
Monadic effects
Now some of these I'm quite familiar with:
Garbage collection, for example, is Perl reference counting and releasing memory when no longer required.
Lexical closures are even part of the FAQ: What is a closure? - there's probably a better article here: http://www.perl.com/pub/2002/05/29/closure.html
But I start to get a bit fuzzy on some of these - List Comprehensions, for example - I think that's referring to map/grep (List::Util and reduce?)
I anyone able to help me fill in the blanks here? Which of the above can Perl do easily (and is there an easy example) and are there examples where it falls down?
Useful things that are relevant:
Perl monks rant about functional programming
Higher Order Perl
C2.com functional programming definitions
First Class functions
In computer science, a programming language is said to have first-class functions if it treats functions as first-class citizens. Specifically, this means the language supports passing functions as arguments to other functions, returning them as the values from other functions, and assigning them to variables or storing them in data structures.
So in Perl:
my $print_something = sub { print "Something\n" };
sub do_something {
my ($function) = #_;
$function->();
}
do_something($print_something);
Verdict: Natively supported
Higher Order Functions
In mathematics and computer science, a higher-order function (also functional form, functional or functor) is a function that does at least one of the following:
takes one or more functions as an input
outputs a function
With reference to this post on perlmonks:
In Perl terminology, we often refer to them as callbacks, factories, and functions that return code refs (usually closures).
Verdict: Natively supported
Lexical Closures
Within the perl FAQ we have questions regarding What is a closure?:
Closure is a computer science term with a precise but hard-to-explain meaning. Usually, closures are implemented in Perl as anonymous subroutines with lasting references to lexical variables outside their own scopes. These lexicals magically refer to the variables that were around when the subroutine was defined (deep binding).
Closures are most often used in programming languages where you can have the return value of a function be itself a function, as you can in Perl.
This is explained perhaps a little more clearly in the article: Achieving Closure
sub make_hello_printer {
my $message = "Hello, world!";
return sub { print $message; }
}
my $print_hello = make_hello_printer();
$print_hello->()
Verdict: Natively supported
Pattern Matching
In the context of pure functional languages and of this page, Pattern Matching is a dispatch mechanism: choosing which variant of a function is the correct one to call. Inspired by standard mathematical notations.
Dispatch tables are the closest approximation - essentially a hash of either anonymous subs or code refs.
use strict;
use warnings;
sub do_it {
print join( ":", #_ );
}
my $dispatch = {
'onething' => sub { print #_; },
'another_thing' => \&do_it,
};
$dispatch->{'onething'}->("fish");
Because it's just a hash, you can add code references and anonymous subroutines too. (Note - not entirely dissimilar to Object Oriented programming)
Verdict: Workaround
Single Assignment
Any assignment that changes an existing value (e.g. x := x + 1) is disallowed in purely functional languages.4 In functional programming, assignment is discouraged in favor of single assignment, also called initialization. Single assignment is an example of name binding and differs from assignment as described in this article in that it can only be done once, usually when the variable is created; no subsequent reassignment is allowed.
I'm not sure perl really does this. The closest approximation might be references/anonymous subs or perhaps constant.
Verdict: Not Supported
Lazy Evaluation
Waiting until the last possible moment to evaluate an expression, especially for the purpose of optimizing an algorithm that may not use the value of the expression.
Examples of lazy evaluation techniques in Perl 5?
And again, coming back to Higher Order Perl (I'm not affiliated with this book, honest - it just seems to be one of the key texts on the subject).
The core concept here seems to be - create a 'linked list' in perl (using object oriented techniques) but embed a code reference at your 'end marker' that evaluates if you ever get that far.
Verdict: Workaround
Garbage Collection
"GarbageCollection (GC), also known as automatic memory management, is the automatic recycling of heap memory."
Perl does this via reference counting, and releasing things when they are no longer referenced. Note that this can have implications for certain things that you're (probably!) more likely to encounter when functional programming.
Specifically - circular references which are covered in perldoc perlref
Verdict: Native support
Type Inference
TypeInference is the analysis of a program to infer the types of some or all expressions, usually at CompileTime
Perl does implicitly cast values back and forth as it needs to. Usually this works well enough that you don't need to mess with it. Occasionally you need to 'force' the process, by making an explicit numeric or string operation. Canonically, this is either by adding 0, or concatenating an empty string.
You can overload a scalar to do different things in by using dualvars
Verdict: Native support
Tail Call Optimization
Tail-call optimization (or tail-call merging or tail-call elimination) is a generalization of TailRecursion: If the last thing a routine does before it returns is call another routine, rather than doing a jump-and-add-stack-frame immediately followed by a pop-stack-frame-and-return-to-caller, it should be safe to simply jump to the start of the second routine, letting it re-use the first routine's stack frame (environment).
Why is Perl so afraid of "deep recursion"?
It'll work, but it'll warn if your recursion depth is >100. You can disable this by adding:
no warnings 'recursion';
But obviously - you need to be slightly cautious about recursion depth and memory footprint.
As far as I can tell, there isn't any particular optimisation and if you want to do something like this in an efficient fashion, you may need to (effectively) unroll your recursives and iterate instead.
Tailcalls are supported by perl. Either see the goto ⊂ notation, or see the neater syntax for it provided by Sub::Call::Tail
Verdict: Native
List Comprehensions
List comprehensions are a feature of many modern FunctionalProgrammingLanguages. Subject to certain rules, they provide a succinct notation for GeneratingElements? in a list.
A list comprehension is SyntacticSugar for a combination of applications of the functions concat, map and filter
Perl has map, grep, reduce.
It also copes with expansion of ranges and repetitions:
my #letters = ( "a" .. "z" );
So you can:
my %letters = map { $_ => 1 } ( "A" .. "z" );
Verdict: Native (List::Utils is a core module)
Monadic effects
... nope, still having trouble with these. It's either much simpler or much more complex than I can grok.
If anyone's got anything more, please chip in or edit this post or ... something. I'm still a sketchy on some of the concepts involved, so this post is more a starting point.
Really nice topic, I wanted to write an article titled something link "the camel is functional". Let me contribute with some code.
Perl also support this anonymous functions like
sub check_config {
my ( $class, $obj ) = #_;
my $separator = ' > ';
# Build message from class namespace.
my $message = join $separator, ( split '::', $class );
# Use provided object $obj or
# create an instance of class with defaults, provided by configuration.
my $object = $obj || $class->new;
# Return a Function.
return sub {
my $attribute = shift;
# Compare attribute with configuration,
# just to ensure it is read from there.
is $object->config->{$attribute},
# Call attribute accessor so it is read from config,
# and validated by type checking.
$object->$attribute,
# Build message with attribute.
join $separator, ( $message, $attribute );
}
}
sub check_config_attributes {
my ( $class, $obj ) = #_;
return sub {
my $attributes = shift;
check_config( $class, $obj )->($_) for (#$attributes);
}
}

What's the best Perl practice for returning hashes from functions?

I am mulling over a best practice for passing hash references for return data to/from functions.
On the one hand, it seems intuitive to pass only input values to a function and have only return output variables. However, passing hashes in Perl can only be done by reference, so it is a bit messy and would seem more of an opportunity to make a mistake.
The other way is to pass a reference in the input variables, but then it has to be dealt with in the function, and it may not be clear what is an input and what is a return variable.
What is a best practice regarding this?
Return references to an array and a hash, and then dereference it.
($ref_array,$ref_hash) = $this->getData('input');
#array = #{$ref_array};
%hash = %{$ref_hash};
Pass in references (#array, %hash) to the function that will hold the output data.
$this->getData('input', \#array, \%hash);
Just return the reference. There is no need to dereference the whole
hash like you are doing in your examples:
my $result = some_function_that_returns_a_hashref;
say "Foo is ", $result->{foo};
say $_, " => ", $result->{$_} for keys %$result;
etc.
I have never seen anyone pass in empty references to hold the result. This is Perl, not C.
Trying to create copies by saying
my %hash = %{$ref_hash};
is even more dangerous than using the hashref. This is because it only creates a shallow copy. This will lead you to thinking it is okay to modify the hash, but if it contains references they will modify the original data structure. I find it better to just pass references and be careful, but if you really want to make sure you have a copy of the reference passed in you can say:
use Storable qw/dclone/;
my %hash = %{dclone $ref_hash};
The first one is better:
my ($ref_array,$ref_hash) = $this->getData('input');
The reasons are:
in the second case, getData() needs to
check the data structures to make
sure they are empty
you have freedom to return undef as a special value
it looks more Perl-idiomatic.
Note: the lines
#array = #{$ref_array};
%hash = %{$ref_hash};
are questionable, since you shallow-copy the whole data structures here. You can use references everywhere where you need array/hash, using -> operator for convenience.
If it's getting complicated enough that both the callsite and the called function are paying for it (because you have to think/write more every time you use it), why not just use an object?
my $results = $this->getData('input');
$results->key_value_thingies;
$results->listy_thingies;
If making an object is "too complicated" then start using Moose so that it no longer is.
My personal preference for sub interfaces:
If the routine has 0-3 arguments, they may be passed in list form: foo( 'a', 12, [1,2,3] );
Otherwise pass a list of name value pairs. foo( one => 'a', two => 12, three => [1,2,3] );
If the routine has or may have more than one argument seriously consider using name/value pairs.
Passing in references increases the risk of inadvertent data modification.
On returns I generally prefer to return a list of results rather than an array or hash reference.
I return hash or array refs when it will make a noticeable improvement in speed or memory consumption (ie BIG structures), or when a complex data structure is involved.
Returning references when not needed deprives one of the ability to take advantage of Perl's nice list handling features and exposes one to the dangers of inadvertent modification of data.
In particular, I find it useful to assign a list of results into an array and return the array, which provides the contextual return behaviors of an array to my subs.
For the case of passing in two hashes I would do something like:
my $foo = foo( hash1 => \%hash1, hash2 => \%hash2 ); # gets number of items returned
my #foo = foo( hash1 => \%hash1, hash2 => \%hash2 ); # gets items returned
sub foo {
my %arg = #_;
# do stuff
return #results;
}
I originally posted this to another question, and then someone pointed to this as a "related post", so I'll post it here to for my take on the subject, assuming people will encounter it in the future.
I'm going to contradict the Accepted Answer and say that I prefer to have my data returned as a plain hash (well, as an even-sized list which is likely to be interpreted as a hash). I work in an environment where we tend to do things like the following code snippet, and it's much easier to combine and sort and slice and dice when you don't have to dereference every other line. (It's also nice to know that someone can't damage your hashref because you passed the entire thing by value -- though someone pointed out that if your hash contains more than simple scalars it's not so simple.)
my %filtered_config_slice =
hashgrep { $a !~ /^apparent_/ && defined $b } (
map { $_->build_config_slice(%some_params, some_other => 'param') }
($self->partial_config_strategies, $other_config_strategy)
);
This approximates something that my code might do: building a configuration for an object based on various configuration strategy objects (some of which the object knows about inherently, plus some extra guy) and then filters out some of them as irrelevant.
(Yes, we have nice tools like hashgrep and hashmap and lkeys that do useful things to hashes. $a and $b get set to the key and the value of each item in the list, respectively). (Yes, we have people who can program at this level. Hiring is obnoxious, but we have a quality product.)
If you don't intend to do anything resembling functional programming like this, or if you need more performance (have you profiled?) then sure, use hashrefs.
Uh... "passing hashes can only be done by reference"?
sub foo(%) {
my %hash = #_;
do_stuff_with(%hash);
}
my %hash = (a => 1, b => 2);
foo(%hash);
What am I missing?
I would say that if the issue is that you need to have multiple outputs from a function, it's better as a general practice to output a data structure, probably a hash, that holds everything you need to send out rather than taking modifiable references as arguments.

How can I elegantly call a Perl subroutine whose name is held in a variable?

I keep the name of the subroutine I want to call at runtime in a variable called $action. Then I use this to call that sub at the right time:
&{\&{$action}}();
Works fine. The only thing I don't like is that it's ugly and every time I do it, I feel beholden to add a comment for the next developer:
# call the sub by the name of $action
Anyone know a prettier way of doing this?
UPDATE: The idea here was to avoid having to maintain a dispatch table every time I added a new callable sub, since I am the sole developer, I'm not worried about other programmers following or not following the 'rules'. Sacrificing a bit of security for my convenience. Instead my dispatch module would check $action to make sure that 1) it is the name of a defined subroutine and not malicious code to run with eval, and 2) that it wouldn't run any sub prefaced by an underscore, which would be marked as internal-only subs by this naming convention.
Any thoughts on this approach? Whitelisting subroutines in the dispatch table is something I will forget all the time, and my clients would rather me err on the side of "it works" than "it's wicked secure". (very limited time to develop apps)
FINAL UPDATE: I think I've decided on a dispatch table after all. Although I'd be curious if anyone who reads this question has ever tried to do away with one and how they did it, I have to bow to the collective wisdom here. Thanks to all, many great responses.
Rather than storing subroutine names in a variable and calling them, a better way to do this is to use a hash of subroutine references (otherwise known as a dispatch table.)
my %actions = ( foo => \&foo,
bar => \&bar,
baz => sub { print 'baz!' }
...
);
Then you can call the right one easily:
$actions{$action}->();
You can also add some checking to make sure $action is a valid key in the hash, and so forth.
In general, you should avoid symbolic references (what you're doing now) as they cause all kinds of problems. In addition, using real subroutine references will work with strict turned on.
Just &$action(), but usually it's nicer to use coderefs from the beginning, or use a dispatcher hash. For example:
my $disp = {foo => \&some_sub, bar => \&some_other_sub };
$disp->{'foo'}->();
Huh? You can just say
$action->()
Example:
sub f { return 11 }
$action = 'f';
print $action->();
$ perl subfromscalar.pl
11
Constructions like
'f'->() # equivalent to &f()
also work.
I'm not sure I understand what you mean. (I think this is another in a recent group of "How can I use a variable as a variable name?" questions, but maybe not.)
In any case, you should be able to assign an entire subroutine to a variable (as a reference), and then call it straightforwardly:
# create the $action variable - a reference to the subroutine
my $action = \&sing_out;
# later - perhaps much later - I call it
$action->();
sub sing_out {
print "La, la, la, la, la!\n"
}
The most important thing is: why do you want to use variable as function name. What will happen if it will be 'eval'?
Is there a list of functions that can be used? Or can it be any function? If list exists - how long it is?
Generally, the best way to handle such cases is to use dispatch tables:
my %dispatch = (
'addition' => \&some_addition_function,
'multiplication' => sub { $self->call_method( #_ ) },
);
And then just:
$dispatch{ $your_variable }->( 'any', 'args' );
__PACKAGE__->can($action)->(#args);
For more info on can(): http://perldoc.perl.org/UNIVERSAL.html
I do something similar. I split it into two lines to make it slightly more identifiable, but it's not a lot prettier.
my $sub = \&{$action};
$sub->();
I do not know of a more correct or prettier way of doing it. For what it's worth, we have production code that does what you are doing, and it works without having to disable use strict.
Every package in Perl is already a hash table. You can add elements and reference them by the normal hash operations. In general it is not necessary to duplicate the functionality by an additional hash table.
#! /usr/bin/perl -T
use strict;
use warnings;
my $tag = 'HTML';
*::->{$tag} = sub { print '<html>', #_, '</html>', "\n" };
HTML("body1");
*::->{$tag}("body2");
The code prints:
<html>body1</html>
<html>body2</html>
If you need a separate name space, you can define a dedicated package.
See perlmod for further information.
Either use
&{\&{$action}}();
Or use eval to execute the function:
eval("$action()");
I did it in this way:
#func = qw(cpu mem net disk);
foreach my $item (#func){
$ret .= &$item(1);
}
If it's only in one program, write a function that calls a subroutine using a variable name, and only have to document it/apologize once?
I used this: it works for me.
(\$action)->();
Or you can use 'do', quite similar with previous posts:
$p = do { \&$conn;};
$p->();

Is there a Perl solution for lazy lists this side of Perl 6?

Has anybody found a good solution for lazily-evaluated lists in Perl? I've tried a number of ways to turn something like
for my $item ( map { ... } #list ) {
}
into a lazy evaluation--by tie-ing #list, for example. I'm trying to avoid breaking down and writing a source filter to do it, because they mess with your ability to debug the code. Has anybody had any success. Or do you just have to break down and use a while loop?
Note: I guess that I should mention that I'm kind of hooked on sometimes long grep-map chains for functionally transforming lists. So it's not so much the foreach loop or the while loop. It's that map expressions tend to pack more functionality into the same vertical space.
As mentioned previously, for(each) is an eager loop, so it wants to evaluate the entire list before starting.
For simplicity, I would recommend using an iterator object or closure rather than trying to have a lazily evaluated array. While you can use a tie to have a lazily evaluated infinite list, you can run into troubles if you ever ask (directly or indirectly, as in the foreach above) for the entire list (or even the size of the entire list).
Without writing a full class or using any modules, you can make a simple iterator factory just by using closures:
sub make_iterator {
my ($value, $max, $step) = #_;
return sub {
return if $value > $max; # Return undef when we overflow max.
my $current = $value;
$value += $step; # Increment value for next call.
return $current; # Return current iterator value.
};
}
And then to use it:
# All the even numbers between 0 - 100.
my $evens = make_iterator(0, 100, 2);
while (defined( my $x = $evens->() ) ) {
print "$x\n";
}
There's also the Tie::Array::Lazy module on the CPAN, which provides a much richer and fuller interface to lazy arrays. I've not used the module myself, so your mileage may vary.
All the best,
Paul
[Sidenote: Be aware that each individual step along a map/grep chain is eager. If you give it a big list all at once, your problems start much sooner than at the final foreach.]
What you can do to avoid a complete rewrite is to wrap your loop with an outer loop. Instead of writing this:
for my $item ( map { ... } grep { ... } map { ... } #list ) { ... }
… write it like this:
while ( my $input = calculcate_next_element() ) {
for my $item ( map { ... } grep { ... } map { ... } $input ) { ... }
}
This saves you from having to significantly rewrite your existing code, and as long as the list does not grow by several orders of magnitude during transformation, you get pretty nearly all the benefit that a rewrite to iterator style would offer.
If you want to make lazy lists, you'll have to write your own iterator. Once you have that, you can use something like Object::Iterate which has iterator-aware versions of map and grep. Take a look at the source for that module: it's pretty simple and you'll see how to write your own iterator-aware subroutines.
Good luck, :)
There is at least one special case where for and foreach have been optimized to not generate the whole list at once. And that is the range operator. So you have the option of saying:
for my $i (0..$#list) {
my $item = some_function($list[$i]);
...
}
and this will iterate through the array, transformed however you like, without creating a long list of values up front.
If you wish your map statement to return variable numbers of elements, you could do this instead:
for my $i (0..$#array) {
for my $item (some_function($array[$i])) {
...
}
}
If you wish more pervasive laziness than this, then your best option is to learn how to use closures to generate lazy lists. MJD's excellent book Higher Order Perl can walk you through those techniques. However do be warned that they will involve far larger changes to your code.
Bringing this back from the dead to mention that I just wrote the module List::Gen on CPAN which does exactly what the poster was looking for:
use List::Gen;
for my $item ( #{gen { ... } \#list} ) {...}
all computation of the lists are lazy, and there are map / grep equivalents along with a few other functions.
each of the functions returns a 'generator' which is a reference to a tied array. you can use the tied array directly, or there are a bunch of accessor methods like iterators to use.
Use an iterator or consider using Tie::LazyList from CPAN (which is a tad dated).
I asked a similar question at perlmonks.org, and BrowserUk gave a really good framework in his answer. Basically, a convenient way to get lazy evaluation is to spawn threads for the computation, at least as long as you're sure you want the results, Just Not Now. If you want lazy evaluation not to reduce latency but to avoid calculations, my approach won't help because it relies on a push model, not a pull model. Possibly using Corooutines, you can turn this approach into a (single-threaded) pull model as well.
While pondering this problem, I also investigated tie-ing an array to the thread results to make the Perl program flow more like map, but so far, I like my API of introducing the parallel "keyword" (an object constructor in disguise) and then calling methods on the result. The more documented version of the code will be posted as a reply to that thread and possibly released onto CPAN as well.
If I remember correctly, for/foreach do get the whole list first anyways, so a lazily evaluated list would be read completely and then it would start to iterate through the elements. Therefore, I think there's no other way than using a while loop. But I may be wrong.
The advantage of a while loop is that you can fake the sensation of a lazily evaluated list with a code reference:
my $list = sub { return calculate_next_element };
while(defined(my $element = &$list)) {
...
}
After all, I guess a tie is as close as you can get in Perl 5.

Any good collection module in perl?

Can someone suggest a good module in perl which can be used to store collection of objects?
Or is ARRAY a good enough substitute for most of the needs?
Update:
I am looking for a collections class because I want to be able to do an operation like compute collection level property from each element.
Since I need to perform many such operations, I might as well write a class which can be extended by individual objects. This class will obviously work with arrays (or may be hashes).
There are collection modules for more complex structures, but it is common style in Perl to use Arrays for arrays, stacks and lists. Perl has built in functions for using the array as a stack or list : push/pop, shift/unshift, splice (inserting or removing in the middle) and the foreach form for iteration.
Perl also has a map, called a hashmap which is the equivalent to a Dictionary in Python - allowing you to have an association between a single key and a single value.
Perl developers often compose these two data-structures to build what they need - need multiple values? Store array-references in the value part of the hashtable (Map). Trees can be built in a similar manner - if you need unique keys, use multiple-levels of hashmaps, or if you don't use nested array references.
These two primitive collection types in Perl don't have an Object Oriented api, but they still are collections.
If you look on CPAN you'll likely find modules that provide other Object Oriented data structures, it really depends on your need. Is there a particular data structure you need besides a List, Stack or Map? You might get a more precise answer (eg a specific module) if you're asking about a particular data structure.
Forgot to mention, if you're looking for small code examples across a variety of languages, PLEAC (Programming Language Examples Alike Cookbook) is a decent resource.
I would second Michael Carman's comment: please do not use the term "Hashmap" or "map" when you mean a hash or associative array. Especially when Perl has a map function; that just confuses things.
Having said that, Kyle Burton's response is fundamentally sound: either a hash or an array, or a complex structure composed of a mixture of the two, is usually enough. Perl groks OO, but doesn't enforce it; chances are that a loosely-defined data structure may be good enough for what you need.
Failing that, please define more exactly what you mean by "compute collection level property from each element". And bear in mind that Perl has keywords like map and grep that let you do functional programming things like e.g.
my $record = get_complex_structure();
# $record = {
# 'widgets' => {
# name => 'ACME Widgets',
# skus => [ 'WIDG01', 'WIDG02', 'WIDG03' ],
# sales => {
# WIDG01 => { num => 25, value => 105.24 },
# WIDG02 => { num => 10, value => 80.02 },
# WIDG03 => { num => 8, value => 205.80 },
# },
# },
# ### and so on for 'grommets', 'nuts', 'bolts' etc.
# }
my #standouts =
map { $_->[0] }
sort {
$b->[2] <=> $a->[2]
|| $b->[1] <=> $a->[1]
|| $record->{$a->[0]}->{name} cmp $record->{$b->[0]}->{name}
}
map {
my ($num, $value);
for my $sku (#{$record->{$_}{skus}}) {
$num += $record->{$_}{sales}{$sku}{num};
$value += $record->{$_}{sales}{$sku}{value};
}
[ $_, $num, $value ];
}
keys %$record;
Reading from back to front, this particular Schwarztian transform does three things:
3) It takes a key to $record, goes through the SKUs defined in this arbitrary structure, and works out the aggregate number and total value of transactions. It returns an anonymous array containing the key, the number of transactions and the total value.
2) The next block takes in a number of arrayrefs and sorts them a) first of all by comparing the total value, numerically, in descending orders; b) if the values are equal, by comparing the number of transactions, numerically in descending order; and c) if that fails, by sorting asciibetically on the name associated with this order.
1) Finally, we take the key to $record from the sorted data structure, and return that.
It may well be that you don't need to set up a separate class to do what you want.
I would normally use an #array or a %hash.
What features are you looking for that aren't provided by those?
Base your decision on how you need to access the objects. If pushing them onto an array, indexing into, popping/shifting them off works, then use an array. Otherwise hash them by some key or organize them into a tree of objects that meets your needs. A hash of objects is a very simple, powerful, and highly-optimized way of doing things in Perl.
Since Perl arrays can easily be appended to, resized, sorted, etc., they are good enough for most "collection" needs. In cases where you need something more advanced, a hash will generally do. I wouldn't recommend that you go looking for a collection module until you actually need it.
Either an array or a hash can store a collection of objects. A class might be better if you want to work with the class in certain ways but you'd have to tell us what those ways are before we could make any good recommendations.
i would stick with an ARRAY or a HASH.
#names = ('Paul','Michael','Jessica','Megan');
and
my %petsounds = ("cat" => "meow",
"dog" => "woof",
"snake" => "hiss");
source
It depends a lot; there's Sparse Matrix modules, some forms of persistence, a new style of OO etc
Most people just man perldata, perllol, perldsc to answer their specific issue with a data structure.