I'm trying to write a Perl module for a “persistent YAML hash”, with the following properties:
With every access, check if the YAML file has changed, and if so, reload.
As soon as any data in the hash is changed, save.
Don't save on UNTIE, so that the file isn't updated when you only read values.
My first attempt seemed to work pretty well:
package YAMLHash;
use v5.24;
use warnings;
use experimental 'signatures';
use YAML::XS qw(DumpFile LoadFile);
use File::stat;
sub refresh($self)
{
if (-f $self->{file}) {
if (stat($self->{file})->mtime > $self->{mtime}) {
$self->{data} = LoadFile($self->{file});
$self->{mtime} = stat($self->{file})->mtime;
}
}
}
sub save($self)
{
DumpFile($self->{file}, $self->{data});
$self->{mtime} = stat($self->{file})->mtime;
}
sub TIEHASH($class, #args)
{
my ($filename) = $args[0];
die "No filename specified" unless $filename;
my $self = bless { data=>{}, file=>$filename, mtime=>0 }, $class;
refresh($self);
return $self;
}
sub FETCH($self, $key = '')
{
refresh($self);
return $self->{data}{$key};
}
sub EXISTS($self, $key)
{
refresh($self);
return exists($self->{data}{$key});
}
sub FIRSTKEY($self)
{
refresh($self);
my #ignore = keys %{$self->{data}}; # reset iterator
return each %{$self->{data}};
}
sub NEXTKEY($self, $lastkey)
{
refresh($self);
return each %{$self->{data}};
}
sub SCALAR($self)
{
return scalar %{$self->{data}};
}
sub STORE($self, $key, $value)
{
refresh($self);
$self->{data}{$key} = $value;
save($self);
}
sub DELETE($self, $key)
{
refresh($self);
delete $self->{data}{$key};
save($self);
}
sub CLEAR($self, $key)
{
$self->{data} = {};
save($self);
}
1;
I tried this as follows:
use YAMLHash;
tie my %foo, 'YAMLHash', 'test.yaml';
$foo{hello} = 'world';
$foo{answer} = 42;
$foo{counter}++;
and the resulting YAML file looks like this:
---
answer: 42
counter: 1
hello: world
But then I changed my example code to:
use YAMLHash;
tie my %foo, 'YAMLHash', 'test.yaml';
$foo{hello} = 'world';
$foo{answer} = 42;
$foo{counter}++;
$foo{a}{b}{c}{d} = 'e';
and the result is:
---
a: {}
answer: 42
counter: 2
hello: world
So, obviously, STORE is called when $foo{a} is created, but not when $foo{a}{b}{c}{d} is assigned.
Is there any way to make this do what I want?
You will need to tie %{ $foo{a} }, %{ $foo{a}{b} } and %{ $foo{a}{b}{c} } as well.
You could recursively tie the hashes and arrays in the data structure in TIEHASH. Don't forget to the do the same thing to data added to the structure via STORE!
You might want to use a different class for the root of the data structure and non-root nodes.
Warning: Using tie will make accesses slower.
Note that you need to tie the scalars too, not just the hashes (and arrays). All of the following change the value of a hash element without calling STORE:
Changing the scalar directly:
++$foo{a};
chomp($foo{a});
$foo{a} =~ s/x/y/g;
...
Changing a scalar via an alias or a reference:
my \$x = \$foo{a}; $x = 123;
my $r = \$foo{a}; $$r = 123;
for ($foo{a}) { $_ = 123; }
sub { $_[0] = 123; }->($foo{a});
...
Related
i am very new to perl. i am trying to use the below code from CPAN.
my $C;
# Recursive version of C<each>;
sub reach {
my $ref = shift;
if (ref $ref eq 'HASH') {
if (defined $C->{$ref}{v}) {
if (ref $C->{$ref}{v} eq 'HASH') {
if (my #rec = reach($C->{$ref}{v})) {
return ($C->{$ref}{k},#rec);
}
} elsif (ref $C->{$ref}{v} eq 'ARRAY') {
if (my #rec = reach($C->{$ref}{v})) {
if (defined $C->{$ref}{k}) {
return $C->{$ref}{k},#rec;
}
return #rec;
}
}
undef $C->{$ref};
}
if (my ($k,$v) = each %$ref) {
$C->{$ref}{v} = $v;
$C->{$ref}{k} = $k;
return ($k,reach($v));
}
return ();
} elsif (ref $ref eq 'ARRAY') {
if (defined $C->{$ref}{v}) {
if (ref $C->{$ref}{v} eq 'HASH' ||
ref $C->{$ref}{v} eq 'ARRAY') {
if (my #rec = reach($C->{$ref}{v})) {
if (defined $C->{$ref}{k}) {
return $C->{$ref}{k},#rec;
}
return #rec;
}
}
}
if (my $v = $ref->[$C->{$ref}{i}++ || 0]) {
$C->{$ref}{v} = $v;
return (reach($v));
}
return ();
}
return $ref;
}
input:
bar => {cmd_opts => { gld_upf => ['abc' , 'def']} }
current output:
[bar, cmd_opts, gld_upf, abc]
[bar, cmd_opts, gld_upf, def]
desired output:
[bar, cmd_opts, gld_upf, ['abc', 'def']]
also, what are the concepts that are being used in this code?
are there any books/courses i can take for this?
also, what are the concepts that are being used in this code? are there any books/courses i can take for this?
The code mentioned by you from the Deep::Hash::Utils CPAN module is mainly handling nested data structures.
A couple of places to read about these:
the official docs: perldsc ; perlreftut ; perlref ;
Modern Perl by chromatic has a section on Nested Data Structures around page 60
Intermediate Perl: Beyond The Basics of Learning Perl 2nd edition has a section about Nested Data Structures around page 44.
In the most basic case, in these nested data structures, every node has one of the following types:
scalar
hashref
arrayref
In turn, the values in the array pointed to by an arrayref can be of type scalar/hashref/arrayref.
The same goes for the values of the hash pointed to by an arrayref, it can be of type scalar/hashref/arrayref.
This induces a tree-like structure. The algorithm for traversing such a tree is depth-first search
where some additional logic is required to check the type of the node and depending on the type decide how to proceed further down the tree.
To make a parallel, all of this is not that much different from traversing a filesystem hierarchy (see link1, link2).
A bigger list called perlres on Perl resources is available.
In this specific case, the function reach from Deep::Hash::Utils acts as an iterator, and it returns all paths descending from the root down to each leaf.
Whenever a #path to a leaf is found, its elements are compared side-by-side with another list called #output, and there are three cases:
there's no element on that position, so we store it
the elements are equal, so we skip them
the elements are different, so we merge them together in a list
#!/usr/bin/perl
use strict;
use warnings;
use Data::Dumper;
use Deep::Hash::Utils qw/reach/;
my $input = { bar => {cmd_opts => { gld_upf => ['abc' , 'def']} } };
my #output = ();
while (my #path = reach($input)) {
for(my $i=0;$i<=$#path;$i++){
if(defined $output[$i]) {
if(ref($output[$i]) eq "") {
if($output[$i] eq $path[$i]) {
next;
};
my $e1 = $output[$i];
my $e2 = $path[$i];
$output[$i] = [$e1,$e2];
}elsif(ref($output[$i]) eq "ARRAY"){
push #{$output[$i]}, $path[$i];
};
} else {
$output[$i] = $path[$i];
};
};
}
print Dumper \#output;
OUTPUT:
$VAR1 = [
'bar',
'cmd_opts',
'gld_upf',
[
'abc',
'def'
]
];
I have a to process some files in a directory.
So, I am using non-OO Perl code as below (just the important snippets are printed below):
#!/usr/bin/perl
use strict;
use warnings;
my $dnaFilesDirectory = "./projectGeneSequencingPfzr";
my %properties = &returnGeneticSequences($dnaFilesDirectory);
sub returnGeneticSequences {
my $dnaDirectory = shift;
my #dnaFiles = ();
opendir(DNADIR, $dnaFilesDirectory) or die "Cannot open directory:$!";
#dnaFiles = readdir(DIR);
foreach my $file (#dnaFiles) {
my $dnaFilePath = $dnaFilesDirectory."\/".$file;
if($file =~ /dna_file.*\.dnaPrj/) {
my %diseaseStages = &returnDiseasesStages($dnaFilePath);
## Do some data analysis on the %diseaseStages Hash;
}
}
}
sub returnDiseasesStages {
my $dnaFile = shift;
## Do something with DNA file and build a hash called %diseasesStagesHash;
return %diseasesStagesHash;
}
The above code works fine.
But we have to create the equivalent OO Perl code for the above functions.
I am trying to do the following, but it does not seem to work. Obviously, I am doing something wrong in calling the class method returnDiseasesStages from returnGeneticSequences.
#!/usr/bin/perl
use strict;
use warnings;
package main;
my $obj = GeneticSequences->new(dnaFilesDir => "./projectGeneSequencingPfzr");
$obj->returnGeneticSequences();
package GeneticSequences;
sub new {
my $class = shift;
my $self = {
dnaFilesDir => "dnaFilesDir",
#_,
};
return (bless($self,$class));
}
sub returnGeneticSequences {
my $self = shift;
my $dnaFilesDirectoryGS = $self->{dnaFilesDir};
my #dnaFiles = ();
opendir(DNADIR,$dnaFilesDirectoryGS) or die "Cannot open directory:$!";
#dnaFiles = readdir(DIR);
foreach my $file (#dnaFiles) {
my $dnaFilePath = $dnaFilesDirectory."\/".$file;
if($file =~ /dna_file.*\.dnaPrj/) {
my $gsObj = GeneticSequences->new();
my %diseaseStages = $gsObj->returnDiseasesStages($dnaFilePath);
## Do some data analysis on the %diseaseStages Hash;
}
}
}
sub returnDiseasesStages {
my $dnaFile = shift;
##Do something with DNA file and build a hash called %diseasesStagesHash;
return %diseasesStagesHash;
}
Please help me understand what I am doing wrong.
The syntax
$gsObj->returnDiseasesStages($dnaFilePath)
is equivalent to the syntax
returnDiseasesStages($gsObj, $dnaFilePath)
(with Perl checking the reference type of $gsObj to see what package to search for the returnDiseasesStages function in).
So your returnDiseasesStages function should expect two arguments:
sub returnDiseasesStages {
my ($self, $dnaFile) = #_;
...
}
I was given the assignment to implement a linked list in perl without using the built-in push,pop,shift and unshift. This is my first time learning perl, coming from c++ and java, this is what I came up with:
#!/usr/bin/perl
sub node {
my (#value) = #_;
sub get {
$next;
}
sub push {
#my $next = \#_;
if(defined($next))
{
$next->push(#_);
}
else
{
my $next = \#_;
}
}
sub size {
if(defined($next))
{
$next->size($_[0]);
}
else
{
$_[0]+1;
}
}
sub myprint {
print "$_[0]: ";
foreach (#value) {
print "$_, ";
}
print "\n";
if(defined($next)) {
$next->print($_[0]+1);
}
}
}
while(!defined($done))
{
print "what do you want to do?\n";
print "1 -- push\n";
print "2 -- print nodes\n";
print "3 -- pop\n";
print "4 -- quit\n";
my $val = <STDIN>;
if ($val == 1)
{
print "Type something: ";
$input = <STDIN>;
if(defined($top))
{
$top->push(node($input));
}
else
{
$top = node($input);
}
}
elsif ($val == 2)
{
if(defined($top))
{
$top->myprint(1);
}
}
elsif ($val == 3)
{
if(defined($top))
{
if(defined($top->next))
{
$top=$top->next;
}
}
}
elsif ($val == 4)
{
$done=true;
}
else
{
print "Invalid option\n";
}
}
output:
what do you want to do?
1 -- push
2 -- print nodes
3 -- pop
4 -- quit
1
Type something: q
what do you want to do?
1 -- push
2 -- print nodes
3 -- pop
4 -- quit
1
Type something: w
Can't call method "push" without a package or object reference at ./linkedlistattempt1.pl line 76, <STDIN> line 4.
I'm guessing the "->" operator can only be used with a module or package.
I haven't gotten around to testing the other methods, I'm still working on push. I feel like the best way of doing this is to simply have a holder sub like `sub holder { $value = \#_; } but I don't understand how I would (could?) add more variables, like the next node, hence the sub within a sub design. So without the help of perl's built in functions, how would I do this?
Its important to mention I'm interested in methods that can run on the older versions, going down to 5.10. Most if not all of the tutorials are showing stuff for 5.16 or 5.18
A typical Perl implementation using classic Perl OO would look something like this. Read the man pages perlootut and perlobj to learn how it works.
#!/usr/bin/perl
use strict;
use warnings;
package LinkedList::Node;
# Constructor.
sub new {
my ($class, $item) = #_;
my $self = { item => $item };
return bless($self, $class);
}
# Read-only accessor.
sub item {
my $self = shift;
return $self->{item};
}
# Read-write accessor.
sub next {
my $self = shift;
my $next = $self->{next};
if (#_ > 0) {
$self->{next} = shift;
}
return $next;
}
package LinkedList;
# Constructor. Creates an empty linked list.
sub new {
my $class = shift;
return bless({}, $class);
}
# Read-only accessor.
sub head {
my $self = shift;
return $self->{head};
}
# Insert an item at the beginning.
sub push {
my ($self, $item) = #_;
my $node = LinkedList::Node->new($item);
my $head = $self->{head};
if ($head) {
$node->next($head);
}
$self->{head} = $node;
}
package main;
my $list = LinkedList->new;
$list->push(2);
$list->push(5);
$list->push(9);
for (my $node = $list->head; $node; $node = $node->next) {
print($node->item, "\n");
}
If I have the following code
sub a {
my $id = shift;
# does something
print &a_section($texta);
print &a_section($textb);
sub a_section {
my $text = shift;
# combines the $id and the $text to create and return some result.
}
}
Assuming a_section is called only by a, will I run into a memory leak, variable dependability, or other problem?
I am exploring this as an alternative so I can avoid the necessity of passing $id to a_section.
First, it's not a private sub. It's fully visible from the outside. Two, you will have problems.
$ perl -wE'
sub outer {
my ($x) = #_;
sub inner { say $x; }
inner();
}
outer(123);
outer(456);
'
Variable "$x" will not stay shared at -e line 4.
123
123 <--- XXX Not 456!!!!
You could do:
sub a {
my $id = shift;
local *a_section = sub {
my $text = shift;
# combines the $id and the $text to create and return some result.
};
print a_section($texta);
print a_section($textb);
}
(You can call the inner sub recursively using a_section(...).)
or:
sub a {
my $id = shift;
my $a_section = sub {
my $text = shift;
# combines the $id and the $text to create and return some result.
};
print $a_section->($texta);
print $a_section->($textb);
}
(Use __SUB__->(...) if you want to call the inner sub recursively to avoid memory leak, available in Perl 5.16+.)
I often find it useful to be able to schedule code to be executed upon leaving the current scope. In my previous life in TCL, a friend created a function we called defer.
It enabled code like:
set fp [open "x"]
defer("close $fp");
which was invoked when the current scope exited. The main benefit is that it's always invoked no matter how/where I leave scope.
So I implemented something similar in Perl but it seems there'd be an easier way. Comments critiques welcome.
The way I did it in Perl:
create a global, tied variable which holds an array of subs to be executed.
whenever I want to schedule a fn to be invoked on exit, I use local to change the array.
when I leave the current scope, Perl changes the global to the previous value
because the global is tied, I know when this value change happens and can invoke the subs in the list.
The actual code is below.
Is there a better way to do this? Seems this would be a commonly needed capability.
use strict;
package tiescalar;
sub TIESCALAR {
my $class = shift;
my $self = {};
bless $self, $class;
return $self;
}
sub FETCH {
my $self = shift;
return $self->{VAL};
}
sub STORE {
my $self = shift;
my $value = shift;
if (defined($self->{VAL}) && defined($value)) {
foreach my $s (#{$self->{VAL}}) { &$s; }
}
$self->{VAL} = $value;
}
1;
package main;
our $h;
tie($h, 'tiescalar');
$h = [];
printf "1\n";
printf "2\n";
sub main {
printf "3\n";
local $h = [sub{printf "9\n"}];
push(#$h, sub {printf "10\n";});
printf "4\n";
{
local $h = [sub {printf "8\n"; }];
mysub();
printf "7\n";
return;
}
}
sub mysub {
local $h = [sub {printf "6\n"; }];
print "5\n";
}
main();
printf "11\n";
Well, your specific case is already handled if you use lexical filehandles (as opposed to the old style bareword filehandles). For other cases, you could always use the DESTROY method of an object guaranteed to go to zero references when it goes out of scope:
#!/usr/bin/perl
use strict;
use warnings;
for my $i (1 .. 5) {
my $defer = Defer::Sub->new(sub { print "end\n" });
print "start\n$i\n";
}
package Defer::Sub;
use Carp;
sub new {
my $class = shift;
croak "$class requires a function to call\n" unless #_;
my $self = {
func => shift,
};
return bless $self, $class;
}
sub DESTROY {
my $self = shift;
$self->{func}();
}
ETA: I like brian's name better, Scope::OnExit is a much more descriptive name.
Instead of using tie for this, I think I'd just create an object. You can also avoid the local that way too.
{
my $defer = Scope::OnExit->new( #subs );
$defer->push( $other_sub ); # and pop, shift, etc
...
}
When the variable goes out of scope, you have a chance to do things in the DESTROY method.
Also, in the example you posted, you need to check that the values you store are code references, and it's probably a good idea to check that the VAL value is an array reference:
sub TIESCALAR { bless { VAL => [] }, $_[0] }
sub STORE {
my( $self, $value ) = #_;
carp "Can only store array references!" unless ref $value eq ref [];
foreach { #$value } {
carp "There should only be code refs in the array"
unless ref $_ eq ref sub {}
}
foreach ( #{ $self->{VAL}} ) { $_->() }
$self->{VAL} = $value;
}
You may want to try out B::Hooks::EndOfScope
I Believe this works:
use B::Hooks::EndOfScope;
sub foo {
on_scope_end {
$codehere;
};
$morecode
return 1; # scope end code executes.
}
foo();
I think you want something like Scope::Guard, but it can't be pushed. Hmmm.
Thanks.
Trivially,
sub OnLeavingScope::DESTROY { ${$_[0]}->() }
used like:
{
...
my $onleavingscope = bless \sub { ... }, 'OnLeavingScope';
my $onleavingscope2 = bless \\&whatever, 'OnLeavingScope';
...
}
(The extra level of having a reference to a reference to a sub is necessary only to work around an optimization (that's arguably a bug) when using a non-closure anonymous sub.)