HTML::TableExtract with a table inside of a table - perl

I have a small script that I am using to manipulate the code from a remote URL (code is separate). The manual page for HTML::TableExtract has the following code section relating to doing a table-in-a-table extract, ie
$te = new HTML::TableExtract
(
headers => [qw(Summary Region)],
chain => [
{ depth => 0, count => 2 },
{ headers => [qw(Part Qty Cost)] }
],
);
My code contains this, ie:
use HTML::TableExtract;
use strict;
use warnings;
my $te = new HTML::TableExtract
(
headers => [qw(Incident Date Time Location Description)],
chain => [
{ depth => 0, count => 2 },
{ headers => [qw(Unit DIS ENR ONS LEF ARR BUS REM COM)] }
],
);
$te->parse_file('data.html');
However, running it gives me this:
Can't locate object method "chain" via package "HTML::TableExtract" at /usr/lib/perl5/HTML/Parser.pm line 80.
Is there something I'm missing? (If anyone has a better way to extract a table from within a table (while printing information from both I'm all ears)

I didn't see any document about chain method in the doc of HTML::TableExtract. Maybe you're using an expired version?
But according to the doc, you could do this using the depth and count attributes:
$te = HTML::TableExtract->new(
headers => [qw(Unit DIS ENR ONS LEF ARR BUS REM COM)],
depth => 1,
count => 1
);
$te->parse($html_string);
depth: Specify how embedded in other tables your tables of interest
should be. Top-level tables in the HTML document have a depth of 0,
tables within top-level tables have a depth of 1, and so on.
count: Specify which table within each depth you are interested in,
beginning with 0.
In your case depth and count should be both 1.

Related

Join attempt throwing exceptions

I'm sure I'm overlooking something glaringly obvious and I apologize for the newbie question, but I've spent several hours back and forth through documentation for DBIx::Class and Catalyst and am not finding the answer I need...
What I'm trying to do is automate creation of sub-menus based on the contents of my database. I have three tables in the database to do so: maps (in which sub-menu items are found), menus (contains names of top-level menus), maps_menus (assigns maps to top-level menus). I've written a subroutine to return a hash of resultsets, with the plan of using a Template Toolkit nested loop to build the top-level and sub-menus.
Basically, for each top-level menu in menus, I'm trying to run the following query and (eventually) build a sub-menu based on the result:
select * FROM maps JOIN maps_menus ON maps.id_maps = maps_menus.id_maps WHERE maps_menus.id_menus = (current id_menus);
Here is the subroutine, located in lib/MyApp/Schema/ResultSet/Menus.pm
# Build a hash of hashes for menu generation
sub build_menu {
my ($self, $maps, $maps_menus) = #_;
my %menus;
while (my $row = $self->next) {
my $id = $row->get_column('id_menus');
my $name = $row->get_column('name');
my $sub = $maps_menus->search(
{ 'id_maps' => $id },
{ join => 'maps',
'+select' => ['maps.id_maps'],
'+as' => ['id_maps'],
'+select' => ['maps.name'],
'+as' => ['name'],
'+select' => ['maps.map_file'],
'+as' => ['map_file']
}
);
$menus{$name} = $sub;
# See if it worked...
print STDERR "$name\n";
while (my $m = $sub->next) {
my $m_id = $m->get_column('id_maps');
my $m_name = $m->get_column('name');
my $m_file = $m->get_column('map_file');
print STDERR "\t$m_id, $m_name, $m_file\n";
}
}
return \%menus;
}
I am calling this from lib/MyApp/Controller/Maps.pm thusly...
$c->stash(menus => [$c->model('DB::Menus')->build_menu($c->model('DB::Map'), $c->model('DB::MapsMenus'))]);
When I attempt to pull up the page, I get all sorts of exceptions, the top-most of which is:
[error] No such relationship maps on MapsMenus at /home/catalyst/perl5/lib/perl5/DBIx/Class/Schema.pm line 1078
Which, as far as I can tell, originates from the call to $sub->next. I take this as meaning I'm doing my query incorrectly and not getting the results I think I should be. However, I'm not sure what I'm missing.
I found the following lines, defining the relationship to maps, in lib/MyApp/Schema/Result/MapsMenus.pm
__PACKAGE__->belongs_to(
"id_map",
"MyApp::Schema::Result::Map",
{ id_maps => "id_maps" },
{ is_deferrable => 1, on_delete => "CASCADE", on_update => "CASCADE" },
);
...and in lib/MyApp/Schema/Result/Map.pm
__PACKAGE__->has_many(
"maps_menuses",
"MyApp::Schema::Result::MapsMenus",
{ "foreign.id_maps" => "self.id_maps" },
{ cascade_copy => 0, cascade_delete => 0 },
);
No idea why it's calling it "maps_menuses" -- that was generated by Catalyst. Could that be the problem?
Any help would be greatly appreciated!
I'd suggest using prefetch of the two relationships which form the many-to-many relationship helper and maybe using HashRefInflator if you don't need access to the row objects.
Note that Catalyst doesn't generate a DBIC (which is btw the official abbreviation for DBIx::Class, DBIx is a whole namespace) schema, SQL::Translator or DBIx::Class::Schema::Loader do. Looks at the docs of the module you've used to find out how to influence its naming.
Also feel free to change the names if they don't fit you.

Perl RRD::Simple no display data

I am new in Perl and also RRDs.
I have tried to implement a simple example, and although it seems that is operating correctly the output is not displayed. The pictures are produced normally but there is no data in the graphs.
I have been following the CPAN documentation for implementation RRD::Simple and theoretically I am doing something wrong. I tried to debug the code and it seems fine, but when it comes to print the graphs there is no data.
#!/usr/bin/perl
use strict;
use warnings;
use RRD::Simple ();
use Data::Dumper;
$| = 1; # Flush the output
my ($rrd, $unixtime, $file);
$file = "perl.txt";
my $path = '/home/os/Desktop/Test_Perl/';
my $period = '3years';
my $rrdfile = 'myfile.rrd';
while (sleep 15) {
open(FH, ">>", $file) || die "Unable to open $file: $!\n";
my $range = 50;
my $minimum = 100;
my $random_number_in = int(rand($range)) + $minimum;
my $random_number_out = int(rand($range)) + $minimum;
my $random_number_sec = int(rand($range)) + $minimum;
# Create an interface object
$rrd = RRD::Simple->new(
file => $rrdfile,
cf => [qw( AVERAGE MIN MAX LAST )],
#default_dstype => "DERIVE",
);
unless (-e $rrdfile) {
# Create a new RRD file with 3 data sources called
# bytesIn, bytesOut and faultsPerSec.
$rrd->create(
$period,
step => 5, # 5 sec interval
bytesIn => "GAUGE",
bytesOut => "GAUGE",
faultsPerSec => "GAUGE"
);
}
# Put some arbitary data values in the RRD file for the same
# 3 data sources called bytesIn, bytesOut and faultsPerSec.
$rrd->update(
bytesIn => $random_number_in,
bytesOut => $random_number_out,
faultsPerSec => $random_number_sec
);
print FH "This is the bytes_in: $random_number_in\n";
print FH "This is the bytes_out: $random_number_out\n";
print FH "This is the bytes_sec: $random_number_sec\n";
# Generate graphs:
# /home/os/Desktop/Test_Perl/myfile-hourly.png, /home/os/Desktop/Test_Perl/myfile-daily.png
# /home/os/Desktop/Test_Perl/myfile-weekly.png, /home/os/Desktop/Test_Perl/myfile-monthly.png
my %rtn = $rrd->graph(
$rrdfile,
destination => $path,
basename => "my_graph",
timestamp => "both", # graph, rrd, both or none
periods => [qw(hour day week month)], # omit to generate all graphs
sources => [qw(bytesIn bytesOut faultsPerSec)],
source_colors => [qw(ff0000 aa3333 000000)],
source_labels => [("Bytes In", "Bytes Out", "Faults Per Second")],
source_drawtypes => [qw(LINE1 AREA LINE)],
line_thickness => 2,
extended_legend => 1,
title => "Network Interface eth0",
vertical_label => "Bytes/Faults",
width => 800,
height => 500,
interlaced => "", # If images are interlaced they become visible to browsers more quickly
);
printf("Created %s\n", join(", ", map { $rtn{$_}->[0] } keys %rtn));
# Return information about an RRD file
my $info = $rrd->info($rrdfile); # This method will return a complex data structure containing details about the RRD file, including RRA and data source information.
print Data::Dumper::Dumper($info);
my #sources = $rrd->sources($rrdfile);
my $seconds = $rrd->retention_period($rrdfile); # This method will return the maximum period of time (in seconds) that the RRD file will store data for.
# Get unixtime of when RRD file was last updated
$unixtime = $rrd->last($rrdfile);
print FH "myfile.rrd was last updated at " . scalar(localtime($unixtime)) . "\n";
# Get list of data source names from an RRD file
my #dsnames = $rrd->sources;
print "Available data sources: " . join(", ", #dsnames) . "\n";
my $heartbeat_In = $rrd->heartbeat($rrdfile, "bytesIn");
my $heartbeat_Out = $rrd->heartbeat($rrdfile, "bytesOut");
my $heartbeat_sec = $rrd->heartbeat($rrdfile, "faultsPerSec"); # This method will return the current heartbeat of a data source.
printf "This is the heartbeat_in: %s\n", $heartbeat_In;
my #rtn_In = $rrd->heartbeat($rrdfile, "bytesIn", 10);
my #rtn_Out = $rrd->heartbeat($rrdfile, "bytesOut", 10);
my #rtn_sec = $rrd->heartbeat($rrdfile, "faultsPerSec", 10); # This method will set a new heartbeat of a data source.
close(FH);
}
Part of the output:
'myfilerrd' => {
'last_ds' => 'U',
'value' => undef,
'min' => '0',
'max' => undef,
'minimal_heartbeat' => 120,
'index' => 3,
'type' => 'DERIVE',
'unknown_sec' => 15
}
I do not understand why the value is undefined?
After 3-4 days of testing and searching over the Internet for more information I just found the answer to my problem. RRD is a very simple to use tool but very very powerful. I would recommend anybody to use it through Perl especially with RRD::Simple module is very easy.
Answer:
I was adjusting the heart beat of my RRD to 10 sec, while my step (data collection time) is 300 by default. If the user do not specify the step "sampling frequency" by default the system will use 300. In result the graph takes 0 values so there is not output. More information and very nice analysis can be found here HeartBeat
Based on my experimentation, I found that since I am using a while loop inside the create function I have to first give the command:
my $rrd = RRD::Simple->new( file => "myfile.rrd" );
and as a second step I had to kill the process and set the step by entering the command:
my $rrd = RRD::Simple->new(
file => "myfile.rrd",
step => 50 );
Based on my experimentation I found that I had to remove this block of code below had to be added to the file as a second step. First had to make the creation and then add it on my loop. This is because initially the "myfile.rrd" has to be created with all the settings, before the user start modifying them.
unless (-f "myfile.rrd") {
$rrd->create(
step => 50,
bytesIn => "GAUGE",
bytesOut => "GAUGE",
faultsPerSec => "COUNTER"
);
}
Another point that worth mentioning here is that by default RRD Data Source (DS) is set to GAUGE. More information can be found here RRDtool
The Perl module can be found easily CPAN RRD::Simple which provides analysis and extra "features" that you can add to your code.
In conclusion RRD::Simple is very simple, it can be executed by copy-paste into your program. Any further modifications (e.g sample rates, Average/Max/Min values etc.) need a bit of reading upon but definitely worth the effort. Unfortunately there is not much of examples online so some testing need it to be done in order to understand what I need to modify in my code to make it work. By writing this short analysis and providing some links to read upon I hope to save someone else from spending a few days to come up with the answer to his problem.
Again I encourage anyone to try implementing RRD's is a very powerful tool to graphically view your results and store the data up to 3 years.
Another update that I think is useful to some people maybe. Instead of following all this process by adding and removing code in order to make the rrd file working.
After modifications and experimentation I found another solution.
use strict;
use RRD::Simple;
use RRDs;
my $rrd = RRD::Simple->new(
file => "myfile.rrd",
rrdtool => "/usr/local/rrdtool-1.2.11/bin/rrdtool", #optional
tmpdir => "/var/tmp", #optional
cf => [ qw(AVERAGE MAX) ], #optional
default_dstype => "COUNTER", #optional
on_missing_ds => "add", #optional
RRDs::tune("myfile.rrd", "-i", "Source_Name:0") #optional -i or --minimum
RRDs::tune("myfile.rrd", "-a", "Source_Name:200") #optional -a or --maximum
);
There are several optional values that someone can use, but I recommend to use all of them so you can take full control of the program.
I am using:
default_dstype => "COUNTER", #optional
Because by default RRD's will set GAUGE as Data Source (DS). By setting the DS to COUNTER the user can set the minimum and maximum values. Short examples can be found here also RRD::Simple::Examples.

Rose::DB::Object::Manager query with a list of object ids

I'm trying to write a Rose::DB::Object query string using either an Array or a Hash, however I'm unsure how to do it. I'm trying to write an update function based off of certain ID's in a list that are enumerated in the array. I unfortunately do not have any other unique key to filter on to build the query, so I need to query specific ID's.
Essentially I am trying to programatically write the follow:
my $list = My::DB::Manager->get_items(query => [
{id => 1},
{id => 14},
{id => 210},
{id => 1102},
{id => 3151},
]);
This is the code I have so far, but I haven't been able to successfully achieve what I am trying to do:
use My::DB::Manager;
my #ary;
foreach (#_) {
my %col = ("id", $_);
push (#ary, \%col);
}
my $list = My::DB::Manager->get_items(query => \#ary);
...
./test.pl
Now the script just hangs with no output indefinately.
I'm trying to avoid iterating through the DB::Manager and making a DB call on a per record basis as this script will be run via cron every 60 seconds and has the potential to return large sets.
The query parameter takes a reference to an array of name/value pairs, not a reference to an array of hash references. If you want objects where the value of the id column is one of a list of values, then use the name id and a reference to an array of ids as the value. This code should work (assuming the id values are in #_):
$list = My::DB::Manager->get_items(query => [ id => \#_ ]);
You push strings into #ary when you need to push perl structures:
use My::DB::Manager;
my #ary;
foreach (#_) {
push (#ary, { id => $_ });
}
my $list = My::DB::Manager->get_items(query => [#ary]);
...
However, I think you can use query => [ id => [$id1, $id2, ... ], ...]:
use My::DB::Manager;
my $list = My::DB::Manager->get_items(query => [ id => \#_ ]);
...
Never used Rose, this based on docs of the module.

How can I use Perl and RRD to plot ping times?

I'm trying to do my first rrd graph through Perl.
I have tried RRD::Simple and rrds and just can't get either one to work.
Here's what I have so far:
use strict;
use RRD::Simple ();
# Create an interface object
my $rrd = RRD::Simple->new( file => "server.rrd" );
# Put some arbitary data values in the RRD file for the same
# 3 data sources called bytesIn, bytesOut and faultsPerSec.
$rrd->create(
EqSearch => "DERIVE",
MfSearch => "DERIVE",
EQCostBasis => "DERIVE",
MFCostBasis => "DERIVE"
);
$rrd->update(
EqSearch => 2,
MfSearch => 3,
EQCostBasis => 10,
MFCostBasis => 15
);
# Generate graphs:
# /var/tmp/myfile-daily.png, /var/tmp/myfile-weekly.png
# /var/tmp/myfile-monthly.png, /var/tmp/myfile-annual.png
my %rtn = $rrd->graph(
destination => "/Users/cmuench/Documents/Code/perl",
title => "Server A",
vertical_label => "",
interlaced => "",
periods => [qw(hour)]
);
The output is:
graph http://www.mediafire.com/imgbnc.php/a39e2bd662adefa823dca66351db637c5g.jpg
From your above script, the main issue is that you don't have enough data to show in graphs. you can see the data in your rrd using 'rrdtool fetch`.
If you can use bash instead of perl. Look at this "Round Trip and Packet Loss stats with rrdtool"
If you still want to use perl module RRD::Simple, please look at the examples provided with this module i.e. RRD::Simple::Examples Or provide more details about the problem you are facing.

How can I prettify Perl code generated by Perl?

I have a test generator written in Perl. It generates tests that connect to a simulator. These tests are themselves written in Perl and connect to the simulator via its API. I would like the generated code to be human-readable, which means I'd like it to be properly indented and formatted. Is there a good way to do it?
Details follow, or you can skip to the actual question below.
This is an example:
my $basic = ABC
TRIGGER => DELAY(
NUM => 500,
),
)
BASIC
my $additional = STATE_IS(
STATE => DEF,
INDEX => 0,
),
ADDITIONAL
I'd like the command ABC to be executed with a delay of 500 (units aren't relevant just now) after I call &event, and the state of index 0 is DEF. Sometimes I'll also want to wait for indeces 1, 2, 3 etc...
For only one index I'd like to see this in my test:
&event(
CMD => ABC
TRIGGER => DELAY(
NUM => 500,
TRIGGER => STATE_IS(
STATE => DEF,
INDEX => 0,
),
),
)
For two indeces I'd like to see:
&event(
CMD => ABC
TRIGGER => DELAY(
NUM => 500,
TRIGGER => STATE_IS(
STATE => DEF,
INDEX => 0,
TRIGGER => STATE_IS(
STATE => DEF,
INDEX => 1,
),
),
),
)
So basically I'm adding a block of:
TRIGGER => STATE_IS(
STATE => DEF,
INDEX => 0,
),
for each index, and the index number changes.
Here's how I'm doing it:
for $i (0..$num_indeces) {
# update the index number
$additional =~ s/(INDEX\s*=>\s*)\d+,/$1 $i,/;
$basic =~ s/(
(\),\s*) # capture sequences of ),
+ # as many as possible
\)\s* # end with ) without a ,
} )/$additional $1/sx; # replace with the additional data
Here's the actual question
The problem here is that the code comes out poorly indented. I'd like to run the resulting $basic through a prettifier like this:
&prettify($basic, "perl");
Which would format it nicely according to Perl's best practices. Is there any good way to do this?
PerlTidy makes your code not only tidy, but really beautiful. You can easily tweak it according to your local coding standards.
I have used this:
use Perl::Tidy;
sub Format {
my $source = shift;
my $result;
Perl::Tidy::perltidy(
source => \$source,
destination => \$result,
argv => [qw(-pbp -nst)]
);
return $result;
}