Bypass optional field in pattern match - perl

I'm trying to pull out the names of the players and totals, but in some cases there is an extra html tag following the number of the player in the list. So how can I bypass that extra field when it appears. I can't put parenthesis around it because it will try to match it, correct?
<tr><td>10<td>MANNY MACHADO - FA</td><td>37</td></tr>
<tr><td>107</td><td>ALEDMYS DIAZ - HOU</td><td>18</td></tr>
while($content =~ /<tr><td>\d+?\S+?<td>(.*?)\s-.*?<\/td><td>(\d+?)</g) {
my $player = $1;
my $total = $2;
print "\nPlayer => $player Total => $total\n";
}
I tried using the '\S+?' to bypass it, but in this case it doesn't print out anything where the number of the player is less than 10.

It is generally a bad idea to use regexes for HTML, XML, etc.
Instead you should use an appropriate parser to convert it to a DOM and then implement your algorithm in the DOM domain. Using your example:
parse the HTML from file or string
(find the correct table in the document - left out in the example as I don't have the complete HTML)
loop over the rows in the table
extract the information you are looking for from the columns of a row
#!/usr/bin/perl
use warnings;
use strict;
use HTML::TreeBuilder;
my $parser = new HTML::TreeBuilder;
my $root = $parser->parse_file(\*DATA)
or die "HTML\n";
foreach my $row ($root->look_down(_tag => 'tr')) {
if (my #columns = $row->look_down(_tag => 'td')) {
my $player = $columns[1]->as_text();
my $total = $columns[2]->as_text();
print "Player => $player Total => $total\n";
}
}
exit 0;
__DATA__
<body>
<tr><td>10<td>MANNY MACHADO - FA</td><td>37</td></tr>
<tr><td>107</td><td>ALEDMYS DIAZ - HOU</td><td>18</td></tr>
</body>
Test run:
$ perl dummy.pl
Player => MANNY MACHADO - FA Total => 37
Player => ALEDMYS DIAZ - HOU Total => 18

With Mojo::DOM:
use strict;
use warnings;
use Mojo::DOM;
my $html = <<'EOD';
<tr><td>10<td>MANNY MACHADO - FA</td><td>37</td></tr>
<tr><td>107</td><td>ALEDMYS DIAZ - HOU</td><td>18</td></tr>
EOD
my $dom = Mojo::DOM->new($html);
foreach my $tr ($dom->find('tr')->each) {
my #cells = $tr->children('td')->each;
my $player = $cells[1]->all_text;
my $total = $cells[2]->all_text;
# or alternatively
my $player = $tr->at('td:nth-of-type(2)')->all_text;
my $total = $tr->at('td:nth-of-type(3)')->all_text;
print "\nPlayer => $player Total => $total\n";
}

You need to match an optional </tr>, so You can do that with the following (?:<\/tr>)? in your regex. This makes a non capturing group, because of the ?: at the start, that matches 0 or 1 times. So your new regex is
/<tr><td>\d+(?:<\/td>)?<td>(.*?)\s-.*?<\/td><td>(\d+?)</g
Normally I'd add a bit about not using regex to parse HTML, but as this is not well formed HTML I'll let it pass. However if you can exercise some control over what is creating the HTML try and fix it so that the <td> and </td> tags are balanced.

I'm also someone who would go for a proper HTML or XML modul to extract information like others above already said. So I will not elaborate on that.
If I would have to extract from the wrongly formatted html you showed, I'd stick with a multi step aproach.
cleanup
extract
more cleanup
For cleanup I'd first check what's common. In this case every line starts with <tr> so I'd settle for that to find my lines, skipping those not starting with <tr>, after some optional whitespace:
while (<>) {
next unless /^\s*<tr>/;
The next common thing I noticed is that every interesting field starts with td. So I'd replace it with something more easy like a tab. Assuming there could be tabs already, I'd first replace them with spaces:
tr/\t/ /;
s/<td>/\t/g;
Now what I have is some tags sprinkled around the data I really need. And the data I really need is prepended with a tab. So let's delete the tags:
s/<.*?>//g;
Finally I can extract my data:
my($dummy, $number, $player, $total)= split /\t/;
But since the player has some stuff appended (after -) let's remove that as well
$player=~ s/\s-.*//;
print "\nPlayer => $player Total => $total\n";
}
Putting it together and using DATA:
while (<DATA>) {
next unless /^\s*<tr>/;
tr/\t/ /;
s/<td>/\t/g;
s/<.*?>//g;
my($dummy, $number, $player, $total)= split /\t/;
$player=~ s/\s-.*//;
print "\nPlayer => $player Total => $total\n";
}
__DATA__
<tr><td>10<td>MANNY MACHADO - FA</td><td>37</td></tr>
<tr><td>107</td><td>ALEDMYS DIAZ - HOU</td><td>18</td></tr>
Please be prepared that you might come across data with more whitespace and the approach will fail.
Example:
<tr>
<td>10
<td>MANNY MACHADO - FA</td>
<td>37</td>
</tr>
<tr><td>107</td>
<td>ALEDMYS DIAZ - HOU</td>
<td>18</td>
</tr>

Related

How can I extract these table rows from HTML? [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 2 years ago.
Improve this question
I want to find a team score by entering team after matching then show next number or word after team name to be able to display score.
And I want this to apply on left and right side teams (left side: FT Crystal Palace 1 - right side: 1 Leicester)
The main issue is to find team score from our matching entered team.
#!/usr/bin/perl -wT
use strict;
use warnings;
use CGI;
use HTML::TableExtract;
use LWP::UserAgent;
my $cgi = CGI->new;
my $ua = LWP::UserAgent->new;
my $input = $cgi->param("team");
my $response ='';
if ($input) {
my $req = HTTP::Request->new(GET => "");
my $res = $ua->request($req);
my $html = $res->decoded_content;
$html =~ s/<span.*?<\/span>//gs;
$html =~ s/<script.*?<\/script>//gs;
$html =~ s/<td(?=[^>]*class="events-button button first-occur")[^>]*>//gs;
my $table = HTML::TableExtract->new();
$table->parse($html);
print "Content-type: text/html\n\n";
foreach my $row ($table->rows) {
my $output = join("", #$row);
$output =~ s/\R//g;
print $output;
# Find matching team (Its a must to find team after word FT)
if ($output =~ m/$input/) {
# After that team the next is score (Gets team scores)
my $score = $output =~ /$input\s*?(\S+)/;
$response ="Found Team And Score is: $score";
} else {
$response ="Can't find team";
}
}
}
print <<EOF;
<!DOCTYPE html>
<html>
<body>
<h1>Test</h1>
<form method="post">
<label for="team">Enter Team:</label>
<input type="text" name="team"><br><br>
<input type="submit" value="Find team">
</form>
<h4>$response</h4>
</body>
</html>
EOF
I like using Mojolicious for these things because everything I need is built in. With a little CSS Selector magic, you can easily zero in on the data you want without manual string processing.
Looking at the source, I see that the page has results in a table with class matches_new and within that table, the interesting rows have the class match. Within those rows, the interesting table cells are in classes team-a, team-b, and score-time (although the "score" may also be a game status).
Note that many of the rows don't load their data until you click on the team name. That's JavaScript at work and a Perl web scraping library isn't going to help you there. I don't know why the English Premier League already has data filled in for me (is it that way for everyone?) or if we can rely on that always being the case.
But let's extract that data.
The basic Mojo HTML-parsing process is to make a request then get the DOM (Document Object Model). In that DOM, find and at locate particular parts. find gets everything that matches its CSS Selector, and at gets the next match.
So, I find all the right rows, and each row is another (smaller) DOM object that I can explore further to get the teams and score.
The find returns a Mojo::Collection (fancy interface to an array) and I call map to process each row in turn and extract all the text from each table cell (and trim removes the leading and trailing whitespace). The all_text gets everything, including text in child nodes.
#!perl
use v5.10;
use Mojo::UserAgent;
use Mojo::Util qw(dumper trim);
my $url = 'https://int.soccerway.com/matches/2020/12/28/';
my $ua = Mojo::UserAgent->new;
my #results = $ua
->get( $url )
->result
->dom
->find( 'table.matches_new tr.match' )
->map( sub {
my $row = $_;
my #results =
map { trim( $row->at( $_ )->all_text ) }
qw( td.team-a td.team-b td.score-time );
return \#results;
} )
->to_array;
say dumper( #results );
Here is the output, with just the soccer matches :
[
[
"Everton",
"Manchester City",
"PSTP"
],
[
"Crystal Palace",
"Leicester City",
"1 - 1"
],
[
"Chelsea",
"Aston Villa",
"1 - 1"
]
]
You aren't strictly "wrong" for using LWP or HTTP::Request, but they don't come with powerful tools to handle the data. The HTML::TableExtract is fine, but that table it gets is long and you don't want most of its rows. In my career, I've written many, many programs like the one you presented: grab the source, remove are much irrelevant HTML as you can, and process what's left. Having all the tools in one package where everything is designed to work together in the same milieu is much nicer.
Targeting exactly what you want with selectors is much easier, less fragile, and much less code. You can see this in the Mojo docs and I wrote about in it detail with lots of examples in Mojo Web Clients.
Also, you can just use the Mojo::DOM stuff as long as you have the HTML, no matter the source. I write about this in Extracting from HTML with Mojo::DOM for Perl.com. For you, take your decoded content and give it to Mojo::DOM and do the same processing I've already done:
my $html = HTTP::Request->new( ... )->request( ... )->decoded_content;
my $dom = Mojo::DOM->new( $html );
my #results = $dom->find( ... )->...;

Find Favicons in HTML using Perl

I'm trying to look for favicons (and variants) for a given URL using Perl (I'd like to avoid using an external service such as Google's favicon finder). There's a CPAN module, WWW::Favicon, but it hasn't been updated in over a decade -- a decade in which now important variants such as "apple-touch-icon" have come to replace the venerable "ico" file.
I thought I found the solution in WWW::Mechanize, since it can list all of the links in a given URL, including <link> header tags. However, I cannot seem to find a clean way to use the "find_link" method to search for the rel attribute.
For example, I tried using 'rel' as the search term, hoping maybe it was in there despite not being mentioned in the documentation, but it doesn't work. This code returns an error about an invalid "link-finding parameter."
my $results = $mech->find_link( 'rel' => "apple-touch-icon" );
use Data::Dumper;
say STDERR Dumper $results;
I also tried using other link-finding parameters, but none of them seem to be suited to searching out a rel attribute.
The only way I could figure out how to do it is by iterating through all links and looking for a rel attribute like this:
my $results = $mech->find_all_links( );
foreach my $result (#{ $results }) {
my $attrs = $result->attrs();
#'tag' => "apple-touch-icon"
foreach my $attr (sort keys %{ $attrs }) {
if ($attrs->{'rel'} =~ /^apple-touch-icon.*$/) {
say STDERR "I found it:" . $result->url();
}
# Add tests for other types of icons here.
# E.g. "mask-icon" and "shortcut icon."
}
}
That works, but it seems messy. Is there a better way?
Here's how I'd do it with Mojo::DOM. Once you fetch an HTML page, use dom to do all the parsing. From that, use a CSS selector to find the interesting nodes:
link[rel*=icon i][href]
This CSS selector looks for link tags that have the rel and href tags at the same time. Additionally, I require that the value in rel contain (*=) "icon", case insensitively (the i). If you want to assume that all nodes will have the href, just leave off [href].
Once I have the list of links, I extract just the value in href and turn that list into an array reference (although I could do the rest with Mojo::Collection methods):
use v5.10;
use Mojo::UserAgent;
my $ua = Mojo::UserAgent->new->max_redirects(3);
my $results = $ua->get( shift )
->result
->dom
->find( 'link[rel*=icon i][href]' )
->map( attr => 'href' )
->to_array
;
say join "\n", #$results;
That works pretty well so far:
$ perl mojo.pl https://www.perl.org
https://cdn.perl.org/perlweb/favicon.ico
$ perl mojo.pl https://www.microsoft.com
https://c.s-microsoft.com/favicon.ico?v2
$ perl mojo.pl https://leanpub.com/mojo_web_clients
https://d3g6anj9jkury9.cloudfront.net/assets/favicons/apple-touch-icon-57x57-b83f183ad6b00aa74d8e692126c7017e.png
https://d3g6anj9jkury9.cloudfront.net/assets/favicons/apple-touch-icon-60x60-6dc1c10b7145a2f1156af5b798565268.png
https://d3g6anj9jkury9.cloudfront.net/assets/favicons/apple-touch-icon-72x72-5037b667b6f7a8d5ba8c4ffb4a62ec2d.png
https://d3g6anj9jkury9.cloudfront.net/assets/favicons/apple-touch-icon-76x76-57860ca8a817754d2861e8d0ef943b23.png
https://d3g6anj9jkury9.cloudfront.net/assets/favicons/apple-touch-icon-114x114-27f9c42684f2a77945643b35b28df6e3.png
https://d3g6anj9jkury9.cloudfront.net/assets/favicons/apple-touch-icon-120x120-3819f03d1bad1584719af0212396a6fc.png
https://d3g6anj9jkury9.cloudfront.net/assets/favicons/apple-touch-icon-144x144-a79479b4595dc7ca2f3e6f5b962d16fd.png
https://d3g6anj9jkury9.cloudfront.net/assets/favicons/apple-touch-icon-152x152-aafe015ef1c22234133158a89b29daf5.png
https://d3g6anj9jkury9.cloudfront.net/assets/favicons/favicon-16x16-c1207cd2f3a20fd50de0e585b4b307a3.png
https://d3g6anj9jkury9.cloudfront.net/assets/favicons/favicon-32x32-e9b1d6ef3d96ed8918c54316cdea011f.png
https://d3g6anj9jkury9.cloudfront.net/assets/favicons/favicon-96x96-842fcd3e7786576fc20d38bbf94837fc.png
https://d3g6anj9jkury9.cloudfront.net/assets/favicons/favicon-128x128-e97066b91cc21b104c63bc7530ff819f.png
https://d3g6anj9jkury9.cloudfront.net/assets/favicons/favicon-196x196-b8cab44cf725c4fa0aafdbd237cdc4ed.png
Now, the problem comes if you find more interesting cases that you can't easily write a selector for. Suppose not all of the rel values have "icon" in them. You can get a little more fancy by specifying multiple selectors separated by commas so you don't have to use the experimental case insensitivity flag:
link[rel*=icon][href], link[rel*=ICON][href]
or different values in rel:
link[rel="shortcut icon"][href], link[rel="apple-touch-icon-precomposed"][href]
Line up as many of those as you like.
But, you could also filter your results without the selectors. Use Mojo::Collection's grep to pick out the nodes that you want:
my %Interesting = ...;
my $results = $ua->get( shift )
->result
->dom
->find( '...' )
->grep( sub { exists $Interesting{ $_->attr('rel') } } )
->map( attr => 'href' )
->to_array
;
I have a lot more examples of Mojo::DOM in Mojo Web Clients, and I think I'll go add this example now.
The problem is very easy to solve with:
assistance of any module allowing to load webpage
define $regex for all possible favicon variations
look for <link rel="$regex" href="icon_address" ...>
Note:
The script has default YouTube url embedded in the code
use strict;
use warnings;
use feature 'say';
use HTTP::Tiny;
my $url = shift || 'https://www.youtube.com/';
my $icons = get_favicon($url);
say for #{$icons};
sub get_favicon {
my $url = shift;
my #lookup = (
'shortcut icon',
'apple-touch-icon',
'image_src',
'icon',
'alternative icon'
);
my $re = join('|',#lookup);
my $html = load_page($url);
my #icons = ($html =~ /<link rel="(?:$re)" href="(.*?)"/gmsi);
return \#icons;
}
sub load_page {
my $url = shift;
my $response = HTTP::Tiny->new->get($url);
my $html;
if ($response->{success}) {
$html = $response->{content};
} else {
say 'ERROR: Could not extract webpage';
say 'Status: ' . $response->{status};
say 'Reason: ' . $response->{reason};
exit;
}
return $html;
}
Run as script.pl
https://www.youtube.com/s/desktop/8259e7c9/img/favicon.ico
https://www.youtube.com/s/desktop/8259e7c9/img/favicon_32.png
https://www.youtube.com/s/desktop/8259e7c9/img/favicon_48.png
https://www.youtube.com/s/desktop/8259e7c9/img/favicon_96.png
https://www.youtube.com/s/desktop/8259e7c9/img/favicon_144.png
https://www.youtube.com/img/desktop/yt_1200.png
Run as script.pl "http://www.microsoft.com/"
https://c.s-microsoft.com/favicon.ico?v2
Run as script.pl "http://finance.yahoo.com/"
https://s.yimg.com/cv/apiv2/default/icons/favicon_y19_32x32_custom.svg

Get HTML within an <a> element using WWW::Mechanize

I'm extracting special links within an HTML page by using WWW::Mechanize.
my $mech = WWW::Mechanize->new();
$mech->get( $uri );
my #links = $mech->find_all_links(url_regex => qr/cgi-bin/);
for my $link ( #links ) {
# try to get everything between HERE
}
The links look like this
<div><div><span>foo bar</span> I WANT THIS TEXT</div></div>
By using $link->text I get foo bar I WANT THIS TEXT without knowing which text was inside the <span> element.
Is there any way to get the raw HTML code instead of the stripped text?
In other words I need to find a way to only get I WANT THIS TEXT without knowing the exact text within the <span> tag.
As simbabque has said you can't do that with WWW::Mechanize
In fact there's very little point in using WWW::Mechanize if you don't want any of its features. If all you're using it for is to fetch a web page, then use LWP::UserAgent instead. WWW::Mechanize is just a subclass of LWP::UserAgent with lots of additional stuff that you don't want
Here's an example that uses HTML::TreeBuilder to construct a parse tree of the HTML and locate the links that you want. I've used HTML::TreeBuilder because it's pretty good at tolerating malformed HTML in a way similar to modern browsers
I've been unable to test it as you haven't provided proper sample data and I'm not inclined to create my own
use strict;
use warnings 'all';
use feature 'say';
use WWW::Mechanize;
use HTML::TreeBuilder;
my $mech = WWW::Mechanize->new;
$mech->get('http://www.example.com/');
my $tree = HTML::TreeBuilder->new_from_content($mech->content);
for my $link ( #{ $tree->extract_links('a') } ) {
my ($href, $elem, $attr, $tag) = #$link;
# Exclude non-CGI links
next unless $link =~ /cgi-bin/;
# Find all immediate child text nodes and concatenate them
# References are non-text children
my $text = join ' ', grep { not ref } $elem->content_list;
next unless $text =~ /\S/;
# Trim and consolidate spaces
$text =~ s/\A\s+|\s+\z//g;
$text =~ s/\s+/ /g;
say $text;
}

Perl HTML::Tokeparser get raw html between tags

i am using TokeParser to extract tag contents.
...
$text = $p->get_text("/td") ;
...
usually it will return the text cleaned up. What I want is to return everthing between td and /td but including all other html elements. How to do that.
I am using the example in this tutorial. thanks
In the example,
my( $tag, $attr, $attrseq, $rawtxt) = #{ $token };
I believe there is some trick to do with $rawtxt .
HTML::TokeParser does not have a built-in feature to do this. However, it's possible by looking at each token between <td>s individually.
#!/usr/bin/perl
use strictures;
use HTML::TokeParser;
use 5.012;
# dispatch table with subs to handle the different types of tokens
my %dispatch = (
S => sub { $_[0]->[4] }, # Start tag
E => sub { $_[0]->[2] }, # End tag
T => sub { $_[0]->[1] }, # Text
C => sub { $_[0]->[1] }, # Comment
D => sub { $_[0]->[1] }, # Declaration
PI => sub { $_[0]->[2] }, # Process Instruction
);
# create the parser
my $p = HTML::TokeParser->new( \*DATA ) or die "Can't open: $!";
# fetch all the <td>s
TD: while ( $p->get_tag('td') ) {
# go through all tokens ...
while ( my $token = $p->get_token ) {
# ... but stop at the end of the current <td>
next TD if ( $token->[0] eq 'E' && $token->[1] eq 'td' );
# call the sub corresponding to the current type of token
print $dispatch{$token->[0]}->($token);
}
} continue {
# each time next TD is called, print a newline
print "\n";
}
__DATA__
<html><body><table>
<tr>
<td><strong>foo</strong></td>
<td><em>bar</em></td>
<td><font size="10"><font color="#FF0000">frobnication</font></font>
<p>Lorem ipsum dolor set amet fooofooo foo.</p></td>
</tr></table></body></html>
This program will parse the HTML document in the __DATA__ section and print everything including HTML between <td> and </td>. It will print one line per <td>. Let's go through it step by step.
After reading the documentation, I learned that each token from HTML::TokeParser has a type associated with it. There are six types: S, E, T, C, D and PI. The doc says:
This method will return the next token found in the HTML document, or
undef at the end of the document. The token is returned as an array
reference. The first element of the array will be a string denoting
the type of this token: "S" for start tag, "E" for end tag, "T" for
text, "C" for comment, "D" for declaration, and "PI" for process
instructions. The rest of the token array depend on the type like
this:
["S", $tag, $attr, $attrseq, $text]
["E", $tag, $text]
["T", $text, $is_data]
["C", $text]
["D", $text]
["PI", $token0, $text]
We want to access the $text stored in these tokens, because there is no other way to grab stuff that looks like HTML tags. I therefore created a dispatch table to handle them in %dispatch. It stores a bunch of code refs that get called later.
We read the document from __DATA__, which is convenient for this example.
First of all, we need to fetch the <td>s by using the get_tag method. #nrathaus's comment pointed me that way. It will move the parser to the next token after the opening <td>. We don't care about what get_tag returns since we only want the tokens after the <td>.
We use the method get_token to fetch the next token and do stuff with it:
But we only want to do that until we find the corresponding closing </td>. If we see that, we next the outer while loop labelled TD.
At that point, the continue block gets called and prints a newline.
If we are not at the end, the magic happens: the dispatch table; As we saw earlier, the first element in the token array ref holds the type. There is a code ref for each of these types in %dispatch. We call it and pass the complete array ref $token by going $coderef->(#args). We print the result on the current line.
This will produce stuff like <strong>, foo, </strong> and so on in each run.
Please note that this will only work for one table. If there is a table within a table (something like <td> ... <td></td> ... </td>) this will break. You would have to adjust it to take remember of how many levels deep it is.
Another approach would be to use miyagawa's excellent Web::Scraper. That way, we have a lot less code:
#!/usr/bin/perl
use strictures;
use Web::Scraper;
use 5.012;
my $s = scraper {
process "td", "foo[]" => 'HTML'; # grab the raw HTML for all <td>s
result 'foo'; # return the array foo where the raw HTML is stored
};
my $html = do { local $/ = undef; <DATA> }; # read HTML from __DATA__
my $res = $s->scrape( $html ); # scrape
say for #$res; # print each line of HTML
This approach can also handle multi-dimensional tables like a charm.

How can I dynamically generate an HTML tables containing 100 rows each in Perl?

use POSIX;
my $test = "";
my $elements = scalar(#array);
my $tablecount = ($elements / 100);
my $tblnum = ceil($tablecount);
my #hundred;
foreach $test (#array) {
until ($tblcnt == $tblnum){
#hundred(#array, 0, 99);
print qq~<table id="$tblcnt"><tr><td>~;
foreach $test (#hundred){
print qq~<tr><td>$test</td></tr>~;
}
print qq~</table>~;
$tblcnt++;
}
}
UG!!! I am trying to learn this but, I just cannot get it right!!!
I am trying to dynamically generate "x" number of html tables filled with up to 100 lines of data each.
I can get the table count needed, I can loop, I can add but, one thing is for sure: I CANNOT WRITE CODE.
Here is the result wanted:
1- I have 1027 lines of data from an array.
2- I want my script to make 11 html tables with up to 100 lines each. ((Layers actually) which by default will be not visible via css. That way I can do some show hide on the tables like a "next prev" navigation. I don't need help with the cross browser css.)
3- IF there is a better way, a method that I can comprehend anyhow, other than using visible= method, please elaborate!
I gave up trying to have Perl make pages of 100 with "next prev" links to the data so I resorted to using css javascript onclick=yadayada to "display the data".
I thought it would be easier to shift off 100 lines of the array in a loop and put the lines in their own html tables. Not.
I have failed miserably and need assistance.
I think you need to spend more time learning the basics of Perl and CGI before writing any scripts.
It is useful to separate logic from presentation in CGI scripts. To that end, I find HTML::Template very useful.
The following script will generate an HTML document containing 100 tables with 100 rows of 10 cells each. It will take its sweet time doing that.
#!/usr/bin/perl
use strict; use warnings;
use CGI::Simple;
use HTML::Template;
my $tmpl = HTML::Template->new(scalarref => page_template() );
my #tables;
for my $t (1 .. 100) {
my #rows;
for my $r (1 .. 100) {
push #rows, { CELLS => [ map { CELL => $_ }, 1 .. 10 ] };
}
push #tables, { ID => "table_$t", ROWS => \#rows }
}
$tmpl->param(TABLES => \#tables);
my $cgi = CGI::Simple->new;
print $cgi->header('text/html');
$tmpl->output(print_to => \*STDOUT);
sub page_template {
return \ <<EO_TMPL
<!DOCTYPE HTML>
<html>
<head><title>Tables Example</title></head>
<body>
<TMPL_LOOP TABLES>
<table id="<TMPL_VAR ID>">
<TMPL_LOOP ROWS>
<tr>
<TMPL_LOOP CELLS>
<td><TMPL_VAR CELL></td>
</TMPL_LOOP>
</tr>
</TMPL_LOOP>
</table>
</TMPL_LOOP>
</body>
</html>
EO_TMPL
;
}
my $cnt = 0;
while (#array) {
my #rows = splice #array, 0, 100;
print qq(<table id="t$cnt">\n);
for my $row (#rows) {
print "<tr><td>$row</td></tr>\n";
}
print "</table>\n";
++$cnt;
}
You may want to use HTML::Table for generating HTML.