How do I create graphs in Perl on Windows? - perl

How do I use Perl to create graphs?
I'm running scheduled job that creates text reports. I'd like to move this to the next step (for the management) and also create some graphs that go along with this. Is this possible / feasible? It'd be great if I could do this using Office some how.
update: solutions i'm going to investigate in this order
Spreadsheet::WriteExcel (this seems to now have changed from the last time i investigated this .... wait, this was suggested by the author of the module. cool.)
GD Graph - this is now available for ActivePerl(wasn't last time i looked)
SVG
Open Charts look interesting.
Chartdirector

GD and GD::Graph are probably your best bets, you can use them to create images that you can then embed into whatever you need.

All of the methods mentioned above are really good, but personally I like SVG::TT::Graph. I really like the power that SVG gives you to draw really nice-looking graphs.

Also you can take a look at Google Charts CPAN module
use Google::Chart;
my $chart = Google::Chart->new(
type => "Bar",
data => [ 1, 2, 3, 4, 5 ]
);
print $chart->as_uri, "\n"; # or simply print $chart, "\n"
$chart->render_to_file( filename => 'filename.png' );

At work we have used the excellent Chartdirector.
It's not free, but is very cheap (maybe 50 bucks or so). The cost is well worth it, as the API and docs are both excellent (way better than GD!), so easily saved more than that amount of my time.
There's also a free version, which includes a small yellow banner advertising the product on each chart - to be honest if this is for personal use, you can go for that as it's really not very intrusive at all.
Chartdirector is available for lots of platforms (Win, Linux, Solaris, BSD, OSX) and has an API for lots of languages, too (Perl, ASP, .NET, Java, PHP, Python, Ruby, C++).
The output is easy on the eye, as you can see at their examples page.

Sorry for blowing my own trumpet, but you might be interested to have a look some slides I did for a short presentation about Graphing With Perl.
It mentions some of the suggestions here, but also gives you some code snippets that you might be able to use to help you get the most of what you're doing.

Depending on the complexity of your graph, simply generating a command file for Gnuplot—or GraphViz/Dotty, depending on what kind of graph you are referring to—might do the trick?

The Perl module Spreadsheet::WriteExcel allows you to create Excel workbooks that include charts.
You first have to create the type of chart that you want in Excel and then extract it out using a utility called chartex which is installed with Spreadsheet::WriteExcel.
The chart template can then be added to a new workbook and made to reference new data.
The documentation is here and there are several examples in the charts directory of the distro.
The mechanism is a little inflexible however and the it is sometimes tricky to get the exact result that you want.

Haven't tried it yet but Chart::Clicker looks quite nifty.
I think it uses the Cairo graphic library (alternative to GD) but is actually built on top of Graphics::Primitive which is an "interesting" graphics agnostic package.
The author in question (GPHAT) seems to be putting together some integrated tools for producing reports... http://www.onemogin.com/blog/582-pixels-and-painting-my-recent-cpan-releases
On a side note... have used both ChartDirector and OFC and both are good (especially if web based).

Spreadsheet::WriteExcel::Chart
You might need something like strawberry or vanilla Perl to get this to compile. Or PPM might have the module.
Tutorial link:
http://search.cpan.org/dist/Spreadsheet-WriteExcel/charts/charts.pod

It won't work with Office, but I really like Chart::OFC which will create Open Flash Charts. Very slick looking and easy to use.

It depends to a great extent what sort of graphs (the look of them), and the data-source. I've had some good result by using the YUI Charts and feeding them some JSON style versions of the original source data. Rolling over a live chart for exact values is quite easy for example. There are plenty of examples on the developer pages.

If you're set on doing this in MS Office you can use the Win32::OLE module to control Excel via OLE. Be warned, that this tends to run slowly and it can be difficult to find documentation for Excel's API. On the plus side, it allows you to do pretty much everything that you can do manually.

Metaprograming of course! Output an R script that creates the graph.

PGPlot does great graphs. There are some examples here. It works fine with Perl 5.8.8 but is broken in 5.10.0

Spreadsheet::WriteExcel will let you just get the data into Excel, then write Excel equations for the graphs.

Related

How can I get started with web page scraping using Perl?

I am interested in learning Perl. I am using Learning Perl books and cpan's web-sites for reference.
I am looking forward to do some web/text scraping application using Perl to apply whatever I have learnt.
Please suggest me some good options to begin with.
(this is not a homework. want to do something in Perl that would help me exploit basic Perl features)
If the web pages you want to scrape require JavaScript to function properly, you are going to need more than what WWW::Mechanize can provide you. You might even have to resort to controlling a specific browser via Perl (e.g. using Win32::IE::Mechanize or WWW::Mechanize::Firefox).
I haven't tried it, but there is also WWW::Scripter with the WWW::Scripter::Plugin::JavaScript plugin.
As others have said, WWW::Mechanize is an excellent module to use for web scraping tasks; you'll do well to learn how to use it, it can make common tasks very easy. I've used it for several web scraping tasks, and it just takes care of all the boring stuff - "go here, find a link with this text and follow it, now find a form with fields named 'username' and 'password', enter these values and submit the form...".
Scrappy is also well worth a look - it lets you do a lot with very little code - an example from its documentation:
my $spidy = Scrappy->new;
$spidy->crawl('http://search.cpan.org/recent', {
'#cpansearch li a' => sub {
print shift->text, "\n";
}
});
Scrappy makes use of Web::Scraper under the hood, which you might want to look at too as another option.
Also, if you need to extract data from HTML tables, HTML::TableExtract makes this dead easy - you can locate the table you're interested in by naming the headings it contains, and extract data very easily indeed, for example:
use HTML::TableExtract;
$te = HTML::TableExtract->new( headers => [qw(Date Price Cost)] );
$te->parse($html_string) or die "Didn't find table";
foreach $row ($te->rows) {
print join(',', #$row), "\n";
}
The most popular web scraping module for Perl is WWW::Mechanize, which is excellent if you can't just retrieve your destination page but need to navigate to it using links or forms, for instance, to log in. Have a look at its documentation for inspiration.
If your needs are simple, you can extract the information you need from the HTML using regular expressions (but beware your sanity), otherwise it might be better to use a module such as HTML::TreeBuilder to do the job.
A module that seems interesting, but that I haven't really tried yet, is WWW::Scripter. It's a subclass of WWW::Mechanize, but has support for Javascript and AJAX, and also integrates HTML::DOM, another way to extract information from the page.
Try the Web-Scraper Perl module. A beginners tutorial can be found here.
It's safe, easy to use and fast.
You may also want to have a look at my new Perl wrapper over Java HtmlUnit. It is very easy to use, e.g. look at the quick tutorial here:
http://code.google.com/p/spidey/wiki/QuickTutorial
By tomorrow I will publish some detailed installation instructions and a first release.
Unlike Mechanize and alike you get some JavaScript support and it is way faster and less memory demanding than screen scraping.

How could I find files that use certain modules in CPAN?

Some modules on CPAN are excellently documented, others.... not so much, but it's usually easy to discern how to use a module via prior art (e.g modules/tests that used the module you're looking to use). I'm wondering what the best way is to find code that uses the code you're looking to use.
example
I want to use (maybe?) Dist::Zilla::App::Tester for something, but the author has elected not to write any documentation on how to use it, some I'm wondering what path of least resistance is to find code that already uses it.
please don't answer for this module
Give a man a fish; you have fed him for today. Teach a man to fish; and you have fed him for a lifetime
Try Google Code Search, trying to search for strings like "use Dist::Zilla::App::Tester" (quotes are important).
Use CPANTS - The CPAN Testing Service web site.
Search for the distribution
Click Other dists requiring this
Here is the page for Dist-Zilla
As an aside, you can always read the source by hitting the Source button on the top of the page on search.cpan.org. In this case, the package doesn't have much code to begin with. Also, many big modules these days have ::Cookbooks ::Manuals or ::Tutorials Dist-Zilla has one too
My guess is ::Tester just supplies the dzil test command through its test_dzil sub.
One option is to use Google Code Search (Google for that phrase for a link :) ); unioned with pure googling. Search for "use my::module::name" string.
If the module name is not something well-searchable (e.g. too many hits), may be combine with "
For searches over CPAN, I suggest CPAN Grep over Google code search.
For more complex searches, I'd write a very small program using CPAN::Visitor and a minicpan.
For quick dependency checking, I'd use the not-perfect-but-very-good CPANDB.

Is Perl's CAM::PDF able to aggregate Annotation objects?

I have several copies of the same PDF file. These copies have annotations in it (Rect type with pop-up comments).
I want to know if I can get all these annotations from these copies and aggregate them into a single master copy using CAM::PDF (or another free tool).
an example to illustrate:
I have file1_userA.pdf and file1_userB.pdf. They are both annotated.
I want to generate file1_allusers.pdf aggregating annotations from both files file1_userA.pdf and file1_userB.pdf.
ps: I have the original un-annotated copy.
-- EDIT (Aug, 4):
I have developed an extension for CAM::PDF, namely CAM::PDF::Annot. It 'use base's CAM::PDF and adds extra functionality regarding Drawing Markup Annotations.
I am in the process of tidying up the code so I can post it to CPAN.
-- EDIT (Aug, 19)
I have finally submitted it to PAUSE, but I am running into some world writable related problems...
In any case, if anyone is interested in taking a look at the code, I will try and make it available somewhere... until then, just PM me and I will mail it to you.
geez, i'm getting such a thrill posting a module to cpan... i found the joy of working in working with Perl...
best regards,
Donato Azevedo
I'm the author of CAM::PDF. I have built only very limited support for annotations to date, specifically just for form field filling. So, no, that's not a supported feature today. The feature you describe is very interesting, though, and I can imagine that others would use it too, so I'd be interested in discussing it further with you offline.

Visualize Friend of a Friend (foaf) graph

I wrote a script to export twitter friends as foaf rdf description. Now I'm looking for a tool to visualize the friend networks. I tried http://foafscape.berlios.de/ but for 300+ Nodes it is really slow and does a bad job on auto formatting.
Any hints for good graph visualization tools? It's ok if they do not support foaf directly, but they should be able to use images for graph nodes and be able to display large graphs. Linux support would be nice.
Oh, and I'm searching for an interactive tool where I can move nodes by hand.
Update: Thanks for your input. I know graphviz and for static images it is really great. But for large datasets I need to be able to select nodes and highlight all neighbours.
Prefuse looks great: http://prefuse.org/gallery/graphview/
trough prefuse I found vizster, which is exactly what I search (just need to find some sourcecode) http://jheer.org/vizster/
perhaps the prefuse visualization toolkit might help you. It's based on Java and has many sample apps including a graph viewer.
You could try Graphviz. It runs on Linux, Windows and Mac OS X and it will generate an image (PNG, PS, etc) of the graph. You will have to transform your foaf data into its own custom language, but it's pretty easy to learn.
I don't know of any program that auto-generates graph visualizations and allows you to interactively adjust nodes, but Graphviz is a really popular tool for graph visualization. It can export to SVG so you can edit the result in your favorite vector graphics editor.
As recommended by other posters, definitely Graphviz. It takes an input file, let's call it foaf.dot, in the following format:
graph G {
"George Formby" [shape=custom, shapefile="file:formby.png"];
"Michael Jackson" [shape=custom, shapefile="file:jackson.png"];
"George Formby" -- "Michael Jackson";
"Fred Flinstone" -- "Michael Jackson";
"Michael Jackson" -- "Steve McQueen";
}
Note that this file describes an undirected graph (hopefully your friendships are reciprocal). The syntax for directed graphs is similar.
In order to output your graph to a pdf file (assuming that you have already installed graphviz) run the following command
dot -Tpdf foaf.dot > foaf.pdf
Graphviz supports a number of output formats other than pdf, see its documentation for details.
I find that the 'dot' program usually provides the best output results, however graphviz contains a total of 5 layout programs. From the documentation:
dot - filter for drawing directed graphs
neato - filter for drawing undirected graphs
twopi - filter for radial layouts of graphs
circo - filter for circular layout of graphs
fdp - filter for drawing undirected graphs
I previously recommended Graphviz, but thought I should add another recommendation now that I have used Gephi, a newer tool than a lot of the stuff here. It's a very powerful interactive graph exploration tool which I have found much more usable and much faster than a lot of the alternatives here.
Try using Google Social Graph. In one of the talks at dConstruct08 last week there was a social graph showing the friend connections of Robert Scoble.
http://code.google.com/apis/socialgraph/
http://dconstruct.org/2008
If you're using Java, you could use JGraph.
I know Adobe Flex has a few graph visualization components out there, and of course that would enable the app to run on Flash which has an excellent penetration rate into your potential userbase. I'd Google up the Flex SpringGraph component, and check that out. There are a ton of graphing components in the wild for Flex, both paid and free versions.
Just one SpringGraph off the top of my head:
http://www.adobe.com/cfusion/exchange/index.cfm?event=extensionDetail&extid=1048510
check this forum: http://goosebumps4all.net/34all/bb/forumdisplay.php?fid=28
for some flare examples, there is a friend of a friend graph around there.
have you tried the Python-based IDE NodeBox (1.0)? That's what I used to generate these: http://givememydata.com/#images
vizster looks cool though, I'll check out that.
For large graphs, Gephi (http://gephi.org/) is very popular. It is highly customisable, with lots of layout and presentation options.

Stand-alone charts in GWT

I've been trying to get pretty charts to work in GWT on our internal network.
Playing around with GWT-Ext's charts is nice, but it requires flash and is really messy to control (it seems buggy, in general).
I'd like to hear about something that works with the least amount of dependencies and it also must work without a connection to the web (so, Google' charts API isn't a solution).
Edit: Indeed, I would rather a library that is all client-side.
I'm building a GWT chart library based on Flot: http://gflot.googlecode.com
I hope you find it useful. Contact me if you have any questions.
Googling for "GWT +sparklines" has gotten me to gchart, which seems like what I need.
From what I understand - it's all client side and requires nothing more than their JAR file.
Google's charts actually come in two flavours, and one of them does not require interaction with Google's servers - so should satisfy your needs.
Google Image Charts is the API you are thinking of, which is an API on Google's servers that returns images.
Google Interactive Charts is a client side javascript API that renders entirely within the browser: Google Interactive Charts
Google provides a GWT wrapper for the interactive charts: GWT Visualization API
It's not all rainbows and unicorns and you can find chart libs out there that make nicer charts, but it's pretty solid, works on all major browsers and we've been using it successfully for quite a while.
http://code.google.com/p/ext-ux-ofcgxt/ is a nice option if you're using ext-gwt
Do you want something that has a server side component or entirely client driven? The best ones I have seen are all flash, alas. I have done little tricks with JS and GWT before, but there is only sophisticated I will get before I go hunting for a library to do it for me.
There is also "sparklines" - they are available in lots of flavours (very simple charts though).
gchart looks seriously awesome. Go with it !
If you're looking for client-side check out flotr which is based on prototype javascript library or flot which is based on jQuery. Both work well, though flot seems like its got a bigger backing.
If you are willing to go with flash, XML/SWF is a wonderful tool
+1 flot, requires jQuery though, so might not play well with GWT, I haven't used that.
Another flash option, with a pre-built GWT integration - Open Flash Chart / ofcgwt.
I think that gwt-chart is a better framework for you.
well.. i've used yahoo ui chart library (which GWT-Ext uses internally). Pretty neat solution, in the beta stage though.
Let us know the conclusion you arrive at..
There is one open source api for charts in GWT hosted on http://code.google.com/p/gwt-rcharts/ . The API works on SVG/VML specification. You may find it quite easy to implement and use. You may find the demo at http://gwt-rcharts.appspot.com/