Why is the same module listed three times at metacpan.org? - perl

Searching for Devel::Peek at metacpan.org gives the following screen shot:
Why is the module listed three times? (It looks a little bit strange, and could easily confuse the user..)

Oddly enough, there's one missing. The following are Devel::Peek's official distributions:
perl
Devel-Peek
These two distributions are returned when searching search.cpan.org, and the only two distributions returned when searching search.cpan.org.
Being part of the perl distribution and part of its own distribution is called being a "dual-lifed" module. It allows the module to be bundled with Perl without having to upgrade Perl to upgrade the module.
I don't know why meta::cpan doesn't pick up the official distribution, and I don't know why it doesn't flag the other distributions as unofficial. You could alert the site's maintainers of the problem.
Conversely, I don't know why search.cpan.org doesn't return CookBookA and CookBookB, and why it doesn't flag theses other distributions as unofficial when one goes to it directly. I think it has to do with the fact that Devel::Peek is only present as a documentation file (.pod) —not a module (.pm)— in them.

Related

Simplest way to get a comprehensive listing of package names available in CPAN?

Suppose that, as a private project, I have implemented a Perl package, and tested it, both formally and through extensive everyday use. I find the package useful and solid enough to warrant submitting it to CPAN.
Up to this point, since the package has been a private project, I have not worried too much about the package's name, but now that I want to submitted to CPAN, however, I would like the package's name to fit well within the ecology of package names already in CPAN.
In order to find a suitable "CPAN name" for my package, I would have to inspect a comprehensive listing of all these package names1.
What is the simplest way to get this comprehensive listing of names of packages in CPAN?
ObPedantry
(IOW, if the question above is already clear enough for you, you may safely ignore what follows.)
I don't think that I can give a technically correct formal definition of what I mean here by "package name", so let me at least give an "operational definition".
If, for example, the one-liner
$ perl -MFoo::Bar::Baz -c -e 1
fails with an error beginning with
Can't locate Foo/Bar/Baz.pm in #INC ...
..., but after installing some distributions from CPAN, the same oneliner succeeds with
-e syntax OK
...then I'll say that "Foo::Bar::Baz is a package name in CPAN".
(We could split hairs over the package/module distinction, and consider scenarios in which the distinction matters, but please let's not.)
Furthermore, if after inspecting the list this question asks about I discover that, on the one hand, there are in fact many eminent package names in CPAN that begin with the prefix Foo::Bar::, and on the other, there are none (or negligibly few) that begin with the prefix Fubar::, then this would be a good enough reason for me to change the name of my Fubar::Frobozz package to Foo::Bar::Frobozz before submitting it to CPAN.
1 Of course, after inspecting such a list, I may discover that my package does not add sufficiently new functionality relative to what's already available in CPAN to warrant submitting my package to CPAN after all.
If you have run cpan before, you have downloaded a comprehensive package and distribution list under <cpan-home>/sources/modules/02packages.details.txt.gz.
A fresh copy is available on any CPAN mirror, e.g.
http://www.cpan.org/modules/02packages.details.txt.gz .
PAUSE::Packages can do what you want, however you probably want to use this list, but http://prepan.org/ can provide advice/review before submission to cpan, with of course reading on the naming of modules first.
Are you sure that's a thing you want? There are 33,623 distributions on CPAN at the time of writing. Within cpan you can enter
cpan> d /./
That's d for distributions followed by a regex pattern that matches the names you're interested in
If you're really interested in packages -- and a distribution may contain multiple package names -- you need
cpan> m/./
where m is for modules. There are 163,136 of those, which means there's an average of four or five packages per distribution, and it takes cpan a few minutes to generate the list. (I'm sorry, I didn't monitor the exact time.)
You could use MetaCPAN::Client
I found this article which gives the idea about using this module.
#!/usr/bin/perl
use strict; use warnings; use MetaCPAN::Client;
my $mcpan = MetaCPAN::Client->new();
my $release_results = $mcpan->release({ status => 'latest' } );
while ( my $release = $release_results->next ) {
printf "%s v%s\n", $release->distribution, $release->version;
}
Currently this gave me 32601 result like this:
Proc-tored v0.11
Locale-Utils-PlaceholderBabelFish v0.004
Perinci-To-Doc v0.83
Mojolicious-Plugin-Qooxdoo v0.905
App-cdnget v0.05
Baal-Parser v0.01
Acme-DoOrDie v0.001
Net-Shadowsocks v0.9.0
MetaCPAN-Client v2.006000
This modules also gives information about release, module, author, and file & uses Elasticsearch.
It also get updated regularly on every MetaCPAN API change.

Check if Perl module exists on Cpan

I have a long list of Perl modules. I know some of them, but most not. I need to know which of them are available on Cpan. I know, i can copy and paste each item of the list into the cpan.org, but i want to learn a bit more about Perl and do this with a perl script.
I know that:
cpan -D ModName
will tell me which version is localy installed, and which is the latest on Cpan. Is there another way to get this info?
You can check this tutorial by Gabor Szabo. Not exactly related, but by using MetaCPAN::API->module($module_name); while looping on an array of your #modules_names, you could build a report of which module is available or not on CPAN. Not tested, but I wonder that if the module is not available, it will return undef.
Note that on the documentation of MetaCpan::API, it's author considers the module as deprecated and recommend the use of MetaCPAN::Client.
With smonff's tips I found Module::CheckVersion
which exactly does what I want, THX.

Why is "package" keyword sometimes separated by a comment from the package name?

Analyzing sources of CPAN modules I can see something like this:
...
package # hide from PAUSE
Try::Tiny::ScopeGuard;
...
Obviously, it's taken from Try::Tiny, but I have seen this kind of comments between package keyword and package identifier in other modules too.
Why this procedure is used? What is its goal and what benefits does it have?
It is indeed a hack to hide a package from PAUSE's indexer.
When a distribution is uploaded to PAUSE, the indexer will examine each file in the upload, looking for the names of packages that are included in the distribution. Any indexed packages can show up in CPAN search results.
There are many reasons for not wanting the indexer to discover your packages. Your distribution may have many small or insignificant packages that would clutter up the search results for your module. You may have packages defined in your t (test) directory or some other non-standard directory that are not meant to be installed as part of the distribution. Your distribution may include files from a completely different distribution (that somebody else wrote).
The hack works because the indexer strictly looks for the keyword package and an expression that looks like a package name on the same line.
Nowadays, you can include a META.yml file with your distribution. The PAUSE indexer will look for and respect a no_index specification in this file. But this is a relatively new capability of the indexer so older modules and old-timer CPAN contributors will still use the line break hack.
Here's an example of a no_index spec from Forks::Super
no_index:
directory:
- t
- inc
package:
- Sys::CpuAffinity
- Signals::XSIG
- Signals::XSIG::Default
- Signals::XSIG::TieArray56
Sys::CpuAffinity and Signals::XSIG are separate distributions that are also packaged with Forks::Super. Some of the test scripts contain package declarations (e.g., Arbitrary::Test::Package) that shouldn't be indexed.
Okay, here's another shot at this phenomenon ... I've been whacky-hacking Perl for a dozen years and I've rarely seen this packy hack and possibly simply ignored and never bothered to investigate. One thing seems clear, though. There's some hackish processing going on at PAUSE that's been crafted in the good ol' Perl'n'UNIX school of thought that without the shadow of a doubt involves line-oriented text parsing, so they parse those Perl files, possibly even using grep, but rather perl itself, who knows, to extract package names and then kick of some procedure or get some stats or whatnot. And to trip up this procedure and hack around its ways the author splits the package declaration in two lines so the hacky packy grep job doesn't have a clue that there's a package declared right under its nose and the programmer is happy about his hacky skills and the PAUSE stats or whatever it is they're cobbling together are as they should be. Does that make sense?

Is there any current review of statistical modules for Perl?

I would like to know which is the current status of the statistical modules in CPAN, does any one know any recent review or could comment about its likes/dislikes with those modules?
I have used the clasical: Statistics::Descriptive, Statistics::Distributions, and some others contained in Bundle::Math::Statistics
Some of the modules has not been updated for long time. I don't know if this is because they are rock solid or has been overtaken by better modules.
Does someone know any current review similar to this old one:
Using Perl for Statistics: Data Processing and Statistical Computing
NB (for the people that will suggest to use R ;-)):
All my code is mainly in perl, but I use R a lot for statistics and plotting. I usually prepare the dataframes with perl and write the R script in the perl modules as templates and save to a file and execute them from perl. But sometimes you have small data sets where efficiency is not an issue (well I am using perl insn't it ;-)) and you want to add some statistics and histograms to your report produced with perl.
PDL, the Perl Data Language is alive and thriving so its worth taking a look at that.
And I think the other stats modules you mention are OK. For eg. Statistics::Descriptive is up-to-date and has been used in answers to a few questions here on Stackoverflow.
NB. There is also a Perl to R bridge called Statistics::R which looks interesting.
/I3az/

How can I list all CPAN modules depending on a given module?

How can I list all CPAN modules depending on a given module? For example, create a list of modules using Class::Workflow?
There's two valid questions about dependencies:
What modules does the given module require?
The reversed question: Which modules depend on the given module?
For the former, the authoritative but non-recursive answer is usually to look at the META.yml file that's part of most modern distributions. If there is no such file, you may try looking at the Makefile.PL or Build.PL build tools that ship with it. If you want to know all dependencies and not just the direct ones, cf. ghostdog74's answer. Specifically, David Cantrell's 'CPANDeps' is very, very handy.
Obviously, the latter question is impossible to answer by inspecting the module itself. If you don't want to grep an unpacked minicpan, the best solution is something like the "used by" section of a module's CPANTS entry.
you can use a module such as CPAN::Dependency, or try this online , among many others.
I’ve found CPAN::Dependency and CPAN::FindDependencies on CPAN, they might help you.
Some other options:
Module::ScanDeps
Perl::PrereqScanner