Why isn't my CPAN distribution indexed by PAUSE? - perl

I've uploaded my stasis distribution to PAUSE, but it isn't in the index.
I thought this was because it didn't contain a package, so I added a package declaration to the stasis script in v0.04 like this:
#!/usr/bin/env perl
package stasis;
package main;
...
but it still wasn't indexed.
Is there anyway to get this distribution indexed that doesn't involve creating a boilerplate module file? (e.g. adding lib/stasis.pm to the distribution).

I believe CPAN does not index scripts.
IMO your best option is to make a module that allows doing programmatically what your script does (and make the script use it).
You could put in a fake module or make it think your script is a module (I think listing it in provides works), but I wouldn't if I were you.

Because your package statement was not in a *.pm file.
The PAUSE indexer is open source. It is a little complicated to unpack, but the regex for extracting a package name in a distribution is in PAUSE::pmfile::packages_per_pmfile, which is a method and a package that is meant to process *.pm files only.
The PAUSE::dist::_index_by_meta method provides the alternate method of declaring a package through the provides keyword in the metafile.

Related

Perl - Use all modules from specified subdirectory and solve it's dependencies automatically

I have two modules:
./My/Module1
./My/Module2
Module1 is using subroutines from Module2. So in my script i typed following:
use My::Module1
use My::Module2
But this does not worked and perl complained that subroutines which are used from Module2 by Module1 does not exists. So I added following line to Module1:
use My::Module2
Finally this worked as expected.
I am wondering if there is some solution that will include all modules from specified sub-directory tree and solve dependencies automatically. I do not want to type use keyword in modules which depends on another modules. Following commands was tried but it did not worked (either by syntax errors or it used wrong modules):
use My::;
use My::*;
use My;
Also I would ask if this cross-using modules and calling it's subroutines is considered as a good practice in perl programming?
PS: #INC contains current directory so loading modules is working.
PPS: Modules used Exporter
I do not want to type use keyword in modules which depends on another modules.
Then type the BEGIN, require, and import keywords instead?
Seriously, there's no good way for this to work. Just use use in each module so that it can load the things it needs.
Also I would ask if this cross-using modules and calling it's subroutines is considered as a good practice in perl programming?
Yes. Modularization is considered good practice in all programming.

Does %INC contain all dependencies

I'm interested in knowing which modules a script I'm working with uses (I did not write it from scratch, so I'm not sure). I know that %INC contains modules used by my script, but does it also contain modules used by those modules?
Yes, it does. Every successful require or use adds the module to %INC. (This includes optional modules if they were loaded.) Look at the pseudocode for require in its documentation.

Something wrong with #INC in perl?

Let's imagine we are new to perl and wrote some great module MyModule.pm. We also wrote some great script myscript.pl, that need to use this module.
use strict;
use warnings;
use MyModule;
etc...
Now we will create dir /home/user/GreatScript and put our files in it. And trying to run myscript.pl...
cd /home/user/GreatScript
perl myscript.pl
Great! Now moving to another dir...
cd /
perl /home/user/GreatScript/myscript.pl
Getting some not very usefull error about #INC and list of paths. What is this? Now after some googling we know that #INC contains paths where to search our modules, and this error means that Perl can't find out MyModule.pm.
Now we can:
Add path to system PERL5LIB var, it will add our path to the beginning of #INC
Install our module to one of dirs from #INC
Manually add our path to #INC in the BEGIN section
Add use lib '/home/user/GreatScript'; to our script, but it looks bad. What if we will move our script to other dir?
Or use FindBin module to find our current dir, and use this path in use lib "$FindBin::Bin";, but it's not 100% effective, for example some bugs with mod_perl or issues...
Or use __FILE__ (or $0) variable and extract path from it using abs_path method from Cwd module. Looks like a yet another bicycle?
Or some forgotten or missing ( Did I miss something? )
Why this obvious and regular operation need a lot of work? If I have one script, it's easy... But why I need to write this few additional lines of code to all my scripts? And it will not work 100%! Why perl not adding current script dir to #INC by default, like it does with "."?
PS: I was searching for answer to my question but find only list of solutions from the list above and some others. I hope this question is duplicate...
Or some forgotten or missing ( Did I miss something? )
Another option: Don't keep your module with your script. The module is separate because it is re-usable, so put it in a library folder. This can be anything from a local library for personal projects, included with use lib (and perhaps referencing an environment variable you have set up for the project), to making the module into a CPAN library, and letting cpan manage where it goes. What you decide to do depends on how re-usable your code is, beyond how the script uses it.
Why this obvious and regular operation need a lot of work?
It is not a lot of work in the scheme of things. Your expectation that files grouped together in the file system, should automatically be used by Perl at the language level to resolve require or use statements is not true. It's not that it is impossible, or that your expectation is unreasonable, just that Perl is not implemented that way. Changing it after the fact could affect how many existing projects work, and may be contentious - so is unlikely to change.
If your use case is: "I want to have a script which accesses additional modules or other resources in the same or relative directory, and I don't want to install everything", then FindBin is a perfect solution. The limitation mentioned in the manpage happens only in persistent environments like mod_perl, but here I would propose different solutions (e.g. using FindBin only within httpd.conf, not in the scripts/modules).
(/me is a heavy FindBin user, and I am also the author of half of the KNOWN ISSUES section of the FindBin manpage)
That is need in order to know which module would you like to use. When you move to another catalog as you said perl looks in '.' catalog so if run:
cd /
perl /home/user/GreatScript/myscript.pl
and if MyModule.pm in '/' perl will find it and will be use in myscrpit.pl. Now as Perl finds module in the #INC in order which catologs in it you have to keep an eye on #INC.
Summary: obvious and regular operation is need to prevent using wrong module with the same name what you want to use.

Why is "package" keyword sometimes separated by a comment from the package name?

Analyzing sources of CPAN modules I can see something like this:
...
package # hide from PAUSE
Try::Tiny::ScopeGuard;
...
Obviously, it's taken from Try::Tiny, but I have seen this kind of comments between package keyword and package identifier in other modules too.
Why this procedure is used? What is its goal and what benefits does it have?
It is indeed a hack to hide a package from PAUSE's indexer.
When a distribution is uploaded to PAUSE, the indexer will examine each file in the upload, looking for the names of packages that are included in the distribution. Any indexed packages can show up in CPAN search results.
There are many reasons for not wanting the indexer to discover your packages. Your distribution may have many small or insignificant packages that would clutter up the search results for your module. You may have packages defined in your t (test) directory or some other non-standard directory that are not meant to be installed as part of the distribution. Your distribution may include files from a completely different distribution (that somebody else wrote).
The hack works because the indexer strictly looks for the keyword package and an expression that looks like a package name on the same line.
Nowadays, you can include a META.yml file with your distribution. The PAUSE indexer will look for and respect a no_index specification in this file. But this is a relatively new capability of the indexer so older modules and old-timer CPAN contributors will still use the line break hack.
Here's an example of a no_index spec from Forks::Super
no_index:
directory:
- t
- inc
package:
- Sys::CpuAffinity
- Signals::XSIG
- Signals::XSIG::Default
- Signals::XSIG::TieArray56
Sys::CpuAffinity and Signals::XSIG are separate distributions that are also packaged with Forks::Super. Some of the test scripts contain package declarations (e.g., Arbitrary::Test::Package) that shouldn't be indexed.
Okay, here's another shot at this phenomenon ... I've been whacky-hacking Perl for a dozen years and I've rarely seen this packy hack and possibly simply ignored and never bothered to investigate. One thing seems clear, though. There's some hackish processing going on at PAUSE that's been crafted in the good ol' Perl'n'UNIX school of thought that without the shadow of a doubt involves line-oriented text parsing, so they parse those Perl files, possibly even using grep, but rather perl itself, who knows, to extract package names and then kick of some procedure or get some stats or whatnot. And to trip up this procedure and hack around its ways the author splits the package declaration in two lines so the hacky packy grep job doesn't have a clue that there's a package declared right under its nose and the programmer is happy about his hacky skills and the PAUSE stats or whatever it is they're cobbling together are as they should be. Does that make sense?

How does Perl's lib pragma work?

I use use lib "./DIR" to grab a library from a folder elsewhere. However, it doesn't seem to work on my server, but it works fine on my local desktop.
Any particular reasons?
And one more question, does use lib get propagated within several modules?
Two situations:
Say I make a base class that requires a few libraries, but I know that it needs to be extended and the extended class will need to use another library. Can I put the use lib command in the base class? or will I need to put it in every extending class?
Finally, can I just have a use package where package contains a bunch of use lib, will it propagate the use lib statements over to my current module? <-- I don't think so, but asking anyways
The . in your use lib statement means "current working directory" and will only work when your script is run from the right directory. The server's idea of cwd is probably something different (or undefined). Assuming that the library directory is co-located with with script and private to it you want to do something like this instead:
use FindBin;
use lib "$FindBin::Bin/DIR";
A use lib statement affects #INC -- the list of locations perl searches when you use or require a module. It globally affects the current instance of the interpreter. You should really only put use lib statements in scripts, not in modules.
In principle, you could have a package MyLibs that consisted of a bunch of use lib statements and then use MyLibs before using any of the packages in those locations, but I wouldn't recommend it.
There's no way to know why it isn't working on your server without more information. In particular, check your server's error logs, and dump #INC somewhere if necessary, and compare that to your actual library paths.
use lib modifies #INC, which is global, so as long as you execute your use lib before other packages try to include stuff, it will work and all other packages will see the new include paths.
For more on #INC, see its entry in perlvar.