Something wrong with #INC in perl? - perl

Let's imagine we are new to perl and wrote some great module MyModule.pm. We also wrote some great script myscript.pl, that need to use this module.
use strict;
use warnings;
use MyModule;
etc...
Now we will create dir /home/user/GreatScript and put our files in it. And trying to run myscript.pl...
cd /home/user/GreatScript
perl myscript.pl
Great! Now moving to another dir...
cd /
perl /home/user/GreatScript/myscript.pl
Getting some not very usefull error about #INC and list of paths. What is this? Now after some googling we know that #INC contains paths where to search our modules, and this error means that Perl can't find out MyModule.pm.
Now we can:
Add path to system PERL5LIB var, it will add our path to the beginning of #INC
Install our module to one of dirs from #INC
Manually add our path to #INC in the BEGIN section
Add use lib '/home/user/GreatScript'; to our script, but it looks bad. What if we will move our script to other dir?
Or use FindBin module to find our current dir, and use this path in use lib "$FindBin::Bin";, but it's not 100% effective, for example some bugs with mod_perl or issues...
Or use __FILE__ (or $0) variable and extract path from it using abs_path method from Cwd module. Looks like a yet another bicycle?
Or some forgotten or missing ( Did I miss something? )
Why this obvious and regular operation need a lot of work? If I have one script, it's easy... But why I need to write this few additional lines of code to all my scripts? And it will not work 100%! Why perl not adding current script dir to #INC by default, like it does with "."?
PS: I was searching for answer to my question but find only list of solutions from the list above and some others. I hope this question is duplicate...

Or some forgotten or missing ( Did I miss something? )
Another option: Don't keep your module with your script. The module is separate because it is re-usable, so put it in a library folder. This can be anything from a local library for personal projects, included with use lib (and perhaps referencing an environment variable you have set up for the project), to making the module into a CPAN library, and letting cpan manage where it goes. What you decide to do depends on how re-usable your code is, beyond how the script uses it.
Why this obvious and regular operation need a lot of work?
It is not a lot of work in the scheme of things. Your expectation that files grouped together in the file system, should automatically be used by Perl at the language level to resolve require or use statements is not true. It's not that it is impossible, or that your expectation is unreasonable, just that Perl is not implemented that way. Changing it after the fact could affect how many existing projects work, and may be contentious - so is unlikely to change.

If your use case is: "I want to have a script which accesses additional modules or other resources in the same or relative directory, and I don't want to install everything", then FindBin is a perfect solution. The limitation mentioned in the manpage happens only in persistent environments like mod_perl, but here I would propose different solutions (e.g. using FindBin only within httpd.conf, not in the scripts/modules).
(/me is a heavy FindBin user, and I am also the author of half of the KNOWN ISSUES section of the FindBin manpage)

That is need in order to know which module would you like to use. When you move to another catalog as you said perl looks in '.' catalog so if run:
cd /
perl /home/user/GreatScript/myscript.pl
and if MyModule.pm in '/' perl will find it and will be use in myscrpit.pl. Now as Perl finds module in the #INC in order which catologs in it you have to keep an eye on #INC.
Summary: obvious and regular operation is need to prevent using wrong module with the same name what you want to use.

Related

Perl - Use all modules from specified subdirectory and solve it's dependencies automatically

I have two modules:
./My/Module1
./My/Module2
Module1 is using subroutines from Module2. So in my script i typed following:
use My::Module1
use My::Module2
But this does not worked and perl complained that subroutines which are used from Module2 by Module1 does not exists. So I added following line to Module1:
use My::Module2
Finally this worked as expected.
I am wondering if there is some solution that will include all modules from specified sub-directory tree and solve dependencies automatically. I do not want to type use keyword in modules which depends on another modules. Following commands was tried but it did not worked (either by syntax errors or it used wrong modules):
use My::;
use My::*;
use My;
Also I would ask if this cross-using modules and calling it's subroutines is considered as a good practice in perl programming?
PS: #INC contains current directory so loading modules is working.
PPS: Modules used Exporter
I do not want to type use keyword in modules which depends on another modules.
Then type the BEGIN, require, and import keywords instead?
Seriously, there's no good way for this to work. Just use use in each module so that it can load the things it needs.
Also I would ask if this cross-using modules and calling it's subroutines is considered as a good practice in perl programming?
Yes. Modularization is considered good practice in all programming.

Module naming convention: Reserved prefix for internal distributions? [duplicate]

I have read the perldoc on modules, but I don't see a recommendation on naming a package so it won't collide with builtin or CPAN module/package names.
In the past, to develop a local Session.pm module, I have created a local directory using my company's name, such as:
package Company::Session;
... and Session.pm would be found in directory Company/.
But I'm just not a fan of this naming convention. I would rather name the package hierarchy closer to the functionality of the code. But that's how it's done on CPAN generally...
I feel like I am missing something fundamental. I also looked in Damian's Perl Best Practices but I may not have been looking in the right place...
Any recommendations on avoiding package namespace collisions the right way?
Update w/ Related Question: if there is a package name conflict, how does Perl choose which one to use? Thanks everyone.
The namespace Local:: has been reserved for just this purpose. No module that starts with that prefix will be accepted to CPAN or the core. Alternatively, you can use an underscore in the top-level name (like My_Corp::Session or just My_Session). All categories with an underscore have also been reserved. (This is mentioned in perlmodlib, under "Select a name for the module".)
Note that both those reservations apply only to the top-level name. For example, there are CPAN modules named Time::Local and Text::CSV_XS. But Local::Time and Text_CSV::XS are reserved names and would not be accepted on CPAN.
Naming modules after your company is fine too. (Well, unless you work for some really generic sounding company.) Using the reverse domain name is probably overkill, unless you intend to distribute your modules to others. (But in that case, you should probably register a normal module name.)
How Perl resolves a conflict:
Perl searches the directories in #INC for a module with the specified name. The first module found is the one used. So the order of directories in #INC determines which module would be used (if you have modules with the same name installed in different locations).
perl -V will report the contents of #INC (the highest-priority directories are listed first). But there are lots of ways to manipulate #INC at runtime, too.
BTW, Raku can handle multiple modules with the same name by different authors, and even use more than one in a single program. That's a different solution.
There is nothing wrong with naming your internal modules after your company; I always do this. 90% of my code ends up on CPAN, so it has "normal" names, but the internal stuff is always starts with ClientName::.
I'm sure everyone else does this too.
What's wrong with just picking a name for your package that you like and then googling "perl the-name-you-picked"?
The #INC variable contains a list of directories to in which to look for modules. It starts with the first entry and then moves on to next if it doesn't find the request module. #INC has a default value that created when perl is compiled, but you can can change it with the PERL5LIB environment variable, the lib pragma, and directly manipulating the #INC array in a BEGIN block:
#!/usr/bin/perl
BEGIN {
#INC = (); #no modules can be found
}
use strict; #error: Can't locate strict.pm in #INC (#INC contains:)
If you need the maximum level of certainty that your module name will not conflict with someone else's you can take a page from Java's book: name the module with the name of the companies domain. So if you work for Example, Inc. and their domain name is example.com, you would name your HTML parser module Com::Example::HTML::Parser or Example::Com::HTML::Parser. The benefit of the first is that if you have multiple subunits they can all have their own name space, but the modules will still sort together:
Com::Example::Biz::FindCustomers
Com::Example::IT::ParseLogs
Com::Example::QA::TestServer
but it does look odd at first.
(I know this post is old, but as I've had to sort this out in the past few months, I thought I'd weigh in)
At work we decided that 'Local::' felt too geographic. CompanyName:: had some problems for us too that aren't development related, I'll skip those, though I will say that CompanyName is long when you have to type it dozens of times.
So we settled on 'Our::'. Sure, we're not 'CPAN Safe' as there could be the day when we want to use a CPAN module with the Our:: prefix. But it feels nice.
Our::Data is our Class::DBI module
Our::App is our generic app framework that does config handling and Getopt stuff
Nice to read and nice to type.

Old .pl modules versus new .pm modules

I'm a beginner in Perl and I'm trying to build in my head the best ways of structuring a Perl program. I'm proficient in Python and I'm used to the python from foo import bar way of importing functions and classes from python modules. As I understood in Perl there are many ways of doing this, .pm and .pl modules, EXPORTs and #ISAs, use and require, etc. and it is not easy for a beginner to get a clear idea of which are the differences, advantages and drawbacks of each (even after reading Beginning Perl and Intermediate Perl).
The problem stated, my current question is related to a sentence from perldoc perlmod:
Perl module files have the
extension .pm. The use operator
assumes this so you don't have to
spell out "Module.pm" in quotes. This
also helps to differentiate new
modules from old .pl and .ph files.
Which are the differences between old .pl way of preparing modules and the new .pm way?
Are they really the old and the modern way? (I assume they are because Perlmod says that but I would like to get some input about this).
The use function and .pm-type modules were introduced in Perl 5, released 16 years ago next month. The "old .pl and .ph files" perlmod is referring to were used with Perl 4 (and earlier). At this point, they're only interesting to computer historians. For your purposes, just forget about .pl libraries.
Which are the differences between old .pl way of preparing modules and the new .pm way?
You can find few old modules inside the Perl's own standard library (pointed to by #INC, the paths can be seen in perl -V output).
In older times, there were no packages. One was doing e.g. require "open2.pl"; which is analogous to essentially including the content of file as it is in the calling script. All functions declared, all global variables were becoming part of the script's context. Or in other words: polluting your context. Including several files might have lead to all possible conflicts.
New modules use package keyword to define their own context and name of the namespace. When use-ed by a script, new modules have possibility to not import/add anything to the immediate context of the script thus prevent namespace pollution and potential conflicts.
#EXPORT/#EXPORT_OK lists are used by standard utility module Exporter which helps to import the module functions into the calling context: so that one doesn't have to write all the time full name of the functions. The lists are generally customized by the module depending on the parameter list passed to the use like in use POSIX qw/:errno_h/;. See perldoc Exporter for more details.
#ISA is a Perl's inheritance mechanism. It tells Perl that if it can't find a function inside of the current package, to scan for the function inside all the packages mentioned in the #ISA. Simple modules often have there only the Exporter mentioned to use its import() method (what is also well described in the same perldoc Exporter).
Reusing code by creating .pl files (the "pl" actually stands for "Perl library") was the way that it was done back in Perl 4 - before we had the 'package' keyword and the 'use' statement.
It's a nasty old way of doing things. If you're coming across documentation that recommends it then that's a strong indication that you should ignore that documentation as it's either really old or written by someone who hasn't kept up to date with Perl development for over fifteen years.
For some examples of the different ways of building Perl modules in the modern way, see my answer to Perl Module Method Calls: Can't call method “X” on an undefined value at ${SOMEFILE} line ${SOMELINE}
I don't know nothing about .pl rather modules rather than they did exist some time ago, nobody seems to use them nowadays so you proably shouldn't use them either.
Stick to pm modules, ignore #ISA right now, that's for OOP. Export isn't that important either, because you can always call your methods fully quallified.
So rather than writing this:
file: MyPkg.pm
package MyPkg;
#EXPORT = qw(func1 func2);
sub func1 { ... };
sub func2 { ... };
file: main.pl
#!/usr/bin/perl
use strict;
use warnings;
use MyPkg;
&func1();
you should, for the beginning, write that:
file: MyPkg.pm
package MyPkg;
sub func1 { ... };
sub func2 { ... };
file: main.pl
#!/usr/bin/perl
use strict;
use warnings;
use MyPkg;
&MyPkg::func1();
And later when you see which methods should really be exported you can do that without having to change your exisiting code.
The use loades your module and call import, which would make any EXPORTed subs avalable in your current package. In the seconds example a require would do, which doesn't call import, but I tend to always use 'use'.

How does Perl's lib pragma work?

I use use lib "./DIR" to grab a library from a folder elsewhere. However, it doesn't seem to work on my server, but it works fine on my local desktop.
Any particular reasons?
And one more question, does use lib get propagated within several modules?
Two situations:
Say I make a base class that requires a few libraries, but I know that it needs to be extended and the extended class will need to use another library. Can I put the use lib command in the base class? or will I need to put it in every extending class?
Finally, can I just have a use package where package contains a bunch of use lib, will it propagate the use lib statements over to my current module? <-- I don't think so, but asking anyways
The . in your use lib statement means "current working directory" and will only work when your script is run from the right directory. The server's idea of cwd is probably something different (or undefined). Assuming that the library directory is co-located with with script and private to it you want to do something like this instead:
use FindBin;
use lib "$FindBin::Bin/DIR";
A use lib statement affects #INC -- the list of locations perl searches when you use or require a module. It globally affects the current instance of the interpreter. You should really only put use lib statements in scripts, not in modules.
In principle, you could have a package MyLibs that consisted of a bunch of use lib statements and then use MyLibs before using any of the packages in those locations, but I wouldn't recommend it.
There's no way to know why it isn't working on your server without more information. In particular, check your server's error logs, and dump #INC somewhere if necessary, and compare that to your actual library paths.
use lib modifies #INC, which is global, so as long as you execute your use lib before other packages try to include stuff, it will work and all other packages will see the new include paths.
For more on #INC, see its entry in perlvar.

How do I choose a package name for a custom Perl module that does not collide with builtin or CPAN packages names?

I have read the perldoc on modules, but I don't see a recommendation on naming a package so it won't collide with builtin or CPAN module/package names.
In the past, to develop a local Session.pm module, I have created a local directory using my company's name, such as:
package Company::Session;
... and Session.pm would be found in directory Company/.
But I'm just not a fan of this naming convention. I would rather name the package hierarchy closer to the functionality of the code. But that's how it's done on CPAN generally...
I feel like I am missing something fundamental. I also looked in Damian's Perl Best Practices but I may not have been looking in the right place...
Any recommendations on avoiding package namespace collisions the right way?
Update w/ Related Question: if there is a package name conflict, how does Perl choose which one to use? Thanks everyone.
The namespace Local:: has been reserved for just this purpose. No module that starts with that prefix will be accepted to CPAN or the core. Alternatively, you can use an underscore in the top-level name (like My_Corp::Session or just My_Session). All categories with an underscore have also been reserved. (This is mentioned in perlmodlib, under "Select a name for the module".)
Note that both those reservations apply only to the top-level name. For example, there are CPAN modules named Time::Local and Text::CSV_XS. But Local::Time and Text_CSV::XS are reserved names and would not be accepted on CPAN.
Naming modules after your company is fine too. (Well, unless you work for some really generic sounding company.) Using the reverse domain name is probably overkill, unless you intend to distribute your modules to others. (But in that case, you should probably register a normal module name.)
How Perl resolves a conflict:
Perl searches the directories in #INC for a module with the specified name. The first module found is the one used. So the order of directories in #INC determines which module would be used (if you have modules with the same name installed in different locations).
perl -V will report the contents of #INC (the highest-priority directories are listed first). But there are lots of ways to manipulate #INC at runtime, too.
BTW, Raku can handle multiple modules with the same name by different authors, and even use more than one in a single program. That's a different solution.
There is nothing wrong with naming your internal modules after your company; I always do this. 90% of my code ends up on CPAN, so it has "normal" names, but the internal stuff is always starts with ClientName::.
I'm sure everyone else does this too.
What's wrong with just picking a name for your package that you like and then googling "perl the-name-you-picked"?
The #INC variable contains a list of directories to in which to look for modules. It starts with the first entry and then moves on to next if it doesn't find the request module. #INC has a default value that created when perl is compiled, but you can can change it with the PERL5LIB environment variable, the lib pragma, and directly manipulating the #INC array in a BEGIN block:
#!/usr/bin/perl
BEGIN {
#INC = (); #no modules can be found
}
use strict; #error: Can't locate strict.pm in #INC (#INC contains:)
If you need the maximum level of certainty that your module name will not conflict with someone else's you can take a page from Java's book: name the module with the name of the companies domain. So if you work for Example, Inc. and their domain name is example.com, you would name your HTML parser module Com::Example::HTML::Parser or Example::Com::HTML::Parser. The benefit of the first is that if you have multiple subunits they can all have their own name space, but the modules will still sort together:
Com::Example::Biz::FindCustomers
Com::Example::IT::ParseLogs
Com::Example::QA::TestServer
but it does look odd at first.
(I know this post is old, but as I've had to sort this out in the past few months, I thought I'd weigh in)
At work we decided that 'Local::' felt too geographic. CompanyName:: had some problems for us too that aren't development related, I'll skip those, though I will say that CompanyName is long when you have to type it dozens of times.
So we settled on 'Our::'. Sure, we're not 'CPAN Safe' as there could be the day when we want to use a CPAN module with the Our:: prefix. But it feels nice.
Our::Data is our Class::DBI module
Our::App is our generic app framework that does config handling and Getopt stuff
Nice to read and nice to type.