Perl class naming convention - perl

Let's say I create a class named Bar. The file Bar.pm starts
package Bar;
To avoid colliding with other Bar classes, I put the file in a subdirectory Foo. So now, when I use the class, I have to write
use Foo::Bar;
My question is, do I need to change the name of the class to Foo::Bar? In other words, do I need to change the first line of Bar.pm to
package Foo::Bar;
? The problem is, if I do this, I now have to refer to the class as Foo::Bar everywhere, e.g.
my $obj = Foo::Bar->new();
Foo::Bar->doClassMethod();
which is annoying (the same problem was discussed in this question), especially since I am fond of class methods.

Yes, you have to change the name of the package to exactly match that of the use.

If you think having a class name Bar might conflict with other classes named Bar then simply moving the file won't help. If your program eventually uses both Bar and Foo::Bar then both will have been loaded into the same namespace. At that point what happens to your program is anyone's guess.
If don't want to type long class names then you can use a variable to hold the name.
use My::Long::Class::Name::For::Bar;
my $bar_class = 'My::Long::Class::Name::For::Bar';
$bar_class->class_method(); # the same as My::Long::Class::Name::For::Bar->class_method()

You don't strictly need to (i.e. it's a style decision, not something the compiler enforces), but it is a good idea to follow the relevant guidelines set out in perlmod/perlnewmod to make the software easily distributable.
IOW, if long names bother you, get an editor with autocompletion.

Quoting perldoc -f require:
If EXPR is a bareword, the require assumes a ".pm" extension and
replaces "::" with "/" in the filename for you, to make it easy to
load standard modules. This form of loading of modules does not risk
altering your namespace. In other words, if you try this:
require Foo::Bar; # a splendid bareword
The require function will actually look for the "Foo/Bar.pm" file in the
directories specified in the #INC array.
So, yes, it's convention that if your module is located at $dir/Foo/Bar.pm for some $dir in #INC, then it must be called Foo::Bar.

The package Local is reserved just for this use. For example, if you create a package Bar.pm, you could call the package Local::Bar and know it won't clash with something in CPAN.
By default, #INC includes the current directory, so you could create a Local subdirectory, and then put all of your packages there. In a company, I'll divide up the Local package namespace into groups and even names of the developers. For example, I could use Local::Cm::Bar or Local::David::Bar. That way, if I decide I could use Alice's Bar class, I could simply include Local::Alice::Bar.
Yes, the standard CPAN rules state not to use your name in package namespaces, but Local packages never go into CPAN.

Related

Module naming convention: Reserved prefix for internal distributions? [duplicate]

I have read the perldoc on modules, but I don't see a recommendation on naming a package so it won't collide with builtin or CPAN module/package names.
In the past, to develop a local Session.pm module, I have created a local directory using my company's name, such as:
package Company::Session;
... and Session.pm would be found in directory Company/.
But I'm just not a fan of this naming convention. I would rather name the package hierarchy closer to the functionality of the code. But that's how it's done on CPAN generally...
I feel like I am missing something fundamental. I also looked in Damian's Perl Best Practices but I may not have been looking in the right place...
Any recommendations on avoiding package namespace collisions the right way?
Update w/ Related Question: if there is a package name conflict, how does Perl choose which one to use? Thanks everyone.
The namespace Local:: has been reserved for just this purpose. No module that starts with that prefix will be accepted to CPAN or the core. Alternatively, you can use an underscore in the top-level name (like My_Corp::Session or just My_Session). All categories with an underscore have also been reserved. (This is mentioned in perlmodlib, under "Select a name for the module".)
Note that both those reservations apply only to the top-level name. For example, there are CPAN modules named Time::Local and Text::CSV_XS. But Local::Time and Text_CSV::XS are reserved names and would not be accepted on CPAN.
Naming modules after your company is fine too. (Well, unless you work for some really generic sounding company.) Using the reverse domain name is probably overkill, unless you intend to distribute your modules to others. (But in that case, you should probably register a normal module name.)
How Perl resolves a conflict:
Perl searches the directories in #INC for a module with the specified name. The first module found is the one used. So the order of directories in #INC determines which module would be used (if you have modules with the same name installed in different locations).
perl -V will report the contents of #INC (the highest-priority directories are listed first). But there are lots of ways to manipulate #INC at runtime, too.
BTW, Raku can handle multiple modules with the same name by different authors, and even use more than one in a single program. That's a different solution.
There is nothing wrong with naming your internal modules after your company; I always do this. 90% of my code ends up on CPAN, so it has "normal" names, but the internal stuff is always starts with ClientName::.
I'm sure everyone else does this too.
What's wrong with just picking a name for your package that you like and then googling "perl the-name-you-picked"?
The #INC variable contains a list of directories to in which to look for modules. It starts with the first entry and then moves on to next if it doesn't find the request module. #INC has a default value that created when perl is compiled, but you can can change it with the PERL5LIB environment variable, the lib pragma, and directly manipulating the #INC array in a BEGIN block:
#!/usr/bin/perl
BEGIN {
#INC = (); #no modules can be found
}
use strict; #error: Can't locate strict.pm in #INC (#INC contains:)
If you need the maximum level of certainty that your module name will not conflict with someone else's you can take a page from Java's book: name the module with the name of the companies domain. So if you work for Example, Inc. and their domain name is example.com, you would name your HTML parser module Com::Example::HTML::Parser or Example::Com::HTML::Parser. The benefit of the first is that if you have multiple subunits they can all have their own name space, but the modules will still sort together:
Com::Example::Biz::FindCustomers
Com::Example::IT::ParseLogs
Com::Example::QA::TestServer
but it does look odd at first.
(I know this post is old, but as I've had to sort this out in the past few months, I thought I'd weigh in)
At work we decided that 'Local::' felt too geographic. CompanyName:: had some problems for us too that aren't development related, I'll skip those, though I will say that CompanyName is long when you have to type it dozens of times.
So we settled on 'Our::'. Sure, we're not 'CPAN Safe' as there could be the day when we want to use a CPAN module with the Our:: prefix. But it feels nice.
Our::Data is our Class::DBI module
Our::App is our generic app framework that does config handling and Getopt stuff
Nice to read and nice to type.

How to import all "our"-variables from the unnamed Perl module without listing them?

I need to import all our variables from the unnamed Perl module (Module.pm) and use them inside the Perl script (Script.pl).
The following code works well without the "use strict", but failed with it. How can I change this code to work with "use strict" without the manual listing of all imported variables (as described in the answer to other question)?
Thanks a lot for your help!
Script.pl:
use strict;
require Module;
print $Var1;
Module.pm:
our $Var1 = "1\n";
...
our $VarN = "N\n";
return 1;
Run the script:
$> perl Script.pl
Errors:
Global symbol "$Var1" requires explicit package name at Script.pl line 3.
Execution of Script.pl aborted due to compilation errors.
NOTE (1): The module is unnamed, so using a Module:: prefix is not the option.
NOTE (2): Module.pm contains also a set of functions configured by global variables.
NOTE (3): Variables are different and should NOT be stored in one array.
NOTE (4): Design is NOT good, but the question is not about the design. It's about forcing of the listed code to work with minimal modifications with the complexity O(1), i.e. a few lines of code that don't depend on the N.
Solution Candidate (ACCEPTED): Add $:: before all imported variables. It's compliant with strict and also allows to differ my variables from imported in the code.
Change your script to:
use strict;
require Module;
print $Module::Var1;
The problem is the $Var1 isn't in the main namespace, it's in Module's namespace.
Edit: As is pointed out in comments below, you haven't named your module (i.e. it doesn't say package Module; at the top). Because of this, there is no Module namespace. Changing your script to:
use strict;
require Module;
print $main::Var1;
...allows the script to correctly print out 1\n.
If you have to import all the our variables in every module, there's something seriously wrong with your design. I suggest that you redesign your program to separate the elements so there is a minimum of cross-talk between them. This is called decoupling.
You want to export all variables from a module, and you want to do it in such a way that you don't even know what you're exporting? Forget about use strict and use warnings because if you put them in your program, they'll just run screaming out, and curl up in a corner weeping hysterically.
I never, and I don't mean hardly ever, never export variables. I always create a method to pull out the required value. It gives me vital control over what I'm exposing to the outside world and it keeps the user's namespace pure.
Let's look at the possible problems with your idea.
You have no idea what is being exported in your module. How is the program that uses that module going to know what to use? Somewhere, you have to document that the variable $foo and #bar are available for use. If you have to do that, why not simply play it safe?
You have the issue of someone changing the module, and suddenly a new variable is being exported into the program using that module. Imagine if that variable was already in use. The program suddenly has a bug, and you'll never be able to figure it out.
You are exporting a variable in your module, and the developer decides to modify that variable, or even removes it from the program. Again, because you have no idea what is being imported or exported, there's no way of knowing why a bug suddenly appeared in the program.
As I mentioned, you have to know somewhere what is being used in your module that the program can use, so you have to document it anyway. If you're going to insist on importing variables, at least use the EXPORT_OK array and the Exporter module. That will help limit the damage. This way, your program can declare what variables its depending upon and your module can declare what variables it knows programs might be using. If I am modifying the module, I would be extra careful of any variable I see I am exporting. And, if you must specify in your program what variables you're importing, you know to be cautious about those particular variables.
Otherwise, why bother with modules? Why not simply go back to Perl 3.0 and use require instead of use and forget about using the package statement.
It sounds like you have data in a file and are trying to load that data into your program.
As it is now, the our declarations in the module only declare variables for the scope of that file. Once the file finshes running, to access the variables, you need to use their fully qualified name. If your module has a package xyz; line, then the fully qualified name is $xzy::Var1. If there is no package declaration, then the default package main is used, giving your variables the name $main::Var1
However, any time that you are making many variables all with numeric name changes, you probably should be using an array.
Change your module to something like:
#My::Module::Data = ("1\n", "2\n" ... )
and then access the items by index:
$My::Module::Data[1]

Should Perl boilerplate go before or after a package declaration?

Assuming there's only one package in a file, does the order or the following Perl boilerplate matter? if there are no technical reasons are there any aesthetic?
use 5.006;
use strict;
use warnings;
package foo;
The order matters if any part of your boiler plate imports any subroutines or variables, or does anything tricky with the caller's namespace.
If you get into the habit of placing it before the package name, then one day when you want to add use List::Util 'reduce'; to your boiler plate, the subroutine will be imported into main instead of foo. So package foo will not have reduce imported, and you may be scratching your head for a while trying to figure out why it isn't working.
The reason why it doesn't matter with the three imports you have shown is that they are all pragmatic modules (or assertions), and their effect is lexically scoped, not package scoped. Placed at the top of the file, those pragmas will be in effect for the entire file.
The order doesn't matter from a technical standpoint.
It's always been my practice (and, fwiw, what's used in Perl Best Practices) to put the package declaration at the very start. I'd suggest a blank line before the package declaration if there's anything before it, to make it stand out.

How does Perl's lib pragma work?

I use use lib "./DIR" to grab a library from a folder elsewhere. However, it doesn't seem to work on my server, but it works fine on my local desktop.
Any particular reasons?
And one more question, does use lib get propagated within several modules?
Two situations:
Say I make a base class that requires a few libraries, but I know that it needs to be extended and the extended class will need to use another library. Can I put the use lib command in the base class? or will I need to put it in every extending class?
Finally, can I just have a use package where package contains a bunch of use lib, will it propagate the use lib statements over to my current module? <-- I don't think so, but asking anyways
The . in your use lib statement means "current working directory" and will only work when your script is run from the right directory. The server's idea of cwd is probably something different (or undefined). Assuming that the library directory is co-located with with script and private to it you want to do something like this instead:
use FindBin;
use lib "$FindBin::Bin/DIR";
A use lib statement affects #INC -- the list of locations perl searches when you use or require a module. It globally affects the current instance of the interpreter. You should really only put use lib statements in scripts, not in modules.
In principle, you could have a package MyLibs that consisted of a bunch of use lib statements and then use MyLibs before using any of the packages in those locations, but I wouldn't recommend it.
There's no way to know why it isn't working on your server without more information. In particular, check your server's error logs, and dump #INC somewhere if necessary, and compare that to your actual library paths.
use lib modifies #INC, which is global, so as long as you execute your use lib before other packages try to include stuff, it will work and all other packages will see the new include paths.
For more on #INC, see its entry in perlvar.

How do I choose a package name for a custom Perl module that does not collide with builtin or CPAN packages names?

I have read the perldoc on modules, but I don't see a recommendation on naming a package so it won't collide with builtin or CPAN module/package names.
In the past, to develop a local Session.pm module, I have created a local directory using my company's name, such as:
package Company::Session;
... and Session.pm would be found in directory Company/.
But I'm just not a fan of this naming convention. I would rather name the package hierarchy closer to the functionality of the code. But that's how it's done on CPAN generally...
I feel like I am missing something fundamental. I also looked in Damian's Perl Best Practices but I may not have been looking in the right place...
Any recommendations on avoiding package namespace collisions the right way?
Update w/ Related Question: if there is a package name conflict, how does Perl choose which one to use? Thanks everyone.
The namespace Local:: has been reserved for just this purpose. No module that starts with that prefix will be accepted to CPAN or the core. Alternatively, you can use an underscore in the top-level name (like My_Corp::Session or just My_Session). All categories with an underscore have also been reserved. (This is mentioned in perlmodlib, under "Select a name for the module".)
Note that both those reservations apply only to the top-level name. For example, there are CPAN modules named Time::Local and Text::CSV_XS. But Local::Time and Text_CSV::XS are reserved names and would not be accepted on CPAN.
Naming modules after your company is fine too. (Well, unless you work for some really generic sounding company.) Using the reverse domain name is probably overkill, unless you intend to distribute your modules to others. (But in that case, you should probably register a normal module name.)
How Perl resolves a conflict:
Perl searches the directories in #INC for a module with the specified name. The first module found is the one used. So the order of directories in #INC determines which module would be used (if you have modules with the same name installed in different locations).
perl -V will report the contents of #INC (the highest-priority directories are listed first). But there are lots of ways to manipulate #INC at runtime, too.
BTW, Raku can handle multiple modules with the same name by different authors, and even use more than one in a single program. That's a different solution.
There is nothing wrong with naming your internal modules after your company; I always do this. 90% of my code ends up on CPAN, so it has "normal" names, but the internal stuff is always starts with ClientName::.
I'm sure everyone else does this too.
What's wrong with just picking a name for your package that you like and then googling "perl the-name-you-picked"?
The #INC variable contains a list of directories to in which to look for modules. It starts with the first entry and then moves on to next if it doesn't find the request module. #INC has a default value that created when perl is compiled, but you can can change it with the PERL5LIB environment variable, the lib pragma, and directly manipulating the #INC array in a BEGIN block:
#!/usr/bin/perl
BEGIN {
#INC = (); #no modules can be found
}
use strict; #error: Can't locate strict.pm in #INC (#INC contains:)
If you need the maximum level of certainty that your module name will not conflict with someone else's you can take a page from Java's book: name the module with the name of the companies domain. So if you work for Example, Inc. and their domain name is example.com, you would name your HTML parser module Com::Example::HTML::Parser or Example::Com::HTML::Parser. The benefit of the first is that if you have multiple subunits they can all have their own name space, but the modules will still sort together:
Com::Example::Biz::FindCustomers
Com::Example::IT::ParseLogs
Com::Example::QA::TestServer
but it does look odd at first.
(I know this post is old, but as I've had to sort this out in the past few months, I thought I'd weigh in)
At work we decided that 'Local::' felt too geographic. CompanyName:: had some problems for us too that aren't development related, I'll skip those, though I will say that CompanyName is long when you have to type it dozens of times.
So we settled on 'Our::'. Sure, we're not 'CPAN Safe' as there could be the day when we want to use a CPAN module with the Our:: prefix. But it feels nice.
Our::Data is our Class::DBI module
Our::App is our generic app framework that does config handling and Getopt stuff
Nice to read and nice to type.