What do the chained namespace identifiers in Perl do? - perl

I found a code snippet which goes like this:
package File::MP3;
use parent 'File'; # sets #File::MP3::ISA = ('File');
my $mp3 = File::MP3->new( 'Andvari.mp3', $data );
$mp3->save();
Here, I want to ask if File::MP3 is just a name (it is named that way just to show that it inherits from File)
OR
We have to name it that way if it inherits from File?

Update
I made a module named Module.pm in lib folder, & then named the package as package lib::Module
You're getting a little confused
When use or require is actioned, Perl will form a relative path from the package name by changing something like My::Other::Module to My/Other/Module.pm using the obvious substitutions
It will look for that relative path in the list of locations in the built-in #INC array, which contains some paths that were defined when perl was built and others that may be added at run time
Up until very recently, #INC contained the current working directory, ., so if you have your module in ./lib/My/Other/Module.pm then the Perl compiler will find it if you use lib::My::Other::Module.pm. But that's not how it's meant to work
You should add ./lib to #INC (using either use lib './lib'[1] or by adding to the value of the environment variable PERL5LIB) and then use My::Other::Module. That will work fine because perl is looking for the .pm file in in ./lib. The name and path to the .pm file, the package statement, and the use statement should all agree about what the name is
[1] Note that it is a security risk to add relative paths to #INC. That is why . is no longer included as standard in the release client for Perl v5.26. That means you shouldn't use lib './lib' as described above. Instead you need something like use lib '/var/users/Me/Perl/source/lib'
A package name like File::MP3 is just a name. Perl's only requirement is that it has to be unique
But modules are grouped into families in CPAN, and most file-related modules begin with File::. There is also Win32::, Net::, Math:: etc.
It is also used to indicate subsidiary modules in a suite. For instance, Mojo::Message contains the code common to both Mojo::Message::Request and Mojo::Message::Response. But that is a mnemonic for the programmer's convenience only
In the case of Math::Poisson, perl will look for a file Math/Poisson.pm which should have a package declaration package Math::Poisson. If you use that package name elsewhere then anything you declare will be inserted into the module's namespace

The File::MP3 in the package declaration is just a name. However, if you want to use that package then the name matters. The use dispatches to require, for which the name in require EXPR is
If EXPR is a bareword, require assumes a .pm extension and replaces :: with / in the filename for you, to make it easy to load standard modules. This form of loading of modules does not risk altering your namespace.
Thus if you want to use File::MP3 then an MP3.pm file had better be in a File/ directory, which needs to be in one of (the absolute-path) directories listed in #INC.
The class File from which File::MP3 inherits has nothing to do with the directory File/. The use parent (see parent) independently specifies the package to load for the class File, and its File.pm file also has to be found in #INC.
The MP3.pm file with the package may well be in a Media/ directory, in which case it would be best named as package Media::MP3; and would be loaded with use Media::MP3; – and it can still inherit from the File class by use parent File;
Placing file-related modules in a directory named File does make sense, of course.

Related

Eliminate the UserName from the "use" command

All my Perl files start with the /home/UserName/lib use command. Is there a way to eliminate the hardcoded UserName from all my .pl files? I want to simplify duplicating of my website for different users and different domains. This is a Perl Templates website that takes the cfg=UserName param from the URL and renders the site according to the specific user format. But the lib files are all the same for all users!
Can you use something like this instead? use lib './lib';
Sorry, I have very limited knowledge of perl programming.
Example:
use lib '/home/UserName/lib';
use RT::RealTime;
use RT::Site;
....
Anyone who can help is appreciated.
If I'm understanding your quest correctly, one way is
use warnings;
use strict;
my $user_name;
BEGIN { $user_name = $ENV{USER} }
use lib "/home/$user_name/lib";
# use packages from that path...
This sets the username of the user who runs the script.
Better yet, in this particular case the use of BEGIN isn't really needed since %ENV is a global variable set up by the interpreter early enough, so you can simply say
use lib "/home/$ENV{USER}/lib";
or
use lib "$ENV{HOME}/lib"
However, this won't work for many other related needs, which is why I (first) showed a way to work out things which aren't handed to us at compile time, in a BEGIN block.
In that case, the little dance around BEGIN goes as follows. The use runs at compile-time, so $user_name must already be available; thus we set it up in a BEGIN block that must come before the use statement that uses the variable. However, we need to declare the variable outside of the block (and my is a compile-time statement).
All that can then be wrapped in another block, enclosing code from before my $user_name to after user lib..., to scope that $user_name variable so that it is not seen in the rest of the file.
Another option seems to be to simply move those packages to a user-neutral location, and then you don't need the username in that use lib statement ...?
Can you use something like this instead? use lib './lib';
Almost, but not quite.
The problem with use lib './lib'; is that it will look to the lib/ directory underneath the user's current working directory when they run the script. This means that the program will only work correctly if the user is in the correct directory when they run it. It also creates a potential security issue if they happen to be in a directory where an attacker has created a malicious lib/ subdirectory containing module files matching the names of modules your code uses.
Instead, the Perl idiom to do what you want is:
use FindBin '$RealBin';
use lib "$RealBin/lib";
When you use FindBin, it finds the file containing the Perl program you're running and sets $RealBin to the directory path where that file is located. The use lib will then use the lib/ directory under that path, regardless of what directory the program is started from.
To include libraries from a directory relative to the current file, use lib::relative:
# in file /home/user/project/foo.pl
use lib::relative 'lib'; # includes /home/user/project/lib
use lib::relative '../lib'; # includes /home/user/lib
This will convert it to an absolute path, it's a bad idea to have relative paths in #INC since it relies on the current working directory, which could be anything and can be changed at any time.
lib::relative can also be easily replicated using core modules as documented.

Allowing changing location perl modules are loaded from with configuration argument

I currently have a mess of Perl code that includes something like a configuration.pm file that exports a large number of variables that other modules are using. The same module uses at least one module, call it Foo, which we wrote in some of the helper methods provided by the configuration.pm (they should be in a different module, but not ready to change this yet).
Currently it loads the module with something like this right near the top of the file:
Begin{ push #INC, 'hard/coded/directory'}
use Module::Foo;
I'm trying to get rid of this hard coded directory. I've already added a default configuration file for it to read data from. I moved the import down some and replaced the use with a require, something like this...
$script_directory = $config_data_from_file{'script_directory'};
push #inc, $script_directory;
require Module::Foo;
However, I want to add a command line argument to Main.pl to point to a different configuration file if I don't want to use the default one. My problem is that all the other modules expect configuration.pm to have loaded configuration data and required foo as soon as they include it. So I can't have configuration.pm wait to initialize until main.pl is ready. The closest I can come up with is something like this:
package Configuration;
load_config_file('default/file/location');
sub load_config_file($){
$config_data_from_file = read_file(#_[0]);
$script_directory = $config_data_from_file{'script_directory'};
push #inc, $script_directory;
require Module::Foo;
#load the rest
}
and have Main.pl recall the load_config_file if a command line option changes the configuration file.
But this is a problem for two reasons. First, if my default script location doesn't exist I still explode when I try to do the first import. Second, I'm requiring Foo twice, overwriting it, which could lead to issues if there are difference between the files. For that matter adding the default script_directory to #INC should be avoided.
There are a few ways to fix the problem I could see. A way to more cleanly load different versions of a module to replace the old one, a way to make Foo delay it's attempt to load until the first time it's used in the file, or a way to delay the $load_config_file method until after I read the configuration file for example. However, as a perl newbie I don't know how to do any of them, and haven't had much luck finding out how online.
I actually can do this now, with a fragile order of loading data that makes presumptions or by skipping ahead to a more through refactor of dozens of scripts to implement the long term solution sooner (but I'm really afraid to touch that much code before I have a way to test the code on my computer). However, I'm asking partially in hopes of learning more features of Perl I may find useful later; how would this be solved if I couldn't do the refactoring?
If you want to give the configuration file as the first parameter you can do something like this:
Main script:
#!perl
BEGIN {
use Configuration;
}
use Module::Foo;
... rest of script ...
Configuration.pm:
package Configuration;
load_config_file($ARGV[0] || 'default/file/location');
sub load_config_file($){
$config_data_from_file = read_file(#_[0]);
$script_directory = $config_data_from_file;
push #INC, $script_directory;
}
My solution in general was to look for my -f argument for a configuration file in my configuration.pm as soon as it is loaded and load the configuration file if possible then, while leaving the #ARGV variable untouched so that others could still parse it. This means we end up parsing command line args twice (actually 3 times), but that doesn't do any real harm. I am enforcing the -f argument being predefined in any module that uses my configuration.pm, and sort of require configuration.pm to be the very first module we include, but I consider that a minor expense. Anyone using our configuration.pm file for configuration arguments should desire that behavior.
I found AppConfig was the best module for handling this. My solution could be done without it, but AppConfig made it cleaner because it combined means of loading variables from config file and command line. in fact I, by pure accident, ended up adding the ability to modify any single variable directly from the command line if they choose the way I did it.
My configuration.pm looks something like hits (rewriting this from memory, not exact)
$conf = AppConfig -> new({
GLOBAL=> {
EXPAND => AppConfig::Expand_Var,
ARGCOUNT => AppConfig::ARGCOUNT_ONE
}})
$conf.define("script_dir", {DEFAULT = "/default/location"});
$conf->define("f", {ALIAS ="file|conf_file"});
...other defines here
#read config file if -f arg exists
parse_commandline_args();
$conf->file($conf->conf_file()) if defined $conf->conf_file()
#reread command line so that arguments on it override those in conf file
parse_commandline_args();
#at this point script_dir should be correct so safely include it.
push #INC $conf->script_dir();
sub parse_commandline_args(){
$copy_of_args = [#ARGV];
$conf->args($copy_of_args);
}
My main.pl is practically untouched. I use configuration.pm near the top of the module and everything else just works. I still need to go through and redefine all the scripts that Use a script to require it instead so that configuration.pm has time to update the INC before it runs, but other then that the rest just works. Anywhere I want to use content from the configuration file I now just can $conf->variable()
The parse_commandline_args is important. just using $conf->args() will erase the content of #ARGV, making them unavailable for later modules, like my main.pl. By copying the array first we leave the original #ARGV untouched for later use.
Not sure if I would recommend this from scratch, feels wrong the way configuration.pm is automagically doing everything, but for updating our ugly prototype to function long enough to maintain it until were funded to write the proper version, which I will not be doing in perl, it will do.

Convenient way to start a custom OCaml toplevel in Emacs with tuareg

If I start a custom toplevel in Emacs/tuareg with the tuareg-run-caml function, I need to give the path to the toplevel and the various -I options it requires to find the CMI files. This typing is tedious, is there a more convenient way to start a custom toplevel?
One approach is to add a .ocamlinit in the project directory that uses #directory to add whatever paths are necessary to the toplevel. You can also use this to install printers, add shorter names for commonly used modules, run test code, etc.
Note that you probably want that project specific .ocamlinit to execute ~/.ocamlinit, since things like opam tend to put bits and pieces in there. It might look something like this:
#use "/home/foo/.ocamlinit"
#directory "_build"
open Printf
module V = VeryLongModuleName
Note that #use expects a hard-coded path. This unfortunately interferes with distributing the file.
I further automate this by having an emacs command to start a toplevel that searches the current directory for a file called *.top to execute, falling back to ocaml if none is found. As ocamlbuild provides a fairly simple method of building these files, this avoids much of the tedium of getting a project loaded into a useable toplevel.

Add Perl module relative to script

im trying to add the module File-Copy-Recursive to my script as i have done with another module already, but when i try to use it i get an error i can not explain:
use lib "./cpan";
use Recursive qw(dircopy);
dircopy($path1, $path2);
the error i get is: Undefined subroutine &main::dircopy called at ...
I don't understand it, the module clearly has the function dircopy in it.
As other answers have already stated, this isn't working because you've moved the module's location in the include directory from File/Copy/Recursive.pm to just Recursive.pm.
Here's why that doesn't work:
A Perl module (file with a .pm extension) and a Perl package (collection of code under a specific namespace) are two completely different things. Normally, we'll put a package into a module which happens to have the same name, but this is really just to help us humans maintain our sanity. perl doesn't care one way or the other - one module can contain multiple packages, one package can be split across multiple files, and the names of the packages and the modules can be completely unrelated for all perl cares.
But, still... there's that convention of using the same name for both, which the use command exploits to make things a little more convenient. Behind the scenes, use Module; means require Module.pm; Module->import; - note that it calls import on the module name, not the name of the package contained within the module!
And that's the key to your issue. Even though you've moved the file out of the File/Copy/ directory, its contents still specify package File::Copy::Recursive, so that's where all of its code ends up. use Recursive attempts to call Recursive->import, which doesn't exist, so nothing gets imported. The dircopy function would be imported by File::Copy::Recursive->import, but that never gets called.
So, yeah. Move ./cpan/Recursive.pm to ./cpan/File/Copy/Recursive.pm so that the package name and the module name will match up again and sanity will be restored. (If you've been paying attention, you should be able to come up with at least two or three other ways to get this working, but moving the file to the proper place under ./cpan really is your best option if you need to keep the File::Copy::Recursive source in a subdirectory of your project's code.)
Use FindBin for relative lib path:
use FindBin;
use lib "$FindBin::Bin/./cpan";
use File::Copy::Recursive;
And you have to keep the whole 'tree' under ./cpan and the use line have to remain the same.
Files under ./cpan dir:
find ./cpan/
./cpan/File/Copy/Recursive.pm
The module name in Perl comes not only from the path, but also from its package declaration. You installed the module to ./cpan, but the package name specified is still File::Copy::Recursive.

How does Perl module(.pm) call corresponding .so?

I just compiled and run a hello world Perl extension,but I don't know the principle.How does the .pm call .so?
It doesn't need to - the binary code defines some variables and functions in the module's namespace, they can be used just like regular variables. The .pm file only needs to ensure that .so is loaded when it is needed. This is done by the DynaLoader module. By inheriting from DynaLoader you make sure that your .so file is loaded when an unknown method is called on your class.