I have come across code with the following syntax:
$a -> mysub($b);
And after looking into it I am still struggling to figure out what it means. Any help would be greatly appreciated, thanks!
What you have encountered is object oriented perl.
it's documented in perlobj. The principle is fairly simple though - an object is a sort of super-hash, which as well as data, also includes built in code.
The advantage of this, is that your data structure 'knows what to do' with it's contents. At a basic level, that's just validate data - so you can make a hash that rejects "incorrect" input.
But it allows you to do considerably more complicated things. The real point of it is encapsulation, such that I can write a module, and you can make use of it without really having to care what's going on inside it - only the mechanisms for driving it.
So a really basic example might look like this:
#!/usr/bin/env perl
use strict;
use warnings;
package MyObject;
#define new object
sub new {
my ($class) = #_;
my $self = {};
$self->{count} = 0;
bless( $self, $class );
return $self;
}
#method within the object
sub mysub {
my ( $self, $new_count ) = #_;
$self->{count} += $new_count;
print "Internal counter: ", $self->{count}, "\n";
}
package main;
#create a new instance of `MyObject`.
my $obj = MyObject->new();
#call the method,
$obj->mysub(10);
$obj->mysub(10);
We define "class" which is a description of how the object 'works'. In this, class, we set up a subroutine called mysub - but because it's a class, we refer to it as a "method" - that is, a subroutine that is specifically tied to an object.
We create a new instance of the object (basically the same as my %newhash) and then call the methods within it. If you create multiple objects, they each hold their own internal state, just the same as it would if you created separate hashes.
Also: Don't use $a and $b as variable names. It's dirty. Both because single var names are wrong, but also because these two in particular are used for sort.
That's a method call. $a is the invocant (a class name or an object), mysub is the method name, and $b is an argument. You should proceed to read perlootut which explains all of this.
I'm new to OOPerl, and wanted to know how I can reference an instance of a class within that class (i.e. $this in PHP) so that I'm able to call its "private" methods
To make it more clear:
in PHP for instance:
class Foo {
public function __construct(){
}
public function doThis(){
$that = $this->doThat(); //How to reference a "private" function in perl that is defined in the same class as the calling function?
return $that;
}
private function doThat(){
return "Hi";
}
}
Perl methods are ordinary subroutines that expect the first element of their parameter array #_ to be the object on which the method is called.
An object defined as
my $object = Class->new
can then be used to call a method, like this
$object->method('p1', 'p2')
The customary name is $self, and within the method you assign it as an ordinary variable, like this
sub method {
my $self = shift;
my ($p1, $p2) = #_;
# Do stuff with $self according to $p1 and $p2
}
Because the shift removes the object from #_, all that is left are the explicit parameters to the method call, which are copied to the local parameter variables.
There are ways to make inaccessible private methods in Perl, but the vast majority of code simply trusts the calling code to do the right thing.
Is it reasonable to create an instance of a class inside a method of the same class?
I ask about "reasonability" and not "feasibility" because I could write a working example, like this:
package MyPkg;
sub new {
...
}
sub myMethod {
my ($class, $param) = #_;
my $obj = $class->new();
...
}
and the usage is:
use MyPkg;
my $result = MyPkg->myMethod("abc");
This architecture does work as expected (in $result I have the result of the call of myMethod on a "transient" object), and the caller is not even forced to explicitly instantiate a new object and then use it to call its methods.
Though, I feel this is not "Perl-ish", I mean... I don't like it. I feel something is flawed with this approach, but can't identify the exact reason.
Is there a more 'standard' solution?
Nothing requires a constructor to have the name new, that's just convention. Your myMethod sub may have a different name, but it's also a constructor.
One example of this is XML::LibXML. It has a standard constructor, but also two other constructors with specific purposes:
use XML::LibXML '1.70';
# Parser constructor
$parser = XML::LibXML->new();
$parser = XML::LibXML->new(option=>value, ...);
$parser = XML::LibXML->new({option=>value, ...});
# Parsing XML
$dom = XML::LibXML->load_xml(
location => $file_or_url
# parser options ...
);
# Parsing HTML
$dom = XML::LibXML->load_html(...);
The second two are probably wrappers for new, but they're also constructors.
Additionally, some objects are designed to be immutable. If they allow operations to still be performed on them, they will need to return a new object of the same type.
All this is to say, yes, what you've doing is fine.
Yes, you can do this. It's quite widely done.
In script i initialize several handlers and set variables which should be available for functions in separate modules. Which is the best way to use them ($q, $dbh, %conf) in modules?
Example pseudo module:
package My::Module
sub SomeFunction (
#data = $dbh->selectrow_array("SELECT * FROM Foo WHERE Bar = ?", undef, $conf{Bar} );
return $q->p( "#data" );
)
1;
Example pseudo script:
use CGI;
use DBI;
use My::Module;
our $q = new CGI;
our $dbh = some_connenction_function($dsn);
our %conf = ( Foo => 1, Bar => 2, Random => some_dynamic_data() );
I understand that using main:: namespace will work, but there sholud be cleaner way? Or not?
package My::Module
Your modules should be independent of the context. That is, they shouldn't expect that $dbh is the database handler, or that they should return stuff in $q. or that configuration is kept in %conf.
For example, what if you suddenly find yourself with two instances of database handles? What do you do? I hate it when a module requires me to use module specific variables for configuration because it means I can't use two different instances of that module.
So you have two choices:
Either pass in the required data each time.
Create an object instance (that's right Object Oriented Programming) to store the required information.
Let's look at the first instance using your pseudo code:
sub someFunction (
%param = #_;
%conf = %{param{conf};
#data = $param{dbh}->selectrow_array(
"SELECT * FROM Foo WHERE Bar = ?", undef, $conf{Bar}
);
return $param{q}->p( "#data" );
)
1;
Example pseudo script:
use CGI;
use DBI;
use My::Module;
my $q = new CGI;
my $dbh = some_connenction_function($dsn);
my %conf = ( Foo => 1, Bar => 2, Random => some_dynamic_data() );
someFunction (
q => $q,
dbh => $dbh,
conf => \%conf,
);
Here I'm using parameter calls. Not too bad. Is it? Now if you need another select statement, you can use different variables.
Sure, but what if you don't want to keep passing variables all of the time. Well then, you can use an Object Oriented Techniques. Now, relax and calm down. There are many, many good reasons to use object oriented design:
It can simplify your programming: This confuses people because object oriented programming means thinking about how your program will work, then designing it, then creating the various objects. All you want to do is code and get it out of the way. The truth is that by thinking about your program and designing it makes your program work better, and you code faster. The design aspect keeps the complexity out of your main code and tucks it safely away in small, easily digestible routines.
It's what is cool in Perl: You're going to have to get use to object oriented Perl because that's what everyone else is writing. You'll be seeing lots of my $foo = Foo::Bar->new; type of statements. Newer Perl modules only have object oriented interfaces, and you won't be able to use them.
Chicks dig it: Hey, I'm just grasping at straws here...
Let's see how an object oriented approach might work. First, let's see the main program:
use CGI;
use DBI;
use My::Module;
my $q = new CGI;
my $dbh = some_connenction_function($dsn);
my %conf = ( Foo => 1, Bar => 2, Random => some_dynamic_data() );
my $mod_handle = My::Module->new (
q => $q,
dbh => $dbh,
conf => \%conf,
);
$mod_handle->someFunction;
In the above, I now create an object instance that contains these variables. And, magically, I've changed your Functions into Methods. A method is simply a function in your Class (aka module). The trick is that my instance (the variable $mod_handler has all of your required variables stored away nice and neat for you. The $mod_hander-> syntax merely passes this information for my into my functions I mean methods.
So, what does your module now look like? Let's look at the first part where I have the Constructor which is simply the function that creates the storage for my variables I need:
sub new {
my $class = shift;
my %param = #_;
my $self = {};
bless $self, $class
$self->{Q} = $q;
$self->{DBH} = $dbh;
$self->{CONF} = $conf;
return $self;
}
Let's look at the first thing that is a bit different: my $class = shift;. Where is this coming from? When I call a function with the syntax Foo->Bar, I am passing Foo as the first parameter in the function Bar. Thus, $class is equal to My::Module. It is the same as if I called your function this way:
my $mod_handle = My::Module::new("My::Module", %params);
instead of:
my $mod_handle = My::Module->new(%params);
The next thing is the my $self = {}; line. This is creating a reference to a hash. If you don't understand references, you should look at Mark's Reference Tutorial that's included in Perldocs. Basically, a reference is the memory location where data is stored. In this case, my hash doesn't have a name, all I have is a reference to the memory where it's stored called $self. In Perl, there's nothing special about the name new or $self, but they're standards that everyone pretty much follows.
The bless command is taking my reference, $self, and declaring it a type My::Module. That way, Perl can track whether $mod_handle is the type of instance that has access to these functions.
As you can see, the $self reference contains all the variables that my functions need. And, I conveniently pass this back to my main program where I store it in $mod_handle.
Now, let's look at my Methods:
sub SomeFunction {
$self = shift;
my $dbh = $self->{DBH};
my $q = $self->{Q};
my %conf = %{self->{CONF}};
#data = $dbh->selectrow_array(
"SELECT * FROM Foo WHERE Bar = ?", undef, $conf{Bar}
);
return $param{q}->p( "#data" );
}
Again, that $self = shift; line. Remember I'm calling this as:
$mod_handle->SomeFunction;
Which is the same as calling it:
My::Module::SomeFunction($mod_handle);
Thus, the value of $self is the hash reference I stored in $mod_handle. And, that hash reference contains the three values I am always passing to this function.
Conclusion
You should never share variables between your main program and your module. Otherwise, you're stuck not only with the same name in your program each and every time, but you must be careful not to use your module in parallel in another part of your program.
By using object oriented code, you can store variables you need in a single instance and pass them back and forth between functions in the same instance. Now, you can call the variables in your program whatever you want, and you can use your module in parallel instances. It improves your program and your programming skills.
Beside, you might as well get use to object oriented programming because it isn't going away. It works too well. Entire languages are designed to be exclusively object oriented, and if you don't understand how it works, you'll never improve your skills.
And, did I mention chicks dig it?
Adium
Before all the Perl hackers descend upon me. I want to mention that my Perl object oriented design is very bad. It's way better than what you wanted, but there is a serious flaw in it: I've exposed the design of my object to all the methods in my class. That means if I change the way I store my data, I'll have to go through my entire module to search and replace it.
I did it this way to keep it simple, and make it a bit more obvious what I was doing. However, as any good object oriented programmer will tell you (and second rate hacks like me) is that you should use setter/getter functions to set your member values.
A setter function is pretty simple. The template looks like this:
sub My_Method {
my $self = shift;
my $value = shift;
# Something here to verify $value is valid
if (defined $value) {
$self->{VALUE} = $value;
}
return $self->{VALUE};
}
If I call $instance->My_Method("This is my value"); in my program, it will set $self->{VALUE} to This is my value. At the same time, it returns the value of $self->{VALUE}.
Now, let's say I all it this way:
my $value = $instance->My_Method;
My parameter, $value is undefined, so I don't set the value $self->{VALUE}. However, I still return the value anyway.
Thus, I can use that same method to set and get my value.
Let's look at my Constructor (which is a fancy name for that new function):
sub new {
my $class = shift;
my %param = #_;
my $self = {};
bless $self, $class
$self->{Q} = $q;
$self->{DBH} = $dbh;
$self->{CONF} = $conf;
return $self;
}
Instead of setting the $self->{} hash reference directly in this program, good design said I should have used getter/setter functions like this:
sub new {
my $class = shift;
my %param = #_;
my $self = {};
bless $self, $class
$self->Q = $q; #This line changed
$self->Dbh = $dbh; #This line changed
$self->Conf = $conf; #This line changed
return $self;
}
Now, I'll have to define these three subroutines, Q, Dbh, and Conf, but now my SomeFunction method goes from this:
sub SomeFunction {
$self = shift;
my $dbh = $self->{DBH};
my $q = $self->{Q};
my %conf = %{self->{CONF}};
#data = $dbh->selectrow_array(
"SELECT * FROM Foo WHERE Bar = ?", undef, $conf{Bar}
);
return $param{q}->p( "#data" );
}
To this:
sub SomeFunction {
$self = shift;
my $dbh = $self->Dbh; #This line changed
my $q = $self->Q; #This line changed
my %conf = %{self->Conf}; #This line changed
#data = $dbh->selectrow_array(
"SELECT * FROM Foo WHERE Bar = ?", undef, $conf{Bar}
);
return $param{q}->p( "#data" );
}
The changes are subtle, but important. Now my new function and my SomeFunction have no idea how these parameters are stored. The only place that knows how they're stored is the getter/setter function itself. If I change the class data structure, I don't have to modify anything but the getter/setter functions themselves.
Postscript
Here's food for thought...
If all of your SQL calls are in your My::Module function, why not simply initialize the $dbh, and the $q there in the first place. This way, you don't need to include the Use Dbi; module in your program itself. In fact, your program now remains blissfully ignorant exactly how the data is stored. You don't know if it's a SQL database, a Mongo database, or even some flat Berkeley styled db structure.
I'm including a module I did for a job long, long ago where I tried to simplify the way we use our database. This module did several things:
It handled everything about the database. The initialization, and all the handles. Thus, your main program didn't have to use any module but this.
It also moved away from the developer writing select statements. Instead, you defined what fields you wanted, and it figured out how to do the query for you.
It returns an incriminator function that's used to fetch the next row from the database.
Take a look at it. It's not the greatest, but you'll see how I use object oriented design to remove all of the various issues you are having. To see the documentation, simply type in perldoc MFX:Cmdata on the command line.
Using main:: explicitly is perfectly clean
As an alternative, you can pass that data into the module constructor method assuming your module is object based (or copy it into module's own variables from main:: in the constructor). This way the not-quite-perfwectly-pretty main:: is hidden away from the rest of the module.
One cleaner way is to use Exporter to make symbols from other namespaces available in your package. If you are just an occasional Perl programmer, I wouldn't bother with it because there's a bit of a learning curve to it, there are lots of gotchas, and for a toy project it makes the code even more complex. But it is essential for larger projects and for understanding other people's modules.
Usually, there are global variables (or especially global functions) in a package that you want to use in a different module without qualifying it (i.e., prepending package:: to every variable and function call). Here's one way that Exporter might be used to do that:
# MyModule.pm
package MyModule;
use strict; use warnings;
use Exporter;
our #EXPORT = qw($variable function);
our $variable = 42; # we want this var to be used somewhere else
sub function { return 19 }; # and want to call this function
...
1;
# my_script.pl
use MyModule;
$variable = 16; # refers to $MyModule::variable
my $f = function(); # refers to &MyModule::function
Your problem spec is backwards (not that there's necessarily anything wrong with that) -- you want variables in the main script and main package to be visible inside another module. For a short list of variables/functions that are used many times, a cleaner way to do that might be to hack at the symbol table:
package My::Module;
# required if you use strict . You do use strict , don't you?
our ($q, $dbh, %conf);
*q = \$main::q;
*dbh = \$main::dbh;
*conf = \%main::conf;
...
Now $My::Module::q and $main::q refer to the same variable, and you can just use $q in either the main or the My::Module namespace.
I think I read somewhere that some modules only have object oriented interfaces ( though they didn't create objects, they only held utility functions ). Is there a point to that?
First, its important to remember that in Perl, classes are implemented in a weird way, via packages. Packages also serve for general namespace pollution prevention.
package Foo;
sub new {
my ($class) = #_;
my $self = bless {}, $class;
return $self;
}
1;
That is how you make a Foo class in Perl (which can have an objected instantiated by calling Foo->new or new Foo). The use of new is just a convention; it can be anything at all. In fact, that new is what C++ would call a static method call.
You can easily create packages that contain only static method calls, and I suspect this is what you're referring to. The advantage here is that you can still use OO features like inheritance:
package Bar;
sub DoSomething {
my ($class, $arg) = #_;
$class->Compute($arg);
}
sub Compute {
my ($class, $arg) = #_;
$arg * 2;
}
1;
package Baz;
#Baz::ISA = qw(Bar);
sub Compute {
my ($class, $arg) = #_;
$arg * 2 - 1
}
1;
Given that, then
say Bar->DoSomething(3) # 6
say Baz->DoSomething(3) # 5
In fact, you can even use variables for the class name, so these can function very much like singletons:
my $obj = "Baz"; # or Baz->new could just return "Baz"
print $obj->DoSomething(3) # 5
[Code is untested; typos may be present]
I suspect that this is mostly a philosophical choice on the part of authors who prefer OO to imperative programming. Others have mentioned establishing a namespace, but it's the package that does that, not the interface. OO is not required.
Personally, I see little value in creating classes that are never instantiated (i.e. when there's no object in object-oriented). Perl isn't Java; you don't have to write a class for everything. Some modules acknowledge this. For example: File::Spec has an OO interface but also provides a functional interface via File::Spec::Functions.
File::Spec also provides an example of where OO can be useful for uninstantiated "utility" interfaces. Essentially, File::Spec is an abstract base class -- an interface with no implementation. When you load File::Spec it checks which OS you're using and loads the appropriate implementation. As a programmer, you use the interface (e.g. File::Spec->catfile) without having to worry about which version of catfile (Unix, Windows, VMS, etc.) to actually call.
As others have said, inheritance is the big gain if an actual object is not needed. The only thing I have to add here is the advice to name your variables well when writing such interfaces, e.g.:
package Foo;
# just a static method call
sub func
{
my $class = shift;
my (#args) = #_;
# stuff...
}
I named the variable that holds the classname "$class", rather than $this, to make it clear to subsequent maintainers that func() will be called as Foo->func() rather than $foo->func() (with an instantiated Foo object). This helps avoid someone adding this line later to the method:
my $value = $this->{key};
...which will fail, as there is no object to deference to get the "key" key.
If a method might be called either statically or against an instantiated object (for example, when writing a custom AUTOLOAD method), you can write this:
my method
{
my $this = shift;
my $class = ref($this) || $this;
my (#args) = #_;
# stuff...
}
namespacing, mostly. Why not? Everything that improves perl has my full approval.