Perl convert from script to module design ideas - perl

I have a script that I would like to convert to a module and call its functions from a Perl script as I do with CPAN modules now. My question is how would others design the module, I'd like to get some idea how to proceed as I have never written a module before. The script, as written now does this:
1) Sets up logging to a db using am in house module
2) Establishes a connection to the db using DBI
3) Fetch a file(s) from a remote server using Net::SFTP:Foreign
4) Process the users in each file and add the data to the DB
The script currently takes command line options to override defaults using Getopt::Long.
Each file is a pipe delimited collection of user data, ~4000 lines, which goes in to the db, provided the user has an entry in our LDAP directory.
So more to the point:
How should I design my module? Should everything my script currently does be moved into the module, or are there some things that are best left in the script. For example, I was thinking of designing my module, so it would be called like this:
use MyModule;
$obj = MyModule->new; // possibly pass some arguments
$obj->setupLogging;
$obj->connectToDB;
$obj->fetchFiles;
$obj->processUsers;
That would keep the script nice and clean, but is this the best idea for the module? I was thinking of having the script retrieve the files and then just pass the paths to the module for processing.
Thanks

I think the most useful question is "What of this code is going to be usable by multiple scripts?" Surely the whole thing won't. I rarely create a perl module until I find myself writing the same subroutine more than once.
At that stage I generally dump it into something called "Utility.pm" until that gathers enough code that patterns (not in the gang of four sense really) start to suggest what belongs in a module with well-defined responsibilities.
The other thing you'll want to think about is whether your module is going to represent an object class or if it's going to be a collection of library subroutines.
In your example I could see Logging and Database connection management (though I'd likely use DBI) belonging in external modules.
But putting everything in there just leaves you with a five line script that calls "MyModule::DoStuff" and that's less than useful.
Most of the time ;)

"is this the best idea for the module? I was thinking of having the
script retrieve the files and then just pass the paths to the module
for processing."
It looks like a decent design to me. I also agree that you should not hardcode paths (or URLS) in the module, but I'm a little confused by what "having the script retrieve the files and then just pass the paths to the module for processing" means: do you mean, the script would write them to disk? Why do that? Instead, if it makes sense, have fetchFile(URL) retrieve a single file and return a reference/handle/object suitable for submission to processUsers, so the logic is like:
my #FileUrls = ( ... );
foreach (#FileUrls) {
my $file = $obj->fetchFile($_);
$obj->processUsers($file);
}
Although, if the purpose of fetchFile is just to get some raw text, you might want to make that a function independent of the database class -- either something that is part of the script, or another module.
If your reason for wanting to write to disk is that you don't want an entire file loaded into memory at once, you might want to adjust everything to process chunks of a file, such that you could have one kind of object for fetching files via a socket and outputting chunks as they are read to another kind of object (the process that adds to database one). In that case:
my $db = DBmodule->new(...);
my $netfiles = NetworkFileReader->new(...)
foreach (#FileUrls) {
$netfiles->connect($_);
$db->newfile(); # initialize/reset
while (my $chunk = $netfiles->more()) {
$db->process($chunk);
}
$db->complete();
$netfiles->close();
}
Or incorporate the chunk reading into the db class if you think it is more appropriate and unlikely to serve a generic purpose.

Related

nytprofhtml seem to ignore module named DB

I'm trying to profile a big application that contains a module creatively named DB. After running it under -d:NYTProf and calling nytprofhtml, both without any additional switches or related environment variables, I get usual nytprof directory with HTML output. However it seems that due to some internal logic any output related to my module DB is severely mangled. Just to make sure, DB is pure Perl.
Top and all subroutines list: while other function links point to relevant -pm-NN-line.html file, links to subs from DB point to entry script instead.
"line" link in "Source Code Files" section does point to DB-pm-NN-line.html and it does exists, but unlike all other files it doesn't have "Statements" table inside and "Line" table have absolutely no lines of code, only summary of calls.
Actually, here's a small example:
# main.pl
use lib '.';
use DB;
for (1..10) {
print DB::do_stuff();
}
# DB.pm
package DB;
sub do_stuff {
my $a = 1;
my $b = 2;
my $c = $a + $b;
return $c;
}
1;
Try running perl -d:NYTProf main.pl, then nytprofhtml and then inspect nytprof/DB-pm-8-line.html.
I don't know if it happens because NYTProf itself have internal module named DB or it handles modules starting with DB in some magical way - I've noticed output for functions from DBI looks somewhat different too.
Is there a way to change/disable this behavior short of renaming my DB module?
That's hardly an option
You don't really have an option. The special DB package and the associated Devel:: namespace are coded into the perl compiler / interpreter. Unless you want to do without any debugging facilities altogether and live in fear of any mysterious unquantifiable side effects then you must refactor your code to rename your proprietary DB library
Anyway, the name is very generic and that's exactly why it is expected to be encountered
On the contrary, Devel::NYTProf is bound to use the existing core package DB. It's precisely because it is a very generic identifier that an engineer should reject it as a choice for production code, which could be required to work with pre-existing third-party code at any point
a module creatively named DB
This belies your own support of the choice. DBhas been a core module since v5.6 in 2000, and anything already in CPAN should have been off the cards from the start. I feel certain that the clash of namespaces must been discovered before now. Please don't be someone else who just sweeps the issue under the carpet
Remember that, as it stands, any code you are now putting in the DB package is sharing space with everything that perl has already put there. It is astonishing that you are not experiencing strange and inexplicable symptoms already
I don't see that it's too big a task to fix this as long as you have a proper test suite for your 10MB of Perl code. If you don't, then hopefully you will at least not make the same mistakes again

How can I make each instance of a perl script have its own unique id?

I have a perl script that connects to the Plex API. It logs in and performs certain actions (mostly working).
However, the Plex API suggests (insists?) that each instance of the script send a unique ID, such that if I share this script with anyone else they should use a different string.
In the interests of keeping this simple, I don't want to have some configuration file that keeps that value outside of the script. I also can't leave a value hard-coded in, no one who downloads this will change it.
Could the perl script modify itself?
If I were to declare it as such:
my $uuid = 1;
... then could I not check immediately afterward if this value is equal to 1, and if so overwrite that with a randomly generated uuid? The script would then exit, but somehow re-invoke itself (so the user doesn't have to run it a second time).
Is there a safe way to do this? Alternatively, is there a better way to accomplish the goal without using this method?
Make the last line of your script __DATA__ and append the ID to the script either at installation or first run. Reading from the special <DATA> handle reads the data segment of a script.
You could use UUID::Tiny to generate a random UUID:
use UUID::Tiny;
my $uuid = create_UUID(UUID_V4);
To preserve the UUID between invocations, you'll have to modify the script itself. The answers in this thread might be helpful.
Update
You say in the comments that you want a different unique ID "per installation", but you also say that "it needs to be the same value for any given user", so I'm no longer sure that my answer will satisfy your requirements
I suggest that you use the system UUID returned by dmidecode. Of course you will need to have it installed on your computer, and there's a parser module for it on CPAN called Parse::DMIDecode
It's slightly more complex if you have to support Windows systems. You can use DmiDecode for Windows, which is available as a ready-built binary, but the parser module explicitly checks that there are no colons (amongst other things) in the path to the demidecode executable, so a call to the probe method won't work. Instead you must call demidecode and pass the result to the parse method
This short example works fine on both Linux and Windows
use strict;
use warnings 'all';
use feature 'say';
use Parse::DMIDecode;
my $decoder = Parse::DMIDecode->new;
$decoder->parse(qx{dmidecode});
say $decoder->keyword('system-uuid');
output
35304535-3439-4344-3232-3245FFFFFFFF

UniData List all avaiable subroutines / All parameters

I am trying to wrap some UniData Subroutines to SOAP Web Service. I'm planning to use C# and UODOTNET library (IBM U2 Data Management Interface for .NET). Also I'm looking to create an engine to read all the available Subroutines from data server and also reads all the required parameters and dynamically generate required codes for Web Service.
My code would be something like this:
var session = UniObjects.OpenSession(
"192.168.0.1",
"user",
"password",
"account"
);
var cmd = session.CreateUniCommand();
cmd.Command = "LIST SUBURB.INDEX"; // ?????
cmd.Execute();
var res = cmd.Response;
Question 1: Is there any command that I can use to retrieve the list of all available subroutines?
Question 2: Is there any command that I can use to retrieve list of all parameters for specific subroutine?
Cheers
The short answer is no.
The longer answer is yes, but with a lot of work.
Since you are asking this question, I'm going to assume you are missing a lot of generally knowledge about the platform. Hence to be able to do this you'll need to:
Learn about how VOC works, specifically how executable code can be catalogued here.
Learn about the CATALOG and how cataloguing programs globally, locally and direct differ.
Understand how your system in particular is designed. Some places have everything directly catalogued in the VOC, others are a mix. If the former, it'll be easier for your question.
Once you understand the above, you'll know how to get a list of all executable programs from VOC, local catalog and global catalog. For example, a simplified example for the VOC is the UniQuery command LIST VOC WITH F1="C".
The hard part is getting the parameter list, of which there isn't any system command. To do this you have 2 options.
Reverse engineer the byte code of every subroutine and tease out the number of parameters
If you have access to the related source code, parse it to generate the list.
Just wanted to add a comment on this one, in UniData there is a MAKE.MAP.FILE command that will identify Programs and Subroutines ( and the number of parameters) which puts the information in the '_MAP_' file. This does not tell you what the parameters are used for, but it helps.

Perl: What is the definition of a test?

I feel a bit ashamed to ask this question but I am curious. I recently wrote a script (without organizing the code in modules) that reads log files of a store and saves the info to the data base.
For example, I wrote something like this (via Richard Huxton):
while (<$infile>) {
if (/item_id:(\d+)\s*,\s*sold/) {
my $item_id = $1;
$item_id_sold_times{$item_id}++;
}
}
my #matched_items_ids = keys %item_id_sold_times;
my $owner_ids =
Store::Model::Map::ItemOwnerMap->fetch_by_keys( \#matched_item_ids )
->entry();
for my $owner_id (#$owner_ids) {
$item_id_owner_map{$owner_id}++;
}
Say, this script is called script.pl. When testing the file, I made a file script.t and had to repeat some blocks of script.pl in script.t. After copy pasting relevant code sections, I do confirmations like:
is( $item_id_sold_times{1}, 1, "Number of sold items of item 1" );
is( $item_id_owner_map{3}, 8, "Number of sold items for owner 3" );
And so on and so on.
But some have pointed out that what I wrote is not a test. It is a confirmation script. A good test would involve writing the code with modules, writing a script that kicks the methods in the modules and write a test for the module.
This has made me think about what is the definition of a test most widely used in software engineering. Perhaps some of you that have even tested Perl core functions can give me a hand. A script (not modulized) cannot be properly tested?
Regards
An important distinction is: If someone were to edit (bugfix or new feature) your 'script.pl' file, would running your 'script.t' file in its current state give any useful information on the state of the new 'script.pl'? For this to be the case you must include or use the .pl file instead of selectively copy/pasting excerpts.
In the best case you design your software modularly and plan for testing. If you want to test a monolithic script after the fact, I suppose you could write a test script that includes your main program after an exit() and at least have the subroutines to test...
You can test a program just like a module if you set it up as a modulino. I've written about this all over the place, but most notably in Mastering Perl. You can read Chapter 18 for online for free right now since I'm working on the second edition in public. :)
Without getting into verification versus validation, etc, a test is any activity that verifies that behavior of something else, that the behaviors match requirements and that certain undesirable behaviors don't occur. The scope of a test can be limited (only verifying a few of the desired behaviors) or it can be comprehensive (verifying most or all of the required or desired behaviors), probably with lots of steps or components to the test to make it so. In my mind, the term "confirmation" in your context is another way of saying "limited scope test". Certainly it doesn't completely verify the behavior of what your testing, but it does something. In direct answer to the question at hand: I think calling what you did a "confirmation" is just a matter of semantics.

Including files in pm modul

I am totally new to Perl/Fastcgi.
I have some pm-modules to which will have to add a lot of scripts and over time it will grow and grow. Hence, I need a structure which makes the admin easier.
So, I want to create files in some kind of directory structure which I can include. I want the files that I include will be exaclty like if the text were written in the file where I do the include.
I have tried 'do', 'use' and 'require'. The actual file I want to include is in one of the directories Perl is looking in. (verified using perl -V)
I have tried within and outside BEGIN {}.
How do I do this? Is it possible at all including pm files in pm files? Does it have to be pm-files I include or can it be any extension?
I have tried several ways, included below is my last try.
Config.pm
package Kernel::Config;
sub Load {
#other activities
require 'severalnines.pm';
#other activities
}
1;
severalnines.pm
# Filter Queues
$Self->{TicketAcl}->{'ACL-hide-queues'} = {
Properties => {
},
PossibleNot => {Ticket => { Queue =>
['[RegExp]^*'] },
},
};
1;
I'm not getting any errors in the Apache's error_log related to this. Still, the code is not recognized like it would be if I put it in the Config.pm file.
I am not about to start programming a lot, just do some admin in a 3rd party application. Still, I have searched around trying to learn how it works with including files. Is the severalnines.pm considered to be a perl module and do I need to use a program like h2xs, or similar, in order to "create" the module (told you, totally newbie...)?
Thanks in advance!
I usually create my own module prefix -- named after the project or the place I worked. For example, you might put everything under Mu with modules named like Mu::Foo and Mu::Bar. Use multiple modules (don't try to keep everything in one single file) and name your modules with the *.pm suffix.
Then, if the Mu directory is in the same directory as your programs, you only need to do this:
use Mu::Foo;
use Mu::Bar;
If they're in another directory, you can do this:
use lib qw(/path/to/other/directory);
use Mu::Foo;
use Mu::Bar;
Is it possible at all including pm files in pm files?
Why certainly yes.
So, I want to create files in some kind of directory structure which I can include. I want the files that I include will be exaclty like if the text were written in the file where I do the include.
That's a bad, bad idea. You are better off using the package mechanism. That is, declare each of your module as a separate package name. Otherwise, your module will have a variable or function in it that your script will override, and you'll never, ever know it.
In Perl, you can reference variables in your modules by prefixing it with the module name. (Such as File::Find does. For example $File::Find::Name is the found file's name. This doesn't pollute your namespace.
If you really want your module's functions and variables in your namespace, look at the #EXPORT_OK list variable in Exporter. This is a list of all the variables and functions that you'd like to import into your module's namespace. However, it's not automatic, you have to list them next to your use statement. That way, you're more likely to know about them. Using Exporter isn't too difficult. In your module, you'd put:
package Mu::Foo;
use Exporter qw(import);
our EXPORT_OK = qw(convert $fundge #ribitz);
Then, in your program, you'd put:
use Mu::Foo qw(convert $fundge #ribitz);
Now you can access convert, $fundge and #ribitz as if they were part of your main program. However, you now have documented that you're pulling in these subroutines and variables from Mu::Foo.
(If you think this is complex, be glad I didn't tell you that you really should use Object Oriented methods in your Modules. That's really the best way to do it.)
if ( 'I want the files that I include will be exactly like if the text were written in the file where I do the include.'
&& 'have to add a lot of scripts and over time it will grow and grow') {
warn 'This is probably a bad idea because you are not creating any kind of abstraction!';
}
Take a look at Exporter, it will probably give you a good solution!