Is this code which is using Switch.pm safe? - perl

In our company we were using this code (given at the end) for about 10 years and it worked fine.
Some days ago we faced some issues and we had to re-code the complete package, we decided to replace this code with Switch module by Damian (in order to improve the readability of code).
Everything is working fine for us.
Later I found on Perlmonks that Damian had put this module under
Damian modules you shouldn't use in production because their purpose
is to explore and prototype future core language features.
But it is working fine for us because we are not hitting the limitations of this module (I guess).
Now I ask you guys to please have a look at the both implementations (nested if else vs switch) and let me know whether using Switch in the newer implementation is fine or are we creating some future problems for us? Is using Switch in the code given below fine or are there any hidden bugs/problems?
I've already read the bugs and reviews of this module on CPAN and Perlmonks and I guess our code is far away from hitting those bugs (I think so).
We are using Perl 5.8.5.
PS: I know the alternatives of Switch, we have given/when in Perl 5.10, we can use dispatch table and other solutions which are specified here, but right now we just want to compare the new implementation which uses Switch.
Using nested if else
if ($command =~ /^enter$/) {
$self->show_main_frames();
}
elsif ($command =~ /^XYZ_MENU/i) {
$self->show_main_menu($manual, $dbot);
}
elsif ($command =~ /^DBOT/i) {
$dbot->process();
}
# XML is used for the reminders-history: Request 2666
elsif ($command =~ /^XML_DBOT/i) {
$dbot->process();
}
elsif ($command =~ /^UGS/i) {
$ugsui->process();
}
elsif ($command eq "kill") {
my $login = $self->{COMMON_HASH}{login} || "";
my $su_login = $self->{CONF}->get("start", "SU_LOGIN");
if ($login eq $su_login) {
# usually only certain user with certain permission will be
# able to do this.
$self->do_error("Daemon was killed by ".$login);
$self->db_connection->disconnect();
$self->{LOG}->write("User $login killed the daemon", 0);
exit; # this 'exit' actually kill the daemon
}
else {
$self->do_error("User $login tried to kill the daemon. ".
"This incident will be reported");
$self->{LOG}->write("User $login tried to kill the daemon", 2);
}
}
elsif ($command eq "logout") {
# check if we should delete the password cookie
my $forget_me = $self->{CGI}->param("forget_me") || 0;
if ($forget_me) {
$self->{DB_PASSWORD_COOKIE}->delete_cookie();
}
$ugsui->do_logout();
# Cliff edit remove id from logged_in
$session->remove_session($session->login());
# delete the session of the user
delete $self->{SESSIONS}{$session->id()};
if ($self->{CACHE_TO_FILE}) {
my $session_data_path =
XYZ_DIR
."/code/cache/session_data"
.$session->id();
unlink($session_data_path);
}
}
# if we just login we should create all the main frames
elsif ($command eq "login") {
# if extra_param holds "command*XXX" the XXX will be placed instead of
# the command. extra_param holds pairs that are astrics-separated
my $extra_param = $cgi->param("extra_param");
$extra_param = "" if (!defined($extra_param));
$extra_param =~ /command\*([^\*]+)/i;
my $other_command = defined($1) ? $1 : "";
if ($other_command =~ /^dbot/i) { # meanwhile - works only on dbot
# commands
$command = $other_command;
# now we will get the other parameters from the extra_param
# (actually including the command that is still in the
# $extra_param)
while ($extra_param =~ /^\*?([^\*]+)\*([^\*]+)(.*)/) {
$extra_param = $3;
my $name = $1;
my $value = $2;
$cgi->param(-name => $name,
-value => $value);
}#end while
}#end if
else{
$self->show_main_frames();
}
}#end elsif
else {
$self->show_main_frames();
}#end outer else
Using Switch
switch ($command)
{
case /^enter$/ { $self->show_main_frames() }
case /^XYZ_MENU/i { $self->show_main_menu($manual, $dbot) }
case /^DBOT/i { $dbot->process() }
case /^XML_DBOT/i { $dbot->process() }
case /^UGS/i { $ugsui->process() }
case "kill" {
my $login = $self->{COMMON_HASH}{login} || "";
my $su_login = $self->{CONF}->get("start", "SU_LOGIN");
if ($login eq $su_login) {
# usually only certain user with certain permission will be
# able to do this.
$self->do_error("Daemon was killed by ".$login);
$self->db_connection->disconnect();
$self->{LOG}->write("User $login killed the daemon", 0);
exit; # this 'exit' actually kill the daemon
}
else {
$self->do_error("User $login tried to kill the daemon. ".
"This incident will be reported");
$self->{LOG}->write("User $login tried to kill the daemon", 2);
}
}
case "logout" {
# check if we should delete the password cookie
my $forget_me = $self->{CGI}->param("forget_me") || 0;
if ($forget_me) {
$self->{DB_PASSWORD_COOKIE}->delete_cookie();
}
$ugsui->do_logout();
# Cliff edit remove id from logged_in
$session->remove_session($session->login());
# delete the session of the user
delete $self->{SESSIONS}{$session->id()};
if ($self->{CACHE_TO_FILE}) {
my $session_data_path =
XYZ_DIR
."/code/cache/session_data"
.$session->id();
unlink($session_data_path);
}
}
case "login" {
# if extra_param holds "command*XXX" the XXX will be placed instead of
# the command. extra_param holds pairs that are astrics-separated
my $extra_param = $cgi->param("extra_param");
$extra_param = "" if (!defined($extra_param));
$extra_param =~ /command\*([^\*]+)/i;
my $other_command = defined($1) ? $1 : "";
if ($other_command =~ /^dbot/i)
{ # meanwhile - works only on dbot
# commands
$command = $other_command;
# now we will get the other parameters from the extra_param
# (actually including the command that is still in the
# $extra_param)
while ($extra_param =~ /^\*?([^\*]+)\*([^\*]+)(.*)/) {
$extra_param = $3;
my $name = $1;
my $value = $2;
$cgi->param(-name => $name,
-value => $value);
}#end while
}#end if
else {$self->show_main_frames();}
}
else {$self->show_main_frames();}
} # end switch

Switch does its own parsing of the source code. This can lead to hard to diagnose errors in the code that directly uses it. The kind of problems Switch creates are not intermittent, so if your code works, you have nothing to worry about.
But really, it doesn't add much at all.
With Switch:
switch ($command) {
case /^enter$/ { $self->show_main_frames() }
case /^XYZ_MENU/i { $self->show_main_menu($manual, $dbot) }
case /^DBOT/i { $dbot->process() }
case /^XML_DBOT/i { $dbot->process() }
case /^UGS/i { $ugsui->process() }
case "kill" {
my $login = $self->{COMMON_HASH}{login} || "";
Without Switch:
for ($command) {
if (/^enter$/) { $self->show_main_frames() }
elsif (/^XYZ_MENU/i) { $self->show_main_menu($manual, $dbot) }
elsif (/^DBOT/i) { $dbot->process() }
elsif (/^XML_DBOT/i) { $dbot->process() }
elsif (/^UGS/i) { $ugsui->process() }
elsif ($_ eq "kill") {
my $login = $self->{COMMON_HASH}{login} || "";
(elsif (/^kill\z/) would also work.)

Actually Switch module does not provide you any "killer feature"; the same can be done with elsif statement which is secure, stable and does not have drawbacks that Switch does. Here is problems with Switch i got in my project (and i dont use it anymore):
Switch is made throgh Perl filters. This technique have following limits:
Your source code actually rewritten on-the-fly and replaces with
sequent elsif statements.
Some Perl error reports will refer wrong line; some of them showing code you dont have in your source (autogenerated code).
Not filter limit, but limit of module itself:
If the file(.pl or .pm) where you call use Swtich excess 1Mbyte size this can lead to "mysterious errors" (as written in doc). I can confirm these errors do not leading to Switch module and is completely unobivious, so you can have hard debug time after some weeks of coding/documentation.
I recommend to use elsif or given..when statements which is available since Perl 5.10. So if you using perl 5.8.x - use elsif.
Also you can read "Limitations" paragraph for Switch documentation.

Because Switch does own source code parsing, it does not work at all in certain circumstances. For example, it is impossible to use it with mod_perl.
However, if you have Perl 5.10 or later, there is much better replacement with effectively the same functionality: given/when
use v5.10;
given ($var) {
when (/^abc/) { $abc = 1 }
when (/^def/) { $def = 1 }
when (/^xyz/) { $xyz = 1 }
default { $nothing = 1 }
}
given is supported by Perl core (and works everywhere, including mod_perl) - you just use v5.10; and it is instantly available to you.

Related

Perl error: not a reference

I recently migrated some Perl code from SunSolaris to a Linux(Ubuntu) box of 64 bit. After the migration Storable.pm is breaking with the following error:
Byte order is not compatible at /usr/lib/perl/5.18/Storable.pm, at /home/VD/Cache.pm line 347.
After some research on the internet I found that I need to use nfreeze instead of thaw, but now I receive the following error:
not a reference at /home/VD/Cache.pm line 347.
Any suggestions how to fix this?
sub get
{
my($self, $type, $param_ref) = #_;
#return 1 if(!$self->{'INI'}{'sf.system.cache.enabled'});
if($self->{'INI'}{'sf.system.cache.database.enabled'})
{
### DATABASE
my $param = $self->SF::Cache::convert_parameter($type, $param_ref);
if($self->SF::Cache::CACHE_TABLE_USERCONTENT && $$param{'type'} == 2)
{
### user-content
my $query = 'SELECT PARAM_CONTENT AS C, DATA AS D FROM sf_cache_usercontent WHERE SITE=? AND PARAM_USER=?';
my $bindvar = { 1=>$self->{'site'}, 2=>$$param{'user'} };
my $sth = $self->db_select($query, $bindvar);
#print SF::Util::debug_dumpquery($query, $bindvar);
return undef if($self->{'Error'});
my %usercontent;
undef(%usercontent);
while(my $hashref = $self->db_fetch($sth))
{
$usercontent{$$hashref{'C'}} = $$hashref{'D'};# ? 1 : 0;
}
return \%usercontent;
}
else
### ******************************************************************************************************
{
my $ret = $self->SF::Cache::get_database('DATA', $param);
return Storable::nfreeze($ret) if(defined $ret);
}
}
else
{
### FILESYSTEM
my $filename = $self->SF::Cache::filename($type, $param_ref);
if($filename && -e $filename)
{
if($self->{'INI'}{'sf.system.cache.lock.enabled'} && defined &lock_retrieve)
{
return lock_retrieve $filename;
}
else
{
return retrieve $filename;
}
}
else
{
$! = 0;
}
}
return undef;
}
Go back to your original system, thaw then nfreeze the file there to fix it.
perl -MStorable=nstore,retrieve -e'nstore(retrieve($ARGV[0]), $ARGV[1])' file fixed
So, "not a reference" means ... exactly what it says on the tin. Can you try printing the thingy with Data::Dumper from comments it's this line:
return Storable::nfreeze($ret) if(defined $ret)
So - what does:
print Dumper $ret;
produce? Is it a reference?
I'm not so sure though that you're right about needing nfreeze instead of thaw, because they both do different things. freeze packs a variable; thaw unpacks it. So nfreeze can replace freeze.
But the core purpose of doing this is to transfer your packed up scalar to another program on another architecture. Is this what you're doing?
If so, can I suggest instead considering transferring it as JSON or XML instead?

Regular expression handling in elsif block in perl

GMF File:
TSTARTCUSTEVSUMMROW_GPRS
CUSTEVSUMMROW_GPRS GPRS - Subscriber Package (Paygo)|93452|MB|240|33952
CUSTEVSUMMROW_GPRS GPRS - MBB Plan (Paygo)|93452|MB|160|20128
TENDCUSTEVSUMMROW_GPRS
TSTARTCUSTEVSUMMROW_GPRS_SIMPLE
CUSTEVSUMMROW_GPRS_SIMPLE GPRS - LTE Roam Package|1529551|MB|85|260536
CUSTEVSUMMROW_GPRS_SIMPLE GPRS - LTE Roam Package|65461|MB|20000|1309252
TENDCUSTEVSUMMROW_GPRS_SIMPLE
Code:
if ( $line =~ m/^(CUSTEVSUMMROW_SIMPLE|CUSTEVSUMMROW_GPRS_SIMPLE|CUSTEVSUMMROW_GPRS|CUSTEVSUMMROW|CUSTPRODSUMMROW)\s(.*?)\|.*\|(.*?)$/) {
$tag = $1;
$lineTxt = $2;
$amt = $3;
if ( $tag =~ m/^(CUSTEVSUMMROW|CUSTEVSUMMROW_SIMPLE)/ ) {
print "Processing some validations";
} else {
Print " Mapping failed";
} elsif ( $tag =~ m/^(CUSTEVSUMMROW_GPRS|CUSTEVSUMMROW_GPRS_SIMPLE)/ ) {
if () {
#It has to do some validations.
} else {
#Failed;
}
}
}
When I try to process the elseif condition is not able to process. Could you please help me out in solving this issue?
Output:
Unable to map:CUSTEVSUMMROW_GPRS | GPRS - Data Only LTE Package Roaming | 34646.2272
Unable to map:CUSTEVSUMMROW_GPRS | GPRS - LTE Dealer1 Package Roaming | 34609.3312
Unable to map:CUSTEVSUMMROW_GPRS_SIMPLE | GPRS - Simple Subscriber Package 3 | 32.1899
Unable to map:CUSTEVSUMMROW_GPRS_SIMPLE | GPRS - Simple Talk and Text Package | 0.2702
I would recommend a change of approach. Rather than individually matching specific parts of the line, and having to do this over and over again, tokenize it at the start. That is, split it into grammatical pieces. Once the parsing is out of the way, it will be much easier to work with.
An example from English, to parse things like "Go to the store", "You go to the store", "I went to the store", "We are going to the store", you could search for go|going|went at various positions, or you can break it up into subject (go), verb (you), object (store) and then work with them.
It looks like you'e got a | delimited set of fields (your post conflicts on this detail, adjust as necessary). Split on that pipe to tokenize.
my($tag, $description, $amount, $units, $limit, $something) = split m{\|}, $line;
Now you can work with $tag without having to do further parsing on the whole line.
if( $tag eq 'CUSTEVSUMMROW' or $tag eq 'CUSTEVSUMMROW_SIMPLE' ) {
...
}
elsif( $tag eq 'CUSTEVSUMMROW_GPRS' or 'CUSTEVSUMMROW_GPRS_SIMPLE' ) {
...
}
You can make the code simpler by pushing the tag logic into a subroutine.
sub is_tag_of_type {
my($tag, $type) = #_;
return 1 if $type eq 'GPRS' and $tag =~ /GPRS/;
return 1 if $type eq 'SIMPLE' and $tag =~ /SIMPLE/;
...
}
Or maybe the tag has its own little grammar and can be split into tokens.
sub tokenize_tag {
my $tag = shift;
my #tokens = split /_/, $tag;
return map { $ _ => 1 } #tokens;
}
Then your code to process a line looks like this.
my($tags, $description, $amount, $units, $limit, $something) = split m{\|}, $line;
my %tags = tokenize_tags($tags);
if( $tags{GPRS} ) {
...
}
else {
...
}
Writing if ... else ... elsif is a syntax error -- your code won't even run. Assuming that the elsif should be between the if and the else you have another problem: the regex in the if condition is more general than the one in the elsif condition. CUSTEVSUMROW will match anything that CUSTEVSUMMROW_GPRS or CUSTEVSUMMROW_GPRS_SIMPLE would. Swap the if and elsif blocks so that the specific check happens before the general one.
if ($tag =~ /^CUSTEVSUMMROW_GPRS/) {
...
}
elsif ($tag =~ /^CUSTEVSUMMROW/) {
...
}
else {
...
}

How can I make my Perl Jabber bot an event-driven program?

I'm trying to make a Jabber bot and I am having trouble keeping it running while waiting for messages. How do I get my script to continuously run? I have tried calling a subroutine that has a while loop that I, in theory, have set up to check for any messages and react accordingly but my script isn't behaving that way.
Here is my source: http://pastebin.com/03Habbvh
# set jabber bot callbacks
$jabberBot->SetMessageCallBacks(chat=>\&chat);
$jabberBot->SetPresenceCallBacks(available=>\&welcome,unavailable=>\&killBot);
$jabberBot->SetCallBacks(receive=>\&prnt,iq=>\&gotIQ);
$jabberBot->PresenceSend(type=>"available");
$jabberBot->Process(1);
sub welcome
{
print "Welcome!\n";
$jabberBot->MessageSend(to=>$jbrBoss->GetJID(),subject=>"",body=>"Hello There!",type=>"chat",priority=>10);
&keepItGoing;
}
sub prnt
{
print $_[1]."\n";
}
#$jabberBot->MessageSend(to=>$jbrBoss->GetJID(),subject=>"",body=>"Hello There! Global...",type=>"chat",priority=>10);
#$jabberBot->Process(5);
#&keepItGoing;
sub chat
{
my ($sessionID,$msg) = #_;
$dump->pl2xml($msg);
if($msg->GetType() ne 'get' && $msg->GetType() ne 'set' && $msg->GetType() ne '')
{
my $jbrCmd = &trimSpaces($msg->GetBody());
my $dbQry = $dbh->prepare("SELECT command,acknowledgement FROM commands WHERE message = '".lc($jbrCmd)."'");
$dbQry->execute();
if($dbQry->rows() > 0 && $jbrCmd !~ /^insert/si)
{
my $ref = $dbQry->fetchrow_hashref();
$dbQry->finish();
$jabberBot->MessageSend(to=>$msg->GetFrom(),subject=>"",body=>$ref->{'acknowledgement'},type=>"chat",priority=>10);
eval $ref->{'command'};
&keepItGoing;
}
else
{
$jabberBot->MessageSend(to=>$msg->GetFrom(),subject=>"",body=>"I didn't understand you!",type=>"chat",priority=>10);
$dbQry->finish();
&keepItGoing;
}
}
}
sub gotIQ
{
print "iq\n";
}
sub trimSpaces
{
my $string = $_[0];
$string =~ s/^\s+//; #remove leading spaces
$string =~ s/\s+$//; #remove trailing spaces
return $string;
}
sub keepItGoing
{
print "keepItGoing!\n";
my $proc = $jabberBot->Process(1);
while(defined($proc) && $proc != 1)
{
$proc = $jabberBot->Process(1);
}
}
sub killBot
{
print "killing\n";
$jabberBot->MessageSend(to=>$_[0]->GetFrom(),subject=>"",body=>"Logging Out!",type=>"chat",priority=>10);
$jabberBot->Process(1);
$jabberBot->Disconnect();
exit;
}
POE has some pretty good event frameworks. I don't know how good the one for Jabber (POE::Component::Jabber) is, but it's probably worth looking at.
AnyEvent::XMPP is ridiculously comprehensive, and, since it uses AnyEvent, can be run in any event driven application with a supported loop (AnyEvent's own, Event, EV, Tk, Glib/Gtk, even POE).
I think you can make your example work by doing this:
0 while $jabber->Process
Having said that, I would strongly recommend using a proper Event handling framework such as AnyEvent (my personal favorite) or POE (the traditional choice).

Obtain a switch/case behaviour in Perl 5

Is there a neat way of making a case or switch statement in Perl 5?. It seems to me they should include a switch on version 6..
I need this control structure in a script, and I've heard you can import a "switch module". But how can I achieve it without imports to minimize dependencies and acquire portability?
If you are using Perl 5.10 you have given/when which is a switch statement (note, it can do more than compare with regexes, read the linked docs to see its full potential):
#or any of the dozen other ways to tell 5.10 to use its new features
use feature qw/switch/;
given($string) {
when (/^abc/) { $abc = 1; }
when (/^def/) { $def = 1; }
when (/^xyz/) { $xyz = 1; }
default { $nothing = 1; }
}
If you are using Perl 5.8 or earlier you must make do with if/elsif/else statements:
if ($string =~ /^abc/) { $abc = 1; }
elsif ($string =~ /^def/) { $def = 1; }
elsif ($string =~ /^zyz/) { $xyz = 1; }
else { $nothing = 1; }
or nested condition operators (?:):
$string =~ /^abc/ ? $abc = 1 :
$string =~ /^def/ ? $def = 1 :
$string =~ /^xyz/ ? $xyz = 1 :
$nothing = 1;
There is a module in Core Perl (Switch) that gives you fake switch statements via source filters, but it is my understanding that it is fragile:
use Switch;
switch ($string) {
case /^abc/ {
case /^abc/ { $abc = 1 }
case /^def/ { $def = 1 }
case /^xyz/ { $xyz = 1 }
else { $nothing = 1 }
}
or the alternate syntax
use Switch 'Perl6';
given ($string) {
when /^abc/ { $abc = 1; }
when /^def/ { $def = 1; }
when /^xyz/ { $xyz = 1; }
default { $nothing = 1; }
}
The suggestion in Programming Perl is:
for ($string) {
/abc/ and do {$abc = 1; last;};
/def/ and do {$def = 1; last;};
/xyz/ and do {$xyz = 1; last;};
$nothing = 1;
}
Just a short comment about the core Switch module that's been mentioned a couple of times in answers. The module in question relies on source filters. Among other things, that may result in wrong lines reported for errors. It's so bad that none of the core developers really remembers or cares to remember why it was accepted into the perl core in the first place.
Furthermore, Switch.pm will be the first Perl module ever to be removed from the perl core. The next major release of perl, 5.12.0, will still have it, albeit with a deprecation warning. That deprecation warning will go away if you explicitly install Switch.pm from CPAN. (You get what you ask for.) In the next release down the road, 5.14, Switch.pm will be entirely removed from core.
An equivalent solution that I like is a dispatch table.
my $switch = {
'case1' => sub { print "case1"; },
'case2' => sub { print "case2"; },
'default' => sub { print "unrecognized"; }
};
$switch->{$case} ? $switch->{$case}->() : $switch->{'default'}->();
print("OK : 1 - CANCEL : 2\n");
my $value = <STDIN>;
SWITCH: {
($value == 1) && last(SWITCH);
($value == 2) && do {print("Cancelled\n"); exit()};
print("??\n");
}

MATLAB: determine dependencies from 'command line' excluding built in dependencies

Is there a way to determine all the dependencies of an .m file and any of the dependencies of the files it calls using a command in a script (command-line)?
There was a question like this before and it was really good because it suggested using the depfun function. BUT the issue with this was that it is outputting the MATLAB related files that it depends on as well.
EXAMPLE:
testing.m
disp('TESTING!!');
The output of depfun('testing')
'C:\testing.m'
'C:\MATLAB\R2008a\toolbox\matlab\datatypes\#opaque\char.m'
'C:\MATLAB\R2008a\toolbox\matlab\datatypes\#opaque\double.m'
'C:\MATLAB\R2008a\toolbox\matlab\datatypes\#opaque\toChar.m'
'C:\MATLAB\R2008a\toolbox\matlab\elfun\log10.m'
'C:\MATLAB\R2008a\toolbox\matlab\elmat\ans.m'
etc.
The list is a little bit longer.
The point here is that I was hoping there would be some similar function or a flag that would remove these unwanted dependencies.
Here are a couple of links I found helpful when I wrote up a simple function to create a table of contents for an m-file:
A thread discussing the undocumented function MLINTMEX
FDEP by Urs Schwarz on the MathWorks File Exchange
FARG by Urs Schwarz on the MathWorks File Exchange
EDIT: Since this problem piqued my curiosity, I started trying out a few ways I might approach it. Finding the dependencies on non-toolbox .m and .mex files was relatively trivial (I did this in MATLAB version 7.1.0.246):
fcnName = 'myfile.m';
fcnList = depfun(fcnName,'-quiet');
listIndex = strmatch('C:\Program Files\MATLAB71\toolbox',fcnList);
fcnList = fcnList(setdiff(1:numel(fcnList),listIndex));
Here, I just used DEPFUN to get the dependencies, then I removed any files that began with 'C:\Program Files\MATLAB71\toolbox', where the MATLAB toolboxes are located on my machine. Note that this assumes you aren't placing any of your own code in these MATLAB directories (which you shouldn't do anyway).
To get dependencies on .mat and .txt files, I took another approach. For each of the files you get from the above code, you could load the text of the file into MATLAB and parse it with a regular expression to find strings that end in a '.mat' or '.txt':
fid = fopen(fcnName,'rt');
fcnText = fscanf(fid,'%c');
fclose(fid);
expr = '[^\'']\''([^\''\n\r]+(?:\w\.(?:mat|txt)){1})\''[^\'']';
dataFiles = regexp(fcnText,expr,'tokens');
dataFiles = unique([dataFiles{:}]).';
There are a few limitations to the regular expression I used:
If you have a string like 'help.txt' that appears in a comment (such as the help comment block of a function), it will still be detected by the regular expression. I tried to get around this with a lookaround operator, but that took too long to run.
If you build a string from variables (like "fileString = [someString '.mat']"), it will not be detected by the regular expression.
The returned strings of file names will be relative path strings. In other words, if you have the strings 'help.txt' or 'C:\temp\junk.mat' in the function, the regular expression matching will return 'help.txt' or 'C:\temp\junk.mat', exactly as they appear in the function. To find the full path, you can use the WHICH function on each data file (assuming the files reside somewhere on the MATLAB path).
Hope you find these useful! =)
Try DepSubFun from TMW FileExchange.
Another way is just to exclude folders you don't need:
localdep = depfunresult(cellfun(#isempty,regexp(a,'toolbox')));
You can use any regexp pattern there.
Thank you for the responses so far.
I do not think that these are quite what I am looking to accomplish.
I was hoping there was already something that would determine local functions called within the main m-file, add them to the list, and proceed to look in each one until there are none left. It doesn't seem that any of these solutions do this
I have come up with a scheme that I will try to implement. It may be a bit brute force and the design might change as I work on it, but here is the concept.
There are quite a few assumptions made in this initial design but since it is mostly for me and a few others I don't think it will be a big issue for my general solution.
Files types to look for: .m .mat .mex* .txt (will be updated as needed)
Determine matlabpath and weed out toolbox paths (this is where it is an assumption your working directories are not called toolbox or that you don't have any special m-files you added to the other toolboxes)
hopefully leaving you only with directories you use and can call functions from. (also assumes you don't hardcode some type of [run 'C:\random\myscript.m']
brute force part:
look for the file types you are interested in and make a list of the ones in your working directory (pwd) and the remaining matlab paths
remove filenames that match one in the working directory.
iterate through searching the main m-file for each filename, if found add it to the array of dependent files. remove dependent files from the original list. search dependent files list with the "new" original list, repeat until no files left or no matches at all.
So far this is just the concept I have, I will also be searching a little more as well.
I got this script finally running today, it is a windows matlab based one as it makes a '!findstr "something" file.txt' call. (I would have preferred a grep but didn't know matlab equivalent.
I am going to ask my boss if I am allowed to post it on the matlab file exchange to share with others so hopefully I will update this soon with the link.
gnovice:
I don't have enough rep to comment on gnovice's comment of my description I wrote prior to writing the code.
But basically to determine which what it does is takes the filename of all files (broken into category of filetype), strips off the fullpathname and the extension, uses the above mentioned !findstr command to search it in the .m file that you are building the dependency for and outputs that to a temp.txt file (this is because I couldn't figure out a way to get a 1 or 0 or isempty return on the output of the command)
here is a breakdown of what I personally search for to determine if each file is used:
.m : 'filename ' or 'filename(' % covers the 'filename (' case
.mex* : same as above
.mat : was doing same as above but am going to change to some sort of load and the 'filename.mat' working on this probably tomorrow
.txt : simply searches for 'filename.txt'
With this method you may end up with a few extra text files or .m files but the key here is you should at least have all the files you need.
It also recursively calls itself on all the dependent files so that their dependencies are taken into account too.
-TaRDy
I wrote code a long time ago to do this for octave. I use it mainly to generate .dot files for graphviz to visualize the dependencies, but I also use it in makefiles for wrapping up dependencies when compiling code. it is perl code, unfortunately, but you can run it from a script by calling it via shell. it is fully recursive.
to run it, you'll have to change the OCT_BASE to point to the root directory of your code. (sorry, it is not matlab's path-variable aware). then I would probably run it as perl octavedepgrapher.pl -l
#! /bin/sh
exec perl -x -S $0 ${1+"$#"} # -*-perl-*-
#!perl
#
# octavedepgrapher.pl
# find the dependancy graph of octave file(s). prints a
# dot file suitable for graphviz
# Author: steven e. pav
# Created: 2006.07.16
# SVN: $Id$
#
# * Thu Aug 30 2007 Steven Pav
# - expanding to recognize matlabs pragma of %#function funcname
# version 0.3 2007.04.17
# add raw output mode.
# version 0.2 2007.03.05
# add media selection
# version 0.1 2006.08.24
# fixed multiple functions within file.
# added multiple edgeout capability.
# adding clusters for files.
# version 0.0 2006.07.16
# created.
#
#
########################################################################
########################################
# change only this
########################################
##OCT_BASE = qw(/home/spav/sys/octave/m/ ./ $ENV{OCTAVE});
#OCT_BASE = qw(/home/spav/sys/octave/m/ ./);
########################################################################
$VERSION = "octavedepgrapher version 0.02 2006.08.23\n";
########################################################################
use Getopt::Long;
$Getopt::Long::ignorecase = 0;
$Getopt::Long::order = $PERMUTE;
%OPT_MEANINGS = (
'H' => 'show Help.',
'l' => 'list the dependencies to standard out. do not make a dot file.',
'p' => 'give full path names.',
'm' => 'multi-edge. one for each function call.',
'g' => 'map connections from functions to global variables.',
'G' => 'map connections between functions which share global variables.',
'C' => 'do not cluster files.',
'D' => 'Debug.',
'd=s' => 'dependency mode for makefiles. sets -p and -l, and but outputs in makefile suitable format. the string is the extension (with dot) to substitute for .m',
'r=s' => 'aspect ratio (can be fill, auto, compact (default))',
'B=s' => 'base directory. if given, all directories are assumed relative to this one.',
'L=s' => 'colon separated list of base directories of libraries (_overrides_ OCT_BASE). should probably include ./',
'l=s' => 'colon separated list of base directories of libraries (in addition to OCT_BASE).',
'X=s' => 'colon separated list of base directories to exclude in the search.',
'M=s' => 'media selection',
);
$OPTS = join('',(map { substr($_,0,1); } keys(%OPT_MEANINGS)));
&GetOptions(keys %OPT_MEANINGS);
$opt_H && &die_usage; #done
$opt_L && (#OCT_BASE = split(/\s*:\s*/,$opt_L));
$opt_l && (push(#OCT_BASE,split(/\s*:\s*/,$opt_l)));
$opt_X && (#OCT_BASE = #{&rm_dirs(\#OCT_BASE,$opt_X)});
if (not $opt_M)
{ $size="25,20";
} else {
($opt_M =~ m/^legal/i) and $size = '8.5,14';
($opt_M =~ m/^letter/i) and $size = '8.5,11';
($opt_M =~ m/^A0$/i) and $size = '33.1,46.8';
($opt_M =~ m/^A1$/i) and $size = '23.4,33.1';
($opt_M =~ m/^A2$/i) and $size = '16.5,23.4';
($opt_M =~ m/^A3$/i) and $size = '11.7,16.5';
($opt_M =~ m/^A4$/i) and $size = '8.3,11.7';
($opt_M =~ m/^A4dj$/i) and $size = '8.3,11.7';
($opt_M =~ m/^A5$/i) and $size = '5.8,8.3';
}
#if (not $opt_r) { $ratio = 'fill'; } else { $ratio = $opt_r; }
$ratio = $opt_r || 'fill';
if ($opt_d)
{
$opt_l = $opt_p = 1;
}
#make sure it has a tailing slash.
if ($opt_B)
{
($opt_B !~ m{/$}) && ($opt_B .= q[/]);
}
########################################################################
$| = 1;
if (! #ARGV)
{
&die_usage;
} else
{
%mfhash = &map_name_to_filename(#ARGV);
}
if ($opt_d)
{
#myargv = #ARGV;
print join(' ',map { s/\.m/$opt_d/e;$_; } #ARGV),qq[ : ];
}
if ($opt_l) {
%bdhash = &find_base_libs(#OCT_BASE);
$alldepref = &find_all_deps(\%mfhash,\%bdhash,0);
print join(' ',#{$alldepref}),qq[\n];
} else {
&print_head();
%bdhash = &find_base_libs(#OCT_BASE);
&find_all_deps(\%mfhash,\%bdhash,1);
&print_tail();
}
$opt_X && (#OCT_BASE = #{&rm_dirs(\#OCT_BASE,$opt_X)});
########################################################################
sub
rm_dirs
#remove directories from OCT_BASE
{
my $ob_ref = shift(#_);
my $oX = shift(#_);
my #excludeus = split(/\s*:\s*/,$oX);
#FIX!
}
########################################################################
sub
make_relative
#just for the sake of opt_B#FOLDUP
{
my $fullname = shift(#_);
if ($opt_B)
{
$fullname =~ s{\Q$opt_B\E}{};
}
return $fullname;
}#UNFOLD
########################################################################
sub
map_name_to_filename#FOLDUP
{
my $mfile;
my %mfiles;
my $mfstub;
while ($mfile = shift(#_))
{
$mfstub = $mfile;
$mfstub =~ s/^\s*(.*\/)?([^\/]+)\.m\s*$/$2/;
$mfiles{$mfstub} = $mfile;
}
return %mfiles;
}#UNFOLD
########################################################################
sub
find_base_libs#FOLDUP
{
my $based;
my %bdhash;
my ($mfile,$mfstub);
my #mfiles;
while ($based = shift(#_))
{
# print "|$based|\n";
#mfiles = split(/\n/,qx(cd $based && find . -name '*.m'));
while ($mfile = shift(#mfiles))
{
$mfstub = $mfile;
$mfstub =~ s/.+\/([^\/]+)\.m/$1/;
$mfile =~ s/^\s*\.\//$based/;
$bdhash{$mfstub} = $mfile;
#print STDERR "|$mfstub| -> |$mfile| |$based|\n";
}
}
return %bdhash;
}#UNFOLD
########################################################################
#returns array of all the dependencies as filename strings.
sub
find_all_deps#FOLDUP
{
my $mfhashref = shift(#_);
my $bdhashref = shift(#_);
my $doprint = shift(#_); #if 0, do not print anything out.
my #mfhashlist = %{$mfhashref};
my %bdhash = %{$bdhashref};
my $output = [];
my %globals;
my $gname;
my %doneok;
my ($mfname,$mfloc);
my ($aline,$acommand,$copyline);
my %eegraph; #store as node::node in this hash set.
#prevents edges from being written multiple times?
my %dangling = {}; #any command which has yet to be found.
#store vals a list of things which want to point in.
my $pointsin;
my $foundnewfunc;
my $foundFuncPragma; #for looking for % #function fname stuff
#my #myDependencies; #every function that I call;
my $edgestr = '';
while ($mfname = shift(#mfhashlist))#FOLDUP
{
$mfloc = shift(#mfhashlist);
$mf_alias = ($opt_p)? &make_relative($mfloc) : $mfname; #full names or not
#prevent node -> self edges.
$eegraph{qq(${mfname}::${mfname})} = 1;
if ((! $opt_C) && $doprint)
{
print qq(subgraph cluster_$mfname {\n);
print qq(rank=min\n);
print qq(ordering=out\n);
}
#node
$doprint &&
print qq{$mfname [label="$mf_alias" shape=plaintext fontsize=44]\n};
push (#{$output},$mf_alias);
$doneok{$mfname} = 1;
#open a file#FOLDUP
open (FH,"$mfloc") || die "no open $mfloc, $!";
while (! eof(FH))
{
$aline = ;
chomp($aline);
$foundFuncPragma = 0;
if ($aline =~ /^[^%]*end\s*%?\s*function/) { $mfname = ''; }
if ($mfname) #inside a function
{
if ($opt_g || $opt_G) #look for globals#FOLDUP
{
if ($aline =~ /global/)
{
$copyline = $aline;
while ($copyline =~ s/(global\s+)([^;\s]+)(\s*;)/$1$3/)
{
$gname = $2;
if (exists $globals{$gname})
{
push(#{$globals{$gname}},$mfname);
} else {
$globals{$gname} = [$mfname];
}
}
}
}#UNFOLD
#look for #function pragma
$foundFuncPragma = ($aline =~ s/%\s*#function\s+(.+)$//);
if ($foundFuncPragma)
{
$opt_D && (print STDERR "found a function pragma! |$1|\n");
#what a bummer that we can't just use this: the
#problem is that we don't really know when a function
#ends in .m code, b/c endfunction is not required. bummer.
#push (#myDependencies,split(/\s+/,$1));
#
#that is, what we would really like to do is just push onto a list
#every time we saw a command, then puke at the end of the function,
#but we do not know really when a function ends in matlab. oops.
foreach $acommand (split(/\s+/,$1))
{
$opt_D && (print STDERR "found a command! |$acommand|\n");
#push (#myDependencies,$acommand);
if (exists($bdhash{$acommand}))
{
$opt_D && (print STDERR "exists in bdhash (prolly means is a file to itself)\n");
if (! $eegraph{qq(${mfname}::${acommand})})
{
if ($opt_C) { $doprint && print "$mfname -> $acommand\n";
} else { $edgestr .= "$mfname -> $acommand\n"; }
if (! $opt_m) { $eegraph{qq(${mfname}::${acommand})} = 1; }
}
if (! $doneok{$acommand})
{
$doneok{$acommand} = 1;
push(#mfhashlist,$acommand,$bdhash{$acommand});
}
} else
{
if (exists($dangling{$acommand}))
{ push(#{$dangling{$acommand}},$mfname);
} else { $dangling{$acommand} = [$mfname]; }
}
}
}
while ($aline =~ /([a-zA-Z0-9_]+)\s*\(/)#FOLDUP
{
$aline =~ s/([a-zA-Z0-9_]+)\s*\(//;
$acommand = $1;
$opt_D && (print STDERR "found a command! |$acommand|\n");
#push (#myDependencies,$acommand);
if (exists($bdhash{$acommand}))
{
$opt_D && (print STDERR "exists in bdhash (prolly means is a file to itself)\n");
if (! $eegraph{qq(${mfname}::${acommand})})
{
if ($opt_C) { $doprint && print "$mfname -> $acommand\n";
} else { $edgestr .= "$mfname -> $acommand\n"; }
if (! $opt_m) { $eegraph{qq(${mfname}::${acommand})} = 1; }
}
if (! $doneok{$acommand})
{
$doneok{$acommand} = 1;
push(#mfhashlist,$acommand,$bdhash{$acommand});
}
} else
{
if (exists($dangling{$acommand}))
{ push(#{$dangling{$acommand}},$mfname);
} else { $dangling{$acommand} = [$mfname]; }
}
}#UNFOLD
} else #not yet inside a function.
{
$foundnewfunc = 0;
if ($aline =~ /^[^%]*function\s+[^=]*=\s*([a-zA-Z0-9_]+)\s*(\(|;|%|$)/)
{
$mfname = $1;$foundnewfunc = 1;
} elsif ($aline =~ /^[^%]*function\s+([a-zA-Z0-9_]+)\s*(\(|;|%|$)/)
{
$mfname = $1;$foundnewfunc = 1;
}
if ($foundnewfunc)
{
##myDependencies = ();
$opt_D && (print STDERR "now looking at function |$mfname|\n");
$eegraph{qq(${mfname}::${mfname})} = 1;
#subnode
$doprint && print "$mfname [shape=box]\n";
$doneok{$mfname} = 1;
$bdhash{$mfname} = 1; #innocent enough since doneok is set too.
if (exists($dangling{$mfname}))
{
while ($pointsin = shift(#{$dangling{$mfname}}))
{
$doprint && print "$pointsin -> $mfname\n";
}
}
}
}
}
close FH;#UNFOLD
if (! $opt_C)
{
$doprint && print qq(}\n);
$doprint && print $edgestr;
$edgestr = '';
}
}#UNFOLD
if ($doprint)
{
if ($opt_g)
{
foreach $key (keys(%globals))
{
print qq{$key [style=dotted label="$key" color=red shape=plaintext fontsize=44]\n};
foreach $f (#{$globals{$key}})
{
print qq{$f -> $key [color=red]\n};
}
}
} elsif ($opt_G)
{
foreach $key (keys(%globals))
{
while (defined($g = shift(#{$globals{$key}})))
{
# foreach $f (#{$globals{$key}}) { print qq{$g -- $f [color=red]\n}; }
foreach $f (#{$globals{$key}}) { print qq{$g -> $f [style=dotted label="$key" fontsize=30 fontcolor=red color=red]\n}; }
}
}
}
}
return $output;
}#UNFOLD
########################################################################
sub
print_head#FOLDUP
{
if (! $opt_m)
{
print qq[strict ];
}
# if ($opt_G) { print qq[octavedep {\n]; } else { print qq[digraph octavedep {\n]; }
print qq[digraph octavedep {\n];
print qq[nslimit=15.0\n];
print qq[mclimit=1.0\n];
print qq[ratio="$ratio"\n];
print qq[size="$size"\n];
}#UNFOLD
sub
print_tail#FOLDUP
{
print "}\n";
}#UNFOLD
########################################################################
sub
die_usage#FOLDUP
{
# print STDERR "usage: perl $0 [-$OPTS] [-$VALOPTS val] octfiles\n\n";
print STDERR "usage: perl $0 [-$OPTS] octfiles\n\n";
if ($opt_H)
{
%OPT_MEANINGS =
map {($a=$_)=~s/(.)+?[=:!]?[ifs]?/$1/;$a=>$OPT_MEANINGS{$_};}
keys %OPT_MEANINGS;
#OPTS = split(//,$OPTS);
while ($OP = shift(#OPTS)) {
print STDERR " $OP $OPT_MEANINGS{$OP}\n";
}
print STDERR "\n";
}
exit;
}#UNFOLD
########################################################################
__END__
works for me...
Though depfun doesn't provide an 'ignore-builtins' option, it does give us a '-toponly' option that we can use within our own recursive function that does exculde built-ins and runs much faster. Below is my solution:
function new_file_list = fastdepfun(paths)
% new_file_list = fastdepfun(paths)
% paths = same input as you use with depfun
[file_list] = depfun(paths,'-toponly','-quiet');
% Remove builtins (implement this part however you like)
mroot = matlabroot;
file_list = file_list(~strncmp(file_list,mroot,length(mroot)));
% Remove files already inspected (otherwise we get stuck in an infinite loop)
new_file_list = setdiff(file_list,paths);
if ~isempty(new_file_list)
new_file_list = fastdepfun(new_file_list);
end
new_file_list = unique([file_list; new_file_list]);