In a debugged script, is it safe to disable `use strict` and `use warnings`? - perl

Consider an ideal case where a script has been written with
use strict;
use warnings FATAL => 'all';
and thoroughly reviewed, tested and debugged and you are happy with how it works.
There is no dynamically generated code or other fancy stuff in the script, and the code is overall simple and low-tech.
For critical applications, would such a script be as correct and safe with checks commented out:
# use strict;
# use warnings FATAL => 'all';
as it is with them on?
Provided that special caution is taken during edits, upgrades and any other maintenance to re-enable both use strict and use warnings and re-test.
Edit:
IMHO the correctness in question is worth answering, no matter whether the original reason justifies the hassle in the opinion of the reader. Replies like "you should use it coz you don't loose much" or "you just should coz it's best practice" are non-answers. Let's take a fresh unbiased look at whether strict and warnings are undoubtedly recommended to be kept on in an already debugged script
The reason is performance time to finish task penalties that these pragmas introduce.
Update
For a script that does its job quickly and is called numerous times and response time is significant, — cumulative effect can make a difference.
Update: time to finish task penalties
CPU is i5-3320M, OS is OpenBSD 7.2 amd64.
for i in $(seq 3); do
time for i in $(seq 10000); do
/usr/bin/perl -e ';'
done
sleep 3
done
sleep 3
for i in $(seq 3); do
time for i in $(seq 10000); do
/usr/bin/perl -e 'use strict; use warnings;'
done
sleep 3
done
perl is v5.32.1, vendor-patched for security (read against performance).
3 passes of 10000 of /usr/bin/perl -e ';':
1m32.01s real 0m00.60s user 0m03.24s system
1m32.60s real 0m00.70s user 0m03.42s system
1m31.53s real 0m00.69s user 0m04.17s system
3 passes of 10000 of /usr/bin/perl -e 'use strict; use warnings;':
2m46.08s real 0m00.72s user 0m04.63s system
2m48.99s real 0m00.61s user 0m04.79s system
2m49.64s real 0m00.75s user 0m05.16s system
Roughly 75 seconds stopwatch time difference for 10000 invocations.
Same shell command but perlbrew-installed /opt/p5/perlbrew/perls/perl-5.36.0/bin/perl instead of vendor /usr/bin/perl:
3 passes of 10000 of /opt/p5/perlbrew/perls/perl-5.36.0/bin/perl -e ';':
1m09.31s real 0m00.48s user 0m02.60s system
1m12.06s real 0m00.49s user 0m02.94s system
1m14.81s real 0m00.70s user 0m03.44s system
3 passes of 10000 of /opt/p5/perlbrew/perls/perl-5.36.0/bin/perl -e 'use strict; use warnings;':
2m20.81s real 0m00.55s user 0m04.03s system
2m21.98s real 0m00.72s user 0m04.26s system
2m21.75s real 0m00.58s user 0m03.86s system
Roughly 70 seconds stopwatch time difference for 10000 invocations.
For those who find time taken to be too long, it is due to OpenBSD. I had done some measurements, and perl 'hello world' turned out to be 8.150 / 1.688 = 4.8 times slower on OpenBSD than on antiX Linux
Update 2: strict-only time to finish task penalties
/usr/bin/perl -e 'use strict;':
1m59.70s real 0m00.51s user 0m04.27s system
1m59.36s real 0m00.58s user 0m04.04s system
1m57.58s real 0m00.63s user 0m04.50s system
Roughly 26 seconds stopwatch time overhead for 10000 invocations.
/opt/p5/perlbrew/perls/perl-5.36.0/bin/perl -e 'use strict;':
1m29.06s real 0m00.59s user 0m04.55s system
1m30.04s real 0m00.52s user 0m04.57s system
1m31.26s real 0m00.54s user 0m05.30s system
Roughly 20 seconds stopwatch time overhead for 10000 invocations.
Update 3
Up till now, replies are mostly evasive pointing out negligibility of performance penalties. One may or may not care about 7 milliseconds per invocation or 70 seconds for 10K invocations. Whatever. Please disregard the provided reason or any other possible reason and focus on the actual question about correctness, as it itself deserves a solid answer

First of all, neither pragma introduce any performance penalty. They simply set flags which are only checked when the exceptional situations occur, and they are checked whether the pragmas were used or not. So the whole question relies on a false premise.
But to answer your question, both of these pragmas have a run-time effect, so whether removing them with make a difference depends on the thoroughness of your tests. They probably aren't complete, so a difference is possible, even likely.

strict and warnings are developer tools. If you are done developing and everything is clean, you don't need them anymore. Note what #ikegami has already said, though.
In certain environments where all standard error is logged, you have the possibility of some new perl, changed setting, or untested code path emitting warnings. I've had one situation in my career where a formerly clean script started emitting tons of warnings after a perl upgrade. This eventually filled up the disk and brought the service down. That was not fun. If no one is monitoring the warnings, it's pointless to emit them. But, the lesson here is proper monitoring.
I don't think warnings should be enabled in production code because you should have either fixed them or decided to ignore them. Sometime, but rarely, I'll turn off warnings in a very small scope because the fix would make the code harder to read or cause other problems:
{
no warnings qw(uninitialized);
....
}
But really, I usually just fix warnings and leave warnings enabled. I stopped caring around Perl v5.12 which turns on warnings for free:
use v5.12; # free warnings
I care more about specifying the minimal Perl version than removing a use warnings or adding a no warnings line.
And, with v5.36, I get strictures when I specify that as the minimal version:
use v5.36; # free warnings and free strict
Finally, your stated penalty is 7ms. If that's the hot path in your code , you're a lucky person. If you are worried about start up time and need that 7ms, there are other things you should be doing to reclaim that start up time.
But, remember that a one time benchmark on a multi-user, multi-process machine, even if you did run it for a couple seconds, is tainted by anything else going on. If you can repeatedly show the 7ms delay through all sorts of loads and situations, then we should believe that. In my own testing of the same thing on my MacBook Pro, I see differences in as much as 30%. I attribute most of that to operating system level stuff happening when I decide to do the test.

Related

Is there any performance benefits that come with using subroutine in Perl?

I am new to Perl 5 and so far I understand that subroutines are good for code reuse and breaking long procedural scripts into shorter and readable fragments. What I wanted to know is that are there any performance benefits that come with using subroutines? Do they make the code to execute faster?
Take a look at the following snippets;
#!/usr/bin/perl
print "Who is the fastest?\n";
compared to this one;
#!/usr/bin/perl
# defining subroutine
sub speed_test {
print "Who is the fastest?\n";
}
# calling subroutine ;
speed_test();
The performance benefit of subroutines is that, used proficiently, they allow developers to write, test, and release code faster.
The resulting code does not run faster, but that's OK, because it's not the 1960s any more. These days, hardware is cheap and programmers are expensive. Hiring a second programmer to speed up slow code production costs orders of magnitude more than buying a second machine (or a higher-powered machine) to speed up slow code execution.
Also, as I note with just about every "how can I micro-optimize Perl code?" question, Perl is not a high-performance language, period. Never has been, almost certainly never will be. That's not what it's designed for. Perl is optimized for speed of development, not speed of execution.
If saving a microsecond here and a microsecond there actually matters to your use case (which it almost certainly doesn't), then you will get far greater speed benefits by switching to a lower-level, higher-performance language than you will get by anything you might do differently in Perl.
There is no performance benefit in using subroutines. Actually there is a performance penalty for using them. That is why in highly speed-optimized code subroutines are sometimes "inlined".
But the root of all evil in software development is premature code optimization. Writing code that does not use subroutines/methods to delegate different tasks is way worse on the long run.
Any code block that exceeds 10 (or maybe 25? or 50?) lines usually has to be split. The limit depends on the people you ask.

Can I use dtrace on OS X 10.5 to determine which of my perl subs is causing the most memory allocation?

We have a pretty big perl codebase.
Some processes that run for multiple hours (ETL jobs) suddenly started consuming a lot more RAM than usual. Analysis of the changes in the relevant release is a slow and frustrating process. I am hoping to identify the culprit using more automated analysis.
Our live environment is perl 5.14 on Debian squeeze.
I have access to lots of OS X 10.5 machines, though. Dtrace and perl seem to play together nicely on this platform. Seems that using dtrace on linux requires a boot more work. I am hoping that memory allocation patterns will be similar between our live system and a dev OS X system - or at least similar enough to help me find the origin of this new memory use.
This slide deck:
https://dgl.cx/2011/01/dtrace-and-perl
shows how to use dtrace do show number of calls to malloc by perl sub. I am interested in tracking the total amount of memory that perl allocates while executing each sub over the lifetime of a process.
Any ideas on how this can be done?
There's no single way to do this, and doing it on a sub-by-sub basis isn't always the best way to examine memory usage. I'm going to recommend a set of tools that you can use, some work on the program as a whole, others allow you to examine a single section of your code or a single variable.
You might want to consider using Valgrind. There's even a Perl module called Test::Valgrind that will help set up a suppression file for your Perl build, and then check for memory leaks or errors in your script.
There's also Devel::Size which does exactly what you asked for, but on a variable-by-variable basis rather than a sub-by-sub basis.
You can use Devel::Cycle to search for inadvertent circular memory references in complex data structures. While a circular reference doesn't mean that you're wasting memory as you use the object, circular references prevent anything in the chain from being freed until the cycle is broken.
Devel::Leak is a little bit more arcane than the rest, but it basically will allow you to get full information on any SVs that are created and not destroyed between two points in your program's execution. If you check this across a sub call, you'll know any new memory that that subroutine allocated.
You may also want to read the perldebguts section of the Perl manual.
I can't really help more because every codebase is going to wind up being different. Test::Valgrind will work great for some codebases and terribly on others. If you are going to try it, I recommend you use the latest version of Valgrind available and Perl >= 5.10, as Perl 5.8 and Valgrind historically didn't get along too well.
You might want to look at Memory::Usage and Devel::Size
To check the whole process or sub:
use Memory::Usage;
my $mu = Memory::Usage->new();
# Record amount of memory used by current process
$mu->record('starting work');
# Do the thing you want to measure
$object->something_memory_intensive();
# Record amount in use afterwards
$mu->record('after something_memory_intensive()');
# Spit out a report
$mu->dump();
Or to check specific variables:
use Devel::Size qw(size total_size);
my $size = size("A string");
my #foo = (1, 2, 3, 4, 5);
my $other_size = size(\#foo);
my $foo = {
a => [1, 2, 3],
b => {a => [1, 3, 4]}
};
my $total_size = total_size($foo);
The answer to the question is 'yes'. Dtrace can be used to analyze memory usage in a perl process.
This snippet of code:
https://github.com/astletron/perl-dtrace-malloc/blob/master/perl-malloc-total-bytes-by-sub.d
tracks how memory use increases between the call and return of every sub in a program. As an added bonus, dtrace seems to sort the output for you (at least on OS X). Cool.
Thanks to all that chimed in. I answered this one myself as the question is really specific to dtrace/perl.
You could write a simple debug module based on Devel::CallTrace that prints the sub entered as well as the current memory size of the current process. (Using /proc or whatever.)

Enable global warnings

I have to optimize an intranet written in Perl (about 3000 files). The first thing I want to do is enable warnings "-w" or "use warnings;" so I can get rid of all those errors, then try to implement "use strict;".
Is there a way of telling Perl to use warnings all the time (like the settings in php.ini for PHP), without the need to modify each script to add "-w" to it's first line?
I even thought to make an alias for /usr/bin/perl, or move it to another name and make a simple script instead of it just to add the -w flag (like a proxy).
How would you debug it?
Well…
You could set the PERL5OPT envariable to hold -w. See the perlrun manpage for details. I hope you’ll consider tainting, too, like -T or maybe -t, for security tracking.
But I don’t envy you. Retrofitting code developed without the benefit of use warnings and use strict is usually a royal PITA.
I have something of a standard boiler-plate I use to start new Perl programs. But I haven’t given any thought to one for CGI programs, which would likely benefit from some tweaks against that boiler-plate.
Retrofitting warnings and strict is hard. I don't recommend a Big Bang approach, setting warnings (let alone strictures) on everything. You will be inundated with warnings to the point of uselessness.
You start by enabling warnings on the modules used by the scripts (there are some, aren't there?), rather than applying warnings to everything. Get the core clean, then get to work on the periphery, one unit at a time. So, in fact, I'd recommend having a simple (Perl) script that simply finds a line that does not start with a hash and adds use warnings; (and maybe use strict; too, since you're going to be dealing with one script at a time), so you can do the renovations one script at a time.
In other words, you will probably be best off actually editing each file as you're about to renovate it.
I'd only use the blanket option to make a simple assessment of the scope of the problem: is it a complete and utter disaster, or merely a few peccadilloes in a few files. Sadly, if the code was developed without warnings and strict, it is more likely to be 'disaster' than 'minimal'.
You may find that your predecessors were prone to copy and paste and some erroneous idioms crop up repeatedly in copied code. Write a Perl script that fixes each one. I have a bunch of fix* scripts in my personal bin directory that deal with various changes - either fixing issues created by recalcitrant (or, more usually, simply long departed) colleagues or to accommodate my own changing standards.
You can set warnings and strictures for all Perl scripts by adding -Mwarnings -Mstrict to your PERL5OPT environment variable. See perlrun for details.

How can I profile a subroutine without using modules?

I'm tempted to relabel this question 'Look at this brick. What type of house does it belong to?'
Here's the situation: I've effectively been asked to profile some subroutines having access to neither profilers (even Devel::DProf) nor Time::HiRes. The purpose of this exercise is to 'locate' bottlenecks.
At the moment, I'm sprinkling print statements at the beginning and end of each sub that log entries and exits to file, along with the result of the time function. Not ideal, but it's the best I can go by given the circumstances. At the very least it'll allow me to see how many times each sub is called.
The code is running under Unix. The closest thing I see to my need is perlfaq8, but that doesn't seem to help (I don't know how to make a syscall, and am wondering if it'll affect the code timing unpredictably).
Not your typical everyday SO question...
This technique should work.
Basically, the idea is if you run Perl with the -d flag, it goes into the debugger. Then, when you run the program, ctrl-Break or ctrl-C should cause it to pause in the middle of whatever it is doing. Then you can type T to show the stack, and examine any other variables if you like, before continuing it.
Do this about 10 or 20 times. Any line of code (or any function, if you prefer) costing a significant percent of time will appear on that percent of stack samples, roughly, so you will not miss it.
For example, if a line of code (typically a function call) costs 20% of time, and you pause the program 20 times, you will see that line on 4 stack samples, give or take 1.8 samples. The amount of time that could be saved if you could avoid executing that line, or execute it a lot less, is a 20% reduction in overall execution time.
Then you can repeat it to find more problems.
You said the purpose is to 'locate' bottlenecks. This method does exactly that. Measuring function execution time is only a very indirect way to do that.
As far as syscall, there's a pretty good example in this post: http://www.cpan.org/scripts/date_and_time/gettimeofday
I think it's clear enough even for someone who never used syscall before (like myself :)
May I ask what the specifics of "having no access" are?
It's usually possible to get access to CPAN modules, even in cases where installing them in central location is not in the cards. Is there a problem with downloading the module? Installing it in your home directory? Using software with the module incuded?
If one of those is a hang-up it can probably be fixed... if it's some company policy, that's priceless :(
Well, you can write your own profiler. It's not as bad as it sounds. A profiler is just a very special-case debugger. You want to read the perldebguts man page for some good first-cut code to get started if you must write your own.
What you want, and what your boss wants, though he or she may not know it, is to use Devel::NYTProf to do a really good job of profiling your code, and getting the job done instead of having to wait for you to partially duplicate the functions of it while learning how it is done.
The comment you made about "personal use" doesn't make sense. You're doing a job for work, and the work needs to get done, and you need (or your manager needs to get you) the resources to do that work. "Personal use" doesn't seem to enter into it.
Is it a question of someone else refusing to sign off on the module to have it installed on the machine running the software to be measured? Is it a licensing question? Is it not being allowed to install arbitrary software on a production machine (understandable, but there's got to be some way the software's tested before it goes live - I hope - profile it there)?
What is the reason that a well-known module from a trustworthy source can't be used? Have you made the money case to your manager that more money will be spent coding a new, less-functional, profiler from scratch than finding a way to use one that is both good and already available?
For each subroutine, create a wrapper around it which reports the time in some format which you can export to something like R, a database, Excel or something similar (CSV would be a good choice). Add something like this to your code. If you are using a Perl less than 5.7 (when Time::HiRes was first added to core), use syscall as mentioned above instead of Time::HiRes's functions below.
INIT {
sub wrap_sub {
no strict 'refs';
my $sub = shift;
my $subref = *{$sub}{CODE};
return sub {
local *__ANON__ = "wrapped_$sub";
my $fsecs = Time::HiRes::gettimeofday();
print STDERR "$sub,$fsecs,";
if (wantarray) {
#return = eval { $subref->(#_) } or die $#;
} else {
$return[0] = eval { $subref->(#_) } or die $#;
}
$fsecs = Time::HiRes::gettimeofday();
print STDERR "$fsecs\n";
return wantarray ? #return : $return[0];
};
}
require Time::HiRes;
my #subs = qw{the subs you want to profile};
no strict 'refs';
no warnings 'redefine';
foreach my $sub (#subs) {
*{$sub} = wrap_sub($sub);
}
}
Replace 'subs you want to profile' with the subs you need profiled, and use an open()ed file handle instead of STDERR if you need to, bearing in mind you can get the results of the run separate from the output of the script (on Unix, with the bourne, korn and bash shells), like this
perl ./myscript.pl 2>myscript.profile

Should I use common::sense or just stick with `use strict` and `use warnings`?

I recently installed a module from CPAN and noticed one of its dependencies was common::sense, a module that offers to enable all the warnings you want, and none that you don't. From the module's synopsis:
use common::sense;
# supposed to be the same, with much lower memory usage, as:
#
# use strict qw(vars subs);
# use feature qw(say state switch);
# no warnings;
# use warnings qw(FATAL closed threads internal debugging pack substr malloc
# unopened portable prototype inplace io pipe unpack regexp
# deprecated exiting glob digit printf utf8 layer
# reserved parenthesis taint closure semicolon);
# no warnings qw(exec newline);
Save for undef warnings sometimes being a hassle, I've usually found the standard warnings to be good. Is it worth switching to common::sense instead of my normal use strict; use warnings;?
While I like the idea of reducing boiler-plate code, I am deeply suspicious of tools like Modern::Perl and common::sense.
The problem I have with modules like this is that they bundle up a group of behaviors and hide behid glib names with changeable meanings.
For example, Modern::Perl today consists of enabling some perl 5.10 features and using strict and warnings. But what happens when Perl 5.12 or 5.14 or 5.24 come out with great new goodies, and the community discovers that we need to use the frobnitz pragma everywhere? Will Modern::Perl provide a consistent set of behaviors or will it remain "Modern". If MP keeps with the times, it will break existing systems that don't keep lock-step with its compiler requirements. It adds extra compatibility testing to upgrade. At least that's my reaction to MP. I'll be the first to admit that chromatic is about 10 times smarter than me and a better programmer as well--but I still disagree with his judgment on this issue.
common::sense has a name problem, too. Whose idea of common sense is involved? Will it change over time?
My preference would be for a module that makes it easy for me to create my own set of standard modules, and even create groups of related modules/pragmas for specific tasks (like date time manipulation, database interaction, html parsing, etc).
I like the idea of Toolkit, but it sucks for several reasons: it uses source filters, and the macro system is overly complex and fragile. I have the utmost respect for Damian Conway, and he produces brilliant code, but sometimes he goes a bit too far (at least for production use, experimentation is good).
I haven't lost enough time typing use strict; use warnings; to feel the need to create my own standard import module. If I felt a strong need for automatically loading a set of modules/pragmas, something similar to Toolkit that allows one to create standard feature groups would be ideal:
use My::Tools qw( standard datetime SQLite );
or
use My::Tools;
use My::Tools::DateTime;
use My::Tools::SQLite;
Toolkit comes very close to my ideal. Its fatal defects are a bummer.
As for whether the choice of pragmas makes sense, that's a matter of taste. I'd rather use the occasional no strict 'foo' or no warnings 'bar' in a block where I need the ability to do something that requires it, than disable the checks over my entire file. Plus, IMO, memory consumption is a red herring. YMMV.
update
It seems that there are many (how many?) different modules of this type floating around CPAN.
There is latest, which is no longer the latest. Demonstrates part of the naming problem.
Also, uni::perl which adds enabling unicode part of the mix.
ToolSet offers a subset of Toolkit's abilities, but without source filters.
I'll include Moose here, since it automatically adds strict and warnings to the calling package.
And finally Acme::Very::Modern::Perl
The proliferation of these modules and the potential for overlapping requirements, adds another issue.
What happens if you write code like:
use Moose;
use common::sense;
What pragmas are enabled with what options?
I would say stick with warnings and strict for two main reasons.
If other people are going to use or work with your code, they are (almost certainly) used to warnings and strict and their rules. Those represent a community norm that you and other people you work with can count on.
Even if this or that specific piece of code is just for you, you probably don't want to worry about remembering "Is this the project where I adhere to warnings and strict or the one where I hew to common::sense?" Moving back and forth between the two modes will just confuse you.
There is one bit nobody else seems to have picked up on, and that's FATAL in the warnings list.
So as of 2.0, use common::sense is more akin to:
use strict;
use warnings FATAL => 'all'; # but with the specific list of fatals instead of 'all' that is
This is a somewhat important and frequently overlooked feature of warnings that ramps the strictness a whole degree higher. Instead of undef string interpolation, or infinite recursion just warning you and then keeping on going despite the problem, it actually halts.
To me this is helpful, because in many cases, undef string interpolation leads to further more dangerous errors, which may go silently unnoticed, and failing and bailing is a good thing.
I obviously have no common sense because I going more for Modern::Perl ;-)
The "lower memory usage" only works if you use no modules that load strict, feature, warnings, etc. and the "much" part is...not all that much.
Not everyone's idea of common sense is the same - in that respect it's anything but common.
Go with what you know. If you get undef warnings, chances are that your program or its input is incorrect.
Warnings are there for a reason. Anything that reduces them cannot be useful. (I always compile with gcc -Wall too...)
I have never had a warning that wasn't something dodgy/just plain wrong in my code. For me, it's always something technically allowed that I almost certainly don't want to do. I think the full suite of warnings is invaluable. If you find use strict + use warnings adequate for now, I don't see why you'd want to change to using a non-standard module which is then a dependency for every piece of code you write from here on out...
When it comes to warnings, I support the use of any module or built-in language feature that gives you the level of warnings that helps you make your code as solid and reliable as it can possibly be. An ignored warning is not helpful to anyone.
But if you're cozy with the standard warnings, stick with it. Coding to a stricter standard is great if you're used to it! I wouldn't recommend switching just for the memory savings. Only switch if the module helps you turn your code around quicker and with more confidence.
Many of peoples argues in a comments with what if the MP changes, it will break your code. While this can be an real threat, here is already MUCH things what are changes over time and break the code (sometimes after a deprecation cycle, sometimes not...).
Some other modules changed the API, so breaks things, and nobody care about them. E.g. Moose has at least two things what are deprecated now, and probably will be forbidden in some future releases.
Another example, years ago was allowed to write
for $i qw(some words)
now, it is deprecated. And many others... And this is a CORE language syntax.
Everybody survived. So, don't really understand why many of people argues againist helper modules. When they going to change, (probably) here will be a sort of deprecation cycle... So, my view is:
if you write programs to yourself, use any module you want ;)
if you write a program to someone, where someone others going to maintnanece it, use minimal nonstandard "pragma-like" modules (common::sense, modern::perl, uni::perl etc...)
in the stackoverflow questions, you can safely use common::sense or Modern::Perl etc. - most of users who will answer, your questions, knows them. Everybody understand than it is easier to write use 5.010; for enable strict, warnings and fearures with 10 chars as with 3 lines...