I made a small script in perl that displays the content of a file sent by a user.
However I noticed something strange: if the file is named 0, nothing will be printed, like if I didn't send any file and just refreshed the page.
How can this happen?
Is there any risk of someone dropping in the filename command to make my server execute it? (with the pipe thing)
Here is the code:
#!/usr/bin/perl
use CGI;
my $cgi = CGI->new;
print "Content-type: text/html\n\n";
if ($cgi->upload('file')) {
print '<h1>file uploaded:</h1>';
my $file = $cgi->param('file');
while (<$file>) {
print "a";
print "<p>".$cgi->escapeHTML($_)."</p>";
}
}
Because the string 0, like the empty string and the undef value, evaluate to false in a boolean context like
if ($cgi->upload('file')) { ... }
In filenames and text processing, this is an edge case that is usually less trouble than its worth to think about, but when you do need to worry about it, the workarounds are to evaluate whether the input is an empty string or not
if ($cgi->upload('file') ne '') { ... }
if (length($cgi->upload('file'))) { ... }
Early versions of the CGI module open the temporary file with sysopen and modern versions use File::Temp. Either way is sufficient to ensure that Perl is attempting to open a real file and will not use a shell that can be tricked by pipes or backticks into executing an arbitrary command.
I would probably use autoEscape ().
Another option is to simply test for the file name being "0" and asking the user to change it if it is.
Also, you could use temporary filenames.
See the official CGI documentation for more information.
CGI Documentation
Related
I am writing a script that looks at an access_log file to see how many times each search engine was accessed and to see which one is accessed the most. I am sure there are problems with some of my syntax, but I can't even tell since I am not receiving any information back when running it. Any help would be appreciated!
Code:
#!/usr/bin/perl
use 5.010;
$googleCount = 0;
$msnCount = 0;
$yahooCount = 0;
$askCount = 0;
$bingCount = 0;
while (<STDIN>)
{
if (/(google.com)/)
{
$googleCount++;
}
if (/(msn.com)/)
{
$msnCount++;
}
if (/yahoo.com/)
{
$yahooCount++;
}
if (/ask.com/)
{
$askCount++;
}
if (/bing.com/)
{
$bingCount++;
}
}
print "Google.com was accessed $googleCount times in this log.\n";
print "MSN.com was accessed $msnCount times in this log.\n";
print "Yahoo.com was accessed $yahooCount times in this log.\n";
print "Ask.com was accessed $askCount times in this log.\n";
print "Bing.com was accessed $bingCount times in this log.\n";
I am running MacOS. In the terminal I am typing:
perl -w access_scan.pl access_log.1
When I press enter, nothing happens.
Beside the fact that your script didn't work as you expected, there are a few things wrong with your script:
In regexes, the dot . matches any non-newline character. This includes a literal period, but is not restricted to that. Either escape it (/google\.com/) or protect special characters with \Q...\E: /\Qgoogle.com\E/.
There is a programming proverb “Three or more, use a for”. All your conditionals inside your loop are the same, except for the regex. You counts are actually one variable. Your report at the end is the same line multiple times.
You can use a hash to ease the pain:
#!/usr/bin/perl
use strict; use warnings; use feature 'say';
my %count; # a hash is a mapping of strings to scalars (e.g. numbers)
my #sites = qw/google.com msn.com yahoo.com ask.com bing.com/;
# initialize the counts we are interested in:
$count{$_} = 0 foreach #sites;
while (<>) { # accept input from files specified as command line options or STDIN
foreach my $site (#sites) {
$count{$site}++ if /\Q$site\E/i; # /i for case insensitive matching
}
}
foreach my $site (#sites) {
say "\u$site was accessed $count{$site} times in this log";
}
The \u uppercases the next character, this is required to produce identical output.
The say is exactly like print, but appends a newline. It is available in perl5 v10 or later.
The script is trying to read from STDIN, but you are providing the filename to read from as an argument.
"Nothing happens" because the script is waiting for input (which, since you haven't redirected anything to standard input, it expects you to type).
Change <STDIN> to <> or change the command to perl -w access_scan.pl < access_log.1
Your script is reading from stdin, but you're providing your input as a file. You need to redirect thus:
perl -w access_scan.pl < access_log.1
The < file construct provides the contents of your file as the standard input for your script.
The script works fine (I tested it), but you need to feed it with the log in the STDIN:
cat access_log.1 | perl -w access_scan.pl
We maintain a huge number of perl modules, actualy its so huge that we don't even know of all modules that we are responsible for. We would like to track what scripts and modules accesses another module in some sort of log, preferably stored by module name, so that we can evaluate whether its a risk to update a module and so that we can know what we might affect.
Is there any simple way to do this?
You could do a simple regex search:
use strict;
use warnings;
my %modules;
foreach my $perl_file (#file_list) {
open FILE, $perl_file or die "Can't open $perl_file ($!)";
while (<FILE>) {
if (/\s*(?:use|require)\s*([^;]+);/) {
$modules{$1}{$perl_file}++;
}
}
}
This is quick and dirty, but it should work pretty well. You end up with a hash of modules, each of which points to a hash of the files that use it.
Of course, it will catch things like use strict; but those will be easy enough to ignore.
If you have things like use Module qw/function/; You will grab the whole thing before the semicolon, but it shouldn't be a big deal. You can just search the keys for your known module names.
A downside is that it doesn't track dependencies. If you need that information you could add it by getting it from cpan or something.
Update: If you want to log this at runtime, you could create a wrapper script and have your perl command point to the wrapper on your system. Then make the wrapper something like this:
use strict;
use warnings;
use Module::Loaded;
my $script = shift #ARGV;
#run program
do $script;
#is_loaded() gets the path of these modules if they are loaded.
print is_loaded('Some::Module');
print is_loaded('Another::Module');
You might run the risk of funny side effects, though, since the method of calling your script has changed. It depends on what you are doing.
Maybe edit sitecustomize.pl so that each time when Perl runs, it would write some info in a log, and then analyse it? Add something like this to sitecustomize.pl:
open (LOG, '>>',"absolutepathto/logfile.txt");
print LOG $0,"\t",$$,"\t",scalar(localtime),"\n";
open SELF, $0;
while (<SELF>) {
print LOG $_ if (/use|require/);
}
close SELF;
print LOG "_" x 80,"\n";
close LOG;
EDIT:
Also, we forgot about %INC hash, so the code above may be rewritten as follows, to include more data about which modules were actually loaded + include files required by do function:
open (LOG, '>>',"absolutepathto/logfile.txt");
print LOG $0,' ',$$,' ',scalar(localtime),"\n";
open SELF, $0;
while (<SELF>) {
print LOG $_ if (/use|require/);
}
close SELF;
END {
local $" = "\n";
print LOG "Files loaded by use, eval, or do functions at the end of this program run:\n";
print LOG "#{[values %INC]}","\n";
print LOG "_" x 80,"\n";
close LOG;
}
I am a newb to Perl. I am writing some scripts and want to define my own print called myprint() which will print the stuff passed to it based on some flags (verbose/debug flag)
open(FD, "> /tmp/abc.txt") or die "Cannot create abc.txt file";
print FD "---Production Data---\n";
myprint "Hello - This is only a comment - debug data";
Can someone please help me with some sample code to for myprint() function?
Do you care more about writing your own logging system, or do you want to know how to put logging statements in appropriate parts of your program which you can turn off (and, incur little performance penalty when they are turned off)?
If you want a logging system that is easy to start using, but also offers a world of features which you can incrementally discover and use, Log::Log4perl is a good option. It has an easy mode, which allows you to specify the desired logging level, and emits only those logging messages that are above the desired level.
#!/usr/bin/env perl
use strict; use warnings;
use File::Temp qw(tempfile);
use Log::Log4perl qw(:easy);
Log::Log4perl->easy_init({level => $INFO});
my ($fh, $filename) = tempfile;
print $fh "---Production Data---\n";
WARN 'Wrote something somewhere somehow';
The snippet also shows a better way of opening a temporary file using File::Temp.
As for overriding the built-in print … It really isn't a good idea to fiddle with built-ins except in very specific circumstances. perldoc perlsub has a section on Overriding Built-in Functions. The accepted answer to this question lists the Perl built-ins that cannot be overridden. print is one of those.
But, then, one really does not need to override a built-in to write a logging system.
So, if an already-written logging system does not do it for you, you really seem to be asking "how do I write a function that prints stuff conditionally depending on the value of a flag?"
Here is one way:
#!/usr/bin/env perl
package My::Logger;
{
use strict; use warnings;
use Sub::Exporter -setup => {
exports => [
DEBUG => sub {
return sub {} unless $ENV{MYDEBUG};
return sub { print 'DEBUG: ' => #_ };
},
]
};
}
package main;
use strict; use warnings;
# You'd replace this with use My::Logger qw(DEBUG) if you put My::Logger
# in My/Logger.pm somewhere in your #INC
BEGIN {
My::Logger->import('DEBUG');
}
sub nicefunc {
print "Hello World!\n";
DEBUG("Isn't this a nice function?\n");
return;
}
nicefunc();
Sample usage:
$ ./yy.pl
Hello World!
$ MYDEBUG=1 ./yy.pl
Hello World!
DEBUG: Isn't this a nice function?
I wasn't going to answer this because Sinan already has the answer I'd recommend, but tonight I also happened to be working on the "Filehandle References" chapter to the upcoming Intermediate Perl. That are a couple of relevant paragraphs which I'll just copy directly without adapting them to your question:
IO::Null and IO::Interactive
Sometimes we don't want to send our output anywhere, but we are forced
to send it somewhere. In that case, we can use IO::Null to create
a filehandle that simply discards anything that we give it. It looks
and acts just like a filehandle, but does nothing:
use IO::Null;
my $null_fh = IO::Null->new;
some_printing_thing( $null_fh, #args );
Other times, we want output in some cases but not in others. If we are
logged in and running our program in our terminal, we probably want to
see lots of output. However, if we schedule the job through cron, we
probably don't care so much about the output as long as it does the job.
The IO::Interactive module is smart enough to tell the difference:
use IO::Interactive;
print { is_interactive } 'Bamboo car frame';
The is_interactive subroutine returns a filehandle. Since the
call to the subroutine is not a simple scalar variable, we surround
it with braces to tell Perl that it's the filehandle.
Now that you know about "do nothing" filehandles, you can replace some
ugly code that everyone tends to write. In some cases you want output
and in some cases you don't, so many people use a post-expression
conditional to turn off a statement in some cases:
print STDOUT "Hey, the radio's not working!" if $Debug;
Instead of that, you can assign different values to $debug_fh based
on whatever condition you want, then leave off the ugly if $Debug
at the end of every print:
use IO::Null;
my $debug_fh = $Debug ? *STDOUT : IO::Null->new;
$debug_fh->print( "Hey, the radio's not working!" );
The magic behind IO::Null might give a warning about "print() on
unopened filehandle GLOB" with the indirect object notation (e.g.
print $debug_fh) even though it works just fine. We don't get that
warning with the direct form.
I'm trying to modify a script that someone else has written and I wanted to keep my script separate from his.
The script I wrote ends with a print line that outputs all relevant data separated by spaces.
Ex: print "$sap $stuff $more_stuff";
I want to use this data in the middle of another perl script and I'm not sure if it's possible using a system call to the script I wrote.
Ex: system("./sap_calc.pl $id"); #obtain printed data from sap_calc.pl here
Can this be done? If not, how should I go about this?
Somewhat related, but not using system():
How can I get one Perl script to see variables in another Perl script?
How can I pass arguments from one Perl script to another?
You're looking for the "backtick operator."
Have a look at perlop, Section "Quote-like operators".
Generally, capturing a program's output goes like this:
my $output = `/bin/cmd ...`;
Mind that the backtick operator captures STDOUT only. So in order to capture everything (STDERR, too) the commands needs to be appended with the usual shell redirection "2>&1".
If you want to use the data printed to stdout from the other script, you'd need to use backticks or qx().
system will only return the return value of the shell command, not the actual output.
Although the proper way to do this would be to import the actual code into your other script, by building a module, or simply by using do.
As a general rule, it is better to use all perl solutions, than relying on system/shell as a way of "simplifying".
myfile.pl:
sub foo {
print "Foo";
}
1;
main.pl:
do 'myfile.pl';
foo();
perldoc perlipc
Backquotes, like in shell, will yield the standard output of the command as a string (or array, depending on context). They can more clearly be written as the quote-like qx operator.
#lines = `./sap_calc.pl $id`;
#lines = qx(./sap_calc.pl $id);
$all = `./sap_calc.pl $id`;
$all = qx(./sap_calc.pl $id);
open can also be used for streaming instead of reading into memory all at once (as qx does). This can also bypass the shell, which avoids all sorts of quoting issues.
open my $fh, '-|', './sap_calc.pl', $id;
while (readline $fh) {
print "read line: $_";
}
I've had a search around, and from my perspective using backticks is the only way I can solve this problem. I'm trying to call the mdls command from Perl for each file in a directory to find it's last accessed time. The issue I'm having is that in the file names I have from find I have unescaped spaces which bash obviously doesn't like. Is there an easy way to escape all of the white space in my file names before passing them to mdls. Please forgive me if this is an obvious question. I'm quite new to Perl.
my $top_dir = '/Volumes/hydrogen/FLAC';
sub wanted { # Learn about sub routines
if ($File::Find::name) {
my $curr_file_path = $File::Find::name. "\n";
`mdls $curr_file_path`;
print $_;
}
}
find(\&wanted, $top_dir);
If you are JUST wanting "last access time" in terms of of the OS last access time, mdls is the wrong tool. Use perl's stat. If you want last access time in terms of the Mac registered application (ie, a song by Quicktime or iTunes) then mdls is potentially the right tool. (You could also use osascript to query the Mac app directly...)
Backticks are for capturing the text return. Since you are using mdls, I assume capturing and parsing the text is still to come.
So there are several methods:
Use the list form of system and the quoting is not necessary (if you
don't care about the return text);
Use String::ShellQuote to escape the file name before sending to sh;
Build the string and enclose in single quotes prior to sending to sending to the shell. This is harder than it sounds because files names with single quotes defeats your quotes! For example, sam's song.mp4 is a legal file name, but if you surround with single quotes you get 'sam's song.mp4' which is not what you meant...
Use open to open a pipe to the output of the child process like this: open my $fh, '-|', "mdls", "$curr_file" or die "$!";
Example of String::ShellQuote:
use strict; use warnings;
use String::ShellQuote;
use File::Find;
my $top_dir = '/Users/andrew/music/iTunes/iTunes Music/Music';
sub wanted {
if ($File::Find::name) {
my $curr_file = "$File::Find::name";
my $rtr;
return if -d;
my $exec="mdls ".shell_quote($curr_file);
$rtr=`$exec`;
print "$rtr\n\n";
}
}
find(\&wanted, $top_dir);
Example of pipe:
use strict; use warnings;
use String::ShellQuote;
use File::Find;
my $top_dir = '/Users/andrew/music/iTunes/iTunes Music/Music';
sub wanted {
if ($File::Find::name) {
my $curr_file = "$File::Find::name";
my $rtr;
return if -d;
open my $fh, '-|', "mdls", "$curr_file" or die "$!";
{ local $/; $rtr=<$fh>; }
close $fh or die "$!";
print "$rtr\n\n";
}
}
find(\&wanted, $top_dir);
If you're sure the filenames don't contain newlines (either CR or LF), then pretty much all Unix shells accept backslash quoting, and Perl has the quotemeta function to apply it.
my $curr_file_path = quotemeta($File::Find::name);
my $time = `mdls $curr_file_path`;
Unfortunately, that doesn't work for filenames with newlines, because the shell handles a backslash followed by a newline by deleting both characters instead of just the backslash. So to be really safe, use String::ShellQuote:
use String::ShellQuote;
...
my $curr_file_path = shell_quote($File::Find::name);
my $time = `mdls $curr_file_path`;
That should work on filenames containing anything except a NUL character, which you really shouldn't be using in filenames.
Both of these solutions are for Unix-style shells only. If you're on Windows, proper shell quoting is much trickier.
If you just want to find the last access time, is there some weird Mac reason you aren't using stat? When would it be worse than kMDItemLastUsedDate?
my $last_access = ( stat($file) )[8];
It seems kMDItemLastUsedDate isn't always updated to the last access time. If you work with a file through the terminal (e.g. cat, more), kMDItemLastUsedDate doesn't change but the value that comes back from stat is right. touch appears to do the right thing in both cases.
It looks like you need stat for the real answer, but mdls if you're looking for access through applications.
You can bypass the shell by expressing the command as a list, combined with capture() from IPC::System::Simple:
use IPC::System::Simple qw(capture);
my $output = capture('mdls', $curr_file_path);
Quote the variable name inside the backticks:
`mdls "$curr_file_path"`;
`mdls '$curr_file_path'`;