How to intercept HTTP Request in Perl? - perl

I am developing a web application and I am wondering if there is a way or some kind of module that can intercept or handle any HTTP Request before being displayed to the client. I can't modify the httpd.conf file so it isn't an option.
I want to do this for adding some security to my web app by denying the access or redirecting to other pages or by modifying the response sent to the client and some other stuff.
I've also heard about Request Dispatching and maybe it could help me.
Anybody knows how to achieve this?

Perhaps you can use Plack::Handler::Apache2 to enable PSGI support. From there, you can PSGI Middleware modules to modify both the request and response.
It's hard to get more specific without knowing how you've setup Perl to be executed in your mod_perl environment.

You may want to check out HTTP::Proxy, a perl module to write a web proxy... An example of this:
# alternate initialisation
my $proxy = HTTP::Proxy->new;
$proxy->port( 3128 ); # the classical accessors are here!
# this is a MainLoop-like method
$proxy->start;
my $d = HTTP::Daemon->new || die;
while (my $c = $d->accept) {
while (my $r = $c->get_request) {
if ($r->method eq 'GET' and $r->uri->path eq "/xyzzy") {
# remember, this is *not* recommended practice :-)
$c->send_file_response("/home/hd1/.zshrc");
}
else {
$c->send_error(RC_FORBIDDEN)
}
}
$c->close;
undef($c);
}

Related

LWP getstore usage

I'm pretty new to Perl. While I just created a simple scripts to retrieve a file with
getstore($url, $file);
But how do I know whether the task is done correctly or the connection interrupted in the middle, or authentication failed, or whatever response. I searched all the web and I found some, like a response list, and some talking about useragent stuff, which I totally can't understand, especially the operator $ua->.
What I wish is to an explanation about that operator stuff (I don't even know what -> used for), and the RC code meaning, and finally, how to use it.
Its a lot of stuff so I appreciate any answer given, even just partially. And, thanks first for whoever will to help. =)
The LWP::Simple module is just that: quite simplistic. The documentation states that the getstore function returns the HTTP status code which we can save into a variable. There are also the is_success and is_error functions that tell us whether a certain return value is ok or not.
my $url = "http://www.example.com/";
my $filename = "some-file.html";
my $rc = getstore($url, $filename)
if (is_error($rc)) {
die "getstore of <$url> failed with $rc";
}
Of course, this doesn't catch errors with the file system.
The die throws a fatal exception that terminates the execution of your script and displays itself on the terminal. If you don't want to abort execution use warn.
The LWP::Simple functions provide high-level controls for common tasks. If you need more control over the requests, you have to manually create an LWP::UserAgent. An user agent (abbreviated ua) is a browser-like object that can make requests to servers. We have very detailed control over these requests, and can even modify the exact header fields.
The -> operator is a general dereference operator, which you'll use a lot when you need complex data structures. It is also used for method calls in object-oriented programming:
$object->method(#args);
would call the method on $object with the #args. We can also call methods on class names. To create a new object, usually the new method is used on the class name:
my $object = The::Class->new();
Methods are just like functions, except that you leave it to the class of the object to figure out which function exactly will be called.
The normal workflow with LWP::UserAgent looks like this:
use LWP::UserAgent; # load the class
my $ua = LWP::UserAgent->new();
We can also provide named arguments to the new method. Because these UA objects are robots, it is considered good manners to tell everybody who sent this Bot. We can do so with the from field:
my $ua = LWP::UserAgent->new(
from => 'ss-tangerine#example.com',
);
We could also change the timeout from the default three minutes. These options can also be set after we constructed a new $ua, so we can do
$ua->timeout(30); # half a minute
The $ua has methods for all the HTTP requests like get and post. To duplicate the behaviour of getstore, we first have to get the URL we are interested in:
my $url = "http://www.example.com/";
my $response = $ua->get($url);
The $response is an object too, and we can ask it whether it is_success:
$response->is_success or die $response->status_line;
So if execution flows past this statement, everything went fine. We can now access the content of the request. NB: use the decoded_content method, as this manages transfer encodings for us:
my $content = $response->decoded_content;
We can now print that to a file:
use autodie; # automatic error handling
open my $fh, ">", "some-file.html";
print {$fh} $content;
(when handling binary files on Windows: binmode $fh after opening the file, or use the ">:raw" open mode)
Done!
To learn about LWP::UserAgent, read the documentation. To learn about objects, read perlootut. You can also visit the perl tag on SO for some book suggestions.

500 Internal Server Error in perl-cgi program

I am getting error as "Internal Server Error.The server encountered an internal error or misconfiguration and was unable to complete your request."
I am submitting a form in html and get its values.
HTML Code (index.cgi)
#!c:/perl/bin/perl.exe
print "Content-type: text/html; charset=iso-8859-1\n\n";
print "<html>";
print "<body>";
print "<form name = 'login' method = 'get' action = '/cgi-bin/login.pl'> <input type = 'text' name = 'uid'><br /><input type = 'text' name = 'pass'><br /><input type = 'submit'>";
print "</body>";
print "</html>";
Perl Code to fetch data (login.pl)
#!c:/perl/bin/perl.exe
use CGI::Carp qw(fatalsToBrowser);
my(%frmfields);
getdata(\%frmfields);
sub getdata {
my ($buffer) = "";
if (($ENV{'REQUEST_METHOD'} eq 'GET')) {
my (%hashref) = shift;
$buffer = $ENV{'QUERY_STRING'};
foreach (split(/&/,$buffer)) {
my ($key, $value) = split(/=/, $_);
$key = decodeURL($key);
$value= decodeURL($value);
$hashref{$key} = $value;
}
}
else{
read(STDIN,$buffer,$ENV{'CONTENT_LENGTH'})
}
}
sub decodeURL{
$_=shift;
tr/+/ /;
s/%(..)/pack('c', hex($1))/eg;
return($_);
}
The HTML page opens correctly but when i submit the form, i get internal server error.
Please help.
What does the web server's error log say?
Independent of what it says, you must stop parsing the form data yourself. There are modules for that, specifically CGI.pm. Using that, you can do this instead:
use CGI;
my $CGI = CGI->new();
my $uid = $CGI->param( 'uid' );
my $pass = $CGI->param( 'pass' );
# rest of your script
Much cleaner and much safer.
I agree with Tore that you must not parse this yourself. Your code has multiple errors. You don't allow multiple parameter values, you don't allow the ; alternate separator, you don't handle POST with a query string in the URL, and so on.
I don't know how long it will be online for free, but chapter 15 of my new "Beginning Perl" book covers Web programming. That should get you started on some decent basics. Note that the online version is an early, rough draft. The actual book also includes Chapter 19 which has a complete Web app example.
could it be this line that's the problem?
my (%hashref) = shift;
You're initialising a proper hash, but shift will give you a hash reference, since you did getdata(\%frmfields);. You probably want this, instead:
my $hashref = shift;
"500 Internal Server Error" just means that something didn't work the way the web server expected. Maybe you don't have CGI enabled. Maybe the script isn't executable. Maybe it's in a directory the web server isn't allowed to access. It's even possible that maybe the web server ran the script successfully and it worked perfectly, but didn't start its output with a valid set of HTTP headers. You need to look in the web server's error log to find out what it didn't like, which may or may not be a Perl issue.
Like everyone else has said, though, don't try to parse query strings and grovel though %ENV yourself. Use one of the many fine modules or frameworks which are available and already known to work correctly. CGI.pm is the granddaddy of them all and works well for smaller projects, but I'd recommend looking into a proper web application framework such as Dancer, Mojolicious, or Catalyst (there are many others, but those are the big three) if you're planning to build anything with more than a handful of relatively simple pages and forms.

mod_perl redirect

So I'm working in a mod_perl environment, and I want to know what the best way is to redirect to a new url. I know in CGI Perl you use print "Location:...", however I've come to find that usually there are better ways to do things in mod_perl, but I can't seem to find anything. Thanks in advance!
use Apache2::Const -compile => qw(REDIRECT);
sub handler {
my $r = shift;
$r->headers_out->set( Location => $url);
$r->status(Apache2::Const::REDIRECT); #302
}
This is the answer for how to properly redirect in mod_perl2

Simple Perl Proxy

We store a large amount of files on Amazon S3 that we want website visitors to be able to access via AJAX but we don't want the actual file locations disclosed to visitors.
To accomplish this what I'm hoping to do is to make an AJAX request to a very simple perl script that would simply act as a proxy and return the file to the browser. I already have the script setup to authenticate that the user is logged in and do a database query to figure out the correct url to access the file on S3 but I'm not sure the best way to return the file to the vistor's browser in the most efficient manner.
Any suggestions on the best way to accomplish this would be greatly appreciated. Thanks!
The best way is to use the sendfile system call. If you're opening and reading the file from disk manually and then again write it blockwise to the "sink" end of your Web framework, then you're very wasteful because the data have to travel through the RAM, possibly including buffering.
What you describe in your question is a very common pattern, therefore many solutions already exist around the idea of just setting a special HTTP header, then letting the Web stack below your application deal with it efficiently.
mod_xsendfile for Apache httpd
in lighttpd
X-Accel-Redirect for nginx
Employ the XSendfile middleware in Plack to set the appropriate header. The following minimal program will DTRT and take advantage of the system call where possible.
use IO::File::WithPath qw();
use Plack::Builder qw(builder enable);
builder {
enable 'Plack::Middleware::XSendfile';
sub {
return [200, [], IO::File::WithPath->new('/usr/src/linux/COPYING')];
}
};
Ok. There's example how to implement this using Mojolicious framework.
I suppose you run this script as daemon. Script catches all requests to /json_dir/.*, this request to Stackoverflow API and returns response.
You may run this script as ./example.pl daemon and then try http://127.0.0.1:3000/json_dir/perl
In response you should be able to find your own question titled 'Simple Perl Proxy'.
This code could be used as standalone daemon that listen on certain port and as CGI script (first preferred).
#!/usr/bin/env perl
use Mojolicious::Lite;
get '/json_dir/(.filename)' => sub {
my $self = shift;
my $filename = $self->stash('filename');
my $url = "http://api.stackoverflow.com/1.1/questions?tagged=" . $filename;
$self->ua->get(
$url => sub {
my ($client, $tx) = #_;
json_response($self, $tx);
}
);
$self->render_later;
};
sub json_response {
my ($self, $tx) = #_;
if (my $res = $tx->success) {
$self->tx->res($res);
}
else {
$self->render_not_found;
}
$self->rendered;
}
app->start;
__DATA__
## not_found.html.ep
<!doctype html><html>
<head><title>Not Found</title></head>
<body>File not found</body>
</html>

How do I get Perl's HTTP::Daemon to accept more than one connection?

I do some testing with HTTP::Daemon:
use HTTP::Daemon;
use HTTP::Status;
my $d = HTTP::Daemon->new || die;
print "Please contact me at: <URL:", $d->url, ">\n";
while (my $c = $d->accept) {
while (my $r = $c->get_request) {
if ($r->method eq 'GET') {
# do some action (about 10s)
}
else {
$c->send_error(RC_FORBIDDEN)
}
}
$c->close;
undef($c);
}
It works fine, but if I do more request within 10s, the requests gets queued (I get all requests through $d->accept)
What I want is the following: if a client starts a request, no other should be queued.
I tried with the Listen option, but without success.
Any suggestions?
HTTP::Daemon doesn't fork for you, and explicitely tells you so in its documentation.
This HTTP daemon does not fork(2) for you. Your application, i.e. the
user of the "HTTP::Daemon" is responsible for forking if that is
desirable. Also note that the user is responsible for generating
responses that conform to the HTTP/1.1 protocol.
If your answering takes too long, fork to answer. Or use another module.
you have one thread here; it can either handle the first request or handle the next one to come in. You can't deal with new requests until control goes back to accept.