Perl CGI problem - perl

I'm doing some development work that uses an embedded Linux for the OS and Boa for the web server. I have a web page that posts to a CGI script, handles the form data, and replies. My development environment was Ubuntu and everything worked fine, but when I ported my code over to the embedded Linux, the CGI module did not instantiate (or at least does not seem to instantiate). Here is a stripped down section of my code. The print statement complains about an uninitialized variable.
use CGI;
use strict;
use warnings;
my $cgiObj = CGI->new();
print $cgiObj->param('wlanPort');
Again, this works fine in my development environment, but fails in the embedded environment. The CGI.pm is installed and there are no errors generated on the CGI->new() command. I have also verified that the form data is being sent, but obviously can't guarantee that it is being received by the Perl script.
I have a feeling that it is a Boa configuration issue and that's what I'll be looking into next. I'm fairly new to Perl, so I'm not sure what else to do. Any ideas?
EDIT: Definitely not a Boa config issue. Still looking into it.
UPDATE:
I've simplified my code to the following:
#!/usr/bin/perl
use CGI qw(:standard);
$data = param('wlanPort') || '<i>(No Input)</i>';
print header;
print <<END;
<title>Echoing user input</title>
<p>wlanPort: $data</p>
END
As expected, it prints (No Input)
I should also point out that the form is enctype="multipart/form-data" because I have to have a file upload capability and I am using the "POST" method.
I used the HttpFox plugin to inspect the post data and checked on the wlanPort value:
-----------------------------132407047814270795471206851178 Content-Disposition: form-data;
name="wlanPort"
eth1
So it is almost definitely being sent...
UPDATE 2: I installed the same version of Perl and Boa being used in the embedded system on my Ubuntu laptop. Works on the laptop, not in the device, which is the same result. I've told my employer that that I've exhausted all possibilities other than the way Boa and (Micro) Perl are built on the device vs. in Ubuntu.

CGI is a beautifully simple protocol and while I cannot answer your question directly, I can suggest some techniques to help isolate the problem.
If your form is being submitted via POST, then the contents of that form will appear as a URLencoded string in the content of the HTTP request your script is getting. Without using the CGI module at all, you should be able to read the the request from STDIN:
my $request = "";
while (<STDIN>) {
$request .= $_;
}
if (open my $out, ">>/tmp/myapp.log") {
print $out $request;
close $out;
}
You can then examine /tmp/myapp.log to see if you are getting all the information from the request that you think you are.
For completeness, if your form submits via GET, then the arguments will be in the environment variable QUERY_STRING, which you can look at in Perl with $ENV{'QUERY_STRING'}.
There should be no difference in the way CGI's object interface and functional interface parses the request. I am not familiar with boa, but I can't imagine it breaking basic CGI protocol.
A common problem is that you named the form parameter one thing and are looking for a different parameter name in the CGI script. That's always a laugh.
Good luck and hope this helps some.

I know this is very old post and OP may not be interested in any new info related to this, but the general question about how to debug CGI scripts has some relevance still. I had similar issues with dev vs. prod environments. To help those who stumble upon this thread, I am posting my experience in dealing with this situation. My simple answer is, to use Log::Log4perl and Data::Dumper modules to demystify this (assuming there is a way to access logs on your prod environment). This way with negligible overhead, you can turn on tracing when problems creep in (even if the code worked before, but due to changes over time, it started failing). Log every relevant bit of information at appropriate log level (trace, debug, info, warning, error, fatal) and configure what level is good for operations. Without these mechanisms, it will be very difficult to get insight into production operations. Hope this helps.
use CGI;
use Log::Log4perl qw(easy);
use Data::Dumper;
use strict;
use warnings;
my $cgiObj = CGI->new();
my $log = Log::Log4perl::get_logger();
$log->trace("CGI Data: " . Dumper($cgiObj));
print $cgiObj->param('wlanPort');

Related

How to scrape a ASP website using perl?

Please tell me which module is used to scrape a website which is completely developed in ASP and all it's contents are not in proper HTML syntax.
It does not matter which language was used to develop the website. All that you (client) get from the website is the produced HTML (or broken HTML in this case).
You can use the "LWP" library and the "get" function to read the website content into a variable... and then analyze it using regular expressions.
Like this:
use strict;
use LWP::Simple;
my $url = 'http://...';
my $content = get $url;
if ($content =~ m/.../) {
...
}
Or you could use WWW::Mechanize. It builds upon LWP (which LWP::Simple is a very simple subset of) and provides lots of handy 'browser-like' behavior. For example, the typical session management of ASP-generated websites with login cookies and stuff gets handled by Mechanize automatically.
use strict; use warnings;
use WWW::Mechanize;
my $mech = WWW::Mechanize->new;
$mech->get( 'http:://www.example.org/login.asp' );
$mech->submit_form(
form_number => 3,
fields => {
username => 'test',
password => 'secret',
}
);
While this first of all is good for testing, it still has LWPs inherited methods and you can access the plain request. You can thus access the request as well, while still having the power of the built-in parser to access forms and links.
Also consider using a proper HTML parser, even if the website's output is no very fancy. There are several of these around that can handle it. It will be a lot easier than just building up a bunch of regexes. These will get hard to maintain once you need to go back because the page has changed something.
Here's a list of related questions that have info on this subject:
Parse html using Perl
Parse html using Perl
Perl parse links from HTML Table
What are some good ways to parse HTML and CSS in Perl?

"get" not working in perl

I'm new to perl. In the past few days, I've made some simple scripts that save websites' source codes to my computer via "get." They do what they're supposed to, but will not get the content of a website which is a forum. Non-forum websites work just fine.
Any idea what's going on? Here's the problem chunk:
my $url = 'http://www.computerforum.com/';
my $content = get $url || die "Unable to get content";
http://p3rl.org/LWP::Simple#get:
The get() function will fetch the document identified by the given URL and return it. It returns undef if it fails. […]
You will not be able to examine the response code or response headers (like 'Content-Type') when you are accessing the web using this function. If you need that information you should use the full OO interface (see LWP::UserAgent).
You really need better error reporting, switch to the LWP::UserAgent library. I suspect the forum software blocks the LWP user agent.

is it possible to write a single perl cgi script to serve all http requests

I am wondering if it is possible to use a single perl cgi script to server all http requests to my site, no matter what relative URL given by the visitors.
Please share your thoughts. Many thanks.
If you call your script index.cgi and combine that with a mod_rewrite rule to redirect all requests to /index.cgi/foo then foo will be available as $ENV{'PATH_INFO'}, thereby letting you know what the original request path was.
It's quite possible using mod_rewrite as other people have said. But you probably don't want to do it in a CGI program. Far better to write a proper web application using something like Catalyst or Dancer (probably with Plack at the back end).
If you're not really committed to your existing web server, you could use something like this:
use HTTP::Daemon; # need LWP-5.32 or better
use HTTP::Status;
use HTTP::Response;
use URI::Heuristic;
my $server = HTTP::Daemon->new(LocalPort => 89);
my $this_url = $server->url;
etc.
I grabbed that snippet from an existing program that ran as its own web server. Not sure how many of the "use" commands are required after the first one, but hopefully that gives you some ideas.
Per request of the submitter, I'm submitting a more-complete version of the HTTPD script:
#!/usr/bin/perl
use strict;
use warnings;
use HTTP::Daemon;
my $PORT = 89;
my $server = HTTP::Daemon->new(LocalPort =>$PORT);
# Init
print "Starting server at $server->url\n";
print "You can also use http://localhost:$PORT if browsing from the same machine running this script.\n\n";
# Server
my $count=0;
while (my $client = $server->accept) {
CONNECTION:
while (my $request = $client->get_request) {
$count++;
print "Connection #$count:\n";
print $request->as_string;
print "\n";
$client->autoflush;
RESPONSE:
print $client "Relative URL used was " . $request->url->path;
last CONNECTION;
}
$client->close;
undef $client;
}
Instead of the simple line that prints "Relative URL used was", you'd most likely want to parse the URL used to perform whatever different functions you need this script to do for every HTTP request.
You can't set up perl itself to do this. However, you should be able to configure your webserver to redirect all requests to a single CGI script, usually then passing the full script as a parameter. If you're running on Apache, look at mod-rewrite.
Can't vote up neither comment to davorg by low reputation (I'm new here).
You can use Mojolicious framework too. Mojolicious::Lite allows you to write full apps in a single file (logic, templating, etc), and I guess you're searching for something like:
http://search.cpan.org/perldoc?Mojolicious::Lite#Wildcard_Placeholders
More info at:
http://mojolicio.us/

Transparently Handling GZip Encoded content with WWW::Mechanize

I am using WWW::Mechanize and currently handling HTTP responses with the 'Content-Encoding: gzip' header in my code by first checking the response headers and then using IO::Uncompress::Gunzip to get the uncompressed content.
However I would like to do this transparently so that WWW::Mechanize methods like form(), links() etc work on and parse the uncompressed content. Since WWW::Mechanize is a sub-class of LWP::UserAgent, I would prefer to use the LWP::UA::handlers to do this.
While I have been partly successful (I can print the uncompressed content for example), I am unable to do this transparently in a way that I can call
$mech->forms();
In summary: How do I "replace" the content inside the $mech object so that from that point onwards, all WWW::Mechanize methods work as if the Content-Encoding never happened?
I would appreciate your attention and help.
Thanks
WWW::Mechanize::GZip, I think.
It looks to me like you can replace it by using the $res->content( $bytes ) member.
By the way, I found this stuff by looking at the source of LWP::UserAgent, then HTTP::Response, then HTTP::Message.
It is built in with UserAgent and thus Mechanize. One MAJOR caveat to save you some hair
-To debug, make sure you check for error $# after the call to decoded_content.
$html = $r->decoded_content;
die $# if $#;
Better yet, look through the source of HTTP::Message and make sure all the support packages are there
In my case, decoded_content returned undef while content is raw binary, and I went on a wild goose chase. UserAgent will set the error flag on failure to decode, but Mechanize will just ignore it (It doesn't check or log the incidence as its own error/warning).
In my case $# sez: "Can't find IO/HTML.pm .. It was eval'ed
After having to dive into the source, I find out the built-in decoding process is long, meticulous, and arduous, covering just about every scenario and making tons of guesses (Thank you Gisle!).
if you are paranoid, explicitly set the default header to be used with every request at new()
$browser = new WWW::Mechanize('default_headers' => HTTP::Headers->new('Accept-Encoding'
=> scalar HTTP::Message::decodable()));

Passing Perl Data Structures as Serialized GET Strings to a Perl CGI program

I want to pass a serialized Perl data structure as a GET variable to a CGI application. I tried Data::Serializer as my first option. Unfortunately the serialized string is too long for my comfort, in addition to containing options joined by '^' (a caret).
Is there a way I can create short encoded strings from perl data structures so that I can safely pass them as GET variables to a perl CGI application?
I would also appreciate being told that this (serialized, encoded strings) is a bad way to pass complex data structures to web applications and suggestions on how I could accomplish this
If you need to send URL's to your users that contains a few key datapoints and you want to ensure it can't be forged you can do this with a Digest (such as from Digest::SHA) and a shared secret. This lets you put the data out there in your messages without needing to keep a local database to track it all. My example doesn't include a time element, but that would be easy enough to add in if you want.
use Digest::SHA qw(sha1_base64);
my $base_url = 'http://example.com/thing.cgi';
my $email = 'guy#somewhere.com';
my $msg_id = '123411';
my $secret = 'mysecret';
my $data = join(":", $email, $msg_id, $secret);
my $digest = sha1_base64($data);
my $url = $base_url . '?email=' . $email . '&msg_id=' . $msg_id' . '&sign=' . $digest;
Then send it along.
In your "thing.cgi" script you just need to extract the parameters and see if the digest submitted in the script matches the one you locally regenerate (using $email and $msg_id, and of course your $secret). If they don't match, don't authorize them, if they do then you have a legitimately authorized request.
Footnote:
I wrote the "raw" methods into Data::Serializer to make translating between serializers much easier and that in fact does help with going between languages (to a point). But that of course is a separate discussion as you really shouldn't ever use a serializer for exchanging data on a web form.
One of the drawbacks of the approach — using a perl-specific serializer, that is — is that if you ever want to communicate between the client and server using something other than perl it will probably be more work than something like JSON or even XML would be. The size limitations of GET requests you've already run in to, but that's problematic for any encoding scheme.
It's more likely to be a problem for the next guy down the road who maintains this code than it is for you. I have a situation now where a developer who worked on a large system before I did decided to store several important bits of data as perl Storable objects. Not a horrible decision in and of itself, but it's making it more difficult than it should be to access the data with tools that aren't written in perl.
Passing serialized encoded strings is a bad way to pass complex data structures to web applications.
If you are trying to pass state from page to page, you can use server side sessions which would only require you to pass around a session key.
If you need to email a link to someone, you can still create a server-side session with a reasonable expiry time (you'll also need to decide if additional authentication is necessary) and then send the session id in the link. You can/should expire the session immediately once the requested action is taken.