How come Catalyst::Controller::WrapCGI doesn't get any post data? - perl

Whenever I POST something to a page with Catalyst::Controller::WrapCGI I notice that my old CGI script doesn't get any POST data.. Data inside the body of the HTTP request. What am I doing wrong and how do I fix this?

In my case, this was because I was using Catalyst::Controller::WrapCGI v0.35 and Catalyst::Controller::REST. This created a problem.. My configuration looked like this,
package MyApp::Controller::REST;
__PACKAGE__->config(namespace => '');
BEGIN { extends 'Catalyst::Controller::REST' }
and
package MyApp::Controller::Root;
__PACKAGE__->config(namespace => '');
BEGIN { extends 'Catalyst::Controller::WrapCGI' }
However, Catalyst::Controller::REST installs a begin action on line 298
sub begin : ActionClass('Deserialize') { }
And, that -- in my case -- was delegating to Catalyst::Action::Deserialize::JSON which is smart enough to seek($body,0,0) but too dumb and inconsiderate to do that for the next guy down the chain.... Code below from here
if(openhandle $body) {
seek($body, 0, 0); # in case something has already read from it
while ( defined( my $line = <$body> ) ) {
$rbody .= $line;
}
}
And, to make matters even worse, the next guy down the chain in this example is Catalyst::Controller::WrapCGI which not just fails to clean up for the next guy, but fails to set it up for itself (code from here),
if ($body) { # Slurp from body filehandle
local $/; $body_content = <$body>;
}
That should look like (at the very least)
if ($body) { # Slurp from body filehandle
local $/;
seek($body,0,0);
$body_content = <$body>;
}
That's why we can't have nice things... I opened a bug in C:C:WrapCGI to get this resolved.

Related

"sh: 1: file: not found" thrown in Perl

So this is an issue I see thrown around on several coding help-sites that always have a slight variation. I'm not entirely familiar with what it means, and what's even more curious is that this error is thrown midway through a larger Upload.pm script, and does not cause any sort of fatal error. It gets tossed into my error log somewhere during this unless conditional snippet
# If this is the first slice, validate the file extension and mime-type. Mime-type of following slices should be "application/octet-stream".
unless ( defined $response{'error'} ) {
if ( $slice->{'index'} == 1 ) {
my ($filename, $directory, $extension) = fileparse($path.$parent_file, qr/\.[^.]*/);
unless ( is_valid_filetype($slice->{'tmp_file'}, $extension) ) {
$response{'error'} = "Invalid file type.";
$response{'retry'} = 0;
}
}
}
Now, let me be perfectly honest. I don't really understand the error message, and I could really use some help understanding it, as well as solving it.
Our Perl based web app has refused to let us upload files correctly since upgrading to Debian Bullseye, and I've been stuck debugging this code I didn't write for a few days now. I'm wondering if the upgrade depreciated some Perl modules, or if the directories to said modules are no longer working?
I'm testing this in a Ubuntu based Docker environment running Debian Bullseye on an Apache 2 server.
If you need any more context, clarification, etc, please let me know.
is_valid_filetype() looks like this:
sub is_valid_filetype
{
my ($tmp_file, $extension) = #_;
if ( $tmp_file && $extension ) {
# Get temp file's actual mime-type.
my $mime = qx/file --mime-type -b '${tmp_file}'/;
$mime =~ s/^\s+|\s+$//g;
# Get valid mime-types matching this extension.
my $dbh = JobTracker::Common::dbh or die("DBH not available.");
my $mime_types = $dbh->selectrow_array('SELECT `mime_types` FROM `valid_files` WHERE `extension` = ?', undef, substr($extension, 1));
if ( $mime && $mime_types ) {
if ( $mime_types !~ /,/ ) {
# Single valid mime-type for this extension.
if ( $mime eq $mime_types ) {
return 1;
}
} else {
# Multiple valid mime-types for this extension.
my %valid_mimes = map { $_ => 1 } split(/,/, $mime_types);
if ( defined $valid_mimes{$mime} ) {
return 1;
}
}
}
}
return 0;
}
It's a message from sh (not Perl). It concerns an error on line 1 of the script, which was apparently an attempt to run the file utility. But sh couldn't find it.
The code in question executes this command using
qx/file --mime-type -b '${tmp_file}'/
Install file or adjust the PATH so it can be found.
Note that this code suffers from a code injection bug. It will fail if the string in $tmp_path contains a single quote ('), possibly resulting in the unintentional execution of code.
Fixed:
use String::ShellQuote qw( shell_quote );
my $cmd = shell_quote( "file", "--mime-type", "-b", $tmp_file" );
qx/$cmd/
Debian Bullseye was reading our CSV files as the wrong mime-type. It was interpreting the file command as application/csv, despite obviously not being an application.
This may be an actual bug in Bullseye, because both my boss and I have scoured the internet with no lucky finding anyone else with this issue. I may even report to Bullseye's devs for further awareness.
The fix was manually adding in our own mime-types that interpreted this file correctly.
It took us dumping the tmp directory to confirm the files existed, and triple checking I had my modules installed.
This was such a weird and crazy upstream issue that either of us could not have imaged it would be the file type interpretation at an OS level in Bullseye.
I really hope this helps someone, saves them the time it took us to find this.

Less verbose debug screen in Catalyst?

in my stage server I would like to activate the debug so the clients can find errors for themselves before the app goes to the production server.
BUT I only want the first part of the message, not the Request, or the Session Data.
For example: Couldn't render template "templates/home.tt2: file error - templates/inc/heater: not found".
The message is enough for me and for my client to see that the "header" call is misspelled.
The Request has a lot of irrelevant information for the client, but also has A LOT of internal developing information that should be hidden all the time!!
Regards
What you want is to override Catalyst's dump_these method. This returns a list of things to display on Catalyst's error debugging page.
The default implementation looks like:
sub dump_these {
my $c = shift;
[ Request => $c->req ],
[ Response => $c->res ],
[ Stash => $c->stash ],
[ Config => $c->config ];
}
but you can make it more restrictive, for example
sub dump_these {
my $c = shift;
return [ Apology => "We're sorry that you encountered a problem" ],
[ Response => substr($c->res->body, 0, 512) ];
}
You would define dump_these in your app's main module -- the one where you use Catalyst.
I had a similar problem that I solved by overriding the Catalyst method log_request_parameters.
Something like this (as #mob said, put it in your main module):
sub log_request_parameters {
my $c = shift;
my %all_params = #_;
my $copy = Clone::clone(\%all_params); # don't change the 'real' request params
# Then, do anything you want to only print what matters to you,
# for example, to hide some POST parameters:
my $body = $copy->{body} || {};
foreach my $key (keys %$body) {
$body->{$key} = '****' if $key =~ /password/;
}
return $c->SUPER::log_request_parameters( %$copy );
}
But you could also simply return at the beginning, if you don't want any GET/POST parameters displayed.
Well, I didn't think of the more obvious solution, in your case: you could simply set your log level to something higher than debug, which would prevent these debug logs from being displayed, but would keep the error logs:
# (or a similar condition to check you are not on the production server)
if ( !__PACKAGE__->config->{dev} ) {
__PACKAGE__->log->levels( 'warn', 'error', 'fatal' ) if ref __PACKAGE__->log;
}

Perl mechanize Find all links array loop issue

I am currently attempting to create a Perl webspider using WWW::Mechanize.
What I am trying to do is create a webspider that will crawl the whole site of the URL (entered by the user) and extract all of the links from every page on the site.
But I have a problem with how to spider the whole site to get every link, without duplicates
What I have done so far (the part im having trouble with anyway):
foreach (#nonduplicates) { #array contain urls like www.tree.com/contact-us, www.tree.com/varieties....
$mech->get($_);
my #list = $mech->find_all_links(url_abs_regex => qr/^\Q$urlToSpider\E/); #find all links on this page that starts with http://www.tree.com
#NOW THIS IS WHAT I WANT IT TO DO AFTER THE ABOVE (IN PSEUDOCODE), BUT CANT GET WORKING
#foreach (#list) {
#if $_ is already in #nonduplicates
#then do nothing because that link has already been found
#} else {
#append the link to the end of #nonduplicates so that if it has not been crawled for links already, it will be
How would I be able to do the above?
I am doing this to try and spider the whole site to get a comprehensive list of every URL on the site, without duplicates.
If you think this is not the best/easiest method of achieving the same result I'm open to ideas.
Your help is much appreciated, thanks.
Create a hash to track which links you've seen before and put any unseen ones onto #nonduplicates for processing:
$| = 1;
my $scanned = 0;
my #nonduplicates = ( $urlToSpider ); # Add the first link to the queue.
my %link_tracker = map { $_ => 1 } #nonduplicates; # Keep track of what links we've found already.
while (my $queued_link = pop #nonduplicates) {
$mech->get($queued_link);
my #list = $mech->find_all_links(url_abs_regex => qr/^\Q$urlToSpider\E/);
for my $new_link (#list) {
# Add the link to the queue unless we already encountered it.
# Increment so we don't add it again.
push #nonduplicates, $new_link->url_abs() unless $link_tracker{$new_link->url_abs()}++;
}
printf "\rPages scanned: [%d] Unique Links: [%s] Queued: [%s]", ++$scanned, scalar keys %link_tracker, scalar #nonduplicates;
}
use Data::Dumper;
print Dumper(\%link_tracker);
use List::MoreUtils qw/uniq/;
...
my #list = $mech->find_all_links(...);
my #unique_urls = uniq( map { $_->url } #list );
Now #unique_urls contains the unique urls from #list.

Perl -- 'Not a HASH reference' error when using JSON::RPC::Client

I'm a newbie in Perl.
I have a JSON-RPC server running at http://localhost:19000 and I need to call checkEmail() method.
use JSON::RPC::Client;
my $client = new JSON::RPC::Client;
my $url = 'http://localhost:19000';
my $callobj = {
method => 'checkEmail',
params => [ 'rprikhodchenko#gmail.com' ],
};
my $res = $client->call($url, $callobj);
if($res) {
if ($res->is_error) {
print "Error : ", $res->error_message;
}
else {
print $res->result;
}
}
else {
print $client->status_line;
}
When I try to launch it it tells following:
perl ./check_ac.pl
Not a HASH reference at /usr/local/share/perl/5.10.1/JSON/RPC/Client.pm line 193.
UPD:
Full stack-trace:
perl -MCarp::Always ./check_ac.pl
Not a HASH reference at /usr/local/share/perl/5.10.1/JSON/RPC/Client.pm line 193
JSON::RPC::ReturnObject::new('JSON::RPC::ReturnObject', 'HTTP::Response=HASH(0x9938d48)', 'JSON=SCALAR(0x96f1518)') called at /usr/local/share/perl/5.10.1/JSON/RPC/Client.pm line 118
JSON::RPC::Client::call('JSON::RPC::Client=HASH(0x944a818)', 'http://localhost:19000', 'HASH(0x96f1578)') called at ./check_ac.pl line 11
This error means that your JSON-RPC server is not actually one, inasmuch as it does not satisfy requirement 7.3. The error is triggered when JSON::RPC::Client assumes the document returned by the JSON-RPC service is well-formed (i.e., a JSON Object), and this assumptions turns out to have been in error. A bug report to the author of JSON::RPC::Client would be an appropriate way to request better error messaging.
I would attack this sort of problem by finding out what the server was returning that was causing JSON::RPC::Client to choke. Unfortunately, JRC fails to provide adequate hookpoints for finding this out, so you'll have to be a little bit tricky.
I don't like editing external libraries, so I recommend an extend-and-override approach to instrumenting traffic with the JSON-RPC server. Something like this (in check_ac.pl):
use Data::Dumper qw();
package JSON::RPC::InstrumentedClient;
use base 'JSON::RPC::Client';
# This would be better done with Module::Install, but I'm limiting dependencies today.
sub _get {
my ($self, #args) = #_;
return $self->_dump_response($self->SUPER::_get(#args));
}
sub _post {
my ($self, #args) = #_;
return $self->_dump_response($self->SUPER::_post(#args));
}
sub _dump_response {
my ($self, $response) = #_;
warn Data::Dumper::Dump([$response->decoded_content], [qw(content)]);
return $response;
}
package main;
my $client = JSON::RPC::InstrumentedClient->new();
my $url = 'http://localhost:19000';
... # rest of check_ac.pl
This wraps the calls to _get and _post that JSON::RPC::Client makes internally in such a way as to let you examine what the web server actually said in response to the request we made. The above code dumps the text content of the page; this might not be the right thing in your case and will blow up if an error is encountered. It's a debugging aid only, to help you figure out from the client code side what is wrong with the server.
That's enough caveats for now, I think. Good luck.
It seems to be a bug in method new of JSON::RPC::ReturnObject.
sub new {
my ($class, $obj, $json) = #_;
my $content = ( $json || JSON->new->utf8 )->decode( $obj->content );
#...
# line 193
$content->{error} ? $self->is_success(0) : $self->is_success(1);
#...
}
$content's value will be something returned from a JSON::decode() call. But looking at the documentation, it seems that JSON->decode() returns a scalar which could be a number, a string, an array reference, or a hash reference.
Unfortunately, JSON::RPC::ReturnObject->new() doesn't check what sort of thing JSON->decode() returned before trying to access it as a hashref. Given your error, I'm going to go ahead and assume what it got in your case was not one. :-)
I don't know if there's a way to force a fix from your code. I'd recommend contacting the author and letting him know about the issue, and/or filing a bug.

How can I set the Cache-Control header for every response in Catalyst?

It seems that by default Catalyst does not output Cache-Control:, etc. headers. I know I can output them in a given controller method like this:
$c->response->headers->last_modified(time);
$c->response->headers->expires(time + $self->{cache_time});
$c->response->headers->header(cache_control => "public, max-age=$self->{cache_time}");
It'd get pretty painful doing that in each method, though! What I'd prefer is:
A default set of headers (expires now, last modified now, cache-control: no-cache, pragma: no-cache)
A way to, per-method, override the default.
Is there a good way to accomplish this?
derobert:
Excellent question. I covered exactly this in an article for the Catalyst advent calendar.
Basically you create a stash variable that defines your cache time for the given action, and then you process it in your Root end routine. See the article for all the details.
JayK
Update: Based on your response to my earlier suggestion, I decided to bite the bullet and look at the Catalyst docs. It seems to me, the place to do this is in:
sub end : Private {
my ( $self, $c ) = #_;
# handle errors etc.
if ( $c->res->body ) {
if ( "some condition" ) {
set_default_response_headers( $c->response->headers );
return;
}
else {
do_something_else();
return;
}
}
$c->forward( 'MyApp::View::TT' ); # render template
}
Earlier response: I do not use Catalyst, but couldn't you just write a sub for your application?
sub set_default_response_headers {
my ($h) = #_;
$h->last_modified(time);
$h->expires(time + $self->{cache_time});
$h->header(cache_control => "public, max-age=$self->{cache_time}");
return $h;
}
Call with set_default_response_headers( $c->response->headers ).