How can I read the URL-Data send with POST in Perl? - perl

I'm trying to read out the POST-Data that was sent from a form in a page to my Perl Script. I googled and found out that:
read(STDIN, $param_string, $ENV{'CONTENT_LENGTH'})
reads out the whole Data-String with and writes the whole string to $param_string in the form of
Param1=Value1&Param2=Value2&Param3=Value3
by spliting it at the right places I get the necessary Data.
But I wonder why my $param_string is empty.
When I try the whole thing with GET:
$param_string = $ENV{'QUERY_STRING'};
everything works fine. Does anybody have an idea?

There absolutely no real reason for someone at your level to want to hand parse CGI requests.
Please use CGI::Simple or CGI.pm.
CGI.pm has a lot of baggage (HTML generation, function oriented interface) which makes CGI::Simple preferable.
Using any CGI processing module on CPAN is better than trying to write CGI processing code from scratch.
See parse_query_string in CGI::Simple for a way of accessing parameters passed using the query string when processing a form that is POSTed to your script.
If you want to learn how to do it right, you can read the source code of either module. Reading through the CGI.pm CHANGES file is also instructive.

If you are able to retrieve GET-data but not able to retrieve POST-data, most likely you forgot to change form method from to be post. You can check your submit method by using this condition in if statement:
if ($ENV{'REQUEST_METHOD'} eq "POST"){
read(STDIN, $param_string, $ENV{'CONTENT_LENGTH'});
}else {
$param_string = $ENV{'QUERY_STRING'};
}

Under mod_perl 2, Apache2::Request works for me.

Related

Selenium WebDriver with Perl

I am trying to run the Selenium driver with Perl bindings, and due to the lack of examples and documentation, I am running into some roadblocks. I have figured out how to do some basic things, but I seem to be running into some issues with other simple things like validating the text on a page using Remote::Driver package.
If I try to do something like this:
$sel->get("https://www.yahoo.com/" );
$ret = $sel->find_element("//div[contains( text(),'Thursday, April 26, 2012')]");
I get a message back that the element couldn't be found. I am using xpath because the driver package doesn't appear to have a sub specific for finding text.. at least not that I've found.
If my xpath setup is wrong or if someone knows a better way, that would be extremely helpful. I'm having problems with some button clicking too.. but this seems like it should be easier and is bugging me.
Finding text on a web page and comparing that text to some "known good value" using Selenium::Remote::Driver can be implemented as follows:
File: SomeWebApp.pm
package SomeWebApp;
sub get_text_present {
my $self = shift;
my $target = shift;
my $locator = shift;
my $text = $self->{driver}->find_element($target, $locator)->get_text();
return $text;
}
Somewhere in your test script: test.pl
my $text = $some_web_app->get_text_present("MainContent_RequiredFieldValidator6", "id");
The above finds the element identified by $target using the locating scheme identified by $locator and stores it in the variable $text. You can then use that to compare / validate as required / needed.
https is a tad slower loading than http. Although WebDriver is pretty good about waiting until it's figured out that the requested page is fully loaded, maybe you need to give it a little help here. Add a sleep(2); after the get() call and see if it works. If it does, try cutting down to 1 second. You can also do a get_title call to see if you've loaded the page you think you have.
The other possibility is that your text target isn't quite exactly the same as what's on the page. You could try looking first for one word, such as "April", and see if you get a hit, and then expand until you find the mismatch (e.g., does this string actually have a newline or break within it? How about an HTML entity such as a non-breaking space?). Also, you are looking for that bit of text anywhere under a div (all child text supposedly is concatenated, and then the search done). That would more likely cast too wide a net than not get anything at all, but it's worth knowing.

Erroring in my Perl script coming from CAM::PDF::Annot module. Don't know why

I believe this may be a bug in the module I am using, or I am just completely overlooking something.
My code is this:
#!/usr/bin/perl
use strict;
use warnings;
use CAM::PDF;
use CAM::PDF::Annot;
sub main()
{
my $pdf = CAM::PDF::Annot->new( 'b.pdf' );
my $otherDoc = CAM::PDF::Annot->new( 'b_an.pdf' );
my $page = 1;
my %refs;
my #list = #{$pdf->getAnnotations($page)};
for my $annotRef (#list){
$otherDoc->appendAnnotation( $page, $pdf, $annotRef, \%refs);
}
$otherDoc->output('pdf_merged.pdf');
}
exit main;
This code was taken almost directly from the synopsis found on the module's CPAN page: http://metacpan.org/pod/CAM::PDF::Annot
The problem comes when I run the script using TWO pdf's with annotations. Using two pdf's without annotations runs. Using one pdf with annotations, and one pdf without annotations, runs. Only when both pdf's have annotations does it error.
The error is: "Can't use string ("46") as an ARRAY ref while "strict refs" in use at /usr/opt/perl5/lib/site_perl/5.10.1/CAM/PDF/Annot.pm line 195"
Line 195 of Annot.pm is:
push #{$annots->{value}}, $pupRef;
Annot.pm is inside the CAM::PDF::Annot module.
Any guidance in fixing this would be greatly appreciated!
P.S. In the error, "string ("x")", x is always a number, and seems to change depending on the pdf and the annotations within the pdf.
And I will try to add any other information that you need to help figure this out!
Whenever I have a problem with a CPAN module, I go to its webpage to try and assess its quality and see if any bugs have already been reported.
http://search.cpan.org/~donatoaz/CAM-PDF-Annot-0.06 shows the following suspicious results:
CPAN Testers PASS (2) FAIL (168) NA (49)
It is surprising that you were able to install the module. No one has reported bugs, but there is clearly a major problem with the code. It seems the author is either unaware of the tester reports (which have been sent to his CPAN email address for more than a year), or has stopped maintaining it.
You could submit a bug report, so at least others will be aware of your issue.
I realize this does not answer your question of how to fix the problem, but even if you do identify a fix, the author may not apply it (in which case, someone could start the process of becoming a co-maintaner).

Strange problem getting $ENV{'QUERY_STRING'}

I'm having a strange problem here. I'm moving a (working) site to a new apache server to which I don't have direct access (I have to go through two people to get stuff done).
The site uses a perl script called adframe to parse html templates. The URLs with which it's called look like /cgi-bin/adframe/index.html?x=something with adframe being the script. The missing suffix never caused any real problems. But on this new Ubuntu server $ENV{'QUERY_STRING'} is always empty. $ENV{'REQUEST_METHOD'} shows up correctly as GET, but the query_string shows nothing ...
Regular *.cgi scripts show the query_string without problems.
From the logs I gathered that the server seems to be running fastcgi, mod_fcgid and the server doesn't even accept .pl as an extension for scripts. I don't have that much experience with server software, but I figured it might be a problem with the server not accepting adframe as a cgi script and thus not passing the query_string correctly ... Can anyone give me a few hints to where I could point the administrator or maybe something I could do in .htaccess myself? Anyway to make sure, adframe is recognized as a cgi script!? (if that's the problem ...)
Any help is appreciated!
thomas
EDIT: I found more details: The server seems to be running a VARNISH cache ... thats's the main difference to my usual configurations ...
Also, the way the script works is, if you call /cgi-bin/adframe/somedir/somefile.html?x=something, $ENV{PATH_INFO} tells which template to parse and $ENV{QUERY_STRING} is, well, the query string. Now the query string is empty, but if I call /cgi-bin/adframe?x=something (without any PATH_INFO), the query string shows up!
Does anyone have an idea what's going on here?
thanks!
Got it. The VARNISH cache strips all the query strings off static content (*.html etc) ... phew
Just ran into the same problem. I am complete newbie in perl scripting.
I tried following:
#values = split (/&/, $ENV{'QUERY_STRING'});
but it didn`t work
this worked:
#values = split (/&/, "$ENV{'QUERY_STRING'}");
just in case if other newbies have ran into the same problem.

How do I get the text-form verification code when doing auto site access in Perl?

I'm playing around with Win32::IE:Mechanize to try to access some authentication-required sites automatically. So far I've achieved moderate success, for example, I can automatically log in to my yahoo mailbox. But I find many sites are using some kind of image verification mechanism, which is possibly called CAPTCHA. I can do nothing to them. But one of the sites I'm trying to auto access is using a plain-text verification code. It is comnposed of four digits, selectable and copyable. But they're not in the source file which can be fetched using
$mech->content;
I searched for the keyword that appears on the webpage but not in the source file through all the files in the Temporary Internet Files but still can't find it.
Any idea what's going on? I was suspecting that the verification code was somehow hidden in some cookie file but I can't seem to find it :(
The following is the code that completes all the fields requirements except for the verification code:
use warnings;
use Win32::IE::Mechanize;
my $url = "http://www.zjsmap.com/smap/smap_login.jsp";
my $eccode = "myeccode";
my $username = "myaccountname";
my $password = "mypassword";
my $verify = "I can't figure out how to let the script get the code yet"
my $mech = Win32::IE::Mechanize->new(visible=>1);
$mech->get($url);
sleep(1); #avoids undefined value error
$mech->form_name("BaseForm");
$mech->field(ECCODE => $eccode);
$mech->field(MEMBERACCOUNT => $username);
$mech->field(PASSWORD => $password);
$mech->field(verify => $verify);
$mech->click();
Like always any suggestions/comments would be greatly appreciated :)
UPDATE
I've figured out a not-so-smart way to solve this problem. Please comment on my own asnwer posted below. Thanks like always :)
This is the reason why they are there. To stop program like yours to do automated stuff ;-)
A CAPTCHA or Captcha is a type of
challenge-response test used in
computing to ensure that the response
is not generated by a computer.
This appears to be an irrelevant number. The page uses it in 3 places: generating it; displaying it on the form next to the input field for it; and checking for the input value being equal to the random number chosen. That is, it is a client-only check. Still, if you disable javascript it looks like, I'm guessing, important cookies don't get set. If you can execute JavaScript in the context of the page (you should be able to with a get method call and a javascript URI), you could change the value of random_number to f.e. 42 and fill that in on the form.
The code is inserted by JavaScript – disable JS, reload the page and see it disappear. You have to hunt through the JS code to get an idea where it comes from and how to replicate it.
Thanks to james2vegas, zoul and Shoban.
I've finally figured out on my own a not-so-smart but at-least-workable way to solve the problem I described here. I'd like to share it here. I think the approach suggested by #james2vegas is probably much better...but anyway I'm learning along the way.
My approach is this:
Although the verification code is not in the source file but since it is still selectable and copyable, I can let my script copy everything in the login page and then extract the verification code.
To do this, I use the sendkeys functions in the Win32::Guitest module to do "Select All" and "Copy" to the login page.
Then I use Win32:Clipboard to get the clipboard content and then Regexp to extract the code. Something like this:
$verify = Win32::Clipboard::GetText();
$verify =~ s/.* (\d{4}).*/$1/msg;
A few thoughts:
The random number is generated by something like this in Perl
my $random_number = int(rand(8999)) + 1000; #var random_number = rand(1000,10000);
And then it checks if $verify == $random_number. I don't know how to catch the value of one-session-only $random_number. I think it is stored somewhere in the memory. If I can capture the value directly then I wouldn't have gone to so much trouble of using this and that extra module.

How can I run a CGI::Application run mode from the command line?

I have a run mode in my CGI::Application web-app that I would like to be able to trigger from the command line so i can automate it. From the web-app's perspective it does some processing then sends the results in an email.
When called from the web interface it passes in a set of parameters (email address, which query to run, date, etc) so these need to be passed in.
How can I construct a call to the CGI::Application app that will be the same as if I ran it from the web?
The original CGI specification makes it easy to run things from the command line and was fully intended not as a specific HTTP-only interface but something that could handle FTP and gopher as well as new top-level URL schemes. I know what I wanted when I helped specify it.
The spec I referenced should give you all you need, but for the most part it is just a collection of environment variables. If you see a request for:
http://some.server.com/some/path?a=b&c=d
The environment variables come out looking like this:
SERVER_PROTOCOL=http
REQUEST_METHOD=GET
HTTP_HOST=some.server.com
SERVER_PORT=80
PATH_INFO=/some/path
QUERY_INFO=a=b&c=d
To reverse the polarity of that in Perl would go something like this:
$ENV{'SERVER_PROTOCOL'} = 'http';
$ENV{'REQUEST_METHOD'} = 'GET';
$ENV{'SERVER_PORT'} = 80;
$ENV{'PATH_INFO'} = '/some/path';
$ENV{'QUERY_INFO'} = 'a=b&c=d';
system("perl your-CGI-script.pl");
Things get a bit more complicated in handling POST queries and there are more possible environment variables that may be required. Worst case you can enumerate them all with a quick CGI script something like:
print "Content-Type: text/plain\r\n\r\n";
foreach (keys(%ENV))
{
print "$_=$ENV{$_}\r\n";
}
Now put that on the web server in place of your CGI script and you'll see all the environment that gets passed in (and the original environment so you'll need to make a few judgement calls).
Upon further digging through the CGI::App and the CGI documentation, it appeared to be more straightforward than I thought. The simplest case (no real argument handling or dealing with the output from the webapp run call) is:
#!/usr/bin/perl
use strict;
use warnings;
use CGI;
use WebApp;
my $cgi = new CGI( \%{#ARGV} );
my $webapp = WebApp->new( QUERY => $cgi );
$webapp->run();
It just takes a series of space separated name value pairs to create the CGI. You need to pass in the run mode and all the arguments.
I'm the maintainer of CGI::Application, and I do this all the time-- I have dozen of cron scripts built with CGI::Application because it's convenient to share the infrastructure with the application.
The simplest approach is this:
# There is no browser to return results to.
$ENV{CGI_APP_RETURN_ONLY} = 1;
my $app = WebApp->new;
$app->direct_run_mode_method;
In that example, you bypass the normal flow and call a method directly. Be sure you don't need any of the "setup" or "teardown" actions to happen in that case.
If you just have one run mode you are calling, you can also just set the "start_mode", and call run(), so then the default run mode is called by default.
Another idea: you can use a module like Getopt::Long and pass in values through the PARAM hash to new(), or completely replace the run-mode selection process. Here's an example where command line flags are used to determine the run mode:
sub setup {
my $self = shift;
$self->start_mode('send_error_digests');
$self->run_modes([qw/
send_error_digests
help
/]);
my ($dry_run, $help);
GetOptions(
'dry-run' => \$dry_run,
'help' => \$help
);
$self->param('dry_run' => $dry_run);
$self->mode_param(sub {
return 'help' if $help;
return $self->start_mode();
});
}
Thusly:
$ perl yourscript.pl field1=value1 field2=value2
Perl's CGI library takes care of the magic for you, and it appears that CGI::Application relies on CGI (judging from their example code).
Instead of having to go through CGI::Application every time you want to get something done, enforce a proper separation of concerns, perhaps using an MVC setup. All of the functionality should exist outside of the CGI::Application stuff since that should only work as a controller. Once you separate out those bits, you can easily write other controllers for other input methods.
Don't write a web application; write an an application that happens to have a web interface. When you have that, you can easily give your application other sorts of interfaces.
You could automate by calling the web app using curl, wget, or an LWP GET-script with the appropriate parameters. I've used a similar system for cron-driven tasks with a Catalyst application.
That deals with all the environment variables for you..