I have a question here:
For some reason when I call get request in perl script, I am getting only half page.
However this is happening only on certain machines, but on some machines it is returning whole page as expected.
The requested page size is only 64 kb, and perl is running on windows 7 (RAM 8GB, 64 bit OS).
I just wondering if anyone have seen this issue before? Is there any way to solve it?
Thank you!
Here is my code
use strict;
use warnings;
use File::Slurp;
use LWP::UserAgent;
my $ua = LWP::UserAgent->new;
$ua->default_header( 'Authorization' => 'Basic [encoded credential]' );
$ua->max_size(2000000000) ;
my $resp = $ua->get( 'url', ':content_file'=>'test.txt' );
say $resp->status_line;
my $content = read_file('test.txt');
print $content;
Related
I wrote a simple perl code:
use HTTP::Request::Common qw(POST);
use strict;
use warnings;
use LWP;
my $ua = new LWP::UserAgent(agent => 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.0.5) Gecko/20060719 Firefox/1.5.0.5');
my $res = $ua->get('http://60606.com/members/search',Cookie => 'incomerr_refid=0; ff_language_id=3; _t_=1; fcode2=4139; ss_time=1527663671; PHPSESSID=otfmog35f0iva0uld1rgld2kj7; _webpush_key_=none; dinged_id=1412807; _language_id=3; _gnum_=950212; refid=1; first_key=TOCz19ls8HfkFW6LoBrIj35p4xSyYPVE; last_valid_url=https%3A//60606.com/login; ping_last=1527663675.939');
print $res->content;
but when I run it I get this error:
Too many header lines (limit is 128) at C:/Perl64/lib/Net/HTTP/Methods.pm line 377
Please tell me why I get this error, I think it's for cookie format
Please tell me why I get this error, I think it's for cookie format
The error speaks about the number of lines in the header. That's not related to the cookie header, as that is only a single line.
The place where Net::HTTP::Methods complains is when it reads the incoming response header lines. Those are set to 128 by default, but this can be changed with an argument.
However, I haven't figured out a way yet to inject a different value. I've reproduced the problem stand-alone though.
$ perl -Mojo -E \
'a(sub ($c) { $c->res->headers->header("X-$_" => $_) for 1..128; $c->render(text => "Hello") })->start' \
daemon
This will give you a server on localhost:3000 that responds with loads of header lines. If you then make a response, the same problem occurs.
my $res = $ua->get('http://localhost:3000/');
Unfortunately that does not help us very much.
I am trying to write a basic webscraping program in Perl. For some reason it is not working correctly and I don't have the slightest clue as to why.
Just the first part of my code where I am getting the content (just saving all of the HTML code from the webpage to a variable) does not work with certain websites.
I am testing it by just printing it out, and it does not print anything out with this specific website. It works with some other sites, but not all.
Is there another way of doing this that will work?
#use strict;
use LWP::Simple qw/get/;
use LWP::Simple qw/getstore/;
## Grab a Web page, and throw the content in a Perl variable.
my $content = get("https://jobscout.lhh.com/Portal/Page/ResumeProfile.aspx?Mode=View&ResumeId=53650");
print $content;
You have a badly-written web site there. The request times out with a 500 Internal Server Error.
I can't suggest how to get around it, but the site almost certainly uses JavaScript as well which LWP doesn't support, so I doubt if an answer would be much use to you.
Update
It looks like the site has been written so that it goes crazy if there is no Accept-Language header in the request.
The full LWP::UserAgent module is necessary to set it up, like this
use strict;
use warnings;
use LWP;
my $ua = LWP::UserAgent->new(timeout => 10);
my $url = 'https://jobscout.lhh.com/Portal/Page/ResumeProfile.aspx?Mode=View&ResumeId=53650';
my $resp = $ua->get($url, accept_language => 'en-gb,en', );
print $resp->status_line, "\n\n";
print $resp->decoded_content;
This returns with a status of 200 OK and some HTML.
To interact with a website that uses Javascript, I would advise that you use the following module:WWW::Mechanize::Firefox
use strict;
use warnings;
use WWW::Mechanize::Firefox;
my $url = "https://jobscout.lhh.com/Portal/Page/ResumeProfile.aspx?Mode=View&ResumeId=53650"
my $mech = WWW::Mechanize::Firefox->new();
$mech->get($url);
print $mech->status();
my $content = $mech->content();
This question already has answers here:
How to redirect from one CGI to another
(2 answers)
Closed 8 years ago.
How can I redirect a user in Perl?...
I've tried many of the methods online, although none seem to work.
For example I tried this: (Although it doesnt seem to work. Just prints out on the page "Location: {url....}"
#!/usr/bin/perl
use CGI;
$q = CGI->new;
print $q->header("text/html");
$url = $q->param('url');
print "Location: $url\n\n";
Any help would be appreciated, thanks,
You included the Location header in the body rather than in the header, and you never set the proper response code.
The module's documentation has a section titled GENERATING A REDIRECTION HEADER which shows how do to build the header.
#!/usr/bin/perl
use strict;
use warnings;
use CGI qw( );
my $q = CGI->new();
my $url = $q->param('url');
print $q->redirect($url);
Why load a module when you can simply make:
print("Status: 301 Moved Permanently\nLocation: $url\n\n");
Or eventually a little more respectful:
print("Status: 301 Moved Permanently\nLocation: $url\nContent-Type: text/html; charset=UTF-8\n\n");
For a 302, replace by "302 Found".
Remove the print $q-header("text/html");
I'm having a bit of difficulty with HTTP::Request in Perl,
My script is set up like so:
use CGI;
use MIME::Base64;
use HTTP::Cookies;
use HTTP::Request;
use LWP::UserAgent;
$request = HTTP::Request->new(GET => 'http://www.example.com/');
$ua = LWP::UserAgent->new;
$response = $ua->request($request);
print "$response";
When I run the script on my website, I get a message like so:
HTTP::Response=HASH(0x987f8d8)
(I'm trying to get it to "print" example.com)
Yes its a hash. You have to access like below.
print $res->decoded_content(); ## if gziped
print $res->content();
print $res->status_line;
Also you can use the Data Dumper to print the whole $res and observe what actually it holds.
use Data::Dumper;
print Dumper $res;
http::Request returns indeed a hash.
$response->content has the result.
From the link:
$r->content( $bytes )
This is used to get/set the content and it is inherited from the HTTP::Message base class. See HTTP::Message for details and other methods that can be used to access the content.
Note that the content should be a string of bytes. Strings in perl can contain characters outside the range of a byte. The Encode module can be used to turn such strings into a string of bytes.
You need to do this ...
print $response->content();
... take a look at the docs http://metacpan.org/pod/LWP#An-Example
When I try to print an image to STDOUT in a Perl CGI script, the image gets clipped when viewed in the browser.
Here is the following code:
if ($path =~ m/\.jpe?g$/i)
{
my $length = (stat($path))[7];
$| = 1;
print "Content-type: image/jpg\r\n";
print "Content-length: $length\r\n\r\n";
open(IMAGE,"<$path");
binmode(IMAGE);
binmode(STDOUT);
my ($image, $buff);
read IMAGE, $buff, $length;
syswrite STDOUT, $buff, $length;
close IMAGE;
}
If you really want to read the entire file into memory before serving, use File::Slurp:
#!/usr/bin/perl
use strict; use warnings;
use CGI::Simple;
use File::Slurp;
use File::stat;
local $| = 1;
my $cgi = CGI::Simple->new;
my $st = stat($path) or die "Cannot stat '$path'";
print $cgi->header(
-type => 'image/jpeg',
-length => $st->size,
);
write_file(\*STDOUT, {binmode => ':raw'},
\ read_file( $path, binmode => ':raw' )
);
However, reading the entire file will consume large amounts of memory for large images. Therefore, see How can I serve an image with a Perl CGI script?.
EDIT: as the stat doesn't seem to be problem, some more ideas:
try using unbuffered instead of buffered reading, ie. use sysread instead of read. or the other way round: use both buffered read and write. also, try commenting out the $|. see Suffering from Buffering? for details on perl buffered io. see also How can I serve an image with a Perl CGI script? here on SO for an apparently working solution. EDIT END
you are using the wrong stat field. (stat($path))[10] is ctime: inode change time in seconds since the epoch. it should be (stat($path))[7], size: total size of file, in bytes.
FYI: I have come to the conclusion that the images are in fact corrupt, though they are fully viewable in Windows File Explorer.
The FireFox browser shows the Images clipped(no matter how they are accessed, so I guess this is no longer a Perl problem), but the Safari Browser displays them completely.
The images were re sampled from using Java's imageIO in "jpg" mode. I just changed the mode to "png", and now the newly generated images are showing perfectly in all browsers. So this was actually a Java imageIO issue.
It is solved.
Thank you everyone for your responses.