The problem I met is that I need to get one URL (I cannot be specific that link exactly, this link is doing request and looks like http://link.com/?name=name&password=password& and etc)
And I need to fetch this URL 100 times in a row. I can not do this manually using browser - this takes much time.
Is there any option to run (just run, like you put link in browser and press enter) this link 100 times in a row using Perl scripting?
I have not met before with the Perl and therefore asking the help directly. As I google before some information and make a little script, but seems like I missing something in my knowledge:
#!/usr/bin/perl -w
use LWP::Simple;
my $uri = 'http://my link here';
my $content = get $uri;
Could you please advise to me how I can finish this script?
Use a (simple) for loop.
#!/usr/bin/perl
use strict;
use warnings;
use LWP::Simple;
my $uri = 'http://my link here';
get $uri for 1 .. 100;
Update: Just read in a comment that you don't care about the returned data, so I've edited my answer to remove the unnecessary $content variable.
Related
I'm try to automate the extraction of a transcript found on a website. The entire transcript is found between dl tags since the site formatted the interview in a description list. The script I have below allows me to search the site and extract the text in a plain-text format, but I'm actually looking for it to include everything between the dl tags, meaning dd's, dt's, etc. This will allow us to develop our own CSS for the interview.
Something to note about the page is that there are break statements inserted at various points during the interview. Some tools we've found that extract information from webpages using pairings have found this to be a problem since it only grabs the information up until the break statement. Just something to keep in mind if you point me in a different direction. Here's what I have so far.
#!/usr/bin/perl -w
use strict;
use WWW::Mechanize;
use WWW::Mechanize::TreeBuilder;
my $mech = WWW::Mechanize->new();
WWW::Mechanize::TreeBuilder->meta->apply($mech);
$mech->get("http://millercenter.org/president/clinton/oralhistory/madeleine-k-albright");
# find all <dl> tags
my #list = $mech->find('dl');
foreach ( #list ) {
print $_->as_text();
}
If there is a tool that essentially prints what I have, only this time as HTML, please let me know of it!
Your code is fine, just change the as_text() method to as_HTML() and it will show the content with HTML tags included.
I am trying to write a Perl script which will automatically key in search variables on this LexisNexis search page and retrieve the search results.
I am using the WWW::Mechanize module but I am not sure how to figure out the field name of the search bar itself. This is the script I have so far ->
#!/usr/bin/perl
use strict;
use warnings;
use WWW::Mechanize;
my $m = WWW::Mechanize->new();
my $url = "http://www.lexisnexis.com/hottopics/lnacademic/?verb=sr&csi=379740";
$m->get($url);
$m->form_name('f');
$m->field('q', 'Test');
my $response = $m->submit();
print $response->content();
However, I think the "Name" of the search box in this website is not "q". I am getting the following Error - "Can't call method "value" on an undefined value at site/lib/WWW/Mechanize.pm line 1442." Any help is much appreciated. Thank you !
If you disable the JavaScript in your browser then you will notice that the search form doesn't load which means it's being loaded by JavaScript, that's why you are unable to handle it with WWW::Mechanize. Have a look at WWW::Mechanize::Firefox, this might help you with your task. Check out the example scripts, cookbook and FAQs.
You can also do the same using Selenium, see Gabor's tutorial on Selenium.
I am trying to write a basic webscraping program in Perl. For some reason it is not working correctly and I don't have the slightest clue as to why.
Just the first part of my code where I am getting the content (just saving all of the HTML code from the webpage to a variable) does not work with certain websites.
I am testing it by just printing it out, and it does not print anything out with this specific website. It works with some other sites, but not all.
Is there another way of doing this that will work?
#use strict;
use LWP::Simple qw/get/;
use LWP::Simple qw/getstore/;
## Grab a Web page, and throw the content in a Perl variable.
my $content = get("https://jobscout.lhh.com/Portal/Page/ResumeProfile.aspx?Mode=View&ResumeId=53650");
print $content;
You have a badly-written web site there. The request times out with a 500 Internal Server Error.
I can't suggest how to get around it, but the site almost certainly uses JavaScript as well which LWP doesn't support, so I doubt if an answer would be much use to you.
Update
It looks like the site has been written so that it goes crazy if there is no Accept-Language header in the request.
The full LWP::UserAgent module is necessary to set it up, like this
use strict;
use warnings;
use LWP;
my $ua = LWP::UserAgent->new(timeout => 10);
my $url = 'https://jobscout.lhh.com/Portal/Page/ResumeProfile.aspx?Mode=View&ResumeId=53650';
my $resp = $ua->get($url, accept_language => 'en-gb,en', );
print $resp->status_line, "\n\n";
print $resp->decoded_content;
This returns with a status of 200 OK and some HTML.
To interact with a website that uses Javascript, I would advise that you use the following module:WWW::Mechanize::Firefox
use strict;
use warnings;
use WWW::Mechanize::Firefox;
my $url = "https://jobscout.lhh.com/Portal/Page/ResumeProfile.aspx?Mode=View&ResumeId=53650"
my $mech = WWW::Mechanize::Firefox->new();
$mech->get($url);
print $mech->status();
my $content = $mech->content();
I have a server from which I can access a webpage. I want to test the server reaction when several users(say 60000 users) access the same webpage simultanously.
Im looking for a script to do this,a perl script will be better,
Here is the code I tried with
#!c:\\perl\\bin
use strict;
use WWW::Mechanize;
my $url = "http://www.cpan.org";
my $searchstring = "WWW::Mechanize";
my $mech = WWW::Mechanize->new();
while (i == 60000)
{
$mech->get($url);
i++;
}
But this scripts access the url 1 at a time, But I need simultanous access.
You can use ab. Use its concurrency (-c) and number of requests (-n) options.
Your code, as shown above, doesn't even compile. It would be far easier for people to help you if you gave us the exact code that you're using, rather than typing to retype it and adding typos.
To get your code to compile, I had to declare $i and add the $ before the two uses of i in your programme. I ended up with this:
#!/usr/bin/perl
use strict;
use WWW::Mechanize;
my $url = "http://www.cpan.org";
my $searchstring = "WWW::Mechanize";
my $mech = WWW::Mechanize->new();
my $i;
while ($i == 60000) {
$mech->get($url);
$i++;
}
You say that the loop only gets executed once. That's not true. The loop doesn't get executed at all. The condition on the loop is $i == 60000. The first time this condition is checked, $i is undefined so the condition is false and the loop is skipped.
I think you probably wanted $i <= 60000 instead.
But as Alan points out, you'd be far better off using an existing tool like ab.
You can either use Perl itself to launch your script 6000 times in parallel or write one (or maybe more than one) .BAT files or similar Windows scripting to launch the perl script.
Be careful to clearly understand how to execute stuff in detached mode under Windows.
(Of course you have to remove the while loop in your original script).
ab has already been mentioned, but that only tests against the root page (/) of the site. We found siege more useful in replicating real load, since we were able to test against a long list of URLs (10,000 or so) taken from our live access log...
Verify whether the input url was obtained
get the url
see if the url was redirected
get meta tags as key-value pairs
get video tags as a list
url of the next suggested video
I tried Video::Info which provides the general information. However I don't know how to get things like whether it is redirected or not, the tags and the next video. Any help is appreciated.
thanks
Take a look at the CGI module. Perhaps this will get you started with the general process:
#!/usr/bin/perl
use strict;
use warnings;
use Data::Dumper;
use CGI;
my $cgi = CGI->new('metadata=abc&nextVideo=efg&foo=bar&blah=etc');
my #URLQueryKeys = $cgi->param;
foreach my $URLQueryKey (#URLQueryKeys) {
print Dumper $URLQueryKey;
print Dumper $cgi->param($URLQueryKey);
}