Why does WWW::Mechanize fail with "X-Died: Illegal field name 'X-Meta-Twitter:title'"? - perl

Why does WWW::Mechanize have blank content after getting the following URL? Using a browser or curl there is a full HTML page retrieved.
use WWW::Mechanize;
$mech = new WWW::Mechanize;
$mech->get("http://www.belizejudiciary.org/web/judgements2/");
print $mech->content # prints nothing
Here is the dump of the response:
HTTP/1.1 200 OK
Connection: close
Date: Fri, 10 Feb 2017 00:51:47 GMT
Server: Apache/2.4
Content-Type: text/html; charset=UTF-8
Client-Aborted: die
Client-Date: Fri, 10 Feb 2017 00:51:48 GMT
Client-Peer: 98.129.229.64:80
Client-Response-Num: 1
Client-Transfer-Encoding: chunked
Link: <http://www.belizejudiciary.org/web/wp-json/>; rel="https://api.w.org/"
Link: <http://www.belizejudiciary.org/web/?p=468>; rel=shortlink
Set-Cookie: X-Mapping-hepadkon=FAB86566672CEB74D66B2818CA030616; path=/
X-Died: Illegal field name 'X-Meta-Twitter:title' at /usr/local/lib/perl5/site_perl/5.16.3/sun4-solaris/HTML/HeadParser.pm line 207.
X-Pingback: http://www.belizejudiciary.org/web/xmlrpc.php
I have version 3.70 of HTML::Parser installed.

Your dump shows that there was an error parsing the response:
X-Died: Illegal field name 'X-Meta-Twitter:title' at /usr/local/lib/perl5/site_perl/5.16.3/sun4-solaris/HTML/HeadParser.pm line 207.
This is caused by a bug in HTML::HeadParser:
<meta> tags can have name attributes with colons in them, and this is perfectly valid. But HTML::HeadParser then tries to register these as X-Meta-<name> headers using HTTP::Headers. Newer versions of HTTP::Headers (since 6.05) have stricter checks for headers, and will refuse them if they contain colons.
This was fixed in version 3.71 of the HTML-Parser distribution, so you should upgrade.

Related

How to correctly display messages in console?

I use mojolicious application. When I do logging all if fine, except when I run application under morbo I see text like:
$app->log->info('тест лога');
[Sat Oct 6 15:22:43 2018] [info] �е�� лога
here is some problem with utf8.
What should I do that messages will be correctly displayed?
My terminal have support for utf8. I run Linux Mint v19.3
Here is how messages looks which come from script:
Test terminal:
Don't use the following when using Mojo::Log to write to STDERR:
binmode STDERR, ':encoding(UTF-8)';
Mojo::Log explicitly encodes everything it logs using UTF-8, even when writing to STDERR.
sub append {
my ($self, $msg) = #_;
return unless my $handle = $self->handle;
flock $handle, LOCK_EX;
$handle->print(encode('UTF-8', $msg)) or croak "Can't write to log: $!";
flock $handle, LOCK_UN;
}
When using STDERR as the log output (the default), this conflicts with the well-established practice of adding an encoding layer to STD*. In such a circumstance, double-encoding occurs.
One must therefore avoid doing
binmode STDERR, ':encoding(UTF-8)';
Note that this is done as part of
use open ':std', ':encoding(UTF-8)';
Try to run following code sample in your system. The test confirmed with correct UTF-8 output in terminal
#!/usr/bin/env perl
use Mojolicious::Lite -signatures;
get '/' => sub ($c) {
$c->render(text => 'Hello World!');
};
app->log->info('тест лога');
app->start;
Run as morbo test.pl produces following output
:u99257852:~/work/perl/mojo$ morbo ./test.pl
Web application available at http://127.0.0.1:3000
[2020-10-31 13:33:57.42056] [83940] [info] тест лога
[2020-10-31 13:35:16.72465] [83940] [debug] [hga9Tgyy] GET "/"
[2020-10-31 13:35:16.72528] [83940] [debug] [hga9Tgyy] Routing to a callback
[2020-10-31 13:35:16.72574] [83940] [debug] [hga9Tgyy] 200 OK (0.001078s, 927.644/s)
(uiserver):u99257852:~/work/perl/mojo$
Tested locally with nc localhost 3000
(uiserver):u99257852:~$ nc localhost 3000
GET / HTTP/1.1
HTTP/1.1 200 OK
Date: Sat, 31 Oct 2020 17:35:16 GMT
Server: Mojolicious (Perl)
Content-Type: text/html;charset=UTF-8
Content-Length: 12
Hello World!
Output of uname -a
(uiserver):u99257852:~/work/perl/mojo$ uname -a
Linux infongwp-us19 4.4.223-icpu-044 #2 SMP Sun May 10 11:26:44 UTC 2020 x86_64 GNU/Linux
(uiserver):u99257852:~/work/perl/mojo$
Bash is user's shell configured with ~/.bashrc with following settings to support UTF-8
export LC_ALL=en_US.UTF-8
export LANG=en_US.UTF-8
export LANGUAGE=en_US.UTF-8:

Getting error in perl Can't find string terminator "EOF" anywhere before EOF at /var/www/sandeep/testeof.cgi line 2

iam using perl to print some data but it is giving me error as Can't find string terminator "EOF" anywhere before EOF at
The code is:
#!/usr/local/bin/perl -w
print <<EOF;
hello
EOF
ERROR:
[Mon May 23 11:32:12 2016] [error] [client 192.168.10.117] Directory index forbidden by Options directive: /var/www/, referer: http://192.168.10.100/
[Mon May 23 11:32:12 2016] [error] [client 192.168.10.117] malformed header from script. Bad header=hello: testeof.cgi
[Mon May 23 11:32:18 2016] [error] [client 192.168.10.117] Directory index forbidden by Options directive: /var/www/, referer: http://192.168.10.100/
[Mon May 23 11:32:18 2016] [error] [client 192.168.10.117] Can't find string terminator "EOF" anywhere before EOF at /var/www/sandeep/testeof.cgi line 2.
[Mon May 23 11:32:18 2016] [error] [client 192.168.10.117] Premature end of script headers: testeof.cgi
I tried putting EOF in single quotes (print <<'EOF';) as shown in this answer, but the error is same. Printing by this method is working in otherr files in the same directory.
I also referred this question (Why am I getting “Can't find string terminator ”'“ anywhere before EOF at -e line 1” when I try to run a Perl one-liner on Windows?) but in that question OP is using different method to print and iam using linux(UBUNTU).
Please guide where I am doing wrong?
Your code is correct.
Ensure that your EOF is on a single line with no spaces around it.
Use quotes to designate your terminator:
print <<_EOF_;
hi $$
_EOF_
print <<"_EOF_";
hi $$
_EOF_
Both are exactly the same (print hi 12345 where 12345 is the process id of the current process), but the second one is more clear compared to single quotes:
print <<'_EOF_';
hi $$
_EOF_
This one will print hi $$, because no variable replacing is done with singles quotes.
Web-script always require a header (at least for Apache). Add this line as the first output line to get rid of the malformed header from script error:
print "Content-type: text/html\r\n\r\n";

Page not found... yet! with Mojolicious

I am using Mojilicious Lite.
#!/usr/bin/perl -T
use strict;
use Mojolicious::Lite;
get '/' => 'index';
# Run the Mojolicious script in CGI mode.
app->start;
#template
__DATA__
## index.html.ep
<!DOCTYPE html>
<html>
<head>
<title>My title</title>
</head>
<body>
pass 15
</body>
</html>
All is working fine but I have an intermittent issue,some times I get error page which says:
"Page not found... yet!
None of these routes could generate a response for your GET request for /, maybe you need to add a new one?"
This happen very rarely (1 out of 20 hit).
Can anyone please let me know what is the issue and how can I overcome of it?
Thanks in advance..
My Error Log is:
[Tue May 26 18:12:42 2015] [debug] GET "/".
[Tue May 26 18:12:42 2015] [debug] Routing to a callback.
[Tue May 26 18:12:42 2015] [debug] Template "index.html.ep" not found.
[Tue May 26 18:12:42 2015] [debug] Template "not_found.development.html.ep" not found.
[Tue May 26 18:12:42 2015] [debug] Template "not_found.html.ep" not found.
[Tue May 26 18:12:42 2015] [debug] Rendering inline template "3e3201ab0667c1fc7f39089209f0435c".
[Tue May 26 18:12:42 2015] [debug] Rendering inline template "b2d451b47e2053ce583cbfdf7bcc6006".
Finally I find that this is a bug with mojilicious light(while rendering inline templetes).
bug report 1
bug report 2
I resolved my problem by using external templates
Example:
My application file:
#!/usr/bin/perl -T
use strict;
use Mojolicious::Lite;
get '/' => 'index';
# Run the Mojolicious script in CGI mode.
app->start;
and my template file with name index.html.ep
<!DOCTYPE html>
<html>
<head>
<title>My title</title>
</head>
<body>
pass 15
</body>
</html>
Note: We have to define this file in templates directory .

How do I get a refresh token for command line gsutil to work?

I use gsutil to transfer files from a Windows machine to Google Cloud Storage.
I have not used it for more than 6 months and now when I try it I get:
Failure: invalid_grant
From researching this I suspect the access token is no longer valid as it has not been used for 6 months, and I need a refresh token?
I cannot seem to find how to get and use this.
thanks
Running gsutil -DD config produces the following output:
C:\Python27>python c:/gsutil/gsutil -DD config
DEBUG:boto:path=/pub/gsutil.tar.gz
DEBUG:boto:auth_path=/pub/gsutil.tar.gz
DEBUG:boto:Method: HEAD
DEBUG:boto:Path: /pub/gsutil.tar.gz
DEBUG:boto:Data:
DEBUG:boto:Headers: {}
DEBUG:boto:Host: storage.googleapis.com
DEBUG:boto:Params: {}
DEBUG:boto:establishing HTTPS connection: host=storage.googleapis.com, kwargs={'timeout': 70}
DEBUG:boto:Token: None
DEBUG:oauth2_client:GetAccessToken: checking cache for key *******************************
DEBUG:oauth2_client:FileSystemTokenCache.GetToken: key=******************************* not present (cache_file= c:\users\admini~1\appdata\local\temp\2\oauth2_client-tokencache._.ea******************************)
DEBUG:oauth2_client:GetAccessToken: token from cache: None
DEBUG:oauth2_client:GetAccessToken: fetching fresh access token...
INFO:oauth2client.client:Refreshing access_token connect: (accounts.google.com, 443)
send: 'POST /o/oauth2/token HTTP/1.1\r\nHost: accounts.google.com\r\nContent-Length: 177\r\ncontent-type: application/x- www-form-urlencoded\r\naccept-encoding: gzip, deflate\r\nuser-agent: Python-httplib2/0.7.7 (gzip)\r\n\r\nclient_secret=******************&grant_type=refresh_token&refresh_token=****************************************&client_ id=****************.apps.googleusercontent.com' reply: 'HTTP/1.1 400 Bad Request\r\n'
header: Content-Type: application/json; charset=utf-8 header: Cache-Control: no-cache, no-store, max-age=0, must-revalidate header: Pragma: no-cache header: Expires: Fri, 01 Jan 1990 00:00:00 GMT header: Date: Thu, 08 May 2014 02:02:21 GMT header: Content-Disposition: attachment; filename="json.txt"; filename*=UTF-8''json.txt header: Content-Encoding: gzip header: X-Content-Type-Options: nosniff header: X-Frame-Options: SAMEORIGIN
header: X-XSS-Protection: 1; mode=block header: Server: GSE header: Alternate-Protocol: 443:quic header: Transfer-Encoding: chunked
INFO:oauth2client.client:Failed to retrieve access token: { "error" : "invalid_grant" }
Traceback (most recent call last):
File "c:/gsutil/gsutil", line 83, in <module> gslib.__main__.main() File "c:\gsutil\gslib_main_.py", line 151, in main command_runner.RunNamedCommand('ver', ['-l'])
File "c:\gsutil\gslib\command_runner.py", line 95, in RunNamedCommand self._MaybeCheckForAndOfferSoftwareUpdate(command_name, debug)):
File "c:\gsutil\gslib\command_runner.py", line 181, in _MaybeCheckForAndOfferSoftwareUpdate cur_ver = LookUpGsutilVersion(suri_builder.StorageUri(GSUTIL_PUB_TARBALL))
File "c:\gsutil\gslib\util.py", line 299, in LookUpGsutilVersion obj = uri.get_key(False)
File "c:\gsutil\third_party\boto\boto\storage_uri.py", line 342, in get_key generation=self.generation)
File "c:\gsutil\third_party\boto\boto\gs\bucket.py", line 102, in get_key query_args_l=query_args_l)
File "c:\gsutil\third_party\boto\boto\s3\bucket.py", line 176, in _get_key_internal query_args=query_args)
File "c:\gsutil\third_party\boto\boto\s3\connection.py", line 547, in make_request retry_handler=retry_handler
File "c:\gsutil\third_party\boto\boto\connection.py", line 947, in make_request retry_handler=retry_handler)
File "c:\gsutil\third_party\boto\boto\connection.py", line 838, in _mexe request.authorize(connection=self)
File "c:\gsutil\third_party\boto\boto\connection.py", line 377, in authorize connection._auth_handler.add_auth(self, *********)
File "c:\gsutil\gslib\third_party\oauth2_plugin\oauth2_plugin.py", line 22, in add_auth self.oauth2_client.GetAuthorizationHeader()
File "c:\gsutil\gslib\third_party\oauth2_plugin\oauth2_client.py", line 338, in GetAuthorizationHeader return 'Bearer %s' % self.GetAccessToken().token
File "c:\gsutil\gslib\third_party\oauth2_plugin\oauth2_client.py", line 309, in GetAccessToken access_token = self.FetchAccessToken()
File "c:\gsutil\gslib\third_party\oauth2_plugin\oauth2_client.py", line 435, in FetchAccessToken credentials.refresh(http)
File "c:\gsutil\third_party\google-api-python-client\oauth2client\client.py", line 516, in refresh self._refresh(http.request)
File "c:\gsutil\third_party\google-api-python-client\oauth2client\client.py", line 653, in _refresh self._do_refresh_request(http_request)
File "c:\gsutil\third_party\google-api-python-client\oauth2client\client.py", line 710, in _do_refresh_request raise AccessTokenRefreshError(error_msg) oauth2client.client.AccessTokenRefreshError: invalid_grant
You can ask gsutil to configure itself. Go to the directory with gsutil and run this:
c:\gsutil> python gsutil config
Gsutil will lead you through the steps to setting up your credentials.
That said, access tokens only normally last about a half hour. It's more likely that the previously-configured refresh token was revoked for some reason. Alternately, you can only request new tokens at a certain rate. It's possible your account has been requesting many, many refresh tokens for some reason and has been temporarily rate limited by the access service.
The command to authenticate is now
$ gcloud auth login
That should refresh your grant and get you going again.
You may also want to run
$ gcloud components update
to update your installation.
Brandon Yarbrough gave me suggestions which solved this problem. He suspected that the .boto file was corrupted and suggested I delete it and run gsutil config again. I did this and it solved the problem.

Odd cgi behaviour

I am getting some very annoying behaviour from my perl cgi scripts running under apache.
I get referer information added on the end of simple print statements, and it's driving me nuts.
[Sun Feb 20 21:34:47 2011] [error] [client xx] ruid: 48, referer: http://www.x.com/
[Sun Feb 20 21:34:47 2011] [error] [client xx] euid: 48, referer: http://www.x.com/
[Sun Feb 20 21:34:47 2011] [error] [client xx] test, referer: http://www.x.com/
[Sun Feb 20 21:34:47 2011] [error] [client xx] Premature end of script headers: test.cgi, referer: http://www.x.com/
This only seems to happen when the url is reached by navigating from another page (hence having a referer yes). The above apache log output was produced with the below incredibly simple depo script:
#!/usr/bin/perl -w
use strict;
use warnings;
use CGI;
my $q = CGI->new;
print STDERR "ruid: $<\n";
print STDERR "euid: $>\n";
print STDERR "test\n";
Anyone seen this before? It feels like an apache setting i need to turn off.
Thanks
Matt
Take a look at your apache config files (httpd.conf and friends) and find the CustomLog directive which is used by your error log to see which LogFormat it uses, then modify that LogFormat (or create a new one) to remove %{Referer} from the list of fields to include in the log messages. (And don't forget to reload the apache config after changing it, of course.)