Perl CGI script print (utf-8 encoded - japanese html) over http (apache httpd) getting truncated - perl

Environment settings
OS : RHEL 6.6 (kernel 2.6.32) - x86_64
httpd : httpd-2.2.15-39
perl : 5.10.1-136
CGI API :
perl-CGI-3.15-136
perl-CGI-Session-4.35-6
I am using a static html page with Perl-CGI defined variables in the static html. This html is read in through perl, and then passed to a perl CGI script for eval.
Note:
While reading the static html, I am using UTF-8 encoding like
open( IN ,"<:encoding(UTF-8)", $file_path )
After reading the status HTML page, the output is passed back to the CGI script through a variable and then pressed in to eval to evaluate the variables.
Finally, the eval(uated) output from CGI is print which can be read through http daemon.
In the CGI script I am using
binmode(STDIN, ':encoding(UTF-8)');
binmode(STDOUT, ':encoding(UTF-8)');
The static HTML looks something like this
When I check the output of print in the CGI script, I see the complete output as desired, like this
However on the Browser, the hidden input fields are getting truncated in an unwanted manner. Like this
When I checked the wireshark output for the text/html, which is being printed back from the server to the browser, this is also getting truncated.
Like this
The HTML header has proper Content-Type and charset declaration.
The same code is working fine with EN language
The same code is working fine with zh(chinese) language as well.
When the language is set to japanese in the browser, and we read from HTTP_ACCEPT_LANGUAGE that it is 'ja', than we print the japanese specific data.
We are not using cgid module of apache.
Are we supposed to use some special encoding for japanese language??
Or it is a double encoding issue. I have tried removing the encoding when I am reading the static html file, however, that also did not help.
The same code is working fine with RHEL 5.x (2.6.18-308), and perl-CGI-Session (4.42-2), perl (5.8.8-38) httpd (2.2.3-63), there was no perl-CGI in RHEL-5.x.

I kind of found the solution to the problem. In our code we used to add a data in the CGI::Session object, which had the language. In the following form
$session->param( KEY_SESSION_LANG, $code ).
Where $session is a CGI::Session object and KEY_SESSION_LANG is 'language' and $code is something that we get from HTTP_ACCEPT_LANGUAGE. When we see 'en' we used to set it as perl constant 'en', when we got 'ja' we used to set it as perl constant 'ja'.
When we used to form the session object we used to get a session file as (perl version RHEL 5.x (2.6.18-308) & perl-CGI-Session-4.42-2 ) for JA language we used to get the following in the cgi file
cat cgisess_e31c8d21af82b59fd064babc7ca25c01
$D = {'_SESSION_ID' => 'e31c8d21af82b59fd064babc7ca25c01','_SESSION_ETIME' => 6000,'language' => 'ja','permit' => 'yes','_SESSION_REMOTE_ADDR' => '192.168.101.1','_SESSION_CTIME' => 1441090386,'execute' => 'yes','_SESSION_ATIME' => 1441090387,'_SESSION_EXPIRE_LIST' => {}};*a = \undef;;$D
For perl CGI Session in RHEL 6.6 this is coming out to be
cat cgisess_e31c8d21af82b59fd064babc7ca25c01
$D = {'_SESSION_ID' => 'e31c8d21af82b59fd064babc7ca25c01','_SESSION_ETIME' => 6000,'language' => *a,'permit' => 'yes','_SESSION_REMOTE_ADDR' => '192.168.101.1','_SESSION_CTIME' => 1441090386,'execute' => 'yes','_SESSION_ATIME' => 1441090387,'_SESSION_EXPIRE_LIST' => {}};*a = \undef;;$D
language data for ja is becoming *a. The same is also reflected when we use perl dumper for getting in memory data.
I checked the /usr/share/perl5/vendor_perl/CGI/Session.pm and it had following information in it
=head1 A Warning about UTF8
Trying to use UTF8 in a program which uses CGI::Session has lead to problems. See RT#21981 and RT#28516.
In the first case the user tried "use encoding 'utf8';" in the program, and in the second case the user tried "$dbh->do(qq|set names 'utf8'|);".
Until this problem is understood and corrected, users are advised to avoid UTF8 in conjunction with CGI::Session.
For details, see: http://rt.cpan.org/Public/Bug/Display.html?id=28516 (and ...id=21981).
=head1 TRANSLATIONS
This document is also available in Japanese.
Now when I used the perl dumper following things happened. I quote below from my offical analysis presented on our local development portal
I think the problem is because of perl-CGI-Session OSS package, please see the analysis below.
Some inputs from the CGI session source code.
## Inputs ##
From the file /usr/share/perl5/vendor_perl/CGI/Session.pm
## Following are the status of CGI session, set internally after modification to any of the parameters ##
sub STATUS_NEW () { 1 } # denotes session that's just created
sub STATUS_MODIFIED () { 2 } # denotes session that needs synchronization
sub STATUS_DELETED () { 4 } # denotes session that needs deletion
sub STATUS_EXPIRED () { 8 } # denotes session that was expired.
--snip --
I::Session - persistent session data in CGI applications
=head1 SYNOPSIS
# Object initialization:
use CGI::Session;
$session = new CGI::Session();
$CGISESSID = $session->id();
We are setting the "language" parameter in the session object. (We create a CGI object, set cookie to it, to get sid, and through sid get the session object). For setting up the language parameter we do
$session->param( 'language', ); ---> language_value = en(english) or ja(japanese). When we have completed the printing of the HTML page in /opt/packageManager/pm_gui/cgi/status.cgi file, I checked the cgi session object it is as follows
For EN Language
Before executing session flush
[09/10/2015 16:13:41] [23722] <ERROR> status.cgi : 267 :
$VAR1 = bless( {
'_STATUS' => 2,
'_DATA' => {
'_SESSION_ETIME' => 6000,
'_SESSION_ID' => '995d11334f2c39b95b3fdb86cecd9655',
'permit' => 'yes',
'language' => 'en',
Then after this when I flush the session as $session->flush() and check the session object it is
[09/10/2015 16:13:41] [23722] <ERROR> status.cgi : 270 :
$VAR1 = bless( {
'_STATUS' => 0,
'_DATA' => {
'_SESSION_ETIME' => 6000,
'_SESSION_ID' => '995d11334f2c39b95b3fdb86cecd9655',
'permit' => 'yes',
'language' => 'en',
Inference 1: session status changed after doing flush. This is good, and should be done.
For JP Language
Before executing session flush
[09/10/2015 16:14:54] [31910] status.cgi : 267 :
$VAR1 = bless( {
'_STATUS' => 2,
'_DATA' => {
'_SESSION_ID' => '1cd1b7860af4c71264f3969fe74e7a44',
'_SESSION_ETIME' => 6000,
'language' => *a
Then after this, when I flush the session as $session->flush() and check the session object it is NOT THERE. SCRIPT CRASHES HERE IT SELF
Inference 2 : Doing flush with language JP, is terminating the session, and that is why the session gets destroyed. And that is why, ending data in response is truncated
Due to the wrong value being set in memory, in session object, and then the implicit flush by CGI session is failing on the disk. Which results in termination of the session object, and in between termination of session, and and data loss of HTML.
I checked the actual code in sessions.pm file and it seems to be coming in from here
sub param {
my ($self, #args) = #_;
--snip--
# USAGE: $s->param($name, $value);
# USAGE: $s->param($name1 => $value1, $name2 => $value2 [,...]);
# DESC: updates one or more **public** records using simple syntax
if ((#args % 2) == 0) {
my $modified_cnt = 0;
ARG_PAIR:
while (my ($name, $val) = each %args) {
if ( $name =~ m/^_SESSION_/) {
carp "param(): attempt to write to private parameter";
next ARG_PAIR;
}
$self->{_DATA}->{ $name } = $val; ----> HERE
++$modified_cnt;
}
$self->_set_status(STATUS_MODIFIED);
return $modified_cnt;
}
As a solution, we stopped putting 'ja' value as a perl constant, but now are putting it as a string "ja" and it seems to be working fine now.

Related

Attempt to access upserted_id property in perl MongoDB Driver returns useless HASH(0x3572074)

I have a Perl script that pulls a table from a SQL database ($row variable) and attempts to do a MongoDB update like so:
my $res = $users->update({"meeting_id" => $row[0]},
{'$set' => {
"meeting_id" => $row[0],
"case_id" => $row[1],
"case_desc" => $row[2],
"date" => $row[3],
"start_time" => $row[4],
"end_time" => $row[5],
#"mediator_LawyerID" => $row[6],
"mediator_LawyerIDs" => \#medLawIds,
"case_number" => $row[6],
"case_name" => $row[7],
"location" => $row[8],
"number_of_parties" => $row[9],
"case_manager" => $row[10],
"last_updated" => $row[11],
"meeting_result" => $row[12],
"parties" => \#partyList
}},
{'upsert' => 1}) or die "I ain't update!!!";
My client now wants ICS style calendar invites sent to their mediators. Thus, I need to know whether an update or insert happened. The documentation for MongoDB::UpdateResult implies that this is how you access such a property:
my $id = $res->upserted_id;
So I tried:
bless ($res,"MongoDB::UpdateResult");
my $id = $res->upserted_id;
After this code $id is like:
HASH(0x356f8fc)
Are these the actual IDs? If so, how do I convert to a hexadecimal string that can be cast to Mongo's ObjectId type? It should be noted I know absolutely nothing about perl; if more of the code is relevant, at request I will post any section ASAP. Its 300 lines so I didn't want to include the whole file off the bat.
EDIT: I should mention before anyone suggests this that using update_one instead of update returns the exact same result.
HASH(0x356f8fc) is a Perl Hash reference. It's basically some kind of (internal) memory address of some data.
The easiest way to get the contents is Data::Dumper:
use Data::Dumper
[...]
my $result = $res->upserted_id;
print Dumper($result);
HASH(0x356f8fc) is just the human readable representation of the real pointer. You must dump it in the same process and can't pass it from one to another.
You'll probably end up with something like
`my $id = $result->{_id};`
See the PerlRef manpage for details.
See also the MongoDB documentation about write concern.
PS: Also remember that you could use your own IDs for MongoDB. You don't need to work with the generated ones.

First 8 bytes are always wrong when downloading a file from my script

I have a Mojolicious Lite script that "gives out" an executable file (user can download the file from the script's URL). I keep encoded data in an inline template in DATA section, then encode it and render_data.
get '/download' => sub {
my $self = shift;
my $hex_data = $self->render_partial( 'TestEXE' );
my $bin_data;
while( $hex_data =~ /([^\n]+)\n?/g ) {
$bin_data .= pack "H".(length $1), $1;
}
my $headers = Mojo::Headers->new;
$headers->add( 'Content-Type', 'application/x-download;name=Test.exe' );
$headers->add( 'Content-Disposition', 'attachment;filename=Test.exe' );
$headers->add( 'Content-Description', 'File Transfer');
$self->res->content->headers($headers);
$self->render_data( $bin_data );
};
__DATA__
## TestEXE.html.ep
4d5a90000300000004000000ffff0000b8000000000000004000000000000000
00000000000000000000000000000000000000000000000000000000b0000000
0e1fba0e00b409cd21b8014ccd21546836362070726f6772616d2063616e6e6f
....
When I run this locally (via built in webserver on http://127.0.0.1:3000/, Win7) I get the correct file (size and contents). But when I run it in CGI mode on shared hosting (Linux), it comes back with correct size, but first 8 bytes of the file are always incorrect (and always different). The rest of the file is correct.
If in my sub i specify $hex_data instead of $bin_data I get what suppose to be there.
I'm at lost.
render_partial isn't what you want.
First, re-encode the executable in base64 format, and specify that the template is base64 encoded (This is assuming hex is not a requirement for your app):
## template-name (base64)
Also, you don't actually need a controller method at all. Mojolicious will handle the process for you - all you have to do is appropriately name the template.
use Mojolicious::Lite;
app->start;
__DATA__
## Test.exe (base64)
...
http://127.0.0.1:3000/Test.exe will then download the file.
-
If you still want to use a controller method for app-specific concerns, get the data template specifically:
use Mojolicious::Lite;
get '/download' => sub {
my $self = shift;
# http://mojolicio.us/perldoc/Mojolicious/Renderer.pm#get_data_template
my $data = $self->app->renderer->get_data_template({}, 'Test.exe');
# Replace content-disposition instead of adding it,
# to prevent duplication from elsewhere in the app
$self->res->headers->header(
'Content-Disposition', 'attachment;filename=name.exe');
$self->render_data($data);
};
app->start;
__DATA__
## Test.exe (base64)
...
http://127.0.0.1:3000/download will get the template, set the header, and then download it as name.exe.

Perl utf8 not quite working with my script

I'm having a problem with utf-8 support on a perl script I'm writing. The script is meant to send html email messages. The html messages are saved in UTF-8 format inside a PostgreSQL database. Everything seems to be working but still I get corruption sometimes when I receive an email from the script - "�".
In the beginning of the script I have:
#!/usr/bin/perl -w
use utf8;
use Encode;
use MIME::Base64;
use MIME::Lite;
my $connection = DBI->connect('dbi:Pg:dbname='.$db_name.';host='.$db_host.'', $db_user,$db_pass, { AutoCommit=>1, PrintError => 1, pg_enable_utf8 => 1 });
my $fetchHtml = $connection->prepare('SELECT * FROM emails ORDER BY n_id DESC LIMIT 1');
$fetchHtml->execute();
my $message = $fetchHtml->fetchrow_hashref();
my $sendMsg = MIME::Lite->build(
Encoding => 'quoted-printable',
Type => 'multipart',
To => '<atesting#address.com>',
From => '<destination#address.com>',
Subject => encode("MIME-B", $message->{'title'}),
Data => decode_entities($message->{'html'})
);
$sendMsg->attr("Content-Type" => "text/html; charset=utf-8;");
$sendMsg->send_by_smtp('127.0.0.1', Timeout =>30, Debug => 0, SkipBad => 1);
I'm wondering what I'm doing wrong and why do I keep on getting the cool "�" sign ? :)
Another thing is that I get this exception when I execute the script:
Uncaught exception from user code:
Wide character in subroutine entry at /usr/lib/perl5/site_perl/5.10.0/MIME/Lite.pm line 2259.
at /usr/lib/perl5/site_perl/5.10.0/MIME/Lite.pm line 2259
MIME::Lite::print_simple_body('MIME::Lite=HASH(0xa51b9c8)', 'MIME::Lite::SMTP=GLOB(0xa5b8888)', 1) called at /usr/lib/perl5/site_perl/5.10.0/MIME/Lite.pm line 2191
MIME::Lite::print_body('MIME::Lite=HASH(0xa51b9c8)', 'MIME::Lite::SMTP=GLOB(0xa5b8888)', 1) called at /usr/lib/perl5/site_perl/5.10.0/MIME/Lite.pm line 2126
MIME::Lite::print_for_smtp('MIME::Lite=HASH(0xa51b9c8)', 'MIME::Lite::SMTP=GLOB(0xa5b8888)') called at /usr/lib/perl5/site_perl/5.10.0/MIME/Lite.pm line 2897
MIME::Lite::send_by_smtp('MIME::Lite=HASH(0xa51b9c8)', 'bla.example.com', 'Timeout', 30, 'Debug', 0, 'SkipBad', 1) called at ./advanced-daemon.pl line 354
main::send_mail('Subject Title' <webmaster#testing>', 'spam#spam.com', 'HASH(0xa518630)') called at ./advanced-daemon.pl line 225
main::sendEmailsToSubscribers('DBI::db=HASH(0xa517f40)', 24, 'HASH(0xa518630)') called at ./advanced-daemon.pl line 136
I can't understand what exactly is the problem but I think it's related to the utf8..
Any help would be pretty much appreciated.. :)
First you need use utf8 only if you have unicode character in the quellcode. Then decode_entities($message->{'html'}) ist also wrong. Use only $message->{'html'}.
The database must be utf8 by default. Then add Encoding => '8bit'. That works nice for me.
Your MIME::Lite is false: see on http://www.perlmonks.org/?node_id=105262 for a nice example
In my code I use the multipart type and here is the version which seems to work correctly.
my $mail_htm = 'Text of the email with utf8 characters like these: ľščťžýáíúäô';
my $msg = MIME::Lite->new(
From =>'me#me.com',
'Reply-To'=>'me#me.com',
To =>'you#you.com',
Subject => 'Simple ascii non-utf8 subject',
Type =>'multipart/mixed'
);
my $msg_body = MIME::Lite->new(
Type =>'multipart/alternative'
);
$msg_body->attach(
Type =>'text/html; charset=UTF-8',
Encoding=>'quoted-printable',
Data =>$mail_htm
);
$msg->attach($msg_body);
I didn't play much with utf8 encoding for the subject as I didn't need it that much and several solutions I found around didn't work.

"Can't call method "dir_path" on an undefined value" when running Mason component on the command line

Greetings,
I'm trying to develop some tests for Mason components which requires running them on the command line instead of the web server. When I try this, I get an error:
perl -MHTML::Mason::Request -MHTML::Mason::Interp -I./lib \
-e '$int = HTML::Mason::Interp->new( data_dir => "/home/friedo/cache", comp_root => "/home/friedo/comps" ); $m = HTML::Mason::Request->new( comp => "/dummy", interp => $int ); $m->comp("/dummy")'
Results in:
Can't call method "dir_path" on an undefined value at lib/HTML/Mason/Request.pm line 1123.
The error is thrown when the call to ->comp is attempted. I can't figure out what's wrong with the configuration. The component is there and appears to be compiled just fine, and it works via Apache.
This is using HTML::Mason 1.35.
Edit: Let's try a bounty for this one. The alternative is me having to dive deep into Mason's guts! :)
Edit again: Thanks very much to David for pointing out the crucial detail that I missed for getting this to work.
This was actually for a test framework that needed to exercise a module that calls some Mason comps -- under normal operation the module is provided with a Mason request object to use for that purpose, but I couldn't get that to work offline. The key was using an Interpreter object instead, so I ended up doing the following, which is a little silly but makes the tests work:
sub _mason_out {
...
my $buf;
if ( $ENV{MASON_TEST} ) {
my $int = HTML::Mason::Interp->new( comp_root => $self->{env}->comp_dir,
out_method => \$buf );
$int->exec( $comp, %args );
} else {
my $m = $self->{mason_object};
$m->comp( { store => \$buf }, $comp, %args );
}
return $buf;
}
I think this fails because your Request object hasn't built a component stack at the point that it is called. Use the Interp->exec() method instead as described in Using Mason from a Standalone Script
perl -MHTML::Mason::Interp -I./lib \
-e 'HTML::Mason::Interp->new( data_dir => "/home/friedo/cache", comp_root => "/home/friedo/comps" )->exec("/dummy")'

How can I log in to YouTube using Perl?

I am trying to write a Perl script to connect to me YouTube account but it doesnt seem to work. Basically I just want to connect to my account but apparently it is not working. I don't even have an idea on how I could debug this! Maybe it is something related to https protocol?
Please enlighten me! Thanks in advance.
use HTTP::Request::Common;
use LWP::UserAgent;
use strict;
my $login="test";
my $pass = "test";
my $res = "";
my $ua = "";
# Create user agent, make it look like FireFox and store cookies
$ua = LWP::UserAgent->new;
$ua->agent("Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.12) Gecko/20051213 Firefox/1.0.7");
$ua->cookie_jar ( {} );
# Request login page
$res = $ua->request(GET "https://www.google.com/accounts/ServiceLogin?service=youtube&hl=en_US&passive=true&ltmpl=sso&uilel=3&continue=http%3A//www.youtube.com/signup%3Fhl%3Den_US%26warned%3D%26nomobiletemp%3D1%26next%3D/index");
die("ERROR1: GET http://www.youtube.com/login\n") unless ($res->is_success);
# Now we login with our user/pass
$res = $ua->request(
POST "https://www.google.com/accounts/ServiceLoginAuth?service=youtube",
Referer => "http://www.youtube.com/login",
Content_Type => "application/x-www-form-urlencoded",
Content => [
currentform => "login",
next => "/index",
username => $login,
password => $pass,
action_login => "Log+In"
]
);
# YouTube redirects (302) to a new page when login is success
# and returns OK (200) if the login failed.
#die("ERROR: Login Failed\n") unless ($res->is_redirect());
print $res->content;
what i am doing is learning the web features of perl, so i dont want to use any library except wwwlib or mechanize to get the job done.
how can i just connect to my account using a perl script? this is my objective for now
hope someone can post a script or correct mine.
thank you guys for you help.
i am testing Webscarab now..
What data are you trying to grab? Why not just using an existing implementation like WebService::YouTube
Some comments on your code: I always avoided the shortcut $ua->request(GET/POST) method since I always ended up needing more flexibility that only the use of HTTP::Request and HTTP::Response allowed. I always felt the code was cleaner that way too.
Why is your code not working? Who knows.
Make sure your cookiejar is adding your cookies to the outgoing HTTP::Request. I'd suggest dumping all your headers when you do it in a browser and compare with the headers and data that libwww is sending. There may be some additional fields that they are checking for that vary for every hit. They may be checking for your UserAgent string. If you are just looking to learn libwww I'd suggest using a different site as a target as I'm sure YouTube has all sort of anti-scripting hardening.
Are you using YouTube's stable documented API?
Use an HTTP proxy such as WebScarab to watch the data flow.
Trey's suggestion to use somebody else's CPAN package for the mechanics is a good idea too.
Right right by and large, what you want to do is define a cookiejar for most of these websites that have a redirection login. This is what the package has done. Also the package tunes a lot of the lookups and scrapes based on the youtube spec.
Ajax content for example will be rough since its not there when your scraping
You just picked a somewhat rough page to start out with.
Enjoy
I'm actually working on this issue myself. Before, I would suggest read over this the API guide from Google as a good starting reference. If I'm reading it correctly, one begins with passing user credentials through a REST interface to get a Authentication Token. To handle that, I'm using the following:
sub getToken {
my %parms = #_;
my $response = LWP::UserAgent->new->post(
'https://www.google.com/youtube/accounts/ClientLogin',
[
Email => $parms{'username'},
Passwd => $parms{'password'},
service => "youtube",
source => "<<Your Value Here>>",
]
);
my $content = $response->content;
my ($auth) = $content =~ /^Auth=(.*)YouTubeUser(.*)$/msg
or die "Unable to authenticate?\n";
my ($user) = $content =~ /YouTubeUser=(.*)$/msg
or die "Could not extract user name from response string. ";
return ($auth, $user);
}
And I call that from the main part of my program as such:
## Get $AuthToken
my ($AuthToken, $GoogleUserName) = getToken((
username => $email, password => $password
));
Once I have these two things -- $AuthToken and $GoogleUserName, I'm still testing the LWP Post. I'm still writing this unit:
sub test {
my %parms = #_;
## Copy file contents. Use, foy's three param open method.
my $fileSize = -s $parms{'File'};
open(VideoFile, '<', "$parms{'File'}") or die "Can't open $parms{'File'}.";
binmode VideoFile;
read(VideoFile, my $fileContents, $fileSize) or die "Can't read $parms{'File'}";
close VideoFile;
my $r = LWP::UserAgent->new->post(
"http://uploads.gdata.youtube.com/feeds/api/users/$parms{'user'}/uploads",
[
Host => "uploads.gdata.youtube.com",
'Authorization' => "AuthSub token=\"$parms{'auth'}\"",
'GData-Version' => "2",
'X-GData-Key' => "key=$YouTubeDeveloperKey",
'Slug' => "$parms{'File'}",
'Content-Type' => "multipart/related; boundary=\"<boundary_string>\"",
'Content-Length' => "<content_length>",
'video_content_type'=> "video/wmv",
'Connection' => "close",
'Content' => $fileContents
]
);
print Dumper(\$r->content)
}
And that is called as
&test((auth=>$Auth, user=>$user, File=>'test.wmv'));