Mojolicious and directory traversal - perl

I am new to Mojolicious and trying to build a tiny webservice using this framework ,
I wrote the below code which render some file remotely
use Mojolicious::Lite;
use strict;
use warnings;
app->static->paths->[0]='C:\results';
get '/result' => sub {
my $self = shift;
my $headers = $self->res->headers;
$headers->content_type('text/zip;charset=UTF-8');
$self->render_static('result.zip');
};
app->start;
but it seems when i try to fetch the file using the following url:
http://mydomain:3000/result/./../result
i get the file .
is there any option on mojolicious to prevent such directory traversal?
i.e in the above case i want only
http:/mydomain:300/result
to serve the page if someone enter this url :
http://mydomain:3000/result/./../result
the page should not be served .
is it possoible to do this ?

/$result^/ is a regular expression, and if you have not defined the scalar variable $result (which it does not appear you have), it resolves to /^/, which matches not just
http://mydomain:3000/result/./../result but also
http://mydomain:3000/john/jacob/jingleheimer/schmidt.
use strict and use warnings, even on tiny webservices.

Related

How do I localize an object that is inside a property of a Moo object in Perl?

I've got an object that stores an LWP::UserAgent. I want to use different cookie jars for different calls with that UA, so I decided to make the cookie_jar local when doing a call.
The following code shows what I did without debug stuff (for reading, not running). Below is another version with lots of debugging output.
package Foo;
use strictures;
use Moo;
use LWP::UserAgent;
has ua => (
is => 'ro',
default => sub { my $ua = LWP::UserAgent->new; $ua->cookie_jar( {} ); return $ua; },
);
sub request {
my ($self, $cookie_jar) = #_;
local $self->{ua}->{cookie_jar} = $cookie_jar;
$self->ua->get('http://www.stackoverflow.com');
}
package main;
my $foo = Foo->new;
my $new_jar = HTTP::Cookies->new;
$foo->request( $new_jar );
So basically I decided to locally overwrite the cookie jar. Unfortunately, when we call get it will still use the cookie jar that is originally inside the UA object.
package Foo;
use strictures;
use Moo;
use LWP::UserAgent;
use HTTP::Cookies;
use Data::Printer;
use feature 'say';
has ua => (
is => 'ro',
default => sub { my $ua = LWP::UserAgent->new; $ua->cookie_jar( {} ); return $ua; },
);
sub request {
my ($self, $cookie_jar) = #_;
say "before local " . $self->{ua}->{cookie_jar};
local $self->{ua}->{cookie_jar} = $cookie_jar;
$self->ua->get('http://www.stackoverflow.com');
print "local jar " . p $self->{ua}->{cookie_jar};
say "after local " . $self->{ua}->{cookie_jar};
}
package main;
use Data::Printer;
use HTTP::Cookies;
my $foo = Foo->new;
say "before outside of local " . $foo->{ua}->{cookie_jar};
my $new_jar = HTTP::Cookies->new;
say "before outside of local " . $new_jar;
$foo->request( $new_jar );
say "after outside of local " . $foo->{ua}->{cookie_jar};
print "global jar " . p $foo->ua->cookie_jar;
__END__
before outside of local HTTP::Cookies=HASH(0x30e1848)
before outside of local HTTP::Cookies=HASH(0x30e3b20)
before local HTTP::Cookies=HASH(0x30e1848)
local jar HTTP::Cookies {
public methods (13) : add_cookie_header, as_string, clear, clear_temporary_cookies, DESTROY, extract_cookies, load, new, revert, save, scan, set_cookie, set_cookie_ok
private methods (3) : _host, _normalize_path, _url_path
internals: {
COOKIES {}
}
}after local HTTP::Cookies=HASH(0x30e3b20)
after outside of local HTTP::Cookies=HASH(0x30e1848)
global jar HTTP::Cookies {
public methods (13) : add_cookie_header, as_string, clear, clear_temporary_cookies, DESTROY, extract_cookies, load, new, revert, save, scan, set_cookie, set_cookie_ok
private methods (3) : _host, _normalize_path, _url_path
internals: {
COOKIES {
stackoverflow.com {
/ {
prov [
[0] 0,
[1] "185e95c6-a7f4-419a-8802-42394776ef63",
[2] undef,
[3] 1,
[4] undef,
[5] 2682374400,
[6] undef,
[7] {
HttpOnly undef
}
]
}
}
}
}
}
As you can see, the HTTP::Cookies object gets localized and replaced correctly. The addresses look totally correct.
But the output of p tells a different story. LWP::UA has not used the local cookie jar at all. That remains a fresh, empty one.
How can I make it use the local one instead?
I have tried using Moo, Moose and classic bless objects. All show this behaviour.
Edit: Since this came up in the comments, let me give a little more background why I need to do this. This is going to be a bit of a rant.
TLDR: Why I do not want alternative solution but understand and fix the problem
I'm building a Dancer2-based webapp that will run with Plack and multiple workers (Twiggy::Prefork - multiple threads in multiple forks). It will allow users to use a service of a third company. That company offers a SOAP webservice. Think of my application as a custom frontend to this service. There is a call to 'log the user in' on the webservice. It returns a cookie (sessionid) for that specific user and we need to pass that cookie with each consecutive call.
To do the SOAP-stuff I am using XML::Compile::WSDL11. Compiling the thing is pretty costly, so I do not want to do that each time a route is handled. That would be way inefficient. Thus the SOAP client will be compiled from the WSDL file when the application starts. It will then be shared by all workers.
If the client object is shared, the user agent inside is shared as well. And so is the cookie jar. That means that if there are two requests at the same time, the sessionids might get mixed up. The app could end up sending wrong stuff to the users.
That's why I decided to localize the cookie jar. If it's a local unique one for a request, it will never be able to interfere with another worker's request that is happening in parallel. Just making a new cookie jar for each request will not cut it. They would still be shared, and might even get lost because they would overwrite each other in the worst case.
Another approach would be to implement a locking mechanism, but that would totally beat the purpose of having multiple workers.
The only other solution I see is using another SOAP-client alltogether. There is SOAP::WSDL, which does not run on newer Perls. according to CPAN testers it breaks on 5.18 andI have verified that. It would be more efficient as it works like a code generator and precreates classes that are cheaper to use than just compiling the WSDL file every time. But since it's broken, it is out of the question.
SOAP::Lite will compile the WSDL, and badly. It is not something anyone should use in production if it can be avoided in my opinion. The only alternative left that I see is to implement the calls without using the WSDL file and parsing the results directly with an XML parser, ignoring the schema. But those are BIG results. It would be very inconvenient.
My conclusion to this rant is that I would really like to understand why Perl does not want to localize the cookie jar in this case and fix that.
Perhaps instead of using local you use the clone and cookie_jar methods of LWP::UserAgent.
...
sub request {
my ($self, $new_cookie_jar) = #_;
my $ua = $self->ua; # cache user agent
if( defined $new_cookie_jar ){
# create a new user agent with the new cookie jar
$ua = $ua->clone;
$ua->cookie_jar( $new_cookie_jar );
}
my $result = $ua->get('http://www.stackoverflow.com');
# allow returning the newly cloned user agent
return ( $result, $ua ) if wantarray;
return $result;
}
If you don't want to do that, you should at least use the methods instead of manipulating the internals of the objects.
...
sub request {
my ($self, $new_cookie_jar) = #_;
my $ua = $self->ua; # cache user agent
my $old_cookie_jar = $ua->cookie_jar( $new_cookie_jar );
my $result = $ua->get('http://www.stackoverflow.com');
# put the old cookie jar back in place
$ua->cookie_jar( $old_cookie_jar );
return $result;
}

Mojo::DOM shortcut to get absolute url for a resource?

When parsing a webpage with Mojo::DOM (or any other framework), it's fairly common to be pulling a resource address that could be either relative or absolute. Is there a shortcut method to translate such a resource address to an absolute URL?
The following mojo command pulls all the stylesheets on mojolicio.us:
$ mojo get http://mojolicio.us "link[rel=stylesheet]" attr href
/mojo/prettify/prettify-mojo-light.css
/css/index.css
And the following script does the same, but also uses URI to translate the resource into an absolute URL.
use strict;
use warnings;
use Mojo::UserAgent;
use URI;
my $url = 'http://mojolicio.us';
my $ua = Mojo::UserAgent->new;
my $dom = $ua->get($url)->res->dom;
for my $csshref ($dom->find('link[rel=stylesheet]')->attr('href')->each) {
my $cssurl = URI->new($csshref)->abs($url);
print "$cssurl\n";
}
Outputs:
http://mojolicio.us/mojo/prettify/prettify-mojo-light.css
http://mojolicio.us/css/index.css
Obviously, a relative URL in this context should be made absolute using the URL that loaded DOM. However, I don't know of a way to get a resource absolute URL except for coding it myself.
There is Mojo::URL #to_abs in Mojolicious. However, I don't know if that would integrate in some way with Mojo::DOM, and by itself would take more code than URI.
My ideal solution would be if something like the following were possible from both a script and command line, but looking for any related insights into using Mojo for parsing:
mojo get http://mojolicio.us "link[rel=stylesheet]" attr href to_abs
I'm not sure why you think it would take more code to use Mojo::URL? In the following example I get the actual request URL from the transaction (there might have been redirects, which I've allowed) which I have called $base.
Then since $base is an instance of Mojo::URL I can create a new instance with $base->new. Of course if that seems to magical, you can replace it with Mojo::URL->new.
use Mojo::Base -strict;
use Mojo::UserAgent;
my $url = 'http://mojolicio.us';
my $ua = Mojo::UserAgent->new->max_redirects(10);
my $tx = $ua->get($url);
my $base = $tx->req->url;
$tx->res
->dom
->find('link[rel=stylesheet]')
->map(sub{$base->new($_->{href})->to_abs($base)})
->each(sub{say});

Processing external page with perl CGI or act as a reverse proxy

There is a page residing on a local server running Apache. I would like to submit the form via a GET request with a single name/value pair, like:
id=item1234
This GET request has to be processed by another server which I don't have control over subsequently returning a page which I would like to transform with the CGI script. In other words:
User submits form
MY apache proxies to external resource
EXTERNAL resource throws back a page
MY apache transforms it with a CGI (maybe another way?)
User get a modified page
Again this more like an architectural question so I'd be grateful for any hints, even poking my nose into some guides will help as I wasn't able to structure my google request well enough to locate anything related.
Thanks.
Pass the id "17929632" to this CGI code ("proxy.pl?id=17929632"), and you should this exact page in your browser.
#!/usr/bin/perl
use strict;
use warnings;
use LWP::UserAgent;
use CGI::Pretty qw(:standard -any -no_xhtml -oldstyle_urls);
print header;
print "<html>\n";
print " <head><title>Proxy Demo</title></head>\n";
print " <body bgcolor=\"white\">\n";
my $id = param('id') || die "No CGI param 'id'\n";
my $ua = LWP::UserAgent->new;
$ua->agent("MyApp/0.1 ");
# Create a request
my $req = HTTP::Request->new(GET => "http://stackoverflow.com/questions/$id");
# Pass request to the user agent and get a response back
my $response = $ua->request($req);
# Check the outcome of the response
if ($response->is_success) {
my $content = $response->content;
# Modify the original content here!
print $content;
}
else {
print $response->status_line;
}
print "</body></html>\n";
Vague question, vague answer: write your CGI program to include a HTTP user agent, e.g. LWP.

Get files from a given URL on the basis of pattern passed using Perl on Unix

I have been told that a given URL contains several xml and text files and I need to download all the xml files starting with AAA(that is AAA*.xml) inside a given directory.
Credentials to access that URL are provided to me.
Please not that size of xml files could be in GBs.
I have used below code to achieve the same-
use strict;
use warnings;
use LWP;
my $browser = LWP::UserAgent->new;
my $username ='scott';
my $password='tiger';
# Create HTTP request object
my $req = HTTP::Request->new( GET => "https://url.com/");
# Authenticate the user
$req->authorization_basic( $username , $password);
my $res = $browser->request( $req , ':content_file' => '/fold/AAA1.xml');
print $res->status_line, "\n";
It prints 200 OK status but I am not able to get the file. Any suggestions?
Man
If the server doesn't allow you to receive a folder list (i.e. Apache without "Options +Indexes"), you will not GET the collection of files.
But, having the list, you can filter it with a regexpr like /AAA.*/, and with LWP::Simple module, it's easy to get it

WWW:Mechanize Form Select

I am attempting to login to Youtube with WWW:Mechanize and use forms() to print out all the forms on the page after logging in. My script is logging in successfully, and also successfully navigating to Youtube.com/inbox; However, for some reason Mechanize can not see any forms at Youtube.com/inbox. It just returns blank. Here is my code:
#!"C:\Perl64\bin\perl.exe" -T
use strict;
use warnings;
use CGI;
use CGI::Carp qw/fatalsToBrowser/;
use WWW::Mechanize;
use Data::Dumper;
my $q = CGI->new;
$q->header();
my $url = 'https://www.google.com/accounts/ServiceLogin?uilel=3&service=youtube&passive=true&continue=http://www.youtube.com/signin%3Faction_handle_signin%3Dtrue%26nomobiletemp%3D1%26hl%3Den_US%26next%3D%252Findex&hl=en_US&ltmpl=sso';
my $mechanize = WWW::Mechanize->new(autocheck => 1);
$mechanize->agent_alias( 'Windows Mozilla' );
$mechanize->get($url);
$mechanize->submit_form(
form_id => 'gaia_loginform',
fields => { Email => 'myemail',Passwd => 'mypassword' },
);
die unless ($mechanize->success);
$url = 'http://www.youtube.com/inbox';
$mechanize->get($url);
$mechanize->form_id('comeposeform');
my $page = $mechanize->content();
print Dumper($mechanize->forms());
Mechanize is unable to see any forms at youtube.com/inbox, however, like I said, I can print all of the forms from the initial link, no matter what I change it to...
Thanks in advance.
As always, one of the best debugging approaches is to print what you get and check if it is what you were expecting. This applies to your problem too.
In your case, if you print $mechanize->content() you'll see that you didn't get the page you're expecting. YouTube wants you to follow a JavaScript redirect in order to complete your cross-domain login action. You have multiple options here:
parse the returned content manually – i.e. /location\.replace\("(.+?)"/
try to have your code parse JavaScript (have a look at WWW::Scripter)
[recommended] use YouTube API for managing your inbox