Perl: Passing a URL in a query string - perl

I'm trying to pass a URL as a query string so it can be read by another website and then used:
www.example.com/domain?return_url=/another/domain
Gets returned as:
www.example.com/domain?return_url=%2Fanother%2Fdomain
Is there a way this URL can be read and parsed by the other application with the escaped characters?
The only way I can think of is to encode it somehow so it comes out like this:
www.example.com/domain?return_url=L2Fub3RoZXIvZG9tYWlu
then the other application can decode and use?
https://www.base64encode.org/

www.example.com/domain?return_url=%2Fanother%2Fdomain
This is called URL encoding. Not because you put a URL in it, but because it encodes characters that have a special meaning in a URL.
The %2F corresponds to a slash /. You've probably also seen the %20 before, which is a space .
Putting a complete URI into a URL parameter of another URI is totally fine.
http://example.org?url=http%3A%2F%2Fexample.org%2Ffoo%3Fbar%3Dbaz
The application behind the URL you are calling needs to be able to understand URL encoding, but that is a trivial thing. Typical web frameworks and interfaces to the web (like CGI.pm or Plack in Perl) will do that. You should not have to care about it a all.
To URL-encode something in Perl, you have several options.
You could use the URI module to create the whole URI including the URL encoded query.
use URI;
my $u = URI->new("http://example.org");
$u->query_form( return_url => "http://example.org/foo/bar?baz=qrr");
print $u;
__END__
http://example.org?return_url=http%3A%2F%2Fexample.org%2Ffoo%2Fbar%3Fbaz%3Dqrr
This seems like the natural thing to do.
You could also use the URI::Encode module, which gives you a uri_encode function. That's useful if you want to encode strings without building a URI object.
use URI::Encode qw(uri_encode uri_decode);
my $encoded = uri_encode($data);
my $decoded = uri_decode($encoded);
All of this is a normal part of how the web works. There is no need to do Base 64 encoding.

The correct way would be to uri-encode the second hop as you do in your first example. The URI and URI::QueryParam modules make this nice and easy:
To encode a URI, you simply create a URI object on your base url. Then add any query parameters that you want. (NOTE: they will be automatically uri-encoded by URI::QueryParam):
use strict;
use warnings;
use feature qw(say);
use URI;
use URI::QueryParam;
my $u = URI->new('http://www.example.com/domain');
$u->query_param_append('return_url', 'http://yahoo.com');
say $u->as_string;
# http://www.example.com/domain?return_url=http%3A%2F%2Fyahoo.com
To receive this url and then redirect to return_url, you simply create a new URI object then pull off the return_url query parameter with URI::QueryParam. (NOTE: again URI::QueryParam automatically uri-decodes the parameter for you):
my $u = URI->new(
'http://www.example.com/domain?return_url=http%3A%2F%2Fyahoo.com'
);
my $return_url = $u->query_param('return_url');
say $return_url;
# http://yahoo.com

Related

save directly to file and get filename using WWW::Mechanize

I would normally save to a file using this:
$mech->save_content($mech->response->filename);
But due to some big files which cause "Out of memory" I have to use this instead:
$mech->get( $url, ":content_file"=>$tempfile );
How can I get the filename with the second method, or do I have to make one up?
I want the filename that would be returned in the response object: $mech->response->filename. I don't want to make up my own filename.
The :content_file option is inherited from LWP::UserAgent and behaves in the same way. You don't know the file beforehand.
You could do a HEAD request to check the filename, and then do a GET request.
Alternatively, have a look at the lwp-download utility that ships with LWP::UserAgent. It provides exactly what you need. You could use it directly, or lift the stuff you want out of it. A WWW::Mechanize object can be dropped into this code and will behave exactly the same as the LWP::UserAgent one.

Perl::Dancer how to include a file path as a parameter in the URI

I'm new to the Dancer framework and web apps in general. I have a Dancer project in which I have a route that accepts multiple parameters. So far, no sweat. However, if one of the parameters has a file path as its value then the route is not found.
I have tried encoding the parameter string as follows to eliminate the forward slashes:
$paramString =~ s/\//%2F/g;
and this does encode the slashes as expected (I print it out in the log to make sure).
However, after the parameter string is appended to the base URI for the route I'm interested in, the URI shows up in the browser in unencoded form, a 404 error is raised and the log says that the unencoded route can't be found.
I looked into the Request.pm module and found that in the init method a private method called _url_decode is called which removes the encoding. Is there a way to disable this when it is not desired?
I also tried using the uri_for method to create the URI. In this case, the encoded URI does show up in the browser, however, the route is still not found and the log indicates that the unencoded form (with the forward slashes) is being used to search for the route
Trying to match 'GET /exome_proj_config/project_type=exome&project_root=/usr/local/projects/users/pdagosto/projects&analysis_project_name=Test' against /^\/exome_proj_config\/([^\/]+)$/ (generated from '/exome_proj_config/:project_type:project_root:analysis_project_name') in /home/pdagosto/perl5/lib/perl5/Dancer/Route.pm l. 84 here
Since the regex used for the match is clearly looking for a string without any forward slashes following the one at the end of the base URI it's clear that the route will never be found.
Is there a way to have a URI parameter that contains a path or must some other approach be used?
It is possible to have a URI with a file path or slashes in the parameter provided that the parameter is part of the query string rather than the path. (See http://en.wikipedia.org/wiki/Uniform_resource_locator.)
For example see this Dancer script:
use strict;
use warnings;
use Dancer;
get '/file/action/:action' => sub {
my $filename = param('filename');
my $action = param('action');
return "Filename is $filename and action is $action";
};
dance;
If you put this string into the browser
http://localhost:3000/file/action/delete?filename=/folder/filename.txt
then you should expect to see this:
Filename is /folder/filename.txt and action is delete
In your question the URL you show uses the & character to separate parameters but it looks like you need a ? character first to separate the query string from the path. It is unclear from your question how you are creating your requests - but provided you can put the filename in the query string part of the URL then the approach above should work.

Disable URL encoding inside an HTTP request

I'm writing a small perl tool which should help me to speed up some processes during a blind SQL injection attack (it's an ethical tool. it's my job).
My script manages HTTP requests already url-encoded with hex values (%xx).
Therefore, my request is encoded twice when I use HTTP::Request to send it to the web browser.
I use this kind of code:
my $ua = LWP::UserAgent->new;
my $httpreq = new HTTP::Request GET => 'http://192.168.0.1/lab/sqli.php?id=1%20and%20(select%20ascii(substring(user,3,1))%20from%20mysql.user%20limit%201)>100%23';
my $res = $ua->request($httpreq)
How can I disable the perl URL encoding inside my request?
HTTP::Request does not modify the provided URL.
Any URL encoding must be done before the URL is assembled — it's actually URL components that get encoded — so HTTP::Request expects the encoding to already be done.
>perl -MHTTP::Request -e"print HTTP::Request->new(GET => 'http://192.168.0.1/lab/sqli.php?id=1%20and%20(select%20ascii(substring(user,3,1))%20from%20mysql.user%20limit%201)>100%23')->as_string;"
GET http://192.168.0.1/lab/sqli.php?id=1%20and%20(select%20ascii(substring(user,3,1))%20from%20mysql.user%20limit%201)%3E100%23

How can I detect the file type of image at a URL?

How to find the image file type in Perl form website URL?
For example,
$image_name = "logo";
$image_path = "http://stackoverflow.com/content/img/so/".$image_name
From this information how to find the file type that . here the example it should display
"png"
http://stackoverflow.com/content/img/so/logo.png .
Supposer if it has more files like SO web site . it should show all file types
If you're using LWP to fetch the image, you can look at the content-type header returned by the HTTP server.
Both WWW::Mechanize and LWP::UserAgent will give you an HTTP::Response object for any GET request. So you can do something like:
use strict;
use warnings;
use WWW::Mechanize;
my $mech = WWW::Mechanize->new;
$mech->get( "http://stackoverflow.com/content/img/so/logo.png" );
my $type = $mech->response->headers->header( 'Content-Type' );
You can't easily tell. The URL doesn't necessarily reflect the type of the image.
To get the image type you have to make a request via HTTP (GET, or more efficiently, HEAD), and inspect the Content-type header in the HTTP response.
Well, https://stackoverflow.com/content/img/so/logo is a 404. If it were not, then you could use
#!/usr/bin/perl
use strict;
use warnings;
use LWP::Simple;
my ($content_type) = head "https://stackoverflow.com/content/img/so/logo.png";
print "$content_type\n" if defined $content_type;
__END__
As Kent Fredric points out, what the web server tells you about content type need not match the actual content sent by the web server. Keep in mind that File::MMagic can also be fooled.
#!/usr/bin/perl
use strict;
use warnings;
use File::MMagic;
use LWP::UserAgent;
my $mm = File::MMagic->new;
my $ua = LWP::UserAgent->new(
max_size => 1_000 * 1_024,
);
my $res = $ua->get('https://stackoverflow.com/content/img/so/logo.png');
if ( $res->code eq '200' ) {
print $mm->checktype_contents( $res->content );
}
else {
print $res->status_line, "\n";
}
__END__
You really can't make assumptions about content based on URL, or even content type headers.
They're only guides to what is being sent.
A handy trick to confuse things that use suffix matching to identify file-types is doing this:
http://example.com/someurl?q=foo#fakeheheh.png
And if you were to arbitrarily permit that image to be added to the page, it might in some cases be a doorway to an attack of some sorts if the browser followed it. ( For example, http://really_awful_bank.example.com/transfer?amt=1000000;from=123;to=123 )
Content-type based forgery is not so detrimental, but you can do nasty things if the person who controls the name works out how you identify things and sends different content types for HEAD requests as it does for GET requests.
It could tell the HEAD request that it's an Image, but then tell the GET request that its a application/javascript and goodness knows where that will lead.
The only way to know for certain what it is is downloading the file and then doing MAGIC based identification, or more (i.e., try to decode the image). Then all you have to worry about is images that are too large, and specially crafted images that could trip vulnerabilities in computers that are not yet patched for that vulnerability.
Granted all of the above is extreme paranoia, but if you know the rare possibilities you can make sure they can't happen :)
From what i understand you're not worried about the content type of an image you already know the the name+extension for, you want to find the extension for an image you know the base name of.
In order to do that you'd have to test all the image extensions you wanted individually and store which ones resolved and which ones didn't. For example both https://stackoverflow.com/content/img/so/logo.png and https://stackoverflow.com/content/img/so/logo.gif could exist. They don't in this exact situation but on some arbitrary server you could have multiple images with the same base name but different extensions. Unfortunately there's no way to get a list of available extensions of a file in a remote web directory by supplying its base name without looping through the possibilities.

How can I get the entire request body with CGI.pm?

I'm trying to write a Perl CGI script to handle XML-RPC requests, in which an XML document is sent as the body of an HTTP POST request.
The CGI.pm module does a great job at extracting named params from an HTTP request, but I can't figure out how to make it give me the entire HTTP request body (i.e. the XML document in the XML-RPC request I'm handling).
If not CGI.pm, is there another module that would be able to parse this information out of the request? I'd prefer not to have to extract this information "by hand" from the environment variables. Thanks for any help.
You can get the raw POST data by using the special parameter name POSTDATA.
my $q = CGI->new;
my $xml = $q->param( 'POSTDATA' );
Alternatively, you could read STDIN directly instead of using CGI.pm, but then you lose all the other useful stuff that CGI.pm does.
The POSTDATA trick is documented in the excellent CGI.pm docs here.
Right, one could use POSTDATA, but that only works if the request Content-Type has not been set to 'multipart/form-data'.
If it is set to 'multipart/form-data', CGI.pm does its own content processing and POSTDATA is not initialized.
So, other options include $cgi->query_string and/or $cgi->Dump.
The $cgi->query_string returns the contents of the POST in a GET format (param=value&...), and there doesn't seem to be a way to simply get the contents of the POST STDIN as they were passed in by the client.
So to get the actual content of the standard input of a POST request, if modifying CGI.pm is an option for you, you could modify around line 620 to save the content of #lines somewhere in a variable, such as:
$self->{standard_input} = join '', #lines;
And then access it through $cgi->{standard_input}.
To handle all cases, including those when Content-Type is multipart/form-data, read (and put back) the raw data, before CGI does.
use strict;
use warnings;
use IO::Handle;
use IO::Scalar;
STDIN->blocking(1); # ensure to read everything
my $cgi_raw = '';
{
local $/;
$cgi_raw = <STDIN>;
my $s;
tie *STDIN, 'IO::Scalar', \$s;
print STDIN $cgi_raw;
tied(*STDIN)->setpos(0);
}
use CGI qw /:standard/;
...