I need to place a Mojolicious app behind an Apache reverse proxy. I've been unable to get Mojolicious to generate working URLs when behind the proxy.
I'm using Mojolicious 6.14 with Perl 5.18.1.
Here's my Apache reverse proxy configuration which I set based on https://github.com/kraih/mojo/wiki/Apache-deployment (in the path section).
<Proxy *>
Order deny,allow
Allow from all
</Proxy>
ProxyRequests Off
ProxyPreserveHost On
ProxyPass /app1 http://localhost:3000/ keepalive=On
ProxyPassReverse /app1 http://localhost:3000/
RequestHeader set X-Forwarded-HTTPS "0"
Here's my test case.
use 5.014;
use Mojolicious::Lite;
app->hook('before_dispatch' => sub {
my $self = shift;
if ($self->req->headers->header('X-Forwarded-Host')) {
#Proxy Path setting
my $path = shift #{$self->req->url->path->parts};
push #{$self->req->url->base->path->parts}, $path;
}
});
any '/' => sub {
my $c = shift;
$c->render('index');
};
any '/test' => sub {
my $c = shift;
$c->render('test');
};
app->start;
__DATA__
## index.html.ep
<!DOCTYPE html>
<html>
<head><title>Index Page</title></head>
<body>
<p>Index page</p>
<p>
%= link_to 'Go to Test Page' => '/test'
</p>
</body>
</html>
## test.html.ep
<!DOCTYPE html>
<html>
<head><title>Test Page</title></head>
<body>
<p>Test page</p>
<p>
%= link_to 'Return to home page' => '/'
</p>
</body>
</html>
I can see the index page when I access http://www.example.com/app1, but the link to the test page is incorrect. The link is //test when I expected it to be http://www.example.com/app1/test.
Here's the HTML output from the test case.
<!DOCTYPE html>
<html>
<head><title>Index Page</title></head>
<body>
<p>Index page</p>
<p>
Go to Test Page
</p>
</body>
</html>
How can I tell Mojolicious what the base URL is for my app so it generates the correct links?
Maybe need to replace server proxy pass on http://localhost:3000/app1 in apache config:
ProxyPass /app1 http://localhost:3000/app1 keepalive=On
ProxyPassReverse /app1 http://localhost:3000/app1
That's a good question! Judging by some of the answers here and elsewhere, there seems to be a general lack of understanding of how the Apache reverse proxy settings affect the Mojolicious application and what the hook is supposed to do.
You've received an answer that's basically correct, but it begins with "Maybe [need to replace server proxy pass..." and it doesn't provide any explanation. A trial-and-error approach may or may not work for you. If your hook works differently, it probably won't.
Apache reverse proxy
This is your reverse proxy configuration (trailing slash removed, see below):
ProxyPass /app1 http://localhost:3000 keepalive=On
Quoting from the Apache documentation:
Suppose the local server has address http://example.com/; then
| ProxyPass /mirror/foo/ http://backend.example.com/
will cause a local request for http://example.com/mirror/foo/bar to be internally converted into a proxy request to http://backend.example.com/bar.
Now, assuming your Apache is listening on localhost:80 and your (Morbo) application server is listening on port 3000, a request to http://localhost/app1 is received by the Apache and forwarded to your application as /. The app1 prefix has been lost, which is why it's missing from the base url, i.e., it's missing in all the links. To fix urls generated by the application, this prefix must be added to the base url, which leads us to the hook.
Hook
This is your hook function:
if ($self->req->headers->header('X-Forwarded-Host')) { # 1. if
my $path = shift #{$self->req->url->path->parts}; # 2. shift
push #{$self->req->url->base->path->parts}, $path; # 3. push
}
This hook is supposed to fix the base url. As explained above, the app1 prefix needs to be added to the base url, which is prepended to all generated urls. If one of your templates links to /test, the base url should look like /app1 to get the final url /app1/test.
This is what your hook does:
By checking for the X-Forwarded-Host, you make sure to only modify the base url if the request came through the reverse proxy. This works because the mod_proxy_http module (documentation) automatically sets that header. Without that check, you wouldn't be able to access your application server directly under localhost:3000, all urls would be broken.
In fact, I asked the question on how this distinction should be made in a reliable way to fix the url prefix when using a reverse proxy without breaking requests going to the application server. Unfortunately, most of the answers I have received are wrong. But I believe checking for X-Forwarded-Host is good enough as it's set by Apache and not by Morbo or Hypnotoad. In fact, it's set by reverse proxies, which is precisely what you're looking for.
This shift is supposed to extract the prefix from the request url.
This is necessary because, strictly speaking, appending the application prefix to the ProxyPass directive manipulates the final request url, so your application receives a request for /app1/. Of course, there's no route for that address because the router in your application doesn't know that /app1 is the prefix of that instance rather than a relative application url.
Clearly, adding the hard-coded prefix /app1 to all templates (as some might be tempted to do) would not work if you deployed another copy of the same application under /app2. Even if you didn't, you'd still have to change all the links if your provider forces you to change the app1 prefix to app_one. This is why the prefix is picked up in that hook, stored to make links work (see #3) and then removed from the request url to make the router happy.
This is where the /app1 prefix, a single path token, is appended to the base url. The base url is prepended to urls generated in your templates. This is what turns /test into /app1/test (if the request came through the reverse proxy).
In your case, /test is turned into //test because you're missing the prefix. I've explained that at the end of this answer.
Fix reverse proxy
That being said, your reverse proxy needs to manipulate the request url to include the prefix in order to make the hook work:
ProxyPass /app1 http://localhost:3000/app1
After this modification, your hook works:
It modifies the base url only if a reverse proxy header is set because the modification is only necessary when a reverse proxy is used.
All requests going to the Mojolicious application will have the /app1 prefix, e.g., /app1/test. In this step, the prefix is removed to turn the url into /test.
The prefix removed in step 2 is appended to the base url, which is later used to generate links.
This should explain why you need to add the application prefix to the ProxyPass line. Without that explanation, someone else might try to do just that without success because they might have a different hook function.
Slashes
A single slash can break everything and cause most requests to fail with error 404.
Note that the local target url in your ProxyPass line (second argument) has a trailing slash but the path argument doesn't. If those don't match, you might end up with double slashes in the request url and some requests could fail.
From the Apache documentation:
If the first argument ends with a trailing /, the second argument should also end with a trailing /, and vice versa. Otherwise, the resulting requests to the backend may miss some needed slashes and do not deliver the expected results.
Now, if you remove the trailing slash but forget the prefix...
ProxyPass /app1 http://localhost:3000
... generated urls will still have two leading slashes: url_for '/test' = //test
That's because you're appending undef to the base url where you want to append the application prefix.
What happens is that in step 2 (see above), you extract the prefix, assuming the application is running exactly one level below the document root, i.e., your prefix is something like app1 and not apps/app1 (in which case the shift/push routine has to be run twice). But there's no prefix in the ProxyPass directive, so your application sees something like /, in other words, there's nothing to extract from parts. And there's no safeguard in the code either, so you end up pushing undef to the parts array of the base url. If you then generate a url, Mojolicious is adding an extra slash for that undef element, which is why you get //test. The parts array looks like this:
"parts" => [
undef,
"test"
],
To fix this double slash error, you can add a safeguard to your hook:
my $path = shift #{$self->req->url->path->parts};
if ($path) { # safeguard
push #{$self->req->url->base->path->parts}, $path;
}
Of course, as long as your reverse proxy configuration has the prefix in it, $path should always be defined.
One could certainly argue that this approach is a hack because it manipulates the url. Hacks tend to fail under certain circumstances. In this case, it would fail if you were to manually set the X-Forwarded-Host while accessing the application server directly. I mentioned that in my question. But as developer, you're probably the only person who has direct access to that application server, as the firewall would only allow external requests to the reverse proxy in a typical production environment. I'll leave it at that.
Somehow a workaround, but this is how I solved this problem:
In the configuration-file (.conf) I define the the base-url:
base_url => 'https://booking.business-apartments.wien',
This allows me to write templates like this:
%= link_to 'Payment Information' => ( config('base_url') . url_for('intern/invoice/list_payments/') );
May you should try to update your Mojolicious to a newer version? I remember I had a similar problem and I solved it with a code where I explicitly defined the url for the proxy and appended it to every request (similar to lanti's answer). After some mojolicious update the code was not necessary anymore.
Moreover, I think I use the same config as proposed by Logioniz.
When you mount your app into some point (not root / ) it is curious to get working it. Look at Mojolicious::Controller::url_for
# Make path absolute
my $base_path = $base->path;
unshift #{$path->parts}, #{$base_path->parts};
$base_path->parts([])->trailing_slash(0);
Here you can control what is generated.
Related
I am looking to try and get haproxy to rewrite a url on the backend. For example if the end user navigates to bitechomp.domain.com they should only see this URL but the request to the backend server needs to have the request re-written to include a path. e.g. bitechomp.domain.com/bitechomp
I believe I have the regex to match it, but struggling to find the syntax to then just have it add the folder path at the end.
^([a-zA-Z0-9]/)?(bitechomp).$
I believe I have resolved this.
http-request set-path /bitechomp/ if { path_reg ^([a-zA-Z0-9]/)?(bitechomp).$ }
This works for any domain so both bitechomp.domain1.com and bitechomp.domain2.com would be re-written to bitechomp.domain1.com/bitechomp and bitechomp.domain2.com/bitechomp
What is the neatest way to redirect these
http://example.com/test to http://example.com/test/
(It should work no matter what "test" is). I'm new to HAProxy - I have different apache backends. Have it working for URLs of this format http://example.com/test/ but cannot get the other type to work.
Have tried:
(1)
http-request set-path %[path]/ unless { path_end / }
which doesn't load the html page properly - possibly because it's screwing up e.g. the referenced JS files.
(2)
http-request redirect code 301 prefix / drop-query append-slash if missing_slash
from the documentation. I figure I need some kind of slight adjustment to this but don't know where to start with it. Any hints would be much appreciated.
This would potentially best be done at the back-end, since only the back-end has a way to actually know which paths should be redirected. But, it should be possible from HAProxy.
It should work no matter what "test" is
I am skeptical that this is precisely what you need, because then (e.g.) /css/common.css becomes /css/common.css/, which would be wrong.
I'm inclined to think you want something like this:
http-request redirect code 301 prefix / drop-query append-slash if { path_reg /[^/\.]+$ }
Redirect to the same path, with the query removed and / appended to the end, if the path ends with a / followed by at least 1 (or more) characters that is neither / nor ..
This should redirect /test to /test/ and /hello/world to /hello/world/ but it should leave paths like /js/example.js and /images/cat.png and /favicon.ico unaffected.
Deeper nesting should work correctly, too, because regular expressions like this either find a match or don't, and this expression only considers whatever is after the final slash in the path. If it contains a dot or nothing, then there is no match, which seems correct.
I have a bunch of this strange 404 error's in google search console and the URL's doesn't exists in my site and I need to redirect them to my homepage.
http://www.example.com/plugins/feedback.php?href=http%3A%2F%2Fwww.example.com%2Fremain-url-1%2F&_fb_noscript=1
http://www.example.com/plugins/feedback.php?href=http%3A%2F%2Fwww.example.com%2Fremain-url-2%2F&_fb_noscript=1
http://www.example.com/plugins/feedback.php?href=http%3A%2F%2Fwww.example.com%2Fremain-url-3%2F&_fb_noscript=1
I've made this two attempts but it's not working
rewrite ^/plugins/feedback(/.*)$ http://www.example.com/ permanent;
rewrite ^/plugins/feedback.php?href=http://www.example.com(/.*)$ /blog/ permanent;
Is it possible to redirect this with one wild card?
The rewrite directive cannot be used to filter parameters because it uses a normalized URI which does not include the query string. You can access parameters using the $args variable, or individually using the $arg_xxx variables.
However, the $request_uri contains the entire URI (including query string) and could be used with an if block or map to test for the presence of the parameters you seek.
For example:
if ($request_uri ~ ^/some/regular/expression) {
return 301 /;
}
The block could be placed in the server block scope, or within the location block which would normally process the /plugins/feedback.php URI.
See the following documents for details: if directive, map directive, if usage restrictions.
Please bear with me as I am not a coder by nature.
This is what I trying to achieve using HAproxy but after hours of checking, I am unable to make it works somehow.
From
domain.com/alpha
domain.com/beta
To
domain.com/alpha will point to backend1/path/index.cgi
domain.com/beta will point to backend2/path/index.cgi
I have tried, multiple ways but come to no avail, I did read about rewrite/redirect but somehow it makes me very confuse. eg "reqrep"
by using the alpha.domain.com points to backend1/path works as expected but I need the path inline because of certificate limitation.
Thank you in advance and if possible explain abit how it works and whats the correct terms (example: rewrite, redirect) so that I can have clue on that and I will advance from there.
This is what I was able to come up with:
frontend HTTP
mode http
bind *:80
acl alpha url_beg /alpha
acl beta url_beg /beta
use_backend backend_alpha if alpha
use_backend backend_beta if beta
backend backend_alpha
reqrep ^([^\ ]*\ /)alpha[/]?(.*) \1path/index.cgi
server server_alpha localhost:8080
backend backend_beta
reqrep ^([^\ ]*\ /)beta[/]?(.*) \1path/index.cgi
server server_beta localhost:8081
Obviously you would then replace localhost:8080 and localhost:8081 with the correct locations for your case.
Explanation
First, in the frontend named HTTP there are two ACLs (Access Control Lists) which test what is in the beginning of the URL (hence the keyword url_beg). The result of these rules are if the url begins with /alpha then the variable called alpha is set to true and then the same for beta.
Next in the frontend, there are two use_backend commands which will direct the request to backend_alpha if the variable alpha is set to true and the same for backend_beta if beta is set to true.
As a result the frontend does the work of taking the url and deciding which server to use.
The two backends (backend_alpha and backend_beta) are almost identical except for the text alpha and beta and the location of the respective servers. The first command in the backend is the reqrep command which you pointed out. What the reqrep command does is takes a url, searches for a particular part using Regular Expression and then replaces it with something else. In this case we have this url:
http://example.com/alpha
In the first part of the reqrep command:
^([^\ ]*\ /) takes http://example.com/ and stores it in a variable called \1
alpha then matches with the alpha in the given url
[/]?(.*) takes everything after alpha and stores it in a variable called
\2 (in this case \2 would equal nothing as there is nothing after alpha in the url)
Then the second part of the reqrep command says take the contents of \1 (http://example.com/) and add path/index.cgi to the end and make that the new url to send to the server.
As a result for both alpha and beta urls the resulting url sent to the server is http://example.com/path/index.cgi.
Finally the server command sends the request of to the appropriate server.
I would like to point out I am not an expert on the complicated Regular Expression part (I find it a bit confusing as well) but hopefully someone else who knows a bit more can explain it in more detail or correct me if I am wrong.
I hope that helps :)
I understand that baseUrls are mostly automatically set in zend projects, and they come in handy when running an app from a subfolder, But I'm running my app from the root public folder, with server-name myzend.loc.
So when I do an echo $this->baseUrl() or var_dump($this->baseUrl()), I get an empty string ''. Is this normal and a result of not running the app from a subfolder? or do I set baseUrl manually; eg $request->getHelper('baseUrl')->setBaseUrl($_SERVER['SERVER_NAME']);
The quick answer is that on my ZF1 apps serving sites at the root of a virtual web host, I also get an empty string for:
Zend_Controller_Front::getInstance()->getBaseUrl()
As #TimFountain notes, on its own, that doesn't return a true URL, complete with http or https protocol prefix. So, it can be interpreted more as a file-path than as a URL. But it's more of a relative file path: the path relative to the docroot.
And, of course, if you set the baseUrl - perhaps as a real URL complete with protocol prefix - then all interpretations are off: it merely returns whatever you set.
It is also worth noting two things about the baseUrl() view-helper implemented in Zend_View_Helper_BaseUrl:
It uses the front-controller getBaseUrl() method above.
It takes an argument!
For (2), in my experience, many people use the baseUrl() view-helper as follows (in a view-script, say):
<?php echo $this->baseUrl() . '/assets/css/styles.css'; ?>
But I find it more concise to write:
<?php echo $this->baseUrl('/assets/css/styles.css'); ?>
To me, it's not just a matter of a few extra characters. It's more that the core of the app should know his own internal structure, but not his deployment details. Employing the baseUrl() view-helper this way implicitly acknowledges that there some deployment-aware mapping required to resolve an internal deployment-independent asset reference to its post-deployment counterpart.
$this->baseUrl() is setted by the zend framework..you dont need to set..the reason why you are getting an empty baseurl is in your apache httpd.conf your Virtualhost documentroot is setted up to the project directory
DocumentRoot "/usr/local/apache2/docs/yourproject"
Despite the name, baseUrl returns an absolute path, not a URL. It returning an empty string would be what you'd expect from a normal ZF project that doesn't live in a sub-folder.