HAProxy path regexp based on map lookup - haproxy

I'm using HAProxy 1.8.14 on a server running Debian stretch.
I want to route requests to different backends for a certain set of domains but only for some specific paths. Since there's quite a few domains and varying paths allowed I think a solution using maps would be nice.
I've tried to use a map to lookup a backend based on hdr(host) with the condition that the path should match with a regular expression mapped to hdr(host). I've tried the following but I can't get it to work:
use_backend bk-%[hdr(host),lower,map_dom(/etc/haproxy/host_to_backend.map,bk_default)] if { path_reg %[hdr(host),lower,map_dom(/etc/haproxy/domain_path_whitelist.map)] }
Example of host_to_backend.map:
a.foo.org a
b.foo.org b
c.foo.org c
Example of domain_path_whitelist.map (regexps not tested):
a.foo.org ^/(yada|info)/.*$
b.foo.org ^.*$
c.foo.org ^/bar/.*$
To avoid the regexp complexities I've also tried a 'beg' alternative:
use_backend bk-%[hdr(host),lower,map_dom(/etc/haproxy/host_to_backend.map,bk_default)] if { path_beg %[hdr(host),lower,map_dom(/etc/haproxy/domain_path_whitelist.map)] }
...but no luck.
Is it possible to solve my specific problem using maps? If not, can you suggest an alternative solution?

I found the HAProxy community and asked there too. I'll probably go with the map_reg variant:
use_backend bk-%[base,lower,map_reg(/etc/haproxy/base_to_backend.map,bk_default)]

This cannot be done the way you are attempting because log format variables referenced %[] cannot be used in the acl match context. The path_beg acl is trying to literally match %[hdr(host),lower,map_dom(/etc/haproxy/domain_path_whitelist.map)].
Also note, to match a regex on the path, you'd want to use path_reg
An alternative solution, is to use an acl without an attempt to map in a variable, for example:
use_backend bk-%[hdr(host),lower,map_dom(/etc/haproxy/host_to_backend.map,bk_default)] if { path_req ^/(yada|info)/.*$ }

Related

HA Proxy rewrite and redirect

Trying to do a rewrite and redirect. I've been trying this, it works to some extent but not 100% what I want it to do
acl old url_beg /site/ab
http-request redirect location /new/%[query] if old
the url can be for example
https://host/site/ab/xx
https://host/site/ab/yyyy
https://host/site/ab/zzzzzz
https://host/site/ab/zzzzzz/asdajshdjasd
I am looking to grab the bold marked part and simply redirect the user to https//host/new/boldmarkedpart
Any string that comes after the bold marked part can be trashed. For example "/asdajshdjasd" in the last example.
Any idea how to accomplish this? Thank you!!
If i understand correctly, you want to split the path part of url and get its 4th part.
In string foo1/foo2/foo3/foo4/foo5 you want only foo4.
This should work for you:
acl old path_beg /site/ab
http-request redirect location /new/%[path,field(4,/)] if old
It may be confusing that you want 3rd directory from path and here you take 4th word, but that's because when you split /foo2/foo3/foo4/foo5 by / then the first word is empty.
field converter is documented here: https://cbonte.github.io/haproxy-dconv/2.2/configuration.html#7.3.1-field
Other notes:
%[query] would return the query part of url, which is everything after ? character and you don't have query part at all in your examples.
url in my tests had schema://hostname:port/path, so testing acl old url_beg /site/ab never matched, path is for that

haproxy rewrite on backend to add a path for some requests to a specific domain

I am looking to try and get haproxy to rewrite a url on the backend. For example if the end user navigates to bitechomp.domain.com they should only see this URL but the request to the backend server needs to have the request re-written to include a path. e.g. bitechomp.domain.com/bitechomp
I believe I have the regex to match it, but struggling to find the syntax to then just have it add the folder path at the end.
^([a-zA-Z0-9]/)?(bitechomp).$
I believe I have resolved this.
http-request set-path /bitechomp/ if { path_reg ^([a-zA-Z0-9]/)?(bitechomp).$ }
This works for any domain so both bitechomp.domain1.com and bitechomp.domain2.com would be re-written to bitechomp.domain1.com/bitechomp and bitechomp.domain2.com/bitechomp

Neatest way to redirect urls without trailing slash in HAproxy

What is the neatest way to redirect these
http://example.com/test to http://example.com/test/
(It should work no matter what "test" is). I'm new to HAProxy - I have different apache backends. Have it working for URLs of this format http://example.com/test/ but cannot get the other type to work.
Have tried:
(1)
http-request set-path %[path]/ unless { path_end / }
which doesn't load the html page properly - possibly because it's screwing up e.g. the referenced JS files.
(2)
http-request redirect code 301 prefix / drop-query append-slash if missing_slash
from the documentation. I figure I need some kind of slight adjustment to this but don't know where to start with it. Any hints would be much appreciated.
This would potentially best be done at the back-end, since only the back-end has a way to actually know which paths should be redirected. But, it should be possible from HAProxy.
It should work no matter what "test" is
I am skeptical that this is precisely what you need, because then (e.g.) /css/common.css becomes /css/common.css/, which would be wrong.
I'm inclined to think you want something like this:
http-request redirect code 301 prefix / drop-query append-slash if { path_reg /[^/\.]+$ }
Redirect to the same path, with the query removed and / appended to the end, if the path ends with a / followed by at least 1 (or more) characters that is neither / nor ..
This should redirect /test to /test/ and /hello/world to /hello/world/ but it should leave paths like /js/example.js and /images/cat.png and /favicon.ico unaffected.
Deeper nesting should work correctly, too, because regular expressions like this either find a match or don't, and this expression only considers whatever is after the final slash in the path. If it contains a dot or nothing, then there is no match, which seems correct.

Concatenating strings in HAProxy

I'd like to have a throttling rule in HAProxy that limits rate at which a user can load any particular path, but I don't know of a way to concatenate strings in HAProxy (at least, in the context of generating a key for a stick table). So what I'd like is
tcp-request content track-sc1 concat(req.cook(user), path)
tcp-request content reject if {sc1_http_req_rate gt 10}
HAProxy manipulate string prior to map lookup suggests using regsub to do something somewhat similar, but I think I can only do constant manipulations with that.
The best I've come up with so far is to track path and req.cook(user) separately, and to reject if each of them is too high, but this isn't the actual behavior that I'm looking for.
The link in your question has the answer. You concat in set-header first and then use that.
http-request set-header X-Concat %[req.cook(user)]__%[path]
http-request track-sc0 hdr(X-Concat) table peruser

HAProxy path to host/path/

Please bear with me as I am not a coder by nature.
This is what I trying to achieve using HAproxy but after hours of checking, I am unable to make it works somehow.
From
domain.com/alpha
domain.com/beta
To
domain.com/alpha will point to backend1/path/index.cgi
domain.com/beta will point to backend2/path/index.cgi
I have tried, multiple ways but come to no avail, I did read about rewrite/redirect but somehow it makes me very confuse. eg "reqrep"
by using the alpha.domain.com points to backend1/path works as expected but I need the path inline because of certificate limitation.
Thank you in advance and if possible explain abit how it works and whats the correct terms (example: rewrite, redirect) so that I can have clue on that and I will advance from there.
This is what I was able to come up with:
frontend HTTP
mode http
bind *:80
acl alpha url_beg /alpha
acl beta url_beg /beta
use_backend backend_alpha if alpha
use_backend backend_beta if beta
backend backend_alpha
reqrep ^([^\ ]*\ /)alpha[/]?(.*) \1path/index.cgi
server server_alpha localhost:8080
backend backend_beta
reqrep ^([^\ ]*\ /)beta[/]?(.*) \1path/index.cgi
server server_beta localhost:8081
Obviously you would then replace localhost:8080 and localhost:8081 with the correct locations for your case.
Explanation
First, in the frontend named HTTP there are two ACLs (Access Control Lists) which test what is in the beginning of the URL (hence the keyword url_beg). The result of these rules are if the url begins with /alpha then the variable called alpha is set to true and then the same for beta.
Next in the frontend, there are two use_backend commands which will direct the request to backend_alpha if the variable alpha is set to true and the same for backend_beta if beta is set to true.
As a result the frontend does the work of taking the url and deciding which server to use.
The two backends (backend_alpha and backend_beta) are almost identical except for the text alpha and beta and the location of the respective servers. The first command in the backend is the reqrep command which you pointed out. What the reqrep command does is takes a url, searches for a particular part using Regular Expression and then replaces it with something else. In this case we have this url:
http://example.com/alpha
In the first part of the reqrep command:
^([^\ ]*\ /) takes http://example.com/ and stores it in a variable called \1
alpha then matches with the alpha in the given url
[/]?(.*) takes everything after alpha and stores it in a variable called
\2 (in this case \2 would equal nothing as there is nothing after alpha in the url)
Then the second part of the reqrep command says take the contents of \1 (http://example.com/) and add path/index.cgi to the end and make that the new url to send to the server.
As a result for both alpha and beta urls the resulting url sent to the server is http://example.com/path/index.cgi.
Finally the server command sends the request of to the appropriate server.
I would like to point out I am not an expert on the complicated Regular Expression part (I find it a bit confusing as well) but hopefully someone else who knows a bit more can explain it in more detail or correct me if I am wrong.
I hope that helps :)